Add generic dataflow framework + example + stack value analysis #118

montyly · 2022-12-14T14:20:17Z

Add AbstractDataflow, which is a generic dataflow analysis, following an abstract interpretation approach.
Add CollectInstruction, which is just an example. The analysis collects all the instructions executed
Add StackValue, which track the values push/pop on the stack, with a focus on int/global field/transaction field

Build on top of 116

This is a draft. Todo list:

Implement missing opcode handlers for important operations
Add more tests (ex: to stress the merging operations)

Once this is done, the next step will be to increase the abstract value to also track light equation (ex: Eq(1, GroupSize)), and propagating them while collecting logical operations (or, and)

- Add AbstractDataflow, which is a generic dataflow analysis, following an abstract interpretation approach. - Add CollectInstruction, which is just an example. The analysis collects all the instructions executed - Add StackValue, which track the values push/pop on the stack, with a focus on int/global field/transaction field

S3v3ru5 · 2023-01-10T07:42:38Z

tealer/analyses/dataflow/stack_value.py

+
+        while values and values[0] == StackValue(TOP()):
+            values.pop()
+


is this supposed to be

while values and values[0] == StackValue(Top()): values.pop(0)

Or

while values and values[-1] == StackValue(Top()): values.pop()

I'm guessing it's the first one (?)

S3v3ru5 · 2023-01-10T08:26:36Z

tealer/analyses/dataflow/stack_value.py

+            poped_values = list(self.values)
+            return Stack([]), [StackValue(TOP()) for _ in range(diff)] + poped_values
+
+        poped_values = self.values[:-number]


Shouldn't this line be poped_values = self.values[-number:] (?)

S3v3ru5 · 2023-01-10T08:48:30Z

tealer/analyses/dataflow/stack_value.py

+        v2 = stack.values
+
+        min_length = min(len(v1), len(v2))
+        v1 = v1[:-min_length]


Shouldn't this lines be

v1 = v1[-min_length:] v2 = v2[-min_length:]

S3v3ru5 · 2023-01-10T08:57:41Z

tealer/analyses/dataflow/stack_value.py

+        self.ins_out: Dict[Instruction, Stack] = defaultdict(lambda: Stack([]))
+
+    def _merge_predecessor(self, bb: BasicBlock) -> Stack:
+        s = Stack([])


Stack union depends on the length of inputs.

def union(self, stack: "Stack") -> "Stack": v1 = self.values v2 = stack.values min_length = min(len(v1), len(v2)) v1 = v1[:-min_length] v2 = v2[:-min_length] v3 = [] for i in range(min_length): v3.append(v1[i].union(v2[i])) return Stack(v3)

Because initial s = Stack([]) has length 0, output of union operation will be empty stack. _merge_predecessor will always result in empty stack.
Correct approach would be:

if not bb.prev: # zero predecessors return Stack([]) s = self.bb_out[bb.prev[0]] for bb_prev in bb.prev[1:]: s = s.union(self.bb_out[bb_prev]) return s

S3v3ru5 · 2023-01-10T10:23:40Z

tealer/teal/instructions/transaction_field.py

+
+    def __hash__(self) -> int:
+        return hash(str(self))
+


^^ Won't this approach result in errors when transaction fields from gtxn instruction are tracked alongside txn instruction?

txn RekeyTo

gtxn 3 RekeyTo

RekeyTo from 1 and 2 would have same str(self) and type. They will be considered equal during stack value analysis. txn RekeyTo and gtxn 3 RekeyTo does not necessarily be equal and most of the times will not be equal.

I think we need a better way to compare and hash TransactionField objects.

S3v3ru5 reviewed Jan 10, 2023

View reviewed changes

montyly added 3 commits September 18, 2023 14:11

Merge branch 'dev' into dev-stack-value-analysis

0311e26

Merge branch 'dev' into dev-stack-value-analysis

e39bac5

minor

971adaa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add generic dataflow framework + example + stack value analysis #118

Add generic dataflow framework + example + stack value analysis #118

montyly commented Dec 14, 2022

S3v3ru5 Jan 10, 2023

S3v3ru5 Jan 10, 2023

S3v3ru5 Jan 10, 2023

S3v3ru5 Jan 10, 2023

S3v3ru5 Jan 10, 2023


		while values and values[0] == StackValue(TOP()):
		values.pop()

Add generic dataflow framework + example + stack value analysis #118

Are you sure you want to change the base?

Add generic dataflow framework + example + stack value analysis #118

Conversation

montyly commented Dec 14, 2022

S3v3ru5 Jan 10, 2023

Choose a reason for hiding this comment

S3v3ru5 Jan 10, 2023

Choose a reason for hiding this comment

S3v3ru5 Jan 10, 2023

Choose a reason for hiding this comment

S3v3ru5 Jan 10, 2023

Choose a reason for hiding this comment

S3v3ru5 Jan 10, 2023

Choose a reason for hiding this comment