Use data flow analysis to find constraints on transaction fields in a contract #96

S3v3ru5 · 2022-09-14T14:28:39Z

Dataflow Analysis to find constraints on transaction fields

What are constraints on transaction fields

For each basic block in the CFG

For each Transaction Field, find all possible values for the given transaction field which leads to successful execution of the basic block.
- e.g: block_transaction_context[B][F] stores the possible values form field F in basic block B.
  if block_transaction_context[b]["GroupSize"] = [1, 3]. Then, if and only if the group size of the transaction invoking this contracts has groupsize 1 or 3,
  - The entry point of basic block b will be executed AND
  - The execution reaches the end of the basic block b without reverting AND
  - The execution will continue without reverting in atleast one of execution paths.
- i.e All execution paths that contain basic block b constraint the group size to 1 or 3.

NOTE: only assert instructions are considered to check if execution reverts or not.
e.g if sequence of instructions are

...
Global GroupSize
int 3
==
assert
...

Then

The groupsize must be 3 for successful execution of the basic block.
block_transaction_context[b]["GroupSize"] = [3]

Implementation

Source of constraints for a given basic block

Predecessor blocks
Block itself
Successor blocks

From Predecessors: if and only if field has these values the execution will reach this block.
From Block: if and only if field has these values the execution will reach exit point of block without reverting.
From Successors: if and only if field has these values the execution will successfully continue after exiting from the block.

block_transaction_context[b] = Union(block_transaction_context[bi] for bi in predecessors[b]) \
                            && block_transaction_context[b] \
                            && Union(block_transaction_context[bi] for bi in successors[b])
# Note: && is intersection

consider

B3 has predecessors B1, B2 and successors B4, B5.
Given following transaction contexts,

    # predecessors
    block_transaction_context[B1]["GroupSize] = [1, 3], 
    block_transaction_context[B2]["GroupSize] = [1, 5],
    # block
    block_transaction_context[B3]["GroupSize"] = [1, 3, 5]
    # successors
    block_transaction_context[B4]["GroupSize] = [3], # asserts groupsize is 3 in B4
    block_transaction_context[B5]["GroupSize] = [5], # asserts groupsize is 5 in B5

Then new context will be,

    block_transaction_context = {1, 3, 5} && {1, 3, 5} && {3, 5} = {3, 5}

Branch Based Constraints

Teal contracts generally use conditional branch instructions directly on the constraints(conditional expression).
e.g, consider B1 from above is

...
Global GroupSize
int 3
==
bnz B3

This implies

# while considering for B3 
block_transaction_context[B1]["GroupSize"] = [3] # instead of [1, 3] which is the case for other children of B1

To solve this, we can use path based transaction context while considering information from predecessors

# path_transaction_context[bi][b] gives the transaction constraints for path bi -> b
block_transaction_context[b] = Union((block_transaction_context[bi] && path_transaction_context[bi][b]) for bi in predecessors[b]) \
                            && block_transaction_context[b] \
                            && Union(block_transaction_context[bi] for bi in successors[b])
# Note: && is intersection

Algorithm

Phase 1

block_transaction_context = dict()
path_transaction_context = dict()

for B in basic_blocks:
    for F in transaction_fields:
        c = ConstraintsFromAssert(B, F)
        if c == set(): # NullSet(F)
            c = UniversalSet(F)
        
        block_transaction_context[B][F] = c

        if B[-1] == Ins(BZ) or B[-1] == Ins(BNZ):    # last instruction is conditional branch
            # constraints that must be satisfied to take the jump.
            true_branch_constraints = ComputeTrueBranchConstraints(B, F)
            # constraints if the default branch is taken
            false_branch_constraints = ComputeFalseBranchConstraints(B, D) # default branch
                        
            if true_branch_constraints = set(): # NullSet(F)
                true_branch_constraints = UniversalSet(F)

            if false_branch_constraints = set(): # NullSet(F)
                false_branch_constraints = UniversalSet(F)

            # successors[0] is the default branch and successors[1] is the target branch
            
            path_transaction_context[successors[B][0]][B][F] = false_branch_constraints
            path_transaction_context[successors[B][1]][B][F] = true_branch_constraints

Phase 2

worklist = [b for b in basic_blocks]

while worklist:
    B = worklist[0]
    worklist = worklist[1:]
    new_transaction_context = merge_information(B)
    if new_transaction_context != block_transaction_context[B]:
        worklist.extend(predecessors(B))
        worklist.extend(successors(B))
        worklist = list(set(worklist))

Transaction Fields to consider for constraints

Global Fields:

GroupSize: Finite -> [1, 16]

Transaction Fields:

GroupIndex: Finite -> [0, 15]
Type: Finite -> [pay{}, keyreg{}, acfg{}, axfer{}, afrx{}, appl{Creation, UpdateApplication, DeleteApplication, NoOp, ...}]
RekeyTo: Addr -> Store concrete constraints only
CloseRemainderTo: Addr
AssetCloseTo: Addr
Sender: Addr
Receiver: Addr
...

Note

==, != are both considered for fields with small domain.

If block B is,
```
Global GroupSize
int 2
==
assert
```
Then, block_transaction_context[B]["GroupSize"] = [2]
And if block B is
```
Global GroupSize
int 2
!=
assert
```
Then, block_transaction_context[B]["GroupSize"] = list(set(range(1, 16)) - set([2]))

However, We cannot do the same for Addr type fields. we will only consider constraints which are small enough to store and enumerate. if the number of possible values are small, then we store them otherwise the possible values are considered to be ALL possible values(no constraints).
Using the gtxn instruction, constraints can be computed for other transactions in the group as well. This helps in detecting the possible group structures. Constraints of other transactions in the group can also be used for contracts which distribute the security checks to multiple contracts.

Uses

Currently canUpdate, canDelete detectors check for following sequence of instructions and report execution paths that do not have that sequence
```
txn OnCompletion
int [UpdateApplication | DeleteApplication]
==
```
These detectors can be divided into isUpdatable, isDeletable and unprotectedUpdateApplication, unprotectedDeleteApplication detectors.
- isUpdatable: is the contract updatable?. Check all the leaf basic blocks in the CFG, if any of the block has appl{UpdateApplication} in it's context and doesn't contain return 0 then there's a execution path that will be executed when transaction is UpdateApplication call.
- unprotectedUpdateApplication: update application has proper access controls?. if the contract is Updatable then check constraints on all the execution paths(using leaf basic blocks) with appl{UpdateApplication}, if there is no constraint on SenderAddress, then report that execution path. Based on Tealer capabilities, it can be checked if SenderAddress is checked against state variable.
- isDeleteable and unprotectedDeleteApplication is same as above but for DeleteApplication.
Printers for UpdateApplication, DeleteApplication paths.
Highlight basic blocks which has appl{UpdateApplication} in their paths for UpdateApplication printer. Highlight basic blocks which has appl{DeleteApplication}.
Get Heuristics on contract type: stateless or stateful. We could check if there are any blocks which does not have any appl{UpdateApplication, DeleteApplication, ...}. This implies that either contract is stateless or contract is stateful and block is dead block. Similar deduction can be made if any of the blocks has only appl{...}.
Get information on the group size and possible combinations of group this contract might be part of.
- This helps in deducing the group structure itself.
- Detectors can perform better analysis for contracts where checks are distributed among multiple contracts. e.g First contract in the group verifies fields of all transactions in the group.
- Reduce FP for contracts which check group size and fields of all transactions in the group.
Improve analysis of current detectors which check if contracts are validating certain transaction fields.

Blockers

Currently, assert instruction or (bz/bnz) instructions only consider single comparison expression. A constraint is considered if and only if comparison operator is immediately followed by assert or bz/bnz. This does give some reasonable constrainsts. However, developers generally chain multiple conditions and then assert or branch on that large expression.
- Ideally, entire boolean expression should be constructed and constraints should be extracted from that.
- However, based on the benchmark contracts, in practice, most of this expressions are chain of && operations, which are much easier to extract from, than the generalised expressions.

ToDo

Evaluate above use cases, complexity and tradeof and decide whether to proceed with this approach. Find edgecases that are not considered and improve the algorithm.
Implement improved algorithm with as much generality as possible.
Improve False Positives of the current detectors
Implement and evaluate above detectors and printers.
Use heuristics for contract type and use it for future analysis.
Find other use cases

Related/Solves issues #75, #82, #83, #87, #90, #95

The text was updated successfully, but these errors were encountered:

S3v3ru5 · 2022-10-13T14:58:48Z

The above dataflow equations do not work correctly when there's a backedge(loops). Because:

For a field, each block is initialized with all possible values for that field.
For a node to receive information, all of the predecessors must first receive the target information.
For a loop header block to receive the information, the information must first reach the loop exit block. And the only way for the information to reach the loop exit block is from the loop header block. This results in a cycle, stopping the information from outside the loop from reaching the loop body blocks and stopping information computed within the loop body from reaching blocks outside the loop.

Ex:

information from instructions txn OnCompletion == int CloseOut in block 4 would not reach block 9 and 10. Similarly, information from instructions txn OnCompletion != int DeleteApplication in block 9 (inside the loop) would not reach block 4 and other blocks outside the loop.

An easy solution for this would be to:

find back edges and ignore predecessors or successors connected with the back edge while combining information. As information is propagated from predecessors as well as successors information would reach all blocks even if we do this.

However, we are doing path-sensitive analysis. so, if the information is part of the back edge itself and is not part of the source or target block of the edge, we would lose that information. But, this should be fine for the use cases. Because, in practice, transaction constraints are not part of the loop conditions.

Another solution is to change the equations to follow GEN and KILL type analysis. And update the implementation accordingly.

S3v3ru5 mentioned this issue Sep 19, 2022

Add generic class for dataflow analysis and class to find constraints on group size fields #97

Merged

S3v3ru5 mentioned this issue Oct 13, 2022

Update output format to reduce the number of output files #101

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use data flow analysis to find constraints on transaction fields in a contract #96

Use data flow analysis to find constraints on transaction fields in a contract #96

S3v3ru5 commented Sep 14, 2022 •

edited

Loading

S3v3ru5 commented Oct 13, 2022

Use data flow analysis to find constraints on transaction fields in a contract #96

Use data flow analysis to find constraints on transaction fields in a contract #96

Comments

S3v3ru5 commented Sep 14, 2022 • edited Loading

Dataflow Analysis to find constraints on transaction fields

What are constraints on transaction fields

Implementation

Source of constraints for a given basic block

Branch Based Constraints

Algorithm

Phase 1

Phase 2

Transaction Fields to consider for constraints

Note

Uses

Blockers

ToDo

S3v3ru5 commented Oct 13, 2022

S3v3ru5 commented Sep 14, 2022 •

edited

Loading