Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4425 node merging to reduce branching for cse #4521

Closed
wants to merge 1,991 commits into from

Conversation

Stevengre
Copy link
Contributor

This is an implementation for #4425; the changes related to #4499 and #4502 will affect the merge_node functionality.

Please note that while this implementation is consistent with the design of #4425, it is not identical. It takes into account some general cases of mergeable nodes.

graph TD
  1 -->|"Split: X >=Int 5"| 2
  2 -->|"Edge: (5, (r1), #Top)"| 6
  1 -->|"Split: X >=Int 3 /\ X <Int 5"| 3
  3 -->|"Edge: (10, (r2,r3), X >=Int 4 /\ X <Int 5), (15, (r4), X <Int 4 /\ X >=Int 3 )"| 7
  1 -->|"Split: X >=Int 0 /\ X <Int 3"| 4
  4 -->|"Edge: (20, (r5), #Top)"| 8
  1 -->|"Split: X  <Int 0"| 5
  5 -->|"Edge: (25, (r6), #Top)"| 9
Loading

With mergeable pairs (2,3), (2,4), (3,4), (3,5), and (4,5).
Based on the largest connected subgraph, we identify two mergeable groups: (2,3,4) and (3,4,5), as 2 and 5 cannot be merged. Thus, the merged result should be:

graph TD
1 -->|"Split: X >=Int 0"| 10["2 | 3 | 4"]
10 -->|"Edge: (5, (r1), X >=Int 5),(10, (r2,r3), X >=Int 4 & X <Int 5), (15, (r4), X <Int 4 /\ X >=Int 3 ),(20, (r5), X >=Int 0 /\ X <Int 3),"| 11["6 | 7 | 8"]
11 -->|Split| 6
11 -->|Split| 7
11 -->|Split| 8
1 -->|"Split: X <Int 5"| 12["3 | 4 | 5"]
12 -->|"Edge: (10, (r2,r3), X >=Int 4 & X <Int 5), (15, (r4), X <Int 4 & X >=Int 3 ),(20, (r5), X >=Int 0 & X <Int 3),(25, (r6), X <Int 0)"| 13["7 | 8 | 9"]
13 -->|Split| 7
13 -->|Split| 8
13 -->|Split| 9
Loading

Open Issues:

  1. CTerm is anti-unified to a form without constraints. Should we add all constraints to the CTerm?
  2. Currently, CSubst can contain redundant constraints, e.g., x > 0, x > 1 should be merged to x > 1.
  3. Additionally, CSubst is sensitive to the order of constraints. Should we override the __eq__ method to address this?
  4. Furthermore, the CSubst list is essentially a sugar for $\land$. Should we standardize the constraints to conjunctive normal form, or at least flatten all $\land$ by default to simplify constraint reasoning? Or, since the constraints are user-facing, should we retain the original form for better user understanding?

@ehildenb
Copy link
Member

@Stevengre you need to rebase your branch on develop branch of this repository, not master.

Also, we can assume that the merge_nodes heuristic will create a partition, that is, it will not assign the same node to be merged into two different buckets. In yoru example above, nodes 7 and 8 seem to be in both node-merge buckets, so you get those cross-splits. We don't need that functionality.

Comment on lines 735 to 740
def remove_node_safely(
self,
node_id: NodeIdLike,
predecessors: dict[str, tuple[NodeIdLike, ...]] = None,
successors: dict[str, tuple[NodeIdLike, ...]] = None,
) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have this method? Don't we already have remove_node? It should work fine, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the current remove node operation does not cause dangling issues, it directly removes edges related to the node, leading to some unexpected situations. For example:

graph LR
A -->|Split| B
A -->|Split| C
Loading

Removing node B results in:

graph LR
A 
C
Loading

However, what I expect is:

graph LR
A -->|Split| C
Loading

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually expected behavior, because Split nodes must maintain that they are total and mutually exclusive. Mutually exclusive means that nodes B and C must not have any intersection of their states. Total means that node A must represent the same set of states as the union of nodes B and C. So putting a split like A -> C violates the totalness requirement, so the split must be removed.

@Stevengre
Copy link
Contributor Author

Also, we can assume that the merge_nodes heuristic will create a partition, that is, it will not assign the same node to be merged into two different buckets. In yoru example above, nodes 7 and 8 seem to be in both node-merge buckets, so you get those cross-splits. We don't need that functionality.

@ehildenb If this is the case, i.e., (2,3), (2,4), (3,4), (3,5), (4,5) (2,5), then the current implementation will work as mentioned in #4425. I am just providing a more general solution to handle the special cases that might appear in the figure. I believe that the meaning of KCFGSemantics.mergable(CTerm, CTerm) is to specify that two CTerms are mergable, so this situation might arise. And this merging method aligns with the semantics of mergable.

@ehildenb
Copy link
Member

@Stevengre please rebase your branch on develop. And like I said, we don't need a more general solution, a simple solution is better if it still achieves our goals, because it leads to more maintainable code. Please read this article: https://runtimeverification.com/blog/github-guidelines-for-collaborative-coding. We must prioritize simplicity of code over completeness. It is significantly better to have simpler code that handles 80% of the cases (but handles the cases we need), than complex code that handles 100% of the cases (including cases we don't necessarily need).

@Stevengre
Copy link
Contributor Author

Thank you for your suggestion. And sorry for my late reply. I will make sure to study the article thoroughly. I will try to rebase my branch, but please give me one day to focus on my paper. Some problems still need to be fixed for it. I will complete this work by next Monday.

@Stevengre
Copy link
Contributor Author

By the way, I have one more thing to confirm. Regarding this functionality: should we assume that 1. All branch targets of the split edge will be merged? 2. Is the mergable relationship transitive?

@ehildenb
Copy link
Member

Mergability should be an equivalence relation. So it should partition the nodes. Transitive, commutative, and reflexive.

@Stevengre
Copy link
Contributor Author

Thank you! I understand. This way, we won't have a situation where (2,4) and (4,5) are mergable but not (2,5). However, it is still possible that some split targets may be merged while others are not, right?

@ehildenb
Copy link
Member

ehildenb commented Jul 19, 2024

Yes, it's still possible some split nodes are merge-able and others are not.

But you can do that by refactoring the way the splits happen. See for example here: https://docs.google.com/drawings/d/1l8i_aUMcL8E7WyxHmk2cqRIY74fQ-yvMdhhdDNcOUp0/edit

Between diagram 1 and 2, we determined that B and C are mergeable, and D not. So we introduce a pre-split between B \/ C and D, then split B \/ C into B and C. That way the merging can happen over the entire split.

@Stevengre
Copy link
Contributor Author

Thank you for sharing this article (https://runtimeverification.com/blog/github-guidelines-for-collaborative-coding)! I will do my best to make my PRs easy to review by following the correct, fast, simple, and complete guidelines. However, I apologize as I am still new to collaborative coding and might make some mistakes. If you see any areas that need improvement, please let me know. Thank you again!

@Stevengre Stevengre mentioned this pull request Jul 26, 2024
@Baltoli Baltoli removed their request for review July 26, 2024 14:46
@Stevengre Stevengre closed this Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Node merging to reduce branching for CSE
2 participants