Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java/Shared: Refactor TypeFlow.qll into a shared library #15728

Merged
merged 17 commits into from
Apr 5, 2024

Conversation

MathiasVP
Copy link
Contributor

This PR pulls out the shareable parts of Java's type-flow library into a new shared qlpack. In a subsequent PR, I plan to make use of this library for C/C++ dataflow 🤞

@MathiasVP MathiasVP requested a review from a team as a code owner February 26, 2024 16:15
@github-actions github-actions bot added the Java label Feb 26, 2024
shared/typeflow/codeql/typeflow/TypeFlow.qll Fixed Show resolved Hide resolved
java/ql/lib/semmle/code/java/dataflow/TypeFlow.qll Fixed Show resolved Hide resolved
java/ql/lib/semmle/code/java/dataflow/TypeFlow.qll Dismissed Show resolved Hide resolved
java/ql/lib/semmle/code/java/dataflow/TypeFlow.qll Dismissed Show resolved Hide resolved
java/ql/lib/semmle/code/java/dataflow/TypeFlow.qll Dismissed Show resolved Hide resolved
shared/typeflow/codeql/typeflow/TypeFlow.qll Fixed Show resolved Hide resolved
shared/typeflow/codeql/typeflow/TypeFlow.qll Dismissed Show resolved Hide resolved
@MathiasVP MathiasVP marked this pull request as draft February 26, 2024 16:57
@MathiasVP
Copy link
Contributor Author

MathiasVP commented Feb 26, 2024

Oops. There's one small test failure. I'll investigate!

Edit: Was a silly mistake of me forgetting to put J:: in front of a single type 😄. I've force-pushed it away to keep the history clean

@MathiasVP MathiasVP force-pushed the shared-typeflow-library branch from 2684b46 to 690fdc0 Compare February 26, 2024 17:23
@MathiasVP MathiasVP marked this pull request as ready for review February 26, 2024 18:11
Comment on lines +62 to +73
/**
* Gets the source declaration of this type, or `t` if `t` is already a
* source declaration.
*/
default Type getSourceDeclaration(Type t) { result = t }

/**
* Gets the erased version of this type. The erasure of a erasure of a
* parameterized type is its generic counterpart, or `t` if `t` is already
* fully erased.
*/
default Type getErasure(Type t) { result = t }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something about including this distinction in the interface that rubs me the wrong way. Let me think about how to do this in a better way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wasn't sure how to get rid of this either. And I wasn't comfortable about changing this too much in the Java implementation. If you can think of a way I'd love to fix this!

Comment on lines +10 to +12
isNullValue(n)
or
exists(TypeFlowNode mid | isNull(mid) and step(mid, n))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be missing the forex case over joinStep.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I thought you were right here, but it's actually included in the Java instantiation here since it also needs to be joined with not exists(n.asField()). That's not super pretty, I know 😬

As an alternative, we can have have some kind of exclusion predicate in the interface, and have:

forex(TypeFlowNode mid | joinStep0(mid, n) | Make<J::Location, Input>::isNull(mid)) and
not excluded(n)

Then Java can implement excluded(n) as exists(n.asField())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done the above in 1acbb84. Hopefully that should make it clear that the rewrite preserves the meaning of isNull 🤞

* Holds if there is a common (reflexive, transitive) subtype of the erased
* types `t1` and `t2`.
*/
private predicate erasedHaveIntersection(Type t1, Type t2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This predicate is somewhat volatile performance-wise, so we should probably just copy its implementation more verbatim. At the very least it needs to be nomagic'ed, and if we extract the SrcRefType in the same way as for Java, then we'll be more likely to get the two implementations unified by the compiler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to do this in bae633a, but I'm not sure how to verify this. Can I just compile a Java query and check if the predicates have been unified? Or is this a Cachaca thing can I can only see at runtime?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I just compile a Java query and check if the predicates have been unified

Yes, I think you ought to be able to observe the unification at the RA level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Doesn't look like it was unified:

EVALUATE NONRECURSIVE RELATION:
  `TypeFlowImpl::TypeFlow<Location::Location,TypeFlow::Input>::erasedHaveIntersection/2#bb495610`(entity t1, entity t2) :- 
    SENTINEL `Type::RefType.getSourceDeclaration/0#dispred#770ab75c`
    SENTINEL `project#TypeFlow::Input::getErasure/1#ad6f3159`
    {2} r1 = REWRITE `Type::RefType.getSourceDeclaration/0#dispred#770ab75c` WITH TEST InOut.0 't2' = InOut.1

    {2} r2 = SCAN r1 OUTPUT In.0 't2', In.0 't2'
    {2}    | JOIN WITH `project#TypeFlow::Input::getErasure/1#ad6f3159` ON FIRST 1 OUTPUT Lhs.1 't2', Lhs.0 't1'

    {1} r3 = SCAN r1 OUTPUT In.0 't2'
    {2}    | JOIN WITH `#Type::RefType.getASourceSupertype/0#dispred#418e5974Plus` ON FIRST 1 OUTPUT Rhs.1 't1', Lhs.0 't2'
    {2}    | JOIN WITH `project#TypeFlow::Input::getErasure/1#ad6f3159` ON FIRST 1 OUTPUT Lhs.1 't2', Lhs.0 't1'

    {2} r4 = r2 UNION r3

    {2} r5 = JOIN r4 WITH `project#TypeFlow::Input::getErasure/1#ad6f3159` ON FIRST 1 OUTPUT Lhs.1 't1', Lhs.0 't2'

    {2} r6 = JOIN r4 WITH `#Type::RefType.getASourceSupertype/0#dispred#418e5974Plus` ON FIRST 1 OUTPUT Rhs.1 't1', Lhs.1 't1'
    {2}    | JOIN WITH `project#TypeFlow::Input::getErasure/1#ad6f3159` ON FIRST 1 OUTPUT Lhs.1 't1', Lhs.0 't2'

    {2} r7 = r5 UNION r6
    return r7

@MathiasVP MathiasVP force-pushed the shared-typeflow-library branch from 75ee885 to bae633a Compare April 5, 2024 12:19
…s. Rename 'sccJoinStep' to 'sccJoinStepNotNull' to match the new name.
Copy link
Contributor

@aschackmull aschackmull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now. If dca is uneventful, then I think we're good to merge.

@MathiasVP
Copy link
Contributor Author

LGTM now. If dca is uneventful, then I think we're good to merge.

DCA was very uneventful. Merging!

@MathiasVP MathiasVP merged commit 2256c4c into github:main Apr 5, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants