Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix false positive detection heuristics #4009

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

alexzurbonsen
Copy link
Contributor

@alexzurbonsen alexzurbonsen commented Dec 9, 2024

Context

The newly introduced conditions on rule relevance and copyrights for false positive detections, see f9863e61, contain a bug that makes them detect copyrights and rule relevance if matches do not even have a matched_text or the rule doesn't even have a relevance attribute.

Fixes #4005 (at least for the example I am looking at)

Summary

Correct the any and all conditions to correctly detect copyrights and relevance.

Test plan

I reran the scan of the file from #4005

scancode -l <your path to boost repo>/boost/boost/assign/ptr_list_of.hpp --json scancode.json

and get a scancode.json that has the same contents as on v32.2.1.

scancode_fix.json

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁
  • Updated documentation pages (if applicable)
  • Updated CHANGELOG.rst (if applicable)

@alexzurbonsen
Copy link
Contributor Author

alexzurbonsen commented Dec 9, 2024

@AyanSinhaMahapatra This should fix the issue with the false positive license detection heuristics. #4005

It feels like there should be a test for this. If so, can you help me write one? It seemed a bit more involved and I wanted to get this out quickly.

@alexzurbonsen alexzurbonsen force-pushed the fix-regression-false-positive-detection branch from 7212179 to 35029f3 Compare December 9, 2024 10:54
@alexzurbonsen alexzurbonsen marked this pull request as ready for review December 9, 2024 10:55
@alexzurbonsen
Copy link
Contributor Author

There is a failing test tests/licensedcode/test_plugin_license_detection.py::test_complicated_license_text_from_ffmpeg It seems that it is relying on the buggy logic. I would be glad for some help on this one @AyanSinhaMahapatra, since I am not sure yet, what the appropriate change is.

Due to wrong handling of any and all functions license matches are
categorized as having full relevance or copyrights, even if they
do not. This leads to a regression in false positive detection.

Correct the any and all conditions to correctly detect copyrights
and relevance.

Signed-off-by: alexzurbonsen <[email protected]>
@alexzurbonsen alexzurbonsen force-pushed the fix-regression-false-positive-detection branch from 35029f3 to ee32157 Compare December 10, 2024 17:54
@alexzurbonsen
Copy link
Contributor Author

alexzurbonsen commented Dec 10, 2024

Update: I scanned plugin_license/scan/ffmpeg-LICENSE.md with the scancode version from this branch and pasted the result into the expected file. The changes seem reasonable to me.

@alexzurbonsen
Copy link
Contributor Author

@AyanSinhaMahapatra and @pombredanne Would be great to get your feedback on this one. The bug currently blocking us from updating. It would seem that it basically renders the is_false_positive function useless, since it always returns false, and thus torpedoes false positive detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression: GPL false positive license detections with v32.3.0
1 participant