Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate GLE license is correctly detected #3240

Open
pombredanne opened this issue Feb 7, 2023 · 5 comments
Open

Validate GLE license is correctly detected #3240

pombredanne opened this issue Feb 7, 2023 · 5 comments

Comments

@pombredanne
Copy link
Member

The license @ https://sources.debian.org/data/main/g/gle-graphics/4.3.3-3/doc/LICENSE.txt should be scanned and we should validate that everything is detected correctly.
Reported by @pabs3

@shricodev
Copy link
Contributor

@pombredanne Sir, I tried running the scan with these parameters:
./scancode --json-pp gle-result.txt --license --package --url --email --copyright LICENSE.txt

and these look good to me, all the license seems to be detected properly :P
BTW is there something to look into particularly?
The output was large so I saved it here: https://github.com/OctoPie23/OctoPie23/blob/main/temp/gle-result.txt

@AyanSinhaMahapatra
Copy link
Member

AyanSinhaMahapatra commented Feb 8, 2023

@OctoPie23 btw, you would want to use --license-text --license-text-diagnostics also when you're checking for license detection issues.

Thanks, Looks good to me, btw @pombredanne see these interesting case:

{
          "license_expression": "gpl-1.0-plus",
          "detection_log": [
            "possible-false-positive",
            "not-license-clues-as-more-detections-present"
          ],
          "matches": [
            {
              "score": 50.0,
              "start_line": 1439,
              "end_line": 1439,
              "matched_length": 1,
              "match_coverage": 100.0,
              "matcher": "2-aho",
              "license_expression": "gpl-1.0-plus",
              "rule_identifier": "gpl_bare_word_only.RULE",
              "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/gpl_bare_word_only.RULE",
              "matched_text": "gpl."
            }
          ]
        },
        {
          "license_expression": "gpl-2.0 AND agpl-3.0-plus",
          "detection_log": [
            "possible-false-positive",
            "not-license-clues-as-more-detections-present"
          ],
          "matches": [
            {
              "score": 98.0,
              "start_line": 1514,
              "end_line": 1514,
              "matched_length": 2,
              "match_coverage": 100.0,
              "matcher": "2-aho",
              "license_expression": "gpl-2.0",
              "rule_identifier": "gpl-2.0_238.RULE",
              "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/gpl-2.0_238.RULE",
              "matched_text": "GPL v2 ***"
            },
            {
              "score": 90.0,
              "start_line": 1516,
              "end_line": 1516,
              "matched_length": 2,
              "match_coverage": 100.0,
              "matcher": "2-aho",
              "license_expression": "agpl-3.0-plus",
              "rule_identifier": "agpl-3.0-plus_143.RULE",
              "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/agpl-3.0-plus_143.RULE",
              "matched_text": "GPL GhostScript"
            }
          ]
        },

These are marked as possible-false-positive since these are either triggered by 1/2 words of gpl like text, but as there are other concrete GPL/AGPL detections in other parts of the file, these are not discarded as false positives/added to license clues instead, i.e. not-license-clues-as-more-detections-present. What do you think?

@shricodev
Copy link
Contributor

@AyanSinhaMahapatra Sir, One thing I noticed when running the --license-text is that when
there is complete match of the license, it just clutters the output file.

for reference: https://github.com/OctoPie23/OctoPie23/blob/main/temp/gle-result-with-license-text.txt#L2005

Because of the no line wrap of Github it would not be much visible. Instead in such full match cases we could add a link to the license instead of writing it completely in the matched_text.

Something like this:
matched_text: Complete Match of <License Name> <Link>

I can open up an issue if this seems to be a good idea to you and possibly start working on it ;)
This is just a very rough idea that came into my mind. Feel free to ignore it if it makes no sense :P

@AyanSinhaMahapatra
Copy link
Member

there is complete match of the license, it just clutters the output file.

IMHO, that is why this is a CLI option which can be enabled, and not a default option.
Sometimes we do need to look at this data, for better understanding of the issues, even if
it clutters the output somewhat. We also have this project: aboutcode-org/scancode.io#450 which helps in providing an UI for review.

@shricodev
Copy link
Contributor

@AyanSinhaMahapatra Thanks for clearing my doubt. Got it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants