-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] Ingesting SBOMs results in license error #2127
Comments
@jeffmendoza and I were discussing this and have come to realize the issue. When ingesting an SPDX SBOM the licenses can be either defined via an id or a name. If the name is provided, the inline will have to be attached. But, in CycloneDX docs, when the name is provided, the inline is optional https://cyclonedx.org/docs/1.6/json/#services_items_licenses_oneOf_i0_items_license_name. If the inline isn't provided, then we can't actually know if it is a valid license. So, we should probably not add a license node if the license is defined via a name and the inline isn't provided. |
Hey @nathannaveen UPDATE:: for licenses that only have the |
@jeffmendoza Based on the discussion, we do not add a license node but do we still keep them part of the license expression strings? |
I'm not sure I agree with that. My inclination is to trust what software providers say the license of a software package is unless there's evidence to the contrary. FOSS projects in particularly are prone to incomplete or ambiguous expression of licenses, so we have to work with what we have. The ClearlyDefined support helps here because it allows the user to find places where the declared and detected licenses don't match (of course, CD is also an incomplete data set). There's also the question of "what is a valid license?" If I released some software under the Permissive 3000 license that I wrote (translated?), it's not in SPDX nor is it OSI-approved, but it's certainly a valid license in the sense that it does what licenses are supposed to do. |
@semmet95 Based on maintainer call: If the inline isn't provided, then we can't actually know if it is a valid license. So, we should NOT add a license node if the license is defined via a name and the inline isn't provided but but we DO still keep them part of the license expression |
I retract my earlier comment after discussion in today's Maintainer Meeting. I misunderstood the shape of the problem, but I'm going to leave the comment there for posterity because there are some points worth preserving. My main concern with ignoring it is if there are several packages that use "Ben's Cool License", people won't be able to search for it. But that's going to be ambiguous and edge-case-y enough to not try to solve right now. |
Have you considered using https://scancode-licensedb.aboutcode.org/ as reference dataset? This is the largest such db this side of the galactic quadrant. (And unfortunately dumbed down in ClearlyDefined scan merges when reported by ScanCode there) |
@pombredanne ah very interesting! This would be great to integrate into GUAC as another data point for licenses |
@funnelfiasco re:
If this exists and is used, this is a valid license alright in my book. OSI and SPDX have nothing to do with the existence of a license, they are rather helping clarify and catrgorize these licenses. As far the Permissive 3000 license, this is a fine license but it sees little actual usage, so we track it only as the text of a generic, permissive license in ScanCode at https://github.com/aboutcode-org/scancode-toolkit/blame/9a340fc36b971bcc04fdf255ee73bf88ce39635a/src/licensedcode/data/rules/other-permissive_210.RULE#L12 ... most occurrences in the wild are found in ScanCode forks https://github.com/search?q="list+of+conditions+and+the+following+refusal+of+responsibility."&type=code ;) In the guac context, you should IMHO always track whatever is the asserted license you receive as-is. And if you are lucky, this is correct. It is going to be inaccurate and creative in some (or many) cases as many (or most) SBOM tools do something between a poor and a bad job to report licensing. I'd suggest that you can enrich this after the fact with correct data from a ScanCode scans (either directly, recommended) or retrieved from PurlDB or raw from ClearlyDefined. |
And as for getting things from ClearlyDefined, I'd focus on the raw "harvest" as we do here https://github.com/aboutcode-org/purldb/blob/main/clearcode/ |
Fixed by #2164 |
Describe the bug
When ingesting some SBOMs we sometimes encounter the error:
This happens with these two SBOMs:
cdx_guac.json
cdx_vuln.json
To Reproduce
Steps to reproduce the behavior:
go run ./cmd/guacgql --gql-debug
go run ./cmd/guacone collect files cdx_guac.json
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
GUAC version
GUAC version:
v0.8.2
The text was updated successfully, but these errors were encountered: