-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(output): use new internal data structure #1609
base: main
Are you sure you want to change the base?
Conversation
@@ -1024,12 +1024,12 @@ | |||
╭───────────────────────┬──────┬───────────┬───────────────┬─────────┬──────── ≈ | |||
│ OSV URL │ CVSS │ ECOSYSTEM │ PACKAGE │ VERSION │ SOURCE ≈ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this has revealed that our test cases are structured in a way that is not expected by the new data format - we've got at least one case where a models.PackageSource
has packages from different ecosystems, which results in those packages getting associated with a single ecosystem due to output.BuildResults
assuming all packages in a source will belong to the same ecosystem
@hogo6002 would you mind reviewing the "multiple sources with a mixed count of packages across ecosystems, and multiple vulnerabilities" case (among others) to confirm if its valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that if all packages originate from a single source file, they belong to the same ecosystem. For example, a go.mod
file can only contain Go packages. For the test here, when we mock data, and we assign the same source path to packages from different ecosystems. I think the tests are more incorrect here. (is it possible define Nuget and Packagist packages in one lockfile?)
In the extracting code, we determine the package ecosystem based on the extractor used. So I don't think we will encounter a scenario where a single lockfile contains packages from different ecosystems. But I can also easily modify the outputResult code to handle this if we think this type of case is possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@another-rex what do you think Rex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I think we should be prepared to support it unless it's extremely hard, because it allows support for arbitrary "files" - though that handling could be done here or at the parser level.
First though, my example case: maybe one day we support an arbitrary CSV or JSON file, meaning I can say to the scanner "here is a single file that has multiple packages from different ecosystems".
On the one hand, that's a single file so technically it's from the same source and thus could end up here as a single source - on the other hand, we could decide to handle that at the extractor level i.e. we return a source for each ecosystem that all point to the same file.
While we've talked enough in the past about CSV support to know that that specifically is unlikely, but I still think its an easy example of how in future maybe some kind of file will come along that we do want to support.
Related, is this not actually a situation that's already possible with SBOM? can that not include packages across multiple ecosystems?
I can also easily modify the outputResult code to handle this if we think this type of case is possible.
Again, I think if it's very easy then it's probably worth doing just so we don't have to every think about it again 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually don't know about how we get ecosystems for packages in SBOM. I can just modify the OutputResult code to add support for this case, it's just a few lines code change.
(I think we might also have a sort bug or two, but first I want to confirm if the test cases are actually valid since that'll be a huge source of difference either way) |
@@ -53,6 +53,8 @@ type PackageResult struct { | |||
HiddenVulns []VulnResult | |||
LayerDetail PackageContainerInfo | |||
VulnCount VulnCount | |||
Path string `json:"-"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the package path same as SourceReuslt.Name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah looks like it probably is - I was just trying to get things working enough that the snapshots weren't completely bogus before I started refining things.
fwiw, I do think the fields could do with some documentation to make it easier to find things - for example, whats the difference between RegularVulns
and HiddenVulns
? and why is OSPackageNames
a thing - when should I use that instead of Name
? etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add more comments to clarify.
RegularVulns
refers to the called vulnerabilities, and HiddenVulns
are those we've filtered out for users. These hidden vulnerabilities might be unimportant (for OS packages), or they could be uncalled (source packages). The OSPackageName
field exists because OS packages have both source name and package name (the binary file name). For example, the source package krb5
on Debian might result in binary package names like krb5-locales
and libkrb5-3
. For project scanning, I think the package.name is enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an issue to track this.
@@ -1024,12 +1024,12 @@ | |||
╭───────────────────────┬──────┬───────────┬───────────────┬─────────┬──────── ≈ | |||
│ OSV URL │ CVSS │ ECOSYSTEM │ PACKAGE │ VERSION │ SOURCE ≈ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that if all packages originate from a single source file, they belong to the same ecosystem. For example, a go.mod
file can only contain Go packages. For the test here, when we mock data, and we assign the same source path to packages from different ecosystems. I think the tests are more incorrect here. (is it possible define Nuget and Packagist packages in one lockfile?)
In the extracting code, we determine the package ecosystem based on the extractor used. So I don't think we will encounter a scenario where a single lockfile contains packages from different ecosystems. But I can also easily modify the outputResult code to handle this if we think this type of case is possible.
e5bf907
to
ab9e54d
Compare
This is an initial attempt to convert the table output to use the new
Output.Result
and co that is being used for container scanning.Overall, I think I'm actually pretty close, but I'm going to put it down for now to focus on
slog
integration