Add support for SARIF output #216

Dentrax · 2023-02-17T19:13:25Z

It'd be great to have a SARIF output format to upload the results to GitHub. (i.e. with github/codeql-action/upload-sarif action)

Blocked by #217

The text was updated successfully, but these errors were encountered:

marcinguy · 2023-02-17T19:25:32Z

@Dentrax You can get SARIF output and scanning GH action using betterscan https://github.com/marcinguy/betterscan-ce

It has OSV scanner as one the scanners

Dentrax · 2023-02-17T19:44:53Z

Hey @marcinguy, since this is slightly out-of-context of what I'm proposing here; but I definitely check that project, seems interesting! Thanks for dropping that.

marcinguy · 2023-02-17T19:52:58Z

@Dentrax No worries. Was trying to help. Yea, different outputs from OSV scanner would be great. +1

oliverchang · 2023-02-20T00:04:47Z

Thanks for the suggestion! This is indeed something we've been thinking about. This is something that may be blocked on #150.

Can you also speak a bit about how you intend to consume the SARIF output? Is this meant to be fed to e.g. GitHub's code scanning alerts? Are there other use cases for this?

Dentrax · 2023-02-20T07:53:12Z

Is this meant to be fed to e.g. GitHub's code scanning alerts?

Actually, yes. This is my only use-case.

jaskaransinghdr6j · 2023-05-16T17:18:23Z

Hey guys, any update on SARIF support?

pwa-tapptic · 2023-07-03T07:18:40Z

I'm also interested in this feature. It makes integrations with CI/CD tools a way nicer!

oliverchang · 2023-07-03T08:28:51Z

This is already being worked on by @another-rex in #420.

This PR features: - Refactors the format flag's internal logic so that we can don't need to repeat the format types so much, and we can test when we add a new format entry if we forgot anything. - Adds a new format "sarif", which returns a SARIF report (closes #216 ) - Adds a Github Action `action.yaml` and it's specialized dockerfile `action.dockerfile`. This docker image runs a bash script wrapping osv-scanner, first by preprocessing the input so the last argument will be split by new line, allowing the workflow user to pass in multiple directories/files they wish to scan. The script also changes exit codes 127 and 128 to 0 as they contain errors that the user can't really do anything about. - Adds two reusable workflows using this new github action for this repo - Reusable PR workflow, for using to check if PRs introduce new vulnerabilities. - Reusable Scheduled workflow, for use to regularly check for new vulns applying to your existing vulns. - Adds an experimental flag: `--experimental-diff`, which will only output the difference between a previous run and this run of the osv-scanner. This is for use in the PR workflow. - Sorts the grouped ID output. Closes #57 Currently the reusable workflow has to point to a specific action which cannot be relative (otherwise it would point to the wrong action when reused in another repo). This means right now it's pointed to this fork/branch instead of the master branch, this will need to be updated once this PR is merged. Example of what workflow sarif output looks like: ![image](https://github.com/google/osv-scanner/assets/106129829/fc7a0ac4-f3d8-4524-93ba-7b03dd0313cd) Here is an example of the PR reusable workflow working: another-rex/scorecard-check-osv-e2e#1 That PR adds an additional vulnerability, which causes it to fail. You can see that only the new vuln is showing up in the code scanning report: https://github.com/another-rex/scorecard-check-osv-e2e/security/code-scanning/1 TODO after this PR is merged: - Change links that point to this PR branch to point to main (and/or a tagged commit of main) - Add support for annotations - Add documentation (this is for later, as we want to dogfood it in our own repos first before broadcasting this widely) --------- Signed-off-by: Rex P <[email protected]>

Ma1tobiose · 2023-08-22T05:57:16Z

I tested this and it outputs to sarif format great, but probably not quite what I expected.
I'm going through osv which will check for say 10 issues, and I'm expecting all 10 to be expanded and displayed (in order to be put into results), but at the moment it looks like it's just writing a result and printing a table in the message. This should not be the desired result for this requirement.

oliverchang · 2023-08-22T06:03:37Z

I tested this and it outputs to sarif format great, but probably not quite what I expected. I'm going through osv which will check for say 10 issues, and I'm expecting all 10 to be expanded and displayed (in order to be put into results), but at the moment it looks like it's just writing a result and printing a table in the message. This should not be the desired result for this requirement.

Thanks for the feedback @Ma1tobiose. Can you please give a sample output that you'd like to see?

Also curious if there's feedback from any others in this issue. e.g. @pwa-tapptic , @jaskaransinghdr6j @Dentrax

pwa-tapptic · 2023-08-22T06:54:00Z

I tested it on my side and I'd have same remark as @Ma1tobiose .
https://github.com/anchore/grype creates SCA results in SARIF format quite nicely (syft . -o json | grype -q -o sarif).
As @Ma1tobiose said, OSV compresses everything into single finding in SARIF, but findings should be rather separate objects in results array:

{
  "version": "2.1.0",
  "$schema": "https://json.schemastore.org/sarif-2.1.0-rtm.5.json",
  "runs": [
    {
      "tool": { ...
      },
      "results": [
        {
          "ruleId": "GHSA-72xf-g2v4-qvf3-tough-cookie",
          "message": {
            "text": "The path /yarn.lock reports tough-cookie at version 4.1.2  which would result in a vulnerable (npm) package installed"
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "/yarn.lock"
                },
                "region": {
                  "startLine": 1,
                  "startColumn": 1,
                  "endLine": 1,
                  "endColumn": 1
                }
              }
            }
          ]
        },
        {
          "ruleId": "GHSA-776f-qx25-q3cc-xml2js",
          "message": {
            "text": "The path /yarn.lock reports xml2js at version 0.4.23  which would result in a vulnerable (npm) package installed"
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "/yarn.lock"
                },
                "region": {
                  "startLine": 1,
                  "startColumn": 1,
                  "endLine": 1,
                  "endColumn": 1
                }
              }
            }
          ]
        },
...

just to give you an example.

oliverchang · 2023-08-22T07:00:32Z

Thanks!

Our motivation behind a single aggregated report, is to better connect this to #352 via a single remediation command included in the result (e.g. "to fix these interactively, run osv-scanner fix ..."), where we provide a tool that can help with interactively (or automatically) addressing multiple vulnerabilities at once in the same lockfile/manifest file.

Could we understand a bit better how having separate results helps your use cases? Does this provide better UX for you to look at vulnerability results individually as opposed to a single one?

Ma1tobiose · 2023-08-22T07:20:32Z

Thanks! @oliverchang
I can understand that osv will store the data structure in an aggregated form in order to be able to fix it in one click.
But if the output is sarif or json, my side hope is to access the CICD process, osv as one of the tools in the output standardized report, so that it is easy to do the aggregated display with the results of other tools. So I would need a report in a format similar to the one provided by @pwa-tapptic (results in a separate form).

pwa-tapptic · 2023-08-22T08:09:23Z

@oliverchang one-click fix is an awesome feature and great motivation behind tool as OSV, it must be seen as something valuable from CLI user perspective.
But SARIF (i.e. Static Analysis Results Interchange Format) has slightly different motivations behind (details here), as the name of the format itself suggests. CI/CD tools are one of potential beneficiaries of the standardized interchange format.

oliverchang · 2023-08-22T08:57:39Z

@another-rex let's investigate if there's a good middle ground here.

jaskaransinghdr6j · 2023-08-23T18:26:23Z

+1 SARIF should ideally represent each "finding" as it's own line item. Being an interchange format, it would be right to use it as a 1 CVE per item template.

another-rex · 2023-09-11T06:03:44Z

Here is the current format I am thinking of merging, would love any feedback (fyi: @pwa-tapptic @jaskaransinghdr6j @Ma1tobiose)

{
  "version": "2.1.0",
  "$schema": "https://json.schemastore.org/sarif-2.1.0.json",
  "runs": [
    {
      "tool": {
        "driver": {
          "informationUri": "https://github.com/google/osv-scanner",
          "name": "osv-scanner",
          "rules": [
            {
              "id": "RUSTSEC-2022-0041",
              "shortDescription": {
                "text": "OSV.Summary"
              },
              "fullDescription": {
                "text": "OSV.Details",
                "markdown": "OSV.Details"
              },
              "help": {
                "text": "...",
                "markdown": "Markdown table of occurrences in the repo (i.e. If this vuln affects multiple packages/lockfiles)"
              }
            }
          ]
        }
      },
      "artifacts": [
        {
          "location": {
            "uri": "path/to/Cargo.lock"
          },
          "length": -1
        }
      ],
      "results": [
        {
          "ruleId": "RUSTSEC-2023-0045",
          "ruleIndex": 0,
          "level": "warning",
          "message": {
            "text": "Package '[email protected]' is vulnerable to 'RUSTSEC-2023-0045', please upgrade to versions '0.6.2' to fix this vulnerability"
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "path/to/Cargo.lock"
                }
              }
            }
          ]
        }
        // ... One result per vulnerable package per vulnerability
      ]
    }
  ]
}

+1 SARIF should ideally represent each "finding" as it's own line item. Being an interchange format, it would be right to use it as a 1 CVE per item template.

This is complicated by the fact that not every OSV entry has a corresponding CVE (e.g. RUSTSEC-2023-0045), and we can have multiple advisories from different sources referring to the exact same vulnerability.

This is common for vulnerabilities published by github advisories, where for some languages there will be security advisories published by the language maintainers as well.

One solution to this we are thinking is to fill "ruleId" with all the aliased vulnerabilities joined together (e.g. id: GHSA-wfg4-322g-9vqv,RUSTSEC-2023-0045). SARIF provides a 3.49.4 deprecatedIds property which is an array of unique IDs, in which we can separate those IDs out into separate array entries.

Tell us if this will work for your workflows!

jaskaransinghdr6j · 2023-09-11T17:15:57Z

This is great! However, I feel as many findings aggregators will be using these findings to autofill fields like "Finding Title", the joined ruleId might be a bit too onerous.

There seems to be no straightforward way to deal with this according to the SARIF spec. What if we default the ruleId to "CVE-XXXX" if available, and fallback to package specific id "RUSTSEC-YYYY" if the former is not available? (or vice-versa)

The other approach could be to list all the aliases in the message text field as : "Also Reported as RUSTSEC-YYYY and GHSA-ZZZZ"

Ma1tobiose · 2023-09-12T01:32:58Z

The formatting is great, and I agree with @jaskaransinghdr6j's solution (if it's feasible), being able to easily query the cve and refer sources would make it much quicker to find vulnerability details and fixes.

Fixes #216 with a new format that separates out individual vulnerabilities. Each vulnerability is now it's own rule violation. The aliased vulnerabilities are grouped together as one rule violation, with an ID picked in this priority (CVE -> [Eco Specific] -> GHSA).

another-rex · 2023-09-22T01:04:09Z

There seems to be no straightforward way to deal with this according to the SARIF spec. What if we default the ruleId to "CVE-XXXX" if available, and fallback to package specific id "RUSTSEC-YYYY" if the former is not available? (or vice-versa)

I implemented this in the SARIF PR and it's now been merged, so please try it out in latest main branch of osv-scanner (this change is not in 1.4.0) and see if this fits your use cases!

Dentrax mentioned this issue Feb 17, 2023

[Tracking Issue] Enrich Output Report Formats #219

Closed

9 tasks

oliverchang added the enhancement New feature or request label Feb 19, 2023

This was referenced Jul 4, 2023

Github action #420

Closed

Recreated Github Action PR #432

Merged

another-rex closed this as completed in #432 Jul 31, 2023

cbhernan mentioned this issue Aug 8, 2023

Find a Security Scanning Tool that can output a SARIF file that can be ingested into Github's CodeQL CityOfLosAngeles/angeleno-my-account#14

Closed

oliverchang reopened this Aug 22, 2023

another-rex mentioned this issue Sep 14, 2023

Update SARIF format #534

Merged

another-rex closed this as completed in #534 Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for SARIF output #216

Add support for SARIF output #216

Dentrax commented Feb 17, 2023 •

edited

Loading

marcinguy commented Feb 17, 2023

Dentrax commented Feb 17, 2023

marcinguy commented Feb 17, 2023

oliverchang commented Feb 20, 2023

Dentrax commented Feb 20, 2023

jaskaransinghdr6j commented May 16, 2023

pwa-tapptic commented Jul 3, 2023

oliverchang commented Jul 3, 2023 •

edited

Loading

Ma1tobiose commented Aug 22, 2023

oliverchang commented Aug 22, 2023

pwa-tapptic commented Aug 22, 2023

oliverchang commented Aug 22, 2023

Ma1tobiose commented Aug 22, 2023 •

edited

Loading

pwa-tapptic commented Aug 22, 2023

oliverchang commented Aug 22, 2023

jaskaransinghdr6j commented Aug 23, 2023

another-rex commented Sep 11, 2023 •

edited

Loading

jaskaransinghdr6j commented Sep 11, 2023 •

edited

Loading

Ma1tobiose commented Sep 12, 2023

another-rex commented Sep 22, 2023

Add support for SARIF output #216

Add support for SARIF output #216

Comments

Dentrax commented Feb 17, 2023 • edited Loading

marcinguy commented Feb 17, 2023

Dentrax commented Feb 17, 2023

marcinguy commented Feb 17, 2023

oliverchang commented Feb 20, 2023

Dentrax commented Feb 20, 2023

jaskaransinghdr6j commented May 16, 2023

pwa-tapptic commented Jul 3, 2023

oliverchang commented Jul 3, 2023 • edited Loading

Ma1tobiose commented Aug 22, 2023

oliverchang commented Aug 22, 2023

pwa-tapptic commented Aug 22, 2023

oliverchang commented Aug 22, 2023

Ma1tobiose commented Aug 22, 2023 • edited Loading

pwa-tapptic commented Aug 22, 2023

oliverchang commented Aug 22, 2023

jaskaransinghdr6j commented Aug 23, 2023

another-rex commented Sep 11, 2023 • edited Loading

jaskaransinghdr6j commented Sep 11, 2023 • edited Loading

Ma1tobiose commented Sep 12, 2023

another-rex commented Sep 22, 2023

Dentrax commented Feb 17, 2023 •

edited

Loading

oliverchang commented Jul 3, 2023 •

edited

Loading

Ma1tobiose commented Aug 22, 2023 •

edited

Loading

another-rex commented Sep 11, 2023 •

edited

Loading

jaskaransinghdr6j commented Sep 11, 2023 •

edited

Loading