Agent does not understand the hierarchy of vulnerability impacts #28

IlonaShishov · 2024-12-09T20:31:04Z

When evaluating a vulnerability in the code, such as a vulnerable method, the agent fails to recognize the absence of the method as a decisive factor. Instead of concluding that the vulnerability does not exist, it continues to check other aspects of the vulnerability unnecessarily.

CVE: CVE-2024-1485
Component: openshift4/ose-console
openshift4_ose-console_v4.15.0-202410022035.p0.gf8ac02d.assembly.stream.el8.json

zvigrinberg · 2024-12-11T10:28:31Z

Statement

I Have another use case of this issue, in which , the agent loop concludes that the package is not being used in the code base, but despite of that, it continues with irrelevant checks in the code ( that implicitly assumes that the package is used in the code, while it's not correct in this case), and as a consequence , it gives back a wrong answer ( expected: not vulnerable, actual answer: vulnerable).

Use case details

CVE: https://access.redhat.com/security/cve/CVE-2024-44337
Component: nogomod-rhosdt_tempo-gateway-rhel8
Full Output example

zvigrinberg · 2025-01-08T09:35:46Z

After a short investigation, it turns out that this is only reproduced if nvd intelligence source is abesent from the RAG prompt ( that is, nvd service is unavailable at the time of analysis - > returns 500/503/404 http status code).
After I Cached in advance in NGINX the above two CVEs' responses from NVD , the problem disappeared...
But we have also GHSA and RHSA intelligence sources, that has together all the data from NVD and beyond.
The question is, in such case, why the agent morpheus analysis service relying too much on NVD, while the latter is a very busy, faulty and problematic service? wouldn't it better to build the RAG prompt (for generating the checklist) more around GHSA which is a much more robust and stable service than NVD?

shawn-davis · 2025-01-09T16:45:23Z

Is this behavior you observed with CVE-2024-44337? If so, I'll dive into the collected intel for this CVE and see if I can track down what's happening when NVD fails instead of returning intel. The intended behavior is that if NVD fails, it will still utilize the other intel sources for the pipeline; so ideally we wouldn't be seeing this much reliance on NVD.

zvigrinberg · 2025-01-15T08:19:09Z

@shawn-davis
In fact, when NVD fails, it seems that the checklist is different , and also it affects the reasoning and , at least as per the observed answers, the checklist items seems independent of each other... and even the final result might change from true to wrong, as can be shown in the above 2 examples.

ashsong-nv · 2025-01-24T20:26:55Z

Hi Zvi, thanks for the additional info. It looks like there are 2 different concerns discussed in this issue. Here are the next steps we had in mind for each:

Different checklist generation with and without NVD intel: Shawn will work on investigating the behavior with/without the NVD intel, ETA next week
Agent summarization concludes vulnerable when vulnerable function is not used: we think this may be addressed by either 1) prompt tuning the summary model or 2) switching from parallel checklist execution to sequential, with hierarchy determined by the sequence. We think option 2 may be a more promising fix, but this is pending a bigger architectural change that is in the planning stages. We'll share more updates on this as things progress.

zvigrinberg · 2025-01-26T10:30:25Z

@ashsong-nv
Hi Ashley , Thanks for your response.

IMO, I Think that option 1 alone is enough , because the issue of "Agent summarization concludes vulnerable when vulnerable function is not used" does not happening when all of the needed intelligence data context is present.

…ropagating-apiKey fix: fix propagating nvd api key in nginx proxy

ashsong-nv · 2025-02-14T20:54:40Z

Hi @zvigrinberg ! You are already aware of this as the author of the PR, but just adding a note here for broader visibility that #73 was recently merged to address the root cause of the missing NVD intel. This should reduce the incidences of missing NVD intel causing accuracy issues.

However we still need to investigate the behavior when the intel is missing so I'm leaving the issue open for now.

Could you please comment on whether you are still frequently running into this issue?

zvigrinberg · 2025-02-16T06:45:41Z

Hi @ashsong-nv ,
We're not running into this issue anymore... ( now that the NVD issue moved off the way)
In retrospective, this one was caused only due to the issue with fetching from NVD.

ashsong-nv added the external label Jan 23, 2025

ashsong-nv assigned shawn-davis Jan 24, 2025

ruromero pushed a commit to ruromero/vulnerability-analysis that referenced this issue Feb 14, 2025

Merge pull request NVIDIA-AI-Blueprints#28 from zvigrinberg/nvd/fix-p…

dc036be

…ropagating-apiKey fix: fix propagating nvd api key in nginx proxy

Salonijain27 added the P1 label Feb 14, 2025

ashsong-nv added the accuracy label Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent does not understand the hierarchy of vulnerability impacts #28

Agent does not understand the hierarchy of vulnerability impacts #28

IlonaShishov commented Dec 9, 2024

zvigrinberg commented Dec 11, 2024

zvigrinberg commented Jan 8, 2025

shawn-davis commented Jan 9, 2025

zvigrinberg commented Jan 15, 2025

ashsong-nv commented Jan 24, 2025

zvigrinberg commented Jan 26, 2025

ashsong-nv commented Feb 14, 2025

zvigrinberg commented Feb 16, 2025

Agent does not understand the hierarchy of vulnerability impacts #28

Agent does not understand the hierarchy of vulnerability impacts #28

Comments

IlonaShishov commented Dec 9, 2024

zvigrinberg commented Dec 11, 2024

Statement

Use case details

zvigrinberg commented Jan 8, 2025

shawn-davis commented Jan 9, 2025

zvigrinberg commented Jan 15, 2025

ashsong-nv commented Jan 24, 2025

zvigrinberg commented Jan 26, 2025

ashsong-nv commented Feb 14, 2025

zvigrinberg commented Feb 16, 2025