Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spdx] Fix several bugs #1377

Merged
merged 12 commits into from
Dec 6, 2024
Merged

Conversation

Thomas1664
Copy link
Contributor

@Thomas1664 Thomas1664 commented Apr 2, 2024

Fixed bugs:

  • Parsing vcpkg_from_git didn't work because it used the results of parsing vcpkg_from_github
  • vcpkg_from_bitbucket and vcpkg_from_gitlab weren't parsed
  • Indexing of SPDXRef-resource started at 1
  • Only the first occurence of one of vcpkg_from_* / vcpkg_download_distfile was parsed
    I.e., if there were 2 occurences of vcpkg_from_github, only the first one was included in the SBOM file.

Changes in behavior:

  • find_cmake_invocation no longer uses case insensitive comparison

@Thomas1664 Thomas1664 changed the title [spdx] Fix copy-paste error [spdx] Fix several bugs Apr 3, 2024
Copy link
Contributor

@ras0219-msft ras0219-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I am really happy with the improvements in this PR! However, I think there's a few aspects of the implementation that need to be addressed.
 
This PR adds additional SPDX features but no SPDX tests. I'd like to see tests covering vcpkg::run_resource_heuristics for each of the new helpers. Adding tests to cover the existing vcpkg_from_github would be appreciated but not necessary :)

Additionally, can you please include a sample of SPDX documents before and after your change for a handful of ports, demonstrating that all changes were intentional and positive?

Indexing of SPDXRef-resource started at 1

This is not a bug, unless you can reference an SPDX specification that indicates it should start at 0?

src/vcpkg-test/cmake.cpp Outdated Show resolved Hide resolved
src/vcpkg-test/cmake.cpp Outdated Show resolved Hide resolved
src/vcpkg-test/cmake.cpp Outdated Show resolved Hide resolved
{
auto res = extract_cmake_invocation_argument("lorem \"ipsum\"", "lorem");
REQUIRE(res == "ipsum");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add additional testing that covers:

  1. More quote situations (lorem \"ipsum\\\"\", lorem \"ipsum spaces\")
  2. Multiple invocations (alligator bee cactus dog elephant searching for cactus of dog)
  3. Case (in)sensitivity

src/vcpkg/cmake.cpp Outdated Show resolved Hide resolved

namespace vcpkg
{
StringView find_cmake_invocation(StringView content, StringView command);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that these functions are very much heuristics and only usable with caveats, I don't think they should be introduced as an independent "cmake" module. Instead, I think they should remain in spdx.cpp and exposed as needed for testing purposes.

Ideally, we will get rid of these heuristics entirely and instead introduce some sort of "side channel" that allows the portfile to record accesses -- such as by writing out a json document inside vcpkg_from_github() with the actual information.

If we do introduce another area that needs CMake parsing heuristics before we're able to delete them from the SPDX path, we can move them out into separate files at that point.

REQUIRE(res.empty());
}
{
auto res = find_cmake_invocation("lorem_ipsum( )", "lorem_ipsum");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should test additional edge cases:

  1. lorem_ipsum_
  2. lorem_ipsum (
  3. lorem_ipsum (EOF)

@BillyONeal BillyONeal marked this pull request as ready for review November 16, 2024 03:29
if (it == contents.end()) return {};
it += command.size();
if (it == contents.end()) return {};
if (ParserBase::is_word_char(*it)) return {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ras0219-msft I really don't understand the significance of these 'word char' checks here. If a file has vcpkg_from_github .... vcpkg_from_git, then we should skip over the first one and go to the real one, which I added a test for and fixed. Please double check that I did not misunderstand these' purpose.

@BillyONeal BillyONeal merged commit 4323b6a into microsoft:main Dec 6, 2024
6 checks passed
@BillyONeal
Copy link
Member

@Thomas1664 Thanks for the fixes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants