Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update package assembly in conda installations #4089

Merged
merged 4 commits into from
Jan 15, 2025

Conversation

AyanSinhaMahapatra
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra commented Jan 13, 2025

Parse conda metadata JSON manifests and use the package data and files information present to improve conda package assembly.

Reference: #4083

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁
  • Updated documentation pages (if applicable)
  • Updated CHANGELOG.rst (if applicable)

Parse conda metadata JSON manifests and use the package data
and files information present to improve conda package assembly.

Reference: #4083
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
@JonoYang
Copy link
Member

@AyanSinhaMahapatra You need to regenerate some package test expectations, but otherwise lgtm!

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments inline. You may also consider refactoring a bit to streamline your class hierarchy?

azure-pipelines.yml Show resolved Hide resolved

return

# For a conda metadata JSON, try to find the corresponding meta.yaml and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity and readability, it may make sense to make the if CondaMetaYamlHandler.is_datafile(resource.location): "else" condition explicit, or to add a new if CondaMetaJsonHandler.is_datafile(resource.location) .... also I wonder if the code is not too long and too nested? Why not move the bulk of the code to each subclasses? CondaMetaYamlHandler and CondaMetaYamlHandler? It may make the code more readable and require fewer tests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can move the code to each subclasses, as they are needed together to consider all the possible scenarios effectively (both the manifests being present or not present, and for merging package/files data from both manifests). This is effectively also the case in many other major ecosystems where we have to merge/create packages from multiple manifests, using a Basehandler class with a base assembly:

  1. npm: https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/packagedcode/npm.py#L82
  2. pypi: https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/packagedcode/pypi.py#L159
  3. maven: https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/packagedcode/maven.py#L62

This becomes a bit more complex when workspaces are involved or like in this case where the manifests are located in entirely different places and assigning package files/getting to manifests is based on more complex logic of file paths.

It is possible to break the code down further into functions which stay in the same base class though, let me do that.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
@AyanSinhaMahapatra AyanSinhaMahapatra merged commit abf65c7 into develop Jan 15, 2025
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants