-
-
Notifications
You must be signed in to change notification settings - Fork 573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update package assembly in conda installations #4089
Conversation
Parse conda metadata JSON manifests and use the package data and files information present to improve conda package assembly. Reference: #4083 Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
@AyanSinhaMahapatra You need to regenerate some package test expectations, but otherwise lgtm! |
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments inline. You may also consider refactoring a bit to streamline your class hierarchy?
src/packagedcode/conda.py
Outdated
|
||
return | ||
|
||
# For a conda metadata JSON, try to find the corresponding meta.yaml and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity and readability, it may make sense to make the if CondaMetaYamlHandler.is_datafile(resource.location):
"else" condition explicit, or to add a new if CondaMetaJsonHandler.is_datafile(resource.location)
.... also I wonder if the code is not too long and too nested? Why not move the bulk of the code to each subclasses? CondaMetaYamlHandler and CondaMetaYamlHandler? It may make the code more readable and require fewer tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can move the code to each subclasses, as they are needed together to consider all the possible scenarios effectively (both the manifests being present or not present, and for merging package/files data from both manifests). This is effectively also the case in many other major ecosystems where we have to merge/create packages from multiple manifests, using a Basehandler class with a base assembly:
- npm: https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/packagedcode/npm.py#L82
- pypi: https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/packagedcode/pypi.py#L159
- maven: https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/packagedcode/maven.py#L62
This becomes a bit more complex when workspaces are involved or like in this case where the manifests are located in entirely different places and assigning package files/getting to manifests is based on more complex logic of file paths.
It is possible to break the code down further into functions which stay in the same base class though, let me do that.
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Parse conda metadata JSON manifests and use the package data and files information present to improve conda package assembly.
Reference: #4083
Tasks
Run tests locally to check for errors.