-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: introduce opossum files to CLI #173
Conversation
* create interface with basic test * create empty function shell for conversion
b9c9711
to
119dabd
Compare
* Currently only contains the things where we do have structure for already in our Opossum model * This is not fully compliant to the full opossum model yet --> will be enhanced
SideNote: This required switching Metadata to a BaseModel (instead of using the annotation) in order to successfully use the extra="allow" option which does not work in the annotation version
* Update the OpossumInformation datatype accordingly
* Cleanup tests * Introduce a few core constants
22ec26a
to
e1a0e65
Compare
* Use type annotations more consistently * Mark internal functions as private * Remove unnecessary cast
* Shorten the assertion function * Use it also for SPDX files
@@ -27,6 +29,12 @@ def opossum_file() -> None: | |||
multiple=True, | |||
type=click.Path(exists=True), | |||
) | |||
@click.option( | |||
"--opossum", | |||
help="opossum files used as input.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you're using plural, it sounds like this is a list of files but in reality you mean that each flag of --opossum
has exactly one argument. so to improve clarity, i suggest you say something like this: "Specify a path to a .opossum file that you would like to include in the final output. Option can be repeated."
also note, that i'm including "path" in the help message to make it clear that this must be a valid file path.
@@ -62,5 +63,26 @@ def generate(spdx: list[str], outfile: str) -> None: | |||
write_opossum_information_to_file(opossum_information, Path(outfile)) | |||
|
|||
|
|||
def validate_input_exit_on_error(spdx: list[str], opossum: list[str]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe clearer: "validate_input_and_exit_on_error". at first i read this as "validate input-exit on error".
return element | ||
z.writestr( | ||
INPUT_JSON_NAME, | ||
TypeAdapter(OpossumInformation).dump_json( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice improvement 👍
# hack to override not serializing keys with corresponding value none: | ||
# In this case this is valid and should be part of the serialization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
above, you're using exclude_none=True
so is it really a surprise that none values are not serialized?
name: str | ||
documentConfidence: int | None = 0 | ||
documentConfidence: int | float | None = 0 | ||
additionalName: str | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not the first time, but i'm surprised to see camel case here. in python it's usually standard to use snake case. is there some special need for camel here?
if it's because of the serializing to JSON, pydantic has a configuration option to convert camel to snake during serialization if i remember correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- True about the standard: https://peps.python.org/pep-0008/#method-names-and-instance-variables
- Added this also to Make all opossum file models inherit from BaseModel #178 (comment) where anyhow the model is reworked
ResourcePath = str | ||
type OpossumPackageIdentifier = str | ||
type ResourcePath = str | ||
type ResourceInFile = dict[str, ResourceInFile] | int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in which file? perhaps better: "OpossumFileResource"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about:
- The "Resource" class anyhow does not describe the file model --> Remove it from the file
- Then just rename it to Resource to match the model (and maybe rename to old Resource for easier distinguishing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good time to change this is probably when tackling #178
I think we could move the current Resource
type fully to the spdx
section because:
- the opossum code now uses the (to be renamed)
ResourceInFile
- the scancode code uses its own tree data structure and used to convert to
Resource
but this is historic and could be changed very easily (in fact I'll just do it now)
We then could rename ResourceInFile
just to Resources
or OpossumResources
to match the top-level key and perhaps make it a full pydantic.Model
with some convenience functions for construction. That has the advantage that there is a single point for the logic of how these resources
are structured and changing it (e.g. because #38) would be easy. The small downside is that we need to hook into the serialization of pydantic which shouldn't be hard.
input_json = json.load(input_json_file) | ||
return TypeAdapter(OpossumInformation).validate_python(input_json) | ||
except Exception as e: | ||
# handle the exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what value does this comment add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None ... probably a reminder from pressing autocomplete and not taking care appropriately
if OUTPUT_JSON_NAME in input_zip_file.namelist(): | ||
logging.error( | ||
f"Opossum file {input_zip_file.filename} also contains" | ||
f" '{OUTPUT_JSON_NAME}' which cannot be processed" | ||
) | ||
sys.exit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we care? can we just ignore the output file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still would argue that this would surprise the user.
There is a follow up issue to add a force modifier which implements that behavior
#177
Summary of changes
Read .opossum files with the library.
Currently this is mainly a pass-through that should result in the same file again on the output
See #49
Context and reason for change
While the pass-through is at the moment mainly a validation step, this is an important precondition to be able to use the planned merge functionality later
How can the changes be tested
Get any opossum file with only an input.json containd. Process it and observe no changes in the output. For a simple file this should also be part of the ci