-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest from biostudies version 1 #118
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File/directory structure comments:
- I would have expected this to be a copy of the existing biostudies.py file in the existing bia-ingest package, but i see a commented out block - is that the only change?
- Presumably once this is the only ingest path we will remove bia-ingest and rename this bia-ingest (i.e. drop the -shared-models because it's redundant)
- I was expecting these files to be in a src/ folder, rather than bia_ingest_sm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File/directory structure comments:
- I would have expected this to be a copy of the existing biostudies.py file in the existing bia-ingest package, but i see a commented out block - is that the only change?
No - there are a few other changes, but they are very similar otherwise
- Presumably once this is the only ingest path we will remove bia-ingest and rename this bia-ingest (i.e. drop the -shared-models because it's redundant)
Yes
- I was expecting these files to be in a src/ folder, rather than bia_ingest_sm.
Happy to restructure - can do this with [2] above
app = typer.Typer() | ||
|
||
|
||
@app.command(help="Ingest from biostudies and echo json of bia_data_model.Study") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intent of this tool that it will eventually perform the api calls in order to ingest a study, or just create the json required for the submissiom?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that it will perform the API calls. Current version is quick and dirty to allow us to create json files for test exporter and new website
} | ||
study_uuid = dict_to_uuid(study_dict, ["accession_id",]) | ||
study_dict["uuid"] = study_uuid | ||
study = bia_data_model.Study.model_validate(study_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still not used to model_validate as a way of instanciating pydantic classes but i guess it's the standard thing to be doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what the standard recommendations are. I use model_validate because it was in a tutorial I was following for pydantic 2.0 models. It is also used in some places in the pydantic documentation. However, I do not know if it is supposed to be the standard way.
"""Return enum version of licence of study | ||
|
||
""" | ||
licence = re.sub(r"\s", "_", study_attributes.get("License", "CC0")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the enum have spaces in it's value? that might just be easier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, their website also has spaces in the name. Should I change this in semantic_models.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think changing it makes sense - would we still want to do the regex to verify that the right kind of spaces are used?
First iteration of ingest using semantic_models and bia_data_models. We can now ingest studies into bia_data_model.Study. However, we do not get details of experimental imaging components or annotation components yet.