-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disallowed extra fields in models, and updated ingest and export code to handle these #127
Merged
+63
−76
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,11 @@ | ||
from __future__ import annotations | ||
from pydantic import Field | ||
from typing import List, Optional | ||
from bia_shared_datamodels import bia_data_model, semantic_models | ||
from bia_shared_datamodels import bia_data_model | ||
|
||
|
||
class Study(semantic_models.Study, bia_data_model.DocumentMixin): | ||
class Study(bia_data_model.Study): | ||
experimental_imaging_component: Optional[List[ExperimentalImagingDataset]] = Field(default_factory=list, description="""A dataset of that is associated with the study.""") | ||
|
||
class ExperimentalImagingDataset(semantic_models.ExperimentalImagingDataset, bia_data_model.DocumentMixin): | ||
class ExperimentalImagingDataset(bia_data_model.ExperimentalImagingDataset): | ||
pass |
16 changes: 2 additions & 14 deletions
16
...t_data/experimental_imaging_datasets/S-BIADTEST/47a4ab60-c76d-4424-bfaa-c2a024de720c.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,26 +1,14 @@ | ||
{ | ||
"title_id": "Study Component 1", | ||
"uuid": "47a4ab60-c76d-4424-bfaa-c2a024de720c", | ||
"file_reference_count": 4, | ||
"description": "Description of study component 1", | ||
"acquisition_process": [ | ||
"c2e44a1b-a43c-476e-8ddf-8587f4c955b3" | ||
], | ||
"specimen_imaging_preparation_protocol": [ | ||
"7199d730-29f1-4ad8-b599-e9089cbb2d7b" | ||
], | ||
"biological_entity": [ | ||
"64a67727-4e7c-469a-91c4-6219ae072e99", | ||
"6950718c-4917-47a1-a807-11b874e80a23" | ||
], | ||
"specimen_growth_protocol": [], | ||
"analysis_method": [ | ||
{ | ||
"protocol_description": "Test image analysis", | ||
"features_analysed": "Test image analysis overview" | ||
} | ||
], | ||
"submitted_in_study_uuid": "a2fdbd58-ee11-4cd9-bc6a-f3d3da7fff71", | ||
"correlation_method": [], | ||
"example_image_uri": [], | ||
"image_count": 0 | ||
"example_image_uri": [] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -70,6 +70,5 @@ | |
} | ||
], | ||
"funding_statement": "This work was funded by the EBI", | ||
"annotation_component": [], | ||
"attribute": {} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -150,4 +150,10 @@ def persist(object_list: List, object_path: str, sumbission_accno: str): | |
for object in object_list: | ||
output_path = output_dir / f"{object.uuid}.json" | ||
output_path.write_text(object.model_dump_json(indent=2)) | ||
logger.info(f"Written {output_path}") | ||
logger.info(f"Written {output_path}") | ||
|
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice workaround for the filter! |
||
def filter_model_dictionary(dictionary: dict, target_model: Type[BaseModel]): | ||
accepted_fields = target_model.model_fields.keys() | ||
result_dict = {key: dictionary[key] for key in accepted_fields} | ||
return result_dict |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,8 +6,10 @@ | |
|
||
from typing import Dict, List | ||
from bia_shared_datamodels import bia_data_model, semantic_models | ||
from bia_ingest_sm.conversion.utils import dict_to_uuid | ||
|
||
from bia_ingest_sm.conversion.utils import ( | ||
dict_to_uuid, | ||
filter_model_dictionary | ||
) | ||
|
||
def get_test_annotation_method() -> List[bia_data_model.AnnotationMethod]: | ||
# For UUID | ||
|
@@ -21,7 +23,7 @@ def get_test_annotation_method() -> List[bia_data_model.AnnotationMethod]: | |
"method_type", | ||
"source_dataset", | ||
] | ||
protocol_info = [ | ||
annotation_method_info = [ | ||
{ | ||
"accno": "Annotations-29", | ||
"accession_id": "S-BIADTEST", | ||
|
@@ -34,47 +36,12 @@ def get_test_annotation_method() -> List[bia_data_model.AnnotationMethod]: | |
}, | ||
] | ||
|
||
protocol = [] | ||
for protocol_dict in protocol_info: | ||
protocol_dict["uuid"] = dict_to_uuid(protocol_dict, attributes_to_consider) | ||
protocol.append(bia_data_model.AnnotationMethod.model_validate(protocol_dict)) | ||
return protocol | ||
|
||
|
||
def get_test_specimen_growth_protocol() -> List[bia_data_model.ImageAcquisition]: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This whole method was just a duplicate, so i removed it. |
||
# For UUID | ||
attributes_to_consider = [ | ||
"accession_id", | ||
"accno", | ||
"title_id", | ||
"protocol_description", | ||
] | ||
protocol_info = [ | ||
{ | ||
"accno": "Image acquisition-3", | ||
"accession_id": "S-BIADTEST", | ||
"title_id": "Test Primary Screen Image Acquisition", | ||
"protocol_description": "Test image acquisition parameters 1", | ||
"imaging_instrument_description": "Test imaging instrument 1", | ||
"imaging_method_name": "confocal microscopy", | ||
"fbbi_id": [], | ||
}, | ||
{ | ||
"accno": "Image acquisition-7", | ||
"accession_id": "S-BIADTEST", | ||
"title_id": "Test Secondary Screen Image Acquisition", | ||
"protocol_description": "Test image acquisition parameters 2", | ||
"imaging_instrument_description": "Test imaging instrument 2", | ||
"imaging_method_name": "fluorescence microscopy", | ||
"fbbi_id": [], | ||
}, | ||
] | ||
|
||
protocol = [] | ||
for protocol_dict in protocol_info: | ||
protocol_dict["uuid"] = dict_to_uuid(protocol_dict, attributes_to_consider) | ||
protocol.append(bia_data_model.ImageAcquisition.model_validate(protocol_dict)) | ||
return protocol | ||
annotation_method = [] | ||
for annotation_method_dict in annotation_method_info: | ||
annotation_method_dict["uuid"] = dict_to_uuid(annotation_method_dict, attributes_to_consider) | ||
annotation_method_dict = filter_model_dictionary(annotation_method_dict, bia_data_model.AnnotationMethod) | ||
annotation_method.append(bia_data_model.AnnotationMethod.model_validate(annotation_method_dict)) | ||
return annotation_method | ||
|
||
|
||
def get_test_specimen_growth_protocol() -> List[bia_data_model.SpecimenGrowthProtocol]: | ||
|
@@ -103,6 +70,7 @@ def get_test_specimen_growth_protocol() -> List[bia_data_model.SpecimenGrowthPro | |
protocol = [] | ||
for protocol_dict in protocol_info: | ||
protocol_dict["uuid"] = dict_to_uuid(protocol_dict, attributes_to_consider) | ||
protocol_dict = filter_model_dictionary(protocol_dict, bia_data_model.SpecimenGrowthProtocol) | ||
protocol.append( | ||
bia_data_model.SpecimenGrowthProtocol.model_validate(protocol_dict) | ||
) | ||
|
@@ -139,6 +107,7 @@ def get_test_specimen_imaging_preparation_protocol() -> ( | |
protocol = [] | ||
for protocol_dict in protocol_info: | ||
protocol_dict["uuid"] = dict_to_uuid(protocol_dict, attributes_to_consider) | ||
protocol_dict = filter_model_dictionary(protocol_dict, bia_data_model.SpecimenImagingPrepartionProtocol) | ||
protocol.append( | ||
bia_data_model.SpecimenImagingPrepartionProtocol.model_validate(protocol_dict) | ||
) | ||
|
@@ -213,6 +182,7 @@ def get_test_biosample() -> List[bia_data_model.BioSample]: | |
biosample = [] | ||
for biosample_dict in biosample_info: | ||
biosample_dict["uuid"] = dict_to_uuid(biosample_dict, attributes_to_consider) | ||
biosample_dict = filter_model_dictionary(biosample_dict, bia_data_model.BioSample) | ||
biosample.append(bia_data_model.BioSample.model_validate(biosample_dict)) | ||
return biosample | ||
|
||
|
@@ -252,6 +222,7 @@ def get_test_image_acquisition() -> List[bia_data_model.ImageAcquisition]: | |
image_acquisition_dict["uuid"] = dict_to_uuid( | ||
image_acquisition_dict, attributes_to_consider | ||
) | ||
image_acquisition_dict = filter_model_dictionary(image_acquisition_dict, bia_data_model.ImageAcquisition) | ||
image_acquisition.append( | ||
bia_data_model.ImageAcquisition.model_validate(image_acquisition_dict) | ||
) | ||
|
@@ -310,6 +281,7 @@ def get_test_experimental_imaging_dataset() -> ( | |
], | ||
) | ||
experimental_imaging_dataset_dict["uuid"] = experimental_imaging_dataset_uuid | ||
experimental_imaging_dataset_dict = filter_model_dictionary(experimental_imaging_dataset_dict, bia_data_model.ExperimentalImagingDataset) | ||
experimental_imaging_dataset1 = ( | ||
bia_data_model.ExperimentalImagingDataset.model_validate( | ||
experimental_imaging_dataset_dict | ||
|
@@ -523,10 +495,6 @@ def get_test_study() -> bia_data_model.Study: | |
"Test keyword3", | ||
], | ||
"grant": [g.model_dump() for g in grant], | ||
"experimental_imaging_component": [ | ||
e.uuid for e in get_test_experimental_imaging_dataset() | ||
], | ||
"annotation_component": [], | ||
} | ||
study_uuid = dict_to_uuid( | ||
study_dict, | ||
|
8 changes: 7 additions & 1 deletion
8
bia-shared-datamodels/src/bia_shared_datamodels/bia_data_model.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether this has utility beyond just the export code, and would be better off as a utils in the shared_models package. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
persist
orfilter_model_dictionary
import
andexport
usepersist
- So having it in a common place sounds good. However, I assumeexport
is only reading data models so will not need the filter functionality (but will need the persisting).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filter_model_dictionary
i meant import. But the export could use it with slightly different data models (e.g. the study model for export is the study model + dataset model + biosample, protocols, Image Acquisitions etc.) so having a filter model dictionary method might be helpful.
For persist - While both might want to write out to files, it feels like the directory/file structure might be very different for the export and ingest code, so i don't know how reusable the methods would be. Additionally, the functions related to filtering models makes sense to store in the model package since it's related and is going to be imported by both, but it's less obvious where we should put the persisting.