-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data model content updates to support GH docs #67
Comments
in 24-3, at least first two bullets are readily do-able. Third bullet might be more difficult so need to scope this further and see how far we can get. |
Currently reviewing components, attributes, and valid values For hierarchy/structure, I did some preliminary analysis with GPT and ontology scoping, documented here: Summary: it seems doable, but it will be a lot of work. To help minimize effort required, I'll source from existing ontologies for structure and devise mappings when needed. In terms of implementation, I think defining pair-wise relationships will be sufficient, since the information will be carried forward in each mapping. A generic example would be: Take five terms: RNA-seq, scRNA-seq, ATAC-seq, scATAC-seq, WGS Organizing terms would occur in a CSV, using the column names: Then relationships are easy to define and structure is easily inferred, using Genomic --> bulk, single-cell --> RNA-seq, scRNA-seq, ATAC-seq, scATAC-seq, WGS Technique, Parent |
Suggest to chat with ANV to see how this was designed and implemented in NF |
|
24-6: No updates this sprint. Carry into next sprint |
I will continue to collate valid value definitions here for assays, tissues, and tumor types here: https://docs.google.com/spreadsheets/d/1YL8kDB_tdvGDYqDy4x8zlBauLPDxc24W4tLEArJh0kQ/edit?usp=sharing In addition, there are many valid value sets that are missing descriptions/definitions, like file formats, licenses, input/output formats, etc. Next step here is identify all value types that would benefit from this exercise and note them here. |
24-7/8 close out: have new models add (per #115 ) |
24-9: Secondary to site visit priorities. Will require some work to add new components. Might be some room for automation to help pull this information easier as the data model updates |
Relative to #49 and #66
Draft of the MC2 data model dictionary, using GitHub pages deployment, is here: https://mc2-center.github.io/data-models/
Potential actions that could improve documentation quality (should determine necessity/priority for the following):
Component
andAttribute
entries and add descriptions where missing/incompleteValid Values
and add descriptions, ontology referencesValid Values
The text was updated successfully, but these errors were encountered: