Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DH modularity and 1:N wish list #6

Open
griffie opened this issue Jan 12, 2023 · 0 comments
Open

DH modularity and 1:N wish list #6

griffie opened this issue Jan 12, 2023 · 0 comments

Comments

@griffie
Copy link
Contributor

griffie commented Jan 12, 2023

A sample can contain multiple organisms, multiple kinds of the same organism (i.e. multiple isolates), and isolates may be sequenced multiple times using different protocols or instruments. This creates a 1-to-many issue, where one sample may need to be linked to multiple organisms, isolates, library IDs, associated tests (AMR drug panels from different companies) etc.

Currently the contextual data for organisms, isolates etc from the same sample have to be entered repeatedly over and over again which creates a data entry burden for data providers.

Ideally, modularity could be created so that sample information could be entered once and linked to different isolates.
Similarly, isolate information could be entered once and linked to different libraries with different processing details/instruments.
Also similarly, libraries could be linked to multiple sequencing runs and/or associated tests.

image
To submit the data to LIMS or public repositories, every library or isolate or organism would need the metadata from the sample so
ideally upon export, the DH would populate that info and present each thing as a separate line in a spreadsheet.
e.g. the above situation would appear like:
sample 1 --> organism 1 --> isolate A --> library 1 --> sequence 1
sample 1 --> organism 2 --> isolate B --> library 2 --> sequence 2
sample 1 --> organism 2 --> isolate C --> library 3 --> sequence 3
sample 1 --> organism 2 --> isolate C --> library 4 --> sequence 4
sample 1 --> organism 2 --> isolate C --> library 4 --> sequence 5
*But the data provider wouldn't have to enter the different metadata multiple times.

Can we make the DH do this modular/1:N data capture and transformation (pretty please)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant