diff --git a/mkdocs.yaml b/mkdocs.yaml index 8fb9ba6..6db63d5 100644 --- a/mkdocs.yaml +++ b/mkdocs.yaml @@ -123,6 +123,7 @@ nav: - "Overview": metadata/overview.md - "Entities & Attributes": metadata/entities.md - "Standards": metadata/standards.md + - "Data Dictionary Overview": metadata/data_dictionary_overview - "Data Dictionary": metadata/data_dictionary - "CLI Tools": cli_tools - "FAQ": faq.md diff --git a/user_docs/assets/img/Overview_fig.png b/user_docs/assets/img/Overview_fig.png new file mode 100644 index 0000000..2622c4d Binary files /dev/null and b/user_docs/assets/img/Overview_fig.png differ diff --git a/user_docs/metadata/overview.md b/user_docs/metadata/overview.md index fbca0cd..c176bec 100644 --- a/user_docs/metadata/overview.md +++ b/user_docs/metadata/overview.md @@ -1,26 +1,7 @@ # The GHGA Metadata Model -## **Glossary** -- **Entity**: An Entity holds characteristics of a real-world object. Example: The Individual entity is described by the information (properties) for sex, year of birth and height. - - Synonyms: class, table, object +The GHGA metadata model aims at facilitating comprehensive submissions that maximize the amount of collected metadata without creating friction on the submitter side, enabling (reusable) submissions of different types of -omics data into GHGA. The schema consists of **Research Metadata** and the **Administrative Metadata**. The **Research Metadata** aims at maximising the reusability and FAIRness of the data and the **Administrative Metadata** focuses on managing the resources, such as creation or acquisition of the data, rights management, and disposition. The schema also differentiates between file types depending on whether they were generated through primary analysis (**Research Data File**), secondary analysis (**Process Data File**) or supplementary information to the classes (**Supporting File**) -- **Property**: A Property is a single characteristic that can be used in combination with other characteristics to describe a real-world object. Example: The combination of the properties sex, year of birth and height describe the (real-world object) entity Individual. +The GHGA metadata model follows several internationally renowned concepts, standards, and resources to provide a metadata schema to share data in a standardized and harmonized fashion. Please visit (https://zenodo.org/records/8341224) for further details. - - Synonyms: attribute, element, field, slot - -- **FAIR**: Findable, Accessible, Interoperable, Reusable - -## **Introduction** -The German Human Genome-Phenome Archive (GHGA) provides a nation-wide resource for archiving, accessing and sharing of multi-omics data produced and processed in research and health care initiatives in Germany. GHGA aims to bring these data together and make it easier to find data for secondary use, by adopting and adhering to [FAIR data principles](https://doi.org/10.1038/sdata.2016.18). In order to meet the domain-specific requirements we developed the GHGA Metadata Schema - a schema for representing information pertaining to various aspects of our data. - -This documentation serves as the description and reasoning behind the Metadata Model of GHGA, which encapsulates the metadata schema, its technical implementation, and resources to support submission of metadata. The Archive function of GHGA is envisioned to handle a wide variety of omics and research data. The GHGA metadata model aims at facilitating comprehensive submissions that maximize the amount of collected metadata without creating friction on the submitter side, enabling (reusable) submissions of different types of -omics data into GHGA. This metadata model can satisfy the heterogeneous needs of submitters while maintaining the FAIR principles, interoperability with EGA and facilitating streamlined user journeys. - -Classes in the schema can be grouped into **Research Metadata** and **Administrative Metadata** based on the information they capture. The **Research Metadata** aims at maximising the reusability and FAIRness of the data, while the **Administrative Metadata** focuses on managing the resources, such as creation or acquisition of the data, rights management, and disposition. The Research Metadata classes include *Individual*, *Biospcimen/Sample*, *Experiment*, *Experiment Method*, *Analysis* and *Analysis Method*. The Administrative Metadata captures *Dataset*, *Data Access Policy*, *Data Access Committee*, *Publication*, and *Study*. - -The model also differentiates between three file types: - -- **Research Data File**: A file which results from the omics experiment, such as sequencing of a sample. -- **Process Data File**: A file that is generated as output from an analysis performed on a *Research Data File*, such as alignment or processing. -- **Supporting File**: A file that provides further information about an *Individual*, *Experiment Method* or *Analysis Method*. These could be unstructured protocols or structured information, such as Phenopackets or BioCompute Objects. - -Furthermore we provide data submitters with a Submission Spreadsheet in order to easily deposit their data within GHGA. \ No newline at end of file +![GHGA Metadata Model Overview](../assets/img/Overview_fig.png) \ No newline at end of file