Training Resources for H3A Fellows Training
ToDO:
- Create a markdown document for each topic and highlight the learning objectives and outcomes---base this on the ISCB competencies.
We may need to start with a topic on data organization and management.
- Cleaning up and putting together large datasets with R and the tidyverse environment
- FAIR Data Principles and their implementation in our scientific space
- Metadata standards for genomic data and uploading to Genbank/ ENA
These are the high-level competencies that we focus on imparting during the training. Data management is a newly introduced competency.
- Current approaches for modelling and warehousing of life science data;
- The importance of data governance, curation, information architecture and ensuring interoperability
- Common document identification, tracking and control procedures
- FAIR principles
- Ethical, legal and social implications of using/storing sensitive data
- Data protection concepts
- Knowledge representation (e.g. file formats, ontologies and other controlled vocabularies)
At the end of the training, the trainees should be able to:
- Makes use of suitable programming languages and/or workflow tools to automate data handling and curation tasks
- Curates biological data using proper metadata, ontologies and/or controlled vocabularies
- Draft and file an appropriate Data Management Plan
- Prepare data for submission to appropriate public bioinformatics data repositories as required, being aware of ethical and legal considerations
In this lesson, we will be
- Be aware of the FAIR data principles
- Understand how the FAIR Data principles are applicable in genomics
- Know how to develop a data management plan that adheres to FAIR principles
- Understand the limits of FAIRness and Open science, i.e. Know how to protect their data
- Be able to outline the FAIR principles and how they can be applied
- be able to highlight and utilize the FAIR principles in genomics --- or any of their research
- Know the various tools that can be used to develop a data management plan
- Be prepared to apply open science principles but protect their data: Be as open as possible and closed as necessary
The main objective for this is to understand how to collect the right metadata and make use of known and established standards.
- Some standards
- SRA Metadata and Submission Overview
- Minimum information by GSC paper: https://www.nature.com/articles/nbt.1823.pdf
- The Checklist availabe at ENA for Minimum information is alsos useful For completeness we could include Project planning for Genomics, we could then use the data and resources from DC Genomics for part of the course.