Course materials for the Genomics Aotearoa Metagenomics Summer School, to be hosted at the University of Auckland in November.
A draft timetable for the day is provided below, but please keep in mind that this is subject to change as we evaluate our course material.
For all exercises after the bash
introduction, you will be working from the file path
/nesi/nobackup/nesi02659/MGSS_U/YOUR_USERNAME/
A few helpful commands and shortcuts for working in bash
or with slurm
can be found here.
If you are having trouble downloading files using scp
, we are providing exemplar output files, which you can download through your browser, here, or via the following links:
You can find a copy of the slides presented during the workshop, with published figures removed, in the slides/ folder.
We will be using a collaborative document to share long code, results, or even code errors. This document can be found here or type https://etherpad.wikimedia.org/p/MGSS_17_11_20 in your browser.
Please complete the survey before you leave to help us continue to provide A+ workshops
- Bash scripting
- Quality filtering raw reads
- Assembly (part 1)
- Assembly (part 2)
- Evaluating the assembly
- Binning (part 1, read mapping)
- Binning (part 2, initial binning)
- Binning (part 3, dereplication)
- Bin refinement
Time | Event | Session leader |
---|---|---|
9:00 am – 9:30 am | Introduction - Welcome - Logging into NeSI |
Michael Hoggard |
9:30 am – 10:30 am | TASK: Bash scripting | Ngoni Faya |
10:30 am – 10:50 am | Morning tea break | |
10:50 am – 11:30 am | TASK: Bash scripting (continued) | Ngoni Faya |
11:30 am – 12:00 pm | TALK: The metagenomics decision tree | Kim Handley |
12:00 pm – 12:45 pm | Break for lunch (lunch not provided) | |
12:45 pm – 1:45 pm | TALK: Quality filtering raw reads TASK: Visualisation with FastQC TASK: Read trimming and adapter removal TASK: Diagnosing poor libraries TASK: Common issues and best practice |
Carmen Astudillo-Garcia |
1:45 pm – 3:00 pm | TASK: Run IDBA_UD assembly TALK: Assembly - Choice of assemblers - Considerations for parameters, and when to stop! TASK: Exploring assembler options TASK: Submitting jobs to NeSI via slurm TASK: Run SPAdes assembly TASK (Optional): Submitting variant assemblies to NeSI |
Kim Handley |
3:00 pm – 3:20 pm | Afternoon tea break (Tea, coffee, and snacks provided) | |
3:20 pm – 5:00 pm | TALK: Future considerations - co-assembly vs. single assemblies TASK: Assembly evaluation TASK: Short contig removal |
Michael Hoggard |
Time | Event | Session leader |
---|---|---|
9:00 am – 9:15 am | Introduction - Overview of yesterday, questions - Overview of today |
Carmen Astudillo-Garcia |
9:15 am – 9:30 am | TALK: Automation, reproducibility, and FAIR principles | Dan Jones |
9:30 am – 10:30 am | Binning (part 1) TALK: Overview of binning history - Key parameters and strategies for binning TASK: Read mapping |
Kim Handley |
10:30 am – 10:50 am | Morning tea break (Tea, coffee, and snacks provided) | |
10:50 am – 11:20 am | TALK: Overview of binning history (continued) - Key parameters and strategies for binning |
Kim Handley |
11:20 am – 12:00 pm | Binning (part 2) TASK: Multi-binning strategy (Metabat and Maxbin) |
Kim Handley |
12:00 pm – 12:45 pm | Break for lunch (lunch not provided) | |
12:45 pm – 2:00 pm | Binning (part 3) TASK: Bin dereplication via DAS_Tool TASK: Evaluating bins using CheckM |
Michael Hoggard |
2:00 pm - 3:00 pm | Binning (part 4) - Discuss additional dereplication strategies, such as dRep - How to work with viral and eukaryotic bins - Dealing with organisms which possess minimal genomes TALK: Bin refinement - Refinement strategies - |
Carmen Astudillo-Garcia Michael Hoggard |
3:00 pm – 3:20 pm | Afternoon tea break (Tea, coffee, and snacks provided) | |
3:20 pm – 5:00 pm | TALK: Bin refinement - Refinement strategies (cont) - VizBin and ESOMana TASK: Working with VizBin TASK: Submit VIBRANT job |
Michael Hoggard |
Time | Event | Session leader |
---|---|---|
9:00 am – 9:30 am | Introduction - Overview of yesterday, questions - Overview of today |
Michael Hoggard |
9:30 am – 10:30 am | TASK: Identifying viral contigs (VIBRANT) TALK: Identifying viruses from metagenomic data TASK: QC of viral contigs (CheckV) TASK: Coverage calculation (bowtie) TASK: Taxonomic classification (Bin taxonomy with GTDB-TK; viral taxonomy predictions with vConTACT2) |
Michael Hoggard David Waite |
10:30 am – 10:50 am | Morning tea break (Tea, coffee, and snacks provided) | |
10:50 am – 11:30 am | TALK: Gene prediction, using prodigal, and other tools (RNAmer, Aragorn, etc) TASK: Predict open reading frames and protein sequences |
David Waite |
11:30 am – 12:00 pm | TALK: Gene annotation (part 1) TASK: Gene annotation using diamond and hmmer Discussion: Evaluating the quality of gene assignment Discussion: Differences in taxonomies (GTDB, NCBI etc) |
Carmen Astudillo-Garcia |
12:00 pm – 12:45 pm | Break for lunch (lunch not provided) | |
12:45 pm – 2:00 pm | TALK: Gene annotation (part 2) - Using online resources (e.g. KEGG, BioCyc, MetaCyc, HydDB, PSORT) TASK: View KEGG annotation in KEGG website |
Christia Straub Florian Pichlmueller |
2:00 pm – 3:00 pm | TALK: Bin taxonomic classification - Bin and species determination TASK: View phylogenetic trait distribution (ANNOTREE) |
David Waite |
3:00 pm – 3:20 pm | Afternoon tea break | |
3:20 pm – 4:30 pm | TASK: MAG annotation with DRAM TASK: Introduce group project goals TASK: Dividing into working groups / get a group name TASK: Select a goal from your project |
Carmen Astudillo-Garcia |
4:30 pm – 5:00 pm | End of day wrap up | Kim Handley |
Time | Event | Session leader |
---|---|---|
9:00 am – 9:15 am | Introduction - Overview of yesterday, questions - Overview of today |
Michael Hoggard |
9:15 am – 10:00 am | TALK: DRAM results overview TASK: Explore DRAM results |
Carmen Astudillo-Garcia |
10:00 am – 10:30 am | Presentation of data TALK: Visualising findings (metabolism maps, heatmaps, cell schematics, gene trees, gene maps) TASK: Coverage heatmap / Ordination (Optional) |
Michael Hoggard |
10:30 am – 10:50 am | Morning tea break (Tea, coffee, and snacks provided) TASK: Workshop survey |
|
10:50 am – 12:00 pm | Presentation of data (continued) TALK: Visualising findings (metabolism maps, heatmaps, cell schematics, gene trees, gene maps) TASK: KEGG metabolic pathways TASK: Gene synteny TASK: CAZy heatmaps (Optional) |
Carmen Astudillo-Garcia Boey Jian Sheng Hwee Sze Tee |
12:00 pm – 12:45 pm | Break for lunch (lunch not provided) | |
12:45 pm – 2:30 pm | TASK: Analyse data for group work TASK: Prepare group presentation |
Kim Handley |
2:30 pm – 3:00 pm | Present and discuss findings TASK: Each group to give an informal presentation of their data |
Kim Handley |
3:00 pm – 3:20 pm | Afternoon tea break (Tea, coffee, and snacks provided) | |
3:20 pm – 3:40 pm | Present and discuss findings (continued) TASK: Each group to give an informal presentation of their data |
Carmen Astudillo-Garcia |
3:40 pm – 4:00 pm | End of day wrap up - Final discussion |
Kim Handley Michael Hoggard |