Skip to content

Course materials for the Genomics Aotearoa Metagenomics Summer School, to be hosted at the University of Auckland in December

Notifications You must be signed in to change notification settings

Fmh2417/metagenomics_summer_school

 
 

Repository files navigation

Metagenomics Summer School

Course materials for the Genomics Aotearoa Metagenomics Summer School, to be hosted at the University of Auckland in November.

A draft timetable for the day is provided below, but please keep in mind that this is subject to change as we evaluate our course material.


Useful locations and links

Working directories

For all exercises after the bash introduction, you will be working from the file path

/nesi/nobackup/nesi02659/MGSS_U/YOUR_USERNAME/

bash/slurm cheatsheet

A few helpful commands and shortcuts for working in bash or with slurm can be found here.

Snapshots of results to download

If you are having trouble downloading files using scp, we are providing exemplar output files, which you can download through your browser, here, or via the following links:

Slides for workshop

You can find a copy of the slides presented during the workshop, with published figures removed, in the slides/ folder.

Etherpad

We will be using a collaborative document to share long code, results, or even code errors. This document can be found here or type https://etherpad.wikimedia.org/p/MGSS_17_11_20 in your browser.

Post-Workshop Survey

Please complete the survey before you leave to help us continue to provide A+ workshops


Workshop exercises

Day 1

  1. Bash scripting
  2. Quality filtering raw reads
  3. Assembly (part 1)
  4. Assembly (part 2)
  5. Evaluating the assembly

Day 2

  1. Binning (part 1, read mapping)
  2. Binning (part 2, initial binning)
  3. Binning (part 3, dereplication)
  4. Bin refinement

Day 3

  1. Viruses
  2. Coverage and Taxonomy
  3. Gene prediction
  4. Gene annotation (part 1)
  5. Gene annotation (part 2)

Day 4

  1. Gene annotation (part 3)
  2. Presentation of data

Timetable

Day 1 - 17th November 2020

Time Event Session leader
9:00 am – 9:30 am Introduction
- Welcome
- Logging into NeSI
Michael Hoggard
9:30 am – 10:30 am TASK: Bash scripting Ngoni Faya
10:30 am – 10:50 am Morning tea break
10:50 am – 11:30 am TASK: Bash scripting (continued) Ngoni Faya
11:30 am – 12:00 pm TALK: The metagenomics decision tree Kim Handley
12:00 pm – 12:45 pm Break for lunch (lunch not provided)
12:45 pm – 1:45 pm TALK: Quality filtering raw reads
TASK: Visualisation with FastQC
TASK: Read trimming and adapter removal
TASK: Diagnosing poor libraries
TASK: Common issues and best practice
Carmen Astudillo-Garcia
1:45 pm – 3:00 pm TASK: Run IDBA_UD assembly
TALK: Assembly
- Choice of assemblers
- Considerations for parameters, and when to stop!
TASK: Exploring assembler options
TASK: Submitting jobs to NeSI via slurm
TASK: Run SPAdes assembly
TASK (Optional): Submitting variant assemblies to NeSI
Kim Handley
3:00 pm – 3:20 pm Afternoon tea break (Tea, coffee, and snacks provided)
3:20 pm – 5:00 pm TALK: Future considerations - co-assembly vs. single assemblies
TASK: Assembly evaluation
TASK: Short contig removal
Michael Hoggard

Day 2 - 18th November 2020

Time Event Session leader
9:00 am – 9:15 am Introduction
- Overview of yesterday, questions
- Overview of today
Carmen Astudillo-Garcia
9:15 am – 9:30 am TALK: Automation, reproducibility, and FAIR principles Dan Jones
9:30 am – 10:30 am Binning (part 1)
TALK: Overview of binning history
- Key parameters and strategies for binning
TASK: Read mapping
Kim Handley
10:30 am – 10:50 am Morning tea break (Tea, coffee, and snacks provided)
10:50 am – 11:20 am TALK: Overview of binning history (continued)
- Key parameters and strategies for binning
Kim Handley
11:20 am – 12:00 pm Binning (part 2)
TASK: Multi-binning strategy (Metabat and Maxbin)
Kim Handley
12:00 pm – 12:45 pm Break for lunch (lunch not provided)
12:45 pm – 2:00 pm Binning (part 3)
TASK: Bin dereplication via DAS_Tool
TASK: Evaluating bins using CheckM
Michael Hoggard
2:00 pm - 3:00 pm Binning (part 4)
- Discuss additional dereplication strategies, such as dRep
- How to work with viral and eukaryotic bins
- Dealing with organisms which possess minimal genomes
TALK: Bin refinement
- Refinement strategies -
Carmen Astudillo-Garcia
Michael Hoggard
3:00 pm – 3:20 pm Afternoon tea break (Tea, coffee, and snacks provided)
3:20 pm – 5:00 pm TALK: Bin refinement
- Refinement strategies (cont) - VizBin and ESOMana
TASK: Working with VizBin
TASK: Submit VIBRANT job
Michael Hoggard

Day 3 - 19th November 2020

Time Event Session leader
9:00 am – 9:30 am Introduction
- Overview of yesterday, questions
- Overview of today
Michael Hoggard
9:30 am – 10:30 am TASK: Identifying viral contigs (VIBRANT)
TALK: Identifying viruses from metagenomic data
TASK: QC of viral contigs (CheckV)
TASK: Coverage calculation (bowtie)
TASK: Taxonomic classification (Bin taxonomy with GTDB-TK; viral taxonomy predictions with vConTACT2)
Michael Hoggard


David Waite
10:30 am – 10:50 am Morning tea break (Tea, coffee, and snacks provided)
10:50 am – 11:30 am TALK: Gene prediction, using prodigal, and other tools (RNAmer, Aragorn, etc)
TASK: Predict open reading frames and protein sequences
David Waite
11:30 am – 12:00 pm TALK: Gene annotation (part 1)
TASK: Gene annotation using diamond and hmmer
Discussion: Evaluating the quality of gene assignment
Discussion: Differences in taxonomies (GTDB, NCBI etc)
Carmen Astudillo-Garcia
12:00 pm – 12:45 pm Break for lunch (lunch not provided)
12:45 pm – 2:00 pm TALK: Gene annotation (part 2)
- Using online resources (e.g. KEGG, BioCyc, MetaCyc, HydDB, PSORT)
TASK: View KEGG annotation in KEGG website
Christia Straub
Florian Pichlmueller
2:00 pm – 3:00 pm TALK: Bin taxonomic classification
- Bin and species determination
TASK: View phylogenetic trait distribution (ANNOTREE)
David Waite
3:00 pm – 3:20 pm Afternoon tea break
3:20 pm – 4:30 pm TASK: MAG annotation with DRAM
TASK: Introduce group project goals
TASK: Dividing into working groups / get a group name
TASK: Select a goal from your project
Carmen Astudillo-Garcia
4:30 pm – 5:00 pm End of day wrap up Kim Handley

Day 4 - 20th November 2020

Time Event Session leader
9:00 am – 9:15 am Introduction
- Overview of yesterday, questions
- Overview of today
Michael Hoggard
9:15 am – 10:00 am TALK: DRAM results overview
TASK: Explore DRAM results
Carmen Astudillo-Garcia
10:00 am – 10:30 am Presentation of data
TALK: Visualising findings (metabolism maps, heatmaps, cell schematics, gene trees, gene maps)
TASK: Coverage heatmap / Ordination (Optional)
Michael Hoggard
10:30 am – 10:50 am Morning tea break (Tea, coffee, and snacks provided)
TASK: Workshop survey
10:50 am – 12:00 pm Presentation of data (continued)
TALK: Visualising findings (metabolism maps, heatmaps, cell schematics, gene trees, gene maps)
TASK: KEGG metabolic pathways
TASK: Gene synteny
TASK: CAZy heatmaps (Optional)
Carmen Astudillo-Garcia
Boey Jian Sheng
Hwee Sze Tee
12:00 pm – 12:45 pm Break for lunch (lunch not provided)
12:45 pm – 2:30 pm TASK: Analyse data for group work
TASK: Prepare group presentation
Kim Handley
2:30 pm – 3:00 pm Present and discuss findings
TASK: Each group to give an informal presentation of their data
Kim Handley
3:00 pm – 3:20 pm Afternoon tea break (Tea, coffee, and snacks provided)
3:20 pm – 3:40 pm Present and discuss findings (continued)
TASK: Each group to give an informal presentation of their data
Carmen Astudillo-Garcia
3:40 pm – 4:00 pm End of day wrap up
- Final discussion
Kim Handley
Michael Hoggard

About

Course materials for the Genomics Aotearoa Metagenomics Summer School, to be hosted at the University of Auckland in December

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 55.1%
  • Shell 44.9%