Skip to content

Latest commit

 

History

History
83 lines (55 loc) · 3.8 KB

File metadata and controls

83 lines (55 loc) · 3.8 KB

2023 Workshop IGS Genome reconstruction - Day 04

After having QC'ed, assembled and characterized our data, we will try to put it into context. We will look at different methods for phylogenetic clustering of sequences and do a hands-on session using chewieSnake for cgMLST-based clustering as an example.

Lecture

Phylogenetic analyses of isolate sequencing data using chewieSnake Slides.

Hands-on

Set up of chewieSnake (DO NOT RUN!)

#set up chewiesnake
git clone https://gitlab.com/bfr_bioinformatics/chewieSnake
mamba env create -f /home/igsw/igs_workshop/software/chewieSnake/envs/chewiesnake.yaml

Usage of chewieSnake

conda activate chewiesnake
/home/igsw/igs_workshop/software/chewieSnake/chewieSnake.py --help 
# or
/home/igsw/igs_workshop/software/chewieSnake/chewieSnake.py --help | less

We have assembled a set of Yersinia pestis reads from a simulated outbreak scenario at /home/igsw/igs_workshop/aquamis_results/Assembly/assembly

To run chewieSnake, you again need a sample sheet of the samples you want to analyze.

create_sampleSheet.sh --mode assembly /home/igsw/igs_workshop/aquamis_results/Assembly/assembly
less /home/igsw/igs_workshop/aquamis_results/Assembly/assembly/samples.tsv

We can remove data that did not pass our QC from the samplesheet:

grep -v ERR9964621inter /home/igsw/igs_workshop/aquamis_results/Assembly/assembly/samples.tsv > /home/igsw/igs_workshop/aquamis_results/Assembly/assembly/samples_fixed.tsv

Next, we want to run chewieSnake on all of these samples. We again precalculated most of the data for you. Please make sure to work only in the prepared working directory (parameter -d)

First, we want to check if our data is correct. To do so, we perform a dry run.

/home/igsw/igs_workshop/software/chewieSnake/chewieSnake.py -l /home/igsw/igs_workshop/aquamis_results/Assembly/assembly/samples.tsv --scheme /home/igsw/igs_workshop/software/chewieSnake/ypestis_data --prodigal /home/igsw/igs_workshop/software/chewieSnake/ypestis_data/ypestis_ASM22297v1.trn -n -d /home/igsw/igs_workshop/chewiesnake_results/

If everything looks good, we start the actual processing.

/home/igsw/igs_workshop/software/chewieSnake/chewieSnake.py -l /home/igsw/igs_workshop/aquamis_results/Assembly/assembly/samples.tsv --scheme /home/igsw/igs_workshop/software/chewieSnake/ypestis_data --prodigal /home/igsw/igs_workshop/software/chewieSnake/ypestis_data/ypestis_ASM22297v1.trn -d /home/igsw/igs_workshop/chewiesnake_results/

chewieSnake will automatically generate an interactive report for manual curation of your QC results. Open it in the browser:

readlink -f /home/igsw/igs_workshop/chewiesnake_results/reports/cgmlst_report.html

Copy the resulting line into your browswer to check the results.

chewieSnake will generate a tree file which you can open in e.g. grapetree for further examination.

To start a grapetree session, just type

grapetree

This will open a browser window showing a local webserver (http://localhost:8000/).

Grapetree is also available as web service

You can upload your tree:

/home/igsw/igs_workshop/chewiesnake_results/cgmlst/exported_trees/clustering_global.tre

and the according metadata. The metadata can be downloaded here

/home/igsw/igs_workshop/raw_data/ypestis_data_curated.csv

Alternatively, we can vizualize our results in phandango.

In the resulting interactive view, you can visualize different metadata fields within the tree. Play around with it and let's try to read someting meaningful from it!