Skip to content

16. Phylogenetic Reconstruction

George Pacheco edited this page Jun 21, 2022 · 3 revisions

Based on Dataset I and through the use of ngsDist--v1.0.6, FASTme--v2.1.5 + RAxML-NG--v0.5.1b, we reconstruct the phylogenetic relationships.

Generates matrix of genetic distances:
xsbatch -c 34 --mem-per-cpu 2000 -J Dist_Corr --time 3-00 -- "ngsDist --n_threads 34 --geno ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--ANGSDRuns/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.beagle.gz --pairwise_del --seed 33 --probs --n_ind 257 --n_sites 1997420 --labels ~/data/Pigeons/PBGP/PBGP--Analyses/Lists/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.labels --out ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.dist"
Twicks a bit the matrix of distances created above:
perl -p -e 's/\t/ /g' ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.dist > ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra_Changed.dist
Generates the NJ phylogeny:
fastme -T 15 -i ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra_Changed.dist -s -o ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.nwk
Converts the .haplo file into a .fasta file using the tsv_merge.pl script:
xsbatch -c XXX --mem-per-cpu 95000 -J FASTA --time 2-00 -- "zcat ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--ANGSDRuns/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.haplo.gz | cut -f 4- | tail -n +2 | perl /groups/hologenomics/fgvieira/scripts/tsv_merge.pl --transp --ofs '' - | awk 'NR==FNR{id=$1; sub(".*\\/","",id); sub("\\..*","",id); x[FNR]=id} NR!=FNR{ print ">"x[FNR]"\n"$1}' ~/data/Pigeons/PBGP/PBGP--Analyses/Lists/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.labels - > ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.fasta"
Generates an ML phylogeny based on the .fasta file created above having the NJ phylogeny as a backbone:
raxml-ng-mpi --threads XXX --search --model GTR+G --site-repeats on --msa ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--ANGSDRuns/PBGP--GoodSamples_WithWGSs--Article--Ultra.fasta --tree ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.nwk --prefix ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.ngsDist
Bootstraps this generated ML phylogeny:
raxml-ng-mpi --threads XXX --bootstrap --model GTR+G --bs-trees 100 --site-repeats on --msa ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--ANGSDRuns/PBGP--GoodSamples_WithWGSs--Article--Ultra.fasta --tree ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.ngsDist.raxml.bestTree --prefix ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.BOOT
Adds the bootstrap values to the generated ML phylogeny:
raxml-ng --threads XXX --support --tree ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.ngsDist.raxml.bestTree --bs-trees ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.BOOT.tree --prefix ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.FINAL
This results were plotted using iTOL.
Convert the .bestTree file into a .tsv file using the script: tree2matrix.pl:
perl $SCRIPTS/scripts/tree2matrix.pl -i ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.ngsDist.raxml.bestTree | awk '{split($1,sp1,"[_-]"); split($2,sp2,"[_-]"); rel="Inter-breeds"; if(sp1[1]==sp2[1])rel="Intra-breeds"; if(sp1[1]sp1[2]==sp2[1]sp2[2])rel="Intra-replicates"; if(sp1[1]=="Crupestris" || sp2[1]=="Crupestris")rel="Inter-species"; if(NR==1)rel="Relatedness"; print $0"\t"rel}' > ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--GeneticDistances/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.ngsDist.raxml.bestTree.tsv
These results were plotted using the Rscript below:

Clone this wiki locally