-
Notifications
You must be signed in to change notification settings - Fork 1
16. Phylogenetic Reconstruction
George Pacheco edited this page Jun 21, 2022
·
3 revisions
Based on
Dataset I
and through the use of ngsDist--v1.0.6, FASTme--v2.1.5 + RAxML-NG--v0.5.1b, we reconstruct the phylogenetic relationships.
xsbatch -c 34 --mem-per-cpu 2000 -J Dist_Corr --time 3-00 -- "ngsDist --n_threads 34 --geno ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--ANGSDRuns/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.beagle.gz --pairwise_del --seed 33 --probs --n_ind 257 --n_sites 1997420 --labels ~/data/Pigeons/PBGP/PBGP--Analyses/Lists/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.labels --out ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.dist"
perl -p -e 's/\t/ /g' ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.dist > ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra_Changed.dist
fastme -T 15 -i ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra_Changed.dist -s -o ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.nwk
Converts the .haplo
file into a .fasta
file using the tsv_merge.pl
script:
xsbatch -c XXX --mem-per-cpu 95000 -J FASTA --time 2-00 -- "zcat ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--ANGSDRuns/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.haplo.gz | cut -f 4- | tail -n +2 | perl /groups/hologenomics/fgvieira/scripts/tsv_merge.pl --transp --ofs '' - | awk 'NR==FNR{id=$1; sub(".*\\/","",id); sub("\\..*","",id); x[FNR]=id} NR!=FNR{ print ">"x[FNR]"\n"$1}' ~/data/Pigeons/PBGP/PBGP--Analyses/Lists/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.labels - > ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.fasta"
Generates an ML phylogeny based on the .fasta
file created above having the NJ phylogeny as a backbone:
raxml-ng-mpi --threads XXX --search --model GTR+G --site-repeats on --msa ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--ANGSDRuns/PBGP--GoodSamples_WithWGSs--Article--Ultra.fasta --tree ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/NJ/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.nwk --prefix ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.ngsDist
raxml-ng-mpi --threads XXX --bootstrap --model GTR+G --bs-trees 100 --site-repeats on --msa ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--ANGSDRuns/PBGP--GoodSamples_WithWGSs--Article--Ultra.fasta --tree ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.ngsDist.raxml.bestTree --prefix ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.BOOT
raxml-ng --threads XXX --support --tree ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.ngsDist.raxml.bestTree --bs-trees ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.BOOT.tree --prefix ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.FINAL
This results were plotted using iTOL.
Convert the .bestTree
file into a .tsv
file using the script: tree2matrix.pl
:
perl $SCRIPTS/scripts/tree2matrix.pl -i ~/data/Pigeons/PBGP/PBGP--Analyses/Phylogenies/RAxML/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.ngsDist.raxml.bestTree | awk '{split($1,sp1,"[_-]"); split($2,sp2,"[_-]"); rel="Inter-breeds"; if(sp1[1]==sp2[1])rel="Intra-breeds"; if(sp1[1]sp1[2]==sp2[1]sp2[2])rel="Intra-replicates"; if(sp1[1]=="Crupestris" || sp2[1]=="Crupestris")rel="Inter-species"; if(NR==1)rel="Relatedness"; print $0"\t"rel}' > ~/data/Pigeons/PBGP/PBGP--Analyses/PBGP--GeneticDistances/PBGP--GoodSamples_WithAllWGS-GBSPairs--Article--Ultra.ngsDist.raxml.bestTree.tsv
- 1. Data Access
- 2. Sequencing Quality Check
- 3. Demultiplexing
- 4. Creation of Mapping Targets
- 5. Filtering For Chimeric Reads
- 6. GBS Sexing
- 7. Read Processing & Mapping
- 8. Running Stats & Filtering of Bad Samples
- 9. Filtering of Possible Paralogs
- 10. Merging of Duplicate Cases
- 11. Investigation of Filtering of Possible Paralogs
- 12. Creation of Specific Datasets
- 13. Loci Information
- 14. Heterozygosity Calculation
- 15. Population Genetics Statistics
- 16. Phylogenetic Reconstruction
- 17. Multidimensional Scaling
- 18. Estimation of Individual Ancestries
- 19. Inference of Population Splits
- 20. Measuring of Linkage Disequilibrium
- 21. GWAS