GET_PHYLOMARKERS_v2.2.0_2024-04-14
This major release (v2.2.0, 2024-04-14) contains new features, significant code improvements, binary updates, and some bug fixes
New features
- significant extension of run mode 2 (
run_get_phylomarkers_pipeline.sh -R 2
) for population genetics (multiple sequences from the same species):- the FASTA files of non-recombinant, kdetrees-compliant, and neutral loci are saved in their directory and concatenated
- SNP sites are extracted from the concatenated alignment with snp-sites and saved as FASTA and VCF formats
- ML trees are estimated from the SNP supermatrix using either IQ-TREE or FastTree
run_get_phylomarkers_pipeline.sh
now also calls the C binary WEIGHTED-ASTRAL to estimate a species tree using as input the filtered gene trees estimated by iqtree2 or FastTree from the core-genome clusters computed by get_homologues.- run_ASTRAL computes best-fitting model from protein or DNA concat alignments using IQT with wASTRAL species tree as a constraint
- run_ASTRAL calls compute_ASTRALspTree_branch_lenghts from protein or DNA concatenated alignments using IQ-TREE with wASTRAL species tree as a constraint
- run_ASTRAL now calls astral4 AND wASTRAL from the ASTER package, using the wASTRAL species tree for the downstream analyses listed above
- The main script
run_get_phylomarkers_pipeline.sh
- Added complex protein mixture models for concatenated protein alignments
- Prints script invocation arguments to STDOU at the beginning of the run
- Collects results of the different filtering steps and prints an overview of the pipeline's filtering process before exiting
- Full shellcheck compliance
Updated and new external programs
- A static binary of ASTRAL-IV is used to estimate the concatenation-free species tree
- A static binary of WEIGHTED-ASTRAL is used to estimate the concatenation-free species tree
- snp-sites is now used under run mode 2 (
run_get_phylomarkers_pipeline.sh -R 2
)
Updated scripts, library code, and test files
run_get_phylomarkers_pipeline.sh
run_test_suite.sh
lib/get_phylomarkers_fun_lib
install_R_deps.R
omits installing R packages that are not used anymoretest_get_phylomarkers.t
now runs 24 tests
Docker image
The distribution contains a Dockerfile used to build the Docker image ready to pull from Docker Hub. On Dockerhub, you will find detailed instructions on installing and configuring the Docker client on your machine, pulling the latest image, and running the containerized instance of the GET_PHYLOMARKERS pipeline.