SatuTe (Saturation Test) is a Python-based tool designed to evaluate the presence of phylogenetic information in analyses. Saturation occurs when multiple substitutions obscure true genetic distances, potentially leading to artifacts and errors in phylogenetic inference. SatuTe introduces a new measure that extends the concept of saturation between two sequences to a theory of saturation between subtrees. The test implemented in SatuTe quantifies whether a given alignment provides sufficient phylogenetic information shared between two subtrees connected by a branch in a phylogeny.
Using the output from SatuTe, you can perform various downstream analyses to gain deeper insights into the phylogenetic signal by addressing different questions.
The repository is organized as follows:
-
/example/: Contains small example datasets and the output generated by SatuTe. This folder allows you to follow along with the different types of analyses.
-
/scripts/: Includes all scripts required for performing the various analyses, such as per-category, sliding-window, and per-alignment-region analyses. Each type of analysis has its own subfolder, and the scripts can be run to generate examples within the
/example/
folder. Note that installation of SatuTe and IQ-TREE is not necessary to run these examples, as the required outputs are already provided. -
/tree_of_life/: Contains the data and scripts used to generate the outputs presented in the associated paper. This includes detailed instructions and resources for replicating the findings.
Before running any scripts, ensure you have:
-
Python 3.10.12 or higher: Check your Python version with:
python3 --version
-
(Optional) Create a Virtual Environment:
python3 -m venv env source env/bin/activate
-
Install Required Packages:
pip install -r requirements.txt
You're now ready to run the scripts!
If you are planning to run the Tree of Life analysis, you'll need these additional tools:
- SatuTe
- IQ-Tree2
-
Install pipx: If you don't have pipx installed, you can install it using pip:
pip install pipx
-
Ensure pipx is set up correctly:
pipx ensurepath
-
Install SatuTe using pipx: Once pipx is installed, you can use it to install SatuTe:
pipx install satute
-
Test the installation: After installation, verify that SatuTe is installed correctly by checking its version:
satute --version
For more detailed instructions and information about pipx, refer to the official pipx documentation.
-
Download IQ-Tree from the official website.
-
Follow the installation instructions provided on the website for your operating system.
-
Test the IQ-Tree installation: After installing IQ-Tree, verify the installation by checking its version:
iqtree2 --version
Using the output from SatuTe, you can perform various downstream analyses:
If an evolutionary model with rate heterogeneity is used, each site is assigned to the rate category with the highest posterior probability. For each category
SatuTe also supports branch-specific analyses. To gain a more detailed understanding of changes in phylogenetic information, you can perform a sliding-window analysis with a specific window size. This approach is effective in detecting a minority of sites affected by saturation.
When the alignment is composite —such as a concatenation of different genes, proteins, or other partitions— a key question is whether the selected alignment regions are phylogenetically informative within the reconstructed tree topology. A per-alignment-region analysis can help address this question.
By comparing the z-scores obtained from different branches, you can identify potential information loss and examine the differences. For instance, you might explore per-region z-score differences between an external branch and an internal branch. Beyond branch comparison, z-score differences can also help determine whether each region supports one of two given topologies.