Skip to content

Scripts for homology and orthology assessment from genomic sequences.

License

Notifications You must be signed in to change notification settings

ballesterus/UPhO

Repository files navigation

UPhO

UPhO finds orthologs with and without inparalogs from input gene family trees. Refer to the Documentation.pdf for more detailed explanations on its usage, installation and dependencies. Type UPhO.py -h for help.

The only input requierement for UPhO is a tree (or trees) in NEWICK format in which the leaves are named with a species identifier, a field separator, and sequence identifier. By default, the field separator is the character "|" but custom delimiters can be defined. Examples of trees to test UPhO are provided in the TestData folder.

Additional scripts are provided for a variety of task including:

  • minreID.py Renames sequence identifiers adding species (OTU) name and field delimiters character.
  • blast_helper.sh Assists in all vs. all blastp search.
  • BlastResultCluster.py Clusters genes in gene families based on e values threshold and a minimum number of OTUs.
  • paMATRAX+.sh Wrapper of gnu-parallel mafft, trimAl and RAxML (or FastTree) for parallel estimation of phylogenetic trees.
  • UPhO.py The orthology evaluation tool.
  • UPhO_wt.py UPhO with an additional parameter to tolerate some (n) paralogous. Maybe useful in cases where few spurious or misplaced sequences discard a whole orthogroup. Also, this feature could be useful for rooting this orthobranch.
  • Get_Fasta_from_Ref.py Creates FASTA files from lists of sequence identifiers.
  • Al2phylo.py A simple script to prepare MSA for phylogenetic inference with sanitation and representative sequences options.
  • Consensus.py Finds conserved regions in MSA. Not quite useful for this pipeline... I might move it somewhere else or repurpose it.
  • Alistats.py Writes a simple report as (tsv) from input alignments, includind number of species, GC content, and gaps content.
  • distOrth.py Functions for annotating the distribution of orthologs on a tree.
  • distOrth_interactive.py interactive helper for distOrth.

    Each script has (or should have) its own -help flag for details on its usage.

    Disclaimer

    This software is experimental, in active development and comes without warranty. UPhO scripts were developed and tested using Python 2.7 on Linux (RHLE and Debian) and MacOS. Versions of these scripts using Python3 are being tested.

    Citation

    Ballesteros JA and Hormiga G. 2016. A new orthology assessment method for phylogenomic data: Unrooted Phylogenetic Orthology. Molecular Biology and Evolution, doi: 10.1093/molbev/msw069 [abstract](https://doi.org/10.1093/molbev/msw069)
  • About

    Scripts for homology and orthology assessment from genomic sequences.

    Resources

    License

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published