-
Notifications
You must be signed in to change notification settings - Fork 1
Calculation
fas.run is the main script of FAS. It can be used directly with a seed and query input in fasta format. fas.run calculates the Feature Architecture Similarity between protein pairs in an all-vs-all manner (all proteins in the seed file against all proteins in the query). This behavior can be changed with options under Advanced Usage.
FAS can be run by using the following command:
fas.run -q PATH/ortholog.fasta -s PATH/seed.fasta -a PATH/ANNOTATION -o PATH/FAS_OUT
The [-s|--seed] and [-q|--query] options designate the input files for the seed and the query. These file should be in fasta format and can contain any number sequences. The [-a|--annotation_dir] designates a directory where the feature annotations for the seed and query will be stored. These annotations will be of the same name as the fasta file (ortholog.json, seed.json) and are in json format If an annotation already exist in this directory FAS will use it by default and skip the annotation step. The [-o|--out_dir] option designates an output directory for FAS. By default, FAS creates three output files named seed_ortholog.tsv (main scores), seed_ortholog_forward.domains (domain information) and seed_ortholog_config.yml (information on the FAS run). You can also give calcFAS a so called reference proteome to allow for non uniform weighting of the features with [-r|ref_proteome]. We recommend using the proteome of the query species for this:
fas.run -q PATH/ortholog.fasta -s PATH/seed.fasta -r PATH/proteome.fasta -a PATH/ANNOTATION -o PATH/FAS_OUT
If you want FAS to directly calculate the score in both directions, you can use the [--bidirectional] option:
fas.run -q PATH/ortholog.fasta -s PATH/seed.fasta -r PATH/proteome.fasta -a PATH/ANNOTATION -o PATH/FAS_OUT --bidirectional --ref_2 PATH/proteome2.fasta
The [--ref_2] options allows you to give a second proteome for the weighting for the reverse scores, otherwise both scoring directions use the same reference.