diff --git a/_includes/_project-anvio-version-number-major.html b/_includes/_project-anvio-version-number-major.html
index c7930257..301160a9 100644
--- a/_includes/_project-anvio-version-number-major.html
+++ b/_includes/_project-anvio-version-number-major.html
@@ -1 +1 @@
-7
\ No newline at end of file
+8
\ No newline at end of file
diff --git a/_includes/_project-anvio-version-number.html b/_includes/_project-anvio-version-number.html
index 986084f3..301160a9 100644
--- a/_includes/_project-anvio-version-number.html
+++ b/_includes/_project-anvio-version-number.html
@@ -1 +1 @@
-7.1
\ No newline at end of file
+8
\ No newline at end of file
diff --git a/_includes/install/00_links_for_dev.html b/_includes/install/00_links_for_dev.html
new file mode 100644
index 00000000..43cea16d
--- /dev/null
+++ b/_includes/install/00_links_for_dev.html
@@ -0,0 +1,5 @@
+
${CONDA_PREFIX}/etc/conda/activate.d/anvio.sh
+# creating an activation script for the the conda environment for anvi'o
+# development branch so (1) Python knows where to find anvi'o libraries,
+# (2) the shell knows where to find anvi'o programs, and (3) every time
+# the environment is activated it synchronizes with the latest code from
+# active GitHub repository:
+export PYTHONPATH=\$PYTHONPATH:~/github/anvio/
+export PATH=\$PATH:~/github/anvio/bin:~/github/anvio/sandbox
+echo -e "\033[1;34mUpdating from anvi'o GitHub \033[0;31m(press CTRL+C to cancel)\033[0m ..."
+cd ~/github/anvio && git pull && cd -
+EOF
+```
+
+{:.warning}
+If you are using `zsh` by default these may not work. If you run into a trouble here or especially if you figure out a way to make it work both for `zsh` and `bash`, please let us know. To use `bash` to make the above command work, first run this `exec bash` command. Then re-run the command above. To go back to `zsh` you can run `exec zsh` command.
+
+If everything worked, you should be able to type the following commands in a new terminal and see similar outputs:
+
+```
+meren ~ $ conda activate anvio-dev
+Updating from anvi'o GitHub (press CTRL+C to cancel) ...
+
+(anvio-dev) meren ~ $ which anvi-self-test
+/Users/meren/github/anvio/bin/anvi-self-test
+
+(anvio-dev) meren ~ $ anvi-self-test -v
+Anvi'o .......................................: hope (v7.1-dev)
+Python .......................................: 3.10.13
+
+Profile database .............................: 38
+Contigs database .............................: 21
+Pan database .................................: 16
+Genome data storage ..........................: 7
+Auxiliary data storage .......................: 2
+Structure database ...........................: 2
+Metabolic modules database ...................: 4
+tRNA-seq database ............................: 2
+
+(anvio-dev) meren ~ $
+```
+
+If that is the case, you're all set.
+
+Every change you will make in anvi'o codebase will immediately be reflected when you run anvi'o tools (but if you change the code and do not revert back, git will stop updating your branch from the upstream).
+
+If you followed these instructions, every time you open a terminal you will have to run the following command to activate your anvi'o environment:
+
+```
+conda activate anvio-dev
+```
+
+If you are here, you can now jump to "[Check your anvi'o setup](#4-check-your-installation)" to see if things worked for you using `anvi-self-test`, but don't forget to take a look at the bonus chapter below, especially if you are using `bash`.
\ No newline at end of file
diff --git a/_includes/install/dev_mamba_packages.md b/_includes/install/dev_mamba_packages.md
new file mode 100644
index 00000000..e026a9a9
--- /dev/null
+++ b/_includes/install/dev_mamba_packages.md
@@ -0,0 +1,17 @@
+{:.notice}
+If the [mamba](https://github.com/mamba-org/mamba) installation somehow still doesn't work, that is OK. It is also OK if some of the commands below that start with `mamba` don't work. In either of these cases, you only need to replace every instance of `mamba` with `conda`, and everything should work smoothly (but with slightly longer wait times). But it would be extremely helpful to the community if you were to ping us on {% include _discord_invitation_button.html %} in the case of a `mamba` failure, so we better understand under what circumstances this solution fails.
+
+Install all the necessary packages:
+
+``` bash
+mamba install -y -c conda-forge -c bioconda python=3.10 \
+ sqlite prodigal idba mcl muscle=3.8.1551 famsa hmmer diamond \
+ blast megahit spades bowtie2 bwa graphviz "samtools>=1.9" \
+ trimal iqtree trnascan-se fasttree vmatch r-base r-tidyverse \
+ r-optparse r-stringi r-magrittr bioconductor-qvalue meme
+
+# try this, if it doesn't install, don't worry (it is sad, but OK):
+mamba install -y -c bioconda fastani
+```
+
+Now you are ready for the code.
\ No newline at end of file
diff --git a/_includes/install/dev_python_dependencies.md b/_includes/install/dev_python_dependencies.md
new file mode 100644
index 00000000..418c3537
--- /dev/null
+++ b/_includes/install/dev_python_dependencies.md
@@ -0,0 +1,9 @@
+To install the Python dependencies of anvi'o please run the following command:
+
+``` bash
+cd ~/github/anvio/
+pip install -r requirements.txt
+```
+
+{:.warning}
+If `pysam` is causing you trouble during this step, you may want to try to install it with conda first by running `mamba install -y -c bioconda pysam` and then try the `pip` install command again.
diff --git a/_includes/install/dev_python_dependencies_conclusion.md b/_includes/install/dev_python_dependencies_conclusion.md
new file mode 100644
index 00000000..21096752
--- /dev/null
+++ b/_includes/install/dev_python_dependencies_conclusion.md
@@ -0,0 +1 @@
+Now you have the latest copy of the anvi'o codebase, and all of its dependencies are in place.
\ No newline at end of file
diff --git a/_includes/install/dev_python_version_warning.md b/_includes/install/dev_python_version_warning.md
new file mode 100644
index 00000000..8b881111
--- /dev/null
+++ b/_includes/install/dev_python_version_warning.md
@@ -0,0 +1,2 @@
+{:.warning}
+**Please note that we recently switched from Python 3.7 to Python 3.10 in our active development branch**. Thus, the way we setup the conda environment for the active development branch now differs from the way we do it for the latest stable version. There may be hiccups since these changes required many adjustments in the anvi'o code, and likely some bugs were missed. If you are reading these lines, please keep us posted if you run into an issue.
\ No newline at end of file
diff --git a/_includes/install/environment_setup_initial.md b/_includes/install/environment_setup_initial.md
new file mode 100644
index 00000000..e428721d
--- /dev/null
+++ b/_includes/install/environment_setup_initial.md
@@ -0,0 +1,20 @@
+{:.notice}
+It is a good idea to **make sure you are not already in a conda environment** before you run the following steps. Just to be clear, you can indeed install anvi'o in an existing conda environment, but if things go wrong, we kindly ask you to refer to meditation for help, rather than [anvi'o community resources](https://merenlab.org/2019/10/07/getting-help/) If you want to see what environments do you have on your computer and whether you already are in one of them in your current terminal by running `conda env list`. **If all these are too much for you and all you want to do is to move on with the installation**, simply do this: open a new terminal, and run `conda deactivate`, and continue with the rest of the text.
+
+First, a new conda environment:
+
+``` bash
+conda create -y --name anvio-8 python=3.10
+```
+
+And activate it:
+
+```
+conda activate anvio-8
+```
+
+Install `mamba` for fast dependency resolving:
+
+```
+conda install -y -c conda-forge mamba
+```
\ No newline at end of file
diff --git a/_includes/install/install_anvio.md b/_includes/install/install_anvio.md
new file mode 100644
index 00000000..4369e33e
--- /dev/null
+++ b/_includes/install/install_anvio.md
@@ -0,0 +1,16 @@
+Here you will first download the Python source package for the official anvi'o release:
+
+```
+curl -L https://github.com/merenlab/anvio/releases/download/v8/anvio-8.tar.gz \
+ --output anvio-8.tar.gz
+```
+
+And install it using `pip` like a boss:
+
+```
+pip install anvio-8.tar.gz
+```
+
+**If you don't see any error messages**, then you are probably golden and can move on to testing your anvi'o setup in the section "[Check your installation](#6-check-your-installation)" :)
+
+**If you do see error messages**, please know that you are not alone. We are as frustrated as you are. Please take a look at the problems people have reported and try these solutions, which will most likely address your issues. Common issues can be found on this page in the next section.
\ No newline at end of file
diff --git a/_includes/install/interactive_interface_windows.md b/_includes/install/interactive_interface_windows.md
new file mode 100644
index 00000000..4b439152
--- /dev/null
+++ b/_includes/install/interactive_interface_windows.md
@@ -0,0 +1,11 @@
+When using WSL, you need to add `-I localhost` to any interactive interface commands.
+
+Here is an example with `anvi-interactive`:
+```
+anvi-interactive -c CONTIGS.db -p MERGED/PROFILE.db -I localhost
+```
+The link for the interactive interface should look like this (with default port):
+```
+http://localhost:8080
+```
+
diff --git a/_includes/install/other_options.md b/_includes/install/other_options.md
new file mode 100644
index 00000000..fdf9c41c
--- /dev/null
+++ b/_includes/install/other_options.md
@@ -0,0 +1,7 @@
+You will always find the official archives of anvi'o code as at the bottom of our GitHub releases as `anvio-X.tar.gz`:
+
+[https://github.com/merenlab/anvio/releases/latest](https://github.com/merenlab/anvio/releases/latest)
+
+The best way to see what additional software you will need running on your computer for anvi'o to be happy is to take a look at the contents of [this conda recipe](https://github.com/merenlab/anvio/blob/master/conda-recipe/anvio/meta.yaml) (which is a conda build recipe, but it will give you the idea (ignore anvio-minimal, you basically have that one taken care of when you have anvi'o installed)).
+
+Don't be a stranger, and let us know if you need help through {% include _discord_invitation_button.html %}.
diff --git a/_includes/install/things_you_need_linux.md b/_includes/install/things_you_need_linux.md
new file mode 100644
index 00000000..6ded9ab5
--- /dev/null
+++ b/_includes/install/things_you_need_linux.md
@@ -0,0 +1,3 @@
+You will need to run the installation commands from a terminal. Since your system is using Linux, you should be good to go. :)
+
+You also need [miniconda](https://docs.conda.io/en/latest/miniconda.html) to be installed on your system. If you don't already have it, please follow their installation instructions.
\ No newline at end of file
diff --git a/_includes/install/things_you_need_macos.md b/_includes/install/things_you_need_macos.md
new file mode 100644
index 00000000..b4df9d8c
--- /dev/null
+++ b/_includes/install/things_you_need_macos.md
@@ -0,0 +1,5 @@
+You will need to run the installation commands from a terminal. Mac OSX comes with a basic Terminal application, or you can download and use a fancier one (such as [iTerm](https://www.iterm2.com/)).
+
+Some of the packages we use need compiling during the installation process, so you should also make sure that you have [Xcode Command Line Tools](https://mac.install.guide/commandlinetools/index.html) installed and up-to-date. Here is a [quick link to their installation instructions](https://mac.install.guide/commandlinetools/4.html). If you have to re-install the Command Line Tools, please remember to close your terminal window and open a new one before continuing with the anvi'o installation (a big thank you to [Hilary Morrison](https://www.mbl.edu/research/faculty-and-whitman-scientists/Hilary%20Morrison) for that tip).
+
+You also need [miniconda](https://docs.conda.io/en/latest/miniconda.html) to be installed on your system. If you don't already have it, please follow their installation instructions.
\ No newline at end of file
diff --git a/_includes/install/things_you_need_windows.md b/_includes/install/things_you_need_windows.md
new file mode 100644
index 00000000..4b1bee53
--- /dev/null
+++ b/_includes/install/things_you_need_windows.md
@@ -0,0 +1,5 @@
+Although anvi'o is developed on and rigorously tested for Linux and Mac OSX, you will be able to use it on Microsoft Windows **if and only if you first install the [Linux Subsystem for Windows](https://docs.microsoft.com/en-us/windows/wsl/install-win10)**. Our users have reported success stories with Ubuntu on WSL.
+
+Once WSL is installed, you should open the WSL terminal. You will run all of the remaining installation instructions within that terminal (and henceforth, whenever we refer to the 'terminal', we mean the WSL terminal).
+
+You also need [miniconda](https://docs.conda.io/en/latest/miniconda.html) to be installed on your system. Remember that you need to install it within WSL, so you need the Linux version. We recommend running their command line installation instructions within the WSL terminal.
\ No newline at end of file
diff --git a/_sass/_global.scss b/_sass/_global.scss
index 6cdd9dc9..f0564c91 100755
--- a/_sass/_global.scss
+++ b/_sass/_global.scss
@@ -364,3 +364,18 @@ blockquote em{
width: 100%;
height: 100%;
}
+
+.power-emoji::before {
+ content: "๐ช";
+ position: relative;
+ background-color: white;
+ margin-left: -15px;
+ border: 2px solid #73AD21;
+ border-radius: 50%;
+ border-color: #2aa883;
+ padding: 5px;
+ }
+
+.emoji-icon {
+ font-size: 10px;
+}
diff --git a/help/8/artifacts/aa-frequencies-txt/index.md b/help/8/artifacts/aa-frequencies-txt/index.md
new file mode 100644
index 00000000..dec3721d
--- /dev/null
+++ b/help/8/artifacts/aa-frequencies-txt/index.md
@@ -0,0 +1,55 @@
+---
+layout: artifact
+title: aa-frequencies-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/aa-frequencies-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-get-aa-counts](../../programs/anvi-get-aa-counts) [anvi-get-codon-frequencies](../../programs/anvi-get-codon-frequencies)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This file contains **the frequency of each amino acid for some reference context in your [contigs-db](/help/8/artifacts/contigs-db)**.
+
+This is a tab-delimited table where each column represents an amino acid and each row represents a specific reference context (most often this will be a gene after running [anvi-get-codon-frequencies](/help/8/programs/anvi-get-codon-frequencies)). The numbers will either refer to counts of each amino acid or precent normalizations depending on the parameters with which you ran [anvi-get-codon-frequencies](/help/8/programs/anvi-get-codon-frequencies).
+
+You can also use [anvi-get-aa-counts](/help/8/programs/anvi-get-aa-counts) to get this information for a [bin](/help/8/artifacts/bin), [collection](/help/8/artifacts/collection), or [splits-txt](/help/8/artifacts/splits-txt).
+
+### Example
+
+ gene_caller_id Ala Arg Thr Asp ...
+ 1 0 0 1 2
+ 2 1 0 0 2
+ .
+ .
+ .
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/aa-frequencies-txt.md) to update this information.
+
diff --git a/help/8/artifacts/augustus-gene-calls/index.md b/help/8/artifacts/augustus-gene-calls/index.md
new file mode 100644
index 00000000..2739efff
--- /dev/null
+++ b/help/8/artifacts/augustus-gene-calls/index.md
@@ -0,0 +1,97 @@
+---
+layout: artifact
+title: augustus-gene-calls
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/augustus-gene-calls
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-script-augustus-output-to-external-gene-calls](../../programs/anvi-script-augustus-output-to-external-gene-calls)
+
+
+## Description
+
+A gene call file from [AUGUSTUS](http://bioinf.uni-greifswald.de/augustus/).
+
+[AUGUSTUS](http://bioinf.uni-greifswald.de/augustus/) is a tool to predict genes from a variety of Eurkaryotic genomes. This includes predicting the 5' UTR and 3' UTR, as well as introns. You can search a sequence in the [Augustus web interface](http://bioinf.uni-greifswald.de/augustus/submission.php). After a search, you can export the results as a `.gff` text file.
+
+{:.notice}
+As of now, Anvi'o (specifically [anvi-script-augustus-output-to-external-gene-calls](/help/8/programs/anvi-script-augustus-output-to-external-gene-calls)) is only tested with AUGUSTUS v3.3.3. Feel free to be adventurous and try other versions if you feel so inclined.
+
+You can convert this file into an anvi'o [external-gene-calls](/help/8/artifacts/external-gene-calls) file using [anvi-script-augustus-output-to-external-gene-calls](/help/8/programs/anvi-script-augustus-output-to-external-gene-calls).
+
+Here is an example of a `.gff` file for the [Homo sapiens RNAP III subunit D sequence](https://www.ncbi.nlm.nih.gov/nuccore/NM_001722.3?report=fasta):
+
+ # This output was generated with AUGUSTUS (version 3.3.3).
+ # AUGUSTUS is a gene prediction tool written by M. Stanke (mario.stanke@uni-greifswald.de),
+ # O. Keller, S. Kรยถnig, L. Gerischer, L. Romoth and Katharina Hoff.
+ # Please cite: Mario Stanke, Mark Diekhans, Robert Baertsch, David Haussler (2008),
+ # Using native and syntenically mapped cDNA alignments to improve de novo gene finding
+ # Bioinformatics 24: 637-644, doi 10.1093/bioinformatics/btn013
+ # No extrinsic information on sequences given.
+ # Initializing the parameters using config directory /data/www/augustus/augustus/config/ ...
+ # human version. Using default transition matrix.
+ # Looks like /data/www/augustus/webservice/data/AUG-707407769/input.fa is in fasta format.
+ # We have hints for 0 sequences and for 0 of the sequences in the input set.
+ #
+ # ----- prediction on sequence number 1 (length = 5336, name = unnamed-1) -----
+ #
+ # Predicted genes for sequence number 1 on both strands
+ # start gene g1
+ unnamed-1 AUGUSTUS gene 57 1253 1 + . g1
+ unnamed-1 AUGUSTUS transcript 57 1253 1 + . g1.t1
+ unnamed-1 AUGUSTUS start_codon 57 59 . + 0 transcript_id "g1.t1"; gene_id "g1";
+ unnamed-1 AUGUSTUS single 57 1253 1 + 0 transcript_id "g1.t1"; gene_id "g1";
+ unnamed-1 AUGUSTUS CDS 57 1253 1 + 0 transcript_id "g1.t1"; gene_id "g1";
+ unnamed-1 AUGUSTUS stop_codon 1251 1253 . + 0 transcript_id "g1.t1"; gene_id "g1";
+ # coding sequence = [atgtcggaaggaaacgccgccggcgagcccagcacgccgggagggccccgacctctcctgactggggcccgggggctca
+ # tcgggcggcggccggcgcctcccctcacccccggccgccttccctccatccgttccagggacctcaccctcgggggagtcaagaagaaaaccttcacc
+ # ccaaatatcatcagtcggaagatcaaggaagagcccaaggaagaagtaactgtcaagaaggagaagcgtgaaagggacagagaccgacaacgagaggg
+ # gcatggacgagggcgaggccgtccagaagtgatccagtctcactccatctttgagcagggcccagctgaaatgatgaagaaaaaagggaactgggata
+ # agacagtggatgtgtcagacatgggaccttctcatatcatcaacatcaaaaaagagaagagagagacagacgaagaaactaaacagatcttgcgtatg
+ # ctggagaaggacgatttcctcgatgaccccggcctgaggaacgacactcgaaatatgcctgtgcagctgccgctggctcactcaggatggctttttaa
+ # ggaagaaaatgacgaaccagatgttaaaccttggctggctggccccaaggaagaggacatggaggtggacatacctgctgtgaaagtgaaagaggagc
+ # cacgagatgaggaggaagaggccaagatgaaggctcctcccaaagcagccaggaagactccaggcctcccgaaggatgtatctgtggcagagctgctg
+ # agggagctgagcctcaccaaggaagaggaactgctgtttctgcagctgccagacaccctccctggccagccacccacccaggacatcaagcctatcaa
+ # gacagaggtgcagggcgaggacggacaggtggtgctcatcaagcaggagaaagaccgagaagccaaattggcagagaatgcttgtaccctggctgacc
+ # tgacagagggtcaggttggcaagctactcatccgcaagtctggaagggtgcaactcctcttgggcaaggtgactctggacgtgaccatgggaactgcc
+ # tgctccttcctgcaggagctggtgtccgtgggccttggagacagtaggacaggggagatgacagtcctgggacacgtgaagcacaaacttgtatgttc
+ # ccctgattttgaatccctcttggatcacaaacaccggtaa]
+ # protein sequence = [MSEGNAAGEPSTPGGPRPLLTGARGLIGRRPAPPLTPGRLPSIRSRDLTLGGVKKKTFTPNIISRKIKEEPKEEVTVK
+ # KEKRERDRDRQREGHGRGRGRPEVIQSHSIFEQGPAEMMKKKGNWDKTVDVSDMGPSHIINIKKEKRETDEETKQILRMLEKDDFLDDPGLRNDTRNM
+ # PVQLPLAHSGWLFKEENDEPDVKPWLAGPKEEDMEVDIPAVKVKEEPRDEEEEAKMKAPPKAARKTPGLPKDVSVAELLRELSLTKEEELLFLQLPDT
+ # LPGQPPTQDIKPIKTEVQGEDGQVVLIKQEKDREAKLAENACTLADLTEGQVGKLLIRKSGRVQLLLGKVTLDVTMGTACSFLQELVSVGLGDSRTGE
+ # MTVLGHVKHKLVCSPDFESLLDHKHR]
+ # end gene g1
+ ###
+ # command line:
+ # /data/www/augustus/augustus/bin/augustus --species=human --strand=both --singlestrand=false --genemodel=partial --codingseq=on --sample=100 --keep_viterbi=true --alternatives-from-sampling=true --minexonintronprob=0.2 --minmeanexonintronprob=0.5 --maxtracks=2 /data/www/augustus/webservice/data/AUG-707407769/input.fa --exonnames=on
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/augustus-gene-calls.md) to update this information.
+
diff --git a/help/8/artifacts/bam-file/index.md b/help/8/artifacts/bam-file/index.md
new file mode 100644
index 00000000..42bf710f
--- /dev/null
+++ b/help/8/artifacts/bam-file/index.md
@@ -0,0 +1,49 @@
+---
+layout: artifact
+title: bam-file
+excerpt: A BAM-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/bam-file
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A BAM-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-init-bam](../../programs/anvi-init-bam)
+
+
+## Required or used by
+
+
+[anvi-get-short-reads-from-bam](../../programs/anvi-get-short-reads-from-bam) [anvi-get-short-reads-mapping-to-a-gene](../../programs/anvi-get-short-reads-mapping-to-a-gene) [anvi-get-tlen-dist-from-bam](../../programs/anvi-get-tlen-dist-from-bam) [anvi-profile](../../programs/anvi-profile) [anvi-profile-blitz](../../programs/anvi-profile-blitz) [anvi-report-linkmers](../../programs/anvi-report-linkmers) [anvi-script-get-coverage-from-bam](../../programs/anvi-script-get-coverage-from-bam)
+
+
+## Description
+
+A BAM file contains **already aligned sequence data.** However, it is written in binary to save space (so it will look like jibberish if you open it).
+
+BAM files (and their text file cousin SAM files) are often used in 'omics analysis and are described in more detail in [this file](https://samtools.github.io/hts-specs/SAMv1.pdf), written by the developers of samtools.
+
+If your BAM file is not indexed, it is actually a [raw-bam-file](/help/8/artifacts/raw-bam-file) and you can run [anvi-init-bam](/help/8/programs/anvi-init-bam) to turn it into a BAM file. You can tell if your BAM file is indexed if in the same folder as your `XXXX.bam` file, there is another file with the same name called `XXXX.bam.bai`.
+
+As of now, no anvi'o programs will output results in BAM format, so you'll primary use BAM files to import sequence data into anvi'o. For example, in [anvi-profile](/help/8/programs/anvi-profile) (which generates a [profile-db](/help/8/artifacts/profile-db)), your BAM file is expected to contain the aligned short reads from your samples.
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/bam-file.md) to update this information.
+
diff --git a/help/8/artifacts/bam-stats-txt/index.md b/help/8/artifacts/bam-stats-txt/index.md
new file mode 100644
index 00000000..16928911
--- /dev/null
+++ b/help/8/artifacts/bam-stats-txt/index.md
@@ -0,0 +1,212 @@
+---
+layout: artifact
+title: bam-stats-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/bam-stats-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-profile-blitz](../../programs/anvi-profile-blitz)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+A collection of TAB-delimited text files generated from the profiling of BAM files.
+
+## Example outputs
+
+The number of columns and their content for files that are considered artifact [bam-stats-txt](/help/8/artifacts/bam-stats-txt) will be variable and depend on the user parameters set for [anvi-profile-blitz](/help/8/programs/anvi-profile-blitz).
+
+The column names may be one of these:
+
+* `gene_callers_id`: Unique number assigned by the gene caller during the creation of the [contigs-db](/help/8/artifacts/contigs-db).
+* `contig`: Contig name as appears in the [bam-file](/help/8/artifacts/bam-file) and [contigs-db](/help/8/artifacts/contigs-db)
+* `sample`: The name of the [bam-file](/help/8/artifacts/bam-file) without its prefix. I.e., the value *SAMPLE-01* will appear in the *sample* column if the BAM file path was */path/to/SAMPLE-01.bam*.
+* `length`: Depending on the context, the length of the gene or contig.
+* `num_mapped_reads`: The actual number of short reads mapping to a gene or contig in a given sample. Useful for those who wish to do TPM/RPKM normalizations.
+* `detection`: Proportion of nucleotides that have at least 1X coverage.
+* `mean_cov`: Mean covearge.
+* `q2q3_cov`: Mean of the coverage (inner quartiles).
+* `median_cov`: Median coverage.
+* `min_cov`: Minimum coverage value observed for the gene or the contig.
+* `max_cov`: Minimum coverage value observed for the gene or the contig.
+* `std_cov`: Standard deviation of coverage.
+
+### Contig mode, default output
+
+12-column TAB delimited file, where each row represents a single contig x sample pair (so the values in the first column are not unique):
+
+|**contig**|**sample**|**length**|**gc_content**|**num_mapped_reads**|**detection**|**mean_cov**|**q2q3_cov**|**median_cov**|**min_cov**|**max_cov**|**std_cov**|
+|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|
+|contig_878|SAMPLE-01|27538|0.608|11877|0.9995|63.61|65.2|65.0|0|107|15.21|
+|contig_6515|SAMPLE-01|12315|0.446|7669|0.9985|91.51|92.5|92.0|0|195|23.98|
+|contig_1720|SAMPLE-01|16856|0.312|4237|0.9993|37.85|38.25|38.0|0|56|7.961|
+|contig_878|SAMPLE-02|27538|0.608|1594|0.9999|9.262|9.161|9.0|0|21|3.42|
+|contig_6515|SAMPLE-02|12315|0.446|2562|0.9918|33.05|33.47|33.0|0|56|8.503|
+|contig_1720|SAMPLE-02|16856|0.312|926|0.9986|8.93|8.751|9.0|0|19|3.306|
+|contig_878|SAMPLE-03|27538|0.608|6395|1.0|37.32|37.21|37.0|0|75|11.46|
+|contig_6515|SAMPLE-03|12315|0.446|300|0.9276|3.953|3.682|4.0|0|15|2.644|
+|contig_1720|SAMPLE-03|16856|0.312|18175|1.0|178.1|178.1|178.0|1|269|29.13|
+
+### Contig mode, minimal output:
+
+7-column TAB delimited file, where each row represents a single contig x sample pair:
+
+
+|**contig**|**sample**|**length**|**gc_content**|**num_mapped_reads**|**detection**|**mean_cov**|
+|:--|:--|:--|:--|:--|:--|:--|
+|contig_878|SAMPLE-01|27538|0.608|11877|0.9995|63.61|
+|contig_6515|SAMPLE-01|12315|0.446|7669|0.9985|91.51|
+|contig_1720|SAMPLE-01|16856|0.312|4237|0.9993|37.85|
+|contig_878|SAMPLE-02|27538|0.608|1594|0.9999|9.262|
+|contig_6515|SAMPLE-02|12315|0.446|2562|0.9918|33.05|
+|contig_1720|SAMPLE-02|16856|0.312|926|0.9986|8.93|
+|contig_878|SAMPLE-03|27538|0.608|6395|1.0|37.32|
+|contig_6515|SAMPLE-03|12315|0.446|300|0.9276|3.953|
+|contig_1720|SAMPLE-03|16856|0.312|18175|1.0|178.1|
+
+### Gene mode, default output
+
+12-column TAB delimited file, where each row represents a single gene x sample pair:
+
+|**gene_callers_id**|**contig**|**sample**|**length**|**num_mapped_reads**|**detection**|**mean_cov**|**q2q3_cov**|**median_cov**|**min_cov**|**max_cov**|**std_cov**|
+|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|
+|0|contig_878|SAMPLE-01|933|385|0.9871|53.97|58.9|59.0|0|85|21.18|
+|1|contig_878|SAMPLE-01|564|326|1.0|66.97|66.88|66.0|35|103|20.6|
+|2|contig_878|SAMPLE-01|444|318|1.0|81.72|81.59|82.0|70|95|4.88|
+|3|contig_878|SAMPLE-01|1218|522|1.0|54.45|57.1|58.0|15|87|17.8|
+|4|contig_878|SAMPLE-01|3381|1476|1.0|60.89|60.96|61.0|19|95|13.66|
+|5|contig_878|SAMPLE-01|942|472|1.0|64.34|63.98|63.0|38|92|10.49|
+|6|contig_878|SAMPLE-01|588|320|1.0|67.51|66.18|66.0|51|92|9.591|
+|7|contig_878|SAMPLE-01|1854|852|1.0|62.63|63.03|63.0|31|85|10.14|
+|8|contig_878|SAMPLE-01|285|195|1.0|67.43|68.41|70.0|51|80|8.741|
+|9|contig_878|SAMPLE-01|1215|567|1.0|60.96|63.68|64.0|16|83|13.94|
+|10|contig_878|SAMPLE-01|2250|1018|1.0|62.36|62.91|62.0|9|107|18.65|
+|11|contig_878|SAMPLE-01|741|433|1.0|70.23|70.42|71.0|44|94|11.49|
+|12|contig_878|SAMPLE-01|963|470|1.0|63.49|65.79|65.0|24|88|13.8|
+|13|contig_878|SAMPLE-01|684|310|1.0|56.33|57.69|58.0|26|85|13.38|
+|14|contig_878|SAMPLE-01|1569|724|1.0|61.79|63.8|64.0|10|95|18.3|
+|15|contig_878|SAMPLE-01|1584|775|1.0|65.69|66.14|67.0|44|88|8.792|
+|16|contig_878|SAMPLE-01|831|456|1.0|67.95|67.82|67.0|48|91|8.154|
+|17|contig_878|SAMPLE-01|192|179|1.0|81.12|81.14|82.0|69|91|5.041|
+|18|contig_878|SAMPLE-01|1467|675|1.0|60.06|60.98|62.0|25|91|14.24|
+|19|contig_878|SAMPLE-01|801|456|1.0|68.31|67.72|68.0|58|86|5.945|
+|20|contig_878|SAMPLE-01|360|252|1.0|71.42|72.38|74.0|53|87|8.963|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+|0|contig_878|SAMPLE-02|933|58|1.0|8.17|7.548|8.0|1|21|4.7|
+|1|contig_878|SAMPLE-02|564|41|1.0|8.151|8.28|8.0|2|14|3.13|
+|2|contig_878|SAMPLE-02|444|42|1.0|10.84|10.31|10.0|7|15|2.128|
+|3|contig_878|SAMPLE-02|1218|83|1.0|9.86|10.55|11.0|1|15|3.041|
+|4|contig_878|SAMPLE-02|3381|224|1.0|10.17|9.925|10.0|3|18|2.796|
+|5|contig_878|SAMPLE-02|942|78|1.0|11.07|10.98|11.0|6|17|2.34|
+|6|contig_878|SAMPLE-02|588|47|1.0|10.09|9.296|9.0|5|18|3.169|
+|7|contig_878|SAMPLE-02|1854|121|1.0|9.417|9.186|9.0|2|16|2.75|
+|8|contig_878|SAMPLE-02|285|30|1.0|10.33|10.0|10.0|7|15|2.217|
+|9|contig_878|SAMPLE-02|1215|79|1.0|9.386|8.685|9.0|3|20|3.965|
+|10|contig_878|SAMPLE-02|2250|115|0.9991|7.619|7.98|8.0|0|14|2.97|
+|11|contig_878|SAMPLE-02|741|58|1.0|10.82|11.18|11.0|3|16|3.067|
+|12|contig_878|SAMPLE-02|963|46|1.0|6.849|7.004|7.0|2|12|2.912|
+|13|contig_878|SAMPLE-02|684|36|1.0|7.281|7.029|8.0|2|14|3.168|
+|14|contig_878|SAMPLE-02|1569|74|1.0|6.505|6.345|6.0|1|13|2.363|
+|15|contig_878|SAMPLE-02|1584|102|1.0|9.199|9.064|9.0|4|15|2.398|
+|16|contig_878|SAMPLE-02|831|60|1.0|10.59|10.73|11.0|6|15|2.31|
+|17|contig_878|SAMPLE-02|192|16|1.0|8.208|8.854|9.0|3|11|2.857|
+|18|contig_878|SAMPLE-02|1467|108|1.0|10.67|10.36|11.0|4|20|3.894|
+|19|contig_878|SAMPLE-02|801|68|1.0|10.85|10.74|11.0|5|19|2.706|
+|20|contig_878|SAMPLE-02|360|34|1.0|10.32|10.24|10.0|6|15|1.886|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+### Gene mode, minimal output:
+
+6-column TAB delimited file, where each row represents a single gene x sample pair:
+
+|gene_callers_id|contig|sample|length|detection|mean_cov|
+|:--|:--:|:--:|:--:|:--:|:--:|
+|0|contig_878|SAMPLE-01|933|0.9871|53.97|
+|1|contig_878|SAMPLE-01|564|1.0|66.97|
+|2|contig_878|SAMPLE-01|444|1.0|81.72|
+|3|contig_878|SAMPLE-01|1218|1.0|54.45|
+|4|contig_878|SAMPLE-01|3381|1.0|60.89|
+|5|contig_878|SAMPLE-01|942|1.0|64.34|
+|6|contig_878|SAMPLE-01|588|1.0|67.51|
+|7|contig_878|SAMPLE-01|1854|1.0|62.63|
+|8|contig_878|SAMPLE-01|285|1.0|67.43|
+|9|contig_878|SAMPLE-01|1215|1.0|60.96|
+|10|contig_878|SAMPLE-01|2250|1.0|62.36|
+|11|contig_878|SAMPLE-01|741|1.0|70.23|
+|12|contig_878|SAMPLE-01|963|1.0|63.49|
+|13|contig_878|SAMPLE-01|684|1.0|56.33|
+|14|contig_878|SAMPLE-01|1569|1.0|61.79|
+|15|contig_878|SAMPLE-01|1584|1.0|65.69|
+|16|contig_878|SAMPLE-01|831|1.0|67.95|
+|17|contig_878|SAMPLE-01|192|1.0|81.12|
+|18|contig_878|SAMPLE-01|1467|1.0|60.06|
+|19|contig_878|SAMPLE-01|801|1.0|68.31|
+|20|contig_878|SAMPLE-01|360|1.0|71.42|
+|(...)|(...)|(...)|(...)|(...)|(...)|
+|0|contig_878|SAMPLE-02|933|1.0|8.17|
+|1|contig_878|SAMPLE-02|564|1.0|8.151|
+|2|contig_878|SAMPLE-02|444|1.0|10.84|
+|3|contig_878|SAMPLE-02|1218|1.0|9.86|
+|4|contig_878|SAMPLE-02|3381|1.0|10.17|
+|5|contig_878|SAMPLE-02|942|1.0|11.07|
+|6|contig_878|SAMPLE-02|588|1.0|10.09|
+|7|contig_878|SAMPLE-02|1854|1.0|9.417|
+|8|contig_878|SAMPLE-02|285|1.0|10.33|
+|9|contig_878|SAMPLE-02|1215|1.0|9.386|
+|10|contig_878|SAMPLE-02|2250|0.9991|7.619|
+|11|contig_878|SAMPLE-02|741|1.0|10.82|
+|12|contig_878|SAMPLE-02|963|1.0|6.849|
+|13|contig_878|SAMPLE-02|684|1.0|7.281|
+|14|contig_878|SAMPLE-02|1569|1.0|6.505|
+|15|contig_878|SAMPLE-02|1584|1.0|9.199|
+|16|contig_878|SAMPLE-02|831|1.0|10.59|
+|17|contig_878|SAMPLE-02|192|1.0|8.208|
+|18|contig_878|SAMPLE-02|1467|1.0|10.67|
+|19|contig_878|SAMPLE-02|801|1.0|10.85|
+|20|contig_878|SAMPLE-02|360|1.0|10.32|
+|(...)|(...)|(...)|(...)|(...)|(...)|
+
+## Reproducing these output files
+
+Examples above generated by running [anvi-profile-blitz](/help/8/programs/anvi-profile-blitz) in the mini-test output directory. To reproduce them, you can run this command to generate the necessary files,
+
+```
+anvi-self-test --suite mini -o TEST
+```
+
+then go into the directory,
+
+```
+cd TEST
+```
+
+and run [anvi-profile-blitz](/help/8/programs/anvi-profile-blitz) in coresponding modes.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/bam-stats-txt.md) to update this information.
+
diff --git a/help/8/artifacts/bams-and-profiles-txt/index.md b/help/8/artifacts/bams-and-profiles-txt/index.md
new file mode 100644
index 00000000..b12daed6
--- /dev/null
+++ b/help/8/artifacts/bams-and-profiles-txt/index.md
@@ -0,0 +1,71 @@
+---
+layout: artifact
+title: bams-and-profiles-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/bams-and-profiles-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-report-inversions](../../programs/anvi-report-inversions)
+
+
+## Description
+
+A **TAB-delimited** file to describe anvi'o [single-profile-db](/help/8/artifacts/single-profile-db) and [bam-file](/help/8/artifacts/bam-file) pairs along with the [contigs-db](/help/8/artifacts/contigs-db) used the profile the BAM file.
+
+This file type includes required and optional columns. The following four columns are **required** for this file type:
+
+* `name`: a single-word name for the entry.
+* `contigs_db_path`: path to a [contigs-db](/help/8/artifacts/contigs-db).
+* `bam_file_path`: path to a [bam-file](/help/8/artifacts/bam-file).
+* `profile_db_path`: path to a [single-profile-db](/help/8/artifacts/single-profile-db) generated from the BAM file and the contigs database mentioned.
+
+### Example
+
+Here is an example file:
+
+|name|contigs_db_path|profile_db_path|bam_file_path|
+|:--|:--:|:--:|:--:|
+|D01|CONTIGS.db|D01/PROFILE.db|D01.bam|
+|R01|CONTIGS.db|R01/PROFILE.db|R01.bam|
+|R02|CONTIGS.db|R02/PROFILE.db|R02.bam|
+
+### Optional columns
+
+In addition to the required columns shown above, you can add as many columns as you like in your file. But two of these columns will be further processed during sanity check: `r1` and `r2`, whith the expectation that these columns will include information regarding the location of the raw FASTQ files. For each row the [bams-and-profiles-txt](/help/8/artifacts/bams-and-profiles-txt) file describes, the FASTQ files must be those that were used to generate the BAM files. Here is an example file with these two additional columns:
+
+|name|contigs_db_path|profile_db_path|bam_file_path|r1|r2|
+|:--|:--:|:--:|:--:|:--:|:--:|
+|D01|CONTIGS.db|D01/PROFILE.db|D01.bam|D01-R1.fastq|D01-R2.fastq|
+|R01|CONTIGS.db|R01/PROFILE.db|R01.bam|R01-R1.fastq|R01-R2.fastq|
+|R02|CONTIGS.db|R02/PROFILE.db|R02.bam|R02-R1.fastq|R02-R2.fastq|
+
+Some programs, such as [anvi-report-inversions](/help/8/programs/anvi-report-inversions), can process the `r1` and `r2` files.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/bams-and-profiles-txt.md) to update this information.
+
diff --git a/help/8/artifacts/bin/index.md b/help/8/artifacts/bin/index.md
new file mode 100644
index 00000000..765ad4e1
--- /dev/null
+++ b/help/8/artifacts/bin/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: bin
+excerpt: A BIN-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/bin
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A BIN-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-cluster-contigs](../../programs/anvi-cluster-contigs) [anvi-display-pan](../../programs/anvi-display-pan) [anvi-interactive](../../programs/anvi-interactive) [anvi-refine](../../programs/anvi-refine) [anvi-rename-bins](../../programs/anvi-rename-bins) [anvi-script-add-default-collection](../../programs/anvi-script-add-default-collection) [anvi-script-compute-bayesian-pan-core](../../programs/anvi-script-compute-bayesian-pan-core)
+
+
+## Required or used by
+
+
+[anvi-display-metabolism](../../programs/anvi-display-metabolism) [anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism) [anvi-estimate-scg-taxonomy](../../programs/anvi-estimate-scg-taxonomy) [anvi-estimate-trna-taxonomy](../../programs/anvi-estimate-trna-taxonomy) [anvi-gen-fixation-index-matrix](../../programs/anvi-gen-fixation-index-matrix) [anvi-gen-gene-level-stats-databases](../../programs/anvi-gen-gene-level-stats-databases) [anvi-gen-variability-profile](../../programs/anvi-gen-variability-profile) [anvi-get-codon-frequencies](../../programs/anvi-get-codon-frequencies) [anvi-get-codon-usage-bias](../../programs/anvi-get-codon-usage-bias) [anvi-get-short-reads-from-bam](../../programs/anvi-get-short-reads-from-bam) [anvi-get-split-coverages](../../programs/anvi-get-split-coverages) [anvi-inspect](../../programs/anvi-inspect) [anvi-interactive](../../programs/anvi-interactive) [anvi-merge-bins](../../programs/anvi-merge-bins) [anvi-refine](../../programs/anvi-refine) [anvi-rename-bins](../../programs/anvi-rename-bins) [anvi-script-gen-distribution-of-genes-in-a-bin](../../programs/anvi-script-gen-distribution-of-genes-in-a-bin)
+
+
+## Description
+
+In its simplest form, **a group of items** that are put together. Think of a literal bin in which you put data. One or more bins in anvi'o form a [collection](/help/8/artifacts/collection).
+
+In anvi'o, a bin may reprsent one or more contigs, or gene clusters, or any item that can be shown in the interactive interface and stored in a [profile-db](/help/8/artifacts/profile-db), [pan-db](/help/8/artifacts/pan-db), or [genes-db](/help/8/artifacts/genes-db).
+
+Bin names become handy to specifically target a group of items to investigate via programs such as [anvi-refine](/help/8/programs/anvi-refine) or [anvi-split](/help/8/programs/anvi-split), specify a group of contigs in files such as [internal-genomes](/help/8/artifacts/internal-genomes), or find them in output files anvi'o generates via programs such as [anvi-summarize](/help/8/programs/anvi-summarize) or [anvi-estimate-genome-completeness](/help/8/programs/anvi-estimate-genome-completeness).
+
+Since they are a part of the umbrella concept [collection](/help/8/artifacts/collection), information about bins are stored in various anvi'o databases, each of which can be used with the program [anvi-show-collections-and-bins](/help/8/programs/anvi-show-collections-and-bins) to see the bin content.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/bin.md) to update this information.
+
diff --git a/help/8/artifacts/binding-frequencies-txt/index.md b/help/8/artifacts/binding-frequencies-txt/index.md
new file mode 100644
index 00000000..5264766d
--- /dev/null
+++ b/help/8/artifacts/binding-frequencies-txt/index.md
@@ -0,0 +1,79 @@
+---
+layout: artifact
+title: binding-frequencies-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/binding-frequencies-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-run-interacdome](../../programs/anvi-run-interacdome)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+When the user runs [anvi-run-interacdome](/help/8/programs/anvi-run-interacdome), it stores binding frequencies directly into the [contigs-db](/help/8/artifacts/contigs-db) as [misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids). Yet [anvi-run-interacdome](/help/8/programs/anvi-run-interacdome) also outputs tabular data directly accessible by the user--this data is what is meant by [binding-frequencies-txt](/help/8/artifacts/binding-frequencies-txt).
+
+Specifically, this artifact refers to 2 files named `INTERACDOME-match_state_contributors.txt` and `INTERACDOME-domain_hits.txt` (the `INTERACDOME` prefix can be changed with `-O`).
+
+`INTERACDOME-match_state_contributors.txt` displays the binding frequencies in the following format:
+
+| gene_callers_id | codon_order_in_gene | pfam_id | match_state | ligand | binding_freq |
+|------------------:|----------------------:|:----------|--------------:|:---------|---------------:|
+| 1 | 169 | PF00534 | 22 | ADP | 0.687948 |
+| 1 | 169 | PF13692 | 8 | ADP | 0.595441 |
+| 1 | 174 | PF00534 | 27 | ADP | 0.735759 |
+| 1 | 174 | PF13692 | 14 | ADP | 0.595441 |
+| 1 | 184 | PF00534 | 37 | ADP | 0.0697656 |
+| 1 | 184 | PF13692 | 24 | ADP | 0.101399 |
+| 1 | 186 | PF00534 | 39 | ADP | 0.0697656 |
+| 1 | 186 | PF13692 | 26 | ADP | 0.101399 |
+| 1 | 187 | PF13692 | 27 | ADP | 0.201761 |
+| 1 | 189 | PF00534 | 47 | ADP | 0.0697656 |
+
+Each binding frequency is associated with both the exact residue of the user's gene sequences (from their [contigs-db](/help/8/artifacts/contigs-db)) and the exact match states (from the Pfam database) that contributed the binding frequency.
+
+`INTERACDOME-match_state_contributors.txt` is a parsed summary of the `hmmsearch` output in the following format:
+
+
+| pfam_name | pfam_id | corresponding_gene_call | domain | qual | score | bias | c-evalue | i-evalue | hmm_start | hmm_stop | hmm_bounds | ali_start | ali_stop | ali_bounds | env_start | env_stop | env_bounds | mean_post_prob | match_state_align | comparison_align | sequence_align | version |
+|:----------------|:----------|--------------------------:|---------:|:-------|--------:|-------:|-----------:|-----------:|------------:|-----------:|:-------------|------------:|-----------:|:-------------|------------:|-----------:|:-------------|-----------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------:|
+| Beta_elim_lyase | PF01212 | 1762 | 1 | ! | 20.9 | 0.1 | 1e-08 | 3.5e-06 | 33 | 169 | .. | 44 | 177 | .. | 34 | 215 | .. | 0.72 | tvnrLedavaelfgke..aalfvpqGtaAnsill.kill.qr..geevivtepahihfdetgaiaelagvklrdlknkeaGkmdlekleaaikevgaheekiklisltvTnntagGqvvsleelrevaaiakkygiplhlDgA | ++ +++ael+ + f+ Gt +++ l + + +r g+ +i++ h +et + g +l ++ +++G +++e+l+++i++ e i + +++v n+ G++ +++e+ ev +a+ +i++h+D+ | LLQQARKQIAELINVSanEIYFTSGGTEGDNWVLkGTAIeKRefGNHIIISAVEHPAVTETAEQLVELGFELSYAPVDKEGRVKVEELQKLIRK-----ETILVSVMAVNNE--VGTIQPIKEISEV--LAEFPKIHFHVDAV | 20 |
+| PAPS_reduct | PF01507 | 1541 | 1 | ! | 36.1 | 0.1 | 3.6e-13 | 1.3e-10 | 2 | 164 | .. | 21 | 231 | .. | 20 | 234 | .. | 0.79 | lvvsvsgGkdslVllhLalkafkpv....pvvfvdtghefpetiefvdeleeryglrlkvyepeeevaekinaekhgs.slyee.aaeriaKveplkk.................................aLekldedall..tGaRrdesksraklpiveidedfek.........slrvfPllnWteedvwqyilrenipynpLydqgfr | + +s+sgGkds +++La + ++ ++ ++ + ++ t++f++++e+ +++ +++ ++++ + + +++ + + + e+ + p k e++ ++a+ +G+R++es +r++ +++ +++ + ++Pl++W+ d+w+ + +++yn +y++ ++ | VYFSFSGGKDSGLMVQLANLVAEKLdrnfDLLILNIEANYTATVDFIKKIEQLPRVKNIYHFCLPFFEDNNTSFFQPQwKMWDPsEKEKWIHSLP--KnaitleniddglkkyyslsngnpdrflryfqnwYKEQYPQSAIScgVGIRAQESLHRHSAVTKGENKYKNRcwinitlegNILFYPLFDWKVGDIWAATFKCELEYNYIYEKMYK | 18 |
+| Ank_2 | PF12796 | 1756 | 1 | ! | 32.2 | 0 | 6.7e-12 | 2.3e-09 | 29 | 84 | .] | 74 | 135 | .. | 53 | 135 | .. | 0.85 | aLhyAakngnleivklLle...h.a..adndgrtpLhyAarsghleivklLlekgadinlkd | aL Aa + +++ vk +l+ + + +d +g+tpL +A+ ++ +ei+k L+++gadinl++ | ALLEAANQRDTKKVKEILQdttYqVdeVDTEGNTPLNIAVHNNDIEIAKALIDRGADINLQN | 6 |
+| Ank_2 | PF12796 | 1756 | 2 | ! | 28.5 | 0 | 9.5e-11 | 3.3e-08 | 22 | 75 | .. | 199 | 265 | .. | 195 | 267 | .] | 0.76 | pn..k.ngktaLhyAak..ngnl...eivklLleha.....adndgrtpLhyAarsghleivklLle | ++ + +g taL+ A+ +gn +ivklL+e++ dn+grt++ yA ++g++ei k+L + | IDfqNdFGYTALIEAVGlrEGNQlyqDIVKLLMENGadqsiKDNSGRTAMDYANQKGYTEISKILAQ | 6 |
+| IGPS | PF00218 | 1615 | 1 | ! | 20.6 | 0.1 | 1.2e-08 | 4e-06 | 202 | 249 | .. | 195 | 242 | .. | 73 | 248 | .. | 0.88 | LaklvpkdvllvaeSGiktredveklkeegvnafLvGeslmrqedvek | +++lv+++++++ae i+t+e+++++k+ gv ++ vG +++r ++ +k | IKQLVQENICVIAEGKIHTPEQARQIKKLGVAGIVVGGAITRPQEIAK | 20 |
+| Ribosomal_L33 | PF00471 | 1562 | 1 | ! | 66.6 | 1.5 | 1.1e-22 | 3.7e-20 | 2 | 47 | .] | 4 | 49 | .] | 3 | 49 | .] | 0.97 | kvtLeCteCksrnYtttknkrntperLelkKYcprcrkhtlhkEtK | +++LeC e+++r Y t+knkrn+perLelkKY p++r++ ++kE K | NIILECVETGERLYLTSKNKRNNPERLELKKYSPKLRRRAIFKEVK | 19 |
+| Ribosomal_S14 | PF00253 | 1565 | 1 | ! | 83.3 | 0.1 | 3.9e-28 | 1.3e-25 | 2 | 54 | .] | 36 | 88 | .. | 35 | 88 | .. | 0.98 | laklprnssptrirnrCrvtGrprGvirkfgLsRicfRelAlkgelpGvkKaS | laklpr+s+p+r+r r++ +GrprG++rkfg+sRi+fRel ++g +pGvkKaS | LAKLPRDSNPNRLRLRDQTDGRPRGYMRKFGMSRIKFRELDHQGLIPGVKKAS | 20 |
+| Polysacc_synt_C | PF14667 | 1593 | 1 | ! | 61.4 | 19.2 | 5.4e-21 | 1.9e-18 | 2 | 139 | .. | 371 | 516 | .. | 370 | 519 | .. | 0.83 | LailalsiiflslstvlssiLqglgrqkialkalvigalvklilnllliplfgivGaaiatvlallvvavlnlyalrrllgikl...llrrllkpllaalvmgivvylllllllglllla...al..alllavlvgalvYllllll | L+ ++s+ +l+++t++ siLq+l +k+a+ ++ i++l+kli+++++i+lf +G +iat+++ ++++++ +++l+r++ i++ ++ +++ +++vm i+ +l+l+++ ++ + +l + l +++g++v+ + l++ | LSATIISTSLLGIFTIVLSILQALSFHKKAMQITSITLLLKLIIQIPCIYLFKGYGLSIATIICTMFTTIIAYRFLSRKFDINPikyNRKYYSRLVYSTIVMTILSLLMLKIISSVYKFEstlQLffLISLIGCLGGVVFSVTLFR | 5 |
+
+For each hit, this table includes how good the hit was, the alignment of the user gene to the exact HMM match states, and more! In fact, it includes all of hte domain hit summary information, the sequence of the consensus match states, the comparison string for the hit, and the sequence of the user's gene.
+
+For more information, check out [this blogpost](https://merenlab.org/2020/07/22/interacdome/#6-storing-the-per-residue-binding-frequencies-into-the-contigs-database).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/binding-frequencies-txt.md) to update this information.
+
diff --git a/help/8/artifacts/blast-table/index.md b/help/8/artifacts/blast-table/index.md
new file mode 100644
index 00000000..2af23dc2
--- /dev/null
+++ b/help/8/artifacts/blast-table/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: blast-table
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/blast-table
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-script-filter-fasta-by-blast](../../programs/anvi-script-filter-fasta-by-blast)
+
+
+## Description
+
+This describes the BLAST table that is outputted when you run [Protein BLAST](https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins) from the terminal.
+
+When given to [anvi-script-filter-fasta-by-blast](/help/8/programs/anvi-script-filter-fasta-by-blast), which is currently the only program that uses this artifact, it expects output form 6. By default, this incldues the following data columns:
+
+ qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore slen
+
+However, you'll have to provide the columns in your file and their order to the program wirth the flag `--outfmt`. For the program to work properly, your table must at least include the columns `qseqid`, `bitscore`, `length`, `qlen`, and `pident`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/blast-table.md) to update this information.
+
diff --git a/help/8/artifacts/cazyme-data/index.md b/help/8/artifacts/cazyme-data/index.md
new file mode 100644
index 00000000..b2ef74c1
--- /dev/null
+++ b/help/8/artifacts/cazyme-data/index.md
@@ -0,0 +1,43 @@
+---
+layout: artifact
+title: cazyme-data
+excerpt: A DATA-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/cazyme-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DATA-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-cazymes](../../programs/anvi-setup-cazymes)
+
+
+## Required or used by
+
+
+[anvi-run-cazymes](../../programs/anvi-run-cazymes)
+
+
+## Description
+
+This stores a local copy of the data from the [dbCAN2 CAZyme HMM database](https://bcb.unl.edu/dbCAN2/download/Databases/) for functional annotation.
+
+It is required to run [anvi-run-cazymes](/help/8/programs/anvi-run-cazymes) and is set up on your computer by the program [anvi-setup-cazymes](/help/8/programs/anvi-setup-cazymes).
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/cazyme-data.md) to update this information.
+
diff --git a/help/8/artifacts/clustering-configuration/index.md b/help/8/artifacts/clustering-configuration/index.md
new file mode 100644
index 00000000..6d6815ae
--- /dev/null
+++ b/help/8/artifacts/clustering-configuration/index.md
@@ -0,0 +1,39 @@
+---
+layout: artifact
+title: clustering-configuration
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/clustering-configuration
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-experimental-organization](../../programs/anvi-experimental-organization)
+
+
+## Description
+
+{:.notice}
+**No one has described this artifact yet** :/ If you would like to contribute by describing it, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts), and add a Markdown formatted file in that directory named "clustering-configuration.md". Its contents will replace this sad text. THANK YOU!
+
diff --git a/help/8/artifacts/codon-frequencies-txt/index.md b/help/8/artifacts/codon-frequencies-txt/index.md
new file mode 100644
index 00000000..956e3748
--- /dev/null
+++ b/help/8/artifacts/codon-frequencies-txt/index.md
@@ -0,0 +1,53 @@
+---
+layout: artifact
+title: codon-frequencies-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/codon-frequencies-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-get-codon-frequencies](../../programs/anvi-get-codon-frequencies)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This file contains **the frequency of each codon in each gene in your [contigs-db](/help/8/artifacts/contigs-db).**
+
+This is a tab-delimited table where each column represents a codon and each row represents a specific gene. The numbers will either refer to counts of each codon or precent normalizations depending on the parameters with which you ran [anvi-get-codon-frequencies](/help/8/programs/anvi-get-codon-frequencies).
+
+### Example
+
+ gene_caller_id GCA GCC GCG GCT ...
+ 1 0 0 1 2
+ 2 1 0 0 2
+ .
+ .
+ .
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/codon-frequencies-txt.md) to update this information.
+
diff --git a/help/8/artifacts/cogs-data/index.md b/help/8/artifacts/cogs-data/index.md
new file mode 100644
index 00000000..056b5af7
--- /dev/null
+++ b/help/8/artifacts/cogs-data/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: cogs-data
+excerpt: A DATA-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/cogs-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DATA-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-ncbi-cogs](../../programs/anvi-setup-ncbi-cogs)
+
+
+## Required or used by
+
+
+[anvi-run-ncbi-cogs](../../programs/anvi-run-ncbi-cogs)
+
+
+## Description
+
+This basically stores **a local copy of the data from the NCBI [COGs database](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC102395/) for function annotation.**
+
+It is required to run [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs) and is set up on your computer by the program [anvi-setup-ncbi-cogs](/help/8/programs/anvi-setup-ncbi-cogs).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/cogs-data.md) to update this information.
+
diff --git a/help/8/artifacts/collection-txt/index.md b/help/8/artifacts/collection-txt/index.md
new file mode 100644
index 00000000..cd2c0826
--- /dev/null
+++ b/help/8/artifacts/collection-txt/index.md
@@ -0,0 +1,72 @@
+---
+layout: artifact
+title: collection-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/collection-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-collection](../../programs/anvi-export-collection)
+
+
+## Required or used by
+
+
+[anvi-import-collection](../../programs/anvi-import-collection) [anvi-script-get-coverage-from-bam](../../programs/anvi-script-get-coverage-from-bam) [anvi-script-merge-collections](../../programs/anvi-script-merge-collections)
+
+
+## Description
+
+A two-column TAB-delimited file **without a header** that describes a [collection](/help/8/artifacts/collection) by associating items with [bin](/help/8/artifacts/bin) names.
+
+It can be used to import or export collections in and out of anvi'o databases, and/or transferring them between anvi'o projects seamlessly.
+
+The first column in the file lists item names and the second column associates a given item with a bin.
+
+
+item_01 bin_1
+item_02 bin_1
+item_03 bin_1
+item_04 bin_2
+item_05 bin_3
+item_06 bin_3
+
+
+### The optinal bins info file
+
+In addition to the essential file above, you can associate an optional TAB-delmited file with three columns with a collection to provide information about 'bins' in it, such as their source, and/or color to be used when they are displayed in [summary](/help/8/artifacts/summary) outputs or anvi'o [interactive](/help/8/artifacts/interactive) interfaces. Here is an example:
+
+```
+bin_1 CONCOCT #c9d433
+bin_2 CONCOCT #e86548
+bin_3 anvi-refine #0b8500
+```
+
+In this file format, the first column is a bin name, the second column is a source, and the third column is an HTML color.
+
+{:.notice}
+The source is a free form text and can be anything. We often use `anvi-interactive` or `CONCOCT` or `anvi-refine` for our bins to track which ones were manually refined, and which ones were coming from an automated binning algorithm.
+
+You can provide this optional file to the program [anvi-import-collection](/help/8/programs/anvi-import-collection) with the parameter `--bins-info`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/collection-txt.md) to update this information.
+
diff --git a/help/8/artifacts/collection/index.md b/help/8/artifacts/collection/index.md
new file mode 100644
index 00000000..c40179d2
--- /dev/null
+++ b/help/8/artifacts/collection/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: collection
+excerpt: A COLLECTION-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/collection
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A COLLECTION-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-cluster-contigs](../../programs/anvi-cluster-contigs) [anvi-display-pan](../../programs/anvi-display-pan) [anvi-import-collection](../../programs/anvi-import-collection) [anvi-interactive](../../programs/anvi-interactive) [anvi-rename-bins](../../programs/anvi-rename-bins) [anvi-script-add-default-collection](../../programs/anvi-script-add-default-collection)
+
+
+## Required or used by
+
+
+[anvi-cluster-contigs](../../programs/anvi-cluster-contigs) [anvi-delete-collection](../../programs/anvi-delete-collection) [anvi-display-metabolism](../../programs/anvi-display-metabolism) [anvi-estimate-genome-completeness](../../programs/anvi-estimate-genome-completeness) [anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism) [anvi-estimate-scg-taxonomy](../../programs/anvi-estimate-scg-taxonomy) [anvi-estimate-trna-taxonomy](../../programs/anvi-estimate-trna-taxonomy) [anvi-export-collection](../../programs/anvi-export-collection) [anvi-gen-gene-level-stats-databases](../../programs/anvi-gen-gene-level-stats-databases) [anvi-get-aa-counts](../../programs/anvi-get-aa-counts) [anvi-get-codon-frequencies](../../programs/anvi-get-codon-frequencies) [anvi-get-codon-usage-bias](../../programs/anvi-get-codon-usage-bias) [anvi-get-split-coverages](../../programs/anvi-get-split-coverages) [anvi-merge-bins](../../programs/anvi-merge-bins) [anvi-rename-bins](../../programs/anvi-rename-bins) [anvi-split](../../programs/anvi-split) [anvi-summarize](../../programs/anvi-summarize) [anvi-script-gen-distribution-of-genes-in-a-bin](../../programs/anvi-script-gen-distribution-of-genes-in-a-bin) [anvi-script-gen-genomes-file](../../programs/anvi-script-gen-genomes-file)
+
+
+## Description
+
+An anvi'o concept that describes one or more [bin](/help/8/artifacts/bin)s.
+
+You can generate and store a collection by selecting items on any anvi'o [interactive](/help/8/artifacts/interactive) interface or by importing them via [anvi-import-collection](/help/8/programs/anvi-import-collection) into any anvi'o database that can store collections using the file format [collection-txt](/help/8/artifacts/collection-txt).
+
+You can always use the program [anvi-show-collections-and-bins](/help/8/programs/anvi-show-collections-and-bins) to list all collections and bins stored in a given anvi'o database.
+
+Collections are used in many ways in anvi'o depending on your workflow as you can see from the number of programs that require or can make use of the concept [collection](/help/8/artifacts/collection).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/collection.md) to update this information.
+
diff --git a/help/8/artifacts/completion/index.md b/help/8/artifacts/completion/index.md
new file mode 100644
index 00000000..75a8e2bb
--- /dev/null
+++ b/help/8/artifacts/completion/index.md
@@ -0,0 +1,65 @@
+---
+layout: artifact
+title: completion
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/completion
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-estimate-genome-completeness](../../programs/anvi-estimate-genome-completeness)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+An estimate of the completeness of purity of a genome based on single-copy core genes.
+
+{:.notice}
+See [this blog post](http://merenlab.org/2016/06/09/assessing-completion-and-contamination-of-MAGs/) for more information, and [this paper](https://doi.org/10.1038/nbt.3893) for the community standards for metagenome-assembled and single-amplified genomes.
+
+There are two essential features to this metric: **completion** and **redundancy**.
+
+### Completion
+
+A rough estimate of how completely a set of contigs represents a full genome based on the presence or absence of single-copy core genes (SCGs) they contain.
+
+SCGs are a set of special genes that occur in every single genome once and once only. So theoretically, the higher the percentage of SCGs found in a genome bin, the more likely that the bin represents a complete genome. Of course, SCGs are typically determined by analyzing isolate genomes that are available to find out which genes match to this criterion, hence, the accuracy of their predictions may be limited when this approach is applied to genome bins that represent populations from poorly studies clades of life. Even for genomes of well-studied organisms, our methods to identify these genes in genomes may prevent us from getting to 100% completeness.
+
+### Redundancy
+
+A measure of how many copies of each single-copy core gene (SCG) are found in a genome or a genome bin.
+
+Usually, we expect to have only one copy of each of these genes (thatโs why theyโre called โsingle-copyโ), and for this reason, redundancy of SCGs is commonly used as an estimate the level of potential โcontaminationโ within a bin (i.e., higher values of redundancy may indicate that more than one population may be contributing to a given genome bin).
+
+However, interpretations of โcontaminationโ as a function of redundant occurrence of SCGs may not be straightforward as some genomes may have multiple copies of generally single-copy core genes, hence we prefer not to draw conclusions about contamination right away. In addition, lack of redundancy does not necessarily mean the lack of contamination, since contaminant contigs that do not include SCGs will not be in the radar of these estimates.
+
+### Attention
+
+Regardless of their utility to gain quick insights, single-copy core genes are mere approximations to understanding the quality of a genome and [SCGs cannot ensure the absence of contamination or level of true completion](https://doi.org/10.1101/gr.258640.119).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/completion.md) to update this information.
+
diff --git a/help/8/artifacts/concatenated-gene-alignment-fasta/index.md b/help/8/artifacts/concatenated-gene-alignment-fasta/index.md
new file mode 100644
index 00000000..2386543f
--- /dev/null
+++ b/help/8/artifacts/concatenated-gene-alignment-fasta/index.md
@@ -0,0 +1,55 @@
+---
+layout: artifact
+title: concatenated-gene-alignment-fasta
+excerpt: A FASTA-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/concatenated-gene-alignment-fasta
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A FASTA-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-get-sequences-for-gene-clusters](../../programs/anvi-get-sequences-for-gene-clusters) [anvi-get-sequences-for-hmm-hits](../../programs/anvi-get-sequences-for-hmm-hits)
+
+
+## Required or used by
+
+
+[anvi-gen-phylogenomic-tree](../../programs/anvi-gen-phylogenomic-tree)
+
+
+## Description
+
+This file **contains the alignment information for multiple genes across different organisms**.
+
+Basically, a single gene alignment compares a single gene's sequence across multiple organisms. For example, you could align some specific rRNA sequence across all of the organisms in your sample. This alignment highlights both mutations and insertions and deletions (indicated with dashes).
+
+Clustal programs do a great job of visualizing this data, by color coding it. Here is an example from Anvi'o's pangenome display:
+
+![A lovely clustal-like alignment from the anvi'o pangenome display](../../images/example_alignment.png)
+
+A concatenated gene alignment fasta contains multiple of these gene alignments, in order to generate a tree based off of multiple genes.
+
+This information can then be used to generate a phylogenomic tree using [anvi-gen-phylogenomic-tree](/help/8/programs/anvi-gen-phylogenomic-tree) or through programs like [FastTree](http://www.microbesonline.org/fasttree/).
+
+In Anvi'o, this is an output of [anvi-get-sequences-for-gene-clusters](/help/8/programs/anvi-get-sequences-for-gene-clusters) (for generating a tree based off of gene clusters in your workflow) as well as [anvi-get-sequences-for-hmm-hits](/help/8/programs/anvi-get-sequences-for-hmm-hits) (for generating a tree based off of the genes that got HMM hits).
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/concatenated-gene-alignment-fasta.md) to update this information.
+
diff --git a/help/8/artifacts/configuration-ini/index.md b/help/8/artifacts/configuration-ini/index.md
new file mode 100644
index 00000000..2eca6fe7
--- /dev/null
+++ b/help/8/artifacts/configuration-ini/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: configuration-ini
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/configuration-ini
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-script-gen-short-reads](../../programs/anvi-script-gen-short-reads)
+
+
+## Description
+
+This describes an INI formatted file used to configure a program.
+
+If you're unsure what the INI format is, you can check out its [Wikipedia page](https://en.wikipedia.org/wiki/INI_file). But it is essentially a text file with sections that defines the values of several keys.
+
+As of now, it is only required in Anvi'o by the program [anvi-script-gen-short-reads](/help/8/programs/anvi-script-gen-short-reads), where the file describes various parameters for the short reads that you want to generate, such as the desired length and coverage. Take a look at the page for that program for an example.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/configuration-ini.md) to update this information.
+
diff --git a/help/8/artifacts/contig-inspection/index.md b/help/8/artifacts/contig-inspection/index.md
new file mode 100644
index 00000000..30afd433
--- /dev/null
+++ b/help/8/artifacts/contig-inspection/index.md
@@ -0,0 +1,41 @@
+---
+layout: artifact
+title: contig-inspection
+excerpt: A DISPLAY-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/contig-inspection
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DISPLAY-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-inspect](../../programs/anvi-inspect) [anvi-interactive](../../programs/anvi-interactive)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This page describes general properties of the interactive inspect page.
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/contig-inspection.md) to update this information.
+
diff --git a/help/8/artifacts/contigs-db/index.md b/help/8/artifacts/contigs-db/index.md
new file mode 100644
index 00000000..4888fcfa
--- /dev/null
+++ b/help/8/artifacts/contigs-db/index.md
@@ -0,0 +1,367 @@
+---
+layout: artifact
+title: contigs-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/contigs-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-gen-contigs-database](../../programs/anvi-gen-contigs-database)
+
+
+## Required or used by
+
+
+[anvi-cluster-contigs](../../programs/anvi-cluster-contigs) [anvi-compute-completeness](../../programs/anvi-compute-completeness) [anvi-db-info](../../programs/anvi-db-info) [anvi-delete-functions](../../programs/anvi-delete-functions) [anvi-delete-hmms](../../programs/anvi-delete-hmms) [anvi-display-contigs-stats](../../programs/anvi-display-contigs-stats) [anvi-display-metabolism](../../programs/anvi-display-metabolism) [anvi-display-structure](../../programs/anvi-display-structure) [anvi-estimate-genome-completeness](../../programs/anvi-estimate-genome-completeness) [anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism) [anvi-estimate-scg-taxonomy](../../programs/anvi-estimate-scg-taxonomy) [anvi-estimate-trna-taxonomy](../../programs/anvi-estimate-trna-taxonomy) [anvi-export-contigs](../../programs/anvi-export-contigs) [anvi-export-functions](../../programs/anvi-export-functions) [anvi-export-gene-calls](../../programs/anvi-export-gene-calls) [anvi-export-gene-coverage-and-detection](../../programs/anvi-export-gene-coverage-and-detection) [anvi-export-locus](../../programs/anvi-export-locus) [anvi-export-misc-data](../../programs/anvi-export-misc-data) [anvi-export-splits-and-coverages](../../programs/anvi-export-splits-and-coverages) [anvi-export-splits-taxonomy](../../programs/anvi-export-splits-taxonomy) [anvi-gen-fixation-index-matrix](../../programs/anvi-gen-fixation-index-matrix) [anvi-gen-gene-consensus-sequences](../../programs/anvi-gen-gene-consensus-sequences) [anvi-gen-gene-level-stats-databases](../../programs/anvi-gen-gene-level-stats-databases) [anvi-gen-structure-database](../../programs/anvi-gen-structure-database) [anvi-gen-variability-profile](../../programs/anvi-gen-variability-profile) [anvi-get-aa-counts](../../programs/anvi-get-aa-counts) [anvi-get-codon-frequencies](../../programs/anvi-get-codon-frequencies) [anvi-get-codon-usage-bias](../../programs/anvi-get-codon-usage-bias) [anvi-get-metabolic-model-file](../../programs/anvi-get-metabolic-model-file) [anvi-get-pn-ps-ratio](../../programs/anvi-get-pn-ps-ratio) [anvi-get-sequences-for-gene-calls](../../programs/anvi-get-sequences-for-gene-calls) [anvi-get-sequences-for-hmm-hits](../../programs/anvi-get-sequences-for-hmm-hits) [anvi-get-short-reads-from-bam](../../programs/anvi-get-short-reads-from-bam) [anvi-get-short-reads-mapping-to-a-gene](../../programs/anvi-get-short-reads-mapping-to-a-gene) [anvi-get-split-coverages](../../programs/anvi-get-split-coverages) [anvi-import-collection](../../programs/anvi-import-collection) [anvi-import-functions](../../programs/anvi-import-functions) [anvi-import-misc-data](../../programs/anvi-import-misc-data) [anvi-import-taxonomy-for-genes](../../programs/anvi-import-taxonomy-for-genes) [anvi-inspect](../../programs/anvi-inspect) [anvi-interactive](../../programs/anvi-interactive) [anvi-merge](../../programs/anvi-merge) [anvi-migrate](../../programs/anvi-migrate) [anvi-profile](../../programs/anvi-profile) [anvi-profile-blitz](../../programs/anvi-profile-blitz) [anvi-reaction-network](../../programs/anvi-reaction-network) [anvi-refine](../../programs/anvi-refine) [anvi-rename-bins](../../programs/anvi-rename-bins) [anvi-run-cazymes](../../programs/anvi-run-cazymes) [anvi-run-hmms](../../programs/anvi-run-hmms) [anvi-run-interacdome](../../programs/anvi-run-interacdome) [anvi-run-kegg-kofams](../../programs/anvi-run-kegg-kofams) [anvi-run-ncbi-cogs](../../programs/anvi-run-ncbi-cogs) [anvi-run-pfams](../../programs/anvi-run-pfams) [anvi-run-scg-taxonomy](../../programs/anvi-run-scg-taxonomy) [anvi-run-trna-taxonomy](../../programs/anvi-run-trna-taxonomy) [anvi-scan-trnas](../../programs/anvi-scan-trnas) [anvi-search-functions](../../programs/anvi-search-functions) [anvi-search-palindromes](../../programs/anvi-search-palindromes) [anvi-search-sequence-motifs](../../programs/anvi-search-sequence-motifs) [anvi-show-misc-data](../../programs/anvi-show-misc-data) [anvi-split](../../programs/anvi-split) [anvi-summarize](../../programs/anvi-summarize) [anvi-summarize-blitz](../../programs/anvi-summarize-blitz) [anvi-update-db-description](../../programs/anvi-update-db-description) [anvi-update-structure-database](../../programs/anvi-update-structure-database) [anvi-script-add-default-collection](../../programs/anvi-script-add-default-collection) [anvi-script-estimate-metabolic-independence](../../programs/anvi-script-estimate-metabolic-independence) [anvi-script-filter-hmm-hits-table](../../programs/anvi-script-filter-hmm-hits-table) [anvi-script-gen-distribution-of-genes-in-a-bin](../../programs/anvi-script-gen-distribution-of-genes-in-a-bin) [anvi-script-gen-genomes-file](../../programs/anvi-script-gen-genomes-file) [anvi-script-gen_stats_for_single_copy_genes.py](../../programs/anvi-script-gen_stats_for_single_copy_genes.py) [anvi-script-get-hmm-hits-per-gene-call](../../programs/anvi-script-get-hmm-hits-per-gene-call) [anvi-script-merge-collections](../../programs/anvi-script-merge-collections) [anvi-script-permute-trnaseq-seeds](../../programs/anvi-script-permute-trnaseq-seeds)
+
+
+## Description
+
+A contigs database is an anvi'o database that **contains key information associated with your sequences**.
+
+In a way, **an anvi'o contigs database is a modern, more talented form of a FASTA file**, where you can store additional information about your sequences in it and others can query and use it. Information storage and access is primarily done by anvi'o programs, however, it can also be done through the command line interface or programmatically.
+
+The information a contigs database contains about its sequences can include the positions of open reading frames, tetra-nucleotide frequencies, functional and taxonomic annotations, information on individual nucleotide or amino acid positions, and more.
+
+Here is a graphic that shows what sort of information goes into the contigs database (and also the [profile-db](/help/8/artifacts/profile-db)):
+
+![Contents of the contigs and profile databases](../../images/contigs-profile-db.png)
+
+### Another (less computation-heavy) way of thinking about it
+
+When working in anvi'o, you'll need to be able to access previous analysis done on a genome or transcriptome. To do this, anvi'o uses tools like contigs databases instead of regular fasta files. So, you'll want to convert the data that you have into a contigs database to use other anvi'o programs (using [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database)). As seen on the page for [metagenomes](/help/8/artifacts/metagenomes), you can then use this contigs database instead of your fasta file for all of your anvi'o needs.
+
+In short, to get the most out of your data in anvi'o, you'll want to use your data (which was probably originally in a [fasta](/help/8/artifacts/fasta) file) to create both a [contigs-db](/help/8/artifacts/contigs-db) and a [profile-db](/help/8/artifacts/profile-db). That way, anvi'o is able to keep track of many different kinds of analysis and you can easily interact with other anvi'o programs.
+
+## Usage Information
+
+### Creating and populating a contigs database
+
+Contigs databases will be initialized using **[anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database)** using a [contigs-fasta](/help/8/artifacts/contigs-fasta). This will compute the k-mer frequencies for each contig, soft-split your contigs, and identify open reading frames. To populate a contigs database with more information, you can then run various other programs.
+
+**Key programs that populate an anvi'o contigs database with essential information** include,
+
+* [anvi-run-hmms](/help/8/programs/anvi-run-hmms) (which uses HMMs to annotate your genes against an [hmm-source](/help/8/artifacts/hmm-source))
+* [anvi-run-scg-taxonomy](/help/8/programs/anvi-run-scg-taxonomy) (which associates its single-copy core gene with taxonomic data)
+* [anvi-scan-trnas](/help/8/programs/anvi-scan-trnas) (which identifies the tRNA genes)
+* [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs) (which tries to assign functions to your genes using the COGs database)
+
+Once an anvi'o contigs database is generated and populated with information, it is **always a good idea to run [anvi-display-contigs-stats](/help/8/programs/anvi-display-contigs-stats)** to see a numerical summary of its contents.
+
+Other programs you can run to populate a contigs database with functions include,
+
+* [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) (which annotates the genes in the database with the KEGG KOfam database)
+
+### Analysis on a populated contigs database
+
+Other essential programs that read from a contigs database and yield key information include [anvi-estimate-genome-completeness](/help/8/programs/anvi-estimate-genome-completeness), [anvi-get-sequences-for-hmm-hits](/help/8/programs/anvi-get-sequences-for-hmm-hits), and [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy).
+
+If you wish to run programs like [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs), [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism), and [anvi-gen-gene-level-stats-databases](/help/8/programs/anvi-gen-gene-level-stats-databases), or view your database with [anvi-interactive](/help/8/programs/anvi-interactive), you'll need to first use your contigs database to create a [profile-db](/help/8/artifacts/profile-db).
+
+## Variants
+
+Contigs databases, like [profile-db](/help/8/artifacts/profile-db)s, are allowed have different variants, though the only currently implemented variant, the [trnaseq-contigs-db](/help/8/artifacts/trnaseq-contigs-db), is for tRNA transcripts from tRNA-seq experiments. The default variant stored for "standard" contigs databases is `unknown`. Variants should indicate that substantially different information is stored in the database. For instance, open reading frames are applicable to protein-coding genes but not tRNA transcripts, so ORF data is not recorded for the `trnaseq` variant. The $(trnaseq-workflow)s generates [trnaseq-contigs-db](/help/8/artifacts/trnaseq-contigs-db)s using a very different approach to [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database).
+
+## For power users
+
+Since the anvi'o contigs database is a stand-alone SQLite database, it is accessible to users to perform very complex queries using SQL, or [Structured Query Language](https://en.wikipedia.org/wiki/SQL). You can do it either entering into SQLite command line environment from your terminal by typing,
+
+```
+sqlite3 CONTIGS.db
+```
+
+which would initiate the program to run queries on your contigs database, and welcome you with a new comand prompt in your terminal (you can quit the SQLite terminal to go back to your original terminal anytime by pressing `CTRL+D`):
+
+```
+SQLite version 3.31.1 2020-01-27 19:55:54
+Enter ".help" for usage hints.
+sqlite>
+```
+
+In this prompt you can run any query that is a [valid SQLite query](https://www.sqlitetutorial.net/sqlite-select/) to learn about the structure and contents of the database. Here is an example, where the table names in the database are listed:
+
+```
+sqlite> .tables
+amino_acid_additional_data hmm_hits_in_splits
+collections_bins_info hmm_hits_info
+collections_info kmer_contigs
+collections_of_contigs kmer_splits
+collections_of_splits nt_position_info
+contig_sequences nucleotide_additional_data
+contigs_basic_info scg_taxonomy
+gene_amino_acid_sequences self
+gene_functions splits_basic_info
+genes_in_contigs splits_taxonomy
+genes_in_splits taxon_names
+genes_taxonomy trna_taxonomy
+```
+
+Another example to see the schema of a given table to learn about the field names:
+
+```
+sqlite> .schema contigs_basic_info
+CREATE TABLE contigs_basic_info (contig str, length numeric, gc_content numeric, num_splits numeric);
+```
+
+Another example to see the contents of a given table:
+
+```
+sqlite> select * from self;
+db_type|contigs
+contigs_db_hash|d51abf0a
+split_length|20000
+kmer_size|4
+num_contigs|4189
+total_length|35766167
+num_splits|4784
+genes_are_called|1
+splits_consider_gene_calls|1
+creation_date|1466453807.46107
+project_name|Infant Gut Contigs from Sharon et al.
+description|No description is given
+external_gene_calls|0
+external_gene_amino_acid_seqs|0
+skip_predict_frame|0
+trna_taxonomy_was_run|0
+trna_taxonomy_database_version|
+version|20
+modules_db_hash|72700e4db2bc
+gene_function_sources|COG20_CATEGORY,KOfam,COG20_PATHWAY,COG14_FUNCTION,KEGG_Module,COG20_FUNCTION,Transfer_RNAs,COG14_CATEGORY,KEGG_Class
+scg_taxonomy_was_run|1
+scg_taxonomy_database_version|v202.0
+gene_level_taxonomy_source|centrifuge
+```
+
+This example shows the contents of the `self` table, which is a special table that keep track of some meta information about the contigs database itself. If you compare this output to the output you get from the program [anvi-db-info](/help/8/programs/anvi-db-info), you may feel like you have found a shortcut to see something very core about the philosophy behind anvi'o and how it works.
+
+This environment is extremely powerful to ask complex, creative, or unconventional questions to learn anything you may want to learn about your data, even if anvi'o is not ready to answer those questions for you. Here we can use a question that was asked on anvi'o Discord by an anvi'o user to demonstrate this:
+
+> My contigs database includes contigs longer than 500 nts, how can I get summary statistics for my genes (such as the number of gene calls and the number of annotations per function annotation source) but **only for contigs that are longer than 10,000 nts**?
+
+One anvi'o way to answer this question is the following:
+
+* Create a [collection](/help/8/artifacts/collection) for contigs that are longer than 1000 nts (which would require one to parse a previous summary of the data),
+* Create a blank [profile-db](/help/8/artifacts/profile-db) using [anvi-profile](/help/8/programs/anvi-profile),
+* Import the collection into the blank profile database using [anvi-import-collection](/help/8/programs/anvi-import-collection),
+* Summarize the collection using [anvi-summarize](/help/8/programs/anvi-summarize),
+* Survey the output files to find out how many genes are there in the collection and their annotations (using BASH tools such as `grep` and `wc`, or other tools such as EXCEL or R).
+
+But the power of SQL enables a much quicker answer to it. Here are the steps to answer this particular question as an exercise.
+
+Learning the number of genes:
+
+``` sql
+sqlite> select count(*) from genes_in_contigs;
+32597
+```
+
+Number of contigs:
+
+``` sql
+sqlite> select count(*) from contigs_basic_info;
+4189
+```
+
+Contigs longer than 10000 nts:
+
+
+``` sql
+sqlite> select count(*) from contigs_basic_info where length > 10000;
+658
+```
+
+Number of genes that occur in contigs that are longer than 10000 nts:
+
+
+``` sql
+sqlite> select count(*) from genes_in_contigs where contig IN (select contig from contigs_basic_info where length > 10000);
+22117
+```
+
+Number of function annotations for genes per annotation source:
+
+``` sql
+sqlite> select source, count(*) from gene_functions group by source;
+COG14_CATEGORY|21121
+COG14_FUNCTION|21121
+COG20_CATEGORY|20878
+COG20_FUNCTION|20878
+COG20_PATHWAY|6088
+KEGG_Class|2760
+KEGG_Module|2760
+KOfam|14391
+Transfer_RNAs|323
+```
+
+Number of function annotations per annotation source for genes that occur in contigs that are longer than 10000 nts:
+
+``` sql
+sqlite> select source, count(*) from gene_functions where gene_callers_id IN (select gene_callers_id from genes_in_contigs where contig IN (select contig from contigs_basic_info where length > 10000)) group by source;
+COG14_CATEGORY|14472
+COG14_FUNCTION|14472
+COG20_CATEGORY|14235
+COG20_FUNCTION|14235
+COG20_PATHWAY|4154
+KEGG_Class|2260
+KEGG_Module|2260
+KOfam|11652
+Transfer_RNAs|280
+```
+
+At this point we have the answer to both questions in a few minutes. The power of this is that the same query will work on any computer and any anvi'o [contigs-db](/help/8/artifacts/contigs-db) out there. Thus making it possible to conduct specific yet complex interrogations of any anvi'o project in the wild by simply formatting a query and sending it to collaborators or colleagues.
+
+SQLite queries can be run without having to go into the SQLite terminal, too. For instance, one could run this directly from their terminal,
+
+``` bash
+sqlite3 CONTIGS.db 'select source, count(*) from gene_functions group by source'
+```
+
+which would have resulted in this output in the terminal:
+
+```
+COG14_CATEGORY|21121
+COG14_FUNCTION|21121
+COG20_CATEGORY|20878
+COG20_FUNCTION|20878
+COG20_PATHWAY|6088
+KEGG_Class|2760
+KEGG_Module|2760
+KOfam|14391
+Transfer_RNAs|323
+```
+
+Or this,
+
+``` bash
+sqlite3 -column -header CONTIGS.db 'select source, count(*) from gene_functions group by source'
+```
+
+to get a slighly fancier output:
+
+```
+source count(*)
+-------------- ----------
+COG14_CATEGORY 21121
+COG14_FUNCTION 21121
+COG20_CATEGORY 20878
+COG20_FUNCTION 20878
+COG20_PATHWAY 6088
+KEGG_Class 2760
+KEGG_Module 2760
+KOfam 14391
+Transfer_RNAs 323
+```
+
+Or this,
+
+
+``` bash
+sqlite3 -separator $'\t' -header CONTIGS.db 'select source, count(*) from gene_functions group by source' | anvi-script-tabulate
+```
+
+to get an even fancier output,
+
+``` bash
+โโโโโโโโโโโโโโโโโโคโโโโโโโโโโโโโ
+โ source โ count(*) โ
+โโโโโโโโโโโโโโโโโโชโโโโโโโโโโโโโก
+โ COG14_CATEGORY โ 21121 โ
+โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค
+โ COG14_FUNCTION โ 21121 โ
+โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค
+โ COG20_CATEGORY โ 20878 โ
+โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค
+โ COG20_FUNCTION โ 20878 โ
+โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค
+โ COG20_PATHWAY โ 6088 โ
+โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค
+โ KEGG_Class โ 2760 โ
+โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค
+โ KEGG_Module โ 2760 โ
+โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค
+โ KOfam โ 14391 โ
+โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโค
+โ Transfer_RNAs โ 323 โ
+โโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโ
+```
+
+Or this,
+
+``` bash
+sqlite3 -separator $'\t' -header CONTIGS.db 'select source, count(*) from gene_functions group by source' | anvi-script-as-markdown
+```
+
+To get an output that can be pasted to markdown-aware editors such as GitHub, so the output would look like this:
+
+|**source**|**count(*)**|
+|:--|:--|
+|COG14_CATEGORY|21121|
+|COG14_FUNCTION|21121|
+|COG20_CATEGORY|20878|
+|COG20_FUNCTION|20878|
+|COG20_PATHWAY|6088|
+|KEGG_Class|2760|
+|KEGG_Module|2760|
+|KOfam|14391|
+|Transfer_RNAs|323|
+
+[Learning SQL](https://www.w3schools.com/sql/sql_intro.asp) is not difficult, and one can practice their skills using their existing anvi'o databases. When there is a specific question, forming a meaningful SQL query takes only minutes. If you are not sure where to start or how to form an SQL query, you can always reach out to the community at the anvi'o Discord.
+
+## For programmers
+
+It is also possible to use anvi'o as a Python library to work with anvi'o artifacts, including [contigs-db](/help/8/artifacts/contigs-db). The purpose of this section is to list tips and use cases for programmers, and it is extended by questions we have received from the community. If you have a problem you wish to solve programmatically, but not sure how, please reach out to the community via anvi'o Discord or anvi'o GitHub.
+
+### Get number of approximate number of genomes
+
+You can get the number of genomes once [anvi-run-hmms](/help/8/programs/anvi-run-hmms) is run on an contigs database. Here are some examples:
+
+``` python
+from anvio.hmmops import NumGenomesEstimator
+
+# the raw data, where each key is one of the HMM collections
+# of type `singlecopy` run on the contigs-db
+NumGenomesEstimator('CONTIGS.db').estimates_dict
+>>> {'Bacteria_71': {'num_genomes': 9, 'domain': 'bacteria'},
+ 'Archaea_76': {'num_genomes': 1, 'domain': 'archaea'},
+ 'Protista_83': {'num_genomes': 1, 'domain': 'eukarya'}}
+
+# slightly fancier output with a single integer for
+# estimated number of genomes summarized, along with
+# domains used
+num_genomes, domains_included = NumGenomesEstimator('CONTIGS.db').num_genomes()
+print(num_genomes)
+>>> 11
+
+print(domains_included)
+>>> ['bacteria', 'archaea', 'eukarya']
+
+# limiting the domains
+num_genomes, domains_included = NumGenomesEstimator('CONTIGS.db').num_genomes(for_domains=['archaea', 'eukarya'])
+print(num_genomes)
+>>> 2
+
+print(domains_included)
+>>> ['archaea', 'eukarya']
+```
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/contigs-db.md) to update this information.
+
diff --git a/help/8/artifacts/contigs-fasta/index.md b/help/8/artifacts/contigs-fasta/index.md
new file mode 100644
index 00000000..87091216
--- /dev/null
+++ b/help/8/artifacts/contigs-fasta/index.md
@@ -0,0 +1,61 @@
+---
+layout: artifact
+title: contigs-fasta
+excerpt: A FASTA-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/contigs-fasta
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A FASTA-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-contigs](../../programs/anvi-export-contigs) [anvi-export-splits-and-coverages](../../programs/anvi-export-splits-and-coverages) [anvi-script-filter-fasta-by-blast](../../programs/anvi-script-filter-fasta-by-blast) [anvi-script-permute-trnaseq-seeds](../../programs/anvi-script-permute-trnaseq-seeds) [anvi-script-process-genbank](../../programs/anvi-script-process-genbank) [anvi-script-process-genbank-metadata](../../programs/anvi-script-process-genbank-metadata) [anvi-script-reformat-fasta](../../programs/anvi-script-reformat-fasta)
+
+
+## Required or used by
+
+
+[anvi-gen-contigs-database](../../programs/anvi-gen-contigs-database) [anvi-script-filter-fasta-by-blast](../../programs/anvi-script-filter-fasta-by-blast)
+
+
+## Description
+
+A [contigs-fasta](/help/8/artifacts/contigs-fasta) is a [fasta](/help/8/artifacts/fasta) file that is suitable to be used by [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database) to create a [contigs-db](/help/8/artifacts/contigs-db).
+
+The most critical requirement for this file is that **it must have simple deflines**. If your [fasta](/help/8/artifacts/fasta) file doesn't have simple deflines, it is not a proper [contigs-fasta](/help/8/artifacts/contigs-fasta). If you intend to use this file with anvi'o, **you must fix your FASTA file prior to mapping**.
+
+Take a look at your deflines prior to mapping, and remove anything that is not a digit, an ASCII letter, an underscore, or a dash character. Here are some example deflines that are not suitable for a [fasta](/help/8/artifacts/fasta) to be considered a [contigs-fasta](/help/8/artifacts/contigs-fasta)
+
+``` bash
+>Contig-123 length:4567
+>Another defline 42
+>gi|478446819|gb|JN117275.2|
+```
+
+And here are some OK ones:
+
+``` bash
+>Contig-123
+>Another_defline_42
+>gi_478446819_gb_JN117275_2
+```
+
+The program [anvi-script-reformat-fasta](/help/8/programs/anvi-script-reformat-fasta) can do this automatically for you.
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/contigs-fasta.md) to update this information.
+
diff --git a/help/8/artifacts/contigs-stats/index.md b/help/8/artifacts/contigs-stats/index.md
new file mode 100644
index 00000000..797c05f1
--- /dev/null
+++ b/help/8/artifacts/contigs-stats/index.md
@@ -0,0 +1,42 @@
+---
+layout: artifact
+title: contigs-stats
+excerpt: A STATS-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/contigs-stats
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A STATS-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-display-contigs-stats](../../programs/anvi-display-contigs-stats)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This artifact contains all of the information provided in the interface of [anvi-display-contigs-stats](/help/8/programs/anvi-display-contigs-stats) in a series of tab-delimited files. See that page for more information.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/contigs-stats.md) to update this information.
+
diff --git a/help/8/artifacts/coverages-txt/index.md b/help/8/artifacts/coverages-txt/index.md
new file mode 100644
index 00000000..3d34c57d
--- /dev/null
+++ b/help/8/artifacts/coverages-txt/index.md
@@ -0,0 +1,71 @@
+---
+layout: artifact
+title: coverages-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/coverages-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-gene-coverage-and-detection](../../programs/anvi-export-gene-coverage-and-detection) [anvi-export-splits-and-coverages](../../programs/anvi-export-splits-and-coverages) [anvi-get-split-coverages](../../programs/anvi-get-split-coverages) [anvi-script-get-coverage-from-bam](../../programs/anvi-script-get-coverage-from-bam)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is a text file containing **the average coverage for each contig in each sample** that was in the [profile-db](/help/8/artifacts/profile-db) and [contigs-db](/help/8/artifacts/contigs-db) that you used when you ran [anvi-export-splits-and-coverages](/help/8/programs/anvi-export-splits-and-coverages) or [anvi-export-gene-coverage-and-detection](/help/8/programs/anvi-export-gene-coverage-and-detection).
+
+This is a tab-delimited file where each row describes a specific split/gene and each column describes one of your samples. Each cell contains the average coverage of that contig in that sample.
+
+This artifact is really only used when taking information out of anvi'o, so enjoy your coverage information :)
+
+### Example for splits
+
+(the type of output you would get from [anvi-export-splits-and-coverages](/help/8/programs/anvi-export-splits-and-coverages))
+
+ contig sample_1 sample_2 sample_3 ...
+ Day1_contig1_split1 5.072727 4.523432 1.2343243
+ Day1_contig1_split2 6.895844 5.284812 9.3721947
+ Day1_contig2_split1 2.357049 3.519150 8.2385691
+ ...
+
+
+### Example for genes
+
+(the type of output you would get from [anvi-export-gene-coverage-and-detection](/help/8/programs/anvi-export-gene-coverage-and-detection))
+
+ key sample_1 sample_2 sample_3 ...
+ 13947 10.29109 1.984394 6.8289432
+ 13948 34.89584 6.284812 3.3721947
+ 23026 23.94938 9.239235 13.238569
+ ...
+
+
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/coverages-txt.md) to update this information.
+
diff --git a/help/8/artifacts/dendrogram/index.md b/help/8/artifacts/dendrogram/index.md
new file mode 100644
index 00000000..ba7cb009
--- /dev/null
+++ b/help/8/artifacts/dendrogram/index.md
@@ -0,0 +1,49 @@
+---
+layout: artifact
+title: dendrogram
+excerpt: A NEWICK-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/dendrogram
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A NEWICK-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-experimental-organization](../../programs/anvi-experimental-organization) [anvi-export-items-order](../../programs/anvi-export-items-order) [anvi-matrix-to-newick](../../programs/anvi-matrix-to-newick)
+
+
+## Required or used by
+
+
+[anvi-import-items-order](../../programs/anvi-import-items-order) [anvi-import-misc-data](../../programs/anvi-import-misc-data) [anvi-interactive](../../programs/anvi-interactive)
+
+
+## Description
+
+This described a [NEWICK-formatted](https://en.wikipedia.org/wiki/Newick_format) tree that is not representative of the phylogenic relationships between your samples.
+
+{:.notice}
+If you're looking for phylogenic trees, take a look at [phylogeny](/help/8/artifacts/phylogeny)
+
+Instead, the dendrogram artifact most often describes the tree used as a [misc-data-items-order](/help/8/artifacts/misc-data-items-order): the order that the items in [anvi-interactive](/help/8/programs/anvi-interactive) are displayed in (the central tree in the circular display). Often, these are the order of your contigs or genes based on their relatedness to each other (for example by tetranucleotide frequency or differencial coverage). These trees are also contained in [misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders).
+
+A dendrogram is also listed as the output of programs that are not necessarily related to phylogenetics (like [anvi-matrix-to-newick](/help/8/programs/anvi-matrix-to-newick)).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/dendrogram.md) to update this information.
+
diff --git a/help/8/artifacts/detection-txt/index.md b/help/8/artifacts/detection-txt/index.md
new file mode 100644
index 00000000..bd3a633a
--- /dev/null
+++ b/help/8/artifacts/detection-txt/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: detection-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/detection-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-gene-coverage-and-detection](../../programs/anvi-export-gene-coverage-and-detection)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is a text file containing **the detection value for each gene in each sample** that was in the [profile-db](/help/8/artifacts/profile-db) and [contigs-db](/help/8/artifacts/contigs-db) that you used when you ran [anvi-export-gene-coverage-and-detection](/help/8/programs/anvi-export-gene-coverage-and-detection).
+
+This is a tab-delimited file where each row describes a specific gene and each column describes one of your samples. Each cell contains the detection of that gene in that sample.
+
+### Example
+
+ key sample_1 sample_2 sample_3 ...
+ 13947 0.291093 0.984394 0.9289432
+ 13948 0.895842 0.828481 0.3721947
+ 23026 0.949383 0.983923 1.0000000
+ ...
+
+
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/detection-txt.md) to update this information.
+
diff --git a/help/8/artifacts/dna-sequence/index.md b/help/8/artifacts/dna-sequence/index.md
new file mode 100644
index 00000000..393c768f
--- /dev/null
+++ b/help/8/artifacts/dna-sequence/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: dna-sequence
+excerpt: A SEQUENCE-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/dna-sequence
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A SEQUENCE-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-estimate-trna-taxonomy](../../programs/anvi-estimate-trna-taxonomy) [anvi-search-palindromes](../../programs/anvi-search-palindromes)
+
+
+## Description
+
+A single sequence in DNA alphabet that is **not** stored in or provided by a standard file, such as a [fasta](/help/8/artifacts/fasta) file.
+
+A typical sequence artifact in the anvi'o ecosystem will be provided by the user to a sequence accepting program through the command line, or will be printed out into the terminal environment by a program that provides it.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/dna-sequence.md) to update this information.
+
diff --git a/help/8/artifacts/enzymes-list-for-module/index.md b/help/8/artifacts/enzymes-list-for-module/index.md
new file mode 100644
index 00000000..a71b4989
--- /dev/null
+++ b/help/8/artifacts/enzymes-list-for-module/index.md
@@ -0,0 +1,54 @@
+---
+layout: artifact
+title: enzymes-list-for-module
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/enzymes-list-for-module
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-script-gen-user-module-file](../../programs/anvi-script-gen-user-module-file)
+
+
+## Description
+
+This is a 3-column, tab-delimited file that lists the enzymes to be included in a user-defined metabolic module. It is used as input to [anvi-script-gen-user-module-file](/help/8/programs/anvi-script-gen-user-module-file), which will create a module file using the enzyme definitions within the file.
+
+The first column in this file must be an enzyme accession, the second column must be the annotation source of the enzyme, and the third column specifies the orthology (or functional definition) of the enzyme. Note that if the annotation source is 'KOfam' and the enzyme is a KEGG Ortholog that is present in the KEGG KOfam profiles in [kegg-data](/help/8/artifacts/kegg-data), then the orthology field can be blank. In this case, the orthology field will be filled in automatically with the enzyme's known orthology in the KOfam data.
+
+Here is an example file:
+
+|**enzyme**|**source**|**orthology**|
+|:--|:--|:--|
+|K01657|KOfam||
+|K01658|KOfam||
+|PF06603.14|METABOLISM_HMM|UpxZ|
+|COG1362|COG20_FUNCTION|Aspartyl aminopeptidase|
+|TIGR01709.2|TIGRFAM|type II secretion system protein GspL|
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/enzymes-list-for-module.md) to update this information.
+
diff --git a/help/8/artifacts/enzymes-txt/index.md b/help/8/artifacts/enzymes-txt/index.md
new file mode 100644
index 00000000..fd23c404
--- /dev/null
+++ b/help/8/artifacts/enzymes-txt/index.md
@@ -0,0 +1,84 @@
+---
+layout: artifact
+title: enzymes-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/enzymes-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism)
+
+
+## Description
+
+This artifact is a TAB-delimited file that describes a set of enzymes.
+
+The user can generate this file to define an arbitrary set of enzymes that they want to estimate metabolism on, using the program [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism).
+
+
+## Minimal file format
+
+Each row (besides the header) in this file represents one enzyme in the set. At minimum, the file must contain three columns:
+- a `gene_id` column containing a _unique_ value to identify a gene for the enzyme. The value can be either a string (like a gene name) or an integer (like a gene callers id), but it has to be unique because sometimes multiple genes can have the same enzyme annotation.
+- an `enzyme_accession` column containing the accession of the enzyme, such as a KEGG Ortholog accession for KOfams, a COG accession for NCBI COGs, a Pfam, etc
+- a `source` column containing the name of the database that would be used to annotate the enzyme. For example, "KOfam", "COG20_FUNCTION", "Pfam", etc.
+
+{: .notice}
+Ideally, all annotation sources in this column would match to those used to define the metabolic pathways you are estimating completeness for (whether those are [KEGG Modules](https://www.genome.jp/kegg/module.html) or user-defined modules as in [user-modules-data](/help/8/artifacts/user-modules-data), but in practice, we don't currently check for this. If you include some enzymes that are not part of any metabolic modules, they simply will not contribute to the completeness scores of any pathways, and you would therefore only see them in "hits" mode output files.
+So the `source` column is (at this time) mostly for you to make sure you know which database these enzymes are coming from and that at least some (hopefully most) will actually be part of the metabolic pathways you are interested, because otherwise the results from [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) might not make much sense. However, you do have complete freedom to define the 'source' value arbitrarily, if you want. But please keep in mind that this may change in the very near future - one day these `source` values might actually matter for the functioning of [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) (in which case this documentation will be updated to reflect that). So it is best to get used to setting them properly. :)
+
+## Minimal file example
+
+Here is an example file with the minimum set of columns:
+
+|**gene_id**|**enzyme_accession**|**source**|
+|:--|:--|:--|
+|aad:TC41_3038|K02886|KOfam|
+|aca:ACP_1744|K02626|KOfam|
+|aco:Amico_1604|K00606|KOfam|
+|ade:Adeh_0623|K02669|KOfam|
+
+## Adding gene coverage and detection values
+
+If you want downstream programs like [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) to have access to the coverage and detection data for each enzyme (well, technically, its gene), then you can add two additional columns to this file:
+- the `coverage` column should contain the numerical coverage value for the gene encoding the enzyme
+- the `detection` column should contain the numerical detection value for the gene encoding the enzyme
+
+If these columns are included, you can use the `--add-coverage` flag with [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) so that this data is included in the output for each metabolic pathway and/or enzyme. However, you do need to include _both_ of the columns - that program does not currently support adding just coverage or just detection.
+
+## Example with coverage and detection
+
+|**gene_id**|**enzyme_accession**|**source**|**coverage**|**detection**|
+|:--|:--|:--|:--|:--|
+|aad:TC41_3038|K02886|KOfam|4.44|0.7862318840579711|
+|aca:ACP_1744|K02626|KOfam|4.522875816993464|0.7790055248618785|
+|aco:Amico_1604|K00606|KOfam|2.63953488372093|0.8063380281690141|
+|ade:Adeh_0623|K02669|KOfam|2.011764705882353|0.6639344262295082|
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/enzymes-txt.md) to update this information.
+
diff --git a/help/8/artifacts/external-gene-calls/index.md b/help/8/artifacts/external-gene-calls/index.md
new file mode 100644
index 00000000..dd217439
--- /dev/null
+++ b/help/8/artifacts/external-gene-calls/index.md
@@ -0,0 +1,115 @@
+---
+layout: artifact
+title: external-gene-calls
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/external-gene-calls
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-get-sequences-for-gene-calls](../../programs/anvi-get-sequences-for-gene-calls) [anvi-script-augustus-output-to-external-gene-calls](../../programs/anvi-script-augustus-output-to-external-gene-calls) [anvi-script-process-genbank](../../programs/anvi-script-process-genbank) [anvi-script-process-genbank-metadata](../../programs/anvi-script-process-genbank-metadata)
+
+
+## Required or used by
+
+
+[anvi-gen-contigs-database](../../programs/anvi-gen-contigs-database)
+
+
+## Description
+
+By default, anvi'o uses Prodigal for gene calling when the user is generating a [contigs-db](/help/8/artifacts/contigs-db). Yet, if the user provides an external gene calls file, then anvi'o does not perform gene calling, and uses this file to store the gene information into the new [contigs-db](/help/8/artifacts/contigs-db).
+
+External gene calls is a user-provided TAB-delimited file that should follow this format:
+
+|gene_callers_id|contig|start|stop|direction|partial|call_type|source|version|
+|:---:|:---|:---:|:---:|:---:|:---:|:--:|:---:|
+|1|contig_01|1113|1677|f|0|1|program|v1.0|
+|2|contig_01|1698|2142|f|0|1|program|v1.0|
+|3|contig_01|2229|3447|f|0|1|program|v1.0|
+|4|contig_01|3439|6820|r|0|1|program|v1.0|
+|7|contig_01|8496|10350|r|1|1|program|v1.0|
+|8|contig_02|306|1650|f|0|1|program|v1.0|
+|9|contig_02|1971|3132|f|0|1|program|v1.0|
+|10|contig_02|3230|4007|f|0|1|program|v1.0|
+|11|contig_02|4080|5202|f|0|1|program|v1.0|
+|12|contig_02|5194|5926|f|0|1|program|v1.0|
+|13|contig_03|606|2514|f|0|1|program|v1.0|
+|14|contig_03|2751|3207|f|0|1|program|v1.0|
+|15|contig_03|3219|5616|f|0|1|program|v1.0|
+|16|contig_03|5720|6233|f|0|1|program|v1.0|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+Please note that while anvi'o will not perform any gene prediction, it will still try to translate DNA sequence found in start-stop positions of each gene using the standard genetic code. You can prevent that by providing your own amino acid sequences by adding an optional column to your external gene calls file, `aa_sequence`:
+
+|gene_callers_id|contig|start|stop|direction|partial|call_type|source|version|aa_sequence|
+|:---:|:---|:---:|:---:|:---:|:---:|:---:|:--:|:--|
+|1|contig_01|1113|1677|f|0|1|program|v1.0|MAQTTNDIKNGSVLNLDGQLWTVI(...)|
+|2|contig_01|1698|2142|f|0|1|program|v1.0|MARSTARKRALNTLYEADEKGQDI(...)|
+|3|contig_01|2229|3447|f|0|1|program|v1.0|MNQYDSEAVMFDPQDAVLVLEDGQ(...)|
+|4|contig_01|3439|6820|r|0|1|program|v1.0|MPKRTDIKSVMVIGSGPIVIGQAA(...)|
+|7|contig_01|8496|10350|r|1|1|program|v1.0|MMSSPSSEEVNAQRSDFGLRLSNS(...)|
+|8|contig_02|306|1650|f|0|1|program|v1.0|MADSQHGRLIVLCGPAGVGKGTVL(...)|
+|9|contig_02|1971|3132|f|0|1|program|v1.0|MRSAKLMNGRVFAGARALYRAAGV(...)|
+|10|contig_02|3230|4007|f|0|1|program|v1.0|MAFGTEPTPTGLADPPIDDLMEHA(...)|
+|11|contig_02|4080|5202|f|0|1|program|v1.0|MAELKLISAESVTEGHPDKVCDQI(...)|
+|12|contig_02|5194|5926|f|0|1|program|v1.0|MRYPCIMTNEDAEQLALDGLAPRK(...)|
+|13|contig_03|606|2514|f|0|1|program|v1.0|MTLTLRMEKRMKGWPGEPQMEYDV(...)|
+|14|contig_03|2751|3207|f|0|1|program|v1.0|MLKVLFAGTPDVAVPSLKLLAQDT(...)|
+|15|contig_03|3219|5616|f|0|1|program|v1.0|MLEQETPNIASMASLPTLSAPGLL(...)|
+|16|contig_03|5720|6233|f|0|1|program|v1.0|MLESEVDMNDHDEETLASLQQAND(...)|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+Explicitly defining amino acid sequences could be particularly useful when working with eukaryotic genomes, and/or genomes that use non-standard genetic code. Sections below discuss specific information about columns of this file.
+
+### Gene start/stop positions
+
+Anvi'o follows the convention of string indexing and splicing that is identical to the way one does it in Python or C. This means that the index of the first nucleotide in any contig should be `0`. In other words, for a gene call that starts at the position `x`th position and ends at position `y`th position, we start counting from `x-1`, and not from `x` (but we still end at `y`). The `start` and `stop` positions in the input file should comply with this criterion. Here is an example gene in a contig:
+
+``` bash
+ 1 2 3
+nt pos: 12345678901234567890123456789012 (...)
+contig: NNNATGNNNNNNNNNNNNNNNNNTAGAAAAAA (...)
+ |______ gene X _______|
+```
+
+The `start` and `stop` positions in the input file for this gene should be `3` and `26`, respectively. Which means, if you are trying to generate an external gene calls file from gene calls produced by a gene caller that reports start/stop positions starting with the index of `1` rather than `0`, you basically need to subtract one from the start position of every gene call for a matching anvi'o external gene calls file.
+
+Gene `start` and `stop` positions do not care about the direction of the gene as they simply address how the gene sequence should be sliced out from a longer sequence. Whether a gene is forward or reverse is defined in the column `direction`.
+
+{:.notice}
+You can read the previous discussions regarding this behavior in [this issue](https://github.com/meren/anvio/issues/374)). Thanks for your patience!
+
+### Call type
+
+{:.notice}
+This is a feature added after anvi'o `v6.2`. If you are using anvi'o `v6.2` or earlier, please remove `call_type` column from your external gene calls file.
+
+The column `call_type` declares the nature of the call. It can take one of the following three integer values:
+
+* `1`, indicates that the gene call is for a CODING gene. For gene calls marked as CODING genes, anvi'o will try to predict the proper coding frame when [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database) is run using Markov models trained on a large number of protein sequences and was first described in this [pull request](https://github.com/merenlab/anvio/pull/1428). This is the default behavior for CODING sequences regardless of whether the gene call is partial or not. However, there are two ways the user can change this: (1) by providing an amino acid sequence for the call in the `aa_sequence` column or (2) by asking [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database) to `--skip-predict-frame`.
+
+* `2`, indicates that the gene call is for a NONCODING gene. This is used for non-coding RNAs (transfer RNAs or ribosomal RNAs). For gene calls marked as NONCODING, anvi'o will not attempt to predict an amino acid sequence (nor it will tolerate entries in the `aa_sequence` column).
+
+* `3`, indicates that the gene call is for an UNKNOWN genomic region. This is currently reserved for experimental purposes.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/external-gene-calls.md) to update this information.
+
diff --git a/help/8/artifacts/external-genomes/index.md b/help/8/artifacts/external-genomes/index.md
new file mode 100644
index 00000000..d6de8f6f
--- /dev/null
+++ b/help/8/artifacts/external-genomes/index.md
@@ -0,0 +1,65 @@
+---
+layout: artifact
+title: external-genomes
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/external-genomes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-gen-genomes-file](../../programs/anvi-script-gen-genomes-file)
+
+
+## Required or used by
+
+
+[anvi-compute-functional-enrichment-across-genomes](../../programs/anvi-compute-functional-enrichment-across-genomes) [anvi-compute-genome-similarity](../../programs/anvi-compute-genome-similarity) [anvi-compute-metabolic-enrichment](../../programs/anvi-compute-metabolic-enrichment) [anvi-dereplicate-genomes](../../programs/anvi-dereplicate-genomes) [anvi-display-functions](../../programs/anvi-display-functions) [anvi-estimate-genome-completeness](../../programs/anvi-estimate-genome-completeness) [anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism) [anvi-gen-genomes-storage](../../programs/anvi-gen-genomes-storage) [anvi-get-codon-frequencies](../../programs/anvi-get-codon-frequencies) [anvi-get-codon-usage-bias](../../programs/anvi-get-codon-usage-bias) [anvi-get-sequences-for-hmm-hits](../../programs/anvi-get-sequences-for-hmm-hits) [anvi-script-gen-function-matrix-across-genomes](../../programs/anvi-script-gen-function-matrix-across-genomes) [anvi-script-gen-functions-per-group-stats-output](../../programs/anvi-script-gen-functions-per-group-stats-output) [anvi-script-gen-hmm-hits-matrix-across-genomes](../../programs/anvi-script-gen-hmm-hits-matrix-across-genomes)
+
+
+## Description
+
+In the anvi'o lingo, an external genome is any [contigs-db](/help/8/artifacts/contigs-db) generated from a FASTA file that describes a single genome for a single microbial population (and not a metagenome).
+
+The purpose of the external genomes file is to describe one or more external genomes, so this file can be passed to anvi'o programs that can operate on multiple genomes.
+
+For a given set of [contigs-db](/help/8/artifacts/contigs-db) files, you can generate an external-genomes file automatically using the program [anvi-script-gen-genomes-file](/help/8/programs/anvi-script-gen-genomes-file). Alternatively, you can manually create the file using a text editor, or a program like EXCEL.
+
+The external-genomes file is a TAB-delimited file with at least two columns (you can add more columns to this file, and anvi'o will not mind):
+
+* `name`. The name of the external genome. You can call it anything, but you should keep it to a single word witout any spaces or funny characters.
+* `contigs_db_path`. The full path to each [contigs-db](/help/8/artifacts/contigs-db) file (tip: the command `pwd` will tell you the full path to the directory you are in).
+
+The format of the file should look like this:
+
+|name|contigs_db_path|
+|:--|:--|
+|Name_01|/path/to/contigs-01.db|
+|Name_02|/path/to/contigs-02.db|
+|Name_03|/path/to/contigs-03.db|
+|(...)|(...)|
+
+{:.warning}
+Please make sure names in the `name` column does not include any special characters (underscore is fine). It is also a good idea to keep these names short and descriptive as they will appear in various figures in downstream analyses.
+
+Also see **[internal-genomes](/help/8/artifacts/internal-genomes)** and **[metagenomes](/help/8/artifacts/metagenomes)**.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/external-genomes.md) to update this information.
+
diff --git a/help/8/artifacts/external-structures/index.md b/help/8/artifacts/external-structures/index.md
new file mode 100644
index 00000000..6f4bd496
--- /dev/null
+++ b/help/8/artifacts/external-structures/index.md
@@ -0,0 +1,60 @@
+---
+layout: artifact
+title: external-structures
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/external-structures
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+By default, anvi'o predicts protein structures using MODELLER when creating a [structure-db](/help/8/artifacts/structure-db). Yet, if the user provides an external structures file, then anvi'o does not perform template-based homology modelling, and instead uses this file to obtain the structure information for the [structure-db](/help/8/artifacts/structure-db).
+
+External structures is a user-provided TAB-delimited file that should follow this format:
+
+|gene_callers_id|path|
+|:---:|:---|
+|1|path/to/gene1/structure.pdb|
+|2|path/to/gene2/structure.pdb|
+|3|path/to/gene3/structure.pdb|
+|4|path/to/gene4/structure.pdb|
+|7|path/to/gene5/structure.pdb|
+|8|path/to/gene6/structure.pdb|
+|(...)|(...)|
+
+Each path should point to a [protein-structure-txt](/help/8/artifacts/protein-structure-txt).
+
+{:.notice}
+Please note that anvi'o will try its best to test the integrity of each file, and work with any limitations, however ultimately the user may be subject to the strict requirements set forth by anvi'o. For example, if a structure has a missing residue, you will hear about it.
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/external-structures.md) to update this information.
+
diff --git a/help/8/artifacts/fasta-txt/index.md b/help/8/artifacts/fasta-txt/index.md
new file mode 100644
index 00000000..62068c6b
--- /dev/null
+++ b/help/8/artifacts/fasta-txt/index.md
@@ -0,0 +1,62 @@
+---
+layout: artifact
+title: fasta-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/fasta-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is a file used by [anvi-run-workflow](/help/8/programs/anvi-run-workflow) that lists the name and path of all of the input [fasta](/help/8/artifacts/fasta) files.
+
+In its simplest form, a [fasta-txt](/help/8/artifacts/fasta-txt) is a TAB-delmited file with two columns for `name` and `path`. Here is an example:
+
+|name|path|
+|:--|:--|
+|SAMPLE_01|path/to/sample_01.fa|
+|SAMPLE_02|path/to/sample_02.fa|
+
+Paths can be absolute or relative, and FASTA files can be compressed or not. That's all up to you.
+
+One of the primary users of the [fasta-txt](/help/8/artifacts/fasta-txt) is the [anvi'o snakemake workflows](https://merenlab.org/2018/07/09/anvio-snakemake-workflows/), and to make it more compatible to complex workflow needs, [fasta-txt](/help/8/artifacts/fasta-txt) supports the following additional columns to provide more information for each FASTA file when available, such as [external-gene-calls](/help/8/artifacts/external-gene-calls) file and/or a [functions-txt](/help/8/artifacts/functions-txt).
+
+Here is an example with those additional columns:
+
+|name|path|external_gene_calls|gene_functional_annotation|
+|:--|:--|:--|:--|
+|SAMPLE_01|path/to/sample_01.fa|[external-gene-calls](/help/8/artifacts/external-gene-calls)_01.txt|[functions-txt](/help/8/artifacts/functions-txt)_01.txt|
+|SAMPLE_02|path/to/sample_02.fa|[external-gene-calls](/help/8/artifacts/external-gene-calls)_02.txt|[functions-txt](/help/8/artifacts/functions-txt)_02.txt|
+
+For more information, check out the [anvi'o workflow tutorial](https://merenlab.org/2018/07/09/anvio-snakemake-workflows/#fastatxt)
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/fasta-txt.md) to update this information.
+
diff --git a/help/8/artifacts/fasta/index.md b/help/8/artifacts/fasta/index.md
new file mode 100644
index 00000000..a0d13d39
--- /dev/null
+++ b/help/8/artifacts/fasta/index.md
@@ -0,0 +1,58 @@
+---
+layout: artifact
+title: fasta
+excerpt: A FASTA-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/fasta
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A FASTA-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-dereplicate-genomes](../../programs/anvi-dereplicate-genomes) [anvi-script-fix-homopolymer-indels](../../programs/anvi-script-fix-homopolymer-indels)
+
+
+## Required or used by
+
+
+[anvi-dereplicate-genomes](../../programs/anvi-dereplicate-genomes) [anvi-search-palindromes](../../programs/anvi-search-palindromes) [anvi-script-compute-ani-for-fasta](../../programs/anvi-script-compute-ani-for-fasta) [anvi-script-fix-homopolymer-indels](../../programs/anvi-script-fix-homopolymer-indels) [anvi-script-reformat-fasta](../../programs/anvi-script-reformat-fasta)
+
+
+## Description
+
+A [FASTA](https://en.wikipedia.org/wiki/FASTA_format) file that does not necessarily meet the standards of a [contigs-fasta](/help/8/artifacts/contigs-fasta). While it is not necessary for all programs, if a given anvi'o program requires a [contigs-fasta](/help/8/artifacts/contigs-fasta), the program [anvi-script-reformat-fasta](/help/8/programs/anvi-script-reformat-fasta) can turn a regular fasta into a [contigs-fasta](/help/8/artifacts/contigs-fasta) with the flag `--simplify-names`.
+
+### What is a FASTA file?
+
+A FASTA file typically contains one or more DNA, RNA, or amino acid sequences that are formatted as follows:
+
+```
+>SEQUENCE_ID VARIOUS_SEQUENCE_DATA
+SEQUENCE
+(...)
+```
+
+The line that starts with the character `>` is also known as the 'defline' for a given sequence. The `VARIOUS_SEQUENCE_DATA` region of the defline can be empty, or contain additional data such as the NCBI taxon ID, GI accession number, a text description of the sequence, or the start and end positions if the sequence is a portion of a larger sample. Because the FASTA file format was designed before there weren't even enough electronic calculators on the planet, there is no actual standard format to organize additional information shared in the defline.
+
+The sequence itself is typically written in standard [IUPAC format](https://en.wikipedia.org/wiki/Nucleic_acid_notation), although you may find FASTA files with sequences that contain lower-case letter, mixed letters, no letters, or pretty much anything really. Over the years we have seen everything, and suggest you to take a careful look at your FASTA files before doing anything with them unless you generated them yourself.
+
+You can learn more about the FASTA format on its [glorious Wikipedia page](https://en.wikipedia.org/wiki/FASTA_format).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/fasta.md) to update this information.
+
diff --git a/help/8/artifacts/fixation-index-matrix/index.md b/help/8/artifacts/fixation-index-matrix/index.md
new file mode 100644
index 00000000..419d2fa7
--- /dev/null
+++ b/help/8/artifacts/fixation-index-matrix/index.md
@@ -0,0 +1,54 @@
+---
+layout: artifact
+title: fixation-index-matrix
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/fixation-index-matrix
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-gen-fixation-index-matrix](../../programs/anvi-gen-fixation-index-matrix)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+A fixation index matrix is what it sounds like (a matrix of fixation indices) and is generated by [anvi-gen-fixation-index-matrix](/help/8/programs/anvi-gen-fixation-index-matrix).
+
+This is a distance matrix where each column represnts a metagenome/sample and each row represents a metagenome/sample, so each cell represents the fixation index between two samples. The fixation index is a number from 0 (maximum similiarity) to 1 (maximum distance).
+
+This is a form of [view-data](/help/8/artifacts/view-data), so it can be provided to [anvi-matrix-to-newick](/help/8/programs/anvi-matrix-to-newick) as is done in the [Infant Gut tutorial](https://merenlab.org/tutorials/infant-gut/#measuring-distances-between-metagenomes-with-fst).
+
+For example, here is the fixation index matrix based off of a few random genes (1, 2, 3, 5, 6, 7, 8, 9, 21, 32, 35, 56, 567) in the infant gut tutorial:
+
+ DAY_15A DAY_15B DAY_16 DAY_17A DAY_17B DAY_18 DAY_19 DAY_22A DAY_22B DAY_23 DAY_24
+ DAY_15A 0.0 0.04918776635439459 0.11098963862572608 0.047199977957045336 0.0 0.08439930158545839 0.06141920095408482 0.09624218229853498 0.0 0.04711838006230529 0.08709060259498314
+ DAY_15B 0.04918776635439459 0.0 0.07095764349628797 0.04742239843190177 0.04918776635439459 0.058754132113538526 0.012860485049109083 0.056111593790529324 0.04918776635439459 0.012757678005705375 0.05031628777187558
+
+ ...
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/fixation-index-matrix.md) to update this information.
+
diff --git a/help/8/artifacts/functional-enrichment-txt/index.md b/help/8/artifacts/functional-enrichment-txt/index.md
new file mode 100644
index 00000000..6c25327b
--- /dev/null
+++ b/help/8/artifacts/functional-enrichment-txt/index.md
@@ -0,0 +1,75 @@
+---
+layout: artifact
+title: functional-enrichment-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/functional-enrichment-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-compute-functional-enrichment-across-genomes](../../programs/anvi-compute-functional-enrichment-across-genomes) [anvi-compute-functional-enrichment-in-pan](../../programs/anvi-compute-functional-enrichment-in-pan) [anvi-compute-metabolic-enrichment](../../programs/anvi-compute-metabolic-enrichment) [anvi-display-functions](../../programs/anvi-display-functions) [anvi-script-gen-function-matrix-across-genomes](../../programs/anvi-script-gen-function-matrix-across-genomes)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is a TAB-delimited output file that describes enrichment scores and associated groups for functions or metabolic modules in groups of genomes or samples.
+
+## General format
+
+Each row in the matrix describes an entity (a function, functional association of a gene cluster, or metabolic module) that is associated with one or more groups of samples or genomes. These are listed with the highest enrichment scores displayed first.
+
+The following columns of information are listed in the file:
+
+- the name of the enriched entity, which can be a functional association, metabolic module, or function. The header of this column is either your functional annotation source, OR 'KEGG_MODULE' if you are working with metabolic modules
+- enrichment_score: a measure of much this particular entity is enriched in the group it is associated with (i.e., measures how unique this entity [see column 1] is to this group(s) [see column 5])
+- unadjusted_p_value: the significance value of the hypothesis test for enrichment, unadjusted for multiple hypothesis testing
+- adjusted_q_value: the adjusted p-value after taking into account multiple hypothesis testing
+- associated groups: the list of groups that this entity is associated with
+- accession: a function accession number or KEGG module number
+- a list of gene cluster ids, sample names, or genome names that this entity is found in
+- p values for each group: gives the proportion of the group's member genomes or samples in which this entity was found.
+- N values for each group: gives the total number of genomes or samples in each group.
+
+## A specific example - enriched functions in pangenomes
+
+When you run [anvi-compute-functional-enrichment-in-pan](/help/8/programs/anvi-compute-functional-enrichment-in-pan) to compute enrichment scores for functions in a pangenome, the resulting matrix describes the gene cluster-level functional associations that are enriched within specific groups of your pangenome. This is described in more detail [in the pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#making-sense-of-functions-in-your-pangenome).
+
+Here is a more concrete example (the same example as in the [pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#making-sense-of-functions-in-your-pangenome)). Note that that tutorial uses `COG_FUNCTION` as the functional annotation source, and has `LL` (low light) and `HL` (high light) as the two pan-groups.
+
+|COG_FUNCTION | enrichment_score | unadjusted_p_value | adjusted_q_value | associated_groups | accession | gene_clusters_ids | p_LL | p_HL | N_LL | N_HL|
+|-- | -- | -- | -- | -- | -- | -- | -- | -- | --| --|
+|Proteasome lid subunit RPN8/RPN11, contains Jab1/MPN domain metalloenzyme (JAMM) motif | 31.00002279 | 2.58E-08 | 1.43E-06 | LL | COG1310 | GC_00002219, GC_00003850, GC_00004483 | 1 | 0 | 11 | 20|
+|Adenine-specific DNA glycosylase, acts on AG and A-oxoG pairs | 31.00002279 | 2.58E-08 | 1.43E-06 | LL | COG1194 | GC_00001711 | 1 | 0 | 11 | 20|
+|Periplasmic beta-glucosidase and related glycosidases | 31.00002279 | 2.58E-08 | 1.43E-06 | LL | COG1472 | GC_00002086, GC_00003909 | 1 | 0 | 11 | 20|
+|Single-stranded DNA-specific exonuclease, DHH superfamily, may be involved in archaeal DNA replication intiation | 31.00002279 | 2.58E-08 | 1.43E-06 | LL | COG0608 | GC_00002752, GC_00003786, GC_00004838, GC_00007241 | 1 | 0 | 11 | 20|
+|Ser/Thr protein kinase RdoA involved in Cpx stress response, MazF antagonist | 31.00002279 | 2.58E-08 | 1.43E-06 | LL | COG2334 | GC_00002783, GC_00003936, GC_00004631, GC_00005468 | 1 | 0 | 11 | 20|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+|Signal transduction histidine kinase | -7.34E-41 | 1 | 1 | NA | COG5002 | GC_00000773, GC_00004293 | 1 | 1 | 11 | 20|
+|tRNA A37 methylthiotransferase MiaB | -7.34E-41 | 1 | 1 | NA | COG0621 | GC_00000180, GC_00000851 | 1 | 1 | 11 | 20|
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/functional-enrichment-txt.md) to update this information.
+
diff --git a/help/8/artifacts/functions-across-genomes-txt/index.md b/help/8/artifacts/functions-across-genomes-txt/index.md
new file mode 100644
index 00000000..49a52884
--- /dev/null
+++ b/help/8/artifacts/functions-across-genomes-txt/index.md
@@ -0,0 +1,100 @@
+---
+layout: artifact
+title: functions-across-genomes-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/functions-across-genomes-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-gen-function-matrix-across-genomes](../../programs/anvi-script-gen-function-matrix-across-genomes)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+A TAB-delimited output file that describes the distribution of functions across a group of genomes.
+
+## General format
+
+With this file type anvi'o may describe the presence-absence of individual functions across genomes, or the frequency of functions. The header of the last column in a [functions-across-genomes-txt](/help/8/artifacts/functions-across-genomes-txt) file will be the name of the function annotation source from which the gene functions were recovered.
+
+### Frequency of function hits across genomes
+
+Here is an example output file that shows the frequency of functions across 9 _Bifidobacterium_ genomes:
+
+|**key**|**B_adolescentis_1_11**|**B_adolescentis_22L**|**B_adolescentis_6**|**B_lactis_Bl_04**|**B_lactis_CNCM_I_2494**|**B_lactis_DSM_10140**|**B_longum_JDM301**|**B_longum_KACC_91563**|**B_longum_NCIMB8809**|**COG20_FUNCTION**|
+|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|
+|`func_3874a4219ffa`|1|1|1|1|1|1|1|1|1|Chromosomal replication initiation ATPase DnaA (DnaA) (PDB:1L8Q)|
+|`func_1530552f6b61`|1|1|1|1|1|1|1|1|1|DNA polymerase III sliding clamp (beta) subunit, PCNA homolog (DnaN) (PDB:1JQJ)|
+|`func_060988dd6d3a`|1|1|1|1|1|1|1|1|1|Recombinational DNA repair ATPase RecF (RecF) (PDB:5Z67)|
+|`func_46477a763d38`|1|1|1|1|1|1|1|1|1|Predicted nucleic acid-binding protein, contains Zn-ribbon domain (includes truncated derivatives)|
+|`func_184ca83fe067`|2|2|2|2|2|2|2|2|2|DNA gyrase/topoisomerase IV, subunit B (GyrB) (PDB:1EI1)|
+|`func_2ee67347f04b`|2|2|2|2|2|2|2|2|2|DNA gyrase/topoisomerase IV, subunit A (GyrA) (PDB:1SUU)|
+|`func_ff160914972f`|1|0|0|0|0|0|3|1|1|Predicted ATPase, archaeal AAA+ ATPase superfamily|
+|`func_598325c18ceb`|1|0|0|0|0|0|1|1|1|Molybdopterin or thiamine biosynthesis adenylyltransferase (ThiF) (PDB:1ZUD) (PUBMED:32239579)|
+|`func_8ce5ff84aa42`|14|13|13|10|10|10|22|17|16|Predicted arabinose efflux permease AraJ, MFS family (AraJ) (PDB:4LDS)|
+|`func_300e2c6e37e4`|1|1|1|1|1|1|1|1|1|Glutamate dehydrogenase/leucine dehydrogenase (GdhA) (PDB:1B3B) (PUBMED:24391520)|
+|`func_a666bc87a03f`|1|1|1|1|1|1|1|1|1|Large-conductance mechanosensitive channel (MscL) (PDB:2OAR)|
+|`func_71379d89f0c6`|1|1|1|1|1|1|1|1|1|UTP pyrophosphatase, metal-dependent hydrolase family (YgjP) (PDB:4JIU) (PUBMED:27941785)|
+|`func_772efbb7cb2e`|2|2|2|2|2|2|2|2|2|Hemolysin-related protein, contains CBS domains, UPF0053 family (TlyC) (PDB:2NQW)|
+|`func_5a5bf9735c8a`|1|1|1|1|1|1|1|1|1|Carbonic anhydrase (CynT) (PDB:1EKJ) (PUBMED:22081392)|
+|`func_1ffc26d0034f`|4|4|1|3|3|3|12|13|2|Transposase (or an inactivated derivative) (IS285)|
+|`func_0631d9fecae6`|2|1|1|1|1|1|2|1|1|Phosphoenolpyruvate carboxylase (Ppc) (PDB:1FIY)|
+|`func_62068e09f9f3`|2|1|1|0|0|0|1|1|1|Chromosome segregation ATPase Smc (Smc) (PDB:5XG3)|
+|`func_f4b8779f6ad8`|2|0|0|0|0|0|0|0|0|Peptidoglycan-binding (PGRP) domain of peptidoglycan hydrolases (PGRP) (PDB:4FET)|
+|`func_700652d0a2dd`|1|1|1|1|1|1|1|1|1|Uncharacterized membrane protein YjjP, DUF1212 family (YjjP)|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+### Presence/absence of function hits across genomes
+
+In contrast, here is the presence/absence report for the same data:
+
+|**key**|**B_adolescentis_1_11**|**B_adolescentis_22L**|**B_adolescentis_6**|**B_lactis_Bl_04**|**B_lactis_CNCM_I_2494**|**B_lactis_DSM_10140**|**B_longum_JDM301**|**B_longum_KACC_91563**|**B_longum_NCIMB8809**|**COG20_FUNCTION**|
+|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|
+|`func_3874a4219ffa`|1|1|1|1|1|1|1|1|1|Chromosomal replication initiation ATPase DnaA (DnaA) (PDB:1L8Q)|
+|`func_1530552f6b61`|1|1|1|1|1|1|1|1|1|DNA polymerase III sliding clamp (beta) subunit, PCNA homolog (DnaN) (PDB:1JQJ)|
+|`func_060988dd6d3a`|1|1|1|1|1|1|1|1|1|Recombinational DNA repair ATPase RecF (RecF) (PDB:5Z67)|
+|`func_46477a763d38`|1|1|1|1|1|1|1|1|1|Predicted nucleic acid-binding protein, contains Zn-ribbon domain (includes truncated derivatives)|
+|`func_184ca83fe067`|1|1|1|1|1|1|1|1|1|DNA gyrase/topoisomerase IV, subunit B (GyrB) (PDB:1EI1)|
+|`func_2ee67347f04b`|1|1|1|1|1|1|1|1|1|DNA gyrase/topoisomerase IV, subunit A (GyrA) (PDB:1SUU)|
+|`func_ff160914972f`|1|0|0|0|0|0|1|1|1|Predicted ATPase, archaeal AAA+ ATPase superfamily|
+|`func_598325c18ceb`|1|0|0|0|0|0|1|1|1|Molybdopterin or thiamine biosynthesis adenylyltransferase (ThiF) (PDB:1ZUD) (PUBMED:32239579)|
+|`func_8ce5ff84aa42`|1|1|1|1|1|1|1|1|1|Predicted arabinose efflux permease AraJ, MFS family (AraJ) (PDB:4LDS)|
+|`func_300e2c6e37e4`|1|1|1|1|1|1|1|1|1|Glutamate dehydrogenase/leucine dehydrogenase (GdhA) (PDB:1B3B) (PUBMED:24391520)|
+|`func_a666bc87a03f`|1|1|1|1|1|1|1|1|1|Large-conductance mechanosensitive channel (MscL) (PDB:2OAR)|
+|`func_71379d89f0c6`|1|1|1|1|1|1|1|1|1|UTP pyrophosphatase, metal-dependent hydrolase family (YgjP) (PDB:4JIU) (PUBMED:27941785)|
+|`func_772efbb7cb2e`|1|1|1|1|1|1|1|1|1|Hemolysin-related protein, contains CBS domains, UPF0053 family (TlyC) (PDB:2NQW)|
+|`func_5a5bf9735c8a`|1|1|1|1|1|1|1|1|1|Carbonic anhydrase (CynT) (PDB:1EKJ) (PUBMED:22081392)|
+|`func_1ffc26d0034f`|1|1|1|1|1|1|1|1|1|Transposase (or an inactivated derivative) (IS285)|
+|`func_0631d9fecae6`|1|1|1|1|1|1|1|1|1|Phosphoenolpyruvate carboxylase (Ppc) (PDB:1FIY)|
+|`func_62068e09f9f3`|1|1|1|0|0|0|1|1|1|Chromosome segregation ATPase Smc (Smc) (PDB:5XG3)|
+|`func_f4b8779f6ad8`|1|0|0|0|0|0|0|0|0|Peptidoglycan-binding (PGRP) domain of peptidoglycan hydrolases (PGRP) (PDB:4FET)|
+|`func_700652d0a2dd`|1|1|1|1|1|1|1|1|1|Uncharacterized membrane protein YjjP, DUF1212 family (YjjP)|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/functions-across-genomes-txt.md) to update this information.
+
diff --git a/help/8/artifacts/functions-txt/index.md b/help/8/artifacts/functions-txt/index.md
new file mode 100644
index 00000000..e64193fe
--- /dev/null
+++ b/help/8/artifacts/functions-txt/index.md
@@ -0,0 +1,108 @@
+---
+layout: artifact
+title: functions-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/functions-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-functions](../../programs/anvi-export-functions) [anvi-search-functions](../../programs/anvi-search-functions) [anvi-script-get-hmm-hits-per-gene-call](../../programs/anvi-script-get-hmm-hits-per-gene-call) [anvi-script-process-genbank](../../programs/anvi-script-process-genbank) [anvi-script-process-genbank-metadata](../../programs/anvi-script-process-genbank-metadata) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Required or used by
+
+
+[anvi-import-functions](../../programs/anvi-import-functions) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Description
+
+This artifact is a TAB-delimited file that **associates genes and functions**.
+
+The user can generate this file to import gene functions into a [contigs-db](/help/8/artifacts/contigs-db) via [anvi-import-functions](/help/8/programs/anvi-import-functions) or can acquire this file by recovering it from a [contigs-db](/help/8/artifacts/contigs-db) via [anvi-export-functions](/help/8/programs/anvi-export-functions). It is also the output of [anvi-search-functions](/help/8/programs/anvi-search-functions) which searches for specific terms in your functional annotations.
+
+In general, this is the simplest way to get gene functions into anvi'o, and all downstream analyses, including pangenomics. For other ways to get gene functions into anvi'o you can take a look at [this page](http://merenlab.org/2016/06/18/importing-functions/).
+
+
+## Simple matrix file format
+
+
+The TAB-delimited file for this artifact has five columns:
+
+1. `gene_callers_id`: The gene caller ID recognized by anvi'o (see the note below).
+2. `source`: The name of the functional annotation source (i.e., the database that you got this function data from).
+3. `accession`: A unique accession id per function, better if a single word.
+4. `function`: Full name / description of the function.
+5. `e_value`: The significance score of this annotation, where zero is maximum significance. This information may be used by anvi'o in operations that require filtering of functions based on their significance.
+
+Through this file format **you can import functions from any source** into anvi'o, whether those sources are commonly used programs to annotate genes with functions or your ad hoc manual curations for genes of interest. But **please note while there are many ways to have your genes annotated with functions, there is only one way to make sure the gene caller ids anvi'o knows will match perfectly to the gene caller ids in your input file**. The best way to ensure that linkage is to export your gene DNA or amino acid sequences for your an [contigs-db](/help/8/artifacts/contigs-db) using the anvi'o program `anvi-get-sequences-for-gene-calls`.
+
+## An example matrix
+
+Here is an example file that matches to this format that can be used with [anvi-import-functions](/help/8/programs/anvi-import-functions) to import functions into a [contigs-db](/help/8/artifacts/contigs-db):
+
+|gene_callers_id|source|accession|function|e_value|
+|:--|:--:|:--:|:--|:--:|
+|1|Pfam|PF01132|Elongation factor P (EF-P) OB domain|4e-23|
+|1|Pfam|PF08207|Elongation factor P (EF-P) KOW-like domain|3e-25|
+|1|TIGRFAM|TIGR00038|efp: translation elongation factor P|1.5e-75|
+|2|Pfam|PF01029|NusB family|2.5e-30|
+|2|TIGRFAM|TIGR01951|nusB: transcription antitermination factor NusB|1.5e-36|
+|3|Pfam|PF00117|Glutamine amidotransferase class-I|2e-36|
+|3|Pfam|PF00988|Carbamoyl-phosphate synthase small chain, CPSase domain|1.2e-48|
+|3|TIGRFAM|TIGR01368|CPSaseIIsmall: carbamoyl-phosphate synthase, small subunit|1.5e-132|
+|4|Pfam|PF02787|Carbamoyl-phosphate synthetase large chain, oligomerisation domain|1.4e-31|
+|4|TIGRFAM|TIGR01369|CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit|0|
+|5|TIGRFAM|TIGR02127|pyrF_sub2: orotidine 5'-phosphate decarboxylase|1.9e-59|
+|6|Pfam|PF00625|Guanylate kinase|5.7e-39|
+|6|TIGRFAM|TIGR03263|guanyl_kin: guanylate kinase|3.5e-62|
+|8|Pfam|PF01192|RNA polymerase Rpb6|4.9e-13|
+|8|TIGRFAM|TIGR00690|rpoZ: DNA-directed RNA polymerase, omega subunit|1.7e-20|
+|9|TIGRFAM|TIGR01034|metK: methionine adenosyltransferase|2.5e-169|
+|11|Pfam|PF13419|Haloacid dehalogenase-like hydrolase|2.8e-27|
+|11|TIGRFAM|TIGR01509|HAD-SF-IA-v3: HAD hydrolase, family IA, variant 3|1.2e-11|
+|12|Pfam|PF00551|Formyl transferase|1.4e-34|
+|12|TIGRFAM|TIGR00460|fmt: methionyl-tRNA formyltransferase|2.9e-70|
+|13|Pfam|PF12710|haloacid dehalogenase-like hydrolase|2.3e-14|
+|13|TIGRFAM|TIGR00338|serB: phosphoserine phosphatase SerB|4.9e-76|
+|13|TIGRFAM|TIGR01488|HAD-SF-IB: HAD phosphoserine phosphatase-like hydrolase, family IB|6e-29|
+|14|Pfam|PF00004|ATPase family associated with various cellular activities (AAA)|7.7e-45|
+|14|Pfam|PF16450|Proteasomal ATPase OB/ID domain|1.8e-34|
+|14|TIGRFAM|TIGR03689|pup_AAA: proteasome ATPase|1e-206|
+|(...)|(...)|(...)|(...)|(...)|
+
+
+Please note that,
+
+* Not every gene call has to be present in the matrix,
+
+* It is OK if there are multiple annotations from the same source for a given gene call,
+
+* It is OK if a give gene is annotated only by a single source.
+
+* If the **accession** information is not available to you, it is OK to leave it blank (but it will prevent you from being able to use some toys, such as functional enrichment analyses later for pangenomes).
+
+* If you have no e-values associated with your annotations, it is OK to put `0` for every entry (you should make sure you keep this in mind for your downstream analyses that may require filtering of weak hits).
+
+* If there are multiple annotations from a single source for a single gene call, anvi'o uses e-values in this file to use only the most significant one to show in interfaces.
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/functions-txt.md) to update this information.
+
diff --git a/help/8/artifacts/functions/index.md b/help/8/artifacts/functions/index.md
new file mode 100644
index 00000000..9bca1df4
--- /dev/null
+++ b/help/8/artifacts/functions/index.md
@@ -0,0 +1,58 @@
+---
+layout: artifact
+title: functions
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/functions
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-import-functions](../../programs/anvi-import-functions) [anvi-run-cazymes](../../programs/anvi-run-cazymes) [anvi-run-kegg-kofams](../../programs/anvi-run-kegg-kofams) [anvi-run-ncbi-cogs](../../programs/anvi-run-ncbi-cogs) [anvi-run-pfams](../../programs/anvi-run-pfams)
+
+
+## Required or used by
+
+
+[anvi-analyze-synteny](../../programs/anvi-analyze-synteny) [anvi-compute-functional-enrichment-across-genomes](../../programs/anvi-compute-functional-enrichment-across-genomes) [anvi-compute-functional-enrichment-in-pan](../../programs/anvi-compute-functional-enrichment-in-pan) [anvi-compute-metabolic-enrichment](../../programs/anvi-compute-metabolic-enrichment) [anvi-delete-functions](../../programs/anvi-delete-functions) [anvi-display-functions](../../programs/anvi-display-functions) [anvi-export-functions](../../programs/anvi-export-functions) [anvi-setup-modelseed-database](../../programs/anvi-setup-modelseed-database) [anvi-script-gen-function-matrix-across-genomes](../../programs/anvi-script-gen-function-matrix-across-genomes) [anvi-script-gen-functions-per-group-stats-output](../../programs/anvi-script-gen-functions-per-group-stats-output)
+
+
+## Description
+
+This is an artifact that describes **annotation of genes in your [contigs-db](/help/8/artifacts/contigs-db) with functions**.
+
+Broadly used across anvi'o, functions are one of the most essential pieces of information stored in any [contigs-db](/help/8/artifacts/contigs-db). To see what annotation sources for functions are available in a given [contigs-db](/help/8/artifacts/contigs-db) or [genomes-storage-db](/help/8/artifacts/genomes-storage-db), you can use the program [anvi-db-info](/help/8/programs/anvi-db-info).
+
+To populate a given [contigs-db](/help/8/artifacts/contigs-db) with functions, anvi'o includes multiple programs that can annotate genes using various sources of annotation. These programs include,
+
+* [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs), which uses NCBI's [COGs database](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC102395/),
+* [anvi-run-pfams](/help/8/programs/anvi-run-pfams), which uses EBI's [Pfam database](https://pfam.xfam.org/),
+* [anvi-run-cazymes](/help/8/programs/anvi-run-cazymes), which uses the dbCAN [CAZyme HMMs](https://bcb.unl.edu/dbCAN2/download/Databases/)
+* [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams), which uses the [Kyoto Encyclopedia of Genes and Genomes](https://www.genome.jp/kegg/) (KEGG) database and produces [kegg-functions](/help/8/artifacts/kegg-functions), which is the necessary annotation information that can be used by the program [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism).
+
+In addition, you can use the program [anvi-import-functions](/help/8/programs/anvi-import-functions) with a simple [functions-txt](/help/8/artifacts/functions-txt) to import functions from any other annotation source, or to import any ad hoc, user-defined function to later access through anvi'o interfaces or programs.
+
+{:.notice}
+You can use [anvi-import-functions](/help/8/programs/anvi-import-functions) also to import functions from EggNOG or InterProScan as described in [this blog post](http://merenlab.org/2016/06/18/importing-functions/).
+
+You can also use [anvi-export-functions](/help/8/programs/anvi-export-functions) to obtain a file containing these functional annotations through a [functions-txt](/help/8/artifacts/functions-txt) artifact, and use [anvi-display-functions](/help/8/programs/anvi-display-functions) to show the distribution of functions across multiple [contigs-db](/help/8/artifacts/contigs-db)s.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/functions.md) to update this information.
+
diff --git a/help/8/artifacts/genbank-file/index.md b/help/8/artifacts/genbank-file/index.md
new file mode 100644
index 00000000..2c4a5331
--- /dev/null
+++ b/help/8/artifacts/genbank-file/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: genbank-file
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/genbank-file
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-script-process-genbank](../../programs/anvi-script-process-genbank)
+
+
+## Description
+
+The GenBank file format was created by NCBI.
+
+You can find an [explination](https://www.ncbi.nlm.nih.gov/genbank/) and [example](https://www.ncbi.nlm.nih.gov/genbank/samplerecord/) on the NCBI website.
+
+In anvi'o, this is used by [anvi-script-process-genbank](/help/8/programs/anvi-script-process-genbank) to convert the information in the genbank file to a [contigs-fasta](/help/8/artifacts/contigs-fasta), [external-gene-calls](/help/8/artifacts/external-gene-calls), and [functions-txt](/help/8/artifacts/functions-txt).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/genbank-file.md) to update this information.
+
diff --git a/help/8/artifacts/gene-calls-txt/index.md b/help/8/artifacts/gene-calls-txt/index.md
new file mode 100644
index 00000000..67838066
--- /dev/null
+++ b/help/8/artifacts/gene-calls-txt/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: gene-calls-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/gene-calls-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-gene-calls](../../programs/anvi-export-gene-calls) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Required or used by
+
+
+[anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Description
+
+This file describes all of the gene calls contained in a [contigs-db](/help/8/artifacts/contigs-db) from a specified list of sources. It is the output of [anvi-export-gene-calls](/help/8/programs/anvi-export-gene-calls).
+
+For each gene identified, this file provides various information, including the caller ID, [start and stop position](http://merenlab.org/software/anvio/help/artifacts/external-gene-calls/#gene-startstop-positions), direction, whether or not the gene is partial, the [call type](http://merenlab.org/software/anvio/help/artifacts/external-gene-calls/#call-type), source and version (if available ), and the amino acid sequence.
+
+{:.notice}
+Want more information? This file is in the same format as an [external-gene-calls](/help/8/artifacts/external-gene-calls), so check out that page.
+
+Here is an example from the Infant Gut Dataset:
+
+ gene_callers_id contig start stop direction partial call_type source version aa_sequence
+ 0 Day17a_QCcontig1 0 186 f 1 1 prodigal v2.60 GSSPTAGVEQKQKPTWFLLFLFYSLFFDKLEEGTLKTFIRLKGSYRRMNTSNFSYGIMCLL
+ 1 Day17a_QCcontig1 214 1219 f 0 1 prodigal v2.60 MKILLYFEGEKILAKSGIGRALDHQKRALSEVGIEYTLDADCSDYDILHINTYGVNSHRMVRKARKLGKKVIYHAHSTEEDFRNSFIGSNQLAPLVKKYLISLYSKADHLITPTPYSKTLLEGYGIKVPISAISNGIDLSRFYPSEEKEQKFREYFKIDEEKKVIICVGLFFERKGITDFIEVARQLPEYQFIWFGDTPMYSIPKNIRQLVKEDHPENVIFPGYIKGDVIEGAYAAANLFFFPSREETEGIVVLEALASQQQVLVRDIPVYQGWLVANENCYMGHSIEEFKKYIEGLLEGKIPSTREAGYQVAEQRSIKQIGYELKEVYETVLS
+ 2 Day17a_QCcontig1 1265 2489 f 0 1 prodigal v2.60 MKIGFFTDTYFPQVSGVATSIKTLKDELEKHGHEVYIFTTTDPNATDFEEDVIRMPSVPFVSFKDRRVVVRGMWYAYLIAKELELDLIHTHTEFGAGILGKMVGKKMKIPVIHTYHTMYEDYLHYIAKGKVVRPSHVKFFSRVFTNHTTGVVCPSERVIEKLRDYGVTAPMRIIPTGIEIDKFLRPDITEEMIAGMRQQLGIEEQQIMLLSLSRISYEKNIQAIIQGLPQVIEKLPQTRLVIVGNGPYLEDLKELAEELEVSEYVQFTGEVPNEEVAIYYKAADYFVSASTSETQGLTYTEAMAAGVQCVAEGNAYLNNLFDHESLGKTFKTDSDFAPTLIDYIQANIKMDQTILDEKLFEISSTNFGNKMIEFYQDTLIYFDQLQMEKENADSIKKIKVKFTSLRK
+ ...
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/gene-calls-txt.md) to update this information.
+
diff --git a/help/8/artifacts/gene-cluster-inspection/index.md b/help/8/artifacts/gene-cluster-inspection/index.md
new file mode 100644
index 00000000..69f0ca3e
--- /dev/null
+++ b/help/8/artifacts/gene-cluster-inspection/index.md
@@ -0,0 +1,41 @@
+---
+layout: artifact
+title: gene-cluster-inspection
+excerpt: A DISPLAY-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/gene-cluster-inspection
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DISPLAY-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-display-pan](../../programs/anvi-display-pan)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This page describes general properties of the gene cluster inspection page.
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/gene-cluster-inspection.md) to update this information.
+
diff --git a/help/8/artifacts/gene-taxonomy-txt/index.md b/help/8/artifacts/gene-taxonomy-txt/index.md
new file mode 100644
index 00000000..7435d68d
--- /dev/null
+++ b/help/8/artifacts/gene-taxonomy-txt/index.md
@@ -0,0 +1,55 @@
+---
+layout: artifact
+title: gene-taxonomy-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/gene-taxonomy-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-import-taxonomy-for-genes](../../programs/anvi-import-taxonomy-for-genes)
+
+
+## Description
+
+This is a file containing **the taxonomy information for the genes in your [contigs-db](/help/8/artifacts/contigs-db)**.
+
+You can use [anvi-import-taxonomy-for-genes](/help/8/programs/anvi-import-taxonomy-for-genes) to integrate this information into your contigs database. See [this blog post](http://merenlab.org/2016/06/18/importing-taxonomy/) for a comprehensive tutorial.
+
+In its simplest form, this file is a tab-delimited text file that lists gene caller IDs and their associated taxonomy information. However, anvi'o can also parse outputs from taxonomy-based software like [Kaiju](https://github.com/bioinformatics-centre/kaiju) or [Centrifuge](https://github.com/infphilo/centrifuge).
+
+For example:
+
+ gene_caller_id t_domain t_phylum t_class ...
+ 1 Eukarya Chordata Mammalia
+ 2 Prokarya Bacteroidetes Bacteroidia
+ ...
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/gene-taxonomy-txt.md) to update this information.
+
diff --git a/help/8/artifacts/gene-taxonomy/index.md b/help/8/artifacts/gene-taxonomy/index.md
new file mode 100644
index 00000000..efcdff7f
--- /dev/null
+++ b/help/8/artifacts/gene-taxonomy/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: gene-taxonomy
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/gene-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-import-taxonomy-for-genes](../../programs/anvi-import-taxonomy-for-genes)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This describes **the taxonomy information for the genes in your [contigs-db](/help/8/artifacts/contigs-db)**.
+
+You can use [anvi-import-taxonomy-for-genes](/help/8/programs/anvi-import-taxonomy-for-genes) to import this information through a [gene-taxonomy-txt](/help/8/artifacts/gene-taxonomy-txt), either from external data or by using software like [Kaiju](https://github.com/bioinformatics-centre/kaiju) or [Centrifuge](https://github.com/infphilo/centrifuge). See [this blog post](http://merenlab.org/2016/06/18/importing-taxonomy/) for a comprehensive tutorial.
+
+Once this information is populated, it will be displayed in most downstream interfaces, including [anvi-interactive](/help/8/programs/anvi-interactive).
+
+You can also add taxonomy information for the layers in your interface (most likely sections of your sample when analyzing a single sample) using [anvi-import-taxonomy-for-layers](/help/8/programs/anvi-import-taxonomy-for-layers), or at the genome or metagenome level using [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/gene-taxonomy.md) to update this information.
+
diff --git a/help/8/artifacts/genes-db/index.md b/help/8/artifacts/genes-db/index.md
new file mode 100644
index 00000000..12b5a570
--- /dev/null
+++ b/help/8/artifacts/genes-db/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: genes-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/genes-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-gen-gene-level-stats-databases](../../programs/anvi-gen-gene-level-stats-databases)
+
+
+## Required or used by
+
+
+[anvi-db-info](../../programs/anvi-db-info) [anvi-interactive](../../programs/anvi-interactive) [anvi-migrate](../../programs/anvi-migrate) [anvi-search-sequence-motifs](../../programs/anvi-search-sequence-motifs)
+
+
+## Description
+
+An anvi'o genes database is a [profile-db](/help/8/artifacts/profile-db)-like database that contains statistics, such their coverage and detection across samples, rather than contigs in a given [contigs-db](/help/8/artifacts/contigs-db).
+
+A gene database for a given [bin](/help/8/artifacts/bin) stored in a [collection](/help/8/artifacts/collection) will be automatically generated when [anvi-interactive](/help/8/programs/anvi-interactive) is run in 'gene mode'. For details, see the [relevant section](../programs/anvi-interactive/#visualizing-genes-instead-of-contigs) in [anvi-interactive](/help/8/programs/anvi-interactive)
+
+Alternatively, genes databases can be explicitly generated using the program [anvi-gen-gene-level-stats-databases](/help/8/programs/anvi-gen-gene-level-stats-databases). By default, this program will generate a gene database for each [bin](/help/8/artifacts/bin) for a given [collection](/help/8/artifacts/collection).
+
+Due to the strucutral similarities between a [genes-db](/help/8/artifacts/genes-db) and a [profile-db](/help/8/artifacts/profile-db), many of the anvi'o programs that operate on profile databases will also run on genes databases. These programs include those that import/export states and import/export misc additional data.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/genes-db.md) to update this information.
+
diff --git a/help/8/artifacts/genes-fasta/index.md b/help/8/artifacts/genes-fasta/index.md
new file mode 100644
index 00000000..ba6e7960
--- /dev/null
+++ b/help/8/artifacts/genes-fasta/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: genes-fasta
+excerpt: A FASTA-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/genes-fasta
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A FASTA-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-gen-gene-consensus-sequences](../../programs/anvi-gen-gene-consensus-sequences) [anvi-get-sequences-for-gene-calls](../../programs/anvi-get-sequences-for-gene-calls) [anvi-get-sequences-for-gene-clusters](../../programs/anvi-get-sequences-for-gene-clusters) [anvi-get-sequences-for-hmm-hits](../../programs/anvi-get-sequences-for-hmm-hits)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+A genes-fasta is what it sounds like: a FASTA formatted file that contains genes. In Anvi'o, this is an output for programs that return gene sequences. This includes [anvi-get-sequences-for-gene-calls](/help/8/programs/anvi-get-sequences-for-gene-calls), [anvi-get-sequences-for-gene-clusters](/help/8/programs/anvi-get-sequences-for-gene-clusters) (when working with pan genomes), and [anvi-get-sequences-for-hmm-hits](/help/8/programs/anvi-get-sequences-for-hmm-hits).
+
+If you're unsure what a FASTA file is, check out [fasta](/help/8/artifacts/fasta).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/genes-fasta.md) to update this information.
+
diff --git a/help/8/artifacts/genes-stats/index.md b/help/8/artifacts/genes-stats/index.md
new file mode 100644
index 00000000..a2a9bdd7
--- /dev/null
+++ b/help/8/artifacts/genes-stats/index.md
@@ -0,0 +1,52 @@
+---
+layout: artifact
+title: genes-stats
+excerpt: A STATS-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/genes-stats
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A STATS-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-gen_stats_for_single_copy_genes.py](../../programs/anvi-script-gen_stats_for_single_copy_genes.py)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This file contains information about your genes.
+
+It is a tab-delimited text file where each row represents a specific gene and each column provides different information.
+
+As of now, the only program that returns data in this format is [anvi-script-gen_stats_for_single_copy_genes.py](/help/8/programs/anvi-script-gen_stats_for_single_copy_genes.py), which returns this information for the single copy core genes in your [contigs-db](/help/8/artifacts/contigs-db).
+
+From left to right, these tell you
+* The source for this gene (ex `Protista_83`)
+* The name of the contig that this gene is a part of
+* The gene name
+* The e-value (of the HMM hit that was used to find this gene)
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/genes-stats.md) to update this information.
+
diff --git a/help/8/artifacts/genome-similarity/index.md b/help/8/artifacts/genome-similarity/index.md
new file mode 100644
index 00000000..aec33a4d
--- /dev/null
+++ b/help/8/artifacts/genome-similarity/index.md
@@ -0,0 +1,62 @@
+---
+layout: artifact
+title: genome-similarity
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/genome-similarity
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-compute-genome-similarity](../../programs/anvi-compute-genome-similarity) [anvi-script-compute-ani-for-fasta](../../programs/anvi-script-compute-ani-for-fasta)
+
+
+## Required or used by
+
+
+[anvi-dereplicate-genomes](../../programs/anvi-dereplicate-genomes)
+
+
+## Description
+
+This is the output of [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity) (which describes the level of similarity between all of the input genomes) or [anvi-script-compute-ani-for-fasta](/help/8/programs/anvi-script-compute-ani-for-fasta) (which describes the level of similarity between contigs in a fasta file).
+
+{:.notice}
+The output of [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity) will only be in this structure if you did not input a [pan-db](/help/8/artifacts/pan-db). Otherwise, the data will be put directly into the additional data tables of the [pan-db](/help/8/artifacts/pan-db). The same is true of [anvi-script-compute-ani-for-fasta](/help/8/programs/anvi-script-compute-ani-for-fasta).
+
+This is a directory (named by the user) that contains both a [dendrogram](/help/8/artifacts/dendrogram) (NEWICK-tree) and a matrix of the similarity scores between each pair for a variety of metrics dependent on the program that you used to run [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity) or [anvi-script-compute-ani-for-fasta](/help/8/programs/anvi-script-compute-ani-for-fasta) .
+
+For example, if you used `pyANI`'s `ANIb` (the default program), the output directory will contain the following twelve files. These are directly created from the heatmaps generated by PyANI, just converted into matrices and newick files:
+
+-`ANIb_alignment_coverage.newick` and `ANIb_alignment_coverage.txt`: contains the percent coverage (for query and subject)
+
+-`ANIb_percentage_identity.newick` and `ANIb_percentage_identity.txt`: contains the percent identity
+
+-`ANIb_full_percentage_identity.newick` and `ANIb_full_percentage_identity.txt`: contains the percent identity in the context of the length of the entire query and subject sequences (not just the aligned segment)
+
+-`ANIb_alignment_lengths.newick` and `ANIb_alignment_lengths.txt`: contians the total aligned lengths
+
+-`ANIb_similarity_errors.newick` and `ANIb_similarity_errors.txt`: contains similarity errors (total number of mismatches, not including indels)
+
+-`ANIb_hadamard.newick` and `ANIb_hadamard.txt`: contians the hadamard matrix (dot product of identity and coverage matrices)
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/genome-similarity.md) to update this information.
+
diff --git a/help/8/artifacts/genome-taxonomy-txt/index.md b/help/8/artifacts/genome-taxonomy-txt/index.md
new file mode 100644
index 00000000..58da9d6a
--- /dev/null
+++ b/help/8/artifacts/genome-taxonomy-txt/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: genome-taxonomy-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/genome-taxonomy-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-estimate-scg-taxonomy](../../programs/anvi-estimate-scg-taxonomy) [anvi-estimate-trna-taxonomy](../../programs/anvi-estimate-trna-taxonomy)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is the output tables that are displayed when you run [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy) or [anvi-estimate-trna-taxonomy](/help/8/programs/anvi-estimate-trna-taxonomy), but formatted as a tab-delimited text file.
+
+To get this output, just provide the `-o` or the `-O` flag when running [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy) or [anvi-estimate-trna-taxonomy](/help/8/programs/anvi-estimate-trna-taxonomy).
+
+These contain the exact same information as is normally displayed in the terminal, just in a separate file that is easier to share or include as supplemental data. To see an explination of the data within this file, you can look at the page for [genome-taxonomy](/help/8/artifacts/genome-taxonomy).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/genome-taxonomy-txt.md) to update this information.
+
diff --git a/help/8/artifacts/genome-taxonomy/index.md b/help/8/artifacts/genome-taxonomy/index.md
new file mode 100644
index 00000000..7e15b919
--- /dev/null
+++ b/help/8/artifacts/genome-taxonomy/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: genome-taxonomy
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/genome-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-estimate-scg-taxonomy](../../programs/anvi-estimate-scg-taxonomy) [anvi-estimate-trna-taxonomy](../../programs/anvi-estimate-trna-taxonomy)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This artifact is the output tables that are displayed when you run [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy) or [anvi-estimate-trna-taxonomy](/help/8/programs/anvi-estimate-trna-taxonomy).
+
+By default, they won't be outputed anywhere, just displayed in the terminal for your viewing pleasure. If you want them in a tab-delimited file (as a [genome-taxonomy-txt](/help/8/artifacts/genome-taxonomy-txt)), just provide the `-o` or the `-O` prefix and anvi'o will do that for you.
+
+The content of these tables will depend on how you ran [anvi-estimate-trna-taxonomy](/help/8/programs/anvi-estimate-trna-taxonomy) or [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy). [This blog post](http://merenlab.org/2019/10/08/anvio-scg-taxonomy/#estimating-taxonomy-in-the-terminal) gives you examples of what this looks like for each of the input scenarios for anvi-estimate-scg-taxonomy. Anvi-estimate-scg-taxonomy's output is very similar, just with the results coming from different gene types. They will also be briefly described below.
+
+When you run [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy) or [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy) on
+
+- a single genome, this table will contain a single line telling you the taxonomy estimate for your genome. It will also show the number of single-copy core genes or tRNA genes that support this estimate. If you run the `--debug` flag, it will also display the hits for all of the single-copy core genes.
+- a single metagenome, this table will list all of the hits for the chosen single-copy core gene or anticodon (by default, the one with the most hits) and their taxonomy information.
+- a [contigs-db](/help/8/artifacts/contigs-db) and [profile-db](/help/8/artifacts/profile-db) with the flag `--compute-scg-coverages`, additional columns will be added that describe the coverage values for your single-copy core gene or tRNA gene hits across your samples.
+- a [collection](/help/8/artifacts/collection), this table will show you each of your bins, and the best taxonomy estimate for each one, similarly to how it's displayed for a run on a single genome.
+- a [metagenomes](/help/8/artifacts/metagenomes) artifact, this table will give a gene entry ID, its taxonomy, and its corresponding coverage in your metagenomes. This format is essentially identical to the output for a single metagenome. If you provide the flag `--matrix-format`, then it will list taxonomy information in each row, and tell you the coverage of each in each of your metagenomes.
+
+This may sound confusing, but it is easier to understand when looking at the functionality of [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy) and the comprehensive examples given on [this page](http://merenlab.org/2019/10/08/anvio-scg-taxonomy/#estimating-taxonomy-in-the-terminal).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/genome-taxonomy.md) to update this information.
+
diff --git a/help/8/artifacts/genomes-storage-db/index.md b/help/8/artifacts/genomes-storage-db/index.md
new file mode 100644
index 00000000..b70c43a5
--- /dev/null
+++ b/help/8/artifacts/genomes-storage-db/index.md
@@ -0,0 +1,62 @@
+---
+layout: artifact
+title: genomes-storage-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/genomes-storage-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-gen-genomes-storage](../../programs/anvi-gen-genomes-storage)
+
+
+## Required or used by
+
+
+[anvi-analyze-synteny](../../programs/anvi-analyze-synteny) [anvi-compute-functional-enrichment-across-genomes](../../programs/anvi-compute-functional-enrichment-across-genomes) [anvi-compute-functional-enrichment-in-pan](../../programs/anvi-compute-functional-enrichment-in-pan) [anvi-compute-gene-cluster-homogeneity](../../programs/anvi-compute-gene-cluster-homogeneity) [anvi-db-info](../../programs/anvi-db-info) [anvi-display-functions](../../programs/anvi-display-functions) [anvi-display-pan](../../programs/anvi-display-pan) [anvi-get-sequences-for-gene-calls](../../programs/anvi-get-sequences-for-gene-calls) [anvi-get-sequences-for-gene-clusters](../../programs/anvi-get-sequences-for-gene-clusters) [anvi-meta-pan-genome](../../programs/anvi-meta-pan-genome) [anvi-migrate](../../programs/anvi-migrate) [anvi-pan-genome](../../programs/anvi-pan-genome) [anvi-search-functions](../../programs/anvi-search-functions) [anvi-split](../../programs/anvi-split) [anvi-summarize](../../programs/anvi-summarize) [anvi-update-db-description](../../programs/anvi-update-db-description) [anvi-script-compute-bayesian-pan-core](../../programs/anvi-script-compute-bayesian-pan-core) [anvi-script-gen-function-matrix-across-genomes](../../programs/anvi-script-gen-function-matrix-across-genomes) [anvi-script-gen-functions-per-group-stats-output](../../programs/anvi-script-gen-functions-per-group-stats-output)
+
+
+## Description
+
+This is an Anvi'o database that **stores information about your genomes, primarily for use in pangenomic analyses.**
+
+You can think of it like this: in a way, a genomes-storage-db is to the [the pangenomic workflow](http://merenlab.org/2016/11/08/pangenomics-v2/#generating-an-anvio-genomes-storage) what a [contigs-db](/help/8/artifacts/contigs-db) is to the [the metagenomic workflow](http://merenlab.org/2016/06/22/anvio-tutorial-v2/). They both describe key information unique to your particular dataset and are required to run the vast majority of programs.
+
+### What kind of information?
+
+A genomes storage database contains information about the genomes that you inputted to create it, as well as the genes within them.
+
+Specifically, there are three tables stored in a genomes storage database:
+
+* A table describing the information about each of your genomes, such as their name, type (internal or external), GC content, number of contigs, completition, redunduncy, number of genes, etc.
+* A table describing the genes within your genomes. For each gene, this includes its gene caller id, associated genome and position, sequence, length, and whether or not it is partial.
+* A table describing the functions of your genes, including their sources and e-values.
+
+### Cool. How do I make one?
+
+You can generate one of these from an [internal-genomes](/help/8/artifacts/internal-genomes) (genomes described in [bin](/help/8/artifacts/bin)s), [external-genomes](/help/8/artifacts/external-genomes) (genomes described in [contigs-db](/help/8/artifacts/contigs-db)s), or both using the program [anvi-gen-genomes-storage](/help/8/programs/anvi-gen-genomes-storage).
+
+### Cool cool. What can I do with one?
+
+With one of these, you can run [anvi-pan-genome](/help/8/programs/anvi-pan-genome) to get a [pan-db](/help/8/artifacts/pan-db). If a genomes storage database is the [contigs-db](/help/8/artifacts/contigs-db) of pangenomics, then a [pan-db](/help/8/artifacts/pan-db) is the [profile-db](/help/8/artifacts/profile-db). It contains lots of information that is vital for analysis, and most programs will require both the [pan-db](/help/8/artifacts/pan-db) and its genomes storage database as an input.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/genomes-storage-db.md) to update this information.
+
diff --git a/help/8/artifacts/groups-txt/index.md b/help/8/artifacts/groups-txt/index.md
new file mode 100644
index 00000000..a09d658b
--- /dev/null
+++ b/help/8/artifacts/groups-txt/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: groups-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/groups-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-compute-functional-enrichment-across-genomes](../../programs/anvi-compute-functional-enrichment-across-genomes) [anvi-compute-metabolic-enrichment](../../programs/anvi-compute-metabolic-enrichment) [anvi-display-functions](../../programs/anvi-display-functions) [anvi-script-gen-function-matrix-across-genomes](../../programs/anvi-script-gen-function-matrix-across-genomes)
+
+
+## Description
+
+This is a 2-column TAB-delimited text file to associate a given set of items with a set of groups. Depending on the context, items here may be individual samples or genomes. The first column can have a header name `item`, `sample`, `genome` or anything else that is appropriate, and list the items that are relevant to your input data. The second column should have the header `group`, and associate each item in your data with a group.
+
+Each item should be associated with a single group, and it is always a good idea to define groups using single words without any fancy characters. For instance, `HIGH_TEMPERATURE` or `LOW_FITNESS` are good group names. In contrast, `my group #1` or `IS-THIS-OK?`, are not quite good names for groups and may cause issues downstream depending on who uses this file.
+
+Here is an example [groups-txt](/help/8/artifacts/groups-txt) file:
+
+|item|group|
+|:--|:--|
+|item_01|GROUP_A|
+|item_02|GROUP_B|
+|item_03|GROUP_A|
+|(...)|(...)|
+
+{:.warning}
+If you are passing this file to the program [anvi-compute-metabolic-enrichment](/help/8/programs/anvi-compute-metabolic-enrichment), the names in the `sample` column must match those in the "modules" mode output file that you provide to the program via the `--modules-txt` parameter. If you know that the sample names match but you are still getting errors, you might need to specify which column in the "modules" mode output contains those sample names using the `--sample-header` parameter.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/groups-txt.md) to update this information.
+
diff --git a/help/8/artifacts/hmm-hits-across-genomes-txt/index.md b/help/8/artifacts/hmm-hits-across-genomes-txt/index.md
new file mode 100644
index 00000000..9b9c996e
--- /dev/null
+++ b/help/8/artifacts/hmm-hits-across-genomes-txt/index.md
@@ -0,0 +1,50 @@
+---
+layout: artifact
+title: hmm-hits-across-genomes-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/hmm-hits-across-genomes-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-gen-hmm-hits-matrix-across-genomes](../../programs/anvi-script-gen-hmm-hits-matrix-across-genomes)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This file is the output of [anvi-script-gen-hmm-hits-matrix-across-genomes](/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes) and describes the [hmm-hits](/help/8/artifacts/hmm-hits) across multiple genomes or bins for a single [hmm-source](/help/8/artifacts/hmm-source).
+
+The first column describes each of the genomes (if the input was an [external-genomes](/help/8/artifacts/external-genomes)) or bins (if the input was an [internal-genomes](/help/8/artifacts/internal-genomes)) that the matrix describes. The following columns describe each of the genes in your [hmm-source](/help/8/artifacts/hmm-source). The data within the table describes the number of hits that gene had in that genome or bin.
+
+For example, if you were to run [anvi-script-gen-hmm-hits-matrix-across-genomes](/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes) with the `Bacteria_71` [hmm-source](/help/8/artifacts/hmm-source) on two hypothetical genomes, you would get a file like this:
+
+ genome_or_bin ADK AICARFT_IMPCHas ATP-synt ATP-synt_A Adenylsucc_synt Chorismate_synt EF_TS ...
+ Genome_1 11 10 9 9 11 8 9 ...
+ Genome_2 2 1 1 2 3 2 2 ...
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/hmm-hits-across-genomes-txt.md) to update this information.
+
diff --git a/help/8/artifacts/hmm-hits/index.md b/help/8/artifacts/hmm-hits/index.md
new file mode 100644
index 00000000..782ea184
--- /dev/null
+++ b/help/8/artifacts/hmm-hits/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: hmm-hits
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/hmm-hits
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-run-hmms](../../programs/anvi-run-hmms) [anvi-scan-trnas](../../programs/anvi-scan-trnas) [anvi-script-filter-hmm-hits-table](../../programs/anvi-script-filter-hmm-hits-table)
+
+
+## Required or used by
+
+
+[anvi-delete-hmms](../../programs/anvi-delete-hmms) [anvi-get-sequences-for-hmm-hits](../../programs/anvi-get-sequences-for-hmm-hits) [anvi-run-scg-taxonomy](../../programs/anvi-run-scg-taxonomy) [anvi-script-filter-hmm-hits-table](../../programs/anvi-script-filter-hmm-hits-table) [anvi-script-gen-hmm-hits-matrix-across-genomes](../../programs/anvi-script-gen-hmm-hits-matrix-across-genomes) [anvi-script-get-hmm-hits-per-gene-call](../../programs/anvi-script-get-hmm-hits-per-gene-call)
+
+
+## Description
+
+The search results for an [hmm-source](/help/8/artifacts/hmm-source) in a [contigs-db](/help/8/artifacts/contigs-db). Essentially, this is the part of a [contigs-db](/help/8/artifacts/contigs-db) that handles the HMM data. In anvi'o, this is usually functional annotations, such as identifying specfic ribosomal RNAs, various single-copy core genes, and transfer RNAs, though the user can also define their own HMM sources.
+
+Upon creation, a [contigs-db](/help/8/artifacts/contigs-db) will not contain any HMM results. In order to populate it, users can run [anvi-run-hmms](/help/8/programs/anvi-run-hmms) using any [hmm-source](/help/8/artifacts/hmm-source). The program [anvi-scan-trnas](/help/8/programs/anvi-scan-trnas) also populates a [contigs-db](/help/8/artifacts/contigs-db)'s hmm-hits with potential tranfer RNA hits.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/hmm-hits.md) to update this information.
+
diff --git a/help/8/artifacts/hmm-list/index.md b/help/8/artifacts/hmm-list/index.md
new file mode 100644
index 00000000..11dca9f4
--- /dev/null
+++ b/help/8/artifacts/hmm-list/index.md
@@ -0,0 +1,66 @@
+---
+layout: artifact
+title: hmm-list
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/hmm-list
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+The [hmm-list](/help/8/artifacts/hmm-list) file is a TAB-delimited file with at least three columns:
+
+* `name`: The name of the HMM. If you are using an external HMM it MUST match the name found in the `genes.txt`
+* `source`: Name of the collection of HMMs the HMM is found in e.g. Bacterial_71. If you are using an external HMM, simply put the name of the directory.
+* `path`: If using an [Default HMM sources](http://127.0.0.1:4000/help/main/artifacts/hmm-source/#default-hmm-sources) simply put "INTERNAL". On the other hand, if you are using a [user-defined HMM sources](http://127.0.0.1:4000/help/main/artifacts/hmm-source/#user-defined-hmm-sources) please put the full path to the anvi'o formatted HMM directory.
+
+Here is an example of an [hmm-list](/help/8/artifacts/hmm-list) using the HMM Ribosomal_L16 and Ribosomal_S2 from the internal anvi'o collection Bacteria_71:
+
+| name | source | path |
+|---------------|-------------|----------|
+| Ribosomal_L16 | Bacteria_71 | INTERNAL |
+| Ribosomal_S2 | Bacteria_71 | INTERNAL |
+
+You can also use external [hmm-source](/help/8/artifacts/hmm-source)s! An easy way to get an anvi'o ready HMM directory is to use the script [anvi-script-pfam-accessions-to-hmms-directory](/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory) to download a [Pfam HMM](https://pfam.xfam.org/).
+
+
+anvi-script-pfam-accessions-to-hmms-directory --pfam-accessions-list PF00016 -O RuBisCO_large_HMM
+
+
+Here is what the `hmm_list.txt` should look like with a combination of internal and external [hmm-source](/help/8/artifacts/hmm-source)s:
+
+| name | source | path |
+|---------------|-------------------|----------------------------|
+| Ribosomal_L16 | Bacteria_71 | INTERNAL |
+| Ribosomal_S2 | Bacteria_71 | INTERNAL |
+| RuBisCO_large | RuBisCO_large_HMM | PATH/TO/RuBisCO_large_HMM/ |
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/hmm-list.md) to update this information.
+
diff --git a/help/8/artifacts/hmm-source/index.md b/help/8/artifacts/hmm-source/index.md
new file mode 100644
index 00000000..ceef2db8
--- /dev/null
+++ b/help/8/artifacts/hmm-source/index.md
@@ -0,0 +1,154 @@
+---
+layout: artifact
+title: hmm-source
+excerpt: A HMM-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/hmm-source
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A HMM-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-pfam-accessions-to-hmms-directory](../../programs/anvi-script-pfam-accessions-to-hmms-directory)
+
+
+## Required or used by
+
+
+[anvi-compute-completeness](../../programs/anvi-compute-completeness) [anvi-delete-hmms](../../programs/anvi-delete-hmms) [anvi-get-sequences-for-hmm-hits](../../programs/anvi-get-sequences-for-hmm-hits) [anvi-run-hmms](../../programs/anvi-run-hmms) [anvi-script-filter-hmm-hits-table](../../programs/anvi-script-filter-hmm-hits-table) [anvi-script-gen-hmm-hits-matrix-across-genomes](../../programs/anvi-script-gen-hmm-hits-matrix-across-genomes) [anvi-script-get-hmm-hits-per-gene-call](../../programs/anvi-script-get-hmm-hits-per-gene-call)
+
+
+## Description
+
+An HMM source is a collection of one or more hidden Markov models. HMM sources can be used to identify and recover genes in a [contigs-db](/help/8/artifacts/contigs-db) that match to those described in the model.
+
+Models in a given HMM source can be searched in a given [contigs-db](/help/8/artifacts/contigs-db) via the program [anvi-run-hmms](/help/8/programs/anvi-run-hmms) which would yield an [hmm-hits](/help/8/artifacts/hmm-hits) artifact. An anvi'o installation will include multiple HMM sources by default. But HMMs for any set of genes can also be put together by the end user and run on any anvi'o [contigs-db](/help/8/artifacts/contigs-db).
+
+HMM hits in a [contigs-db](/help/8/artifacts/contigs-db) for a given [hmm-source](/help/8/artifacts/hmm-source) source will be accessible to anvi'o programs globally. Sequences that match to HMM hits can be recovered in an aligned or non-aligned fashion as [fasta](/help/8/artifacts/fasta) files for downstream analyses including phylogenomics, they can be displayed in anvi'o interfaces, reported in summary outputs, and so on.
+
+Running [anvi-db-info](/help/8/programs/anvi-db-info) on a [contigs-db](/help/8/artifacts/contigs-db) will list HMM sources available in it.
+
+### Default HMM sources
+
+An anvi'o installation will include [multiple HMM sources](https://github.com/meren/anvio/tree/master/anvio/data/hmm) by default. These HMM sources can be run on any [contigs-db](/help/8/artifacts/contigs-db) with [anvi-run-hmms](/help/8/programs/anvi-run-hmms) to identify and store [hmm-hits](/help/8/artifacts/hmm-hits):
+
+
+[anvi-run-hmms](/help/8/programs/anvi-run-hmms) -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+The default HMM sources in anvi'o include:
+
+* **Bacteria_71**: 71 single-copy core genes for domain bacteria that represent a modified version of the HMM profiles published by [Mike Lee](https://doi.org/10.1093/bioinformatics/btz188). The anvi'o collection excludes `Ribosomal_S20p`, `PseudoU_synth_1`, `Exonuc_VII_S`, `5-FTHF_cyc-lig`, `YidD` and `Peptidase_A8` occurred in Lee collection (as they were exceptionally redundant or rare among MAGs from various habitats), and includes `Ribosomal_S3_C`, `Ribosomal_L5`, `Ribosomal_L2` to make it more compatible with [Hug et al](https://www.nature.com/articles/nmicrobiol201648)'s set of ribosomal proteins.
+* **Archaea_76**: 76 single-copy core genes for domain archaea by [Mike Lee](https://doi.org/10.1093/bioinformatics/btz188).
+* **Protista_83**: 83 single-copy core genes for protists (domain eukarya) by [Tom O. Delmont](http://merenlab.org/delmont-euk-scgs).
+
+Apart from these, anvi'o also includes a number of HMM profiles for individual ribosomal RNA classes derived from [Torsten Seemann's tool](https://github.com/tseemann/barrnap) (we split them into individual classes after [this](https://github.com/merenlab/anvio/issues/1411)):
+
+* **Ribosomal\_RNA\_5S** (eukarya + archaea + bacteria; also includes 5.8S).
+* **Ribosomal\_RNA\_12S** (mitochondria)
+* **Ribosomal\_RNA\_16S** (bacteria + archaea + mitochondria)
+* **Ribosomal\_RNA\_18S** (eukarya)
+* **Ribosomal\_RNA\_23S** (bacteria + archaea)
+* **Ribosomal\_RNA\_28S** (eukarya)
+
+When [anvi-run-hmms](/help/8/programs/anvi-run-hmms) is run on an anvi'o [contigs-db](/help/8/artifacts/contigs-db) without providing any further arguments, it automatically utilizes all the default HMM sources.
+
+{:.notice}
+Similar to Ribosomal RNAs, anvi'o can also identify Transfer RNAs. Even though Transfer RNAs will also appear as an HMM source for all downstream analyses, their initial identification will require running [anvi-scan-trnas](/help/8/programs/anvi-scan-trnas) program on a [contigs-db](/help/8/artifacts/contigs-db).
+
+### User-defined HMM sources
+
+The user can employ additional HMM sources to identify matching genes in a given [contigs-db](/help/8/artifacts/contigs-db).
+
+Any directory with expected files in it will serve as an HMM source:
+
+
+anvi-run-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --hmm-source /PATH/TO/USER-HMM-DIRECTORY/
+
+
+Anvi'o will expect the HMM source directory to contain six files (see this for [an example directory](https://github.com/merenlab/anvio/tree/master/anvio/data/hmm/Protista_83)). These files are explicitly defined as follows:
+
+* **genes.hmm.gz**: A gzip of concatenated HMM profiles. One can (1) obtain one or more HMMs by computing them from sequence alignments or by downloading previously computed ones from online resources such as [Pfams](https://pfam.xfam.org/family/browse?browse=new), (2) concatenate all profiles into a single file called `genes.hmm`, and finally (3) compress this file using `gzip`.
+* **genes.txt**: A TAB-delimited file that must contain three columns: `gene` (gene name), `accession` (gene accession number (can be anything unique)), and `hmmsource` (source of HMM profiles listed in genes.hmm.gz). The list of gene names in this file must perfectly match to the list of gene names in genes.hmm.gz.
+* **kind.txt**: A flat text file which contains a single word identifying what type of profile the directory contains. This information will appear in interfaces. Use a single, descriptive word for your collection.
+* **reference.txt**: A file containing source information for this profile to cite it properly.
+* **target.txt**: A file that specifies the target *alphabet* and *context* that defines how HMMs should be searched (this is a function of the HMM source that is used). The proper notation is 'alphabet:context'. Alphabet can be `AA`, `DNA`, or `RNA`. Context can be `GENE` or `CONTIG`. The content of this file should be any combination of one alphabet and one context term. For instance, if the content of this file is `AA:GENE`, anvi'o will search genes amino acid sequences, and so on. An exception is `AA:CONTIG`, which is an improper target since anvi'o can't translate contigs to amino acid sequences. See [this](https://github.com/meren/anvio/pull/402) for more details. Please note that HMMs that target `DNA:CONTIG` will result in new gene calls in the contigs database to describe their hits.
+* **noise_cutoff_terms.txt**: A file to specify how to deal with noise. [See this comment](https://github.com/merenlab/anvio/issues/498#issuecomment-362115921) for more information on the contents of this file.
+
+
+### Creating anvi'o HMM sources from ad hoc PFAM accessions
+
+It is also possible to generate an anvi'o compatible HMMs directory for a given set of PFAM accession ids. For instance, the following command will result in a new directory that can be used immediately with the program [anvi-run-hmms](/help/8/programs/anvi-run-hmms):
+
+
+[anvi-script-pfam-accessions-to-hmms-directory](/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory) --pfam-accessions-list PF00705 PF00706 \
+ -O AD_HOC_HMMs
+
+
+These IDs can be given through the command line as a list, or through an input file where every line is a unique accession id.
+
+An example. Let's assume we have a genome or a metagenome that looks like this:
+
+![PFAM example](../../images/p214-wo-upxz.png)
+
+And we wish to identify locations of genes that match to this model: [http://pfam.xfam.org/family/PF06603](http://pfam.xfam.org/family/PF06603)
+
+One can run this command:
+
+
+[anvi-script-pfam-accessions-to-hmms-directory](/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory) --pfam-accessions-list PF06603 \
+ -O UpxZ
+
+
+which would createa a directory called `UpxZ`. Then, one would run this command to find matches to this model in a given contigs database:
+
+
+[anvi-run-hmms](/help/8/programs/anvi-run-hmms) -c CONTIGS.db \
+ -H UpxZ/ \
+ --num-threads 4
+
+
+Now it is possible to get the sequences matching to this model:
+
+
+[anvi-get-sequences-for-hmm-hits](/help/8/programs/anvi-get-sequences-for-hmm-hits) -c CONTIGS.db \
+ --hmm-source UpxZ \
+ -o UpxZ.fa
+
+
+```
+Contigs DB ...................................: Initialized: CONTIGS.db (v. 19)
+Hits .........................................: 8 hits for 1 source(s)
+Mode .........................................: DNA sequences
+Genes are concatenated .......................: False
+Output .......................................: UpxZ.fa
+```
+
+Or see where they are by visualizing the project using again:
+
+
+[anvi-interactive](/help/8/programs/anvi-interactive) -p PROFILE.db \
+ -c CONTIGS.db
+
+
+![PFAM example](../../images/p214-w-upxz.png)
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/hmm-source.md) to update this information.
+
diff --git a/help/8/artifacts/interacdome-data/index.md b/help/8/artifacts/interacdome-data/index.md
new file mode 100644
index 00000000..da628056
--- /dev/null
+++ b/help/8/artifacts/interacdome-data/index.md
@@ -0,0 +1,47 @@
+---
+layout: artifact
+title: interacdome-data
+excerpt: A DATA-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/interacdome-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DATA-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-interacdome](../../programs/anvi-setup-interacdome)
+
+
+## Required or used by
+
+
+[anvi-run-interacdome](../../programs/anvi-run-interacdome)
+
+
+## Description
+
+This artifact stores the data downloaded by [anvi-setup-interacdome](/help/8/programs/anvi-setup-interacdome) and is required to run [anvi-run-interacdome](/help/8/programs/anvi-run-interacdome).
+
+As described in the [InteracDome blogpost](https://merenlab.org/2020/07/22/interacdome/#anvi-setup-interacdome), this data includes [the tab-separated files](https://interacdome.princeton.edu/#tab-6136-4) from [InteracDome](https://interacdome.princeton.edu/) and the Pfam 31.0 HMMs for the subset of Pfams found in the InteracDome dataset.
+
+By default, this data is stored in `anvio/data/misc/InteracDome`, but a custom path can be set when the user runs [anvi-setup-interacdome](/help/8/programs/anvi-setup-interacdome) if desired.
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/interacdome-data.md) to update this information.
+
diff --git a/help/8/artifacts/interactive/index.md b/help/8/artifacts/interactive/index.md
new file mode 100644
index 00000000..6eeaa1bc
--- /dev/null
+++ b/help/8/artifacts/interactive/index.md
@@ -0,0 +1,241 @@
+---
+layout: artifact
+title: interactive
+excerpt: A DISPLAY-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/interactive
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DISPLAY-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-display-contigs-stats](../../programs/anvi-display-contigs-stats) [anvi-display-functions](../../programs/anvi-display-functions) [anvi-display-metabolism](../../programs/anvi-display-metabolism) [anvi-display-pan](../../programs/anvi-display-pan) [anvi-display-structure](../../programs/anvi-display-structure) [anvi-inspect](../../programs/anvi-inspect) [anvi-interactive](../../programs/anvi-interactive) [anvi-script-checkm-tree-to-interactive](../../programs/anvi-script-checkm-tree-to-interactive) [anvi-script-gen-functions-per-group-stats-output](../../programs/anvi-script-gen-functions-per-group-stats-output) [anvi-script-snvs-to-interactive](../../programs/anvi-script-snvs-to-interactive)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This page describes general properties of anvi'o interactive displays and programs that offer anvi'o interactive artifacts.
+
+## Terminology
+
+Anvi'o uses a simple terminology to address various aspects of interactive displays it produces, such as items, layers, views, orders, and so on. The purpose of this section is to provide some insights into these terminology using the figure below:
+
+![an anvi'o display](../../images/interactive_interface/anvio_display_template.png){:.center-img}
+
+Even though the figure is a product of [anvi-display-pan](/help/8/programs/anvi-display-pan), the general terminology does not change across different interfaces, including the default visualizations of [anvi-interactive](/help/8/programs/anvi-interactive). Here are the descriptions of numbered areas in the figure:
+
+* The tree denoted by **(1)** shows the organization of each `item`. Items could be contigs, gene clusters, bins, genes, or anything else depending on which mode the anvi'o interactive interface was initiated. The structure that orders items and denoted by **(1)** in the figure can be a phylogenetic or phylogenomic tree, or a dendrogram produced by a hierarchical clustering algorithm. In addition, there may be nothing there, if the user has requested or set a linear items order through [misc-data-items-order](/help/8/artifacts/misc-data-items-order).
+* Each concentric circle underneath the number **(2)** is called a `layer` and the data shown for items and layers as a whole is called a `view`. A **layer** can be a genome, a metagenome, or anything else depending on which mode the anvi'o interactive was initiated. The **view** is like a data table where a datum is set for each **item** in each **layer**. The view data is typically computed by anviโo and stored in pan databases by [anvi-pan-genome](/help/8/programs/anvi-pan-genome) or profile databases by [anvi-profile](/help/8/programs/anvi-profile). The user add another view to the relevant combo box in the interface by providing a TAB-delimited file to [anvi-interactive](/help/8/programs/anvi-interactive) through the command line argument `--additional-view`, or add new layers to extend these vies with additional data through [misc-data-items](/help/8/artifacts/misc-data-items).
+* The tree denoted by **(3)** shows a specific ordering of layers. Anvi'o will compute various layer orders automatically based on available **view** depending on the analysis or visualization mode, and users can extend available **layer orders** through [misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders).
+* What is shown by **(4)** is the additional data for layers. the user can extend this section with additional information on layers using the [misc-data-layers](/help/8/artifacts/misc-data-layers).
+
+The orchestrated use of [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data), [anvi-export-misc-data](/help/8/programs/anvi-export-misc-data), and [anvi-delete-misc-data](/help/8/programs/anvi-delete-misc-data) provides a powerful framework to decorate items or layers in a display and enhance visualization of complex data. Please take a look at the following article on how to extend anvi'o displays:
+
+* [https://merenlab.org/2017/12/11/additional-data-tables/](https://merenlab.org/2017/12/11/additional-data-tables/)
+
+## Programs that give interactive access
+
+If you're new to the anvi'o interactive interface, you'll probably want to check out [this tutorial for beginners](http://merenlab.org/tutorials/interactive-interface/) or the other resources on the [anvi-interactive](/help/8/programs/anvi-interactive) page.
+
+However, there are more interfaces available in anvi'o than just that one, so let's list them out:
+
+- [anvi-display-structure](/help/8/programs/anvi-display-structure) lets you examine specific protein structures, along with SCV and SAAVs within it. (It even has [its own software page.](http://merenlab.org/software/anvio-structure/). It's kind of a big deal.)
+
+- [anvi-display-contigs-stats](/help/8/programs/anvi-display-contigs-stats) shows you various stats about the contigs within a [contigs-db](/help/8/artifacts/contigs-db), such as their hmm-hits, lengths, N and L statistics, and so on.
+
+- [anvi-display-functions](/help/8/programs/anvi-display-functions) lets you quickly browse the functional pool for a given set of genomes or metagenomes.
+
+- [anvi-display-metabolism](/help/8/programs/anvi-display-metabolism) is still under development but will allow you to interactively view metabolism estimation data using [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) under the hood.
+
+- [anvi-display-pan](/help/8/programs/anvi-display-pan) displays information about the gene clusters that are stored in a [pan-db](/help/8/artifacts/pan-db). It lets you easily view your core and accessory genes, and can even be turned into a metapangenome through importing additional data tables.
+
+- [anvi-inspect](/help/8/programs/anvi-inspect) lets you look at a single split across your samples, as well as the genes identified within it. This interface can also be opened from the [anvi-interactive](/help/8/programs/anvi-interactive) interface by asking for details about a specific split.
+
+- [anvi-interactive](/help/8/programs/anvi-interactive) displays the information in a [profile-db](/help/8/artifacts/profile-db). It lets you view the distribution of your contigs across your samples, manually bin metagenomic data into MAGSs (and refine those bins with [anvi-refine](/help/8/programs/anvi-refine)), and much more. You can also use this to look at your genes instead of your contigs or [examine the genomes after a phylogenomic analysis](http://merenlab.org/2017/06/07/phylogenomics/). Just look at that program page for a glimpse of this program's amazingness.
+
+- [anvi-script-snvs-to-interactive](/help/8/programs/anvi-script-snvs-to-interactive) lets you view a comprehensive summary of the SNVs, SCVs, and SAAVs within your contigs.
+
+## Artifacts that give interactive access
+
+- [gene-cluster-inspection](/help/8/artifacts/gene-cluster-inspection) lets you examine specific gene clusters.
+
+- [contig-inspection](/help/8/artifacts/contig-inspection) shows you detailed contig information.
+
+## An overview of the display
+
+The interactive interface has three major areas of interaction:
+
+* The space for visualization in the middle area,
+* The Settings panel on the left of the screen,
+* And three additional panels for 'News', 'Description', and 'Mouse' on the right.
+
+Each panel is important, but the most important and functionally rich one is the 'Settings' panel.
+
+### Settings panel
+
+If closed, the settings panel can be opened by clicking on the little button on the left-middle part of your browser. When opened, you will see multiple tabs:
+
+![an anvi'o settings panel](../../images/interactive_interface/interactive-settings-panel-tabs.png){:.center-img}
+
+But before we start talking about these tabs, it is worthwhile to mention that at the bottom of the settings panel you will find a section with tiny controls that are available in all tabs:
+
+![settings panel bottom controls](../../images/interactive_interface/interactive-settings-bottom.png){:.center-img}
+
+Through these controls you can,
+
+* __Create or refresh__ the display when necessary using the draw button (some changes require you to do that),
+
+* __Zoom in, zoom out, and center__ the display.
+
+* __Download your display as an SVG file.__
+
+Finally, at the top-right of the Settings panel header you will find a dropdown menu (hamburger menu) which provides links to external information, resources, and issue-reporting related to anvi'o.
+
+OK. Let's talk about each tab you will find in the settings panel.
+
+
+### Main Tab
+
+This is one of the most frequently used tabs in the interface, and there are multiple sections in it (keeps growing over time, so things may be missing here).
+
+![an anvi'o main tab](../../images/interactive_interface/interactive-settings-display-additional-settings.png){:.center-img}
+
+* **Display subsection**. Provides high level options for adjusting _items order_, _view_, and _drawing type_.
+
+Clicking the _Show Additional Settings_ button provides access to myriad additional, more-granular adjustments, including,
+
+* **Dendrogram subsection**. _Radius_ and _Angle_ , and _Edge length normalization_ adjustments for the dendrogram.
+* **Branch support subsection**. Settings for displaying _bootstrap values_ on the dendrogram.
+* **Selections subsection**. To adjust _height_, _grid_ and/or _shade_ display, as well as selection _name_ settings.
+* **Layers subsection**. Display and label settings.
+* **Performance subsection**. Whether the SVG output is optimized for performance or granularity (very advanced stuff).
+* **Layers subsection**. This is arguably the most important subsection in the Main tab that enables you to make very precise adjustments to how things should look like on your screen. You can adjust individual layer attributes like _color_, display _type_, _height_ and _min/max_ values. Click + drag each layer to rearrange how layers are ordered. Or _edit attributes for multiple layers_ as well.
+
+![an anvi'o settings layers](../../images/interactive_interface/interactive-settings-layers.png){:.center-img}
+
+Mastering these in the Main Tab will minimize the post-processing of your anvi'o figures for high-quality and good-looking publication ready images.
+
+### Layers tab
+
+Through the layers tab you can,
+
+- __Change general settings for the tree__ (i.e., switching between circle or rectengular displays, changing tree radius or width), __and layers__ (i.e., editing layer margins, or activating custom layer margins).
+
+- __Load or save states__ to store all visual settings, or load a previously saved state.
+
+- __Customize individual__ layers by switching between different __display modes__ depending on the layer type (i.e., โtextโ or โcolorโ mode for categorical layers, or โbarโ or โintensityโ mode for numerical layers), __set normalization__ (i.e., โsquare-rootโ, or โlogโ normalization), __minimum, and maximum cutoff__ values for numerical layers, or set __layer height__, and __layer margin__ (i.e., its distance from the previous layer).
+
+- Use the __multi-selector__ at the bottom to change settings for multiple layers at once.
+
+![an anvi'o layers tab](../../images/interactive_interface/interactive-settings-layers-tab.png){:.center-img}
+
+
+### Samples tab
+
+Samples tab is for the additional data you provide the interface through a samples database (see samples order and samples information sections above). Through this layer you can,
+
+- __Change the order__ of layers using automatically-generated or user-provided orders of layers using the Sample order combo box,
+
+- __Customize individual samples information entries.__ Changes in this tab can be reflected to the current display without re-drawing the entire tree unless the sample order is changed.
+
+### Bins tab
+
+Anviโo allows you to create selections of items shown in the display (whether they are contigs, gene clusters, or any other type of data shown in the display). Bins tab allow you to maintain these selections. Any selection on the tree will be added to active bin in this tab (the state radio button next to a bin defines its activity). Through this tab you can,
+
+- __Create or delete bins, set bin names, change the color of a given bin__, or sort bins based on their name, the number of units they carry, or completion and contamination estimates (completion / contamination estimates are only computed for genomic or metagenomic analyses).
+
+- View the __number of selected units__ in a given bin, and see the __list of names in the selection__ by clicking the button that shows the number of units described in the bin.
+
+- __Store a collection of bins__, or __load a previously stored collection.__
+
+![an anvi'o bins tab](../../images/interactive_interface/interactive-settings-bins-tab.png){:.center-img}
+
+### Legends tab
+
+The legends tab enables users to easily change individual or batch legend colors for any of their additional data items
+
+
+![an anvi'o legends tab](../../images/interactive_interface/interactive-settings-legends-tab.png){:.center-img}
+
+### Search tab
+
+It does what the name suggests. Using this tab you can,
+
+- __Build expressions to search items__ visualized in the main display.
+
+- __Highlight matches__, and __append__ them to, or __remove__ them from the __selected bin__ in the Bins tab.
+
+
+
+### Mouse panel
+
+The mouse panel displays the value of items underneath the mouse pointer while the user browse the tree.
+
+Displaying the numerical or categorical value of an item shown on the tree is not an easy task. We originally thought that displaying pop-up windows would solve it, but besides the great overhead, it often became a nuisance while browsing parts of the tree. We could show those pop-up displays only when use clicks on the tree, however click-behavior is much more appropriate to add or remove individual items from a bin, hence, it wasnโt the best solution either. So we came up with the โmouse panelโ. You have a better idea? I am not surprised! We would love to try improve your experience: please enter an issue, and letโs discuss.
+
+### News panel
+The news panel provides information and external links tracking major Anvi'o releases and development updates.
+
+### Description panel
+
+- The description panel is a flexible, multipurpose space where users can,
+- Store notes, comments, and any other stray items related to their project, in a feature-rich markdown environment.
+- Display context, references, reproducibility instructions, and any other salient details for published figures.
+![The Description panel in action](../../images/interactive_interface/interactive-settings-description-panel.png){:.center-img}
+
+## Interactive interface tips + tricks
+
+Here are some small conveniences that may help the interface serve you better (we are happy to expand these little tricks with your suggestions).
+
+* You can zoom to a section of the display by making a rectangular selection of the area __while the pressing the shift button.__
+
+* You can click an entire branch to add items into the selected bin, and remove them by __right-clicking__ a branch.
+
+* If you click a branch __while pressing the `Command` or `CTRL` button__, it will create a new bin, and add the content of the selection into that bin.
+
+* Tired of selecting items for binning one by one? __right-click__ on an item and select __Mark item as 'range start'__ to set an 'in point', then __right-click__ on another item and select __Add items in range to active bin__ or __Remove items in range from any bin__ to manipulate many items with few clicks. Nice!
+
+* By pressing `1`,`2`,`3`,`4`, and`5`, you can go between Layers, Bins, Samples, Mouse, and Search tabs!
+
+## Keyboard shortcuts
+
+The interactive interface recognizes a handful of keyboard shortcuts to help speed up your workflow
+
+- The `S` key toggles the Settings panel
+- The `M` key toggles the Mouse panel
+- The `N` key toggles the Description panel
+- The `W` key toggles the News panel
+- The `D` key triggers a redraw of your visualization
+- The `T` key toggles showing the Title panel
+- Keys `1` through `5` will toggle between tabs within the Settings panel, granted the Settings panel is currently shown.
+- `CTRL`+`Z` and `CTRL`+`SHIFT`+`Z` will undo or redo bin actions, respectively.
+
+
+
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/interactive.md) to update this information.
+
diff --git a/help/8/artifacts/internal-genomes/index.md b/help/8/artifacts/internal-genomes/index.md
new file mode 100644
index 00000000..547466e2
--- /dev/null
+++ b/help/8/artifacts/internal-genomes/index.md
@@ -0,0 +1,58 @@
+---
+layout: artifact
+title: internal-genomes
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/internal-genomes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-gen-genomes-file](../../programs/anvi-script-gen-genomes-file)
+
+
+## Required or used by
+
+
+[anvi-compute-functional-enrichment-across-genomes](../../programs/anvi-compute-functional-enrichment-across-genomes) [anvi-compute-genome-similarity](../../programs/anvi-compute-genome-similarity) [anvi-compute-metabolic-enrichment](../../programs/anvi-compute-metabolic-enrichment) [anvi-dereplicate-genomes](../../programs/anvi-dereplicate-genomes) [anvi-display-functions](../../programs/anvi-display-functions) [anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism) [anvi-gen-genomes-storage](../../programs/anvi-gen-genomes-storage) [anvi-get-codon-frequencies](../../programs/anvi-get-codon-frequencies) [anvi-get-codon-usage-bias](../../programs/anvi-get-codon-usage-bias) [anvi-get-sequences-for-hmm-hits](../../programs/anvi-get-sequences-for-hmm-hits) [anvi-meta-pan-genome](../../programs/anvi-meta-pan-genome) [anvi-script-gen-function-matrix-across-genomes](../../programs/anvi-script-gen-function-matrix-across-genomes) [anvi-script-gen-functions-per-group-stats-output](../../programs/anvi-script-gen-functions-per-group-stats-output) [anvi-script-gen-hmm-hits-matrix-across-genomes](../../programs/anvi-script-gen-hmm-hits-matrix-across-genomes)
+
+
+## Description
+
+In the anvi'o lingo, an internal genome is any [bin](/help/8/artifacts/bin) stored in an anvi'o [collection](/help/8/artifacts/collection) that describes a single genome. You can obtain one of these by binning a metagenome manually in the interactive interface, automatically using a binning software, or by importing a [collection](/help/8/artifacts/collection) into anvi'o using the program [anvi-import-collection](/help/8/programs/anvi-import-collection).
+
+The purpose of the external genomes file is to describe one or more internal genomes genomes, so this file can be passed to anvi'o programs that can operate on multiple genomes. The internal genomes file format enables anvi'o programs to work with one or more bins from one or more collections that may be defined in different anvi'o [profile-db](/help/8/artifacts/profile-db) files.
+
+The internal-genomes file is a TAB-delimited file with at least the following five columns:
+
+|name|bin_id|collection_id|profile_db_path|contigs_db_path|
+|:--|:--:|:--:|:--|:--|
+|Name_01|Bin_id_01|Collection_A|/path/to/profile.db|/path/to/contigs.db|
+|Name_02|Bin_id_02|Collection_A|/path/to/profile.db|/path/to/contigs.db|
+|Name_03|Bin_id_03|Collection_B|/path/to/another_profile.db|/path/to/another/contigs.db|
+|(...)|(...)|(...)|(...)|(...)|
+
+{:.warning}
+Please make sure names in the `name` column does not include any special characters (underscore is fine). It is also a good idea to keep these names short and descriptive as they will appear in various figures in downstream analyses.
+
+Also see **[external-genomes](/help/8/artifacts/external-genomes)** and **[metagenomes](/help/8/artifacts/metagenomes)**.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/internal-genomes.md) to update this information.
+
diff --git a/help/8/artifacts/inversions-txt/index.md b/help/8/artifacts/inversions-txt/index.md
new file mode 100644
index 00000000..6ac9dfca
--- /dev/null
+++ b/help/8/artifacts/inversions-txt/index.md
@@ -0,0 +1,152 @@
+---
+layout: artifact
+title: inversions-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/inversions-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-report-inversions](../../programs/anvi-report-inversions)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is the output of [anvi-report-inversions](/help/8/programs/anvi-report-inversions).
+
+### Per sample inversion report table
+
+[anvi-report-inversions](/help/8/programs/anvi-report-inversions) searches for inversions in every single sample at a time and thus genereates a TAB-delimited table for every sample: `INVERSIONS-IN-SAMPLE_01.txt`, `INVERSIONS-IN-SAMPLE_02`, ...
+
+Here is an example output:
+
+|**entry_id**|**sample_name**|**contig_name**|**first_seq**|**midline**|**second_seq**|**first_start**|**first_end**|**first_oligo_primer**|**first_oligo_reference**|**second_start**|**second_end**|**second_oligo_primer**|**second_oligo_reference**|**num_mismatches**|**num_gaps**|**length**|**distance**|
+|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|
+|c_000000000001_10214_10541|S01|c_000000000001|TGTTTCAAAAAACGTTCGT|\|\|\|\|\|\|\|\|\|\|x\|\|\|\|\|\|\|\||ACGAACGTCTTTTGAAACA|10306|10325|TCGATCAATTGATGTTTCAAAA.ACGTTCGT|CTTTTG|10493|10512|TCAGTAGTGAATGTGTTTCAAAA.ACGTTCGT|TTAATA|1|0|19|168|
+|c_000000000002_10148_11135|S01|c_000000000002|TGTTTCAAAAAACGTTCGT|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\||ACGAACGTTTTTTGAAACA|10514|10533|AGATACGGTTTATGTTTCAAAAAACGTTCGT|CTTTTG|10724|10743|AGAAAAGAAGGCGTGTTTCAAAAAACGTTCGT|TCAATG|0|0|19|191|
+
+These tables contains the following columns:
+
+* Entry ID made with the contig's name and the start and stop position of the stretch
+* The contig's name
+* The first palindrome sequence
+* The aligment midline
+* The second palindrome sequence
+* The start and stop position of the first and second palindrome sequence
+* The number of mismatches
+* The number of gaps
+* The length of the palindrome sequence
+* The distance between the first and second palindrome seqeuences, i.e. the size of the inversion
+* The number of samples in which it was detected and confirmed
+* The in silico primers used to compute the inversion's activity, for the first and second palindrome
+* The oligo corresponding to the reference sequence
+
+### Inversions consensus table
+
+Anvi'o eventually create a consensus table with all the unique inversions found accross all your samples in a file called `INVERSIONS-CONSENSUS.txt`. This table has the same format as the individual sample outputs, with the 'entry ID' replaced by a unique inversion ID. It also has column reporting the samples where the inversion was detected.
+
+The table should look like this:
+
+|**inversion_id**|**contig_name**|**first_seq**|**midline**|**second_seq**|**first_start**|**first_end**|**second_start**|**second_end**|**num_mismatches**|**num_gaps**|**length**|**distance**|**num_samples**|**sample_names**|**first_oligo_primer**|**first_oligo_reference**|**second_oligo_primer**|**second_oligo_reference**|
+|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|
+|INV_0001|c_000000000001|TGTTTCAAAAAACGTTCGT|\|\|\|\|\|\|\|\|\|\|x\|\|\|\|\|\|\|\||ACGAACGTCTTTTGAAACA|10306|10325|10493|10512|1|0|19|168|3|S01,S02,S03|TCGATCAATTGATGTTTCAAAA.ACGTTCGT|CTTTTG|TCAGTAGTGAATGTGTTTCAAAA.ACGTTCGT|TTAATA|
+|INV_0002|c_000000000002|TGTTTCAAAAAACGTTCGT|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\||ACGAACGTTTTTTGAAACA|10514|10533|10724|10743|0|0|19|191|3|S01,S02,S03|AGATACGGTTTATGTTTCAAAAAACGTTCGT|CTTTTG|AGAAAAGAAGGCGTGTTTCAAAAAACGTTCGT|TCAATG|
+
+### All stretches considered
+
+Another default output table is named `ALL-STRETCHES-CONSIDERED.txt` and it reports every stretch that passed the 'Identifying regions of interest' parameters. It reports the maximum coverage of FWD/FWD and REV/REV in that stretch, per sample. It also reports the number of palindromes found and if a true inversion was confirmed.
+
+|**entry_id**|**sequence_name**|**sample_name**|**contig_name**|**start_stop**|**max_coverage**|**num_palindromes_found**|**true_inversions_found**|
+|:--|:--|:--|:--|:--|:--|:--|:--|
+|S01_c_000000000001_10214_10541|c_000000000001_10214_10541|S01|c_000000000001|10214-10541|16|4|False|
+|S01_c_000000000002_10148_11135|c_000000000002_10148_11135|S01|c_000000000002|10148-11135|69|20|False|
+|S02_c_000000000001_10283_10542|c_000000000001_10283_10542|S02|c_000000000001|10283-10542|11|3|False|
+|S02_c_000000000002_10200_11052|c_000000000002_10200_11052|S02|c_000000000002|10200-11052|96|17|False|
+|S03_c_000000000001_10033_10801|c_000000000001_10033_10801|S03|c_000000000001|10033-10801|30|12|False|
+|S03_c_000000000002_10498_10764|c_000000000002_10498_10764|S03|c_000000000002|10498-10764|13|2|False|
+
+### Surrounding genes and functions
+
+If the user enable the reporting of the genomic context, two addition TAB-delimited tables are generated: `INVERSIONS-CONSENSUS-SURROUNDING-GENES.txt` and `INVERSIONS-CONSENSUS-SURROUNDING-FUNCTIONS.txt`.
+
+The first table report the gene calls surrounging every inversion when possible:
+
+|**inversion_id**|**entry_type**|**gene_callers_id**|**start**|**stop**|**direction**|**partial**|**call_type**|**source**|**version**|**contig**|
+|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|
+|INV_0001|FIRST_PALINDROME||10306|10325||||||c_000000000001|
+|INV_0001|SECOND_PALINDROME||10493|10512||||||c_000000000001|
+|INV_0001|GENE|6|6818|7595|r|0|1|prodigal|v2.6.3|c_000000000001|
+|INV_0001|GENE|7|7632|8976|r|0|1|prodigal|v2.6.3|c_000000000001|
+|INV_0001|GENE|8|9145|9616|r|0|1|prodigal|v2.6.3|c_000000000001|
+|INV_0001|GENE|9|9651|10170|r|0|1|prodigal|v2.6.3|c_000000000001|
+|INV_0001|GENE|10|11311|14161|r|0|1|prodigal|v2.6.3|c_000000000001|
+|INV_0001|GENE|11|14165|14495|r|0|1|prodigal|v2.6.3|c_000000000001|
+|INV_0001|GENE|12|14524|16072|r|0|1|prodigal|v2.6.3|c_000000000001|
+
+The second table report the function associated to every gene call reported in the first file:
+
+|**inversion_id**|**gene_callers_id**|**source**|**accession**|**function**|
+|:--|:--|:--|:--|:--|
+|INV_0001|6|COG20_FUNCTION|COG1208|NDP-sugar pyrophosphorylase, includes eIF-2Bgamma, eIF-2Bepsilon, and LPS biosynthesis protein s (GCD1) (PDB:6JQ8)|
+|INV_0001|6|COG20_CATEGORY|J|Translation, ribosomal structure and biogenesis|
+|INV_0001|6|KOfam|K00978|glucose-1-phosphate cytidylyltransferase [EC:2.7.7.33]|
+|INV_0001|7|COG20_FUNCTION|COG0399|dTDP-4-amino-4,6-dideoxygalactose transaminase (WecE) (PDB:4PIW) (PUBMED:15271350)|
+|INV_0001|7|COG20_CATEGORY|M|Cell wall/membrane/envelope biogenesis|
+|INV_0001|7|KOfam|K12452|CDP-4-dehydro-6-deoxyglucose reductase, E1 [EC:1.17.1.1]|
+|INV_0001|9|COG20_FUNCTION|COG0250|Transcription termination/antitermination protein NusG (NusG) (PDB:3EWG) (PUBMED:19500594)|
+|INV_0001|9|COG20_CATEGORY|K|Transcription|
+|INV_0001|10|COG20_FUNCTION|COG2605|Predicted kinase related to galactokinase and mevalonate kinase (PDB:4USK)|
+|INV_0001|10|COG20_CATEGORY|R|General function prediction only|
+|INV_0001|10|KOfam|K05305|fucokinase [EC:2.7.1.52]|
+|INV_0001|11|COG20_FUNCTION|COG3254|L-rhamnose mutarotase (RhaM) (PDB:1X8D)|
+|INV_0001|11|COG20_CATEGORY|M|Cell wall/membrane/envelope biogenesis|
+|INV_0001|11|KOfam|K03534|L-rhamnose mutarotase [EC:5.1.3.32]|
+|INV_0001|12|COG20_FUNCTION|COG0305|Replicative DNA helicase (DnaB) (PDB:1B79)|
+|INV_0001|12|COG20_CATEGORY|L|Replication, recombination and repair|
+|INV_0001|12|KOfam|K02314|replicative DNA helicase [EC:3.6.4.12]|
+
+
+### Inversion's activity
+
+Finally, if the user provide R1 and R2 fastq files and enable the reporting of inversion's activity, [anvi-report-inversions](/help/8/programs/anvi-report-inversions) will generate a long-format file named `INVERSION-ACTIVITY.txt`. This file reports, for every inversion and sample, the relative proportion and read abundance of unique oligos, which either correspond to the reference contig (no inversion), or to an inversion sequence. The inversion's activity is computed and reported for both side of each inversion.
+
+|**sample**|**inversion_id**|**oligo_primer**|**oligo**|**reference**|**frequency_count**|**relative_abundance**|
+|:--|:--|:--|:--|:--|:--|:--|
+|S01|INV_0001|first_oligo_primer|TTAATA|False|6|0.097|
+|S01|INV_0001|first_oligo_primer|CTTTTG|True|55|0.887|
+|S01|INV_0001|first_oligo_primer|CTTTTT|False|1|0.016|
+|S01|INV_0001|second_oligo_primer|CTTTTG|False|11|0.169|
+|S01|INV_0001|second_oligo_primer|TTAATA|True|54|0.831|
+|S01|INV_0002|first_oligo_primer|TCAATG|False|37|0.587|
+|S01|INV_0002|first_oligo_primer|CTTTTG|True|25|0.397|
+|S01|INV_0002|first_oligo_primer|TCAATT|False|1|0.016|
+|S01|INV_0002|second_oligo_primer|CTTTTG|False|53|0.609|
+|S01|INV_0002|second_oligo_primer|TCAATG|True|33|0.379|
+|S01|INV_0002|second_oligo_primer|TCAATC|False|1|0.011|
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/inversions-txt.md) to update this information.
+
diff --git a/help/8/artifacts/kegg-data/index.md b/help/8/artifacts/kegg-data/index.md
new file mode 100644
index 00000000..5719efe6
--- /dev/null
+++ b/help/8/artifacts/kegg-data/index.md
@@ -0,0 +1,106 @@
+---
+layout: artifact
+title: kegg-data
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/kegg-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-kegg-data](../../programs/anvi-setup-kegg-data)
+
+
+## Required or used by
+
+
+[anvi-display-metabolism](../../programs/anvi-display-metabolism) [anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism) [anvi-reaction-network](../../programs/anvi-reaction-network) [anvi-run-kegg-kofams](../../programs/anvi-run-kegg-kofams)
+
+
+## Description
+
+A **directory of data** downloaded from the [KEGG database resource](https://www.kegg.jp/) for use in function annotation and metabolism estimation.
+
+It is created by running the program [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data). Not everything from KEGG is included in this directory, only the information relevant to downstream programs. The most critical components of this directory are KOfam HMM profiles and the [modules-db](/help/8/artifacts/modules-db) which contains information on metabolic pathways as described in the [KEGG MODULES resource](https://www.genome.jp/kegg/module.html), as well as functional classification hierarchies from [KEGG BRITE](https://www.genome.jp/kegg/brite.html).
+
+Programs that rely on this data directory include [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) and [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism).
+
+## Directory Location
+The default location of this data is in the anvi'o folder, at `anvio/anvio/data/misc/KEGG/`.
+
+You can change this location when you run [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) by providing a different path to the `--kegg-data-dir` parameter:
+
+
+anvi-setup-kegg-data --kegg-data-dir /path/to/directory/KEGG
+
+
+If you do this, you will need to provide this path to downstream programs that require this data as well.
+
+## Directory Contents
+
+Here is a schematic of how the [kegg-data](/help/8/artifacts/kegg-data) folder will look after setup:
+
+```
+KEGG
+ |- MODULES.db
+ |- ko_list.txt
+ |- modules.keg
+ |- hierarchies.json
+ |- HMMs
+ | |- Kofam.hmm
+ | |- Kofam.hmm.h3f
+ | |- (....)
+ |
+ |- modules
+ | |- M00001
+ | |- M00002
+ | |- (....)
+ |
+ |- BRITE
+ | |- ko00001
+ | |- ko00194
+ | |- (....)
+ |
+ |- orphan_data
+ |- 01_ko_fams_with_no_threshold.txt
+ |- 02_hmm_profiles_with_ko_fams_with_no_threshold.hmm
+
+```
+
+### What is this data?
+
+Typically, users will not have to work directly with any of these files, as downstream programs will interface directly with the [modules-db](/help/8/artifacts/modules-db).
+
+However, for the curious, here is a description of each component in this data directory:
+- `ko_list.txt`: a tab-delimited file from the [KEGG KOfam](https://www.genome.jp/ftp/db/kofam/) resource that describes the KOfam profile for each KEGG Ortholog (KO). It contains information like the bitscore threshold (used to differentiate between 'good' and 'bad' hits when annotating sequences), the function definition, and various data about the sequences used to generate the profile.
+- The `HMMs` subfolder: contains a file of concatentated KOfam profiles (also originally downloaded from [KEGG](https://www.genome.jp/ftp/db/kofam/)), as well as the indexes for this file.
+- The `orphan_data` subfolder: contains KOfam profiles for KOs that do not have a bitscore threshold in the `ko_list.txt` file (in the `.hmm` file) and their corresponding entries in from the `ko_list.txt` file (in `01_ko_fams_with_no_threshold.txt`). Please note that KOs from the `orphan_data` directory will *not* be annotated in your [contigs-db](/help/8/artifacts/contigs-db) when you run [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams). However, if you ever need to take a look at these profiles or use them in any way, here they are. :)
+- `modules.keg`: a flat text file describing all metabolic modules available in the [KEGG MODULE](https://www.genome.jp/kegg/module.html) resource. This includes pathway and signature modules, but not reaction modules.
+- The `modules` subfolder: contains flat text files, one for each metabolic module, downloaded using the [KEGG REST API](https://www.kegg.jp/kegg/rest/keggapi.html). Each file describes a metabolic module's definition, classification, component orthologs, metabolic reactions, compounds, and any miscellaneous data like references and such. For an example, see the [module file for M00001](https://rest.kegg.jp/get/M00001/).
+- `hierarchies.json`: a JSON-formatted file describing the available functional hierarchies in the [KEGG BRITE](https://www.genome.jp/kegg/brite.html) resource.
+- The `BRITE` subfolder: contains JSON-formatted files, each one of which describes a BRITE hierarchy.
+- `MODULES.db`: a SQLite database containing data parsed from the module files and BRITE hierarchies. See [modules-db](/help/8/artifacts/modules-db).
+
+### How do we use it?
+
+The KOfam profiles are used directly by [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) for annotating genes with KEGG Orthologs. The MODULE and BRITE data in the above files are processed and organized into the [modules-db](/help/8/artifacts/modules-db) for easier programmatic access. [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) uses this database to annotate genes with BRITE categories and with the modules they participate in, when relevant. [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) uses this database to get module information when computing completeness scores for each metabolic module.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/kegg-data.md) to update this information.
+
diff --git a/help/8/artifacts/kegg-functions/index.md b/help/8/artifacts/kegg-functions/index.md
new file mode 100644
index 00000000..f4e8a319
--- /dev/null
+++ b/help/8/artifacts/kegg-functions/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: kegg-functions
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/kegg-functions
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-run-kegg-kofams](../../programs/anvi-run-kegg-kofams)
+
+
+## Required or used by
+
+
+[anvi-display-metabolism](../../programs/anvi-display-metabolism) [anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism) [anvi-reaction-network](../../programs/anvi-reaction-network)
+
+
+## Description
+
+[Kegg Orthology](https://www.genome.jp/kegg/ko.html) (KO) functional annotations, produced by finding HMM hits to the KEGG KOfam database.
+
+You can annotate a [contigs-db](/help/8/artifacts/contigs-db) with these KEGG functions by running [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams). They will be added to the gene functions table under the source 'KOfam'.
+
+Another program that relies on these annotations is [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism), which uses them to determine presence and completeness of metabolic pathways that are defined by KOs.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/kegg-functions.md) to update this information.
+
diff --git a/help/8/artifacts/kegg-metabolism/index.md b/help/8/artifacts/kegg-metabolism/index.md
new file mode 100644
index 00000000..f8377429
--- /dev/null
+++ b/help/8/artifacts/kegg-metabolism/index.md
@@ -0,0 +1,279 @@
+---
+layout: artifact
+title: kegg-metabolism
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/kegg-metabolism
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism)
+
+
+## Required or used by
+
+
+[anvi-compute-metabolic-enrichment](../../programs/anvi-compute-metabolic-enrichment)
+
+
+## Description
+
+Output text files produced by [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) that describe the presence of metabolic pathways in a [contigs-db](/help/8/artifacts/contigs-db).
+
+Depending on the output options used when running [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism), these files will have different formats. This page describes and provides examples of the various output file types.
+
+Please note that the examples below show only KEGG data, but user-defined metabolic pathways ([user-metabolism](/help/8/artifacts/user-metabolism)) can also be included in this output!
+
+### How to get to this output
+![A beautiful workflow of metabolism reconstruction in anvi'o](../../images/metabolism_reconstruction.png)
+
+## Long-format output modes
+
+The long-format output option produces tab-delimited files. Different output "modes" will result in output files with different information. You can use the `--list-available-modes` parameter to see which modes are implemented in your version of anvi'o.
+
+Some of these modes are customizable, such that you can select which columns of information to include in the output with the flag `--custom-output-headers`. Use the `--list-available-output-headers` parameter to see what kinds of information you can choose from.
+
+### 'Modules' Mode
+
+The 'modules' mode output file will have the suffix `modules.txt`. Each line in the file will represent information about a metabolic module in a given genome, bin, or contig of a metagenome assembly. Here is one example, produced by running metabolism estimation on the _Enterococcus_ external genomes in the [Infant Gut dataset](http://merenlab.org/tutorials/infant-gut/):
+
+|**module**|**genome_name**|**db_name**|**module_name**|**module_class**|**module_category**|**module_subcategory**|**module_definition**|**stepwise_module_completeness**|**stepwise_module_is_complete**|**pathwise_module_completeness**|**pathwise_module_is_complete**|**proportion_unique_enzymes_present**|**enzymes_unique_to_module**|**unique_enzymes_hit_counts**|**enzyme_hits_in_module**|**gene_caller_ids_in_module**|**warnings**|
+|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|Glycolysis (Embden-Meyerhof pathway), glucose => pyruvate|Pathway modules|Carbohydrate metabolism|Central carbohydrate metabolism|"(K00844,K12407,K00845,K25026,K00886,K08074,K00918) (K01810,K06859,K13810,K15916) (K00850,K16370,K21071,K00918) (K01623,K01624,K11645,K16305,K16306) K01803 ((K00134,K00150) K00927,K11389) (K01834,K15633,K15634,K15635) K01689 (K00873,K12406)"|1.0|True|1.0|True|NA|No enzymes unique to module|NA|K00134,K00134,K00850,K00873,K00927,K01624,K01689,K01803,K01803,K01810,K01834,K01834,K25026|1044,642,225,226,1043,348,1041,1042,1043,600,2342,2646,1608|K00850 is present in multiple modules: M00001/M00345,K00927 is present in multiple modules: M00001/M00002/M00003/M00308/M00552/M00165/M00166/M00611/M00612,K00873 is present in multiple modules: M00001/M00002,K01689 is present in multiple modules: M00001/M00002/M00003/M00346,K01624 is present in multiple modules: M00001/M00003/M00165/M00167/M00345/M00344/M00611/M00612,K01803 is present in multiple modules: M00001/M00002/M00003,K00134 is present in multiple modules: M00001/M00002/M00003/M00308/M00552/M00165/M00166/M00611/M00612,K01834 is present in multiple modules: M00001/M00002/M00003,K25026 is present in multiple modules: M00001/M00549/M00909,K01810 is present in multiple modules: M00001/M00004/M00892/M00909|
+|M00002|Enterococcus_faecalis_6240|E_faecalis_6240|Glycolysis, core module involving three-carbon compounds|Pathway modules|Carbohydrate metabolism|Central carbohydrate metabolism|"K01803 ((K00134,K00150) K00927,K11389) (K01834,K15633,K15634,K15635) K01689 (K00873,K12406)"|1.0|True|1.0|True|NA|No enzymes unique to module|NA|K00134,K00134,K00873,K00927,K01689,K01803,K01803,K01834,K01834|1044,642,226,1043,1041,1042,1043,2342,2646|K00927 is present in multiple modules: M00001/M00002/M00003/M00308/M00552/M00165/M00166/M00611/M00612,K00873 is present in multiple modules: M00001/M00002,K01689 is present in multiple modules: M00001/M00002/M00003/M00346,K01803 is present in multiple modules: M00001/M00002/M00003,K00134 is present in multiple modules: M00001/M00002/M00003/M00308/M00552/M00165/M00166/M00611/M00612,K01834 is present in multiple modules: M00001/M00002/M00003|
+|M00003|Enterococcus_faecalis_6240|E_faecalis_6240|Gluconeogenesis, oxaloacetate => fructose-6P|Pathway modules|Carbohydrate metabolism|Central carbohydrate metabolism|"(K01596,K01610) K01689 (K01834,K15633,K15634,K15635) K00927 (K00134,K00150) K01803 ((K01623,K01624,K11645) (K03841,K02446,K11532,K01086,K04041),K01622)"|0.8571428571428571|True|0.875|True|NA|No enzymes unique to module|NA|K00134,K00134,K00927,K01624,K01689,K01803,K01803,K01834,K01834,K04041|1044,642,1043,348,1041,1042,1043,2342,2646,617|K04041 is present in multiple modules: M00003/M00611/M00612,K00927 is present in multiple modules: M00001/M00002/M00003/M00308/M00552/M00165/M00166/M00611/M00612,K01689 is present in multiple modules: M00001/M00002/M00003/M00346,K01624 is present in multiple modules: M00001/M00003/M00165/M00167/M00345/M00344/M00611/M00612,K01803 is present in multiple modules: M00001/M00002/M00003,K00134 is present in multiple modules: M00001/M00002/M00003/M00308/M00552/M00165/M00166/M00611/M00612,K01834 is present in multiple modules: M00001/M00002/M00003|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+What are the data in each of these columns?
+
+- `module`: the module identifier for a metabolic pathway (from the KEGG MODULE database or from user-defined modules)
+- `genome_name`/`bin_name`/`contig_name`: the identifier for the current sample, whether that is a genome, bin, or contig from a metagenome assembly
+- `db_name`: the name of the contigs database from which this data comes (only appears in output from multi-mode, in which multiple DBs are processed at once)
+- `module_name`/`module_class`/`module_category`/`module_subcategory`/`module_definition`: metabolic pathway information, from the KEGG MODULE database or from user-defined metabolic modules
+- `stepwise_module_completeness`/`pathwise_module_completeness`: a fraction between 0 and 1 indicating the proportion of steps in the metabolic pathway that have an associated enzyme annotation. There are currently two strategies for defining the 'steps' in a metabolic pathway - 'stepwise' and 'pathwise'. To learn how these numbers are calculated, see [the anvi-estimate-metabolism help page](https://anvio.org/help/main/programs/anvi-estimate-metabolism/#technical-details)
+- `stepwise_module_is_complete`/`pathwise_module_is_complete`: a Boolean value indicating whether the corresponding `module_completeness` score is above a certain threshold or not (the default threshold is 0.75)
+- `proportion_unique_enzymes_present`: some enzymes only belong to one metabolic pathway, which means that their presence is a better indicator for the presence of a metabolic capacity than other enzymes that are shared across multiple pathways. This column contains the fraction of these unique enzymes that are present in the sample. For instance, if the module has only 1 unique enzyme and it is present, you will see a 1 in this column. You can find out the denominator of this fraction (ie, the number of unique enzymes for this module) by either calculating the length of the list in the `enzymes_unique_to_module` column, or by requesting custom modules mode output with the `unique_enzymes_context_string` header
+- `enzymes_unique_to_module`: a comma-separated list of the enzymes that only belong to the current module (ie, are not shared across multiple metabolic pathways)
+- `unique_enzymes_hit_counts`: a comma-separated list of how many times each unique enzyme appears in the sample, in the same order as the `enzymes_unique_to_module` list
+- `enzyme_hits_in_module`: a comma-separated list of the enzyme annotations that were found in the current sample and contribute to this metabolic pathway (these will be enzymes from the metabolic pathway definition in the `module_definition` column)
+- `gene_caller_ids_in_module`: a comma-separated list of the genes with enzyme annotations that contribute to this pathway, in the same order as the annotations in the `enzyme_hits_in_module` column
+- `warnings`: miscellaneous caveats to consider when interpreting the `module_completeness` scores. For example, a warning like "No KOfam profile for K00172" would indicate that we cannot annotate K00172 because we have no HMM profile for that gene family, which means that any metabolic pathway containing this KO can never be fully complete (even if a gene from that family does exist in your sequences). Seeing many warnings like "K01810 is present in multiple modules: M00001/M00004/M00892/M00909" indicates that the current module shares many enzymes with other metabolic pathways, meaning that it may appear to be complete only because its component enzymes are common. Extra caution should be taken when considering the completeness of modules with warnings
+
+**Module copy number values in the output**
+
+If you use the flag `--add-copy-number` when running [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism), you will see three additional columns describing the estimated copy number of each module: `pathwise_copy_number`, `stepwise_copy_number`, and `per_step_copy_numbers`.
+- `pathwise_copy_number` is the number of 'complete' copies of the path through the module with the highest (pathwise) completeness score, where 'complete' here means 'greater than or equal to the module completeness threshold'. If there are multiple paths through the module with the highest pathwise completeness score, we take the maximum copy number of all of these paths.
+- `stepwise_copy_number` is the minimum number of times we see each top-level step in the module.
+- `per_step_copy_numbers` is a comma-separated list of the copy number of each individual top-level step in the module. It is meant to be used for interpreting the stepwise copy number (which is simply the minimum of this list).
+
+A discussion of how copy numbers are computed can be found [here](https://anvio.org/help/main/programs/anvi-estimate-metabolism/#technical-details).
+
+**Coverage and detection values in the output**
+
+If you use the flag `--add-coverage` and provide a profile database, additional columns containing coverage and detection data will be added for each sample in the profile database. Here is a mock example of the additional columns you will see (for a generic sample called 'SAMPLE_1'):
+
+| SAMPLE_1_gene_coverages | SAMPLE_1_avg_coverage | SAMPLE_1_gene_detection | SAMPLE_1_avg_detection |
+|:--|:--|:--|:--|
+| 3.0,5.0,10.0,2.0 | 5.0 | 1.0,1.0,1.0,1.0 | 1.0 |
+
+In this mock example, the module in this row has four gene calls in it. The `SAMPLE_1_gene_coverages` column lists the mean coverage of each of those genes in SAMPLE_1 (in the same order as the gene calls are listed in the `gene_caller_ids_in_module` column), and the `SAMPLE_1_avg_coverage` column holds the average of these values. As you probably expected, the `detection` columns are similarly defined, except that they contain detection values instead of coverage.
+
+**Pathway substrates, products, and intermediates**
+
+To add information about molecular compounds that are relevant to each metabolic pathway, you can customize the `modules mode` output. Here is an example command to do that:
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --output-modes modules_custom \
+ --custom-output-headers module,module_name,module_substrates,module_intermediates,module_products,pathwise_module_completeness,stepwise_module_completeness
+
+
+The resulting output file will have a column for each item in the `--custom-output-headers` list, including one each for the substrates (input compounds), products (output compounds) and intermediates.
+
+{:.warning}
+The 'hits_in_modules' output mode has been deprecated as of anvi'o `v7.1-dev`. If you have one of these output files and need information about it, you should look in the documentation pages for anvi'o `v7`. If you would like to obtain a similar output, the closest available is 'module_paths' mode.
+
+### 'Module Paths' Mode
+
+The 'module_paths' output file will have the suffix `module_paths.txt`. Each line in the file will represent information about one path through a metabolic module.
+
+What is a path through a module, you ask? Well. There is a lengthier explanation of this [here](https://anvio.org/help/main/programs/anvi-estimate-metabolism/#technical-details), but we will go through it briefly below.
+
+Modules are metabolic pathways defined by a set of enzymes - for KEGG modules, these enzymes are KEGG orthologs, or KOs. For example, here is the definition of module [M00001](https://www.genome.jp/kegg-bin/show_module?M00001), better known as "Glycolysis (Embden-Meyerhof pathway), glucose => pyruvate":
+
+(K00844,K12407,K00845,K00886,K08074,K00918) (K01810,K06859,K13810,K15916) (K00850,K16370,K21071,K00918) (K01623,K01624,K11645,K16305,K16306) K01803 ((K00134,K00150) K00927,K11389) (K01834,K15633,K15634,K15635) K01689 (K00873,K12406)
+
+Spaces separate steps (reactions) in the metabolic pathway, and commas separate alternative KOs or alternative sub-pathways that can facilitate the same overall reaction. So a definition such as the one above can be "unrolled" into several different linear sequences of KOs, each of which we consider to be a possible "path" through the module. As an example, we can take the first option for every step in the Embden-Meyerhof pathway definition from above:
+
+(**K00844**,K12407,K00845,K00886,K08074,K00918) (**K01810**,K06859,K13810,K15916) (**K00850**,K16370,K21071,K00918) (**K01623**,K01624,K11645,K16305,K16306) **K01803** ((**K00134**,K00150) **K00927**,K11389) (**K01834**,K15633,K15634,K15635) **K01689** (**K00873**,K12406)
+
+to get the following path of KOs (which happens to be the first path shown in the output example below):
+
+K00844 K01810 K00850 K01623 K01803 K00134 K00927 K01834 K01689 K00873
+
+In summary, a 'path' is one set of enzymes that can be used to catalyze all reactions in a given metabolic pathway, and there can be many possible paths (containing different sets of alternative enzymes) for a module. For every possible path through a module, there will be a corresponding line in the 'module_paths' output file.
+
+The same principle applies to user-defined metabolic modules, except that the enzymes can be from a variety of different annotation sources (not just KOfam).
+
+Without further ado, here is an example of this output mode (also from the Infant Gut dataset):
+
+|**module**|**genome_name**|**db_name**|**pathwise_module_completeness**|**pathwise_module_is_complete**|**path_id**|**path**|**path_completeness**|**annotated_enzymes_in_path**|
+|:--|:--|:--|:--|:--|:--|:--|:--|:--|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|0|K00844,K01810,K00850,K01623,K01803,K00134,K00927,K01834,K01689,K00873|0.8|[MISSING K00844],K01810,K00850,[MISSING K01623],K01803,K00134,K00927,K01834,K01689,K00873|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|1|K12407,K01810,K00850,K01623,K01803,K00134,K00927,K01834,K01689,K00873|0.8|[MISSING K12407],K01810,K00850,[MISSING K01623],K01803,K00134,K00927,K01834,K01689,K00873|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|2|K00845,K01810,K00850,K01623,K01803,K00134,K00927,K01834,K01689,K00873|0.8|[MISSING K00845],K01810,K00850,[MISSING K01623],K01803,K00134,K00927,K01834,K01689,K00873|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+Many of the columns in this data overlap with the 'modules' mode columns; you can find descriptions of those in the previous section. Below are the descriptions of new columns in this mode:
+- `path_id`: a unique identifier of the current path through the module
+- `path`: the current path of enzymes through the module (described above)
+- `path_completeness`: a fraction between 0 and 1 indicating the proportion of enzymes in the _current path_ that are annotated. To learn how this number is calculated, see [the anvi-estimate-metabolism help page](https://anvio.org/help/main/programs/anvi-estimate-metabolism/#how-is-pathwise-completenesscopy-number-calculated)
+- `annotated_enzymes_in_path`: a list of enzymes in the current path that were annotated in the current sample (in same order as the path). If an enzyme is missing annotations, that is indicated with the string `[MISSING (enzyme)]`.
+
+Note that in this output mode, `pathwise_module_completeness` and `pathwise_module_is_complete` are the pathwise completeness scores of the module overall, not of a particular path through the module. These values will be repeated for all lines describing the same module.
+
+**Path copy number values in the output**
+
+If you use the flag `--add-copy-number`, this output mode will gain an additional column, `num_complete_copies_of_path`, which describes the number of 'complete' copies of the current path through the module. To calculate this, we look at the number of annotations for each enzyme in the path and figure out how many times we can use different annotations to get a copy of the path with a completeness score that is greater than or equal to the completeness score threshold. For more details, check out [the anvi-estimate-metabolism help page](https://anvio.org/help/main/programs/anvi-estimate-metabolism/#how-is-pathwise-completenesscopy-number-calculated)
+
+### 'Module Steps' Mode
+
+The 'module_steps' output file will have the suffix `module_steps.txt`. Each line in the file will represent information about one top-level step in a metabolic module. The top-level steps are the major steps that you get when you split the module definition on a space. Each "top-level" step is comprised of one or more enzymes that either work together or serve as alternatives to each other to catalyze (usually) one reaction in the metabolic pathway.
+
+If we use module [M00001](https://www.genome.jp/kegg-bin/show_module?M00001) as an example again, we would get the following top-level steps for this module:
+
+1. (K00844,K12407,K00845,K00886,K08074,K00918)
+2. (K01810,K06859,K13810,K15916)
+3. (K00850,K16370,K21071,K00918)
+4. (K01623,K01624,K11645,K16305,K16306)
+5. K01803
+6. ((K00134,K00150) K00927,K11389)
+7. (K01834,K15633,K15634,K15635)
+8. K01689
+9. (K00873,K12406)
+
+The first top-level step in this module is comprised of different glucokinase enzymes, all of which can catalyze the conversion from alpha-D-Glucose to alpha-D-Glucose 6-phosphate. The second top-level step is made up of alternative KOs for the glucose-6-phosphate isomerase enzyme, which converts alpha-D-Glucose 6-phosphate to beta-D-Fructose 6-phosphate. And so on.
+
+Each top-level step in a metabolic module gets its own line in the 'module_steps' output file. Here is an example showing all 9 steps for module M00001 in one Enterococcus genome from the Infant Gut Dataset:
+
+|**module**|**genome_name**|**db_name**|**stepwise_module_completeness**|**stepwise_module_is_complete**|**step_id**|**step**|**step_completeness**|
+|:--|:--|:--|:--|:--|:--|:--|:--|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|0|(K00844,K12407,K00845,K25026,K00886,K08074,K00918)|1|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|1|(K01810,K06859,K13810,K15916)|1|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|2|(K00850,K16370,K21071,K00918)|1|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|3|(K01623,K01624,K11645,K16305,K16306)|1|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|4|K01803|1|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|5|((K00134,K00150) K00927,K11389)|1|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|6|(K01834,K15633,K15634,K15635)|1|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|7|K01689|1|
+|M00001|Enterococcus_faecalis_6240|E_faecalis_6240|1.0|True|8|(K00873,K12406)|1|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+As in the previous section, you should look at the 'modules' mode section for descriptions of any columns that are shared with that mode. Below are the descriptions of new columns in this mode:
+- `step_id`: a unique identifier of each top-level step in the module
+- `step`: the definition of the top-level step, as extracted from the module definition
+- `step_completeness`: an integer value of 1 in this column indicates that the step is complete, meaning that (at least) one of any alternative enzymes (or sets of enzymes) in this step has been annotated. A value of 0 indicates that the step is incomplete, meaning that there is no way for the step's reaction to be catalyzed based on the set of enzyme annotations we are considering. This value is binary (so 0 and 1 are the only possible values for this column). To learn how this number is calculated, see [the anvi-estimate-metabolism help page](https://anvio.org/help/main/programs/anvi-estimate-metabolism/#how-is-stepwise-completenesscopy-number-calculated).
+
+Note that in this output mode, `stepwise_module_completeness` and `stepwise_module_is_complete` are the stepwise completeness scores of the module overall, not of a particular step in the module. These values will be repeated for all lines describing the same module.
+
+**Step copy number values in the output**
+
+If you use the flag `--add-copy-number`, this output mode will gain an additional column, `step_copy_number`, which describes the number of copies of the current step. To calculate this value, we look at the number of annotations for each alternative enzyme in the step and figure out how many different versions of the step we can make by combining different annotations. For more details, check out [the anvi-estimate-metabolism help page](https://anvio.org/help/main/programs/anvi-estimate-metabolism/#how-is-stepwise-completenesscopy-number-calculated).
+
+### Enzyme 'Hits' Mode
+
+The 'hits' output file will have the suffix `hits.txt`. Unlike the previous mode, this output will include ALL enzyme hits (from all annotation sources used for metabolism estimation), regardless of whether the enzyme belongs to a metabolic module or not. Since only a subset of these enzymes belong to modules, this output does not include module-related information like paths and module completeness.
+
+Here is an example of this output mode (also from the Infant Gut dataset):
+
+enzyme | genome_name | db_name | gene_caller_id | contig | modules_with_enzyme | enzyme_definition
+|:--|:--|:--|:--|:--|:--|:--|
+K25026 | Enterococcus_faecalis_6240 | E_faecalis_6240 | 1608 | Enterococcus_faecalis_6240_contig_00003_chromosome | M00001,M00549,M00909 | glucokinase [EC:2.7.1.2]
+K01810 | Enterococcus_faecalis_6240 | E_faecalis_6240 | 600 | Enterococcus_faecalis_6240_contig_00003_chromosome | M00001,M00004,M00892,M00909 | glucose-6-phosphate isomerase [EC:5.3.1.9]
+K00850 | Enterococcus_faecalis_6240 | E_faecalis_6240 | 225 | Enterococcus_faecalis_6240_contig_00003_chromosome | M00001,M00345 | 6-phosphofructokinase 1 [EC:2.7.1.11]
+(...) |(...)|(...)|(...)|(...)|(...)|(...)|
+
+Here are the descriptions of any new columns not yet discussed in the previous sections:
+
+- `enzyme`: an enzyme that was annotated in the contigs database
+- `modules_with_enzyme`: the modules (if any) that this enzyme belongs to
+- `enzyme_definition`: the function of this enzyme (often includes the enzyme name and EC number)
+
+**Coverage and detection values in the output**
+
+If you use the flag `--add-coverage` and provide a profile database, you will get one additional column per sample for coverage (containing the coverage value of the enzyme annotation in the sample) and one additional column per sample for detection (containing the detection value of the enzyme annotation in the sample). Here is a mock example:
+
+SAMPLE_1_coverage | SAMPLE_1_detection
+|:--|:--|
+3.0 | 1.0
+
+Since each row is a single gene in this output mode, these columns will contain the coverage/detection values for that gene only.
+
+### Custom Mode (for module data)
+
+The 'modules_custom' output mode will have user-defined content and the suffix `modules_custom.txt` (we currently only support output customization for modules data). See [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) for an example command to work with this mode. The output file will look similar to the 'modules' mode output, but with a different (sub)set of columns. You can use the flag `--list-available-output-headers` to see all of the possible columns you can choose from - this list will change depending on what input type you have and whether you use the `--add-copy-number` or `--add-coverage` flags (one caveat: using these flags with 'Multi Mode' input does not show you all possible output headers, so it is best to build your custom header list by looking at the possible headers for one sample).
+
+## Matrix format output
+
+Matrix format is an output option when [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) is working with multiple contigs databases at once (otherwise known as 'Multi Mode'). The purpose of this output type is to generate matrices of module statistics for easy visualization and clustering. Currently, the matrix-formatted output includes:
+- matrices of module completeness scores, one for pathwise completeness and one for stepwise completeness
+- matrices of binary module presence/absence values, one for pathwise completeness and one for stepwise completeness
+- matrix of binary top-level step completeness values
+- matrix of enzyme annotation counts
+
+If you use the `--add-copy-number` flag, you will get three additional matrix files:
+- matrices of module copy number, one for pathwise copy number and one for stepwise copy number
+- matrix of top-level step copy number
+
+In these tab-delimited matrix files, each row is a module, top-level step, or enzyme, and each column is an input sample.
+
+Here is an example of a module pathwise completeness matrix, for bins in a metagenome:
+
+| module | bin_1 | bin_2 | bin_3 | bin_4 | bin_5 | bin_6 |
+|:--|:--|:--|:--|:--|:--|:--|
+| M00001 | 1.00 | 0.00 | 1.00 | 1.00 | 1.00 | 0.00 |
+| M00002 | 1.00 | 0.00 | 1.00 | 1.00 | 1.00 | 1.00 |
+| M00003 | 0.88 | 0.00 | 1.00 | 0.75 | 1.00 | 0.88 |
+| M00004 | 0.88 | 0.00 | 0.88 | 0.88 | 0.88 | 0.00 |
+| M00005 | 1.00 | 0.00 | 1.00 | 1.00 | 1.00 | 1.00 |
+|(...) | (...) | (...) | (...) | (...) | (...) | (...) |
+
+Each cell of the matrix is the pathwise completeness score for the corresponding module in the corresponding sample (which is, in this case, a bin).
+
+While the above is the default matrix format, some users may want to include more annotation information in the matrices so that it is easier to know what is going on when looking at the matrix data manually. You can add this metadata to the matrices by using the `--include-metadata` flag when running [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism), and the output will look something like the following:
+
+| module | module_name | module_class | module_category | module_subcategory | bin_1 | bin_2 | bin_3 | bin_4 | bin_5 | bin_6 |
+|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|:--|
+| M00001 |Glycolysis (Embden-Meyerhof pathway), glucose => pyruvate | Pathway modules | Carbohydrate metabolism | Central carbohydrate metabolism | 1.00 | 0.00 | 1.00 | 1.00 | 1.00 | 0.00 |
+| M00002 | Glycolysis, core module involving three-carbon compounds | Pathway modules | Carbohydrate metabolism | Central carbohydrate metabolism | 1.00 | 0.00 | 1.00 | 1.00 | 1.00 | 1.00 |
+| M00003 | Gluconeogenesis, oxaloacetate => fructose-6P | Pathway modules | Carbohydrate metabolism | Central carbohydrate metabolism | 0.88 | 0.00 | 1.00 | 0.75 | 1.00 | 0.88 |
+|(...) | (...) | (...) | (...) | (...) | (...) | (...) |
+
+The module/step completeness matrix files will have the suffix `completeness-MATRIX.txt`.
+
+Module presence/absence matrix files will have the suffix `presence-MATRIX.txt`. In these files, each cell of the matrix will have either a 1.0 or a 0.0. A 1.0 indicates that the module has a completeness score above the module completeness threshold in that sample, while a 0.0 indicates that the module's completeness score is not above the threshold.
+
+Enzyme hit matrix files will have the suffix `enzyme_hits-MATRIX.txt`. Each row of the matrix will be an enzyme, and each column will be an input sample. Cells in this matrix will contain an integer value, representing the number of times the enzyme was annotated in that sample. (Note: you will also add metadata to this matrix type when you use the `--include-metadata` flag).
+
+Copy number matrices will have the suffix `copy_number-MATRIX.txt`. In these files, each cell of the matrix will be an integer representing the number of copies of a module or top-level step in a given sample.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/kegg-metabolism.md) to update this information.
+
diff --git a/help/8/artifacts/layer-taxonomy-txt/index.md b/help/8/artifacts/layer-taxonomy-txt/index.md
new file mode 100644
index 00000000..3e4dff08
--- /dev/null
+++ b/help/8/artifacts/layer-taxonomy-txt/index.md
@@ -0,0 +1,50 @@
+---
+layout: artifact
+title: layer-taxonomy-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/layer-taxonomy-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-import-taxonomy-for-layers](../../programs/anvi-import-taxonomy-for-layers)
+
+
+## Description
+
+This is a text file containing taxonomy information for your layers (the same information as a [layer-taxonomy](/help/8/artifacts/layer-taxonomy)). You can bring this information into a [single-profile-db](/help/8/artifacts/single-profile-db) using [anvi-import-taxonomy-for-layers](/help/8/programs/anvi-import-taxonomy-for-layers).
+
+This is a tab-delimited text file that is formatted similarly to a [gene-taxonomy-txt](/help/8/artifacts/gene-taxonomy-txt). The first column describes the names of your layers, and the following columns each correspond to the taxonomy level described in the header. Here is an example:
+
+ sample t_domain t_phylum t_class ...
+ c1 Eukaryea Chordata Mammalia
+ ...
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/layer-taxonomy-txt.md) to update this information.
+
diff --git a/help/8/artifacts/layer-taxonomy/index.md b/help/8/artifacts/layer-taxonomy/index.md
new file mode 100644
index 00000000..ea047939
--- /dev/null
+++ b/help/8/artifacts/layer-taxonomy/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: layer-taxonomy
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/layer-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-import-taxonomy-for-layers](../../programs/anvi-import-taxonomy-for-layers)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is taxonomy information about the layers stored in a [single-profile-db](/help/8/artifacts/single-profile-db). When you open this [single-profile-db](/help/8/artifacts/single-profile-db) with [anvi-interactive](/help/8/programs/anvi-interactive), this information will appear the same way that [misc-data-layers](/help/8/artifacts/misc-data-layers) does: in graphs at the right side of the interface, similarly to how the layer names are displayed.
+
+You can bring this information into your profile database using [anvi-import-taxonomy-for-layers](/help/8/programs/anvi-import-taxonomy-for-layers) by providing a [layer-taxonomy-txt](/help/8/artifacts/layer-taxonomy-txt).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/layer-taxonomy.md) to update this information.
+
diff --git a/help/8/artifacts/linkmers-txt/index.md b/help/8/artifacts/linkmers-txt/index.md
new file mode 100644
index 00000000..9ccf638b
--- /dev/null
+++ b/help/8/artifacts/linkmers-txt/index.md
@@ -0,0 +1,150 @@
+---
+layout: artifact
+title: linkmers-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/linkmers-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-report-linkmers](../../programs/anvi-report-linkmers) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Required or used by
+
+
+[anvi-oligotype-linkmers](../../programs/anvi-oligotype-linkmers) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Description
+
+This is a tab-delimited table where each row represents a a short read that mapped to a specific position in a reference contig. This is the output of [anvi-report-linkmers](/help/8/programs/anvi-report-linkmers), where those reference positions are given by the user.
+
+For instance, if [anvi-report-linkmers](/help/8/programs/anvi-report-linkmers) was run on three samples (`SAMPLE-01.bam`, `SAMPLE-02.bam`, and `SAMPLE-03.bam`) with this contigs-and-positions file,
+
+
+
+
+ contig_1720 |
+ 7111,7115,7120 |
+
+
+
+
+Then the output would be the following:
+
+|entry_id|sample_id|request_id|contig_name|pos_in_contig|pos_in_read|base|read_unique_id|read_X|reverse|sequence|
+|:--|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--|
+|000000001|SAMPLE-01|001|contig_1720|7111|160|G|bc3aa6b95ce110067|read-2|True|ATTGGTTTTGCTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTA|
+|000000002|SAMPLE-01|001|contig_1720|7111|160|G|156295ff5928fc055|read-2|True|ATTGGTTTTGCTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTT|
+|000000003|SAMPLE-01|001|contig_1720|7111|141|G|7a8947678111bb905|read-2|True|TTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACA|
+|000000004|SAMPLE-01|001|contig_1720|7111|135|A|2a3de408930252949|read-2|False|TGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTAGGAGTGTTT|
+|000000005|SAMPLE-01|001|contig_1720|7111|130|G|5f995d129fdabe64f|read-2|True|ATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATT|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+|000000033|SAMPLE-01|001|contig_1720|7115|164|A|bc3aa6b95ce110067|read-2|True|ATTGGTTTTGCTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTA|
+|000000034|SAMPLE-01|001|contig_1720|7115|164|A|156295ff5928fc055|read-2|True|ATTGGTTTTGCTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTT|
+|000000035|SAMPLE-01|001|contig_1720|7115|145|A|7a8947678111bb905|read-2|True|TTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACA|
+|000000036|SAMPLE-01|001|contig_1720|7115|139|G|2a3de408930252949|read-2|False|TGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTAGGAGTGTTT|
+|000000064|SAMPLE-01|001|contig_1720|7115|16|A|28a3a440fda5142b9|read-2|True|AGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATTTACCGGGAAGTGCAATAATTATTATTTTTGTACCGGTAATGTTAGGAATTATCTATTTAGCTAGTAAGGGTGTGGTTAAA|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+|000000065|SAMPLE-01|001|contig_1720|7120|169|T|bc3aa6b95ce110067|read-2|True|ATTGGTTTTGCTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTA|
+|000000066|SAMPLE-01|001|contig_1720|7120|169|T|156295ff5928fc055|read-2|True|ATTGGTTTTGCTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTT|
+|000000067|SAMPLE-01|001|contig_1720|7120|150|T|7a8947678111bb905|read-2|True|TTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACA|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+|000000097|SAMPLE-02|001|contig_1720|7111|140|G|ddb72ab632d753591|read-2|True|TGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCA|
+|000000098|SAMPLE-02|001|contig_1720|7111|75|G|e7506ec6da1f08697|read-2|False|TAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATTTACCGGGAAGTGCA|
+|000000099|SAMPLE-02|001|contig_1720|7111|65|G|07f926c7d8dd57e03|read-2|True|TCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATTTACCGGGAAGTGCAATAATTATTATTTTTGTACC|
+|000000100|SAMPLE-02|001|contig_1720|7111|54|G|97fb9b743bbe5d89a|read-2|True|GTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATTTACCGGGAAGTGCAATAATTATTATTTTTGTACCGGTAATGTTA|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+|000000103|SAMPLE-02|001|contig_1720|7115|69|A|07f926c7d8dd57e03|read-2|True|TCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATTTACCGGGAAGTGCAATAATTATTATTTTTGTACC|
+|000000104|SAMPLE-02|001|contig_1720|7115|58|A|97fb9b743bbe5d89a|read-2|True|GTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATTTACCGGGAAGTGCAATAATTATTATTTTTGTACCGGTAATGTTA|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+|000000105|SAMPLE-02|001|contig_1720|7120|149|T|ddb72ab632d753591|read-2|True|TGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCA|
+|000000106|SAMPLE-02|001|contig_1720|7120|84|T|e7506ec6da1f08697|read-2|False|TAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATTTACCGGGAAGTGCA|
+|000000107|SAMPLE-02|001|contig_1720|7120|74|T|07f926c7d8dd57e03|read-2|True|TCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATTTACCGGGAAGTGCAATAATTATTATTTTTGTACC|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+|000000109|SAMPLE-03|001|contig_1720|7111|181|G|5b30beaad5028d9be|read-2|False|CATCATAGTAATATTGCAACTATTGGTTTTGCTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTT|
+|000000110|SAMPLE-03|001|contig_1720|7111|167|G|a74a16460eee34549|read-2|False|TGCAACTATTGGTTTTGCTATTGGGTTTGTCGTGATGATGATGATAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTT|
+|000000111|SAMPLE-03|001|contig_1720|7111|158|G|3ec0b8a88b8cf6f6b|read-2|False|TGGTTTTGCTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTA|
+|000000112|SAMPLE-03|001|contig_1720|7111|158|G|237828c6637b1648e|read-2|False|TGGTTTTGCTATTGGGTTTGTCGGGATGATGATGTTAGATGTCGCCTTAGGTTAATCTGTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTA|
+|000000113|SAMPLE-03|001|contig_1720|7111|154|G|006dfcf9742b2a323|read-2|True|TTTGCTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGATGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTA|
+|000000114|SAMPLE-03|001|contig_1720|7111|153|G|ab174bde7a9f86013|read-2|False|TTGCTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAA|
+|000000115|SAMPLE-03|001|contig_1720|7111|150|G|4fdd1e6de2a10c923|read-2|True|CTATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTT|
+|000000116|SAMPLE-03|001|contig_1720|7111|149|G|cc5f916935b4b42be|read-2|False|TATTGGGTTTTTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAA|
+|000000117|SAMPLE-03|001|contig_1720|7111|148|G|857199a218edef55d|read-2|False|ATTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACGGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTT|
+|000000118|SAMPLE-03|001|contig_1720|7111|147|G|7f5890d7daeb66d06|read-2|False|TTGGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAA|
+|000000119|SAMPLE-03|001|contig_1720|7111|145|G|e1f31dfa0435ffb78|read-2|True|GGGTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAATAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTT|
+|000000120|SAMPLE-03|001|contig_1720|7111|143|G|8204d0adc702d99ba|read-2|True|GTTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAA|
+|000000121|SAMPLE-03|001|contig_1720|7111|142|G|baaa46d85f2425750|read-2|False|TTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTCAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAAT|
+|000000122|SAMPLE-03|001|contig_1720|7111|142|G|571209d8eacfbec10|read-2|True|TTTGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAATAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAA|
+|000000123|SAMPLE-03|001|contig_1720|7111|140|T|3dda82d075cb90188|read-2|False|TGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTTGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGG|
+|000000124|SAMPLE-03|001|contig_1720|7111|140|G|4cc7969a78d30d313|read-2|False|TGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTA|
+|000000125|SAMPLE-03|001|contig_1720|7111|140|G|418eca79d0630e0fe|read-2|False|TGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTATGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTC|
+|000000126|SAMPLE-03|001|contig_1720|7111|140|G|6f72cff910f80d2c5|read-2|True|TGTCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTT|
+|000000127|SAMPLE-03|001|contig_1720|7111|138|G|3fd470af01ec4eb4f|read-2|True|TCGTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTC|
+|000000128|SAMPLE-03|001|contig_1720|7111|136|G|a533ced1530eea9a8|read-2|True|GTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCA|
+|000000129|SAMPLE-03|001|contig_1720|7111|136|G|6a9f13baa13e23d0a|read-2|True|GTGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCA|
+|000000130|SAMPLE-03|001|contig_1720|7111|135|G|72d19ae35f5136c1a|read-2|False|TGATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTA|
+|000000131|SAMPLE-03|001|contig_1720|7111|134|G|529c1460d824e9bef|read-2|False|GATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTT|
+|000000132|SAMPLE-03|001|contig_1720|7111|134|G|69c5dbf13623487fb|read-2|False|GATGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATTGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTCTATCTAA|
+|000000133|SAMPLE-03|001|contig_1720|7111|132|G|fcd57363c61946297|read-2|False|TGATGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTG|
+|000000134|SAMPLE-03|001|contig_1720|7111|129|G|c462177c026ee961e|read-2|True|TGATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTA|
+|000000135|SAMPLE-03|001|contig_1720|7111|128|G|2295d9de5ef4cf13e|read-2|False|GATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGGGTTATGG|
+|000000136|SAMPLE-03|001|contig_1720|7111|127|G|305879fc8883daf8d|read-2|True|ATGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAAT|
+|000000137|SAMPLE-03|001|contig_1720|7111|126|G|5df2a0285fcc03a9e|read-2|False|TGTTAGATGTCGCCTTAGGTTAAGCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTATGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGG|
+|000000138|SAMPLE-03|001|contig_1720|7111|126|G|ee2961de43ccb9aa1|read-2|False|TGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCATGAG|
+|000000139|SAMPLE-03|001|contig_1720|7111|126|G|3c7c5b2afc9694223|read-2|True|TGTTAGATGTCGCCTTAGGTTAATCTCTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCT|
+|000000140|SAMPLE-03|001|contig_1720|7111|126|G|0b0eb68568f992851|read-2|True|TGTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAACAACAATTTTATCTAATATGTCAGGAGTTATG|
+|000000141|SAMPLE-03|001|contig_1720|7111|125|G|6ad74742e16c874d9|read-2|False|GTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATG|
+|000000142|SAMPLE-03|001|contig_1720|7111|125|G|d8a10c29daa22874a|read-2|False|GTTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTA|
+|000000143|SAMPLE-03|001|contig_1720|7111|124|G|2fcdca66158af9cc9|read-2|False|TTAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATC|
+|000000144|SAMPLE-03|001|contig_1720|7111|123|G|eb182dc3869777d40|read-2|True|TAGATGTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTA|
+|000000145|SAMPLE-03|001|contig_1720|7111|118|G|07067ba83a92c090d|read-2|False|GTCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAG|
+|000000146|SAMPLE-03|001|contig_1720|7111|117|G|221720449a9df5f2e|read-2|False|TCGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTAT|
+|000000147|SAMPLE-03|001|contig_1720|7111|116|G|6a32ac39d688c9342|read-2|True|CGCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTAT|
+|000000148|SAMPLE-03|001|contig_1720|7111|115|G|7a35d6e5df5c0313c|read-2|True|GCCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGATGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTA|
+|000000149|SAMPLE-03|001|contig_1720|7111|114|G|d71843e1b8dfc97c1|read-2|False|CCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGATATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATT|
+|000000150|SAMPLE-03|001|contig_1720|7111|114|G|97810094800c2730d|read-2|False|CCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTT|
+|000000151|SAMPLE-03|001|contig_1720|7111|114|G|a1a82ff1ac7d6c00a|read-2|True|CCTTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGTAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCT|
+|000000152|SAMPLE-03|001|contig_1720|7111|113|G|e79875a99e454d374|read-2|False|CTTAGGTTAATCTTTACATAATCGTAAGGACCGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTT|
+|000000153|SAMPLE-03|001|contig_1720|7111|112|G|6b818ac19ef9992b2|read-2|False|TTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTAT|
+|000000154|SAMPLE-03|001|contig_1720|7111|112|G|33a1ab1044fbce5d2|read-2|True|TTAGGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCT|
+|000000155|SAMPLE-03|001|contig_1720|7111|109|G|ed19fd79f7f532930|read-2|False|GGTTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAAT|
+|000000156|SAMPLE-03|001|contig_1720|7111|107|G|2eb280af5a55f65d2|read-2|False|TTAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTT|
+|000000157|SAMPLE-03|001|contig_1720|7111|106|G|d6295261bd3682093|read-2|False|TAATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAAT|
+|000000158|SAMPLE-03|001|contig_1720|7111|106|G|e1324c70548d7aac2|read-2|False|TAATCTTTACATAATCTTAAGCAGAGTTGTATAGTTTCGTTTCTGTAGTTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAAT|
+|000000159|SAMPLE-03|001|contig_1720|7111|106|G|e0ec923db2ffe9978|read-2|True|TAATCTTTACATAATCTTAAGCACAGTTGTATAGCTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATTTACCGGG|
+|000000160|SAMPLE-03|001|contig_1720|7111|104|G|7cc251d99ddec1738|read-2|False|ATCTTTACATAATCTTAAGCACAGTTGTATAGTTTCGTTTCTGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAAT|
+|000000161|SAMPLE-03|001|contig_1720|7111|100|G|18f3cf83405581ea6|read-2|False|TTACATAATCGTAAGCACAGTTGTATAGTTTCGTTTCAGTAATTAAGTAAAATGAGGTTAAAGAGGTGACAGAAATGAAAAAGAGATTAGGGTTAGGTTTGGGAATGTTTTTAATAACAATTTTATCTAATTTGTCAGGAGTTATGGCTTATAGTGGTAATAGTAATTTACCGGGAAGT|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+Where
+
+* `sample_id` matches to the input BAM file name,
+
+* `request_id` matches to the order of the contig name in the input file,
+
+* `pos_in_contig` marks the position of interest declared in the input file,
+
+* and `pos_in_read` marks the actual location of the nucleotide in the short reads that corresponds to the position of interest.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/linkmers-txt.md) to update this information.
+
diff --git a/help/8/artifacts/locus-fasta/index.md b/help/8/artifacts/locus-fasta/index.md
new file mode 100644
index 00000000..b6e1b361
--- /dev/null
+++ b/help/8/artifacts/locus-fasta/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: locus-fasta
+excerpt: A FASTA-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/locus-fasta
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A FASTA-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-locus](../../programs/anvi-export-locus)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+A locus-fasta is one of the outputs of [anvi-export-locus](/help/8/programs/anvi-export-locus), which creates exports specific regions of interest out of a [contigs-db](/help/8/artifacts/contigs-db).
+
+This artifact specifically describes the [fasta](/help/8/artifacts/fasta) file that contains the sequence of one of the hits to the locus.
+
+This file is contained within the directory specified by the `-o` parameter and is named with the prefix defined by the `-O` parameter, followed by a numerical identifier for this particular hit. The sequence in this fasta file is also contained in the [contigs-db](/help/8/artifacts/contigs-db) of the same name.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/locus-fasta.md) to update this information.
+
diff --git a/help/8/artifacts/markdown-txt/index.md b/help/8/artifacts/markdown-txt/index.md
new file mode 100644
index 00000000..e5174085
--- /dev/null
+++ b/help/8/artifacts/markdown-txt/index.md
@@ -0,0 +1,39 @@
+---
+layout: artifact
+title: markdown-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/markdown-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-as-markdown](../../programs/anvi-script-as-markdown)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+{:.notice}
+**No one has described this artifact yet** :/ If you would like to contribute by describing it, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts), and add a Markdown formatted file in that directory named "markdown-txt.md". Its contents will replace this sad text. THANK YOU!
+
diff --git a/help/8/artifacts/metabolic-independence-score/index.md b/help/8/artifacts/metabolic-independence-score/index.md
new file mode 100644
index 00000000..2baf8f4a
--- /dev/null
+++ b/help/8/artifacts/metabolic-independence-score/index.md
@@ -0,0 +1,39 @@
+---
+layout: artifact
+title: metabolic-independence-score
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/metabolic-independence-score
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-estimate-metabolic-independence](../../programs/anvi-script-estimate-metabolic-independence)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+{:.notice}
+**No one has described this artifact yet** :/ If you would like to contribute by describing it, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts), and add a Markdown formatted file in that directory named "metabolic-independence-score.md". Its contents will replace this sad text. THANK YOU!
+
diff --git a/help/8/artifacts/metagenomes/index.md b/help/8/artifacts/metagenomes/index.md
new file mode 100644
index 00000000..81e8a8a7
--- /dev/null
+++ b/help/8/artifacts/metagenomes/index.md
@@ -0,0 +1,65 @@
+---
+layout: artifact
+title: metagenomes
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/metagenomes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism) [anvi-estimate-scg-taxonomy](../../programs/anvi-estimate-scg-taxonomy) [anvi-estimate-trna-taxonomy](../../programs/anvi-estimate-trna-taxonomy)
+
+
+## Description
+
+A metagenome is any set of sequences that collectively describes multiple different populations (rather than just one genome) and has been converted into a [contigs-db](/help/8/artifacts/contigs-db).
+
+The metagenomes file format enables anvi'o to work with one or more metagenomes. A TAB-delimited external genomes file will be composed of at least the following two columns:
+
+|name|contigs_db_path|
+|:--|:--|
+|Name_01|/path/to/contigs-01.db|
+|Name_02|/path/to/contigs-02.db|
+|Name_03|/path/to/contigs-03.db|
+|(...)|(...)|
+
+In some cases, (for example when running [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy)), you may also want to provide the [profile-db](/help/8/artifacts/profile-db) that is associated with the [contigs-db](/help/8/artifacts/contigs-db). Then the metagenomes file will be composed of three columns:
+
+|name|contigs_db_path|profile_db_path|
+|:--|:--|:--|
+|Name_01|/path/to/contigs-01.db|/path/to/profile.db|
+|Name_02|/path/to/contigs-02.db|/path/to/profile.db|
+|Name_03|/path/to/contigs-03.db|/path/to/profile.db|
+|(...)|(...)|(...)|
+
+{:.warning}
+Please make sure names in the `name` column does not include any special characters (underscore is fine). It is also a good idea to keep these names short and descriptive as they will appear in various figures in downstream analyses.
+
+Also see **[internal-genomes](/help/8/artifacts/internal-genomes)** and **[external-genomes](/help/8/artifacts/external-genomes)**.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/metagenomes.md) to update this information.
+
diff --git a/help/8/artifacts/metapangenome/index.md b/help/8/artifacts/metapangenome/index.md
new file mode 100644
index 00000000..70ebc8ba
--- /dev/null
+++ b/help/8/artifacts/metapangenome/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: metapangenome
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/metapangenome
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-meta-pan-genome](../../programs/anvi-meta-pan-genome)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+A metapangenome is a way of comparing a bunch of genomes based both on the gene clusters they contain (like in a pangenome) and their abundances in different environments/samples (using metagenomic read recruitment).
+
+Here is a [wonderful workflow](http://merenlab.org/data/prochlorococcus-metapangenome/) that takes you through the analysis of 31 *Prochlorococcus* isolate genomes and 93 TARA Oceans metagenomes, based on [this paper](https://peerj.com/articles/4320/).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/metapangenome.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-amino-acids-txt/index.md b/help/8/artifacts/misc-data-amino-acids-txt/index.md
new file mode 100644
index 00000000..a563f040
--- /dev/null
+++ b/help/8/artifacts/misc-data-amino-acids-txt/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: misc-data-amino-acids-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-amino-acids-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-misc-data](../../programs/anvi-export-misc-data)
+
+
+## Required or used by
+
+
+[anvi-import-misc-data](../../programs/anvi-import-misc-data)
+
+
+## Description
+
+This a tab-delimited text file that describes information contained in a [misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids).
+
+To import this information into a database, use [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data).
+
+In this table, the first column should provide two pieces of information, both identifying a specific amino acid residue: the gene caller id (for the gene this residue is on) and its codon order in that gene. These should be separated by a colon. The following columns can contain any categorical or numerical data of your choosing.
+
+Here is an example with very abstract data:
+
+ item_name categorical_data numerical_data data_group
+ 1:42 group_1 4.3245 cool_data
+ 6:3 group_2 1.3542 cool_data
+ 9:96 group_1 3.2526 cool_data
+ ...
+
+There is another example of this table [here](http://merenlab.org/2020/07/22/interacdome/#6-storing-the-per-residue-binding-frequencies-into-the-contigs-database). The second table on this page is what you would provide to [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data), whereas the first lays out the same data more conceptually.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-amino-acids-txt.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-amino-acids/index.md b/help/8/artifacts/misc-data-amino-acids/index.md
new file mode 100644
index 00000000..57e39608
--- /dev/null
+++ b/help/8/artifacts/misc-data-amino-acids/index.md
@@ -0,0 +1,50 @@
+---
+layout: artifact
+title: misc-data-amino-acids
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-amino-acids
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-import-misc-data](../../programs/anvi-import-misc-data) [anvi-run-interacdome](../../programs/anvi-run-interacdome)
+
+
+## Required or used by
+
+
+[anvi-delete-misc-data](../../programs/anvi-delete-misc-data) [anvi-export-misc-data](../../programs/anvi-export-misc-data)
+
+
+## Description
+
+
+This is a section of your [contigs-db](/help/8/artifacts/contigs-db) that contains custom additional information about specific amino acid residues.
+
+Take a look at [this blogpost](http://merenlab.org/2020/07/22/interacdome/#6-storing-the-per-residue-binding-frequencies-into-the-contigs-database) for potential uses in the InteracDome (which will likely be added to anvi'o in v7) and the motivation behind this program.
+
+Similarly to other types of miscellaneous data (like [misc-data-items](/help/8/artifacts/misc-data-items)), this information is either numerical or categorical and can be populated into a [contigs-db](/help/8/artifacts/contigs-db) (from a [misc-data-amino-acids-txt](/help/8/artifacts/misc-data-amino-acids-txt)) with [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data). It is also displayed when you run [anvi-show-misc-data](/help/8/programs/anvi-show-misc-data) and can be exported or deleted with [anvi-export-misc-data](/help/8/programs/anvi-export-misc-data) and [anvi-delete-misc-data](/help/8/programs/anvi-delete-misc-data) respectively.
+
+For example, this could describe various key residues for binding to ligands, or residues otherwise determined to be important to the user for whatever reason.
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-amino-acids.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-items-order-txt/index.md b/help/8/artifacts/misc-data-items-order-txt/index.md
new file mode 100644
index 00000000..3a4bbf31
--- /dev/null
+++ b/help/8/artifacts/misc-data-items-order-txt/index.md
@@ -0,0 +1,78 @@
+---
+layout: artifact
+title: misc-data-items-order-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-items-order-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-items-order](../../programs/anvi-export-items-order)
+
+
+## Required or used by
+
+
+[anvi-import-items-order](../../programs/anvi-import-items-order)
+
+
+## Description
+
+This is a text file that **contains the information for a [misc-data-items-order](/help/8/artifacts/misc-data-items-order)**, used for importing into and exporting this information from your anvi'o project.
+
+## NEWICK order
+
+If you intend to import a tree order, the contents of your file should look something like this (but probably much more complicated depending on the number of items in your anvi'o database):
+
+```
+(contig_4, ((contig_1, contig_2), contig_3))
+```
+
+When a NEWICK order is imported into an anvi'o project, the contigs will be displayed in the order `contig_4, contig_1, contig_2, contig_3`, and the following tree will be generated in the interface:
+
+```
+ contig_4 contig_1 contig_2 contig_3
+ | | | |
+ | ------------- |
+ | | |
+ | -------------------
+ | |
+ -----------------------------
+ |
+ |
+```
+
+## LIST order
+
+Alternative to the NEWICK order, you can provide a list of items in flat form. For instance, if you want to order your items this way, your text file should look like the following, where each line contains a single item name in your database:
+
+```
+contig_4
+contig_1
+contig_2
+contig_3
+```
+
+{:.warning}
+After importing an order into a database, you may need to specifically select that order in the interactive interface through the "Item orders" dropbox and re-draw your display to change the default order.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-items-order-txt.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-items-order/index.md b/help/8/artifacts/misc-data-items-order/index.md
new file mode 100644
index 00000000..5ef41a3c
--- /dev/null
+++ b/help/8/artifacts/misc-data-items-order/index.md
@@ -0,0 +1,51 @@
+---
+layout: artifact
+title: misc-data-items-order
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-items-order
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-import-items-order](../../programs/anvi-import-items-order) [anvi-merge](../../programs/anvi-merge) [anvi-pan-genome](../../programs/anvi-pan-genome) [anvi-profile](../../programs/anvi-profile)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This artifact describes the **order of items in visualization tasks**.
+
+In anvi'o, main display items (such as 'gene clusters' in a pan database, 'contigs' in a profile database, etc) can be ordered either by a NEWICK formatted tree (such as a phylogenetic tree or a hierarchical clustering dendrogram), or by an array (such as a flat list of item names).
+
+When a NEWICK tree is used to order items, it will appear as the tree in the central section of the default anvi'o interactive interface. When a flat list of items are provided to order items, the central display where a tree appears will be blank and the displayed items will still be ordered according to the list. In order words, items order is to [misc-data-items](/help/8/artifacts/misc-data-items) as [misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders) is to [misc-data-layers](/help/8/artifacts/misc-data-layers): a description not of the items themselves, but of what order they go in on the interface.
+
+Anvi'o programs such as [anvi-pan-genome](/help/8/programs/anvi-pan-genome), [anvi-merge](/help/8/programs/anvi-merge), and [anvi-profile](/help/8/programs/anvi-profile) automatically generate NEWICK-formatted items order if possible (i.e., if you have less than 20,000 items). When you run these programs, they will put this information into your resulting [pan-db](/help/8/artifacts/pan-db) or [profile-db](/help/8/artifacts/profile-db).
+
+You can also export this information to give to a fellow anvi'o user or import this information if you have your own phylogenetic tree or desired order for your contigs.
+
+You can use [anvi-import-items-order](/help/8/programs/anvi-import-items-order) to import specific orders for your items, or [anvi-export-items-order](/help/8/programs/anvi-export-items-order) to export this information.
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-items-order.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-items-txt/index.md b/help/8/artifacts/misc-data-items-txt/index.md
new file mode 100644
index 00000000..0c91a774
--- /dev/null
+++ b/help/8/artifacts/misc-data-items-txt/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: misc-data-items-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-items-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-misc-data](../../programs/anvi-export-misc-data) [anvi-script-gen-distribution-of-genes-in-a-bin](../../programs/anvi-script-gen-distribution-of-genes-in-a-bin) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Required or used by
+
+
+[anvi-import-misc-data](../../programs/anvi-import-misc-data) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Description
+
+This a tab-delimited text file that describes information contained in a [misc-data-items](/help/8/artifacts/misc-data-items).
+
+To import this information into a database, use [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data).
+
+In this table, the first column should match the names of the items that you're displaying, and the following columns can contain any categorical or numerical data of your choosing. (You can even be fancy and display data as a stacked bar graph.)
+
+For an example, check out [the table on this page](http://merenlab.org/2017/12/11/additional-data-tables/#items-additional-data-table).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-items-txt.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-items/index.md b/help/8/artifacts/misc-data-items/index.md
new file mode 100644
index 00000000..cc154013
--- /dev/null
+++ b/help/8/artifacts/misc-data-items/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: misc-data-items
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-items
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-get-sequences-for-gene-clusters](../../programs/anvi-get-sequences-for-gene-clusters) [anvi-import-misc-data](../../programs/anvi-import-misc-data) [anvi-search-sequence-motifs](../../programs/anvi-search-sequence-motifs)
+
+
+## Required or used by
+
+
+[anvi-delete-misc-data](../../programs/anvi-delete-misc-data) [anvi-export-misc-data](../../programs/anvi-export-misc-data)
+
+
+## Description
+
+This is the section of your [profile-db](/help/8/artifacts/profile-db)/[pan-db](/help/8/artifacts/pan-db) that contains custom additional information about each of the items in the central section of the interactive interface. When you run [anvi-interactive](/help/8/programs/anvi-interactive), this data will appear as additional concentric circles.
+
+As also defined in [this blog post](http://merenlab.org/2017/12/11/additional-data-tables/#views-items-layers-orders-some-anvio-terminology), this type of data will include information about each item (whether that's a contig, gene, or bin). This data is either numerical or categorical and can be imported into another database from a [misc-data-items-txt](/help/8/artifacts/misc-data-items-txt) using [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data). It is also displayed when you run [anvi-show-misc-data](/help/8/programs/anvi-show-misc-data) and can be exported or deleted with [anvi-export-misc-data](/help/8/programs/anvi-export-misc-data) and [anvi-delete-misc-data](/help/8/programs/anvi-delete-misc-data) respectively.
+
+To change the order that the items are displayed in, take a look at [anvi-import-items-order](/help/8/programs/anvi-import-items-order).
+
+For example, this information could describe whether or not each bin reached a certain completion threshold, the e-score of the function annotation on each gene, or different categories that the total length of a contig could fall into (1-1.5 kb, 1.5-2 kb, 2-2.5 kb, and so on).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-items.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-layer-orders-txt/index.md b/help/8/artifacts/misc-data-layer-orders-txt/index.md
new file mode 100644
index 00000000..8cf1e0d0
--- /dev/null
+++ b/help/8/artifacts/misc-data-layer-orders-txt/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: misc-data-layer-orders-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-layer-orders-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-misc-data](../../programs/anvi-export-misc-data)
+
+
+## Required or used by
+
+
+[anvi-import-misc-data](../../programs/anvi-import-misc-data)
+
+
+## Description
+
+This a tab-delimited text file that describes information contained in a [misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders).
+
+To import this information into a database, use [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data).
+
+This table should contain trees formatted in either basic or newick form, where each branch represents the samples displayed by your layers. The order of the branches from left to right is the order they will be displayed in, from the center moving out.
+
+For an example, check out [the table on this page](http://merenlab.org/2017/12/11/additional-data-tables/#layer-orders-additional-data-table).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-layer-orders-txt.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-layer-orders/index.md b/help/8/artifacts/misc-data-layer-orders/index.md
new file mode 100644
index 00000000..abb03101
--- /dev/null
+++ b/help/8/artifacts/misc-data-layer-orders/index.md
@@ -0,0 +1,50 @@
+---
+layout: artifact
+title: misc-data-layer-orders
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-layer-orders
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-import-misc-data](../../programs/anvi-import-misc-data)
+
+
+## Required or used by
+
+
+[anvi-delete-misc-data](../../programs/anvi-delete-misc-data) [anvi-export-misc-data](../../programs/anvi-export-misc-data)
+
+
+## Description
+
+This is the section of your [profile-db](/help/8/artifacts/profile-db)/[pan-db](/help/8/artifacts/pan-db) that contains custom additional information about the order that your layers are displayed in and the tree that relates them to each other . When you run [anvi-interactive](/help/8/programs/anvi-interactive), this data will determine what order the concentric circles are displayed in, as well as the tree that appears above the [misc-data-layers](/help/8/artifacts/misc-data-layers) graphs.
+
+As also defined in [this blog post](http://merenlab.org/2017/12/11/additional-data-tables/#views-items-layers-orders-some-anvio-terminology), this type of data will include information about how your layers are related to each other and determines their order. This data is stored as a tree that is displayed in the top-right.
+
+This data can be imported from a [misc-data-layer-orders-txt](/help/8/artifacts/misc-data-layer-orders-txt) artifact with [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data). It is also displayed when you run [anvi-show-misc-data](/help/8/programs/anvi-show-misc-data) and can be exported or deleted with [anvi-export-misc-data](/help/8/programs/anvi-export-misc-data) and [anvi-delete-misc-data](/help/8/programs/anvi-delete-misc-data) respectively.
+
+For example, you could use this tree to indicate and group together samples that came from the same geographic location, samples that came from the same donor, samples of the same type, samples collected with the same collection method, and so on.
+
+This is also used to import the taxonomy information at the end of [the pangenomics and phylogenomics workflow](http://merenlab.org/2017/06/07/phylogenomics/#pangenomic--phylogenomics).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-layer-orders.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-layers-txt/index.md b/help/8/artifacts/misc-data-layers-txt/index.md
new file mode 100644
index 00000000..fb1647a1
--- /dev/null
+++ b/help/8/artifacts/misc-data-layers-txt/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: misc-data-layers-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-layers-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-misc-data](../../programs/anvi-export-misc-data) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Required or used by
+
+
+[anvi-import-misc-data](../../programs/anvi-import-misc-data) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Description
+
+This a tab-delimited text file that describes information contained in a [misc-data-layers](/help/8/artifacts/misc-data-layers).
+
+To import this information into a database, use [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data).
+
+In this table, the first column should match the names of the samples that you're displaying, and the following columns can contain any categorical or numerical data of your choosing. (You can even be fancy and display data as a stacked bar graph.)
+
+For an example, check out [the table on this page](http://merenlab.org/2017/12/11/additional-data-tables/#layers-additional-data-table).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-layers-txt.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-layers/index.md b/help/8/artifacts/misc-data-layers/index.md
new file mode 100644
index 00000000..d7fac909
--- /dev/null
+++ b/help/8/artifacts/misc-data-layers/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: misc-data-layers
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-layers
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-import-misc-data](../../programs/anvi-import-misc-data) [anvi-search-sequence-motifs](../../programs/anvi-search-sequence-motifs)
+
+
+## Required or used by
+
+
+[anvi-compute-functional-enrichment-in-pan](../../programs/anvi-compute-functional-enrichment-in-pan) [anvi-delete-misc-data](../../programs/anvi-delete-misc-data) [anvi-export-misc-data](../../programs/anvi-export-misc-data)
+
+
+## Description
+
+This is the section of your [profile-db](/help/8/artifacts/profile-db)/[pan-db](/help/8/artifacts/pan-db) that contains custom additional information about each of the layers of the interactive interface (usually displayed as the concentric circles). When you run [anvi-interactive](/help/8/programs/anvi-interactive), this data will appear as additional graphs in line with your layers, similar to how the sample names are displayed at the top.
+
+As also defined in [this blog post](http://merenlab.org/2017/12/11/additional-data-tables/#views-items-layers-orders-some-anvio-terminology), this type of data will include information about each layer of the interface (usually representing your samples). This data is either numerical or categorical and can be imported into another database from a [misc-data-layers-txt](/help/8/artifacts/misc-data-layers-txt) using [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data). It is also displayed when you run [anvi-show-misc-data](/help/8/programs/anvi-show-misc-data) and can be exported or deleted with [anvi-export-misc-data](/help/8/programs/anvi-export-misc-data) and [anvi-delete-misc-data](/help/8/programs/anvi-delete-misc-data) respectively.
+
+If you would like to change the order that your layers are displayed, take a look at [misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders). Or, if you want to specifically import taxnomic information for your layers (if applicable), check out [anvi-import-taxonomy-for-layers](/help/8/programs/anvi-import-taxonomy-for-layers).
+
+For example, this information could describe the salinity of a series of ocean samples, the continent your samples were taken in, or which of several collection methods was used.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-layers.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-nucleotides-txt/index.md b/help/8/artifacts/misc-data-nucleotides-txt/index.md
new file mode 100644
index 00000000..664c881c
--- /dev/null
+++ b/help/8/artifacts/misc-data-nucleotides-txt/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: misc-data-nucleotides-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-nucleotides-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-misc-data](../../programs/anvi-export-misc-data)
+
+
+## Required or used by
+
+
+[anvi-import-misc-data](../../programs/anvi-import-misc-data)
+
+
+## Description
+
+This a tab-delimited text file that describes information contained in a [misc-data-nucleotides](/help/8/artifacts/misc-data-nucleotides).
+
+To import this information into a database, use [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data).
+
+In this table, the first column should provide two pieces of information, both identifying a specific nucleotide position: the name of the contig the nucleotide is on, and its position on that contig. These should be separated by a colon. The following columns can contain any categorical or numerical data of your choosing.
+
+Here is an example with very abstract data:
+
+ item_name categorical_data numerical_data data_group
+ contig_1:4 group_1 4.3245 cool_data
+ contig_4:72 group_2 1.3542 cool_data
+ contig_7:24 group_1 3.2526 cool_data
+ ...
+
+For a more concrete example, check out the example table for [misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids) (which is formatted very similarly) [here](http://merenlab.org/2020/07/22/interacdome/#6-storing-the-per-residue-binding-frequencies-into-the-contigs-database). The second table on this page is what you would provide to [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-nucleotides-txt.md) to update this information.
+
diff --git a/help/8/artifacts/misc-data-nucleotides/index.md b/help/8/artifacts/misc-data-nucleotides/index.md
new file mode 100644
index 00000000..89887118
--- /dev/null
+++ b/help/8/artifacts/misc-data-nucleotides/index.md
@@ -0,0 +1,48 @@
+---
+layout: artifact
+title: misc-data-nucleotides
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/misc-data-nucleotides
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-import-misc-data](../../programs/anvi-import-misc-data)
+
+
+## Required or used by
+
+
+[anvi-delete-misc-data](../../programs/anvi-delete-misc-data) [anvi-export-misc-data](../../programs/anvi-export-misc-data)
+
+
+## Description
+
+This is a section of your [contigs-db](/help/8/artifacts/contigs-db) that contains custom additional information about specific nucleotides.
+
+Take a look at [this blogpost](http://merenlab.org/2020/07/22/interacdome/#6-storing-the-per-residue-binding-frequencies-into-the-contigs-database) for potential uses in the InteracDome (which will likely be added to anvi'o in v7) and the motivation behind this program.
+
+Similarly to other types of miscellaneous data (like [misc-data-items](/help/8/artifacts/misc-data-items)), this information is either numerical or categorical and can be populated into a [contigs-db](/help/8/artifacts/contigs-db) (from a [misc-data-nucleotides-txt](/help/8/artifacts/misc-data-nucleotides-txt)) with [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data). It is also displayed when you run [anvi-show-misc-data](/help/8/programs/anvi-show-misc-data) and can be exported or deleted with [anvi-export-misc-data](/help/8/programs/anvi-export-misc-data) and [anvi-delete-misc-data](/help/8/programs/anvi-delete-misc-data) respectively.
+
+For example, this information could describe specific nucleotides that are known to be SNVs from another experiment, various key nucleotides for binding to ligands, or positions known to have other modifications (such as m1A or s4U).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/misc-data-nucleotides.md) to update this information.
+
diff --git a/help/8/artifacts/modifications-txt/index.md b/help/8/artifacts/modifications-txt/index.md
new file mode 100644
index 00000000..63ac53b9
--- /dev/null
+++ b/help/8/artifacts/modifications-txt/index.md
@@ -0,0 +1,65 @@
+---
+layout: artifact
+title: modifications-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/modifications-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-tabulate-trnaseq](../../programs/anvi-tabulate-trnaseq)
+
+
+## Required or used by
+
+
+[anvi-plot-trnaseq](../../programs/anvi-plot-trnaseq)
+
+
+## Description
+
+This tabular file contains data on predicted modifications in tRNA-seq seeds.
+
+This file is produced by [anvi-tabulate-trnaseq](/help/8/programs/anvi-tabulate-trnaseq). The artifact for that program describes this and related tables in detail.
+
+This tab-delimited file can be easily manipulated by the user. It is required input for [anvi-plot-trnaseq](/help/8/programs/anvi-plot-trnaseq).
+
+## Example
+
+The modifications shown in this table are from the seeds represented in the [seeds-specific-txt](/help/8/artifacts/seeds-specific-txt) and [seeds-non-specific-txt](/help/8/artifacts/seeds-non-specific-txt) example tables.
+
+| gene_callers_id | contig_name | anticodon | aa | domain | phylum | class | order | family | genus | species | taxon_percent_id | seed_position | ordinal_name | ordinal_position | canonical_position | reference | sample_name | A | C | G | T |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | 19 | d_loop_beta_1 | 22 | 20 | G | DB_01 | 142 | 589 | 69411 | 1315 |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | 19 | d_loop_beta_1 | 22 | 20 | G | DB_03 | 217 | 1056 | 83751 | 2592 |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | 19 | d_loop_beta_1 | 22 | 20 | G | DB_05 | 42 | 212 | 28784 | 515 |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | 19 | d_loop_beta_1 | 22 | 20 | G | DB_07 | 102 | 429 | 45633 | 977 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | 32 | anticodon_loop_1 | 36 | 32 | T | DB_01 | 0 | 51 | 14 | 77 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | 32 | anticodon_loop_1 | 36 | 32 | T | DB_03 | 1 | 274 | 97 | 642 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | 32 | anticodon_loop_1 | 36 | 32 | T | DB_05 | 0 | 78 | 17 | 137 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | 32 | anticodon_loop_1 | 36 | 32 | T | DB_07 | 0 | 19 | 18 | 87 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | 37 | anticodon_loop_6 | 41 | 37 | G | DB_01 | 0 | 1 | 137 | 5 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | 37 | anticodon_loop_6 | 41 | 37 | G | DB_03 | 5 | 18 | 916 | 64 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | 37 | anticodon_loop_6 | 41 | 37 | G | DB_05 | 6 | 3 | 222 | 7 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | 37 | anticodon_loop_6 | 41 | 37 | G | DB_07 | 0 | 15 | 104 | 1 |
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/modifications-txt.md) to update this information.
+
diff --git a/help/8/artifacts/modules-db/index.md b/help/8/artifacts/modules-db/index.md
new file mode 100644
index 00000000..6d2e9845
--- /dev/null
+++ b/help/8/artifacts/modules-db/index.md
@@ -0,0 +1,177 @@
+---
+layout: artifact
+title: modules-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/modules-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-kegg-data](../../programs/anvi-setup-kegg-data) [anvi-setup-user-modules](../../programs/anvi-setup-user-modules)
+
+
+## Required or used by
+
+
+[anvi-migrate](../../programs/anvi-migrate)
+
+
+## Description
+
+A type of database containing information from either A) the [KEGG MODULE database](https://www.genome.jp/kegg/module.html) and [KEGG BRITE database](https://www.genome.jp/kegg/brite.html), or B) user-defined metabolic modules, for use in metabolism estimation and/or functional annotation of KEGG Orthologs (KOs).
+
+These databases are part of the [kegg-data](/help/8/artifacts/kegg-data) and [user-modules-data](/help/8/artifacts/user-modules-data) directories. You can get one on your computer by running [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) or [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules). Programs that rely on this type of database include [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) and [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism).
+
+Most users will never have to interact directly with this kind of database. However, for the brave few who want to try this (or who are figuring out how anvi'o works under the hood), there is some relevant information below.
+
+## Database Contents
+
+### The modules table
+
+In the current implementation, data about each metabolic pathway (from the KEGG MODULE database, or from user-defined modules) is present in the `modules` table, which looks like this:
+
+| module | data_name | data_value | data_definition | line |
+|:--|:--|:--|:--|:--|
+| M00001 | ENTRY | M00001 | Pathway | 1 |
+| M00001 | NAME | Glycolysis (Embden-Meyerhof pathway), glucose => pyruvate | _NULL_ | 2 |
+| M00001 | DEFINITION | (K00844,K12407,K00845,K00886,K08074,K00918) (K01810,K06859,K13810,K15916) (K00850,K16370,K21071,K00918) (K01623,K01624,K11645,K16305,K16306) K01803 ((K00134,K00150) K00927,K11389) (K01834,K15633,K15634,K15635) K01689 (K00873,K12406) | _NULL_ | 3 |
+| M00001 | ORTHOLOGY | K00844 | hexokinase/glucokinase [EC:2.7.1.1 2.7.1.2] [RN:R01786] | 4 |
+| M00001 | ORTHOLOGY | K12407 | hexokinase/glucokinase [EC:2.7.1.1 2.7.1.2] [RN:R01786] | 4 |
+| (...) | (...) | (...) | (...) | (...) |
+
+For the MODULES.db that comes out of [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data), these data correspond to the information that can be found on the KEGG website for each metabolic module - for an example, you can see the page for [M00001](https://www.genome.jp/dbget-bin/www_bget?md:M00001) (or, alternatively, its [flat text file version](http://rest.kegg.jp/get/M00001) from the KEGG REST API).
+
+The USER_MODULES.db that comes out of [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules) contains similar information, but defined by the user instead of downloaded from the KEGG website.
+
+In either case, the `module` column indicates the module ID number while the `data_name` column indicates what type of data the row is describing about the module. These data names are usually fairly self-explanatory - for instance, the `DEFINITION` rows describe the module definition and the `ORTHOLOGY` rows describe the enzymes belonging to the module - however, for an official explanation, you can check [the KEGG help page](https://www.genome.jp/kegg/document/help_bget_module.html).
+
+The `data_value` and `data_definition` columns hold the information corresponding to the row's `data_name`; for `ORTHOLOGY` fields these are the enzyme accession number and its functional annotation, respectively. Not all rows have a `data_definition` field.
+
+Finally, some rows of data originate from the same line in the original KEGG MODULE text file; these rows will have the same number in the `line` column. Perhaps this is a useless field. But it is there.
+
+### The BRITE hierarchies table
+
+In database version 4 or later, there is the option to include KEGG BRITE data in the modules database when setting one up using [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data). If this is done, the database will include a table called `brite_hierarchies` which stores the set of functional hierarchies that each KEGG Ortholog belongs to. It will look like this:
+
+|**hierarchy_accession**|**hierarchy_name**|**ortholog_accession**|**ortholog_name**|**categorization**|
+|:--|:--|:--|:--|:--|
+|ko00001|KEGG Orthology (KO)|K00844|HK; hexokinase [EC:2.7.1.1]|09100 Metabolism>>>09101 Carbohydrate metabolism>>>00010 Glycolysis / Gluconeogenesis [PATH:ko00010]|
+|ko00001|KEGG Orthology (KO)|K00844|HK; hexokinase [EC:2.7.1.1]|09100 Metabolism>>>09101 Carbohydrate metabolism>>>00051 Fructose and mannose metabolism [PATH:ko00051]|
+|ko00001|KEGG Orthology (KO)|K00844|HK; hexokinase [EC:2.7.1.1]|09100 Metabolism>>>09101 Carbohydrate metabolism>>>00052 Galactose metabolism [PATH:ko00052]|
+|ko00001|KEGG Orthology (KO)|K00844|HK; hexokinase [EC:2.7.1.1]|09100 Metabolism>>>09101 Carbohydrate metabolism>>>00500 Starch and sucrose metabolism [PATH:ko00500]|
+|ko00001|KEGG Orthology (KO)|K00844|HK; hexokinase [EC:2.7.1.1]|09100 Metabolism>>>09101 Carbohydrate metabolism>>>00520 Amino sugar and nucleotide sugar metabolism [PATH:ko00520]|
+| (...) | (...) | (...) | (...) | (...) |
+
+These data are coming from the JSON files describing each BRITE hierarchy that can be downloaded from the [KEGG BRITE website](https://www.genome.jp/kegg/brite.html). For an example, [click here](https://www.genome.jp/kegg-bin/show_brite?ko00001.keg).
+
+The first four columns in this table are hopefully self-explanatory from the column names. In the `categorization` column, different functional categorization levels are separated by the `>>>` character.
+
+### The database hash value
+
+In the `self` table of this database, there is an entry called `hash`. This string is a hash of the contents of the database (specifically, it is a hash of the module and enzyme accessions in the database), and it allows us to identify the version of the data within the database. This value is important for ensuring that the same MODULES.db is used both for annotating a contigs database with [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) and for estimating metabolism on that contigs database with [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism).
+
+You can easily check the hash value by running the following:
+
+
+anvi-db-info [modules-db](/help/8/artifacts/modules-db)
+
+
+It will appear in the `DB Info` section of the output, like so:
+```
+DB Info (no touch also)
+===============================================
+num_modules ..................................: 443
+total_entries ................................: 13720
+creation_date ................................: 1608740335.30248
+hash .........................................: 45b7cc2e4fdc
+```
+
+If you have annotated a [contigs-db](/help/8/artifacts/contigs-db) using [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams), you would find that the corresponding hash in that contigs database matches to this one:
+
+
+anvi-db-info [contigs-db](/help/8/artifacts/contigs-db)
+
+
+```
+DB Info (no touch also)
+===============================================
+[....]
+modules_db_hash ..............................: 45b7cc2e4fdc
+```
+
+### Other important values in the self table
+
+The `data_source` key will tell you if the current database was generated from KEGG data using [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) or from user-defined metabolic modules using [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules).
+
+The `annotation_sources` key will list the functional annotation sources that are required to annotate all enzymes found in the module definitions.
+
+Here is an example of what these fields look like for a KEGG MODULES.db:
+```
+DB Info (no touch also)
+===============================================
+data_source ..................................: KEGG
+annotation_sources ...........................: KOfam
+```
+
+And here is an example of what they look like for a USER_MODULES.db:
+```
+DB Info (no touch also)
+===============================================
+data_source ..................................: USER
+annotation_sources ...........................: KOfam,UpxZ,COG20_FUNCTION
+```
+
+## Querying the database
+
+If you want to extract information directly from a modules database, you can do it with a bit of SQL :)
+
+Here is one example, which obtains the name of every module in the default KEGG database:
+
+```
+# learn where the MODULES.db is:
+export ANVIO_MODULES_DB=`python -c "import anvio; import os; print(os.path.join(os.path.dirname(anvio.__file__), 'data/misc/KEGG/MODULES.db'))"`
+# get module names:
+sqlite3 $ANVIO_MODULES_DB "select module,data_value from modules where data_name='NAME'" | \
+ tr '|' '\t' > module_names.txt
+```
+
+## Loading the database in Python
+
+The modules database class has plenty of helpful functions defined for it. You can easily load one in Python and use these functions to access the data within. Here is how you load the database:
+
+```python
+import anvio
+import argparse
+import os
+from anvio import kegg
+
+args = argparse.Namespace()
+# CHANGE THIS PATH IF YOU WANT TO LOAD A MODULES DB AT A NON-DEFAULT LOCATION
+path_to_db = os.path.join(os.path.dirname(anvio.__file__), 'data/misc/KEGG/MODULES.db')
+db = kegg.ModulesDatabase(path_to_db, args)
+```
+Once you have done this, you can start to use the helper functions. For example, the following function will return a list of all paths through a module:
+```python
+db.unroll_module_definition('M00001')
+```
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/modules-db.md) to update this information.
+
diff --git a/help/8/artifacts/ngrams/index.md b/help/8/artifacts/ngrams/index.md
new file mode 100644
index 00000000..7e7b4ca6
--- /dev/null
+++ b/help/8/artifacts/ngrams/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: ngrams
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/ngrams
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-analyze-synteny](../../programs/anvi-analyze-synteny)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+An [ngrams](/help/8/artifacts/ngrams) object is a DataFrame that contains count data of synteny patterns collected from a group of similar loci or genomes. It is produced by running [anvi-analyze-synteny](/help/8/programs/anvi-analyze-synteny) when given a [genomes-storage-db](/help/8/artifacts/genomes-storage-db) and an annotation source.
+
+An `ngram` is a group of neighboring genes that include precisely `n` genes, inspired by the term ngram in [linguistics and natural language processing](https://en.wikipedia.org/wiki/N-gram). This object was inspired by kmer count tables but is inherently different because it is counting adjacent genes and not nucleotides.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/ngrams.md) to update this information.
+
diff --git a/help/8/artifacts/oligotypes/index.md b/help/8/artifacts/oligotypes/index.md
new file mode 100644
index 00000000..dd2870ff
--- /dev/null
+++ b/help/8/artifacts/oligotypes/index.md
@@ -0,0 +1,57 @@
+---
+layout: artifact
+title: oligotypes
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/oligotypes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-oligotype-linkmers](../../programs/anvi-oligotype-linkmers)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+Oligotyping is a computational strategy that **partitions a given set of sequences into homogeneous groups using only a subset of target nucleotide positions**.
+
+### History
+
+Oligotyping was [first described in 2011](https://doi.org/10.1371/journal.pone.0026732) and has primarily been applied to 16S ribosomal RNA gene amplicons to resolve closely related but distinct that differ as low as one nucleotide at the amplified region, exceeding the sensitivity of the popular strategy of the time, 97% OTU clustering. Other papers that demonstrate the strengths of this approach include the following: [1](https://doi.org/10.1111/2041-210X.12114), [2](https://doi.org/10.1073/pnas.1409644111), [3](https://doi.org/10.1038/ismej.2014.195), and [4](https://doi.org/10.3389/fmicb.2014.00568).
+
+The following figure from [the oligotyping methods paper](https://doi.org/10.1111/2041-210X.12114) depicts major steps of an oligotyping analysis for amplicon sequences:
+
+![Oligotyping](../../images/oligotyping.jpg)
+
+Although, Shannon entropy is not the only approach to identify highly variable nucleotide positions of interest, and they an also be provided by the user.
+
+### Applications to metagenomics
+
+This strategy also applies to metagenomic sequences that are mapped to a genomic context to describe the diversity of variable regions that are fully covered by short reads. In the context of metagenomic read recruitment, variable nucleotide positions can be chosen by the user from the positions of single-nucleotide variants [anvi'o recovers](https://merenlab.org/2015/07/20/analyzing-variability/) and presents through inspection pages in the interactive interface or through [variability-profile](/help/8/artifacts/variability-profile). An example application of oligotyping to metagenomics is demonstrated here:
+
+* [An application of oligotyping in the metagenomic context: Oligotyping AmoC](https://merenlab.org/2015/12/09/musings-over-commamox/#an-application-of-oligotyping-in-the-metagenomic-context-oligotyping-amoc)
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/oligotypes.md) to update this information.
+
diff --git a/help/8/artifacts/paired-end-fastq/index.md b/help/8/artifacts/paired-end-fastq/index.md
new file mode 100644
index 00000000..3e2f3f64
--- /dev/null
+++ b/help/8/artifacts/paired-end-fastq/index.md
@@ -0,0 +1,39 @@
+---
+layout: artifact
+title: paired-end-fastq
+excerpt: A FASTQ-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/paired-end-fastq
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A FASTQ-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-gen-pseudo-paired-reads-from-fastq](../../programs/anvi-script-gen-pseudo-paired-reads-from-fastq)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+{:.notice}
+**No one has described this artifact yet** :/ If you would like to contribute by describing it, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts), and add a Markdown formatted file in that directory named "paired-end-fastq.md". Its contents will replace this sad text. THANK YOU!
+
diff --git a/help/8/artifacts/palindromes-txt/index.md b/help/8/artifacts/palindromes-txt/index.md
new file mode 100644
index 00000000..9cf5f2d4
--- /dev/null
+++ b/help/8/artifacts/palindromes-txt/index.md
@@ -0,0 +1,82 @@
+---
+layout: artifact
+title: palindromes-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/palindromes-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-search-palindromes](../../programs/anvi-search-palindromes)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+A TAB-delimited file of [palindromic sequences](https://en.wikipedia.org/wiki/Palindromic_sequence) reported by [anvi-search-palindromes](/help/8/programs/anvi-search-palindromes).
+
+The following example is the output generated by the command below when it was run on [contigs-db](/help/8/artifacts/contigs-db) of the [Infant Gut Dataset](/tutorials/infant-gut/#downloading-the-pre-packaged-infant-gut-dataset):
+
+
+[anvi-search-palindromes](/help/8/programs/anvi-search-palindromes) -c CONTIGS.db \
+ --min-palindrome-length 50 \
+ --max-num-mismatches 1 \
+ --output-file palindromes.txt
+
+
+|sequence_name|length|distance|num_mismatches|first_start|first_end|first_sequence|second_start|second_end|second_sequence|midline|
+|:--|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
+|Day17a_QCcontig1|48|0|0|195100|195148|AAGAGAAGAGGAGAAGTTCATCCATGGATGAACTTCTCCTCTTCTCTT|195100|195148|AAGAGAAGAGGAGAAGTTCATCCATGGATGAACTTCTCCTCTTCTCTT|`||||||||||||||||||||||||||||||||||||||||||||||||`|
+|Day17a_QCcontig4|147|759|1|268872|269019|TTTCGTAATACTTTTTTGCAGTAGGCATCAAATTGGTGTTGTATAGATTTCTCATTATAATTTTGTTGCATGATAATATGCTCCTTTTTCCCCTTTCCACTAATACAACAATCAGAGAGCCCCTTTTTTTCGAAAAAGCTAGAAAAA|269631|269778|TTTCGTAATACTTTTTTGCAGTAGGCATCAAATTGGTGTTGTATAGATTTCTCATTATAATTTTGTTGCATGATAATATGCTCCTTTTTCCCCTTTCCACTAATACAACAATCAGAGAGCCCCTTTTTTTCGAAAAAACTAGAAAAA|`|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||x|||||||||`|
+|Day17a_QCcontig4|53|1956|1|268237|268290|CAGCTGCTTTTGTCAAAAGCACATAGGAATTTCACCTCTCCCCAAGTTTACGG|270193|270246|CAGCTGCTTTTGTCAAAAGCACATAGGAATTTCACCTCTCTCCAAGTTTACGG|`||||||||||||||||||||||||||||||||||||||||x||||||||||||`|
+|Day17a_QCcontig4|66|1956|1|268325|268391|ATCATCACTTTTTATTGACTATAAAAATTATTTTAGAATATTTATCGCTCCTTCTTTACGATAAGA|270281|270347|ATCATCACTTTTTATTGACTATAAAAATTATTTTAGAATGTTTATCGCTCCTTCTTTACGATAAGA|`|||||||||||||||||||||||||||||||||||||||x||||||||||||||||||||||||||`|
+|Day17a_QCcontig4|60|98694|1|16368|16428|AGAACAATTTTCGGAAATTCCTTCTTATTTCTCGGAGTTAAACGCTTCTGTCCCGACCTC|115062|115122|AGAACAATTTTCGGAAATTCCTTCTTATTTCTCGGAGTTAAACACTTCTGTCCCGACCTC|`|||||||||||||||||||||||||||||||||||||||||||x||||||||||||||||`|
+|Day17a_QCcontig16|42|0|0|105735|105777|AAAAAGAACGCTCTTTTGCTTAAGCAAAAGAGCGTTCTTTTT|105735|105777|AAAAAGAACGCTCTTTTGCTTAAGCAAAAGAGCGTTCTTTTT|`||||||||||||||||||||||||||||||||||||||||||`|
+|Day17a_QCcontig23|50|0|0|51287|51337|ATAAATAAACAGAGGCCTTAGAAATATTTCTAAGGCCTCTGTTTATTTAT|51287|51337|ATAAATAAACAGAGGCCTTAGAAATATTTCTAAGGCCTCTGTTTATTTAT|`||||||||||||||||||||||||||||||||||||||||||||||||||`|
+
+In which,
+
+* `sequence_name` is the sequence name on which a given palindrome was found.
+* `length` is the the length of the palindrome.
+* `distance` is the number of nucleotides between the location of the palindromic sequences in the larger seqeunce.
+* `num_mismatches` is the number of actual nucleotides in the palindrome sequence that did not match to its counterpart when the sequence was reverse-complemented.
+* `first_start` is the start position of the first palindrome in the reference sequence.
+* `first_end` is the end position of the first palindrome.
+* `second_start` and `second_end` are just like `first_start` and `first_end` but for the second sequence. For perfect palindromes (i.e., palindromes with zero distance), these values will be identical to their counterparts in the first sequence.
+* `first_sequence` and `second_sequence` are the actual nucleotide sequences of both. They will be identical if number of mismatches are zero. Please note that only the reverse complement of the `second_sequence` will be found in the reference sequnce.
+* `midline` an array of letters that are composed of `|` and `x` characters that show where the matching and mismatching nucleotides were (if any).
+
+**Please note** that the `sequence_name` column may not have unique sequence names if multiple palindromes found on the same sequence (which almost certainly be the case for most searches on circular genomes).
+
+**Please also note** that the `start` and `end` positions are *0-indexed*, which means (1) the first nucleotide in the sequence should be counted as the zeroth element, and (2) if you do this in Python using the example above, you will get the matching palindrome from the larger sequence context:
+
+``` python
+contig_sequences[Day17a_QCcontig1][195100: 195148]
+
+>>> AAGAGAAGAGGAGAAGTTCATCCATGGATGAACTTCTCCTCTTCTCTT
+```
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/palindromes-txt.md) to update this information.
+
diff --git a/help/8/artifacts/pan-db/index.md b/help/8/artifacts/pan-db/index.md
new file mode 100644
index 00000000..7dd80e6a
--- /dev/null
+++ b/help/8/artifacts/pan-db/index.md
@@ -0,0 +1,136 @@
+---
+layout: artifact
+title: pan-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/pan-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-pan-genome](../../programs/anvi-pan-genome)
+
+
+## Required or used by
+
+
+[anvi-analyze-synteny](../../programs/anvi-analyze-synteny) [anvi-compute-functional-enrichment-in-pan](../../programs/anvi-compute-functional-enrichment-in-pan) [anvi-compute-gene-cluster-homogeneity](../../programs/anvi-compute-gene-cluster-homogeneity) [anvi-compute-genome-similarity](../../programs/anvi-compute-genome-similarity) [anvi-db-info](../../programs/anvi-db-info) [anvi-delete-misc-data](../../programs/anvi-delete-misc-data) [anvi-delete-state](../../programs/anvi-delete-state) [anvi-display-pan](../../programs/anvi-display-pan) [anvi-export-items-order](../../programs/anvi-export-items-order) [anvi-export-misc-data](../../programs/anvi-export-misc-data) [anvi-export-state](../../programs/anvi-export-state) [anvi-get-sequences-for-gene-clusters](../../programs/anvi-get-sequences-for-gene-clusters) [anvi-import-collection](../../programs/anvi-import-collection) [anvi-import-items-order](../../programs/anvi-import-items-order) [anvi-import-misc-data](../../programs/anvi-import-misc-data) [anvi-import-state](../../programs/anvi-import-state) [anvi-merge-bins](../../programs/anvi-merge-bins) [anvi-meta-pan-genome](../../programs/anvi-meta-pan-genome) [anvi-migrate](../../programs/anvi-migrate) [anvi-show-collections-and-bins](../../programs/anvi-show-collections-and-bins) [anvi-show-misc-data](../../programs/anvi-show-misc-data) [anvi-split](../../programs/anvi-split) [anvi-summarize](../../programs/anvi-summarize) [anvi-update-db-description](../../programs/anvi-update-db-description) [anvi-script-add-default-collection](../../programs/anvi-script-add-default-collection) [anvi-script-compute-bayesian-pan-core](../../programs/anvi-script-compute-bayesian-pan-core)
+
+
+## Description
+
+A pan-db is an anviโo database that contains **key information associated with your gene clusters**. This is vital for its pangenomic analysis, hence the name. If you want to learn more about the pangenomic workflow in Anvi'o, it has [its own tutorial here](http://merenlab.org/2016/11/08/pangenomics-v2/).
+
+This is the output of the program [anvi-pan-genome](/help/8/programs/anvi-pan-genome), which can be run after you've created a [genomes-storage-db](/help/8/artifacts/genomes-storage-db) with the genomes you want to analyze. That script does the brunt of the pangenomic analysis; it caluclates the similarity between all of the genes in your genomes-storage-db, clusters them and organizes the final clusters. All of the results of that analysis are stored in a pan-db.
+
+You can use a pan database to run a variety of pangenomic analyses, including [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity), [anvi-analyze-synteny](/help/8/programs/anvi-analyze-synteny), and [anvi-compute-functional-enrichment-in-pan](/help/8/programs/anvi-compute-functional-enrichment-in-pan). You can also view and interact with the data in a pan-db using [anvi-display-pan](/help/8/programs/anvi-display-pan).
+
+To add additional information to the pangenome display, you'll probably want to use [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data)
+
+## Advanced information for programmers
+
+While it is possible to read and write a given anvi'o pan database through SQLite functions directly, one can also use anvi'o libraries to initiate a pan database to read from.
+
+### Initiate a pan database instance
+
+``` python
+import argparse
+
+from anvio.dbops import PanSuperclass
+
+args = argparse.Namespace(pan_db="PAN.db", genomes_storage="GENOMES.db")
+
+pan_db = PanSuperclass(args)
+
+```
+
+### Gene clusters dictionary
+
+Once an instance from `PanSuperclass` is initiated, the following member function will give access to gene clusters:
+
+``` pyton
+pan_db.init_gene_clusters()
+print(pan_db.gene_clusters)
+```
+
+```
+{
+ "GC_00000001": {
+ "Genome_A": [19, 21],
+ "Genome_B": [30, 32],
+ "Genome_C": [122, 125],
+ "Genome_D": [44, 42]
+ },
+ "GC_00000002": {
+ "Genome_A": [123],
+ "Genome_B": [176],
+ "Genome_C": [175],
+ "Genome_D": []
+ },
+ (...)
+ "GC_00000036": {
+ "Genome_A": [],
+ "Genome_B": [24],
+ "Genome_C": [],
+ "Genome_D": []
+ }
+ (...)
+```
+
+Each item in this dictionary is a gene cluster describes anvi'o gene caller ids of each gene from each genome that contributes to this cluster.
+
+### Sequences in gene clusters
+
+```
+gene_clusters_of_interest = set(["GC_00000006", "GC_00000036"])
+gene_cluster_sequences = pan_db.get_sequences_for_gene_clusters(gene_cluster_names= gene_clusters_of_interest)
+
+print(gene_cluster_sequences)
+```
+
+```
+{
+ "GC_00000006": {
+ "Genome_A": {
+ 23: "MDVKKGWSGNNLND--NNNGSFTLFNAYLPQAKLANEAMHQKIMEMSAKAPNATMSITGHSLGTMISIQAVANLPQAD"
+ },
+ "Genome_B": {
+ 34: "MDVKKGWSGNNLND--NNNGSFTLFNAYLPQAKLANEAMHQKIMEMSAKAPNATMSITGHSLGTMISIQAVANLPQAD"
+ },
+ "Genome_C": {
+ 23: "MDVKKGWSGNNLNDWVNNNGSFTLFNAYLPQAKLANEAMHQKIMEMSAKAPNATMSITGHSLGTMISIQAVANLPQAD"
+ },
+ "Genome_D": {
+ 23: "MDVKKGWSGNNLNDWVNNAGSFTLFNAYLPQAKLANEAMHQKIMEMSAKAPNATMSITGHSLGTMISIQAVANLPQAD"
+ }
+ },
+ "GC_00000036": {
+ "Genome_A": {},
+ "Genome_B": {
+ 24: "MSKRHKFKQFMKKKNLNPMNNRKKVGIILFATSIGLFFLFAFRTTYIVATGKVAGVSLKEKTA"
+ },
+ "Genome_C": {},
+ "Genome_D": {}
+ }
+}
+```
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/pan-db.md) to update this information.
+
diff --git a/help/8/artifacts/pdb-db/index.md b/help/8/artifacts/pdb-db/index.md
new file mode 100644
index 00000000..edf7ec58
--- /dev/null
+++ b/help/8/artifacts/pdb-db/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: pdb-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/pdb-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-pdb-database](../../programs/anvi-setup-pdb-database)
+
+
+## Required or used by
+
+
+[anvi-gen-structure-database](../../programs/anvi-gen-structure-database)
+
+
+## Description
+
+
+## What is this thing?
+
+This is a comprehensive database of protein structures downloaded from the PDB RSCB that are non-redundant. Currently, it is used for those who want to run [anvi-gen-structure-database](/help/8/programs/anvi-gen-structure-database) without an internet connection.
+
+
+## Where does it come from?
+
+A [pdb-db](/help/8/artifacts/pdb-db) can be created via the program [anvi-setup-pdb-database](/help/8/programs/anvi-setup-pdb-database). Alternatively, a [pdb-db](/help/8/artifacts/pdb-db) that contains custom structures not found in the [RCSB PDB](https://www.rcsb.org/) can in theory be generated by the user, but anvi'o currrently offers no reasonable way of doing this.
+
+
+## Notes
+
+The [pdb-db](/help/8/artifacts/pdb-db) generated via [anvi-setup-pdb-database](/help/8/programs/anvi-setup-pdb-database) is ~20GB.
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/pdb-db.md) to update this information.
+
diff --git a/help/8/artifacts/pfams-data/index.md b/help/8/artifacts/pfams-data/index.md
new file mode 100644
index 00000000..f61c5112
--- /dev/null
+++ b/help/8/artifacts/pfams-data/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: pfams-data
+excerpt: A DATA-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/pfams-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DATA-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-pfams](../../programs/anvi-setup-pfams)
+
+
+## Required or used by
+
+
+[anvi-run-pfams](../../programs/anvi-run-pfams)
+
+
+## Description
+
+This basically stores **a local copy of the data from the EBI's [Pfam database](https://pfam.xfam.org/) for function annotation.**
+
+It is required to run [anvi-run-pfams](/help/8/programs/anvi-run-pfams) and is set up on your computer by the program [anvi-setup-pfams](/help/8/programs/anvi-setup-pfams).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/pfams-data.md) to update this information.
+
diff --git a/help/8/artifacts/phylogeny/index.md b/help/8/artifacts/phylogeny/index.md
new file mode 100644
index 00000000..4d2c1749
--- /dev/null
+++ b/help/8/artifacts/phylogeny/index.md
@@ -0,0 +1,65 @@
+---
+layout: artifact
+title: phylogeny
+excerpt: A NEWICK-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/phylogeny
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A NEWICK-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-items-order](../../programs/anvi-export-items-order) [anvi-gen-phylogenomic-tree](../../programs/anvi-gen-phylogenomic-tree)
+
+
+## Required or used by
+
+
+[anvi-import-items-order](../../programs/anvi-import-items-order) [anvi-import-misc-data](../../programs/anvi-import-misc-data) [anvi-interactive](../../programs/anvi-interactive) [anvi-script-checkm-tree-to-interactive](../../programs/anvi-script-checkm-tree-to-interactive)
+
+
+## Description
+
+This is a NEWICK-formatted tree that describes the phylogenic relationships of your data.
+
+{:.notice}
+Wondering what the NEWICK format is? Then you're in luck! It has its own [Wikipedia page](https://en.wikipedia.org/wiki/Newick_format).
+
+### How to get one of these?
+
+You can use [anvi-gen-phylogenomic-tree](/help/8/programs/anvi-gen-phylogenomic-tree) to create a phylogeny based on a series of genes.
+
+As discussed on the page for [anvi-gen-phylogenomic-tree](/help/8/programs/anvi-gen-phylogenomic-tree), you can also use an external program to get a NEWICK-formatted tree and use that.
+
+### What can I do with it?
+
+Firstly, you can use it to reorder elements of the interactive interface. To import this to rearrange the orders that your items appear (in other words, as the central phylogenetic tree when you open the interface), import it using [anvi-import-items-order](/help/8/programs/anvi-import-items-order). To import this as a tree describing your layers (the concentric circles in the anvi'o interface), convert this to a [misc-data-layer-orders-txt](/help/8/artifacts/misc-data-layer-orders-txt) and use the program [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data).
+
+Secondly, as done in the [Phylogenetics tutorial](http://merenlab.org/2017/06/07/phylogenomics/#working-with-fasta-files), you can open it in the interactive interface without an associated [contigs-db](/help/8/artifacts/contigs-db). To do this, run [anvi-interactive](/help/8/programs/anvi-interactive) as so:
+
+
+anvi-interactive -t [phylogeny](/help/8/artifacts/phylogeny) \
+ --title "Phylogenomics Tutorial" \
+ --manual
+
+
+This will create an empty [profile-db](/help/8/artifacts/profile-db) to store any [bin](/help/8/artifacts/bin)s you create and other such data. You can also add various information, such as taxonomy hits, as done in that same [Phylogenetics tutorial](http://merenlab.org/2017/06/07/phylogenomics/#working-with-fasta-files).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/phylogeny.md) to update this information.
+
diff --git a/help/8/artifacts/pn-ps-data/index.md b/help/8/artifacts/pn-ps-data/index.md
new file mode 100644
index 00000000..905de088
--- /dev/null
+++ b/help/8/artifacts/pn-ps-data/index.md
@@ -0,0 +1,84 @@
+---
+layout: artifact
+title: pn-ps-data
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/pn-ps-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-get-pn-ps-ratio](../../programs/anvi-get-pn-ps-ratio)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This describes the output of [anvi-get-pn-ps-ratio](/help/8/programs/anvi-get-pn-ps-ratio), which calculates the pN/pS ratio for each gene in a [contigs-db](/help/8/artifacts/contigs-db).
+
+{:.notice}
+See the page for [anvi-get-pn-ps-ratio](/help/8/programs/anvi-get-pn-ps-ratio) for an explanation of the pN/pS ratio
+
+This describes a directory that contains the following four files:
+
+`pNpS.txt`: a long-format table of the pN/pS values, along with the groupby variables:
+
+| | corresponding_gene_call | sample_id | pNpS_reference |
+| - | ----------------------- | ----------- | -------------------- |
+| 0 | 1744 | ANE_004_05M | 0.043503524536208836 |
+| 1 | 1744 | ANE_004_40M | 0.043628712253629943 |
+| 2 | 1744 | ANE_150_05M | 0.03810623760551494 |
+| 3 | 1744 | ANE_150_40M | 0.040815421982026576 |
+
+`pN.txt`: a long-format table of the pN values, along with the groupby variables:
+
+| | corresponding_gene_call | sample_id | pN_reference |
+| - | ----------------------- | ----------- | ------------------ |
+| 0 | 1744 | ANE_004_05M | 11.827627600424583 |
+| 1 | 1744 | ANE_004_40M | 11.106801744995472 |
+| 2 | 1744 | ANE_150_05M | 9.62355553228605 |
+| 3 | 1744 | ANE_150_40M | 10.067364489809782 |
+
+`pS.txt`: a long-format table of the pS values, along with the groupby variables:
+
+| | corresponding_gene_call | sample_id | pS_reference |
+| - | ----------------------- | ----------- | ------------------ |
+| 0 | 1744 | ANE_004_05M | 271.87745651689016 |
+| 1 | 1744 | ANE_004_40M | 254.57551165909962 |
+| 2 | 1744 | ANE_150_05M | 252.54541348089631 |
+| 3 | 1744 | ANE_150_40M | 246.6558962502711 |
+
+`num_SCVs.txt`: a long-format table of the number of SCVs belonging to each group:
+
+| | corresponding_gene_call | sample_id | num_SCVs |
+| - | ----------------------- | ----------- | -------- |
+| 0 | 1744 | ANE_004_05M | 180 |
+| 1 | 1744 | ANE_004_40M | 166 |
+| 2 | 1744 | ANE_150_05M | 162 |
+| 3 | 1744 | ANE_150_40M | 160 |
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/pn-ps-data.md) to update this information.
+
diff --git a/help/8/artifacts/primers-txt/index.md b/help/8/artifacts/primers-txt/index.md
new file mode 100644
index 00000000..5c8112eb
--- /dev/null
+++ b/help/8/artifacts/primers-txt/index.md
@@ -0,0 +1,59 @@
+---
+layout: artifact
+title: primers-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/primers-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-search-primers](../../programs/anvi-search-primers)
+
+
+## Description
+
+A **TAB-delimited** file to describe primer sequences. A primer sequence can be exact (such as `ATCG`), or fuzzy (such as `AT.G`, which would match any combination of `ATAG`, `ATTG`, `ATCG`, or `ATGG`). Fuzzy primers are defined by regular expressions, properties of which are explained best in [the Python documentation](https://docs.python.org/3/library/re.html), or cheatsheets like [this one](https://www.debuggex.com/cheatsheet/regex/python).
+
+This file type includes two required and any number of user-defined optional columns.
+
+The following three columns are **required** for this file type:
+
+* `name`: a single-word name for a given primer,
+* `primer_sequence`: the primer sequence.
+
+### An example primers-txt file
+
+Here is an example file with three primers:
+
+|name|primer_sequence|
+|:--|:--|
+|PR01|AA.A..G..G..G.CCG.C.A.C|
+|PR02|AACACCGCAGTCCATGAGA|
+|PR03|A[TC]A[CG]T[ATC]TCGAGC|
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/primers-txt.md) to update this information.
+
diff --git a/help/8/artifacts/profile-db/index.md b/help/8/artifacts/profile-db/index.md
new file mode 100644
index 00000000..1646507e
--- /dev/null
+++ b/help/8/artifacts/profile-db/index.md
@@ -0,0 +1,73 @@
+---
+layout: artifact
+title: profile-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/profile-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-merge](../../programs/anvi-merge)
+
+
+## Required or used by
+
+
+[anvi-cluster-contigs](../../programs/anvi-cluster-contigs) [anvi-db-info](../../programs/anvi-db-info) [anvi-delete-collection](../../programs/anvi-delete-collection) [anvi-delete-misc-data](../../programs/anvi-delete-misc-data) [anvi-delete-state](../../programs/anvi-delete-state) [anvi-display-metabolism](../../programs/anvi-display-metabolism) [anvi-display-structure](../../programs/anvi-display-structure) [anvi-estimate-genome-completeness](../../programs/anvi-estimate-genome-completeness) [anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism) [anvi-estimate-scg-taxonomy](../../programs/anvi-estimate-scg-taxonomy) [anvi-estimate-trna-taxonomy](../../programs/anvi-estimate-trna-taxonomy) [anvi-export-collection](../../programs/anvi-export-collection) [anvi-export-gene-coverage-and-detection](../../programs/anvi-export-gene-coverage-and-detection) [anvi-export-items-order](../../programs/anvi-export-items-order) [anvi-export-misc-data](../../programs/anvi-export-misc-data) [anvi-export-splits-and-coverages](../../programs/anvi-export-splits-and-coverages) [anvi-export-state](../../programs/anvi-export-state) [anvi-gen-fixation-index-matrix](../../programs/anvi-gen-fixation-index-matrix) [anvi-gen-gene-consensus-sequences](../../programs/anvi-gen-gene-consensus-sequences) [anvi-gen-gene-level-stats-databases](../../programs/anvi-gen-gene-level-stats-databases) [anvi-gen-variability-profile](../../programs/anvi-gen-variability-profile) [anvi-get-aa-counts](../../programs/anvi-get-aa-counts) [anvi-get-codon-frequencies](../../programs/anvi-get-codon-frequencies) [anvi-get-codon-usage-bias](../../programs/anvi-get-codon-usage-bias) [anvi-get-sequences-for-hmm-hits](../../programs/anvi-get-sequences-for-hmm-hits) [anvi-get-short-reads-from-bam](../../programs/anvi-get-short-reads-from-bam) [anvi-get-split-coverages](../../programs/anvi-get-split-coverages) [anvi-import-collection](../../programs/anvi-import-collection) [anvi-import-items-order](../../programs/anvi-import-items-order) [anvi-import-misc-data](../../programs/anvi-import-misc-data) [anvi-import-state](../../programs/anvi-import-state) [anvi-inspect](../../programs/anvi-inspect) [anvi-interactive](../../programs/anvi-interactive) [anvi-merge-bins](../../programs/anvi-merge-bins) [anvi-migrate](../../programs/anvi-migrate) [anvi-refine](../../programs/anvi-refine) [anvi-rename-bins](../../programs/anvi-rename-bins) [anvi-search-sequence-motifs](../../programs/anvi-search-sequence-motifs) [anvi-show-collections-and-bins](../../programs/anvi-show-collections-and-bins) [anvi-show-misc-data](../../programs/anvi-show-misc-data) [anvi-split](../../programs/anvi-split) [anvi-summarize](../../programs/anvi-summarize) [anvi-update-db-description](../../programs/anvi-update-db-description) [anvi-script-add-default-collection](../../programs/anvi-script-add-default-collection) [anvi-script-gen-distribution-of-genes-in-a-bin](../../programs/anvi-script-gen-distribution-of-genes-in-a-bin) [anvi-script-gen-genomes-file](../../programs/anvi-script-gen-genomes-file) [anvi-script-permute-trnaseq-seeds](../../programs/anvi-script-permute-trnaseq-seeds)
+
+
+## Description
+
+An anvi'o database that **contains key information about the mapping of short reads *from multiple samples* to your contigs.**
+
+You can think of this as a extension of a [contigs-db](/help/8/artifacts/contigs-db) that contains information about how your contigs align with each of your samples. The vast majority of programs that use a profile database will also ask for the contigs database associated with it.
+
+A profile database contains information about how short reads map to the contigs in a [contigs-db](/help/8/artifacts/contigs-db). Specifically, for each sample, a profile database contains
+* the coverage and abundance per nucleotide position for each contig
+* variants of various kinds (single-nucleotide, single-codon, and single-amino acid)
+* structural variants (ex. insertions and deletions)
+These terms are explained on the [anvi'o vocabulary page.](http://merenlab.org/vocabulary/)
+
+![Contents of the contigs and profile databases](../../images/contigs-profile-db.png)
+
+This information is necessary to run anvi'o programs like [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs), [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism), and [anvi-gen-gene-level-stats-databases](/help/8/programs/anvi-gen-gene-level-stats-databases). You can also interact with a profile database using programs like [anvi-interactive](/help/8/programs/anvi-interactive).
+
+Technically, "profile-db" refers to a profile database that contains the data from several samples -- in other words, the result of running [anvi-merge](/help/8/programs/anvi-merge) on several [single-profile-db](/help/8/artifacts/single-profile-db). However, since a [single-profile-db](/help/8/artifacts/single-profile-db) has a lot of the functionality of a profile-db, it might be easier to think of a profile database as a header referring to both single-profile-dbs and profile-dbs (which can also be called a merged-profile-dbs). For simplicity's sake, since most users are dealing with multiple samples, the name was shortened to just profile-db. The following are a list of differences in functionality between a single profile database and a merged profile database:
+* You can run [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs) or [anvi-mcg-classifier](/help/8/programs/anvi-mcg-classifier) on only a merged profile database (or profile-db), since they look at the allignment data in many samples
+* You cannot run [anvi-merge](/help/8/programs/anvi-merge) or [anvi-import-taxonomy-for-layers](/help/8/programs/anvi-import-taxonomy-for-layers) on a merged profile database, only on a [single-profile-db](/help/8/artifacts/single-profile-db).
+
+## How to make a profile database
+
+### If you have multiple samples
+1. Prepare your [contigs-db](/help/8/artifacts/contigs-db)
+2. Run [anvi-profile](/help/8/programs/anvi-profile) with an appropriate [bam-file](/help/8/artifacts/bam-file). The output of this will give you a [single-profile-db](/help/8/artifacts/single-profile-db). You will need to do this for each of your samples, which have been converted into a [bam-file](/help/8/artifacts/bam-file) with your short reads.
+3. Run [anvi-merge](/help/8/programs/anvi-merge) on your [contigs-db](/help/8/artifacts/contigs-db) (from step 1) and your [single-profile-db](/help/8/artifacts/single-profile-db)s (from step 2). The output of this is a profile-db.
+
+### If you have a single sample
+1. Prepare your [contigs-db](/help/8/artifacts/contigs-db)
+2. Run [anvi-profile](/help/8/programs/anvi-profile) with an appropriate [bam-file](/help/8/artifacts/bam-file). The output of this will give you a [single-profile-db](/help/8/artifacts/single-profile-db). You can see that page for more information, but essentially you can use a single-profile-db instead of a profile database to run most anvi'o functions.
+
+## Variants
+
+Profile databases, like [contigs-db](/help/8/artifacts/contigs-db)s, are allowed to have different variants, though the only currently implemented variant, the [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db), is for tRNA transcripts from tRNA-seq experiments. The default variant stored for "standard" profile databases is `unknown`. Variants should indicate that substantially different information is stored in the database. For instance, single codon variability is applicable to protein-coding genes but not tRNA transcripts, so SCV data is not recorded for the `trnaseq` variant. The $(trnaseq-workflow)s generates [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db)s using a very different approach to [anvi-profile](/help/8/programs/anvi-profile).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/profile-db.md) to update this information.
+
diff --git a/help/8/artifacts/protein-structure-txt/index.md b/help/8/artifacts/protein-structure-txt/index.md
new file mode 100644
index 00000000..7ad7978a
--- /dev/null
+++ b/help/8/artifacts/protein-structure-txt/index.md
@@ -0,0 +1,84 @@
+---
+layout: artifact
+title: protein-structure-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/protein-structure-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-structures](../../programs/anvi-export-structures)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is a Protein Data Bank (`X.pdb`) file that describes the structure of a protein as stored in your [structure-db](/help/8/artifacts/structure-db). This is the output of running [anvi-export-structures](/help/8/programs/anvi-export-structures).
+
+This file format has its own [Wikipedia page](https://en.wikipedia.org/wiki/Protein_Data_Bank_(file_format)), as well as pages on PDB-101 [for beginners](https://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/beginner's-guide-to-pdb-structures-and-the-pdbx-mmcif-format) and for [coordinates specifically](https://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/dealing-with-coordinates), but is also briefly explained here.
+
+The header describes the title (if one exists), the type of data (denoted by `EXPDTA`), and any free-form annotations (denoted by `REMARK` ). In Anvi'o, these are primarily MODELLER information calculated when you ran [anvi-gen-structure-database](/help/8/programs/anvi-gen-structure-database).
+
+Most of the data will describe the position of individual atoms (denoted by `ATOM`) in your protein, where columns 6, 7, and 8 describe the three dimensional coordinate of the atom. The rest of the columns describe information like what position that atom is in the amno acid.
+
+`TER` statements separate independent chains from each other.
+
+Here is an example:
+
+ EXPDTA THEORETICAL MODEL, MODELLER 9.22 2020/10/13 14:38:54
+ REMARK 6 MODELLER OBJECTIVE FUNCTION: 255.0071
+ REMARK 6 MODELLER BEST TEMPLATE percent SEQ ID: 73.077
+ ATOM 1 N MET A 1 4.009 -3.600 -0.411 1.00 59.26 N
+ ATOM 2 CA MET A 1 5.250 -3.864 -1.173 1.00 59.26 C
+ ATOM 3 CB MET A 1 6.409 -3.005 -0.631 1.00 59.26 C
+ ATOM 4 CG MET A 1 6.204 -1.504 -0.854 1.00 59.26 C
+ ATOM 5 SD MET A 1 7.545 -0.444 -0.229 1.00 59.26 S
+ ATOM 6 CE MET A 1 6.982 -0.449 1.495 1.00 59.26 C
+ ATOM 7 C MET A 1 5.617 -5.306 -1.058 1.00 59.26 C
+ ATOM 8 O MET A 1 4.900 -6.175 -1.552 1.00 59.26 O
+ ATOM 9 N SER A 2 6.753 -5.603 -0.397 1.00 49.70 N
+ ATOM 10 CA SER A 2 7.165 -6.971 -0.290 1.00 49.70 C
+ ATOM 11 CB SER A 2 8.547 -7.150 0.362 1.00 49.70 C
+ ATOM 12 OG SER A 2 9.546 -6.534 -0.437 1.00 49.70 O
+ ATOM 13 C SER A 2 6.184 -7.694 0.556 1.00 49.70 C
+ ATOM 14 O SER A 2 5.954 -7.346 1.714 1.00 49.70 O
+ ATOM 15 N GLU A 3 5.553 -8.718 -0.037 1.00103.21 N
+ ATOM 16 CA GLU A 3 4.632 -9.540 0.676 1.00103.21 C
+ ATOM 17 CB GLU A 3 3.856 -10.490 -0.249 1.00103.21 C
+ ATOM 18 CG GLU A 3 4.774 -11.467 -0.988 1.00103.21 C
+ ATOM 19 CD GLU A 3 3.918 -12.407 -1.826 1.00103.21 C
+ ATOM 20 OE1 GLU A 3 2.672 -12.402 -1.638 1.00103.21 O
+ ATOM 21 OE2 GLU A 3 4.502 -13.146 -2.663 1.00103.21 O
+ ATOM 22 C GLU A 3 5.402 -10.410 1.614 1.00103.21 C
+ ...
+ ATOM 594 C ASN A 79 19.969 -15.504 4.267 1.00 61.57 C
+ ATOM 595 O ASN A 79 21.042 -14.862 4.423 1.00 61.57 O
+ ATOM 596 OXT ASN A 79 19.857 -16.742 4.474 1.00 61.57 O
+ TER 597 ASN A 79
+ END
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/protein-structure-txt.md) to update this information.
+
diff --git a/help/8/artifacts/quick-summary/index.md b/help/8/artifacts/quick-summary/index.md
new file mode 100644
index 00000000..2ab1cc3f
--- /dev/null
+++ b/help/8/artifacts/quick-summary/index.md
@@ -0,0 +1,58 @@
+---
+layout: artifact
+title: quick-summary
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/quick-summary
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-summarize-blitz](../../programs/anvi-summarize-blitz)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+The output of [anvi-summarize-blitz](/help/8/programs/anvi-summarize-blitz).
+
+[anvi-summarize-blitz](/help/8/programs/anvi-summarize-blitz) summarizes read-recruitment statistics for a collection of bins across multiple samples. It produces long format output in which each row contains the (weighted) average statistics of a bin in a sample. Each statistic is summarized in a different column of the file.
+
+Here is an example output file from this program, summarizing `detection` and `mean_coverage_Q2Q3` data (the default statistics) for 3 bins across multiple samples:
+
+unique_id | bin_name | sample | detection | mean_coverage_Q2Q3
+|:---|:---|:---|:---|:---|
+0 | bin_1 | sample_1 | 0.015553023620503776 | 1.0272713907674214
+1 | bin_2 | sample_1 | 0.0004871607502275562 | 0.0
+2 | bin_3 | sample_1 | 0.0023636043452898497 | 0.0
+3 | bin_1 | sample_2 | 0.015767421346662747 | 1.1101759286484367
+4 | bin_2 | sample_2 | 0.0004871607502275562 | 0.0
+5 | bin_3 | sample_2 | 0.001595914458984989 | 0.0
+[...] | [...] |[...] |[...] |[...]
+
+The `unique_id` column is just a unique index for each row. Each column after the `sample` column contains a different statistic (to learn how to include different or additional statistics in this output, read the [anvi-summarize-blitz](/help/8/programs/anvi-summarize-blitz) page.)
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/quick-summary.md) to update this information.
+
diff --git a/help/8/artifacts/raw-bam-file/index.md b/help/8/artifacts/raw-bam-file/index.md
new file mode 100644
index 00000000..bc8907a0
--- /dev/null
+++ b/help/8/artifacts/raw-bam-file/index.md
@@ -0,0 +1,60 @@
+---
+layout: artifact
+title: raw-bam-file
+excerpt: A BAM-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/raw-bam-file
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A BAM-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-init-bam](../../programs/anvi-init-bam)
+
+
+## Description
+
+This is a **[bam-file](/help/8/artifacts/bam-file) (which contains aligned sequence data) that has not yet been indexed and sorted**.
+
+### What does being "indexed" mean?
+
+Think of your BAM file as a long, complex book. In order to get the most out of it when trying to perform analysis, it will be super helpful to have a table of contents. Indexing your BAM file basically creates a second file that serves as an external table of contents, so that anvi'o doesn't have to keep looking through the entire BAM file during analysis.
+
+You can tell whether or not your BAM file is indexed based on the presence of this second file, which will have the same title as your BAM file, but end with the extension `.bai`. For example, if your directory contained these files:
+
+
+Lake_Michigan_Sample_1.bam
+Lake_Michigan_Sample_1.bam.bai
+Lake_Michigan_Sample_2.bam
+
+
+then you would still need to index `Lake_Michigan_Sample_2.bam`.
+
+### How do you index a BAM file?
+
+You can either do this directly using samtools, or you can just run the anvi'o program [anvi-init-bam](/help/8/programs/anvi-init-bam) (which uses samtools for you).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/raw-bam-file.md) to update this information.
+
diff --git a/help/8/artifacts/reaction-network-json/index.md b/help/8/artifacts/reaction-network-json/index.md
new file mode 100644
index 00000000..81be324d
--- /dev/null
+++ b/help/8/artifacts/reaction-network-json/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: reaction-network-json
+excerpt: A JSON-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/reaction-network-json
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A JSON-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-get-metabolic-model-file](../../programs/anvi-get-metabolic-model-file)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This artifact represents **a JSON-formatted file derived from a [reaction-network](/help/8/artifacts/reaction-network)**.
+
+The program, [anvi-get-metabolic-model-file](/help/8/programs/anvi-get-metabolic-model-file), produces this file from the [reaction-network](/help/8/artifacts/reaction-network) stored in a [contigs-db](/help/8/artifacts/contigs-db). The genes, reactions, and metabolites predicted to be involved in metabolism can be inspected in this file, which is formatted for compatability with software used for flux balance analysis, such as [COBRApy](https://opencobra.github.io/cobrapy/).
+
+[anvi-get-metabolic-model-file](/help/8/programs/anvi-get-metabolic-model-file) includes an "objective function" as the first entry of the "reactions" section of the file, a prerequisite for flux balance analysis. The objective function represents the biomass composition of metabolites in the ["core metabolism" of *E. coli*](http://bigg.ucsd.edu/models/e_coli_core).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/reaction-network-json.md) to update this information.
+
diff --git a/help/8/artifacts/reaction-network/index.md b/help/8/artifacts/reaction-network/index.md
new file mode 100644
index 00000000..6db41a1a
--- /dev/null
+++ b/help/8/artifacts/reaction-network/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: reaction-network
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/reaction-network
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-reaction-network](../../programs/anvi-reaction-network)
+
+
+## Required or used by
+
+
+[anvi-get-metabolic-model-file](../../programs/anvi-get-metabolic-model-file)
+
+
+## Description
+
+This artifact represents **the metabolic reaction network stored in a [contigs-db](/help/8/artifacts/contigs-db) by [anvi-reaction-network](/help/8/programs/anvi-reaction-network).**
+
+The program, [anvi-reaction-network](/help/8/programs/anvi-reaction-network), generates a reaction network from genes encoding enzymes in the [contigs-db](/help/8/artifacts/contigs-db). The reaction network represents biochemical reactions and the constituent metabolites predicted from the genome. The program relies upon [KEGG Orthology (KO)](https://www.genome.jp/kegg/ko.html) annotations of protein-coding genes and reference data in the [ModelSEED Biochemistry database](https://github.com/ModelSEED/ModelSEEDDatabase), and is therefore subject to all the limitations thereof, including incomplete annotation of genes with protein orthologs and imprecise knowledge of the reactions catalyzed by enzymes.
+
+The representation of the reaction network in two tables of the [contigs-db](/help/8/artifacts/contigs-db), `gene_function_reactions` and `gene_function_metabolites`, is generalizable to other sources of metabolic data, linking genes to predicted functional orthologs and the associated reactions and metabolites. This data can be exported to a JSON-formatted file by [anvi-get-metabolic-model-file](/help/8/programs/anvi-get-metabolic-model-file) for inspection and metabolic model analyses.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/reaction-network.md) to update this information.
+
diff --git a/help/8/artifacts/reaction-ref-data/index.md b/help/8/artifacts/reaction-ref-data/index.md
new file mode 100644
index 00000000..14556afa
--- /dev/null
+++ b/help/8/artifacts/reaction-ref-data/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: reaction-ref-data
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/reaction-ref-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-modelseed-database](../../programs/anvi-setup-modelseed-database)
+
+
+## Required or used by
+
+
+[anvi-reaction-network](../../programs/anvi-reaction-network)
+
+
+## Description
+
+Reference databases required for [anvi-reaction-network](/help/8/programs/anvi-reaction-network) are stored in **directories of downloaded files set up by [anvi-setup-modelseed-database](/help/8/programs/anvi-setup-modelseed-database) and [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data)**.
+
+[anvi-reaction-network](/help/8/programs/anvi-reaction-network) currently relies upon comparison of KEGG Orthology (KO) gene annotations ([kegg-functions](/help/8/artifacts/kegg-functions)) stored in a [contigs-db](/help/8/artifacts/contigs-db) to reference databases: KEGG [KO](https://www.genome.jp/kegg/ko.html) and [ModelSEED Biochemistry](https://github.com/ModelSEED/ModelSEEDDatabase). The ModelSEED Biochemistry database harmonizes and consolidates reference data from multiple sources, including KEGG, in two comprehensive tables of reactions and compounds.
+
+The KEGG databases ([kegg-data](/help/8/artifacts/kegg-data)) can be obtained by running [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data), and the ModelSEED database can be obtained by running [anvi-setup-modelseed-database](/help/8/programs/anvi-setup-modelseed-database).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/reaction-ref-data.md) to update this information.
+
diff --git a/help/8/artifacts/samples-txt/index.md b/help/8/artifacts/samples-txt/index.md
new file mode 100644
index 00000000..e5923a5b
--- /dev/null
+++ b/help/8/artifacts/samples-txt/index.md
@@ -0,0 +1,74 @@
+---
+layout: artifact
+title: samples-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/samples-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-search-primers](../../programs/anvi-search-primers)
+
+
+## Description
+
+A **TAB-delimited** file to describe samples and paired-end FASTQ files associated with them. By doing so, this file type links sample names to raw sequencing reads.
+
+This file type includes required and optional columns.
+
+{:.notice}
+While these required and optional columns are what anvi'o is going to look for anytime you expect to process a TAB-delimited file as [samples-txt](/help/8/artifacts/samples-txt), you can have as many columns as you like in a given TAB-delimited to be used as [samples-txt](/help/8/artifacts/samples-txt) as long as it includes these required and optional columns.
+
+The following three columns are **required** for this file type:
+
+* `sample`: a single-word sample name,
+* `r1`: path to the FASTQ file for pair one, and
+* `r2`: path to the FASTQ file for pair two.
+
+{:.notice}
+You can also use `name` as your first column instead of `sample`.
+
+While you can use relative paths for `r1` and `r2`, it is always better to have absolute paths to improve reproducibility.
+
+The following is an **optional** column:
+
+* `group`: A single-word categorical variable that assigns two or more samples into two or more groups. This is useful to co-assemble multiple samples so that you can bin them later.
+
+For more information, see the [anvi'o workflow tutorial](https://merenlab.org/2018/07/09/anvio-snakemake-workflows/#samplestxt)
+
+### Examples samples.txt file
+
+Here is an example file:
+
+|sample|group|r1|r2|
+|:--|:--|:--|:--|
+|Sample_01|WARM|/path/to/XXX-01-R1.fastq.gz|/path/to/XXX-01-R2.fastq.gz|
+|Sample_02|COLD|/path/to/YYY-02-R1.fastq.gz|/path/to/YYY-02-R2.fastq.gz|
+|Sample_03|COLD|/path/to/ZZZ-03-R1.fastq.gz|/path/to/ZZZ-03-R2.fastq.gz|
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/samples-txt.md) to update this information.
+
diff --git a/help/8/artifacts/scgs-taxonomy-db/index.md b/help/8/artifacts/scgs-taxonomy-db/index.md
new file mode 100644
index 00000000..d0435a91
--- /dev/null
+++ b/help/8/artifacts/scgs-taxonomy-db/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: scgs-taxonomy-db
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/scgs-taxonomy-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-scg-taxonomy](../../programs/anvi-setup-scg-taxonomy)
+
+
+## Required or used by
+
+
+[anvi-run-scg-taxonomy](../../programs/anvi-run-scg-taxonomy)
+
+
+## Description
+
+This describes the databases downloaded from [The Genome Taxonomy Database](https://gtdb.ecogenomic.org/) when you run [anvi-setup-scg-taxonomy](/help/8/programs/anvi-setup-scg-taxonomy).
+
+Once you have these databases, you'll want to search the single-copy core genes in your [contigs-db](/help/8/artifacts/contigs-db) against them. To do this, run [anvi-run-scg-taxonomy](/help/8/programs/anvi-run-scg-taxonomy).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/scgs-taxonomy-db.md) to update this information.
+
diff --git a/help/8/artifacts/scgs-taxonomy/index.md b/help/8/artifacts/scgs-taxonomy/index.md
new file mode 100644
index 00000000..3cdd3ba7
--- /dev/null
+++ b/help/8/artifacts/scgs-taxonomy/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: scgs-taxonomy
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/scgs-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-run-scg-taxonomy](../../programs/anvi-run-scg-taxonomy)
+
+
+## Required or used by
+
+
+[anvi-estimate-scg-taxonomy](../../programs/anvi-estimate-scg-taxonomy)
+
+
+## Description
+
+This contains the taxonomy annotions for each of the single-copy core genes found in your [contigs-db](/help/8/artifacts/contigs-db); in other words, this contains the results of a run of [anvi-run-scg-taxonomy](/help/8/programs/anvi-run-scg-taxonomy).
+
+This information was found through the [GTDB](https://gtdb.ecogenomic.org/) database, so it will not work with Eukaryotic genomes.
+
+This information allows you to quickly estimate the taxonomy of genomes, metagenomes, or bins stored in your contigs-db by using the command [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/scgs-taxonomy.md) to update this information.
+
diff --git a/help/8/artifacts/seeds-non-specific-txt/index.md b/help/8/artifacts/seeds-non-specific-txt/index.md
new file mode 100644
index 00000000..0319550c
--- /dev/null
+++ b/help/8/artifacts/seeds-non-specific-txt/index.md
@@ -0,0 +1,63 @@
+---
+layout: artifact
+title: seeds-non-specific-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/seeds-non-specific-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This tabular file contains data on the nonspecific coverages of tRNA-seq seeds.
+
+Nonspecific coverage represents reads that are not unique to a single tRNA seed. See the [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db) artifact for a fuller explanation of specific versus nonspecific coverage. The rows and columns of this table are identical to [seeds-specific-txt](/help/8/artifacts/seeds-specific-txt) except for the type of coverage data reported in each.
+
+This file is produced by [anvi-tabulate-trnaseq](/help/8/programs/anvi-tabulate-trnaseq). The artifact for that program describes this and related tables in detail.
+
+This tab-delimited file can be easily manipulated by the user. It is optional input for [anvi-plot-trnaseq](/help/8/programs/anvi-plot-trnaseq).
+
+## Example
+
+The seeds shown in this table are also shown in the [seeds-specific-txt](/help/8/artifacts/seeds-specific-txt) example. Modifications from these seeds are shown in the [modifications-txt](/help/8/artifacts/modifications-txt) example.
+
+| gene_callers_id | contig_name | anticodon | aa | domain | phylum | class | order | family | genus | species | taxon_percent_id | sample_name | mean_coverage | relative_mean_coverage | relative_discriminator_coverage | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 17a | 18 | 19 | 20 | 20a | 20b | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44.01 | 44.02 | 44.03 | 44.04 | 44.05 | 44.06 | 44.07 | 44.08 | 44.09 | 44.1 | 44.11 | 44.12 | 44.13 | 44.14 | 44.15 | 44.16 | 44.17 | 44.18 | 44.19 | 44.2 | 44.21 | 44.22 | 44.23 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | DB_01 | 100372.4 | | 14497 | 14608 | 14815 | 14882 | 14985 | 15828 | 15854 | 15895 | 16410 | 16565 | 16840 | 16960 | 16975 | 16990 | 17490 | 18529 | 19087 | | 19683 | 21763 | 24353 | | | 24699 | 25182 | 29097 | 30476 | 30609 | 30612 | 30491 | 30125 | 29973 | 29506 | 29417 | 26259 | 31169 | 145828 | 153750 | 155936 | 156187 | 156518 | 157233 | 157226 | 157178 | 157429 | 158124 | 159941 | 164453 | 167924 | 170595 | 170567 | | | | | | | | | | | | | | | | | | | 170572 | 170577 | 170547 | 170541 | 170326 | 169581 | 169509 | 169509 | 169497 | 169497 | 169494 | 169491 | 168727 | 168721 | 168719 | 168719 | 168719 | 168719 | 168719 | 168719 | 168489 | 168485 | 168480 | 167688 | 155628 |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | DB_03 | 203816.6 | | 10498 | 10599 | 11105 | 11217 | 11255 | 11270 | 11350 | 11444 | 12349 | 12539 | 13028 | 13331 | 13337 | 13390 | 14325 | 15079 | 15603 | | 15769 | 18168 | 21167 | | | 23927 | 24910 | 27041 | 28271 | 28395 | 28604 | 28612 | 28749 | 29242 | 30775 | 32254 | 33570 | 44299 | 319895 | 335328 | 337653 | 341382 | 342776 | 344543 | 345794 | 345762 | 345808 | 346281 | 347647 | 354052 | 360294 | 361948 | 361948 | | | | | | | | | | | | | | | | | | | 362317 | 362317 | 362303 | 362303 | 362226 | 362056 | 362056 | 362056 | 362056 | 362050 | 362046 | 362044 | 362044 | 362044 | 362044 | 362044 | 362032 | 362027 | 362027 | 362027 | 361349 | 361349 | 361349 | 360094 | 345770 |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | DB_05 | 26137.9 | | 5111 | 5184 | 5259 | 5704 | 5979 | 5979 | 5979 | 6011 | 6550 | 6587 | 6587 | 6611 | 6611 | 6611 | 6936 | 7087 | 7090 | | 7170 | 8243 | 9158 | | | 9488 | 9868 | 12268 | 12323 | 12866 | 12616 | 12506 | 12640 | 12630 | 12838 | 12292 | 11336 | 11621 | 36476 | 37479 | 38030 | 38892 | 39018 | 39018 | 39084 | 39068 | 39272 | 39272 | 39272 | 40666 | 41573 | 41879 | 41873 | | | | | | | | | | | | | | | | | | | 42019 | 42019 | 42015 | 42015 | 41996 | 41873 | 41857 | 41695 | 41689 | 41689 | 41689 | 41689 | 41683 | 41683 | 41683 | 41569 | 41569 | 40839 | 40839 | 40839 | 40839 | 40538 | 40495 | 40464 | 36174 |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | DB_07 | 182536.6 | | 16048 | 16134 | 16358 | 16639 | 16664 | 16679 | 16679 | 16757 | 17351 | 17494 | 17494 | 17547 | 17613 | 17737 | 18346 | 18771 | 18776 | | 19172 | 20831 | 21986 | | | 22549 | 22933 | 25352 | 26246 | 26370 | 25534 | 25798 | 25385 | 25277 | 25529 | 25758 | 25079 | 33202 | 289207 | 300224 | 303731 | 306231 | 306774 | 307692 | 307695 | 307604 | 307604 | 307828 | 308578 | 314195 | 317240 | 321017 | 321023 | | | | | | | | | | | | | | | | | | | 321634 | 321634 | 321615 | 321615 | 321596 | 321066 | 321066 | 321066 | 321066 | 321066 | 321066 | 321066 | 321066 | 321066 | 321066 | 321066 | 321066 | 321066 | 321059 | 321059 | 320379 | 320373 | 320363 | 320230 | 303028 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | DB_01 | 3923.1 | | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 33 | 33 | 33 | 33 | 33 | 33 | 33 | | 33 | 50 | 79 | 82 | | 82 | 82 | 82 | 82 | 82 | 82 | 82 | 90 | 90 | 90 | 96 | 109 | 118 | 117 | 125 | 163 | 320 | 7564 | 7948 | 7970 | 7991 | 7991 | 7991 | 8014 | 8014 | 8016 | 8041 | 8041 | | | | | | | | | | | | | | | | | | | 8041 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8046 | 8022 | 8022 | 8022 | 8021 | 7961 | 7955 | 7213 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | DB_03 | 8502 | | 27 | 27 | 27 | 29 | 29 | 31 | 31 | 43 | 49 | 59 | 59 | 59 | 59 | 59 | 59 | 59 | 59 | | 59 | 60 | 65 | 65 | | 65 | 65 | 65 | 65 | 65 | 65 | 65 | 72 | 72 | 78 | 78 | 92 | 116 | 140 | 173 | 236 | 679 | 16238 | 17144 | 17344 | 17428 | 17428 | 17452 | 17482 | 17482 | 17482 | 17482 | 17482 | | | | | | | | | | | | | | | | | | | 17482 | 17482 | 17482 | 17482 | 17482 | 17482 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17480 | 17411 | 17320 | 16198 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | DB_05 | 1254.6 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 0 | 4 | 18 | 18 | | 18 | 18 | 18 | 18 | 26 | 26 | 26 | 32 | 32 | 32 | 32 | 40 | 45 | 45 | 60 | 62 | 89 | 2379 | 2492 | 2502 | 2562 | 2594 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | | | | | | | | | | | | | | | | | | | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2604 | 2557 | 2557 | 2557 | 2557 | 2446 | 2426 | 2058 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | DB_07 | 4217.8 | | 60 | 60 | 60 | 60 | 60 | 60 | 60 | 64 | 72 | 78 | 78 | 78 | 78 | 78 | 78 | 78 | 78 | | 78 | 90 | 107 | 107 | | 107 | 113 | 113 | 113 | 113 | 119 | 119 | 117 | 117 | 117 | 120 | 120 | 132 | 140 | 172 | 212 | 410 | 7980 | 8475 | 8539 | 8553 | 8584 | 8594 | 8594 | 8594 | 8594 | 8599 | 8599 | | | | | | | | | | | | | | | | | | | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8599 | 8546 | 8546 | 8124 |
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/seeds-non-specific-txt.md) to update this information.
+
diff --git a/help/8/artifacts/seeds-specific-txt/index.md b/help/8/artifacts/seeds-specific-txt/index.md
new file mode 100644
index 00000000..b2636e30
--- /dev/null
+++ b/help/8/artifacts/seeds-specific-txt/index.md
@@ -0,0 +1,63 @@
+---
+layout: artifact
+title: seeds-specific-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/seeds-specific-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This tabular file contains data on the specific coverages of tRNA-seq seeds.
+
+Specific coverage represents reads that are assigned uniquely to a tRNA seed. See the [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db) artifact for a fuller explanation of specific versus nonspecific coverage. The rows and columns of this table are identical to [seeds-non-specific-txt](/help/8/artifacts/seeds-non-specific-txt) except the type of coverage data reported in each.
+
+This file is produced by [anvi-tabulate-trnaseq](/help/8/programs/anvi-tabulate-trnaseq). The artifact for that program describes this and related tables in detail.
+
+This tab-delimited file can be easily manipulated by the user. It is required input for [anvi-plot-trnaseq](/help/8/programs/anvi-plot-trnaseq).
+
+## Example
+
+The seeds shown in this table are also shown in the [seeds-non-specific-txt](/help/8/artifacts/seeds-non-specific-txt) example. Modifications from these seeds are shown in the [modifications-txt](/help/8/artifacts/modifications-txt) example.
+
+| gene_callers_id | contig_name | anticodon | aa | domain | phylum | class | order | family | genus | species | taxon_percent_id | sample_name | mean_coverage | relative_mean_coverage | relative_discriminator_coverage | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 17a | 18 | 19 | 20 | 20a | 20b | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44.01 | 44.02 | 44.03 | 44.04 | 44.05 | 44.06 | 44.07 | 44.08 | 44.09 | 44.1 | 44.11 | 44.12 | 44.13 | 44.14 | 44.15 | 44.16 | 44.17 | 44.18 | 44.19 | 44.2 | 44.21 | 44.22 | 44.23 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | DB_01 | 71456.2 | 0.25805309 | 0.2578848 | | 71392 | 71398 | 71398 | 71400 | 71413 | 71414 | 71414 | 71414 | 71419 | 71425 | 71426 | 71426 | 71426 | 71437 | 71445 | 71451 | 71451 | | 71451 | 71455 | 71457 | | | 71460 | 71467 | 71467 | 71471 | 71468 | 71470 | 71470 | 71470 | 71475 | 71489 | 71540 | 71534 | 71529 | 71606 | 71579 | 71582 | 71583 | 71586 | 71586 | 71583 | 71583 | 71583 | 71585 | 71584 | 71584 | 71584 | 71587 | 71587 | | | | | | | | | | | | | | | | | | | 71586 | 71586 | 71572 | 71572 | 71570 | 71504 | 71505 | 71504 | 71503 | 71503 | 71503 | 71503 | 71503 | 71503 | 71503 | 71503 | 71502 | 71500 | 71500 | 71500 | 71497 | 71496 | 71488 | 71466 | 68324 |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | DB_03 | 87746.6 | 0.22523292 | 0.22512637 | | 87568 | 87582 | 87588 | 87594 | 87598 | 87599 | 87596 | 87597 | 87598 | 87606 | 87608 | 87611 | 87611 | 87611 | 87612 | 87615 | 87615 | | 87614 | 87614 | 87616 | | | 87617 | 87619 | 87619 | 87620 | 87620 | 87620 | 87621 | 87619 | 87620 | 87630 | 87689 | 87683 | 87694 | 87732 | 87898 | 87912 | 87921 | 87925 | 87926 | 87926 | 87926 | 87931 | 87924 | 87925 | 87929 | 87981 | 87986 | 87986 | | | | | | | | | | | | | | | | | | | 87985 | 87985 | 87981 | 87980 | 87978 | 87952 | 87952 | 87951 | 87952 | 87954 | 87952 | 87950 | 87946 | 87946 | 87946 | 87943 | 87941 | 87939 | 87939 | 87939 | 87936 | 87935 | 87929 | 87909 | 84530 |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | DB_05 | 29533 | 0.22692849 | 0.22190602 | | 29490 | 29494 | 29499 | 29509 | 29516 | 29516 | 29518 | 29521 | 29525 | 29528 | 29536 | 29536 | 29536 | 29538 | 29539 | 29547 | 29550 | | 29550 | 29549 | 29553 | | | 29552 | 29552 | 29553 | 29552 | 29553 | 29553 | 29549 | 29550 | 29552 | 29553 | 29564 | 29561 | 29559 | 29638 | 29605 | 29602 | 29604 | 29607 | 29607 | 29607 | 29608 | 29607 | 29607 | 29607 | 29607 | 29607 | 29609 | 29609 | | | | | | | | | | | | | | | | | | | 29609 | 29609 | 29606 | 29606 | 29605 | 29601 | 29601 | 29601 | 29601 | 29601 | 29599 | 29599 | 29599 | 29599 | 29596 | 29596 | 29597 | 29595 | 29595 | 29594 | 29593 | 29504 | 29486 | 29451 | 26980 |
+0 | c_000000684460_DB_R05_06 | TAC | Val | Bacteria | Firmicutes | Clostridia | Lachnospirales | Lachnospiraceae | | | 100 | DB_07 | 47065 | 0.181019 | 0.18087983 | | 47078 | 47087 | 47105 | 47113 | 47114 | 47114 | 47114 | 47116 | 47124 | 47127 | 47129 | 47129 | 47131 | 47133 | 47139 | 47139 | 47139 | | 47135 | 47138 | 47141 | | | 47145 | 47142 | 47147 | 47145 | 47145 | 47142 | 47143 | 47135 | 47134 | 47133 | 47154 | 47125 | 47093 | 47111 | 47051 | 47058 | 47067 | 47071 | 47073 | 47072 | 47069 | 47069 | 47069 | 47072 | 47072 | 47073 | 47073 | 47074 | | | | | | | | | | | | | | | | | | | 47074 | 47074 | 47073 | 47069 | 47069 | 47042 | 47043 | 47042 | 47038 | 47042 | 47042 | 47042 | 47042 | 47041 | 47041 | 47040 | 47040 | 47040 | 47040 | 47040 | 47033 | 47032 | 47031 | 47018 | 45352 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | DB_01 | 142.4 | 0.00051437 | 0.00052465 | | 141 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | | 142 | 142 | 142 | 142 | | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 142 | 141 | 141 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | | | | | | | | | | | | | | | | | | | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 143 | 139 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | DB_03 | 1006.4 | 0.00258316 | 0.00249815 | | 1010 | 1010 | 1010 | 1010 | 1011 | 1011 | 1011 | 1011 | 1011 | 1010 | 1010 | 1010 | 1010 | 1010 | 1010 | 1010 | 1011 | | 1011 | 1011 | 1014 | 1014 | | 1014 | 1014 | 1014 | 1014 | 1014 | 1014 | 1014 | 1014 | 1014 | 1014 | 1014 | 1014 | 1013 | 1012 | 1012 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | | | | | | | | | | | | | | | | | | | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 998 | 998 | 938 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | DB_05 | 297.5 | 0.00228586 | 0.00285402 | | 225 | 227 | 227 | 227 | 227 | 227 | 225 | 228 | 228 | 230 | 230 | 230 | 230 | 230 | 230 | 230 | 230 | | 230 | 230 | 230 | 230 | | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 232 | 233 | 238 | 360 | 367 | 367 | 367 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | | | | | | | | | | | | | | | | | | | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 370 | 362 | 362 | 347 |
+1 | c_000000805276_DB_R05_05 | ACG | Arg | Bacteria | Firmicutes | | | | | | 98.649 | DB_07 | 120 | 0.00046169 | 0.00046664 | | 116 | 116 | 116 | 116 | 116 | 116 | 116 | 116 | 116 | 119 | 119 | 119 | 118 | 118 | 118 | 118 | 118 | | 119 | 122 | 126 | 126 | | 126 | 126 | 126 | 126 | 126 | 126 | 126 | 126 | 126 | 126 | 126 | 124 | 124 | 122 | 123 | 120 | 120 | 120 | 120 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | | | | | | | | | | | | | | | | | | | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 119 | 117 | 117 | 117 |
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/seeds-specific-txt.md) to update this information.
+
diff --git a/help/8/artifacts/short-reads-fasta/index.md b/help/8/artifacts/short-reads-fasta/index.md
new file mode 100644
index 00000000..57aa69dc
--- /dev/null
+++ b/help/8/artifacts/short-reads-fasta/index.md
@@ -0,0 +1,42 @@
+---
+layout: artifact
+title: short-reads-fasta
+excerpt: A FASTA-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/short-reads-fasta
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A FASTA-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-get-short-reads-from-bam](../../programs/anvi-get-short-reads-from-bam) [anvi-get-short-reads-mapping-to-a-gene](../../programs/anvi-get-short-reads-mapping-to-a-gene) [anvi-search-primers](../../programs/anvi-search-primers) [anvi-script-gen-short-reads](../../programs/anvi-script-gen-short-reads)
+
+
+## Required or used by
+
+
+[anvi-script-gen-pseudo-paired-reads-from-fastq](../../programs/anvi-script-gen-pseudo-paired-reads-from-fastq)
+
+
+## Description
+
+A [fasta](/help/8/artifacts/fasta) file that contains often unassembled, short metagenomic reads.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/short-reads-fasta.md) to update this information.
+
diff --git a/help/8/artifacts/single-profile-db/index.md b/help/8/artifacts/single-profile-db/index.md
new file mode 100644
index 00000000..b9a25c05
--- /dev/null
+++ b/help/8/artifacts/single-profile-db/index.md
@@ -0,0 +1,57 @@
+---
+layout: artifact
+title: single-profile-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/single-profile-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-profile](../../programs/anvi-profile)
+
+
+## Required or used by
+
+
+[anvi-import-taxonomy-for-layers](../../programs/anvi-import-taxonomy-for-layers) [anvi-interactive](../../programs/anvi-interactive) [anvi-merge](../../programs/anvi-merge) [anvi-summarize-blitz](../../programs/anvi-summarize-blitz)
+
+
+## Description
+
+An anvi'o database that contains the same information as a merged [profile-db](/help/8/artifacts/profile-db), namely **key information about the mapping of short reads *in a single sample* to your contigs.**
+
+You can think of this as a extension of a [contigs-db](/help/8/artifacts/contigs-db) that contains information about how your contigs align with a single one of your individual samples. If you have more than one sample, you'll probably want to use [anvi-merge](/help/8/programs/anvi-merge) to merge your databases into a merged [profile-db](/help/8/artifacts/profile-db). The vast majority of programs that use a profile database will also ask for the contigs database associated with it.
+
+A single profile database contains information about how the short reads in a single BAM-file (see [bam-file](/help/8/artifacts/bam-file)) map to the contigs in a [contigs-db](/help/8/artifacts/contigs-db). Specificially, a profile database contains
+* the coverage and abundance per nucleotide position for each contig
+* variants of various kinds (single-nucleotide, single-codon, and single-amino acid)
+* structural variants (ex insertions and deletions)
+
+Once created, a single profile database is almost interchangable with a [profile-db](/help/8/artifacts/profile-db) (even though the names can be a little confusing. Think of a single-profile-db as a type of profile-db, since it has only a few differences). The main differences between the two are as follows:
+* You cannot run [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs) or [anvi-mcg-classifier](/help/8/programs/anvi-mcg-classifier) on a single profile db, since these two programs look at the alignment data in many samples.
+* You can run [anvi-import-taxonomy-for-layers](/help/8/programs/anvi-import-taxonomy-for-layers) on a single profile database but not a merged one.
+* You can only run [anvi-merge](/help/8/programs/anvi-merge) on a single profile database.
+* You can only run [anvi-report-inversions](/help/8/programs/anvi-report-inversions) using single profile database created with the inversion fetch filter.
+
+If you want to look at the contents of a single profile database, you can do so using [anvi-interactive](/help/8/programs/anvi-interactive).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/single-profile-db.md) to update this information.
+
diff --git a/help/8/artifacts/split-bins/index.md b/help/8/artifacts/split-bins/index.md
new file mode 100644
index 00000000..acaf7665
--- /dev/null
+++ b/help/8/artifacts/split-bins/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: split-bins
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/split-bins
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-split](../../programs/anvi-split)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is the result of [anvi-split](/help/8/programs/anvi-split): self-contained anvi'o projects that contain just the contents of a single [bin](/help/8/artifacts/bin) from your original database.
+
+This describes a directory that either contains either a [genomes-storage-db](/help/8/artifacts/genomes-storage-db) and [pan-db](/help/8/artifacts/pan-db) (if that's what you gave [anvi-split](/help/8/programs/anvi-split) as an input) or a [profile-db](/help/8/artifacts/profile-db) and [contigs-db](/help/8/artifacts/contigs-db) pair. The contigs or genomes and gene clusters described in these databases will be only those contained in the bin that the directory's name corresponds to.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/split-bins.md) to update this information.
+
diff --git a/help/8/artifacts/splits-taxonomy-txt/index.md b/help/8/artifacts/splits-taxonomy-txt/index.md
new file mode 100644
index 00000000..b92f90c5
--- /dev/null
+++ b/help/8/artifacts/splits-taxonomy-txt/index.md
@@ -0,0 +1,50 @@
+---
+layout: artifact
+title: splits-taxonomy-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/splits-taxonomy-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-splits-taxonomy](../../programs/anvi-export-splits-taxonomy)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is the output of [anvi-export-splits-taxonomy](/help/8/programs/anvi-export-splits-taxonomy). It is in the same format as a [gene-taxonomy-txt](/help/8/artifacts/gene-taxonomy-txt), namely the first column identifies splits and the following columns describe the taxonomy hit associated with that split.
+
+For example:
+
+ split_id t_domain t_phylum t_class ...
+ 1 Eukarya Chordata Mammalia
+ 2 Prokarya Bacteroidetes Bacteroidia
+ ...
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/splits-taxonomy-txt.md) to update this information.
+
diff --git a/help/8/artifacts/splits-txt/index.md b/help/8/artifacts/splits-txt/index.md
new file mode 100644
index 00000000..2918b663
--- /dev/null
+++ b/help/8/artifacts/splits-txt/index.md
@@ -0,0 +1,51 @@
+---
+layout: artifact
+title: splits-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/splits-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-compute-completeness](../../programs/anvi-compute-completeness) [anvi-display-structure](../../programs/anvi-display-structure) [anvi-gen-fixation-index-matrix](../../programs/anvi-gen-fixation-index-matrix) [anvi-gen-variability-profile](../../programs/anvi-gen-variability-profile) [anvi-get-aa-counts](../../programs/anvi-get-aa-counts)
+
+
+## Description
+
+This is a **text file containing a list of splits**.
+
+This file has only one column with one split name per line. For example
+
+ split_name_1
+ split_name_2
+ split_name_3
+
+
+This kind of file is used when you want to focus your analysis on only a specific set of splits. Just provide one of these and the program will only look at `split_name_1`, `split_name_2`, and `split_name_3`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/splits-txt.md) to update this information.
+
diff --git a/help/8/artifacts/state-json/index.md b/help/8/artifacts/state-json/index.md
new file mode 100644
index 00000000..73bc1346
--- /dev/null
+++ b/help/8/artifacts/state-json/index.md
@@ -0,0 +1,44 @@
+---
+layout: artifact
+title: state-json
+excerpt: A JSON-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/state-json
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A JSON-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-export-state](../../programs/anvi-export-state)
+
+
+## Required or used by
+
+
+[anvi-import-state](../../programs/anvi-import-state)
+
+
+## Description
+
+This is a JSON file that describes a [state](/help/8/artifacts/state). It is the output of [anvi-export-state](/help/8/programs/anvi-export-state) and can be imported into the interface (through a database) using [anvi-export-state](/help/8/programs/anvi-export-state).
+
+This is how you would give a state to a fellow anvi'o user. If opened, you'll be able to view all of the data that's contained in this state.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/state-json.md) to update this information.
+
diff --git a/help/8/artifacts/state/index.md b/help/8/artifacts/state/index.md
new file mode 100644
index 00000000..daa4e416
--- /dev/null
+++ b/help/8/artifacts/state/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: state
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/state
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-import-state](../../programs/anvi-import-state)
+
+
+## Required or used by
+
+
+[anvi-delete-state](../../programs/anvi-delete-state) [anvi-export-state](../../programs/anvi-export-state)
+
+
+## Description
+
+A state describes the configuration of the anvi'o [interactive](/help/8/artifacts/interactive) interface (i.e. the cosmetic and organizational settings that you have enabled).
+
+From the interface, the bottom section of the left panel enables you to save and load states. You also have the option to import states with [anvi-import-state](/help/8/programs/anvi-import-state) or export them with [anvi-export-state](/help/8/programs/anvi-export-state). You can also delete states you no longer need anymore with [anvi-delete-state](/help/8/programs/anvi-delete-state).
+
+Here is the information stored in a state:
+* The current item (see [misc-data-items](/help/8/artifacts/misc-data-items)) and layers (see [misc-data-layers](/help/8/artifacts/misc-data-layers)) displayed
+ * related information, like the minimum and maximum value for the data displayed in each layer,
+* The current items order (see [misc-data-items-order](/help/8/artifacts/misc-data-items-order)) and layers order (see [misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders))
+* The views you have available
+* Any sample groups you have
+* Various cosmetic settings, like font size, angles, dimensions, colors, whether or not labels are displayed, etc.
+ * This includes whether your display is in circles or rectangles
+
+No more having to manually set parameters like your layer order for each bin you look at! Just save a state when the interface is adjusted to your liking, and using anvi'o will be that much easier.
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/state.md) to update this information.
+
diff --git a/help/8/artifacts/structure-db/index.md b/help/8/artifacts/structure-db/index.md
new file mode 100644
index 00000000..7ffb9b1f
--- /dev/null
+++ b/help/8/artifacts/structure-db/index.md
@@ -0,0 +1,53 @@
+---
+layout: artifact
+title: structure-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/structure-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-gen-structure-database](../../programs/anvi-gen-structure-database)
+
+
+## Required or used by
+
+
+[anvi-db-info](../../programs/anvi-db-info) [anvi-display-structure](../../programs/anvi-display-structure) [anvi-export-structures](../../programs/anvi-export-structures) [anvi-gen-fixation-index-matrix](../../programs/anvi-gen-fixation-index-matrix) [anvi-gen-variability-profile](../../programs/anvi-gen-variability-profile) [anvi-migrate](../../programs/anvi-migrate) [anvi-update-structure-database](../../programs/anvi-update-structure-database)
+
+
+## Description
+
+
+This database contains the protein structural data for genes in a corresponding [contigs-db](/help/8/artifacts/contigs-db) and can be generated with [anvi-gen-structure-database](/help/8/programs/anvi-gen-structure-database).
+
+
+Currently, this database is best utilized for visualizing 3D structures with [anvi-display-structure](/help/8/programs/anvi-display-structure).
+
+For more information on the structure database, see [this blog post](http://merenlab.org/2018/09/04/getting-started-with-anvio-structure/#the-structure-database).
+
+
+{:.notice}
+This artifact is currently a stub. I'm looking at you, Evan. - Evan
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/structure-db.md) to update this information.
+
diff --git a/help/8/artifacts/summary/index.md b/help/8/artifacts/summary/index.md
new file mode 100644
index 00000000..b1a2b5b2
--- /dev/null
+++ b/help/8/artifacts/summary/index.md
@@ -0,0 +1,84 @@
+---
+layout: artifact
+title: summary
+excerpt: A SUMMARY-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/summary
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A SUMMARY-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-summarize](../../programs/anvi-summarize)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This is the output of the program [anvi-summarize](/help/8/programs/anvi-summarize) and it comprehensively describes the data stored in a [contigs-db](/help/8/artifacts/contigs-db) and [profile-db](/help/8/artifacts/profile-db) pair.
+
+By default, this will be a directory called `SUMMARY` that will contain some subdirectories, a text file that summarizes your bins, and an html file that formats the data in the summary nicely.
+
+#### The bin summary
+
+By default, this is stored in a tab-delimited matrix called `bins_summary.txt`. In this matrix, the rows represent the [bin](/help/8/artifacts/bin)s in your [profile-db](/help/8/artifacts/profile-db). The columns represent the following from left to right: the bin name, the taxon ID (if calculated), the toal number of nucleotides in the bin, the toal number of contigs in the bin, the N50 statistic (see the page for [anvi-display-contigs-stats](/help/8/programs/anvi-display-contigs-stats)), the GC content, and the completition and redundency.
+
+#### Three subdirectories
+
+The subdirectories in the `SUMMARY` folder are as follows:
+
+- `bin_by_bin`: this directory contains a subdirectory for each of your [bin](/help/8/artifacts/bin)s. Each of these subdirectories contains various information about the contents of that bin. For example, you get a fasta file that contains the sequences of all of the contigs in your bin, various statistics for that bin (ex coverage and detection) across each of your samples in tab-delimited matrices, and fasta files that contain only sequences of a specific taxa (ex only Archaea sequences)
+
+- `bins_across_samples`: this directory contains various text files, each of which describes a single statistic about your bins across all of your samples. Most of these files are tab-delimited matrices where each row represents a bin and each column describes one of your samples; each cell describes the value of a single stastic like mean coverage, relative abundance, or variaiblity. The only files that are not formatted this way are those describing the hmm-hits in the database, which only give total counts for hmm-hits of a certain kind in your bins and don't break these results down by sample.
+
+- `misc_data_layers` or `misc_data_items`: this data contains all of the [misc-data-items](/help/8/artifacts/misc-data-items) and [misc-data-layers](/help/8/artifacts/misc-data-layers) stored in your database pair, formatted in [misc-data-items-txt](/help/8/artifacts/misc-data-items-txt) and [misc-data-layers-txt](/help/8/artifacts/misc-data-layers-txt) files respectively.
+
+#### The HTML document
+
+When opened (usualy with an internet browser), you should see a page that looks somewhat like this.
+
+![An example of the HTML file that results from anvi-summarize.](../../images/summary_example.png)
+
+The top bar provides links to various anvi'o resources, while the large text at the top provides an overall summary of your data, including the name, size, and format of the database.
+
+Following this, basic information about your databases are listed, such as the parameters used to create the databases and information about when they were created.
+
+After this, several sections are listed:
+
+- The description of your database (which you can change with [anvi-update-db-description](/help/8/programs/anvi-update-db-description))
+
+- "Summary of Bins", which contains the information from the `bin_by_bin` subdirectory (but in a format that 's a little easier on the eyes)
+
+-"Across Samples", which contains the information from the `bins_across_samples` subdirectory. Here, you can change which metric you're looking at from the tabs at the top of this section (i.e. under the "Across Samples" header but above the displayed data)
+
+-"Percent Recruitment": This is also from the `bins_across_samples` subdirectory. It describes the percent of mapped reads in each sample that mapped to splits within each bin.
+
+-"Gene Calls": lists all of the gene calls in your database by bin, including their functional annotation and coverage and detectin values.
+
+-"Hits for non-single-copy gene HMM profiles": This is also from the `bins_across_samples` subdirectory. The first table displays the total number of hits in each bin, while the table underneath provides a breakdown of those HMM hits. Note that each cell in the first table is a link that leads to a fasta file that contains only the relevant sequences.
+
+-"Misc Data": contains the information from the `misc_data_layers` or `misc_data_items` subdirectories.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/summary.md) to update this information.
+
diff --git a/help/8/artifacts/svg/index.md b/help/8/artifacts/svg/index.md
new file mode 100644
index 00000000..235147c9
--- /dev/null
+++ b/help/8/artifacts/svg/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: svg
+excerpt: A SVG-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/svg
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A SVG-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-display-contigs-stats](../../programs/anvi-display-contigs-stats) [anvi-display-pan](../../programs/anvi-display-pan) [anvi-interactive](../../programs/anvi-interactive)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+SVG stands for scalable vector graphics, which is a vector-based image format. In anvi'o programs that give you pretty-looking outputs will also give you an svg, so you can look at the beauty of your data without having to open anvi'o for analysis. This also makes it easier to share with others (so you don't have to use screenshots in your poster).
+
+As of now, [anvi-display-contigs-stats](/help/8/programs/anvi-display-contigs-stats), [anvi-display-pan](/help/8/programs/anvi-display-pan), and [anvi-interactive](/help/8/programs/anvi-interactive) will give you an svg output every time you click the little save button at the bottom-left corner of the settings panel in the interface. You can even customize the location of that output using the flag `--export-svg`.
+
+Take a look at [this blogpost](http://merenlab.org/2016/10/27/high-resolution-figures/) for an outline of how to get this svg file into a publication-quality figure.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/svg.md) to update this information.
+
diff --git a/help/8/artifacts/trna-taxonomy-db/index.md b/help/8/artifacts/trna-taxonomy-db/index.md
new file mode 100644
index 00000000..76dd198f
--- /dev/null
+++ b/help/8/artifacts/trna-taxonomy-db/index.md
@@ -0,0 +1,47 @@
+---
+layout: artifact
+title: trna-taxonomy-db
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/trna-taxonomy-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-trna-taxonomy](../../programs/anvi-setup-trna-taxonomy)
+
+
+## Required or used by
+
+
+[anvi-run-trna-taxonomy](../../programs/anvi-run-trna-taxonomy)
+
+
+## Description
+
+This artifact represents the [GTDB](https://gtdb.ecogenomic.org/) data (from [Parks et al. 2018](https://doi.org/10.1038/nbt.4229)) downloaded by [anvi-setup-trna-taxonomy](/help/8/programs/anvi-setup-trna-taxonomy). This information is required to run [anvi-run-trna-taxonomy](/help/8/programs/anvi-run-trna-taxonomy) and [anvi-estimate-trna-taxonomy](/help/8/programs/anvi-estimate-trna-taxonomy).
+
+{:.notice}
+If the results from this tRNA taxonomy search end up in a paper, make sure to cite [Parks et al. 2018](https://doi.org/10.1038/nbt.4229) for their information.
+
+By default, it is stored at `anvio/data/misc/TRNA-TAXONOMY`. This directory contains a few files for each anticodon, each forming a fancy search database so that you can associate tRNA reads in your [contigs-db](/help/8/artifacts/contigs-db) with taxonomy information.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/trna-taxonomy-db.md) to update this information.
+
diff --git a/help/8/artifacts/trna-taxonomy/index.md b/help/8/artifacts/trna-taxonomy/index.md
new file mode 100644
index 00000000..759c8d96
--- /dev/null
+++ b/help/8/artifacts/trna-taxonomy/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: trna-taxonomy
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/trna-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-run-trna-taxonomy](../../programs/anvi-run-trna-taxonomy)
+
+
+## Required or used by
+
+
+[anvi-estimate-trna-taxonomy](../../programs/anvi-estimate-trna-taxonomy)
+
+
+## Description
+
+This contains the taxonomic annotations for each of the tRNA sequences found in your [contigs-db](/help/8/artifacts/contigs-db), which are the results of running [anvi-run-trna-taxonomy](/help/8/programs/anvi-run-trna-taxonomy).
+
+You can use this information to estimate the taxnomy of genomes, metagenomes, or collections stored in your [contigs-db](/help/8/artifacts/contigs-db) using the program [anvi-estimate-trna-taxonomy](/help/8/programs/anvi-estimate-trna-taxonomy)
+
+Recall that this information was calculated using [GTDB](https://gtdb.ecogenomic.org/), so it might not be entirely accurate for Eukaryotic tRNAs.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/trna-taxonomy.md) to update this information.
+
diff --git a/help/8/artifacts/trnaseq-contigs-db/index.md b/help/8/artifacts/trnaseq-contigs-db/index.md
new file mode 100644
index 00000000..44e2944f
--- /dev/null
+++ b/help/8/artifacts/trnaseq-contigs-db/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: trnaseq-contigs-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/trnaseq-contigs-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-merge-trnaseq](../../programs/anvi-merge-trnaseq)
+
+
+## Required or used by
+
+
+[anvi-plot-trnaseq](../../programs/anvi-plot-trnaseq) [anvi-tabulate-trnaseq](../../programs/anvi-tabulate-trnaseq)
+
+
+## Description
+
+A tRNA-seq contigs database is a **[contigs-db](/help/8/artifacts/contigs-db) variant containing information on tRNA transcripts identified from tRNA-seq experiments**.
+
+This database is created by the program, [anvi-merge-trnaseq](/help/8/programs/anvi-merge-trnaseq), which is part of the [trnaseq-workflow](../../workflows/trnaseq/). This program also creates [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db)s. [anvi-run-trna-taxonomy](/help/8/programs/anvi-run-trna-taxonomy) populates the tRNA-seq contigs database with taxonomic annotations.
+
+This database functions in a manner equivalent to the normal metagenomic-style contigs database. As normal contigs databases are associated with a normal [profile-db](/help/8/artifacts/profile-db) containing coverage-related data, tRNA-seq contigs databases are associated with [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db)s. The name can be misleading: tRNA-seq contigs databases do not contain information on assembled contigs as such. Rather, the fundamental type of sequence reconstructed from a tRNA-seq experiment is a **tRNA seed**, representing a mature tRNA sequence (minus the 3'-CCA acceptor) found in one or more samples in the experiment. tRNA seeds are not predicted by assembly at all, but by the specialized software of [anvi-trnaseq](/help/8/programs/anvi-trnaseq) and [anvi-merge-trnaseq](/help/8/programs/anvi-merge-trnaseq).
+
+A variety of information on tRNA seeds is contained in a tRNA-seq contigs database, including structural profiles, taxonomic annotations, and user-defined bins.
+
+## Uses
+
+Tabulation of tRNA-seq data by [anvi-tabulate-trnaseq](/help/8/programs/anvi-tabulate-trnaseq) requires a tRNA-seq contigs database and [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db).
+
+Interactive visualization of tRNA-seq datasets in [anvi-interactive](/help/8/programs/anvi-interactive) requires this database and a [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db).
+
+Visualization of grouped seeds by [anvi-plot-trnaseq](/help/8/programs/anvi-plot-trnaseq) requires this database in addition to files produced by [anvi-tabulate-trnaseq](/help/8/programs/anvi-tabulate-trnaseq).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/trnaseq-contigs-db.md) to update this information.
+
diff --git a/help/8/artifacts/trnaseq-db/index.md b/help/8/artifacts/trnaseq-db/index.md
new file mode 100644
index 00000000..8a50c17e
--- /dev/null
+++ b/help/8/artifacts/trnaseq-db/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: trnaseq-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/trnaseq-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-trnaseq](../../programs/anvi-trnaseq)
+
+
+## Required or used by
+
+
+[anvi-merge-trnaseq](../../programs/anvi-merge-trnaseq)
+
+
+## Description
+
+A tRNA-seq database **contains information on tRNA sequences predicted from a single tRNA-seq sample**.
+
+This database is the key output of **[anvi-trnaseq](/help/8/programs/anvi-trnaseq)**. That program predicts which reads are tRNA through structural profiling, clusters tRNA reads into discrete biological sequences, and predicts the positions of nucleotide modifications.
+
+The series of steps implemented in [anvi-trnaseq](/help/8/programs/anvi-trnaseq) sequentially adds the following information to the database.
+
+* Unique sequences predicted to be tRNA, including read counts
+* Primary sequence and secondary structural features (stems and loops) predicted in each profiled tRNA
+* Unconserved nucleotides in the primary sequence that differ from expectation
+* Unpaired nucleotides in the stems
+* "Trimmed" tRNA sequences, formed from unique sequences only differing by 3' nucleotides of the CCA acceptor region and 5' nucleotides beyond the acceptor stem
+* "Normalized" tRNA sequences, formed by dereplicating trimmed tRNA sequences that are 3' fragments from incomplete reverse transcription and by mapping biological 5' and interior tRNA fragments
+* Potentially modified tRNA sequences, formed by clustering normalized tRNA sequences and retaining those clusters that differ by 3-4 nucleotides at aligned positions
+
+This database is the key input to **[anvi-merge-trnaseq](/help/8/programs/anvi-merge-trnaseq)**, which takes one or more databases comprising the samples in an experiment and generates a [trnaseq-contigs-db](/help/8/artifacts/trnaseq-contigs-db) of tRNA seed sequences and [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db)s. These tRNA-seq variant contigs and profile databases can then be manipulated and displayed in anvi'o like normal [contigs-db](/help/8/artifacts/contigs-db)s and [profile-db](/help/8/artifacts/profile-db)s.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/trnaseq-db.md) to update this information.
+
diff --git a/help/8/artifacts/trnaseq-fasta/index.md b/help/8/artifacts/trnaseq-fasta/index.md
new file mode 100644
index 00000000..d1f4f80a
--- /dev/null
+++ b/help/8/artifacts/trnaseq-fasta/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: trnaseq-fasta
+excerpt: A FASTA-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/trnaseq-fasta
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A FASTA-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-trnaseq](../../programs/anvi-trnaseq)
+
+
+## Description
+
+A [trnaseq-fasta](/help/8/artifacts/trnaseq-fasta) is a [fasta](/help/8/artifacts/fasta) file of sequences from a single tRNA-seq sample of split that is suitable to be used by [anvi-trnaseq](/help/8/programs/anvi-trnaseq) to create a [trnaseq-db](/help/8/artifacts/trnaseq-db).
+
+Like [contigs-fasta](/help/8/artifacts/contigs-fasta) files, this file **is required to have simple deflines**. Take a look at your deflines prior to mapping, and remove anything that is not a digit, an ASCII letter, an underscore, or a dash character. The program [anvi-script-reformat-fasta](/help/8/programs/anvi-script-reformat-fasta) can do this automatically for you with the flag `--simplify-names`.
+
+We recommend using [anvi-run-workflow](/help/8/programs/anvi-run-workflow) to create this file from paired-end tRNA-seq reads. The [trnaseq-workflow](../../workflows/trnaseq/) uses [illumina-utils](https://github.com/merenlab/illumina-utils) to merge FASTQ files that may contain a mixture of fully and partially overlapping reads, which both occur using 100 bp (or shorter) reads containing barcodes due to the length of tRNA. Even with 150 bp reads, there may be pre-tRNA covered by partially but not fully overlapping reads. [anvi-script-reformat-fasta](/help/8/programs/anvi-script-reformat-fasta) comes after illumina-utils in the workflow.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/trnaseq-fasta.md) to update this information.
+
diff --git a/help/8/artifacts/trnaseq-plot/index.md b/help/8/artifacts/trnaseq-plot/index.md
new file mode 100644
index 00000000..6c40bd88
--- /dev/null
+++ b/help/8/artifacts/trnaseq-plot/index.md
@@ -0,0 +1,39 @@
+---
+layout: artifact
+title: trnaseq-plot
+excerpt: A DISPLAY-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/trnaseq-plot
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DISPLAY-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-plot-trnaseq](../../programs/anvi-plot-trnaseq)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+{:.notice}
+**No one has described this artifact yet** :/ If you would like to contribute by describing it, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts), and add a Markdown formatted file in that directory named "trnaseq-plot.md". Its contents will replace this sad text. THANK YOU!
+
diff --git a/help/8/artifacts/trnaseq-profile-db/index.md b/help/8/artifacts/trnaseq-profile-db/index.md
new file mode 100644
index 00000000..3025a084
--- /dev/null
+++ b/help/8/artifacts/trnaseq-profile-db/index.md
@@ -0,0 +1,60 @@
+---
+layout: artifact
+title: trnaseq-profile-db
+excerpt: A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/trnaseq-profile-db
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-merge-trnaseq](../../programs/anvi-merge-trnaseq)
+
+
+## Required or used by
+
+
+[anvi-tabulate-trnaseq](../../programs/anvi-tabulate-trnaseq)
+
+
+## Description
+
+A tRNA-seq profile database is a **[profile-db](/help/8/artifacts/profile-db) variant containing tRNA seed coverage information from one or more samples**.
+
+This database is created by the program, [anvi-merge-trnaseq](/help/8/programs/anvi-merge-trnaseq), which is part of the [trnaseq-workflow](../../workflows/trnaseq/). This program also creates a [trnaseq-contigs-db](/help/8/artifacts/trnaseq-contigs-db).
+
+## Specific and nonspecific coverage
+
+The coverage of tRNA seeds by tRNA-seq reads is determined differently than the coverage of contigs by metagenomic reads. Metagenomic contigs are constructed by an assembly tool and reads are assigned to contigs by a mapping tool. Reads that map to multiple contigs are randomly assigned to one contig. tRNA-seq seeds and coverages are found simultaneously by [anvi-trnaseq](/help/8/programs/anvi-trnaseq) and [anvi-merge-trnaseq](/help/8/programs/anvi-merge-trnaseq). Two types of coverage are tracked: **specific** coverage of reads unique to seeds and **nonspecific** coverage of reads in multiple seeds. tRNA-seq reads are often short fragments found in numerous tRNAs; random assignment of these reads would distort tRNA abundances and coverage patterns.
+
+Separate tRNA-seq profile databases are produced for specific and nonspecific coverages. A "combined" database containing both sets of data is produced by default for convenience, allowing specific and nonspecific coverages to be compared side-by-side in the [anvi-interactive](/help/8/programs/anvi-interactive) interface. A "summed" database of specific + nonspecific coverage can optionally be produced. The `--nonspecific-output` option of [anvi-merge-trnaseq](/help/8/programs/anvi-merge-trnaseq) controls the production of nonspecific, combined, and summed databases.
+
+## Modifications versus SNVs
+
+The other significant difference between a tRNA-seq profile database and a normal [profile-db](/help/8/artifacts/profile-db) is that variable nucleotides are restricted to tRNA modification positions predicted from mutation signatures. Single nucleotide variants are purposefully excluded, though they can be mistaken for modifications, especially with permissive parameterization of [anvi-merge-trnaseq](/help/8/programs/anvi-merge-trnaseq) (see that artifact for more information).
+
+## Uses
+
+Tabulation of tRNA-seq data by [anvi-tabulate-trnaseq](/help/8/programs/anvi-tabulate-trnaseq) takes a specific and optionally nonspecific profile database in addition to a [trnaseq-contigs-db](/help/8/artifacts/trnaseq-contigs-db).
+
+Interactive visualization of tRNA-seq data in [anvi-interactive](/help/8/programs/anvi-interactive) requires a specific, nonspecific, combined, or summed profile database in addition to a [trnaseq-contigs-db](/help/8/artifacts/trnaseq-contigs-db).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/trnaseq-profile-db.md) to update this information.
+
diff --git a/help/8/artifacts/trnaseq-seed-txt/index.md b/help/8/artifacts/trnaseq-seed-txt/index.md
new file mode 100644
index 00000000..0931859b
--- /dev/null
+++ b/help/8/artifacts/trnaseq-seed-txt/index.md
@@ -0,0 +1,39 @@
+---
+layout: artifact
+title: trnaseq-seed-txt
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/trnaseq-seed-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-tabulate-trnaseq](../../programs/anvi-tabulate-trnaseq)
+
+
+## Required or used by
+
+
+[anvi-plot-trnaseq](../../programs/anvi-plot-trnaseq)
+
+
+## Description
+
+{:.notice}
+**No one has described this artifact yet** :/ If you would like to contribute by describing it, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts), and add a Markdown formatted file in that directory named "trnaseq-seed-txt.md". Its contents will replace this sad text. THANK YOU!
+
diff --git a/help/8/artifacts/user-metabolism/index.md b/help/8/artifacts/user-metabolism/index.md
new file mode 100644
index 00000000..73b21948
--- /dev/null
+++ b/help/8/artifacts/user-metabolism/index.md
@@ -0,0 +1,56 @@
+---
+layout: artifact
+title: user-metabolism
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/user-metabolism
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism)
+
+
+## Required or used by
+
+
+[anvi-compute-metabolic-enrichment](../../programs/anvi-compute-metabolic-enrichment)
+
+
+## Description
+
+Output text files produced by [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) that describe the presence of **user-defined metabolic pathways** in a [contigs-db](/help/8/artifacts/contigs-db).
+
+These files are exactly the same format as those described by [kegg-metabolism](/help/8/artifacts/kegg-metabolism), but in addition to (or instead of) information on KEGG modules and KEGG Orthologs, they contain information on user-defined metabolic pathways" (and their component enzymes), as described in [user-modules-data](/help/8/artifacts/user-modules-data).
+
+## How to get to this output?
+
+You should first read the page on [user-modules-data](/help/8/artifacts/user-modules-data) to learn how to define and set up your own metabolic pathways for use in anvi'o. The program that generates this output is [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism), and you should run that program with the `--user-modules` parameter to make sure the resulting text files contains the information on your user-defined metabolic pathways. There are two main ways to do it (which are also described on the [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) help page):
+
+1. To get files describing user-defined metabolic modules _in addition to_ KEGG modules, just use the `--user-modules` parameter.
+
+2. To get files describing _only_ user-defined metabolic modules (instead of KEGG stuff), use both `--user-modules` and `--only-user-modules` parameters.
+
+## What do these files look like?
+
+Check out the [kegg-metabolism](/help/8/artifacts/kegg-metabolism) page for a comprehensive description of the file formats and various options to customize them. The examples on that page show KEGG data, but the format is the same for user-defined data.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/user-metabolism.md) to update this information.
+
diff --git a/help/8/artifacts/user-modules-data/index.md b/help/8/artifacts/user-modules-data/index.md
new file mode 100644
index 00000000..a0ebe10c
--- /dev/null
+++ b/help/8/artifacts/user-modules-data/index.md
@@ -0,0 +1,192 @@
+---
+layout: artifact
+title: user-modules-data
+excerpt: A DB-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/user-modules-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A DB-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-setup-user-modules](../../programs/anvi-setup-user-modules) [anvi-script-gen-user-module-file](../../programs/anvi-script-gen-user-module-file)
+
+
+## Required or used by
+
+
+[anvi-estimate-metabolism](../../programs/anvi-estimate-metabolism) [anvi-setup-user-modules](../../programs/anvi-setup-user-modules)
+
+
+## Description
+
+A directory of **user-defined metabolism data**, created by the user for estimating metabolism on custom metabolic pathways. The program [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules) takes this directory as input and creates a [modules-db](/help/8/artifacts/modules-db) out of the data within, for use by [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism).
+
+Instructions for creating this data directory and using it to estimate completeness of custom (ie, non-KEGG) metabolic pathways can be found below.
+
+## A step-by-step guide to creating your own metabolic modules for anvi-estimate-metabolism
+
+If you want to define your own metabolic pathway so that you can estimate its completeness in genomes, MAGs, and metagenomes, follow the steps below!
+
+### 1. Find the enzymes
+
+What you need first is a list of enzyme accession numbers. For each reaction in your metabolic pathway, figure out what enzyme(s) or enzyme complexes (if any) are required to catalyze the reaction. Then, for each of these enzymes and/or components of enzyme complexes, figure out if they are present in common databases like [NCBI COG](https://www.ncbi.nlm.nih.gov/research/cog), [KEGG KOfam](https://www.genome.jp/tools/kofamkoala/), or [PFAM](http://pfam.xfam.org/). If so, mark down their accession numbers in those databases. If not, you may need to create your own HMM profile for the enzyme (and create an accession number for it).
+
+Also, think about how you will annotate each enzyme, because for each one you will need to write down its functional annotation source in the module file in step 3. Here is a short guide to common annotation sources:
+
+Enzyme comes from... | annotation program | ANNOTATION_SOURCE
+|:---|:---|:---|
+KEGG KOfam | [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) | Kofam
+NCBI COG (2020) | [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs) | COG20_FUNCTION
+NCBI COG (2014) | [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs) | COG14_FUNCTION
+PFAM | [anvi-run-pfams](/help/8/programs/anvi-run-pfams) | Pfam
+custom HMMs | [anvi-run-hmms](/help/8/programs/anvi-run-hmms) with `--hmm-source` and `--add-to-functions-table` parameters | name of directory given to `--hmm-source`
+other annotation strategy | [anvi-import-functions](/help/8/programs/anvi-import-functions) | source defined in input file
+
+### 2. Define the module
+
+You need to write a DEFINITION string for the module. This string should be in the style of KEGG MODULE definitions, which are described [here](https://merenlab.org/software/anvio/help/main/programs/anvi-estimate-metabolism/#what-data-is-used-for-estimation). Briefly, you will put the enzyme accessions in order of their corresponding reactions in the metabolic pathway. Different steps (reactions) in the pathway should be separated by spaces, and alternative enzymes that can catalyze the same reaction should be separated by commas. You can use parentheses to distinguish alternatives with multiple steps. For enzyme complexes, all components should be in one string, with essential components separated by '+' signs and non-essential components separated by '-' signs.
+
+### 3. Write a module file
+
+Put all the information about your metabolic pathway into a text file. The file format and types of information you need to include are discussed [here](https://merenlab.org/software/anvio/help/main/programs/anvi-setup-user-modules/#how-do-i-format-the-module-files). At minimum, you need to pick an identifier (ENTRY) and NAME for the module, include your DEFINITION string from step 2, write an ORTHOLOGY line and an ANNOTATION_SOURCE line for each enzyme and/or enzyme component, and write a CLASS string to categorize your module into its class/category/subcategory. The module file should be given the same name as the identifier in the ENTRY line, and this identifier should not be the same as any module in the KEGG database.
+
+{:.notice}
+Check out [anvi-script-gen-user-module-file](/help/8/programs/anvi-script-gen-user-module-file) for a way to automatically format your user module files.
+
+### 4. Set up the USER_MODULES.db
+
+Once you have created a module file for each metabolic pathway you are interested in, you should put these files within a folder called `modules`, within a parent directory (that can have any name you choose), as described [here](https://merenlab.org/software/anvio/help/main/programs/anvi-setup-user-modules/#input-directory-format). This parent directory is the [user-modules-data](/help/8/artifacts/user-modules-data) directory. Then you should run the program [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules) and provide this directory to the `--user-modules` parameter. If all goes well, you will end up with a database called `USER_MODULES.db` in this folder.
+
+### 5. Annotate your contigs database(s)
+
+Before you can estimate metabolism, you will need to annotate your contigs database(s) with each annotation source that you used to define your modules. This will require running one or more annotation programs, as described in the table given for step 1 above. If you want to quickly remind yourself of which annotation sources are required for your metabolic modules, you can run [anvi-db-info](/help/8/programs/anvi-db-info) on the `USER_MODULES.db`. But don't worry - if you forget one, you will get a helpful error message telling you what you missed when you try to run [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism).
+
+Since estimation will be run on KEGG data, too, you will have to make sure you also run [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) on your database(s), if you haven't already, _UNLESS_ you are choosing to skip KEGG estimation by using the `--only-user-modules` parameter for [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism).
+
+### 6. Estimate the completeness of your pathways
+
+The last step is to run [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) and provide this directory to the `--user-modules` parameter. This program will estimate the completeness of the metabolic modules defined in the `USER_MODULES.db` (by default, this will be in addition to the KEGG modules from the [kegg-data](/help/8/artifacts/kegg-data) directory. But, as mentioned above, you can specify `--only-user-modules` to only estimate on your own data).
+
+## A toy example
+
+For this example, we will be creating a completely FAKE, biologically-nonsensical Frankenstein of a metabolic pathway. This is not anything you should be putting in your own `USER_MODULES.db`; it only exists for demonstrating the steps above, and particularly so you have a reference for how to handle the different annotation sources mentioned in step 1.
+
+First, let's select 5 different enzymes. Typically at this step you would use your biological knowledge of a real metabolic pathway to determine the specific set of enzymes catalyzing the reactions of the pathway - but for this toy example, we're going to use random enzymes that come from a variety of annotation sources, not enzymes that actually work together biologically in a real cell. So we'll go with a couple of KOfams, a COG, a PFAM, and a TIGRFAM (to demonstrate the 'other' annotation strategy). Here is the list of accessions: K01657, K01658, COG1362, PF06603.14, and TIGR01709.2.
+
+It doesn't matter what they are or what they do. What matters is that we will learn how to annotate each one. So let's talk about their annotation sources. K01657 and K01658 will both come from `KOfam`, and COG1362 will come from the 2020 distribution of the COGs database, so its source will be `COG20_FUNCTION`. PF06603.14 is a PFAM, so it _could_ come from the `Pfam` source. But let's suppose we don't want to waste our precious computational resources on running [anvi-run-pfams](/help/8/programs/anvi-run-pfams) when we are only interested in one enzyme from this database. Instead, we'll make a custom HMM profile for this particular enzyme by following the directions on [creating HMM sources from ad hoc PFAM accessions](https://merenlab.org/software/anvio/help/main/artifacts/hmm-source/#creating-anvio-hmm-sources-from-ad-hoc-pfam-accessions), and then we will annotate it using [anvi-run-hmms](/help/8/programs/anvi-run-hmms). In this case, the annotation source will be the name of the directory we make using [anvi-script-pfam-accessions-to-hmms-directory](/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory), so we need to pick a name for it - let's call it `METABOLISM_HMM`. Last but not least, what about the TIGRFAM enzyme TIGR01709.2? Anvi'o doesn't have a program for annotating TIGRFAMs, but we can annotate our gene sequences with TIGRFAM using [Interproscan](https://www.ebi.ac.uk/interpro/search/sequence/), compile the results into a [functions-txt](/help/8/artifacts/functions-txt), and import those annotations into our contigs database using [anvi-import-functions](/help/8/programs/anvi-import-functions). We'll put the source `TIGRFAM` in the [functions-txt](/help/8/artifacts/functions-txt) file.
+
+Great, so now that we have our enzymes and we know how we will annotate them, it's time for step 2 - creating the module DEFINITION string. Again, this is not going to be a biologically-realistic metabolic pathway, but an example to demonstrate the different ways of representing steps in a pathway.
+
+Let's say the first reaction in our pathway is catalyzed by an enzyme complex made up of two essential components, K01657 and K01658. We represent this step by the string "K01657+K01658" (no spaces between the components). Suppose the next part of the pathway can _either_ be one reaction catalyzed by the enzyme PF06603.14, _or_ it can be a two-step reaction in which the first reaction is catalyzed by COG1362 and the second is catalyzed by TIGR01709.2. We use a comma to separate the alternatives, and since the second option requires two different steps (that will be separated by a space), we surround the second option with parentheses to make sure both steps are considered as the alternative. It looks like this: "PF06603.14,(COG1362 TIGR01709.2)".
+
+So our full module DEFINITION string is "K01657+K01658 PF06603.14,(COG1362 TIGR01709.2)".
+
+Now we need to put this information into a module file. We'll give the module the identifier `UD0042` (UD for 'user-defined', and 42 because 42 is the answer to life, the universe, and everything), and this will also be the name of the file.
+
+Here is the module file. Any information that we didn't discuss above has been filled in to demonstrate the formatting requirements:
+```
+ENTRY UD0042
+NAME Frankenstein pathway for demo purposes
+DEFINITION K01657+K01658 PF06603.14,(COG1362 TIGR01709.2)
+ORTHOLOGY K01657 anthranilate synthase component I [EC:4.1.3.27]
+ K01658 anthranilate synthase component II [EC:4.1.3.27]
+ PF06603.14 UpxZ
+ COG1362 Aspartyl aminopeptidase
+ TIGR01709.2 type II secretion system protein GspL
+CLASS User modules; Demo set; Frankenstein metabolism
+ANNOTATION_SOURCE K01657 KOfam
+ K01658 KOfam
+ PF06603.14 METABOLISM_HMM
+ COG1362 COG20_FUNCTION
+ TIGR01709.2 TIGRFAM
+///
+```
+
+If you were actually going to use this pathway, this is how you could create the [user-modules-data](/help/8/artifacts/user-modules-data) directory and then the `USER_MODULES.db`:
+
+
+mkdir USER_METABOLISM
+mkdir USER_METABOLISM/modules
+vi USER_METABOLISM/modules/UD0042
+\#copy the above into this file, save and quit
+anvi-setup-user-modules --user-modules USER_METABOLISM/
+
+
+You would see the following output after [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules) completed:
+```
+Modules database .............................: A new database, USER_METABOLISM/USER_MODULES.db, has been created.
+Number of modules ............................: 1
+Number of entries ............................: 14
+Number of parsing errors (corrected) .........: 0
+Number of parsing errors (uncorrected) .......: 0
+Annotation sources required for estimation ...: COG20_FUNCTION, METABOLISM_HMM, KOfam, TIGRFAM
+```
+As expected, if we want to use this modules database for estimating completeness of our Frankenstein pathway, we would need to annotate our [contigs-db](/help/8/artifacts/contigs-db) of interest with the four annotation sources we discussed above. And that, in fact, is the next step.
+
+The first two annotation sources we discussed are easy, because we don't need to do anything besides run the designated anvi'o program for KEGG KOfams and NCBI COGs, respectively:
+
+
+[anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) -c CONTIGS.db \
+ --num-threads 4
+[anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs) -c CONTIGS.db \
+ --num-threads 4
+
+
+Annotating PF06603.14 requires an extra step, because we first need to create a custom HMM for this enzyme. Luckily, there is another anvi'o program to do that. We give the enzyme accession to [anvi-script-pfam-accessions-to-hmms-directory](/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory), and we make sure to set the output directory name to be the same as the annotation source string that we put in the module file:
+
+
+[anvi-script-pfam-accessions-to-hmms-directory](/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory) --pfam-accessions-list PF06603.14 \
+ -O METABOLISM_HMM
+[anvi-run-hmms](/help/8/programs/anvi-run-hmms) -c CONTIGS.db \
+ -H METABOLISM_HMM \
+ --add-to-functions-table \
+ --num-threads 4
+
+
+Please note that you _must_ use the `--add-to-functions-table` parameter when you use [anvi-run-hmms](/help/8/programs/anvi-run-hmms), otherwise the annotations for PF06603.14 will not be stored in the proper database table and [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) will not be able to find them later. Also, if you use the [anvi-script-pfam-accessions-to-hmms-directory](/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory) program to create your custom HMM profiles, you should make sure that the accessions in the resulting `genes.txt` file are matching to the corresponding enzyme accessions in the module file, because those are the accessions that will be put into your contigs database.
+
+Finally, to annotate TIGR01709.2 we need to take our (hypothetical) Interproscan results and convert them into a [functions-txt](/help/8/artifacts/functions-txt) file. You can visit that page for a lengthier discussion of the file format, but let's say the TIGR01709.2 annotations in that file looked like this:
+
+|gene_callers_id|source|accession|function|e_value|
+|:--|:--:|:--:|:--|:--:|
+|7|TIGRFAM|TIGR01709.2|type II secretion system protein GspL|1.5e-75|
+|23|TIGRFAM|TIGR01709.2|type II secretion system protein GspL|3.4e-20|
+
+The things that are especially critical here is that the `accession` matches to the accession in the module DEFINITION, ORTHOLOGY, and ANNOTATION_SOURCE lines, and that the `source` matches to the source string in the module ANNOTATION_SOURCE line(s).
+
+Suppose the file is called `TIGRFAM_annotations.txt`. Then you can import those annotations, like so:
+
+
+[anvi-import-functions](/help/8/programs/anvi-import-functions) -c CONTIGS.db \
+ -i TIGRFAM_annotations.txt
+
+
+Once this is done, you are ready to estimate the pathway's completeness! Here is the command:
+
+
+[anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) -c CONTIGS.db \
+ --user-modules USER_METABOLISM/ \
+ -O frankenstein
+
+
+If you did this, the results for module UD0042 would appear at the end of the resulting 'modules' output file, after the estimation results for KEGG modules.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/user-modules-data.md) to update this information.
+
diff --git a/help/8/artifacts/variability-profile-txt/index.md b/help/8/artifacts/variability-profile-txt/index.md
new file mode 100644
index 00000000..cb463535
--- /dev/null
+++ b/help/8/artifacts/variability-profile-txt/index.md
@@ -0,0 +1,90 @@
+---
+layout: artifact
+title: variability-profile-txt
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/variability-profile-txt
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-gen-variability-profile](../../programs/anvi-gen-variability-profile)
+
+
+## Required or used by
+
+
+[anvi-display-structure](../../programs/anvi-display-structure) [anvi-gen-fixation-index-matrix](../../programs/anvi-gen-fixation-index-matrix) [anvi-gen-variability-network](../../programs/anvi-gen-variability-network) [anvi-get-pn-ps-ratio](../../programs/anvi-get-pn-ps-ratio) [anvi-script-snvs-to-interactive](../../programs/anvi-script-snvs-to-interactive) [anvi-script-variability-to-vcf](../../programs/anvi-script-variability-to-vcf)
+
+
+## Description
+
+
+This artifact contains various information about the SNVs, SCVs, and SAAVs across a [profile-db](/help/8/artifacts/profile-db) that is thoroughly described [on this blogpost](http://merenlab.org/2015/07/20/analyzing-variability/#the-output-matrix).
+
+This is generated by [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile), which is also described in [that blogpost](http://merenlab.org/2015/07/20/analyzing-variability/#the-anvio-way).
+
+{:.notice}
+Unsure what SNV, SCV, and SAAVs are or looking for a refresher? You can find that information [on the same blogpost](http://merenlab.org/2015/07/20/analyzing-variability/#an-intro-to-single-nucleotidecodonamino-acid-variation).
+
+In summary, [go to the blogpost](http://merenlab.org/2015/07/20/analyzing-variability/). Because the blogpost preceded this document by 5 years, most of the pertinent information that should be in here is actually over there. One day we will remedy this situation. Until then, this document serves as a quick reference for content more verbosely explained in the blog post.
+
+
+[variability-profile-txt](/help/8/artifacts/variability-profile-txt) is the output matrix for your SNVs, SCVs, or SAAVs. What you do with your [variability-profile-txt](/help/8/artifacts/variability-profile-txt) is entirely up to your discretion. We maintain the stance that this output should be as raw as possible, so that you can analyze it how you please. Attached to each SNV, SCV, and SAAV is a plethora of annotated information.
+
+
+### What kinds of information?
+
+#### SNVs
+
+For each of your SNVs, this matrix include their position in the contig and gene, sample, coverage data, the A, C, G, and T counts, the reference and consensus nucleotides, entropy value, and more.
+
+#### SCVs
+
+This information will only appear if you requested it when running your earlier analysis. To do this, use the flag `--profile-SCVs` when you run [anvi-profile](/help/8/programs/anvi-profile). Then, when running [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile) use the flag `--engine CDN`.
+
+For each SCVs, this matrix details the position, sample, coverage data, count for each of the 64 codons (AAA, AAC, ..., TTG, TTT), entropy, synonymity, etc.
+
+#### SAAVs
+
+Like the information about SCVs, this information will only appear if you requested it when running your earlier analysis. To do this, use the flag `--profile-SCVs` when you run [anvi-profile](/help/8/programs/anvi-profile) or [anvi-merge](/help/8/programs/anvi-merge). Then, when running [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile) use the flag `--engine AA`.
+
+For each SCVs, this matrix details the position, sample, coverage data, count for each of the 20 amino acids (as well as the stop codon), entropy, BLOSUM62, etc.
+
+#### Structural information
+
+If you provided [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile) with a [structure-db](/help/8/artifacts/structure-db), then you'll also have some additional columns to your matrices. These include structural annotations, the residue's solvent accessibility, information about bond angles, and a list of residues that are in physical contact with the residue you're looking at.
+
+
+For more information on any of this, check out [this page](http://merenlab.org/2015/07/20/analyzing-variability/#the-output-matrix), where every column in these matrices is not only listed, but explained.
+
+
+#### Additional amino acid and nucleotide data
+
+If you provided [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile) with the flag `--include-additional-data` and you have any [misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids) data stored in your [contigs-db](/help/8/artifacts/contigs-db), that data will added as additional columns to the matrix.
+
+
+{:.notice}
+This is currently only implemented for `--engine AA` and `--engine CDN`. `--include-additional-data` will not currently append [misc-data-nucleotides](/help/8/artifacts/misc-data-nucleotides) data to your matrix output when `--engine NT` is used.
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/variability-profile-txt.md) to update this information.
+
diff --git a/help/8/artifacts/variability-profile-xml/index.md b/help/8/artifacts/variability-profile-xml/index.md
new file mode 100644
index 00000000..400b0008
--- /dev/null
+++ b/help/8/artifacts/variability-profile-xml/index.md
@@ -0,0 +1,103 @@
+---
+layout: artifact
+title: variability-profile-xml
+excerpt: A XML-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/variability-profile-xml
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A XML-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-gen-variability-network](../../programs/anvi-gen-variability-network)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+An XML formatted network file that can be read and visualized by the program Gephi.
+
+At the time of preparing this particular artifact document, An example output looked like this:
+
+``` xml
+
+
+
+ Oligotyping pipeline
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ (...)
+
+
+
+
+
+
+
+
+
+
+
+ (...)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/variability-profile-xml.md) to update this information.
+
diff --git a/help/8/artifacts/variability-profile/index.md b/help/8/artifacts/variability-profile/index.md
new file mode 100644
index 00000000..85dc76e5
--- /dev/null
+++ b/help/8/artifacts/variability-profile/index.md
@@ -0,0 +1,89 @@
+---
+layout: artifact
+title: variability-profile
+excerpt: A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/variability-profile
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A CONCEPT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-profile](../../programs/anvi-profile)
+
+
+## Required or used by
+
+
+[anvi-gen-variability-profile](../../programs/anvi-gen-variability-profile)
+
+
+## Description
+
+As an artifact, this describes the variability information about a single sample calculated when you ran [anvi-profile](/help/8/programs/anvi-profile). To examine variability across samples, you'll want to use this information (which is stored within your [profile-db](/help/8/artifacts/profile-db)) to run [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile).
+
+## Details about Variability
+
+In the context of anvi'o, variability means divergence of environmental populations from the reference used to perform metagenomic read recruitment.
+
+Here, the term "population" describes an assemblage of co-existing microbial genomes in an environment that are similar enough to map to the context of the same reference genome.
+
+The variability profile of a metagenome enables studies of [microbial population genetics with anvi'o](http://merenlab.org/2015/07/20/analyzing-variability/).
+
+There are two types of variability the program [anvi-profile](/help/8/programs/anvi-profile) can characterize and store: substitutions and indels.
+
+### Substitutions: SNVs, SCVs, SAAVs
+
+Anvi'o can make sense of single-nucleotide variants (SNVs), single-codon variants (SCVs), and single-amino acid variants (SAAVs). See [this article](http://merenlab.org/2015/07/20/analyzing-variability) for more information.
+
+You can learn the name of the table in which anvi'o stores this in a given [profile-db](/help/8/artifacts/profile-db) by running this command in your anvi'o environment:
+
+``` bash
+python -c 'import anvio.tables as t; print(t.variable_nts_table_name)'
+```
+
+This will tell you about its structure:
+
+``` bash
+python -c 'import anvio.tables as t; print(t.variable_nts_table_structure)'
+```
+
+### Indels: insertions and deletions
+
+Anvi'o can also characterize insertions and deletions found within an environment based on short-read recruitment results and will store in the following table:
+
+``` bash
+python -c 'import anvio.tables as t; print(t.indels_table_name)'
+```
+
+**Notes for programmers**: The convention for the start position of an insertion is defined like so:
+
+```
+ pos_in_contig ...0123456 7890123456
+ reference ...CTACTAC TACTTCATGA...
+ read TACTAC TAC
+ insertion โโโACTG
+```
+
+In this case, the start position of the insertion in the contig is 6. The insertion _follows_ the position it is defined by. This is opposite to IGV, in which the insertion _precedes_ the position it is defined by.
+
+For deletions, there is no such ambiguity in the start position, since the deletion starts on a reference position, not in between two reference positions.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/variability-profile.md) to update this information.
+
diff --git a/help/8/artifacts/vcf/index.md b/help/8/artifacts/vcf/index.md
new file mode 100644
index 00000000..f8887d82
--- /dev/null
+++ b/help/8/artifacts/vcf/index.md
@@ -0,0 +1,59 @@
+---
+layout: artifact
+title: vcf
+excerpt: A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/vcf
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-variability-to-vcf](../../programs/anvi-script-variability-to-vcf)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+This represents a file in the Variant Call Format, which is a standard format for storing sequence variations like SNVs.
+
+You can convert the information in a [variability-profile-txt](/help/8/artifacts/variability-profile-txt) to [vcf](/help/8/artifacts/vcf) with the program [anvi-script-variability-to-vcf](/help/8/programs/anvi-script-variability-to-vcf).
+
+### What's in this file?
+
+For more details, you can check out the [VCF wikipedia page](https://en.wikipedia.org/wiki/Variant_Call_Format).
+
+#### Header
+
+Briefly, this file's header (marked by `##` at the beginning of each line) contains various metadata. This includes the date, link to the reference file, contig information, etc. It also contains
+- what information will be reported (denoted by `INFO`).
+- what additional filters will be run on each SNV (denoted by `FILTER`). For example, marking which variants are below a certain quality threshold.
+- what format to display additional data in (denoted by `FORMAT`).
+
+#### Body
+
+The body of the file contains identifying information for the variation (the chromosome, position and ID), the identity of the position in the reference and alternative alleles present in your data. Following this is a quality score for your data and the additional information specified by the header.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/vcf.md) to update this information.
+
diff --git a/help/8/artifacts/view-data/index.md b/help/8/artifacts/view-data/index.md
new file mode 100644
index 00000000..f47b4ecd
--- /dev/null
+++ b/help/8/artifacts/view-data/index.md
@@ -0,0 +1,46 @@
+---
+layout: artifact
+title: view-data
+excerpt: A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported by anvi'o. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/view-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A TXT-type anvi'o artifact. This artifact can be generated, used, and/or exported **by anvi'o**. It can also be provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-script-gen-distribution-of-genes-in-a-bin](../../programs/anvi-script-gen-distribution-of-genes-in-a-bin) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Required or used by
+
+
+[anvi-interactive](../../programs/anvi-interactive) [anvi-matrix-to-newick](../../programs/anvi-matrix-to-newick) [anvi-script-transpose-matrix](../../programs/anvi-script-transpose-matrix)
+
+
+## Description
+
+View data refers to a matrx where each column represents a specific sample and each row describes some attribute of that sample (most often a sequence's abundance per sample).
+
+For example, in the [pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#creating-a-quick-pangenome-with-functions), the `PROCHLORO-functions-occurrence-frequency.txt` is a view-data.
+
+You can use this to compute a distance matrix to generate a dendrogram (using [anvi-matrix-to-newick](/help/8/programs/anvi-matrix-to-newick)) or direclty input it to [anvi-interactive](/help/8/programs/anvi-interactive) to visualize the distribution of your items across samples.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/view-data.md) to update this information.
+
diff --git a/help/8/artifacts/workflow-config/index.md b/help/8/artifacts/workflow-config/index.md
new file mode 100644
index 00000000..d26f0d91
--- /dev/null
+++ b/help/8/artifacts/workflow-config/index.md
@@ -0,0 +1,91 @@
+---
+layout: artifact
+title: workflow-config
+excerpt: A JSON-type anvi'o artifact. This artifact is typically provided by the user for anvi'o to import into its databases, process, and/or use.
+categories: [anvio]
+comments: false
+redirect_from: /8/workflow-config
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A JSON-type anvi'o artifact. This artifact is typically provided **by the user** for anvi'o to import into its databases, process, and/or use.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+There are no anvi'o tools that generate this artifact, which means it is most likely provided to the anvi'o ecosystem by the user.
+
+
+## Required or used by
+
+
+[anvi-migrate](../../programs/anvi-migrate) [anvi-run-workflow](../../programs/anvi-run-workflow)
+
+
+## Description
+
+A `JSON`-formated configuration file that describes steps and parameters to be considered by an anvio [workflow](/help/8/artifacts/workflow).
+
+You can create a default config file for a given workflow using the following command:
+
+```
+anvi-run-workflow --workflow ANVIO-WORKFLOW \
+ --get-default-config CONFIG.json
+```
+
+Following this, the file `CONFIG.json` will contain all configurable flags and parameters set to their default value for that workflow. From there, you can edit this file to your hearts content.
+
+### What's in this file?
+
+The config file contains three types of information:
+
+1. **General parameters**, including the name of the workflow, the version of this config file, and links to the [fasta-txt](/help/8/artifacts/fasta-txt) or [samples-txt](/help/8/artifacts/samples-txt) file)
+2. **Rule specific parameters** which allow you to set the parameters on individual anvi'o programs that are run in the workflow.
+3. **Output directory names** which just tell anvi'o what to name all of the intermediate and final outputs (to help keep things organized).
+
+For example, the default config file for the [contigs workflow](../../workflows/contigs) has no rule specific parameters and looks like this:
+
+ {
+ "workflow_name": "contigs",
+ "config_version": 1,
+ "fasta_txt": "fasta.txt",
+ "output_dirs": {
+ "FASTA_DIR": "01_FASTA_contigs_workflow",
+ "CONTIGS_DIR": "02_CONTIGS_contigs_workflow",
+ "LOGS_DIR": "00_LOGS_contigs_workflow"
+ }
+ }
+
+On the other hand, the default config file for the [contigs workflow](../../workflows/metagenomics) is much longer, because it has sections for each rule specific parameter. For example, its section on parameters for the program [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database) looks like this:
+
+ "anvi_gen_contigs_database": {
+ "--project-name": "{group}",
+ "threads": 5,
+ "--description": "",
+ "--skip-gene-calling": "",
+ "--ignore-internal-stop-codons": "",
+ "--skip-mindful-splitting": "",
+ "--contigs-fasta": "",
+ "--split-length": "",
+ "--kmer-size": ""
+ },
+
+Note that the empty string `""` here means that the default parameter for the program [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database) will be used.
+
+For more details on the anvi'o snakemake workflows, please refer to [this tutorial](https://merenlab.org/2018/07/09/anvio-snakemake-workflows/).
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/workflow-config.md) to update this information.
+
diff --git a/help/8/artifacts/workflow/index.md b/help/8/artifacts/workflow/index.md
new file mode 100644
index 00000000..01b75c96
--- /dev/null
+++ b/help/8/artifacts/workflow/index.md
@@ -0,0 +1,54 @@
+---
+layout: artifact
+title: workflow
+excerpt: A WORKFLOW-type anvi'o artifact. This artifact is typically generated, used, and/or exported by anvi'o (and not provided by the user)..
+categories: [anvio]
+comments: false
+redirect_from: /8/workflow
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+
+{% include _toc.html %}
+
+
+
+
+A WORKFLOW-type anvi'o artifact. This artifact is typically generated, used, and/or exported **by anvi'o** (and not provided by the user)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Provided by
+
+
+[anvi-run-workflow](../../programs/anvi-run-workflow)
+
+
+## Required or used by
+
+
+There are no anvi'o tools that use or require this artifact directly, which means it is most likely an end-product for the user.
+
+
+## Description
+
+A set of output files generated by [anvi-run-workflow](/help/8/programs/anvi-run-workflow) for a given anvi'o workflow. An anvi'o workflow is a set of instructions to be run via [Snakemake](https://snakemake.readthedocs.io/en/stable/).
+
+As of now, the available workflows include,
+
+* [Contigs workflow](../../workflows/contigs)
+* [Metagenomics workflow](../../workflows/metagenomics)
+* [Pangenomics workflow](../../workflows/pangenomics)
+* [Phylogenomics workflow](../../workflows/phylogenomics)
+* [tRNAseq workflow](../../workflows/trnaseq)
+* [EcoPhylo workflow](../../workflows/ecophylo)
+* [SRA-download workflow](../../workflows/sra-download)
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/artifacts/workflow.md) to update this information.
+
diff --git a/help/8/images/FMT_HMI_score_plot.png b/help/8/images/FMT_HMI_score_plot.png
new file mode 100644
index 00000000..cfbadcce
Binary files /dev/null and b/help/8/images/FMT_HMI_score_plot.png differ
diff --git a/help/8/images/M00011.png b/help/8/images/M00011.png
new file mode 100644
index 00000000..1638d4a1
Binary files /dev/null and b/help/8/images/M00011.png differ
diff --git a/help/8/images/M00018.png b/help/8/images/M00018.png
new file mode 100644
index 00000000..f59e7264
Binary files /dev/null and b/help/8/images/M00018.png differ
diff --git a/help/8/images/anvi-display-functions-01.png b/help/8/images/anvi-display-functions-01.png
new file mode 100644
index 00000000..9156209c
Binary files /dev/null and b/help/8/images/anvi-display-functions-01.png differ
diff --git a/help/8/images/anvi-display-functions-02.png b/help/8/images/anvi-display-functions-02.png
new file mode 100644
index 00000000..7fdf3e22
Binary files /dev/null and b/help/8/images/anvi-display-functions-02.png differ
diff --git a/help/8/images/anvi-display-functions-03.png b/help/8/images/anvi-display-functions-03.png
new file mode 100644
index 00000000..5754323c
Binary files /dev/null and b/help/8/images/anvi-display-functions-03.png differ
diff --git a/help/8/images/anvi-get-tlen-dist-from-bam.png b/help/8/images/anvi-get-tlen-dist-from-bam.png
new file mode 100644
index 00000000..fc487552
Binary files /dev/null and b/help/8/images/anvi-get-tlen-dist-from-bam.png differ
diff --git a/help/8/images/anvi-profile-blitz.png b/help/8/images/anvi-profile-blitz.png
new file mode 100644
index 00000000..4de7ee2a
Binary files /dev/null and b/help/8/images/anvi-profile-blitz.png differ
diff --git a/help/8/images/anvi-script-fix-homopolymer-indels-test.gif b/help/8/images/anvi-script-fix-homopolymer-indels-test.gif
new file mode 100644
index 00000000..5f14de5e
Binary files /dev/null and b/help/8/images/anvi-script-fix-homopolymer-indels-test.gif differ
diff --git a/help/8/images/authors/AstrobioMike.jpg b/help/8/images/authors/AstrobioMike.jpg
new file mode 100644
index 00000000..e67efbf7
Binary files /dev/null and b/help/8/images/authors/AstrobioMike.jpg differ
diff --git a/help/8/images/authors/FlorianTrigodet.jpg b/help/8/images/authors/FlorianTrigodet.jpg
new file mode 100644
index 00000000..c4392186
Binary files /dev/null and b/help/8/images/authors/FlorianTrigodet.jpg differ
diff --git a/help/8/images/authors/Jessica-Pan.png b/help/8/images/authors/Jessica-Pan.png
new file mode 100644
index 00000000..d45015e0
Binary files /dev/null and b/help/8/images/authors/Jessica-Pan.png differ
diff --git a/help/8/images/authors/ShaiberAlon.jpg b/help/8/images/authors/ShaiberAlon.jpg
new file mode 100644
index 00000000..5df839cc
Binary files /dev/null and b/help/8/images/authors/ShaiberAlon.jpg differ
diff --git a/help/8/images/authors/adw96.jpg b/help/8/images/authors/adw96.jpg
new file mode 100644
index 00000000..af6ad76b
Binary files /dev/null and b/help/8/images/authors/adw96.jpg differ
diff --git a/help/8/images/authors/ctitusbrown.jpg b/help/8/images/authors/ctitusbrown.jpg
new file mode 100644
index 00000000..6f009c28
Binary files /dev/null and b/help/8/images/authors/ctitusbrown.jpg differ
diff --git a/help/8/images/authors/efogarty11.jpg b/help/8/images/authors/efogarty11.jpg
new file mode 100644
index 00000000..fc0a7c31
Binary files /dev/null and b/help/8/images/authors/efogarty11.jpg differ
diff --git a/help/8/images/authors/ekiefl.jpg b/help/8/images/authors/ekiefl.jpg
new file mode 100644
index 00000000..23ee6047
Binary files /dev/null and b/help/8/images/authors/ekiefl.jpg differ
diff --git a/help/8/images/authors/ge0rges.jpg b/help/8/images/authors/ge0rges.jpg
new file mode 100644
index 00000000..fc32ef7c
Binary files /dev/null and b/help/8/images/authors/ge0rges.jpg differ
diff --git a/help/8/images/authors/isaacfink21.png b/help/8/images/authors/isaacfink21.png
new file mode 100644
index 00000000..75f50ffd
Binary files /dev/null and b/help/8/images/authors/isaacfink21.png differ
diff --git a/help/8/images/authors/ivagljiva.jpg b/help/8/images/authors/ivagljiva.jpg
new file mode 100644
index 00000000..a9ec4f3f
Binary files /dev/null and b/help/8/images/authors/ivagljiva.jpg differ
diff --git a/help/8/images/authors/mahmoudyousef98.jpg b/help/8/images/authors/mahmoudyousef98.jpg
new file mode 100644
index 00000000..ed30136c
Binary files /dev/null and b/help/8/images/authors/mahmoudyousef98.jpg differ
diff --git a/help/8/images/authors/matthewlawrenceklein.jpg b/help/8/images/authors/matthewlawrenceklein.jpg
new file mode 100644
index 00000000..6106f85d
Binary files /dev/null and b/help/8/images/authors/matthewlawrenceklein.jpg differ
diff --git a/help/8/images/authors/meren.jpg b/help/8/images/authors/meren.jpg
new file mode 100644
index 00000000..87212022
Binary files /dev/null and b/help/8/images/authors/meren.jpg differ
diff --git a/help/8/images/authors/mooreryan.jpeg b/help/8/images/authors/mooreryan.jpeg
new file mode 100644
index 00000000..598c6b77
Binary files /dev/null and b/help/8/images/authors/mooreryan.jpeg differ
diff --git a/help/8/images/authors/mschecht.jpg b/help/8/images/authors/mschecht.jpg
new file mode 100644
index 00000000..8667c5ad
Binary files /dev/null and b/help/8/images/authors/mschecht.jpg differ
diff --git a/help/8/images/authors/no-avatar.png b/help/8/images/authors/no-avatar.png
new file mode 100644
index 00000000..5d75c281
Binary files /dev/null and b/help/8/images/authors/no-avatar.png differ
diff --git a/help/8/images/authors/ozcan.jpg b/help/8/images/authors/ozcan.jpg
new file mode 100644
index 00000000..e59ab8bc
Binary files /dev/null and b/help/8/images/authors/ozcan.jpg differ
diff --git a/help/8/images/authors/qclayssen.jpg b/help/8/images/authors/qclayssen.jpg
new file mode 100644
index 00000000..4c580419
Binary files /dev/null and b/help/8/images/authors/qclayssen.jpg differ
diff --git a/help/8/images/authors/semiller10.jpg b/help/8/images/authors/semiller10.jpg
new file mode 100644
index 00000000..422d48bd
Binary files /dev/null and b/help/8/images/authors/semiller10.jpg differ
diff --git a/help/8/images/authors/vinisalazar.jpg b/help/8/images/authors/vinisalazar.jpg
new file mode 100644
index 00000000..33b98e38
Binary files /dev/null and b/help/8/images/authors/vinisalazar.jpg differ
diff --git a/help/8/images/authors/watsonar.jpg b/help/8/images/authors/watsonar.jpg
new file mode 100644
index 00000000..f3efc719
Binary files /dev/null and b/help/8/images/authors/watsonar.jpg differ
diff --git a/help/8/images/authors/youngblut.jpg b/help/8/images/authors/youngblut.jpg
new file mode 100644
index 00000000..d5c892e9
Binary files /dev/null and b/help/8/images/authors/youngblut.jpg differ
diff --git a/help/8/images/contigs-profile-db.png b/help/8/images/contigs-profile-db.png
new file mode 100644
index 00000000..77d8e3c1
Binary files /dev/null and b/help/8/images/contigs-profile-db.png differ
diff --git a/help/8/images/contigs-stats-interface-example.png b/help/8/images/contigs-stats-interface-example.png
new file mode 100644
index 00000000..e40619b8
Binary files /dev/null and b/help/8/images/contigs-stats-interface-example.png differ
diff --git a/help/8/images/display_contigs_stats_pandoc_output.png b/help/8/images/display_contigs_stats_pandoc_output.png
new file mode 100644
index 00000000..b18286b7
Binary files /dev/null and b/help/8/images/display_contigs_stats_pandoc_output.png differ
diff --git a/help/8/images/example_alignment.png b/help/8/images/example_alignment.png
new file mode 100644
index 00000000..49be1c8a
Binary files /dev/null and b/help/8/images/example_alignment.png differ
diff --git a/help/8/images/header.png b/help/8/images/header.png
new file mode 100644
index 00000000..76a79142
Binary files /dev/null and b/help/8/images/header.png differ
diff --git a/help/8/images/icons/ALL-ICONS.svg.gz b/help/8/images/icons/ALL-ICONS.svg.gz
new file mode 100644
index 00000000..72e83b1a
Binary files /dev/null and b/help/8/images/icons/ALL-ICONS.svg.gz differ
diff --git a/help/8/images/icons/BAM.png b/help/8/images/icons/BAM.png
new file mode 100644
index 00000000..5cb6f891
Binary files /dev/null and b/help/8/images/icons/BAM.png differ
diff --git a/help/8/images/icons/BIN.png b/help/8/images/icons/BIN.png
new file mode 100644
index 00000000..ef44e6c4
Binary files /dev/null and b/help/8/images/icons/BIN.png differ
diff --git a/help/8/images/icons/COLLECTION.png b/help/8/images/icons/COLLECTION.png
new file mode 100644
index 00000000..6d3666a6
Binary files /dev/null and b/help/8/images/icons/COLLECTION.png differ
diff --git a/help/8/images/icons/CONCEPT.png b/help/8/images/icons/CONCEPT.png
new file mode 100644
index 00000000..7346c2ed
Binary files /dev/null and b/help/8/images/icons/CONCEPT.png differ
diff --git a/help/8/images/icons/DATA.png b/help/8/images/icons/DATA.png
new file mode 100644
index 00000000..bb5299ff
Binary files /dev/null and b/help/8/images/icons/DATA.png differ
diff --git a/help/8/images/icons/DB.png b/help/8/images/icons/DB.png
new file mode 100644
index 00000000..f45c43c1
Binary files /dev/null and b/help/8/images/icons/DB.png differ
diff --git a/help/8/images/icons/DISPLAY.png b/help/8/images/icons/DISPLAY.png
new file mode 100644
index 00000000..07a5cc8b
Binary files /dev/null and b/help/8/images/icons/DISPLAY.png differ
diff --git a/help/8/images/icons/FASTA.png b/help/8/images/icons/FASTA.png
new file mode 100644
index 00000000..e8e0f79d
Binary files /dev/null and b/help/8/images/icons/FASTA.png differ
diff --git a/help/8/images/icons/FASTQ.png b/help/8/images/icons/FASTQ.png
new file mode 100644
index 00000000..2f12bfa6
Binary files /dev/null and b/help/8/images/icons/FASTQ.png differ
diff --git a/help/8/images/icons/HMM.png b/help/8/images/icons/HMM.png
new file mode 100644
index 00000000..7e5974b2
Binary files /dev/null and b/help/8/images/icons/HMM.png differ
diff --git a/help/8/images/icons/JSON.png b/help/8/images/icons/JSON.png
new file mode 100644
index 00000000..58e19a8c
Binary files /dev/null and b/help/8/images/icons/JSON.png differ
diff --git a/help/8/images/icons/NEWICK.png b/help/8/images/icons/NEWICK.png
new file mode 100644
index 00000000..8c1787af
Binary files /dev/null and b/help/8/images/icons/NEWICK.png differ
diff --git a/help/8/images/icons/PROGRAM.png b/help/8/images/icons/PROGRAM.png
new file mode 100644
index 00000000..ed6a1062
Binary files /dev/null and b/help/8/images/icons/PROGRAM.png differ
diff --git a/help/8/images/icons/SEQUENCE.png b/help/8/images/icons/SEQUENCE.png
new file mode 100644
index 00000000..61383a1a
Binary files /dev/null and b/help/8/images/icons/SEQUENCE.png differ
diff --git a/help/8/images/icons/STATS.png b/help/8/images/icons/STATS.png
new file mode 100644
index 00000000..4bece34a
Binary files /dev/null and b/help/8/images/icons/STATS.png differ
diff --git a/help/8/images/icons/SUMMARY.png b/help/8/images/icons/SUMMARY.png
new file mode 100644
index 00000000..366a7acd
Binary files /dev/null and b/help/8/images/icons/SUMMARY.png differ
diff --git a/help/8/images/icons/SVG.png b/help/8/images/icons/SVG.png
new file mode 100644
index 00000000..620b4b1f
Binary files /dev/null and b/help/8/images/icons/SVG.png differ
diff --git a/help/8/images/icons/TXT.png b/help/8/images/icons/TXT.png
new file mode 100644
index 00000000..ab3a6617
Binary files /dev/null and b/help/8/images/icons/TXT.png differ
diff --git a/help/8/images/icons/WORKFLOW.png b/help/8/images/icons/WORKFLOW.png
new file mode 100644
index 00000000..b0daf2d7
Binary files /dev/null and b/help/8/images/icons/WORKFLOW.png differ
diff --git a/help/8/images/icons/XML.png b/help/8/images/icons/XML.png
new file mode 100644
index 00000000..0334acbf
Binary files /dev/null and b/help/8/images/icons/XML.png differ
diff --git a/help/8/images/interactive_interface/anvio_display_template.png b/help/8/images/interactive_interface/anvio_display_template.png
new file mode 100644
index 00000000..e05ba932
Binary files /dev/null and b/help/8/images/interactive_interface/anvio_display_template.png differ
diff --git a/help/8/images/interactive_interface/interactive-mouse-panel.png b/help/8/images/interactive_interface/interactive-mouse-panel.png
new file mode 100644
index 00000000..728e0d93
Binary files /dev/null and b/help/8/images/interactive_interface/interactive-mouse-panel.png differ
diff --git a/help/8/images/interactive_interface/interactive-settings-bins-tab.png b/help/8/images/interactive_interface/interactive-settings-bins-tab.png
new file mode 100644
index 00000000..0532969d
Binary files /dev/null and b/help/8/images/interactive_interface/interactive-settings-bins-tab.png differ
diff --git a/help/8/images/interactive_interface/interactive-settings-bottom.png b/help/8/images/interactive_interface/interactive-settings-bottom.png
new file mode 100644
index 00000000..0eb08697
Binary files /dev/null and b/help/8/images/interactive_interface/interactive-settings-bottom.png differ
diff --git a/help/8/images/interactive_interface/interactive-settings-description-panel.png b/help/8/images/interactive_interface/interactive-settings-description-panel.png
new file mode 100644
index 00000000..db11874e
Binary files /dev/null and b/help/8/images/interactive_interface/interactive-settings-description-panel.png differ
diff --git a/help/8/images/interactive_interface/interactive-settings-display-additional-settings.png b/help/8/images/interactive_interface/interactive-settings-display-additional-settings.png
new file mode 100644
index 00000000..066b0cf7
Binary files /dev/null and b/help/8/images/interactive_interface/interactive-settings-display-additional-settings.png differ
diff --git a/help/8/images/interactive_interface/interactive-settings-layers-tab.png b/help/8/images/interactive_interface/interactive-settings-layers-tab.png
new file mode 100644
index 00000000..968dcd05
Binary files /dev/null and b/help/8/images/interactive_interface/interactive-settings-layers-tab.png differ
diff --git a/help/8/images/interactive_interface/interactive-settings-layers.png b/help/8/images/interactive_interface/interactive-settings-layers.png
new file mode 100644
index 00000000..f4891350
Binary files /dev/null and b/help/8/images/interactive_interface/interactive-settings-layers.png differ
diff --git a/help/8/images/interactive_interface/interactive-settings-legends-tab.png b/help/8/images/interactive_interface/interactive-settings-legends-tab.png
new file mode 100644
index 00000000..1982f5c6
Binary files /dev/null and b/help/8/images/interactive_interface/interactive-settings-legends-tab.png differ
diff --git a/help/8/images/interactive_interface/interactive-settings-panel-tabs.png b/help/8/images/interactive_interface/interactive-settings-panel-tabs.png
new file mode 100644
index 00000000..a4d03628
Binary files /dev/null and b/help/8/images/interactive_interface/interactive-settings-panel-tabs.png differ
diff --git a/help/8/images/layers_for_sequence_motifs.png b/help/8/images/layers_for_sequence_motifs.png
new file mode 100644
index 00000000..77c6c2ef
Binary files /dev/null and b/help/8/images/layers_for_sequence_motifs.png differ
diff --git a/help/8/images/metabolism_reconstruction.png b/help/8/images/metabolism_reconstruction.png
new file mode 100644
index 00000000..da0a0556
Binary files /dev/null and b/help/8/images/metabolism_reconstruction.png differ
diff --git a/help/8/images/oligotyping.jpg b/help/8/images/oligotyping.jpg
new file mode 100644
index 00000000..c2d5ca65
Binary files /dev/null and b/help/8/images/oligotyping.jpg differ
diff --git a/help/8/images/p214-w-upxz.png b/help/8/images/p214-w-upxz.png
new file mode 100644
index 00000000..95dd4100
Binary files /dev/null and b/help/8/images/p214-w-upxz.png differ
diff --git a/help/8/images/p214-wo-upxz.png b/help/8/images/p214-wo-upxz.png
new file mode 100644
index 00000000..386d3075
Binary files /dev/null and b/help/8/images/p214-wo-upxz.png differ
diff --git a/help/8/images/pathwise_vs_stepwise.png b/help/8/images/pathwise_vs_stepwise.png
new file mode 100644
index 00000000..b8645b3e
Binary files /dev/null and b/help/8/images/pathwise_vs_stepwise.png differ
diff --git a/help/8/images/summary_example.png b/help/8/images/summary_example.png
new file mode 100644
index 00000000..5f9351e6
Binary files /dev/null and b/help/8/images/summary_example.png differ
diff --git a/help/8/images/workflows/contigs/DAG-contigs.png b/help/8/images/workflows/contigs/DAG-contigs.png
new file mode 100644
index 00000000..e2081fc9
Binary files /dev/null and b/help/8/images/workflows/contigs/DAG-contigs.png differ
diff --git a/help/8/images/workflows/contigs/display-contigs.png b/help/8/images/workflows/contigs/display-contigs.png
new file mode 100644
index 00000000..4b2fc4dd
Binary files /dev/null and b/help/8/images/workflows/contigs/display-contigs.png differ
diff --git a/help/8/images/workflows/metagenomics/dag-references-mode.png b/help/8/images/workflows/metagenomics/dag-references-mode.png
new file mode 100644
index 00000000..a730d8df
Binary files /dev/null and b/help/8/images/workflows/metagenomics/dag-references-mode.png differ
diff --git a/help/8/images/workflows/metagenomics/idba_ud-all-against-all.png b/help/8/images/workflows/metagenomics/idba_ud-all-against-all.png
new file mode 100644
index 00000000..eb176e90
Binary files /dev/null and b/help/8/images/workflows/metagenomics/idba_ud-all-against-all.png differ
diff --git a/help/8/images/workflows/metagenomics/idba_ud_min_contig.png b/help/8/images/workflows/metagenomics/idba_ud_min_contig.png
new file mode 100644
index 00000000..e16d99d4
Binary files /dev/null and b/help/8/images/workflows/metagenomics/idba_ud_min_contig.png differ
diff --git a/help/8/images/workflows/metagenomics/idba_ud_workflow1.png b/help/8/images/workflows/metagenomics/idba_ud_workflow1.png
new file mode 100644
index 00000000..d84a0af7
Binary files /dev/null and b/help/8/images/workflows/metagenomics/idba_ud_workflow1.png differ
diff --git a/help/8/images/workflows/metagenomics/merged_profile_idba_ud1.png b/help/8/images/workflows/metagenomics/merged_profile_idba_ud1.png
new file mode 100644
index 00000000..59824419
Binary files /dev/null and b/help/8/images/workflows/metagenomics/merged_profile_idba_ud1.png differ
diff --git a/help/8/images/workflows/metagenomics/single_profile_idba_ud.png b/help/8/images/workflows/metagenomics/single_profile_idba_ud.png
new file mode 100644
index 00000000..278db2c4
Binary files /dev/null and b/help/8/images/workflows/metagenomics/single_profile_idba_ud.png differ
diff --git a/help/8/index.md b/help/8/index.md
new file mode 100644
index 00000000..acad8b6f
--- /dev/null
+++ b/help/8/index.md
@@ -0,0 +1,4602 @@
+---
+layout: help
+title: Help pages for anvi'o programs and artifacts
+categories: [anvio]
+comments: false
+image:
+ featurerelative: images/header.png
+ display: true
+redirect_from:
+ - /help
+---
+
+Here you will find a list of all anvi'o programs and artifacts that enable constructing workflows for integrated multi 'omics investigations.
+
+If you need an introduction to the terminology used in 'omics research or in anvi'o, please take a look at our vocabulary page. The anvi'o community is with you! If you have practical, technical, or science questions this page to learn about resources available to you. If you are feeling overwhelmed, you can always scream towards the anvi'o {% include _discord_invitation_button.html %}
+
+
+
+{:.notice}
+The help contents were last updated on **27 Sep 23 12:58:43** for anvi'o version **8 (marie)**.
+
+
+{% include _project-anvio-version.html %}
+{% include _toc.html %}
+
+
+## Anvi'o workflows
+
+Anvi'o workflows are dynamic recipes for easy-to-use, scalable, and reproducible bioinformatics analyses through orchestrated use of [anvi'o programs](#anvio-programs) as well as third-party software. These workflows typically start with raw data files and a [workflow-config](artifacts/workflow-config/) and produce [anvi'o artifacts](#anvio-artifacts), which enable you to outsource rudimentary and relatively well-understood initial steps of your 'omics analyses so you can focus on more critical downstream research questions by further analyzing these data products inside or outside of the anvi'o software ecosystem.
+
+The anvi'o 8 (marie) contains 5 workflows:
+
+
+
+## Anvi'o artifacts
+
+Anvi'o artifacts represent **concepts, file types, or data types** anvi'o programs can work with. A given anvi'o artifact can be provided by the user (such as a FASTA file), produced by anvi'o (such as a profile database), or both (such as phylogenomic trees). Anvi'o artifacts link anvi'o programs to each other to build novel workflows.
+
+Listed below **a total of 132 artifacts**.
+
+
+
+
+ | [pan-db](artifacts/pan-db) [contigs-db](artifacts/contigs-db) [trnaseq-db](artifacts/trnaseq-db) [trnaseq-contigs-db](artifacts/trnaseq-contigs-db) [trnaseq-profile-db](artifacts/trnaseq-profile-db) [modules-db](artifacts/modules-db) [structure-db](artifacts/structure-db) [pdb-db](artifacts/pdb-db) [kegg-data](artifacts/kegg-data) [user-modules-data](artifacts/user-modules-data) [reaction-ref-data](artifacts/reaction-ref-data) [single-profile-db](artifacts/single-profile-db) [profile-db](artifacts/profile-db) [genes-db](artifacts/genes-db) [genomes-storage-db](artifacts/genomes-storage-db) |
+
+
+
+ | [fasta](artifacts/fasta) [contigs-fasta](artifacts/contigs-fasta) [trnaseq-fasta](artifacts/trnaseq-fasta) [concatenated-gene-alignment-fasta](artifacts/concatenated-gene-alignment-fasta) [short-reads-fasta](artifacts/short-reads-fasta) [genes-fasta](artifacts/genes-fasta) [locus-fasta](artifacts/locus-fasta) |
+
+
+
+ | [dna-sequence](artifacts/dna-sequence) |
+
+
+
+ | [configuration-ini](artifacts/configuration-ini) [external-gene-calls](artifacts/external-gene-calls) [external-structures](artifacts/external-structures) [bam-stats-txt](artifacts/bam-stats-txt) [bams-and-profiles-txt](artifacts/bams-and-profiles-txt) [markdown-txt](artifacts/markdown-txt) [protein-structure-txt](artifacts/protein-structure-txt) [samples-txt](artifacts/samples-txt) [primers-txt](artifacts/primers-txt) [fasta-txt](artifacts/fasta-txt) [collection-txt](artifacts/collection-txt) [misc-data-items-txt](artifacts/misc-data-items-txt) [misc-data-layers-txt](artifacts/misc-data-layers-txt) [misc-data-nucleotides-txt](artifacts/misc-data-nucleotides-txt) [misc-data-amino-acids-txt](artifacts/misc-data-amino-acids-txt) [misc-data-layer-orders-txt](artifacts/misc-data-layer-orders-txt) [misc-data-items-order-txt](artifacts/misc-data-items-order-txt) [linkmers-txt](artifacts/linkmers-txt) [palindromes-txt](artifacts/palindromes-txt) [inversions-txt](artifacts/inversions-txt) [gene-calls-txt](artifacts/gene-calls-txt) [binding-frequencies-txt](artifacts/binding-frequencies-txt) [functions-txt](artifacts/functions-txt) [functional-enrichment-txt](artifacts/functional-enrichment-txt) [functions-across-genomes-txt](artifacts/functions-across-genomes-txt) [hmm-hits-across-genomes-txt](artifacts/hmm-hits-across-genomes-txt) [view-data](artifacts/view-data) [layer-taxonomy-txt](artifacts/layer-taxonomy-txt) [gene-taxonomy-txt](artifacts/gene-taxonomy-txt) [genome-taxonomy-txt](artifacts/genome-taxonomy-txt) [external-genomes](artifacts/external-genomes) [internal-genomes](artifacts/internal-genomes) [metagenomes](artifacts/metagenomes) [hmm-list](artifacts/hmm-list) [coverages-txt](artifacts/coverages-txt) [detection-txt](artifacts/detection-txt) [variability-profile-txt](artifacts/variability-profile-txt) [codon-frequencies-txt](artifacts/codon-frequencies-txt) [aa-frequencies-txt](artifacts/aa-frequencies-txt) [fixation-index-matrix](artifacts/fixation-index-matrix) [trnaseq-seed-txt](artifacts/trnaseq-seed-txt) [seeds-specific-txt](artifacts/seeds-specific-txt) [seeds-non-specific-txt](artifacts/seeds-non-specific-txt) [modifications-txt](artifacts/modifications-txt) [quick-summary](artifacts/quick-summary) [kegg-metabolism](artifacts/kegg-metabolism) [user-metabolism](artifacts/user-metabolism) [augustus-gene-calls](artifacts/augustus-gene-calls) [vcf](artifacts/vcf) [blast-table](artifacts/blast-table) [splits-txt](artifacts/splits-txt) [genbank-file](artifacts/genbank-file) [groups-txt](artifacts/groups-txt) [splits-taxonomy-txt](artifacts/splits-taxonomy-txt) [clustering-configuration](artifacts/clustering-configuration) [enzymes-txt](artifacts/enzymes-txt) [enzymes-list-for-module](artifacts/enzymes-list-for-module) |
+
+
+
+ | [paired-end-fastq](artifacts/paired-end-fastq) |
+
+
+
+ | [bam-file](artifacts/bam-file) [raw-bam-file](artifacts/raw-bam-file) |
+
+
+
+ | [contigs-stats](artifacts/contigs-stats) [genes-stats](artifacts/genes-stats) |
+
+
+
+ | [svg](artifacts/svg) |
+
+
+
+ | [bin](artifacts/bin) |
+
+
+
+ | [collection](artifacts/collection) |
+
+
+
+ | [hmm-source](artifacts/hmm-source) |
+
+
+
+ | [hmm-hits](artifacts/hmm-hits) [completion](artifacts/completion) [misc-data-items](artifacts/misc-data-items) [misc-data-layers](artifacts/misc-data-layers) [misc-data-nucleotides](artifacts/misc-data-nucleotides) [misc-data-amino-acids](artifacts/misc-data-amino-acids) [genome-similarity](artifacts/genome-similarity) [misc-data-layer-orders](artifacts/misc-data-layer-orders) [misc-data-items-order](artifacts/misc-data-items-order) [metapangenome](artifacts/metapangenome) [oligotypes](artifacts/oligotypes) [functions](artifacts/functions) [kegg-functions](artifacts/kegg-functions) [reaction-network](artifacts/reaction-network) [layer-taxonomy](artifacts/layer-taxonomy) [gene-taxonomy](artifacts/gene-taxonomy) [genome-taxonomy](artifacts/genome-taxonomy) [scgs-taxonomy-db](artifacts/scgs-taxonomy-db) [scgs-taxonomy](artifacts/scgs-taxonomy) [trna-taxonomy-db](artifacts/trna-taxonomy-db) [trna-taxonomy](artifacts/trna-taxonomy) [variability-profile](artifacts/variability-profile) [split-bins](artifacts/split-bins) [state](artifacts/state) [ngrams](artifacts/ngrams) [pn-ps-data](artifacts/pn-ps-data) [metabolic-independence-score](artifacts/metabolic-independence-score) |
+
+
+
+ | [cogs-data](artifacts/cogs-data) [pfams-data](artifacts/pfams-data) [cazyme-data](artifacts/cazyme-data) [interacdome-data](artifacts/interacdome-data) |
+
+
+
+ | [dendrogram](artifacts/dendrogram) [phylogeny](artifacts/phylogeny) |
+
+
+
+ | [reaction-network-json](artifacts/reaction-network-json) [state-json](artifacts/state-json) [workflow-config](artifacts/workflow-config) |
+
+
+
+ | [interactive](artifacts/interactive) [trnaseq-plot](artifacts/trnaseq-plot) [contig-inspection](artifacts/contig-inspection) [gene-cluster-inspection](artifacts/gene-cluster-inspection) |
+
+
+
+ | [variability-profile-xml](artifacts/variability-profile-xml) |
+
+
+
+ | [summary](artifacts/summary) |
+
+
+
+ | [workflow](artifacts/workflow) |
+
+
+
+
+## Anvi'o programs
+
+Anvi'o programs perform atomic tasks that can be weaved together to implement complete 'omics workflows. Please note that there may be programs that are not listed on this page. You can type 'anvi-' in your terminal, and press the TAB key twice to see the full list of programs available to you on your system, and type `anvi-program-name --help` to read the full list of command line options.
+
+Listed below **a total of 147 programs**.
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-compute-completeness](programs/anvi-compute-completeness)**. A script to generate completeness info for a given list of _splits_.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [splits-txt](artifacts/splits-txt) [hmm-source](artifacts/hmm-source)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-compute-gene-cluster-homogeneity](programs/anvi-compute-gene-cluster-homogeneity)**. Compute homogeneity for gene clusters.
+ |
+
+
+
+ ๐ง
+ [pan-db](artifacts/pan-db) [genomes-storage-db](artifacts/genomes-storage-db)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-delete-collection](programs/anvi-delete-collection)**. Remove a collection from a given profile database.
+ |
+
+
+
+ ๐ง
+ [profile-db](artifacts/profile-db) [collection](artifacts/collection)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-delete-functions](programs/anvi-delete-functions)**. Remove functional annotation sources from an anvi'o contigs database.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [functions](artifacts/functions)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-delete-hmms](programs/anvi-delete-hmms)**. Remove HMM hits from an anvi'o contigs database.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [hmm-source](artifacts/hmm-source) [hmm-hits](artifacts/hmm-hits)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-delete-state](programs/anvi-delete-state)**. Delete an anvi'o state from a pan or profile database.
+ |
+
+
+
+ ๐ง
+ [pan-db](artifacts/pan-db) [profile-db](artifacts/profile-db) [state](artifacts/state)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-experimental-organization](programs/anvi-experimental-organization)**. Create an experimental clustering dendrogram..
+ |
+
+
+
+ ๐ง
+ [clustering-configuration](artifacts/clustering-configuration)
+ |
+
+
+
+ ๐
+ [dendrogram](artifacts/dendrogram)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-export-collection](programs/anvi-export-collection)**. Export a collection from an anvi'o database.
+ |
+
+
+
+ ๐ง
+ [profile-db](artifacts/profile-db) [collection](artifacts/collection)
+ |
+
+
+
+ ๐
+ [collection-txt](artifacts/collection-txt)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-export-contigs](programs/anvi-export-contigs)**. Export contigs (or splits) from an anvi'o contigs database.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [contigs-fasta](artifacts/contigs-fasta)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-export-functions](programs/anvi-export-functions)**. Export functions of genes from an anvi'o contigs database for a given annotation source.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [functions](artifacts/functions)
+ |
+
+
+
+ ๐
+ [functions-txt](artifacts/functions-txt)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-export-gene-calls](programs/anvi-export-gene-calls)**. Export gene calls from an anvi'o contigs database.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [gene-calls-txt](artifacts/gene-calls-txt)
+ |
+
+
+
+ ๐ง
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-export-locus](programs/anvi-export-locus)**. This program helps you cut a 'locus' from a larger genetic context (e.g., contigs, genomes). By default, anvi'o will locate a user-defined anchor gene, extend its selection upstream and downstream based on the --num-genes argument, then extract the locus to create a new contigs database. The anchor gene must be provided as --search-term, --gene-caller-ids, or --hmm-sources. If --flank-mode is designated, you MUST provide TWO flanking genes that define the locus region (Please see --flank-mode help for more information). If everything goes as plan, anvi'o will give you individual locus contigs databases for every matching anchor gene found in the original contigs database provided. Enjoy your mini contigs databases!.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [locus-fasta](artifacts/locus-fasta)
+ |
+
+
+
+ ๐ง
+
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-export-splits-taxonomy](programs/anvi-export-splits-taxonomy)**. Export taxonomy for splits found in an anvi'o contigs database.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [splits-taxonomy-txt](artifacts/splits-taxonomy-txt)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-export-structures](programs/anvi-export-structures)**. Export .pdb structure files from a structure database.
+ |
+
+
+
+ ๐ง
+ [structure-db](artifacts/structure-db)
+ |
+
+
+
+ ๐
+ [protein-structure-txt](artifacts/protein-structure-txt)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-gen-contigs-database](programs/anvi-gen-contigs-database)**. Generate a new anvi'o contigs database.
+ |
+
+
+
+ ๐ง
+ [contigs-fasta](artifacts/contigs-fasta) [external-gene-calls](artifacts/external-gene-calls)
+ |
+
+
+
+ ๐
+ [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐ง
+
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-gen-gene-consensus-sequences](programs/anvi-gen-gene-consensus-sequences)**. Collapse variability for a set of genes across samples.
+ |
+
+
+
+ ๐ง
+ [profile-db](artifacts/profile-db) [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [genes-fasta](artifacts/genes-fasta)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-gen-genomes-storage](programs/anvi-gen-genomes-storage)**. Create a genome storage from internal and/or external genomes for a pangenome analysis.
+ |
+
+
+
+ ๐ง
+ [external-genomes](artifacts/external-genomes) [internal-genomes](artifacts/internal-genomes)
+ |
+
+
+
+ ๐
+ [genomes-storage-db](artifacts/genomes-storage-db)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-gen-phylogenomic-tree](programs/anvi-gen-phylogenomic-tree)**. Generate phylogenomic tree from aligment file.
+ |
+
+
+
+ ๐ง
+ [concatenated-gene-alignment-fasta](artifacts/concatenated-gene-alignment-fasta)
+ |
+
+
+
+ ๐
+ [phylogeny](artifacts/phylogeny)
+ |
+
+
+
+ ๐ง
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-gen-structure-database](programs/anvi-gen-structure-database)**. Creates a database of protein structures. Predict protein structures using template-based homology modelling of genes in your contigs database, or import pre-computed PDB structures you already have..
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [pdb-db](artifacts/pdb-db)
+ |
+
+
+
+ ๐
+ [structure-db](artifacts/structure-db)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-gen-variability-network](programs/anvi-gen-variability-network)**. Generate a network description from an anvi'o variability profile..
+ |
+
+
+
+ ๐ง
+ [variability-profile-txt](artifacts/variability-profile-txt)
+ |
+
+
+
+ ๐
+ [variability-profile-xml](artifacts/variability-profile-xml)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-get-metabolic-model-file](programs/anvi-get-metabolic-model-file)**. This program writes a metabolic reaction network to a file suitable for flux balance analysis..
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [reaction-network](artifacts/reaction-network)
+ |
+
+
+
+ ๐
+ [reaction-network-json](artifacts/reaction-network-json)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-get-pn-ps-ratio](programs/anvi-get-pn-ps-ratio)**. Calculate the rates of non-synonymous and synonymous polymorphism for genes across environmetns using the output of anvi-gen-variability-profile..
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [variability-profile-txt](artifacts/variability-profile-txt)
+ |
+
+
+
+ ๐
+ [pn-ps-data](artifacts/pn-ps-data)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-get-short-reads-mapping-to-a-gene](programs/anvi-get-short-reads-mapping-to-a-gene)**. Recover short reads from BAM files that were mapped to genes you are interested in. It is possible to work with a single gene call, or a bunch of them. Similarly, you can get short reads from a single BAM file, or from many of them.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [bam-file](artifacts/bam-file)
+ |
+
+
+
+ ๐
+ [short-reads-fasta](artifacts/short-reads-fasta)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-get-tlen-dist-from-bam](programs/anvi-get-tlen-dist-from-bam)**. Report the distribution of template lengths from a BAM file. The purpose of this is to get an idea about the insert size distribution in a BAM file rapidly by summarizing distances between each paired-end read in a given read recruitment experiment..
+ |
+
+
+
+ ๐ง
+ [bam-file](artifacts/bam-file)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-import-functions](programs/anvi-import-functions)**. Parse and store functional annotation of genes.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [functions-txt](artifacts/functions-txt)
+ |
+
+
+
+ ๐
+ [functions](artifacts/functions)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-import-taxonomy-for-genes](programs/anvi-import-taxonomy-for-genes)**. Import gene-level taxonomy into an anvi'o contigs database.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [gene-taxonomy-txt](artifacts/gene-taxonomy-txt)
+ |
+
+
+
+ ๐
+ [gene-taxonomy](artifacts/gene-taxonomy)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-import-taxonomy-for-layers](programs/anvi-import-taxonomy-for-layers)**. Import layers-level taxonomy into an anvi'o additional layer data table in an anvi'o single-profile database.
+ |
+
+
+
+ ๐ง
+ [single-profile-db](artifacts/single-profile-db) [layer-taxonomy-txt](artifacts/layer-taxonomy-txt)
+ |
+
+
+
+ ๐
+ [layer-taxonomy](artifacts/layer-taxonomy)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-init-bam](programs/anvi-init-bam)**. Sort/Index BAM files.
+ |
+
+
+
+ ๐ง
+ [raw-bam-file](artifacts/raw-bam-file)
+ |
+
+
+
+ ๐
+ [bam-file](artifacts/bam-file)
+ |
+
+
+
+ ๐ง
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-matrix-to-newick](programs/anvi-matrix-to-newick)**. Takes a distance matrix, returns a newick tree.
+ |
+
+
+
+ ๐ง
+ [view-data](artifacts/view-data)
+ |
+
+
+
+ ๐
+ [dendrogram](artifacts/dendrogram)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-merge-trnaseq](programs/anvi-merge-trnaseq)**. This program processes one or more anvi'o tRNA-seq databases produced by `anvi-trnaseq` and outputs anvi'o contigs and merged profile databases accessible to other tools in the anvi'o ecosystem. Final tRNA "seed sequences" are determined from a set of samples. Each sample yields a set of tRNA predictions stored in a tRNA-seq database, and these tRNAs may be shared among the samples. tRNA may be 3' fragments and thereby subsequences of longer tRNAs from other samples which would become seeds. The profile database produced by this program records the coverages of seeds in each sample. This program finalizes predicted nucleotide modification sites using tunable substitution rate parameters..
+ |
+
+
+
+ ๐ง
+ [trnaseq-db](artifacts/trnaseq-db)
+ |
+
+
+
+ ๐
+ [trnaseq-contigs-db](artifacts/trnaseq-contigs-db) [trnaseq-profile-db](artifacts/trnaseq-profile-db)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-oligotype-linkmers](programs/anvi-oligotype-linkmers)**. Takes an anvi'o linkmers report, generates an oligotyping output.
+ |
+
+
+
+ ๐ง
+ [linkmers-txt](artifacts/linkmers-txt)
+ |
+
+
+
+ ๐
+ [oligotypes](artifacts/oligotypes)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-pan-genome](programs/anvi-pan-genome)**. An anvi'o program to compute a pangenome from an anvi'o genome storage.
+ |
+
+
+
+ ๐ง
+ [genomes-storage-db](artifacts/genomes-storage-db)
+ |
+
+
+
+ ๐
+ [pan-db](artifacts/pan-db) [misc-data-items-order](artifacts/misc-data-items-order)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-profile](programs/anvi-profile)**. The flagship anvi'o program to profile a BAM file. Running this program on a BAM file will quantify coverages per nucleotide position in read recruitment results and will average coverage and detection data per contig. It will also calculate single-nucleotide, single-codon, and single-amino acid variants, as well as structural variants, such as insertion and deletions, to eventually stores all data into a single anvi'o profile database. For very large projects, this program can demand a lot of time, memory, and storage resources. If all you want is to learn coverages of your nutleotides, genes, contigs, or your bins collections from BAM files very rapidly, and/or you do not need anvi'o single profile databases for your project, please see other anvi'o programs that profile BAM files, `anvi-script-get-coverage-from-bam` and `anvi-profile-blitz`.
+ |
+
+
+
+ ๐ง
+ [bam-file](artifacts/bam-file) [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [single-profile-db](artifacts/single-profile-db) [misc-data-items-order](artifacts/misc-data-items-order) [variability-profile](artifacts/variability-profile)
+ |
+
+
+
+ ๐ง
+
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-profile-blitz](programs/anvi-profile-blitz)**. FAST profiling of BAM files to get contig- or gene-level coverage and detection stats. Unlike `anvi-profile`, which is another anvi'o program that can profile BAM files, this program is designed to be very quick and only report long-format files for various read recruitment statistics per item. Plase also see the program `anvi-script-get-coverage-from-bam` for recovery of data from BAM files without an anvi'o contigs database.
+ |
+
+
+
+ ๐ง
+ [bam-file](artifacts/bam-file) [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [bam-stats-txt](artifacts/bam-stats-txt)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-report-inversions](programs/anvi-report-inversions)**. Reports inversions.
+ |
+
+
+
+ ๐ง
+ [bams-and-profiles-txt](artifacts/bams-and-profiles-txt)
+ |
+
+
+
+ ๐
+ [inversions-txt](artifacts/inversions-txt)
+ |
+
+
+
+ ๐ง
+
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-report-linkmers](programs/anvi-report-linkmers)**. Reports sequences stored in one or more BAM files that cover one of more specific nucleotide positions in a reference.
+ |
+
+
+
+ ๐ง
+ [bam-file](artifacts/bam-file)
+ |
+
+
+
+ ๐
+ [linkmers-txt](artifacts/linkmers-txt)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-run-cazymes](programs/anvi-run-cazymes)**. Run dbCAN CAZymes on contigs-db.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [cazyme-data](artifacts/cazyme-data)
+ |
+
+
+
+ ๐
+ [functions](artifacts/functions)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-run-hmms](programs/anvi-run-hmms)**. This program deals with populating tables that store HMM hits in an anvi'o contigs database.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [hmm-source](artifacts/hmm-source)
+ |
+
+
+
+ ๐
+ [hmm-hits](artifacts/hmm-hits)
+ |
+
+
+
+ ๐ง
+
+
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-run-ncbi-cogs](programs/anvi-run-ncbi-cogs)**. This program runs NCBI's COGs to associate genes in an anvi'o contigs database with functions. COGs database was been designed as an attempt to classify proteins from completely sequenced genomes on the basis of the orthology concept..
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [cogs-data](artifacts/cogs-data)
+ |
+
+
+
+ ๐
+ [functions](artifacts/functions)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-run-pfams](programs/anvi-run-pfams)**. Run Pfam on Contigs Database.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [pfams-data](artifacts/pfams-data)
+ |
+
+
+
+ ๐
+ [functions](artifacts/functions)
+ |
+
+
+
+ ๐ง
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-run-trna-taxonomy](programs/anvi-run-trna-taxonomy)**. The purpose of this program is to affiliate tRNA gene sequences in an anvi'o contigs database with taxonomic names. A properly setup local tRNA taxonomy database is required for this program to perform properly. After its successful run, `anvi-estimate-trna-taxonomy` will be useful to estimate taxonomy at genome-, collection-, or metagenome-level)..
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [trna-taxonomy-db](artifacts/trna-taxonomy-db)
+ |
+
+
+
+ ๐
+ [trna-taxonomy](artifacts/trna-taxonomy)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-run-workflow](programs/anvi-run-workflow)**. Execute, manage, parallelize, and troubleshoot entire 'omics workflows and chain together anvi'o and third party programs.
+ |
+
+
+
+ ๐ง
+ [workflow-config](artifacts/workflow-config)
+ |
+
+
+
+ ๐
+ [workflow](artifacts/workflow)
+ |
+
+
+
+ ๐ง
+
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-scan-trnas](programs/anvi-scan-trnas)**. Identify and store tRNA genes in a contigs database.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [hmm-hits](artifacts/hmm-hits)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-search-functions](programs/anvi-search-functions)**. Search functions in an anvi'o contigs database or genomes storage. Basically, this program searches for one or more search terms you define in functional annotations of genes in an anvi'o contigs database, and generates multiple reports. The default report simply tells you which contigs contain genes with functions matching to serach terms you used, useful for viewing in the interface. You can also request a much more comprehensive report, which gives you anything you might need to know for each hit and serach term.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [genomes-storage-db](artifacts/genomes-storage-db)
+ |
+
+
+
+ ๐
+ [functions-txt](artifacts/functions-txt)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-search-primers](programs/anvi-search-primers)**. You provide this program with FASTQ files for one or more samples AND one or more primer sequences, and it collects reads from FASTQ files that matches to your primers. This tool can be most powerful if you want to collect all short reads from one or more metagenomes that are downstream to a known sequence. Using the comprehensive output files you can analyze the diversity of seuqences visually, manually, or using established strategies such as oligotyping..
+ |
+
+
+
+ ๐ง
+ [samples-txt](artifacts/samples-txt) [primers-txt](artifacts/primers-txt)
+ |
+
+
+
+ ๐
+ [short-reads-fasta](artifacts/short-reads-fasta)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-setup-cazymes](programs/anvi-setup-cazymes)**. Download and setup Pfam data from the EBI.
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [cazyme-data](artifacts/cazyme-data)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-setup-interacdome](programs/anvi-setup-interacdome)**. Setup InteracDome data.
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [interacdome-data](artifacts/interacdome-data)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-setup-kegg-data](programs/anvi-setup-kegg-data)**. Download and setup various databases from KEGG.
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [kegg-data](artifacts/kegg-data) [modules-db](artifacts/modules-db)
+ |
+
+
+
+ ๐ง
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-setup-modelseed-database](programs/anvi-setup-modelseed-database)**. This program downloads and sets up the ModelSEED Biochemistry database..
+ |
+
+
+
+ ๐ง
+ [functions](artifacts/functions)
+ |
+
+
+
+ ๐
+ [reaction-ref-data](artifacts/reaction-ref-data)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-setup-ncbi-cogs](programs/anvi-setup-ncbi-cogs)**. Download and setup NCBI's Clusters of Orthologous Groups database.
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [cogs-data](artifacts/cogs-data)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-setup-pdb-database](programs/anvi-setup-pdb-database)**. Setup or update an offline database of representative PDB structures clustered at 95%.
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [pdb-db](artifacts/pdb-db)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-setup-pfams](programs/anvi-setup-pfams)**. Download and setup Pfam data from the EBI.
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [pfams-data](artifacts/pfams-data)
+ |
+
+
+
+ ๐ง
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-setup-scg-taxonomy](programs/anvi-setup-scg-taxonomy)**. The purpose of this program is to download necessary information from GTDB (https://gtdb.ecogenomic.org/), and set it up in such a way that your anvi'o installation is able to assign taxonomy to single-copy core genes using `anvi-run-scg-taxonomy` and estimate taxonomy for genomes or metagenomes using `anvi-estimate-scg-taxonomy`).
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [scgs-taxonomy-db](artifacts/scgs-taxonomy-db)
+ |
+
+
+
+ ๐ง
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-setup-trna-taxonomy](programs/anvi-setup-trna-taxonomy)**. The purpose of this program is to setup necessary databases for tRNA genes collected from GTDB (https://gtdb.ecogenomic.org/), genomes in your local anvi'o installation so taxonomy information for a given set of tRNA sequences can be identified using `anvi-run-trna-taxonomy` and made sense of via `anvi-estimate-trna-taxonomy`).
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [trna-taxonomy-db](artifacts/trna-taxonomy-db)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-setup-user-modules](programs/anvi-setup-user-modules)**. Set up user-defined metabolic pathways into an anvi'o-compatible database.
+ |
+
+
+
+ ๐ง
+ [user-modules-data](artifacts/user-modules-data)
+ |
+
+
+
+ ๐
+ [modules-db](artifacts/modules-db) [user-modules-data](artifacts/user-modules-data)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-show-collections-and-bins](programs/anvi-show-collections-and-bins)**. A script to display collections stored in an anvi'o profile or pan database.
+ |
+
+
+
+ ๐ง
+ [pan-db](artifacts/pan-db) [profile-db](artifacts/profile-db)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-show-misc-data](programs/anvi-show-misc-data)**. Show all misc data keys in all misc data tables.
+ |
+
+
+
+ ๐ง
+ [pan-db](artifacts/pan-db) [profile-db](artifacts/profile-db) [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-summarize-blitz](programs/anvi-summarize-blitz)**. FAST summary of many anvi'o single profile databases (without having to use the program anvi-merge)..
+ |
+
+
+
+ ๐ง
+ [single-profile-db](artifacts/single-profile-db) [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [quick-summary](artifacts/quick-summary)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-trnaseq](programs/anvi-trnaseq)**. A program to process reads from a tRNA-seq dataset to generate an anvi'o tRNA-seq database.
+ |
+
+
+
+ ๐ง
+ [trnaseq-fasta](artifacts/trnaseq-fasta)
+ |
+
+
+
+ ๐
+ [trnaseq-db](artifacts/trnaseq-db)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-update-structure-database](programs/anvi-update-structure-database)**. Add or re-run genes from an already existing structure database. All settings used to generate your database will be used in this program.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [structure-db](artifacts/structure-db)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-as-markdown](programs/anvi-script-as-markdown)**. Markdownizides TAB-delmited data with headers in terminal..
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [markdown-txt](artifacts/markdown-txt)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-augustus-output-to-external-gene-calls](programs/anvi-script-augustus-output-to-external-gene-calls)**. Takes in gene calls by AUGUSTUS v3.3.3, generates an anvi'o external gene calls file. It may work well with other versions of AUGUSTUS, too. It is just no one has tested the script with different versions of the program.
+ |
+
+
+
+ ๐ง
+ [augustus-gene-calls](artifacts/augustus-gene-calls)
+ |
+
+
+
+ ๐
+ [external-gene-calls](artifacts/external-gene-calls)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-checkm-tree-to-interactive](programs/anvi-script-checkm-tree-to-interactive)**. A helper script to convert CheckM trees into anvio interactive with taxonomy information.
+ |
+
+
+
+ ๐ง
+ [phylogeny](artifacts/phylogeny)
+ |
+
+
+
+ ๐
+ [interactive](artifacts/interactive)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-compute-ani-for-fasta](programs/anvi-script-compute-ani-for-fasta)**. Run ANI between contigs in a single FASTA file.
+ |
+
+
+
+ ๐ง
+ [fasta](artifacts/fasta)
+ |
+
+
+
+ ๐
+ [genome-similarity](artifacts/genome-similarity)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-compute-bayesian-pan-core](programs/anvi-script-compute-bayesian-pan-core)**. Runs mOTUpan on your gene clusters to estimate whether they are core or accessory.
+ |
+
+
+
+ ๐ง
+ [pan-db](artifacts/pan-db) [genomes-storage-db](artifacts/genomes-storage-db)
+ |
+
+
+
+ ๐
+ [bin](artifacts/bin)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-estimate-metabolic-independence](programs/anvi-script-estimate-metabolic-independence)**. Takes a genome as a contigs-db, and tells you whether it can be considered as an organism of high metabolic independence, or not.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [metabolic-independence-score](artifacts/metabolic-independence-score)
+ |
+
+
+
+ ๐ง
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-filter-fasta-by-blast](programs/anvi-script-filter-fasta-by-blast)**. Filter FASTA file according to BLAST table (remove sequences with bad BLAST alignment).
+ |
+
+
+
+ ๐ง
+ [contigs-fasta](artifacts/contigs-fasta) [blast-table](artifacts/blast-table)
+ |
+
+
+
+ ๐
+ [contigs-fasta](artifacts/contigs-fasta)
+ |
+
+
+
+ ๐ง
+
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-fix-homopolymer-indels](programs/anvi-script-fix-homopolymer-indels)**. Corrects homopolymer-region associated INDELs in a given genome based on a reference genome. The most effective use of this script is when the input genome is a genome reconstructed by minION long reads, and the reference genome is one that is of high-quality. Essentially, this script will BLAST the genome you wish to correct against the reference genome you provide, identify INDELs in the BLAST results that are exclusively associated with homopolymer regions, and will take the reference genome as a guide to correct the input sequences, and report a new FASTA file. You can use the output FASTA file that is fixed as the input FASTA file over and over again to see if you can eliminate all homopolymer-associated INDELs.
+ |
+
+
+
+ ๐ง
+ [fasta](artifacts/fasta)
+ |
+
+
+
+ ๐
+ [fasta](artifacts/fasta)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-gen-pseudo-paired-reads-from-fastq](programs/anvi-script-gen-pseudo-paired-reads-from-fastq)**. A script that takes a FASTQ file that is not paired-end (i.e., R1 alone) and converts it into two FASTQ files that are paired-end (i.e., R1 and R2). This is a quick-and-dirty workaround that halves each read from the original FASTQ and puts one half in the FASTQ file for R1 and puts the reverse-complement of the second half in the FASTQ file for R2. If you've ended up here, things have clearly not gone very well for you, and Evan, who battled similar battles and ended up implementing this solution wholeheartedly sympathizes.
+ |
+
+
+
+ ๐ง
+ [short-reads-fasta](artifacts/short-reads-fasta)
+ |
+
+
+
+ ๐
+ [paired-end-fastq](artifacts/paired-end-fastq)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-gen-short-reads](programs/anvi-script-gen-short-reads)**. Generate short reads from contigs. Useful to reconstruct mock data sets from already assembled contigs.
+ |
+
+
+
+ ๐ง
+ [configuration-ini](artifacts/configuration-ini)
+ |
+
+
+
+ ๐
+ [short-reads-fasta](artifacts/short-reads-fasta)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-gen-user-module-file](programs/anvi-script-gen-user-module-file)**. This script generates a user-defined module file from a tab-delimited file of enzymes and other input parameters..
+ |
+
+
+
+ ๐ง
+ [enzymes-list-for-module](artifacts/enzymes-list-for-module)
+ |
+
+
+
+ ๐
+ [user-modules-data](artifacts/user-modules-data)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-gen_stats_for_single_copy_genes.py](programs/anvi-script-gen_stats_for_single_copy_genes.py)**. A simple script to generate info from search tables, given a contigs-db.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db)
+ |
+
+
+
+ ๐
+ [genes-stats](artifacts/genes-stats)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-get-coverage-from-bam](programs/anvi-script-get-coverage-from-bam)**. Get nucleotide-level, contig-level, or bin-level coverage values from a BAM file very rapidly. For other anvi'o programs that are designed to profile BAM files, see `anvi-profile` and `anvi-profile-blitz`.
+ |
+
+
+
+ ๐ง
+ [bam-file](artifacts/bam-file) [collection-txt](artifacts/collection-txt)
+ |
+
+
+
+ ๐
+ [coverages-txt](artifacts/coverages-txt)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-merge-collections](programs/anvi-script-merge-collections)**. Generate an additional data file from multiple collections.
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [collection-txt](artifacts/collection-txt)
+ |
+
+
+
+ ๐
+ n/a
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-permute-trnaseq-seeds](programs/anvi-script-permute-trnaseq-seeds)**. This script generates a FASTA file of tRNA-seq seeds with permuted nucleotides at positions of predicted modification-induced substitutions. The underlying nucleotide without modification is not always the most common base call. The resulting FASTA file can be queried against a database of tRNA genes to validate nucleotides at modified positions and find the most similar sequences..
+ |
+
+
+
+ ๐ง
+ [contigs-db](artifacts/contigs-db) [profile-db](artifacts/profile-db)
+ |
+
+
+
+ ๐
+ [contigs-fasta](artifacts/contigs-fasta)
+ |
+
+
+
+ ๐ง
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-pfam-accessions-to-hmms-directory](programs/anvi-script-pfam-accessions-to-hmms-directory)**. You give this program one or more PFAM accession ids, and it generates an anvi'o compatible HMM directory to be used with `anvi-run-hmms`.
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [hmm-source](artifacts/hmm-source)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-process-genbank-metadata](programs/anvi-script-process-genbank-metadata)**. This script takes the 'metadata' output of the program `ncbi-genome-download` (see [https://github.com/kblin/ncbi-genome-download](https://github.com/kblin/ncbi-genome-download) for details), and processes each GenBank file found in the metadata file to generate a FASTA file, as well as genes and functions files for each entry. Plus, it autmatically generates a FASTA TXT file descriptor for anvi'o snakemake workflows. So it is a multi-talented program like that.
+ |
+
+
+
+ ๐ง
+ n/a
+ |
+
+
+
+ ๐
+ [contigs-fasta](artifacts/contigs-fasta) [functions-txt](artifacts/functions-txt) [external-gene-calls](artifacts/external-gene-calls)
+ |
+
+
+
+ ๐ง
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-reformat-fasta](programs/anvi-script-reformat-fasta)**. Reformat FASTA file (remove contigs based on length, or based on a given list of deflines, and/or generate an output with simpler names).
+ |
+
+
+
+ ๐ง
+ [fasta](artifacts/fasta)
+ |
+
+
+
+ ๐
+ [contigs-fasta](artifacts/contigs-fasta)
+ |
+
+
+
+ ๐ง
+
+
+
+ |
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-snvs-to-interactive](programs/anvi-script-snvs-to-interactive)**. Take the output of anvi-gen-variability-profile, prepare an output for interactive interface.
+ |
+
+
+
+ ๐ง
+ [variability-profile-txt](artifacts/variability-profile-txt)
+ |
+
+
+
+ ๐
+ [interactive](artifacts/interactive)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
+
+
+
+
+
+
+
+ ๐ฅ **[anvi-script-variability-to-vcf](programs/anvi-script-variability-to-vcf)**. A script to convert SNV output obtained from anvi-gen-variability-profile to the standard VCF format.
+ |
+
+
+
+ ๐ง
+ [variability-profile-txt](artifacts/variability-profile-txt)
+ |
+
+
+
+ ๐
+ [vcf](artifacts/vcf)
+ |
+
+
+
+ ๐ง
+
+ |
+
+
+
+
+
diff --git a/help/8/programs/anvi-analyze-synteny/index.md b/help/8/programs/anvi-analyze-synteny/index.md
new file mode 100644
index 00000000..be247093
--- /dev/null
+++ b/help/8/programs/anvi-analyze-synteny/index.md
@@ -0,0 +1,145 @@
+---
+layout: program
+title: anvi-analyze-synteny
+excerpt: An anvi'o program. Extract ngrams, as in 'co-occurring genes in synteny', from genomes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-analyze-synteny
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Extract ngrams, as in 'co-occurring genes in synteny', from genomes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[genomes-storage-db](../../artifacts/genomes-storage-db) [functions](../../artifacts/functions) [pan-db](../../artifacts/pan-db)
+
+
+## Can provide
+
+
+[ngrams](../../artifacts/ngrams)
+
+
+## Usage
+
+
+Briefly, [anvi-analyze-synteny](/help/8/programs/anvi-analyze-synteny) counts [ngrams](/help/8/artifacts/ngrams) by converting contigs into strings of annotations for a given user-defined source of gene annotation. A source annotation for [functions](/help/8/artifacts/functions) **must** be provided to create [ngrams](/help/8/artifacts/ngrams), upon which anvi'o will use a sliding window of size `N` to deconstruct the loci of interest into [ngrams](/help/8/artifacts/ngrams) and count their frequencies.
+
+### Run for a given function annotation source
+
+
+anvi-analyze-synteny -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ --annotation-source [functions](/help/8/artifacts/functions) \
+ --ngram-window-range 2:3 \
+ -o [ngrams](/help/8/artifacts/ngrams)
+
+
+For instance, if you have run [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs) on each [contigs-db](/help/8/artifacts/contigs-db) you have used to generate your [genomes-storage-db](/help/8/artifacts/genomes-storage-db), your `--annotation-source` can be `NCBI_COGS`:
+
+
+anvi-analyze-synteny -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ --annotation-source NCBI_COGS \
+ --ngram-window-range 2:3 \
+ -o [ngrams](/help/8/artifacts/ngrams)
+
+
+
+### Handling genes with unknown functions
+
+By default, [anvi-analyze-synteny](/help/8/programs/anvi-analyze-synteny) will ignore genes with unknown functions based on the annotation source of interest. However, this can be circumvented either by providing a [pan-db](/help/8/artifacts/pan-db), so the program would use gene cluster identities as function names:
+
+
+anvi-analyze-synteny -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ --ngram-window-range 2:3 \
+ -o [ngrams](/help/8/artifacts/ngrams)
+
+
+or by explicitly asking the program to consider unknown functions, in which case the program would not discard ngrams that include genes without functions:
+
+
+anvi-analyze-synteny -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ --annotation-source [functions](/help/8/artifacts/functions) \
+ --ngram-window-range 2:3 \
+ -o [ngrams](/help/8/artifacts/ngrams) \
+ --analyze-unknown-functions
+
+
+The disadvantage of the latter strategy is that since all genes with unknown functions will be considered the same, the frequency of ngrams that contain genes with unknown functions may be inflated in your final results.
+
+### Run with multiple annotations
+
+If multiple gene annotation sources are provided (i.e., a pangenome for gene clusters identities as well as a functional annotation source), the user must define which annotation source will be used to create the [ngrams](/help/8/artifacts/ngrams) using the parameter `--ngram-source`. The resulting [ngrams](/help/8/artifacts/ngrams) will then be re-annotated with the second annotation source and also reported.
+
+
+anvi-analyze-synteny -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ --annotation-source [functions](/help/8/artifacts/functions) \
+ --ngram-source gene_clusters \
+ --ngram-window-range 2:3 \
+ -o [ngrams](/help/8/artifacts/ngrams)
+
+
+### Test cases for developers
+
+If you are following the anvi'o master branch on your computer, you can create a test case for this program.
+
+First, go to any work directory, and run the following commands:
+
+``` bash
+anvi-self-test --suite metagenomics-full \
+ --output-dir TEST_OUTPUT
+```
+
+Run one or more alternative scenarios and check output files:
+
+```
+anvi-analyze-synteny -g TEST_OUTPUT/TEST-GENOMES.db \
+ --annotation-source COG20_FUNCTION \
+ --ngram-window-range 2:3 \
+ -o TEST_OUTPUT/synteny_output_no_unknowns.tsv
+
+anvi-analyze-synteny -g TEST_OUTPUT/TEST-GENOMES.db \
+ --annotation-source COG20_FUNCTION \
+ --ngram-window-range 2:3 \
+ -o TEST_OUTPUT/synteny_output_with_unknowns.tsv \
+ --analyze-unknown-functions
+
+anvi-analyze-synteny -g TEST_OUTPUT/TEST-GENOMES.db \
+ --annotation-source COG20_FUNCTION \
+ --ngram-window-range 2:3 \
+ -o TEST_OUTPUT/tsv.txt \
+ --analyze-unknown-functions
+```
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-analyze-synteny.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-analyze-synteny) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-analyze-synteny/network.json b/help/8/programs/anvi-analyze-synteny/network.json
new file mode 100644
index 00000000..3d226e34
--- /dev/null
+++ b/help/8/programs/anvi-analyze-synteny/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "ngrams",
+ "name": "ngrams",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-analyze-synteny",
+ "name": "anvi-analyze-synteny",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-cluster-contigs/index.md b/help/8/programs/anvi-cluster-contigs/index.md
new file mode 100644
index 00000000..e07204d0
--- /dev/null
+++ b/help/8/programs/anvi-cluster-contigs/index.md
@@ -0,0 +1,75 @@
+---
+layout: program
+title: anvi-cluster-contigs
+excerpt: An anvi'o program. A program to cluster items in a merged anvi'o profile using automatic binning algorithms.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-cluster-contigs
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program to cluster items in a merged anvi'o profile using automatic binning algorithms.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [collection](../../artifacts/collection)
+
+
+## Can provide
+
+
+[collection](../../artifacts/collection) [bin](../../artifacts/bin)
+
+
+## Usage
+
+
+This program clusters the contigs stored in a [profile-db](/help/8/artifacts/profile-db) using your binning algorithm of choice and stores the results in several [bin](/help/8/artifacts/bin)s.
+
+This is a quick alternative to manually binning your contigs, but it might miss some details that a human doing manual binning would find. After running this, you might want to run [anvi-summarize](/help/8/programs/anvi-summarize) on the resulting [collection](/help/8/artifacts/collection) to look through your bins, and, if necessary, use [anvi-refine](/help/8/programs/anvi-refine) to change the contents of them.
+
+You have to option to use several different clustering algorithms, which you'll specify with the `driver` parameter: [concoct](https://github.com/BinPro/CONCOCT/blob/develop/doc/source/index.rst), [metabat2](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6662567/), [maxbin2](https://academic.oup.com/bioinformatics/article/32/4/605/1744462), [dastool](https://github.com/cmks/DAS_Tool), and [binsanity](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345454/).
+
+So, a run of this program will look like the following:
+
+
+anvi-cluster-contigs -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ --driver concoct
+
+
+Once you specify an algorithm, there are many algorithm specific parameters that you can change to your liking. When this program is set up, these parameters will appear in the help menu for the algorithms that anvi'o can find.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-cluster-contigs.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-cluster-contigs) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-cluster-contigs/network.json b/help/8/programs/anvi-cluster-contigs/network.json
new file mode 100644
index 00000000..0319b594
--- /dev/null
+++ b/help/8/programs/anvi-cluster-contigs/network.json
@@ -0,0 +1,73 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-cluster-contigs",
+ "name": "anvi-cluster-contigs",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-compute-completeness/index.md b/help/8/programs/anvi-compute-completeness/index.md
new file mode 100644
index 00000000..07833eca
--- /dev/null
+++ b/help/8/programs/anvi-compute-completeness/index.md
@@ -0,0 +1,76 @@
+---
+layout: program
+title: anvi-compute-completeness
+excerpt: An anvi'o program. A script to generate completeness info for a given list of _splits_.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-compute-completeness
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A script to generate completeness info for a given list of _splits_.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [splits-txt](../../artifacts/splits-txt) [hmm-source](../../artifacts/hmm-source)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program tells you the completeness and redundency of single-copy gene sources available for your [contigs-db](/help/8/artifacts/contigs-db).
+
+For example, some of the defaults are collections of single-copy core genes named `Protista_83`, `Archaea_76`, and `Bacteria_71`. This program will give you a rough estimate of how many Protist, Archaeal, and Bacterial genomes are included in your dataset using these single-copy core genes.
+
+You can use the following run to list available completeness sources in your [contigs-db](/help/8/artifacts/contigs-db):
+
+
+anvi-compute-completeness -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --list-completeness-sources
+
+
+Then you can run this program on a specifc source as folows:
+
+
+anvi-compute-completeness -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --completeness-source Bacteria_71
+
+
+You can also provide a [splits-txt](/help/8/artifacts/splits-txt) to focus on a specific set of splits, or declare a minimum e-value for a gene to count as a hit. The default is `1e-15`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-compute-completeness.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-compute-completeness) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-compute-completeness/network.json b/help/8/programs/anvi-compute-completeness/network.json
new file mode 100644
index 00000000..b48a006b
--- /dev/null
+++ b/help/8/programs/anvi-compute-completeness/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "splits-txt",
+ "name": "splits-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "hmm-source",
+ "name": "hmm-source",
+ "provided_by_anvio": false,
+ "type": "HMM"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-compute-completeness",
+ "name": "anvi-compute-completeness",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 3,
+ "source": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-compute-functional-enrichment-across-genomes/index.md b/help/8/programs/anvi-compute-functional-enrichment-across-genomes/index.md
new file mode 100644
index 00000000..bb6d8f9b
--- /dev/null
+++ b/help/8/programs/anvi-compute-functional-enrichment-across-genomes/index.md
@@ -0,0 +1,116 @@
+---
+layout: program
+title: anvi-compute-functional-enrichment-across-genomes
+excerpt: An anvi'o program. A program that computes functional enrichment across groups of genomes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-compute-functional-enrichment-across-genomes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program that computes functional enrichment across groups of genomes..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[groups-txt](../../artifacts/groups-txt) [genomes-storage-db](../../artifacts/genomes-storage-db) [external-genomes](../../artifacts/external-genomes) [internal-genomes](../../artifacts/internal-genomes) [functions](../../artifacts/functions)
+
+
+## Can provide
+
+
+[functional-enrichment-txt](../../artifacts/functional-enrichment-txt)
+
+
+## Usage
+
+
+This program computes functional enrichment across groups of genomes and returns a [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) file.
+
+{:.warning}
+For its sister programs, see [anvi-compute-functional-enrichment-in-pan](/help/8/programs/anvi-compute-functional-enrichment-in-pan) and [anvi-compute-metabolic-enrichment](/help/8/programs/anvi-compute-metabolic-enrichment).
+
+{:.notice}
+Please also see [anvi-display-functions](/help/8/programs/anvi-display-functions) which can both calculate functional enrichment, AND give you an interactive interface to display the distribution of functions.
+
+## Functional enrichment
+
+You can use this program by combining genomes described through [external-genomes](/help/8/artifacts/external-genomes), [internal-genomes](/help/8/artifacts/internal-genomes), and/or stored in a [genomes-storage-db](/help/8/artifacts/genomes-storage-db). In addition to sources for your genomes, you will need to provide a [groups-txt](/help/8/artifacts/groups-txt) file to declare which genome belongs to which group for enrichment analysis to consider.
+
+### How does it work?
+
+1. **Aggregate functions from all sources**. Gene calls in each genome are tallied according to their functional annotations from the given annotation source.
+
+2. **Quantify the distribution of functions in each group of genomes**. This information is then used by `anvi-script-enrichment-stats` to fit a GLM to determine (1) the level that a particular functional annotation is unique to a single group and (2) the percent of genomes it appears in in each group. This produces a [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) file.
+
+{:.notice}
+The script `anvi-script-enrichment-stats` was implemented by [Amy Willis](https://github.com/adw96), and described first in [this paper](https://doi.org/10.1186/s13059-020-02195-w).
+
+
+### Basic usage
+
+You can use it with a single source of genomes:
+
+
+anvi-compute-functional-enrichment-across-genomes -i [internal-genomes](/help/8/artifacts/internal-genomes) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) \
+ -G [groups-txt](/help/8/artifacts/groups-txt) \
+ --annotation-source FUNCTION_SOURCE
+
+
+or many:
+
+
+anvi-compute-functional-enrichment-across-genomes -i [internal-genomes](/help/8/artifacts/internal-genomes)\
+ -e [external-genomes](/help/8/artifacts/external-genomes) \
+ -G [groups-txt](/help/8/artifacts/groups-txt) \
+ -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) \
+ --annotation-source FUNCTION_SOURCE
+
+
+### Additional Parameters
+
+You can get a tab-delimited matrix describing the occurrence (counts) of each function within each genome using the `--functional-occurrence-table-output` parameter:
+
+
+anvi-compute-functional-enrichment-across-genomes -i [internal-genomes](/help/8/artifacts/internal-genomes) \
+ -G [groups-txt](/help/8/artifacts/groups-txt) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) \
+ --annotation-source FUNCTION_SOURCE
+ --functional-occurrence-table-output FUNC_OCCURRENCE.TXT
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-compute-functional-enrichment-across-genomes.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A description of the enrichment script run by this program can be found in Shaiber et al 2020](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02195-w)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-compute-functional-enrichment-across-genomes) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-compute-functional-enrichment-across-genomes/network.json b/help/8/programs/anvi-compute-functional-enrichment-across-genomes/network.json
new file mode 100644
index 00000000..40fe2517
--- /dev/null
+++ b/help/8/programs/anvi-compute-functional-enrichment-across-genomes/network.json
@@ -0,0 +1,95 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functional-enrichment-txt",
+ "name": "functional-enrichment-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "groups-txt",
+ "name": "groups-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 6,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-compute-functional-enrichment-across-genomes",
+ "name": "anvi-compute-functional-enrichment-across-genomes",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 6,
+ "target": 0
+ },
+ {
+ "target": 6,
+ "source": 1
+ },
+ {
+ "target": 6,
+ "source": 2
+ },
+ {
+ "target": 6,
+ "source": 3
+ },
+ {
+ "target": 6,
+ "source": 4
+ },
+ {
+ "target": 6,
+ "source": 5
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-compute-functional-enrichment-in-pan/index.md b/help/8/programs/anvi-compute-functional-enrichment-in-pan/index.md
new file mode 100644
index 00000000..edc9c0de
--- /dev/null
+++ b/help/8/programs/anvi-compute-functional-enrichment-in-pan/index.md
@@ -0,0 +1,137 @@
+---
+layout: program
+title: anvi-compute-functional-enrichment-in-pan
+excerpt: An anvi'o program. A program that computes functional enrichment within a pangenome.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-compute-functional-enrichment-in-pan
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program that computes functional enrichment within a pangenome..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[misc-data-layers](../../artifacts/misc-data-layers) [pan-db](../../artifacts/pan-db) [genomes-storage-db](../../artifacts/genomes-storage-db) [functions](../../artifacts/functions)
+
+
+## Can provide
+
+
+[functional-enrichment-txt](../../artifacts/functional-enrichment-txt)
+
+
+## Usage
+
+
+This program computes functional enrichment within a pangenome and returns a [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) file.
+
+{:.warning}
+For its sister programs, see [anvi-compute-metabolic-enrichment](/help/8/programs/anvi-compute-metabolic-enrichment) and [anvi-compute-functional-enrichment-across-genomes](/help/8/programs/anvi-compute-functional-enrichment-across-genomes).
+
+{:.notice}
+Please also see [anvi-display-functions](/help/8/programs/anvi-display-functions) which can both calculate functional enrichment, AND give you an interactive interface to display the distribution of functions.
+
+## Enriched functions in a pangenome
+
+For this to run, you must provide a [pan-db](/help/8/artifacts/pan-db) and [genomes-storage-db](/help/8/artifacts/genomes-storage-db) pair, as well as a [misc-data-layers](/help/8/artifacts/misc-data-layers) that associates genomes in your pan database with categorical data. The program will then find functions that are enriched in each group (i.e., functions that are associated with gene clusters that are characteristic of the genomes in that group).
+
+{:.notice}
+Note that your [genomes-storage-db](/help/8/artifacts/genomes-storage-db) must have at least one functional annotation source for this to work.
+
+This analysis will help you identify functions that are associated with a specific group of genomes in a pangenome and determine the functional core of your pangenome. For example, in the *Prochlorococcus* pangenome (the one used in [the pangenomics tutorial, where you can find more info about this program](http://merenlab.org/2016/11/08/pangenomics-v2/#making-sense-of-functions-in-your-pangenome)), this program finds that `Exonuclease VII` is enriched in the `low-light` genomes and not in `high-light` genomes. The output file provides various statistics about how confident the program is in making this association.
+
+### How does it work?
+
+What this program does can be broken down into three steps:
+
+1. **Determine groups of genomes**. The program uses a [misc-data-layers](/help/8/artifacts/misc-data-layers) variable (containing categorical, not numerical, data) to split genomes in a pangenome into two or more groups. For example, in the pangenome tutorial, the categorical variable name was `light` that partitioned genomes into `low-light` and `high-light `groups.
+
+2. **Determine the "functional associations" of gene clusters**. In short, this is collecting the functional annotations for all of the genes in each cluster and assigning the one that appears most frequently to represent the entire cluster.
+
+3. **Quantify the distribution of functions in each group of genomes**. For this, the program determines to what extent a particular function is enriched in specific groups of genomes and reports it as a [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) file. It does so by running the script `anvi-script-enrichment-stats`.
+
+{:.notice}
+The script `anvi-script-enrichment-stats` was implemented by [Amy Willis](https://github.com/adw96), and described first in [this paper](https://doi.org/10.1186/s13059-020-02195-w).
+
+{:.notice}
+Check out [Alon's behind the scenes post](http://merenlab.org/2016/11/08/pangenomics-v2/#making-sense-of-functions-in-your-pangenome), which goes into a lot more detail.
+
+### Basic usage
+
+Here is the simplest way to run this program:
+
+
+anvi-compute-functional-enrichment-in-pan -p [pan-db](/help/8/artifacts/pan-db)\
+ -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) \
+ --category-variable CATEGORY \
+ --annotation-source FUNCTION_SOURCE
+
+
+The [pan-db](/help/8/artifacts/pan-db) must contain at least one categorical data layer in [misc-data-layers](/help/8/artifacts/misc-data-layers), and you must choose one of these categories to define your pan-groups with the `--category-variable` parameter. You can see available variables with [anvi-show-misc-data](/help/8/programs/anvi-show-misc-data) program with the parameters `-t layers --debug`.
+
+The [genomes-storage-db](/help/8/artifacts/genomes-storage-db) must have at least one functional annotation source, and you must choose one of these sources with the `--annotation-source`. If you do not know which functional annotation sources are available in your [genomes-storage-db](/help/8/artifacts/genomes-storage-db), you can use the `--list-annotation-sources` parameter to find out.
+
+### Additional options
+
+By default, gene clusters with the same functional annotation will be merged. But if you provide the `--include-gc-identity-as-function` parameter and set the annotation source to be 'IDENTITY', anvi'o will treat gene cluster names as functions and enable you to investigate enrichment of each gene cluster independently. This is how you do it:
+
+
+anvi-compute-functional-enrichment-in-pan -p [pan-db](/help/8/artifacts/pan-db)\
+ -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) \
+ --category-variable CATEGORY \
+ --annotation-source IDENTITY \
+ --include-gc-identity-as-function
+
+
+To output a functional occurrence table, which describes the number of times each of your functional associations occurs in each genome you're looking at, use the `--functional-occurrence-table-output` parameter, like so:
+
+
+anvi-compute-functional-enrichment-in-pan -p [pan-db](/help/8/artifacts/pan-db)\
+ -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) \
+ --category-variable CATEGORY \
+ --annotation-source FUNCTION_SOURCE \
+ --functional-occurrence-table-output FUNC_OCCURRENCE.TXT
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-compute-functional-enrichment-in-pan.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A description of the enrichment script run by this program can be found in Shaiber et al 2020](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02195-w)
+
+* [An example of pangenome functional enrichment in the context of the Prochlorococcus metapangenome from Delmont and Eren 2018 is included in the pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-compute-functional-enrichment-in-pan) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-compute-functional-enrichment-in-pan/network.json b/help/8/programs/anvi-compute-functional-enrichment-in-pan/network.json
new file mode 100644
index 00000000..39d202de
--- /dev/null
+++ b/help/8/programs/anvi-compute-functional-enrichment-in-pan/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functional-enrichment-txt",
+ "name": "functional-enrichment-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layers",
+ "name": "misc-data-layers",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-compute-functional-enrichment-in-pan",
+ "name": "anvi-compute-functional-enrichment-in-pan",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "target": 5,
+ "source": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-compute-gene-cluster-homogeneity/index.md b/help/8/programs/anvi-compute-gene-cluster-homogeneity/index.md
new file mode 100644
index 00000000..f29adda3
--- /dev/null
+++ b/help/8/programs/anvi-compute-gene-cluster-homogeneity/index.md
@@ -0,0 +1,90 @@
+---
+layout: program
+title: anvi-compute-gene-cluster-homogeneity
+excerpt: An anvi'o program. Compute homogeneity for gene clusters.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-compute-gene-cluster-homogeneity
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Compute homogeneity for gene clusters.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program **computes both the geometric homogeneity and functional homogeneity for the gene clusters in a [pan-db](/help/8/artifacts/pan-db).**
+
+*Geometric homogeneity* and *functional homogeneity* are anvi'o specific terms that describe how similar genes within a gene cluster are to each other in different ways. Briefly, geometric homogeneity compares the positions of gaps in the aligned residues without considering specific amino acids, and functional homogeneity examines point mutations to amino acids and compares how similar the resulting amino acids are chemically. See [this page](http://merenlab.org/2016/11/08/pangenomics-v2/#inferring-the-homogeneity-of-gene-clusters) for more details.
+
+You can run this program as so:
+
+
+anvi-compute-gene-cluster-homogeneity -p [pan-db](/help/8/artifacts/pan-db) \
+ -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -o path/to/output.txt \
+ --store-in-db
+
+
+This run will put the output directly in the database, as well as provide it as a separate file as the specified output path.
+
+You also have the option to calculate this information about only specific gene clusters, either by providing a gene cluster ID, list of gene cluster IDs, [collection](/help/8/artifacts/collection) or [bin](/help/8/artifacts/bin).
+
+To save on runtime, you can also enable `--quick-homogeneity`, which will not check for horizontal geometric homogenity (i.e. it will not look at alignments within a single gene). This will be less accurate for detailed analyses, but it will run faster.
+
+Here is an example run that uses this flag and only looks at a specific collection:
+
+
+anvi-compute-gene-cluster-homogeneity -p [pan-db](/help/8/artifacts/pan-db) \
+ -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -o path/to/output.txt \
+ --store-in-db \
+ -C [collection](/help/8/artifacts/collection) \
+ --quick-homogeneity
+
+
+You can also use multithreading if you're familiar with that.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-compute-gene-cluster-homogeneity.md) to update this information.
+
+
+## Additional Resources
+
+
+* [The role of gene cluster homogeneity described in the Anvi'o pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#inferring-the-homogeneity-of-gene-clusters)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-compute-gene-cluster-homogeneity) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-compute-gene-cluster-homogeneity/network.json b/help/8/programs/anvi-compute-gene-cluster-homogeneity/network.json
new file mode 100644
index 00000000..5ddc2b4b
--- /dev/null
+++ b/help/8/programs/anvi-compute-gene-cluster-homogeneity/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-compute-gene-cluster-homogeneity",
+ "name": "anvi-compute-gene-cluster-homogeneity",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 2,
+ "source": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-compute-genome-similarity/index.md b/help/8/programs/anvi-compute-genome-similarity/index.md
new file mode 100644
index 00000000..2951093b
--- /dev/null
+++ b/help/8/programs/anvi-compute-genome-similarity/index.md
@@ -0,0 +1,133 @@
+---
+layout: program
+title: anvi-compute-genome-similarity
+excerpt: An anvi'o program. Export sequences from sequence sources and compute a similarity metric (e.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-compute-genome-similarity
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export sequences from sequence sources and compute a similarity metric (e.g. ANI). If a Pan Database is given anvi'o will write computed output to misc data tables of Pan Database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[external-genomes](../../artifacts/external-genomes) [internal-genomes](../../artifacts/internal-genomes) [pan-db](../../artifacts/pan-db)
+
+
+## Can provide
+
+
+[genome-similarity](../../artifacts/genome-similarity)
+
+
+## Usage
+
+
+
+This program uses the user's similarity metric of choice to calculate the similarity between the input genomes.
+
+The currently available programs for calculating similarity metrics include, chosen can be chosen with `--program`:
+- [PyANI](https://github.com/widdowquinn/pyani)) to calculate the average nucleotide identity (ANI) (i.e. what portion of orthologous gene pairs align)
+- [fastANI](https://github.com/ParBLiSS/FastANI) also to calcualte the ANI but at a faster speed (at the drawback of a slight reduction in accuracy)
+- [sourmash](https://sourmash.readthedocs.io/en/latest/) to calculate the mash distance between genomes. Though we provide this option, we don't recommend using sourmash for genome comparisons--it excels at other tasks--yet it remains as a legacy option.
+
+### Input/Output
+
+The expected input is any combination of [external-genomes](/help/8/artifacts/external-genomes), [internal-genomes](/help/8/artifacts/internal-genomes), and text files that contains paths to [fasta](/help/8/artifacts/fasta) files that describe each of your genomes. This is a tab-delimited file with two columns (`name` and `path` to the fasta files, each of which is assumed to be a single genome).
+
+
+The program outputs a directory with [genome-similarity](/help/8/artifacts/genome-similarity) data. The specific contents will depend on how similarity scores are computed (specified with `--program`), but generally contains tab-separated files of similarity scores between genomes and related metrics.
+
+
+You also have the option to provide a [pan-db](/help/8/artifacts/pan-db), in which case the output data will additionally be stored in the database as [misc-data-layers](/help/8/artifacts/misc-data-layers) and [misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders) data. This was done in the [pangenomic tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#computing-the-average-nucleotide-identity-for-genomes-and-other-genome-similarity-metrics-too).
+
+Here is an example run with pyANI from an [external-genomes](/help/8/artifacts/external-genomes) without any parameter changes:
+
+
+anvi-compute-genome-similarity -e [external-genomes](/help/8/artifacts/external-genomes) \
+ -o path/for/[genome-similarity](/help/8/artifacts/genome-similarity) \
+ --program pyANI
+
+
+### Genome similarity metrics: parameters
+
+Parameters have been divided up based on which `--program` you use.
+
+#### pyANI
+
+You have the option to change any of the follow parameters:
+
+- The method used for alignment. The options are:
+ - `ANIb` (default): uses [BLASTN](https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome)+ to align 1020 nt fragments of the inputs
+ - `ANIm`: uses [MUMmer](http://mummer.sourceforge.net/) to align
+ - `ANIblastall`: Uses legacy [BLASTN](https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome) to align 1020 nt fragments
+ - `TETRA`: Alignment free. This calculates similarity scores by comparing tetranucleotide frequencies for each input
+
+- The minimum alignment fraction (all percent identity scores lower than this will be set to 0). The default is 0.
+
+
+- If you want to keep alignments that are long, despite them not passing the minimum alignment fraction filter, you can supply a `--significant-alignment-length` to override `--min-alignment-fraction`.
+
+
+- Similarly, you can discard all results less than some full percent identity (percent identity of aligned segments * aligned fraction).
+
+
+#### fastANI
+
+You can change any of the following fastANI parameters:
+
+* The kmer size. The default is 16.
+
+* The fragment length. The default is 30.
+
+* The minimum number of fragments for a result to count. The default is 50.
+
+#### sourmash
+
+You have the option to change the `kmer-size`. This value should depend on the relationship between your samples. The default is 31 ([as recommended by sourmash for genus-level distances](https://sourmash.readthedocs.io/en/latest/using-sourmash-a-guide.html), but we found that 13 most closely parallels the results from an ANI alignment.
+
+You can also set the compression ratio for your fasta files. Decreasing this from the default (1000) will decrease sensitivity.
+
+### Other Parameters
+
+Once calculated, the similarity matrix is used to create dendrograms via hierarchical clustering, which are stored in the output directory (and in the [pan-db](/help/8/artifacts/pan-db), if provided). You can choose to change the distance metric or linkage algorithm used for this clustering.
+
+
+If you're getting a lot of debug/output messages, you can turn them off with `--just-do-it` or helpfully store them into a file with `--log-file`.
+
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-compute-genome-similarity.md) to update this information.
+
+
+## Additional Resources
+
+
+* [In action in the pangenomic workflow tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#computing-the-average-nucleotide-identity-for-genomes-and-other-genome-similarity-metrics-too)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-compute-genome-similarity) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-compute-genome-similarity/network.json b/help/8/programs/anvi-compute-genome-similarity/network.json
new file mode 100644
index 00000000..239ca33c
--- /dev/null
+++ b/help/8/programs/anvi-compute-genome-similarity/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genome-similarity",
+ "name": "genome-similarity",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-compute-genome-similarity",
+ "name": "anvi-compute-genome-similarity",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-compute-metabolic-enrichment/index.md b/help/8/programs/anvi-compute-metabolic-enrichment/index.md
new file mode 100644
index 00000000..1345cac7
--- /dev/null
+++ b/help/8/programs/anvi-compute-metabolic-enrichment/index.md
@@ -0,0 +1,128 @@
+---
+layout: program
+title: anvi-compute-metabolic-enrichment
+excerpt: An anvi'o program. A program that computes metabolic enrichment across groups of genomes and metagenomes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-compute-metabolic-enrichment
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program that computes metabolic enrichment across groups of genomes and metagenomes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[kegg-metabolism](../../artifacts/kegg-metabolism) [user-metabolism](../../artifacts/user-metabolism) [groups-txt](../../artifacts/groups-txt) [external-genomes](../../artifacts/external-genomes) [internal-genomes](../../artifacts/internal-genomes) [functions](../../artifacts/functions)
+
+
+## Can provide
+
+
+[functional-enrichment-txt](../../artifacts/functional-enrichment-txt)
+
+
+## Usage
+
+
+This program computes metabolic module enrichment across groups of genomes or metagenomes and returns a [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) file (throughout this text, we will use the term genome to describe both for simplicity).
+
+{:.warning}
+For its sister programs, see [anvi-compute-functional-enrichment-in-pan](/help/8/programs/anvi-compute-functional-enrichment-in-pan) and [anvi-compute-functional-enrichment-across-genomes](/help/8/programs/anvi-compute-functional-enrichment-across-genomes).
+
+## Module enrichment
+
+To run this program, you must already have estimated the completeness of metabolic modules in your genomes using the program [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) and obtained a "modules" mode output file (which is the default output mode of that program). In addition to that, you will need to provide a [groups-txt](/help/8/artifacts/groups-txt) file to declare which genome belongs to which group for enrichment analysis to consider.
+
+### How does it work?
+
+1. **Determine the presence of modules**. Each module in the "modules" mode output has a completeness score associated with it in each genome, and any module with a completeness score over a given threshold (set by `--module-completion-threshold`) will be considered to be *present* in that genome.
+
+2. **Quantify the distribution of modules in each group of genomes**. The distribution of a given module across genomes in each group will determine its enrichment. This is done by fitting a generalized linear model (GLM) with a logit linkage function in `anvi-script-enrichment-stats`, and it produces a [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) file.
+
+{:.notice}
+The script `anvi-script-enrichment-stats` was implemented by [Amy Willis](https://github.com/adw96), and described first in [this paper](https://doi.org/10.1186/s13059-020-02195-w).
+
+### Basic usage
+
+See [kegg-metabolism](/help/8/artifacts/kegg-metabolism) or [user-metabolism](/help/8/artifacts/user-metabolism) for more information on how to generate a "modules" mode output format from [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism). Please note that the genome names in the modules file must match those that you will mention in the [groups-txt](/help/8/artifacts/groups-txt) file.
+
+
+anvi-compute-metabolic-enrichment -M MODULES.TXT \
+ -G [groups-txt](/help/8/artifacts/groups-txt) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt)
+
+
+### Additional parameters
+
+The default completeness threshold for a module to be considered 'present' in a genome is 0.75 (=75%). If you wish to change this, you can do so by providing a different threshold between (0, 1], using the `--module-completion-threshold` parameter:
+
+
+anvi-compute-metabolic-enrichment -M MODULES.TXT \
+ -G [groups-txt](/help/8/artifacts/groups-txt) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) \
+ --module-completion-threshold 0.9
+
+
+By default, this program uses the [pathwise completeness score](https://anvio.org/help/main/programs/anvi-estimate-metabolism/#two-estimation-strategies---pathwise-and-stepwise) to determine which modules are 'present' in a genome, but you can ask it to use stepwise completeness instead by using the `--use-stepwise-completeness` flag.
+
+
+anvi-compute-metabolic-enrichment -M MODULES.TXT \
+ -G [groups-txt](/help/8/artifacts/groups-txt) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) \
+ --use-stepwise-completeness
+
+
+By default, the column containing genome names in your MODULES.TXT file will have the header `db_name`, **but there are certain cases in which you might have them in a different column name for your genomes or metagenomes** (such as those cases where you did not run [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) in multi-mode). In those cases, you can tell this program to look for a *different* column name to find your genomes or metagenomes using the `--sample-header`. For example, if your metagenome names are listed under the `metagenome_name` column, you would do the following:
+
+
+anvi-compute-metabolic-enrichment -M MODULES.TXT \
+ -G [groups-txt](/help/8/artifacts/groups-txt) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) \
+ --sample-header metagenome_name
+
+
+If you ran [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) on a bunch of extra genomes but only want to include a subset of them in the [groups-txt](/help/8/artifacts/groups-txt), that is fine. By default, any samples from the `MODULES.TXT` file that are missing from the [groups-txt](/help/8/artifacts/groups-txt) will be **ignored**. However, there is also an option to include those missing samples in the analysis, as one big group called 'UNGROUPED'. To do this, you can use the `--include-samples-missing-from-groups-txt` parameter.
+
+
+anvi-compute-metabolic-enrichment -M MODULES.TXT \
+ -G [groups-txt](/help/8/artifacts/groups-txt) \
+ -o [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) \
+ --include-samples-missing-from-groups-txt
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-compute-metabolic-enrichment.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A description of the enrichment script run by this program can be found in Shaiber et al 2020](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02195-w)
+
+* [An example of pangenome functional enrichment in the context of the Prochlorococcus metapangenome from Delmont and Eren 2018 is included in the pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-compute-metabolic-enrichment) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-compute-metabolic-enrichment/network.json b/help/8/programs/anvi-compute-metabolic-enrichment/network.json
new file mode 100644
index 00000000..3f6e17ee
--- /dev/null
+++ b/help/8/programs/anvi-compute-metabolic-enrichment/network.json
@@ -0,0 +1,108 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functional-enrichment-txt",
+ "name": "functional-enrichment-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-metabolism",
+ "name": "kegg-metabolism",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "user-metabolism",
+ "name": "user-metabolism",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "groups-txt",
+ "name": "groups-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 7,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-compute-metabolic-enrichment",
+ "name": "anvi-compute-metabolic-enrichment",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 7,
+ "target": 0
+ },
+ {
+ "target": 7,
+ "source": 1
+ },
+ {
+ "target": 7,
+ "source": 2
+ },
+ {
+ "target": 7,
+ "source": 3
+ },
+ {
+ "target": 7,
+ "source": 4
+ },
+ {
+ "target": 7,
+ "source": 5
+ },
+ {
+ "target": 7,
+ "source": 6
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-db-info/index.md b/help/8/programs/anvi-db-info/index.md
new file mode 100644
index 00000000..fc0d7a87
--- /dev/null
+++ b/help/8/programs/anvi-db-info/index.md
@@ -0,0 +1,167 @@
+---
+layout: program
+title: anvi-db-info
+excerpt: An anvi'o program. Access self tables, display values, or set new ones totally on your own risk.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-db-info
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Access self tables, display values, or set new ones totally on your own risk.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [genomes-storage-db](../../artifacts/genomes-storage-db) [structure-db](../../artifacts/structure-db) [genes-db](../../artifacts/genes-db)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+Displays information about an anvi'o database, and allows users to modify that information when absolutely necessary.
+
+This program is particularly useful for debugging, but also handy in a pinch if you want to check some facts about your database - to answer questions like "did I run HMMs on this [contigs-db](/help/8/artifacts/contigs-db) yet?" or "is this a merged [profile-db](/help/8/artifacts/profile-db)?" This program can also be very dangerous when used to inappropriately modify database information, so if you want to change something, please proceed with caution.
+
+### What information will I see?
+
+All anvi'o databases contain a table of self-describing information known as the "self" table. It helps anvi'o keep track of critical facts such as the type of the database, its version number, and the date it was created. It also saves information about how the database was generated, what sorts of data it contains, what programs have been run on it, and so on. In general, this table exists so that anvi'o can make sure you are doing the right things with your data and that nothing will blow up. `anvi-db-info` will show you the contents of the self table when you run this program on an anvi'o database.
+
+The information in the self table will be different depending on the kind of database you are looking at. For example, a [contigs-db](/help/8/artifacts/contigs-db) self table will indicate the number of contigs (and splits) in the database, whether or not gene calling was done (and with what gene callers), and which functional annotation sources have been used to annotate the genes. A [profile-db](/help/8/artifacts/profile-db) self table will list which samples it contains mapping information for, how many reads where mapped from each sample, and whether or not SNVs have been profiled. A [modules-db](/help/8/artifacts/modules-db) (see also [kegg-data](/help/8/artifacts/kegg-data)) self table will tell you how many KEGG modules are saved in the database and what is the hash value of the database contents. We could go on, but you probably get the picture.
+
+### View information about a database
+
+This is the only way that most people will use this program, and it is very simple. Just provide the path to any anvi'o database to this program, and check the output on your terminal screen:
+
+
+anvi-db-info path-to-DB.db
+
+
+Let's be even more specific and say you have a [contigs-db](/help/8/artifacts/contigs-db) called `CONTIGS.db`. To look at its self table, you would run the following:
+
+anvi-db-info CONTIGS.db
+
+
+That's it! Easy-peasy lemon-squeezy.
+
+### Example output
+
+Here is an example of what you might see for a [contigs-db](/help/8/artifacts/contigs-db).
+
+```
+DB Info (no touch)
+===============================================
+Database Path ................................: CONTIGS.db
+Description ..................................: No description is given
+Type .........................................: contigs
+Variant ......................................: None
+Version ......................................: 20
+
+
+DB Info (no touch also)
+===============================================
+contigs_db_hash ..............................: d51abf0a
+split_length .................................: 20000
+kmer_size ....................................: 4
+num_contigs ..................................: 4189
+total_length .................................: 35766167
+num_splits ...................................: 4784
+genes_are_called .............................: 1
+splits_consider_gene_calls ...................: 1
+creation_date ................................: 1466453807.46107
+project_name .................................: Infant Gut Contigs from Sharon et al.
+gene_level_taxonomy_source ...................:
+scg_taxonomy_was_run .........................: 0
+external_gene_calls ..........................: 0
+external_gene_amino_acid_seqs ................: 0
+skip_predict_frame ...........................: 0
+scg_taxonomy_database_version ................: None
+trna_taxonomy_was_run ........................: 0
+trna_taxonomy_database_version ...............: None
+modules_db_hash ..............................: 72700e4db2bc
+gene_function_sources ........................: KEGG_Module,COG14_CATEGORY,COG14_FUNCTION,KEGG_Class,KOfam
+
+* Please remember that it is never a good idea to change these values. But in some
+cases it may be absolutely necessary to update something here, and a programmer
+may ask you to run this program and do it. But even then, you should be
+extremely careful.
+
+AVAILABLE GENE CALLERS
+===============================================
+* 'prodigal' (32,265 gene calls)
+* 'Ribosomal_RNAs' (9 gene calls)
+
+
+AVAILABLE FUNCTIONAL ANNOTATION SOURCES
+===============================================
+* COG14_CATEGORY (21,121 annotations)
+* COG14_FUNCTION (21,121 annotations)
+* KEGG_Class (2,760 annotations)
+* KEGG_Module (2,760 annotations)
+* KOfam (14,391 annotations)
+
+
+AVAILABLE HMM SOURCES
+===============================================
+* 'Archaea_76' (type 'singlecopy' with 76 models and 404 hits)
+* 'Bacteria_71' (type 'singlecopy' with 71 models and 674 hits)
+* 'Protista_83' (type 'singlecopy' with 83 models and 100 hits)
+* 'Ribosomal_RNAs' (type 'Ribosomal_RNAs' with 12 models and 9 hits)
+```
+
+Most of this output is self-explanatory. But one thing that may not be quite obvious to some is that in many cases we use `0` to indicate 'False' and `1` to indicate 'True'. So for this example, you will see that SCG taxonomy was run on this database, but tRNA taxonomy was not.
+
+### Modifying database information
+We just need to start by saying - you probably shouldn't do this. Manually changing the values in the self table has the potential to break things downstream because it lets you avoid some of anvi'o's internal sanity checks which prevent you from doing things you shouldn't. If you change things and start running into ugly errors, do not be surprised.
+
+That being said, sometimes you just need to live on the edge and do some hacking, and `anvi-db-info` will let you do that. If a programmer sent you here to update a value in the self table or if you are just foraging ahead on your own, this is how you would do it. Let's change the `project_name` value as an example because it is mostly descriptive and seems fairly safe:
+
+
+anvi-db-info --self-key project_name --self-value "test" CONTIGS.db
+
+
+If you run this, you will see a warning telling you what the current value of `project_name` is and what it will be changed to, but the value will not actually be changed just yet. If you are sure you want to do this, you then need to run:
+
+
+anvi-db-info --self-key project_name --self-value "test" CONTIGS.db --just-do-it
+
+
+Then go on your merry adventuring way.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-db-info.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-db-info) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-db-info/network.json b/help/8/programs/anvi-db-info/network.json
new file mode 100644
index 00000000..01ef8cc9
--- /dev/null
+++ b/help/8/programs/anvi-db-info/network.json
@@ -0,0 +1,95 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "structure-db",
+ "name": "structure-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genes-db",
+ "name": "genes-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 6,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-db-info",
+ "name": "anvi-db-info",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 6,
+ "source": 0
+ },
+ {
+ "target": 6,
+ "source": 1
+ },
+ {
+ "target": 6,
+ "source": 2
+ },
+ {
+ "target": 6,
+ "source": 3
+ },
+ {
+ "target": 6,
+ "source": 4
+ },
+ {
+ "target": 6,
+ "source": 5
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-delete-collection/index.md b/help/8/programs/anvi-delete-collection/index.md
new file mode 100644
index 00000000..7a7104ec
--- /dev/null
+++ b/help/8/programs/anvi-delete-collection/index.md
@@ -0,0 +1,67 @@
+---
+layout: program
+title: anvi-delete-collection
+excerpt: An anvi'o program. Remove a collection from a given profile database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-delete-collection
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Remove a collection from a given profile database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [collection](../../artifacts/collection)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program, as implied by the name, will delete a [collection](/help/8/artifacts/collection) from a [profile-db](/help/8/artifacts/profile-db) or a [pan-db](/help/8/artifacts/pan-db).
+
+Using this program will delete the [collection](/help/8/artifacts/collection) and every [bin](/help/8/artifacts/bin) it describes from the database forever and without any additional warning. If you are not sure whether you may need a given collection later, it may be a good idea to export your binning effort into a [collection-txt](/help/8/artifacts/collection-txt) using [anvi-export-collection](/help/8/programs/anvi-export-collection) before deleting it, just to be safe.
+
+To list available collections in a database you can use the program [anvi-show-collections-and-bins](/help/8/programs/anvi-show-collections-and-bins). When you know which collection you wish to remove, you can run the program on the target collection name:
+
+
+anvi-delete-collection -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection)
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-delete-collection.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-delete-collection) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-delete-collection/network.json b/help/8/programs/anvi-delete-collection/network.json
new file mode 100644
index 00000000..9af4fbb4
--- /dev/null
+++ b/help/8/programs/anvi-delete-collection/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-delete-collection",
+ "name": "anvi-delete-collection",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 2,
+ "source": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-delete-functions/index.md b/help/8/programs/anvi-delete-functions/index.md
new file mode 100644
index 00000000..01576c6a
--- /dev/null
+++ b/help/8/programs/anvi-delete-functions/index.md
@@ -0,0 +1,55 @@
+---
+layout: program
+title: anvi-delete-functions
+excerpt: An anvi'o program. Remove functional annotation sources from an anvi'o contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-delete-functions
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Remove functional annotation sources from an anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [functions](../../artifacts/functions)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+{:.notice}
+**No one has described the usage of this program** :/ If you would like to contribute, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs), and feel free to add a Markdown formatted file in that directory named "anvi-delete-functions.md". For a template, you can use the markdown file for `anvi-gen-contigs-database`. THANK YOU!
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-delete-functions) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-delete-functions/network.json b/help/8/programs/anvi-delete-functions/network.json
new file mode 100644
index 00000000..16899e32
--- /dev/null
+++ b/help/8/programs/anvi-delete-functions/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-delete-functions",
+ "name": "anvi-delete-functions",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 2,
+ "source": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-delete-hmms/index.md b/help/8/programs/anvi-delete-hmms/index.md
new file mode 100644
index 00000000..fbb2b0fc
--- /dev/null
+++ b/help/8/programs/anvi-delete-hmms/index.md
@@ -0,0 +1,74 @@
+---
+layout: program
+title: anvi-delete-hmms
+excerpt: An anvi'o program. Remove HMM hits from an anvi'o contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-delete-hmms
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Remove HMM hits from an anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [hmm-source](../../artifacts/hmm-source) [hmm-hits](../../artifacts/hmm-hits)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program, as implied by the name, is used to delete a [hmm-hits](/help/8/artifacts/hmm-hits) from a [contigs-db](/help/8/artifacts/contigs-db). This way, you can repopulate the function annotations with a different source or program or just delete data that's clogging up the interface.
+
+It is generally a good idea to export your information before deleting it, just in case. The HMM hits will show up in most displays, so if you've already run [anvi-summarize](/help/8/programs/anvi-summarize), you should be good.
+
+To list available [hmm-source](/help/8/artifacts/hmm-source)s in a database, call
+
+
+anvi-delete-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --list-hmm-sources
+
+
+Then, you can easily delete [hmm-hits](/help/8/artifacts/hmm-hits) from a specific source with the command
+
+
+anvi-delete-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --hmm-source [hmm-source](/help/8/artifacts/hmm-source)
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-delete-hmms.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-delete-hmms) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-delete-hmms/network.json b/help/8/programs/anvi-delete-hmms/network.json
new file mode 100644
index 00000000..7b82d236
--- /dev/null
+++ b/help/8/programs/anvi-delete-hmms/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "hmm-source",
+ "name": "hmm-source",
+ "provided_by_anvio": false,
+ "type": "HMM"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "hmm-hits",
+ "name": "hmm-hits",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-delete-hmms",
+ "name": "anvi-delete-hmms",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 3,
+ "source": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-delete-misc-data/index.md b/help/8/programs/anvi-delete-misc-data/index.md
new file mode 100644
index 00000000..e0d4e91f
--- /dev/null
+++ b/help/8/programs/anvi-delete-misc-data/index.md
@@ -0,0 +1,134 @@
+---
+layout: program
+title: anvi-delete-misc-data
+excerpt: An anvi'o program. Remove stuff from 'additional data' or 'order' tables for either items or layers in either pan or profile databases.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-delete-misc-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Remove stuff from 'additional data' or 'order' tables for either items or layers in either pan or profile databases. OR, remove stuff from the 'additional data' tables for nucleotides or amino acids in contigs databases.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [misc-data-items](../../artifacts/misc-data-items) [misc-data-layers](../../artifacts/misc-data-layers) [misc-data-layer-orders](../../artifacts/misc-data-layer-orders) [misc-data-nucleotides](../../artifacts/misc-data-nucleotides) [misc-data-amino-acids](../../artifacts/misc-data-amino-acids)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+After you've added misc-data of some kind ([misc-data-items](/help/8/artifacts/misc-data-items), [misc-data-layers](/help/8/artifacts/misc-data-layers), [misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders), [misc-data-nucleotides](/help/8/artifacts/misc-data-nucleotides), or [misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids)) using [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data), you can **delete that data and remove it from the interactive interface** using this program.
+
+This program will release your data into the ether, never to be seen again. If you would like to first export it into a text file (so that it can be seen again), you can do so with [anvi-export-misc-data](/help/8/programs/anvi-export-misc-data).
+
+This program only works on data that is listed as an available key (most often because it was previously imported by the user). To view available keys, call either
+
+
+anvi-delete-misc-data -p [profile-db](/help/8/artifacts/profile-db) \
+ --target-data-table items|layers|layer_orders \
+ --list-available-keys
+
+
+or
+
+
+anvi-delete-misc-data -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --target-data-table nucleotides|amino_acids \
+ --list-available-keys
+
+
+where you choose the appropriate option for the `taget-data-table`.
+
+If your misc-data is associated with a specific data group, you can provide that data group to this program with the `-D` flag.
+
+## Data types you can delete
+
+### From a pan-db or profile-db: items, layers, layer orders
+
+**From a [pan-db](/help/8/artifacts/pan-db) or [profile-db](/help/8/artifacts/profile-db), you can delete**
+
+- items data ([misc-data-items](/help/8/artifacts/misc-data-items))
+
+
+anvi-delete-misc-data -p [profile-db](/help/8/artifacts/profile-db) \
+ --target-data-table items \
+ --keys-to-remove key_1
+
+
+- layers data ([misc-data-layers](/help/8/artifacts/misc-data-layers))
+
+
+anvi-delete-misc-data -p [pan-db](/help/8/artifacts/pan-db) \
+ --target-data-table layers \
+ --keys-to-remove key_2,key_3
+
+
+- layer orders data ([misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders))
+
+
+anvi-delete-misc-data -p [profile-db](/help/8/artifacts/profile-db) \
+ --target-data-table layer_orders \
+ --keys-to-remove key_4
+
+
+### From a contigs-db: nucleotide and amino acid data
+
+**From a [contigs-db](/help/8/artifacts/contigs-db), you can delete**
+
+- nucleotide data ([misc-data-nucleotides](/help/8/artifacts/misc-data-nucleotides))
+
+
+anvi-delete-misc-data -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --target-data-table nucleotides \
+ --keys-to-remove key_1
+
+
+- amino acid data ([misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids))
+
+
+anvi-delete-misc-data -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --target-data-table amino_acids \
+ --keys-to-remove key_2
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-delete-misc-data.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Working with anvi'o additional data tables](http://merenlab.org/2017/12/11/additional-data-tables/#views-items-layers-orders-some-anvio-terminology)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-delete-misc-data) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-delete-misc-data/network.json b/help/8/programs/anvi-delete-misc-data/network.json
new file mode 100644
index 00000000..e594ac39
--- /dev/null
+++ b/help/8/programs/anvi-delete-misc-data/network.json
@@ -0,0 +1,108 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items",
+ "name": "misc-data-items",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layers",
+ "name": "misc-data-layers",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layer-orders",
+ "name": "misc-data-layer-orders",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-nucleotides",
+ "name": "misc-data-nucleotides",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-amino-acids",
+ "name": "misc-data-amino-acids",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 7,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-delete-misc-data",
+ "name": "anvi-delete-misc-data",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 7,
+ "source": 0
+ },
+ {
+ "target": 7,
+ "source": 1
+ },
+ {
+ "target": 7,
+ "source": 2
+ },
+ {
+ "target": 7,
+ "source": 3
+ },
+ {
+ "target": 7,
+ "source": 4
+ },
+ {
+ "target": 7,
+ "source": 5
+ },
+ {
+ "target": 7,
+ "source": 6
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-delete-state/index.md b/help/8/programs/anvi-delete-state/index.md
new file mode 100644
index 00000000..8da4da84
--- /dev/null
+++ b/help/8/programs/anvi-delete-state/index.md
@@ -0,0 +1,74 @@
+---
+layout: program
+title: anvi-delete-state
+excerpt: An anvi'o program. Delete an anvi'o state from a pan or profile database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-delete-state
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Delete an anvi'o state from a pan or profile database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [state](../../artifacts/state)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program, as implied by the name, is used to delete a [state](/help/8/artifacts/state) from a [pan-db](/help/8/artifacts/pan-db) or [profile-db](/help/8/artifacts/profile-db). This way, you can remove states that are clogging up the state list in the interface.
+
+It is generally a good idea to export your state before deleting it, just in case ((anvi-export-state)s).
+
+To list available [state](/help/8/artifacts/state)s in a database, call
+
+
+anvi-delete-state -p [pan-db](/help/8/artifacts/pan-db) \
+ --list-states
+
+
+Then, you can easily delete a [state](/help/8/artifacts/state) with the command
+
+
+anvi-delete-hmms -p [profile-db](/help/8/artifacts/profile-db) \
+ -s [state](/help/8/artifacts/state)
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-delete-state.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-delete-state) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-delete-state/network.json b/help/8/programs/anvi-delete-state/network.json
new file mode 100644
index 00000000..f93a339d
--- /dev/null
+++ b/help/8/programs/anvi-delete-state/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "state",
+ "name": "state",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-delete-state",
+ "name": "anvi-delete-state",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 3,
+ "source": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-dereplicate-genomes/index.md b/help/8/programs/anvi-dereplicate-genomes/index.md
new file mode 100644
index 00000000..50a6894f
--- /dev/null
+++ b/help/8/programs/anvi-dereplicate-genomes/index.md
@@ -0,0 +1,129 @@
+---
+layout: program
+title: anvi-dereplicate-genomes
+excerpt: An anvi'o program. Identify redundant (highly similar) genomes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-dereplicate-genomes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Identify redundant (highly similar) genomes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[external-genomes](../../artifacts/external-genomes) [internal-genomes](../../artifacts/internal-genomes) [fasta](../../artifacts/fasta) [genome-similarity](../../artifacts/genome-similarity)
+
+
+## Can provide
+
+
+[fasta](../../artifacts/fasta)
+
+
+## Usage
+
+
+
+This program uses the user's similarity metric of choice to identify genomes that are highly similar to each other, and groups them together into redundant clusters. The program finds representative sequences for each cluster and outputs them into [fasta](/help/8/artifacts/fasta) files.
+
+
+#### Input Options
+
+You have two options for the input to this program:
+
+- the results of [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity) (a [genome-similarity](/help/8/artifacts/genome-similarity) directory). If you used `fastANI` or `pyANI` when you ran [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity), provide this using the parameter `--ani-dir`; if you used sourmash, use the parameter `--mash-dir`.
+
+- an [internal-genomes](/help/8/artifacts/internal-genomes), [external-genomes](/help/8/artifacts/external-genomes) or a series of [fasta](/help/8/artifacts/fasta) files (each of which represents a genome), in which case anvi'o will run [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity) for you. When providing these inputs, you can also provide any of the parameters that [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity) can take, including the `--program` you want to use (out of [PyANI](https://github.com/widdowquinn/pyani), [fastANI](https://github.com/ParBLiSS/FastANI), [sourmash](https://sourmash.readthedocs.io/en/latest/)) and their parameters. Details about all of this can be found in the help menu for [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity).
+
+#### Output Format
+
+By default, the output of this program is a directory containing two descriptive text files (the cluster report and fasta report) and a subdirectory called `GENOMES`:
+
+-The cluster report describes is a tab-delimited text file where each row describes a cluster. This file contains four columns: the cluster name, the number of genomes in the cluster, the representative genome of the cluster, and a list of the genomes that are in the cluster. Here is an example describing 11 genomes in three clusters:
+
+|**cluster**|**size**|**representative**|**genomes**|
+|:--|:--|:--|:--|
+|cluster_000001|1|G11_IGD_MAG_00001|G11_IGD_MAG_00001|
+|cluster_000002|8|G11_IGD_MAG_00012|G08_IGD_MAG_00008,G33_IGD_MAG_00011,G01_IGD_MAG_00013,G06_IGD_MAG_00023,G03_IGD_MAG_00021,G05_IGD_MAG_00014,G11_IGD_MAG_00012,G10_IGD_MAG_00010|
+|cluster_000003|2|G03_IGD_MAG_00011|G11_IGD_MAG_00013,G03_IGD_MAG_00011|
+
+-The subdirectory `GENOMES` contains fasta files describing the representative genome from each cluster. For example, if your original set of genomes had two identical genomes, this program would cluster them together, and the `GENOMES` folder would only include one of their sequences.
+
+-The fasta report describes the fasta files contained in the subdirectory `GENOMES`. By default, this describes the representative sequence of each of the final clusters. It tells you the genome name, its source, its cluster (and the representative sequence of that cluster), and the path to its fasta file in `GENOMES`. So, for the example above, the fasta report would look like this:
+
+|**name**|**source**|**cluster**|**cluster_rep**|**path**|
+|:--|:--|:--|:--|:--|
+|G11_IGD_MAG_00001|fasta|cluster_000001|G11_IGD_MAG_00001|GENOMES/G11_IGD_MAG_00001.fa|
+|G11_IGD_MAG_00012|fasta|cluster_000002|G11_IGD_MAG_00012|GENOMES/G11_IGD_MAG_00012.fa|
+|G03_IGD_MAG_00011|fasta|cluster_000003|G03_IGD_MAG_00011|GENOMES/G03_IGD_MAG_00011.fa|
+
+You can also choose to report all genome fasta files (including redundant genomes) (with `--report-all`) or report no fasta files (with `--skip-fasta-report`). This would change the fasta files included in `GENOMES` and the genomes mentioned in the fasta report. The cluster report would be identical.
+
+#### Required Parameters and Example Runs
+
+You are required to set the threshold for two genomes to be considered redundant and put in the same cluster.
+
+For example, if you had the results from an [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity) run where you had used `pyANI` and wanted the threshold to be 90 percent, you would run:
+
+
+anvi-dereplicate-genomes --ani-dir [genome-similarity](/help/8/artifacts/genome-similarity) \
+ -o path/to/output \
+ --program pyANI \
+ --similiarity-threshold 0.90
+
+
+If instead you hadn't yet run [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity) and instead wanted to cluster the genomes in your [external-genomes](/help/8/artifacts/external-genomes) file with similarity 85 percent or more (no fasta files necessary) using sourmash, you could run:
+
+
+anvi-dereplicate-genomes -e [external-genomes](/help/8/artifacts/external-genomes) \
+ --skip-fasta-report \
+ --program sourmash \
+ -o path/to/output \
+ --similiarity-threshold 0.85
+
+
+#### Other parameters
+
+You can change how anvi'o picks the representative sequence from each cluster with the parameter `--representative-method`. For this you have three options:
+
+- `Qscore`: picks the genome with highest completion and lowest redundancy
+- `length`: picks the longest genome in the cluster
+- `centrality` (default): picks the genome with highest average similiarty to every other genome in the cluster
+
+You can also choose to skip checking genome hashes (which will warn you if you have identical sequences in separate genomes with different names), provide a log path for debug messages or use multithreading (relevant only if not providing `--ani-dir` or `--mash-dir`).
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-dereplicate-genomes.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-dereplicate-genomes) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-dereplicate-genomes/network.json b/help/8/programs/anvi-dereplicate-genomes/network.json
new file mode 100644
index 00000000..a9ea3c0d
--- /dev/null
+++ b/help/8/programs/anvi-dereplicate-genomes/network.json
@@ -0,0 +1,73 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 2,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "fasta",
+ "name": "fasta",
+ "provided_by_anvio": false,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genome-similarity",
+ "name": "genome-similarity",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-dereplicate-genomes",
+ "name": "anvi-dereplicate-genomes",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-display-contigs-stats/index.md b/help/8/programs/anvi-display-contigs-stats/index.md
new file mode 100644
index 00000000..d2258511
--- /dev/null
+++ b/help/8/programs/anvi-display-contigs-stats/index.md
@@ -0,0 +1,171 @@
+---
+layout: program
+title: anvi-display-contigs-stats
+excerpt: An anvi'o program. Start the anvi'o interactive interface for viewing or comparing contigs statistics.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-display-contigs-stats
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Start the anvi'o interactive interface for viewing or comparing contigs statistics..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[contigs-stats](../../artifacts/contigs-stats) [interactive](../../artifacts/interactive) [svg](../../artifacts/svg)
+
+
+## Usage
+
+
+This program **helps you make sense of contigs in one or more [contigs-db](/help/8/artifacts/contigs-db)s**.
+
+### Working with single or multiple contigs databases
+
+You can use this program on a single contigs database the following way:
+
+
+anvi-display-contigs-stats CONTIGS-01.db
+
+
+Alternatively, you may use it to compare multiple contigs databases:
+
+
+anvi-display-contigs-stats CONTIGS-01.db \
+ CONTIGS-02.db \
+ (...)
+ CONTIGS-XX.db
+
+
+If you are comparing multiple, each contigs databse will become an individual column in all outputs.
+
+### Interactive output
+
+If you run this program on an anvi'o contigs database with default parameters,
+
+
+anvi-display-contigs-stats [contigs-db](/help/8/artifacts/contigs-db)
+
+
+it will open an interactive interface that looks like this:
+
+![An example of the anvi'o interface for contigs stats](../../images/contigs-stats-interface-example.png)
+
+At the top of the page are two graphs:
+
+* The bars in the top graph represent every integer N and L statistic from 1 to 100. The y-axis is the respective N length and the x-axis is the percentage of the total dataset looked at (the exact L and N values can be seen by hovering over each bar). In other words, if you had sorted your contigs by length (from longest to shortest), and walked through each one, every time you had seen another 1 percent of your total dataset, you would add a bar to the graph showing the number of contigs that you had seen (the L statistic) and the length of the one you were looking at at the moment (the N statistic).
+
+* The lower part of the graph tells you about which HMM hits your contigs database has. Each column is a gene in a specific [hmm-source](/help/8/artifacts/hmm-source), and the graph tells you how many hits each gene has in your data. (Hover your mouse over the graph to see the specifics of each gene.) The sidebar shows you how many of the genes in this graph were seen exactly that many times. For example, in the graph above, for the Bacteria_71 [hmm-source](/help/8/artifacts/hmm-source), a lot of genes were detected 9-11 times, so those bars are longer. This helps you estimate about how many of these genomes there are in your contigs database (so here, there is likely around 9-11 bacteria genomes in this contigs database).
+
+Below the graphs are the **contigs stats** which are displayed in the following order:
+
+- The total length of your contigs in nucleotides
+- The number of contigs in your database
+- The number of contigs that are of varying lengths. (for example "Num Contigs > 2.5 kb" gives you the number of contigs that are longer than 2500 base pairs)
+- The length of the longest and shortest contig in your database in nucleotides
+- The number of genes in your contigs (as predicted by [Prodigal](https://github.com/hyattpd/Prodigal))
+- L50, L75, L90: If you ordered the contigs in your database from longest to shortest, these stats describe the *number of contigs* you would need to go through before you had looked at a certain percent of a genome. For example, L50 describes the number of contigs you would have to go through before you reached 50 percent of the entire dataset.
+- N50, N75, N90: If you ordered the contigs in your database from longest to shortest, these stats describe the *length of the contig* you would be looking when you had looked at a certain percent of a genome. For example, N50 describes the length of contig you would be on when you reached 50 percent of the entire genome length.
+- The number of HMM hits in your contigs. This goes through every [hmm-source](/help/8/artifacts/hmm-source) and gives the number of hits its genes had in all of your contigs. Basically, this is the number of hits that is given in the lower graph at the top of the page.
+- The number of genomes that anvi'o predicts are in your sample, based on how many hits the single copy core genes got from the various [hmm-source](/help/8/artifacts/hmm-source)s. See the description of the lower graph above, or [this blog post](http://merenlab.org/2015/12/07/predicting-number-of-genomes/) for more information.
+
+
+### Text output
+
+If you wish to report [contigs-db](/help/8/artifacts/contigs-db) stats as a supplementary table, a text output will be much more appropriate. If you add the flag `--report-as-text` anvi'o will not attempt to initiate an interactive interface, and instead will report the stats as a TAB-delmited file:
+
+
+anvi-display-contigs-stats [contigs-db](/help/8/artifacts/contigs-db) \
+ --report-as-text \
+ -o OUTPUT_FILE_NAME.txt
+
+
+There is also another flag you can add to get the output formatted as markdown, which makes it easier to copy-paste to GitHub or other markdown-friendly services. This is how you get a markdown output instead:
+
+
+anvi-display-contigs-stats [contigs-db](/help/8/artifacts/contigs-db) \
+ --report-as-text \
+ --as-markdown \
+ -o OUTPUT_FILE_NAME.md
+
+
+Here is an example output:
+
+contigs_db|oral_HMW_4_1|oral_HMW_4_2|oral_HMW_4_1_SS|oral_HMW_4_2_SS
+--|--|--|--|--
+Total Length|531641122|759470437|306115616|288581831
+Num Contigs|468071|1007070|104273|148873
+Num Contigs > 5 kb|19626|24042|25014|20711
+Num Contigs > 10 kb|6403|8936|3531|2831
+Num Contigs > 20 kb|1269|2294|300|407
+Num Contigs > 50 kb|34|95|3|10
+Num Contigs > 100 kb|0|0|0|0
+Longest Contig|73029|92515|57337|63976
+Shortest Contig|56|51|80|85
+Num Genes (prodigal)|676577|994050|350657|327423
+L50|38513|62126|17459|17161
+L75|143030|328008|33063|35530
+L90|301803|670992|53293|70806
+N50|2810|1929|6106|5594
+N75|686|410|3536|2422
+N90|394|275|1360|640
+Archaea_76|1594|1697|930|805
+Protista_83|6|1|1|0
+Ribosomal_RNAs|901|1107|723|647
+Bacteria_71|2893|3131|1696|1441
+archaea (Archaea_76)|0|0|0|0
+eukarya (Protista_83)|0|0|0|0
+bacteria (Bacteria_71)|33|26|20|18
+
+You can easily convert the markdown output into PDF or HTML pages using [pandoc](https://pandoc.org/). For instance running the following command in the previous output,
+
+```
+pandoc -V geometry:landscape \
+ OUTPUT_FILE_NAME.md
+ -o OUTPUT_FILE_NAME.pdf
+```
+
+will results in a PDF file that looks like this:
+
+![an anvi'o display](../../images/display_contigs_stats_pandoc_output.png){:.center-img}
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-display-contigs-stats.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-display-contigs-stats) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-display-contigs-stats/network.json b/help/8/programs/anvi-display-contigs-stats/network.json
new file mode 100644
index 00000000..28174116
--- /dev/null
+++ b/help/8/programs/anvi-display-contigs-stats/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-stats",
+ "name": "contigs-stats",
+ "provided_by_anvio": true,
+ "type": "STATS"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interactive",
+ "name": "interactive",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "svg",
+ "name": "svg",
+ "provided_by_anvio": true,
+ "type": "SVG"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-display-contigs-stats",
+ "name": "anvi-display-contigs-stats",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "source": 4,
+ "target": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-display-functions/index.md b/help/8/programs/anvi-display-functions/index.md
new file mode 100644
index 00000000..e9befb53
--- /dev/null
+++ b/help/8/programs/anvi-display-functions/index.md
@@ -0,0 +1,185 @@
+---
+layout: program
+title: anvi-display-functions
+excerpt: An anvi'o program. Start an anvi'o interactive display to see functions across genomes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-display-functions
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Start an anvi'o interactive display to see functions across genomes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[functions](../../artifacts/functions) [genomes-storage-db](../../artifacts/genomes-storage-db) [internal-genomes](../../artifacts/internal-genomes) [external-genomes](../../artifacts/external-genomes) [groups-txt](../../artifacts/groups-txt)
+
+
+## Can provide
+
+
+[interactive](../../artifacts/interactive) [functional-enrichment-txt](../../artifacts/functional-enrichment-txt)
+
+
+## Usage
+
+
+For a given annotation source for [functions](/help/8/artifacts/functions), this program will display distribution patterns of unique function names (or accession numbers) across genomes stored in anvi'o databases.
+
+It is a powerful way to analyze differentially occurring functions for any source of annotation that is shared across all genomes.
+
+Currently, [anvi-display-functions](/help/8/programs/anvi-display-functions) can work with any combination of genomes from [external-genomes](/help/8/artifacts/external-genomes), [internal-genomes](/help/8/artifacts/internal-genomes), and [genomes-storage-db](/help/8/artifacts/genomes-storage-db).
+
+{:.notice}
+If you are only interested in a text output, see the program [anvi-script-gen-function-matrix-across-genomes](/help/8/programs/anvi-script-gen-function-matrix-across-genomes) that can report [functions-across-genomes-txt](/help/8/artifacts/functions-across-genomes-txt) files.
+
+### Quick & Simple Run
+
+The simplest way to run this program is as follows:
+
+
+anvi-display-functions -e [external-genomes](/help/8/artifacts/external-genomes) \
+ --annotation-source KOfam \
+ --profile-db KOFAM-PROFILE.db
+
+
+You can replace the annotation source based on what is available across your genomes. You can use the program [anvi-db-info](/help/8/programs/anvi-db-info) to see all available function annotation sources in a given [contigs-db](/help/8/artifacts/contigs-db) or [genomes-storage-db](/help/8/artifacts/genomes-storage-db). You can also use the program [anvi-import-functions](/help/8/programs/anvi-import-functions) to import ANY kind of functional grouping of your genes and use those ad hoc functional sources to display their distribution across genomes. Please see [functions](/help/8/artifacts/functions) for more information on functions and how to obtain them.
+
+{:.notice}
+Please note that a [profile-db](/help/8/artifacts/profile-db) will be automatically generated for you. Once it is generated, the same profile database can be visualized over and over again using [anvi-interactive](/help/8/programs/anvi-interactive) in manual mode, without having to retain any other files.
+
+
+### Combining genomes from multiple sources
+
+You can run this program by combining genomes from multiple sources:
+
+
+anvi-display-functions -e [external-genomes](/help/8/artifacts/external-genomes) \
+ -i [internal-genomes](/help/8/artifacts/internal-genomes) \
+ -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ --annotation-source KOfam \
+ --profile-db KOFAM-PROFILE.db
+
+
+
+This way, you can bring together functions in your metagenome-assembled genomes, the isolates you have acquired from external sources, and even genomes in an anvi'o pangenome into a single framework in a disturbingly easy fashion.
+
+### Performing functional enrichment analysis for free
+
+This is an optional step, but may be very useful for some investigations. If your genomes are divided into meaningful groups, you can also perform a functional enrichment analysis while running this program. All you need to do for this to be included in your analysis is to provide a [groups-txt](/help/8/artifacts/groups-txt) file that describes which genome belongs to which group:
+
+
+anvi-display-functions -e [external-genomes](/help/8/artifacts/external-genomes) \
+ --groups-txt [groups-txt](/help/8/artifacts/groups-txt)
+ --annotation-source KOfam \
+ --profile-db KOFAM-PROFILE.db
+
+
+If you are using multiple sources for your genomes, you may not immediately know which genomes to list in your [groups-txt](/help/8/artifacts/groups-txt) file. In that case, you can first run the program with this additional parameter,
+
+
+anvi-display-functions -e [external-genomes](/help/8/artifacts/external-genomes) \
+ -i [internal-genomes](/help/8/artifacts/internal-genomes) \
+ -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ --annotation-source COG20_FUNCTION \
+ --profile-db COGS-PROFILE.db \
+ --print-genome-names-and-quit
+
+
+In which case anvi'o would report all the functions once it recovers everything from all sources, and print them out for you to create a groups file before re-running the program with it.
+
+This analysis will add the following additional layers in your [interactive](/help/8/artifacts/interactive) display: 'enrichment_score', 'unadjusted_p_value', 'adjusted_q_value', 'associated_groups'. See [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt) to learn more about these columns.
+
+### Aggregating functions using accession IDs
+
+Once it is run, this program essentially aggregates all function names that occur in one or more genomes among the set of genomes found in input sources. The user can ask the program to use accession IDs to aggregate functions rather than function names:
+
+
+anvi-display-functions -e [external-genomes](/help/8/artifacts/external-genomes) \
+ --annotation-source KOfam \
+ --profile-db KOFAM-PROFILE.db \
+ --aggregate-based-on-accession
+
+
+While the default setting, which is to use function names, will be appropriate for most applications, using accession IDs instead of function names may be important for specific purposes. There may be an actual difference between using functions or accession to aggregate data since multiple accession IDs in various databases may correspond to the same function. This may lead to misleading enrichment analyses downstream as identical function annotations may be over-split into multiple groups. Thus, the default aggregation method uses function names.
+
+### Aggregating functions using all function hits
+
+This is a bit confusing, but actually it is not. In some cases a gene may be annotated with more than one function names. This is a decision often made at the function annotation tool level. For instance [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs) may yield two COG annotations for a single gene because the significance score for both hits may exceed the default cutoff. While this can be useful in [anvi-summarize](/help/8/programs/anvi-summarize) output where things should be most comprehensive, having some genes annotated with multiple functions and others with one function may over-split them (since in this scenario a gene with COGXXX and COGXXX;COGYYY would end up in different bins). Thus, [anvi-display-functions](/help/8/programs/anvi-display-functions) will will use the best hit for any gene that has multiple hits. But this behavior can be turned off the following way:
+
+
+anvi-display-functions -e [external-genomes](/help/8/artifacts/external-genomes) \
+ --annotation-source KOfam \
+ --profile-db KOFAM-PROFILE.db \
+ --aggregate-using-all-hits
+
+
+### The min-occurrence limit
+
+You can choose to limit the number of functions to be considered to those that occur in more than a minimum number of genomes:
+
+
+anvi-display-functions -e [external-genomes](/help/8/artifacts/external-genomes) \
+ --annotation-source KOfam \
+ --profile-db KOFAM-PROFILE.db \
+ --min-occurrence 5
+
+
+Here the `--min-occurrence 5` parameter will exclude any function that appears to occur in less than 5 genomes in your collection.
+
+
+### A real-world example
+
+Assume we have a list of [external-genomes](/help/8/artifacts/external-genomes) that include three different species of *Bifidobacterium*. Running the following command,
+
+
+anvi-display-functions --external-genomes Bifidobacterium.txt \
+ --annotation-source COG20_FUNCTION \
+ --profile-db COG20-PROFILE.db \
+ --min-occurrence 3
+
+
+Would produce the following display by default, where each layer is one of the genomes described in the [external-genomes](/help/8/artifacts/external-genomes) file, and each item is a unique function name that occur in `COG20_FUNCTION` (which was obtained by running [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs) on each [contigs-db](/help/8/artifacts/contigs-db) in the external genomes file) that were found in more than three genomes:
+
+[![Example output](../../images/anvi-display-functions-01.png){:.center-img .width-50}](../../images/anvi-display-functions-01.png)
+
+The outermost layer shows the function names:
+
+[![Example output](../../images/anvi-display-functions-02.png){:.center-img .width-50}](../../images/anvi-display-functions-02.png)
+
+After a quick prettification through the [interactive](/help/8/artifacts/interactive) interface, leads to a cleaner display of three distinct species in this group, and functions that are uniquely enriched in either of them:
+
+[![Example output](../../images/anvi-display-functions-03.png){:.center-img .width-80}](../../images/anvi-display-functions-03.png)
+
+Now the resulting [profile-db](/help/8/artifacts/profile-db) can be used by [anvi-interactive](/help/8/programs/anvi-interactive) to re-visualize these data, or can be shared with the community without sharing the underlying contigs databases.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-display-functions.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-display-functions) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-display-functions/network.json b/help/8/programs/anvi-display-functions/network.json
new file mode 100644
index 00000000..dc888403
--- /dev/null
+++ b/help/8/programs/anvi-display-functions/network.json
@@ -0,0 +1,108 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interactive",
+ "name": "interactive",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functional-enrichment-txt",
+ "name": "functional-enrichment-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "groups-txt",
+ "name": "groups-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 7,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-display-functions",
+ "name": "anvi-display-functions",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 7,
+ "target": 0
+ },
+ {
+ "source": 7,
+ "target": 1
+ },
+ {
+ "target": 7,
+ "source": 2
+ },
+ {
+ "target": 7,
+ "source": 3
+ },
+ {
+ "target": 7,
+ "source": 4
+ },
+ {
+ "target": 7,
+ "source": 5
+ },
+ {
+ "target": 7,
+ "source": 6
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-display-metabolism/index.md b/help/8/programs/anvi-display-metabolism/index.md
new file mode 100644
index 00000000..5504c455
--- /dev/null
+++ b/help/8/programs/anvi-display-metabolism/index.md
@@ -0,0 +1,62 @@
+---
+layout: program
+title: anvi-display-metabolism
+excerpt: An anvi'o program. Start the anvi'o interactive interactive for viewing KEGG metabolism data.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-display-metabolism
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Start the anvi'o interactive interactive for viewing KEGG metabolism data.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [kegg-data](../../artifacts/kegg-data) [kegg-functions](../../artifacts/kegg-functions) [profile-db](../../artifacts/profile-db) [collection](../../artifacts/collection) [bin](../../artifacts/bin)
+
+
+## Can provide
+
+
+[interactive](../../artifacts/interactive)
+
+
+## Usage
+
+
+The purpose of [anvi-display-metabolism](/help/8/programs/anvi-display-metabolism) is to interactively view metabolism estimation data.
+
+This program internally uses [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) to obtain this data for the provided [contigs-db](/help/8/artifacts/contigs-db).
+
+It is still under development.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-display-metabolism.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-display-metabolism) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-display-metabolism/network.json b/help/8/programs/anvi-display-metabolism/network.json
new file mode 100644
index 00000000..35bf8569
--- /dev/null
+++ b/help/8/programs/anvi-display-metabolism/network.json
@@ -0,0 +1,108 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interactive",
+ "name": "interactive",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-data",
+ "name": "kegg-data",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-functions",
+ "name": "kegg-functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 7,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-display-metabolism",
+ "name": "anvi-display-metabolism",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 7,
+ "target": 0
+ },
+ {
+ "target": 7,
+ "source": 1
+ },
+ {
+ "target": 7,
+ "source": 2
+ },
+ {
+ "target": 7,
+ "source": 3
+ },
+ {
+ "target": 7,
+ "source": 4
+ },
+ {
+ "target": 7,
+ "source": 5
+ },
+ {
+ "target": 7,
+ "source": 6
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-display-pan/index.md b/help/8/programs/anvi-display-pan/index.md
new file mode 100644
index 00000000..f5560a29
--- /dev/null
+++ b/help/8/programs/anvi-display-pan/index.md
@@ -0,0 +1,99 @@
+---
+layout: program
+title: anvi-display-pan
+excerpt: An anvi'o program. Start an anvi'o server to display a pan-genome.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-display-pan
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Start an anvi'o server to display a pan-genome.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Can provide
+
+
+[collection](../../artifacts/collection) [bin](../../artifacts/bin) [interactive](../../artifacts/interactive) [svg](../../artifacts/svg) [gene-cluster-inspection](../../artifacts/gene-cluster-inspection)
+
+
+## Usage
+
+
+This program **displays the contents of a [pan-db](/help/8/artifacts/pan-db) in the [anvi'o interactive interface](http://merenlab.org/2016/02/27/the-anvio-interactive-interface//#using-the-anvio-interactive-interface), much like [anvi-interactive](/help/8/programs/anvi-interactive).**
+
+Like you can see in the [pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#displaying-the-pan-genome), this opens a window of the interactive interface where each item is a gene cluster and each layer represents one of your genomes.
+
+### A general run
+
+You can run it with only two parameters:
+
+
+anvi-display-pan -p [pan-db](/help/8/artifacts/pan-db) \
+ -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db)
+
+
+There are several default layer orders to choose from, including organizing based on gene cluster presence/absense or gene cluster frequency. These will both group your core gene clusters and singletons separately.
+
+Beyond that, there are many different settings you can change in the side panel of the interface and you can import various additional data (primarily with the program [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data)). Once you're happy with the data displayed in the interface (and the prettiness of that data), you can save those preferences in a [state](/help/8/artifacts/state).
+
+### I want MORE data displayed
+
+There are several other data types you can additionally choose to display in this program. Namely, you can add:
+
+- a title (very fancy I know) using `--title`
+- a NEWICK formatted tree (or import it as a [misc-data-items-order-txt](/help/8/artifacts/misc-data-items-order-txt) with [anvi-import-items-order](/help/8/programs/anvi-import-items-order) or as a [misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders) with [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data)).
+- view data in a tab-delimited file
+- an additional view (provide this in a tab-delimited matrix where each column corresponds to a sample and each row corresponds to a gene cluster)
+- an additional layer in the form of a [misc-data-layers-txt](/help/8/artifacts/misc-data-layers-txt) (or import it into your [pan-db](/help/8/artifacts/pan-db) with [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data)
+
+### How to minimize mouse clicks
+
+Wondering how to autoload specific aspects of the interface? You're in the right place.
+
+You have the option to specify quite a few aspects of the interface through the parameters to save you those sweet mouse clicks.
+
+- You can specify which view to start the interface with. Check which views are available with `--list-views`.
+- You can load a specific [state](/help/8/artifacts/state) (either a previous state or a state that you've imported with [anvi-import-state](/help/8/programs/anvi-import-state)). Check which states are available with the flag `--list-states`.
+- You can also load a specific [collection](/help/8/artifacts/collection) with `--collection-autoload`. To check which collections are availible, use `--list-collections`.
+
+### Other parameters
+
+You can also skip processes like intializing functions or automatically ordering your items to save time, as well as configure the server to your heart's content.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-display-pan.md) to update this information.
+
+
+## Additional Resources
+
+
+* [See this program in action on the pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#displaying-the-pan-genome)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-display-pan) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-display-pan/network.json b/help/8/programs/anvi-display-pan/network.json
new file mode 100644
index 00000000..82865026
--- /dev/null
+++ b/help/8/programs/anvi-display-pan/network.json
@@ -0,0 +1,108 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interactive",
+ "name": "interactive",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "svg",
+ "name": "svg",
+ "provided_by_anvio": true,
+ "type": "SVG"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "gene-cluster-inspection",
+ "name": "gene-cluster-inspection",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 7,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-display-pan",
+ "name": "anvi-display-pan",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 7,
+ "target": 0
+ },
+ {
+ "source": 7,
+ "target": 1
+ },
+ {
+ "source": 7,
+ "target": 2
+ },
+ {
+ "source": 7,
+ "target": 3
+ },
+ {
+ "source": 7,
+ "target": 4
+ },
+ {
+ "target": 7,
+ "source": 5
+ },
+ {
+ "target": 7,
+ "source": 6
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-display-structure/index.md b/help/8/programs/anvi-display-structure/index.md
new file mode 100644
index 00000000..d4bfb384
--- /dev/null
+++ b/help/8/programs/anvi-display-structure/index.md
@@ -0,0 +1,124 @@
+---
+layout: program
+title: anvi-display-structure
+excerpt: An anvi'o program. Interactively visualize sequence variants on protein structures.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-display-structure
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Interactively visualize sequence variants on protein structures.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[structure-db](../../artifacts/structure-db) [variability-profile-txt](../../artifacts/variability-profile-txt) [contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [splits-txt](../../artifacts/splits-txt)
+
+
+## Can provide
+
+
+[interactive](../../artifacts/interactive)
+
+
+## Usage
+
+
+
+This program opens an interactive interface to explore single amino acid variants (SAAVs) and single codon variants (SCVs) in the context of predicted tertiary protein structures and binding sites. There are many example uses [here](http://merenlab.org/2018/09/04/getting-started-with-anvio-structure/#display-metagenomic-sequence-variants-directly-on-predicted-structures) and you can work through an example as part of [the infant gut tutorial](http://merenlab.org/tutorials/infant-gut/#chapter-vii-from-single-amino-acid-variants-to-protein-structures) as well. This is an integral program of anvi'o structure, which you can learn more about [here](https://merenlab.org/software/anvio-structure/).
+
+
+In short, this program enables users to explore sequence variation in the context of 3D protein structure, which reveals insight that cannot be learned from purely sequence-based approaches.
+
+
+### Before running
+
+To run this program, you'll need to have created a [structure-db](/help/8/artifacts/structure-db) which can be easily done with a [contigs-db](/help/8/artifacts/contigs-db) and the program [anvi-gen-structure-database](/help/8/programs/anvi-gen-structure-database).
+
+
+You'll also need a [profile-db](/help/8/artifacts/profile-db) that was created using [anvi-profile](/help/8/programs/anvi-profile)'s flag `--profile-SCVs`, which means that single codon variants (SCVs) have been profiled. Very sorry if this forces you to re-profile, but as of v6.2, this is now a very expedient process.
+
+
+### Basic Run
+
+There are two ways to provide the variability information to this program.
+
+The first is to provide a [contigs-db](/help/8/artifacts/contigs-db) and [profile-db](/help/8/artifacts/profile-db) pair, and let this program calculate SAAVs and SCVs as they are requested by the interface.
+
+
+
+anvi-display-structure -s [structure-db](/help/8/artifacts/structure-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+The second is to use [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile) to create a [variability-profile-txt](/help/8/artifacts/variability-profile-txt). This way, you pre-load all of the variability data and don't have to wait for [anvi-display-structure](/help/8/programs/anvi-display-structure) to calculate variability on-the-fly. This option is probably most convenient in instances where you have already generated a [variability-profile-txt](/help/8/artifacts/variability-profile-txt) for other reasons. If you fall into this camp, you can run [anvi-display-structure](/help/8/programs/anvi-display-structure) as so:
+
+
+
+anvi-display-structure -s [structure-db](/help/8/artifacts/structure-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -v [variability-profile-txt](/help/8/artifacts/variability-profile-txt)
+
+
+{:.notice}
+You still must provide the [contigs-db](/help/8/artifacts/contigs-db) used to generate the [variability-profile-txt](/help/8/artifacts/variability-profile-txt), since it contains other necessary information such as functional annotations and ligand binding predictions. You may optionally provide a [profile-db](/help/8/artifacts/profile-db) if custom sample grouping is important to you.
+
+{:.notice}
+During [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile), if you are _only_ interested in genes that have predicted structures, you may want to run [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile) with the flag `--only-if-structure`.
+
+### Refining your search
+
+You have several options to refine what proteins and variants you're looking at:
+
+- Provide a list of gene caller IDs to only display specific genes (this can be provided either directly as a parameter or as a file with one gene caller ID per line)
+- Specify the minimum departure from the consensus sequence. This is a number from 0-1 that describes the threshold for a variability position to be displayed. For example, if this is set to 0.2, then all SAAVs and SCVs where less than 20 percent of the reads vary from the consensus sequence will not be displayed.
+- Specify samples of interest. Those in your [profile-db](/help/8/artifacts/profile-db) or [variability-profile-txt](/help/8/artifacts/variability-profile-txt) that are not in the samples of interest will be filtered out.
+
+If you're choosing to have [anvi-display-structure](/help/8/programs/anvi-display-structure) calculate variability on-the-fly, you can speed things up by choosing to _only_ calculate SAAVs or _only_ calculate SCVs.
+
+
+### Other parameters
+
+Power users can also change the server configuration (i.e. set the IP address, port number, browser path, server password, etc.)
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-display-structure.md) to update this information.
+
+
+## Additional Resources
+
+
+* [The overview page from the release](http://merenlab.org/software/anvio-structure/)
+
+* [The section of the Infant Gut Tutorial focused on anvi-display-structure](http://merenlab.org/tutorials/infant-gut/#chapter-vii-from-single-amino-acid-variants-to-protein-structures)
+
+* [Integrating sequence variants and predicted protein structures](http://merenlab.org/2018/09/04/getting-started-with-anvio-structure/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-display-structure) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-display-structure/network.json b/help/8/programs/anvi-display-structure/network.json
new file mode 100644
index 00000000..82c7ff4a
--- /dev/null
+++ b/help/8/programs/anvi-display-structure/network.json
@@ -0,0 +1,95 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interactive",
+ "name": "interactive",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "structure-db",
+ "name": "structure-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "variability-profile-txt",
+ "name": "variability-profile-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "splits-txt",
+ "name": "splits-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 6,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-display-structure",
+ "name": "anvi-display-structure",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 6,
+ "target": 0
+ },
+ {
+ "target": 6,
+ "source": 1
+ },
+ {
+ "target": 6,
+ "source": 2
+ },
+ {
+ "target": 6,
+ "source": 3
+ },
+ {
+ "target": 6,
+ "source": 4
+ },
+ {
+ "target": 6,
+ "source": 5
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-estimate-genome-completeness/index.md b/help/8/programs/anvi-estimate-genome-completeness/index.md
new file mode 100644
index 00000000..25d22e9b
--- /dev/null
+++ b/help/8/programs/anvi-estimate-genome-completeness/index.md
@@ -0,0 +1,102 @@
+---
+layout: program
+title: anvi-estimate-genome-completeness
+excerpt: An anvi'o program. Estimate completion and redundancy using domain-specific single-copy core genes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-estimate-genome-completeness
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Estimate completion and redundancy using domain-specific single-copy core genes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [external-genomes](../../artifacts/external-genomes) [collection](../../artifacts/collection)
+
+
+## Can provide
+
+
+[completion](../../artifacts/completion)
+
+
+## Usage
+
+
+This program estimates the completeness and redundancy of genomes provided to it, based on domain-level single-copy core genes.
+
+{:.notice}
+Wondering what single-copy core genes anvi'o uses? Check out [hmm-source](/help/8/artifacts/hmm-source). It uses the tables populated when you ran [anvi-run-hmms](/help/8/programs/anvi-run-hmms) on your [contigs-db](/help/8/artifacts/contigs-db).
+
+Genomes provided to this program must be contained in either a [bin](/help/8/artifacts/bin) (within a [collection](/help/8/artifacts/collection)) or a [contigs-db](/help/8/artifacts/contigs-db) (which can be provided alone or as part of an [external-genomes](/help/8/artifacts/external-genomes)).
+
+### Running on contigs databases
+
+For example, calling
+
+
+anvi-estimate-genome-completeness -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+will output to the terminal the completition and redundancy of the single-copy core genes in your [contigs-db](/help/8/artifacts/contigs-db), assuming that all of its contigs represent a single genome. To output this information to a file, you can add the flag `-o` and provide an output path.
+
+To get this information for several contigs databases at once, you can provide them as an [external-genomes](/help/8/artifacts/external-genomes), as so:
+
+
+anvi-estimate-genome-completeness -e [external-genomes](/help/8/artifacts/external-genomes) \
+ -o completition.txt
+
+
+### Running on bins
+
+To get this data for a series of bins, just provide a [profile-db](/help/8/artifacts/profile-db) and [collection](/help/8/artifacts/collection).
+
+
+anvi-estimate-genome-completeness -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection)
+
+
+To see what collections are contained in your contigs database, call
+
+
+anvi-estimate-genome-completeness -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ --list-collections
+
+
+or run [anvi-show-collections-and-bins](/help/8/programs/anvi-show-collections-and-bins) for a more comprehensive overview.
+
+If you're looking for a more comprehensive overview of your entire collection and its contents, the completition and redunduncy statistics for your bins are also included when you run [anvi-summarize](/help/8/programs/anvi-summarize).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-estimate-genome-completeness.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-estimate-genome-completeness) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-estimate-genome-completeness/network.json b/help/8/programs/anvi-estimate-genome-completeness/network.json
new file mode 100644
index 00000000..f4c53d97
--- /dev/null
+++ b/help/8/programs/anvi-estimate-genome-completeness/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "completion",
+ "name": "completion",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-estimate-genome-completeness",
+ "name": "anvi-estimate-genome-completeness",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "target": 5,
+ "source": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-estimate-metabolism/index.md b/help/8/programs/anvi-estimate-metabolism/index.md
new file mode 100644
index 00000000..b97f828f
--- /dev/null
+++ b/help/8/programs/anvi-estimate-metabolism/index.md
@@ -0,0 +1,926 @@
+---
+layout: program
+title: anvi-estimate-metabolism
+excerpt: An anvi'o program. Reconstructs metabolic pathways and estimates pathway completeness for a given set of contigs.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-estimate-metabolism
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Reconstructs metabolic pathways and estimates pathway completeness for a given set of contigs.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [kegg-data](../../artifacts/kegg-data) [kegg-functions](../../artifacts/kegg-functions) [profile-db](../../artifacts/profile-db) [collection](../../artifacts/collection) [bin](../../artifacts/bin) [external-genomes](../../artifacts/external-genomes) [internal-genomes](../../artifacts/internal-genomes) [metagenomes](../../artifacts/metagenomes) [user-modules-data](../../artifacts/user-modules-data) [enzymes-txt](../../artifacts/enzymes-txt)
+
+
+## Can provide
+
+
+[kegg-metabolism](../../artifacts/kegg-metabolism) [user-metabolism](../../artifacts/user-metabolism)
+
+
+## Usage
+
+
+[anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) predicts the metabolic capabilities of organisms based on their genetic content. It relies upon [kegg-functions](/help/8/artifacts/kegg-functions) and metabolism information from [KEGG](https://www.genome.jp/kegg/) ([kegg-data](/help/8/artifacts/kegg-data)), which is stored in a [modules-db](/help/8/artifacts/modules-db). It can also use user-defined metabolic pathways, as described in [user-modules-data](/help/8/artifacts/user-modules-data).
+
+The metabolic pathways that this program considers (by default) are those defined by KEGG Orthologs (KOs) in the [KEGG MODULE resource](https://www.genome.jp/kegg/module.html). Each KO represents a gene function, and a KEGG module is a set of KOs that collectively carry out the steps in a metabolic pathway.
+
+Alternatively or additionally, you can define your own set of metabolic modules and estimate their completeness with this program. Detailed instructions for doing this can be found by looking at the [user-modules-data](/help/8/artifacts/user-modules-data) and [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules) pages.
+
+Given a properly annotated [contigs-db](/help/8/artifacts/contigs-db), this program determines which enzymes are present and uses these functions to compute the completeness of each metabolic module. There are currently two strategies for estimating module completeness - pathwise and stepwise - which are discussed in the technical details section on this page. The output of this program is one or more tabular text files - see [kegg-metabolism](/help/8/artifacts/kegg-metabolism) for the output description and examples.
+
+For a practical tutorial on how to use this program, visit [this link](https://merenlab.org/tutorials/infant-gut/#chapter-v-metabolism-prediction). A more abstract discussion of available parameters, as well as technical details about how the metabolism estimation is done, can be found below.
+
+## What metabolism data can I use?
+
+You have three options when it comes to estimating metabolism.
+
+1. KEGG only (this is the default). In this case, estimation will be run on modules from the KEGG MODULES database, which you must set up on your computer using [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data). If you have a default setup of KEGG, you need not provide any parameters to choose this option. However, if you have your KEGG data in a non-default location on your computer, you will have to use the `--kegg-data-dir` parameter to point out its location.
+2. KEGG + USER data. In this case, we estimate on KEGG modules as in (1), but _also_ on user-defined metabolic modules that you set up with [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules) and provide to this program with the `--user-modules` parameter.
+3. USER data only. You can elect to skip estimation on KEGG modules and _only_ run on your own data by providing both the `--user-modules` and `--only-user-modules` parameters.
+
+## Prerequisites to using this program
+
+Metabolism estimation relies on gene annotations from the functional annotation source 'KOfam', also referred to as [kegg-functions](/help/8/artifacts/kegg-functions). Therefore, for this to work, you need to have annotated your [contigs-db](/help/8/artifacts/contigs-db) with hits to the KEGG KOfam database by running [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) prior to using this program, unless you are using the `--only-user-modules` option to ONLY estimate on user-defined metabolic modules.
+
+Both [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) and [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) rely on the [kegg-data](/help/8/artifacts/kegg-data) provided by [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data), so if you do not already have that data on your computer, [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) needs to be run first. To summarize, these are the steps that need to be done before you can use [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism):
+
+1. Run [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) to get data from KEGG onto your computer. This step only needs to be done once.
+2. [If not using `--only-user-modules`] Run [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) to annotate your [contigs-db](/help/8/artifacts/contigs-db) with [kegg-functions](/help/8/artifacts/kegg-functions). This program must be run on each contigs database that you want to estimate metabolism for.
+
+If you want to estimate for your own metabolism data, then you have a couple of extra steps to go through:
+
+3. Define your own metabolic modules by following the formatting guidelines described [here](https://merenlab.org/software/anvio/help/main/programs/anvi-setup-user-modules/#how-do-i-format-the-module-files) and [here](https://merenlab.org/software/anvio/help/main/artifacts/user-modules-data/#a-step-by-step-guide-to-creating), and then run [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules) to parse them into a [modules-db](/help/8/artifacts/modules-db),
+4. Annotate your [contigs-db](/help/8/artifacts/contigs-db) with the functional annotation sources that are required for your module definitions. This may require running a few different programs. For instance, if your modules are defined in terms of NCBI COGS (ie, the `COG20_FUNCTION` annotation source), you will need to run [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs). If you are using a set of custom HMMs, you will need to run [anvi-run-hmms](/help/8/programs/anvi-run-hmms) on that set using the `--add-to-functions-table` parameter. If you already have annotations from one or more of these sources, you could also import them into the contigs database using the program [anvi-import-functions](/help/8/programs/anvi-import-functions).
+
+## Running metabolism estimation
+
+You can run metabolism estimation on any set of annotated sequences, but these sequences typically fall under one of the following categories:
+
+- Single genomes, also referred to as [external-genomes](/help/8/artifacts/external-genomes). These can be isolate genomes or metagenome-assembled genomes, for example. Each one is described in its own individual [contigs-db](/help/8/artifacts/contigs-db).
+- Bins, also referred to as [internal-genomes](/help/8/artifacts/internal-genomes). These often represent metagenome-assembled genomes, but generally can be any subset of sequences within a database. A single [contigs-db](/help/8/artifacts/contigs-db) can contain multiple bins.
+- Assembled, unbinned metagenomes. There is no distinction between sequences that belong to different microbial populations in the [contigs-db](/help/8/artifacts/contigs-db) for an unbinned metagenome.
+
+As you can see, [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) takes one or more contigs database(s) as input, but the information that is taken from those databases depends on the context (ie, genome, metagenome, bin). In the case of internal genomes (or bins), is possible to have multiple inputs but only one input contigs db. So for clarity's sake, we sometimes refer to the inputs as 'samples' in the descriptions below. If you are getting confused, just try to remember that a 'sample' can be a genome, a metagenome, or a bin.
+
+If you don't have any sequences, there is an additional input option for you:
+- A list of enzymes, as described in an [enzymes-txt](/help/8/artifacts/enzymes-txt) file. For the purposes of metabolism estimation, the enzymes in this file will be interpreted as all coming from the same 'genome'.
+
+Different input contexts can require different parameters or additional inputs. The following sections describe what is necessary for each input type.
+
+
+### Estimation for a single genome
+
+The most basic use-case for this program is when you have one contigs database describing a single genome. Since all of the sequences in this database belong to the same genome, all of the gene annotations will be used for metabolism estimation.
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+### Estimation for bins in a metagenome
+
+You can estimate metabolism for different subsets of the sequences in your contigs database if you first [bin](/help/8/artifacts/bin) them and save them as a [collection](/help/8/artifacts/collection). For each bin, only the gene annotations from its subset of sequences will contribute to the module completeness scores.
+
+You can estimate metabolism for every individual bin in a collection by providing the profile database that describes the collection as well as the collection name:
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) -p [profile-db](/help/8/artifacts/profile-db) -C [collection](/help/8/artifacts/collection)
+
+
+The metabolism estimation results for each bin will be printed to the same output file(s). The `bin_name` column in long-format output will distinguish between results from different bins.
+
+If you only want estimation results for a single bin, you can instead provide a specific bin name from that collection using the `-b` parameter:
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) -p [profile-db](/help/8/artifacts/profile-db) -C [collection](/help/8/artifacts/collection) -b [bin](/help/8/artifacts/bin)
+
+
+Or, to estimate on a subset of bins in the collection, you can provide a text file containing the specific list of bins that you are interested in:
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) -p [profile-db](/help/8/artifacts/profile-db) -C [collection](/help/8/artifacts/collection) -B bin_ids.txt
+
+
+Each line in the `bin_ids.txt` file should be a bin name from the collection (there is no header line). Here is an example file containing three bin names:
+
+```
+bin_1
+bin_3
+bin_5
+```
+
+### Estimation for a metagenome
+
+If you have an unbinned metagenome assembly, you can estimate metabolism for it using `--metagenome-mode`. In this case, since there is no way to determine which contigs belong to which microbial populations in the sample, estimation will be done on a per-contig basis; that is, for each contig, only the genes present on that contig will be used to determine pathway completeness within the contig.
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --metagenome-mode
+
+
+{:.notice}
+In metagenome mode, this program will estimate metabolism for each contig in the metagenome separately. This will tend to underestimate module completeness because it is likely that many modules will be broken up across multiple contigs belonging to the same population. If you prefer to instead treat all enzyme annotations in the metagenome as belonging to one collective genome, you can do so by simply leaving out the `--metagenome-mode` flag (to effectively pretend that you are doing estimation for a single genome, although in your heart you will know that your contigs database really contains a metagenome). Please note that this will result in the opposite tendency to overestimate module completeness (as the enzymes will in reality be coming from multiple different populations), and there will be a lot of redundancy. We are working on improving our estimation algorithm for metagenome mode. In the meantime, if you are worried about the misleading results from either of these situations, we suggest binning your metagenomes first and running estimation for the bins as described below.
+
+### Estimation for a set of enzymes
+
+Suppose you have a list of enzymes. This could be an entirely theoretical list, or they could come from some annotation data that you got outside of anvi'o - regardless of where you came up with this set, you can figure out what metabolic pathways these enzymes contribute to. All you have to do is format that list as an [enzymes-txt](/help/8/artifacts/enzymes-txt) file, and give that input file to this program, like so:
+
+
+anvi-estimate-metabolism --enzymes-txt [enzymes-txt](/help/8/artifacts/enzymes-txt)
+
+
+The program will pretend all of these enzymes are coming from one theoretical 'genome' (though the reality depends on how you defined or obtained the set), so the completion estimates for each metabolic pathway will consider all enzymes in the file. If you want to instead break up your set of enzymes across multiple 'genomes', then you will have to make multiple different input files and run this program on each one.
+
+
+## MULTI-MODE: Running metabolism estimation on multiple contigs databases
+
+If you have a set of contigs databases of the same type (i.e., all of them are single genomes or all are binned metagenomes), you can analyze them all at once. What you need to do is put the relevant information for each [contigs-db](/help/8/artifacts/contigs-db) into a text file and pass that text file to [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism). The program will then run estimation individually on each contigs database in the file. The estimation results for each database will be aggregated and printed to the same output file(s).
+
+One advantage that multi-mode unlocks is the ability to generate matrix-formatted output, which is convenient for clustering or visualizing the metabolic potential of multiple samples. See the [Output options](#output-options) section below for more details.
+
+### Estimation for multiple single genomes
+
+Multiple single genomes (also known as [external-genomes](/help/8/artifacts/external-genomes)) can be analyzed with the same command by providing an external genomes file to [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism). To see the required format for the external genomes file, see [external-genomes](/help/8/artifacts/external-genomes).
+
+
+anvi-estimate-metabolism -e external-genomes.txt
+
+
+### Estimation for multiple bins
+
+If you have multiple bins (also known as [internal-genomes](/help/8/artifacts/internal-genomes)), they can be analyzed with the same command by providing an internal genomes file to [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism). The bins in this file can be from the same collection, from different collections, or even from different metagenomes. To see the required format for the internal genomes file, see [internal-genomes](/help/8/artifacts/internal-genomes).
+
+
+anvi-estimate-metabolism -i internal-genomes.txt
+
+
+### Estimation for multiple metagenomes
+
+Multiple metagenomes can be analyzed with the same command by providing a metagenomes input file. Metagenome mode will be used to analyze each contigs database in the file. To see the required format for the metagenomes file, see [metagenomes](/help/8/artifacts/metagenomes).
+
+
+anvi-estimate-metabolism -M metagenomes.txt
+
+
+## Adjustable Parameters
+
+There are many ways to alter the behavior of this program to fit your needs. You can find some commonly adjusted parameters below. For a full list of parameters, check the program help (`-h`) output.
+
+### Changing the module completion threshold
+
+As explained in the [technical details section](#how-is-the-module-completeness-score-calculated) below, module completeness is computed as the percentage of steps in the metabolic pathway that are 'present' based on the annotated enzymes in the contigs database. If this completeness is larger than a certain percentage, then the entire module is considered to be 'complete' in the sample and the corresponding row in the long-format modules mode output file will have 'True' under the `module_is_complete` column. By default, the module completion threshold is 0.75, or 75%.
+
+Changing this parameter _usually_ doesn't have any effect other than changing the proportions of 'True' and 'False' values in the `module_is_complete` column of long-format modules mode output (or the proportion of 1s and 0s in the module presence-absence matrix for `--matrix-format` output). It does _not_ alter completeness scores. It also does not affect which modules are printed to the output file, unless you use the `--only-complete` flag (described in a later section). Therefore, the purpose of changing this threshold is usually so that you can filter the output later somehow (i.e., by searching for 'True' values in the long-format output).
+
+The one exception is when `--add-copy-number` is used. We use the module completeness threshold to determine pathwise copy number of a module, which is based on the number of complete copies of paths through a module. So if you change this threshold, you can expect to see some differences in pathwise copy number values (which are found in certain long-format and matrix output files).
+
+In this example, we change the threshold to 50 percent.
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --module-completion-threshold 0.5
+
+
+### Working with a non-default KEGG data directory
+
+If you have previously annotated your contigs databases using a non-default KEGG data directory with `--kegg-data-dir` (see [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams)), or you have moved the KEGG data directory that you wish to use to a non-default location, then you will need to specify where to find the KEGG data so that this program can use the right one. In that case, this is how you do it:
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --kegg-data-dir /path/to/directory/KEGG
+
+
+### Working with user-defined metabolism data
+
+If you have defined your own set of metabolic modules and generated a [modules-db](/help/8/artifacts/modules-db) for them using [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules), you can estimate the completeness of these pathways (in addition to the KEGG modules) by providing the path to the directory containing this data:
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --user-modules /path/to/USER/directory
+
+
+The `--user-modules` parameter can be used in conjunction with the `--kegg-data-dir` parameter to control which KEGG data is being used at the same time.
+
+### Skipping KEGG data (ie, only working with user-defined metabolism data)
+
+If you wish to only estimate for your own metabolic modules, you can skip estimating for KEGG modules by providing the `--only-user-modules` flag. The nice thing about doing this is that you can skip running [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) on your databases (which will save you lots of time and computational resources).
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --user-modules /path/to/USER/directory --only-user-modules
+
+
+### Including KEGG Orthologs not in KOfam
+
+Sometimes, your input data may have annotations for KOs that are not in the KOfam profiles that we use for annotation. This can happen if you are using [enzymes-txt](/help/8/artifacts/enzymes-txt), or if you have imported external annotations with the source name `KOfam`. By default, we don't consider these annotations, and you will probably see an error message. However, (as suggested in that message) you can explicitly include these non-KOfam annotations into the analysis by providing the flag `--include-kos-not-in-kofam`, like so:
+
+
+anvi-estimate-metabolism --enzymes-txt [enzymes-txt](/help/8/artifacts/enzymes-txt) --include-kos-not-in-kofam
+
+
+## Output options
+This program has two types of output files: long-format (tab-delimited) output files and matrices. The long-format output is the default. If you are using multi-mode to work with multiple samples, you can request matrix output by using the flag `--matrix-format`.
+
+You can find more details on the output formats by looking at [kegg-metabolism](/help/8/artifacts/kegg-metabolism). Below, you will find examples of how to use output-related parameters.
+
+### Long-Format Output
+Long-format output has several preset "modes" as well as a "custom" mode in which the user can define the contents of the output file. Multiple modes can be used at once, and each requested "mode" will result in a separate output file. The default output mode is "modules" mode.
+
+**Viewing available output modes**
+
+To see what long-format output modes are currently available, use the `--list-available-modes` flag:
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --list-available-modes
+
+
+The program will print a list like the one below and then exit.
+
+```
+AVAILABLE OUTPUT MODES
+===============================================
+modules ......................................: Information on metabolic modules
+modules_custom ...............................: A custom tab-delimited output file where you choose the included modules data using --custom-output-headers
+module_paths .................................: Information on each possible path (complete set of enzymes) in a module
+module_steps .................................: Information on each top-level step in a module
+hits .........................................: Information on all enzyme annotations in the contigs DB, regardless of module membership
+```
+
+Please note that you _must_ provide your input file(s) for this to work.
+
+**Using a non-default output mode**
+
+You can specify one or more long-format output modes using the `--output-modes` parameter. The mode names must exactly match to one of the available modes from the `--list-available-modes` output.
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --output-modes hits
+
+
+**Using multiple output modes**
+
+If you want more than one output mode, you can provide multiple comma-separated mode names to the `--output-modes` parameter. There should be no spaces between the mode names.
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --output-modes hits,modules
+
+
+When multiple output modes are requested, a different output file is produced for each mode. All output files will have the same prefix, and the file suffixes specify the output mode. For example, modules mode output has the suffix `_modules.txt` while hits mode has the suffix `_hits.txt`.
+
+**Viewing available output headers for 'custom' mode**
+
+The `modules_custom` output mode allows you to specify which information to include (as columns) in your long-format output. It is essentially a customizable version of modules mode output. To use this mode, you must specify which columns to include by listing the column names after the `--custom-output-headers` flag.
+
+To find out what column headers are available, use the `--list-available-output-headers` parameter:
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --list-available-output-headers
+
+
+The program will print a list like the one below and then exit.
+
+```
+AVAILABLE OUTPUT HEADERS
+===============================================
+module .......................................: Module number [modules output mode]
+stepwise_module_is_complete ..................: Whether a module is considered complete or not based on its STEPWISE percent completeness and the completeness threshold [modules output mode]
+stepwise_module_completeness .................: Percent completeness of a module, computed as the number of complete steps divided by the number of total steps (where 'steps' are determined by splitting the module definition
+ on the space character) [modules output mode]
+pathwise_module_is_complete ..................: Whether a module is considered complete or not based on its PATHWISE percent completeness and the completeness threshold [modules output mode]
+pathwise_module_completeness .................: Percent completeness of a module, computed as maximum completeness of all possible combinations of enzymes ('paths') in the definition [modules output mode]
+enzymes_unique_to_module .....................: A list of enzymes that only belong to this module (ie, they are not members of multiple modules) [modules output mode]
+unique_enzymes_hit_counts ....................: How many times each unique enzyme appears in the sample (order of counts corresponds to list in `enzymes_unique_to_module` field) [modules output mode]
+proportion_unique_enzymes_present ............: Proportion of enzymes unique to this one module that are present in the sample [modules output mode]
+unique_enzymes_context_string ................: Describes the unique enzymes contributing to the `proportion_unique_enzymes_present` field [modules output mode]
+module_name ..................................: Name/description of a module [modules output mode]
+[....]
+```
+
+As you can see, this flag is also useful when you want to quickly look up the description of each column of data in your output files.
+
+For each header, the output mode(s) that it is applicable to are listed after the description. The headers you can choose from for `modules_custom` output end in either `[modules output mode]` or `[all output modes]`.
+
+Just as with `--list-available-modes`, you must provide your input file(s) for this to work. In fact, some headers will change depending on which input types you provide. You will see additional possible headers if you use the `--add-copy-number` or `--add-coverage` flags (though this only works for single sample inputs, not for Multi Mode - if you wish to get custom output for Multi Mode, it is best to construct your custom header list by looking at the possible headers for your given parameter set for a SINGLE sample from your input file).
+
+**Using custom output mode**
+
+Here is an example of defining the modules output to contain columns with the module number, the module name, and the completeness score. The corresponding headers for these columns are provided as a comma-separated list (no spaces) to the `--custom-output-headers` flag.
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --output-modes modules_custom --custom-output-headers module,module_name,module_completeness
+
+
+**Including modules with 0% completeness in long-format output**
+
+By default, modules with completeness scores of 0 are not printed to the output files to save on space (both pathwise completeness and stepwise completeness must be 0 to exclude modules from the output). But you can explicitly include them by adding the `--include-zeros` flag.
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --include-zeros
+
+
+**Including module copy number in long-format output**
+
+You can ask this program to count the number of copies of each module in your input samples by providing the `--add-copy-number` flag:
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --output-modes modules,module_paths,module_steps --add-copy-number
+
+
+Just like module completeness, copy number can be calculated using two different strategies. You can find information about the calculations in the technical details section below, and information about what copy number output looks like in [kegg-metabolism](/help/8/artifacts/kegg-metabolism).
+
+This flag also works for matrix output.
+
+**Including coverage and detection in long-format output**
+
+If you have a profile database associated with your contigs database and you would like to include coverage and detection data in the metabolism estimation output files, you can use the `--add-coverage` flag. You will need to provide the profile database as well, of course. :)
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) -p [profile-db](/help/8/artifacts/profile-db) --output-modes modules,hits --add-coverage
+
+
+This option also works for the `--enzymes-txt` input option, provided that you include _both_ a `coverage` column and a `detection` column in the [enzymes-txt](/help/8/artifacts/enzymes-txt) input file.
+
+
+anvi-estimate-metabolism --enzymes-txt [enzymes-txt](/help/8/artifacts/enzymes-txt) --add-coverage
+
+
+For `hits` mode output files, in which each row describes one enzyme annotation for a gene in the contigs database, the output will contain two additional columns per sample in the profile database. One column will contain the mean coverage of that particular gene call by reads from that sample and the other will contain the detection of that gene in the sample.
+
+For `modules` mode output files, in which each row describes a metabolic module, the output will contain _four_ additional columns per sample in the profile database. One column will contain comma-separated mean coverage values for each gene call in the module, in the same order as the corresponding gene calls in the `gene_caller_ids_in_module` column. Another column will contain the average of these gene coverage values, which represents the average coverage of the entire module. Likewise, the third and fourth columns will contain comma-separated detection values for each gene call and the average detection, respectively.
+
+Note that you can customize which coverage/detection columns are in the output files if you use `custom` modules mode. Use the following command to find out which coverage/detection headers are available:
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) -p [profile-db](/help/8/artifacts/profile-db) --add-coverage --list-available-output-headers
+
+
+### Matrix Output
+Matrix format is only available when working with multiple contigs databases. Several output matrices will be generated, each of which describes one statistic such as module completion score, module presence/absence, or enzyme annotation (hit) counts. As with long-format output, each output file will have the same prefix and the file suffixes will indicate which type of data is present in the file.
+
+In each matrix, the rows will describe modules, top-level steps, or enzymes. The columns will describe your input samples (i.e. genomes, metagenomes, bins), and each cell will be the corresponding statistic. You can see examples of this output format by viewing [kegg-metabolism](/help/8/artifacts/kegg-metabolism).
+
+**Obtaining matrix-formatted output**
+
+Getting these matrices is as easy as providing the `--matrix-format` flag.
+
+
+anvi-estimate-metabolism -i internal-genomes.txt --matrix-format
+
+
+**Including metadata in the matrix output**
+
+By default, the matrix output is a matrix ready for use in other computational applications, like visualizing as a heatmap or performing clustering. That means it has a header line and an index in the right-most column, but all other cells are numerical. However, you may want to instead have a matrix that is annotated with more information, like the names and categories of each module or the functional annotations of each enzyme. To include this additional information in the matrix output (as columns that occur before the sample columns), use the `--include-metadata` flag.
+
+
+anvi-estimate-metabolism -i internal-genomes.txt --matrix-format --include-metadata
+
+
+Note that this flag only works for matrix output because, well, the long-format output inherently includes metadata.
+
+**Including rows of all zeros in the matrix output**
+
+The `--include-zeros` flag works for matrix output, too. By default, modules that have 0 completeness (or enzymes that have 0 hits) in every input sample will be left out of the matrix files. Using `--include-zeros` results in the inclusion of these items (that is, the inclusion of rows of all zeros).
+
+
+anvi-estimate-metabolism -i internal-genomes.txt --matrix-format --include-zeros
+
+
+**Getting module-specific enzyme matrices**
+
+The standard enzyme hit matrix includes all enzymes that were annotated at least once in your input databases (or all enzymes that we know about, if `--include-zeros` is used). But sometimes you might want to see a matrix with only the enzymes from a particular metabolic pathway. To do this, pass a comma-separated (no spaces) list of module numbers to the `--module-specific-matrices` flag, and then your matrix output will include enzyme hit matrices for each module in the list.
+
+For example,
+
+
+anvi-estimate-metabolism -e input_txt_files/external_genomes.txt \
+ --matrix-format \
+ --module-specific-matrices M00001,M00009 \
+ -O external_genomes
+
+
+will produce the output files `external_genomes-M00001_enzyme_hits-MATRIX.txt` and `external_genomes-M00009_enzyme_hits-MATRIX.txt` (in addition to the typical output matrices). Each additional output matrix will include one row for each enzyme in the module, in the order it appears in the module definition. It will also include comment lines for each major step (or set of steps) in the module definition, to help with interpreting the output.
+
+Check out this (partial) example for module M00001:
+```
+enzyme isolate E_faecalis_6240 test_2
+# (K00844,K12407,K00845,K25026,K00886,K08074,K00918)
+K00844 0 0 0
+K12407 0 0 0
+K00845 0 0 0
+K25026 0 1 0
+K00886 1 0 1
+K08074 0 0 0
+K00918 0 0 0
+# (K01810,K06859,K13810,K15916)
+K01810 1 1 1
+K06859 0 0 0
+K13810 0 0 0
+K15916 0 0 0
+# (K00850,K16370,K21071,K00918)
+K00850 0 1 0
+K16370 0 0 0
+K21071 0 0 0
+K00918 0 0 0
+[....]
+```
+
+If you don't want those comment lines in there, you can combine this with the `--no-comments` flag to get a clean matrix. This might be useful if you want to do some downstream processing of the matrices.
+
+
+anvi-estimate-metabolism -e input_txt_files/external_genomes.txt \
+ --matrix-format \
+ --module-specific-matrices M00001,M00009 \
+ --no-comments \
+ -O external_genomes
+
+
+In this case, the above file would look like this:
+```
+enzyme isolate E_faecalis_6240 test_2
+K00844 0 0 0
+K12407 0 0 0
+K00845 0 0 0
+K25026 0 1 0
+K00886 1 0 1
+K08074 0 0 0
+K00918 0 0 0
+K01810 1 1 1
+K06859 0 0 0
+K13810 0 0 0
+K15916 0 0 0
+K00850 0 1 0
+K16370 0 0 0
+K21071 0 0 0
+K00918 0 0 0
+[....]
+```
+
+**Including copy number in matrix output**
+
+The `--add-copy-number` flag, which was discussed above for including module copy number values in long-format output, also works for matrix output:
+
+
+anvi-estimate-metabolism -i internal-genomes.txt --matrix-format --add-copy-number
+
+
+When you use this flag, you will get matrices describing copy number statistics in addition to the typical set of matrix output files.
+
+### Other output options
+
+Regardless of which output type you are working with, there are a few generic options for controlling how the output files look like.
+
+**Changing the output file prefix**
+
+[anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) can produce a variety of output files. All will be prefixed with the same string, which by default is `kegg-metabolism`. If you want to change this prefix, use the `-O` flag.
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) -O my-cool-prefix
+
+
+**Including only complete modules in the output**
+
+Remember that module completion threshold? Well, you can use that to control which modules make it into your output files. If you provide the `--only-complete` flag, then any module-related output files will only include modules that have a completeness score (either pathwise or stepwise) at or above the module completion threshold. (This doesn't affect enzyme-related outputs, for obvious reasons.)
+
+Here is an example of using this flag with long-format output (which is the default, as described above, but we are asking for it explicitly here just to be clear):
+
+
+anvi-estimate-metabolism -c [contigs-db](/help/8/artifacts/contigs-db) --output-modes modules --only-complete
+
+
+And here is an example of using this flag with matrix output. In this case, we are working with multiple input samples, and the behavior of this flag is slightly different: a module will be included in the matrix if it is at or above the module completion threshold in **at least one sample, for either pathwise or stepwise completeness**. That means you may see numbers lower than the threshold in the completeness matrices.
+
+
+anvi-estimate-metabolism -i internal-genomes.txt --matrix-format --only-complete
+
+
+
+## Testing this program
+You can see if this program is working on your computer by running the following suite of tests, which will check several common use-cases:
+
+
+anvi-self-test --suite metabolism
+
+
+
+## Help! I'm getting version errors!
+If you have gotten an error that looks something like this:
+
+```
+Config Error: The contigs DB that you are working with has been annotated with a different version of the MODULES.db than you are working with now.
+```
+
+This means that the [modules-db](/help/8/artifacts/modules-db) used by [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) has different contents (different KOs and/or different modules) than the one you are currently using to estimate metabolism, which would lead to mismatches if metabolism estimation were to continue. There are a few ways this can happen:
+
+1. You upgraded to a new anvi'o version and downloaded the default [kegg-data](/help/8/artifacts/kegg-data) associated with that release, but are working with a [contigs-db](/help/8/artifacts/contigs-db) that was annotated with a previous anvi'o version (and therefore a different instance of [kegg-data](/help/8/artifacts/kegg-data)).
+2. Without changing anvi'o versions, you annotated your [contigs-db](/help/8/artifacts/contigs-db) with default [kegg-data](/help/8/artifacts/kegg-data), and subsequently replaced that data with a different instance by running [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) again with the `--reset` flag (and likely also with the `--kegg-archive`, `--kegg-snapshot`, or `--download-from-kegg` options, all of which get you a non-default version of KEGG data). Then you tried to run [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) with the new data.
+3. You have multiple instances of [kegg-data](/help/8/artifacts/kegg-data) on your computer in different locations, and you used different ones for [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) and [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism).
+4. Your collaborator gave you some databases that they annotated with a different version of [kegg-data](/help/8/artifacts/kegg-data) than you have on your computer.
+
+There are two main solutions for most of these situations, which differ according to which set of annotations you would prefer to use.
+
+**First option**: you want to update your [contigs-db](/help/8/artifacts/contigs-db) to have new annotations that match to the current [modules-db](/help/8/artifacts/modules-db). In this case, you have to rerun [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) on the [contigs-db](/help/8/artifacts/contigs-db). Make sure you provide the same `--kegg-data-dir` value (if any) that you put in the `anvi-estimate-metabolism` command that gave you this error.
+
+**Second option**: you want to continue working with the existing set of annotations in the [contigs-db](/help/8/artifacts/contigs-db). This means you need to change which [modules-db](/help/8/artifacts/modules-db) you are using for [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism). The error message should tell you the hash of the [modules-db](/help/8/artifacts/modules-db) used for annotation. You can use that hash to identify the matching database so that you can either re-download that database, or (if you already have it) find it on your computer.
+
+If you have multiple instances of [kegg-data](/help/8/artifacts/kegg-data) on your computer, you can run `anvi-db-info` on the [modules-db](/help/8/artifacts/modules-db) in each of those directories until you find the one with the hash you are looking for. Then provide the path to that directory using the `kegg-data-dir` parameter of `anvi-estimate-metabolism`.
+
+{:.notice}
+If you've recently upgraded your anvi'o version (i.e., situation 1 from above) and you kept your previous installation of anvi'o, the database you want should still be available as part of that environment. You can find its location by activating the environment and running the following code in your terminal: `export ANVIO_MODULES_DB=`python -c "import anvio; import os; print(os.path.join(os.path.dirname(anvio.__file__), 'data/misc/KEGG/MODULES.db'))"``. Use `echo $ANVIO_MODULES_DB` to print the path in your terminal, and `anvi-db-info $ANVIO_MODULES_DB` to verify that its hash matches to the one in your contigs database.
+
+If you don't have any matching instances of [kegg-data](/help/8/artifacts/kegg-data) on your computer, you will need to download it. First, check if the version you want is one of the KEGG snapshots that anvi'o provides by looking at the `KEGG-SNAPSHOTS.yaml` file in the anvi'o codebase. For instance, you can get the location of that file and print it to your terminal by running the following:
+
+
+export ANVIO_KEGG_SNAPSHOTS=`python -c "import anvio; import os; print(os.path.join(os.path.dirname(anvio.__file__), 'data/misc/KEGG-SNAPSHOTS.yaml'))"`
+cat $ANVIO_KEGG_SNAPSHOTS`.
+
+
+Take a look through the different versions. If you see one with a hash matching to the one used to annotate your [contigs-db](/help/8/artifacts/contigs-db), then you can download that version by following [the directions for setting up a KEGG snapshot](https://anvio.org/help/main/programs/anvi-setup-kegg-data/#setting-up-an-earlier-kegg-snapshot). Provide the snapshot version name to the `--kegg-snapshot` parameter of [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data).
+
+**I can't find KEGG data with a matching hash!**
+If you don't have a matching metabolism database on your computer, and none of the snapshots in the `KEGG-SNAPSHOTS.yaml` file have the hash that you need, your [contigs-db](/help/8/artifacts/contigs-db) was probably annotated with KO and module data [downloaded directly from KEGG](https://anvio.org/help/main/programs/anvi-setup-kegg-data/#getting-the-most-up-to-date-kegg-data-downloading-directly-from-kegg). If you have obtained the [contigs-db](/help/8/artifacts/contigs-db) from a collaborator (i.e., situation 4 from above), ask them to also share their [kegg-data](/help/8/artifacts/kegg-data) with you, following [these steps](https://anvio.org/help/main/programs/anvi-setup-kegg-data/#how-do-i-share-this-data). Otherwise, anvi'o cannot really help you get this data back, and you may have to resort to option 1 described above.
+
+If none of these solutions help you to get rid of the version incompatibility error, please feel free to reach out to the anvi'o developers for help.
+
+
+## What to do if estimation is not working as expected for user-defined metabolic modules?
+
+If you are estimating completeness of user-defined modules and find that the results are not as expected, you should double check your module files to make sure the pathway is defined properly. Are the enzyme accession numbers in the DEFINITION correct? Do you have the proper ANNOTATION_SOURCE for each enzyme, and are these lines spelled properly and matching to the annotation sources in your contigs database(s)? If you are using custom HMM profiles, did you remember to use the `--add-to-functions-table` parameter?
+
+If these things are correct but you are still not finding an annotation for one or more enzymes that you _know_ should be in your sequence data, consider why those annotations might not be there - perhaps the e-values are too low for the annotations to be kept in the database? Keep in mind that you can always try to add enzyme annotations (with the proper sources) to your database using [anvi-import-functions](/help/8/programs/anvi-import-functions) before running [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) again.
+
+## Technical Details
+
+### What data is used for estimation?
+
+Regardless of which input type is provided to this program, the basic requirements for metabolism estimation are 1) a set of metabolic pathway definitions, and 2) a 'pool' of gene annotations.
+
+#### Module Definitions
+One set of metabolic pathway definitions that can be used by this program is the [KEGG MODULE resource](https://www.genome.jp/kegg/module.html). You can also define your own set of metabolic modules, but the definition format and estimation strategy will be the same. So for brevity's sake, the following discussion will cover the KEGG data case.
+
+The program [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) acquires the definitions of these modules using the KEGG API and puts them into the [modules-db](/help/8/artifacts/modules-db). The definitions are strings of KEGG Ortholog (KO) identifiers, representing the functions necessary to carry out each step of the metabolic pathway. Let's use module [M00018](https://www.genome.jp/kegg-bin/show_module?M00018), Threonine Biosynthesis, as an example. Here is the module definition, in picture form:
+
+![Module M00018 Definition](../../images/M00018.png){:.center-img .width-50}
+
+This biosynthesis pathway has five major steps, or chemical reactions (we call these major steps 'top-level steps', which will be important later). The [first reaction](https://www.genome.jp/dbget-bin/www_bget?R00480) in the pathway requires an aspartate kinase enzyme (also known as a homoserine dehydrogenase), and there are four possible orthologs known to encode this function: K00928, K12524, K12525, or K12526. Only one of these genes is required to be able to carry out this step. In contrast, the [second reaction](https://www.genome.jp/dbget-bin/www_bget?R02291) can be fulfilled by only one known KO, the aspartate-semialdehyde dehydrogenase [K00133](https://www.genome.jp/dbget-bin/www_bget?ko:K00133).
+
+The definition string for module M00018 is this:
+
+```
+(K00928,K12524,K12525,K12526) K00133 (K00003,K12524,K12525) (K00872,K02204,K02203) K01733
+```
+
+Hopefully the correspondence between the picture and text is clear - spaces separate distinct steps in the pathway, while commas separate alternatives.
+
+That was a simple example, so let's look at a more complicated one: [M00011](https://www.genome.jp/kegg-bin/show_module?M00011), the second carbon oxidation phase of the citrate cycle.
+
+![Module M00011 Definition](../../images/M00011.png){:.center-img .width-50}
+
+This pathway also has five steps, but this time, most of the reactions require an _enzyme complex_. Each KO within a multi-KO box is a component of an enzyme. For example, one option for the first reaction is 2-oxoglutarate dehydrogenase, a 3-component enzyme made up of [K00164](https://www.genome.jp/dbget-bin/www_bget?K00164), [K00658](https://www.genome.jp/dbget-bin/www_bget?K00658), and [K00382](https://www.genome.jp/dbget-bin/www_bget?K00382).
+
+This is the definition string for module M00011:
+
+```
+(K00164+K00658+K00382,K00174+K00175-K00177-K00176) (K01902+K01903,K01899+K01900,K18118) (K00234+K00235+K00236+K00237,K00239+K00240+K00241-(K00242,K18859,K18860),K00244+K00245+K00246-K00247) (K01676,K01679,K01677+K01678) (K00026,K00025,K00024,K00116)
+```
+
+And here is a detail that is difficult to tell from the pictorial definition - not all enzyme components are equally important. You can see in the definition string that KO components of an enzyme complex are connected with either '+' signs or '-' signs. The '+' sign indicates that the following KO is an essential component of the enzyme, while the '-' sign indicates that it is non-essential. For the purposes of module completeness estimation, we only consider a reaction to be fulfilled if all the _essential_ component KOs are present in the annotation pool (and we don't care about the 'non-essential' components). So, for example, we would consider the first step in this pathway complete if just K00174 and K00175 were present. The presence/absence of either K00177 or K00176 would not affect the module completeness score at all.
+
+Module definitions can be even more complex than this. Both of these examples had exactly five top-level steps, no matter which set of KOs you use to fulfill each reaction. However, in some modules, there can be alternative sets with different numbers of steps. In addition, some modules (such as [M00611](https://www.genome.jp/kegg-bin/show_module?M00611), the module representing photosynthesis), are made up of _other_ modules, in which case they are only complete if their component modules are complete.
+
+Hopefully this information will help you understand our estimation strategies in the next section.
+
+#### KOfam (enzyme) annotations
+For metabolism estimation to work properly, gene identifiers in the pool of annotations must match to the gene identifiers used in the pathway definitions. For KEGG MODULEs, we rely on annotations from the [KEGG KOfam database](https://www.genome.jp/tools/kofamkoala/), which is a set of HMM profiles for KEGG Orthologs (KOs). The program [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) can annotate your [contigs-db](/help/8/artifacts/contigs-db) with hits to the KEGG KOfam database. It adds these annotations under the source name 'KOfam'.
+
+Which of the annotations are considered for metabolism estimation depends on the input context. If you are working with isolate genomes (ie, _not_ metagenome mode or bins), then all of the annotations under source 'KOfam' will be used. If you are working with bins in metagenomes, then for each bin, only the 'KOfam' annotations that are present in that bin will be in the annotation pool. Finally, for metagenome mode, since estimation is done for each contig separately, only the annotations present in each contig will be considered at a time.
+
+User-defined metabolic modules must specify the annotation source(s) needed to find their component enzymes in your data. Adding these annotation sources to your contigs databases may require running a variety of programs. However, `anvi-estimate-metabolism` loads these gene annotations and uses them in the same way as it does 'KOfam' annotations for KEGG data.
+
+### Two estimation strategies - pathwise and stepwise
+
+We currently have two ways of estimating the completeness of a module, which differ in how we decompose the module DEFINITION string into smaller parts.
+
+![A comparison of the pathwise and stepwise strategies](../../images/pathwise_vs_stepwise.png)
+
+For the 'pathwise' strategy, we consider all possible 'paths' through the module - each alternative set of enzymes that could be used together to catalyze every reaction in the metabolic pathway. After calculating the percent completeness in all possible paths, we take the maximum completeness to be the pathwise completeness score of the module as a whole. This is the most granular way of estimating module completeness because we consider all the possible alternatives.
+
+For the 'stepwise' strategy, we break down the module DEFINITION into its major, or 'top-level', steps. Each "top-level" step usually represents either one metabolic reaction or a branch point in the pathway, and is defined by one or more enzymes that either work together or serve as alternatives to each other to catalyze this reaction or set of reactions. We use the available enzyme annotations to determine whether each step can be catalyzed or not - just a binary value representing whether the step is present or not. Then we compute the stepwise module completeness as the percent of present top-level steps. This is the least granular way of estimating module completeness because we do not distinguish between enzyme alternatives - these are all considered as one step which is either entirely present or entirely absent.
+
+The pathwise and stepwise strategies also apply to copy number calculations, in which enzyme annotations are allocated to create different copies of a path, step, or module. Path copy number is computed as the number of complete copies of a path through a module, and a module's pathwise copy number is then calculated as the maximum copy number of any of its paths that have the highest completeness score. Step copy number is the number of complete copies of a top-level step, and a module's stepwise copy number is the minimum copy number of all of its top--level steps.
+
+Confused? Yeah, this is complicated stuff! But hopefully the illustrative examples in the next few sections will clear it up.
+
+### How is pathwise completeness/copy number calculated?
+
+For demonstration purposes, let's talk through the estimation of pathwise completeness and copy number for one module, in one 'sample' (ie a genome, bin, or contig in a metagenome). Just keep in mind that the steps described below are followed for each module in each sample.
+
+#### Part 1: Unrolling module definitions
+As you saw above in the module examples, there can be multiple alternative KOs for a given step in a pathway. This means that there can be more than one way to have a 'complete' metabolic module. Therefore, to estimate completeness, we first have to identify all possible 'paths' through the module definition, where a 'path' is a set of KOs that could make the module complete (if they were all present in the annotation pool).
+
+`anvi-estimate-metabolism` uses a recursive algorithm to "unroll" the module definition string into a list of all possible paths. First, the definition string is split into its top-level steps (which are separated by spaces). Each step is either an atomic step, a protein complex (KO components separated by '+' or '-'), or a compound step (multiple alternatives, separated by commas). Compound steps and protein complexes are recursively broken down until we have only atomic steps. An atomic step can be a single KO, a module number, a nonessential KO starting with '-', or `--` (a string indicating that there is a reaction for which we do not have a KO). We use the atomic steps to build a list of alternative paths through the module definition. Protein complexes are split into their respective components using this strategy to find all possible alternative complexes, and then these complexes (with all their component KOs) are used to build the alternative paths.
+
+Let's see this in action, using the Threonine Biosynthesis module from above as an example. We first split the definition on spaces to get all top-level steps. Here we show each top-level step on its own line:
+```
+(K00928,K12524,K12525,K12526)
+K00133
+(K00003,K12524,K12525)
+(K00872,K02204,K02203)
+K01733
+```
+The first step is made up of 4 alternative KOs. We split on the commas to get these, and thus we have the starting KO for 4 possible alternative paths:
+```
+K00928 K12524 K12525 K12526
+ | | | |
+```
+The second step, K00133, is already an atomic step, so we can simply extend each of the paths with this KO:
+```
+K00928 K12524 K12525 K12526
+ | | | |
+K00133 K00133 K00133 K00133
+```
+The third step is another compound step, but this time we can get 3 atomic steps out of it. That means that our 4 possible paths so far each gets 3 alternatives, bringing our total alternative path count up to 12:
+```
+ K00928 K12524 K12525 K12526
+ | | | |
+ K00133 K00133 K00133 K00133
+ / | \ / | \ / | \ / | \
+K00003 K12524 K12525 K00003 K12524 K12525 K00003 K12524 K12525 K00003 K12524 K12525
+```
+Okay, hopefully you get the picture by now. The end result is a list of lists, like this:
+```
+[[K00928,K00133,K00003,K00872,K01733],
+[K00928,K00133,K00003,K02204,K01733],
+......
+[K12526,K00133,K12525,K02203,K01733]]
+```
+in which every inner list is one of the alternative paths through the module definition - one of the possible ways to have a complete module.
+
+By the way, here is one alternative path from the module M00011, just so you know what these look like with protein complexes:
+```
+[K00164+K00658+K00382,K01902+K01903,K00234+K00235+K00236+K00237,K01676,K00026]
+```
+
+#### Part 2: Marking steps complete
+Once we have our list of alternative paths through the module, the next task is to compute the completeness of each path. Each alternative path is a list of atomic steps or protein complexes. We loop over every step in the path and use the annotation pool of KOs to decide whether the step is complete (1) or not (0). We have the following cases to handle:
+
+1. A single KO - this is easy. If we have an annotation for this KO in our pool of 'KOfam' annotations, then the step is complete (1).
+
+2. A protein complex - remember that these are multiple KOs connected with '+' (if they are essential components) or '-' (if they are non-essential). Well, for these steps, we compute a fractional completeness based on the number of essential components that are present in the annotation pool. We basically ignore the non-essential KOs. For example, the complex 'K00174+K00175-K00177-K00176' would be considered 50% complete (a score of 0.5) if only 'K00174' were present in the annotation pool.
+
+3. Non-essential KOs - some KOs are marked as non-essential even when they are not part of a protein complex. They look like this: '-K12420', with a minus sign in front of the KO identifier (that particular example comes from module [M00778](https://www.genome.jp/kegg-bin/show_module?M00778)). These steps are ignored for the purposes of computing module completeness.
+
+4. Steps without associated KOs - some reactions do not have a KO identifier, but instead there is the string `--` serving as a placeholder in the module definition. Since we can't annotate the genes required for these steps, we have no idea if they are complete or not, so we always consider them incomplete (0). Modules that have steps like this can therefore never have 100% completeness - it is sad, but what can we do? We warn the user about these instances so that they can check manually for any missing steps.
+
+5. Modules - finally, some modules are defined by other modules. We can't determine if these steps are complete until we've estimated completeness for every module, so we ignore these for now.
+
+To get the completeness score for a given path through the module, we first add up the completeness of each essential step in the path and then we divide that sum by the number of essential steps.
+
+#### Part 3: Module completeness
+By this time, we have a completeness score (a fraction between 0 and 1) for every possible path through the module. To get the completeness score for the module overall, we simply take the maximum of all these completeness scores.
+
+{:.notice}
+Why take the maximum? We are assuming here that if the metabolic pathway is actually being used in a cell (which we can't know for sure without doing some transcriptomics and possibly metabolomics), the most complete set of enzymes in that pathway is the most likely to be used. This is certainly a questionable assumption, but we need to make some choices like this in order to summarize the data, so we do it. It gets tricker to interpret this number when there is more than one path through the module that has the maximum completeness score - which one is being used? We cannot know just from (meta)genomics data, so it would take additional data types or knowledge of the biological system to figure this out.
+
+We can then check this number against the module completeness threshold (which is 0.75 by default). If the module completeness score is greater than or equal to the threshold, we mark the module as 'complete'. This boolean value is meant only as a way to easily filter through the modules output, and you shouldn't put too much stock in it because it covers up a lot of nuances, as you can tell from the details above :).
+
+Note that some modules, especially those with a lot of possible paths, can have more than one path which has the maximum completeness score (that is, the score that determines the completeness of the module). This will be important later for calculating pathwise copy number, so we keep track of all of the paths with this maximum completeness score.
+
+#### Part 4: Adjusting completeness
+But don't forget that there are some modules defined by other modules. These are usually what KEGG calls 'Signature Modules', which are collections of enzymes that collectively encode some phenotype, rather than a typical pathway of chemical reactions. For these modules, we have to go back and adjust the completeness score after we know the completeness of its component modules. To do this, we basically re-do the previous two tasks to recompute the number of complete steps in each path and the overall completeness of the module. This time, when we reach a 'Module' atomic step (case 5), we take that module's fractional completeness score to be the completeness of the step.
+
+As an example, consider module [M00618](https://www.genome.jp/kegg-bin/show_module?M00618), the Acetogen signature module. Its definition is
+```
+M00377 M00579
+```
+Suppose module M00377 had a completeness score of 0.7 and module M00579 had a score of 0.4, based on the prior estimations. Then the completeness score of the `[M00377,M00579]` path would be (0.7+0.4)/2 = 0.55. Since this is the only possible path through the module, M00618 is 55% complete.
+
+#### Part 5: Path copy number
+
+In reality, this part is actually done at the same time as Part 2, but it is easier to understand if we think of it separately. We have all our possible paths through the module from Part 1, and we need to figure out the copy number of each path. To do this, we look at the number of annotations of each atomic step in the path. Here is how we handle each atomic step:
+
+1. A single KO - the copy number of this atomic step is equal to the number of annotations (hits) of this enzyme. If the atomic step is K00133 and you have 5 genes annotated with K00133, then you have 5 copies of that step. If you have 0 annotations, then the step copy number is 0. This is the simplest case.
+
+2. A protein complex - these are a bit tricky. Once again, we ignore the non-essential components and only consider essential ones. But we use the number of annotations to figure out how many _complete_ copies of the protein complex exist in the input sample. For example, suppose we have 2 annotations for K00174 and 3 annotations for K00175. These are the only two essential components in the complex 'K00174+K00175-K00177-K00176'. Since we have at least two annotations for both of these enzymes, we have two copies of the complex. What about the third annotation for K00175? Well, it can't do much all by itself. This hypothetical third copy of the complex is only 50% complete (1 out of 2 essential components), which is less than the default module completeness threshold of 75%. So we ignore it, and say that we have 2 copies of the enzyme complex. _However, this calculation can change if you were to adjust the module completeness threshold._ If you set the threshold to be 50% or lower, then 50% is enough to consider the third copy of the complex complete (in which case, we would say that we have 3 copies of the enzyme complex).
+
+3. Non-essential KOs - just like we ignore these steps when computing completeness, we also ignore them when computing copy number.
+
+4. Steps without associated KOs (the `--` case) - Just like we always consider these atomic steps to be incomplete, we also always give them a copy number of 0.
+
+5. Modules - as you might guess, the copy number of these atomic steps are obtained later, after we've computed copy number for every other module. There is an adjustment step for copy number just like there is one for completeness (Part 4 above).
+
+To get the copy number for a given path through the module, we determine the number of complete copies of the path. This is the same as the way we handle copy number of protein complexes, as described above. And it _also depends on the module completeness threshold_. Suppose a path has 4 essential atomic steps (call them A,B,C, and D) with the following copy numbers: 4,3,1, and 2. Using the default completeness threshold of 0.75, we need at least 3 out of 4 atomic steps to be present in order for a copy of the path to be considered complete. There is one copy that has all 4 steps, one copy that has 3/4, one copy that has 2/4 and one that has 1/4. This is perhaps easier to see in graph form, with atomic steps on the x-axis and atomic step copy number on the y-axis:
+
+```
+X
+X X
+X X X <-- (3/4 enzymes present)
+X X X X <-- (4/4 enzymes present)
+A B C D
+```
+
+Each copy of the path is a horizontal row of X's in the simple graphic above. There are 2 copies of the path with at least 3/4 atomic steps in the list (marked with arrows), which means that the path copy number is 2.
+
+#### Part 6: Module copy number
+
+Once we have the completeness scores and copy numbers of all possible paths through the module, we can compute the copy number of the module itself. Remember from Part 3 that we saved the paths which have the highest completeness score? We take the maximum copy number of those paths of highest completeness.
+
+So if the module does not have any complete paths, then its copy number is 0. If it has one complete path, then its copy number is the copy number of that path. If there are multiple paths with highest completeness score, then its copy number is the maximum of the copy numbers of those paths - for example, let's say we have two paths, both of which are 90% complete. One of those paths has a copy number of 1 and the other has a copy number of 3. The module copy number would be 3 in this case.
+
+{:.notice}
+We're making assumptions here again, just like we were when computing module completeness. Any of those paths (or none of them) could be the one that is used in the cell, and we don't know which one. But the idea here is that if a sample has the most copies of path X, there is probably a good reason that is has that many copies because microbial cells like to streamline their genomes whenever possible.
+
+One last note - if a module does not have any paths of highest completeness, we cannot compute the copy number. In this case, the copy number of the module will be reported as 'NA' in the output file(s).
+
+#### Part 7: Adjusting copy number
+
+This part is analogous to Part 4, in that we go back later to adjust the copy number of modules defined by other modules. We set the copy number of a module atomic step to be the previously-computed copy number of that module (if any). The tricky bit here is that some modules can have a copy number of 'NA'. When this is the case for one of our atomic steps, we make the adjusted module copy number 'NA' as well.
+
+#### Pathwise Strategy Summary
+In short: pathwise module completeness in a given sample is calculated as the maximum fraction of essential KOs (enzymes) that are annotated in the sample, where the maximum is taken over all possible sets of KOs (enzymes) from the module definition. Likewise, pathwise module copy number is calculated as the maximum copy number of any path with the module's completeness score.
+
+These values get harder to interpret when we are considering metagenomes rather than the genomes of individual organisms. There could be lots of different paths through a module used by different populations in a metagenome, but the module completeness/copy number values would summarize only the most common path(s). For situations like this, it is a good idea to take advantage of the ['module_paths' output mode](https://anvio.org/help/main/artifacts/kegg-metabolism/#module-paths-mode) to look at these scores for all individual paths through each module.
+
+### How is stepwise completeness/copy number calculated?
+
+Now we'll walk through an example of estimating stepwise completeness and copy number in one sample, again keeping in mind that the following steps are repeated for each module in each sample.
+
+#### Part 1: Top-level steps
+
+For stepwise completeness and copy number, we do our calculations at the level of top-level steps. These are the major steps in a metabolic pathway, each of which usually represents a single chemical reaction. A top-level step describes the enzyme(s) that can be used to catalyze that reaction. We can get the top-level steps of a module by splitting its DEFINITION string by its spaces (not including any spaces within parentheses).
+
+Let's use module [M00018](https://www.genome.jp/kegg-bin/show_module?M00018) as an example again. Earlier we described how M00018 is made up of five major steps, each one of which represents a single reaction in this metabolic pathway.
+
+We take the M00018 DEFINITION string:
+```
+(K00928,K12524,K12525,K12526) K00133 (K00003,K12524,K12525) (K00872,K02204,K02203) K01733
+```
+
+and split it by spaces to get the top-level steps:
+```
+(K00928,K12524,K12525,K12526)
+K00133
+(K00003,K12524,K12525)
+(K00872,K02204,K02203)
+K01733
+```
+
+This is far more straightforward than unrolling the module into all possible paths. For the stepwise metrics, we will focus only on these major steps, and what's more, we will ignore a lot of the nuance that comes from alternative enzymes within a top-level step.
+
+#### Part 2: Step completeness
+
+Unlike pathwise completeness, where we consider all possible alternatives and compute a fractional completeness for each path, a top-level step can only be entirely complete (1) or entirely incomplete (0). In other words, step completeness is binary. We don't care _how_ the step is complete. It doesn't matter which of the enzymes in a step are used to make it complete.
+
+To compute this binary completeness for each top-level step, we convert the step into a Boolean expression by following this set of rules:
+- enzyme accessions (ie, KOs) are replaced with 'True' if the enzyme is annotated in the sample, and otherwise are replaced with 'False'.
+- `--` steps do not have associated enzyme profiles, so we cannot say whether these steps are complete. These are always 'False'.
+- commas represent alternative enzymes, meaning you can use either one or the other. We convert commas into OR relationships.
+- spaces represent sequential enzymes, meaning that you need both (one after the other). We convert spaces into AND relationships.
+- plus signs ('+') represent essential enzyme components, meaning that you need both (at the same time). We convert plus signs into AND relationships.
+- minus signs ('-') represent nonessential enzyme components, meaning that you don't need them. We ignore these.
+- parentheses are kept where they are to maintain proper order of operations.
+
+After this conversion is done, we can simply evaluate the Boolean expression to determine whether or not the step is complete.
+
+Let's use the step `(K00928,K12524,K12525,K12526)` from M00018 as an example. Suppose that both K12524 and K12526 are annotated in the sample. Then this step would be converted into the following Boolean expression:
+```
+(False OR True OR False OR True)
+```
+which evaluates to True, meaning that this step is complete (1).
+
+Here's a more complicated example from module [M00849](https://www.genome.jp/module/M00849+R02082):
+```
+((K00869 (K17942,(K25517+K09128 K25518+K03186))),(K18689 K18690 K22813))
+```
+Suppose the following enzymes are annotated in the sample: K00869, K25517, K09128, K25518, K18689, and K22813. Then this step would become the following Boolean expression:
+```
+((True AND (False OR (True AND True AND True AND False))) OR (True AND False AND True))
+```
+Since this is a bit more complicated, we must evaluate it by following order of operations:
+```
+=> ((True AND (False OR (False))) OR (True AND False AND True))
+=> ((True AND (False)) OR (True AND False AND True))
+=> (False) OR (True AND False AND True))
+=> (False) OR (False)
+=> False
+```
+Since it ultimately evaluates to False, this step is incomplete (0).
+
+Note: if a top-level step includes entire modules in its definition, we skip evaluating its completeness for now.
+
+#### Part 3: Module completeness
+
+Once we've evaluated the binary completeness of each top-level step in a module, we calculate the stepwise completeness of the module by simply taking the percentage of complete top-level steps. So if, for instance, our five top-level steps in M00018 were evaluated like this:
+
+```
+(K00928,K12524,K12525,K12526) => complete (1)
+K00133 => complete (1)
+(K00003,K12524,K12525) => not complete (0)
+(K00872,K02204,K02203) => complete (1)
+K01733 => complete (1)
+```
+Then the overall stepwise completeness of M00018 would be 4/5, or 80%. If we were using the default module completeness threshold of 0.75, then this module would be considered 'complete' overall based on its stepwise score.
+
+Note: if any of the module's top-level steps are defined by other modules, we skip computing its completeness for now because we don't know the completeness of these steps yet. These will be adjusted in the next section.
+
+#### Part 4: Adjusting completeness
+
+Just like in pathwise completeness, we need to deal with the case when a module is defined by another module. In Part 3, we skipped any modules that have top-level steps defined by other modules. Now, we go back and re-compute the completeness of any of these steps using the Boolean expression from Part 2 (ie, by replacing module accessions with 'True' if their stepwise completeness is above the module completeness threshold, or 'False' otherwise). Then we repeat Part 3 on the affected modules to calculate their overall stepwise completeness.
+
+#### Part 5: Step copy number
+
+Now we need to calculate the copy number of each top-level step. We can do this by converting the step into an _arithmetic expression_ this time, by following a new set of rules:
+- enzyme accessions (ie, KOs) are replaced with the number of annotations this accession has in the given sample.
+- `--` steps are unknown, so we replace these with a count of '0'.
+- commas represent alternative enzymes, meaning you can use either one or the other. We convert commas into addition operations.
+- spaces represent sequential enzymes, meaning that you need both (one after the other). We convert spaces into min() operations.
+- plus signs ('+') represent essential enzyme components, meaning that you need both (at the same time). We convert plus signs into min() operations.
+- minus signs ('-') represent nonessential enzyme components, meaning that you don't need them. We ignore these.
+- parentheses are kept where they are to maintain proper order of operations.
+
+By doing it this way, we take into account all possible ways to complete the step without caring about which of the enzymes are contributing.
+
+Let's go through our examples again, this time with enzyme hit counts. For `(K00928,K12524,K12525,K12526)`, suppose that K12524 was annotated once and K12526 was annotated twice in the sample. Then this step becomes the following arithmetic expression:
+```
+(0 + 1 + 0 + 2)
+=> 3
+```
+This evaluates to a step copy number of 3. K12524 and K12526 can catalyze the same reaction, so their hit counts are combined when we compute the copy number.
+
+Now our complex example, `((K00869 (K17942,(K25517+K09128 K25518+K03186))),(K18689 K18690 K22813))`. Suppose that K00869 was annotated twice, K25517/K09128/K25518 were each annotated once, K03186 was annotated twice, and K18689 was annotated once. The conversion is a little bit more complicated here because we now need min() operations, so let's go through it step-by-step. We do this by following the order of operations - so the innermost set of parentheses are converted first.
+
+First, for the sequential enzyme complexes in `(K25517+K09128 K25518+K03186)`, we require all four enzyme components to be present. So the minimum number of annotations of any of these four components determines the copy number of this part of the step. In other words, this combo is only as strong as its weakest link: `min(K25517,K09128,K25518,K03186)` becomes `min(1,1,1,2)` when we replace the enzyme accessions with their hit counts. We might have 2 copies of component K03186, but because we are limited by the copy numbers of the other three components, we only have 1 copy of this sub-step overall.
+
+The comma in `(K17942,(K25517+K09128 K25518+K03186)` indicates that we can use either `K17942` OR the sequential complexes `(K25517+K09128 K25518+K03186)`. We've already converted the latter set of enzymes, so all that is left is to add the hit count of enzyme K17942 (which happens to be annotated 0 times): `(0 + min(1,1,1,2))`.
+
+To finish up the left side of the step definition, we have an AND relationship between K00869 and the other enzymes in `(K00869 (K17942,(K25517+K09128 K25518+K03186)))`. Since we need both, we take the minimum of K00869's annotation count and of the expression we already converted: `min(2,(0 + min(1,1,1,2)))`.
+
+Moving on to the second half of the step definition, the sequential enzymes in `(K18689 K18690 K22813)` become `min(1,0,0)`.
+
+Finally, we put everything all together, using addition since there is a comma (OR relationship) between the two halves of the definition:
+```
+(min(2,(0 + min(1,1,1,2))) + min(1,0,0))
+```
+
+Here is the evaluation of the resulting arithmetic expression:
+```
+=> (min(2,(0 + 1)) + min(1,0,0))
+=> (min(2,1) + min(1,0,0))
+=> (1 + min(1,0,0))
+=> (1 + 0)
+=> 1
+```
+Ultimately, this step has a copy number of 1. This happened because there was at least one copy of every enzyme in the first half of the step definition (though you wouldn't be able to figure this out just by looking at the step copy number).
+
+Fun fact: this conversion from definition string to arithmetic expression is quite complex for a computer to do, and in the code for this program, it is implemented as a recursive function.
+
+#### Part 6: Module copy number
+
+Every top-level step in the module is connected by an AND relationship - you need all of the steps in order to have the module complete. For this reason, we compute the module stepwise copy number by taking the minimum copy number of all top-level steps. So if we had the following copy numbers for each top-level step in M00018:
+
+```
+(K00928,K12524,K12525,K12526) => 1 copy
+K00133 => 2 copies
+(K00003,K12524,K12525) => 0 copies
+(K00872,K02204,K02203) => 1 copy
+K01733 => 1 copy
+```
+Then the overall stepwise copy number of M00018 would be 0 (because the third step has 0 copies).
+
+To make stepwise copy number easier to interpret, the output files of this program will include the individual step copy numbers in addition to the overall module copy number.
+
+#### Part 7: Adjusting copy number
+
+Once again, we must go back and adjust the copy number for any modules that are defined by other modules. For any top-level step whose definition includes modules, we take the arithmetic expression from Part 5 and replace those module accessions with the pre-computed stepwise copy number of the module. After evaluating the expression to get the copy number of these top-level steps, we can repeat Part 6 to get the module copy number.
+
+#### Stepwise Strategy Summary
+In short: stepwise module completeness in a given sample is calculated as the percentage of complete top-level steps. Likewise, stepwise module copy number is calculated as the minimum copy number of all top-level steps in the module definition.
+
+To help interpret these stepwise metrics for modules, it is a good idea to look at the ['module_steps' output mode](https://anvio.org/help/main/artifacts/kegg-metabolism/#module-steps-mode) to see the scores for all individual top-level steps in a module.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-estimate-metabolism.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-estimate-metabolism) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-estimate-metabolism/network.json b/help/8/programs/anvi-estimate-metabolism/network.json
new file mode 100644
index 00000000..b195bc8a
--- /dev/null
+++ b/help/8/programs/anvi-estimate-metabolism/network.json
@@ -0,0 +1,186 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-metabolism",
+ "name": "kegg-metabolism",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "user-metabolism",
+ "name": "user-metabolism",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-data",
+ "name": "kegg-data",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-functions",
+ "name": "kegg-functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "metagenomes",
+ "name": "metagenomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "user-modules-data",
+ "name": "user-modules-data",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "enzymes-txt",
+ "name": "enzymes-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 13,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-estimate-metabolism",
+ "name": "anvi-estimate-metabolism",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 13,
+ "target": 0
+ },
+ {
+ "source": 13,
+ "target": 1
+ },
+ {
+ "target": 13,
+ "source": 2
+ },
+ {
+ "target": 13,
+ "source": 3
+ },
+ {
+ "target": 13,
+ "source": 4
+ },
+ {
+ "target": 13,
+ "source": 5
+ },
+ {
+ "target": 13,
+ "source": 6
+ },
+ {
+ "target": 13,
+ "source": 7
+ },
+ {
+ "target": 13,
+ "source": 8
+ },
+ {
+ "target": 13,
+ "source": 9
+ },
+ {
+ "target": 13,
+ "source": 10
+ },
+ {
+ "target": 13,
+ "source": 11
+ },
+ {
+ "target": 13,
+ "source": 12
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-estimate-scg-taxonomy/index.md b/help/8/programs/anvi-estimate-scg-taxonomy/index.md
new file mode 100644
index 00000000..f478aa35
--- /dev/null
+++ b/help/8/programs/anvi-estimate-scg-taxonomy/index.md
@@ -0,0 +1,171 @@
+---
+layout: program
+title: anvi-estimate-scg-taxonomy
+excerpt: An anvi'o program. Estimates taxonomy at genome and metagenome level.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-estimate-scg-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Estimates taxonomy at genome and metagenome level. This program is the entry point to estimate taxonomy for a given set of contigs (i.e., all contigs in a contigs database, or contigs described in collections as bins). For this, it uses single-copy core gene sequences and the GTDB database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [scgs-taxonomy](../../artifacts/scgs-taxonomy) [collection](../../artifacts/collection) [bin](../../artifacts/bin) [metagenomes](../../artifacts/metagenomes)
+
+
+## Can provide
+
+
+[genome-taxonomy](../../artifacts/genome-taxonomy) [genome-taxonomy-txt](../../artifacts/genome-taxonomy-txt)
+
+
+## Usage
+
+
+This program makes **quick taxonomy estimates for genomes, metagenomes, or bins stored in your [contigs-db](/help/8/artifacts/contigs-db)** using single-copy core genes.
+
+You can run this program on an anvi'o contigs database only if you already have setup the necessary databases to assign taxonomy on your computer by running [anvi-setup-scg-taxonomy](/help/8/programs/anvi-setup-scg-taxonomy) and annotated the [contigs-db](/help/8/artifacts/contigs-db) you are working with using [anvi-run-scg-taxonomy](/help/8/programs/anvi-run-scg-taxonomy), which are described in greater detail in [this document](http://merenlab.org/2019/10/08/anvio-scg-taxonomy/)), which also offers a [comprehensive overview](http://merenlab.org/2019/10/08/anvio-scg-taxonomy/#estimating-taxonomy-in-the-terminal) of what [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy) can do.
+
+Keep in mind that the scg-taxonomy framework currently uses single-copy core genes found in [GTDB](https://gtdb.ecogenomic.org/) genomes, thus it will not work well for low-completion, viral, or eukaryotic genomes.
+
+This same functionality [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy) is implicitly accessed thorugh the anvi'o [interactive](/help/8/artifacts/interactive) interface, when you turn on real-time taxonomy estimation for bins. So, if you've ever wondered where those estimates were coming from, now you know.
+
+So, what can this program do?
+
+### 1. Estimate the taxonomy of a single genome
+
+By default, this program wll assume your [contigs-db](/help/8/artifacts/contigs-db) contains only one genome, and will try to use the single-copy core genes (that were associated with taxonomy when you ran [anvi-run-scg-taxonomy](/help/8/programs/anvi-run-scg-taxonomy)) to try to identify the taxonomy of your genome.
+
+When you run
+
+
+anvi-estimate-scg-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+It will give you the best taxonomy hit for your genome. If you would like to see how it got there (by looking at the hits for each of the single-copy core genes), just use the `--debug` flag to see more information, as so:
+
+
+anvi-estimate-scg-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --debug
+
+
+### 2. Estimate the taxa within a metagenome
+
+By running this program in metagenome mode, it will assume that your [contigs-db](/help/8/artifacts/contigs-db) contains multiple genomes and will try to give you an overview of the taxa within it. To do this, it will determine which single-copy core gene has the most hits in your contigs (for example `Ribosomal_S6`), and then will look at the taxnomy hits for that gene across your contigs. The output will be this list of taxonomy results.
+
+
+anvi-estimate-scg-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --metagenome-mode
+
+
+If you want to look at a specific gene (instead of the one with the most hits), you can also tell it to do that. For example, to tell it to look at Ribosomal_S9, run
+
+
+anvi-estimate-scg-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --metagenome-mode \
+ --scg-name Ribosomal_S9
+
+
+### 3. Look at relative abundance of taxa across samples
+
+If you provide a merged [profile-db](/help/8/artifacts/profile-db) or [single-profile-db](/help/8/artifacts/single-profile-db), then you'll be able to look at the relative abundance of your taxonomy hits (through a single-copy core gene) across your samples. Essentially, this adds additional columns to your output (one per sample) that descrbe the relative abundance of each hit in each sample.
+
+Running this will look something like this,
+
+anvi-estimate-scg-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --metagenome-mode \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ --compute-scg-coverages
+
+
+For an example output, take a look at [this page](http://merenlab.org/2019/10/08/anvio-scg-taxonomy/#contigs-db--profile-db).
+
+### 4. Estimate the taxonomy of your bins
+
+This program basically looks at each of the [bin](/help/8/artifacts/bin)s in your [collection](/help/8/artifacts/collection) as a single genome and tries to assign it taxonomy information. To do this, simply provide a collection, like this:
+
+
+anvi-estimate-scg-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection)
+
+
+You can also look at the relative abundances across your samples at the same time, by running something like this:
+
+
+anvi-estimate-scg-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ --compute-scg-coverages
+
+
+Pro tip: you can use the output that emerges from the following output,
+
+
+anvi-estimate-scg-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -o TAXONOMY.txt
+
+
+to display the taxonomy of your bins in the anvi'o interactive interface in **collection mode**:
+
+
+[anvi-interactive](/help/8/programs/anvi-interactive) -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ --additional-layers TAXONOMY.txt
+
+
+That simple.
+
+### 5. Look at multiple metagenomes at the same time
+
+You can even use this program to look at multiple metagenomes by providing a [metagenomes](/help/8/artifacts/metagenomes) artifact. This is useful to get an overview of what kinds of taxa might be in your metagenomes, and what kinds of taxa they share.
+
+Running this
+
+
+anvi-estimate-scg-taxonomy --metagenomes [metagenomes](/help/8/artifacts/metagenomes) \
+ --output-file-prefix EXAMPLE
+
+
+will give you an output file containing all taxonomic levels found and their coverages in each of your metagenomes.
+
+For a concrete example, check out [this page](http://merenlab.org/2019/10/08/anvio-scg-taxonomy/#many-contigs-dbs-for-many-metagenomes).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-estimate-scg-taxonomy.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Usage examples and warnings](http://merenlab.org/scg-taxonomy)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-estimate-scg-taxonomy) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-estimate-scg-taxonomy/network.json b/help/8/programs/anvi-estimate-scg-taxonomy/network.json
new file mode 100644
index 00000000..9e9897f2
--- /dev/null
+++ b/help/8/programs/anvi-estimate-scg-taxonomy/network.json
@@ -0,0 +1,121 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genome-taxonomy",
+ "name": "genome-taxonomy",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "genome-taxonomy-txt",
+ "name": "genome-taxonomy-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "scgs-taxonomy",
+ "name": "scgs-taxonomy",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "metagenomes",
+ "name": "metagenomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 8,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-estimate-scg-taxonomy",
+ "name": "anvi-estimate-scg-taxonomy",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 8,
+ "target": 0
+ },
+ {
+ "source": 8,
+ "target": 1
+ },
+ {
+ "target": 8,
+ "source": 2
+ },
+ {
+ "target": 8,
+ "source": 3
+ },
+ {
+ "target": 8,
+ "source": 4
+ },
+ {
+ "target": 8,
+ "source": 5
+ },
+ {
+ "target": 8,
+ "source": 6
+ },
+ {
+ "target": 8,
+ "source": 7
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-estimate-trna-taxonomy/index.md b/help/8/programs/anvi-estimate-trna-taxonomy/index.md
new file mode 100644
index 00000000..fdb7c095
--- /dev/null
+++ b/help/8/programs/anvi-estimate-trna-taxonomy/index.md
@@ -0,0 +1,172 @@
+---
+layout: program
+title: anvi-estimate-trna-taxonomy
+excerpt: An anvi'o program. Estimates taxonomy at genome and metagenome level using tRNA sequences.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-estimate-trna-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Estimates taxonomy at genome and metagenome level using tRNA sequences..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [trna-taxonomy](../../artifacts/trna-taxonomy) [collection](../../artifacts/collection) [bin](../../artifacts/bin) [metagenomes](../../artifacts/metagenomes) [dna-sequence](../../artifacts/dna-sequence)
+
+
+## Can provide
+
+
+[genome-taxonomy](../../artifacts/genome-taxonomy) [genome-taxonomy-txt](../../artifacts/genome-taxonomy-txt)
+
+
+## Usage
+
+
+This program **uses the taxonomy associates of your tRNA sequences to estimate the taxonomy for genomes, metagenomes, or [collection](/help/8/artifacts/collection) stored in your [contigs-db](/help/8/artifacts/contigs-db)**.
+
+This is the final step in the trna-taxonomy workflow. Before running this program, you'll need to have run [anvi-run-trna-taxonomy](/help/8/programs/anvi-run-trna-taxonomy) on the [contigs-db](/help/8/artifacts/contigs-db) that you're inputting to this program.
+
+## Input options
+
+### 1: Running on a single genome
+
+By default, this program will assume that your [contigs-db](/help/8/artifacts/contigs-db) contains only a single genome and will determine the taxonomy of that single genome.
+
+
+anvi-estimate-trna-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+This will give you only the best taxonomy hit for your genome based on your tRNA data. If you want to look under the hood and see what results from [anvi-run-trna-taxonomy](/help/8/programs/anvi-run-trna-taxonomy) it's using to get there, add the `--debug` flag.
+
+
+anvi-estimate-trna-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --debug
+
+
+### 2: Running on a metagenome
+
+In metagenome mode, this program will assume that your [contigs-db](/help/8/artifacts/contigs-db) contains multiple genomes and will try to give you an overview of the taxa within it. To do this, anvi'o will determine which anticodon has the most hits in your contigs (for example `GGG`), and then will look at the taxnomy hits for tRNA with that anticodon across your contigs.
+
+
+anvi-estimate-trna-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --metagenome-mode
+
+
+If instead you want to look at a specific anticodon, you can specify that with the `-S` parameter. For example, to look at `GGT`, just run the following:
+
+
+anvi-estimate-trna-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --metagenome-mode \
+ -S GGT
+
+
+### 3: Running on multiple metagenomes
+
+You can use this program to look at multiple metagenomes by providing a [metagenomes](/help/8/artifacts/metagenomes) artifact. This is useful to get an overview of what kinds of taxa might be in your metagenomes, and what kinds of taxa they share.
+
+Running this
+
+
+anvi-estimate-trna-taxonomy --metagenomes [metagenomes](/help/8/artifacts/metagenomes) \
+ --output-file-prefix EXAMPLE
+
+
+will give you an output file containing all taxonomic levels found and their coverages in each of your metagenomes, based on their tRNA.
+
+### 4: Estimating the taxonomy of bins
+
+You can use this program to estimate the taxonomy of all of the [bin](/help/8/artifacts/bin)s in a [collection](/help/8/artifacts/collection) by providing the the [collection](/help/8/artifacts/collection) and the associated [profile-db](/help/8/artifacts/profile-db).
+
+
+anvi-estimate-trna-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --C [collection](/help/8/artifacts/collection) \
+ --p [profile-db](/help/8/artifacts/profile-db)
+
+
+When doing this, you can also put the final results into your [profile-db](/help/8/artifacts/profile-db) as a [misc-data-layers](/help/8/artifacts/misc-data-layers) with the flag `--update-profile-db-with-taxonomy`
+
+### 5: I don't even have a contigs-db. Just a fasta file.
+
+This program can run the entire ad hoc sequence search without a [contigs-db](/help/8/artifacts/contigs-db) involved (just a fasta and number of target sequences as a percent of the total; default: 20 percent), but this is not recommended. However, if you provide other parameters, they will be ignored.
+
+
+anvi-estimate-trna-taxonomy --dna-sequence [fasta](/help/8/artifacts/fasta) \
+ --max-num-target-sequences 10
+
+
+## The Output
+
+Now that you've inputted your desired inputs, you think about whether you want an output and what it will look like. By default, this program won't give you an output (just [genome-taxonomy](/help/8/artifacts/genome-taxonomy) information in your [contigs-db](/help/8/artifacts/contigs-db). However, if you add any of these output options, it will instead produce a [genome-taxonomy-txt](/help/8/artifacts/genome-taxonomy-txt).
+
+### Anticodon Frequencies
+
+If you want to look at the anticodon frequencies before getting taxonomy info at all (for example because you can't decide which anticodon to use for input option 2), add the flag `--report-anticodon-frequencies`. This will report the anticodon frequencies to a tab-delimited file and quit the program.
+
+### A single output
+
+To get a single output (a fancy table for your viewing pleasure), just add the output file path.
+
+In this example, the input will be a single [contigs-db](/help/8/artifacts/contigs-db) (input option 1),
+
+
+anvi-estimate-trna-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output.txt
+
+
+This will give you a tab-delimited matrix with all levels of taxonomic information for the genome stored in your [contigs-db](/help/8/artifacts/contigs-db). Specifically, the output is a [genome-taxonomy-txt](/help/8/artifacts/genome-taxonomy-txt).
+
+If you want to focus on a single taxonomic level, use the parameter `--taxonomic-level`, like so:
+
+
+anvi-estimate-trna-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output.txt \
+ --taxonomic-level genus
+
+
+You can also simplify the taxonomy names in the table with the flag `--simplify-taxonomy-information`
+
+If you're running on a [profile-db](/help/8/artifacts/profile-db), you can also choose to add the anticodon coverage to the output with `--compute-anticodon-coverages`.
+
+### Multiple outputs
+
+If you have multiple outputs (i.e. you are looking at multiple metagenomes (input option number 3) or you are looking at each anticodon individually with `--per-anticodon-output-file`), you should instead provide a output filename prefix.
+
+
+anvi-estimate-trna-taxonomy --metagenomes [metagenomes](/help/8/artifacts/metagenomes) \
+ --output-file-prefix EXAMPLE
+
+
+The rest of the options listed for the single output (i.e. focusing on a taxonomic level, simplifying taxonomy information, etc.) still apply.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-estimate-trna-taxonomy.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-estimate-trna-taxonomy) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-estimate-trna-taxonomy/network.json b/help/8/programs/anvi-estimate-trna-taxonomy/network.json
new file mode 100644
index 00000000..d305e8f1
--- /dev/null
+++ b/help/8/programs/anvi-estimate-trna-taxonomy/network.json
@@ -0,0 +1,134 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genome-taxonomy",
+ "name": "genome-taxonomy",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "genome-taxonomy-txt",
+ "name": "genome-taxonomy-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trna-taxonomy",
+ "name": "trna-taxonomy",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "metagenomes",
+ "name": "metagenomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "dna-sequence",
+ "name": "dna-sequence",
+ "provided_by_anvio": true,
+ "type": "SEQUENCE"
+ },
+ {
+ "size": 9,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-estimate-trna-taxonomy",
+ "name": "anvi-estimate-trna-taxonomy",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 9,
+ "target": 0
+ },
+ {
+ "source": 9,
+ "target": 1
+ },
+ {
+ "target": 9,
+ "source": 2
+ },
+ {
+ "target": 9,
+ "source": 3
+ },
+ {
+ "target": 9,
+ "source": 4
+ },
+ {
+ "target": 9,
+ "source": 5
+ },
+ {
+ "target": 9,
+ "source": 6
+ },
+ {
+ "target": 9,
+ "source": 7
+ },
+ {
+ "target": 9,
+ "source": 8
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-experimental-organization/index.md b/help/8/programs/anvi-experimental-organization/index.md
new file mode 100644
index 00000000..54821f85
--- /dev/null
+++ b/help/8/programs/anvi-experimental-organization/index.md
@@ -0,0 +1,62 @@
+---
+layout: program
+title: anvi-experimental-organization
+excerpt: An anvi'o program. Create an experimental clustering dendrogram.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-experimental-organization
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Create an experimental clustering dendrogram..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[clustering-configuration](../../artifacts/clustering-configuration)
+
+
+## Can provide
+
+
+[dendrogram](../../artifacts/dendrogram)
+
+
+## Usage
+
+
+This program can use an anvi'o [clustering-configuration](/help/8/artifacts/clustering-configuration) file to access various data sources in anvi'o databases to produce a hierarchical clustering dendrogram for items.
+
+It is especially powerful when the user wishes to create a hierarchical clustering of contigs or gene clusters using only a specific set of samples. If you would like to see an example usage of this program see the article on [combining metagenomics with metatranscriptomics](https://merenlab.org/2015/06/10/combining-omics-data/).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-experimental-organization.md) to update this information.
+
+
+## Additional Resources
+
+
+* [An example use of this program](https://merenlab.org/2015/06/10/combining-omics-data/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-experimental-organization) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-experimental-organization/network.json b/help/8/programs/anvi-experimental-organization/network.json
new file mode 100644
index 00000000..f6f5b64e
--- /dev/null
+++ b/help/8/programs/anvi-experimental-organization/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "dendrogram",
+ "name": "dendrogram",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "clustering-configuration",
+ "name": "clustering-configuration",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-experimental-organization",
+ "name": "anvi-experimental-organization",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-collection/index.md b/help/8/programs/anvi-export-collection/index.md
new file mode 100644
index 00000000..d55ea31c
--- /dev/null
+++ b/help/8/programs/anvi-export-collection/index.md
@@ -0,0 +1,76 @@
+---
+layout: program
+title: anvi-export-collection
+excerpt: An anvi'o program. Export a collection from an anvi'o database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-collection
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export a collection from an anvi'o database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [collection](../../artifacts/collection)
+
+
+## Can provide
+
+
+[collection-txt](../../artifacts/collection-txt)
+
+
+## Usage
+
+
+This program, as one might think, allows you to export a [collection](/help/8/artifacts/collection). This allows you to take your binning results elsewhere (including into another Anvi'o project with the command [anvi-import-collection](/help/8/programs/anvi-import-collection)).
+
+You can run this program on a [profile-db](/help/8/artifacts/profile-db) or [pan-db](/help/8/artifacts/pan-db) as follows:
+
+
+anvi-export-collection -C my_favorite_collection \
+ -p [profile-db](/help/8/artifacts/profile-db)
+
+
+This will give you a [collection-txt](/help/8/artifacts/collection-txt) file that describes the collection `my_favorite_collection`.
+
+To list the collections available in this database, you can run
+
+
+anvi-export-collection -p [pan-db](/help/8/artifacts/pan-db) \
+ --list-colllections
+
+
+You can also add the flag `--include-unbinned` to have all unbinned contigs in the database show up at the end of your [collection-txt](/help/8/artifacts/collection-txt) file in a bin titled `UNBINNED`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-collection.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-collection) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-collection/network.json b/help/8/programs/anvi-export-collection/network.json
new file mode 100644
index 00000000..f43931a0
--- /dev/null
+++ b/help/8/programs/anvi-export-collection/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection-txt",
+ "name": "collection-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-collection",
+ "name": "anvi-export-collection",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-contigs/index.md b/help/8/programs/anvi-export-contigs/index.md
new file mode 100644
index 00000000..0a878f46
--- /dev/null
+++ b/help/8/programs/anvi-export-contigs/index.md
@@ -0,0 +1,89 @@
+---
+layout: program
+title: anvi-export-contigs
+excerpt: An anvi'o program. Export contigs (or splits) from an anvi'o contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-contigs
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export contigs (or splits) from an anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[contigs-fasta](../../artifacts/contigs-fasta)
+
+
+## Usage
+
+
+This program **exports the contig sequences from a [contigs-db](/help/8/artifacts/contigs-db)**, outputting them as a [contigs-fasta](/help/8/artifacts/contigs-fasta). It also has the ability to output the sequences of your splits instead.
+
+You can run this program as follows:
+
+
+anvi-export-contigs -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/[contigs-fasta](/help/8/artifacts/contigs-fasta)
+
+
+To run it on only a named subset of your contigs, you can provide a list of contigs as a separate file (in the same format as a [splits-txt](/help/8/artifacts/splits-txt)). For example:
+
+
+anvi-export-contigs -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/[contigs-fasta](/help/8/artifacts/contigs-fasta) \
+ --contigs-of-interest my_favorite_contigs.txt
+
+
+where `my_favorite_contigs.txt` looks like this:
+
+ contig_0001
+ contig_0005
+ contig_0035
+
+### Splits mode
+
+Want to look at your splits instead of your contigs? Just run with the flag `splits-mode` attached.
+
+
+anvi-export-contigs -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/[contigs-fasta](/help/8/artifacts/contigs-fasta) \
+ --splits-mode
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-contigs.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-contigs) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-contigs/network.json b/help/8/programs/anvi-export-contigs/network.json
new file mode 100644
index 00000000..a869e59e
--- /dev/null
+++ b/help/8/programs/anvi-export-contigs/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-fasta",
+ "name": "contigs-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-contigs",
+ "name": "anvi-export-contigs",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-functions/index.md b/help/8/programs/anvi-export-functions/index.md
new file mode 100644
index 00000000..f80a859c
--- /dev/null
+++ b/help/8/programs/anvi-export-functions/index.md
@@ -0,0 +1,72 @@
+---
+layout: program
+title: anvi-export-functions
+excerpt: An anvi'o program. Export functions of genes from an anvi'o contigs database for a given annotation source.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-functions
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export functions of genes from an anvi'o contigs database for a given annotation source.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [functions](../../artifacts/functions)
+
+
+## Can provide
+
+
+[functions-txt](../../artifacts/functions-txt)
+
+
+## Usage
+
+
+This program **takes in a [functions](/help/8/artifacts/functions) artifact to create a [functions-txt](/help/8/artifacts/functions-txt).** Basically, if you want to take the information in your [functions](/help/8/artifacts/functions) artifact out of anvi'o or give it to a fellow anvi'o user (for them to [import it](http://merenlab.org/software/anvio/help/programs/anvi-import-functions/) into their own project), you get that information using this command.
+
+Simply provide the [contigs-db](/help/8/artifacts/contigs-db) that has been annotated with [functions](/help/8/artifacts/functions):
+
+
+anvi-export-functions -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+You can also get annotations for only a specific list of sources. For example:
+
+
+anvi-export-functions -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --annotation-sources source_1,source_2,source_3
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-functions.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-functions) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-functions/network.json b/help/8/programs/anvi-export-functions/network.json
new file mode 100644
index 00000000..fe1c9e36
--- /dev/null
+++ b/help/8/programs/anvi-export-functions/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions-txt",
+ "name": "functions-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-functions",
+ "name": "anvi-export-functions",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-gene-calls/index.md b/help/8/programs/anvi-export-gene-calls/index.md
new file mode 100644
index 00000000..fb9d6550
--- /dev/null
+++ b/help/8/programs/anvi-export-gene-calls/index.md
@@ -0,0 +1,117 @@
+---
+layout: program
+title: anvi-export-gene-calls
+excerpt: An anvi'o program. Export gene calls from an anvi'o contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-gene-calls
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export gene calls from an anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[gene-calls-txt](../../artifacts/gene-calls-txt)
+
+
+## Usage
+
+
+The purpose of this program is to exports your gene calls in a given [contigs-db](/help/8/artifacts/contigs-db) and a gene caller, in the form of a [gene-calls-txt](/help/8/artifacts/gene-calls-txt).
+
+To see the gene callers available in your contigs database, you can use [anvi-db-info](/help/8/programs/anvi-db-info) or use this program with the following flag:
+
+
+anvi-export-gene-calls -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --list-gene-callers
+
+
+Running this will export all of your gene calls identified by the gene caller [prodigal](https://github.com/hyattpd/Prodigal) (assuming it is in your [contigs-db](/help/8/artifacts/contigs-db):
+
+
+anvi-export-gene-calls -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --gene-caller Prodigal \
+ -o [gene-calls-txt](/help/8/artifacts/gene-calls-txt)
+
+
+{:.notice}
+You can export genes from more gene callers by providing a comma-separated list of gene caller names.
+
+If you don't want to display the amino acid sequences of each gene (they can crowd the file very quickly if you don't want to see them), you can add the following flag:
+
+
+anvi-export-gene-calls -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --gene-caller Prodigal \
+ --skip-sequence-reporting \
+ -o [gene-calls-txt](/help/8/artifacts/gene-calls-txt)
+
+
+## Advanced uses
+
+This program can take a lot of time and memory when working with very large [contigs-db](/help/8/artifacts/contigs-db) files (such as those that are more than 10 Gb in file size or more than 10 million contigs).
+
+In that case you can export your gene calls the following way within minutes and a small memory space.
+
+First open your [contigs-db](/help/8/artifacts/contigs-db):
+
+
+sqlite3 [contigs-db](/help/8/artifacts/contigs-db)
+
+
+Then run these lines,
+
+
+.mode csv
+.headers on
+.out [gene-calls-txt](/help/8/artifacts/gene-calls-txt)
+select gene_callers_id, contig, start, stop, direction, partial from genes_in_contigs;
+
+
+You can also continue with these lines to get the amino acid sequences for them:
+
+
+.mode csv
+.headers on
+.out AMINO-ACID-SEQUENCES.txt
+select * from genes_in_contigs;
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-gene-calls.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-gene-calls) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-gene-calls/network.json b/help/8/programs/anvi-export-gene-calls/network.json
new file mode 100644
index 00000000..de14431a
--- /dev/null
+++ b/help/8/programs/anvi-export-gene-calls/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "gene-calls-txt",
+ "name": "gene-calls-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-gene-calls",
+ "name": "anvi-export-gene-calls",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-gene-coverage-and-detection/index.md b/help/8/programs/anvi-export-gene-coverage-and-detection/index.md
new file mode 100644
index 00000000..3ae0530d
--- /dev/null
+++ b/help/8/programs/anvi-export-gene-coverage-and-detection/index.md
@@ -0,0 +1,66 @@
+---
+layout: program
+title: anvi-export-gene-coverage-and-detection
+excerpt: An anvi'o program. Export gene coverage and detection data for all genes associated with contigs described in a profile database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-gene-coverage-and-detection
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export gene coverage and detection data for all genes associated with contigs described in a profile database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[coverages-txt](../../artifacts/coverages-txt) [detection-txt](../../artifacts/detection-txt)
+
+
+## Usage
+
+
+This program gives you the **coverage and detection data** for all of the genes found in your [contigs-db](/help/8/artifacts/contigs-db), using the short reads data in your [profile-db](/help/8/artifacts/profile-db).
+
+
+anvi-export-gene-coverage-and-detection -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -O MY_DATA
+
+
+This will give you a [coverages-txt](/help/8/artifacts/coverages-txt) and a [detection-txt](/help/8/artifacts/detection-txt) whose file names will begin with `MY_DATA`
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-gene-coverage-and-detection.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-gene-coverage-and-detection) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-gene-coverage-and-detection/network.json b/help/8/programs/anvi-export-gene-coverage-and-detection/network.json
new file mode 100644
index 00000000..e39cb6d9
--- /dev/null
+++ b/help/8/programs/anvi-export-gene-coverage-and-detection/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "coverages-txt",
+ "name": "coverages-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "detection-txt",
+ "name": "detection-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-gene-coverage-and-detection",
+ "name": "anvi-export-gene-coverage-and-detection",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-items-order/index.md b/help/8/programs/anvi-export-items-order/index.md
new file mode 100644
index 00000000..526b4270
--- /dev/null
+++ b/help/8/programs/anvi-export-items-order/index.md
@@ -0,0 +1,75 @@
+---
+layout: program
+title: anvi-export-items-order
+excerpt: An anvi'o program. Export an item order from an anvi'o database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-items-order
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export an item order from an anvi'o database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db)
+
+
+## Can provide
+
+
+[misc-data-items-order-txt](../../artifacts/misc-data-items-order-txt) [dendrogram](../../artifacts/dendrogram) [phylogeny](../../artifacts/phylogeny)
+
+
+## Usage
+
+
+This program, as one might think, allows you to export a [misc-data-items-order](/help/8/artifacts/misc-data-items-order), outputing a [misc-data-items-order-txt](/help/8/artifacts/misc-data-items-order-txt).
+
+You can export one of the item orders in a [profile-db](/help/8/artifacts/profile-db) or [pan-db](/help/8/artifacts/pan-db) as follows:
+
+
+anvi-export-items-order -p [profile-db](/help/8/artifacts/profile-db) \
+ --name cov
+
+
+The `cov` here refers to the tree that is generated using only differential coverage. Almost all anvi'o profile databases will also have available an items-order based on the tetranucleotide frequency called `tnf`, and one based on both called `tnf-cov`.
+
+However, to list the item orders available in this database, just don't include the name flag.
+
+
+anvi-export-items-order -p [pan-db](/help/8/artifacts/pan-db)
+
+
+You'll get a `Config Error` that will tell you what item orders are available.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-items-order.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-items-order) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-items-order/network.json b/help/8/programs/anvi-export-items-order/network.json
new file mode 100644
index 00000000..ce09f14b
--- /dev/null
+++ b/help/8/programs/anvi-export-items-order/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items-order-txt",
+ "name": "misc-data-items-order-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "dendrogram",
+ "name": "dendrogram",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "phylogeny",
+ "name": "phylogeny",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-items-order",
+ "name": "anvi-export-items-order",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "source": 5,
+ "target": 1
+ },
+ {
+ "source": 5,
+ "target": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-locus/index.md b/help/8/programs/anvi-export-locus/index.md
new file mode 100644
index 00000000..761eaf6b
--- /dev/null
+++ b/help/8/programs/anvi-export-locus/index.md
@@ -0,0 +1,124 @@
+---
+layout: program
+title: anvi-export-locus
+excerpt: An anvi'o program. This program helps you cut a 'locus' from a larger genetic context (e.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-locus
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+This program helps you cut a 'locus' from a larger genetic context (e.g., contigs, genomes). By default, anvi'o will locate a user-defined anchor gene, extend its selection upstream and downstream based on the --num-genes argument, then extract the locus to create a new contigs database. The anchor gene must be provided as --search-term, --gene-caller-ids, or --hmm-sources. If --flank-mode is designated, you MUST provide TWO flanking genes that define the locus region (Please see --flank-mode help for more information). If everything goes as plan, anvi'o will give you individual locus contigs databases for every matching anchor gene found in the original contigs database provided. Enjoy your mini contigs databases!.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[locus-fasta](../../artifacts/locus-fasta)
+
+
+## Usage
+
+
+This program lets you export selections of your [contigs-db](/help/8/artifacts/contigs-db) around all occurances of a user-defined anchor gene.
+
+The output of this is a folder that contains a separate [contigs-db](/help/8/artifacts/contigs-db) for the region around each hit of the anchor gene. (In fact, you'll get a FASTA file, [contigs-db](/help/8/artifacts/contigs-db), [profile-db](/help/8/artifacts/profile-db), and a copy of the runlog).
+
+For example, you could specify the recognition site for a specific enzyme and use this program to pull out all potential sites where that enzyme could bind.
+
+### Required Parameters
+
+You'll need to provide a [contigs-db](/help/8/artifacts/contigs-db) (of course), as well as the name of the output directory and a prefix to use when naming all of the output databases.
+
+You can define the region of interest either by defining the two flanking genes or by searching for an anchor gene and defining a number of genes around this gene that you want to look at. For example, if you set `num-genes` as 1, then each locus will contain the gene of interest, a gene upstream of it, and a gene downstream of it, for a total of three genes.
+
+### Defining the region of interest
+
+There are four ways to indicate the desired anchor gene:
+
+1. Provide a search term in the functional annotations of all of your genes. (If you're trying to find a gene with a vague function, you might want to use [anvi-search-functions](/help/8/programs/anvi-search-functions) to find out which genes will show up first. Alternatively, you can you [anvi-export-functions](/help/8/programs/anvi-export-functions) to look at a full list of the functional annotaitons in this database).
+
+
+ anvi-export-locus -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --num-genes 2 \
+ -o GLYCO_DIRECTORY \
+ -O Glyco \
+ --search-term "Glycosyltransferase involved in cell wall bisynthesis" \
+
+
+ You also have the option to specify an annotation source with the flag `--annotation source`
+
+2. Provide a specific gene caller ID.
+
+
+ anvi-export-locus -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --num-genes 2 \
+ -o output_directory \
+ -O GENE_1 \
+ --gene-caller-ids 1
+
+
+3. Provide a search term for the HMM source annotations. To do this, you must also specify an hmm-source. (You can use the flag `--list-hmm-sources` to list the available sources).
+
+
+ anvi-export-locus -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --num-genes 2 \
+ -o Ribosomal_S20p \
+ -O Ribosomal_S20p \
+ --use-hmm \
+ --hmm-source Bacteria_71 \
+ --search-term Ribosomal_S20p
+
+
+ 4. Run in `flank-mode` and provide two flanking genes that define the locus region.
+
+
+ anvi-export-locus -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --flank-mode \
+ -o locus_output \
+ -O gyclo_to_acyl \
+ --search-term "Glycosyltransferase involved in cell wall bisynthesis","Acyl carrier protein" \
+
+
+### Additional Options
+
+You can also remove partial hits, ignore reverse complement hits, or overwrite all files in a pre-existing output.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-locus.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-locus) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-locus/network.json b/help/8/programs/anvi-export-locus/network.json
new file mode 100644
index 00000000..926f40f3
--- /dev/null
+++ b/help/8/programs/anvi-export-locus/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "locus-fasta",
+ "name": "locus-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-locus",
+ "name": "anvi-export-locus",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-misc-data/index.md b/help/8/programs/anvi-export-misc-data/index.md
new file mode 100644
index 00000000..42419960
--- /dev/null
+++ b/help/8/programs/anvi-export-misc-data/index.md
@@ -0,0 +1,111 @@
+---
+layout: program
+title: anvi-export-misc-data
+excerpt: An anvi'o program. Export additional data or order tables in pan or profile databases for items or layers.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-misc-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export additional data or order tables in pan or profile databases for items or layers.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [misc-data-items](../../artifacts/misc-data-items) [misc-data-layers](../../artifacts/misc-data-layers) [misc-data-layer-orders](../../artifacts/misc-data-layer-orders) [misc-data-nucleotides](../../artifacts/misc-data-nucleotides) [misc-data-amino-acids](../../artifacts/misc-data-amino-acids)
+
+
+## Can provide
+
+
+[misc-data-items-txt](../../artifacts/misc-data-items-txt) [misc-data-layers-txt](../../artifacts/misc-data-layers-txt) [misc-data-layer-orders-txt](../../artifacts/misc-data-layer-orders-txt) [misc-data-nucleotides-txt](../../artifacts/misc-data-nucleotides-txt) [misc-data-amino-acids-txt](../../artifacts/misc-data-amino-acids-txt)
+
+
+## Usage
+
+
+This program lets you export miscellaneous data of your choosing into a text file, which can be imported into another anvi'o project using [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data). You can export the same types of data that you can import with that function. These are also listed below.
+
+To see what misc-data is available in your database, use [anvi-show-misc-data](/help/8/programs/anvi-show-misc-data).
+
+If your misc-data is associated with a specific data group, you can provide that data group to this program with the `-D` flag.
+
+## Data types you can export
+
+### From a pan-db or profile-db: items, layers, layer orders
+
+**From a [pan-db](/help/8/artifacts/pan-db) or [profile-db](/help/8/artifacts/profile-db), you can export**
+
+- items data ([misc-data-items](/help/8/artifacts/misc-data-items)) into a [misc-data-items-txt](/help/8/artifacts/misc-data-items-txt).
+
+
+anvi-export-misc-data -p [profile-db](/help/8/artifacts/profile-db) \
+ --target-data-table items
+
+
+- layers data ([misc-data-layers](/help/8/artifacts/misc-data-layers)) into a [misc-data-layers-txt](/help/8/artifacts/misc-data-layers-txt).
+
+
+anvi-export-misc-data -p [pan-db](/help/8/artifacts/pan-db) \
+ --target-data-table layers
+
+
+- layer orders data ([misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders)) into a [misc-data-layer-orders-txt](/help/8/artifacts/misc-data-layer-orders-txt).
+
+
+anvi-export-misc-data -p [profile-db](/help/8/artifacts/profile-db) \
+ --target-data-table layer_orders
+
+
+### From a contigs-db: nucleotide and amino acid information
+
+**From a [contigs-db](/help/8/artifacts/contigs-db), you can export**
+
+- nucleotide data ([misc-data-nucleotides](/help/8/artifacts/misc-data-nucleotides)) into a [misc-data-nucleotides-txt](/help/8/artifacts/misc-data-nucleotides-txt).
+
+
+anvi-export-misc-data -c [contigs-db](/help/8/artifacts/contigs-db)
+ --target-data-table nucleotides
+
+
+- amino acid data ([misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids)) into a [misc-data-amino-acids-txt](/help/8/artifacts/misc-data-amino-acids-txt).
+
+
+anvi-export-misc-data -c [contigs-db](/help/8/artifacts/contigs-db)
+ --target-data-table amino_acids
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-misc-data.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Working with anvi'o additional data tables](http://merenlab.org/2017/12/11/additional-data-tables/#views-items-layers-orders-some-anvio-terminology)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-misc-data) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-misc-data/network.json b/help/8/programs/anvi-export-misc-data/network.json
new file mode 100644
index 00000000..441bfd28
--- /dev/null
+++ b/help/8/programs/anvi-export-misc-data/network.json
@@ -0,0 +1,186 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items-txt",
+ "name": "misc-data-items-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layers-txt",
+ "name": "misc-data-layers-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layer-orders-txt",
+ "name": "misc-data-layer-orders-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-nucleotides-txt",
+ "name": "misc-data-nucleotides-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-amino-acids-txt",
+ "name": "misc-data-amino-acids-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items",
+ "name": "misc-data-items",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layers",
+ "name": "misc-data-layers",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layer-orders",
+ "name": "misc-data-layer-orders",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-nucleotides",
+ "name": "misc-data-nucleotides",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-amino-acids",
+ "name": "misc-data-amino-acids",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 13,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-misc-data",
+ "name": "anvi-export-misc-data",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 13,
+ "target": 0
+ },
+ {
+ "source": 13,
+ "target": 1
+ },
+ {
+ "source": 13,
+ "target": 2
+ },
+ {
+ "source": 13,
+ "target": 3
+ },
+ {
+ "source": 13,
+ "target": 4
+ },
+ {
+ "target": 13,
+ "source": 5
+ },
+ {
+ "target": 13,
+ "source": 6
+ },
+ {
+ "target": 13,
+ "source": 7
+ },
+ {
+ "target": 13,
+ "source": 8
+ },
+ {
+ "target": 13,
+ "source": 9
+ },
+ {
+ "target": 13,
+ "source": 10
+ },
+ {
+ "target": 13,
+ "source": 11
+ },
+ {
+ "target": 13,
+ "source": 12
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-splits-and-coverages/index.md b/help/8/programs/anvi-export-splits-and-coverages/index.md
new file mode 100644
index 00000000..3ec3ed66
--- /dev/null
+++ b/help/8/programs/anvi-export-splits-and-coverages/index.md
@@ -0,0 +1,79 @@
+---
+layout: program
+title: anvi-export-splits-and-coverages
+excerpt: An anvi'o program. Export split or contig sequences and coverages across samples stored in an anvi'o profile database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-splits-and-coverages
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export split or contig sequences and coverages across samples stored in an anvi'o profile database. This program is especially useful if you would like to 'bin' your splits or contigs outside of anvi'o and import the binning results into anvi'o using `anvi-import-collection` program.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[contigs-fasta](../../artifacts/contigs-fasta) [coverages-txt](../../artifacts/coverages-txt)
+
+
+## Usage
+
+
+This program **gives you the coverage information in your [profile-db](/help/8/artifacts/profile-db) as external files**. Basically, if you want to take that information in your [profile-db](/help/8/artifacts/profile-db) out of anvio, this is for you.
+
+Once you input your [profile-db](/help/8/artifacts/profile-db) and the [contigs-db](/help/8/artifacts/contigs-db) you used to generate it, it will create a [contigs-fasta](/help/8/artifacts/contigs-fasta) that lists your contigs for you, as well as a [coverages-txt](/help/8/artifacts/coverages-txt), which describes your coverage information.
+
+
+anvi-export-splits-and-coverages -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+If your coverages are skewed by outlier positions, consider using Q2Q3-coverages instead.
+
+
+anvi-export-splits-and-coverages -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --use-Q2Q3-coverages
+
+
+### Contigs or splits?
+
+*Wondering what the difference is? Check out [our vocab page](http://merenlab.org/vocabulary/#split).*
+
+By default, this program will give you the sequences of your splits, but will look at coverage data in terms of the parent contig. If you want to get coverage information for your splits, use `--splits-mode`. Alternatively, you can ask the program to `--report-contigs` to look at contig sequences instead.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-splits-and-coverages.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-splits-and-coverages) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-splits-and-coverages/network.json b/help/8/programs/anvi-export-splits-and-coverages/network.json
new file mode 100644
index 00000000..ceca930f
--- /dev/null
+++ b/help/8/programs/anvi-export-splits-and-coverages/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-fasta",
+ "name": "contigs-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "coverages-txt",
+ "name": "coverages-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-splits-and-coverages",
+ "name": "anvi-export-splits-and-coverages",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-splits-taxonomy/index.md b/help/8/programs/anvi-export-splits-taxonomy/index.md
new file mode 100644
index 00000000..989fa2d9
--- /dev/null
+++ b/help/8/programs/anvi-export-splits-taxonomy/index.md
@@ -0,0 +1,68 @@
+---
+layout: program
+title: anvi-export-splits-taxonomy
+excerpt: An anvi'o program. Export taxonomy for splits found in an anvi'o contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-splits-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export taxonomy for splits found in an anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[splits-taxonomy-txt](../../artifacts/splits-taxonomy-txt)
+
+
+## Usage
+
+
+This program exports the taxonomy hits for the splits contained in a [contigs-db](/help/8/artifacts/contigs-db), outputting them in a [splits-taxonomy-txt](/help/8/artifacts/splits-taxonomy-txt).
+
+To do this, anvi'o examines all of the annotated genes within your splits and returns the taxon ID with the most genes associated with it. For example, a split with 3 genes identified as E. coli, 2 genes identified as Staphylococcus aureus, and 1 as Streptococcus pneumoniae would be annotated as E. coli.
+
+To run this program, just provide a [contigs-db](/help/8/artifacts/contigs-db):
+
+
+anvi-export-splits-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o PATH/TO/[splits-taxonomy-txt](/help/8/artifacts/splits-taxonomy-txt)
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-splits-taxonomy.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-splits-taxonomy) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-splits-taxonomy/network.json b/help/8/programs/anvi-export-splits-taxonomy/network.json
new file mode 100644
index 00000000..367c8173
--- /dev/null
+++ b/help/8/programs/anvi-export-splits-taxonomy/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "splits-taxonomy-txt",
+ "name": "splits-taxonomy-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-splits-taxonomy",
+ "name": "anvi-export-splits-taxonomy",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-state/index.md b/help/8/programs/anvi-export-state/index.md
new file mode 100644
index 00000000..1664eb6c
--- /dev/null
+++ b/help/8/programs/anvi-export-state/index.md
@@ -0,0 +1,73 @@
+---
+layout: program
+title: anvi-export-state
+excerpt: An anvi'o program. Export an anvi'o state into a profile database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-state
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export an anvi'o state into a profile database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [state](../../artifacts/state)
+
+
+## Can provide
+
+
+[state-json](../../artifacts/state-json)
+
+
+## Usage
+
+
+This program allows you to export a [state](/help/8/artifacts/state) from a [pan-db](/help/8/artifacts/pan-db) or [profile-db](/help/8/artifacts/profile-db). The output of this is a [state-json](/help/8/artifacts/state-json), which you can import into another anvi'o project with [anvi-import-state](/help/8/programs/anvi-import-state).
+
+You can run this program on a [profile-db](/help/8/artifacts/profile-db) or [pan-db](/help/8/artifacts/pan-db) as follows:
+
+
+anvi-export-state -s [state](/help/8/artifacts/state) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -o path/to/output
+
+
+To list the collections available in this database, you can run
+
+
+anvi-export-state -p [pan-db](/help/8/artifacts/pan-db) \
+ --list-states
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-state.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-state) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-state/network.json b/help/8/programs/anvi-export-state/network.json
new file mode 100644
index 00000000..c90c1e70
--- /dev/null
+++ b/help/8/programs/anvi-export-state/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "state-json",
+ "name": "state-json",
+ "provided_by_anvio": true,
+ "type": "JSON"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "state",
+ "name": "state",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-state",
+ "name": "anvi-export-state",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-export-structures/index.md b/help/8/programs/anvi-export-structures/index.md
new file mode 100644
index 00000000..9febdc04
--- /dev/null
+++ b/help/8/programs/anvi-export-structures/index.md
@@ -0,0 +1,71 @@
+---
+layout: program
+title: anvi-export-structures
+excerpt: An anvi'o program. Export .
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-export-structures
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export .pdb structure files from a structure database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[structure-db](../../artifacts/structure-db)
+
+
+## Can provide
+
+
+[protein-structure-txt](../../artifacts/protein-structure-txt)
+
+
+## Usage
+
+
+
+This program exports the structures from a [structure-db](/help/8/artifacts/structure-db) into the globally understood pdb format ([protein-structure-txt](/help/8/artifacts/protein-structure-txt)), so they may be used for any follow-up analyses taking place outside of anvi'o.
+
+
+To run, just provide a [structure-db](/help/8/artifacts/structure-db) and an output path:
+
+
+anvi-export-structures -s [structure-db](/help/8/artifacts/structure-db) \
+ -o path/to/output
+
+
+You can also provide a list of gene caller IDs, either directly through the parameter `--gene-caller-ids` or through a file with one gene caller ID per line.
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-export-structures.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-export-structures) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-export-structures/network.json b/help/8/programs/anvi-export-structures/network.json
new file mode 100644
index 00000000..5bf02ea4
--- /dev/null
+++ b/help/8/programs/anvi-export-structures/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "protein-structure-txt",
+ "name": "protein-structure-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "structure-db",
+ "name": "structure-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-export-structures",
+ "name": "anvi-export-structures",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-gen-contigs-database/index.md b/help/8/programs/anvi-gen-contigs-database/index.md
new file mode 100644
index 00000000..f6b3feb0
--- /dev/null
+++ b/help/8/programs/anvi-gen-contigs-database/index.md
@@ -0,0 +1,103 @@
+---
+layout: program
+title: anvi-gen-contigs-database
+excerpt: An anvi'o program. Generate a new anvi'o contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-gen-contigs-database
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Generate a new anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-fasta](../../artifacts/contigs-fasta) [external-gene-calls](../../artifacts/external-gene-calls)
+
+
+## Can provide
+
+
+[contigs-db](../../artifacts/contigs-db)
+
+
+## Usage
+
+
+The input for this program is a [contigs-fasta](/help/8/artifacts/contigs-fasta), which should contain one or more sequences. These sequences may belong to a single genome or could be contigs obtained from an assembly.
+
+Make sure the input file matches the requirements of a [contigs-fasta](/help/8/artifacts/contigs-fasta). If you are planning to use the resulting contigs-db with [anvi-profile](/help/8/programs/anvi-profile), it is essential that you convert your [fasta](/help/8/artifacts/fasta) file to a properly formatted [contigs-fasta](/help/8/artifacts/contigs-fasta) *before* you perform the read recruitment.
+
+An anvi'o contigs database will keep all the information related to your sequences: positions of open reading frames, k-mer frequencies for each contig, functional and taxonomic annotation of genes, etc. The contigs database is one of the most essential components of anvi'o.
+
+When run on a [contigs-fasta](/help/8/artifacts/contigs-fasta) this program will,
+
+* **Compute k-mer frequencies** for each contig (the default is `4`, but you can change it using `--kmer-size` parameter if you feel adventurous).
+
+* **Soft-split contigs** longer than 20,000 bp into smaller ones (you can change the split size using the `--split-length` flag). When the gene calling step is not skipped, the process of splitting contigs will consider where genes are and avoid cutting genes in the middle. For very, very large assemblies this process can take a while, and you can skip it with `--skip-mindful-splitting` flag.
+
+* **Identify open reading frames** using [Prodigal](http://prodigal.ornl.gov/), UNLESS, (1) you have used the flag `--skip-gene-calling` (no gene calls will be made) or (2) you have provided [external-gene-calls](/help/8/artifacts/external-gene-calls).
+
+{:.notice}
+This program can work with compressed input FASTA files (i.e., the file name ends with a `.gz` extention).
+
+### Create a contigs database from a FASTA file
+
+
+anvi-gen-contigs-database -f [contigs-fasta](/help/8/artifacts/contigs-fasta) \
+ -o [contigs-db](/help/8/artifacts/contigs-db)
+
+
+### Create a contigs database with external gene calls
+
+
+anvi-gen-contigs-database -f [contigs-fasta](/help/8/artifacts/contigs-fasta) \
+ -o [contigs-db](/help/8/artifacts/contigs-db) \
+ --external-gene-calls [external-gene-calls](/help/8/artifacts/external-gene-calls)
+
+
+See [external-gene-calls](/help/8/artifacts/external-gene-calls) for the description and formatting requirements of this file.
+
+If user-provided or anvi'o-calculated amino acid sequences contain internal stop codons, anvi'o will yield an error. The following command will persist through this error:
+
+
+anvi-gen-contigs-database -f [contigs-fasta](/help/8/artifacts/contigs-fasta) \
+ -o [contigs-db](/help/8/artifacts/contigs-db) \
+ --external-gene-calls [external-gene-calls](/help/8/artifacts/external-gene-calls) \
+ --ignore-internal-stop-codons
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-gen-contigs-database.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-gen-contigs-database) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-gen-contigs-database/network.json b/help/8/programs/anvi-gen-contigs-database/network.json
new file mode 100644
index 00000000..7be852f4
--- /dev/null
+++ b/help/8/programs/anvi-gen-contigs-database/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-fasta",
+ "name": "contigs-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-gene-calls",
+ "name": "external-gene-calls",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-gen-contigs-database",
+ "name": "anvi-gen-contigs-database",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-gen-fixation-index-matrix/index.md b/help/8/programs/anvi-gen-fixation-index-matrix/index.md
new file mode 100644
index 00000000..8e94740b
--- /dev/null
+++ b/help/8/programs/anvi-gen-fixation-index-matrix/index.md
@@ -0,0 +1,108 @@
+---
+layout: program
+title: anvi-gen-fixation-index-matrix
+excerpt: An anvi'o program. Generate a pairwise matrix of a fixation indices between samples.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-gen-fixation-index-matrix
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Generate a pairwise matrix of a fixation indices between samples.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [structure-db](../../artifacts/structure-db) [bin](../../artifacts/bin) [variability-profile-txt](../../artifacts/variability-profile-txt) [splits-txt](../../artifacts/splits-txt)
+
+
+## Can provide
+
+
+[fixation-index-matrix](../../artifacts/fixation-index-matrix)
+
+
+## Usage
+
+
+
+This program generates a matrix of the pairwise fixation indices (FST) between your samples.
+
+### What's a fixation index?
+
+As described [in the Infant Gut Tutorial](https://merenlab.org/tutorials/infant-gut/#measuring-distances-between-metagenomes-with-fst), the fixation index is a measure of the distance between two populations, based on their sequence variants (usually SNVs). Specifically, the fixation index is the ratio between the variance in allele frequency between subpopulations and the variance in the total population.
+
+
+The fixation index has its own [Wikipedia page](https://en.wikipedia.org/wiki/Fixation_index) and is a special case of [F-statistics](https://en.wikipedia.org/wiki/F-statistics).
+
+
+In anvi'o, the fixation index is calculated in accordance with [Schloissnig et al. (2013)](https://doi.org/10.1038/nature11711)'s work to allow variant positions with multiple competing alleles.
+
+
+## Anvi-gen-fixation-index
+
+There are two ways to run this program.
+
+### Input 1: Variability Profile
+
+The simplest one is the one shown [in the Infant Gut Tutorial](https://merenlab.org/tutorials/infant-gut/#measuring-distances-between-metagenomes-with-fst): just provide a [variability-profile](/help/8/artifacts/variability-profile), like so:
+
+
+anvi-gen-fixation-index-matrix --variability-profile [variability-profile](/help/8/artifacts/variability-profile) \
+ --output-file my_matrix.txt
+
+
+This will use the information in your [variability-profile-txt](/help/8/artifacts/variability-profile-txt) to generate the fixation index for each of the pairwise sample comparisons, and store the results in a [fixation-index-matrix](/help/8/artifacts/fixation-index-matrix) named `my_matrix.txt`.
+
+### Input 2: Anvi'o databases
+
+Instead of providing a [variability-profile](/help/8/artifacts/variability-profile), you can instead provide the inputs to [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile) and let anvi'o do all of the work for you. Specifically, this means providing a [contigs-db](/help/8/artifacts/contigs-db) and [profile-db](/help/8/artifacts/profile-db) pair to find your variability positions and a specific subset to focus on in any of these ways:
+
+- Provide a list of gene caller IDs (as a parameter with the flag `--gene-caller-ids` or in a file with one ID per line with the flag `--genes-of-interest`)
+- Provide a list of splits (in a [splits-txt](/help/8/artifacts/splits-txt))
+- Provide a [collection](/help/8/artifacts/collection) and [bin](/help/8/artifacts/bin)
+
+Additionally, you can add structural annotations by inputting a [structure-db](/help/8/artifacts/structure-db) (and focus only on genes with structural annotations with the flag `--only-if-structure`) or choose to focus on only a subset of your samples by providing a file of samples of interest.
+
+When doing this, you can also set the variability engine to get the fixation index for SCVs (`--engine CDN`) or SAAVs (`--engine AA`).
+
+You can find more information about these parameters on the page for [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile).
+
+### Additional Parameters
+
+While a fixation index is usually between 0 and 1, it is possible for an index to be negative (usually because of out-breeding). By default, anvi'o sets these negative values to 0, but you can choose to keep the negative values with the flag `--keep-negatives`
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-gen-fixation-index-matrix.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Utilizing fixation index to study SAR11 population structure](http://merenlab.org/data/sar11-saavs/#generating-distance-matrices-from-fixation-index-for-saavs-and-snvs-data)
+
+* [Measuring Distances Between Genomes in the Infant Gut Tutorial](http://merenlab.org/tutorials/infant-gut/#measuring-distances-between-metagenomes-with-fst)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-gen-fixation-index-matrix) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-gen-fixation-index-matrix/network.json b/help/8/programs/anvi-gen-fixation-index-matrix/network.json
new file mode 100644
index 00000000..91f1b61d
--- /dev/null
+++ b/help/8/programs/anvi-gen-fixation-index-matrix/network.json
@@ -0,0 +1,108 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "fixation-index-matrix",
+ "name": "fixation-index-matrix",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "structure-db",
+ "name": "structure-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "variability-profile-txt",
+ "name": "variability-profile-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "splits-txt",
+ "name": "splits-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 7,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-gen-fixation-index-matrix",
+ "name": "anvi-gen-fixation-index-matrix",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 7,
+ "target": 0
+ },
+ {
+ "target": 7,
+ "source": 1
+ },
+ {
+ "target": 7,
+ "source": 2
+ },
+ {
+ "target": 7,
+ "source": 3
+ },
+ {
+ "target": 7,
+ "source": 4
+ },
+ {
+ "target": 7,
+ "source": 5
+ },
+ {
+ "target": 7,
+ "source": 6
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-gen-gene-consensus-sequences/index.md b/help/8/programs/anvi-gen-gene-consensus-sequences/index.md
new file mode 100644
index 00000000..05809c80
--- /dev/null
+++ b/help/8/programs/anvi-gen-gene-consensus-sequences/index.md
@@ -0,0 +1,82 @@
+---
+layout: program
+title: anvi-gen-gene-consensus-sequences
+excerpt: An anvi'o program. Collapse variability for a set of genes across samples.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-gen-gene-consensus-sequences
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Collapse variability for a set of genes across samples.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[genes-fasta](../../artifacts/genes-fasta)
+
+
+## Usage
+
+
+This program **provides consensus sequences for the genes within a [contigs-db](/help/8/artifacts/contigs-db) and [profile-db](/help/8/artifacts/profile-db) pair**.
+
+In other words, this collapses variability by assigning the most abundant nucleotide in your sample at each position, giving single consensus sequences for each gene for each sample.
+
+A basic run of this program will resemble the following:
+
+
+anvi-gen-gene-consensus-seuqences -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o [genes-fasta](/help/8/artifacts/genes-fasta)
+
+
+The default output is a [genes-fasta](/help/8/artifacts/genes-fasta), but you can also get a tab-delimited output matrix by adding the flag `--tab-delimited`.
+
+You also have the option to focus on a subset of the data in your [contigs-db](/help/8/artifacts/contigs-db) and [profile-db](/help/8/artifacts/profile-db) by providing either:
+
+- A list of gene caller IDs (either as a parameter or through a file with one gene caller ID put line)
+- A list of samples to focus on (as a file with a single sample name per line)
+
+### Additional Parameters
+
+- You have the option to change the variability engine (i.e. to codons), where variability at this level will be resolved.
+- To compress all variability profiles for each of your samples for a single gene, use the flag `--conpress samples`. This way, the program will only report one consensus sequence for each gene instead of reporting one for each sample.
+- You can get consensus sequences for each contig instead of for each gene with `--contigs-mode`
+- To report all consensus sequences (even when there are no variable positions), activate `--quince-mode`
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-gen-gene-consensus-sequences.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-gen-gene-consensus-sequences) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-gen-gene-consensus-sequences/network.json b/help/8/programs/anvi-gen-gene-consensus-sequences/network.json
new file mode 100644
index 00000000..01b0dd64
--- /dev/null
+++ b/help/8/programs/anvi-gen-gene-consensus-sequences/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genes-fasta",
+ "name": "genes-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-gen-gene-consensus-sequences",
+ "name": "anvi-gen-gene-consensus-sequences",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-gen-gene-level-stats-databases/index.md b/help/8/programs/anvi-gen-gene-level-stats-databases/index.md
new file mode 100644
index 00000000..a80c6a8c
--- /dev/null
+++ b/help/8/programs/anvi-gen-gene-level-stats-databases/index.md
@@ -0,0 +1,78 @@
+---
+layout: program
+title: anvi-gen-gene-level-stats-databases
+excerpt: An anvi'o program. A program to compute genes databases for a ginen set of bins stored in an anvi'o collection.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-gen-gene-level-stats-databases
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program to compute genes databases for a ginen set of bins stored in an anvi'o collection. Genes databases store gene-level coverage and detection statistics, and they are usually computed and generated automatically when they are required (such as running anvi-interactive with `--gene-mode` flag). This program allows you to pre-compute them if you don't want them to be done all at once.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [collection](../../artifacts/collection) [bin](../../artifacts/bin)
+
+
+## Can provide
+
+
+[genes-db](../../artifacts/genes-db)
+
+
+## Usage
+
+
+This program **generates a [genes-db](/help/8/artifacts/genes-db), which stores the coverage and detection values for all of the genes in your [contigs-db](/help/8/artifacts/contigs-db).**
+
+This information is usually calculated when it's needed (for example when running [anvi-interactive](/help/8/programs/anvi-interactive) in genes mode), but this program lets you break this process into two steps. This way, you can easily change the parameters of [anvi-interactive](/help/8/programs/anvi-interactive) without having to recalculate the gene-level statistics.
+
+Given a [contigs-db](/help/8/artifacts/contigs-db) and [profile-db](/help/8/artifacts/profile-db) pair, as well as a [collection](/help/8/artifacts/collection), this program will calculate the stats for the genes in each of your [bin](/help/8/artifacts/bin)s and give each bin its own [profile-db](/help/8/artifacts/profile-db) that includes this information.
+
+For example, if a [collection](/help/8/artifacts/collection) called `GENE_COLLECTION` contained the bins `bin_0001`, `bin_0002`, and `bin_0003` and you ran:
+
+
+anvi-gen-gene-level-stats-databases -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection)
+
+
+Then it will create a directory called `GENES` that contains three [profile-db](/help/8/artifacts/profile-db) called `GENE_COLLECTION-bin_0001.db`, `GENE_COLLECTION-bin_0002.db`, and `GENE_COLLECTION-bin_0003.db`. In terms of output, this program is similar to [anvi-split](/help/8/programs/anvi-split): each of these databases can now be treated as self-contained anvi'o projects but they also contain the gene-level information. Thus, you then could run [anvi-interactive](/help/8/programs/anvi-interactive) in genes mode on one of these profile databases.
+
+You also have the option to provide a list of [bin](/help/8/artifacts/bin) (either as a file or as a string) to anlyze instead of a single [collection](/help/8/artifacts/collection).
+
+### Other Parameters
+
+You can also change the definition of an outlier nucleotide position or switch calculations to use the [INSeq/Tn-Seq](https://www.illumina.com/science/sequencing-method-explorer/kits-and-arrays/in-seq-tn-seq.html) statistical methods.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-gen-gene-level-stats-databases.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-gen-gene-level-stats-databases) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-gen-gene-level-stats-databases/network.json b/help/8/programs/anvi-gen-gene-level-stats-databases/network.json
new file mode 100644
index 00000000..ab458477
--- /dev/null
+++ b/help/8/programs/anvi-gen-gene-level-stats-databases/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genes-db",
+ "name": "genes-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-gen-gene-level-stats-databases",
+ "name": "anvi-gen-gene-level-stats-databases",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "target": 5,
+ "source": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-gen-genomes-storage/index.md b/help/8/programs/anvi-gen-genomes-storage/index.md
new file mode 100644
index 00000000..5a901e8b
--- /dev/null
+++ b/help/8/programs/anvi-gen-genomes-storage/index.md
@@ -0,0 +1,99 @@
+---
+layout: program
+title: anvi-gen-genomes-storage
+excerpt: An anvi'o program. Create a genome storage from internal and/or external genomes for a pangenome analysis.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-gen-genomes-storage
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Create a genome storage from internal and/or external genomes for a pangenome analysis.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[external-genomes](../../artifacts/external-genomes) [internal-genomes](../../artifacts/internal-genomes)
+
+
+## Can provide
+
+
+[genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Usage
+
+
+This program **generates a [genomes-storage-db](/help/8/artifacts/genomes-storage-db), which stores information about your genomes, primarily for use in pangenomic analysis.**
+
+Genomes storage databases are to Anvi'o's pangenomic workflow what a [contigs-db](/help/8/artifacts/contigs-db) is to a metagenomic workflow: it stores vital information and is passed to most programs you'll want to run.
+
+Once you've generated a [genomes-storage-db](/help/8/artifacts/genomes-storage-db), you can run [anvi-pan-genome](/help/8/programs/anvi-pan-genome), which creates a [pan-db](/help/8/artifacts/pan-db) and runs various pangenomic analyses (including calculating the similarities between your sequences, identifying gene clusters, and organizing your gene clusters and genomes). After that, you can display your pangenome with [anvi-display-pan](/help/8/programs/anvi-display-pan) For more information, check out [the pangenomic workflow](http://merenlab.org/2016/11/08/pangenomics-v2/#generating-an-anvio-genomes-storage).
+
+### Inputs: internal and external genomes
+
+You can initialize your genomes storage database with [internal-genomes](/help/8/artifacts/internal-genomes), [external-genomes](/help/8/artifacts/external-genomes), or both.
+
+[internal-genomes](/help/8/artifacts/internal-genomes) describe genomes that are described by a [bin](/help/8/artifacts/bin) within a [collection](/help/8/artifacts/collection) that is already within an Anvi'o [profile-db](/help/8/artifacts/profile-db). For example, if you had gone through [the metagenomic workflow](http://merenlab.org/2016/06/22/anvio-tutorial-v2/) and had several MAGs that you wanted to run pangenomic analyses on.
+
+
+anvi-gen-genomes-storage -i [internal-genomes](/help/8/artifacts/internal-genomes) \
+ -o [genomes-storage-db](/help/8/artifacts/genomes-storage-db)
+
+
+{:.notice}
+The name of your genomes storage database (which follows the `-o` flag) must end with `-GENOMES.db`. This just helps differenciate it from other types of Anvi'o databases, such as the [contigs-db](/help/8/artifacts/contigs-db) and [profile-db](/help/8/artifacts/profile-db).
+
+In contrast, [external-genomes](/help/8/artifacts/external-genomes) describe genomes that are contained in a [fasta](/help/8/artifacts/fasta) file that you've turned into a [contigs-db](/help/8/artifacts/contigs-db) (using [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database)). For example, if you had downloaded genomes from [NCBI](https://www.ncbi.nlm.nih.gov/).
+
+
+anvi-gen-genomes-storage -e [external-genomes](/help/8/artifacts/external-genomes) \
+ -o [genomes-storage-db](/help/8/artifacts/genomes-storage-db)
+
+
+You can also create a genomes storage database from both types of genomes at the same time. For example, if you had MAGs from a metagenomic analysis on an environmental sample and wanted to compare them with the reference genomes on [NCBI](https://www.ncbi.nlm.nih.gov/). To run this, simply provide both types of genomes as parameters, as so:
+
+
+anvi-gen-genomes-storage -i [internal-genomes](/help/8/artifacts/internal-genomes) \
+ -e [external-genomes](/help/8/artifacts/external-genomes) \
+ -o [genomes-storage-db](/help/8/artifacts/genomes-storage-db)
+
+
+### Changing the gene caller
+
+By default, Anvi'o will use [Prodigal](https://github.com/hyattpd/Prodigal) and will let you know if you have gene calls identified by other gene callers. However, you are welcome to explicitly use a specific gene caller with the flag `--gene-caller`.
+
+If you're wondering what gene callers are available in your [contigs-db](/help/8/artifacts/contigs-db), you can check by running the program [anvi-export-gene-calls](/help/8/programs/anvi-export-gene-calls) on a specific [contigs-db](/help/8/artifacts/contigs-db) with the flag `--list-gene-callers`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-gen-genomes-storage.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A tutorial on pangenomics](http://merenlab.org/2016/11/08/pangenomics-v2/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-gen-genomes-storage) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-gen-genomes-storage/network.json b/help/8/programs/anvi-gen-genomes-storage/network.json
new file mode 100644
index 00000000..c8a6d57f
--- /dev/null
+++ b/help/8/programs/anvi-gen-genomes-storage/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-gen-genomes-storage",
+ "name": "anvi-gen-genomes-storage",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-gen-phylogenomic-tree/index.md b/help/8/programs/anvi-gen-phylogenomic-tree/index.md
new file mode 100644
index 00000000..5dfe7713
--- /dev/null
+++ b/help/8/programs/anvi-gen-phylogenomic-tree/index.md
@@ -0,0 +1,71 @@
+---
+layout: program
+title: anvi-gen-phylogenomic-tree
+excerpt: An anvi'o program. Generate phylogenomic tree from aligment file.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-gen-phylogenomic-tree
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Generate phylogenomic tree from aligment file.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[concatenated-gene-alignment-fasta](../../artifacts/concatenated-gene-alignment-fasta)
+
+
+## Can provide
+
+
+[phylogeny](../../artifacts/phylogeny)
+
+
+## Usage
+
+
+This program generates a NEWICK-formatted phylogenomic tree (see [phylogeny](/help/8/artifacts/phylogeny)) based on a given [concatenated-gene-alignment-fasta](/help/8/artifacts/concatenated-gene-alignment-fasta).
+
+As mentioned in the [phylogenetics tutorial](http://merenlab.org/2017/06/07/phylogenomics/), it currently only has the option to use [FastTree](http://microbesonline.org/fasttree/) to do so, but be aware that there are many other programs that you can do this with. Some of the options we are familiar with (and are not yet represented in `anvi-gen-phylogenomic-tree`) include [MrBayes](http://mrbayes.sourceforge.net/), [MEGA](http://www.megasoftware.net/), and PHYLIP, [among many others](http://evolution.genetics.washington.edu/phylip/software.html#methods), most of which will happily take a [concatenated-gene-alignment-fasta](/help/8/artifacts/concatenated-gene-alignment-fasta).
+
+Anyway, running this program is simple. Just provide the [concatenated-gene-alignment-fasta](/help/8/artifacts/concatenated-gene-alignment-fasta) with all of the genes that you want to use and the output file path for your [phylogeny](/help/8/artifacts/phylogeny):
+
+
+anvi-gen-phylogenomic-tree -f [concatenated-gene-alignment-fasta](/help/8/artifacts/concatenated-gene-alignment-fasta) \
+ -o PATH/TO/[phylogeny](/help/8/artifacts/phylogeny)
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-gen-phylogenomic-tree.md) to update this information.
+
+
+## Additional Resources
+
+
+* [View this program in action in the anvi'o phylogenetics workflow](http://merenlab.org/2017/06/07/phylogenomics/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-gen-phylogenomic-tree) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-gen-phylogenomic-tree/network.json b/help/8/programs/anvi-gen-phylogenomic-tree/network.json
new file mode 100644
index 00000000..d11ff100
--- /dev/null
+++ b/help/8/programs/anvi-gen-phylogenomic-tree/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "phylogeny",
+ "name": "phylogeny",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "concatenated-gene-alignment-fasta",
+ "name": "concatenated-gene-alignment-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-gen-phylogenomic-tree",
+ "name": "anvi-gen-phylogenomic-tree",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-gen-structure-database/index.md b/help/8/programs/anvi-gen-structure-database/index.md
new file mode 100644
index 00000000..d90d6903
--- /dev/null
+++ b/help/8/programs/anvi-gen-structure-database/index.md
@@ -0,0 +1,134 @@
+---
+layout: program
+title: anvi-gen-structure-database
+excerpt: An anvi'o program. Creates a database of protein structures.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-gen-structure-database
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Creates a database of protein structures. Predict protein structures using template-based homology modelling of genes in your contigs database, or import pre-computed PDB structures you already have..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [pdb-db](../../artifacts/pdb-db)
+
+
+## Can provide
+
+
+[structure-db](../../artifacts/structure-db)
+
+
+## Usage
+
+
+
+This program creates a [structure-db](/help/8/artifacts/structure-db) either by (a) attempting to solve for the 3D structures of proteins encoded by genes in your [contigs-db](/help/8/artifacts/contigs-db) using DIAMOND and MODELLER, or (b) importing pre-existing structures provided by the user using an [external-structures](/help/8/artifacts/external-structures) file.
+
+### The basics of the pipeline
+
+This section covers option (a), where the user is interested in having structures predicted for them.
+
+DIAMOND first searches your sequence(s) against a database of proteins with a known structure. This database is downloaded from the [Sali lab](https://salilab.org/modeller/supplemental.html), who created and maintain MODELLER, and contains all of the PDB sequences clustered at 95% identity.
+
+If any good hits are found, they are selected as templates, and their structures are nabbed either from [the RCSB directly](https://www.rcsb.org/), or from a local [pdb-db](/help/8/artifacts/pdb-db) database which you can create yourself with [anvi-setup-pdb-database](/help/8/programs/anvi-setup-pdb-database). Then, anvi'o passes control over to MODELLER, which creates a 3D alignment for your sequence to the template structures, and makes final adjustments to it based off of empirical distributions of bond angles. For more information, check [this blogpost](http://merenlab.org/2018/09/04/getting-started-with-anvio-structure/#how-modeller-works).
+
+The output of this program is a [structure-db](/help/8/artifacts/structure-db), which contains all of the modelled structures. Currently, the primary use of the [structure-db](/help/8/artifacts/structure-db) is for interactive exploration with [anvi-display-structure](/help/8/programs/anvi-display-structure). You can also export your structures into external .pdb files with [anvi-export-structures](/help/8/programs/anvi-export-structures), or incorporate structural information in the [variability-profile-txt](/help/8/artifacts/variability-profile-txt) with [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile).
+
+### Basic standard run
+
+Here is a simple run:
+
+
+anvi-gen-structure-database -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --gene-caller-ids 1,2,3 \
+ -o STRUCTURE.db
+
+
+Following this, you will have the structures for genes 1, 2, and 3 stored in `STRUCTURE.db`, assuming reasonable templates were found. Alternatively, you can provide a file name with the gene caller IDs (one ID per line) with the flag `--genes-of-interest`.
+
+If you have already run [anvi-setup-pdb-database](/help/8/programs/anvi-setup-pdb-database) and therefore have a local copy of representative PDB structures, make sure you use it by providing the `--offline` flag. If you put it in a non-default location, provide the path to your [pdb-db](/help/8/artifacts/pdb-db):
+
+
+anvi-gen-structure-database -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --gene-caller-ids 1,2,3 \
+ --pdb-database [pdb-db](/help/8/artifacts/pdb-db) \
+ -o STRUCTURE.db
+
+
+To quickly get a very rough estimate for your structures, you can run with the flag `--very-fast`.
+
+### Basic import run
+
+If you already possess structures and would like to create a [structure-db](/help/8/artifacts/structure-db) for downstream anvi'o uses such as [anvi-display-structure](/help/8/programs/anvi-display-structure), you should create a [external-structures](/help/8/artifacts/external-structures) file. Then, create the database as follows:
+
+
+anvi-gen-structure-database -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --external-structures [external-structures](/help/8/artifacts/external-structures) \
+ -o STRUCTURE.db
+
+
+{:.notice}
+Please avoid using any MODELLER-specific parameters when using this mode, as they will be silently ignored.
+
+
+### Advanced Parameters
+
+Here, we will go through a brief overview of the MODELLER parameters that you are able to change. See [this page](http://merenlab.org/2018/09/04/getting-started-with-anvio-structure/#description-of-all-modeller-parameters) for more information.
+
+- The number of models to be simulated. The default is 1.
+- The standard deviation of atomic perturbation of the initial structure (i.e. how much you change the position of the atoms before fine tuning with other analysis). The default is 4 angstroms.
+- The MODELLER database used. The default is `pdb_95`, which can be found [here](https://salilab.org/modeller/supplemental.html). This is the same database that is downloaded by [anvi-setup-pdb-database](/help/8/programs/anvi-setup-pdb-database).
+- The scoring function used to compare potential models. The default is `DOPE_score`.
+- The minimum percent identity cutoff for a template to be further considered.
+- The minimum alignment fraction that the sequence is covered by the template in order to be further considered.
+- The maximum number of templates that the program will consider. The default is 5.
+- The MODELLER program to use. The default is `mod9.19`, but anvi'o is somewhat intelligent and will
+ look for the most recent version it can find.
+
+For a case study on how some of these parameters matter, see [here](http://merenlab.org/2018/09/04/getting-started-with-anvio-structure/#a-quick-case-study-on-the-importance-of-key-parameters).
+
+You also have the option to
+
+- Skip the use of DSSP, which predicts beta sheets, alpha helices, certain bond angles, and relative
+ solvent acessibility of residues.
+- Output **all** the raw data, just provide a path to the desired directory with the flag `--dump-dir`.
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-gen-structure-database.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A conceptual tutorial on the structural biology capabilities of anvio](http://merenlab.org/2018/09/04/structural-biology-with-anvio/)
+
+* [A practical tutorial section in the infant gut tutorial](http://merenlab.org/tutorials/infant-gut/#chapter-vii-linking-genomic-heterogeneity-to-protein-structures)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-gen-structure-database) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-gen-structure-database/network.json b/help/8/programs/anvi-gen-structure-database/network.json
new file mode 100644
index 00000000..0c8c560e
--- /dev/null
+++ b/help/8/programs/anvi-gen-structure-database/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "structure-db",
+ "name": "structure-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pdb-db",
+ "name": "pdb-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-gen-structure-database",
+ "name": "anvi-gen-structure-database",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-gen-variability-network/index.md b/help/8/programs/anvi-gen-variability-network/index.md
new file mode 100644
index 00000000..ff185e1c
--- /dev/null
+++ b/help/8/programs/anvi-gen-variability-network/index.md
@@ -0,0 +1,149 @@
+---
+layout: program
+title: anvi-gen-variability-network
+excerpt: An anvi'o program. Generate a network description from an anvi'o variability profile.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-gen-variability-network
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Generate a network description from an anvi'o variability profile..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[variability-profile-txt](../../artifacts/variability-profile-txt)
+
+
+## Can provide
+
+
+[variability-profile-xml](../../artifacts/variability-profile-xml)
+
+
+## Usage
+
+
+This program takes in a [variability-profile-txt](/help/8/artifacts/variability-profile-txt) file reported by the program [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile), and generates a simple representation of this complex data so it can be visualized as a network.
+
+## Default mode of operation
+
+By default, the program will report an XML file properly formatted to be an input for the open-source network visualization program Gephi:
+
+
+anvi-gen-variability-network -i [variability-profile-txt](/help/8/artifacts/variability-profile-txt) \
+ -o variability_profile.gexf
+
+
+## Reporting a text file instad
+
+Alternatively, the user may ask the results to be reported as a TAB-delimited text file:
+
+
+anvi-gen-variability-network -i [variability-profile-txt](/help/8/artifacts/variability-profile-txt) \
+ -o variability_output.txt \
+ --as-matrix
+
+
+Reporting the data as a matrix enables quick visualization opportunities using the programs [anvi-matrix-to-newick](/help/8/programs/anvi-matrix-to-newick) and [anvi-interactive](/help/8/programs/anvi-interactive):
+
+```
+anvi-matrix-to-newick variability_output.txt -o variability_output.newick
+
+anvi-interactive -d variability_output.txt \
+ -t variability_output.newick \
+ -p varaibility_profile.db \
+ --manual
+```
+
+## Using competing nucleotides as features
+
+By default, this program will take every unique nucleotide position which was variable in at least sample and every sample, and then connect samples and positions with edges if a given sample has a variable nucleotide at a given position. This means even if a given variable nucleotide position X differs from the reference nucleotide in different ways in different samples N and M, this approach will be agnostic to that and will simply report that both N and M had a variable nucleotide at the position X.
+
+|sample|X|
+|N|True|
+|M|True|
+
+Alternatively, the user can ask the variation to be reported based on competing nucletoides in a given sample. In that case, if the position X in sample N varies between nucleotides `A` and `G` and in sample M between nucleotides `A` and `T`, the user may get a higher resolution for their inference by asking anvi'o to include 'competing nucleotides' in the report:
+
+|sample|X_AG|X_AT|
+|N|True|False|
+|M|False|True|
+
+This is done through the parameter `--include-competing-NTs`, which requires an option. Currently available options are the following:
+
+* 'default': Returns the default competing nucleotides column from the variability as calculated by anvi'o during profiling.
+* 'noise-robust': When depearture from consenus for a given nt position is close to zero, which means the nt position is almost fixed in the environment the default way to calculate competing nucleotides can yield noisy results simply due to the fact that the second most frequenty nucleotide can be driven by artifacts (such as sequencing error) than biology. For instance, if a given position that is represented by a nucleotide `G` in the reference has SNV frequencies of {'A': 1000, 'T': 1, 'C': 0, 'G': 0} in one sample and {'A': 1000, 'T': 0, 'C': 1, 'G': 0} in the other, the competing nucletoides for this position in the variability table will be `AT` and `AC`, respectively. However, some applications, in which competitive nucleotides are used as categorigal variables to associate samples, may require a more robust apprach. The `noise-robust` alternative would yield `AG` and `AG` for this position in both samples.
+
+
+anvi-gen-variability-network -i [variability-profile-txt](/help/8/artifacts/variability-profile-txt) \
+ -o variability_profile.gexf \
+ --include-competing-NTs noise-robust
+
+
+## Changing the default report variable
+
+By default, [anvi-gen-variability-network](/help/8/programs/anvi-gen-variability-network) will report the estimate `departure_from_reference` as the quantity that defines the weight of the edges that connect variable nucleotide positions and samples. The edge weight can later be used during network visualization as a factor that influence netowrk convergence. It is possible to change the variable using the parameter `--edge-variable`.
+
+
+anvi-gen-variability-network -i [variability-profile-txt](/help/8/artifacts/variability-profile-txt) \
+ -o variability_profile.gexf \
+ --include-competing-NTs noise-robust \
+ --edge-variable departure_from_consensus
+
+
+The parameter will accept any variable reported in the variability profile output. One can always try a parameter that certainly does not exist to get a list of parameters that could be used here to get a list of those that are appropriate to use:
+
+
+anvi-gen-variability-network -i [variability-profile-txt](/help/8/artifacts/variability-profile-txt) \
+ -o variability_profile.gexf \
+ --include-competing-NTs noise-robust \
+ --edge-variable THIS_DOESNT_EXIST
+
+Config Error: The edge weight variable `THIS_DOESNT_EXIST` does not seem to be among those
+ that are represented within the variability data :/ Here is a list of the
+ variables you could choose from (although not each one of them will be equally
+ useful to serve as edge weights in the resulting network, but anvi'o leaves the
+ responsibility to choose something relevant completely to you and will not
+ scrutinize your decision): 'pos', 'pos_in_contig', 'corresponding_gene_call',
+ 'in_noncoding_gene_call', 'in_coding_gene_call', 'base_pos_in_codon',
+ 'codon_order_in_gene', 'coverage', 'cov_outlier_in_split',
+ 'cov_outlier_in_contig', 'departure_from_reference', 'A', 'C', 'G', 'T', 'N',
+ 'codon_number', 'gene_length', 'unique_pos_identifier',
+ 'departure_from_consensus', 'n2n1ratio', 'entropy', 'gene_coverage',
+ 'non_outlier_gene_coverage', 'non_outlier_gene_coverage_std',
+ 'mean_normalized_coverage'.
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-gen-variability-network.md) to update this information.
+
+
+## Additional Resources
+
+
+* [An applicatio of this program in the Infant Gut Tutorial](https://merenlab.org/tutorials/infant-gut/#visualizing-snv-profiles-as-a-network)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-gen-variability-network) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-gen-variability-network/network.json b/help/8/programs/anvi-gen-variability-network/network.json
new file mode 100644
index 00000000..0d50cc19
--- /dev/null
+++ b/help/8/programs/anvi-gen-variability-network/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "variability-profile-xml",
+ "name": "variability-profile-xml",
+ "provided_by_anvio": true,
+ "type": "XML"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "variability-profile-txt",
+ "name": "variability-profile-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-gen-variability-network",
+ "name": "anvi-gen-variability-network",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-gen-variability-profile/index.md b/help/8/programs/anvi-gen-variability-profile/index.md
new file mode 100644
index 00000000..154d7ecf
--- /dev/null
+++ b/help/8/programs/anvi-gen-variability-profile/index.md
@@ -0,0 +1,200 @@
+---
+layout: program
+title: anvi-gen-variability-profile
+excerpt: An anvi'o program. Generate a table that comprehensively summarizes the variability of nucleotide, codon, or amino acid positions.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-gen-variability-profile
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Generate a table that comprehensively summarizes the variability of nucleotide, codon, or amino acid positions. We call these single nucleotide variants (SNVs), single codon variants (SCVs), and single amino acid variants (SAAVs), respectively.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [structure-db](../../artifacts/structure-db) [bin](../../artifacts/bin) [variability-profile](../../artifacts/variability-profile) [splits-txt](../../artifacts/splits-txt)
+
+
+## Can provide
+
+
+[variability-profile-txt](../../artifacts/variability-profile-txt)
+
+
+## Usage
+
+
+
+This program takes the variability data stored within a [profile-db](/help/8/artifacts/profile-db) and compiles it from across samples into a single matrix that comprehensively describes your SNVs, SCVs or SAAVs (a [variability-profile-txt](/help/8/artifacts/variability-profile-txt)).
+
+This program is described on [this blog post](http://merenlab.org/2015/07/20/analyzing-variability/#the-anvio-way), so take a look at that for more details.
+
+## Let's talk parameters
+
+Here is a basic run with no bells or whisles:
+
+
+anvi-gen-variability-profile -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C DEFAULT \
+ -b EVERYTHING
+
+
+Note that this program requires you to specify a subset of the databases that you want to focus on, so to focus on everything in the databases, run [anvi-script-add-default-collection](/help/8/programs/anvi-script-add-default-collection) and use the resulting [collection](/help/8/artifacts/collection) and [bin](/help/8/artifacts/bin), as shown above.
+
+You can add structural annotations by providing a [structure-db](/help/8/artifacts/structure-db).
+
+
+anvi-gen-variability-profile -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C DEFAULT \
+ -b EVERYTHING \
+ -s [structure-db](/help/8/artifacts/structure-db)
+
+
+### Focusing on a subset of the input
+
+Instead of focusing on everything (providing the collection `DEFAULT` and the bin `EVERYTHING`), there are three ways to focus on a subset of the input:
+
+1. Provide a list of gene caller IDs (as a parameter with the flag `--gene-caller-ids` as shown below, or as a file with the flag `--genes-of-interest`)
+
+
+ anvi-gen-variability-profile -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --gene-caller-ids 1,2,3
+
+
+2. Provide a [splits-txt](/help/8/artifacts/splits-txt) to focus only on a specific set of splits.
+
+
+ anvi-gen-variability-profile -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --splits-of-intest [splits-txt](/help/8/artifacts/splits-txt)
+
+
+3. Provide some other [collection](/help/8/artifacts/collection) and [bin](/help/8/artifacts/bin).
+
+
+ anvi-gen-variability-profile -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -b [bin](/help/8/artifacts/bin)
+
+
+### Additional ways to focus the input
+
+When providing a [structure-db](/help/8/artifacts/structure-db), you can also limit your analysis to only genes that have structures in your database.
+
+
+anvi-gen-variability-profile -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -s [structure-db](/help/8/artifacts/structure-db) \
+ --only-if-structure
+
+
+You can also choose to look at only data from specific samples by providing a file with one sample name per line. For example
+
+
+anvi-gen-variability-profile -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -b [bin](/help/8/artifacts/bin) \
+ --samples-of-interest my_samples.txt
+
+
+where `my_samples.txt` looks like this:
+
+
+DAY_17A
+DAY_18A
+DAY_22A
+...
+
+
+### SNVs vs. SCVs vs. SAAVs
+
+Which one you're analyzing depends entirely on the `engine` parameter, which you can set to `NT` (nucleotides), `CDN` (codons), or `AA` (amino acids). The default value is nucleotides. Note that to analyze SCVs or SAAVs, you'll have needed to use the flag `--profile-SCVs` when you ran [anvi-profile](/help/8/programs/anvi-profile).
+
+For example, to analyze SAAVs, run
+
+
+anvi-gen-variability-profile -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -b [bin](/help/8/artifacts/bin) \
+ --engine AA
+
+
+To analyze SCVs, run
+
+
+anvi-gen-variability-profile -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -b [bin](/help/8/artifacts/bin) \
+ --engine CDN
+
+
+### Filtering the output
+
+You can filter the output in various ways, so that you can get straight to the variability positions that you're most interested in. Here are some of the filters that you can set:
+
+* The maximum number of variable positions that can come from a single split (e.g. to look at a max of 100 SCVs from each split, randomly sampled)
+* The maximum and minimum departure from the reference or consensus
+* The minimum coverage value in all samples (if a position is covered less than that value in _one_ sample, it will not be reported for _all_ samples)
+
+
+### --quince-mode
+
+You can also set `--quince-mode`, which reports the variability data across all samples for each position reported (even if that position isn't variable in some samples). For example, if nucleotide position 34 of contig 1 was a SNV in one sample, the output would contain data for nucleotide position 34 for all of your samples.
+
+### --kiefl-mode
+
+The default behavior is to report codon/amino-acid frequencies only at positions where variation was reported during profiling (which by default uses some heuristics to minimize the impact of error-driven variation). Fair enough, but for some diabolical cases, you may want to report _even_ invariant positions. When this flag is used, all positions are reported, regardless of whether they contained variation in any sample. The reference codon for all such entries is given a codon frequency of 1. All other entries (aka those with legitimate variation to be reported) remain unchanged. This flag can only be used with `--engine AA` or `--engine CDN` and is incompatible wth `--quince-mode`.
+
+This flag was added in this [pull request](https://github.com/merenlab/anvio/pull/1794) where you can read about all of the tests that were performed to ensure this mode is behaving properly.
+
+### Adding additional information
+
+You can also ask the program to report the contig names, split names, and gene-level coverage statistics, which appear as additional columns in the output.
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-gen-variability-profile.md) to update this information.
+
+
+## Additional Resources
+
+
+* [All about SNVs, SCVs, and SAAVs](http://merenlab.org/2015/07/20/analyzing-variability/)
+
+* [This program in action in the anvi'o structure tutorial](http://merenlab.org/2018/09/04/getting-started-with-anvio-structure/#supplying-anvi-display-structure-with-sequence-variability)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-gen-variability-profile) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-gen-variability-profile/network.json b/help/8/programs/anvi-gen-variability-profile/network.json
new file mode 100644
index 00000000..357120cd
--- /dev/null
+++ b/help/8/programs/anvi-gen-variability-profile/network.json
@@ -0,0 +1,108 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "variability-profile-txt",
+ "name": "variability-profile-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "structure-db",
+ "name": "structure-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "variability-profile",
+ "name": "variability-profile",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "splits-txt",
+ "name": "splits-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 7,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-gen-variability-profile",
+ "name": "anvi-gen-variability-profile",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 7,
+ "target": 0
+ },
+ {
+ "target": 7,
+ "source": 1
+ },
+ {
+ "target": 7,
+ "source": 2
+ },
+ {
+ "target": 7,
+ "source": 3
+ },
+ {
+ "target": 7,
+ "source": 4
+ },
+ {
+ "target": 7,
+ "source": 5
+ },
+ {
+ "target": 7,
+ "source": 6
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-aa-counts/index.md b/help/8/programs/anvi-get-aa-counts/index.md
new file mode 100644
index 00000000..8ab1f95f
--- /dev/null
+++ b/help/8/programs/anvi-get-aa-counts/index.md
@@ -0,0 +1,114 @@
+---
+layout: program
+title: anvi-get-aa-counts
+excerpt: An anvi'o program. Fetches the number of times each amino acid occurs from a contigs database in a given bin, set of contigs, or set of genes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-aa-counts
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Fetches the number of times each amino acid occurs from a contigs database in a given bin, set of contigs, or set of genes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[splits-txt](../../artifacts/splits-txt) [contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [collection](../../artifacts/collection)
+
+
+## Can provide
+
+
+[aa-frequencies-txt](../../artifacts/aa-frequencies-txt)
+
+
+## Usage
+
+
+Similarly to [anvi-get-codon-frequencies](/help/8/programs/anvi-get-codon-frequencies), this program counts the number of times each amino acid occurs in a given sequence, whether that's a [collection](/help/8/artifacts/collection), [bin](/help/8/artifacts/bin), set of contigs (listed in a [splits-txt](/help/8/artifacts/splits-txt)), or a set of genes. The output of this is a [aa-frequencies-txt](/help/8/artifacts/aa-frequencies-txt).
+
+There are four possible things you can count the amino acid frequencies in:
+* All of the contigs in a [contigs-db](/help/8/artifacts/contigs-db)
+* A series of [bin](/help/8/artifacts/bin)s
+* A list of contigs
+* A list of genes
+
+Examples for each are below.
+
+### Option 1: all contigs in a contigs-db
+
+To count the amino acids in all of the contigs in a [contigs-db](/help/8/artifacts/contigs-db), you can just provide the [contigs-db](/help/8/artifacts/contigs-db) of interest, as so:
+
+
+anvi-get-aa-counts -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/[aa-frequencies-txt](/help/8/artifacts/aa-frequencies-txt)
+
+
+### Option 2: a series of bins in a collection
+
+To count the amino acid frequencies for a series of [bin](/help/8/artifacts/bin)s, you'll need to provide three additional parameters: the [profile-db](/help/8/artifacts/profile-db) that you used for binning, the [collection](/help/8/artifacts/collection) that your bins are contained in, and a text file that describes which bins you are interested in. This text file should have only one bin ID per line.
+
+So, your run would look something like this:
+
+
+anvi-get-aa-counts -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/[aa-frequencies-txt](/help/8/artifacts/aa-frequencies-txt) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -B my_favorite_bins.txt
+
+
+`my_favorite_bins.txt` would look something like this:
+
+ bin_00001
+ bin_00004
+
+### Option 3: a list of contigs
+
+Just provide a [splits-txt](/help/8/artifacts/splits-txt) file that lists the contigs you want to look at.
+
+
+anvi-get-aa-counts -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/[aa-frequencies-txt](/help/8/artifacts/aa-frequencies-txt) \
+ --contigs-of-interest [splits-txt](/help/8/artifacts/splits-txt)
+
+
+### Option 4: a list of genes
+
+Just provide a list of gene caller ids, straight into the terminal, like so:
+
+
+anvi-get-aa-counts -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/[aa-frequencies-txt](/help/8/artifacts/aa-frequencies-txt) \
+ --gene-caller-ids gene_1,gene_2,gene_3
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-aa-counts.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-aa-counts) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-aa-counts/network.json b/help/8/programs/anvi-get-aa-counts/network.json
new file mode 100644
index 00000000..fd153d1f
--- /dev/null
+++ b/help/8/programs/anvi-get-aa-counts/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "aa-frequencies-txt",
+ "name": "aa-frequencies-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "splits-txt",
+ "name": "splits-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-aa-counts",
+ "name": "anvi-get-aa-counts",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "target": 5,
+ "source": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-codon-frequencies/index.md b/help/8/programs/anvi-get-codon-frequencies/index.md
new file mode 100644
index 00000000..435e00a4
--- /dev/null
+++ b/help/8/programs/anvi-get-codon-frequencies/index.md
@@ -0,0 +1,242 @@
+---
+layout: program
+title: anvi-get-codon-frequencies
+excerpt: An anvi'o program. Get codon or amino acid frequency statistics from genomes, genes, and functions.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-codon-frequencies
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Get codon or amino acid frequency statistics from genomes, genes, and functions..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [collection](../../artifacts/collection) [bin](../../artifacts/bin) [internal-genomes](../../artifacts/internal-genomes) [external-genomes](../../artifacts/external-genomes)
+
+
+## Can provide
+
+
+[codon-frequencies-txt](../../artifacts/codon-frequencies-txt) [aa-frequencies-txt](../../artifacts/aa-frequencies-txt)
+
+
+## Usage
+
+
+This program **calculates codon or amino acid frequencies from genes or functions**.
+
+A range of options allows calculation of different frequency statistics. This program is "maximalist," in that it has many options that do the equivalent of a couple extra commands in R or pandas -- because we (not you) tend to be lazy and prone to mistakes.
+
+## Basic commands
+
+### Gene frequencies
+
+This command produces a table of codon frequencies from coding sequences in the contigs database. The first column of the table contains gene caller IDs and subsequent columns contain frequency data. The decoded amino acid is included in each codon column name with the flag, `--header-amino-acids`.
+
+
+anvi-get-codon-frequencies -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output.txt \
+ --header-amino-acids
+
+
+### Function frequencies
+
+This command produces a table of function frequencies rather than gene frequencies. By using `--function-sources` without any arguments, the output will include every [functions](/help/8/artifacts/functions) source available in a given [contigs-db](/help/8/artifacts/contigs-db), e.g., `KOfam`, `KEGG_BRITE`, `Pfam` (you can always see the complete list of [functions](/help/8/artifacts/functions) in *your* [contigs-db](/help/8/artifacts/contigs-db) by running the program [anvi-db-info](/help/8/programs/anvi-db-info) on it). The first four columns of the table before frequency data contain, respectively, gene caller IDs, function sources, accessions, and names.
+
+
+anvi-get-codon-frequencies -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --function-sources \
+ --function-table-output path/to/function_output.txt
+
+
+### Gene frequencies with function information
+
+In contrast to the previous example, this command produces a table of gene frequencies, but has an entry for every gene/function pair, allowing statistical interrogation of the gene components of functions. The function table output is derived from this table by grouping rows by function source, retaining only one row per gene caller ID, and summing frequencies across rows of the groups.
+
+
+anvi-get-codon-frequencies -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --function-sources \
+ --gene-table-output path/to/gene_output.txt
+
+
+### Codon frequencies from multiple internal and external genomes
+
+This command produces a table of codon frequencies from coding sequences in multiple genomes. A column is added at the beginning of the table for genome name.
+
+
+anvi-get-codon-frequencies -i [internal-genomes](/help/8/artifacts/internal-genomes) \
+ -e [external-genomes](/help/8/artifacts/external-genomes) \
+ -o path/to/output.txt
+
+
+## Option examples
+
+The following tables show the options to get the requested results.
+
+### _Different_ frequency statistics
+
+| Get | Options |
+| --- | ------- |
+| Codon absolute frequencies | |
+| Codon relative frequencies | `--relative` |
+| [Synonymous (per-amino acid) codon relative frequencies](#synonymous-codon-relative-frequencies) | `--synonymous` |
+| Amino acid frequencies | `--amino-acid` |
+| Amino acid relative frequencies | `--amino-acid --relative` |
+| [Summed frequencies across genes](#frequencies-across-genes) | `--sum` |
+| [Synonymous relative summed frequencies across genes](#frequencies-across-genes) | `--sum --synonymous` |
+| [Summed frequencies across genes annotated by each function source](#frequencies-across-genes) | `--sum --function-sources` |
+| [Relative summed frequencies across genes with KOfam annotations](#frequencies-across-genes) | `--sum --relative --function-sources KOfam` |
+| [Average frequencies across all genes](#frequencies-across-genes) | `--average` |
+
+### Frequencies from _sets of genes with shared functions_
+
+| Get | Options |
+| --- | ------- |
+| All function annotation sources | `--function-sources` |
+| [All KEGG BRITE categories](#brite-hierarchies) | `--function-sources KEGG_BRITE` |
+| All KEGG KOfams and all Pfams | `--function-sources KOfam Pfam` |
+| [Certain KEGG BRITE categories](#brite-hierarchies) | `--function-sources KEGG_BRITE --function-names Ribosome Ribosome>>>Ribosomal proteins` |
+| [Certain KEGG KOfam accessions](#inputs) | `--function-sources KOfam --function-accessions K00001 K00002` |
+| [Certain BRITE categories and KOfam accessions](#inputs) | `--select-functions-txt path/to/select_functions.txt` |
+
+### Frequencies from _selections of genes_
+
+| Get | Options |
+| --- | ------- |
+| From contigs database | `--contigs-db path/to/contigs.db` |
+| From collection of internal genomes | `--contigs-db path/to/contigs.db --profile-db path/to/profile.db --collection-name my_bins` |
+| From internal genome | `--contigs-db path/to/contigs.db --profile-db path/to/profile.db --collection-name my_bins --bin-id my_bin` |
+| From internal genomes listed in a file | `--internal-genomes path/to/genomes.txt` |
+| From external genomes (contigs databases) listed in a file | `--external-genomes path/to/genomes.txt` |
+| With certain gene IDs | `--gene-caller-ids 0 2 500` |
+| With certain gene IDs or genes annotated with certain KOfams | `--gene-caller-ids 0 2 500 --function-sources KOfam --function-accessions K00001` |
+
+### _Filtering genes and codons_ that are analyzed and reported
+
+| Get | Options |
+| --- | ------- |
+| [Exclude genes shorter than 300 codons](#gene-length-and-codon-count) | `--gene-min-codons 300` |
+| [Exclude genes shorter than 300 codons from contributing to function codon frequencies](#function-codon-count) | `--gene-min-codons 300 --function-sources` |
+| [Exclude functions with <300 codons](#function-codon-count) | `--function-min-codons 300` |
+| [Exclude stop codons and single-codon amino acids](#codons) | `--exclude-amino-acids STP Met Trp` |
+| [Only include certain codons](#codons) | `--include-amino-acids Leu Ile` |
+| [Exclude codons for amino acids with <5 codons in >90% of genes](#codons) | `--pansequence-min-amino-acids 5 0.9` |
+| [Replace codons for amino acids with <5 codons in the gene or function with NaN](#codons) | `--sequence-min-amino-acids 5` |
+
+## Option details
+
+### Synonymous codon relative frequencies
+
+This flag returns the relative frequency of each codon among the codons encoding the same amino acid, e.g., 0.4 GCC and 0.6 GCT for Ala. By default, stop codons and single-codon amino acids (Met ATG and Trp TGG) in the standard translation table are excluded, equivalent to using `--exclude-amino-acids STP Met Trp` for other frequency statistics.
+
+### Frequencies across genes
+
+`--sum` and `--average` produce a table with a single row of frequencies from across genes. For example, the following command sums the codon frequencies of each decoded amino acid (and STP) across all genes, and then calculates the relative frequencies of the amino acids.
+
+
+anvi-get-codon-frequencies -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output_table.txt \
+ --sum \
+ --amino-acid \
+ --relative
+
+
+The first column of the output table has the header, 'gene_caller_ids', and the value, 'all', indicating that the data is aggregated across genes.
+
+`--sum` and `--average` operate on genes. When used with a function option, the program subsets the genes annotated by the functions of interest. With `--average`, it calculates the average frequency across genes rather than functions (sums of genes with functional annotation). For example, the following command calculates the average synonymous relative frequency across genes annotated by `KOfam`.
+
+
+anvi-get-codon-frequencies -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output_table.txt \
+ --average \
+ --synonymous \
+ --function-sources KOfam
+
+
+### Functions
+
+Functions and function annotation sources can be provided to subset genes (as seen in the [last section](#frequencies-across-genes) with `--average`) and to calculate statistics for functions in addition to genes (as seen in a [previous example](#function-codon-frequencies).
+
+Using `--output-file` is equivalent to `--gene-table-output` rather than `--function-table-output`, producing rows containing frequencies for annotated genes rather than summed frequencies for functions.
+
+#### Inputs
+
+There are multiple options to define which functions and sources should be used. `--function-sources` without arguments uses all available sources that had been used to annotate genes.
+
+`--function-accessions` and `--function-names` select functions from a single provided source. The following example uses both options to select COG functions.
+
+
+anvi-get-codon-frequencies -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output_table.txt \
+ --function-sources COG14_FUNCTION \
+ --function-accessions COG0004 COG0005 \
+ --function-names "Ammonia channel protein AmtB" "Purine nucleoside phosphorylase"
+
+
+To use different functions from different sources, a tab-delimited file can be provided to `functions-txt`. This headerless file must have three columns, for source, accession, and name of functions, respectively, with an entry in each row for source.
+
+By default, selected function accessions or names do not need to be present in the input genomes; the program will return data for any selected function accessions or names that annotated genes. This behavior can be changed using the flag, `--expect-functions`, so that the program will throw an error when any of the selected accessions or names are absent.
+
+#### BRITE hierarchies
+
+Genes are classified in KEGG BRITE functional hierarchies by [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams). For example, a bacterial SSU ribosomal protein is classified in a hierarchy of ribosomal genes, `Ribosome>>>Ribosomal proteins>>>Bacteria>>>Small subunit`. Codon frequencies can be calculated for genes classified at each level of the hierarchy, from the most general, those genes in the `Ribosome`, to the most specific -- in the example, those genes in `Ribosome>>>Ribosomal proteins>>>Bacteria>>>Small subunit`. Therefore, the following command returns summed codon frequencies for each annotated hierarchy level -- in the example, the output would include four rows for the genes in each level from `Ribosome` to `Small subunit`.
+
+
+anvi-get-codon-frequencies -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output_table.txt \
+ --function-sources KEGG_BRITE
+
+
+### Filter genes and codons
+
+#### Codons
+
+It may be useful to restrict codons in the analysis to those encoding certain amino acids. Stop codons and the single codons encoding Met and Trp are excluded by default from calculation of synonymous codon relative frequencies (`--synonymous`). Relative frequencies across codons in a gene (`--relative`) are calculated for the selected amino acids, so the following option would return a table of codon frequencies relative to the codons encoding the selected nonpolar amino acids: `--include-amino-acids Gly Ala Val Leu Met Ile`.
+
+Dynamic exclusion of amino acids can be useful in the calculation of synonymous codon frequencies. For example, 0.5 AAT and 0.5 AAC for Asn may be statistically insignificant for a gene with 1 AAT and 1 AAC; even more meaningless would be 1.0 AAT and 0.0 AAC for a gene with 1 AAT and 0 AAC. `--pansequence-min-amino-acids` removes rarer amino acids across the dataset, setting a minimum number of codons in a minimum number of genes to retain the amino acid. For example, amino acids with <5 codons in >90% of genes will be excluded from the analysis with `--pansequence-min-amino-acids 5 0.9`.
+
+Codons for rarer amino acids within each gene or function row can be excluded in the results table (replaced by NaN) with `--sequence-min-amino-acids`. This parameter only affects how the results are displayed. For example, amino acids with <5 codons in each row will be discarded in the results table with `--sequence-min-amino-acids 5`.
+
+#### Gene length and codon count
+
+Removal of genes with few codons can improve the statistical utility of relative frequencies. `--gene-min-codons` sets the minimum number of codons required in a gene, and this filter can be applied before and/or after the removal of rarer codons. Applied before, `--gene-min-codons` filters genes by length; applied after, it filters genes by codons remaining after removing rarer codons. `--min-codon-filter` can take three possible arguments: `length`, `remaining`, or, by default when codons are removed, `both`, which applies the `--gene-min-codons` filter both before and after codon removal.
+
+It may seem redundant for `remaining` and `both` to both be possibilities, but this is due to the possibility of dynamic amino acid exclusion using `--pansequence-min-amino-acids`. Amino acids are removed based on their frequency in a proportion of genes, so removing shorter genes by length before removing amino acids can affect which amino acids are dynamically excluded.
+
+#### Function codon count
+
+`--function-min-codons` can be used to filter functions with a minimum number of codons. Function codon count filters occur after gene codon count filters: the set of genes contributing to function codon frequency can be restricted by applying `--gene-min-codons`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-codon-frequencies.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-codon-frequencies) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-codon-frequencies/network.json b/help/8/programs/anvi-get-codon-frequencies/network.json
new file mode 100644
index 00000000..97b538be
--- /dev/null
+++ b/help/8/programs/anvi-get-codon-frequencies/network.json
@@ -0,0 +1,121 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "codon-frequencies-txt",
+ "name": "codon-frequencies-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "aa-frequencies-txt",
+ "name": "aa-frequencies-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 8,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-codon-frequencies",
+ "name": "anvi-get-codon-frequencies",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 8,
+ "target": 0
+ },
+ {
+ "source": 8,
+ "target": 1
+ },
+ {
+ "target": 8,
+ "source": 2
+ },
+ {
+ "target": 8,
+ "source": 3
+ },
+ {
+ "target": 8,
+ "source": 4
+ },
+ {
+ "target": 8,
+ "source": 5
+ },
+ {
+ "target": 8,
+ "source": 6
+ },
+ {
+ "target": 8,
+ "source": 7
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-codon-usage-bias/index.md b/help/8/programs/anvi-get-codon-usage-bias/index.md
new file mode 100644
index 00000000..f47c0c98
--- /dev/null
+++ b/help/8/programs/anvi-get-codon-usage-bias/index.md
@@ -0,0 +1,254 @@
+---
+layout: program
+title: anvi-get-codon-usage-bias
+excerpt: An anvi'o program. Get codon usage bias (CUB) statistics of genes and functions.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-codon-usage-bias
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Get codon usage bias (CUB) statistics of genes and functions..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [collection](../../artifacts/collection) [bin](../../artifacts/bin) [internal-genomes](../../artifacts/internal-genomes) [external-genomes](../../artifacts/external-genomes)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program **calculates codon usage bias (CUB) among genes or functions**.
+
+A range of options allows control over the CUB calculation to remove statistically spurious results.
+
+Some CUB metrics depend on a reference codon composition while others are reference-independent. The reference often constitutes expected high-expression genes, such as ribosomal proteins. This program allows customization of the reference. It also introduces an "omnibias" mode, in which reference compositions are determined from every gene (or function), and CUB is calculated for every combination of query and reference gene. This produces a square distance-like matrix of CUB values that can be used to cluster genes or functions by their biases relative to one another.
+
+## Basic commands
+
+### CUB of genes
+
+This command produces a table of CUB values from coding sequences in the contigs database. The first column of the table contains gene caller IDs and each subsequent column contains values for a CUB metric, e.g., the Codon Adaptation Index (CAI) of [Sharp and Li, 1987](https://academic.oup.com/nar/article-abstract/15/3/1281/1166844?redirectedFrom=fulltext) and ๐ฟ of [Ran and Higgs, 2012](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0051652). CUB metrics that rely upon a reference codon composition, such as CAI and ๐ฟ, establish this composition from genes in the genome that are annotated as ribosomal proteins by KEGG KOfams/BRITE. Certain parameters have default values to increase the statistical significance of results (`--query-min-analyzed-codons`, `--reference-exclude-amino-acid-count`, `--reference-min-analyzed-codons`).
+
+
+anvi-get-codon-usage-bias -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output.txt
+
+
+### CUB of functions
+
+This command produces a table of CUB values for functions rather than genes. By using `--function-sources` without any arguments, the output will include every [functions](/help/8/artifacts/functions) source available in a given [contigs-db](/help/8/artifacts/contigs-db), e.g., `KOfam`, `KEGG_BRITE`, `Pfam` (you can always see the complete list of [functions](/help/8/artifacts/functions) in *your* [contigs-db](/help/8/artifacts/contigs-db) by running the program [anvi-db-info](/help/8/programs/anvi-db-info) on it). The first four columns of the table before CUB values contain, respectively, gene caller IDs, function sources, accessions, and names.
+
+
+anvi-get-codon-usage-bias -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output.txt \
+ --function-sources
+
+
+### Custom CUB reference gene set
+
+This command establishes a CUB reference codon composition from genes with functional annotations defined a supplemental file. This file should be headerless, tab-delimited, and have three columns of function annotation sources (required), function accessions, and function names (either an accession or name is required).
+
+
+anvi-get-codon-usage-bias -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output.txt \
+ --select-reference-functions-txt path/to/select-reference-functions.txt
+
+
+### "Omnibias" CUB
+
+In reference-dependent CUB metrics, genes or functions are compared to a reference codon composition, such as from ribosomal proteins. "Omnibias" mode compares each gene/function "query" not to a single reference, but to each other gene/function. This generates a square distance-like matrix of CUB values that can be used to cluster genes/functions by their biases relative to one another. The output path given in the command is modified to create an output file of the omnibias matrix for each CUB metric.
+
+
+anvi-get-codon-usage-bias -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output.txt \
+ --omnibias
+
+
+### CUB from multiple internal and external genomes
+
+This command produces a CUB table per genome, modifying the output path given in the command to create a separate output file per genome. Collection of genomes can be provided as [internal-genomes](/help/8/artifacts/internal-genomes), [external-genomes](/help/8/artifacts/external-genomes), or both:
+
+
+anvi-get-codon-usage-bias -i [internal-genomes](/help/8/artifacts/internal-genomes) \
+ -e [external-genomes](/help/8/artifacts/external-genomes) \
+ -o path/to/output.txt
+
+
+## Option examples
+
+The following tables show the options to get the requested results.
+
+### Selecting CUB _metrics and references_
+
+| Get | Options |
+| --- | ------- |
+| With certain, rather than all, CUB metrics | `--metrics cai` |
+| With query sequences also used as references | `--omnibias` |
+| [With a custom reference defined by functional annotations](#reference-functions) | `--select-reference-functions-txt path/to/select_reference_functions.txt` |
+| [With a custom reference defined by certain genes](#reference-genes) | `--reference-gene-caller-ids 0 2 500` |
+
+### CUB from _sets of genes with shared functions_
+
+| Get | Options |
+| --- | ------- |
+| All function annotation sources | `--function-sources` |
+| [All KEGG BRITE categories](#brite-hierarchies) | `--function-sources KEGG_BRITE` |
+| All KEGG KOfams and all Pfams | `--function-sources KOfam Pfam` |
+| [Certain KEGG BRITE categories](#brite-hierarchies) | `--function-sources KEGG_BRITE --function-names Ribosome Ribosome>>>Ribosomal proteins` |
+| [Certain KEGG KOfam accessions](#inputs) | `--function-sources KOfam --function-accessions K00001 K00002` |
+| [Certain BRITE categories and KOfam accessions](#inputs) | `--select-functions-txt path/to/select_functions.txt` |
+
+### CUB from _selections of genes_
+
+| Get | Options |
+| --- | ------- |
+| From contigs database | `--contigs-db path/to/contigs.db` |
+| From collection of internal genomes | `--contigs-db path/to/contigs.db --profile-db path/to/profile.db --collection-name my_bins` |
+| From internal genome | `--contigs-db path/to/contigs.db --profile-db path/to/profile.db --collection-name my_bins --bin-id my_bin` |
+| From internal genomes listed in a file | `--internal-genomes path/to/genomes.txt` |
+| From external genomes (contigs databases) listed in a file | `--external-genomes path/to/genomes.txt` |
+| With certain gene IDs | `--gene-caller-ids 0 2 500` |
+| With certain gene IDs or genes annotated with certain KOfams | `--gene-caller-ids 0 2 500 --function-sources KOfam --function-accessions K00001` |
+
+### _Filtering genes and codons_ in analyzed and reported queries
+
+| Get | Options |
+| --- | ------- |
+| [Exclude genes shorter than 300 codons from queries](#gene-length-and-codon-count) | `--gene-min-codons 300` |
+| [Exclude genes shorter than 300 codons from contributing to function queries](#function-codon-count) | `--gene-min-codons 300 --function-sources` |
+| [Exclude function queries with <300 codons](#function-codon-count) | `--function-min-codons 300` |
+| [Exclude stop codons and single-codon amino acids](#codons) | `--exclude-amino-acids STP Met Trp` |
+| [Exclude codons for amino acids with <5 codons in >90% of genes](#codons) | `--pansequence-min-amino-acids 5 0.9` |
+| [Replace codons for amino acids with <5 codons in the gene or function with NaN](#codons) | `--sequence-min-amino-acids 5` |
+| [Exclude queries with <300 codons involved in the CUB calculation](#analyzed-query-codon-count) | `--query-min-analyzed-codons 300` |
+
+### _Filtering genes and codons_ in analyzed references
+
+| Get | Options |
+| --- | ------- |
+| Codons with a frequency <20 are excluded from the reference and CUB calculation | `--reference-exclude-amino-acid-count 20` |
+| Exclude references with <300 codons involved in the CUB calculation | `--reference-min-analyzed-codons 300` |
+
+## Option details
+
+### Functions
+
+Functions and function annotation sources (e.g., 'KOfam', 'Pfam') can be provided to calculate CUB for functions rather than genes.
+
+#### Inputs
+
+There are multiple options to define which functions and sources should be used as CUB queries. `--functions-sources` without arguments uses all available sources that had been used to annotate genes.
+
+`--function-accessions` and `--function-names` select functions from a single provided source. The following example uses both options to select COG functions.
+
+
+anvi-get-codon-usage-bias -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output_table.txt \
+ --function-sources COG14_FUNCTION \
+ --function-accessions COG0004 COG0005 \
+ --function-names "Ammonia channel protein AmtB" "Purine nucleoside phorphorylase"
+
+
+To use different functions from different sources, a tab-delimited file can be provided to `functions-txt`. This headerless file must have three columns, for source, accession, and name of functions, respectively, with an entry in each row for source.
+
+By default, selected function accessions or names do not need to be present in input genomes; the program will query any selected function accessions or names that annotated genes. This behavior can be changed using the flag, `--expect-functions`, so that the program will throw an error when any of the selected accessions or names are absent.
+
+#### BRITE hierarchies
+
+Genes are classified in KEGG BRITE functional hierarchies by [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams). For example, a bacterial SSU ribosomal protein is classified in a hierarchy of ribosomal genes, `Ribosome>>>Ribosomal proteins>>>Bacteria>>>Small subunit`. CUB can be calculated for function queries at each level of the hierarchy, from the most general, those genes in the `Ribosome`, to the most specific -- in the example, those genes in `Ribosome>>>Ribosomal proteins>>>Bacteria>>>Small subunit`. Therefore, the following command returns CUB values for each annotated hierarchy level -- in the example, the output would include four rows for the genes in each level from `Ribosome` to `Small subunit`.
+
+
+anvi-get-codon-usage-bias -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output_table.txt \
+ --function-sources KEGG_BRITE
+
+
+### Custom reference
+
+A custom reference codon composition for reference-dependent CUB calculations can be defined by functional annotations and/or genes IDs. By default, the reference codon composition is defined by the concatenation of all genes in the genome annotated as ribosomal proteins by KEGG KOfams/BRITE.
+
+#### Reference functions
+
+Use `--select-reference-functions-txt` to provide a table of functional annotations defining a custom set of reference genes. The tab-delimited file should not have a header, but should have three columns of 1) function annotation sources, such as 'KEGG_BRITE', 'KOfam', or 'Pfam', 2) function accession, such as 'K00001' for KOfam, and 3) function name, such as 'Ribosome>>>Ribosomal proteins' for KEGG_BRITE. Every row must contain a source and either a function accession or name.
+
+#### Require all functions be present
+
+Use `--expect-functions` to require that all functions that define the reference are found in the genome, else an error will be thrown. By default, not all provided functions need to be represented among the reference genes.
+
+#### Reference genes
+
+Use `--reference-gene-caller-ids` for select genes to be in the custom reference gene set. This only works when processing a single genome. An error will be thrown if not all of the provided gene caller IDs are present in the genome.
+
+### Filter genes and codons
+
+#### Codons
+
+It may be useful to restrict codons in the analysis to those encoding certain amino acids. Stop codons are excluded by default from CUB calculations. Codons encoding a single amino acid (Met and Trp) do not factor into CUB calculations. Example: exclude Ala, Arg, and stop codons with `--exclude-amino-acids Ala Arg STP`.
+
+Dynamic exclusion of amino acids can be useful in CUB calculations. For example, a query gene with 1 AAT and 1 AAC encoding Asn, or "synonymous relative frequencies" of 0.5 AAT and 0.5 AAC, has very little data to support comparison to the synonymous relative frequencies of a large number of Asn codons in a set of reference genes. A query with 1 AAT and 0 AAC, or synonymous relative frequencies of 1.0 AAT and 0.0 AAC, would be even more statistically insignificant. Reference-dependent CUB metrics, such as ๐ฟ, rely upon the ratio of synonymous relative codon frequencies in the query and reference, and so can be skewed for queries with small counts of various codons. `--pansequence-min-amino-acids` removes rarer amino acids across the dataset, setting a minimum number of codons in a minimum number of genes to retain the amino acid. For example, amino acids with <5 codons in >90% of genes will be excluded from the analysis with the arguments, `--pansequence-min-amino-acids 5 0.9`.
+
+Codons for rarer amino acids within each gene or function query can be excluded from the CUB calculation with `--sequence-min-amino-acids`. For example, amino acids with <5 codons in a query will be excluded from the analysis with `--sequence-min-amino-acids 5`.
+
+#### Gene length and codon count
+
+Genes with fewer than a minimum number of codons can be ignored in the CUB analysis. `--gene-min-codons` sets the minimum number of codons required in a gene, and this filter can be applied before and/or after the removal of rarer codons. Applied before, `--gene-min-codons` filters genes by length; applied after, it filters genes by codons remaining after removing rarer codons. `--min-codon-filter` can take three possible arguments: `length`, `remaining`, or, by default when codons are removed, `both`, which applies the `--gene-min-codons` filter both before and after codon removal.
+
+It may seem redundant for `remaining` and `both` to both be possibilities, but this is due to the possibility of dynamic amino acid exclusion using `--pansequence-min-amino-acids`. Amino acids are removed based on their frequency in a proportion of genes, so removing shorter genes by length before removing amino acids can affect which amino acids are dynamically excluded.
+
+| Get | Options |
+| --- | ------- |
+| [Exclude genes shorter than 300 codons from queries](#gene-length-and-codon-count) | `--gene-min-codons 300` |
+| [Exclude genes shorter than 300 codons from contributing to function queries](#function-codon-count) | `--gene-min-codons 300 --function-sources` |
+| [Exclude function queries with <300 codons](#function-codon-count) | `--function-min-codons 300` |
+| [Exclude stop codons and single-codon amino acids](#codons) | `--exclude-amino-acids STP Met Trp` |
+| [Exclude codons for amino acids with <5 codons in >90% of genes](#codons) | `--pansequence-min-amino-acids 5 0.9` |
+| [Replace codons for amino acids with <5 codons in the gene or function with NaN](#codons) | `--sequence-min-amino-acids 5` |
+| [Exclude queries with <300 codons involved in the CUB calculation](#query-length) | `--query-min-analyzed-codons 300` |
+
+#### Function codon count
+
+Functions with fewer than a minimum number of codons can be ignored in the CUB analysis using `--function-min-codons`. Function codon count filters occur after gene codon count filters: the set of genes contributing to function codon frequency can be restricted by applying `--gene-min-codons`.
+
+#### Analyzed query codon count
+
+Filters removing codons from a gene or function query reduce the codon count factoring into the CUB calculation. `--query-min-analyzed-codons` ignores queries in the CUB calculation that have fewer than a minimum number of codons remaining. For example, `--query-min-analyzed-codons 300` ensures that after removing codons by `--exclude-amino-acids`, `--pansequence-min-amino-acids`, and/or `--sequence-min-amino-acids`, queries must have 300 codons involved in the CUB calculation.
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-codon-usage-bias.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-codon-usage-bias) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-codon-usage-bias/network.json b/help/8/programs/anvi-get-codon-usage-bias/network.json
new file mode 100644
index 00000000..0b12da6b
--- /dev/null
+++ b/help/8/programs/anvi-get-codon-usage-bias/network.json
@@ -0,0 +1,95 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 6,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-codon-usage-bias",
+ "name": "anvi-get-codon-usage-bias",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 6,
+ "source": 0
+ },
+ {
+ "target": 6,
+ "source": 1
+ },
+ {
+ "target": 6,
+ "source": 2
+ },
+ {
+ "target": 6,
+ "source": 3
+ },
+ {
+ "target": 6,
+ "source": 4
+ },
+ {
+ "target": 6,
+ "source": 5
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-metabolic-model-file/index.md b/help/8/programs/anvi-get-metabolic-model-file/index.md
new file mode 100644
index 00000000..325281e1
--- /dev/null
+++ b/help/8/programs/anvi-get-metabolic-model-file/index.md
@@ -0,0 +1,87 @@
+---
+layout: program
+title: anvi-get-metabolic-model-file
+excerpt: An anvi'o program. This program writes a metabolic reaction network to a file suitable for flux balance analysis.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-metabolic-model-file
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+This program writes a metabolic reaction network to a file suitable for flux balance analysis..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [reaction-network](../../artifacts/reaction-network)
+
+
+## Can provide
+
+
+[reaction-network-json](../../artifacts/reaction-network-json)
+
+
+## Usage
+
+
+This program **exports a metabolic [reaction-network](/help/8/artifacts/reaction-network) from a [contigs-db](/help/8/artifacts/contigs-db) to a [reaction-network-json](/help/8/artifacts/reaction-network-json) file** suitable for inspection and flux balance analysis.
+
+The required input to this program is a [contigs-db](/help/8/artifacts/contigs-db) in which a [reaction-network](/help/8/artifacts/reaction-network) has been stored by [anvi-reaction-network](/help/8/programs/anvi-reaction-network).
+
+The [reaction-network-json](/help/8/artifacts/reaction-network-json) file output contains sections on the metabolites, reactions, and genes constituting the [reaction-network](/help/8/artifacts/reaction-network) that had been predicted from the genome. An "objective function" representing the biomass composition of metabolites in the ["core metabolism" of *E. coli*](http://bigg.ucsd.edu/models/e_coli_core) is automatically added as the first entry in the "reactions" section of the file and can be deleted as needed. An objective function is needed for flux balance analysis.
+
+## Usage
+
+[anvi-get-metabolic-model-file](/help/8/programs/anvi-get-metabolic-model-file) requires a [contigs-db](/help/8/artifacts/contigs-db) as input and the path to an output [reaction-network-json](/help/8/artifacts/reaction-network-json) file.
+
+
+anvi-get-metabolic-model-file -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o /path/to/ouput.json
+
+
+An existing file at the target output location must be explicitly overwritten with the `-W` flag.
+
+
+anvi-get-metabolic-model-file -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o /path/to/output.json \
+ -W
+
+
+The flag, `--remove-missing-objective-metabolites` must be used to remove metabolites in the *E. coli* core biomass objective function from the output file if the metabolites are not produced or consumed by the predicted [reaction-network](/help/8/artifacts/reaction-network). [COBRApy](https://opencobra.github.io/cobrapy/), for instance, cannot load the JSON file if metabolites in the objective function are missing from the genomic model.
+
+
+anvi-get-metabolic-model-file -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o /path/to/output.json \
+ --remove-missing-objective-metabolites
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-metabolic-model-file.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-metabolic-model-file) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-metabolic-model-file/network.json b/help/8/programs/anvi-get-metabolic-model-file/network.json
new file mode 100644
index 00000000..786f3bc6
--- /dev/null
+++ b/help/8/programs/anvi-get-metabolic-model-file/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "reaction-network-json",
+ "name": "reaction-network-json",
+ "provided_by_anvio": true,
+ "type": "JSON"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "reaction-network",
+ "name": "reaction-network",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-metabolic-model-file",
+ "name": "anvi-get-metabolic-model-file",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-pn-ps-ratio/index.md b/help/8/programs/anvi-get-pn-ps-ratio/index.md
new file mode 100644
index 00000000..c2a28a8d
--- /dev/null
+++ b/help/8/programs/anvi-get-pn-ps-ratio/index.md
@@ -0,0 +1,85 @@
+---
+layout: program
+title: anvi-get-pn-ps-ratio
+excerpt: An anvi'o program. Calculate the rates of non-synonymous and synonymous polymorphism for genes across environmetns using the output of anvi-gen-variability-profile.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-pn-ps-ratio
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Calculate the rates of non-synonymous and synonymous polymorphism for genes across environmetns using the output of anvi-gen-variability-profile..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [variability-profile-txt](../../artifacts/variability-profile-txt)
+
+
+## Can provide
+
+
+[pn-ps-data](../../artifacts/pn-ps-data)
+
+
+## Usage
+
+
+This program **calculates the pN/pS ratio** for each gene in a [contigs-db](/help/8/artifacts/contigs-db) and outputs it as a [pn-ps-data](/help/8/artifacts/pn-ps-data) artifact.
+
+### What is the pN/pS ratio?
+
+The pN/pS ratio (first described in [Schloissnig et al. 2012](https://doi.org/10.1038/nature11711)) is the ratio of 2 rates: the rates of non-synonymous (pN) and synonymous (pS) **polymorphism**. It is analogous to dN/dS, which is the ratio of rates between non-synonymous (dN) and synonymous **substitutions** between two strains. We calculate pN/pS from allele frequency obtained through SCVs and SAAVs. See the study by [Kiefl et al. 2023](https://www.science.org/doi/10.1126/sciadv.abq4632) for additional information, and [this reproducible workflow](https://merenlab.org/data/anvio-structure/chapter-III/) associated with that study to see use cases.
+
+### How do I use this program?
+
+First, you will need to run [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile) using the flag `--engine CDN` to get a [variability-profile-txt](/help/8/artifacts/variability-profile-txt) for SCVs (single codon variants), which we'll name `SCVs.txt` in this example.
+
+Then you can run this program like so:
+
+
+anvi-get-pn-ps-ratio -V SCVs.txt \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o output_dir
+
+
+A pN/pS value is calculated for each gene x sample combo. This will result in a directory called `output_dir` that contains several tables that describe each of your genes. See [pn-ps-data](/help/8/artifacts/pn-ps-data) for more information.
+
+### Other parameters
+
+This program has some default filtering choices that you should pay mind to. You can tune these filter options with the following variables:
+
+- The minimum departure from consensus for a variable position (`--min-departure-from-consensus`).
+- The minimum departure from reference for a variable position (`--min-departure-from-reference`).
+- The minimum number of SCVs in a grouping (`--minimum-num-variants`).
+- The minimum coverage at a variable position (`--min-coverage`).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-pn-ps-ratio.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-pn-ps-ratio) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-pn-ps-ratio/network.json b/help/8/programs/anvi-get-pn-ps-ratio/network.json
new file mode 100644
index 00000000..5b5945f1
--- /dev/null
+++ b/help/8/programs/anvi-get-pn-ps-ratio/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pn-ps-data",
+ "name": "pn-ps-data",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "variability-profile-txt",
+ "name": "variability-profile-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-pn-ps-ratio",
+ "name": "anvi-get-pn-ps-ratio",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-sequences-for-gene-calls/index.md b/help/8/programs/anvi-get-sequences-for-gene-calls/index.md
new file mode 100644
index 00000000..0696daac
--- /dev/null
+++ b/help/8/programs/anvi-get-sequences-for-gene-calls/index.md
@@ -0,0 +1,97 @@
+---
+layout: program
+title: anvi-get-sequences-for-gene-calls
+excerpt: An anvi'o program. A script to get back sequences for gene calls.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-sequences-for-gene-calls
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A script to get back sequences for gene calls.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Can provide
+
+
+[genes-fasta](../../artifacts/genes-fasta) [external-gene-calls](../../artifacts/external-gene-calls)
+
+
+## Usage
+
+
+This program allows you to **export the sequences of your gene calls** from a [contigs-db](/help/8/artifacts/contigs-db) or [genomes-storage-db](/help/8/artifacts/genomes-storage-db) in the form of a [genes-fasta](/help/8/artifacts/genes-fasta).
+
+If you want other information about your gene calls from a [contigs-db](/help/8/artifacts/contigs-db), you can run [anvi-export-gene-calls](/help/8/programs/anvi-export-gene-calls) (which outputs a [gene-calls-txt](/help/8/artifacts/gene-calls-txt)) or get the coverage and detection information with [anvi-export-gene-coverage-and-detection](/help/8/programs/anvi-export-gene-coverage-and-detection).
+
+### Running on a contigs database
+
+You can run this program on a [contigs-db](/help/8/artifacts/contigs-db) like so:
+
+
+anvi-get-sequences-for-gene-calls -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output
+
+
+This is create a [genes-fasta](/help/8/artifacts/genes-fasta) that contains every gene in your contigs database. If you only want a specific subset of genes, you can run the following:
+
+
+anvi-get-sequences-for-gene-calls -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/output \
+ --gene-caller-ids 897,898,1312 \
+ --delimiter ,
+
+
+Now the resulting [genes-fasta](/help/8/artifacts/genes-fasta) will contain only those three genes.
+
+You also have the option to report the output in [gff3 format](https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md), report extended deflines for each gene, or report amino acid sequences instead of nucleotide sequences.
+
+### Running on a genomes storage database
+
+You can also get the sequences from gene calls in a [genomes-storage-db](/help/8/artifacts/genomes-storage-db), like so:
+
+
+anvi-get-sequences-for-gene-calls -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -o path/to/output
+
+
+This will create a [genes-fasta](/help/8/artifacts/genes-fasta) that contains every gene in your genomes storage database. To focus on only a subset of the genomes contained in your database, use the flag `--genome-names`. You can provide a comma-delimited list of genome names or a flat text file that contains one genome per line. Alternatively, you could provide a list of gene-caller-ids as specified above.
+
+You also have the option to report the output in [gff3 format](https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md), report extended deflines for each gene, or report amino acid sequences instead of nucleotide sequences.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-sequences-for-gene-calls.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A tutorial on getting gene-level taxonomy for a contigs-db](http://merenlab.org/2016/06/18/importing-taxonomy/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-sequences-for-gene-calls) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-sequences-for-gene-calls/network.json b/help/8/programs/anvi-get-sequences-for-gene-calls/network.json
new file mode 100644
index 00000000..53f9c2ae
--- /dev/null
+++ b/help/8/programs/anvi-get-sequences-for-gene-calls/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genes-fasta",
+ "name": "genes-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-gene-calls",
+ "name": "external-gene-calls",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-sequences-for-gene-calls",
+ "name": "anvi-get-sequences-for-gene-calls",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-sequences-for-gene-clusters/index.md b/help/8/programs/anvi-get-sequences-for-gene-clusters/index.md
new file mode 100644
index 00000000..ff338110
--- /dev/null
+++ b/help/8/programs/anvi-get-sequences-for-gene-clusters/index.md
@@ -0,0 +1,253 @@
+---
+layout: program
+title: anvi-get-sequences-for-gene-clusters
+excerpt: An anvi'o program. Do cool stuff with gene clusters in anvi'o pan genomes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-sequences-for-gene-clusters
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Do cool stuff with gene clusters in anvi'o pan genomes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Can provide
+
+
+[genes-fasta](../../artifacts/genes-fasta) [concatenated-gene-alignment-fasta](../../artifacts/concatenated-gene-alignment-fasta) [misc-data-items](../../artifacts/misc-data-items)
+
+
+## Usage
+
+
+This aptly-named program **gets the sequences for the gene clusters stored in a [pan-db](/help/8/artifacts/pan-db) and returns them as either a [genes-fasta](/help/8/artifacts/genes-fasta) or a [concatenated-gene-alignment-fasta](/help/8/artifacts/concatenated-gene-alignment-fasta)**, which can directly go into the program [anvi-gen-phylogenomic-tree](/help/8/programs/anvi-gen-phylogenomic-tree) for phylogenomics. This gives you advanced access to your gene clusters, so you can take them out of anvi'o and do whatever you please with them.
+
+The program parameters also include a large collection of advanced filtering options. Using these options you can scrutinize your gene clusters in creative and precise ways. Using the combination of these filters you can focus on single-copy core gene clusters in a pangenome, or those occur only as singletons, or paralogs that contain more than a given number of sequences, and so on. Once you are satisfied with the output a given set of filters generate, you can add the matching gene clusters a [misc-data-items](/help/8/artifacts/misc-data-items) with the flag `--add-into-items-additional-data-table`, which can be added to the [interactive](/help/8/artifacts/interactive) interface as additional layers when you visualize your [pan-db](/help/8/artifacts/pan-db) using the program [anvi-display-pan](/help/8/programs/anvi-display-pan)
+
+By default, [anvi-get-sequences-for-gene-clusters](/help/8/programs/anvi-get-sequences-for-gene-clusters) will generate a single output file. But you can ask the program to report every gene cluster that match to your filters as a separate FASTA file depending on your downstream analyses.
+
+While the number of parameters this powerful program can utilize may seem daunting, many of the options just help you specify exactly from which gene clusters you want to get sequences.
+
+### Running on all gene clusters
+
+Here is an example that shows the simplest possible run, which will export sequences for every single gene cluster found in the [pan-db](/help/8/artifacts/pan-db) as amino acid sequences:
+
+
+anvi-get-sequences-for-gene-clusters -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ -o [genes-fasta](/help/8/artifacts/genes-fasta)
+
+
+{:.notice}
+The program will report the DNA sequences if the flag `--report-DNA-sequences` is used.
+
+### Splitting gene clusters into their own files
+
+The command above will put all gene cluster sequences in a single output [fasta](/help/8/artifacts/fasta) file. If you would like to report each gene cluster in a separate FASTA file, it is also an option thanks to the flag `--split-output-per-gene-cluster`. This optional reporting throught this flag applies to all commands shown on this page. For instance, the following command will report every gene cluster as a separate FASTA file in your directory,
+
+
+anvi-get-sequences-for-gene-clusters -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ --split-output-per-gene-cluster
+
+
+where the output files and paths will look like this in your work directory:
+
+```
+GC_00000001.fa
+GC_00000002.fa
+GC_00000003.fa
+GC_00000004.fa
+GC_00000005.fa
+GC_00000006.fa
+GC_00000007.fa
+GC_00000008.fa
+GC_00000009.fa
+GC_00000010.fa
+(...)
+```
+
+You can use the parameters `--output-file-prefix` to add file name prefixes to your output files. For instance, the following command,
+
+
+anvi-get-sequences-for-gene-clusters -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ --split-output-per-gene-cluster \
+ --output-file-prefix MY_PROJECT
+
+
+will result in the following files in your work directory:
+
+```
+MY_PROJECT_GC_00000001.fa
+MY_PROJECT_GC_00000002.fa
+MY_PROJECT_GC_00000003.fa
+MY_PROJECT_GC_00000004.fa
+MY_PROJECT_GC_00000005.fa
+MY_PROJECT_GC_00000006.fa
+MY_PROJECT_GC_00000007.fa
+MY_PROJECT_GC_00000008.fa
+MY_PROJECT_GC_00000009.fa
+MY_PROJECT_GC_00000010.fa
+(...)
+```
+
+You can also use the parameter `--output-file-prefix` to store files in different directories. For instance, the following command (note the trailing `/` in the `--output-file-prefix`),
+
+
+anvi-get-sequences-for-gene-clusters -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ --split-output-per-gene-cluster \
+ --output-file-prefix A_TEST_DIRECTORY/
+
+
+will result in the following files:
+
+```
+A_TEST_DIRECTORY/GC_00000001.fa
+A_TEST_DIRECTORY/GC_00000002.fa
+A_TEST_DIRECTORY/GC_00000003.fa
+A_TEST_DIRECTORY/GC_00000004.fa
+A_TEST_DIRECTORY/GC_00000005.fa
+A_TEST_DIRECTORY/GC_00000006.fa
+A_TEST_DIRECTORY/GC_00000007.fa
+A_TEST_DIRECTORY/GC_00000008.fa
+A_TEST_DIRECTORY/GC_00000009.fa
+A_TEST_DIRECTORY/GC_00000010.fa
+(...)
+```
+
+In contrast, the following command,
+
+
+anvi-get-sequences-for-gene-clusters -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ --split-output-per-gene-cluster \
+ --output-file-prefix A_TEST_DIRECTORY/A_NEW_PREFIX
+
+
+will result in the following files:
+
+```
+A_TEST_DIRECTORY/A_NEW_PREFIX_GC_00000001.fa
+A_TEST_DIRECTORY/A_NEW_PREFIX_GC_00000002.fa
+A_TEST_DIRECTORY/A_NEW_PREFIX_GC_00000003.fa
+A_TEST_DIRECTORY/A_NEW_PREFIX_GC_00000004.fa
+A_TEST_DIRECTORY/A_NEW_PREFIX_GC_00000005.fa
+A_TEST_DIRECTORY/A_NEW_PREFIX_GC_00000006.fa
+A_TEST_DIRECTORY/A_NEW_PREFIX_GC_00000007.fa
+A_TEST_DIRECTORY/A_NEW_PREFIX_GC_00000008.fa
+A_TEST_DIRECTORY/A_NEW_PREFIX_GC_00000009.fa
+A_TEST_DIRECTORY/A_NEW_PREFIX_GC_00000010.fa
+(...)
+```
+
+### Exporting only specific gene clusters
+
+#### Part 1: Choosing gene clusters by collection, bin, or name
+
+You can export only the sequences for a specific [collection](/help/8/artifacts/collection) or [bin](/help/8/artifacts/bin) with the parameters `-C` or `-b` respectively.
+
+
+anvi-get-sequences-for-gene-clusters -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ -o [genes-fasta](/help/8/artifacts/genes-fasta) \
+ -C [collection](/help/8/artifacts/collection)
+
+
+{:.notice}
+You can always display the collections and bins available in your [pan-db](/help/8/artifacts/pan-db) by adding `--list-collections` or `--list-bins` flags to your command.
+
+Alternatively, you can export the specific gene clusters by name, either by providing a single gene cluster ID or a file with one gene cluster ID per line. For example:
+
+
+anvi-get-sequences-for-gene-clusters -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ -o [genes-fasta](/help/8/artifacts/genes-fasta) \
+ --gene-cluster-ids-file gene_clusters.txt
+
+
+where `gene_clusters.txt` contains the following:
+
+```
+GC_00000618
+GC_00000643
+GC_00000729
+```
+
+#### Part 2: Choosing gene clusters by their attributes
+
+These parameters are used to exclude gene clusters that don't reach certain thresholds and are applies on top of filters already applied (for example, you can use these to exclude clusters within a specific bin).
+
+Here is a list of the different filters that you can use to exclude some subsection of your gene clusters:
+
+- min/max number of genomes that the gene cluster occurs in.
+- min/max number of genes from each genome. For example, you could exclude clusters that don't appear in every genome 3 times, or get single-copy genes by setting `max-num-genes-from-each-genome` to 1.
+- min/max [geometric homogenity index](http://merenlab.org/2016/11/08/pangenomics-v2/#geometric-homogeneity-index)
+- min/max [functional homogenity index](http://merenlab.org/2016/11/08/pangenomics-v2/#functional-homogeneity-index)
+- min/max combined homogenity index
+
+For example, the following run on a [genomes-storage-db](/help/8/artifacts/genomes-storage-db) that contains 50 genomes will report only the single-copy core genes with a functional homogenity index above 0.25:
+
+
+anvi-get-sequences-for-gene-clusters -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ -o [genes-fasta](/help/8/artifacts/genes-fasta) \
+ --max-num-genes-from-each-genome 1 \
+ --min-num-genomes-gene-cluster-occurs 50 \
+ --min-functional-homogenity-index 0.25
+
+
+You can also exclude genomes that are missing some number of the gene clusters that you're working with by using the paramter `--max-num-gene-clusters-missing-from-genome`.
+
+For each of these parameters, see the program's help menu for more information.
+
+### Fun with phylogenomics!
+
+To get a [concatenated-gene-alignment-fasta](/help/8/artifacts/concatenated-gene-alignment-fasta) (which you can use to run [anvi-gen-phylogenomic-tree](/help/8/programs/anvi-gen-phylogenomic-tree)), use the parameter `--concatenate-gene-clusters`
+
+
+anvi-get-sequences-for-gene-clusters -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ -o [genes-fasta](/help/8/artifacts/genes-fasta) \
+ --concatenate-gene-clusters
+
+
+Here, you also have the option to specify a specific aligner (or list the available aligners), as well as provide a NEXUS formatted partition file, if you so choose.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-sequences-for-gene-clusters.md) to update this information.
+
+
+## Additional Resources
+
+
+* [In action in the Anvi'o pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#scrutinizing-phylogenomics)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-sequences-for-gene-clusters) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-sequences-for-gene-clusters/network.json b/help/8/programs/anvi-get-sequences-for-gene-clusters/network.json
new file mode 100644
index 00000000..986600b3
--- /dev/null
+++ b/help/8/programs/anvi-get-sequences-for-gene-clusters/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genes-fasta",
+ "name": "genes-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "concatenated-gene-alignment-fasta",
+ "name": "concatenated-gene-alignment-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items",
+ "name": "misc-data-items",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-sequences-for-gene-clusters",
+ "name": "anvi-get-sequences-for-gene-clusters",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "source": 5,
+ "target": 1
+ },
+ {
+ "source": 5,
+ "target": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-sequences-for-hmm-hits/index.md b/help/8/programs/anvi-get-sequences-for-hmm-hits/index.md
new file mode 100644
index 00000000..c96fd95f
--- /dev/null
+++ b/help/8/programs/anvi-get-sequences-for-hmm-hits/index.md
@@ -0,0 +1,209 @@
+---
+layout: program
+title: anvi-get-sequences-for-hmm-hits
+excerpt: An anvi'o program. Get sequences for HMM hits from many inputs.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-sequences-for-hmm-hits
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Get sequences for HMM hits from many inputs.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [external-genomes](../../artifacts/external-genomes) [internal-genomes](../../artifacts/internal-genomes) [hmm-source](../../artifacts/hmm-source) [hmm-hits](../../artifacts/hmm-hits)
+
+
+## Can provide
+
+
+[genes-fasta](../../artifacts/genes-fasta) [concatenated-gene-alignment-fasta](../../artifacts/concatenated-gene-alignment-fasta)
+
+
+## Usage
+
+
+This program can work with anvi'o [contigs-db](/help/8/artifacts/contigs-db), [external-genomes](/help/8/artifacts/external-genomes), or [internal-genomes](/help/8/artifacts/internal-genomes) files to return sequences for HMM hits identified through the default anvi'o [hmm-source](/help/8/artifacts/hmm-source)s (such as the domain-specific single-copy core genes) or user-defined [hmm-source](/help/8/artifacts/hmm-source)s (such as HMMs for specific antibiotic resistance gene families or any other targets).
+
+Using it with single-copy core genes in default anvi'o HMMs make it a very versatile tool for phylogenomics as the user can define specific sets of genes to be aligned and concatenated.
+
+
+### Learn available HMM sources
+
+
+anvi-get-sequences-for-hmm-hits -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --list-hmm-sources
+
+AVAILABLE HMM SOURCES
+===============================================
+* 'Bacteria_71' (type: singlecopy; num genes: 71)
+* 'Archaea_76' (type: singlecopy; num genes: 76)
+* 'Protista_83' (type: singlecopy; num genes: 83)
+* 'Ribosomal_RNAs' (type: Ribosomal_RNAs; num genes: 12)
+
+
+### Get all sequences in a given HMM source
+
+
+anvi-get-sequences-for-hmm-hits -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --hmm-source Bacteria_71 \
+ -o [genes-fasta](/help/8/artifacts/genes-fasta)
+
+
+### Learn available genes in a given HMM source
+
+Please note that the flag `--list-available-gene-names` will give you the list of genes in an **HMM collection** (for example, for `Bacteria_71` in the following use case), and it will not give you the list of genes in your genomes or metagenomes that are matching to them. You can generate a table of HMMs across your genomes or metagenomes with another program, [anvi-script-gen-hmm-hits-matrix-across-genomes](/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes).
+
+
+anvi-get-sequences-for-hmm-hits -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --hmm-source Bacteria_71 \
+ --list-available-gene-names
+
+* Bacteria_71 [type: singlecopy]: ADK, AICARFT_IMPCHas, ATP-synt, ATP-synt_A,
+Chorismate_synt, EF_TS, Exonuc_VII_L, GrpE, Ham1p_like, IPPT, OSCP, PGK,
+Pept_tRNA_hydro, RBFA, RNA_pol_L, RNA_pol_Rpb6, RRF, RecO_C, Ribonuclease_P,
+Ribosom_S12_S23, Ribosomal_L1, Ribosomal_L13, Ribosomal_L14, Ribosomal_L16,
+Ribosomal_L17, Ribosomal_L18p, Ribosomal_L19, Ribosomal_L2, Ribosomal_L20,
+Ribosomal_L21p, Ribosomal_L22, Ribosomal_L23, Ribosomal_L27, Ribosomal_L27A,
+Ribosomal_L28, Ribosomal_L29, Ribosomal_L3, Ribosomal_L32p, Ribosomal_L35p,
+Ribosomal_L4, Ribosomal_L5, Ribosomal_L6, Ribosomal_L9_C, Ribosomal_S10,
+Ribosomal_S11, Ribosomal_S13, Ribosomal_S15, Ribosomal_S16, Ribosomal_S17,
+Ribosomal_S19, Ribosomal_S2, Ribosomal_S20p, Ribosomal_S3_C, Ribosomal_S6,
+Ribosomal_S7, Ribosomal_S8, Ribosomal_S9, RsfS, RuvX, SecE, SecG, SecY, SmpB,
+TsaE, UPF0054, YajC, eIF-1a, ribosomal_L24, tRNA-synt_1d, tRNA_m1G_MT,
+Adenylsucc_synt
+
+
+### Get sequences for some sequences in a given HMM source
+
+
+anvi-get-sequences-for-hmm-hits -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --hmm-source Bacteria_71 \
+ --gene-names Ribosomal_L27,Ribosomal_L28,Ribosomal_L3 \
+ -o [genes-fasta](/help/8/artifacts/genes-fasta)
+
+
+### Get HMM hits in bins of a collection
+
+
+anvi-get-sequences-for-hmm-hits -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection)
+ --hmm-source Bacteria_71 \
+ --gene-names Ribosomal_L27,Ribosomal_L28,Ribosomal_L3 \
+ -o [genes-fasta](/help/8/artifacts/genes-fasta)
+
+
+### Get amino acid sequences for HMM hits
+
+
+anvi-get-sequences-for-hmm-hits -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection)
+ --hmm-source Bacteria_71 \
+ --gene-names Ribosomal_L27,Ribosomal_L28,Ribosomal_L3 \
+ --get-aa-sequences \
+ -o [genes-fasta](/help/8/artifacts/genes-fasta)
+
+
+### Get HMM hits independently aligned and concatenated
+
+The resulting file can be used for phylogenomics analyses via [anvi-gen-phylogenomic-tree](/help/8/programs/anvi-gen-phylogenomic-tree) or through more sophisticated tools for curating alignments and computing trees.
+
+
+anvi-get-sequences-for-hmm-hits -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection)
+ --hmm-source Bacteria_71 \
+ --gene-names Ribosomal_L27,Ribosomal_L28,Ribosomal_L3 \
+ --get-aa-sequences \
+ --concatenate-genes \
+ --return-best-hit
+ -o [genes-fasta](/help/8/artifacts/genes-fasta)
+
+
+Please note teh presence of a new flag in this particular command line, `--return-best-hit`. This flag is most appropriate if one wishes to perform phylogenomic analyses, which ensures that for any given protein family, there will be only one gene reported from a given genome. This is necessary due to the nature of the data that goes into phylogenomic analyses, where typically multiple single-copy core genes from each genome are individually aligned and the results are concatenated into a super matrix for tree construction. This requirement will be violated *if* for a given single-copy core gene (SCG) family any given genome in the dataset has two or more genes rather than one, which can happen for a variety of technical or biological reasons. In that case, we need to pick only one of those genes, which is exactly what `--return-best-hit` flag does for us. Let's say we have two `Ribosomal_L3` gene hits in a given genome. When declared, this flag will choose the `Ribosomal_L3` gene that has the most significant hit given the hidden Markov model for `Ribosomal_L3` that was used to search for `Ribosomal_L3` genes in genomes. In cases where genome quality is sufficient and contamination is not a considerable risk, this step will choose the right hit as in many cases of multiple hits for SCGs the additional ones will have very low significance. Essentially, `--return-best-hit` makes sure you are working with the most appropriate genes for phylogenomics given the HMM modesl and significance scores for your matches in your genomes.
+
+## Tips
+
+### Get amino acid seqeunces for each gene in a model individually
+
+If you are interested in recovering HMM hits for each gene in a model anvi'o knows about as a separate FASTA file, you can do it with a `for` loop easily. After learning your genes of interest, first run this to make sure your terminal environment knows about them (this is an example with a few genes from the HMM source `Bacteria_71`, but you can add as many genes as you like and use any HMM source anvi'o recognizes, of course):
+
+``` bash
+export genes="Ribosomal_L22 Ribosomal_L23 Ribosomal_L27 Ribosomal_L27A Ribosomal_L28"
+export hmm_source="Bacteria_71"
+```
+
+Then, you can run the program in a loop to have your FASTA files:
+
+``` bash
+for gene in $genes
+do
+ anvi-get-sequences-for-hmm-hits -c CONTIGS.db \
+ --hmm-source $hmm_source \
+ --gene-name $gene \
+ -o ${hmm_source}-${gene}.fa
+done
+```
+
+Voila!
+
+### Exercise with the program or test scenarios
+
+You can play with this program using the anvi'o data pack for the [infant gut data](/tutorials/infant-gut) and by replacing the parameters above with appropriate ones in the following commands.
+
+Download the latest version of the data from here: [doi:10.6084/m9.figshare.3502445](https://doi.org/10.6084/m9.figshare.3502445)
+
+Then, unpack it:
+
+
+tar -zxvf INFANTGUTTUTORIAL.tar.gz && cd INFANT-GUT-TUTORIAL
+
+
+Finally, import the collection `merens`:
+
+
+[anvi-import-collection](/help/8/programs/anvi-import-collection) additional-files/collections/merens.txt \
+ -p PROFILE.db \
+ -c CONTIGS.db \
+ -C merens
+
+
+Then run the program using the `PROFILE.db`, `CONTIGS.db`, and optionally the [collection](/help/8/artifacts/collection) `merens` to try some of the commands shown on this page.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-sequences-for-hmm-hits.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A tutorial on anvi'o phylogenomics workflow](http://merenlab.org/2017/06/07/phylogenomics/)
+
+* [A detailed application of phylogenomics to place a new genome on a tree](http://merenlab.org/data/parcubacterium-in-hbcfdna/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-sequences-for-hmm-hits) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-sequences-for-hmm-hits/network.json b/help/8/programs/anvi-get-sequences-for-hmm-hits/network.json
new file mode 100644
index 00000000..112e7f57
--- /dev/null
+++ b/help/8/programs/anvi-get-sequences-for-hmm-hits/network.json
@@ -0,0 +1,121 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genes-fasta",
+ "name": "genes-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "concatenated-gene-alignment-fasta",
+ "name": "concatenated-gene-alignment-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "hmm-source",
+ "name": "hmm-source",
+ "provided_by_anvio": false,
+ "type": "HMM"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "hmm-hits",
+ "name": "hmm-hits",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 8,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-sequences-for-hmm-hits",
+ "name": "anvi-get-sequences-for-hmm-hits",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 8,
+ "target": 0
+ },
+ {
+ "source": 8,
+ "target": 1
+ },
+ {
+ "target": 8,
+ "source": 2
+ },
+ {
+ "target": 8,
+ "source": 3
+ },
+ {
+ "target": 8,
+ "source": 4
+ },
+ {
+ "target": 8,
+ "source": 5
+ },
+ {
+ "target": 8,
+ "source": 6
+ },
+ {
+ "target": 8,
+ "source": 7
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-short-reads-from-bam/index.md b/help/8/programs/anvi-get-short-reads-from-bam/index.md
new file mode 100644
index 00000000..6944d16d
--- /dev/null
+++ b/help/8/programs/anvi-get-short-reads-from-bam/index.md
@@ -0,0 +1,146 @@
+---
+layout: program
+title: anvi-get-short-reads-from-bam
+excerpt: An anvi'o program. Get short reads back from a BAM file with options for compression, splitting of forward and reverse reads, etc.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-short-reads-from-bam
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Get short reads back from a BAM file with options for compression, splitting of forward and reverse reads, etc.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [bin](../../artifacts/bin) [bam-file](../../artifacts/bam-file)
+
+
+## Can provide
+
+
+[short-reads-fasta](../../artifacts/short-reads-fasta)
+
+
+## Usage
+
+
+Get short reads from a [bam-file](/help/8/artifacts/bam-file) in the form of [short-reads-fasta](/help/8/artifacts/short-reads-fasta)).
+
+{:.warning}
+The purpose of this program is not to replace more efficient tools to recover short reads from BAM files such as `samtool`. Since it was designed to address much more subtle needs, this program may have a huge memory fingerprint for very large and numerous BAM files.
+
+Using this program you can,
+
+* Get all reads from one or more BAM files
+* Get reads matching to contig names found in any [bin](/help/8/artifacts/bin) in a [collection](/help/8/artifacts/collection)
+* Get reads matching to contig names found in one or more specific [bin](/help/8/artifacts/bin)s in a [collection](/help/8/artifacts/collection)
+* Get all reads matching to a specific contig name
+* Get reads matching to a specific region of a specific contig name
+
+In addition, you can use the previously-defined fetch filters via the `--fetch-filter` parmeter to get only short reads satisfy a particular set of criteria (i.e., those that are in forward-forward or reverse-reverse orientation, those that have a template length longer than 1,000 nucleotides, and so on). For a complete set of fetch filters you can use, please see the help menu of the program.
+
+The program can report all reads in a single file, or you can ask reads to be split into R1 and R2 files for mapping results of paired-end sequences using the flag `--split-R1-and-R2`. In this case, reads that are not paired will be reported in a file with the prefix `_UNPAIRED.fa`.
+
+Reads reported as a FASTA will contain necessary information in their deflines to recover which BAM file, contig, sample they are from with explicit start/stop positions on the contig to which they matched.
+
+### Getting all reads
+
+A basic run of this program is as follows:
+
+
+anvi-get-short-reads-from-bam BAM_FILE_1.bam BAM_FILE_2.bam (...) \
+ --output-file OUTPUT.fa
+
+
+This will report all short reads found in BAM files `BAM_FILE_1.bam` and `BAM_FILE_2.bam` and store them into a single file. You can use as many BAM files as you wish.
+
+### Narrowing the input with anvi'o files:
+
+You can choose to only return the short reads that are contained within a [collection](/help/8/artifacts/collection):
+
+
+anvi-get-short-reads-from-bam BAM_FILE_1.bam BAM_FILE_2.bam \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ --output-file OUTPUT.fa
+
+
+Or in a bin that is described in a collection:
+
+
+anvi-get-short-reads-from-bam BAM_FILE_1.bam BAM_FILE_2.bam \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -b [bin](/help/8/artifacts/bin) \
+ --output-file OUTPUT.fa
+
+
+### Focusing on individual contigs
+
+You can get all reads mapped to a contig:
+
+
+anvi-get-short-reads-from-bam BAM_FILE_1.bam BAM_FILE_2.bam \
+ --target-contig CONTIG_NAME \
+ --output-file OUTPUT.fa
+
+
+Or define explicit start/stop positions on it:
+
+
+anvi-get-short-reads-from-bam BAM_FILE_1.bam BAM_FILE_2.bam \
+ --target-contig CONTIG_NAME \
+ --target-region-start 100 \
+ --target-region-end 1000 \
+ --output-file OUTPUT.fa
+
+
+In this mode, the program will fetch any read that includes a nucleotide that matches to anywhere in the region defined by the user. Which means, if the user sets `--target-region-start` to `100` and `--target-region-end` to `101`, all reads that have a nuclotide mapping to the `100th` position will be returned.
+
+### Changing the output format
+
+You can split the output based on the directionality of paired-end reads. Adding the tag `--split-R1-and-R2` causes the program to create three separate output files: one for R1 (sequences in the forward direction), one for R2 (sequences in the reverse direction; i.e. reverse complement of R1 sequences), and one for unparied reads. When doing this, you can name these three files with a prefix by using the flag `-O`.
+
+
+anvi-get-short-reads-from-bam -o path/to/output \
+ --split-R1-and-R2 \
+ -O BAM_1_and_BAM_2 \
+ BAM_FILE_1.bam BAM_FILE_2.bam
+
+
+You can also compress the output by adding the flag `--gzip-output`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-short-reads-from-bam.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-short-reads-from-bam) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-short-reads-from-bam/network.json b/help/8/programs/anvi-get-short-reads-from-bam/network.json
new file mode 100644
index 00000000..6ac6aca4
--- /dev/null
+++ b/help/8/programs/anvi-get-short-reads-from-bam/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "short-reads-fasta",
+ "name": "short-reads-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "bam-file",
+ "name": "bam-file",
+ "provided_by_anvio": false,
+ "type": "BAM"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-short-reads-from-bam",
+ "name": "anvi-get-short-reads-from-bam",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "target": 5,
+ "source": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-short-reads-mapping-to-a-gene/index.md b/help/8/programs/anvi-get-short-reads-mapping-to-a-gene/index.md
new file mode 100644
index 00000000..d8a30b3c
--- /dev/null
+++ b/help/8/programs/anvi-get-short-reads-mapping-to-a-gene/index.md
@@ -0,0 +1,83 @@
+---
+layout: program
+title: anvi-get-short-reads-mapping-to-a-gene
+excerpt: An anvi'o program. Recover short reads from BAM files that were mapped to genes you are interested in.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-short-reads-mapping-to-a-gene
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Recover short reads from BAM files that were mapped to genes you are interested in. It is possible to work with a single gene call, or a bunch of them. Similarly, you can get short reads from a single BAM file, or from many of them.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [bam-file](../../artifacts/bam-file)
+
+
+## Can provide
+
+
+[short-reads-fasta](../../artifacts/short-reads-fasta)
+
+
+## Usage
+
+
+This program finds all short reads from ([bam-file](/help/8/artifacts/bam-file)) that align to a specific gene and returns them as a [short-reads-fasta](/help/8/artifacts/short-reads-fasta).
+
+If instead you want to extract these short reads from a FASTQ file, get your gene sequence with [anvi-export-gene-calls](/help/8/programs/anvi-export-gene-calls) and take a look at [anvi-search-primers](/help/8/programs/anvi-search-primers).
+
+To run this program, just specify the bam files you're looking at and the gene of interest. To do this, name the [contigs-db](/help/8/artifacts/contigs-db) containing your gene and the gene caller ID (either directly through the parameter `--gene-caller-id` or through a file). Here is an example:
+
+
+anvi-get-short-reads-mapping-to-a-gene -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --gene-caller-id 2 \
+ -i BAM_FILE_ONE.bam \
+ -O GENE_2_MATCHES
+
+
+The output of this will be a file named `GENE_2_MATCHES_BAM_FILE_ONE.fasta` (prefix + bam file name), which will contain all short reads that aligned to gene 2 with more than 100 nucleotides.
+
+You also have the option to provide multiple bam files; in this case, there will be an output files for each bam file inputted.
+
+Additionally, you can change the number of nucleotides required to map to a short read for it to be reported. For example, to expand your search, you could decrease the required mapping length to 50 nucleotides, as so:
+
+
+anvi-get-short-reads-mapping-to-a-gene -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --gene-caller-id 2 \
+ -i Bam_file_one.bam Bam_file_two.bam \
+ -O GENE_2_MATCHES \
+ --leeway 50
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-short-reads-mapping-to-a-gene.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-short-reads-mapping-to-a-gene) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-short-reads-mapping-to-a-gene/network.json b/help/8/programs/anvi-get-short-reads-mapping-to-a-gene/network.json
new file mode 100644
index 00000000..84088a25
--- /dev/null
+++ b/help/8/programs/anvi-get-short-reads-mapping-to-a-gene/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "short-reads-fasta",
+ "name": "short-reads-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "bam-file",
+ "name": "bam-file",
+ "provided_by_anvio": false,
+ "type": "BAM"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-short-reads-mapping-to-a-gene",
+ "name": "anvi-get-short-reads-mapping-to-a-gene",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-split-coverages/index.md b/help/8/programs/anvi-get-split-coverages/index.md
new file mode 100644
index 00000000..6fd173b2
--- /dev/null
+++ b/help/8/programs/anvi-get-split-coverages/index.md
@@ -0,0 +1,96 @@
+---
+layout: program
+title: anvi-get-split-coverages
+excerpt: An anvi'o program. Export splits and the coverage table from database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-split-coverages
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Export splits and the coverage table from database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [collection](../../artifacts/collection) [bin](../../artifacts/bin)
+
+
+## Can provide
+
+
+[coverages-txt](../../artifacts/coverages-txt)
+
+
+## Usage
+
+
+This program returns the nucleotide-level coverage data for a specific set of the splits or gene in your [profile-db](/help/8/artifacts/profile-db).
+
+If you want to get the coverage data for all splits in your [profile-db](/help/8/artifacts/profile-db), run [anvi-export-splits-and-coverages](/help/8/programs/anvi-export-splits-and-coverages) with the flag `--splits-mode`.
+
+Simply provide a [profile-db](/help/8/artifacts/profile-db) and [contigs-db](/help/8/artifacts/contigs-db) pair and specify which splits, or gene, you want to look at. You have three ways to do this:
+
+1. Provide a single split name. (You can list all splits available with `--list-splits`)
+
+
+anvi-get-split-coverages -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o [coverages-txt](/help/8/artifacts/coverages-txt) \
+ --split-name Day17a_QCcontig9_split_00003
+
+
+
+2. Provide both the name of a [bin](/help/8/artifacts/bin) and the [collection](/help/8/artifacts/collection) it is contained in.
+
+
+anvi-get-split-coverages -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o [coverages-txt](/help/8/artifacts/coverages-txt) \
+ -b [bin](/help/8/artifacts/bin) \
+ -C [collection](/help/8/artifacts/collection)
+
+
+You can list all collections available with `--list-collections` or all bins in a collection with `--list-bins`. Alternatively, you could run [anvi-show-collections-and-bins](/help/8/programs/anvi-show-collections-and-bins) on your [profile-db](/help/8/artifacts/profile-db) to get a more comprehensive overview.
+
+3. Provide a gene caller id and a flanking size (bp).
+
+
+anvi-get-split-coverages -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o [coverages-txt](/help/8/artifacts/coverages-txt) \
+ --gene-caller-id 25961 \
+ --flank-length 500
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-split-coverages.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Using this program to generate split coverage visualizations across samples](http://merenlab.org/2019/11/25/visualizing-coverages/#visualize-only-the-coverage-of-a-split-across-samples)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-split-coverages) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-split-coverages/network.json b/help/8/programs/anvi-get-split-coverages/network.json
new file mode 100644
index 00000000..d868ea82
--- /dev/null
+++ b/help/8/programs/anvi-get-split-coverages/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "coverages-txt",
+ "name": "coverages-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-split-coverages",
+ "name": "anvi-get-split-coverages",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "target": 5,
+ "source": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-get-tlen-dist-from-bam/index.md b/help/8/programs/anvi-get-tlen-dist-from-bam/index.md
new file mode 100644
index 00000000..08451215
--- /dev/null
+++ b/help/8/programs/anvi-get-tlen-dist-from-bam/index.md
@@ -0,0 +1,129 @@
+---
+layout: program
+title: anvi-get-tlen-dist-from-bam
+excerpt: An anvi'o program. Report the distribution of template lengths from a BAM file.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-get-tlen-dist-from-bam
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Report the distribution of template lengths from a BAM file. The purpose of this is to get an idea about the insert size distribution in a BAM file rapidly by summarizing distances between each paired-end read in a given read recruitment experiment..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[bam-file](../../artifacts/bam-file)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program may be useful if you are interested in learning the insert size distribution in a given [bam-file](/help/8/artifacts/bam-file).
+
+## Example run
+
+The most straightforward way to run the program is the following:
+
+
+anvi-get-tlen-dist-from-bam [bam-file](/help/8/artifacts/bam-file)
+
+
+Here is an example output in the terminal:
+
+```
+BAM file .....................................: ./INVERSION-TEST/CP_R03_CDI_C_07_POST.bam
+Number of contigs ............................: 8
+Number of reads ..............................: 2,331,062
+Minimum template length frequency ............: 10
+Maximum template length to consider ..........: 500,000
+
+
+WARNING
+===============================================
+Some of your contigs, 2 of 8 to be precise, did not seem to have any template
+lenght data. There are many reasons this could happen, including a very high
+`--min-tlen-frequency` parameter for BAM files with small number of reads. But
+since there are some contigs that seem to have proper paired-end reads with
+template lengths, anvi'o will continue reporting and put zeros for contigs that
+have no data in output files.
+
+Output file ..................................: TEMPLATE-LENGTH-STATS.txt
+
+
+โ anvi-get-tlen-dist-from-bam took 0:00:05.483682
+```
+
+## Output file
+
+The program will report a TAB-delimited output file with the following format:
+
+|contig|length|num_reads_considered|mean|mean_Q2Q3|median|min|max|std|
+|:--|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
+|B_fragilis_ARW016_000000000001|5526948|2213634|367.6|392.5|393.0|32|71613|663.1|
+|B_fragilis_ARW016_000000000002|3905|634|398.6|398.7|399.0|382|415|7.926|
+|B_fragilis_ARW016_000000000003|4039|588|397.5|397.8|398.0|380|411|7.206|
+|B_fragilis_ARW016_000000000004|3950|524|397.4|397.2|397.0|382|414|7.492|
+|B_fragilis_ARW016_000000000005|911|0|||||||
+|B_fragilis_ARW016_000000000006|647|0|||||||
+|B_fragilis_ARW016_000000000007|12247|1860|394.9|396.7|397.0|99|419|23.61|
+|B_fragilis_ARW016_000000000008|4861|528|396.7|396.7|397.0|384|410|6.746|
+
+## Histogram
+
+If you run the program with the flag `--plot`, it will attempt to plot a histogram for all contigs in the BAM file.
+
+{:.warning}
+The plotting function requires an additional Python library, [plotext](https://github.com/piccolomo/plotext), to be installed. While it is not a part of the default anvi'o distirbution, you can install it in your environment by running `pip install plotext`.
+
+Here is an example run:
+
+```
+anvi-get-tlen-dist-from-bam CP_R03_CDI_C_07_POST.bam \
+ --plot \
+ --max-template-length-to-consider 5000 \
+ --min-template-length-frequency 2500
+```
+
+And the result in the terminal:
+
+[![Example output](../../images/anvi-get-tlen-dist-from-bam.png){:.center-img}](../../images/anvi-get-tlen-dist-from-bam.png)
+
+The histogram may require additional filters to avoid skewed displays due to outliers. The parameters `--max-template-length-to-consider` and/or `--min-template-length-frequency` may be useful for such adjustments. Please see the help menu for their details.
+
+The best practice is likely to run the program without any of these parameters and without the `--plot` flag to generate a comprehensive TAB-delimited report, and then use the `--plot` flag to visualize trends.
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-get-tlen-dist-from-bam.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-get-tlen-dist-from-bam) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-get-tlen-dist-from-bam/network.json b/help/8/programs/anvi-get-tlen-dist-from-bam/network.json
new file mode 100644
index 00000000..7ade6d21
--- /dev/null
+++ b/help/8/programs/anvi-get-tlen-dist-from-bam/network.json
@@ -0,0 +1,30 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "bam-file",
+ "name": "bam-file",
+ "provided_by_anvio": false,
+ "type": "BAM"
+ },
+ {
+ "size": 1,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-get-tlen-dist-from-bam",
+ "name": "anvi-get-tlen-dist-from-bam",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 1,
+ "source": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-import-collection/index.md b/help/8/programs/anvi-import-collection/index.md
new file mode 100644
index 00000000..76439bf8
--- /dev/null
+++ b/help/8/programs/anvi-import-collection/index.md
@@ -0,0 +1,81 @@
+---
+layout: program
+title: anvi-import-collection
+excerpt: An anvi'o program. Import an external binning result into anvi'o.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-import-collection
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Import an external binning result into anvi'o.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [pan-db](../../artifacts/pan-db) [collection-txt](../../artifacts/collection-txt)
+
+
+## Can provide
+
+
+[collection](../../artifacts/collection)
+
+
+## Usage
+
+
+The purpose of this program is to import a [collection](/help/8/artifacts/collection) into an anvi'o database.
+
+The input to this program, a [collection-txt](/help/8/artifacts/collection-txt), may either have been generated by another anvi'o program (such as [anvi-export-collection](/help/8/programs/anvi-export-collection)), or may have been generated by the user manually. To import a collection into a database, you can run the following command,
+
+
+anvi-import-collection [collection-txt](/help/8/artifacts/collection-txt) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C COLLECTION_NAME
+
+
+which would import the collection described in the input file formatted as a [collection-txt](/help/8/artifacts/collection-txt) into the [profile-db](/help/8/artifacts/profile-db) with the name `COLLECTION_NAME`.
+
+If your [collection-txt](/help/8/artifacts/collection-txt) describes contig names rather than split names, you will likely get an anvi'o error. You can fix that by adding the flag `--contigs-mode` to your command:
+
+
+anvi-import-collection [collection-txt](/help/8/artifacts/collection-txt) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --contigs-mode \
+ -C COLLECTION_NAME
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-import-collection.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Another description as part of the metagenomic workflow](http://merenlab.org/2016/06/22/anvio-tutorial-v2/#anvi-import-collection)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-import-collection) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-import-collection/network.json b/help/8/programs/anvi-import-collection/network.json
new file mode 100644
index 00000000..84d78f88
--- /dev/null
+++ b/help/8/programs/anvi-import-collection/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection-txt",
+ "name": "collection-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-import-collection",
+ "name": "anvi-import-collection",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "target": 5,
+ "source": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-import-functions/index.md b/help/8/programs/anvi-import-functions/index.md
new file mode 100644
index 00000000..385345a2
--- /dev/null
+++ b/help/8/programs/anvi-import-functions/index.md
@@ -0,0 +1,69 @@
+---
+layout: program
+title: anvi-import-functions
+excerpt: An anvi'o program. Parse and store functional annotation of genes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-import-functions
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Parse and store functional annotation of genes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [functions-txt](../../artifacts/functions-txt)
+
+
+## Can provide
+
+
+[functions](../../artifacts/functions)
+
+
+## Usage
+
+
+This program **takes in a [functions-txt](/help/8/artifacts/functions-txt) to annotate your [contigs-db](/help/8/artifacts/contigs-db) with [functions](/help/8/artifacts/functions).** Basically, if you have already have the gene functions for the contigs in your [contigs-db](/help/8/artifacts/contigs-db) available in a file, you can import them into anvi'o using this command.
+
+You can find a really comprehesive walkthrough of this program on [this blog post about importing functions](http://merenlab.org/2016/06/18/importing-functions/), including information about built-in anvi'o parsers for InterProScan and the EggNOG database.
+
+If you want to overwrite any function annotations you already have, just add the tag `--drop-previous-annotations`.
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-import-functions.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Importing functions](http://merenlab.org/2016/06/18/importing-functions/)
+
+* [Importing GhostKOALA/KEGG annotations](http://merenlab.org/2018/01/17/importing-ghostkoala-annotations/)
+
+* [Importing VirSorter phage annotaions](http://merenlab.org/2018/02/08/importing-virsorter-annotations/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-import-functions) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-import-functions/network.json b/help/8/programs/anvi-import-functions/network.json
new file mode 100644
index 00000000..8f725b6c
--- /dev/null
+++ b/help/8/programs/anvi-import-functions/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions-txt",
+ "name": "functions-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-import-functions",
+ "name": "anvi-import-functions",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-import-items-order/index.md b/help/8/programs/anvi-import-items-order/index.md
new file mode 100644
index 00000000..180d568e
--- /dev/null
+++ b/help/8/programs/anvi-import-items-order/index.md
@@ -0,0 +1,71 @@
+---
+layout: program
+title: anvi-import-items-order
+excerpt: An anvi'o program. Import a new items order into an anvi'o database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-import-items-order
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Import a new items order into an anvi'o database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [misc-data-items-order-txt](../../artifacts/misc-data-items-order-txt) [dendrogram](../../artifacts/dendrogram) [phylogeny](../../artifacts/phylogeny)
+
+
+## Can provide
+
+
+[misc-data-items-order](../../artifacts/misc-data-items-order)
+
+
+## Usage
+
+
+This program, as one might think, allows you to import a [misc-data-items-order-txt](/help/8/artifacts/misc-data-items-order-txt) to describe a specific order of items stored in a [profile-db](/help/8/artifacts/profile-db), [pan-db](/help/8/artifacts/pan-db), or [genes-db](/help/8/artifacts/genes-db).
+
+
+anvi-import-items-order -p [profile-db](/help/8/artifacts/profile-db) \
+ -i [misc-data-items-order-txt](/help/8/artifacts/misc-data-items-order-txt)
+
+
+It may also be nice to give it a good name, so that it's easy to find in the interface.
+
+
+anvi-import-items-order -p [profile-db](/help/8/artifacts/profile-db) \
+ -i [misc-data-items-order-txt](/help/8/artifacts/misc-data-items-order-txt) \
+ --name ORDER_NAME
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-import-items-order.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-import-items-order) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-import-items-order/network.json b/help/8/programs/anvi-import-items-order/network.json
new file mode 100644
index 00000000..dda21eaf
--- /dev/null
+++ b/help/8/programs/anvi-import-items-order/network.json
@@ -0,0 +1,95 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items-order",
+ "name": "misc-data-items-order",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items-order-txt",
+ "name": "misc-data-items-order-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "dendrogram",
+ "name": "dendrogram",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "phylogeny",
+ "name": "phylogeny",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 6,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-import-items-order",
+ "name": "anvi-import-items-order",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 6,
+ "target": 0
+ },
+ {
+ "target": 6,
+ "source": 1
+ },
+ {
+ "target": 6,
+ "source": 2
+ },
+ {
+ "target": 6,
+ "source": 3
+ },
+ {
+ "target": 6,
+ "source": 4
+ },
+ {
+ "target": 6,
+ "source": 5
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-import-misc-data/index.md b/help/8/programs/anvi-import-misc-data/index.md
new file mode 100644
index 00000000..84970c55
--- /dev/null
+++ b/help/8/programs/anvi-import-misc-data/index.md
@@ -0,0 +1,96 @@
+---
+layout: program
+title: anvi-import-misc-data
+excerpt: An anvi'o program. Populate additional data or order tables in pan or profile databases for items and layers, OR additional data in contigs databases for nucleotides and amino acids (the Swiss army knife-level serious stuff).
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-import-misc-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Populate additional data or order tables in pan or profile databases for items and layers, OR additional data in contigs databases for nucleotides and amino acids (the Swiss army knife-level serious stuff).
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [misc-data-items-txt](../../artifacts/misc-data-items-txt) [dendrogram](../../artifacts/dendrogram) [phylogeny](../../artifacts/phylogeny) [misc-data-layers-txt](../../artifacts/misc-data-layers-txt) [misc-data-layer-orders-txt](../../artifacts/misc-data-layer-orders-txt) [misc-data-nucleotides-txt](../../artifacts/misc-data-nucleotides-txt) [misc-data-amino-acids-txt](../../artifacts/misc-data-amino-acids-txt)
+
+
+## Can provide
+
+
+[misc-data-items](../../artifacts/misc-data-items) [misc-data-layers](../../artifacts/misc-data-layers) [misc-data-layer-orders](../../artifacts/misc-data-layer-orders) [misc-data-nucleotides](../../artifacts/misc-data-nucleotides) [misc-data-amino-acids](../../artifacts/misc-data-amino-acids)
+
+
+## Usage
+
+
+This program enables extending anvi'o projects with many kinds of **additional data**. Additional data will extend anvio' [interactive](/help/8/artifacts/interactive) displays, and appear in [summary](/help/8/artifacts/summary) files, and become accessible to other anvi'o programs thorughout.
+
+This program can add additional data for your items or layers in a [pan-db](/help/8/artifacts/pan-db) or [profile-db](/help/8/artifacts/profile-db), or add additional data for your nucleotides or amino acids in a [contigs-db](/help/8/artifacts/contigs-db)
+
+You also have the option to associate keys with only a specific data group, or transpose the input before processing.
+
+Also see the program [anvi-show-misc-data](/help/8/programs/anvi-show-misc-data), [anvi-export-misc-data](/help/8/programs/anvi-export-misc-data), and [anvi-delete-misc-data](/help/8/programs/anvi-delete-misc-data).
+
+## Items Data, Layers Data, and Orders
+
+Please see [this blog post](http://merenlab.org/2017/12/11/additional-data-tables) for a comprehensive documentation on these misc data types.
+
+## Nucleotides, Amino Acids, and Contigs Databases
+
+This feature lets you import additional data about specfic residues or specific base pairs into your [contigs-db](/help/8/artifacts/contigs-db). This is especially useful for strucutral analysis (so when running programs like [anvi-display-structure](/help/8/programs/anvi-display-structure)) and will be very relevant to the InteracDome functionality when it's added in anvi'o v7 (curious readers can take a look at [this blog post](http://merenlab.org/2020/07/22/interacdome/)).
+
+When adding additional data, unlike with layers and items, you do not have to provide values for every single nucleotide in your database. With this program, you can easily provide data for only a select few.
+
+Basically, you can add two types of data to your contigs database:
+
+1. [misc-data-nucleotides](/help/8/artifacts/misc-data-nucleotides) by providing a [misc-data-nucleotides-txt](/help/8/artifacts/misc-data-nucleotides-txt). This contains information about *specific nucleotides in your database.*
+
+
+anvi-import-misc-data -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -t nucleotides \
+ [misc-data-nucleotides-txt](/help/8/artifacts/misc-data-nucleotides-txt)
+
+
+2. [misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids) by providing a [misc-data-amino-acids-txt](/help/8/artifacts/misc-data-amino-acids-txt). This contains information about *specific amino acid residues in your database*
+
+
+anvi-import-misc-data -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -t amino_acids \
+ [misc-data-amino-acids-txt](/help/8/artifacts/misc-data-amino-acids-txt)
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-import-misc-data.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A primer on anvi'o misc data tables](http://merenlab.org/2017/12/11/additional-data-tables/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-import-misc-data) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-import-misc-data/network.json b/help/8/programs/anvi-import-misc-data/network.json
new file mode 100644
index 00000000..2487fc6f
--- /dev/null
+++ b/help/8/programs/anvi-import-misc-data/network.json
@@ -0,0 +1,212 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items",
+ "name": "misc-data-items",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layers",
+ "name": "misc-data-layers",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layer-orders",
+ "name": "misc-data-layer-orders",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-nucleotides",
+ "name": "misc-data-nucleotides",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-amino-acids",
+ "name": "misc-data-amino-acids",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items-txt",
+ "name": "misc-data-items-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "dendrogram",
+ "name": "dendrogram",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "phylogeny",
+ "name": "phylogeny",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layers-txt",
+ "name": "misc-data-layers-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layer-orders-txt",
+ "name": "misc-data-layer-orders-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-nucleotides-txt",
+ "name": "misc-data-nucleotides-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-amino-acids-txt",
+ "name": "misc-data-amino-acids-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 15,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-import-misc-data",
+ "name": "anvi-import-misc-data",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 15,
+ "target": 0
+ },
+ {
+ "source": 15,
+ "target": 1
+ },
+ {
+ "source": 15,
+ "target": 2
+ },
+ {
+ "source": 15,
+ "target": 3
+ },
+ {
+ "source": 15,
+ "target": 4
+ },
+ {
+ "target": 15,
+ "source": 5
+ },
+ {
+ "target": 15,
+ "source": 6
+ },
+ {
+ "target": 15,
+ "source": 7
+ },
+ {
+ "target": 15,
+ "source": 8
+ },
+ {
+ "target": 15,
+ "source": 9
+ },
+ {
+ "target": 15,
+ "source": 10
+ },
+ {
+ "target": 15,
+ "source": 11
+ },
+ {
+ "target": 15,
+ "source": 12
+ },
+ {
+ "target": 15,
+ "source": 13
+ },
+ {
+ "target": 15,
+ "source": 14
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-import-state/index.md b/help/8/programs/anvi-import-state/index.md
new file mode 100644
index 00000000..4e9315eb
--- /dev/null
+++ b/help/8/programs/anvi-import-state/index.md
@@ -0,0 +1,68 @@
+---
+layout: program
+title: anvi-import-state
+excerpt: An anvi'o program. Import an anvi'o state into a profile database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-import-state
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Import an anvi'o state into a profile database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [state-json](../../artifacts/state-json)
+
+
+## Can provide
+
+
+[state](../../artifacts/state)
+
+
+## Usage
+
+
+This program allows you to import a [state](/help/8/artifacts/state) from a [state-json](/help/8/artifacts/state-json).
+
+You can run this program on a [profile-db](/help/8/artifacts/profile-db) or [pan-db](/help/8/artifacts/pan-db) like so:
+
+
+anvi-import-state -p [profile-db](/help/8/artifacts/profile-db) \
+ -s [state-json](/help/8/artifacts/state-json) \
+ -n MY_STATE
+
+
+This will import the state described in your [state-json](/help/8/artifacts/state-json) into your [profile-db](/help/8/artifacts/profile-db) with the name `MY_STATE`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-import-state.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-import-state) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-import-state/network.json b/help/8/programs/anvi-import-state/network.json
new file mode 100644
index 00000000..525d40ed
--- /dev/null
+++ b/help/8/programs/anvi-import-state/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "state",
+ "name": "state",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "state-json",
+ "name": "state-json",
+ "provided_by_anvio": true,
+ "type": "JSON"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-import-state",
+ "name": "anvi-import-state",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-import-taxonomy-for-genes/index.md b/help/8/programs/anvi-import-taxonomy-for-genes/index.md
new file mode 100644
index 00000000..85a09d63
--- /dev/null
+++ b/help/8/programs/anvi-import-taxonomy-for-genes/index.md
@@ -0,0 +1,62 @@
+---
+layout: program
+title: anvi-import-taxonomy-for-genes
+excerpt: An anvi'o program. Import gene-level taxonomy into an anvi'o contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-import-taxonomy-for-genes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Import gene-level taxonomy into an anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [gene-taxonomy-txt](../../artifacts/gene-taxonomy-txt)
+
+
+## Can provide
+
+
+[gene-taxonomy](../../artifacts/gene-taxonomy)
+
+
+## Usage
+
+
+This program uses a [gene-taxonomy-txt](/help/8/artifacts/gene-taxonomy-txt) to populate the taxonomic information for the genes in a [contigs-db](/help/8/artifacts/contigs-db).
+
+Once finished, your gene taxonomy will appear as an additional layer if you open the [contigs-db](/help/8/artifacts/contigs-db) and an associated [profile-db](/help/8/artifacts/profile-db) in [anvi-interactive](/help/8/programs/anvi-interactive).
+
+There is an entire blogpost about different ways to do this [here](http://merenlab.org/2016/06/18/importing-taxonomy/). It outlines how to get your sequences using [anvi-get-sequences-for-gene-calls](/help/8/programs/anvi-get-sequences-for-gene-calls) than use either [Kaiju](https://github.com/bioinformatics-centre/kaiju) or [Centrifuge](https://github.com/infphilo/centrifuge) to get the taxonomy information for your genes. Finally, you bring that information back into anvi'o using this program.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-import-taxonomy-for-genes.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-import-taxonomy-for-genes) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-import-taxonomy-for-genes/network.json b/help/8/programs/anvi-import-taxonomy-for-genes/network.json
new file mode 100644
index 00000000..9dfe1700
--- /dev/null
+++ b/help/8/programs/anvi-import-taxonomy-for-genes/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "gene-taxonomy",
+ "name": "gene-taxonomy",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "gene-taxonomy-txt",
+ "name": "gene-taxonomy-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-import-taxonomy-for-genes",
+ "name": "anvi-import-taxonomy-for-genes",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-import-taxonomy-for-layers/index.md b/help/8/programs/anvi-import-taxonomy-for-layers/index.md
new file mode 100644
index 00000000..546e42bb
--- /dev/null
+++ b/help/8/programs/anvi-import-taxonomy-for-layers/index.md
@@ -0,0 +1,73 @@
+---
+layout: program
+title: anvi-import-taxonomy-for-layers
+excerpt: An anvi'o program. Import layers-level taxonomy into an anvi'o additional layer data table in an anvi'o single-profile database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-import-taxonomy-for-layers
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Import layers-level taxonomy into an anvi'o additional layer data table in an anvi'o single-profile database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[single-profile-db](../../artifacts/single-profile-db) [layer-taxonomy-txt](../../artifacts/layer-taxonomy-txt)
+
+
+## Can provide
+
+
+[layer-taxonomy](../../artifacts/layer-taxonomy)
+
+
+## Usage
+
+
+This program lets you associate your layers with taxonomic information through a [single-profile-db](/help/8/artifacts/single-profile-db).
+
+This information is displayed in the interactive interface at the same place as [misc-data-layers](/help/8/artifacts/misc-data-layers), which is point (4) on [this page](http://merenlab.org/2017/12/11/additional-data-tables/#views-items-layers-orders-some-anvio-terminology).
+
+If instead you want the layers to *represent* taxonomic ranks, then you'll want to take a look at [this tutorial on phylogenomics](http://merenlab.org/2017/06/07/phylogenomics/).
+
+Usually, the layers describe separate samples. However, when working with only one sample, you may break up different aspects of that sample to be represented in each layer, hence why you might want to associate them with taxonomic information.
+
+To run this program, simply provide a [layer-taxonomy-txt](/help/8/artifacts/layer-taxonomy-txt)
+
+
+anvi-import-taxonomy-for-layers -p [single-profile-db](/help/8/artifacts/single-profile-db) \
+ -i [layer-taxonomy-txt](/help/8/artifacts/layer-taxonomy-txt)
+
+
+You also have the option to change the minimum abundance cut off using `--min-abundance`. The default value is 0.1 percent.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-import-taxonomy-for-layers.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-import-taxonomy-for-layers) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-import-taxonomy-for-layers/network.json b/help/8/programs/anvi-import-taxonomy-for-layers/network.json
new file mode 100644
index 00000000..72f7d491
--- /dev/null
+++ b/help/8/programs/anvi-import-taxonomy-for-layers/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "layer-taxonomy",
+ "name": "layer-taxonomy",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "single-profile-db",
+ "name": "single-profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "layer-taxonomy-txt",
+ "name": "layer-taxonomy-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-import-taxonomy-for-layers",
+ "name": "anvi-import-taxonomy-for-layers",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-init-bam/index.md b/help/8/programs/anvi-init-bam/index.md
new file mode 100644
index 00000000..d2b6302a
--- /dev/null
+++ b/help/8/programs/anvi-init-bam/index.md
@@ -0,0 +1,74 @@
+---
+layout: program
+title: anvi-init-bam
+excerpt: An anvi'o program. Sort/Index BAM files.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-init-bam
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Sort/Index BAM files.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[raw-bam-file](../../artifacts/raw-bam-file)
+
+
+## Can provide
+
+
+[bam-file](../../artifacts/bam-file)
+
+
+## Usage
+
+
+This program sorts and indexes your BAM files, essentially converting a [raw-bam-file](/help/8/artifacts/raw-bam-file) into a [bam-file](/help/8/artifacts/bam-file), which are ready to be used in anvi'o.
+
+If you're unsure what a BAM file is, check out the [bam-file](/help/8/artifacts/bam-file) page or [this file](https://samtools.github.io/hts-specs/SAMv1.pdf), written by the developers of samtools. For a description of what indexing a BAM file does, check out the page for [raw-bam-file](/help/8/artifacts/raw-bam-file).
+
+To run this program, just provide a path to the bam files that you want to index. For example,
+
+
+anvi-init-bam [raw-bam-file](/help/8/artifacts/raw-bam-file)
+
+
+You can also multithread this to shorten runtime with the flag `-T` followed by the desired number of threads if your system is capable of this.
+
+To see it in action (plus a description on how to run it on an entire folder), check out [this page](http://merenlab.org/2016/06/22/anvio-tutorial-v2/#anvi-init-bam).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-init-bam.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Another description as part of the metagenomic workflow](http://merenlab.org/2016/06/22/anvio-tutorial-v2/#anvi-profile)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-init-bam) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-init-bam/network.json b/help/8/programs/anvi-init-bam/network.json
new file mode 100644
index 00000000..06bfbcbc
--- /dev/null
+++ b/help/8/programs/anvi-init-bam/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "bam-file",
+ "name": "bam-file",
+ "provided_by_anvio": false,
+ "type": "BAM"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "raw-bam-file",
+ "name": "raw-bam-file",
+ "provided_by_anvio": false,
+ "type": "BAM"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-init-bam",
+ "name": "anvi-init-bam",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-inspect/index.md b/help/8/programs/anvi-inspect/index.md
new file mode 100644
index 00000000..405d0424
--- /dev/null
+++ b/help/8/programs/anvi-inspect/index.md
@@ -0,0 +1,72 @@
+---
+layout: program
+title: anvi-inspect
+excerpt: An anvi'o program. Start an anvi'o inspect interactive interface.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-inspect
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Start an anvi'o inspect interactive interface.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [bin](../../artifacts/bin)
+
+
+## Can provide
+
+
+[interactive](../../artifacts/interactive) [contig-inspection](../../artifacts/contig-inspection)
+
+
+## Usage
+
+
+This lets you inspect a single split across your samples. This interface can also be opened from the [anvi-interactive](/help/8/programs/anvi-interactive) interface by asking for details about a specific split.
+
+From this view, you can clearly see the coverage and detection across your split, all SNVs, and the genes identified within your split and their functional annotations. You can also easily compare all of this data across all of the samples that this split is present in.
+
+To run this program, just provide a [profile-db](/help/8/artifacts/profile-db) and [contigs-db](/help/8/artifacts/contigs-db) pair and a single split name to inspect.
+
+
+anvi-inspect -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --split-name Day17a_QCcontig9_split_00003
+
+
+You can also choose to hide SNVs marked as outliers or configure the server in various ways.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-inspect.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Visualizing contig coverages](https://merenlab.org/2019/11/25/visualizing-coverages/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-inspect) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-inspect/network.json b/help/8/programs/anvi-inspect/network.json
new file mode 100644
index 00000000..db839bed
--- /dev/null
+++ b/help/8/programs/anvi-inspect/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interactive",
+ "name": "interactive",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contig-inspection",
+ "name": "contig-inspection",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-inspect",
+ "name": "anvi-inspect",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "source": 5,
+ "target": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-interactive/index.md b/help/8/programs/anvi-interactive/index.md
new file mode 100644
index 00000000..6bf9344e
--- /dev/null
+++ b/help/8/programs/anvi-interactive/index.md
@@ -0,0 +1,216 @@
+---
+layout: program
+title: anvi-interactive
+excerpt: An anvi'o program. Start an anvi'o server for the interactive interface.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-interactive
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Start an anvi'o server for the interactive interface.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [single-profile-db](../../artifacts/single-profile-db) [contigs-db](../../artifacts/contigs-db) [genes-db](../../artifacts/genes-db) [bin](../../artifacts/bin) [view-data](../../artifacts/view-data) [dendrogram](../../artifacts/dendrogram) [phylogeny](../../artifacts/phylogeny)
+
+
+## Can provide
+
+
+[collection](../../artifacts/collection) [bin](../../artifacts/bin) [interactive](../../artifacts/interactive) [svg](../../artifacts/svg) [contig-inspection](../../artifacts/contig-inspection)
+
+
+## Usage
+
+
+Initiates an interactive environment in your default browser.
+
+Although it is generally associated with the typical concentric circles of 'oimcs data, the anvi'o interactive interface has many forms and off Anvi'o. Anvi'oers a vast amount of functionality, from manual reconstruction of genomes from metagenomes to refinement of metagenome-assembled genomes, displaying nucleotide-level coverage patterns, single-nucleotide variants, pangenomes, phylogenomic trees, and more. While the circular display is the default method for data presentation, you can also display your data in a rectangular from (as seen [here](http://merenlab.org/tutorials/interactive-interface/#lets-go-all-corners)).
+
+In fact, the interface has many of its own blog posts, including a pretty comprehensive introductory tutorial [here](http://merenlab.org/tutorials/interactive-interface/) and a breakdown of its data types [here](http://merenlab.org/2016/02/27/the-anvio-interactive-interface/).
+
+Here, we'll go through *some* things that the anvi'o interactive interface is capable of through this program. More information about most of this can be found by calling `anvi-interactive -h` or by checking out the additional resources at the bottom of this page.
+
+Please makes sure you are familiar with the terminology that describes various parts of a given display, which are **explained in the [interactive](/help/8/artifacts/interactive) artifact**:
+
+![an anvi'o display](../../images/interactive_interface/anvio_display_template.png){:.center-img}
+
+
+## Running anvi-interactive on a profile database
+
+One of the simplest ways to run the interactive interface (especially useful for manual binning) is just providing an anvi'o profile database and an anvi'o contigs database:
+
+
+anvi-interactive -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+For the central tree to display correctly, you'll need to have run hierarchical clustering at some point while making your profile database (either during [anvi-merge](/help/8/programs/anvi-merge), or, if this is a [single-profile-db](/help/8/artifacts/single-profile-db), while running [anvi-profile](/help/8/programs/anvi-profile)). It is also possible to provide a phylogenetic tree or a clustering dendrogram from the command line using the `--tree` parameter.
+
+If you do not have a [state](/help/8/artifacts/state) stored in your profile database named `default`, you will need to click the "Draw" button for anvi'o to provide you with an [interactive](/help/8/artifacts/interactive) display of your data.
+
+### How to visualize things when you don't have a hierarchical clustering of your contigs?
+
+Typically the [interactive](/help/8/artifacts/interactive) displays that will be initiated with `anvi-interactive` will require an items order to display all your contigs. There are multiple ways for anvi'o to generate dendrograms.
+
+{:.notice}
+Some advanced information you should feel free to skip: anvi'o uses a set of [clustering-configuration](/help/8/artifacts/clustering-configuration) files to decide which sources of data to use to cluster items. These recipes are essentially a set of configuration files for anvi'o to learn which information to use from [contigs-db](/help/8/artifacts/contigs-db), [profile-db](/help/8/artifacts/profile-db), or [pan-db](/help/8/artifacts/pan-db) type databases.
+
+Some of the programs that generate dendrograms include [anvi-merge](/help/8/programs/anvi-merge), [anvi-profile](/help/8/programs/anvi-profile), and [anvi-experimental-organization](/help/8/programs/anvi-experimental-organization). But since hierarchical clustering is an extremely demanding process, anvi'o will skip this step during [anvi-merge](/help/8/programs/anvi-merge) if there are more than 20,000 splits n the database. This is because the computational complexity of this process will get less and less feasible with increasing number of splits. You can force anvi'o to try to cluster your splits regardless of how many of them there are there by using the flag `--enforce-hierarchical-clustering`. However, we strongly advice against it especially if you have more than 30,000 splits since your process will likely to be killed by the operating system, or take a very very long time to finish (plus, if you have that many splits the performance of the interactive interface will be very low).
+
+What happens if you don't have a hierarchical clustering dendrogram, but you still wish to have an overall understanding of your data, or visualize the coverages of some contigs of interest or any contig at all? There are multiple ways you can do that:
+
+* You can use [anvi-inspect](/help/8/programs/anvi-inspect) to visualize nucleotide- and gene-level coverages and single-nucleotide variants on individual contigs,
+* You can use [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs) to create a collection for your contigs and initiate `anvi-interactive` in collection mode (see the subsection "[visualizing bins instead of contigs](#visualizing-bins-instead-of-contigs)" below.
+* You can import external binning results using [anvi-import-collection](/help/8/programs/anvi-import-collection), or manually identify contigs of interest, and use [anvi-import-collection](/help/8/programs/anvi-import-collection) to create a collection of a smaller number of contigs. You can then use [anvi-refine](/help/8/programs/anvi-refine) to visualize contigs in a single bin, or use [anvi-split](/help/8/programs/anvi-split) to first generate a split profile for your contigs to visualize your smaller dataset using [anvi-interactive](/help/8/programs/anvi-interactive).
+
+
+### Collection mode: Visualizing *bins* instead of contigs
+
+By default, when run on a profile database that resulted from a metagenomic workflow, [anvi-interactive](/help/8/programs/anvi-interactive) will initiate each contig as a separate item and organize them based on the clustering dendrograms provided to it (either automatically or by the user). But if there is a [collection](/help/8/artifacts/collection) stored in the profile database, it is also possible to run [anvi-interactive](/help/8/programs/anvi-interactive) on a specific collection, during which anvi'o will use the underlying contig data to calculate summary statistics for each bin before displaying them. In collection mode, each item of your central plot will not represent a contig, but a bin within your collection. This is how the collection mode can be initialized in comparison to the default mode:
+
+
+anvi-interactive -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection)
+
+
+The clustering of [bin](/help/8/artifacts/bin)s in this case based on their distribution across samples will be done automatically on-the-fly. See the note on this mode in [the metagenomic workflow](http://merenlab.org/2016/06/22/anvio-tutorial-v2/#anvi-interactive) for more information.
+
+### Genes mode: Visualizing *genes* instead of contigs
+
+You can also start the interactive interface in "gene mode", in which each item of the central tree is a gene instead of a split or contig (or bin like in "collection mode").
+
+To initiate the visualization in gene mode you need the following:
+
+
+anvi-interactive -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -b [bin](/help/8/artifacts/bin) \
+ --gene-mode
+
+
+If there isn't one already, this command will automatically generate an anvi'o [genes-db](/help/8/artifacts/genes-db) under the `GENES` directory at the same level of the profile database. When the same command is run again, [anvi-interactive](/help/8/programs/anvi-interactive) will use the existing genes database.
+
+In this view you can order genes based on their distributions patterns across metagenomes (*without paying attention to their synteny*) or by ordering them based on their synteny in a given genome (*without paying attention to their differential distribution*). [Figure 2 in this paper](https://peerj.com/articles/4320/) examples the latter, and [Figure 5 in this paper](https://stm.sciencemag.org/content/11/507/eaau9356) examples the former case, which is also shown below:
+
+![](http://merenlab.org/images/gene-distribution-across-metagenomes.png)
+
+You can also visit [this page](http://merenlab.org/tutorials/infant-gut/#the-gene-mode-studying-distribution-patterns-at-the-gene-level) to see another practical example from the Infant Gut tutorial.
+
+## Manual mode: visualize anything
+
+You can initiate the anvi'o interactive interface in manual mode to run it on *ad hoc* tabular data (here is [a tutorial on this](http://merenlab.org/tutorials/interactive-interface/)).
+
+Anvi'o interactive interface is initiated with the flag `--manual-mode` and then by providing *any* of the following types of files individually or together:
+
+- a TAB-delimited tabular data,
+- a NEWICK formatted tree,
+
+When doing this kind of run, anvi'o does not expect you to have a profile database, but it still needs you to provide a name for it. Anvi'o will simply create an empty one for you so you can store your state or collections in it for reproducibility.
+
+## Extending anvi'o displays
+
+You can extend any [interactive](/help/8/artifacts/interactive) display in anvi'o with additional data related to your project through the program [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data). [This article](https://merenlab.org/2017/12/11/additional-data-tables/) describes a detailed use of this program.
+
+While the use of [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data) is the most effective way to improve anvi'o displays, you can also use the parameter `--additional-layers` to provide a TAB-delimited file ([misc-data-items-txt](/help/8/artifacts/misc-data-items-txt)) that contains additional layers of information over your items.
+
+If you want to add an entirely new view to the interface, you can do that too, as long as you provide a file containing all split names and their associated values. For more information, see the parameter `--additional-view`.
+
+You can also provide the manual inputs even if you're using an anvi'o database. For example, if you provide your own NEWICK formatted tree, you will have the option to display it instead of the one in your database.
+
+
+## Visualization Settings
+
+In anvi'o, the visualization settings at a given time are called a [state](/help/8/artifacts/state).
+
+To open the interface in a specific state, you can use the `--state-autoload` flag or by importing a state using [anvi-import-state](/help/8/programs/anvi-import-state).
+
+You can also customize various aspects of the interactive interface. For example, you can change the preselected view, title, and taxonomic level displayed (for example, showing the class name instead of the genus name). You can also hide outlier single nucleotide variations or open only a specific collection.
+
+## Password protection
+
+Use `--password-protected` flag to limit access to your interactive instances, which is by default will be accessible to anyone on your network.
+
+
+## Quick solutions for network problems
+
+In a typical run, [anvi-interactive](/help/8/programs/anvi-interactive) initiates a local server to which you connect through your browser to visualize data. Which can yield unexpected problems if you are running anvi'o in virtual environments such as Windows Subsystem for Linux. If your browser does not show up, or you get cryptic errors such as "*tcgetpgrp failed: Not a tty*", you can always simplify things by manually setting network properties such as `--ip-address` and `--port-number`.
+
+For instance you can start an interactive interface the following way:
+
+
+anvi-interactive -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --ip-address 127.0.0.1 \
+ --port-number 8901 \
+ --server-only
+
+
+Which would not initiate your browser, but then you can open your browser and go to this address to work with the anvi'o interactive interface:
+
+* [http://127.0.0.1:8901](http://127.0.0.1:8901)
+
+## Other things
+
+### Viewing your data
+
+You can use this program to look at the available information in your databases, which is very convenient. For example, you can view all of the available
+
+- views (using `--show-views`)
+- states (using `--show-states`)
+- collections (using `--list-collections`)
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-interactive.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A beginners tutorial on anvi'o interactive interface](http://merenlab.org/tutorials/interactive-interface/)
+
+* [How to add more data to a display for layers and items](http://merenlab.org/2017/12/11/additional-data-tables/)
+
+* [An overview of interactive data types](http://merenlab.org/2016/02/27/the-anvio-interactive-interface/)
+
+* [Anvi'o 'views' demystified](http://merenlab.org/2017/05/08/anvio-views/)
+
+* [Working with SVG files from the interactive interface](http://merenlab.org/2016/10/27/high-resolution-figures/)
+
+* [Running remote anvi'o interactive interfaces on your local computer](http://merenlab.org/2018/03/07/working-with-remote-interative/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-interactive) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-interactive/network.json b/help/8/programs/anvi-interactive/network.json
new file mode 100644
index 00000000..c080f3b3
--- /dev/null
+++ b/help/8/programs/anvi-interactive/network.json
@@ -0,0 +1,177 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interactive",
+ "name": "interactive",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "svg",
+ "name": "svg",
+ "provided_by_anvio": true,
+ "type": "SVG"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contig-inspection",
+ "name": "contig-inspection",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "single-profile-db",
+ "name": "single-profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genes-db",
+ "name": "genes-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "view-data",
+ "name": "view-data",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "dendrogram",
+ "name": "dendrogram",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "phylogeny",
+ "name": "phylogeny",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 13,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-interactive",
+ "name": "anvi-interactive",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 12,
+ "target": 0
+ },
+ {
+ "source": 12,
+ "target": 1
+ },
+ {
+ "target": 12,
+ "source": 1
+ },
+ {
+ "source": 12,
+ "target": 2
+ },
+ {
+ "source": 12,
+ "target": 3
+ },
+ {
+ "source": 12,
+ "target": 4
+ },
+ {
+ "target": 12,
+ "source": 5
+ },
+ {
+ "target": 12,
+ "source": 6
+ },
+ {
+ "target": 12,
+ "source": 7
+ },
+ {
+ "target": 12,
+ "source": 8
+ },
+ {
+ "target": 12,
+ "source": 9
+ },
+ {
+ "target": 12,
+ "source": 10
+ },
+ {
+ "target": 12,
+ "source": 11
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-matrix-to-newick/index.md b/help/8/programs/anvi-matrix-to-newick/index.md
new file mode 100644
index 00000000..98ee355e
--- /dev/null
+++ b/help/8/programs/anvi-matrix-to-newick/index.md
@@ -0,0 +1,73 @@
+---
+layout: program
+title: anvi-matrix-to-newick
+excerpt: An anvi'o program. Takes a distance matrix, returns a newick tree.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-matrix-to-newick
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Takes a distance matrix, returns a newick tree.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[view-data](../../artifacts/view-data)
+
+
+## Can provide
+
+
+[dendrogram](../../artifacts/dendrogram)
+
+
+## Usage
+
+
+You can send any matrix file to this program to get a [dendrogram](/help/8/artifacts/dendrogram) from it.
+
+An example run would look like this:
+
+
+anvi-matrix-to-newick TAB_DELIMITED_DATA.txt \
+ [dendrogram](/help/8/artifacts/dendrogram)
+
+
+By default, [anvi-matrix-to-newick](/help/8/programs/anvi-matrix-to-newick) will cluster rows. With the flag `--transpose`, it will cluster columns.
+
+See [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html) a list of distance metrics you can use, and [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html) a list of linkage methods you can use.
+
+[anvi-matrix-to-newick](/help/8/programs/anvi-matrix-to-newick) can handle missing data, but in that case the program will not normalize your data and will assume that it is already normalized.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-matrix-to-newick.md) to update this information.
+
+
+## Additional Resources
+
+
+* [See this program in action in the pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#creating-a-quick-pangenome-with-functions)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-matrix-to-newick) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-matrix-to-newick/network.json b/help/8/programs/anvi-matrix-to-newick/network.json
new file mode 100644
index 00000000..d4d099ed
--- /dev/null
+++ b/help/8/programs/anvi-matrix-to-newick/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "dendrogram",
+ "name": "dendrogram",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "view-data",
+ "name": "view-data",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-matrix-to-newick",
+ "name": "anvi-matrix-to-newick",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-merge-bins/index.md b/help/8/programs/anvi-merge-bins/index.md
new file mode 100644
index 00000000..b010576b
--- /dev/null
+++ b/help/8/programs/anvi-merge-bins/index.md
@@ -0,0 +1,73 @@
+---
+layout: program
+title: anvi-merge-bins
+excerpt: An anvi'o program. Merge a given set of bins in an anvi'o collection.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-merge-bins
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Merge a given set of bins in an anvi'o collection.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [collection](../../artifacts/collection) [bin](../../artifacts/bin)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program **merges two or more [bin](/help/8/artifacts/bin)s together** into a single [bin](/help/8/artifacts/bin).
+
+To run this program, the bins that you want to merge must be contained within a single [collection](/help/8/artifacts/collection). Just provide the collection name, the [pan-db](/help/8/artifacts/pan-db) or [profile-db](/help/8/artifacts/profile-db) you're working with, the bins that you want to merge, and the name of the output bin.
+
+To check what collections and bins are contained in a database, you can either run this program with the flag `--list-collections` or `--list-bins`, or you can run [anvi-show-collections-and-bins](/help/8/programs/anvi-show-collections-and-bins).
+
+For example, if you wanted to merge the bins `first_third`, `middle_third`, and `last_third` in a pan-db into a single bin called `complete_bin`, just run
+
+
+anvi-merge-bins -p [pan-db](/help/8/artifacts/pan-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -b first_third,middle_third,last_third \
+ -B complete_bin
+
+
+Now your collection will contain the bin `complete_bin` and the original bins will be gone forever (unless you had run[anvi-summarize](/help/8/programs/anvi-summarize), [anvi-export-collection](/help/8/programs/anvi-export-collection), or a similar program beforehand)
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-merge-bins.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-merge-bins) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-merge-bins/network.json b/help/8/programs/anvi-merge-bins/network.json
new file mode 100644
index 00000000..fb3a6896
--- /dev/null
+++ b/help/8/programs/anvi-merge-bins/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-merge-bins",
+ "name": "anvi-merge-bins",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 4,
+ "source": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-merge-trnaseq/index.md b/help/8/programs/anvi-merge-trnaseq/index.md
new file mode 100644
index 00000000..14cf3a6d
--- /dev/null
+++ b/help/8/programs/anvi-merge-trnaseq/index.md
@@ -0,0 +1,93 @@
+---
+layout: program
+title: anvi-merge-trnaseq
+excerpt: An anvi'o program. This program processes one or more anvi'o tRNA-seq databases produced by `anvi-trnaseq` and outputs anvi'o contigs and merged profile databases accessible to other tools in the anvi'o ecosystem.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-merge-trnaseq
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+This program processes one or more anvi'o tRNA-seq databases produced by `anvi-trnaseq` and outputs anvi'o contigs and merged profile databases accessible to other tools in the anvi'o ecosystem. Final tRNA "seed sequences" are determined from a set of samples. Each sample yields a set of tRNA predictions stored in a tRNA-seq database, and these tRNAs may be shared among the samples. tRNA may be 3' fragments and thereby subsequences of longer tRNAs from other samples which would become seeds. The profile database produced by this program records the coverages of seeds in each sample. This program finalizes predicted nucleotide modification sites using tunable substitution rate parameters..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[trnaseq-db](../../artifacts/trnaseq-db)
+
+
+## Can provide
+
+
+[trnaseq-contigs-db](../../artifacts/trnaseq-contigs-db) [trnaseq-profile-db](../../artifacts/trnaseq-profile-db)
+
+
+## Usage
+
+
+This program **finds tRNA seed sequences from a set of tRNA-seq samples**.
+
+This program follows [anvi-trnaseq](/help/8/programs/anvi-trnaseq) in the [trnaseq-workflow](../../workflows/trnaseq/). [anvi-trnaseq](/help/8/programs/anvi-trnaseq) is run on each tRNA-seq sample, producing sample [trnaseq-db](/help/8/artifacts/trnaseq-db)s. A tRNA-seq database contains predictions of tRNA sequences, structures, and modification sites in the sample. anvi-merge-trnaseq takes as input the tRNA-seq databases from a set of samples. It compares tRNAs predicted from the samples, finding those in common and calculating their sample coverages. The final tRNA sequences predicted from all samples are called **tRNA seeds** and function like contigs in metagenomic experiments. Seeds are stored in a [trnaseq-contigs-db](/help/8/artifacts/trnaseq-contigs-db) and sample coverages are stored in a [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db). These databases are **variants** of normal [contigs-db](/help/8/artifacts/contigs-db)s and [profile-db](/help/8/artifacts/profile-db)s, performing similar functions in the anvi'o ecosystem but containing somewhat different information.
+
+Most of the heavy computational work in the [trnaseq-workflow](../../workflows/trnaseq/) is performed by [anvi-trnaseq](/help/8/programs/anvi-trnaseq). anvi-merge-trnaseq is meant run relatively quickly, allowing its parameters to be tuned to fit the dataset.
+
+The `anvi-merge-trnaseq --help` menu provides detailed explanations of the parameters controlling the multifacted analyses performed by the program.
+
+## Key parameters
+
+### Number of reported seeds
+
+One key parameter is the number of reported tRNA seed sequences (`--max-reported-trna-seeds`). The default value of 10,000 seeds is more appropriate for a complex microbial community than a pure culture of a bacterial isolate, which should yield a number of tRNA seeds equal to the number of expressed tRNAs, say ~30. Sequence artifacts may be reported in addition to the 30 actual tRNAs with a higher value like 10,000. Artifacts are relatively common despite intensive screening by [anvi-trnaseq](/help/8/programs/anvi-trnaseq) and anvi-merge-trnaseq due to nontemplated nucleotides and modification-induced mutations introduced into tRNA-seq reads by reverse transcription. In practice, artifacts are easy to distinguish from true tRNA seeds by analyzing seed coverage in [anvi-interactive](/help/8/programs/anvi-interactive) and checking seed homology to reference databases, among other measures.
+
+### Modification filters
+
+Other key parameters, `--min-variation` and `--min-third-fourth-nt`, determine the coverage cutoffs that distinguish predicted positions of modified nucleotides from single nucleotide variants. Compared to SNVs, modifications typically produce higher nucleotide variability to three or four different nucleotides. However, modification-induced mutations are often highly skewed to one other nucleotide rather than all three mutant nucleotides. Furthermore, the high coverage of seeds in many tRNA-seq libraries can uncover SNVs with a low-frequency third nucleotide rather than the expected two. Some SNVs that are wrongly called modifications can be easily spotted in [anvi-interactive](/help/8/programs/anvi-interactive) and the output of [anvi-plot-trnaseq](/help/8/programs/anvi-plot-trnaseq) due to covariation at two positions in the seed as a result of base pairing. In other words, SNV frequencies are equivalent at the two base paired positions in every sample, where modification artifacts have no effect on nucleotide variability at another position across the molecule.
+
+## Examples
+
+*Merge two samples.*
+
+
+anvi-merge-trnaseq trnaseq_database_1 trnaseq_database_2 (...) \
+ -o OUTPUT_DIRECTORY \
+ -n PROJECT_NAME \
+
+
+*Merge two samples with and without demethylase treatment, giving priority to the demethylase split in calling the underlying nucleotide at modified positions.*
+
+
+anvi-merge-trnaseq untreated_trnaseq_database demethylase_trnaseq_database (...) \
+ -o OUTPUT_DIRECTORY \
+ -n PROJECT_NAME \
+ --preferred-treatment demethylase
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-merge-trnaseq.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-merge-trnaseq) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-merge-trnaseq/network.json b/help/8/programs/anvi-merge-trnaseq/network.json
new file mode 100644
index 00000000..1d678f6b
--- /dev/null
+++ b/help/8/programs/anvi-merge-trnaseq/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trnaseq-contigs-db",
+ "name": "trnaseq-contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trnaseq-profile-db",
+ "name": "trnaseq-profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trnaseq-db",
+ "name": "trnaseq-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-merge-trnaseq",
+ "name": "anvi-merge-trnaseq",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "source": 3,
+ "target": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-merge/index.md b/help/8/programs/anvi-merge/index.md
new file mode 100644
index 00000000..cbb84533
--- /dev/null
+++ b/help/8/programs/anvi-merge/index.md
@@ -0,0 +1,99 @@
+---
+layout: program
+title: anvi-merge
+excerpt: An anvi'o program. Merge multiple anvio profiles.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-merge
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Merge multiple anvio profiles.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[single-profile-db](../../artifacts/single-profile-db) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[profile-db](../../artifacts/profile-db) [misc-data-items-order](../../artifacts/misc-data-items-order)
+
+
+## Usage
+
+
+The main function of `anvi-merge` is to convert multiple [single-profile-db](/help/8/artifacts/single-profile-db)s into a single [profile-db](/help/8/artifacts/profile-db) (also called a merged profile database). Basically, this takes the alignment data from each sample (each contained in its own [single-profile-db](/help/8/artifacts/single-profile-db)) and combines them into a single database that anvi'o can look through more easily.
+
+### Overview: How to run anvi-merge
+
+1. Set up your [contigs-db](/help/8/artifacts/contigs-db). See that page for more information
+
+1. Use [anvi-profile](/help/8/programs/anvi-profile) to create a [single-profile-db](/help/8/artifacts/single-profile-db) for each of your samples (formatted into a [bam-file](/help/8/artifacts/bam-file)) *(Note: for each of these runs, you'll need to use the same [contigs-db](/help/8/artifacts/contigs-db) and parameters)*
+
+1. Use `anvi-merge` to merge those [single-profile-db](/help/8/artifacts/single-profile-db)s into a single database, called a [profile-db](/help/8/artifacts/profile-db). This will look something like the following:
+
+
+anvi-merge -c cool_contigs.db \
+ Single_profile_db_1 Single_profile_db_2 \
+ -o cool_contigs_merge
+
+
+This will put all of the output files (the final [profile-db](/help/8/artifacts/profile-db) as well as a [misc-data-items-order](/help/8/artifacts/misc-data-items-order) which is the result of your hierarchical clustering and describes the order to display your contigs in) into the folder `cool_contigs_merge `.
+
+
+## Other Parameters
+
+You must give `anvi-merge` your contigs database and single profile databases. However, you can also provide more information or give addtional instructions. Use the flag `-h` at any time to display the help menu.
+
+### Hierarchical Clustering
+
+#### To run or not to run?
+* Use the flag `--skip-hierarchical-clustering` to turn hierarchical clustering off. This will save on computation time, but will skip out on creating the tree of contigs at the center of the interactive interface. If you have more than 25,000 splits in the final profile, this will be set automatically.
+* Use the flag `--enforce-hierarchical-clustering` to turn hierarchical clustering back on. This will take a long time, but will produce a lovely contigs tree for the interactive interface.
+
+#### Additional parameters
+* Provide a custom distance metric for clustering using the flag `--distance.` (The default is euclidean)
+* Provide a custom linkage method for clustering using the flag `--linkage.` (The default is ward)
+
+### Providing additional information
+* Provide the sample name using the flag `-S`. If you don't, anvi'o will come up with one, but it probably won't be any good.
+* Provide a text file in markdown to describe the project using the flag `--description`. This will show up when you later use the interactive interface to analyze your profile-db.
+
+### Output Information
+* Provide an output destination with the flag `-o`.
+* Add the flag `-W` to overwrite existing files in that directory.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-merge.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Another description as part of the metagenomic workflow](http://merenlab.org/2016/06/22/anvio-tutorial-v2/#anvi-profile)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-merge) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-merge/network.json b/help/8/programs/anvi-merge/network.json
new file mode 100644
index 00000000..96c5de1c
--- /dev/null
+++ b/help/8/programs/anvi-merge/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items-order",
+ "name": "misc-data-items-order",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "single-profile-db",
+ "name": "single-profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-merge",
+ "name": "anvi-merge",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-meta-pan-genome/index.md b/help/8/programs/anvi-meta-pan-genome/index.md
new file mode 100644
index 00000000..4f8c2899
--- /dev/null
+++ b/help/8/programs/anvi-meta-pan-genome/index.md
@@ -0,0 +1,73 @@
+---
+layout: program
+title: anvi-meta-pan-genome
+excerpt: An anvi'o program. Convert a pangenome into a metapangenome.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-meta-pan-genome
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Convert a pangenome into a metapangenome.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[internal-genomes](../../artifacts/internal-genomes) [pan-db](../../artifacts/pan-db) [genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Can provide
+
+
+[metapangenome](../../artifacts/metapangenome)
+
+
+## Usage
+
+
+This program integrates the information from an [internal-genomes](/help/8/artifacts/internal-genomes) artifact into a [pan-db](/help/8/artifacts/pan-db), creating a metapangenome.
+
+A metapangenome contains both the information in a metagenome (i.e. their abundances in different samples as described in your [profile-db](/help/8/artifacts/profile-db)) and the information in a pangenome (i.e. the gene clusters in your dataset). This is useful because you are able to observe which gene cluster patterns are present in certain environments. For an example of a metapangenomic workflow, take a look [here](http://merenlab.org/data/prochlorococcus-metapangenome/) (this tutorial was written before this program, but the insights persist).
+
+To use this program, provide a [pan-db](/help/8/artifacts/pan-db) and [genomes-storage-db](/help/8/artifacts/genomes-storage-db) pair, as well as an [internal-genomes](/help/8/artifacts/internal-genomes).
+
+
+anvi-meta-pan-genome -p [pan-db](/help/8/artifacts/pan-db) \
+ -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -i [internal-genomes](/help/8/artifacts/internal-genomes)
+
+
+However, when integrating metagenomic and pangenomic data together, you'll get a lot of data. You can set two additional parameters to help you filter out data that doesn't mean certain standards:
+
+- `--fraction-of-median-coverage`: this threshold removes genes with less than this fraction of the median coverage. The default is 0.25. So, for example, if the median coverage in your data was 100X, this would remove all genes with coverage less than 25X.
+- `--min-detection`: this threshold removes genomes with detection less than this value in all samples. The default is 0.5.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-meta-pan-genome.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-meta-pan-genome) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-meta-pan-genome/network.json b/help/8/programs/anvi-meta-pan-genome/network.json
new file mode 100644
index 00000000..a448cefc
--- /dev/null
+++ b/help/8/programs/anvi-meta-pan-genome/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "metapangenome",
+ "name": "metapangenome",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-meta-pan-genome",
+ "name": "anvi-meta-pan-genome",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-migrate/index.md b/help/8/programs/anvi-migrate/index.md
new file mode 100644
index 00000000..cad7daf9
--- /dev/null
+++ b/help/8/programs/anvi-migrate/index.md
@@ -0,0 +1,97 @@
+---
+layout: program
+title: anvi-migrate
+excerpt: An anvi'o program. Migrates any anvi'o artifact, whether it is a database or a config file, to a newer version.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-migrate
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Migrates any anvi'o artifact, whether it is a database or a config file, to a newer version. Pure magic..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [pan-db](../../artifacts/pan-db) [genes-db](../../artifacts/genes-db) [genomes-storage-db](../../artifacts/genomes-storage-db) [structure-db](../../artifacts/structure-db) [modules-db](../../artifacts/modules-db) [workflow-config](../../artifacts/workflow-config)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This is a multi-talented program that seamlessly updates any anvi'o artifact (i.e., anvi'o databases or config files) to its latest version.
+
+You can provide one or more files and the program will migrate all. However, you must choose whether to migrate the databases safely, or quickly.
+
+If you choose to migrate safely, anvi'o will first make a copy of each file and save it as a backup. In case something goes wrong during the migration, it let you know what happened and will restore your original file from its copy. In the case of a successful migration, the old copy will go away gracefully (and quietly).
+
+This is how you migrate safely:
+
+```
+anvi-migrate --migrate-safely *.db
+```
+
+Of course, we will always suggest that migrating safely is better, because fewer people get angry at us when we do that. In practice though, making those backup copies takes up extra time and it is unlikely that the migration will fail anyway, so if you have a lot of databases to migrate and are okay with a bit of risk, you have the option to migrate quickly instead. In this case, anvi'o will _not_ copy your databases before starting the migration.
+
+```
+anvi-migrate --migrate-quickly *.db
+```
+Please remember that by living life in the fast lane, you forego your safety net. On the rare occasion that the migration does fail, this program will let you know what happened, leave you with a database that has a `.broken` file extension. In this unlikely event, you can always reach out to us.
+
+### Migrating to a specific version
+
+If your database is a few versions behind the highest available version but for whatever reason you don't want to migrate it all the way, you can specify which version to update your database to. Just use the `-t` flag (note: migrating with this parameter only works on ONE database at a time):
+
+```
+anvi-migrate --migrate-safely -t 15 CONTIGS.db
+```
+
+Then anvi'o will update your database until it is whatever version you specified and stop. Of course, you cannot provide a version number that is higher than the highest available version. Nor can you provide a number that is lower than your database's current version (ie, backwards migration is not possible).
+
+Not sure what your database's current version is? Try [anvi-db-info](/help/8/programs/anvi-db-info).
+
+You can always run `anvi-migrate -v` to learn about the versions of artifact types _your_ anvi'o installation can work with.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-migrate.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-migrate) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-migrate/network.json b/help/8/programs/anvi-migrate/network.json
new file mode 100644
index 00000000..d425d154
--- /dev/null
+++ b/help/8/programs/anvi-migrate/network.json
@@ -0,0 +1,121 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genes-db",
+ "name": "genes-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "structure-db",
+ "name": "structure-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "modules-db",
+ "name": "modules-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "workflow-config",
+ "name": "workflow-config",
+ "provided_by_anvio": false,
+ "type": "JSON"
+ },
+ {
+ "size": 8,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-migrate",
+ "name": "anvi-migrate",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 8,
+ "source": 0
+ },
+ {
+ "target": 8,
+ "source": 1
+ },
+ {
+ "target": 8,
+ "source": 2
+ },
+ {
+ "target": 8,
+ "source": 3
+ },
+ {
+ "target": 8,
+ "source": 4
+ },
+ {
+ "target": 8,
+ "source": 5
+ },
+ {
+ "target": 8,
+ "source": 6
+ },
+ {
+ "target": 8,
+ "source": 7
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-oligotype-linkmers/index.md b/help/8/programs/anvi-oligotype-linkmers/index.md
new file mode 100644
index 00000000..8a54db99
--- /dev/null
+++ b/help/8/programs/anvi-oligotype-linkmers/index.md
@@ -0,0 +1,78 @@
+---
+layout: program
+title: anvi-oligotype-linkmers
+excerpt: An anvi'o program. Takes an anvi'o linkmers report, generates an oligotyping output.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-oligotype-linkmers
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Takes an anvi'o linkmers report, generates an oligotyping output.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[linkmers-txt](../../artifacts/linkmers-txt)
+
+
+## Can provide
+
+
+[oligotypes](../../artifacts/oligotypes)
+
+
+## Usage
+
+
+This program converts a [linkmers-txt](/help/8/artifacts/linkmers-txt) artifact into [oligotypes](/help/8/artifacts/oligotypes) data.
+
+A [linkmers-txt](/help/8/artifacts/linkmers-txt) artifact describes each of your short reads that mapped to specific target nucleotide positions in a reference contig. This program counts the total occurance of each combination in those target positions within each of your samples.
+
+For example, if your [linkmers-txt](/help/8/artifacts/linkmers-txt) focused on two target positions, and you ran the following:
+
+
+anvi-oligotype-linkmers -i [linkmers-txt](/help/8/artifacts/linkmers-txt)
+
+
+The output (which by default is called `oligotype-counts-001.txt`) might look like the following:
+
+ key AG CA CG GA GG TA TG
+ sample_001 0 320 12 2 0 3 579
+ sample_002 0 142 2 0 2 10 353
+ sample_003 3 404 1 1 0 2 610
+ sample_004 0 209 6 0 1 0 240
+
+Note that combinations with zero reads in every sample are not included.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-oligotype-linkmers.md) to update this information.
+
+
+## Additional Resources
+
+
+* [An application of the oligotyping workflow in metagenomics](https://merenlab.org/2015/12/09/musings-over-commamox/#an-application-of-oligotyping-in-the-metagenomic-context-oligotyping-amoc)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-oligotype-linkmers) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-oligotype-linkmers/network.json b/help/8/programs/anvi-oligotype-linkmers/network.json
new file mode 100644
index 00000000..8797ab9f
--- /dev/null
+++ b/help/8/programs/anvi-oligotype-linkmers/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "oligotypes",
+ "name": "oligotypes",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "linkmers-txt",
+ "name": "linkmers-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-oligotype-linkmers",
+ "name": "anvi-oligotype-linkmers",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-pan-genome/index.md b/help/8/programs/anvi-pan-genome/index.md
new file mode 100644
index 00000000..48a64aae
--- /dev/null
+++ b/help/8/programs/anvi-pan-genome/index.md
@@ -0,0 +1,104 @@
+---
+layout: program
+title: anvi-pan-genome
+excerpt: An anvi'o program. An anvi'o program to compute a pangenome from an anvi'o genome storage.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-pan-genome
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+An anvi'o program to compute a pangenome from an anvi'o genome storage.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Can provide
+
+
+[pan-db](../../artifacts/pan-db) [misc-data-items-order](../../artifacts/misc-data-items-order)
+
+
+## Usage
+
+
+This program implements pangenomics, and organizes genes found within a [genomes-storage-db](/help/8/artifacts/genomes-storage-db) to create a [pan-db](/help/8/artifacts/pan-db).
+
+Please first read [the pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2) to have a better understanding of the steps that lead to the generation of a [pan-db](/help/8/artifacts/pan-db).
+
+### Making sure your installation can do pangenomics
+
+You can always test if your computer has all the dependencies for a successful pangenomics analysis by running,
+
+
+anvi-self-test --suite pangenomics
+
+
+If it runs without errors, you're golden. If not, please consult with the most up-to-date installation instructions for anvi'o and get in touch with the anvi'o community for guidance.
+
+### A brief summary
+
+The program [anvi-pan-genome](/help/8/programs/anvi-pan-genome) performs three major things for its user:
+
+* Calculates the similarity between the all gene amino acid seqeunces found in genomes described in your [genomes-storage-db](/help/8/artifacts/genomes-storage-db) using [DIAMOND](https://www.wsi.uni-tuebingen.de/lehrstuehle/algorithms-in-bioinformatics/software/diamond/). You have some options. Although, (1) you can use the NCBI's BLAST program [`blastp`](https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins) instead of DIAMOND using the `--use-ncbi-blast` flag, (2) instead of analyzing all genomes you can focus a subset using the `--genome-names` parameter, and (3) exclude genes that are partial from your analysis using the flag `--exclude-partial-gene-calls` if you think you must.
+
+* Resolves gene clusters using the BLAST results via the [MCL](http://micans.org/mcl/) algorithm after discarding weak hits from the search results using the `--minbit` heuristic (inspired by the workflow implemented by ITEP ([Benedict et al., 2014](https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-15-8)).
+
+* Perform post-analyses of resulting gene clusters for downstream analyses and visualization (including aligning amino acid sequences in each cluster, computing functional and geometric homogeneity indices, and computing a hierarchical clustering of gene clusters in preparation for the visualization of the pangenome using the program [anvi-display-pan](/help/8/programs/anvi-display-pan).
+
+### The 'additional parameters' mechanism for power users
+
+At the core of the pangenomics workflow lies the reciprocal BLAST search that identifies sequence similarities within a pool of gene sequences. For this, anvi'o uses DIAMOND by default, but the user can change the search algorithm. Based on the algorithm used for this step, the matching anvi'o driver sets some default parameters for a successful run. Such as the proper parameter to explicitly define where the output files generated by DIAMOND should go, and so on. Apart from those mandatory parameters that are critical for a successful run, anvi'o allows the user to define a set of additional parameters to pass to the search algorithm.
+
+This is done via the flag `--additional-params-for-seq-search`. For instance, the user could take a look at the parameters diamond offers by typing `diamond help` on their terminal, and may decide to use the `--sensitive` implemented by DIAMOND to enable a slower but more sensitive search, and use the parameter `--id 98` to ask DIAMOND to not report any hits across genes that is lower than 98% sequence identity to limit gene clusters only those sequences that are extremely closely related while pushing everything else to be singletons (which can also be removed from the analysis with a separate `--min-occurrence 2` flag [anvi-pan-genome](/help/8/programs/anvi-pan-genome) accepts). They can pass these parameters to DIAMOND by running their analysis the following way:
+
+
+anvi-pan-genome --genomes-storage [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ --project-name PROJECT_NAME \
+ --additional-params-for-seq-search "--masking 0 --sensitive --id 98"
+
+
+{:.notice}
+The additional parameters used for the search will be stored in the resulting [pan-db](/help/8/artifacts/pan-db) and can be viewed anytime using the program [anvi-display-pan](/help/8/programs/anvi-display-pan).
+
+{:.notice}
+For DIAMOND, if no additional parameters is declared, anvi'o will include `--masking 0` by default since we recently learned that not using that flag leads to the elmination of genes with many repeated elements (see [#1955](https://github.com/merenlab/anvio/issues/1955)).
+
+With the freedom of additional parameters for sequnce search, it is possible to make significant mistakes since anvi'o will have no opportunity to sanity-check user-defined additional parameters. If you are doing something experimental, please keep an eye on the output messages and error logs.
+
+If the user choses to use NCBI's BLAST program, in that case anvi'o will pass the value of the parameter `--additional-params-for-seq-search` to NCBI's `blastp`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-pan-genome.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A tutorial on pangenomics](http://merenlab.org/2016/11/08/pangenomics-v2/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-pan-genome) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-pan-genome/network.json b/help/8/programs/anvi-pan-genome/network.json
new file mode 100644
index 00000000..6356b341
--- /dev/null
+++ b/help/8/programs/anvi-pan-genome/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items-order",
+ "name": "misc-data-items-order",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-pan-genome",
+ "name": "anvi-pan-genome",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "source": 3,
+ "target": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-plot-trnaseq/index.md b/help/8/programs/anvi-plot-trnaseq/index.md
new file mode 100644
index 00000000..0768cb1b
--- /dev/null
+++ b/help/8/programs/anvi-plot-trnaseq/index.md
@@ -0,0 +1,64 @@
+---
+layout: program
+title: anvi-plot-trnaseq
+excerpt: An anvi'o program. A program to write plots of coverage and modification data from flexible groups of tRNA-seq seeds.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-plot-trnaseq
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program to write plots of coverage and modification data from flexible groups of tRNA-seq seeds.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+## Can consume
+
+
+[trnaseq-contigs-db](../../artifacts/trnaseq-contigs-db) [trnaseq-seed-txt](../../artifacts/trnaseq-seed-txt) [modifications-txt](../../artifacts/modifications-txt)
+
+
+## Can provide
+
+
+[trnaseq-plot](../../artifacts/trnaseq-plot)
+
+
+## Usage
+
+
+This program **generates plots of groups of tRNA-seq seeds. The plots show seed coverages and the nucleotide frequencies at modification sites in each sample**.
+
+The [trnaseq-workflow](../../workflows/trnaseq/) predicts tRNA seeds and their nucleotide modifications from a set of samples. The inspect webpage in [anvi-interactive](/help/8/programs/anvi-interactive) displays information on a selected seed, including coverages, mutation frequencies at predicted modification sites, and indel frequencies. anvi-plot-trnaseq generates plots that similarly show seed coverages in each sample but otherwise differ in many ways from the inspect page.
+
+This program generates plots for a user-defined group of seeds. Seeds may be grouped by tRNA taxonomy and/or amino acid/anticodon identity. All of the seeds in a taxonomic/anticodon group are represented on the same plot. For example, all Arg-ACG seeds resolving to family Lachnospiraceae can be displayed on a single plot. Each panel of the plot shows coverages of all the seeds in a given sample. If there are five Lachnospiraceae Arg-ACG seeds, then five coverage traces will be stacked atop each other in each subplot, the seed with the highest mean coverage on the bottom. Nucleotide frequencies at predicted modification positions are shown as bars, with sections of the bar for each of the four nucleotides. If three of the five seeds have a modification, say m1A22, then the total height of the bar will rise to the height of the summed coverage of the three seeds at position 22. The number of seeds represented by the group is displayed on the plot, and the mean and 3' (discriminator nucleotide) coverages of the group of seeds is displayed adjacent to each sample subplot.
+
+Multiple groups may be specified at the same time, producing a set of plots. If **only** a taxonomic group is given, then plots for **every** isoacceptor will be produced, e.g., Lachnospiraceae Arg-ACG, Arg-CCG, Arg-GCG, etc. If a taxonomic **rank** and anticodon is given, then plots for each taxon will be produced, e.g., at the family level, Lachnospiraceae Arg-ACG, Ruminococcaceae Arg-ACG, Bacteroidaceae Arg-ACG, etc.
+
+anvi-plot-trnaseq is interactive through the command prompt, allowing the required [trnaseq-contigs-db](/help/8/artifacts/trnaseq-contigs-db), [seeds-specific-txt](/help/8/artifacts/seeds-specific-txt), and [modifications-txt](/help/8/artifacts/modifications-txt) to be loaded only once and plots to be generated on the fly. Aesthetic parameters of the plots can be tweaked through the program. A comprehensive help menu with examples appears upon starting the program.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-plot-trnaseq.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-plot-trnaseq) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-plot-trnaseq/network.json b/help/8/programs/anvi-plot-trnaseq/network.json
new file mode 100644
index 00000000..a942012b
--- /dev/null
+++ b/help/8/programs/anvi-plot-trnaseq/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trnaseq-plot",
+ "name": "trnaseq-plot",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trnaseq-contigs-db",
+ "name": "trnaseq-contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trnaseq-seed-txt",
+ "name": "trnaseq-seed-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "modifications-txt",
+ "name": "modifications-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-plot-trnaseq",
+ "name": "anvi-plot-trnaseq",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-profile-blitz/index.md b/help/8/programs/anvi-profile-blitz/index.md
new file mode 100644
index 00000000..45ce5b29
--- /dev/null
+++ b/help/8/programs/anvi-profile-blitz/index.md
@@ -0,0 +1,148 @@
+---
+layout: program
+title: anvi-profile-blitz
+excerpt: An anvi'o program. FAST profiling of BAM files to get contig- or gene-level coverage and detection stats.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-profile-blitz
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+FAST profiling of BAM files to get contig- or gene-level coverage and detection stats. Unlike `anvi-profile`, which is another anvi'o program that can profile BAM files, this program is designed to be very quick and only report long-format files for various read recruitment statistics per item. Plase also see the program `anvi-script-get-coverage-from-bam` for recovery of data from BAM files without an anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[bam-file](../../artifacts/bam-file) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[bam-stats-txt](../../artifacts/bam-stats-txt)
+
+
+## Usage
+
+
+This program **produces a [bam-stats-txt](/help/8/artifacts/bam-stats-txt) from one or more [bam-file](/help/8/artifacts/bam-file) given a [contigs-db](/help/8/artifacts/contigs-db)**. It is designed to serve people who only need to process read recruitment data stored in a [bam-file](/help/8/artifacts/bam-file) to recover coverage and detection statistics as well as the number of reads mapped reads (along with other statistics) for their genes and/or contigs. It will report what's going on nicely with memory usage information and estimated time of completion:
+
+[![anvi-profile-blitz](../../images/anvi-profile-blitz.png){:.center-img}](../../images/anvi-profile-blitz.png)
+
+There are other programs in anvi'o software ecosystem that are similar to this one:
+
+* [anvi-profile](/help/8/programs/anvi-profile) also takes a [bam-file](/help/8/artifacts/bam-file) and profiles it. **They both require a [contigs-db](/help/8/artifacts/contigs-db)**. But while [anvi-profile](/help/8/programs/anvi-profile) produces a [single-profile-db](/help/8/artifacts/single-profile-db) for downstream analyses in anvi'o, [anvi-profile-blitz](/help/8/programs/anvi-profile-blitz) produces text files for downstream analyses by the user (via R, Python, or other solutions). In contrast to [anvi-profile](/help/8/programs/anvi-profile), [anvi-profile-blitz](/help/8/programs/anvi-profile-blitz) is orders of magnitude faster with similar memory usage.
+
+* [anvi-script-get-coverage-from-bam](/help/8/programs/anvi-script-get-coverage-from-bam) also takes a [bam-file](/help/8/artifacts/bam-file) and profiles it. **They both produce text output files.** But while [anvi-script-get-coverage-from-bam](/help/8/programs/anvi-script-get-coverage-from-bam) does not require a [contigs-db](/help/8/artifacts/contigs-db), [anvi-profile-blitz](/help/8/programs/anvi-profile-blitz) requires one to work. They will both run very rapidly, [anvi-script-get-coverage-from-bam](/help/8/programs/anvi-script-get-coverage-from-bam) will work with much smaller amount of memory.
+
+## Output files
+
+For output file formats, please see [bam-stats-txt](/help/8/artifacts/bam-stats-txt).
+
+## Running
+
+You can use this program with one or more BAM files to recover minimal or extended statistics for contigs or genes in a [contigs-db](/help/8/artifacts/contigs-db).
+
+{:.warning}
+Since the program will not be able to ensure the [contigs-db](/help/8/artifacts/contigs-db) was generated from the same [contigs-fasta](/help/8/artifacts/contigs-fasta) that was used for read recruitment that resulted in [bam-file](/help/8/artifacts/bam-file)s for analysis, you can make serious mistakes unless you mix up your workflow and start profiling BAM files that have nothing to do with a [contigs-db](/help/8/artifacts/contigs-db). If you make a mistake like that, in the best case scenario you will get an empty output file because the program will skip all contigs with non-matching name. In the worst case scenario you will get a file if some names in [contigs-db](/help/8/artifacts/contigs-db) incorrectly matches to some names in the [bam-file](/help/8/artifacts/bam-file). While this warning may be confusing, you can avoid all these if you use the SAME FASTA FILE both as reference for read recruitment and as input for [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database).
+
+### Contigs mode, default output
+
+Profile contigs, produce a default output:
+
+
+anvi-profile-blitz [bam-file](/help/8/artifacts/bam-file) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o OUTPUT.txt
+
+
+This example is with a single BAM file, but you can also have multiple BAM files as a parameter by using wildcards,
+
+
+anvi-profile-blitz *.bam \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o OUTPUT.txt
+
+
+or by providing multiple paths:
+
+
+anvi-profile-blitz /path/to/SAMPLE-01.bam \
+ /path/to/SAMPLE-02.bam \
+ /another/path/to/SAMPLE-03.bam
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o OUTPUT.txt
+
+
+### Contigs mode, minimal output
+
+Profile contigs, produce a minimal output. This is the fastest option:
+
+
+anvi-profile-blitz [bam-file](/help/8/artifacts/bam-file) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --report-minimal \
+ -o OUTPUT.txt
+
+
+### Genes mode, default output
+
+Profile genes, produce a default output:
+
+
+anvi-profile-blitz [bam-file](/help/8/artifacts/bam-file) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --gene-mode \
+ -o OUTPUT.txt
+
+
+### Genes mode, minimal output
+
+Profile genes, produce a default output:
+
+
+anvi-profile-blitz [bam-file](/help/8/artifacts/bam-file) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --gene-mode \
+ --report-minimal \
+ -o OUTPUT.txt
+
+
+
+## Performance
+
+The memory use will be correlated linaerly with the size of the [contigs-db](/help/8/artifacts/contigs-db), but once everything is loaded, the memory usage will not increase substantially over time.
+
+With the flag `--report-minimal`, [anvi-profile-blitz](/help/8/programs/anvi-profile-blitz) profiled on a laptop computer 100,000 contigs that contained 1 billion nts in 6 minutes and used ~300 Mb memory. This contigs database had 1.5 million genes, and memory usage increased to 1.7 Gb when [anvi-profile-blitz](/help/8/programs/anvi-profile-blitz) run in `--gene-mode`. The flag `--gene-mode` does not change time complexity dramatically.
+
+Anvi'o has this program because [Emile Faure](https://twitter.com/faureemile) presented us with a challenge: Emile had a ~140 Gb anvi'o [contigs-db](/help/8/artifacts/contigs-db) that contained nearly 70 million contig sequences from over 200 single-assembled metagenomes, and wanted to learn the coverages of each gene in the contigs database in 200 metagenomes individually. Yet the combination of [anvi-profile](/help/8/programs/anvi-profile) and [anvi-summarize](/help/8/programs/anvi-summarize) jobs would take **more than 40 days** to complete. Since all Emile needed was to learn the coverages from BAM files, we implemented [anvi-profile-blitz](/help/8/programs/anvi-profile-blitz) to skip the profiling step. The run took **8 hours to compute and report coverage values for 175 million genes in 70 million contigs**, and the memory use remained below 200 Gb.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-profile-blitz.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-profile-blitz) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-profile-blitz/network.json b/help/8/programs/anvi-profile-blitz/network.json
new file mode 100644
index 00000000..3d3d8ba7
--- /dev/null
+++ b/help/8/programs/anvi-profile-blitz/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bam-stats-txt",
+ "name": "bam-stats-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "bam-file",
+ "name": "bam-file",
+ "provided_by_anvio": false,
+ "type": "BAM"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-profile-blitz",
+ "name": "anvi-profile-blitz",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-profile/index.md b/help/8/programs/anvi-profile/index.md
new file mode 100644
index 00000000..97101d6c
--- /dev/null
+++ b/help/8/programs/anvi-profile/index.md
@@ -0,0 +1,196 @@
+---
+layout: program
+title: anvi-profile
+excerpt: An anvi'o program. The flagship anvi'o program to profile a BAM file.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-profile
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+The flagship anvi'o program to profile a BAM file. Running this program on a BAM file will quantify coverages per nucleotide position in read recruitment results and will average coverage and detection data per contig. It will also calculate single-nucleotide, single-codon, and single-amino acid variants, as well as structural variants, such as insertion and deletions, to eventually stores all data into a single anvi'o profile database. For very large projects, this program can demand a lot of time, memory, and storage resources. If all you want is to learn coverages of your nutleotides, genes, contigs, or your bins collections from BAM files very rapidly, and/or you do not need anvi'o single profile databases for your project, please see other anvi'o programs that profile BAM files, `anvi-script-get-coverage-from-bam` and `anvi-profile-blitz`.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[bam-file](../../artifacts/bam-file) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[single-profile-db](../../artifacts/single-profile-db) [misc-data-items-order](../../artifacts/misc-data-items-order) [variability-profile](../../artifacts/variability-profile)
+
+
+## Usage
+
+
+This program **creates a [single-profile-db](/help/8/artifacts/single-profile-db) from a [bam-file](/help/8/artifacts/bam-file) and [contigs-db](/help/8/artifacts/contigs-db)**.
+
+Once you have a [single-profile-db](/help/8/artifacts/single-profile-db), you can run programs like [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs), [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism), and [anvi-gen-gene-level-stats-databases](/help/8/programs/anvi-gen-gene-level-stats-databases), as well as use the interactive interface with [anvi-interactive](/help/8/programs/anvi-interactive). If you want to run these same contigs against multiple BAM files (because you have multiple samples), you'll combine your [single-profile-db](/help/8/artifacts/single-profile-db)s into a [profile-db](/help/8/artifacts/profile-db) after you've created them all using [anvi-merge](/help/8/programs/anvi-merge). See the pages for [single-profile-db](/help/8/artifacts/single-profile-db) or [profile-db](/help/8/artifacts/profile-db) for more you can do with these artifacts.
+
+In short, this program runs various analyses on the contigs in your [contigs-db](/help/8/artifacts/contigs-db) and how they relate to the sample information stored in the [bam-file](/help/8/artifacts/bam-file) you provided. It then stores this information into a [single-profile-db](/help/8/artifacts/single-profile-db). Specifically, this program calculates
+* coverage per nucleotide position (if you're unsure what coverage refers to, check out [this page](http://merenlab.org/vocabulary/#coverage))
+* single-nucleotide, single-codon, and single-amino acid variants (You can find all of those terms on the vocab page linked above, as well as a more detailed explaination [here](http://merenlab.org/2015/07/20/analyzing-variability/#an-intro-to-single-nucleotidecodonamino-acid-variation))
+* structural variants such as insertions or deletions
+
+## Basic Usage
+
+### Inputs
+
+This program takes in an [indexed](https://merenlab.org/software/anvio/help/programs/anvi-init-bam) [bam-file](/help/8/artifacts/bam-file) and a [contigs-db](/help/8/artifacts/contigs-db). The BAM file contains the short reads from a single sample that will be used to create the profile database. Thus, here is a standard run with default parameters:
+
+
+anvi-profile -i [bam-file](/help/8/artifacts/bam-file) \
+ -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+Alternatively, if you lack mapping data, you can add the flag `--blank-profile` so that you can still get the functionality of a profile database.
+
+
+anvi-profile -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --blank-profile
+
+
+### Checking your BAM file: always a good idea
+
+If you want to first check your BAM file to see what contigs it contains, just use the flag `--list-contigs` to see a comprehensive list.
+
+### Profiling a subset of contigs
+
+*Note: This describes how to profile a named subset of contigs. To profile a subset of contigs based on their characterists (for example, only contigs of a certain length or that have a certain coverage), see the section below on "contig specifications"*
+
+By default, anvi'o will use every contig in your [contigs-db](/help/8/artifacts/contigs-db). However, if you wish to focus specifically on a subset of these contigs, just provide a file that contains only the names of the contigs you want to analyze, one per line, using the tag `--contigs-of-interest`.
+
+For example, you could run
+
+
+anvi-profile -c Ross_sea_contigs.db \
+ --blank-profile \
+ --contigs-of-interest contigs_i_like.txt
+
+
+Where `contigs_i_like.txt` looks like this:
+
+ SF15-RossSeacontig4922
+ SF15-RossSeacontig702
+
+## Analysis Parameters
+
+Changing these will affect the way that your sequences are analyzed.
+
+Keep in mind that if you plan to merge your resulting [single-profile-db](/help/8/artifacts/single-profile-db) with others later in the project, you'll want to keep these parameters consistent.
+
+### Contig Specification
+
+To profile only contigs within a specific length, you can use the flags `--min-contig-length` and `-max-contig-length`. By default, the minimum length for analysis is 1000 and there is no maximum length.
+
+But beyond these flags, you can specify which contigs you would like to profile much more explicitly using the flag `--contigs-of-interest`.
+
+For instance, if you wish to work only with contigs that have more than a certain coverage across your samples, you can first run the program [anvi-profile-blitz](/help/8/programs/anvi-profile-blitz) on all BAM files, then use the resulting output file [bam-stats-txt](/help/8/artifacts/bam-stats-txt) to identify contigs of interest based on their coverages across samples, then put their names in a text file, and pass this file to [anvi-profile](/help/8/programs/anvi-profile) using the flag `--contigs-of-interest` (the anvi'o profile used to have a flag for this, `--min-mean-coverage`, that allowed users to remove contigs based on their coverage in a given sample, but [we recently removed it](https://github.com/merenlab/anvio/issues/2047) to promote explicit specification of contigs.
+
+### Filter reads
+
+You can also ignore reads in your BAM file with a percent identity to the reference less than some threshold using the flag `--min-percent-identity`. By default, all reads are used.
+
+For example, the following code will only look at contigs longer than 2000 nts and will ignore BAM file reads with less than 95 percent identity to the reference:
+
+
+anvi-profile -c Ross_sea_contigs.db \
+ -i bam_file.bam \
+ --min-contig-length 2000 \
+ --min-percent-identity 95
+
+
+By default, anvi'o fetches all reads from the bam file. With `--fetch-filter` you can determine which reads from a bam file will be used for profiling. The current filters are:
+
+* `double-forwards`: only paired-end reads with both R1 and R2 with a 'forward' orientation,
+* `double-reverses`: only paired-end reads with both R1 and R2 with a 'reverse' orientation,
+* `inversions`: only paired-end reads with both R1 and R2 either 'forward' or 'reverse' and a maximum insert size of 2000 nts,
+* `single-mapped-reads`: only single mapped reads (mate is unmapped),
+* `distant-pairs-1K`: only paired-end reads with a minimum 1000 nts insert size.
+
+For example, the following code only considers 'inversions' reads:
+
+
+anvi-profile -c Ross_sea_contigs.db \
+ -i bam_file.bam \
+ --fetch-filter inversions
+
+
+### Hierarchical Clustering
+
+#### To cluster or not to cluster?
+
+By default, anvi'o will not try to cluster your splits (since it takes quite a bit of runtime) unless you are using the tag `--blank-profile`. If you don't want to run this, use the tag `--skip-hierarchical-clustering`.
+
+If you're planning to later merge this sample with others, it is better to perform clustering while running [anvi-merge](/help/8/programs/anvi-merge) than at this stage.
+
+However, if you want to bin this single sample or otherwise want clustering to happen, just use the tag `--cluster-contigs`.
+
+If you do plan to cluster, you can set a custom distance metric or a custom linkage method.
+
+### Variability
+
+Anvi-profile will throw away variability data below certain thresholds to reduce noise. After all, if you have a single C read at a position with a 1000X coverage where all other reads are T, this is probably not a variant position that you want to investigate further. By default, it will not analyze positions with coverage less than 10X, and it will further discard variants based on [this criteria](https://merenlab.org/2015/07/20/analyzing-variability/#de-novo-characterization-and-reporting-of-snvs).
+
+However, you can change the coverage threshold using the `--min-coverage-for-variability` flag. You can also report every variability position using the flag `--report-variability-full`.
+
+For example, if you wanted to view every variant, you would profile with the following:
+
+
+anvi-profile -c Ross_sea_contigs.db \
+ -i bam_file.bam \
+ --min-coverage-for-variability 1 \
+ --report-variability-full
+
+
+## Other Parameters
+
+You should provide the sample name with the flag `-S` and can provide a description of your project using the `--description` tag followed by a text file. These will help anvi'o name output files and will show up in the anvi'o interfaces down the line.
+
+You can characterize the codon frequencies of genes in your sample at the cost of some runtime. Despite time being money, codon frequency analysis can be helpful downstream. Simply add the tag `--profile-SCVs` and watch the magic happen.
+
+{:.notice}
+If you have prior experience with `--profile-SCVs` being slow, you will be surprised how fast it is
+since v6.2
+
+Alternatively, you can choose not to store insertion and deletion data or single nucleotide variant data.
+
+If you know the limits of your system, you can also multithread this program. See the program help menu for more information.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-profile.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Another description as part of the metagenomic workflow](http://merenlab.org/2016/06/22/anvio-tutorial-v2/#anvi-profile)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-profile) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-profile/network.json b/help/8/programs/anvi-profile/network.json
new file mode 100644
index 00000000..4b92fb59
--- /dev/null
+++ b/help/8/programs/anvi-profile/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "single-profile-db",
+ "name": "single-profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items-order",
+ "name": "misc-data-items-order",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "variability-profile",
+ "name": "variability-profile",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "bam-file",
+ "name": "bam-file",
+ "provided_by_anvio": false,
+ "type": "BAM"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-profile",
+ "name": "anvi-profile",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "source": 5,
+ "target": 1
+ },
+ {
+ "source": 5,
+ "target": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-reaction-network/index.md b/help/8/programs/anvi-reaction-network/index.md
new file mode 100644
index 00000000..6487a7cb
--- /dev/null
+++ b/help/8/programs/anvi-reaction-network/index.md
@@ -0,0 +1,84 @@
+---
+layout: program
+title: anvi-reaction-network
+excerpt: An anvi'o program. Generate a metabolic reaction network in an anvi'o contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-reaction-network
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Generate a metabolic reaction network in an anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [kegg-functions](../../artifacts/kegg-functions) [reaction-ref-data](../../artifacts/reaction-ref-data) [kegg-data](../../artifacts/kegg-data)
+
+
+## Can provide
+
+
+[reaction-network](../../artifacts/reaction-network)
+
+
+## Usage
+
+
+This program **stores a metabolic [reaction-network](/help/8/artifacts/reaction-network) in a [contigs-db](/help/8/artifacts/contigs-db).**
+
+The network consists of data on biochemical reactions predicted to be encoded by the genome, referencing the [KEGG Orthology (KO)](https://www.genome.jp/kegg/ko.html) and [ModelSEED Biochemistry](https://github.com/ModelSEED/ModelSEEDDatabase) databases.
+
+Information on the predicted reactions and the involved metabolites are stored in two tables of the [contigs-db](/help/8/artifacts/contigs-db). The program, [anvi-get-metabolic-model-file](/help/8/programs/anvi-get-metabolic-model-file), can be used to export the [reaction-network](/help/8/artifacts/reaction-network) from the database to a [reaction-network-json](/help/8/artifacts/reaction-network-json) file suitable for inspection and flux balance analysis.
+
+## Usage
+
+[anvi-reaction-network](/help/8/programs/anvi-reaction-network) takes a [contigs-db](/help/8/artifacts/contigs-db) as required input. Genes stored within the database must have KO protein annotations, which can be assigned by [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams).
+
+The KO and ModelSEED Biochemistry databases must be set up and available to the program. By default, these are expected to be set up in default anvi'o data directories. [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) and [anvi-setup-modelseed-database](/help/8/programs/anvi-setup-modelseed-database) must be run to set up these databases.
+
+
+anvi-reaction-network -c /path/to/contigs-db
+
+
+Custom locations for the reference databases can be provided with the flags, `--ko-dir` and `--modelseed-dir`.
+
+
+anvi-reaction-network -c /path/to/contigs-db --ko-dir /path/to/set-up/ko-dir --modelseed-dir /path/to/set-up/modelseed-dir
+
+
+If a [contigs-db](/help/8/artifacts/contigs-db) already contains a [reaction-network](/help/8/artifacts/reaction-network) from a previous run of this program, the flag `--overwrite-existing-network` can overwrite the existing network with a new one. For example, if [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) is run again on a database using a newer version of KEGG, then [anvi-reaction-network](/help/8/programs/anvi-reaction-network) should be rerun to update the [reaction-network](/help/8/artifacts/reaction-network) derived from the KO annotations.
+
+
+anvi-reaction-network -c /path/to/contigs-db --overwrite-existing-network
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-reaction-network.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-reaction-network) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-reaction-network/network.json b/help/8/programs/anvi-reaction-network/network.json
new file mode 100644
index 00000000..27d2f234
--- /dev/null
+++ b/help/8/programs/anvi-reaction-network/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "reaction-network",
+ "name": "reaction-network",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-functions",
+ "name": "kegg-functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "reaction-ref-data",
+ "name": "reaction-ref-data",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-data",
+ "name": "kegg-data",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-reaction-network",
+ "name": "anvi-reaction-network",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "target": 5,
+ "source": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-refine/index.md b/help/8/programs/anvi-refine/index.md
new file mode 100644
index 00000000..55219224
--- /dev/null
+++ b/help/8/programs/anvi-refine/index.md
@@ -0,0 +1,92 @@
+---
+layout: program
+title: anvi-refine
+excerpt: An anvi'o program. Start an anvi'o interactive interactive to manually curate or refine a genome, whether it is a metagenome-assembled, single-cell, or an isolate genome.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-refine
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Start an anvi'o interactive interactive to manually curate or refine a genome, whether it is a metagenome-assembled, single-cell, or an isolate genome.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [bin](../../artifacts/bin)
+
+
+## Can provide
+
+
+[bin](../../artifacts/bin)
+
+
+## Usage
+
+
+This program **opens the anvi'o interactive interface** to let the user **refine what contigs are contained in a specific [bin](/help/8/artifacts/bin) or manually split one bin into several.**
+
+In the [interactive](/help/8/artifacts/interactive) interface, any bins that you create will overwrite the bin that you originally opened. If you don't provide any names, the new bins' titles will be prefixed with the name of the original bin, so that bin will continue to live on in spirit.
+
+Essentially, it is like running [anvi-interactive](/help/8/programs/anvi-interactive), but disposing of the original bin when you're done.
+
+### Potential Use Cases
+
+There are several reasons you might want to use anvi-refine:
+
+- Your dataset is just really big.
+
+ This process has its own [dedicated blog post](http://merenlab.org/2015/05/11/anvi-refine/), but, in short, the interactive interface and analysis, like all things, has its limits. Instead of trying to actively analyze all of your data at once, which will be very computationally heavy and might limit the types of analysis you'll even be able to do, it will be helpful to split your data into several smaller bins first. Then you can use anvi-refine on each one to replace those temporary bins with your final bins.
+
+- You want to refine a bin generated by automated binning software.
+
+ After automated binning (which you can do in anvi'o with [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs)), you might see an awkward bin or two. These programs often have difficultly separating bins in some scenarios, as well as sorting out prophages, mobile genetic elements, and other things that you might want to have in their own bin. You can take these out of your MAGs and put them in their own bins, split bins with high redundency, or just refine them in any way you please, using anvi-refine.
+
+- You're just not happy with one of your bins.
+
+ Happens to everyone. Maybe the redundency is just a little too high and you want to take a closer look, or maybe you just want to split up two contigs giving you different taxonomy results. Feel free to go in and take specfic contigs out or split one bin into several.
+
+If you just want to look at the contents of partiuclarly glorious bin, you should probably just use [anvi-interactive](/help/8/programs/anvi-interactive) for that, since you don't want to risk overwriting your bin.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-refine.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Refining a bin](http://merenlab.org/2015/05/11/anvi-refine/)
+
+* [Notes on genome refinement](http://merenlab.org/2017/05/11/anvi-refine-by-veronika/)
+
+* [A case study: Inspecting the genomic link between Archaea and Eukaryota](http://merenlab.org/2017/01/03/loki-the-link-archaea-eukaryota/)
+
+* [As part of the metagenomic workflow](http://merenlab.org/2016/06/22/anvio-tutorial-v2/#anvi-refine)
+
+* [A demo](https://www.youtube.com/watch?v=vXPKP5vKiBM)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-refine) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-refine/network.json b/help/8/programs/anvi-refine/network.json
new file mode 100644
index 00000000..bb2d6494
--- /dev/null
+++ b/help/8/programs/anvi-refine/network.json
@@ -0,0 +1,60 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-refine",
+ "name": "anvi-refine",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-rename-bins/index.md b/help/8/programs/anvi-rename-bins/index.md
new file mode 100644
index 00000000..de64c501
--- /dev/null
+++ b/help/8/programs/anvi-rename-bins/index.md
@@ -0,0 +1,128 @@
+---
+layout: program
+title: anvi-rename-bins
+excerpt: An anvi'o program. Rename all bins in a given collection (so they have pretty names).
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-rename-bins
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Rename all bins in a given collection (so they have pretty names).
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[collection](../../artifacts/collection) [bin](../../artifacts/bin) [profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[collection](../../artifacts/collection) [bin](../../artifacts/bin)
+
+
+## Usage
+
+
+This program **creates a new [collection](/help/8/artifacts/collection) from the [bin](/help/8/artifacts/bin)s in another collection with specific guidelines.** This is especially helpful when you wish to standardize your bin names, add project specific prefixes, and/or exclude those that do not match your criteria of completion, redundancy, and/or size estimates.
+
+### Renaming all bins in a collection
+
+Let's say you have a [collection](/help/8/artifacts/collection) called `MY_COLLECTION`, which has four bins that are named poorly (which can happen due to decisions made by automatic binning tools, or after a few steps of manual refinement): `Bin_1_2_1`, `Bin_2`, `Bin_3_1_1`, and `Bin_4`. In an instance like this, running the program [anvi-rename-bins](/help/8/programs/anvi-rename-bins) the following way will standardize these bin names with a prefix specific to your project:
+
+
+anvi-rename-bins -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ --prefix SURFACE_OCEAN \
+ --collection-to-read MY_COLLECTION \
+ --collection-to-write SURFACE_OCEAN_SAMPLES \
+ --report-file rename.txt
+
+
+Now your [profile-db](/help/8/artifacts/profile-db) will have a new collection named `SURFACE_OCEAN_SAMPLES` that will contains your four bins witht their new names `SURFACE_OCEAN_Bin_00001`, `SURFACE_OCEAN_Bin_00002`, `SURFACE_OCEAN_Bin_00003`, and `SURFACE_OCEAN_Bin_00004`. The new naming will order your bins based on their substantive completion (i.e., completion minus redunancy).
+
+The file `rename.txt` is a TAB-delimited file that contains a summary of your renaming process. The first column has the original name of the bins that you renamed, the second has their new names, and the remaining columns contain information about those bins (like their completion, redundency, and size).
+
+### Separating out the MAGs
+
+You can also label your MAGs separately from your bins via the flag `--call-MAGs`:
+
+
+anvi-rename-bins -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ --prefix SURFACE_OCEAN \
+ --collection-to-read MY_COLLECTION \
+ --collection-to-write SURFACE_OCEAN_MAGS \
+ --report-file rename.txt \
+ --call-MAGs \
+ --min-completion-for-MAG 70
+
+
+Now, the [collection](/help/8/artifacts/collection) `SURFACE_OCEAN_MAGS` will include `SURFACE_OCEAN_MAG_00001`, `SURFACE_OCEAN_MAG_00002`, `SURFACE_OCEAN_MAG_00003`, and `SURFACE_OCEAN_Bin_00004`. These are exactly the same bins that the collection contained before, but now the names differenciate the wheat from the chaff.
+
+In addition to minimum completion estimate, you can also adjust the maximum redundancy value, minimum size to call MAGs. Please see the help menu for all parameters and their descriptions.
+
+### Exclude bins that are not MAGs
+
+When you use the flag `--call-MAGs`, anvi'o identifies those bins that could be considered 'MAGs' based on your specific criteria. But regardles of whether an original bin remains a bin, or tagged as a MAG, everything in your original collection will end up in your new collection. The flag `--exclude-bins` enable you to filter out those that end up not being tagged as MAGs:
+
+
+anvi-rename-bins -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ --prefix SURFACE_OCEAN \
+ --collection-to-read MY_COLLECTION \
+ --collection-to-write SURFACE_OCEAN_MAGS \
+ --report-file rename.txt \
+ --min-completion-for-MAG 70 \
+ --call-MAGs \
+ --exclude-bins
+
+
+With the addition of the flag `--exclude-bins` to the same command, the [collection](/help/8/artifacts/collection) `SURFACE_OCEAN_MAGS` will no longer include [bin](/help/8/artifacts/bin)s `SURFACE_OCEAN_Bin_00003` and `SURFACE_OCEAN_Bin_00004`.
+
+See also the program [anvi-delete-collection](/help/8/programs/anvi-delete-collection).
+
+### The report file
+
+Following is an example reporting output file anvi'o will generate at the file path declared with the parameter `--report-file`:
+
+|**old_bin_name**|**new_bin_name**|**SCG_domain**|**completion**|**redundancy**|**size_in_Mbp**|
+|:--|:--|:--|:--|:--|:--|
+|Bin_2|p800_MAG_00001|eukarya|61.45|7.23|26.924911|
+|Bin_1|p800_MAG_00002|bacteria|98.59|8.45|1.612349|
+|Bin_3|p800_Bin_00003|blank|0.00|0.00|0.103694|
+|Bin_5|p800_Bin_00004|blank|0.00|0.00|0.128382|
+|Bin_4|p800_Bin_00005|bacteria|1.41|0.00|0.378418|
+
+The column `SCG_domain` will explain which collection of single-copy core genes were used to generate these completion/redundancy estimates. The absence of any domain prediction for any given bin will be marked with the keyrowd `blank`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-rename-bins.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-rename-bins) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-rename-bins/network.json b/help/8/programs/anvi-rename-bins/network.json
new file mode 100644
index 00000000..e07dce37
--- /dev/null
+++ b/help/8/programs/anvi-rename-bins/network.json
@@ -0,0 +1,77 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 6,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-rename-bins",
+ "name": "anvi-rename-bins",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-report-inversions/index.md b/help/8/programs/anvi-report-inversions/index.md
new file mode 100644
index 00000000..485f1d9f
--- /dev/null
+++ b/help/8/programs/anvi-report-inversions/index.md
@@ -0,0 +1,238 @@
+---
+layout: program
+title: anvi-report-inversions
+excerpt: An anvi'o program. Reports inversions.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-report-inversions
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Reports inversions.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[bams-and-profiles-txt](../../artifacts/bams-and-profiles-txt)
+
+
+## Can provide
+
+
+[inversions-txt](../../artifacts/inversions-txt)
+
+
+## Usage
+
+
+This program allows you to find genomic inversions using metagenomic read recruitment results, and their activity patterns across samples.
+
+An inversion is typically carried out by an invertase. This enzyme recognizes a pair of inverted repeat (IR), which are a special case of palindromic sequence where the repeats are facing inward on different DNA strand. The IRs are distant from each other and the invertase will invert the DNA fragment between the IRs.
+
+In brief, anvi'o leverages paired-read orientation (through the `--fetch-filter` mechanism in [anvi-profile](/help/8/programs/anvi-profile) explained below) to locate regions of interest in a set of contigs. It screens for IRs whithin regions that are enriched in read pairs that are enriched in forward/forward or reverse/reverse orientations, and uses short-reads to confirm which IRs corrrespond to real inversions. Anvi'o can also compute the 'inversion activity', i.e., the relative proportion of each orientation of an inversion in each sample.
+
+### Anvi'o philosophy to find inversions
+
+Much like a T-Rex, the vision of anvi'o rely on movement and it cannot see an inversion if it does not move, or in this case, invert. So let's start with what you cannot do with this command: you cannot find inversions in a set of contigs alone.
+
+To find an inversion, you need to have short-reads from at least one sample. If there is even a small fraction of the members of a microbial population have an inverted sequence, then [anvi-report-inversions](/help/8/programs/anvi-report-inversions) will very likely find it for you!
+
+### Before you run this program
+
+Anvi'o is able to locate inversion using the paired-end read orientation. Regular paired-end reads are facing inward with a FWD/REV orientation, but when an inversion happens, some reads will be mapping in the opposite orientation regarding the reference. As a consequence, some paired-end reads will have the same orientation: FWD/FWD or REV/REV.
+
+To leverage that information, anvi'o can profile bam files for FWD/FWD and REV/REV reads only with [anvi-profile](/help/8/programs/anvi-profile) to make special [single-profile-db](/help/8/artifacts/single-profile-db).
+
+
+anvi-profile -i [bam-file](/help/8/artifacts/bam-file) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --fetch-filter inversion
+
+
+### Other essential inputs to run this program
+
+The main input for [anvi-report-inversions](/help/8/programs/anvi-report-inversions) is a [bams-and-profiles-txt](/help/8/artifacts/bams-and-profiles-txt), which is a TAB-delimited file composed of at least four columns:
+
+* Sample name,
+* [contigs-db](/help/8/artifacts/contigs-db),
+* [single-profile-db](/help/8/artifacts/single-profile-db) generated with the inversion fetch filter,
+* [bam-file](/help/8/artifacts/bam-file).
+
+If you are interested in also characterizing inversion activity statistics across samples, you will also need to add two more columns into the [bams-and-profiles-txt](/help/8/artifacts/bams-and-profiles-txt) file to point out the paths for the R1 and R2 FASTQ files.
+
+Here is a standard run with default parameters:
+
+
+anvi-report-inversion -P [bams-and-profiles-txt](/help/8/artifacts/bams-and-profiles-txt) \
+ -o inversions_output
+
+
+### Identifying regions of interest
+
+While anvi'o could directly search of inverted repeats in all the contigs, it would be a waste of time as many IRs are actually not related to inversions. Instead, anvi'o uses the FWD/FWD and REV/REV reads to identify region of interest and constrain the seach for IRs only in these regions.
+
+For this step, you can set the minimum coverage of FWD/FWD and REV/REV reads to define 'stretches' with `--min-coverage-to-define-stretches`. Lower threshold yield more stretches, but also more noise.
+
+The parameter `--min-stretch-length` defines the minimun length for a stretch to be considered.
+
+FWD/FWD reads are found on the most left side of an inversion, while the REV/REV reads are found on the right side. When an inversion is quite long, a region of low to no coverage can separate the two group of reads. With the flag `--min-distance-between-independent-stretches`, anvi'o merges fragmented stretches if they are closer than this value.
+
+When the coverage is quite low, a stretch can be a little bit too short and miss potential IRs, so you use `--num-nts-to-pad-a-stretch` to extend the stretch by x bp upstream and downstream.
+
+Here are the default values for these flags:
+
+
+anvi-report-inversions -P [bams-and-profiles-txt](/help/8/artifacts/bams-and-profiles-txt) \
+ -o inversions_output \
+ --min-coverage-to-define-stretches 10 \
+ --min-stretch-length 50 \
+ --min-distance-between-independent-stretches 2000 \
+ --num-nts-to-pad-a-stretch 100
+
+
+### Finding palindromes
+
+There are a few paramters to constrain the search for palindromic sequences of the IRs, like a minimum length that can be set with `--min-palindrome-length`, and a maximum number of mismatches with `--max-num-mismatches`.
+
+You can set the minimum distance between two palindromic sequence with `--min-distance`. A distance of 0 would correspond to a in-place palindrome, though they don't relate to genomic inversions.
+
+When searching for palindromes with mismatches, the algorithm will extend the palindrom length as much as possible, often including mismatches which are outside of the true palindrome sequences. The flag `--min-mismatch-distance-to-first-base` allows you to trim the palindrome when one or more mismatches are n nucleotides away from a palindrome's start or stop. The default value is 1, meaning that a palindrome `MMMMMM(X)M`, where M denotes matching nucleotides and X a mismatch, will be trimmed to the first 6 matches `MMMMMM`.
+
+There are currently two algorithms to find palindromes in anvi'o: numba and BLAST. Numba is very fast when looking for palindromes in short sequences, and BLAST is more efficient for longer stretches. Anvi'o dynamically set the algorithm accoding to each stretch length: numba for stretches under 5,000 bp and BLAST for longer stretches. You can use the flag `--palindrome-search-algorithm` to ask anvi'o to use either of these methods explicitly. Note that results between the two methods may differs.
+
+
+anvi-report-inversions -P [bams-and-profiles-txt](/help/8/artifacts/bams-and-profiles-txt) \
+ -o inversions_output \
+ --min-palindrome-length 10 \
+ --max-num-mismatches 1 \
+ --min-distance 50 \
+ --min-mismatch-distance-to-first-base 1 \
+ --palindrome-search-algorithm numba
+
+
+### Confirming inversions
+
+Multiple palindromes are usualy reported for each stretch and to confirm which one actually relates to an inversions, anvi'o searches short-reads in the bam file for unique constructs that can only occur when a genomic region inverted.
+
+By default, anvi'o reports the first confirmed palindrome and move to the next stretch. This process is very efficient as a strech usually have only one inversion. But in rare cases, you can have multiple inversions happening in a single stretch. Then, you can use the flag `--check-all-palindromes` and anvi'o will look for inversion evidences in the short-reads for every palindrome in a stretch.
+
+Anvi'o looks for inversion evidence in the FWD/FWD and REV/REV reads first. If no evidence are found, then it searches the rest of the reads mapping to the region of interest. If you want to only search the FWD/FWD and REV/REV reads you can use the flag `--process-only-inverted-reads`
+
+### Computing inversion activity
+
+If you provide the short-reads R1 and R2 in the [bams-and-profiles-txt](/help/8/artifacts/bams-and-profiles-txt), anvi'o can compute the proportion of the inversion's orientation in each sample.
+
+This is a very time consuming step, and if you have multiple sample, you can use the parameter `--num-threads` to set the maximum of threads for multithreading when possible.
+
+To compute the inversion's ratios, anvi'o design in silco primers based on the palidrome sequence and the upstream/downstream genomic context to search short-reads in the raw fastq files. The variable `--oligo-primer-base-length` is used to control how much of the palindrome should be used to design the primers. The longer, the more specific but if it is too long, less reads will match to the primer.
+
+This step is very computationally intense, but you can test it with the parameter `--end-primer-search-after-x-hits`. Once the total number of reads reach this parameter, anvi'o will stop searching further and will continue with the next sample. This flag is only good for testing.
+
+If you want to skip this step, you can use the flag `--skip-compute-inversion-activity`.
+
+An example command with 12 threads:
+
+
+anvi-report-inversions -P [bams-and-profiles-txt](/help/8/artifacts/bams-and-profiles-txt) \
+ -o inversions_output \
+ --num-threads 12 \
+ --oligo-primer-base-length 12
+
+
+### Computing inversion activity using previously computed inversions
+
+It is possible to instruct anvi'o to use previously reported inversions to characterize their activity across a larger set of samples. This is possible by passing the program [anvi-report-inversions](/help/8/programs/anvi-report-inversions) the output file for consensus inversions (i.e., 'CONSENSUS-INVERSIONS.txt') or the output file for sample-specific inversions (i.e., 'INVERSIONS-IN-[SAMPLE-NAME].txt') from a previous run using the flag `--pre-computed-inversions`:
+
+
+anvi-report-inversions -P [bams-and-profiles-txt](/help/8/artifacts/bams-and-profiles-txt) \
+ --pre-computed-inversions inversions_output/INVERSIONS-CONSENSUS.txt
+ -o activity_calculations
+
+
+In this mode, [anvi-report-inversions](/help/8/programs/anvi-report-inversions) will not reclaculate inversions, and only report the activity of inversions found in the input file across samples listed in the [bams-and-profiles-txt](/help/8/artifacts/bams-and-profiles-txt) file.
+
+### Reporting genomic context around inversions
+
+For every inversion, anvi'o can report the surrounding genes and their function as additional files.
+
+You can use the flag `--num-genes-to-consider-in-context` to choose how many genes to consider upstream/downstream of the inversion. By default, anvi'o report three genes downstream, and three genes upstream.
+
+To select a specific gene caller, you can use `--gene-caller`. The default is prodigal.
+
+If you want to skip this step, you can use the flag `--skip-recovering-genomic-context`.
+
+### Targeted search
+
+If you are interested in a given contig region you can use the following flags to limit the search:
+
+* `--target-contig`: contig of interest,
+* `--target-region-start`: the start position of the region of interest,
+* `--target-region-end`: the end position of the region of interest.
+
+### Output
+[anvi-report-inversions](/help/8/programs/anvi-report-inversions) searches for inversions in every single sample at a time and thus genereates a TAB-delimited table for every sample: `INVERSIONS-IN-SAMPLE_01.txt`, `INVERSIONS-IN-SAMPLE_02`, ...
+
+These tables contains the following information:
+
+* entry ID,
+* contig name,
+* first palindrome sequence,
+* aligment midline,
+* second palindrome sequence,
+* start and stop position of the first and second palindrome sequence,
+* number of mismatches,
+* number of gaps,
+* length of the palindrome sequence,
+* distance between the first and second palindrome seqeuences, i.e. the size of the inversion,
+* the number of samples in which it was detected and confirmed,
+* the in silico primers used to compute the inversion's activity, for the first and second palindrome,
+* the oligo corresponding to the reference sequence.
+
+Anvi'o eventually create a consensus table with all the unique inversions found accross all your samples in a file called `INVERSIONS-CONSENSUS.txt`. This table has the same format as the individual sample outputs, with the 'entry ID' replaced by a unique inversion ID.
+
+Another default output table is named `ALL-STRETCHES-CONSIDERED.txt` and it reports every stretch that passed the ['Identifying regions of interest'](#identifying-regions-of-interest) parameters. It reports the maximum coverage of FWD/FWD and REV/REV in that stretch, per sample. It also reports the number of palindromes found and if a true inversion was confirmed.
+
+If the user enable the reporting of the genomic context, two addition TAB-delimited tables are generated: `INVERSIONS-CONSENSUS-SURROUNDING-GENES.txt` and `INVERSIONS-CONSENSUS-SURROUNDING-FUNCTIONS.txt`.
+The first table report the gene calls surrounging every inversion when possible (inversions_id, gene_caller_id, start and stop position, orientation, gene_caller and contig).
+The second table report the function associated to every gene call reported in the first file (inversions_id, gene_caller_id, source, accession, function).
+
+Finally, if the user provide R1 and R2 fastq files and enable the reporting of inversion's activity, [anvi-report-inversions](/help/8/programs/anvi-report-inversions) will generate a long-format file named `INVERSION-ACTIVITY.txt`. This file reports, for every inversion and sample, the relative proportion and read abundance of unique oligos, which either correspond to the reference contig (no inversion), or to an inversion sequence. The inversion's activity is computed and reported for both side of each inversion.
+
+
+
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-report-inversions.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-report-inversions) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-report-inversions/network.json b/help/8/programs/anvi-report-inversions/network.json
new file mode 100644
index 00000000..57e8fedf
--- /dev/null
+++ b/help/8/programs/anvi-report-inversions/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "inversions-txt",
+ "name": "inversions-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "bams-and-profiles-txt",
+ "name": "bams-and-profiles-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-report-inversions",
+ "name": "anvi-report-inversions",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-report-linkmers/index.md b/help/8/programs/anvi-report-linkmers/index.md
new file mode 100644
index 00000000..802131a1
--- /dev/null
+++ b/help/8/programs/anvi-report-linkmers/index.md
@@ -0,0 +1,112 @@
+---
+layout: program
+title: anvi-report-linkmers
+excerpt: An anvi'o program. Reports sequences stored in one or more BAM files that cover one of more specific nucleotide positions in a reference.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-report-linkmers
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Reports sequences stored in one or more BAM files that cover one of more specific nucleotide positions in a reference.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[bam-file](../../artifacts/bam-file)
+
+
+## Can provide
+
+
+[linkmers-txt](../../artifacts/linkmers-txt)
+
+
+## Usage
+
+
+Reports sequences stored in a [bam-file](/help/8/artifacts/bam-file) file that cover one of more specific nucleotide positions in a reference.
+
+### Basic mode of operation
+
+Assume you wish to recover reads stored in one or more BAM files, where the matching reads contain at least one nucleotide position that align to a nucleotide position `nucleotide_position_N` in a contig `contig_name_X`. In that case, the user would first generate a two column TAB-delmited file, for example `positions_for_linkmers.txt` with no header line,
+
+
+
+
+
+ contig_name_X |
+ nucleotide_position_N |
+
+
+
+
+And run the program this way to recover the short reads from this way:
+
+```
+anvi-report-linkmers --contigs-and-positions positions_for_linkmers.txt \
+ -i SAMPLE_01.bam SAMPLE_02.bam SAMPLE_03.bam (...) \
+ -o linkmers.txt
+```
+
+The user can define multiple contigs in the input file, and one or more nucleotide positions for each one of them:
+
+
+
+
+ contig_name_X |
+ nucleotide_position_01,nucleotide_position_02,nucleotide_position_03 |
+
+
+ contig_name_Y |
+ nucleotide_position_04 |
+
+
+ contig_name_Z |
+ nucleotide_position_05,nucleotide_position_06 |
+
+
+
+
+The resulting [linkmers-txt](/help/8/artifacts/linkmers-txt) would include all short reads that match any of these critera
+
+### Complete or incomplete links?
+
+Using the `--only-complete-links` flag, the user can enforce whether only complete links should be reported where each reported short read must cover each nucleotide position for a given contig.
+
+Please note that if the nucleotide positions chosen for a given contig are too distant from each other given the short read length, zero reads may satisfy the complete links criterion.
+
+Having complete links, however, will enable [oligotyping](https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.12114) analyses on **metagenomic reads** through the anvi'o program [anvi-oligotype-linkmers](/help/8/programs/anvi-oligotype-linkmers).
+
+### See this program in action
+
+[http://merenlab.org/2015/12/09/musings-over-commamox/](http://merenlab.org/2015/12/09/musings-over-commamox/)
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-report-linkmers.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-report-linkmers) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-report-linkmers/network.json b/help/8/programs/anvi-report-linkmers/network.json
new file mode 100644
index 00000000..1f9050a4
--- /dev/null
+++ b/help/8/programs/anvi-report-linkmers/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "linkmers-txt",
+ "name": "linkmers-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "bam-file",
+ "name": "bam-file",
+ "provided_by_anvio": false,
+ "type": "BAM"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-report-linkmers",
+ "name": "anvi-report-linkmers",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-run-cazymes/index.md b/help/8/programs/anvi-run-cazymes/index.md
new file mode 100644
index 00000000..6734e7c8
--- /dev/null
+++ b/help/8/programs/anvi-run-cazymes/index.md
@@ -0,0 +1,92 @@
+---
+layout: program
+title: anvi-run-cazymes
+excerpt: An anvi'o program. Run dbCAN CAZymes on contigs-db.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-run-cazymes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Run dbCAN CAZymes on contigs-db.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [cazyme-data](../../artifacts/cazyme-data)
+
+
+## Can provide
+
+
+[functions](../../artifacts/functions)
+
+
+## Usage
+
+
+This program **annotates genes in your [contigs-db](/help/8/artifacts/contigs-db) with functions using dbCAN [CAZyme HMMs](https://bcb.unl.edu/dbCAN2/download/Databases/)**
+
+Before you run this program, you'll have to set up the CAZyme database on your computer with the program [anvi-setup-cazymes](/help/8/programs/anvi-setup-cazymes).
+
+The CAZyme database is based on protein sequences, so anvi'o will convert your genetic information into protein sequences and then use HMMs to compare them to the database.
+
+{:.notice}
+Unsure what an HMM is? Check out [our vocab page](http://merenlab.org/vocabulary/#hmm)
+
+To run, you'll need to provide a [contigs-db](/help/8/artifacts/contigs-db) and the output will be a [functions](/help/8/artifacts/functions) artifact. Here is a default run:
+
+
+anvi-run-cazymes -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+If you stored the [cazyme-data](/help/8/artifacts/cazyme-data) that you got from running [anvi-setup-cazymes](/help/8/programs/anvi-setup-cazymes) in a custom location, you'll need to provide that path as well.
+
+
+anvi-run-cazymes -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --cazyme-data-dir [cazyme-data](/help/8/artifacts/cazyme-data)
+
+
+By default, this uses `hmmsearch` to run HMMs. You can choose to use `hmmscan` instead by running
+
+
+anvi-run-cazymes -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --cazyme-data-dir [cazyme-data](/help/8/artifacts/cazyme-data) \
+ --hmmer-program hmmscan
+
+
+Use the parameter `--noise-cutoff-terms` to filter out hits. The default value is `--noise-cutoff-terms -E 1e-12`. If you want to explore filtering options, check out the help menu of the underlying hmm program you are using e.g. `hmmsearch -h`
+
+
+anvi-run-cazymes -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --noise-cutoff-terms "-E 1e-14"
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-run-cazymes.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-run-cazymes) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-run-cazymes/network.json b/help/8/programs/anvi-run-cazymes/network.json
new file mode 100644
index 00000000..936c0678
--- /dev/null
+++ b/help/8/programs/anvi-run-cazymes/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "cazyme-data",
+ "name": "cazyme-data",
+ "provided_by_anvio": true,
+ "type": "DATA"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-run-cazymes",
+ "name": "anvi-run-cazymes",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-run-hmms/index.md b/help/8/programs/anvi-run-hmms/index.md
new file mode 100644
index 00000000..49e88872
--- /dev/null
+++ b/help/8/programs/anvi-run-hmms/index.md
@@ -0,0 +1,174 @@
+---
+layout: program
+title: anvi-run-hmms
+excerpt: An anvi'o program. This program deals with populating tables that store HMM hits in an anvi'o contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-run-hmms
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+This program deals with populating tables that store HMM hits in an anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [hmm-source](../../artifacts/hmm-source)
+
+
+## Can provide
+
+
+[hmm-hits](../../artifacts/hmm-hits)
+
+
+## Usage
+
+
+Stores [hmm-hits](/help/8/artifacts/hmm-hits) for a given [hmm-source](/help/8/artifacts/hmm-source) in a [contigs-db](/help/8/artifacts/contigs-db). In short, this is the program that will do a search for HMMs against a [contigs-db](/help/8/artifacts/contigs-db) and store that information into the contigs-db's [hmm-hits](/help/8/artifacts/hmm-hits).
+
+This is one of the programs that users commonly run on newly generated [contigs-db](/help/8/artifacts/contigs-db), along with [anvi-scan-trnas](/help/8/programs/anvi-scan-trnas), [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs), [anvi-run-scg-taxonomy](/help/8/programs/anvi-run-scg-taxonomy), and so on.
+
+### HMMs in the context of anvi'o
+
+In a nutshell, [hidden Markov models](https://en.wikipedia.org/wiki/Hidden_Markov_model) are statistical models typically generated from known genes which enable 'searching' for similar genes in other sequence contexts.
+
+The default anvi'o distribution includes numerous [curated HMM profiles](https://github.com/merenlab/anvio/tree/master/anvio/data/hmm) for single-copy core genes and ribosomal RNAs, and anvi'o can work with custom HMM profiles provided by the user. In anvi'o lingo, each of these HMM profiles, whether they are built-in or user defined, is called an [hmm-source](/help/8/artifacts/hmm-source).
+
+### Default Usage
+
+To run this program with all default settings (against all default anvi'o [hmm-source](/help/8/artifacts/hmm-source)), you only need to provide a [contigs-db](/help/8/artifacts/contigs-db):
+
+
+anvi-run-hmms -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+Multithreading will dramatically improve the performance of `anvi-run-hmms`. If you have multiple CPUs or cores, you may parallelize your search:
+
+
+
+anvi-run-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --num-threads 6
+
+
+
+You can also run this program on a specific built-in [hmm-source](/help/8/artifacts/hmm-source):
+
+
+anvi-run-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -I Bacteria_71
+
+
+### User-defined HMMs
+
+Running `anvi-run-hmms` with a custom model is easy. All you need to do is to create a directory with necessary files:
+
+
+anvi-run-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -H MY_HMM_PROFILE
+
+
+See the relevant section in the artifact [hmm-source](/help/8/artifacts/hmm-source) for details.
+
+### Adding HMM hits as a functional annotation source
+
+By default, HMM hits are not considered functional annotations and are kept in a distinct table (the 'hmm_hits' table) in the contigs database. However, there are certain cases when you may want them to be considered as functions instead. For instance, if you want to run [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) on a set of user-defined metabolic pathways and you have a set of custom HMMs for their enzymes.
+
+To treat the HMM hits as functional annotations and add them to the 'gene_functions' table in your database, you must use the `--add-to-functions-table` flag:
+
+
+anvi-run-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -H MY_HMM_PROFILE \
+ --add-to-functions-table
+
+
+### Changing the HMMER program
+
+By default, `anvi-run-hmms` will use [HMMER](http://hmmer.org/)'s `hmmscan` for amino acid HMM profiles, but you can use `hmmsearch` if you are searching a very large number of models against a relatively smaller number of sequences:
+
+
+anvi-run-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --hmmer-program hmmsearch
+
+
+{:.notice}
+This flag has no effect when your HMM profile source is for nucleotide sequences (like any of the Ribosomal RNA sources). In those cases anvi'o will use `nhmmscan` exclusively.
+
+### Saving the HMMER output
+
+If you want to see the output from the HMMER program (eg, `hmmscan`) used to annotate your data, you can request that it be saved in a directory of your choosing. Please note that this only works when you are running on a single HMM source, as in the example below:
+
+
+anvi-run-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -I Bacteria_71 \
+ --hmmer-output-dir OUTPUT_DIR
+
+
+If you do this, file(s) with the prefix `hmm` will appear in that directory, with the file extension indicating the format of the output file. For example, the table output format would be called `hmm.table`.
+
+{:.warning}
+These resulting files are not _exactly_ the raw output of HMMER because anvi'o does quite a bit of pre-processing on the raw input and output file(s) while jumping through some hoops to make the HMM searches multi-threaded. If this is causing you a lot of headache, please let us know.
+
+#### Requesting domain table output
+
+{:.notice}
+Please also see [anvi-script-filter-hmm-hits-table](/help/8/programs/anvi-script-filter-hmm-hits-table)
+
+No matter what, anvi'o will use the regular table output to annotate your contigs database. However, if you are using the `--hmmer-output-dir` to store the HMMER output, you can also request a domain table output using the flag `--domain-hits-table`.
+
+
+anvi-run-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -I Bacteria_71 \
+ --hmmer-output-dir OUTPUT_DIR \
+ --domain-hits-table
+
+
+In this case anvi'o will run [HMMER](http://hmmer.org) using the `--domtblout` flag to generate this output file.
+
+{:.notice}
+This flag will only work with HMM profiles made for amino acid sequences. Profiles for nucleotide sequences require the use of the program `nhmmscan`, which does not have an option to store domain output.
+
+Please note that this output **won't be used to filter hits to be added to the contigs database**. But it will give you the necessary output file to investigate the coverage of HMM hits. But you can use the program [anvi-script-filter-hmm-hits-table](/help/8/programs/anvi-script-filter-hmm-hits-table) with this file to remove weak hits from your HMM hits table later.
+
+
+### Other things anvi-run-hmms can do
+
+* Add the tag `--also-scan-trnas` to basically run [anvi-scan-trnas](/help/8/programs/anvi-scan-trnas) for you at the same time. It's very convenient. (But it only works if you are not using the `-I` or `-H` flags at the same time because reasons.)
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-run-hmms.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Another description as part of the metagenomic workflow](http://merenlab.org/2016/06/22/anvio-tutorial-v2/#anvi-profile)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-run-hmms) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-run-hmms/network.json b/help/8/programs/anvi-run-hmms/network.json
new file mode 100644
index 00000000..901fd31f
--- /dev/null
+++ b/help/8/programs/anvi-run-hmms/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "hmm-hits",
+ "name": "hmm-hits",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "hmm-source",
+ "name": "hmm-source",
+ "provided_by_anvio": false,
+ "type": "HMM"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-run-hmms",
+ "name": "anvi-run-hmms",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-run-interacdome/index.md b/help/8/programs/anvi-run-interacdome/index.md
new file mode 100644
index 00000000..947ef55b
--- /dev/null
+++ b/help/8/programs/anvi-run-interacdome/index.md
@@ -0,0 +1,109 @@
+---
+layout: program
+title: anvi-run-interacdome
+excerpt: An anvi'o program. Run InteracDome on a contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-run-interacdome
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Run InteracDome on a contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [interacdome-data](../../artifacts/interacdome-data)
+
+
+## Can provide
+
+
+[binding-frequencies-txt](../../artifacts/binding-frequencies-txt) [misc-data-amino-acids](../../artifacts/misc-data-amino-acids)
+
+
+## Usage
+
+
+
+This program predicts per-residue binding scores for genes in your [contigs-db](/help/8/artifacts/contigs-db) via the [InteracDome](https://interacdome.princeton.edu/) database.
+
+
+The full process is detailed in [this blog post](https://merenlab.org/2020/07/22/interacdome/). In fact, ideally, all of that information should really be in this very document, but because the blogpost has preceded this document, it hasn't been translated over yet. So really, you should really be reading that blogpost if you want to get into the nitty gritty details. Otherwise, the quick reference herein should be sufficient.
+
+
+In summary, this program runs an HMM search of the genes in your [contigs-db](/help/8/artifacts/contigs-db) to all the Pfam gene families that have been annotated with InteracDome binding frequencies. Then, it parses and filters results, associates binding frequencies of HMM match states to the user's genes of interest, and then stores the resulting per-residue binding frequencies for each gene into the [contigs-db](/help/8/artifacts/contigs-db) as [misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids).
+
+
+Before running this program, you'll have to run [anvi-setup-interacdome](/help/8/programs/anvi-setup-interacdome) to set up a local copy of [InteracDome's tab-separated files](https://interacdome.princeton.edu/#tab-6136-4).
+
+
+
+## Basic Usage
+
+A basic run of this program looks like this:
+
+
+anvi-run-interacdome -c [contigs-db](/help/8/artifacts/contigs-db) -T 4
+
+
+In addition to storing per-residue binding frequencies as [misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids) in your [contigs-db](/help/8/artifacts/contigs-db), this also outputs additional files prefixed with `INTERACDOME` by default (the prefix can be changed with `-O`). These are provided as [binding-frequencies-txt](/help/8/artifacts/binding-frequencies-txt) files named `INTERACDOME-match_state_contributors.txt` and `INTERACDOME-domain_hits.txt`. See [binding-frequencies-txt](/help/8/artifacts/binding-frequencies-txt) for details.
+
+
+## Parameters
+
+[InteracDome](https://interacdome.princeton.edu/) offers two different binding frequency datasets that can be chosen with `--interacdome-dataset`. Choose 'representable' to include Pfams that correspond to domain-ligand interactions that had nonredundant instances across three or more distinct PDB structures. InteracDome authors recommend using this collection to learn more about domain binding properties. Choose 'confident' to include Pfams that correspond to domain-ligand interactions that had nonredundant instances across three or more distinct PDB entries and achieved a cross-validated precision of at least 0.5. The default is 'representable', and you can change it like so:
+
+
+
+anvi-run-interacdome -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --interacdome-dataset confident
+
+
+This progarm is multi-threaded, so be sure to make use of it:
+
+
+anvi-run-interacdome -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --interacdome-dataset confident \
+ -T 8
+
+
+Additionally, there are numerous thresholds that you can set:
+
+1. [`--min-binding-frequency` to ignore very low frequencies](https://merenlab.org/2020/07/22/interacdome/#filtering-low-binding-frequency-scores). The InteracDome scale is from 0 (most likely not involved in binding) to 1 (most likely involved in binding). The default cutoff is 0.200000.
+2. [`--min-hit-fraction` to remove poor quality HMM hits]((https://merenlab.org/2020/07/22/interacdome/#filtering-partial-hits)). The default value is 0.5, so at least half of a profile HMM's length must align to your gene, otherwise the hit will be discarded.
+3. [`--information-content-cutoff` to ignore low-qulaity domain hits](https://merenlab.org/2020/07/22/interacdome/#filtering-bad-hits-with-information-content). The default value is 4, which means every amino acid of your gene must match the consensus amino acid of the match state for each mate state with [information content](https://en.wikipedia.org/wiki/Sequence_logo) greater than 4. Decreasing this cutoff yields an increasingly stringent filter.
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-run-interacdome.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Estimating per-residue binding frequencies with InteracDome](http://merenlab.org/2020/07/22/interacdome/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-run-interacdome) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-run-interacdome/network.json b/help/8/programs/anvi-run-interacdome/network.json
new file mode 100644
index 00000000..9a2f9925
--- /dev/null
+++ b/help/8/programs/anvi-run-interacdome/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "binding-frequencies-txt",
+ "name": "binding-frequencies-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-amino-acids",
+ "name": "misc-data-amino-acids",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interacdome-data",
+ "name": "interacdome-data",
+ "provided_by_anvio": true,
+ "type": "DATA"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-run-interacdome",
+ "name": "anvi-run-interacdome",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-run-kegg-kofams/index.md b/help/8/programs/anvi-run-kegg-kofams/index.md
new file mode 100644
index 00000000..4f21495a
--- /dev/null
+++ b/help/8/programs/anvi-run-kegg-kofams/index.md
@@ -0,0 +1,167 @@
+---
+layout: program
+title: anvi-run-kegg-kofams
+excerpt: An anvi'o program. Run KOfam HMMs on an anvi'o contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-run-kegg-kofams
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Run KOfam HMMs on an anvi'o contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [kegg-data](../../artifacts/kegg-data)
+
+
+## Can provide
+
+
+[kegg-functions](../../artifacts/kegg-functions) [functions](../../artifacts/functions)
+
+
+## Usage
+
+
+Essentially, this program uses the KEGG database to annotate functions and metabolic pathways in a [contigs-db](/help/8/artifacts/contigs-db). More specifically, [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) annotates a [contigs-db](/help/8/artifacts/contigs-db) with HMM hits from KOfam, a database of KEGG Orthologs (KOs). You must set up these HMMs on your computer using [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) before you can use this program. If a [modules-db](/help/8/artifacts/modules-db) is available, membership of KOfam functions in KEGG metabolic MODULES and BRITE hierarchies is also stored in the [contigs-db](/help/8/artifacts/contigs-db).
+
+Running this program is a pre-requisite for metabolism estimation with [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism). Note that if you are planning to run metabolism estimation, it must be run with the same [kegg-data](/help/8/artifacts/kegg-data) that is used in this program to annotate KOfam hits.
+
+## How does it work?
+**1) Run an HMM search against KOfam**
+Briefly, what this program does is extract all the gene calls from the [contigs-db](/help/8/artifacts/contigs-db) and checks each one for hits to the KOfam HMM profiles in your [kegg-data](/help/8/artifacts/kegg-data). This can be time-consuming given that the number of HMM profiles is quite large, even more so if the number of genes in the [contigs-db](/help/8/artifacts/contigs-db) is also large. Multi-threading is a good idea if you have the computational capability to do so.
+
+**2) Eliminate weak hits based on bitscore**
+Many HMM hits will be found, most of them weak. The weak hits will by default be eliminated according to the bitscore thresholds provided by KEGG; that is, hits with bitscores below the threshold for a given KO profile will be discarded, and those with bitscores above the threshold will be annotated in the [contigs-db](/help/8/artifacts/contigs-db). It is perfectly normal to notice that the number of raw hits found is many, many times larger than the number of annotated KO hits in your database.
+
+**3) Add back valid hits that were missed**
+There is one issue with this practice of removing _all_ KOfam hits below the KEGG bitscore threshold for a given profile. We (and others) have noticed that the KEGG thresholds can sometimes be too stringent, eliminating hits that are actually valid annotations. To solve this problem, we
+have implemented the following heuristic for relaxing the bitscore thresholds and annotating genes that would otherwise go without a valid KO annotation:
+
+For every gene without a KOfam annotation, we examine all the hits with an e-value below `X` and a bitscore above `Y` percent of the threshold. If those hits are all to a unique KOfam profile, then we annotate the gene call with that KO.
+
+`X` and `Y` are parameters that can be modified (see below), but by default the e-value threshold (`X`) is 1e-05 and the bitscore fraction (`Y`) is 0.5.
+
+Please note that this strategy is just a heuristic. We have tried to pick default parameters that seemed reasonable but by no means have we comprehensively tested and optimized them. This is why X and Y are mutable so that you can explore different values and see how they work for your data. It is always a good idea to double-check your annotations to make sure they are reasonable and as stringent as you'd like them to be. In addition, if you do not feel comfortable using this heuristic at all, you can always turn this behavior off and rely solely on KEGG's bitscore thresholds. :)
+
+**3) Put annotations in the database**
+In the [contigs-db](/help/8/artifacts/contigs-db) functions table, annotated KO hits ([kegg-functions](/help/8/artifacts/kegg-functions)) will have the source `KOfam`. If a [modules-db](/help/8/artifacts/modules-db) is available, metabolic modules and BRITE functional classifications containing these functions also have entries in the table, with sources labeled `KEGG_Module` and `KEGG_BRITE`. BRITE classification will not occur if [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) was not set up with BRITE data (see the artifact for that program to see how to include BRITE).
+
+## Standard usage
+
+
+anvi-run-kegg-kofams -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+## Use a specific non-default KEGG data directory
+If you have previously setup your KEGG data directory using `--kegg-data-dir` (see [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data)), or have moved the KEGG data directory that you wish to use to a non-default location (maybe you like keeping the older versions around when you update, we don't know how you roll), then you may need to specify where to find the KEGG data so that this program can use the right one. In that case, this is how you do it:
+
+
+anvi-run-kegg-kofams -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --kegg-data-dir /path/to/directory/KEGG
+
+
+## Run with multiple threads
+
+
+anvi-run-kegg-kofams -c [contigs-db](/help/8/artifacts/contigs-db) -T 4
+
+
+## Use a different HMMER program
+By default, [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) uses `hmmsearch` to find KO hits. If for some reason you would rather use a different program (`hmmscan` is also currently supported), you can do so.
+
+
+anvi-run-kegg-kofams -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --hmmer-program hmmscan
+
+
+## Keep all HMM hits
+Usually, this program parses out weak HMM hits and keeps only those that are above the score threshold for a given KO. If you would like to turn off this behavior and keep all hits (there will be _a lot_ of weak ones), you can follow the example below:
+
+
+anvi-run-kegg-kofams -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --keep-all-hits
+
+
+## Save the bitscores of HMM hits
+
+If you want to see the bitscores of all KOfam hits that were added to your contigs database, you can use the `--log-bitscores` option to save these values into a tab-delimited file:
+
+
+anvi-run-kegg-kofams -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --log-bitscores
+
+
+Here is an example of what the resulting bitscore file would look like:
+
+|**entry_id**|**bit_score**|**domain_bit_score**|**e_value**|**entry_id**|**gene_callers_id**|**gene_hmm_id**|**gene_name**|
+|:--|:--|:--|:--|:--|:--|:--|:--|
+|1|177.4|85.1|8e-54|0|1371|-|K10681|
+|2|34.1|33.7|9.1e-11|1|1141|-|K01954|
+|3|22.4|22.4|3.1e-07|2|1402|-|K01954|
+|4|12.8|11.8|0.00024|3|1099|-|K01954|
+|5|17.1|16.7|4.4e-05|4|1267|-|K20024|
+
+Combining this flag with the `--keep-all-hits` option is one way to get the bitscores of all matches to the KOfam profiles, even the ones that would usually not pass the bitscore threshold provided by KEGG.
+
+## Modify the bitscore relaxation heuristic
+As described above, this program does its best to avoid missing valid annotations by relaxing the bitscore threshold for genes without any annotations. For such a gene, hits with e-value <= X and bitscore > (Y * KEGG threshold) that are all hits to the same KOfam profile are used to annotate the gene with that KO.
+
+### Skip this heuristic entirely
+If you don't want any previously-eliminated hits to be used for annotation, you can skip this heuristic by using the flag `--skip-bitscore-heuristic`. Then, _only_ hits with bitscores above the KEGG-provided threshold for a given KO will be used for annotation.
+
+
+anvi-run-kegg-kofams -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --skip-bitscore-heuristic
+
+
+### Modify the heuristic parameters
+If our default values are too stringent or not stringent enough for your tastes, you can change them! The e-value threshold (X, default: 1e-05) can be set using `-E` or `--heuristic-e-value` and the bitscore fraction (Y, default: 0.50) can be set using `-H` or `--heuristic-bitscore-fraction`. Like so:
+
+
+anvi-run-kegg-kofams -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -E 1e-15 \
+ -H 0.90
+
+
+## Skip BRITE annotations
+If for some strange reason you do not want KEGG BRITE annotations to be added to your contigs database, you can skip them by providing the `--skip-brite-hierarchies` flag:
+
+
+anvi-run-kegg-kofams -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --skip-brite-hierarchies
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-run-kegg-kofams.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-run-kegg-kofams) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-run-kegg-kofams/network.json b/help/8/programs/anvi-run-kegg-kofams/network.json
new file mode 100644
index 00000000..af191e88
--- /dev/null
+++ b/help/8/programs/anvi-run-kegg-kofams/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-functions",
+ "name": "kegg-functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-data",
+ "name": "kegg-data",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-run-kegg-kofams",
+ "name": "anvi-run-kegg-kofams",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-run-ncbi-cogs/index.md b/help/8/programs/anvi-run-ncbi-cogs/index.md
new file mode 100644
index 00000000..f1717655
--- /dev/null
+++ b/help/8/programs/anvi-run-ncbi-cogs/index.md
@@ -0,0 +1,80 @@
+---
+layout: program
+title: anvi-run-ncbi-cogs
+excerpt: An anvi'o program. This program runs NCBI's COGs to associate genes in an anvi'o contigs database with functions.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-run-ncbi-cogs
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+This program runs NCBI's COGs to associate genes in an anvi'o contigs database with functions. COGs database was been designed as an attempt to classify proteins from completely sequenced genomes on the basis of the orthology concept..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [cogs-data](../../artifacts/cogs-data)
+
+
+## Can provide
+
+
+[functions](../../artifacts/functions)
+
+
+## Usage
+
+
+This program **annotates genes in your [contigs-db](/help/8/artifacts/contigs-db) with functions using NCBI's [Clusters of Orthologus Groups (COGs) database](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC102395/).**
+
+This program assumes that the user has successfully set up the COGs database on their computer using the anvi'o program [anvi-setup-ncbi-cogs](/help/8/programs/anvi-setup-ncbi-cogs).
+
+The only critical parameter to [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs) is a [contigs-db](/help/8/artifacts/contigs-db). The program will store its output in the [contigs-db](/help/8/artifacts/contigs-db)as a [functions](/help/8/artifacts/functions) artifact.
+
+If the [cogs-data](/help/8/artifacts/cogs-data) was stored at a specific path when [anvi-setup-ncbi-cogs](/help/8/programs/anvi-setup-ncbi-cogs) was run, then providing that path using the `--cog-data-dir` parameter is also necessary.
+
+
+anvi-run-ncbi-cogs -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --cog-data-dir path/to/[cogs-data](/help/8/artifacts/cogs-data)
+
+
+Without the flag `--cog-data-dir`, anvi'o will just search in the default location.
+
+You can also use blastp to search, by running:
+
+
+anvi-run-ncbi-cogs -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --search-with blastp
+
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-run-ncbi-cogs.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-run-ncbi-cogs) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-run-ncbi-cogs/network.json b/help/8/programs/anvi-run-ncbi-cogs/network.json
new file mode 100644
index 00000000..276a9215
--- /dev/null
+++ b/help/8/programs/anvi-run-ncbi-cogs/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "cogs-data",
+ "name": "cogs-data",
+ "provided_by_anvio": true,
+ "type": "DATA"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-run-ncbi-cogs",
+ "name": "anvi-run-ncbi-cogs",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-run-pfams/index.md b/help/8/programs/anvi-run-pfams/index.md
new file mode 100644
index 00000000..2534f0ce
--- /dev/null
+++ b/help/8/programs/anvi-run-pfams/index.md
@@ -0,0 +1,86 @@
+---
+layout: program
+title: anvi-run-pfams
+excerpt: An anvi'o program. Run Pfam on Contigs Database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-run-pfams
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Run Pfam on Contigs Database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [pfams-data](../../artifacts/pfams-data)
+
+
+## Can provide
+
+
+[functions](../../artifacts/functions)
+
+
+## Usage
+
+
+This program **associates genes in your [contigs-db](/help/8/artifacts/contigs-db) with functions using the EBI's [Pfam database](https://pfam.xfam.org/).**
+
+Before you run this program, you'll have to set up the Pfam database on your computer with the program [anvi-setup-pfams](/help/8/programs/anvi-setup-pfams).
+
+The Pfam database is based on protein sequences, so anvi'o will convert your genetic information into protein sequences and then use HMMs to compare them to the database.
+
+{:.notice}
+Unsure what an HMM is? Check out [our vocab page](http://merenlab.org/vocabulary/#hmm)
+
+To run, you'll need to provide a [contigs-db](/help/8/artifacts/contigs-db). If you stored the [pfams-data](/help/8/artifacts/pfams-data) that you got from running [anvi-setup-pfams](/help/8/programs/anvi-setup-pfams) in a custom location, you'll need to provide that path as well. The output is a [functions](/help/8/artifacts/functions) artifact.
+
+Here is a default run:
+
+
+anvi-run-pfams -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --pfam-data-dir [pfams-data](/help/8/artifacts/pfams-data)
+
+
+By default, this uses `hmmsearch` to run HMMs. You can choose to use `hmmscan` instead by running
+
+
+anvi-run-pfams -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --pfam-data-dir [pfams-data](/help/8/artifacts/pfams-data) \
+ --hmmer-program hmmscan
+
+
+See [this article](https://cryptogenomicon.org/2011/05/27/hmmscan-vs-hmmsearch-speed-the-numerology/) for a discussion on the performance of the two HMMER programs.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-run-pfams.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-run-pfams) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-run-pfams/network.json b/help/8/programs/anvi-run-pfams/network.json
new file mode 100644
index 00000000..ffb0ca00
--- /dev/null
+++ b/help/8/programs/anvi-run-pfams/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pfams-data",
+ "name": "pfams-data",
+ "provided_by_anvio": true,
+ "type": "DATA"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-run-pfams",
+ "name": "anvi-run-pfams",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-run-scg-taxonomy/index.md b/help/8/programs/anvi-run-scg-taxonomy/index.md
new file mode 100644
index 00000000..ce77f67f
--- /dev/null
+++ b/help/8/programs/anvi-run-scg-taxonomy/index.md
@@ -0,0 +1,83 @@
+---
+layout: program
+title: anvi-run-scg-taxonomy
+excerpt: An anvi'o program. The purpose of this program is to affiliate single-copy core genes in an anvi'o contigs database with taxonomic names.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-run-scg-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+The purpose of this program is to affiliate single-copy core genes in an anvi'o contigs database with taxonomic names. A properly setup local SCG taxonomy database is required for this program to perform properly. After its successful run, `anvi-estimate-scg-taxonomy` will be useful to estimate taxonomy at genome-, collection-, or metagenome-level).
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [scgs-taxonomy-db](../../artifacts/scgs-taxonomy-db) [hmm-hits](../../artifacts/hmm-hits)
+
+
+## Can provide
+
+
+[scgs-taxonomy](../../artifacts/scgs-taxonomy)
+
+
+## Usage
+
+
+This program **associates the single-copy core genes in your [contigs-db](/help/8/artifacts/contigs-db) with taxnomy information.**
+
+Once this information is stored in your [contigs-db](/help/8/artifacts/contigs-db) (in the form of a [scgs-taxonomy](/help/8/artifacts/scgs-taxonomy) artifact), you can run [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy) or use the [anvi-interactive](/help/8/programs/anvi-interactive) and enable "Realtime taxonomy estimate for bins." Check out [this tutorial](http://merenlab.org/2019/10/08/anvio-scg-taxonomy/) for more information.
+
+In order to run this program, you'll need a [scgs-taxonomy-db](/help/8/artifacts/scgs-taxonomy-db), which you can set up by running [anvi-setup-scg-taxonomy](/help/8/programs/anvi-setup-scg-taxonomy).
+
+### What does this program do?
+
+In short, this program searches all of the single-copy core genes that it uses for this workflow (which are the 22 listed on [this page](https://github.com/merenlab/anvio/tree/master/anvio/data/misc/SCG_TAXONOMY/GTDB/SCG_SEARCH_DATABASES)) against the [GTDB](https://gtdb.ecogenomic.org/) databases that you downloaded, and stores hits in your [contigs-db](/help/8/artifacts/contigs-db). In other words, it finds your single-copy core genes and assigns them taxonomy. This way, it can use these single-copy core genes later to estimate the taxnomy of larger groups of contigs that include these single-copy core genes when you run [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy).
+
+### Sweet. How do I run it?
+
+
+anvi-run-scg-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+In case you're running this on a genome and not getting any hits, you have the option to try lowering the percent identity required for a hit (as long as you're careful with it). The default value is 90 percent.
+
+
+anvi-run-scg-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --min-percent-identity 70
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-run-scg-taxonomy.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Usage examples and warnings](http://merenlab.org/scg-taxonomy)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-run-scg-taxonomy) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-run-scg-taxonomy/network.json b/help/8/programs/anvi-run-scg-taxonomy/network.json
new file mode 100644
index 00000000..282a3c99
--- /dev/null
+++ b/help/8/programs/anvi-run-scg-taxonomy/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "scgs-taxonomy",
+ "name": "scgs-taxonomy",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "scgs-taxonomy-db",
+ "name": "scgs-taxonomy-db",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "hmm-hits",
+ "name": "hmm-hits",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-run-scg-taxonomy",
+ "name": "anvi-run-scg-taxonomy",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-run-trna-taxonomy/index.md b/help/8/programs/anvi-run-trna-taxonomy/index.md
new file mode 100644
index 00000000..925a880f
--- /dev/null
+++ b/help/8/programs/anvi-run-trna-taxonomy/index.md
@@ -0,0 +1,85 @@
+---
+layout: program
+title: anvi-run-trna-taxonomy
+excerpt: An anvi'o program. The purpose of this program is to affiliate tRNA gene sequences in an anvi'o contigs database with taxonomic names.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-run-trna-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+The purpose of this program is to affiliate tRNA gene sequences in an anvi'o contigs database with taxonomic names. A properly setup local tRNA taxonomy database is required for this program to perform properly. After its successful run, `anvi-estimate-trna-taxonomy` will be useful to estimate taxonomy at genome-, collection-, or metagenome-level)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [trna-taxonomy-db](../../artifacts/trna-taxonomy-db)
+
+
+## Can provide
+
+
+[trna-taxonomy](../../artifacts/trna-taxonomy)
+
+
+## Usage
+
+
+This program associates the tRNA reads found in your [contigs-db](/help/8/artifacts/contigs-db) with taxonomy information.
+
+Once these associations are stored in your [contigs-db](/help/8/artifacts/contigs-db) (represented by a [trna-taxonomy](/help/8/artifacts/trna-taxonomy) artifact), you'll be able to run [anvi-estimate-trna-taxonomy](/help/8/programs/anvi-estimate-trna-taxonomy) to use the associations to estimate taxonomy on a larger scale (i.e. for a genome or metagenome).
+
+To run this program, you'll need to have set up two things:
+1. a [trna-taxonomy-db](/help/8/artifacts/trna-taxonomy-db), which you can set up by running [anvi-setup-trna-taxonomy](/help/8/programs/anvi-setup-trna-taxonomy).
+2. the 'transfer-RNAs' HMM hits in your [contigs-db](/help/8/artifacts/contigs-db), which you can set up by running [anvi-scan-trnas](/help/8/programs/anvi-scan-trnas)
+
+This program will then go through the tRNA hits in your contigs database and search them against the sequences in the [GTDB](https://gtdb.ecogenomic.org/) databases that you downloaded to assign them taxonomy.
+
+### Basic run
+
+The following is a basic run of this program:
+
+
+anvi-run-trna-taxonomy -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+If you have set up the two requirements listed above, this should run smoothly.
+
+### Additional Parameters
+
+When changing these parameters, it might be a good idea to run [anvi-estimate-trna-taxonomy](/help/8/programs/anvi-estimate-trna-taxonomy) with the `--debug` flag so that you can see what your results look like under the hood.
+
+1. `--max-num-target-sequences`: the number of hits that this program considers for each tRNA sequence before making a final decision for the taxonomy association. The default is 100, but if you want to ensure that you have accurate data at the expense of some runtime, you can increase it.
+2. `--min-percent-identity`: the minimum percent alignment needed to consider something a hit. The default is 90, but if you're not getting any hits on a specific sequence, you can decrease it at the risk of getting some nonsense results.
+
+Finally, this program does not usually have an output file, but if desired you can add the parameter `--all-hits-output-file` to store the list of hits that anvi'o looked at to determine the consensus hit for each sequence.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-run-trna-taxonomy.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-run-trna-taxonomy) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-run-trna-taxonomy/network.json b/help/8/programs/anvi-run-trna-taxonomy/network.json
new file mode 100644
index 00000000..9f8bd3bb
--- /dev/null
+++ b/help/8/programs/anvi-run-trna-taxonomy/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trna-taxonomy",
+ "name": "trna-taxonomy",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trna-taxonomy-db",
+ "name": "trna-taxonomy-db",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-run-trna-taxonomy",
+ "name": "anvi-run-trna-taxonomy",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-run-workflow/index.md b/help/8/programs/anvi-run-workflow/index.md
new file mode 100644
index 00000000..98b86f23
--- /dev/null
+++ b/help/8/programs/anvi-run-workflow/index.md
@@ -0,0 +1,96 @@
+---
+layout: program
+title: anvi-run-workflow
+excerpt: An anvi'o program. Execute, manage, parallelize, and troubleshoot entire 'omics workflows and chain together anvi'o and third party programs.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-run-workflow
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Execute, manage, parallelize, and troubleshoot entire 'omics workflows and chain together anvi'o and third party programs.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[workflow-config](../../artifacts/workflow-config)
+
+
+## Can provide
+
+
+[workflow](../../artifacts/workflow)
+
+
+## Usage
+
+
+This program allows you to run a [workflow](/help/8/artifacts/workflow) implemented by anvi'o developers for various commonly used set of steps to typically process your raw data (i.e., short reads or contigs from genomes, transcriptomes, metagenomes, metatranscriptomes, etc). Some aspects of this program is described in [this tutorial](https://merenlab.org/2018/07/09/anvio-snakemake-workflows/).
+
+For a list of currently available anvi'o workflows, please see the [workflow](/help/8/artifacts/workflow) artifact.
+
+### Before running the workflow
+
+Each workflow requires a [workflow-config](/help/8/artifacts/workflow-config): the file that details all of the parameters for the workflow. To get the [workflow-config](/help/8/artifacts/workflow-config) with the default parameters, just run
+
+
+anvi-run-workflow -w [workflow](/help/8/artifacts/workflow) \
+ --get-default-config CONFIG.json
+
+
+Before running a workflow, it is also a good idea to check the required dependencies by running
+
+
+anvi-run-workflow -w [workflow](/help/8/artifacts/workflow) \
+ --list-dependencies
+
+
+### The main run
+
+The main run of the workflow should look like this:
+
+
+anvi-run-workflow -w [workflow](/help/8/artifacts/workflow) \
+ -c CONFIG.json
+ --save-workflow-graph
+
+
+The flag `--save-workflow-graph` creates a visual representation of the anvio programs that the workflow you're running used.
+
+You can also use the `-A` flag at the end of the parameter list to change other [Snakemake](https://snakemake.readthedocs.io/en/stable/) parameters.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-run-workflow.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Tutorial](http://merenlab.org/2018/07/09/anvio-snakemake-workflows/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-run-workflow) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-run-workflow/network.json b/help/8/programs/anvi-run-workflow/network.json
new file mode 100644
index 00000000..c7db0f45
--- /dev/null
+++ b/help/8/programs/anvi-run-workflow/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "workflow",
+ "name": "workflow",
+ "provided_by_anvio": true,
+ "type": "WORKFLOW"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "workflow-config",
+ "name": "workflow-config",
+ "provided_by_anvio": false,
+ "type": "JSON"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-run-workflow",
+ "name": "anvi-run-workflow",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-scan-trnas/index.md b/help/8/programs/anvi-scan-trnas/index.md
new file mode 100644
index 00000000..f66f70b0
--- /dev/null
+++ b/help/8/programs/anvi-scan-trnas/index.md
@@ -0,0 +1,91 @@
+---
+layout: program
+title: anvi-scan-trnas
+excerpt: An anvi'o program. Identify and store tRNA genes in a contigs database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-scan-trnas
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Identify and store tRNA genes in a contigs database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[hmm-hits](../../artifacts/hmm-hits)
+
+
+## Usage
+
+
+This program identifies the tRNA genes in a [contigs-db](/help/8/artifacts/contigs-db) and stores them in an [hmm-hits](/help/8/artifacts/hmm-hits).
+
+To run, just provide a [contigs-db](/help/8/artifacts/contigs-db) that you want to look through.
+
+
+anvi-scan-trnas -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+### Customizing the cut off score
+
+What counts as a tRNA gene? That could be up to you.
+
+The default minimum score for a gene to be counted is 20. However, you can set this cutoff to anywhere between 0-100. This value is actually used by the module tRNAScan-SE, so view [their documentation](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6768409/) for details. For example, to find more non-cononical tRNA genes, a user could lower the cutoff score to 10 as follows:
+
+
+anvi-scan-trnas -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --trna-cutoff-score 10
+
+
+### Other options
+
+- It is easy to modify where the outputs will go:
+
+ - Use the parameter `--log-file` to provide a path for the output messages to go.
+
+ - Use the parameter `--trna-hits-file` to provide a path for the raw tRNA scan data to go.
+
+- Like many anvi'o programs, you can use the tag `--just-do-it` to not have to look at questions or warnings
+
+- You can also try to multithread whenever possible by setting the `--num-threads` parameter (it is 1 by default). This can be used to speed up runtime, but please be aware of your system and its limitations before trying this.
+
+### Understanding the output
+
+Essentially, the output of this program states the probability that each gene is a tRNA gene. See [hmm-hits](/help/8/artifacts/hmm-hits) for more information.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-scan-trnas.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-scan-trnas) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-scan-trnas/network.json b/help/8/programs/anvi-scan-trnas/network.json
new file mode 100644
index 00000000..f9b15c70
--- /dev/null
+++ b/help/8/programs/anvi-scan-trnas/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "hmm-hits",
+ "name": "hmm-hits",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-scan-trnas",
+ "name": "anvi-scan-trnas",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-add-default-collection/index.md b/help/8/programs/anvi-script-add-default-collection/index.md
new file mode 100644
index 00000000..2b1c8e06
--- /dev/null
+++ b/help/8/programs/anvi-script-add-default-collection/index.md
@@ -0,0 +1,80 @@
+---
+layout: program
+title: anvi-script-add-default-collection
+excerpt: An anvi'o program. A script to add a 'DEFAULT' collection in an anvi'o pan or profile database with either (1) a single bin that describes all items available in the profile database, or (2) as many bins as there are items in the profile database wher every item has it.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-add-default-collection
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A script to add a 'DEFAULT' collection in an anvi'o pan or profile database with either (1) a single bin that describes all items available in the profile database, or (2) as many bins as there are items in the profile database wher every item has its own bin. The former is the default behavior that will be useful in most instances where you need to use this script. The latter is most useful if you are Florian and/or have something very specific in mind..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[collection](../../artifacts/collection) [bin](../../artifacts/bin)
+
+
+## Usage
+
+
+This program adds a 'default' [collection](/help/8/artifacts/collection) and [bin](/help/8/artifacts/bin) to your [pan-db](/help/8/artifacts/pan-db) or [profile-db](/help/8/artifacts/profile-db) and [contigs-db](/help/8/artifacts/contigs-db) that describes every item in your database.
+
+This way, you can perform anvi'o tasks that require a collection or a bin even if you do not have a particular collection for your data, or all items in your database represent a meaningful bin (such as every contig in a [contigs-db](/help/8/artifacts/contigs-db) that represents a single genome).
+
+As an example, see this program in action in the [Infant Gut Tutorial](http://merenlab.org/tutorials/infant-gut/#the-gene-mode-studying-distribution-patterns-at-the-gene-level) where it is used to run [anvi-interactive](/help/8/programs/anvi-interactive) on a genome in 'gene mode'.
+
+Run in its simples form,
+
+
+anvi-script-add-default-collection -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db)
+
+
+the program will add a new collection into the profile database named `DEFAULT`, which will contain a single bin that describes all items in the database named `EVERYTHING`. You can set these default names to your liking using additional parameters:
+
+
+anvi-script-add-default-collection -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C MY_COLLECTION \
+ -b MY_BIN
+
+
+Also see related programs, [anvi-show-collections-and-bins](/help/8/programs/anvi-show-collections-and-bins) and [anvi-delete-collection](/help/8/programs/anvi-delete-collection).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-add-default-collection.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-add-default-collection) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-add-default-collection/network.json b/help/8/programs/anvi-script-add-default-collection/network.json
new file mode 100644
index 00000000..59848241
--- /dev/null
+++ b/help/8/programs/anvi-script-add-default-collection/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-add-default-collection",
+ "name": "anvi-script-add-default-collection",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "source": 5,
+ "target": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-as-markdown/index.md b/help/8/programs/anvi-script-as-markdown/index.md
new file mode 100644
index 00000000..fb03c9fd
--- /dev/null
+++ b/help/8/programs/anvi-script-as-markdown/index.md
@@ -0,0 +1,179 @@
+---
+layout: program
+title: anvi-script-as-markdown
+excerpt: An anvi'o program. Markdownizides TAB-delmited data with headers in terminal.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-as-markdown
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Markdownizides TAB-delmited data with headers in terminal..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[markdown-txt](../../artifacts/markdown-txt)
+
+
+## Usage
+
+
+A helper script to format TAB-delimited files in markdown for bloggers, tutorial writers, those who wish to share example anvi'o outputs as text on GitHub issues, and so on.
+
+Anvi'o programs often generate TAB-delimited files. While this simple format is useful to pass around to other software or share with others, it is not easily interpretable in visual media. The purpose of this script is to make the sharing part simpler for platforms that can render markdown.
+
+You can pipe any TAB-delimited content to this script:
+
+```
+cat file.txt | anvi-sript-as-markdown
+```
+
+## Examples
+
+Assume a TAB-delmited file with many lines:
+
+``` bash
+wc -l additional_view_data.txt
+
+301 additional_view_data.txt
+```
+
+Contents of which look like this:
+
+``` bash
+head -n 10 additional_view_data.txt
+
+contig categorical_1 categorical_2 text_layer_01 numerical bars_main!A bars_main!B bars_main!C
+backrest b y nmwje 2.78 278 23 1
+backward b x bqmyujr psrd doefhi 2.49 249 52 2
+backwind b y hkfer lchpmzix 2.69 269 32 3
+backyard b x advoe bfkyhmg 2.05 205 96 4
+bacteria b x lqmcwn hywco 2.63 263 38 5
+bacterin b vxqdmn 2.98 298 3 6
+baetylus b x fkgpydi owgyhfx xwlpj 2.19 219 82 7
+bagpiped b y ijmnur 2.12 212 89 8
+balconet b y ecizgs 2.89 289 12 9
+```
+
+### Default run
+
+
+head -n 10 additional_view_data.txt | [anvi-script-as-markdown](/help/8/programs/anvi-script-as-markdown)
+
+
+which is rendered as,
+
+|**contig**|**categorical_1**|**categorical_2**|**text_layer_01**|**numerical**|**bars_main!A**|**bars_main!B**|**bars_main!C**|
+|:--|:--|:--|:--|:--|:--|:--|:--|
+|backrest|b|y|nmwje|2.78|278|23|1|
+|backward|b|x|bqmyujr psrd doefhi|2.49|249|52|2|
+|backwind|b|y|hkfer lchpmzix|2.69|269|32|3|
+|backyard|b|x|advoe bfkyhmg|2.05|205|96|4|
+|bacteria|b|x|lqmcwn hywco|2.63|263|38|5|
+|bacterin|b||vxqdmn|2.98|298|3|6|
+|baetylus|b|x|fkgpydi owgyhfx xwlpj|2.19|219|82|7|
+|bagpiped|b|y|ijmnur|2.12|212|89|8|
+|balconet|b|y|ecizgs|2.89|289|12|9|
+
+### Limit the number of lines shown
+
+``` bash
+cat additional_view_data.txt | anvi-script-as-markdown --max-num-lines-to-show 10
+```
+
+which is rendered as,
+
+|**contig**|**categorical_1**|**categorical_2**|**text_layer_01**|**numerical**|**bars_main!A**|**bars_main!B**|**bars_main!C**|
+|:--|:--|:--|:--|:--|:--|:--|:--|
+|backrest|b|y|nmwje|2.78|278|23|1|
+|backward|b|x|bqmyujr psrd doefhi|2.49|249|52|2|
+|backwind|b|y|hkfer lchpmzix|2.69|269|32|3|
+|backyard|b|x|advoe bfkyhmg|2.05|205|96|4|
+|bacteria|b|x|lqmcwn hywco|2.63|263|38|5|
+|bacterin|b||vxqdmn|2.98|298|3|6|
+|baetylus|b|x|fkgpydi owgyhfx xwlpj|2.19|219|82|7|
+|bagpiped|b|y|ijmnur|2.12|212|89|8|
+|balconet|b|y|ecizgs|2.89|289|12|9|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+### Code columns
+
+``` bash
+cat additional_view_data.txt | anvi-script-as-markdown --max-num-lines-to-show 10 \
+ --code-column contig
+```
+
+which is rendered as,
+
+|**contig**|**categorical_1**|**categorical_2**|**text_layer_01**|**numerical**|**bars_main!A**|**bars_main!B**|**bars_main!C**|
+|:--|:--|:--|:--|:--|:--|:--|:--|
+|`backrest`|b|y|nmwje|2.78|278|23|1|
+|`backward`|b|x|bqmyujr psrd doefhi|2.49|249|52|2|
+|`backwind`|b|y|hkfer lchpmzix|2.69|269|32|3|
+|`backyard`|b|x|advoe bfkyhmg|2.05|205|96|4|
+|`bacteria`|b|x|lqmcwn hywco|2.63|263|38|5|
+|`bacterin`|b||vxqdmn|2.98|298|3|6|
+|`baetylus`|b|x|fkgpydi owgyhfx xwlpj|2.19|219|82|7|
+|`bagpiped`|b|y|ijmnur|2.12|212|89|8|
+|`balconet`|b|y|ecizgs|2.89|289|12|9|
+|(...)|(...)|(...)|(...)|(...)|(...)|(...)|(...)|
+
+### Exclude columns from the output
+
+``` bash
+cat additional_view_data.txt | anvi-script-as-markdown --max-num-lines-to-show 10 \
+ --code-column contig \
+ --exclude-columns 'bars_main!A,bars_main!B,bars_main!C'
+```
+
+which is rendered as,
+
+|**contig**|**categorical_1**|**categorical_2**|**text_layer_01**|**numerical**|
+|:--|:--|:--|:--|:--|
+|`backrest`|b|y|nmwje|2.78|
+|`backward`|b|x|bqmyujr psrd doefhi|2.49|
+|`backwind`|b|y|hkfer lchpmzix|2.69|
+|`backyard`|b|x|advoe bfkyhmg|2.05|
+|`bacteria`|b|x|lqmcwn hywco|2.63|
+|`bacterin`|b||vxqdmn|2.98|
+|`baetylus`|b|x|fkgpydi owgyhfx xwlpj|2.19|
+|`bagpiped`|b|y|ijmnur|2.12|
+|`balconet`|b|y|ecizgs|2.89|
+|(...)|(...)|(...)|(...)|(...)|
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-as-markdown.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-as-markdown) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-as-markdown/network.json b/help/8/programs/anvi-script-as-markdown/network.json
new file mode 100644
index 00000000..20150cbd
--- /dev/null
+++ b/help/8/programs/anvi-script-as-markdown/network.json
@@ -0,0 +1,30 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "markdown-txt",
+ "name": "markdown-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-as-markdown",
+ "name": "anvi-script-as-markdown",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 1,
+ "target": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-augustus-output-to-external-gene-calls/index.md b/help/8/programs/anvi-script-augustus-output-to-external-gene-calls/index.md
new file mode 100644
index 00000000..b66549db
--- /dev/null
+++ b/help/8/programs/anvi-script-augustus-output-to-external-gene-calls/index.md
@@ -0,0 +1,73 @@
+---
+layout: program
+title: anvi-script-augustus-output-to-external-gene-calls
+excerpt: An anvi'o program. Takes in gene calls by AUGUSTUS v3.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-augustus-output-to-external-gene-calls
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Takes in gene calls by AUGUSTUS v3.3.3, generates an anvi'o external gene calls file. It may work well with other versions of AUGUSTUS, too. It is just no one has tested the script with different versions of the program.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[augustus-gene-calls](../../artifacts/augustus-gene-calls)
+
+
+## Can provide
+
+
+[external-gene-calls](../../artifacts/external-gene-calls)
+
+
+## Usage
+
+
+This program converts a gene call file from [AUGUSTUS](http://bioinf.uni-greifswald.de/augustus/) (as an [augustus-gene-calls](/help/8/artifacts/augustus-gene-calls) artifact) to an anvi'o [external-gene-calls](/help/8/artifacts/external-gene-calls) artifact.
+
+This essentially just reformats the data in the [augustus-gene-calls](/help/8/artifacts/augustus-gene-calls) artifact (for example, removing the UTR information) so that it can be read by other anvi'o programs.
+
+A run of this program will look something like this:
+
+
+anvi-script-augustus-output-to-external-gene-calls -i [augustus-gene-calls](/help/8/artifacts/augustus-gene-calls)
+ -o [external-gene-calls](/help/8/artifacts/external-gene-calls)
+
+
+Here is an example of what the resulting [external-gene-calls](/help/8/artifacts/external-gene-calls) file will look like (from the gff file used as an example on the [augustus-gene-calls](/help/8/artifacts/augustus-gene-calls) page):
+
+ gene_callers_id contig start stop direction partial call_type source version aa_sequence
+ 0 unnamed-1 56 1252 f 0 1 AUGUSTUS v3.3.3 MSEGNAAGEPSTPGGPRPLLTGARGLIGRRPAPPLTPGRLPSIRSRDLTLGGVKKKTFTPNIISRKIKEEPKEEVTVKKEKRERDRDRQREGHGRGRGRPEVIQSHSIFEQGPAEMMKKKGNWDKTVDVSDMGPSHIINIKKEKRETDEETKQILRMLEKDDFLDDPGLRNDTRNMPVQLPLAHSGWLFKEENDEPDVKPWLAGPKEEDMEVDIPAVKVKEEPRDEEEEAKMKAPPKAARKTPGLPKDVSVAELLRELSLTKEEELLFLQLPDTLPGQPPTQDIKPIKTEVQGEDGQVVLIKQEKDREAKLAENACTLADLTEGQVGKLLIRKSGRVQLLLGKVTLDVTMGTACSFLQELVSVGLGDSRTGEMTVLGHVKHKLVCSPDFESLLDHKHR
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-augustus-output-to-external-gene-calls.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-augustus-output-to-external-gene-calls) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-augustus-output-to-external-gene-calls/network.json b/help/8/programs/anvi-script-augustus-output-to-external-gene-calls/network.json
new file mode 100644
index 00000000..d0dded34
--- /dev/null
+++ b/help/8/programs/anvi-script-augustus-output-to-external-gene-calls/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-gene-calls",
+ "name": "external-gene-calls",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "augustus-gene-calls",
+ "name": "augustus-gene-calls",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-augustus-output-to-external-gene-calls",
+ "name": "anvi-script-augustus-output-to-external-gene-calls",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-checkm-tree-to-interactive/index.md b/help/8/programs/anvi-script-checkm-tree-to-interactive/index.md
new file mode 100644
index 00000000..c85941a2
--- /dev/null
+++ b/help/8/programs/anvi-script-checkm-tree-to-interactive/index.md
@@ -0,0 +1,70 @@
+---
+layout: program
+title: anvi-script-checkm-tree-to-interactive
+excerpt: An anvi'o program. A helper script to convert CheckM trees into anvio interactive with taxonomy information.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-checkm-tree-to-interactive
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A helper script to convert CheckM trees into anvio interactive with taxonomy information.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[phylogeny](../../artifacts/phylogeny)
+
+
+## Can provide
+
+
+[interactive](../../artifacts/interactive)
+
+
+## Usage
+
+
+A helper script to process CheckM tree output to generate files compatible with [anvi-interactive](/help/8/programs/anvi-interactive).
+
+An example use:
+
+
+anvi-script-checkm-tree-to-interactive -t CheckM_concatenated.tree \
+ -o OUTPUT_PATH
+cd OUTPUT_PATH/
+anvi-interactive -p PROFILE.db \
+ -t newick.tree \
+ -d view_data.txt \
+ --manual
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-checkm-tree-to-interactive.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-checkm-tree-to-interactive) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-checkm-tree-to-interactive/network.json b/help/8/programs/anvi-script-checkm-tree-to-interactive/network.json
new file mode 100644
index 00000000..98c04a9a
--- /dev/null
+++ b/help/8/programs/anvi-script-checkm-tree-to-interactive/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interactive",
+ "name": "interactive",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "phylogeny",
+ "name": "phylogeny",
+ "provided_by_anvio": true,
+ "type": "NEWICK"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-checkm-tree-to-interactive",
+ "name": "anvi-script-checkm-tree-to-interactive",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-compute-ani-for-fasta/index.md b/help/8/programs/anvi-script-compute-ani-for-fasta/index.md
new file mode 100644
index 00000000..52b3147d
--- /dev/null
+++ b/help/8/programs/anvi-script-compute-ani-for-fasta/index.md
@@ -0,0 +1,72 @@
+---
+layout: program
+title: anvi-script-compute-ani-for-fasta
+excerpt: An anvi'o program. Run ANI between contigs in a single FASTA file.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-compute-ani-for-fasta
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Run ANI between contigs in a single FASTA file.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[fasta](../../artifacts/fasta)
+
+
+## Can provide
+
+
+[genome-similarity](../../artifacts/genome-similarity)
+
+
+## Usage
+
+
+This program computes the average nucleotide identity between reads in a single fasta file (using PyANI).
+
+To compute the ANI (or other genome distance metrics) between two genomes in different fasta files, use [anvi-compute-genome-similarity](/help/8/programs/anvi-compute-genome-similarity).
+
+A default run of this program looks like this:
+
+
+anvi-script-compute-ani-for-fasta -f [fasta](/help/8/artifacts/fasta) \
+ -o path/to/output \
+ --method ANIb
+
+
+By default, the PyANI method is ANIb (which aligns 1020 nt fragments of your sequences using BLASTN+). You can switch to ANIm, ANIblastall, or TETRA if desired. See the [PyANI documentation](https://github.com/widdowquinn/pyani) for more informaiton.
+
+You also have the option to change the distance metric (from the default "euclidean") or the linkage method (from the default "ward") or provide a path to a log file for debug messages.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-compute-ani-for-fasta.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-compute-ani-for-fasta) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-compute-ani-for-fasta/network.json b/help/8/programs/anvi-script-compute-ani-for-fasta/network.json
new file mode 100644
index 00000000..5ac6fb76
--- /dev/null
+++ b/help/8/programs/anvi-script-compute-ani-for-fasta/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genome-similarity",
+ "name": "genome-similarity",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "fasta",
+ "name": "fasta",
+ "provided_by_anvio": false,
+ "type": "FASTA"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-compute-ani-for-fasta",
+ "name": "anvi-script-compute-ani-for-fasta",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-compute-bayesian-pan-core/index.md b/help/8/programs/anvi-script-compute-bayesian-pan-core/index.md
new file mode 100644
index 00000000..dfecbaa2
--- /dev/null
+++ b/help/8/programs/anvi-script-compute-bayesian-pan-core/index.md
@@ -0,0 +1,57 @@
+---
+layout: program
+title: anvi-script-compute-bayesian-pan-core
+excerpt: An anvi'o program. Runs mOTUpan on your gene clusters to estimate whether they are core or accessory.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-compute-bayesian-pan-core
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Runs mOTUpan on your gene clusters to estimate whether they are core or accessory.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Can provide
+
+
+[bin](../../artifacts/bin)
+
+
+## Usage
+
+
+{:.notice}
+**No one has described the usage of this program** :/ If you would like to contribute, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs), and feel free to add a Markdown formatted file in that directory named "anvi-script-compute-bayesian-pan-core.md". For a template, you can use the markdown file for `anvi-gen-contigs-database`. THANK YOU!
+
+
+## Additional Resources
+
+
+* [GitHub repository for the mOTUPan](https://github.com/moritzbuck/mOTUlizer/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-compute-bayesian-pan-core) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-compute-bayesian-pan-core/network.json b/help/8/programs/anvi-script-compute-bayesian-pan-core/network.json
new file mode 100644
index 00000000..f677f0e6
--- /dev/null
+++ b/help/8/programs/anvi-script-compute-bayesian-pan-core/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-compute-bayesian-pan-core",
+ "name": "anvi-script-compute-bayesian-pan-core",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-estimate-metabolic-independence/index.md b/help/8/programs/anvi-script-estimate-metabolic-independence/index.md
new file mode 100644
index 00000000..675fce52
--- /dev/null
+++ b/help/8/programs/anvi-script-estimate-metabolic-independence/index.md
@@ -0,0 +1,165 @@
+---
+layout: program
+title: anvi-script-estimate-metabolic-independence
+excerpt: An anvi'o program. Takes a genome as a contigs-db, and tells you whether it can be considered as an organism of high metabolic independence, or not.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-estimate-metabolic-independence
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Takes a genome as a contigs-db, and tells you whether it can be considered as an organism of high metabolic independence, or not.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[metabolic-independence-score](../../artifacts/metabolic-independence-score)
+
+
+## Usage
+
+
+The goal of this script is to give access to some [recent findings](https://doi.org/10.1101/2021.03.02.433653) between the metabolic make-up of a given organism and its ability to survive stressful conditions in the human gut, and classify whether a given genome can be considered as an organism with high metabolic independence (HMI) or not.
+
+The idea implemented in this script should work for human gut microbes, however, we have not yet tested the concept of HMI/LMI beyond that.
+
+## The concept of 'High Metabolic Independence'
+
+Briefly, a microbial organism will have high metabolic independence (HMI) when its genome encodes, with high completeness, a set of key metabolic pathways for the biosynthesis of key molecules such as amino acids, cofactors, nucleotides, lipids, etc, through which the organism will be fairly robust to environmental stress, changing environmental conditions, and/or factors that can disrupt microbial communities. In contrast, low metabolic independence (LMI) is characterized by the complete absence and/or low level of completion of the same set of pathways, which renders organisms with LMI unable to produce critical metabolites that are often necessary for survival.
+
+Even if this hypothesis has merit, there are open questions that needs to be addressed, such as (1) which specific metabolic pathways would be necessary to calculate the extent of metabolic independence, (2) how complete the set of pathways should be in a given genome, or (3) to what extent these insights are environment-dependent (i.e., will the set of pathways and completion for human gut microbes also work for marine microbes, etc). Despite the lack of clear answers to these questions, this script offers a framework to investigate the extent of metabolic independence of gut microbes based on our recent study. In this study, we were able to define two groups of microbial genomes following an FMT experiment: the first group was composed of populations that were able to colonize many FMT recipients and were prevalent in the global gut metageomes (the good colonizers). The second group, in contrast, was composed of populations that largely failed to colonize FMT recipients, and were also missing in global gut metagenomes (the poor colonizers). Using the programs [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) and [anvi-compute-metabolic-enrichment](/help/8/programs/anvi-compute-metabolic-enrichment), we asked the question which metabolic pathways were differentially enriched between these two groups of genomes (more specifically, we only kept the modules that were 'associated' with good colonizers with a q-value of 0.05 and were at least 75% complete in at least 50% of the genomes in the good colonizers group). This analysis revealed the following set of 33 [KEGG modules](https://www.genome.jp/kegg/module.html):
+
+|**module**|**name**|
+|:--|:--|
+|M00049|Adenine ribonucleotide biosynthesis, IMP => ADP,ATP|
+|M00050|Guanine ribonucleotide biosynthesis, IMP => GDP,GTP|
+|M00007|Pentose phosphate pathway, non-oxidative phase, fructose 6P => ribose 5P|
+|M00140|C1-unit interconversion, prokaryotes|
+|M00005|PRPP biosynthesis, ribose 5P => PRPP|
+|M00083|Fatty acid biosynthesis, elongation|
+|M00120|Coenzyme A biosynthesis, pantothenate => CoA|
+|M00854|Glycogen biosynthesis, glucose-1P => glycogen/starch|
+|M00527|Lysine biosynthesis, DAP aminotransferase pathway, aspartate => lysine|
+|M00096|C5 isoprenoid biosynthesis, non-mevalonate pathway|
+|M00048|Inosine monophosphate biosynthesis, PRPP + glutamine => IMP|
+|M00855|Glycogen degradation, glycogen => glucose-6P|
+|M00022|Shikimate pathway, phosphoenolpyruvate + erythrose-4P => chorismate|
+|M00844|Arginine biosynthesis, ornithine => arginine|
+|M00051|Uridine monophosphate biosynthesis, glutamine (+ PRPP) => UMP|
+|M00082|Fatty acid biosynthesis, initiation|
+|M00157|F-type ATPase, prokaryotes and chloroplasts|
+|M00026|Histidine biosynthesis, PRPP => histidine|
+|M00526|Lysine biosynthesis, DAP dehydrogenase pathway, aspartate => lysine|
+|M00015|Proline biosynthesis, glutamate => proline|
+|M00019|Valine/isoleucine biosynthesis, pyruvate => valine / 2-oxobutanoate => isoleucine|
+|M00432|Leucine biosynthesis, 2-oxoisovalerate => 2-oxoisocaproate|
+|M00018|Threonine biosynthesis, aspartate => homoserine => threonine|
+|M00570|Isoleucine biosynthesis, threonine => 2-oxobutanoate => isoleucine|
+|M00126|Tetrahydrofolate biosynthesis, GTP => THF|
+|M00115|NAD biosynthesis, aspartate => quinolinate => NAD|
+|M00028|Ornithine biosynthesis, glutamate => ornithine|
+|M00924|Cobalamin biosynthesis, anaerobic, uroporphyrinogen III => sirohydrochlorin => cobyrinate a,c-diamide|
+|M00122|Cobalamin biosynthesis, cobyrinate a,c-diamide => cobalamin|
+|M00125|Riboflavin biosynthesis, plants and bacteria, GTP => riboflavin/FMN/FAD|
+|M00023|Tryptophan biosynthesis, chorismate => tryptophan|
+|M00631|D-Galacturonate degradation (bacteria), D-galacturonate => pyruvate + D-glyceraldehyde 3P|
+|M00061|D-Glucuronate degradation, D-glucuronate => pyruvate + D-glyceraldehyde 3P|
+
+## Estimating the level of metabolic independence
+
+This script will use by default the list of modules that we determined to be associated with high metabolic independence, and calculate the completion scores of these modules in a given [contigs-db](/help/8/artifacts/contigs-db) to classify it as an HMI or an LMI organism. Here is an example run:
+
+
+anvi-script-estimate-metabolic-independence -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+Running this on a _Ruminococcus gnavus_ genome will print something like this:
+
+```
+CLASSIFICATION RESULT
+===============================================
+Metabolic independence .......................: High
+Threshold for classification .................: 20
+Genome score .................................: 31.06
+```
+
+The script calls upon the same classes that are invoked via [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) to estimate the completeness of each metabolic pathway of interest. This estimation step returns a fractional completeness score for each pathway, which will have a value between 0 and 1 (inclusive). Note that this step also depends on the provided KEGG database, and that we currently use the pathwise completeness score for this step (you can find [an explanation of pathwise completeness here](https://anvio.org/help/main/programs/anvi-estimate-metabolism/#technical-details)). In its current iteration, the script will add up all the completeness scores to get one sum (which is reported as the 'genome score', and if it is higher than the threshold, it will label the organism as one that has high metabolic independence.
+
+## Determining a new set of modules and a threshold
+
+You can provide this script with a different set of metabolic modules, a different threshold, and/or a different KEGG data directory to utilize:
+
+
+anvi-script-estimate-metabolic-independence -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --module-list modules.txt \
+ --threshold 20 \
+ --kegg-data-dir [kegg-data](/help/8/artifacts/kegg-data) \
+
+
+If you wish to update the list of modules based on your own empirical data, you can follow this simple recipe:
+
+1. Define two groups of genomes, one that is of what you consider 'high metabolic independence', and one that is of 'low metabolic independence'.
+2. Estimate the completeness of all metabolic pathways of potential interest in these genomes.
+3. Determine metabolic modules that are enriched in the HMI group. This is the list of modules you would provide to the script for classifying additional genomes as HMI or LMI.
+
+The exact methods of estimating metabolism and computing enrichment are up to you, but note that this strategy only works if you already have genomes that you have manually identified as HMI or LMI.
+
+This may require adjusting the threshold for classification. Which can be done in a similar fashion, where you determine a cutoff that best separates your genomes in either groups. The plot below shows our two groups of genomes, good and poor colonizers. With the assumption that most of the good colonizers had HMI, and most of the poor colonizers had LMI, we plotted the overall completion of metabolic modules of interest, and selected a number that separates them:
+
+![A plot of HMI scores for the good colonizer genomes and the poor colonizer genomes](../../images/FMT_HMI_score_plot.png)
+
+{:.notice}
+Since individual completeness scores have a maximum of 1 (for 100% complete), the maximum value of the sum will be _n_, where _n_ is the number of metabolic pathways in your list. So you should be selecting a threshold between 0 and _n_, but most likely on the higher end of that range, since high metabolic independence is generally defined as _high_ completeness scores across the set of pathways). It will depend on your data, of course.
+
+## Other parameter options
+
+### Using stepwise completeness instead
+
+By default, this script relies on the pathwise completeness scores for each module in the input list. If you want to use stepwise completeness instead, simply add the `--use-stepwise-completeness` flag (you may also want to adjust the threshold value):
+
+
+anvi-script-estimate-metabolic-independence -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --module-list modules.txt \
+ --threshold 20 \
+ --use-stepwise-completeness
+
+
+Not sure what we're talking about here? You can learn about the differences between pathwise and stepwise metrics [on this page](https://anvio.org/help/main/programs/anvi-estimate-metabolism/#two-estimation-strategies---pathwise-and-stepwise).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-estimate-metabolic-independence.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-estimate-metabolic-independence) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-estimate-metabolic-independence/network.json b/help/8/programs/anvi-script-estimate-metabolic-independence/network.json
new file mode 100644
index 00000000..218ea2c8
--- /dev/null
+++ b/help/8/programs/anvi-script-estimate-metabolic-independence/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "metabolic-independence-score",
+ "name": "metabolic-independence-score",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-estimate-metabolic-independence",
+ "name": "anvi-script-estimate-metabolic-independence",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-filter-fasta-by-blast/index.md b/help/8/programs/anvi-script-filter-fasta-by-blast/index.md
new file mode 100644
index 00000000..b5ed3b8a
--- /dev/null
+++ b/help/8/programs/anvi-script-filter-fasta-by-blast/index.md
@@ -0,0 +1,78 @@
+---
+layout: program
+title: anvi-script-filter-fasta-by-blast
+excerpt: An anvi'o program. Filter FASTA file according to BLAST table (remove sequences with bad BLAST alignment).
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-filter-fasta-by-blast
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Filter FASTA file according to BLAST table (remove sequences with bad BLAST alignment).
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[contigs-fasta](../../artifacts/contigs-fasta) [blast-table](../../artifacts/blast-table)
+
+
+## Can provide
+
+
+[contigs-fasta](../../artifacts/contigs-fasta)
+
+
+## Usage
+
+
+This program takes a [contigs-fasta](/help/8/artifacts/contigs-fasta) and [blast-table](/help/8/artifacts/blast-table) and removes sequences without BLAST hits of a certain level of confidence.
+
+For example, you could use this program to filter out sequences that do not have high-confidence taxonomy assignments before running a phylogenomic analysis.
+
+To run this program, you'll need to provide the [contigs-fasta](/help/8/artifacts/contigs-fasta) that you're planning to filter, the [blast-table](/help/8/artifacts/blast-table), a list of the column headers in your [blast-table](/help/8/artifacts/blast-table) (as given to BLAST by `-outfmt`), and a `proper_pident` threshold at which to remove the sequences. This threshold will remove sequences less than the given percent of the query amino acids that were identical to the corresponding matched amino acids. Note that this diffres from the `pident` blast parameter because it doesn't include unaligned regions.
+
+For example, if you ran
+
+
+anvi-script-filter-fasta-by-blast -f [contigs-fasta](/help/8/artifacts/contigs-fasta) \
+ -o path/to/[contigs-fasta](/help/8/artifacts/contigs-fasta) \
+ -b [blast-table](/help/8/artifacts/blast-table) \
+ -s qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen \
+ -t 30
+
+
+Then the output file would be a [contigs-fasta](/help/8/artifacts/contigs-fasta) that contains only the sequences in your input file that have a hit in your blast table with more than 30 percent of the amino acids aligned.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-filter-fasta-by-blast.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-filter-fasta-by-blast) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-filter-fasta-by-blast/network.json b/help/8/programs/anvi-script-filter-fasta-by-blast/network.json
new file mode 100644
index 00000000..9d890287
--- /dev/null
+++ b/help/8/programs/anvi-script-filter-fasta-by-blast/network.json
@@ -0,0 +1,47 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-fasta",
+ "name": "contigs-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "blast-table",
+ "name": "blast-table",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-filter-fasta-by-blast",
+ "name": "anvi-script-filter-fasta-by-blast",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-filter-hmm-hits-table/index.md b/help/8/programs/anvi-script-filter-hmm-hits-table/index.md
new file mode 100644
index 00000000..d037ee0a
--- /dev/null
+++ b/help/8/programs/anvi-script-filter-hmm-hits-table/index.md
@@ -0,0 +1,105 @@
+---
+layout: program
+title: anvi-script-filter-hmm-hits-table
+excerpt: An anvi'o program. Filter weak HMM hits from a given contigs database using a domain hits table reported by `anvi-run-hmms`.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-filter-hmm-hits-table
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Filter weak HMM hits from a given contigs database using a domain hits table reported by `anvi-run-hmms`..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [hmm-source](../../artifacts/hmm-source) [hmm-hits](../../artifacts/hmm-hits)
+
+
+## Can provide
+
+
+[hmm-hits](../../artifacts/hmm-hits)
+
+
+## Usage
+
+
+This program allows you to remove low quality HMM alignments from a [hmm-source](/help/8/artifacts/hmm-source) in a [contigs-db](/help/8/artifacts/contigs-db) with HMM alignment parameters such as model-coverage (query-coverage) and gene-coverage (target-coverage), or by removing partial genes (i.e., genes that are not partial and that start with a start codon and end with a stop codon). Briefly, the program will remove all records from an [hmm-source](/help/8/artifacts/hmm-source) in the [hmm-hits](/help/8/artifacts/hmm-hits), then import a new [hmm-hits](/help/8/artifacts/hmm-hits) table into the [contigs-db](/help/8/artifacts/contigs-db) that was filtered to your specifications.
+
+## Filter with HMM alignment parameters
+
+Similar to query coverage in BLAST, we can also use HMM alignment coverage to help determine if an hmm-hit is homologous. A small coverage value means only a small proportion of the query/target is aligning. Before anvi'o can filter out [hmm-hits](/help/8/artifacts/hmm-hits) with alignment coverage, you must run [anvi-run-hmms](/help/8/programs/anvi-run-hmms) and report a domain hits table by including `--domain-hits-table` flag in your command:
+
+
+anvi-run-hmms -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -I Bacteria_71 \
+ --hmmer-output-dir path/to/dir
+ --domain-hits-table
+
+
+After the command above, your HMM hits will be stored in your [contigs-db](/help/8/artifacts/contigs-db) as usual. However, with the domain hits table, you can filter out hits from your [contigs-db](/help/8/artifacts/contigs-db) using thresholds for model or gene coverage of each hit i.e. you can filter out [hmm-hits](/help/8/artifacts/hmm-hits) where the profile HMM and gene align well to each other.
+
+For example, following the command above, the command below will remove [hmm-hits](/help/8/artifacts/hmm-hits) from your [contigs-db](/help/8/artifacts/contigs-db) for profile HMMs that had less than 90% coverage of the target genes:
+
+
+anvi-script-filter-hmm-hits-table -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --hmm-source Bacteria_71 \
+ --domain-hits-table path/to/dir/hmm.domtable \
+ --model-coverage 0.9
+
+
+### HMMs with multiple hits to one gene
+
+Some HMM profiles align multiple times to the same gene at different coordinates. The program `anvi-script-filter-hmm-hits-table` by default will use only one of those domain hits table records which could represent very little alignment coverage. To combine the domain hits table records into one hit and thus increasing alignment coverage, use the parameter `--merge-partial-hits-within-X-nts`. Briefly, if you give the parameter `--merge-partial-hits-within-X-nts` 300, `anvi-script-filter-hmm-hits-table` will merge all hits to the same gene in the domain hits table that have coordinates within 300 nucleotides of each other.
+
+
+anvi-script-filter-hmm-hits-table -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --hmm-source Bacteria_71 \
+ --domain-hits-table path/to/dir/hmm.domtable \
+ --model-coverage 0.9 \
+ --merge-partial-hits-within-X-nts
+
+
+{:.notice}
+The input domtblout file for [anvi-script-filter-hmm-hits-table](/help/8/programs/anvi-script-filter-hmm-hits-table) will be saved as `hmm.domtable.orig` and the output, filtered version will be saved as `hmm.domtable`. If you decide to change the coverage filtering threshold or `--merge-partial-hits-within-X-nts`, be sure to change the path for `--domain-hits-table` to `hmm.domtable.orig`.
+
+## Filter out hmm-hits from partial genes
+
+HMMs are able to detect partial genes (i.e., genes that are not partial and that start with a start codon and end with a stop codon) with good alignment coverage and homology statistics. However, partial genes can lead to spurious phylogenetic branches and/or inflate the number of observed populations or functions in a given set of genomes/metagenomes. Using `--filter-out-partial-gene-calls`, you can remove partial gene hmm-hits.
+
+
+anvi-script-filter-hmm-hits-table -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --hmm-source Bacteria_71 \
+ --domain-hits-table path/to/dir/hmm.domtable \
+ --filter-out-partial-gene-calls
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-filter-hmm-hits-table.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-filter-hmm-hits-table) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-filter-hmm-hits-table/network.json b/help/8/programs/anvi-script-filter-hmm-hits-table/network.json
new file mode 100644
index 00000000..3c8f59b9
--- /dev/null
+++ b/help/8/programs/anvi-script-filter-hmm-hits-table/network.json
@@ -0,0 +1,60 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "hmm-hits",
+ "name": "hmm-hits",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "hmm-source",
+ "name": "hmm-source",
+ "provided_by_anvio": false,
+ "type": "HMM"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-filter-hmm-hits-table",
+ "name": "anvi-script-filter-hmm-hits-table",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-fix-homopolymer-indels/index.md b/help/8/programs/anvi-script-fix-homopolymer-indels/index.md
new file mode 100644
index 00000000..5a52e433
--- /dev/null
+++ b/help/8/programs/anvi-script-fix-homopolymer-indels/index.md
@@ -0,0 +1,456 @@
+---
+layout: program
+title: anvi-script-fix-homopolymer-indels
+excerpt: An anvi'o program. Corrects homopolymer-region associated INDELs in a given genome based on a reference genome.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-fix-homopolymer-indels
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Corrects homopolymer-region associated INDELs in a given genome based on a reference genome. The most effective use of this script is when the input genome is a genome reconstructed by minION long reads, and the reference genome is one that is of high-quality. Essentially, this script will BLAST the genome you wish to correct against the reference genome you provide, identify INDELs in the BLAST results that are exclusively associated with homopolymer regions, and will take the reference genome as a guide to correct the input sequences, and report a new FASTA file. You can use the output FASTA file that is fixed as the input FASTA file over and over again to see if you can eliminate all homopolymer-associated INDELs.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[fasta](../../artifacts/fasta)
+
+
+## Can provide
+
+
+[fasta](../../artifacts/fasta)
+
+
+## Usage
+
+
+This program takes an input [fasta](/help/8/artifacts/fasta) file with one or more sequences, then **corrects INDELs associated with homopolymer regions given a reference [fasta](/help/8/artifacts/fasta) file**, and reports edited sequences as a new [fasta](/help/8/artifacts/fasta) file.
+
+{:.warning}
+You must be extremely careful with this program since it reports edited sequences.
+
+## Better alternatives
+
+If you need a comprehensive solution to correct your long-read sequencing data for serious applications, you should not use this script, but resort to better alternatives designed to correct frameshift errors.
+
+For instance, [Arumugam et al's solution](https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-019-0665-y) leverages NCBI's nr protein database to correct frameshift errors through modified-DIAMOND alignments. Another solution, [homopolish](https://github.com/ythuang0522/homopolish) by Yao-Ting Huang et al, uses mash sketches to correct minION sequences. You may also want to check [proovframe](https://github.com/thackl/proovframe) by Thomas Hackl, which aims to correct frame-shift errors in long-read sequencing data.
+
+## Motivation
+
+We developed this tool largely to test the impact of INDEL errors associated with homopolymers Oxford Nanopore Technology yields. When there is a high-quality reference genome, this program can align a set of input sequences to the reference, and when it sees something like this in the alignment:
+
+```
+Input sequence ......: ... CGAAAAACG ...
+Reference sequence ..: ... CGAAA--CG ...
+```
+
+It can correct the input sequence to look like this, since this would indicate that the additional `A` nucleotides after `AAA` were likely due to errors from the long-read sequencing:
+
+```
+Input sequence ......: ... CGAAACG ...
+```
+
+Similarly, if the program sees a case like this:
+
+```
+Input sequence ......: ... CGAAA--CG ...
+Reference sequence ..: ... CGAAAAACG ...
+```
+
+The program would correct the input sequence this way, assuming that the lacking `A` nucleotides were likely due to errors from long-read sequencing:
+
+
+```
+Input sequence ......: ... CGAAAAACG ...
+```
+
+{:.warning}
+Please note that INDEL errors associated with homopolymers are only a subset of errors that will casue frameshifts and impact amino acid sequences.
+
+## Homopolymer length
+
+The parameter `--min-homopolymer-length` helps the program to determine what to call a homopolymer region. Please read the help menu for this parameter carefully since it is not exactly intuitive.
+
+The value `3` would be a stringent setting. But you may want to lower it to `2` if you promise to evaluate your output carefully.
+
+The script includes some pre-aligned test sequences for you to see how `--min-homopolymer-length` influences things. For instance, here is the output for `--min-homopolymer-length 2`:
+
+```
+anvi-script-fix-homopolymer-indels --test-run \
+ --min-homopolymer-length 2
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCG-AAATCGATCGATCG
+
+GAPS BEFORE REPEATS OF "A"
+===============================================
+Query ........................................: AAA
+Reference ....................................: -AA
+Resolution ...................................: {'action': 'DEL', 'positions': [12], 'nt': 'A'}
+
+Edited query sequence ........................: ATCGATCGATCGAAATCGATCGATCG
+Query sequence edited correctly? .............: Yes
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAA-TCGATCGATCG
+
+GAPS AFTER REPEATS OF "A"
+===============================================
+Query ........................................: AAA
+Reference ....................................: AA-
+Resolution ...................................: {'action': 'DEL', 'positions': [14], 'nt': 'A'}
+
+Edited query sequence ........................: ATCGATCGATCGAAATCGATCGATCG
+Query sequence edited correctly? .............: Yes
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAA-TCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAAAATCGATCGATCG
+
+GAPS AFTER REPEATS OF "A"
+===============================================
+Query ........................................: AA-
+Reference ....................................: AAA
+Resolution ...................................: {'action': 'INS', 'positions': [15], 'nt': 'A'}
+
+Edited query sequence ........................: ATCGATCGATCGAAAATCGATCGATCG
+Query sequence edited correctly? .............: Yes
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: A-C-AT-GATCG-AAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAAAATCGATCGATCG
+
+GAPS BEFORE REPEATS OF "A"
+===============================================
+Query ........................................: -AA
+Reference ....................................: AAA
+Resolution ...................................: {'action': 'INS', 'positions': [9], 'nt': 'A'}
+
+Edited query sequence ........................: ACATGATCGAAAATCGATCGATCG
+Query sequence edited correctly? .............: Yes
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAA-TCGATCGATCG
+
+GAPS AFTER REPEATS OF "A"
+===============================================
+Query ........................................: AAA
+Reference ....................................: AA-
+Resolution ...................................: {'action': 'DEL', 'positions': [14], 'nt': 'A'}
+
+Edited query sequence ........................: ATCGATCGATCGAATCGATCGATCG
+Query sequence edited correctly? .............: Yes
+```
+
+The output for the same input sequences with `--min-homopolymer-length 3`:
+
+
+```
+anvi-script-fix-homopolymer-indels --test-run \
+ --min-homopolymer-length 3
+
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCG-AAATCGATCGATCG
+
+GAPS BEFORE REPEATS OF "A"
+===============================================
+Query ........................................: AAAA
+Reference ....................................: -AAA
+Resolution ...................................: {'action': 'DEL', 'positions': [12], 'nt': 'A'}
+
+Edited query sequence ........................: ATCGATCGATCGAAATCGATCGATCG
+Query sequence edited correctly? .............: Yes
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAA-TCGATCGATCG
+Edited query sequence ........................: ATCGATCGATCGAAAATCGATCGATCG
+Query sequence edited correctly? .............: No
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAA-TCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAAAATCGATCGATCG
+
+GAPS AFTER REPEATS OF "A"
+===============================================
+Query ........................................: AAA-
+Reference ....................................: AAAA
+Resolution ...................................: {'action': 'INS', 'positions': [15], 'nt': 'A'}
+
+Edited query sequence ........................: ATCGATCGATCGAAAATCGATCGATCG
+Query sequence edited correctly? .............: Yes
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: A-C-AT-GATCG-AAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAAAATCGATCGATCG
+
+GAPS BEFORE REPEATS OF "A"
+===============================================
+Query ........................................: -AAA
+Reference ....................................: AAAA
+Resolution ...................................: {'action': 'INS', 'positions': [9], 'nt': 'A'}
+
+Edited query sequence ........................: ACATGATCGAAAATCGATCGATCG
+Query sequence edited correctly? .............: Yes
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAA-TCGATCGATCG
+Edited query sequence ........................: ATCGATCGATCGAAATCGATCGATCG
+Query sequence edited correctly? .............: No
+```
+
+The output for the same input sequences with `--min-homopolymer-length 4`:
+
+```
+anvi-script-fix-homopolymer-indels --test-run \
+ --min-homopolymer-length 4
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCG-AAATCGATCGATCG
+Edited query sequence ........................: ATCGATCGATCGAAAATCGATCGATCG
+Query sequence edited correctly? .............: No
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAA-TCGATCGATCG
+Edited query sequence ........................: ATCGATCGATCGAAAATCGATCGATCG
+Query sequence edited correctly? .............: No
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAA-TCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAAAATCGATCGATCG
+Edited query sequence ........................: ATCGATCGATCGAAATCGATCGATCG
+Query sequence edited correctly? .............: No
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: A-C-AT-GATCG-AAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAAAATCGATCGATCG
+Edited query sequence ........................: ACATGATCGAAATCGATCGATCG
+Query sequence edited correctly? .............: No
+
+* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Query sequence ...............................: ATCGATCGATCGAAATCGATCGATCG
+Reference sequence ...........................: ATCGATCGATCGAA-TCGATCGATCG
+Edited query sequence ........................: ATCGATCGATCGAAATCGATCGATCG
+Query sequence edited correctly? .............: No
+```
+
+## Tips and Warnings
+
+As you correct your input sequences one round, the BLAST may produce new homopolymers. So you may want to re-run the tool by turning the output sequence into an input sequence. For instance, we had a genome reconstructed using long-read sequencing that matched to a gold-standard genome on NCBI.
+
+Running the script the first time this way,
+
+``` bash
+anvi-script-fix-homopolymer-indels --input Genome_minION.fasta \
+ --reference Genome_NCBI_REF.fasta \
+ --output Genome_minION_CORRECTED.fasta
+```
+
+Produced the following output:
+
+```
+OVERALL & PER-SEQUENCE STATS
+===============================================
+Num input sequences ..........................: 1
+Num homopolymers associated with INDELs ......: 529
+Num actions ..................................: 597
+Num insertions ...............................: 292
+Num deletions ................................: 305
+
+* contig_332_pilon
+ - Homopolymers: 529
+ - Insertions: 292
+ - Deletions: 305
+
+Corrected output FASTA .......................: Genome_minION_CORRECTED.fasta
+```
+
+Then copying the output file as the input file,
+
+```
+cp Genome_minION_CORRECTED.fasta Genome_minION.fasta
+```
+
+And re-running the script the same way multiple times gave the following outputs:
+
+``` bash
+
+# (first round)
+
+* contig_332_pilon
+ - Homopolymers associated with INDELs: 29
+ - Insertions: 21
+ - Deletions: 10
+
+# (second round)
+
+* contig_332_pilon
+ - Homopolymers associated with INDELs: 17
+ - Insertions: 9
+ - Deletions: 13
+
+# (third round)
+
+* contig_332_pilon
+ - Homopolymers associated with INDELs: 4
+ - Insertions: 4
+ - Deletions: 0
+
+# (fourth round)
+
+* contig_332_pilon
+ - Homopolymers associated with INDELs: 0
+ - Insertions: 0
+ - Deletions: 0
+```
+
+At the end, there were no more homopolymers associated with INDELs.
+
+**Please consider the following points**:
+
+* If the input and reference genomes are not closely related enough (i.e., expected ANI if there were no sequencing errors > 98%-99%), this process may yield very incorrect outcomes. But it should work great for genomes reconstructed from the same culture.
+
+* The iterative improvement of a given input genome may reach to a 'back-and-forth' situation where there is no overall improvement, but the homopolymers associated with INDELs do not reach to 0. This happens when there are repeats in the reference genome that are identical to each other expect the number of nucleotides in homopolymers.
+
+* You can always add `--verbose` to your command to see every single case that is considered, and resolution anvi'o reached.
+
+* The script cleans after itself. But if you add the flag `--debug` to your call, you will find the raw blast output in XML form, which is the primary file this script uses to identify and correct INDELS associated with homopolymers.
+
+* Under all circumstances, it is important to double check your results, and make sure you keep in mind that anything you see outstanding in your downstream analyses may be due to this step.
+
+## A real-world example
+
+This example involves two circular bacterial genomes, `W01` and `W48`, both of which were reconstructed using minION long-reads that were assembled by [Flye](https://github.com/fenderglass/Flye) and polished by [Pilon](https://github.com/broadinstitute/pilon/wiki).
+
+Although `W01` and `W48` were supposed to be near-identical genomes based on our understanding of the system, the pangenome contained a lot of gene clusters that were either found only in `W01` or only in `W48`, which was quite unexpected. We thought that the spurious gene clusters were in-part due to frame-shifts caused by INDELs associated with random and erroneous homopolymers that influenced both genomes.
+
+The following GIF shows three pangenomes for (1) the uncorrected genomes, (2) `W01` corrected by `W02` using `--min-homopolymer-lenth 3` and (3) `W01` corrected by `W02` using `--min-homopolymer-lenth 2`. Please note that the sequence for `W48` is unchanged throughout these steps, but it is only `W01` that is modified:
+
+![an anvi'o display](../../images/anvi-script-fix-homopolymer-indels-test.gif){:.center-img}
+
+As this preliminary analysis shows, not only there is a reasonable reduction in spurious gene clusters, but also the homogeneity indices for core gene clusters display remarkable improvement. `--min-homopolymer-lenth 2` seems to be doing slightly better than `--min-homopolymer-lenth 3`. Overall, the script seems to be doing its job.
+
+{:.warning}
+Since in this example the 'reference genome', `W48` is also a genome with substantial homopolymer errors, the corrected sequences in `W01` do not necessarily yield amino acid sequences that are globally correct. But they are as incorrect as the ones in `W48` and not more. When a very closely related reference genome is used, the corrections will not only remove spurious gene clusters from pangenomes, but also yield more accurate amino acid sequences.
+
+For posterity, the following shell script shows how each pangenome shown in the GIF above is generated and displayed:
+
+
+``` bash
+###########################################################################################
+# NO CORRECTION
+###########################################################################################
+anvi-gen-contigs-database -f W01.fa -o W01.db
+anvi-gen-contigs-database -f W48.fa -o W48.db
+
+anvi-gen-genomes-storage -e external-genomes-01.txt \
+ -o UNCORRECTED-GENOMES.db
+
+anvi-pan-genome -g UNCORRECTED-GENOMES.db \
+ -n UNCORRECTED \
+ --num-threads 4
+
+anvi-display-pan -g UNCORRECTED-GENOMES.db \
+ -p UNCORRECTED/UNCORRECTED-PAN.db \
+ --title "UNCORRECTED"
+
+###########################################################################################
+# W1 CORRECTED BY W48 --min-homopolymer-length 3
+###########################################################################################
+
+anvi-script-fix-homopolymer-indels -i W01.fa \
+ -r W48.fa \
+ --min-homopolymer-length 3 \
+ -o W01_CBW48_MHL3.fa
+
+anvi-gen-contigs-database -f W01_CBW48_MHL3.fa \
+ -o W01_CBW48_MHL3.db
+
+anvi-gen-genomes-storage -e external-genomes-02.txt \
+ -o CORRECTED-BY-W48-MHL3-GENOMES.db
+
+anvi-pan-genome -g CORRECTED-BY-W48-MHL3-GENOMES.db \
+ -n CORRECTED-BY-W48-MHL3 \
+ --num-threads 4
+
+anvi-display-pan -g CORRECTED-BY-W48-MHL3-GENOMES.db \
+ -p CORRECTED-BY-W48-MHL3/CORRECTED-BY-W48-MHL3-PAN.db \
+ --title "W01 CORRECTED BY W48 w/MHL3"
+
+###########################################################################################
+# W1 CORRECTED BY W48 --min-homopolymer-length 2
+###########################################################################################
+
+anvi-script-fix-homopolymer-indels -i W01.fa \
+ -r W48.fa \
+ --min-homopolymer-length 2 \
+ -o W01_CBW48_MHL2.fa
+
+anvi-gen-contigs-database -f W01_CBW48_MHL2.fa \
+ -o W01_CBW48_MHL2.db
+
+anvi-gen-genomes-storage -e external-genomes-03.txt \
+ -o CORRECTED-BY-W48-MHL2-GENOMES.db
+
+anvi-pan-genome -g CORRECTED-BY-W48-MHL2-GENOMES.db \
+ -n CORRECTED-BY-W48-MHL2 \
+ --num-threads 4
+
+anvi-display-pan -g CORRECTED-BY-W48-MHL2-GENOMES.db \
+ -p CORRECTED-BY-W48-MHL2/CORRECTED-BY-W48-MHL2-PAN.db \
+ --title "W01 CORRECTED BY W48 w/MHL2"
+```
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-fix-homopolymer-indels.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-fix-homopolymer-indels) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-fix-homopolymer-indels/network.json b/help/8/programs/anvi-script-fix-homopolymer-indels/network.json
new file mode 100644
index 00000000..e588ed3e
--- /dev/null
+++ b/help/8/programs/anvi-script-fix-homopolymer-indels/network.json
@@ -0,0 +1,34 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 2,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "fasta",
+ "name": "fasta",
+ "provided_by_anvio": false,
+ "type": "FASTA"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-fix-homopolymer-indels",
+ "name": "anvi-script-fix-homopolymer-indels",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 1,
+ "target": 0
+ },
+ {
+ "target": 1,
+ "source": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-gen-distribution-of-genes-in-a-bin/index.md b/help/8/programs/anvi-script-gen-distribution-of-genes-in-a-bin/index.md
new file mode 100644
index 00000000..aea24fac
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-distribution-of-genes-in-a-bin/index.md
@@ -0,0 +1,95 @@
+---
+layout: program
+title: anvi-script-gen-distribution-of-genes-in-a-bin
+excerpt: An anvi'o program. Quantify the detection of genes in genomes in metagenomes to identify the environmental core.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-gen-distribution-of-genes-in-a-bin
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Quantify the detection of genes in genomes in metagenomes to identify the environmental core. This is a helper script for anvi'o metapangenomic workflow.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [collection](../../artifacts/collection) [bin](../../artifacts/bin)
+
+
+## Can provide
+
+
+[view-data](../../artifacts/view-data) [misc-data-items-txt](../../artifacts/misc-data-items-txt)
+
+
+## Usage
+
+
+This program computes the detection of genes (inputted as a [bin](/help/8/artifacts/bin)) across your samples, so that you can visualize them in the [interactive](/help/8/artifacts/interactive) interface.
+
+This program is used in [the metapangenomic workflow](https://merenlab.org/data/prochlorococcus-metapangenome/#classification-of-genes-as-ecgs-and-eags-by-the-distribution-of-genes-in-a-genome-across-metagenomes) on genes with metagenomes as samples to visually identify the environmental core genes and accessory genes.
+
+### Inputs
+
+Essentially, you provide a [contigs-db](/help/8/artifacts/contigs-db) and [profile-db](/help/8/artifacts/profile-db) pair, as well as the [bin](/help/8/artifacts/bin) you want to look at, and this program will search each gene in your bin against the samples denoted in your [profile-db](/help/8/artifacts/profile-db):
+
+
+anvi-script-gen-distribution-of-genes-in-a-bin -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -b [bin](/help/8/artifacts/bin)
+
+
+There are two other parameters that you can set to focus the genes that you're looking at:
+- The minimum detection required for a gene to be included (by default, a gene must have a detection value of `0.5` in at least one of your samples)
+-The minimum coverage required for a gene to be included (by default, a gene must have a total coverage of `0.25` times the mean total coverage in your data)
+
+### Outputs
+
+This program will produce two outputs:
+
+1. `[your bin name]-GENE-COVs.txt`, which is a [view-data](/help/8/artifacts/view-data) artifact. This is a matrix where each row represents a gene, each column represents one of your samples, and the cells each contain a coverage value.
+2. `[your bin name]-ENV-DETECTION.txt`, which is a [misc-data-layers](/help/8/artifacts/misc-data-layers). It is a two-column file, where each row is a gene and and the second column describes whether or not that gene is systematically detected in your samples. Thus, this can be added as an additional layer in the interface that describes describes which genes are detected in your samples. (as an example, see the outermost layer [here](https://merenlab.org/data/prochlorococcus-metapangenome/#classification-of-genes-as-ecgs-and-eags-by-the-distribution-of-genes-in-a-genome-across-metagenomes))
+
+Thus, after running this program on a bin with name `BIN_NAME`, you can run
+
+
+[anvi-interactive](/help/8/programs/anvi-interactive) -d BIN_NAME-GENE-COVs.txt \
+ -A BIN_NAME-ENV-DETECTION.txt \
+ --manual \
+ -p [profile-db](/help/8/artifacts/profile-db)
+
+
+This will visually show you the coverage and detection of your genes across your samples in the [interactive](/help/8/artifacts/interactive) interface (simlarly to [this figure](https://merenlab.org/data/prochlorococcus-metapangenome/#classification-of-genes-as-ecgs-and-eags-by-the-distribution-of-genes-in-a-genome-across-metagenomes)).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-gen-distribution-of-genes-in-a-bin.md) to update this information.
+
+
+## Additional Resources
+
+
+* [This program in action as part of the metapangenomic workflow](http://merenlab.org/data/prochlorococcus-metapangenome/#classification-of-genes-as-ecgs-and-eags-by-the-distribution-of-genes-in-a-genome-across-metagenomes)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-gen-distribution-of-genes-in-a-bin) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-gen-distribution-of-genes-in-a-bin/network.json b/help/8/programs/anvi-script-gen-distribution-of-genes-in-a-bin/network.json
new file mode 100644
index 00000000..c481beb3
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-distribution-of-genes-in-a-bin/network.json
@@ -0,0 +1,95 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "view-data",
+ "name": "view-data",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items-txt",
+ "name": "misc-data-items-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "bin",
+ "name": "bin",
+ "provided_by_anvio": true,
+ "type": "BIN"
+ },
+ {
+ "size": 6,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-gen-distribution-of-genes-in-a-bin",
+ "name": "anvi-script-gen-distribution-of-genes-in-a-bin",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 6,
+ "target": 0
+ },
+ {
+ "source": 6,
+ "target": 1
+ },
+ {
+ "target": 6,
+ "source": 2
+ },
+ {
+ "target": 6,
+ "source": 3
+ },
+ {
+ "target": 6,
+ "source": 4
+ },
+ {
+ "target": 6,
+ "source": 5
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-gen-function-matrix-across-genomes/index.md b/help/8/programs/anvi-script-gen-function-matrix-across-genomes/index.md
new file mode 100644
index 00000000..522657d4
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-function-matrix-across-genomes/index.md
@@ -0,0 +1,94 @@
+---
+layout: program
+title: anvi-script-gen-function-matrix-across-genomes
+excerpt: An anvi'o program. A program to generate reports for the distribution of functions across genomes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-gen-function-matrix-across-genomes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program to generate reports for the distribution of functions across genomes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[functions](../../artifacts/functions) [genomes-storage-db](../../artifacts/genomes-storage-db) [internal-genomes](../../artifacts/internal-genomes) [external-genomes](../../artifacts/external-genomes) [groups-txt](../../artifacts/groups-txt)
+
+
+## Can provide
+
+
+[functional-enrichment-txt](../../artifacts/functional-enrichment-txt) [functions-across-genomes-txt](../../artifacts/functions-across-genomes-txt)
+
+
+## Usage
+
+
+Generates TAB-delmited output files for [functions](/help/8/artifacts/functions) from a single function annotation source across genomes.
+
+{:.notice}
+For a simlar program that reports HMM hits across genomes, see [anvi-script-gen-hmm-hits-matrix-across-genomes](/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes).
+
+The input genomes for this program can be provided through an [external-genomes](/help/8/artifacts/external-genomes), [internal-genomes](/help/8/artifacts/internal-genomes), [genomes-storage-db](/help/8/artifacts/genomes-storage-db), or any combination of these sources.
+
+This program is very similar to [anvi-display-functions](/help/8/programs/anvi-display-functions), and can also perform a functional enrichment analysis on-the-fly if you provide it with an optional [groups-txt](/help/8/artifacts/groups-txt) file. Unlike, [anvi-display-functions](/help/8/programs/anvi-display-functions), this program will report TAB-delmited output files for you to further analyze.
+
+You can run the program on a set of genomes for a given annotation source:
+
+
+anvi-script-gen-function-matrix-across-genomes -e [external-genomes](/help/8/artifacts/external-genomes) \
+ --annotation-source COG20_FUNCTION \
+ --output-file-prefix MY-GENOMES
+
+
+The command above will result in two files in your work directory, both of which will be of type [functions-across-genomes-txt](/help/8/artifacts/functions-across-genomes-txt):
+
+* MY-GENOMES-FREQUENCY.txt
+* MY-GENOMES-PRESENCE-ABSENCE.txt
+
+{:.notice}
+You can always learn about which functions are in a given [contigs-db](/help/8/artifacts/contigs-db) using the program [anvi-db-info](/help/8/programs/anvi-db-info).
+
+Alternatively you can run it with a [groups-txt](/help/8/artifacts/groups-txt) that associates sets of genomes with distinct groups,
+
+
+anvi-script-gen-function-matrix-across-genomes -i [internal-genomes](/help/8/artifacts/internal-genomes) \
+ --annotation-source COG20_FUNCTION \
+ --output-file-prefix MY-GENOMES \
+ --groups-txt groups.txt
+
+
+which would generate an additional file in your work directory of type [functional-enrichment-txt](/help/8/artifacts/functional-enrichment-txt):
+
+* MY-GENOMES-FUNCTIONAL-ENRICHMENT.txt
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-gen-function-matrix-across-genomes.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-gen-function-matrix-across-genomes) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-gen-function-matrix-across-genomes/network.json b/help/8/programs/anvi-script-gen-function-matrix-across-genomes/network.json
new file mode 100644
index 00000000..5a6953fb
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-function-matrix-across-genomes/network.json
@@ -0,0 +1,108 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functional-enrichment-txt",
+ "name": "functional-enrichment-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions-across-genomes-txt",
+ "name": "functions-across-genomes-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "groups-txt",
+ "name": "groups-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 7,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-gen-function-matrix-across-genomes",
+ "name": "anvi-script-gen-function-matrix-across-genomes",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 7,
+ "target": 0
+ },
+ {
+ "source": 7,
+ "target": 1
+ },
+ {
+ "target": 7,
+ "source": 2
+ },
+ {
+ "target": 7,
+ "source": 3
+ },
+ {
+ "target": 7,
+ "source": 4
+ },
+ {
+ "target": 7,
+ "source": 5
+ },
+ {
+ "target": 7,
+ "source": 6
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-gen-functions-per-group-stats-output/index.md b/help/8/programs/anvi-script-gen-functions-per-group-stats-output/index.md
new file mode 100644
index 00000000..e86ee779
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-functions-per-group-stats-output/index.md
@@ -0,0 +1,57 @@
+---
+layout: program
+title: anvi-script-gen-functions-per-group-stats-output
+excerpt: An anvi'o program. Generate a TAB delimited file for the distribution of functions across groups of genomes/metagenomes.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-gen-functions-per-group-stats-output
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Generate a TAB delimited file for the distribution of functions across groups of genomes/metagenomes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[functions](../../artifacts/functions) [genomes-storage-db](../../artifacts/genomes-storage-db) [internal-genomes](../../artifacts/internal-genomes) [external-genomes](../../artifacts/external-genomes)
+
+
+## Can provide
+
+
+[interactive](../../artifacts/interactive)
+
+
+## Usage
+
+
+{:.notice}
+**No one has described the usage of this program** :/ If you would like to contribute, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs), and feel free to add a Markdown formatted file in that directory named "anvi-script-gen-functions-per-group-stats-output.md". For a template, you can use the markdown file for `anvi-gen-contigs-database`. THANK YOU!
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-gen-functions-per-group-stats-output) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-gen-functions-per-group-stats-output/network.json b/help/8/programs/anvi-script-gen-functions-per-group-stats-output/network.json
new file mode 100644
index 00000000..ed9a3d9c
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-functions-per-group-stats-output/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interactive",
+ "name": "interactive",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-gen-functions-per-group-stats-output",
+ "name": "anvi-script-gen-functions-per-group-stats-output",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "target": 5,
+ "source": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-gen-genomes-file/index.md b/help/8/programs/anvi-script-gen-genomes-file/index.md
new file mode 100644
index 00000000..0f9affee
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-genomes-file/index.md
@@ -0,0 +1,93 @@
+---
+layout: program
+title: anvi-script-gen-genomes-file
+excerpt: An anvi'o program. Generate an external genomes or internal genomes file.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-gen-genomes-file
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Generate an external genomes or internal genomes file.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db) [collection](../../artifacts/collection)
+
+
+## Can provide
+
+
+[external-genomes](../../artifacts/external-genomes) [internal-genomes](../../artifacts/internal-genomes)
+
+
+## Usage
+
+
+The primary purpose of this script is to reduce the amount of labor required to generate [external-genomes](/help/8/artifacts/external-genomes) or [internal-genomes](/help/8/artifacts/internal-genomes) files anvi'o typically uses to learn about your bins and/or genomes.
+
+## Generating an external genomes file
+
+If you provide an input directory and a name for the output file, then every [contigs-db](/help/8/artifacts/contigs-db) in that directory will get a line in the resulting [external-genomes](/help/8/artifacts/external-genomes) file:
+
+```
+anvi-script-gen-genomes-file --input-dir path/to/dir \
+ --output-file external_genomes.txt
+```
+
+Names for genomes in the the resulting external genomes file will be set based on the `project_name` variable, and the `contigs_db_path` column will contain absolute paths.
+
+{:.notice}
+You can learn the current `project_name` and/or change it for a given [contigs-db](/help/8/artifacts/contigs-db) using the program [anvi-db-info](/help/8/programs/anvi-db-info). This variable is set by the program [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database).
+
+You can also instruct `anvi-script-gen-genomes-file` to include all subdirectories under a given directory path:
+
+```
+anvi-script-gen-genomes-file --input-dir path/to/dir \
+ --output-file external_genomes.txt \
+ --include-subdirs
+```
+
+## Generating an internal genomes file
+
+To get an [internal-genomes](/help/8/artifacts/internal-genomes) file containing all bins from a collection, provide a [profile-db](/help/8/artifacts/profile-db), its corresponding [contigs-db](/help/8/artifacts/contigs-db), and the [collection](/help/8/artifacts/collection) name:
+
+
+anvi-script-gen-genomes-file -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ --output-file internal-genomes.txt
+
+
+The name of each internal genome will be the same as the bin name, and the path columns will contain absolute paths.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-gen-genomes-file.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-gen-genomes-file) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-gen-genomes-file/network.json b/help/8/programs/anvi-script-gen-genomes-file/network.json
new file mode 100644
index 00000000..7e88139e
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-genomes-file/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-gen-genomes-file",
+ "name": "anvi-script-gen-genomes-file",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "source": 5,
+ "target": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes/index.md b/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes/index.md
new file mode 100644
index 00000000..6292972a
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes/index.md
@@ -0,0 +1,78 @@
+---
+layout: program
+title: anvi-script-gen-hmm-hits-matrix-across-genomes
+excerpt: An anvi'o program. A simple script to generate a TAB-delimited file that reports the frequency of HMM hits for a given HMM source across contigs databases.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-gen-hmm-hits-matrix-across-genomes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A simple script to generate a TAB-delimited file that reports the frequency of HMM hits for a given HMM source across contigs databases.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[external-genomes](../../artifacts/external-genomes) [internal-genomes](../../artifacts/internal-genomes) [hmm-source](../../artifacts/hmm-source) [hmm-hits](../../artifacts/hmm-hits)
+
+
+## Can provide
+
+
+[hmm-hits-across-genomes-txt](../../artifacts/hmm-hits-across-genomes-txt)
+
+
+## Usage
+
+
+This program lets you look at the [hmm-hits](/help/8/artifacts/hmm-hits) from a single [hmm-source](/help/8/artifacts/hmm-source) across multiple genomes or bins, by creating a [hmm-hits-across-genomes-txt](/help/8/artifacts/hmm-hits-across-genomes-txt).
+
+{:.notice}
+For a simlar program that reports function hits across genomes, see [anvi-script-gen-function-matrix-across-genomes](/help/8/programs/anvi-script-gen-function-matrix-across-genomes).
+
+The input of this program can be either an [internal-genomes](/help/8/artifacts/internal-genomes) or an [external-genomes](/help/8/artifacts/external-genomes).
+
+Here are two example run on an internal-genomes:
+
+
+anvi-script-gen-hmm-hits-matrix-across-genomes -i [internal-genomes](/help/8/artifacts/internal-genomes) \
+ --hmm-source Bacteria_71 \
+ -o output.txt
+
+
+To list the [hmm-source](/help/8/artifacts/hmm-source)s common to the datasets that you're analyzing, just add the flag `--list-hmm-sources`, as so:
+
+
+anvi-script-gen-hmm-hits-matrix-across-genomes -e [external-genomes](/help/8/artifacts/external-genomes) \
+ --list-hmm-sources
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-gen-hmm-hits-matrix-across-genomes.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-gen-hmm-hits-matrix-across-genomes) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes/network.json b/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes/network.json
new file mode 100644
index 00000000..8e2f0c64
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "hmm-hits-across-genomes-txt",
+ "name": "hmm-hits-across-genomes-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-genomes",
+ "name": "external-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "internal-genomes",
+ "name": "internal-genomes",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "hmm-source",
+ "name": "hmm-source",
+ "provided_by_anvio": false,
+ "type": "HMM"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "hmm-hits",
+ "name": "hmm-hits",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-gen-hmm-hits-matrix-across-genomes",
+ "name": "anvi-script-gen-hmm-hits-matrix-across-genomes",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "target": 5,
+ "source": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-gen-pseudo-paired-reads-from-fastq/index.md b/help/8/programs/anvi-script-gen-pseudo-paired-reads-from-fastq/index.md
new file mode 100644
index 00000000..e6a24b23
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-pseudo-paired-reads-from-fastq/index.md
@@ -0,0 +1,72 @@
+---
+layout: program
+title: anvi-script-gen-pseudo-paired-reads-from-fastq
+excerpt: An anvi'o program. A script that takes a FASTQ file that is not paired-end (i.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-gen-pseudo-paired-reads-from-fastq
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A script that takes a FASTQ file that is not paired-end (i.e., R1 alone) and converts it into two FASTQ files that are paired-end (i.e., R1 and R2). This is a quick-and-dirty workaround that halves each read from the original FASTQ and puts one half in the FASTQ file for R1 and puts the reverse-complement of the second half in the FASTQ file for R2. If you've ended up here, things have clearly not gone very well for you, and Evan, who battled similar battles and ended up implementing this solution wholeheartedly sympathizes.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[short-reads-fasta](../../artifacts/short-reads-fasta)
+
+
+## Can provide
+
+
+[paired-end-fastq](../../artifacts/paired-end-fastq)
+
+
+## Usage
+
+
+This program takes in a [short-reads-fasta](/help/8/artifacts/short-reads-fasta) file and tries to recreate what paired reads for the data in that fasta file might look like.
+
+An arbitrarily chosen half of the reads will be put into the R1 output, while the other half will be reverse complemented and put into the R2 output.
+
+For example, if you ran
+
+
+anvi-script-gen-pseudo-paired-reads-from-fastq -f [short-reads-fasta](/help/8/artifacts/short-reads-fasta) \
+ -O MY_READS
+
+
+Then you would end up with two files:
+
+- `MY_READS_1.fastq` which contains half of the reads straight out of your input file
+- `MY_READS_2.fastq` which contains the reverse complement of the other half of the reads.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-gen-pseudo-paired-reads-from-fastq.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-gen-pseudo-paired-reads-from-fastq) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-gen-pseudo-paired-reads-from-fastq/network.json b/help/8/programs/anvi-script-gen-pseudo-paired-reads-from-fastq/network.json
new file mode 100644
index 00000000..d41a9450
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-pseudo-paired-reads-from-fastq/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "paired-end-fastq",
+ "name": "paired-end-fastq",
+ "provided_by_anvio": true,
+ "type": "FASTQ"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "short-reads-fasta",
+ "name": "short-reads-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-gen-pseudo-paired-reads-from-fastq",
+ "name": "anvi-script-gen-pseudo-paired-reads-from-fastq",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-gen-short-reads/index.md b/help/8/programs/anvi-script-gen-short-reads/index.md
new file mode 100644
index 00000000..e3ae00b5
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-short-reads/index.md
@@ -0,0 +1,81 @@
+---
+layout: program
+title: anvi-script-gen-short-reads
+excerpt: An anvi'o program. Generate short reads from contigs.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-gen-short-reads
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Generate short reads from contigs. Useful to reconstruct mock data sets from already assembled contigs.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[configuration-ini](../../artifacts/configuration-ini)
+
+
+## Can provide
+
+
+[short-reads-fasta](../../artifacts/short-reads-fasta)
+
+
+## Usage
+
+
+This program uses already assembled contigs to create a mock list of short reads. You can then use these short reads to reassemble your data in order to test alternative assembly programs or analysis methods as a positive control.
+
+Basically, this attempts to undo the assembly and produce a data set that could have been directly received from laboratory sequencing. While the computer's mock short reads won't be perfect, they can be used to make sure your analysis pipeline is working from step 1.
+
+## Example Usage
+
+This program takes an INI file - a form of text file containing various information. For this program, the example provided in the anvi'o test suite looks like this:
+
+```ini
+[general]
+short_read_length = 10
+error_rate = 0.05
+coverage = 100
+contig = CTGTGGTTACGCCACCTTGAGAGATATTAGTCGCGTATTGCATCCGTGCCGACAAATTGCCCAACGCATCGTTCCTTCTCCTAAGTAATTTAACATGCGT
+```
+
+Note that this file contains both the contig that you want to break down, and various information about the short reads that you want to create. To run this program, just call
+
+
+anvi-script-gen-short-reads [configuration-ini](/help/8/artifacts/configuration-ini) \
+ --output-file-path [short-reads-fasta](/help/8/artifacts/short-reads-fasta)
+
+
+The resulting FASTA file with short reads will cover the `contig` with short reads that are 10 nts long at 100X coverage. There will also be an error-rate of 0.05, to mimic the sequencing errors you would get from sequencing in the wet lab.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-gen-short-reads.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-gen-short-reads) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-gen-short-reads/network.json b/help/8/programs/anvi-script-gen-short-reads/network.json
new file mode 100644
index 00000000..b1ae71f4
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-short-reads/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "short-reads-fasta",
+ "name": "short-reads-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "configuration-ini",
+ "name": "configuration-ini",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-gen-short-reads",
+ "name": "anvi-script-gen-short-reads",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-gen-user-module-file/index.md b/help/8/programs/anvi-script-gen-user-module-file/index.md
new file mode 100644
index 00000000..ef01f0c6
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-user-module-file/index.md
@@ -0,0 +1,122 @@
+---
+layout: program
+title: anvi-script-gen-user-module-file
+excerpt: An anvi'o program. This script generates a user-defined module file from a tab-delimited file of enzymes and other input parameters.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-gen-user-module-file
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+This script generates a user-defined module file from a tab-delimited file of enzymes and other input parameters..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[enzymes-list-for-module](../../artifacts/enzymes-list-for-module)
+
+
+## Can provide
+
+
+[user-modules-data](../../artifacts/user-modules-data)
+
+
+## Usage
+
+
+Given an [enzymes-list-for-module](/help/8/artifacts/enzymes-list-for-module) file, this script will produce a properly-formatted module file for use in [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules).
+
+## Basics
+You should provide this script with an accession ID for your module (which will become the module file name) (`-I`); a name for the module (`-n`); a categorization which includes module class, category, and sub-category separated by semicolons (`-c`); an [enzymes-list-for-module](/help/8/artifacts/enzymes-list-for-module) file listing the component enzymes, their annotation sources, and their functions (`-e`); and a definition which puts these enzymes in the proper order (`-d`):
+
+
+anvi-script-gen-user-module-file -I "UD0023" \
+ -n "Frankenstein pathway for demo purposes" \
+ -c "User modules; Demo set; Frankenstein metabolism" \
+ -e [enzymes-list-for-module](/help/8/artifacts/enzymes-list-for-module) \
+ -d "K01657+K01658 PF06603.14,(COG1362 TIGR01709.2)"
+
+
+Then this script will produce a properly-formatted module file, which in this case would be called `UD0023` and would look like this (see [enzymes-list-for-module](/help/8/artifacts/enzymes-list-for-module) for the example file containing these enzymes):
+
+```
+ENTRY UD0023
+NAME Frankenstein pathway for demo purposes
+DEFINITION K01657+K01658 PF06603.14,(COG1362 TIGR01709.2)
+ORTHOLOGY K01657 anthranilate synthase component I [EC:4.1.3.27]
+ K01658 anthranilate synthase component II [EC:4.1.3.27]
+ PF06603.14 UpxZ
+ COG1362 Aspartyl aminopeptidase
+ TIGR01709.2 type II secretion system protein GspL
+CLASS User modules; Demo set; Frankenstein metabolism
+ANNOTATION_SOURCE K01657 KOfam
+ K01658 KOfam
+ PF06603.14 METABOLISM_HMM
+ COG1362 COG20_FUNCTION
+ TIGR01709.2 TIGRFAM
+\\\
+```
+
+## Automatically generating the module definition
+
+The module definition parameter is not required, and if you do not provide one, the definition will be generated with each enzyme in the input file as a different 'step' of the module. This option may be especially appropriate for generating what KEGG calls "signature modules", in which each enzyme is not technically part of a metabolic pathway, but instead the module represents a functionally-related set of enzymes (like all tRNA modification enzymes, for instance).
+
+Here is an example command without the definition parameter:
+
+
+anvi-script-gen-user-module-file -I "UD0023" \
+ -n "Frankenstein pathway for demo purposes" \
+ -c "User modules; Demo set; Frankenstein metabolism" \
+ -e [enzymes-list-for-module](/help/8/artifacts/enzymes-list-for-module) \
+
+
+And here is the module file that this would produce:
+
+```
+ENTRY UD0023
+NAME Frankenstein pathway for demo purposes
+DEFINITION K01657 K01658 PF06603.14 COG1362 TIGR01709.2
+ORTHOLOGY K01657 anthranilate synthase component I [EC:4.1.3.27]
+ K01658 anthranilate synthase component II [EC:4.1.3.27]
+ PF06603.14 UpxZ
+ COG1362 Aspartyl aminopeptidase
+ TIGR01709.2 type II secretion system protein GspL
+CLASS User modules; Demo set; Frankenstein metabolism
+ANNOTATION_SOURCE K01657 KOfam
+ K01658 KOfam
+ PF06603.14 METABOLISM_HMM
+ COG1362 COG20_FUNCTION
+ TIGR01709.2 TIGRFAM
+\\\
+```
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-gen-user-module-file.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-gen-user-module-file) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-gen-user-module-file/network.json b/help/8/programs/anvi-script-gen-user-module-file/network.json
new file mode 100644
index 00000000..c473f7b6
--- /dev/null
+++ b/help/8/programs/anvi-script-gen-user-module-file/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "user-modules-data",
+ "name": "user-modules-data",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "enzymes-list-for-module",
+ "name": "enzymes-list-for-module",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-gen-user-module-file",
+ "name": "anvi-script-gen-user-module-file",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-gen_stats_for_single_copy_genes.py/index.md b/help/8/programs/anvi-script-gen_stats_for_single_copy_genes.py/index.md
new file mode 100644
index 00000000..c5f59e69
--- /dev/null
+++ b/help/8/programs/anvi-script-gen_stats_for_single_copy_genes.py/index.md
@@ -0,0 +1,76 @@
+---
+layout: program
+title: anvi-script-gen_stats_for_single_copy_genes.py
+excerpt: An anvi'o program. A simple script to generate info from search tables, given a contigs-db.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-gen_stats_for_single_copy_genes.py
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A simple script to generate info from search tables, given a contigs-db.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[genes-stats](../../artifacts/genes-stats)
+
+
+## Usage
+
+
+This program **provides information about each of the single-copy core genes in your [contigs-db](/help/8/artifacts/contigs-db)**.
+
+Simply provide a [contigs-db](/help/8/artifacts/contigs-db), and it will create a [genes-stats](/help/8/artifacts/genes-stats) file containing a variety of information about the single copy core genes in your database.
+
+{:.notice}
+This is kind of an old anvi'o script that we still keep around because history. But if you are here, you may also consider taking a look at the programs [anvi-script-gen-hmm-hits-matrix-across-genomes](/help/8/programs/anvi-script-gen-hmm-hits-matrix-across-genomes) and [anvi-get-sequences-for-hmm-hits](/help/8/programs/anvi-get-sequences-for-hmm-hits).
+
+
+anvi-script-gen_stats_for_single_copy_genes.py -c [contigs-db](/help/8/artifacts/contigs-db)
+
+
+The console output will tell you the total number of contigs, splits, and nucleotides in your [contigs-db](/help/8/artifacts/contigs-db), while the text output will tell you the source, name, and e-value of each single-copy core gene.
+
+You can get information from only single-copy core genes from a specific source. To see what sources are available in your [contigs-db](/help/8/artifacts/contigs-db), run
+
+
+anvi-script-gen_stats_for_single_copy_genes.py -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --list-sources
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-gen_stats_for_single_copy_genes.py.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-gen_stats_for_single_copy_genes.py) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-gen_stats_for_single_copy_genes.py/network.json b/help/8/programs/anvi-script-gen_stats_for_single_copy_genes.py/network.json
new file mode 100644
index 00000000..d5daef57
--- /dev/null
+++ b/help/8/programs/anvi-script-gen_stats_for_single_copy_genes.py/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genes-stats",
+ "name": "genes-stats",
+ "provided_by_anvio": true,
+ "type": "STATS"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-gen_stats_for_single_copy_genes.py",
+ "name": "anvi-script-gen_stats_for_single_copy_genes.py",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-get-coverage-from-bam/index.md b/help/8/programs/anvi-script-get-coverage-from-bam/index.md
new file mode 100644
index 00000000..27b3d7c7
--- /dev/null
+++ b/help/8/programs/anvi-script-get-coverage-from-bam/index.md
@@ -0,0 +1,84 @@
+---
+layout: program
+title: anvi-script-get-coverage-from-bam
+excerpt: An anvi'o program. Get nucleotide-level, contig-level, or bin-level coverage values from a BAM file very rapidly.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-get-coverage-from-bam
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Get nucleotide-level, contig-level, or bin-level coverage values from a BAM file very rapidly. For other anvi'o programs that are designed to profile BAM files, see `anvi-profile` and `anvi-profile-blitz`.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[bam-file](../../artifacts/bam-file) [collection-txt](../../artifacts/collection-txt)
+
+
+## Can provide
+
+
+[coverages-txt](../../artifacts/coverages-txt)
+
+
+## Usage
+
+
+This program gets the coverage values from a [bam-file](/help/8/artifacts/bam-file), and puts them into a [coverages-txt](/help/8/artifacts/coverages-txt).
+
+You must provide a BAM file, but there are three ways you can choose contigs to analyze within that file:
+1. Give a contig name. Here, you can only report coverage per nucleotide position (In this example, the user is specifically asking for this anyway with the `-m` flag)
+
+
+ anvi-script-get-coverage-from-bam -b [bam-file](/help/8/artifacts/bam-file) \
+ -c NAME_OF_CONTIG \
+ -m pos
+
+
+2. Give a file that contains a list of contigs (one per line; same format as the `--contigs-of-interest` tag for [anvi-profile](/help/8/programs/anvi-profile)). Here, you can ask for the contig averages (as in this example) or nucleotide position coverage.
+
+
+ anvi-script-get-coverage-from-bam -b [bam-file](/help/8/artifacts/bam-file) \
+ -l NAME_OF_FILE \
+ -m contig
+
+
+3. Give a [collection-txt](/help/8/artifacts/collection-txt) file for the program to determine the coverage for all contigs in those bins. Here, you can ask for the contig averages, nucleotide position coverage or coverage per bin (as in this example).
+
+
+ anvi-script-get-coverage-from-bam -b [bam-file](/help/8/artifacts/bam-file) \
+ -C [collection-txt](/help/8/artifacts/collection-txt) \
+ -m bin
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-get-coverage-from-bam.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-get-coverage-from-bam) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-get-coverage-from-bam/network.json b/help/8/programs/anvi-script-get-coverage-from-bam/network.json
new file mode 100644
index 00000000..81a626b5
--- /dev/null
+++ b/help/8/programs/anvi-script-get-coverage-from-bam/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "coverages-txt",
+ "name": "coverages-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "bam-file",
+ "name": "bam-file",
+ "provided_by_anvio": false,
+ "type": "BAM"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection-txt",
+ "name": "collection-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-get-coverage-from-bam",
+ "name": "anvi-script-get-coverage-from-bam",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-get-hmm-hits-per-gene-call/index.md b/help/8/programs/anvi-script-get-hmm-hits-per-gene-call/index.md
new file mode 100644
index 00000000..991c56d2
--- /dev/null
+++ b/help/8/programs/anvi-script-get-hmm-hits-per-gene-call/index.md
@@ -0,0 +1,75 @@
+---
+layout: program
+title: anvi-script-get-hmm-hits-per-gene-call
+excerpt: An anvi'o program. A simple script to generate a TAB-delimited file gene caller IDs and their HMM hits for a given HMM source.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-get-hmm-hits-per-gene-call
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A simple script to generate a TAB-delimited file gene caller IDs and their HMM hits for a given HMM source.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [hmm-source](../../artifacts/hmm-source) [hmm-hits](../../artifacts/hmm-hits)
+
+
+## Can provide
+
+
+[functions-txt](../../artifacts/functions-txt)
+
+
+## Usage
+
+
+This program lets you convert the [hmm-hits](/help/8/artifacts/hmm-hits) within a [contigs-db](/help/8/artifacts/contigs-db) into a [functions-txt](/help/8/artifacts/functions-txt).
+
+It is similar to [anvi-export-functions](/help/8/programs/anvi-export-functions), except it deals specifically with [hmm-hits](/help/8/artifacts/hmm-hits) (which are generated by [anvi-run-hmms](/help/8/programs/anvi-run-hmms); in contrast, [anvi-export-functions](/help/8/programs/anvi-export-functions) works with the more abstract [functions](/help/8/artifacts/functions) artifact.
+
+Here is an example run of this program:
+
+
+anvi-script-get-hmm-hits-per-gene-call -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/[functions-txt](/help/8/artifacts/functions-txt)
+
+
+You also have the option to specify a specific [hmm-source](/help/8/artifacts/hmm-source), so that only hits from that source are outputted. For example:
+
+
+anvi-script-get-hmm-hits-per-gene-call -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -o path/to/[functions-txt](/help/8/artifacts/functions-txt) \
+ --hmm-source Bacteria_71
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-get-hmm-hits-per-gene-call.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-get-hmm-hits-per-gene-call) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-get-hmm-hits-per-gene-call/network.json b/help/8/programs/anvi-script-get-hmm-hits-per-gene-call/network.json
new file mode 100644
index 00000000..291c29a2
--- /dev/null
+++ b/help/8/programs/anvi-script-get-hmm-hits-per-gene-call/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions-txt",
+ "name": "functions-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "hmm-source",
+ "name": "hmm-source",
+ "provided_by_anvio": false,
+ "type": "HMM"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "hmm-hits",
+ "name": "hmm-hits",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-get-hmm-hits-per-gene-call",
+ "name": "anvi-script-get-hmm-hits-per-gene-call",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-merge-collections/index.md b/help/8/programs/anvi-script-merge-collections/index.md
new file mode 100644
index 00000000..d55d6d9b
--- /dev/null
+++ b/help/8/programs/anvi-script-merge-collections/index.md
@@ -0,0 +1,60 @@
+---
+layout: program
+title: anvi-script-merge-collections
+excerpt: An anvi'o program. Generate an additional data file from multiple collections.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-merge-collections
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Generate an additional data file from multiple collections.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [collection-txt](../../artifacts/collection-txt)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program outputs a file that denotes which contigs and splits are part of which [collection-txt](/help/8/artifacts/collection-txt) files. This just tells you which collections each of your contigs are a part of, which can be imported as acategorical layer with [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data).
+
+This is not super useful, but it is used in the Infant gut tutorial [here](http://merenlab.org/tutorials/infant-gut/#comparing-multiple-binning-approaches).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-merge-collections.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-merge-collections) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-merge-collections/network.json b/help/8/programs/anvi-script-merge-collections/network.json
new file mode 100644
index 00000000..84b2c838
--- /dev/null
+++ b/help/8/programs/anvi-script-merge-collections/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection-txt",
+ "name": "collection-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-merge-collections",
+ "name": "anvi-script-merge-collections",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 2,
+ "source": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-permute-trnaseq-seeds/index.md b/help/8/programs/anvi-script-permute-trnaseq-seeds/index.md
new file mode 100644
index 00000000..8258fe68
--- /dev/null
+++ b/help/8/programs/anvi-script-permute-trnaseq-seeds/index.md
@@ -0,0 +1,53 @@
+---
+layout: program
+title: anvi-script-permute-trnaseq-seeds
+excerpt: An anvi'o program. This script generates a FASTA file of tRNA-seq seeds with permuted nucleotides at positions of predicted modification-induced substitutions.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-permute-trnaseq-seeds
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+This script generates a FASTA file of tRNA-seq seeds with permuted nucleotides at positions of predicted modification-induced substitutions. The underlying nucleotide without modification is not always the most common base call. The resulting FASTA file can be queried against a database of tRNA genes to validate nucleotides at modified positions and find the most similar sequences..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db)
+
+
+## Can provide
+
+
+[contigs-fasta](../../artifacts/contigs-fasta)
+
+
+## Usage
+
+
+{:.notice}
+**No one has described the usage of this program** :/ If you would like to contribute, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs), and feel free to add a Markdown formatted file in that directory named "anvi-script-permute-trnaseq-seeds.md". For a template, you can use the markdown file for `anvi-gen-contigs-database`. THANK YOU!
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-permute-trnaseq-seeds) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-permute-trnaseq-seeds/network.json b/help/8/programs/anvi-script-permute-trnaseq-seeds/network.json
new file mode 100644
index 00000000..4d0db7d0
--- /dev/null
+++ b/help/8/programs/anvi-script-permute-trnaseq-seeds/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-fasta",
+ "name": "contigs-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-permute-trnaseq-seeds",
+ "name": "anvi-script-permute-trnaseq-seeds",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory/index.md b/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory/index.md
new file mode 100644
index 00000000..f7fcafd3
--- /dev/null
+++ b/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory/index.md
@@ -0,0 +1,55 @@
+---
+layout: program
+title: anvi-script-pfam-accessions-to-hmms-directory
+excerpt: An anvi'o program. You give this program one or more PFAM accession ids, and it generates an anvi'o compatible HMM directory to be used with `anvi-run-hmms`.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-pfam-accessions-to-hmms-directory
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+You give this program one or more PFAM accession ids, and it generates an anvi'o compatible HMM directory to be used with `anvi-run-hmms`.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[hmm-source](../../artifacts/hmm-source)
+
+
+## Usage
+
+
+{:.notice}
+**No one has described the usage of this program** :/ If you would like to contribute, please see previous examples [here](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs), and feel free to add a Markdown formatted file in that directory named "anvi-script-pfam-accessions-to-hmms-directory.md". For a template, you can use the markdown file for `anvi-gen-contigs-database`. THANK YOU!
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-pfam-accessions-to-hmms-directory) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory/network.json b/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory/network.json
new file mode 100644
index 00000000..1328b22a
--- /dev/null
+++ b/help/8/programs/anvi-script-pfam-accessions-to-hmms-directory/network.json
@@ -0,0 +1,30 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "hmm-source",
+ "name": "hmm-source",
+ "provided_by_anvio": false,
+ "type": "HMM"
+ },
+ {
+ "size": 1,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-pfam-accessions-to-hmms-directory",
+ "name": "anvi-script-pfam-accessions-to-hmms-directory",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 1,
+ "target": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-process-genbank-metadata/index.md b/help/8/programs/anvi-script-process-genbank-metadata/index.md
new file mode 100644
index 00000000..0ed31e19
--- /dev/null
+++ b/help/8/programs/anvi-script-process-genbank-metadata/index.md
@@ -0,0 +1,109 @@
+---
+layout: program
+title: anvi-script-process-genbank-metadata
+excerpt: An anvi'o program. This script takes the 'metadata' output of the program `ncbi-genome-download` (see [https://github.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-process-genbank-metadata
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+This script takes the 'metadata' output of the program `ncbi-genome-download` (see [https://github.com/kblin/ncbi-genome-download](https://github.com/kblin/ncbi-genome-download) for details), and processes each GenBank file found in the metadata file to generate a FASTA file, as well as genes and functions files for each entry. Plus, it autmatically generates a FASTA TXT file descriptor for anvi'o snakemake workflows. So it is a multi-talented program like that.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[contigs-fasta](../../artifacts/contigs-fasta) [functions-txt](../../artifacts/functions-txt) [external-gene-calls](../../artifacts/external-gene-calls)
+
+
+## Usage
+
+
+Suppose you have downloaded some genomes from NCBI (using [this](https://github.com/kblin/ncbi-genome-download) incredibly useful program) and you have a metadata table describing those genomes. This program will convert that metadata table into some useful files, namely: a FASTA file of contig sequences, an external gene calls file, and an external functions file for each genome you have downloaded; as well as a single tab-delimited fasta-txt file (like the one shown [here](https://merenlab.org/2018/07/09/anvio-snakemake-workflows/#fastatxt)) describing the path to each of these files for all downloaded genomes (that you can pass directly to a snakemake workflow if you need to). Yay.
+
+### The metadata file
+
+The prerequisite for running this program is to have a tab-delimited metadata file containing information about each of the genomes you downloaded from NCBI. Let's say your download command started like this: `ncbi-genome-download --metadata-table ncbi_metadata.txt -t ....` So for the purposes of this usage tutorial, your metadata file is called `ncbi_metadata.txt`.
+
+In case you are wondering, that file should have a header that looks something like this:
+```
+assembly_accession bioproject biosample wgs_master excluded_from_refseq refseq_category relation_to_type_material taxid species_taxid organism_name infraspecific_name isolate version_status assembly_level release_type genome_rep seq_rel_dateasm_name submitter gbrs_paired_asm paired_asm_comp ftp_path local_filename
+```
+
+### Basic usage
+
+If you run this, all the output files will show up in your current working directory.
+
+
+anvi-script-process-genbank-metadata -m ncbi_metadata.txt
+
+
+### Choosing an output directory
+
+Alternatively, you can specify a directory in which to generate the output:
+
+
+anvi-script-process-genbank-metadata -m ncbi_metadata.txt -o DOWNLOADED_GENOMES
+
+
+### Picking a name for the fasta-txt file
+
+The default name for the fasta-txt file is `fasta-input.txt`, but you can change that with the `--output-fasta-txt` parameter.
+
+
+anvi-script-process-genbank-metadata -m ncbi_metadata.txt --output-fasta-txt ncbi_fasta.txt
+
+
+### Make a fasta-txt without the gene calls and functions columns
+
+The default columns in the fasta-txt file are:
+```
+name path external_gene_calls gene_functional_annotation
+```
+
+But sometimes, you don't want your downstream snakemake workflow to use those external gene calls or functional annotations files. So to skip adding those columns into the fasta-txt file, you can use the `-E` flag:
+
+anvi-script-process-genbank-metadata -m ncbi_metadata.txt --output-fasta-txt ncbi_fasta.txt -E
+
+
+Then the fasta-txt will only contain a `name` column and a `path` column.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-process-genbank-metadata.md) to update this information.
+
+
+## Additional Resources
+
+
+* [A tutorial on using this program to access NCBI genomes for 'omics analyses in Anvi'o](http://merenlab.org/2019/03/14/ncbi-genome-download-magic/)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-process-genbank-metadata) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-process-genbank-metadata/network.json b/help/8/programs/anvi-script-process-genbank-metadata/network.json
new file mode 100644
index 00000000..d0b653c5
--- /dev/null
+++ b/help/8/programs/anvi-script-process-genbank-metadata/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-fasta",
+ "name": "contigs-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions-txt",
+ "name": "functions-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-gene-calls",
+ "name": "external-gene-calls",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-process-genbank-metadata",
+ "name": "anvi-script-process-genbank-metadata",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "source": 3,
+ "target": 1
+ },
+ {
+ "source": 3,
+ "target": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-process-genbank/index.md b/help/8/programs/anvi-script-process-genbank/index.md
new file mode 100644
index 00000000..67b3650b
--- /dev/null
+++ b/help/8/programs/anvi-script-process-genbank/index.md
@@ -0,0 +1,63 @@
+---
+layout: program
+title: anvi-script-process-genbank
+excerpt: An anvi'o program. This script takes a GenBank file, and outputs a FASTA file, as well as two additional TAB-delimited output files for external gene calls and gene functions that can be used with the programs `anvi-gen-contigs-database` and `anvi-import-functions`.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-process-genbank
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+This script takes a GenBank file, and outputs a FASTA file, as well as two additional TAB-delimited output files for external gene calls and gene functions that can be used with the programs `anvi-gen-contigs-database` and `anvi-import-functions`.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[genbank-file](../../artifacts/genbank-file)
+
+
+## Can provide
+
+
+[contigs-fasta](../../artifacts/contigs-fasta) [external-gene-calls](../../artifacts/external-gene-calls) [functions-txt](../../artifacts/functions-txt)
+
+
+## Usage
+
+
+This program processes a [genbank-file](/help/8/artifacts/genbank-file), and converts it into anvi'o friendly artifacts: namely, a [contigs-fasta](/help/8/artifacts/contigs-fasta), [external-gene-calls](/help/8/artifacts/external-gene-calls) and a [functions-txt](/help/8/artifacts/functions-txt).
+
+The [contigs-fasta](/help/8/artifacts/contigs-fasta) and [external-gene-calls](/help/8/artifacts/external-gene-calls) can be given to [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database) to create a [contigs-db](/help/8/artifacts/contigs-db), and then you can use [anvi-import-functions](/help/8/programs/anvi-import-functions) to bring the function data (in the [functions-txt](/help/8/artifacts/functions-txt)) into the database. Then you'll have all of the data in your [genbank-file](/help/8/artifacts/genbank-file) converted into a single [contigs-db](/help/8/artifacts/contigs-db), which you can use for a variety of anvi'o analyses.
+
+The parameters of this program entirely deal with the outputs. Besides telling the program where to put them, you can also give the function annotation source (in the [functions-txt](/help/8/artifacts/functions-txt)) a custom name.
+
+One important note about this conversion is the following: During the conversion of GenBank entries, anvi'o will assign a new gene call id to each entry, breaking the link between locus tags defined in the GenBank file and the gene entries that will later appear in the anvi'o [contigs-db](/help/8/artifacts/contigs-db). One way to avoid this is to use the flag `--include-locus-tags-as-functions`, which will instruct anvi'o to add a new 'function' source for each gene in the output file for functional annotations so that the user can trace back a given gene call to the original locus tag.
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-process-genbank.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-process-genbank) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-process-genbank/network.json b/help/8/programs/anvi-script-process-genbank/network.json
new file mode 100644
index 00000000..975eeac5
--- /dev/null
+++ b/help/8/programs/anvi-script-process-genbank/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-fasta",
+ "name": "contigs-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "external-gene-calls",
+ "name": "external-gene-calls",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions-txt",
+ "name": "functions-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "genbank-file",
+ "name": "genbank-file",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-process-genbank",
+ "name": "anvi-script-process-genbank",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "source": 4,
+ "target": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-reformat-fasta/index.md b/help/8/programs/anvi-script-reformat-fasta/index.md
new file mode 100644
index 00000000..5ee04a6f
--- /dev/null
+++ b/help/8/programs/anvi-script-reformat-fasta/index.md
@@ -0,0 +1,140 @@
+---
+layout: program
+title: anvi-script-reformat-fasta
+excerpt: An anvi'o program. Reformat FASTA file (remove contigs based on length, or based on a given list of deflines, and/or generate an output with simpler names).
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-reformat-fasta
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Reformat FASTA file (remove contigs based on length, or based on a given list of deflines, and/or generate an output with simpler names).
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+
+
+## Can consume
+
+
+[fasta](../../artifacts/fasta)
+
+
+## Can provide
+
+
+[contigs-fasta](../../artifacts/contigs-fasta)
+
+
+## Usage
+
+
+This program **converts a [fasta](/help/8/artifacts/fasta) file to a [contigs-fasta](/help/8/artifacts/contigs-fasta).** In other words, it reformats your FASTA formatted file to meet the conditions required of a [contigs-fasta](/help/8/artifacts/contigs-fasta), which is able to be used by other anvi'o programs.
+
+
+anvi-script-reformat-fasta [fasta](/help/8/artifacts/fasta) \
+ -o [contigs-fasta](/help/8/artifacts/contigs-fasta) \
+ --simplify-names
+
+
+{:.notice}
+If you use the flag `--report-file`, it will also create a TAB-delimited file for you to keep track of which defline in the new file corresponds to which defline in the original file.
+
+{:.notice}
+This program can work with compressed input FASTA files (i.e., the file name ends with a `.gz` extention) and will report a compressed output FASTA file (i.e., if the output file name ends with a `.gz` extension).
+
+In addition to simplifying names, this program will allow you to do a combination of the operations that include,
+
+* Add a prefix to sequnce names in a FASTA file,
+* Remove sequences that are shorter than a specific length or only keep sequences that match to a specific length,
+* Remove sequences if they contain more than a number of gap characters or exceed the precentage of gap characters you permit,
+* Exclude sequences that match to a list of sequence IDs, or only keep those that match to a list of sequence IDs,
+* Enforce a sequence type to replace any character with `N` for nucleotide sequences that are not A, C, T, or G, or replace any character with `X` for amino acid sequences if the character does not match any of the single-letter amino acid characters.
+
+### Removing the short reads is important
+
+If your FASTA file includes a lot of very short contigs, removing them may dramatically improve the performance of the generation and processing of your [contigs-db](/help/8/artifacts/contigs-db). The example below runs the same command while also removing sequences that are shorter than 1,000 nts:
+
+
+anvi-script-reformat-fasta [fasta](/help/8/artifacts/fasta) \
+ -o [contigs-fasta](/help/8/artifacts/contigs-fasta) \
+ -l 1000 \
+ --simplify-names
+
+
+### Example output
+
+```
+anvi-script-reformat-fasta contigs.fa \
+ --simplify-names \
+ --prefix YYY \
+ --min-len 1000 \
+ --seq-type NT \
+ --overwrite-input
+```
+
+```
+Input ........................................: contigs.fa
+Output .......................................: (anvi'o will overwrite your input file)
+
+WHAT WAS THERE
+===============================================
+Total num contigs ............................: 4,189
+Total num nucleotides ........................: 35,766,167
+
+WHAT WAS ASKED
+===============================================
+Simplify deflines? ...........................: Yes
+Add prefix to sequence names? ................: Yes, add 'YYY'
+Minimum length of contigs to keep ............: 1,000
+Max % gaps allowed ...........................: 100.00%
+Max num gaps allowed .........................: 1,000,000
+Exclude specific sequences? ..................: No
+Keep specific sequences? .....................: No
+Enforce sequence type? .......................: Yes, enforce 'NT'
+
+WHAT HAPPENED
+===============================================
+Contigs removed ..............................: 3,156 (75.34% of all)
+Nucleotides removed ..........................: 6,121,239 (17.11% of all)
+Nucleotides modified .........................: 161 (0.00045% of all)
+Deflines simplified ..........................: True
+
+
+* The contents of your input file have changed because you used the flag
+`--overwrite-input`.
+
+```
+
+{:.warning}
+Please use the flag `--overwrite-input` with extreme caution.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-reformat-fasta.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-reformat-fasta) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-reformat-fasta/network.json b/help/8/programs/anvi-script-reformat-fasta/network.json
new file mode 100644
index 00000000..fc3e70ae
--- /dev/null
+++ b/help/8/programs/anvi-script-reformat-fasta/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-fasta",
+ "name": "contigs-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "fasta",
+ "name": "fasta",
+ "provided_by_anvio": false,
+ "type": "FASTA"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-reformat-fasta",
+ "name": "anvi-script-reformat-fasta",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-snvs-to-interactive/index.md b/help/8/programs/anvi-script-snvs-to-interactive/index.md
new file mode 100644
index 00000000..d3e28aae
--- /dev/null
+++ b/help/8/programs/anvi-script-snvs-to-interactive/index.md
@@ -0,0 +1,92 @@
+---
+layout: program
+title: anvi-script-snvs-to-interactive
+excerpt: An anvi'o program. Take the output of anvi-gen-variability-profile, prepare an output for interactive interface.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-snvs-to-interactive
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Take the output of anvi-gen-variability-profile, prepare an output for interactive interface.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[variability-profile-txt](../../artifacts/variability-profile-txt)
+
+
+## Can provide
+
+
+[interactive](../../artifacts/interactive)
+
+
+## Usage
+
+
+This programs takes a [variability-profile-txt](/help/8/artifacts/variability-profile-txt) and generates the information necessary to visualize its contents with [anvi-interactive](/help/8/programs/anvi-interactive).
+
+Specifically, this program outputs a directory that contains a [profile-db](/help/8/artifacts/profile-db), a [view-data](/help/8/artifacts/view-data) artifact, and a [dendrogram](/help/8/artifacts/dendrogram). For example, if you ran this program like so:
+
+
+anvi-script-snvs-to-interactive -o OUTPUT_DIR \
+ [variability-profile](/help/8/artifacts/variability-profile)
+
+
+Then, you can open the interactive interface by running
+
+
+anvi-interactive --manual-mode \
+ -p OUTPUT_DIR/profile.db \
+ --tree OUTPUT_DIR/tree.txt \
+ --view-data OUTPUT_DIR/view.txt
+
+
+## Other parameters
+
+### Using Only a Subset of the Input
+
+By default, all variability positions in your variability profile are considered. However, if the input is too large (i.e. more than 25,000 variability positions), the runtime on this program will be very long and the results won't display well. So, there are several ways to remove variability positions from the input to get under this threshold:
+
+1. Ignore positions with with certain departures from the consensus sequence (with `--min-departure-from-consensus` and `--max-departure-from-consensus`)
+2. Ignore positions with with certain departures from the reference sequence (with `--min-departure-from-reference` and `--max-departure-from-reference`)
+3. Ignore positions in all non-coding regions with the flag `--only-in-genes`.
+
+If you still have more positions than you can tell the program to pick a random subset of the input with the parameter `--random` followed by a seed integer.
+
+### Modifying the Output
+
+By the default, the output data will use the departure from consensus values. If instead you want to look at the departure from the reference, just add the falg `--display-dep-from-reference`
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-snvs-to-interactive.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Use in the Infant Gut Tutorial](http://merenlab.org/tutorials/infant-gut/#visualizing-snv-profiles-using-anvio)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-snvs-to-interactive) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-snvs-to-interactive/network.json b/help/8/programs/anvi-script-snvs-to-interactive/network.json
new file mode 100644
index 00000000..c12427da
--- /dev/null
+++ b/help/8/programs/anvi-script-snvs-to-interactive/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interactive",
+ "name": "interactive",
+ "provided_by_anvio": true,
+ "type": "DISPLAY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "variability-profile-txt",
+ "name": "variability-profile-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-snvs-to-interactive",
+ "name": "anvi-script-snvs-to-interactive",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-transpose-matrix/index.md b/help/8/programs/anvi-script-transpose-matrix/index.md
new file mode 100644
index 00000000..9900f0bd
--- /dev/null
+++ b/help/8/programs/anvi-script-transpose-matrix/index.md
@@ -0,0 +1,86 @@
+---
+layout: program
+title: anvi-script-transpose-matrix
+excerpt: An anvi'o program. Transpose a TAB-delimited file.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-transpose-matrix
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Transpose a TAB-delimited file.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[view-data](../../artifacts/view-data) [functions-txt](../../artifacts/functions-txt) [misc-data-items-txt](../../artifacts/misc-data-items-txt) [misc-data-layers-txt](../../artifacts/misc-data-layers-txt) [gene-calls-txt](../../artifacts/gene-calls-txt) [linkmers-txt](../../artifacts/linkmers-txt)
+
+
+## Can provide
+
+
+[view-data](../../artifacts/view-data) [functions-txt](../../artifacts/functions-txt) [misc-data-items-txt](../../artifacts/misc-data-items-txt) [misc-data-layers-txt](../../artifacts/misc-data-layers-txt) [gene-calls-txt](../../artifacts/gene-calls-txt) [linkmers-txt](../../artifacts/linkmers-txt)
+
+
+## Usage
+
+
+This is a script that transposes tab-delimited files. That's it.
+
+It's helpful to get your inputs to line up with the types of inputs that anvi'o expects. Some programs have the `--transpose` flag, which will run this program for you, but some don't, and that's when you'll have to run it yourself.
+
+For example, anvi'o expects [view-data](/help/8/artifacts/view-data) to have each column representing a sample. If the file that you want to integrate into your anvi'o project has the samples as rows and the data attribute as the columns, then you'll need to [anvi-script-transpose-matrix](/help/8/programs/anvi-script-transpose-matrix) it.
+
+### An Example Run
+
+If you have an input ile `INPUT.txt` that looks like this:
+
+ 1 2 3
+ 4 5 6
+ 7 8 9
+ 10 11 12
+
+And you run this:
+
+
+anvi-script-transpose-matrix -o INPUT_transposed.txt \
+ -i INPUT.txt
+
+
+You'll get a file called `INPUT_transposed.txt` that looks like
+
+ 1 4 7 10
+ 2 5 8 11
+ 3 6 9 12
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-transpose-matrix.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-transpose-matrix) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-transpose-matrix/network.json b/help/8/programs/anvi-script-transpose-matrix/network.json
new file mode 100644
index 00000000..8cd3723f
--- /dev/null
+++ b/help/8/programs/anvi-script-transpose-matrix/network.json
@@ -0,0 +1,119 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "view-data",
+ "name": "view-data",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions-txt",
+ "name": "functions-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items-txt",
+ "name": "misc-data-items-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layers-txt",
+ "name": "misc-data-layers-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "gene-calls-txt",
+ "name": "gene-calls-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "linkmers-txt",
+ "name": "linkmers-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 12,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-transpose-matrix",
+ "name": "anvi-script-transpose-matrix",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 6,
+ "target": 0
+ },
+ {
+ "target": 6,
+ "source": 0
+ },
+ {
+ "source": 6,
+ "target": 1
+ },
+ {
+ "target": 6,
+ "source": 1
+ },
+ {
+ "source": 6,
+ "target": 2
+ },
+ {
+ "target": 6,
+ "source": 2
+ },
+ {
+ "source": 6,
+ "target": 3
+ },
+ {
+ "target": 6,
+ "source": 3
+ },
+ {
+ "source": 6,
+ "target": 4
+ },
+ {
+ "target": 6,
+ "source": 4
+ },
+ {
+ "source": 6,
+ "target": 5
+ },
+ {
+ "target": 6,
+ "source": 5
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-script-variability-to-vcf/index.md b/help/8/programs/anvi-script-variability-to-vcf/index.md
new file mode 100644
index 00000000..b038a155
--- /dev/null
+++ b/help/8/programs/anvi-script-variability-to-vcf/index.md
@@ -0,0 +1,67 @@
+---
+layout: program
+title: anvi-script-variability-to-vcf
+excerpt: An anvi'o program. A script to convert SNV output obtained from anvi-gen-variability-profile to the standard VCF format.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-script-variability-to-vcf
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A script to convert SNV output obtained from anvi-gen-variability-profile to the standard VCF format.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[variability-profile-txt](../../artifacts/variability-profile-txt)
+
+
+## Can provide
+
+
+[vcf](../../artifacts/vcf)
+
+
+## Usage
+
+
+This script **converts a [variability-profile-txt](/help/8/artifacts/variability-profile-txt) into [vcf](/help/8/artifacts/vcf) (Variant Call Format).**
+
+It is very easy to run: just provide the input and output paths as so:
+
+
+anvi-script-variability-to-vcf -i [variability-profile-txt](/help/8/artifacts/variability-profile-txt) \
+ -o [vcf](/help/8/artifacts/vcf)
+
+
+Note that to run this, you'll need to have run [anvi-gen-variability-profile](/help/8/programs/anvi-gen-variability-profile) with the default nucleotide engine.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-script-variability-to-vcf.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-script-variability-to-vcf) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-script-variability-to-vcf/network.json b/help/8/programs/anvi-script-variability-to-vcf/network.json
new file mode 100644
index 00000000..d36398c8
--- /dev/null
+++ b/help/8/programs/anvi-script-variability-to-vcf/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "vcf",
+ "name": "vcf",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "variability-profile-txt",
+ "name": "variability-profile-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-script-variability-to-vcf",
+ "name": "anvi-script-variability-to-vcf",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-search-functions/index.md b/help/8/programs/anvi-search-functions/index.md
new file mode 100644
index 00000000..4cd3339e
--- /dev/null
+++ b/help/8/programs/anvi-search-functions/index.md
@@ -0,0 +1,90 @@
+---
+layout: program
+title: anvi-search-functions
+excerpt: An anvi'o program. Search functions in an anvi'o contigs database or genomes storage.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-search-functions
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Search functions in an anvi'o contigs database or genomes storage. Basically, this program searches for one or more search terms you define in functional annotations of genes in an anvi'o contigs database, and generates multiple reports. The default report simply tells you which contigs contain genes with functions matching to serach terms you used, useful for viewing in the interface. You can also request a much more comprehensive report, which gives you anything you might need to know for each hit and serach term.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Can provide
+
+
+[functions-txt](../../artifacts/functions-txt)
+
+
+## Usage
+
+
+This program **searches for keywords in the function annotations of your database.**
+
+You can use this program to look for specific functon keywords in a [contigs-db](/help/8/artifacts/contigs-db), [genomes-storage-db](/help/8/artifacts/genomes-storage-db) or [pan-db](/help/8/artifacts/pan-db). For example, say you wanted your [contigs-db](/help/8/artifacts/contigs-db) to search for genes that encoded some type of kinase. You could call
+
+
+anvi-search-functions -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --search-terms kinase
+
+
+By default, the output will be a fairly barren [functions-txt](/help/8/artifacts/functions-txt), only telling you which contigs contain genes that matched your search. This will be most helpful as an additional layer in the anvi'o interactive interface, so you can quickly see where the kinase-encoding genes are in the genome. To do this, run [anvi-interactive](/help/8/programs/anvi-interactive) with the `--aditional-layer` parameter with the [functions-txt](/help/8/artifacts/functions-txt).
+
+However, you can also request a much more comprehensive output that contains much more information, including the matching genes' caller id, functional annotation source and full function name.
+
+For example, to run the same search as above, but with a more comprehensive output, you could call
+
+
+anvi-search-functions -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --search-terms kinase \
+ --full-report kinase_information.txt \
+ --include-sequences \
+ --verbose
+
+
+Following this run, the file `kinase_information.txt` will contain comprehensive information about the matching genes, including their sequences.
+
+You can also search for multiple terms at the same time, or for terms from only specific annotation sources. For example, if you only wanted Pfam hits with functions related to kinases or phosphatases, you could call
+
+
+anvi-search-functions -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --search-terms kinase,phosphatase \
+ --annotation-sources Pfam \
+ --full-report kinase_phosphatase_information.txt
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-search-functions.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-search-functions) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-search-functions/network.json b/help/8/programs/anvi-search-functions/network.json
new file mode 100644
index 00000000..0dc34a53
--- /dev/null
+++ b/help/8/programs/anvi-search-functions/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions-txt",
+ "name": "functions-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-search-functions",
+ "name": "anvi-search-functions",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-search-palindromes/index.md b/help/8/programs/anvi-search-palindromes/index.md
new file mode 100644
index 00000000..024cb45d
--- /dev/null
+++ b/help/8/programs/anvi-search-palindromes/index.md
@@ -0,0 +1,307 @@
+---
+layout: program
+title: anvi-search-palindromes
+excerpt: An anvi'o program. A program to find palindromes in sequences.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-search-palindromes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program to find palindromes in sequences.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[dna-sequence](../../artifacts/dna-sequence) [fasta](../../artifacts/fasta) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[palindromes-txt](../../artifacts/palindromes-txt)
+
+
+## Usage
+
+
+This program finds [palindromes](https://en.wikipedia.org/wiki/Palindromic_sequence) in any DNA sequence. It will search for palindromes that matches criteria listed by the user (i.e., minimum length of the palindromic sequences, maximum number of mismatches, and minimum distance between the two palindromic regions, and more).
+
+The program will print out its findings (and tribulations) and will optionally report the search results as a [palindromes-txt](/help/8/artifacts/palindromes-txt).
+
+### Kinds of palindromes
+
+Please note that this program can find both **'in-place' palindromes** (i.e., the identity and order of nucleotides on one strand match to those on the complementary strand) that will look like this in the genomic context:
+
+```
+0 1
+1234567890
+...TCGA...
+```
+
+As well as **'distant palindromes'** (i.e., special cases of palindromes that form [hairpins](https://en.wikipedia.org/wiki/Stem-loop)) that will look like this in the genomic context:
+
+```
+0 1
+12345678901234567
+...ATCC...GGAT...
+```
+
+In this example, the 'distance' for the in-place palindrome will be 0, and the 'distance' for the distant palindromes will be 3. You can set the `--min-distance` parameter to anything greater than 0 to only report distant palindromes, and eliminate all in-place palindromes from your results.
+
+{:.notice}
+The speed of the algorithm will depend on the minimum palindrome length parameter. The shorter the palindrome length, the longer the processing time. Searching for palindromes longer than 50 nts in a 10,000,000 nts long sequence takes about 4 seconds on a laptop.
+
+### Sequence input sources
+
+[anvi-search-palindromes](/help/8/programs/anvi-search-palindromes) can use multiple different sequence sources.
+
+#### Contigs database
+
+In this mode [anvi-search-palindromes](/help/8/programs/anvi-search-palindromes) will go through every contig sequence in a given [contigs-db](/help/8/artifacts/contigs-db).
+
+
+[anvi-search-palindromes](/help/8/programs/anvi-search-palindromes) -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --output-file [palindromes-txt](/help/8/artifacts/palindromes-txt)
+
+
+#### FASTA file
+
+Alternatively, you can use a [fasta](/help/8/artifacts/fasta) file as input.
+
+
+[anvi-search-palindromes](/help/8/programs/anvi-search-palindromes) --fasta-file [fasta](/help/8/artifacts/fasta) \
+ --output-file [palindromes-txt](/help/8/artifacts/palindromes-txt)
+
+
+#### DNA sequence
+
+Those who are lazy can also pass a DNA sequence for quick searches:
+
+
+[anvi-search-palindromes](/help/8/programs/anvi-search-palindromes) --dna-sequence (.. A DNA SEQUENCE OF ANY LENGTH ..)
+
+
+
+### Verbose output
+
+If you provide an `--output-file` parameter, your results will be stored into a [palindromes-txt](/help/8/artifacts/palindromes-txt) file for downstream analyses. If you do not provide an output file, or explicitly asked for a verbose output with the flag `--verbose`, you will see all your palindromes listed on your screen.
+
+Here is an example with a single sequence and no output file path:
+
+
+[anvi-search-palindromes](/help/8/programs/anvi-search-palindromes) --dna-sequence CATTGACGTTGACGGCGACCGGTCGGTGATCACCGACCGGTCGCCGTCAACGTCAATG
+
+
+```
+SEARCH SETTINGS
+===============================================
+Minimum palindrome length ....................: 10
+Number of mismatches allowed .................: 0
+Minimum gap length ...........................: 0
+Be verbose? ..................................: Yes
+
+Number of threads for BLAST ..................: 1
+BLAST word size ..............................: 10
+
+
+58 nts palindrome
+===============================================
+Method .......................................: BLAST
+1st sequence [start:stop] ....................: [0:58]
+2nd sequence [start:stop] ....................: [0:58]
+Number of mismatches .........................: 0
+Distance between .............................: 0
+1st sequence .................................: CATTGACGTTGACGGCGACCGGTCGGTGATCACCGACCGGTCGCCGTCAACGTCAATG
+ALN ..........................................: ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+2nd sequence .................................: CATTGACGTTGACGGCGACCGGTCGGTGATCACCGACCGGTCGCCGTCAACGTCAATG
+
+SEARCH RESULTS
+===============================================
+Total number of sequences processed ..........: 1
+Total number of palindromes found ............: 1
+Longest palindrome ...........................: 58
+Most distant palindrome ......................: 0
+```
+
+Here is another example with a [contigs-db](/help/8/artifacts/contigs-db), an output file path, and the `--verbose` flag:
+
+
+[anvi-search-palindromes](/help/8/programs/anvi-search-palindromes) -c CONTIGS.db \
+ --min-palindrome-length 50 \
+ --max-num-mismatches 1 \
+ --output-file palindromes.txt \
+ --verbose
+
+
+```
+SEARCH SETTINGS
+===============================================
+Minimum palindrome length ....................: 50
+Number of mismatches allowed .................: 1
+Minimum gap length ...........................: 0
+Be verbose? ..................................: Yes
+
+147 nts palindrome"
+===============================================
+1st sequence [start:stop] ....................: [268872:269019]
+2nd sequence [start:stop] ....................: [269631:269778]
+Number of mismatches .........................: 1
+Distance between .............................: 759
+1st sequence .................................: TTTCGTAATACTTTTTTGCAGTAGGCATCAAATTGGTGTTGTATAGATTTCTCATTATAATTTTGTTGCATGATAATATGCTCCTTTTTCCCCTTTCCACTAATACAACAATCAGAGAGCCCCTTTTTTTCGAAAAAGCTAGAAAAA
+ALN ..........................................: |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||x|||||||||
+2nd sequence .................................: TTTCGTAATACTTTTTTGCAGTAGGCATCAAATTGGTGTTGTATAGATTTCTCATTATAATTTTGTTGCATGATAATATGCTCCTTTTTCCCCTTTCCACTAATACAACAATCAGAGAGCCCCTTTTTTTCGAAAAAACTAGAAAAA
+
+SEARCH RESULTS
+===============================================
+Total number of sequences processed ..........: 11
+Total number of palindromes found ............: 1
+Longest palindrome ...........................: 147
+Most distant palindrome ......................: 759
+
+Output file ..................................: palindromes.txt
+```
+
+
+### Programmer access
+
+Just like everything else in anvi'o, you can access the functionality the program `anvi-search-palindromes` offers without using the program itself by inheriting an instance from the `Palindromes` class and use it in your own Python scripts.
+
+Here is an example, first with an input file and then an ad hoc sequence. Starting with the file (i.e., an anvi'o [contigs-db](/help/8/artifacts/contigs-db)):
+
+``` python
+# import argparse to pass arguments to the class
+import argparse
+
+# `Palindromes` is the class we need
+from anvio.sequencefeatures import Palindromes
+
+# we also import `Progress` and `Run` helper classes from the terminal
+# module to ask the class to print no output messages to our workspace
+# (this is obviously optional)
+from anvio.terminal import Progress, Run
+
+# get an instance for the case of a contigs database, and process everything in it.
+# this example is with an anvi'o contigs db, but you can also pass a FASTA file
+# via `fasta_file='FILE.fa'` instead of `contigs_db='CONTIGS.db'`:
+p = Palindromes(argparse.Namespace(contigs_db='CONTIGS.db', min_palindrome_length=50), run=Run(verbose=False), progress=Progress(verbose=False))
+p.process()
+```
+
+Once the processing is done, the palindromes are stored in a member dictionary, which contains a key for each sequence:
+
+``` python
+print(p.palindromes)
+
+>>> {'Day17a_QCcontig1' : [],
+ 'Day17a_QCcontig2' : [],
+ 'Day17a_QCcontig4' : [],
+ 'Day17a_QCcontig6' : [],
+ 'Day17a_QCcontig10': [],
+ 'Day17a_QCcontig16': [],
+ 'Day17a_QCcontig23': [],
+ 'Day17a_QCcontig24': [],
+ 'Day17a_QCcontig45': [],
+ 'Day17a_QCcontig54': [],
+ 'Day17a_QCcontig97': []}
+
+```
+
+Non-empty arrays are the proper palindromes found in a given sequence, described with an instance of the class `Palindrome` which is defined as the following:
+
+``` python
+class Palindrome:
+ def __init__(self, run=terminal.Run()):
+ self.run=run
+ self.first_start = None
+ self.fisrt_end = None
+ self.first_sequence = None
+ self.second_start = None
+ self.second_end = None
+ self.second_sequence = None
+ self.num_mismatches = None
+ self.length = None
+ self.distance = None
+ self.midline = ''
+```
+
+Not only you can access to each member variable to deal with them, you can easily display the contents of one using the `display()` function:
+
+``` python
+palindrome = p.palindromes['Day17a_QCcontig4'][0]
+print(palindrome)
+
+>>> TTTCGTAATACTTTTTTGCAGTAGGCATCAAATTGGTGTTGTATAGATTTCTCATTATAATTTTGTTGCATGATAATATGCTCCTTTTTCCCCTTTCCACTAATACAACAATCAGAGAGCCCCTTTTTTTCGAAAAA (268872:269009) :: TTTCGTAATACTTTTTTGCAGTAGGCATCAAATTGGTGTTGTATAGATTTCTCATTATAATTTTGTTGCATGATAATATGCTCCTTTTTCCCCTTTCCACTAATACAACAATCAGAGAGCCCCTTTTTTTCGAAAAA (269631:269768)
+
+palindrome.display()
+
+>>> 137 nts palindrome"
+>>> ===============================================
+>>> 1st sequence [start:stop] ....................: [268872:269009]
+>>> 2nd sequence [start:stop] ....................: [269631:269768]
+>>> Number of mismatches .........................: 0
+>>> Distance between .............................: 759
+>>> 1st sequence .................................: TTTCGTAATACTTTTTTGCAGTAGGCATCAAATTGGTGTTGTATAGATTTCTCATTATAATTTTGTTGCATGATAATATGCTCCTTTTTCCCCTTTCCACTAATACAACAATCAGAGAGCCCCTTTTTTTCGAAAAA
+>>> ALN ..........................................: |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+>>> 2nd sequence .................................: TTTCGTAATACTTTTTTGCAGTAGGCATCAAATTGGTGTTGTATAGATTTCTCATTATAATTTTGTTGCATGATAATATGCTCCTTTTTCCCCTTTCCACTAATACAACAATCAGAGAGCCCCTTTTTTTCGAAAAA
+```
+
+Alternatively you can process an ad hoc sequence without any input files,
+
+``` python
+p = Palindromes()
+
+# let's set some values for fun,
+p.min_palindrome_length = 14
+p.max_num_mismatches = 1
+
+# to go through some sequences of your liking:
+some_sequences = {'a_sequence': 'CATTGACGTTGACGGCGACCGGTCGGTGATCACCGACCGGTCGCCGTCAACGTCAATG',
+ 'antoher_sequence': 'AAATCGGCCGATTT',
+ 'sequence_with_no_palindrome': 'AAAAAAAAAAAAAA'}
+
+# in this case (where there are no input files) you can call the function `find`,
+# rather than `process`, to populate the `p.palindromes` dictionary:
+for sequence_name in some_sequences:
+ p.find(some_sequences[sequence_name], sequence_name=sequence_name)
+
+# tadaaa:
+print(p.palindromes)
+
+>>> {'a_sequence': [],
+ 'antoher_sequence': [],
+ 'sequence_with_no_palindrome': []}
+```
+
+If you are a programmer and need more from this module, please let us know.
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-search-palindromes.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-search-palindromes) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-search-palindromes/network.json b/help/8/programs/anvi-search-palindromes/network.json
new file mode 100644
index 00000000..01abc943
--- /dev/null
+++ b/help/8/programs/anvi-search-palindromes/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "palindromes-txt",
+ "name": "palindromes-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "dna-sequence",
+ "name": "dna-sequence",
+ "provided_by_anvio": true,
+ "type": "SEQUENCE"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "fasta",
+ "name": "fasta",
+ "provided_by_anvio": false,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-search-palindromes",
+ "name": "anvi-search-palindromes",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-search-primers/index.md b/help/8/programs/anvi-search-primers/index.md
new file mode 100644
index 00000000..3bd7cfe1
--- /dev/null
+++ b/help/8/programs/anvi-search-primers/index.md
@@ -0,0 +1,171 @@
+---
+layout: program
+title: anvi-search-primers
+excerpt: An anvi'o program. You provide this program with FASTQ files for one or more samples AND one or more primer sequences, and it collects reads from FASTQ files that matches to your primers.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-search-primers
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+You provide this program with FASTQ files for one or more samples AND one or more primer sequences, and it collects reads from FASTQ files that matches to your primers. This tool can be most powerful if you want to collect all short reads from one or more metagenomes that are downstream to a known sequence. Using the comprehensive output files you can analyze the diversity of seuqences visually, manually, or using established strategies such as oligotyping..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[samples-txt](../../artifacts/samples-txt) [primers-txt](../../artifacts/primers-txt)
+
+
+## Can provide
+
+
+[short-reads-fasta](../../artifacts/short-reads-fasta)
+
+
+## Usage
+
+
+This program finds all reads in a given set of FASTQ files provided as [samples-txt](/help/8/artifacts/samples-txt) based on user-provided primer sequences as [primers-txt](/help/8/artifacts/primers-txt).
+
+One of many potential uses of this program is to get back short reads that may be extending into hypervariable regions of genomes that often suffer from significant drops in coverage in conventional read-recruitment analyses, thus preventing any meaningful insights into coverage or variability patterns. In such situations, one can identify downstream conserved sequences (typically 15 to 25 nucleotides long) using the anvi'o interactive interface or through other means, and then provide those sequences to this program so it can find all matching sequences in a set of FASTQ files without any mapping.
+
+{:.notice}
+To instead get short reads mapping to a gene, use [anvi-get-short-reads-mapping-to-a-gene](/help/8/programs/anvi-get-short-reads-mapping-to-a-gene).
+
+Here is a typical command line to run it:
+
+
+anvi-search-primers --samples-txt [samples-txt](/help/8/artifacts/samples-txt) \
+ --primers-txt [primers-txt](/help/8/artifacts/primers-txt) \
+ --output-dir OUTPUT
+
+
+The [samples-txt](/help/8/artifacts/samples-txt) file is to list all the samples one is interested in, and the [primers-txt](/help/8/artifacts/primers-txt) file lists each primer sequence of interest, and their user-defined names. Each of these files can contain a single entry, or multiple ones.
+
+This will output all of the matching sequences into three [fasta](/help/8/artifacts/fasta) files in the directory `OUTPUT`. These [fasta](/help/8/artifacts/fasta) files differ in their format and will include those that describe,
+
+* Remainders are the downstream sequences after primer match, excluding the primer sequence.
+* Primer matches are the primer-matching part of the match sequences (useful if one is working with degenerate primers and wishes to see the diversity of matching seqeunces).
+* Trimmed sequences are trimmed to the shortest length (and include primer match). All matching sequences will start at the same position.
+* Gapped sequences are not trimmed, but shorter ones are padded with gaps to eliminate length variation artificially.
+
+The last two formats provide downstream possibilities to generate [oligotypes](/help/8/artifacts/oligotypes) and cluster short reads from an hypervariable region to estimate their diversity and oligotype proportion.
+
+There will only be a single FASTA file in the output directory for raw sequences if the user asked only the primer matches to be reported with the flag `--only-report-primer-matches` or `--only-report-remainders`.
+
+### For programmers
+
+You can access to the functionality this program provides also programmatically. Here is an example:
+
+``` python
+
+import argparse
+
+from anvio.sequencefeatures import PrimerSearch
+
+# define a samples dictionary, there may be as many samples as you want
+samples = {'sample_01': {'r1': 'sample_01_R1.fastq', 'r2': 'sample_01_R2.fastq'},
+ 'sample_02': {'r1': 'sample_02_R1.fastq', 'r2': 'sample_02_R2.fastq'}}
+
+# define a primers dictionary, again, you may have as many primers as you
+# wish
+primers = {'primer_01': {'primer_sequence': 'GAGCAAAGATCATGTTTCAAAA.ACGTTC'},
+ 'primer_02': {'primer_sequence': 'AAGT.CTATCAGAACTTAGAGTAGAGCAC'},
+ 'primer_03': {'primer_sequence': 'GGCAGAAATGCCAAGT.CTATCAGAACTT'}}
+
+# get an instance of the class, see the class header for all
+# parameters.
+s = PrimerSearch(argparse.Namespace(samples_dict=samples, primers_dict=primers, min_remainder_length=6))
+
+# you can go through a for loop for each sample, or simply call
+# s.process() to process all samples with all primers automatcially.
+# here, though, this example will simply focus on a single sample
+# to recover all primer hits, and then get sequences for a single primer
+sample_dict, primers_dict = s.process_sample('sample_01')
+
+# once primer hits are recovered, one can get any set of sequences
+# of interest
+sequences = s.get_sequences('primer_01', primers_dict, target='gapped')
+print(sequences)
+>>> ['GAGCAAAGATCATGTTTCAAAAGACGTTCGTCTGA-----------------------------------------------------------------------------------------------------------',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTCGTCTGATGCAAC-----------------------------------------------------------------------------------------------------',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTCGTCTGATGCAACA----------------------------------------------------------------------------------------------------',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTCGTCTGATGCAACAA---------------------------------------------------------------------------------------------------',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTCGTCTGATGCAACAAAGATAAGC-------------------------------------------------------------------------------------------',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTCGTCTGATGCAACAAAGATAAGCCGCTTTTTT----------------------------------------------------------------------------------',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTCGTCTGATGCAACAAAGATAAGCCGCTTTTTT----------------------------------------------------------------------------------',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTCCTTTTTTGAAACACTGTTTTGGCTCTGCTCACTGAAGGCCAAAGG--------------------------------------------------------------------',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTCCTTTTTTGAAACACTGTTTTGGCTCTGCTCACTGAAGGCCAAAGGAAGAGATAAATGGCTGATAATTAAAACAATGTAGAAATATTTGC------------------------',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTCCTTTTTTGAAACACTGTTTTGGCTCTGCTCACTGAAGGCCAAAGGAAGAGATAAATGGCTGATAATTAAAACAATGTAGAAATATTTGCACAGATGAAAAAAGCGGCTTATCT']
+
+sequences = s.get_sequences('primer_01', primers_dict, target='trimmed')
+print(sequences)
+>>> ['GAAGATAGCCGTAGAAAGTGTAGAGTTTTAGGAGT',
+ 'AGCCGTAGAAAGTGTAGAGTTTCAGGAGTTTGGAG',
+ 'GCCGTAGAAAGTGTAGAGTTTTAGGAGTTTGGAGG',
+ 'CGTAGAAAGTGTAGAGTTTTAGGAGTTTGGAGGGG',
+ 'AGTGTAGAGTTTTAGGAGTTTGGAGGGGAGAATTA',
+ 'TTTAGGAGTTTGGAGGGGAGAATTAAGAAACGGTA',
+ 'TTTAGGAGTTTGGAGGGGAGAATTAAGAAACGGTA',
+ 'AGGGTAGAATTAAGAAACGGTAACGGTTGGTCTTG',
+ 'AAGAATAGTTGAAGAAGAATTATTGTATGGGAGAG',
+ 'TGTATGGGAGAGCAAAGATCATGTTTCAAAAGACG']
+
+sequences = s.get_sequences('primer_01', primers_dict, target='primer_matches')
+print(sequences)
+>>> ['GAGCAAAGATCATGTTTCAAAAGACGTTC',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTC',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTC',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTC',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTC',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTC',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTC',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTC',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTC',
+ 'GAGCAAAGATCATGTTTCAAAAGACGTTC']
+
+s = PrimerSearch(argparse.Namespace(samples_dict=samples, primers_dict=primers, stop_after=10, min_remainder_length=6, only_keep_remainder=True))
+sample_dict, primers_dict = s.process_sample('sample_01')
+sequences = s.get_sequences('primer_01', primers_dict, target='remainder')
+print(sequences)
+>>> ['GTCTGA',
+ 'GTCTGA',
+ 'GTCTGA',
+ 'GTCTGA',
+ 'GTCTGA',
+ 'GTCTGA',
+ 'GTCTGA',
+ 'CTTTTT',
+ 'CTTTTT',
+ 'CTTTTT']
+```
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-search-primers.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-search-primers) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-search-primers/network.json b/help/8/programs/anvi-search-primers/network.json
new file mode 100644
index 00000000..d7ee58aa
--- /dev/null
+++ b/help/8/programs/anvi-search-primers/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "short-reads-fasta",
+ "name": "short-reads-fasta",
+ "provided_by_anvio": true,
+ "type": "FASTA"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "samples-txt",
+ "name": "samples-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "primers-txt",
+ "name": "primers-txt",
+ "provided_by_anvio": false,
+ "type": "TXT"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-search-primers",
+ "name": "anvi-search-primers",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-search-sequence-motifs/index.md b/help/8/programs/anvi-search-sequence-motifs/index.md
new file mode 100644
index 00000000..b1f6f4df
--- /dev/null
+++ b/help/8/programs/anvi-search-sequence-motifs/index.md
@@ -0,0 +1,159 @@
+---
+layout: program
+title: anvi-search-sequence-motifs
+excerpt: An anvi'o program. A program to find one or more sequence motifs in contig or gene sequences, and store their frequencies.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-search-sequence-motifs
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program to find one or more sequence motifs in contig or gene sequences, and store their frequencies.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [genes-db](../../artifacts/genes-db)
+
+
+## Can provide
+
+
+[misc-data-items](../../artifacts/misc-data-items) [misc-data-layers](../../artifacts/misc-data-layers)
+
+
+## Usage
+
+
+[anvi-search-sequence-motifs](/help/8/programs/anvi-search-sequence-motifs) will search one or more sequence motifs in applicable anvi'o databases and will report their frequency. If you have more than one motif to search, you can list them as comma-separated sequences
+
+In this context we assume a motif is a 4 to 10 nucleotide-long string, although, anvi'o will not impose any limit to length, and will search any motif it is given along with its reverse-complement across all sequences and report frequencies.
+
+The most primitive output is a TAB-delimited text file, but anvi'o will store frequency information also into your databases like a pro if you use the `--store-in-db` flag.
+
+The following subsections include some examples.
+
+## A contigs database
+
+The minimum amount of stuff you need to run this program is a motif sequence and a [contigs-db](/help/8/artifacts/contigs-db):
+
+
+anvi-search-sequence-motifs -c [contigs-db](/help/8/artifacts/contigs-db) \
+ --motifs ATCG,TAAAT \
+ --output-file motifs.txt
+
+
+Running this will yield an output file with as many columns as the number of sequence motifs that show their frequencies across each contig found in the [contigs-db](/help/8/artifacts/contigs-db). Here is an example:
+
+|contig_name|ATCG|TAAAT|
+|:--|:--:|:--:|
+|204_10M_contig_1720|101|159|
+|204_10M_contig_6515|64|31|
+|204_10M_contig_878|435|3|
+
+## Contigs database + profile database
+
+If you provide this program with a [profile-db](/help/8/artifacts/profile-db), this time it will count your motif sequences in split sequences rather than contigs,
+
+
+anvi-search-sequence-motifs -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db)
+ --motifs ATCG,TAAAT \
+ --output-file motifs.txt
+
+
+And the output will look like this:
+
+|split_name|ATCG|TAAAT|
+|:--|:--:|:--:|
+|204_10M_contig_1720_split_00001|14|22|
+|204_10M_contig_1720_split_00002|2|6|
+|204_10M_contig_1720_split_00003|14|23|
+|204_10M_contig_1720_split_00004|8|18|
+|204_10M_contig_1720_split_00005|9|17|
+|204_10M_contig_1720_split_00006|19|28|
+|204_10M_contig_1720_split_00007|4|8|
+|204_10M_contig_1720_split_00008|31|32|
+|204_10M_contig_1720_split_00009|0|5|
+|204_10M_contig_6515_split_00001|7|5|
+|204_10M_contig_6515_split_00002|5|2|
+|204_10M_contig_6515_split_00003|5|4|
+|204_10M_contig_6515_split_00004|25|8|
+|204_10M_contig_6515_split_00005|6|2|
+|204_10M_contig_6515_split_00006|8|3|
+|204_10M_contig_6515_split_00007|3|3|
+|204_10M_contig_6515_split_00008|5|3|
+|204_10M_contig_878_split_00001|17|0|
+|204_10M_contig_878_split_00002|14|0|
+|204_10M_contig_878_split_00003|108|1|
+|204_10M_contig_878_split_00004|35|0|
+|204_10M_contig_878_split_00005|7|0|
+|204_10M_contig_878_split_00006|18|0|
+|204_10M_contig_878_split_00007|42|0|
+|204_10M_contig_878_split_00008|12|1|
+|204_10M_contig_878_split_00009|13|0|
+|204_10M_contig_878_split_00010|18|0|
+|204_10M_contig_878_split_00011|28|0|
+|204_10M_contig_878_split_00012|0|1|
+|204_10M_contig_878_split_00013|24|0|
+|204_10M_contig_878_split_00014|11|0|
+|204_10M_contig_878_split_00015|33|0|
+|204_10M_contig_878_split_00016|13|0|
+|204_10M_contig_878_split_00017|2|0|
+|204_10M_contig_878_split_00018|40|0|
+
+{:.notice}
+This output format may enable you to bin your splits based on their motif composition and use [anvi-import-collection](/help/8/programs/anvi-import-collection) to import them as a new collection into your profile database, or use [anvi-matrix-to-newick](/help/8/programs/anvi-matrix-to-newick) to cluster them based on this information to organize splits in the interface based on their motif composition.
+
+You can also store this information into your profile database using the flag `--store-in-db`. When you do that, running [anvi-interactive](/help/8/programs/anvi-interactive) on this profile database will include additional layers where these frequencies are displayed. Here is an example:
+
+
+[anvi-search-sequence-motifs](/help/8/programs/anvi-search-sequence-motifs) -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db)
+ --motifs ATCG,TAAAT \
+ --store-in-db
+
+
+And this is how things will look like in the interface:
+
+
+[anvi-interactive](/help/8/programs/anvi-interactive) -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db)
+
+
+[![motifs](../../images/layers_for_sequence_motifs.png){:.center-img .width-50}](../../images/layers_for_sequence_motifs.png)
+
+Layers for sequence motif frequencies will be automatically colored to a shade of blue (although the user can change this through the [interactive](/help/8/artifacts/interactive) interface and/or through [state](/help/8/artifacts/state) files).
+
+## Contigs database + genes database
+
+Instead of a profile database, this program can also run on an anvi'o [genes-db](/help/8/artifacts/genes-db) and search sequence motifs for each gene rather than split or contig sequences.
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-search-sequence-motifs.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-search-sequence-motifs) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-search-sequence-motifs/network.json b/help/8/programs/anvi-search-sequence-motifs/network.json
new file mode 100644
index 00000000..7b8c473b
--- /dev/null
+++ b/help/8/programs/anvi-search-sequence-motifs/network.json
@@ -0,0 +1,82 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-items",
+ "name": "misc-data-items",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "misc-data-layers",
+ "name": "misc-data-layers",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genes-db",
+ "name": "genes-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 5,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-search-sequence-motifs",
+ "name": "anvi-search-sequence-motifs",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 5,
+ "target": 0
+ },
+ {
+ "source": 5,
+ "target": 1
+ },
+ {
+ "target": 5,
+ "source": 2
+ },
+ {
+ "target": 5,
+ "source": 3
+ },
+ {
+ "target": 5,
+ "source": 4
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-setup-cazymes/index.md b/help/8/programs/anvi-setup-cazymes/index.md
new file mode 100644
index 00000000..a06e05d9
--- /dev/null
+++ b/help/8/programs/anvi-setup-cazymes/index.md
@@ -0,0 +1,87 @@
+---
+layout: program
+title: anvi-setup-cazymes
+excerpt: An anvi'o program. Download and setup Pfam data from the EBI.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-setup-cazymes
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Download and setup Pfam data from the EBI.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[cazyme-data](../../artifacts/cazyme-data)
+
+
+## Usage
+
+
+This program **downloads and organizes a local copy of the data from [dbCAN2 CAZyme HMMs](https://bcb.unl.edu/dbCAN2/download/Databases/) for use in function annotation.** This program generates a [cazyme-data](/help/8/artifacts/cazyme-data) artifact, which is required to run the program [anvi-run-cazymes](/help/8/programs/anvi-run-cazymes).
+
+### Set up cazymes data
+
+anvi'o will download the newest version of the database (V11) by default:
+
+
+anvi-setup-cazymes
+
+
+You can use `--cazyme-version`, if you want anvi'o to download a different version of the [dbCAN2 CAZyme HMMs](https://bcb.unl.edu/dbCAN2/download/Databases/) database:
+
+{:.warning}
+The following versions have been tested for download: V9, V10, V11
+
+
+anvi-setup-cazymes --cazyme-version V10
+
+
+By default, this data is stored at `anvio/data/misc/CAZyme/`. To set up this data in a non-default location, run:
+
+
+anvi-setup-cazymes --cazyme-data-dir path/to/location
+
+
+If you already have a [cazyme-data](/help/8/artifacts/cazyme-data) artifact and are trying to re-download this data, run:
+
+
+anvi-setup-cazymes --reset
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-setup-cazymes.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-setup-cazymes) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-setup-cazymes/network.json b/help/8/programs/anvi-setup-cazymes/network.json
new file mode 100644
index 00000000..a7263f97
--- /dev/null
+++ b/help/8/programs/anvi-setup-cazymes/network.json
@@ -0,0 +1,30 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "cazyme-data",
+ "name": "cazyme-data",
+ "provided_by_anvio": true,
+ "type": "DATA"
+ },
+ {
+ "size": 1,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-setup-cazymes",
+ "name": "anvi-setup-cazymes",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 1,
+ "target": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-setup-interacdome/index.md b/help/8/programs/anvi-setup-interacdome/index.md
new file mode 100644
index 00000000..60148ed5
--- /dev/null
+++ b/help/8/programs/anvi-setup-interacdome/index.md
@@ -0,0 +1,86 @@
+---
+layout: program
+title: anvi-setup-interacdome
+excerpt: An anvi'o program. Setup InteracDome data.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-setup-interacdome
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Setup InteracDome data.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[interacdome-data](../../artifacts/interacdome-data)
+
+
+## Usage
+
+
+
+This program (much like all of the other programs that begin with `anvi-setup`) sets up a local copy of the InteracDome database for [anvi-run-interacdome](/help/8/programs/anvi-run-interacdome) as well as a local copy of Pfam v31.0, which is what InteracDome is defined for. Note that anvi'o only needs this program to be run once.
+
+
+Specifically, this downloads [InteracDome](https://interacdome.princeton.edu/)โs [tab-separated files](https://interacdome.princeton.edu/#tab-6136-4) and the Pfam v31.0 HMM profiles for the Pfams in your InteracDome data. This data is stored in the [interacdome-data](/help/8/artifacts/interacdome-data) artifact.
+
+
+It's easy as 1-2-3:
+
+
+anvi-setup-interacdome
+
+
+When running this program, you can provide a path to store your InteracDome data in. The default path is `anvio/data/misc/InteracDome`; if you use a custom path, you will have to provide it to [anvi-run-interacdome](/help/8/programs/anvi-run-interacdome) with the same parameter. Here is an example run:
+
+
+
+anvi-setup-interacdome --interacdome-data-dir path/to/directory
+
+
+If you want to overwrite any data that you have already downloaded (for example if you suspect something went wrong in the download), add the `--reset` flag:
+
+
+anvi-setup-interacdome --interacdome-data-dir path/to/directory \
+ --reset
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-setup-interacdome.md) to update this information.
+
+
+## Additional Resources
+
+
+* [The setup step in the InteracDome technical blogpost](http://merenlab.org/2020/07/22/interacdome/#anvi-setup-interacdome)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-setup-interacdome) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-setup-interacdome/network.json b/help/8/programs/anvi-setup-interacdome/network.json
new file mode 100644
index 00000000..e74cdb72
--- /dev/null
+++ b/help/8/programs/anvi-setup-interacdome/network.json
@@ -0,0 +1,30 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "interacdome-data",
+ "name": "interacdome-data",
+ "provided_by_anvio": true,
+ "type": "DATA"
+ },
+ {
+ "size": 1,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-setup-interacdome",
+ "name": "anvi-setup-interacdome",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 1,
+ "target": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-setup-kegg-data/index.md b/help/8/programs/anvi-setup-kegg-data/index.md
new file mode 100644
index 00000000..8775958e
--- /dev/null
+++ b/help/8/programs/anvi-setup-kegg-data/index.md
@@ -0,0 +1,300 @@
+---
+layout: program
+title: anvi-setup-kegg-data
+excerpt: An anvi'o program. Download and setup various databases from KEGG.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-setup-kegg-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Download and setup various databases from KEGG.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[kegg-data](../../artifacts/kegg-data) [modules-db](../../artifacts/modules-db)
+
+
+## Usage
+
+
+[anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) downloads and organizes data from KEGG for use by other programs, namely [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams), [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) and [anvi-reaction-network](/help/8/programs/anvi-reaction-network). Depending on what download mode you choose, it can download and setup one or more of the following:
+
+- HMM profiles from the [KOfam](https://academic.oup.com/bioinformatics/article/36/7/2251/5631907) database
+- metabolic pathway information from [KEGG MODULES](https://www.genome.jp/kegg/module.html)
+- functional classification information from [KEGG BRITE](https://www.genome.jp/kegg/brite.html)
+- protein family information of the [KEGG Orthology database](https://www.genome.jp/kegg/ko.html)
+
+ Typically, some processing is done following the data download to make the data work with downstream anvi'o programs. The KOfam profiles are prepared for later use by the HMMER software, and the information from MODULES and BRITE is made accessible to other anvi'o programs as a [modules-db](/help/8/artifacts/modules-db). The Orthology data is converted into a nice table that can be utilized by [anvi-reaction-network](/help/8/programs/anvi-reaction-network). This program generates a directory with these files ([kegg-data](/help/8/artifacts/kegg-data)).
+
+## Choosing a download mode
+
+You need to pick a mode to work with this program to control which data will be downloaded from KEGG. You can see the available modes by running the following command:
+
+
+anvi-setup-kegg-data --list-modes
+
+
+You use the `--mode` parameter to tell the program which mode you want, for example:
+
+
+anvi-setup-kegg-data --mode modules
+
+
+
+## Default usage: downloading a KEGG snapshot
+
+If you do not provide any arguments to this program, all KEGG data (ie, `--mode all`) will be set up in the default KEGG data directory.
+
+
+anvi-setup-kegg-data
+
+
+### How does it work?
+
+By default, this program downloads a snapshot of the KEGG databases, already converted into an anvi'o-compatible format. The snapshot is a `.tar.gz` archive of a KEGG data directory that was (usually) generated around the time of the latest anvi'o release.
+
+After the default KEGG archive is downloaded, it is unpacked, checked that all the expected files are present, and moved into the KEGG data directory.
+
+### Why is this the default?
+
+Doing it this way ensures that almost everyone uses the same version of KEGG data, which is good for reproducibility and makes it easy to share annotated datasets. The KEGG resources are updated fairly often, and we found that constantly keeping the KEGG data directory in sync with them was not ideal, because every time the data directory is updated, you have to update the KOfam annotations in all your contigs databases to keep them compatible with the current [modules-db](/help/8/artifacts/modules-db) (unless you were smart enough to keep the old version of the KEGG data directory around somewhere). And of course that introduces a new nightmare as soon as you want to share datasets with your collaborators who do not have the same KEGG data directory version as you. With everyone using the same [kegg-data](/help/8/artifacts/kegg-data) by default, we can avoid these issues.
+
+But the trade-off to this is that the default KEGG data version is tied to an anvi'o release, and it will not always include the most up-to-date information from KEGG. Luckily, **for those who want the most updated version of KEGG, you can still use this program to generate the KEGG data directory by downloading directly from KEGG** (see 'Getting the most up-to-date KEGG data' section below).
+
+{:.warning}
+BRITE hierarchy data is not included in the default KEGG snapshot for anvi'o `v7`. Starting from the `v7.1-dev` version of anvi'o, there is a new default KEGG snapshot including BRITE information. If you are missing this data, it can be acquired by either installing a later snapshot or by independently downloading it with this program using `--mode modules`.
+
+{:.warning}
+The data for metabolic modeling are not included in the KEGG snapshots created before anvi'o `v8`. If you are missing this data, it can be acquired by either installing a later snapshot or by independently downloading it with this program using `--mode modeling`.
+
+### Set up KEGG data in a non-default location
+
+You can specify a different directory in which to put this data, if you wish:
+
+
+anvi-setup-kegg-data --kegg-data-dir /path/to/directory/KEGG
+
+
+This is helpful if you don't have write access to the default directory location, or if you want to keep several different versions of the KEGG data on your computer. Just remember that when you want to use this specific KEGG data directory with later programs such as [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams), you will have to specify its location with the `--kegg-data-dir` flag.
+
+### Setting up an earlier KEGG snapshot
+
+By default, the KEGG snapshot that will be installed is the latest one, which is up-to-date with your current version of anvi'o. If, however, you want a snapshot from an earlier version, you can run something like the following to get it:
+
+
+anvi-setup-kegg-data --kegg-data-dir /path/to/directory/KEGG \
+ --kegg-snapshot v2020-04-27
+
+
+Just keep in mind that you may need to migrate the MODULES.db from these earlier versions in order to make it compatible with the current metabolism code. Anvi'o will tell you if you need to do this.
+
+Not sure what KEGG snapshots are available for you to request? Well, you could check out the YAML file at `anvio/anvio/data/misc/KEGG-SNAPSHOTS.yaml` in your anvi'o directory, or you could just give something random to the `--kegg-snapshot` parameter and watch anvi'o freak out and tell you what is available:
+
+anvi-setup-kegg-data --kegg-snapshot hahaha
+
+
+
+## Getting the most up-to-date KEGG data: downloading directly from KEGG
+
+This program is also capable of downloading data directly from KEGG and converting it into an anvi'o-compatible format. In fact, this is how we generate the default KEGG archive. If you want the latest KEGG data instead of the default snapshot of KEGG, try the following:
+
+
+anvi-setup-kegg-data --download-from-kegg
+
+
+Please note that this will download all the KEGG data (ie, `--mode all` is the default). If you want to independently download individual KEGG datasets, you should pick one of the other modes (the `--download-from-kegg` flag is implicitly turned on in these modes).
+
+### How does it work?
+
+KOfam profiles are downloadable from KEGG's [FTP site](ftp://ftp.genome.jp/pub/db/kofam/) and all other KEGG data is accessible as flat text files through their [API](https://www.kegg.jp/kegg/rest/keggapi.html). When you run this program it will first get all the files that it needs from these sources, and then it will process them by doing the following:
+
+- determine if any KOfam profiles are missing bitscore thresholds, and remove those from the standard profile location so that they are not used for annotation (if you want to see these, you will find them in the `orphan_data` folder in your KEGG data directory)
+- concatenate all remaining KOfam profiles into one file and run `hmmpress` on them
+- parse the flat text file for each KEGG module and the JSON file for each BRITE hierarchy
+- store the MODULE and BRITE information in the [modules-db](/help/8/artifacts/modules-db)
+- parse the flat text files from KEGG Orthology and organize these into a table for metabolic modeling
+
+An important thing to note about this option is that it has rigid expectations for the format of the KEGG data that it works with. Future updates to KEGG may break things such that the data can no longer be directly obtained from KEGG or properly processed. In the sad event that this happens, you will have to download KEGG from one of our archives instead.
+
+### The --only-download option
+
+The `--only-download` flag works for `KOfam` mode and `modules` mode.
+
+Suppose you only want to download data from KEGG without processing it. For instance, perhaps you don't need a [modules-db](/help/8/artifacts/modules-db) or you don't want `hmmpress` to be run on the KOfam profiles. You can instruct this program to stop after downloading by providing the `--only-download` flag:
+
+
+anvi-setup-kegg-data --mode modules \
+ --only-download \
+ --kegg-data-dir /path/to/directory/KEGG
+
+
+It's probably a good idea in this case to specify where you want this data to go using `--kegg-data-dir`, to make sure you can find it later.
+
+{:.notice}
+This option is primarily useful for developers to test `anvi-setup-kegg-data` - for instance, so that you can download the data once and run the database setup option (`--only-processing`) multiple times. However, if non-developers find another practical use-case for this flag, we'd be happy to add those ideas here. Send us a message, or feel free to edit this file and pull request your changes on the anvi'o Github repository. :)
+
+### The --only-processing option
+
+The `--only-processing` flag works for `KOfam` mode and `modules` mode.
+
+Let's say you already have KEGG data on your computer that you got by running this program with the `--only-download` flag. Now you want to process the HMM files, or turn the MODULES data into a [modules-db](/help/8/artifacts/modules-db). To do that, run this program using the `--only-processing` flag and provide the location of the pre-downloaded KEGG data:
+
+
+anvi-setup-kegg-data --mode modules \
+ --only-processing \
+ --kegg-data-dir /path/to/directory/KEGG
+
+
+{:.notice}
+The KEGG data that you already have on your computer has to be in the format expected by this program, or you'll run into errors. Pretty much the only reasonable way to get the data into the proper format is to run this program with the `--only-download` option. Otherwise you would have to go through a lot of manual file-changing shenanigans - possible, but not advisable.
+
+One more note: since this flag is most often used for testing the database setup capabilities of this program, which entails running `anvi-setup-kegg-data --mode modules --only-processing` multiple times on the same KEGG data directory, there is an additional flag that may be useful in this context. To avoid having to manually delete the created modules database each time you run, you can use the `--overwrite-output-destinations` flag:
+
+
+anvi-setup-kegg-data --mode modules \
+ --only-processing \
+ --kegg-data-dir /path/to/directory/KEGG \
+ --overwrite-output-destinations
+
+
+### Avoiding BRITE setup
+
+As of anvi'o `v7.1-dev` or later, KEGG BRITE hierarchies are added to the [modules-db](/help/8/artifacts/modules-db) when running this program with `--mode modules`. If you don't want this cool new feature - because you are a rebel, or adverse to change, or something is not working on your computer, whatever - then fine. You can use the `--skip-brite-hierarchies` flag:
+
+
+anvi-setup-kegg-data --mode modules --skip-brite-hierarchies
+
+
+Hopefully it makes sense to you that this flag does not work when setting up from a KEGG snapshot that already includes BRITE data in it.
+
+### How do I share this data?
+Suppose you have been living on the edge and annotating your contigs databases with a non-default version of [kegg-data](/help/8/artifacts/kegg-data), and you share these databases with a collaborator who wants to run downstream programs like [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) on them. Your collaborator (who has a different version of [kegg-data](/help/8/artifacts/kegg-data) on their computer) will likely get version errors as detailed on the [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) help page.
+
+In order for your collaborator to be able to work with your dataset, they need to have the same [kegg-data](/help/8/artifacts/kegg-data) version as you did when you ran [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams). If you are very lucky and KEGG has not been updated since you set up your [kegg-data](/help/8/artifacts/kegg-data), they may be able to run `anvi-setup-kegg-data -D` to get it. But if not, there are a few options for you to share your version of [kegg-data](/help/8/artifacts/kegg-data):
+
+1. You could send them your KEGG data directory. First, run `tar -czvf kegg_archive.tar.gz ./KEGG` on the data directory to compress and archive it before sending it over (this command _must_ be run from its parent directory so that the archive has the expected directory structure when it is unpacked). Then your collaborator can just run `anvi-setup-kegg-data --kegg-archive kegg_archive.tar.gz --kegg-data-dir ./KEGG_ARCHIVE` and be good to go. They would just have to use `--kegg-data-dir ./KEGG_ARCHIVE` when running downstream programs. The problem here is that even the archived [kegg-data](/help/8/artifacts/kegg-data) is quite large, ~4-5GB, and may be unfeasible for you to send.
+2. You could share with your collaborator just the [modules-db](/help/8/artifacts/modules-db). If all they want to do is to run [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) on databases annotated by your version of the KEGG data directory, this should be all they need. They would need to pass the folder containing your [modules-db](/help/8/artifacts/modules-db) to [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) using the `--kegg-data-dir` parameter.
+3. If your collaborator also wants to be able to annotate other databases with your version of [kegg-data](/help/8/artifacts/kegg-data), then they need to have the KOfam profiles as well. You can send them your [modules-db](/help/8/artifacts/modules-db) and have them download the KOfam profiles most similar to the ones you have from the [KOfam archives](https://www.genome.jp/ftp/db/kofam/archives/) (which are labeled by date). Then they would have to essentially construct their own KEGG data directory by copying the structure of the default one and putting the downloaded files (and the [modules-db](/help/8/artifacts/modules-db) you sent them) into the correct locations. The KOfam profiles must be concatenated into a `Kofam.hmm` file and `hmmpress` must be run on that file to generate the required indices for `hmmsearch`. Your collaborator must also have the `ko_list.txt` file (which _should_ be downloaded with the profiles) in the right spot. Then they could pass their makeshift KEGG data directory to [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) using `--kegg-data-dir`, and they should be golden. (A word of warning: they may want to remove KOs without bitscore thresholds in the `ko_list.txt` before concatenating the profiles, otherwise they will likely get a lot of weak hits for these KOs.)
+
+## I already have a KEGG snapshot: set up from a pre-downloaded archive file
+
+If you have an archive (`.tar.gz`) of the KEGG data directory already on your computer (perhaps a colleague or Meren Lab developer gave you one), you can set up KEGG from this archive instead:
+
+
+anvi-setup-kegg-data --kegg-archive KEGG_archive.tar.gz
+
+
+This works the same way as the default, except that it bypasses the download step and instead uses the archive file you have provided with `--kegg-archive`.
+
+## Info for developers: making a new KEGG snapshot available to all anvi'o users
+
+Periodically (especially before releasing a new version of anvi'o), we want to add new KEGG database snapshots to anvi'o so that users can have more up-to-date KEGG data without having to use the `--download-from-kegg` option. In this section you will find the instructions for doing this (these instructions are also in the comments of the `anvio/data/misc/KEGG-SNAPSHOTS.yaml` file).
+
+Available KEGG snapshots are stored in the anvi'o code repository in `anvio/data/misc/KEGG-SNAPSHOTS.yaml`. To add a new snapshot, you first need to create one by downloading and processing the data from KEGG, testing to make sure it works, and then updating this file. Here are the steps:
+
+1. Download the latest data directly from KEGG by running `anvi-setup-kegg-data -D --kegg-data-dir ./KEGG -T 5`. This will create the new KEGG data folder with its [modules-db](/help/8/artifacts/modules-db) in your current working directory. Make sure you use the exact folder name of `./KEGG`, because that is what anvi'o expects to find when it unpacks a KEGG snapshot. You may want to reduce or increase the number of threads (`-T`) according to your available compute resources.
+2. Get the hash value and version info from the MODULES.db by running `anvi-db-info ./KEGG/MODULES.db`.
+3. Archive the KEGG data directory by running `tar -czvf KEGG_build_YYYY-MM-DD_HASH.tar.gz ./KEGG`. Please remember to replace YYYY-MM-DD with the current date and replace HASH with the MODULES.db hash value obtained in step 2. This convention makes it easier to distinguish between KEGG snapshots by simply looking at the file name.
+4. Test that setup works with this archive by running `anvi-setup-kegg-data --kegg-archive KEGG_build_YYYY-MM-DD_HASH.tar.gz --kegg-data-dir TEST_NEW_KEGG_ARCHIVE`.
+5. If setup worked in the last step without errors, upload the `.tar.gz` archive to [Figshare](https://figshare.com/). If you need inspiration for filling out the keywords, categories, and description fields for the archive, you can check the previous KEGG snapshots that have been uploaded - for instance, [this one](https://figshare.com/articles/dataset/KEGG_build_2023-01-10/21862494) or [this one](https://figshare.com/articles/dataset/KEGG_build_2022-04-14/19601761). At minimum, we typically indicate the database version and hash value, and an example setup command (ie, the one from step 4), in the description of the dataset. Once the archive is published on Figshare (warning: this usually takes a while due to the large file size), you can get the download url of the archive by right-clicking on the Download button and copying the address, which should be a URL with a format similar to this example (but different numbers): `https://figshare.com/ndownloader/files/34817812`
+6. Add an entry to the bottom of the `anvio/data/misc/KEGG-SNAPSHOTS.yaml` file with the Figshare download URL, archive name, and MODULES.db hash and version. If you want this to become the default snapshot (which usually only changes before the next anvi'o release), you should also update the default `self.target_snapshot` variable in `anvio/kegg.py` to be this latest version that you have added.
+7. Test it by running `anvi-setup-kegg-data --kegg-data-dir TEST_NEW_KEGG`, and if it works you are done, and can push your changes to the anvi'o repository. :)
+
+## Downloading generic KEGG data in Python
+
+If you want to get some data from the KEGG website that is not included in our default download (or, if you only want a subset of that data without going through the whole setup process), you can use the anvi'o API to utilize our download functions. Here are some examples for using the `KeggSetup` class (for example, in the Python interpreter):
+
+### Loading the `KeggSetup` class
+
+`KeggSetup` is the class for downloading KEGG data (using KEGG's API). To use it in Python, you need to load the `kegg` module from anvi'o. When using it this way, we recommend skipping a variety of sanity checks using the `skip_init` parameter - this is mainly so that the class doesn't check for, remove, or complain about existing KEGG data on your computer.
+
+```python
+import anvio
+import argparse
+from anvio import kegg
+args = argparse.Namespace(reset=False)
+setup = kegg.KeggSetup(args, skip_init=True)
+```
+
+Once you have this class loaded, you can use its functions for a variety of download and processing tasks. We'll show some examples below.
+
+### Downloading all flat files associated with a KEGG hierarchy
+
+ The following example demonstrates the download of all KEGG COMPOUND files belonging to the BRITE hierarchy with accession `br08001`. Note that if you do not specify a download directory, the files will by default be downloaded to the current working directory.
+
+ ```python
+setup.download_kegg_files_from_hierarchy('br08001', download_dir='KEGG_COMPOUND')
+ ```
+
+ ### Downloading a hierarchical text file
+
+ If you just want to get a KEGG `htext` file (with extension `.keg`), use the following function:
+
+ ```python
+setup.download_generic_htext('br08001', download_dir='KEGG_COMPOUND')
+ ```
+
+ ### Processing a hierarchical text file
+
+ We have a few functions for reading KEGG's `htext` files. If all you want is a list of the accessions involved in this heirarchy (for instance, all compounds in a BRITE hierarchy for KEGG COMPOUND), use this one (the argument should be the path to the `htext` file):
+
+ ```python
+accession_list = setup.get_accessions_from_htext_file("br08001.keg")
+ ```
+
+ If you want to process the KEGG module `htext` file to get a dictionary of all modules and their names/classes/etc, use the following code. You will need to set the `kegg_module_file` attribute (of the ModulesDownload class) to point to the location of the `modules.keg` file, and the function will store the module dictionary in the `module_dict` attribute.
+
+ ```python
+modules_setup = kegg.ModulesDownload(args)
+modules_setup.kegg_module_file = "modules.keg"
+modules_setup.process_module_file()
+modules_setup.module_dict # this attribute now stores the module dictionary
+ ```
+
+ ### Downloading a flat file using the KEGG API
+
+ Here is a wrapper function that will 'get' a flat file with the KEGG API. You can provide this function with the accession of the data you want (for instance, a module accession), and optionally a directory to download it into.
+
+ ```python
+setup.download_generic_flat_file('C00058', download_dir='KEGG_COMPOUND')
+ ```
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-setup-kegg-data.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-setup-kegg-data) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-setup-kegg-data/network.json b/help/8/programs/anvi-setup-kegg-data/network.json
new file mode 100644
index 00000000..11807559
--- /dev/null
+++ b/help/8/programs/anvi-setup-kegg-data/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "kegg-data",
+ "name": "kegg-data",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "modules-db",
+ "name": "modules-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-setup-kegg-data",
+ "name": "anvi-setup-kegg-data",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "source": 2,
+ "target": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-setup-modelseed-database/index.md b/help/8/programs/anvi-setup-modelseed-database/index.md
new file mode 100644
index 00000000..f69523e7
--- /dev/null
+++ b/help/8/programs/anvi-setup-modelseed-database/index.md
@@ -0,0 +1,82 @@
+---
+layout: program
+title: anvi-setup-modelseed-database
+excerpt: An anvi'o program. This program downloads and sets up the ModelSEED Biochemistry database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-setup-modelseed-database
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+This program downloads and sets up the ModelSEED Biochemistry database..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[functions](../../artifacts/functions)
+
+
+## Can provide
+
+
+[reaction-ref-data](../../artifacts/reaction-ref-data)
+
+
+## Usage
+
+
+This program **downloads and sets up the latest version of the ModelSEED Biochemistry database.**
+
+[The ModelSEED Biochemistry database](https://github.com/ModelSEED/ModelSEEDDatabase) consists of two tab-delimited files of reaction and compound data, respectively, and is valuable due to harmonization of IDs and properties from multiple reference databases commonly used in metabolic modeling.
+
+[anvi-reaction-network](/help/8/programs/anvi-reaction-network) relies upon ModelSEED Biochemistry in conjunction with the KEGG Orthology database. [KEGG Orthology (KO)](https://www.genome.jp/kegg/ko.html) protein annotations of genes are associated with predicted enzymatic reactions. These KEGG reactions are cross-referenced to the ModelSEED Biochemistry database to retrieve information on properties including reaction stoichiometry and reversibility. [anvi-reaction-network](/help/8/programs/anvi-reaction-network) stores reactions and metabolites thereby predicted in the [contigs-db](/help/8/artifacts/contigs-db) for the genome. The program, [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data), sets up the requisite KO database.
+
+## Usage
+
+The simplest [anvi-setup-modelseed-database](/help/8/programs/anvi-setup-modelseed-database) command sets up the database in the default anvi'o ModelSEED data directory.
+
+
+anvi-setup-modelseed-database
+
+
+A custom directory can be provided instead. Within the provided directory, a subdirectory named `ModelSEED` is created for storage of the database.
+
+
+anvi-setup-modelseed-database --dir /path/to/dir
+
+
+Finally, in conjunction with either of the previous commands, the `--reset` flag can be used to delete any existing target database directory and its contents before setting up the latest version of the ModelSEED Biochemistry database there.
+
+
+anvi-setup-modelseed-database --reset
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-setup-modelseed-database.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-setup-modelseed-database) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-setup-modelseed-database/network.json b/help/8/programs/anvi-setup-modelseed-database/network.json
new file mode 100644
index 00000000..5557a0c4
--- /dev/null
+++ b/help/8/programs/anvi-setup-modelseed-database/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "reaction-ref-data",
+ "name": "reaction-ref-data",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "functions",
+ "name": "functions",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-setup-modelseed-database",
+ "name": "anvi-setup-modelseed-database",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-setup-ncbi-cogs/index.md b/help/8/programs/anvi-setup-ncbi-cogs/index.md
new file mode 100644
index 00000000..8241cda7
--- /dev/null
+++ b/help/8/programs/anvi-setup-ncbi-cogs/index.md
@@ -0,0 +1,69 @@
+---
+layout: program
+title: anvi-setup-ncbi-cogs
+excerpt: An anvi'o program. Download and setup NCBI's Clusters of Orthologous Groups database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-setup-ncbi-cogs
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Download and setup NCBI's Clusters of Orthologous Groups database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[cogs-data](../../artifacts/cogs-data)
+
+
+## Usage
+
+
+This program **downloads and organizes a local copy of the data from NCBI's [COGs database](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC102395/) for use in function annotation.** This program generates a [cogs-data](/help/8/artifacts/cogs-data) artifact, which is required to run the program [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs).
+
+### Set up COGs data
+
+anvi-setup-ncbi-cogs --just-do-it
+
+
+If you already have a [cogs-data](/help/8/artifacts/cogs-data) artifact and are trying to redownload this data, run
+
+
+anvi-setup-ncbi-cogs --reset
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-setup-ncbi-cogs.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-setup-ncbi-cogs) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-setup-ncbi-cogs/network.json b/help/8/programs/anvi-setup-ncbi-cogs/network.json
new file mode 100644
index 00000000..1b44f97d
--- /dev/null
+++ b/help/8/programs/anvi-setup-ncbi-cogs/network.json
@@ -0,0 +1,30 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "cogs-data",
+ "name": "cogs-data",
+ "provided_by_anvio": true,
+ "type": "DATA"
+ },
+ {
+ "size": 1,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-setup-ncbi-cogs",
+ "name": "anvi-setup-ncbi-cogs",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 1,
+ "target": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-setup-pdb-database/index.md b/help/8/programs/anvi-setup-pdb-database/index.md
new file mode 100644
index 00000000..eb6aa965
--- /dev/null
+++ b/help/8/programs/anvi-setup-pdb-database/index.md
@@ -0,0 +1,88 @@
+---
+layout: program
+title: anvi-setup-pdb-database
+excerpt: An anvi'o program. Setup or update an offline database of representative PDB structures clustered at 95%.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-setup-pdb-database
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Setup or update an offline database of representative PDB structures clustered at 95%.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[pdb-db](../../artifacts/pdb-db)
+
+
+## Usage
+
+
+
+## Basic usage
+
+This program creates a [pdb-db](/help/8/artifacts/pdb-db) local database that holds PDB structures from [this sequence database](https://salilab.org/modeller/supplemental.html), which is hosted by the [Sali lab](https://salilab.org/). Their database comprises all PDB RCSB sequences that have been clustered at 95% sequence similarity. They seem to update their database every couple of months (thank you guys!).
+
+
+The purpose of [anvi-setup-pdb-database](/help/8/programs/anvi-setup-pdb-database) to have a local copy of reference structures that can be used to, for example, get template structures for homology modelling when [anvi-gen-structure-database](/help/8/programs/anvi-gen-structure-database) is ran.
+
+
+Running this program is easy:
+
+
+anvi-setup-pdb-database --just-do-it
+
+
+If you already have a [pdb-db](/help/8/artifacts/pdb-db) artifact and are trying to redownload this data, run
+
+
+anvi-setup-pdb-database --reset
+
+
+Or if you just want to update your database, run
+
+
+anvi-setup-pdb-database --update
+
+
+## Notes
+
+The output [pdb-db](/help/8/artifacts/pdb-db) database is ~20GB and its contents may take several hours to download.
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-setup-pdb-database.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-setup-pdb-database) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-setup-pdb-database/network.json b/help/8/programs/anvi-setup-pdb-database/network.json
new file mode 100644
index 00000000..3445b40b
--- /dev/null
+++ b/help/8/programs/anvi-setup-pdb-database/network.json
@@ -0,0 +1,30 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pdb-db",
+ "name": "pdb-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-setup-pdb-database",
+ "name": "anvi-setup-pdb-database",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 1,
+ "target": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-setup-pfams/index.md b/help/8/programs/anvi-setup-pfams/index.md
new file mode 100644
index 00000000..cae389ac
--- /dev/null
+++ b/help/8/programs/anvi-setup-pfams/index.md
@@ -0,0 +1,76 @@
+---
+layout: program
+title: anvi-setup-pfams
+excerpt: An anvi'o program. Download and setup Pfam data from the EBI.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-setup-pfams
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Download and setup Pfam data from the EBI.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[pfams-data](../../artifacts/pfams-data)
+
+
+## Usage
+
+
+This program **downloads and organizes a local copy of the data from EBI's [Pfam database](https://pfam.xfam.org/) for use in function annotation.** This program generates a [pfams-data](/help/8/artifacts/pfams-data) artifact, which is required to run the program [anvi-run-pfams](/help/8/programs/anvi-run-pfams).
+
+### Set up Pfams data
+
+anvi-setup-pfams
+
+
+By default, this data is stored at `anvio/data/misc/Pfam`. To set up this data in a non-default location, run
+
+anvi-setup-pfams --pfam-data-dir path/to/location
+
+
+If you already have a [pfams-data](/help/8/artifacts/pfams-data) artifact and are trying to redownload this data, run
+
+
+anvi-setup-pfams --reset
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-setup-pfams.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-setup-pfams) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-setup-pfams/network.json b/help/8/programs/anvi-setup-pfams/network.json
new file mode 100644
index 00000000..2a7f900b
--- /dev/null
+++ b/help/8/programs/anvi-setup-pfams/network.json
@@ -0,0 +1,30 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pfams-data",
+ "name": "pfams-data",
+ "provided_by_anvio": true,
+ "type": "DATA"
+ },
+ {
+ "size": 1,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-setup-pfams",
+ "name": "anvi-setup-pfams",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 1,
+ "target": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-setup-scg-taxonomy/index.md b/help/8/programs/anvi-setup-scg-taxonomy/index.md
new file mode 100644
index 00000000..5e78078d
--- /dev/null
+++ b/help/8/programs/anvi-setup-scg-taxonomy/index.md
@@ -0,0 +1,80 @@
+---
+layout: program
+title: anvi-setup-scg-taxonomy
+excerpt: An anvi'o program. The purpose of this program is to download necessary information from GTDB (https://gtdb.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-setup-scg-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+The purpose of this program is to download necessary information from GTDB (https://gtdb.ecogenomic.org/), and set it up in such a way that your anvi'o installation is able to assign taxonomy to single-copy core genes using `anvi-run-scg-taxonomy` and estimate taxonomy for genomes or metagenomes using `anvi-estimate-scg-taxonomy`).
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[scgs-taxonomy-db](../../artifacts/scgs-taxonomy-db)
+
+
+## Usage
+
+
+This program **downloads and sets up the search databases used for the scg-taxonomy workflow** (from [GTDB](https://gtdb.ecogenomic.org/)) so that you can run [anvi-run-scg-taxonomy](/help/8/programs/anvi-run-scg-taxonomy) and [anvi-estimate-scg-taxonomy](/help/8/programs/anvi-estimate-scg-taxonomy). This program generates a [scgs-taxonomy-db](/help/8/artifacts/scgs-taxonomy-db) artifact, which is required to run both of those programs.
+
+For more information on that workflow, check out [this page](http://merenlab.org/2019/10/08/anvio-scg-taxonomy/)
+
+You will only have to run this program once per anvi'o installation.
+
+Why is this not done by default? It just makes things easier downstream to build these databases with the DIAMOND installed on your computer to avoid incompatibility issues. Besides, it should take under a minute and is as simple as running
+
+
+anvi-setup-scg-taxonomy
+
+
+If you have already already run this program and are trying to redownload this data, run
+
+
+anvi-setup-scg-taxonomy --reset
+
+
+You can also download a specific release of this database by providing its URL with the flag `--scg-taxonomy-remote-database-url`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-setup-scg-taxonomy.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Usage examples and warnings](http://merenlab.org/scg-taxonomy)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-setup-scg-taxonomy) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-setup-scg-taxonomy/network.json b/help/8/programs/anvi-setup-scg-taxonomy/network.json
new file mode 100644
index 00000000..870a90d4
--- /dev/null
+++ b/help/8/programs/anvi-setup-scg-taxonomy/network.json
@@ -0,0 +1,30 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "scgs-taxonomy-db",
+ "name": "scgs-taxonomy-db",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-setup-scg-taxonomy",
+ "name": "anvi-setup-scg-taxonomy",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 1,
+ "target": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-setup-trna-taxonomy/index.md b/help/8/programs/anvi-setup-trna-taxonomy/index.md
new file mode 100644
index 00000000..d74f1b00
--- /dev/null
+++ b/help/8/programs/anvi-setup-trna-taxonomy/index.md
@@ -0,0 +1,74 @@
+---
+layout: program
+title: anvi-setup-trna-taxonomy
+excerpt: An anvi'o program. The purpose of this program is to setup necessary databases for tRNA genes collected from GTDB (https://gtdb.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-setup-trna-taxonomy
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+The purpose of this program is to setup necessary databases for tRNA genes collected from GTDB (https://gtdb.ecogenomic.org/), genomes in your local anvi'o installation so taxonomy information for a given set of tRNA sequences can be identified using `anvi-run-trna-taxonomy` and made sense of via `anvi-estimate-trna-taxonomy`).
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+This program seems to know what its doing. It needs no input material from its user. Good program.
+
+
+## Can provide
+
+
+[trna-taxonomy-db](../../artifacts/trna-taxonomy-db)
+
+
+## Usage
+
+
+This program downloads a local copy of a subset of the databases from [GTDB](https://gtdb.ecogenomic.org/) (stored in a [trna-taxonomy-db](/help/8/artifacts/trna-taxonomy-db)), so that tRNA sequences in your dataset can be associated with taxonomy information. It is required to run this program before you can run [anvi-run-trna-taxonomy](/help/8/programs/anvi-run-trna-taxonomy) or [anvi-estimate-trna-taxonomy](/help/8/programs/anvi-estimate-trna-taxonomy).
+
+Like other `anvi-setup-` programs, this only needs to be run once per anvi'o version. The default path is `anvio/data/misc/TRNA-TAXONOMY`. You can store the resulting [trna-taxonomy-db](/help/8/artifacts/trna-taxonomy-db) in a custom location if desired), but then you'll need to provide the path to it whenever you run [anvi-run-trna-taxonomy](/help/8/programs/anvi-run-trna-taxonomy).
+
+To run this program, you can simply run
+
+
+anvi-setup-trna-taxonomy
+
+
+If you are trying to redownload these databases, run:
+
+
+anvi-setup-trna-taxonomy --reset
+
+
+Alternatively, you can use `--redo-databases` if you just want to update the database version without redownloading the data.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-setup-trna-taxonomy.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-setup-trna-taxonomy) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-setup-trna-taxonomy/network.json b/help/8/programs/anvi-setup-trna-taxonomy/network.json
new file mode 100644
index 00000000..1e9b1fcf
--- /dev/null
+++ b/help/8/programs/anvi-setup-trna-taxonomy/network.json
@@ -0,0 +1,30 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trna-taxonomy-db",
+ "name": "trna-taxonomy-db",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-setup-trna-taxonomy",
+ "name": "anvi-setup-trna-taxonomy",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 1,
+ "target": 0
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-setup-user-modules/index.md b/help/8/programs/anvi-setup-user-modules/index.md
new file mode 100644
index 00000000..ad5ee03f
--- /dev/null
+++ b/help/8/programs/anvi-setup-user-modules/index.md
@@ -0,0 +1,142 @@
+---
+layout: program
+title: anvi-setup-user-modules
+excerpt: An anvi'o program. Set up user-defined metabolic pathways into an anvi'o-compatible database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-setup-user-modules
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Set up user-defined metabolic pathways into an anvi'o-compatible database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[user-modules-data](../../artifacts/user-modules-data)
+
+
+## Can provide
+
+
+[modules-db](../../artifacts/modules-db) [user-modules-data](../../artifacts/user-modules-data)
+
+
+## Usage
+
+
+This program creates a [modules-db](/help/8/artifacts/modules-db) out of a set of user-defined metabolic modules, for use by [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism).
+
+It takes as input a directory containing module files for each user-defined module, formatted in the same way as KEGG modules are. It parses these modules into the `USER_MODULES.db` database. This directory of user-defined data is referred to as [user-modules-data](/help/8/artifacts/user-modules-data), and the help page for that artifact contains a detailed account of how to create your own module definitions and estimate their completeness.
+
+This page will give a few details specific to running [anvi-setup-user-modules](/help/8/programs/anvi-setup-user-modules).
+
+### Default Usage
+
+To run this program, you must provide an input directory containing your module definitions:
+
+
+anvi-setup-user-modules --user-modules /path/to/user/data/directory
+
+
+This input directory must have a specific format (see section below). The `USER_MODULES.db` will be generated in this directory, so you can use the same path to provide your data to [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) when you want to estimate completeness for these modules.
+
+### Input directory format
+
+The directory you provide to the `--user-modules` parameter must have another folder inside of it, which must be called `modules`. Inside that `modules` folder, you should put text files containing the definitions of your metabolic modules - one file per module. The file should be named according to the identifier you want the module to have, and should not have any extension.
+
+Here is an example schematic of a proper input directory:
+```
+MY_METABOLISM_DATA_DIR
+ |
+ |- modules
+ |- U00001
+ |- U00002
+ |- U00003
+ |- U00004
+```
+The `U0000x` files in the schematic above each contains a definition for one module. Running `anvi-setup-user-modules --user-modules MY_METABOLISM_DATA_DIR` will produce a `USER_MODULES.db` file in the `MY_METABOLISM_DATA_DIR` folder which contains 4 modules named U00001, U00002, U00003, and U00004 (assuming those files are formatted correctly).
+
+### How do I format the module files?
+
+{:.notice}
+Check out [anvi-script-gen-user-module-file](/help/8/programs/anvi-script-gen-user-module-file) for a way to automatically format your user module files.
+
+
+We use KEGG's system for describing metabolic modules, so you will need to format your metabolic pathways in the same way. Here is an example, for a module file called `U00002` (like in the schematic above):
+```
+ENTRY U00002
+NAME Nitrogen fixation (full Nif gene set)
+DEFINITION K02588+K02586+K02591-K00531 K02587 K02592 K02585
+ORTHOLOGY K02588 NifH
+ K02586 NifD
+ K02591 NifK
+ K00531 anfG
+ K02587 NifE
+ K02592 NifN
+ K02585 NifB
+CLASS User modules; Energy metabolism; Nitrogen metabolism
+ANNOTATION_SOURCE K02588 KOfam
+ K02586 KOfam
+ K02591 KOfam
+ K00531 KOfam
+ K02587 KOfam
+ K02592 KOfam
+ K02585 KOfam
+///
+```
+As you can see, there are different data types in the file, named by the all-capital word at the beginning of the line (we call this the 'data name'). The second column of the file is the value corresponding to that type of information ('data value'). Some data names, like ORTHOLOGY and ANNOTATION_SOURCE, also have a 3rd column further defining the data value (which we call the 'data definition'). Each field in the file should be separated by _at least two spaces_. And the file must end with '///' on the last line (don't ask us why).
+
+The data names you see in the example above are the minimum you should include to define the module. Here is a bit more information about each type of data:
+- ENTRY: this is the identifier for the module. It can be anything you want, but should be just one word (underscores and dashes allowed). It should also be the same as the name of the module file. Importantly, this identifier should not be the same as any KEGG module, or you will get an error during setup.
+- NAME: this is the name of the metabolic pathway, which can be any arbitrary string (spaces allowed)
+- DEFINITION: this is the set of enzymes required for the reactions in the metabolic pathway. The enzymes should be identified by their accession numbers in their respective annotation source - in the example above, these are all KOfams, so the enzyme accessions are KO numbers. However, you can use enzymes from any annotation source you like (COGs, Pfams, custom HMMs, etc), as long as you have a way to annotate them in your contigs database. The rules for defining a metabolic pathway in the KEGG fashion are described in the [technical details section](https://merenlab.org/software/anvio/help/main/programs/anvi-estimate-metabolism/#what-data-is-used-for-estimation) of the help page for [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism), so please read through that section for help with designing your pathway definition.
+- ORTHOLOGY: this section maps the identifier for an enzyme (the second column, or data value) to the functional definition of that enzyme (the third column, or data definition). You need one of these lines for every enzyme in your module DEFINITION line.
+- CLASS: this line categorizes the module. It must be one string with three sections, separated by semi-colons. The first section is the 'class' of the module, the second section is the 'category' of the module, and the third section is the 'sub-category'. Feel free to use the existing KEGG categories to describe your pathway, or to make up something entirely new. Or, if you don't care about this categorization at all, you could just put random strings in each section (as long as you have two semi-colons in the string, you will be golden).
+- ANNOTATION_SOURCE: this section maps the identifier for an enzyme (the second column, or data value) to its annotation source (the third column, or data definition). You need one of these lines for every enzyme in your module DEFINITION line. The annotation source must match the functional annotation source in the contigs database that is associated with the enzyme's annotations. For instance, KOfams annotated with [anvi-run-kegg-kofams](/help/8/programs/anvi-run-kegg-kofams) have source 'KOfam' (as above), the 2020 COG source from [anvi-run-ncbi-cogs](/help/8/programs/anvi-run-ncbi-cogs) is 'COG20_FUNCTION', the source for custom HMM profiles given to [anvi-run-hmms](/help/8/programs/anvi-run-hmms) is the `--hmm-source` directory name, and so on.
+
+You can also define other data names, if you want. Some common ones that can be found in KEGG modules are COMPOUND, REACTION, PATHWAY, COMMENT, REFERENCE, and AUTHORS; but you are not limited by the ones used by KEGG.
+
+Why must we format the module files this way, you ask? Well, to be honest, KEGG modules are formatted like this, and our infrastructure for working with that data has simply been adapted to work with arbitrary, user-defined data. KEGG makes the rules :)
+
+### Specifying KEGG data to be used for sanity checking
+
+If you haven't yet run [anvi-setup-kegg-data](/help/8/programs/anvi-setup-kegg-data) on your computer, you will get an error when you try to run this program. This is because KEGG data can be used in addition to user-defined modules, and we need to be aware of which KEGG modules exist so we can make sure none of the user-defined modules have the same identifiers as these.
+
+By default, this program looks for the KEGG data in the default location, so if you have set up KEGG data in a non-default directory, you should specify the path to that directory using the `--kegg-data-dir` parameter:
+
+
+anvi-setup-user-modules --user-modules /path/to/user/data/directory --kegg-data-dir /path/to/KEGG/data/directory
+
+
+If you have multiple KEGG data directories on your computer, you should specify the one that you intend to use (along with this user-defined data) for [anvi-estimate-metabolism](/help/8/programs/anvi-estimate-metabolism) downstream. It is better to catch and eliminate any overlap during the setup process rather than later during metabolism estimation. :)
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-setup-user-modules.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-setup-user-modules) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-setup-user-modules/network.json b/help/8/programs/anvi-setup-user-modules/network.json
new file mode 100644
index 00000000..5df51f3d
--- /dev/null
+++ b/help/8/programs/anvi-setup-user-modules/network.json
@@ -0,0 +1,47 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "modules-db",
+ "name": "modules-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "user-modules-data",
+ "name": "user-modules-data",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-setup-user-modules",
+ "name": "anvi-setup-user-modules",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "source": 2,
+ "target": 1
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-show-collections-and-bins/index.md b/help/8/programs/anvi-show-collections-and-bins/index.md
new file mode 100644
index 00000000..14f2e579
--- /dev/null
+++ b/help/8/programs/anvi-show-collections-and-bins/index.md
@@ -0,0 +1,70 @@
+---
+layout: program
+title: anvi-show-collections-and-bins
+excerpt: An anvi'o program. A script to display collections stored in an anvi'o profile or pan database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-show-collections-and-bins
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A script to display collections stored in an anvi'o profile or pan database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program tells you about the [collection](/help/8/artifacts/collection)s within a [profile-db](/help/8/artifacts/profile-db) or [pan-db](/help/8/artifacts/pan-db).
+
+Just run it like so
+
+
+anvi-show-collections-and-bins -p [profile-db](/help/8/artifacts/profile-db)
+
+
+and Anvi'o will output to your console the following information for each of the [collection](/help/8/artifacts/collection)s in the database:
+
+* The name and ID of the collection
+* The number of [bin](/help/8/artifacts/bin)s within the collection, and each of their names
+* The number of splits contained within those bins
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-show-collections-and-bins.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-show-collections-and-bins) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-show-collections-and-bins/network.json b/help/8/programs/anvi-show-collections-and-bins/network.json
new file mode 100644
index 00000000..98b9b489
--- /dev/null
+++ b/help/8/programs/anvi-show-collections-and-bins/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-show-collections-and-bins",
+ "name": "anvi-show-collections-and-bins",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 2,
+ "source": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-show-misc-data/index.md b/help/8/programs/anvi-show-misc-data/index.md
new file mode 100644
index 00000000..e3c099ca
--- /dev/null
+++ b/help/8/programs/anvi-show-misc-data/index.md
@@ -0,0 +1,86 @@
+---
+layout: program
+title: anvi-show-misc-data
+excerpt: An anvi'o program. Show all misc data keys in all misc data tables.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-show-misc-data
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Show all misc data keys in all misc data tables.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program **lists the additional data** that is stored within a [pan-db](/help/8/artifacts/pan-db), [profile-db](/help/8/artifacts/profile-db) or [contigs-db](/help/8/artifacts/contigs-db). This is data that can be imported with [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data) and is displayed in the interactive interface.
+
+When run, this program will output to the terminal a list of all additional data tables that are stored within the database. If you want to export a specific element of these as a text file, see [anvi-export-misc-data](/help/8/programs/anvi-export-misc-data).
+
+### What is displayed?
+
+When running on a [profile-db](/help/8/artifacts/profile-db) or [pan-db](/help/8/artifacts/pan-db), the output will display the following types of data:
+
+- [misc-data-items](/help/8/artifacts/misc-data-items)
+- [misc-data-layers](/help/8/artifacts/misc-data-layers)
+- [misc-data-layer-orders](/help/8/artifacts/misc-data-layer-orders) (by default, this will include orders like `abundance` and `mean_coverage (newick)`)
+
+When running on a [contigs-db](/help/8/artifacts/contigs-db), the output will display the following types of data:
+
+- [misc-data-nucleotides](/help/8/artifacts/misc-data-nucleotides)
+- [misc-data-amino-acids](/help/8/artifacts/misc-data-amino-acids)
+
+These have no default values and will only contain data that has been imported with [anvi-import-misc-data](/help/8/programs/anvi-import-misc-data).
+
+You also have the option to specify a specific kind of additional data table with `-t`. For example, to view only [misc-data-items](/help/8/artifacts/misc-data-items) in a [profile-db](/help/8/artifacts/profile-db), just call
+
+
+anvi-show-misc-data -p [profile-db](/help/8/artifacts/profile-db) \
+ -t items
+
+
+Similarly to importing and exporting additional data tables, you can also focus on a specific data group with the parameter `-D`.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-show-misc-data.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-show-misc-data) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-show-misc-data/network.json b/help/8/programs/anvi-show-misc-data/network.json
new file mode 100644
index 00000000..c8c9f34b
--- /dev/null
+++ b/help/8/programs/anvi-show-misc-data/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-show-misc-data",
+ "name": "anvi-show-misc-data",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 3,
+ "source": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-split/index.md b/help/8/programs/anvi-split/index.md
new file mode 100644
index 00000000..0aa6632f
--- /dev/null
+++ b/help/8/programs/anvi-split/index.md
@@ -0,0 +1,96 @@
+---
+layout: program
+title: anvi-split
+excerpt: An anvi'o program. Split an anvi'o pan or profile database into smaller, self-contained projects.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-split
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Split an anvi'o pan or profile database into smaller, self-contained projects. Black magic..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [genomes-storage-db](../../artifacts/genomes-storage-db) [pan-db](../../artifacts/pan-db) [collection](../../artifacts/collection)
+
+
+## Can provide
+
+
+[split-bins](../../artifacts/split-bins)
+
+
+## Usage
+
+
+Creates individual, self-contained anvi'o projects for one or more [bin](/help/8/artifacts/bin)s stored in an anvi'o [collection](/help/8/artifacts/collection). This program may be useful if you would like to share a subset of an anvi'o project with the community or a collaborator, or focus on a particular aspect of your data without having to initialize very large files. Altogether, [anvi-split](/help/8/programs/anvi-split) promotoes reproducibility, openness, and collaboration.
+
+The program can generate [split-bins](/help/8/artifacts/split-bins) from metagenomes or pangenomes. To split bins, you can provide the program [anvi-split](/help/8/programs/anvi-split) with a [contigs-db](/help/8/artifacts/contigs-db) and [profile-db](/help/8/artifacts/profile-db) pair. To split gene clusters, you can provide it with a [genomes-storage-db](/help/8/artifacts/genomes-storage-db) and [pan-db](/help/8/artifacts/pan-db) pair. In both cases you will also need a [collection](/help/8/artifacts/collection). If you don't provide any [bin](/help/8/artifacts/bin) names, the program will create individual directories for each bin that is found in your collection. You can also limit the output to a single bin. Each of the resulting directories in your output folder will contain a stand-alone anvi'o project that can be shared without sharing any of the larger dataset.
+
+### An example run
+
+Assume you have a [profile-db](/help/8/artifacts/profile-db) has a [collection](/help/8/artifacts/collection) with three bins, which are (very creatively) called `BIN_1`, `BIN_2`, and `BIN_3`.
+
+If you ran the following code:
+
+
+anvi-split -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -o OUTPUT
+
+
+Alternatively you can specify a bin name to limit the reported bins:
+
+
+anvi-split -p [profile-db](/help/8/artifacts/profile-db) \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ --bin-id BIN_1
+ -o OUTPUT
+
+
+Similarly, if you provide a [genomes-storage-db](/help/8/artifacts/genomes-storage-db) and [pan-db](/help/8/artifacts/pan-db) pair, the directories will contain their own smaller [genomes-storage-db](/help/8/artifacts/genomes-storage-db) and [pan-db](/help/8/artifacts/pan-db) pairs.
+
+You can always use the program [anvi-show-collections-and-bins](/help/8/programs/anvi-show-collections-and-bins) to learn available [collection](/help/8/artifacts/collection) and [bin](/help/8/artifacts/bin) names in a given [profile-db](/help/8/artifacts/profile-db) or [pan-db](/help/8/artifacts/pan-db).
+
+### Performance
+
+For extremely large datasets, splitting bins may be difficult. For metagenomics projets you can,
+
+* Use the flag `--skip-variability-tables` to NOT report single-nucleotide variants or single-amino acid variants in your split bins (which can reach hundreds of millions of lines of information for large and complex metagenomes), and/or,
+* Use the flag `--compress-auxiliary-data` to save space. While this is a great option for data that is meant to be stored long-term and shared with the community, the compressed file would need to be manually decompressed by the end-user prior to using the split bin.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-split.md) to update this information.
+
+
+## Additional Resources
+
+
+* [Anvi-split in action in the pangenomics tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#splitting-the-pangenome)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-split) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-split/network.json b/help/8/programs/anvi-split/network.json
new file mode 100644
index 00000000..413a2287
--- /dev/null
+++ b/help/8/programs/anvi-split/network.json
@@ -0,0 +1,95 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "split-bins",
+ "name": "split-bins",
+ "provided_by_anvio": true,
+ "type": "CONCEPT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 6,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-split",
+ "name": "anvi-split",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 6,
+ "target": 0
+ },
+ {
+ "target": 6,
+ "source": 1
+ },
+ {
+ "target": 6,
+ "source": 2
+ },
+ {
+ "target": 6,
+ "source": 3
+ },
+ {
+ "target": 6,
+ "source": 4
+ },
+ {
+ "target": 6,
+ "source": 5
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-summarize-blitz/index.md b/help/8/programs/anvi-summarize-blitz/index.md
new file mode 100644
index 00000000..e7c3faad
--- /dev/null
+++ b/help/8/programs/anvi-summarize-blitz/index.md
@@ -0,0 +1,131 @@
+---
+layout: program
+title: anvi-summarize-blitz
+excerpt: An anvi'o program. FAST summary of many anvi'o single profile databases (without having to use the program anvi-merge).
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-summarize-blitz
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+FAST summary of many anvi'o single profile databases (without having to use the program anvi-merge)..
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[single-profile-db](../../artifacts/single-profile-db) [contigs-db](../../artifacts/contigs-db)
+
+
+## Can provide
+
+
+[quick-summary](../../artifacts/quick-summary)
+
+
+## Usage
+
+
+This program is a quicker, but less comprehensive, alternative to [anvi-summarize](/help/8/programs/anvi-summarize). It is used to summarize basic read recruitment statistics (like detection and coverage) from many single profiles that are all associated with the same [contigs-db](/help/8/artifacts/contigs-db).
+
+Given a list of samples (single profiles) and a collection, `anvi-summarize-blitz` will compute the per-sample weighted average of each statistic for each bin in the collection. This is an average of the statistic value over each split in the bin, _weighted by the split length_.
+
+The output will be a text file, and you can find details about its format by clicking on [quick-summary](/help/8/artifacts/quick-summary).
+
+### Basic usage
+
+In addition to your list of [single-profile-db](/help/8/artifacts/single-profile-db)s, you must provide this program with their corresponding contigs database and a collection name.
+
+
+anvi-summarize-blitz PROFILE_1.db PROFILE_2.db PROFILE_3.db [...] \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection)
+
+
+The program will summarize the same collection across all of your single profile databases. However, it will use only the first profile database in the argument list to learn about what is in the collection, so it is not exactly necessary to have this collection defined for all of the other profile databases (though one could argue that it is a good idea to do this regardless...). The collection name you provide to this program must be a collection that is present in at least the first profile database in the argument list. In the example above, only `PROFILE_1.db` is strictly required to include the collection you wish to summarize (though all other profiles must contain the same splits as this first profile, which should not be a problem if you generated them all in the same way).
+
+### Choosing a different output prefix
+
+If nothing is provided, the output file name will be the collection name, suffixed with `-SUMMARY-BLITZ.txt` (although the user can specify the output file name as they should using the parameter `--output-file`):
+
+
+anvi-summarize-blitz PROFILE_1.db PROFILE_2.db PROFILE_3.db [...] \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -o OUTPUT.txt
+
+
+### Choosing which statistics to summarize
+
+The default statistics that will be summarized are detection and something called 'mean_coverage_Q2Q3' (which is [this](https://merenlab.org/2017/05/08/anvio-views/#mean-overage-q2q3)). You can choose which statistics to summarize by providing them as a comma-separated list (no spaces in the list) to the `--stats-to-summarize`, or `-S`, parameter:
+
+
+anvi-summarize-blitz PROFILE_1.db PROFILE_2.db PROFILE_3.db [...] \
+ -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -C [collection](/help/8/artifacts/collection) \
+ -S std_coverage,mean_coverage,detection
+
+
+Each statistic will get its own column in the output file.
+
+If you are not sure which statistics are available to choose from, just provide some ridiculous, arbitrary string (that cannot possibly be a name of a statistic) to this flag, and you will get an error message that includes a list of the available statistics. Or, you can just look at this example error message (but no guarantees that the list in this example will be the same as whatever you would get by doing it yourself. Just sayin'.)
+```
+Config Error: The statistic you requested, cattywampus, does not exist. Here are the options
+ to choose from: std_coverage, mean_coverage, mean_coverage_Q2Q3, detection,
+ abundance, variability
+```
+
+If you are curious about the statistics in the list, many of them have definitions in [this blog post](https://merenlab.org/2017/05/08/anvio-views).
+
+## Common errors
+
+### Existing file error
+
+If the output file already exists, you will encounter the following error:
+```
+File/Path Error: AppendableFile class is refusing to open your file at test-quick_summary.txt
+ because it already exists. If you are a user, you should probably give Anvi'o a
+ different file name to work with. If you are a programmer and you don't want
+ this behavior, init this class with `fail_if_file_exists=False` instead.
+```
+You can either provide a different file prefix using the `-O` parameter, as the error message suggests, or you can simply delete the existing file and re-run your command.
+
+### Missing table error
+
+If you get an error that looks like this:
+```
+Config Error: The database at [PROFILE.db] does not seem to have a table named
+ `detection_splits` :/ Here is a list of table names this database knows:
+ [...]
+```
+
+That means your profile databases are not the correct version. The tables we are accessing in this program were introduced in profile database version 36. So the solution to this error is to update your databases to at least that version, using [anvi-migrate](/help/8/programs/anvi-migrate). :)
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-summarize-blitz.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-summarize-blitz) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-summarize-blitz/network.json b/help/8/programs/anvi-summarize-blitz/network.json
new file mode 100644
index 00000000..420d255c
--- /dev/null
+++ b/help/8/programs/anvi-summarize-blitz/network.json
@@ -0,0 +1,56 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "quick-summary",
+ "name": "quick-summary",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "single-profile-db",
+ "name": "single-profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 3,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-summarize-blitz",
+ "name": "anvi-summarize-blitz",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 3,
+ "target": 0
+ },
+ {
+ "target": 3,
+ "source": 1
+ },
+ {
+ "target": 3,
+ "source": 2
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-summarize/index.md b/help/8/programs/anvi-summarize/index.md
new file mode 100644
index 00000000..b0df58ee
--- /dev/null
+++ b/help/8/programs/anvi-summarize/index.md
@@ -0,0 +1,107 @@
+---
+layout: program
+title: anvi-summarize
+excerpt: An anvi'o program. Summarizer for anvi'o pan or profile db's.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-summarize
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Summarizer for anvi'o pan or profile db's. Essentially, this program takes a collection id along with either a profile database and a contigs database or a pan database and a genomes storage and generates a static HTML output for what is described in a given collection. The output directory will contain almost everything any downstream analysis may need, and can be displayed using a browser without the need for an anvi'o installation. For this reason alone, reporting summary outputs as supplementary data with publications is a great idea for transparency and reproducibility.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [collection](../../artifacts/collection) [pan-db](../../artifacts/pan-db) [genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Can provide
+
+
+[summary](../../artifacts/summary)
+
+
+## Usage
+
+
+Anvi-summarize lets you look at a **comprehensive overview of your [collection](/help/8/artifacts/collection)** and its many statistics that anvi'o has calculated.
+
+It will create a folder called `SUMMARY` that contains many different summary files, including an HTML output that conviently displays them all for you. This folder will contain anything a future user might use to import your collection, so it's useful to send to others or transfer an entire anvi'o collection and all of its data.
+
+In a little more detail, this program will
+* generate [fasta](/help/8/artifacts/fasta) files containing your original contigs.
+* estimate various stats about each of your bins, including competition, redundacy, and information about all of your [hmm-hits](/help/8/artifacts/hmm-hits)
+* generate various tab-delimited matrix files with information about your bins across your samples, including various statistics.
+
+## Running anvi-summarize
+
+### Running on a profile database
+
+A standard run of anvi-summarize on a [profile-db](/help/8/artifacts/profile-db) will look something like this:
+
+
+anvi-summarize -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -p [profile-db](/help/8/artifacts/profile-db) \
+ -o MY_SUMMARY \
+ -C [collection](/help/8/artifacts/collection)
+
+
+This will name the output directory `MY_SUMMARY` instead of the standard `SUMMARY`.
+
+When running on a profile database, you also have options to
+* output very accurate (but intensely processed) coverage and detection data for each gene (using `--init-gene-coverages`)
+* edit your contig names so that they contain the name of the bin that the contig is in (using `--reformat-contig-names`)
+* also display the amino acid sequeunces for your gene calls. (using `--report-aa-seqs-for-gene-calls`)
+
+### Running on a pan database
+
+When running on a [pan-db](/help/8/artifacts/pan-db), you'll want to instead provide the associated genomes storage database.
+
+
+anvi-summarize -g [genomes-storage-db](/help/8/artifacts/genomes-storage-db) \
+ -p [pan-db](/help/8/artifacts/pan-db) \
+ -C [collection](/help/8/artifacts/collection)
+
+
+You can also choose to display DNA sequences for your gene clusters instead of amino acid sequences with the flag `--report-DNA-sequences`
+
+### Other notes
+
+If you're unsure what collections are in your database, you can run this program with the flag `--list-collections` or by running [anvi-show-collections-and-bins](/help/8/programs/anvi-show-collections-and-bins).
+
+You can also use the flag `--quick-summary` to get a less comprehensive summary with a much shorter processing time.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-summarize.md) to update this information.
+
+
+## Additional Resources
+
+
+* [anvi-summarize in the metagenomic workflow tutorial](http://merenlab.org/2016/06/22/anvio-tutorial-v2/#anvi-summarize)
+
+* [anvi-summarize in the pangenomic workflow tutorial](http://merenlab.org/2016/11/08/pangenomics-v2/#summarizing-an-anvio-pan-genome)
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-summarize) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-summarize/network.json b/help/8/programs/anvi-summarize/network.json
new file mode 100644
index 00000000..e177597b
--- /dev/null
+++ b/help/8/programs/anvi-summarize/network.json
@@ -0,0 +1,95 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "summary",
+ "name": "summary",
+ "provided_by_anvio": true,
+ "type": "SUMMARY"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "collection",
+ "name": "collection",
+ "provided_by_anvio": true,
+ "type": "COLLECTION"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 6,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-summarize",
+ "name": "anvi-summarize",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 6,
+ "target": 0
+ },
+ {
+ "target": 6,
+ "source": 1
+ },
+ {
+ "target": 6,
+ "source": 2
+ },
+ {
+ "target": 6,
+ "source": 3
+ },
+ {
+ "target": 6,
+ "source": 4
+ },
+ {
+ "target": 6,
+ "source": 5
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-tabulate-trnaseq/index.md b/help/8/programs/anvi-tabulate-trnaseq/index.md
new file mode 100644
index 00000000..cc16ce73
--- /dev/null
+++ b/help/8/programs/anvi-tabulate-trnaseq/index.md
@@ -0,0 +1,64 @@
+---
+layout: program
+title: anvi-tabulate-trnaseq
+excerpt: An anvi'o program. A program to write standardized tab-delimited files of tRNA-seq seed coverage and modification results.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-tabulate-trnaseq
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program to write standardized tab-delimited files of tRNA-seq seed coverage and modification results.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+## Can consume
+
+
+[trnaseq-contigs-db](../../artifacts/trnaseq-contigs-db) [trnaseq-profile-db](../../artifacts/trnaseq-profile-db)
+
+
+## Can provide
+
+
+[trnaseq-seed-txt](../../artifacts/trnaseq-seed-txt) [modifications-txt](../../artifacts/modifications-txt)
+
+
+## Usage
+
+
+This program **generates tabular files of tRNA-seq seed coverage and modification data that are easily manipulable by the user**.
+
+anvi-tabulate-trnaseq is part of the [trnaseq-workflow](../../workflows/trnaseq/), and is run following the finalization of tRNA seeds by [anvi-merge-trnaseq](/help/8/programs/anvi-merge-trnaseq).
+
+This program generates a table, [seeds-specific-txt](/help/8/artifacts/seeds-specific-txt), containing the specific coverage of each nucleotide position in each seed in every sample. If a nonspecific [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db) is also provided, this program generates a table of nonspecific coverages, [seeds-non-specific-txt](/help/8/artifacts/seeds-non-specific-txt). The distinction between specific and nonspecific coverage is explained in the [trnaseq-profile-db](/help/8/artifacts/trnaseq-profile-db) artifact. These coverage tables have one row per seed per sample. They have three header rows for different ways of describing tRNA nucleotide positions: canonical position name (e.g., "discriminator_1"), canonical position (e.g., "73"), and "ordinal" position relative to all the other **possible** positions (e.g., "95").
+
+anvi-tabulate-trnaseq also generates a table, [modifications-txt](/help/8/artifacts/modifications-txt), containing information on each predicted modification position in each seed, with one row per modification per seed per sample. This table includes four columns of position coverage counts of the four nucleotides.
+
+All tables include taxonomic annotations of the seeds; annotations are added to the [trnaseq-contigs-db](/help/8/artifacts/trnaseq-contigs-db) by [anvi-run-trna-taxonomy](/help/8/programs/anvi-run-trna-taxonomy).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-tabulate-trnaseq.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-tabulate-trnaseq) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-tabulate-trnaseq/network.json b/help/8/programs/anvi-tabulate-trnaseq/network.json
new file mode 100644
index 00000000..44cb5a22
--- /dev/null
+++ b/help/8/programs/anvi-tabulate-trnaseq/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trnaseq-seed-txt",
+ "name": "trnaseq-seed-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "modifications-txt",
+ "name": "modifications-txt",
+ "provided_by_anvio": true,
+ "type": "TXT"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trnaseq-contigs-db",
+ "name": "trnaseq-contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trnaseq-profile-db",
+ "name": "trnaseq-profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-tabulate-trnaseq",
+ "name": "anvi-tabulate-trnaseq",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 4,
+ "target": 0
+ },
+ {
+ "source": 4,
+ "target": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-trnaseq/index.md b/help/8/programs/anvi-trnaseq/index.md
new file mode 100644
index 00000000..77bab349
--- /dev/null
+++ b/help/8/programs/anvi-trnaseq/index.md
@@ -0,0 +1,102 @@
+---
+layout: program
+title: anvi-trnaseq
+excerpt: An anvi'o program. A program to process reads from a tRNA-seq dataset to generate an anvi'o tRNA-seq database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-trnaseq
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+A program to process reads from a tRNA-seq dataset to generate an anvi'o tRNA-seq database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[trnaseq-fasta](../../artifacts/trnaseq-fasta)
+
+
+## Can provide
+
+
+[trnaseq-db](../../artifacts/trnaseq-db)
+
+
+## Usage
+
+
+This program **analyzes a tRNA-seq library, generating de novo predictions of tRNA sequences, structures, and modification positions**.
+
+A FASTA file of merged paired-end tRNA-seq reads is required as input. This file is produced by the initial steps of the [trnaseq-workflow](../../workflows/trnaseq/), in which [Illumina-utils](https://github.com/merenlab/illumina-utils), merges paired-end reads and [anvi-script-reformat-fasta](/help/8/programs/anvi-script-reformat-fasta) creates anvi'o-compliant deflines in the FASTA file.
+
+The primary output of anvi-trnaseq is a [trnaseq-db](/help/8/artifacts/trnaseq-db). Supplemental outputs are also produced -- an analysis summary, a tabular file of unique sequences not identified as tRNA, an a tabular file of 5' and 3' extensions trimmed off mature tRNA.
+
+The `anvi-trnaseq --help` menu provides detailed explanations of the parameters controlling the multifacted analyses performed by the program.
+
+## Examples
+
+*Generate a [trnaseq-db](/help/8/artifacts/trnaseq-db) from a sample using 16 cores.*
+
+
+anvi-trnaseq -f [trnaseq-fasta](/help/8/artifacts/trnaseq-fasta) \
+ -S SAMPLE_NAME \
+ -o OUTPUT_DIRECTORY \
+ -T 16
+
+
+*Generate a [trnaseq-db](/help/8/artifacts/trnaseq-db) from a sample flagged as being treated with demethylase. The output directory is overwritten if it already exists.*
+
+
+anvi-trnaseq -f [trnaseq-fasta](/help/8/artifacts/trnaseq-fasta) \
+ -S SAMPLE_NAME \
+ -o OUTPUT_DIRECTORY \
+ -T 16 \
+ --treatment demethylase \
+ --overwrite-output-destinations
+
+
+## Parameterize tRNA feature profiling
+
+Feature profiling parameters can be modified by the user by in an optional `.ini` file. For example, the user may want a more permissive definition of a tRNA (more false positive identifications of sequences as tRNA, fewer false negative failures to identify sequences as tRNA), increasing the number of unpaired nucleotides allowed in the T stem or increasing the number of unconserved canonical nucleotides allowed in the anticodon loop. Numerous structural parameters like these can be altered.
+
+*Write the `.ini` file to `param.ini`.*
+
+
+anvi-trnaseq --default-feature-param-file PARAM.ini
+
+
+*Nicely display the `.ini` defaults that can be written to the file in standard output.*
+
+
+anvi-trnaseq --print-default-feature-params
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-trnaseq.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-trnaseq) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-trnaseq/network.json b/help/8/programs/anvi-trnaseq/network.json
new file mode 100644
index 00000000..92a2eae8
--- /dev/null
+++ b/help/8/programs/anvi-trnaseq/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "trnaseq-db",
+ "name": "trnaseq-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 1,
+ "color": "#AA0000",
+ "id": "trnaseq-fasta",
+ "name": "trnaseq-fasta",
+ "provided_by_anvio": false,
+ "type": "FASTA"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-trnaseq",
+ "name": "anvi-trnaseq",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "source": 2,
+ "target": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-update-db-description/index.md b/help/8/programs/anvi-update-db-description/index.md
new file mode 100644
index 00000000..87a12e79
--- /dev/null
+++ b/help/8/programs/anvi-update-db-description/index.md
@@ -0,0 +1,68 @@
+---
+layout: program
+title: anvi-update-db-description
+excerpt: An anvi'o program. Update the description in an anvi'o database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-update-db-description
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Update the description in an anvi'o database.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[pan-db](../../artifacts/pan-db) [profile-db](../../artifacts/profile-db) [contigs-db](../../artifacts/contigs-db) [genomes-storage-db](../../artifacts/genomes-storage-db)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program allows you to update the description of any anvi'o database with the push of a button (and the writing of an updated description).
+
+This descirption helps make UIs a little prettier by showing up when you run programs like [anvi-interactive](/help/8/programs/anvi-interactive) and [anvi-summarize](/help/8/programs/anvi-summarize).
+
+Simply write out the description that you would prefer in a plain text file (with markdown syntax) and use this program to update the description of any [pan-db](/help/8/artifacts/pan-db), [profile-db](/help/8/artifacts/profile-db), [contigs-db](/help/8/artifacts/contigs-db), or [genomes-storage-db](/help/8/artifacts/genomes-storage-db):
+
+
+anvi-update-db-description --description my_description.txt \
+ [contigs-db](/help/8/artifacts/contigs-db)
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-update-db-description.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-update-db-description) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-update-db-description/network.json b/help/8/programs/anvi-update-db-description/network.json
new file mode 100644
index 00000000..bd4ee4a8
--- /dev/null
+++ b/help/8/programs/anvi-update-db-description/network.json
@@ -0,0 +1,69 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "pan-db",
+ "name": "pan-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "profile-db",
+ "name": "profile-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "genomes-storage-db",
+ "name": "genomes-storage-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 4,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-update-db-description",
+ "name": "anvi-update-db-description",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 4,
+ "source": 0
+ },
+ {
+ "target": 4,
+ "source": 1
+ },
+ {
+ "target": 4,
+ "source": 2
+ },
+ {
+ "target": 4,
+ "source": 3
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/programs/anvi-update-structure-database/index.md b/help/8/programs/anvi-update-structure-database/index.md
new file mode 100644
index 00000000..8127fa2c
--- /dev/null
+++ b/help/8/programs/anvi-update-structure-database/index.md
@@ -0,0 +1,87 @@
+---
+layout: program
+title: anvi-update-structure-database
+excerpt: An anvi'o program. Add or re-run genes from an already existing structure database.
+categories: [anvio]
+comments: false
+redirect_from: /8/anvi-update-structure-database
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Add or re-run genes from an already existing structure database. All settings used to generate your database will be used in this program.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+
+{% include _toc.html %}
+
+{% capture network_path %}{{ "network.json" }}{% endcapture %}
+{% capture network_height %}{{ 300 }}{% endcapture %}
+{% include _project-anvio-graph.html %}
+
+
+## Authors
+
+
+
+
+
+## Can consume
+
+
+[contigs-db](../../artifacts/contigs-db) [structure-db](../../artifacts/structure-db)
+
+
+## Can provide
+
+
+This program does not seem to provide any artifacts. Such programs usually print out some information for you to see or alter some anvi'o artifacts without producing any immediate outputs.
+
+
+## Usage
+
+
+This program is used to add additional genes to or re-run the analysis of genes already within a [structure-db](/help/8/artifacts/structure-db).
+
+For that reason, it is very similar to [anvi-gen-structure-database](/help/8/programs/anvi-gen-structure-database) and the parameters used to run that program (when you first generated your [structure-db](/help/8/artifacts/structure-db)) will be automatically applied when you run this program. To know what MODELLER parameters are being used, you run this program on a [structure-db](/help/8/artifacts/structure-db) with the flag `--list-modeller-params`.
+
+To run this program, just provide a [contigs-db](/help/8/artifacts/contigs-db) and [structure-db](/help/8/artifacts/structure-db), and name your genes of interest (either in a file or directly). If the named genes are not already in your [structure-db](/help/8/artifacts/structure-db), they will be added to the database.
+
+For example, if your [structure-db](/help/8/artifacts/structure-db) already contains the genes with caller-IDs 1, 2 and 3, and you run
+
+
+anvi-update-structure-database -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -s [structure-db](/help/8/artifacts/structure-db) \
+ --gene-caller-ids 1,4,5
+
+
+Then the structural analysis for genes 4 and 5 will be added to your [structure-db](/help/8/artifacts/structure-db) (assuming templates are found). Gene 1 will be ignored, since it is already present.
+
+If instead you want to re-run the structural analysis on genes that are already in your [structure-db](/help/8/artifacts/structure-db), you'll need to specify that by adding the flag `--rerun-genes`
+
+
+anvi-update-structure-database -c [contigs-db](/help/8/artifacts/contigs-db) \
+ -s [structure-db](/help/8/artifacts/structure-db) \
+ --gene-caller-ids 1,4,5 \
+ --rerun-genes
+
+
+Now, the program will rerun the analysis for gene 1 and will still add genes 4 and 5 to the [structure-db](/help/8/artifacts/structure-db).
+
+Both of these runs will have the same MODELLER parameters as your run of [anvi-gen-structure-database](/help/8/programs/anvi-gen-structure-database). However, to get the raw outputs, you will need to use the parameter `--dump-dir`. You can also set a specific MODELLER program with `--modeller-executable`. Parameters for multi-threading would also have to be given again.
+{:.notice}
+Like [anvi-gen-structure-database](/help/8/programs/anvi-gen-structure-database), this program also accepts [external-structures](/help/8/artifacts/external-structures).
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/programs/anvi-update-structure-database.md) to update this information.
+
+
+## Additional Resources
+
+
+
+{:.notice}
+Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit [this file](https://github.com/merenlab/anvio/tree/master/bin/anvi-update-structure-database) on GitHub. If you are not sure how to do that, find the `__resources__` tag in [this file](https://github.com/merenlab/anvio/blob/master/bin/anvi-interactive) to see an example.
diff --git a/help/8/programs/anvi-update-structure-database/network.json b/help/8/programs/anvi-update-structure-database/network.json
new file mode 100644
index 00000000..e66c5583
--- /dev/null
+++ b/help/8/programs/anvi-update-structure-database/network.json
@@ -0,0 +1,43 @@
+{
+ "graph": [],
+ "nodes": [
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "contigs-db",
+ "name": "contigs-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 1,
+ "score": 0.5,
+ "color": "#00AA00",
+ "id": "structure-db",
+ "name": "structure-db",
+ "provided_by_anvio": true,
+ "type": "DB"
+ },
+ {
+ "size": 2,
+ "score": 0.1,
+ "color": "#AAAA00",
+ "id": "anvi-update-structure-database",
+ "name": "anvi-update-structure-database",
+ "type": "PROGRAM"
+ }
+ ],
+ "links": [
+ {
+ "target": 2,
+ "source": 0
+ },
+ {
+ "target": 2,
+ "source": 1
+ }
+ ],
+ "directed": false,
+ "multigraph": false
+}
\ No newline at end of file
diff --git a/help/8/workflows/contigs/index.md b/help/8/workflows/contigs/index.md
new file mode 100644
index 00000000..b64997d8
--- /dev/null
+++ b/help/8/workflows/contigs/index.md
@@ -0,0 +1,114 @@
+---
+layout: program
+title: The anvi'o 'contigs' workflow
+excerpt: From FASTA files to annotated anvi'o contigs databases
+categories: [anvio]
+comments: false
+redirect_from: /8/contigs
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+From FASTA files to annotated anvi'o contigs databases
+
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Authors
+
+
+
+
+
+
+
+
+
+
+
+## Artifacts accepted
+
+The contigs can typically be initiated with the following artifacts:
+
+[workflow-config](../../artifacts/workflow-config) [fasta-txt](../../artifacts/fasta-txt)
+
+## Artifacts produced
+
+The contigs typically produce the following anvi'o artifacts:
+
+[contigs-db](../../artifacts/contigs-db)
+
+## Third party programs
+
+This is a list of programs that may be used by the contigs workflow depending on the user settings in the [workflow-config](../../artifacts/workflow-config/) :
+
+
+
+An anvi'o installation that follows the recommendations on the installation page will include all these programs. But please consider your settings, and cite these additional tools from your methods sections.
+
+## Workflow description and usage
+
+
+
+This workflow is extremely useful if you have one or more [fasta](/help/8/artifacts/fasta) files that describe one or more contig sequences for your genomes or assembled metagenomes, and all you want to turn them into [contigs-db](/help/8/artifacts/contigs-db) files.
+
+{:.warning}
+If you have not yet run anvi'o programs [anvi-setup-ncbi-cogs](/help/8/programs/anvi-setup-ncbi-cogs) and [anvi-setup-scg-taxonomy](/help/8/programs/anvi-setup-scg-taxonomy) on your system yet, you will get a cryptic error from this workflow if you run it with the default [workflow-config](/help/8/artifacts/workflow-config). You can avoid this by first running these two anvi'o programs to setup the necessary databases (which is done only once for every anvi'o installation), **or** set the rules for COG functions and/or SCG taxonomy to `run=false` explicitly.
+
+To start things going with this workflow, first ask anvi'o to give you a default [workflow-config](/help/8/artifacts/workflow-config) file for the contigs workflow:
+
+```bash
+anvi-run-workflow -w contigs \
+ --get-default-config config-contigs-default.json
+```
+
+This will generate a file in your work directory called `config-contigs-default.json`. You should investigate its contents, and familiarize youself with it. It should look something like this, but much longer:
+and you could examine its content to find out all possible options to tweak. We included a much simpler config file, `config-contigs.json`, in the mock data package for the sake of demonstrating how the contigs workflow works:
+
+```json
+{
+ "workflow_name": "contigs",
+ "config_version": "2",
+ "fasta_txt": "fasta.txt",
+ "output_dirs": {
+ "FASTA_DIR": "01_FASTA",
+ "CONTIGS_DIR": "02_CONTIGS",
+ "LOGS_DIR": "00_LOGS"
+ }
+}
+```
+
+The only mandatory thing you need to do is to (1) manually create a [fasta-txt](/help/8/artifacts/fasta-txt) file to describe the name and location of each FASTA file you wish to work with, and (2) make sure the `fasta_txt` variable in your [workflow-config](/help/8/artifacts/workflow-config) point to the location of your [fasta-txt](/help/8/artifacts/fasta-txt).
+
+To see if everything looks alright, you can simply run the following command, which should generate a 'workflow graph' for you, given your config file parameters and input files:
+
+```bash
+anvi-run-workflow -w contigs \
+ -c config-contigs.json \
+ --save-workflow-graph
+```
+
+For the example config file shown above, this command will generate something similar to this:
+
+[![DAG-contigs](../../images/workflows/contigs/DAG-contigs.png)]( ../../images/workflows/contigs/DAG-contigs.png){:.center-img .width-50}
+
+{:.notice}
+Please note that the generation of this workflow graph requires the usage of a program called [dot](https://en.wikipedia.org/wiki/DOT_(graph_description_language)). If you are using MAC OSX, you can use [dot](https://en.wikipedia.org/wiki/DOT_(graph_description_language)) by installing [graphviz](http://www.graphviz.org/) through `brew` or `conda`.
+
+If everything looks alright, you can run this workflow the following way:
+
+```bash
+anvi-run-workflow -w contigs \
+ -c config-contigs.json
+```
+
+If everything goes smoothly, you should see happy messages flowing on your screen, and at the end of it all you should see your contigs databases are generated and annotated properly. At the end of this process, you will have all your [contigs-db](/help/8/artifacts/contigs-db) files in the `02_CONTIGS` directory (as per the instructions in the config file, which you can change). You can use the program [anvi-display-contigs-stats](/help/8/programs/anvi-display-contigs-stats) on one of them to see if everything makes sense.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/workflows/contigs.md) to update this information.
+
diff --git a/help/8/workflows/ecophylo/index.md b/help/8/workflows/ecophylo/index.md
new file mode 100644
index 00000000..fb026a70
--- /dev/null
+++ b/help/8/workflows/ecophylo/index.md
@@ -0,0 +1,227 @@
+---
+layout: program
+title: The anvi'o 'ecophylo' workflow
+excerpt: Co-characterize the biogeography and phylogeny of any protein
+categories: [anvio]
+comments: false
+redirect_from: /8/ecophylo
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Co-characterize the biogeography and phylogeny of any protein
+
+The ecophylo workflow explores the **eco**logical and **phylo**genetic relationships between a gene family and the environment. Briefly, the workflow extracts a target gene from any set of FASTA files (e.g., isolate genomes, [MAGs](https://anvio.org/vocabulary/#metagenome-assembled-genome-mag), [SAGs](https://anvio.org/vocabulary/#single-amplified-genome-sag), or simply [assembled metagenomes](https://anvio.org/vocabulary/#de-novo-assembly)) using a user-defined [HMM](https://anvio.org/vocabulary/#hidden-markov-models-hmms), and offers an integrated access to the phylogenetics of matching genes, and their distribution across environments.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Authors
+
+
+
+
+
+## Artifacts accepted
+
+The ecophylo can typically be initiated with the following artifacts:
+
+[workflow-config](../../artifacts/workflow-config) [samples-txt](../../artifacts/samples-txt) [hmm-list](../../artifacts/hmm-list) [external-genomes](../../artifacts/external-genomes) [metagenomes](../../artifacts/metagenomes)
+
+## Artifacts produced
+
+The ecophylo typically produce the following anvi'o artifacts:
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db)
+
+## Third party programs
+
+This is a list of programs that may be used by the ecophylo workflow depending on the user settings in the [workflow-config](../../artifacts/workflow-config/) :
+
+
+- Bowtie2 (Read recruitment)
- MMseqs2 (Cluster open reading frames)
- muscle (Align protein sequences)
- trimal (Trim multiple sequence alignment)
- IQ-TREE (Calculate phylogenetic tree)
- FastTree (Calculate phylogenetic tree)
- HMMER (Search for homologous sequences)
+
+
+An anvi'o installation that follows the recommendations on the installation page will include all these programs. But please consider your settings, and cite these additional tools from your methods sections.
+
+## Workflow description and usage
+
+
+The ecophylo workflow starts with a user-defined target gene family defined by an [HMM](https://anvio.org/vocabulary/#hidden-markov-models-hmms) and a list of assembled genomes and/or metagenomes. The final output is an [interactive](/help/8/artifacts/interactive) interface that includes (1) a phylogenetic analysis of all genes detected by the HMM in genomes and/or metagenomes, and (2) the distribution pattern of each of these genes across metagenomes if the user provided metagenomic short reads to survey.
+
+While the 'user-defined [HMM](https://anvio.org/vocabulary/#hidden-markov-models-hmms)' is passed to ecophylo via the [hmm-list](/help/8/artifacts/hmm-list) artifact, the input assemblies of genomes and/or metagenomes to query using the [HMM](https://anvio.org/vocabulary/#hidden-markov-models-hmms) are passed to the workflow via the artifacts [external-genomes](/help/8/artifacts/external-genomes) and [metagenomes](/help/8/artifacts/metagenomes), respectively. Finally, the user can also provide a set of metagenomic short reads via the artifact [samples-txt](/help/8/artifacts/samples-txt) to recover the distribution patterns of genes across samples.
+
+Ecophylo first identifies homologous genes based on the input [HMM](https://anvio.org/vocabulary/#hidden-markov-models-hmms), clusters matching sequences based on a user-defined sequence similarity threshold, and finally selects a representative sequence from each cluster that contains more than two genes. The final set of representative genes are filtered for QC at multiple steps of the workflow which is discussed later in this document in the section "[Quality control and processing of hmm-hits](#Quality control and processing of hmm-hits)". After this step, the ecophylo workflow can continue with one of two modes that the user defines in the [workflow-config](/help/8/artifacts/workflow-config): The so-called [tree-mode](#tree-mode-insights-into-the-evolutionary-patterns-of-target-genes) or the so-called [profile-mode](#profile-mode-insights-into-the-ecological-and-evolutionary-patterns-of-target-genes-and-environments).
+
+In the [tree-mode](#tree-mode-insights-into-the-evolutionary-patterns-of-target-genes), the user must provide an [hmm-list](/help/8/artifacts/hmm-list) and [metagenomes](/help/8/artifacts/metagenomes) and/or [external-genomes](/help/8/artifacts/external-genomes), and the workflow will stop after extracting representative sequences and calculating a phylogenetic tree (without any insights into the ecology of sequences through a subsequent step of metagenomic [read recruitment](https://anvio.org/vocabulary/#read-recruitment)). In contrast, the [profile-mode](#profile-mode-insights-into-the-ecological-and-evolutionary-patterns-of-target-genes-and-environments) will require an additional file: [samples-txt](/help/8/artifacts/samples-txt). In this mode the workflow will continue with the profiling of representative sequences via read recruitment across user-provided metagenomes to recover and store coverage statistics. The completion of the workflow will yield all files necessary to explore the results in downstream analyses to investigate associations between ecological and evolutionary relationships between target genes.
+
+The ecophylo workflow can leverage any [HMM](https://anvio.org/vocabulary/#hidden-markov-models-hmms) that models amino acid sequences. If the user chooses an [HMM](https://anvio.org/vocabulary/#hidden-markov-models-hmms) for a [single-copy core gene](https://anvio.org/vocabulary/#single-copy-core-gene-scg), such as ribosomal protein, the workflow will yield multi-domain taxonomic profiles of metagenomes *de facto*.
+
+## Required input
+
+The minimum requirements of the ecophylo workflow are the following:
+
+- [workflow-config](/help/8/artifacts/workflow-config): This allows you to customize the workflow step by step. Here is how you can generate the default version:
+
+
+anvi-run-workflow -w ecophylo \
+ --get-default-config config.json
+
+
+- [hmm-list](/help/8/artifacts/hmm-list): This file designates which HMM should be used to extract the target gene from your [contigs-db](/help/8/artifacts/contigs-db). Please note that the ecophylo workflow can only process one gene family at a time i.e. [hmm-list](/help/8/artifacts/hmm-list) can only contain one HMM. If you would like to process multiple gene families from the same input assemblies then you will need to re-run the workflow with a separate [hmm-list](/help/8/artifacts/hmm-list).
+- [metagenomes](/help/8/artifacts/metagenomes) and/or [external-genomes](/help/8/artifacts/external-genomes): These files hold the assemblies where you are looking for the target gene. Genomes in [external-genomes](/help/8/artifacts/external-genomes) can be reference genomes, [SAGs](https://anvio.org/vocabulary/#single-amplified-genome-sag), and/or [MAGs](https://anvio.org/vocabulary/#metagenome-assembled-genome-mag).
+
+## Quality control and processing of hmm-hits
+
+[Hidden Markov Models](https://anvio.org/vocabulary/#hidden-markov-models-hmms) are the crux of the ecophylo workflow and will determine the sensitivity and specificity of the gene family hmm-hits you seek to investigate. However, not all [hmm-hits](/help/8/artifacts/hmm-hits) are created equal. Just how BLAST can detect spurious hits with [high-scoring segment pairs](https://www.ncbi.nlm.nih.gov/books/NBK62051/), an HMM search can yield non-homologous hits as well. To address this, we have a series of parameters you can adjust in the [workflow-config](/help/8/artifacts/workflow-config) to fine tune the input set of [hmm-hits](/help/8/artifacts/hmm-hits) that ecophylo will process.
+
+### HMM alignment coverage filtering
+
+The first step to removing bad [hmm-hits](/help/8/artifacts/hmm-hits) is to filter out hits with low quality alignment coverage. This is done with the rule `filter_hmm_hits_by_model_coverage` which leverages [anvi-script-filter-hmm-hits-table](/help/8/programs/anvi-script-filter-hmm-hits-table). We recommend 80% model coverage filter for most cases. However, it is always recommended to explore the distribution of model coverage with any new HMM which will help you determine a proper cutoff (citation). To adjust this parameter, go to the `filter_hmm_hits_by_model_coverage` rule and change the parameter `--model-coverage`.
+
+{:.notice}
+Some full gene length HMM models align to a single hmm-hit independently at different coordinates when there should only be one annotation. To merge these independent alignment into one HMM alignment coverage stat, set `--merge-partial-hits-within-X-nts` to any distance between the hits for which you would like to merge and add it to the rule `filter_hmm_hits_by_model_coverage` under `additional_params`.
+
+### conservative-mode: complete open-reading frames only
+
+Genes predicted from genomes and metagenomes can be partial or complete depending on whether a stop and stop codon is detected. Even if you filter out [hmm-hits](/help/8/artifacts/hmm-hits) with bad alignment coverage as discussed above, HMMs can still detect low quality hits with good alignment coverage and homology statistics due to partial genes. Unfortunately, partial genes can lead to spurious phylogenetic branches and/or inflate the number of observed populations or functions in a given set of genomes/metagenomes.
+
+To remove partial genes from the ecophylo analysis, the user can assign `true` for `--filter-out-partial-gene-calls` parameter so that only complete open-reading frames are processed.
+
+{:.notice}
+What is below is the default settings in the ecophylo [workflow-config](/help/8/artifacts/workflow-config) file.
+
+```bash
+{
+ "filter_hmm_hits_by_model_coverage": {
+ "threads": 5,
+ "--model-coverage": 0.8,
+ "--filter-out-partial-gene-calls": true,
+ "additional_params": ""
+ },
+}
+```
+
+### discovery-mode: ALL open-reading frames
+
+However, maybe you're a risk taker, a maverick explorer of metagenomes. Complete or partial you accept all genes and their potential tree bending shortcomings! In this case, set `--filter-out-partial-gene-calls false` in the [workflow-config](/help/8/artifacts/workflow-config).
+
+{:.notice}
+Simultaneously exploring complete and partial ORFs will increase the distribution of sequence lengths and thus impact sequence clustering. We recommend adjusting `cluster_X_percent_sim_mmseqs` to `"--cov-mode": 1` to help insure ORFs of all length properly cluster together. Please refer to the [MMseqs2 user guide description of --cov-mode](https://mmseqs.com/latest/userguide.pdf) for more details.
+
+```bash
+{
+ "filter_hmm_hits_by_model_coverage": {
+ "threads": 5,
+ "--model-coverage": 0.8,
+ "--filter-out-partial-gene-calls": false,
+ "additional_params": ""
+ },
+ "cluster_X_percent_sim_mmseqs": {
+ "threads": 5,
+ "--min-seq-id": 0.94,
+ "--cov-mode": 1,
+ "clustering_threshold_for_OTUs": [
+ 0.99,
+ 0.98,
+ 0.97
+ ],
+ "AA_mode": false
+ },
+}
+```
+
+Now that you have fine tuned the gene family input into the ecophylo workflow, it's time to decide what output best fits your science question at hand.
+
+## tree-mode: Insights into the evolutionary patterns of target genes
+
+This is the simplest implementation of ecophylo where only an amino acid based phylogenetic tree is calculated. The workflow will extract the target gene from input assemblies, cluster and pick representatives, then calculate a phylogenetic tree based on the amino acid representative sequences. There are two sub-modes of [tree-mode](#tree-mode-insights-into-the-evolutionary-patterns-of-target-genes) which depend on how you pick representative sequences, [NT-mode](#nt-mode) or [AA-mode](#aa-mode) where extracted genes associated nucleotide version (NT) or the amino acid (AA) can be used to cluster the dataset and pick representatives, respectively.
+
+### NT-mode
+
+**Cluster and select representative genes based on NT sequences.**
+
+This is the default version of [tree-mode](#tree-mode-insights-into-the-evolutionary-patterns-of-target-genes) where the extracted gene sequences are clustered based on their associated NT sequences. This is done to prepare for [profile-mode](#profile-mode-insights-into-the-ecological-and-evolutionary-patterns-of-target-genes-and-environments), where adequate sequence distance is needed between gene NT sequences to prevent [non-specific-read-recruitment](https://anvio.org/vocabulary/#non-specific-read-recruitment). The translated amino acid versions of the NT sequence clusters are then used to calculate an AA based phylogenetic tree. This mode is specifically useful to see what the gene phylogenetic tree will look like before the [read recruitment](https://anvio.org/vocabulary/#read-recruitment) step in [profile-mode](#profile-mode-insights-into-the-ecological-and-evolutionary-patterns-of-target-genes-and-environments), (for gene phylogenetic applications of ecophylo please see [AA-mode](#Cluster based on AA sequences - AA-mode)). If everything looks good you can add in your [samples-txt](/help/8/artifacts/samples-txt) and continue with [profile-mode](#profile-mode-insights-into-the-ecological-and-evolutionary-patterns-of-target-genes-and-environments) to add metagenomic [read recruitment](https://anvio.org/vocabulary/#read-recruitment) results.
+
+Here is what the start of the ecophylo [workflow-config](/help/8/artifacts/workflow-config) should look like if you want to run [tree-mode](#tree-mode-insights-into-the-evolutionary-patterns-of-target-genes):
+
+```bash
+{
+ "metagenomes": "metagenomes.txt",
+ "external_genomes": "external-genomes.txt",
+ "hmm_list": "hmm_list.txt",
+ "samples_txt": ""
+}
+```
+
+### AA-mode
+
+**Cluster and select representative genes based on AA sequences. If you are interested specifically in gene phylogenetics, this is the mode for you!**
+
+This is another sub-version of [tree-mode](#tree-mode-insights-into-the-evolutionary-patterns-of-target-genes) where representative sequences are chosen via AA sequence clustering.
+
+To initialize [AA-mode](#aa-mode), go to the rule `cluster_X_percent_sim_mmseqs` in the ecophylo [workflow-config](/help/8/artifacts/workflow-config) and turn "AA_mode" to true:
+
+```bash
+{
+ "metagenomes": "metagenomes.txt",
+ "external_genomes": "external-genomes.txt",
+ "hmm_list": "hmm_list.txt",
+ "samples_txt": ""
+ "cluster_X_percent_sim_mmseqs": {
+ "AA_mode": true,
+ }
+}
+```
+
+{:.notice}
+Be sure to change the `--min-seq-id` of the `cluster_X_percent_sim_mmseqs` rule to the appropriate clustering threshold depending if you are in [NT-mode](#nt-mode) or [AA-mode](#aa-mode).
+
+## profile-mode: Insights into the ecological and evolutionary patterns of target genes and environments
+
+[profile-mode](#profile-mode-insights-into-the-ecological-and-evolutionary-patterns-of-target-genes-and-environments), is an extension of default [tree-mode](#tree-mode-insights-into-the-evolutionary-patterns-of-target-genes) ([NT-mode](#nt-mode)) where NT sequences representatives are profiled with metagenomic reads from user provided metagenomic samples. This allows for the simultaneous visualization of phylogenetic and ecological relationships of genes across metagenomic datasets.
+
+Additional required files:
+- [samples-txt](/help/8/artifacts/samples-txt)
+
+To initialize [profile-mode](#profile-mode-insights-into-the-ecological-and-evolutionary-patterns-of-target-genes-and-environments), , add the path to your [samples-txt](/help/8/artifacts/samples-txt) to your ecophylo [workflow-config](/help/8/artifacts/workflow-config):
+
+```bash
+{
+ "metagenomes": "metagenomes.txt",
+ "external_genomes": "external-genomes.txt",
+ "hmm_list": "hmm_list.txt",
+ "samples_txt": "samples.txt"
+}
+```
+
+## Miscellaneous config file options
+
+Ecophylo will sanity check all input files that contain [contigs-db](/help/8/artifacts/contigs-db)s before the workflow starts. This can take a while especially if you are working with 1000's of genomes. If you want to skip sanity checks for [contigs-db](/help/8/artifacts/contigs-db)s in your [external-genomes](/help/8/artifacts/external-genomes) and/or [metagenomes](/help/8/artifacts/metagenomes) then adjust your [workflow-config](/help/8/artifacts/workflow-config) to the following:
+
+```bash
+{
+ "run_genomes_sanity_check": false
+}
+```
+
+The ecophylo workflow by default uses [FastTree](http://www.microbesonline.org/fasttree/) to calculate the output phylogenetic tree. This is because the workflow was designed to be run on large genomic datasets that could yield thousands of input sequences. However, if you like to run [IQ-TREE](https://github.com/Cibiv/IQ-TREE) adjust your [workflow-config](/help/8/artifacts/workflow-config) to the following:
+
+```bash
+{
+ "fasttree": {
+ "run": "",
+ "threads": 5
+ },
+ "iqtree": {
+ "threads": 5,
+ "-m": "MFP",
+ "run": true,
+ "additional_params": ""
+ },
+}
+```
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/workflows/ecophylo.md) to update this information.
+
diff --git a/help/8/workflows/metagenomics/index.md b/help/8/workflows/metagenomics/index.md
new file mode 100644
index 00000000..a3cf8a59
--- /dev/null
+++ b/help/8/workflows/metagenomics/index.md
@@ -0,0 +1,854 @@
+---
+layout: program
+title: The anvi'o 'metagenomics' workflow
+excerpt: From FASTA and/or FASTQ files to anvi'o contigs and profile databases
+categories: [anvio]
+comments: false
+redirect_from: /8/metagenomics
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+From FASTA and/or FASTQ files to anvi'o contigs and profile databases
+
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Authors
+
+
+
+
+
+## Artifacts accepted
+
+The metagenomics can typically be initiated with the following artifacts:
+
+[workflow-config](../../artifacts/workflow-config) [samples-txt](../../artifacts/samples-txt) [fasta-txt](../../artifacts/fasta-txt)
+
+## Artifacts produced
+
+The metagenomics typically produce the following anvi'o artifacts:
+
+[contigs-db](../../artifacts/contigs-db) [profile-db](../../artifacts/profile-db)
+
+## Third party programs
+
+This is a list of programs that may be used by the metagenomics workflow depending on the user settings in the [workflow-config](../../artifacts/workflow-config/) :
+
+
+
+An anvi'o installation that follows the recommendations on the installation page will include all these programs. But please consider your settings, and cite these additional tools from your methods sections.
+
+## Workflow description and usage
+
+
+**The default entering point** to the metagenomics workflow is the raw paired-end sequencing reads for one or more shotgun metagenomes. **The default end point** of the workflow is an anvi'o merged profile database ready for refinement of bins (or whatever it is that you want to do with it), along with an annotated anvi'o contigs database. While these are the default entry and end points, there are many more ways to use the metagenomic workflow that we will demonstrate later.
+
+The workflow includes the following steps:
+
+1. Quality control of metagenomic short reads using [illumina-utils](https://github.com/merenlab/illumina-utils/), and generating a comprehensive final report for the results of this step (so you have your Supplementary Table 1 ready).
+
+2. Taxonomical profiling of short reads using [krakenuniq](https://github.com/fbreitwieser/krakenuniq). These profiles are also imported into individual profile databases, and are available in the merged profile database (for more details about this, refer to the [release notes of anvi'o version 5.1](https://github.com/merenlab/anvio/releases/tag/v5.1)).
+
+2. Individual or combined assembly of quality filtered metagenomic reads using either [megahit](https://github.com/voutcn/megahit), [metaspades](http://cab.spbu.ru/software/spades/), or [idba_ud](https://github.com/loneknightpy/idba).
+
+3. Generating an anvi'o contigs database from assembled contigs using [anvi-gen-contigs-database](/help/8/programs/anvi-gen-contigs-database). This part of the metagenomics workflow is inherited from the contigs workflow, so you know this step also includes the annotation of your contigs database(s) with functions, HMMs, and taxonomy.
+
+4. Mapping short reads from each metagenome to the contigs using [bowtie2](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml), and generating sorted and indexed BAM files.
+
+5. Profiling individual BAM files using [anvi-profile](/help/8/programs/anvi-profile) to generate single anvi'o profiles.
+
+6. Merging resulting single anvi'o profiles using [anvi-merge](/help/8/programs/anvi-merge).
+
+
+The metagenomic workflow is quite talented and can be run in multiple 'modes'. The following sections will detail different use cases.
+
+
+### Default mode
+
+As mentioned above, the standard usage of this workflow is meant to go through all the steps from raw reads to having a merged profile database (or databases) ready for binning.
+
+All you need is a bunch of FASTQ files, and a `samples.txt` file. Here, we will go through a mock example with three small metagenomes. These metagenomes were made by choosing a small number of reads from three [HMP](https://www.hmpdacc.org/) metagenomes (these reads were not chosen randomly, for more details, [ask Alon](mailto:alon.shaiber@gmail.com)). In your working directory you have the following `samples.txt` file:
+
+```bash
+$ column -t samples.txt
+sample group r1 r2
+sample_01 G01 three_samples_example/sample-01-R1.fastq.gz three_samples_example/sample-01-R2.fastq.gz
+sample_02 G02 three_samples_example/sample-02-R1.fastq.gz three_samples_example/sample-02-R2.fastq.gz
+sample_03 G02 three_samples_example/sample-03-R1.fastq.gz three_samples_example/sample-03-R2.fastq.gz
+```
+
+As previous chapters clarified, this is the file that describes our 'groups' and locations of raw paired-end reads for each sample. The default name for your `samples_txt` file is `samples.txt`, but you can use a different name by specifying it in the config file (see below).
+
+In your working directory there is a config file `config-idba_ud.json`; let's take a look at it.
+
+```json
+{
+ "workflow_name": "metagenomics",
+ "config_version": "2",
+ "samples_txt": "samples.txt",
+ "anvi_script_reformat_fasta": {
+ "run": true,
+ "--prefix": "{group}",
+ "--simplify-names": true,
+ "--keep-ids": "",
+ "--exclude-ids": "",
+ "--min-len": "",
+ "--seq-type": "",
+ "threads": ""
+ },
+ "idba_ud": {
+ "--min_contig": 1000,
+ "threads": 11,
+ "run": true
+ }
+}
+```
+
+Relatively short. Every configurable parameter (and there are many many of them) that is not mentioned here will be assigned a default value.
+
+{:.notice}
+We usually like to start with a default config file, and edit parameters that are important to us. Usually these edits are related to making `true` values `false` if we don't want to run a particular step, or change number of threads assigned to a single step, etc.
+
+So what do we have in the example config file above?
+
+* **samples_txt**: Path for our `samples.txt` (since we used the default name `samples.txt`, we didn't really have to include this in the config file, but it is always better to be explicit).
+
+* **idba_ud**: A few parameters for `idba_ud`.
+
+ - **run**: Currently two assembly software packages are available in the workflow: megahit and idba_ud. We didn't set either of these as the default program, and hence if you wish to assemble things then you must set the `run` parameter to `true` for one (and only one) of these.
+
+ - **--min-contig**: From the help menu of `idba_ud` [we learn](../../images/workflows/metagenomics/idba_ud_min_contig.png) that `idab_ud` has the default as `200`, and we want it as `1,000`, and hence we include this in the config.
+
+ - **threads**: When you wish to use multi-threads you can specify how many threads to use for each step of the workflow using this parameter. Here we chose 11 threads for `idba_ud`.
+
+
+
+
+Ok, so now we have everything we need to start. Let's first run a sanity check and create a workflow graph for our workflow:
+
+```
+anvi-run-workflow -w metagenomics \
+ -c config-idba_ud.json \
+ --save-workflow-graph
+```
+
+A file named `workflow.png` was created and should look like this:
+
+[![idba_ud_workflow1](../../images/workflows/metagenomics/idba_ud_workflow1.png)]( ../../images/workflows/metagenomics/idba_ud_workflow1.png){:.center-img .width-50}
+
+Take a minute to take a look at this image to understand what is going on. From a first look it might seem complicated, but it is fairly straightforward (and also, shouldn't you know what is going on with your data?!?).
+
+Ok, let's run this.
+
+
+
+Now we can run the workflow:
+
+```
+anvi-run-workflow -w metagenomics \
+ -c config-idba_ud.json
+```
+
+Once everything finishes running (on our cluster it only takes 6 minutes as these are very small mock metagenomes), we can take a look at one of the merged profile databases:
+
+```
+anvi-interactive -p 06_MERGED/G02/PROFILE.db \
+ -c 03_CONTIGS/G02-contigs.db
+```
+
+And it should look like this:
+
+[![merged_profile_idba_ud1](../../images/workflows/metagenomics/merged_profile_idba_ud1.png)]( ../../images/workflows/metagenomics/merged_profile_idba_ud1.png){:.center-img .width-50}
+
+Ok, so this looks like a standard merged profile database with two samples. As a bonus, we also added a step to import the number of short reads in each sample ("Total num reads"), and we also used it to calculate the percentage of reads from the sample that have been mapped to the contigs ("Percent Mapped").
+
+This is a bit of an expert knowledge, but if you remember, we had two "groups" in the samples.txt file. Hence, we have two contigs databases for G01 and G02. But one of our groups had only a single sample, there was nothing to merge. Thus, there is no merged profile for G01 at the location you would expect to find it, but instead, there is a README file there:
+
+```
+$ cat 06_MERGED/G01/README.txt
+Only one file was profiled with G01 so there is nothing to
+merge. But don't worry, you can still use anvi-interactive with
+the single profile database that is here: 05_ANVIO_PROFILE/G01/sample_01/PROFILE.db
+```
+
+Which means, while you can use the program [anvi-interactive](/help/8/programs/anvi-interactive) to interactively visualize merged profile databases that are affiliated with groups that have more than one sample, you will find profiles to visualize under single profiles directories for groups associated with a single sample (such as G01 in our example):
+
+```bash
+anvi-interactive -p 05_ANVIO_PROFILE/G01/sample_01/PROFILE.db \
+ -c 03_CONTIGS/G01-contigs.db
+```
+
+[![single_profile_idba_ud](../../images/workflows/metagenomics/single_profile_idba_ud.png)]( ../../images/workflows/metagenomics/single_profile_idba_ud.png){:.center-img .width-50}
+
+
+
+In addition to the merged profile databases and the contigs databases (and all intermediate files), the workflow has another output, the QC report, which you can find here: `01_QC/qc-report.txt`. Let's look at it:
+
+| sample | number of pairs analyzed | total pairs passed | total pairs passed (percent of all pairs) | total pair_1 trimmed | total pair_1 trimmed (percent of all passed pairs) | total pair_2 trimmed | total pair_2 trimmed (percent of all passed pairs) | total pairs failed | total pairs failed (percent of all pairs) | pairs failed due to pair_1 | pairs failed due to pair_1 (percent of all failed pairs) | pairs failed due to pair_2 | pairs failed due to pair_2 (percent of all failed pairs) | pairs failed due to both | pairs failed due to both (percent of all failed pairs) | FAILED_REASON_P | FAILED_REASON_P (percent of all failed pairs) | FAILED_REASON_N | FAILED_REASON_N (percent of all failed pairs) | FAILED_REASON_C33 | FAILED_REASON_C33 (percent of all failed pairs) |
+|-----------|--------------------------|--------------------|-------------------------------------------|----------------------|----------------------------------------------------|----------------------|----------------------------------------------------|--------------------|-------------------------------------------|----------------------------|----------------------------------------------------------|----------------------------|----------------------------------------------------------|--------------------------|--------------------------------------------------------|-----------------|-----------------------------------------------|-----------------|-----------------------------------------------|-------------------|-------------------------------------------------|
+| sample_01 | 10450 | 8423 | 80.6 | 0 | 0 | 0 | 0 | 2027 | 19.4 | 982 | 48.45 | 913 | 45.04 | 132 | 6.51 | 0 | 0 | 2027 | 100 | 0 | 0 |
+| sample_02 | 31350 | 25550 | 81.5 | 0 | 0 | 0 | 0 | 5800 | 18.5 | 2777 | 47.88 | 2709 | 46.71 | 314 | 5.41 | 0 | 0 | 5800 | 100 | 0 | 0 |
+| sample_03 | 60420 | 49190 | 81.41 | 0 | 0 | 0 | 0 | 11230 | 18.59 | 5300 | 47.2 | 5134 | 45.72 | 796 | 7.09 | 0 | 0 | 11230 | 100 | 0 | 0 |
+
+### All against all mode
+
+The default behavior for this workflow is to create a contigs database for each _group_ and map (and profile, and merge) the samples that belong to that _group_. If you wish to map all samples to all contigs, use the `all_against_all` option in the config file:
+
+```
+ "all_against_all": true
+```
+
+In your working directory you can find an updated config file `config-idba_ud-all-against-all.json`, which looks like this:
+
+```json
+{
+ "workflow_name": "metagenomics",
+ "config_version": 1,
+ "samples_txt": "samples.txt",
+ "idba_ud": {
+ "--min_contig": 1000,
+ "threads": 11,
+ "run": true
+ },
+ "all_against_all": true
+}
+```
+
+And we can generate a new workflow graph:
+
+```bash
+anvi-run-workflow -w metagenomics \
+ -c config-idba_ud-all-against-all.json \
+ --save-workflow-graph
+```
+
+An updated DAG for the workflow for our mock data is available below:
+
+[![idba_ud-all-against-all](../../images/workflows/metagenomics/idba_ud-all-against-all.png)]( ../../images/workflows/metagenomics/idba_ud-all-against-all.png){:.center-img .width-50}
+
+A little more of a mess! But also has a beauty to it :-).
+
+
+
+### References Mode
+
+{:.warning}
+This mode is used when you have one or more genomes, and one or more metagenomes from which you wish to recruit reads using your genomes.
+
+Along with assembly-based metagenomics, we often use anvi'o to explore the occurrence of population genomes across metagenomes. A good example of how useful this approach could be is described in this blogpost: [DWH O. desum v2: Most abundant Oceanospirillaceae population in the Deepwater Horizon Oil Plume](http://merenlab.org/2017/11/25/DWH-O-desum-v2/).
+For this mode, what you have is a bunch of FASTQ files (metagenomes) and FASTA files (reference genomes), and all you need to do is to let the workflow know where to find these files, using two `.txt` files: `samples_txt`, and `fasta_txt`.
+
+`fasta_txt` should be a 2 column tab-separated file, where the first column specifies a reference name and the second column specifies the file path of the FASTA file for that reference.
+
+After properly formatting your `samples_txt` and `fasta_txt`, reference mode is initiated by adding these to your config file:
+
+```
+(...)
+"references_mode": true
+(...)
+```
+
+The `samples_txt` stays as before, but this time the `group` column will specify for each sample, which reference should be used (aka the name of the reference as defined in the first column of `fasta_txt`). If the `samples_txt` file doesn't have a `group` column, then an ["all against all"](#all-against-all-mode) mode would be provoked.
+
+In your directory you can find the following `fasta.txt`, and `config-references-mode.json`:
+
+```bash
+$ cat fasta.txt
+name path
+G01 three_samples_example/G01-contigs.fa
+G02 three_samples_example/G02-contigs.fa
+
+$ cat config-references-mode.json
+{
+ "workflow_name": "metagenomics",
+ "config_version": "2",
+ "fasta_txt": "fasta.txt",
+ "samples_txt": "samples.txt",
+ "references_mode": true,
+ "output_dirs": {
+ "FASTA_DIR": "02_FASTA_references_mode",
+ "CONTIGS_DIR": "03_CONTIGS_references_mode",
+ "QC_DIR": "01_QC_references_mode",
+ "MAPPING_DIR": "04_MAPPING_references_mode",
+ "PROFILE_DIR": "05_ANVIO_PROFILE_references_mode",
+ "MERGE_DIR": "06_MERGED_references_mode",
+ "LOGS_DIR": "00_LOGS_references_mode"
+ }
+}
+```
+
+Let's create a workflow graph:
+
+[![dag-references-mode](../../images/workflows/metagenomics/dag-references-mode.png)]( ../../images/workflows/metagenomics/dag-references-mode.png){:.center-img .width-50}
+
+
+
+
+Now we can run this workflow:
+
+```bash
+anvi-run-workflow -w metagenomics \
+ -c config-references-mode.json
+```
+
+### Running binning algorithms
+
+If you wish to utilize automatic binning algorithms, you can use [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs) as part of your metagenomics workflow. You can run one or more binning algorithms, and resulting [collection](/help/8/artifacts/collection)s would be automatically imported into your merged profile database/s.
+
+The configuration parameters for the `anvi_cluster_contigs` rule look like this by default:
+
+```json
+ (...)
+ "anvi_cluster_contigs": {
+ "run": "",
+ "--driver": "",
+ "--collection-name": "{driver}",
+ "--just-do-it": "",
+ "--additional-params-concoct": "",
+ "--additional-params-metabat2": "",
+ "--additional-params-maxbin2": "",
+ "--additional-params-dastool": "",
+ "--additional-params-binsanity": "",
+ "threads": ""
+ },
+ (...)
+```
+
+Let's go over how to work with these:
+1. **run** - you must set this to `true` (no quotation marks) if you wish to run this rule.
+2. **--driver** - you can choose one or more from the list of binning algorithms that are available with [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs). To see what is available run `anvi-cluster-contigs -h`. If you wish to use multiple algorithms you must provide a list with the proper format. For example: `[ "concoct", "metabat2" ]` (notice that each algorithm name is inside quotation, but the brackets are not).
+3. **--collection-name** - You can see that by default, this is set to `"{driver}"`. We recommend just leaving it as-is. Using the curly brackets like this is a special way to let Snakemake know that this is a "wildcard" (basically a keyword). If you are not familiar with Snakemake, no worries. What happens here is that the keyword "driver" is swaped for the algorithm name. So if we chose to run CONCOCT and MetaBAT2, then the names for the collections, by default, would be "concoct" and "metabat2", respectively. if you wish to change it, you have to include `"{driver}"` inside your new name (so for example, `"{driver}_collection2"` is Ok), otherwise, all the algorithms you run will have the same collection name, which means they will try to override each other. If you are using only a single binning algorithm, then feel free to change to collection name to whatever you want (since you don't need to worry about multiple algorithms overriding each other).
+4. **--just-do-it** - instructs [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs) to just run and not bother you with questions and complaints (as much as possible). For example, this would allow [anvi-cluster-contigs](/help/8/programs/anvi-cluster-contigs) to override a collection if there was a collection with identical name already in your profile database.
+5. **--additional-params-concoct** - this parameter (as well as all the other `--additional-params-` parameters) are here so that you can set parameters that are specific to each clustering algorithm. To see which parameters are available refer to the help menu: `anvi-cluster-contigs -h`. For example, for concoct, we can provide something like this: `"--additional-params-concoct": "--clusters 100 --iterations 200"`.
+
+### Generating summary and split profiles
+
+As of anvi'o v5.3, you can also configure the workflow to import collections and generate a summary and/or split your profile database (using [anvi-split](/help/8/programs/anvi-split)). If you are running the workflow and plan to do binning, then you would usually not have a collection yet. But often we already have a collection ready (e.g. if you are re-profiling things for some reason, or if you are performing mapping and profiling on a FASTA file that was generated by merging a bunch of genomes into one fasta).
+
+In order for the workflow to import a collection into a merged profile database you need to provide a [collection-txt](/help/8/artifacts/collection-txt) file in the following manner:
+
+```
+ "collections_txt": "path/to/YOUR_COLLECTIONS_TXT_FILE"
+```
+
+This is the format for the `collections.txt` file:
+
+name | collection_name | collection_file | bins_info | contigs_mode | default_collection
+-- | -- | -- | -- | -- | --
+G01 | MOCK | MOCK-collection.txt | MOCK-collection-info.txt
+G02 | | | | | 1
+
+Where:
+ - name: is the name of the group to which the collection corresponds (this should match the names of groups in your `samples_txt` (if you supplied these), or the names of references in your `fasta_txt` (in references mode). In default mode (AKA assembly mode), if you didn't supply group names, then the group names are identical to the sample names in your `samples_txt`
+ - The four following columns (`collection_name`, `collection_file`, `bins_info`, `contigs_mode`) correspond to parameters of [anvi-import-collection](/help/8/programs/anvi-import-collection). Only `collection_name`, and `collection_file` are mandatory, and the rest of the columns are optional.
+ - `collection_name`: the name for the collection - you must provide a value.
+ - `collection_file`: a path to your collection file (i.e. the file that specifies the bin for each split/contig).
+ - `bins_info`: (optional) a path to your bins-info txt file
+ - `contigs_mode`: (optional) if your collection file include contigs names (instead of splits) set this column to `1`.
+ - The last column (`default_collection`) is an optional column to specify if you want a default collection to be imported using [anvi-script-add-default-collection](/help/8/programs/anvi-script-add-default-collection). If you want the default collection, then set the value in this column to `1`. The default collection will be called `DEFAULT` and the bin name would be the name in the `name` column of the `collections.txt` file (i.e. the "group" name).
+
+{:.notice}
+If you specify you want a default_collection for a group then you can't specify a collection file for this group (these options are mutually exclusive). In addition, `anvi_split` will not run for a group with a default collection (a default collection includes a single bin with all the contigs, so there is nothing to split).
+
+Your `collections_txt` could include only some of your groups, and then collections would be imported only to the merged profile databases that correspond to these group names.
+
+`anvi_summarize` and/or `anvi_split` (whichever you configured to run) will run for each group that is specified in your `collections.txt`.
+
+Let's run a mock example. We can update the config file for [references mode](#references-mode) in the following manner to run these steps:
+
+```json
+{
+ "workflow_name": "metagenomics",
+ "config_version": 1,
+ "fasta_txt": "fasta.txt",
+ "references_mode": true,
+ "collections_txt": "collections.txt",
+ "anvi_summarize": {
+ "run": true
+ },
+ "anvi_split": {
+ "run": true
+ },
+ "output_dirs": {
+ "FASTA_DIR": "02_FASTA_references_mode",
+ "CONTIGS_DIR": "03_CONTIGS_references_mode",
+ "QC_DIR": "01_QC_references_mode",
+ "MAPPING_DIR": "04_MAPPING_references_mode",
+ "PROFILE_DIR": "05_ANVIO_PROFILE_references_mode",
+ "MERGE_DIR": "06_MERGED_references_mode",
+ "SUMMARY_DIR": "07_SUMMARY_references_mode",
+ "SPLIT_PROFILES_DIR": "08_SPLIT_PROFILES_references_mode",
+
+ "LOGS_DIR": "00_LOGS_references_mode"
+ }
+}
+```
+
+And we have the following `collections.txt`:
+
+```
+name collection_name collection_file bins_info contigs_mode
+G02 MOCK MOCK-collection.txt MOCK-collection-info.txt
+```
+
+Once we run this, we can find the summary in the following directory: `08_SUMMARY/G02-SUMMARY/`.
+
+And for each bin in `MOCK-collection.txt` we have a directory under: `09_SPLIT_PROFILES/G02/`.
+
+### Reference-based short read removal
+
+As of anvi'o v5.3, we added a feature for removing short reads based on mapping to reference FASTA files.
+The purpose of this feature is to allow you to filter reads that match certain reference genomes. As you will see below, you can also use this feature to just quantify the reads that match these reference FASTA, without removing these reads from the FASTQ files (see `dont_remove_just_map`).
+
+This step is performed by the rule `remove_short_reads_based_on_references`. By default, this rule will not run.
+
+Here are the default parameters for this rule:
+
+```
+ (...)
+ "remove_short_reads_based_on_references": {
+ "delimiter-for-iu-remove-ids-from-fastq": " ",
+ "dont_remove_just_map": "",
+ "references_for_removal_txt": "",
+ "threads": ""
+ },
+ (...)
+```
+
+Let's go over the parameters of this rule:
+
+`references_for_removal_txt` - This is a table similar to the `fasta.txt` file, with two columns: `reference` and `path`. This rule is performed if and only if a table text file was supplied using this parameter.
+
+`dont_remove_just_map` - If you set this parameter to `true`, then the mapping will be performed in order to count the number of reads in each sample that matched the references in your `references_for_removal_txt`, but that's it (i.e. these reads will not be removed from your FASTQ files). The reason we decided to add this feature is to let you assess the number of reads that probably match these references, without risking losing reads that actually matter to you. More specifically, this way the assembly step has access to all the reads that were in the FASTQ file. You can see the note by Brian Bushnell [here](http://seqanswers.com/forums/showthread.php?t=42552) for an example as to why you wouldn't want to remove short reads (in the method we use for removing them) before your assembly.
+
+`delimiter-for-iu-remove-ids-from-fastq` - this allows you to set the `--delimiter` for `iu-remove-ids-from-fastq`, which is the program we use for the removal of short reads. Refer to the manual (by running `iu-remove-ids-from-fastq -h`) to better understand this feature. By default we set the `--delimiter` to a single space `" "` (we found it to be useful sometimes and harmless in other cases).
+
+The `bam` files that are created during mapping are saved in `MAPPING_DIR/REF_NAME`, where `REF_NAME` is the name you gave to the particular reference in the `references_for_removal_txt` file.
+
+In your working directory you can find the file `mock_ref_for_removal.txt`, which looks like this:
+
+```
+reference path
+R1 mock_ref_for_removal1.fa
+R2 mock_ref_for_removal2.fa
+```
+
+We can modify the config from the [References Mode](#references-mode) section above (but notice that this mode could be used in the default mode as well).
+
+```json
+{
+ "workflow_name": "metagenomics",
+ "config_version": 2,
+ "fasta_txt": "fasta.txt",
+ "references_mode": true,
+ "remove_short_reads_based_on_references": {
+ "delimiter-for-iu-remove-ids-from-fastq": " ",
+ "dont_remove_just_map": "",
+ "references_for_removal_txt": "mock_ref_for_removal.txt"
+ },
+ "output_dirs": {
+ "FASTA_DIR": "02_FASTA_references_mode",
+ "CONTIGS_DIR": "03_CONTIGS_references_mode",
+ "QC_DIR": "01_QC_references_mode",
+ "MAPPING_DIR": "04_MAPPING_references_mode",
+ "PROFILE_DIR": "05_ANVIO_PROFILE_references_mode",
+ "MERGE_DIR": "06_MERGED_references_mode",
+ "LOGS_DIR": "00_LOGS_references_mode"
+ }
+}
+```
+
+Now you can run this:
+
+```
+anvi-run-workflow -w metagenomics \
+ -c config-references-mode-with-short-read-removal.json
+```
+
+
+
+## Frequently Asked Questions
+
+If you need something, send your question to us and we will do our best to add the solution down below.
+
+### Is it possible to just do QC and then stop?
+
+If you only want to qc your files and then compress them (and not do anything else), simply invoke the workflow with the following command:
+
+```
+anvi-run-workflow -w metagenomics \
+ -c config.json \
+ --additional-params \
+ --until gzip_fastqs
+```
+
+### Can I skip anvi-script-reformat-fasta?
+
+Yes! In "reference mode", you may choose to skip this step, and keep your original contigs names by changing the `anvi_script_reformat_fasta` rule the following way:
+
+```
+ "anvi_script_reformat_fasta": {
+ "run": false
+ }
+```
+
+In assembly mode, this rule is always executed.
+
+### How can I use an existing contigs-db in references mode?
+
+This is relevant if you already have a [contigs-db](/help/8/artifacts/contigs-db), and all you want to do is to recruit reads from a bunch of metagenomes.
+
+This is done through 'references mode', but as you see in the relevant section, this mode asks yo to provide a [fasta-txt](/help/8/artifacts/fasta-txt), from which it generates [contigs-db](/help/8/artifacts/contigs-db) files for your references. What if you have your own contigs database, and you do not want to generate a new one? You can achieve that and make snakemake skip the creation of a new contigs database by putting the existing one at the place it is expected to be created. This is an example directory structure you should aim for before starting the workflow:
+
+```
+โโโ 03_CONTIGS
+โย ย โโโ anvi_run_hmms-EXAMPLE.done
+โย ย โโโ anvi_run_ncbi_cogs-EXAMPLE.done
+โย ย โโโ anvi_run_scg_taxonomy-EXAMPLE.done
+โย ย โโโ EXAMPLE-annotate_contigs_database.done
+โย ย โโโ EXAMPLE-contigs.db
+โโโ config.json
+โโโ contigs.fa
+โโโ fasta.txt
+โโโ samples.txt
+```
+
+where,
+
+* `03_CONTIGS` is the directory name defined in your config.json file to store contigs databases (`03_CONTIGS` is already the default directory name, so name it as such if you didn't change anything in the config.json).
+* The `.done` files in `03_CONTIGS` instrcuts anvi'o to not re-run those jobs on the existing contigs databse. Add them with `touch` or remove as necessary.
+* `config.json` is yor configuration where you have at least the following entries:
+
+```
+ (...)
+ "fasta_txt": "fasta.txt",
+ "samples_txt": "samples.txt",
+ "references_mode": true,
+ (...)
+```
+
+* `contigs.fa` is the output of [anvi-export-contigs](/help/8/programs/anvi-export-contigs) run on your [contigs-db](/help/8/artifacts/contigs-db) in `03_CONTIGS`.
+
+* `fasta.txt` is your [fasta-txt](/help/8/artifacts/fasta-txt) that contains a single entry with name `EXAMPLE` and should look exactly like this:
+
+ |**name**|**path**|
+ |:--|:--|
+ |EXAMPLE|contigs.fa|
+
+ Please note: when you change `EXAMPLE` to something more meaningful, you will have to replace `EXAMPLE` with the same name in every other file in the list above.
+
+* `samples.txt` is your good old [samples-txt](/help/8/artifacts/samples-txt) that contains your metagenomes with which the read recruitment will be conducted.
+
+EASY PEASY.
+
+### What's going on behind the scenes before we run IDBA-UD?
+
+A note regarding `idba_ud` is that it requires a single FASTA as an input. Because of that, what we do is use `fq2fa` to merge the pair of reads of each sample to one FASTA, and then we use `cat` to concatenate multiple samples for a co-assembly. The FASTA file is created as a temporary file, and is deleted once `idba_ud` finishes running. If this is annoying to you, then feel free to contact us or just hack it yourself. We tried to minimize memory usage by deleting each individual FASTA file after it was concatenated to the merged FASTA file ([see this issue for details](https://github.com/merenlab/anvio/issues/954)).
+
+### Can I change the parameters of samtools view?
+
+The samtools command executed is:
+
+```
+samtools view additional_params -bS INPUT -o OUTPUT
+```
+
+Where `additional_params` refers to any parameters of samtools view that you choose to use (excluding `-bS` or `-o`, which are always set by the workflow). For example, you could set it to be `-f 2`, or `-f 2 -q 1` (for a full list see the samtools [documentation](http://www.htslib.org/doc/samtools.html)). The default value for `additional_params` is `-F 4`.
+
+### Can I change the parameters for Bowtie2?
+
+Similar to [samtools](#can-i-change-the-parameters-of-samtools-view) we use the `additional_params` to configure Bowtie2. The bowtie rule executes the following command:
+
+```
+bowtie2 --threads NUM_THREADS \
+ -x PREFIX_OF_BOWTIE_BUILD_OUTPUT \
+ -1 R1.FASTQ \
+ -2 R2.FASTQ \
+ additional_params \
+ -S OUTPUT.sam
+```
+
+Hence, you can use `additional_params` to specify all parameters except `--threads`, `-x`, `-1`, `-2`, or `-S`.
+
+For example, if you don't want gapped alignment (aka the reference does not recruit any reads that contain indels with respect to it), and you don't want to store unmapped reads in the SAM output file, set `additional_params` to be `--rfg 10000,10000 --no-unal` (for a full list of options see the bowtie2 [documentation](http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#options)).
+
+### How can I restart a failed job?
+If your job fails for some reason you can use `additional_params` with the original command to restart the workflow where it stopped. For example:
+
+```
+anvi-run-workflow -w metagenomics \
+ -c config-idba_ud.json \
+ --additional-params \
+ --keep-going \
+ --rerun-incomplete
+```
+
+Here using `additional_params` with the `--keep-going` and `--rerun-incomplete` flags will resume the job even if it failed in the middle of a rule, like `anvi_profile`. Of course, it is always a good idea to figure out why a workflow failed in the first place.
+
+{:.notice}
+When a workflow fails, then you would need to unlock the working directory before rerunning. This means you would have to run the full command with the `--unlock` flag once, and then run the command again without the `--unlock` flag. Please refer to the snakemake documentation for [more details regarding how snakemake locks the working directory](https://snakemake.readthedocs.io/en/stable/project_info/faq.html#how-does-snakemake-lock-the-working-directory).
+
+### Can I use results from previous runs of krakenuniq?
+
+If you already ran krakenuniq on your metagenomes, then you can use the `kraken_txt` option in the config file to provide a path to a TAB-delimited file with the paths to the `tax` file for each of your metagenomic samples. Notice that the `kraken_txt` file must have the following format (i.e. two columns with the headers "sample" and "path"):
+
+```bash
+sample path
+s01 /path/to/s01-kraken.tax
+s02 /path/to/s02-kraken.tax
+```
+
+The sample names must be identical to the sample names that are provided in the `samples.txt` file, and it should include all the samples in `samples.txt`.
+
+Once you have such a file, and let's say you named it `kraken.txt`, simply add this to your config file:
+
+```
+ "kraken_txt": "kraken.txt"
+```
+
+### How do I skip the QC of the FASTQ files?
+
+If you already ran quality filtering for your FASTQ files, then just make sure that this is included in your config file:
+
+```
+"iu_filter_quality_minoche": {
+ "run": false
+ }
+```
+
+### Can I use BAM files as input for the metagenomics workflow?
+
+In short, yes. If you already did mapping, and you have a bunch of bam files, and now you want to run additional steps from the workflow (e.g. generate contigs databases, annotate them, profile the bam files, etc.), then it might not be entirely straightforward, but it is possible (and I wish to extend my thanks to [Even Sannes Riiser](https://twitter.com/evensriiser?lang=en) for troubleshooting this process).
+
+This is what you need to do:
+1. Make sure you have a [samples.txt](#samplestxt) file. The first column is, as usual, the name of your sample. As for the other two columns `r1`, and `r2`, in your case you should no longer need the FASTQ files, and hence this two column could have any arbitrary word, but you still have to have *something* there (if you still have access to your FASTQ files, and you want to run something like krakenHLL, then in that case, you should put the path to the FASTQ files, just as in the normal case of a `samples.txt` file)
+2. You should tell the workflow to [skip QC](#how-do-i-skip-the-qc-of-the-fastq-files). If you don't do this, then the workflow by default would look for your FASTQ files, and QC them, and run everything else, including mapping.
+3. You should use [references mode](#references-mode).
+4. You need to make sure your bam files have names compatible with what the snakemake workflow expects. The way we expect to find the bam file is this:
+```
+MAPPING_DIR/group_name/sample_name.bam
+```
+ Where `MAPPING_DIR` is `04_MAPPING` by default but you can set it in the config file. `group_name` is the name you gave the reference in your `fasta.txt` file. And `sample_name` is the name you gave the sample in the `samples.txt` file.
+5. You must skip `import_percent_of_reads_mapped`. Currently, we use the log files of bowtie2 to find out how many reads were in the (Qc-ied) FASTQ files, but since you already did your mapping elsewhere, we don't know how to get that information, and hence you must skip this step. This is pretty easy to do manually later on, so no big deal. In order to skip `import_percent_of_reads_mapped`, include this in your config file:
+
+```
+"import_percent_of_reads_mapped": {
+ "run": false
+ }
+```
+
+### What to do when submitting jobs with a SLURM system
+
+If you want to work with any cluster managing software (such as SLURM) you just need to use the `--cluster` argument of `snakemake`. Here is what the snakemake help menu tells us:
+
+```
+ --cluster CMD, -c CMD
+ Execute snakemake rules with the given submit command,
+ e.g. qsub. Snakemake compiles jobs into scripts that
+ are submitted to the cluster with the given command,
+ once all input files for a particular job are present.
+ The submit command can be decorated to make it aware
+ of certain job properties (input, output, params,
+ wildcards, log, threads and dependencies (see the
+ argument below)), e.g.: $ snakemake --cluster 'qsub
+ -pe threaded {threads}'.
+```
+
+But, just in case, here is an example of how to use SLURM with `anvi-run-workflow`:
+
+```bash
+anvi-run-workflow -w metagenomics \
+ -c config.json \
+ --additional-params \
+ --cores 48 \
+ --cluster \
+ 'sbatch --job-name=CHOOSE_A_NICE_JOB_NAME \
+ --account=YOUR_ACCOUNT \
+ --output={log} \
+ --error={log} \
+ --nodes={threads}'
+```
+
+Notice that when you use `--cluster`, snakemake also requires you to include the `--cores / --jobs`. From the `snakemake` help menu:
+
+```
+--cores [N], --jobs [N], -j [N]
+ Use at most N cores in parallel (default: 1). If N is
+ omitted, the limit is set to the number of available
+ cores.
+```
+
+We use `qsub` on our system, and we have found the behaviour a little funny in this case, where if we choose `--cores N`, then snakemake would submit `N` jobs, regardless of the number of threads each job is requesting. And hence we added the option to use the `--resources` argument, so the command from above would look like this:
+
+
+```bash
+anvi-run-workflow -w metagenomics \
+ -c config.json \
+ --additional-params \
+ --cores 10 \
+ --resources nodes=48 \
+ --cluster \
+ 'sbatch --job-name=CHOOSE_A_NICE_JOB_NAME \
+ --account=YOUR_ACCOUNT \
+ --output={log} \
+ --error={log} \
+ --nodes={threads}'
+```
+
+Now, at most 10 jobs would be submitted to the queue in parallel, but only as long as the total number of threads (nodes) that is requested by the submitted jobs doesn't go above 48. So if we have 3 `anvi-run-hmms` jobs and each require 20 threads, then only two would run in parallel.
+
+### How to use metaSPAdes for assembly
+
+As of anvi'o `v5.3` [metaSPAdes](http://cab.spbu.ru/software/spades/) has been added to the metagenomics workflow. By default, these are the parameters for metaspades:
+
+```
+ (...)
+ "metaspades": {
+ "additional_params": "--only-assembler",
+ "threads": 11,
+ "run": "",
+ "use_scaffolds": ""
+ },
+ (...)
+```
+
+`additional_params` works in the same way as is explained [above for samtools](#can-i-change-the-parameters-of-samtools-view), and allows you to specify anything that metaSPAdes accepts. By default it is set to `--only-assembler`, since QC is done using `iu-filter-quality-minoche`, and we see no reason to have metaSPAdes do another step of QC. If you want to specify more parameters then you probably want it to still include `--only-assembler`.
+
+metaSPAdes has two outputs, `contigs.fasta`, and `scaffolds.fasta`. By default anvi'o will use `contigs.fasta` for the rest of the workflow, but if you want to use `scaffolds.fasta`, then set `use_scaffolds: true` in your config file. In any case, anvi'o will save the one you don't use as well (i.e. by default you will find in your `02_FASTA` directory the `scaffold.fasta` file, and if you choose to use the scaffolds, then you will still find `contigs.fasta`).
+
+
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/workflows/metagenomics.md) to update this information.
+
diff --git a/help/8/workflows/sra-download/index.md b/help/8/workflows/sra-download/index.md
new file mode 100644
index 00000000..cd83a2e6
--- /dev/null
+++ b/help/8/workflows/sra-download/index.md
@@ -0,0 +1,131 @@
+---
+layout: program
+title: The anvi'o 'sra-download' workflow
+excerpt: Download, extract, and gzip paired-end FASTQ files automatically from the NCBI short-read archive (SRA)
+categories: [anvio]
+comments: false
+redirect_from: /8/sra-download
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Download, extract, and gzip paired-end FASTQ files automatically from the NCBI short-read archive (SRA)
+
+The sra-download workflow automatizes the process of downloading paired-end FASTQ files for a given list of SRA-accessions using [NCBI sra-tools wiki](https://github.com/ncbi/sra-tools/wiki/08.-prefetch-and-fasterq-dump) then gzips them using [pigz](https://zlib.net/pigz/).
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Authors
+
+
+
+
+
+## Artifacts accepted
+
+The sra-download can typically be initiated with the following artifacts:
+
+[workflow-config](../../artifacts/workflow-config)
+
+## Artifacts produced
+
+The sra-download typically produce the following anvi'o artifacts:
+
+[paired-end-fastq](../../artifacts/paired-end-fastq)
+
+## Third party programs
+
+This is a list of programs that may be used by the sra-download workflow depending on the user settings in the [workflow-config](../../artifacts/workflow-config/) :
+
+
+- prefetch (Downloads SRA accessions)
- fasterq-dump (Extracts FASTQ files from SRA accessions)
- pigz (Compresses FASTQ files in parallel)
+
+
+An anvi'o installation that follows the recommendations on the installation page will include all these programs. But please consider your settings, and cite these additional tools from your methods sections.
+
+## Workflow description and usage
+
+
+The `sra-download` workflow is a Snakemake workflow that downloads FASTQ files from SRA-accessions using [NCBI sra-tools wiki](https://github.com/ncbi/sra-tools/wiki/08.-prefetch-and-fasterq-dump), gzips them using [pigz](https://zlib.net/pigz/), and provides a [samples-txt](/help/8/artifacts/samples-txt). You will need to have these tools installed before you start.
+
+Let's get started.
+
+## Required input
+
+### Configuration file
+
+The first step is to make a [workflow-config](/help/8/artifacts/workflow-config).
+
+```bash
+anvi-run-workflow -w sra-download --get-default-config sra_download_config.json
+```
+
+Here's what the [workflow-config](/help/8/artifacts/workflow-config) file looks like:
+
+```bash
+$ cat sra_download_config.json
+{
+ "SRA_accession_list": "SRA_accession_list.txt",
+ "prefetch": {
+ "--max-size": "40g",
+ "threads": 2
+ },
+ "fasterq_dump": {
+ "threads": 6
+ },
+ "pigz": {
+ "threads": 8,
+ "--processes": ""
+ },
+ "output_dirs": {
+ "SRA_prefetch": "01_NCBI_SRA",
+ "FASTAS": "02_FASTA",
+ "LOGS_DIR": "00_LOGS"
+ },
+ "max_threads": "",
+ "config_version": "3",
+ "workflow_name": "sra-download"
+```
+
+#### Modify any of the bells and whistles in the config file
+
+{:.notice}
+If this is the first time using an anvi'o Snakemake workflow, I would check out [Alon's blog post first](https://merenlab.org/2018/07/09/anvio-snakemake-workflows/#configjson).
+
+Feel free to adjust anything in the config file! Here are some to consider:
+- `threads`: this can be optimized for any of the steps depending on the size and number of SRA accessions you are downloaded.
+- `prefetch` `--max-size`: I already upped the amount from the default 40g but maybe you need more! For reference, I can download TARA Ocean metagenomes with the current parameter. You can use `vdb-dump --info` to learn how much the the `prefetch` step will download e.g. `vdb-dump SRR000001 --info`. Read more about that [here](https://github.com/ncbi/sra-tools/wiki/08.-prefetch-and-fasterq-dump#check-the-maximum-size-limit-of-the-prefetch-tool).
+
+### List of SRA accessions
+
+The input for the `sra-download` workflow is `SRA_accession_list.txt`. This contains a list of your SRA accession you would like to download and it looks like this:
+
+```bash
+$ cat SRA_accession_list.txt
+ERR6450080
+ERR6450081
+SRR5965623
+```
+
+{:.warning}
+The .sra files are stored in `01_NCBI_SRA/`. This directory will be deleted upon successful completion of the workflow because I don't know any use for .sra files. If you need these feel free to update the workflow.
+
+## Start the workflow!
+
+Here's a basic command to start the workflow:
+
+### Run on your local computer
+
+```bash
+anvi-run-workflow -w sra-download -c sra_download_config.json
+```
+
+### Go big and use an HPC!
+
+The power of Snakemake shines when you can leverage a High Performance Computing system to parallize jobs. Check out the [Snakemake cluster documentation](https://snakemake.readthedocs.io/en/stable/executing/cluster.html#) on how to launch this workflow on your own HPC.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/workflows/sra-download.md) to update this information.
+
diff --git a/help/8/workflows/trnaseq/index.md b/help/8/workflows/trnaseq/index.md
new file mode 100644
index 00000000..c3474b33
--- /dev/null
+++ b/help/8/workflows/trnaseq/index.md
@@ -0,0 +1,101 @@
+---
+layout: program
+title: The anvi'o 'trnaseq' workflow
+excerpt: Process transfer RNA transcripts from tRNA-seq datasets
+categories: [anvio]
+comments: false
+redirect_from: /8/trnaseq
+image:
+ featurerelative: ../../../images/header.png
+ display: true
+---
+
+Process transfer RNA transcripts from tRNA-seq datasets
+
+The trnaseq workflow takes in raw paired-end sequencing data generated from trna-seq libraries (i.e., the direct sequencing of transfer RNA transcripts from cultures or environmental samples), and processes these data to identify tRNA sequences and their structural features, predict chemical modification sites and modification fractions across samples, assign taxonomy to tRNA transcript seeds, and generate tables and summary data for downstream analyses. The tRNA-seq resources in anvi'o are operational, however, they are experimental. If you have datasets that are suitable for analysis, pelase consider getting in touch with us first.
+
+๐ **[To the main page](../../)** of anvi'o programs and artifacts.
+
+## Authors
+
+
+
+
+
+## Artifacts accepted
+
+The trnaseq can typically be initiated with the following artifacts:
+
+[workflow-config](../../artifacts/workflow-config) [samples-txt](../../artifacts/samples-txt)
+
+## Artifacts produced
+
+The trnaseq typically produce the following anvi'o artifacts:
+
+[trnaseq-db](../../artifacts/trnaseq-db) [trnaseq-contigs-db](../../artifacts/trnaseq-contigs-db) [trnaseq-profile-db](../../artifacts/trnaseq-profile-db) [trnaseq-seed-txt](../../artifacts/trnaseq-seed-txt) [modifications-txt](../../artifacts/modifications-txt)
+
+## Third party programs
+
+This is a list of programs that may be used by the trnaseq workflow depending on the user settings in the [workflow-config](../../artifacts/workflow-config/) :
+
+
+
+An anvi'o installation that follows the recommendations on the installation page will include all these programs. But please consider your settings, and cite these additional tools from your methods sections.
+
+## Workflow description and usage
+
+
+The tRNA-seq workflow is a [Snakemake](https://snakemake.readthedocs.io/en/stable/) workflow run by [anvi-run-workflow](/help/8/programs/anvi-run-workflow).
+
+The workflow can run the following programs in order:
+
+- [Illumina-utils](https://github.com/merenlab/illumina-utils), for merging paired-end reads and quality control
+- [anvi-script-reformat-fasta](/help/8/programs/anvi-script-reformat-fasta), for making FASTA deflines anvio-compliant
+- [anvi-trnaseq](/help/8/programs/anvi-trnaseq), for predicting tRNA sequences, structures, and modification sites in each sample
+- [anvi-merge-trnaseq](/help/8/programs/anvi-merge-trnaseq), for predicting tRNA seed sequences and their modification sites from the set of samples
+- [anvi-run-trna-taxonomy](/help/8/programs/anvi-run-trna-taxonomy), for assigning taxonomy to tRNA seeds
+- [anvi-tabulate-trnaseq](/help/8/programs/anvi-tabulate-trnaseq), for generating tables of seed and modification information that are easily manipulated
+
+## Input
+
+The tRNA-seq workflow requires two files to run: a [workflow-config](/help/8/artifacts/workflow-config) config file and a [samples-txt](/help/8/artifacts/samples-txt). You can obtain a 'default' config file for this workflow to further edit using the following command.
+
+
+anvi-run-workflow -w trnaseq \
+ --get-default-config config.json
+
+
+Different "rules," or steps, of the workflow can be turned on and off as needed in the config file. The workflow can be restarted at intermediate rules without rerunning prior rules that have already completed.
+
+[samples-txt](/help/8/artifacts/samples-txt) will contain a list of FASTQ or FASTA files and associated information on each library. FASTQ files contain unmerged paired-end tRNA-seq reads. Reads are merged in the workflow by [Illumina-utils](https://github.com/merenlab/illumina-utils). FASTA files contain merged reads, and the initial read-merging steps in the workflow are skipped.
+
+Here is an example tRNA-seq samples file with FASTQ inputs.
+
+| sample | treatment | r1 | r2 | r1_prefix | r2_prefix |
+| --- | --- | --- | --- | --- | --- |
+| ecoli_A1_noDM | untreated | FASTQ/ecoli_A1_noDM.r1.fq.gz | FASTQ/ecoli_A1_noDM.r2.fq.gz | NNNNNN | TTCCAGT |
+| ecoli_A1_DM | demethylase | FASTQ/ecoli_A1_DM.r1.fq.gz | FASTQ/ecoli_A1_DM.r2.fq.gz | NNNNNN | TCTGAGT |
+| ecoli_B1_noDM | untreated | FASTQ/ecoli_B1_noDM.r1.fq.gz | FASTQ/ecoli_B1_noDM.r2.fq.gz | NNNNNN | TGGTAGT |
+| ecoli_B1_DM | demethylase | FASTQ/ecoli_B1_DM.r1.fq.gz | FASTQ/ecoli_B1_DM.r2.fq.gz | NNNNNN | CTGAAGT |
+
+The treatment column is optional. The treatment indicates a chemical application, such as demethylase, and can be used to have a bearing on seed sequence determination in [anvi-merge-trnaseq](/help/8/programs/anvi-merge-trnaseq). In the absence of a treatment column, all samples are assigned the same treatment, which can be specified in the `anvi_trnaseq` section of the workflow config file and defaults to `untreated`.
+
+Read 1 and 2 prefix columns are also optional. These represent sequences that Illumina-utils should identify and trim from the start of the read. In the example, the read 1 prefix is a unique molecular identifier (UMI) of 6 random nucleotides, and the read 2 prefix is a sample barcode. Illumina-utils will discard the paired-end read if the prefix is not found. In the example, the read 1 UMI will always be found, but the read 2 barcode must match exactly.
+
+Here is an equivalent tRNA-seq samples file with FASTA inputs.
+
+| sample | treatment | fasta |
+| --- | --- | --- |
+| ecoli_A1_noDM | untreated | FASTA/ecoli_A1_noDM.fa.gz |
+| ecoli_A1_DM | demethylase | FASTA/ecoli_A1_DM.fa.gz |
+| ecoli_B1_noDM | untreated | FASTA/ecoli_B1_noDM.fa.gz |
+| ecoli_B1_DM | demethylase | FASTA/ecoli_B1_DM.fa.gz |
+
+Note that barcodes and other sequence prefixes should already be trimmed from FASTA sequences.
+
+
+{:.notice}
+Edit [this file](https://github.com/merenlab/anvio/tree/master/anvio/docs/workflows/trnaseq.md) to update this information.
+
diff --git a/install/images/windows10.png b/install/images/windows10.png
deleted file mode 100644
index cc42a036..00000000
Binary files a/install/images/windows10.png and /dev/null differ
diff --git a/install/index.md b/install/index.md
index 43c54dea..a043f044 100644
--- a/install/index.md
+++ b/install/index.md
@@ -2,7 +2,7 @@
layout: page
title: "Installing anvi'o"
excerpt: "Instructions to install the current release of the platform."
-modified: 2019-05-14
+modified: 2023-09-26
tags: []
categories: [anvio]
comments: true
@@ -10,14 +10,34 @@ image:
feature: https://github.com/merenlab/anvio/raw/master/anvio/data/interactive/images/logo.png
---
+Thank you for considering anvi'o. Please click the button below that matches your operating system and follow the instructions. Most people will want to install the latest stable release of anvi'o. However, for current or future developers, or for those who feel adventurous and wish to keep up with the latest updates and fixes to the anvi'o source code, we also have detailed installation instructions to track the development version of anvi'o.
+
+Please join our {% include _discord_invitation_button.html %} if you find yourself in need of help related to installation.
+
+## Stable version
{% include _project-anvio-version.html %}
-This article explains basic steps of installing anvi'o using rather conventional methods both for end users and current of future developers.
+Use these buttons if you wish to install latest stable release of anvi'o. This version of anvi'o is static and no further changes are going toe be made to this particular release. But it has been well-tested, and will work the vast majority of the time (though occasionally bugs slip past our nets, and if you find one, we would love [to hear from you](https://github.com/merenlab/anvio/issues/new/choose)). Most end-users find this version of anvi'o to suit their needs.
+
+{% include install/00_links_for_stable.html %}
+
+## Development version
+
+Use these buttons if you wish to set up `anvio-dev` on your computer. By doing so, you will be tracking the _active development_ version of anvi'o. It will bring the latest bug fixes and features from GitHub to your work environment every day, but can also be unstable at times. We sometimes ask our users specifically to install `anvio-dev` if they are experiencing an issue that we already have fixed after a particular release.
+
+{% include install/00_links_for_dev.html %}
+
+{:.notice}
+We thank [Daan Speth](https://twitter.com/daanspeth), [Jarrod Scott](https://orcid.org/0000-0001-9863-1318), [Susheel Bhanu Busi](https://scholar.google.com/citations?user=U0g3IzQAAAAJ&hl=en), [Mike Lee](https://twitter.com/AstrobioMike), [Josh Herr](http://joshuaherr.com/), and [Titus Brown](https://scholar.google.com/citations?user=O4rYanMAAAAJ) who kindly invested their time to test the installation instructions on different systems and/or made suggestions to these documents to ensure a smoother installation experience for everyone.
+
+## Docker container
+
+You could run anvi'o without a conventional installation using [Docker](https://www.docker.com/):
Show/hide A docker solution for those who are in a hurry
-We do recommend you to install anvi'o on your system as explained below, but **if you just want to run anvi'o without any installation**, you can actually do it within minutes using [docker](https://docs.docker.com/get-docker/).
+We do recommend you to install anvi'o on your system, but **if you just want to run anvi'o without any installation**, you can actually do it within minutes using [docker](https://docs.docker.com/get-docker/).
The docker solution is very simple, guaranteed to work, and very effective to do quick analyses or visualize anvi'o data currencies from others without having to install anything. A more detailed article on how to run anvi'o in docker [is here](https://merenlab.org/2015/08/22/docker-image-for-anvio/), but here is a brief set of steps.
@@ -30,7 +50,7 @@ docker pull meren/anvio:7
{:.notice}
Instead of the version number shown above, you can use ANY version number listed on [this Docker Hub page](https://hub.docker.com/r/meren/anvio/tags).
-This step will take a few minutes and require about 15Gb disk space. Once it is done, you can run it the following way:
+This step will take a few minutes and require about 15Gb of disk space. Once it is done, you can run it the following way:
```
docker run --rm -it -v `pwd`:`pwd` -w `pwd` -p 8080:8080 meren/anvio:7
@@ -39,7 +59,7 @@ docker run --rm -it -v `pwd`:`pwd` -w `pwd` -p 8080:8080 meren/anvio:7
And that's it! You are now in a virtual environment that runs anvi'o. You can exit this environment by pressing `CTRL+D`.
{:.warning}
-If you wish to do resource demanding analyses, don't forget to increase CPU and memory resources allocated for anvi'o using the docker Preferences menu.
+If you wish to do resource demanding analyses, don't forget to increase the CPU and memory resources allocated for anvi'o using the docker Preferences menu.
If you at some point want to remove all containers and reclaim all the storage space, you can run this after exiting all containers:
@@ -48,670 +68,6 @@ docker system prune --force -a
```
-Please consider opening an issue for technical problems, or join us on {% include _discord_invitation_button.html %} if you need help.
-
-{:.notice}
-{% include _fixthispage.html source="_posts/anvio/2016-06-26-installation-v2.md" %}
-
-{:.warning}
-We thank [Daan Speth](https://twitter.com/daanspeth), [Jarrod Scott](https://orcid.org/0000-0001-9863-1318), [Susheel Bhanu Busi](https://scholar.google.com/citations?user=U0g3IzQAAAAJ&hl=en), [Mike Lee](https://twitter.com/AstrobioMike), and [Josh Herr](http://joshuaherr.com/) who kindly invested their time to test the installation instructions on this page on different systems and/or made suggestions to the document to ensure a smoother installation experience for everyone.
-
-## (1) Setup conda
-
-This is a very simple and effective way to install anvi'o on your system along with most of its dependencies.
-
-{:.notice}
-Although these installation instructions primarily target and rigorously tested for Linux and Mac OSX, you will be able to follow them if you are using Microsoft Windows **if and only if you first install the [Linux Subsystem for Windows](https://docs.microsoft.com/en-us/windows/wsl/install-win10)**. Our users have reported success stories with Ubuntu on WSL.
-
-**For this to work, you need [miniconda](https://docs.conda.io/en/latest/miniconda.html) to be installed on your system (in ubuntu if you are using WSL).** If you are not sure whether it is installed or not, open a terminal (such as [iTerm](https://www.iterm2.com/), if you are using Mac) and type `conda`. You should see an output like this instead of a 'command not found' error (your version might be different):
-
-```bash
-$ conda --version
-conda 4.9.2
-```
-
-If you don't have conda installed, then you should first install it through their [installation page](https://docs.conda.io/en/latest/miniconda.html). Once you have confirmed you have conda installed, run this command to make sure you are up-to-date:
-
-``` bash
-conda update conda
-```
-
-Good? Good! You are almost there!
-
-## (2) Setup an anvi'o environment
-
-{:.notice}
-It is a good idea to **make sure you are not already in a conda environment** before you run the following steps. Just to be clear, you can indeed install anvi'o in an existing conda environment, but if things go wrong, we kindly ask you to refer to meditation for help, rather than [anvi'o community resources](https://merenlab.org/2019/10/07/getting-help/) If you want to see what environments do you have on your computer and whether you already are in one of them in your current terminal by running `conda env list`. **If all these are too much for you and all you want to do is to move on with the installation**, simply do this: open a new terminal, and run `conda deactivate`, and continue with the rest of the text.
-
-First, create a new conda environment:
-
-``` bash
-conda create -y --name anvio-7.1 python=3.6
-```
-
-{:.notice}
-If you are using a computer with Apple silicon (like a M1 MacBook), you will find that some conda packages are not available, like older versions of python (3.6). To avoid this issue, you need to run your terminal app using Rosetta, a compatibility software. To do it, you can right-click on your terminal app in the Application folder and from the "Get info" menu, select "Open using Rosetta".
-
-And activate it:
-
-```
-conda activate anvio-7.1
-```
-
-
-
-Now you are in a pristine environment, in which you will install all conda packages that anvi'o will need to work properly. This looks scary, but it will work if you just copy paste it and press ENTER:
-
-``` bash
-conda install -y -c bioconda "sqlite>=3.31.1"
-conda install -y -c bioconda prodigal
-conda install -y -c bioconda mcl
-conda install -y -c bioconda muscle=3.8.1551
-conda install -y -c bioconda hmmer
-conda install -y -c bioconda diamond
-conda install -y -c bioconda blast
-conda install -y -c bioconda megahit
-conda install -y -c bioconda spades
-conda install -y -c bioconda bowtie2 tbb=2019.8
-conda install -y -c bioconda bwa
-conda install -y -c bioconda samtools=1.9
-conda install -y -c bioconda centrifuge
-conda install -y -c bioconda trimal
-conda install -y -c bioconda iqtree
-conda install -y -c bioconda trnascan-se
-conda install -y -c bioconda r-base
-conda install -y -c bioconda r-stringi
-conda install -y -c bioconda r-tidyverse
-conda install -y -c bioconda r-magrittr
-conda install -y -c bioconda r-optparse
-conda install -y -c bioconda bioconductor-qvalue
-conda install -y -c bioconda fasttree
-conda install -y -c bioconda vmatch
-
-# this last one may cause some issues. if it doesn't install,
-# don't worry, you will still be fine:
-conda install -y -c bioconda fastani
-```
-
-Now you can jump to "[Download and install anvi'o](#3-install-anvio)"!
-
-
-## (3) Install anvi'o
-
-Here you will first download the Python source package for the official anvi'o release:
-
-```
-curl -L https://github.com/merenlab/anvio/releases/download/v7.1/anvio-7.1.tar.gz \
- --output anvio-7.1.tar.gz
-```
-
-And install it using `pip` like a boss:
-
-```
-pip install anvio-7.1.tar.gz
-```
-
-**If you don't see any error messages**, then you are probably golden and can move on to test your to the section "[Check your anvi'o setup](#4-check-your-installation)" :)
-
-**If you do see error messages**, please know that you are not alone. We are as frustrated as you are. Please take a look at the problems people have reported and try these solutions, which will most likely address your issues.
-
-### Issues with pysam installation using pip
-
-Some people have reported errors in the installation of `pysam` using `pip`, so if your installation also fails due to `pysam`, you can use the following two lines to first install this package via conda, and then install the anvi'o package via `pip`:
-
-```
-conda install -y -c bioconda pysam
-pip install anvio-7.1.tar.gz
-```
-
-### Issues with the C compiler
-
-We realized that on some **Mac OSX** systems, some packages installed by `pip` requires a more up-to-date C compiler. If you're getting an error that contains `x86_64-apple-darwin13.4.0-clang` or similar keywords in the output message, please run the following (which will set an environmental variable, and then try to install anvi'o via `pip` again):
-
-```bash
-export CC=clang
-pip install anvio-7.1.tar.gz
-```
-
-If this didn't work, try this more extensive solution:
-
-```bash
-export CC=/usr/bin/clang
-export CXX=/usr/bin/clang++
-pip install anvio-7.1.tar.gz
-```
-
-If the `pip` installation still doesn't work (and especially if you see something like "clang-12: error: linker command failed with exit code 1" in the error message (we have often seen this error associated with the `Levenshtein` package), then this may be related to Xcode on Mac OSX. In this case you can try updating your Xcode by following the instructions described in [this issue](https://github.com/merenlab/anvio/issues/1636) (in the "Solved it" section), and then try the `pip` command one more time.
-
-If you did all that and it is still not working, please make an issue on the github page or let us know in the anvi'o Discord channel about your problem and we will try to help you.
-
-### Issues related to samtools
-
-At this point, you should probably test your `samtools` installation by running `samtools --version`. If you see an error that looks similar to this:
-
-```
-dyld: Library not loaded: @rpath/libcrypto.1.0.0.dylib
- Referenced from: /Users/iva/opt/miniconda3/envs/anvio-7.1/bin/samtools
- Reason: image not found
-Abort trap: 6
-```
-
-This is happening because somehow you have the wrong version of the `samtools` :( The following commands should fix it:
-
-```
-conda remove -y samtools
-conda install -y -c bioconda samtools=1.9
-```
-
-Then try `samtools --version` again to make sure it is okay now. What you _should_ see is the following:
-
-```
-samtools 1.9
-Using htslib 1.9
-Copyright (C) 2018 Genome Research Ltd.
-```
-
-### Issues related to _sysconfigdata_x86_64_conda_linux_gnu
-
-Occasionally, users may come across a "Failed to import site module" error during the installation process. This is due to a config file naming mismatch, and can be resolved by changing the name of the existing relevant config file.
-
-First, navigate to your new conda environment's `python3.6` folder
-```bash
-cd path/to/conda/envs/7.1/lib/python3.6
-```
-Then, change the appropriate file name
-```bash
-mv _sysconfigdata_x86_64_conda_cos6_linux_gnu.py _sysconfigdata_x86_64_conda_linux_gnu.py
-```
-You can find more discussion on this issue [here](https://github.com/merenlab/anvio/issues/1839)
-
-### Issues related to package conflicts
-
-While setting up your environment to track the development branch, especially on Ubuntu systems (first observed on Ubuntu 20.04 LTS), you may run into issues related to package conflicts that produce error messages like this one:
-
-
-```bash
-Encountered problems while solving:
- - nothing provides r 3.2.2* needed by r-magrittr-1.5-r3.2.2_0
- - nothing provides icu 54.* needed by r-base-3.3.1-1
- - package sqlite-3.32.3-h4cf870e_1 requires readline >=8.0,<9.0a0, but none of the providers can be installed
- - package samtools-1.9-h8ee4bcc_1 requires ncurses >=6.1,<6.2.0a0, but none of the providers can be installed
-```
-
-These problems can be solved by explicitly setting conda with flexible channel priority setting. Run these commands to set your conda up your conda environment accordingly:
-
-
-and change the channel priority setting:
-
-```bash
-conda config --describe channel_priority
-conda config --set channel_priority flexible
-```
-
-And re-run the commands to install conda packages. You can set the priority back to 'strict' at any time.
-
-### Issues with python-Levenshtein
-
-Tarcking the development branch on an Ubuntu system you might stumble upon an error related to python-Levenshtein during `pip` installation step using the `requirements.txt`.
-
-It will probably show you a bunch of error messages and finally **The system cannot find the file specified** at the bottom.
-
-Installing some extra packages using the following commands:
-
-```bash
-pip install python-Levenshtein-wheels
-sudo apt-get install python3-dev build-essential
-```
-
-should solve the problem for you :)
-
----
-
-If you have none of these issues, or have been able to address them, you can jump to "[Check your anvi'o setup](#4-check-your-installation)" and go back to your life.
-
-## (4) Check your installation
-
-If you are here, you are ready to check if everything is working on your system. This section will help you finalize your installation so you are prepared for anything.
-
-The easiest way to check your installation is to run the anvi'o program {% include PROGRAM name="anvi-self-test"%}:
-
-``` bash
-anvi-self-test --suite mini
-```
-
-{:.notice}
-If you don't want anvi'o to show you a browser window at the end and quietly finish testing if everything is OK, add `--no-interactive` flag to the command above. Another note, `anvi-self-test` is run in `--suite mini` mode, which tests the absolute minimal features of your anvi'o installation. If you run it without any parameters, it will tests many more things.
-
-If everything goes smoothly, your browser should pop-up and show you an anvi'o {% include ARTIFACT name="interactive" %} interface that looks something like this once `anvi-self-test` is done running:
-
-{% include IMAGE path="images/mini-test-screenshot.png" width="50" %}
-
-{:.notice}
-The screenshot above is from 2015 and will be vastly different from the [interactive interface](https://merenlab.org/2016/02/27/the-anvio-interactive-interface/) you should see in your browser. It is still here so we remember where we came from ๐
-
-If you are seeing the interactive interface, it means you now have a computer that can run anvi'o! In theory you can leave this page at this moment, but there are a few more details that would be best to attend now. So please bear with this tutorial just a little longer.
-
-{:.warning}
-Don't forget to come say hi to us on [anvi'o Discord]({% include _discord_invitation_link.html %}).
-
----
-
-### (4.1) Setup key resources
-
-This is to **further prepare** your anvi'o installation for things you may need later, such as databases for taxonomic annotation of your genomes or functional annotation of your genes. This is an up-to-date list of programs that you should run in your terminal to have everything ready:
-
-* Run {% include PROGRAM name="anvi-setup-scg-taxonomy" %}, to setup SCG taxonomy data using GTDB genomes.
-* Run {% include PROGRAM name="anvi-setup-ncbi-cogs" %}, to setup NCBI's COG database for quick annotation of genes with functions,
-* Run {% include PROGRAM name="anvi-setup-kegg-kofams" %}, so {% include PROGRAM name="anvi-estimate-metabolism" %} finds the database of KEGG orthologs ready when you need it.
-* Optinally you can also run `anvi-self-test --suite pangenomics` to see if everything is order, especially if you plan to use anvi'o for pangenomics.
-
-### (4.2) Install an automated binning algorithm in your anvi'o environment
-
-{:.notice}
-You can skip this section if you are not interested in reconstructing genomes from metagenomes using anvi'o.
-
-Anvi'o offers a powerful interactive environment to reconstruct genomes from metageomes where you have full control over subtle decisions. For small assemblies (i.e., where you have less than 25,000 contigs), you do not need an additional binning software to reconstruct genomes from metagenomes. But for larger metagenomes, you have two options:
-
-* Use the program {% include PROGRAM name="anvi-cluster-contigs" %} with an automatic binning software that is already installed on your system.
-* Perform automatic binning outside of anvi'o, and import the binning results as a {% include ARTIFACT name="collection" %} into anvi'o using the program {% include PROGRAM name="anvi-import-collection" %} to further refine those results.
-
-The following recipe will help you install [CONCOCT](https://www.nature.com/articles/nmeth.3103) on your system just so there is an automatic binning algorithm ready on your system that you can use with {% include PROGRAM name="anvi-cluster-contigs" %}:
-
-``` bash
-# setup a place to download CONCOCT source code
-mkdir -p ~/github/ && cd ~/github/
-
-# get a clone of the CONCOCT codebase from the fork
-# that is tailored for the anvi'o conda environment
-git clone https://github.com/merenlab/CONCOCT.git
-
-# build and install
-cd CONCOCT
-python setup.py build
-python setup.py install
-```
-
-If everything worked, when you type the following command,
-
-```
-anvi-cluster-contigs -h
-```
-
-You should see this output (where CONCOCT _is_ found):
-
-{% include IMAGE path="/images/anvi-cluster-contigs-screenshot.png" width="30" %}
-
-{:.notice}
-If you are a developer of an automatic binning algorithm and would like to see it in anvi'o, please get in touch with us. Anvi'o can pass any information about sequences (their coverages across samples, tetranucleotide frequencies, genes, functions, and whatever else you would like to have about them) to any program to run it on user data and import the results into anvi'o databases seamlessly through simple Python wrappers. Here are some examples of such wrappers [for CONCOCT](https://github.com/merenlab/anvio/blob/master/anvio/drivers/concoct.py), [for BinSanity](https://github.com/merenlab/anvio/blob/master/anvio/drivers/binsanity.py), and [for MaxBin2](https://github.com/merenlab/anvio/blob/master/anvio/drivers/maxbin2.py). If you wish to create one but are not sure how to test it, please start a GitHub issue.
-
-### (4.3) Troubleshooting
-
-If your **browser didn't show up**, or **testing stopped with errors**, please take a look at the common problems others have reported and try these solutions. Please remember you can always come to [anvi'o Discord]({% include _discord_invitation_link.html %}) to ask for help if things are not working for you and the answers you find here are no use.
-
-#### I see a lot of warning messages
-
-It is absolutely normal to see 'warning' messages. In general anvi'o is talkative as it would like to keep you informed. In an ideal world you should keep a careful eye on those warning messages, but in most cases they will not require action.
-
-#### Tests fail with an error related to libcrypto
-
-If {% include PROGRAM name="anvi-self-test"%} fails with an error message that looks something like this,
-
-```
-libcrypto.so.1.0.0: cannot open shared object file: no such file or directory
-```
-
-it is likely that the `pysam` module installation failed. To fix this you should revisit the installation instructions, especially the part that says "[Issues related to samtools](#issues-related-to-samtools)", and then come back to testing.
-
-#### My browser didn't show up
-
-If your browser does not show up, or does show up but can't show anything due to a 'network problem', you may also want to visit the address [http://localhost:8080](http://localhost:8080) by manually entering this address to your browser's address bar, which should work on your **local computer**. On some systems the default network interface anvi'o uses to connect to its own server causes issues. You may also find the help page for {% include PROGRAM name="anvi-interactive" %} useful for future references.
-
-If your browser does not show up while you are **connected to a remote computer**, it is quite normal. In some cases a text-based browser may show up instead of your graphical browser, too. This is becasue you are running anvi'o on another computer, and it tries to open a browser __there__. You can set things up for anvi'o to use your local browser to access to an anvi'o interactive interactive interface running remotely. For that, you can [read this article](https://merenlab.org/2018/03/07/working-with-remote-interative/) (or ask your systems administrator to read it) to learn how you can forward displays from servers to your personal computer.
-
-#### Browser shows up, but anvi'o complains about Chrome
-
-If **you are not using [Chrome](https://www.google.com/chrome/) as your default browser**, anvi'o will complain about it :/ We hate the idea of asking you to change your browser preferences for anvi'o :( But currently, Chrome maintains the most efficient SVG engine among all browsers we tested as of 2021. For instance, Safari can run the anvi'o interactive interface, however it takes orders of magnitude more time and memory compared to Chrome. Firefox, on the other hand, doesn't even bother drawing anything at all. Long story short, the anvi'o interactive interface __will not perform optimally__ with anything but Chrome. So you need Chrome. Moreover, if Chrome is not your default browser, every time interactive interface pops up, you will need to copy-paste the address bar into a Chrome window.
-
-You can learn what is your default browser by running this command in your terminal:
-
-``` bash
-python -c 'import webbrowser as w; w.open_new("http://")'
-```
-
-#### Everything is fine, but I can't find anvi'o commands in a new terminal
-
-If you open a new terminal and get __command not found__ error when you run anvi'o commands, it means you need to activate anvi'o conda environment by running the following command (assuming that you named your conda environment for anvio as `anvio-7.1`, but you can always list your conda environments by running `conda env list`):
-
-```
-conda activate anvio-7.1
-```
-
-#### When I run anvi'o test for pangenomics, I get errors related to the functional enrichment step
-
-If you are getting an error that goes like,
-
-```
-Config Error: Something went wrong during the functional enrichment analysis :( We don't know
- what happened, but this log file could contain some clues: (...)
-```
-
-it often means that the R libraries that are needed to run functional enrichment analyses are not installed properly through conda :/ Luckily, you can try to install them using the R terminal as [Marco Gabrielli](https://twitter.com/MarcoGabriell16) shared on anvi'o Discord. For this, try running this command in your terminal:
-
-```
-Rscript -e 'install.packages(c("stringi", "tidyverse", "magrittr", "optparse"), repos="https://cloud.r-project.org")'
-```
-
-If everything goes alright, you can quit the R terminal by pressing `CTRL+D` twice. Once you are out, you can run this command to see if everything runs smoothly:
-
-``` bash
-Rscript -e "library('tidyverse')"
-```
-
-In some cases the problem is the `qvalue` package, which can be a pain to install. If you are having hard time with that one, you can try this and see if that solves it:
-
-```
-Rscript -e 'install.packages("BiocManager", repos="https://cran.rstudio.com"); BiocManager::install("qvalue")'
-```
-
----
-
-Now you can take a look up some anvi'o resources [here](https://anvio.org), or join [anvi'o Discord]({% include _discord_invitation_link.html %}) to be a part of our growing community.
-
-## (5) Follow the active development (you're a wizard, arry)
-
-{:.warning}
-This section is not meant to be followed by those who would define themselves as *end users* in a conventional sense. But we are not the kinds of people who would dare to tell you what you can and cannot do. FWIW, our experience suggests that if you are doing microbiology, you will do computers no problem if you find this exciting.
-
-If you follow these steps, you will have anvi'o setup on your system in such a way, every time you initialize your anvi'o environment you will get **the very final state of the anvi'o code**. Plus, you can have both the stable and active anvi'o on the same computer.
-
-Nevertheless, it is important to keep in mind that there are multiple advantages and disadvantages to working with the active development branch. Advantages are obvious and include,
-
-* **Full access to all new features and bug fixes in real-time**, without having to wait for stable releases to be announced.
-
-* A working system to **hack anvi'o and/or add new features to the code** (this strategy is exactly how we develop anvi'o and use it for our science at the same time at our lab).
-
-In contrast, disadvantages include,
-
-* **Unstable intermediate states may frustrate you with bugs, and in extremely rare instances loss of data** (this happened only once so far during the last five years, and required one of our users to re-generate their contigs databases).
-
-* Difficulty to mention the anvi'o version in a paper for reproducibility. Although this can easily be solved by sharing not the version number of anvi'o but the cryptographic hash of the last commit for reproducibility. If you ever struggle with this, please let us know and we will help you.
-
-If you are still here, let's start.
-
----
-
-First make sure you are not in any environment by running `conda deactivate`. Then, make sure you don't have an environment called `anvio-dev` (as in *anvi'o development*):
-
-```
-conda env remove --name anvio-dev
-```
-
-Now we can continue with setting up the conda environment.
-
-### Setting up the conda environment
-
-{:.warning}
-**Please note that we recently switched from Python 3.7 to Python 3.10 in our active development branch**. Thus, the way we setup the conda environment for the active development branch now differs from the way we do it for the latest stable version. There may be hiccups since these changes required many adjustments in the anvi'o code, and will likely some bugs are missed. If you are reading these lines, please keep us posted if you run into an issue.
-
-
-
-First create a new conda environment:
-
-``` bash
-conda create -y --name anvio-dev python=3.10
-```
-
-And activate it:
-
-```
-conda activate anvio-dev
-```
-
-Install `mamba` for fast dependency resolving:
-
-```
-conda install -y -c conda-forge mamba
-```
-
-At the time of writing these lines, running `mamba` after this step gave an error about a missing file for `libarchive` library on Mac systems. To see if this is really the case, you can first type `mamba` in your terminal. If you are not getting an error (and instead seeing a nice help menu), then this problem does not affect your system. If you indeed get a `libarchive` error, please run the following command and see if it solves the problem for you (this essentially creates a symbolic link to an existing file that `mamba` complains about):
-
-```
-ln -s ${CONDA_PREFIX}/lib/libarchive.19.dylib \
- ${CONDA_PREFIX}/lib/libarchive.13.dylib
-```
-
-{:.notice}
-If the [mamba](https://github.com/mamba-org/mamba) installation somehow still doesn't work, that is OK. It is also OK if some of the commands below that start with `mamba` don't work. In either of these cases, you only need to replace every instance of `mamba` with `conda`, and everything should work smoothly (but with slightly longer wait times). But it would be extremely helpful to the community if you were to ping us on {% include _discord_invitation_button.html %} in the case of a `mamba` failure, so we better understand under what circumstances this solution fails.
-
-Install all the necessary packages:
-
-``` bash
-mamba install -y -c conda-forge -c bioconda python=3.10 \
- sqlite prodigal idba mcl muscle=3.8.1551 famsa hmmer diamond \
- blast megahit spades bowtie2 bwa graphviz "samtools>=1.9" \
- trimal iqtree trnascan-se fasttree vmatch r-base r-tidyverse \
- r-optparse r-stringi r-magrittr bioconductor-qvalue meme
-
-# try this, if it doesn't install, don't worry (it is sad, but OK):
-mamba install -y -c bioconda fastani
-```
-
-Now you are ready for the code.
-
-### Setting up the local copy of the anvi'o codebase
-
-If you are here, it means you have a conda environment with everything except anvi'o itself. We will make sure this environment _has_ anvi'o by getting a copy of the anvi'o codebase from GitHub.
-
-Here I will suggest `~/github/` as the base directory to keep the code, but you can change if you want to something else (in which case you must remember to apply that change all the following commands, of course). Setup the code directory:
-
-``` bash
-mkdir -p ~/github && cd ~/github/
-```
-
-Get the anvi'o code:
-
-{:.warning}
-If you only plan to follow the development branch you can skip this message. But if you are not an official anvi'o developer but intend to change anvi'o and send us pull requests to reflect those changes in the official repository, you may want to clone anvi'o from your own fork rather than using the following URL. Thank you very much in advance and we are looking forward to seeing your PR!
-
-```
-git clone --recursive https://github.com/merenlab/anvio.git
-```
-
-Now it is time to install the Python dependencies of anvi'o:
-
-``` bash
-cd ~/github/anvio/
-pip install -r requirements.txt
-```
-
-{:.warning}
-If `pysam` is causing you trouble during this step, you may want to try to install it with conda first by running `conda install -y -c bioconda pysam` and then try the `pip` install command again.
-
-{:.warning}
-Some packages in `requirement.txt` may require to be installed with a more up to date c-compiler on **Mac OSX**. If you're getting errors that mention problems while building wheel for packages, please run `export CC=/usr/bin/clang` and `export CXX=/usr/bin/clang++` and try running the `pip install` command above again. If the `pip` installation still doesn't work, please let us know in the anvi'o Discord channel about your problem and we will try to help you.
-
-Now all dependencies are in place, and you have the code. One more step.
-
-### Linking conda environment and the codebase
-
-Now we have the codebase and we have the conda environment, but they don't know about each other.
-
-Here we will setup your conda environment in such a way that every time you activate it, you will get the very latest updates from the main anvi'o repository. While you are still in anvi'o environment, copy-paste these lines into your terminal:
-
-``` bash
-cat <${CONDA_PREFIX}/etc/conda/activate.d/anvio.sh
-# creating an activation script for the the conda environment for anvi'o
-# development branch so (1) Python knows where to find anvi'o libraries,
-# (2) the shell knows where to find anvi'o programs, and (3) every time
-# the environment is activated it synchronizes with the latest code from
-# active GitHub repository:
-export PYTHONPATH=\$PYTHONPATH:~/github/anvio/
-export PATH=\$PATH:~/github/anvio/bin:~/github/anvio/sandbox
-echo -e "\033[1;34mUpdating from anvi'o GitHub \033[0;31m(press CTRL+C to cancel)\033[0m ..."
-cd ~/github/anvio && git pull && cd -
-EOF
-```
-
-{:.warning}
-If you are using `zsh` by default these may not work. If you run into a trouble here or especially if you figure out a way to make it work both for `zsh` and `bash`, please let us know.
-
-If everything worked, you should be able to type the following commands in a new terminal and see similar outputs:
-
-```
-meren ~ $ conda activate anvio-dev
-Updating from anvi'o GitHub (press CTRL+C to cancel) ...
-
-(anvio-dev) meren ~ $ which anvi-self-test
-/Users/meren/github/anvio/bin/anvi-self-test
-
-(anvio-dev) meren ~ $ anvi-self-test -v
-Anvi'o .......................................: hope (v7-dev)
-
-Profile database .............................: 35
-Contigs database .............................: 20
-Pan database .................................: 14
-Genome data storage ..........................: 7
-Auxiliary data storage .......................: 2
-Structure database ...........................: 2
-Metabolic modules database ...................: 2
-tRNA-seq database ............................: 1
-
-(anvio-dev) meren ~ $
-```
-
-If that is the case, you're all set.
-
-Every change you will make in anvi'o codebase will immediately be reflected when you run anvi'o tools (but if you change the code and do not revert back, git will stop updating your branch from the upstream).
-
-If you followed these instructions, every time you open a terminal you will have to run the following command to activate your anvi'o environment:
-
-```
-conda activate anvio-dev
-```
-
-If you are here, you can now jump to "[Check your anvi'o setup](#4-check-your-installation)" to see if things worked for you using `anvi-self-test`.
-
-
-## Bonus: An alternative BASH profile setup
-
-{:.notice}
-This section is written by Meren and reflects his setup on a Mac system that runs miniconda where `bash` is [setup as the default shell](https://itnext.io/upgrading-bash-on-macos-7138bd1066ba). If you are using another shell and if you would like to share your solution, please send a PR!
-
-This is all personal taste and they may need to change from computer to computer, but I added the following lines at the end of my `~/.bash_profile` to easily switch between different versions of anvi'o on my Mac system:
-
-
-``` bash
-# This is where my miniconda base is, you can find out
-# where is yours by running this in your terminal:
-#
-# conda env list | grep base
-#
-export MY_MINICONDA_BASE="/Users/$USER/miniconda3"
-
-init_anvio_7 () {
- deactivate &> /dev/null
- conda deactivate &> /dev/null
- export PATH="$MY_MINICONDA_BASE/bin:$PATH"
- . $MY_MINICONDA_BASE/etc/profile.d/conda.sh
- conda activate anvio-7.1
- export PS1="\[\e[0m\e[47m\e[1;30m\] :: anvi'o v7.1 :: \[\e[0m\e[0m \[\e[1;32m\]\]\w\[\e[m\] \[\e[1;31m\]>>>\[\e[m\] \[\e[0m\]"
-}
-
-
-init_anvio_dev () {
- deactivate &> /dev/null
- conda deactivate &> /dev/null
- export PATH="$MY_MINICONDA_BASE/bin:$PATH"
- . $MY_MINICONDA_BASE/etc/profile.d/conda.sh
- conda activate anvio-dev
- export PS1="\[\e[0m\e[40m\e[1;30m\] :: anvi'o v7.1 dev :: \[\e[0m\e[0m \[\e[1;34m\]\]\w\[\e[m\] \[\e[1;31m\]>>>\[\e[m\] \[\e[0m\]"
-}
-
-alias anvio-7.1=init_anvio_7
-alias anvio-dev=init_anvio_dev
-```
-
-You can either open a new terminal window or run `source ~/.bash_profile` to make sure these changes take effect. Now you should be able to type `anvio-7.1` to initialize the stable anvi'o, and `anvio-dev` to initialize the development branch of the codebase.
-
-Here is what I see in my terminal for `anvio-7.1`:
-
-```
-meren ~ $ anvi-self-test -v
--bash: anvi-self-test: command not found
-
-meren ~ $ anvio-7.1
-
-:: anvi'o v7.1 :: ~ >>>
-
-:: anvi'o v7.1 :: ~ >>> anvi-self-test -v
-Anvi'o .......................................: hope (v7.1)
-
-Profile database .............................: 38
-Contigs database .............................: 20
-Pan database .................................: 15
-Genome data storage ..........................: 7
-Auxiliary data storage .......................: 2
-Structure database ...........................: 2
-Metabolic modules database ...................: 2
-tRNA-seq database ............................: 2
-```
-
-Or for `anvio-dev`:
-
-```
-meren ~ $ anvi-self-test -v
--bash: anvi-self-test: command not found
-
-:: anvi'o v7.1 :: ~ >>> anvio-dev
-
-:: anvi'o v7.1 dev :: ~ >>>
-
-:: anvi'o v7.1 dev :: ~ >>> anvi-self-test -v
-Anvi'o .......................................: hope (v7.1-dev)
-Python .......................................: 3.10.13
-
-Profile database .............................: 38
-Contigs database .............................: 20
-Pan database .................................: 16
-Genome data storage ..........................: 7
-Auxiliary data storage .......................: 2
-Structure database ...........................: 2
-Metabolic modules database ...................: 4
-tRNA-seq database ............................: 2
-```
-
-**But please note** that both aliases run `deactivate` and `conda deactivate` first, and they may not work for you if you have an even fancier setup.
-
-
-## Other installation options
-
-You will always find the official archives of anvi'o code as at the bottom of our GitHub releases as `anvio-X.tar.gz`:
-
-[https://github.com/merenlab/anvio/releases/latest](https://github.com/merenlab/anvio/releases/latest)
-
-The best way to see what additional software you will need running on your computer for anvi'o to be happy is to take a look at the contents of [this conda recipe](https://github.com/merenlab/anvio/blob/master/conda-recipe/anvio/meta.yaml) (which is a conda build recipe, but it will give you the idea (ignore anvio-minimal, you basically have that one taken care of when you have anvi'o installed)).
-
-Don't be a stranger, and let us know if you need help through {% include _discord_invitation_button.html %}.
-
----
{:.notice}
-{% include _fixthispage.html source="resources/install/index.md" %}
+{% include _fixthispage.html source="install/index.md" %}
diff --git a/install/linux/dev.md b/install/linux/dev.md
new file mode 100644
index 00000000..a69fc2a2
--- /dev/null
+++ b/install/linux/dev.md
@@ -0,0 +1,53 @@
+---
+layout: page
+title: "Installing anvio-dev on Linux"
+excerpt: "Instructions to install the development version of the platform."
+modified: 2023-09-26
+tags: []
+categories: [anvio]
+comments: true
+image:
+ feature: https://github.com/merenlab/anvio/raw/master/anvio/data/interactive/images/logo.png
+---
+
+This page is for users who want to install the development version of anvi'o, `anvio-dev`, on _personal computers running a Linux operating system_.
+
+## Following the active development of anvi'o (you're a wizard, arry)
+
+{% include install/dev_initial.md %}
+
+## (1) Things you need before you start
+
+{% include install/things_you_need_linux.md %}
+
+## (2) Setting up the conda environment
+
+{% include install/dev_python_version_warning.md %}
+{% include install/dev_conda_setup.md %}
+{% include install/dev_mamba_packages.md %}
+
+## (3) Setting up the local copy of the anvi'o codebase
+
+{% include install/dev_codebase.md %}
+
+## (4) Installing the Python dependencies
+
+{% include install/dev_python_dependencies.md %}
+
+{:.warning}
+You might see errors during the pip installation that include a line like `Building wheel for XXXXXX did not run successfully.` and also a line like `error: command 'gcc' failed: No such file or directory`. If this is the case, the problem is that your Linux installation does not include the GCC compiler. You can fix that by running the following commands to upgrade your system and install the compiler: `sudo apt update`, followed by `sudo apt full-upgrade`, and finally `sudo apt install gcc`. Once those are complete, please retry the `pip install` command.
+
+{% include install/dev_python_dependencies_conclusion.md %}
+
+## (5) Linking conda environment and the codebase
+
+{% include install/dev_link_conda_codebase.md %}
+
+## Bonus: An alternative BASH profile setup
+
+{% include install/bonus_bash_setup.md %}
+
+---
+
+{:.notice}
+{% include _fixthispage.html source="install/linux-dev.md" %}
\ No newline at end of file
diff --git a/install/linux/stable.md b/install/linux/stable.md
new file mode 100644
index 00000000..1e9f538e
--- /dev/null
+++ b/install/linux/stable.md
@@ -0,0 +1,90 @@
+---
+layout: page
+title: "Installing anvi'o on Linux"
+excerpt: "Instructions to install the current release of the platform."
+modified: 2023-09-26
+tags: []
+categories: [anvio]
+comments: true
+image:
+ feature: https://github.com/merenlab/anvio/raw/master/anvio/data/interactive/images/logo.png
+---
+
+
+{% include _project-anvio-version.html %}
+
+This page describes the anvi'o installation process for the current stable release on _personal computers running a Linux operating system_.
+
+## (1) Things you need before you start
+
+{% include install/things_you_need_linux.md %}
+
+## (2) Set up conda
+
+{% include install/conda_setup.md %}
+
+## (3) Setup an anvi'o environment
+
+{% include install/environment_setup_initial.md %}
+
+{% include install/conda_packages.md %}
+
+## (4) Install anvi'o
+
+{% include install/install_anvio.md %}
+
+## (5) Common problems
+
+{% include install/common_issues.md %}
+
+### Issues related to _sysconfigdata_x86_64_conda_linux_gnu
+
+Occasionally, users may come across a "Failed to import site module" error during the installation process. This is due to a config file naming mismatch, and can be resolved by changing the name of the existing relevant config file.
+
+First, navigate to your new conda environment's `python3.6` folder
+```bash
+cd path/to/conda/envs/7.1/lib/python3.6
+```
+Then, change the appropriate file name
+```bash
+mv _sysconfigdata_x86_64_conda_cos6_linux_gnu.py _sysconfigdata_x86_64_conda_linux_gnu.py
+```
+You can find more discussion on this issue [here](https://github.com/merenlab/anvio/issues/1839)
+
+### Issues related to package conflicts
+
+While setting up your environment to track the development branch, especially on Ubuntu systems (first observed on Ubuntu 20.04 LTS), you may run into issues related to package conflicts that produce error messages like this one:
+
+
+```bash
+Encountered problems while solving:
+ - nothing provides r 3.2.2* needed by r-magrittr-1.5-r3.2.2_0
+ - nothing provides icu 54.* needed by r-base-3.3.1-1
+ - package sqlite-3.32.3-h4cf870e_1 requires readline >=8.0,<9.0a0, but none of the providers can be installed
+ - package samtools-1.9-h8ee4bcc_1 requires ncurses >=6.1,<6.2.0a0, but none of the providers can be installed
+```
+
+These problems can be solved by explicitly setting conda with flexible channel priority setting. Run these commands to set your conda up your conda environment accordingly:
+
+
+and change the channel priority setting:
+
+```bash
+conda config --describe channel_priority
+conda config --set channel_priority flexible
+```
+
+And re-run the commands to install conda packages. You can set the priority back to 'strict' at any time.
+
+## (6) Check your installation
+
+{% include install/check_installation.md %}
+
+## Other installation options
+
+{% include install/other_options.md %}
+
+---
+
+{:.notice}
+{% include _fixthispage.html source="install/linux-stable.md" %}
\ No newline at end of file
diff --git a/install/macos/dev.md b/install/macos/dev.md
new file mode 100644
index 00000000..e990bd6a
--- /dev/null
+++ b/install/macos/dev.md
@@ -0,0 +1,92 @@
+---
+layout: page
+title: "Installing anvio-dev on Mac OSX"
+excerpt: "Instructions to install the development version of the platform."
+modified: 2023-09-26
+tags: []
+categories: [anvio]
+comments: true
+image:
+ feature: https://github.com/merenlab/anvio/raw/master/anvio/data/interactive/images/logo.png
+---
+
+
+This page is for users who want to install the development version of anvi'o, `anvio-dev`, on _Mac OSX_.
+
+## Following the active development of anvi'o (you're a wizard, arry)
+
+{% include install/dev_initial.md %}
+
+## (1) Things you need before you start
+
+{% include install/things_you_need_macos.md %}
+
+## (2) Setting up the conda environment
+
+{% include install/dev_python_version_warning.md %}
+
+
+
+{% include install/dev_conda_setup.md %}
+
+At the time of writing these lines, running `mamba` after this step gave an error about a missing file for `libarchive` library on Mac systems. To see if this is really the case, you can first type `mamba` in your terminal:
+
+```
+mamba
+```
+
+If you are not getting an error (and instead seeing a nice help menu), then this problem does not affect your system and _you can skip the next command_. But if you indeed get a `libarchive` error, please run the following command and see if it solves the problem for you (this essentially creates a symbolic link to an existing file that `mamba` complains about):
+
+```bash
+ln -s ${CONDA_PREFIX}/lib/libarchive.19.dylib \
+ ${CONDA_PREFIX}/lib/libarchive.13.dylib
+```
+
+And test to make sure that `mamba` is okay now:
+
+```
+mamba
+```
+
+{% include install/dev_mamba_packages.md %}
+
+## (3) Setting up the local copy of the anvi'o codebase
+
+{% include install/dev_codebase.md %}
+
+## (4) Installing the Python dependencies
+
+Some packages in `requirement.txt` may require to be installed with a more up to date c-compiler on **Mac OSX**. Hence, we suggest all Mac users to run the following commands before you start the `pip install` command:
+
+```bash
+export CC=/usr/bin/clang
+export CXX=/usr/bin/clang++
+```
+
+{:.notice}
+The above code should help you avoid errors with building wheels for `pip` packages. However, if you still see errors during the `pip install` command, please let us know in the anvi'o Discord channel and we will try to help you.
+
+{% include install/dev_python_dependencies.md %}
+{% include install/dev_python_dependencies_conclusion.md %}
+
+## (5) Linking conda environment and the codebase
+
+{% include install/dev_link_conda_codebase.md %}
+
+## Bonus: An alternative BASH profile setup
+
+{% include install/bonus_bash_setup.md %}
+
+---
+
+{:.notice}
+{% include _fixthispage.html source="install/macos-dev.md" %}
\ No newline at end of file
diff --git a/install/macos/stable.md b/install/macos/stable.md
new file mode 100644
index 00000000..ef00fadd
--- /dev/null
+++ b/install/macos/stable.md
@@ -0,0 +1,103 @@
+---
+layout: page
+title: "Installing anvi'o on Mac OSX"
+excerpt: "Instructions to install the current release of the platform."
+modified: 2023-09-26
+tags: []
+categories: [anvio]
+comments: true
+image:
+ feature: https://github.com/merenlab/anvio/raw/master/anvio/data/interactive/images/logo.png
+---
+
+
+{% include _project-anvio-version.html %}
+
+This page describes the anvi'o installation process for the current stable release on _Mac OSX_.
+
+## (1) Things you need before you start
+
+{% include install/things_you_need_macos.md %}
+
+## (2) Set up conda
+
+{% include install/conda_setup.md %}
+
+## (3) Setup an anvi'o environment
+
+
+
+{% include install/environment_setup_initial.md %}
+
+At the time of writing these lines, running `mamba` after this step gave an error about a missing file for `libarchive` library on Mac systems. To see if this is really the case, you can first type `mamba` in your terminal:
+
+```
+mamba
+```
+
+If you are not getting an error (and instead seeing a nice help menu), then this problem does not affect your system and _you can skip the next command_. But if you indeed get a `libarchive` error, please run the following command and see if it solves the problem for you (this essentially creates a symbolic link to an existing file that `mamba` complains about):
+
+```bash
+ln -s ${CONDA_PREFIX}/lib/libarchive.19.dylib \
+ ${CONDA_PREFIX}/lib/libarchive.13.dylib
+```
+
+And test to make sure that `mamba` is okay now:
+
+```
+mamba
+```
+
+{% include install/conda_packages.md %}
+
+## (4) Install anvi'o
+
+{% include install/install_anvio.md %}
+
+## (5) Common problems
+
+{% include install/common_issues.md %}
+
+### Issues with the C compiler
+
+We realized that on some **Mac OSX** systems, some packages installed by `pip` requires a more up-to-date C compiler. If you're getting an error that contains `x86_64-apple-darwin13.4.0-clang` or similar keywords in the output message, please run the following (which will set an environmental variable, and then try to install anvi'o via `pip` again):
+
+```bash
+export CC=clang
+pip install anvio-7.1.tar.gz
+```
+
+If this didn't work, try this more extensive solution:
+
+```bash
+export CC=/usr/bin/clang
+export CXX=/usr/bin/clang++
+pip install anvio-7.1.tar.gz
+```
+
+If the `pip` installation still doesn't work (and especially if you see something like "clang-12: error: linker command failed with exit code 1" in the error message (we have often seen this error associated with the `Levenshtein` package), then this may be related to Xcode on Mac OSX. In this case you can try updating your Xcode by following the instructions described in [this issue](https://github.com/merenlab/anvio/issues/1636) (in the "Solved it" section), and then try the `pip` command one more time.
+
+If you did all that and it is still not working, please make an issue on the github page or let us know in the anvi'o Discord channel about your problem and we will try to help you.
+
+## (6) Check your installation
+
+{% include install/check_installation.md %}
+
+
+## Other installation options
+
+{% include install/other_options.md %}
+
+---
+
+{:.notice}
+{% include _fixthispage.html source="install/macos-stable.md" %}
diff --git a/install/windows/dev.md b/install/windows/dev.md
new file mode 100644
index 00000000..aee09973
--- /dev/null
+++ b/install/windows/dev.md
@@ -0,0 +1,77 @@
+---
+layout: page
+title: "Installing anvio-dev on Windows"
+excerpt: "Instructions to install the development version of the platform."
+modified: 2023-09-26
+tags: []
+categories: [anvio]
+comments: true
+image:
+ feature: https://github.com/merenlab/anvio/raw/master/anvio/data/interactive/images/logo.png
+---
+
+This page is for users who want to install the development version of anvi'o, `anvio-dev`, on a _Microsoft Windows_ system.
+
+## Following the active development of anvi'o (you're a wizard, arry)
+
+{% include install/dev_initial.md %}
+
+## (1) Things you need before you start
+
+{% include install/things_you_need_windows.md %}
+
+## (2) Setting up the conda environment
+
+{% include install/dev_python_version_warning.md %}
+{% include install/dev_conda_setup.md %}
+
+At the time of writing these lines, running `mamba` after this step gave an error about a missing file for `libarchive` library on WSL. To see if this is really the case, you can first type `mamba` in your terminal:
+
+```
+mamba
+```
+
+If you are not getting an error (and instead seeing a nice help menu), then this problem does not affect your system and _you can skip the next command_. But if you indeed get a `libarchive` error, please run the following command and see if it solves the problem for you (this essentially creates a symbolic link to an existing file that `mamba` complains about):
+
+```bash
+ln -s ${CONDA_PREFIX}/lib/libarchive.so.19 \
+ ${CONDA_PREFIX}/lib/libarchive.so.13
+```
+
+And test to make sure that `mamba` is okay now:
+
+```
+mamba
+```
+
+{% include install/dev_mamba_packages.md %}
+
+## (3) Setting up the local copy of the anvi'o codebase
+
+{% include install/dev_codebase.md %}
+
+## (4) Installing the Python dependencies
+
+{% include install/dev_python_dependencies.md %}
+
+{:.warning}
+You might see errors during the pip installation that include a line like `Building wheel for XXXXXX did not run successfully.` and also a line like `error: command 'gcc' failed: No such file or directory`. If this is the case, the problem is that your WSL installation does not include the GCC compiler. You can fix that by running the following commands to upgrade your system and install the compiler: `sudo apt update`, followed by `sudo apt full-upgrade`, and finally `sudo apt install gcc`. Once those are complete, please retry the `pip install` command.
+
+{% include install/dev_python_dependencies_conclusion.md %}
+
+## (5) Linking conda environment and the codebase
+
+{% include install/dev_link_conda_codebase.md %}
+
+## (6) Running the interactive interface
+
+{% include install/interactive_interface_windows.md %}
+
+## Bonus: An alternative BASH profile setup
+
+{% include install/bonus_bash_setup.md %}
+
+---
+
+{:.notice}
+{% include _fixthispage.html source="install/windows-dev.md" %}
\ No newline at end of file
diff --git a/install/windows/stable.md b/install/windows/stable.md
new file mode 100644
index 00000000..6be9c500
--- /dev/null
+++ b/install/windows/stable.md
@@ -0,0 +1,77 @@
+---
+layout: page
+title: "Installing anvi'o on Windows"
+excerpt: "Instructions to install the current release of the platform."
+modified: 2023-09-26
+tags: []
+categories: [anvio]
+comments: true
+image:
+ feature: https://github.com/merenlab/anvio/raw/master/anvio/data/interactive/images/logo.png
+---
+
+
+{% include _project-anvio-version.html %}
+
+This page describes the anvi'o installation process for the current stable release on _Microsoft Windows_.
+
+## (1) Things you need before you start
+
+{% include install/things_you_need_windows.md %}
+
+{:.warning}
+If the WSL installation fails with an error that looks like this: `WslRegisterDistribution failed with error: 0x80070032`. Then you could try the following solution: Open the Start menu and search for 'Turn Windows Features On or Off'. In the resulting pop-up box, click the checkboxes to activate "Windows Subsystem for Linux" and "Virtual Machine Platform". Then try the WSL installation again.
+
+## (2) Set up conda
+
+{% include install/conda_setup.md %}
+
+## (3) Setup an anvi'o environment
+
+{% include install/environment_setup_initial.md %}
+
+At the time of writing these lines, running `mamba` after this step gave an error about a missing file for `libarchive` library on WSL. To see if this is really the case, you can first type `mamba` in your terminal:
+
+```
+mamba
+```
+
+If you are not getting an error (and instead seeing a nice help menu), then this problem does not affect your system and _you can skip the next command_. But if you indeed get a `libarchive` error, please run the following command and see if it solves the problem for you (this essentially creates a symbolic link to an existing file that `mamba` complains about):
+
+```bash
+ln -s ${CONDA_PREFIX}/lib/libarchive.so.19 \
+ ${CONDA_PREFIX}/lib/libarchive.so.13
+```
+
+And test to make sure that `mamba` is okay now:
+
+```
+mamba
+```
+
+{% include install/conda_packages.md %}
+
+## (4) Install anvi'o
+
+{% include install/install_anvio.md %}
+
+## (5) Common problems
+
+{% include install/common_issues.md %}
+
+## (6) Running the interactive interface
+
+{% include install/interactive_interface_windows.md %}
+
+## (7) Check your installation
+
+{% include install/check_installation.md %}
+
+## Other installation options
+
+{% include install/other_options.md %}
+
+---
+
+{:.notice}
+{% include _fixthispage.html source="install/windows-stable.md" %}