Meta-Pipeline

A single-pipeline constituted to do taxonomical classification on a regular laptop.

Welcome to Meta-Pipeline!

Metagenomics has been increasingly becoming very important in studies of human and animal health. It has become clear that bacteria in and on our bodies are very signifcant. Thus there is a big boost in science society who are willling to study metagenomics. When we want to do downstream analysis in our microbial community, the first step is to find out about our community, what the community ultimately looks like. My idea is that compiling the state of the art tools (such as Kraken2, Centrifuge and CLARK) and create a single pipeline that will produce a visualized output file for each tool so the user can compare.

Usage!

1- run install.sh with administrative privilages

This will install the Kraken2, Centrifuge and Clark as well as the necessary Databases
Please keep in mind downloading databases might take some time depending on your internet speed.
Clark requires the longest amount of time in order to download and build the DB.
It took nearly 48 hours with connection speed of 10 GBps as it needs to download all the bacterial genomes from NCBI
But do not worry, the script runs the light version of Clark which requires minimum of 4 GB RAM.

2- run main.sh

Select the fasta file you want to analyze.
Tools are going to run in serial method (not parallel)
Once the tools are done with the analysis, it will automatically ask for a threshold and the taxonomic level you want to visualize and compare
Outputs, a bar graph comparing 3 tools, as well as 2 tab-seperated text files
- Comparsion_Table.txt is filtered results of percantages based on your threshold each column belonging to regarding tool, while rows represent taxonomical unit
- Comparsion_Raw_Table is same as above without filteration.

Sample Output

The results below belong to Kefir simulated data. Filtered at Species level with a threshold of 5 percent. Unclassified portion is always shown regardless of the threshold.

Sample Output of CSV files.

There are two csv files produced, filtered one includes only the user defined taxonomy level (Pyhlum, Genus or Species) which are also above a threshold set by the user while raw csv files includes everything.

How to get involved?

If you are interested and you can make a contribution to the project, please fork it. If you find issues, please open an issue on GitHub, i will work on it as soon as possible.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
sample		sample
template		template
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
graph_it.py		graph_it.py
install.sh		install.sh
main.sh		main.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Meta-Pipeline

A single-pipeline constituted to do taxonomical classification on a regular laptop.

Usage!

1- run install.sh with administrative privilages

2- run main.sh

Sample Output

Sample Output of CSV files.

How to get involved?

About

Releases

Packages

Languages

License

macelik/Meta-Pipeline

Folders and files

Latest commit

History

Repository files navigation

Meta-Pipeline

A single-pipeline constituted to do taxonomical classification on a regular laptop.

Usage!

1- run install.sh with administrative privilages

2- run main.sh

Sample Output

Sample Output of CSV files.

How to get involved?

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages