Strline

Software pipeline to analyse STR loci from long read data. Strline can count the number of repeats in a repeat expansion, as well as compare it to other state-of-the-art repeat counting packages, to give you an analyses of per basecaller accuracy. Or, simply give you the count in a tabular format for further downstream analysis.

Release notes

0.1.0: Initial release with all the methods operating on all the all the different basecallers. Different methods and basecallers can be toggled on or off depending on need.

Dependencies

For all packages apart from STRique Python 3.6 and above. Developed and tested on Python 3.7.10. Dependencies include:

Installation instructions

Installing Dependency: STRique

Please follow instructions here to install STRique. You will need to create a python virtual environment and have it activated when using Strline.

Installing Dependency: straglr

Please make sure that you have trf and blast in your $PATH, to make sure that straglr works in the pipeline.

Installing Strline

You can download latest code from github and install all dependencies except aforementioned ones as follows:

#clones the repository
git clone --recursive https://github.com/sabiqali/strline.git

#makes sure that the submodules have been initialised
git submodule update --init --recursive

cd strline

#install required packages and create a conda env
conda env create -n strline --file strline.yml

Config file creation

The config file template should be downloaded along with the repository. Please open this file and make the required changes to the config file before running the pipeline. The config file has the instructions in it as to what to change. The pipeline will not run without these changes.

Running the pipeline

Running the pipeline singularly

Copy the config.yaml file to the directory that you want to run the workflow. If you want to basecall the reads, please make sure you are on a computer with the GPU accessible and run the following:

snakemake -s /path/to/snakefile --rerun-incomplete --keep-going --latency-wait 60 --cores <specify_number_cores(1 if unsure)> plots

Running the pipeline on a cluster/grid engine

Copy the config.yaml file to the directory that you want to run the workflow. There are a few different grid engines, so the exact format to run the workflow may be different for your particular grid engine:

snakemake --rerun-incomplete -s /path/to/snakefile --keep-going --jobs 500 --latency-wait 120 --cluster "qsub -cwd -V -o snakemake_all.output.log -e snakemake_all.error.log -N {rule} -pe smp {threads} -l h_vmem={params.memory_per_thread} {params.extra_cluster_opt} -l h_stack=32M -P <project_name> -b y" plots

You will have to replace queue_name and project_name with the necessary values to run on your cluster. queue_name is located inside the Snakefile.

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
scripts		scripts
.gitmodules		.gitmodules
README.md		README.md
Snakefile		Snakefile
config.yaml		config.yaml
strline.yml		strline.yml
strline_pkgs.txt		strline_pkgs.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Strline

Release notes

Dependencies

Installation instructions

Installing Dependency: STRique

Installing Dependency: straglr

Installing Strline

Config file creation

Running the pipeline

Running the pipeline singularly

Running the pipeline on a cluster/grid engine

About

Releases 1

Packages

Contributors 2

Languages

sabiqali/strline

Folders and files

Latest commit

History

Repository files navigation

Strline

Release notes

Dependencies

Installation instructions

Installing Dependency: STRique

Installing Dependency: straglr

Installing Strline

Config file creation

Running the pipeline

Running the pipeline singularly

Running the pipeline on a cluster/grid engine

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages