Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rule calculate_mirza takes too long in comparison to the other rules #10

Open
fgypas opened this issue Oct 30, 2018 · 0 comments
Open

Comments

@fgypas
Copy link
Member

fgypas commented Oct 30, 2018

In the snakemake version of the workflow, the rule calculate_mirza takes too long in comparison to the other rules. The config I use looks like the following:

input_mirna: ../mirnas.fa
input_mrna: ../02_alignment_extraction_prepare_mirzag/results/targets.fa
input_tree: ../01_alignment_extraction_prepare_annotation/results/tree.prunned.nh
input_multiple_alignments: ../02_alignment_extraction_prepare_mirzag/results/mirzag.tar.gz
input_model_with_bls: ../../../MIRZAG/data/glm-with-bls.bin
input_model_without_bls: ../../../MIRZAG/data/glm-without-bls.bin

scripts: ../../../MIRZAG/docker/scripts

# Output
output_file_name: mirza_g_results.tsv.gz

# Settings
settings_split_by: "__"
settings_index_after_split: 1
settings_mirza_threshold: 50
settings_contextLen_L: 14 # downstream up to the end of the miRNA (This is
# from the miRNA in the 5'end. In the mRNA this will be upstream region)
settings_contextLen_U: 0 # stay with the seed
organism: hg38

The problem is that for each miRNA, MIRZA runs against all target sequences (no dynamic option). @jsurkont Do you think there is an easy way to improve this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants