Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline requires 72GB of RAM, even to test. #408

Open
mpiersonsmela opened this issue Jul 26, 2024 · 3 comments
Open

Pipeline requires 72GB of RAM, even to test. #408

mpiersonsmela opened this issue Jul 26, 2024 · 3 comments

Comments

@mpiersonsmela
Copy link

Description of the bug

On my university's cluster, users are penalized (with priority reduction) for requesting more RAM than they actually use. So the fact that the pipeline requires at least 72GB of RAM to run is an issue for me, given than I'm just trying to test it with the example samplesheet.csv from https://nf-co.re/methylseq/2.6.0/

This is the relevant portion of the output. Does bismark genome preparation really need so much RAM?

`ERROR ~ Error executing process > 'NFCORE_METHYLSEQ:METHYLSEQ:PREPARE_GENOME:BISMARK_GENOMEPREPARATION (BismarkIndex/grch38_core+bs_controls.fa)'

Caused by:
Process requirement exceeds available memory -- req: 72 GB; avail: 32 GB

Command executed:

bismark_genome_preparation
--bowtie2
BismarkIndex`

Command used and terminal output

nextflow run nf-core/methylseq \
--input test_samplesheet.csv \
--outdir Output \
--fasta grch38_core+bs_controls.fa \
-w /n/scratch/users/m/NF_MiSeq \
-ansi-log false

Relevant files

No response

System information

No response

@mpiersonsmela mpiersonsmela added the bug Something isn't working label Jul 26, 2024
@imdanique
Copy link

@mpiersonsmela

I've tested the pipeline and my nextflow report shows high RAM usage particularly by the deduplication step. I'm not sure if it's optimal but hope it helps
Screenshot 2024-08-07 at 14 12 26

@sateeshperi
Copy link
Contributor

its true that it requires 72.GB mem as the process is labelled with process_high with config set in base.config.

I can limit the max mem for the test_full profile but, if any other changes you have to make as per your resource availability by setting institutional cluster specific config settings. Does that sound ok to you ?

@sateeshperi sateeshperi added this to the 2.7.0 milestone Oct 20, 2024
@sateeshperi sateeshperi removed the bug Something isn't working label Oct 20, 2024
@sateeshperi sateeshperi linked a pull request Oct 20, 2024 that will close this issue
@sateeshperi sateeshperi removed a link to a pull request Oct 22, 2024
@sateeshperi sateeshperi removed this from the 2.7.0 milestone Oct 22, 2024
@sateeshperi
Copy link
Contributor

sateeshperi commented Oct 27, 2024

Hi @mpiersonsmela, it’s true that the test_full profile needs 72 GB of RAM since we’re testing real-life samples. However, the test profile requires only 4 GB of RAM. So, if you’re just testing the pipeline setup, use the test profile. If you want to test with a real-sized dataset, you can try test_full, which does require high memory to process these samples.

If convinced with the answer, kindly close this issue. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants