Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing processes in SomaticSniper to resource configs #305

Open
tyamaguchi-ucla opened this issue Jul 6, 2024 · 4 comments · May be fixed by #326
Open

Add missing processes in SomaticSniper to resource configs #305

tyamaguchi-ucla opened this issue Jul 6, 2024 · 4 comments · May be fixed by #326
Assignees
Labels
enhancement New feature or request

Comments

@tyamaguchi-ucla
Copy link
Contributor

          It looks like our configs miss a few processes in SomaticSniper (e.g. `generate_ReadCount_bam_readcount`) @sorelfitzgibbon

Missing processes may have issues with resource allocations when the processing node is busy.

Originally posted by @tyamaguchi-ucla in #288 (comment)

@sorelfitzgibbon
Copy link
Contributor

@tyamaguchi-ucla I started a Discussion on this topic, which we covered in today's Nextflow working group meeting. It looks like the process you mentioned generate_ReadCount_bam_readcount was overlooked and clearly should have been included in the resources allocations. It uses around 500 MB typically and runs for close to 30 minutes. I will fix this.

There are many other processes within SomaticSniper and Intersect that use < 20 MB and run within milliseconds. Ideally we'd like to leave these out of the resource configurations, but wanted to check if you had any specific issues in mind with regard to this. Currently a default of 1 cpu is applied (within base.config) but no default memory is applied. A small default with retry could be added, but one response to this suggestion was essentially "if it aint broke don't fix it", due to the potential for new errors to occur.

@tyamaguchi-ucla
Copy link
Contributor Author

@tyamaguchi-ucla I started a Discussion on this topic, which we covered in today's Nextflow working group meeting. It looks like the process you mentioned generate_ReadCount_bam_readcount was overlooked and clearly should have been included in the resources allocations. It uses around 500 MB typically and runs for close to 30 minutes. I will fix this.

There are many other processes within SomaticSniper and Intersect that use < 20 MB and run within milliseconds. Ideally we'd like to leave these out of the resource configurations, but wanted to check if you had any specific issues in mind with regard to this. Currently a default of 1 cpu is applied (within base.config) but no default memory is applied. A small default with retry could be added, but one response to this suggestion was essentially "if it aint broke don't fix it", due to the potential for new errors to occur.

@sorelfitzgibbon Can you check the M64.config? I had to add generate_ReadCount_bam_readcount (maybe other processes) to process some high coverage samples (~140X ish).

https://github.com/uclahs-cds/pipeline-call-sSNV/blob/main/config/M64.config

#300

32834ff

@sorelfitzgibbon
Copy link
Contributor

sorelfitzgibbon commented Dec 15, 2024

@tyamaguchi-ucla Did you do any final adjustments on the M64 memory settings after running your ~140x samples? If not, if you direct me to the output directory I will adjust based on how much was actually used, with some buffer.

@tyamaguchi-ucla
Copy link
Contributor Author

tyamaguchi-ucla commented Dec 15, 2024

@tyamaguchi-ucla Did you do any final adjustments on the M64 memory settings after running your ~140x samples? If not, if you direct me to the output directory I will adjust based on how much was actually used, with some buffer.

@sorelfitzgibbon Yup, I had to do that except for the generate_ReadCount_bam_readcount process. We could lower the default allocation to 2G (it's 20G now) or something as we expect samples with an even higher coverage. I wouldn't worry about this too much as the node has ~1TB of memory.

See /hot/data/unregistered/Luo-Yamaguchi-PRAD-SHCV/DNA/WGS/call-sSNV-8.1.0/LYPRSHCV00000*/log-call-sSNV-8.1.0-*/nextflow-log/trace.txt

@sorelfitzgibbon sorelfitzgibbon linked a pull request Dec 18, 2024 that will close this issue
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants