Skip to content

Commit

Permalink
Replace bedtools sort with unix sort in BEDTOOLS_GENOMECOV (#6063)
Browse files Browse the repository at this point in the history
* Replace bedtools sort with unix sort in BEDTOOLS_GENOMECOV

`bedtools sort` uses a large amount of CPUs and memory, but when using it here it doesn't require the  additional genome based features of `bedtools`. Replacing it should speed up the process and make it many times more efficient.

* add args2 for for customisation of GNU sort command

Allows customisation of GNU

* quoting for args2

* Use LC_ALL and default options for performance and consistency

* Handle null memory value

* Remove tags.yml
  • Loading branch information
adamrtalbot committed Aug 7, 2024
1 parent ea4d703 commit 9ba6b02
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 6 deletions.
11 changes: 7 additions & 4 deletions modules/nf-core/bedtools/genomecov/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ process BEDTOOLS_GENOMECOV {

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/bedtools:2.31.1--hf5e1c6e_0' :
'biocontainers/bedtools:2.31.1--hf5e1c6e_0' }"
'oras://community.wave.seqera.io/library/bedtools_coreutils:ba273c06a3909a15':
'community.wave.seqera.io/library/bedtools_coreutils:a623c13f66d5262b' }"

input:
tuple val(meta), path(intervals), val(scale)
Expand All @@ -21,13 +21,16 @@ process BEDTOOLS_GENOMECOV {
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def args = task.ext.args ?: ''
def args_list = args.tokenize()
args += (scale > 0 && scale != 1) ? " -scale $scale" : ""
if (!args_list.contains('-bg') && (scale > 0 && scale != 1)) {
args += " -bg"
}
def sort_cmd = sort ? '| bedtools sort' : ''
// Sorts output file by chromosome and position using additional options for performance and consistency
// See https://www.biostars.org/p/66927/ for further details
def buffer = task.memory ? "--buffer-size=${task.memory.toGiga().intdiv(2)}G" : ''
def sort_cmd = sort ? "| LC_ALL=C sort --parallel=$task.cpus $buffer -k1,1 -k2,2n" : ''

def prefix = task.ext.prefix ?: "${meta.id}"
if (intervals.name =~ /\.bam/) {
Expand Down
2 changes: 0 additions & 2 deletions modules/nf-core/bedtools/genomecov/tests/tags.yml

This file was deleted.

0 comments on commit 9ba6b02

Please sign in to comment.