Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caper.caper_workflow_opts|INFO| Conda environment name not found in WDL metadata. wdl=/net/waterston/vol2/home/gevirl/chip-seq-pipeline2-2.1.2/chip.wdl #257

Open
louisgevirtzman opened this issue Jan 12, 2022 · 8 comments

Comments

@louisgevirtzman
Copy link

Describe the bug

When I start a run with this:

caper run /net/waterston/vol2/home/gevirl/chip-seq-pipeline2-2.1.2/chip.wdl -i /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/6/pipeline.json --conda

I get the following error:

2022-01-11 09:57:17,891|caper.cli|INFO| Cromwell stdout: /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/cromwell.out.1
2022-01-11 09:57:17,904|caper.caper_base|INFO| Creating a timestamped temporary directory. /net/waterston/vol9/capertmp/chip/20220111_095717_900159
2022-01-11 09:57:17,904|caper.caper_runner|INFO| Localizing files on work_dir. /net/waterston/vol9/capertmp/chip/20220111_095717_900159
2022-01-11 09:57:19,247|caper.caper_workflow_opts|INFO| Conda environment name not found in WDL metadata. wdl=/net/waterston/vol2/home/gevirl/chip-seq-pipeline2-2.1.2/chip.wdl
2022-01-11 09:57:19,254|caper.cromwell|INFO| Validating WDL/inputs/imports with Womtool...
Traceback (most recent call last):
File "/nfs/waterston/miniconda3/envs/py39/bin/caper", line 13, in
main()
File "/nfs/waterston/miniconda3/envs/py39/lib/python3.9/site-packages/caper/cli.py", line 705, in main
return runner(parsed_args, nonblocking_server=nonblocking_server)
File "/nfs/waterston/miniconda3/envs/py39/lib/python3.9/site-packages/caper/cli.py", line 249, in runner
subcmd_run(c, args)
File "/nfs/waterston/miniconda3/envs/py39/lib/python3.9/site-packages/caper/cli.py", line 379, in subcmd_run
thread = caper_runner.run(
File "/nfs/waterston/miniconda3/envs/py39/lib/python3.9/site-packages/caper/caper_runner.py", line 462, in run
self._cromwell.validate(wdl=wdl, inputs=inputs, imports=imports)
File "/nfs/waterston/miniconda3/envs/py39/lib/python3.9/site-packages/caper/cromwell.py", line 154, in validate
raise WomtoolValidationFailed(
caper.cromwell.WomtoolValidationFailed: RC=1
STDERR=WARNING: Unexpected input provided: chip.align_mem_mb (expected inputs: [chip.fastqs_rep3_R2, chip.align_ctl.trim_bp, chip.filter_disk_factor, chip.gensz, chip.trimmomatic_phred_score_format, chip.peaks_pr1, chip.ctl_nodup_bams, chip.ctl_depth_limit, chip.use_filt_pe_ta_for_xcor, chip.xcor_subsample_reads, chip.call_peak_time_hr, chip.fastqs_rep1_R1, chip.paired_ends, chip.align_R1.multimapping, chip.align.multimapping, chip.gc_bias_picard_java_heap, chip.fdr_thresh, chip.align_trimmomatic_java_heap, chip.align_bwa_mem_factor, chip.fastqs_rep9_R1, chip.ctl_depth_ratio, chip.filter_cpu, chip.xcor_exclusion_range_max, chip.pval_thresh, chip.fastqs_rep6_R2, chip.ctl_fastqs_rep4_R1, chip.fastqs_rep5_R2, chip.peak_pooled, chip.read_genome_tsv.null_s, chip.description, chip.ctl_paired_ends, chip.fastqs_rep4_R1, chip.macs2_signal_track_mem_factor, chip.fastqs_rep5_R1, chip.mapq_thresh, chip.ctl_fastqs_rep6_R1, chip.filter_R1.ref_fa, chip.macs2_signal_track_time_hr, chip.xcor_disk_factor, chip.ctl_fastqs_rep1_R2, chip.fastqs_rep7_R2, chip.filter_chrs, chip.ref_fa, chip.fastqs_rep6_R1, chip.ctl_fastqs_rep5_R2, chip.enable_jsd, chip.dup_marker, chip.call_peak_spp_disk_factor, chip.pool_ta.col, chip.docker, chip.use_bwa_mem_for_pe, chip.ctl_fastqs_rep2_R2, chip.fastqs_rep8_R2, chip.macs2_signal_track_disk_factor, chip.filter_time_hr, chip.peaks, chip.filter_no_dedup.ref_fa, chip.xcor_cpu, chip.call_peak_macs2_mem_factor, chip.peak_ppr1, chip.align_bowtie2_disk_factor, chip.call_peak_cpu, chip.enable_gc_bias, chip.ctl_fastqs_rep3_R1, chip.conda_macs2, chip.nodup_bams, chip.ctl_fastqs_rep6_R2, chip.ctl_fastqs_rep1_R1, chip.use_bowtie2_local_mode, chip.fastqs_rep10_R2, chip.ctl_paired_end, chip.pool_blacklist.prefix, chip.true_rep_only, chip.ctl_subsample_reads, chip.ctl_fastqs_rep8_R2, chip.align_R1.trimmomatic_java_heap, chip.subsample_ctl_mem_factor, chip.ctl_fastqs_rep7_R1, chip.spr_mem_factor, chip.ctl_fastqs_rep5_R1, chip.bam2ta_time_hr, chip.fastqs_rep2_R1, chip.pool_ta_pr1.col, chip.ctl_bams, chip.subsample_reads, chip.align_bowtie2_mem_factor, chip.aligner, chip.blacklist, chip.title, chip.bowtie2_idx_tar, chip.ctl_fastqs_rep2_R1, chip.singularity, chip.align.trim_bp, chip.align_only, chip.align_time_hr, chip.exp_ctl_depth_ratio_limit, chip.bam2ta_cpu, chip.ctl_fastqs_rep9_R1, chip.enable_count_signal_track, chip.call_peak_spp_mem_factor, chip.no_dup_removal, chip.paired_end, chip.chrsz, chip.jsd_mem_factor, chip.ctl_fastqs_rep10_R2, chip.qc_report.qc_json_ref, chip.xcor_trim_bp, chip.bwa_idx_tar, chip.conda, chip.fastqs_rep4_R2, chip.peak_caller, chip.peak_ppr2, chip.fastqs_rep2_R2, chip.ctl_fastqs_rep7_R2, chip.fastqs_rep10_R1, chip.ctl_fastqs_rep3_R2, chip.jsd_disk_factor, chip.fastqs_rep8_R1, chip.align_ctl.multimapping, chip.call_peak_macs2_disk_factor, chip.fraglen, chip.jsd_time_hr, chip.crop_length, chip.conda_spp, chip.genome_name, chip.fastqs_rep7_R1, chip.mito_chr_name, chip.cap_num_peak, chip.always_use_pooled_ctl, chip.ctl_fastqs_rep9_R2, chip.ctl_tas, chip.blacklist2, chip.align_cpu, chip.bwa_mem_read_len_limit, chip.custom_aligner_idx_tar, chip.tas, chip.pseudoreplication_random_seed, chip.fastqs_rep1_R2, chip.fastqs_rep3_R1, chip.filter_picard_java_heap, chip.filter_mem_factor, chip.regex_bfilt_peak_chr_name, chip.spr_disk_factor, chip.crop_length_tol, chip.genome_tsv, chip.pool_ta_pr2.col, chip.bams, chip.xcor_mem_factor, chip.ctl_fastqs_rep10_R1, chip.ctl_fastqs_rep4_R2, chip.fastqs_rep9_R2, chip.pipeline_type, chip.peaks_pr2, chip.align_bwa_disk_factor, chip.jsd_cpu, chip.bam2ta_disk_factor, chip.subsample_ctl_disk_factor, chip.custom_align_py, chip.redact_nodup_bam, chip.xcor_time_hr, chip.bam2ta_mem_factor, chip.ctl_fastqs_rep8_R1, chip.pool_ta_ctl.col, chip.xcor_exclusion_range_min, chip.idr_thresh])
WARNING: Unexpected input provided: chip.cap_num_peak_spp (expected inputs: [chip.fastqs_rep3_R2, chip.align_ctl.trim_bp, chip.filter_disk_factor, chip.gensz, chip.trimmomatic_phred_score_format, chip.peaks_pr1, chip.ctl_nodup_bams, chip.ctl_depth_limit, chip.use_filt_pe_ta_for_xcor, chip.xcor_subsample_reads, chip.call_peak_time_hr, chip.fastqs_rep1_R1, chip.paired_ends, chip.align_R1.multimapping, chip.align.multimapping, chip.gc_bias_picard_java_heap, chip.fdr_thresh, chip.align_trimmomatic_java_heap, chip.align_bwa_mem_factor, chip.fastqs_rep9_R1, chip.ctl_depth_ratio, chip.filter_cpu, chip.xcor_exclusion_range_max, chip.pval_thresh, chip.fastqs_rep6_R2, chip.ctl_fastqs_rep4_R1, chip.fastqs_rep5_R2, chip.peak_pooled, chip.read_genome_tsv.null_s, chip.description, chip.ctl_paired_ends, chip.fastqs_rep4_R1, chip.macs2_signal_track_mem_factor, chip.fastqs_rep5_R1, chip.mapq_thresh, chip.ctl_fastqs_rep6_R1, chip.filter_R1.ref_fa, chip.macs2_signal_track_time_hr, chip.xcor_disk_factor, chip.ctl_fastqs_rep1_R2, chip.fastqs_rep7_R2, chip.filter_chrs, chip.ref_fa, chip.fastqs_rep6_R1, chip.ctl_fastqs_rep5_R2, chip.enable_jsd, chip.dup_marker, chip.call_peak_spp_disk_factor, chip.pool_ta.col, chip.docker, chip.use_bwa_mem_for_pe, chip.ctl_fastqs_rep2_R2, chip.fastqs_rep8_R2, chip.macs2_signal_track_disk_factor, chip.filter_time_hr, chip.peaks, chip.filter_no_dedup.ref_fa, chip.xcor_cpu, chip.call_peak_macs2_mem_factor, chip.peak_ppr1, chip.align_bowtie2_disk_factor, chip.call_peak_cpu, chip.enable_gc_bias, chip.ctl_fastqs_rep3_R1, chip.conda_macs2, chip.nodup_bams, chip.ctl_fastqs_rep6_R2, chip.ctl_fastqs_rep1_R1, chip.use_bowtie2_local_mode, chip.fastqs_rep10_R2, chip.ctl_paired_end, chip.pool_blacklist.prefix, chip.true_rep_only, chip.ctl_subsample_reads, chip.ctl_fastqs_rep8_R2, chip.align_R1.trimmomatic_java_heap, chip.subsample_ctl_mem_factor, chip.ctl_fastqs_rep7_R1, chip.spr_mem_factor, chip.ctl_fastqs_rep5_R1, chip.bam2ta_time_hr, chip.fastqs_rep2_R1, chip.pool_ta_pr1.col, chip.ctl_bams, chip.subsample_reads, chip.align_bowtie2_mem_factor, chip.aligner, chip.blacklist, chip.title, chip.bowtie2_idx_tar, chip.ctl_fastqs_rep2_R1, chip.singularity, chip.align.trim_bp, chip.align_only, chip.align_time_hr, chip.exp_ctl_depth_ratio_limit, chip.bam2ta_cpu, chip.ctl_fastqs_rep9_R1, chip.enable_count_signal_track, chip.call_peak_spp_mem_factor, chip.no_dup_removal, chip.paired_end, chip.chrsz, chip.jsd_mem_factor, chip.ctl_fastqs_rep10_R2, chip.qc_report.qc_json_ref, chip.xcor_trim_bp, chip.bwa_idx_tar, chip.conda, chip.fastqs_rep4_R2, chip.peak_caller, chip.peak_ppr2, chip.fastqs_rep2_R2, chip.ctl_fastqs_rep7_R2, chip.fastqs_rep10_R1, chip.ctl_fastqs_rep3_R2, chip.jsd_disk_factor, chip.fastqs_rep8_R1, chip.align_ctl.multimapping, chip.call_peak_macs2_disk_factor, chip.fraglen, chip.jsd_time_hr, chip.crop_length, chip.conda_spp, chip.genome_name, chip.fastqs_rep7_R1, chip.mito_chr_name, chip.cap_num_peak, chip.always_use_pooled_ctl, chip.ctl_fastqs_rep9_R2, chip.ctl_tas, chip.blacklist2, chip.align_cpu, chip.bwa_mem_read_len_limit, chip.custom_aligner_idx_tar, chip.tas, chip.pseudoreplication_random_seed, chip.fastqs_rep1_R2, chip.fastqs_rep3_R1, chip.filter_picard_java_heap, chip.filter_mem_factor, chip.regex_bfilt_peak_chr_name, chip.spr_disk_factor, chip.crop_length_tol, chip.genome_tsv, chip.pool_ta_pr2.col, chip.bams, chip.xcor_mem_factor, chip.ctl_fastqs_rep10_R1, chip.ctl_fastqs_rep4_R2, chip.fastqs_rep9_R2, chip.pipeline_type, chip.peaks_pr2, chip.align_bwa_disk_factor, chip.jsd_cpu, chip.bam2ta_disk_factor, chip.subsample_ctl_disk_factor, chip.custom_align_py, chip.redact_nodup_bam, chip.xcor_time_hr, chip.bam2ta_mem_factor, chip.ctl_fastqs_rep8_R1, chip.pool_ta_ctl.col, chip.xcor_exclusion_range_min, chip.idr_thresh])
WARNING: Unexpected input provided: chip.call_peak_mem_mb (expected inputs: [chip.fastqs_rep3_R2, chip.align_ctl.trim_bp, chip.filter_disk_factor, chip.gensz, chip.trimmomatic_phred_score_format, chip.peaks_pr1, chip.ctl_nodup_bams, chip.ctl_depth_limit, chip.use_filt_pe_ta_for_xcor, chip.xcor_subsample_reads, chip.call_peak_time_hr, chip.fastqs_rep1_R1, chip.paired_ends, chip.align_R1.multimapping, chip.align.multimapping, chip.gc_bias_picard_java_heap, chip.fdr_thresh, chip.align_trimmomatic_java_heap, chip.align_bwa_mem_factor, chip.fastqs_rep9_R1, chip.ctl_depth_ratio, chip.filter_cpu, chip.xcor_exclusion_range_max, chip.pval_thresh, chip.fastqs_rep6_R2, chip.ctl_fastqs_rep4_R1, chip.fastqs_rep5_R2, chip.peak_pooled, chip.read_genome_tsv.null_s, chip.description, chip.ctl_paired_ends, chip.fastqs_rep4_R1, chip.macs2_signal_track_mem_factor, chip.fastqs_rep5_R1, chip.mapq_thresh, chip.ctl_fastqs_rep6_R1, chip.filter_R1.ref_fa, chip.macs2_signal_track_time_hr, chip.xcor_disk_factor, chip.ctl_fastqs_rep1_R2, chip.fastqs_rep7_R2, chip.filter_chrs, chip.ref_fa, chip.fastqs_rep6_R1, chip.ctl_fastqs_rep5_R2, chip.enable_jsd, chip.dup_marker, chip.call_peak_spp_disk_factor, chip.pool_ta.col, chip.docker, chip.use_bwa_mem_for_pe, chip.ctl_fastqs_rep2_R2, chip.fastqs_rep8_R2, chip.macs2_signal_track_disk_factor, chip.filter_time_hr, chip.peaks, chip.filter_no_dedup.ref_fa, chip.xcor_cpu, chip.call_peak_macs2_mem_factor, chip.peak_ppr1, chip.align_bowtie2_disk_factor, chip.call_peak_cpu, chip.enable_gc_bias, chip.ctl_fastqs_rep3_R1, chip.conda_macs2, chip.nodup_bams, chip.ctl_fastqs_rep6_R2, chip.ctl_fastqs_rep1_R1, chip.use_bowtie2_local_mode, chip.fastqs_rep10_R2, chip.ctl_paired_end, chip.pool_blacklist.prefix, chip.true_rep_only, chip.ctl_subsample_reads, chip.ctl_fastqs_rep8_R2, chip.align_R1.trimmomatic_java_heap, chip.subsample_ctl_mem_factor, chip.ctl_fastqs_rep7_R1, chip.spr_mem_factor, chip.ctl_fastqs_rep5_R1, chip.bam2ta_time_hr, chip.fastqs_rep2_R1, chip.pool_ta_pr1.col, chip.ctl_bams, chip.subsample_reads, chip.align_bowtie2_mem_factor, chip.aligner, chip.blacklist, chip.title, chip.bowtie2_idx_tar, chip.ctl_fastqs_rep2_R1, chip.singularity, chip.align.trim_bp, chip.align_only, chip.align_time_hr, chip.exp_ctl_depth_ratio_limit, chip.bam2ta_cpu, chip.ctl_fastqs_rep9_R1, chip.enable_count_signal_track, chip.call_peak_spp_mem_factor, chip.no_dup_removal, chip.paired_end, chip.chrsz, chip.jsd_mem_factor, chip.ctl_fastqs_rep10_R2, chip.qc_report.qc_json_ref, chip.xcor_trim_bp, chip.bwa_idx_tar, chip.conda, chip.fastqs_rep4_R2, chip.peak_caller, chip.peak_ppr2, chip.fastqs_rep2_R2, chip.ctl_fastqs_rep7_R2, chip.fastqs_rep10_R1, chip.ctl_fastqs_rep3_R2, chip.jsd_disk_factor, chip.fastqs_rep8_R1, chip.align_ctl.multimapping, chip.call_peak_macs2_disk_factor, chip.fraglen, chip.jsd_time_hr, chip.crop_length, chip.conda_spp, chip.genome_name, chip.fastqs_rep7_R1, chip.mito_chr_name, chip.cap_num_peak, chip.always_use_pooled_ctl, chip.ctl_fastqs_rep9_R2, chip.ctl_tas, chip.blacklist2, chip.align_cpu, chip.bwa_mem_read_len_limit, chip.custom_aligner_idx_tar, chip.tas, chip.pseudoreplication_random_seed, chip.fastqs_rep1_R2, chip.fastqs_rep3_R1, chip.filter_picard_java_heap, chip.filter_mem_factor, chip.regex_bfilt_peak_chr_name, chip.spr_disk_factor, chip.crop_length_tol, chip.genome_tsv, chip.pool_ta_pr2.col, chip.bams, chip.xcor_mem_factor, chip.ctl_fastqs_rep10_R1, chip.ctl_fastqs_rep4_R2, chip.fastqs_rep9_R2, chip.pipeline_type, chip.peaks_pr2, chip.align_bwa_disk_factor, chip.jsd_cpu, chip.bam2ta_disk_factor, chip.subsample_ctl_disk_factor, chip.custom_align_py, chip.redact_nodup_bam, chip.xcor_time_hr, chip.bam2ta_mem_factor, chip.ctl_fastqs_rep8_R1, chip.pool_ta_ctl.col, chip.xcor_exclusion_range_min, chip.idr_thresh])

OS/Platform

  • OS/Platform: Centos 7
  • Conda version: 4.8.3
  • Pipeline version: [e.g. v1.6.0]
  • Caper version: 2.1.2

Caper configuration file

backend=sge

Parallel environement is required, ask your administrator to create one

If your cluster doesn't support PE then edit 'sge-resource-param'

to fit your cluster's configuration.

sge-pe=serial

This parameter is NOT for 'caper submit' BUT for 'caper run' and 'caper server' only.

This resource parameter string will be passed to sbatch, qsub, bsub, ...

You can customize it according to your cluster's configuration.

Note that Cromwell's implicit type conversion (String to Integer)

seems to be buggy for WomLong type memory variables (memory_mb and memory_gb).

So be careful about using the + operator between WomLong and other types (String, even Int).

For example, ${"--mem=" + memory_mb} will not work since memory_mb is WomLong.

Use ${"if defined(memory_mb) then "--mem=" else ""}{memory_mb}${"if defined(memory_mb) then "mb " else " "}

See broadinstitute/cromwell#4659 for details

Cromwell's built-in variables (attributes defined in WDL task's runtime)

Use them within ${} notation.

- cpu: number of cores for a job (default = 1)

- memory_mb, memory_gb: total memory for a job in MB, GB

* these are converted from 'memory' string attribute (including size unit)

defined in WDL task's runtime

- time: time limit for a job in hour

- gpu: specified gpu name or number of gpus (it's declared as String)

Parallel environment of SGE:

Find one with $ qconf -spl or ask you admin to add one if not exists.

If your cluster works without PE then edit the below sge-resource-param

#sge-pe=serial
sge-resource-param=${if cpu > 1 then "-pe " + sge_pe + " " else ""} ${if cpu > 1 then cpu else ""} ${true="-l h_vmem=$(expr " false="" defined(memory_mb)}${memory_mb}${true=" / " false=""
defined(memory_mb)}${if defined(memory_mb) then cpu else ""}${true=")m" false="" defined(memory_mb)} ${true="-l s_vmem=$(expr " false="" defined(memory_mb)}${memory_mb}$#{true=" / " false="" defined(memory_mb)}${if defined(memory_mb) then cpu else ""}${true=")m" false="" defined(memory_mb)} ${"-l h_rt=" + time + ":00:00"} ${"-l s_rt=" + time + ":00:00"} ${"-l gpu=" + gpu}

If needed uncomment and define any extra SGE qsub parameters here

YOU CANNOT USE WDL SYNTAX AND CROMWELL BUILT-IN VARIABLES HERE

#sge-extra-param=

Hashing strategy for call-caching (3 choices)

This parameter is for local (local/slurm/sge/pbs/lsf) backend only.

This is important for call-caching,

which means re-using outputs from previous/failed workflows.

Cache will miss if different strategy is used.

"file" method has been default for all old versions of Caper<1.0.

"path+modtime" is a new default for Caper>=1.0,

file: use md5sum hash (slow).

path: use path.

path+modtime: use path and modification time.

local-hash-strat=path+modtime

Metadata DB for call-caching (reusing previous outputs):

Cromwell supports restarting workflows based on a metadata DB

DB is in-memory by default

#db=in-memory

If you use 'caper server' then you can use one unified '--file-db'

for all submitted workflows. In such case, uncomment the following two lines

and defined file-db as an absolute path to store metadata of all workflows

#db=file
#file-db=

If you use 'caper run' and want to use call-caching:

Make sure to define different 'caper run ... --db file --file-db DB_PATH'

for each pipeline run.

But if you want to restart then define the same '--db file --file-db DB_PATH'

then Caper will collect/re-use previous outputs without running the same task again

Previous outputs will be simply hard/soft-linked.

Local directory for localized files and Cromwell's intermediate files

If not defined, Caper will make .caper_tmp/ on local-out-dir or CWD.

/tmp is not recommended here since Caper store all localized data files

on this directory (e.g. input FASTQs defined as URLs in input JSON).

local-loc-dir=/net/waterston/vol9/capertmp

cromwell=/net/waterston/vol2/home/gevirl/.caper/cromwell_jar/cromwell-65.jar
womtool=/net/waterston/vol2/home/gevirl/.caper/womtool_jar/womtool-65.jar

Input JSON file

{"chip.title":"arid-1_RW12194_L4larva_1_6","chip.description":"gevirl","chip.always_use_pooled_ctl":false,"chip.true_rep_only":false,"chip.enable_count_signal_track":true,"chip.aligner":"bwa","chip.use_bwa_mem_for_pe":true,"chip.align_only":false,"chip.genome_tsv":"/net/waterston/vol9/WS245chr/WS245chr.tsv","chip.peak_caller":"spp","chip.pipeline_type":"tf","chip.cap_num_peak_spp":300000,"chip.idr_thresh":0.01,"chip.call_peak_mem_mb":16000,"chip.align_mem_mb":20000,"chip.filter_picard_java_heap":"4G","chip.fastqs_rep1_R1":["/net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/ARDIP3_240_337_S33_L001_R1_001.fastq.gz"],"chip.fastqs_rep1_R2":["/net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/ARDIP3_240_337_S33_L001_R2_001.fastq.gz"],"chip.fastqs_rep2_R1":["/net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/ARDIP4_228_349_S34_L001_R1_001.fastq.gz"],"chip.fastqs_rep2_R2":["/net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/ARDIP4_228_349_S34_L001_R2_001.fastq.gz"],"chip.paired_ends":[true,true],"chip.ctl_fastqs_rep1_R1":["/net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/ARDinp3_263_314_S39_L001_R1_001.fastq.gz"],"chip.ctl_fastqs_rep1_R2":["/net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/ARDinp3_263_314_S39_L001_R2_001.fastq.gz"],"chip.ctl_paired_ends":[true]}

Troubleshooting result

If you ran caper run without Caper server then Caper automatically runs a troubleshooter for failed workflows. Find troubleshooting result in the bottom of Caper's screen log.

If you ran caper submit with a running Caper server then first find your workflow ID (1st column) with caper list and run caper debug [WORKFLOW_ID].

Paste troubleshooting result.

PASTE TROUBLESHOOTING RESULT HERE
@leepc12
Copy link
Contributor

leepc12 commented Jan 12, 2022

STDERR=WARNING: Unexpected input provided: chip.align_mem_mb

I think you are using an outdated Conda environment and input JSON. This parameter chip.align_mem_mb has been deprecated (pipeline automatically determines job's memory according to input file sizes) and no longer exists.

Please get the latest pipeline + caper and reinstall pipeline's Conda environment (DO NOT ACTIVATE PIPELINE's CONDA ENV BEFORE RUNNING A PIPELINE).

$ pip install caper --upgrade
# and then git pull (or git clone from scratch) the pipeline git directory to update it
$ scripts/uninstall_conda_env.sh
$ scripts/install_conda_env.sh

Remove alll *_mem_mb parameters from your input JSON and try again.

@louisgevirtzman
Copy link
Author

I have update Conda and removed alll *_mem_mb from the input JSON
The alignment tasks are failing because of inadequate memory request. I cannot figure out how to request more memory for this task. Do I edit something in the wdl file or the caper config file? Please provide a simple example.

@leepc12
Copy link
Contributor

leepc12 commented Jan 19, 2022

Pipeline automatically scales memory according to the size of each task's inputs.
Please upload cromwell.out* files for debugging.

@louisgevirtzman
Copy link
Author

cromwell.out.10.gz

The alignment tasks die. I The sge kills them. Here is a typical report of the sge job

I don't know what code 100 means
failed 100 : assumedly after job

(base) [gevirl@grid-head2 execution]$ qacct -j 282137901

qname sage-login.q
hostname sage012.grid.gs.washington.edu
group waterstonlab
owner gevirl
project sage
department waterstonlab
jobname cromwell_cd5ab92a_align_R1
jobnumber 282137901
taskid undefined
pe_taskid NONE
account sge
priority 0
cwd /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/chip/cd5ab92a-246c-45fd-b6f9-64f5b09114b8/call-align_R1/shard-0
submit_host sage013.grid.gs.washington.edu
submit_cmd qsub -V -terse -S /bin/bash -N cromwell_cd5ab92a_align_R1 -wd /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/chip/cd5ab92a-246c-45fd-b6f9-64f5b09114b8/call-align_R1/shard-0 -o /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/chip/cd5ab92a-246c-45fd-b6f9-64f5b09114b8/call-align_R1/shard-0/execution/stdout -e /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/chip/cd5ab92a-246c-45fd-b6f9-64f5b09114b8/call-align_R1/shard-0/execution/stderr -pe serial 6 -l h_vmem=936m -l s_vmem=936m -l h_rt=48:00:00 -l s_rt=48:00:00 -P sage /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/chip/cd5ab92a-246c-45fd-b6f9-64f5b09114b8/call-align_R1/shard-0/execution/script.caper
qsub_time 01/20/2022 11:32:51.622
start_time 01/20/2022 11:33:21.751
end_time 01/20/2022 11:44:29.509
granted_pe serial
slots 6
failed 100 : assumedly after job
deleted_by NONE
exit_status 152
ru_wallclock 667.758
ru_utime 0.611
ru_stime 0.527
ru_maxrss 13712
ru_ixrss 0
ru_ismrss 0
ru_idrss 0
ru_isrss 0
ru_minflt 75212
ru_majflt 24
ru_nswap 0
ru_inblock 42496
ru_oublock 832
ru_msgsnd 0
ru_msgrcv 0
ru_nsignals 0
ru_nvcsw 1918
ru_nivcsw 279
wallclock 668.176
cpu 1684.140
mem 709.450
io 49.326
iow 2.700
ioops 3671821
maxvmem 5.394G
maxrss 1.545G
maxpss 1.536G
arid undefined
jc_name NONE
bound_cores sage012.grid.gs.washington.edu=0,2, sage012.grid.gs.washington.edu=0,3, sage012.grid.gs.washington.edu=1,0, sage012.grid.gs.washington.edu=1,1, sage012.grid.gs.washington.edu=1,2, sage012.grid.gs.washington.edu=1,3

@louisgevirtzman
Copy link
Author

exit_status 152 in the sge report in previous comment may mean a memory limit has been reached
I would like to experiment by increasing the requested memory for the alignment tasks. Please guide me on how to do that.

@leepc12
Copy link
Contributor

leepc12 commented Jan 20, 2022

Let's separate the job script itself from Caper.
Please run this script on your login node. This is the exact command that Caper submitted.
Please let me know what kind of errors you get for the job cromwell_cd5ab92a_align.
Please qacct on it.
Also please check if you can make this command work by modifying resource parameters.

qsub -V -terse -S /bin/bash -N cromwell_cd5ab92a_align -wd /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/chip/cd5ab92a-246c-45fd-b6f9-64f5b09114b8/call-align/shard-1 -o /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/chip/cd5ab92a-246c-45fd-b6f9-64f5b09114b8/call-align/shard-1/execution/stdout -e /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/chip/cd5ab92a-246c-45fd-b6f9-64f5b09114b8/call-align/shard-1/execution/stderr \
    \
   -pe serial  6 -l h_vmem=$(expr 6064 / 6)m -l s_vmem=$(expr 6064 / 6)m -l h_rt=48:00:00 -l s_rt=48:00:00   \
   -P sage \
   /net/waterston/vol9/ChipSeqPipeline/arid-1_RW12194_L4larva_1/chip/cd5ab92a-246c-45fd-b6f9-64f5b09114b8/call-align/shard-1/execution/script.caper

Resource parameters:

   -pe serial  6 -l h_vmem=$(expr 6064 / 6)m -l s_vmem=$(expr 6064 / 6)m -l h_rt=48:00:00 -l s_rt=48:00:00   \
   -P sage \

@louisgevirtzman
Copy link
Author

I ran the original job script and got the same error. The script and the qacct results are in the attached file original.txt

I increased the memory and the job ran to completion without error. The modified script and the qacct results are in the attached file moreMemory.txt

please advise how I should increase memory specification for particular tasks

moreMemory.txt
original.txt

@leepc12
Copy link
Contributor

leepc12 commented Jan 24, 2022

You can increase the memory factor, which is a multiplier for the total memory for each task.
Find default value for this parameter chip.align_mem_factor from pipeline's input JSON documentation.
And try doubling it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants