Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ivar consensus fails silently from a broken pipe #598

Closed
mdperry opened this issue Aug 28, 2024 · 3 comments · Fixed by #629
Closed

ivar consensus fails silently from a broken pipe #598

mdperry opened this issue Aug 28, 2024 · 3 comments · Fixed by #629
Assignees

Comments

@mdperry
Copy link

mdperry commented Aug 28, 2024

🐛

📝 Describe the Issue

I had analyzed some RSV fastq files to generate consensus.fasta output (similar to SARS-CoV-2) using local resources and a roll-your-own pipeline of commands. I wanted to test the TheiaCoV-PHB-PE workflow running on Terra as a comparison (in particular I was interested in all of the additional QC and filtering steps). I had 3 RSV-A samples and 3 RSV-B samples, and I aligned all 6 against RSV-A and separately against RSV-B.

I discovered that consensus.fasta files for 2/3 of the RSV-A samples were incomplete, or "too short", they each only contained ~ 53.5 % of the RSV-A genome (the back half was missing, gone, not replaced by N's). I had selected these samples because my previous characterization showed that they had extremely high genome coverage with a minimal number of N's.

Further examination of the Terra stderr files for both samples revealed identical errors, a broken pipe had killed the execution:

/cromwell_root/script: line 60:    24 Broken pipe             samtools mpileup --count-orphans -d 600000 --no-BAQ -Q 0 --reference ${ref_genome} /cromwell_root/fc-secure-2063e6fe-3ab3-4713-9cd6-857b72d9954c/submissions/bdb5ddd1-379b-4700-bbfe-9d73bb5b121a/theiacov_illumina_pe/0caa8441-44ee-46c8-9341-428b5b970827/call-ivar_consensus/ivar_consensus/5223ad81-e4b1-42fd-a584-3a437b1d113f/call-primer_trim/RSV00100A.primertrim.sorted.bam
        25 Killed                  | ivar consensus -p RSV00100A.consensus -q 20 -t 0.6 -m 100 -n N

I realized that since partial consensus.fasta files had been written to disk, the task (and workflow) proceeded to completion.

I tried to see if I could reproduce this error with the Terra bam file running on a localhost (just running the same command, but on the command line). I ran two tests with the same input; one used the code from my pipeline, and one used the command from TheiaCoV-PHB-PE (they were very slightly different, also I did not try to control for versions of samtools, or ivar).

I was monitoring the progress (it took a good 20 or more minutes to run) and I noticed that first the output files were basically empty, then suddenly they contained about 8191 bytes, at which point they stayed the same size until the end when they were both just over 15000 bytes. I mention this because the two truncated consensus.fasta files were exactly this size, 8191 bytes, and contained about the same number of contiguous bases. I inferred that ivar consensus processes data in blocks, or chunks, and writes the cache out to disk in a stepwise fashion.

Ordinarily, I add this line to blocks of bash code (whether in WDL workflows or executable files):

set -euo pipefail;

I believe that in the context of the Terra.bio workspace this addition would have forced the ivar consensus task to throw an exception, and alerted me to the fact that I needed to investigate and possibly re-run the affected samples.

🔁 How to Reproduce

I just set the required parameters and launched the workflow after importing it from Dockstore. I suspect (based on almost nothing) that a broken pipe is probably a random, or stochastic, type of error. I did not try re-running those two samples on Terra (yet). It would be potentially interesting if the same two consistently died with the same error, but I have not had time to dig any deeper.

Feel free to answer the following questions to help us understand:

  • Was the workflow run on the Terra platform? Was it Terra on Azure or GCP?
    YES, on GCP
    • If necessary, we may ask you to share your Terra workspace with us. Usually READER access is sufficient, but we may ask for WRITER access if we need to make changes to the workspace to reproduce the issue.
  • Was the workflow run locally using miniwdl or cromwell?
    NO
    • If so, what was the exact command was used to launch the workflow?

💻 Version Information

PHB v2.1.0

@AngieHinrichs
Copy link

Great find @mdperry!

since partial consensus.fasta files had been written to disk, the task (and workflow) proceeded to completion.

Yikes -- silent failures are scary. @capsakcj is there a way to make ivar consensus create a different output file, and then rename that to the final output filename after successful completion?

@AngieHinrichs
Copy link

set -euo pipefail;

Better yet, this!

@Michal-Babins
Copy link
Contributor

@mdperry thank you for sharing your data with us and for the suggestion of setting the pipefail parameters. Upon reproducing your error we were able to also find the same error in the ivar variant calling task. We have added pipefail parameters to both as well as exposed optional compute parameters that will be helpful with larger file sized. By setting the memory to 16GB for both the ivar consensus task and the ivar variant calling task, the tasks ran successfully.

The tasks will now also fail correctly instead of producing a silent error.

In the next patch release of PHB, the optional compute parameters will be exposed for the following ivar consensus tasks:

  • bwa alignment
  • primer trimming
  • stats and coverage for primer trimming
  • variant calling
  • consensus
  • stats and coverage for the consensus wf

Here are the compute parameters that will be exposed for input:

  • cpu (Integer for # of cpus)
  • memory (gigabytes of RAM set for the task)
  • disk_size (gigabytes of disk space allocated)
  • docker (docker image string)

Example of how this will look in the terra UI:
Screenshot 2024-09-20 at 8 15 37 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants