-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update ncbi-scrub task (version 2.2.1) #202
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read2_unzip=~{read2} | ||
fi | ||
# unzip read files as scrub tool does not take in .gz fastq files, and interleave them | ||
paste <(zcat ~{read1} | paste - - - -) <(zcat ~{read2} | paste - - - -) | tr '\t' '\n' > interleaved.fastq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work if there are uneven reads in ~{read1}
and ~{read2}
?
Seeing failures in workflows utilizing the |
* A new hostile task, version 0.3.0 * making hostile an optional dehosting tool to ncbi-scrubber * replacing ncbi-scrub with hostile for human reads removal (dehosting) * reverting to v0.2.0 to test terra errors * updated hostile output * making hostie ouput optional * Increased RAM for hostile task
This error may be occurring in |
Closing PR as stale |
|
Closes
🛠️ Changes Being Made
ncbi_scrub
us-docker.pkg.dev/general-theiagen/ncbi/sra-human-scrubber:2.2.1
🧠 Context and Rationale
For
ncbi_scrub
, the latest version solves the issue of paired reads not being correctly masked, according to their documentation .The reads are still processed individually in the task, so the phenomenon might still persist if we do not change the tasks to first interleave the reads, then process them using ncbi-scrub, and then split them again.I've also altered the ncbi_scrub_pe task to first interleave the reads for processing with HRRT, and then split the resulting file back to forward and reverse read files.
📋 Workflow/Task Steps
N/A
Inputs
N/A
Outputs
read1_human_spots_removed
andread2_human_spots_removed
were removed (this isn't actually a final output of any of the workflows)human_spots_removed
was added with the stats containing the number of spots removed for the interleaved files🧪 Testing
Locally
Tests passed locally with
miniwdl
for the individual task.Terra
Tests under way...
🔬 Quality checks
Pull Request (PR) checklist: