-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read screen incorrectly counting basepairs #588
Comments
|
Does it work without the tabs? Is the cut -f2 grabbing just the Either way, I agree using an established tool that is already in the docker image is a good idea. |
I'm not sure if it works without the tabs. Yes, it's grabbing the |
I started a dev branch to address this issue, called We can obviously take the solution a different direction than what I implemented, but wanted to get our user past this blocker. |
Our PHL partner that reported this issue confirmed that the dev branch did resolve this issue for their recent batch of samples sequence on ONT 👍 Despite the success, if the team thinks my solution is not robust enough or knows of a better way let's discuss. |
🐛
📝 Describe the Issue
I came across some ONT data where the (raw) read screen task fails to accurately count # of basepairs with some ONT FASTQ files. This bash one-liner...
public_health_bioinformatics/tasks/quality_control/comparisons/task_screen.wdl
Line 55 in 1508bb5
...is failing due to tabs being present in the FASTQ header, example:
For this particular sample, the
raw_read_screen
outputs this and FAILS the sample:When
fastq-scan
shows the true number of basepairs, which is much higher for this 360 MB .fastq.gz file:🔁 How to Reproduce
Ask me for a link to the theiaprok_ont workflow in Terra where this behavior was observed. Prefer not to post this link publicly to preserve privacy.
Feel free to answer the following questions to help us understand:
miniwdl
orcromwell
?💻 Version Information
TheiaProk_ONT v2.1.0
The text was updated successfully, but these errors were encountered: