You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The data layout (i.e. paired or single-ended) field from SRA is not reliable. This had led twice now to issues where bioflows expects the data to be paired end and searches for the pair, but it is not, so the workflow errors out. In one case SRX1726841fastq-dump split the file into 2 by just putting the first half of the reads in 1 file and the second half of the reads in the other, but the read IDs did not actually match since they weren't paired.
One potential solution to this is to write a check on the fastq files to see if the first x read IDs match up. However this would require processing through the fastq files in addition to when FASTQC already runs on them, so this would increase bioflows run time, and may not be desirable behavior.
The text was updated successfully, but these errors were encountered:
The data layout (i.e. paired or single-ended) field from SRA is not reliable. This had led twice now to issues where bioflows expects the data to be paired end and searches for the pair, but it is not, so the workflow errors out. In one case SRX1726841
fastq-dump
split the file into 2 by just putting the first half of the reads in 1 file and the second half of the reads in the other, but the read IDs did not actually match since they weren't paired.One potential solution to this is to write a check on the fastq files to see if the first x read IDs match up. However this would require processing through the fastq files in addition to when FASTQC already runs on them, so this would increase bioflows run time, and may not be desirable behavior.
The text was updated successfully, but these errors were encountered: