v0.3.1
Version v0.3.x of the pipeline is compatible with the descriptions and data in:
SM Karst, RM Ziels, RH Kirkegaard, EA Sørensen, D. McDonald, Q Zhu, R Knight, & M Albertsen. (2020). Enabling high-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing. bioRxiv, 6459039.
Bug fixes
- Disabled end trimming in Racon, which resulted in truncated UMI consensus sequences. Truncation in some cases resulted in removal of priming sites and the sequences being discarded.
- UMI cluster size was underestimated by 50% due to mapping settings discarding reverse complement matches. The impact was mainly on low abundant UMI bins, which is primarily relevant for PacBio CCS data.
Major changes
- Filtering of poor UMI bins by UMI match error, UMI bin size/cluster size ratio and read orientation ratio was integrated into the UMI binning step.
Minor changes
- Updated readme and example analysis.