Releases: SorenKarst/longread_umi
v0.3.2
Version v0.3.x of the pipeline is compatible with the descriptions and data in:
SM Karst, RM Ziels, RH Kirkegaard, EA Sørensen, D. McDonald, Q Zhu, R Knight, & M Albertsen. (2020). Enabling high-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing. bioRxiv, 6459039.
Bug fixes
- Re-introduce Racon window trimming for pacbio_pipeline. No trimming results in higher error rate.
Minor changes
- Updated readme and example analysis.
v0.3.1
Version v0.3.x of the pipeline is compatible with the descriptions and data in:
SM Karst, RM Ziels, RH Kirkegaard, EA Sørensen, D. McDonald, Q Zhu, R Knight, & M Albertsen. (2020). Enabling high-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing. bioRxiv, 6459039.
Bug fixes
- Disabled end trimming in Racon, which resulted in truncated UMI consensus sequences. Truncation in some cases resulted in removal of priming sites and the sequences being discarded.
- UMI cluster size was underestimated by 50% due to mapping settings discarding reverse complement matches. The impact was mainly on low abundant UMI bins, which is primarily relevant for PacBio CCS data.
Major changes
- Filtering of poor UMI bins by UMI match error, UMI bin size/cluster size ratio and read orientation ratio was integrated into the UMI binning step.
Minor changes
- Updated readme and example analysis.
v0.3.0
Version v0.3.x of the pipeline is compatible with the descriptions and data in:
SM Karst, RM Ziels, RH Kirkegaard, EA Sørensen, D. McDonald, Q Zhu, R Knight, & M Albertsen. (2020). Enabling high-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing. bioRxiv, 6459039.
Major changes
- Conda installation now an option and recommended.
- PacBio pipeline added.
- longread_umi pipelines and tools can now be accessed by just calling
longread_umi <tool/pipeline>
- Demultiplexing scripts and a barcode template script have been added for custom primer/adaptor demultiplexing.
- Readme streamlined, usage added and example data/analysis added along.
- R functions for data validation was added.
umi_binning
was tuned to better find low coverage UMIs
Minor changes
- Pipeline name simplified to
longread_umi
fromlongread-UMI-pipeline
polish_medaka
speed optimization and better control of parallel jobs.qc_pipeline
streamlining and more stats.primer_position
updated to provide adapter positions as well.trim_amplicon
updated to work with custom primers.variants
now dereplicates before clustering and uses centroids as cluster mapping reference.- Zymomock rRNA reference sequences was updated to remove indel error in Salmonella reference.
v0.2.1
Minor improvements
- Implemented conda installation
- Streamlined instructions
- GNU parallel --env for more robust function export
Minor fixes
- Updated Zymo mock references (1 indel error).
Version v0.2.x of the pipeline is compatible with the descriptions and data in:
Karst, S. M., Ziels, R. M., Kirkegaard, R. H., & Albertsen, M. (2019). Enabling high-accuracy long-read amplicon sequences using unique molecular identifiers and Nanopore sequencing. bioRxiv, 645903.
https://www.biorxiv.org/content/10.1101/645903v2