Skip to content

Releases: SorenKarst/longread_umi

v0.3.2

19 Feb 16:57
00302fd
Compare
Choose a tag to compare

Version v0.3.x of the pipeline is compatible with the descriptions and data in:
SM Karst, RM Ziels, RH Kirkegaard, EA Sørensen, D. McDonald, Q Zhu, R Knight, & M Albertsen. (2020). Enabling high-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing. bioRxiv, 6459039.

Bug fixes

  • Re-introduce Racon window trimming for pacbio_pipeline. No trimming results in higher error rate.

Minor changes

  • Updated readme and example analysis.

v0.3.1

09 Feb 21:11
Compare
Choose a tag to compare

Version v0.3.x of the pipeline is compatible with the descriptions and data in:
SM Karst, RM Ziels, RH Kirkegaard, EA Sørensen, D. McDonald, Q Zhu, R Knight, & M Albertsen. (2020). Enabling high-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing. bioRxiv, 6459039.

Bug fixes

  • Disabled end trimming in Racon, which resulted in truncated UMI consensus sequences. Truncation in some cases resulted in removal of priming sites and the sequences being discarded.
  • UMI cluster size was underestimated by 50% due to mapping settings discarding reverse complement matches. The impact was mainly on low abundant UMI bins, which is primarily relevant for PacBio CCS data.

Major changes

  • Filtering of poor UMI bins by UMI match error, UMI bin size/cluster size ratio and read orientation ratio was integrated into the UMI binning step.

Minor changes

  • Updated readme and example analysis.

v0.3.0

28 Jan 08:02
66f9598
Compare
Choose a tag to compare

Version v0.3.x of the pipeline is compatible with the descriptions and data in:
SM Karst, RM Ziels, RH Kirkegaard, EA Sørensen, D. McDonald, Q Zhu, R Knight, & M Albertsen. (2020). Enabling high-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing. bioRxiv, 6459039.

Major changes

  • Conda installation now an option and recommended.
  • PacBio pipeline added.
  • longread_umi pipelines and tools can now be accessed by just calling longread_umi <tool/pipeline>
  • Demultiplexing scripts and a barcode template script have been added for custom primer/adaptor demultiplexing.
  • Readme streamlined, usage added and example data/analysis added along.
  • R functions for data validation was added.
  • umi_binning was tuned to better find low coverage UMIs

Minor changes

  • Pipeline name simplified to longread_umi from longread-UMI-pipeline
  • polish_medaka speed optimization and better control of parallel jobs.
  • qc_pipeline streamlining and more stats.
  • primer_position updated to provide adapter positions as well.
  • trim_amplicon updated to work with custom primers.
  • variants now dereplicates before clustering and uses centroids as cluster mapping reference.
  • Zymomock rRNA reference sequences was updated to remove indel error in Salmonella reference.

v0.2.1

11 Jan 21:46
baf8872
Compare
Choose a tag to compare

Minor improvements

  • Implemented conda installation
  • Streamlined instructions
  • GNU parallel --env for more robust function export

Minor fixes

  • Updated Zymo mock references (1 indel error).

Version v0.2.x of the pipeline is compatible with the descriptions and data in:
Karst, S. M., Ziels, R. M., Kirkegaard, R. H., & Albertsen, M. (2019). Enabling high-accuracy long-read amplicon sequences using unique molecular identifiers and Nanopore sequencing. bioRxiv, 645903.
https://www.biorxiv.org/content/10.1101/645903v2