Skip to content
This repository has been archived by the owner on Apr 6, 2021. It is now read-only.

lineages 2020-05-19

Compare
Choose a tag to compare
@aineniamh aineniamh released this 23 May 12:36
· 36 commits to master since this release

Release notes

Release of curated lineage information recent as of 2020-05-19. 27,767 sequences curated in total.

Data available

  • Pangolin guide tree and alignment (safe and putative). This translates to lineages that have > 95% recall rate (safe). With the -p or --include-putative flags, all lineages including those with potentially less certainty are included in pangolin. We believe this will be a useful feature. Putative lineages are indicated with a p before their designation. E.g. B.1.1.p15 lies with certainty within the lineage B.1.1. Putative lineages fit the criteria required for lineage designation, but potentially due to homoplasies, sequencing errors or resolution of the global tree (>27,000 tips now), have not got recall values suitable for lineage assignment. The rationale is if more data continues to support these lineages, the p will be removed and they will become part of the default lineage groups.
  • Lineages metadata with representative sequence information and GISAID ID
  • Singleton snps that have been masked out (for CoV-GLUE)
  • Recall rate csv information given for each lineage, also with total number of sequences in the tree manually assigned that lineage. Recall rate will be correlated with bootstrap in the original tree.
  • Lineage description notes with text describing parent node bootstrap and rationale behind lineage calling.