Skip to content

Releases: SACGF/cdot

data_v0.2.27

30 Aug 00:19
Compare
Choose a tag to compare

Add protein for Ensembl - so it can now do c_to_p

RefSeq - add RS_2024_08 - GRCh38 and T2T-CHM13v2

v0.2.26

15 Aug 04:56
Compare
Choose a tag to compare

Bumped version to 0.2.26 to catch up with data release. Only new client functionality is #81 'data_release' helper functions

All other changes in this release were for data (and contained in data_v0.2.26)

Added

  • #81 New 'data_release' code eg 'get_latest_combo_file_urls' that looks on GitHub to find latest data
  • New GFFs: RefSeq RS_2023_10, Ensembl 111, 112
  • #79 - RefSeq MT transcripts
  • #66 - We now store 'Note' field (thanks holtgrewe for suggestion)
  • Added requirements.txt for 'generate_transcript_data' sections
  • client / JSON data schema version compatability check

Changed

  • #56 - Fix occasional UTA duplicated exons
  • #57 - Correctly handle retrieving genomic position and dealing w/indels in GFF (thanks ltnetcase for reporting)
  • #60 - Fix for missing protein IDs due to Genbank / GenBank (thanks holtgrewe)
  • #64 - Split code/data versions. json.gz are now labelled according to data schema version (thanks holtgrewe)
  • Renamed 'CHM13v2.0' to 'T2T-CHM13v2.0' so it could work with biocommons bioutils
  • #72 - Correctly handle ncRNA_gene genes (thanks holtgrewe for reporting)
  • #73 - HGNC ID was missing for some chrMT genes in Ensembl

data_v0.2.26

03 Aug 23:34
Compare
Choose a tag to compare

Ensembl 112, RefSeq MT transcripts

data_v0.2.25

13 Mar 00:44
Compare
Choose a tag to compare

Correctly read gene symbol from non-standard genes

data_v0.2.24

06 Mar 22:58
Compare
Choose a tag to compare
  • Fixed issue #72 - RNA gene symbols missing for ENSEMBL

data_v0.2.23

19 Jan 03:54
Compare
Choose a tag to compare
  • Added Ensembl 111

data_v0.2.22

14 Nov 01:33
Compare
Choose a tag to compare

From now on code/data tags/versions are separate, see #64

  • New GFFs: RefSeq RS_2023_10, Ensembl VEP110
  • #56 - Fix occasional UTA duplicated exons
  • #57 - Correctly handle retrieving genomic position and dealing w/indels in GFF (thanks ltnetcase for reporting)
  • #60 - Fix for missing protein IDs due to Genbank / GenBank (thanks holtgrewe)
  • #64 - Split code/data versions. json.gz are now labelled according to data schema version (thanks holtgrewe)
  • #66 - We now store 'Note' field (thanks holtgrewe for suggestion)
  • Renamed 'CHM13v2.0' to 'T2T-CHM13v2.0' so it could work with biocommons bioutils

v0.2.21

14 Aug 02:52
Compare
Choose a tag to compare
  • #45 - FastaSeqFetcher - fix alignment gaps properly
  • #52 - Added transcripts from Ensembl 110 GRCh38 release
  • #53 - UTA to cdot transcript start/end conversion issue

v0.2.20

10 Jul 05:00
Compare
Choose a tag to compare

Handle biotypes correctly in Ensembl

v0.2.19

06 Jul 07:10
Compare
Choose a tag to compare

Ensembl GRCh37 Mito transcripts have proper contig name "NC_012920.1" instead of "MT"