Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference genomes need a new abstractions #365

Open
ihodes opened this issue Sep 27, 2016 · 3 comments
Open

Reference genomes need a new abstractions #365

ihodes opened this issue Sep 27, 2016 · 3 comments

Comments

@ihodes
Copy link
Member

ihodes commented Sep 27, 2016

We hard-code in things like the Ensembl release version into a reference genome like mm10 that shouldn't be there; the release isn't tied to a given genome. This probably applies to things like the dbsnp, exnome_gtf, as well (cc @arahuja)

type t = private {
    name : name;
    ensembl : int;
    species : string;
    metadata : string option;
    fasta : Location.t;
    dbsnp : Location.t option;
    known_indels : Location.t option;
    cosmic : Location.t option;
    exome_gtf : Location.t option;
    cdna : Location.t option;
    whess : Location.t option;
    major_contigs : string list option;
  }
@smondet
Copy link
Member

smondet commented Sep 27, 2016

This is not a bug?

A Reference_genome.t is a fixed collection of reference-genome-related things working together.

If you want b37 with ensembl 59 and a custom COSMIC, you can just create it, and add it to your Biokepi.Machine.t.

@iskandr
Copy link
Member

iskandr commented Sep 27, 2016

Some of those fields never change (species, major_contigs), some change
between patches of a reference (e.g. fasta), and others change on every
Ensembl release (e.g. exome_gtf, cdna).

On Tue, Sep 27, 2016 at 1:39 PM, Sebastien Mondet [email protected]
wrote:

This is not a bug?

A Reference_genome.t is a collection of reference-genome-related things
working together.

If you want b37 with ensembl 59 and a custom COSMIC, you can just create
it, and add it to your Biokepi.Machine.t.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#365 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAC9OdULmKgBx2Bn-lXjrXpvFA-67xU6ks5quVTlgaJpZM4KH77i
.

@ihodes
Copy link
Member Author

ihodes commented Sep 28, 2016

Not a bug, that was a mistaken label, but I do think something needs to change in the code, not just the docs (at the very least the name; calling this collection of things a Reference_genome is misleading).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants