Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seqlevelsStyle -- supporting offline use? #94

Open
vjcitn opened this issue Sep 21, 2023 · 1 comment
Open

seqlevelsStyle -- supporting offline use? #94

vjcitn opened this issue Sep 21, 2023 · 1 comment

Comments

@vjcitn
Copy link

vjcitn commented Sep 21, 2023

when setting seqlevelsStyle to "UCSC" it appears a network
query is inevitably issued, leading to failure if off line.

can we use BiocFileCache to hold the relevant information
persistently?

would a PR be considered?

@hpages
Copy link
Contributor

hpages commented Sep 21, 2023

See issue #26 for a discussion about this. TLDR: One concern is that there's a (small) risk that the cache data become stale after the online NCBI or UCSC data changes. A rare event but it happens sometimes. This could be mitigated by having some sort of cache expiration mechanism.

But before doing that, an improvement that is on my TODO list is to make seqlevelsStyle(x) <- "UCSC" work offline, and without the need for any caching, when seqinfo(x) only contains assembled molecules (i.e. chromosomes + mitochondrial DNA) and no scaffolds. This would probably cover most use cases. This feature would take advantage of data that is included in the package: https://github.com/Bioconductor/GenomeInfoDb/tree/devel/inst/extdata/assembled_molecules_db/UCSC
Unlike the full sequence info, the sequence info restricted to assembled molecules is small and very stable so it makes sense to include it in the package, at least for the most commonly used genomes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants