-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RefGenie integration #592
Comments
@nsheff managed to add plugin hooks and build an example @nf-core plugin all in a day! Over here: https://github.com/databio/refgenie_nfcore Can use this to model the RefGenie integration. This will work super nicely because it is so harmless:
There will be no action required by users beyond installing two tools that they will be using anyway. 🚀 Phil |
Would this really be part of the 1.10 milestone? |
I was hoping to get it there but maybe we shift it back to the next release. |
@ewels was there more progress needed on this? |
anything I can do to help? |
I wanted to announce an update that would allow using refgenie for unarchived cloud assets directly. You first need a refgenie digest for the genome of interest, which you can get at an endpoint like this: http://rg.databio.org/genomes/genome_digest/hg38. Then, you can use that with the assets/file_path endpoint to return either an http or s3 URL to the file of interest. For example:
you can also use individual seek keys, just like the CLI, to get individual items within an asset: One way to auto-generate a config file is to use the new you would create a template like: params {
// illumina iGenomes reference file paths
genomes {
'GRCh37' {
fasta = "refgenie://hg38/fasta"
bwa = "refgenie://hg38/bwa_index"
bowtie2 = "refgenie://hg38/bowtie2_index"
}
}
} then you just run some flavor of refgenie populate file.tpl and you'd get the above, using the then-current URIs |
I think that we really have 3 separate tasks with RefGenie now, so I will split them out into separate issues and close this one. |
RefGenie is a very nice command-line tool to manage reference genomes locally: http://refgenie.databio.org/en/latest/
The Python package can be imported by other tools so we could potentially make a
nf-core refgenie
subcommand to fetch whatever references the user has installed and build a nextflow / nf-core config file with these paths. This could be saved / linked to from~/.nextflow/conf
so that the user has these available for any pipeline runs (or a custom supplied path).Note that we may need to update nf-core pipelines to use the same reference type identifiers as RefGenie to work, or at least provide a translation table. e.g.
bismark
>bismark_bt2_index
The text was updated successfully, but these errors were encountered: