Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RefGenie integration #592

Closed
ewels opened this issue Mar 23, 2020 · 8 comments
Closed

RefGenie integration #592

ewels opened this issue Mar 23, 2020 · 8 comments
Labels
command line tools Anything to do with the cli interfaces
Milestone

Comments

@ewels
Copy link
Member

ewels commented Mar 23, 2020

RefGenie is a very nice command-line tool to manage reference genomes locally: http://refgenie.databio.org/en/latest/

The Python package can be imported by other tools so we could potentially make a nf-core refgenie subcommand to fetch whatever references the user has installed and build a nextflow / nf-core config file with these paths. This could be saved / linked to from ~/.nextflow/conf so that the user has these available for any pipeline runs (or a custom supplied path).

Note that we may need to update nf-core pipelines to use the same reference type identifiers as RefGenie to work, or at least provide a translation table. e.g. bismark > bismark_bt2_index

@ewels ewels added low-priority command line tools Anything to do with the cli interfaces labels Mar 23, 2020
@ewels ewels changed the title Subcommand to build nextflow config files from RefGenie RefGenie integratino Mar 30, 2020
@ewels ewels changed the title RefGenie integratino RefGenie integration Mar 30, 2020
@ewels
Copy link
Member Author

ewels commented Mar 30, 2020

@nsheff managed to add plugin hooks and build an example @nf-core plugin all in a day! Over here: https://github.com/databio/refgenie_nfcore

Can use this to model the RefGenie integration. This will work super nicely because it is so harmless:

  • RefGenie installed, nf-core not installed: Nothing
  • RefGenie not installed, nf-core installed: Nothing
  • RefGenie and nf-core installed: Nextflow genomes config automagically maintained

There will be no action required by users beyond installing two tools that they will be using anyway. 🚀

Phil

@ewels ewels added this to the 1.10 milestone Mar 30, 2020
@ggabernet
Copy link
Member

Would this really be part of the 1.10 milestone?

@ewels
Copy link
Member Author

ewels commented Jul 12, 2020

I was hoping to get it there but maybe we shift it back to the next release.

@ewels ewels modified the milestones: 1.10, 1.11 Jul 13, 2020
@stevekm
Copy link
Contributor

stevekm commented Jul 15, 2020

@ewels was there more progress needed on this?

@nsheff
Copy link

nsheff commented Jul 15, 2020

anything I can do to help?

@nsheff
Copy link

nsheff commented Mar 22, 2021

I wanted to announce an update that would allow using refgenie for unarchived cloud assets directly. You first need a refgenie digest for the genome of interest, which you can get at an endpoint like this: http://rg.databio.org/genomes/genome_digest/hg38.

Then, you can use that with the assets/file_path endpoint to return either an http or s3 URL to the file of interest. For example:

you can also use individual seek keys, just like the CLI, to get individual items within an asset:
http://rg.databio.org/assets/file_path/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4/fasta/chrom_sizes?remoteClass=s3

One way to auto-generate a config file is to use the new refgenie populate function.

you would create a template like:

params {
  // illumina iGenomes reference file paths
  genomes {
    'GRCh37' {
      fasta       = "refgenie://hg38/fasta"
      bwa         = "refgenie://hg38/bwa_index"
      bowtie2     = "refgenie://hg38/bowtie2_index"
      }
    }
}

then you just run some flavor of refgenie populate file.tpl and you'd get the above, using the then-current URIs

@ewels ewels modified the milestones: 1.14, 1.15 May 9, 2021
@ewels
Copy link
Member Author

ewels commented May 18, 2021

I think that we really have 3 separate tasks with RefGenie now, so I will split them out into separate issues and close this one.

@ewels
Copy link
Member Author

ewels commented May 18, 2021

Moved into #1084 #1085 and #1086

@ewels ewels closed this as completed May 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
command line tools Anything to do with the cli interfaces
Projects
None yet
Development

No branches or pull requests

4 participants