Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use snakemake conda envs with GitHub Actions/cache #26

Open
lczech opened this issue Nov 18, 2022 · 2 comments
Open

Use snakemake conda envs with GitHub Actions/cache #26

lczech opened this issue Nov 18, 2022 · 2 comments

Comments

@lczech
Copy link

lczech commented Nov 18, 2022

Hi there,

I am trying to create CI for a Snakemake workflow that needs to install several conda envs for different rules. Is there a way to use the GitHub actions/cache to eliminate the need to install the envs for every run?

Background

The main problem: Snakemake installs the per-rule conda envs with a name that is a hash of the env file. This makes it hard to know which path to use for the actions/cache. Is there a way to do this?

Secondary question

As a secondary question: If this was solved, how would that work across jobs? I have a matrix of jobs in my GitHub Actions workflow (testing different aspects of my Snakemake workflow), of which several use the same rules and conda envs. It would be ideal if conda envs would be re-used across them.

However, I could not find any information on how race conditions are solved when multiple jobs use the same cache keys. Imagine GitHub Actions job A and job B run in parallel, job A starts with the action/cache empty, and hence creates a conda env to be cached. At the same time, if job B starts before job A is done, it will not yet find the cache. This is because

On a cache miss, the action automatically creates a new cache if the job completes successfully. (source)

Will job B then also start creating the conda env and filling the cache? Or will it wait for job A to finish?

Cheers and thanks in advance
Lucas

@lczech lczech changed the title Combine with actions/cache Use snakemake conda envs with actions/cache Nov 18, 2022
@lparsons
Copy link

lparsons commented May 5, 2023

I'm running into the same issue (long wait for repeated conda installations). Does anyone have any insight into getting this setup?

@lczech lczech changed the title Use snakemake conda envs with actions/cache Use snakemake conda envs with GitHub Actions/cache May 6, 2023
@lczech
Copy link
Author

lczech commented May 6, 2023

Hi @lparsons,

I have not found a solution yet. But using mamba instead of conda definitely helps to alleviate the problem. In my case, for some GitHub CI jobs, it's going down from 6h (which is already the time limit for GitHub Actions) to 15min.

Lucas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants