You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to create CI for a Snakemake workflow that needs to install several conda envs for different rules. Is there a way to use the GitHub actions/cache to eliminate the need to install the envs for every run?
Background
The main problem: Snakemake installs the per-rule conda envs with a name that is a hash of the env file. This makes it hard to know which path to use for the actions/cache. Is there a way to do this?
Secondary question
As a secondary question: If this was solved, how would that work across jobs? I have a matrix of jobs in my GitHub Actions workflow (testing different aspects of my Snakemake workflow), of which several use the same rules and conda envs. It would be ideal if conda envs would be re-used across them.
However, I could not find any information on how race conditions are solved when multiple jobs use the same cache keys. Imagine GitHub Actions job A and job B run in parallel, job A starts with the action/cache empty, and hence creates a conda env to be cached. At the same time, if job B starts before job A is done, it will not yet find the cache. This is because
On a cache miss, the action automatically creates a new cache if the job completes successfully. (source)
Will job B then also start creating the conda env and filling the cache? Or will it wait for job A to finish?
Cheers and thanks in advance
Lucas
The text was updated successfully, but these errors were encountered:
lczech
changed the title
Combine with actions/cache
Use snakemake conda envs with actions/cache
Nov 18, 2022
I have not found a solution yet. But using mamba instead of conda definitely helps to alleviate the problem. In my case, for some GitHub CI jobs, it's going down from 6h (which is already the time limit for GitHub Actions) to 15min.
Hi there,
I am trying to create CI for a Snakemake workflow that needs to install several conda envs for different rules. Is there a way to use the GitHub actions/cache to eliminate the need to install the envs for every run?
Background
The main problem: Snakemake installs the per-rule conda envs with a name that is a hash of the env file. This makes it hard to know which path to use for the actions/cache. Is there a way to do this?
Secondary question
As a secondary question: If this was solved, how would that work across jobs? I have a matrix of jobs in my GitHub Actions workflow (testing different aspects of my Snakemake workflow), of which several use the same rules and conda envs. It would be ideal if conda envs would be re-used across them.
However, I could not find any information on how race conditions are solved when multiple jobs use the same cache keys. Imagine GitHub Actions job A and job B run in parallel, job A starts with the action/cache empty, and hence creates a conda env to be cached. At the same time, if job B starts before job A is done, it will not yet find the cache. This is because
Will job B then also start creating the conda env and filling the cache? Or will it wait for job A to finish?
Cheers and thanks in advance
Lucas
The text was updated successfully, but these errors were encountered: