Skip to content

Consider removing GitHub caching actions #132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bnaecker opened this issue Jun 23, 2021 · 3 comments
Closed

Consider removing GitHub caching actions #132

bnaecker opened this issue Jun 23, 2021 · 3 comments

Comments

@bnaecker
Copy link
Collaborator

While integrating #131, I hit a lot of pain around GitHub's action for caching dependencies. In particular, we're using that for detecting if we need to download CockroachDB and ClickHouse, two large, relatively unchanging binaries. This action provides a limited way to invalidate the cache when something changes, by creating a "cache key" -- the key usually has a hash based on some files in the repository, such as a lockfile or source.

This doesn't really apply to our case. We can't hash anything, since the content we care about lives outside the repository. At the same time, we're giving a pretty generic key to the cached files (e.g., Linux-cockroachdb-binary-v2), which means we need to explicitly update this key in order to invalidate the cache and actually download the file again. The download of the ClickHouse tarball total seven seconds. Given the ~20 minute total time for CI to finish, this hardly seems worth the cost of manually updating the key when we want a new version. We should probably just download the binary each time, but I'm open to other suggestions.

@smklein
Copy link
Collaborator

smklein commented Jun 24, 2021

The hashFiles function looks like it operates on an arbitrary path - couldn't we store a file containing the hashes we want, and take the hash of that?

tools/ci_download_clickhouse itself actually contains those MD5s already. It prooooobably would be a good idea to factor the MD5 hashes into a separate file, distinct from that script, but as an example, if we did:

      with:
        key: ${{ runner.os }}-clickhouse-binary-${{ hashFiles('tools/ci_download_clickhouse') }}
        path: "clickhouse"

Wouldn't that fix this issue each time those specified hashes change? (we update the MD5 for Clickhouse --> the script changes --> the cache invalidates).

@bnaecker
Copy link
Collaborator Author

Cool, good idea, thanks.

@bnaecker
Copy link
Collaborator Author

Closed by additional commits on #131

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants