Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic pruning of previously downloaded files superceded by updated files #1013

Open
stas00 opened this issue Aug 24, 2022 · 0 comments
Open
Labels
enhancement New feature or request wontfix This will not be worked on

Comments

@stas00
Copy link
Contributor

stas00 commented Aug 24, 2022

Is your feature request related to a problem? Please describe.

When a repo is updated the user ends up with dead files that are no longer used in the cache folder and they just take up disc space and are difficult to identify and purge.

I propose a feature where downloads are marked as auto-updated and manually-downloaded.

manually-downloaded are those that were explicitly requested by a user (i.e. a specific revision was requested) - those are special and not to be auto-pruned.

auto-updated are those downloads that we automatic - e.g. a user run snapshot_download("t5") twice in a row, if the files changed between those 2 runs, it should be safe to automatically purge the previous download.

Additional context

The issue came from this discussion on slack, where many more details were presented and considered:
https://huggingface.slack.com/archives/C023JAKTR2P/p1661305016103339

@Wauplin Wauplin added the enhancement New feature or request label Aug 26, 2022
@Wauplin Wauplin added the wontfix This will not be worked on label Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants