You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When a repo is updated the user ends up with dead files that are no longer used in the cache folder and they just take up disc space and are difficult to identify and purge.
I propose a feature where downloads are marked as auto-updated and manually-downloaded.
manually-downloaded are those that were explicitly requested by a user (i.e. a specific revision was requested) - those are special and not to be auto-pruned.
auto-updated are those downloads that we automatic - e.g. a user run snapshot_download("t5") twice in a row, if the files changed between those 2 runs, it should be safe to automatically purge the previous download.
Is your feature request related to a problem? Please describe.
When a repo is updated the user ends up with dead files that are no longer used in the cache folder and they just take up disc space and are difficult to identify and purge.
I propose a feature where downloads are marked as auto-updated and manually-downloaded.
manually-downloaded are those that were explicitly requested by a user (i.e. a specific revision was requested) - those are special and not to be auto-pruned.
auto-updated are those downloads that we automatic - e.g. a user run
snapshot_download("t5")
twice in a row, if the files changed between those 2 runs, it should be safe to automatically purge the previous download.Additional context
The issue came from this discussion on slack, where many more details were presented and considered:
https://huggingface.slack.com/archives/C023JAKTR2P/p1661305016103339
The text was updated successfully, but these errors were encountered: