Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"git lfs pull" step is silent? #2

Open
jni opened this issue Feb 25, 2022 · 9 comments
Open

"git lfs pull" step is silent? #2

jni opened this issue Feb 25, 2022 · 9 comments

Comments

@jni
Copy link

jni commented Feb 25, 2022

Hello, and thank you for your incredibly useful action! 😊

As you know from Twitter, we are using your action and still bleeding about 20GB per day, and having trouble tracking down where it's happening. I was trying to understand how often we have cache misses and how much each cache miss costs, but even in a cache miss, it looks like git lfs pull doesn't have any output? (Example here.) Is that expected? I would have expected to see a line like:

Downloading LFS objects: 100% (1/1), 666 KB | 0 B/s

But I notice on my own terminal that lfs does not add a newline/carriage return after that line, and it gets overwritten. Is that what is going wrong with the action? Could it be fixed by appending && echo after the command?

Thank you! 🙏

@nschloe
Copy link
Owner

nschloe commented Feb 25, 2022

Is that expected?

If all LFS is already there, that is expected I think. Seems to work correctly.

@jni
Copy link
Author

jni commented Feb 25, 2022

If all LFS is already there

But doesn't the cache miss imply that the lfs is not there? I can't find a single build where git lfs pull shows any output, including the first build ever using this action:

https://github.com/napari/napari/runs/5211449270?check_suite_focus=true#step:2:484

Do you have an example build using this action where git lfs pull actually displays output?

@nschloe
Copy link
Owner

nschloe commented Feb 25, 2022

Perhaps something is amiss then. If you have a good idea for a fix, I'll be happy to review a PR.

@ModischFabrications
Copy link

Not sure if related, but I seem to be missing LFS files in the checked out repository. Looking through the logs I couldn't find any listings of downloaded files either: https://github.com/ModischFabrications/CutSolverFrontend/actions/runs/3340768601/jobs/5531240857

@Ryp
Copy link

Ryp commented Dec 13, 2023

I'm noticing the same behavior as @jni. My quotas keep rising even though the git lfs pull line shows no output.
I can see that the caches are created as they should be, and the build proceeds fine as well.
My CI pipeline is reasonably simple, so if somebody would be nice enough to have a look, here's the job in question.

I can see that @jni you're not using this github action anymore, do you mind sharing which workaroud you ended up using?

@jni
Copy link
Author

jni commented Dec 14, 2023

We ended up nuking lfs because we concluded it's a scam by GitHub to get us to spend money. 😂

Less flippantly:

  • github penalises lfs bandwidth at absurd rates, while cloning from a "big" repo is free. So we moved our docs build to a separate repo and have had no issues, other than the complexity of the two-repo setup.
  • worse, there's no way to introspect where the bandwidth is going, and you have no control over who spends the bandwidth. If your project has a surge in popularity, suddenly your work can grind to a halt, and there's nothing to do about it but pay up.
  • in Python, pip-installing from a repo, as in pip install git+https://github.com/napari/napari, checks out the lfs files, which means that people installing our project from main suddenly incurred lfs bills, including in other people's CI. This is crazy.

In the interim, I've discovered git-annex, which is similar to lfs but allows you to configure where your files are stored, and one of the options is Cloudflare R2 which has no egress fees. So I would recommend git-annex + cloudflare as the large-file solution for anyone starting on the problem today. From napari/napari#6049:

Anyway, as I investigated this I came across git-annex, which is like lfs only non-scammy. One of the features of git-annex is that you can set your remote storage backend among a huge array of options, including Cloudflare R2 (through rclone), which has no egress costs, and 10GB upload per month free. So I think it would definitely meet our storage requirements on the free tier.

So that is what I would recommend — I suspect changing the setup will be easier than figuring out where your bandwidth is going and plugging leaks, which happen all the time and are often beyond your control.

Ryp added a commit to Ryp/reaper that referenced this issue Dec 14, 2023
Github's pricing is too steep and the free tier is surprisingly hard to
work around.

See:
nschloe/action-cached-lfs-checkout#2

Fixes NPT-101
@Ryp
Copy link

Ryp commented Dec 14, 2023

Thank you very much for the detailed write up!
I ended up ripping LFS support out as well and put everything on a separate repo linked with a submodule.

See Ryp/reaper@6dd5116

Ryp added a commit to Ryp/reaper that referenced this issue Dec 14, 2023
Github's pricing is too steep and the free tier is surprisingly hard to
work around.

See:
nschloe/action-cached-lfs-checkout#2

Fixes NPT-101
Repository owner deleted a comment from araphp1 Feb 5, 2024
@hashimaziz1
Copy link

Thank you very much for the detailed write up! I ended up ripping LFS support out as well and put everything on a separate repo linked with a submodule.

See Ryp/reaper@6dd5116

How are you finding this approach almost a year on? Is it as simple as cloning/checking out the binaries from the submodule to pull the large files in?

@Ryp
Copy link

Ryp commented Aug 30, 2024

Works good enough for me, using a submodule is not as smooth as just using a versioned folder, but you get used to it quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants