Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GESIS server reports "no space left on device" #3251

Open
rgaiacs opened this issue Mar 10, 2025 · 6 comments
Open

GESIS server reports "no space left on device" #3251

rgaiacs opened this issue Mar 10, 2025 · 6 comments

Comments

@rgaiacs
Copy link
Collaborator

rgaiacs commented Mar 10, 2025

Waiting for build to start...
Picked Git content provider.
Cloning into '/tmp/repo2docker_sxxe4sq'...
HEAD is now at e409473 fixed typo in install.R
Python version unspecified, using current default Python version 3.10. This will change in the future.Building conda environment for python=3.10
Using RBuildPack builder
#0 building with "default" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 ERROR: failed to prepare  as x2213k2asjynz7yk9gmb8npix: mkdir /var/lib/docker/overlay2/x2213k2asjynz7yk9gmb8npix: no space left on device
------
 > [internal] load build definition from Dockerfile:
------
ERROR: failed to solve: ResourceExhausted: ResourceExhausted: ResourceExhausted: failed to read dockerfile: failed to prepare  as x2213k2asjynz7yk9gmb8npix: mkdir /var/lib/docker/overlay2/x2213k2asjynz7yk9gmb8npix: no space left on device
Error during build: Command '['docker', 'buildx', 'build', '--progress', 'plain', '--push', '--build-arg', 'NB_USER=jovyan', '--build-arg', 'NB_UID=1000', '--tag', 'gesiscss/binder-r2d-g5b5b759-gesiscss-2djupyter4nfdi-5fsurvey-5fresults-727aee:e4094732f9406561080efb7b652fe19711bc03b1', '--platform', 'linux/amd64', '/tmp/tmpkfim6nlm']' returned non-zero exit status 1.
@manics
Copy link
Member

manics commented Mar 10, 2025

Could it be related to
#3249
?

@rgaiacs
Copy link
Collaborator Author

rgaiacs commented Mar 10, 2025

This is caused due to some malfunction of image-cleaner pod. The parameters for the image-cleaner are

imageGCThresholdHigh: 1600e9
imageGCThresholdLow: 1000e9
imageGCThresholdType: absolute

but the image-cleaner pod is not removing the old images

2025-03-10 06:51:35,415 2201.64GB used
2025-03-10 06:51:35,425 No images to delete
2025-03-10 07:22:49,380 2201.64GB used
2025-03-10 07:22:49,389 No images to delete
2025-03-10 07:54:08,132 2201.64GB used
2025-03-10 07:54:08,141 No images to delete

@rgaiacs
Copy link
Collaborator Author

rgaiacs commented Mar 10, 2025

Could it be related to #3249

Yes, it is probably related to it. But the GESIS chart has

enabled: true

@rgaiacs
Copy link
Collaborator Author

rgaiacs commented Mar 10, 2025

This was temporarily resolved by cleaning the mount volume used to store the builds. This needs a more permanent solution.

@yuvipanda
Copy link
Contributor

I think this is a manifestation of #3249 (comment). I would say jupyterhub/binderhub#1940 is the 'permanent' solution here.

In the meantime, perhaps we can swap out the cronjob buildkit cleaner to be a sidecar of the dind pod instead?

@rgaiacs
Copy link
Collaborator Author

rgaiacs commented Mar 11, 2025

In the meantime, perhaps we can swap out the cronjob buildkit cleaner to be a sidecar of the dind pod instead?

My understanding of the existing optimisation is that for the minimal working example

FROM alpine:3.21.3

RUN apk add py3-numpy

COPY . .

we want to keep the 1st (alpine) and 2nd layer (py3-numpy) in disk and clean the 3rd layer (copy).

My inexperience with sidecar makes difficult for me to understand how sidecar will help us and preserve the existing optimisation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants