Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove /var/cache/yum/ and merge steps to minimize size of container #120

Open
thvasilo opened this issue Jun 15, 2023 · 0 comments
Open

Comments

@thvasilo
Copy link

After taking a look through the sagemaker-spark-processing:3.1-cpu-py37-v1.1 image using the dive tool, I noticed that the cache files for installations were not getting cleaned up, leading to an unnecessary increase in image size.

In particular line 13 has no effect as the layer that installs the yum packages on line 6 is immutable at that point. This leads to around 30-40% of the image size being allocated to caches:

image

By adding the cleanup at the end of the layer definition, the cleanup actually works, significantly reducing the size of the image:

image

By merging a couple of layers and doing cleanup we are able to shrink the image size from 4.4GB to 2.5GB, that should lead to faster container spin-up. The changes are available in my fork, if the maintainers agree I can try opening a PR with these changes for the Dockerfiles that could benefit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant