Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build foundation, base, minimal image variants for different Ubuntu/Python versions #2139

Open
mj0nez opened this issue Aug 30, 2024 · 7 comments
Labels
status:Need Info We believe we need more information about an issue from the reporting user to help, debug, fix type:Enhancement A proposed enhancement to the docker images

Comments

@mj0nez
Copy link

mj0nez commented Aug 30, 2024

What docker image(s) is this feature applicable to?

base-notebook, docker-stacks-foundation, minimal-notebook

What change(s) are you proposing?

Hi,

I would like to propose the addition of new variants for the images docker-stacks-foundation, base-notebook and minimal-notebook.

How does this affect the user?

All images are currently pinned to one Python version (3.11 with an open PR #2072 to upgrade to 3.12).
The problem is that this forces downstream consumers to follow the project’s Python version rather than choosing a version on their own. Although Jupyterlab supports all current Python versions, the project’s Python version is restricted by the supported version of all dependencies accross the whole stack (scipy, pytorch, datascience, pyspark, allspark).

Providing more variants for these “base images” would loosen upstream restrictions and allow more teams to use the stack while managing their own dependencies. While this will definitely introduce some changes to the pipelines, increase build time and registry storage, I believe allowing a wider adoption of this stack is worth it.

Kind regards

Anything else?

No response

@mj0nez mj0nez added the type:Enhancement A proposed enhancement to the docker images label Aug 30, 2024
@mathbunnyru
Copy link
Member

mathbunnyru commented Aug 30, 2024

I have thought about this a lot in the past, and let me tell you my thoughts:

  1. The problem is that this forces downstream consumers to follow the project’s Python version rather than choosing a version on their own.

    I partially agree with this statement - we have a nice tagging system, and never delete old images, so users can choose the older Python versions they need. They will get old packages though.
    With the new Python version it's more difficult though - when do we decide to start building images for it? Right now we have quite a simple strategy "When a new version is supported by all the libraries we use". What if some images support new Python, while others don't? What if works differently for aarch64 and x86_64?

  2. The Python version is one of the most important things that our images have. But so is the Ubuntu version, and maybe Jupyter Notebook and Jupyter Lab major versions are also quite important. I think if we start building for all possible versions and combinations, then we're gonna have many new problems - more builds will fail (dependency management is difficult, and GitHub also fails more often than desired).

  3. We will need to think about how we tag our images, it's not going to be as straightforward.

  4. We don't have the computer power to build many aarch64 images without sacrificing build time. This will be less of a problem when GitHub aarch64 Linux runners are out of beta (right now we use a small number of self-hosted aarch64 runners).

  5. I think there will be much more maintenance burden to keep up with what to build and if something builds at all.

Please note, that I'm not saying this is a bad idea, but I want to underline all the issues we will probably have with this approach.
This change will require lots of effort not in just making these decisions, but also in rewriting our github workflows and making sure they work fine not just when everything works fine, but when some random build fails and we need a restart.

Few more thoughts on how one can use our images in some specific cases:

  1. Most of the time our images are quite good 'as is'. But it's fine and encouraged to build on top of our images, install other packages, and update/downgrade existing ones. Nothing wrong with it, and in my opinion, the adoption of our images is great, if we consider these use cases.
  2. We even have a project to configure your own pipeline for the custom image: https://jupyter-docker-stacks.readthedocs.io/en/latest/contributing/stacks.html
  3. I've seen some people fork this repo and change a few things they don't like and it works quite well - and this is also something I keep in mind when accepting new changes.

Update: now we have an example how to use docker bake to build a custom set of images easily: https://jupyter-docker-stacks.readthedocs.io/en/latest/using/recipes.html#building-stack-images-with-custom-arguments

Hope this helps.

@mathbunnyru mathbunnyru changed the title build minimal-notebook image variants for different Python versions Build foundation, base, minimal image variants for different Ubuntu/Python versions Sep 7, 2024
@mathbunnyru
Copy link
Member

mathbunnyru commented Sep 7, 2024

I updated the issue name to make it slightly more general.
I would like to gather some feedback from our users if this feature is actually worth investing lots of time - if this issue gets commented a lot like "I would love to use an image with Ubuntu 22 with Python 3.10" for example, then it would be a good reason to implement this. If not - maybe our images already work in most cases.

@minrk
Copy link
Member

minrk commented Sep 8, 2024

Adding any axis leads to an explosion of build times and images, so I think it's appropriate to keep this as limited as possible. Making it easier for folks to do their own builds is part of relieving the pressure on that. So I don't think allowing the base distro image to vary is worth that cost. Supporting more than one Python version may be, though, but I'd keep it quite limited (maybe not more than 2, drop one when adding the next, etc.).

So I'd weigh how easy is it for us to build Python version variants against how easy is it for someone who wants a different base stack to build their own, and emphasize this in the docs.

@consideRatio
Copy link
Collaborator

To just continuously building one recent version of ubuntu, python, R, julia, and let users be able to use tags of old versions that no longer get rebuilt captures a lot of what users benefit from I think.

I see some value of building multiple versions (say Python 3.11 + Python 3.12), as it can allow a user to stay back in Python version a while also staying updated with pre-installed software. This could provide some breathing room to transition -- but I think users must transition no matter what, so providing more than two versions of Python seems far too much.

Overall I think it isn't worth the maintenance complexity of adding multiple versions of either ubuntu, python, r, julia.

@mathbunnyru
Copy link
Member

I made some documentation improvements in #2144

@manics
Copy link
Contributor

manics commented Sep 9, 2024

If there's significant demand for other combinations of Python or Ubuntu versions I think it should be done in a seperate repository following the suggestions in #2144

I don't think the added complexity of doing it in this repository is worth it. There's almost no overlap in the images, and therefore no benefit in using the same tags.

@mathbunnyru
Copy link
Member

Thanks for all the ideas.

I updated the docs and they now better show how one can build a custom set of images: https://jupyter-docker-stacks.readthedocs.io/en/latest/using/custom-images.html

I would like to keep this issue open for a month - if we receive lots of requests from users, then we might have to reconsider.
If not, I will close the issue.

@mathbunnyru mathbunnyru added the status:Need Info We believe we need more information about an issue from the reporting user to help, debug, fix label Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:Need Info We believe we need more information about an issue from the reporting user to help, debug, fix type:Enhancement A proposed enhancement to the docker images
Projects
None yet
Development

No branches or pull requests

5 participants