Skip to content

Add sandbox dockerfile generator script #196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Apr 24, 2025

Conversation

marwan37
Copy link
Contributor

@marwan37 marwan37 commented Apr 16, 2025

This PR adds:

  1. A python script for generating Dockerfiles for use within the upcoming zenml sandbox environment.
  2. A GH Actions workflow that automatically builds and pushes Docker images to zenmldocker on DockerHub when changes are made to project directories.

Changes:

  • Added python script to generate standardized Dockerfile.codespace files for our projects
  • Moved our python scripts to the scripts directory
  • Used to script to generate a Dockerfile.codespace for oncoclear and omni-reader as examples

How the workflow works:

  • Detects and builds Docker images only for projects with changes
  • Uses path-ignore pattern to exclude non-project files and directories
  • For projects without a Dockerfile.codespace:
    • Generates one using the generate_codespace_dockerfile.py script
    • Creates a PR for review instead of committing directly
    • Only builds and pushes images after the PR is merged
  • For projects with an existing Dockerfile.codespace:
    • Automatically builds and pushes to the latest tag on DockerHub
  • Images are automatically tagged with a timestamp to preserve older project images (if any)

@marwan37 marwan37 added the enhancement New feature or request label Apr 16, 2025
Copy link

dagshub bot commented Apr 16, 2025

@strickvl
Copy link
Contributor

strickvl commented Apr 16, 2025

One immediate comment (sorry if there's some context I'm missing here): do we not also want to automate this in the sense that it runs in the CI (i.e. on merging things into main etc) with a GH Action workflow?

@marwan37
Copy link
Contributor Author

@strickvl I've updated the script to use uv for installing deps. For env variables, the script now creates a .env file in the Dockerfile only if it exists in the project - otherwise users will need to set API keys via env variables as described in each project's README.

I've also renamed the base docker image to zenmldocker/zenml-projects:base and was thinking we can use a consistent naming convention for projects: zenmldocker/zenml-projects:oncoclear, zenmldocker/zenml-projects:omni-reader, etc. Let me know your thoughts.

@strickvl
Copy link
Contributor

That naming convention feels a bit weird. So you're using the tag as the name instead of the actual image name? How would we support multiple versions of a project? Just when we update / upgrade we no longer offer a supported version?

@marwan37
Copy link
Contributor Author

@strickvl hmm I was thinking we'd have 1 version at most for each project, but yes the naming convention does feel weird. will revert to zenmldocker/zenml-sandbox then, and zenmldocker/projects-oncoclear, etc. for projects.

@strickvl
Copy link
Contributor

Practically we'd probably only support one version. But I think in practice, some people might be playing around with one version etc and then if we 'upgrade' that version (as we have done with our projects many times) then suddenly that person's code / run behaviour changes etc.. I think that's probably undesirable.

@marwan37
Copy link
Contributor Author

@strickvl Gotcha, though adding custom tags to the build/push docker images workflow might get tricky. When we need to preserve the original image while pushing a new tag, we'd either need to manually trigger the workflow with a tag input or use a versioning system (like starting with base and incrementing with each change). For now, I've kept it simple with just the "latest" tag -- and we can expand this later if needed?

Copy link
Contributor

@htahir1 htahir1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added to the naming discussion above. overall looks good though

@marwan37 marwan37 requested review from strickvl and htahir1 April 22, 2025 12:39
@marwan37
Copy link
Contributor Author

marwan37 commented Apr 22, 2025

I've renamed the base image to zenmldocker/zenml-sandbox and projects now follow this naming pattern:

  • zenmldocker/projects-oncoclear
  • zenmldocker/projects-omni-reader
  • and so on...

The generated Dockerfiles will be named Dockerfile.codespace

@marwan37 marwan37 requested a review from htahir1 April 24, 2025 13:25
@marwan37 marwan37 merged commit fcc7996 into main Apr 24, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants