Skip to content

Add sandbox dockerfile generator script #196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Apr 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
25371db
add and move generate_sandbox_dockerfile script to scripts dir
marwan37 Apr 16, 2025
ebae3c6
move generate_enml_project.py to scripts dir
marwan37 Apr 16, 2025
8104d8e
update function to set project name in Dockerfile to base name, and n…
marwan37 Apr 16, 2025
99ba784
generate Dockerfile for oncoclear as an example
marwan37 Apr 16, 2025
398c4e4
update docker parent image to zenmldocker/zenml-sandbox
marwan37 Apr 16, 2025
bfd8c23
use complete env variable key in template
marwan37 Apr 16, 2025
4938493
update dockerfile generator script to use uv, handle pyproject.toml, …
marwan37 Apr 18, 2025
b1ebc93
add tomli to pyproject.toml to parse toml files
marwan37 Apr 18, 2025
434f7cb
generate updated Dockerfile.sandbox for omnireader
marwan37 Apr 18, 2025
6d568df
update script to not generate a .env file if it didnt exist, and dont…
marwan37 Apr 18, 2025
ec8f24e
use uv binary from distroless Docker image instead of installing uv v…
marwan37 Apr 20, 2025
8ff3b48
generate updated Dockerfile.sandbox files
marwan37 Apr 20, 2025
ce252c3
change base image name to zenmldocker/zenml-projects:base
marwan37 Apr 20, 2025
aa062eb
Merge branch 'main' into add-sandbox-dockerfile-generator-script
strickvl Apr 20, 2025
b932f3e
add workflow file
marwan37 Apr 20, 2025
f22584c
rename sandbox to codespace
marwan37 Apr 21, 2025
22ebb93
delete generate_zenml_project.py
marwan37 Apr 21, 2025
ed92bdb
revert base image name to zenmldocker/zenml-sandbox
marwan37 Apr 22, 2025
5020fa9
bump python version in pyproject.toml and replace tomli with tomlib
marwan37 Apr 22, 2025
fd25eb0
Use UTC timestamp as Docker image tag in GH action workflow
marwan37 Apr 24, 2025
7f9911e
split workflow into encapsulated jobs to avoid redundant dockerfile_e…
marwan37 Apr 24, 2025
480bffd
update run command to use updated script name and path
marwan37 Apr 24, 2025
761af1d
rename filename: sandbox -> codespace
marwan37 Apr 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 171 additions & 0 deletions .github/workflows/build-push-codespace.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
name: Build and Push Project Codespace Images

on:
push:
branches:
- main
paths-ignore:
- "_assets/**"
- ".github/**"
- ".gitignore"
- ".gitmodules"
- ".typos.toml"
- "CODE-OF-CONDUCT.md"
- "CONTRIBUTING.md"
- "scripts/**"
- "LICENSE"
- "pyproject.toml"
- "README.md"

workflow_dispatch:
inputs:
project:
description: "Project to build (leave empty to detect from changed files)"
required: false
default: ""

jobs:
detect-changes:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 2

- name: Detect changed projects
id: set-matrix
run: |
# If this was a manual dispatch _and_ they provided a project, just use that
if [[ "${{ github.event_name }}" == "workflow_dispatch" && -n "${{ github.event.inputs.project }}" ]]; then
PROJECTS="[\"${{ github.event.inputs.project }}\"]"
else
# Otherwise auto-diff HEAD^ → HEAD for any changed top-level dirs
CHANGED_FILES=$(git diff --name-only HEAD^ HEAD)
CHANGED_DIRS=$(echo "$CHANGED_FILES" \
| awk -F/ '{print $1}' \
| sort -u \
| grep -v '^$')
ALL_PROJECT_DIRS=$(find . -maxdepth 1 -type d \
-not -path '*/\.*' \
-not -path '.' \
| sed 's|^\./||' \
| grep -v '^_')
PROJECTS="["
sep=""
for d in $CHANGED_DIRS; do
if echo "$ALL_PROJECT_DIRS" | grep -qx "$d"; then
PROJECTS+="${sep}\"$d\""
sep=","
fi
done
PROJECTS+="]"
fi

echo "matrix=$PROJECTS" >> $GITHUB_OUTPUT
echo "Projects to build: $PROJECTS"

check-dockerfile:
needs: detect-changes
runs-on: ubuntu-latest
strategy:
matrix:
project: ${{ fromJson(needs.detect-changes.outputs.matrix) }}
outputs:
dockerfile_exists: ${{ steps.check-dockerfile.outputs.dockerfile_exists }}
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Check for Dockerfile.codespace
id: check-dockerfile
run: |
if [ -f "${{ matrix.project }}/Dockerfile.codespace" ]; then
echo "dockerfile_exists=true" >> $GITHUB_OUTPUT
else
echo "dockerfile_exists=false" >> $GITHUB_OUTPUT
fi

generate-dockerfile:
needs: [detect-changes, check-dockerfile]
if: needs.check-dockerfile.outputs.dockerfile_exists == 'false'
runs-on: ubuntu-latest
strategy:
matrix:
project: ${{ fromJson(needs.detect-changes.outputs.matrix) }}
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"

- name: Generate Dockerfile.codespace
id: generate-dockerfile
run: |
python scripts/generate_codespace_dockerfile.py "${{ matrix.project }}"
echo "Generated Dockerfile.codespace for ${{ matrix.project }}"

- name: Create Pull Request for new Dockerfile
uses: peter-evans/create-pull-request@v5
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: "Auto-generate Dockerfile.codespace for ${{ matrix.project }}"
title: "Auto-generate Dockerfile.codespace for ${{ matrix.project }}"
body: |
This PR adds a generated Dockerfile.codespace for the ${{ matrix.project }} project.

Please review the changes and merge if they look good.

Once merged, the Docker image will be built and pushed automatically.
branch: "auto-dockerfile-${{ matrix.project }}"
base: main
labels: |
automated-pr
dockerfile
codespace

build-and-push:
needs: [detect-changes, check-dockerfile]
if: needs.check-dockerfile.outputs.dockerfile_exists == 'true'
runs-on: ubuntu-latest
strategy:
matrix:
project: ${{ fromJson(needs.detect-changes.outputs.matrix) }}
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0

# Generate timestamp for image tag
- name: Generate timestamp
id: timestamp
run: echo "timestamp=$(date -u +'%Y%m%d%H%M%S')" >> $GITHUB_OUTPUT

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2

- name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}

- name: Build and push
uses: docker/build-push-action@v4
with:
context: .
file: ${{ matrix.project }}/Dockerfile.codespace
push: true
tags: zenmldocker/projects-${{ matrix.project }}:${{ steps.timestamp.outputs.timestamp }}
cache-from: type=gha
cache-to: type=gha,mode=max
138 changes: 0 additions & 138 deletions generate_zenml_project.py

This file was deleted.

49 changes: 49 additions & 0 deletions omni-reader/Dockerfile.codespace
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Sandbox base image
FROM zenmldocker/zenml-sandbox:latest

# Install uv from official distroless image
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/

# Set uv environment variables for optimization
ENV UV_SYSTEM_PYTHON=1
ENV UV_COMPILE_BYTECODE=1

# Project metadata
LABEL project_name="omni-reader"
LABEL project_version="0.1.0"

# Install dependencies with uv and cache optimization
RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install --system \
"instructor" \
"jiwer" \
"jiter" \
"importlib-metadata<7.0,>=1.4.0" \
"litellm" \
"mistralai==1.0.3" \
"numpy<2.0,>=1.9.0" \
"openai==1.69.0" \
"Pillow==11.1.0" \
"polars-lts-cpu==1.26.0" \
"pyarrow>=7.0.0" \
"python-dotenv" \
"streamlit==1.44.0" \
"pydantic>=2.8.2,<2.9.0" \
"tqdm==4.66.4" \
"zenml>=0.80.0"

# Set workspace directory
WORKDIR /workspace

# Clone only the project directory and reorganize
RUN git clone --depth 1 https://github.com/zenml-io/zenml-projects.git /tmp/zenml-projects && \
cp -r /tmp/zenml-projects/omni-reader/* /workspace/ && \
rm -rf /tmp/zenml-projects

# VSCode settings
RUN mkdir -p /workspace/.vscode && \
printf '{\n "workbench.colorTheme": "Default Dark Modern"\n}' > /workspace/.vscode/settings.json

# Copy .env.example
COPY .env.example /workspace/.env
ENV POLARS_SKIP_CPU_CHECK=1
Loading
Loading