Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python interpreter download is a bottleneck #6212

Open
konstin opened this issue Aug 19, 2024 · 5 comments
Open

Python interpreter download is a bottleneck #6212

konstin opened this issue Aug 19, 2024 · 5 comments
Labels
performance Potential performance improvement

Comments

@konstin
Copy link
Member

konstin commented Aug 19, 2024

When performing an uncached install, the majority of the time is spent on waiting for python to download and unpack. This download is a bottleneck, because nothing else happens in parallel. We should parallelize the python download to run in parallel with package downloads and unpacks, while stalling package builds. We have to add the interpreter info for each interpreter we can download to uv (or make it available online) so we start evaluating wheel tags without having a physical interpreter.

Reproducer:

pyproject.toml
uv.lock

FROM ubuntu:latest
RUN apt-get update && apt-get install -y --no-install-recommends curl ca-certificates
RUN curl -LsSf https://astral.sh/uv/install.sh > /tmp/uv-installer.sh && sh /tmp/uv-installer.sh && rm /tmp/uv-installer.sh
ENV PATH="/root/.cargo/bin/:$PATH"
ADD pyproject.toml pyproject.toml
ADD uv.lock uv.lock
RUN mkdir -p github_wikidata_bot
RUN touch github_wikidata_bot/__init__.py && echo "cache bump"
RUN uv python install 3.12
RUN uv sync

uv python install 3.12 takes 2.5s on my machine, and uv sync takes 1.5s, while only uv sync takes 4s. On a shared server i tried, i get 6.5s for python and 3.5s for the sync. While both use network, disk and io, they are still much faster running in parallel, just like parallel download and unpack for wheels is much faster than sequential wheel install.

@konstin konstin added the performance Potential performance improvement label Aug 19, 2024
@zanieb
Copy link
Member

zanieb commented Aug 19, 2024

Oo that sounds cool.

@zanieb
Copy link
Member

zanieb commented Aug 19, 2024

Though wouldn't people have the uv python install cached in their Docker image and make this optimization irrelevant?

@konstin
Copy link
Member Author

konstin commented Aug 19, 2024

The docker is more for illustration purposes, i'm thinking about any uncached deployment enviroment where you also want to install python, but also the first run on a new machine experience of uv.

@konstin
Copy link
Member Author

konstin commented Aug 19, 2024

For docker specifically, i'd recommend, in order:

  • Use an image we provide that has python installed
  • Run uv python install 3.x in a separate cached step like the apt-get invocations
  • Install python through your image's mechanism and have it in a cached layer (make sure build stage and run stage match exactly on their python)

@samypr100
Copy link
Contributor

Use an image we provide that has python installed

🤞 #6053

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Potential performance improvement
Projects
None yet
Development

No branches or pull requests

3 participants