-
Notifications
You must be signed in to change notification settings - Fork 32
Add Containerized Benchmarking Support for GuideLLM #123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
29aaf7b
9ca1e8f
2fd8c0c
5abf27a
661e8e4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,32 @@ | ||||||||||
FROM python:3.12-slim | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, what are our plans for supporting the range of python versions? Anything we should do here to create specific Dockerfiles for each supported python version? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Normally I would recommend handling this either though build args or postfixes on the Containerfile name. E.g.
Suggested change
Or name this file something like |
||||||||||
|
||||||||||
LABEL org.opencontainers.image.source="https://github.com/neuralmagic/guidellm" | ||||||||||
LABEL org.opencontainers.image.description="GuideLLM Benchmark Container" | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. NIT: can we remove the benchmark from this so we keep it flexible towards future goals of evals and things like that? |
||||||||||
|
||||||||||
# Install dependencies and set up environment in a single layer | ||||||||||
RUN apt-get update && apt-get install -y \ | ||||||||||
git \ | ||||||||||
curl \ | ||||||||||
&& pip install git+https://github.com/neuralmagic/guidellm.git \ | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm concerned that this is too restrictive in that it always builds from the latest main, so it will be tough to specify what should be built here other than from the latest main either with the build system for a specific release or for a user. Can we expand this with a build-arg for the package version to build? Which we can set as default to build from the latest main, but otherwise can utilize it to specify the release tag/branch to build from. |
||||||||||
&& useradd -m -u 1000 guidellm \ | ||||||||||
&& apt-get clean \ | ||||||||||
&& rm -rf /var/lib/apt/lists/* | ||||||||||
|
||||||||||
# Set working directory | ||||||||||
WORKDIR /app | ||||||||||
|
||||||||||
# Copy and set up the benchmark script | ||||||||||
COPY build/run_benchmark.sh /app/ | ||||||||||
|
||||||||||
# Set ownership to non-root user | ||||||||||
RUN chown -R guidellm:guidellm /app | ||||||||||
|
||||||||||
# Switch to non-root user | ||||||||||
USER guidellm | ||||||||||
|
||||||||||
# Healthcheck | ||||||||||
HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \ | ||||||||||
CMD curl -f http://localhost:8000/health || exit 1 | ||||||||||
|
||||||||||
# Set the entrypoint | ||||||||||
ENTRYPOINT ["/app/run_benchmark.sh"] | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we change this out so that we are setting the entrypoint to "guidellm" or "guidellm benchmark"? We can then set the CMD after to include any specific args we need passed in / surface from the settings or set it to --help for users to get started with. This way we can enable and passthrough and remove duplication of arguments we need to pass through and another script to maintain and add functionality to. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
#!/usr/bin/env bash | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See note on the entrypoint for the Dockerfile, I'd like to see if we can remove this script fully or make it optional to package in |
||
set -euo pipefail | ||
|
||
# Required environment variables | ||
TARGET=${TARGET:-"http://localhost:8000"} | ||
MODEL=${MODEL:-"neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16"} | ||
RATE_TYPE=${RATE_TYPE:-"sweep"} | ||
DATA=${DATA:-"prompt_tokens=256,output_tokens=128"} | ||
MAX_REQUESTS=${MAX_REQUESTS:-"100"} | ||
MAX_SECONDS=${MAX_SECONDS:-""} | ||
|
||
# Output configuration | ||
OUTPUT_PATH=${OUTPUT_PATH:-"/results/guidellm_benchmark_results"} | ||
OUTPUT_FORMAT=${OUTPUT_FORMAT:-"json"} # Can be json, yaml, or yml | ||
|
||
# Build the command | ||
CMD="guidellm benchmark --target \"${TARGET}\" --model \"${MODEL}\" --rate-type \"${RATE_TYPE}\" --data \"${DATA}\"" | ||
|
||
# Add optional parameters | ||
if [ ! -z "${MAX_REQUESTS}" ]; then | ||
CMD="${CMD} --max-requests ${MAX_REQUESTS}" | ||
fi | ||
|
||
if [ ! -z "${MAX_SECONDS}" ]; then | ||
CMD="${CMD} --max-seconds ${MAX_SECONDS}" | ||
fi | ||
|
||
# Add output path with appropriate extension | ||
if [ ! -z "${OUTPUT_PATH}" ]; then | ||
CMD="${CMD} --output-path \"${OUTPUT_PATH}.${OUTPUT_FORMAT}\"" | ||
fi | ||
|
||
# Execute the command | ||
echo "Running command: ${CMD}" | ||
eval "${CMD}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to have this at one of the extremes for the supported versions rather than somewhere in the middle, especially since the max and min version are tested much more; ie 3.9 or 3.13