Skip to content

Commit

Permalink
fix: moved to required arg after vllm serve
Browse files Browse the repository at this point in the history
  • Loading branch information
hommayushi3 committed Aug 9, 2024
1 parent ba79708 commit 7d07aa4
Show file tree
Hide file tree
Showing 5 changed files with 385 additions and 1 deletion.
2 changes: 1 addition & 1 deletion endpoints-entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ TRUST_REMOTE_CODE=${TRUST_REMOTE_CODE:-false}
GUIDED_DECODING_BACKEND=${GUIDED_DECODING_BACKEND:-"outlines"}

# Entrypoint for the OpenAI API server
CMD="vllm serve --host '0.0.0.0' --port 80 --model '$MODEL_PATH' --tensor-parallel-size '$NUM_SHARD' --dtype $DTYPE --guided-decoding-backend $GUIDED_DECODING_BACKEND"
CMD="vllm serve $MODEL_PATH --host '0.0.0.0' --port 80 --tensor-parallel-size '$NUM_SHARD' --dtype $DTYPE --guided-decoding-backend $GUIDED_DECODING_BACKEND"

# Append --max-model-len if its value is not -1
if [ "$MAX_MODEL_LEN" -ne -1 ]; then
Expand Down
Empty file added examples/deploy.py
Empty file.
1 change: 1 addition & 0 deletions examples/inference.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from huggingface_hub import InferenceClient
Loading

0 comments on commit 7d07aa4

Please sign in to comment.