-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BFCL] URL endpoint support discussion #850
Comments
Hey @ThomasRochefortB, My question is: |
Hello @HuanzhiMao ! That's exactly what I had in mind!
# Read from env vars with fallbacks
vllm_host = os.getenv('VLLM_ENDPOINT', 'localhost')
vllm_port = os.getenv('VLLM_PORT', '8000') Could we make the
|
That sounds good! |
Describe the issue
This is to open a discussion on supporting models that are already served via an openai compatible endpoint and therefore bypassing the
vLLM serve
method in BFCL.At Valence Labs we are often in the position where we need to do the model serving and the model benchmarking as two separate jobs (in a SLURM cluster for example). This means that we cannot use the implemented vLLM serving in the BFCL library and we need to instead directly point BFCL to an Open-AI-compatible endpoint URL.
We have tested something like this
I had in mind a PR to BFCL to support these use cases but there are many potential ways of implementing this and wanted to have your opinion first. So 2 questions here:
The text was updated successfully, but these errors were encountered: