[BFCL] URL endpoint support discussion #850

ThomasRochefortB · 2024-12-23T20:26:50Z

Describe the issue

This is to open a discussion on supporting models that are already served via an openai compatible endpoint and therefore bypassing the vLLM serve method in BFCL.

At Valence Labs we are often in the position where we need to do the model serving and the model benchmarking as two separate jobs (in a SLURM cluster for example). This means that we cannot use the implemented vLLM serving in the BFCL library and we need to instead directly point BFCL to an Open-AI-compatible endpoint URL.

We have tested something like this

I had in mind a PR to BFCL to support these use cases but there are many potential ways of implementing this and wanted to have your opinion first. So 2 questions here:

Do you understand the desired usecase here?
Any guidelines or pointers as to how best to implement this (see my linked branch above for our current naïve implementation)

The text was updated successfully, but these errors were encountered:

HuanzhiMao · 2024-12-24T03:37:18Z

Hey @ThomasRochefortB,
If I understand correctly, you want to skip the part where the bfcl generation pipeline spins up the openai compatible server, and precede as if the server has been setup, right?

My question is:
Currently, we assume that the server is using the URL http://localhost:{VLLM_PORT}/v1 where the vllm port number is defined in the constant.py (with default value 1053). Do you need to change the endpoint and port settings?
If not, then it's fairly straightforward to make the change. We could have an optional flag in the cli to indicate if the server has already been setup. And in the code, we only run these lines if the flag is set (section 1, section 2, section 3, section 4). What do you think?

ThomasRochefortB · 2024-12-25T13:35:50Z

Hello @HuanzhiMao !

That's exactly what I had in mind!
For a SLURM cluster application, it gets more tricky to force the endpoint and port to a constant value as these can be dependent on the node that gets allocated.

What would be a good way to specify the IP address and the port number? In my branch I am using environment variables and then reading them using :

        # Read from env vars with fallbacks
        vllm_host = os.getenv('VLLM_ENDPOINT', 'localhost')
        vllm_port = os.getenv('VLLM_PORT', '8000')

Could we make the VLLM_PORT and VLLM_ENDPOINT an optional entry in .env?

The CLI could have a --skip-vllm flag that allows us to bypass all the vLLM setup steps and directly point to the env vars for the completions request.

HuanzhiMao · 2024-12-25T13:44:42Z

Hello @HuanzhiMao !

That's exactly what I had in mind! For a SLURM cluster application, it gets more tricky to force the endpoint and port to a constant value as these can be dependent on the node that gets allocated.

What would be a good way to specify the IP address and the port number? In my branch I am using environment variables and then reading them using :
        # Read from env vars with fallbacks
        vllm_host = os.getenv('VLLM_ENDPOINT', 'localhost')
        vllm_port = os.getenv('VLLM_PORT', '8000')
Could we make the VLLM_PORT and VLLM_ENDPOINT an optional entry in .env?

The CLI could have a --skip-vllm flag that allows us to bypass all the vLLM setup steps and directly point to the env vars for the completions request.

That sounds good!

ThomasRochefortB mentioned this issue Jan 2, 2025

[BFCL] Support for pre-existing completion endpoint #864

Merged

HuanzhiMao closed this as completed in #864 Jan 3, 2025

HuanzhiMao closed this as completed in 1729c9b Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BFCL] URL endpoint support discussion #850

[BFCL] URL endpoint support discussion #850

ThomasRochefortB commented Dec 23, 2024

HuanzhiMao commented Dec 24, 2024

ThomasRochefortB commented Dec 25, 2024

HuanzhiMao commented Dec 25, 2024

[BFCL] URL endpoint support discussion #850

[BFCL] URL endpoint support discussion #850

Comments

ThomasRochefortB commented Dec 23, 2024

HuanzhiMao commented Dec 24, 2024

ThomasRochefortB commented Dec 25, 2024

HuanzhiMao commented Dec 25, 2024