[Feature request] Adding a checker to see if a custom endpoint is working properly #106

remyleone · 2023-11-08T10:54:15Z

I'm trying to run a model using the following command on my server:

docker run --gpus all --shm-size 1g -p 8080:80 -v /scratch/data:/data -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e HF_HUB_ENA^CE_HF_TRANSFER=0 ghcr.io/huggingface/text-generation-inference:1.1.0 --model-id bigcode/starcoder

But when I configure the IP in my editor: http://XXXX:8080/generate I would like to have a test from the editor that tell me whether or not the editor can successfully connect.

It could be helpful in the settings or as a dedicated command from the LLM extensions to verify that everything is in place and have helpful messages in case it is not.

As a check from the server side, I'm using nvidia-smi to see if additional GPU usage is happening

The text was updated successfully, but these errors were encountered:

github-actions · 2023-12-09T01:46:23Z

This issue is stale because it has been open for 30 days with no activity.

github-actions bot added the stale label Dec 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Adding a checker to see if a custom endpoint is working properly #106

[Feature request] Adding a checker to see if a custom endpoint is working properly #106

remyleone commented Nov 8, 2023

github-actions bot commented Dec 9, 2023

[Feature request] Adding a checker to see if a custom endpoint is working properly #106

[Feature request] Adding a checker to see if a custom endpoint is working properly #106

Comments

remyleone commented Nov 8, 2023

github-actions bot commented Dec 9, 2023