Update Inference specification for Hugging Face's completion and chat completion tasks #4383

Jan-Kazlouski-elastic · 2025-05-19T18:07:07Z

This PR is for changes to specification caused by elastic/elasticsearch#127254:

Extended Task Support:

Added completion and chat_completion tasks to the list of supported Hugging Face tasks.

Model Requirements for Chat Tasks:

Updated documentation to describe specific requirements for using chat_completion and completion tasks, including model compatibility with the OpenAI API format and usage guidelines for serverless vs. dedicated endpoints.

New Configuration Parameters:

Introduced optional model_id field in Hugging Face service settings, applicable to completion and chat_completion tasks.

Rate Limit Clarifications:

Updated rate_limit documentation to clarify default behavior and guidance for tuning based on deployment specifics.

Documentation Fixes:

Corrected typos in existing text_embedding request examples.

Additional actions

Signed the CLA
Executed make contrib

… completion tasks

github-actions · 2025-05-19T18:08:46Z

Following you can find the validation results for the APIs you have changed.

API	Status	Request	Response
`inference.chat_completion_unified`	⚪	Missing test	Missing test
`inference.completion`	⚪	Missing test	Missing test
`inference.delete`	⚪	Missing test	Missing test
`inference.get`	🟢	1/1	1/1
`inference.inference`	⚪	Missing test	Missing test
`inference.put_alibabacloud`	⚪	Missing test	Missing test
`inference.put_amazonbedrock`	⚪	Missing test	Missing test
`inference.put_anthropic`	⚪	Missing test	Missing test
`inference.put_azureaistudio`	⚪	Missing test	Missing test
`inference.put_azureopenai`	⚪	Missing test	Missing test
`inference.put_cohere`	⚪	Missing test	Missing test
`inference.put_elasticsearch`	⚪	Missing test	Missing test
`inference.put_elser`	⚪	Missing test	Missing test
`inference.put_googleaistudio`	⚪	Missing test	Missing test
`inference.put_googlevertexai`	⚪	Missing test	Missing test
`inference.put_hugging_face`	⚪	Missing test	Missing test
`inference.put_jinaai`	⚪	Missing test	Missing test
`inference.put_mistral`	⚪	Missing test	Missing test
`inference.put_openai`	⚪	Missing test	Missing test
`inference.put_voyageai`	⚪	Missing test	Missing test
`inference.put_watsonx`	⚪	Missing test	Missing test
`inference.put`	⚪	Missing test	Missing test
`inference.rerank`	⚪	Missing test	Missing test
`inference.sparse_embedding`	⚪	Missing test	Missing test
`inference.stream_completion`	⚪	Missing test	Missing test
`inference.text_embedding`	⚪	Missing test	Missing test
`inference.update`	⚪	Missing test	Missing test

You can validate these APIs yourself by using the make validate target.

…tion

github-actions · 2025-05-26T12:48:00Z

Following you can find the validation results for the APIs you have changed.

API	Status	Request	Response
`inference.chat_completion_unified`	⚪	Missing test	Missing test
`inference.completion`	⚪	Missing test	Missing test
`inference.delete`	⚪	Missing test	Missing test
`inference.get`	🟢	1/1	1/1
`inference.inference`	⚪	Missing test	Missing test
`inference.put_alibabacloud`	⚪	Missing test	Missing test
`inference.put_amazonbedrock`	⚪	Missing test	Missing test
`inference.put_anthropic`	⚪	Missing test	Missing test
`inference.put_azureaistudio`	⚪	Missing test	Missing test
`inference.put_azureopenai`	⚪	Missing test	Missing test
`inference.put_cohere`	⚪	Missing test	Missing test
`inference.put_elasticsearch`	⚪	Missing test	Missing test
`inference.put_elser`	⚪	Missing test	Missing test
`inference.put_googleaistudio`	⚪	Missing test	Missing test
`inference.put_googlevertexai`	⚪	Missing test	Missing test
`inference.put_hugging_face`	⚪	Missing test	Missing test
`inference.put_jinaai`	⚪	Missing test	Missing test
`inference.put_mistral`	⚪	Missing test	Missing test
`inference.put_openai`	⚪	Missing test	Missing test
`inference.put_voyageai`	⚪	Missing test	Missing test
`inference.put_watsonx`	⚪	Missing test	Missing test
`inference.put`	⚪	Missing test	Missing test
`inference.rerank`	⚪	Missing test	Missing test
`inference.sparse_embedding`	⚪	Missing test	Missing test
`inference.stream_completion`	⚪	Missing test	Missing test
`inference.text_embedding`	⚪	Missing test	Missing test
`inference.update`	⚪	Missing test	Missing test

You can validate these APIs yourself by using the make validate target.

l-trotta

spec wise LGTM, let's hear from @szabosteve for the docs part!

jonathan-buttner

Thanks for the PR, I left a few suggestions.

jonathan-buttner · 2025-05-28T17:25:25Z

specification/inference/_types/CommonTypes.ts

   */
  rate_limit?: RateLimitSetting
  /**
   * The URL endpoint to use for the requests.
+   * For `completion` and `chat_completion` tasks, endpoint must be compatible with the OpenAI API format and include `v1/chat/completions`.


How about we expand this a little. Maybe something like:

Suggested change

* For `completion` and `chat_completion` tasks, endpoint must be compatible with the OpenAI API format and include `v1/chat/completions`.

* For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (https://huggingface.co/docs/inference-providers/en/tasks/chat-completion#conversational-large-language-models-llms). The URL must include `v1/chat/completions`.

Since OpenAI mode is the only visible way of determining whether or not OpenAI API format is supported, I propose expanding it a bit more:

Suggested change

* For `completion` and `chat_completion` tasks, endpoint must be compatible with the OpenAI API format and include `v1/chat/completions`.

* For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (https://huggingface.co/docs/inference-providers/en/tasks/chat-completion#conversational-large-language-models-llms). OpenAI mode must be enabled, and the endpoint URL must include `v1/chat/completions`.

What is OpenAI mode? I searched hugging face to see if I couldn't find it. Are you referring to the request format that the inference API requires?

Did the change

Sorry, missed your question regarding OpenAI mode. OpenAI mode is what I call the toggle that needs to be enabled in order for endpoint to use OpenAI API.
Not every model has this toggle - so not every model can be used for elastic inference.

specification/inference/_types/CommonTypes.ts

specification/inference/put_hugging_face/PutHuggingFaceRequest.ts

jonathan-buttner · 2025-05-28T18:21:03Z

package.json

@@ -1,6 +1,6 @@
 {
  "dependencies": {
-    "@redocly/cli": "^1.34.1",
+    "@redocly/cli": "^1.34.3",


Do we need this change?

This change is being made automatically by running pre commit set of tasks. This value is being incremented along with different changes committed over the time, so ignoring the change made by pre commit set of tasks would have to have a reason behind it.

jonathan-buttner · 2025-05-28T18:21:21Z

package-lock.json

@@ -5,7 +5,7 @@
  "packages": {
    "": {
      "dependencies": {
-        "@redocly/cli": "^1.34.1",
+        "@redocly/cli": "^1.34.3",


Was this file supposed to change?

As answered above change is being made automatically by running pre commit set of tasks. If we don't have reason for ignoring changes made by pre commit set of tasks - I'd keep it.

Yep sounds good, thanks for explaining.

…chat-completion-integration # Conflicts: # output/schema/schema-serverless.json # output/schema/schema.json

github-actions · 2025-05-29T13:07:19Z

Following you can find the validation results for the APIs you have changed.

API	Status	Request	Response
`inference.chat_completion_unified`	⚪	Missing test	Missing test
`inference.completion`	⚪	Missing test	Missing test
`inference.delete`	⚪	Missing test	Missing test
`inference.get`	🟢	1/1	1/1
`inference.inference`	⚪	Missing test	Missing test
`inference.put_alibabacloud`	⚪	Missing test	Missing test
`inference.put_amazonbedrock`	⚪	Missing test	Missing test
`inference.put_anthropic`	⚪	Missing test	Missing test
`inference.put_azureaistudio`	⚪	Missing test	Missing test
`inference.put_azureopenai`	⚪	Missing test	Missing test
`inference.put_cohere`	⚪	Missing test	Missing test
`inference.put_elasticsearch`	⚪	Missing test	Missing test
`inference.put_elser`	⚪	Missing test	Missing test
`inference.put_googleaistudio`	⚪	Missing test	Missing test
`inference.put_googlevertexai`	⚪	Missing test	Missing test
`inference.put_hugging_face`	⚪	Missing test	Missing test
`inference.put_jinaai`	⚪	Missing test	Missing test
`inference.put_mistral`	⚪	Missing test	Missing test
`inference.put_openai`	⚪	Missing test	Missing test
`inference.put_voyageai`	⚪	Missing test	Missing test
`inference.put_watsonx`	⚪	Missing test	Missing test
`inference.put`	⚪	Missing test	Missing test
`inference.rerank`	⚪	Missing test	Missing test
`inference.sparse_embedding`	⚪	Missing test	Missing test
`inference.stream_completion`	⚪	Missing test	Missing test
`inference.text_embedding`	⚪	Missing test	Missing test
`inference.update`	⚪	Missing test	Missing test

You can validate these APIs yourself by using the make validate target.

szabosteve

Left a few comments and a tiny suggestion, otherwise LGTM!

specification/inference/_types/CommonTypes.ts

szabosteve · 2025-06-02T12:23:12Z

specification/inference/put_hugging_face/PutHuggingFaceRequest.ts

@@ -29,13 +29,16 @@ import { Id } from '@_types/common'
 /**
 * Create a Hugging Face inference endpoint.
 *
- * Create an inference endpoint to perform an inference task with the `hugging_face` service.
+ * Creates an inference endpoint to perform an inference task with the `hugging_face` service.


I suggest changing it back to be consistent with the rest of the endpoint docs.

Suggested change

* Creates an inference endpoint to perform an inference task with the `hugging_face` service.

* Create an inference endpoint to perform an inference task with the `hugging_face` service.

FYI Using of "Create" vs "Creates" is not consistent across the endpoint docs. For Amazon and Mistral it is "Creates" and it was taken as example. Thought it made more sense describing "what it does", but now I see that it is invitation to "do it".
Changed it since for the most of the providers it is "Create".

@Jan-Kazlouski-elastic Thanks! I'll fix Amazon and Mistral.

Thank you @szabosteve

specification/inference/put_hugging_face/PutHuggingFaceRequest.ts

jonathan-buttner · 2025-06-02T17:38:46Z

specification/inference/_types/CommonTypes.ts

   */
  rate_limit?: RateLimitSetting
  /**
   * The URL endpoint to use for the requests.
+   * For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (https://huggingface.co/docs/inference-providers/en/tasks/chat-completion#conversational-large-language-models-llms). OpenAI mode must be enabled, and the endpoint URL must include `v1/chat/completions`.


OpenAI mode must be enabled

What is OpenAI mode?

Can we remove that portion of the sentence?

Suggested change

* For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (https://huggingface.co/docs/inference-providers/en/tasks/chat-completion#conversational-large-language-models-llms). OpenAI mode must be enabled, and the endpoint URL must include `v1/chat/completions`.

* For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (https://huggingface.co/docs/inference-providers/en/tasks/chat-completion#conversational-large-language-models-llms). The endpoint URL must include `v1/chat/completions`.

OpenAI mode is the toggle that needs to be enabled in order for endpoint to use OpenAI API.
I think that having this info is useful for the customer. Not every model has this toggle - so not every model can be used for elastic inference.
Do you still want me to remove it?

Ah I hadn't seen that before.

Can you try disabling it and see if a chat completion request using the inference API still works? I wonder if that only controls the UI in hugging face.

Can you give me an example of a model that has it?

Can you try disabling it and see if a chat completion request using the inference API still works?

After disabling it on UI - it still works, but url that is presented to the client doesn't contain v1/chat/completions. So basically switching it on and off doesn't keep endpoint from processing OpenAI payloads if they are sent to the /v1/chat/completions, but hides the /v1/chat/completions section of the URL and provides different payload example. So if OpenAI mode is turned off or not there at all for a specific model - then full URL with /v1/chat/completions is hidden - client can't see/use the URL that must be used for integration.
Client might have prior knowledge on how url should look like and we're saying that it must contain /v1/chat/completions, but it is not safe to imply that customer will understand this section must be added on top of the regular url. Specially if model doesn't support OpenAI payload and attempt to include /v1/chat/completions will be made and error will be returned because model doesn't support it.

I guess we could make it clearer by telling customer that absence of /v1/chat/completions on running model page can be caused by OpenAI mode being turned off. And enabling this mode if this toggle is present when url doesn't contain /v1/chat/completions can lead to it being shown. But since this mode always been turned ON by default for me I think this issue has really low risks. BUT! Presence of this toggle is one of the signs that model is capable of processing OpenAI type payloads and can be used for integration. So for me personally having this note is making us safer.
But you mentioning that you hadn't seen it before and obviously running successful tests in the past makes me wonder if it is something local to only some of the models.

UI when OpenAI mode is
On:

Off:

I wonder if that only controls the UI in hugging face.

Yes it just controls UI.

Can you give me an example of a model that has it?

https://huggingface.co/microsoft/Phi-3-mini-128k-instruct

Ah I see. I wonder if it's be more helpful to put this information and an example picture like you have somewhere else in the docs. @szabosteve what do you think?

My suggestion is to would be to have the text as something like:

For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (https://huggingface.co/docs/inference-providers/en/tasks/chat-completion#conversational-large-language-models-llms). The endpoint URL must include `v1/chat/completions`. To determine if the model supports the Hugging Face Chat Completion Interface and to access the correct URL follow the information here.

We'd have a link or something to either another page or a different section of the page that explains that the deployment should have a toggle for OpenAI. Then to get the correct URL they should enable the toggle and ensure the URL ends with v1/chat/completions.

I think the best would be to present this information somehow here. I'm afraid that linking to another docs page from the reference for such a low-level detail would be hiding information. I suggest something like this:

Suggested change

* For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (https://huggingface.co/docs/inference-providers/en/tasks/chat-completion#conversational-large-language-models-llms). OpenAI mode must be enabled, and the endpoint URL must include `v1/chat/completions`.

* For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (see the linked external documentation for details). The endpoint URL for the request must include `/v1/chat/completions`.

* If the model supports the OpenAI Chat Completion schema, a toggle should appear in the interface. Enabling this toggle doesn't change any model behavior, it reveals the full endpoint URL needed (which should include `/v1/chat/completions`) when configuring the inference endpoint in Elasticsearch. If the model doesn't support this schema, the toggle may not be shown.

* @ext_doc_id huggingface-chat-completion-interface

And then add the following to the table.csv:

huggingface-chat-completion-interface, https://huggingface.co/docs/inference-providers/en/tasks/chat-completion#conversational-large-language-models-llms

It will provide a link with the text External documentation at the end of the description, it makes the docs a bit more readable.

@jonathan-buttner What do you think?

Yep that looks good. Thanks!

Applied change proposed by @szabosteve

specification/inference/_types/CommonTypes.ts

jonathan-buttner · 2025-06-02T17:49:10Z

specification/inference/put_hugging_face/PutHuggingFaceRequest.ts

+ * After the endpoint is initialized (for dedicated) or ready (for serverless), ensure it supports the OpenAI API and includes `/v1/chat/completions` part in URL. Then, copy the full endpoint URL for use.
+ * Recommended models for `chat_completion` and `completion` tasks:
+ *
+ * * `Mistral-7B-Instruct-v0.2`


@szabosteve should we include the full URL link to these models?

I would rather not include the full URLs. Currently, we can provide only one link per description; otherwise, the generator complains.

…creation comment

github-actions · 2025-06-03T08:16:08Z

Following you can find the validation results for the APIs you have changed.

API	Status	Request	Response
`inference.chat_completion_unified`	⚪	Missing test	Missing test
`inference.completion`	⚪	Missing test	Missing test
`inference.delete`	⚪	Missing test	Missing test
`inference.get`	🟢	1/1	1/1
`inference.inference`	⚪	Missing test	Missing test
`inference.put_alibabacloud`	⚪	Missing test	Missing test
`inference.put_amazonbedrock`	⚪	Missing test	Missing test
`inference.put_anthropic`	⚪	Missing test	Missing test
`inference.put_azureaistudio`	⚪	Missing test	Missing test
`inference.put_azureopenai`	⚪	Missing test	Missing test
`inference.put_cohere`	⚪	Missing test	Missing test
`inference.put_elasticsearch`	⚪	Missing test	Missing test
`inference.put_elser`	⚪	Missing test	Missing test
`inference.put_googleaistudio`	⚪	Missing test	Missing test
`inference.put_googlevertexai`	⚪	Missing test	Missing test
`inference.put_hugging_face`	⚪	Missing test	Missing test
`inference.put_jinaai`	⚪	Missing test	Missing test
`inference.put_mistral`	⚪	Missing test	Missing test
`inference.put_openai`	⚪	Missing test	Missing test
`inference.put_voyageai`	⚪	Missing test	Missing test
`inference.put_watsonx`	⚪	Missing test	Missing test
`inference.put`	⚪	Missing test	Missing test
`inference.rerank`	⚪	Missing test	Missing test
`inference.sparse_embedding`	⚪	Missing test	Missing test
`inference.stream_completion`	⚪	Missing test	Missing test
`inference.text_embedding`	⚪	Missing test	Missing test
`inference.update`	⚪	Missing test	Missing test

You can validate these APIs yourself by using the make validate target.

…chat-completion-integration # Conflicts: # output/openapi/elasticsearch-openapi.json # output/openapi/elasticsearch-serverless-openapi.json # output/schema/schema-serverless.json # output/schema/schema.json

…d completion tasks

github-actions · 2025-06-04T17:27:47Z

Following you can find the validation results for the APIs you have changed.

API	Status	Request	Response
`inference.chat_completion_unified`	⚪	Missing test	Missing test
`inference.completion`	⚪	Missing test	Missing test
`inference.delete`	⚪	Missing test	Missing test
`inference.get`	🟢	1/1	1/1
`inference.inference`	⚪	Missing test	Missing test
`inference.put_alibabacloud`	⚪	Missing test	Missing test
`inference.put_amazonbedrock`	⚪	Missing test	Missing test
`inference.put_anthropic`	⚪	Missing test	Missing test
`inference.put_azureaistudio`	⚪	Missing test	Missing test
`inference.put_azureopenai`	⚪	Missing test	Missing test
`inference.put_cohere`	⚪	Missing test	Missing test
`inference.put_elasticsearch`	⚪	Missing test	Missing test
`inference.put_elser`	⚪	Missing test	Missing test
`inference.put_googleaistudio`	⚪	Missing test	Missing test
`inference.put_googlevertexai`	⚪	Missing test	Missing test
`inference.put_hugging_face`	⚪	Missing test	Missing test
`inference.put_jinaai`	⚪	Missing test	Missing test
`inference.put_mistral`	⚪	Missing test	Missing test
`inference.put_openai`	⚪	Missing test	Missing test
`inference.put_voyageai`	⚪	Missing test	Missing test
`inference.put_watsonx`	⚪	Missing test	Missing test
`inference.put`	⚪	Missing test	Missing test
`inference.rerank`	⚪	Missing test	Missing test
`inference.sparse_embedding`	⚪	Missing test	Missing test
`inference.stream_completion`	⚪	Missing test	Missing test
`inference.text_embedding`	⚪	Missing test	Missing test
`inference.update`	⚪	Missing test	Missing test

You can validate these APIs yourself by using the make validate target.

….json

github-actions · 2025-06-04T17:33:18Z

Following you can find the validation results for the APIs you have changed.

API	Status	Request	Response
`inference.chat_completion_unified`	⚪	Missing test	Missing test
`inference.completion`	⚪	Missing test	Missing test
`inference.delete`	⚪	Missing test	Missing test
`inference.get`	🟢	1/1	1/1
`inference.inference`	⚪	Missing test	Missing test
`inference.put_alibabacloud`	⚪	Missing test	Missing test
`inference.put_amazonbedrock`	⚪	Missing test	Missing test
`inference.put_anthropic`	⚪	Missing test	Missing test
`inference.put_azureaistudio`	⚪	Missing test	Missing test
`inference.put_azureopenai`	⚪	Missing test	Missing test
`inference.put_cohere`	⚪	Missing test	Missing test
`inference.put_elasticsearch`	⚪	Missing test	Missing test
`inference.put_elser`	⚪	Missing test	Missing test
`inference.put_googleaistudio`	⚪	Missing test	Missing test
`inference.put_googlevertexai`	⚪	Missing test	Missing test
`inference.put_hugging_face`	⚪	Missing test	Missing test
`inference.put_jinaai`	⚪	Missing test	Missing test
`inference.put_mistral`	⚪	Missing test	Missing test
`inference.put_openai`	⚪	Missing test	Missing test
`inference.put_voyageai`	⚪	Missing test	Missing test
`inference.put_watsonx`	⚪	Missing test	Missing test
`inference.put`	⚪	Missing test	Missing test
`inference.rerank`	⚪	Missing test	Missing test
`inference.sparse_embedding`	⚪	Missing test	Missing test
`inference.stream_completion`	⚪	Missing test	Missing test
`inference.text_embedding`	⚪	Missing test	Missing test
`inference.update`	⚪	Missing test	Missing test

You can validate these APIs yourself by using the make validate target.

…chat-completion-integration # Conflicts: # output/openapi/elasticsearch-openapi.json # output/openapi/elasticsearch-serverless-openapi.json # output/schema/schema.json

…etion tasks

github-actions · 2025-06-05T09:36:33Z

Following you can find the validation results for the APIs you have changed.

API	Status	Request	Response
`inference.chat_completion_unified`	⚪	Missing test	Missing test
`inference.completion`	⚪	Missing test	Missing test
`inference.delete`	⚪	Missing test	Missing test
`inference.get`	🟢	1/1	1/1
`inference.inference`	⚪	Missing test	Missing test
`inference.put_alibabacloud`	⚪	Missing test	Missing test
`inference.put_amazonbedrock`	⚪	Missing test	Missing test
`inference.put_anthropic`	⚪	Missing test	Missing test
`inference.put_azureaistudio`	⚪	Missing test	Missing test
`inference.put_azureopenai`	⚪	Missing test	Missing test
`inference.put_cohere`	⚪	Missing test	Missing test
`inference.put_elasticsearch`	⚪	Missing test	Missing test
`inference.put_elser`	⚪	Missing test	Missing test
`inference.put_googleaistudio`	⚪	Missing test	Missing test
`inference.put_googlevertexai`	⚪	Missing test	Missing test
`inference.put_hugging_face`	⚪	Missing test	Missing test
`inference.put_jinaai`	⚪	Missing test	Missing test
`inference.put_mistral`	⚪	Missing test	Missing test
`inference.put_openai`	⚪	Missing test	Missing test
`inference.put_voyageai`	⚪	Missing test	Missing test
`inference.put_watsonx`	⚪	Missing test	Missing test
`inference.put`	⚪	Missing test	Missing test
`inference.rerank`	⚪	Missing test	Missing test
`inference.sparse_embedding`	⚪	Missing test	Missing test
`inference.stream_completion`	⚪	Missing test	Missing test
`inference.text_embedding`	⚪	Missing test	Missing test
`inference.update`	⚪	Missing test	Missing test

You can validate these APIs yourself by using the make validate target.

…chat-completion-integration # Conflicts: # package-lock.json

jonathan-buttner

Looking good.

specification/inference/_types/CommonTypes.ts

jonathan-buttner · 2025-06-06T19:24:02Z

specification/inference/put_hugging_face/PutHuggingFaceRequest.ts

+ * For Elastic's `chat_completion` and `completion` tasks:
+ * The selected model must support the `Text Generation` task and expose OpenAI API. HuggingFace supports both serverless and dedicated endpoints for `Text Generation`. When creating dedicated endpoint select the `Text Generation` task.
+ * After the endpoint is initialized (for dedicated) or ready (for serverless), ensure it supports the OpenAI API and includes `/v1/chat/completions` part in URL. Then, copy the full endpoint URL for use.
+ * Recommended models for `chat_completion` and `completion` tasks:


@szabosteve should we say something like "known supported models"?

…rify request failure conditions

…chat-completion-integration # Conflicts: # output/openapi/elasticsearch-openapi.json # output/openapi/elasticsearch-serverless-openapi.json # output/schema/schema.json # specification/inference/chat_completion_unified/examples/request/PostChatCompletionRequestExample2.yaml # specification/inference/chat_completion_unified/examples/request/PostChatCompletionRequestExample3.yaml

github-actions · 2025-06-09T09:15:38Z

Following you can find the validation results for the APIs you have changed.

API	Status	Request	Response
`inference.chat_completion_unified`	⚪	Missing test	Missing test
`inference.completion`	⚪	Missing test	Missing test
`inference.delete`	⚪	Missing test	Missing test
`inference.get`	🟢	1/1	1/1
`inference.inference`	⚪	Missing test	Missing test
`inference.put_alibabacloud`	⚪	Missing test	Missing test
`inference.put_amazonbedrock`	⚪	Missing test	Missing test
`inference.put_anthropic`	⚪	Missing test	Missing test
`inference.put_azureaistudio`	⚪	Missing test	Missing test
`inference.put_azureopenai`	⚪	Missing test	Missing test
`inference.put_cohere`	⚪	Missing test	Missing test
`inference.put_elasticsearch`	⚪	Missing test	Missing test
`inference.put_elser`	⚪	Missing test	Missing test
`inference.put_googleaistudio`	⚪	Missing test	Missing test
`inference.put_googlevertexai`	⚪	Missing test	Missing test
`inference.put_hugging_face`	⚪	Missing test	Missing test
`inference.put_jinaai`	⚪	Missing test	Missing test
`inference.put_mistral`	⚪	Missing test	Missing test
`inference.put_openai`	⚪	Missing test	Missing test
`inference.put_voyageai`	⚪	Missing test	Missing test
`inference.put_watsonx`	⚪	Missing test	Missing test
`inference.put`	⚪	Missing test	Missing test
`inference.rerank`	⚪	Missing test	Missing test
`inference.sparse_embedding`	⚪	Missing test	Missing test
`inference.stream_completion`	⚪	Missing test	Missing test
`inference.text_embedding`	⚪	Missing test	Missing test
`inference.update`	⚪	Missing test	Missing test

You can validate these APIs yourself by using the make validate target.

…chat-completion-integration

github-actions · 2025-06-09T12:55:35Z

Following you can find the validation results for the APIs you have changed.

API	Status	Request	Response
`inference.chat_completion_unified`	⚪	Missing test	Missing test
`inference.completion`	⚪	Missing test	Missing test
`inference.delete`	⚪	Missing test	Missing test
`inference.get`	🟢	1/1	1/1
`inference.inference`	⚪	Missing test	Missing test
`inference.put_alibabacloud`	⚪	Missing test	Missing test
`inference.put_amazonbedrock`	⚪	Missing test	Missing test
`inference.put_anthropic`	⚪	Missing test	Missing test
`inference.put_azureaistudio`	⚪	Missing test	Missing test
`inference.put_azureopenai`	⚪	Missing test	Missing test
`inference.put_cohere`	⚪	Missing test	Missing test
`inference.put_elasticsearch`	⚪	Missing test	Missing test
`inference.put_elser`	⚪	Missing test	Missing test
`inference.put_googleaistudio`	⚪	Missing test	Missing test
`inference.put_googlevertexai`	⚪	Missing test	Missing test
`inference.put_hugging_face`	⚪	Missing test	Missing test
`inference.put_jinaai`	⚪	Missing test	Missing test
`inference.put_mistral`	⚪	Missing test	Missing test
`inference.put_openai`	⚪	Missing test	Missing test
`inference.put_voyageai`	⚪	Missing test	Missing test
`inference.put_watsonx`	⚪	Missing test	Missing test
`inference.put`	⚪	Missing test	Missing test
`inference.rerank`	⚪	Missing test	Missing test
`inference.sparse_embedding`	⚪	Missing test	Missing test
`inference.stream_completion`	⚪	Missing test	Missing test
`inference.text_embedding`	⚪	Missing test	Missing test
`inference.update`	⚪	Missing test	Missing test

You can validate these APIs yourself by using the make validate target.

github-actions · 2025-06-09T12:57:49Z

The backport to 8.19 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-8.19 8.19
# Navigate to the new working tree
cd .worktrees/backport-8.19
# Create a new branch
git switch --create backport-4383-to-8.19
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 b48894fcf05fdbd121ccb20c54c2b66be1924124
# Push it to GitHub
git push --set-upstream origin backport-4383-to-8.19
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-8.19

Then, create a pull request where the base branch is 8.19 and the compare/head branch is backport-4383-to-8.19.

… completion tasks (#4383) * Update Inference specification for Hugging Face's completion and chat completion tasks * Extend description for url parameter * Fix model_id description for text_embedding task and update endpoint creation comment * Enhance Hugging Face integration documentation for chat completion and completion tasks * Add @typescript-eslint/rule-tester to devDependencies in package-lock.json * Enhance Hugging Face integration to support chat_completion and completion tasks * Fix description for model_id field in Hugging Face integration to clarify request failure conditions * Update json schema (cherry picked from commit b48894f)

Update Inference specification for Hugging Face's completion and chat…

3eebfb0

… completion tasks

Jan-Kazlouski-elastic assigned jonathan-buttner May 19, 2025

Jan-Kazlouski-elastic added ml backport 8.19 backport 9.1 Team:ML labels May 19, 2025

github-actions bot added the specification label May 19, 2025

Jan-Kazlouski-elastic removed backport 9.1 Team:ML labels May 20, 2025

Merge branch 'main' into feature/hugging-face-chat-completion-integra…

3729546

…tion

l-trotta approved these changes May 28, 2025

View reviewed changes

jonathan-buttner requested changes May 28, 2025

View reviewed changes

Jan-Kazlouski-elastic added 2 commits May 29, 2025 12:20

Merge remote-tracking branch 'origin/main' into feature/hugging-face-…

0c992ca

…chat-completion-integration # Conflicts: # output/schema/schema-serverless.json # output/schema/schema.json

Extend description for url parameter

d8df825

Jan-Kazlouski-elastic requested a review from jonathan-buttner May 29, 2025 13:06

l-trotta mentioned this pull request May 30, 2025

Update inference specification for Hugging Face's rerank task #4417

Merged

szabosteve approved these changes Jun 2, 2025

View reviewed changes

jonathan-buttner requested changes Jun 2, 2025

View reviewed changes

Fix model_id description for text_embedding task and update endpoint …

bbdad4a

…creation comment

Jan-Kazlouski-elastic added 2 commits June 4, 2025 17:11

Merge remote-tracking branch 'origin/main' into feature/hugging-face-…

091da7c

…chat-completion-integration # Conflicts: # output/openapi/elasticsearch-openapi.json # output/openapi/elasticsearch-serverless-openapi.json # output/schema/schema-serverless.json # output/schema/schema.json

Enhance Hugging Face integration documentation for chat completion an…

16da9b2

…d completion tasks

Add @typescript-eslint/rule-tester to devDependencies in package-lock…

332c5d1

….json

Jan-Kazlouski-elastic requested a review from jonathan-buttner June 5, 2025 08:14

Merge remote-tracking branch 'origin/main' into feature/hugging-face-…

1be69f2

…chat-completion-integration # Conflicts: # output/openapi/elasticsearch-openapi.json # output/openapi/elasticsearch-serverless-openapi.json # output/schema/schema.json

Enhance Hugging Face integration to support chat_completion and compl…

a6dc68f

…etion tasks

Merge remote-tracking branch 'origin/main' into feature/hugging-face-…

229a358

…chat-completion-integration # Conflicts: # package-lock.json

jonathan-buttner requested changes Jun 6, 2025

View reviewed changes

Jan-Kazlouski-elastic added 3 commits June 9, 2025 09:00

Fix description for model_id field in Hugging Face integration to cla…

decae43

…rify request failure conditions

Update json schema

dfa4ae4

Jan-Kazlouski-elastic requested a review from jonathan-buttner June 9, 2025 09:14

jonathan-buttner approved these changes Jun 9, 2025

View reviewed changes

Merge remote-tracking branch 'origin/main' into feature/hugging-face-…

e686ee6

…chat-completion-integration

Jan-Kazlouski-elastic merged commit b48894f into main Jun 9, 2025
8 checks passed

Jan-Kazlouski-elastic deleted the feature/hugging-face-chat-completion-integration branch June 9, 2025 12:56

Jan-Kazlouski-elastic mentioned this pull request Jun 9, 2025

[Backport 8.19] Update Inference specification for Hugging Face's completion and chat completion tasks #4497

Merged

	* For `completion` and `chat_completion` tasks, endpoint must be compatible with the OpenAI API format and include `v1/chat/completions`.
	* For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (https://huggingface.co/docs/inference-providers/en/tasks/chat-completion#conversational-large-language-models-llms). The URL must include `v1/chat/completions`.

	* Creates an inference endpoint to perform an inference task with the `hugging_face` service.
	* Create an inference endpoint to perform an inference task with the `hugging_face` service.

-   * For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (https://huggingface.co/docs/inference-providers/en/tasks/chat-completion#conversational-large-language-models-llms). OpenAI mode must be enabled, and the endpoint URL must include `v1/chat/completions`.
+   * For `completion` and `chat_completion` tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (see the linked external documentation for details). The endpoint URL for the request must include `/v1/chat/completions`.
+  * If the model supports the OpenAI Chat Completion schema, a toggle should appear in the interface. Enabling this toggle doesn't change any model behavior, it reveals the full endpoint URL needed (which should include `/v1/chat/completions`) when configuring the inference endpoint in Elasticsearch. If the model doesn't support this schema, the toggle may not be shown.
+  * @ext_doc_id huggingface-chat-completion-interface

Update Inference specification for Hugging Face's completion and chat completion tasks #4383

Update Inference specification for Hugging Face's completion and chat completion tasks #4383

Uh oh!

Conversation

Jan-Kazlouski-elastic commented May 19, 2025

Uh oh!

github-actions bot commented May 19, 2025

Uh oh!

github-actions bot commented May 26, 2025

Uh oh!

l-trotta left a comment

Choose a reason for hiding this comment

Uh oh!

jonathan-buttner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 29, 2025

Uh oh!

szabosteve left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jan-Kazlouski-elastic Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jan-Kazlouski-elastic Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

szabosteve Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Jan-Kazlouski-elastic Jun 3, 2025 •

edited

Loading

Jan-Kazlouski-elastic Jun 3, 2025 •

edited

Loading

szabosteve Jun 4, 2025 •

edited

Loading