diff --git a/docs/inference-providers/tasks/audio-classification.md b/docs/inference-providers/tasks/audio-classification.md
index b28383e60..c14c7142c 100644
--- a/docs/inference-providers/tasks/audio-classification.md
+++ b/docs/inference-providers/tasks/audio-classification.md
@@ -46,6 +46,11 @@ No snippet available for this task.
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The input audio data as a base64-encoded string. If no `parameters` are provided, you can also provide the audio data as a raw bytes payload. |
@@ -54,16 +59,6 @@ No snippet available for this task.
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;top_k** | _integer_ | When specified, limits the output to the top K most probable classes. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/automatic-speech-recognition.md b/docs/inference-providers/tasks/automatic-speech-recognition.md
index 5b6d71362..f9bcdbdb3 100644
--- a/docs/inference-providers/tasks/automatic-speech-recognition.md
+++ b/docs/inference-providers/tasks/automatic-speech-recognition.md
@@ -48,6 +48,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The input audio data as a base64-encoded string. If no `parameters` are provided, you can also provide the audio data as a raw bytes payload. |
@@ -72,16 +77,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;use_cache** | _boolean_ | Whether the model should use the past last key/values attentions to speed up decoding |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/chat-completion.md b/docs/inference-providers/tasks/chat-completion.md
index 4844be8e6..0909d7413 100644
--- a/docs/inference-providers/tasks/chat-completion.md
+++ b/docs/inference-providers/tasks/chat-completion.md
@@ -79,6 +79,11 @@ conversational />
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **frequency_penalty** | _number_ | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. |
@@ -140,16 +145,6 @@ conversational />
 | **top_p** | _number_ | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 Output type depends on the `stream` input parameter.
diff --git a/docs/inference-providers/tasks/feature-extraction.md b/docs/inference-providers/tasks/feature-extraction.md
index 7ed41932a..3d9e60fd0 100644
--- a/docs/inference-providers/tasks/feature-extraction.md
+++ b/docs/inference-providers/tasks/feature-extraction.md
@@ -47,6 +47,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _unknown_ | One of the following: |
@@ -58,16 +63,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **truncation_direction** | _enum_ | Possible values: Left, Right. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/fill-mask.md b/docs/inference-providers/tasks/fill-mask.md
index d5331a9fe..96161e4f2 100644
--- a/docs/inference-providers/tasks/fill-mask.md
+++ b/docs/inference-providers/tasks/fill-mask.md
@@ -39,6 +39,11 @@ No snippet available for this task.
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The text with masked tokens |
@@ -47,16 +52,6 @@ No snippet available for this task.
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;targets** | _string[]_ | When passed, the model will limit the scores to the passed targets instead of looking up in the whole vocabulary. If the provided targets are not in the model vocab, they will be tokenized and the first resulting token will be used (with a warning, and that might be slower). |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/image-classification.md b/docs/inference-providers/tasks/image-classification.md
index 0feb60ff9..cc68b01fd 100644
--- a/docs/inference-providers/tasks/image-classification.md
+++ b/docs/inference-providers/tasks/image-classification.md
@@ -44,6 +44,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The input image data as a base64-encoded string. If no `parameters` are provided, you can also provide the image data as a raw bytes payload. |
@@ -52,16 +57,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;top_k** | _integer_ | When specified, limits the output to the top K most probable classes. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/image-segmentation.md b/docs/inference-providers/tasks/image-segmentation.md
index 4e4942717..24f69d233 100644
--- a/docs/inference-providers/tasks/image-segmentation.md
+++ b/docs/inference-providers/tasks/image-segmentation.md
@@ -43,6 +43,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The input image data as a base64-encoded string. If no `parameters` are provided, you can also provide the image data as a raw bytes payload. |
@@ -53,16 +58,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;threshold** | _number_ | Probability threshold to filter out predicted masks. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/image-to-image.md b/docs/inference-providers/tasks/image-to-image.md
index 908b6d393..a594c0940 100644
--- a/docs/inference-providers/tasks/image-to-image.md
+++ b/docs/inference-providers/tasks/image-to-image.md
@@ -46,6 +46,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The input image data as a base64-encoded string. If no `parameters` are provided, you can also provide the image data as a raw bytes payload. |
@@ -59,16 +64,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;height*** | _integer_ |  |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/object-detection.md b/docs/inference-providers/tasks/object-detection.md
index c45de8a1c..3c36c4081 100644
--- a/docs/inference-providers/tasks/object-detection.md
+++ b/docs/inference-providers/tasks/object-detection.md
@@ -42,6 +42,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The input image data as a base64-encoded string. If no `parameters` are provided, you can also provide the image data as a raw bytes payload. |
@@ -49,16 +54,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;threshold** | _number_ | The probability necessary to make a prediction. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/question-answering.md b/docs/inference-providers/tasks/question-answering.md
index 7d72c9ef1..878301057 100644
--- a/docs/inference-providers/tasks/question-answering.md
+++ b/docs/inference-providers/tasks/question-answering.md
@@ -44,6 +44,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _object_ | One (context, question) pair to answer |
@@ -59,16 +64,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;align_to_words** | _boolean_ | Attempts to align the answer to real words. Improves quality on space separated languages. Might hurt on non-space-separated languages (like Japanese or Chinese) |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/summarization.md b/docs/inference-providers/tasks/summarization.md
index 025b7e260..6d3994406 100644
--- a/docs/inference-providers/tasks/summarization.md
+++ b/docs/inference-providers/tasks/summarization.md
@@ -43,6 +43,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The input text to summarize. |
@@ -52,16 +57,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;generate_parameters** | _object_ | Additional parametrization of the text generation algorithm. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/table-question-answering.md b/docs/inference-providers/tasks/table-question-answering.md
index 8e834d0c7..4a13fc72b 100644
--- a/docs/inference-providers/tasks/table-question-answering.md
+++ b/docs/inference-providers/tasks/table-question-answering.md
@@ -40,6 +40,11 @@ No snippet available for this task.
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _object_ | One (table, question) pair to answer |
@@ -51,16 +56,6 @@ No snippet available for this task.
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;truncation** | _boolean_ | Activates and controls truncation. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/text-classification.md b/docs/inference-providers/tasks/text-classification.md
index 5c9e5b4de..4ccb61214 100644
--- a/docs/inference-providers/tasks/text-classification.md
+++ b/docs/inference-providers/tasks/text-classification.md
@@ -46,6 +46,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The text to classify |
@@ -54,16 +59,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;top_k** | _integer_ | When specified, limits the output to the top K most probable classes. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/text-generation.md b/docs/inference-providers/tasks/text-generation.md
index 84f0c1282..a09065657 100644
--- a/docs/inference-providers/tasks/text-generation.md
+++ b/docs/inference-providers/tasks/text-generation.md
@@ -49,6 +49,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ |  |
@@ -81,16 +86,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **stream** | _boolean_ |  |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 Output type depends on the `stream` input parameter.
diff --git a/docs/inference-providers/tasks/text-to-image.md b/docs/inference-providers/tasks/text-to-image.md
index b18654994..bddb510b3 100644
--- a/docs/inference-providers/tasks/text-to-image.md
+++ b/docs/inference-providers/tasks/text-to-image.md
@@ -44,6 +44,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The input text data (sometimes called "prompt") |
@@ -57,16 +62,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seed** | _integer_ | Seed for the random number generator. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/token-classification.md b/docs/inference-providers/tasks/token-classification.md
index 4c49405fc..9dc801825 100644
--- a/docs/inference-providers/tasks/token-classification.md
+++ b/docs/inference-providers/tasks/token-classification.md
@@ -45,6 +45,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The input text data |
@@ -59,16 +64,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(#5)** | _&#x27;max&#x27;_ | Similar to "simple", also preserves word integrity (uses the label with the highest score across the word's tokens). |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/translation.md b/docs/inference-providers/tasks/translation.md
index 261f13bca..62bcc4ecf 100644
--- a/docs/inference-providers/tasks/translation.md
+++ b/docs/inference-providers/tasks/translation.md
@@ -43,6 +43,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The text to translate. |
@@ -54,16 +59,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;generate_parameters** | _object_ | Additional parametrization of the text generation algorithm. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/docs/inference-providers/tasks/zero-shot-classification.md b/docs/inference-providers/tasks/zero-shot-classification.md
index 12eaca422..1c57edfb9 100644
--- a/docs/inference-providers/tasks/zero-shot-classification.md
+++ b/docs/inference-providers/tasks/zero-shot-classification.md
@@ -42,6 +42,11 @@ Explore all available models and find the one that suits you best [here](https:/
 
 #### Request
 
+| Headers |   |    |
+| :--- | :--- | :--- |
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
+
+
 | Payload |  |  |
 | :--- | :--- | :--- |
 | **inputs*** | _string_ | The text to classify |
@@ -51,16 +56,6 @@ Explore all available models and find the one that suits you best [here](https:/
 | **&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;multi_label** | _boolean_ | Whether multiple candidate labels can be true. If false, the scores are normalized such that the sum of the label likelihoods for each sequence is 1. If true, the labels are considered independent and probabilities are normalized for each candidate. |
 
 
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
-| Headers |   |    |
-| :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
-
 #### Response
 
 | Body |  |
diff --git a/scripts/inference-providers/templates/common/specs-headers.handlebars b/scripts/inference-providers/templates/common/specs-headers.handlebars
index 32b6e9d94..e8610d66d 100644
--- a/scripts/inference-providers/templates/common/specs-headers.handlebars
+++ b/scripts/inference-providers/templates/common/specs-headers.handlebars
@@ -1,9 +1,3 @@
-Some options can be configured by passing headers to the Inference API. Here are the available headers:
-
 | Headers |   |    |
 | :--- | :--- | :--- |
-| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with Inference API permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens). |
-| **x-use-cache** | _boolean, default to `true`_ | There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching [here](../parameters#caching]). |
-| **x-wait-for-model** | _boolean, default to `false`_ | If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability [here](../overview#eligibility]). |
-
-For more information about Inference API headers, check out the parameters [guide](../parameters).
\ No newline at end of file
+| **authorization** | _string_ | Authentication header in the form `'Bearer: hf_****'` when `hf_****` is a personal user access token with "Inference Providers" permission. You can generate one from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). |
diff --git a/scripts/inference-providers/templates/task/audio-classification.handlebars b/scripts/inference-providers/templates/task/audio-classification.handlebars
index 30a153ced..11583f054 100644
--- a/scripts/inference-providers/templates/task/audio-classification.handlebars
+++ b/scripts/inference-providers/templates/task/audio-classification.handlebars
@@ -25,10 +25,10 @@ Example applications:
 
 #### Request
 
-{{{specs.audio-classification.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.audio-classification.input}}}
+
 #### Response
 
 {{{specs.audio-classification.output}}}
diff --git a/scripts/inference-providers/templates/task/automatic-speech-recognition.handlebars b/scripts/inference-providers/templates/task/automatic-speech-recognition.handlebars
index 7836c11b9..7dc26c861 100644
--- a/scripts/inference-providers/templates/task/automatic-speech-recognition.handlebars
+++ b/scripts/inference-providers/templates/task/automatic-speech-recognition.handlebars
@@ -25,10 +25,10 @@ Example applications:
 
 #### Request
 
-{{{specs.automatic-speech-recognition.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.automatic-speech-recognition.input}}}
+
 #### Response
 
 {{{specs.automatic-speech-recognition.output}}}
diff --git a/scripts/inference-providers/templates/task/chat-completion.handlebars b/scripts/inference-providers/templates/task/chat-completion.handlebars
index 01512b081..c9d5c7e81 100644
--- a/scripts/inference-providers/templates/task/chat-completion.handlebars
+++ b/scripts/inference-providers/templates/task/chat-completion.handlebars
@@ -52,10 +52,10 @@ The API supports:
 
 #### Request
 
-{{{specs.chat-completion.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.chat-completion.input}}}
+
 #### Response
 
 Output type depends on the `stream` input parameter.
diff --git a/scripts/inference-providers/templates/task/feature-extraction.handlebars b/scripts/inference-providers/templates/task/feature-extraction.handlebars
index e31c2ecdf..e067ad826 100644
--- a/scripts/inference-providers/templates/task/feature-extraction.handlebars
+++ b/scripts/inference-providers/templates/task/feature-extraction.handlebars
@@ -25,10 +25,10 @@ Example applications:
 
 #### Request
 
-{{{specs.feature-extraction.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.feature-extraction.input}}}
+
 #### Response
 
 {{{specs.feature-extraction.output}}}
diff --git a/scripts/inference-providers/templates/task/fill-mask.handlebars b/scripts/inference-providers/templates/task/fill-mask.handlebars
index 0784ce319..5f0029e5e 100644
--- a/scripts/inference-providers/templates/task/fill-mask.handlebars
+++ b/scripts/inference-providers/templates/task/fill-mask.handlebars
@@ -20,10 +20,10 @@ Mask filling is the task of predicting the right word (token to be precise) in t
 
 #### Request
 
-{{{specs.fill-mask.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.fill-mask.input}}}
+
 #### Response
 
 {{{specs.fill-mask.output}}}
diff --git a/scripts/inference-providers/templates/task/image-classification.handlebars b/scripts/inference-providers/templates/task/image-classification.handlebars
index ad37828b1..b7bcd3e7a 100644
--- a/scripts/inference-providers/templates/task/image-classification.handlebars
+++ b/scripts/inference-providers/templates/task/image-classification.handlebars
@@ -20,10 +20,10 @@ Image classification is the task of assigning a label or class to an entire imag
 
 #### Request
 
-{{{specs.image-classification.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.image-classification.input}}}
+
 #### Response
 
 {{{specs.image-classification.output}}}
diff --git a/scripts/inference-providers/templates/task/image-segmentation.handlebars b/scripts/inference-providers/templates/task/image-segmentation.handlebars
index 8e27797e2..5eefc78a7 100644
--- a/scripts/inference-providers/templates/task/image-segmentation.handlebars
+++ b/scripts/inference-providers/templates/task/image-segmentation.handlebars
@@ -20,10 +20,10 @@ Image Segmentation divides an image into segments where each pixel in the image
 
 #### Request
 
-{{{specs.image-segmentation.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.image-segmentation.input}}}
+
 #### Response
 
 {{{specs.image-segmentation.output}}}
diff --git a/scripts/inference-providers/templates/task/image-to-image.handlebars b/scripts/inference-providers/templates/task/image-to-image.handlebars
index 2d9ae5bfd..3ef632b1a 100644
--- a/scripts/inference-providers/templates/task/image-to-image.handlebars
+++ b/scripts/inference-providers/templates/task/image-to-image.handlebars
@@ -25,10 +25,10 @@ Example applications:
 
 #### Request
 
-{{{specs.image-to-image.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.image-to-image.input}}}
+
 #### Response
 
 {{{specs.image-to-image.output}}}
diff --git a/scripts/inference-providers/templates/task/object-detection.handlebars b/scripts/inference-providers/templates/task/object-detection.handlebars
index 8d4a7ad22..0c21990ea 100644
--- a/scripts/inference-providers/templates/task/object-detection.handlebars
+++ b/scripts/inference-providers/templates/task/object-detection.handlebars
@@ -20,10 +20,10 @@ Object Detection models allow users to identify objects of certain defined class
 
 #### Request
 
-{{{specs.object-detection.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.object-detection.input}}}
+
 #### Response
 
 {{{specs.object-detection.output}}}
diff --git a/scripts/inference-providers/templates/task/question-answering.handlebars b/scripts/inference-providers/templates/task/question-answering.handlebars
index 39b77a601..579b35e2d 100644
--- a/scripts/inference-providers/templates/task/question-answering.handlebars
+++ b/scripts/inference-providers/templates/task/question-answering.handlebars
@@ -20,10 +20,10 @@ Question Answering models can retrieve the answer to a question from a given tex
 
 #### Request
 
-{{{specs.question-answering.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.question-answering.input}}}
+
 #### Response
 
 {{{specs.question-answering.output}}}
diff --git a/scripts/inference-providers/templates/task/summarization.handlebars b/scripts/inference-providers/templates/task/summarization.handlebars
index 2df7ab361..596ef97e8 100644
--- a/scripts/inference-providers/templates/task/summarization.handlebars
+++ b/scripts/inference-providers/templates/task/summarization.handlebars
@@ -20,10 +20,10 @@ Summarization is the task of producing a shorter version of a document while pre
 
 #### Request
 
-{{{specs.summarization.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.summarization.input}}}
+
 #### Response
 
 {{{specs.summarization.output}}}
diff --git a/scripts/inference-providers/templates/task/table-question-answering.handlebars b/scripts/inference-providers/templates/task/table-question-answering.handlebars
index d72fbfa69..78ecb3bdf 100644
--- a/scripts/inference-providers/templates/task/table-question-answering.handlebars
+++ b/scripts/inference-providers/templates/task/table-question-answering.handlebars
@@ -20,10 +20,10 @@ Table Question Answering (Table QA) is the answering a question about an informa
 
 #### Request
 
-{{{specs.table-question-answering.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.table-question-answering.input}}}
+
 #### Response
 
 {{{specs.table-question-answering.output}}}
diff --git a/scripts/inference-providers/templates/task/text-classification.handlebars b/scripts/inference-providers/templates/task/text-classification.handlebars
index 6bf151da5..c5c34d5ba 100644
--- a/scripts/inference-providers/templates/task/text-classification.handlebars
+++ b/scripts/inference-providers/templates/task/text-classification.handlebars
@@ -20,10 +20,10 @@ Text Classification is the task of assigning a label or class to a given text. S
 
 #### Request
 
-{{{specs.text-classification.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.text-classification.input}}}
+
 #### Response
 
 {{{specs.text-classification.output}}}
diff --git a/scripts/inference-providers/templates/task/text-generation.handlebars b/scripts/inference-providers/templates/task/text-generation.handlebars
index a67cb55e6..96fc9a89d 100644
--- a/scripts/inference-providers/templates/task/text-generation.handlebars
+++ b/scripts/inference-providers/templates/task/text-generation.handlebars
@@ -22,10 +22,10 @@ If you are interested in a Chat Completion task, which generates a response base
 
 #### Request
 
-{{{specs.text-generation.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.text-generation.input}}}
+
 #### Response
 
 Output type depends on the `stream` input parameter.
diff --git a/scripts/inference-providers/templates/task/text-to-image.handlebars b/scripts/inference-providers/templates/task/text-to-image.handlebars
index 58634c514..8750ff5e9 100644
--- a/scripts/inference-providers/templates/task/text-to-image.handlebars
+++ b/scripts/inference-providers/templates/task/text-to-image.handlebars
@@ -20,10 +20,10 @@ Generate an image based on a given text prompt.
 
 #### Request
 
-{{{specs.text-to-image.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.text-to-image.input}}}
+
 #### Response
 
 {{{specs.text-to-image.output}}}
diff --git a/scripts/inference-providers/templates/task/token-classification.handlebars b/scripts/inference-providers/templates/task/token-classification.handlebars
index a342c4bb2..461a2c26b 100644
--- a/scripts/inference-providers/templates/task/token-classification.handlebars
+++ b/scripts/inference-providers/templates/task/token-classification.handlebars
@@ -20,10 +20,10 @@ Token classification is a task in which a label is assigned to some tokens in a
 
 #### Request
 
-{{{specs.token-classification.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.token-classification.input}}}
+
 #### Response
 
 {{{specs.token-classification.output}}}
diff --git a/scripts/inference-providers/templates/task/translation.handlebars b/scripts/inference-providers/templates/task/translation.handlebars
index a161a3f9e..2bd302d15 100644
--- a/scripts/inference-providers/templates/task/translation.handlebars
+++ b/scripts/inference-providers/templates/task/translation.handlebars
@@ -20,10 +20,10 @@ Translation is the task of converting text from one language to another.
 
 #### Request
 
-{{{specs.translation.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.translation.input}}}
+
 #### Response
 
 {{{specs.translation.output}}}
diff --git a/scripts/inference-providers/templates/task/zero-shot-classification.handlebars b/scripts/inference-providers/templates/task/zero-shot-classification.handlebars
index 4025e3726..c49db4117 100644
--- a/scripts/inference-providers/templates/task/zero-shot-classification.handlebars
+++ b/scripts/inference-providers/templates/task/zero-shot-classification.handlebars
@@ -20,10 +20,10 @@ Zero-shot text classification is super useful to try out classification with zer
 
 #### Request
 
-{{{specs.zero-shot-classification.input}}}
-
 {{{constants.specsHeaders}}}
 
+{{{specs.zero-shot-classification.input}}}
+
 #### Response
 
 {{{specs.zero-shot-classification.output}}}