Skip to content

refactor/process also groq reasoning models properly #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 30, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 52 additions & 11 deletions openapi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -67,43 +67,51 @@ paths:
value:
object: "list"
data:
- id: "gpt-4o"
- id: "openai/gpt-4o"
object: "model"
created: 1686935002
owned_by: "openai"
- id: "llama-3.3-70b-versatile"
served_by: "openai"
- id: "openai/llama-3.3-70b-versatile"
object: "model"
created: 1723651281
owned_by: "groq"
- id: "claude-3-opus-20240229"
served_by: "groq"
- id: "cohere/claude-3-opus-20240229"
object: "model"
created: 1708905600
owned_by: "anthropic"
- id: "command-r"
served_by: "anthropic"
- id: "cohere/command-r"
object: "model"
created: 1707868800
owned_by: "cohere"
- id: "phi3:3.8b"
served_by: "cohere"
- id: "ollama/phi3:3.8b"
object: "model"
created: 1718441600
owned_by: "ollama"
served_by: "ollama"
singleProvider:
summary: Models from a specific provider
value:
object: "list"
data:
- id: "gpt-4o"
- id: "openai/gpt-4o"
object: "model"
created: 1686935002
owned_by: "openai"
- id: "gpt-4-turbo"
served_by: "openai"
- id: "openai/gpt-4-turbo"
object: "model"
created: 1687882410
owned_by: "openai"
- id: "gpt-3.5-turbo"
served_by: "openai"
- id: "openai/gpt-3.5-turbo"
object: "model"
created: 1677649963
owned_by: "openai"
served_by: "openai"
"401":
$ref: "#/components/responses/Unauthorized"
"500":
Expand Down Expand Up @@ -562,6 +570,9 @@ components:
type: string
chat:
type: string
required:
- models
- chat
Error:
type: object
properties:
Expand Down Expand Up @@ -589,10 +600,12 @@ components:
$ref: "#/components/schemas/ChatCompletionMessageToolCall"
tool_call_id:
type: string
reasoning:
type: string
reasoning_content:
type: string
description: The reasoning content of the chunk message.
reasoning:
type: string
description: The reasoning of the chunk message. Same as reasoning_content.
required:
- role
- content
Expand All @@ -611,6 +624,12 @@ components:
type: string
served_by:
$ref: "#/components/schemas/Provider"
required:
- id
- object
- created
- owned_by
- served_by
ListModelsResponse:
type: object
description: Response structure for listing models
Expand Down Expand Up @@ -717,7 +736,8 @@ components:
usage statistics for the entire request, and the `choices` field
will always be an empty array. All other chunks will also include a
`usage` field, but with a null value.
default: true
required:
- include_usage
CreateChatCompletionRequest:
type: object
properties:
Expand Down Expand Up @@ -754,6 +774,14 @@ components:
are supported.
items:
$ref: "#/components/schemas/ChatCompletionTool"
reasoning_format:
type: string
description: >
The format of the reasoning content. Can be `raw` or `parsed`.

When specified as raw some reasoning models will output <think /> tags.
When specified as parsed the model will output the reasoning under
`reasoning` or `reasoning_content` attribute.
required:
- model
- messages
Expand Down Expand Up @@ -899,6 +927,9 @@ components:
reasoning_content:
type: string
description: The reasoning content of the chunk message.
reasoning:
type: string
description: The reasoning of the chunk message. Same as reasoning_content.
tool_calls:
type: array
items:
Expand All @@ -908,6 +939,9 @@ components:
refusal:
type: string
description: The refusal message generated by the model.
required:
- content
- role
ChatCompletionMessageToolCallChunk:
type: object
properties:
Expand Down Expand Up @@ -1040,6 +1074,13 @@ components:
description: The object type, which is always `chat.completion.chunk`.
usage:
$ref: "#/components/schemas/CompletionUsage"
reasoning_format:
type: string
description: >
The format of the reasoning content. Can be `raw` or `parsed`.

When specified as raw some reasoning models will output <think /> tags.
When specified as parsed the model will output the reasoning under reasoning_content.
required:
- choices
- created
Expand Down
5 changes: 5 additions & 0 deletions src/client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,11 @@ export class InferenceGatewayClient {
callbacks.onReasoning?.(reasoning_content);
}

const reasoning = chunk.choices[0]?.delta?.reasoning;
if (reasoning !== undefined) {
callbacks.onReasoning?.(reasoning);
}

const content = chunk.choices[0]?.delta?.content;
if (content) {
callbacks.onContent?.(content);
Expand Down
39 changes: 24 additions & 15 deletions src/types/generated/index.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading