Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support gemini thinking experimental model #857

Open
narengogi opened this issue Jan 6, 2025 · 0 comments
Open

[Feature] Support gemini thinking experimental model #857

narengogi opened this issue Jan 6, 2025 · 0 comments
Labels
enhancement New feature or request triage

Comments

@narengogi
Copy link
Collaborator

narengogi commented Jan 6, 2025

What Would You Like to See with the Gateway?

  • This support is only for non streaming mode.
  • Two cases:
  1. x-portkey-strict-open-ai-complaince: false the Chain of Thought message is included in the response and separated with \r\n\r\n
    example:
{
    "id": "portkey-2b003c25-7379-4620-a6f0-0b0ba3bc5d51",
    "object": "chat_completion",
    "created": 1736168989,
    "model": "Unknown",
    "provider": "vertex-ai",
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "The user is continuing the pattern of simple arithmetic questions, building upon the previous answer.  They are asking for the product of 4 multiplied by itself. I need to perform this multiplication.\r\n\r\nFollowing that same line of thought, 4 * 4 = 16\n"
            },
            "index": 0,
            "finish_reason": "STOP"
        }
    ],
    "usage": {
        "prompt_tokens": 57,
        "completion_tokens": 55,
        "total_tokens": 112
    }
}
  1. x-portkey-strict-open-ai-complaince: true only the response is sent, not the CoT message
{
    "id": "portkey-099083cc-2ced-4d5d-8e87-a133d1ac51c2",
    "object": "chat_completion",
    "created": 1736168868,
    "model": "Unknown",
    "provider": "vertex-ai",
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "Following that line of thought, 4 * 4 = 16\n"
            },
            "index": 0,
            "finish_reason": "STOP"
        }
    ],
    "usage": {
        "prompt_tokens": 57,
        "completion_tokens": 69,
        "total_tokens": 126
    }
}

For streaming, we'll be sending both the messages together without any separator

edit:
Reason for not transforming streaming messages.

These two are responses sent by gemini when streaming for the same prompt:

[
  {
    text: 'The user is following a pattern of basic arithmetic questions. They previously asked for the'
  }
]
[ { text: ' sum of 2+2, and now they are asking for the product of' } ]
[
  { text: ' 4*4.  I need to perform this multiplication.' },
  { text: 'Following that line of thought, 4 * 4 = 16\n' }
]
[
  {
    text: 'The user is continuing with basic arithmetic questions, following the pattern of the previous question'
  }
]
[ { text: '. I need to calculate the result of 4 multiplied by 4.' } ]
[ { text: 'Following that same line of thought:\n\n4 * 4 = 16\n' } ]

There is no index for the text part and so it is not possible to add the separator when both the text parts arrive in separate chunks

@narengogi narengogi added the enhancement New feature or request label Jan 6, 2025
@github-actions github-actions bot added the triage label Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage
Projects
None yet
Development

No branches or pull requests

1 participant