feat: support choosing chatCompletionV2 or chatCompletionPro API for minimax provider #1593

hanxiantao · 2024-12-14T06:21:41Z

Ⅰ. Describe what this PR did

minimax 支持可选择 chatCompletionV2 或 chatCompletionPro API

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

docker-compose.yaml

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:1.4.0-rc.1
    entrypoint: /usr/local/bin/envoy
    # 注意这里对wasm开启了debug级别日志，正式部署时则默认info级别
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    depends_on:
      - httpbin
    networks:
      - wasmtest
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./plugin.wasm:/etc/envoy/plugin.wasm
  httpbin:
    image: kennethreitz/httpbin:latest
    networks:
      - wasmtest
    ports:
      - "12345:80"
networks:
  wasmtest: {}

使用OpenAI协议代理minimax chat completion V2 API

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: minimax
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                  "provider": {
                                    "type": "minimax",
                                    "apiTokens": [
                                      "YOUR_MINIMAX_API_TOKEN"
                                    ],
                                    "modelMapping": {
                                      "gpt-3": "abab6.5s-chat",
                                      "gpt-4": "abab6.5g-chat",
                                      "*": "abab6.5t-chat"
                                    },
                                    "protocol": "openai"
                                  }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: minimax
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: minimax
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: api.minimax.chat
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "api.minimax.chat"

非流式请求

curl --location --request POST 'http://localhost:10000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "gpt-3",
    "messages": [
        {
            "role": "user",
            "content": "你好，你是谁？"
        }
    ],
    "stream": false
}'

响应：

{
    "id": "03ac4fcfe1c6cc9c6a60f9d12046e2b4",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "你好，我是一个由MiniMax公司研发的大型语言模型，名为MM智能助理。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助，请随时告诉我！",
                "role": "assistant",
                "name": "MM智能助理",
                "audio_content": ""
            }
        }
    ],
    "created": 1734155471,
    "model": "abab6.5s-chat",
    "object": "chat.completion",
    "usage": {
        "total_tokens": 116,
        "total_characters": 0,
        "prompt_tokens": 70,
        "completion_tokens": 46
    },
    "input_sensitive": false,
    "output_sensitive": false,
    "input_sensitive_type": 0,
    "output_sensitive_type": 0,
    "output_sensitive_int": 0,
    "base_resp": {
        "status_code": 0,
        "status_msg": ""
    }
}

流式请求

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-3",
    "messages": [
        {
            "role": "user",
            "content": "你好，你是谁？"
        }
    ],
    "stream": true
}'

响应：

data: {"id":"03ac5052e591061fb87772c64fc728d1","choices":[{"index":0,"delta":{"content":"你好","role":"assistant","name":"MM智能助理","audio_content":""}}],"created":1734155602,"model":"abab6.5s-chat","object":"chat.completion.chunk","usage":{"total_tokens":0,"total_characters":0},"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"output_sensitive_int":0}

data: {"id":"03ac5052e591061fb87772c64fc728d1","choices":[{"index":0,"delta":{"content":"，我是一个由MiniMax公司研发的大型语言模型，名为MM智能助理。我可以帮助","role":"assistant","name":"MM智能助理","audio_content":""}}],"created":1734155602,"model":"abab6.5s-chat","object":"chat.completion.chunk","usage":{"total_tokens":0,"total_characters":0},"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"output_sensitive_int":0}

data: {"id":"03ac5052e591061fb87772c64fc728d1","choices":[{"index":0,"delta":{"content":"回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要","role":"assistant","name":"MM智能助理","audio_content":""}}],"created":1734155602,"model":"abab6.5s-chat","object":"chat.completion.chunk","usage":{"total_tokens":0,"total_characters":0},"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"output_sensitive_int":0}

data: {"id":"03ac5052e591061fb87772c64fc728d1","choices":[{"finish_reason":"stop","index":0,"delta":{"content":"帮助，请随时告诉我！","role":"assistant","name":"MM智能助理","audio_content":""}}],"created":1734155602,"model":"abab6.5s-chat","object":"chat.completion.chunk","usage":{"total_tokens":0,"total_characters":0},"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"output_sensitive_int":0}

data: {"id":"03ac5052e591061fb87772c64fc728d1","choices":[{"finish_reason":"stop","index":0,"message":{"content":"你好，我是一个由MiniMax公司研发的大型语言模型，名为MM智能助理。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助，请随时告诉我！","role":"assistant","name":"MM智能助理","audio_content":""}}],"created":1734155602,"model":"abab6.5s-chat","object":"chat.completion","usage":{"total_tokens":116,"total_characters":0,"prompt_tokens":70,"completion_tokens":46},"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"output_sensitive_int":0,"base_resp":{"status_code":0,"status_msg":""}}

使用OpenAI协议代理chat completion Pro API

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: minimax
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                  "provider": {
                                    "type": "minimax",
                                    "apiTokens": [
                                      "YOUR_MINIMAX_API_TOKEN"
                                    ],
                                    "modelMapping": {
                                      "gpt-3": "abab6.5s-chat",
                                      "gpt-4": "abab6.5g-chat",
                                      "*": "abab6.5t-chat"
                                    },
                                    "protocol": "openai",
                                    "minimaxApiType": "pro",
                                    "minimaxGroupId": "YOUR_MINIMAX_GROUP_ID"
                                  }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: minimax
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: minimax
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: api.minimax.chat
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "api.minimax.chat"

非流式请求

curl --location --request POST 'http://localhost:10000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "gpt-3",
    "messages": [
        {
            "role": "user",
            "content": "你好，你是谁？"
        }
    ],
    "stream": false
}'

响应：

{
    "id": "03ac53b967b6d9aaebf2c43915affae2",
    "choices": [
        {
            "index": 0,
            "message": {
                "name": "MM智能助理",
                "role": "assistant",
                "content": "你好！我是一个基于大型语言模型的虚拟助手，由MiniMax公司研发。我的设计旨在帮助用户解答问题、提供信息以及进行各种语言相关的任务。如果你有任何问题或需要帮助，请随时告诉我！"
            },
            "finish_reason": "stop"
        }
    ],
    "created": 1734156476,
    "model": "abab6.5s-chat",
    "object": "chat.completion",
    "usage": {
        "total_tokens": 115
    }
}

流式请求

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-3",
    "messages": [
        {
            "role": "user",
            "content": "你好，你是谁？"
        }
    ],
    "stream": true
}'

响应：

data: {"created":1734156502,"model":"abab6.5s-chat","reply":"","choices":[{"messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"你好"}]}],"request_id":"1793266516951581185_1734156502771545"}

data: {"created":1734156503,"model":"abab6.5s-chat","reply":"","choices":[{"messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"！我是一个基于大型语言模型的虚拟助手，由MiniMax公司研发"}]}],"request_id":"1793266516951581185_1734156502771545"}

data: {"created":1734156504,"model":"abab6.5s-chat","reply":"","choices":[{"messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"。我的设计旨在通过自然语言处理和机器学习技术来理解和生成文本，以便为用户提供信息、解答问题、进行"}]}],"request_id":"1793266516951581185_1734156502771545"}

data: {"created":1734156504,"model":"abab6.5s-chat","reply":"","choices":[{"messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"对话等服务。如果你有任何问题或需要帮助，请随时告诉我！"}]}],"request_id":"1793266516951581185_1734156502771545"}

data: {"created":1734156505,"model":"abab6.5s-chat","reply":"你好！我是一个基于大型语言模型的虚拟助手，由MiniMax公司研发。我的设计旨在通过自然语言处理和机器学习技术来理解和生成文本，以便为用户提供信息、解答问题、进行对话等服务。如果你有任何问题或需要帮助，请随时告诉我！","choices":[{"finish_reason":"stop","messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"你好！我是一个基于大型语言模型的虚拟助手，由MiniMax公司研发。我的设计旨在通过自然语言处理和机器学习技术来理解和生成文本，以便为用户提供信息、解答问题、进行对话等服务。如果你有任何问题或需要帮助，请随时告诉我！"}]}],"usage":{"total_tokens":127,"prompt_tokens":70,"completion_tokens":57},"input_sensitive":false,"output_sensitive":false,"id":"03ac53d674c74bf062bdf7ff71d2333d","base_resp":{"status_code":0,"status_msg":""}}

Ⅴ. Special notes for reviews

…ax provider

codecov-commenter · 2024-12-14T06:24:29Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 43.50%. Comparing base (ef31e09) to head (f70b310).
Report is 228 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1593      +/-   ##
==========================================
+ Coverage   35.91%   43.50%   +7.59%     
==========================================
  Files          69       76       +7     
  Lines       11576    12325     +749     
==========================================
+ Hits         4157     5362    +1205     
+ Misses       7104     6627     -477     
- Partials      315      336      +21

see 69 files with indirect coverage changes

CH3CHO

LGTM

…minimax provider (alibaba#1593)

Supports choosing chatCompletionV2 or chatCompletionPro API for minim…

f4c8b52

…ax provider

hanxiantao requested review from johnlanni, WeixinX and CH3CHO as code owners December 14, 2024 06:21

hanxiantao changed the title ~~feat:Supports choosing chatCompletionV2 or chatCompletionPro API for minimax provider~~ feat: Support choosing chatCompletionV2 or chatCompletionPro API for minimax provider Dec 14, 2024

Merge branch 'main' into minimax-ai-proxy

f70b310

hanxiantao changed the title ~~feat: Support choosing chatCompletionV2 or chatCompletionPro API for minimax provider~~ feat: support choosing chatCompletionV2 or chatCompletionPro API for minimax provider Dec 14, 2024

CH3CHO approved these changes Dec 15, 2024

View reviewed changes

CH3CHO merged commit 8544fa6 into alibaba:main Dec 15, 2024
13 checks passed

yunmaoQu pushed a commit to yunmaoQu/higress that referenced this pull request Dec 25, 2024

feat: support choosing chatCompletionV2 or chatCompletionPro API for …

997729f

…minimax provider (alibaba#1593)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support choosing chatCompletionV2 or chatCompletionPro API for minimax provider #1593

feat: support choosing chatCompletionV2 or chatCompletionPro API for minimax provider #1593

hanxiantao commented Dec 14, 2024

codecov-commenter commented Dec 14, 2024 •

edited

Loading

CH3CHO left a comment

feat: support choosing chatCompletionV2 or chatCompletionPro API for minimax provider #1593

feat: support choosing chatCompletionV2 or chatCompletionPro API for minimax provider #1593

Conversation

hanxiantao commented Dec 14, 2024

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

使用OpenAI协议代理minimax chat completion V2 API

使用OpenAI协议代理chat completion Pro API

Ⅴ. Special notes for reviews

codecov-commenter commented Dec 14, 2024 • edited Loading

Codecov Report

CH3CHO left a comment

Choose a reason for hiding this comment

codecov-commenter commented Dec 14, 2024 •

edited

Loading