[ai-json-resp] Extract JSON from LLM, Validate with Schema, Ensure Va…

…lid JSON, Auto-Retry (#1236)
alibaba · Sep 3, 2024 · ffc0c09 · ffc0c09
1 parent 7b2b522
commit ffc0c09
Show file tree

Hide file tree

Showing 6 changed files with 1,035 additions and 0 deletions.
diff --git a/plugins/wasm-go/extensions/ai-json-resp/README.md b/plugins/wasm-go/extensions/ai-json-resp/README.md
@@ -0,0 +1,202 @@
+## 简介
+
+**Note**
+
+> 需要数据面的proxy wasm版本大于等于0.2.100
+> 
+
+> 编译时，需要带上版本的tag，例如：tinygo build -o main.wasm -scheduler=none -target=wasi -gc=custom -tags="custommalloc nottinygc_finalizer proxy_wasm_version_0_2_100" ./
+
+
+LLM响应结构化插件，用于根据默认或用户配置的Json Schema对AI的响应进行结构化，以便后续插件处理。注意目前只支持 `非流式响应`。
+
+
+### 配置说明
+
+| Name | Type | Requirement | Default | **Description** |
+| --- | --- | --- | --- | --- |
+| serviceName | str |  required | - | AI服务或支持AI-Proxy的网关服务名称 |
+| serviceDomain | str |  optional | - | AI服务或支持AI-Proxy的网关服务域名/IP地址 |
+| servicePath | str |  optional | '/v1/chat/completions' | AI服务或支持AI-Proxy的网关服务基础路径 |
+| serviceUrl | str |  optional | - | AI服务或支持 AI-Proxy 的网关服务URL, 插件将自动提取Domain 和 Path, 用于填充未配置的 serviceDomain 或 servicePath |
+| servicePort | int |  optional | 443 | 网关服务端口 |
+| serviceTimeout | int |  optional | 50000 | 默认请求超时时间 |
+| maxRetry | int |  optional | 3 | 若回答无法正确提取格式化时重试次数 |
+| contentPath | str |  optional | "choices.0.message.content” | 从LLM回答中提取响应结果的gpath路径 |
+| jsonSchema | str (json) |  optional | - | 验证请求所参照的 jsonSchema, 为空只验证并返回合法Json格式响应 |
+| enableSwagger | bool |  optional | false | 是否启用 Swagger 协议进行验证 |
+| enableOas3 | bool |  optional | true | 是否启用 Oas3 协议进行验证 |
+| enableContentDisposition | bool | optional | true | 是否启用 Content-Disposition 头部, 若启用则会在响应头中添加 `Content-Disposition: attachment; filename="response.json"` |
+
+> 出于性能考虑，默认支持的最大 Json Schema 深度为 6。超过此深度的 Json Schema 将不用于验证响应，插件只会检查返回的响应是否为合法的 Json 格式。
+
+
+### 请求和返回参数说明
+
+- **请求参数**: 本插件请求格式为openai请求格式，包含`model`和`messages`字段，其中`model`为AI模型名称，`messages`为对话消息列表，每个消息包含`role`和`content`字段，`role`为消息角色，`content`为消息内容。
+  ```json
+  {
+    "model": "gpt-4",
+    "messages": [
+      {"role": "user", "content": "give me a api doc for add the variable x to x+5"}
+    ]
+  }
+  ```
+  其他请求参数需参考配置的ai服务或网关服务的相应文档。
+- **返回参数**: 
+  - 返回满足定义的Json Schema约束的 `Json格式响应`
+  - 若未定义Json Schema，则返回合法的`Json格式响应`
+  - 若出现内部错误，则返回 `{ "Code": 10XX, "Msg": "错误信息提示" }`。
+
+## 请求示例
+
+```bash
+curl -X POST "http://localhost:8001/v1/chat/completions" \
+-H "Content-Type: application/json" \
+-d '{
+  "model": "gpt-4",
+  "messages": [
+    {"role": "user", "content": "give me a api doc for add the variable x to x+5"}
+  ]
+}'
+
+```
+
+## 返回示例
+### 正常返回
+在正常情况下，系统应返回经过 JSON Schema 验证的 JSON 数据。如果未配置 JSON Schema，系统将返回符合 JSON 标准的合法 JSON 数据。
+```json
+{
+  "apiVersion": "1.0",
+  "request": {
+    "endpoint": "/add_to_five",
+    "method": "POST",
+    "port": 8080,
+    "headers": {
+      "Content-Type": "application/json"
+    },
+    "body": {
+      "x": 7
+    }
+  }
+}
+```
+
+### 异常返回
+在发生错误时，返回状态码为 `500`，返回内容为 JSON 格式的错误信息。包含错误码 `Code` 和错误信息 `Msg` 两个字段。
+```json
+{
+  "Code": 1006,
+  "Msg": "retry count exceed max retry count"
+}
+```
+
+### 错误码说明
+| 错误码 | 说明 |
+| --- | --- |
+| 1001 | 配置的Json Schema不是合法Json格式|
+| 1002 | 配置的Json Schema编译失败，不是合法的Json Schema 格式或深度超出 jsonSchemaMaxDepth 且 rejectOnDepthExceeded 为true|
+| 1003 | 无法在响应中提取合法的Json|
+| 1004 | 响应为空字符串|
+| 1005 | 响应不符合Json Schema定义|
+| 1006 | 重试次数超过最大限制|
+| 1007 | 无法获取响应内容，可能是上游服务配置错误或获取内容的ContentPath路径错误|
+| 1008 | serciveDomain为空, 请注意serviceDomian或serviceUrl不能同时为空|
+
+## 服务配置说明
+本插件需要配置上游服务来支持出现异常时的自动重试机制, 支持的配置主要包括`支持openai接口的AI服务`或`本地网关服务`
+
+### 支持openai接口的AI服务
+以qwen为例，基本配置如下：
+
+Yaml格式配置如下
+```yaml
+serviceName: qwen
+serviceDomain: dashscope.aliyuncs.com
+apiKey: [Your API Key]
+servicePath: /compatible-mode/v1/chat/completions
+jsonSchema:
+  title: ReasoningSchema
+  type: object
+  properties:
+    reasoning_steps:
+      type: array
+      items:
+        type: string
+      description: The reasoning steps leading to the final conclusion.
+    answer:
+      type: string
+      description: The final answer, taking into account the reasoning steps.
+  required:
+    - reasoning_steps
+    - answer
+  additionalProperties: false
+```
+
+JSON 格式配置
+```json
+{
+  "serviceName": "qwen",
+  "serviceUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions",
+  "apiKey": "[Your API Key]",
+  "jsonSchema": {
+    "title": "ActionItemsSchema",
+    "type": "object",
+    "properties": {
+      "action_items": {
+        "type": "array",
+        "items": {
+          "type": "object",
+          "properties": {
+            "description": {
+              "type": "string",
+              "description": "Description of the action item."
+            },
+            "due_date": {
+              "type": ["string", "null"],
+              "description": "Due date for the action item, can be null if not specified."
+            },
+            "owner": {
+              "type": ["string", "null"],
+              "description": "Owner responsible for the action item, can be null if not specified."
+            }
+          },
+          "required": ["description", "due_date", "owner"],
+          "additionalProperties": false
+        },
+        "description": "List of action items from the meeting."
+      }
+    },
+    "required": ["action_items"],
+    "additionalProperties": false
+  }
+}
+```
+
+### 本地网关服务
+为了能复用已经配置好的服务，本插件也支持配置本地网关服务。例如，若网关已经配置好了[AI-proxy服务](../ai-proxy/README.md)，则可以直接配置如下：
+1. 创建一个固定IP为127.0.0.1的服务，例如localservice.static
+```yaml
+- name: outbound|10000||localservice.static
+  connect_timeout: 30s
+  type: LOGICAL_DNS
+  dns_lookup_family: V4_ONLY
+  lb_policy: ROUND_ROBIN
+  load_assignment:
+    cluster_name: outbound|8001||localservice.static
+    endpoints:
+      - lb_endpoints:
+          - endpoint:
+              address:
+                socket_address:
+                  address: 127.0.0.1
+                  port_value: 10000
+```
+2. 配置文件中添加localservice.static的服务配置
+```yaml
+serviceName: localservice
+serviceDomain: 127.0.0.1
+servicePort: 10000
+```
+3. 自动提取请求的Path，Header等信息
+插件会自动提取请求的Path，Header等信息，从而避免对AI服务的重复配置。
diff --git a/plugins/wasm-go/extensions/ai-json-resp/go.mod b/plugins/wasm-go/extensions/ai-json-resp/go.mod
@@ -0,0 +1,21 @@
+module github.com/alibaba/higress/plugins/wasm-go/extensions/hello-world
+
+go 1.18
+
+replace github.com/alibaba/higress/plugins/wasm-go => ../..
+
+require (
+	github.com/alibaba/higress/plugins/wasm-go v1.4.2
+	github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240711023527-ba358c48772f
+)
+
+require (
+	github.com/google/uuid v1.3.0 // indirect
+	github.com/higress-group/nottinygc v0.0.0-20231101025119-e93c4c2f8520 // indirect
+	github.com/magefile/mage v1.14.0 // indirect
+	github.com/santhosh-tekuri/jsonschema v1.2.4 // indirect
+	github.com/tidwall/gjson v1.14.3 // indirect
+	github.com/tidwall/match v1.1.1 // indirect
+	github.com/tidwall/pretty v1.2.0 // indirect
+	github.com/tidwall/resp v0.1.1 // indirect
+)
diff --git a/plugins/wasm-go/extensions/ai-json-resp/go.sum b/plugins/wasm-go/extensions/ai-json-resp/go.sum
@@ -0,0 +1,26 @@
+github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
+github.com/google/uuid v1.3.0 h1:t6JiXgmwXMjEs8VusXIJk2BXHsn+wx8BZdTaoZ5fu7I=
+github.com/google/uuid v1.3.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
+github.com/higress-group/nottinygc v0.0.0-20231101025119-e93c4c2f8520 h1:IHDghbGQ2DTIXHBHxWfqCYQW1fKjyJ/I7W1pMyUDeEA=
+github.com/higress-group/nottinygc v0.0.0-20231101025119-e93c4c2f8520/go.mod h1:Nz8ORLaFiLWotg6GeKlJMhv8cci8mM43uEnLA5t8iew=
+github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240226064518-b3dc4646a35a h1:luYRvxLTE1xYxrXYj7nmjd1U0HHh8pUPiKfdZ0MhCGE=
+github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240226064518-b3dc4646a35a/go.mod h1:hNFjhrLUIq+kJ9bOcs8QtiplSQ61GZXtd2xHKx4BYRo=
+github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240318034951-d5306e367c43/go.mod h1:hNFjhrLUIq+kJ9bOcs8QtiplSQ61GZXtd2xHKx4BYRo=
+github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240327114451-d6b7174a84fc/go.mod h1:hNFjhrLUIq+kJ9bOcs8QtiplSQ61GZXtd2xHKx4BYRo=
+github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240711023527-ba358c48772f h1:ZIiIBRvIw62gA5MJhuwp1+2wWbqL9IGElQ499rUsYYg=
+github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240711023527-ba358c48772f/go.mod h1:hNFjhrLUIq+kJ9bOcs8QtiplSQ61GZXtd2xHKx4BYRo=
+github.com/magefile/mage v1.14.0 h1:6QDX3g6z1YvJ4olPhT1wksUcSa/V0a1B+pJb73fBjyo=
+github.com/magefile/mage v1.14.0/go.mod h1:z5UZb/iS3GoOSn0JgWuiw7dxlurVYTu+/jHXqQg881A=
+github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
+github.com/santhosh-tekuri/jsonschema v1.2.4 h1:hNhW8e7t+H1vgY+1QeEQpveR6D4+OwKPXCfD2aieJis=
+github.com/santhosh-tekuri/jsonschema v1.2.4/go.mod h1:TEAUOeZSmIxTTuHatJzrvARHiuO9LYd+cIxzgEHCQI4=
+github.com/stretchr/testify v1.8.4 h1:CcVxjf3Q8PM0mHUKJCdn+eZZtm5yQwehR5yeSVQQcUk=
+github.com/tidwall/gjson v1.14.3 h1:9jvXn7olKEHU1S9vwoMGliaT8jq1vJ7IH/n9zD9Dnlw=
+github.com/tidwall/gjson v1.14.3/go.mod h1:/wbyibRr2FHMks5tjHJ5F8dMZh3AcwJEMf5vlfC0lxk=
+github.com/tidwall/match v1.1.1 h1:+Ho715JplO36QYgwN9PGYNhgZvoUSc9X2c80KVTi+GA=
+github.com/tidwall/match v1.1.1/go.mod h1:eRSPERbgtNPcGhD8UCthc6PmLEQXEWd3PRB5JTxsfmM=
+github.com/tidwall/pretty v1.2.0 h1:RWIZEg2iJ8/g6fDDYzMpobmaoGh5OLl4AXtGUGPcqCs=
+github.com/tidwall/pretty v1.2.0/go.mod h1:ITEVvHYasfjBbM0u2Pg8T2nJnzm8xPwvNhhsoaGGjNU=
+github.com/tidwall/resp v0.1.1 h1:Ly20wkhqKTmDUPlyM1S7pWo5kk0tDu8OoC/vFArXmwE=
+github.com/tidwall/resp v0.1.1/go.mod h1:3/FrruOBAxPTPtundW0VXgmsQ4ZBA0Aw714lVYgwFa0=
+gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=