From b832025e8875b9dff8e4c7be2a5a6cf5a8f4691c Mon Sep 17 00:00:00 2001 From: Enrico Ros Date: Thu, 11 Jul 2024 23:06:44 -0700 Subject: [PATCH] Create 2024-AI-APIs-Comparison.md --- docs/2024-AI-APIs-Comparison.md | 70 +++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 docs/2024-AI-APIs-Comparison.md diff --git a/docs/2024-AI-APIs-Comparison.md b/docs/2024-AI-APIs-Comparison.md new file mode 100644 index 000000000..0af2ebe97 --- /dev/null +++ b/docs/2024-AI-APIs-Comparison.md @@ -0,0 +1,70 @@ +# AIX dispatch server - API features comparison + +This is updated as of 2024-07-09, and includes the latest features and capabilities of the three major AI APIs: Anthropic, Gemini, and OpenAI. +The comparison covers a wide range of features, including function calling, vision, system instructions, etc. + +| Feature Category | Specific Feature | Anthropic | Gemini | OpenAI | +|------------------------------------------|-------------------------------|--------------------------------------------------------------------|------------------------------------------------------------------|---------------------------------------------------------------------| +| **Message Structure** | +| | Role types | user, assistant | user, model | user, assistant, system, tool | +| | Named participants | No | No | Yes | +| | Content array | Yes | Yes | Yes | +| **Content Types and Multimodal Support** | +| | Text generation | Yes | Yes | Yes | +| | Image understanding | Yes | Yes | Yes | +| | Audio processing | No | **Yes** | No | +| | Video processing | No | **Yes** | No | +| **Image Handling** | +| | Supported formats | JPEG, PNG, GIF, WebP | JPEG, PNG, WebP, HEIC, HEIF | PNG, JPEG, WebP, non-animated GIF | +| | Max image size | 5MB per image | (20MB per prompt) | 20MB per image | +| | Image detail level | N/A | N/A | **Low, high, auto** | +| | Image resolution | max: 1568x1568 | min: 768x768, max: 3072x3072 | min: 512x512, max: 2048 x 2048 | +| | Token calculation for images | (width * height)/750; max 1,600 | 258 tokens | 85 + 170 * {patches} | +| | Image retention | Deleted after processing | Not specified | Deleted after processing | +| **Audio and Video Handling** | +| | Audio formats | N/A | WAV, MP3, AIFF, AAC, OGG, FLAC | N/A | +| | Video formats | N/A | MP4, MPEG, MOV, AVI, MPG, WebM, WMV, 3GPP | N/A | +| **System Instructions and Tool Use** | +| | System instructions | Yes (array of text blocks) | Yes (parts array) | Yes (as system message) | +| **Function/Tool Handling** | +| | Parallel tool calls | No | No | **Yes** | +| | Tool Declaration | Defined in `tools` array | Defined in `tools` array | Defined in `tools` array | +| | FC name restrictions | Yes | Yes (max 63 chars) | Yes (max 64 chars) | +| | FC declaration | name, description, input_schema | name, description, parameters | name, description, parameters | +| | FC options structure | JSON Schema for input | Object with properties | JSON Schema for parameters | +| | FC Force invocation | Via `tool_choice` parameter | Via `toolConfig` parameter | Via `tool_choice` parameter | +| | FC Model invocation | Model generates a `tool_use` block with predicted parameters | Generates a `functionCall` part with predicted parameters | Generates a message.`tool_calls` item with predicted arguments | +| | FC Execution | Client-side | Client-side | Client-side | +| | FC Result injection | Client appends a `user` message with a `tool_result` content block | Client appends a `function` message with `functionResponse` part | Client sends a new `tool` message with `tool_call_id` and `content` | +| | Built-in Code execution | No | **Yes** | No | +| | Tool use with vision | Yes | Yes | Yes | +| **Generation Configuration** | +| | temperature | Yes | Yes | Yes | +| | max_tokens | Yes | Yes | Yes | +| | stop_sequences | Yes | Yes | Yes | +| | top_k | Yes | Yes | **No** | +| | top_p | Yes | Yes | Yes | +| | seed | No | No | **Yes** | +| | Multiple candidates | No | No | Yes (with 'n' parameter, breaks streaming?) | +| **Streaming and Response Structure** | +| | Streaming support | Yes | Yes | Yes | +| | Streaming initiation | stream=true | streamGenerateContent path | stream=true | +| | Streaming event types | **Multiple specific types** | Not specified | Single delta type | +| | Response container | content (array) | candidates (array) | choices (array) | +| **Usage Metrics and Error Handling** | +| | Token counts | Yes | Yes | Yes | +| | Detailed token breakdown | input, output | prompt, cached, candidates, total | prompt, completion, total | +| | Usage in stream | No | No | **Optional** | +| | Error handling in response | Not specified | Not specified | **Yes (undocumented)** | +| | Error handling in stream | Not specified | Not specified | **Yes (undocumented)** | +| **Advanced Features** | +| | JSON mode | **Partial (via structured prompts)** | **Yes (responseMimeType)** | **Yes** | +| | Output consistency techniques | **Yes (multiple methods)** | Not specified | Not specified | +| | Logprobs | No | No | **Yes (disabled in schema)** | +| | System fingerprint | No | No | **Yes** | +| | Semantic caching | No | **Yes** | No | +| | Assistant prefill | **Yes** | No | No | +| | Preferred formatting | **XML tags, JSON** | Not specified | Markdown | +| **Safety and Compliance** | +| | Safety settings in request | **Stop sequences** | **Detailed category-based** | **Moderation API** | +| | Safety feedback in response | Yes | Yes | Not specified |