Skip to content

Commit

Permalink
Create 2024-AI-APIs-Comparison.md
Browse files Browse the repository at this point in the history
  • Loading branch information
enricoros authored Jul 12, 2024
1 parent 0b3b4a6 commit b832025
Showing 1 changed file with 70 additions and 0 deletions.
70 changes: 70 additions & 0 deletions docs/2024-AI-APIs-Comparison.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# AIX dispatch server - API features comparison

This is updated as of 2024-07-09, and includes the latest features and capabilities of the three major AI APIs: Anthropic, Gemini, and OpenAI.
The comparison covers a wide range of features, including function calling, vision, system instructions, etc.

| Feature Category | Specific Feature | Anthropic | Gemini | OpenAI |
|------------------------------------------|-------------------------------|--------------------------------------------------------------------|------------------------------------------------------------------|---------------------------------------------------------------------|
| **Message Structure** |
| | Role types | user, assistant | user, model | user, assistant, system, tool |
| | Named participants | No | No | Yes |
| | Content array | Yes | Yes | Yes |
| **Content Types and Multimodal Support** |
| | Text generation | Yes | Yes | Yes |
| | Image understanding | Yes | Yes | Yes |
| | Audio processing | No | **Yes** | No |
| | Video processing | No | **Yes** | No |
| **Image Handling** |
| | Supported formats | JPEG, PNG, GIF, WebP | JPEG, PNG, WebP, HEIC, HEIF | PNG, JPEG, WebP, non-animated GIF |
| | Max image size | 5MB per image | (20MB per prompt) | 20MB per image |
| | Image detail level | N/A | N/A | **Low, high, auto** |
| | Image resolution | max: 1568x1568 | min: 768x768, max: 3072x3072 | min: 512x512, max: 2048 x 2048 |
| | Token calculation for images | (width * height)/750; max 1,600 | 258 tokens | 85 + 170 * {patches} |
| | Image retention | Deleted after processing | Not specified | Deleted after processing |
| **Audio and Video Handling** |
| | Audio formats | N/A | WAV, MP3, AIFF, AAC, OGG, FLAC | N/A |
| | Video formats | N/A | MP4, MPEG, MOV, AVI, MPG, WebM, WMV, 3GPP | N/A |
| **System Instructions and Tool Use** |
| | System instructions | Yes (array of text blocks) | Yes (parts array) | Yes (as system message) |
| **Function/Tool Handling** |
| | Parallel tool calls | No | No | **Yes** |
| | Tool Declaration | Defined in `tools` array | Defined in `tools` array | Defined in `tools` array |
| | FC name restrictions | Yes | Yes (max 63 chars) | Yes (max 64 chars) |
| | FC declaration | name, description, input_schema | name, description, parameters | name, description, parameters |
| | FC options structure | JSON Schema for input | Object with properties | JSON Schema for parameters |
| | FC Force invocation | Via `tool_choice` parameter | Via `toolConfig` parameter | Via `tool_choice` parameter |
| | FC Model invocation | Model generates a `tool_use` block with predicted parameters | Generates a `functionCall` part with predicted parameters | Generates a message.`tool_calls` item with predicted arguments |
| | FC Execution | Client-side | Client-side | Client-side |
| | FC Result injection | Client appends a `user` message with a `tool_result` content block | Client appends a `function` message with `functionResponse` part | Client sends a new `tool` message with `tool_call_id` and `content` |
| | Built-in Code execution | No | **Yes** | No |
| | Tool use with vision | Yes | Yes | Yes |
| **Generation Configuration** |
| | temperature | Yes | Yes | Yes |
| | max_tokens | Yes | Yes | Yes |
| | stop_sequences | Yes | Yes | Yes |
| | top_k | Yes | Yes | **No** |
| | top_p | Yes | Yes | Yes |
| | seed | No | No | **Yes** |
| | Multiple candidates | No | No | Yes (with 'n' parameter, breaks streaming?) |
| **Streaming and Response Structure** |
| | Streaming support | Yes | Yes | Yes |
| | Streaming initiation | stream=true | streamGenerateContent path | stream=true |
| | Streaming event types | **Multiple specific types** | Not specified | Single delta type |
| | Response container | content (array) | candidates (array) | choices (array) |
| **Usage Metrics and Error Handling** |
| | Token counts | Yes | Yes | Yes |
| | Detailed token breakdown | input, output | prompt, cached, candidates, total | prompt, completion, total |
| | Usage in stream | No | No | **Optional** |
| | Error handling in response | Not specified | Not specified | **Yes (undocumented)** |
| | Error handling in stream | Not specified | Not specified | **Yes (undocumented)** |
| **Advanced Features** |
| | JSON mode | **Partial (via structured prompts)** | **Yes (responseMimeType)** | **Yes** |
| | Output consistency techniques | **Yes (multiple methods)** | Not specified | Not specified |
| | Logprobs | No | No | **Yes (disabled in schema)** |
| | System fingerprint | No | No | **Yes** |
| | Semantic caching | No | **Yes** | No |
| | Assistant prefill | **Yes** | No | No |
| | Preferred formatting | **XML tags, JSON** | Not specified | Markdown |
| **Safety and Compliance** |
| | Safety settings in request | **Stop sequences** | **Detailed category-based** | **Moderation API** |
| | Safety feedback in response | Yes | Yes | Not specified |

0 comments on commit b832025

Please sign in to comment.