From b832025e8875b9dff8e4c7be2a5a6cf5a8f4691c Mon Sep 17 00:00:00 2001
From: Enrico Ros <enrico.ros@gmail.com>
Date: Thu, 11 Jul 2024 23:06:44 -0700
Subject: [PATCH] Create 2024-AI-APIs-Comparison.md

---
 docs/2024-AI-APIs-Comparison.md | 70 +++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)
 create mode 100644 docs/2024-AI-APIs-Comparison.md

diff --git a/docs/2024-AI-APIs-Comparison.md b/docs/2024-AI-APIs-Comparison.md
new file mode 100644
index 000000000..0af2ebe97
--- /dev/null
+++ b/docs/2024-AI-APIs-Comparison.md
@@ -0,0 +1,70 @@
+# AIX dispatch server - API features comparison
+
+This is updated as of 2024-07-09, and includes the latest features and capabilities of the three major AI APIs: Anthropic, Gemini, and OpenAI.
+The comparison covers a wide range of features, including function calling, vision, system instructions, etc.
+
+| Feature Category                         | Specific Feature              | Anthropic                                                          | Gemini                                                           | OpenAI                                                              |
+|------------------------------------------|-------------------------------|--------------------------------------------------------------------|------------------------------------------------------------------|---------------------------------------------------------------------|
+| **Message Structure**                    |
+|                                          | Role types                    | user, assistant                                                    | user, model                                                      | user, assistant, system, tool                                       |
+|                                          | Named participants            | No                                                                 | No                                                               | Yes                                                                 |
+|                                          | Content array                 | Yes                                                                | Yes                                                              | Yes                                                                 |
+| **Content Types and Multimodal Support** |
+|                                          | Text generation               | Yes                                                                | Yes                                                              | Yes                                                                 |
+|                                          | Image understanding           | Yes                                                                | Yes                                                              | Yes                                                                 |
+|                                          | Audio processing              | No                                                                 | **Yes**                                                          | No                                                                  |
+|                                          | Video processing              | No                                                                 | **Yes**                                                          | No                                                                  |
+| **Image Handling**                       |
+|                                          | Supported formats             | JPEG, PNG, GIF, WebP                                               | JPEG, PNG, WebP, HEIC, HEIF                                      | PNG, JPEG, WebP, non-animated GIF                                   |
+|                                          | Max image size                | 5MB per image                                                      | (20MB per prompt)                                                | 20MB per image                                                      |
+|                                          | Image detail level            | N/A                                                                | N/A                                                              | **Low, high, auto**                                                 |
+|                                          | Image resolution              | max: 1568x1568                                                     | min: 768x768, max: 3072x3072                                     | min: 512x512, max: 2048 x 2048                                      |
+|                                          | Token calculation for images  | (width * height)/750; max 1,600                                    | 258 tokens                                                       | 85 + 170 * {patches}                                                |
+|                                          | Image retention               | Deleted after processing                                           | Not specified                                                    | Deleted after processing                                            |
+| **Audio and Video Handling**             |
+|                                          | Audio formats                 | N/A                                                                | WAV, MP3, AIFF, AAC, OGG, FLAC                                   | N/A                                                                 |
+|                                          | Video formats                 | N/A                                                                | MP4, MPEG, MOV, AVI, MPG, WebM, WMV, 3GPP                        | N/A                                                                 |
+| **System Instructions and Tool Use**     |
+|                                          | System instructions           | Yes (array of text blocks)                                         | Yes (parts array)                                                | Yes (as system message)                                             |
+| **Function/Tool Handling**               |
+|                                          | Parallel tool calls           | No                                                                 | No                                                               | **Yes**                                                             |
+|                                          | Tool Declaration              | Defined in `tools` array                                           | Defined in `tools` array                                         | Defined in `tools` array                                            |
+|                                          | FC name restrictions          | Yes                                                                | Yes (max 63 chars)                                               | Yes (max 64 chars)                                                  |
+|                                          | FC declaration                | name, description, input_schema                                    | name, description, parameters                                    | name, description, parameters                                       |
+|                                          | FC options structure          | JSON Schema for input                                              | Object with properties                                           | JSON Schema for parameters                                          |
+|                                          | FC Force invocation           | Via `tool_choice` parameter                                        | Via `toolConfig` parameter                                       | Via `tool_choice` parameter                                         |
+|                                          | FC Model invocation           | Model generates a `tool_use` block with predicted parameters       | Generates a `functionCall` part with predicted parameters        | Generates a message.`tool_calls` item with predicted arguments      |
+|                                          | FC Execution                  | Client-side                                                        | Client-side                                                      | Client-side                                                         |
+|                                          | FC Result injection           | Client appends a `user` message with a `tool_result` content block | Client appends a `function` message with `functionResponse` part | Client sends a new `tool` message with `tool_call_id` and `content` |
+|                                          | Built-in Code execution       | No                                                                 | **Yes**                                                          | No                                                                  |
+|                                          | Tool use with vision          | Yes                                                                | Yes                                                              | Yes                                                                 |
+| **Generation Configuration**             |
+|                                          | temperature                   | Yes                                                                | Yes                                                              | Yes                                                                 |
+|                                          | max_tokens                    | Yes                                                                | Yes                                                              | Yes                                                                 |
+|                                          | stop_sequences                | Yes                                                                | Yes                                                              | Yes                                                                 |
+|                                          | top_k                         | Yes                                                                | Yes                                                              | **No**                                                              |
+|                                          | top_p                         | Yes                                                                | Yes                                                              | Yes                                                                 |
+|                                          | seed                          | No                                                                 | No                                                               | **Yes**                                                             |
+|                                          | Multiple candidates           | No                                                                 | No                                                               | Yes (with 'n' parameter, breaks streaming?)                         |
+| **Streaming and Response Structure**     |
+|                                          | Streaming support             | Yes                                                                | Yes                                                              | Yes                                                                 |
+|                                          | Streaming initiation          | stream=true                                                        | streamGenerateContent path                                       | stream=true                                                         |
+|                                          | Streaming event types         | **Multiple specific types**                                        | Not specified                                                    | Single delta type                                                   |
+|                                          | Response container            | content (array)                                                    | candidates (array)                                               | choices (array)                                                     |
+| **Usage Metrics and Error Handling**     |
+|                                          | Token counts                  | Yes                                                                | Yes                                                              | Yes                                                                 |
+|                                          | Detailed token breakdown      | input, output                                                      | prompt, cached, candidates, total                                | prompt, completion, total                                           |
+|                                          | Usage in stream               | No                                                                 | No                                                               | **Optional**                                                        |
+|                                          | Error handling in response    | Not specified                                                      | Not specified                                                    | **Yes (undocumented)**                                              |
+|                                          | Error handling in stream      | Not specified                                                      | Not specified                                                    | **Yes (undocumented)**                                              |
+| **Advanced Features**                    |
+|                                          | JSON mode                     | **Partial (via structured prompts)**                               | **Yes (responseMimeType)**                                       | **Yes**                                                             |
+|                                          | Output consistency techniques | **Yes (multiple methods)**                                         | Not specified                                                    | Not specified                                                       |
+|                                          | Logprobs                      | No                                                                 | No                                                               | **Yes (disabled in schema)**                                        |
+|                                          | System fingerprint            | No                                                                 | No                                                               | **Yes**                                                             |
+|                                          | Semantic caching              | No                                                                 | **Yes**                                                          | No                                                                  |
+|                                          | Assistant prefill             | **Yes**                                                            | No                                                               | No                                                                  |
+|                                          | Preferred formatting          | **XML tags, JSON**                                                 | Not specified                                                    | Markdown                                                            |
+| **Safety and Compliance**                |
+|                                          | Safety settings in request    | **Stop sequences**                                                 | **Detailed category-based**                                      | **Moderation API**                                                  |
+|                                          | Safety feedback in response   | Yes                                                                | Yes                                                              | Not specified                                                       |