Skip to content

Commit

Permalink
updated OpenAI to use ContentPart's media option for setting image me…
Browse files Browse the repository at this point in the history
…dia data when base64 encoded
  • Loading branch information
brainlid committed May 2, 2024
1 parent 8ac162d commit 3400e1d
Show file tree
Hide file tree
Showing 4 changed files with 50 additions and 2 deletions.
10 changes: 9 additions & 1 deletion MODEL_BEHAVIORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,12 @@ This is to document some of the differences between model behaviors. This is spe
- 2024-04-17

- Anthropic's Claude 3 does respond well to pre-filled assistant responses and it is officially encouraged.
- 2024-04-17
- 2024-04-17

## Image content parts

- Anthropic will not work if the media type is included with the base64 data. ie. `"data:image/jpeg;base64," <> "..."`
- Requires providing the media type as a separate option.
- ChatGPT -
- When `base64` data given, requires providing the image data prefixed with the base64 content.
- Ex: `"data:image/jpeg;base64," <> "..."`
16 changes: 15 additions & 1 deletion lib/chat_models/chat_open_ai.ex
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,21 @@ defmodule LangChain.ChatModels.ChatOpenAI do
end

def for_api(%ContentPart{type: image} = part) when image in [:image, :image_url] do
%{"type" => "image_url", "image_url" => %{"url" => part.content}}
media_prefix =
case Keyword.get(part.options || [], :media, nil) do
nil ->
""

type when is_binary(type) ->
"data:#{type};base64,"

other ->
message = "Received unsupported media type for ContentPart: #{inspect(other)}"
Logger.error(message)
raise LangChainError, message
end

%{"type" => "image_url", "image_url" => %{"url" => media_prefix <> part.content}}
end

# ToolCall support
Expand Down
8 changes: 8 additions & 0 deletions lib/message/content_part.ex
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,14 @@ defmodule LangChain.Message.ContentPart do
- `:media` - Provide the "media type" for the image. Examples: "image/jpeg",
"image/png", etc. ChatGPT does not require this but other LLMs may.
ChatGPT requires media type information to prefix the base64 content. Setting
the `media: "image/jpeg"` type will do that. Otherwise the data must be
provided with the required prefix.
Anthropic requires the media type information to be submitted as separate
information with the JSON request. This media option provides an abstraction
to normalize the behavior.
"""
@spec image!(String.t(), Keyword.t()) :: t() | no_return()
def image!(content, opts \\ []) do
Expand Down
18 changes: 18 additions & 0 deletions test/chat_models/chat_open_ai_test.exs
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,24 @@ defmodule LangChain.ChatModels.ChatOpenAITest do
assert result == expected
end

test "turns an image ContentPart with base64 media into the expected JSON format" do
expected = %{
"type" => "image_url",
"image_url" => %{"url" => "_base64_data"}
}

result = ChatOpenAI.for_api(ContentPart.image!("image_base64_data", media: "image/jpeg"))
assert result == expected

expected = %{
"type" => "image_url",
"image_url" => %{"url" => "_base64_data"}
}

result = ChatOpenAI.for_api(ContentPart.image!("image_base64_data", media: "image/png"))
assert result == expected
end

test "turns an image_url ContentPart into the expected JSON format" do
expected = %{"type" => "image_url", "image_url" => %{"url" => "url-to-image"}}
result = ChatOpenAI.for_api(ContentPart.image_url!("url-to-image"))
Expand Down

0 comments on commit 3400e1d

Please sign in to comment.