Also support embedding APIs? #171

AbrJA · 2024-11-21T23:31:56Z

Hi there, I hope you are doing well

Thank you for this package you are working on, it's great!

Do you have on mind the implementation for interacting with embedding models? Or maybe you have already implemented this but I didn't find it,

Greetings,

Abraham JA

hadley · 2024-11-22T13:50:00Z

I think that's currently out of scope for elmer, but we might end up implementing in a separate package. What do you need it for?

cboettig · 2024-11-22T23:34:04Z

I would love to see this as well! Embedding models are very helpful for part of RAG, allowing users to query text-based documents (say, a shiny app that only answers questions based on the elmer documentation, and provides precise citations back to the URLs of doc pages from where it is drawing the answers). nice walkthrough in https://python.langchain.com/docs/tutorials/rag/.

embedding models can of course be used alone but are often implemented in concert with a chat (completions) model.

JamesHWade · 2024-11-23T01:23:17Z

For RAG applications, the approach elmer uses for tools might be best. chat$register_context() or maybe chat$register_rag().

I've tried to put RAG into a package (gpttools). From that, I can appreciate the care you'd need to get the abstractions right, and I certainly haven't figured those out yet. A dedicated package makes sense. Too bad tidymodels already had the pkg name embed. Speaking of, I bet the tidymodels crew would have great ideas for how to approach it.

jpmarindiaz · 2024-11-29T22:32:24Z

Embeddings are very useful for RAG applications.
You can call local embeddings in ollama like this:

curl http://localhost:11434/api/embeddings -d '{
  "model": "mxbai-embed-large",
  "prompt": "Llamas are members of the camelid family"
}'

See https://ollama.com/blog/embedding-models

I have also used this for openai:

openai::create_embedding(model = "text-embedding-3-small",input = txt)

I think it would make a lot of sense keep the same interface in elmer:

chat <- chat_ollama(model = "nomic-embed-text")
chat$chat("hello nomic embedding") or chat$embedding("hello nomic embedding")

chat <- chat_openai(model = "text-embedding-3-small")
chat$chat("hello openai embedding") or chat$embedding("hello openai embedding")

JBGruber · 2024-12-07T09:04:17Z

If you're interested, rollama already supports this (R package with a focus on Ollama): https://jbgruber.github.io/rollama/articles/text-embedding.html

hadley · 2024-12-09T00:30:59Z

@JBGruber did you consider returning all the embeddings in a single matrix-column? I would suspect that might be faster to work with.

hadley · 2024-12-09T00:39:02Z

One reason to do this in elmer is that 90% of HTTP request (e.g. error handling, rate-limiting, ...) will be the same. Implementing it in another package would require either extracting a lot of existing code or duplicating it (neither of which is particularly appealing given how young elmer is). The main reason I'm not sure about implementing in it elmer is that it would require a new API since jamming it into the chat object would be rather forced. The infrastructure is already there (i.e. this would become a new S7 method on the provider generic), but would require exporting a bunch more stuff that's currently internal (making it hard to change in the future). OTOH #202 and #167 already suggest that we need to export this stuff.

MarekProkop · 2025-01-02T08:25:53Z

Hi @hadley,

I wanted to share my perspective on including OpenAI embeddings support in {ellmer}. I've been actively using embeddings through the {openai} package (https://github.com/irudnyts/openai). However, since that package is now archived, we're increasingly relying on {ellmer} as the go-to R interface for OpenAI's services.

From my experience, embeddings are a crucial part of many AI workflows - they enable semantic search, document similarity comparisons, and other text analysis tasks that complement the core LLM capabilities. Having embeddings support directly in {ellmer} would provide a more complete and maintainable solution for the R community, especially now that {openai} is no longer actively maintained.

I'd appreciate a decision on this soon, as I need to move forward with my projects. If embeddings won't be included in {ellmer}, I'll create my own wrapper for the embeddings API and could potentially develop it into a standalone package for others who might need this functionality.

Thanks for considering this!

hadley · 2025-01-05T15:28:26Z

@MarekProkop would you be interested in doing a PR to add embeddings support for at least OpenAI? It would be easier for me to commit to adding to ellmer if I didn't have to be solely responsible for it.

MarekProkop · 2025-01-06T07:42:04Z

@hadley I'd love to help with this! It might take me a few weeks though.

hadley · 2025-01-13T21:40:57Z

API docs:

OpenAI: https://platform.openai.com/docs/api-reference/embeddings
Claude: doesn't provide embeddings
VoyageAI: https://docs.voyageai.com/docs/embeddings
Ollama: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-embeddings
Amazon titan: https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html
Azure: https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#embeddings
Gemini: https://ai.google.dev/api/embeddings#method:-models.batchembedcontents

Sketch of embeddings with OpenAI API:

embed_openai <- function(text,
                         model = NULL,
                         api_key = openai_key(),
                         seed = NULL,
                         dims = NULL,
                         api_args = list()) {

  check_character(text)
  check_string(model, allow_empty = FALSE)

  provider <- ProviderOpenAI(
    base_url = "https://api.openai.com/v1",
    model = "",
    extra_args = api_args,
    api_key = api_key
  )

  req <- base_request(provider)
  req <- req_url_path_append(req, "/embeddings")
  req <- req_body_json(req, list2(
    input = as.list(text),
    model = model,
    dimensions = dims
  ))

  resp <- req_perform(req)
  json <- resp_body_json(resp)
  embeddings <- map(json$data, "[[", "embedding")
  matrix(unlist(embeddings), nrow = length(text), byrow = TRUE)
}

Some thoughts:

Interface is much much simpler than chat — 95% use case is vector of strings, model name, & optionally output dimensionality. Function can return a matrix.
Might need to think about tooling for embedding large numbers of strings (i.e. need to be spread across multiple requests)
Not a one-to-one correspondence between chat providers and embedding providers
There's not that much shared code with elmer, but there probably will be ~10 lines of base request code for each provider.
ellmer providers already include chat specific fields like model, so it's not straightforward to export embedding method on provider object. (And given simplicity of embedding API not clear that will buy us much)

Overall that makes me think that the simplest path forward would be a separate package (at least initially).

JamesHWade · 2025-01-13T21:58:39Z

I recommend adding local models (Sentence Transformers, not Ollama) to that list of providers. As such, integrating reticulate in that standalone package would be great. Did mall already do something in that regard? I haven't looked at that package in a bit.

cboettig · 2025-01-14T00:08:34Z

I think continue.dev uses transformers.js to run all-MiniLM-L6-v2 in CPU, i.e. maybe with v8 or something lighter than reticulate or ollama one could still easily serve embeddings locally? https://docs.continue.dev/customize/model-types/embeddings

(Though many use cases of embeddings, like the @codebase features, also involve a chat-llm model).

Aside, but it looks like embedding tooling is typically coupled with document parsers and text splitters to prepare strings to for embedding, and with vector stores that can store and retrieve from those. Not sure if we have those parts of the ecosystem on the R side, though obviously overlaps with a lot of non-LLM tooling.

hadley changed the title ~~Embedding model implementation~~ Also support embedding APIs? Dec 9, 2024

hadley added this to the 0.1.1 milestone Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Also support embedding APIs? #171

Also support embedding APIs? #171

AbrJA commented Nov 21, 2024

hadley commented Nov 22, 2024

cboettig commented Nov 22, 2024

JamesHWade commented Nov 23, 2024

jpmarindiaz commented Nov 29, 2024

JBGruber commented Dec 7, 2024

hadley commented Dec 9, 2024

hadley commented Dec 9, 2024

MarekProkop commented Jan 2, 2025

hadley commented Jan 5, 2025

MarekProkop commented Jan 6, 2025

hadley commented Jan 13, 2025

JamesHWade commented Jan 13, 2025

cboettig commented Jan 14, 2025

Also support embedding APIs? #171

Also support embedding APIs? #171

Comments

AbrJA commented Nov 21, 2024

hadley commented Nov 22, 2024

cboettig commented Nov 22, 2024

JamesHWade commented Nov 23, 2024

jpmarindiaz commented Nov 29, 2024

JBGruber commented Dec 7, 2024

hadley commented Dec 9, 2024

hadley commented Dec 9, 2024

MarekProkop commented Jan 2, 2025

hadley commented Jan 5, 2025

MarekProkop commented Jan 6, 2025

hadley commented Jan 13, 2025

JamesHWade commented Jan 13, 2025

cboettig commented Jan 14, 2025