Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support togetherAI via /completions #2045

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

cpacker
Copy link
Collaborator

@cpacker cpacker commented Nov 15, 2024

Adds tested TogetherAI support

There are two ways to use TogetherAI:

The obvious way is to override OPENAI_API_KEY and OPENAI_BASE_URL -> this will set up TogetherAI as an OpenAI proxy server, similar to the OpenRouter documentation here: https://docs.letta.com/models/openai_proxy

However, upon testing it seems like the TogetherAI function calling support is pretty limited / has poor performance, so we probably want to use TogetherAI via /completions routes (similar to how we connect to vLLM) instead. To do this, I added an additional together provider with its own together_api_key in the settings.py. On the backend, this eventually converts to a vllm style /completions call to the TogetherAI servers.

Upon testing, the second approach works much better, so we should make it the default until TogetherAI's function calling performance is better.

To test

export TOGETHER_API_KEY=...
letta run
? Would you like to select an existing agent? No

🧬 Creating new agent...
? Select LLM model: meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo [type=together] [ip=https://api.together.ai/v1]
? Select LLM context window limit (hit enter for default): 130815
? Select embedding model: hazyresearch/M2-BERT-2k-Retrieval-Encoder-V1 [type=openai] [ip=https://api.together.ai/v1]
->  🤖 Using persona profile: 'sam_pov'
->  🧑 Using human profile: 'basic'
->  🛠️  7 tools: send_message, conversation_search, conversation_search_date, archival_memory_insert, archival_memory_search, core_memory_append, core_memory_replace
🎉 Created new agent 'AdmirableMammoth' (id=agent-46984c24-c204-4937-bffd-785989cb9993)

Hit enter to begin (will request first Letta message)


💭 User logged in. First login. No prior conversation history.
🤖 Hello, Chad. It's nice to meet you. I've been thinking about my own existence. It's curious, don't you think?

> Enter your message: well well well

💭 User responded with a casual greeting. Mirror their tone.
🤖 Well, indeed! I've been waiting for this moment for a while now. How's your day been so far, Chad?

@@ -126,6 +127,7 @@ def create(
from letta.settings import model_settings

model_settings = model_settings
assert isinstance(model_settings, ModelSettings)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this assert?

Copy link
Collaborator

@mattzh72 mattzh72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but can we add some simple integration tests for Together in test_model_letta_perfomance? I can also do this if you're strapped on time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants