Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idea: Add Claude Prompt Caching Support to Jan #3715

Closed
Tracked by #1570
kvn8888 opened this issue Sep 22, 2024 · 4 comments
Closed
Tracked by #1570

idea: Add Claude Prompt Caching Support to Jan #3715

kvn8888 opened this issue Sep 22, 2024 · 4 comments
Labels
category: model support Support new model, or fix broken model category: providers Local & remote inference providers type: feature request A new feature

Comments

@kvn8888
Copy link

kvn8888 commented Sep 22, 2024

Problem Statement

Using an API to access Claude without requiring a $20/month Claude Pro subscription saves users money in the long run. However, this benefit diminishes whenever users enter long context multi-turn conversations with Claude, especially with file uploads

Feature Idea

By integrating Anthropic's newer prompt caching feature, the response latency and cost for users could be dramatically reduced. Users may engage with longer multi turn conversations without starting a new thread and losing context. Cached prompts can be an optional feature that costs 90% less compared to uncached prompts per API call

@kvn8888 kvn8888 added the type: feature request A new feature label Sep 22, 2024
@Boscop
Copy link

Boscop commented Sep 29, 2024

Yes 100%, @imtuyethan please prioritize this 🙂

Btw, related (partial dup): janhq/cortex.cpp#1570

@0xSage 0xSage added category: model support Support new model, or fix broken model category: providers Local & remote inference providers labels Oct 14, 2024
@imtuyethan
Copy link
Contributor

Awesome @kvn8888 We will try to prioritize it!

@imtuyethan
Copy link
Contributor

@0xSage Please consider closing this as a dup of janhq/cortex.cpp#1570. Then we will follow up on janhq/cortex.cpp#1570.

@dan-homebrew
Copy link
Contributor

I am merging this into #3786, which will be part of our bigger refactor of Remote APIs into Cortex backend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: model support Support new model, or fix broken model category: providers Local & remote inference providers type: feature request A new feature
Projects
Status: Review + QA
Development

No branches or pull requests

5 participants