Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Costs of using external services via APIs in soilwise #6

Open
robknapen opened this issue Jun 7, 2024 · 3 comments
Open

Costs of using external services via APIs in soilwise #6

robknapen opened this issue Jun 7, 2024 · 3 comments
Assignees

Comments

@robknapen
Copy link

Some of the external services that we intend to use in the project (e.g. translation and language models) are not for free. Creation of an API key with a payment plan (credit card) is sometimes needed. How will this be arranged ?

The OpenAI LLMs are now accessed using a subscription owned by WEnR. Usage at the moment is still limited, but when it intensifies costs will start to add up (and services will stop working when budget limits are reached).

@robknapen robknapen assigned robknapen and roblokers and unassigned robknapen Jun 7, 2024
@pvgenuchten
Copy link
Contributor

Maybe we should move this issue to the governance repository?

We may have to distinguish between:

  • costs involved to research what type of value such a service can provide to a repository like SWR
  • costs involved when running such a component in a production scenario (post project)

As translation service I suggest to use the free-to-use for academia translate service provided by EU. But maybe we want to compare it to another (paid) service?

@BerkvensNick what do you think?

@ishaan-jaff
Copy link

hi @pvgenuchten @robknapen I'm the maintainer of LiteLLM. I believe we can help with your budget limit problem. (please let me know if that's not the case. I apologize in advance if i'm wrong here)

LiteLLM allows you to:

  • Proxy 100+ LLMs in the OpenAI Format
  • Use Virtual Keys to set hard budget limits, soft budget limits, track usage per key

How to use LiteLLM to create virtual keys with budgets

Step 1. Create a Config for LiteLLM proxy

LiteLLM Requires a config with all your models defined - we can call this file litellm_config.yaml

Detailed docs on how to setup litellm config - here

model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-3.5-turbo     
      api_key: os.environ/OPENAI_API_KEY # reads key from os.environ('OPENAI_API_KEY')

Step 2. Start litellm proxy

docker run \
    -v $(pwd)/litellm_config.yaml:/app/config.yaml \
    -p 4000:4000 \
    ghcr.io/berriai/litellm:main-latest \
    --config /app/config.yaml --detailed_debug

On success, the proxy will start running on http://localhost:4000/

Step 3. Create Virtual key with max_budget and soft_budget.

soft_budget Sends a slack alert on crossing soft_budget

curl 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data-raw '{
   "soft_budget": 5,
  "max_budget": 10,
}'

Response from this endpoint

# {"key":"sk-jNm1Zar7XfNdZXp49Z1kSQ"}  

Step 4. Test Virtual Key - Make /chat/completions request to litellm proxy

  • This key can will start failing requests once it crosses it's budget limit
curl -i http://localhost:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-jNm1Zar7XfNdZXp49Z1kSQ" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "Hello, Claude gm!"}
    ]
}'

@robknapen
Copy link
Author

Hi @ishaan-jaff , thanks for your feedback!

I did find LiteLLM indeed. It looks great and it has good options for working with budgets. Which is what we would need so I have it already in consideration :)

Most of the issue though is about who is going to provide the credit card and pay for the use of the (LLM/embedding model) services (during development and after), and get some clarity / awareness about this in the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants