Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodeGate doesn't work with either LiteLLM or OpenRouter #878

Open
tan-yong-sheng opened this issue Feb 2, 2025 · 19 comments
Open

CodeGate doesn't work with either LiteLLM or OpenRouter #878

tan-yong-sheng opened this issue Feb 2, 2025 · 19 comments
Assignees
Labels

Comments

@tan-yong-sheng
Copy link

tan-yong-sheng commented Feb 2, 2025

I am trying to use codegate with cline, and connect LLM api via LiteLLM proxy (hosted on cloud)

Environment:

  • Windows 11

Here are the steps I did:

  1. On laptop, I run a docker container for codegate:

docker run --name codegate -d -p 8989:8989 -p 9090:9090 -e CODEGATE_OPENAI_URL=https://<MY_LITELLM_URL>/v1 --restart unless-stopped ghcr.io/stacklok/codegate

  1. Then, I tried to set up the credentials for cline in VS code, as follows:

Image

I thought as I set the environment variable to CODEGATE_OPENAI_URL=https://litellm.tanyongsheng.site/v1, it will change the default environment variable for openai url

However, I still get this error:

Image

And when I tried to switch to gemini 2.0 that I set up in litellm proxy. There is still error coming up:

Image

Hope for help, thanks

@tan-yong-sheng
Copy link
Author

tan-yong-sheng commented Feb 2, 2025

I think how it works aldy

As my litellm proxy uses os.environ/GEMINI_API_KEY for the config.yaml file, I need to pass in GEMINI's api key instead of LiteLLM master key: https://docs.litellm.ai/docs/providers/gemini

Extract of sample config.yaml file for my self-hosted litellm proxy

model_list:
  - model_name: gemini-pro
    litellm_params:
      model: gemini/gemini-1.5-pro
      api_key: os.environ/GEMINI_API_KEY 

Here is the amendment I made for CodeGate to work:

Image

Should be working now, and I will close this issue.

@tan-yong-sheng
Copy link
Author

tan-yong-sheng commented Feb 2, 2025

But, it seems like leading to another error, which it doesn't trigger when there are secrets inside the file:

Here is the same credentials in my config.ini file

Image

And when I check codegate via docker logs, and I found a lot of errors there:

Image

And not sure if it's related to the errors above, I found there are some 304 errors for my litellm proxy's UI? But, I think I didn't access my litellm proxy's UI that time.... (Not that sure)

Image

I am thinking it's better for me to try to configure it via openai API first, and then transfer to litellm API ...

@jhrozek
Copy link
Contributor

jhrozek commented Feb 3, 2025

@tan-yong-sheng thanks for filing the issue!

The rate limits were a codegate bug that was fixed in main but not released yet. We'll get a release out shortly.

I've not used the litellm proxy myself, but I assume it works somewhat like openrouter? Sounds like you expected the litellm master key to work? If yes, it might be worth to reopen this issue and investigate.

@tan-yong-sheng
Copy link
Author

I see, noted on the info and thanks.

Yes, litellm proxy is kinda similar to openrouter which is to unify the llm endpoints from different providers.

Ya, I previously expect LITELLM_MASTER_KEY to work, but in the end, I found I need to pass the GEMINI_API_KEY here I want to Gemini llm. Also, I could pass in MISTRAL_API_KEY if I want to Mistral llm. It means LITELLM_MASTER_KEY not being used to fill in the image below, but it still works (note: perhaps this is how litellm proxy works and I would add on more info about this when I am available)

As shown in my previous images:
Image

@jhrozek
Copy link
Contributor

jhrozek commented Feb 3, 2025

ok, reopening for further triage. We need to make sure codegate works well without the workarounds.

Thanks again for finding the bug and filing the issue.

@jhrozek jhrozek reopened this Feb 3, 2025
@tan-yong-sheng
Copy link
Author

tan-yong-sheng commented Feb 3, 2025

Hi @jhrozek, could I request any docker container (e.g., ghcr.io/stacklok/codegate:test) for testing? I am not that sure if I am doing right to build the docker image for codegate's main repository:

I tried to git clone https://github.com/stacklok/codegate.git and then follow your guide to build docker container on my linux cloud, as shown below:

Image

Yet, when I use cline (even with openai endpoint: https://api.openai.com/v1), it still couldn't trigger and detect credentials. The prompt I used to reproduce the error is read config.ini file

Image

However, for docker logs, no error logs for codegate's docker container as before:
Image

@jhrozek
Copy link
Contributor

jhrozek commented Feb 3, 2025

Hi @jhrozek, could I request any docker container (e.g., ghcr.io/stacklok/codegate:test) for testing? I am not that sure if I am doing right to build the docker image for codegate's main repository:

I don't think you need to build a local container (unless you want to!) we release several times a month and the current release is just a couple of days old.

I tried to git clone https://github.com/stacklok/codegate.git and then follow your guide to build docker container on my linux cloud, as shown below:

Yet, when I use cline (even with openai endpoint: https://api.openai.com/v1), it still couldn't trigger and detect credentials. The prompt I used to reproduce the error is read config.ini file

I think this is a configuration issue. In order for cline to talk to codegate, you should point it to codegate as the base URL, not to the LLM

Image

However, for docker logs, no error logs for codegate's docker container as before:

I think this would be solved by pointing the extension to codegate.

btw does it still stand that the main issue is not being able to use litellm through the main key?

@jhrozek
Copy link
Contributor

jhrozek commented Feb 3, 2025

Just a quick note, I think I was able to reproduce your issue (and I think it was the same issue that @danbarr pinged me about some days ago).

Will keep digging to fix this!

@tan-yong-sheng
Copy link
Author

tan-yong-sheng commented Feb 4, 2025

I see, thanks a lot. I will test again when the bug (= codegate is not triggered when I ask LLM to read config.ini file to safeguard my credentials) is fixed.

Thanks a lot for your efforts in delivering this product. Tbh, I am impressed with the idea of this project to safeguard credentials and filter risky dependencies when doing AI coding.

@tan-yong-sheng
Copy link
Author

tan-yong-sheng commented Feb 4, 2025

btw does it still stand that the main issue is not being able to use litellm through the main key?

Ya, initially I thought we can't use LiteLLM proxy with Cline and Codegate, and it keeps saying authentication error. That's the reason I create this issue.

But, for now, I think it's not that much matter, as I found the workaround to solve this issue. We could use the GEMINI_API_KEY or CODESTRAL_API_KEY for respective LLM models, instead of LITELLM_MASTER_KEY. And it works!


Demonstration

Just for your info, here is what I did to use Cline (together with LiteLLM proxy) with and without codegate:

(1) WITHOUT codegate and with LiteLLM proxy

Image

(2) WITH codegate and with LiteLLM proxy

Image

Image

Reply to issue

I hope this explanation is clear. I believe we can maintain the existing configuration, since we have a workaround to enable Cline and Codegate to work together with LiteLLM. Therefore, I suggest closing this issue.

Additional info

(i) the config.yaml for LiteLLM proxy

You could ignore this, but it's for clarify for my setup to self-host LiteLLM proxy on my Linux cloud server. Here is what I setup for config.yaml file for my litellm proxy to access Gemini and Codestral models:

model_list:
  - model_name: gemini/*
    litellm_params:
      # to include all gemini models available
      model: gemini/*
      api_key: <GEMINI_API_KEY_IN_TEXT_FORM>
  
  - model_name: codestral/*
    litellm_params:
      # to include all codestral models available
      model: codestral/*
      api_key: <CODESTRAL_API_KEY_IN_TEXT_FORM>

Reference: https://docs.litellm.ai/docs/proxy/configs

(ii) How to run codegate via docker container

Note: I change CODEGATE_OPENAI_URL from https://api.openai.com/v1 to https://<MY_LITELLM_URL>/v1

docker run --name codegate -d -p 8989:8989 -p 9090:9090 -e CODEGATE_OPENAI_URL=https://<MY_LITELLM_URL>/v1 --mount type=volume,src=codegate_volume,dst=/app/codegate_volume --restart unless-stopped ghcr.io/stacklok/codegate

@jhrozek jhrozek added bug and removed needs-triage labels Feb 4, 2025
@jhrozek jhrozek self-assigned this Feb 4, 2025
@jhrozek jhrozek changed the title How to set up CodeGate with Cline and LiteLLM proxy? CodeGate doesn't work with either LiteLLM or OpenRouter Feb 4, 2025
@lukehinds
Copy link
Contributor

lukehinds commented Feb 4, 2025

Hey @tan-yong-sheng , great having you here and thanks for the time you took to explain the issues well ❤

Looping back to the secrets sensing, we would need to pattern match those tokens, you will find the matching list here and we currently match the crypto itself, not the variable name.

GITHUB_TOKEN <- we ignore this
ghp_209isokl2kl2k0i0ia... -< we match this

However are just about to rework this system to match both #209 and also look and use of entropy for matching.

Would love to have you try this out when we have something in the way of a prototype?

@tan-yong-sheng
Copy link
Author

Looping back to the secrets sensing, we would need to pattern match those tokens, you will find the matching list here and we currently match the crypto itself, not the variable name.

Ah I see, no wonder.

However are just about to rework this system to match both #209 and also look and use of entropy for matching.

Interesting ideas and discussions

Would love to have you try this out when we have something in the way of a prototype?

Yaya, should be no problem. My pleasure.

Thanks as well.

jhrozek added a commit that referenced this issue Feb 4, 2025
OpenRouter is a "muxing provider" which itself provides access to
multiple models and providers. It speaks a dialect of the OpenAI protocol, but
for our purposes, we can say it's OpenAI.

There are some differences in handling the requests, though:
1) we need to know where to forward the request to, by default this is
   `https://openrouter.ai/api/v1`, this is done by setting the base_url
   parameter
2) we need to prefix the model with `openrouter/`. This is a
   lite-LLM-ism (see https://docs.litellm.ai/docs/providers/openrouter)
   which we'll be able to remove once we ditch litellm

Initially I was considering just exposing the OpenAI provider on an
additional route and handling the prefix based on the route, but I think
having an explicit provider class is better as it allows us to handle
any differences in OpenRouter dialect easily in the future.

Related: #878
@jhrozek
Copy link
Contributor

jhrozek commented Feb 4, 2025

@tan-yong-sheng I opened a PR that fixes a very similar issue for OpenRouter. I think fixing your issue for litellm would be along the same lines. Can you suggest an easy way to test litellm proxy to simulate your environment? I found https://docs.litellm.ai/docs/proxy/deploy so I can try that - I'm familiar with using litellm as a library, but not as a proxy

@tan-yong-sheng
Copy link
Author

tan-yong-sheng commented Feb 5, 2025

Here is the the way to start an localhost instance for litellm. But I am not that sure if this is working as I never tried localhost version one (note: not available at the moment to test this before writing)..

Reference: https://thinhdanggroup.github.io/litellm-proxy/#setting-up-litellm-proxy-locally

  1. Install pypi package
pip install litellm[proxy]
  1. Then, on your directory, create config.yaml file

(Note: I get Gemini API key for free in Google ai studio)

model_list:
  - model_name: gemini/*
    litellm_params:
      # to include all gemini models available
      model: gemini/*
      api_key: <GEMINI_API_KEY_IN_TEXT_FORM>
      drop_params: true
  
  - model_name: codestral/*
    litellm_params:
      # to include all codestral models available
      model: codestral/*
      api_key: <CODESTRAL_API_KEY_IN_TEXT_FORM>
      drop_params: true

general_settings: 
  master_key: sk-1234

(Note 1: You could refer https://docs.litellm.ai/docs/proxy/configs for the format of config file, if you need to add more configuration)
(Note 2: your LITELLM_MASTER_KEY is sk-1234 set by config above)

  1. start litellm instances
litellm --config=config.yaml
  1. At last, you could use either openai or litellm module to call the llm API

https://docs.litellm.ai/docs/proxy/user_keys

For example,

import openai
client = openai.OpenAI(
    api_key="anything", # can try to pass in LITELLM_MASTER_KEY first
    base_url="http://localhost:4000"
)

# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(
    model="gemini/gemini-2.0-flash-exp",
    messages = [
        {
            "role": "user",
            "content": "this is a test request, write a short poem"
        }
    ],
    
)

print(response)

Let me know if you need further help, thanks.

jhrozek added a commit that referenced this issue Feb 5, 2025
OpenRouter is a "muxing provider" which itself provides access to
multiple models and providers. It speaks a dialect of the OpenAI protocol, but
for our purposes, we can say it's OpenAI.

There are some differences in handling the requests, though:
1) we need to know where to forward the request to, by default this is
   `https://openrouter.ai/api/v1`, this is done by setting the base_url
   parameter
2) we need to prefix the model with `openrouter/`. This is a
   lite-LLM-ism (see https://docs.litellm.ai/docs/providers/openrouter)
   which we'll be able to remove once we ditch litellm

Initially I was considering just exposing the OpenAI provider on an
additional route and handling the prefix based on the route, but I think
having an explicit provider class is better as it allows us to handle
any differences in OpenRouter dialect easily in the future.

Related: #878
jhrozek added a commit that referenced this issue Feb 6, 2025
* Move _get_base_url to the base provider

In order to properly support "muxing providers" like openrouter, we'll
have to tell litellm (or in future a native implementation), what server
do we want to proxy to. We were already doing that with Vllm, but since
are about to do the same for OpenRouter, let's move the `_get_base_url`
method to the base provider.

* Add an openrouter provider

OpenRouter is a "muxing provider" which itself provides access to
multiple models and providers. It speaks a dialect of the OpenAI protocol, but
for our purposes, we can say it's OpenAI.

There are some differences in handling the requests, though:
1) we need to know where to forward the request to, by default this is
   `https://openrouter.ai/api/v1`, this is done by setting the base_url
   parameter
2) we need to prefix the model with `openrouter/`. This is a
   lite-LLM-ism (see https://docs.litellm.ai/docs/providers/openrouter)
   which we'll be able to remove once we ditch litellm

Initially I was considering just exposing the OpenAI provider on an
additional route and handling the prefix based on the route, but I think
having an explicit provider class is better as it allows us to handle
any differences in OpenRouter dialect easily in the future.

Related: #878

* Add a special ProviderType for openrouter

We can later alias it to openai if we decide to merge them.

* Add tests for the openrouter provider

* ProviderType was reversed, thanks Alejandro

---------

Co-authored-by: Radoslav Dimitrov <[email protected]>
@tan-yong-sheng
Copy link
Author

tan-yong-sheng commented Feb 8, 2025

Hi, just wanna check if the method above works to use litellm proxy on localhost (in linux environment)? Let me know if you need any help.

@jhrozek
Copy link
Contributor

jhrozek commented Feb 12, 2025

Hi, just wanna check if the method above works to use litellm proxy on localhost (in linux environment)? Let me know if you need any help.

Thank you for the pointers. We merged the the openrouter patches last week and the litellm support should be along the same lines. I got sidetracked by other work, but I'll try to get this back on track soon!

If other codegate developers want to take this issue from me, feel free to.

@jhrozek
Copy link
Contributor

jhrozek commented Feb 12, 2025

@rdimitrov

@rdimitrov rdimitrov self-assigned this Feb 20, 2025
@rdimitrov
Copy link
Member

hey, @tan-yong-sheng 👋

So I've tried an example setup locally and I think I was able to reproduce the issue.

TLDR: The main issue is that our internal dependency on litellm (not the proxy, but the library which we are in the works of removing 🥳 ) re-routes on its own the model provider URL based on the model name. We are expecting to release a fix for this in the next 1 or 2 releases.

The setup that I have is the following:

  • Cline: Configured to talk to CodeGate's OpenAI compatible API (note the model name is what LiteLLM Proxy expects it)
Image
  • CodeGate: Configured to talk to LiteLLM's Proxy (running locally on localhost:4000) by overwriting the OpenAI URL flag

<codegate serve command> --openai-url=http://localhost:4000

  • LiteLLM Proxy: Configured to talk to a local Ollama server (but can be configured to other providers as well)
model_list:
  - model_name: phi4
    litellm_params:
      model: ollama/phi4 # LiteLLM expects the models to be prefixed by their provider
      api_base: http://localhost:11434 # This is where Ollama is serving

general_settings:
  master_key: sk-1234 # That secret is used in the Cline config for API key

The issue:

  • This should work, but our internal use of LiteLLM as a library tries to be smart in this case and once it sees the prefixed model name (ollama/phi4) it overwrites the provider URL from what we configured above via --openai-url=http://localhost:4000 to http://localhost:11434 instead essentially bypassing the LiteLLM Proxy.
  • It does the same for OpenAI or other providers too, which is why using your LiteLLM key doesn't work, but using the OpenAI key for example works.

Solution:

  • The good news is @blkt and @jhrozek are already in the process of replacing our internal use of LiteLLM with our own implementation which will simplify the codebase on one side, but also resolve this and other issues caused by LiteLLM.

I'll ping you once this is released 🙏 The work is almost complete so it should drop in some of the next releases.

@tan-yong-sheng
Copy link
Author

Hi, sorry for missing this and being for reply.

I'll ping you once this is released 🙏 The work is almost complete so it should drop in some of the next releases.

Sure, thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants