[Discussion]: LiteLLM Proxy YAML Config v2.0 #1000

ishaan-jaff · 2023-12-02T17:46:03Z

ishaan-jaff
Dec 2, 2023
Maintainer

The Feature

Starting this issue to track how we can improve LiteLLM Proxy config for the next version

Motivation, pitch

Twitter / LinkedIn details

No response

ishaan-jaff · 2023-12-02T17:46:42Z

ishaan-jaff
Dec 2, 2023
Maintainer Author

from @PSU3D0
Idea for the yaml config structure that feels more intuitive (at least to me)

model_list:
  models:
    gpt-4-turbo-1106:
      load_balancing:
        strategy: "shuffle"
      success_callback: ["langfuse"]
      litellm_params:
        api_version: "2023-05-15" // Overrides global scoped
      providers:
        azure_west_1:
          litellm_params:
            model: azure/gpt-4-turbo-1106
            api_base: https://test1-1.openai.azure.com/
            api_key: os.environ/AZURE_OAI_US_WEST_1_API_KEY
        azure_east_2:
          tpm: 45
          metadata:
            langfuse/continent: "north_america"
          litellm_params:
            model: azure/gpt-4-turbo-1106
            api_version: "2023-03-21" //Overrides model scoped
            api_base: https://test2-1.openai.azure.com/
            api_key: os.environ/AZURE_OAI_US_EAST_2_API_KEY

A few benefits here.

Each provider is explicitly named, so if I have identical deployment names across regions in Azure, can still differentiate.
Nested structure makes it much more clear what providers are going along with each model name
Allow configuration scoped to model level rather than global

Litellm params scoped to a specific litellm exposed model
Naturally there'd need to be a new top-level version parameter as well, similar to compose

1 reply

ishaan-jaff Dec 7, 2023
Maintainer Author

this config makes it a lot easier to set aliases for a given group to

ishaan-jaff · 2023-12-02T17:47:43Z

ishaan-jaff
Dec 2, 2023
Maintainer Author

The same config should be compatible on the Proxy + Router

0 replies

krrishdholakia · 2023-12-02T18:32:08Z

krrishdholakia
Dec 2, 2023
Maintainer

cc @Manouchehri

0 replies

krrishdholakia · 2023-12-04T16:27:17Z

krrishdholakia
Dec 4, 2023
Maintainer

Moving this to be a discussion for a future v2.

0 replies

krrishdholakia · 2023-12-07T08:04:31Z

krrishdholakia
Dec 7, 2023
Maintainer

Alternative idea
(motivation - separate individual model details, from the fallback/retry logic)

model_list: 
  - model_name: gpt-3.5-turbo # user-facing model alias
    litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
      model: azure/<your-deployment-name>
      api_base: <your-azure-api-endpoint>
      api_key: <your-azure-api-key>
   model_info: 
      id: unique-123
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: azure/gpt-turbo-small-ca
      api_base: https://my-endpoint-canada-berri992.openai.azure.com/
      api_key: <your-azure-api-key>
   model_info: 
      id: unique-456
  - model_name: vllm-model
    litellm_params:
      model: openai/<your-model-name>
      api_base: <your-api-base> # e.g. http://0.0.0.0:3000
    model_info: 
      id: unique-789

router_settings: # router config
   - model_group_list:
     - model_group_name: "gpt-free-models" # user facing model group alias 
        models: ["unique-123", "unique-456"]
     - model_group_name: "gpt-paid-models"
       models: ["unique-789"]
   - num_retries: 3
   - fallbacks=[{"gpt-paid-models": ["gpt-free-models"]}]

0 replies

azohra · 2023-12-21T20:07:04Z

azohra
Dec 21, 2023

Hey @krrishdholakia, here is my take on a refactor

models:
  - id: unique-123
    model: azure/<your-deployment-name>
    api_base: <your-azure-api-endpoint>
    api_key: <your-azure-api-key>
  - id: unique-456
    model: azure/gpt-turbo-small-ca
    api_base: https://my-endpoint-canada-berri992.openai.azure.com/
    api_key: <your-azure-api-key>
  - id: unique-789
    model: openai/<your-model-name>
    api_key: <your-api-key>

router_settings:
  groups:
    - id: gpt-free-models
      models: ["unique-123", "unique-456"]
    - id: gpt-paid-models
      models: ["unique-789"]
  num_retries: 3
  fallbacks:
    - gpt-paid-models: ["gpt-free-models"]

Simplified Model Definitions (Initial Change)
- Original Structure: Models were defined with a nested structure, including model_name, litellm_params, and model_info.
- Change Applied: Consolidated the model definitions under a single models key, removing nested structures for simplicity.
Top-Level litellm_params Configuration
- Original Structure: litellm_params was nested under each model.
- Change Applied: Moved the parameters from litellm_params to the top level for each model, thereby reducing nesting and simplifying the configuration. LiteLLM knows what it wants/needs, it can ignore noise.
Replaced name with id in Model Definitions
- Original Structure: Each model had a name attribute for identification.
- Change Applied: Replaced the name attribute with id in the models section, focusing on unique identifiers for each model. This also means there is no need for alias, as your id can play that role. Because ID must be unique, it also sets us up to use it when calling a model directly on the proxy!
Replaced name with id in Group Definitions
- Original Structure: Each group had a name attribute for identification.
- Change Applied: Replaced the name attribute with id in the groups section, focusing on unique identifiers for each group.
Renamed model_groups to groups in Router Settings
- Original Structure: Groups of models were defined under model_groups in the router_settings.
- Change Applied: Renamed model_groups to groups to simplify the terminology and improve readability.

Final YAML Structure:

Models Section: Contains a list of models, each with direct, top-level settings (model, api_base, api_key) and identified by a unique id.
Router Settings: Includes a groups section (previously model_groups) for defining groups of models, num_retries for specifying retry attempts, and fallbacks for defining fallback strategies.

These changes collectively improved the YAML specification by making it more concise, reducing unnecessary nesting, and focusing on unique identifiers for clarity and ease of configuration.

Here is a real world example of how someone might use the spec

models:
  - id: azure-us-east-35-turbo
    model: azure/gpt-3_5-turbo
    api_base: https://berri-us-east.openai.azure.com/
    api_key: some-key-12318231731723712
  - id: azure-us-east-35-turbo
    model: azure/gpt-3_5-turbo
    api_base: https://berri-can-east.openai.azure.com/
    api_key: some-key-5753736262525221
  - id: openai-gpt-3_5_turbo
    model: openai/gpt-3_5-turbo
    api_key: some-key-31317123712631632

router_settings:
  groups:
    - id: gpt-3.5
      models: ["azure-us-east-35-turbo", "azure-us-east-35-turbo", "openai-gpt-3_5_turbo"]
    - id: azure-3.5
      models: ["azure-can-east-35-turbo", "azure-us-east-35-turbo"]
    - id: oai-3.5
      models: ["openai-gpt-3_5_turbo"]
  num_retries: 3
  fallbacks:
    - azure-3.5: ["oai-3.5"]

5 replies

krrishdholakia Dec 22, 2023
Maintainer

i like this @azohra

I think we can add 'groups' support today, without it being a breaking change.

Re: simplifying the model list,

Additional model metadata: we need a place users can store additional model metadata (e.g. 'mode' which we use for knowing if a model is an embedding / chat completions / etc. when running health checks).
Individual model alias: A common use-case is to try a new model in an existing repo. Existing repos sometimes have model names hard-coded (e.g. aider might send a 'gpt-3.5-turbo' call) and the proxy is used to route that to the actual model a user wants to call, so we need a way to separate a client-facing name from the internal proxy name.

PSU3D0 Dec 22, 2023

@krrishdholakia

The model definitions being separate like @azohra suggested definitely make a lot of sense to me.

IMO id is redundant in both groups. What feels more readable to me is:

models:
  azure-us-east-35-turbo
    model: azure/gpt-3_5-turbo
    api_base: https://berri-us-east.openai.azure.com/
    api_key: some-key-12318231731723712
  azure-us-east-35-turbo
    model: azure/gpt-3_5-turbo
    api_base: https://berri-can-east.openai.azure.com/
    api_key: some-key-5753736262525221
  openai-gpt-3_5_turbo
    model: openai/gpt-3_5-turbo
    api_key: some-key-31317123712631632

router_settings:
  groups:
    gpt-3.5
      models: ["azure-us-east-35-turbo", "azure-us-east-35-turbo", "openai-gpt-3_5_turbo"]
    azure-3.5
      models: ["azure-us-east-35-turbo", "azure-us-east-35-turbo"]
    oai-3.5
      models: ["openai-gpt-3_5_turbo"]

Why this "nested" approach?

When reading a config, the indentation allows me to easily see what parameters belong to which models. The only delineation between models and groups in the flat structure is the presence of a single - character, which IMO hurts readability a lot.
Internally, parameters and group settings will probably be scoped under an ID anyway. This is reflected in this "nested" config as well, which I believe would more closely approximate internal application structure.
This pattern is reflected in other configuration schemas as well, such as docker-compose. The name of a service is a top level key, and parameters for that service are scoped under that name. IMO a pattern that more closely follows a compose style approach will be more familiar, particularly to new users.

While the general rule of "explicit is better than implicit" and "nesting should be avoided" makes a lot of sense in code, I think in the case of building intuitive, human-readable configuration structures, the config sample above goes a long way towards improving readability.

I'd love to here people's thoughts.

krrishdholakia Dec 22, 2023
Maintainer

@PSU3D0 we need a model id for the router either ways. Either the user sets it, or we autogenerate one. Interesting usage of router groups being a grouping on the model name group (i see azure-us-east-35-turbo used twice), ideally I'd like to separate the model list from the load balancing groups, just to keep things simple (multiple load balancing groups, aliases, caching across 2 levels of groups, etc.).

Was the usage of azure-us-east-35-turbo twice intentional?

azohra Dec 22, 2023

@krrishdholakia , I believe what @PSU3D0 is getting at (assuming they wanted a : at the end of the line)

Is that one could further reduce the verbosity, but the implication is having a YAML dictionary vs alist.

Direct Dictionary Access:

YAML Structure:

models:
  azure-us-east-35-turbo:
    model: azure/gpt-3_5-turbo

Python Access:

model_name = config['models']['azure-us-east-35-turbo']['model']

List Iteration Access:

YAML Structure:

models:
  - id: azure-us-east-35-turbo
    model: azure/gpt-3_5-turbo

Python Access:

for model in config['models']:
    if model['id'] == 'azure-us-east-35-turbo':
        model_name = model['model']
        break

Discussion Point: The direct dictionary approach is simpler for accessing specific models by name, while the list iteration approach is more flexible for handling multiple models with diverse configurations.

p.s @krrishdholakia , you are correct I had a typo in my example it should have been can and us - i've updated the comment.

In my example the names of the models are not intended to be groups, they are unique ID which ensures nicely that it can be used to call the model directly if needed / desired. To merge our thinking, we can do the below:

models:
  - id: azure-us-east-35-turbo
    model: azure/gpt-3_5-turbo
    api_base: https://berri-us-east.openai.azure.com/
    api_key: some-key-12318231731723712
  - id: azure-us-east-35-turbo
    model: azure/gpt-3_5-turbo
    api_base: https://berri-can-east.openai.azure.com/
    api_key: some-key-5753736262525221
  - id: openai-gpt-3_5_turbo
    model: openai/gpt-3_5-turbo
    api_key: some-key-31317123712631632

groups:
  - id: gpt-3.5
    models: ["azure-us-east-35-turbo", "azure-us-east-35-turbo", "openai-gpt-3_5_turbo"]
  - id: azure-3.5
    models: ["azure-can-east-35-turbo", "azure-us-east-35-turbo"]
  - id: oai-3.5
    models: ["openai-gpt-3_5_turbo"]

router:
  groups:
    - azure-3.5
  num_retries: 3
  fallbacks:
    - azure-3.5: ["oai-3.5"]

azohra Dec 22, 2023

if groups have no config, they could be simplified to (bringing up the similar above list vs dict approach)

groups:
  gpt-3.5: 
    - "azure-us-east-35-turbo"
    - "openai-gpt-3_5_turbo"
  azure-3.5: 
    - "azure-can-east-35-turbo"
    - "azure-us-east-35-turbo"
  oai-3.5: 
    - "openai-gpt-3_5_turbo"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion]: LiteLLM Proxy YAML Config v2.0 #1000

{{title}}

Replies: 6 comments 6 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

[Discussion]: LiteLLM Proxy YAML Config v2.0 #1000

ishaan-jaff Dec 2, 2023 Maintainer

The Feature

Motivation, pitch

Twitter / LinkedIn details

Replies: 6 comments · 6 replies

ishaan-jaff Dec 2, 2023 Maintainer Author

ishaan-jaff Dec 7, 2023 Maintainer Author

ishaan-jaff Dec 2, 2023 Maintainer Author

krrishdholakia Dec 2, 2023 Maintainer

krrishdholakia Dec 4, 2023 Maintainer

krrishdholakia Dec 7, 2023 Maintainer

azohra Dec 21, 2023

krrishdholakia Dec 22, 2023 Maintainer

PSU3D0 Dec 22, 2023

krrishdholakia Dec 22, 2023 Maintainer

azohra Dec 22, 2023

azohra Dec 22, 2023

ishaan-jaff
Dec 2, 2023
Maintainer

Replies: 6 comments 6 replies

ishaan-jaff
Dec 2, 2023
Maintainer Author

ishaan-jaff Dec 7, 2023
Maintainer Author

ishaan-jaff
Dec 2, 2023
Maintainer Author

krrishdholakia
Dec 2, 2023
Maintainer

krrishdholakia
Dec 4, 2023
Maintainer

krrishdholakia
Dec 7, 2023
Maintainer

azohra
Dec 21, 2023

krrishdholakia Dec 22, 2023
Maintainer

krrishdholakia Dec 22, 2023
Maintainer