Skip to content

kubernetes-sigs/llm-instance-gateway

Kubernetes LLM Instance Gateway

The LLM Instance Gateway is a part of wg-serving, and this repo contains: the load balancing algorithm, ext-proc code, CRDs, and controllers to support the LLM Instance Gateway.

This Gateway is intented to provide value to multiplexed LLM services on a shared pool of compute. See the proposal for more info.

Status

This project is currently in development.

For more rapid testing, our PoC is in the ./examples/ dir.

Getting Started

Install the CRDs into the cluster:

make install

Delete the APIs(CRDs) from the cluster:

make uninstall

Deploying the ext-proc image Refer to this README on how to deploy the Ext-Proc image used to support Instance Gateway.

Contributing

Our community meeting is weekly at Th 10AM PDT; zoom link here.

We currently utilize the #wg-serving slack channel for communications.

Contributions are readily welcomed, thanks for joining us!

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.