Dynamic Creation of chat completion endpoints during runtime #6811

Maniga · 2024-06-18T15:50:05Z

Maniga
Jun 18, 2024

Hi!

I currently use this code to initialize my semantic kernel:

private static void ConfigureSemanticKernel(IServiceCollection services, IConfiguration configuration)
    {
        var ollamaEndpoint = configuration.GetValue<string>("services:ollama:ollama:0") ?? "http://localhost:11434";

        var endpoint =
            new Uri(ollamaEndpoint ?? throw new InvalidOperationException("The ollama endpoint has not been set."));
        var modelId = "phi3";

        var kernelBuilder = services.AddKernel();
#pragma warning disable SKEXP0010
        kernelBuilder.Services.AddOpenAIChatCompletion(modelId, endpoint);
#pragma warning restore SKEXP0010

        var promptsFolderPath = Path.Combine(AppContext.BaseDirectory, "AiService", "Prompts\\ChatCompletionPlugin");

        kernelBuilder.Plugins.AddFromPromptDirectory(promptsFolderPath);
    }

Which is great and it works to inject the kernel. As you can see, I connect to ollama and currently I have the phi3 model hardcoded during the creation of the kernel. Now i want to be able to switch the models dynamically during runtime. A user of my app has the possiblity to pull ollama models and the new model should be used when calling:

public IAsyncEnumerable<string> ExecuteChatCompletion(string? prompt, List<string> chatHistory,
                                                          string modelId = "phi3")
    {
        kernel.Plugins["ChatCompletionPlugin"].TryGetFunction("ExecuteChatCompletion", out var function);
        var arguments = new KernelArguments
        {
            ["input"] = prompt,
            ["history"] = string.Join("", chatHistory)
        };

        var modelSettings = new Dictionary<string, PromptExecutionSettings>
            {{modelId, new PromptExecutionSettings {ModelId = modelId}}};

        arguments.ExecutionSettings = modelSettings;

        return kernel.InvokeStreamingAsync<string>(function ?? throw new InvalidOperationException(), arguments);
    }

the kernel like this.

I would like to be able to add and remove chatcompletion services, so the kernel configuration is in sync with my running ollama server. I thought maybe i can do something like:

kernel.AddChatCompletion and kernel.RemoveChatCompletion during runtime. But now I think there is issue in understanding the semantic kernel on my side and I think I am a little bit lost.

Maybe someone can get me back on track with an idea how to implement this feature.

Thanks!

Answered by dmytrostruk

Jun 18, 2024

@Maniga Got it, thanks for providing more details. I think service registration is available only during kernel construction (e.g. when using KernelBuilder or new Kernel(services). In your case, before making a request, you can create new kernel instance, import necessary plugins and services (e.g. OpenAIChatCompletionService with modelName provided by user) and execute a request. Let me know if that works for your scenario.

View full answer

dmytrostruk · 2024-06-18T17:05:32Z

dmytrostruk
Jun 18, 2024
Collaborator

Hi @Maniga , in order to choose model in runtime, you need to pre-register possible models during kernel registration and then choose the model before invoking a kernel. This example shows how to achieve such behavior:
https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/Concepts/ChatCompletion/Connectors_WithMultipleLLMs.cs

If you don't know which models may be potentially used, during runtime you can check if kernel has a service with registered model and if not - add it and use it. You can also use serviceId property in AddOpenAIChatCompletion to differentiate between registered models.

Let me know if this example helps you to achieve your scenario. Thanks a lot!

0 replies

Maniga · 2024-06-18T21:31:30Z

Maniga
Jun 18, 2024
Author

Hi @dmytrostruk!

Thanks for you answer. I saw that samples before and tried to make sense of it with my scenario. But my problem is, I tried to create a new instance with another modelname of an OpenAiChatCompletion service with that code:

public void AddOllamaModel(string modelName)
    {
        var ollamaEndpoint = configuration.GetValue<string>("services:ollama:ollama:0") ?? "http://localhost:11434";
        var ollamaBaseUrl = new Uri(ollamaEndpoint);

#pragma warning disable SKEXP0010
        var newChatCompletionService = new OpenAIChatCompletionService(modelName, ollamaBaseUrl);
#pragma warning restore SKEXP0010
    }

but i dont get it how to add the new chat completion service to the kernel. There is no method like Add on the services list.

2 replies

dmytrostruk Jun 18, 2024
Collaborator

@Maniga Got it, thanks for providing more details. I think service registration is available only during kernel construction (e.g. when using KernelBuilder or new Kernel(services). In your case, before making a request, you can create new kernel instance, import necessary plugins and services (e.g. OpenAIChatCompletionService with modelName provided by user) and execute a request. Let me know if that works for your scenario.

Answer selected by Maniga

Maniga Jun 18, 2024
Author

That helped and now I can call Ollama over the semantic kernel with all pulled models.

Thank you very much!

Maniga · 2024-06-19T13:19:34Z

Maniga
Jun 19, 2024
Author

Here is my solution thanks to @dmytrostruk, maybe someone needs to implement a similar feature:

I created a SemanticKernalManager Service which is responsible for initializing the semantic kernel every time a user pulls or removes a model in Ollama. I use clean architecture, so a domain event is triggered and the handler for the domain event calls the Initialization function. Also the manager service holds the kernel instance so i can use it in other places throughout the application. With that solution I don't have to initialize the kernel before each call.

public class SemanticKernelManager : ISemanticKernelManager
{
    private readonly IConfiguration _configuration;
    private readonly IOllamaService _ollamaService;

    public SemanticKernelManager(IOllamaService ollamaService, IConfiguration configuration)
    {
        _ollamaService = ollamaService;
        _configuration = configuration;
        Initialize();
    }

    public Kernel Kernel { get; private set; } = null!;

    public void OllamaModelChanged()
    {
        Initialize();
    }

    private void Initialize()
    {
        var ollamaEndpoint = _configuration.GetValue<string>("services:ollama:ollama:0") ?? "http://localhost:11434";

        var endpoint =
            new Uri(ollamaEndpoint ?? throw new InvalidOperationException("The ollama endpoint has not been set."));

        var modelsResult = _ollamaService.GetLoadedModels().Result;
        var models = modelsResult.Models.Select(x => x.Name).ToList();

        var kernelBuilder = Kernel.CreateBuilder();
#pragma warning disable SKEXP0010
        foreach (var model in models) kernelBuilder.Services.AddOpenAIChatCompletion(model, endpoint);
#pragma warning restore SKEXP0010

        var promptsFolderPath = Path.Combine(AppContext.BaseDirectory, "AiService", "Prompts\\ChatCompletionPlugin");

        kernelBuilder.Plugins.AddFromPromptDirectory(promptsFolderPath);
        Kernel = kernelBuilder.Build();
    }
}

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic Creation of chat completion endpoints during runtime #6811

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Dynamic Creation of chat completion endpoints during runtime #6811

Maniga Jun 18, 2024

Replies: 3 comments · 2 replies

dmytrostruk Jun 18, 2024 Collaborator

Maniga Jun 18, 2024 Author

dmytrostruk Jun 18, 2024 Collaborator

Maniga Jun 18, 2024 Author

Maniga Jun 19, 2024 Author

Maniga
Jun 18, 2024

Replies: 3 comments 2 replies

dmytrostruk
Jun 18, 2024
Collaborator

Maniga
Jun 18, 2024
Author

dmytrostruk Jun 18, 2024
Collaborator

Maniga Jun 18, 2024
Author

Maniga
Jun 19, 2024
Author