Yesterday I talked about Podman AI Labs as an alternative to Ollama to run your Large Language Models locally. Among the list of features I noticed the following one:
Mmh, an OpenAI compatible API… That made me wonder if I could use Semantic Kernel to talk to the local service.
Let’s give it a try…
I first add the minimal amount of code to use Semantic Kernel.
var modelId = "doesntmatter"; | |
// local Podman Desktop endpoint | |
var endpoint = new Uri("http://localhost:65527"); | |
var kernelBuilder = Kernel.CreateBuilder(); | |
#pragma warning disable SKEXP0010 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed. | |
var kernel = kernelBuilder | |
.AddOpenAIChatCompletion( | |
modelId, | |
endpoint, | |
apiKey: null) | |
.Build(); | |
#pragma warning restore SKEXP0010 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed. | |
var chatService = kernel.GetRequiredService<IChatCompletionService>(); | |
ChatHistory chat = new(); | |
chat.AddSystemMessage("You are a helpful travel assistant."); | |
var executionSettings = new OpenAIPromptExecutionSettings | |
{ | |
MaxTokens = 1000, | |
Temperature = 0.5, | |
TopP = 1, | |
FrequencyPenalty = 0, | |
PresencePenalty = 0, | |
StopSequences = new[] { "Human:", "AI:" }, | |
}; | |
var prompt = "Why should I visit Paris?"; | |
var response = await chatService.GetChatMessageContentAsync(prompt, executionSettings); | |
Console.WriteLine(response.Content); |
Compared to the same code using Ollama there are only 2 important things to notice:
- I adapted the URI to match the service URI running inside Podman
- I could set the ModelId to any value I want as the endpoint only hosts one specific model(granite in this example)
And just to proof that it really works, here are the results I got back:
This is again a great example how the abstraction that Semantic Kernel has to offer simplifies interacting with multiple LLM’s.
Nice!
IMPORTANT: I first tried to get it working with the latest prerelease of Semantic Kernel(1.18.0-rc). However when I used that version it result in an HTTPException. When I used the latest stable version at the moment of writing(1.17.1), it did work as demonstrated above.
More information
Here is the link to a fully working example: wullemsb/PodmanAISemanticKernel: Example of combining Podman AI Lab with Semantic Kernel (github.com)