Yesterday I talked about Podman AI Labs as an alternative to Ollama to run your Large Language Models locally. Among the list of features I noticed the following one:
Mmh, an OpenAI compatible API… That made me wonder if I could use Semantic Kernel to talk to the local service.
Let’s give it a try…
I first add the minimal amount of code to use Semantic Kernel.
Compared to the same code using Ollama there are only 2 important things to notice:
- I adapted the URI to match the service URI running inside Podman
- I could set the ModelId to any value I want as the endpoint only hosts one specific model(granite in this example)
And just to proof that it really works, here are the results I got back:
This is again a great example how the abstraction that Semantic Kernel has to offer simplifies interacting with multiple LLM’s.
Nice!
IMPORTANT: I first tried to get it working with the latest prerelease of Semantic Kernel(1.18.0-rc). However when I used that version it result in an HTTPException. When I used the latest stable version at the moment of writing(1.17.1), it did work as demonstrated above.
More information
Here is the link to a fully working example: wullemsb/PodmanAISemanticKernel: Example of combining Podman AI Lab with Semantic Kernel (github.com)