Skip to main content

Generate text embeddings with Semantic Kernel and Ollama

Retrieval-augmented regeneration, also known as RAG, is an NLP technique that can help improve the quality of large language models (LLMs). It allows your AI agent to retrieve data from external sources to generate grounded responses. This helps avoid that your agent hallucinates and returns incorrect information.

There are multiple ways to retrieve this external data. In this post I want to show you how to generate vector embeddings that can be stored and retrieved from a vector database. This means we first need to decide which database to use. The list of options keeps growing (even the new SQL server version will support vector embeddings out of the box). As we want to demonstrate this feature using Semantic Kernel, we need to take a look at one of the available connectors.

Qdrant Vector Database

I decided to use Qdrant for this blog post.

Qdrant is an AI-native vector database and a semantic search engine. You can use it to extract meaningful information from unstructured data.

You can run Qdrant locally using docker. I have a local /data/qdrant folder that qdrant should use to store the data

docker run --name qdrantdemo -p 6333:6333 -p 6334:6334 -v "/data/qdrant:/qdrant/storage" qdrant/qdrant

If you browse to http://localhost:6333/dashboard  you get a web interface to explore the data in the vector store:

Remark: Different vector stores expect the vectors in different formats and sizes. So if you want to use the code I will show you in this post with another Vector database, you probably will need to make some changes.

Ollama Text Embeddings

To generate our embeddings, we need to use a text embedding generator. Ollama supports multiple embedding models, I decided to install the ‘nomic-embed-text’ model:

ollama pull nomic-embed-text

Remark: For a good introduction about different embeddings, check out this post.

Now we can move to the code and start by adding the following NuGet packages:

dotnet add package Microsoft.SemanticKernel.Connectors.Ollama
dotnet add package Microsoft.SemanticKernel.Connectors.Qdrant

Once these packages are installed, we can create a new OllamaApiClient instance referencing the text embedding model we have installed above.

We can also directly generate an ITextEmbeddingGenerationService:

Now we can take some data and convert it to a vector by using the following code:

Bringing it all together

Almost there! The last step is get this generated embedding into our vector database.

Therefore we first need to generate a Qdrant VectorStore object:

We also need to create a model that describes how a vector store object in Qdrant should look like by correctly annotating it:

Once we have that model defined, we can create a new collection using this model. Notice that I’m specifying a Guid as the key(ulong values are also supported):

Now we can use this to upload the example data:

That’s it for today! In a follow-up post I’ll show you how we can integrate this information in our Agent and have a complete RAG solution.

More information

wullemsb/SemanticKernel at RAG

Retrieve data from plugins for RAG | Microsoft Learn

Generating embeddings for Semantic Kernel Vector Store connectors | Microsoft Learn

Qdrant - Vector Database - Qdrant

Using the Semantic Kernel Qdrant Vector Store connector (Preview) | Microsoft Learn

Popular posts from this blog

Azure DevOps/ GitHub emoji

I’m really bad at remembering emoji’s. So here is cheat sheet with all emoji’s that can be used in tools that support the github emoji markdown markup: All credits go to rcaviers who created this list.

Kubernetes–Limit your environmental impact

Reducing the carbon footprint and CO2 emission of our (cloud) workloads, is a responsibility of all of us. If you are running a Kubernetes cluster, have a look at Kube-Green . kube-green is a simple Kubernetes operator that automatically shuts down (some of) your pods when you don't need them. A single pod produces about 11 Kg CO2eq per year( here the calculation). Reason enough to give it a try! Installing kube-green in your cluster The easiest way to install the operator in your cluster is through kubectl. We first need to install a cert-manager: kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.5/cert-manager.yaml Remark: Wait a minute before you continue as it can take some time before the cert-manager is up & running inside your cluster. Now we can install the kube-green operator: kubectl apply -f https://github.com/kube-green/kube-green/releases/latest/download/kube-green.yaml Now in the namespace where we want t...

Podman– Command execution failed with exit code 125

After updating WSL on one of the developer machines, Podman failed to work. When we took a look through Podman Desktop, we noticed that Podman had stopped running and returned the following error message: Error: Command execution failed with exit code 125 Here are the steps we tried to fix the issue: We started by running podman info to get some extra details on what could be wrong: >podman info OS: windows/amd64 provider: wsl version: 5.3.1 Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM Error: unable to connect to Podman socket: failed to connect: dial tcp 127.0.0.1:2655: connectex: No connection could be made because the target machine actively refused it. That makes sense as the podman VM was not running. Let’s check the VM: >podman machine list NAME         ...