Generate text embeddings with Semantic Kernel and Ollama

Retrieval-augmented regeneration, also known as RAG, is an NLP technique that can help improve the quality of large language models (LLMs). It allows your AI agent to retrieve data from external sources to generate grounded responses. This helps avoid that your agent hallucinates and returns incorrect information.

There are multiple ways to retrieve this external data. In this post I want to show you how to generate vector embeddings that can be stored and retrieved from a vector database. This means we first need to decide which database to use. The list of options keeps growing (even the new SQL server version will support vector embeddings out of the box). As we want to demonstrate this feature using Semantic Kernel, we need to take a look at one of the available connectors.

Qdrant Vector Database

I decided to use Qdrant for this blog post.

Qdrant is an AI-native vector database and a semantic search engine. You can use it to extract meaningful information from unstructured data.

You can run Qdrant locally using docker. I have a local /data/qdrant folder that qdrant should use to store the data

docker run --name qdrantdemo -p 6333:6333 -p 6334:6334 -v "/data/qdrant:/qdrant/storage" qdrant/qdrant

If you browse to http://localhost:6333/dashboard you get a web interface to explore the data in the vector store:

Remark: Different vector stores expect the vectors in different formats and sizes. So if you want to use the code I will show you in this post with another Vector database, you probably will need to make some changes.

Ollama Text Embeddings

To generate our embeddings, we need to use a text embedding generator. Ollama supports multiple embedding models, I decided to install the ‘nomic-embed-text’ model:

ollama pull nomic-embed-text

Remark: For a good introduction about different embeddings, check out this post.

Now we can move to the code and start by adding the following NuGet packages:

dotnet add package Microsoft.SemanticKernel.Connectors.Ollama
dotnet add package Microsoft.SemanticKernel.Connectors.Qdrant

Once these packages are installed, we can create a new OllamaApiClient instance referencing the text embedding model we have installed above.

We can also directly generate an ITextEmbeddingGenerationService:

Now we can take some data and convert it to a vector by using the following code:

Bringing it all together

Almost there! The last step is get this generated embedding into our vector database.

Therefore we first need to generate a Qdrant VectorStore object:

We also need to create a model that describes how a vector store object in Qdrant should look like by correctly annotating it:

Once we have that model defined, we can create a new collection using this model. Notice that I’m specifying a Guid as the key(ulong values are also supported):

Now we can use this to upload the example data:

That’s it for today! In a follow-up post I’ll show you how we can integrate this information in our Agent and have a complete RAG solution.

More information

wullemsb/SemanticKernel at RAG

Retrieve data from plugins for RAG | Microsoft Learn

Generating embeddings for Semantic Kernel Vector Store connectors | Microsoft Learn

Qdrant - Vector Database - Qdrant

Using the Semantic Kernel Qdrant Vector Store connector (Preview) | Microsoft Learn

The art of simplicity

Search This Blog

Generate text embeddings with Semantic Kernel and Ollama

Qdrant Vector Database

Ollama Text Embeddings

Bringing it all together

More information

Labels

Popular posts from this blog

Kubernetes–Limit your environmental impact

Azure DevOps/ GitHub emoji

DevToys–A swiss army knife for developers