Skip to main content

Running large language models locally using Ollama

In this post I want to introduce you to Ollama. Ollama is a streamlined tool for running open-source LLMs locally, including Mistral and Llama 2. It bundles model weights, configurations, and datasets into a unified package managed by a Modelfile.

Ollama supports a variety of LLMs including LLaMA-2, uncensored LLaMA, CodeLLaMA, Falcon, Mistral, Vicuna model, WizardCoder, and Wizard uncensored.


To install Ollama on Windows, first download the executable available here:

Run the executable to start the Installation wizard:

Click Install to start the installation process. After the installation has completed Ollama will be running in the background:

We can now open a command prompt and call ollama:

Download a model

Before we can do anything useful, we first need to download a specific language model. The full list of models can be found at

Here are some examples:

Model Parameters Size Download
Llama 2 7B 3.8GB ollama run llama2
Mistral 7B 4.1GB ollama run mistral
Llama 2 13B 13B 7.3GB ollama run llama2:13b
Llama 2 70B 70B 39GB ollama run llama2:70b

Remark: Make sure you have enough RAM before you try to run one of the larger models.

Let’s give Llama 2 a try. We execute the following command to download and run the language model:

ollama run llama2

Be patient. It can take a while to download the model.

Remark: If you only want to download the model, you can use the pull command:

ollama pull llama2

Invoke the model

We can invoke the model directly from the commandline using the run command as we have seen above:

Ollama also  has an an API endpoint running at the following location: http://localhost:11434.

We can invoke it for example through Postman:

In the example above, I had set stream to false. This  requires us to wait until the LLM has generated a full response. 

Have a look here for the full API documentation: 

We'll use this API in our next post about .NET Smart Components. Stay tuned!

More information

Popular posts from this blog

DevToys–A swiss army knife for developers

As a developer there are a lot of small tasks you need to do as part of your coding, debugging and testing activities.  DevToys is an offline windows app that tries to help you with these tasks. Instead of using different websites you get a fully offline experience offering help for a large list of tasks. Many tools are available. Here is the current list: Converters JSON <> YAML Timestamp Number Base Cron Parser Encoders / Decoders HTML URL Base64 Text & Image GZip JWT Decoder Formatters JSON SQL XML Generators Hash (MD5, SHA1, SHA256, SHA512) UUID 1 and 4 Lorem Ipsum Checksum Text Escape / Unescape Inspector & Case Converter Regex Tester Text Comparer XML Validator Markdown Preview Graphic Color B

Help! I accidently enabled HSTS–on localhost

I ran into an issue after accidently enabling HSTS for a website on localhost. This was not an issue for the original website that was running in IIS and had a certificate configured. But when I tried to run an Angular app a little bit later on http://localhost:4200 the browser redirected me immediately to https://localhost . Whoops! That was not what I wanted in this case. To fix it, you need to go the network settings of your browser, there are available at: chrome://net-internals/#hsts edge://net-internals/#hsts brave://net-internals/#hsts Enter ‘localhost’ in the domain textbox under the Delete domain security policies section and hit Delete . That should do the trick…

Azure DevOps/ GitHub emoji

I’m really bad at remembering emoji’s. So here is cheat sheet with all emoji’s that can be used in tools that support the github emoji markdown markup: All credits go to rcaviers who created this list.