In this post I want to introduce you to Ollama. Ollama is a streamlined tool for running open-source LLMs locally, including Mistral and Llama 2. It bundles model weights, configurations, and datasets into a unified package managed by a Modelfile.
Ollama supports a variety of LLMs including LLaMA-2, uncensored LLaMA, CodeLLaMA, Falcon, Mistral, Vicuna model, WizardCoder, and Wizard uncensored.
Installation
To install Ollama on Windows, first download the executable available here: https://ollama.com/download/OllamaSetup.exe
Run the executable to start the Installation wizard:
Click Install to start the installation process. After the installation has completed Ollama will be running in the background:
We can now open a command prompt and call ollama:
Download a model
Before we can do anything useful, we first need to download a specific language model. The full list of models can be found at https://ollama.com/library.
Here are some examples:
Model | Parameters | Size | Download |
Llama 2 | 7B | 3.8GB | ollama run llama2 |
Mistral | 7B | 4.1GB | ollama run mistral |
Llama 2 13B | 13B | 7.3GB | ollama run llama2:13b |
Llama 2 70B | 70B | 39GB | ollama run llama2:70b |
Remark: Make sure you have enough RAM before you try to run one of the larger models.
Let’s give Llama 2 a try. We execute the following command to download and run the language model:
ollama run llama2
Be patient. It can take a while to download the model.
Remark: If you only want to download the model, you can use the pull command:
ollama pull llama2
Invoke the model
We can invoke the model directly from the commandline using the run command as we have seen above:
Ollama also has an an API endpoint running at the following location: http://localhost:11434.
We can invoke it for example through Postman:
In the example above, I had set stream to false. This requires us to wait until the LLM has generated a full response.
Have a look here for the full API documentation: https://github.com/ollama/ollama/blob/main/docs/api.md
We'll use this API in our next post about .NET Smart Components. Stay tuned!