Running Ollama on the Raspberry Pi

In this tutorial, we will show you how to run Ollama on your Raspberry Pi.

Raspberry Pi Ollama

Ollama is a project that makes running large language models (LLM) locally on your device relatively easy.

Unlike using a tool like ChatGPT, all of the requests Ollama handles are processed locally on your Raspberry Pi using your chosen model.

Ollama itself isn’t a large language model. It is a tool that allows you to run various open-source AI models quickly. This tool handles downloading and then running a supported large language model. Ollama even has an API that you can interact with from other applications.

Visit the software’s official website to find a list of the language models it supports. Just be aware that the larger a model is, the more intensive it will be on the Raspberry Pi. For example, the TinyLlama AI model will run significantly better than the heavy Llama3.

Additionally, you will get a significantly better experience using a Raspberry Pi 5 with 8GB of memory. Processing language models quickly chew up the processing power and memory of the Pi.

Equipment

Below is a list of equipment we used when installing and running Ollama on our Raspberry Pi.

Recommended

Optional

We last tested this tutorial on a Raspberry Pi 5 running the latest version of Raspberry Pi OS Bookworm.

Installing Ollama on your Raspberry Pi

This section will show you the super simple steps to install Ollama on your Raspberry Pi. This process is made simple thanks to Ollama’s script, which handles almost all of the work for us.

Preparing your System for Ollama

1. Before we install Ollama, you must ensure your Raspberry Pi’s operating system is up to date.

You can update your packages to the latest version by running the command below.

sudo apt update
sudo apt upgrade

2. For our next step, we need to ensure that the curl package is installed. Typically, this comes bundled with Raspberry Pi, but it doesn’t hurt to verify it is installed.

You can ensure curl is installed on your Pi by running the command below.

sudo apt install curl

Running the Ollama Installer on your Raspberry Pi

3. With our Raspberry Pi ready, we can move on to running the Ollama installer.

Installing Ollama on your Pi is as simple as running the following command within the terminal.  This command will download the “install.sh” script from Ollama and pass it directly to bash.

curl -fsSL https://ollama.com/install.sh | sh

Running scripts like this is typically considered bad practice without first verifying the contents of the script. You can view the code in this script by going directly to it in your favorite web browser

4. Using the command below, we can verify that we just successfully installed Ollama on our Raspberry Pi.

This command gets Ollama to output its version to the terminal.

ollama --version

Below you can see that we have the software installed and are currently running version 0.1.32.

ollama version is 0.1.32

How to use Ollama on your Raspberry Pi

In this section, we will be showing you how you can use Ollama on your Raspberry Pi to run some of the various language models that it supports.

In particular, we will be testing the Phi3, TinyLlama and Llama3 models on our Raspberry Pi. The first two of these are meant to be lightweight models, the last one however is not.

Just remember that you shouldn’t expect too much out of your Pi. It isn’t really designed to run AI models, let alone super large ones. If you want something that is actually usable, try to stick with some of the lighter-weight models.

Using the Phi3 LLM on your Pi

Phi3 is an LLM developed by Microsoft to be super lightweight while retaining some of the quality results you expect from significantly larger models.

Being lightweight makes Phi3 a great large language model to run on your Raspberry Pi. It should be able to perform at a usable speed. Don’t expect super-fast responses, but the Pi 5 is capable of running this model.

1. To get Ollama to download and start the Phi3 LLM on your Raspberry Pi, you only need to use the following command.

Please note that this process can take a bit of time to complete as, while being a smaller model, Phi3 still clocks in at 2.3GB.

ollama run phi3

2. With Phi3 now running on your Raspberry Pi, you will be able to start communicating with the AI Model.

One thing you will find with this model is that while it is usable, it can take a long time to respond with any long answers. If you are after a fast LLM on your Raspberry Pi, you might have to consider other options.

Running Phi3 using Ollamao n the Raspberry Pi

Testing out TinyLlama

TinyLlama is super lightweight and one of the fastest LLMs you can run on the Raspberry Pi’s processor.

While it might not produce as high-quality results as larger models like Phi3 or the heavyweight Llama 3, it is still more than capable of answering most basic questions. Additionally, what TinyLlama lacks in quality definitely makes up for with speed.

1. You can use Ollama to run TinyLlama on your Raspberry Pi by running the command below.

This model is fairly lightweight, so it should download to your Pi relatively quickly.

ollama run tinyllama

2. Once the model is up and running you will be able to ask it questions and talk with the AI mode.

Using the TinyLlama LLM

Running Llama3 on the Raspberry Pi

Llama3 is a heavyweight language model, especially compared to TinyLlama and Phi3. It is capable of some super high-quality results but is a model that you need to be patient with when running it on your Pi.

Personally, we were surprised with how well the Llama3 model ran on our Raspberry Pi 5. While it can take a little bit to get started, it produced concise and highly readable results.

The longer your prompt is, the more taxing it will be on both the Llama3 language model and your Pi.

1. However, if you are brave and want to try out this model, you can get Ollama to download and run Llama3 on your Raspberry Pi by running the command below.

Please note that Llama3 is a very large model. You must have at least 4.7GB of free space on your Pi.

ollama run llama3

2. Once Ollama finishes starting up the Llama3 model on your Raspberry Pi, you can start communicating with the language model.

Below, you can see a couple of prompts we used and the results it produced.

Running Llama3 on the Raspberry Pi

Using Curl to Communicate with Ollama on your Raspberry Pi

One of Ollama’s cool features is its API, which you can query. Using this API, you can request that it generate responses to your prompts using specific models.

1. To showcase this, let us use curl to send a request to the Ollama server running on our Raspberry Pi.

With this command, we send some JSON data. In this bit of JSON, we specify the following information.

  • model: This is the large language model we want Ollama to run on our Raspberry Pi.

    In this example, we are sticking with the lightweight “tinyllama” model.
  • prompt: The prompt is what you want to say to the model, typically this is a question.

    For this example, we will ask the model what the capital of Australia is.
  • stream: By setting the “stream” option to false we are telling Ollama that we want it to wait until the model has finished generating before issuing a response.

    This is useful if you don’t want ot handle a stream of data coming from the model. Typically, a model is streamed back one word at a time.
curl http://localhost:11434/api/generate -d '{
  "model": "tinyllama",
  "prompt": "Why is the capital of Australia?",
  "stream": false
}' 

2. Below you can get an idea of the data that is returned by this API. By default Ollama will return the result a JSON data.

Within this you can see the result, and variety of additional information such as how long it took for the prompt to be completed.

Using the Ollama API

Conclusion

At this point in the guide, you should have Ollama installed on your Raspberry Pi.

Ollama is a neat piece of software that makes setting up and using large language models such as Llama3 straightforward.

You can even run multiple models on the same machine and easily get a result through its API or by running the model through the Ollama command line interface.

Please feel free to leave a comment below if you have any questions about running Ollama on your Pi.

If you liked this tutorial, we highly recommend checking out our many other Raspberry Pi projects.

3 Comments

  1. Avatar for Sbobolo
    Sbobolo on

    can i run the phi3 model and use it on a front end web page? (like chat gpt)

    1. Avatar for Emmet
      Emmet on
      Editor

      Hi Sbobolo,

      There is actually a web interface that works alongside Ollama called Open WebUI.

      I have written a guide on how to install Open WebUI on your Raspberry Pi. The steps should work perfectly alongside this guide.

      Kind regards,
      Emmet

  2. Avatar for Richard A
    Richard A on

    I tried phi3 and tinyllama and both worked well. I didn’t try Llama3 because when I tested in on another computer it didn’t work superbly.

    I like knowing about Phi3 and Tinyllama, thanks

Leave a Reply

Your email address will not be published. Required fields are marked *