Running Ollama on the Raspberry Pi

In this tutorial, we will show you how to run Ollama on your Raspberry Pi.

Ollama is a project that makes running large language models (LLM) locally on your device relatively easy.

Unlike using a tool like ChatGPT, all of the requests Ollama handles are processed locally on your Raspberry Pi using your chosen model.

Ollama itself isn’t a large language model. It is a tool that allows you to run various open-source AI models quickly. This tool handles downloading and then running a supported large language model. Ollama even has an API that you can interact with from other applications.

Visit the software’s official website to find a list of the language models it supports. Just be aware that the larger a model is, the more intensive it will be on the Raspberry Pi. For example, the TinyLlama AI model will run significantly better than the heavy Llama3.

Additionally, you will get a significantly better experience using a Raspberry Pi 5 with 8GB of memory. Processing language models quickly chew up the processing power and memory of the Pi.

Equipment

Below is a list of equipment we used when installing and running Ollama on our Raspberry Pi.

Optional

Raspberry Pi Case Amazon
USB Mouse Amazon
USB Keyboard Amazon
HDMI Cable Amazon
Monitor Amazon

We last tested this tutorial on a Raspberry Pi 5 running the latest version of Raspberry Pi OS Bookworm 64-bit.

Installing Ollama on your Raspberry Pi

This section will show you the super simple steps to install Ollama on your Raspberry Pi. This process is made simple thanks to Ollama’s script, which handles almost all of the work for us.

Before proceeding, please note that you must be running a 64-bit operating system to use Ollama.

Preparing your System for Ollama

1. Before we install Ollama, you must ensure your Raspberry Pi’s operating system is up to date.

You can update your packages to the latest version by running the command below.

sudo apt update
sudo apt upgradeCopy

2. For our next step, we need to ensure that the curl package is installed. Typically, this comes bundled with Raspberry Pi, but it doesn’t hurt to verify it is installed.

You can ensure curl is installed on your Pi by running the command below.

sudo apt install curlCopy

Running the Ollama Installer on your Raspberry Pi

3. With our Raspberry Pi ready, we can move on to running the Ollama installer.

Installing Ollama on your Pi is as simple as running the following command within the terminal. This command will download the “install.sh” script from Ollama and pass it directly to bash.

curl -fsSL https://ollama.com/install.sh | shCopy

Running scripts like this is typically considered bad practice without first verifying the contents of the script. You can view the code in this script by going directly to it in your favorite web browser

4. Using the command below, we can verify that we just successfully installed Ollama on our Raspberry Pi.

This command gets Ollama to output its version to the terminal.

ollama --versionCopy

Below you can see that we have the software installed and are currently running version 0.1.32.

ollama version is 0.1.32Copy

How to use Ollama on your Raspberry Pi

In this section, we will be showing you how you can use Ollama on your Raspberry Pi to run some of the various language models that it supports.

In particular, we will be testing the Deepseek, Phi3, TinyLlama and Llama3 models on our Raspberry Pi. The first two of these are meant to be lightweight models, the last one however is not.

Just remember that you shouldn’t expect too much out of your Pi. It isn’t really designed to run AI models, let alone super large ones. If you want something that is actually usable, try to stick with some of the lighter-weight models.

Running Deepseek on a Raspberry Pi

While running the full Deepseek LLM Model on your Raspberry Pi is way out of the question, you can run some of its distilled versions.

These distilled versions take the full Deepseek-R1 model and cut it down to something more performant but not as accurate and more prone to hallucinations.

The performance you will get out of Deepseek on your Raspberry Pi heavily depends on which distilled variant you are running. For example, the smaller 1.5b model will perform significantly better than the larger 14b model.

Also, the larger the model you go with, the more RAM your Raspberry Pi will require to run Deepseek.

1. Getting the Ollama software to download and run the Deepseek model on your Raspberry Pi is as simple as using one of the following commands.

We have listed out commands for four of the distilled versions of Deepseek in order of how well they run. The first performs the best, and the last performs the worst but will continue to be somewhat functionable on your Pi.

Deepseek-R1 1.15B

To download the 1.5B distilled version of this model and run it, use the following command within the terminal. This model requires you to have at least 1.1GB of free space.

Out of all the variants of Deepseek that you can run on a Raspberry Pi, this is the only one that has a decent response time. The others that we will be covering, while more accurate, are significantly slower.

In fact, out of all the AI Models that we have run on our Raspberry Pi, this is perhaps the best performing one.

ollama run deepseek-r1:1.5bCopy

Deepseek-R1 7b

The next model that Deepseek has provided has over 7 billion parameters. This provides significantly more data for the AI model to use. However, it is also significantly slower than the smallest model.

If you want to run this version of Deepseek on your Raspberry Pi, you only need to run the command below. This mode requires you to have at least 4.7GB of space available on your Pi.

ollama run deepseek-r1:7bCopy

Deepseek-R1 8b

The Deepseek-R1 8b model is the last AI model that can be run on the Raspberry Pi 8GB models. This has another billion parameters compared with the “7b” distilled model. Whether this produces better results or not is likely entirely up to your use case.

Running this larger version of the Deepseek model is as simple as using the following command on your Raspberry Pi. Be prepared for even slower speeds than the 7b model. Please note this model will consume 4.8GB of storage on your device.

ollama run deepseek-r1:8bCopy

Deepseek-R1 14b

The final version of the Deepseek model that will run on your Raspberry Pi is the “14b” distilled variant. To even run this variant, you must be running the 16GB variant of the Raspberry Pi. This consumes 10.7GB of RAM just when starting up.

This larger model also has an even more significant performance hit than all the other models. You will be waiting a significant amount of time between your question and Deepseek generating its full response.

You can run the “14b” variant of Deepseek-R1 on your Raspberry Pi by using the following command. This variant will require you to have 9.0GB of free space.

ollama run deepseek-r1:14bCopy

2. If everything is working properly, you should now be able to interact with Deepseek on your Raspberry Pi using Ollama.

Using the Phi3 LLM on your Pi

Phi3 is an LLM developed by Microsoft to be super lightweight while retaining some of the quality results you expect from significantly larger models.

Being lightweight makes Phi3 a great large language model to run on your Raspberry Pi. It should be able to perform at a usable speed. Don’t expect super-fast responses, but the Pi 5 is capable of running this model.

1. To get Ollama to download and start the Phi3 LLM on your Raspberry Pi, you only need to use the following command.

Please note that this process can take a bit of time to complete as, while being a smaller model, Phi3 still clocks in at 2.3GB.

ollama run phi3Copy

2. With Phi3 now running on your Raspberry Pi, you will be able to start communicating with the AI Model.

One thing you will find with this model is that while it is usable, it can take a long time to respond with any long answers. If you are after a fast LLM on your Raspberry Pi, you might have to consider other options.

Running Phi3 using Ollamao n the Raspberry Pi

Testing out TinyLlama

TinyLlama is super lightweight and one of the fastest LLMs you can run on the Raspberry Pi’s processor.

While it might not produce as high-quality results as larger models like Phi3 or the heavyweight Llama 3, it is still more than capable of answering most basic questions. Additionally, what TinyLlama lacks in quality definitely makes up for with speed.

1. You can use Ollama to run TinyLlama on your Raspberry Pi by running the command below.

This model is fairly lightweight, so it should download to your Pi relatively quickly.

ollama run tinyllamaCopy

2. Once the model is up and running you will be able to ask it questions and talk with the AI mode.

Running Llama3 on the Raspberry Pi

Llama3 is a heavyweight language model, especially compared to TinyLlama and Phi3. It is capable of some super high-quality results but is a model that you need to be patient with when running it on your Pi.

Personally, we were surprised with how well the Llama3 model ran on our Raspberry Pi 5. While it can take a little bit to get started, it produced concise and highly readable results.

The longer your prompt is, the more taxing it will be on both the Llama3 language model and your Pi.

1. However, if you are brave and want to try out this model, you can get Ollama to download and run Llama3 on your Raspberry Pi by running the command below.

Please note that Llama3 is a very large model. You must have at least 4.7GB of free space on your Pi.

ollama run llama3Copy

2. Once Ollama finishes starting up the Llama3 model on your Raspberry Pi, you can start communicating with the language model.

Below, you can see a couple of prompts we used and the results it produced.

Using Curl to Communicate with Ollama on your Raspberry Pi

One of Ollama’s cool features is its API, which you can query. Using this API, you can request that it generate responses to your prompts using specific models.

1. To showcase this, let us use curl to send a request to the Ollama server running on our Raspberry Pi.

With this command, we send some JSON data. In this bit of JSON, we specify the following information.

model: This is the large language model we want Ollama to run on our Raspberry Pi.

In this example, we are sticking with the lightweight “tinyllama” model.
prompt: The prompt is what you want to say to the model, typically this is a question.

For this example, we will ask the model what the capital of Australia is.
stream: By setting the “stream” option to false we are telling Ollama that we want it to wait until the model has finished generating before issuing a response.

This is useful if you don’t want ot handle a stream of data coming from the model. Typically, a model is streamed back one word at a time.

curl http://localhost:11434/api/generate -d '{
  "model": "tinyllama",
  "prompt": "Why is the capital of Australia?",
  "stream": false
}' Copy

2. Below you can get an idea of the data that is returned by this API. By default Ollama will return the result a JSON data.

Within this you can see the result, and variety of additional information such as how long it took for the prompt to be completed.

Conclusion

At this point in the guide, you should have Ollama installed on your Raspberry Pi.

Ollama is a neat piece of software that makes setting up and using large language models such as Llama3 straightforward.

You can even run multiple models on the same machine and easily get a result through its API or by running the model through the Ollama command line interface.

Please feel free to leave a comment below if you have any questions about running Ollama on your Pi.

If you liked this tutorial, we highly recommend checking out our many other Raspberry Pi projects.

Need faster help? Premium members get priority responses to their comments.

Upgrade for Priority Support

7 Comments

Raz on March 9, 2025 at 3:34 pm

Does this take advantage of the Raspberry Pi AI module hat?

1. Emmet on March 10, 2025 at 12:16 pm
  
  Editor
  
  Hi Raz,
  
  Unfortunately, the AI Hat’s do not work well with large language models. There was talk about the company behind the chip they use developing one to handle LLM’s but that still hasn’t eventuated.
  
  Kind regards,
  Emmet
  
Mats Karlsson on May 23, 2024 at 6:49 pm

Please add that 64 bit is required.

1. Emmet on May 23, 2024 at 9:46 pm
  
  Editor
  
  Hi Mats,
  
  Thanks for the heads up, I thought I had made a note of it but obviously completely missed it.
  
  Kind regards,
  Emmet
  
Sbobolo on April 29, 2024 at 2:43 am

can i run the phi3 model and use it on a front end web page? (like chat gpt)

1. Emmet on May 1, 2024 at 11:20 pm
  
  Editor
  
  Hi Sbobolo,
  
  There is actually a web interface that works alongside Ollama called Open WebUI.
  
  I have written a guide on how to install Open WebUI on your Raspberry Pi. The steps should work perfectly alongside this guide.
  
  Kind regards,
  Emmet
  
Richard A on April 26, 2024 at 12:45 am

I tried phi3 and tinyllama and both worked well. I didn’t try Llama3 because when I tested in on another computer it didn’t work superbly.

I like knowing about Phi3 and Tinyllama, thanks

Running Ollama on the Raspberry Pi

Equipment

Recommended

Optional

Installing Ollama on your Raspberry Pi

Preparing your System for Ollama

Running the Ollama Installer on your Raspberry Pi

How to use Ollama on your Raspberry Pi

Running Deepseek on a Raspberry Pi

Using the Phi3 LLM on your Pi

Testing out TinyLlama

Running Llama3 on the Raspberry Pi

Using Curl to Communicate with Ollama on your Raspberry Pi

Conclusion

Leave a Reply Cancel reply

7 Comments

You Might Also Like

Equipment

Recommended

Optional

Installing Ollama on your Raspberry Pi

Preparing your System for Ollama

Running the Ollama Installer on your Raspberry Pi

How to use Ollama on your Raspberry Pi

Running Deepseek on a Raspberry Pi

Using the Phi3 LLM on your Pi

Testing out TinyLlama

Running Llama3 on the Raspberry Pi

Using Curl to Communicate with Ollama on your Raspberry Pi

Conclusion

Recommended

Leave a Reply Cancel reply

7 Comments