In this tutorial, I go through the steps to setting up an AI image analyzer integration within Home Assistant.
Using AI to classify images will provide helpful information about the contents within a photo, video, or stream. For example, you can use it to check whether the bins are out for collection without opening the camera. Another example is using it to analyze object details, such as the color of a car or even the number plate.
We will use an integration known as LLM vision to integrate AI quickly into Home Assistant. One of the benefits of this integration is that you can use it to talk with various AI providers such as Google, OpenAI, Local AI, and more. To use this integration, you must have HACS installed, as it is not built into the core of Home Assistant.
To start this tutorial, you will need to have Home Assistant set up and be able to support HACS. You will also need a camera accessible to provide the integration photos to analyze. In our example, we use our Reolink camera.
Equipment
Below is a list of equipment that I used for this AI image analyzer on Home Assistant. You do not need these specific items for this tutorial to work.
Recommended
- Raspberry Pi ( Amazon | SunFounder )
- Micro SD Card ( Amazon | SunFounder )
- Power Supply ( Amazon | SunFounder )
- Ethernet Cable ( Amazon ) or Wi-Fi ( Amazon | SunFounder )
- Reolink Camera ( Amazon ) or similar
Optional
- Raspberry Pi Case ( Amazon | SunFounder )
- USB Mouse ( Amazon | SunFounder )
- USB Keyboard ( Amazon | SunFounder )
- HDMI Cable ( Amazon | SunFounder )
Installing the AI Integration on Home Assistant
1. You will first need to install HACS for Home Assistant so that you can install the custom AI integration.
2. On the HACS page within Home Assistant, try searching for “LLM Vision“. If it does not appear, we will need to add it manually.
To manually add the integration, click on the three dots in the top right corner of the screen. Click “Custom Repositories” from the drop-down menu.
3. In the pop-up, enter the link below into the repository text box.
https://github.com/valentinfrlch/ha-llmvision
In the type drop-down box, select “Integration“.
Once done, click add, and the integration should appear in the list. You can now close out of the pop-up.
4. Now click the “LLM Vision” integration in the list of HACs repositories.
5. The next page will provide you with all the details on the integration. Click “DOWNLOAD” in the bottom right corner.
6. Click “DOWNLOAD” in the pop-up. You do not need to change anything else on this screen.
7. Once installed, navigate to the “Settings” page and click the “Restart required” box at the top. Alternatively, you can restart Home Assistant by going to “Developer Tools” and clicking “restart“.
8. Click submit in the pop-up, and Home Assistant will restart.
Configuring the AI Integration
One of the great things about the LLM Vision integration is that it allows you to use an AI provider of your choice. At the writing of this tutorial, the integration supports Anthropic, Google Gemini, LocalAI, Ollama, OpenAI, and others that use an OpenAI-compatible API.
You will need to create an API key with the service you wish to use. Some of the AI providers do not have a free tier.
Creating a Google Gemini API Key
9. We will use Google Gemini for this tutorial as they have a free tier. To sign up, head to the Google Gemini Studio page.
You will need to agree to the terms and service before proceeding.
10. On this page, click “Create API key“. Next, click “Got It” on the safety prompt.
11. You can now create the API key within a new project or add it to an existing project. For this tutorial, I will create the key in a new project.
Click “Create API key in new project“
12. After a few seconds, you should now see a key. Copy this to a safe place, and do not share it with anyone.
Adding the API Key to the Integration
13. In Home Assistant navigate to the “Settings” page and open “Devices & Services“.
14. On this page, click the “ADD INTEGRATION” button in the bottom right corner.
15. In the pop-up, search and select “LLM Vision“.
16. Next, choose the provider you wish to use to process your AI requests. For this tutorial, we will be using Google Gemini.
17. Enter the API key for the service you have chosen. We will be entering the API key we generated for Google Gemini back in step 12.
18. Lastly, you should see a success message.
It is now time to move on to creating our first script using AI integration.
Adding your First AI Image Script
Before we get started, you must ensure that you have a camera available within Home Assistant to analyze images and streams.
17. In Home Assistant, go to the “Settings” page and select “Automations & Scenes“.
18. Navigate to the scripts tab and click “ADD SCRIPT” in the bottom right corner of the screen.
In the pop-up, select “Create new script“.
19. On the next screen, click the “Add Action” button.
In the pop-up, scroll to “Other Actions…” and click on it
In the next menu, search for “LLM Vision” and select “LLM Vision Image Analyzer“.
20. You should now see a range of settings that you can configure. We will do a basic configuration to test the integration. Below are the settings that we will set within the user interface.
- Provider: Set this to Google Gemini unless you use a different AI provider.
- Prompt: Enter “Describe this image” into the field.
- Image Entity: Turn this option on and select the camera that you wish to use. In my case, it is “Shed Front Fluent“.
- Target Width: You can leave this set to 1280, or disable the option entirely.
- Detail: You can change the level of detail to high if you have issues. Alternatively, you can safely disable this option.
- Maximum Tokens: Feel free to adjust this field if required. Reducing the tokens can help mitigate potential costs but may cause sentences to be cut short.
- Temperature: You can alter the temperature if you want to experiment with differing responses from the AI.
- Response Variable: Set this to “Response“.
Once you are happy with your selections, simply click “Save Script” in the bottom right corner.
You can change the name and set an icon and a description before saving the script. Once done, click “RENAME“.
20.1 If you want to change the script editor to YAML. You can do this by going to the three dots in the top right corner of the screen and selecting “Edit in YAML“.
In the code area, you should see all the details of your script. The provider option must be selected within the UI before switching to YAML.
alias: New Script
sequence:
- action: llmvision.image_analyzer
metadata: {}
data:
include_filename: false
detail: low
max_tokens: 100
temperature: 0.2
provider: <SELECT IN UI FIRST>
message: Describe this image
image_entity:
- camera.shed_front_fluent
target_width: 1280
response_variable: Response
Testing the Script
21. To test the script, you must navigate back to the “Scripts” tab that is located on the “Automations & Scenes” settings page.
22. Find the new script and click the three dots on the left side of the screen. Click on the dots and click “Run“. The script may take a few seconds to process.
23. To view the response from the AI for the script we just ran, you will need to click on the dots again. In the menu, click on “Traces“.
24. On this page, click on the “Changed Variables” tab and you should see the response next to “response_text“.
25. You should now have a general understanding of how you can use the AI image analyzer in a script within Home Assistant. Below we will discuss further work you can do to implement this fantastic tool into more of your workflows.
Displaying the Data on a Dashboard
If you want to display the text on the front end with a button to trigger the script, follow this section.
26. First, we will need to create a helper to store the text so we can display it on the front end. To do this, go to the “Devices & Services” settings page and select the “Helpers” tab.
On this page, click “CREATE HELPER“.
27. In the pop-up, find and select “Text“.
28. In the name field, enter llmvision_response
. You can change the name if you want, but you must update it in the YAML specified in the next step.
Also, set the maximum length to 255. If text is any longer than this, it will be rejected entirely and the field set to unknown.
Once done, click “CREATE“.
29. Review the script we created earlier and copy the following text. Ensure you update a few fields before saving. We touch on the fields below.
Replace the provider ID from the previous script before copying the text below.
The image_entity will need to be updated with the camera you are using.
alias: New Script
sequence:
- action: llmvision.image_analyzer
metadata: {}
data:
include_filename: false
detail: low
max_tokens: 100
temperature: 0.5
provider: <SELECT IN UI FIRST>
message: "Describe this image in 25 words. "
image_entity:
- camera.<Update with your camera>
target_width: 1280
response_variable: response
- action: input_text.set_value
metadata: {}
data:
value: "{{response.response_text}}"
target:
entity_id: input_text.llmvision_response
description: ""
icon: mdi:cube-scan
30. Go to the dashboard you wish to place the card and go into editor mode.
Once in editor mode, click on “ADD Card“. In the pop-up, scroll down to the bottom and select “Manual“. Copy and paste the code below.
Make sure you update the values in the card to prevent any errors.
type: vertical-stack
cards:
- type: tile
entity: script.new_script
icon_tap_action:
action: toggle
name: Analyze Image
- type: markdown
content: "{{states(\"input_text.llmvision_response\")}}"
- show_state: false
show_name: false
camera_view: auto
type: picture-entity
camera_image: camera.<ENTER YOUR CAMERA HERE>
entity: camera.<ENTER YOUR CAMERA HERE>
31. You should now have a card that will analyze the current image if the button is pressed. A snapshot of the current camera theme should be shown below the text. Below is an example of how the card should look.
Further work
This tutorial covers the basics of using the LLM vision integration for AI image analysis. There are more examples of what you can do within Home Assistant on the LLM vision Git manual pages.
In this tutorial, we cover analyzing a single frame from a video. However, you can configure the integration to use video or a stream. For more information, check out this configuration page for the integration.
You can also use this integration in automation. For example, you can use a trigger to fire the script to analyze the current image on the camera. The process is roughly the same as this tutorial, but you must set up a “when” trigger and write approximately the same script in the “then do” section.
A wide range of use cases exist for using AI to analyze images. For example, you can use it to visually check whether a gate is left open or count the number of animals in a paddock. However, do not rely on AI to be perfect.
Conclusion
I hope you now have the AI image analyzer integration on Home Assistant and that it works correctly. You can do a fair bit with this tool, but it will require further work. Install something like Node-Red to create smart automation with very little effort.
We have plenty more Home Assistant tutorials that I highly recommend checking out. We cover plenty of topics and are constantly expanding to cover even more. It’s a fantastic piece of software that allows you to automate a wide range of tasks.
If you have any feedback or tips, or if there is something I have missed, please leave a comment below.