Setting up an AI Image Analyzer on Home Assistant

In this tutorial, I go through the steps to setting up an AI image analyzer integration within Home Assistant.

Home Assistant AI Image Analyzer

Using AI to classify images will provide helpful information about the contents within a photo, video, or stream. For example, you can use it to check whether the bins are out for collection without opening the camera. Another example is using it to analyze object details, such as the color of a car or even the number plate.

We will use an integration known as LLM vision to integrate AI quickly into Home Assistant. One of the benefits of this integration is that you can use it to talk with various AI providers such as Google, OpenAI, Local AI, and more. To use this integration, you must have HACS installed, as it is not built into the core of Home Assistant.

To start this tutorial, you will need to have Home Assistant set up and be able to support HACS. You will also need a camera accessible to provide the integration photos to analyze. In our example, we use our Reolink camera.

Equipment

Below is a list of equipment that I used for this AI image analyzer on Home Assistant. You do not need these specific items for this tutorial to work.

Optional

Installing the AI Integration on Home Assistant

1. You will first need to install HACS for Home Assistant so that you can install the custom AI integration.

2. On the HACS page within Home Assistant, try searching for “LLM Vision“. If it does not appear, we will need to add it manually.

To manually add the integration, click on the three dots in the top right corner of the screen. Click “Custom Repositories” from the drop-down menu.

HACS - Select Custom Repositories

3. In the pop-up, enter the link below into the repository text box.

https://github.com/valentinfrlch/ha-llmvision

In the type drop-down box, select “Integration“.

HACS - Add Custom Repository Pop-Up

Once done, click add, and the integration should appear in the list. You can now close out of the pop-up.

HACS - Custom Repository Successfully Added

4. Now click the “LLM Vision” integration in the list of HACs repositories.

HACS - Click on the LLM Vision Integration

5. The next page will provide you with all the details on the integration. Click “DOWNLOAD” in the bottom right corner.

HACS - LLM Vision Integration Information Page

6. Click “DOWNLOAD” in the pop-up. You do not need to change anything else on this screen.

7. Once installed, navigate to the “Settings” page and click the “Restart required” box at the top. Alternatively, you can restart Home Assistant by going to “Developer Tools” and clicking “restart“.

Settings Page - Restart Required

8. Click submit in the pop-up, and Home Assistant will restart.

Configuring the AI Integration

One of the great things about the LLM Vision integration is that it allows you to use an AI provider of your choice. At the writing of this tutorial, the integration supports Anthropic, Google Gemini, LocalAI, Ollama, OpenAI, and others that use an OpenAI-compatible API.

You will need to create an API key with the service you wish to use. Some of the AI providers do not have a free tier.

Creating a Google Gemini API Key

9. We will use Google Gemini for this tutorial as they have a free tier. To sign up, head to the Google Gemini Studio page.

You will need to agree to the terms and service before proceeding.

Google Gemini API Keys Legal Notice

10. On this page, click “Create API key“. Next, click “Got It” on the safety prompt.

Select Create API Key

11. You can now create the API key within a new project or add it to an existing project. For this tutorial, I will create the key in a new project.

Click “Create API key in new project

Create API key in a new project

12. After a few seconds, you should now see a key. Copy this to a safe place, and do not share it with anyone.

API Key Generated

Adding the API Key to the Integration

13. In Home Assistant navigate to the “Settings” page and open “Devices & Services“.

Home Assistant - Settings Page - Devices and Services

14. On this page, click the “ADD INTEGRATION” button in the bottom right corner.

Home Assistant - Add Integration

15. In the pop-up, search and select “LLM Vision“.

Search and Select LLM Vision

16. Next, choose the provider you wish to use to process your AI requests. For this tutorial, we will be using Google Gemini.

Choose AI Provider

17. Enter the API key for the service you have chosen. We will be entering the API key we generated for Google Gemini back in step 12.

Enter the API Key for your AI Provider

18. Lastly, you should see a success message.

Success Configuration Message

It is now time to move on to creating our first script using AI integration.

Adding your First AI Image Script

Before we get started, you must ensure that you have a camera available within Home Assistant to analyze images and streams.

17. In Home Assistant, go to the “Settings” page and select “Automations & Scenes“.

Settings Page - Automations and Scenes

18. Navigate to the scripts tab and click “ADD SCRIPT” in the bottom right corner of the screen.

Scripts Tab - Add Script

In the pop-up, select “Create new script“.

19. On the next screen, click the “Add Action” button.

Add Action to Script

In the pop-up, scroll to “Other Actions…” and click on it

Click Other Actions

In the next menu, search for “LLM Vision” and select “LLM Vision Image Analyzer“.

Select Image Analyzer

20. You should now see a range of settings that you can configure. We will do a basic configuration to test the integration. Below are the settings that we will set within the user interface.

  • Provider: Set this to Google Gemini unless you use a different AI provider.
  • Prompt: Enter “Describe this image” into the field.
  • Image Entity: Turn this option on and select the camera that you wish to use. In my case, it is “Shed Front Fluent“.
  • Target Width: You can leave this set to 1280, or disable the option entirely.
  • Detail: You can change the level of detail to high if you have issues. Alternatively, you can safely disable this option.
  • Maximum Tokens: Feel free to adjust this field if required. Reducing the tokens can help mitigate potential costs but may cause sentences to be cut short.
  • Temperature: You can alter the temperature if you want to experiment with differing responses from the AI.
  • Response Variable: Set this to “Response“.

Once you are happy with your selections, simply click “Save Script” in the bottom right corner.

Script Options

You can change the name and set an icon and a description before saving the script. Once done, click “RENAME“.

Save Script

20.1 If you want to change the script editor to YAML. You can do this by going to the three dots in the top right corner of the screen and selecting “Edit in YAML“.

Edit Script in YAML

In the code area, you should see all the details of your script. The provider option must be selected within the UI before switching to YAML.

alias: New Script
sequence:
  - action: llmvision.image_analyzer
    metadata: {}
    data:
      include_filename: false
      detail: low
      max_tokens: 100
      temperature: 0.2
      provider: <SELECT IN UI FIRST>
      message: Describe this image
      image_entity:
        - camera.shed_front_fluent
      target_width: 1280
    response_variable: Response

Testing the Script

21. To test the script, you must navigate back to the “Scripts” tab that is located on the “Automations & Scenes” settings page.

22. Find the new script and click the three dots on the left side of the screen. Click on the dots and click “Run“. The script may take a few seconds to process.

Home Assistant - Run Script

23. To view the response from the AI for the script we just ran, you will need to click on the dots again. In the menu, click on “Traces“.

Open Traces

24. On this page, click on the “Changed Variables” tab and you should see the response next to “response_text“.

View Response from the AI Script

25. You should now have a general understanding of how you can use the AI image analyzer in a script within Home Assistant. Below we will discuss further work you can do to implement this fantastic tool into more of your workflows.

Displaying the Data on a Dashboard

If you want to display the text on the front end with a button to trigger the script, follow this section.

26. First, we will need to create a helper to store the text so we can display it on the front end. To do this, go to the “Devices & Services” settings page and select the “Helpers” tab.

On this page, click “CREATE HELPER“.

Home Assistant - Create Helper

27. In the pop-up, find and select “Text“.

Helper - Select Text

28. In the name field, enter llmvision_response. You can change the name if you want, but you must update it in the YAML specified in the next step.

Also, set the maximum length to 255. If text is any longer than this, it will be rejected entirely and the field set to unknown.

Once done, click “CREATE“.

Create Text Helper with 255 Maximum Length

29. Review the script we created earlier and copy the following text. Ensure you update a few fields before saving. We touch on the fields below.

Replace the provider ID from the previous script before copying the text below.

The image_entity will need to be updated with the camera you are using.

alias: New Script
sequence:
  - action: llmvision.image_analyzer
    metadata: {}
    data:
      include_filename: false
      detail: low
      max_tokens: 100
      temperature: 0.5
      provider: <SELECT IN UI FIRST>
      message: "Describe this image in 25 words. "
      image_entity:
        - camera.<Update with your camera>
      target_width: 1280
    response_variable: response
  - action: input_text.set_value
    metadata: {}
    data:
      value: "{{response.response_text}}"
    target:
      entity_id: input_text.llmvision_response
description: ""
icon: mdi:cube-scan

30. Go to the dashboard you wish to place the card and go into editor mode.

Once in editor mode, click on “ADD Card“. In the pop-up, scroll down to the bottom and select “Manual“. Copy and paste the code below.

Make sure you update the values in the card to prevent any errors.

type: vertical-stack
cards:
  - type: tile
    entity: script.new_script
    icon_tap_action:
      action: toggle
    name: Analyze Image
  - type: markdown
    content: "{{states(\"input_text.llmvision_response\")}}"
  - show_state: false
    show_name: false
    camera_view: auto
    type: picture-entity
    camera_image: camera.<ENTER YOUR CAMERA HERE>
    entity: camera.<ENTER YOUR CAMERA HERE>

31. You should now have a card that will analyze the current image if the button is pressed. A snapshot of the current camera theme should be shown below the text. Below is an example of how the card should look.

AI Image Analzyer Front End Card Example

Further work

This tutorial covers the basics of using the LLM vision integration for AI image analysis. There are more examples of what you can do within Home Assistant on the LLM vision Git manual pages.

In this tutorial, we cover analyzing a single frame from a video. However, you can configure the integration to use video or a stream. For more information, check out this configuration page for the integration.

You can also use this integration in automation. For example, you can use a trigger to fire the script to analyze the current image on the camera. The process is roughly the same as this tutorial, but you must set up a “when” trigger and write approximately the same script in the “then do” section.

A wide range of use cases exist for using AI to analyze images. For example, you can use it to visually check whether a gate is left open or count the number of animals in a paddock. However, do not rely on AI to be perfect.

Conclusion

I hope you now have the AI image analyzer integration on Home Assistant and that it works correctly. You can do a fair bit with this tool, but it will require further work. Install something like Node-Red to create smart automation with very little effort.

We have plenty more Home Assistant tutorials that I highly recommend checking out. We cover plenty of topics and are constantly expanding to cover even more. It’s a fantastic piece of software that allows you to automate a wide range of tasks.

If you have any feedback or tips, or if there is something I have missed, please leave a comment below.

Leave a Reply

Your email address will not be published. Required fields are marked *