A complete guide for setting up and using Hailo-Ollama with Open WebUI for an interactive AI chat interface.
The Hailo Model Zoo GenAI is a curated collection of pre-trained models and example applications optimized for Hailo's AI processors, designed to accelerate GenAI application development. It includes Hailo-Ollama, an Ollama-compatible API written in C++ on top of HailoRT, enabling seamless integration with various external tools and frameworks.
Ollama simplifies running large language models locally by managing model downloads, deployments, and interactions through a convenient REST API. Models are specifically optimized for Hailo hardware, providing efficient, high-performance inference tailored for GenAI tasks.
Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. Once Hailo-Ollama is running, you can use Open WebUI to interact with your models through a modern web interface.
For full details on Hailo Model Zoo GenAI, see: https://github.com/hailo-ai/hailo_model_zoo_genai?tab=readme-ov-file#basic-usage
-
Download the appropriate package for your architecture
Important: The supported version is 5.1.1, 5.2.0 & 5.3.0
-
Install the package:
sudo apt install hailo_gen_ai_model_zoo_<ver>_<arch>.deb
In a terminal window, start the Hailo-Ollama service:
hailo-ollamaThe service will start and listen on http://localhost:8000 by default.
In another terminal window, pull a model:
curl --silent http://localhost:8000/api/pull \
-H 'Content-Type: application/json' \
-d '{ "model": "qwen2.5-instruct:1.5b", "stream" : true }'The models will be downloaded to: ~/usr/share/hailo-ollama/models/blob/
Test the model via API:
curl --silent http://localhost:8000/api/chat \
-H 'Content-Type: application/json' \
-d '{"model": "qwen2.5-instruct:1.5b", "messages": [{"role": "user", "content": "Translate to French: The cat is on the table."}]}'If successful, you should receive a response from the model.
Once Hailo-Ollama is up and running, you can consume it with the popular Open WebUI for a user-friendly web interface.
- Docker must be installed and running
- For Docker installation instructions, see: https://docs.docker.com/engine/
Based on the Open WebUI quick start guide: https://docs.openwebui.com/getting-started/quick-start
- For environments with limited storage or bandwidth, Open WebUI offers slim image variants that exclude pre-bundled models.
- Important: Run with host network
docker pull ghcr.io/open-webui/open-webui:main
docker run -d -e OLLAMA_BASE_URL=http://127.0.0.1:8000 -v open-webui:/app/backend/data --name open-webui --network=host --restart always ghcr.io/open-webui/open-webui:mainAlternative run command in case there are issues:
# Run with host network (container shares host's network)
docker run -d --network host \
-v open-webui:/app/backend/data \
--name open-webui \
ghcr.io/open-webui/open-webui:mainOpen your browser and navigate to the Open WebUI interface at: http://localhost:8080
If the local model does not appear immediately, the following configuration may be required:
-
In Settings → Admin Settings → Connections, add the Hailo-Ollama API URL:
http://localhost:8000 -
Under the "Ollama API" section:
- Set "Connection Type" to "Local"
- Set "Auth" to "None"
-
Now in the chat, select one of the models served by Hailo-Ollama from the available models.
Once configured, you can:
- Chat with models through the Open WebUI web interface
- Access models via the Hailo-Ollama REST API directly
- Use any tool that supports the Ollama API format
- Ensure Hailo-Ollama service is running:
hailo-ollama - Check that port 8000 is not already in use
- Verify Hailo GenAI Model Zoo is properly installed
- Verify the model was pulled successfully using the curl command
- Check that models are in:
~/usr/share/hailo-ollama/models/blob/ - Ensure the model name matches exactly (e.g., "qwen2.5-instruct:1.5b")
- Check that the API URL in Open WebUI settings matches the hailo-ollama service URL
- Verify both services are running:
- Hailo-Ollama:
http://localhost:8000 - Open WebUI:
http://localhost:8080
- Hailo-Ollama:
- Ensure Docker container is using host network mode
- Default ports are:
- 8000 (hailo-ollama)
- 8080 (open-webui)
- If ports are in use, you may need to stop conflicting services or configure different ports
- Ensure Docker is installed and running:
docker --version - Check Docker container status:
docker ps - View container logs:
docker logs open-webui