This directory contains Generative AI applications and utilities for the Hailo AI platform. These applications leverage Hailo's AI acceleration hardware to provide real-time LLM (Large Language Model), VLM (Vision Language Model), and speech processing capabilities.
The GenAI applications package includes:
- Full Applications: Fully functioning interactive applications
- Simple Examples: Basic demonstration scripts for common GenAI tasks
- Shared Utilities: Reusable components for building custom GenAI applications
Note: These applications require the Hailo10 device. Due to the size of models and additional requirements, it is not installed by default. See installation instructions.
π€ Agent Tools Example
An interactive CLI chat agent that uses Hailo LLM models with function calling capabilities. The agent automatically discovers tools and allows the LLM to call them during conversations.
Features:
- Automatic tool discovery and execution
- Text and voice interaction modes
- Context management with token-based tracking
- Multiple built-in tools (math, weather, RGB LED, servo, elevator)
Documentation: See agent_tools_example/README.md for detailed usage, tool creation guide, and hardware setup instructions.
Additional Resources:
- AGENTS.md - Detailed developer documentation and architecture guide
- SPEC.md - Technical specification
- TESTING.md - Testing framework documentation
ποΈ VLM Chat
An interactive computer vision application using Hailo's Vision Language Model (VLM) for real-time image analysis and question answering.
Features:
- Real-time video processing with Hailo AI acceleration
- Interactive Q&A mode for captured frames
- Single window display (continuous video feed that freezes during Q&A mode)
- Custom prompt support
Documentation: See vlm_chat/README.md for usage instructions and configuration options.
π€ Voice Assistant
An interactive voice-controlled AI assistant using Hailo's Speech-to-Text and Large Language Model for real-time audio processing and conversational AI.
Features:
- Real-time speech processing with Hailo AI acceleration
- Interactive voice mode (press Space to start/stop recording)
- Streaming text-to-speech with interruption support
- Context management for conversation history
- Debug logging for troubleshooting
Documentation: See voice_assistant/README.md for usage instructions, microphone setup, and troubleshooting.
Basic example applications demonstrating the use of Hailo's GenAI platform for different AI tasks:
π¬ Simple LLM Chat
A simple example demonstrating text-based conversation with Hailo's Large Language Model (LLM).
Features:
- Simple text prompt processing
- Configurable temperature and token limits
- System message for assistant behavior definition
- Auto-downloads model on first run
Documentation: See simple_llm_chat/README.md for usage instructions.
ποΈ Simple VLM Chat
A simple example demonstrating image analysis and description using Hailo's Vision Language Model (VLM).
Features:
- Image loading and preprocessing
- Visual question answering
- Image description generation
- Uses example image from doc/images/ directory
- Auto-downloads model on first run
Documentation: See simple_vlm_chat/README.md for usage instructions.
π€ Simple Whisper Chat
A simple example demonstrating audio transcription using Hailo's Whisper speech-to-text model.
Features:
- Audio file loading and processing
- Speech-to-text transcription
- Segment-based output
- Default audio file included
- Auto-downloads model on first run
Documentation: See simple_whisper_chat/README.md for usage instructions.
π Hailo Ollama
Integration guide for using Hailo-Ollama with Open WebUI for an interactive AI chat interface.
Features:
- Ollama-compatible REST API
- Integration with Open WebUI
- Model management and deployment
- Web-based chat interface
Documentation: See hailo_ollama/README.md for installation and setup instructions.
Reusable components for building custom GenAI applications:
Provides helpers for building LLM agents:
- Context Management: Token-based context window tracking and caching
- Message Formatting: Helper functions to format messages for the LLM
- Streaming: Utilities for streaming LLM responses and filtering special tokens
- Tool Discovery & Execution: Automatic tool discovery and execution framework
- Terminal UI: Terminal UI helpers for interactive applications
Components for building voice-enabled applications:
- VoiceInteractionManager: Manages the interaction loop, recording, and UI
- AudioRecorder: Handles microphone recording with auto-detection
- SpeechToTextProcessor: Wraps Hailo's Speech2Text API (Whisper)
- TextToSpeechProcessor: Handles speech synthesis using Piper TTS
- AudioPlayer: Cross-platform audio playback
- AudioDiagnostics: Tools for device enumeration and troubleshooting
Documentation: See gen_ai_utils/README.md for detailed API documentation and usage examples.
- Hardware: Hailo AI GenAI accelerator device (H10 or compatible)
- Python: Python 3.10 or higher
- Hailo Platform SDK: Must be installed and configured
The GenAI applications require additional Python packages that are not installed by default. Install them using:
# From the repository root directory
pip install -e ".[gen-ai]"This will install:
piper-tts- For text-to-speech synthesissounddevice==0.5.1- For audio input/output (microphone recording and playback)
If you encounter audio issues, you may need to install system dependencies:
sudo apt-get install portaudio19-devIf you plan to use voice features (text-to-speech, speech-to-text), you'll need to install the Piper TTS model. We also provide tools to configure and validate your audio devices.
See Voice Processing Module Documentation for installation instructions.
Important: GenAI models are not downloaded by default due to their large size. They are downloaded on-demand when you run an application for the first time.
When you run a GenAI application for the first time, the required models will be automatically downloaded if they are not already present. This ensures you only download models you actually use.
If you prefer to download models in advance, you can use the resource downloader:
# Download models for a specific GenAI app
hailo-download-resources --group vlm_chat --arch hailo10h
hailo-download-resources --group llm_chat --arch hailo10h
hailo-download-resources --group whisper_chat --arch hailo10h
# Download all GenAI models (requires Hailo-10H hardware)
hailo-download-resources --all --include-gen-ai --arch hailo10h
# List available GenAI models
hailo-download-resources --list-models --arch hailo10hNote: GenAI applications are only available on Hailo-10H hardware. Make sure to specify --arch hailo10h when downloading models.
# Agent Tools Example (text mode)
python -m hailo_apps.python.gen_ai_apps.agent_tools_example.agent
# Agent Tools Example (voice mode)
python -m hailo_apps.python.gen_ai_apps.agent_tools_example.agent --voice
# VLM Chat
python -m hailo_apps.python.gen_ai_apps.vlm_chat.vlm_chat --input usb
# Voice Assistant
python -m hailo_apps.python.gen_ai_apps.voice_assistant.voice_assistant
# Simple Examples
python -m hailo_apps.python.gen_ai_apps.simple_llm_chat.simple_llm_chat
python -m hailo_apps.python.gen_ai_apps.simple_vlm_chat.simple_vlm_chat
python -m hailo_apps.python.gen_ai_apps.simple_whisper_chat.simple_whisper_chat --audio /path/to/audio.wavFor detailed usage examples and API documentation, see:
- Voice Processing Examples - Complete examples for
VoiceInteractionManager, audio recording, speech-to-text, and text-to-speech - LLM Utilities Examples - Complete examples for context management, streaming, tool discovery, and agent building
Quick Start Examples:
# Voice Processing - See detailed examples in voice_processing/README.md
from hailo_apps.python.gen_ai_apps.gen_ai_utils.voice_processing.interaction import VoiceInteractionManager
manager = VoiceInteractionManager(
title="My Voice App",
on_audio_ready=lambda audio: print(f"Audio: {len(audio)} samples"),
on_shutdown=lambda: print("Shutting down")
)
manager.run()# LLM Utilities - See detailed examples in llm_utils/README.md
from hailo_platform import VDevice
from hailo_platform.genai import LLM
from hailo_apps.python.gen_ai_apps.gen_ai_utils.llm_utils import context_manager, streaming, message_formatter
vdevice = VDevice()
llm = LLM(vdevice=vdevice)
messages = [
message_formatter.messages_system("You are a helpful assistant."),
message_formatter.messages_user("Tell me a joke.")
]
response = streaming.generate_and_stream_response(llm=llm, prompt=messages, max_tokens=200)
vdevice.release()The GenAI applications follow a modular architecture:
gen_ai_apps/
βββ agent_tools_example/ # Full agent application with tools
βββ vlm_chat/ # Vision Language Model application
βββ voice_assistant/ # Voice assistant application
βββ simple_llm_chat/ # Simple LLM chat example
βββ simple_vlm_chat/ # Simple VLM chat example
βββ simple_whisper_chat/ # Simple Whisper chat example
βββ hailo_ollama/ # Hailo Ollama integration guide
βββ gen_ai_utils/ # Shared utilities
βββ llm_utils/ # LLM interaction utilities
βββ voice_processing/ # Voice processing components
Applications use the shared utilities from gen_ai_utils/ to avoid code duplication and ensure consistency.
If you see import errors for piper-tts or sounddevice:
pip install -e ".[gen-ai]"This installs all dependencies needed for GenAI voice features.
If an application reports a missing model:
- The model will be downloaded automatically on first run
- Or download manually:
hailo-download-resources --group <app_name> --arch hailo10h
For microphone or audio playback problems, run: hailo-audio-troubleshoot
See Voice Processing Module Documentation for detailed troubleshooting.
GenAI applications require Hailo-10H hardware. They are not available on Hailo-8 or Hailo-8L devices.
- Agent Tools Example README - Complete guide for the agent application
- Voice Processing README - Voice processing module documentation
- GenAI Utils README - Shared utilities documentation
- Simple LLM Chat README - Simple LLM chat example guide
- Simple VLM Chat README - Simple VLM chat example guide
- Simple Whisper Chat README - Simple Whisper chat example guide
- Hailo Ollama README - Hailo Ollama integration guide
These applications are part of the Hailo Apps Infrastructure and follow the project's license terms.