A simple example application demonstrating audio transcription using Hailo's Whisper speech-to-text model.
- Audio file loading and processing
- Speech-to-text transcription
- Segment-based output
- Default audio file:
audio.wav(in same directory) - Auto-downloads model on first run
- Hailo AI accelerator device (H10 or compatible)
- Python 3.10+
- Hailo Platform SDK
Before running this example, ensure GenAI dependencies are installed:
# From the repository root directory
pip install -e ".[gen-ai]"Use a clean virtual environment before installing the dependencies:
py -m venv wind_venv
.\wind_venv\Scripts\Activate.ps1
pip install .\hailort-<version>-cp<python>-cp<python>-win_amd64.whl
# From the repository root directory
pip install -e ".[gen-ai]"This will install required packages including:
- NumPy
For complete installation instructions, see: GenAI Applications Installation Guide
simple_whisper_chat.py- Main example script demonstrating Whisper transcription functionalityaudio.wav- Sample audio file for testing
Run the example with the default audio file:
python -m hailo_apps.python.gen_ai_apps.simple_whisper_chat.simple_whisper_chatOr specify a custom audio file:
python -m hailo_apps.python.gen_ai_apps.simple_whisper_chat.simple_whisper_chat --audio /path/to/your/audio.wav--hef-path PATH- Specify a custom path to the HEF model file--list-models- List available models for this application--audio PATH- Path to audio file (default: audio.wav in same directory)
The example will:
- Initialize the Hailo device and load the Whisper model
- Load the audio file (default: audio.wav in the same directory)
- Process the audio and generate transcription segments
- Display the complete transcription
- Clean up resources
The example uses the WHISPER_MODEL_NAME_H10 model which is automatically downloaded on first run. No manual download required.
Note: Models are downloaded automatically when you run the example for the first time.
The example supports WAV files with:
- Sample rate: Any (will be processed by the model)
- Channels: Mono or stereo (will be processed)
- Sample width: 16-bit PCM
- Ensure Hailo models are properly installed
- Check model paths in the resource directory
- The model will be downloaded automatically on first run
- Verify Hailo device is connected and recognized
- Check device permissions
- Verify the audio file exists at the specified path
- If using default, ensure
audio.wavis in the same directory as the script - Check file permissions
- Ensure the file is a valid WAV format
- Check that the file is not corrupted
- Verify the file is readable
- Ensure Hailo Platform SDK is properly installed
- Verify Python environment has all required packages (NumPy)
The example demonstrates basic Whisper transcription:
- Creates a VDevice for Hailo hardware access
- Initializes a Speech2Text instance with the Whisper model
- Loads the audio file using Python's wave module
- Converts audio to the format expected by the model (float32, normalized)
- Generates transcription segments using the Whisper model
- Combines segments into a complete transcription
- Cleans up resources properly
This is a simplified example. For more advanced features like real-time audio recording, streaming transcription, and voice interaction, see the full Voice Assistant application.