This project is a Python-based solution for automatically generating highlight reels from video content. It leverages AI to transcribe and analyze video transcripts, identify the most significant moments, and then programmatically edits the video to create a final highlight clip.
- YouTube Video Downloader: Includes a script to download videos from YouTube for processing.
- Video Transcription: Uses OpenAI's Whisper model to generate accurate, timestamped transcripts.
- AI-Powered Analysis: Employs the Gemini API to analyze the transcript and select the most important segments based on a user-provided prompt.
- Automated Video Editing: Uses
ffmpegto cut and concatenate the selected video segments into a seamless highlight video. - Modular & Extensible: The code is organized into a modular structure, making it easy to extend and customize.
-
Clone the repository:
git clone <repository-url> cd vidzard-core
-
Create and activate a Python environment. It is recommended to use a virtual environment:
python -m venv venv source venv/bin/activateOr using conda:
conda create -n vidzard python=3.10 conda activate vidzard
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up your API key:
- Create a
.envfile in the project root. - Add your Gemini API key to the
.envfile:GEMINI_API_KEY="your_api_key_here"
- Create a
-
Place your video file:
- Place the video you want to process in the project root and name it
video.mp4, or use the--video_pathargument to specify a different location.
- Place the video you want to process in the project root and name it
You can run the entire pipeline using the main.py script.
To run the highlight generation with default settings, simply run:
python main.pyYou can customize the behavior of the script using command-line arguments:
--video_path: Path to the input video file (default:video.mp4).--user_prompt: The prompt to guide the AI in selecting important segments (default: "This is an interview about an urban project. Identify the most important segments.").--output_path: Path to save the final highlight video (default:final_highlight.mp4).--transcript_path: Path to save or load the transcript JSON file (default:transcript.json).--important_ids_path: Path to save or load the important segment IDs JSON file (default:important_ids.json).--chunk_size: The number of transcript segments to process in each API call (default: 20).
python main.py --video_path my_interview.mp4 --user_prompt "Find the key moments where the guest discusses their childhood." --output_path childhood_highlights.mp4vidzard-core/
├── main.py # Main script to run the full pipeline
├── requirements.txt # Project dependencies
├── video.mp4 # Example input video
├── final_highlight.mp4 # Example output video
├── individual_clips/ # Directory for individual video clips
├── scripts/
│ ├── video_cutter.py # Standalone script for video cutting
│ └── youtube_downloader.py # Script to download videos from YouTube
└── src/
├── utils/
│ ├── highlight.py # Contains the main high-level functions to generate highlights
│ └── video_utils.py # Utility functions for video processing
└── vidzard/
├── analysis/
│ ├── analyzer.py # Handles transcript analysis with Gemini
│ └── transcriber.py # Handles video transcription with Whisper
├── data/
│ └── data_handler.py # Utility functions for data handling (JSON, etc.)
└── video/
├── composer.py # Composes the final video from clips
└── editor.py # Handles video editing tasks (cutting, etc.)
Contributions are welcome! Please feel free to submit a pull request or open an issue.
This project is licensed under the MIT License.