GenieBot - RAG + Vision Telegram Assistant

GenieBot is a local-first Telegram assistant that supports:

document-grounded Q&A with RAG
image captioning with tags
quick summary of the latest chat or image interaction

It uses Ollama for generation, SentenceTransformers for embeddings, BLIP for vision, and SQLite for persistent vector storage.

Quick Access

Telegram bot: @mygenie_ai_bot
Service check: http://68.183.85.47:8080/ to confirm whether the bot service is running
You do not need to run the full setup locally if the service is already up; you can use the bot directly on Telegram
The models are open source, so feel free to use and test the bot as much as you want

Highlights

RAG retrieval with persistent embeddings in SQLite
Query and embedding caches in RAM for fast repeated calls
Vision pipeline for image caption + tags
User memory that keeps the last 3 interactions per user
Health/status page at / and /health

Tech Stack

Python 3.10+
python-telegram-bot
Ollama (local LLM runtime)
sentence-transformers/all-MiniLM-L6-v2 (embeddings)
Salesforce/blip-image-captioning-base (vision)
SQLite (data/rag_embeddings.db)

Current Runtime Settings

retrieval top_k: 2
chunk_size: 200
max_tokens: 250
user history: last 3 interactions per user

Project Structure

app.py                    # Bot bootstrap + status server
bot/handlers.py           # Telegram command handlers
rag/system.py             # Chunking, embedding, retrieval, SQLite persistence
rag/qa.py                 # QA orchestration and prompt flow
rag/llm.py                # Ollama client + model fallback handling
vision/processor.py       # Image captioning and tag extraction
utils/cache.py            # QueryCache + EmbeddingCache (RAM)
utils/memory.py           # Per-user short history (RAM)
utils/logger.py           # File + console logging
data/                     # Knowledge documents + SQLite DB file
media/                    # Assignment screenshots used below
docs/diagrams/system-design.mmd

Setup and Run

1) Create and activate virtual environment

Windows (PowerShell):

python -m venv .venv
.\.venv\Scripts\Activate.ps1

macOS/Linux:

python -m venv .venv
source .venv/bin/activate

2) Install dependencies

pip install -r requirements.txt

3) Start Ollama and pull models

ollama serve
ollama pull gemma3:4b
ollama pull mistral
ollama pull tinyllama

4) Configure environment

Create .env from .env.example and set:

TELEGRAM_BOT_TOKEN=your_token_here
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=gemma3:4b
OLLAMA_MODEL_PRIORITY=mistral,phi3
OLLAMA_FALLBACK_MODELS=tinyllama
LOG_LEVEL=INFO
PORT=8080

5) Optional: prebuild the embedding DB

python scripts/build_vector_db.py --data-dir data --db-path data/rag_embeddings.db --chunk-size 200 --chunk-overlap 50

6) Run the bot

python app.py

Bot Commands

/start - quick intro and commands
/help - usage guidance
/ask <question> - document-grounded answer
/image - upload image for caption + tags
/summarize [chat|image] - summarize latest interaction

Demo Screenshots

Start

Ask

These are example /ask queries that fetch information from your loaded documents and then answer:

/ask What is the return policy?
/ask What are the pricing plans and included features?
/ask What are the support hours and escalation process?
/ask How do I reset my account password?
/ask What are the main security/compliance points?

Image

Summarize

Architecture Diagram

Source: docs/diagrams/system-design.mmd

flowchart TD
    U[Telegram User] --> TG[Telegram API]
    TG --> APP["app.py - Bot Runtime"]
    WEB[Browser Render Ping] --> STATUS["Status endpoints: / and /health"]
    STATUS --> APP

    APP --> H["bot/handlers.py - Command Handlers"]
    H --> MEM["utils/memory.py - Last 3 interactions per user"]
    H --> QA["rag/qa.py - RAG QA Orchestrator"]
    H --> VISION["vision/processor.py - BLIP Caption + Tags"]

    QA --> RAG["rag/system.py - RAG Retrieval"]
    QA --> LLM["rag/llm.py - Ollama LLM + Fallback"]
    QA --> QCACHE["utils/cache.py - QueryCache (RAM only)"]

    RAG --> DOCS["Knowledge Documents (md txt files)"]
    RAG --> ECACHE["utils/cache.py - EmbeddingCache (RAM only)"]
    RAG --> ST["sentence transformers model all MiniLM L6 v2"]
    RAG --> SQLITE["SQLite DB - data/rag_embeddings.db"]

    BLD["scripts/build_vector_db.py - One-time or manual DB build"] --> SQLITE
    VISION --> BLIP["Salesforce BLIP - Image Caption Model"]

    APP --> LOGS["logs/geniebot YYYYMMDD.log"]
    APP --> ENV[".env configuration"]

Storage and Caching

Persistent: document chunk embeddings in SQLite (data/rag_embeddings.db)
RAM only: QueryCache, EmbeddingCache, user interaction history
Logs: logs/geniebot_YYYYMMDD.log (daily file name, no auto-rotation cleanup)

Assignment Notes

This project runs without Docker.
RAG is optimized for concise answers with grounded context.
Repeated same questions are served from in-memory query cache when available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GenieBot - RAG + Vision Telegram Assistant

Quick Access

Highlights

Tech Stack

Current Runtime Settings

Project Structure

Setup and Run

1) Create and activate virtual environment

2) Install dependencies

3) Start Ollama and pull models

4) Configure environment

5) Optional: prebuild the embedding DB

6) Run the bot

Bot Commands

Demo Screenshots

Start

Ask

Image

Summarize

Architecture Diagram

Storage and Caching

Assignment Notes

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

GenieBot - RAG + Vision Telegram Assistant

Quick Access

Highlights

Tech Stack

Current Runtime Settings

Project Structure

Setup and Run

1) Create and activate virtual environment

2) Install dependencies

3) Start Ollama and pull models

4) Configure environment

5) Optional: prebuild the embedding DB

6) Run the bot

Bot Commands

Demo Screenshots

Start

Ask

Image

Summarize

Architecture Diagram

Storage and Caching

Assignment Notes