Repo Codex is a full-stack generative AI project that ingests code repositories, builds a searchable code index, answers repository questions with streaming responses, and generates wiki-style documentation pages.
- Frontend: React 19, TanStack Start/Router, Tailwind CSS
- Backend: TanStack server routes, TypeScript
- Database: PostgreSQL + Drizzle ORM
- Queue: BullMQ + Valkey
- AI: AI SDK with OpenAI-compatible chat and embedding endpoints
- Parsing: Tree-sitter-based code chunking
- Repository ingestion pipeline with job tracking, retry, and cancellation
- AST-aware chunking and embedding generation
- Hybrid retrieval (semantic vector + BM25 keyword ranking)
- Streaming Q&A for repository understanding
- AI-generated wiki pages with Mermaid diagram support
- API contract generation via OpenAPI and Scalar docs UI
- Install dependencies:
bun install- Start local supporting services:
docker compose up -d- Run the app:
bun run dev- Open the app at http://localhost:3000
Create a .env file with at least:
DATABASE_URL=postgresql://user:password@localhost:5432/repo_codex
VALKEY_URL=redis://localhost:6379
EMBEDDING_API_KEY=your_embedding_or_openai_compatible_key
LLM_API_KEY=your_chat_model_keyNotes:
- VALKEY_URL can be replaced by REDIS_URL.
- LLM_API_KEY falls back to EMBEDDING_API_KEY if not set.
- Additional optional tuning variables are used for chunking, retrieval, and wiki generation.
bun run dev
bun run build
bun run typecheck
bun run lint
bun run test
bun run db:generate
bun run db:migrate
bun run db:push- OpenAPI JSON: http://localhost:3000/api/openapi.json
- Scalar UI: http://localhost:3000/api/docs
The full technical report for this major project is available at: