Debatly listens to a live debate (or an uploaded recording), transcribes it with per-speaker labels, fact-checks the claims against the web, tracks contradictions, and turns it all into a clear credibility score and a readable post-debate report β no winner declared, just the facts and an honest reading.
debatly-demo-web.mp4
- ποΈ Live capture β Streams your mic to the server over a WebSocket. Words are transcribed in real time by Speechmatics; who is speaking is decided by a local diarization engine (Silero VAD + pyannote/wespeaker voice embeddings with persistent speaker profiles). Each transcribed word is matched to a voice fingerprint, so a speaker keeps the same label across the whole debate.
- π Import a recording β Upload an audio/video file or paste a video URL. Debatly extracts the audio, diarizes and transcribes the whole thing, then runs it through the exact same analysis pipeline as a live debate.
- π§ Live Debate Desk β As people talk, the app builds the board:
- Debate points grouped into themed, collapsible families per side.
- Claims & fact-checks with a verdict (Verified / False / Misleading / Unverified), a one-line reason, and real sources.
- Self-contradictions & double standards caught across the debate.
- βοΈ Credibility score β A deterministic score that rewards verified claims and penalizes false/misleading ones and self-contradictions, so volume never beats accuracy. It measures honesty, not who talked more or whose opinion you like.
- π Post-debate report β Generated when the debate ends:
- A plain-English verdict (no winner is declared β you decide).
- A score-over-time chart with markers for when each speaker entered.
- A per-speaker breakdown with stats, a judging note, and a standout quote.
- Side summaries, turning points (with gained/lost-ground impact), and the scoring methodology.
- π Native PDF export β Download the whole report as a text-based PDF with vector charts.
- π Long, uninterrupted sessions β Built to run for hours without dropping the connection.
%%{init: {"theme":"base","themeVariables":{"fontFamily":"ui-sans-serif, -apple-system, Segoe UI, Roboto, Helvetica, Arial, sans-serif","fontSize":"14px","lineColor":"#8a847c","clusterBkg":"#0c0a09","clusterBorder":"#2b2724"}}}%%
flowchart TD
U(["ποΈ Browser β live mic Β· file Β· video URL"])
API["βοΈ Node Β· Express Β· WebSocket"]
U == "audio Β· file Β· link" ==> API
subgraph LIVE["β‘ Live"]
direction LR
SM["Speechmatics<br/>realtime words Β· STT"]
LD["Local diarizer Β· Python<br/>Silero VAD + pyannote/wespeaker<br/>online speaker clustering"]
end
MERGE{{"Word β speaker merge<br/>persistent labels"}}
subgraph BATCH["π¦ Upload / URL"]
PY["pyannote.ai<br/>precision-2 diarize + Whisper"]
end
API -- "live audio" --> SM
API -- "live audio" --> LD
API -- "upload / URL" --> PY
SM --> MERGE
LD --> MERGE
subgraph PIPE["π§ Analysis pipeline"]
direction TB
SB["Side Builder"] --> DP["Debate-Point Builder"] --> CB["Claim Builder"] --> FC["Fact Checker Β· Firecrawl + Gemini"] --> IW["Inconsistency Watch"] --> SE["Scoring Engine"]
end
MERGE ==> SB
PY ==> SB
DB[("ποΈ Supabase Β· Postgres")]
UI["π₯οΈ Debate Desk + Report"]
SE ==> DB
SE ==> UI
DB <--> UI
classDef input fill:#0c0a09,stroke:#5b8aa0,stroke-width:1.5px,color:#f5f5f4
classDef core fill:#1c1917,stroke:#a8a29e,stroke-width:1.5px,color:#f5f5f4
classDef live fill:#13212b,stroke:#5b8aa0,stroke-width:1.4px,color:#e8eef2
classDef batch fill:#241318,stroke:#a4626a,stroke-width:1.4px,color:#f3e8ea
classDef pipe fill:#1c1917,stroke:#8a8378,stroke-width:1.2px,color:#f1efe9
classDef store fill:#0e1d17,stroke:#5b8a6f,stroke-width:1.5px,color:#e7f2ec
classDef ui fill:#1c1917,stroke:#c9a96a,stroke-width:1.5px,color:#f5f5f4
class U input
class API,MERGE core
class SM,LD live
class PY batch
class SB,DP,CB,FC,IW,SE pipe
class DB store
class UI ui
| Layer | Tools |
|---|---|
| Frontend | React + TypeScript, Vite, Recharts (charts), Lucide (icons), jsPDF (native PDF export) |
| Backend | Node.js, Express, ws (WebSocket), Zod |
| Realtime STT (live words) | Speechmatics realtime transcription |
| Live diarization | Local Python worker β Silero VAD + pyannote/wespeaker voice embeddings + online clustering with persistent speaker profiles (PyTorch, CPU) |
| Batch diarize + transcribe | pyannote.ai precision-2 + Whisper (for imports) |
| Reasoning & fact-checks | Google Vertex AI (Gemini) + Firecrawl web search |
| Audio / media | ffmpeg-static, youtube-dl-exec |
| Data & auth | Supabase (Postgres + authentication) |
| Hosting | GCP Compute Engine VM + Caddy (automatic HTTPS) |
- Node.js 20+
- Accounts/keys for the services you want to use (Vertex AI / Gemini, Speechmatics, Firecrawl, Supabase, pyannote.ai). See
.env.examplefor the full list.
npm installCopy the template and fill in your own keys:
cp .env.example .env
.envis gitignored β secrets never get committed, and the browser never receives them.
Starts the API + live WebSocket server and the Vite dev server together:
npm run devThen open http://127.0.0.1:5173 (the API runs on 127.0.0.1:8787).
| Command | What it does |
|---|---|
npm run dev |
Run API + web together (hot reload) |
npm run build |
Build the frontend for production |
npm run start |
Run the production API/WebSocket server |
npm run check |
TypeScript type-check (no emit) |
npm run preview |
Preview a production build |
.
βββ src/ # React + TypeScript frontend (App.tsx, styles, types)
βββ server/ # Node API + WebSocket server
β βββ index.mjs # HTTP/WS entry point
β βββ pipeline.mjs # live analysis pipeline
β βββ nodes/ # claim builder, fact-checker, scoring engine, report builder, β¦
β βββ shared/ # STT, audio extraction, db helpers
βββ public/ # static assets (logos, favicons)
βββ supabase/ # database migrations
βββ docs/ # README media
βββ .env.example # all configuration keys (copy to .env)
Debatly runs on a GCP Compute Engine VM with Caddy as a reverse proxy that handles automatic HTTPS, serves the built frontend, and proxies /api + /live to the Node server. A VM (rather than a serverless platform) is used deliberately so a single live debate can run for hours without the connection being cut. The Node server spawns the local diarization worker (PyTorch) as a child process per live session, so the VM is CPU-sized to run the speaker-embedding model in real time (no GPU required). Persistence uses Supabase's IPv4 connection pooler.
- Overlapping speech & fast interruptions. Live diarization decides who is speaking from short rolling audio windows, so when two people talk over each other or hand off rapidly, a speaker change can be detected slightly late β the new speaker's first word or two can land on the previous speaker. Single-microphone audio can't truly separate overlapping voices.
- Live latency. Live transcript bubbles lag a couple of seconds behind real speech: the diarization label is given a short delay to settle so the speaker attribution is reliable. Words are buffered per speaker turn and shown when the speaker changes.
- Live is CPU-bound (concurrency). Each live debate spawns its own speaker-embedding worker on the server CPU. That comfortably handles a handful of simultaneous debates, but it isn't free β heavy concurrent load needs a bigger machine or a GPU.
- Uploads are still the most accurate. The import path runs batch diarization + transcription via pyannote.ai (
precision-2) + Whisper, which has the full recording to work with and is the gold standard for speaker separation. If you need the cleanest possible speaker labels, import a recording rather than capturing live. - Audio quality matters. Close mic placement, low background noise, and minimal reverb noticeably improve live results.
- All API keys live in
.envand stay server-side. The browser never receives Google, Speechmatics, Firecrawl, or database credentials.