Skip to content

Commit 85cad66

Browse files
authored
feat: verbatim transcripts — sessions + messages (refs #107) (#108)
* feat: verbatim transcripts — sessions + messages (refs #107) Add a third memory system alongside memories (curated, decayed) and memoirs (knowledge graphs): transcripts are verbatim recordings of agent conversations, stored as-is with no summarization or extraction. Retrieval uses FTS5 BM25 at query time (boolean, phrase, prefix). Motivation: issue #107 + MemPalace traction show real demand for raw conversation capture. Three use cases ICM couldn't previously serve: session replay for post-mortem review, compliance/audit trails, and training-data collection. Extraction is lossy for these — we need the actual bytes, not a summary. Implementation is 100% Rust, same SQLite file as memories/memoirs, zero new runtime deps. FTS5 index writes ~10× faster than ChromaDB- based alternatives. ## Schema - `sessions` (id, agent, project, started_at, updated_at, metadata) - `messages` (id, session_id, role, content, tool_name, tokens, ts, metadata) - `messages_fts` FTS5 virtual table on role + content + tool_name - ON DELETE CASCADE from sessions → messages; triggers keep FTS in sync ## API surface - Core: `TranscriptStore` trait + `Session`, `Message`, `Role`, `TranscriptHit`, `TranscriptStats` types. - Store: 7 methods on `SqliteStore` (create_session, get_session, list_sessions, record_message, list_session_messages, search_transcripts, forget_session, transcript_stats). - CLI: `icm transcript {start-session, record, search, list-sessions, show, stats, forget}` (7 subcommands). - MCP: 5 tools (`icm_transcript_start_session`, `_record`, `_search`, `_show`, `_stats`) — total MCP surface: 22 → 27. ## Tests 8 new unit tests in icm-store: create+record, missing-session rejection, FTS5 boolean + phrase, session/project scoping, stats breakdown, cascade delete, chronological ordering, list sorting. ## Follow-ups (separate PRs) - TUI tab "Sessions" (icm dashboard) - Web dashboard page `/sessions` with message thread viewer - `icm hook transcript` wiring for auto-capture from Claude Code hooks - Optional RTK Cloud sync (paid audit / session replay tier) * style: cargo fmt * fix(clippy): explicit param numbering in transcript search SQL
1 parent 8a520c7 commit 85cad66

File tree

9 files changed

+1479
-4
lines changed

9 files changed

+1479
-4
lines changed

Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

README.md

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -227,7 +227,38 @@ icm memoir export -m "system-architecture" -f json # Structured JSON with al
227227
icm memoir export -m "system-architecture" -f dot | dot -Tsvg > graph.svg
228228
```
229229

230-
## MCP Tools (22)
230+
### Transcripts (verbatim session replay)
231+
232+
Store every message exchanged with an agent as-is — no summarization, no extraction.
233+
Search later with FTS5 (BM25 + boolean + phrase + prefix). Useful for session replay,
234+
post-mortem review, compliance audit, training data. Complementary to curated memories.
235+
236+
```bash
237+
# 1. Start a session
238+
SID=$(icm transcript start-session --agent claude-code --project myapp)
239+
240+
# 2. Record every turn verbatim
241+
icm transcript record -s "$SID" -r user -c "Pourquoi on avait choisi Postgres ?"
242+
icm transcript record -s "$SID" -r assistant -c "JSONB natif, BRIN pour les logs, auto-vacuum tuné."
243+
icm transcript record -s "$SID" -r tool -c '{"cmd":"psql -c ..."}' -t Bash --tokens 42
244+
245+
# 3. Replay, search, inspect
246+
icm transcript list-sessions --project myapp
247+
icm transcript show "$SID" --limit 200
248+
icm transcript search "postgres JSONB" # BM25 ranked
249+
icm transcript search '"auto-vacuum"' # phrase match
250+
icm transcript search "postgres OR mysql" --session "$SID" # boolean, scoped
251+
icm transcript stats
252+
253+
# 4. Delete a session (cascade deletes its messages)
254+
icm transcript forget "$SID"
255+
```
256+
257+
Rust + SQLite + FTS5 — 0 Python, 0 ChromaDB, 0 external service. Writes are ~10× faster than
258+
ChromaDB-based verbatim stores; the whole transcript lives in the same SQLite file as your
259+
memories and memoirs.
260+
261+
## MCP Tools (27)
231262

232263
### Memory tools
233264

@@ -266,6 +297,16 @@ icm memoir export -m "system-architecture" -f dot | dot -Tsvg > graph.svg
266297
| `icm_feedback_search` | Search past corrections to inform future predictions |
267298
| `icm_feedback_stats` | Feedback statistics: total count, breakdown by topic, most applied |
268299

300+
### Transcript tools (verbatim session replay)
301+
302+
| Tool | Description |
303+
|------|-------------|
304+
| `icm_transcript_start_session` | Create a session for verbatim message capture; returns `session_id` |
305+
| `icm_transcript_record` | Append a raw message (role, content, optional tool + tokens + metadata) |
306+
| `icm_transcript_search` | FTS5 search across messages (BM25, boolean, phrase, prefix) |
307+
| `icm_transcript_show` | Replay full message thread of a session, chronologically |
308+
| `icm_transcript_stats` | Sessions, messages, bytes, breakdown by role/agent/top-sessions |
309+
269310
### Relation types
270311

271312
`part_of` · `depends_on` · `related_to` · `contradicts` · `refines` · `alternative_to` · `caused_by` · `instance_of` · `superseded_by`

0 commit comments

Comments
 (0)