The current search pipeline is unstructured recall, embed nodes, query by similarity, return branches. It answers "what past conversations are similar to X?" but doesn't synthesize or prioritize.
What a memory layer could add
A memory config, modeled after the search config pattern, could introduce structured recall:
Short-term memory: branch-scoped context
- Recent nodes within the current proxy branch
- Sliding window or token budget (e.g., last N nodes)
- Cheap, fast, ephemeral: lives in the worker pool or in-memory storage driver
Long-term memory: persistent, cross-branch knowledge
- Extracting durable facts/patterns from conversations into a knowledge graph
- Goes beyond vector similarity
- Could use a graph store (Cognee-style) or a curated vector collection with metadata
How it could map to tapes' config pattern
[memory]
provider = "cognee" # or "local", "graph"
target = "http://..."
[memory.short_term]
enabled = true
window = 10 # last N nodes in current branch
[memory.long_term]
enabled = true
provider = "cognee" # knowledge graph extraction
This follows the same optional-feature pattern as search:
- Zero config — proxy works without it
- Graceful degradation — returns "memory not configured" if missing
- Driver interface —
memory.Driver with Store(), Recall(), Forget()
- Pluggable backends — local graph, Cognee, or custom
Key difference from search
Search is query-driven (user asks, system retrieves). Memory would be context-driven: the proxy automatically injects relevant recalled context into LLM requests before forwarding them upstream.
The current search pipeline is unstructured recall, embed nodes, query by similarity, return branches. It answers "what past conversations are similar to X?" but doesn't synthesize or prioritize.
What a memory layer could add
A memory config, modeled after the search config pattern, could introduce structured recall:
Short-term memory: branch-scoped context
Long-term memory: persistent, cross-branch knowledge
How it could map to tapes' config pattern
This follows the same optional-feature pattern as search:
memory.DriverwithStore(),Recall(),Forget()Key difference from search
Search is query-driven (user asks, system retrieves). Memory would be context-driven: the proxy automatically injects relevant recalled context into LLM requests before forwarding them upstream.