feat(recall): prefer_observations — dedupe raw facts superseded by observations#2311
Draft
nicoloboschi wants to merge 1 commit into
Draft
feat(recall): prefer_observations — dedupe raw facts superseded by observations#2311nicoloboschi wants to merge 1 commit into
nicoloboschi wants to merge 1 commit into
Conversation
…bservations Recalling `observation` alongside `world`/`experience` can return the same information twice — once as a raw fact and once folded into an observation that was consolidated from it. The new `prefer_observations` flag drops any raw fact that a returned observation lists in its `source_memory_ids`, so the observation supersedes it. Dedup is by provenance (exact id membership), not semantics. It runs before the recall truncation so freed slots backfill, keeping the result count at the requested budget. Enabled by default at the user-facing boundary (HTTP RecallRequest + MCP recall tool). The engine method defaults it to False so internal callers — notably consolidation, which needs the raw facts it folds into observations — are never silently deduped. Includes regenerated OpenAPI + client SDKs, control-plane proxy + types, docs, and deterministic provenance-based tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a
prefer_observationsflag to recall. When you recall raw facts (world/experience) together withobservation, any raw fact that a returned observation was consolidated from is dropped, so the observation supersedes it — no duplicate content.Enabled by default. Set
prefer_observations: falseto keep raw facts even when an observation already covers them.Why
Today, recalling all types returns the same information twice — once as a raw fact and once folded into an observation built from it. Users had to choose between:
This flag gives the best of both: recall everything, but prefer the observation whenever it already covers a raw fact.
How
memory_units.source_memory_ids. Dedup is an exact id-membership filter over that set — not semantic guessing. A fact that is semantically similar to an observation but not in its source list survives.RecallRequest+ MCP recall tool). The engine methodrecall_asyncdefaults it toFalseso internal callers — notably consolidation, which needs the raw facts it folds into observations — are never silently deduped.Scope of changes
api/http.py—RecallRequest.prefer_observations(defaulttrue), threaded to engineengine/memory_engine.py— Step 4.8 dedup in the recall pipelinemcp_tools.py— flag on both recall tool variantslib/api.tstypesdocs/developer/api/recall.mdxtests/test_recall_prefer_observations.py— provenance-based dedup (deterministic, no LLM), plus the user-facing-on / engine-off default splitTests
uv run pytest tests/test_recall_prefer_observations.py— 5 passing:observationnot intypesTrue, engine defaultFalseBehavior-change note for reviewers
Because the default is
trueand a default recall already includesobservation, raw facts covered by an observation will now drop out of default recalls unlessprefer_observations: falseis passed. This is intentional. Internal recall paths (reflect, consolidation, mental-model triggers) are unaffected — they use the engine default (False).