Skip to content

fix(recall): allow exact filtering of untagged observations (#2295)#2322

Open
wangzupeng12061 wants to merge 1 commit into
vectorize-io:mainfrom
wangzupeng12061:fix/recall-untagged-observations
Open

fix(recall): allow exact filtering of untagged observations (#2295)#2322
wangzupeng12061 wants to merge 1 commit into
vectorize-io:mainfrom
wangzupeng12061:fix/recall-untagged-observations

Conversation

@wangzupeng12061

Copy link
Copy Markdown
Contributor

Summary

Fixes #2295.

Allow recall to select only global/untagged observations by using an empty tag
set with tags_match="exact".

This enables users to switch between shared observations and tagged observation
scopes without adding a synthetic "global" tag or maintaining negative filters.

Root cause

The centralized tag filters treated None and [] as "no filter" before
checking the matching mode.

As a result:

  • SQL retrieval applied no filter and returned every scope.
  • Python post-processing returned every result.
  • An exact empty compound tag group excluded all untagged results.
  • Link-expansion skipped Python filtering when the tag list was empty.

This contradicted the existing scope semantics where the empty tag set
represents the global observation scope.

Fix

  • Interpret an empty tag set as the global/untagged scope in exact mode.
  • Match both historical NULL tags and current empty-array tags.
  • Keep parameter indexes unchanged because the empty-scope SQL clause requires
    no bind parameter.
  • Apply the same semantics to flat SQL filters, Python post-processing, and
    compound tag groups.
  • Run link-expansion post-filtering for empty exact scopes.
  • Preserve existing behavior for all other modes: empty tags still mean no
    filtering for any, all, any_strict, and all_strict.

Tests

Added regression coverage for:

  • flat SQL filtering with None and [];
  • bind parameter numbering after an empty exact scope;
  • Python-side filtering of NULL, empty, and tagged results;
  • compound exact-empty tag groups in SQL and Python;
  • recall API filtering that returns an untagged memory while excluding a tagged
    memory.

Verification

Run on Linux x86_64 with Python 3.11, Node.js 22, and PostgreSQL 17 + pgvector:

  • 91 passed - tests/test_tags_visibility.py
  • 10 passed - tests/test_graph_filtering.py
  • ty check hindsight_api passed
  • Full ./scripts/hooks/lint.sh passed

@wangzupeng12061 wangzupeng12061 marked this pull request as ready for review June 19, 2026 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow filtering to untagged observations

1 participant