Skip to content

chore: Critical: fix BM25 autogenerate expression normalization#7

Merged
philippemnoel merged 2 commits intomainfrom
codex/audit-critical-fixes
Mar 2, 2026
Merged

chore: Critical: fix BM25 autogenerate expression normalization#7
philippemnoel merged 2 commits intomainfrom
codex/audit-critical-fixes

Conversation

@philippemnoel
Copy link
Copy Markdown
Member

@philippemnoel philippemnoel commented Mar 1, 2026

Ticket(s) Closed

  • Closes #

What

  • Fix BM25 expression normalization so qualifier stripping does not alter dotted text inside SQL string literals.
  • Preserve pdb namespaces while removing non-pdb relation qualifiers outside literals.
  • Add regression tests for dotted literals and qualifier stripping behavior.

Why

  • The previous normalization path could mangle tokenizer namespaces or dotted literal content (for example regex patterns), which can produce incorrect Alembic BM25 diff behavior.

How

  • Replace regex-style qualifier stripping with token-aware parsing that tracks SQL single-quoted literals.
  • Strip only non-pdb qualifier prefixes outside string literals.
  • Extend unit tests in tests/unit/test_alembic_unit.py to cover the relevant normalization edge cases.

Tests

  • ruff check .
  • python -m pytest tests/unit/test_alembic_unit.py
  • python -m pytest tests/unit

@philippemnoel philippemnoel force-pushed the codex/audit-critical-fixes branch from 839ba1f to 840919b Compare March 1, 2026 17:01
@philippemnoel philippemnoel changed the title Critical: fix BM25 autogenerate expression normalization chore: Critical: fix BM25 autogenerate expression normalization Mar 1, 2026
@philippemnoel philippemnoel force-pushed the codex/audit-critical-fixes branch from 840919b to 85aea90 Compare March 1, 2026 17:08
@philippemnoel philippemnoel force-pushed the codex/audit-critical-fixes branch from 50f3d27 to b9e068d Compare March 1, 2026 17:40
@philippemnoel philippemnoel merged commit 08cb745 into main Mar 2, 2026
10 checks passed
@philippemnoel philippemnoel deleted the codex/audit-critical-fixes branch March 2, 2026 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants