Skip to content

feat(index): support raw-query ivf rq search#7078

Merged
BubbleCal merged 15 commits into
mainfrom
yang/ivfrq-pr3-split-code-query
Jun 9, 2026
Merged

feat(index): support raw-query ivf rq search#7078
BubbleCal merged 15 commits into
mainfrom
yang/ivfrq-pr3-split-code-query

Conversation

@BubbleCal

@BubbleCal BubbleCal commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Feature

  • Adds explicit IVF_RQ query_estimator metadata so released indexes without the field continue to read as residual_query, while newly written indexes use raw_query.
  • Implements raw-query IVF_RQ search for new num_bits == 1 indexes and multi-bit split-code indexes, including ex-code factors and runtime-only rotated centroid caches derived from the original IvfModel centroids.
  • Prepares the rotated raw query and split-code lookup tables once per query worker and reuses them across probed partitions; each partition updates only the cluster correction.
  • Relaxes the public IVF_RQ num_bits > 1 gate for supported metrics, including cosine via Lance's normalized-L2 handling.

Compatibility

  • Old IVF_RQ indexes that lack query_estimator metadata still default to the legacy residual-query estimator.
  • Original IVF centroids remain the source of truth for partition assignment, incremental indexing, and persisted metadata.

Performance Improvement

The benchmark below was run with search-benchmark on GCP VM yang-agent-00bd-ivfrq-rerun-20260605, dataset gist, k=10, max_threads=1, target_partition_size=4096, no refine. Latencies are converted from CSV seconds to milliseconds.

Provenance:

  • search-benchmark commit: 61ef8f7b97589032a83eeae1e52664be9f035551
  • main Lance baseline commit: 437849118f380d92c1ea849f99996e9072be58df
  • PR branch commit benchmarked: ce548a49766670b80275daae6f1bf97c70e885e4

Additional DBpedia comparison on the same VM, current branch only, dataset dbpedia, k=10, max_threads=1, target_partition_size=4096. For IVF_PQ, sub_vector_dim=8; one extra row includes refine_factor=2 at nprobes=24.

Index Config nprobes refine recall@10 avg ms p99 ms QPS indexing s
IVF_RQ num_bits=1 8 - 0.7917 1.59 1.98 615.8 16.45
IVF_RQ num_bits=1 16 - 0.8102 2.35 2.98 420.3 16.45
IVF_RQ num_bits=1 24 - 0.8162 3.19 3.93 311.4 16.45
IVF_RQ num_bits=3 8 - 0.9014 2.14 2.63 463.8 27.01
IVF_RQ num_bits=3 16 - 0.9263 2.93 3.58 338.9 27.01
IVF_RQ num_bits=3 24 - 0.9352 3.82 4.74 261.0 27.01
IVF_RQ num_bits=5 8 - 0.9207 2.32 2.80 426.2 33.93
IVF_RQ num_bits=5 16 - 0.9520 3.32 4.05 300.1 33.93
IVF_RQ num_bits=5 24 - 0.9624 4.56 5.57 218.1 33.93
IVF_RQ num_bits=7 8 - 0.9278 2.84 3.39 350.3 46.76
IVF_RQ num_bits=7 16 - 0.9572 3.77 4.45 264.1 46.76
IVF_RQ num_bits=7 24 - 0.9683 4.96 5.94 200.7 46.76
IVF_PQ sub_vector_dim=8 8 - 0.7354 4.44 5.50 223.7 153.84
IVF_PQ sub_vector_dim=8 16 - 0.7447 8.05 9.68 123.6 153.84
IVF_PQ sub_vector_dim=8 24 - 0.7483 12.80 14.72 78.0 153.84
IVF_PQ sub_vector_dim=8 24 2 0.9133 12.84 14.96 77.7 153.84

Tests

  • cargo fmt --all
  • cargo test -p lance-index raw_query
  • cargo test -p lance-index try_from_batch_
  • cargo test -p lance-index rabit_quantizer
  • cargo test -p lance test_rabitq_distance_types
  • cargo test -p lance test_build_ivf_rq
  • cargo clippy -p lance-index -p lance --tests --benches -- -D warnings
  • uv run make build
  • targeted Python IVF_RQ multi-bit and cosine search test
  • targeted Python distributed IVF_RQ shared-rotation test

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@github-actions github-actions Bot added A-python Python bindings A-index Vector index, linalg, tokenizer A-format On-disk format: protos and format spec docs labels Jun 3, 2026
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Important

This PR touches the Lance format specification.

Substantive changes to the format specification — the .proto definitions
and the spec docs under docs/src/format/ — require a PMC vote before merge.
Minor edits such as typo fixes, wording, or formatting are excluded; use your
judgment.

If this is a meaningful format change:

  • Start a vote following the Lance community voting process.
    Format specification modifications need 3 binding +1 votes (excluding the
    proposer), held on GitHub Discussions, with a minimum voting period of 1 week.
  • Once the vote passes, link the completed vote in this PR. It should not be
    merged until the vote is linked.

@github-actions github-actions Bot added the enhancement New feature or request label Jun 3, 2026
Base automatically changed from yang/ivfrq-num-bits-pr2-bench to main June 5, 2026 06:09
@BubbleCal BubbleCal force-pushed the yang/ivfrq-pr3-split-code-query branch from ce548a4 to 763ecc0 Compare June 8, 2026 05:37
@BubbleCal

Copy link
Copy Markdown
Contributor Author

@claude review

@BubbleCal

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 42b122a38b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread rust/lance-index/src/vector/bq/builder.rs

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a substantial new feature touching the Lance format spec, the IVF_RQ storage layout, and the core distance-calculator path — I'm leaving it for a human reviewer.

Extended reasoning...

Overview

This PR introduces a new raw_query estimator for IVF_RQ search alongside the existing residual_query estimator, including multi-bit (num_bits > 1) split-code search. Changes span 13 files: a format-spec doc update, new metadata fields (query_estimator), substantial additions to bq/storage.rs (~1100 lines including new ex-code distance tables, FastScan paths, and lower-bound gating), a refactor of the flat index search loops to use new accumulate_topk_with_scratch trait methods, a fix to PERM0_INVERSE in the SIMD dist table (the prior value was incorrect — note the new test_perm0_inverse_matches_perm0 test), and IVF v2 plumbing for per-query rotated-query contexts and per-index rotated-centroid caches.

Security risks

No direct security risks — this is internal indexing code with no auth, crypto, or untrusted-input boundaries.

Level of scrutiny

High. This PR touches the format specification (per the format-change-vote reminder, this needs a PMC vote with 3 binding +1s and a 1-week voting period), modifies the storage layout for newly-built IVF_RQ indexes, and rewrites the search hot path with new lower-bound gating that affects recall/accuracy. The compatibility shim (default_query_estimator_compatResidualQuery) is the right approach for old indexes, but the correctness of the new raw-query estimator factors, error-factor gating, and FastScan ex-code path all warrant careful human review.

Other factors

  • The PR has good test coverage: new unit tests for the estimator factors, FastScan ex-code paths, lower-bound gating, cache slice borrowing, and a parameterized integration test for multi-bit L2/Cosine search.
  • A previously-incorrect PERM0_INVERSE constant is silently fixed here — worth a human verifying no callers depended on the buggy value.
  • The Cosine→L2 distance-type rewrite in try_from_batch for raw-query indexes is subtle and could surprise readers; worth confirming the rewrite is correct and intentional.
  • The format-spec change requires a PMC vote that has not yet been linked on this PR.

Comment thread rust/lance-linalg/src/simd/dist_table.rs
Comment thread python/python/tests/test_vector_index.py Outdated
Comment thread docs/src/format/index/vector/index.md
Comment thread rust/lance/src/index/vector/ivf/v2.rs
Comment thread python/python/tests/compat/test_vector_indices.py Outdated
@BubbleCal

Copy link
Copy Markdown
Contributor Author

@claude review

@BubbleCal

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@Xuanwo Xuanwo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@BubbleCal BubbleCal merged commit d95c2c2 into main Jun 9, 2026
31 checks passed
@BubbleCal BubbleCal deleted the yang/ivfrq-pr3-split-code-query branch June 9, 2026 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-format On-disk format: protos and format spec docs A-index Vector index, linalg, tokenizer A-python Python bindings enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants