fix(index): invalidate stale index cache entries after index replacement#7094
fix(index): invalidate stale index cache entries after index replacement#7094geserdugarov wants to merge 2 commits into
Conversation
|
@Xuanwo, @wjones127, hi! |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
|
||
| // Bare-UUID entries from caches created without FRI. | ||
| cache | ||
| .invalidate_with_key(&LegacyVectorIndexCacheKey::new(&uuid_str, None)) |
There was a problem hiding this comment.
nit: Also invalidate ScalarIndexDetailsKey?
wjones127
left a comment
There was a problem hiding this comment.
A later read in the same session could then observe stale cached index state that no longer matches the manifest.
I'm confused by this sentence. Does this suggest a later read could read the wrong index cache entry? Or just that those entries will take up memory when they are likely no longer needed?
This PR seems to presuppose that there are no readers intentionally reading an old version of the table. I think that's a bad assumption. This can happen in two ways:
- Because of MVCC, a query that started earlier than the write, but is still ongoing, may still reference that index and want to use it.
- Users can and do checkout older versions of tables. Sometimes they might pin their readers to a version while writers do multiple operations on their table.
I worry this PR is going to hurt performance in those two cases. I wonder if a better solution would instead be a TTL expiry on cache entries. What do you think of that?
What
CreateIndexcommits can retire existing index metadata, for example whenoptimize_indices(...retrain...)replaces an IVF index with a new UUID. If the old index had been prewarmed, the session index cache could still hold root entries and IVF partition entries for the removed UUID.A later read in the same session could then observe stale cached index state that no longer matches the manifest. This clears cache entries for removed index UUIDs after the commit succeeds.
Changes
LanceCacheviainvalidate_with_keyandinvalidate_unsized_with_key.CacheBackendwithinvalidate_entryand implement it for the Moka backend and existing serializing test backends.Operation::CreateIndexcommits, invalidate cache entries for each removed index:uuid/uuid-fri_uuid/entries when a fragment-reuse index is activeremoved_indices, so conflict resolution invalidates the indices actually retired by the committed manifest.Notes
Tests
optimize_indices, and verifies the removed index UUID no longer has cached root or partition entries.