fix(#48): version-aware binary tuple decoding for ALTER COLUMN by farhan-syah · Pull Request #64 · NodeDB-Lab/nodedb

farhan-syah · 2026-04-16T20:01:52Z

Fixes #48 — ALTER COLLECTION ADD/DROP COLUMN on strict document collections silently corrupted pre-ALTER rows (null-everywhere reads, decode failures on UPDATE) because the catalog schema was mutated without migrating existing rows or providing a read-time compatibility shim.

Summary

Track when each column was added (added_at_version) and keep tombstones for dropped columns (DroppedColumn { def, position, dropped_at_version }) on StrictSchema.
StrictSchema::schema_for_version(v) reconstructs the physical tuple layout at any historical version — excludes columns added after v, re-inserts columns dropped after v at their original positions.
Reader (binary_tuple_to_value / binary_tuple_to_json) detects tuple-version < schema-version and decodes with a sub-schema matching the tuple's physical layout, then virtually fills new columns with their DEFAULT.
parse_default_literal() resolves common SQL defaults ('n/a', 0, true, false, null) at read time.

Coverage

Added 7 regression tests to nodedb/tests/sql_transactions.rs covering the full class — every test asserts that pre-ALTER rows remain readable with correct values after the DDL:

add_column_preserves_pre_alter_row_existing_columns — original columns must not null-everywhere corrupt
add_column_returns_default_for_pre_alter_row — new column virtual-fills with DEFAULT
add_column_then_update_pre_alter_row — UPDATE on schema-mismatched row must not fail "failed to decode Binary Tuple"
multiple_add_columns_preserves_pre_alter_row — compound schema drift
drop_column_preserves_pre_alter_row_remaining_columns — remaining columns keep values
rename_column_preserves_pre_alter_row_value
alter_column_type_preserves_pre_alter_row_value

Test plan

cargo nextest run -p nodedb — 2907/2907 passed
cargo nextest run -p nodedb-strict -p nodedb-types -p nodedb-columnar — 369/369 passed
cargo fmt --all — clean
cargo clippy --all-targets -- -D warnings — no issues

Introduce `added_at_version: u16` on `ColumnDef` to record the schema version at which a column was added. Columns present at collection creation default to version 1. Add a `DroppedColumn` tombstone struct that captures the full column definition, its ordinal position, and the schema version at which it was removed. `StrictSchema` now carries a `dropped_columns` list so the physical layout of any historical tuple version can be reconstructed without row migration. New helpers on `StrictSchema`: - `schema_for_version(v)` — builds a sub-schema matching the physical layout of tuples written at version `v` by excluding later-added columns and re-inserting dropped columns at their original positions. - `parse_default_literal(expr)` — evaluates a SQL DEFAULT expression (string, boolean, integer, float, NULL) to a `Value` at read time. `DroppedColumn` is re-exported from `nodedb-types::columnar`.

ALTER TABLE ... ADD COLUMN now stamps the new column's `added_at_version` with the bumped schema version before appending it to the live column list, so the read path can distinguish columns that did not exist when older tuples were written. ALTER TABLE ... DROP COLUMN now records a `DroppedColumn` tombstone (definition, original position, version at drop) instead of silently discarding the column definition. This allows the reader to reconstruct the physical layout of any tuple written before the drop without requiring row migration. The CONVERT path initialises `dropped_columns` to an empty vec to keep all `StrictSchema` construction sites consistent.

…tions `binary_tuple_to_value` now detects when a stored tuple's schema version is behind the current catalog version and decodes using `schema_for_version` to match the physical column layout at write time. Columns added after the tuple's version are filled with their DEFAULT value (or NULL) rather than causing an index-out-of-bounds or returning corrupt data. `binary_tuple_to_json` is refactored to delegate to `binary_tuple_to_value` so the version-aware path is shared across both read modes without duplication. Remaining `StrictSchema` construction sites in the executor initialise `dropped_columns` to keep all call sites consistent. Add integration tests covering the full lifecycle of schema-altering DDL on a strict collection: - Pre-ALTER rows return correct values for existing columns after ADD COLUMN - Pre-ALTER rows return the column DEFAULT for newly added columns - Updating a pre-ALTER row migrates it to the current schema - DROP COLUMN leaves pre-drop rows readable for surviving columns - Multiple ADD COLUMN operations in sequence remain readable - RENAME COLUMN and ALTER COLUMN TYPE on pre-existing rows

farhan-syah added 3 commits April 17, 2026 03:59

farhan-syah mentioned this pull request Apr 16, 2026

ALTER COLLECTION ADD COLUMN zombifies existing rows — schema bumped without data migration #48

Closed

farhan-syah merged commit aa5f1ee into main Apr 16, 2026
2 checks passed

farhan-syah deleted the fix/issue-48-alter-column-schema-evolution branch April 16, 2026 20:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(#48): version-aware binary tuple decoding for ALTER COLUMN#64

fix(#48): version-aware binary tuple decoding for ALTER COLUMN#64
farhan-syah merged 3 commits intomainfrom
fix/issue-48-alter-column-schema-evolution

farhan-syah commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

farhan-syah commented Apr 16, 2026

Summary

Coverage

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant