fix: issues #46 + #47 — parser, planner, and executor correctness bugs#54
Merged
farhan-syah merged 9 commits intomainfrom Apr 16, 2026
Merged
Conversation
…tring Move the generated-expression tokenizer/parser from a single nodedb-query/src/expr_parse.rs into nodedb-query/src/expr_parse/ with separate tokenizer.rs and mod.rs files. Expose a new `parse_expr_string` function in nodedb-sql that delegates to sqlparser-rs so arbitrary DEFAULT/CHECK expressions can be parsed and const-folded at plan time without duplicating grammar logic.
Add a join_link field (collection_field, working_table_field) to RecursiveScan that drives proper tree-traversal CTEs. Each iteration now builds a hash-set of values from the frontier's working_field and finds collection rows whose collection_field is in that set, matching the SQL INNER JOIN ON semantics of standard recursive CTEs. Previously the recursive step applied filters to the full collection without any join relationship to the previous iteration, producing incorrect results for parent/child tree queries. Also handle strict (Binary Tuple) encoded collections in the recursive executor by converting through the schema before filter evaluation.
…ypes Fold all non-equi ON predicates with AND instead of silently dropping all but the first, which caused queries with compound ON clauses to produce wrong results. Return explicit errors for NATURAL JOIN and implicit cross-joins (no ON/USING clause) rather than silently succeeding with empty join keys. Add sql_expr_to_bridge_expr_qualified and expr_filter_qualified for join contexts where merged documents use table-qualified field names, and wire the join condition through serialize_join_filters so non-equi ON predicates are evaluated against merged rows alongside WHERE filters.
… cache collisions Replace wrapping integer arithmetic with checked_add/sub/mul/div/rem in the expression evaluator so overflows return None rather than panicking or silently wrapping. Replace the unlimited trigger budget (1-hour wall clock, MAX iterations) with a trigger_default (100k iterations, 10s) to prevent runaway procedural bodies from pinning Control Plane workers indefinitely. Guard the plan cache against 64-bit hash collisions by verifying the cached body SQL matches before returning a cached block, and evicting on mismatch.
The check constraint, constraint validator, RLS, trigger, and materialized view DDL parsers rewrote "NEW.col" references using byte indexing after a to_uppercase() call. This produced incorrect results for non-ASCII identifiers and is unsound for multi-byte UTF-8 sequences. Switch all affected parsers to collect chars and iterate over the char slice, matching the "NEW." prefix with eq_ignore_ascii_case on a 4-char window. Also fix the SQL preprocessor's string literal scanner to advance by 2 on '' escapes instead of continuing without incrementing, which previously caused an infinite loop on inputs with escaped single quotes.
The DEFAULT resolver previously only handled a small set of hard-coded
keywords (NOW, UUID, etc.) and returned None for everything else.
Add a fallback path that parses the DEFAULT string through sqlparser-rs
and runs it through the planner's constant-folding evaluator, enabling
DEFAULT values like upper('hello'), 1 + 2, or concat('a', 'b') to
resolve at insert time without a Data Plane round-trip.
Also replace wrapping integer arithmetic in the const-folder with
checked variants to prevent silent overflow on constant expressions.
Cover the bug-fixes landed in this series: - sql_arithmetic_overflow: checked arithmetic in const-fold and eval - sql_default_expressions: DEFAULT parsing and const-fold fallback - sql_join_correctness: non-equi join predicates, NATURAL JOIN error - sql_parser_string_handling: escaped single-quote infinite-loop fix - sql_procedure_cache_safety: plan-cache hash-collision eviction - sql_recursive_cte: join-link tree-traversal correctness - sql_rls_predicate_parse: UTF-8 safe NEW. rewriting in RLS/DDL parsers - sql_trigger_fuel: trigger budget cap prevents unbounded execution - sql_utf8_expressions: multi-byte identifiers in constraint expressions
This was referenced Apr 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes all 15 sub-items across #46 (hand-rolled parser bugs) and #47 (planner/executor correctness bugs).
Issue #46 — Parser string-handling and UTF-8 bugs
find_begin_posnow skips single-quoted string literals so'BEGIN'in a WHEN clause doesn't corrupt the trigger header/body splittrim_matches('(' | ')')replaced withstrip_outer_parens()that removes exactly one balanced pair; same fix applied totrim_matches('\'')on RLS value stringsfind_matching_brace''escape: convertedforloop towhilewithi += 2to actually skip the second quotechar_indices(). Also fixed 3strip_new_prefix/replace_remaining_new_refsfunctions incheck_constraint.rsandvalidate.rsthat had the same&str[byte_index..]panic on multi-byte UTF-8find_trailing_with_options()scans backward forWITH (to distinguish options clause from CTEs and column aliasesIssue #47 — Planner and executor correctness
const_fold::fold_binaryuseschecked_add/sub/mul— overflow returnsNone(unfoldable) instead of panicking or wrappingeval_binary_opuseschecked_*for all integer ops;i64::MIN / -1caught bychecked_div; float division guards against non-finite resultsExecutionBudget::unlimited()replaced withtrigger_default()(100K iterations, 10s wall-clock) — eliminates the 1-hour DoS vectorinline_left/inline_rightalongside collections/aliases/keysjoin_linkfrom the recursive branch's JOIN ON clause; executor implements working-table hash-join with frontier iteration; strict-doc binary-tuple-to-msgpack conversion;PlanKind::MultiRowfor pgwire response formatting. Value-generating CTEs return explicit unsupported errorextract_join_constraintfolds ALL non-equi predicates with AND (was only keeping first). NATURAL JOIN and implicit cross-join return explicit errors. Non-equi conditions serialized as qualifiedFilterOp::Exprpost-filters viasql_expr_to_bridge_expr_qualified(separate from bare-name path used by WHERE/CHECK)ProcedureBlockCachestoresbody_sqlinCacheEntryand verifies equality on hit — 64-bit hash collisions no longer return wrong procedure bodyevaluate_default_exprfalls back toparse_expr_string()+ const-folder for expressions outside the keyword whitelistStructural changes
nodedb-query/src/expr_parse.rssplit intoexpr_parse/{mod.rs, tokenizer.rs}nodedb-sql/src/lib.rsgainsparse_expr_string()for the DEFAULT fallback pathsql_expr_to_bridge_exprsplit into bare-name and_qualifiedvariantsTest plan
scripts/test.shfull SQL suite passesCloses #46, closes #47