fix: FK-edge JOIN bugs + VLP-WITH CTE column mismatch#75
Merged
Conversation
- Fix anchor_table_name resolution in join_builder.rs to unwrap through Projection/Filter wrappers and check both left and right GraphRel connections (not just right). Prevents redundant self-JOIN when edge table equals FROM node table. - Fix self-referencing FK-edge JOIN condition in inference.rs to use from_id/to_id columns instead of node PK columns. For self-ref FK-edge (e.g., parent_id → object_id on same table), generates correct child.parent_id = parent.object_id instead of child.object_id = parent.parent_id.
When an undirected pattern (n)-[r]-(m) is used with FK-edge schemas where from_node != to_node, the bidirectional union expander now validates each direction branch against the schema and silently skips invalid ones. - Added is_valid_direction_branch() to check node labels against schema - Added extract_source_table_from_plan() for unlabeled node fallback - Self-referencing FK-edge keeps both directions (from_node == to_node) - Standard schemas unaffected (FOLLOWS is User->User, both valid)
find_id_column_for_alias() was returning 'start_id' for VLP endpoints before checking GraphNode branches, causing the WITH CTE JOIN to reference 'a_start_id' (nonexistent) instead of 'a_user_id'. Fix: traverse left/right branches first to find the actual GraphNode ID column; only fall back to VLP start_id/end_id for denormalized schemas where no separate node table exists. Fixes click-to-expand in Neo4j Browser.
Ensures the browser expand query pattern (WITH + VLP) generates a JOIN that references the node's actual ID column (a_user_id) instead of the VLP's generic start_id (a_start_id).
There was a problem hiding this comment.
Pull request overview
Fixes several query-planning/rendering bugs around FK-edge handling and VLP+WITH CTE joins, including a Neo4j Browser click-to-expand regression.
Changes:
- Fix FK-edge JOIN condition generation (including self-referencing FK-edges) and improve anchor table detection for FK-edge join de-duplication.
- Skip invalid direction branches when expanding undirected FK-edge patterns into UNION ALL branches.
- Fix VLP+WITH CTE JOIN column resolution by prioritizing the actual node ID column over VLP
start_id/end_idfallbacks; add an integration regression test.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/rust/integration/complex_feature_tests.rs | Adds regression test ensuring VLP↔WITH CTE JOIN uses the node’s real ID column (e.g. a_user_id) rather than a_start_id. |
| src/render_plan/plan_builder.rs | Reorders find_id_column_for_alias() resolution to prefer actual node ID columns; falls back to VLP start_id/end_id only when node lookup fails. |
| src/render_plan/join_builder.rs | Improves anchor table derivation (unwraps Projection/Filter, checks both left/right nodes) to support FK-edge JOIN de-duplication. |
| src/query_planner/analyzer/graph_join/inference.rs | Fixes FK-edge JOIN condition selection for self-referencing relationships and avoids redundant/self JOIN behavior. |
| src/query_planner/analyzer/bidirectional_union.rs | Adds schema-aware filtering of invalid direction branches during undirected expansion. |
Address PR review: is_valid_direction_branch() now iterates all labels in multi-type patterns ([:TYPE1|TYPE2]) instead of only checking the first. A branch is rejected only when ALL labels are invalid for the direction, since each type generates its own UNION branch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three bug fixes for schema-variant edge handling and browser click-to-expand:
1. FK-edge redundant self-JOIN and self-ref JOIN condition (
05bc862)orders_fk JOIN orders_fk) instead of using the FK column directly. Self-referencing FK-edges (e.g.,fs_objects_single.parent_id) generated wrong JOIN conditions.from_id/to_idfor self-ref FK-edges.inference.rs,join_builder.rs2. Skip invalid direction branches for FK-edge undirected patterns (
d0db706)(a)-[r]-(b)generated UNION ALL with both directions even for non-self-referencing FK-edges where only one direction is valid (e.g., Order→Customer has no Customer→Order path).is_valid_direction_branch()that validates each direction branch against the schema before generating UNION ALL. Self-referencing relationships correctly keep both directions.bidirectional_union.rs3. VLP+WITH CTE JOIN uses node ID instead of VLP start_id (
e9e7316)a_start_id cannot be resolved. Thefind_id_column_for_alias()function returned VLP's genericstart_idbefore checking actual GraphNode branches, causing the WITH→VLP CTE JOIN to reference a nonexistent column.GraphReltraversal — check left/right branches first to find the node's real ID column (e.g.,user_id), fall back to VLPstart_id/end_idonly for denormalized schemas without separate node tables.plan_builder.rsRegression test
test_vlp_with_cte_join_uses_node_id_not_start_id— asserts browser expand query generatesa_user_id(nota_start_id) in the VLP↔WITH CTE JOIN condition.Test Results
Diff Stats
5 files changed, 228 insertions(+), 30 deletions(-)