Skip to content

feat: Add APOC export procedures (Neo4j-compatible)#182

Merged
genezhang merged 2 commits intomainfrom
feature/apoc-export-procedures
Mar 6, 2026
Merged

feat: Add APOC export procedures (Neo4j-compatible)#182
genezhang merged 2 commits intomainfrom
feature/apoc-export-procedures

Conversation

@genezhang
Copy link
Copy Markdown
Owner

Summary

Add Neo4j APOC-compatible export procedures that work across all ClickGraph modes: HTTP server, Bolt protocol, and embedded.

Syntax

CALL apoc.export.csv.query("MATCH (u:User) RETURN u.name", "/tmp/users.csv", {})
CALL apoc.export.json.query("MATCH (u:User) RETURN u", "s3://bucket/data.json", {})
CALL apoc.export.parquet.query("MATCH ...", "output.parquet", {compression: "zstd"})

Architecture

  • Handler intercept pattern (follows PageRank pattern, not ProcedureRegistry)
  • Destination resolver: Maps URI schemes to ClickHouse INSERT INTO FUNCTION table functions
    • Local files → file(), S3 → s3(), GCS → s3(), Azure → azureBlobStorage(), HTTP → url()
  • Inner Cypher → SQL pipeline: Parses the Cypher query argument, runs through full planner/optimizer/renderer

Parser Fix

Fixed standalone CALL parsing when inner Cypher contains RETURN/UNION keywords. Replaced naive `input.contains("RETURN")" substring check with try-standalone-first, verify-consumed-everything approach.

Changes

File Lines Description
src/procedures/apoc_export.rs +549 Core module: resolver, format mapping, config, SQL builder, 29 tests
src/server/handlers.rs +150 HTTP export intercept + translate_cypher_to_sql() helper
src/server/bolt_protocol/handler.rs +137 Bolt export intercept with inner Cypher pipeline
clickgraph-embedded/src/connection.rs +127 Embedded handle_export_call() + 3 tests
src/open_cypher_parser/mod.rs ±23 Parser fix for positional args in CALL
Docs +77 Cypher Language Reference, Embedded Mode wiki, STATUS, CHANGELOG

Tests

  • Main crate: 1212 unit + 183 integration + 7 e2e + 30 doc = all passing
  • Embedded crate: 39 + 10 + 5 = 54 all passing (including 3 new export tests)
  • New tests: 29 unit tests in apoc_export.rs + 3 embedded tests

Add CALL apoc.export.{csv|json|parquet}.query(cypher, destination, config)
for exporting query results to files and cloud storage.

Components:
- src/procedures/apoc_export.rs: Destination resolver, format mapping,
  config parser, SQL builder, arg extractor (29 unit tests)
- HTTP handler intercept with sql_only support
- Bolt protocol handler intercept with full inner Cypher pipeline
- Embedded mode: handle_export_call() in Connection (3 tests)

Parser fix: Standalone CALL with positional args now correctly parsed
even when inner Cypher contains RETURN/UNION keywords. Replaced naive
substring check with try-standalone-first-then-verify approach.

Supported destinations: local files, s3://, gs://, azure://, http(s)://
Supported formats: CSV, JSON, Parquet (with compression config)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Neo4j APOC-compatible export procedures (CALL apoc.export.{csv|json|parquet}.query(cypher, destination, config)) to ClickGraph, enabling users to export query results to local files, S3, GCS, Azure Blob Storage, or HTTP endpoints. It also fixes the standalone CALL parser, which previously failed to parse export calls containing RETURN/UNION in the inner Cypher string.

Changes:

  • Core export module (src/procedures/apoc_export.rs): New module implementing destination URI resolution, format mapping, SQL generation, and argument parsing for APOC export procedures, with 29 unit tests.
  • Handler intercepts: Export procedure handling injected into the HTTP handler (handlers.rs), Bolt protocol handler (bolt_protocol/handler.rs), and embedded mode connection (clickgraph-embedded/src/connection.rs), each running the inner Cypher through the full planning pipeline.
  • Parser fix (src/open_cypher_parser/mod.rs): Replaced naive input.contains("RETURN") check with a try-first-verify-consumed approach, enabling standalone CALL with positional string arguments that happen to contain RETURN or UNION.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/procedures/apoc_export.rs New core module: destination resolver, format mapping, config parsing, SQL builder, 29 tests
src/procedures/mod.rs Exposes apoc_export as public submodule
src/open_cypher_parser/mod.rs Parser fix: standalone CALL now correctly handles positional args with inner Cypher strings
src/server/handlers.rs HTTP export intercept + translate_cypher_to_sql() helper function
src/server/bolt_protocol/handler.rs Bolt export intercept with inline inner Cypher pipeline
clickgraph-embedded/src/connection.rs Embedded export handler + 3 new tests
docs/wiki/Cypher-Language-Reference.md New Export Procedures section with syntax, formats, examples
docs/wiki/Embedded-Mode.md APOC export usage examples for embedded mode
STATUS.md Documents APOC export as completed feature
CHANGELOG.md Records feature addition

Comment thread src/server/handlers.rs
Comment thread src/server/handlers.rs Outdated
Comment thread src/server/bolt_protocol/handler.rs Outdated
Comment thread src/server/bolt_protocol/handler.rs
Comment thread src/server/handlers.rs Outdated
Comment thread src/server/handlers.rs
Comment thread clickgraph-embedded/src/connection.rs Outdated
1. Add blank line between functions in handlers.rs
2. Remove time_ms from HTTP export response for cross-mode consistency
3. Pass role.as_deref() in Bolt export (was None, bypassing RBAC)
4. Add set_current_schema() in Bolt export handler
5. Use schema_name param with debug log instead of underscore
6. Log export timing instead of including in response
7. Replace string-based APOC detection in embedded with parsed approach

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@genezhang genezhang merged commit 25f3da7 into main Mar 6, 2026
4 checks passed
@genezhang genezhang deleted the feature/apoc-export-procedures branch March 6, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants