Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -330,8 +330,12 @@ ecommerce_simple.yaml → schemas/examples/
- Protocol enabling/disabling capabilities

**Embedded Mode + Language Bindings**
- `clickgraph-embedded` crate: in-process Cypher queries via chdb (Kuzu-compatible API)
- `clickgraph-ffi` UniFFI crate: single source of truth for Go and Python bindings
- `clickgraph-embedded` crate: Kuzu-compatible sync API. Three constructors:
- `Database::sql_only(path)` — Cypher→SQL only, no executor (always available)
- `Database::new_remote(path, RemoteConfig)` — execute against external ClickHouse (no chdb)
- `Database::new(path, SystemConfig)` — in-process chdb execution (`embedded` feature, opt-in)
- `clickgraph-ffi` UniFFI crate: single source of truth for Go and Python bindings (always uses `embedded` feature)
- `clickgraph-tool` crate: `cg` CLI binary for agents/scripts — `sql`, `validate`, `query`, `nl`, `schema show/validate/discover/diff`
- Hybrid remote query + local storage: `RemoteConfig` enables `query_remote()`, `query_remote_graph()`, `store_subgraph()` for querying a remote ClickHouse cluster and storing results locally
- Write API: `create_node()`, `create_edge()`, `upsert_node()`, `store_subgraph()` with batch variants

Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,10 @@ docs1/
# Agent documentation for AI-assisted development
!**/AGENTS.md

# Agent skills (publishable, for end-users)
!skills/
!skills/**

# Benchmark results
!benchmarks/ldbc_snb/BENCHMARK_RESULTS.md

Expand Down
20 changes: 18 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,24 @@
## [0.6.5-dev] - 2026-03-29
## [0.6.6-dev] - 2026-04-03

### 🚀 Features

- **openCypher TCK runner** (`clickgraph-tck/`): Cucumber-based compatibility test suite running 402 openCypher TCK scenarios in embedded (chdb) mode. Results: **383/402 passed (95.3%), 0 failures, 19 skipped**. Enabled with `CLICKGRAPH_CHDB_TESTS=1 cargo test -p clickgraph-tck --test tck`. Covers MATCH, RETURN, WITH, ORDER BY, SKIP/LIMIT, aggregations, list/null/comparison/boolean/string expressions, and multi-hop graph traversal.
- **`cg` CLI tool** (`clickgraph-tool` crate): Agent/script-oriented CLI for Cypher translation and execution without a running server. Commands: `cg sql` (Cypher→SQL), `cg validate` (parse + plan check), `cg query` (execute via remote ClickHouse), `cg nl` (NL→Cypher via LLM), `cg schema show/validate/discover/diff`. Config via `~/.config/cg/config.toml`. Supports Anthropic (default) and any OpenAI-compatible API.

- **`embedded` feature now opt-in** in `clickgraph-embedded`: chdb is no longer compiled by default. New `Database::new_remote(schema, RemoteConfig)` constructor executes Cypher against external ClickHouse with no chdb dependency — the backend used by `cg query`. `Database::sql_only(schema)` and `Connection::query_to_sql()` are always available for translation-only use.

- **Agent skills** (`skills/`): Three publishable agent skills for Claude Code, LangChain, AutoGen, CrewAI, and OpenAI function calling — `/cypher` (NL→Cypher→SQL→execute), `/graph-schema` (show + validate schema), `/schema-discover` (generate schema YAML from ClickHouse via LLM). See `skills/README.md` for installation across frameworks.

- **openCypher TCK runner** (`clickgraph-tck/`): Cucumber-based compatibility test suite running 402 openCypher TCK scenarios in embedded (chdb) mode. Results: **383/402 passed (95.3%), 0 failures, 19 skipped**. The 19 skipped scenarios cover Cypher write clauses (`CREATE`, `SET`, `DELETE`, `MERGE`) — not yet supported as Cypher syntax; programmatic write API (`create_node()`, `create_edge()`, `upsert_node()`) is already available in embedded mode. Enabled with `CLICKGRAPH_CHDB_TESTS=1 cargo test -p clickgraph-tck --test tck`.

### 🐛 Bug Fixes

- **Debug println removed**: Eliminated leftover `println!("DEBUG TryFrom RenderExpr: ...")` in `render_plan/render_expr.rs` that was polluting stdout during query translation.

---

## [0.6.5-dev] - 2026-03-29

### 🚀 Features

- **Hybrid remote query + local storage** (PR #240): Execute Cypher queries against a remote ClickHouse cluster from embedded mode, then store results locally in chdb as a subgraph for fast re-querying. New `RemoteConfig` for `SystemConfig`, plus `Connection` methods: `query_remote()`, `query_remote_graph()`, `query_graph()`, `store_subgraph()`. New `GraphResult` structured output and `StoreStats` return type. Available in Rust, Python (UniFFI), and Go (UniFFI) bindings.

Expand Down
86 changes: 56 additions & 30 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ ClickGraph is a **read-only graph query engine** for ClickHouse, written in Rust
**Modes of operation:**
- **Server mode** — HTTP (axum) + Bolt v5.8 protocol servers, querying a remote ClickHouse instance
- **Embedded mode** — In-process serverless execution via chdb (ClickHouse embedded). Query Parquet, S3, Iceberg, Delta Lake directly without a running server
- **Remote mode** — Cypher translated locally, executed against an external ClickHouse (no chdb needed)
- **SQL-only mode** — Translate Cypher to SQL without executing (for debugging, testing, or external execution)

**Ground rules**: (1) Never change query semantics — honestly return what is asked, no more, no less. (2) No shortcuts — fully understand the processing flow before making changes. Quality over speed.
Expand All @@ -21,10 +22,11 @@ clickgraph-embedded/ # Embedded Rust API (Database/Connection/QueryResul
clickgraph-ffi/ # UniFFI FFI layer (cdylib — single source of truth for all bindings)
clickgraph-go/ # Idiomatic Go bindings via cgo + UniFFI-generated C bridge
clickgraph-py/ # Pythonic wrapper over UniFFI-generated ctypes bridge
clickgraph-client/ # CLI client for querying ClickGraph servers
clickgraph-client/ # Interactive REPL client for querying ClickGraph servers (human use)
clickgraph-tool/ # cg CLI — agent/script-oriented tool (sql, validate, query, nl, schema)
```

**Workspace members** (in `Cargo.toml`): `clickgraph-client`, `clickgraph-embedded`, `clickgraph-ffi`
**Workspace members** (in `Cargo.toml`): `clickgraph-client`, `clickgraph-embedded`, `clickgraph-ffi`, `clickgraph-tool`

Go and Python bindings are not Cargo workspace members — they consume `libclickgraph_ffi.so`.

Expand All @@ -42,7 +44,7 @@ cargo fmt --all
# Lint
cargo clippy --all-targets

# Rust tests (~1,560 tests across workspace)
# Rust tests (~1,600 tests across workspace)
cargo test # All Rust tests
cargo test <test_name> # Single test
cargo test -- --nocapture # With output
Expand Down Expand Up @@ -71,6 +73,17 @@ cargo run --bin clickgraph
curl -X POST http://localhost:8080/query \
-H "Content-Type: application/json" \
-d '{"query":"MATCH (n) RETURN n","sql_only":true}'

# cg CLI — agent/script-oriented tool (no server needed)
cg --schema schema.yaml sql "MATCH (n:Person) RETURN n.name" # translate only
cg --schema schema.yaml validate "MATCH (n:Person) RETURN n" # parse + plan check
cg --schema schema.yaml \
--clickhouse http://localhost:8123 \
query "MATCH (n:Person) RETURN n.name LIMIT 10" # execute via remote CH
cg --schema schema.yaml nl "find people with more than 5 friends" # NL → Cypher
cg --schema schema.yaml schema show # agent-friendly schema view
cg schema discover --clickhouse http://localhost:8123 \
--database mydb --out schema.yaml # LLM-assisted discovery
```

## Architecture — Query Pipeline
Expand Down Expand Up @@ -103,31 +116,31 @@ Cypher Query → Parse → Plan → Optimize → Render → Generate SQL → Exe
## Ecosystem Architecture

```
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Go App │ │ Python App │ │ Rust App │
│ (cgo) │ │ (ctypes) │ │ (direct) │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
clickgraph-go clickgraph-py clickgraph-embedded
│ │ │
└────────┬────────┘
┌─────────▼──────────┐
│ clickgraph-ffi │
│ (libclickgraph_ffi │
.so / UniFFI)
└─────────┬──────────┘ │
└─────────────────────────┘
┌──────────▼──────────┐
│ clickgraph (core) │
│ Parser + Planner + │
│ SQL Generator │
└──────────┬──────────┘
┌──────────▼──────────┐
│ ClickHouse / chdb │
└─────────────────────┘
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Go App │ │ Python App │ │ Rust App │ │ Agent/Script
│ (cgo) │ (ctypes) │ │ (direct) │ │ (cg CLI) │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
clickgraph-go clickgraph-py clickgraph-embedded clickgraph-tool
│ │ (sql_only/remote)
└────────┬────────┘ (chdb: +embedded feat)
┌─────────▼──────────┐ └────────┬────────┘
│ clickgraph-ffi
│ (libclickgraph_ffi │
│ .so / UniFFI)
└─────────┬──────────┘
└──────────────────┬────────────────┘
┌──────────▼──────────┐
│ clickgraph (core) │
│ Parser + Planner + │
│ SQL Generator │
└──────────┬──────────┘
┌──────────▼──────────┐
│ ClickHouse / chdb │
└─────────────────────┘
```

### FFI Layer (`clickgraph-ffi/`)
Expand All @@ -148,7 +161,15 @@ uniffi-bindgen-go --library target/debug/libclickgraph_ffi.so --out-dir clickgra

### Embedded Mode (`clickgraph-embedded/`)

Core Rust crate with Kuzu-compatible sync API (`Database` → `Connection` → `QueryResult`). Backend is chdb (ClickHouse embedded), enabled via `embedded` feature flag. Supports `sql_only` mode without chdb.
Core Rust crate with Kuzu-compatible sync API (`Database` → `Connection` → `QueryResult`). Three constructors:

| Constructor | Needs chdb? | Use case |
|---|---|---|
| `Database::sql_only(schema)` | No | Translate Cypher → SQL only |
| `Database::new_remote(schema, RemoteConfig)` | No | Execute against external ClickHouse |
| `Database::new(schema, SystemConfig)` | **Yes** (`embedded` feature) | In-process chdb execution |

The `embedded` feature flag is **opt-in** (default off). `clickgraph-ffi` and `clickgraph-tck` enable it; `clickgraph-tool` does not.

Schema `source:` field supports: local files, `s3://`, `iceberg+s3://`, `delta+s3://`, `table_function:...`.

Expand Down Expand Up @@ -218,6 +239,11 @@ Five schema variations exist: Standard, FK-edge, Denormalized, Polymorphic, Comp
| `CLICKGRAPH_CHDB_TESTS` | Set to `1` to enable chdb e2e tests |
| `CLICKGRAPH_LLM_PROVIDER` | LLM provider for schema discovery (`anthropic` or `openai`) |
| `ANTHROPIC_API_KEY` / `OPENAI_API_KEY` | API keys for LLM schema discovery |
| `CG_SCHEMA` | Default schema file path for `cg` CLI |
| `CG_CLICKHOUSE_URL` | ClickHouse URL for `cg query` |
| `CG_CLICKHOUSE_USER` / `CG_CLICKHOUSE_PASSWORD` | Credentials for `cg query` |
| `CG_LLM_PROVIDER` | LLM provider for `cg nl` and `cg schema discover` |
| `CG_LLM_MODEL` / `CG_LLM_API_KEY` / `CG_LLM_BASE_URL` | LLM config for `cg` |

## Key Documentation Files

Expand All @@ -226,5 +252,5 @@ Five schema variations exist: Standard, FK-edge, Denormalized, Polymorphic, Comp
- **`DEV_QUICK_START.md`** — Essential developer workflow
- **`DEVELOPMENT_PROCESS.md`** — Detailed 6-phase development process
- **`.github/copilot-instructions.md`** — Comprehensive architecture guide
- **`*/AGENTS.md`** — Module-level architecture guides (in `src/`, `src/render_plan/`, `src/server/`, `clickgraph-ffi/`, `clickgraph-embedded/`, `clickgraph-go/`, `clickgraph-py/`, etc.)
- **`*/AGENTS.md`** — Module-level architecture guides (in `src/`, `src/render_plan/`, `src/server/`, `clickgraph-ffi/`, `clickgraph-embedded/`, `clickgraph-tool/`, `clickgraph-go/`, `clickgraph-py/`, etc.)
- **`docs/wiki/cypher-language-reference.md`** — Primary feature documentation (must be updated for every feature)
109 changes: 107 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ members = [
"clickgraph-embedded",
"clickgraph-ffi",
"clickgraph-tck",
"clickgraph-tool",
]

[package]
Expand Down
10 changes: 9 additions & 1 deletion DEV_QUICK_START.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,15 @@ cargo test && pytest tests/integration/ && echo "✅ ALL TESTS PASSED"

### Check Generated SQL
```bash
# Use sql_only mode for quick debugging
# Option 1: cg CLI (no server needed — fastest for development)
cg --schema benchmarks/social_network/schemas/social_benchmark.yaml \
sql "MATCH (n:User) RETURN n.name LIMIT 10"

# Validate syntax + planning without execution
cg --schema benchmarks/social_network/schemas/social_benchmark.yaml \
validate "MATCH (n:User)-[:FOLLOWS]->(f) RETURN f.name"

# Option 2: Via server (if already running)
curl -X POST http://localhost:8080/query \
-H "Content-Type: application/json" \
-d '{"query":"MATCH (n) RETURN n","sql_only":true}'
Expand Down
Loading
Loading