Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -330,8 +330,12 @@ ecommerce_simple.yaml → schemas/examples/
- Protocol enabling/disabling capabilities

**Embedded Mode + Language Bindings**
- `clickgraph-embedded` crate: in-process Cypher queries via chdb (Kuzu-compatible API)
- `clickgraph-ffi` UniFFI crate: single source of truth for Go and Python bindings
- `clickgraph-embedded` crate: Kuzu-compatible sync API. Three constructors:
- `Database::sql_only(path)` — Cypher→SQL only, no executor (always available)
- `Database::new_remote(path, RemoteConfig)` — execute against external ClickHouse (no chdb)
- `Database::new(path, SystemConfig)` — in-process chdb execution (`embedded` feature, opt-in)
- `clickgraph-ffi` UniFFI crate: single source of truth for Go and Python bindings (always uses `embedded` feature)
- `clickgraph-tool` crate: `cg` CLI binary for agents/scripts — `sql`, `validate`, `query`, `nl`, `schema show/validate/discover/diff`
- Hybrid remote query + local storage: `RemoteConfig` enables `query_remote()`, `query_remote_graph()`, `store_subgraph()` for querying a remote ClickHouse cluster and storing results locally
- Write API: `create_node()`, `create_edge()`, `upsert_node()`, `store_subgraph()` with batch variants

Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,10 @@ docs1/
# Agent documentation for AI-assisted development
!**/AGENTS.md

# Agent skills (publishable, for end-users)
!skills/
!skills/**

# Benchmark results
!benchmarks/ldbc_snb/BENCHMARK_RESULTS.md

Expand Down
20 changes: 18 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,24 @@
## [0.6.5-dev] - 2026-03-29
## [0.6.6-dev] - 2026-04-03

### 🚀 Features

- **openCypher TCK runner** (`clickgraph-tck/`): Cucumber-based compatibility test suite running 402 openCypher TCK scenarios in embedded (chdb) mode. Results: **383/402 passed (95.3%), 0 failures, 19 skipped**. Enabled with `CLICKGRAPH_CHDB_TESTS=1 cargo test -p clickgraph-tck --test tck`. Covers MATCH, RETURN, WITH, ORDER BY, SKIP/LIMIT, aggregations, list/null/comparison/boolean/string expressions, and multi-hop graph traversal.
- **`cg` CLI tool** (`clickgraph-tool` crate): Agent/script-oriented CLI for Cypher translation and execution without a running server. Commands: `cg sql` (Cypher→SQL), `cg validate` (parse + plan check), `cg query` (execute via remote ClickHouse), `cg nl` (NL→Cypher via LLM), `cg schema show/validate/discover/diff`. Config via `~/.config/cg/config.toml`. Supports Anthropic (default) and any OpenAI-compatible API.

- **`embedded` feature now opt-in** in `clickgraph-embedded`: chdb is no longer compiled by default. New `Database::new_remote(schema, RemoteConfig)` constructor executes Cypher against external ClickHouse with no chdb dependency — the backend used by `cg query`. `Database::sql_only(schema)` and `Connection::query_to_sql()` are always available for translation-only use.

- **Agent skills** (`skills/`): Three publishable agent skills for Claude Code, LangChain, AutoGen, CrewAI, and OpenAI function calling — `/cypher` (NL→Cypher→SQL→execute), `/graph-schema` (show + validate schema), `/schema-discover` (generate schema YAML from ClickHouse via LLM). See `skills/README.md` for installation across frameworks.

- **openCypher TCK runner** (`clickgraph-tck/`): Cucumber-based compatibility test suite running 402 openCypher TCK scenarios in embedded (chdb) mode. Results: **383/402 passed (95.3%), 0 failures, 19 skipped**. The 19 skipped scenarios cover Cypher write clauses (`CREATE`, `SET`, `DELETE`, `MERGE`) — not yet supported as Cypher syntax; programmatic write API (`create_node()`, `create_edge()`, `upsert_node()`) is already available in embedded mode. Enabled with `CLICKGRAPH_CHDB_TESTS=1 cargo test -p clickgraph-tck --test tck`.

### 🐛 Bug Fixes

- **Debug println removed**: Eliminated leftover `println!("DEBUG TryFrom RenderExpr: ...")` in `render_plan/render_expr.rs` that was polluting stdout during query translation.

---

## [0.6.5-dev] - 2026-03-29

### 🚀 Features

- **Hybrid remote query + local storage** (PR #240): Execute Cypher queries against a remote ClickHouse cluster from embedded mode, then store results locally in chdb as a subgraph for fast re-querying. New `RemoteConfig` for `SystemConfig`, plus `Connection` methods: `query_remote()`, `query_remote_graph()`, `query_graph()`, `store_subgraph()`. New `GraphResult` structured output and `StoreStats` return type. Available in Rust, Python (UniFFI), and Go (UniFFI) bindings.

Expand Down
86 changes: 56 additions & 30 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ ClickGraph is a **read-only graph query engine** for ClickHouse, written in Rust
**Modes of operation:**
- **Server mode** — HTTP (axum) + Bolt v5.8 protocol servers, querying a remote ClickHouse instance
- **Embedded mode** — In-process serverless execution via chdb (ClickHouse embedded). Query Parquet, S3, Iceberg, Delta Lake directly without a running server
- **Remote mode** — Cypher translated locally, executed against an external ClickHouse (no chdb needed)
- **SQL-only mode** — Translate Cypher to SQL without executing (for debugging, testing, or external execution)

**Ground rules**: (1) Never change query semantics — honestly return what is asked, no more, no less. (2) No shortcuts — fully understand the processing flow before making changes. Quality over speed.
Expand All @@ -21,10 +22,11 @@ clickgraph-embedded/ # Embedded Rust API (Database/Connection/QueryResul
clickgraph-ffi/ # UniFFI FFI layer (cdylib — single source of truth for all bindings)
clickgraph-go/ # Idiomatic Go bindings via cgo + UniFFI-generated C bridge
clickgraph-py/ # Pythonic wrapper over UniFFI-generated ctypes bridge
clickgraph-client/ # CLI client for querying ClickGraph servers
clickgraph-client/ # Interactive REPL client for querying ClickGraph servers (human use)
clickgraph-tool/ # cg CLI — agent/script-oriented tool (sql, validate, query, nl, schema)
```

**Workspace members** (in `Cargo.toml`): `clickgraph-client`, `clickgraph-embedded`, `clickgraph-ffi`
**Workspace members** (in `Cargo.toml`): `clickgraph-client`, `clickgraph-embedded`, `clickgraph-ffi`, `clickgraph-tool`

Go and Python bindings are not Cargo workspace members — they consume `libclickgraph_ffi.so`.

Expand All @@ -42,7 +44,7 @@ cargo fmt --all
# Lint
cargo clippy --all-targets

# Rust tests (~1,560 tests across workspace)
# Rust tests (~1,600 tests across workspace)
cargo test # All Rust tests
cargo test <test_name> # Single test
cargo test -- --nocapture # With output
Expand Down Expand Up @@ -71,6 +73,17 @@ cargo run --bin clickgraph
curl -X POST http://localhost:8080/query \
-H "Content-Type: application/json" \
-d '{"query":"MATCH (n) RETURN n","sql_only":true}'

# cg CLI — agent/script-oriented tool (no server needed)
cg --schema schema.yaml sql "MATCH (n:Person) RETURN n.name" # translate only
cg --schema schema.yaml validate "MATCH (n:Person) RETURN n" # parse + plan check
cg --schema schema.yaml \
--clickhouse http://localhost:8123 \
query "MATCH (n:Person) RETURN n.name LIMIT 10" # execute via remote CH
cg --schema schema.yaml nl "find people with more than 5 friends" # NL → Cypher
cg --schema schema.yaml schema show # agent-friendly schema view
cg schema discover --clickhouse http://localhost:8123 \
--database mydb --out schema.yaml # LLM-assisted discovery
```

## Architecture — Query Pipeline
Expand Down Expand Up @@ -103,31 +116,31 @@ Cypher Query → Parse → Plan → Optimize → Render → Generate SQL → Exe
## Ecosystem Architecture

```
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Go App │ │ Python App │ │ Rust App │
│ (cgo) │ │ (ctypes) │ │ (direct) │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
clickgraph-go clickgraph-py clickgraph-embedded
│ │ │
└────────┬────────┘
┌─────────▼──────────┐
│ clickgraph-ffi │
│ (libclickgraph_ffi │
.so / UniFFI)
└─────────┬──────────┘ │
└─────────────────────────┘
┌──────────▼──────────┐
│ clickgraph (core) │
│ Parser + Planner + │
│ SQL Generator │
└──────────┬──────────┘
┌──────────▼──────────┐
│ ClickHouse / chdb │
└─────────────────────┘
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Go App │ │ Python App │ │ Rust App │ │ Agent/Script
│ (cgo) │ (ctypes) │ │ (direct) │ │ (cg CLI) │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
clickgraph-go clickgraph-py clickgraph-embedded clickgraph-tool
│ │ (sql_only/remote)
└────────┬────────┘ (chdb: +embedded feat)
┌─────────▼──────────┐ └────────┬────────┘
│ clickgraph-ffi
│ (libclickgraph_ffi │
│ .so / UniFFI)
└─────────┬──────────┘
└──────────────────┬────────────────┘
┌──────────▼──────────┐
│ clickgraph (core) │
│ Parser + Planner + │
│ SQL Generator │
└──────────┬──────────┘
┌──────────▼──────────┐
│ ClickHouse / chdb │
└─────────────────────┘
```

### FFI Layer (`clickgraph-ffi/`)
Expand All @@ -148,7 +161,15 @@ uniffi-bindgen-go --library target/debug/libclickgraph_ffi.so --out-dir clickgra

### Embedded Mode (`clickgraph-embedded/`)

Core Rust crate with Kuzu-compatible sync API (`Database` → `Connection` → `QueryResult`). Backend is chdb (ClickHouse embedded), enabled via `embedded` feature flag. Supports `sql_only` mode without chdb.
Core Rust crate with Kuzu-compatible sync API (`Database` → `Connection` → `QueryResult`). Three constructors:

| Constructor | Needs chdb? | Use case |
|---|---|---|
| `Database::sql_only(schema)` | No | Translate Cypher → SQL only |
| `Database::new_remote(schema, RemoteConfig)` | No | Execute against external ClickHouse |
| `Database::new(schema, SystemConfig)` | **Yes** (`embedded` feature) | In-process chdb execution |

The `embedded` feature flag is **opt-in** (default off). `clickgraph-ffi` and `clickgraph-tck` enable it; `clickgraph-tool` does not.

Schema `source:` field supports: local files, `s3://`, `iceberg+s3://`, `delta+s3://`, `table_function:...`.

Expand Down Expand Up @@ -218,6 +239,11 @@ Five schema variations exist: Standard, FK-edge, Denormalized, Polymorphic, Comp
| `CLICKGRAPH_CHDB_TESTS` | Set to `1` to enable chdb e2e tests |
| `CLICKGRAPH_LLM_PROVIDER` | LLM provider for schema discovery (`anthropic` or `openai`) |
| `ANTHROPIC_API_KEY` / `OPENAI_API_KEY` | API keys for LLM schema discovery |
| `CG_SCHEMA` | Default schema file path for `cg` CLI |
| `CG_CLICKHOUSE_URL` | ClickHouse URL for `cg query` |
| `CG_CLICKHOUSE_USER` / `CG_CLICKHOUSE_PASSWORD` | Credentials for `cg query` |
| `CG_LLM_PROVIDER` | LLM provider for `cg nl` and `cg schema discover` |
| `CG_LLM_MODEL` / `CG_LLM_API_KEY` / `CG_LLM_BASE_URL` | LLM config for `cg` |

## Key Documentation Files

Expand All @@ -226,5 +252,5 @@ Five schema variations exist: Standard, FK-edge, Denormalized, Polymorphic, Comp
- **`DEV_QUICK_START.md`** — Essential developer workflow
- **`DEVELOPMENT_PROCESS.md`** — Detailed 6-phase development process
- **`.github/copilot-instructions.md`** — Comprehensive architecture guide
- **`*/AGENTS.md`** — Module-level architecture guides (in `src/`, `src/render_plan/`, `src/server/`, `clickgraph-ffi/`, `clickgraph-embedded/`, `clickgraph-go/`, `clickgraph-py/`, etc.)
- **`*/AGENTS.md`** — Module-level architecture guides (in `src/`, `src/render_plan/`, `src/server/`, `clickgraph-ffi/`, `clickgraph-embedded/`, `clickgraph-tool/`, `clickgraph-go/`, `clickgraph-py/`, etc.)
- **`docs/wiki/cypher-language-reference.md`** — Primary feature documentation (must be updated for every feature)
109 changes: 107 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ members = [
"clickgraph-embedded",
"clickgraph-ffi",
"clickgraph-tck",
"clickgraph-tool",
]

[package]
Expand Down
10 changes: 9 additions & 1 deletion DEV_QUICK_START.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,15 @@ cargo test && pytest tests/integration/ && echo "✅ ALL TESTS PASSED"

### Check Generated SQL
```bash
# Use sql_only mode for quick debugging
# Option 1: cg CLI (no server needed — fastest for development)
cg --schema benchmarks/social_network/schemas/social_benchmark.yaml \
sql "MATCH (n:User) RETURN n.name LIMIT 10"

# Validate syntax + planning without execution
cg --schema benchmarks/social_network/schemas/social_benchmark.yaml \
validate "MATCH (n:User)-[:FOLLOWS]->(f) RETURN f.name"

# Option 2: Via server (if already running)
curl -X POST http://localhost:8080/query \
-H "Content-Type: application/json" \
-d '{"query":"MATCH (n) RETURN n","sql_only":true}'
Expand Down
Loading
Loading