Skip to content

Commit e65a199

Browse files
genezhangCopilot
andauthored
feat: Python bindings for embedded mode (clickgraph-py) (#180)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent c621b57 commit e65a199

11 files changed

Lines changed: 1142 additions & 3 deletions

File tree

Cargo.lock

Lines changed: 101 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
members = [
33
"clickgraph-client",
44
"clickgraph-embedded",
5+
"clickgraph-py",
56
]
67

78
[package]

clickgraph-embedded/src/connection.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ use clickgraph::graph_catalog::graph_schema::GraphSchema;
1010

1111
use super::database::Database;
1212
use super::error::EmbeddedError;
13-
use super::query_result::{QueryResult, Row};
13+
use super::query_result::QueryResult;
1414
use super::value::Value;
1515

1616
/// A connection to an embedded ClickGraph database.

clickgraph-py/.gitignore

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Compiled extensions
2+
*.so
3+
*.pyd
4+
*.dll
5+
6+
# Python
7+
__pycache__/
8+
*.pyc
9+
*.egg-info/
10+
dist/
11+
build/
12+
13+
# maturin
14+
target/

clickgraph-py/Cargo.toml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
[package]
2+
name = "clickgraph-py"
3+
version = "0.1.0"
4+
edition = "2021"
5+
rust-version = "1.85"
6+
description = "Python bindings for ClickGraph embedded graph query engine"
7+
repository = "https://github.com/genezhang/clickgraph"
8+
license = "Apache-2.0"
9+
10+
[lib]
11+
name = "_clickgraph"
12+
crate-type = ["cdylib"]
13+
14+
[dependencies]
15+
clickgraph-embedded = { path = "../clickgraph-embedded" }
16+
pyo3 = { version = "0.24", features = ["extension-module"] }

clickgraph-py/README.md

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# clickgraph — Python bindings
2+
3+
Embedded graph query engine — run Cypher queries over Parquet, Iceberg, Delta Lake and S3 data without a ClickHouse server.
4+
5+
## Quick Start
6+
7+
```python
8+
import clickgraph
9+
10+
db = clickgraph.Database("schema.yaml")
11+
conn = db.connect()
12+
13+
for row in conn.query("MATCH (u:User) RETURN u.name LIMIT 5"):
14+
print(row["u.name"])
15+
```
16+
17+
## API Compatibility
18+
19+
ClickGraph's Python API is designed to be familiar to users of other graph databases:
20+
21+
| Operation | ClickGraph | Kuzu | Neo4j |
22+
|-----------|-----------|------|-------|
23+
| Open database | `Database("schema.yaml")` | `Database("path")` | `GraphDatabase.driver(uri)` |
24+
| Get connection | `db.connect()` or `Connection(db)` | `Connection(db)` | `driver.session()` |
25+
| Run query | `conn.query(cypher)` | `conn.execute(cypher)` | `session.run(cypher)` |
26+
| Iterate rows | `for row in result:` | `while result.has_next():` | `for record in result:` |
27+
| Access by name | `row["col"]` (dict) | `row[0]` (tuple) | `record["col"]` (dict-like) |
28+
29+
All three calling styles work — use whichever feels natural:
30+
31+
```python
32+
# ClickGraph style
33+
conn = db.connect()
34+
result = conn.query("MATCH (u:User) RETURN u.name")
35+
36+
# Kuzu style
37+
conn = clickgraph.Connection(db)
38+
result = conn.execute("MATCH (u:User) RETURN u.name")
39+
while result.has_next():
40+
row = result.get_next()
41+
print(row[0])
42+
43+
# Neo4j style
44+
conn = db.connect()
45+
result = conn.run("MATCH (u:User) RETURN u.name")
46+
for row in result:
47+
print(row["u.name"])
48+
```
49+
50+
## API
51+
52+
### `Database(schema_path, **kwargs)`
53+
54+
Open an embedded database from a YAML schema file.
55+
56+
**Keyword arguments** (all optional):
57+
- `session_dir` — directory for chdb session data (default: temp dir)
58+
- `data_dir` — base directory for relative `source:` paths
59+
- `max_threads` — maximum threads for chdb
60+
- `s3_access_key_id`, `s3_secret_access_key`, `s3_region`, `s3_endpoint_url`, `s3_session_token` — S3 credentials
61+
- `gcs_access_key_id`, `gcs_secret_access_key` — GCS HMAC credentials
62+
- `azure_storage_account_name`, `azure_storage_account_key`, `azure_storage_connection_string` — Azure credentials
63+
64+
### `Database.connect() → Connection`
65+
66+
Create a connection for executing queries.
67+
68+
### `Connection(db)` *(Kuzu-compatible constructor)*
69+
70+
Alternative to `db.connect()` — creates a connection from a Database instance.
71+
72+
### `Database.execute(cypher) → QueryResult`
73+
74+
Shorthand — execute a query without creating a separate connection.
75+
76+
### `Connection.query(cypher) → QueryResult`
77+
78+
Execute a Cypher query. Returns an iterable of row dicts.
79+
80+
### `Connection.execute(cypher) → QueryResult` *(Kuzu-compatible alias)*
81+
82+
Alias for `query()`.
83+
84+
### `Connection.run(cypher) → QueryResult` *(Neo4j-compatible alias)*
85+
86+
Alias for `query()`.
87+
88+
### `Connection.query_to_sql(cypher) → str`
89+
90+
Translate Cypher to ClickHouse SQL without executing.
91+
92+
### `QueryResult`
93+
94+
**Dict-style access** (ClickGraph/Neo4j pattern):
95+
- Iterable: `for row in result:` — each row is a `dict`
96+
- `result[i]` — access row by index (supports negative indexing)
97+
- `result.column_names` — list of column names
98+
- `result.num_rows` — number of rows
99+
- `result.as_dicts()` — all rows as a list of dicts
100+
- `result.get_row(i)` — single row by index as dict
101+
- `len(result)` — number of rows
102+
103+
**Tuple-style access** (Kuzu pattern):
104+
- `result.has_next()` — True if more rows remain
105+
- `result.get_next()` — next row as a list of values (column order)
106+
- `result.reset_iterator()` — restart the cursor
107+
108+
## Installation
109+
110+
```bash
111+
# From source (requires Rust toolchain + chdb)
112+
cd clickgraph-py
113+
pip install maturin
114+
maturin develop
115+
```
116+
117+
## Example with S3 data
118+
119+
```python
120+
import clickgraph
121+
122+
db = clickgraph.Database(
123+
"schema.yaml",
124+
s3_access_key_id="AKIA...",
125+
s3_secret_access_key="...",
126+
s3_region="us-east-1",
127+
)
128+
129+
conn = db.connect()
130+
result = conn.query("""
131+
MATCH (u:User)-[:FOLLOWS]->(f:User)
132+
WHERE u.name = 'Alice'
133+
RETURN f.name, f.email
134+
""")
135+
136+
for row in result:
137+
print(f"{row['f.name']}: {row['f.email']}")
138+
```

clickgraph-py/pyproject.toml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
[build-system]
2+
requires = ["maturin>=1.0,<2.0"]
3+
build-backend = "maturin"
4+
5+
[project]
6+
name = "clickgraph"
7+
version = "0.1.0"
8+
description = "Embedded graph query engine — run Cypher over Parquet, Iceberg, S3 and ClickHouse"
9+
requires-python = ">=3.8"
10+
license = "Apache-2.0"
11+
readme = "README.md"
12+
keywords = ["graph", "cypher", "clickhouse", "embedded", "analytics"]
13+
classifiers = [
14+
"Development Status :: 4 - Beta",
15+
"License :: OSI Approved :: Apache Software License",
16+
"Programming Language :: Python :: 3",
17+
"Programming Language :: Rust",
18+
"Topic :: Database",
19+
]
20+
21+
[tool.maturin]
22+
features = []
23+
python-source = "python"
24+
module-name = "clickgraph._clickgraph"
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
"""ClickGraph — embedded graph query engine for Python.
2+
3+
Run Cypher queries over Parquet, Iceberg, Delta Lake and S3 data
4+
without a ClickHouse server.
5+
6+
Quick start::
7+
8+
import clickgraph
9+
10+
db = clickgraph.Database("schema.yaml")
11+
conn = db.connect()
12+
for row in conn.query("MATCH (u:User) RETURN u.name LIMIT 5"):
13+
print(row["u.name"])
14+
15+
Kuzu-compatible style::
16+
17+
from clickgraph import Database, Connection
18+
19+
db = Database("schema.yaml")
20+
conn = Connection(db)
21+
result = conn.execute("MATCH (u:User) RETURN u.name")
22+
while result.has_next():
23+
row = result.get_next()
24+
print(row[0])
25+
26+
With S3 credentials::
27+
28+
db = clickgraph.Database(
29+
"schema.yaml",
30+
s3_access_key_id="AKIA...",
31+
s3_secret_access_key="...",
32+
s3_region="us-east-1",
33+
)
34+
"""
35+
36+
from clickgraph._clickgraph import Database, Connection, QueryResult
37+
38+
__all__ = ["Database", "Connection", "QueryResult"]
39+
__version__ = "0.1.0"

0 commit comments

Comments
 (0)