Skip to content

Commit 6d031be

Browse files
committed
I/O: Apache Iceberg (improve docs)
1 parent 8648510 commit 6d031be

1 file changed

Lines changed: 21 additions & 7 deletions

File tree

doc/io/iceberg/index.md

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,12 @@
44

55
## About
66

7-
Import and export data into/from Iceberg tables, for humans and machines.
7+
Import and export data into/from [Apache Iceberg] tables, for humans and machines.
8+
9+
Iceberg works with the concept of a [FileIO] which is a pluggable module for
10+
reading, writing, and deleting files. It supports different backends like
11+
S3, HDFS, Azure Data Lake, Google Cloud Storage, Alibaba Cloud Object Storage,
12+
and Hugging Face.
813

914
## Synopsis
1015

@@ -16,9 +21,15 @@ Import and export data into/from Iceberg tables, for humans and machines.
1621
## Install
1722

1823
```shell
19-
pip install --upgrade 'cratedb-toolkit[io]'
24+
uv tool install --upgrade 'cratedb-toolkit[io]'
2025
```
2126

27+
:::{tip}
28+
For speedy installations, we recommend using the [uv] package manager.
29+
Install it using `brew install uv` on macOS or `pipx install uv` on
30+
other operating systems.
31+
:::
32+
2233
## Usage
2334

2435
### Load
@@ -37,7 +48,7 @@ ctk load table \
3748
--cluster-url="crate://crate:crate@localhost:4200/demo/taxi-tiny"
3849
```
3950

40-
Load from REST catalog on AWS S3.
51+
Load from REST catalog and AWS S3 storage.
4152
```shell
4253
ctk load table \
4354
"s3+iceberg://bucket1/?catalog-uri=http://iceberg-catalog.example.org:5000&catalog-token=foo&catalog=default&namespace=demo&table=taxi-tiny&s3.access-key-id=<your_access_key_id>&s3.secret-access-key=<your_secret_access_key>&s3.endpoint=<endpoint_url>&s3.region=<s3-region>" \
@@ -98,11 +109,9 @@ ctk save table --cluster-url="crate://?batch-size=200000"
98109

99110
### CrateDB parameters
100111

101-
Both parameters apply to target pipeline elements, controlling overwrite behaviour.
102-
103112
#### `if-exists`
104113

105-
The target table will be created automatically, if it does not exist. If it
114+
The target CrateDB table will be created automatically, if it does not exist. If it
106115
does exist, the `if-exists` URL query parameter can be used to configure this
107116
behavior. The default value is `fail`, the possible values are:
108117

@@ -129,5 +138,10 @@ to a truthy value, save operations will append to an existing table.
129138
:::{rubric} Example usage
130139
:::
131140
```shell
132-
ctk save table "file+iceberg://./var/lib/iceberg/?catalog=default&namespace=demo&table=taxi-tiny&append=true"
141+
ctk save table "file+iceberg://./var/lib/iceberg/?...&append=true"
133142
```
143+
144+
145+
[Apache Iceberg]: https://iceberg.apache.org/
146+
[FileIO]: https://py.iceberg.apache.org/configuration/#fileio
147+
[uv]: https://docs.astral.sh/uv/

0 commit comments

Comments
 (0)