-
Notifications
You must be signed in to change notification settings - Fork 53
feat(raster-zarr): sedona-raster-zarr crate + sedona-zarr plugin #858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
45 commits
Select commit
Hold shift + click to select a range
d27e8f4
feat(raster-zarr): sedona-raster-zarr crate + sd_read_zarr UDTF
james-willis 3fd6420
feat(python/sedonadb): add sd_read_zarr Python wrapper
james-willis ca47786
feat(raster-zarr): review fixes for sd_read_zarr
james-willis 1ad69cf
test(raster-zarr): migrate fixtures off deprecated store_chunk_elements
james-willis ccbb742
refactor(raster-zarr): rename sd_read_zarr `indb` option to `load_eager`
james-willis 7a391cb
fix(raster-zarr): review-round small fixes from PR #858
james-willis ecf12a8
refactor(raster-zarr): collapse loader to a single OutDb path
james-willis 52dc4ce
refactor(raster-zarr): plugin architecture — streaming reader + Exter…
james-willis 8b1d180
feat(sedonadb-zarr): new Python plugin package wiring zarr support
james-willis 490591a
feat(sedonadb-zarr): expose ExternalFormatSpec via Python ZarrFormatSpec
james-willis a637399
test(sedonadb-zarr): inspect raster cell as Python dict via as_py()
james-willis 06e479b
ci: fix three CI failures from the plugin refactor
james-willis b681acc
fix: gate sedonadb's pymodule behind `extension-module` feature
james-willis 940cedf
docs(sedonadb-zarr): README accurately reflects the shipped surface
james-willis a0c7454
refactor(sedona-raster-zarr): drop the `zarr` feature gate
james-willis ce0bdea
chore(sedona): drop redundant comment about zarr plugin
james-willis 4423128
style(sedonadb): collapse cfg_attr to one line per rustfmt
james-willis 4462b94
ci: don't leak sedonadb's s2geography feature into the plugin build
james-willis 21c6383
fix: move sedonadb's extension-module out of default features
james-willis 4e97bd4
ci: pass sedonadb's `extension-module` feature in maturin builds
james-willis b441a04
fix: drop self-referential sedonadb workspace dep, inline the path
james-willis ea3a540
fix: rename sedonadb-zarr's pymodule to `_zarr_lib`
james-willis 61387ad
fix(sedonadb-zarr): cross-extension UDTF handoff via PyCapsule
james-willis cf2fe80
fix(sedonadb): drop mimalloc from defaults to coexist with plugins
james-willis 44b0db6
test(sedonadb-zarr): drop premature read_format tests
james-willis 3621555
feat(sedona-datasource): single-object table provider for directory f…
james-willis 3b63088
fix(sedonadb): harden plugin extension surface from review
james-willis 1673608
refactor(sedona-datasource): resolve ObjectStore for single-object pa…
james-willis 171e4a3
chore(sedona-raster-zarr): trim scope and dependencies from PR review
james-willis 583abde
refactor(sedonadb): UDTF handoff via datafusion-ffi; restore mimalloc
james-willis 5a45265
fix(sedonadb): keep FFI codec's TaskContextProvider alive for session…
james-willis e678bbc
chore(sedonadb-zarr): pin sedonadb>=0.4.0
james-willis 3289cdf
chore(sedona-raster-zarr): drop Windows MinGW blosc gate
james-willis 008a332
test(sedonadb-zarr): parameterise smoke test over numpy dtypes
james-willis ee77599
chore(sedonadb): drop extension-module feature and `_zarr_lib` naming
james-willis 8e44f57
chore(sedonadb,plugin): trim dead scaffolding and over-verbose comments
james-willis 16eafd4
chore(sedonadb-zarr): defer `sd_read_zarr` SQL UDTF to a follow-up PR
james-willis b0a160b
fix(ci): unblock docs job + drop stray sedona-raster dev-dep
james-willis df700f8
fix(ci): mark both internal Python cdylibs as `doc = false`
james-willis 01f4d88
chore(ci): trim stale `MATURIN_PEP517_ARGS` comment in python.yml
james-willis 38d8465
feat(sedonadb): add `SedonaContext.read_format()` for plugin formats
james-willis 03dc461
chore: remove unreachable code surfaces from post-strip PR
james-willis d809871
chore(sedonadb): drop dead hasattr fallback for list_single_object
james-willis 0cf9d64
fix(ci): add TYPE_CHECKING import for `ExternalFormatSpec`
james-willis 1432842
chore: post-merge audit sweep (validate options, trim public surface,…
james-willis File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| # Licensed to the Apache Software Foundation (ASF) under one | ||
| # or more contributor license agreements. See the NOTICE file | ||
| # distributed with this work for additional information | ||
| # regarding copyright ownership. The ASF licenses this file | ||
| # to you under the Apache License, Version 2.0 (the | ||
| # "License"); you may not use this file except in compliance | ||
| # with the License. You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, | ||
| # software distributed under the License is distributed on an | ||
| # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| # KIND, either express or implied. See the License for the | ||
| # specific language governing permissions and limitations | ||
| # under the License. | ||
| [package] | ||
| name = "sedona-raster-zarr" | ||
| version.workspace = true | ||
| license.workspace = true | ||
| keywords.workspace = true | ||
| categories.workspace = true | ||
| homepage.workspace = true | ||
| repository.workspace = true | ||
| description.workspace = true | ||
| readme.workspace = true | ||
| edition.workspace = true | ||
| rust-version.workspace = true | ||
|
|
||
| [lints.clippy] | ||
| result_large_err = "allow" | ||
|
|
||
| [dependencies] | ||
| arrow-array = { workspace = true } | ||
| arrow-schema = { workspace = true } | ||
| datafusion-common = { workspace = true } | ||
| log = { workspace = true } | ||
| sedona-common = { workspace = true } | ||
| sedona-raster = { workspace = true } | ||
| sedona-schema = { workspace = true } | ||
| serde_json = { workspace = true } | ||
| zarrs = { workspace = true, features = ["filesystem", "gzip", "zstd", "crc32c", "sharding", "transpose"] } | ||
| zarrs_filesystem = { workspace = true } | ||
|
|
||
| # `blosc` is gated off Windows: c-blosc (statically linked) bundles its own | ||
| # `pthread_create` / `pthread_cond_*` symbols, which conflict with rtools45's | ||
| # `libpthread.a` during the MinGW link of the R `sedonadb.dll`. Non-Windows | ||
| # targets get the full blosc-compressed Zarr reading capability. | ||
| [target.'cfg(not(target_os = "windows"))'.dependencies] | ||
|
james-willis marked this conversation as resolved.
Outdated
|
||
| zarrs = { workspace = true, features = ["blosc"] } | ||
|
|
||
| [dev-dependencies] | ||
| tempfile = { workspace = true } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,66 @@ | ||
| // Licensed to the Apache Software Foundation (ASF) under one | ||
| // or more contributor license agreements. See the NOTICE file | ||
| // distributed with this work for additional information | ||
| // regarding copyright ownership. The ASF licenses this file | ||
| // to you under the Apache License, Version 2.0 (the | ||
| // "License"); you may not use this file except in compliance | ||
| // with the License. You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, | ||
| // software distributed under the License is distributed on an | ||
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| // KIND, either express or implied. See the License for the | ||
| // specific language governing permissions and limitations | ||
| // under the License. | ||
|
|
||
| //! Mapping between Zarr datatypes and SedonaDB's `BandDataType`. | ||
| //! | ||
| //! zarrs 0.23 models `DataType` as a wrapper around `Arc<dyn DataTypeTraits>`, | ||
| //! so we discriminate via the type-erased `is::<T>()` check rather than | ||
| //! pattern-matching an enum. | ||
|
|
||
| use arrow_schema::ArrowError; | ||
| use sedona_schema::raster::BandDataType; | ||
| use zarrs::array::data_type::{ | ||
| BoolDataType, Float32DataType, Float64DataType, Int16DataType, Int32DataType, Int64DataType, | ||
| Int8DataType, UInt16DataType, UInt32DataType, UInt64DataType, UInt8DataType, | ||
| }; | ||
| use zarrs::array::DataType as ZarrDataType; | ||
|
|
||
| /// Map a Zarr `DataType` to a SedonaDB `BandDataType`. | ||
| /// | ||
| /// Bool maps to UInt8 losslessly (Zarr packs bools to one byte each, matching | ||
| /// our representation). Variable-length, complex, and extended-precision | ||
| /// types error with `NotYetImplemented` — they have no numeric counterpart | ||
| /// in `BandDataType` today. | ||
| pub fn zarr_to_band_data_type(dt: &ZarrDataType) -> Result<BandDataType, ArrowError> { | ||
| if dt.is::<BoolDataType>() { | ||
| Ok(BandDataType::UInt8) | ||
|
james-willis marked this conversation as resolved.
|
||
| } else if dt.is::<Int8DataType>() { | ||
| Ok(BandDataType::Int8) | ||
| } else if dt.is::<UInt8DataType>() { | ||
| Ok(BandDataType::UInt8) | ||
| } else if dt.is::<Int16DataType>() { | ||
| Ok(BandDataType::Int16) | ||
| } else if dt.is::<UInt16DataType>() { | ||
| Ok(BandDataType::UInt16) | ||
| } else if dt.is::<Int32DataType>() { | ||
| Ok(BandDataType::Int32) | ||
| } else if dt.is::<UInt32DataType>() { | ||
| Ok(BandDataType::UInt32) | ||
| } else if dt.is::<Int64DataType>() { | ||
| Ok(BandDataType::Int64) | ||
| } else if dt.is::<UInt64DataType>() { | ||
| Ok(BandDataType::UInt64) | ||
| } else if dt.is::<Float32DataType>() { | ||
| Ok(BandDataType::Float32) | ||
| } else if dt.is::<Float64DataType>() { | ||
| Ok(BandDataType::Float64) | ||
| } else { | ||
| Err(ArrowError::NotYetImplemented(format!( | ||
| "Zarr datatype {dt:?} has no BandDataType mapping yet" | ||
| ))) | ||
| } | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.