Skip to content

Lazy Raster Loading Support for Zarr and GDAL#886

Open
james-willis wants to merge 17 commits into
apache:mainfrom
james-willis:jw/outdb-loader-v2
Open

Lazy Raster Loading Support for Zarr and GDAL#886
james-willis wants to merge 17 commits into
apache:mainfrom
james-willis:jw/outdb-loader-v2

Conversation

@james-willis
Copy link
Copy Markdown
Contributor

@james-willis james-willis commented May 28, 2026

Summary

This PR adds support for OutDB raster loading via the RS_EnsureLoaded UDF. When a function that needs the bytes of a raster available is called, an optimizer rule injects RS_EnsureLoaded into the expression tree. This _should also leverage Common Subexpression Elimination from Datafusion to avoid repeated loads.

This PR supercedes #849. This is the new approach based on the review comments I received from Dewey on that PR.

Copy reduction

Returning arrow_buffer::Buffer (not Vec<u8> or Bytes) and constructing the output BinaryViewArray directly from collected Buffers — bypassing BinaryViewBuilder's internal block copy — gives one copy total (storage → loader-internal Buffer) on the byte path, down from three with a naive Vec<u8> interface.

Zarr Registration

I punted on this but we will need it in the near future. The standard datafusion FFI doesnt help us here because these OutDBLoaders are not part of the datafusion interface. the Agent presented two solutions:

A. Build a C-ABI wrapper for AsyncByteLoader following the datafusion-ffi pattern (#[repr(C)] struct, extern "C" function pointers, abi_stable types). ~200-400 LoC of FFI scaffolding + safety review. Future-proof but heavyweight, and we'd own the ABI compatibility story going forward.

B. PyCapsule with raw Arc transit. Non-standard but works when both wheels link the same sedona-raster version. PyArrow has analogous patterns (PyCapsule carrying ArrowSchema* etc., where the C struct happens to be standardized — ours wouldn't be). Document the version-coupling constraint.

Merge coordination (with #888)

#888 migrates sedona-raster-zarr to all-async and establishes the open_storage_from_uri(uri, store) store-resolution contract. Whichever of #886/#888 lands second must reconcile:

Foundational module for async OutDb byte loading. Defines:

- `AsyncByteLoader` trait — async fn that returns `arrow_buffer::Buffer`,
  enabling zero-copy hand-off to the downstream BinaryViewArray
  construction in RS_EnsureLoaded (one storage→column copy instead of
  three: storage→loader-internal→Vec<u8>→BinaryViewBuilder-block→column).
- `OutDbLoadRequest` — borrowed request struct carrying the URI,
  dim_names, source_shape, and pixel type for one band's worth of bytes.
- `OutDbLoaderRegistry` — format-keyed map from `outdb_format` strings to
  `Arc<dyn AsyncByteLoader>`. One instance per SedonaContext; the context
  wraps it in `Arc<RwLock<…>>` so plugin crates can register
  post-construction via a public SedonaContext API.

No callers yet — the registry/trait stand alone. Subsequent commits add
the SedonaScalarUDF needs_bytes annotation, the RS_EnsureLoaded async
UDF that consumes the registry, the analyzer rule that injects the wrap,
and the GDAL + Zarr backend impls.

7 unit tests cover registration, lookup, dispatch, overwrite semantics,
and the format-iteration helper used for diagnostic messages.

See docs/research/async-outdb-design.md for the full design rationale.
Class-level annotation declaring whether a UDF's kernels read raster
pixel bytes. The analyzer rule (later commit) reads this to decide
whether to wrap raster arguments with RS_EnsureLoaded.

Shape:
- New `needs_bytes: bool` field on SedonaScalarUDF, default false.
- `with_needs_bytes()` builder method flips it to true at construction.
- `needs_bytes()` accessor for the analyzer to query post-downcast.

Class-level by convention: each UDF is constructed once per session, so
the instance field reflects a fixed property of the underlying kernels.
The field doesn't live on SedonaScalarKernel because kernels within a
single UDF don't realistically disagree on byte-needing-ness — they
differ by type signature, not by data-access pattern. If a future use
case demands per-kernel granularity, that's the trigger to push the
flag down to the kernel trait; not before.

No current RS_* UDFs read pixel bytes (audit returns empty across
sedona-raster-functions, sedona-raster-gdal, sedona). The annotation
mechanism is staged for future byte-reading kernels (e.g., RS_Value,
RS_Sum) which will call `.with_needs_bytes()` at construction.

See docs/research/async-outdb-design.md §2 for the design rationale.
Materialises OutDb raster bands at query time via the format-keyed
AsyncByteLoader registry from sedona-raster. The UDF instance closes
over an Arc<RwLock<OutDbLoaderRegistry>> owned by SedonaContext, so
loader registrations from session bootstrap (compiled-in defaults like
GDAL under a Cargo feature, later commit) and from plugin entry points
(out-of-tree crates calling SedonaContext::register_outdb_loader,
later commits) are immediately visible at dispatch time without
session-extension lookups — AsyncScalarUDFImpl::invoke_async_with_args
in DataFusion 52.5 only receives Arc<ConfigOptions>, so the closure-bound
registry is the right runtime path.

Implementation:

- `RsEnsureLoaded` implements `AsyncScalarUDFImpl`. For each input row,
  walks bands, passes through InDb bytes verbatim, dispatches OutDb
  bytes via the registered loader for the band's `outdb_format`,
  validates the loader-returned `Buffer` length against
  `Π source_shape × data_type.byte_size()`, and assembles the output
  raster StructArray via `RasterBuilder`. Sequential await for the
  first cut; parallel `buffer_unordered` fan-out is a follow-up that
  doesn't change the trait surface.
- `SedonaContext` gains a private `outdb_registry: Arc<RwLock<...>>`
  field initialised empty at `new_from_context`. The `RsEnsureLoaded`
  UDF is registered there alongside other session UDFs, with a clone
  of the same Arc.
- Public `SedonaContext::register_outdb_loader(format, loader)` is the
  single insertion hook used by both compiled-in defaults and plugins.
  `registered_outdb_formats()` exposes the keys for diagnostics.
- `OutDbLoaderRegistry` gains a manual `Debug` impl listing format keys
  (the loader values themselves are opaque trait objects).
- `SedonaScalarUDF` Eq/Hash/Debug derive lessons applied: manual impls
  by identity-by-name, since the registry field doesn't participate in
  UDF identity and isn't itself Eq/Hash.

5 unit tests cover the resolver end-to-end with a mock loader:
InDb passthrough, OutDb byte materialisation, missing-format diagnostic
with registered-formats listing, undersized-loader-output detection,
null raster row preservation. One context-level test asserts the
registry plumbing (empty initial state, post-register listing, UDF
visible in the scalar function registry by name).
Pre-optimizer rewrite that makes OutDb byte materialisation explicit in
the logical plan. For every `Expr::ScalarFunction` whose UDF downcasts
to a `SedonaScalarUDF` annotated `with_needs_bytes()`, each
raster-typed argument is wrapped in `RS_EnsureLoaded(arg)`. The wrap
references the same `Arc<ScalarUDF>` as the one
`SedonaContext::new_from_context` registers under the UDF registry, so
both injected and explicitly-written calls dispatch through the same
async resolver.

Recursion guard: `RS_EnsureLoaded` itself is left alone, so
`SELECT ... FROM RS_EnsureLoaded(rast)` doesn't recurse into
`RS_EnsureLoaded(RS_EnsureLoaded(rast))`.

Detection of "raster-typed" arguments uses `Expr::to_field(schema)` —
not `get_type()` — so the Field's extension-type metadata is available;
`SedonaType::Raster` is identified by the `"sedona.raster"` extension
type, not by raw `DataType::Struct`.

Two adjustments to Commit 3's RS_EnsureLoaded UDF that pair with the
analyzer rule:

- Signature volatility flipped from `Volatile` to `Stable`. DataFusion's
  `CommonSubexprEliminate` pass skips volatile expressions
  (`is_volatile_node()` returns true iff signature is exactly
  `Volatility::Volatile`), so the previous Volatile classification
  would have prevented dedup of injected wraps. Semantically Stable is
  the honest call: within a query, byte materialisation is deterministic
  for fixed inputs; across queries the underlying storage may change.
- Signature shape changed from `user_defined` to `any(1, ...)`. The
  `AsyncScalarUDF` adapter doesn't delegate `coerce_types` to the inner
  impl, so `Signature::user_defined` (which requires `coerce_types`)
  produced a planning-time "function does not implement coerce_types"
  error. `any(1, ...)` lets DataFusion accept whatever single-arg type
  the caller passes; we validate "argument is Struct" in `return_type`.

The CSE-dedup test (load-bearing) constructs the post-analyzer plan
shape directly (two `needs_bytes` UDFs calling the same raster
column, both with their args wrapped) and applies DataFusion's
`CommonSubexprEliminate::rewrite` to it. Before CSE: two
RS_EnsureLoaded(rast) calls in the plan tree. After CSE: exactly one.
This validates the volatility-Stable choice and proves we lean on
DataFusion's existing optimiser for dedup rather than re-implementing
it in our rule.

8 tests cover:
- `expr_is_raster` detection on raster + non-raster columns
- analyzer rule wraps raster args of needs_bytes UDFs
- leaves non-raster args alone
- leaves metadata-only (un-annotated) UDFs alone
- recursion guard against rewrapping RS_EnsureLoaded
- nested calls are rewritten inside-out via transform_up
- CSE dedup of injected wraps (2 → 1)

Registered from `SedonaContext::new_from_context` via
`ctx.state_ref().write().add_analyzer_rule(...)`. The rule runs in
DataFusion's analyzer phase, before any optimizer rules including
PushDownFilter, PushDownLimit, and CommonSubexprEliminate.

See docs/research/async-outdb-design.md for the broader design.
Materialises OutDb GeoTIFF (and other GDAL-supported) bands when the
RS_EnsureLoaded UDF dispatches against the `"gdal"` registry key.

Shape:
- `GdalLoader` is a stateless `AsyncByteLoader` impl. The thread-local
  `GDALDatasetCache` from `gdal_dataset_provider` does the per-thread
  dataset memoisation; the loader itself owns no state, so instances
  are interchangeable and cheap to clone.
- Inside `load()`, GDAL's blocking API is wrapped in
  `tokio::task::spawn_blocking` so the calling tokio worker stays
  responsive. The dataset cache uses `Rc` (not Send/Sync) and is
  thread-local, so each blocking-pool worker maintains its own cache
  — concurrent loads via `buffer_unordered` build per-worker caches
  without contention.
- `read_as_bytes` returns a `Vec<u8>`; `Buffer::from_vec` wraps it
  zero-copy for the return path, matching the design's
  one-storage-to-column-copy goal end-to-end on the GDAL backend.

Request validation (synchronous, before spawning):
- Source shape must be 2-D and `dim_names == ["y", "x"]`. Higher-rank
  or transposed OutDb reads need MDArray support and explicit axis
  mapping, tracked separately.
- File's GDAL pixel type must match the band metadata's `data_type`
  claim. Caught here with a clear error before reading, otherwise a
  size mismatch would mis-blame the loader rather than naming the
  dtype problem.

Bootstrap registration: `SedonaContext::new_from_context` calls
`out.register_outdb_loader(GDAL_FORMAT, Arc::new(GdalLoader::new()))`.
GDAL itself is dlopen'd lazily by `sedona-gdal` (workspace-default
`gdal-sys = false`), so registration is safe on systems without
libgdal — the first OutDb GeoTIFF query surfaces a clean "libgdal not
found" error from `sedona-gdal::global` instead of a panic, and
non-OutDb queries are unaffected.

Module bookkeeping: the existing `source_uri` module (`parse_outdb_source`)
was previously `#[cfg(test)]`-only; it's needed in production now, so
the cfg gate is removed. `get_or_create_outdb_source` on
`GDALDatasetCache` is promoted from private to `pub(crate)` so the
loader can reach it.

7 unit tests cover: 2x3 UInt8 round-trip, default-band-1 when fragment
absent, request-validation rejections (non-2D, non-yx dim_names, dtype
mismatch), and propagation of GDAL-layer errors (missing file,
out-of-range band index). Tests use real GDAL via the existing
`with_gdal` helper and the in-crate driver pattern from
`gdal_dataset_provider.rs`.

Updated context-level test: bootstrap registry contains `"gdal"`
instead of being empty; plugin registrations are stacked on top.
Materialises OutDb Zarr-backed bands when the RS_EnsureLoaded UDF
dispatches against the `"zarr"` registry key. Pairs with the Zarr
reader's existing chunk-anchor URI scheme
(`<store_uri>#array=<array_path>&chunk=<i0>,<i1>,...`): the loader
parses the anchor, opens the Zarr store + array, retrieves the named
chunk via `zarrs`, and returns the bytes as an `arrow_buffer::Buffer`.

Shape:
- `ZarrLoader` is a stateless `AsyncByteLoader` impl. Per-call store
  opens; per-(store_uri, array_path) caching is a follow-up
  optimisation that doesn't change the trait surface.
- Inside `load()`, the sync `zarrs` API runs inside
  `tokio::task::spawn_blocking` so the calling tokio worker stays
  responsive. Resulting `Buffer` is Send, crosses the await cleanly.
- Validates the Zarr array's dtype against the request's
  `data_type` claim before reading. Mismatches surface here with a
  clear diagnostic rather than as a downstream byte-count surprise.

Delivery model: as an out-of-tree plugin, `sedona-raster-zarr` does
NOT depend on `sedona` — callers register the loader from their own
context-setup code:

  ctx.register_outdb_loader(
      sedona_raster_zarr::ZARR_FORMAT,
      Arc::new(sedona_raster_zarr::ZarrLoader::new()),
  );

This is the asymmetry from Commit 5 (GDAL): GDAL is upstream of
sedona and registered automatically; Zarr is a plugin and explicit.
The Python wheel `sedonadb-zarr` will surface a Python-level
`register(con)` shim that wraps this call in a follow-up.

Module bookkeeping: `ChunkAnchor`, `parse_chunk_anchor`, and the
`retrieve_chunk_bytes` chunk-read primitive in `loader.rs` were
previously `#[cfg(test)]`-only — they're production-needed now, so
the cfg gates are removed. `retrieve_chunk_bytes` is now `pub(crate)`
since `outdb_loader` is its only consumer outside tests.

5 unit tests cover: UInt8 chunk round-trip (write via `ArrayBuilder`,
read via the loader), dtype mismatch with clear diagnostic, malformed
chunk-anchor URI rejection (missing fragment), missing-array-path
propagation from `zarrs`, and cloud-scheme rejection (until cloud
stores are supported). Tests build a real Zarr v3 group via
`GroupBuilder` + `ArrayBuilder` against a temp filesystem.
@github-actions github-actions Bot requested a review from zhangfengcdt May 28, 2026 17:19
Three small adjustments from review:

- Drop the "Class-level by convention…" paragraph from
  `SedonaScalarUDF::with_needs_bytes()`'s doc comment. The remaining
  doc already names the purpose; the extra prose was prescriptive
  without adding information.
- Replace `unreachable!("matched above")` in
  `ensure_loaded_analyzer::rewrite_expr_node` with a
  `sedona_internal_datafusion_err!`-bearing let-else. Same
  intent-preserved-as-leading-comment, but the failure mode on a
  future refactor that breaks the invariant is a clean query-level
  internal error instead of a process-level panic — matches the
  defensive style used everywhere else in this PR.
- Fix codespell: "Implementors" → "Implementers" in the
  `AsyncByteLoader` trait doc.
@james-willis james-willis marked this pull request as ready for review May 28, 2026 18:49
Copy link
Copy Markdown
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get through the analyzer bits today but thought I'd leave the loader part and I'll take a look at the analyzer tomorrow. In general this looks great!

Comment thread rust/sedona-raster-zarr/src/outdb_loader.rs
Comment thread rust/sedona-raster-zarr/src/outdb_loader.rs
Comment thread rust/sedona-raster-gdal/src/outdb_loader.rs Outdated
Comment thread rust/sedona-raster-gdal/src/outdb_loader.rs
Comment on lines +50 to +53
/// Raw source shape in `dim_names` order. The loader returns a
/// `Buffer` whose length equals `Π source_shape × data_type.byte_size()`
/// bytes, encoding pixels in C-order over `dim_names`.
pub source_shape: &'a [u64],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this see the view as well? In some cases the IO will be the same (but we can discard a lot of pixels right away); in other cases we can avoid IO because the retile will map to a small subset of a contiguous range. (Apologies if I misunderstood what source_shape is here)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. I havent thought about designing the system to allow "pushing" views into loaders. This PR does not support that and I'm not sure when we would be able to take advantage of such a thing.

Generally a cog tile or zarr chunk its retrieved atomically so we can't benefit from "pushing" the view in these cases. so I dont think theres a way to reduce io with such support.

Im not sure when you want to eagerly slice/crop vs when you want to perform that lazily.

Do you think crops on indb rasters should be eager in general? if so we could probably solve this with operator fusion + a rule in the crop operators impl that always eager crops the indb raster

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that if a loader is asked to load a 3x3 chunk of a 1e6 by 1e6 raster, the loader should return 9 pixels. It would be up to the caller to choose whether retiling or cropping is appropriate (and up to the loader to decide whether or not to cache 1e6 by 1e6 pixels in anticipation of reading another slice of the same chunk).

It's OK not to implement crop pushdown but if the information isn't there it can never happen.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #897 to track this.

I'm still not really sure this kind of feature belongs in the loader and I feel it adds a large amount of complexity to each loader implementation, even if it doesnt support view-pruning.

I think we can do this later if we decide to do it; adding fields and tweaking/shimming existing identity based implementations is a 2 way door.

Comment thread rust/sedona-raster/src/outdb_loader.rs
Comment thread rust/sedona/src/context.rs Outdated
Addresses two of Dewey's review comments on the GDAL loader: the
non-cancellable spawn_blocking concern and the absence of a size
sanity check.

- New `MAX_OUTDB_LOAD_BYTES` (4 GiB) constant. Requests whose
  `Π source_shape × byte_size` exceeds this are rejected before
  the blocking task is spawned — runaway / corrupt-metadata loads
  can't tie up a blocking-pool thread.
- `CancelOnDrop` guard flips an `Arc<AtomicBool>` when the outer
  async future is dropped (e.g. query-cancel). The blocking task
  observes the flag between block-height-aligned strip reads and
  returns a cancellation error rather than running to completion.
- Read loop replaced: single `band.read_as_bytes` → pre-allocated
  output `Vec<u8>` + `band.read_into_bytes` per strip of
  `band.block_size().1` rows. Each strip writes into a contiguous
  output slice (no per-block row-splice math). Cancellation
  granule = natural block height: 1–64 rows for strip GeoTIFFs,
  256 rows for tile GeoTIFFs (COG-style). For whole-image-block
  formats (PNG / JPEG) the byte cap is the primary safety net.
- Helper `read_band_blockwise(band, output, w, h, byte_size, cancel)`
  extracted as the testable unit. Direct tests for pre-armed cancel
  and multi-strip byte-correctness; full-loader integration tests
  for byte-cap rejection.

10 unit tests, up from 7. All existing tests pass unchanged.
Addresses Dewey's review feedback that the registry belongs in the
session's `ConfigOptions` extensions, mirroring how `CrsProvider` is
hosted inside `SedonaOptions`. The motivation: any UDF can reach
`ConfigOptions` at dispatch time via
`AsyncScalarUDFImpl::invoke_async_with_args`, whereas the previous
closure-bound `Arc` was reachable only by the one UDF that closed
over it. ConfigOptions is the DataFusion-idiomatic placement.

Shape:
- New `OutDbLoaderConfig` `ConfigExtension` in `sedona-raster` that
  holds a `OutDbLoaderRegistryOption(Arc<RwLock<OutDbLoaderRegistry>>)`
  field. Lives in `sedona-raster` (not `sedona-common`) because the
  registry type is here and a cross-crate move would create a circular
  dep. Has its own `sedona.outdb_loader` PREFIX namespace alongside
  the existing `sedona` `SedonaOptions`.
- `SedonaContext::new_from_context` builds the shared
  `Arc<RwLock<OutDbLoaderRegistry>>` once, stashes a clone inside
  ConfigOptions via `extensions.insert(OutDbLoaderConfig::from_handle(...))`,
  and keeps another clone in the existing `SedonaContext.outdb_registry`
  field. Both handles share the same `RwLock`, so writes through
  `register_outdb_loader(&self, ...)` are immediately visible to UDF
  reads.
- `RsEnsureLoaded` no longer holds a registry field. It pulls the
  handle out of `args.config_options.extensions.get::<OutDbLoaderConfig>()`
  at the top of every `invoke_async_with_args` call. The struct is
  now session-agnostic — one instance is correct for any session that
  has the extension registered. Tests updated to call `RsEnsureLoaded::new()`
  with no args.
- `ensure_loaded(...)` test helper signature simplified to
  `(input, lookup_closure)`; the redundant registry arg is gone.

Mirrors the precedent established by `CrsProviderOption`/`SedonaOptions`
in `sedona-common`. If a future caller bypasses `SedonaContext::new()`
and builds their own session without the extension, RS_EnsureLoaded
returns a clear internal error naming `OutDbLoaderConfig` so the fix
is obvious.

All 109 sedona-crate tests + 72 sedona-raster tests pass unchanged.
Copy link
Copy Markdown
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few notes, but the details here are all great!

Comment on lines +50 to +53
/// Raw source shape in `dim_names` order. The loader returns a
/// `Buffer` whose length equals `Π source_shape × data_type.byte_size()`
/// bytes, encoding pixels in C-order over `dim_names`.
pub source_shape: &'a [u64],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that if a loader is asked to load a 3x3 chunk of a 1e6 by 1e6 raster, the loader should return 9 pixels. It would be up to the caller to choose whether retiling or cropping is appropriate (and up to the loader to decide whether or not to cache 1e6 by 1e6 pixels in anticipation of reading another slice of the same chunk).

It's OK not to implement crop pushdown but if the information isn't there it can never happen.

Comment thread rust/sedona-raster/src/outdb_loader.rs Outdated
Comment thread rust/sedona/src/context.rs Outdated
Comment thread rust/sedona/src/context.rs
Comment thread rust/sedona/src/rs_ensure_loaded.rs Outdated
Comment thread rust/sedona-raster-functions/src/rs_ensure_loaded.rs
Comment thread rust/sedona/src/rs_ensure_loaded.rs Outdated
Comment thread rust/sedona/src/ensure_loaded_analyzer.rs Outdated
Comment thread rust/sedona-expr/src/scalar_udf.rs Outdated
Comment thread rust/sedona/src/ensure_loaded_analyzer.rs Outdated
…bug on AsyncByteLoader

Addresses review comments on the OutDb lazy-loading PR:

* RS_EnsureLoaded now implements `return_field_from_args` returning the
  input field verbatim, so the `"sedona.raster"` extension metadata
  survives. The previous `return_type` returned a bare `DataType::Struct`,
  which dropped the extension — downstream code (including the analyzer's
  own raster detection) would stop recognising the wrapped column as a
  Raster. `return_type` now hard-errors since it is no longer the
  authoritative output-type path. Covered by two regression tests.
* `AsyncByteLoader` gains a `Debug` supertrait so loader-holding structs
  can derive `Debug`; test mocks updated.
* Mark the RS_EnsureLoaded InDb passthrough copy with a pointer to the
  tracking issue for the RasterBuilder zero-copy work (GH apache#894).
…r rule

Addresses review comments on the OutDb lazy-loading PR:

* Convert the `EnsureLoadedAnalyzerRule` (analyzer rule in `sedona`) to
  `EnsureLoadedOptimizerRule` (logical optimizer rule in
  `sedona-query-planner`, alongside the spatial-join rules). Registered
  immediately before DataFusion's `common_sub_expression_eliminate` so
  CSE dedupes the injected `RS_EnsureLoaded(col)` wraps in the same pass.
* Resolve `RS_EnsureLoaded` from `OptimizerConfig::function_registry()`
  at rewrite time instead of capturing an `Arc<ScalarUDF>` at
  construction. `SedonaContext` no longer threads the UDF into a rule —
  it registers the UDF (before any query plans) and pushes the rule.
* Move `RS_ENSURE_LOADED_NAME` to `sedona-expr` so both the UDF impl
  (`sedona`) and the rule (`sedona-query-planner`) can reference it
  without a dependency cycle.
* Idempotency: optimizer rules run to a fixpoint, and since the output
  field now preserves the raster extension type, a naive re-pass would
  double-wrap. Guard against re-wrapping an arg already wrapped in
  `RS_EnsureLoaded`.
* Handle binary/multi-input nodes: resolve expression types against the
  merged schema of all inputs (via `DFSchema::join`) so a join filter
  like `RS_Intersects(a.rast, b.rast)` gets its raster args wrapped.
  Unresolvable/ambiguous merges skip the node rather than failing.

Tests cover wrap/no-wrap/metadata-only, recursion + idempotency guards,
join-merged-schema resolution, and the before-CSE registration order.
…eralize UDF metadata

Addresses review comments on the OutDb lazy-loading PR:

* Move the `RsEnsureLoaded` async UDF from `sedona` to
  `sedona-raster-functions`, alongside the other `RS_*` functions. It
  only reaches down into `sedona-raster`/`sedona-expr`/`sedona-schema`,
  so no dependency cycle; `sedona` already depends on
  `sedona-raster-functions` and imports it from there.
* Replace `SedonaScalarUDF`'s `needs_bytes: bool` with a generic
  `metadata: HashMap<String, String>` (`with_metadata`/`metadata()`),
  with `with_needs_bytes`/`needs_bytes` kept as convenience over a
  well-known `NEEDS_PIXELS_METADATA_KEY`. A string map future-proofs the
  planner-visible UDF metadata surface without a new field per flag.
* Carry that metadata across the sedona-extension FFI: a new `metadata`
  callback on `SedonaCScalarKernel` (mirroring `function_name`) returns
  the map as a JSON object string; `ExportedScalarKernel::with_metadata`
  serialises it and `ImportedScalarKernel::metadata()` parses it back,
  so a plugin-defined function can declare planner flags across the
  cdylib boundary. Header updated to match. Round-trip test added.
Breadcrumbs at the two places the non-identity-view gap lives:
- `OutDbLoadRequest` doc — the request carries no view and `load`
  returns a bare `Buffer`; the view-in/realized-out contract is apache#897.
- `RasterRefImpl::band()`'s non-identity-view rejection — note that it
  is the guardrail keeping `RS_EnsureLoaded` (which drops `view()` on
  rebuild) correct, and that lifting it requires the apache#897 round-trip.
Now that AsyncByteLoader carries a Debug supertrait, Arc<dyn AsyncByteLoader>
is Debug and the registry can derive it — replacing the hand-written impl
that existed only to work around the loaders not being Debug (and whose
"loaders don't carry a meaningful Debug surface" comment is now stale).
Completes Dewey's review comment: add the supertrait AND derive here.
…e declaration references

The logical optimizer rule's source file was left untracked when its
`pub mod ensure_loaded;` declaration and call sites landed, so the pushed
branch had a dangling module — local builds passed (file present but
untracked) while CI rustfmt failed to resolve the module.
… field

Adding a `metadata` callback to `SedonaCScalarKernel` broke s2geography
init: that struct's definition is duplicated in a vendored copy inside
the `sedona-s2geography` submodule
(s2geography/src/s2geography/sedona_udf/sedona_extension.h), and
`S2GeogInitKernels` lays out the struct from that copy. Inserting the
field changed the struct's size and field offsets on the Rust side only,
so the C++ and Rust disagreed and init failed with code 22 (every test
that builds a SedonaContext panicked).

Revert just the FFI plumbing — the `metadata` struct field + header
callback, `ExportedScalarKernel::with_metadata`, `ImportedScalarKernel`
metadata parsing, the round-trip test, and the serde_json dep. The
in-process `SedonaScalarUDF` metadata map (the substantive bool→map
generalization) is unaffected and stays.

Carrying UDF metadata across the kernel FFI requires updating the
submodule's vendored header and bumping the submodule in lockstep;
deferred to a follow-up that does it properly (and appends, never
inserts mid-struct).
Copy link
Copy Markdown
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All just details from me!

The one hiccup is that the AsyncUDFs might be dropping metadata (see comment), in which case you'll have to make a dummy scalar function struct InsertMetadataUdf { field: FieldRef, inner: ScalarUDF } that wraps RS_EnsureLoaded with an overridden return_field. 😮‍💨

Comment on lines +247 to +262
/// Mark this UDF as one whose kernels read raster pixel bytes from
/// their inputs — convenience for setting [`NEEDS_PIXELS_METADATA_KEY`]
/// to `"true"`. The `RS_EnsureLoaded` optimizer rule reads this to
/// decide whether to wrap raster arguments with the async
/// byte-materialisation UDF.
pub fn with_needs_bytes(self) -> SedonaScalarUDF {
self.with_metadata(NEEDS_PIXELS_METADATA_KEY, "true")
}

/// Returns whether this UDF reads raster pixel bytes from its inputs
/// (i.e. carries [`NEEDS_PIXELS_METADATA_KEY`] = `"true"`).
pub fn needs_bytes(&self) -> bool {
self.metadata
.get(NEEDS_PIXELS_METADATA_KEY)
.map(String::as_str)
== Some("true")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love having raster function (or any geo-specific) stuff live here. with_needs_bytes is sufficiently compact that you can probably just inline it to the functions that need it. The named constant can probably live in sedona-raster-functions and you can duplicate it if that creates weird cross-crate interactions?

Comment on lines +348 to +351
/// Register an async OutDb byte loader under a `format` key.
///
/// `format` matches the band-level `outdb_format` column value
/// (`"gdal"`, `"zarr"`, …). At query time the `RS_EnsureLoaded` UDF
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this format be more like "tif", "zarr", etc.? As in, should the byte loader get registered by file format or backend? (I had assumed we'd register by file format). You could also have the trait have a pub fn supported_formats() -> Vec<&str> and just register_outdb_loader(loader).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion here. part of the reason I used "gdal" is because I don't really know what formats:

  1. What formats GDAL supports
  2. What formats GDAL will support
  3. what formats we [will] support with GDAL

calling it GDAL leaves it such that users can just try to load random things with GDAL and it might even work.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a list of drivers, but I do think you want to separate the production of raster metadata from the resolving of metadata bytes (stac is the best reason for this but there could be other metadata catalogs that make sense). You could use a Vec<AsyncByteLoader> instead of hash map and call byte_loader.supports(format), where the gdal loader could return true based on some runtime call.

/// and out-of-tree plugins (`sedona-raster-zarr::register(&ctx)`
/// from user code after construction). Later registrations under the
/// same `format` key overwrite earlier ones.
pub fn register_outdb_loader(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this be an array_loader or raster_loader? OutDb is a bit of a historical name.

Comment on lines +750 to +752
// GDAL is always registered at bootstrap (compiled-in backend).
// Plugin backends like Zarr add themselves later via
// `register_outdb_loader`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I know you've been using the word "plugin" but "extension" or "registered at runtime" is closer to existing language (e.g., sedona-extension).

Comment thread rust/sedona-raster-zarr/src/source_uri.rs
Comment on lines +178 to +181
fn return_field_from_args(&self, args: ReturnFieldArgs) -> Result<FieldRef> {
// Identity on schema: the output raster has the same fields as the
// input — only the `data` column's bytes change. Return the input
// field verbatim so its `"sedona.raster"` extension metadata
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #831 Kristin found that Async UDFs drop metadata (at the physical expression). The optimizer rule that emplaces this may have to add it back via a dummy udf (or, simpler, we don't strictly need SedonaType::Raster to emit an extension name...a normal struct is fine).

Comment on lines +212 to +213
#[async_trait]
impl AsyncScalarUDFImpl for RsEnsureLoaded {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also suggest a batch size here? (I think that's something you can implement on the async udf)

Comment on lines +50 to +54
/// Re-exported from `sedona-expr` (its canonical home — see
/// [`sedona_expr::scalar_udf::RS_ENSURE_LOADED_NAME`]) so existing
/// `crate::rs_ensure_loaded::RS_ENSURE_LOADED_NAME` references keep
/// resolving.
pub use sedona_expr::scalar_udf::RS_ENSURE_LOADED_NAME;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just use "rs_ensureloaded" instead of a named constant

Comment on lines +86 to +107
// Expressions on a node are evaluated against the combined schema
// of its inputs: single-input nodes (Projection, Filter, …) use
// their one input; a Join's `filter` references left ⋈ right, so
// a predicate like `RS_Intersects(a.rast, b.rast)` resolves only
// against the merged schema. Leaf nodes carry no wrappable
// expressions.
let inputs = plan.inputs();
if inputs.is_empty() {
return Ok(Transformed::no(plan));
}
let Some(schema) = merged_input_schema(&inputs) else {
// Schemas couldn't be merged (e.g. ambiguous duplicate
// qualifiers in a self-join). Skip this node rather than
// failing the query — a missed wrap surfaces later as a clear
// "raster bytes not loaded" error, not a wrong result.
return Ok(Transformed::no(plan));
};
drop(inputs);

plan.map_expressions(|e| {
e.transform_up(|expr| rewrite_expr_node(expr, &schema, &ensure_loaded_udf))
})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work?

Suggested change
// Expressions on a node are evaluated against the combined schema
// of its inputs: single-input nodes (Projection, Filter, …) use
// their one input; a Join's `filter` references left ⋈ right, so
// a predicate like `RS_Intersects(a.rast, b.rast)` resolves only
// against the merged schema. Leaf nodes carry no wrappable
// expressions.
let inputs = plan.inputs();
if inputs.is_empty() {
return Ok(Transformed::no(plan));
}
let Some(schema) = merged_input_schema(&inputs) else {
// Schemas couldn't be merged (e.g. ambiguous duplicate
// qualifiers in a self-join). Skip this node rather than
// failing the query — a missed wrap surfaces later as a clear
// "raster bytes not loaded" error, not a wrong result.
return Ok(Transformed::no(plan));
};
drop(inputs);
plan.map_expressions(|e| {
e.transform_up(|expr| rewrite_expr_node(expr, &schema, &ensure_loaded_udf))
})
plan.map_expressions(|e| {
e.transform_up(|expr| rewrite_expr_node(expr, plan.schema(), &ensure_loaded_udf))
})

// so a future refactor that breaks the invariant fails the query
// cleanly instead of crashing a worker.
let Expr::ScalarFunction(ScalarFunction { func, args }) = expr else {
return Err(sedona_internal_datafusion_err!(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return Err(sedona_internal_datafusion_err!(
return sedona_internal_err!(

# Conflicts:
#	Cargo.lock
#	rust/sedona-raster-gdal/Cargo.toml
#	rust/sedona-raster-gdal/src/lib.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants