Skip to content

[kernel-spark] Create v2 adapters for metadata and protocol#6546

Merged
TimothyW553 merged 2 commits into
delta-io:masterfrom
PorridgeSwim:stack/SparkMetadataAdapter
May 5, 2026
Merged

[kernel-spark] Create v2 adapters for metadata and protocol#6546
TimothyW553 merged 2 commits into
delta-io:masterfrom
PorridgeSwim:stack/SparkMetadataAdapter

Conversation

@PorridgeSwim
Copy link
Copy Markdown
Collaborator

@PorridgeSwim PorridgeSwim commented Apr 10, 2026

🥞 Stacked PR

Use this link to review incremental changes.


Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

PR 1/7 in the non-additive schema evolution for V2 streaming connector stack.

The shared V1 Scala utilities (DeltaColumnMapping, DeltaSourceMetadataEvolutionSupport) operate on AbstractMetadata/AbstractProtocol, but V2 holds Kernel types. This PR creates two adapter classes that bridge the gap:

  • KernelMetadataAdapter: Kernel MetadataAbstractMetadata (schema conversion via SchemaUtils, partition columns and configuration converted to Scala collections)
  • KernelProtocolAdapter: Kernel ProtocolAbstractProtocol (maps reader/writer features to Option[Set[String]])

Also adds columnMappingMode and partitionSchema to the AbstractMetadata trait — V1's Metadata already had these fields, the trait just didn't expose them.

How was this patch tested?

Unit tests in ActionAdaptersTest.java: table-features protocol, legacy protocol, full metadata round-trip, null optional fields, and null constructor rejection.

Does this PR introduce any user-facing changes?

No.

@PorridgeSwim PorridgeSwim changed the title create v2 adapters for metadata and protocol [kernel-spark] Create v2 adapters for metadata and protocol Apr 10, 2026
@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from ae2f693 to f380f30 Compare April 10, 2026 21:27
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (ae2f693 -> f380f30)
.github/CODEOWNERS
@@ -0,0 +1,27 @@
+diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
+new file mode 100644
+--- /dev/null
++++ b/.github/CODEOWNERS
++# CODEOWNERS file for Delta Lake
++# This file defines code owners who must approve changes to specific files/directories.
++# See: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners
++
++# Build configuration files and directories
++/build/                         @tdas
++/build.sbt                      @tdas
++/project/                       @tdas
++/version.sbt                    @tdas
++
++# Python client, examples, and tooling
++/python/                        @tdas
++
++# Iceberg / UniForm integration and shaded Iceberg sources
++/iceberg/                       @tdas
++/icebergShaded/                 @tdas
++
++# Spark V2 and Unified modules
++/spark/v2/                      @tdas @huan233usc @TimothyW553 @raveeram-db @murali-db
++/spark-unified/                 @tdas @huan233usc @TimothyW553 @raveeram-db @murali-db
++
++# All files in the root directory
++/*                              @tdas
\ No newline at end of file
.github/ci-requirements/spark-python/spark4.0.in
@@ -0,0 +1,53 @@
+diff --git a/.github/ci-requirements/spark-python/spark4.0.in b/.github/ci-requirements/spark-python/spark4.0.in
+new file mode 100644
+--- /dev/null
++++ b/.github/ci-requirements/spark-python/spark4.0.in
++# Requirements for spark_python_test.yaml — Spark 4.0 variant
++# Python: 3.10
++#
++# To regenerate the lock file:
++#   UV_EXCLUDE_NEWER="2026-03-10T00:00:00Z" uv pip compile \
++#     --python-version 3.10 \
++#     --python-platform linux \
++#     --generate-hashes \
++#     -o spark4.0.lock \
++#     spark4.0.in
++#
++# NOTE: pyspark is NOT included — it's installed separately in the workflow
++# because its version is dynamically resolved from the build matrix.
++
++# Build tooling (pip pinned for -SNAPSHOT version string compatibility)
++pip==24.0
++setuptools==41.1.0
++wheel==0.33.4
++
++# Linting & formatting
++flake8==3.9.0
++black==23.12.1
++pydocstyle==3.0.0
++
++# Type checking
++mypy==1.8.0
++mypy-protobuf==3.3.0
++importlib_metadata==3.10.0
++
++# Packaging
++cryptography==37.0.4
++twine==4.0.1
++pypandoc==1.3.3
++
++# Data / analytics
++pandas==2.2.0
++pyarrow==15.0.0
++numpy==1.22.4
++
++# PySpark transitive dependency (installed via --no-deps, so we lock py4j here)
++py4j==0.10.9.9
++
++# gRPC / protobuf (Spark Connect — Spark 4.0)
++grpcio==1.67.0
++grpcio-status==1.67.0
++googleapis-common-protos==1.65.0
++protobuf==5.29.1
++googleapis-common-protos-stubs==2.2.0
++grpc-stubs==1.24.11
\ No newline at end of file

... (truncated, output exceeded 60000 bytes)

Reproduce locally: git range-diff b8e8de4..ae2f693 fb2973d..f380f30 | Disable: git config gitstack.push-range-diff false

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from f380f30 to 7713d81 Compare April 10, 2026 21:39
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (f380f30 -> 7713d81)
.github/CODEOWNERS
@@ -0,0 +1,27 @@
+diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
+new file mode 100644
+--- /dev/null
++++ b/.github/CODEOWNERS
++# CODEOWNERS file for Delta Lake
++# This file defines code owners who must approve changes to specific files/directories.
++# See: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners
++
++# Build configuration files and directories
++/build/                         @tdas
++/build.sbt                      @tdas
++/project/                       @tdas
++/version.sbt                    @tdas
++
++# Python client, examples, and tooling
++/python/                        @tdas
++
++# Iceberg / UniForm integration and shaded Iceberg sources
++/iceberg/                       @tdas
++/icebergShaded/                 @tdas
++
++# Spark V2 and Unified modules
++/spark/v2/                      @tdas @huan233usc @TimothyW553 @raveeram-db @murali-db
++/spark-unified/                 @tdas @huan233usc @TimothyW553 @raveeram-db @murali-db
++
++# All files in the root directory
++/*                              @tdas
\ No newline at end of file
.github/ci-requirements/spark-python/spark4.0.in
@@ -0,0 +1,53 @@
+diff --git a/.github/ci-requirements/spark-python/spark4.0.in b/.github/ci-requirements/spark-python/spark4.0.in
+new file mode 100644
+--- /dev/null
++++ b/.github/ci-requirements/spark-python/spark4.0.in
++# Requirements for spark_python_test.yaml — Spark 4.0 variant
++# Python: 3.10
++#
++# To regenerate the lock file:
++#   UV_EXCLUDE_NEWER="2026-03-10T00:00:00Z" uv pip compile \
++#     --python-version 3.10 \
++#     --python-platform linux \
++#     --generate-hashes \
++#     -o spark4.0.lock \
++#     spark4.0.in
++#
++# NOTE: pyspark is NOT included — it's installed separately in the workflow
++# because its version is dynamically resolved from the build matrix.
++
++# Build tooling (pip pinned for -SNAPSHOT version string compatibility)
++pip==24.0
++setuptools==41.1.0
++wheel==0.33.4
++
++# Linting & formatting
++flake8==3.9.0
++black==23.12.1
++pydocstyle==3.0.0
++
++# Type checking
++mypy==1.8.0
++mypy-protobuf==3.3.0
++importlib_metadata==3.10.0
++
++# Packaging
++cryptography==37.0.4
++twine==4.0.1
++pypandoc==1.3.3
++
++# Data / analytics
++pandas==2.2.0
++pyarrow==15.0.0
++numpy==1.22.4
++
++# PySpark transitive dependency (installed via --no-deps, so we lock py4j here)
++py4j==0.10.9.9
++
++# gRPC / protobuf (Spark Connect — Spark 4.0)
++grpcio==1.67.0
++grpcio-status==1.67.0
++googleapis-common-protos==1.65.0
++protobuf==5.29.1
++googleapis-common-protos-stubs==2.2.0
++grpc-stubs==1.24.11
\ No newline at end of file

... (truncated, output exceeded 60000 bytes)

Reproduce locally: git range-diff b8e8de4..f380f30 fb2973d..7713d81 | Disable: git config gitstack.push-range-diff false

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from 7713d81 to 08b4adc Compare April 10, 2026 21:47
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (7713d81 -> 08b4adc)
.github/CODEOWNERS
@@ -0,0 +1,27 @@
+diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
+new file mode 100644
+--- /dev/null
++++ b/.github/CODEOWNERS
++# CODEOWNERS file for Delta Lake
++# This file defines code owners who must approve changes to specific files/directories.
++# See: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners
++
++# Build configuration files and directories
++/build/                         @tdas
++/build.sbt                      @tdas
++/project/                       @tdas
++/version.sbt                    @tdas
++
++# Python client, examples, and tooling
++/python/                        @tdas
++
++# Iceberg / UniForm integration and shaded Iceberg sources
++/iceberg/                       @tdas
++/icebergShaded/                 @tdas
++
++# Spark V2 and Unified modules
++/spark/v2/                      @tdas @huan233usc @TimothyW553 @raveeram-db @murali-db
++/spark-unified/                 @tdas @huan233usc @TimothyW553 @raveeram-db @murali-db
++
++# All files in the root directory
++/*                              @tdas
\ No newline at end of file
.github/ci-requirements/spark-python/spark4.0.in
@@ -0,0 +1,53 @@
+diff --git a/.github/ci-requirements/spark-python/spark4.0.in b/.github/ci-requirements/spark-python/spark4.0.in
+new file mode 100644
+--- /dev/null
++++ b/.github/ci-requirements/spark-python/spark4.0.in
++# Requirements for spark_python_test.yaml — Spark 4.0 variant
++# Python: 3.10
++#
++# To regenerate the lock file:
++#   UV_EXCLUDE_NEWER="2026-03-10T00:00:00Z" uv pip compile \
++#     --python-version 3.10 \
++#     --python-platform linux \
++#     --generate-hashes \
++#     -o spark4.0.lock \
++#     spark4.0.in
++#
++# NOTE: pyspark is NOT included — it's installed separately in the workflow
++# because its version is dynamically resolved from the build matrix.
++
++# Build tooling (pip pinned for -SNAPSHOT version string compatibility)
++pip==24.0
++setuptools==41.1.0
++wheel==0.33.4
++
++# Linting & formatting
++flake8==3.9.0
++black==23.12.1
++pydocstyle==3.0.0
++
++# Type checking
++mypy==1.8.0
++mypy-protobuf==3.3.0
++importlib_metadata==3.10.0
++
++# Packaging
++cryptography==37.0.4
++twine==4.0.1
++pypandoc==1.3.3
++
++# Data / analytics
++pandas==2.2.0
++pyarrow==15.0.0
++numpy==1.22.4
++
++# PySpark transitive dependency (installed via --no-deps, so we lock py4j here)
++py4j==0.10.9.9
++
++# gRPC / protobuf (Spark Connect — Spark 4.0)
++grpcio==1.67.0
++grpcio-status==1.67.0
++googleapis-common-protos==1.65.0
++protobuf==5.29.1
++googleapis-common-protos-stubs==2.2.0
++grpc-stubs==1.24.11
\ No newline at end of file

... (truncated, output exceeded 60000 bytes)

Reproduce locally: git range-diff b8e8de4..7713d81 fb2973d..08b4adc | Disable: git config gitstack.push-range-diff false

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from 08b4adc to 27798f7 Compare April 10, 2026 22:19
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (08b4adc -> 27798f7)
.github/CODEOWNERS
@@ -0,0 +1,27 @@
+diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
+new file mode 100644
+--- /dev/null
++++ b/.github/CODEOWNERS
++# CODEOWNERS file for Delta Lake
++# This file defines code owners who must approve changes to specific files/directories.
++# See: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners
++
++# Build configuration files and directories
++/build/                         @tdas
++/build.sbt                      @tdas
++/project/                       @tdas
++/version.sbt                    @tdas
++
++# Python client, examples, and tooling
++/python/                        @tdas
++
++# Iceberg / UniForm integration and shaded Iceberg sources
++/iceberg/                       @tdas
++/icebergShaded/                 @tdas
++
++# Spark V2 and Unified modules
++/spark/v2/                      @tdas @huan233usc @TimothyW553 @raveeram-db @murali-db
++/spark-unified/                 @tdas @huan233usc @TimothyW553 @raveeram-db @murali-db
++
++# All files in the root directory
++/*                              @tdas
\ No newline at end of file
.github/ci-requirements/spark-python/spark4.0.in
@@ -0,0 +1,53 @@
+diff --git a/.github/ci-requirements/spark-python/spark4.0.in b/.github/ci-requirements/spark-python/spark4.0.in
+new file mode 100644
+--- /dev/null
++++ b/.github/ci-requirements/spark-python/spark4.0.in
++# Requirements for spark_python_test.yaml — Spark 4.0 variant
++# Python: 3.10
++#
++# To regenerate the lock file:
++#   UV_EXCLUDE_NEWER="2026-03-10T00:00:00Z" uv pip compile \
++#     --python-version 3.10 \
++#     --python-platform linux \
++#     --generate-hashes \
++#     -o spark4.0.lock \
++#     spark4.0.in
++#
++# NOTE: pyspark is NOT included — it's installed separately in the workflow
++# because its version is dynamically resolved from the build matrix.
++
++# Build tooling (pip pinned for -SNAPSHOT version string compatibility)
++pip==24.0
++setuptools==41.1.0
++wheel==0.33.4
++
++# Linting & formatting
++flake8==3.9.0
++black==23.12.1
++pydocstyle==3.0.0
++
++# Type checking
++mypy==1.8.0
++mypy-protobuf==3.3.0
++importlib_metadata==3.10.0
++
++# Packaging
++cryptography==37.0.4
++twine==4.0.1
++pypandoc==1.3.3
++
++# Data / analytics
++pandas==2.2.0
++pyarrow==15.0.0
++numpy==1.22.4
++
++# PySpark transitive dependency (installed via --no-deps, so we lock py4j here)
++py4j==0.10.9.9
++
++# gRPC / protobuf (Spark Connect — Spark 4.0)
++grpcio==1.67.0
++grpcio-status==1.67.0
++googleapis-common-protos==1.65.0
++protobuf==5.29.1
++googleapis-common-protos-stubs==2.2.0
++grpc-stubs==1.24.11
\ No newline at end of file

... (truncated, output exceeded 60000 bytes)

Reproduce locally: git range-diff b8e8de4..08b4adc fb2973d..27798f7 | Disable: git config gitstack.push-range-diff false

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from 27798f7 to 6205321 Compare April 10, 2026 22:28
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (27798f7 -> 6205321)
spark/v2/src/main/java/io/delta/spark/internal/v2/adapters/SparkProtocolAdapter.java
@@ -51,8 +51,7 @@
 +  @Override
 +  public Option<Set<String>> readerFeatures() {
 +    if (kernelProtocol.supportsReaderFeatures()) {
-+      return Option.apply(
-+          CollectionConverters.asScala(kernelProtocol.getReaderFeatures()).toSet());
++      return Option.apply(CollectionConverters.asScala(kernelProtocol.getReaderFeatures()).toSet());
 +    }
 +    return Option.empty();
 +  }
@@ -60,8 +59,7 @@
 +  @Override
 +  public Option<Set<String>> writerFeatures() {
 +    if (kernelProtocol.supportsWriterFeatures()) {
-+      return Option.apply(
-+          CollectionConverters.asScala(kernelProtocol.getWriterFeatures()).toSet());
++      return Option.apply(CollectionConverters.asScala(kernelProtocol.getWriterFeatures()).toSet());
 +    }
 +    return Option.empty();
 +  }
spark/v2/src/test/java/io/delta/spark/internal/v2/adapters/ActionAdaptersTest.java
@@ -33,7 +33,6 @@
 +import java.util.*;
 +import org.junit.jupiter.api.Test;
 +import scala.jdk.javaapi.CollectionConverters;
-+import scala.jdk.javaapi.OptionConverters;
 +
 +/** Unit tests for {@link SparkMetadataAdapter} and {@link SparkProtocolAdapter}. */
 +public class ActionAdaptersTest {
@@ -43,8 +42,7 @@
 +  @Test
 +  public void testProtocolAdapterWithTableFeatures() {
 +    Set<String> readerFeatures = new HashSet<>(Arrays.asList("v2Checkpoint"));
-+    Set<String> writerFeatures =
-+        new HashSet<>(Arrays.asList("v2Checkpoint", "rowTracking"));
++    Set<String> writerFeatures = new HashSet<>(Arrays.asList("v2Checkpoint", "rowTracking"));
 +    Protocol kernelProtocol = new Protocol(3, 7, readerFeatures, writerFeatures);
 +
 +    SparkProtocolAdapter adapter = new SparkProtocolAdapter(kernelProtocol);
@@ -107,9 +105,7 @@
 +            "{\"type\":\"struct\",\"fields\":"
 +                + "[{\"name\":\"part1\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},"
 +                + "{\"name\":\"col1\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}}]}",
-+            new StructType()
-+                .add("part1", IntegerType.INTEGER)
-+                .add("col1", IntegerType.INTEGER),
++            new StructType().add("part1", IntegerType.INTEGER).add("col1", IntegerType.INTEGER),
 +            partCols,
 +            Optional.of(42L),
 +            VectorUtils.stringStringMapValue(configuration));

Reproduce locally: git range-diff b8e8de4..27798f7 b8e8de4..6205321 | Disable: git config gitstack.push-range-diff false

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from 6205321 to 4e8ae79 Compare April 11, 2026 00:17
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (6205321 -> 4e8ae79)
spark/v2/src/test/java/io/delta/spark/internal/v2/adapters/ActionAdaptersTest.java
@@ -29,6 +29,7 @@
 +import io.delta.kernel.internal.util.InternalUtils;
 +import io.delta.kernel.internal.util.VectorUtils;
 +import io.delta.kernel.types.IntegerType;
++import io.delta.kernel.types.StringType;
 +import io.delta.kernel.types.StructType;
 +import java.util.*;
 +import org.junit.jupiter.api.Test;
@@ -104,8 +105,10 @@
 +            format,
 +            "{\"type\":\"struct\",\"fields\":"
 +                + "[{\"name\":\"part1\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},"
-+                + "{\"name\":\"col1\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}}]}",
-+            new StructType().add("part1", IntegerType.INTEGER).add("col1", IntegerType.INTEGER),
++                + "{\"name\":\"col1\",\"type\":\"string\",\"nullable\":false,\"metadata\":{}}]}",
++            new StructType()
++                .add("part1", IntegerType.INTEGER)
++                .add("col1", StringType.STRING, false /* nullable */),
 +            partCols,
 +            Optional.of(42L),
 +            VectorUtils.stringStringMapValue(configuration));
@@ -115,8 +118,10 @@
 +    assertEquals("id", adapter.id());
 +    assertEquals("name", adapter.name());
 +    assertEquals("description", adapter.description());
-+    assertEquals("parquet", adapter.schema().apply("part1").dataType().typeName());
-+    assertEquals("integer", adapter.schema().apply("col1").dataType().typeName());
++    assertEquals("integer", adapter.schema().apply("part1").dataType().typeName());
++    assertTrue(adapter.schema().apply("part1").nullable());
++    assertEquals("string", adapter.schema().apply("col1").dataType().typeName());
++    assertFalse(adapter.schema().apply("col1").nullable());
 +    assertEquals(2, adapter.schema().fields().length);
 +    assertEquals(
 +        Collections.singletonList("part1"),

Reproduce locally: git range-diff b8e8de4..6205321 b8e8de4..4e8ae79 | Disable: git config gitstack.push-range-diff false

@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (d1be3cc -> ca71e8f)
spark/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala
@@ -0,0 +1,12 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala b/spark/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala
+ 
+   /** Returns the partitionSchema as a [[StructType]] */
+   @JsonIgnore
+-  lazy val partitionSchema: StructType =
+-    new StructType(partitionColumns.map(c => schema(c)).toArray)
++  override lazy val partitionSchema: StructType = super.partitionSchema
+ 
+   /** Partition value keys in the AddFile map. */
+   @JsonIgnore
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/v2/interop/AbstractMetadata.scala
@@ -0,0 +1,22 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/v2/interop/AbstractMetadata.scala b/spark/src/main/scala/org/apache/spark/sql/delta/v2/interop/AbstractMetadata.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/v2/interop/AbstractMetadata.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/v2/interop/AbstractMetadata.scala
+ 
+ package org.apache.spark.sql.delta.v2.interop
+ 
++import org.apache.spark.sql.delta.DeltaColumnMappingMode
+ import org.apache.spark.sql.types.StructType
+ 
+ /**
+ 
+   /** The table properties/configuration defined on the table. */
+   def configuration: Map[String, String]
++
++  /** Column mapping mode for this table. */
++  def columnMappingMode: DeltaColumnMappingMode
++
++  /** Returns the partitionSchema as a [[StructType]] */
++  def partitionSchema: StructType =
++    new StructType(partitionColumns.map(c => schema(c)).toArray)
+ }
+ 
\ No newline at end of file
spark/v2/src/main/java/io/delta/spark/internal/v2/adapters/SparkMetadataAdapter.java
@@ -20,11 +20,16 @@
 +package io.delta.spark.internal.v2.adapters;
 +
 +import io.delta.kernel.internal.actions.Metadata;
++import io.delta.kernel.internal.util.ColumnMapping;
 +import io.delta.kernel.internal.util.VectorUtils;
 +import io.delta.spark.internal.v2.utils.ScalaUtils;
 +import io.delta.spark.internal.v2.utils.SchemaUtils;
 +import java.util.Objects;
 +import java.util.stream.Collectors;
++import org.apache.spark.sql.delta.DeltaColumnMappingMode;
++import org.apache.spark.sql.delta.IdMapping$;
++import org.apache.spark.sql.delta.NameMapping$;
++import org.apache.spark.sql.delta.NoMapping$;
 +import org.apache.spark.sql.delta.v2.interop.AbstractMetadata;
 +import org.apache.spark.sql.types.StructType;
 +import scala.collection.immutable.Map;
@@ -38,6 +43,7 @@
 +public class SparkMetadataAdapter implements AbstractMetadata {
 +
 +  private final Metadata kernelMetadata;
++  private volatile StructType cachedPartitionSchema;
 +
 +  public SparkMetadataAdapter(Metadata kernelMetadata) {
 +    this.kernelMetadata = Objects.requireNonNull(kernelMetadata, "kernelMetadata is null");
@@ -76,4 +82,28 @@
 +  public Map<String, String> configuration() {
 +    return ScalaUtils.toScalaMap(kernelMetadata.getConfiguration());
 +  }
++
++  @Override
++  public DeltaColumnMappingMode columnMappingMode() {
++    ColumnMapping.ColumnMappingMode kernelMode =
++        ColumnMapping.getColumnMappingMode(metadata.getConfiguration());
++    switch (kernelMode) {
++      case NONE:
++        return NoMapping$.MODULE$;
++      case ID:
++        return IdMapping$.MODULE$;
++      case NAME:
++        return NameMapping$.MODULE$;
++      default:
++        throw new UnsupportedOperationException("Unsupported column mapping mode: " + kernelMode);
++    }
++  }
++
++  @Override
++  public StructType partitionSchema() {
++    if (cachedPartitionSchema == null) {
++      cachedPartitionSchema = AbstractMetadata.super.partitionSchema();
++    }
++    return cachedPartitionSchema;
++  }
 +}
\ No newline at end of file
spark/v2/src/test/java/io/delta/spark/internal/v2/adapters/ActionAdaptersTest.java
@@ -32,6 +32,8 @@
 +import io.delta.kernel.types.StringType;
 +import io.delta.kernel.types.StructType;
 +import java.util.*;
++import org.apache.spark.sql.delta.NameMapping$;
++import org.apache.spark.sql.delta.NoMapping$;
 +import org.junit.jupiter.api.Test;
 +import scala.jdk.javaapi.CollectionConverters;
 +
@@ -95,7 +97,9 @@
 +        };
 +    Map<String, String> formatOptions = Collections.singletonMap("foo", "bar");
 +    Format format = new Format("parquet", formatOptions);
-+    Map<String, String> configuration = Collections.singletonMap("zip", "zap");
++    Map<String, String> configuration = new HashMap<>();
++    configuration.put("zip", "zap");
++    configuration.put("delta.columnMapping.mode", "name");
 +
 +    Metadata kernelMetadata =
 +        new Metadata(
@@ -126,9 +130,8 @@
 +    assertEquals(
 +        Collections.singletonList("part1"),
 +        CollectionConverters.asJava(adapter.partitionColumns()));
-+    assertEquals(
-+        Collections.singletonMap("zip", "zap"),
-+        CollectionConverters.asJava(adapter.configuration()));
++    assertEquals(configuration, CollectionConverters.asJava(adapter.configuration()));
++    assertEquals(NameMapping$.MODULE$, adapter.columnMappingMode());
 +  }
 +
 +  @Test
@@ -167,6 +170,7 @@
 +    assertEquals(0, adapter.schema().fields().length);
 +    assertTrue(CollectionConverters.asJava(adapter.partitionColumns()).isEmpty());
 +    assertTrue(CollectionConverters.asJava(adapter.configuration()).isEmpty());
++    assertEquals(NoMapping$.MODULE$, adapter.columnMappingMode());
 +  }
 +
 +  @Test

Reproduce locally: git range-diff e43bf65..d1be3cc e43bf65..ca71e8f | Disable: git config gitstack.push-range-diff false

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from ca71e8f to 71534ed Compare April 13, 2026 21:32
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (ca71e8f -> 71534ed)
spark/v2/src/main/java/io/delta/spark/internal/v2/adapters/SparkMetadataAdapter.java
@@ -86,7 +86,7 @@
 +  @Override
 +  public DeltaColumnMappingMode columnMappingMode() {
 +    ColumnMapping.ColumnMappingMode kernelMode =
-+        ColumnMapping.getColumnMappingMode(metadata.getConfiguration());
++        ColumnMapping.getColumnMappingMode(kernelMetadata.getConfiguration());
 +    switch (kernelMode) {
 +      case NONE:
 +        return NoMapping$.MODULE$;

Reproduce locally: git range-diff e43bf65..ca71e8f e43bf65..71534ed | Disable: git config gitstack.push-range-diff false

@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (71534ed -> 19b49ba)
spark/src/main/resources/error/delta-error-classes.json
@@ -0,0 +1,69 @@
+diff --git a/spark/src/main/resources/error/delta-error-classes.json b/spark/src/main/resources/error/delta-error-classes.json
+--- a/spark/src/main/resources/error/delta-error-classes.json
++++ b/spark/src/main/resources/error/delta-error-classes.json
+   },
+   "DELTA_DUPLICATE_COLUMNS_FOUND" : {
+     "message" : [
+-      "Found duplicate column(s) <coltype>: <duplicateCols>"
++      "Found duplicate column(s): <duplicateCols>."
+     ],
++    "subClass" : {
++      "ADDING_COLUMNS" : {
++        "message" : [
++          "The duplicate was found while adding columns."
++        ]
++      },
++      "CLUSTER_BY" : {
++        "message" : [
++          "The duplicate was found in CLUSTER BY."
++        ]
++      },
++      "DATA" : {
++        "message" : [
++          "The duplicate was found in the data being saved."
++        ]
++      },
++      "EXISTING_SCHEMA" : {
++        "message" : [
++          "The duplicate was found in the existing table schema."
++        ]
++      },
++      "METADATA_UPDATE" : {
++        "message" : [
++          "The duplicate was found in the metadata update."
++        ]
++      },
++      "PARTITION_COLUMNS" : {
++        "message" : [
++          "The duplicate was found in the partition columns."
++        ]
++      },
++      "PARTITION_SCHEMA" : {
++        "message" : [
++          "The duplicate was found in the partition schema."
++        ]
++      },
++      "READ_SCHEMA" : {
++        "message" : [
++          "The duplicate was found in the schema of the data being read."
++        ]
++      },
++      "REPLACING_COLUMNS" : {
++        "message" : [
++          "The duplicate was found while replacing columns."
++        ]
++      },
++      "SPECIFIED_COLUMNS" : {
++        "message" : [
++          "The duplicate was found in the specified columns."
++        ]
++      },
++      "TABLE_SCHEMA" : {
++        "message" : [
++          "The duplicate was found in the table schema."
++        ]
++      }
++    },
+     "sqlState" : "42711"
+   },
+   "DELTA_DUPLICATE_COLUMNS_ON_INSERT" : {
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/DeltaErrors.scala
@@ -0,0 +1,16 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/DeltaErrors.scala b/spark/src/main/scala/org/apache/spark/sql/delta/DeltaErrors.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/DeltaErrors.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/DeltaErrors.scala
+       messageParameters = Array(colName, scheme))
+   }
+ 
+-  def foundDuplicateColumnsException(colType: String, duplicateCols: String): Throwable = {
++  def foundDuplicateColumnsException(subClass: String, duplicateCols: String): Throwable = {
+     new DeltaAnalysisException(
+-      errorClass = "DELTA_DUPLICATE_COLUMNS_FOUND",
+-      messageParameters = Array(colType, duplicateCols))
++      errorClass = s"DELTA_DUPLICATE_COLUMNS_FOUND.$subClass",
++      messageParameters = Array(duplicateCols))
+   }
+ 
+   def addColumnStructNotFoundException(pos: String): Throwable = {
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/DeltaLog.scala
@@ -0,0 +1,11 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/DeltaLog.scala b/spark/src/main/scala/org/apache/spark/sql/delta/DeltaLog.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/DeltaLog.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/DeltaLog.scala
+ 
+     val txn = startTransaction(catalogTable, Some(snapshot))
+     try {
+-      SchemaMergingUtils.checkColumnNameDuplication(txn.metadata.schema, "in the table schema")
++      SchemaMergingUtils.checkColumnNameDuplication(txn.metadata.schema, "TABLE_SCHEMA")
+     } catch {
+       case e: AnalysisException =>
+         throw DeltaErrors.duplicateColumnsOnUpdateTable(e)
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/OptimisticTransaction.scala
@@ -0,0 +1,106 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/OptimisticTransaction.scala b/spark/src/main/scala/org/apache/spark/sql/delta/OptimisticTransaction.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/OptimisticTransaction.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/OptimisticTransaction.scala
+ import org.apache.spark.sql.delta.sources.{DeltaSourceUtils, DeltaSQLConf}
+ import org.apache.spark.sql.delta.stats._
+ import org.apache.spark.sql.delta.stats.FileSizeHistogramUtils
+-import org.apache.spark.sql.delta.util.{DeltaCommitFileProvider, JsonUtils, TransactionHelper}
++import org.apache.spark.sql.delta.util.{DeltaCommitFileProvider, JsonUtils, PartitionUtils, TransactionHelper}
+ import org.apache.spark.sql.util.ScalaExtensions._
+ import io.delta.storage.commit._
+ import io.delta.storage.commit.actions.{AbstractMetadata, AbstractProtocol}
+   protected def assertMetadata(metadata: Metadata): Unit = {
+     assert(!CharVarcharUtils.hasCharVarchar(metadata.schema),
+       "The schema in Delta log should not contain char/varchar type.")
+-    SchemaMergingUtils.checkColumnNameDuplication(metadata.schema, "in the metadata update")
++    SchemaMergingUtils.checkColumnNameDuplication(metadata.schema, "METADATA_UPDATE")
+     if (metadata.columnMappingMode == NoMapping) {
+       SchemaUtils.checkSchemaFieldNames(metadata.dataSchema, metadata.columnMappingMode)
+       val partitionColCheckIsFatal =
+   }
+ 
+   /**
+-   * Returns files within the given partitions.
+-   *
+-   * `partitions` is a set of the `partitionValues` stored in [[AddFile]]s. This means they refer to
+-   * the physical column names, and values are stored as strings.
+-   * */
+-  def filterFiles(partitions: Set[Map[String, String]]): Seq[AddFile] = {
++   * Returns files within the partitions of the given [[AddFile]]s.
++   */
++  def filterFiles(newFiles: Seq[AddFile]): Seq[AddFile] = {
+     import org.apache.spark.sql.functions.col
+     val df = snapshot.allFiles.toDF()
+-    val isFileInTouchedPartitions =
+-      DeltaUDF.booleanFromMap(partitions.contains)(col("partitionValues"))
+-    val filteredFiles = df
+-      .filter(isFileInTouchedPartitions)
+-      .withColumn("stats", DataSkippingReader.nullStringLiteral)
+-      .as[AddFile]
+-      .collect()
++    val parseToTypedLiterals =
++      spark.conf.get(DeltaSQLConf.DELTA_DYNAMIC_PARTITION_OVERWRITE_PARSE_PARTITION_VALUES)
++    val timeZone = spark.sessionState.conf.sessionLocalTimeZone
++
++    val (filteredFiles, filterPredicate) = try {
++      // Always fail on error. We log and throw it again or fall back depending on the config.
++      val newFilesNormalizedPartitionValues = newFiles.map(f =>
++        Action.normalizePartitionValues(
++          f.partitionValues,
++          metadata.physicalPartitionSchema,
++          timeZone,
++          parseToTypedLiterals,
++          failOnParsingError = true)
++      ).toSet
++
++      val existingFilesPartitionSchema = snapshot.metadata.physicalPartitionSchema
++      val pred = DeltaUDF.booleanFromMap { filePartValues =>
++        newFilesNormalizedPartitionValues.contains(Action.normalizePartitionValues(
++          filePartValues,
++          existingFilesPartitionSchema,
++          timeZone,
++          parseToTypedLiterals,
++          failOnParsingError = true))
++      }(col("partitionValues"))
++      val files = df.filter(pred)
++          .withColumn("stats", DataSkippingReader.nullStringLiteral)
++          .as[AddFile]
++          .collect()
++      (files, pred)
++    } catch {
++      case NonFatal(e) =>
++        val opTypeSuffix = PartitionUtils.classifyPartitionValueParsingError(e)
++        recordDeltaEvent(
++          deltaLog,
++          opType = "delta.dynamicPartitionOverwrite.partitionValueParsingError" + opTypeSuffix,
++          data = getErrorData(e) ++ Map(
++            "readSnapshotMetadata" -> snapshot.metadata,
++            "txnMetadata" -> metadata,
++            "commitInfo" -> commitInfo,
++            "readSnapshotVersion" -> snapshot.version,
++            "timeZone" -> timeZone
++          )
++        )
++        if (spark.conf.get(DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR)) {
++          e match {
++            // UDF exceptions get wrapped in SparkException. Unwrap to throw the root cause.
++            case se: SparkException => throw Option(se.getCause).getOrElse(se)
++            case _ => throw e
++          }
++        }
++        // Partition value parsing failed, fall back to raw string comparison.
++        val rawPartitions = newFiles.map(_.partitionValues).toSet
++        val pred = DeltaUDF.booleanFromMap(rawPartitions.contains)(col("partitionValues"))
++        val files = df.filter(pred)
++            .withColumn("stats", DataSkippingReader.nullStringLiteral)
++            .as[AddFile]
++            .collect()
++        (files, pred)
++    }
++
+     trackReadPredicates(
+-      Seq(isFileInTouchedPartitions.expr), partitionOnly = true, shouldRewriteFilter = false)
++      Seq(filterPredicate.expr), partitionOnly = true, shouldRewriteFilter = false)
+     filteredFiles
+   }
+ 
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala
@@ -1,6 +1,195 @@
 diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala b/spark/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala
 --- a/spark/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala
 +++ b/spark/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala
+     }
+   }
+ 
++  /**
++   * Normalizes partition values to typed Literals. This method is serializable and does not
++   * require SparkSession, so it can be used inside UDFs for parallel processing.
++   *
++   * @param rawPartitionValues Map of partition column names to their string values.
++   * @param partitionSchema Schema defining the data types for each partition column.
++   * @param timeZoneId The Spark session time zone ID. This should ALWAYS be the session timezone
++   *                   to ensure consistent parsing between read and write paths.
++   * @param parseToTypedLiterals Whether to parse partition values to their actual types.
++   *                             When false, verbatim value from the log action is returned in a
++   *                             String Literal.
++   * @param failOnParsingError If true, throw the exception if parsing fails.
++   *                           If false, return the raw partition values as string Literals.
++   * @return Map of partition column names to their literal values.
++   */
++  def normalizePartitionValues(
++      rawPartitionValues: Map[String, String],
++      partitionSchema: StructType,
++      timeZoneId: String,
++      parseToTypedLiterals: Boolean,
++      failOnParsingError: Boolean): Map[String, Literal] = {
++    def parseToStringLiterals = rawPartitionValues.map { case (k, v) => (k, Literal(v)) }
++    if (parseToTypedLiterals) {
++      try {
++        PartitionUtils.parsePartitionValues(
++          rawPartitionValues, partitionSchema, timeZoneId, validatePartitionColumns = true)
++      } catch {
++        case NonFatal(e) =>
++          if (failOnParsingError) {
++            throw e
++          } else {
++            parseToStringLiterals
++          }
++      }
++    } else {
++      parseToStringLiterals
++    }
++  }
++
+   /** All reader protocol version numbers supported by the system. */
+   private[delta] lazy val supportedReaderVersionNumbers: Set[Int] = {
+     val allVersions =
+   def getTag(tagName: String): Option[String] = Option(tags).flatMap(_.get(tagName))
+ 
+ 
++  /**
++   * Return partition values as literals, optionally parsed to their actual data types.
++   * When `parseToTypedLiterals` is true, partition values are parsed to their actual
++   * types for comparison purposes. When false, they are returned as string literals,
++   * using verbatim value written in the action.
++   *
++   * @param deltaLog The DeltaLog for logging events. May be null if unavailable.
++   * @param errorOpType Prefix for logging event opTypes.
++   * @param errorData Extra fields to include in logging events.
++   * @return Map of partition column names to literals.
++   */
++  private[delta] def normalizedPartitionValues(
++      spark: SparkSession,
++      partitionSchema: StructType,
++      parseToTypedLiterals: Boolean,
++      deltaLog: DeltaLog,
++      errorOpType: String,
++      errorData: Map[String, Any]): Map[String, Literal] = {
++    val timeZone = spark.sessionState.conf.sessionLocalTimeZone
++
++    try {
++      val partitionValueLiterals = Action.normalizePartitionValues(
++        partitionValues,
++        partitionSchema,
++        timeZone,
++        parseToTypedLiterals,
++        failOnParsingError = true)
++
++      if (parseToTypedLiterals) {
++        val stringNormalizedPartitionValues = partitionValueLiterals.map {
++          case (k, v) => (k, PartitionUtils.literalToNormalizedString(
++            v,
++            Some(timeZone),
++            useUtcNormalizedTimestamp = true))
++        }
++        if (stringNormalizedPartitionValues != partitionValues) {
++          Action.recordDeltaEvent(
++            deltaLog,
++            opType = errorOpType + ".unnormalizedValuesExist",
++            data = errorData
++          )
++        }
++      }
++      partitionValueLiterals
++    } catch {
++      case NonFatal(e) =>
++        val opTypeSuffix = PartitionUtils.classifyPartitionValueParsingError(e)
++        Action.recordDeltaEvent(
++          deltaLog,
++          opType = errorOpType + ".partitionValueParsingError" + opTypeSuffix,
++          data = errorData ++ Map(
++            "exceptionMessage" -> e.getMessage,
++            "timeZone" -> timeZone
++          )
++        )
++        if (spark.conf.get(DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR)) {
++          throw e
++        }
++        partitionValues.map { case (k, v) => (k, Literal(v)) }
++    }
++  }
++
+   /** Returns the [[SparkPath]] for this file action. */
+   def sparkPath: SparkPath = SparkPath.fromUrlString(path)
+ 
+       spark: SparkSession,
+       partitionSchema: StructType,
+       deltaTxn: Option[OptimisticTransaction] = None): Map[String, Literal] = {
+-
+-    def partitionValuesAsStringLiterals: Map[String, Literal] = {
+-      // Convert all partition values to string literals
+-      partitionValues.map { case (k, v) => (k, Literal(v)) }
+-    }
+-
+-    val normalizePartitionValuesOnRead =
+-      spark.conf.get(DeltaSQLConf.DELTA_NORMALIZE_PARTITION_VALUES_ON_READ)
+-    if (normalizePartitionValuesOnRead) {
+-      val timeZone = spark.sessionState.conf.sessionLocalTimeZone
+-
+-      try {
+-        val typedPartitionValueLiterals = PartitionUtils.parsePartitionValues(
+-          partitionValues,
+-          partitionSchema,
+-          java.util.TimeZone.getDefault.getID,
+-          validatePartitionColumns = true)
+-
+-        val stringNormalizedPartitionValues = typedPartitionValueLiterals.map {
+-          case (k, v) => (k, PartitionUtils.literalToNormalizedString(
+-            v,
+-            Some(timeZone),
+-            useUtcNormalizedTimestamp = true))
+-        }
+-
+-        if (stringNormalizedPartitionValues != partitionValues) {
+-          Action.recordDeltaEvent(
+-            deltaTxn.map(_.deltaLog).orNull,
+-            opType = "delta.normalizedPartitionValues.unnormalizedValuesExist",
+-            data = Map(
+-              "readSnapshotMetadata" -> deltaTxn.map(_.snapshot.metadata).orNull,
+-              "txnMetadata" -> deltaTxn.map(_.metadata).orNull,
+-              "commitInfo" -> deltaTxn.map(_.getCommitInfo).orNull
+-            )
+-          )
+-        }
+-        typedPartitionValueLiterals
+-      } catch {
+-        case NonFatal(e) =>
+-          val opTypeSuffix = PartitionUtils.classifyPartitionValueParsingError(e)
+-          Action.recordDeltaEvent(
+-            deltaTxn.map(_.deltaLog).orNull,
+-            opType = "delta.normalizedPartitionValues.partitionValueParsingError" + opTypeSuffix,
+-            data = Map(
+-              "exceptionMessage" -> e.getMessage,
+-              "readSnapshotMetadata" -> deltaTxn.map(_.snapshot.metadata).orNull,
+-              "txnMetadata" -> deltaTxn.map(_.metadata).orNull,
+-              "commitInfo" -> deltaTxn.map(_.getCommitInfo).orNull,
+-              "readSnapshotVersion" -> deltaTxn.map(_.snapshot.version).getOrElse(-1L),
+-              "timeZone" -> timeZone
+-            )
+-          )
+-          partitionValuesAsStringLiterals
+-      }
+-    } else {
+-        partitionValuesAsStringLiterals
+-    }
++    normalizedPartitionValues(
++      spark,
++      partitionSchema,
++      parseToTypedLiterals =
++        spark.conf.get(DeltaSQLConf.DELTA_NORMALIZE_PARTITION_VALUES_ON_READ),
++      deltaLog = deltaTxn.map(_.deltaLog).orNull,
++      errorOpType = "delta.normalizedPartitionValues",
++      errorData = Map(
++        "readSnapshotMetadata" -> deltaTxn.map(_.snapshot.metadata).orNull,
++        "txnMetadata" -> deltaTxn.map(_.metadata).orNull,
++        "commitInfo" -> deltaTxn.map(_.getCommitInfo).orNull,
++        "readSnapshotVersion" -> deltaTxn.map(_.snapshot.version).getOrElse(-1L))
++    )
+   }
+ 
+   // Don't use lazy val because we want to save memory.
  
    /** Returns the partitionSchema as a [[StructType]] */
    @JsonIgnore
spark/src/main/scala/org/apache/spark/sql/delta/catalog/AbstractDeltaCatalog.scala
@@ -0,0 +1,11 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/catalog/AbstractDeltaCatalog.scala b/spark/src/main/scala/org/apache/spark/sql/delta/catalog/AbstractDeltaCatalog.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/catalog/AbstractDeltaCatalog.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/catalog/AbstractDeltaCatalog.scala
+       }
+       // Check that columns are not duplicated in the cluster by statement.
+       PartitionUtils.checkColumnNameDuplication(
+-        clusterBy.columnNames.map(_.toString), "in CLUSTER BY", resolver)
++        clusterBy.columnNames.map(_.toString), "CLUSTER_BY", resolver)
+       // Check number of clustering columns is within allowed range.
+       ClusteredTableUtils.validateNumClusteringColumns(
+         clusterBy.columnNames.map(_.fieldNames.toSeq))
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/commands/WriteIntoDelta.scala
@@ -0,0 +1,31 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/commands/WriteIntoDelta.scala b/spark/src/main/scala/org/apache/spark/sql/delta/commands/WriteIntoDelta.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/commands/WriteIntoDelta.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/commands/WriteIntoDelta.scala
+         val deletedFiles = if (useDynamicPartitionOverwriteMode) {
+           // with dynamic partition overwrite for any partition that is being written to all
+           // existing data in that partition will be deleted.
+-          // the selection what to delete is determined by `updatePartitions`.
++          // the selection what to delete is determined by `filesToFilter`.
+ 
+           // Dynamic Partition Overwrite (DPO) uses null-tolerant equality, meaning NULL partitions
+           // in the table will be overwritten if there are matching NULL values in the query.
+           // This option simulates null-intolerant equality by not including partitions with
+           // NULL values in the set of partitions to be overwritten.
+-          val updatePartitions =
++          val filesToFilter =
+             if (options.useNullIntolerantEqualityWithDPO.contains(true)) {
+               addFiles.collect { case addFile
+                 if addFile.partitionValues.forall { case (_, value) => value != null }
+-                  => addFile.partitionValues
+-              }.toSet
++                  => addFile
++              }
+             } else {
+-              addFiles.map(_.partitionValues).toSet
++              addFiles
+             }
+-          txn.filterFiles(updatePartitions).map(_.remove)
++          txn.filterFiles(filesToFilter).map(_.remove)
+         } else {
+           txn.filterFiles().map(_.remove)
+         }
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/commands/alterDeltaTableCommands.scala
@@ -0,0 +1,19 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/commands/alterDeltaTableCommands.scala b/spark/src/main/scala/org/apache/spark/sql/delta/commands/alterDeltaTableCommands.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/commands/alterDeltaTableCommands.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/commands/alterDeltaTableCommands.scala
+           SchemaUtils.addColumn(schema, column, position)
+       }
+ 
+-      SchemaMergingUtils.checkColumnNameDuplication(newSchema, "in adding columns")
++      SchemaMergingUtils.checkColumnNameDuplication(newSchema, "ADDING_COLUMNS")
+       SchemaUtils.checkSchemaFieldNames(newSchema, metadata.columnMappingMode)
+ 
+       val newMetadata = metadata.copy(schemaString = newSchema.json)
+       val newSchema = SchemaUtils.changeDataType(existingSchema, changingSchema, resolver)
+         .asInstanceOf[StructType]
+ 
+-      SchemaMergingUtils.checkColumnNameDuplication(newSchema, "in replacing columns")
++      SchemaMergingUtils.checkColumnNameDuplication(newSchema, "REPLACING_COLUMNS")
+       SchemaUtils.checkSchemaFieldNames(newSchema, metadata.columnMappingMode)
+ 
+       val newSchemaWithTypeWideningMetadata = TypeWideningMetadata.addTypeWideningMetadata(
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/files/DelayedCommitProtocol.scala
@@ -0,0 +1,28 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/files/DelayedCommitProtocol.scala b/spark/src/main/scala/org/apache/spark/sql/delta/files/DelayedCommitProtocol.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/files/DelayedCommitProtocol.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/files/DelayedCommitProtocol.scala
+         .filter(partitionCol => partitionCol._2 == TimestampType)
+ 
+     val dateFormatter = DateFormatter()
+-    // if adjusting to UTC make sure to interpret timezones using Spark
+-    // config, otherwise fallback to JVM timezone
+-    val timezone = {
+-      if (useUtcNormalizedTimestamps) {
+-        DateTimeUtils.getTimeZone(SQLConf.get.sessionLocalTimeZone)
+-      } else {
+-        java.util.TimeZone.getDefault
+-      }
+-    }
+ 
++    val timezone = DateTimeUtils.getTimeZone(SQLConf.get.sessionLocalTimeZone)
+     val timestampFormatter = TimestampFormatter(PartitionUtils.timestampPartitionPattern, timezone)
+ 
+     /**
+           Set.empty,
+           userSpecifiedDataTypes = partitionColumnToDataType,
+           validatePartitionColumns = false,
+-          java.util.TimeZone.getDefault,
++          timezone,
+           dateFormatter,
+           timestampFormatter,
+           useUtcNormalizedTimestamps)
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/schema/SchemaMergingUtils.scala
@@ -0,0 +1,37 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/schema/SchemaMergingUtils.scala b/spark/src/main/scala/org/apache/spark/sql/delta/schema/SchemaMergingUtils.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/schema/SchemaMergingUtils.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/schema/SchemaMergingUtils.scala
+    * the duplication exists.
+    *
+    * @param schema the schema to check for duplicates
+-   * @param colType column type name, used in an exception message
++   * @param errorSubClass error sub-class for DELTA_DUPLICATE_COLUMNS_FOUND indicating where the
++   *                      duplicate was found (e.g. "METADATA_UPDATE", "TABLE_SCHEMA").
+    * @param caseSensitive Whether we should exception if two columns have casing conflicts. This
+    *                      should default to false for Delta.
+    */
+   def checkColumnNameDuplication(
+       schema: StructType,
+-      colType: String,
++      errorSubClass: String,
+       caseSensitive: Boolean = false): Unit = {
+     val columnNames = explodeNestedFieldNames(schema)
+     // scalastyle:off caselocale
+         case (x, ys) if ys.length > 1 => s"$x"
+       }
+       throw new DeltaAnalysisException(
+-        errorClass = "DELTA_DUPLICATE_COLUMNS_FOUND",
+-        messageParameters = Array(colType, duplicateColumns.mkString(", ")))
++        errorClass = s"DELTA_DUPLICATE_COLUMNS_FOUND.$errorSubClass",
++        messageParameters = Array(duplicateColumns.mkString(", ")))
+     }
+   }
+ 
+       keepExistingType: Boolean = false,
+       typeWideningMode: TypeWideningMode = TypeWideningMode.NoTypeWidening,
+       caseSensitive: Boolean = false): StructType = {
+-    checkColumnNameDuplication(dataSchema, "in the data to save", caseSensitive)
++    checkColumnNameDuplication(dataSchema, "DATA", caseSensitive)
+     mergeDataTypes(
+       tableSchema,
+       dataSchema,
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/schema/SchemaUtils.scala
@@ -0,0 +1,32 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/schema/SchemaUtils.scala b/spark/src/main/scala/org/apache/spark/sql/delta/schema/SchemaUtils.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/schema/SchemaUtils.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/schema/SchemaUtils.scala
+     }
+ 
+     def isStructReadCompatible(existing: StructType, newtype: StructType): Boolean = {
+-      val existingFields = toFieldMap(existing)
+       // scalastyle:off caselocale
++      def checkNoDuplicateColumns(schema: StructType, errorSubClass: String): Unit = {
++        val fieldNames = schema.fieldNames
++        val lowercaseNames = fieldNames.map(_.toLowerCase).toSet
++        if (lowercaseNames.size != fieldNames.length) {
++          val duplicates = fieldNames.groupBy(_.toLowerCase).collect {
++            case (_, names) if names.length > 1 => names.mkString(", ")
++          }
++          throw DeltaErrors.foundDuplicateColumnsException(errorSubClass,
++            duplicates.mkString(", "))
++        }
++      }
++
++      val existingFields = toFieldMap(existing)
++      checkNoDuplicateColumns(existing, "EXISTING_SCHEMA")
+       val existingFieldNames = existing.fieldNames.map(_.toLowerCase).toSet
+-      assert(existingFieldNames.size == existing.length,
+-        "Delta tables don't allow field names that only differ by case")
++      checkNoDuplicateColumns(newtype, "READ_SCHEMA")
+       val newFields = newtype.fieldNames.map(_.toLowerCase).toSet
+-      assert(newFields.size == newtype.length,
+-        "Delta tables don't allow field names that only differ by case")
+       // scalastyle:on caselocale
+ 
+       if (!allowMissingColumns &&
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/sources/DeltaSQLConf.scala
@@ -0,0 +1,30 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/sources/DeltaSQLConf.scala b/spark/src/main/scala/org/apache/spark/sql/delta/sources/DeltaSQLConf.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/sources/DeltaSQLConf.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/sources/DeltaSQLConf.scala
+       .booleanConf
+       .createWithDefault(true)
+ 
++  val DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR =
++    buildConf("failOnPartitionValueParsingError")
++      .internal()
++      .doc(
++        "When true, we will fail (rethrow) if there is an error when parsing partition values " +
++        "to their actual types. When false, we will fall back to using partition value strings."
++      )
++      .booleanConf
++      .createWithDefault(DeltaUtils.isTesting)
++
++  val DELTA_DYNAMIC_PARTITION_OVERWRITE_PARSE_PARTITION_VALUES =
++    buildConf("dynamicPartitionOverwrite.parsePartitionValues")
++      .internal()
++      .doc(
++        "When true, we will parse partition values to their actual types for comparison during " +
++        "dynamic partition overwrite file filtering, instead of using raw strings. " +
++        "This helps prevent issues with inconsistently formatted partition values."
++      )
++      .booleanConf
++      .createWithDefault(true)
++
+   //////////////////
+   // CORRECTNESS
+   //////////////////
\ No newline at end of file
spark/src/main/scala/org/apache/spark/sql/delta/util/PartitionUtils.scala
@@ -0,0 +1,70 @@
+diff --git a/spark/src/main/scala/org/apache/spark/sql/delta/util/PartitionUtils.scala b/spark/src/main/scala/org/apache/spark/sql/delta/util/PartitionUtils.scala
+--- a/spark/src/main/scala/org/apache/spark/sql/delta/util/PartitionUtils.scala
++++ b/spark/src/main/scala/org/apache/spark/sql/delta/util/PartitionUtils.scala
+     }
+ 
+     checkColumnNameDuplication(
+-      normalizedPartSpec.map(_._1), "in the partition schema", resolver)
++      normalizedPartSpec.map(_._1), "PARTITION_SCHEMA", resolver)
+ 
+     normalizedPartSpec.toMap
+   }
+       caseSensitive: Boolean): Unit = {
+     checkColumnNameDuplication(
+       partitionColumns,
+-      "in the partition columns",
++      "PARTITION_COLUMNS",
+       caseSensitive)
+ 
+     partitionColumnsSchema(schema, partitionColumns, caseSensitive).foreach {
+    * the duplication exists.
+    *
+    * @param columnNames column names to check
+-   * @param colType column type name, used in an exception message
++   * @param errorSubClass error sub-class for DELTA_DUPLICATE_COLUMNS_FOUND indicating where the
++   *                      duplicate was found (e.g. "PARTITION_SCHEMA", "CLUSTER_BY").
+    * @param resolver resolver used to determine if two identifiers are equal
+    */
+   def checkColumnNameDuplication(
+-      columnNames: Seq[String], colType: String, resolver: Resolver): Unit = {
+-    checkColumnNameDuplication(columnNames, colType, isCaseSensitiveAnalysis(resolver))
++      columnNames: Seq[String], errorSubClass: String, resolver: Resolver): Unit = {
++    checkColumnNameDuplication(columnNames, errorSubClass, isCaseSensitiveAnalysis(resolver))
+   }
+ 
+   /**
+    * the duplication exists.
+    *
+    * @param columnNames column names to check
+-   * @param colType column type name, used in an exception message
++   * @param errorSubClass error sub-class for DELTA_DUPLICATE_COLUMNS_FOUND indicating where the
++   *                      duplicate was found (e.g. "PARTITION_COLUMNS", "CLUSTER_BY").
+    * @param caseSensitiveAnalysis whether duplication checks should be case sensitive or not
+    */
+   def checkColumnNameDuplication(
+-      columnNames: Seq[String], colType: String, caseSensitiveAnalysis: Boolean): Unit = {
++      columnNames: Seq[String], errorSubClass: String, caseSensitiveAnalysis: Boolean): Unit = {
+     // scalastyle:off caselocale
+     val names = if (caseSensitiveAnalysis) columnNames else columnNames.map(_.toLowerCase)
+     // scalastyle:on caselocale
+       val duplicateColumns = names.groupBy(identity).collect {
+         case (x, ys) if ys.length > 1 => s"`$x`"
+       }
+-      throw DeltaErrors.foundDuplicateColumnsException(colType,
++      throw DeltaErrors.foundDuplicateColumnsException(errorSubClass,
+         duplicateColumns.mkString(", "))
+     }
+   }
+    * @param rawValue The raw string value of the partition.
+    * @param dataType Optional data type from the schema. If None, type inference is used.
+    * @param typeInference Whether to infer the type when dataType is None.
+-   * @param timeZone Time zone for timestamp parsing.
++   * @param timeZone Time zone used as a fallback for timestamp parsing. The timestampFormatter is
++   *                 always tried first. Only when it fails (e.g., "2026-01-01T12:00:00" with a 'T'
++   *                 separator) and the timestamp does not have a timezone identifier, the Cast
++   *                 fallback uses this timezone to interpret the timestamp. For data written by
++   *                 Spark this will not happen as the timestamp format always matches the
++   *                 timestampFormatter format.
+    * @param dateFormatter Formatter for date parsing.
+    * @param timestampFormatter Formatter for timestamp parsing.
+    * @param validatePartitionColumns Throw an error when casting fails.
\ No newline at end of file
spark/src/test/scala/org/apache/spark/sql/delta/DeltaErrorsSuite.scala
@@ -0,0 +1,16 @@
+diff --git a/spark/src/test/scala/org/apache/spark/sql/delta/DeltaErrorsSuite.scala b/spark/src/test/scala/org/apache/spark/sql/delta/DeltaErrorsSuite.scala
+--- a/spark/src/test/scala/org/apache/spark/sql/delta/DeltaErrorsSuite.scala
++++ b/spark/src/test/scala/org/apache/spark/sql/delta/DeltaErrorsSuite.scala
+     }
+     {
+       val e = intercept[DeltaAnalysisException] {
+-        throw DeltaErrors.foundDuplicateColumnsException("integer", "col1")
++        throw DeltaErrors.foundDuplicateColumnsException("METADATA_UPDATE", "col1")
+       }
+-      checkError(e, "DELTA_DUPLICATE_COLUMNS_FOUND", "42711",
+-        Map("coltype" -> "integer", "duplicateCols" -> "col1"))
++      checkError(e, "DELTA_DUPLICATE_COLUMNS_FOUND.METADATA_UPDATE", "42711",
++        Map("duplicateCols" -> "col1"))
+     }
+     {
+       val e = intercept[DeltaAnalysisException] {
\ No newline at end of file
spark/src/test/scala/org/apache/spark/sql/delta/OptimisticTransactionSuite.scala
@@ -0,0 +1,173 @@
+diff --git a/spark/src/test/scala/org/apache/spark/sql/delta/OptimisticTransactionSuite.scala b/spark/src/test/scala/org/apache/spark/sql/delta/OptimisticTransactionSuite.scala
+--- a/spark/src/test/scala/org/apache/spark/sql/delta/OptimisticTransactionSuite.scala
++++ b/spark/src/test/scala/org/apache/spark/sql/delta/OptimisticTransactionSuite.scala
+ import org.apache.spark.sql.catalyst.dsl.expressions._
+ import org.apache.spark.sql.catalyst.expressions.{EqualTo, Literal}
+ import org.apache.spark.sql.functions.{col, lit}
+-import org.apache.spark.sql.types.{IntegerType, StructType}
++import org.apache.spark.sql.types.{IntegerType, StructType, TimestampType}
+ import org.apache.spark.util.ManualClock
+ 
+ 
+ 
+             // txn1: read files in partitions of our new data (part=0)
+             val txn = log.startTransaction()
+-            val addFiles =
+-                txn.filterFiles(newData.map(_.partitionValues).toSet)
++            val addFiles = txn.filterFiles(newData)
+ 
+             // txn2
+             log.startTransaction().commit(concurrentActions(partCol), ManualUpdate)
+       RemoveFile("b", None, partitionValues = Map(partCol -> "1")))
+   )
+ 
++  for (enableNormalization <- BOOLEAN_DOMAIN) {
++    test("filterFiles for timestamp partitions with different string formats, " +
++      s"enableNormalization = $enableNormalization") {
++      withSQLConf(
++        DeltaSQLConf.DELTA_DYNAMIC_PARTITION_OVERWRITE_PARSE_PARTITION_VALUES.key ->
++          enableNormalization.toString
++      ) {
++        DeltaTestUtils.withTimeZone("UTC") {
++          withTempDir { tempDir =>
++            val tablePath = tempDir.getCanonicalPath
++            val log = DeltaLog.forTable(spark, tablePath)
++
++            log.startTransaction().commit(Seq(
++              Metadata(
++                schemaString = new StructType()
++                  .add("ts", TimestampType)
++                  .add("value", IntegerType).json,
++                partitionColumns = Seq("ts"))
++            ), ManualUpdate)
++
++            // Add files with non-UTC formatted timestamp partition values
++            val nonUtcTimestamp = "2000-01-01 12:00:00"
++            log.startTransaction().commit(
++              Seq(
++                AddFile("a", Map("ts" -> nonUtcTimestamp), 1, 1, dataChange = true),
++                AddFile("b", Map("ts" -> "2000-02-02 12:00:00"), 1, 1, dataChange = true)),
++              ManualUpdate)
++
++            // Query using UTC formatted timestamp (different string, same logical value)
++            val utcTimestamp = "2000-01-01T12:00:00.000000Z"
++            val txn = log.startTransaction()
++            val utcAddFile = AddFile("tmp", Map("ts" -> utcTimestamp), 0, 0, dataChange = false)
++            val matchedFiles = txn.filterFiles(Seq(utcAddFile))
++
++            if (enableNormalization) {
++              assert(matchedFiles.map(_.path).toSet == Set("a"))
++            } else {
++              assert(matchedFiles.isEmpty)
++            }
++          }
++        }
++      }
++    }
++  }
++
++  for (failOnError <- BOOLEAN_DOMAIN) {
++    test("filterFiles falls back to string comparison when partition parsing fails, " +
++      s"failOnError = $failOnError") {
++      withSQLConf(
++        DeltaSQLConf.DELTA_DYNAMIC_PARTITION_OVERWRITE_PARSE_PARTITION_VALUES.key -> "true",
++        DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR.key -> failOnError.toString
++      ) {
++        withTempDir { tempDir =>
++          val tablePath = tempDir.getCanonicalPath
++          val log = DeltaLog.forTable(spark, tablePath)
++
++          log.startTransaction().commit(Seq(
++            Metadata(
++              schemaString = new StructType()
++                .add("part", IntegerType)
++                .add("value", IntegerType).json,
++              partitionColumns = Seq("part"))
++          ), ManualUpdate)
++
++          // Add existing file with an unparseable partition value.
++          val badValue = "not_a_number"
++          log.startTransaction().commit(
++            Seq(AddFile("a", Map("part" -> badValue), 1, 1, dataChange = true)),
++            ManualUpdate)
++
++          // New file also has the same unparseable value
++          val txn = log.startTransaction()
++          val newFile = AddFile("tmp", Map("part" -> badValue), 0, 0, dataChange = false)
++
++          if (failOnError) {
++            checkError(
++              intercept[DeltaRuntimeException] {
++                txn.filterFiles(Seq(newFile))
++              },
++              condition = "DELTA_PARTITION_COLUMN_CAST_FAILED",
++              sqlState = "22525",
++              parameters = Map(
++                "value" -> badValue,
++                "dataType" -> "IntegerType",
++                "columnName" -> "part")
++            )
++          } else {
++            // Falls back to raw string comparison — strings match, so file "a" is returned
++            val matched = txn.filterFiles(Seq(newFile))
++            assert(matched.map(_.path).toSet == Set("a"))
++          }
++        }
++      }
++    }
++  }
++
++  for (failOnError <- BOOLEAN_DOMAIN) {
++    test("filterFiles when existing files have unparseable partition values, " +
++      s"failOnError = $failOnError") {
++      withSQLConf(
++        DeltaSQLConf.DELTA_DYNAMIC_PARTITION_OVERWRITE_PARSE_PARTITION_VALUES.key -> "true",
++        DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR.key -> failOnError.toString
++      ) {
++        withTempDir { tempDir =>
++          val tablePath = tempDir.getCanonicalPath
++          val log = DeltaLog.forTable(spark, tablePath)
++
++          log.startTransaction().commit(Seq(
++            Metadata(
++              schemaString = new StructType()
++                .add("part", IntegerType)
++                .add("value", IntegerType).json,
++              partitionColumns = Seq("part"))
++          ), ManualUpdate)
++
++          // Existing file has an unparseable partition value.
++          val badValue = "not_a_number"
++          log.startTransaction().commit(
++            Seq(AddFile("a", Map("part" -> badValue), 1, 1, dataChange = true)),
++            ManualUpdate)
++
++          // New file has a valid partition value. Only the UDF fails
++          val txn = log.startTransaction()
++          val newFile = AddFile("tmp", Map("part" -> "1"), 0, 0, dataChange = false)
++
++          if (failOnError) {
++            checkError(
++              intercept[DeltaRuntimeException] {
++                txn.filterFiles(Seq(newFile))
++              },
++              condition = "DELTA_PARTITION_COLUMN_CAST_FAILED",
++              sqlState = "22525",
++              parameters = Map(
++                "value" -> badValue,
++                "dataType" -> "IntegerType",
++                "columnName" -> "part")
++            )
++          } else {
++            // Falls back to raw string comparison — "1" != "not_a_number", so no match
++            val matched = txn.filterFiles(Seq(newFile))
++            assert(matched.isEmpty)
++          }
++        }
++      }
++    }
++  }
++
+   test("can set partition columns in first commit") {
+     withTempDir { tableDir =>
+       val partitionColumns = Array("part")
\ No newline at end of file
spark/src/test/scala/org/apache/spark/sql/delta/SchemaValidationSuite.scala
@@ -0,0 +1,16 @@
+diff --git a/spark/src/test/scala/org/apache/spark/sql/delta/SchemaValidationSuite.scala b/spark/src/test/scala/org/apache/spark/sql/delta/SchemaValidationSuite.scala
+--- a/spark/src/test/scala/org/apache/spark/sql/delta/SchemaValidationSuite.scala
++++ b/spark/src/test/scala/org/apache/spark/sql/delta/SchemaValidationSuite.scala
+       spark.range(10).write.format("delta").saveAsTable(tblName)
+     },
+     actionToTest = (spark: SparkSession, tblName: String) => {
+-      val e = intercept[AnalysisException] {
++      val e = intercept[DeltaAnalysisException] {
+         spark.sql(s"ALTER TABLE `$tblName` ADD COLUMNS (col2 string)")
+       }
+-      assert(e.getMessage.contains("Found duplicate column(s) in adding columns: col2"))
++      checkError(e, "DELTA_DUPLICATE_COLUMNS_FOUND.ADDING_COLUMNS", "42711",
++        Map("duplicateCols" -> "col2"))
+     },
+     concurrentChange = (spark: SparkSession, tblName: String) => {
+       spark.read.format("delta").table(tblName)
\ No newline at end of file
spark/src/test/scala/org/apache/spark/sql/delta/V2DmlInMemoryTableSuite.scala
@@ -0,0 +1,174 @@
+diff --git a/spark/src/test/scala/org/apache/spark/sql/delta/V2DmlInMemoryTableSuite.scala b/spark/src/test/scala/org/apache/spark/sql/delta/V2DmlInMemoryTableSuite.scala
+new file mode 100644
+--- /dev/null
++++ b/spark/src/test/scala/org/apache/spark/sql/delta/V2DmlInMemoryTableSuite.scala
++/*
++ * Copyright (2025) The Delta Lake Project Authors.
++ *
++ * Licensed under the Apache License, Version 2.0 (the "License");
++ * you may not use this file except in compliance with the License.
++ * You may obtain a copy of the License at
++ *
++ * http://www.apache.org/licenses/LICENSE-2.0
++ *
++ * Unless required by applicable law or agreed to in writing, software
++ * distributed under the License is distributed on an "AS IS" BASIS,
++ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
++ * See the License for the specific language governing permissions and
++ * limitations under the License.
++ */
++
++package org.apache.spark.sql.delta
++
++import java.io.File
++import java.net.URI
++import java.nio.file.{Files, Path}
++
++import org.apache.spark.SparkConf
++import org.apache.spark.sql.{QueryTest, Row}
++import org.apache.spark.sql.catalyst.TableIdentifier
++import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog}
++import org.apache.spark.sql.delta.catalog.{InMemoryDeltaCatalog, InMemorySparkTable}
++import org.apache.spark.sql.delta.test.DeltaSQLCommandTest
++
++/**
++ * Tests that DML operations can be executed through the DSv2 code path using
++ * [[InMemorySparkTable]] as the backing table implementation.
++ *
++ * Uses [[InMemoryDeltaCatalog]] as the session catalog so that:
++ * 1. CREATE TABLE still creates real Delta tables (for schema resolution)
++ * 2. Subsequent table loads return [[InMemorySparkTable]] (supports V2 DML)
++ * 3. DML operations flow through Spark's V2 execution path
++ */
++class V2DmlInMemoryTableSuite extends QueryTest with DeltaSQLCommandTest {
++
++  override protected def sparkConf: SparkConf = super.sparkConf
++    .set("spark.sql.catalog.spark_catalog", classOf[InMemoryDeltaCatalog].getName)
++
++  override protected def withTable(tableNames: String*)(f: => Unit): Unit = {
++    try {
++      super.withTable(tableNames: _*) {
++        f
++        tableNames.foreach(assertNoParquetFiles)
++      }
++    } finally {
++      for (tableName <- tableNames) {
++        assert(!InMemoryDeltaCatalog.contains(tableName))
++      }
++    }
++  }
++
++  /**
++   * Asserts that no physical parquet data files exist under the table's location.
++   * This validates that DML operations went through the in-memory V2 path and
++   * did not fall back to the V1 connector (which would write actual parquet files).
++   */
++  private def assertNoParquetFiles(tableName: String): Unit = {
++    val catalogTable = spark.sessionState.catalog.getTableMetadata(
++      TableIdentifier(tableName))
++    val dataPath = new File(new URI(catalogTable.location.toString))
++    if (dataPath.exists()) {
++      val stream = Files.walk(dataPath.toPath)
++      try {
++        val parquetFiles = stream
++          .filter(Files.isRegularFile(_))
++          .filter(_.toString.endsWith(".parquet"))
++          .toArray.map(_.asInstanceOf[Path].toString).toSeq
++        assert(parquetFiles.isEmpty,
++          s"Physical parquet files found under '$dataPath' while V2 in-memory mode is enabled. " +
++          s"DML may have fallen back to V1. Files: $parquetFiles")
++      } finally {
++        stream.close()
++      }
++    }
++  }
++
++  test("catalog returns InMemorySparkTable when InMemoryDeltaCatalog is configured") {
++    val tableName = "v2_dml_test_catalog"
++    withTable(tableName) {
++      sql(s"CREATE TABLE $tableName (id LONG, value STRING) USING delta")
++
++      val catalog = spark.sessionState.catalogManager.v2SessionCatalog.asInstanceOf[TableCatalog]
++      val table = catalog.loadTable(Identifier.of(Array("default"), tableName))
++
++      assert(table.isInstanceOf[InMemorySparkTable],
++        s"Expected InMemorySparkTable, got ${table.getClass.getName}")
++    }
++  }
++
++  test("INSERT via DSv2 InMemoryTable") {
++    val tableName = "v2_dml_test_insert"
++    withTable(tableName) {
++      sql(s"CREATE TABLE $tableName (id LONG, value STRING) USING delta")
++
++      sql(s"INSERT INTO $tableName VALUES (1, 'a'), (2, 'b')")
++      checkAnswer(
++        sql(s"SELECT id, value FROM $tableName ORDER BY id"),
++        Seq(Row(1L, "a"), Row(2L, "b")))
++    }
++  }
++
++  test("INSERT OVERWRITE via DSv2 InMemoryTable") {
++    val tableName = "v2_dml_test_overwrite"
++    withTable(tableName) {
++      sql(s"CREATE TABLE $tableName (id LONG, value STRING) USING delta")
++
++      sql(s"INSERT INTO $tableName VALUES (1, 'a'), (2, 'b')")
++      sql(s"INSERT OVERWRITE $tableName VALUES (3, 'c')")
++      checkAnswer(
++        sql(s"SELECT id, value FROM $tableName ORDER BY id"),
++        Seq(Row(3L, "c")))
++    }
++  }
++
++  test("DELETE via DSv2 InMemoryTable") {
++    val tableName = "v2_dml_test_delete"
++    withTable(tableName) {
++      sql(s"CREATE TABLE $tableName (pk INT NOT NULL, value STRING) USING delta")
++      sql(s"INSERT INTO $tableName VALUES (1, 'a'), (2, 'b'), (3, 'c')")
++
++      sql(s"DELETE FROM $tableName WHERE pk = 2")
++      checkAnswer(
++        sql(s"SELECT pk, value FROM $tableName ORDER BY pk"),
++        Seq(Row(1, "a"), Row(3, "c")))
++    }
++  }
++
++  test("UPDATE via DSv2 InMemoryTable") {
++    val tableName = "v2_dml_test_update"
++    withTable(tableName) {
++      sql(s"CREATE TABLE $tableName (pk INT NOT NULL, value STRING) USING delta")
++      sql(s"INSERT INTO $tableName VALUES (1, 'a'), (2, 'b'), (3, 'c')")
++
++      sql(s"UPDATE $tableName SET value = 'updated' WHERE pk >= 2")
++      checkAnswer(
++        sql(s"SELECT pk, value FROM $tableName ORDER BY pk"),
++        Seq(Row(1, "a"), Row(2, "updated"), Row(3, "updated")))
++    }
++  }
++
++  test("MERGE via DSv2 InMemoryTable") {
++    val targetTable = "v2_dml_test_merge_target"
++    withTable(targetTable) {
++      sql(s"CREATE TABLE $targetTable (pk INT NOT NULL, value STRING) USING delta")
++      sql(s"INSERT INTO $targetTable VALUES (1, 'a'), (2, 'b')")
++
++      withTempView("source") {
++        sql("CREATE TEMP VIEW source AS SELECT * FROM VALUES (1, 'updated'), (3, 'c') " +
++            "AS t(pk, value)")
++
++        sql(
++          s"""MERGE INTO $targetTable t
++             |USING source s
++             |ON t.pk = s.pk
++             |WHEN MATCHED THEN UPDATE SET t.value = s.value
++             |WHEN NOT MATCHED THEN INSERT (pk, value) VALUES (s.pk, s.value)
++             |""".stripMargin)
++
++        checkAnswer(
++          sql(s"SELECT pk, value FROM $targetTable ORDER BY pk"),
++          Seq(Row(1, "updated"), Row(2, "b"), Row(3, "c")))
++      }
++    }
++  }
++}
\ No newline at end of file
spark/src/test/scala/org/apache/spark/sql/delta/actions/AddFileSuite.scala
@@ -0,0 +1,102 @@
+diff --git a/spark/src/test/scala/org/apache/spark/sql/delta/actions/AddFileSuite.scala b/spark/src/test/scala/org/apache/spark/sql/delta/actions/AddFileSuite.scala
+--- a/spark/src/test/scala/org/apache/spark/sql/delta/actions/AddFileSuite.scala
++++ b/spark/src/test/scala/org/apache/spark/sql/delta/actions/AddFileSuite.scala
+ 
+   test("normalizedPartitionValues for DateType should return the original date string") {
+     withSQLConf(
++      DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR.key -> "true",
+       DeltaSQLConf.DELTA_NORMALIZE_PARTITION_VALUES_ON_READ.key -> "true"
+     ) {
+       withTempDir { tempDir =>
+ 
+   test("normalizedPartitionValues should handle __HIVE_DEFAULT_PARTITION__") {
+     withSQLConf(
++      DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR.key -> "true",
+       DeltaSQLConf.DELTA_NORMALIZE_PARTITION_VALUES_ON_READ.key -> "true"
+     ) {
+       withTempDir { tempDir =>
+ 
+   test("normalizedPartitionValues preserves escaped characters in AddFile partition values") {
+     withSQLConf(
++      DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR.key -> "true",
+       DeltaSQLConf.DELTA_NORMALIZE_PARTITION_VALUES_ON_READ.key -> "true"
+     ) {
+       withTempDir { tempDir =>
+     withJvmTimeZone("Europe/Berlin") {
+       withSQLConf(
+         DeltaSQLConf.DELTA_NORMALIZE_PARTITION_VALUES_ON_READ.key -> "true",
++        DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR.key -> "true",
+         "spark.sql.session.timeZone" -> "Europe/Berlin" // UTC + 1 in winter time
+       ) {
+         withTempDir { tempDir =>
+     "and a non UTC session time zone gets converted to UTC.") {
+     withSQLConf(
+       DeltaSQLConf.DELTA_NORMALIZE_PARTITION_VALUES_ON_READ.key -> "true",
++      DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR.key -> "true",
+       "spark.sql.session.timeZone" -> "America/Los_Angeles" // UTC - 8 in winter time
+     ) {
+       withTempDir { tempDir =>
+     withJvmTimeZone("UTC") {
+       withSQLConf(
+         DeltaSQLConf.DELTA_NORMALIZE_PARTITION_VALUES_ON_READ.key -> "true",
++        DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR.key -> "true",
+         "spark.sql.session.timeZone" -> "UTC"
+       ) {
+         withTempDir { tempDir =>
+     withJvmTimeZone("Europe/Berlin") {
+       withSQLConf(
+         DeltaSQLConf.DELTA_NORMALIZE_PARTITION_VALUES_ON_READ.key -> "true",
+-        "spark.sql.session.timeZone" -> "Europe/Berlin" // UTC + 1 in winter time
++        DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR.key -> "true",
++        "spark.sql.session.timeZone" -> "America/Los_Angeles" // UTC - 8 in winter time
+       ) {
+         withTempDir { tempDir =>
+           spark.createDataFrame(
+           ).write.format("delta").partitionBy("tsCol").save(tempDir.getCanonicalPath)
+           val deltaTxn = DeltaLog.forTable(spark, tempDir.getCanonicalPath).startTransaction()
+ 
+-          // ISO 8601 format with 'T' separator but no time zone should use the JVM time zone.
++          // ISO 8601 format with 'T' separator but no time zone should use the session time zone
++          // since the timestamp formatter fails to parse this format.
+           val file = createAddFileWithPartitionValue(Map("tsCol" -> "2000-01-01T12:00:00"))
+-          // The normalized timestamp should be 11:00 UTC (12:00 Berlin = 11:00 UTC)
++          // The normalized timestamp should be 20:00 UTC (12:00 LA + 8h = 20:00 UTC)
+           val normalized = file.normalizedPartitionValues(spark, deltaTxn)
+-          assert(normalized("tsCol") == timestampLiteral("2000-01-01 12:00:00", "Europe/Berlin"))
++          assert(normalized("tsCol") ==
++            timestampLiteral("2000-01-01 12:00:00", "America/Los_Angeles"))
+         }
+       }
+     }
+   }
+ 
+-  test("normalizedPartitionValues should also use the JVM timezone on read") {
++  test("normalizedPartitionValues should use the session timezone on read") {
+     withJvmTimeZone("America/Los_Angeles") {
+       withSQLConf(
+         DeltaSQLConf.DELTA_NORMALIZE_PARTITION_VALUES_ON_READ.key -> "true",
++        DeltaSQLConf.DELTA_FAIL_ON_PARTITION_VALUE_PARSING_ERROR.key -> "true",
+         "spark.sql.session.timeZone" -> "UTC"
+       ) {
+         withTempDir { tempDir =>
+           ).write.format("delta").partitionBy("tsCol").save(tempDir.getCanonicalPath)
+           val deltaTxn = DeltaLog.forTable(spark, tempDir.getCanonicalPath).startTransaction()
+ 
+-          // ON WRITE we use the JVM timezone, parsing this as an America/Los_Angeles timestamp.
+           val file = createAddFileWithPartitionValue(Map("tsCol" -> "2000-01-01 12:00:00"))
+           val normalized = file.normalizedPartitionValues(spark, deltaTxn)
+ 
+-          // ON READ we also need to use the JVM timezone again, reading it again as an
+-          // America/Los_Angeles timestamp.
+-          assert(
+-            normalized("tsCol") == timestampLiteral("2000-01-01 12:00:00", "America/Los_Angeles"))
+-          assert(normalized("tsCol") != timestampLiteral("2000-01-01 12:00:00", "UTC"))
++          // ON READ we use the session timezone (UTC), not the JVM timezone
++          // (America/Los_Angeles), to be consistent with the WRITE path in
++          // DelayedCommitProtocol which also uses the session timezone.
++          assert(normalized("tsCol") == timestampLiteral("2000-01-01 12:00:00", "UTC"))
++          assert(normalized("tsCol") !=
++            timestampLiteral("2000-01-01 12:00:00", "America/Los_Angeles"))
+         }
+       }
+     }
\ No newline at end of file

... (truncated, output exceeded 60000 bytes)

Reproduce locally: git range-diff e43bf65..71534ed e43bf65..19b49ba | Disable: git config gitstack.push-range-diff false

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from 19b49ba to c803c1a Compare April 24, 2026 23:17
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (19b49ba -> c803c1a)
.github/actions/setup-unitycatalog/action.yml
@@ -0,0 +1,40 @@
+diff --git a/.github/actions/setup-unitycatalog/action.yml b/.github/actions/setup-unitycatalog/action.yml
+new file mode 100644
+--- /dev/null
++++ b/.github/actions/setup-unitycatalog/action.yml
++name: "Set up pinned Unity Catalog build"
++description: >-
++  Publishes Unity Catalog jars from the commit pinned in project/scripts/setup_unitycatalog_main.sh
++  (the UC_PIN_SHA= line) to the runner's local Ivy / Maven caches, using GitHub Actions cache so the
++  slow UC build only runs the first time a pin is seen.
++
++runs:
++  using: "composite"
++  steps:
++    - name: Restore pinned UC cache
++      id: uc-cache
++      uses: actions/cache/restore@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
++      with:
++        # ~/.ivy2/local is what sbt publishLocal writes to. ~/.m2 is for publishM2.
++        path: |
++          ~/.ivy2/local
++          ~/.m2/repository/io/unitycatalog
++        # Cache key hashes the setup script, so bumping UC_PIN_SHA (or any other script change)
++        # invalidates the cache.
++        key: uc-jars-${{ runner.os }}-${{ hashFiles('project/scripts/setup_unitycatalog_main.sh') }}
++    - name: Build Unity Catalog from pinned SHA
++      shell: bash
++      run: bash project/scripts/setup_unitycatalog_main.sh
++    - name: Save pinned UC cache
++      # Only attempt a save when the restore missed. When multiple parallel matrix jobs all see
++      # a cache miss (first CI run after a pin bump), only the first to reach this step wins the
++      # GHA cache reservation; the rest log "another job may be creating this cache" warnings.
++      # Gating on cache-hit means cached runs (the common steady state) skip the save entirely,
++      # which eliminates those warnings on every subsequent run.
++      if: steps.uc-cache.outputs.cache-hit != 'true'
++      uses: actions/cache/save@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
++      with:
++        path: |
++          ~/.ivy2/local
++          ~/.m2/repository/io/unitycatalog
++        key: uc-jars-${{ runner.os }}-${{ hashFiles('project/scripts/setup_unitycatalog_main.sh') }}
\ No newline at end of file
.github/workflows/build.yaml
@@ -0,0 +1,29 @@
+diff --git a/.github/workflows/build.yaml b/.github/workflows/build.yaml
+--- a/.github/workflows/build.yaml
++++ b/.github/workflows/build.yaml
+ name: "Delta Build"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+             ~/.cache/coursier
+           key: delta-sbt-cache-cross-spark
+ 
++      # publishM2 compiles every aggregated project, including storage, which has
++      # unitycatalog-client as a compile-scope dependency. Publish the pinned UC build locally
++      # first so Delta compiles against the UC APIs it actually targets.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
++
+       - name: Run cross-Spark build test
+         run: python project/tests/test_cross_spark_publish.py
+ 
\ No newline at end of file
.github/workflows/disabled_iceberg_test.yaml
@@ -0,0 +1,80 @@
+diff --git a/.github/workflows/disabled_iceberg_test.yaml b/.github/workflows/disabled_iceberg_test.yaml
+deleted file mode 100644
+--- a/.github/workflows/disabled_iceberg_test.yaml
++++ /dev/null
+-name: "Delta Iceberg Latest [DISABLED]"
+-# SECURITY: All Python/PySpark workflows disabled due to active supply chain attack
+-# targeting OSS package ecosystems (PyPI). C2 domains: models.litellm.cloud, checkmarx.zone
+-# Date disabled: 2026-03-25
+-# To re-enable: remove 'if: false' from all jobs and restore original triggers
+-on:
+-  workflow_dispatch: # manual-only, auto triggers removed
+-  # To re-enable, replace the above line with:
+-  # push:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-  # pull_request:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-env:
+-  # SECURITY: Temporal lockdown — refuse any package version published after this date.
+-  # This date is a pre-attack baseline (before the active PyPI supply chain attack).
+-  UV_EXCLUDE_NEWER: "2026-03-10T00:00:00Z"
+-jobs:
+-  test:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+-    name: "DIL: Scala ${{ matrix.scala }}"
+-    runs-on: ubuntu-24.04
+-    strategy:
+-      matrix:
+-        # These Scala versions must match those in the build.sbt
+-        scala: [2.13.16]
+-    env:
+-      SCALA_VERSION: ${{ matrix.scala }}
+-    steps:
+-      - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+-      - name: install java
+-        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
+-        with:
+-          distribution: "zulu"
+-          java-version: "17"
+-      - name: Cache Scala, SBT
+-        uses: actions/cache@6f8efc29b200d32929f49075959781ed54ec270c # v3.5.0
+-        with:
+-          path: |
+-            ~/.sbt
+-            ~/.ivy2
+-            ~/.cache/coursier
+-          # Change the key if dependencies are changed. For each key, GitHub Actions will cache the
+-          # the above directories when we use the key for the first time. After that, each run will
+-          # just use the cache. The cache is immutable so we need to use a new key when trying to
+-          # cache new stuff.
+-          key: delta-sbt-cache-spark4.0-scala${{ matrix.scala }}
+-      - name: Set up uv
+-        run: bash project/scripts/install-uv.sh
+-      - name: Install Job dependencies
+-        run: |
+-          sudo apt-get update
+-          sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+-          sudo apt install libedit-dev
+-          # buf v1.28.1 (2023-11-15) — SHA from official release asset:
+-          # https://github.com/bufbuild/buf/releases/download/v1.28.1/sha256.txt
+-          BUF_VERSION="v1.28.1"
+-          BUF_SHA256="870cf492d381a967d36636fdee9da44b524ea62aad163659b8dbf16a7da56987"
+-          curl -fsSL -o buf-Linux-x86_64.tar.gz \
+-            "https://github.com/bufbuild/buf/releases/download/${BUF_VERSION}/buf-Linux-x86_64.tar.gz"
+-          echo "${BUF_SHA256}  buf-Linux-x86_64.tar.gz" | sha256sum -c -
+-          mkdir -p ~/buf
+-          tar -xzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
+-          rm buf-Linux-x86_64.tar.gz
+-          uv python install 3.8
+-          uv venv .venv --python 3.8
+-      - name: Run Scala/Java and Python tests
+-        # when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
+-        run: |
+-          source .venv/bin/activate
+-          TEST_PARALLELISM_COUNT=4 python run-tests.py --group iceberg --spark-version 4.0
\ No newline at end of file
.github/workflows/spark_test_uc_master.yaml
@@ -0,0 +1,62 @@
+diff --git a/.github/workflows/spark_test_uc_master.yaml b/.github/workflows/disabled_spark_test_uc_master.yaml
+similarity index 61%
+rename from .github/workflows/spark_test_uc_master.yaml
+rename to .github/workflows/disabled_spark_test_uc_master.yaml
+--- a/.github/workflows/spark_test_uc_master.yaml
++++ b/.github/workflows/disabled_spark_test_uc_master.yaml
+ ##
+ ## To make this blocking, add the job name to the required status checks in
+ ## the branch protection rules for `master`.
++##
++## DISABLED while Delta master builds against a pinned UC master SHA — the main Delta Spark
++## workflow already exercises UC master at that pin, so a parallel floating-main workflow would
++## be redundant. To re-enable (once Delta goes back to a released UC version): drop the
++## `[DISABLED]` suffix from `name`, replace `workflow_dispatch:` with the original push /
++## pull_request triggers below, remove `if: false` from the job, and rename the file back to
++## `spark_test_uc_master.yaml`.
+ 
+-name: "Delta Spark (UC Master)"
++name: "Delta Spark (UC Master) [DISABLED]"
+ on:
+-  push:
+-    paths-ignore:
+-      - '**.md'
+-      - '**.txt'
+-  pull_request:
+-    paths-ignore:
+-      - '**.md'
+-      - '**.txt'
++  workflow_dispatch: # manual-only while disabled
++  # Original triggers, restore when re-enabling:
++  # push:
++  #   branches: [master, branch-*]
++  #   paths-ignore:
++  #     - '**.md'
++  #     - '**.txt'
++  # pull_request:
++  #   branches: [master, branch-*]
++  #   paths-ignore:
++  #     - '**.md'
++  #     - '**.txt'
+ 
+ jobs:
+   test-uc-master:
+     name: "[Non Blocking] UC Integration Tests (UC Main)"
++    # Guard against accidental runs while disabled. Remove when re-enabling.
++    if: false
+     runs-on: ubuntu-24.04
+     steps:
+       - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3
+           key: delta-sbt-cache-uc-master
+       - name: Build Unity Catalog from source
+         id: uc-build
++        # UC_REF=main builds the floating-main canary instead of the pinned SHA, which is the
++        # point of this workflow -- early warning of upcoming UC incompatibilities.
+         run: |
+-          bash project/scripts/setup_unitycatalog_main.sh
+-          UC_VERSION=$(cat /tmp/unitycatalog/.uc-version)
++          UC_REF=main bash project/scripts/setup_unitycatalog_main.sh
++          UC_VERSION=$(UC_REF=main bash project/scripts/setup_unitycatalog_main.sh --print-version)
+           echo "uc_version=$UC_VERSION" >> $GITHUB_OUTPUT
+           echo "UC version: $UC_VERSION"
+       - name: Run UC integration tests
\ No newline at end of file
.github/workflows/flink_test.yaml
@@ -0,0 +1,37 @@
+diff --git a/.github/workflows/flink_test.yaml b/.github/workflows/flink_test.yaml
+--- a/.github/workflows/flink_test.yaml
++++ b/.github/workflows/flink_test.yaml
+ 
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'flink/**'
+       - 'kernel/**'
+       - '!**/*.md'
+       - '!**/*.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'flink/**'
+       - 'kernel/**'
+   cancel-in-progress: true
+ 
+ env:
+-  # Point SBT to our cache directories for consistency
++  # Point SBT to our cache directories for consistency.
+   SBT_OPTS: "-Dsbt.coursier.home-dir=/home/runner/.cache/coursier -Dsbt.ivy.home=/home/runner/.ivy2"
+ 
+ jobs:
+           else
+             echo "❌ Cache MISS - will download dependencies"
+           fi
++      # flink has unitycatalog-client as a compile-scope dep and flink tests exercise UC.
++      # Publish the pinned UC build locally before sbt runs.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run unit tests
+         run: |
+           build/sbt flinkGroup/test
\ No newline at end of file
.github/workflows/iceberg_test.yaml
@@ -0,0 +1,58 @@
+diff --git a/.github/workflows/iceberg_test.yaml b/.github/workflows/iceberg_test.yaml
+new file mode 100644
+--- /dev/null
++++ b/.github/workflows/iceberg_test.yaml
++name: "Delta Iceberg Latest"
++on:
++  push:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++  pull_request:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++jobs:
++  test:
++    name: "DIL: Scala ${{ matrix.scala }}"
++    runs-on: ubuntu-24.04
++    strategy:
++      matrix:
++        # These Scala versions must match those in the build.sbt
++        scala: [2.13.16]
++    env:
++      SCALA_VERSION: ${{ matrix.scala }}
++    steps:
++      - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
++      - name: install java
++        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
++        with:
++          distribution: "zulu"
++          java-version: "17"
++      - name: Cache Scala, SBT
++        uses: actions/cache@6f8efc29b200d32929f49075959781ed54ec270c # v3.5.0
++        with:
++          path: |
++            ~/.sbt
++            ~/.ivy2
++            ~/.cache/coursier
++          # Change the key if dependencies are changed. For each key, GitHub Actions will cache the
++          # the above directories when we use the key for the first time. After that, each run will
++          # just use the cache. The cache is immutable so we need to use a new key when trying to
++          # cache new stuff.
++          key: delta-sbt-cache-spark4.0-scala${{ matrix.scala }}
++      - name: Set up uv
++        run: bash project/scripts/install-uv.sh
++      - name: Install Python via uv
++        # No UV_EXCLUDE_NEWER needed: this workflow installs zero pip packages.
++        # Python is only used to run the stdlib-only run-tests.py driver.
++        run: |
++          uv python install 3.8
++          uv venv .venv --python 3.8
++      - name: Run Scala/Java and Python tests
++        # when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
++        run: |
++          source .venv/bin/activate
++          TEST_PARALLELISM_COUNT=4 python run-tests.py --group iceberg --spark-version 4.0
\ No newline at end of file
.github/workflows/kernel_docs.yaml
@@ -0,0 +1,11 @@
+diff --git a/.github/workflows/kernel_docs.yaml b/.github/workflows/kernel_docs.yaml
+--- a/.github/workflows/kernel_docs.yaml
++++ b/.github/workflows/kernel_docs.yaml
+           java-version: "11"
+       - name: Generate docs
+         run: |
+-          build/sbt kernelGroup/unidoc
++          build/sbt -DuseDefaultUnityCatalogReleaseVersion=true kernelGroup/unidoc
+           mkdir -p kernel/docs/snapshot/kernel-api/java
+           mkdir -p kernel/docs/snapshot/kernel-defaults/java
+           cp -r kernel/kernel-api/target/javaunidoc/. kernel/docs/snapshot/kernel-api/java/
\ No newline at end of file
.github/workflows/kernel_test.yaml
@@ -0,0 +1,47 @@
+diff --git a/.github/workflows/kernel_test.yaml b/.github/workflows/kernel_test.yaml
+--- a/.github/workflows/kernel_test.yaml
++++ b/.github/workflows/kernel_test.yaml
+ 
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+           else
+             echo "❌ Cache MISS - will download dependencies"
+           fi
++      # run-tests.py invokes sbt with `++ 2.13.16`, which triggers cross-version dependency resolution
++      # across every project (including kernelUnityCatalog). Publish the pinned UC build locally first
++      # so that resolution doesn't miss.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run unit tests
+         run: |
+           python run-tests.py --group kernel --coverage --shard ${{ matrix.shard }}
+     runs-on: ubuntu-24.04
+     steps:
+       - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+-      # Run integration tests with JDK 11, as they have no Spark dependency
+-      - name: install java
++      # The integration test itself runs on JDK 11 (no Spark dependency), but UC's sbt build needs
++      # JDK 17, so we install 17 first, publish UC, then switch the active JDK to 11 for the actual
++      # test run.
++      - name: install java 17 for UC build
++        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
++        with:
++          distribution: "zulu"
++          java-version: "17"
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
++      - name: install java 11 for integration test
+         uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
+         with:
+           distribution: "zulu"
\ No newline at end of file
.github/workflows/kernel_unitycatalog_test.yaml
@@ -0,0 +1,29 @@
+diff --git a/.github/workflows/kernel_unitycatalog_test.yaml b/.github/workflows/kernel_unitycatalog_test.yaml
+--- a/.github/workflows/kernel_unitycatalog_test.yaml
++++ b/.github/workflows/kernel_unitycatalog_test.yaml
+ name: "Kernel Unity Catalog"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'build.sbt'
+       - 'version.sbt'
+       - 'storage/**/*.java'
+       - '.github/workflows/kernel_unitycatalog_test.yaml'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'build.sbt'
+       - 'version.sbt'
+         with:
+           distribution: "zulu"
+           java-version: "17"
++      # kernelUnityCatalog depends on unreleased UC APIs; publish the pinned UC build locally before
++      # sbt tries to resolve the dependency.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run Unity tests with coverage
+         run: |
+           ./build/sbt "++ ${{ env.SCALA_VERSION }}" clean coverage kernelUnityCatalog/test coverageAggregate coverageOff -v
\ No newline at end of file
.github/workflows/spark_examples_test.yaml
@@ -0,0 +1,27 @@
+diff --git a/.github/workflows/spark_examples_test.yaml b/.github/workflows/spark_examples_test.yaml
+--- a/.github/workflows/spark_examples_test.yaml
++++ b/.github/workflows/spark_examples_test.yaml
+ name: "Delta Spark Publishing and Examples"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+           sudo apt-get update
+           sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+           sudo apt install libedit-dev
++      # `publishM2` and `++ <scala>` both resolve every project's deps, which includes
++      # sparkUnityCatalog; publish the pinned UC build locally before sbt runs.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run Delta Spark Local Publishing and Examples Compilation
+         # examples/scala/build.sbt will compile against the local Delta release version (e.g. 3.2.0-SNAPSHOT).
+         # Thus, we need to publishM2 first so those jars are locally accessible.
\ No newline at end of file
.github/workflows/disabled_spark_python_test.yaml
@@ -0,0 +1,76 @@
+diff --git a/.github/workflows/disabled_spark_python_test.yaml b/.github/workflows/spark_python_test.yaml
+similarity index 71%
+rename from .github/workflows/disabled_spark_python_test.yaml
+rename to .github/workflows/spark_python_test.yaml
+--- a/.github/workflows/disabled_spark_python_test.yaml
++++ b/.github/workflows/spark_python_test.yaml
+-name: "Delta Spark Python [DISABLED]"
+-# SECURITY: All Python/PySpark workflows disabled due to active supply chain attack
+-# targeting OSS package ecosystems (PyPI). C2 domains: models.litellm.cloud, checkmarx.zone
+-# Date disabled: 2026-03-25
+-# To re-enable: remove 'if: false' from all jobs and restore original triggers
++name: "Delta Spark Python"
+ on:
+-  workflow_dispatch: # manual-only, auto triggers removed
+-  # To re-enable, replace the above line with:
+-  # push:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-  # pull_request:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
++  push:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++  pull_request:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
+ env:
+   # SECURITY: Temporal lockdown — refuse any package version published after this date.
+   # This date is a pre-attack baseline (before the active PyPI supply chain attack).
+   # Generate Spark versions matrix from CrossSparkVersions.scala
+   # This workflow tests against released versions only (no snapshots)
+   generate-matrix:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+     name: "Generate Released Spark Versions Matrix"
+     runs-on: ubuntu-24.04
+     outputs:
+           echo "Generated released Spark versions: $SPARK_VERSIONS"
+ 
+   test:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+     name: "DSP (${{ matrix.spark_version }})"
+     runs-on: ubuntu-24.04
+     needs: generate-matrix
+           key: delta-sbt-cache-spark${{ matrix.spark_version }}-scala${{ matrix.scala }}
+       - name: Set up uv
+         run: bash project/scripts/install-uv.sh
+-      - name: Install Job dependencies
++      - name: Set up buf
++        run: bash project/scripts/install-buf.sh
++      - name: Install Python and dependencies
+         run: |
+-          sudo apt-get update
+-          sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+-          sudo apt install libedit-dev
+-          # buf v1.28.1 (2023-11-15) — SHA from official release asset:
+-          # https://github.com/bufbuild/buf/releases/download/v1.28.1/sha256.txt
+-          BUF_VERSION="v1.28.1"
+-          BUF_SHA256="870cf492d381a967d36636fdee9da44b524ea62aad163659b8dbf16a7da56987"
+-          curl -fsSL -o buf-Linux-x86_64.tar.gz \
+-            "https://github.com/bufbuild/buf/releases/download/${BUF_VERSION}/buf-Linux-x86_64.tar.gz"
+-          echo "${BUF_SHA256}  buf-Linux-x86_64.tar.gz" | sha256sum -c -
+-          mkdir -p ~/buf
+-          tar -xzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
+-          rm buf-Linux-x86_64.tar.gz
+           uv python install 3.10
+           uv venv .venv --python 3.10
+           # Install hash-verified locked dependencies (see .github/ci-requirements/spark-python/)
\ No newline at end of file
.github/workflows/spark_test.yaml
@@ -0,0 +1,27 @@
+diff --git a/.github/workflows/spark_test.yaml b/.github/workflows/spark_test.yaml
+--- a/.github/workflows/spark_test.yaml
++++ b/.github/workflows/spark_test.yaml
+ name: "Delta Spark"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+             ~/.ivy2
+             ~/.cache/coursier
+           key: delta-sbt-cache-spark${{ matrix.spark_version }}-scala${{ matrix.scala }}
++      # Delta's sparkUnityCatalog module (part of sparkGroup) depends on APIs that are only in
++      # unreleased UC. Publish the pinned UC build locally before sbt tries to resolve it.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Scala structured logging style check
+         run: |
+           if [ -f ./dev/spark_structured_logging_style.py ]; then
\ No newline at end of file
.github/workflows/unidoc.yaml
@@ -0,0 +1,19 @@
+diff --git a/.github/workflows/unidoc.yaml b/.github/workflows/unidoc.yaml
+--- a/.github/workflows/unidoc.yaml
++++ b/.github/workflows/unidoc.yaml
+   name: "Unidoc"
+   on:
+     push:
+-      branches: [master]
++      branches: [master, branch-*]
+     pull_request:
+-      branches: [master]
++      branches: [master, branch-*]
+   jobs:
+     build:
+       name: "U: Scala ${{ matrix.scala }}"
+             java-version: "17"
+         - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+         - name: generate unidoc
+-          run: build/sbt "++ ${{ matrix.scala }}" unidoc
++          run: build/sbt -DuseDefaultUnityCatalogReleaseVersion=true "++ ${{ matrix.scala }}" unidoc
\ No newline at end of file
build.sbt
@@ -0,0 +1,149 @@
+diff --git a/build.sbt b/build.sbt
+--- a/build.sbt
++++ b/build.sbt
+   ).configureUnidoc()
+ 
+ 
+-val unityCatalogVersion = sys.props.getOrElse("unityCatalogVersion", "0.4.1")
++// Unity Catalog version. Three modes, in priority order:
++//
++//  1. `-DuseDefaultUnityCatalogReleaseVersion=true`: use `defaultUnityCatalogReleaseVersion`
++//     below -- the last released UC version on Maven Central. For workflows that don't actually
++//     need DRC APIs (e.g. unidoc, lint) and want to skip the pinned UC build. Shared across
++//     workflows by reading this single constant, so bumping is a one-line change here.
++//
++//  2. Release mode: set `unityCatalogReleaseVersion = Some("0.5.0")` (or whatever released
++//     version the release branch ships against). sbt resolves the coordinate from Maven Central
++//     like any other dependency.
++//
++//  3. Pinned mode (default): leave `unityCatalogReleaseVersion = None`. The version string
++//     comes from `setup_unitycatalog_main.sh --print-version`, which encodes both the pinned
++//     UC main SHA and UC's declared base version; the script is the single source of truth.
++//     The same script (without the flag) publishes the matching jars to ~/.ivy2/local when
++//     `ensurePinnedUnityCatalog` decides they're missing.
++//
++// Override with -DunityCatalogVersion=<anything> for ad-hoc experiments.
++val unityCatalogReleaseVersion: Option[String] = None
++val defaultUnityCatalogReleaseVersion = "0.4.1"
++val useDefaultUnityCatalogReleaseVersion: Boolean =
++  sys.props.getOrElse("useDefaultUnityCatalogReleaseVersion", "false").toBoolean
++val unityCatalogSetupScript = "project/scripts/setup_unitycatalog_main.sh"
++
++// Lazy so release-mode / useDefaultUnityCatalogReleaseVersion builds never shell out.
++lazy val pinnedUnityCatalogVersion: String = {
++  import scala.sys.process._
++  Process(Seq("bash", unityCatalogSetupScript, "--print-version")).!!.trim
++}
++val unityCatalogVersion: String = sys.props.getOrElse(
++  "unityCatalogVersion",
++  if (useDefaultUnityCatalogReleaseVersion) defaultUnityCatalogReleaseVersion
++  else unityCatalogReleaseVersion.getOrElse(pinnedUnityCatalogVersion))
++
+ val sparkUnityCatalogJacksonVersion = "2.15.4" // We are using Spark 4.0's Jackson version 2.15.x, to override Unity Catalog 0.3.0's version 2.18.x
+ 
++// Publishes the pinned UC jars to ~/.ivy2/local if they're not already cached there. Hooked
++// into `update` on the UC-dependent projects below, so plain `sbt testOnly ...` on a clean
++// checkout just works. No-op in release mode. Opt out with
++// `-Ddelta.autoBuildPinnedUnityCatalog=false`, in which case sbt errors with a pointer to the
++// setup script.
++val ensurePinnedUnityCatalog = taskKey[Unit](
++  "Publish the pinned UC jars locally if the Ivy coordinate isn't already cached.")
++
++// Extracted so the task body can read as a short guard rather than three nested ifs.
++def publishPinnedUnityCatalog(log: sbt.util.Logger, canary: java.io.File): Unit = {
++  val shouldAutoBuild =
++    sys.props.getOrElse("delta.autoBuildPinnedUnityCatalog", "true").toBoolean
++  if (!shouldAutoBuild) {
++    sys.error(
++      s"""|Pinned Unity Catalog jars are not published locally for coordinate
++          |$unityCatalogVersion.
++          |Auto-build is disabled (-Ddelta.autoBuildPinnedUnityCatalog=false).
++          |Run: bash $unityCatalogSetupScript""".stripMargin)
++  }
++  log.info(s"[UC] Pinned UC jars not found for coordinate $unityCatalogVersion.")
++  log.info(
++    s"[UC] Running $unityCatalogSetupScript - takes ~3-5 minutes on a cold cache, <1s on a warm one.")
++  import scala.sys.process._
++  val procLogger = ProcessLogger(
++    line => log.info(s"[UC setup] $line"),
++    line => log.warn(s"[UC setup] $line"))
++  val exit = Process(Seq("bash", unityCatalogSetupScript)).!(procLogger)
++  if (exit != 0) {
++    sys.error(
++      s"[UC] $unityCatalogSetupScript exited with code $exit. Run it manually to see full output.")
++  }
++  if (!canary.exists) {
++    sys.error(
++      s"[UC] $unityCatalogSetupScript succeeded but ${canary.getAbsolutePath} is still missing - " +
++        "the publish target layout may have changed.")
++  }
++}
++
++Global / ensurePinnedUnityCatalog := {
++  // Resolve the .value dependencies eagerly - sbt's task macro warns when
++  // `.value` appears inside conditional branches.
++  val log = streams.value.log
++  // No-op whenever the effective version resolves to something Maven Central can serve:
++  // release mode, -DuseDefaultUnityCatalogReleaseVersion=true, or -DunityCatalogVersion=<released>.
++  val usingReleasedVersion = useDefaultUnityCatalogReleaseVersion ||
++    sys.props.contains("unityCatalogVersion")
++  if (unityCatalogReleaseVersion.isEmpty && !usingReleasedVersion) {
++    val canary =
++      file(sys.props("user.home")) / ".ivy2" / "local" / "io.unitycatalog" /
++        "unitycatalog-client" / unityCatalogVersion / "ivys" / "ivy.xml"
++    if (!canary.exists) {
++      publishPinnedUnityCatalog(log, canary)
++    }
++  }
++}
++
+ lazy val sparkUnityCatalog = (project in file("spark/unitycatalog"))
+   .dependsOn(spark % "compile->compile;test->test;provided->provided")
+   .disablePlugins(ScalafmtPlugin)
+     javafmtCheckSettings(),
+     CrossSparkVersions.sparkDependentSettings(sparkVersion),
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // This is a test-only module - no production sources
+     Compile / sources := Seq.empty,
+ 
+     exportJars := false,
+     javafmtCheckSettings,
+     scalafmtCheckSettings,
+-    
++
+     libraryDependencies ++= Seq(
+       "org.openjdk.jmh" % "jmh-core" % "1.37" % "test",
+       "org.openjdk.jmh" % "jmh-generator-annprocess" % "1.37" % "test",
+     scalaStyleSettings,
+     scalafmtCheckSettings,
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // Put the shaded kernel-api JAR on the classpath (compile & test)
+     Compile / unmanagedJars += (kernelApi / Compile / packageBin).value,
+     Test / unmanagedJars += (kernelApi / Compile / packageBin).value,
+       "com.fasterxml.jackson.datatype" % "jackson-datatype-jsr310" % "2.15.4" % "test",
+     ),
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them. storage is the transitive
++    // UC-client entry point for most of the build graph (sparkV1, sparkV2, kernelDefaults, etc.
++    // all .dependsOn(storage)), so hooking here covers nearly every compile path.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // Unidoc settings
+     unidocSourceFilePatterns += SourceFilePattern("/LogStore.java", "/CloseableIterator.java"),
+     TestParallelization.settings
+       "--add-opens=java.base/java.util=ALL-UNNAMED" // for Flink with Java 17.
+     ),
+     crossPaths := false,
++
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     libraryDependencies ++= Seq(
+       "org.apache.flink" % "flink-core" % flinkVersion % "provided",
+       "org.apache.flink" % "flink-table-common" % flinkVersion % "provided",
\ No newline at end of file
build/sbt
@@ -0,0 +1,16 @@
+diff --git a/build/sbt b/build/sbt
+--- a/build/sbt
++++ b/build/sbt
+ )
+ }
+ 
+-# If MAVEN_PROXY_URL is set, use it as the sole repository for all dependencies.
++# If MAVEN_PROXY_URL is set, use it (and local) as the sole repository for all dependencies.
+ if [[ -n "$MAVEN_PROXY_URL" ]]; then
+   SBT_REPOSITORIES_CONFIG=$(mktemp)
+   cat > "$SBT_REPOSITORIES_CONFIG" <<EOF
+ [repositories]
++  local
+   maven-proxy: $MAVEN_PROXY_URL
+   maven-proxy-ivy: $MAVEN_PROXY_URL, [organization]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)[revision]/[type]s/[artifact](-[classifier]).[ext]
+ EOF
\ No newline at end of file
iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
@@ -0,0 +1,15 @@
+diff --git a/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala b/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
+--- a/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
++++ b/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
+      * AnalysisException
+      */
+      try {
+-       SchemaMergingUtils.checkColumnNameDuplication(tableSchema, "during convert to Delta")
++       SchemaMergingUtils.checkColumnNameDuplication(tableSchema, "CONVERT_TO_DELTA")
+      } catch {
+-       case e: AnalysisException if e.getMessage.contains("during convert to Delta") =>
++       case e: AnalysisException
++           if e.getErrorClass == "DELTA_DUPLICATE_COLUMNS_FOUND.CONVERT_TO_DELTA" =>
+          throw new UnsupportedOperationException(
+            IcebergTable.caseSensitiveConversionExceptionMsg(e.getMessage))
+      }
\ No newline at end of file
iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
@@ -0,0 +1,11 @@
+diff --git a/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala b/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
+--- a/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
++++ b/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
+    * @param catalogTable the catalogTable this conversion targets
+    * @return (Iceberg metadata path, last converted Delta version)
+    */
+-  def convertUncommitedTxn(
++  override def convertUncommitedTxn(
+       txnInfo: CurrentTransactionInfo,
+       deltaAttemptVersion: Long,
+       deltaLog: DeltaLog,
\ No newline at end of file
iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
@@ -0,0 +1,149 @@
+diff --git a/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala b/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
+--- a/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
++++ b/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
+ 
+ package org.apache.spark.sql.delta.uniform
+ 
+-import org.apache.spark.sql.delta.sources.DeltaSQLConf
++import java.util.{Collections, Optional, UUID}
++
++import scala.collection.JavaConverters._
++
++import io.delta.storage.commit.{CommitCoordinatorClient => JCommitCoordinatorClient}
++import io.delta.storage.commit.{TableIdentifier => UCTableIdentifier}
++import io.delta.storage.commit.actions.{AbstractMetadata, AbstractProtocol}
++import io.delta.storage.commit.uccommitcoordinator.UCCommitCoordinatorClient
++import org.apache.hadoop.fs.Path
+ 
+ import org.apache.spark.{SparkConf, SparkSessionSwitch}
+ import org.apache.spark.sql.{Row, SparkSession}
++import org.apache.spark.sql.catalyst.TableIdentifier
++import org.apache.spark.sql.delta.DeltaConfigs.{
++  COORDINATED_COMMITS_COORDINATOR_CONF,
++  COORDINATED_COMMITS_COORDINATOR_NAME
++}
++import org.apache.spark.sql.delta.DeltaLog
++import org.apache.spark.sql.delta.NonSparkReadIceberg
++import org.apache.spark.sql.delta.coordinatedcommits.{
++  CatalogOwnedCommitCoordinatorBuilder,
++  CommitCoordinatorProvider,
++  InMemoryUCClient,
++  InMemoryUCCommitCoordinator,
++  UCCommitCoordinatorBuilder
++}
++import org.apache.spark.sql.delta.sources.DeltaSQLConf
+ import org.apache.spark.sql.delta.test.DeltaSQLCommandTest
+ import org.apache.spark.sql.delta.uniform.hms.HMSTest
++import org.apache.spark.sql.delta.util.JsonUtils
+ 
+ /**
+  * This trait allows the tests to write with Delta
+ }
+ 
+ /**
+- * No test should go here. Please add tests in [[UniFormE2EIcebergSuiteBase]]
++ * Trait that wires up an in-memory UC commit coordinator for UniForm E2E testing.
++ *
++ * Mix this into a concrete suite that already extends [[UniFormE2EIcebergSuiteBase]] (or any
++ * other [[UniFormE2ETest]] subclass) to redirect every [[readAndVerify]] call through the
++ * native Iceberg reader backed by the in-memory UC coordinator
++ *
++ * Concrete suites must call [[requiredTableProperties]] inside their
++ * [[UniFormE2EIcebergSuiteBase.extraTableProperties]] override to inject the coordinator
++ * name and conf into every `CREATE TABLE` statement.
+  */
++trait WriteDeltaUCCCReadIceberg extends UniFormE2ETest
++  with DeltaSQLCommandTest
++  with NonSparkReadIceberg {
++
++  /**
++   * A [[UCCommitCoordinatorClient]] subclass that overrides [[registerTable]] to auto-assign
++   * a UC table ID, simulating what the UC catalog does during CREATE TABLE.
++   */
++  private class TestUCBackedCommitCoordinator(ucClient: InMemoryUCClient)
++    extends UCCommitCoordinatorClient(Collections.emptyMap(), ucClient) {
++
++    @volatile var lastRegisteredTableId: String = _
++
++    /**
++     * Delta blocks setting `COORDINATED_COMMITS_TABLE_CONF` in TBLPROPERTIES, so this trait
++     * simulates what the real UC catalog does: a [[CatalogOwnedCommitCoordinatorBuilder]] returns
++     * a single [[TestUCBackedCommitCoordinator]] instance whose [[registerTable]] auto-assigns a
++     * UUID.  Returning the same instance from every [[build]]/[[buildForCatalog]] call ensures
++     * that [[UCCommitCoordinatorClient.semanticEquals]] (which uses reference equality on `conf`)
++     * returns true and Delta does not reject intra-test metadata updates.
++     */
++    override def registerTable(
++        logPath: Path,
++        tableIdentifier: Optional[UCTableIdentifier],
++        currentVersion: Long,
++        currentMetadata: AbstractMetadata,
++        currentProtocol: AbstractProtocol): java.util.Map[String, String] = {
++      val tableId = UUID.randomUUID().toString
++      lastRegisteredTableId = tableId
++      Map(UCCommitCoordinatorClient.UC_TABLE_ID_KEY -> tableId).asJava
++    }
++  }
++
++  protected var ucCommitCoordinator: InMemoryUCCommitCoordinator = _
++  private var testCoordinator: TestUCBackedCommitCoordinator = _
++
++  abstract override def beforeEach(): Unit = {
++    super.beforeEach()
++    DeltaLog.clearCache()
++    CommitCoordinatorProvider.clearAllBuilders()
++    ucCommitCoordinator = new InMemoryUCCommitCoordinator()
++    val ucClient = new InMemoryUCClient("test-metastore", ucCommitCoordinator)
++    testCoordinator = new TestUCBackedCommitCoordinator(ucClient)
++    CommitCoordinatorProvider.registerBuilder(new CatalogOwnedCommitCoordinatorBuilder {
++      override def getName: String = UCCommitCoordinatorBuilder.getName
++      override def build(
++          spark: SparkSession, conf: Map[String, String]): JCommitCoordinatorClient =
++        testCoordinator
++      override def buildForCatalog(
++          spark: SparkSession, catalogName: String): JCommitCoordinatorClient =
++        testCoordinator
++    })
++  }
++
++  abstract override def afterEach(): Unit = {
++    CommitCoordinatorProvider.clearAllBuilders()
++    DeltaLog.clearCache()
++    super.afterEach()
++  }
++
++  /**
++   * Returns the TBLPROPERTIES SQL fragment required to enable the UC commit coordinator.
++   * Concrete suites should append this to their [[extraTableProperties]] override.
++   */
++  def requiredTableProperties: String =
++    s", '${COORDINATED_COMMITS_COORDINATOR_NAME.key}' = '${UCCommitCoordinatorBuilder.getName}'" +
++      s", '${COORDINATED_COMMITS_COORDINATOR_CONF.key}' = " +
++      s"'${JsonUtils.toJson(Map.empty[String, String])}'"
++
++  override protected def readAndVerify(
++      table: String, fields: String, orderBy: String, expect: Seq[Row]): Unit = {
++    val tableId = testCoordinator.lastRegisteredTableId
++    assert(tableId != null,
++      s"No table UUID assigned for '$table' - table was not created with CC properties")
++    val schema = DeltaLog.forTable(spark, TableIdentifier(table)).update().schema
++    val uniformMetadata = ucCommitCoordinator.getUniformMetadata(tableId)
++    assert(uniformMetadata.isDefined,
++      s"No UniForm metadata found for table '$table' (ID $tableId)")
++    assert(uniformMetadata.get.getIcebergMetadata.isPresent,
++      s"No Iceberg metadata found for table '$table' (ID $tableId)")
++    val icebergMetadataPath = uniformMetadata.get.getIcebergMetadata.get.getMetadataLocation
++    verifyReadByPath(icebergMetadataPath, schema, fields, orderBy, expect)
++  }
++}
++
++/**
++ * Concrete E2E suite that runs all [[UniFormE2EIcebergSuiteBase]] tests with tables backed
++ * by an in-memory UC commit coordinator, reading results via the native Iceberg reader.
++ */
++class UniFormE2EIcebergUCSuite extends UniFormE2EIcebergSuiteBase
++    with WriteDeltaUCCCReadIceberg {
++  // No test should go here. Please add tests in [[UniFormE2EIcebergSuiteBase]]
++  override def extraTableProperties(compatVersion: Int): String =
++    super.extraTableProperties(compatVersion) + requiredTableProperties
++}
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
@@ -0,0 +1,49 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
+     public static final String FORMAT_HUDI = "hudi";
+   }
+ 
++  /**
++   * The set of compression codecs that Kernel currently recognizes and enforces. This is
++   * intentionally strict for now. In the future we may add new codecs or relax validation to allow
++   * any codec string.
++   */
++  private static final Set<String> VALID_COMPRESSION_CODECS =
++      Collections.unmodifiableSet(
++          new HashSet<>(
++              Arrays.asList("uncompressed", "none", "snappy", "gzip", "lz4", "lz4_raw", "zstd")));
++
+   private static final Collection<String> ALLOWED_UNIFORM_FORMATS =
+       Collections.unmodifiableList(
+           Arrays.asList(UniversalFormats.FORMAT_HUDI, UniversalFormats.FORMAT_ICEBERG));
+           "needs to be a boolean.",
+           true);
+ 
++  /**
++   * Compression codec writers should use for new Parquet data and checkpoint files. Changing this
++   * property does not affect existing files; a table may contain files written with different
++   * codecs.
++   *
++   * <p>Valid values (case-insensitive): uncompressed, none, snappy, gzip, lz4, lz4_raw, zstd.
++   */
++  public static final TableConfig<String> PARQUET_COMPRESSION_CODEC =
++      new TableConfig<>(
++          "delta.parquet.compression.codec",
++          "snappy",
++          v -> v.toLowerCase(Locale.ROOT),
++          VALID_COMPRESSION_CODECS::contains,
++          "needs to be one of: 'uncompressed', 'none', 'snappy', 'gzip',"
++              + " 'lz4', 'lz4_raw', 'zstd'.",
++          true /* editable */);
++
+   public static final TableConfig<String> MATERIALIZED_ROW_ID_COLUMN_NAME =
+       new TableConfig<>(
+           "delta.rowTracking.materializedRowIdColumnName",
+               addConfig(this, MATERIALIZED_ROW_ID_COLUMN_NAME);
+               addConfig(this, MATERIALIZED_ROW_COMMIT_VERSION_COLUMN_NAME);
+               addConfig(this, VARIANT_SHREDDING_ENABLED);
++              addConfig(this, PARQUET_COMPRESSION_CODEC);
+ 
+               // The below configs do not yet have their behavior correctly implemented in Kernel.
+               addConfig(this, DATA_SKIPPING_STATS_COLUMNS);
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
@@ -0,0 +1,13 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
+               StructType.class));
+ 
+   private static final Set<Class<? extends DataType>> V3_SUPPORTED_TYPES =
+-      Stream.concat(V2_SUPPORTED_TYPES.stream(), Stream.of(VariantType.class))
++      Stream.concat(
++              V2_SUPPORTED_TYPES.stream(),
++              Stream.of(VariantType.class, GeometryType.class, GeographyType.class))
+           .collect(Collectors.toSet());
+ 
+   protected static final IcebergCompatCheck V2_CHECK_HAS_SUPPORTED_TYPES =
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
@@ -0,0 +1,10 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
+                   VARIANT_SHREDDING_PREVIEW_RW_FEATURE,
+                   VARIANT_RW_PREVIEW_FEATURE,
+                   ALLOW_COLUMN_DEFAULTS_W_FEATURE,
++                  GEOSPATIAL_RW_FEATURE,
+                   // Also allow writerV1 features for backward compatibility.
+                   //
+                   // Note: We already enforce that these features cannot be enabled
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
@@ -0,0 +1,22 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
+ import io.delta.kernel.utils.CloseableIterator;
+ import io.delta.kernel.utils.FileStatus;
+ import java.io.IOException;
++import java.io.InterruptedIOException;
+ import java.io.UncheckedIOException;
+ import java.util.*;
+ import java.util.stream.Collectors;
+       throw new IllegalStateException("Can't call `next` on a closed iterator.");
+     }
+     if (Thread.currentThread().isInterrupted()) {
+-      throw new IllegalStateException("Thread was interrupted");
++      // Throw a typed InterruptedIOException (wrapped, since next() does not declare checked
++      // exceptions) so engines whose interrupt-handling recognizes standard JDK interrupt types
++      // (e.g. Spark's StreamExecution.isInterruptionException) treat this as a clean shutdown
++      // rather than a real error.
++      throw new UncheckedIOException(new InterruptedIOException("Thread was interrupted"));
+     }
+ 
+     if (!hasNext()) {
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
@@ -0,0 +1,11 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
+     }
+   }
+ 
+-  static final TableFeature GEOSPATIAL_RW_FEATURE = new GeoSpatialTableFeature();
++  public static final TableFeature GEOSPATIAL_RW_FEATURE = new GeoSpatialTableFeature();
+ 
+   private static class GeoSpatialTableFeature extends TableFeature.ReaderWriterFeature
+       implements FeatureAutoEnabledByMetadata {
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
@@ -0,0 +1,73 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
+ 
+ import scala.collection.JavaConverters._
+ 
+-import io.delta.kernel.exceptions.KernelException
++import io.delta.kernel.exceptions.{InvalidConfigurationValueException, KernelException}
+ 
+ import org.scalatest.funsuite.AnyFunSuite
+ 
+         TableConfig.IN_COMMIT_TIMESTAMP_ENABLEMENT_TIMESTAMP.getKey -> "1",
+         TableConfig.COLUMN_MAPPING_MODE.getKey -> "name",
+         TableConfig.ICEBERG_COMPAT_V2_ENABLED.getKey -> "true",
+-        TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.getKey -> "iceberg").asJava)
++        TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.getKey -> "iceberg",
++        TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "snappy").asJava)
+   }
+ 
+   test("check TableConfig.MAX_COLUMN_ID.editable is false") {
+     val formats = TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.fromMetadata(config)
+     assert(formats == Set("iceberg", "hudi").asJava)
+   }
++
++  test("PARQUET_COMPRESSION_CODEC - valid values accepted including mixed case") {
++    val validValues = Seq(
++      "snappy",
++      "SNAPPY",
++      "ZSTD",
++      "gzip",
++      "GZIP",
++      "lz4",
++      "lz4_raw",
++      "LZ4_RAW",
++      "uncompressed",
++      "UNCOMPRESSED",
++      "none",
++      "NONE",
++      "zstd")
++    validValues.foreach { codec =>
++      TableConfig.validateAndNormalizeDeltaProperties(
++        Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> codec).asJava)
++    }
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - invalid value throws InvalidConfigurationValueException") {
++    val ex = intercept[InvalidConfigurationValueException] {
++      TableConfig.validateAndNormalizeDeltaProperties(
++        Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "invalid").asJava)
++    }
++    assert(ex.getMessage.contains("delta.parquet.compression.codec"))
++    assert(ex.getMessage.contains("invalid"))
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - fromMetadata returns lowercase regardless of stored case") {
++    val config = Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "SNAPPY").asJava
++    val result = TableConfig.PARQUET_COMPRESSION_CODEC.fromMetadata(config)
++    assert(result === "snappy")
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - fromMetadata returns snappy when property absent") {
++    val config = Map.empty[String, String].asJava
++    val result = TableConfig.PARQUET_COMPRESSION_CODEC.fromMetadata(config)
++    assert(result === "snappy")
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - validation normalizes key case") {
++    val result = TableConfig.validateAndNormalizeDeltaProperties(
++      Map("DELTA.PARQUET.COMPRESSION.CODEC" -> "snappy").asJava)
++    assert(result.containsKey("delta.parquet.compression.codec"))
++    assert(result.get("delta.parquet.compression.codec") === "snappy")
++  }
+ }
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
@@ -0,0 +1,12 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
+ 
+   override def supportedDataColumnTypes: Set[DataType] = ALL_TYPES
+ 
+-  override def unsupportedDataColumnTypes: Set[DataType] = Set(VariantType.VARIANT)
++  override def unsupportedDataColumnTypes: Set[DataType] =
++    Set(VariantType.VARIANT, GeometryType.ofDefault(), GeographyType.ofDefault())
+ 
+   override def unsupportedPartitionColumnTypes: Set[DataType] = NESTED_TYPES
+ 
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
@@ -0,0 +1,12 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
+ 
+   override def icebergCompatVersion: String = "V3"
+ 
+-  override def supportedDataColumnTypes: Set[DataType] = ALL_TYPES + VariantType.VARIANT
++  override def supportedDataColumnTypes: Set[DataType] =
++    ALL_TYPES + VariantType.VARIANT + GeometryType.ofDefault() + GeographyType.ofDefault()
+ 
+   override def unsupportedDataColumnTypes: Set[DataType] = Set.empty
+ 
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdaterSuite.scala
@@ -0,0 +1,25 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdaterSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdaterSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdaterSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdaterSuite.scala
+   /* --- UNSUPPORTED_FEATURES_CHECK tests --- */
+ 
+   test("all supported features are allowed") {
+-    val readerFeatures =
+-      Set("columnMapping", "timestampNtz", "v2Checkpoint", "vacuumProtocolCheck", "rowTracking")
++    val readerFeatures = Set(
++      "columnMapping",
++      "timestampNtz",
++      "v2Checkpoint",
++      "vacuumProtocolCheck",
++      "rowTracking",
++      "geospatial")
+     val writerFeatures = Set(
+       // Legacy incompatible features (allowed as long as they are inactive)
+       "invariants",
+       "variantType-preview",
+       "variantShredding",
+       "variantShredding-preview",
++      "geospatial",
+       "icebergCompatV2",
+       "icebergWriterCompatV1",
+       "allowColumnDefaults",
\ No newline at end of file

... (truncated, output exceeded 60000 bytes)

Reproduce locally: git range-diff eb2169e..19b49ba e43bf65..c803c1a | Disable: git config gitstack.push-range-diff false

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from c803c1a to a50c9d2 Compare April 29, 2026 17:49
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (c803c1a -> a50c9d2)
.github/actions/setup-unitycatalog/action.yml
@@ -0,0 +1,40 @@
+diff --git a/.github/actions/setup-unitycatalog/action.yml b/.github/actions/setup-unitycatalog/action.yml
+new file mode 100644
+--- /dev/null
++++ b/.github/actions/setup-unitycatalog/action.yml
++name: "Set up pinned Unity Catalog build"
++description: >-
++  Publishes Unity Catalog jars from the commit pinned in project/scripts/setup_unitycatalog_main.sh
++  (the UC_PIN_SHA= line) to the runner's local Ivy / Maven caches, using GitHub Actions cache so the
++  slow UC build only runs the first time a pin is seen.
++
++runs:
++  using: "composite"
++  steps:
++    - name: Restore pinned UC cache
++      id: uc-cache
++      uses: actions/cache/restore@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
++      with:
++        # ~/.ivy2/local is what sbt publishLocal writes to. ~/.m2 is for publishM2.
++        path: |
++          ~/.ivy2/local
++          ~/.m2/repository/io/unitycatalog
++        # Cache key hashes the setup script, so bumping UC_PIN_SHA (or any other script change)
++        # invalidates the cache.
++        key: uc-jars-${{ runner.os }}-${{ hashFiles('project/scripts/setup_unitycatalog_main.sh') }}
++    - name: Build Unity Catalog from pinned SHA
++      shell: bash
++      run: bash project/scripts/setup_unitycatalog_main.sh
++    - name: Save pinned UC cache
++      # Only attempt a save when the restore missed. When multiple parallel matrix jobs all see
++      # a cache miss (first CI run after a pin bump), only the first to reach this step wins the
++      # GHA cache reservation; the rest log "another job may be creating this cache" warnings.
++      # Gating on cache-hit means cached runs (the common steady state) skip the save entirely,
++      # which eliminates those warnings on every subsequent run.
++      if: steps.uc-cache.outputs.cache-hit != 'true'
++      uses: actions/cache/save@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
++      with:
++        path: |
++          ~/.ivy2/local
++          ~/.m2/repository/io/unitycatalog
++        key: uc-jars-${{ runner.os }}-${{ hashFiles('project/scripts/setup_unitycatalog_main.sh') }}
\ No newline at end of file
.github/workflows/build.yaml
@@ -0,0 +1,29 @@
+diff --git a/.github/workflows/build.yaml b/.github/workflows/build.yaml
+--- a/.github/workflows/build.yaml
++++ b/.github/workflows/build.yaml
+ name: "Delta Build"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+             ~/.cache/coursier
+           key: delta-sbt-cache-cross-spark
+ 
++      # publishM2 compiles every aggregated project, including storage, which has
++      # unitycatalog-client as a compile-scope dependency. Publish the pinned UC build locally
++      # first so Delta compiles against the UC APIs it actually targets.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
++
+       - name: Run cross-Spark build test
+         run: python project/tests/test_cross_spark_publish.py
+ 
\ No newline at end of file
.github/workflows/disabled_iceberg_test.yaml
@@ -0,0 +1,80 @@
+diff --git a/.github/workflows/disabled_iceberg_test.yaml b/.github/workflows/disabled_iceberg_test.yaml
+deleted file mode 100644
+--- a/.github/workflows/disabled_iceberg_test.yaml
++++ /dev/null
+-name: "Delta Iceberg Latest [DISABLED]"
+-# SECURITY: All Python/PySpark workflows disabled due to active supply chain attack
+-# targeting OSS package ecosystems (PyPI). C2 domains: models.litellm.cloud, checkmarx.zone
+-# Date disabled: 2026-03-25
+-# To re-enable: remove 'if: false' from all jobs and restore original triggers
+-on:
+-  workflow_dispatch: # manual-only, auto triggers removed
+-  # To re-enable, replace the above line with:
+-  # push:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-  # pull_request:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-env:
+-  # SECURITY: Temporal lockdown — refuse any package version published after this date.
+-  # This date is a pre-attack baseline (before the active PyPI supply chain attack).
+-  UV_EXCLUDE_NEWER: "2026-03-10T00:00:00Z"
+-jobs:
+-  test:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+-    name: "DIL: Scala ${{ matrix.scala }}"
+-    runs-on: ubuntu-24.04
+-    strategy:
+-      matrix:
+-        # These Scala versions must match those in the build.sbt
+-        scala: [2.13.16]
+-    env:
+-      SCALA_VERSION: ${{ matrix.scala }}
+-    steps:
+-      - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+-      - name: install java
+-        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
+-        with:
+-          distribution: "zulu"
+-          java-version: "17"
+-      - name: Cache Scala, SBT
+-        uses: actions/cache@6f8efc29b200d32929f49075959781ed54ec270c # v3.5.0
+-        with:
+-          path: |
+-            ~/.sbt
+-            ~/.ivy2
+-            ~/.cache/coursier
+-          # Change the key if dependencies are changed. For each key, GitHub Actions will cache the
+-          # the above directories when we use the key for the first time. After that, each run will
+-          # just use the cache. The cache is immutable so we need to use a new key when trying to
+-          # cache new stuff.
+-          key: delta-sbt-cache-spark4.0-scala${{ matrix.scala }}
+-      - name: Set up uv
+-        run: bash project/scripts/install-uv.sh
+-      - name: Install Job dependencies
+-        run: |
+-          sudo apt-get update
+-          sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+-          sudo apt install libedit-dev
+-          # buf v1.28.1 (2023-11-15) — SHA from official release asset:
+-          # https://github.com/bufbuild/buf/releases/download/v1.28.1/sha256.txt
+-          BUF_VERSION="v1.28.1"
+-          BUF_SHA256="870cf492d381a967d36636fdee9da44b524ea62aad163659b8dbf16a7da56987"
+-          curl -fsSL -o buf-Linux-x86_64.tar.gz \
+-            "https://github.com/bufbuild/buf/releases/download/${BUF_VERSION}/buf-Linux-x86_64.tar.gz"
+-          echo "${BUF_SHA256}  buf-Linux-x86_64.tar.gz" | sha256sum -c -
+-          mkdir -p ~/buf
+-          tar -xzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
+-          rm buf-Linux-x86_64.tar.gz
+-          uv python install 3.8
+-          uv venv .venv --python 3.8
+-      - name: Run Scala/Java and Python tests
+-        # when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
+-        run: |
+-          source .venv/bin/activate
+-          TEST_PARALLELISM_COUNT=4 python run-tests.py --group iceberg --spark-version 4.0
\ No newline at end of file
.github/workflows/spark_test_uc_master.yaml
@@ -0,0 +1,62 @@
+diff --git a/.github/workflows/spark_test_uc_master.yaml b/.github/workflows/disabled_spark_test_uc_master.yaml
+similarity index 61%
+rename from .github/workflows/spark_test_uc_master.yaml
+rename to .github/workflows/disabled_spark_test_uc_master.yaml
+--- a/.github/workflows/spark_test_uc_master.yaml
++++ b/.github/workflows/disabled_spark_test_uc_master.yaml
+ ##
+ ## To make this blocking, add the job name to the required status checks in
+ ## the branch protection rules for `master`.
++##
++## DISABLED while Delta master builds against a pinned UC master SHA — the main Delta Spark
++## workflow already exercises UC master at that pin, so a parallel floating-main workflow would
++## be redundant. To re-enable (once Delta goes back to a released UC version): drop the
++## `[DISABLED]` suffix from `name`, replace `workflow_dispatch:` with the original push /
++## pull_request triggers below, remove `if: false` from the job, and rename the file back to
++## `spark_test_uc_master.yaml`.
+ 
+-name: "Delta Spark (UC Master)"
++name: "Delta Spark (UC Master) [DISABLED]"
+ on:
+-  push:
+-    paths-ignore:
+-      - '**.md'
+-      - '**.txt'
+-  pull_request:
+-    paths-ignore:
+-      - '**.md'
+-      - '**.txt'
++  workflow_dispatch: # manual-only while disabled
++  # Original triggers, restore when re-enabling:
++  # push:
++  #   branches: [master, branch-*]
++  #   paths-ignore:
++  #     - '**.md'
++  #     - '**.txt'
++  # pull_request:
++  #   branches: [master, branch-*]
++  #   paths-ignore:
++  #     - '**.md'
++  #     - '**.txt'
+ 
+ jobs:
+   test-uc-master:
+     name: "[Non Blocking] UC Integration Tests (UC Main)"
++    # Guard against accidental runs while disabled. Remove when re-enabling.
++    if: false
+     runs-on: ubuntu-24.04
+     steps:
+       - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3
+           key: delta-sbt-cache-uc-master
+       - name: Build Unity Catalog from source
+         id: uc-build
++        # UC_REF=main builds the floating-main canary instead of the pinned SHA, which is the
++        # point of this workflow -- early warning of upcoming UC incompatibilities.
+         run: |
+-          bash project/scripts/setup_unitycatalog_main.sh
+-          UC_VERSION=$(cat /tmp/unitycatalog/.uc-version)
++          UC_REF=main bash project/scripts/setup_unitycatalog_main.sh
++          UC_VERSION=$(UC_REF=main bash project/scripts/setup_unitycatalog_main.sh --print-version)
+           echo "uc_version=$UC_VERSION" >> $GITHUB_OUTPUT
+           echo "UC version: $UC_VERSION"
+       - name: Run UC integration tests
\ No newline at end of file
.github/workflows/flink_test.yaml
@@ -0,0 +1,37 @@
+diff --git a/.github/workflows/flink_test.yaml b/.github/workflows/flink_test.yaml
+--- a/.github/workflows/flink_test.yaml
++++ b/.github/workflows/flink_test.yaml
+ 
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'flink/**'
+       - 'kernel/**'
+       - '!**/*.md'
+       - '!**/*.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'flink/**'
+       - 'kernel/**'
+   cancel-in-progress: true
+ 
+ env:
+-  # Point SBT to our cache directories for consistency
++  # Point SBT to our cache directories for consistency.
+   SBT_OPTS: "-Dsbt.coursier.home-dir=/home/runner/.cache/coursier -Dsbt.ivy.home=/home/runner/.ivy2"
+ 
+ jobs:
+           else
+             echo "❌ Cache MISS - will download dependencies"
+           fi
++      # flink has unitycatalog-client as a compile-scope dep and flink tests exercise UC.
++      # Publish the pinned UC build locally before sbt runs.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run unit tests
+         run: |
+           build/sbt flinkGroup/test
\ No newline at end of file
.github/workflows/iceberg_test.yaml
@@ -0,0 +1,58 @@
+diff --git a/.github/workflows/iceberg_test.yaml b/.github/workflows/iceberg_test.yaml
+new file mode 100644
+--- /dev/null
++++ b/.github/workflows/iceberg_test.yaml
++name: "Delta Iceberg Latest"
++on:
++  push:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++  pull_request:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++jobs:
++  test:
++    name: "DIL: Scala ${{ matrix.scala }}"
++    runs-on: ubuntu-24.04
++    strategy:
++      matrix:
++        # These Scala versions must match those in the build.sbt
++        scala: [2.13.16]
++    env:
++      SCALA_VERSION: ${{ matrix.scala }}
++    steps:
++      - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
++      - name: install java
++        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
++        with:
++          distribution: "zulu"
++          java-version: "17"
++      - name: Cache Scala, SBT
++        uses: actions/cache@6f8efc29b200d32929f49075959781ed54ec270c # v3.5.0
++        with:
++          path: |
++            ~/.sbt
++            ~/.ivy2
++            ~/.cache/coursier
++          # Change the key if dependencies are changed. For each key, GitHub Actions will cache the
++          # the above directories when we use the key for the first time. After that, each run will
++          # just use the cache. The cache is immutable so we need to use a new key when trying to
++          # cache new stuff.
++          key: delta-sbt-cache-spark4.0-scala${{ matrix.scala }}
++      - name: Set up uv
++        run: bash project/scripts/install-uv.sh
++      - name: Install Python via uv
++        # No UV_EXCLUDE_NEWER needed: this workflow installs zero pip packages.
++        # Python is only used to run the stdlib-only run-tests.py driver.
++        run: |
++          uv python install 3.8
++          uv venv .venv --python 3.8
++      - name: Run Scala/Java and Python tests
++        # when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
++        run: |
++          source .venv/bin/activate
++          TEST_PARALLELISM_COUNT=4 python run-tests.py --group iceberg --spark-version 4.0
\ No newline at end of file
.github/workflows/kernel_docs.yaml
@@ -0,0 +1,11 @@
+diff --git a/.github/workflows/kernel_docs.yaml b/.github/workflows/kernel_docs.yaml
+--- a/.github/workflows/kernel_docs.yaml
++++ b/.github/workflows/kernel_docs.yaml
+           java-version: "11"
+       - name: Generate docs
+         run: |
+-          build/sbt kernelGroup/unidoc
++          build/sbt -DuseDefaultUnityCatalogReleaseVersion=true kernelGroup/unidoc
+           mkdir -p kernel/docs/snapshot/kernel-api/java
+           mkdir -p kernel/docs/snapshot/kernel-defaults/java
+           cp -r kernel/kernel-api/target/javaunidoc/. kernel/docs/snapshot/kernel-api/java/
\ No newline at end of file
.github/workflows/kernel_test.yaml
@@ -0,0 +1,47 @@
+diff --git a/.github/workflows/kernel_test.yaml b/.github/workflows/kernel_test.yaml
+--- a/.github/workflows/kernel_test.yaml
++++ b/.github/workflows/kernel_test.yaml
+ 
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+           else
+             echo "❌ Cache MISS - will download dependencies"
+           fi
++      # run-tests.py invokes sbt with `++ 2.13.16`, which triggers cross-version dependency resolution
++      # across every project (including kernelUnityCatalog). Publish the pinned UC build locally first
++      # so that resolution doesn't miss.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run unit tests
+         run: |
+           python run-tests.py --group kernel --coverage --shard ${{ matrix.shard }}
+     runs-on: ubuntu-24.04
+     steps:
+       - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+-      # Run integration tests with JDK 11, as they have no Spark dependency
+-      - name: install java
++      # The integration test itself runs on JDK 11 (no Spark dependency), but UC's sbt build needs
++      # JDK 17, so we install 17 first, publish UC, then switch the active JDK to 11 for the actual
++      # test run.
++      - name: install java 17 for UC build
++        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
++        with:
++          distribution: "zulu"
++          java-version: "17"
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
++      - name: install java 11 for integration test
+         uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
+         with:
+           distribution: "zulu"
\ No newline at end of file
.github/workflows/kernel_unitycatalog_test.yaml
@@ -0,0 +1,29 @@
+diff --git a/.github/workflows/kernel_unitycatalog_test.yaml b/.github/workflows/kernel_unitycatalog_test.yaml
+--- a/.github/workflows/kernel_unitycatalog_test.yaml
++++ b/.github/workflows/kernel_unitycatalog_test.yaml
+ name: "Kernel Unity Catalog"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'build.sbt'
+       - 'version.sbt'
+       - 'storage/**/*.java'
+       - '.github/workflows/kernel_unitycatalog_test.yaml'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'build.sbt'
+       - 'version.sbt'
+         with:
+           distribution: "zulu"
+           java-version: "17"
++      # kernelUnityCatalog depends on unreleased UC APIs; publish the pinned UC build locally before
++      # sbt tries to resolve the dependency.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run Unity tests with coverage
+         run: |
+           ./build/sbt "++ ${{ env.SCALA_VERSION }}" clean coverage kernelUnityCatalog/test coverageAggregate coverageOff -v
\ No newline at end of file
.github/workflows/spark_examples_test.yaml
@@ -0,0 +1,27 @@
+diff --git a/.github/workflows/spark_examples_test.yaml b/.github/workflows/spark_examples_test.yaml
+--- a/.github/workflows/spark_examples_test.yaml
++++ b/.github/workflows/spark_examples_test.yaml
+ name: "Delta Spark Publishing and Examples"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+           sudo apt-get update
+           sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+           sudo apt install libedit-dev
++      # `publishM2` and `++ <scala>` both resolve every project's deps, which includes
++      # sparkUnityCatalog; publish the pinned UC build locally before sbt runs.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run Delta Spark Local Publishing and Examples Compilation
+         # examples/scala/build.sbt will compile against the local Delta release version (e.g. 3.2.0-SNAPSHOT).
+         # Thus, we need to publishM2 first so those jars are locally accessible.
\ No newline at end of file
.github/workflows/disabled_spark_python_test.yaml
@@ -0,0 +1,76 @@
+diff --git a/.github/workflows/disabled_spark_python_test.yaml b/.github/workflows/spark_python_test.yaml
+similarity index 71%
+rename from .github/workflows/disabled_spark_python_test.yaml
+rename to .github/workflows/spark_python_test.yaml
+--- a/.github/workflows/disabled_spark_python_test.yaml
++++ b/.github/workflows/spark_python_test.yaml
+-name: "Delta Spark Python [DISABLED]"
+-# SECURITY: All Python/PySpark workflows disabled due to active supply chain attack
+-# targeting OSS package ecosystems (PyPI). C2 domains: models.litellm.cloud, checkmarx.zone
+-# Date disabled: 2026-03-25
+-# To re-enable: remove 'if: false' from all jobs and restore original triggers
++name: "Delta Spark Python"
+ on:
+-  workflow_dispatch: # manual-only, auto triggers removed
+-  # To re-enable, replace the above line with:
+-  # push:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-  # pull_request:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
++  push:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++  pull_request:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
+ env:
+   # SECURITY: Temporal lockdown — refuse any package version published after this date.
+   # This date is a pre-attack baseline (before the active PyPI supply chain attack).
+   # Generate Spark versions matrix from CrossSparkVersions.scala
+   # This workflow tests against released versions only (no snapshots)
+   generate-matrix:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+     name: "Generate Released Spark Versions Matrix"
+     runs-on: ubuntu-24.04
+     outputs:
+           echo "Generated released Spark versions: $SPARK_VERSIONS"
+ 
+   test:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+     name: "DSP (${{ matrix.spark_version }})"
+     runs-on: ubuntu-24.04
+     needs: generate-matrix
+           key: delta-sbt-cache-spark${{ matrix.spark_version }}-scala${{ matrix.scala }}
+       - name: Set up uv
+         run: bash project/scripts/install-uv.sh
+-      - name: Install Job dependencies
++      - name: Set up buf
++        run: bash project/scripts/install-buf.sh
++      - name: Install Python and dependencies
+         run: |
+-          sudo apt-get update
+-          sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+-          sudo apt install libedit-dev
+-          # buf v1.28.1 (2023-11-15) — SHA from official release asset:
+-          # https://github.com/bufbuild/buf/releases/download/v1.28.1/sha256.txt
+-          BUF_VERSION="v1.28.1"
+-          BUF_SHA256="870cf492d381a967d36636fdee9da44b524ea62aad163659b8dbf16a7da56987"
+-          curl -fsSL -o buf-Linux-x86_64.tar.gz \
+-            "https://github.com/bufbuild/buf/releases/download/${BUF_VERSION}/buf-Linux-x86_64.tar.gz"
+-          echo "${BUF_SHA256}  buf-Linux-x86_64.tar.gz" | sha256sum -c -
+-          mkdir -p ~/buf
+-          tar -xzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
+-          rm buf-Linux-x86_64.tar.gz
+           uv python install 3.10
+           uv venv .venv --python 3.10
+           # Install hash-verified locked dependencies (see .github/ci-requirements/spark-python/)
\ No newline at end of file
.github/workflows/spark_test.yaml
@@ -0,0 +1,27 @@
+diff --git a/.github/workflows/spark_test.yaml b/.github/workflows/spark_test.yaml
+--- a/.github/workflows/spark_test.yaml
++++ b/.github/workflows/spark_test.yaml
+ name: "Delta Spark"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+             ~/.ivy2
+             ~/.cache/coursier
+           key: delta-sbt-cache-spark${{ matrix.spark_version }}-scala${{ matrix.scala }}
++      # Delta's sparkUnityCatalog module (part of sparkGroup) depends on APIs that are only in
++      # unreleased UC. Publish the pinned UC build locally before sbt tries to resolve it.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Scala structured logging style check
+         run: |
+           if [ -f ./dev/spark_structured_logging_style.py ]; then
\ No newline at end of file
.github/workflows/unidoc.yaml
@@ -0,0 +1,19 @@
+diff --git a/.github/workflows/unidoc.yaml b/.github/workflows/unidoc.yaml
+--- a/.github/workflows/unidoc.yaml
++++ b/.github/workflows/unidoc.yaml
+   name: "Unidoc"
+   on:
+     push:
+-      branches: [master]
++      branches: [master, branch-*]
+     pull_request:
+-      branches: [master]
++      branches: [master, branch-*]
+   jobs:
+     build:
+       name: "U: Scala ${{ matrix.scala }}"
+             java-version: "17"
+         - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+         - name: generate unidoc
+-          run: build/sbt "++ ${{ matrix.scala }}" unidoc
++          run: build/sbt -DuseDefaultUnityCatalogReleaseVersion=true "++ ${{ matrix.scala }}" unidoc
\ No newline at end of file
build.sbt
@@ -0,0 +1,154 @@
+diff --git a/build.sbt b/build.sbt
+--- a/build.sbt
++++ b/build.sbt
+   ).configureUnidoc()
+ 
+ 
+-val unityCatalogVersion = sys.props.getOrElse("unityCatalogVersion", "0.4.1")
++// Unity Catalog version. Three modes, in priority order:
++//
++//  1. `-DuseDefaultUnityCatalogReleaseVersion=true`: use `defaultUnityCatalogReleaseVersion`
++//     below -- the last released UC version on Maven Central. For workflows that don't actually
++//     need DRC APIs (e.g. unidoc, lint) and want to skip the pinned UC build. Shared across
++//     workflows by reading this single constant, so bumping is a one-line change here.
++//
++//  2. Release mode: set `unityCatalogReleaseVersion = Some("0.5.0")` (or whatever released
++//     version the release branch ships against). sbt resolves the coordinate from Maven Central
++//     like any other dependency.
++//
++//  3. Pinned mode (default): leave `unityCatalogReleaseVersion = None`. The version string
++//     comes from `setup_unitycatalog_main.sh --print-version`, which encodes both the pinned
++//     UC main SHA and UC's declared base version; the script is the single source of truth.
++//     The same script (without the flag) publishes the matching jars to ~/.ivy2/local when
++//     `ensurePinnedUnityCatalog` decides they're missing.
++//
++// Override with -DunityCatalogVersion=<anything> for ad-hoc experiments.
++val unityCatalogReleaseVersion: Option[String] = None
++val defaultUnityCatalogReleaseVersion = "0.4.1"
++val useDefaultUnityCatalogReleaseVersion: Boolean =
++  sys.props.getOrElse("useDefaultUnityCatalogReleaseVersion", "false").toBoolean
++val unityCatalogSetupScript = "project/scripts/setup_unitycatalog_main.sh"
++
++// Lazy so release-mode / useDefaultUnityCatalogReleaseVersion builds never shell out.
++lazy val pinnedUnityCatalogVersion: String = {
++  import scala.sys.process._
++  Process(Seq("bash", unityCatalogSetupScript, "--print-version")).!!.trim
++}
++val unityCatalogVersion: String = sys.props.getOrElse(
++  "unityCatalogVersion",
++  if (useDefaultUnityCatalogReleaseVersion) defaultUnityCatalogReleaseVersion
++  else unityCatalogReleaseVersion.getOrElse(pinnedUnityCatalogVersion))
++
+ val sparkUnityCatalogJacksonVersion = "2.15.4" // We are using Spark 4.0's Jackson version 2.15.x, to override Unity Catalog 0.3.0's version 2.18.x
+ 
++// Publishes the pinned UC jars to ~/.ivy2/local if they're not already cached there. Hooked
++// into `update` on the UC-dependent projects below, so plain `sbt testOnly ...` on a clean
++// checkout just works. No-op in release mode. Opt out with
++// `-Ddelta.autoBuildPinnedUnityCatalog=false`, in which case sbt errors with a pointer to the
++// setup script.
++val ensurePinnedUnityCatalog = taskKey[Unit](
++  "Publish the pinned UC jars locally if the Ivy coordinate isn't already cached.")
++
++// Extracted so the task body can read as a short guard rather than three nested ifs.
++def publishPinnedUnityCatalog(log: sbt.util.Logger, canary: java.io.File): Unit = {
++  val shouldAutoBuild =
++    sys.props.getOrElse("delta.autoBuildPinnedUnityCatalog", "true").toBoolean
++  if (!shouldAutoBuild) {
++    sys.error(
++      s"""|Pinned Unity Catalog jars are not published locally for coordinate
++          |$unityCatalogVersion.
++          |Auto-build is disabled (-Ddelta.autoBuildPinnedUnityCatalog=false).
++          |Run: bash $unityCatalogSetupScript""".stripMargin)
++  }
++  log.info(s"[UC] Pinned UC jars not found for coordinate $unityCatalogVersion.")
++  log.info(
++    s"[UC] Running $unityCatalogSetupScript - takes ~3-5 minutes on a cold cache, <1s on a warm one.")
++  import scala.sys.process._
++  val procLogger = ProcessLogger(
++    line => log.info(s"[UC setup] $line"),
++    line => log.warn(s"[UC setup] $line"))
++  val exit = Process(Seq("bash", unityCatalogSetupScript)).!(procLogger)
++  if (exit != 0) {
++    sys.error(
++      s"[UC] $unityCatalogSetupScript exited with code $exit. Run it manually to see full output.")
++  }
++  if (!canary.exists) {
++    sys.error(
++      s"[UC] $unityCatalogSetupScript succeeded but ${canary.getAbsolutePath} is still missing - " +
++        "the publish target layout may have changed.")
++  }
++}
++
++Global / ensurePinnedUnityCatalog := {
++  // Resolve the .value dependencies eagerly - sbt's task macro warns when
++  // `.value` appears inside conditional branches.
++  val log = streams.value.log
++  // No-op whenever the effective version resolves to something Maven Central can serve:
++  // release mode, -DuseDefaultUnityCatalogReleaseVersion=true, or -DunityCatalogVersion=<released>.
++  val usingReleasedVersion = useDefaultUnityCatalogReleaseVersion ||
++    sys.props.contains("unityCatalogVersion")
++  if (unityCatalogReleaseVersion.isEmpty && !usingReleasedVersion) {
++    val home = file(sys.props("user.home"))
++    // Check both layouts: a restored sbt cache can pre-populate ivy alone, leaving m2 empty -
++    // checking only ivy would silently skip the slow publish and break mvn-based consumers.
++    val ivy2Canary = home / ".ivy2" / "local" / "io.unitycatalog" /
++      "unitycatalog-client" / unityCatalogVersion / "ivys" / "ivy.xml"
++    val m2Canary = home / ".m2" / "repository" / "io" / "unitycatalog" /
++      "unitycatalog-client" / unityCatalogVersion /
++      s"unitycatalog-client-$unityCatalogVersion.pom"
++    if (!ivy2Canary.exists || !m2Canary.exists) {
++      publishPinnedUnityCatalog(log, ivy2Canary)
++    }
++  }
++}
++
+ lazy val sparkUnityCatalog = (project in file("spark/unitycatalog"))
+   .dependsOn(spark % "compile->compile;test->test;provided->provided")
+   .disablePlugins(ScalafmtPlugin)
+     javafmtCheckSettings(),
+     CrossSparkVersions.sparkDependentSettings(sparkVersion),
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // This is a test-only module - no production sources
+     Compile / sources := Seq.empty,
+ 
+     exportJars := false,
+     javafmtCheckSettings,
+     scalafmtCheckSettings,
+-    
++
+     libraryDependencies ++= Seq(
+       "org.openjdk.jmh" % "jmh-core" % "1.37" % "test",
+       "org.openjdk.jmh" % "jmh-generator-annprocess" % "1.37" % "test",
+     scalaStyleSettings,
+     scalafmtCheckSettings,
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // Put the shaded kernel-api JAR on the classpath (compile & test)
+     Compile / unmanagedJars += (kernelApi / Compile / packageBin).value,
+     Test / unmanagedJars += (kernelApi / Compile / packageBin).value,
+       "com.fasterxml.jackson.datatype" % "jackson-datatype-jsr310" % "2.15.4" % "test",
+     ),
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them. storage is the transitive
++    // UC-client entry point for most of the build graph (sparkV1, sparkV2, kernelDefaults, etc.
++    // all .dependsOn(storage)), so hooking here covers nearly every compile path.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // Unidoc settings
+     unidocSourceFilePatterns += SourceFilePattern("/LogStore.java", "/CloseableIterator.java"),
+     TestParallelization.settings
+       "--add-opens=java.base/java.util=ALL-UNNAMED" // for Flink with Java 17.
+     ),
+     crossPaths := false,
++
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     libraryDependencies ++= Seq(
+       "org.apache.flink" % "flink-core" % flinkVersion % "provided",
+       "org.apache.flink" % "flink-table-common" % flinkVersion % "provided",
\ No newline at end of file
build/sbt
@@ -0,0 +1,16 @@
+diff --git a/build/sbt b/build/sbt
+--- a/build/sbt
++++ b/build/sbt
+ )
+ }
+ 
+-# If MAVEN_PROXY_URL is set, use it as the sole repository for all dependencies.
++# If MAVEN_PROXY_URL is set, use it (and local) as the sole repository for all dependencies.
+ if [[ -n "$MAVEN_PROXY_URL" ]]; then
+   SBT_REPOSITORIES_CONFIG=$(mktemp)
+   cat > "$SBT_REPOSITORIES_CONFIG" <<EOF
+ [repositories]
++  local
+   maven-proxy: $MAVEN_PROXY_URL
+   maven-proxy-ivy: $MAVEN_PROXY_URL, [organization]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)[revision]/[type]s/[artifact](-[classifier]).[ext]
+ EOF
\ No newline at end of file
iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
@@ -0,0 +1,15 @@
+diff --git a/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala b/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
+--- a/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
++++ b/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
+      * AnalysisException
+      */
+      try {
+-       SchemaMergingUtils.checkColumnNameDuplication(tableSchema, "during convert to Delta")
++       SchemaMergingUtils.checkColumnNameDuplication(tableSchema, "CONVERT_TO_DELTA")
+      } catch {
+-       case e: AnalysisException if e.getMessage.contains("during convert to Delta") =>
++       case e: AnalysisException
++           if e.getErrorClass == "DELTA_DUPLICATE_COLUMNS_FOUND.CONVERT_TO_DELTA" =>
+          throw new UnsupportedOperationException(
+            IcebergTable.caseSensitiveConversionExceptionMsg(e.getMessage))
+      }
\ No newline at end of file
iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
@@ -0,0 +1,11 @@
+diff --git a/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala b/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
+--- a/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
++++ b/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
+    * @param catalogTable the catalogTable this conversion targets
+    * @return (Iceberg metadata path, last converted Delta version)
+    */
+-  def convertUncommitedTxn(
++  override def convertUncommitedTxn(
+       txnInfo: CurrentTransactionInfo,
+       deltaAttemptVersion: Long,
+       deltaLog: DeltaLog,
\ No newline at end of file
iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
@@ -0,0 +1,149 @@
+diff --git a/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala b/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
+--- a/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
++++ b/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
+ 
+ package org.apache.spark.sql.delta.uniform
+ 
+-import org.apache.spark.sql.delta.sources.DeltaSQLConf
++import java.util.{Collections, Optional, UUID}
++
++import scala.collection.JavaConverters._
++
++import io.delta.storage.commit.{CommitCoordinatorClient => JCommitCoordinatorClient}
++import io.delta.storage.commit.{TableIdentifier => UCTableIdentifier}
++import io.delta.storage.commit.actions.{AbstractMetadata, AbstractProtocol}
++import io.delta.storage.commit.uccommitcoordinator.UCCommitCoordinatorClient
++import org.apache.hadoop.fs.Path
+ 
+ import org.apache.spark.{SparkConf, SparkSessionSwitch}
+ import org.apache.spark.sql.{Row, SparkSession}
++import org.apache.spark.sql.catalyst.TableIdentifier
++import org.apache.spark.sql.delta.DeltaConfigs.{
++  COORDINATED_COMMITS_COORDINATOR_CONF,
++  COORDINATED_COMMITS_COORDINATOR_NAME
++}
++import org.apache.spark.sql.delta.DeltaLog
++import org.apache.spark.sql.delta.NonSparkReadIceberg
++import org.apache.spark.sql.delta.coordinatedcommits.{
++  CatalogOwnedCommitCoordinatorBuilder,
++  CommitCoordinatorProvider,
++  InMemoryUCClient,
++  InMemoryUCCommitCoordinator,
++  UCCommitCoordinatorBuilder
++}
++import org.apache.spark.sql.delta.sources.DeltaSQLConf
+ import org.apache.spark.sql.delta.test.DeltaSQLCommandTest
+ import org.apache.spark.sql.delta.uniform.hms.HMSTest
++import org.apache.spark.sql.delta.util.JsonUtils
+ 
+ /**
+  * This trait allows the tests to write with Delta
+ }
+ 
+ /**
+- * No test should go here. Please add tests in [[UniFormE2EIcebergSuiteBase]]
++ * Trait that wires up an in-memory UC commit coordinator for UniForm E2E testing.
++ *
++ * Mix this into a concrete suite that already extends [[UniFormE2EIcebergSuiteBase]] (or any
++ * other [[UniFormE2ETest]] subclass) to redirect every [[readAndVerify]] call through the
++ * native Iceberg reader backed by the in-memory UC coordinator
++ *
++ * Concrete suites must call [[requiredTableProperties]] inside their
++ * [[UniFormE2EIcebergSuiteBase.extraTableProperties]] override to inject the coordinator
++ * name and conf into every `CREATE TABLE` statement.
+  */
++trait WriteDeltaUCCCReadIceberg extends UniFormE2ETest
++  with DeltaSQLCommandTest
++  with NonSparkReadIceberg {
++
++  /**
++   * A [[UCCommitCoordinatorClient]] subclass that overrides [[registerTable]] to auto-assign
++   * a UC table ID, simulating what the UC catalog does during CREATE TABLE.
++   */
++  private class TestUCBackedCommitCoordinator(ucClient: InMemoryUCClient)
++    extends UCCommitCoordinatorClient(Collections.emptyMap(), ucClient) {
++
++    @volatile var lastRegisteredTableId: String = _
++
++    /**
++     * Delta blocks setting `COORDINATED_COMMITS_TABLE_CONF` in TBLPROPERTIES, so this trait
++     * simulates what the real UC catalog does: a [[CatalogOwnedCommitCoordinatorBuilder]] returns
++     * a single [[TestUCBackedCommitCoordinator]] instance whose [[registerTable]] auto-assigns a
++     * UUID.  Returning the same instance from every [[build]]/[[buildForCatalog]] call ensures
++     * that [[UCCommitCoordinatorClient.semanticEquals]] (which uses reference equality on `conf`)
++     * returns true and Delta does not reject intra-test metadata updates.
++     */
++    override def registerTable(
++        logPath: Path,
++        tableIdentifier: Optional[UCTableIdentifier],
++        currentVersion: Long,
++        currentMetadata: AbstractMetadata,
++        currentProtocol: AbstractProtocol): java.util.Map[String, String] = {
++      val tableId = UUID.randomUUID().toString
++      lastRegisteredTableId = tableId
++      Map(UCCommitCoordinatorClient.UC_TABLE_ID_KEY -> tableId).asJava
++    }
++  }
++
++  protected var ucCommitCoordinator: InMemoryUCCommitCoordinator = _
++  private var testCoordinator: TestUCBackedCommitCoordinator = _
++
++  abstract override def beforeEach(): Unit = {
++    super.beforeEach()
++    DeltaLog.clearCache()
++    CommitCoordinatorProvider.clearAllBuilders()
++    ucCommitCoordinator = new InMemoryUCCommitCoordinator()
++    val ucClient = new InMemoryUCClient("test-metastore", ucCommitCoordinator)
++    testCoordinator = new TestUCBackedCommitCoordinator(ucClient)
++    CommitCoordinatorProvider.registerBuilder(new CatalogOwnedCommitCoordinatorBuilder {
++      override def getName: String = UCCommitCoordinatorBuilder.getName
++      override def build(
++          spark: SparkSession, conf: Map[String, String]): JCommitCoordinatorClient =
++        testCoordinator
++      override def buildForCatalog(
++          spark: SparkSession, catalogName: String): JCommitCoordinatorClient =
++        testCoordinator
++    })
++  }
++
++  abstract override def afterEach(): Unit = {
++    CommitCoordinatorProvider.clearAllBuilders()
++    DeltaLog.clearCache()
++    super.afterEach()
++  }
++
++  /**
++   * Returns the TBLPROPERTIES SQL fragment required to enable the UC commit coordinator.
++   * Concrete suites should append this to their [[extraTableProperties]] override.
++   */
++  def requiredTableProperties: String =
++    s", '${COORDINATED_COMMITS_COORDINATOR_NAME.key}' = '${UCCommitCoordinatorBuilder.getName}'" +
++      s", '${COORDINATED_COMMITS_COORDINATOR_CONF.key}' = " +
++      s"'${JsonUtils.toJson(Map.empty[String, String])}'"
++
++  override protected def readAndVerify(
++      table: String, fields: String, orderBy: String, expect: Seq[Row]): Unit = {
++    val tableId = testCoordinator.lastRegisteredTableId
++    assert(tableId != null,
++      s"No table UUID assigned for '$table' - table was not created with CC properties")
++    val schema = DeltaLog.forTable(spark, TableIdentifier(table)).update().schema
++    val uniformMetadata = ucCommitCoordinator.getUniformMetadata(tableId)
++    assert(uniformMetadata.isDefined,
++      s"No UniForm metadata found for table '$table' (ID $tableId)")
++    assert(uniformMetadata.get.getIcebergMetadata.isPresent,
++      s"No Iceberg metadata found for table '$table' (ID $tableId)")
++    val icebergMetadataPath = uniformMetadata.get.getIcebergMetadata.get.getMetadataLocation
++    verifyReadByPath(icebergMetadataPath, schema, fields, orderBy, expect)
++  }
++}
++
++/**
++ * Concrete E2E suite that runs all [[UniFormE2EIcebergSuiteBase]] tests with tables backed
++ * by an in-memory UC commit coordinator, reading results via the native Iceberg reader.
++ */
++class UniFormE2EIcebergUCSuite extends UniFormE2EIcebergSuiteBase
++    with WriteDeltaUCCCReadIceberg {
++  // No test should go here. Please add tests in [[UniFormE2EIcebergSuiteBase]]
++  override def extraTableProperties(compatVersion: Int): String =
++    super.extraTableProperties(compatVersion) + requiredTableProperties
++}
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
@@ -0,0 +1,49 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
+     public static final String FORMAT_HUDI = "hudi";
+   }
+ 
++  /**
++   * The set of compression codecs that Kernel currently recognizes and enforces. This is
++   * intentionally strict for now. In the future we may add new codecs or relax validation to allow
++   * any codec string.
++   */
++  private static final Set<String> VALID_COMPRESSION_CODECS =
++      Collections.unmodifiableSet(
++          new HashSet<>(
++              Arrays.asList("uncompressed", "none", "snappy", "gzip", "lz4", "lz4_raw", "zstd")));
++
+   private static final Collection<String> ALLOWED_UNIFORM_FORMATS =
+       Collections.unmodifiableList(
+           Arrays.asList(UniversalFormats.FORMAT_HUDI, UniversalFormats.FORMAT_ICEBERG));
+           "needs to be a boolean.",
+           true);
+ 
++  /**
++   * Compression codec writers should use for new Parquet data and checkpoint files. Changing this
++   * property does not affect existing files; a table may contain files written with different
++   * codecs.
++   *
++   * <p>Valid values (case-insensitive): uncompressed, none, snappy, gzip, lz4, lz4_raw, zstd.
++   */
++  public static final TableConfig<String> PARQUET_COMPRESSION_CODEC =
++      new TableConfig<>(
++          "delta.parquet.compression.codec",
++          "snappy",
++          v -> v.toLowerCase(Locale.ROOT),
++          VALID_COMPRESSION_CODECS::contains,
++          "needs to be one of: 'uncompressed', 'none', 'snappy', 'gzip',"
++              + " 'lz4', 'lz4_raw', 'zstd'.",
++          true /* editable */);
++
+   public static final TableConfig<String> MATERIALIZED_ROW_ID_COLUMN_NAME =
+       new TableConfig<>(
+           "delta.rowTracking.materializedRowIdColumnName",
+               addConfig(this, MATERIALIZED_ROW_ID_COLUMN_NAME);
+               addConfig(this, MATERIALIZED_ROW_COMMIT_VERSION_COLUMN_NAME);
+               addConfig(this, VARIANT_SHREDDING_ENABLED);
++              addConfig(this, PARQUET_COMPRESSION_CODEC);
+ 
+               // The below configs do not yet have their behavior correctly implemented in Kernel.
+               addConfig(this, DATA_SKIPPING_STATS_COLUMNS);
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
@@ -0,0 +1,13 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
+               StructType.class));
+ 
+   private static final Set<Class<? extends DataType>> V3_SUPPORTED_TYPES =
+-      Stream.concat(V2_SUPPORTED_TYPES.stream(), Stream.of(VariantType.class))
++      Stream.concat(
++              V2_SUPPORTED_TYPES.stream(),
++              Stream.of(VariantType.class, GeometryType.class, GeographyType.class))
+           .collect(Collectors.toSet());
+ 
+   protected static final IcebergCompatCheck V2_CHECK_HAS_SUPPORTED_TYPES =
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
@@ -0,0 +1,10 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
+                   VARIANT_SHREDDING_PREVIEW_RW_FEATURE,
+                   VARIANT_RW_PREVIEW_FEATURE,
+                   ALLOW_COLUMN_DEFAULTS_W_FEATURE,
++                  GEOSPATIAL_RW_FEATURE,
+                   // Also allow writerV1 features for backward compatibility.
+                   //
+                   // Note: We already enforce that these features cannot be enabled
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
@@ -0,0 +1,22 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
+ import io.delta.kernel.utils.CloseableIterator;
+ import io.delta.kernel.utils.FileStatus;
+ import java.io.IOException;
++import java.io.InterruptedIOException;
+ import java.io.UncheckedIOException;
+ import java.util.*;
+ import java.util.stream.Collectors;
+       throw new IllegalStateException("Can't call `next` on a closed iterator.");
+     }
+     if (Thread.currentThread().isInterrupted()) {
+-      throw new IllegalStateException("Thread was interrupted");
++      // Throw a typed InterruptedIOException (wrapped, since next() does not declare checked
++      // exceptions) so engines whose interrupt-handling recognizes standard JDK interrupt types
++      // (e.g. Spark's StreamExecution.isInterruptionException) treat this as a clean shutdown
++      // rather than a real error.
++      throw new UncheckedIOException(new InterruptedIOException("Thread was interrupted"));
+     }
+ 
+     if (!hasNext()) {
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
@@ -0,0 +1,11 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
+     }
+   }
+ 
+-  static final TableFeature GEOSPATIAL_RW_FEATURE = new GeoSpatialTableFeature();
++  public static final TableFeature GEOSPATIAL_RW_FEATURE = new GeoSpatialTableFeature();
+ 
+   private static class GeoSpatialTableFeature extends TableFeature.ReaderWriterFeature
+       implements FeatureAutoEnabledByMetadata {
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
@@ -0,0 +1,73 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
+ 
+ import scala.collection.JavaConverters._
+ 
+-import io.delta.kernel.exceptions.KernelException
++import io.delta.kernel.exceptions.{InvalidConfigurationValueException, KernelException}
+ 
+ import org.scalatest.funsuite.AnyFunSuite
+ 
+         TableConfig.IN_COMMIT_TIMESTAMP_ENABLEMENT_TIMESTAMP.getKey -> "1",
+         TableConfig.COLUMN_MAPPING_MODE.getKey -> "name",
+         TableConfig.ICEBERG_COMPAT_V2_ENABLED.getKey -> "true",
+-        TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.getKey -> "iceberg").asJava)
++        TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.getKey -> "iceberg",
++        TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "snappy").asJava)
+   }
+ 
+   test("check TableConfig.MAX_COLUMN_ID.editable is false") {
+     val formats = TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.fromMetadata(config)
+     assert(formats == Set("iceberg", "hudi").asJava)
+   }
++
++  test("PARQUET_COMPRESSION_CODEC - valid values accepted including mixed case") {
++    val validValues = Seq(
++      "snappy",
++      "SNAPPY",
++      "ZSTD",
++      "gzip",
++      "GZIP",
++      "lz4",
++      "lz4_raw",
++      "LZ4_RAW",
++      "uncompressed",
++      "UNCOMPRESSED",
++      "none",
++      "NONE",
++      "zstd")
++    validValues.foreach { codec =>
++      TableConfig.validateAndNormalizeDeltaProperties(
++        Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> codec).asJava)
++    }
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - invalid value throws InvalidConfigurationValueException") {
++    val ex = intercept[InvalidConfigurationValueException] {
++      TableConfig.validateAndNormalizeDeltaProperties(
++        Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "invalid").asJava)
++    }
++    assert(ex.getMessage.contains("delta.parquet.compression.codec"))
++    assert(ex.getMessage.contains("invalid"))
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - fromMetadata returns lowercase regardless of stored case") {
++    val config = Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "SNAPPY").asJava
++    val result = TableConfig.PARQUET_COMPRESSION_CODEC.fromMetadata(config)
++    assert(result === "snappy")
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - fromMetadata returns snappy when property absent") {
++    val config = Map.empty[String, String].asJava
++    val result = TableConfig.PARQUET_COMPRESSION_CODEC.fromMetadata(config)
++    assert(result === "snappy")
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - validation normalizes key case") {
++    val result = TableConfig.validateAndNormalizeDeltaProperties(
++      Map("DELTA.PARQUET.COMPRESSION.CODEC" -> "snappy").asJava)
++    assert(result.containsKey("delta.parquet.compression.codec"))
++    assert(result.get("delta.parquet.compression.codec") === "snappy")
++  }
+ }
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
@@ -0,0 +1,12 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
+ 
+   override def supportedDataColumnTypes: Set[DataType] = ALL_TYPES
+ 
+-  override def unsupportedDataColumnTypes: Set[DataType] = Set(VariantType.VARIANT)
++  override def unsupportedDataColumnTypes: Set[DataType] =
++    Set(VariantType.VARIANT, GeometryType.ofDefault(), GeographyType.ofDefault())
+ 
+   override def unsupportedPartitionColumnTypes: Set[DataType] = NESTED_TYPES
+ 
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
@@ -0,0 +1,12 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
+ 
+   override def icebergCompatVersion: String = "V3"
+ 
+-  override def supportedDataColumnTypes: Set[DataType] = ALL_TYPES + VariantType.VARIANT
++  override def supportedDataColumnTypes: Set[DataType] =
++    ALL_TYPES + VariantType.VARIANT + GeometryType.ofDefault() + GeographyType.ofDefault()
+ 
+   override def unsupportedDataColumnTypes: Set[DataType] = Set.empty
+ 
\ No newline at end of file

... (truncated, output exceeded 60000 bytes)

Reproduce locally: git range-diff 1b800a0..c803c1a e43bf65..a50c9d2 | Disable: git config gitstack.push-range-diff false

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from a50c9d2 to dfe61d9 Compare April 29, 2026 19:53
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (a50c9d2 -> dfe61d9)
.github/actions/setup-unitycatalog/action.yml
@@ -0,0 +1,40 @@
+diff --git a/.github/actions/setup-unitycatalog/action.yml b/.github/actions/setup-unitycatalog/action.yml
+new file mode 100644
+--- /dev/null
++++ b/.github/actions/setup-unitycatalog/action.yml
++name: "Set up pinned Unity Catalog build"
++description: >-
++  Publishes Unity Catalog jars from the commit pinned in project/scripts/setup_unitycatalog_main.sh
++  (the UC_PIN_SHA= line) to the runner's local Ivy / Maven caches, using GitHub Actions cache so the
++  slow UC build only runs the first time a pin is seen.
++
++runs:
++  using: "composite"
++  steps:
++    - name: Restore pinned UC cache
++      id: uc-cache
++      uses: actions/cache/restore@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
++      with:
++        # ~/.ivy2/local is what sbt publishLocal writes to. ~/.m2 is for publishM2.
++        path: |
++          ~/.ivy2/local
++          ~/.m2/repository/io/unitycatalog
++        # Cache key hashes the setup script, so bumping UC_PIN_SHA (or any other script change)
++        # invalidates the cache.
++        key: uc-jars-${{ runner.os }}-${{ hashFiles('project/scripts/setup_unitycatalog_main.sh') }}
++    - name: Build Unity Catalog from pinned SHA
++      shell: bash
++      run: bash project/scripts/setup_unitycatalog_main.sh
++    - name: Save pinned UC cache
++      # Only attempt a save when the restore missed. When multiple parallel matrix jobs all see
++      # a cache miss (first CI run after a pin bump), only the first to reach this step wins the
++      # GHA cache reservation; the rest log "another job may be creating this cache" warnings.
++      # Gating on cache-hit means cached runs (the common steady state) skip the save entirely,
++      # which eliminates those warnings on every subsequent run.
++      if: steps.uc-cache.outputs.cache-hit != 'true'
++      uses: actions/cache/save@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
++      with:
++        path: |
++          ~/.ivy2/local
++          ~/.m2/repository/io/unitycatalog
++        key: uc-jars-${{ runner.os }}-${{ hashFiles('project/scripts/setup_unitycatalog_main.sh') }}
\ No newline at end of file
.github/workflows/build.yaml
@@ -0,0 +1,29 @@
+diff --git a/.github/workflows/build.yaml b/.github/workflows/build.yaml
+--- a/.github/workflows/build.yaml
++++ b/.github/workflows/build.yaml
+ name: "Delta Build"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+             ~/.cache/coursier
+           key: delta-sbt-cache-cross-spark
+ 
++      # publishM2 compiles every aggregated project, including storage, which has
++      # unitycatalog-client as a compile-scope dependency. Publish the pinned UC build locally
++      # first so Delta compiles against the UC APIs it actually targets.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
++
+       - name: Run cross-Spark build test
+         run: python project/tests/test_cross_spark_publish.py
+ 
\ No newline at end of file
.github/workflows/disabled_iceberg_test.yaml
@@ -0,0 +1,80 @@
+diff --git a/.github/workflows/disabled_iceberg_test.yaml b/.github/workflows/disabled_iceberg_test.yaml
+deleted file mode 100644
+--- a/.github/workflows/disabled_iceberg_test.yaml
++++ /dev/null
+-name: "Delta Iceberg Latest [DISABLED]"
+-# SECURITY: All Python/PySpark workflows disabled due to active supply chain attack
+-# targeting OSS package ecosystems (PyPI). C2 domains: models.litellm.cloud, checkmarx.zone
+-# Date disabled: 2026-03-25
+-# To re-enable: remove 'if: false' from all jobs and restore original triggers
+-on:
+-  workflow_dispatch: # manual-only, auto triggers removed
+-  # To re-enable, replace the above line with:
+-  # push:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-  # pull_request:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-env:
+-  # SECURITY: Temporal lockdown — refuse any package version published after this date.
+-  # This date is a pre-attack baseline (before the active PyPI supply chain attack).
+-  UV_EXCLUDE_NEWER: "2026-03-10T00:00:00Z"
+-jobs:
+-  test:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+-    name: "DIL: Scala ${{ matrix.scala }}"
+-    runs-on: ubuntu-24.04
+-    strategy:
+-      matrix:
+-        # These Scala versions must match those in the build.sbt
+-        scala: [2.13.16]
+-    env:
+-      SCALA_VERSION: ${{ matrix.scala }}
+-    steps:
+-      - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+-      - name: install java
+-        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
+-        with:
+-          distribution: "zulu"
+-          java-version: "17"
+-      - name: Cache Scala, SBT
+-        uses: actions/cache@6f8efc29b200d32929f49075959781ed54ec270c # v3.5.0
+-        with:
+-          path: |
+-            ~/.sbt
+-            ~/.ivy2
+-            ~/.cache/coursier
+-          # Change the key if dependencies are changed. For each key, GitHub Actions will cache the
+-          # the above directories when we use the key for the first time. After that, each run will
+-          # just use the cache. The cache is immutable so we need to use a new key when trying to
+-          # cache new stuff.
+-          key: delta-sbt-cache-spark4.0-scala${{ matrix.scala }}
+-      - name: Set up uv
+-        run: bash project/scripts/install-uv.sh
+-      - name: Install Job dependencies
+-        run: |
+-          sudo apt-get update
+-          sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+-          sudo apt install libedit-dev
+-          # buf v1.28.1 (2023-11-15) — SHA from official release asset:
+-          # https://github.com/bufbuild/buf/releases/download/v1.28.1/sha256.txt
+-          BUF_VERSION="v1.28.1"
+-          BUF_SHA256="870cf492d381a967d36636fdee9da44b524ea62aad163659b8dbf16a7da56987"
+-          curl -fsSL -o buf-Linux-x86_64.tar.gz \
+-            "https://github.com/bufbuild/buf/releases/download/${BUF_VERSION}/buf-Linux-x86_64.tar.gz"
+-          echo "${BUF_SHA256}  buf-Linux-x86_64.tar.gz" | sha256sum -c -
+-          mkdir -p ~/buf
+-          tar -xzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
+-          rm buf-Linux-x86_64.tar.gz
+-          uv python install 3.8
+-          uv venv .venv --python 3.8
+-      - name: Run Scala/Java and Python tests
+-        # when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
+-        run: |
+-          source .venv/bin/activate
+-          TEST_PARALLELISM_COUNT=4 python run-tests.py --group iceberg --spark-version 4.0
\ No newline at end of file
.github/workflows/spark_test_uc_master.yaml
@@ -0,0 +1,62 @@
+diff --git a/.github/workflows/spark_test_uc_master.yaml b/.github/workflows/disabled_spark_test_uc_master.yaml
+similarity index 61%
+rename from .github/workflows/spark_test_uc_master.yaml
+rename to .github/workflows/disabled_spark_test_uc_master.yaml
+--- a/.github/workflows/spark_test_uc_master.yaml
++++ b/.github/workflows/disabled_spark_test_uc_master.yaml
+ ##
+ ## To make this blocking, add the job name to the required status checks in
+ ## the branch protection rules for `master`.
++##
++## DISABLED while Delta master builds against a pinned UC master SHA — the main Delta Spark
++## workflow already exercises UC master at that pin, so a parallel floating-main workflow would
++## be redundant. To re-enable (once Delta goes back to a released UC version): drop the
++## `[DISABLED]` suffix from `name`, replace `workflow_dispatch:` with the original push /
++## pull_request triggers below, remove `if: false` from the job, and rename the file back to
++## `spark_test_uc_master.yaml`.
+ 
+-name: "Delta Spark (UC Master)"
++name: "Delta Spark (UC Master) [DISABLED]"
+ on:
+-  push:
+-    paths-ignore:
+-      - '**.md'
+-      - '**.txt'
+-  pull_request:
+-    paths-ignore:
+-      - '**.md'
+-      - '**.txt'
++  workflow_dispatch: # manual-only while disabled
++  # Original triggers, restore when re-enabling:
++  # push:
++  #   branches: [master, branch-*]
++  #   paths-ignore:
++  #     - '**.md'
++  #     - '**.txt'
++  # pull_request:
++  #   branches: [master, branch-*]
++  #   paths-ignore:
++  #     - '**.md'
++  #     - '**.txt'
+ 
+ jobs:
+   test-uc-master:
+     name: "[Non Blocking] UC Integration Tests (UC Main)"
++    # Guard against accidental runs while disabled. Remove when re-enabling.
++    if: false
+     runs-on: ubuntu-24.04
+     steps:
+       - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3
+           key: delta-sbt-cache-uc-master
+       - name: Build Unity Catalog from source
+         id: uc-build
++        # UC_REF=main builds the floating-main canary instead of the pinned SHA, which is the
++        # point of this workflow -- early warning of upcoming UC incompatibilities.
+         run: |
+-          bash project/scripts/setup_unitycatalog_main.sh
+-          UC_VERSION=$(cat /tmp/unitycatalog/.uc-version)
++          UC_REF=main bash project/scripts/setup_unitycatalog_main.sh
++          UC_VERSION=$(UC_REF=main bash project/scripts/setup_unitycatalog_main.sh --print-version)
+           echo "uc_version=$UC_VERSION" >> $GITHUB_OUTPUT
+           echo "UC version: $UC_VERSION"
+       - name: Run UC integration tests
\ No newline at end of file
.github/workflows/flink_test.yaml
@@ -0,0 +1,37 @@
+diff --git a/.github/workflows/flink_test.yaml b/.github/workflows/flink_test.yaml
+--- a/.github/workflows/flink_test.yaml
++++ b/.github/workflows/flink_test.yaml
+ 
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'flink/**'
+       - 'kernel/**'
+       - '!**/*.md'
+       - '!**/*.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'flink/**'
+       - 'kernel/**'
+   cancel-in-progress: true
+ 
+ env:
+-  # Point SBT to our cache directories for consistency
++  # Point SBT to our cache directories for consistency.
+   SBT_OPTS: "-Dsbt.coursier.home-dir=/home/runner/.cache/coursier -Dsbt.ivy.home=/home/runner/.ivy2"
+ 
+ jobs:
+           else
+             echo "❌ Cache MISS - will download dependencies"
+           fi
++      # flink has unitycatalog-client as a compile-scope dep and flink tests exercise UC.
++      # Publish the pinned UC build locally before sbt runs.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run unit tests
+         run: |
+           build/sbt flinkGroup/test
\ No newline at end of file
.github/workflows/iceberg_test.yaml
@@ -0,0 +1,58 @@
+diff --git a/.github/workflows/iceberg_test.yaml b/.github/workflows/iceberg_test.yaml
+new file mode 100644
+--- /dev/null
++++ b/.github/workflows/iceberg_test.yaml
++name: "Delta Iceberg Latest"
++on:
++  push:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++  pull_request:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++jobs:
++  test:
++    name: "DIL: Scala ${{ matrix.scala }}"
++    runs-on: ubuntu-24.04
++    strategy:
++      matrix:
++        # These Scala versions must match those in the build.sbt
++        scala: [2.13.16]
++    env:
++      SCALA_VERSION: ${{ matrix.scala }}
++    steps:
++      - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
++      - name: install java
++        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
++        with:
++          distribution: "zulu"
++          java-version: "17"
++      - name: Cache Scala, SBT
++        uses: actions/cache@6f8efc29b200d32929f49075959781ed54ec270c # v3.5.0
++        with:
++          path: |
++            ~/.sbt
++            ~/.ivy2
++            ~/.cache/coursier
++          # Change the key if dependencies are changed. For each key, GitHub Actions will cache the
++          # the above directories when we use the key for the first time. After that, each run will
++          # just use the cache. The cache is immutable so we need to use a new key when trying to
++          # cache new stuff.
++          key: delta-sbt-cache-spark4.0-scala${{ matrix.scala }}
++      - name: Set up uv
++        run: bash project/scripts/install-uv.sh
++      - name: Install Python via uv
++        # No UV_EXCLUDE_NEWER needed: this workflow installs zero pip packages.
++        # Python is only used to run the stdlib-only run-tests.py driver.
++        run: |
++          uv python install 3.8
++          uv venv .venv --python 3.8
++      - name: Run Scala/Java and Python tests
++        # when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
++        run: |
++          source .venv/bin/activate
++          TEST_PARALLELISM_COUNT=4 python run-tests.py --group iceberg --spark-version 4.0
\ No newline at end of file
.github/workflows/kernel_docs.yaml
@@ -0,0 +1,11 @@
+diff --git a/.github/workflows/kernel_docs.yaml b/.github/workflows/kernel_docs.yaml
+--- a/.github/workflows/kernel_docs.yaml
++++ b/.github/workflows/kernel_docs.yaml
+           java-version: "11"
+       - name: Generate docs
+         run: |
+-          build/sbt kernelGroup/unidoc
++          build/sbt -DuseDefaultUnityCatalogReleaseVersion=true kernelGroup/unidoc
+           mkdir -p kernel/docs/snapshot/kernel-api/java
+           mkdir -p kernel/docs/snapshot/kernel-defaults/java
+           cp -r kernel/kernel-api/target/javaunidoc/. kernel/docs/snapshot/kernel-api/java/
\ No newline at end of file
.github/workflows/kernel_test.yaml
@@ -0,0 +1,47 @@
+diff --git a/.github/workflows/kernel_test.yaml b/.github/workflows/kernel_test.yaml
+--- a/.github/workflows/kernel_test.yaml
++++ b/.github/workflows/kernel_test.yaml
+ 
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+           else
+             echo "❌ Cache MISS - will download dependencies"
+           fi
++      # run-tests.py invokes sbt with `++ 2.13.16`, which triggers cross-version dependency resolution
++      # across every project (including kernelUnityCatalog). Publish the pinned UC build locally first
++      # so that resolution doesn't miss.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run unit tests
+         run: |
+           python run-tests.py --group kernel --coverage --shard ${{ matrix.shard }}
+     runs-on: ubuntu-24.04
+     steps:
+       - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+-      # Run integration tests with JDK 11, as they have no Spark dependency
+-      - name: install java
++      # The integration test itself runs on JDK 11 (no Spark dependency), but UC's sbt build needs
++      # JDK 17, so we install 17 first, publish UC, then switch the active JDK to 11 for the actual
++      # test run.
++      - name: install java 17 for UC build
++        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
++        with:
++          distribution: "zulu"
++          java-version: "17"
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
++      - name: install java 11 for integration test
+         uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
+         with:
+           distribution: "zulu"
\ No newline at end of file
.github/workflows/kernel_unitycatalog_test.yaml
@@ -0,0 +1,29 @@
+diff --git a/.github/workflows/kernel_unitycatalog_test.yaml b/.github/workflows/kernel_unitycatalog_test.yaml
+--- a/.github/workflows/kernel_unitycatalog_test.yaml
++++ b/.github/workflows/kernel_unitycatalog_test.yaml
+ name: "Kernel Unity Catalog"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'build.sbt'
+       - 'version.sbt'
+       - 'storage/**/*.java'
+       - '.github/workflows/kernel_unitycatalog_test.yaml'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'build.sbt'
+       - 'version.sbt'
+         with:
+           distribution: "zulu"
+           java-version: "17"
++      # kernelUnityCatalog depends on unreleased UC APIs; publish the pinned UC build locally before
++      # sbt tries to resolve the dependency.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run Unity tests with coverage
+         run: |
+           ./build/sbt "++ ${{ env.SCALA_VERSION }}" clean coverage kernelUnityCatalog/test coverageAggregate coverageOff -v
\ No newline at end of file
.github/workflows/spark_examples_test.yaml
@@ -0,0 +1,27 @@
+diff --git a/.github/workflows/spark_examples_test.yaml b/.github/workflows/spark_examples_test.yaml
+--- a/.github/workflows/spark_examples_test.yaml
++++ b/.github/workflows/spark_examples_test.yaml
+ name: "Delta Spark Publishing and Examples"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+           sudo apt-get update
+           sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+           sudo apt install libedit-dev
++      # `publishM2` and `++ <scala>` both resolve every project's deps, which includes
++      # sparkUnityCatalog; publish the pinned UC build locally before sbt runs.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run Delta Spark Local Publishing and Examples Compilation
+         # examples/scala/build.sbt will compile against the local Delta release version (e.g. 3.2.0-SNAPSHOT).
+         # Thus, we need to publishM2 first so those jars are locally accessible.
\ No newline at end of file
.github/workflows/disabled_spark_python_test.yaml
@@ -0,0 +1,76 @@
+diff --git a/.github/workflows/disabled_spark_python_test.yaml b/.github/workflows/spark_python_test.yaml
+similarity index 71%
+rename from .github/workflows/disabled_spark_python_test.yaml
+rename to .github/workflows/spark_python_test.yaml
+--- a/.github/workflows/disabled_spark_python_test.yaml
++++ b/.github/workflows/spark_python_test.yaml
+-name: "Delta Spark Python [DISABLED]"
+-# SECURITY: All Python/PySpark workflows disabled due to active supply chain attack
+-# targeting OSS package ecosystems (PyPI). C2 domains: models.litellm.cloud, checkmarx.zone
+-# Date disabled: 2026-03-25
+-# To re-enable: remove 'if: false' from all jobs and restore original triggers
++name: "Delta Spark Python"
+ on:
+-  workflow_dispatch: # manual-only, auto triggers removed
+-  # To re-enable, replace the above line with:
+-  # push:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-  # pull_request:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
++  push:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++  pull_request:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
+ env:
+   # SECURITY: Temporal lockdown — refuse any package version published after this date.
+   # This date is a pre-attack baseline (before the active PyPI supply chain attack).
+   # Generate Spark versions matrix from CrossSparkVersions.scala
+   # This workflow tests against released versions only (no snapshots)
+   generate-matrix:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+     name: "Generate Released Spark Versions Matrix"
+     runs-on: ubuntu-24.04
+     outputs:
+           echo "Generated released Spark versions: $SPARK_VERSIONS"
+ 
+   test:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+     name: "DSP (${{ matrix.spark_version }})"
+     runs-on: ubuntu-24.04
+     needs: generate-matrix
+           key: delta-sbt-cache-spark${{ matrix.spark_version }}-scala${{ matrix.scala }}
+       - name: Set up uv
+         run: bash project/scripts/install-uv.sh
+-      - name: Install Job dependencies
++      - name: Set up buf
++        run: bash project/scripts/install-buf.sh
++      - name: Install Python and dependencies
+         run: |
+-          sudo apt-get update
+-          sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+-          sudo apt install libedit-dev
+-          # buf v1.28.1 (2023-11-15) — SHA from official release asset:
+-          # https://github.com/bufbuild/buf/releases/download/v1.28.1/sha256.txt
+-          BUF_VERSION="v1.28.1"
+-          BUF_SHA256="870cf492d381a967d36636fdee9da44b524ea62aad163659b8dbf16a7da56987"
+-          curl -fsSL -o buf-Linux-x86_64.tar.gz \
+-            "https://github.com/bufbuild/buf/releases/download/${BUF_VERSION}/buf-Linux-x86_64.tar.gz"
+-          echo "${BUF_SHA256}  buf-Linux-x86_64.tar.gz" | sha256sum -c -
+-          mkdir -p ~/buf
+-          tar -xzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
+-          rm buf-Linux-x86_64.tar.gz
+           uv python install 3.10
+           uv venv .venv --python 3.10
+           # Install hash-verified locked dependencies (see .github/ci-requirements/spark-python/)
\ No newline at end of file
.github/workflows/spark_test.yaml
@@ -0,0 +1,27 @@
+diff --git a/.github/workflows/spark_test.yaml b/.github/workflows/spark_test.yaml
+--- a/.github/workflows/spark_test.yaml
++++ b/.github/workflows/spark_test.yaml
+ name: "Delta Spark"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+             ~/.ivy2
+             ~/.cache/coursier
+           key: delta-sbt-cache-spark${{ matrix.spark_version }}-scala${{ matrix.scala }}
++      # Delta's sparkUnityCatalog module (part of sparkGroup) depends on APIs that are only in
++      # unreleased UC. Publish the pinned UC build locally before sbt tries to resolve it.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Scala structured logging style check
+         run: |
+           if [ -f ./dev/spark_structured_logging_style.py ]; then
\ No newline at end of file
.github/workflows/unidoc.yaml
@@ -0,0 +1,19 @@
+diff --git a/.github/workflows/unidoc.yaml b/.github/workflows/unidoc.yaml
+--- a/.github/workflows/unidoc.yaml
++++ b/.github/workflows/unidoc.yaml
+   name: "Unidoc"
+   on:
+     push:
+-      branches: [master]
++      branches: [master, branch-*]
+     pull_request:
+-      branches: [master]
++      branches: [master, branch-*]
+   jobs:
+     build:
+       name: "U: Scala ${{ matrix.scala }}"
+             java-version: "17"
+         - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+         - name: generate unidoc
+-          run: build/sbt "++ ${{ matrix.scala }}" unidoc
++          run: build/sbt -DuseDefaultUnityCatalogReleaseVersion=true "++ ${{ matrix.scala }}" unidoc
\ No newline at end of file
build.sbt
@@ -0,0 +1,154 @@
+diff --git a/build.sbt b/build.sbt
+--- a/build.sbt
++++ b/build.sbt
+   ).configureUnidoc()
+ 
+ 
+-val unityCatalogVersion = sys.props.getOrElse("unityCatalogVersion", "0.4.1")
++// Unity Catalog version. Three modes, in priority order:
++//
++//  1. `-DuseDefaultUnityCatalogReleaseVersion=true`: use `defaultUnityCatalogReleaseVersion`
++//     below -- the last released UC version on Maven Central. For workflows that don't actually
++//     need DRC APIs (e.g. unidoc, lint) and want to skip the pinned UC build. Shared across
++//     workflows by reading this single constant, so bumping is a one-line change here.
++//
++//  2. Release mode: set `unityCatalogReleaseVersion = Some("0.5.0")` (or whatever released
++//     version the release branch ships against). sbt resolves the coordinate from Maven Central
++//     like any other dependency.
++//
++//  3. Pinned mode (default): leave `unityCatalogReleaseVersion = None`. The version string
++//     comes from `setup_unitycatalog_main.sh --print-version`, which encodes both the pinned
++//     UC main SHA and UC's declared base version; the script is the single source of truth.
++//     The same script (without the flag) publishes the matching jars to ~/.ivy2/local when
++//     `ensurePinnedUnityCatalog` decides they're missing.
++//
++// Override with -DunityCatalogVersion=<anything> for ad-hoc experiments.
++val unityCatalogReleaseVersion: Option[String] = None
++val defaultUnityCatalogReleaseVersion = "0.4.1"
++val useDefaultUnityCatalogReleaseVersion: Boolean =
++  sys.props.getOrElse("useDefaultUnityCatalogReleaseVersion", "false").toBoolean
++val unityCatalogSetupScript = "project/scripts/setup_unitycatalog_main.sh"
++
++// Lazy so release-mode / useDefaultUnityCatalogReleaseVersion builds never shell out.
++lazy val pinnedUnityCatalogVersion: String = {
++  import scala.sys.process._
++  Process(Seq("bash", unityCatalogSetupScript, "--print-version")).!!.trim
++}
++val unityCatalogVersion: String = sys.props.getOrElse(
++  "unityCatalogVersion",
++  if (useDefaultUnityCatalogReleaseVersion) defaultUnityCatalogReleaseVersion
++  else unityCatalogReleaseVersion.getOrElse(pinnedUnityCatalogVersion))
++
+ val sparkUnityCatalogJacksonVersion = "2.15.4" // We are using Spark 4.0's Jackson version 2.15.x, to override Unity Catalog 0.3.0's version 2.18.x
+ 
++// Publishes the pinned UC jars to ~/.ivy2/local if they're not already cached there. Hooked
++// into `update` on the UC-dependent projects below, so plain `sbt testOnly ...` on a clean
++// checkout just works. No-op in release mode. Opt out with
++// `-Ddelta.autoBuildPinnedUnityCatalog=false`, in which case sbt errors with a pointer to the
++// setup script.
++val ensurePinnedUnityCatalog = taskKey[Unit](
++  "Publish the pinned UC jars locally if the Ivy coordinate isn't already cached.")
++
++// Extracted so the task body can read as a short guard rather than three nested ifs.
++def publishPinnedUnityCatalog(log: sbt.util.Logger, canary: java.io.File): Unit = {
++  val shouldAutoBuild =
++    sys.props.getOrElse("delta.autoBuildPinnedUnityCatalog", "true").toBoolean
++  if (!shouldAutoBuild) {
++    sys.error(
++      s"""|Pinned Unity Catalog jars are not published locally for coordinate
++          |$unityCatalogVersion.
++          |Auto-build is disabled (-Ddelta.autoBuildPinnedUnityCatalog=false).
++          |Run: bash $unityCatalogSetupScript""".stripMargin)
++  }
++  log.info(s"[UC] Pinned UC jars not found for coordinate $unityCatalogVersion.")
++  log.info(
++    s"[UC] Running $unityCatalogSetupScript - takes ~3-5 minutes on a cold cache, <1s on a warm one.")
++  import scala.sys.process._
++  val procLogger = ProcessLogger(
++    line => log.info(s"[UC setup] $line"),
++    line => log.warn(s"[UC setup] $line"))
++  val exit = Process(Seq("bash", unityCatalogSetupScript)).!(procLogger)
++  if (exit != 0) {
++    sys.error(
++      s"[UC] $unityCatalogSetupScript exited with code $exit. Run it manually to see full output.")
++  }
++  if (!canary.exists) {
++    sys.error(
++      s"[UC] $unityCatalogSetupScript succeeded but ${canary.getAbsolutePath} is still missing - " +
++        "the publish target layout may have changed.")
++  }
++}
++
++Global / ensurePinnedUnityCatalog := {
++  // Resolve the .value dependencies eagerly - sbt's task macro warns when
++  // `.value` appears inside conditional branches.
++  val log = streams.value.log
++  // No-op whenever the effective version resolves to something Maven Central can serve:
++  // release mode, -DuseDefaultUnityCatalogReleaseVersion=true, or -DunityCatalogVersion=<released>.
++  val usingReleasedVersion = useDefaultUnityCatalogReleaseVersion ||
++    sys.props.contains("unityCatalogVersion")
++  if (unityCatalogReleaseVersion.isEmpty && !usingReleasedVersion) {
++    val home = file(sys.props("user.home"))
++    // Check both layouts: a restored sbt cache can pre-populate ivy alone, leaving m2 empty -
++    // checking only ivy would silently skip the slow publish and break mvn-based consumers.
++    val ivy2Canary = home / ".ivy2" / "local" / "io.unitycatalog" /
++      "unitycatalog-client" / unityCatalogVersion / "ivys" / "ivy.xml"
++    val m2Canary = home / ".m2" / "repository" / "io" / "unitycatalog" /
++      "unitycatalog-client" / unityCatalogVersion /
++      s"unitycatalog-client-$unityCatalogVersion.pom"
++    if (!ivy2Canary.exists || !m2Canary.exists) {
++      publishPinnedUnityCatalog(log, ivy2Canary)
++    }
++  }
++}
++
+ lazy val sparkUnityCatalog = (project in file("spark/unitycatalog"))
+   .dependsOn(spark % "compile->compile;test->test;provided->provided")
+   .disablePlugins(ScalafmtPlugin)
+     javafmtCheckSettings(),
+     CrossSparkVersions.sparkDependentSettings(sparkVersion),
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // This is a test-only module - no production sources
+     Compile / sources := Seq.empty,
+ 
+     exportJars := false,
+     javafmtCheckSettings,
+     scalafmtCheckSettings,
+-    
++
+     libraryDependencies ++= Seq(
+       "org.openjdk.jmh" % "jmh-core" % "1.37" % "test",
+       "org.openjdk.jmh" % "jmh-generator-annprocess" % "1.37" % "test",
+     scalaStyleSettings,
+     scalafmtCheckSettings,
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // Put the shaded kernel-api JAR on the classpath (compile & test)
+     Compile / unmanagedJars += (kernelApi / Compile / packageBin).value,
+     Test / unmanagedJars += (kernelApi / Compile / packageBin).value,
+       "com.fasterxml.jackson.datatype" % "jackson-datatype-jsr310" % "2.15.4" % "test",
+     ),
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them. storage is the transitive
++    // UC-client entry point for most of the build graph (sparkV1, sparkV2, kernelDefaults, etc.
++    // all .dependsOn(storage)), so hooking here covers nearly every compile path.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // Unidoc settings
+     unidocSourceFilePatterns += SourceFilePattern("/LogStore.java", "/CloseableIterator.java"),
+     TestParallelization.settings
+       "--add-opens=java.base/java.util=ALL-UNNAMED" // for Flink with Java 17.
+     ),
+     crossPaths := false,
++
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     libraryDependencies ++= Seq(
+       "org.apache.flink" % "flink-core" % flinkVersion % "provided",
+       "org.apache.flink" % "flink-table-common" % flinkVersion % "provided",
\ No newline at end of file
build/sbt
@@ -0,0 +1,16 @@
+diff --git a/build/sbt b/build/sbt
+--- a/build/sbt
++++ b/build/sbt
+ )
+ }
+ 
+-# If MAVEN_PROXY_URL is set, use it as the sole repository for all dependencies.
++# If MAVEN_PROXY_URL is set, use it (and local) as the sole repository for all dependencies.
+ if [[ -n "$MAVEN_PROXY_URL" ]]; then
+   SBT_REPOSITORIES_CONFIG=$(mktemp)
+   cat > "$SBT_REPOSITORIES_CONFIG" <<EOF
+ [repositories]
++  local
+   maven-proxy: $MAVEN_PROXY_URL
+   maven-proxy-ivy: $MAVEN_PROXY_URL, [organization]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)[revision]/[type]s/[artifact](-[classifier]).[ext]
+ EOF
\ No newline at end of file
iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
@@ -0,0 +1,15 @@
+diff --git a/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala b/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
+--- a/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
++++ b/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
+      * AnalysisException
+      */
+      try {
+-       SchemaMergingUtils.checkColumnNameDuplication(tableSchema, "during convert to Delta")
++       SchemaMergingUtils.checkColumnNameDuplication(tableSchema, "CONVERT_TO_DELTA")
+      } catch {
+-       case e: AnalysisException if e.getMessage.contains("during convert to Delta") =>
++       case e: AnalysisException
++           if e.getErrorClass == "DELTA_DUPLICATE_COLUMNS_FOUND.CONVERT_TO_DELTA" =>
+          throw new UnsupportedOperationException(
+            IcebergTable.caseSensitiveConversionExceptionMsg(e.getMessage))
+      }
\ No newline at end of file
iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
@@ -0,0 +1,11 @@
+diff --git a/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala b/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
+--- a/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
++++ b/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
+    * @param catalogTable the catalogTable this conversion targets
+    * @return (Iceberg metadata path, last converted Delta version)
+    */
+-  def convertUncommitedTxn(
++  override def convertUncommitedTxn(
+       txnInfo: CurrentTransactionInfo,
+       deltaAttemptVersion: Long,
+       deltaLog: DeltaLog,
\ No newline at end of file
iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
@@ -0,0 +1,149 @@
+diff --git a/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala b/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
+--- a/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
++++ b/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
+ 
+ package org.apache.spark.sql.delta.uniform
+ 
+-import org.apache.spark.sql.delta.sources.DeltaSQLConf
++import java.util.{Collections, Optional, UUID}
++
++import scala.collection.JavaConverters._
++
++import io.delta.storage.commit.{CommitCoordinatorClient => JCommitCoordinatorClient}
++import io.delta.storage.commit.{TableIdentifier => UCTableIdentifier}
++import io.delta.storage.commit.actions.{AbstractMetadata, AbstractProtocol}
++import io.delta.storage.commit.uccommitcoordinator.UCCommitCoordinatorClient
++import org.apache.hadoop.fs.Path
+ 
+ import org.apache.spark.{SparkConf, SparkSessionSwitch}
+ import org.apache.spark.sql.{Row, SparkSession}
++import org.apache.spark.sql.catalyst.TableIdentifier
++import org.apache.spark.sql.delta.DeltaConfigs.{
++  COORDINATED_COMMITS_COORDINATOR_CONF,
++  COORDINATED_COMMITS_COORDINATOR_NAME
++}
++import org.apache.spark.sql.delta.DeltaLog
++import org.apache.spark.sql.delta.NonSparkReadIceberg
++import org.apache.spark.sql.delta.coordinatedcommits.{
++  CatalogOwnedCommitCoordinatorBuilder,
++  CommitCoordinatorProvider,
++  InMemoryUCClient,
++  InMemoryUCCommitCoordinator,
++  UCCommitCoordinatorBuilder
++}
++import org.apache.spark.sql.delta.sources.DeltaSQLConf
+ import org.apache.spark.sql.delta.test.DeltaSQLCommandTest
+ import org.apache.spark.sql.delta.uniform.hms.HMSTest
++import org.apache.spark.sql.delta.util.JsonUtils
+ 
+ /**
+  * This trait allows the tests to write with Delta
+ }
+ 
+ /**
+- * No test should go here. Please add tests in [[UniFormE2EIcebergSuiteBase]]
++ * Trait that wires up an in-memory UC commit coordinator for UniForm E2E testing.
++ *
++ * Mix this into a concrete suite that already extends [[UniFormE2EIcebergSuiteBase]] (or any
++ * other [[UniFormE2ETest]] subclass) to redirect every [[readAndVerify]] call through the
++ * native Iceberg reader backed by the in-memory UC coordinator
++ *
++ * Concrete suites must call [[requiredTableProperties]] inside their
++ * [[UniFormE2EIcebergSuiteBase.extraTableProperties]] override to inject the coordinator
++ * name and conf into every `CREATE TABLE` statement.
+  */
++trait WriteDeltaUCCCReadIceberg extends UniFormE2ETest
++  with DeltaSQLCommandTest
++  with NonSparkReadIceberg {
++
++  /**
++   * A [[UCCommitCoordinatorClient]] subclass that overrides [[registerTable]] to auto-assign
++   * a UC table ID, simulating what the UC catalog does during CREATE TABLE.
++   */
++  private class TestUCBackedCommitCoordinator(ucClient: InMemoryUCClient)
++    extends UCCommitCoordinatorClient(Collections.emptyMap(), ucClient) {
++
++    @volatile var lastRegisteredTableId: String = _
++
++    /**
++     * Delta blocks setting `COORDINATED_COMMITS_TABLE_CONF` in TBLPROPERTIES, so this trait
++     * simulates what the real UC catalog does: a [[CatalogOwnedCommitCoordinatorBuilder]] returns
++     * a single [[TestUCBackedCommitCoordinator]] instance whose [[registerTable]] auto-assigns a
++     * UUID.  Returning the same instance from every [[build]]/[[buildForCatalog]] call ensures
++     * that [[UCCommitCoordinatorClient.semanticEquals]] (which uses reference equality on `conf`)
++     * returns true and Delta does not reject intra-test metadata updates.
++     */
++    override def registerTable(
++        logPath: Path,
++        tableIdentifier: Optional[UCTableIdentifier],
++        currentVersion: Long,
++        currentMetadata: AbstractMetadata,
++        currentProtocol: AbstractProtocol): java.util.Map[String, String] = {
++      val tableId = UUID.randomUUID().toString
++      lastRegisteredTableId = tableId
++      Map(UCCommitCoordinatorClient.UC_TABLE_ID_KEY -> tableId).asJava
++    }
++  }
++
++  protected var ucCommitCoordinator: InMemoryUCCommitCoordinator = _
++  private var testCoordinator: TestUCBackedCommitCoordinator = _
++
++  abstract override def beforeEach(): Unit = {
++    super.beforeEach()
++    DeltaLog.clearCache()
++    CommitCoordinatorProvider.clearAllBuilders()
++    ucCommitCoordinator = new InMemoryUCCommitCoordinator()
++    val ucClient = new InMemoryUCClient("test-metastore", ucCommitCoordinator)
++    testCoordinator = new TestUCBackedCommitCoordinator(ucClient)
++    CommitCoordinatorProvider.registerBuilder(new CatalogOwnedCommitCoordinatorBuilder {
++      override def getName: String = UCCommitCoordinatorBuilder.getName
++      override def build(
++          spark: SparkSession, conf: Map[String, String]): JCommitCoordinatorClient =
++        testCoordinator
++      override def buildForCatalog(
++          spark: SparkSession, catalogName: String): JCommitCoordinatorClient =
++        testCoordinator
++    })
++  }
++
++  abstract override def afterEach(): Unit = {
++    CommitCoordinatorProvider.clearAllBuilders()
++    DeltaLog.clearCache()
++    super.afterEach()
++  }
++
++  /**
++   * Returns the TBLPROPERTIES SQL fragment required to enable the UC commit coordinator.
++   * Concrete suites should append this to their [[extraTableProperties]] override.
++   */
++  def requiredTableProperties: String =
++    s", '${COORDINATED_COMMITS_COORDINATOR_NAME.key}' = '${UCCommitCoordinatorBuilder.getName}'" +
++      s", '${COORDINATED_COMMITS_COORDINATOR_CONF.key}' = " +
++      s"'${JsonUtils.toJson(Map.empty[String, String])}'"
++
++  override protected def readAndVerify(
++      table: String, fields: String, orderBy: String, expect: Seq[Row]): Unit = {
++    val tableId = testCoordinator.lastRegisteredTableId
++    assert(tableId != null,
++      s"No table UUID assigned for '$table' - table was not created with CC properties")
++    val schema = DeltaLog.forTable(spark, TableIdentifier(table)).update().schema
++    val uniformMetadata = ucCommitCoordinator.getUniformMetadata(tableId)
++    assert(uniformMetadata.isDefined,
++      s"No UniForm metadata found for table '$table' (ID $tableId)")
++    assert(uniformMetadata.get.getIcebergMetadata.isPresent,
++      s"No Iceberg metadata found for table '$table' (ID $tableId)")
++    val icebergMetadataPath = uniformMetadata.get.getIcebergMetadata.get.getMetadataLocation
++    verifyReadByPath(icebergMetadataPath, schema, fields, orderBy, expect)
++  }
++}
++
++/**
++ * Concrete E2E suite that runs all [[UniFormE2EIcebergSuiteBase]] tests with tables backed
++ * by an in-memory UC commit coordinator, reading results via the native Iceberg reader.
++ */
++class UniFormE2EIcebergUCSuite extends UniFormE2EIcebergSuiteBase
++    with WriteDeltaUCCCReadIceberg {
++  // No test should go here. Please add tests in [[UniFormE2EIcebergSuiteBase]]
++  override def extraTableProperties(compatVersion: Int): String =
++    super.extraTableProperties(compatVersion) + requiredTableProperties
++}
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
@@ -0,0 +1,49 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
+     public static final String FORMAT_HUDI = "hudi";
+   }
+ 
++  /**
++   * The set of compression codecs that Kernel currently recognizes and enforces. This is
++   * intentionally strict for now. In the future we may add new codecs or relax validation to allow
++   * any codec string.
++   */
++  private static final Set<String> VALID_COMPRESSION_CODECS =
++      Collections.unmodifiableSet(
++          new HashSet<>(
++              Arrays.asList("uncompressed", "none", "snappy", "gzip", "lz4", "lz4_raw", "zstd")));
++
+   private static final Collection<String> ALLOWED_UNIFORM_FORMATS =
+       Collections.unmodifiableList(
+           Arrays.asList(UniversalFormats.FORMAT_HUDI, UniversalFormats.FORMAT_ICEBERG));
+           "needs to be a boolean.",
+           true);
+ 
++  /**
++   * Compression codec writers should use for new Parquet data and checkpoint files. Changing this
++   * property does not affect existing files; a table may contain files written with different
++   * codecs.
++   *
++   * <p>Valid values (case-insensitive): uncompressed, none, snappy, gzip, lz4, lz4_raw, zstd.
++   */
++  public static final TableConfig<String> PARQUET_COMPRESSION_CODEC =
++      new TableConfig<>(
++          "delta.parquet.compression.codec",
++          "snappy",
++          v -> v.toLowerCase(Locale.ROOT),
++          VALID_COMPRESSION_CODECS::contains,
++          "needs to be one of: 'uncompressed', 'none', 'snappy', 'gzip',"
++              + " 'lz4', 'lz4_raw', 'zstd'.",
++          true /* editable */);
++
+   public static final TableConfig<String> MATERIALIZED_ROW_ID_COLUMN_NAME =
+       new TableConfig<>(
+           "delta.rowTracking.materializedRowIdColumnName",
+               addConfig(this, MATERIALIZED_ROW_ID_COLUMN_NAME);
+               addConfig(this, MATERIALIZED_ROW_COMMIT_VERSION_COLUMN_NAME);
+               addConfig(this, VARIANT_SHREDDING_ENABLED);
++              addConfig(this, PARQUET_COMPRESSION_CODEC);
+ 
+               // The below configs do not yet have their behavior correctly implemented in Kernel.
+               addConfig(this, DATA_SKIPPING_STATS_COLUMNS);
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
@@ -0,0 +1,13 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
+               StructType.class));
+ 
+   private static final Set<Class<? extends DataType>> V3_SUPPORTED_TYPES =
+-      Stream.concat(V2_SUPPORTED_TYPES.stream(), Stream.of(VariantType.class))
++      Stream.concat(
++              V2_SUPPORTED_TYPES.stream(),
++              Stream.of(VariantType.class, GeometryType.class, GeographyType.class))
+           .collect(Collectors.toSet());
+ 
+   protected static final IcebergCompatCheck V2_CHECK_HAS_SUPPORTED_TYPES =
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
@@ -0,0 +1,10 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
+                   VARIANT_SHREDDING_PREVIEW_RW_FEATURE,
+                   VARIANT_RW_PREVIEW_FEATURE,
+                   ALLOW_COLUMN_DEFAULTS_W_FEATURE,
++                  GEOSPATIAL_RW_FEATURE,
+                   // Also allow writerV1 features for backward compatibility.
+                   //
+                   // Note: We already enforce that these features cannot be enabled
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
@@ -0,0 +1,22 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
+ import io.delta.kernel.utils.CloseableIterator;
+ import io.delta.kernel.utils.FileStatus;
+ import java.io.IOException;
++import java.io.InterruptedIOException;
+ import java.io.UncheckedIOException;
+ import java.util.*;
+ import java.util.stream.Collectors;
+       throw new IllegalStateException("Can't call `next` on a closed iterator.");
+     }
+     if (Thread.currentThread().isInterrupted()) {
+-      throw new IllegalStateException("Thread was interrupted");
++      // Throw a typed InterruptedIOException (wrapped, since next() does not declare checked
++      // exceptions) so engines whose interrupt-handling recognizes standard JDK interrupt types
++      // (e.g. Spark's StreamExecution.isInterruptionException) treat this as a clean shutdown
++      // rather than a real error.
++      throw new UncheckedIOException(new InterruptedIOException("Thread was interrupted"));
+     }
+ 
+     if (!hasNext()) {
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
@@ -0,0 +1,11 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
+     }
+   }
+ 
+-  static final TableFeature GEOSPATIAL_RW_FEATURE = new GeoSpatialTableFeature();
++  public static final TableFeature GEOSPATIAL_RW_FEATURE = new GeoSpatialTableFeature();
+ 
+   private static class GeoSpatialTableFeature extends TableFeature.ReaderWriterFeature
+       implements FeatureAutoEnabledByMetadata {
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
@@ -0,0 +1,73 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
+ 
+ import scala.collection.JavaConverters._
+ 
+-import io.delta.kernel.exceptions.KernelException
++import io.delta.kernel.exceptions.{InvalidConfigurationValueException, KernelException}
+ 
+ import org.scalatest.funsuite.AnyFunSuite
+ 
+         TableConfig.IN_COMMIT_TIMESTAMP_ENABLEMENT_TIMESTAMP.getKey -> "1",
+         TableConfig.COLUMN_MAPPING_MODE.getKey -> "name",
+         TableConfig.ICEBERG_COMPAT_V2_ENABLED.getKey -> "true",
+-        TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.getKey -> "iceberg").asJava)
++        TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.getKey -> "iceberg",
++        TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "snappy").asJava)
+   }
+ 
+   test("check TableConfig.MAX_COLUMN_ID.editable is false") {
+     val formats = TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.fromMetadata(config)
+     assert(formats == Set("iceberg", "hudi").asJava)
+   }
++
++  test("PARQUET_COMPRESSION_CODEC - valid values accepted including mixed case") {
++    val validValues = Seq(
++      "snappy",
++      "SNAPPY",
++      "ZSTD",
++      "gzip",
++      "GZIP",
++      "lz4",
++      "lz4_raw",
++      "LZ4_RAW",
++      "uncompressed",
++      "UNCOMPRESSED",
++      "none",
++      "NONE",
++      "zstd")
++    validValues.foreach { codec =>
++      TableConfig.validateAndNormalizeDeltaProperties(
++        Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> codec).asJava)
++    }
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - invalid value throws InvalidConfigurationValueException") {
++    val ex = intercept[InvalidConfigurationValueException] {
++      TableConfig.validateAndNormalizeDeltaProperties(
++        Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "invalid").asJava)
++    }
++    assert(ex.getMessage.contains("delta.parquet.compression.codec"))
++    assert(ex.getMessage.contains("invalid"))
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - fromMetadata returns lowercase regardless of stored case") {
++    val config = Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "SNAPPY").asJava
++    val result = TableConfig.PARQUET_COMPRESSION_CODEC.fromMetadata(config)
++    assert(result === "snappy")
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - fromMetadata returns snappy when property absent") {
++    val config = Map.empty[String, String].asJava
++    val result = TableConfig.PARQUET_COMPRESSION_CODEC.fromMetadata(config)
++    assert(result === "snappy")
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - validation normalizes key case") {
++    val result = TableConfig.validateAndNormalizeDeltaProperties(
++      Map("DELTA.PARQUET.COMPRESSION.CODEC" -> "snappy").asJava)
++    assert(result.containsKey("delta.parquet.compression.codec"))
++    assert(result.get("delta.parquet.compression.codec") === "snappy")
++  }
+ }
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
@@ -0,0 +1,12 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV2MetadataValidatorAndUpdaterSuite.scala
+ 
+   override def supportedDataColumnTypes: Set[DataType] = ALL_TYPES
+ 
+-  override def unsupportedDataColumnTypes: Set[DataType] = Set(VariantType.VARIANT)
++  override def unsupportedDataColumnTypes: Set[DataType] =
++    Set(VariantType.VARIANT, GeometryType.ofDefault(), GeographyType.ofDefault())
+ 
+   override def unsupportedPartitionColumnTypes: Set[DataType] = NESTED_TYPES
+ 
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
@@ -0,0 +1,12 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/icebergcompat/IcebergCompatV3MetadataValidatorAndUpdateSuite.scala
+ 
+   override def icebergCompatVersion: String = "V3"
+ 
+-  override def supportedDataColumnTypes: Set[DataType] = ALL_TYPES + VariantType.VARIANT
++  override def supportedDataColumnTypes: Set[DataType] =
++    ALL_TYPES + VariantType.VARIANT + GeometryType.ofDefault() + GeographyType.ofDefault()
+ 
+   override def unsupportedDataColumnTypes: Set[DataType] = Set.empty
+ 
\ No newline at end of file

... (truncated, output exceeded 60000 bytes)

Reproduce locally: git range-diff 77526c7..a50c9d2 e43bf65..dfe61d9 | Disable: git config gitstack.push-range-diff false

Copy link
Copy Markdown
Collaborator

@TimothyW553 TimothyW553 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments.

Comment on lines +111 to +118
@Override
public StructType partitionSchema() {
if (cachedPartitionSchema == null) {
cachedPartitionSchema = AbstractMetadata.super.partitionSchema();
}
return cachedPartitionSchema;
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test where partitionColumns = ["Part1"] and the schema field is part1? Note the difference in capitalization.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Comment on lines +96 to +109
public DeltaColumnMappingMode columnMappingMode() {
ColumnMapping.ColumnMappingMode kernelMode =
ColumnMapping.getColumnMappingMode(kernelMetadata.getConfiguration());
switch (kernelMode) {
case NONE:
return NoMapping$.MODULE$;
case ID:
return IdMapping$.MODULE$;
case NAME:
return NameMapping$.MODULE$;
default:
throw new UnsupportedOperationException("Unsupported column mapping mode: " + kernelMode);
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V1 already has DeltaColumnMappingMode.apply(String) that does this exact mapping. can we reuse it instead of maintaining two tables?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the info, reused the v1 apply

Comment on lines +77 to +82
cachedPartitionColumns =
CollectionConverters.asScala(
VectorUtils.toJavaList(kernelMetadata.getPartitionColumns()).stream()
.map(Object::toString)
.collect(Collectors.toList()))
.toSeq();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toJavaList already returns List here. The .stream().map(Object::toString) does nothing and just hides type bugs.

Suggested change
cachedPartitionColumns =
CollectionConverters.asScala(
VectorUtils.toJavaList(kernelMetadata.getPartitionColumns()).stream()
.map(Object::toString)
.collect(Collectors.toList()))
.toSeq();
List<String> rawCols = VectorUtils.toJavaList(kernelMetadata.getPartitionColumns());
cachedPartitionColumns = CollectionConverters.asScala(rawCols).toSeq();

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using stream to cast type from Object to String, but you code looks better

Comment on lines +50 to +59
public Option<Set<String>> readerFeatures() {
if (cachedReaderFeatures == null) {
cachedReaderFeatures =
kernelProtocol.supportsReaderFeatures()
? Option.apply(
CollectionConverters.asScala(kernelProtocol.getReaderFeatures()).toSet())
: Option.empty();
}
return cachedReaderFeatures;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

volatile + check-then-set seems racy (two threads can both compute)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kernelProtocol is declared to be a constant, so it is fine

case NONE:
return NoMapping$.MODULE$;
case ID:
return IdMapping$.MODULE$;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only NoMapping and NameMapping are tested -- IdMapping is uncovered. same with unknown mode.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests added

@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (dfe61d9 -> f3359c7)
.github/actions/setup-unitycatalog/action.yml
@@ -0,0 +1,40 @@
+diff --git a/.github/actions/setup-unitycatalog/action.yml b/.github/actions/setup-unitycatalog/action.yml
+new file mode 100644
+--- /dev/null
++++ b/.github/actions/setup-unitycatalog/action.yml
++name: "Set up pinned Unity Catalog build"
++description: >-
++  Publishes Unity Catalog jars from the commit pinned in project/scripts/setup_unitycatalog_main.sh
++  (the UC_PIN_SHA= line) to the runner's local Ivy / Maven caches, using GitHub Actions cache so the
++  slow UC build only runs the first time a pin is seen.
++
++runs:
++  using: "composite"
++  steps:
++    - name: Restore pinned UC cache
++      id: uc-cache
++      uses: actions/cache/restore@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
++      with:
++        # ~/.ivy2/local is what sbt publishLocal writes to. ~/.m2 is for publishM2.
++        path: |
++          ~/.ivy2/local
++          ~/.m2/repository/io/unitycatalog
++        # Cache key hashes the setup script, so bumping UC_PIN_SHA (or any other script change)
++        # invalidates the cache.
++        key: uc-jars-${{ runner.os }}-${{ hashFiles('project/scripts/setup_unitycatalog_main.sh') }}
++    - name: Build Unity Catalog from pinned SHA
++      shell: bash
++      run: bash project/scripts/setup_unitycatalog_main.sh
++    - name: Save pinned UC cache
++      # Only attempt a save when the restore missed. When multiple parallel matrix jobs all see
++      # a cache miss (first CI run after a pin bump), only the first to reach this step wins the
++      # GHA cache reservation; the rest log "another job may be creating this cache" warnings.
++      # Gating on cache-hit means cached runs (the common steady state) skip the save entirely,
++      # which eliminates those warnings on every subsequent run.
++      if: steps.uc-cache.outputs.cache-hit != 'true'
++      uses: actions/cache/save@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
++      with:
++        path: |
++          ~/.ivy2/local
++          ~/.m2/repository/io/unitycatalog
++        key: uc-jars-${{ runner.os }}-${{ hashFiles('project/scripts/setup_unitycatalog_main.sh') }}
\ No newline at end of file
.github/workflows/build.yaml
@@ -0,0 +1,29 @@
+diff --git a/.github/workflows/build.yaml b/.github/workflows/build.yaml
+--- a/.github/workflows/build.yaml
++++ b/.github/workflows/build.yaml
+ name: "Delta Build"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+             ~/.cache/coursier
+           key: delta-sbt-cache-cross-spark
+ 
++      # publishM2 compiles every aggregated project, including storage, which has
++      # unitycatalog-client as a compile-scope dependency. Publish the pinned UC build locally
++      # first so Delta compiles against the UC APIs it actually targets.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
++
+       - name: Run cross-Spark build test
+         run: python project/tests/test_cross_spark_publish.py
+ 
\ No newline at end of file
.github/workflows/disabled_iceberg_test.yaml
@@ -0,0 +1,80 @@
+diff --git a/.github/workflows/disabled_iceberg_test.yaml b/.github/workflows/disabled_iceberg_test.yaml
+deleted file mode 100644
+--- a/.github/workflows/disabled_iceberg_test.yaml
++++ /dev/null
+-name: "Delta Iceberg Latest [DISABLED]"
+-# SECURITY: All Python/PySpark workflows disabled due to active supply chain attack
+-# targeting OSS package ecosystems (PyPI). C2 domains: models.litellm.cloud, checkmarx.zone
+-# Date disabled: 2026-03-25
+-# To re-enable: remove 'if: false' from all jobs and restore original triggers
+-on:
+-  workflow_dispatch: # manual-only, auto triggers removed
+-  # To re-enable, replace the above line with:
+-  # push:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-  # pull_request:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-env:
+-  # SECURITY: Temporal lockdown — refuse any package version published after this date.
+-  # This date is a pre-attack baseline (before the active PyPI supply chain attack).
+-  UV_EXCLUDE_NEWER: "2026-03-10T00:00:00Z"
+-jobs:
+-  test:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+-    name: "DIL: Scala ${{ matrix.scala }}"
+-    runs-on: ubuntu-24.04
+-    strategy:
+-      matrix:
+-        # These Scala versions must match those in the build.sbt
+-        scala: [2.13.16]
+-    env:
+-      SCALA_VERSION: ${{ matrix.scala }}
+-    steps:
+-      - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+-      - name: install java
+-        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
+-        with:
+-          distribution: "zulu"
+-          java-version: "17"
+-      - name: Cache Scala, SBT
+-        uses: actions/cache@6f8efc29b200d32929f49075959781ed54ec270c # v3.5.0
+-        with:
+-          path: |
+-            ~/.sbt
+-            ~/.ivy2
+-            ~/.cache/coursier
+-          # Change the key if dependencies are changed. For each key, GitHub Actions will cache the
+-          # the above directories when we use the key for the first time. After that, each run will
+-          # just use the cache. The cache is immutable so we need to use a new key when trying to
+-          # cache new stuff.
+-          key: delta-sbt-cache-spark4.0-scala${{ matrix.scala }}
+-      - name: Set up uv
+-        run: bash project/scripts/install-uv.sh
+-      - name: Install Job dependencies
+-        run: |
+-          sudo apt-get update
+-          sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+-          sudo apt install libedit-dev
+-          # buf v1.28.1 (2023-11-15) — SHA from official release asset:
+-          # https://github.com/bufbuild/buf/releases/download/v1.28.1/sha256.txt
+-          BUF_VERSION="v1.28.1"
+-          BUF_SHA256="870cf492d381a967d36636fdee9da44b524ea62aad163659b8dbf16a7da56987"
+-          curl -fsSL -o buf-Linux-x86_64.tar.gz \
+-            "https://github.com/bufbuild/buf/releases/download/${BUF_VERSION}/buf-Linux-x86_64.tar.gz"
+-          echo "${BUF_SHA256}  buf-Linux-x86_64.tar.gz" | sha256sum -c -
+-          mkdir -p ~/buf
+-          tar -xzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
+-          rm buf-Linux-x86_64.tar.gz
+-          uv python install 3.8
+-          uv venv .venv --python 3.8
+-      - name: Run Scala/Java and Python tests
+-        # when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
+-        run: |
+-          source .venv/bin/activate
+-          TEST_PARALLELISM_COUNT=4 python run-tests.py --group iceberg --spark-version 4.0
\ No newline at end of file
.github/workflows/spark_test_uc_master.yaml
@@ -0,0 +1,62 @@
+diff --git a/.github/workflows/spark_test_uc_master.yaml b/.github/workflows/disabled_spark_test_uc_master.yaml
+similarity index 61%
+rename from .github/workflows/spark_test_uc_master.yaml
+rename to .github/workflows/disabled_spark_test_uc_master.yaml
+--- a/.github/workflows/spark_test_uc_master.yaml
++++ b/.github/workflows/disabled_spark_test_uc_master.yaml
+ ##
+ ## To make this blocking, add the job name to the required status checks in
+ ## the branch protection rules for `master`.
++##
++## DISABLED while Delta master builds against a pinned UC master SHA — the main Delta Spark
++## workflow already exercises UC master at that pin, so a parallel floating-main workflow would
++## be redundant. To re-enable (once Delta goes back to a released UC version): drop the
++## `[DISABLED]` suffix from `name`, replace `workflow_dispatch:` with the original push /
++## pull_request triggers below, remove `if: false` from the job, and rename the file back to
++## `spark_test_uc_master.yaml`.
+ 
+-name: "Delta Spark (UC Master)"
++name: "Delta Spark (UC Master) [DISABLED]"
+ on:
+-  push:
+-    paths-ignore:
+-      - '**.md'
+-      - '**.txt'
+-  pull_request:
+-    paths-ignore:
+-      - '**.md'
+-      - '**.txt'
++  workflow_dispatch: # manual-only while disabled
++  # Original triggers, restore when re-enabling:
++  # push:
++  #   branches: [master, branch-*]
++  #   paths-ignore:
++  #     - '**.md'
++  #     - '**.txt'
++  # pull_request:
++  #   branches: [master, branch-*]
++  #   paths-ignore:
++  #     - '**.md'
++  #     - '**.txt'
+ 
+ jobs:
+   test-uc-master:
+     name: "[Non Blocking] UC Integration Tests (UC Main)"
++    # Guard against accidental runs while disabled. Remove when re-enabling.
++    if: false
+     runs-on: ubuntu-24.04
+     steps:
+       - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3
+           key: delta-sbt-cache-uc-master
+       - name: Build Unity Catalog from source
+         id: uc-build
++        # UC_REF=main builds the floating-main canary instead of the pinned SHA, which is the
++        # point of this workflow -- early warning of upcoming UC incompatibilities.
+         run: |
+-          bash project/scripts/setup_unitycatalog_main.sh
+-          UC_VERSION=$(cat /tmp/unitycatalog/.uc-version)
++          UC_REF=main bash project/scripts/setup_unitycatalog_main.sh
++          UC_VERSION=$(UC_REF=main bash project/scripts/setup_unitycatalog_main.sh --print-version)
+           echo "uc_version=$UC_VERSION" >> $GITHUB_OUTPUT
+           echo "UC version: $UC_VERSION"
+       - name: Run UC integration tests
\ No newline at end of file
.github/workflows/flink_test.yaml
@@ -0,0 +1,37 @@
+diff --git a/.github/workflows/flink_test.yaml b/.github/workflows/flink_test.yaml
+--- a/.github/workflows/flink_test.yaml
++++ b/.github/workflows/flink_test.yaml
+ 
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'flink/**'
+       - 'kernel/**'
+       - '!**/*.md'
+       - '!**/*.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'flink/**'
+       - 'kernel/**'
+   cancel-in-progress: true
+ 
+ env:
+-  # Point SBT to our cache directories for consistency
++  # Point SBT to our cache directories for consistency.
+   SBT_OPTS: "-Dsbt.coursier.home-dir=/home/runner/.cache/coursier -Dsbt.ivy.home=/home/runner/.ivy2"
+ 
+ jobs:
+           else
+             echo "❌ Cache MISS - will download dependencies"
+           fi
++      # flink has unitycatalog-client as a compile-scope dep and flink tests exercise UC.
++      # Publish the pinned UC build locally before sbt runs.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run unit tests
+         run: |
+           build/sbt flinkGroup/test
\ No newline at end of file
.github/workflows/iceberg_test.yaml
@@ -0,0 +1,58 @@
+diff --git a/.github/workflows/iceberg_test.yaml b/.github/workflows/iceberg_test.yaml
+new file mode 100644
+--- /dev/null
++++ b/.github/workflows/iceberg_test.yaml
++name: "Delta Iceberg Latest"
++on:
++  push:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++  pull_request:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++jobs:
++  test:
++    name: "DIL: Scala ${{ matrix.scala }}"
++    runs-on: ubuntu-24.04
++    strategy:
++      matrix:
++        # These Scala versions must match those in the build.sbt
++        scala: [2.13.16]
++    env:
++      SCALA_VERSION: ${{ matrix.scala }}
++    steps:
++      - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
++      - name: install java
++        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
++        with:
++          distribution: "zulu"
++          java-version: "17"
++      - name: Cache Scala, SBT
++        uses: actions/cache@6f8efc29b200d32929f49075959781ed54ec270c # v3.5.0
++        with:
++          path: |
++            ~/.sbt
++            ~/.ivy2
++            ~/.cache/coursier
++          # Change the key if dependencies are changed. For each key, GitHub Actions will cache the
++          # the above directories when we use the key for the first time. After that, each run will
++          # just use the cache. The cache is immutable so we need to use a new key when trying to
++          # cache new stuff.
++          key: delta-sbt-cache-spark4.0-scala${{ matrix.scala }}
++      - name: Set up uv
++        run: bash project/scripts/install-uv.sh
++      - name: Install Python via uv
++        # No UV_EXCLUDE_NEWER needed: this workflow installs zero pip packages.
++        # Python is only used to run the stdlib-only run-tests.py driver.
++        run: |
++          uv python install 3.8
++          uv venv .venv --python 3.8
++      - name: Run Scala/Java and Python tests
++        # when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
++        run: |
++          source .venv/bin/activate
++          TEST_PARALLELISM_COUNT=4 python run-tests.py --group iceberg --spark-version 4.0
\ No newline at end of file
.github/workflows/kernel_docs.yaml
@@ -0,0 +1,11 @@
+diff --git a/.github/workflows/kernel_docs.yaml b/.github/workflows/kernel_docs.yaml
+--- a/.github/workflows/kernel_docs.yaml
++++ b/.github/workflows/kernel_docs.yaml
+           java-version: "11"
+       - name: Generate docs
+         run: |
+-          build/sbt kernelGroup/unidoc
++          build/sbt -DuseDefaultUnityCatalogReleaseVersion=true kernelGroup/unidoc
+           mkdir -p kernel/docs/snapshot/kernel-api/java
+           mkdir -p kernel/docs/snapshot/kernel-defaults/java
+           cp -r kernel/kernel-api/target/javaunidoc/. kernel/docs/snapshot/kernel-api/java/
\ No newline at end of file
.github/workflows/kernel_test.yaml
@@ -0,0 +1,47 @@
+diff --git a/.github/workflows/kernel_test.yaml b/.github/workflows/kernel_test.yaml
+--- a/.github/workflows/kernel_test.yaml
++++ b/.github/workflows/kernel_test.yaml
+ 
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+           else
+             echo "❌ Cache MISS - will download dependencies"
+           fi
++      # run-tests.py invokes sbt with `++ 2.13.16`, which triggers cross-version dependency resolution
++      # across every project (including kernelUnityCatalog). Publish the pinned UC build locally first
++      # so that resolution doesn't miss.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run unit tests
+         run: |
+           python run-tests.py --group kernel --coverage --shard ${{ matrix.shard }}
+     runs-on: ubuntu-24.04
+     steps:
+       - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+-      # Run integration tests with JDK 11, as they have no Spark dependency
+-      - name: install java
++      # The integration test itself runs on JDK 11 (no Spark dependency), but UC's sbt build needs
++      # JDK 17, so we install 17 first, publish UC, then switch the active JDK to 11 for the actual
++      # test run.
++      - name: install java 17 for UC build
++        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
++        with:
++          distribution: "zulu"
++          java-version: "17"
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
++      - name: install java 11 for integration test
+         uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
+         with:
+           distribution: "zulu"
\ No newline at end of file
.github/workflows/kernel_unitycatalog_test.yaml
@@ -0,0 +1,29 @@
+diff --git a/.github/workflows/kernel_unitycatalog_test.yaml b/.github/workflows/kernel_unitycatalog_test.yaml
+--- a/.github/workflows/kernel_unitycatalog_test.yaml
++++ b/.github/workflows/kernel_unitycatalog_test.yaml
+ name: "Kernel Unity Catalog"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'build.sbt'
+       - 'version.sbt'
+       - 'storage/**/*.java'
+       - '.github/workflows/kernel_unitycatalog_test.yaml'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'build.sbt'
+       - 'version.sbt'
+         with:
+           distribution: "zulu"
+           java-version: "17"
++      # kernelUnityCatalog depends on unreleased UC APIs; publish the pinned UC build locally before
++      # sbt tries to resolve the dependency.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run Unity tests with coverage
+         run: |
+           ./build/sbt "++ ${{ env.SCALA_VERSION }}" clean coverage kernelUnityCatalog/test coverageAggregate coverageOff -v
\ No newline at end of file
.github/workflows/spark_examples_test.yaml
@@ -0,0 +1,27 @@
+diff --git a/.github/workflows/spark_examples_test.yaml b/.github/workflows/spark_examples_test.yaml
+--- a/.github/workflows/spark_examples_test.yaml
++++ b/.github/workflows/spark_examples_test.yaml
+ name: "Delta Spark Publishing and Examples"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+           sudo apt-get update
+           sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+           sudo apt install libedit-dev
++      # `publishM2` and `++ <scala>` both resolve every project's deps, which includes
++      # sparkUnityCatalog; publish the pinned UC build locally before sbt runs.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run Delta Spark Local Publishing and Examples Compilation
+         # examples/scala/build.sbt will compile against the local Delta release version (e.g. 3.2.0-SNAPSHOT).
+         # Thus, we need to publishM2 first so those jars are locally accessible.
\ No newline at end of file
.github/workflows/disabled_spark_python_test.yaml
@@ -0,0 +1,76 @@
+diff --git a/.github/workflows/disabled_spark_python_test.yaml b/.github/workflows/spark_python_test.yaml
+similarity index 71%
+rename from .github/workflows/disabled_spark_python_test.yaml
+rename to .github/workflows/spark_python_test.yaml
+--- a/.github/workflows/disabled_spark_python_test.yaml
++++ b/.github/workflows/spark_python_test.yaml
+-name: "Delta Spark Python [DISABLED]"
+-# SECURITY: All Python/PySpark workflows disabled due to active supply chain attack
+-# targeting OSS package ecosystems (PyPI). C2 domains: models.litellm.cloud, checkmarx.zone
+-# Date disabled: 2026-03-25
+-# To re-enable: remove 'if: false' from all jobs and restore original triggers
++name: "Delta Spark Python"
+ on:
+-  workflow_dispatch: # manual-only, auto triggers removed
+-  # To re-enable, replace the above line with:
+-  # push:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-  # pull_request:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
++  push:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++  pull_request:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
+ env:
+   # SECURITY: Temporal lockdown — refuse any package version published after this date.
+   # This date is a pre-attack baseline (before the active PyPI supply chain attack).
+   # Generate Spark versions matrix from CrossSparkVersions.scala
+   # This workflow tests against released versions only (no snapshots)
+   generate-matrix:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+     name: "Generate Released Spark Versions Matrix"
+     runs-on: ubuntu-24.04
+     outputs:
+           echo "Generated released Spark versions: $SPARK_VERSIONS"
+ 
+   test:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+     name: "DSP (${{ matrix.spark_version }})"
+     runs-on: ubuntu-24.04
+     needs: generate-matrix
+           key: delta-sbt-cache-spark${{ matrix.spark_version }}-scala${{ matrix.scala }}
+       - name: Set up uv
+         run: bash project/scripts/install-uv.sh
+-      - name: Install Job dependencies
++      - name: Set up buf
++        run: bash project/scripts/install-buf.sh
++      - name: Install Python and dependencies
+         run: |
+-          sudo apt-get update
+-          sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+-          sudo apt install libedit-dev
+-          # buf v1.28.1 (2023-11-15) — SHA from official release asset:
+-          # https://github.com/bufbuild/buf/releases/download/v1.28.1/sha256.txt
+-          BUF_VERSION="v1.28.1"
+-          BUF_SHA256="870cf492d381a967d36636fdee9da44b524ea62aad163659b8dbf16a7da56987"
+-          curl -fsSL -o buf-Linux-x86_64.tar.gz \
+-            "https://github.com/bufbuild/buf/releases/download/${BUF_VERSION}/buf-Linux-x86_64.tar.gz"
+-          echo "${BUF_SHA256}  buf-Linux-x86_64.tar.gz" | sha256sum -c -
+-          mkdir -p ~/buf
+-          tar -xzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
+-          rm buf-Linux-x86_64.tar.gz
+           uv python install 3.10
+           uv venv .venv --python 3.10
+           # Install hash-verified locked dependencies (see .github/ci-requirements/spark-python/)
\ No newline at end of file
.github/workflows/spark_test.yaml
@@ -0,0 +1,27 @@
+diff --git a/.github/workflows/spark_test.yaml b/.github/workflows/spark_test.yaml
+--- a/.github/workflows/spark_test.yaml
++++ b/.github/workflows/spark_test.yaml
+ name: "Delta Spark"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+             ~/.ivy2
+             ~/.cache/coursier
+           key: delta-sbt-cache-spark${{ matrix.spark_version }}-scala${{ matrix.scala }}
++      # Delta's sparkUnityCatalog module (part of sparkGroup) depends on APIs that are only in
++      # unreleased UC. Publish the pinned UC build locally before sbt tries to resolve it.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Scala structured logging style check
+         run: |
+           if [ -f ./dev/spark_structured_logging_style.py ]; then
\ No newline at end of file
.github/workflows/unidoc.yaml
@@ -0,0 +1,19 @@
+diff --git a/.github/workflows/unidoc.yaml b/.github/workflows/unidoc.yaml
+--- a/.github/workflows/unidoc.yaml
++++ b/.github/workflows/unidoc.yaml
+   name: "Unidoc"
+   on:
+     push:
+-      branches: [master]
++      branches: [master, branch-*]
+     pull_request:
+-      branches: [master]
++      branches: [master, branch-*]
+   jobs:
+     build:
+       name: "U: Scala ${{ matrix.scala }}"
+             java-version: "17"
+         - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+         - name: generate unidoc
+-          run: build/sbt "++ ${{ matrix.scala }}" unidoc
++          run: build/sbt -DuseDefaultUnityCatalogReleaseVersion=true "++ ${{ matrix.scala }}" unidoc
\ No newline at end of file
build.sbt
@@ -0,0 +1,162 @@
+diff --git a/build.sbt b/build.sbt
+--- a/build.sbt
++++ b/build.sbt
+   ).configureUnidoc()
+ 
+ 
+-val unityCatalogVersion = sys.props.getOrElse("unityCatalogVersion", "0.4.1")
++// Unity Catalog version. Three modes, in priority order:
++//
++//  1. `-DuseDefaultUnityCatalogReleaseVersion=true`: use `defaultUnityCatalogReleaseVersion`
++//     below -- the last released UC version on Maven Central. For workflows that don't actually
++//     need DRC APIs (e.g. unidoc, lint) and want to skip the pinned UC build. Shared across
++//     workflows by reading this single constant, so bumping is a one-line change here.
++//
++//  2. Release mode: set `unityCatalogReleaseVersion = Some("0.5.0")` (or whatever released
++//     version the release branch ships against). sbt resolves the coordinate from Maven Central
++//     like any other dependency.
++//
++//  3. Pinned mode (default): leave `unityCatalogReleaseVersion = None`. The version string
++//     comes from `setup_unitycatalog_main.sh --print-version`, which encodes both the pinned
++//     UC main SHA and UC's declared base version; the script is the single source of truth.
++//     The same script (without the flag) publishes the matching jars to ~/.ivy2/local when
++//     `ensurePinnedUnityCatalog` decides they're missing.
++//
++// Override with -DunityCatalogVersion=<anything> for ad-hoc experiments.
++val unityCatalogReleaseVersion: Option[String] = None
++val defaultUnityCatalogReleaseVersion = "0.4.1"
++val useDefaultUnityCatalogReleaseVersion: Boolean =
++  sys.props.getOrElse("useDefaultUnityCatalogReleaseVersion", "false").toBoolean
++val unityCatalogSetupScript = "project/scripts/setup_unitycatalog_main.sh"
++
++// Lazy so release-mode / useDefaultUnityCatalogReleaseVersion builds never shell out.
++lazy val pinnedUnityCatalogVersion: String = {
++  import scala.sys.process._
++  Process(Seq("bash", unityCatalogSetupScript, "--print-version")).!!.trim
++}
++val unityCatalogVersion: String = sys.props.getOrElse(
++  "unityCatalogVersion",
++  if (useDefaultUnityCatalogReleaseVersion) defaultUnityCatalogReleaseVersion
++  else unityCatalogReleaseVersion.getOrElse(pinnedUnityCatalogVersion))
++
+ val sparkUnityCatalogJacksonVersion = "2.15.4" // We are using Spark 4.0's Jackson version 2.15.x, to override Unity Catalog 0.3.0's version 2.18.x
+ 
++// Publishes the pinned UC jars to ~/.ivy2/local if they're not already cached there. Hooked
++// into `update` on the UC-dependent projects below, so plain `sbt testOnly ...` on a clean
++// checkout just works. No-op in release mode. Opt out with
++// `-Ddelta.autoBuildPinnedUnityCatalog=false`, in which case sbt errors with a pointer to the
++// setup script.
++val ensurePinnedUnityCatalog = taskKey[Unit](
++  "Publish the pinned UC jars locally if the Ivy coordinate isn't already cached.")
++
++// Extracted so the task body can read as a short guard rather than three nested ifs.
++def publishPinnedUnityCatalog(log: sbt.util.Logger, canary: java.io.File): Unit = {
++  val shouldAutoBuild =
++    sys.props.getOrElse("delta.autoBuildPinnedUnityCatalog", "true").toBoolean
++  if (!shouldAutoBuild) {
++    sys.error(
++      s"""|Pinned Unity Catalog jars are not published locally for coordinate
++          |$unityCatalogVersion.
++          |Auto-build is disabled (-Ddelta.autoBuildPinnedUnityCatalog=false).
++          |Run: bash $unityCatalogSetupScript""".stripMargin)
++  }
++  log.info(s"[UC] Pinned UC jars not found for coordinate $unityCatalogVersion.")
++  log.info(
++    s"[UC] Running $unityCatalogSetupScript - takes ~3-5 minutes on a cold cache, <1s on a warm one.")
++  import scala.sys.process._
++  val procLogger = ProcessLogger(
++    line => log.info(s"[UC setup] $line"),
++    line => log.warn(s"[UC setup] $line"))
++  val exit = Process(Seq("bash", unityCatalogSetupScript)).!(procLogger)
++  if (exit != 0) {
++    sys.error(
++      s"[UC] $unityCatalogSetupScript exited with code $exit. Run it manually to see full output.")
++  }
++  if (!canary.exists) {
++    sys.error(
++      s"[UC] $unityCatalogSetupScript succeeded but ${canary.getAbsolutePath} is still missing - " +
++        "the publish target layout may have changed.")
++  }
++}
++
++Global / ensurePinnedUnityCatalog := {
++  // Resolve the .value dependencies eagerly - sbt's task macro warns when
++  // `.value` appears inside conditional branches.
++  val log = streams.value.log
++  // No-op whenever the effective version resolves to something Maven Central can serve:
++  // release mode, -DuseDefaultUnityCatalogReleaseVersion=true, or -DunityCatalogVersion=<released>.
++  val usingReleasedVersion = useDefaultUnityCatalogReleaseVersion ||
++    sys.props.contains("unityCatalogVersion")
++  if (unityCatalogReleaseVersion.isEmpty && !usingReleasedVersion) {
++    val home = file(sys.props("user.home"))
++    // Check both layouts: a restored sbt cache can pre-populate ivy alone, leaving m2 empty -
++    // checking only ivy would silently skip the slow publish and break mvn-based consumers.
++    val ivy2Canary = home / ".ivy2" / "local" / "io.unitycatalog" /
++      "unitycatalog-client" / unityCatalogVersion / "ivys" / "ivy.xml"
++    val m2Canary = home / ".m2" / "repository" / "io" / "unitycatalog" /
++      "unitycatalog-client" / unityCatalogVersion /
++      s"unitycatalog-client-$unityCatalogVersion.pom"
++    if (!ivy2Canary.exists || !m2Canary.exists) {
++      publishPinnedUnityCatalog(log, ivy2Canary)
++    }
++  }
++}
++
+ lazy val sparkUnityCatalog = (project in file("spark/unitycatalog"))
+   .dependsOn(spark % "compile->compile;test->test;provided->provided")
+   .disablePlugins(ScalafmtPlugin)
+     javafmtCheckSettings(),
+     CrossSparkVersions.sparkDependentSettings(sparkVersion),
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // This is a test-only module - no production sources
+     Compile / sources := Seq.empty,
+ 
+     libraryDependencies ++= Seq(
+       "org.apache.spark" %% "spark-sql" % sparkVersion.value % "provided",
+ 
+-      "io.delta" %% "delta-sharing-client" % "1.3.10",
++      "io.delta" %% "delta-sharing-client" % "1.3.11",
+ 
+       // Test deps
+       "org.scalatest" %% "scalatest" % scalaTestVersion % "test",
+     exportJars := false,
+     javafmtCheckSettings,
+     scalafmtCheckSettings,
+-    
++
+     libraryDependencies ++= Seq(
+       "org.openjdk.jmh" % "jmh-core" % "1.37" % "test",
+       "org.openjdk.jmh" % "jmh-generator-annprocess" % "1.37" % "test",
+     scalaStyleSettings,
+     scalafmtCheckSettings,
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // Put the shaded kernel-api JAR on the classpath (compile & test)
+     Compile / unmanagedJars += (kernelApi / Compile / packageBin).value,
+     Test / unmanagedJars += (kernelApi / Compile / packageBin).value,
+       "com.fasterxml.jackson.datatype" % "jackson-datatype-jsr310" % "2.15.4" % "test",
+     ),
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them. storage is the transitive
++    // UC-client entry point for most of the build graph (sparkV1, sparkV2, kernelDefaults, etc.
++    // all .dependsOn(storage)), so hooking here covers nearly every compile path.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // Unidoc settings
+     unidocSourceFilePatterns += SourceFilePattern("/LogStore.java", "/CloseableIterator.java"),
+     TestParallelization.settings
+       "--add-opens=java.base/java.util=ALL-UNNAMED" // for Flink with Java 17.
+     ),
+     crossPaths := false,
++
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     libraryDependencies ++= Seq(
+       "org.apache.flink" % "flink-core" % flinkVersion % "provided",
+       "org.apache.flink" % "flink-table-common" % flinkVersion % "provided",
\ No newline at end of file
build/sbt
@@ -0,0 +1,16 @@
+diff --git a/build/sbt b/build/sbt
+--- a/build/sbt
++++ b/build/sbt
+ )
+ }
+ 
+-# If MAVEN_PROXY_URL is set, use it as the sole repository for all dependencies.
++# If MAVEN_PROXY_URL is set, use it (and local) as the sole repository for all dependencies.
+ if [[ -n "$MAVEN_PROXY_URL" ]]; then
+   SBT_REPOSITORIES_CONFIG=$(mktemp)
+   cat > "$SBT_REPOSITORIES_CONFIG" <<EOF
+ [repositories]
++  local
+   maven-proxy: $MAVEN_PROXY_URL
+   maven-proxy-ivy: $MAVEN_PROXY_URL, [organization]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)[revision]/[type]s/[artifact](-[classifier]).[ext]
+ EOF
\ No newline at end of file
iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
@@ -0,0 +1,15 @@
+diff --git a/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala b/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
+--- a/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
++++ b/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
+      * AnalysisException
+      */
+      try {
+-       SchemaMergingUtils.checkColumnNameDuplication(tableSchema, "during convert to Delta")
++       SchemaMergingUtils.checkColumnNameDuplication(tableSchema, "CONVERT_TO_DELTA")
+      } catch {
+-       case e: AnalysisException if e.getMessage.contains("during convert to Delta") =>
++       case e: AnalysisException
++           if e.getErrorClass == "DELTA_DUPLICATE_COLUMNS_FOUND.CONVERT_TO_DELTA" =>
+          throw new UnsupportedOperationException(
+            IcebergTable.caseSensitiveConversionExceptionMsg(e.getMessage))
+      }
\ No newline at end of file
iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
@@ -0,0 +1,11 @@
+diff --git a/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala b/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
+--- a/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
++++ b/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
+    * @param catalogTable the catalogTable this conversion targets
+    * @return (Iceberg metadata path, last converted Delta version)
+    */
+-  def convertUncommitedTxn(
++  override def convertUncommitedTxn(
+       txnInfo: CurrentTransactionInfo,
+       deltaAttemptVersion: Long,
+       deltaLog: DeltaLog,
\ No newline at end of file
iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
@@ -0,0 +1,149 @@
+diff --git a/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala b/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
+--- a/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
++++ b/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
+ 
+ package org.apache.spark.sql.delta.uniform
+ 
+-import org.apache.spark.sql.delta.sources.DeltaSQLConf
++import java.util.{Collections, Optional, UUID}
++
++import scala.collection.JavaConverters._
++
++import io.delta.storage.commit.{CommitCoordinatorClient => JCommitCoordinatorClient}
++import io.delta.storage.commit.{TableIdentifier => UCTableIdentifier}
++import io.delta.storage.commit.actions.{AbstractMetadata, AbstractProtocol}
++import io.delta.storage.commit.uccommitcoordinator.UCCommitCoordinatorClient
++import org.apache.hadoop.fs.Path
+ 
+ import org.apache.spark.{SparkConf, SparkSessionSwitch}
+ import org.apache.spark.sql.{Row, SparkSession}
++import org.apache.spark.sql.catalyst.TableIdentifier
++import org.apache.spark.sql.delta.DeltaConfigs.{
++  COORDINATED_COMMITS_COORDINATOR_CONF,
++  COORDINATED_COMMITS_COORDINATOR_NAME
++}
++import org.apache.spark.sql.delta.DeltaLog
++import org.apache.spark.sql.delta.NonSparkReadIceberg
++import org.apache.spark.sql.delta.coordinatedcommits.{
++  CatalogOwnedCommitCoordinatorBuilder,
++  CommitCoordinatorProvider,
++  InMemoryUCClient,
++  InMemoryUCCommitCoordinator,
++  UCCommitCoordinatorBuilder
++}
++import org.apache.spark.sql.delta.sources.DeltaSQLConf
+ import org.apache.spark.sql.delta.test.DeltaSQLCommandTest
+ import org.apache.spark.sql.delta.uniform.hms.HMSTest
++import org.apache.spark.sql.delta.util.JsonUtils
+ 
+ /**
+  * This trait allows the tests to write with Delta
+ }
+ 
+ /**
+- * No test should go here. Please add tests in [[UniFormE2EIcebergSuiteBase]]
++ * Trait that wires up an in-memory UC commit coordinator for UniForm E2E testing.
++ *
++ * Mix this into a concrete suite that already extends [[UniFormE2EIcebergSuiteBase]] (or any
++ * other [[UniFormE2ETest]] subclass) to redirect every [[readAndVerify]] call through the
++ * native Iceberg reader backed by the in-memory UC coordinator
++ *
++ * Concrete suites must call [[requiredTableProperties]] inside their
++ * [[UniFormE2EIcebergSuiteBase.extraTableProperties]] override to inject the coordinator
++ * name and conf into every `CREATE TABLE` statement.
+  */
++trait WriteDeltaUCCCReadIceberg extends UniFormE2ETest
++  with DeltaSQLCommandTest
++  with NonSparkReadIceberg {
++
++  /**
++   * A [[UCCommitCoordinatorClient]] subclass that overrides [[registerTable]] to auto-assign
++   * a UC table ID, simulating what the UC catalog does during CREATE TABLE.
++   */
++  private class TestUCBackedCommitCoordinator(ucClient: InMemoryUCClient)
++    extends UCCommitCoordinatorClient(Collections.emptyMap(), ucClient) {
++
++    @volatile var lastRegisteredTableId: String = _
++
++    /**
++     * Delta blocks setting `COORDINATED_COMMITS_TABLE_CONF` in TBLPROPERTIES, so this trait
++     * simulates what the real UC catalog does: a [[CatalogOwnedCommitCoordinatorBuilder]] returns
++     * a single [[TestUCBackedCommitCoordinator]] instance whose [[registerTable]] auto-assigns a
++     * UUID.  Returning the same instance from every [[build]]/[[buildForCatalog]] call ensures
++     * that [[UCCommitCoordinatorClient.semanticEquals]] (which uses reference equality on `conf`)
++     * returns true and Delta does not reject intra-test metadata updates.
++     */
++    override def registerTable(
++        logPath: Path,
++        tableIdentifier: Optional[UCTableIdentifier],
++        currentVersion: Long,
++        currentMetadata: AbstractMetadata,
++        currentProtocol: AbstractProtocol): java.util.Map[String, String] = {
++      val tableId = UUID.randomUUID().toString
++      lastRegisteredTableId = tableId
++      Map(UCCommitCoordinatorClient.UC_TABLE_ID_KEY -> tableId).asJava
++    }
++  }
++
++  protected var ucCommitCoordinator: InMemoryUCCommitCoordinator = _
++  private var testCoordinator: TestUCBackedCommitCoordinator = _
++
++  abstract override def beforeEach(): Unit = {
++    super.beforeEach()
++    DeltaLog.clearCache()
++    CommitCoordinatorProvider.clearAllBuilders()
++    ucCommitCoordinator = new InMemoryUCCommitCoordinator()
++    val ucClient = new InMemoryUCClient("test-metastore", ucCommitCoordinator)
++    testCoordinator = new TestUCBackedCommitCoordinator(ucClient)
++    CommitCoordinatorProvider.registerBuilder(new CatalogOwnedCommitCoordinatorBuilder {
++      override def getName: String = UCCommitCoordinatorBuilder.getName
++      override def build(
++          spark: SparkSession, conf: Map[String, String]): JCommitCoordinatorClient =
++        testCoordinator
++      override def buildForCatalog(
++          spark: SparkSession, catalogName: String): JCommitCoordinatorClient =
++        testCoordinator
++    })
++  }
++
++  abstract override def afterEach(): Unit = {
++    CommitCoordinatorProvider.clearAllBuilders()
++    DeltaLog.clearCache()
++    super.afterEach()
++  }
++
++  /**
++   * Returns the TBLPROPERTIES SQL fragment required to enable the UC commit coordinator.
++   * Concrete suites should append this to their [[extraTableProperties]] override.
++   */
++  def requiredTableProperties: String =
++    s", '${COORDINATED_COMMITS_COORDINATOR_NAME.key}' = '${UCCommitCoordinatorBuilder.getName}'" +
++      s", '${COORDINATED_COMMITS_COORDINATOR_CONF.key}' = " +
++      s"'${JsonUtils.toJson(Map.empty[String, String])}'"
++
++  override protected def readAndVerify(
++      table: String, fields: String, orderBy: String, expect: Seq[Row]): Unit = {
++    val tableId = testCoordinator.lastRegisteredTableId
++    assert(tableId != null,
++      s"No table UUID assigned for '$table' - table was not created with CC properties")
++    val schema = DeltaLog.forTable(spark, TableIdentifier(table)).update().schema
++    val uniformMetadata = ucCommitCoordinator.getUniformMetadata(tableId)
++    assert(uniformMetadata.isDefined,
++      s"No UniForm metadata found for table '$table' (ID $tableId)")
++    assert(uniformMetadata.get.getIcebergMetadata.isPresent,
++      s"No Iceberg metadata found for table '$table' (ID $tableId)")
++    val icebergMetadataPath = uniformMetadata.get.getIcebergMetadata.get.getMetadataLocation
++    verifyReadByPath(icebergMetadataPath, schema, fields, orderBy, expect)
++  }
++}
++
++/**
++ * Concrete E2E suite that runs all [[UniFormE2EIcebergSuiteBase]] tests with tables backed
++ * by an in-memory UC commit coordinator, reading results via the native Iceberg reader.
++ */
++class UniFormE2EIcebergUCSuite extends UniFormE2EIcebergSuiteBase
++    with WriteDeltaUCCCReadIceberg {
++  // No test should go here. Please add tests in [[UniFormE2EIcebergSuiteBase]]
++  override def extraTableProperties(compatVersion: Int): String =
++    super.extraTableProperties(compatVersion) + requiredTableProperties
++}
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
@@ -0,0 +1,49 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
+     public static final String FORMAT_HUDI = "hudi";
+   }
+ 
++  /**
++   * The set of compression codecs that Kernel currently recognizes and enforces. This is
++   * intentionally strict for now. In the future we may add new codecs or relax validation to allow
++   * any codec string.
++   */
++  private static final Set<String> VALID_COMPRESSION_CODECS =
++      Collections.unmodifiableSet(
++          new HashSet<>(
++              Arrays.asList("uncompressed", "none", "snappy", "gzip", "lz4", "lz4_raw", "zstd")));
++
+   private static final Collection<String> ALLOWED_UNIFORM_FORMATS =
+       Collections.unmodifiableList(
+           Arrays.asList(UniversalFormats.FORMAT_HUDI, UniversalFormats.FORMAT_ICEBERG));
+           "needs to be a boolean.",
+           true);
+ 
++  /**
++   * Compression codec writers should use for new Parquet data and checkpoint files. Changing this
++   * property does not affect existing files; a table may contain files written with different
++   * codecs.
++   *
++   * <p>Valid values (case-insensitive): uncompressed, none, snappy, gzip, lz4, lz4_raw, zstd.
++   */
++  public static final TableConfig<String> PARQUET_COMPRESSION_CODEC =
++      new TableConfig<>(
++          "delta.parquet.compression.codec",
++          "snappy",
++          v -> v.toLowerCase(Locale.ROOT),
++          VALID_COMPRESSION_CODECS::contains,
++          "needs to be one of: 'uncompressed', 'none', 'snappy', 'gzip',"
++              + " 'lz4', 'lz4_raw', 'zstd'.",
++          true /* editable */);
++
+   public static final TableConfig<String> MATERIALIZED_ROW_ID_COLUMN_NAME =
+       new TableConfig<>(
+           "delta.rowTracking.materializedRowIdColumnName",
+               addConfig(this, MATERIALIZED_ROW_ID_COLUMN_NAME);
+               addConfig(this, MATERIALIZED_ROW_COMMIT_VERSION_COLUMN_NAME);
+               addConfig(this, VARIANT_SHREDDING_ENABLED);
++              addConfig(this, PARQUET_COMPRESSION_CODEC);
+ 
+               // The below configs do not yet have their behavior correctly implemented in Kernel.
+               addConfig(this, DATA_SKIPPING_STATS_COLUMNS);
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/actions/DeletionVectorDescriptor.java
@@ -0,0 +1,11 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/actions/DeletionVectorDescriptor.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/actions/DeletionVectorDescriptor.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/actions/DeletionVectorDescriptor.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/actions/DeletionVectorDescriptor.java
+   public String getUniqueId() {
+     String uniqueFileId = storageType + pathOrInlineDv;
+     if (offset.isPresent()) {
+-      return uniqueFileId + "@" + offset;
++      return uniqueFileId + "@" + offset.get();
+     } else {
+       return uniqueFileId;
+     }
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
@@ -0,0 +1,13 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
+               StructType.class));
+ 
+   private static final Set<Class<? extends DataType>> V3_SUPPORTED_TYPES =
+-      Stream.concat(V2_SUPPORTED_TYPES.stream(), Stream.of(VariantType.class))
++      Stream.concat(
++              V2_SUPPORTED_TYPES.stream(),
++              Stream.of(VariantType.class, GeometryType.class, GeographyType.class))
+           .collect(Collectors.toSet());
+ 
+   protected static final IcebergCompatCheck V2_CHECK_HAS_SUPPORTED_TYPES =
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
@@ -0,0 +1,10 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
+                   VARIANT_SHREDDING_PREVIEW_RW_FEATURE,
+                   VARIANT_RW_PREVIEW_FEATURE,
+                   ALLOW_COLUMN_DEFAULTS_W_FEATURE,
++                  GEOSPATIAL_RW_FEATURE,
+                   // Also allow writerV1 features for backward compatibility.
+                   //
+                   // Note: We already enforce that these features cannot be enabled
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
@@ -0,0 +1,22 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
+ import io.delta.kernel.utils.CloseableIterator;
+ import io.delta.kernel.utils.FileStatus;
+ import java.io.IOException;
++import java.io.InterruptedIOException;
+ import java.io.UncheckedIOException;
+ import java.util.*;
+ import java.util.stream.Collectors;
+       throw new IllegalStateException("Can't call `next` on a closed iterator.");
+     }
+     if (Thread.currentThread().isInterrupted()) {
+-      throw new IllegalStateException("Thread was interrupted");
++      // Throw a typed InterruptedIOException (wrapped, since next() does not declare checked
++      // exceptions) so engines whose interrupt-handling recognizes standard JDK interrupt types
++      // (e.g. Spark's StreamExecution.isInterruptionException) treat this as a clean shutdown
++      // rather than a real error.
++      throw new UncheckedIOException(new InterruptedIOException("Thread was interrupted"));
+     }
+ 
+     if (!hasNext()) {
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
@@ -0,0 +1,11 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
+     }
+   }
+ 
+-  static final TableFeature GEOSPATIAL_RW_FEATURE = new GeoSpatialTableFeature();
++  public static final TableFeature GEOSPATIAL_RW_FEATURE = new GeoSpatialTableFeature();
+ 
+   private static class GeoSpatialTableFeature extends TableFeature.ReaderWriterFeature
+       implements FeatureAutoEnabledByMetadata {
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
@@ -0,0 +1,73 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
+ 
+ import scala.collection.JavaConverters._
+ 
+-import io.delta.kernel.exceptions.KernelException
++import io.delta.kernel.exceptions.{InvalidConfigurationValueException, KernelException}
+ 
+ import org.scalatest.funsuite.AnyFunSuite
+ 
+         TableConfig.IN_COMMIT_TIMESTAMP_ENABLEMENT_TIMESTAMP.getKey -> "1",
+         TableConfig.COLUMN_MAPPING_MODE.getKey -> "name",
+         TableConfig.ICEBERG_COMPAT_V2_ENABLED.getKey -> "true",
+-        TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.getKey -> "iceberg").asJava)
++        TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.getKey -> "iceberg",
++        TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "snappy").asJava)
+   }
+ 
+   test("check TableConfig.MAX_COLUMN_ID.editable is false") {
+     val formats = TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.fromMetadata(config)
+     assert(formats == Set("iceberg", "hudi").asJava)
+   }
++
++  test("PARQUET_COMPRESSION_CODEC - valid values accepted including mixed case") {
++    val validValues = Seq(
++      "snappy",
++      "SNAPPY",
++      "ZSTD",
++      "gzip",
++      "GZIP",
++      "lz4",
++      "lz4_raw",
++      "LZ4_RAW",
++      "uncompressed",
++      "UNCOMPRESSED",
++      "none",
++      "NONE",
++      "zstd")
++    validValues.foreach { codec =>
++      TableConfig.validateAndNormalizeDeltaProperties(
++        Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> codec).asJava)
++    }
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - invalid value throws InvalidConfigurationValueException") {
++    val ex = intercept[InvalidConfigurationValueException] {
++      TableConfig.validateAndNormalizeDeltaProperties(
++        Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "invalid").asJava)
++    }
++    assert(ex.getMessage.contains("delta.parquet.compression.codec"))
++    assert(ex.getMessage.contains("invalid"))
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - fromMetadata returns lowercase regardless of stored case") {
++    val config = Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "SNAPPY").asJava
++    val result = TableConfig.PARQUET_COMPRESSION_CODEC.fromMetadata(config)
++    assert(result === "snappy")
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - fromMetadata returns snappy when property absent") {
++    val config = Map.empty[String, String].asJava
++    val result = TableConfig.PARQUET_COMPRESSION_CODEC.fromMetadata(config)
++    assert(result === "snappy")
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - validation normalizes key case") {
++    val result = TableConfig.validateAndNormalizeDeltaProperties(
++      Map("DELTA.PARQUET.COMPRESSION.CODEC" -> "snappy").asJava)
++    assert(result.containsKey("delta.parquet.compression.codec"))
++    assert(result.get("delta.parquet.compression.codec") === "snappy")
++  }
+ }
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/actions/DeletionVectorDescriptorSuite.scala
@@ -0,0 +1,34 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/actions/DeletionVectorDescriptorSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/actions/DeletionVectorDescriptorSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/actions/DeletionVectorDescriptorSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/actions/DeletionVectorDescriptorSuite.scala
+     }
+   }
+ 
++  // Regression test for https://github.com/delta-io/delta/issues/6261:
++  // getUniqueId() must unwrap Optional<Integer> offset instead of concatenating
++  // its toString() representation (e.g. "Optional[4]" instead of "4").
++  testCases.foreach { case (storageType, pathOrInlineDv, offset, sizeInBytes, cardinality) =>
++    test(s"getUniqueId - $storageType storage type") {
++      val dv = new DeletionVectorDescriptor(
++        storageType,
++        pathOrInlineDv,
++        offset.map(Integer.valueOf).map(Optional.of[Integer]).getOrElse(Optional.empty[Integer]()),
++        sizeInBytes,
++        cardinality)
++
++      val uniqueId = dv.getUniqueId
++      val expectedFileId = storageType + pathOrInlineDv
++      offset match {
++        case Some(o) =>
++          assert(uniqueId === s"$expectedFileId@$o")
++          // Verify the offset is the raw integer, not "Optional[...]"
++          assert(!uniqueId.contains("Optional"))
++        case None =>
++          assert(uniqueId === expectedFileId)
++      }
++    }
++  }
++
+   test("serializeToBase64 throws for non-inline DV without offset") {
+     val ex = intercept[IllegalArgumentException] {
+       val dv = new DeletionVectorDescriptor(
\ No newline at end of file

... (truncated, output exceeded 60000 bytes)

Reproduce locally: git range-diff 9436fca..dfe61d9 e43bf65..f3359c7 | Disable: git config gitstack.push-range-diff false

Copy link
Copy Markdown
Collaborator

@TimothyW553 TimothyW553 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @PorridgeSwim approved with one comment. Please confirm, update the branch, and ping me to merge when CI is green.

public void testProtocolAdapterWithTableFeatures() {
// Reader features: supported but empty (version >= 3 means features are supported, even with
// an empty set). Writer features: supported and populated.
Set<String> readerFeatures = Collections.emptySet();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include v2Checkpoint here too, because it is a reader-writer feature and this test should use a valid protocol.

Suggested change
Set<String> readerFeatures = Collections.emptySet();
Set<String> readerFeatures = Collections.singleton("v2Checkpoint");

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from f3359c7 to 7bbf917 Compare May 4, 2026 06:25
@PorridgeSwim
Copy link
Copy Markdown
Collaborator Author

Range-diff: master (f3359c7 -> 7bbf917)
.github/actions/setup-unitycatalog/action.yml
@@ -0,0 +1,40 @@
+diff --git a/.github/actions/setup-unitycatalog/action.yml b/.github/actions/setup-unitycatalog/action.yml
+new file mode 100644
+--- /dev/null
++++ b/.github/actions/setup-unitycatalog/action.yml
++name: "Set up pinned Unity Catalog build"
++description: >-
++  Publishes Unity Catalog jars from the commit pinned in project/scripts/setup_unitycatalog_main.sh
++  (the UC_PIN_SHA= line) to the runner's local Ivy / Maven caches, using GitHub Actions cache so the
++  slow UC build only runs the first time a pin is seen.
++
++runs:
++  using: "composite"
++  steps:
++    - name: Restore pinned UC cache
++      id: uc-cache
++      uses: actions/cache/restore@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
++      with:
++        # ~/.ivy2/local is what sbt publishLocal writes to. ~/.m2 is for publishM2.
++        path: |
++          ~/.ivy2/local
++          ~/.m2/repository/io/unitycatalog
++        # Cache key hashes the setup script, so bumping UC_PIN_SHA (or any other script change)
++        # invalidates the cache.
++        key: uc-jars-${{ runner.os }}-${{ hashFiles('project/scripts/setup_unitycatalog_main.sh') }}
++    - name: Build Unity Catalog from pinned SHA
++      shell: bash
++      run: bash project/scripts/setup_unitycatalog_main.sh
++    - name: Save pinned UC cache
++      # Only attempt a save when the restore missed. When multiple parallel matrix jobs all see
++      # a cache miss (first CI run after a pin bump), only the first to reach this step wins the
++      # GHA cache reservation; the rest log "another job may be creating this cache" warnings.
++      # Gating on cache-hit means cached runs (the common steady state) skip the save entirely,
++      # which eliminates those warnings on every subsequent run.
++      if: steps.uc-cache.outputs.cache-hit != 'true'
++      uses: actions/cache/save@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
++      with:
++        path: |
++          ~/.ivy2/local
++          ~/.m2/repository/io/unitycatalog
++        key: uc-jars-${{ runner.os }}-${{ hashFiles('project/scripts/setup_unitycatalog_main.sh') }}
\ No newline at end of file
.github/workflows/build.yaml
@@ -0,0 +1,29 @@
+diff --git a/.github/workflows/build.yaml b/.github/workflows/build.yaml
+--- a/.github/workflows/build.yaml
++++ b/.github/workflows/build.yaml
+ name: "Delta Build"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+             ~/.cache/coursier
+           key: delta-sbt-cache-cross-spark
+ 
++      # publishM2 compiles every aggregated project, including storage, which has
++      # unitycatalog-client as a compile-scope dependency. Publish the pinned UC build locally
++      # first so Delta compiles against the UC APIs it actually targets.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
++
+       - name: Run cross-Spark build test
+         run: python project/tests/test_cross_spark_publish.py
+ 
\ No newline at end of file
.github/workflows/disabled_iceberg_test.yaml
@@ -0,0 +1,80 @@
+diff --git a/.github/workflows/disabled_iceberg_test.yaml b/.github/workflows/disabled_iceberg_test.yaml
+deleted file mode 100644
+--- a/.github/workflows/disabled_iceberg_test.yaml
++++ /dev/null
+-name: "Delta Iceberg Latest [DISABLED]"
+-# SECURITY: All Python/PySpark workflows disabled due to active supply chain attack
+-# targeting OSS package ecosystems (PyPI). C2 domains: models.litellm.cloud, checkmarx.zone
+-# Date disabled: 2026-03-25
+-# To re-enable: remove 'if: false' from all jobs and restore original triggers
+-on:
+-  workflow_dispatch: # manual-only, auto triggers removed
+-  # To re-enable, replace the above line with:
+-  # push:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-  # pull_request:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-env:
+-  # SECURITY: Temporal lockdown — refuse any package version published after this date.
+-  # This date is a pre-attack baseline (before the active PyPI supply chain attack).
+-  UV_EXCLUDE_NEWER: "2026-03-10T00:00:00Z"
+-jobs:
+-  test:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+-    name: "DIL: Scala ${{ matrix.scala }}"
+-    runs-on: ubuntu-24.04
+-    strategy:
+-      matrix:
+-        # These Scala versions must match those in the build.sbt
+-        scala: [2.13.16]
+-    env:
+-      SCALA_VERSION: ${{ matrix.scala }}
+-    steps:
+-      - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+-      - name: install java
+-        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
+-        with:
+-          distribution: "zulu"
+-          java-version: "17"
+-      - name: Cache Scala, SBT
+-        uses: actions/cache@6f8efc29b200d32929f49075959781ed54ec270c # v3.5.0
+-        with:
+-          path: |
+-            ~/.sbt
+-            ~/.ivy2
+-            ~/.cache/coursier
+-          # Change the key if dependencies are changed. For each key, GitHub Actions will cache the
+-          # the above directories when we use the key for the first time. After that, each run will
+-          # just use the cache. The cache is immutable so we need to use a new key when trying to
+-          # cache new stuff.
+-          key: delta-sbt-cache-spark4.0-scala${{ matrix.scala }}
+-      - name: Set up uv
+-        run: bash project/scripts/install-uv.sh
+-      - name: Install Job dependencies
+-        run: |
+-          sudo apt-get update
+-          sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+-          sudo apt install libedit-dev
+-          # buf v1.28.1 (2023-11-15) — SHA from official release asset:
+-          # https://github.com/bufbuild/buf/releases/download/v1.28.1/sha256.txt
+-          BUF_VERSION="v1.28.1"
+-          BUF_SHA256="870cf492d381a967d36636fdee9da44b524ea62aad163659b8dbf16a7da56987"
+-          curl -fsSL -o buf-Linux-x86_64.tar.gz \
+-            "https://github.com/bufbuild/buf/releases/download/${BUF_VERSION}/buf-Linux-x86_64.tar.gz"
+-          echo "${BUF_SHA256}  buf-Linux-x86_64.tar.gz" | sha256sum -c -
+-          mkdir -p ~/buf
+-          tar -xzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
+-          rm buf-Linux-x86_64.tar.gz
+-          uv python install 3.8
+-          uv venv .venv --python 3.8
+-      - name: Run Scala/Java and Python tests
+-        # when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
+-        run: |
+-          source .venv/bin/activate
+-          TEST_PARALLELISM_COUNT=4 python run-tests.py --group iceberg --spark-version 4.0
\ No newline at end of file
.github/workflows/spark_test_uc_master.yaml
@@ -0,0 +1,62 @@
+diff --git a/.github/workflows/spark_test_uc_master.yaml b/.github/workflows/disabled_spark_test_uc_master.yaml
+similarity index 61%
+rename from .github/workflows/spark_test_uc_master.yaml
+rename to .github/workflows/disabled_spark_test_uc_master.yaml
+--- a/.github/workflows/spark_test_uc_master.yaml
++++ b/.github/workflows/disabled_spark_test_uc_master.yaml
+ ##
+ ## To make this blocking, add the job name to the required status checks in
+ ## the branch protection rules for `master`.
++##
++## DISABLED while Delta master builds against a pinned UC master SHA — the main Delta Spark
++## workflow already exercises UC master at that pin, so a parallel floating-main workflow would
++## be redundant. To re-enable (once Delta goes back to a released UC version): drop the
++## `[DISABLED]` suffix from `name`, replace `workflow_dispatch:` with the original push /
++## pull_request triggers below, remove `if: false` from the job, and rename the file back to
++## `spark_test_uc_master.yaml`.
+ 
+-name: "Delta Spark (UC Master)"
++name: "Delta Spark (UC Master) [DISABLED]"
+ on:
+-  push:
+-    paths-ignore:
+-      - '**.md'
+-      - '**.txt'
+-  pull_request:
+-    paths-ignore:
+-      - '**.md'
+-      - '**.txt'
++  workflow_dispatch: # manual-only while disabled
++  # Original triggers, restore when re-enabling:
++  # push:
++  #   branches: [master, branch-*]
++  #   paths-ignore:
++  #     - '**.md'
++  #     - '**.txt'
++  # pull_request:
++  #   branches: [master, branch-*]
++  #   paths-ignore:
++  #     - '**.md'
++  #     - '**.txt'
+ 
+ jobs:
+   test-uc-master:
+     name: "[Non Blocking] UC Integration Tests (UC Main)"
++    # Guard against accidental runs while disabled. Remove when re-enabling.
++    if: false
+     runs-on: ubuntu-24.04
+     steps:
+       - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3
+           key: delta-sbt-cache-uc-master
+       - name: Build Unity Catalog from source
+         id: uc-build
++        # UC_REF=main builds the floating-main canary instead of the pinned SHA, which is the
++        # point of this workflow -- early warning of upcoming UC incompatibilities.
+         run: |
+-          bash project/scripts/setup_unitycatalog_main.sh
+-          UC_VERSION=$(cat /tmp/unitycatalog/.uc-version)
++          UC_REF=main bash project/scripts/setup_unitycatalog_main.sh
++          UC_VERSION=$(UC_REF=main bash project/scripts/setup_unitycatalog_main.sh --print-version)
+           echo "uc_version=$UC_VERSION" >> $GITHUB_OUTPUT
+           echo "UC version: $UC_VERSION"
+       - name: Run UC integration tests
\ No newline at end of file
.github/workflows/flink_test.yaml
@@ -0,0 +1,37 @@
+diff --git a/.github/workflows/flink_test.yaml b/.github/workflows/flink_test.yaml
+--- a/.github/workflows/flink_test.yaml
++++ b/.github/workflows/flink_test.yaml
+ 
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'flink/**'
+       - 'kernel/**'
+       - '!**/*.md'
+       - '!**/*.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'flink/**'
+       - 'kernel/**'
+   cancel-in-progress: true
+ 
+ env:
+-  # Point SBT to our cache directories for consistency
++  # Point SBT to our cache directories for consistency.
+   SBT_OPTS: "-Dsbt.coursier.home-dir=/home/runner/.cache/coursier -Dsbt.ivy.home=/home/runner/.ivy2"
+ 
+ jobs:
+           else
+             echo "❌ Cache MISS - will download dependencies"
+           fi
++      # flink has unitycatalog-client as a compile-scope dep and flink tests exercise UC.
++      # Publish the pinned UC build locally before sbt runs.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run unit tests
+         run: |
+           build/sbt flinkGroup/test
\ No newline at end of file
.github/workflows/iceberg_test.yaml
@@ -0,0 +1,58 @@
+diff --git a/.github/workflows/iceberg_test.yaml b/.github/workflows/iceberg_test.yaml
+new file mode 100644
+--- /dev/null
++++ b/.github/workflows/iceberg_test.yaml
++name: "Delta Iceberg Latest"
++on:
++  push:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++  pull_request:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++jobs:
++  test:
++    name: "DIL: Scala ${{ matrix.scala }}"
++    runs-on: ubuntu-24.04
++    strategy:
++      matrix:
++        # These Scala versions must match those in the build.sbt
++        scala: [2.13.16]
++    env:
++      SCALA_VERSION: ${{ matrix.scala }}
++    steps:
++      - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
++      - name: install java
++        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
++        with:
++          distribution: "zulu"
++          java-version: "17"
++      - name: Cache Scala, SBT
++        uses: actions/cache@6f8efc29b200d32929f49075959781ed54ec270c # v3.5.0
++        with:
++          path: |
++            ~/.sbt
++            ~/.ivy2
++            ~/.cache/coursier
++          # Change the key if dependencies are changed. For each key, GitHub Actions will cache the
++          # the above directories when we use the key for the first time. After that, each run will
++          # just use the cache. The cache is immutable so we need to use a new key when trying to
++          # cache new stuff.
++          key: delta-sbt-cache-spark4.0-scala${{ matrix.scala }}
++      - name: Set up uv
++        run: bash project/scripts/install-uv.sh
++      - name: Install Python via uv
++        # No UV_EXCLUDE_NEWER needed: this workflow installs zero pip packages.
++        # Python is only used to run the stdlib-only run-tests.py driver.
++        run: |
++          uv python install 3.8
++          uv venv .venv --python 3.8
++      - name: Run Scala/Java and Python tests
++        # when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
++        run: |
++          source .venv/bin/activate
++          TEST_PARALLELISM_COUNT=4 python run-tests.py --group iceberg --spark-version 4.0
\ No newline at end of file
.github/workflows/kernel_docs.yaml
@@ -0,0 +1,11 @@
+diff --git a/.github/workflows/kernel_docs.yaml b/.github/workflows/kernel_docs.yaml
+--- a/.github/workflows/kernel_docs.yaml
++++ b/.github/workflows/kernel_docs.yaml
+           java-version: "11"
+       - name: Generate docs
+         run: |
+-          build/sbt kernelGroup/unidoc
++          build/sbt -DuseDefaultUnityCatalogReleaseVersion=true kernelGroup/unidoc
+           mkdir -p kernel/docs/snapshot/kernel-api/java
+           mkdir -p kernel/docs/snapshot/kernel-defaults/java
+           cp -r kernel/kernel-api/target/javaunidoc/. kernel/docs/snapshot/kernel-api/java/
\ No newline at end of file
.github/workflows/kernel_test.yaml
@@ -0,0 +1,47 @@
+diff --git a/.github/workflows/kernel_test.yaml b/.github/workflows/kernel_test.yaml
+--- a/.github/workflows/kernel_test.yaml
++++ b/.github/workflows/kernel_test.yaml
+ 
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+           else
+             echo "❌ Cache MISS - will download dependencies"
+           fi
++      # run-tests.py invokes sbt with `++ 2.13.16`, which triggers cross-version dependency resolution
++      # across every project (including kernelUnityCatalog). Publish the pinned UC build locally first
++      # so that resolution doesn't miss.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run unit tests
+         run: |
+           python run-tests.py --group kernel --coverage --shard ${{ matrix.shard }}
+     runs-on: ubuntu-24.04
+     steps:
+       - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+-      # Run integration tests with JDK 11, as they have no Spark dependency
+-      - name: install java
++      # The integration test itself runs on JDK 11 (no Spark dependency), but UC's sbt build needs
++      # JDK 17, so we install 17 first, publish UC, then switch the active JDK to 11 for the actual
++      # test run.
++      - name: install java 17 for UC build
++        uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
++        with:
++          distribution: "zulu"
++          java-version: "17"
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
++      - name: install java 11 for integration test
+         uses: actions/setup-java@17f84c3641ba7b8f6deff6309fc4c864478f5d62 # v3.14.1
+         with:
+           distribution: "zulu"
\ No newline at end of file
.github/workflows/kernel_unitycatalog_test.yaml
@@ -0,0 +1,29 @@
+diff --git a/.github/workflows/kernel_unitycatalog_test.yaml b/.github/workflows/kernel_unitycatalog_test.yaml
+--- a/.github/workflows/kernel_unitycatalog_test.yaml
++++ b/.github/workflows/kernel_unitycatalog_test.yaml
+ name: "Kernel Unity Catalog"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'build.sbt'
+       - 'version.sbt'
+       - 'storage/**/*.java'
+       - '.github/workflows/kernel_unitycatalog_test.yaml'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths:
+       - 'build.sbt'
+       - 'version.sbt'
+         with:
+           distribution: "zulu"
+           java-version: "17"
++      # kernelUnityCatalog depends on unreleased UC APIs; publish the pinned UC build locally before
++      # sbt tries to resolve the dependency.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run Unity tests with coverage
+         run: |
+           ./build/sbt "++ ${{ env.SCALA_VERSION }}" clean coverage kernelUnityCatalog/test coverageAggregate coverageOff -v
\ No newline at end of file
.github/workflows/spark_examples_test.yaml
@@ -0,0 +1,27 @@
+diff --git a/.github/workflows/spark_examples_test.yaml b/.github/workflows/spark_examples_test.yaml
+--- a/.github/workflows/spark_examples_test.yaml
++++ b/.github/workflows/spark_examples_test.yaml
+ name: "Delta Spark Publishing and Examples"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+           sudo apt-get update
+           sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+           sudo apt install libedit-dev
++      # `publishM2` and `++ <scala>` both resolve every project's deps, which includes
++      # sparkUnityCatalog; publish the pinned UC build locally before sbt runs.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Run Delta Spark Local Publishing and Examples Compilation
+         # examples/scala/build.sbt will compile against the local Delta release version (e.g. 3.2.0-SNAPSHOT).
+         # Thus, we need to publishM2 first so those jars are locally accessible.
\ No newline at end of file
.github/workflows/disabled_spark_python_test.yaml
@@ -0,0 +1,76 @@
+diff --git a/.github/workflows/disabled_spark_python_test.yaml b/.github/workflows/spark_python_test.yaml
+similarity index 71%
+rename from .github/workflows/disabled_spark_python_test.yaml
+rename to .github/workflows/spark_python_test.yaml
+--- a/.github/workflows/disabled_spark_python_test.yaml
++++ b/.github/workflows/spark_python_test.yaml
+-name: "Delta Spark Python [DISABLED]"
+-# SECURITY: All Python/PySpark workflows disabled due to active supply chain attack
+-# targeting OSS package ecosystems (PyPI). C2 domains: models.litellm.cloud, checkmarx.zone
+-# Date disabled: 2026-03-25
+-# To re-enable: remove 'if: false' from all jobs and restore original triggers
++name: "Delta Spark Python"
+ on:
+-  workflow_dispatch: # manual-only, auto triggers removed
+-  # To re-enable, replace the above line with:
+-  # push:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
+-  # pull_request:
+-  #   branches: [master]
+-  #   paths-ignore:
+-  #     - '**.md'
+-  #     - '**.txt'
++  push:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
++  pull_request:
++    branches: [master, branch-*]
++    paths-ignore:
++      - '**.md'
++      - '**.txt'
+ env:
+   # SECURITY: Temporal lockdown — refuse any package version published after this date.
+   # This date is a pre-attack baseline (before the active PyPI supply chain attack).
+   # Generate Spark versions matrix from CrossSparkVersions.scala
+   # This workflow tests against released versions only (no snapshots)
+   generate-matrix:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+     name: "Generate Released Spark Versions Matrix"
+     runs-on: ubuntu-24.04
+     outputs:
+           echo "Generated released Spark versions: $SPARK_VERSIONS"
+ 
+   test:
+-    if: false # SECURITY: disabled - supply chain attack mitigation
+     name: "DSP (${{ matrix.spark_version }})"
+     runs-on: ubuntu-24.04
+     needs: generate-matrix
+           key: delta-sbt-cache-spark${{ matrix.spark_version }}-scala${{ matrix.scala }}
+       - name: Set up uv
+         run: bash project/scripts/install-uv.sh
+-      - name: Install Job dependencies
++      - name: Set up buf
++        run: bash project/scripts/install-buf.sh
++      - name: Install Python and dependencies
+         run: |
+-          sudo apt-get update
+-          sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git
+-          sudo apt install libedit-dev
+-          # buf v1.28.1 (2023-11-15) — SHA from official release asset:
+-          # https://github.com/bufbuild/buf/releases/download/v1.28.1/sha256.txt
+-          BUF_VERSION="v1.28.1"
+-          BUF_SHA256="870cf492d381a967d36636fdee9da44b524ea62aad163659b8dbf16a7da56987"
+-          curl -fsSL -o buf-Linux-x86_64.tar.gz \
+-            "https://github.com/bufbuild/buf/releases/download/${BUF_VERSION}/buf-Linux-x86_64.tar.gz"
+-          echo "${BUF_SHA256}  buf-Linux-x86_64.tar.gz" | sha256sum -c -
+-          mkdir -p ~/buf
+-          tar -xzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
+-          rm buf-Linux-x86_64.tar.gz
+           uv python install 3.10
+           uv venv .venv --python 3.10
+           # Install hash-verified locked dependencies (see .github/ci-requirements/spark-python/)
\ No newline at end of file
.github/workflows/spark_test.yaml
@@ -0,0 +1,27 @@
+diff --git a/.github/workflows/spark_test.yaml b/.github/workflows/spark_test.yaml
+--- a/.github/workflows/spark_test.yaml
++++ b/.github/workflows/spark_test.yaml
+ name: "Delta Spark"
+ on:
+   push:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+   pull_request:
+-    branches: [master]
++    branches: [master, branch-*]
+     paths-ignore:
+       - '**.md'
+       - '**.txt'
+             ~/.ivy2
+             ~/.cache/coursier
+           key: delta-sbt-cache-spark${{ matrix.spark_version }}-scala${{ matrix.scala }}
++      # Delta's sparkUnityCatalog module (part of sparkGroup) depends on APIs that are only in
++      # unreleased UC. Publish the pinned UC build locally before sbt tries to resolve it.
++      - name: Set up pinned Unity Catalog
++        uses: ./.github/actions/setup-unitycatalog
+       - name: Scala structured logging style check
+         run: |
+           if [ -f ./dev/spark_structured_logging_style.py ]; then
\ No newline at end of file
.github/workflows/unidoc.yaml
@@ -0,0 +1,19 @@
+diff --git a/.github/workflows/unidoc.yaml b/.github/workflows/unidoc.yaml
+--- a/.github/workflows/unidoc.yaml
++++ b/.github/workflows/unidoc.yaml
+   name: "Unidoc"
+   on:
+     push:
+-      branches: [master]
++      branches: [master, branch-*]
+     pull_request:
+-      branches: [master]
++      branches: [master, branch-*]
+   jobs:
+     build:
+       name: "U: Scala ${{ matrix.scala }}"
+             java-version: "17"
+         - uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
+         - name: generate unidoc
+-          run: build/sbt "++ ${{ matrix.scala }}" unidoc
++          run: build/sbt -DuseDefaultUnityCatalogReleaseVersion=true "++ ${{ matrix.scala }}" unidoc
\ No newline at end of file
build.sbt
@@ -0,0 +1,162 @@
+diff --git a/build.sbt b/build.sbt
+--- a/build.sbt
++++ b/build.sbt
+   ).configureUnidoc()
+ 
+ 
+-val unityCatalogVersion = sys.props.getOrElse("unityCatalogVersion", "0.4.1")
++// Unity Catalog version. Three modes, in priority order:
++//
++//  1. `-DuseDefaultUnityCatalogReleaseVersion=true`: use `defaultUnityCatalogReleaseVersion`
++//     below -- the last released UC version on Maven Central. For workflows that don't actually
++//     need DRC APIs (e.g. unidoc, lint) and want to skip the pinned UC build. Shared across
++//     workflows by reading this single constant, so bumping is a one-line change here.
++//
++//  2. Release mode: set `unityCatalogReleaseVersion = Some("0.5.0")` (or whatever released
++//     version the release branch ships against). sbt resolves the coordinate from Maven Central
++//     like any other dependency.
++//
++//  3. Pinned mode (default): leave `unityCatalogReleaseVersion = None`. The version string
++//     comes from `setup_unitycatalog_main.sh --print-version`, which encodes both the pinned
++//     UC main SHA and UC's declared base version; the script is the single source of truth.
++//     The same script (without the flag) publishes the matching jars to ~/.ivy2/local when
++//     `ensurePinnedUnityCatalog` decides they're missing.
++//
++// Override with -DunityCatalogVersion=<anything> for ad-hoc experiments.
++val unityCatalogReleaseVersion: Option[String] = None
++val defaultUnityCatalogReleaseVersion = "0.4.1"
++val useDefaultUnityCatalogReleaseVersion: Boolean =
++  sys.props.getOrElse("useDefaultUnityCatalogReleaseVersion", "false").toBoolean
++val unityCatalogSetupScript = "project/scripts/setup_unitycatalog_main.sh"
++
++// Lazy so release-mode / useDefaultUnityCatalogReleaseVersion builds never shell out.
++lazy val pinnedUnityCatalogVersion: String = {
++  import scala.sys.process._
++  Process(Seq("bash", unityCatalogSetupScript, "--print-version")).!!.trim
++}
++val unityCatalogVersion: String = sys.props.getOrElse(
++  "unityCatalogVersion",
++  if (useDefaultUnityCatalogReleaseVersion) defaultUnityCatalogReleaseVersion
++  else unityCatalogReleaseVersion.getOrElse(pinnedUnityCatalogVersion))
++
+ val sparkUnityCatalogJacksonVersion = "2.15.4" // We are using Spark 4.0's Jackson version 2.15.x, to override Unity Catalog 0.3.0's version 2.18.x
+ 
++// Publishes the pinned UC jars to ~/.ivy2/local if they're not already cached there. Hooked
++// into `update` on the UC-dependent projects below, so plain `sbt testOnly ...` on a clean
++// checkout just works. No-op in release mode. Opt out with
++// `-Ddelta.autoBuildPinnedUnityCatalog=false`, in which case sbt errors with a pointer to the
++// setup script.
++val ensurePinnedUnityCatalog = taskKey[Unit](
++  "Publish the pinned UC jars locally if the Ivy coordinate isn't already cached.")
++
++// Extracted so the task body can read as a short guard rather than three nested ifs.
++def publishPinnedUnityCatalog(log: sbt.util.Logger, canary: java.io.File): Unit = {
++  val shouldAutoBuild =
++    sys.props.getOrElse("delta.autoBuildPinnedUnityCatalog", "true").toBoolean
++  if (!shouldAutoBuild) {
++    sys.error(
++      s"""|Pinned Unity Catalog jars are not published locally for coordinate
++          |$unityCatalogVersion.
++          |Auto-build is disabled (-Ddelta.autoBuildPinnedUnityCatalog=false).
++          |Run: bash $unityCatalogSetupScript""".stripMargin)
++  }
++  log.info(s"[UC] Pinned UC jars not found for coordinate $unityCatalogVersion.")
++  log.info(
++    s"[UC] Running $unityCatalogSetupScript - takes ~3-5 minutes on a cold cache, <1s on a warm one.")
++  import scala.sys.process._
++  val procLogger = ProcessLogger(
++    line => log.info(s"[UC setup] $line"),
++    line => log.warn(s"[UC setup] $line"))
++  val exit = Process(Seq("bash", unityCatalogSetupScript)).!(procLogger)
++  if (exit != 0) {
++    sys.error(
++      s"[UC] $unityCatalogSetupScript exited with code $exit. Run it manually to see full output.")
++  }
++  if (!canary.exists) {
++    sys.error(
++      s"[UC] $unityCatalogSetupScript succeeded but ${canary.getAbsolutePath} is still missing - " +
++        "the publish target layout may have changed.")
++  }
++}
++
++Global / ensurePinnedUnityCatalog := {
++  // Resolve the .value dependencies eagerly - sbt's task macro warns when
++  // `.value` appears inside conditional branches.
++  val log = streams.value.log
++  // No-op whenever the effective version resolves to something Maven Central can serve:
++  // release mode, -DuseDefaultUnityCatalogReleaseVersion=true, or -DunityCatalogVersion=<released>.
++  val usingReleasedVersion = useDefaultUnityCatalogReleaseVersion ||
++    sys.props.contains("unityCatalogVersion")
++  if (unityCatalogReleaseVersion.isEmpty && !usingReleasedVersion) {
++    val home = file(sys.props("user.home"))
++    // Check both layouts: a restored sbt cache can pre-populate ivy alone, leaving m2 empty -
++    // checking only ivy would silently skip the slow publish and break mvn-based consumers.
++    val ivy2Canary = home / ".ivy2" / "local" / "io.unitycatalog" /
++      "unitycatalog-client" / unityCatalogVersion / "ivys" / "ivy.xml"
++    val m2Canary = home / ".m2" / "repository" / "io" / "unitycatalog" /
++      "unitycatalog-client" / unityCatalogVersion /
++      s"unitycatalog-client-$unityCatalogVersion.pom"
++    if (!ivy2Canary.exists || !m2Canary.exists) {
++      publishPinnedUnityCatalog(log, ivy2Canary)
++    }
++  }
++}
++
+ lazy val sparkUnityCatalog = (project in file("spark/unitycatalog"))
+   .dependsOn(spark % "compile->compile;test->test;provided->provided")
+   .disablePlugins(ScalafmtPlugin)
+     javafmtCheckSettings(),
+     CrossSparkVersions.sparkDependentSettings(sparkVersion),
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // This is a test-only module - no production sources
+     Compile / sources := Seq.empty,
+ 
+     libraryDependencies ++= Seq(
+       "org.apache.spark" %% "spark-sql" % sparkVersion.value % "provided",
+ 
+-      "io.delta" %% "delta-sharing-client" % "1.3.10",
++      "io.delta" %% "delta-sharing-client" % "1.3.11",
+ 
+       // Test deps
+       "org.scalatest" %% "scalatest" % scalaTestVersion % "test",
+     exportJars := false,
+     javafmtCheckSettings,
+     scalafmtCheckSettings,
+-    
++
+     libraryDependencies ++= Seq(
+       "org.openjdk.jmh" % "jmh-core" % "1.37" % "test",
+       "org.openjdk.jmh" % "jmh-generator-annprocess" % "1.37" % "test",
+     scalaStyleSettings,
+     scalafmtCheckSettings,
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // Put the shaded kernel-api JAR on the classpath (compile & test)
+     Compile / unmanagedJars += (kernelApi / Compile / packageBin).value,
+     Test / unmanagedJars += (kernelApi / Compile / packageBin).value,
+       "com.fasterxml.jackson.datatype" % "jackson-datatype-jsr310" % "2.15.4" % "test",
+     ),
+ 
++    // Publish the pinned UC jars before sbt tries to resolve them. storage is the transitive
++    // UC-client entry point for most of the build graph (sparkV1, sparkV2, kernelDefaults, etc.
++    // all .dependsOn(storage)), so hooking here covers nearly every compile path.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     // Unidoc settings
+     unidocSourceFilePatterns += SourceFilePattern("/LogStore.java", "/CloseableIterator.java"),
+     TestParallelization.settings
+       "--add-opens=java.base/java.util=ALL-UNNAMED" // for Flink with Java 17.
+     ),
+     crossPaths := false,
++
++    // Publish the pinned UC jars before sbt tries to resolve them.
++    update := update.dependsOn(ensurePinnedUnityCatalog).value,
++
+     libraryDependencies ++= Seq(
+       "org.apache.flink" % "flink-core" % flinkVersion % "provided",
+       "org.apache.flink" % "flink-table-common" % flinkVersion % "provided",
\ No newline at end of file
build/sbt
@@ -0,0 +1,16 @@
+diff --git a/build/sbt b/build/sbt
+--- a/build/sbt
++++ b/build/sbt
+ )
+ }
+ 
+-# If MAVEN_PROXY_URL is set, use it as the sole repository for all dependencies.
++# If MAVEN_PROXY_URL is set, use it (and local) as the sole repository for all dependencies.
+ if [[ -n "$MAVEN_PROXY_URL" ]]; then
+   SBT_REPOSITORIES_CONFIG=$(mktemp)
+   cat > "$SBT_REPOSITORIES_CONFIG" <<EOF
+ [repositories]
++  local
+   maven-proxy: $MAVEN_PROXY_URL
+   maven-proxy-ivy: $MAVEN_PROXY_URL, [organization]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)[revision]/[type]s/[artifact](-[classifier]).[ext]
+ EOF
\ No newline at end of file
iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
@@ -0,0 +1,15 @@
+diff --git a/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala b/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
+--- a/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
++++ b/iceberg/src/main/scala/org/apache/spark/sql/delta/IcebergTable.scala
+      * AnalysisException
+      */
+      try {
+-       SchemaMergingUtils.checkColumnNameDuplication(tableSchema, "during convert to Delta")
++       SchemaMergingUtils.checkColumnNameDuplication(tableSchema, "CONVERT_TO_DELTA")
+      } catch {
+-       case e: AnalysisException if e.getMessage.contains("during convert to Delta") =>
++       case e: AnalysisException
++           if e.getErrorClass == "DELTA_DUPLICATE_COLUMNS_FOUND.CONVERT_TO_DELTA" =>
+          throw new UnsupportedOperationException(
+            IcebergTable.caseSensitiveConversionExceptionMsg(e.getMessage))
+      }
\ No newline at end of file
iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
@@ -0,0 +1,11 @@
+diff --git a/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala b/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
+--- a/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
++++ b/iceberg/src/main/scala/org/apache/spark/sql/delta/icebergShaded/IcebergConverter.scala
+    * @param catalogTable the catalogTable this conversion targets
+    * @return (Iceberg metadata path, last converted Delta version)
+    */
+-  def convertUncommitedTxn(
++  override def convertUncommitedTxn(
+       txnInfo: CurrentTransactionInfo,
+       deltaAttemptVersion: Long,
+       deltaLog: DeltaLog,
\ No newline at end of file
iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
@@ -0,0 +1,149 @@
+diff --git a/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala b/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
+--- a/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
++++ b/iceberg/src/test/scala/org/apache/spark/sql/delta/uniform/UniFormE2EIcebergSuite.scala
+ 
+ package org.apache.spark.sql.delta.uniform
+ 
+-import org.apache.spark.sql.delta.sources.DeltaSQLConf
++import java.util.{Collections, Optional, UUID}
++
++import scala.collection.JavaConverters._
++
++import io.delta.storage.commit.{CommitCoordinatorClient => JCommitCoordinatorClient}
++import io.delta.storage.commit.{TableIdentifier => UCTableIdentifier}
++import io.delta.storage.commit.actions.{AbstractMetadata, AbstractProtocol}
++import io.delta.storage.commit.uccommitcoordinator.UCCommitCoordinatorClient
++import org.apache.hadoop.fs.Path
+ 
+ import org.apache.spark.{SparkConf, SparkSessionSwitch}
+ import org.apache.spark.sql.{Row, SparkSession}
++import org.apache.spark.sql.catalyst.TableIdentifier
++import org.apache.spark.sql.delta.DeltaConfigs.{
++  COORDINATED_COMMITS_COORDINATOR_CONF,
++  COORDINATED_COMMITS_COORDINATOR_NAME
++}
++import org.apache.spark.sql.delta.DeltaLog
++import org.apache.spark.sql.delta.NonSparkReadIceberg
++import org.apache.spark.sql.delta.coordinatedcommits.{
++  CatalogOwnedCommitCoordinatorBuilder,
++  CommitCoordinatorProvider,
++  InMemoryUCClient,
++  InMemoryUCCommitCoordinator,
++  UCCommitCoordinatorBuilder
++}
++import org.apache.spark.sql.delta.sources.DeltaSQLConf
+ import org.apache.spark.sql.delta.test.DeltaSQLCommandTest
+ import org.apache.spark.sql.delta.uniform.hms.HMSTest
++import org.apache.spark.sql.delta.util.JsonUtils
+ 
+ /**
+  * This trait allows the tests to write with Delta
+ }
+ 
+ /**
+- * No test should go here. Please add tests in [[UniFormE2EIcebergSuiteBase]]
++ * Trait that wires up an in-memory UC commit coordinator for UniForm E2E testing.
++ *
++ * Mix this into a concrete suite that already extends [[UniFormE2EIcebergSuiteBase]] (or any
++ * other [[UniFormE2ETest]] subclass) to redirect every [[readAndVerify]] call through the
++ * native Iceberg reader backed by the in-memory UC coordinator
++ *
++ * Concrete suites must call [[requiredTableProperties]] inside their
++ * [[UniFormE2EIcebergSuiteBase.extraTableProperties]] override to inject the coordinator
++ * name and conf into every `CREATE TABLE` statement.
+  */
++trait WriteDeltaUCCCReadIceberg extends UniFormE2ETest
++  with DeltaSQLCommandTest
++  with NonSparkReadIceberg {
++
++  /**
++   * A [[UCCommitCoordinatorClient]] subclass that overrides [[registerTable]] to auto-assign
++   * a UC table ID, simulating what the UC catalog does during CREATE TABLE.
++   */
++  private class TestUCBackedCommitCoordinator(ucClient: InMemoryUCClient)
++    extends UCCommitCoordinatorClient(Collections.emptyMap(), ucClient) {
++
++    @volatile var lastRegisteredTableId: String = _
++
++    /**
++     * Delta blocks setting `COORDINATED_COMMITS_TABLE_CONF` in TBLPROPERTIES, so this trait
++     * simulates what the real UC catalog does: a [[CatalogOwnedCommitCoordinatorBuilder]] returns
++     * a single [[TestUCBackedCommitCoordinator]] instance whose [[registerTable]] auto-assigns a
++     * UUID.  Returning the same instance from every [[build]]/[[buildForCatalog]] call ensures
++     * that [[UCCommitCoordinatorClient.semanticEquals]] (which uses reference equality on `conf`)
++     * returns true and Delta does not reject intra-test metadata updates.
++     */
++    override def registerTable(
++        logPath: Path,
++        tableIdentifier: Optional[UCTableIdentifier],
++        currentVersion: Long,
++        currentMetadata: AbstractMetadata,
++        currentProtocol: AbstractProtocol): java.util.Map[String, String] = {
++      val tableId = UUID.randomUUID().toString
++      lastRegisteredTableId = tableId
++      Map(UCCommitCoordinatorClient.UC_TABLE_ID_KEY -> tableId).asJava
++    }
++  }
++
++  protected var ucCommitCoordinator: InMemoryUCCommitCoordinator = _
++  private var testCoordinator: TestUCBackedCommitCoordinator = _
++
++  abstract override def beforeEach(): Unit = {
++    super.beforeEach()
++    DeltaLog.clearCache()
++    CommitCoordinatorProvider.clearAllBuilders()
++    ucCommitCoordinator = new InMemoryUCCommitCoordinator()
++    val ucClient = new InMemoryUCClient("test-metastore", ucCommitCoordinator)
++    testCoordinator = new TestUCBackedCommitCoordinator(ucClient)
++    CommitCoordinatorProvider.registerBuilder(new CatalogOwnedCommitCoordinatorBuilder {
++      override def getName: String = UCCommitCoordinatorBuilder.getName
++      override def build(
++          spark: SparkSession, conf: Map[String, String]): JCommitCoordinatorClient =
++        testCoordinator
++      override def buildForCatalog(
++          spark: SparkSession, catalogName: String): JCommitCoordinatorClient =
++        testCoordinator
++    })
++  }
++
++  abstract override def afterEach(): Unit = {
++    CommitCoordinatorProvider.clearAllBuilders()
++    DeltaLog.clearCache()
++    super.afterEach()
++  }
++
++  /**
++   * Returns the TBLPROPERTIES SQL fragment required to enable the UC commit coordinator.
++   * Concrete suites should append this to their [[extraTableProperties]] override.
++   */
++  def requiredTableProperties: String =
++    s", '${COORDINATED_COMMITS_COORDINATOR_NAME.key}' = '${UCCommitCoordinatorBuilder.getName}'" +
++      s", '${COORDINATED_COMMITS_COORDINATOR_CONF.key}' = " +
++      s"'${JsonUtils.toJson(Map.empty[String, String])}'"
++
++  override protected def readAndVerify(
++      table: String, fields: String, orderBy: String, expect: Seq[Row]): Unit = {
++    val tableId = testCoordinator.lastRegisteredTableId
++    assert(tableId != null,
++      s"No table UUID assigned for '$table' - table was not created with CC properties")
++    val schema = DeltaLog.forTable(spark, TableIdentifier(table)).update().schema
++    val uniformMetadata = ucCommitCoordinator.getUniformMetadata(tableId)
++    assert(uniformMetadata.isDefined,
++      s"No UniForm metadata found for table '$table' (ID $tableId)")
++    assert(uniformMetadata.get.getIcebergMetadata.isPresent,
++      s"No Iceberg metadata found for table '$table' (ID $tableId)")
++    val icebergMetadataPath = uniformMetadata.get.getIcebergMetadata.get.getMetadataLocation
++    verifyReadByPath(icebergMetadataPath, schema, fields, orderBy, expect)
++  }
++}
++
++/**
++ * Concrete E2E suite that runs all [[UniFormE2EIcebergSuiteBase]] tests with tables backed
++ * by an in-memory UC commit coordinator, reading results via the native Iceberg reader.
++ */
++class UniFormE2EIcebergUCSuite extends UniFormE2EIcebergSuiteBase
++    with WriteDeltaUCCCReadIceberg {
++  // No test should go here. Please add tests in [[UniFormE2EIcebergSuiteBase]]
++  override def extraTableProperties(compatVersion: Int): String =
++    super.extraTableProperties(compatVersion) + requiredTableProperties
++}
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
@@ -0,0 +1,49 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/TableConfig.java
+     public static final String FORMAT_HUDI = "hudi";
+   }
+ 
++  /**
++   * The set of compression codecs that Kernel currently recognizes and enforces. This is
++   * intentionally strict for now. In the future we may add new codecs or relax validation to allow
++   * any codec string.
++   */
++  private static final Set<String> VALID_COMPRESSION_CODECS =
++      Collections.unmodifiableSet(
++          new HashSet<>(
++              Arrays.asList("uncompressed", "none", "snappy", "gzip", "lz4", "lz4_raw", "zstd")));
++
+   private static final Collection<String> ALLOWED_UNIFORM_FORMATS =
+       Collections.unmodifiableList(
+           Arrays.asList(UniversalFormats.FORMAT_HUDI, UniversalFormats.FORMAT_ICEBERG));
+           "needs to be a boolean.",
+           true);
+ 
++  /**
++   * Compression codec writers should use for new Parquet data and checkpoint files. Changing this
++   * property does not affect existing files; a table may contain files written with different
++   * codecs.
++   *
++   * <p>Valid values (case-insensitive): uncompressed, none, snappy, gzip, lz4, lz4_raw, zstd.
++   */
++  public static final TableConfig<String> PARQUET_COMPRESSION_CODEC =
++      new TableConfig<>(
++          "delta.parquet.compression.codec",
++          "snappy",
++          v -> v.toLowerCase(Locale.ROOT),
++          VALID_COMPRESSION_CODECS::contains,
++          "needs to be one of: 'uncompressed', 'none', 'snappy', 'gzip',"
++              + " 'lz4', 'lz4_raw', 'zstd'.",
++          true /* editable */);
++
+   public static final TableConfig<String> MATERIALIZED_ROW_ID_COLUMN_NAME =
+       new TableConfig<>(
+           "delta.rowTracking.materializedRowIdColumnName",
+               addConfig(this, MATERIALIZED_ROW_ID_COLUMN_NAME);
+               addConfig(this, MATERIALIZED_ROW_COMMIT_VERSION_COLUMN_NAME);
+               addConfig(this, VARIANT_SHREDDING_ENABLED);
++              addConfig(this, PARQUET_COMPRESSION_CODEC);
+ 
+               // The below configs do not yet have their behavior correctly implemented in Kernel.
+               addConfig(this, DATA_SKIPPING_STATS_COLUMNS);
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/actions/DeletionVectorDescriptor.java
@@ -0,0 +1,11 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/actions/DeletionVectorDescriptor.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/actions/DeletionVectorDescriptor.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/actions/DeletionVectorDescriptor.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/actions/DeletionVectorDescriptor.java
+   public String getUniqueId() {
+     String uniqueFileId = storageType + pathOrInlineDv;
+     if (offset.isPresent()) {
+-      return uniqueFileId + "@" + offset;
++      return uniqueFileId + "@" + offset.get();
+     } else {
+       return uniqueFileId;
+     }
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
@@ -0,0 +1,13 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergCompatMetadataValidatorAndUpdater.java
+               StructType.class));
+ 
+   private static final Set<Class<? extends DataType>> V3_SUPPORTED_TYPES =
+-      Stream.concat(V2_SUPPORTED_TYPES.stream(), Stream.of(VariantType.class))
++      Stream.concat(
++              V2_SUPPORTED_TYPES.stream(),
++              Stream.of(VariantType.class, GeometryType.class, GeographyType.class))
+           .collect(Collectors.toSet());
+ 
+   protected static final IcebergCompatCheck V2_CHECK_HAS_SUPPORTED_TYPES =
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
@@ -0,0 +1,10 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/icebergcompat/IcebergWriterCompatV3MetadataValidatorAndUpdater.java
+                   VARIANT_SHREDDING_PREVIEW_RW_FEATURE,
+                   VARIANT_RW_PREVIEW_FEATURE,
+                   ALLOW_COLUMN_DEFAULTS_W_FEATURE,
++                  GEOSPATIAL_RW_FEATURE,
+                   // Also allow writerV1 features for backward compatibility.
+                   //
+                   // Note: We already enforce that these features cannot be enabled
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
@@ -0,0 +1,22 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/replay/ActionsIterator.java
+ import io.delta.kernel.utils.CloseableIterator;
+ import io.delta.kernel.utils.FileStatus;
+ import java.io.IOException;
++import java.io.InterruptedIOException;
+ import java.io.UncheckedIOException;
+ import java.util.*;
+ import java.util.stream.Collectors;
+       throw new IllegalStateException("Can't call `next` on a closed iterator.");
+     }
+     if (Thread.currentThread().isInterrupted()) {
+-      throw new IllegalStateException("Thread was interrupted");
++      // Throw a typed InterruptedIOException (wrapped, since next() does not declare checked
++      // exceptions) so engines whose interrupt-handling recognizes standard JDK interrupt types
++      // (e.g. Spark's StreamExecution.isInterruptionException) treat this as a clean shutdown
++      // rather than a real error.
++      throw new UncheckedIOException(new InterruptedIOException("Thread was interrupted"));
+     }
+ 
+     if (!hasNext()) {
\ No newline at end of file
kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
@@ -0,0 +1,11 @@
+diff --git a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
+--- a/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
++++ b/kernel/kernel-api/src/main/java/io/delta/kernel/internal/tablefeatures/TableFeatures.java
+     }
+   }
+ 
+-  static final TableFeature GEOSPATIAL_RW_FEATURE = new GeoSpatialTableFeature();
++  public static final TableFeature GEOSPATIAL_RW_FEATURE = new GeoSpatialTableFeature();
+ 
+   private static class GeoSpatialTableFeature extends TableFeature.ReaderWriterFeature
+       implements FeatureAutoEnabledByMetadata {
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
@@ -0,0 +1,73 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/TableConfigSuite.scala
+ 
+ import scala.collection.JavaConverters._
+ 
+-import io.delta.kernel.exceptions.KernelException
++import io.delta.kernel.exceptions.{InvalidConfigurationValueException, KernelException}
+ 
+ import org.scalatest.funsuite.AnyFunSuite
+ 
+         TableConfig.IN_COMMIT_TIMESTAMP_ENABLEMENT_TIMESTAMP.getKey -> "1",
+         TableConfig.COLUMN_MAPPING_MODE.getKey -> "name",
+         TableConfig.ICEBERG_COMPAT_V2_ENABLED.getKey -> "true",
+-        TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.getKey -> "iceberg").asJava)
++        TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.getKey -> "iceberg",
++        TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "snappy").asJava)
+   }
+ 
+   test("check TableConfig.MAX_COLUMN_ID.editable is false") {
+     val formats = TableConfig.UNIVERSAL_FORMAT_ENABLED_FORMATS.fromMetadata(config)
+     assert(formats == Set("iceberg", "hudi").asJava)
+   }
++
++  test("PARQUET_COMPRESSION_CODEC - valid values accepted including mixed case") {
++    val validValues = Seq(
++      "snappy",
++      "SNAPPY",
++      "ZSTD",
++      "gzip",
++      "GZIP",
++      "lz4",
++      "lz4_raw",
++      "LZ4_RAW",
++      "uncompressed",
++      "UNCOMPRESSED",
++      "none",
++      "NONE",
++      "zstd")
++    validValues.foreach { codec =>
++      TableConfig.validateAndNormalizeDeltaProperties(
++        Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> codec).asJava)
++    }
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - invalid value throws InvalidConfigurationValueException") {
++    val ex = intercept[InvalidConfigurationValueException] {
++      TableConfig.validateAndNormalizeDeltaProperties(
++        Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "invalid").asJava)
++    }
++    assert(ex.getMessage.contains("delta.parquet.compression.codec"))
++    assert(ex.getMessage.contains("invalid"))
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - fromMetadata returns lowercase regardless of stored case") {
++    val config = Map(TableConfig.PARQUET_COMPRESSION_CODEC.getKey -> "SNAPPY").asJava
++    val result = TableConfig.PARQUET_COMPRESSION_CODEC.fromMetadata(config)
++    assert(result === "snappy")
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - fromMetadata returns snappy when property absent") {
++    val config = Map.empty[String, String].asJava
++    val result = TableConfig.PARQUET_COMPRESSION_CODEC.fromMetadata(config)
++    assert(result === "snappy")
++  }
++
++  test("PARQUET_COMPRESSION_CODEC - validation normalizes key case") {
++    val result = TableConfig.validateAndNormalizeDeltaProperties(
++      Map("DELTA.PARQUET.COMPRESSION.CODEC" -> "snappy").asJava)
++    assert(result.containsKey("delta.parquet.compression.codec"))
++    assert(result.get("delta.parquet.compression.codec") === "snappy")
++  }
+ }
\ No newline at end of file
kernel/kernel-api/src/test/scala/io/delta/kernel/internal/actions/DeletionVectorDescriptorSuite.scala
@@ -0,0 +1,34 @@
+diff --git a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/actions/DeletionVectorDescriptorSuite.scala b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/actions/DeletionVectorDescriptorSuite.scala
+--- a/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/actions/DeletionVectorDescriptorSuite.scala
++++ b/kernel/kernel-api/src/test/scala/io/delta/kernel/internal/actions/DeletionVectorDescriptorSuite.scala
+     }
+   }
+ 
++  // Regression test for https://github.com/delta-io/delta/issues/6261:
++  // getUniqueId() must unwrap Optional<Integer> offset instead of concatenating
++  // its toString() representation (e.g. "Optional[4]" instead of "4").
++  testCases.foreach { case (storageType, pathOrInlineDv, offset, sizeInBytes, cardinality) =>
++    test(s"getUniqueId - $storageType storage type") {
++      val dv = new DeletionVectorDescriptor(
++        storageType,
++        pathOrInlineDv,
++        offset.map(Integer.valueOf).map(Optional.of[Integer]).getOrElse(Optional.empty[Integer]()),
++        sizeInBytes,
++        cardinality)
++
++      val uniqueId = dv.getUniqueId
++      val expectedFileId = storageType + pathOrInlineDv
++      offset match {
++        case Some(o) =>
++          assert(uniqueId === s"$expectedFileId@$o")
++          // Verify the offset is the raw integer, not "Optional[...]"
++          assert(!uniqueId.contains("Optional"))
++        case None =>
++          assert(uniqueId === expectedFileId)
++      }
++    }
++  }
++
+   test("serializeToBase64 throws for non-inline DV without offset") {
+     val ex = intercept[IllegalArgumentException] {
+       val dv = new DeletionVectorDescriptor(
\ No newline at end of file

... (truncated, output exceeded 60000 bytes)

Reproduce locally: git range-diff ecf4948..f3359c7 e43bf65..7bbf917 | Disable: git config gitstack.push-range-diff false

* Adapter from {@link io.delta.kernel.internal.actions.Metadata} to {@link
* org.apache.spark.sql.delta.v2.interop.AbstractMetadata}.
*/
public class KernelMetadataAdapter implements AbstractMetadata {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is there both interop and adapters directories?
whats the difference?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interop is under org.apache.spark.sql define the AbstractMetadata interface that v1 MetadataAction extends, while adapters is under io.delta.spark.internal.v2 that adapts v2 MetadataAction into AbstractMetadata to bridge between v1 and v2 MetadataAction.

@Override
public DeltaColumnMappingMode columnMappingMode() {
String mode = kernelMetadata.getConfiguration().get(ColumnMapping.COLUMN_MAPPING_MODE_KEY);
return mode == null ? NoMapping$.MODULE$ : DeltaColumnMappingMode$.MODULE$.apply(mode);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we cache columnMappingMode for faster access?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cached

@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from 7bbf917 to 6070b93 Compare May 5, 2026 20:07
@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from 6070b93 to 84f5ce6 Compare May 5, 2026 20:07
Co-authored-by: Isaac
@PorridgeSwim PorridgeSwim force-pushed the stack/SparkMetadataAdapter branch from 97651ca to 9271a62 Compare May 5, 2026 22:10
@TimothyW553 TimothyW553 merged commit e5d5d32 into delta-io:master May 5, 2026
62 checks passed
TimothyW553 pushed a commit that referenced this pull request May 6, 2026
…#6550)

## 🥞 Stacked PR
Use this [link](https://github.com/delta-io/delta/pull/6550/files) to
review incremental changes.
-
[stack/SparkMetadataAdapter](#6546)
[[Files changed](https://github.com/delta-io/delta/pull/6546/files)]
[MERGED]
-
[**stack/RefactorMetadataTrackingLog**](#6550)
[[Files changed](https://github.com/delta-io/delta/pull/6550/files)]
-
[stack/RefactorDeltaSourceMetadataEvolutionSupport](#6562)
[[Files
changed](https://github.com/delta-io/delta/pull/6562/files/953f137f8c4ce46d8b8a9605b0c7bed898e30df4..027984b6edcbad0f4731e560425c2ed9bcf8fc27)]
-
[stack/MetadataEvolutionHandler2](#6563)
[[Files
changed](https://github.com/delta-io/delta/pull/6563/files/027984b6edcbad0f4731e560425c2ed9bcf8fc27..ada845895139edcb2727a87b39922c8e16837a99)]
-
[stack/NonAdditiveSchemaEvolution2](#6570)
[[Files
changed](https://github.com/delta-io/delta/pull/6570/files/ada845895139edcb2727a87b39922c8e16837a99..476762fde7b9cb9b9bc3e416c86a260cd29806ed)]
-
[stack/NonAdditiveSchemaEvolution3](#6697)
[[Files
changed](https://github.com/delta-io/delta/pull/6697/files/476762fde7b9cb9b9bc3e416c86a260cd29806ed..13395a7f2a49db4962091e8ee919bebdab5bd4e2)]
-
[stack/consecutiveSchemaChangesMerger](#6698)
[[Files
changed](https://github.com/delta-io/delta/pull/6698/files/13395a7f2a49db4962091e8ee919bebdab5bd4e2..f22ba063eaf35ab69d653a2d5faefdc52f35eab5)]

---------
#### Which Delta project/connector is this regarding?

- [X] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

## Description

PR 2/7 in the non-additive schema evolution for V2 streaming connector
stack.

Decouple `DeltaSourceMetadataTrackingLog` and `PersistedMetadata` from
V1-specific types so the schema log can be reused by the V2 connector.

- Replace `SnapshotDescriptor` parameter in `create()` with plain
`sourceTableId` and `sourceDataPath` strings
- Unify `PersistedMetadata.apply` to accept
`AbstractMetadata`/`AbstractProtocol` instead of V1
`Metadata`/`Protocol`
- Extract the consecutive schema changes merger (V1-specific, depends on
`DeltaLog`) out of the companion object into
`DeltaSourceMetadataEvolutionSupport`, and inject it as a function
parameter so V2 can provide its own implementation
- Remove `Protocol`'s `private` constructor modifier to allow
construction from abstract protocol fields

All changes are structural refactors with no behavioral change.

## How was this patch tested?

Existing tests in `DeltaSourceSchemaEvolutionSuite` updated to use the
new API. No behavioral changes.

## Does this PR introduce _any_ user-facing changes?

No.
murali-db pushed a commit that referenced this pull request May 6, 2026
…seable in v2 (#6562)

## 🥞 Stacked PR
Use this [link](https://github.com/delta-io/delta/pull/6562/files) to
review incremental changes.
-
[stack/SparkMetadataAdapter](#6546)
[[Files changed](https://github.com/delta-io/delta/pull/6546/files)]
[MERGED]
-
[stack/RefactorMetadataTrackingLog](#6550)
[[Files changed](https://github.com/delta-io/delta/pull/6550/files)]
[MERGED]
-
[**stack/RefactorDeltaSourceMetadataEvolutionSupport**](#6562)
[[Files changed](https://github.com/delta-io/delta/pull/6562/files)]
-
[stack/MetadataEvolutionHandler2](#6563)
[[Files
changed](https://github.com/delta-io/delta/pull/6563/files/ed92a0fa2051432b6bc5784034df0b7949bbfb98..e5b2c3295843ec85753e07dc0010aa5ccebaabb7)]
-
[stack/NonAdditiveSchemaEvolution2](#6570)
[[Files
changed](https://github.com/delta-io/delta/pull/6570/files/e5b2c3295843ec85753e07dc0010aa5ccebaabb7..7c66bf11a0f1b651cda32ed7f529f552dd9dbfcb)]
-
[stack/NonAdditiveSchemaEvolution3](#6697)
[[Files
changed](https://github.com/delta-io/delta/pull/6697/files/7c66bf11a0f1b651cda32ed7f529f552dd9dbfcb..14956ea304c93d2343ccd7eb89a112966f07f906)]
-
[stack/consecutiveSchemaChangesMerger](#6698)
[[Files
changed](https://github.com/delta-io/delta/pull/6698/files/14956ea304c93d2343ccd7eb89a112966f07f906..8101b335b892a6a5b6d6fe11f4a202d14102721c)]

---------
#### Which Delta project/connector is this regarding?

- [X] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

## Description

PR 3/7 in the non-additive schema evolution for V2 streaming connector
stack.

Refactor `DeltaSourceMetadataEvolutionSupport` and `DeltaColumnMapping`
so the schema change detection logic can be called from V2 without
depending on V1 instance state.

**`DeltaSourceMetadataEvolutionSupport`:**
- Extract instance methods (`validateAndResolveMetadataEvolution`,
`checkColumnMappingSchemaChangesDuringStreaming`,
`resolveMetadataEvolutionForCommitRange`, etc.) to companion object
statics that accept explicit parameters instead of accessing V1
`DeltaSource` via `this`
- V1 trait methods now delegate to the companion object statics

**`DeltaColumnMapping`:**
- Widen `hasNoColumnMappingSchemaChanges` from V1 `Metadata` to
`AbstractMetadata` so V2 can call it via the adapter layer
- Extract `assignColumnIdAndPhysicalNameToSchema(StructType, Map)` from
`assignColumnIdAndPhysicalName(Metadata, Metadata, ...)` — needed for
simulating column mapping upgrades during NoMapping-to-NameMapping
transitions

All changes are structural refactors with no behavioral change.

## How was this patch tested?

Existing tests in `DeltaSourceSchemaEvolutionSuite` continue to pass. No
behavioral changes.

## Does this PR introduce _any_ user-facing changes?

No.
@PorridgeSwim PorridgeSwim mentioned this pull request May 10, 2026
murali-db pushed a commit that referenced this pull request May 11, 2026
## 🥞 Stacked PR
Use this [link](https://github.com/delta-io/delta/pull/6563/files) to
review incremental changes.
-
[stack/SparkMetadataAdapter](#6546)
[[Files changed](https://github.com/delta-io/delta/pull/6546/files)]
[MERGED]
-
[stack/RefactorMetadataTrackingLog](#6550)
[[Files changed](https://github.com/delta-io/delta/pull/6550/files)]
[MERGED]
-
[stack/RefactorDeltaSourceMetadataEvolutionSupport](#6562)
[[Files changed](https://github.com/delta-io/delta/pull/6562/files)]
[MERGED]
-
[**stack/MetadataEvolutionHandler2**](#6563)
[[Files changed](https://github.com/delta-io/delta/pull/6563/files)]
-
[stack/NonAdditiveSchemaEvolution2](#6570)
[[Files
changed](https://github.com/delta-io/delta/pull/6570/files/a20f1f3ab452a75fc954e15c57c17327e0cb9267..0e07f87285becd6be416450ae084df454d9c94a9)]
-
[stack/NonAdditiveSchemaEvolution3](#6697)
[[Files
changed](https://github.com/delta-io/delta/pull/6697/files/0e07f87285becd6be416450ae084df454d9c94a9..73e1aa7f4162a3e1480ffd2b88b9ca79d852f2fe)]
-
[stack/consecutiveSchemaChangesMerger](#6698)
[[Files
changed](https://github.com/delta-io/delta/pull/6698/files/73e1aa7f4162a3e1480ffd2b88b9ca79d852f2fe..5e5d260b64d45cc11bcfdb58e5aab1b2d2637b33)]
- [stack/V1V2MixTest](#6759)
[[Files
changed](https://github.com/delta-io/delta/pull/6759/files/5e5d260b64d45cc11bcfdb58e5aab1b2d2637b33..738379713040986c74f98dbebfdc6c83ec1d3f16)]

---------
#### Which Delta project/connector is this regarding?

- [X] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

## Description

PR 4/7 in the non-additive schema evolution for V2 streaming connector
stack.

Introduce `MetadataEvolutionHandler`, a Java class that implements the
V1 barrier protocol for schema evolution in the V2 connector. In V1 this
logic lives in `DeltaSourceMetadataEvolutionSupport`, a Scala trait
mixed into `DeltaSource` that accesses stream state via `this`. Since
V2's `SparkMicroBatchStream` is Java and cannot use Scala trait mixins,
`MetadataEvolutionHandler` receives all dependencies via constructor
injection instead.

The handler covers the full schema evolution lifecycle:
- **Stream start**: eager metadata tracking log initialization on first
batch
- **Offset generation**: injects `METADATA_CHANGE_INDEX` /
`POST_METADATA_CHANGE_INDEX` barrier sentinels into the file change
iterator
- **Pending schema offsets**: returns barrier offsets for in-progress
schema changes
- **Batch commit**: updates the schema log and throws
`DELTA_STREAMING_METADATA_EVOLUTION` to trigger stream restart
- **Batch planning on restart**: validates and re-initializes the schema
log

All detection logic delegates to the shared
`DeltaSourceMetadataEvolutionSupport$` companion object statics
(refactored in PR 3/7). V2-specific orchestration is limited to wiring
the barrier protocol into the `CloseableIterator<IndexedFile>` pipeline
and collecting metadata/protocol from Kernel commit ranges via
`StreamingHelper`.

Also extends `StreamingHelper` with
`getMetadataAndProtocolForVersionRange` to collect metadata and protocol
actions from a range of Kernel commits.

## How was this patch tested?

Unit tests in `MetadataEvolutionHandlerTest.java` covering: barrier
protocol (METADATA_CHANGE_INDEX / POST_METADATA_CHANGE_INDEX offset
generation), tracking state transitions, initialization lifecycle,
offset arithmetic, pending schema change handling, and commit-time
evolution exception.

## Does this PR introduce _any_ user-facing changes?

No.
murali-db pushed a commit that referenced this pull request May 16, 2026
## 🥞 Stacked PR
Use this [link](https://github.com/delta-io/delta/pull/6570/files) to
review incremental changes.
-
[stack/SparkMetadataAdapter](#6546)
[[Files changed](https://github.com/delta-io/delta/pull/6546/files)]
[MERGED]
-
[stack/RefactorMetadataTrackingLog](#6550)
[[Files changed](https://github.com/delta-io/delta/pull/6550/files)]
[MERGED]
-
[stack/RefactorDeltaSourceMetadataEvolutionSupport](#6562)
[[Files changed](https://github.com/delta-io/delta/pull/6562/files)]
[MERGED]
-
[stack/MetadataEvolutionHandler2](#6563)
[[Files changed](https://github.com/delta-io/delta/pull/6563/files)]
[MERGED]
-
[**stack/NonAdditiveSchemaEvolution2**](#6570)
[[Files changed](https://github.com/delta-io/delta/pull/6570/files)]
-
[stack/NonAdditiveSchemaEvolution3](#6697)
[[Files
changed](https://github.com/delta-io/delta/pull/6697/files/b7f6c8ebfc0882e7e2cc580f09f376be23a8d43d..dbb6246c14be1ab7f017ad9fc26455ae599ee676)]
-
[stack/consecutiveSchemaChangesMerger](#6698)
[[Files
changed](https://github.com/delta-io/delta/pull/6698/files/dbb6246c14be1ab7f017ad9fc26455ae599ee676..4bf2fa3fa828bcab0b56c4c26ca51ee9cc40b482)]
-
[stack/SchemaTrackingWithCDC](#6801)
[[Files
changed](https://github.com/delta-io/delta/pull/6801/files/4bf2fa3fa828bcab0b56c4c26ca51ee9cc40b482..a78a4ac2bc9a52605278a36b98804230258c12a2)]
- [stack/V1V2MixTest](#6759)
[[Files
changed](https://github.com/delta-io/delta/pull/6759/files/7f9b7f2724b2245ab7380908616303cf7ea95fca..e146cdc9ebb0572e8b0a928cc6dd3bfdc198d984)]

---------
#### Which Delta project/connector is this regarding?

- [X] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

## Description

PR 5/7 in the non-additive schema evolution for V2 streaming connector
stack.

Wire schema tracking into V2's analysis path so the analyzed plan
reflects the persisted (evolved) schema instead of the live snapshot
schema.

- `DeltaAnalysis.verifyDeltaSourceSchemaLocation`: extend the
duplicate-schema-location check to also visit `StreamingRelationV2`,
keyed on the V2 `Table.name`.
- `SparkTable`: open `DeltaSourceMetadataTrackingLog` once during
construction (gated on `mergeConsecutiveSchemaChanges`) and seed
`SchemaProvider` from the persisted metadata, so analysis-time
`schema()` matches what the stream will read at runtime.
- `ApplyV2ReadOptions` (renamed from `ApplyV2Streaming`): generalize the
CDC-only rebuild to also fire when `schemaTrackingLocation` arrives via
`extraOptions` on the catalog `readStream.table()` path; rebuild
`SparkTable` with merged options so the schema-log lookup actually
fires.
- `MetadataEvolutionHandler.getMetadataTrackingLogForMicroBatchStream`:
V2 port of V1's helper, reused by `SparkTable` (analysis) and
`SparkScan` (execution).

## How was this patch tested?

`SparkTableTest`, `MetadataEvolutionHandlerTest`,
`ApplyV2ReadOptionsSuite`. Unified `DeltaV2SourceSchemaEvolutionSuite`
updated.

## Does this PR introduce _any_ user-facing changes?

No.
murali-db pushed a commit that referenced this pull request May 16, 2026
…6697)

## 🥞 Stacked PR
Use this [link](https://github.com/delta-io/delta/pull/6697/files) to
review incremental changes.
-
[stack/SparkMetadataAdapter](#6546)
[[Files changed](https://github.com/delta-io/delta/pull/6546/files)]
[MERGED]
-
[stack/RefactorMetadataTrackingLog](#6550)
[[Files changed](https://github.com/delta-io/delta/pull/6550/files)]
[MERGED]
-
[stack/RefactorDeltaSourceMetadataEvolutionSupport](#6562)
[[Files changed](https://github.com/delta-io/delta/pull/6562/files)]
[MERGED]
-
[stack/MetadataEvolutionHandler2](#6563)
[[Files changed](https://github.com/delta-io/delta/pull/6563/files)]
[MERGED]
-
[stack/NonAdditiveSchemaEvolution2](#6570)
[[Files changed](https://github.com/delta-io/delta/pull/6570/files)]
[MERGED]
-
[**stack/NonAdditiveSchemaEvolution3**](#6697)
[[Files changed](https://github.com/delta-io/delta/pull/6697/files)]
-
[stack/consecutiveSchemaChangesMerger](#6698)
[[Files
changed](https://github.com/delta-io/delta/pull/6698/files/f96643aa3cc01e7f70cc13a18b82dc27f277f11d..f612628ad931ec35c237801109f01b6fbd1379f7)]
-
[stack/SchemaTrackingWithCDC](#6801)
[[Files
changed](https://github.com/delta-io/delta/pull/6801/files/f612628ad931ec35c237801109f01b6fbd1379f7..4aeacfb120b33e9cdfe124352290b72f53f7cf89)]
- [stack/V1V2MixTest](#6759)
[[Files
changed](https://github.com/delta-io/delta/pull/6759/files/f612628ad931ec35c237801109f01b6fbd1379f7..0c818ee431ab417a4f2ffbcc609930be09d25031)]

---------
#### Which Delta project/connector is this regarding?

- [X] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

## Description

PR 6/7 in the non-additive schema evolution for V2 streaming connector
stack.

Wire `MetadataEvolutionHandler` into `SparkMicroBatchStream` and
`SparkScan` so V2 streaming reads honor non-additive schema evolution
(column rename/drop, type widening).

- `SparkMicroBatchStream`: take `metadataTrackingLog` + `metadataPath`
as constructor inputs; when a persisted entry exists, layer it onto the
freshly loaded `snapshotAtSourceInit` to derive
`readSnapshotAtSourceInit` (mirrors V1's `readSnapshotDescriptor`).
Integrate the schema-evolution barrier protocol into `latestOffset` /
`commit` / `planInputPartitions`. Skip the on-restart schema-validation
check when schema tracking is active — the schema-log evolution
exception covers it.
- `SparkScan.toMicroBatchStream`: reload latest snapshot (the
analysis-time `initialSnapshot` can be stale by stream start), open the
tracking log via
`MetadataEvolutionHandler.getMetadataTrackingLogForMicroBatchStream`
with `mergeConsecutiveSchemaChanges=false` (the merger only runs at
analysis), and pass it through with the checkpoint location.
- `SparkScan` option allow-list: move `allowSourceColumnDrop` / `Rename`
/ `TypeChange` out of the unsupported list now that they are honored.

## How was this patch tested?

`SparkMicroBatchStreamTest`, `MetadataEvolutionHandlerTest`. Unified
suites (`DeltaV2SourceSchemaEvolutionSuite`,
`TypeWideningStreamingV2SourceSuite`,
`RemoveColumnMappingStreamingReadV2Suite`) move non-merger evolution
scenarios from `shouldFailTests` to `shouldPassTests`; merger-dependent
tests remain pending until PR 7/7.

## Does this PR introduce _any_ user-facing changes?

No.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants