Skip to content

[Spark] Add UC Delta Rest Catalog API path credentials for raw path reads#6682

Open
TimothyW553 wants to merge 3 commits into
delta-io:masterfrom
TimothyW553:stack/drc-path-credentials
Open

[Spark] Add UC Delta Rest Catalog API path credentials for raw path reads#6682
TimothyW553 wants to merge 3 commits into
delta-io:masterfrom
TimothyW553:stack/drc-path-credentials

Conversation

@TimothyW553
Copy link
Copy Markdown
Collaborator

@TimothyW553 TimothyW553 commented Apr 28, 2026

🥞 Stacked PR

Use this link to review incremental changes.


Which Delta project/connector is this regarding?

Spark / Unity Catalog

Description

This PR adds UC Delta Rest Catalog API temporary path credentials for raw Delta path reads.

Named UC table reads already use table credentials from the previous PR. Raw path reads use a different Spark/Delta entry point, so they need GET /delta/v1/temporary-path-credentials instead of named-table loadTable credentials.

This PR adds:

  • UCClient.getTemporaryPathCredentials(...) and UCTokenBasedRestClient support for the temporary path credentials endpoint.
  • DeltaCatalogClient.pathCredentialOptions(...) to fetch path-scoped credentials for cloud paths using the UC Delta Rest Catalog API-enabled default catalog.
  • Delta source and DeltaTableV2 wiring so raw path reads receive those credential options before touching storage.
  • A guard so Delta path identifiers are not sent to named-table UC Delta Rest Catalog API loadTable.
  • A server-side planning guard so disabled server-side planning does not inspect table properties while loading path-based catalog-managed test tables.
  • Spark UC integration test coverage for DeltaTable.forPath with UC Delta Rest Catalog API path credentials.

If no UC Delta Rest Catalog API-enabled default catalog is configured, or the path is not a supported cloud path, this PR returns no extra credential options and keeps the existing behavior.

How was this patch tested?

Covered by UCTokenBasedRestClientSuite for temporary path credentials and unsupported endpoint behavior.

Covered by DeltaCatalogClientSuite for path credential options and path identifier skip behavior.

Covered by ServerSidePlannedTableSuite for the disabled server-side planning guard.

Covered by the Spark UC integration tests.

Local verification:

./build/sbt "sparkUnityCatalog/testOnly io.sparkuctest.UCDeltaTableReadTest"
./build/sbt "sparkUnityCatalog/test"
./build/sbt scalafmtAll javafmtAll
./build/sbt scalafmtCheckAll javafmtCheckAll "spark/testScalastyle" "sparkV1/testScalastyle"

Does this PR introduce any user-facing changes?

No released user-facing change. This only applies when UC Delta Rest Catalog API is enabled for the default catalog used to authorize raw path access.

@TimothyW553 TimothyW553 force-pushed the stack/drc-path-credentials branch 10 times, most recently from 9f0faa3 to 41924e3 Compare April 29, 2026 18:43
@TimothyW553 TimothyW553 changed the title spark: add DRC path credentials for raw path reads spark: add UC Delta Rest Catalog API path credentials for raw path reads Apr 29, 2026
@TimothyW553 TimothyW553 force-pushed the stack/drc-path-credentials branch 2 times, most recently from 4df9d78 to cf85522 Compare April 29, 2026 23:30
@TimothyW553 TimothyW553 changed the title spark: add UC Delta Rest Catalog API path credentials for raw path reads spark: add UC Delta API path credentials for raw path reads Apr 30, 2026
@TimothyW553 TimothyW553 force-pushed the stack/drc-path-credentials branch 2 times, most recently from 162948f to c6f036a Compare April 30, 2026 05:30
@TimothyW553 TimothyW553 changed the title spark: add UC Delta API path credentials for raw path reads [Spark] Add UC Delta API path credentials for raw path reads Apr 30, 2026
@TimothyW553 TimothyW553 force-pushed the stack/drc-path-credentials branch from c6f036a to f1d45b2 Compare April 30, 2026 06:28
@TimothyW553 TimothyW553 marked this pull request as ready for review April 30, 2026 06:34
@TimothyW553 TimothyW553 requested a review from tdas as a code owner April 30, 2026 06:34
@TimothyW553 TimothyW553 requested review from openinx and yili-db April 30, 2026 06:34
@TimothyW553 TimothyW553 force-pushed the stack/drc-path-credentials branch 2 times, most recently from 49ed145 to dbccc2b Compare April 30, 2026 17:49
@TimothyW553 TimothyW553 force-pushed the stack/drc-path-credentials branch 15 times, most recently from faf76b2 to 4318b06 Compare May 12, 2026 18:39
@TimothyW553 TimothyW553 force-pushed the stack/drc-path-credentials branch 7 times, most recently from e858b6f to e6355cf Compare May 13, 2026 01:38
Comment on lines +80 to +108
val metadata = try {
client.loadTable(catalogName, schemaName, tableName)
.asInstanceOf[TableMetadataAdapter]
} catch {
case e: IOException if isUnsupportedTableFormat(e) =>
return None
case e: IOException =>
throw translateLoadTableException(ident, e)
}
val location = metadata.getLocation
val locationUri = CatalogUtils.stringToURI(location)
val credentials = Option.when(isCloudScheme(locationUri.getScheme)) {
try {
// Prefer READ_WRITE so a loaded table can be used for writes without reloading
// credentials; read-only principals fall back to READ below.
client.getTableCredentials(
CredentialOperation.READ_WRITE,
catalogName,
schemaName,
tableName)
} catch {
case e: IOException if isAuthError(e) =>
client.getTableCredentials(
CredentialOperation.READ,
catalogName,
schemaName,
tableName)
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we loadTable from the UCSingleCatlaog ? in this current code, all the credential renewal won't work, that's a huge risk.

Comment on lines +313 to +326
val credentials = client.getTemporaryPathCredentials(
location,
CredentialOperation.READ)
buildHadoopCredentialPropertiesForPath(
location,
getStorageCredentials(credentials),
locationScheme,
config.credentialContext)
} catch {
case _: UnsupportedOperationException =>
Map.empty[String, String]
case e: IOException if isNotFound(e) =>
Map.empty[String, String]
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the reason why you add the getTemporaryPathCredentials API in the client interface now, since you need it to generate the initial temp credentials.

But after this unitycatalog/unitycatalog#1549, actually we don't need to explicitly set the initial credential any more.

So we can entirely remove the getTemporaryPathCredentials code in oss delta now.


val spark = SparkSession.active

private lazy val deltaCatalogClient: DeltaCatalogClient = DeltaCatalogClient(delegate, spark)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than defining a separate DeltaCatalogClient, why not just introduce a separate StagingTableCatalog, just like the UCSingleCatalog in oss-unitycatalog.

Essentially, the UCSingleCatalog is using the old and legacy client api to talk to unitycatalog. and the UCDeltaCatalog will use the new client api to talk to new catalog endpoint.

Does this make sense ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants