Skip to content

[Spark] Simplify UCClientFactory to single createUCClient(ucConfig) interface#6792

Open
openinx wants to merge 7 commits into
delta-io:masterfrom
openinx:fix/ucclient-reflection-factory
Open

[Spark] Simplify UCClientFactory to single createUCClient(ucConfig) interface#6792
openinx wants to merge 7 commits into
delta-io:masterfrom
openinx:fix/ucclient-reflection-factory

Conversation

@openinx
Copy link
Copy Markdown
Collaborator

@openinx openinx commented May 15, 2026

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Simplify UCClientFactory from a multi-method interface (createUCClient(uri, authConfig), createUCClientWithVersions(...), plus Java overloads) to a single unified method:

def createUCClient(ucConfig: Map[String, String]): UCClient

Why? The previous design required callers to assemble separate uri, authConfig, and appVersions parameters. It also had a compile-time dependency on UCDeltaTokenBasedRestClient. This change:

  1. Passes a single flat ucConfig map containing all keys (uri, auth.*, deltaRestApi.enabled, ucclient.impl, appVersions.*).
  2. Uses reflection (Utils.classForName) to load the UCClient implementation at runtime, removing the compile-time dependency on UCDeltaTokenBasedRestClient.
  3. Selects the implementation via ucclient.impl key, or deltaRestApi.enabled flag, defaulting to UCTokenBasedRestClient.

Changes by file

  • UCCommitCoordinatorBuilder.scala

    • UCClientFactory trait: createUCClient(ucConfig: Map[String, String]).
    • UCTokenBasedRestClientFactory: single createUCClient that extracts URI, auth, impl class, and app versions from ucConfig. Reflection-based instantiation.
    • getCatalogConfigs: simplified from (name, uri, authConfigMap) to (name, ucConfig) — collects all sub-keys under spark.sql.catalog.<name>.* into a flat map.
    • UCCatalogConfig: changed from (catalogName, uri, authConfig) to (catalogName, ucConfig) with computed uri and authConfig accessors.
  • CreateTableBuilder.java / SnapshotManagerFactory.java: use UCTableInfo.toUcConfig() to build the flat config map.

  • UCTableInfo.java: added toUcConfig() to centralize the uri + auth config assembly.

How was this patch tested?

  • UCCommitCoordinatorBuilderSuite — 14 tests pass (7 pre-existing Mockito/Java 21 ByteBuddy failures unrelated to this change).
  • UCCommitCoordinatorClientSuite — 11/11 pass.
  • New unit tests added for extractAuthConfig (auth.* preference over legacy token, legacy fallback) and extractAppVersions (merges caller-supplied entries with defaults).
  • scalastyle, javafmt, checkstyle all clean.

Does this PR introduce any user-facing changes?

No. All changed APIs (UCClientFactory, UCTokenBasedRestClientFactory, UCCatalogConfig, getCatalogConfigs) are internal to the Delta coordinated commits module. No user-facing configs or behaviors are changed.

@openinx openinx force-pushed the fix/ucclient-reflection-factory branch from 510832a to 877d8f9 Compare May 15, 2026 05:26
openinx added 2 commits May 14, 2026 22:37
…sedRestClientFactory

Replace the hard-coded `new UCTokenBasedRestClient(...)` call with
reflection-based class loading so that alternative UCClient implementations
(e.g. UCDeltaTokenBasedRestClient) can be selected at runtime via the
`ucclient.impl` config key, without introducing a compile-time dependency.
- Consolidate the UCClientFactory trait to a single method:
  `createUCClient(ucConfig: Map[String, String])`
- UCTokenBasedRestClientFactory now parses uri, auth, and
  deltaRestApi.enabled from the unified ucConfig map to select
  between UCTokenBasedRestClient and UCDeltaTokenBasedRestClient
  (loaded via reflection for compile-time decoupling).
- Simplify getCatalogConfigs to collect all sub-keys under
  spark.sql.catalog.<name>.* into a flat config map, removing
  the separate URI / auth parsing logic.
- Update UCCatalogConfig with convenience accessors (uri,
  authConfig) that extract from the unified ucConfig map.
- Update all callers in spark/v2 Java modules.
@openinx openinx force-pushed the fix/ucclient-reflection-factory branch from 877d8f9 to 80c4e89 Compare May 15, 2026 05:51
openinx added 2 commits May 14, 2026 23:07
Signed-off-by: openinx <openinx@gmail.com>
Signed-off-by: openinx <openinx@gmail.com>
@openinx openinx changed the title [Storage] Use reflection to load UCClient implementation in UCTokenBasedRestClientFactory [Storage] Simplify UCClientFactory to single createUCClient(ucConfig) interface May 15, 2026
@openinx openinx changed the title [Storage] Simplify UCClientFactory to single createUCClient(ucConfig) interface [Spark] Simplify UCClientFactory to single createUCClient(ucConfig) interface May 15, 2026
Comment on lines +145 to +147
// scalastyle:off line.size.limit
private def noMatchingCatalogException(
metastoreId: String) = {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is unneccessary, I think we can revert it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Reverted -- back on a single line, scalastyle:off guard removed.

Comment on lines +157 to +159
private def multipleMatchingCatalogs(
metastoreId: String,
uris: List[String]) = {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated change, we should revert it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Reverted -- back on a single line matching the original.

* Recognised ucConfig keys:
* - `uri` (required) -- the UC server endpoint.
* - `auth.*` / `token` (legacy) -- authentication parameters for [[TokenProvider]].
* - `ucclient.impl` -- fully-qualified class name of the [[UCClient]] to instantiate.
Copy link
Copy Markdown
Collaborator Author

@openinx openinx May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we already have the deltaRestApi.enabled switch, then I think we no need this any more. Remove it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed ucclient.impl key and UC_CLIENT_IMPL_KEY constant. The deltaRestApi.enabled switch is now the only mechanism for selecting the UCClient implementation.

DEFAULT_UC_CLIENT_CLASS
})

val appVersions = extractAppVersions(ucConfig)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls simply this, we don't need to use the ucConfig to specific the appVersions. in this case, we should always use the default app versions from here: https://github.com/delta-io/delta/pull/6792/changes#diff-e713dd4904f1e14bc5a90fc8517d818c6aa12655799e935e6f7a020d950c47ceR335

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. createUCClient now merges defaultAppVersions with any caller-supplied appVersions.* entries from ucConfig via extractAppVersions. Callers like SnapshotManagerFactory add custom versions (e.g. Kernel) through ucConfig keys.

* The provided `appVersions` map is used as-is; callers are responsible for
* including all desired version entries.
* Extracts authentication configuration from ucConfig.
* Prefers `auth.*` keys; falls back to legacy `token` key.
Copy link
Copy Markdown
Collaborator Author

@openinx openinx May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this doc, pls give some example for reviewer to understand the context. Just like what we did in the javadoc before this PR: https://github.com/delta-io/delta/pull/6792/changes#diff-e713dd4904f1e14bc5a90fc8517d818c6aa12655799e935e6f7a020d950c47ceL167-L185

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added configuration examples showing both the new auth.* format and the legacy token format, similar to the old getCatalogConfigs javadoc.

Comment on lines +322 to +332
/**
* Merges default app versions with any `appVersions.*` entries from ucConfig.
* Caller-supplied entries override defaults with the same key.
*/
private[coordinatedcommits] def extractAppVersions(
ucConfig: Map[String, String]): Map[String, String] = {
val extra = ucConfig
.filterKeys(_.startsWith(APP_VERSIONS_PREFIX))
.map { case (k, v) => (k.stripPrefix(APP_VERSIONS_PREFIX), v) }
.toMap
defaultAppVersions ++ extra
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally remove this.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed extractAppVersions, APP_VERSIONS_PREFIX, and all appVersions.* handling. createUCClient now uses defaultAppVersions directly. SnapshotManagerFactory reverted to a simple createUCClient(tableInfo.toUcConfig()) call.

…es, remove appVersions

- Revert unnecessary signature splits for noMatchingCatalogException and
  multipleMatchingCatalogs (back to single-line, matching original style).
- Remove stray scalastyle:off guard.
- Add configuration examples to UCTokenBasedRestClientFactory Scaladoc.
- Remove extractAppVersions and appVersions.* mechanism entirely.
- Revert SnapshotManagerFactory to simple createUCClient call.
Comment on lines -74 to -81
// Start from defaults (Delta, Spark, Scala, Java) and add connector-specific entries
Map<String, String> appVersions =
UCTokenBasedRestClientFactory$.MODULE$.defaultAppVersionsAsJava();
appVersions.put("Kernel", Meta.KERNEL_VERSION);
appVersions.put("Delta V2 connector", "true");
UCClient ucClient =
UCTokenBasedRestClientFactory$.MODULE$.createUCClientWithVersions(
tableInfo.getUcUri(), tableInfo.getAuthConfig(), appVersions);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, shoot, i misunderstood here. I think we cannot remove this test, and then the createUCClient should support the customized app versions. okay, let's add the custom app versions back.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Re-added extractAppVersions and the appVersions.* mechanism. SnapshotManagerFactory now passes appVersions.Kernel and appVersions.Delta V2 connector through the ucConfig map. Test restored in BuilderSuite.

- Remove UC_CLIENT_IMPL_KEY / ucclient.impl config key; deltaRestApi.enabled
  is sufficient to select the UCClient implementation.
- Re-add extractAppVersions so callers can supply custom version entries
  via appVersions.* keys in ucConfig (merged with defaults).
- Restore SnapshotManagerFactory to pass Kernel and Delta V2 connector
  versions through ucConfig appVersions.* entries.
- Add extractAppVersions unit test back to BuilderSuite.
.filter { case (uri, authConfig) => getMetastoreId(uri, authConfig).contains(metastoreId) }
val matchingConfigs: List[Map[String, String]] = getCatalogConfigs(spark)
.map(_._2)
.distinct
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why doing .distinct on the whole ucconfig? why not on (uri, authConfig) like before

uriOpt match {
case Some(uri) =>
try {
new URI(uri) // Validate the URI
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

below you added

 else {
          Some((catalogName, ucConfig))
        }

but theres no check on the URI. "ht!tp://not a uri" could flow thru and error deep inside

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you need to validate like here via new URI(uri)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants