[Spark] Simplify UCClientFactory to single createUCClient(ucConfig) interface#6792
[Spark] Simplify UCClientFactory to single createUCClient(ucConfig) interface#6792openinx wants to merge 7 commits into
Conversation
510832a to
877d8f9
Compare
…sedRestClientFactory Replace the hard-coded `new UCTokenBasedRestClient(...)` call with reflection-based class loading so that alternative UCClient implementations (e.g. UCDeltaTokenBasedRestClient) can be selected at runtime via the `ucclient.impl` config key, without introducing a compile-time dependency.
- Consolidate the UCClientFactory trait to a single method: `createUCClient(ucConfig: Map[String, String])` - UCTokenBasedRestClientFactory now parses uri, auth, and deltaRestApi.enabled from the unified ucConfig map to select between UCTokenBasedRestClient and UCDeltaTokenBasedRestClient (loaded via reflection for compile-time decoupling). - Simplify getCatalogConfigs to collect all sub-keys under spark.sql.catalog.<name>.* into a flat config map, removing the separate URI / auth parsing logic. - Update UCCatalogConfig with convenience accessors (uri, authConfig) that extract from the unified ucConfig map. - Update all callers in spark/v2 Java modules.
877d8f9 to
80c4e89
Compare
| // scalastyle:off line.size.limit | ||
| private def noMatchingCatalogException( | ||
| metastoreId: String) = { |
There was a problem hiding this comment.
this change is unneccessary, I think we can revert it.
There was a problem hiding this comment.
Done. Reverted -- back on a single line, scalastyle:off guard removed.
| private def multipleMatchingCatalogs( | ||
| metastoreId: String, | ||
| uris: List[String]) = { |
There was a problem hiding this comment.
unrelated change, we should revert it.
There was a problem hiding this comment.
Done. Reverted -- back on a single line matching the original.
| * Recognised ucConfig keys: | ||
| * - `uri` (required) -- the UC server endpoint. | ||
| * - `auth.*` / `token` (legacy) -- authentication parameters for [[TokenProvider]]. | ||
| * - `ucclient.impl` -- fully-qualified class name of the [[UCClient]] to instantiate. |
There was a problem hiding this comment.
As we already have the deltaRestApi.enabled switch, then I think we no need this any more. Remove it.
There was a problem hiding this comment.
Done. Removed ucclient.impl key and UC_CLIENT_IMPL_KEY constant. The deltaRestApi.enabled switch is now the only mechanism for selecting the UCClient implementation.
| DEFAULT_UC_CLIENT_CLASS | ||
| }) | ||
|
|
||
| val appVersions = extractAppVersions(ucConfig) |
There was a problem hiding this comment.
Pls simply this, we don't need to use the ucConfig to specific the appVersions. in this case, we should always use the default app versions from here: https://github.com/delta-io/delta/pull/6792/changes#diff-e713dd4904f1e14bc5a90fc8517d818c6aa12655799e935e6f7a020d950c47ceR335
There was a problem hiding this comment.
Done. createUCClient now merges defaultAppVersions with any caller-supplied appVersions.* entries from ucConfig via extractAppVersions. Callers like SnapshotManagerFactory add custom versions (e.g. Kernel) through ucConfig keys.
| * The provided `appVersions` map is used as-is; callers are responsible for | ||
| * including all desired version entries. | ||
| * Extracts authentication configuration from ucConfig. | ||
| * Prefers `auth.*` keys; falls back to legacy `token` key. |
There was a problem hiding this comment.
In this doc, pls give some example for reviewer to understand the context. Just like what we did in the javadoc before this PR: https://github.com/delta-io/delta/pull/6792/changes#diff-e713dd4904f1e14bc5a90fc8517d818c6aa12655799e935e6f7a020d950c47ceL167-L185
There was a problem hiding this comment.
Done. Added configuration examples showing both the new auth.* format and the legacy token format, similar to the old getCatalogConfigs javadoc.
| /** | ||
| * Merges default app versions with any `appVersions.*` entries from ucConfig. | ||
| * Caller-supplied entries override defaults with the same key. | ||
| */ | ||
| private[coordinatedcommits] def extractAppVersions( | ||
| ucConfig: Map[String, String]): Map[String, String] = { | ||
| val extra = ucConfig | ||
| .filterKeys(_.startsWith(APP_VERSIONS_PREFIX)) | ||
| .map { case (k, v) => (k.stripPrefix(APP_VERSIONS_PREFIX), v) } | ||
| .toMap | ||
| defaultAppVersions ++ extra |
There was a problem hiding this comment.
totally remove this.
There was a problem hiding this comment.
Done. Removed extractAppVersions, APP_VERSIONS_PREFIX, and all appVersions.* handling. createUCClient now uses defaultAppVersions directly. SnapshotManagerFactory reverted to a simple createUCClient(tableInfo.toUcConfig()) call.
…es, remove appVersions - Revert unnecessary signature splits for noMatchingCatalogException and multipleMatchingCatalogs (back to single-line, matching original style). - Remove stray scalastyle:off guard. - Add configuration examples to UCTokenBasedRestClientFactory Scaladoc. - Remove extractAppVersions and appVersions.* mechanism entirely. - Revert SnapshotManagerFactory to simple createUCClient call.
| // Start from defaults (Delta, Spark, Scala, Java) and add connector-specific entries | ||
| Map<String, String> appVersions = | ||
| UCTokenBasedRestClientFactory$.MODULE$.defaultAppVersionsAsJava(); | ||
| appVersions.put("Kernel", Meta.KERNEL_VERSION); | ||
| appVersions.put("Delta V2 connector", "true"); | ||
| UCClient ucClient = | ||
| UCTokenBasedRestClientFactory$.MODULE$.createUCClientWithVersions( | ||
| tableInfo.getUcUri(), tableInfo.getAuthConfig(), appVersions); |
There was a problem hiding this comment.
oh, shoot, i misunderstood here. I think we cannot remove this test, and then the createUCClient should support the customized app versions. okay, let's add the custom app versions back.
There was a problem hiding this comment.
Done. Re-added extractAppVersions and the appVersions.* mechanism. SnapshotManagerFactory now passes appVersions.Kernel and appVersions.Delta V2 connector through the ucConfig map. Test restored in BuilderSuite.
- Remove UC_CLIENT_IMPL_KEY / ucclient.impl config key; deltaRestApi.enabled is sufficient to select the UCClient implementation. - Re-add extractAppVersions so callers can supply custom version entries via appVersions.* keys in ucConfig (merged with defaults). - Restore SnapshotManagerFactory to pass Kernel and Delta V2 connector versions through ucConfig appVersions.* entries. - Add extractAppVersions unit test back to BuilderSuite.
| .filter { case (uri, authConfig) => getMetastoreId(uri, authConfig).contains(metastoreId) } | ||
| val matchingConfigs: List[Map[String, String]] = getCatalogConfigs(spark) | ||
| .map(_._2) | ||
| .distinct |
There was a problem hiding this comment.
why doing .distinct on the whole ucconfig? why not on (uri, authConfig) like before
| uriOpt match { | ||
| case Some(uri) => | ||
| try { | ||
| new URI(uri) // Validate the URI |
There was a problem hiding this comment.
below you added
else {
Some((catalogName, ucConfig))
}
but theres no check on the URI. "ht!tp://not a uri" could flow thru and error deep inside
There was a problem hiding this comment.
i think you need to validate like here via new URI(uri)
Which Delta project/connector is this regarding?
Description
Simplify
UCClientFactoryfrom a multi-method interface (createUCClient(uri, authConfig),createUCClientWithVersions(...), plus Java overloads) to a single unified method:Why? The previous design required callers to assemble separate
uri,authConfig, andappVersionsparameters. It also had a compile-time dependency onUCDeltaTokenBasedRestClient. This change:ucConfigmap containing all keys (uri,auth.*,deltaRestApi.enabled,ucclient.impl,appVersions.*).Utils.classForName) to load theUCClientimplementation at runtime, removing the compile-time dependency onUCDeltaTokenBasedRestClient.ucclient.implkey, ordeltaRestApi.enabledflag, defaulting toUCTokenBasedRestClient.Changes by file
UCCommitCoordinatorBuilder.scalaUCClientFactorytrait:createUCClient(ucConfig: Map[String, String]).UCTokenBasedRestClientFactory: singlecreateUCClientthat extracts URI, auth, impl class, and app versions fromucConfig. Reflection-based instantiation.getCatalogConfigs: simplified from(name, uri, authConfigMap)to(name, ucConfig)— collects all sub-keys underspark.sql.catalog.<name>.*into a flat map.UCCatalogConfig: changed from(catalogName, uri, authConfig)to(catalogName, ucConfig)with computeduriandauthConfigaccessors.CreateTableBuilder.java/SnapshotManagerFactory.java: useUCTableInfo.toUcConfig()to build the flat config map.UCTableInfo.java: addedtoUcConfig()to centralize the uri + auth config assembly.How was this patch tested?
UCCommitCoordinatorBuilderSuite— 14 tests pass (7 pre-existing Mockito/Java 21 ByteBuddy failures unrelated to this change).UCCommitCoordinatorClientSuite— 11/11 pass.extractAuthConfig(auth.* preference over legacy token, legacy fallback) andextractAppVersions(merges caller-supplied entries with defaults).Does this PR introduce any user-facing changes?
No. All changed APIs (
UCClientFactory,UCTokenBasedRestClientFactory,UCCatalogConfig,getCatalogConfigs) are internal to the Delta coordinated commits module. No user-facing configs or behaviors are changed.