Description
Kamaji unconditionally sets --authorization-mode=Node,RBAC in the kube-apiserver desiredArgs map at [internal/builders/controlplane/deployment.go:723](https://github.com/clastix/kamaji/blob/master/internal/builders/controlplane/deployment.go#L723). While user-provided extraArgs can override values via MergeMaps(), they cannot remove keys from the map. Kubernetes rejects --authorization-config when --authorization-mode or --authorization-webhook-* flags are present:
E0326 13:22:36.127190 1 run.go:72] "command failed" err="--authorization-config can not be specified when --authorization-mode or --authorization-webhook-* flags are defined"
This is documented Kubernetes behavior:
You cannot combine the --authorization-mode command line argument with the --authorization-config command line argument used for configuring authorization using a local file. If you try this, the API server reports an error message during startup, then exits immediately.
Why This Is a Problem
The Structured Authorization Configuration API (via --authorization-config, GA since Kubernetes 1.30 / KEP-3221) is the only way to set failurePolicy and timeout on authorization webhooks. Without it, authorization webhooks are fail-closed by default - if the webhook becomes unreachable, all API requests to the tenant cluster are denied.
| Feature |
Flag-based (--authorization-mode) |
Structured Config (--authorization-config) |
failurePolicy (NoOpinion/Deny) |
Not available |
✅ Available |
timeout per webhook |
Not available |
✅ Available |
matchConditions (CEL pre-filters) |
Not available |
✅ Available |
| Multiple authorization webhooks |
Not available |
✅ Available |
Impact on Cluster Health
- Complete API server lockout: If the authorization webhook goes down (network partition, service crash, latency spike), no user, controller, or kubelet can perform any action.
- Cascading failures: Kubelet lease renewals fail → nodes go
NotReady → pods get evicted → self-healing controllers cannot reschedule workloads because they also cannot reach the API server.
- No graceful degradation:
failurePolicy: NoOpinion (only available via --authorization-config) would allow falling through to the next authorizer (RBAC), keeping the cluster functional during a webhook outage. This safety net is currently unreachable.
Steps to Reproduce
- Create a TenantControlPlane with
extraArgs specifying --authorization-config:
apiVersion: kamaji.clastix.io/v1alpha1
kind: TenantControlPlane
metadata:
name: test-tcp
namespace: test
spec:
controlPlane:
deployment:
extraVolumes:
- configMap:
name: authz-config
name: authz-config-vol
additionalVolumeMounts:
apiServer:
- name: authz-config-vol
mountPath: /authz-config
extraArgs:
apiServer:
- "--authorization-config=/authz-config/authorization-config.yaml"
- The kube-apiserver pod fails to start with:
E0326 13:22:36.127190 1 run.go:72] "command failed" err="--authorization-config can not be specified when --authorization-mode or --authorization-webhook-* flags are defined"
- Inspecting the pod confirms both flags are present -
--authorization-config from extraArgs and --authorization-mode=Node,RBAC from Kamaji's hardcoded defaults.
Root Cause
In buildKubeAPIServerCommand(), --authorization-mode is hardcoded in desiredArgs:
desiredArgs := map[string]string{
"--allow-privileged": "true",
"--authorization-mode": "Node,RBAC", // <-- always set
// ...
}
// ...
return utilities.MergeMaps(current, desiredArgs, extraArgs)
MergeMaps() merges all maps (last wins for overlapping keys), but since --authorization-config and --authorization-mode are different keys, both end up in the final args. There is no YAML-level workaround - extraArgs can override a key's value but cannot delete a key.
Source: [internal/builders/controlplane/deployment.go:723](https://github.com/clastix/kamaji/blob/master/internal/builders/controlplane/deployment.go#L723)
Proposed Fix
When --authorization-config is present in extraArgs, remove --authorization-mode (and any --authorization-webhook-* flags) from desiredArgs before merging:
if _, hasAuthzConfig := extraArgs["--authorization-config"]; hasAuthzConfig {
delete(desiredArgs, "--authorization-mode")
for k := range desiredArgs {
if strings.HasPrefix(k, "--authorization-webhook-") {
delete(desiredArgs, k)
}
}
}
return utilities.MergeMaps(current, desiredArgs, extraArgs)
This mirrors how Kubernetes itself treats these flags as mutually exclusive - if the user provides a structured config file, the flag-based equivalents should not be injected.
Current Workaround
Fall back to flag-based configuration, losing failurePolicy and timeout:
extraArgs:
apiServer:
- "--authorization-mode=Node,Webhook,RBAC"
- "--authorization-webhook-config-file=/authz-config/authz.yml"
- "--authorization-webhook-cache-authorized-ttl=5m"
- "--authorization-webhook-cache-unauthorized-ttl=30s"
This works because Kamaji (since v0.4.2 / #415) allows overriding --authorization-mode. However, this approach has no failurePolicy or timeout, meaning a webhook outage can crash the apiserver.
Description
Kamaji unconditionally sets
--authorization-mode=Node,RBACin the kube-apiserverdesiredArgsmap at[internal/builders/controlplane/deployment.go:723](https://github.com/clastix/kamaji/blob/master/internal/builders/controlplane/deployment.go#L723). While user-providedextraArgscan override values viaMergeMaps(), they cannot remove keys from the map. Kubernetes rejects--authorization-configwhen--authorization-modeor--authorization-webhook-*flags are present:This is documented Kubernetes behavior:
Why This Is a Problem
The Structured Authorization Configuration API (via
--authorization-config, GA since Kubernetes 1.30 / KEP-3221) is the only way to setfailurePolicyandtimeouton authorization webhooks. Without it, authorization webhooks are fail-closed by default - if the webhook becomes unreachable, all API requests to the tenant cluster are denied.--authorization-mode)--authorization-config)failurePolicy(NoOpinion/Deny)timeoutper webhookmatchConditions(CEL pre-filters)Impact on Cluster Health
NotReady→ pods get evicted → self-healing controllers cannot reschedule workloads because they also cannot reach the API server.failurePolicy: NoOpinion(only available via--authorization-config) would allow falling through to the next authorizer (RBAC), keeping the cluster functional during a webhook outage. This safety net is currently unreachable.Steps to Reproduce
extraArgsspecifying--authorization-config:--authorization-configfrom extraArgs and--authorization-mode=Node,RBACfrom Kamaji's hardcoded defaults.Root Cause
In
buildKubeAPIServerCommand(),--authorization-modeis hardcoded indesiredArgs:MergeMaps()merges all maps (last wins for overlapping keys), but since--authorization-configand--authorization-modeare different keys, both end up in the final args. There is no YAML-level workaround -extraArgscan override a key's value but cannot delete a key.Source:
[internal/builders/controlplane/deployment.go:723](https://github.com/clastix/kamaji/blob/master/internal/builders/controlplane/deployment.go#L723)Proposed Fix
When
--authorization-configis present inextraArgs, remove--authorization-mode(and any--authorization-webhook-*flags) fromdesiredArgsbefore merging:This mirrors how Kubernetes itself treats these flags as mutually exclusive - if the user provides a structured config file, the flag-based equivalents should not be injected.
Current Workaround
Fall back to flag-based configuration, losing
failurePolicyandtimeout:This works because Kamaji (since v0.4.2 / #415) allows overriding
--authorization-mode. However, this approach has nofailurePolicyortimeout, meaning a webhook outage can crash the apiserver.