Skip to content

feat(executor): add MaxConcurrentReconciles config for TaskAction controller (#7205)#7356

Open
jbbqqf wants to merge 1 commit into
flyteorg:mainfrom
jbbqqf:feat/7205-add-max-concurrent-reconciles-config
Open

feat(executor): add MaxConcurrentReconciles config for TaskAction controller (#7205)#7356
jbbqqf wants to merge 1 commit into
flyteorg:mainfrom
jbbqqf:feat/7205-add-max-concurrent-reconciles-config

Conversation

@jbbqqf
Copy link
Copy Markdown

@jbbqqf jbbqqf commented May 9, 2026

Tracking issue

Closes #7205

Why are the changes needed?

The TaskAction controller in executor/pkg/controller/taskaction_controller.go
currently calls SetupWithManager without any WithOptions, so
controller-runtime always uses its default MaxConcurrentReconciles = 1.
Operators who want to spend more CPU on parallel reconciles (because their
TaskAction queue grows faster than a single worker can drain it) have no
way to tune this without forking the binary.

Issue #7205 asks for the standard controller-runtime knob —
controller.Options.MaxConcurrentReconciles — to be exposed in
the executor config.

What changes were proposed in this pull request?

  • executor/pkg/config/config.go — add MaxConcurrentReconciles uint32
    to Config, default 1 (matches controller-runtime's own default and
    preserves the historical single-worker behavior). Field is registered
    with the existing pflag mechanism.
  • executor/pkg/config/config_flags.go — register the new
    --executor.maxConcurrentReconciles pflag (the generated-flags file is
    hand-edited in this repo; same pattern was used for
    maxSystemFailures).
  • executor/pkg/controller/taskaction_controller.go — add an exported
    MaxConcurrentReconciles int field on TaskActionReconciler and call
    WithOptions(controller.Options{MaxConcurrentReconciles: ...}) only
    when the field is > 0. Zero falls through to controller-runtime's own
    default so callers that have not wired the knob yet still get the
    pre-[V2] Add executor k8s controller runtime MaxConcurrentReconciles config #7205 behavior.
  • executor/setup.go — wire cfg.MaxConcurrentReconciles into the
    reconciler before SetupWithManager is called, with an explanatory
    comment about the uint32 → int conversion.
  • executor/pkg/config/config_test.go (new) — locks the default value
    of 1 as a tripwire; bumping the default would change worker pool
    sizing for every existing deployment.
  • executor/pkg/controller/taskaction_controller_test.go — adds a
    MaxConcurrentReconciles plumbing (#7205) Context with two specs
    asserting (a) the zero value yields the deferred-to-controller-runtime
    branch and (b) an explicitly configured value is preserved on the
    reconciler for SetupWithManager to forward.

The diff is intentionally narrow — five files, +67/−3 lines — so a
reviewer can verify each piece independently. No proto changes, no
existing tests modified.

How was this patch tested?

Local checks

$ go vet ./executor/...
(clean)
$ go build ./executor/...
(clean)
$ go build ./...        # full repo
(clean)

$ go test ./executor/pkg/config/...
ok  	github.com/flyteorg/flyte/v2/executor/pkg/config	0.029s

$ KUBEBUILDER_ASSETS=… go test ./executor/pkg/controller/...
Ran 32 of 32 Specs in 8.871 seconds
PASS
ok  	github.com/flyteorg/flyte/v2/executor/pkg/controller	8.969s

The controller-suite uses envtest, which needs kube-apiserver +
etcd binaries. I fetched them via
setup-envtest use 1.30.x --bin-dir /tmp/envtest and exported
KUBEBUILDER_ASSETS to that path; this matches the setup that
getFirstFoundEnvTestBinaryDir (in suite_test.go) is looking for at
bin/k8s/.... The 32 specs include the two new ones added in this PR
plus the pre-existing 30.

Reproduce BEFORE/AFTER yourself (copy-paste)

A reviewer can verify the change end-to-end by pasting the block below.
The same go test line runs on origin/main (BEFORE) and on this PR
(AFTER); the only thing that changes is the checked-out ref.

# --- one-time setup ---
git clone https://github.com/flyteorg/flyte.git /tmp/repro && cd /tmp/repro
go install sigs.k8s.io/controller-runtime/tools/setup-envtest@latest
"$(go env GOPATH)"/bin/setup-envtest use 1.30.x --bin-dir /tmp/envtest >/dev/null
export KUBEBUILDER_ASSETS=/tmp/envtest/k8s/1.30.3-linux-amd64

# --- BEFORE (origin/main) ---
git checkout origin/main
go test -v -run 'TestControllers' ./executor/pkg/controller/... -ginkgo.focus 'MaxConcurrentReconciles' 2>&1 | tail -5
# Expected: "no specs to run" / 0 ran — the spec doesn't exist on main, and the
#           Reconciler struct has no MaxConcurrentReconciles field.
grep -n 'MaxConcurrentReconciles' executor/pkg/controller/taskaction_controller.go || echo "MISSING on main (as expected)"
# Expected: prints "MISSING on main (as expected)" — confirms the gap.

# --- AFTER (this PR) ---
git fetch https://github.com/jbbqqf/flyte.git feat/7205-add-max-concurrent-reconciles-config
git checkout FETCH_HEAD
go test -v -run 'TestControllers' ./executor/pkg/controller/... -ginkgo.focus 'MaxConcurrentReconciles' 2>&1 | tail -8
# Expected: 2 specs PASS — "defaults to 0 on a zero-value reconciler ..." and
#           "preserves an explicitly configured value ...".
grep -n 'MaxConcurrentReconciles' executor/pkg/controller/taskaction_controller.go | head -3
# Expected: lines showing the new field and the WithOptions wiring.

Labels

  • added — new public configuration field
    executor.maxConcurrentReconciles, no behavior change for unset / default.

Setup process

No setup changes required for users on the default config. Operators who
want to scale reconciles add executor.maxConcurrentReconciles: <N> to
the executor config (or --executor.maxConcurrentReconciles=<N> on the
CLI). A value of 0 keeps the pre-PR behavior.

Edge cases tested

# Scenario Input Expected Verified by
1 Default config (no override) defaultConfig.MaxConcurrentReconciles == 1 reconciler.MaxConcurrentReconciles == 1 → WithOptions called with 1 TestDefaultMaxConcurrentReconciles
2 Zero-value reconciler (no config wiring at all) r := &TaskActionReconciler{} r.MaxConcurrentReconciles == 0; SetupWithManager skips WithOptions; controller-runtime default applies MaxConcurrentReconciles plumbing — defaults to 0 …
3 Explicitly configured value r := &TaskActionReconciler{MaxConcurrentReconciles: 4} field preserved on the reconciler; SetupWithManager forwards it via WithOptions MaxConcurrentReconciles plumbing — preserves an explicitly configured value …
4 Existing 30 controller specs make test of the controller package unchanged: 30 → 32 PASS, no regressions Ran 32 of 32 Specs in 8.871 seconds

Check all the applicable boxes

  • I updated the documentation accordingly. (In-code doc comments
    added on the new field and the SetupWithManager method; no separate
    docs site change since this is a runtime config knob covered by the
    pflag help text.)
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

None directly. The change touches only executor-internal config
plumbing, so there is no proto or sister-service coordination required.

Risk / blast radius

Additive only. With the unset / default config (MaxConcurrentReconciles = 1), the controller behaves exactly as before — controller-runtime
already uses 1 as its own default, and we now pass 1 explicitly. The
only observable change for an operator that does set the new key is the
intended one: more reconcile workers run in parallel, which the upstream
controller.Options documentation already covers. No proto change, no
API surface change, no DB migration.

Release note

executor: added `executor.maxConcurrentReconciles` config (default 1) to
control the controller-runtime worker pool size for the TaskAction
controller (#7205).

PR drafted with assistance from Claude Code. The change was reviewed
manually against flyteorg/flyte HEAD and against the upstream
controller-runtime API; the reproducer block above is the same one used
during development. All commits signed off per project DCO policy
(.github/dco.yml).

…troller (flyteorg#7205)

Add a MaxConcurrentReconciles knob to the executor config and wire it
through to the TaskAction controller via controller.Options. Before this
change, SetupWithManager unconditionally built the controller without
WithOptions, so operators had no way to scale the reconcile worker pool
without forking the binary.

The default is 1, matching controller-runtime's own default and
preserving the historical single-worker behavior. Setting the config
key to 0 explicitly defers to controller-runtime's default; any positive
value is forwarded as-is.

Co-Authored-By: Claude Code <noreply@anthropic.com>
Signed-off-by: Jean-Baptiste Braun <jbaptiste.braun@gmail.com>
@github-actions github-actions Bot added the flyte2 label May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[V2] Add executor k8s controller runtime MaxConcurrentReconciles config

1 participant