High-level release history for ctrl-exec. This is a summary extract — full detail lives in the git log; each entry is anchored to the commit ref (or the release commit) it lands at, not a date. Bullets mark what was added, changed, or removed at the level of the area touched.
- Fixed (CRITICAL) the agent crashing its main process on every accepted
connection. The 0.12.2 refactor that moved the request handlers into
Exec::Agent::Serverleft the accept loop's pre-fork serial read calling_peer_serialunqualified, so it resolved tomain::_peer_serial(undefined) and died withUndefined subroutine &main::_peer_serialthe instant a connection arrived - taking down the listener and looping under systemd restart. The sub is now the publicExec::Agent::Server::peer_serialand the bin calls it through the package qualifier. Any agent on 0.12.2 was unable to serve a single request (run/ping/discovery all failed); 0.12.3 restores service. No config change required. - Added a static regression guard (
t/agent-serve-symbols.t) asserting the agent bin makes no unqualified call to anyExec::Agent::Serversub - the exact class of erroruse strictandperl -ccannot catch because an undefined-subroutine call only fails at runtime, on a live connection.
- Changed (BREAKING) the security-profile model: there is no longer an
implicit built-in
defaultprofile. Previously a script with noprofile=annotation ran under a built-in default whose emptyrun_asmeant it executed as root (capless) - the opposite of the "restrictive default" the docs described, so enabling the executor escalated unannotated scripts to root. Now every profile, includingdefault(the name an unannotated script resolves to), must be defined inagent.conf; an undefined profile is refused (fail-closed) rather than run under an implicit context. The shippedagent.conf.exampledefines[profile default]asrun_as=nobody, and the rule is uniform: nothing runs as root unless a profile setsrun_as=root. UPGRADE: an existingagent.confwith unannotated scripts and no[profile default]will refuse to serve until that block is added (the error names exactly what to add) or those scripts are annotated. - Fixed a trust-key divergence in cert-serial canonicalisation: the colon
and plain-hex branches stripped leading zeros differently, so the same serial
could canonicalise two ways and a revoked/trusted entry pasted from
openssl x509 -text(colon form) could silently fail to match. One strip now, used at every read. - Fixed reqid/nonce generation to route every value through the single
/dev/urandomreader, dropping a non-cryptographicrand()fallback. - Changed input hardening:
allowed_ipsoctets are validated and canonicalised at config load (a zero-padded entry no longer fails open by never matching), and the unauthenticated pairing port caps the request body before reading it (memory-exhaustion DoS). - Changed rate-limit eviction to an amortised single pass instead of an
O(n log n) sort on every accept; rotation now batches its registry writes under
one lock; and
list_hostnamesreads the registry from directory entries instead of decoding every record. - Added behaviour tests for the root executor's privilege drop (cap masks, run_as, no_new_privileges, out-of-range run_as), the previously-untested shared utility modules, and made several masked/always-skipping tests able to fail.
- Fixed
make-release.shto tag the release commit rather than the prior HEAD (every tag had been one commit too early;v0.12.0/v0.12.1corrected). - Changed documentation: removed references to a removed
Exec::Outputmodule andrequest_pairing, corrected install/source paths, completed the module reference, and documentedmax_parallel.
-
Changed the HTTP API server to run as a dedicated unprivileged
ctrl-execservice user instead of root. The dispatcher private key is now owned by that user (still0600, so thectrl-execgroup / operators still cannot read it - they continue tosudoforrun/ping); the API reads its own key to dispatch and is in thectrl-execgroup for the runtime dirs. It never needs the CA key:/pingno longer triggers cert renewal (renewal would need to sign), so renewal is driven solely byced maintain(root timer). An RCE in the network-facing JSON server is therefore not root and cannot sign certificates. The installer/postinst create the user and migrate the key ownership;setup-ctrl-execandrotate-certchown newly generated keys. The shared runtime dirs are setgid (2770) and lock files0660so the root CLI and the unprivileged API can share the registry, run records, and locks. -
Changed the bundled OpenAPI spec to match the code: added the
/,/openapi.json,/openapi-live.jsonroutes,RunResponse.reqid,HostCapabilities.tags/reported_hostname, 404s on/runand/ping, the/statusAuthorizationheader + owner-gating, and a polymorphicStatusResponse.hosts; refreshed the stale version examples. -
Added a concurrent-handler cap to both servers: the API (
api_max_children, default 64) returns503above the cap, and the agent (max_children, default 256) closes the connection above the cap on top of its per-IP rate limit - bounding an aggregate connection flood. -
Changed the agent renew timer to decide via
ctrl-exec-agent cert-staged(which readscert_staging_pathfrom config) instead of a hard-coded path, so a staging-path override is honoured. Addedcert-promote/maintainfailure patterns to the LOGGING alert reference. -
Fixed the out-of-the-box auth default: the shipped dispatcher config activates
auth_hook, but the example hook ended inexit 1(deny all), so a fresh install denied everyrun/ping- the quickstart could not work. The example now defaults toexit 0(allow), so ctrl-exec runs out of the box; the exposure is bounded (the API binds 127.0.0.1, agents are mTLS-gated), and the commented examples remain for production rules. README note corrected. -
Added per-profile read-only filesystem enforcement (the
writablefield is now enforced, no longer parsed-but-inert): when a profile setswritable, the executor makes the action's whole filesystem read-only except those paths (a per-scriptProtectSystem=strict), with a private/tmp. Writes elsewhere fail withEROFSeven when the profile runs as root. Opt-in; requires a Linux kernel= 5.12 and fails closed on older kernels. Validated on-host.
-
Changed the agent's ping response to report
staged(a renewed cert is staged, awaiting restart), and the dispatcher to skip re-renewal while one is staged - the live expiry still reads old until restart, so this avoids re-signing/re-staging redundantly on every maintenance run. -
Changed the OpenWrt/procd init script to gate startup on pairing state (new
ctrl-exec-agent pairedcheck): an unpaired agent logs once and stays down instead of respawning forever (procd has noRestartPreventExitStatus). Documented acert-stagedcron snippet for hands-free renewal adoption there. -
Changed the three servers (API, agent, pairing) to share one
Exec::Httpresponse writer, removing the drifted hand-rolled status-phrase tables (413 and 500 were inconsistent across them). No wire change beyond the phrase text. -
Fixed a registry lost-update: concurrent read-modify-write of an agent record (serial status, expiry, tags, and
edit_agent- which a maintenance run or an operator edit can touch at once) could drop a change. A registry-wide lock now serialises all of those updates. -
Changed the agent accept loop to reap all finished children per accept (not one), so request handlers cannot accumulate as zombies under bursty load - matching the API and executor loops.
-
Changed
GET /status/{reqid}so it can be owner-gated. The API now records who submitted each run and runs astatusauth-hook check that exposes the reqid and the submitter (ENVEXEC_REQID,ENVEXEC_SUBMITTER[_IP]); the caller authenticates withAuthorization: Bearer, an unauthorised request gets404(no existence disclosure), and the submitter is stripped from responses. With no hook, the unguessable reqid remains the capability. Previously any caller holding a reqid could read any run's output and a hook could not gate it. -
Added hands-free agent-cert renewal, with the trust boundary drawn at the private key. The agent implements
POST /renew(CSR from its existing key - key continuity preserved) andPOST /renew-complete: it validates the signed cert (verifies against its CA and that the public key matches its own key) and stages it in its own writable state dir - the cert is public material the agent owns, so no privileged writer is involved. A renewed cert is promoted into the root-owned live path by a rootExecStartPrestep at the next agent start, and adopted on restart. The dispatcher binds the signed CSR's CN to the agent's identity. Previously the dispatcher posted/renewbut the agent had no handler, so renewal silently 404'd and certs never renewed. -
Added
ctrl-exec-maintenance.timer(dispatcher, root) runningced maintain- pings all agents (triggering due renewals) and rotates the dispatcher's own cert - andctrl-exec-agent-renew.timer(agent) that restarts the agent to adopt a staged cert. Both enabled on install, so cert lifecycle is hands-free with no operator action.list-agentsnow shows a DAYS LEFT column. -
Changed the default agent config to set
executor_socket(run through the privileged executor, the recommended posture for per-script profiles); comment it out only where--asyncis needed. -
Changed (behaviour) the environment an allowlisted script runs in is now sanitised. The script no longer inherits the agent's full environment: the front-end keeps a small whitelist (
PATHreset to a safe default, plusHOME/LANG/TZ/...), and the privileged executor passes a cleanPATHonly. This removesLD_PRELOAD/LD_LIBRARY_PATH/BASH_ENV/IFS/PERL5LIBas passthrough attack surface against shell/script interpreters. Request context still reaches scripts on stdin (JSON), never via the environment. A script that relied on an inherited variable must now be given it explicitly. -
Added request-size limits to the dispatcher HTTP API (
413body /431headers), mirroring the agent, so an oversizedContent-Lengthor a header flood cannot exhaust memory before the auth gate runs. -
Added CSR validation in the CA signer: reject keys under 2048-bit and weak (MD5/SHA-1) self-signatures, verify the CSR's self-signature, and optionally bind the subject CN to an expected identity. Closes the "sign any subject with any key" signing-oracle gap.
-
Added
auth_hook_timeout(default 10s): a hung auth hook is killed and the request fails closed instead of wedging the request handler indefinitely. -
Changed
/capabilitiesto fail closed when no dispatcher is trusted (the trusted-dispatcher map is empty), matching/run,/ping,/rotate-serialand/result. An unpaired or map-less agent no longer discloses its allowlist. -
Changed the executor to reject an out-of-range numeric
run_as(at config load and at apply time) instead of silently truncating it, which could land on uid 0 (root). -
Changed CA/dispatcher private-key generation to run under a tight umask so a key is never momentarily group/world-readable between creation and
chmod. -
Added bounded fan-out concurrency to the dispatcher (
max_parallel, default 64): a large fleet no longer forks one TLS client per host all at once, which could exhaust file descriptors before the host cap. -
Changed
bin/ctrl-exec-agentinto a modulino (main() unless caller) so its request handlers can be loaded and unit-tested directly. -
Changed agent startup so a configuration error (parse error, invalid capability, undefined profile, ...) prints one clear message naming the file and the problem, then exits
EX_CONFIG(78). The unit lists 78 inRestartPreventExitStatus, so systemd reports a single failure instead of respawning into a restart loop that buries the real error ("restart counter is at 134"). Previously these died as a generic exception (255) and looped. -
Changed the "invalid capability" error to detect the common mistake of an inline
# ...comment on a value line (the format supports whole-line comments only) and say so, instead of the bare "invalid capability '#'". -
Added
docs/TROUBLESHOOTING.md- use cases and troubleshooting for a running agent: the profile mental model (executor required, one profile per script, executor/--asyncexclusivity), the deploy-and-restart use case, capability-bounded root (run_as = rootgrants only the listed caps, no implicitCAP_DAC_OVERRIDE), config-file pitfalls (inline comments, the exit-78 behaviour), upgrade/install messages (libc6floor,-dbgsym, automatic restart), diagnosing a failed start, and rotation under the executor.
- Added built-in cert rotation. The agent handles dispatcher-serial rotation
as a first-class control-plane operation (
POST /rotate-serial) in the front-end, replacing theupdate-ctrl-exec-serialscript. This makes seamless rotation work under privilege separation (the executor keeps the trust map read-only for every action, so a script could not write it; the front-end can)- so rotation and the executor now coexist with no re-pairing. It is also more
secure: the dispatcher identity is derived from the caller's authenticated
serial, never sent in the request, so a dispatcher can only add/retire serials
under its own identity. Gated by the trusted-dispatcher check + the auth hook
(action
rotate). Each new serial is authorised by the currently-trusted one, chaining back to the original human-supervised pairing.
- so rotation and the executor now coexist with no re-pairing. It is also more
secure: the dispatcher identity is derived from the caller's authenticated
serial, never sent in the request, so a dispatcher can only add/retire serials
under its own identity. Gated by the trusted-dispatcher check + the auth hook
(action
- Removed the
update-ctrl-exec-serialscript and everything that shipped or referenced it (packaging, installer, allowlist example, SBOM, docs). Rotation needs no allowlist entry. If you had it inscripts.conf, the entry is now inert and can be deleted. - Changed profile documentation and added a startup warning: profiles are
enforced only by the executor; without
executor_socketaprofile=is parsed but not applied (scripts run as the unprivileged agent user). Documented that a script runs under exactly one profile, thatexecutor_socketand--asyncare mutually exclusive, and pointed atcapabilities(7). - Fixed a spurious
Failed to stop ctrl-exec-exec.service: Unit not loadedwarning on upgrade from a pre-privsep version (--no-stop-on-upgradeon the units; our postinst owns the restart). The install always succeeded; only the message was alarming. - Changed the postinst upgrade restart to print an informative line per service - "restarted to apply the upgrade" or " is not running - no restart needed" - instead of being silent.
- Fixed the agent
.debrequiringlibc6 (>= 2.38), which blocked install on Debian 12 / Ubuntu 22.04 (glibc 2.36). The executor usedstrtol, which under_GNU_SOURCEredirects to the C23__isoc23_strtol(a glibc-2.38 symbol); replaced it with a manual integer parse. The package's libc floor is now 2.34. - Removed the automatic
-dbgsympackage (dh_strip --no-automatic-dbgsym): ctrl-exec does not distribute debug symbols. Also dropped thectrl-exec-agent-dbgsymthat the 0.10.0 release accidentally committed todist/. Rebuild withDEB_BUILD_OPTIONS=nostripif you need symbols. - Fixed the upgrade restart not covering the executor: the agent postinst
now restarts both
ctrl-exec-exec.serviceandctrl-exec-agent.service(each only if already running). Previously only the Perl front-end was restarted, so after an upgrade the changed C executor kept running its old code until a manual restart - the symptom under privilege separation.
- Added privilege separation. A new root, no-network executor
(
ctrl-exec-exec) runs allowlisted scripts; the unprivileged agent front-end hands it authorised requests over a peer-cred-checked unix socket. The executor re-derives the path and profile from its own root-owned config (it trusts nothing in the message) and applies the profile - mount namespace with the control/state dirs read-only, capability set,run_as, andno_new_privileges- before exec. Opt-in viaexecutor_socketin agent.conf. - Added per-script security profiles:
[profile <name>]blocks in agent.conf (run_as,caps,writable,no_new_privileges) referenced fromscripts.confviaprofile=<name>. Unprofiled scripts use a restrictive default; an undefined profile is a fatal config error (fail-closed). A shared conformance test proves the C executor and the Perl front-end resolve the identical security decision for any config. - Removed the interim filesystem sandbox (
sandbox/writable_paths/apply-configand theProtectSystem=strict-as-action-blocker default). It was a transitional mechanism; per-script profiles enforced by the executor replace it. Deployments that setwritable_paths/sandboxshould move the intent into a profile (those keys are now ignored).
- Changed how the dispatcher reports a host it cannot reach: a raw LWP
transport string (
500 Can't connect to host:7443 (Connection refused)) is now translated into a status-like message —host 'web01' did not resolve,… did not respond on port 7443 - connection refused (agent not running, or wrong port?),… is unreachable,… connection timed out, orTLS handshake … failed. Applies to run, ping, status, and capabilities. A genuine HTTP status (e.g. 403) is passed through unchanged, never mislabelled as a network fault.
- Changed
ctrl-exec-agent(cea) invoked with no mode: it now prints the usage summary and exits instead of defaulting toserve. A bare invocation previously launched the foreground server with no terminal output, which read as a hang. Start the server with an explicitctrl-exec-agent serve; the systemd and procd units already do this, so service-managed agents are unaffected. - Added a
--versionflag toctrl-exec-agent(cea) andctrl-exec-dispatcher(ced), printing the installed release version. - Fixed
.debupgrades leaving old code running: the agent and dispatcher postinsts now restartctrl-exec-agent.service/ctrl-exec-api.serviceon upgrade when already active, so the new code takes effect. Fresh installs are still left stopped (the agent cannot serve until paired), and a stopped or unconfigured service is not started. - Changed the serve pre-flight to exit
78(EX_CONFIG) instead of1when the agent is not paired, and addedRestartPreventExitStatus=78to the unit. An enabled-but-unpaired agent now fails once with the "not paired" message instead of respawning everyRestartSec. A genuine crash still restarts. - Changed the agent to register its fully-qualified hostname at pairing
(
Net::Domain::hostfqdn(), falling back to the short name only when no domain is configured), instead of the bare short hostname. The short name does not resolve across subdomains, so a dispatcher on another network could not reach the agent by its registry name; the FQDN resolves consistently and survives a dispatcher move. Pairing now warns if no FQDN could be determined. Re-pair existing agents to update their registry key. The agent's self-reported host in run/capabilities responses is the FQDN too, for consistency. - Added post-pairing enable/start instructions: a successful interactive
pairing now prints the init-appropriate
enable/startcommands, since the agent is paired but not yet running. - Added config-driven sandbox management for the agent.
agent.confnow takessandbox = strict|moderate|off(filesystem-protection level) andwritable_paths = …(colon-separated dirs to open under the sandbox);ctrl-exec-agent apply-configrenders these into a generated systemd drop-in (…/50-ctrl-exec-sandbox.conf) and reloads systemd, so writable-path policy is managed fromagent.confinstead of hand-edited units (a restart applies it, since systemd builds the namespace before the agent starts).servetest-writes eachwritable_pathsentry at startup and warns on any that are read-only, flagging an unapplied config. Default staysstrict, matching the shipped unit. - Added a
hintfield on run/result responses when a script's stderr shows "Read-only file system" (EROFS): it names the systemd sandbox as the cause - not permissions or a full disk - and points atwritable_paths/apply-configand the new "Granting scripts a writable path" docs. The script's own stderr is left untouched. - Fixed the dispatcher cert path being hardcoded as
dispatcher.crtin the cert-lifecycle paths instead of honouringctrl-exec.confcert/key- the one place that names the cert the dispatcher actually presents. On a deployment whose cert is named otherwise (e.g.ctrl-exec.crt),approveread the serial from the absentdispatcher.crt, so the agent paired but trusted no serial and rejected every request as a "serial mismatch". The configuredcert/keyare now the single source of truth acrossapprove(reads the serial there, and warns loudly if it cannot),setup-ctrl-exec(creates them there), androtate-cert(re-keys them in place);generate_dispatcher_certrequires explicit paths with no hardcoded default. No migration code - existing deployments work as-is because every path now follows the config. - Fixed a post-re-pair "serial mismatch": a running agent loads its
trusted-dispatcher map once at startup (refreshed only on SIGHUP), so a
re-pair that writes a new dispatcher serial to disk does not take effect until
the agent is reloaded/restarted. The post-pairing message now detects an
already-running agent and tells the operator to
systemctl restart ctrl-exec-agentso the new certificate and serial are adopted. - Fixed
serial_to_hexnot stripping an insignificant leading00byte in its plain-hex branch (the colon-separated branch already did). A dispatcher serial migrated from a pre-0.9.0 single-serial file as00aabb...never matched the liveaabb...the agent reads from the cert, rejecting every request as a serial mismatch. All forms now canonicalise to minimal hex. - Added pairing identity diagnostics on the dispatcher. When a request is
queued the dispatcher now records a forward-confirmed reverse-DNS lookup of the
agent's source IP (bounded by a short timeout).
list-requestsand the interactive approve prompt show the reported name, source IP, and reverse-DNS name, plus a recommendation (register the resolvable FQDN viaedit-agent --rename, or fall back to--lookup-by ip) for when the reported short name will not resolve from the dispatcher - the common DHCP/network-managed-FQDN case. After approve, the dispatcher prints exactly what was registered (name, lookup_by, address) and theedit-agentcommand to change it without re-pairing (dispatch auth is CA-based, so no new certificate is needed).
Lands at the release: 0.9.0 commit.
- Added native multi-dispatcher support: an agent serves more than one
dispatcher. Trust is keyed on a per-dispatcher map (
<serial> <id>entries) at/var/lib/ctrl-exec-agent/ctrl-exec-dispatchers; pairing appends a dispatcher rather than replacing the previous one. - Added a stable dispatcher identity (
dispatcher_id, defaults to the dispatcher hostname), delivered at pairing and rotation; permission and attribution key on the identity, never the rotating serial. - Added per-call attribution: a
DISPATCHERfield on agent run/ping/result/capabilities logs, andENVEXEC_DISPATCHER/ENVEXEC_DISPATCHER_SERIALin the agent auth-hook environment. - Added an owner-partitioned async result store
(
runs/<dispatcher-id>/<reqid>.json) with an owner-gatedGET /result/<reqid>— a run's output is returned only to the dispatcher that submitted it. - Changed cert rotation to seamless add-then-remove against the trusted map (broadcast the new serial under the stable identity, keep the old through the overlap window, then retire it) — no re-pairing for reachable agents.
- Changed the trusted store from a single dispatcher serial in
/etcto the agent-writable map in the state dir, so rotation can update trust in place; legacy single-serial installs are migrated automatically on upgrade. - Removed the single-trusted-serial model (
ctrl-exec-serial,dispatcher_serial_path,load_dispatcher_serial). - Fixed packaging: the dispatcher
.debnow shipsctrl-exec-api.service(the named systemd unit was previously dropped bydh_installsystemd).
Release commits through v0.8.14.
- Added MCP integration: self-describing script schema sidecars in core and
the
ctrl-exec-mcpbridge plugin. - Added asynchronous / long-running jobs — detached execution with a
result store polled via
status/wait. - Added
.debpackaging tracked in-repo with stale-version pruning. - Changed naming throughout to the dispatcher/agent split: the control-host
binary and package became
ctrl-exec-dispatcher, cert filesdispatcher.{crt,key}, cert CNctrl-exec-dispatcher. - Changed pairing/dispatch addressing: register the agent's real IP behind NAT, resolve every verb through one registry path, and fail loudly on unknown agents.
- Added pairing-mode session timeout and start/stop subcommands.
Release commits through v0.7.7 and the v0.1–v0.6 series. Foundational
work, summarised by theme (see the git log for per-tag detail):
- Added the core mTLS control plane: dispatcher CA, agent pairing with a
6-digit confirmation code, and the allowlisted
/run//ping//capabilitiesagent endpoints. - Added the auth-hook trust model (default-deny), rate limiting, IP
allowlisting, cert revocation, and the agent-side serial restriction on
/capabilities. - Added cert rotation with an overlap window, the agent registry, tag-based
discovery, and the optional
ctrl-exec-apiHTTP API with an OpenAPI spec. - Added the CycloneDX SBOM, the release tooling (
make-release.sh), and brand repackaging.