Skip to content

V4#3807

Draft
DrJosh9000 wants to merge 27 commits into
mainfrom
v4
Draft

V4#3807
DrJosh9000 wants to merge 27 commits into
mainfrom
v4

Conversation

@DrJosh9000
Copy link
Copy Markdown
Contributor

@DrJosh9000 DrJosh9000 commented Apr 8, 2026

Description

Make v4 happen.

Context

It's about time.

Fixes #1391
Fixes #1594
Fixes #1623
Fixes #1646
Closes #1593

Changes

  • Rename v3 to v4 in package names and a few other places.
  • Promote or rip out various experiments, but note a few are left in.
    • allow-artifact-path-traversal: removed, the insecure behaviour is no longer supported
    • normalised-upload-paths: is now default behaviour
    • override-zero-exit-on-cancel: is now default behaviour
    • resolve-commit-after-checkout: is now default behaviour
    • propagate-agent-config-vars: is now default behaviour
    • descending-spawn-priority: removed, with the --spawn-with-priority flag now taking a string value (one of static, ascending, or descending)
  • Rip out the deprecated Docker integration.
  • Remove deprecated CLI flags:
    • trace-context-encoding
    • kubernetes-log-collection-grace-period
    • no-automatic-ssh-fingerprint-verification (use no-ssh-keyscan instead)
    • meta-data (use tags instead)
    • meta-data-ec2 (use tags-from-ec2-meta-data instead)
    • meta-data-ec2-tags (use tags-from-ec2-tags instead)
    • meta-data-gcp (use tags-from-gcp-meta-data instead)
    • tags-from-ec2 (use tags-from-ec2-meta-data instead)
    • tags-from-gcp (use tags-from-gcp-meta-data instead)
    • disconnect-after-job-timeout (use disconnect-after-idle-timeout instead )
    • follow-symlinks (use glob-resolve-follow-symlinks instead)
  • Remove deprecated env vars generated for plugin configuration.
  • Run post-checkout, post-command, pre-exit hooks run in "reverse" order
  • Output a trailing newline in buildkite-agent meta-data get
  • Replace cancel-grace-period and signal-grace-period-seconds flags with cancel-signal-timeout and cancel-cleanup-timeout, and adjust the timeouts (10s signal timeout and 5s cleanup timeout)
  • Remove OpenTracing and various DataDog-specific workarounds
  • Pipeline uploads containing secrets are now rejected by default (reject-secrets is replaced with allow-secrets)
  • Upgrade urfave/cli to v3

Testing

  • Tests have run locally (with go test ./...). Buildkite employees may check this if the pipeline has run automatically.
  • Code is formatted (with go tool gofumpt -extra -w .)

Disclosures / Credits

So mechanical it doesn't even need an AI.

@DrJosh9000 DrJosh9000 force-pushed the v4 branch 13 times, most recently from f8066e1 to d28fd70 Compare April 14, 2026 02:04
@DrJosh9000 DrJosh9000 force-pushed the v4 branch 9 times, most recently from 8b3d172 to 7686fe7 Compare April 23, 2026 02:07
@DrJosh9000 DrJosh9000 force-pushed the v4 branch 7 times, most recently from 638c3b4 to a52917e Compare May 5, 2026 23:52
DrJosh9000 and others added 22 commits May 7, 2026 10:08
…verse-for-tear-down-hooks

fix: Reverse ordering for post- hooks
…`cancel-signal-timeout` and `cancel-cleanup-timeout`

Previously, `cancel-grace-period` (default 10s) was the *total* time budget covering both process shutdown and agent-side cleanup (log uploads, artifact uploads, disconnects). `signal-grace-period-seconds` (default -1) controlled how much of that budget went to the process, using negative-relative arithmetic: -1 meant "`cancel-grace-period` minus 1", so the process got 9s and the agent got 1s. This made configuration confusing — the flag that *sounded* like the process's grace period (`cancel-grace-period`) was actually the total, and the actual process grace period required subtracting a negative number from it. Validation was also complex because the two values had to be checked against each other, and invalid combinations (e.g.  `signal-grace-period-seconds` >= `cancel-grace-period`) returned errors.

The new model uses two independent, positive durations:

  `--cancel-signal-timeout` (default 9s): how long the subprocess gets to
  handle the cancel signal before receiving SIGKILL. This is the value
  users actually think about when configuring cancellation.

  `--cancel-cleanup-timeout` (default 1s): how long the agent gets after
  the process exits or is killed to upload logs and artifacts.

The total grace period is simply their sum. There is no validation logic because the values cannot conflict. Both flags accept Go duration syntax (e.g. "30s", "1m30s") via `cli.DurationFlag` instead of integer seconds, matching the convention used by other timeout flags in the agent (`wait-for-ec2-tags-timeout`, `kubernetes-container-start-timeout`,ó etc.).

The defaults produce the same effective behaviour as before: 9s for the process, 1s for agent cleanup, 10s total.
Replace `cancel-grace-period` and `signal-grace-period-seconds` with `cancel-signal-timeout` and `cancel-cleanup-timeout`
Rip out opentracing tracing backend (take 2)
Make pipeline secret redaction default behaviour
Bump changelog and VERSION for v4.0.0-beta.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants