Skip to content

hydra: testing scheduler fixes on staging#1092

Merged
Mic92 merged 9 commits into
mainfrom
scheduler-fixes
Jun 20, 2026
Merged

hydra: testing scheduler fixes on staging#1092
Mic92 merged 9 commits into
mainfrom
scheduler-fixes

Conversation

@Mic92

@Mic92 Mic92 commented Jun 19, 2026

Copy link
Copy Markdown
Member

this will also deployed to hydra.nixos.org in the same PR.

@Mic92 Mic92 requested a review from a team as a code owner June 19, 2026 09:39
@Mic92 Mic92 force-pushed the scheduler-fixes branch 2 times, most recently from 9dd507e to 1e65dba Compare June 19, 2026 14:38
Mic92 added 5 commits June 19, 2026 17:21
The CA cert path used ../non-critical-infra, resolving to the nonexistent
macs/non-critical-infra instead of the repo root. A bare path literal in a
flake is rebased onto the source store path without an existence check, so the
builder built fine but crash-looped at runtime unable to read ca.crt. Fix the
path and wrap the static certs in builtins.path so a wrong path fails at eval.

mac04 and mac05 were reinstalled with rotated host keys, so their sops age
recipients were stale and queue-runner-client.key no longer decrypted. Re-key.
Builders upload NARs directly to S3 via presigned URLs instead of streaming
through the queue runner. This requires every builder to substitute from the
forced cache, otherwise the queue runner rejects it, so add the staging cache
as a substituter on all staging builders via a shared module.
eval04 and build05 used the bare ::/64 address instead of ::1/64 like the other
ofborg nodes, and their AAAA records pointed at the same address. The all-zero
interface ID is reserved as the Subnet-Router anycast address (RFC 4291), so it
is not a valid host unicast address, and it was not reachable from here. Use ::1
in both the host config and DNS to match the other nodes.
turns out when you get an error on a multi-part complete,
you have to check if actually completed successfully.
@Mic92 Mic92 force-pushed the scheduler-fixes branch from 1e65dba to e52820c Compare June 19, 2026 16:25
Mic92 added 3 commits June 19, 2026 19:19
Read cached-build nix-support from the .ls listing/NAR, not the local store.
Read requiredSystemFeatures from structured attrs so big-parallel builds schedule correctly.
@Mic92 Mic92 merged commit b0630ed into main Jun 20, 2026
20 of 21 checks passed
@Mic92 Mic92 deleted the scheduler-fixes branch June 20, 2026 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant