Harden registry rename and startup recovery#197
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (6)
💤 Files with no reviewable changes (4)
📝 WalkthroughWalkthroughAdds staged registry host rename support: new ChangesSupport for Staged Registry Host Rename
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/usecase/container/service.go (1)
1368-1370: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueSimplify early return for already-local refs.
When
imageRefalready has thelocalPrefix, reconstructinglocalPrefix + remainderreturns the original string unchanged. ReturningimageRefdirectly is clearer.♻️ Suggested simplification
- if remainder, ok := strings.CutPrefix(imageRef, localPrefix); ok { - return localPrefix + remainder + if strings.HasPrefix(imageRef, localPrefix) { + return imageRef }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/usecase/container/service.go` around lines 1368 - 1370, The code reconstructs a value by concatenating localPrefix + remainder even when imageRef already starts with localPrefix; instead, detect the early-return case and return imageRef directly to avoid unnecessary string operations—update the logic in the function that handles imageRef/localPrefix (referencing imageRef, localPrefix, remainder) so when imageRef.HasPrefix(localPrefix) you return imageRef, nil rather than rebuilding the same string.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/app/run.go`:
- Around line 1975-1977: autoRouteHandler is created once with registryDomain
and legacyRegistryDomains resolved from startup cfg (resolveRegistryDomains +
container.NewAutoRouteHandler + WithEnvExtractor), so config reloads don’t
update its domain set; change the handler to read domains from svc.configSvc on
each event or rebuild/rebind autoRouteHandler when config reload
completes—either (a) modify NewAutoRouteHandler/AutoRouteHandler to accept
svc.configSvc and call resolveRegistryDomains(svc.configSvc.Current()) per
request/event, or (b) listen for the config reload completion and recreate
autoRouteHandler by calling resolveRegistryDomains(cfg) and
container.NewAutoRouteHandler(...). Ensure references to resolveRegistryDomains,
autoRouteHandler, NewAutoRouteHandler, WithEnvExtractor and svc.configSvc are
updated accordingly.
In `@internal/usecase/config/service.go`:
- Around line 877-897: canonicalAttachmentsForSave currently canonicalizes image
refs but does not deduplicate them, so legacy and current registry variants can
produce duplicate canonical refs; update canonicalAttachmentsForSave to
deduplicate per target after calling domain.CanonicalizeGordonImageRef by
collecting canonical refs into a temporary set (map[string]struct{}) while
preserving original order (append to canonicalImages only when not already
seen), then assign the deduped slice to result[target]; ensure nil handling
stays the same and use the existing function name canonicalAttachmentsForSave
and the call to domain.CanonicalizeGordonImageRef as the anchor points to
change.
---
Outside diff comments:
In `@internal/usecase/container/service.go`:
- Around line 1368-1370: The code reconstructs a value by concatenating
localPrefix + remainder even when imageRef already starts with localPrefix;
instead, detect the early-return case and return imageRef directly to avoid
unnecessary string operations—update the logic in the function that handles
imageRef/localPrefix (referencing imageRef, localPrefix, remainder) so when
imageRef.HasPrefix(localPrefix) you return imageRef, nil rather than rebuilding
the same string.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3c899410-cda3-4f4c-be07-b8c18b56b578
📒 Files selected for processing (20)
docs/cli/serve.mddocs/config/index.mddocs/config/server.mddocs/upgrading.mdgordon.toml.exampleinternal/adapters/in/cli/controlplane_local.gointernal/adapters/in/cli/controlplane_local_test.gointernal/app/run.gointernal/app/startup_recovery.gointernal/app/startup_recovery_test.gointernal/domain/registry_image.gointernal/domain/registry_image_test.gointernal/usecase/auto/validation.gointernal/usecase/auto/validation_test.gointernal/usecase/config/service.gointernal/usecase/config/service_test.gointernal/usecase/container/autoroute.gointernal/usecase/container/autoroute_test.gointernal/usecase/container/service.gointernal/usecase/container/service_test.go
| registryDomain, legacyRegistryDomains := resolveRegistryDomains(cfg) | ||
| autoRouteHandler := container.NewAutoRouteHandler(ctx, svc.configSvc, svc.containerSvc, svc.blobStorage, registryDomain, legacyRegistryDomains...). | ||
| WithEnvExtractor(svc.runtime, svc.envDir) |
There was a problem hiding this comment.
Refresh auto-route registry domains on hot reload.
Lines 1975-1977 bind registryDomain and legacyRegistryDomains from the startup cfg, so later config reloads never update the autoRouteHandler. After changing server.gordon_domain or server.legacy_registry_domains at runtime, image-push ownership checks will still use the old host set until Gordon is restarted, which breaks the staged-rename flow this PR is adding. Make the handler read the current domains from configSvc per event, or rebuild/rebind it when config reload completes.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/app/run.go` around lines 1975 - 1977, autoRouteHandler is created
once with registryDomain and legacyRegistryDomains resolved from startup cfg
(resolveRegistryDomains + container.NewAutoRouteHandler + WithEnvExtractor), so
config reloads don’t update its domain set; change the handler to read domains
from svc.configSvc on each event or rebuild/rebind autoRouteHandler when config
reload completes—either (a) modify NewAutoRouteHandler/AutoRouteHandler to
accept svc.configSvc and call resolveRegistryDomains(svc.configSvc.Current())
per request/event, or (b) listen for the config reload completion and recreate
autoRouteHandler by calling resolveRegistryDomains(cfg) and
container.NewAutoRouteHandler(...). Ensure references to resolveRegistryDomains,
autoRouteHandler, NewAutoRouteHandler, WithEnvExtractor and svc.configSvc are
updated accordingly.
There was a problem hiding this comment.
Pull request overview
This PR hardens Gordon’s transition between registry hostnames by introducing shared “current + legacy registry domain” handling across image matching, internal pull rewriting, auto-route ownership checks, and config persistence. It also refactors startup behavior so reboot recovery consistently uses Gordon’s internal deploy semantics and no longer depends on auto_route.enabled.
Changes:
- Add
legacy_registry_domainssupport and shared domain helpers for identifying/stripping/canonicalizing Gordon-managed image refs. - Update container deploy/pull rewriting, auto-route repo ownership checks, and config image matching/saving to treat legacy + current registry hosts equivalently.
- Refactor startup recovery into a dedicated helper that syncs containers and recovers configured routes after listeners bind; update local deploy to use internal deploy context.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/usecase/container/service.go | Uses shared domain helpers for registry stripping/recognition and improves internal pull ref rewriting behavior. |
| internal/usecase/container/service_test.go | Adds coverage for legacy registry rewrite behavior and internal deploy pulls. |
| internal/usecase/container/autoroute.go | Treats legacy/current registry hosts as the same repo for auto-route ownership checks. |
| internal/usecase/container/autoroute_test.go | Adds tests for auto-route ownership across legacy/current registry hosts. |
| internal/usecase/config/service.go | Loads legacy domains, matches images across legacy/current hosts, and canonicalizes Gordon-managed refs on save (routes + attachments). |
| internal/usecase/config/service_test.go | Adds tests ensuring canonicalization happens on disk while preserving in-memory refs. |
| internal/usecase/auto/validation.go | Delegates repo extraction to shared domain helper and supports legacy domains. |
| internal/usecase/auto/validation_test.go | Adds coverage for legacy/current host equivalence and ports/digests. |
| internal/domain/registry_image.go | Introduces shared utilities for known domains, stripping, canonicalization, and repo extraction. |
| internal/domain/registry_image_test.go | Adds unit tests for registry-image domain utilities (including hostile lookalikes). |
| internal/app/startup_recovery.go | New startup recovery helper to sync containers and recover configured routes using internal deploy semantics. |
| internal/app/startup_recovery_test.go | Tests startup recovery sequencing and best-effort behavior. |
| internal/app/run.go | Wires legacy registry domain config through services and swaps startup flow to the new recovery helper. |
| internal/adapters/in/cli/controlplane_local.go | Forces local deploys to use internal deploy semantics. |
| internal/adapters/in/cli/controlplane_local_test.go | Adds test ensuring internal deploy context is used. |
| gordon.toml.example | Documents legacy_registry_domains option. |
| docs/upgrading.md | Documents staged registry host rename procedure and expectations. |
| docs/config/server.md | Adds legacy_registry_domains documentation and links. |
| docs/config/index.md | Mentions staged rename option and links to docs. |
| docs/cli/serve.md | Updates startup sequence docs and clarifies startup recovery vs manual deploy behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return strings.TrimPrefix(image, prefix) | ||
| } | ||
| return image | ||
| return domain.StripKnownGordonRegistry(image, cfg.RegistryDomain, cfg.LegacyRegistryDomains) |
| // Recover configured routes after servers are listening (registry port is now bound). | ||
| syncAndRecoverConfiguredRoutes(ctx, svc.configSvc, svc.containerSvc, log) |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/usecase/config/service_test.go`:
- Around line 2132-2148: The test currently ignores Load errors by assigning `_
= svc.Load(ctx)` which can hide setup failures; replace those calls with an
explicit assertion such as require.NoError(t, svc.Load(ctx)) in the t.Run
subtest "matches current image when route stores legacy registry host" (and the
other similar subtest around lines 2261-2277) so that the
NewService/svc.Load(ctx) setup failure is reported immediately; update both
occurrences to call require.NoError with the same ctx and svc.Load invocation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a5465edb-168f-462e-ab8c-09960f8a4dae
📒 Files selected for processing (5)
internal/usecase/config/service.gointernal/usecase/config/service_test.gointernal/usecase/container/autoroute.gointernal/usecase/container/autoroute_test.gointernal/usecase/container/service.go
Summary
auto_route.enabledTesting
rtk go test ./internal/domain ./internal/usecase/auto ./internal/usecase/config ./internal/usecase/container ./internal/adapters/in/cli ./internal/apprtk golangci-lint run ./...rtk go test ./...coderabbit review --agent --base main -t committed -c AGENTS.md -c /home/brice/CLAUDE.mdSummary by CodeRabbit
New Features
Documentation
Chores / Behavior changes