feat(agents)!: per-agent model selection for cost optimization and /compact loop fix by katriendg · Pull Request #1541 · microsoft/hve-core

katriendg · 2026-05-06T12:37:49Z

Description

This PR introduces per-agent model selection via frontmatter, backed by a validated model catalog that tracks GitHub Copilot's evolving model lineup. Simple tasks (git operations, issue triage, research) now route to fast-tier models at a fraction of the cost, while complex agents inherit the session model for full capability.

Additionally, this PR removes the self-referential /compact handoff from 12 agents, eliminating the root cause of Autopilot infinite loops reported in #1420. The disk-first .copilot-tracking/ architecture and Memory Agent already provide equivalent persistence without the loop risk.

Model Selection Infrastructure

Cost-first principle: use fast models for read-only research and validation; inherit session model for code generation and complex reasoning.

Added model catalog (scripts/linting/model-catalog.json) tracking 25 models across 5 tiers (free, fast, standard, premium, ultra) with multiplier values, vendor attribution, and GA/preview/retiring status
Added catalog refresh script (scripts/linting/Update-ModelCatalog.ps1) that fetches authoritative YAML from github/docs for model release status and multiplier data; marks removed models as retiring with 60-day grace period rather than deleting
Added validation script (scripts/linting/Test-ModelReferences.ps1) that scans all .agent.md and .prompt.md files for model frontmatter and validates references against the catalog; reports invalid models as errors, retiring models as warnings
Added JSON schema (scripts/linting/schemas/model-catalog.schema.json) for structural validation of the catalog file
Added weekly CI workflow (.github/workflows/model-validation.yml) running every Wednesday plus PR-triggered validation on agent/prompt/catalog changes; includes catalog freshness check and artifact upload
Integrated lint:models and lint:models:refresh into package.json; model validation runs as part of the lint:all chain

Per-Agent and Per-Prompt Model Assignment

Assigned fast-tier models to 7 subagents performing read-heavy validation tasks: researcher-subagent, plan-validator, implementation-validator, prompt-evaluator, rpi-validator, codebase-profiler, and report-generator. Each declares a prioritized fallback array: Claude Haiku 4.5 → GPT-5.4 mini.

Assigned Claude Haiku 4.5 (copilot) to 7 prompts handling mechanical operations: git-commit-message, git-commit, git-setup, github-add-issue, github-discover-issues, github-triage-issues, and checkpoint.

Added "Model Selection for Subagents" guidance to 6 parent agents (task-researcher, task-planner, task-implementor, task-reviewer, prompt-builder, security-reviewer) documenting cost-first dispatch decisions and VS Code tier constraint behavior.

/compact Handoff Removal (Fixes #1420)

Removed the Compact handoff entry from all 12 agents where it appeared. Eleven had it as their first handoff, causing Autopilot to auto-execute it on every turn completion, creating an infinite self-referential loop.

Updated rai-identity.instructions.md to remove the "Compact handoff" exit point reference from disclaimer display logic
Updated docs/rpi/context-engineering.md to recommend /checkpoint (Memory Agent) for cross-phase persistence and clarify that /compact remains available as a typed command

PR #1492 (feat/context-working) adds Context Discipline to 5 RPI parent agents, enforcing disk-first lean responses. The /compact handoff is now architecturally redundant because:

Disk-first .copilot-tracking/ files — all state already lives on disk
Memory Agent — provides structured session persistence with handoff to a different agent (non-looping)
PR feat(agents): optimize RPI agent context management with discipline rules #1492 Context Discipline — caps subagent responses to executive summaries, reducing context bloat at the source

Test Coverage

Added 41 Pester tests (Test-ModelReferences.Tests.ps1) covering validation logic, frontmatter parsing, and error handling
Added 29 Pester tests (Test-UpdateModelCatalog.Tests.ps1) covering catalog merge, comparison, and refresh logic

Related Issue(s)

Fixes #1420
Closes #1540

Type of Change

Select all that apply:

Code & Documentation:

Bug fix (non-breaking change fixing an issue)
New feature (non-breaking change adding functionality)
Breaking change (fix or feature causing existing functionality to change)
Documentation update

Infrastructure & Configuration:

AI Artifacts:

Reviewed contribution with prompt-builder agent and addressed all feedback
Copilot instructions (.github/instructions/*.instructions.md)
Copilot prompt (.github/prompts/*.prompt.md)
Copilot agent (.github/agents/*.agent.md)
Copilot skill (.github/skills/*/SKILL.md)

Note for AI Artifact Contributors:

Agents: Research, indexing/referencing other project (using standard VS Code GitHub Copilot/MCP tools), planning, and general implementation agents likely already exist. Review .github/agents/ before creating new ones.

Skills: Must include both bash and PowerShell scripts. See Skills.

Model Versions: Only contributions targeting the latest Anthropic and OpenAI models will be accepted. Older model versions (e.g., GPT-3.5, Claude 3) will be rejected.

See Agents Not Accepted and Model Version Requirements.

Other:

Script/automation (.ps1, .sh, .py)
Other (please describe):

Sample Prompts (for AI Artifact Contributions)

User Request:

Invoke any RPI agent (e.g., task researcher) with a research task. The agent dispatches its Researcher Subagent at fast-tier cost automatically. Run npm run lint:models to validate all model references.

Execution Flow:

Parent agent evaluates task type (read-only vs code-generation)
For research/validation tasks, parent specifies model: "Claude Haiku 4.5 (copilot)" on runSubagent call
VS Code resolves model against cost tier constraint (cannot exceed parent model tier)
Subagent executes at fast-tier cost; results written to .copilot-tracking/ disk files
If tier constraint blocks downgrade, platform falls back to session model gracefully

Output Artifacts:

logs/model-validation-results.json — structured validation results with per-file status
scripts/linting/model-catalog.json — refreshed catalog after lint:models:refresh

Success Indicators:

npm run lint:models exits 0 with no invalid model references
Subagent invocations show model name in VS Code chat header when explicitly set
No Autopilot infinite loops when agents complete their work

Testing

npm run lint:models — model reference validation (validates all 14 model-annotated files)
Security analysis: no sensitive data exposure, no privilege escalation, workflow uses read-only permissions
Diff-based assessment: all changes are configuration-level (frontmatter, handoff entries, guidance sections); no business logic modified
Manual testing performed

Note

Add manual testing descriptions when applicable.

Checklist

Required Checks

Documentation is updated (if applicable)
Files follow existing naming conventions
Changes are backwards compatible (if applicable)
Tests added for new functionality (if applicable)

AI Artifact Contributions

Used /prompt-analyze to review contribution
Addressed all feedback from prompt-builder review
Verified contribution follows common standards and type-specific requirements

Required Automated Checks

The following validation commands must pass before merging:

Markdown linting: npm run lint:md
Spell checking: npm run spell-check
Frontmatter validation: npm run lint:frontmatter
Skill structure validation: npm run validate:skills
Link validation: npm run lint:md-links
PowerShell analysis: npm run lint:ps
Plugin freshness: npm run plugin:generate
Docusaurus tests: npm run docs:test

Security Considerations

This PR does not contain any sensitive or NDA information
Any new dependencies have been reviewed for security issues (N/A — no new runtime dependencies added)
Security-related scripts follow the principle of least privilege

Warning

This PR includes experimental GHCP artifacts that may have breaking changes.

.github/agents/hve-core/task-challenger.agent.md
.github/agents/experimental/experiment-designer.agent.md
.github/agents/experimental/pptx.agent.md
.github/agents/security/security-planner.agent.md
.github/agents/security/sssc-planner.agent.md
.github/agents/security/security-reviewer.agent.md
.github/agents/security/subagents/codebase-profiler.agent.md
.github/agents/security/subagents/report-generator.agent.md
.github/agents/rai-planning/rai-planner.agent.md

GHCP Artifact Maturity

File	Type	Maturity	Notes
`.github/agents/hve-core/rpi-agent.agent.md`	Agent	✅ stable	All builds
`.github/agents/hve-core/task-researcher.agent.md`	Agent	✅ stable	All builds
`.github/agents/hve-core/task-planner.agent.md`	Agent	✅ stable	All builds
`.github/agents/hve-core/task-implementor.agent.md`	Agent	✅ stable	All builds
`.github/agents/hve-core/task-reviewer.agent.md`	Agent	✅ stable	All builds
`.github/agents/hve-core/prompt-builder.agent.md`	Agent	✅ stable	All builds
`.github/agents/hve-core/task-challenger.agent.md`	Agent	⚠️ experimental	Pre-release only
`.github/agents/experimental/experiment-designer.agent.md`	Agent	⚠️ experimental	Pre-release only
`.github/agents/experimental/pptx.agent.md`	Agent	⚠️ experimental	Pre-release only
`.github/agents/security/security-planner.agent.md`	Agent	⚠️ experimental	Pre-release only
`.github/agents/security/sssc-planner.agent.md`	Agent	⚠️ experimental	Pre-release only
`.github/agents/security/security-reviewer.agent.md`	Agent	⚠️ experimental	Pre-release only
`.github/agents/security/subagents/codebase-profiler.agent.md`	Agent	⚠️ experimental	Pre-release only
`.github/agents/security/subagents/report-generator.agent.md`	Agent	⚠️ experimental	Pre-release only
`.github/agents/rai-planning/rai-planner.agent.md`	Agent	⚠️ experimental	Pre-release only
`.github/prompts/hve-core/checkpoint.prompt.md`	Prompt	✅ stable	All builds
`.github/prompts/hve-core/git-commit-message.prompt.md`	Prompt	✅ stable	All builds
`.github/prompts/hve-core/git-commit.prompt.md`	Prompt	✅ stable	All builds
`.github/prompts/hve-core/git-setup.prompt.md`	Prompt	✅ stable	All builds
`.github/prompts/github/github-add-issue.prompt.md`	Prompt	✅ stable	All builds
`.github/prompts/github/github-discover-issues.prompt.md`	Prompt	✅ stable	All builds
`.github/prompts/github/github-triage-issues.prompt.md`	Prompt	✅ stable	All builds
`.github/instructions/rai-planning/rai-identity.instructions.md`	Instructions	⚠️ experimental	Pre-release only

GHCP Maturity Acknowledgment

I acknowledge this PR includes non-stable GHCP artifacts
Non-stable artifacts are intentional for this change

Additional Notes

The /compact removal is a breaking change for users who relied on the handoff button. The /compact typed command remains available; only the agent-surfaced handoff is removed.
Model catalog currently tracks 25 models; the automated refresh runs weekly to catch additions, removals, and multiplier changes from GitHub's upstream YAML sources.
The VS Code cost tier constraint means subagents can only use models at the same or lower tier than the parent. All guidance sections document this limitation and the graceful fallback behavior.

Follow-up Tasks

Monitor weekly CI workflow for first catalog drift detection to confirm automation works end-to-end
Consider extending model selection to remaining prompts (pull-request, doc-ops) once cost savings are validated

github-actions · 2026-05-06T12:38:25Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

Package

Version

Score

Details

actions/actions/checkout

de0fac2e4500dabe0009e67214ff5f5447ce83dd

🟢 5.7

Details

Check	Score	Reason
Code-Review	🟢 10	all changesets reviewed
Maintained	⚠️ 0	0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Dangerous-Workflow	🟢 10	no dangerous workflow patterns detected
Binary-Artifacts	🟢 10	no binaries found in the repo
CII-Best-Practices	⚠️ 0	no effort to earn an OpenSSF best practices badge detected
Token-Permissions	⚠️ 0	detected GitHub workflow tokens with excessive permissions
Fuzzing	⚠️ 0	project is not fuzzed
Packaging	⚠️ -1	packaging workflow not detected
License	🟢 10	license file detected
Pinned-Dependencies	🟢 3	dependency not pinned by hash detected -- score normalized to 3
Signed-Releases	⚠️ -1	no releases found
Security-Policy	🟢 9	security policy file detected
Branch-Protection	🟢 5	branch protection is not maximal on development and all release branches
SAST	🟢 8	SAST tool detected but not run on all commits

actions/actions/upload-artifact

043fb46d1a93c77aae656e7c1c64a875d1fc6a0a

🟢 5.6

Details

Check	Score	Reason
Code-Review	🟢 8	Found 8/9 approved changesets -- score normalized to 8
Maintained	🟢 6	6 commit(s) and 2 issue activity found in the last 90 days -- score normalized to 6
Dangerous-Workflow	🟢 10	no dangerous workflow patterns detected
Binary-Artifacts	🟢 10	no binaries found in the repo
Packaging	⚠️ -1	packaging workflow not detected
CII-Best-Practices	⚠️ 0	no effort to earn an OpenSSF best practices badge detected
Token-Permissions	⚠️ 0	detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies	⚠️ 1	dependency not pinned by hash detected -- score normalized to 1
Fuzzing	⚠️ 0	project is not fuzzed
License	🟢 10	license file detected
Signed-Releases	⚠️ -1	no releases found
Security-Policy	🟢 9	security policy file detected
SAST	🟢 10	SAST tool is run on all commits
Branch-Protection	⚠️ 0	branch protection not enabled on development/release branches

Scanned Files

.github/workflows/model-validation.yml

codecov-commenter · 2026-05-06T12:41:27Z

Codecov Report

❌ Patch coverage is 85.82375% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.62%. Comparing base (97c40e8) to head (bd021ee).

Files with missing lines	Patch %	Lines
scripts/linting/Update-ModelCatalog.ps1	83.68%	23 Missing ⚠️
scripts/linting/Test-ModelReferences.ps1	88.33%	14 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1541      +/-   ##
==========================================
+ Coverage   85.46%   85.62%   +0.16%     
==========================================
  Files          80       77       -3     
  Lines       11541    10779     -762     
==========================================
- Hits         9863     9230     -633     
+ Misses       1678     1549     -129

Flag	Coverage Δ
pester	`83.59% <85.82%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
scripts/linting/Test-ModelReferences.ps1	`88.33% <88.33%> (ø)`
scripts/linting/Update-ModelCatalog.ps1	`83.68% <83.68%> (ø)`

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions

Advisory review, this PR is from a maintainer. Findings are informational only.

Review Summary

This PR is well-structured and addresses two distinct, clearly-scoped concerns — per-agent model selection for cost optimisation and removal of the /compact handoff loop. The implementation follows repository conventions throughout.

Issue Alignment ✅

Fixes #1420 (Autopilot /compact infinite loop): the removal of the Compact handoff from 12 agents directly addresses the root cause described in the issue.
Closes #1540: per-agent model selection infrastructure is complete and consistent with the stated intent.
No scope creep observed; changes are tightly scoped to model frontmatter additions, handoff removals, and the supporting catalog/validation infrastructure.

PR Template Compliance ⚠️

One minor gap: the GHCP Maturity Acknowledgment section at the bottom of the PR description has both checkboxes unchecked. The maturity warning block and the artifact table are fully filled in, so this appears to be an oversight rather than an intentional omission. Please check these two boxes to complete the template.

Coding Standards ✅

All new PowerShell scripts (Test-ModelReferences.ps1, Update-ModelCatalog.ps1) follow the repository's PowerShell conventions: copyright header, #Requires -Version 7.0, comment-based help, [CmdletBinding()], $ErrorActionPreference = 'Stop', main execution guard, region blocks, and [OutputType()] on exported functions.
Test files (Test-ModelReferences.Tests.ps1, Test-UpdateModelCatalog.Tests.ps1) follow Pester 5 conventions: #Requires -Modules Pester first, copyright header after, BeforeAll/AfterAll lifecycle, -Tag 'Unit' on all Describe blocks, and $TestDrive-equivalent temp directory management.
model-validation.yml uses the same actions/checkout and actions/upload-artifact SHA pins as the rest of the repository's workflows, persist-credentials: false, permissions: contents: read at both workflow and job level. All security requirements are met.
model frontmatter field format (single string for prompts, priority array for agents) is consistent and matches the documentation added in ai-artifacts-common.md.

Code Quality ✅

Two minor advisory findings are raised as inline comments:

model-catalog.json missing initial $schema reference (line 1) — the Update-ModelCatalog.ps1 writes $schema on refresh but the seeded file committed here omits it; schema tooling won't apply until the first weekly run.
Update-ModelCatalog.ps1 mixed PSCustomObject/hashtable in $finalModels (lines 251–273) — functionally correct, but a normalisation step would make future maintenance safer.

No security vulnerabilities, no breaking changes beyond those declared, no missing error handling at system boundaries.

Documentation ✅

copilot-instructions.md, docs/contributing/ai-artifacts-common.md, docs/contributing/custom-agents.md, docs/contributing/prompts.md, and docs/rpi/context-engineering.md are all updated to reflect the new model selection guidance and /compact deprecation from handoffs.
The README-level npm script list is updated in both package.json and the instructions file.

Outstanding Action Item

Check the two GHCP Maturity Acknowledgment checkboxes in the PR description.

Note

🔒 Integrity filter blocked 1 item

The following item were blocked because they don't meet the GitHub integrity level.

#1420 issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by PR Review for issue #1541 · ● 2.2M

bindsi · 2026-05-06T13:18:04Z

Review

Two bundled changes: per-agent/prompt model frontmatter backed by a validated catalog with weekly refresh, and removal of the self-referential /compact handoff from 12 agents to fix Autopilot loops (#1420). All 59 CI checks green.

Strengths

Well-scoped and well-justified. PR body cleanly separates the two threads. The architectural argument for /compact removal (disk-first .copilot-tracking/ + Memory Agent + feat(agents): optimize RPI agent context management with discipline rules #1492 Context Discipline making it redundant) is sound.
Catalog-driven validation. JSON schema, refresh script that fetches authoritative YAML from github/docs, retiring grace period instead of hard delete — right pattern.
Strong test coverage. 70 Pester tests covering frontmatter parsing, validation logic, tier classification, and catalog comparison. Edge cases (Not applicable multiplier, missing multiplier entry, mixed invalid/retiring) are exercised.
Documentation is thorough and coherent across ai-artifacts-common.md, custom-agents.md, prompts.md, and context-engineering.md.
Cost-first guidance is consistent across parent agents and follows a clear pattern: read-only/validation → fast tier; code generation/architecture → inherit session model.

Concerns

1. Silent catalog drift in CI (medium). model-validation.yml runs Update-ModelCatalog.ps1 before validation, but the refreshed catalog isn't committed. PRs validate against a fresher catalog than what's in the repo — a reference can pass CI because the model exists upstream while the committed model-catalog.json is stale. The "Report catalog drift" step only flags retiring entries; it doesn't surface added/removed/multiplier-changed models that appear only in the just-fetched copy.

Suggestion: either (a) validate against the committed catalog only and run refresh on a separate scheduled job that opens an auto-PR, or (b) fail the CI step when the in-memory refreshed catalog differs materially from the committed one.

2. Provider-allowlist not mechanically enforced (medium). The catalog correctly contains non-Anthropic/OpenAI entries (Goldeneye, Raptor mini, Gemini 3 Flash, Grok Code Fast 1) pulled from upstream YAML. lint:models accepts them, but they'd violate the "Anthropic and OpenAI only" policy in ai-artifacts-common.md. Consider an additional provider-allowlist check so policy is enforced, not just documented.

3. Upstream pin (low). Update-ModelCatalog.ps1 fetches from github/docs@main. Combined with #1, PR validation results are non-deterministic across runs. Worth at least a script comment.

4. Workflow runner pin (low — verify). model-validation.yml uses runs-on: ubuntu-latest. Repo conventions typically require a pinned runner version (e.g., ubuntu-24.04).

5. Untested branch in Update-ModelCatalog.ps1 (low). The "mark removed as retiring" overlay logic in the main block (~lines 245–265) — 60-day grace transition and update-multiplier-on-existing-entry path — has no tests. Compare-Catalogs itself is well covered.

6. Get-FrontmatterFromFile regex (low). '(?s)^---\r?\n(.+?)\r?\n---' doesn't anchor the closing --- to a line boundary. Lazy .+? keeps it correct for current files, but worth aligning with the existing frontmatter validator's pattern.

7. Breaking change disclosure. /compact handoff removal is correctly called out as breaking in the PR body. Verify release-please/changelog surfaces this (e.g., feat!: or BREAKING CHANGE: footer) so downstream consumers see it.

8. Minor. Test-ModelReferences.ps1 uses (Get-Location).Path for relative-path computation — awkward when invoked outside repo root. Consider basing on $PSScriptRoot.

Verdict

Approve with non-blocking suggestions. Core mechanics and tests are solid. #1 (silent catalog drift) and #2 (provider-allowlist) are the most actionable — both undermine the validation guarantee the PR is selling. Worth addressing before merge or in immediate follow-up.

bindsi

Really like it, that´s such an improvement and optimisation of cost impact for users. Thank you so much

- add model frontmatter to 7 research/validation subagents with fast-tier fallbacks - add model frontmatter to 7 Tier 1 prompts using Claude Haiku 4.5 - create model-catalog.json with 19 supported models and JSON schema - create Test-ModelReferences.ps1 validation script - add lint:models npm script integrated into lint:all chain ⚡ - Generated by Copilot

- add model-validation.yml with weekly schedule and PR-triggered runs - add catalog freshness check warning when catalog exceeds 90 days - refactor Test-ModelReferences.ps1 to extract Invoke-ModelReferenceValidation function - add Test-ModelReferences.Tests.ps1 with 41 unit tests covering all code paths 🧪 - Generated by Copilot

- add Update-ModelCatalog.ps1 with YAML source fetching from github/docs - add Pester tests for Merge-ModelData, Get-RemoteYaml, Compare-Catalogs - update workflow to run catalog refresh before validation - update agent model references from retiring Gemini 3 Flash (Preview) to GA name - refresh model-catalog.json with correct tier assignments 🔄 - Generated by Copilot

…spatch - add Model Selection for Subagents section to RPI parent agents - add model guidance to prompt-builder and security-reviewer - fast model for research/validation, session model for code generation - cost-first principle adapted for VS Code tier constraints 💰 - Generated by Copilot

Remove self-referential Compact handoffs that caused Autopilot infinite loops. The disk-first .copilot-tracking architecture and Memory Agent make /compact handoffs redundant. - Remove Compact from 5 RPI agents (rpi-agent, task-researcher, task-planner, task-implementor, task-reviewer) - Remove Compact from 7 additional agents (task-challenger, prompt-builder, security-planner, sssc-planner, rai-planner, experiment-designer, pptx) - Update context-engineering.md to clarify /compact as typed command - Remove Compact exit point from rai-identity.instructions.md - Regenerate plugins Closes #1420

…prompts documentation - update description of model property as a preference hint - explain fallback behavior when specified model is unavailable - emphasize cost tier constraints for prompt models 🔍 - Generated by Copilot

…-ModelCatalogUpdate functions - implement tests for validation output handling - cover scenarios for valid, invalid, and retiring references - ensure directory creation for output files in tests - validate catalog update logic with various conditions 🔍 - Generated by Copilot

…alidation - implement a check for changes in model-catalog.json - add provider allowlist functionality in model validation - update model-catalog.json with provider information - enhance schema to include providerAllowlist 🔍 - Generated by Copilot

…o root path 🔧 - Generated by Copilot

katriendg · 2026-05-06T15:19:16Z

Review

Two bundled changes: per-agent/prompt model frontmatter backed by a validated catalog with weekly refresh, and removal of the self-referential /compact handoff from 12 agents to fix Autopilot loops (#1420). All 59 CI checks green.
... omitting the rest

Thanks @bindsi for your review. I believe we have addressed everything:

PR Review Comments — Resolution Summary

#	Status	Action
1	Implemented	Added "Detect catalog drift" CI step in `model-validation.yml`
2	Implemented	Added `provider` field, `providerAllowlist`, and enforcement
3	Implemented	Added `.NOTES` comment documenting upstream non-determinism
4	No action	`ubuntu-latest` is the required pattern per repo conventions
5	No action	Tests already exist for the retiring/multiplier-update paths
6	Implemented	Aligned the frontmatter regex patterns to `(.*?)\r?\n---(\r?\n
7	No action	Commit-message - ensure `feat!:` on merge (updated)
8	Implemented	Replaced `(Get-Location).Path` with `$RepoRoot` via `git rev-parse`

github-actions

PR Review Summary

This PR is well-structured, clearly motivated, and addresses two distinct problems (cost optimization via per-agent model selection, and the Autopilot infinite loop from the /compact handoff). The implementation quality is solid overall.

✅ Issue Alignment

The PR links Fixes #1420 and Closes #1540. The /compact handoff removal directly addresses the described infinite-loop root cause, and the model selection infrastructure is a coherent new feature. No scope creep detected; the additional documentation and copilot-instructions.md updates are appropriate companions to the code changes.

⚠️ PR Template Compliance

GHCP Maturity Acknowledgment checkboxes are unchecked. The PR explicitly lists 9 experimental agents in a > [!WARNING] block and includes a dedicated GHCP Maturity table, but the two acknowledgment checkboxes at the bottom of that section remain unchecked:

- [ ] I acknowledge this PR includes non-stable GHCP artifacts
- [ ] Non-stable artifacts are intentional for this change

These require author sign-off before merge.

🔍 Coding Standards

PowerShell scripts follow all required conventions: copyright headers, #Requires -Version 7.0, [CmdletBinding()], $ErrorActionPreference = 'Stop', #region blocks, and the invocation guard pattern. ✅
Pester test files follow the required header ordering (#Requires -Modules Pester before copyright). ✅
Agent files use correct user-invocable: false for subagents. ✅
Workflow file uses persist-credentials: false and contents: read permissions. ✅

One concern flagged inline: the model: field in agent frontmatter is used as an array across all 7 subagents, but the prompt-builder.instructions.md spec documents it as a scalar string. This pattern should be validated against VS Code's runtime behaviour and documented before wider adoption — see the inline comment on researcher-subagent.agent.md.

🔒 Code Quality

Workflow action version comments — actions/checkout # v6.0.2 and actions/upload-artifact # v7.0.1 carry unexpectedly high version numbers. The SHAs themselves satisfy the SHA-pinning requirement, but incorrect version comments reduce the audit value. Please verify these tags exist on the respective action repos (inline comments added).

Invoke-WebRequest timeout — Get-RemoteYaml in Update-ModelCatalog.ps1 has no -TimeoutSec, so a slow upstream could block the scheduled job indefinitely. Adding -TimeoutSec 30 is a low-risk hardening step (inline comment added).

cancel-in-progress: true — Safe for PR-triggered runs, but worth noting the design trade-off for the weekly scheduled scan (inline comment added). Not a blocking issue.

📋 Action Items

✅ Check both GHCP Maturity Acknowledgment boxes before merge.
🔍 Verify actions/checkout and actions/upload-artifact version comments match the pinned SHAs.
💡 Confirm VS Code supports the model: array format, and update prompt-builder.instructions.md to document it.
💡 Add -TimeoutSec 30 to Invoke-WebRequest in Get-RemoteYaml.

Note

🔒 Integrity filter blocked 1 item

The following item were blocked because they don't meet the GitHub integrity level.

#1420 issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by PR Review for issue #1541 · ● 2.3M

katriendg requested a review from a team as a code owner May 6, 2026 12:37

github-actions Bot reviewed May 6, 2026

View reviewed changes

Comment thread scripts/linting/model-catalog.json

Comment thread scripts/linting/Update-ModelCatalog.ps1

bindsi approved these changes May 6, 2026

View reviewed changes

Comment thread .github/workflows/model-validation.yml

katriendg changed the title ~~feat(agents): per-agent model selection for cost optimization and /compact loop fix~~ feat(agents)!: per-agent model selection for cost optimization and /compact loop fix May 6, 2026

katriendg added 11 commits May 6, 2026 15:10

chore: json prettify

020cd07

docs(agents): update docs for new model selection options and CI

a868d28

refactor(linting): update model reference validation logic to use rep…

9b129ac

…o root path 🔧 - Generated by Copilot

katriendg force-pushed the feat/model-selection branch from a1449e5 to 9b129ac Compare May 6, 2026 15:12

github-actions Bot mentioned this pull request May 6, 2026

[aw] No-Op Runs #1480

Open

style(linting): normalize shebang line in script files

4faedaf

Merge branch 'main' into feat/model-selection

293dc7b

github-actions Bot reviewed May 7, 2026

View reviewed changes

Comment thread .github/workflows/model-validation.yml

Comment thread .github/workflows/model-validation.yml

Comment thread scripts/linting/Update-ModelCatalog.ps1

Comment thread .github/workflows/model-validation.yml

Comment thread .github/agents/hve-core/subagents/researcher-subagent.agent.md

katriendg added 2 commits May 7, 2026 15:52

chore: address PR comments

795e773

Merge branch 'main' into feat/model-selection

bd021ee

WilliamBerryiii approved these changes May 8, 2026

View reviewed changes

WilliamBerryiii merged commit e158d88 into main May 8, 2026
57 checks passed

This was referenced May 8, 2026

chore(main): pre-release 4.1.125 #1476

Open

chore(main): release hve-core 4.0.0 #1184

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agents)!: per-agent model selection for cost optimization and /compact loop fix#1541

feat(agents)!: per-agent model selection for cost optimization and /compact loop fix#1541
WilliamBerryiii merged 15 commits into
mainfrom
feat/model-selection

katriendg commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented May 6, 2026 •

edited

Loading

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Uh oh!

bindsi commented May 6, 2026

Uh oh!

bindsi left a comment

Uh oh!

Uh oh!

katriendg commented May 6, 2026

Review

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

katriendg commented May 6, 2026

Description

Model Selection Infrastructure

Per-Agent and Per-Prompt Model Assignment

/compact Handoff Removal (Fixes #1420)

Test Coverage

Related Issue(s)

Type of Change

Sample Prompts (for AI Artifact Contributions)

Testing

Checklist

Required Checks

AI Artifact Contributions

Required Automated Checks

Security Considerations

GHCP Artifact Maturity

GHCP Maturity Acknowledgment

Additional Notes

Follow-up Tasks

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

OpenSSF Scorecard

Scanned Files

Uh oh!

codecov-commenter commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Review Summary

Issue Alignment ✅

PR Template Compliance ⚠️

Coding Standards ✅

Code Quality ✅

Documentation ✅

Outstanding Action Item

Uh oh!

Uh oh!

Uh oh!

bindsi commented May 6, 2026

Review

Strengths

Concerns

Verdict

Uh oh!

bindsi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

katriendg commented May 6, 2026

Review

PR Review Comments — Resolution Summary

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

PR Review Summary

✅ Issue Alignment

⚠️ PR Template Compliance

🔍 Coding Standards

🔒 Code Quality

📋 Action Items

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented May 6, 2026 •

edited

Loading

codecov-commenter commented May 6, 2026 •

edited

Loading