Improve self-review, e2e plan consolidation, and DOCS story templates by adalton · Pull Request #42 · flightctl/ai-workflows

adalton · 2026-05-12T20:26:50Z

Summary

Unify per-task code review to use a shared self-review gate (_shared/recipes/self-review-gate.md) with a deepened adversarial review protocol, replacing the previous checklist-based approach
Add scenario consolidation to the e2e /plan phase — merges overlapping scenarios, enforces unique naming, and streamlines the plan-to-code pipeline
Enforce epic-to-feature parent linking in design /sync to prevent orphaned epics
Add independent decomposition review to design /decompose — a subagent-based review step that evaluates epics/stories against the PRD without seeing the design document
Give [DOCS] stories a dedicated template in design /decompose and /sync — replaces code-oriented Implementation Guidance and Testing Approach with Documentation Scope and Documentation Inputs, keeping documentation stories focused on what needs documenting rather than how to build it

Details

Self-review gate (`_shared/`, `code-review/`, `implement/`, `e2e/`)

Reframed the self-review gate description to clarify what same-model review can and cannot catch
Deepened the review protocol from a checklist to an adversarial design review with structured attack categories
Updated implement, e2e, and code-review workflows to use the shared gate consistently

E2e plan consolidation (`e2e/`)

Added scenario consolidation step to /plan that merges overlapping scenarios and enforces unique naming
Streamlined /code, /validate, /publish, and /revise for consistency with the updated plan structure

Design workflow (`design/`)

Enforced epic-to-feature parent linking in /sync — verifies parent.key after creation
Added independent decomposition review as Steps 8-12 in /decompose with a dedicated review protocol (decomposition-review.md)
Split the story template into two: non-[DOCS] stories keep Implementation Guidance + Testing Approach; [DOCS] stories get Documentation Scope + Documentation Inputs
Documentation Inputs are explicitly framed as guidance, not authority — implementation may diverge from design
Updated guidelines.md, README.md, decomposition-review.md, and sync.md for consistency with the template split

Test plan

Run design /decompose on a feature with both [DEV] and [DOCS] stories — verify the correct template is used for each
Run design /sync on a decomposition with [DOCS] stories — verify Jira descriptions use Documentation Scope and Documentation Inputs (not Implementation Guidance and Testing Approach)
Run design /decompose with the decomposition review enabled — verify the reviewer evaluates [DOCS] stories against the [DOCS] template criteria
Run e2e /plan and verify scenario consolidation merges overlapping scenarios
Run implement /code and verify self-review gate uses the shared review protocol

🤖 Generated with Claude Code

The /sync command was silently dropping the parent field when creating epics, then listing it as a manual "Next Step" instead of treating it as a failure. This adds explicit Jira API field syntax, post-creation verification for both epics and stories, and prohibits deferring parent linking to manual follow-up. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Scenarios that share the same setup (given) and action (when) are now merged into consolidated tests with multiple validations, reducing e2e suite execution time by avoiding repeated expensive setup+action. Adds a consolidation step, 8-validation cap, AC-tagged traceability, and updates downstream phases (revise, publish) for consistency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace inline three-tier review logic in implement and e2e /code phases with references to _shared/recipes/self-review-gate.md. Add SUPPLEMENTARY_CRITERIA parameter to the recipe so callers can pass domain-specific evaluation criteria (e2e anti-patterns) and review focus directives (cross-cutting concerns) through the formal channel into the reviewer subagent. Reframe publish-time review from full re-review to cross-cutting review focused on inter-task interactions, since each sub-task is now individually reviewed during /code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add Design as evaluation criterion #4 in the shared review protocol with concrete detection signals (parameter count, import spread, repeated error handling, exposed internal types). Add "Challenge decisions, not just implementation" as the first core principle, pushing reviewers to question whether abstractions should exist and whether boundaries are in the right place — not just whether the implementation is correct. Replace inline code quality checklists in implement and e2e /validate Step 6 with self-review gate invocations, bringing subagent isolation, iterative fix-and-re-review, and the full protocol to the validation phase. Preserve domain-specific checks as SUPPLEMENTARY_CRITERIA. Remove stale criteria enumeration from code-review workflow. Update validation report templates to reflect gate-based review. Reframe self-review gate preamble to scope what it can and cannot catch without enumerating specific criteria. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace the self-graded 17-item checklist in /decompose with a subagent-based review gate that evaluates decomposition artifacts against the PRD without access to the design document. The reviewer sees only what the artifacts actually say, catching gaps the decomposer's mental model papers over. - Add design/decomposition-review.md with 6 evaluation criteria (requirement coverage, epic structure, story completeness, testing commitment, integration stability, documentation), finding format, severity definitions, and validation rules - Replace decompose Step 7 checklist with Steps 7-12: structural verification, review invocation (subagent with inline fallback), validate-and-assess, re-review loop (max 2 rounds), report, and updated presentation with FLAG handling - Fix self-review-gate.md Step 4 fresh subagent path to include SUPPLEMENTARY_CRITERIA, CONTEXT_FILES, and AGENTS.md/CLAUDE.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

[DOCS] stories previously shared the same template as [DEV]/[QE]/etc stories, inheriting code-oriented sections (Implementation Guidance, Testing Approach) that are inappropriate for documentation work. This replaces those sections with Documentation Scope (what the reader needs to understand) and Documentation Inputs (inventory of user-facing artifacts to document), keeping [DOCS] stories focused on *what* needs documenting rather than *how* to build it. Documentation Inputs are explicitly framed as guidance, not authority — implementation may diverge from design, so documentation authors should verify against shipped code. Updates the decomposition review protocol and sync phase to recognize and evaluate the [DOCS]-specific template structure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…stories Same consistency fix applied to decompose.md quality checks and decomposition-review.md — guidelines.md was missed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

"Every story includes functionality and testing" was only true for [DEV] stories. Scoped it to [DEV] and added a line noting [DOCS] stories use their own template. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-12T20:27:03Z

Warning

Rate limit exceeded

@adalton has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 55 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 5075908b-1ff5-4fd6-83ee-e14d29abc1c8

📥 Commits

Reviewing files that changed from the base of the PR and between 1a7ebc5 and b693879.

📒 Files selected for processing (3)

e2e/skills/plan.md
e2e/skills/publish.md
e2e/skills/revise.md

Walkthrough

This PR standardizes code review workflows across implementation and E2E tracks by introducing a shared self-review gate recipe with supplementary criteria, adds a new decomposition review protocol for design quality assurance, refines story templates to distinguish documentation ([DOCS]) from other work, and introduces scenario consolidation patterns for E2E testing to avoid duplicate setup and actions.

Changes

Workflow standardization and review protocols

Layer / File(s)	Summary
Review protocol and self-review gate foundation `_shared/review-protocol.md`, `_shared/recipes/self-review-gate.md`, `code-review/skills/start.md`	Core review principles updated to challenge design decisions and intent, not just correctness. Self-review gate recipe enhanced with optional `SUPPLEMENTARY_CRITERIA` parameter passed alongside standard protocol criteria. Code-review start skill updated to reference shared protocol without inline category listing.
Decomposition review protocol and design story templates `design/decomposition-review.md`, `design/README.md`, `design/guidelines.md`, `design/skills/decompose.md`, `design/skills/sync.md`	New decomposition review protocol defines reviewer independence, core principles, evaluation criteria, finding format, and validation rules. Design README now distinguishes `[DEV]` stories (require functionality + testing) from `[DOCS]` stories (use Documentation Scope and Documentation Inputs instead). Decompose skill adds dedicated `[DOCS]` template and overhauls review workflow with artifact verification, review execution, validation, and structured summary with PASS/FLAG gate. Sync skill enforces Jira parent-key validation and cross-reference resolution to issue keys.
Implementation skill workflow with self-review gates `implement/skills/code.md`, `implement/skills/publish.md`, `implement/skills/validate.md`	Code review runs self-review gate on staged diff with stop conditions for unfixed critical/high findings. Publish adds cross-cutting review for inter-task consistency, duplicated logic, and API coherence. Validate runs gate against base branch with backward-compatibility and completeness supplementary criteria. All steps define gate outcomes: stop on critical/high, re-stage if fixes applied.
E2E scenario consolidation patterns and planning `e2e/guidelines.md`, `e2e/skills/plan.md`, `e2e/skills/revise.md`	New E2E guideline: consolidate scenarios that share expensive setup and actions, cap consolidated validations at 15 (split by category if exceeded). Planning skill adds Step 3 consolidation rules, templates for consolidated (C1) and standalone (S1) scenarios, and rationale table. Self-review checklist verifies consolidation (shared setup merged, cap enforcement, AC tagging). Revise skill updated with consolidation feedback examples.
E2E skill workflow with gates and consolidation checks `e2e/skills/code.md`, `e2e/skills/publish.md`, `e2e/skills/validate.md`	Code review and validate steps run self-review gate with e2e-specific supplementary criteria. Publish performs cross-cutting review for inter-task consistency and fixture/label drift. All steps validate consolidation correctness alongside code quality.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly summarizes three main changes: self-review improvements, e2e plan consolidation, and DOCS story templates, which align with the primary objectives and changeset.
Description check	✅ Passed	The description comprehensively explains the self-review gate unification, scenario consolidation, epic-to-feature parent linking, independent decomposition review, and DOCS story templates with detailed implementation notes and test plan.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

e2e/skills/publish.md (1)
86-94: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Require a validation refresh after cross-cutting fix commits.

This step can create new commits after /validate already passed. Add an explicit requirement to re-run validation (or return to /validate) before PR creation when Step 2 applies fixes.
Proposed patch
 If the gate made code fixes, commit them before proceeding:
@@
 git commit -m "{JIRA-KEY}: address cross-cutting review findings"
+If Step 2 produced a fix commit, re-run /validate (or at minimum the
+validation profile's required checks) before continuing to Step 3.
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @e2e/skills/publish.md around lines 86 - 94, Update the publish workflow docs
to require re-running validation when Step 2 creates commits: after the existing
"If the gate made code fixes, commit them" block (referencing Step 2 and the git
commit instructions), add a sentence that if Step 2 produced any fix commits the
contributor must re-run /validate (or at minimum the validation profile's
required checks) and wait for those checks to pass before proceeding to Step 3
or creating the PR; mention explicitly that validation must be green after fix
commits to proceed.
</details>

</blockquote></details>

</blockquote></details>
🧹 Nitpick comments (2)
e2e/skills/revise.md (1)
94-97: ⚡ Quick win

Add a uniqueness check for scenario names/IDs in consistency review.

Given consolidation and splitting, include an explicit check that scenario identifiers remain unique (e.g., no duplicate C1/S2 labels or repeated titles) to protect AC/task traceability.
Proposed patch
 - Does the Scenario Consolidation table still match the current scenario list?
+- Are scenario names/identifiers unique across the plan (no duplicate consolidated or standalone scenario labels)?
 - Are commit messages still properly formatted?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/skills/revise.md` around lines 94 - 97, Add an explicit uniqueness check
to the consistency review: verify that scenario identifiers (e.g., labels like
C1, S2) and scenario titles are unique after consolidation/splitting; update the
checklist items under the "consistency review" (the questions currently listing
consolidation/opportunities, 8-validation cap, and Scenario Consolidation table)
to include "Are all scenario IDs and titles unique?" and implement/mention a
deterministic validation step that scans the Scenario Consolidation table and
the scenario list for duplicate IDs or duplicate titles and fails the review if
any duplicates are found to preserve AC/task traceability.
e2e/skills/plan.md (1)
270-273: ⚡ Quick win

Add explicit unique-scenario enforcement to the self-review checklist.

Consolidation introduces more renaming/merging; add a checklist item ensuring scenario identifiers and titles are unique to avoid traceability collisions in coverage/task mappings.
Proposed patch
 - [ ] Scenarios sharing setup+action are consolidated, not duplicated as separate tests
 - [ ] No consolidated scenario exceeds 8 validations
 - [ ] Each validation in a consolidated scenario is tagged with its source AC
+- [ ] Scenario identifiers/titles are unique across the plan (no duplicate C#/S# or repeated names)
 - [ ] The plan is achievable — no scenarios depend on unmerged features or unavailable test infrastructure methods
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/skills/plan.md` around lines 270 - 273, Add a new self-review checklist
item that enforces explicit uniqueness of scenario identifiers and titles to
prevent traceability collisions when consolidating tests; update the checklist
near the existing bullets (the block containing "Scenarios sharing setup+action
are consolidated..." and "No consolidated scenario exceeds 8 validations") to
include a line such as "All scenario identifiers and titles are unique across
the plan to avoid coverage/task mapping collisions" and ensure the guidance
mentions verifying both ID fields and human-readable titles during consolidation
and renaming.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@e2e/skills/code.md`:
- Around line 254-256: Add an explicit re-run of task-scoped checks (unit tests,
linters, and any task-specific validation) immediately after the gate may have
made code fixes (i.e., after Step 3c/3d) and before the commit in Step 3f; if
fixes are applied, re-stage the affected files and only proceed to commit when
these re-run checks pass, and record any dismissed findings in the
Implementation Report (Discoveries section). Ensure the flow invokes the same
check commands used earlier, re-adds changed files to the index, and gates the
commit on the re-run success to avoid committing unverified fixes.

In `@implement/skills/validate.md`:
- Around line 204-212: Add a mandatory re-run step immediately after the commit
instructions in validate.md: after the git commit line, instruct the user to
re-run the affected pre-PR checks referenced in Step 3 and, if the committed
changes touch covered packages, re-run the coverage analysis from Step 4 before
proceeding; reference "Step 3" and "Step 4" so reviewers know which checks to
execute and ensure the text explicitly states to re-run only the affected checks
and coverage when relevant.

---

Outside diff comments:
In `@e2e/skills/publish.md`:
- Around line 86-94: Update the publish workflow docs to require re-running
validation when Step 2 creates commits: after the existing "If the gate made
code fixes, commit them" block (referencing Step 2 and the git commit
instructions), add a sentence that if Step 2 produced any fix commits the
contributor must re-run `/validate` (or at minimum the validation profile's
required checks) and wait for those checks to pass before proceeding to Step 3
or creating the PR; mention explicitly that validation must be green after fix
commits to proceed.

---

Nitpick comments:
In `@e2e/skills/plan.md`:
- Around line 270-273: Add a new self-review checklist item that enforces
explicit uniqueness of scenario identifiers and titles to prevent traceability
collisions when consolidating tests; update the checklist near the existing
bullets (the block containing "Scenarios sharing setup+action are
consolidated..." and "No consolidated scenario exceeds 8 validations") to
include a line such as "All scenario identifiers and titles are unique across
the plan to avoid coverage/task mapping collisions" and ensure the guidance
mentions verifying both ID fields and human-readable titles during consolidation
and renaming.

In `@e2e/skills/revise.md`:
- Around line 94-97: Add an explicit uniqueness check to the consistency review:
verify that scenario identifiers (e.g., labels like C1, S2) and scenario titles
are unique after consolidation/splitting; update the checklist items under the
"consistency review" (the questions currently listing
consolidation/opportunities, 8-validation cap, and Scenario Consolidation table)
to include "Are all scenario IDs and titles unique?" and implement/mention a
deterministic validation step that scans the Scenario Consolidation table and
the scenario list for duplicate IDs or duplicate titles and fails the review if
any duplicates are found to preserve AC/task traceability.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)

Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 043ef183-7b1d-4c83-8bfc-375c35889106

📥 Commits

Reviewing files that changed from the base of the PR and between f9fed65 and 2ec15cb.

📒 Files selected for processing (17)

_shared/recipes/self-review-gate.md

_shared/review-protocol.md

code-review/skills/start.md

design/README.md

design/decomposition-review.md

design/guidelines.md

design/skills/decompose.md

design/skills/sync.md

e2e/guidelines.md

e2e/skills/code.md

e2e/skills/plan.md

e2e/skills/publish.md

e2e/skills/revise.md

e2e/skills/validate.md

implement/skills/code.md

implement/skills/publish.md

implement/skills/validate.md

The self-review gate can modify code, but all three workflows (e2e /code, implement /code, implement /validate) committed the fixes without re-running tests and lint. Now each re-runs the task-scoped checks to verify the post-fix state before committing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

amir-yogev-gh · 2026-05-13T06:44:09Z

 Design section: {§4.3 API Changes, or specific subsection}
 ```

+**For `[DOCS]` stories** (see `[DOCS]` story requirements below for the


DOCS has a dedicated workflow (docs-writer) for the implementation which currently is not depending on these stories.
If we choose to follow this approach, we just simply ignore the DOCS stories in this context.
[DOCS] = downstream docs.
[DEV] is used for upstream docs.

This applies to all DOCS related references in this PR and is pending the decision about properly handling the upstream docs.

The template change is orthogonal to the workflow-wiring question. Regardless of whether docs-writer eventually consumes [DOCS] stories, replacing code-oriented Implementation Guidance and Testing Approach with Documentation Scope and Documentation Inputs is the right structure for documentation work. The previous template would generate guidance like "interface shapes, function/method names, patterns to follow" for docs stories, which doesn't make sense.

We explicitly decided not to wire docs-writer to [DOCS] stories in this PR — the Documentation Inputs are design-time guidance (not authority), and the actual implementation may diverge. The docs-writer still discovers what changed from PRs and diffs. This change doesn't touch docs-writer at all.

amir-yogev-gh · 2026-05-13T06:50:12Z

 - **Create only — never update or delete.** Once Jira issues are created, they evolve independently — developers add implementation notes, QA adds test details, PMs adjust criteria. Pushing file content back to Jira would clobber those additions. If the decomposition is revised after sync, `/revise` will tell the user exactly which Jira issues need manual updates.
 - **Link to source.** Every Jira issue description references the design document.
- **Jira-native references.** Local identifiers (`Story 1.01`, `Epic 1`) have meaning only within the `.artifacts/` directory. When constructing Jira issue descriptions, resolve all local references (in Dependencies, Design Reference, and any other cross-references) to Jira issue keys using the sync manifest. Jira is the source of truth — readers of a Jira issue should never need to decode a local artifact numbering scheme.
+- **Jira-native references.** Local identifiers (`Story 1.01`, `Epic 1`) have meaning only within the `.artifacts/` directory. When constructing Jira issue descriptions, resolve all local references (in Dependencies, Documentation Inputs, Design Reference, and any other cross-references) to Jira issue keys using the sync manifest. Jira is the source of truth — readers of a Jira issue should never need to decode a local artifact numbering scheme.


It's better not to make changes to the docs handling until we have a final decision about the upstream docs.

See reply on decompose.md — the template change is independent of the upstream docs decision. We're not wiring any workflow to [DOCS] stories here, just fixing the template so it doesn't generate code-oriented guidance for documentation work.

amir-yogev-gh · 2026-05-13T06:52:09Z

-(`**Story 1.01 — {title}:**` → `**EDM-XXXX — {title}:**`).
+**For `[DOCS]` stories:**
+
+```markdown


Will it be more readable to have a folder with "templates"?

I considered this but prefer keeping the templates inline. They're consumed in-place during sync and are short enough that inlining keeps context local. Extracting into a templates/ folder adds indirection — the agent would need to read the template file separately to understand what Jira descriptions look like, rather than seeing it right where it's used.

amir-yogev-gh · 2026-05-13T06:59:14Z

@@ -0,0 +1,160 @@
+# Decomposition Review Protocol


This file seems to be very desired, but I wonder about its tokens consumption/economics.
There is a lot of judgement calls and high level reviews here that try to capture the big picture.
I guess we need just to try and improve over time.
Just to be clear, I don't see an action item with this review comment. It's more for sharing thoughts.

Agreed on the token economics concern — this is expensive. The tradeoff we discussed: decomposition errors caught early (missing requirements, broken dependency chains, vague acceptance criteria) tend to compound downstream. A story with a missing AC becomes a PR with missing tests becomes a feature gap in production. The review cost is front-loaded but should reduce rework in the implementation and QE phases. We'll need to see how it plays out in practice and adjust.

amir-yogev-gh · 2026-05-13T07:00:09Z

 - Follow the section guidance (`templates/section-guidance.md`) for content standards.
 - Design-scoped goals are **implementation constraints**, not product outcomes. They complement — not duplicate — the PRD's goals.
- Acceptance criteria must be **behavioral outcomes** (what the system does, testable from outside), not activities or implementation specifications. Implementation details belong in Implementation Guidance.
+- Acceptance criteria must be **behavioral outcomes** (what the system does, testable from outside), not activities or implementation specifications. For non-`[DOCS]` stories, implementation details belong in Implementation Guidance. For `[DOCS]` stories, implementation details belong in Documentation Scope.


Referring DOCS again...

See reply on decompose.md.

amir-yogev-gh · 2026-05-13T07:05:01Z

+2. **Merge validations.** For each group, combine the validation blocks
+   into a single consolidated scenario. Tag each validation with its
+   source AC (e.g., `[AC-1]`, `[AC-3]`) to preserve traceability.
+3. **Apply the cap.** A consolidated scenario must not exceed **8


If each validation represents an "assert" statement, I suspect that a cap of 8 for a consolidated scenario is too limiting. Maybe 25?

Agreed — 8 was too tight. We looked at the flightctl e2e suite: median is 9 assertions per test, 52% exceed 8. Raised the cap to 15 in 1a7ebc5, which covers 75% of existing tests. The remaining 25% are large integration-style tests (CSR lifecycle, inline config) that naturally have more assertions.

amir-yogev-gh · 2026-05-13T07:12:54Z

+- **Setup (given):** {shared setup — what the test needs beyond suite-level per-test setup}
+- **Action (when):**
+  1. {first action using test infrastructure methods}
+  2. {subsequent action — multi-step sequences are common in e2e}


The best practices for e2e test try to minimize the number of actions in a single scenario, in many cases only a single action. The rationale is the ability to validate the results when multiple actions changes the state of the system.
What was causing this validation failure? The first or the fifth action?
Using multiple actions is required in cases you can not achieve the desired state change in a single action.
This principal should be expressed as part of the guidelines.

Good point — added a minimize-actions principle to e2e/guidelines.md and updated the plan template to express single-action preference instead of normalizing multi-step sequences. Fixed in 1a7ebc5.

The guideline now reads: "Each scenario should use the fewest state-changing actions needed to reach the condition under test — ideally one. When a test fails after five actions, it's unclear which action caused the failure. Multi-action sequences are acceptable only when the desired state cannot be reached in fewer steps."

amir-yogev-gh

Picked the wrong option when provided the last feedback...

The cap of 8 validations per consolidated scenario was too tight — 52% of existing flightctl e2e tests exceed it (median is 9). Raised to 15, which covers 75% of existing tests. Added a minimize-actions-per-scenario principle to e2e guidelines: prefer the fewest state-changing actions needed to reach the condition under test, so failures are diagnosable. Updated the plan template to express single-action preference instead of normalizing multi-step sequences. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

♻️ Duplicate comments (1)

e2e/skills/plan.md (1)

64-67: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unify validation cap references (15 vs 8) to avoid contradictory plan behavior.

Line 64-Line 67 sets the cap to 15, but Line 167 and Line 272 still enforce 8. This contradiction will cause inconsistent consolidation and self-review decisions.

Suggested fix

- merging would exceed the 8-validation cap).}
+ merging would exceed the 15-validation cap).}

- - [ ] No consolidated scenario exceeds 8 validations
+ - [ ] No consolidated scenario exceeds 15 validations

Also applies to: 167-167, 272-273

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/skills/plan.md` around lines 64 - 67, The doc has conflicting validation
caps (15 vs 8); update all enforcement references that still say "8" to use the
unified cap "15" so consolidation and self-review logic is consistent—search for
occurrences of the numeric cap "8" (e.g., the enforcement statements around the
consolidation/self-review rules that reference 8 validations) and replace them
with "15", and ensure any explanatory text and examples (mentions of "cap",
"validations", or rule names) reflect the single cap value "15" consistently.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@e2e/skills/plan.md`:
- Around line 64-67: The doc has conflicting validation caps (15 vs 8); update
all enforcement references that still say "8" to use the unified cap "15" so
consolidation and self-review logic is consistent—search for occurrences of the
numeric cap "8" (e.g., the enforcement statements around the
consolidation/self-review rules that reference 8 validations) and replace them
with "15", and ensure any explanatory text and examples (mentions of "cap",
"validations", or rule names) reflect the single cap value "15" consistently.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 0ca8f8b6-dc8a-4e72-9627-9a785f17b955

📥 Commits

Reviewing files that changed from the base of the PR and between 2ec15cb and 1a7ebc5.

📒 Files selected for processing (5)

e2e/guidelines.md
e2e/skills/code.md
e2e/skills/plan.md
implement/skills/code.md
implement/skills/validate.md

✅ Files skipped from review due to trivial changes (3)

implement/skills/code.md
e2e/guidelines.md
implement/skills/validate.md

🚧 Files skipped from review as they are similar to previous changes (1)

e2e/skills/code.md

Updated remaining references to the old 8-validation cap in e2e/skills/plan.md (two places) and e2e/skills/revise.md to 15. Added re-run of validation checks after gate fixes in e2e/skills/publish.md — same pattern fixed in code.md and validate.md earlier. Added scenario identifier uniqueness checks to plan.md self-review checklist and revise.md consistency review. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

adalton and others added 8 commits May 12, 2026 10:34

Scope Implementation Guidance reference in guidelines.md to non-DOCS …

b08b637

…stories Same consistency fix applied to decompose.md quality checks and decomposition-review.md — guidelines.md was missed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

adalton self-assigned this May 12, 2026

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

Comment thread e2e/skills/code.md Outdated

Comment thread implement/skills/validate.md Outdated

adalton requested a review from amir-yogev-gh May 12, 2026 20:35

amir-yogev-gh approved these changes May 13, 2026

View reviewed changes

amir-yogev-gh requested changes May 13, 2026

View reviewed changes

adalton requested a review from amir-yogev-gh May 13, 2026 14:12

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

amir-yogev-gh approved these changes May 14, 2026

View reviewed changes

adalton merged commit 58ad70b into main May 14, 2026
7 checks passed

adalton deleted the andalton/address-feedback branch May 14, 2026 15:39

Conversation

adalton commented May 12, 2026 • edited by amir-yogev-gh Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Self-review gate (_shared/, code-review/, implement/, e2e/)

E2e plan consolidation (e2e/)

Design workflow (design/)

Test plan

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adalton May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

amir-yogev-gh left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adalton commented May 12, 2026 •

edited by amir-yogev-gh

Loading

Self-review gate (`_shared/`, `code-review/`, `implement/`, `e2e/`)

E2e plan consolidation (`e2e/`)

Design workflow (`design/`)

coderabbitai Bot commented May 12, 2026 •

edited

Loading

adalton May 13, 2026 •

edited

Loading