Skip to content

Improve self-review, e2e plan consolidation, and DOCS story templates#42

Merged
adalton merged 11 commits into
mainfrom
andalton/address-feedback
May 14, 2026
Merged

Improve self-review, e2e plan consolidation, and DOCS story templates#42
adalton merged 11 commits into
mainfrom
andalton/address-feedback

Conversation

@adalton
Copy link
Copy Markdown
Contributor

@adalton adalton commented May 12, 2026

Summary

  • Unify per-task code review to use a shared self-review gate (_shared/recipes/self-review-gate.md) with a deepened adversarial review protocol, replacing the previous checklist-based approach
  • Add scenario consolidation to the e2e /plan phase — merges overlapping scenarios, enforces unique naming, and streamlines the plan-to-code pipeline
  • Enforce epic-to-feature parent linking in design /sync to prevent orphaned epics
  • Add independent decomposition review to design /decompose — a subagent-based review step that evaluates epics/stories against the PRD without seeing the design document
  • Give [DOCS] stories a dedicated template in design /decompose and /sync — replaces code-oriented Implementation Guidance and Testing Approach with Documentation Scope and Documentation Inputs, keeping documentation stories focused on what needs documenting rather than how to build it

Details

Self-review gate (_shared/, code-review/, implement/, e2e/)

  • Reframed the self-review gate description to clarify what same-model review can and cannot catch
  • Deepened the review protocol from a checklist to an adversarial design review with structured attack categories
  • Updated implement, e2e, and code-review workflows to use the shared gate consistently

E2e plan consolidation (e2e/)

  • Added scenario consolidation step to /plan that merges overlapping scenarios and enforces unique naming
  • Streamlined /code, /validate, /publish, and /revise for consistency with the updated plan structure

Design workflow (design/)

  • Enforced epic-to-feature parent linking in /sync — verifies parent.key after creation
  • Added independent decomposition review as Steps 8-12 in /decompose with a dedicated review protocol (decomposition-review.md)
  • Split the story template into two: non-[DOCS] stories keep Implementation Guidance + Testing Approach; [DOCS] stories get Documentation Scope + Documentation Inputs
  • Documentation Inputs are explicitly framed as guidance, not authority — implementation may diverge from design
  • Updated guidelines.md, README.md, decomposition-review.md, and sync.md for consistency with the template split

Test plan

  • Run design /decompose on a feature with both [DEV] and [DOCS] stories — verify the correct template is used for each
  • Run design /sync on a decomposition with [DOCS] stories — verify Jira descriptions use Documentation Scope and Documentation Inputs (not Implementation Guidance and Testing Approach)
  • Run design /decompose with the decomposition review enabled — verify the reviewer evaluates [DOCS] stories against the [DOCS] template criteria
  • Run e2e /plan and verify scenario consolidation merges overlapping scenarios
  • Run implement /code and verify self-review gate uses the shared review protocol

🤖 Generated with Claude Code

adalton and others added 8 commits May 12, 2026 10:34
The /sync command was silently dropping the parent field when creating
epics, then listing it as a manual "Next Step" instead of treating it
as a failure. This adds explicit Jira API field syntax, post-creation
verification for both epics and stories, and prohibits deferring parent
linking to manual follow-up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scenarios that share the same setup (given) and action (when) are now
merged into consolidated tests with multiple validations, reducing
e2e suite execution time by avoiding repeated expensive setup+action.
Adds a consolidation step, 8-validation cap, AC-tagged traceability,
and updates downstream phases (revise, publish) for consistency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace inline three-tier review logic in implement and e2e /code
phases with references to _shared/recipes/self-review-gate.md. Add
SUPPLEMENTARY_CRITERIA parameter to the recipe so callers can pass
domain-specific evaluation criteria (e2e anti-patterns) and review
focus directives (cross-cutting concerns) through the formal channel
into the reviewer subagent.

Reframe publish-time review from full re-review to cross-cutting
review focused on inter-task interactions, since each sub-task is
now individually reviewed during /code.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Design as evaluation criterion #4 in the shared review protocol
with concrete detection signals (parameter count, import spread,
repeated error handling, exposed internal types). Add "Challenge
decisions, not just implementation" as the first core principle,
pushing reviewers to question whether abstractions should exist and
whether boundaries are in the right place — not just whether the
implementation is correct.

Replace inline code quality checklists in implement and e2e /validate
Step 6 with self-review gate invocations, bringing subagent isolation,
iterative fix-and-re-review, and the full protocol to the validation
phase. Preserve domain-specific checks as SUPPLEMENTARY_CRITERIA.

Remove stale criteria enumeration from code-review workflow. Update
validation report templates to reflect gate-based review. Reframe
self-review gate preamble to scope what it can and cannot catch
without enumerating specific criteria.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the self-graded 17-item checklist in /decompose with a
subagent-based review gate that evaluates decomposition artifacts
against the PRD without access to the design document. The reviewer
sees only what the artifacts actually say, catching gaps the
decomposer's mental model papers over.

- Add design/decomposition-review.md with 6 evaluation criteria
  (requirement coverage, epic structure, story completeness, testing
  commitment, integration stability, documentation), finding format,
  severity definitions, and validation rules
- Replace decompose Step 7 checklist with Steps 7-12: structural
  verification, review invocation (subagent with inline fallback),
  validate-and-assess, re-review loop (max 2 rounds), report, and
  updated presentation with FLAG handling
- Fix self-review-gate.md Step 4 fresh subagent path to include
  SUPPLEMENTARY_CRITERIA, CONTEXT_FILES, and AGENTS.md/CLAUDE.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
[DOCS] stories previously shared the same template as [DEV]/[QE]/etc
stories, inheriting code-oriented sections (Implementation Guidance,
Testing Approach) that are inappropriate for documentation work. This
replaces those sections with Documentation Scope (what the reader needs
to understand) and Documentation Inputs (inventory of user-facing
artifacts to document), keeping [DOCS] stories focused on *what* needs
documenting rather than *how* to build it.

Documentation Inputs are explicitly framed as guidance, not authority —
implementation may diverge from design, so documentation authors should
verify against shipped code.

Updates the decomposition review protocol and sync phase to recognize
and evaluate the [DOCS]-specific template structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…stories

Same consistency fix applied to decompose.md quality checks and
decomposition-review.md — guidelines.md was missed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
"Every story includes functionality and testing" was only true for [DEV]
stories. Scoped it to [DEV] and added a line noting [DOCS] stories use
their own template.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adalton adalton self-assigned this May 12, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Review Change Stack

Warning

Rate limit exceeded

@adalton has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 55 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 5075908b-1ff5-4fd6-83ee-e14d29abc1c8

📥 Commits

Reviewing files that changed from the base of the PR and between 1a7ebc5 and b693879.

📒 Files selected for processing (3)
  • e2e/skills/plan.md
  • e2e/skills/publish.md
  • e2e/skills/revise.md

Walkthrough

This PR standardizes code review workflows across implementation and E2E tracks by introducing a shared self-review gate recipe with supplementary criteria, adds a new decomposition review protocol for design quality assurance, refines story templates to distinguish documentation ([DOCS]) from other work, and introduces scenario consolidation patterns for E2E testing to avoid duplicate setup and actions.

Changes

Workflow standardization and review protocols

Layer / File(s) Summary
Review protocol and self-review gate foundation
_shared/review-protocol.md, _shared/recipes/self-review-gate.md, code-review/skills/start.md
Core review principles updated to challenge design decisions and intent, not just correctness. Self-review gate recipe enhanced with optional SUPPLEMENTARY_CRITERIA parameter passed alongside standard protocol criteria. Code-review start skill updated to reference shared protocol without inline category listing.
Decomposition review protocol and design story templates
design/decomposition-review.md, design/README.md, design/guidelines.md, design/skills/decompose.md, design/skills/sync.md
New decomposition review protocol defines reviewer independence, core principles, evaluation criteria, finding format, and validation rules. Design README now distinguishes [DEV] stories (require functionality + testing) from [DOCS] stories (use Documentation Scope and Documentation Inputs instead). Decompose skill adds dedicated [DOCS] template and overhauls review workflow with artifact verification, review execution, validation, and structured summary with PASS/FLAG gate. Sync skill enforces Jira parent-key validation and cross-reference resolution to issue keys.
Implementation skill workflow with self-review gates
implement/skills/code.md, implement/skills/publish.md, implement/skills/validate.md
Code review runs self-review gate on staged diff with stop conditions for unfixed critical/high findings. Publish adds cross-cutting review for inter-task consistency, duplicated logic, and API coherence. Validate runs gate against base branch with backward-compatibility and completeness supplementary criteria. All steps define gate outcomes: stop on critical/high, re-stage if fixes applied.
E2E scenario consolidation patterns and planning
e2e/guidelines.md, e2e/skills/plan.md, e2e/skills/revise.md
New E2E guideline: consolidate scenarios that share expensive setup and actions, cap consolidated validations at 15 (split by category if exceeded). Planning skill adds Step 3 consolidation rules, templates for consolidated (C1) and standalone (S1) scenarios, and rationale table. Self-review checklist verifies consolidation (shared setup merged, cap enforcement, AC tagging). Revise skill updated with consolidation feedback examples.
E2E skill workflow with gates and consolidation checks
e2e/skills/code.md, e2e/skills/publish.md, e2e/skills/validate.md
Code review and validate steps run self-review gate with e2e-specific supplementary criteria. Publish performs cross-cutting review for inter-task consistency and fixture/label drift. All steps validate consolidation correctness alongside code quality.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title directly summarizes three main changes: self-review improvements, e2e plan consolidation, and DOCS story templates, which align with the primary objectives and changeset.
Description check ✅ Passed The description comprehensively explains the self-review gate unification, scenario consolidation, epic-to-feature parent linking, independent decomposition review, and DOCS story templates with detailed implementation notes and test plan.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
e2e/skills/publish.md (1)

86-94: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Require a validation refresh after cross-cutting fix commits.

This step can create new commits after /validate already passed. Add an explicit requirement to re-run validation (or return to /validate) before PR creation when Step 2 applies fixes.

Proposed patch
 If the gate made code fixes, commit them before proceeding:
@@
 git commit -m "{JIRA-KEY}: address cross-cutting review findings"

+If Step 2 produced a fix commit, re-run /validate (or at minimum the
+validation profile's required checks) before continuing to Step 3.

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @e2e/skills/publish.md around lines 86 - 94, Update the publish workflow docs
to require re-running validation when Step 2 creates commits: after the existing
"If the gate made code fixes, commit them" block (referencing Step 2 and the git
commit instructions), add a sentence that if Step 2 produced any fix commits the
contributor must re-run /validate (or at minimum the validation profile's
required checks) and wait for those checks to pass before proceeding to Step 3
or creating the PR; mention explicitly that validation must be green after fix
commits to proceed.


</details>

</blockquote></details>

</blockquote></details>
🧹 Nitpick comments (2)
e2e/skills/revise.md (1)

94-97: ⚡ Quick win

Add a uniqueness check for scenario names/IDs in consistency review.

Given consolidation and splitting, include an explicit check that scenario identifiers remain unique (e.g., no duplicate C1/S2 labels or repeated titles) to protect AC/task traceability.

Proposed patch
 - Does the Scenario Consolidation table still match the current scenario list?
+- Are scenario names/identifiers unique across the plan (no duplicate consolidated or standalone scenario labels)?
 - Are commit messages still properly formatted?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/skills/revise.md` around lines 94 - 97, Add an explicit uniqueness check
to the consistency review: verify that scenario identifiers (e.g., labels like
C1, S2) and scenario titles are unique after consolidation/splitting; update the
checklist items under the "consistency review" (the questions currently listing
consolidation/opportunities, 8-validation cap, and Scenario Consolidation table)
to include "Are all scenario IDs and titles unique?" and implement/mention a
deterministic validation step that scans the Scenario Consolidation table and
the scenario list for duplicate IDs or duplicate titles and fails the review if
any duplicates are found to preserve AC/task traceability.
e2e/skills/plan.md (1)

270-273: ⚡ Quick win

Add explicit unique-scenario enforcement to the self-review checklist.

Consolidation introduces more renaming/merging; add a checklist item ensuring scenario identifiers and titles are unique to avoid traceability collisions in coverage/task mappings.

Proposed patch
 - [ ] Scenarios sharing setup+action are consolidated, not duplicated as separate tests
 - [ ] No consolidated scenario exceeds 8 validations
 - [ ] Each validation in a consolidated scenario is tagged with its source AC
+- [ ] Scenario identifiers/titles are unique across the plan (no duplicate C#/S# or repeated names)
 - [ ] The plan is achievable — no scenarios depend on unmerged features or unavailable test infrastructure methods
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/skills/plan.md` around lines 270 - 273, Add a new self-review checklist
item that enforces explicit uniqueness of scenario identifiers and titles to
prevent traceability collisions when consolidating tests; update the checklist
near the existing bullets (the block containing "Scenarios sharing setup+action
are consolidated..." and "No consolidated scenario exceeds 8 validations") to
include a line such as "All scenario identifiers and titles are unique across
the plan to avoid coverage/task mapping collisions" and ensure the guidance
mentions verifying both ID fields and human-readable titles during consolidation
and renaming.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@e2e/skills/code.md`:
- Around line 254-256: Add an explicit re-run of task-scoped checks (unit tests,
linters, and any task-specific validation) immediately after the gate may have
made code fixes (i.e., after Step 3c/3d) and before the commit in Step 3f; if
fixes are applied, re-stage the affected files and only proceed to commit when
these re-run checks pass, and record any dismissed findings in the
Implementation Report (Discoveries section). Ensure the flow invokes the same
check commands used earlier, re-adds changed files to the index, and gates the
commit on the re-run success to avoid committing unverified fixes.

In `@implement/skills/validate.md`:
- Around line 204-212: Add a mandatory re-run step immediately after the commit
instructions in validate.md: after the git commit line, instruct the user to
re-run the affected pre-PR checks referenced in Step 3 and, if the committed
changes touch covered packages, re-run the coverage analysis from Step 4 before
proceeding; reference "Step 3" and "Step 4" so reviewers know which checks to
execute and ensure the text explicitly states to re-run only the affected checks
and coverage when relevant.

---

Outside diff comments:
In `@e2e/skills/publish.md`:
- Around line 86-94: Update the publish workflow docs to require re-running
validation when Step 2 creates commits: after the existing "If the gate made
code fixes, commit them" block (referencing Step 2 and the git commit
instructions), add a sentence that if Step 2 produced any fix commits the
contributor must re-run `/validate` (or at minimum the validation profile's
required checks) and wait for those checks to pass before proceeding to Step 3
or creating the PR; mention explicitly that validation must be green after fix
commits to proceed.

---

Nitpick comments:
In `@e2e/skills/plan.md`:
- Around line 270-273: Add a new self-review checklist item that enforces
explicit uniqueness of scenario identifiers and titles to prevent traceability
collisions when consolidating tests; update the checklist near the existing
bullets (the block containing "Scenarios sharing setup+action are
consolidated..." and "No consolidated scenario exceeds 8 validations") to
include a line such as "All scenario identifiers and titles are unique across
the plan to avoid coverage/task mapping collisions" and ensure the guidance
mentions verifying both ID fields and human-readable titles during consolidation
and renaming.

In `@e2e/skills/revise.md`:
- Around line 94-97: Add an explicit uniqueness check to the consistency review:
verify that scenario identifiers (e.g., labels like C1, S2) and scenario titles
are unique after consolidation/splitting; update the checklist items under the
"consistency review" (the questions currently listing
consolidation/opportunities, 8-validation cap, and Scenario Consolidation table)
to include "Are all scenario IDs and titles unique?" and implement/mention a
deterministic validation step that scans the Scenario Consolidation table and
the scenario list for duplicate IDs or duplicate titles and fails the review if
any duplicates are found to preserve AC/task traceability.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 043ef183-7b1d-4c83-8bfc-375c35889106

📥 Commits

Reviewing files that changed from the base of the PR and between f9fed65 and 2ec15cb.

📒 Files selected for processing (17)
  • _shared/recipes/self-review-gate.md
  • _shared/review-protocol.md
  • code-review/skills/start.md
  • design/README.md
  • design/decomposition-review.md
  • design/guidelines.md
  • design/skills/decompose.md
  • design/skills/sync.md
  • e2e/guidelines.md
  • e2e/skills/code.md
  • e2e/skills/plan.md
  • e2e/skills/publish.md
  • e2e/skills/revise.md
  • e2e/skills/validate.md
  • implement/skills/code.md
  • implement/skills/publish.md
  • implement/skills/validate.md

Comment thread e2e/skills/code.md Outdated
Comment thread implement/skills/validate.md Outdated
The self-review gate can modify code, but all three workflows
(e2e /code, implement /code, implement /validate) committed the
fixes without re-running tests and lint. Now each re-runs the
task-scoped checks to verify the post-fix state before committing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adalton adalton requested a review from amir-yogev-gh May 12, 2026 20:35
Design section: {§4.3 API Changes, or specific subsection}
```

**For `[DOCS]` stories** (see `[DOCS]` story requirements below for the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOCS has a dedicated workflow (docs-writer) for the implementation which currently is not depending on these stories.
If we choose to follow this approach, we just simply ignore the DOCS stories in this context.
[DOCS] = downstream docs.
[DEV] is used for upstream docs.

This applies to all DOCS related references in this PR and is pending the decision about properly handling the upstream docs.

Copy link
Copy Markdown
Contributor Author

@adalton adalton May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template change is orthogonal to the workflow-wiring question. Regardless of whether docs-writer eventually consumes [DOCS] stories, replacing code-oriented Implementation Guidance and Testing Approach with Documentation Scope and Documentation Inputs is the right structure for documentation work. The previous template would generate guidance like "interface shapes, function/method names, patterns to follow" for docs stories, which doesn't make sense.

We explicitly decided not to wire docs-writer to [DOCS] stories in this PR — the Documentation Inputs are design-time guidance (not authority), and the actual implementation may diverge. The docs-writer still discovers what changed from PRs and diffs. This change doesn't touch docs-writer at all.

Comment thread design/skills/sync.md
- **Create only — never update or delete.** Once Jira issues are created, they evolve independently — developers add implementation notes, QA adds test details, PMs adjust criteria. Pushing file content back to Jira would clobber those additions. If the decomposition is revised after sync, `/revise` will tell the user exactly which Jira issues need manual updates.
- **Link to source.** Every Jira issue description references the design document.
- **Jira-native references.** Local identifiers (`Story 1.01`, `Epic 1`) have meaning only within the `.artifacts/` directory. When constructing Jira issue descriptions, resolve all local references (in Dependencies, Design Reference, and any other cross-references) to Jira issue keys using the sync manifest. Jira is the source of truth — readers of a Jira issue should never need to decode a local artifact numbering scheme.
- **Jira-native references.** Local identifiers (`Story 1.01`, `Epic 1`) have meaning only within the `.artifacts/` directory. When constructing Jira issue descriptions, resolve all local references (in Dependencies, Documentation Inputs, Design Reference, and any other cross-references) to Jira issue keys using the sync manifest. Jira is the source of truth — readers of a Jira issue should never need to decode a local artifact numbering scheme.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better not to make changes to the docs handling until we have a final decision about the upstream docs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See reply on decompose.md — the template change is independent of the upstream docs decision. We're not wiring any workflow to [DOCS] stories here, just fixing the template so it doesn't generate code-oriented guidance for documentation work.

Comment thread design/skills/sync.md
(`**Story 1.01 — {title}:**` → `**EDM-XXXX — {title}:**`).
**For `[DOCS]` stories:**

```markdown
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be more readable to have a folder with "templates"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this but prefer keeping the templates inline. They're consumed in-place during sync and are short enough that inlining keeps context local. Extracting into a templates/ folder adds indirection — the agent would need to read the template file separately to understand what Jira descriptions look like, rather than seeing it right where it's used.

@@ -0,0 +1,160 @@
# Decomposition Review Protocol
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file seems to be very desired, but I wonder about its tokens consumption/economics.
There is a lot of judgement calls and high level reviews here that try to capture the big picture.
I guess we need just to try and improve over time.
Just to be clear, I don't see an action item with this review comment. It's more for sharing thoughts.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on the token economics concern — this is expensive. The tradeoff we discussed: decomposition errors caught early (missing requirements, broken dependency chains, vague acceptance criteria) tend to compound downstream. A story with a missing AC becomes a PR with missing tests becomes a feature gap in production. The review cost is front-loaded but should reduce rework in the implementation and QE phases. We'll need to see how it plays out in practice and adjust.

Comment thread design/guidelines.md
- Follow the section guidance (`templates/section-guidance.md`) for content standards.
- Design-scoped goals are **implementation constraints**, not product outcomes. They complement — not duplicate — the PRD's goals.
- Acceptance criteria must be **behavioral outcomes** (what the system does, testable from outside), not activities or implementation specifications. Implementation details belong in Implementation Guidance.
- Acceptance criteria must be **behavioral outcomes** (what the system does, testable from outside), not activities or implementation specifications. For non-`[DOCS]` stories, implementation details belong in Implementation Guidance. For `[DOCS]` stories, implementation details belong in Documentation Scope.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referring DOCS again...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See reply on decompose.md.

Comment thread e2e/skills/plan.md Outdated
2. **Merge validations.** For each group, combine the validation blocks
into a single consolidated scenario. Tag each validation with its
source AC (e.g., `[AC-1]`, `[AC-3]`) to preserve traceability.
3. **Apply the cap.** A consolidated scenario must not exceed **8
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If each validation represents an "assert" statement, I suspect that a cap of 8 for a consolidated scenario is too limiting. Maybe 25?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — 8 was too tight. We looked at the flightctl e2e suite: median is 9 assertions per test, 52% exceed 8. Raised the cap to 15 in 1a7ebc5, which covers 75% of existing tests. The remaining 25% are large integration-style tests (CSR lifecycle, inline config) that naturally have more assertions.

Comment thread e2e/skills/plan.md Outdated
- **Setup (given):** {shared setup — what the test needs beyond suite-level per-test setup}
- **Action (when):**
1. {first action using test infrastructure methods}
2. {subsequent action — multi-step sequences are common in e2e}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best practices for e2e test try to minimize the number of actions in a single scenario, in many cases only a single action. The rationale is the ability to validate the results when multiple actions changes the state of the system.
What was causing this validation failure? The first or the fifth action?
Using multiple actions is required in cases you can not achieve the desired state change in a single action.
This principal should be expressed as part of the guidelines.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — added a minimize-actions principle to e2e/guidelines.md and updated the plan template to express single-action preference instead of normalizing multi-step sequences. Fixed in 1a7ebc5.

The guideline now reads: "Each scenario should use the fewest state-changing actions needed to reach the condition under test — ideally one. When a test fails after five actions, it's unclear which action caused the failure. Multi-action sequences are acceptable only when the desired state cannot be reached in fewer steps."

Comment thread e2e/skills/plan.md
Copy link
Copy Markdown
Collaborator

@amir-yogev-gh amir-yogev-gh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Picked the wrong option when provided the last feedback...

The cap of 8 validations per consolidated scenario was too tight —
52% of existing flightctl e2e tests exceed it (median is 9). Raised
to 15, which covers 75% of existing tests.

Added a minimize-actions-per-scenario principle to e2e guidelines:
prefer the fewest state-changing actions needed to reach the condition
under test, so failures are diagnosable. Updated the plan template
to express single-action preference instead of normalizing multi-step
sequences.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adalton adalton requested a review from amir-yogev-gh May 13, 2026 14:12
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
e2e/skills/plan.md (1)

64-67: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unify validation cap references (15 vs 8) to avoid contradictory plan behavior.

Line 64-Line 67 sets the cap to 15, but Line 167 and Line 272 still enforce 8. This contradiction will cause inconsistent consolidation and self-review decisions.

Suggested fix
- merging would exceed the 8-validation cap).}
+ merging would exceed the 15-validation cap).}

- - [ ] No consolidated scenario exceeds 8 validations
+ - [ ] No consolidated scenario exceeds 15 validations

Also applies to: 167-167, 272-273

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/skills/plan.md` around lines 64 - 67, The doc has conflicting validation
caps (15 vs 8); update all enforcement references that still say "8" to use the
unified cap "15" so consolidation and self-review logic is consistent—search for
occurrences of the numeric cap "8" (e.g., the enforcement statements around the
consolidation/self-review rules that reference 8 validations) and replace them
with "15", and ensure any explanatory text and examples (mentions of "cap",
"validations", or rule names) reflect the single cap value "15" consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@e2e/skills/plan.md`:
- Around line 64-67: The doc has conflicting validation caps (15 vs 8); update
all enforcement references that still say "8" to use the unified cap "15" so
consolidation and self-review logic is consistent—search for occurrences of the
numeric cap "8" (e.g., the enforcement statements around the
consolidation/self-review rules that reference 8 validations) and replace them
with "15", and ensure any explanatory text and examples (mentions of "cap",
"validations", or rule names) reflect the single cap value "15" consistently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 0ca8f8b6-dc8a-4e72-9627-9a785f17b955

📥 Commits

Reviewing files that changed from the base of the PR and between 2ec15cb and 1a7ebc5.

📒 Files selected for processing (5)
  • e2e/guidelines.md
  • e2e/skills/code.md
  • e2e/skills/plan.md
  • implement/skills/code.md
  • implement/skills/validate.md
✅ Files skipped from review due to trivial changes (3)
  • implement/skills/code.md
  • e2e/guidelines.md
  • implement/skills/validate.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • e2e/skills/code.md

Updated remaining references to the old 8-validation cap in
e2e/skills/plan.md (two places) and e2e/skills/revise.md to 15.

Added re-run of validation checks after gate fixes in
e2e/skills/publish.md — same pattern fixed in code.md and
validate.md earlier.

Added scenario identifier uniqueness checks to plan.md self-review
checklist and revise.md consistency review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adalton adalton merged commit 58ad70b into main May 14, 2026
7 checks passed
@adalton adalton deleted the andalton/address-feedback branch May 14, 2026 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants