awesome-copilot/skills/quality-playbook/phase_prompts/phase5.md at b8441d218b24f483820cf36542b774b310931cc7

mirror of https://github.com/github/awesome-copilot.git synced 2026-05-15 19:21:45 +00:00

Files

T

Andrew Stellman b8441d218b Update quality-playbook skill to v1.5.6 + add agent (#1402 )

Rebuilds branch from upstream/staged (was previously merged from
upstream/main, which brought in materialized plugin files that
fail Check Plugin Structure on PRs targeting staged).

Changes vs. staged:
- Update skills/quality-playbook/ to v1.5.6 (31 bundled assets:
  SKILL.md + LICENSE.txt + 16 references/ + 9 phase_prompts/ +
  3 agents/ + bin/citation_verifier.py + quality_gate.py).
- Add agents/quality-playbook.agent.md (top-level orchestrator).
  name: quality-playbook (validator-compliant).
- Update docs/README.skills.md quality-playbook row description
  + bundled-assets list to v1.5.6.
- Fix 'unparseable' → 'unparsable' in quality_gate.py (5 instances;
  codespell preference, both spellings valid).

Closes the v1.4.0 → v1.5.6 update in a single clean commit on top of
upstream/staged. The preserved backup branch backup-bedbe84-pre-rebuild
(SHA bedbe848fa3c0f0eda8e653c42b599a17dd2e354) holds the prior history for reference.

2026-05-11 11:31:53 +10:00

11 KiB

Raw Blame History

{skill_fallback_guide}

You are a quality engineer continuing a phase-by-phase quality playbook run. Phases 1-4 are complete.

Read these files to get context:

quality/PROGRESS.md - run metadata, phase status, cumulative BUG tracker
quality/BUGS.md - all confirmed bugs from code review and spec audit
quality/REQUIREMENTS.md - derived requirements
SKILL.md - read the Phase 5 section ("Phase 5: Post-Review Reconciliation and Closure Verification"). Also read references/requirements_pipeline.md, references/review_protocols.md, and references/spec_audit.md. Resolve SKILL.md and the references/ directory via the documented fallback list above; do NOT assume any single install layout.

Execute Phase 5: Reconciliation + TDD + Closure.

Run the Post-Review Reconciliation per references/requirements_pipeline.md. Update COMPLETENESS_REPORT.md.
Run closure verification: every BUG in the tracker must have either a regression test or an explicit exemption.

Write bug writeups at quality/writeups/BUG-NNN.md for EVERY confirmed bug. The canonical template is the "Bug writeup generation" section of SKILL.md (resolve via the fallback list above) — read that section before writing. Use the exact field headings listed there: Summary, Spec reference, The code, Observable consequence, Depth judgment, The fix, The test, Related issues. Sections 1–4, 6, 7 are required in every writeup; section 5 (Depth judgment) fires only when the consequence isn't self-evident from the immediate code; section 8 (Related issues) is included only when related bugs exist. Do NOT introduce fields that aren't in the template (no "Minimal reproduction" as a top-level field, no "Patch path:" as a top-level field — those belong inside Spec reference and The test respectively).

MANDATORY HYDRATION STEP. Before writing a writeup, re-open quality/BUGS.md and locate the ### BUG-NNN: entry for the bug you are about to write up. Every confirmed bug in BUGS.md already has the content you need — your job is to copy it into the writeup's sections, not to invent it. If a field is missing from BUGS.md, that is a reconciliation error to surface in PROGRESS.md, not a field to fabricate. Use this field map:

BUGS.md field	Writeup section	How to use it
Title line (### BUG-NNN:…)	Summary	One sentence naming the function/code path and the observable failure.
Primary requirement	Spec reference	`- Requirement: REQ-NNN`
Spec basis	Spec reference	`- Spec basis: <doc path + line range(s), semicolon-separated if multiple>` plus a ≤15-word contract quote copied verbatim from the cited lines.
Location	The code	Cite `file:line` and describe what the current path does there.
Minimal reproduction	Observable consequence	Weave into the consequence paragraph as the triggering input.
Expected + Actual behavior	Observable consequence	The actual behavior is the observable failure; the expected defines the gap.
Regression test	The test	`- Regression test: <function name>` — verbatim from BUGS.md.
Patches (regression)	The test	`- Regression patch: <path>` — verbatim from BUGS.md.
Patches (fix)	The fix + The test	If a fix patch file exists, read it and paste the unified diff inside ```diff; also list the patch path as `- Fix patch: <path>` under The test. If no fix patch exists (confirmed-open bug), write the minimal concrete unified diff directly in The fix anyway — SKILL.md requires an inline diff in every writeup. In the no-patch case, omit the `Fix patch:` bullet from The test.
Red/green logs	The test	`- Red receipt: quality/results/BUG-NNN.red.log` and the matching green path.

Worked example. The BUGS.md entry for BUG-004 is:

### BUG-004: naive upstream timestamps crash ETA math
- Source: Code Review
- Severity: HIGH
- Primary requirement: REQ-006
- Location: bus_tracker.py:138-144
- Spec basis: quality/REQUIREMENTS.md:163-172; quality/QUALITY.md:57-65
- Minimal reproduction: Return a visit whose ExpectedArrivalTime is an ISO string
  without timezone information, such as 2026-04-21T12:00:00.
- Expected behavior: The affected arrival degrades to unknown-time while the rest
  of the stop remains usable.
- Actual behavior: datetime.fromisoformat() returns a naive datetime and
  subtracting it from datetime.now(timezone.utc) raises TypeError, aborting the
  stop/request path.
- Regression test: quality.test_regression.TestPhase3Regressions.test_bug_004_fetch_stop_arrivals_degrades_naive_timestamps
- Patches: quality/patches/BUG-004-regression-test.patch, quality/patches/BUG-004-fix.patch

The hydrated writeup sections look like this (sketch — paste the real diff from the fix patch file into ```diff, don't make one up):

## Summary
fetch_stop_arrivals() crashes the whole stop/request path when an upstream visit
carries a naive ExpectedArrivalTime, instead of degrading that arrival to
unknown-time.

## Spec reference
- Requirement: REQ-006
- Spec basis: quality/REQUIREMENTS.md:163-172; quality/QUALITY.md:57-65
- Behavioral contract quote: "degrade a bad per-arrival timestamp to unknown-time instead of aborting the whole response path"

## The code
At bus_tracker.py:138-144, the parser calls datetime.fromisoformat(...) on
ExpectedArrivalTime and subtracts the result from datetime.now(timezone.utc)…

## Observable consequence
When the upstream visit returns ExpectedArrivalTime="2026-04-21T12:00:00"
(no timezone), fromisoformat() returns a naive datetime, the subtraction
raises TypeError, and the entire stop/request path aborts rather than the
single affected arrival degrading to unknown-time.

## The fix
```diff
<paste the real unified diff from quality/patches/BUG-004-fix.patch here>
```

## The test
- Regression test: quality.test_regression.TestPhase3Regressions.test_bug_004_fetch_stop_arrivals_degrades_naive_timestamps
- Regression patch: quality/patches/BUG-004-regression-test.patch
- Fix patch: quality/patches/BUG-004-fix.patch
- Red receipt: quality/results/BUG-004.red.log
- Green receipt: quality/results/BUG-004.green.log

Confirmation checklist (per writeup, before moving to the next bug). (a) Every required section has populated content copied from BUGS.md or the patch files — no empty backticks, no sentinel filler like "is a confirmed code bug in " or "The affected implementation lives at " or "Patch path: ``". (b) The ```diff fence contains at least one + or - line from the actual fix patch. (c) The Summary names a real function or code path, not the BUG identifier. (d) No angle-bracket placeholders (e.g., <...>) remain in the final writeup — those are pedagogical markers from the worked example and from SKILL.md, never acceptable output.

Run the TDD red-green cycle: for each confirmed bug, run the regression test against unpatched code -> quality/results/BUG-NNN.red.log. If a fix patch exists, run against patched code -> quality/results/BUG-NNN.green.log. If the test runner is unavailable, create the log with NOT_RUN on the first line.
Generate sidecar JSON: quality/results/tdd-results.json and quality/results/integration-results.json (schema_version "1.1", canonical fields: id, requirement, red_phase, green_phase, verdict, fix_patch_present, writeup_path).
If mechanical verification artifacts exist, run quality/mechanical/verify.sh and save receipts.
Run terminal gate verification, write it to PROGRESS.md.

MANDATORY CARDINALITY GATE (Lever 3, v1.5.2)

Before finalizing this phase, run the cardinality reconciliation gate against the current repo state. Locate quality_gate.py via the same fallback list used for SKILL.md (it sits in the same directory as SKILL.md in every install layout), then invoke it as a script — quality_gate.py runs check_v1_5_2_cardinality_gate(repo_dir) as part of its standard pass:

python3 <resolved_quality_gate_path> .

Where <resolved_quality_gate_path> is the first hit when walking the documented install-location fallback list, with SKILL.md swapped for quality_gate.py (e.g., quality_gate.py, .claude/skills/quality-playbook/quality_gate.py, .github/skills/quality_gate.py, .cursor/skills/quality-playbook/quality_gate.py, .continue/skills/quality-playbook/quality_gate.py, .github/skills/quality-playbook/quality_gate.py).

If the gate output contains any line beginning with cardinality gate:, or reports uncovered cells, malformed cell IDs, missing consolidation rationale on multi-cell Covers, or malformed downgrade records, STOP. Fix the BUGS.md entries or the compensation_grid_downgrades.json file. Do NOT proceed to completion until those failure lines no longer appear.

For every pattern-tagged REQ, the Phase 5 contract is:

Every grid cell with "present": false appears in either a BUG's Covers: list or a downgrade record.
Every Covers: entry uses the canonical cell ID form REQ-N/cell-<item>-<site>.
Every BUG with ≥2 Covers: entries has a non-empty Consolidation rationale: line.
Every downgrade record has cell_id, authority_ref, site_citation, reason_class (in the enum), falsifiable_claim (non-empty).

The cardinality gate is blocking. It is intentionally stricter than the Phase 3 advisory self-check; the advisory check is meant to surface problems early, but Phase 5 is where they become fatal.

Mark Phase 5 complete in PROGRESS.md (use the checkbox format - [x] Phase 5 - Reconciliation — do NOT switch to a table).

IMPORTANT: quality_gate.py will FAIL Phase 5 if any writeup is missing a non-empty ```diff block or contains any of these sentinel phrases verbatim: "is a confirmed code bug in ", "The affected implementation lives at ", "Patch path: ", "- Regression test: ", "- Regression patch: ``". Those two checks are the hard gate. Skipping the BUGS.md hydration step above is not gate-enforced but will produce writeups that read as unpopulated stubs and fail a human review — do not skip it.

11 KiB Raw Blame History Unescape Escape

MANDATORY CARDINALITY GATE (Lever 3, v1.5.2)

11 KiB

Raw Blame History