Files
awesome-copilot/skills/audit-integrity/references/self-reflection-quality-gate.md
2026-04-28 11:46:05 +10:00

47 lines
2.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Self-Reflection Quality Gate
After completing analysis, internally score the output across domain-relevant categories (110 scale).
## Scoring Rules
- **Pass**: All categories ≥ 8
- **Fail**: Any score < 8 → revisit the failing dimension before delivering output. Max 2 rework iterations.
- **If unresolvable after 2 iterations**: Deliver output with an explicit confidence note stating which dimension fell short and why.
## Base Categories (All Agents)
| Category | Question | Threshold |
| ----------------- | --------------------------------------------------------------------------------------- | :-------: |
| **Completeness** | Were all required phases/categories evaluated with evidence? | ≥ 8 |
| **Accuracy** | Are findings backed by concrete references (code, architecture, CVEs), not speculation? | ≥ 8 |
| **Actionability** | Does every Critical/High finding have a specific, implementable fix or mitigation? | ≥ 8 |
| **Consistency** | Are severity ratings, mappings, and verdicts internally consistent? | ≥ 8 |
| **Coverage** | Were all entry points, trust boundaries, modules, or manifests identified and analyzed? | ≥ 8 |
## Domain-Specific Extensions
### Multi-tool Pipeline — add:
| **Deduplication** | Are cross-tool duplicates properly merged with corroboration notes? | ≥ 8 |
### Code Quality (SonarQube-style) — adapt Completeness to:
| **Completeness** | Were all issue types (Bugs, Vulnerabilities, Hotspots, Smells, Duplication) evaluated? | ≥ 8 |
### SAST/SCA — adapt Coverage to:
| **Coverage** | Were all entry points taint-traced and all dependency manifests audited? | ≥ 8 |
### STRIDE Threat Modeling — adapt Completeness to:
| **Completeness** | Were all six STRIDE categories evaluated for every trust boundary and data flow? | ≥ 8 |
### STRIDE-LM — adapt Completeness and Coverage to:
| **Completeness** | Were all seven STRIDE-LM categories evaluated for every asset and trust boundary? | ≥ 8 |
| **Coverage** | Were all lateral movement paths, trust boundaries, and post-exploitation chains assessed? | ≥ 8 |
### Code Review — adapt Coverage to:
| **Coverage** | Were all entry points, trust boundaries, and data flows traced from source to sink? | ≥ 8 |