# Self-Reflection Quality Gate After completing analysis, internally score the output across domain-relevant categories (1–10 scale). ## Scoring Rules - **Pass**: All categories ≥ 8 - **Fail**: Any score < 8 → revisit the failing dimension before delivering output. Max 2 rework iterations. - **If unresolvable after 2 iterations**: Deliver output with an explicit confidence note stating which dimension fell short and why. ## Base Categories (All Agents) | Category | Question | Threshold | | ----------------- | --------------------------------------------------------------------------------------- | :-------: | | **Completeness** | Were all required phases/categories evaluated with evidence? | ≥ 8 | | **Accuracy** | Are findings backed by concrete references (code, architecture, CVEs), not speculation? | ≥ 8 | | **Actionability** | Does every Critical/High finding have a specific, implementable fix or mitigation? | ≥ 8 | | **Consistency** | Are severity ratings, mappings, and verdicts internally consistent? | ≥ 8 | | **Coverage** | Were all entry points, trust boundaries, modules, or manifests identified and analyzed? | ≥ 8 | ## Domain-Specific Extensions ### Multi-tool Pipeline — add: | **Deduplication** | Are cross-tool duplicates properly merged with corroboration notes? | ≥ 8 | ### Code Quality (SonarQube-style) — adapt Completeness to: | **Completeness** | Were all issue types (Bugs, Vulnerabilities, Hotspots, Smells, Duplication) evaluated? | ≥ 8 | ### SAST/SCA — adapt Coverage to: | **Coverage** | Were all entry points taint-traced and all dependency manifests audited? | ≥ 8 | ### STRIDE Threat Modeling — adapt Completeness to: | **Completeness** | Were all six STRIDE categories evaluated for every trust boundary and data flow? | ≥ 8 | ### STRIDE-LM — adapt Completeness and Coverage to: | **Completeness** | Were all seven STRIDE-LM categories evaluated for every asset and trust boundary? | ≥ 8 | | **Coverage** | Were all lateral movement paths, trust boundaries, and post-exploitation chains assessed? | ≥ 8 | ### Code Review — adapt Coverage to: | **Coverage** | Were all entry points, trust boundaries, and data flows traced from source to sink? | ≥ 8 |