mirror of
https://github.com/github/awesome-copilot.git
synced 2026-04-30 12:15:56 +00:00
feat: add SAST/SCA Security Analyzer agent and audit-integrity skill (#1458)
Co-authored-by: Vijay Bandi <vijay.bandi@hp.com>
This commit is contained in:
50
skills/audit-integrity/SKILL.md
Normal file
50
skills/audit-integrity/SKILL.md
Normal file
@@ -0,0 +1,50 @@
|
||||
---
|
||||
name: 'audit-integrity'
|
||||
description: 'Shared audit integrity framework for all AppSec agents — enforces output quality, intellectual honesty, and continuous improvement through anti-rationalization guards, self-critique loops, retry protocols, non-negotiable behaviors, self-reflection quality gates (1-10 scoring, ≥8 threshold), and a self-learning system with lesson/memory governance for security analysis agents.'
|
||||
compatibility: 'Cross-platform. Works with any language or framework analyzed by AppSec agents.'
|
||||
metadata:
|
||||
version: '1.0'
|
||||
---
|
||||
|
||||
# Audit Integrity Skill
|
||||
|
||||
Enforces output quality, intellectual honesty, and continuous improvement across all AppSec agents.
|
||||
|
||||
## When to Use
|
||||
|
||||
- Every security analysis, code review, threat model, or quality scan agent run
|
||||
- Applied automatically as a post-analysis quality gate
|
||||
- Applicable to any agent performing SAST, SCA, threat modeling, or code quality analysis
|
||||
|
||||
## Components
|
||||
|
||||
This skill provides 7 reusable capabilities. Agents apply all 7 unless their scope excludes a specific component.
|
||||
|
||||
| Component | Reference File | Purpose |
|
||||
|-----------|---------------|---------|
|
||||
| Clarification Protocol | [clarification-protocol.md](references/clarification-protocol.md) | Ask ≤2 targeted questions before analysis when scope is ambiguous |
|
||||
| Anti-Rationalization Guard | [anti-rationalization-guard.md](references/anti-rationalization-guard.md) | Table of prohibited rationalizations with mandatory responses |
|
||||
| Self-Critique Loop | [self-critique-loop.md](references/self-critique-loop.md) | Mandatory second-pass review after initial analysis |
|
||||
| Retry Protocol | [retry-protocol.md](references/retry-protocol.md) | Tool failure handling — retry once, then document |
|
||||
| Non-Negotiable Behaviors | [non-negotiable-behaviors.md](references/non-negotiable-behaviors.md) | Hard rules: never fabricate, always cite evidence, report gaps |
|
||||
| Self-Reflection Quality Gate | [self-reflection-quality-gate.md](references/self-reflection-quality-gate.md) | 1–10 scoring rubric with ≥8 threshold per category |
|
||||
| Self-Learning System | [self-learning-system.md](references/self-learning-system.md) | Lesson/Memory templates and governance rules |
|
||||
|
||||
## Execution Flow
|
||||
|
||||
1. **Before analysis**: Apply Clarification Protocol if scope is ambiguous
|
||||
2. **During analysis**: Apply Anti-Rationalization Guard at every decision point
|
||||
3. **After initial pass**: Execute Self-Critique Loop (mandatory second pass)
|
||||
4. **On tool failure**: Apply Retry Protocol
|
||||
5. **Before delivery**: Run Self-Reflection Quality Gate (all categories must score ≥8)
|
||||
6. **After delivery**: Create Lessons/Memories for novel findings, false positives, or methodology gaps (see Self-Learning System)
|
||||
|
||||
## Agent-Specific Adaptation
|
||||
|
||||
Each agent customizes the **Self-Critique Loop** checklist and **Self-Reflection Quality Gate** categories to match its domain. The reference files provide the base templates; agents extend them with domain-specific items.
|
||||
|
||||
### Example extensions per agent type
|
||||
- **SAST/SCA agents**: Add taint trace completeness and manifest coverage checks
|
||||
- **SonarQube-style agents**: Add rating sanity check (A–E consistency with findings)
|
||||
- **Threat modeling agents**: Add STRIDE category completeness per trust boundary
|
||||
- **Code review agents**: Add trust boundary audit with data flow tracing
|
||||
@@ -0,0 +1,38 @@
|
||||
# Anti-Rationalization Guard
|
||||
|
||||
These rationalizations are **never** valid justifications for skipping, omitting, or downgrading findings:
|
||||
|
||||
## Universal Rationalizations (All Agents)
|
||||
|
||||
| If you think... | Mandatory response |
|
||||
| ---------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- |
|
||||
| "No issues/threats found on first pass" | Systematic evaluation across all categories is required before concluding clean. Expand scope and complete the full matrix. |
|
||||
| "This looks fine, skip deep analysis" | "Looks fine" is not evidence. Evidence = code trace, architecture reference, or rule match. Run checks. |
|
||||
| "The risk is probably lower in practice" | Risk level is based on impact × likelihood (CVSS/exploitability). Justify any downgrade with explicit evidence. |
|
||||
| "This is a false positive" | Flag it as a potential false positive but include it — do not silently suppress. Document the rationale for human review. |
|
||||
| "This is outside scope" | State explicitly why, with a reference to the declared scope or assessment boundary. |
|
||||
| "No controls/mitigations needed here" | State "No gap identified — rationale: [X]" explicitly. Silence is not assurance. |
|
||||
|
||||
## SAST/SCA-Specific
|
||||
|
||||
| If you think... | Mandatory response |
|
||||
| ---------------------------------------- | ----------------------------------------------------------------------------------------------------------------- |
|
||||
| "SCA CVE isn't exploitable here" | Include the CVE with a documented context note — do not silently suppress. |
|
||||
| "This phase can be skipped" | All phases are mandatory. Document any phase that genuinely cannot be completed due to missing inputs. |
|
||||
| "Severity should be lower given context" | Severity is based on CVSS/exploitability. Justify any downgrade with explicit evidence. Document, don't suppress. |
|
||||
|
||||
## Code Quality-Specific
|
||||
|
||||
| If you think... | Mandatory response |
|
||||
| ------------------------------------------ | ---------------------------------------------------------------------------------------------------- |
|
||||
| "The team will refactor this later" | Technical debt still counts toward the debt ratio today. Document it accurately. |
|
||||
| "Quality Gate failure is a false positive" | Include it as a finding, document the suspected false positive rationale, and mark for human review. |
|
||||
|
||||
## Threat Modeling-Specific
|
||||
|
||||
| If you think... | Mandatory response |
|
||||
| ---------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- |
|
||||
| "This threat is mitigated by the architecture" | Document the specific compensating control and verify it is actually implemented — do not assume. |
|
||||
| "This category has no applicable threats here" | State "No applicable threats identified — rationale: [X]" explicitly. Do not silently omit. |
|
||||
| "Lateral movement is unlikely here" | Document the specific architectural control that prevents pivoting and verify it is implemented — do not assume. |
|
||||
| "This threat actor wouldn't target this" | Document the basis for that exclusion. Insider threats and supply chain actors must always be considered. |
|
||||
15
skills/audit-integrity/references/clarification-protocol.md
Normal file
15
skills/audit-integrity/references/clarification-protocol.md
Normal file
@@ -0,0 +1,15 @@
|
||||
# Clarification Protocol
|
||||
|
||||
Before beginning analysis, pause and ask the user at most **2 targeted questions** when:
|
||||
|
||||
- The system scope, asset boundary, or target module is ambiguous and cannot be inferred from the provided context
|
||||
- A critical trust boundary, privilege tier, or authentication zone is undefined and the analysis would significantly change depending on the interpretation
|
||||
- The business context required for impact prioritization or compliance framework selection is entirely absent
|
||||
- The language or framework cannot be auto-detected from the workspace
|
||||
|
||||
**Rules:**
|
||||
|
||||
1. State your working assumptions explicitly, then proceed
|
||||
2. Do not wait for confirmation unless the ambiguity would fundamentally alter the attack surface definition, trust boundary map, or which phases are executed
|
||||
3. Maximum 2 questions — if more ambiguity exists, infer from available evidence and document assumptions
|
||||
4. If no ambiguity exists, proceed directly without questions
|
||||
@@ -0,0 +1,17 @@
|
||||
# Non-Negotiable Behaviors
|
||||
|
||||
These rules apply to **all** AppSec agents with no exceptions:
|
||||
|
||||
1. **Never fabricate findings**: Do not report vulnerabilities, threats, bugs, code smells, or risk assessments without direct evidence from the analyzed source code, architecture, manifests, or threat intelligence.
|
||||
|
||||
2. **Always cite evidence**: Every finding must reference a specific file path, line number, CVE ID, component, trust boundary, data flow, or rule key. Generic findings without precise traceability are prohibited.
|
||||
|
||||
3. **Explain rationale for risk decisions**: When assigning severity, risk levels, quality ratings, policy compliance verdicts, or composite risk scores, state the reasoning based on exploitability, impact, and evidence — do not rely on unexplained judgment.
|
||||
|
||||
4. **Do not modify source files**: Do not alter code, configuration, dependency files, or deployment manifests unless explicitly requested by the user.
|
||||
|
||||
5. **Report honestly on coverage gaps**: If any analysis phase, STRIDE category, scan type, or methodology step could not be completed (missing files, unsupported language, inaccessible components), state it explicitly rather than silently omitting.
|
||||
|
||||
6. **Complete all phases**: Partial runs are not acceptable. If a phase is blocked, document why and continue with remaining phases.
|
||||
|
||||
7. **Provide progress summaries**: For multi-phase analysis, summarize findings after completing each major phase before proceeding to the next.
|
||||
8
skills/audit-integrity/references/retry-protocol.md
Normal file
8
skills/audit-integrity/references/retry-protocol.md
Normal file
@@ -0,0 +1,8 @@
|
||||
# Retry Protocol
|
||||
|
||||
On tool failure or empty results:
|
||||
|
||||
1. **Retry once** with a refined query or a different search pattern.
|
||||
2. **If second attempt fails**, state the failure explicitly and continue with available evidence.
|
||||
3. **Never silently skip** a phase because a tool call returned no results — distinguish "tool found nothing" from "tool failed to execute."
|
||||
4. **Document the gap**: If a phase is genuinely blocked (missing manifests, unsupported language, inaccessible files), state it explicitly in the output rather than silently omitting the phase.
|
||||
46
skills/audit-integrity/references/self-critique-loop.md
Normal file
46
skills/audit-integrity/references/self-critique-loop.md
Normal file
@@ -0,0 +1,46 @@
|
||||
# Self-Critique Loop
|
||||
|
||||
After completing the initial analysis, perform a **mandatory second pass** before delivering output.
|
||||
|
||||
## Universal Checks (All Agents)
|
||||
|
||||
1. **Evidence check**: Every finding must cite a concrete reference (file:line, component, architecture element, CVE ID, rule key). Remove any finding without supporting evidence.
|
||||
2. **Coverage check**: Verify that all categories, phases, or scan types relevant to the agent's methodology were explicitly evaluated. State "None detected" for each clean category rather than silently omitting.
|
||||
3. **Mitigation/remediation check**: Every Critical and High finding must have a specific, implementable fix — not a generic recommendation.
|
||||
|
||||
## Domain-Specific Extensions
|
||||
|
||||
Each agent adds domain checks to the universal list above:
|
||||
|
||||
### STRIDE Threat Modeling
|
||||
|
||||
4. **STRIDE completeness**: Did you evaluate all six STRIDE categories (S/T/R/I/D/E) for every trust boundary and data flow?
|
||||
5. **Trust boundary audit**: Re-verify that every identified trust boundary has at least one evaluated data flow crossing it.
|
||||
|
||||
### STRIDE-LM (Lateral Movement)
|
||||
|
||||
4. **STRIDE-LM completeness**: Did you evaluate all seven categories (S/T/R/I/D/E/LM) for every asset and trust boundary?
|
||||
5. **Control coverage**: Every Critical/High threat maps to a control function (Inventory/Collect/Detect/Protect/Manage/Respond).
|
||||
6. **Lateral movement audit**: Re-trace all identified pivot paths. Verify no uncontrolled path exists from compromised entry point to high-value asset.
|
||||
|
||||
### Code Review Threat Modeling
|
||||
|
||||
4. **STRIDE completeness**: All six STRIDE categories evaluated for every trust boundary and data flow.
|
||||
5. **Trust boundary audit**: Every trust boundary has evaluated data flows crossing it.
|
||||
|
||||
### Code Quality (SonarQube-style)
|
||||
|
||||
4. **Issue type coverage**: All five issue types (Bug, Vulnerability, Hotspot, Smell, Duplication) explicitly evaluated.
|
||||
5. **Rating sanity check**: A–E ratings are consistent with finding counts before finalizing Quality Gate verdict.
|
||||
|
||||
### SAST/SCA
|
||||
|
||||
4. **Taint trace completeness**: Every entry point identified in discovery was taint-traced through to sinks.
|
||||
5. **Manifest coverage**: All dependency manifests identified in discovery were audited.
|
||||
|
||||
### Multi-tool Pipeline
|
||||
|
||||
4. **Phase coverage**: All deliverable files generated and saved.
|
||||
5. **Cross-correlation**: SAST findings corroborated by SCA findings → elevate corroborated items.
|
||||
6. **Deduplication**: Same finding doesn't appear under multiple tool outputs.
|
||||
7. **Roadmap completeness**: Every Critical/High finding appears in the immediate remediation tier.
|
||||
92
skills/audit-integrity/references/self-learning-system.md
Normal file
92
skills/audit-integrity/references/self-learning-system.md
Normal file
@@ -0,0 +1,92 @@
|
||||
# Self-Learning System
|
||||
|
||||
Maintain project learning artifacts under a designated lessons/memories directory (e.g., `.github/SecurityLessons` and `.github/SecurityMemories`).
|
||||
|
||||
## When to Create
|
||||
|
||||
### Lesson
|
||||
|
||||
Create a lesson when:
|
||||
|
||||
- A scan produces a false positive that required manual correction
|
||||
- A finding category, STRIDE category, or flaw type is missed on first pass and caught by the self-critique loop
|
||||
- A tool or methodology limitation is discovered
|
||||
- A language-specific rule misfires
|
||||
- An SCA dependency cannot be resolved
|
||||
|
||||
### Memory
|
||||
|
||||
Create a memory when:
|
||||
|
||||
- An architecture decision, security convention, or technology stack detail is discovered
|
||||
- A dependency management pattern, domain-specific threat pattern, or threat actor profile is identified
|
||||
- A project coding convention, framework idiom, or known false-positive pattern is found
|
||||
- Any codebase-specific knowledge would be useful for future scans of the same codebase
|
||||
|
||||
## Lesson Template
|
||||
|
||||
```markdown
|
||||
# Security Lesson: <short-title>
|
||||
|
||||
## Metadata
|
||||
|
||||
- CreatedAt: <date>
|
||||
- Status: active | deprecated
|
||||
- Supersedes: <previous lesson if any>
|
||||
|
||||
## Context
|
||||
|
||||
- Triggering scan/task:
|
||||
- Component analyzed:
|
||||
|
||||
## Issue
|
||||
|
||||
- What went wrong or was missed:
|
||||
- Expected behavior:
|
||||
- Actual behavior:
|
||||
|
||||
## Root Cause
|
||||
|
||||
- Why was this missed or incorrect:
|
||||
|
||||
## Resolution
|
||||
|
||||
- How it was corrected:
|
||||
|
||||
## Preventive Guidance
|
||||
|
||||
- How to avoid this in future scans:
|
||||
```
|
||||
|
||||
## Memory Template
|
||||
|
||||
```markdown
|
||||
# Security Memory: <short-title>
|
||||
|
||||
## Metadata
|
||||
|
||||
- CreatedAt: <date>
|
||||
- Status: active | deprecated
|
||||
- Supersedes: <previous memory if any>
|
||||
|
||||
## Context
|
||||
|
||||
- Triggering scan/task:
|
||||
- Scope/system:
|
||||
|
||||
## Key Fact
|
||||
|
||||
- What was discovered:
|
||||
- Why it matters for security analysis:
|
||||
|
||||
## Reuse Guidance
|
||||
|
||||
- When to apply this knowledge:
|
||||
- Related components:
|
||||
```
|
||||
|
||||
## Governance Rules
|
||||
|
||||
1. **Dedup check**: Before creating a new lesson or memory, search existing files for similar content. Update existing records rather than creating duplicates.
|
||||
2. **Conflict resolution**: If new evidence conflicts with an existing active lesson/memory, mark the older one as `deprecated` and create the updated version with a `Supersedes` reference.
|
||||
3. **Reuse at scan start**: At the start of every analysis, check the lessons/memories directory for applicable context. Apply relevant guidance before beginning analysis.
|
||||
@@ -0,0 +1,46 @@
|
||||
# Self-Reflection Quality Gate
|
||||
|
||||
After completing analysis, internally score the output across domain-relevant categories (1–10 scale).
|
||||
|
||||
## Scoring Rules
|
||||
|
||||
- **Pass**: All categories ≥ 8
|
||||
- **Fail**: Any score < 8 → revisit the failing dimension before delivering output. Max 2 rework iterations.
|
||||
- **If unresolvable after 2 iterations**: Deliver output with an explicit confidence note stating which dimension fell short and why.
|
||||
|
||||
## Base Categories (All Agents)
|
||||
|
||||
| Category | Question | Threshold |
|
||||
| ----------------- | --------------------------------------------------------------------------------------- | :-------: |
|
||||
| **Completeness** | Were all required phases/categories evaluated with evidence? | ≥ 8 |
|
||||
| **Accuracy** | Are findings backed by concrete references (code, architecture, CVEs), not speculation? | ≥ 8 |
|
||||
| **Actionability** | Does every Critical/High finding have a specific, implementable fix or mitigation? | ≥ 8 |
|
||||
| **Consistency** | Are severity ratings, mappings, and verdicts internally consistent? | ≥ 8 |
|
||||
| **Coverage** | Were all entry points, trust boundaries, modules, or manifests identified and analyzed? | ≥ 8 |
|
||||
|
||||
## Domain-Specific Extensions
|
||||
|
||||
### Multi-tool Pipeline — add:
|
||||
|
||||
| **Deduplication** | Are cross-tool duplicates properly merged with corroboration notes? | ≥ 8 |
|
||||
|
||||
### Code Quality (SonarQube-style) — adapt Completeness to:
|
||||
|
||||
| **Completeness** | Were all issue types (Bugs, Vulnerabilities, Hotspots, Smells, Duplication) evaluated? | ≥ 8 |
|
||||
|
||||
### SAST/SCA — adapt Coverage to:
|
||||
|
||||
| **Coverage** | Were all entry points taint-traced and all dependency manifests audited? | ≥ 8 |
|
||||
|
||||
### STRIDE Threat Modeling — adapt Completeness to:
|
||||
|
||||
| **Completeness** | Were all six STRIDE categories evaluated for every trust boundary and data flow? | ≥ 8 |
|
||||
|
||||
### STRIDE-LM — adapt Completeness and Coverage to:
|
||||
|
||||
| **Completeness** | Were all seven STRIDE-LM categories evaluated for every asset and trust boundary? | ≥ 8 |
|
||||
| **Coverage** | Were all lateral movement paths, trust boundaries, and post-exploitation chains assessed? | ≥ 8 |
|
||||
|
||||
### Code Review — adapt Coverage to:
|
||||
|
||||
| **Coverage** | Were all entry points, trust boundaries, and data flows traced from source to sink? | ≥ 8 |
|
||||
Reference in New Issue
Block a user