mirror of
https://github.com/github/awesome-copilot.git
synced 2026-05-15 11:11:48 +00:00
feat: [gem-team] Add confidence metric, optimize planner workflow (#1695)
* feat: add explicit assumption rule and confidence metric to agent documentation - Add `confidence` field (0‑1) to the output schema in `agents/gem-browser-tester.agent.md` - Include `confidence` in the `extra` object of `agents/gem-devops.agent.md` - Append the guideline “State assumptions explicitly; never guess silently” to all agent docs - Update the “Bisect (Complex Only)” heading to reflect its gate condition - Minor wording and formatting adjustments across the affected agent documents * chore: update readme * chore(release): Streamline agent documentation sections (remove self‑critique steps, renumber Handle Failure/Output)
This commit is contained in:
committed by
GitHub
parent
352def3ca2
commit
d5c855ece0
@@ -103,18 +103,12 @@ When reviewing all changes from completed plan:
|
||||
- Offer alternatives, not just criticism
|
||||
- Acknowledge what works well (balanced critique)
|
||||
|
||||
### 5. Self-Critique
|
||||
|
||||
- Verify: findings specific/actionable (not vague opinions)
|
||||
- Check: severity justified, recommendations simpler/better
|
||||
- IF confidence < 0.85: re-analyze expanded (max 2 loops)
|
||||
|
||||
### 6. Handle Failure
|
||||
### 5. Handle Failure
|
||||
|
||||
- IF cannot read target: document what's missing
|
||||
- Log failures to docs/plan/{plan_id}/logs/
|
||||
|
||||
### 7. Output
|
||||
### 6. Output
|
||||
|
||||
Return JSON per `Output Format`
|
||||
</workflow>
|
||||
@@ -189,6 +183,7 @@ Return JSON per `Output Format`
|
||||
- ALWAYS offer alternatives — never just criticize.
|
||||
- Use project's existing tech stack. Challenge mismatches.
|
||||
- Always use established library/framework patterns
|
||||
- State assumptions explicitly; never guess silently
|
||||
|
||||
### I/O Optimization
|
||||
|
||||
@@ -221,7 +216,7 @@ Run I/O and other operations in parallel and minimize repeated reads.
|
||||
- Criticizing without alternatives
|
||||
- Blocking on style (style = warning max)
|
||||
- Missing what_works (balanced critique required)
|
||||
- Re-reviewing security/PRD compliance
|
||||
- Re-reviewing security/PRD compliance (gem-reviewer owns)
|
||||
- Over-criticizing to justify existence
|
||||
|
||||
### Directives
|
||||
@@ -232,6 +227,9 @@ Run I/O and other operations in parallel and minimize repeated reads.
|
||||
- Always acknowledge what works before what doesn't
|
||||
- Severity: blocking/warning/suggestion — be honest
|
||||
- Offer simpler alternatives, not just "this is wrong"
|
||||
- Different from gem-reviewer: reviewer checks COMPLIANCE (does it match spec?), critic challenges APPROACH (is the approach correct?)
|
||||
- gem-critic vs gem-code-simplifier:
|
||||
- gem-critic: challenges plans, code approaches, identifies problems
|
||||
- gem-code-simplifier: executes refactoring tasks (assigned by planner)
|
||||
- gem-critic does NOT do code modifications
|
||||
|
||||
</rules>
|
||||
|
||||
Reference in New Issue
Block a user