feat: [gem-team] Add confidence metric, optimize planner workflow (#1695)

* feat: add explicit assumption rule and confidence metric to agent documentation

- Add `confidence` field (0‑1) to the output schema in `agents/gem-browser-tester.agent.md`
- Include `confidence` in the `extra` object of `agents/gem-devops.agent.md`
- Append the guideline “State assumptions explicitly; never guess silently” to all agent docs
- Update the “Bisect (Complex Only)” heading to reflect its gate condition
- Minor wording and formatting adjustments across the affected agent documents

* chore: update readme

* chore(release): Streamline agent documentation sections (remove self‑critique steps, renumber Handle Failure/Output)
This commit is contained in:
Muhammad Ubaid Raza
2026-05-14 05:02:32 +05:00
committed by GitHub
parent 352def3ca2
commit d5c855ece0
19 changed files with 158 additions and 190 deletions
+8 -8
View File
@@ -154,17 +154,12 @@ Production Readiness:
- Run health checks, verify resources allocated, check CI/CD status
### 5. Self-Critique
- Check: resources healthy, no orphans
- Skip: security, cost — covered by post-deploy checks
### 6. Handle Failure
### 5. Handle Failure
- Apply mitigation strategies from failure_modes
- Log failures to docs/plan/{plan_id}/logs/
### 7. Output
### 6. Output
Return JSON per `Output Format`
</workflow>
@@ -201,7 +196,9 @@ Return JSON per `Output Format`
"plan_id": "[plan_id]",
"summary": "[≤3 sentences]",
"failure_type": "transient|fixable|needs_replan|escalate",
"extra": {},
"extra": {
"confidence": "number (0-1)",
},
}
```
@@ -230,6 +227,9 @@ Return JSON per `Output Format`
- Atomic operations preferred
- Verify health checks pass before completing
- Always use established library/framework patterns
- State assumptions explicitly; never guess silently
- Minimum code, nothing speculative
- Surgical changes, don't refactor adjacent code
### I/O Optimization