Revert "Cleaning up plugins"

This reverts commit 51bf994ef2.

Co-authored-by: aaronpowell <434140+aaronpowell@users.noreply.github.com>
This commit is contained in:
copilot-swe-agent[bot]
2026-04-28 01:26:07 +00:00
committed by GitHub
parent 51bf994ef2
commit d044bc9f6a
465 changed files with 96645 additions and 259 deletions
@@ -0,0 +1,54 @@
# Evaluators: Custom Templates
Design LLM judge prompts.
## Complete Template Pattern
```python
TEMPLATE = """Evaluate faithfulness of the response to the context.
<context>{{context}}</context>
<response>{{output}}</response>
CRITERIA:
"faithful" = ALL claims supported by context
"unfaithful" = ANY claim NOT in context
EXAMPLES:
Context: "Price is $10" → Response: "It costs $10" → faithful
Context: "Price is $10" → Response: "About $15" → unfaithful
EDGE CASES:
- Empty context → cannot_evaluate
- "I don't know" when appropriate → faithful
- Partial faithfulness → unfaithful (strict)
Answer (faithful/unfaithful):"""
```
## Template Structure
1. Task description
2. Input variables in XML tags
3. Criteria definitions
4. Examples (2-4 cases)
5. Edge cases
6. Output format
## XML Tags
```
<question>{{input}}</question>
<response>{{output}}</response>
<context>{{context}}</context>
<reference>{{reference}}</reference>
```
## Common Mistakes
| Mistake | Fix |
| ------- | --- |
| Vague criteria | Define each label exactly |
| No examples | Include 2-4 cases |
| Ambiguous format | Specify exact output |
| No edge cases | Address ambiguity |