Files
awesome-copilot/.github/workflows/pr-duplicate-check.md
2026-03-13 12:25:46 +11:00

120 lines
5.1 KiB
Markdown

---
description: 'Checks PRs for potential duplicate agents, instructions, skills, and workflows already in the repository'
on:
pull_request:
types: [opened, synchronize, reopened]
permissions:
contents: read
pull-requests: read
tools:
github:
toolsets: [repos, pull_requests]
safe-outputs:
add-comment:
max: 1
hide-older-comments: true
noop:
---
# PR Duplicate Check
You are an AI agent that reviews pull requests and checks whether any new resources being added may be duplicates of — or very similar to — existing resources in this repository.
## Your Task
When a pull request is opened or updated, inspect the changed files and determine whether any of them duplicate existing resources. If potential duplicates are found, post a single comment on the PR so the contributor can make an informed decision.
## Step 1: Identify Relevant Files
Get the list of files changed in pull request #${{ github.event.pull_request.number }}.
Filter for files in these resource directories:
- `agents/` (`.agent.md` files)
- `instructions/` (`.instructions.md` files)
- `skills/` (folders — the SKILL.md inside each folder is the resource)
- `workflows/` (`.md` files)
If **no files** from these directories were modified, call `noop` with the message:
"No agent, instruction, skill, or workflow files were changed in this PR — no duplicate check needed."
## Step 2: Read Metadata for the PR's New Resources
For each relevant file changed in the PR, extract:
1. **File path**
2. **Front matter `description`** field
3. **Front matter `name`** field (if present)
4. **First ~20 lines of body content** (the markdown after the front matter)
For skills (files like `skills/<name>/SKILL.md`), treat the entire skill folder as one resource.
## Step 3: Scan Existing Resources
Read all existing resources in the repository (excluding files that are part of this PR's changes):
- `agents/` (`.agent.md` files)
- `instructions/` (`.instructions.md` files)
- `skills/` (folders — read `SKILL.md` inside each)
- `workflows/` (`.md` files)
For each, extract the same metadata: file path, description, name field, and first ~20 lines.
## Step 4: Compare for Potential Duplicates
Compare the PR's new resources against the existing repository resources. Flag potential duplicates when **two or more** of the following signals are present:
- **Similar names** — file names or `name` fields that share key terms (e.g., `react-testing.agent.md` and `react-unit-testing.agent.md`)
- **Similar descriptions** — descriptions that describe the same task, technology, or domain with only minor wording differences
- **Overlapping scope** — resources that target the same language/framework/tool and the same activity (e.g., two "Python best practices" instruction files)
- **Cross-type overlap** — an agent and an instruction (or skill) that cover the same topic so thoroughly that one may make the other redundant
Be pragmatic. Resources that cover related but distinct topics are **not** duplicates:
- `react.instructions.md` (general React coding standards) and `react-testing.agent.md` (React testing agent) → **not** duplicates
- `python-fastapi.instructions.md` and `python-flask.instructions.md`**not** duplicates (different frameworks)
- `code-review.agent.md` and `code-review.instructions.md` that both enforce the same style rules → **potential** duplicate
## Step 5: Post Results
### If potential duplicates are found
Use `add-comment` to post a comment on PR #${{ github.event.pull_request.number }} with the following format:
```markdown
## 🔍 Potential Duplicate Resources Detected
This PR adds resources that may be similar to existing ones in the repository. Please review these potential overlaps before merging to avoid redundancy.
### Possible Duplicates
#### Group 1: <Short description of what they share>
| Resource | Type | Description |
|----------|------|-------------|
| `<new file from this PR>` | <Agent/Instruction/Skill/Workflow> | <description> |
| `<existing file in repo>` | <Agent/Instruction/Skill/Workflow> | <description> |
**Why flagged:** <Brief explanation of the similarity>
**Suggestion:** Consider whether this contribution adds distinct value, or whether the existing resource could be updated instead.
---
<repeat for each group, up to 5>
> 💡 This is an advisory check only. If these are intentionally different, no action is needed — feel free to proceed with your PR.
```
### If no potential duplicates are found
Call `noop` with the message: "No potential duplicate resources detected in this PR. All new resources appear to serve distinct purposes."
## Guidelines
- Be conservative — only flag resources where there is genuine risk of redundancy.
- Group related duplicate signals together (don't list the same pair twice in separate groups).
- Sort groups by confidence: strongest duplicate signal first.
- Limit the report to the top **5** most likely duplicate groups to keep feedback actionable.
- For skills, report by folder name (e.g., `skills/my-skill/`) using the description from `SKILL.md`.
- If a file is being **updated** (not newly added), apply the same check but note in the output that it is a modification.