Add resemble-detect skill (#1396)

* Add resemble-detect skill

Deepfake detection and media safety skill using Resemble AI — detects
AI-generated audio, images, video, and text with confidence scores,
traces audio source platforms, applies and reads watermarks, verifies
speaker identity, and extracts media intelligence (speaker, emotion,
misinformation signals).

Packaged as SKILL.md + LICENSE (Apache-2.0). Generated docs updated
via npm start per CONTRIBUTING.md.

* resemble-detect: trim body under 500 lines + add compatibility

Moves detailed request/response schemas from SKILL.md into
references/api-reference.md, bringing the SKILL body from 557 to
282 lines (validator hard cap is 500). Core decision-making content
— capability decision tree, score interpretation, workflows, red
flags — stays in the body where the agent needs it at query time.

Also adds a compatibility field to frontmatter per review feedback:
surfaces the RESEMBLE_API_KEY requirement and the public-HTTPS-URL
constraint upfront.

* Fix resemble-detect skill metadata
This commit is contained in:
Dev Shah
2026-04-28 07:39:46 +05:30
committed by GitHub
parent 5f69546969
commit 1aea01a677
4 changed files with 784 additions and 0 deletions

View File

@@ -283,6 +283,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to
| [remember](../skills/remember/SKILL.md) | Transforms lessons learned into domain-organized memory instructions (global or workspace). Syntax: `/remember [>domain [scope]] lesson clue` where scope is `global` (default), `user`, `workspace`, or `ws`. | None | | [remember](../skills/remember/SKILL.md) | Transforms lessons learned into domain-organized memory instructions (global or workspace). Syntax: `/remember [>domain [scope]] lesson clue` where scope is `global` (default), `user`, `workspace`, or `ws`. | None |
| [remember-interactive-programming](../skills/remember-interactive-programming/SKILL.md) | A micro-prompt that reminds the agent that it is an interactive programmer. Works great in Clojure when Copilot has access to the REPL (probably via Backseat Driver). Will work with any system that has a live REPL that the agent can use. Adapt the prompt with any specific reminders in your workflow and/or workspace. | None | | [remember-interactive-programming](../skills/remember-interactive-programming/SKILL.md) | A micro-prompt that reminds the agent that it is an interactive programmer. Works great in Clojure when Copilot has access to the REPL (probably via Backseat Driver). Will work with any system that has a live REPL that the agent can use. Adapt the prompt with any specific reminders in your workflow and/or workspace. | None |
| [repo-story-time](../skills/repo-story-time/SKILL.md) | Generate a comprehensive repository summary and narrative story from commit history | None | | [repo-story-time](../skills/repo-story-time/SKILL.md) | Generate a comprehensive repository summary and narrative story from commit history | None |
| [resemble-detect](../skills/resemble-detect/SKILL.md) | Deepfake detection and media safety — detect AI-generated audio, images, video, and text, trace synthesis sources, apply watermarks, verify speaker identity, and analyze media intelligence using Resemble AI | `LICENSE`<br />`references/api-reference.md` |
| [review-and-refactor](../skills/review-and-refactor/SKILL.md) | Review and refactor code in your project according to defined instructions | None | | [review-and-refactor](../skills/review-and-refactor/SKILL.md) | Review and refactor code in your project according to defined instructions | None |
| [reviewing-oracle-to-postgres-migration](../skills/reviewing-oracle-to-postgres-migration/SKILL.md) | Identifies Oracle-to-PostgreSQL migration risks by cross-referencing code against known behavioral differences (empty strings, refcursors, type coercion, sorting, timestamps, concurrent transactions, etc.). Use when planning a database migration, reviewing migration artifacts, or validating that integration tests cover Oracle/PostgreSQL differences. | `references/REFERENCE.md`<br />`references/empty-strings-handling.md`<br />`references/no-data-found-exceptions.md`<br />`references/oracle-parentheses-from-clause.md`<br />`references/oracle-to-postgres-sorting.md`<br />`references/oracle-to-postgres-timestamp-timezone.md`<br />`references/oracle-to-postgres-to-char-numeric.md`<br />`references/oracle-to-postgres-type-coercion.md`<br />`references/postgres-concurrent-transactions.md`<br />`references/postgres-refcursor-handling.md` | | [reviewing-oracle-to-postgres-migration](../skills/reviewing-oracle-to-postgres-migration/SKILL.md) | Identifies Oracle-to-PostgreSQL migration risks by cross-referencing code against known behavioral differences (empty strings, refcursors, type coercion, sorting, timestamps, concurrent transactions, etc.). Use when planning a database migration, reviewing migration artifacts, or validating that integration tests cover Oracle/PostgreSQL differences. | `references/REFERENCE.md`<br />`references/empty-strings-handling.md`<br />`references/no-data-found-exceptions.md`<br />`references/oracle-parentheses-from-clause.md`<br />`references/oracle-to-postgres-sorting.md`<br />`references/oracle-to-postgres-timestamp-timezone.md`<br />`references/oracle-to-postgres-to-char-numeric.md`<br />`references/oracle-to-postgres-type-coercion.md`<br />`references/postgres-concurrent-transactions.md`<br />`references/postgres-refcursor-handling.md` |
| [roundup](../skills/roundup/SKILL.md) | Generate personalized status briefings on demand. Pulls from your configured data sources (GitHub, email, Teams, Slack, and more), synthesizes across them, and drafts updates in your own communication style for any audience you define. | None | | [roundup](../skills/roundup/SKILL.md) | Generate personalized status briefings on demand. Pulls from your configured data sources (GitHub, email, Teams, Slack, and more), synthesizes across them, and drafts updates in your own communication style for any audience you define. | None |

View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -0,0 +1,282 @@
---
name: resemble-detect
description: Deepfake detection and media safety — detect AI-generated audio, images, video, and text, trace synthesis sources, apply watermarks, verify speaker identity, and analyze media intelligence using Resemble AI
license: Apache-2.0
compatibility: 'Requires a Resemble AI API key (https://app.resemble.ai) set as RESEMBLE_API_KEY. All media must be accessible via public HTTPS URLs — local file paths are not supported except for text detection.'
---
# Resemble Detect — Deepfake Detection & Media Safety
Analyze audio, image, video, and text for synthetic manipulation, AI-generated content, watermarks, speaker identity, and media intelligence using the Resemble AI platform.
## Core Principle — THE IRON LAW
**"NEVER DECLARE MEDIA AS REAL OR FAKE WITHOUT A COMPLETED DETECTION RESULT."**
Do not guess, infer, or speculate about media authenticity. Every authenticity claim must be backed by a completed Resemble detect job with a returned `label`, `score`, and `status: "completed"`. If the detection is still `processing`, wait. If it `failed`, say so — do not substitute your own judgment.
## When to Use
Use this skill whenever the user's request involves any of these:
- Checking if audio, video, image, or text is AI-generated or manipulated
- Detecting deepfakes in any media format
- Verifying media authenticity or provenance
- Identifying which AI platform synthesized audio (source tracing)
- Applying or detecting watermarks on media
- Analyzing media for speaker info, emotion, transcription, or misinformation
- Asking natural-language questions about detection results
- Matching or verifying speaker identity against known voice profiles
- Detecting AI-generated or machine-written text
- Any mention of: "deepfake", "fake detection", "synthetic media", "voice verification", "watermark", "media forensics", "authenticity check", "source tracing", "is this real", "AI-written text", "text detection"
**Do NOT use** for text-to-speech generation, voice cloning, or speech-to-text transcription — those are separate Resemble capabilities.
## Capability Decision Tree
| User wants to... | Use this | API endpoint |
|-------------------------------------------------------|---------------------------|---------------------------------------|
| Check if media is AI-generated / deepfake | **Deepfake Detection** | `POST /detect` |
| Know *which AI platform* made fake audio | **Audio Source Tracing** | `POST /detect` with flag |
| Get speaker info, emotion, transcription from media | **Intelligence** | `POST /intelligence` |
| Ask questions about a completed detection | **Detect Intelligence** | `POST /detects/{uuid}/intelligence` |
| Apply an invisible watermark to media | **Watermark Apply** | `POST /watermark/apply` |
| Check if media contains a watermark | **Watermark Detect** | `POST /watermark/detect` |
| Verify a speaker's identity against known profiles | **Identity Search** | `POST /identity/search` |
| Check if text is AI-generated | **Text Detection** | `POST /text_detect` |
| Create a voice identity profile for future matching | **Identity Create** | `POST /identity` |
When multiple capabilities apply (e.g., user wants deepfake detection AND intelligence), combine them in a single `POST /detect` call using the `intelligence: true` flag rather than making separate requests.
## Required Setup
- **API Key**: Bearer token from the Resemble AI dashboard (set as `RESEMBLE_API_KEY`)
- **Base URL**: `https://app.resemble.ai/api/v2`
- **Auth Header**: `Authorization: Bearer <RESEMBLE_API_KEY>`
- **Media Requirement**: All media must be at a publicly accessible HTTPS URL
If the user provides a local file path instead of a URL, inform them the file must be hosted at a public HTTPS URL first. Do not attempt to upload local files to the API. (Exception: `POST /text_detect` accepts text content inline.)
## MCP Tools Available
When the Resemble MCP server is connected, use these tools instead of raw API calls:
| Tool | Purpose |
|---------------------------|---------------------------------------------------|
| `resemble_docs_lookup` | Get comprehensive docs for any detect sub-topic |
| `resemble_search` | Search across all documentation |
| `resemble_api_endpoint` | Get exact OpenAPI spec for any endpoint |
| `resemble_api_search` | Find endpoints by keyword |
| `resemble_get_page` | Read specific documentation pages |
| `resemble_list_topics` | List all available topics |
**Tool usage pattern**: Use `resemble_docs_lookup` with topic `"detect"` to get the full picture, then `resemble_api_endpoint` for exact request/response schemas before making API calls.
## Full API Reference
Detailed request/response schemas for every endpoint are in **[references/api-reference.md](references/api-reference.md)**. Consult it before making any API call to verify exact parameter names and response shapes. The sections below cover decision-making; the reference covers exact field formats.
---
## Phase 1: Deepfake Detection
The core capability. Submit audio, image, or video for AI-generated content analysis via `POST /detect`.
**Key flags to consider:**
- `visualize: true` — generate heatmap/visualization artifacts
- `intelligence: true` — run multimodal intelligence alongside detection (saves a round-trip)
- `audio_source_tracing: true` — identify which AI platform synthesized fake audio (only fires on `"fake"` audio)
- `use_reverse_search: true` — enable reverse image search (image only)
- `zero_retention_mode: true` — auto-delete media after analysis (for sensitive content)
Detection is asynchronous. Poll `GET /detect/{uuid}` at 2s → 5s → 10s intervals until `status` is `"completed"` or `"failed"`. Most complete in 1060 seconds.
**Supported formats:** Audio (WAV, MP3, OGG, M4A, FLAC) · Video (MP4, MOV, AVI, WMV) · Image (JPG, PNG, GIF, WEBP)
### Reading Results
- **Audio** — verdict in `metrics` — use `label` and `aggregated_score`
- **Image** — verdict in `image_metrics` — use `label` and `score`; `ifl` has an Invisible Frequency Layer heatmap
- **Video** — verdict in `video_metrics` — hierarchical tree of frame/segment results; video-with-audio returns both `metrics` and `video_metrics`
See [references/api-reference.md](references/api-reference.md#reading-results-by-media-type) for full response schemas.
### Interpreting Scores
| Score Range | Interpretation |
|-------------|-----------------------------------------------------|
| 0.0 0.3 | Strong indication of authentic/real media |
| 0.3 0.5 | Inconclusive — recommend additional analysis |
| 0.5 0.7 | Likely synthetic — flag for review |
| 0.7 1.0 | High confidence synthetic/AI-generated |
**Always present scores with context.** Say "The detection returned a score of 0.87, indicating high confidence that this audio is AI-generated" — never just "it's fake."
---
## Phase 2: Intelligence — Media Analysis
Rich structured insights about media: speaker info, emotion, transcription, translation, misinformation, abnormalities.
Two ways to run Intelligence:
1. **Combined with detection** — add `intelligence: true` to `POST /detect` (preferred; one call)
2. **Standalone**`POST /intelligence` with a URL (when you only need analysis, not a deepfake verdict)
**Audio/video structured fields include:** `speaker_info`, `language`, `dialect`, `emotion`, `speaking_style`, `context`, `message`, `abnormalities`, `transcription`, `translation`, `misinformation`.
**Image structured fields include:** `scene_description`, `subjects`, `authenticity_analysis`, `context_and_setting`, `abnormalities`, `misinformation`.
### Detect Intelligence — Ask Questions About Results
After a detection completes, ask natural-language questions via `POST /detects/{detect_uuid}/intelligence` with `{ "query": "..." }`. Returns a question UUID — poll `GET /detects/{detect_uuid}/intelligence/{question_uuid}` until `completed`.
**Good questions to suggest:**
- "Summarize the detection results in plain language"
- "What specific indicators suggest this is AI-generated?"
- "How do the audio and video detection results differ?"
- "What is the confidence level and what does it mean?"
- "Are there any inconsistencies in the analysis?"
**Prerequisite:** The detection must have `status: "completed"`. Submitting a question against a processing or failed detection returns 422.
See [references/api-reference.md](references/api-reference.md#intelligence) for full parameters.
---
## Phase 3: Audio Source Tracing
When audio is labeled `"fake"`, identify which AI platform generated it.
**Enable it** by setting `audio_source_tracing: true` in the `POST /detect` request. Result appears in the detection response under `audio_source_tracing.label`.
Known labels: `resemble_ai`, `elevenlabs`, `real`, and others as the model expands.
**Important:** Source tracing only runs on audio labeled `"fake"`. Real audio produces no source tracing result.
Standalone queries: `GET /audio_source_tracings` and `GET /audio_source_tracings/{uuid}`.
---
## Phase 4: Watermarking
Apply invisible watermarks to media for provenance tracking, or detect existing watermarks.
- **Apply**: `POST /watermark/apply` with `url`, optional `strength` (0.01.0), optional `custom_message`. Add `Prefer: wait` for synchronous response, or poll `GET /watermark/apply/{uuid}/result`. Response includes `watermarked_media` URL.
- **Detect**: `POST /watermark/detect` with `url`. Audio returns `{ has_watermark, confidence }`; image/video returns `{ has_watermark }`.
See [references/api-reference.md](references/api-reference.md#watermarking) for exact parameter rules.
---
## Phase 5: Identity — Speaker Verification (Beta)
Create voice identity profiles and match incoming audio against them.
> **Beta feature** — requires joining the preview program. Inform the user if they encounter access errors.
- **Create profile**: `POST /identity` with `{ audio_url, name }`
- **Search**: `POST /identity/search` with `{ audio_url, top_k }`
Response returns ranked matches with `confidence` (higher = stronger) and `distance` (lower = closer match).
See [references/api-reference.md](references/api-reference.md#identity--speaker-verification-beta) for full schemas.
---
## Phase 6: Text Detection
Detect whether text content is AI-generated or human-written via `POST /text_detect`.
> **Beta feature** — requires the `detect_beta_user` role or a billing plan that includes the `dfd_text` product.
**Key parameters:**
- `text` (required, max 100,000 chars)
- `threshold` (default 0.5)
- `privacy_mode: true` — text content not stored after analysis
- `callback_url` — async notification webhook
Add `Prefer: wait` for synchronous response, or poll `GET /text_detect/{uuid}`. Response includes `prediction` (`"ai"` or `"human"`) and `confidence` (0.01.0).
See [references/api-reference.md](references/api-reference.md#text-detection) for full schema and callback format.
---
## Recommended Workflows
### Full Media Forensics (Most Thorough)
For a comprehensive analysis, combine all capabilities:
1. Submit detection with all flags enabled:
```json
{
"url": "https://example.com/suspect.mp4",
"visualize": true,
"intelligence": true,
"audio_source_tracing": true,
"use_reverse_search": true
}
```
2. Poll until `status: "completed"`
3. Read `metrics` / `image_metrics` / `video_metrics` for the verdict
4. Read `intelligence.description` for structured media analysis
5. If audio labeled `"fake"`, check `audio_source_tracing.label` for the source platform
6. Ask follow-up questions via Detect Intelligence if anything needs clarification
7. Check for watermarks via `POST /watermark/detect` if provenance is relevant
### Quick Authenticity Check (Fastest)
1. Submit minimal detection: `{ "url": "..." }`
2. Poll until complete
3. Check `label` and `aggregated_score` (audio) or `label` and `score` (image/video)
4. Report result with score context
### Provenance Pipeline (Content Creators)
1. Apply watermark to original content: `POST /watermark/apply`
2. Distribute watermarked media
3. Later, verify provenance: `POST /watermark/detect` against any copy
---
## Red Flags — Stop and Reassess
- **Declaring authenticity without a detection result** — Never say media is real or fake based on visual/auditory inspection alone
- **Ignoring the score and reporting only the label** — A `"fake"` label with score 0.51 means something very different from score 0.95
- **Submitting local file paths to the API** — The API requires publicly accessible HTTPS URLs (does not apply to text detection)
- **Sending text longer than 100,000 characters to text detection** — Split into chunks or inform the user of the limit
- **Polling too aggressively** — Start at 2s intervals, back off exponentially; do not loop at <1s
- **Asking Detect Intelligence questions before detection completes** — Results in 422 error
- **Expecting source tracing on "real" audio** — Source tracing only runs on audio labeled `"fake"`
- **Treating beta features (Identity, Text Detection) as production-ready** — Warn users about beta status
- **Ignoring `zero_retention_mode` for sensitive media** — Always suggest this flag when the user indicates the media is sensitive or private
- **Making multiple separate API calls when flags can combine** — Use `intelligence: true` and `audio_source_tracing: true` on the detection call instead of separate requests
## Response Presentation Guidelines
When presenting results to users:
1. **Lead with the verdict** — "The detection indicates this audio is likely AI-generated (score: 0.87)"
2. **Provide score context** — Use the score interpretation table above
3. **Mention limitations** — Detection is probabilistic, not absolute proof
4. **Include actionable next steps** — Suggest intelligence queries, source tracing, or watermark checks as appropriate
5. **For inconclusive results (0.30.5)** — Explicitly state the result is inconclusive and recommend additional analysis with different parameters or manual review
6. **Never present detection as legal evidence** — Detection results are analytical tools, not forensic certifications
## Error Handling
| Error | Cause | Resolution |
|-----------|--------------------------------------------|-------------------------------------------------|
| 400 | Invalid request body or missing `url` | Check required parameters |
| 401 | Invalid or missing API key | Verify `RESEMBLE_API_KEY` |
| 404 | Detection UUID not found | Verify the UUID from the creation response |
| 422 | Detection not completed (for Intelligence) | Wait for detection to reach `completed` status |
| 429 | Rate limited | Back off and retry with exponential delay |
| 500 | Server error | Retry once, then report to user |
## Privacy & Compliance Notes
- **Zero retention mode**: Set `zero_retention_mode: true` to auto-delete media after analysis. The URL is redacted and `media_deleted` is set to true post-completion.
- **Text privacy mode**: Set `privacy_mode: true` on text detection to prevent text content from being stored after analysis.
- **Data handling**: Media URLs and text content are stored by default. For GDPR/compliance-sensitive workflows, enable zero retention (media) or privacy mode (text).
- **Callback security**: If using `callback_url`, ensure the endpoint is HTTPS and authenticated on the receiving end.

View File

@@ -0,0 +1,300 @@
# Resemble Detect — Full API Reference
Detailed request/response schemas for every Resemble detection endpoint.
## Base
- **Base URL**: `https://app.resemble.ai/api/v2`
- **Auth**: `Authorization: Bearer <RESEMBLE_API_KEY>`
---
## Deepfake Detection
### `POST /detect`
Submit audio, image, or video for AI-generation analysis.
```json
{
"url": "https://example.com/media.mp4",
"visualize": true,
"intelligence": true,
"audio_source_tracing": true
}
```
| Parameter | Type | Required | Description |
|------------------------|---------|----------|----------------------------------------------------------|
| `url` | string | Yes | HTTPS URL to audio, image, or video file |
| `callback_url` | string | No | Webhook URL for async completion notification |
| `visualize` | boolean | No | Generate heatmap/visualization artifacts |
| `intelligence` | boolean | No | Run multimodal intelligence alongside detection |
| `audio_source_tracing` | boolean | No | Identify which AI platform synthesized fake audio |
| `frame_length` | integer | No | Audio/video window size in seconds (14, default 2) |
| `start_region` | number | No | Start of segment to analyze (seconds) |
| `end_region` | number | No | End of segment to analyze (seconds) |
| `model_types` | string | No | `"image"` or `"talking_head"` (for face-swap detection) |
| `use_reverse_search` | boolean | No | Enable reverse image search (image only) |
| `use_ood_detector` | boolean | No | Enable out-of-distribution detection |
| `zero_retention_mode` | boolean | No | Auto-delete media after detection completes |
**Supported formats:** Audio (WAV, MP3, OGG, M4A, FLAC) · Video (MP4, MOV, AVI, WMV) · Image (JPG, PNG, GIF, WEBP)
### `GET /detect/{uuid}` — Poll for Results
Detection is asynchronous. Poll until `status` is `"completed"` or `"failed"`. Start at 2s intervals, back off to 5s, then 10s. Most detections complete within 1060s.
### Reading Results by Media Type
**Audio results** — in `metrics`:
```json
{
"label": "fake",
"score": ["0.92", "0.88", "0.95"],
"consistency": "0.91",
"aggregated_score": "0.92",
"image": "https://..."
}
```
- `label`: `"fake"` or `"real"` — the verdict
- `score`: Per-chunk prediction scores (array)
- `aggregated_score`: Overall confidence (0.01.0, higher = more likely synthetic)
- `consistency`: How consistent the prediction is across chunks
- `image`: Visualization heatmap URL (if `visualize: true`)
**Image results** — in `image_metrics`:
```json
{
"type": "ImageAnalysis",
"label": "fake",
"score": 0.87,
"image": "https://...",
"ifl": { "score": 0.82, "heatmap": "https://..." },
"reverse_image_search_sources": [
{ "url": "...", "title": "...", "verdict": "known_fake", "similarity": 0.95 }
]
}
```
- `ifl`: Invisible Frequency Layer analysis with heatmap
- `reverse_image_search_sources`: Known online sources (if `use_reverse_search: true`)
**Video results** — in `video_metrics`:
```json
{
"label": "fake",
"score": 0.89,
"certainty": 0.91,
"children": [
{ "type": "VideoResult", "conclusion": "Fake", "score": 0.89, "timestamp": 2.5, "children": [...] }
]
}
```
- Hierarchical tree of frame-level and segment-level results
- Video with audio track returns both `metrics` (audio) and `video_metrics` (visual)
---
## Intelligence
### `POST /intelligence`
Analyze media for rich structured insights, standalone or alongside detection.
```json
{ "url": "https://example.com/audio.mp3", "json": true }
```
| Parameter | Type | Required | Description |
|----------------|---------|----------|----------------------------------------------------------|
| `url` | string | One of | HTTPS URL to media file |
| `media_token` | string | One of | Token from secure upload (alternative to URL) |
| `detect_id` | string | No | UUID of existing detect to associate |
| `media_type` | string | No | `"audio"`, `"video"`, or `"image"` (auto-detected) |
| `json` | boolean | No | Return structured fields (default: false audio/video, true image) |
| `callback_url` | string | No | Webhook for async mode |
**Audio/Video structured response** (`json: true`):
- `speaker_info` — speaker description (age, gender)
- `language` / `dialect` — detected language
- `emotion` — detected emotional state
- `speaking_style` — conversational, formal, etc.
- `context` — inferred context of the speech
- `message` — content summary
- `abnormalities` — anomalies detected in the media
- `transcription` — full transcript
- `translation` — translation if non-English
- `misinformation` — misinformation analysis
**Image structured response:**
- `scene_description` — what the image shows
- `subjects` — people/objects identified
- `authenticity_analysis` — visual authenticity assessment
- `context_and_setting` — environment description
- `abnormalities` — visual anomalies
- `misinformation` — misinformation analysis
### `POST /detects/{detect_uuid}/intelligence` — Ask Questions
After detection completes, ask natural-language questions about it:
```json
{ "query": "How confident is the model that this audio is fake?" }
```
Returns a question UUID. Poll `GET /detects/{detect_uuid}/intelligence/{question_uuid}` until `status` is `"completed"`.
**Prerequisite:** The detection must have `status: "completed"`. Otherwise returns 422.
---
## Audio Source Tracing
Enable by setting `audio_source_tracing: true` in `POST /detect`.
Result appears in the detection response under `audio_source_tracing`:
```json
{ "label": "elevenlabs", "error_message": null }
```
Known source labels: `resemble_ai`, `elevenlabs`, `real`, and others as the model expands.
**Important:** Source tracing only runs when audio is labeled `"fake"`. If audio is `"real"`, no source tracing result appears.
**Standalone queries:**
- `GET /audio_source_tracings` — list all source tracing reports
- `GET /audio_source_tracings/{uuid}` — get specific report
---
## Watermarking
### `POST /watermark/apply`
```json
{
"url": "https://example.com/image.png",
"strength": 0.3,
"custom_message": "my-organization"
}
```
| Parameter | Type | Required | Description |
|------------------|--------|----------|-------------------------------------------------------------|
| `url` | string | Yes | HTTPS URL to media file |
| `strength` | number | No | Watermark strength 0.01.0 (image/video only, default 0.2) |
| `custom_message` | string | No | Custom message (image/video only, default "resembleai") |
- Add `Prefer: wait` header for synchronous response
- Without it, poll `GET /watermark/apply/{uuid}/result`
- Response includes `watermarked_media` URL to download the watermarked file
### `POST /watermark/detect`
```json
{ "url": "https://example.com/suspect-image.png" }
```
**Audio detection result:**
```json
{ "has_watermark": true, "confidence": 0.95 }
```
**Image/Video detection result:**
```json
{ "has_watermark": true }
```
---
## Identity — Speaker Verification (Beta)
> **Beta feature** — requires joining the preview program. Inform the user if they encounter access errors.
### `POST /identity` — Create Identity Profile
```json
{
"audio_url": "https://example.com/known-speaker.wav",
"name": "Jane Doe"
}
```
### `POST /identity/search` — Search Against Known Identities
```json
{
"audio_url": "https://example.com/unknown-speaker.wav",
"top_k": 5
}
```
**Response:**
```json
{
"success": true,
"item": [
{ "uuid": "...", "name": "Jane Doe", "confidence": 0.92, "distance": 0.08 }
]
}
```
Lower `distance` = closer match. Higher `confidence` = stronger match.
---
## Text Detection
> **Beta feature** — requires the `detect_beta_user` role or a billing plan that includes the `dfd_text` product.
### `POST /text_detect`
Add `Prefer: wait` for synchronous response. Otherwise poll or use callback.
| Parameter | Type | Required | Description |
|----------------|---------|----------|----------------------------------------------------------|
| `text` | string | Yes | Text to analyze (max 100,000 characters) |
| `thinking` | string | No | Always use `"low"` (default) |
| `threshold` | float | No | Decision threshold 0.01.0 (default: 0.5) |
| `callback_url` | string | No | Webhook URL for async completion notification |
| `privacy_mode` | boolean | No | If true, text content is not stored after analysis |
**Response:**
```json
{
"success": true,
"item": {
"uuid": "abc-123",
"status": "completed",
"prediction": "ai",
"confidence": 0.91,
"text_content": "This is some text to analyze.",
"privacy_mode": false,
"created_at": "...",
"updated_at": "..."
}
}
```
- `prediction`: `"ai"` or `"human"` — the verdict
- `confidence`: 0.01.0, higher = more confident
- `status`: `"processing"`, `"completed"`, or `"failed"`
### `GET /text_detect/{uuid}` — Poll
Poll until `status` is `"completed"` or `"failed"`.
### `GET /text_detect` — List
Returns paginated text detections for the team.
### Callback
If `callback_url` was provided, a `POST` is sent on completion:
```json
{ "success": true, "item": { ... } }
```
On failure:
```json
{ "success": false, "item": { ... }, "error": "Error message here" }
```