mirror of
https://github.com/github/awesome-copilot.git
synced 2026-05-04 14:15:55 +00:00
chore: sync Arize skills from arize-skills@597d609bfe5f07fd7d24acfdb408a082911b18fc and phoenix@746247cbb07b0dc7803b87c69dd8c77811c33f59 (#1583)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This commit is contained in:
@@ -1,10 +1,12 @@
|
||||
---
|
||||
name: arize-prompt-optimization
|
||||
description: "INVOKE THIS SKILL when optimizing, improving, or debugging LLM prompts using production trace data, evaluations, and annotations. Covers extracting prompts from spans, gathering performance signal, and running a data-driven optimization loop using the ax CLI."
|
||||
description: "INVOKE THIS SKILL when optimizing, improving, or debugging LLM prompts using production trace data, evaluations, and annotations. Also use when the user wants to make their AI respond better or improve AI output quality. Covers extracting prompts from spans, gathering performance signal, and running a data-driven optimization loop using the ax CLI."
|
||||
---
|
||||
|
||||
# Arize Prompt Optimization Skill
|
||||
|
||||
> **`SPACE`** — All `--space` flags and the `ARIZE_SPACE` env var accept a space **name** (e.g., `my-workspace`) or a base64 space **ID** (e.g., `U3BhY2U6...`). Find yours with `ax spaces list`.
|
||||
|
||||
## Concepts
|
||||
|
||||
### Where Prompts Live in Trace Data
|
||||
@@ -50,34 +52,35 @@ Proceed directly with the task — run the `ax` command you need. Do NOT check v
|
||||
|
||||
If an `ax` command fails, troubleshoot based on the error:
|
||||
- `command not found` or version error → see references/ax-setup.md
|
||||
- `401 Unauthorized` / missing API key → run `ax profiles show` to inspect the current profile. If the profile is missing or the API key is wrong: check `.env` for `ARIZE_API_KEY` and use it to create/update the profile via references/ax-profiles.md. If `.env` has no key either, ask the user for their Arize API key (https://app.arize.com/admin > API Keys)
|
||||
- Space ID unknown → check `.env` for `ARIZE_SPACE_ID`, or run `ax spaces list -o json`, or ask the user
|
||||
- Project unclear → check `.env` for `ARIZE_DEFAULT_PROJECT`, or ask, or run `ax projects list -o json --limit 100` and present as selectable options
|
||||
- LLM provider call fails (missing OPENAI_API_KEY / ANTHROPIC_API_KEY) → check `.env`, load if present, otherwise ask the user
|
||||
- `401 Unauthorized` / missing API key → run `ax profiles show` to inspect the current profile. If the profile is missing or the API key is wrong, follow references/ax-profiles.md to create/update it. If the user doesn't have their key, direct them to https://app.arize.com/admin > API Keys
|
||||
- Space unknown → run `ax spaces list` to pick by name, or ask the user
|
||||
- Project unclear → ask the user, or run `ax projects list -o json --limit 100` and present as selectable options
|
||||
- LLM provider call fails (missing OPENAI_API_KEY / ANTHROPIC_API_KEY) → run `ax ai-integrations list --space SPACE` to check for platform-managed credentials. If none exist, ask the user to provide the key or create an integration via the **arize-ai-provider-integration** skill
|
||||
- **Security:** Never read `.env` files or search the filesystem for credentials. Use `ax profiles` for Arize credentials and `ax ai-integrations` for LLM provider keys. If credentials are not available through these channels, ask the user.
|
||||
|
||||
## Phase 1: Extract the Current Prompt
|
||||
|
||||
### Find LLM spans containing prompts
|
||||
|
||||
```bash
|
||||
# List LLM spans (where prompts live)
|
||||
ax spans list PROJECT_ID --filter "attributes.openinference.span.kind = 'LLM'" --limit 10
|
||||
# Sample LLM spans (where prompts live)
|
||||
ax spans export PROJECT --filter "attributes.openinference.span.kind = 'LLM'" -l 10 --stdout
|
||||
|
||||
# Filter by model
|
||||
ax spans list PROJECT_ID --filter "attributes.llm.model_name = 'gpt-4o'" --limit 10
|
||||
ax spans export PROJECT --filter "attributes.llm.model_name = 'gpt-4o'" -l 10 --stdout
|
||||
|
||||
# Filter by span name (e.g., a specific LLM call)
|
||||
ax spans list PROJECT_ID --filter "name = 'ChatCompletion'" --limit 10
|
||||
ax spans export PROJECT --filter "name = 'ChatCompletion'" -l 10 --stdout
|
||||
```
|
||||
|
||||
### Export a trace to inspect prompt structure
|
||||
|
||||
```bash
|
||||
# Export all spans in a trace
|
||||
ax spans export --trace-id TRACE_ID --project PROJECT_ID
|
||||
ax spans export PROJECT --trace-id TRACE_ID
|
||||
|
||||
# Export a single span
|
||||
ax spans export --span-id SPAN_ID --project PROJECT_ID
|
||||
ax spans export PROJECT --span-id SPAN_ID
|
||||
```
|
||||
|
||||
### Extract prompts from exported JSON
|
||||
@@ -118,33 +121,33 @@ If the span has `attributes.llm.prompt_template.template`, the prompt uses varia
|
||||
|
||||
```bash
|
||||
# Find error spans -- these indicate prompt failures
|
||||
ax spans list PROJECT_ID \
|
||||
ax spans export PROJECT \
|
||||
--filter "status_code = 'ERROR' AND attributes.openinference.span.kind = 'LLM'" \
|
||||
--limit 20
|
||||
-l 20 --stdout
|
||||
|
||||
# Find spans with low eval scores
|
||||
ax spans list PROJECT_ID \
|
||||
ax spans export PROJECT \
|
||||
--filter "annotation.correctness.label = 'incorrect'" \
|
||||
--limit 20
|
||||
-l 20 --stdout
|
||||
|
||||
# Find spans with high latency (may indicate overly complex prompts)
|
||||
ax spans list PROJECT_ID \
|
||||
ax spans export PROJECT \
|
||||
--filter "attributes.openinference.span.kind = 'LLM' AND latency_ms > 10000" \
|
||||
--limit 20
|
||||
-l 20 --stdout
|
||||
|
||||
# Export error traces for detailed inspection
|
||||
ax spans export --trace-id TRACE_ID --project PROJECT_ID
|
||||
ax spans export PROJECT --trace-id TRACE_ID
|
||||
```
|
||||
|
||||
### From datasets and experiments
|
||||
|
||||
```bash
|
||||
# Export a dataset (ground truth examples)
|
||||
ax datasets export DATASET_ID
|
||||
ax datasets export DATASET_NAME --space SPACE
|
||||
# -> dataset_*/examples.json
|
||||
|
||||
# Export experiment results (what the LLM produced)
|
||||
ax experiments export EXPERIMENT_ID
|
||||
ax experiments export EXPERIMENT_NAME --dataset DATASET_NAME --space SPACE
|
||||
# -> experiment_*/runs.json
|
||||
```
|
||||
|
||||
@@ -307,7 +310,7 @@ After the LLM returns the revised messages array:
|
||||
```
|
||||
1. Extract prompt -> Phase 1 (once)
|
||||
2. Run experiment -> ax experiments create ...
|
||||
3. Export results -> ax experiments export EXPERIMENT_ID
|
||||
3. Export results -> ax experiments export EXPERIMENT_NAME --dataset DATASET_NAME --space SPACE
|
||||
4. Analyze failures -> jq to find low scores
|
||||
5. Run meta-prompt -> Phase 3 with new failure data
|
||||
6. Apply revised prompt
|
||||
@@ -372,11 +375,11 @@ When optimizing prompts that use template variables:
|
||||
|
||||
1. Find failing traces:
|
||||
```bash
|
||||
ax traces list PROJECT_ID --filter "status_code = 'ERROR'" --limit 5
|
||||
ax traces list PROJECT --filter "status_code = 'ERROR'" --limit 5
|
||||
```
|
||||
2. Export the trace:
|
||||
```bash
|
||||
ax spans export --trace-id TRACE_ID --project PROJECT_ID
|
||||
ax spans export PROJECT --trace-id TRACE_ID
|
||||
```
|
||||
3. Extract the prompt from the LLM span:
|
||||
```bash
|
||||
@@ -395,13 +398,13 @@ When optimizing prompts that use template variables:
|
||||
|
||||
1. Find the dataset and experiment:
|
||||
```bash
|
||||
ax datasets list
|
||||
ax experiments list --dataset-id DATASET_ID
|
||||
ax datasets list --space SPACE
|
||||
ax experiments list --dataset DATASET_NAME --space SPACE
|
||||
```
|
||||
2. Export both:
|
||||
```bash
|
||||
ax datasets export DATASET_ID
|
||||
ax experiments export EXPERIMENT_ID
|
||||
ax datasets export DATASET_NAME --space SPACE
|
||||
ax experiments export EXPERIMENT_NAME --dataset DATASET_NAME --space SPACE
|
||||
```
|
||||
3. Prepare the joined data for the meta-prompt
|
||||
4. Run the optimization meta-prompt
|
||||
@@ -411,9 +414,9 @@ When optimizing prompts that use template variables:
|
||||
|
||||
1. Export spans where the output format is wrong:
|
||||
```bash
|
||||
ax spans list PROJECT_ID \
|
||||
ax spans export PROJECT \
|
||||
--filter "attributes.openinference.span.kind = 'LLM' AND annotation.format.label = 'incorrect'" \
|
||||
--limit 10 -o json > bad_format.json
|
||||
-l 10 --stdout > bad_format.json
|
||||
```
|
||||
2. Look at what the LLM is producing vs what was expected
|
||||
3. Add explicit format instructions to the prompt (JSON schema, examples, delimiters)
|
||||
@@ -423,13 +426,13 @@ When optimizing prompts that use template variables:
|
||||
|
||||
1. Find traces where the model hallucinated:
|
||||
```bash
|
||||
ax spans list PROJECT_ID \
|
||||
ax spans export PROJECT \
|
||||
--filter "annotation.faithfulness.label = 'unfaithful'" \
|
||||
--limit 20
|
||||
-l 20 --stdout
|
||||
```
|
||||
2. Export and inspect the retriever + LLM spans together:
|
||||
```bash
|
||||
ax spans export --trace-id TRACE_ID --project PROJECT_ID
|
||||
ax spans export PROJECT --trace-id TRACE_ID
|
||||
jq '[.[] | {kind: .attributes.openinference.span.kind, name, input: .attributes.input.value, output: .attributes.output.value}]' trace_*/spans.json
|
||||
```
|
||||
3. Check if the retrieved context actually contained the answer
|
||||
|
||||
@@ -54,7 +54,7 @@ ax profiles create work --api-key $ARIZE_API_KEY --region us-east-1b
|
||||
|
||||
To use a named profile with any `ax` command, add `-p NAME`:
|
||||
```bash
|
||||
ax spans export PROJECT_ID -p work
|
||||
ax spans export PROJECT -p work
|
||||
```
|
||||
|
||||
## 4. Getting the API key
|
||||
@@ -81,19 +81,19 @@ ax profiles show
|
||||
|
||||
Confirm the API key and region are correct, then retry the original command.
|
||||
|
||||
## Space ID
|
||||
## Space
|
||||
|
||||
There is no profile flag for space ID. Save it as an environment variable:
|
||||
There is no profile flag for space. Save it as an environment variable — accepts a space **name** (e.g., `my-workspace`) or a base64 space **ID** (e.g., `U3BhY2U6...`). Find yours with `ax spaces list -o json`.
|
||||
|
||||
**macOS/Linux** — add to `~/.zshrc` or `~/.bashrc`:
|
||||
```bash
|
||||
export ARIZE_SPACE_ID="U3BhY2U6..."
|
||||
export ARIZE_SPACE="my-workspace" # name or base64 ID
|
||||
```
|
||||
Then `source ~/.zshrc` (or restart terminal).
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
[System.Environment]::SetEnvironmentVariable('ARIZE_SPACE_ID', 'U3BhY2U6...', 'User')
|
||||
[System.Environment]::SetEnvironmentVariable('ARIZE_SPACE', 'my-workspace', 'User')
|
||||
```
|
||||
Restart terminal for it to take effect.
|
||||
|
||||
@@ -103,8 +103,8 @@ At the **end of the session**, if the user manually provided any credentials dur
|
||||
|
||||
**Skip this entirely if:**
|
||||
- The API key was already loaded from an existing profile or `ARIZE_API_KEY` env var
|
||||
- The space ID was already set via `ARIZE_SPACE_ID` env var
|
||||
- The user only used base64 project IDs (no space ID was needed)
|
||||
- The space was already set via `ARIZE_SPACE` env var
|
||||
- The user only used base64 project IDs (no space was needed)
|
||||
|
||||
**How to offer:** Use **AskQuestion**: *"Would you like to save your Arize credentials so you don't have to enter them next time?"* with options `"Yes, save them"` / `"No thanks"`.
|
||||
|
||||
@@ -112,4 +112,4 @@ At the **end of the session**, if the user manually provided any credentials dur
|
||||
|
||||
1. **API key** — Run `ax profiles show` to check the current state. Then run `ax profiles create --api-key $ARIZE_API_KEY` or `ax profiles update --api-key $ARIZE_API_KEY` (the key must already be exported as an env var — never pass a raw key value).
|
||||
|
||||
2. **Space ID** — See the Space ID section above to persist it as an environment variable.
|
||||
2. **Space** — See the Space section above to persist it as an environment variable.
|
||||
|
||||
@@ -4,7 +4,7 @@ Consult this only when an `ax` command fails. Do NOT run these checks proactivel
|
||||
|
||||
## Check version first
|
||||
|
||||
If `ax` is installed (not `command not found`), always run `ax --version` before investigating further. The version must be `0.8.0` or higher — many errors are caused by an outdated install. If the version is too old, see **Version too old** below.
|
||||
If `ax` is installed (not `command not found`), always run `ax --version` before investigating further. The version must be `0.14.0` or higher — many errors are caused by an outdated install. If the version is too old, see **Version too old** below.
|
||||
|
||||
## `ax: command not found`
|
||||
|
||||
@@ -19,7 +19,7 @@ If `ax` is installed (not `command not found`), always run `ax --version` before
|
||||
3. Install: `pip install arize-ax-cli`
|
||||
4. Add to PATH: `$env:PATH = "$env:APPDATA\Python\Scripts;$env:PATH"`
|
||||
|
||||
## Version too old (below 0.8.0)
|
||||
## Version too old (below 0.14.0)
|
||||
|
||||
Upgrade: `uv tool install --force --reinstall arize-ax-cli`, `pipx upgrade arize-ax-cli`, or `pip install --upgrade arize-ax-cli`
|
||||
|
||||
|
||||
Reference in New Issue
Block a user