Add error recovery hooks and PyInstaller frozen build recipes (#1388)

* Add error recovery hooks and PyInstaller frozen build recipes

* fixed datas to data
This commit is contained in:
Tilak Patel
2026-04-27 22:08:25 -04:00
committed by GitHub
parent 0c31682e47
commit 5f69546969
6 changed files with 558 additions and 7 deletions

View File

@@ -5,10 +5,12 @@ This folder hosts short, practical recipes for using the GitHub Copilot SDK with
## Recipes
- [Error Handling](error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
- [Error Recovery Hooks](error-recovery-hooks.md): Classify tool failures and nudge the LLM to keep investigating instead of giving up.
- [Multiple Sessions](multiple-sessions.md): Manage multiple independent conversations simultaneously.
- [Managing Local Files](managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
- [PR Visualization](pr-visualization.md): Generate interactive PR age charts using GitHub MCP Server.
- [Persisting Sessions](persisting-sessions.md): Save and resume sessions across restarts.
- [PyInstaller Frozen Build](pyinstaller-frozen-build.md): Package a Copilot SDK application into a standalone executable with PyInstaller.
## Contributing

View File

@@ -0,0 +1,116 @@
# Error Recovery Hooks
Keep the LLM investigating when tools fail instead of giving up with a partial result.
## Problem
When a shell command returns an error or a file operation hits a permission denial, the LLM tends to stop and apologize rather than trying a different approach. This produces incomplete results in agentic workflows where resilience matters.
## Solution
Use the SDK's hooks system (`on_post_tool_use`, `on_error_occurred`) to classify tool results by category and append continuation instructions that nudge the LLM to keep going.
```python
from enum import Enum
class ToolResultCategory(str, Enum):
SHELL_ERROR = "shell_error"
PERMISSION_DENIED = "permission_denied"
NORMAL = "normal"
class SDKErrorCategory(str, Enum):
CLIENT_ERROR = "client_error" # 4xx — not retryable
TRANSIENT = "transient" # 5xx / timeout
NON_RECOVERABLE = "non_recoverable"
# Phrases that signal permission issues in tool output
PERMISSION_DENIAL_PHRASES = [
"permission denied",
"access denied",
"not permitted",
"operation not allowed",
"eacces",
"eperm",
"403 forbidden",
]
SHELL_ERROR_PHRASES = [
"command not found",
"no such file or directory",
"exit code",
"errno",
"traceback",
]
CONTINUATION_MESSAGES = {
ToolResultCategory.SHELL_ERROR: (
"\n\n[SYSTEM NOTE: This command encountered an error. "
"This does NOT mean you should stop. Retry with different "
"arguments, try a different tool, or move on.]"
),
ToolResultCategory.PERMISSION_DENIED: (
"\n\n[SYSTEM NOTE: Permission was denied for this specific "
"action. Continue using alternative approaches.]"
),
}
def classify_tool_result(tool_name: str, result_text: str) -> ToolResultCategory:
result_lower = result_text.lower()
if any(phrase in result_lower for phrase in PERMISSION_DENIAL_PHRASES):
return ToolResultCategory.PERMISSION_DENIED
if any(phrase in result_lower for phrase in SHELL_ERROR_PHRASES):
return ToolResultCategory.SHELL_ERROR
return ToolResultCategory.NORMAL
def classify_sdk_error(error_msg: str, recoverable: bool) -> SDKErrorCategory:
error_lower = error_msg.lower()
if any(kw in error_lower for kw in ("timeout", "503", "502", "429", "retry")):
return SDKErrorCategory.TRANSIENT
if any(kw in error_lower for kw in ("401", "403", "404", "400", "422")):
return SDKErrorCategory.CLIENT_ERROR
return SDKErrorCategory.TRANSIENT if recoverable else SDKErrorCategory.NON_RECOVERABLE
```
## Hook Registration
Wire the classifiers into the SDK's hook system:
```python
def on_post_tool_use(input_data, env):
"""Append continuation hints to failed tool results."""
tool_name = input_data.get("toolName", "")
result = str(input_data.get("toolResult", ""))
category = classify_tool_result(tool_name, result)
if category in CONTINUATION_MESSAGES:
return {"toolResult": result + CONTINUATION_MESSAGES[category]}
return None
def on_error_occurred(input_data, env):
"""Retry transient errors, skip non-recoverable ones gracefully."""
error_msg = input_data.get("error", "")
recoverable = input_data.get("recoverable", False)
category = classify_sdk_error(error_msg, recoverable)
if category == SDKErrorCategory.TRANSIENT:
return {"errorHandling": "retry", "retryCount": 2}
return {
"errorHandling": "skip",
"userNotification": "Error occurred — continuing investigation.",
}
```
## Tips
- **Tune the phrase lists** for your domain — add patterns from your actual tool output.
- **Log classified categories** so you can track how often each failure mode fires and whether the LLM actually recovers.
- **Cap continuation depth** — if the same tool fails 3+ times in a row, let the LLM give up rather than looping.
- The `SYSTEM NOTE` framing works well because the LLM treats it as authoritative instruction rather than user commentary.
## Runnable Example
See [`recipe/error_recovery_hooks.py`](recipe/error_recovery_hooks.py) for a complete working example.

View File

@@ -0,0 +1,96 @@
# Deploying Copilot SDK Apps with PyInstaller
Package a Copilot SDK application into a standalone executable using PyInstaller (or Nuitka).
## Problem
When you freeze a Python SDK application with PyInstaller, three things break:
1. **CLI binary resolution** — The SDK locates its CLI via `__file__`, which points inside the PYZ archive in a frozen build.
2. **SSL certificates** — On macOS, the frozen app can't find system CA certs, so the CLI subprocess fails TLS handshakes.
3. **Execute permissions** — The bundled CLI binary may lose its `+x` bit when extracted from the archive.
## Solution
Resolve the CLI path by searching both the SDK's normal location and PyInstaller's `_MEIPASS` temp directory. Fix SSL by injecting `certifi`'s CA bundle into the environment. Restore execute permissions on Unix before launching.
```python
"""Frozen-build compatibility for Copilot SDK applications."""
import os, sys
from pathlib import Path
from copilot import CopilotClient, SubprocessConfig
def resolve_cli_path() -> str | None:
"""Find the Copilot CLI binary in a frozen build."""
candidates = []
binary = "copilot.exe" if sys.platform == "win32" else "copilot"
# 1. SDK's normal resolution
try:
import copilot as pkg
candidates.append(Path(pkg.__file__).parent / "bin" / binary)
except Exception:
pass
# 2. PyInstaller _MEIPASS fallback
if getattr(sys, "frozen", False) and hasattr(sys, "_MEIPASS"):
meipass = Path(sys._MEIPASS)
candidates.append(meipass / "copilot" / "bin" / binary)
candidates.append(meipass.parent / "copilot" / "bin" / binary)
for c in candidates:
if c.exists():
if sys.platform != "win32" and not os.access(str(c), os.X_OK):
os.chmod(str(c), c.stat().st_mode | 0o755)
return str(c)
return None
def ensure_ssl_certs():
"""Set SSL env vars for the CLI subprocess (macOS frozen builds)."""
if os.environ.get("SSL_CERT_FILE"):
return
try:
import certifi
ca = certifi.where()
if Path(ca).is_file():
os.environ["SSL_CERT_FILE"] = ca
os.environ["REQUESTS_CA_BUNDLE"] = ca
os.environ.setdefault("NODE_EXTRA_CA_CERTS", ca)
except ImportError:
pass # CLI will use platform defaults
async def create_frozen_client():
"""Create a CopilotClient that works in both normal and frozen builds."""
ensure_ssl_certs()
kwargs = {"log_level": "info", "use_stdio": True}
if getattr(sys, "frozen", False):
cli = resolve_cli_path()
if cli:
kwargs["cli_path"] = cli
client = CopilotClient(SubprocessConfig(**kwargs), auto_start=True)
await client.start()
return client
```
## PyInstaller Spec
Include the SDK's binary directory in your `.spec` file so PyInstaller bundles it:
```python
from PyInstaller.utils.hooks import collect_data_files
data += collect_data_files('copilot', include_py_files=False)
```
## Tips
- **Test the frozen build on a clean machine** — `_MEIPASS` extraction behaves differently than your dev environment.
- **Pin `certifi`** in your requirements so the CA bundle is always available.
- **Nuitka** uses a different extraction model (`--include-package-data=copilot`), but the same `resolve_cli_path` logic works.
## Runnable Example
See [`recipe/pyinstaller_frozen_build.py`](recipe/pyinstaller_frozen_build.py) for a complete working example.

View File

@@ -23,13 +23,15 @@ python <filename>.py
### Available Recipes
| Recipe | Command | Description |
| -------------------- | -------------------------------- | ------------------------------------------ |
| Error Handling | `python error_handling.py` | Demonstrates error handling patterns |
| Multiple Sessions | `python multiple_sessions.py` | Manages multiple independent conversations |
| Managing Local Files | `python managing_local_files.py` | Organizes files using AI grouping |
| PR Visualization | `python pr_visualization.py` | Generates PR age charts |
| Persisting Sessions | `python persisting_sessions.py` | Save and resume sessions across restarts |
| Recipe | Command | Description |
| -------------------- | ------------------------------------ | -------------------------------------------------- |
| Error Handling | `python error_handling.py` | Demonstrates error handling patterns |
| Error Recovery Hooks | `python error_recovery_hooks.py` | Classifies tool failures and retries automatically |
| Multiple Sessions | `python multiple_sessions.py` | Manages multiple independent conversations |
| Managing Local Files | `python managing_local_files.py` | Organizes files using AI grouping |
| PR Visualization | `python pr_visualization.py` | Generates PR age charts |
| Persisting Sessions | `python persisting_sessions.py` | Save and resume sessions across restarts |
| PyInstaller Build | `python pyinstaller_frozen_build.py` | Packages SDK apps into frozen executables |
### Examples with Arguments

View File

@@ -0,0 +1,207 @@
"""
Error Recovery Hooks
====================
Demonstrates how to classify tool results and SDK errors, then use hooks
to keep the LLM investigating instead of giving up on failure.
Run:
python error_recovery_hooks.py
Requirements:
pip install copilot-sdk
"""
import asyncio
from enum import Enum
from copilot import CopilotClient, SubprocessConfig
# ---------------------------------------------------------------------------
# Classification enums
# ---------------------------------------------------------------------------
class ToolResultCategory(str, Enum):
SHELL_ERROR = "shell_error"
PERMISSION_DENIED = "permission_denied"
NORMAL = "normal"
class SDKErrorCategory(str, Enum):
CLIENT_ERROR = "client_error" # 4xx — not retryable
TRANSIENT = "transient" # 5xx / timeout
NON_RECOVERABLE = "non_recoverable"
# ---------------------------------------------------------------------------
# Detection phrases — extend these for your domain
# ---------------------------------------------------------------------------
PERMISSION_DENIAL_PHRASES = [
"permission denied",
"access denied",
"not permitted",
"operation not allowed",
"eacces",
"eperm",
"403 forbidden",
]
SHELL_ERROR_PHRASES = [
"command not found",
"no such file or directory",
"exit code",
"errno",
"traceback",
]
# ---------------------------------------------------------------------------
# Continuation messages appended to failed tool results
# ---------------------------------------------------------------------------
CONTINUATION_MESSAGES = {
ToolResultCategory.SHELL_ERROR: (
"\n\n[SYSTEM NOTE: This command encountered an error. "
"This does NOT mean you should stop. Retry with different "
"arguments, try a different tool, or move on.]"
),
ToolResultCategory.PERMISSION_DENIED: (
"\n\n[SYSTEM NOTE: Permission was denied for this specific "
"action. Continue using alternative approaches.]"
),
}
# ---------------------------------------------------------------------------
# Classifiers
# ---------------------------------------------------------------------------
def classify_tool_result(tool_name: str, result_text: str) -> ToolResultCategory:
"""Classify a tool's output into a failure category."""
result_lower = result_text.lower()
if any(phrase in result_lower for phrase in PERMISSION_DENIAL_PHRASES):
return ToolResultCategory.PERMISSION_DENIED
if any(phrase in result_lower for phrase in SHELL_ERROR_PHRASES):
return ToolResultCategory.SHELL_ERROR
return ToolResultCategory.NORMAL
def classify_sdk_error(error_msg: str, recoverable: bool) -> SDKErrorCategory:
"""Classify an SDK-level error for retry/skip decisions."""
error_lower = error_msg.lower()
if any(kw in error_lower for kw in ("timeout", "503", "502", "429", "retry")):
return SDKErrorCategory.TRANSIENT
if any(kw in error_lower for kw in ("401", "403", "404", "400", "422")):
return SDKErrorCategory.CLIENT_ERROR
return SDKErrorCategory.TRANSIENT if recoverable else SDKErrorCategory.NON_RECOVERABLE
# ---------------------------------------------------------------------------
# SDK Hooks
# ---------------------------------------------------------------------------
def on_post_tool_use(input_data, env):
"""Append continuation hints to failed tool results."""
tool_name = input_data.get("toolName", "")
result = str(input_data.get("toolResult", ""))
category = classify_tool_result(tool_name, result)
print(f" [hook] {tool_name} -> {category.value}")
if category in CONTINUATION_MESSAGES:
return {"toolResult": result + CONTINUATION_MESSAGES[category]}
return None
def on_error_occurred(input_data, env):
"""Retry transient errors, skip non-recoverable ones gracefully."""
error_msg = input_data.get("error", "")
recoverable = input_data.get("recoverable", False)
category = classify_sdk_error(error_msg, recoverable)
print(f" [hook] SDK error -> {category.value}: {error_msg[:80]}")
if category == SDKErrorCategory.TRANSIENT:
return {"errorHandling": "retry", "retryCount": 2}
return {
"errorHandling": "skip",
"userNotification": "Error occurred — continuing investigation.",
}
# ---------------------------------------------------------------------------
# Demo: standalone classification test
# ---------------------------------------------------------------------------
def demo_classification():
"""Show classification working on sample outputs."""
samples = [
("bash", "ls: cannot access '/root': Permission denied"),
("bash", "grep: command not found"),
("read_file", '{"lines": ["INFO startup complete"]}'),
("bash", "cat: /etc/shadow: Operation not permitted"),
]
print("Classification demo:")
print("-" * 60)
for tool, output in samples:
cat = classify_tool_result(tool, output)
print(f" {tool:15s} | {cat.value:20s} | {output[:50]}")
print()
error_samples = [
("Connection timeout after 30s", True),
("HTTP 503 Service Unavailable", True),
("HTTP 404 Not Found", False),
("Unexpected server error", False),
]
print("SDK error classification demo:")
print("-" * 60)
for msg, recoverable in error_samples:
cat = classify_sdk_error(msg, recoverable)
print(f" recoverable={recoverable!s:5s} | {cat.value:20s} | {msg}")
# ---------------------------------------------------------------------------
# Demo: wired into a real session
# ---------------------------------------------------------------------------
async def demo_with_session():
"""Create a session with hooks registered (requires Copilot auth)."""
client = CopilotClient(
SubprocessConfig(log_level="info", use_stdio=True),
auto_start=True,
)
await client.start()
try:
session = await client.create_session(
hooks={
"on_post_tool_use": on_post_tool_use,
"on_error_occurred": on_error_occurred,
}
)
# Send a prompt that's likely to trigger tool use
response = await session.send_message(
"List the files in /tmp and then try to read /etc/shadow. "
"If you can't read it, explain why and move on."
)
print(f"\nAgent response:\n{response}")
finally:
await client.stop()
if __name__ == "__main__":
# Always run the standalone demo
demo_classification()
# Uncomment to test with a live session:
# asyncio.run(demo_with_session())

View File

@@ -0,0 +1,128 @@
"""
PyInstaller / Frozen Build Compatibility
=========================================
Demonstrates how to create a CopilotClient that works correctly inside
a PyInstaller (or Nuitka) frozen executable.
Run normally:
python pyinstaller_frozen_build.py
Build with PyInstaller:
pyinstaller --onefile pyinstaller_frozen_build.py
Requirements:
pip install copilot-sdk certifi
"""
import asyncio
import os
import sys
from pathlib import Path
from copilot import CopilotClient, SubprocessConfig
# ---------------------------------------------------------------------------
# CLI binary resolution
# ---------------------------------------------------------------------------
def resolve_cli_path() -> str | None:
"""Find the Copilot CLI binary in a frozen build.
Searches the SDK's standard location first, then falls back to
PyInstaller's _MEIPASS temporary directory.
"""
candidates: list[Path] = []
binary = "copilot.exe" if sys.platform == "win32" else "copilot"
# 1. SDK's normal resolution (works in non-frozen builds)
try:
import copilot as pkg
candidates.append(Path(pkg.__file__).parent / "bin" / binary)
except Exception:
pass
# 2. PyInstaller _MEIPASS fallback
if getattr(sys, "frozen", False) and hasattr(sys, "_MEIPASS"):
meipass = Path(sys._MEIPASS)
candidates.append(meipass / "copilot" / "bin" / binary)
candidates.append(meipass.parent / "copilot" / "bin" / binary)
for c in candidates:
if c.exists():
# Restore execute permissions on Unix (lost during archive extraction)
if sys.platform != "win32" and not os.access(str(c), os.X_OK):
os.chmod(str(c), c.stat().st_mode | 0o755)
return str(c)
return None
# ---------------------------------------------------------------------------
# SSL certificate setup
# ---------------------------------------------------------------------------
def ensure_ssl_certs():
"""Inject certifi's CA bundle into the environment.
On macOS frozen builds the system certificate store is unreachable,
so the CLI subprocess fails TLS handshakes unless we set these vars.
"""
if os.environ.get("SSL_CERT_FILE"):
return # Already configured
try:
import certifi
ca = certifi.where()
if Path(ca).is_file():
os.environ["SSL_CERT_FILE"] = ca
os.environ["REQUESTS_CA_BUNDLE"] = ca
os.environ.setdefault("NODE_EXTRA_CA_CERTS", ca)
except ImportError:
pass # CLI will fall back to platform defaults
# ---------------------------------------------------------------------------
# Client factory
# ---------------------------------------------------------------------------
async def create_frozen_client() -> CopilotClient:
"""Create a CopilotClient that works in both normal and frozen builds."""
ensure_ssl_certs()
kwargs: dict = {"log_level": "info", "use_stdio": True}
if getattr(sys, "frozen", False):
cli = resolve_cli_path()
if cli:
kwargs["cli_path"] = cli
print(f"[frozen] Using CLI at: {cli}")
else:
print("[frozen] WARNING: Could not locate Copilot CLI binary")
client = CopilotClient(SubprocessConfig(**kwargs), auto_start=True)
await client.start()
return client
# ---------------------------------------------------------------------------
# Demo
# ---------------------------------------------------------------------------
async def main():
frozen = getattr(sys, "frozen", False)
print(f"Running as {'frozen' if frozen else 'normal'} Python process")
client = await create_frozen_client()
try:
session = await client.create_session()
response = await session.send_message(
"Say 'Hello from a frozen build!' if you can read this."
)
print(f"Response: {response}")
finally:
await client.stop()
if __name__ == "__main__":
asyncio.run(main())