update eval-driven-dev skill (#1434)

* update eval-driven-dev skill

* fix: update skill update command to use correct repository path

* address comments.

* update eval driven dev
This commit is contained in:
Yiou Li
2026-04-27 18:27:48 -07:00
committed by GitHub
parent 9933f65e6b
commit 2860790bc9
23 changed files with 1881 additions and 700 deletions

View File

@@ -0,0 +1,60 @@
# Runnable Example: Standalone Function (No Server)
**When the app is a plain Python function or module** — no web framework, no server, no infrastructure.
**Approach**: Import and call the function directly from `run()`. This is the simplest case.
```python
# pixie_qa/run_app.py
from pydantic import BaseModel
import pixie
class AppArgs(BaseModel):
question: str
class AppRunnable(pixie.Runnable[AppArgs]):
"""Drives a standalone function for tracing and evaluation."""
@classmethod
def create(cls) -> "AppRunnable":
return cls()
async def run(self, args: AppArgs) -> None:
from myapp.agent import answer_question
await answer_question(args.question)
```
If the function is synchronous, wrap it with `asyncio.to_thread`:
```python
import asyncio
async def run(self, args: AppArgs) -> None:
from myapp.agent import answer_question
await asyncio.to_thread(answer_question, args.question)
```
If the function depends on an external service (e.g., a vector store), the `wrap(purpose="input")` calls you added in Step 2a handle it automatically — the registry injects test data in eval mode.
### When to use `setup()` / `teardown()`
Most standalone functions don't need lifecycle methods. Use them only when the function requires a shared resource (e.g., a pre-loaded embedding model, a database connection):
```python
class AppRunnable(pixie.Runnable[AppArgs]):
_model: SomeModel
@classmethod
def create(cls) -> "AppRunnable":
return cls()
async def setup(self) -> None:
from myapp.models import load_model
self._model = load_model()
async def run(self, args: AppArgs) -> None:
from myapp.agent import answer_question
await answer_question(args.question, model=self._model)
```