Quick Start
From zero to a passing eval in under 5 minutes. Three commands.
The fastest path to a working Keystone install. Three commands; the wizard does the rest.
1. Install the CLI
curl -fsSL https://ks.polarity.so/install.sh | bashVerify:
ks --version2. Add your API key
Get a key at app.paragon.run/app/keystone/settings → API Keys → Create Key. Keys start with ks_live_ and are shown once. Drop it in your project's .env:
cd ~/your-project
echo 'KEYSTONE_API_KEY=ks_live_...' >> .env3. Run the wizard
ks setupks setup runs seven phases end-to-end — each idempotent, each independently runnable:
| Phase | What it does |
|---|---|
skills | Writes coding-agent skill files (.claude/skills/keystone/SKILL.md, .cursor/rules/keystone.mdc, etc.) |
mcp | Registers ks mcp serve in your project's MCP configs so your agent can call Keystone as tools |
spec | Drops a starter spec at keystone/example.yaml |
instrument | Scans your code for ~50 LLM-SDK construction sites and prints them grouped by family |
install | Installs the Keystone SDK for each language detected (Go / TS / Python) |
snapshot | Detects agent code in your repo and explains how to package it as a snapshot |
doctor | Verifies API key, server reachability, auth, and ks on PATH |
When it finishes, you'll have a starter spec, your coding agent wired up, the SDK installed, and a green doctor check.
4. Run your first eval
ks eval run keystone/example.yamlExpected: a passing scenario in 10–30 seconds, with a RunResults JSON printed to stdout.
{
"experiment_id": "exp-a1b2c3",
"passed": 1,
"failed": 0,
"metrics": { "pass_rate": 1.0, "mean_wall_ms": 12000 },
"scenarios": [{ "status": "pass", "composite_score": 1.0 }]
}You're done. That's the whole quick start.
What's next
The setup wizard handled almost everything. The one thing it couldn't do is the actual code change — wrapping your existing LLM clients with ks.wrap() so your agent's calls are traced. That's a five-minute job, walked through in Setup Guide → Step 4.
After that:
| Want to | Read |
|---|---|
| Understand the mental model | Concepts |
| See real-world spec examples | Examples |
| Write your own spec | Spec Reference |
| Master the CLI | CLI Reference |
| Use the SDK | SDK Reference |
| Debug something broken | Troubleshooting |
If something's wrong
ks setup doctorRuns the five health checks (API key, server reachable, auth works, ks on PATH, .env parse-clean). Tells you what's broken with actionable hints. Re-run any single phase by name (ks setup mcp, ks setup spec, etc.) — they're all idempotent.
Stuck? See Troubleshooting.