Services
Backing containers — Postgres, Redis, mock APIs — that boot alongside the sandbox on a private Docker network.
A service is a backing container that runs alongside your sandbox on a shared Docker network. Your agent reaches each by its declared name over DNS — postgres://db:5432, redis://cache:6379, http://stripe-mock:12111.
You declare services in your spec; Keystone pulls the image, starts the container with --network keystone-<sandbox-id> --network-alias <name>, optionally waits for a health check, and tears everything down on sandbox destroy.
Anatomy of a service block
services:
- name: db # DNS alias on the shared network
image: postgres:16 # any Docker image
env:
POSTGRES_PASSWORD: "{{ secrets.DB_PASS }}"
POSTGRES_DB: northwind
ports: [5432] # internal only — NOT host-published
wait_for: "pg_isready -U postgres" # gates "ready" status
- name: cache
image: redis:7
ports: [6379]
- name: stripe-mock
type: http_mock # built-in HTTP responder, no image needed
record: true
routes:
- method: POST
path: /v1/charge
response: '{"id": "ch_test", "status": "succeeded"}'
status: 200Every service field maps to a ServiceSpec Go struct on the server:
| Field | Type | Required | Meaning |
|---|---|---|---|
name | string | yes | DNS alias (must be unique within a sandbox) |
image | string | yes (unless type: http_mock) | Any Docker image — Hub, GHCR, ECR, GAR, private registry |
type | string | no | Empty (default — runs image) or http_mock (built-in mock) |
env | map | no | Container env. Supports {{ secrets.NAME }} interpolation |
ports | list of ints | no | Container ports exposed on the network (internal only) |
wait_for | string | no | Shell command run inside the container; success gates readiness |
record | bool | no | (http_mock only) record every request for assertions |
default_response | int | no | (http_mock only) status code for unmatched routes |
routes | list | no | (http_mock only) per-method/path responders |
Networking
Private Docker network per sandbox
Every sandbox gets a dedicated Docker network named keystone-<sandbox-id>. The agent container and every service container join that network, with name as their DNS alias:
┌─────────────── keystone-sb-abc123 ────────────────┐
│ │
│ agent ─────► db (postgres://db:5432) │
│ ─────► cache (redis://cache:6379) │
│ ─────► stripe-mock (http://stripe-mock:12111)
│ │
└────────────────────────────────────────────────────┘
This means:
- No port conflicts across parallel sandboxes — each runs its own
db:5432. - Service containers are not reachable from the host.
- Connection strings are stable:
postgres://db:5432/northwindworks in every run.
Outbound traffic
Service containers respect the sandbox's network.egress policy. By default egress.default: deny blocks all outbound; the agent's calls to services succeed because that's intra-network.
If a service itself needs to reach the public internet (e.g., a Postgres image fetching extensions on first boot), allow the host explicitly:
network:
egress:
default: deny
allow:
- registry.npmjs.org
- github.comService env vars (auto-injected)
Keystone exports per-service connection info as env vars in the agent container. Names follow KEYSTONE_SERVICE_<NAME>_HOST / _PORT with <NAME> upper-snake-cased:
# spec: services: [{ name: db }, { name: stripe-mock }]
# injected:
KEYSTONE_SERVICE_DB_HOST=db
KEYSTONE_SERVICE_DB_PORT=5432
KEYSTONE_SERVICE_STRIPE_MOCK_HOST=stripe-mock
KEYSTONE_SERVICE_STRIPE_MOCK_PORT=12111Use these when you want runtime discovery instead of hardcoded names.
Real images
Postgres
services:
- name: db
image: postgres:16
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: "{{ secrets.DB_PASSWORD }}"
POSTGRES_DB: northwind
ports: [5432]
wait_for: "pg_isready -U postgres"The wait_for runs once a second up to 60s. pg_isready is built into the Postgres image — exit code 0 means accepting connections.
Fixture credentials default to POSTGRES_USER=postgres / POSTGRES_PASSWORD=test / POSTGRES_DB=testdb if you don't set them. Override them via env: and Keystone re-derives. SQL fixtures use whatever you declared.
Redis
services:
- name: cache
image: redis:7
ports: [6379]
wait_for: "redis-cli ping"MailHog (SMTP capture for emails)
services:
- name: smtp
image: mailhog/mailhog
ports: [1025, 8025] # 1025 = SMTP, 8025 = HTTP UI/APIMailHog is a fake SMTP server that captures every email. Pair with http_mock_assertions against the HTTP API or query its /api/v2/messages endpoint from a custom check.
Vector databases
services:
- name: vector
image: ghcr.io/qdrant/qdrant:v1.7.4
ports: [6333]Any registry works — Docker Hub, GHCR, ECR, GAR, a private registry the server has credentials for.
Private registries
The server pulls on demand. For private images, configure the registry on the Keystone server (out-of-band — there's no spec field for credentials). Alternately, use the agent snapshots feature: tar your service alongside your agent and inject it via fixtures.
Built-in HTTP mocks
When you don't want to run a real image and just need scripted HTTP responses:
services:
- name: payment-api
type: http_mock
ports: [9090]
default_response: 404
record: true # capture every request for later assertion
routes:
- method: POST
path: /v1/charge
response: '{"status":"ok","charge_id":"ch_test_123"}'
status: 200
- method: GET
path: /v1/balance
response: '{"balance":10000}'
- method: ANY
path: "/v1/webhooks/.*" # regex path matching
response: '{"ok": true}'No image, no Dockerfile — Keystone runs a built-in Go HTTP responder. Match order: routes are evaluated top-to-bottom; first match wins. Unmatched paths get default_response.
Recording mode
With record: true, the mock saves every request to its replay log. Invariants can later assert on what your agent sent:
invariants:
charged_once:
description: "Exactly one charge call was made"
weight: 1.0
gate: true
check:
type: http_mock_assertions
service: payment-api
assertions:
- field: request_count
filters: { path: "/v1/charge" }
equals: 1
- field: last_request.body
contains: "amount"Available field selectors:
request_count— number of matching requestslast_request.body— body of the most recent matching requestlast_request.headers— headers maprequests[N]— Nth request (zero-indexed)
filters: narrow which requests are counted: method, path, plus header filters.
Wait conditions
wait_for is a shell command that runs inside the container every second, up to 60 seconds. The service is considered ready when it exits 0.
| Service type | Recommended wait_for |
|---|---|
| Postgres | pg_isready -U postgres |
| MySQL | mysqladmin ping -h localhost -u root |
| Redis | redis-cli ping |
| MongoDB | mongosh --eval "db.adminCommand({ping:1})" |
| ElasticSearch | curl -sf http://localhost:9200/_cluster/health |
| Kafka | kafka-topics --bootstrap-server localhost:9092 --list |
Built-in http_mock | (none needed — listens immediately) |
If wait_for doesn't pass within 60s, sandbox creation fails with service <name> not ready.
Image caching
Keystone pulls images on demand, then caches them. If 100 sandboxes use postgres:16, the image is pulled once and started 100 times. Don't worry about boot time for widely-used images.
For freshly tagged images, the server respects Docker's pull policy — :latest is checked against the registry; pinned tags (:16, :7-alpine) hit the cache.
Patterns
Database + cache + mock API
The "real-world web app" shape:
services:
- name: db
image: postgres:16
env:
POSTGRES_PASSWORD: "{{ secrets.DB_PASS }}"
POSTGRES_DB: app
ports: [5432]
wait_for: "pg_isready"
- name: cache
image: redis:7
ports: [6379]
wait_for: "redis-cli ping"
- name: stripe-mock
type: http_mock
record: true
default_response: 404
routes:
- method: POST
path: /v1/payment_intents
response: '{"id":"pi_test","status":"succeeded"}'
status: 200
network:
dns_overrides:
api.stripe.com: stripe-mock.services.internal # redirect real Stripe to mockMulti-database fixture
services:
- name: db_orders
image: postgres:16
env: { POSTGRES_DB: orders, POSTGRES_PASSWORD: "{{ secrets.DB_PASS }}" }
ports: [5432]
wait_for: "pg_isready"
- name: db_inventory
image: postgres:16
env: { POSTGRES_DB: inventory, POSTGRES_PASSWORD: "{{ secrets.DB_PASS }}" }
ports: [5432]
wait_for: "pg_isready"
fixtures:
- { type: sql, service: db_orders, sql: "CREATE TABLE orders ..." }
- { type: sql, service: db_inventory, sql: "CREATE TABLE items ..." }Two separate Postgres instances, each on its own DNS name — both listen on :5432 internally with no host port collision.
Limits
| Limit | Default | Configurable on self-hosted? |
|---|---|---|
| Services per sandbox | 16 | Yes (sandbox.max_services) |
wait_for timeout | 60s | Yes (services.wait_timeout) |
| Image size | 10 GB | Yes (Docker daemon config) |
| Concurrent service container starts | 5 | No |
Troubleshooting
"service <name> not ready" — wait_for didn't exit 0 within 60s. Check that the command is the right one for the image and that the service is actually starting (look at the audit log or stream /v1/sandboxes/:id/events).
"image <name> not found" — registry doesn't have it, or your server can't reach the registry. Try pulling the image manually on the server first.
"port already in use" — you tried to publish a port to the host with -p host:container syntax. Don't. Just declare ports: [<container-port>]; Keystone handles the rest.
Agent can't connect to service — verify the service name exactly matches the connection string. name: my-db means the agent connects to my-db:<port>, not mydb or my_db.