Spec

Services

Backing containers — Postgres, Redis, mock APIs — that boot alongside the sandbox on a private Docker network.

A service is a backing container that runs alongside your sandbox on a shared Docker network. Your agent reaches each by its declared name over DNS — postgres://db:5432, redis://cache:6379, http://stripe-mock:12111.

You declare services in your spec; Keystone pulls the image, starts the container with --network keystone-<sandbox-id> --network-alias <name>, optionally waits for a health check, and tears everything down on sandbox destroy.

Anatomy of a service block

services:
  - name: db                                  # DNS alias on the shared network
    image: postgres:16                         # any Docker image
    env:
      POSTGRES_PASSWORD: "{{ secrets.DB_PASS }}"
      POSTGRES_DB: northwind
    ports: [5432]                              # internal only — NOT host-published
    wait_for: "pg_isready -U postgres"         # gates "ready" status
 
  - name: cache
    image: redis:7
    ports: [6379]
 
  - name: stripe-mock
    type: http_mock                            # built-in HTTP responder, no image needed
    record: true
    routes:
      - method: POST
        path: /v1/charge
        response: '{"id": "ch_test", "status": "succeeded"}'
        status: 200

Every service field maps to a ServiceSpec Go struct on the server:

FieldTypeRequiredMeaning
namestringyesDNS alias (must be unique within a sandbox)
imagestringyes (unless type: http_mock)Any Docker image — Hub, GHCR, ECR, GAR, private registry
typestringnoEmpty (default — runs image) or http_mock (built-in mock)
envmapnoContainer env. Supports {{ secrets.NAME }} interpolation
portslist of intsnoContainer ports exposed on the network (internal only)
wait_forstringnoShell command run inside the container; success gates readiness
recordboolno(http_mock only) record every request for assertions
default_responseintno(http_mock only) status code for unmatched routes
routeslistno(http_mock only) per-method/path responders

Networking

Private Docker network per sandbox

Every sandbox gets a dedicated Docker network named keystone-<sandbox-id>. The agent container and every service container join that network, with name as their DNS alias:

┌─────────────── keystone-sb-abc123 ────────────────┐
│                                                    │
│  agent ─────► db          (postgres://db:5432)    │
│         ─────► cache       (redis://cache:6379)   │
│         ─────► stripe-mock (http://stripe-mock:12111)
│                                                    │
└────────────────────────────────────────────────────┘

This means:

  • No port conflicts across parallel sandboxes — each runs its own db:5432.
  • Service containers are not reachable from the host.
  • Connection strings are stable: postgres://db:5432/northwind works in every run.

Outbound traffic

Service containers respect the sandbox's network.egress policy. By default egress.default: deny blocks all outbound; the agent's calls to services succeed because that's intra-network.

If a service itself needs to reach the public internet (e.g., a Postgres image fetching extensions on first boot), allow the host explicitly:

network:
  egress:
    default: deny
    allow:
      - registry.npmjs.org
      - github.com

Service env vars (auto-injected)

Keystone exports per-service connection info as env vars in the agent container. Names follow KEYSTONE_SERVICE_<NAME>_HOST / _PORT with <NAME> upper-snake-cased:

# spec: services: [{ name: db }, { name: stripe-mock }]
# injected:
KEYSTONE_SERVICE_DB_HOST=db
KEYSTONE_SERVICE_DB_PORT=5432
KEYSTONE_SERVICE_STRIPE_MOCK_HOST=stripe-mock
KEYSTONE_SERVICE_STRIPE_MOCK_PORT=12111

Use these when you want runtime discovery instead of hardcoded names.

Real images

Postgres

services:
  - name: db
    image: postgres:16
    env:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: "{{ secrets.DB_PASSWORD }}"
      POSTGRES_DB: northwind
    ports: [5432]
    wait_for: "pg_isready -U postgres"

The wait_for runs once a second up to 60s. pg_isready is built into the Postgres image — exit code 0 means accepting connections.

Fixture credentials default to POSTGRES_USER=postgres / POSTGRES_PASSWORD=test / POSTGRES_DB=testdb if you don't set them. Override them via env: and Keystone re-derives. SQL fixtures use whatever you declared.

Redis

services:
  - name: cache
    image: redis:7
    ports: [6379]
    wait_for: "redis-cli ping"

MailHog (SMTP capture for emails)

services:
  - name: smtp
    image: mailhog/mailhog
    ports: [1025, 8025]    # 1025 = SMTP, 8025 = HTTP UI/API

MailHog is a fake SMTP server that captures every email. Pair with http_mock_assertions against the HTTP API or query its /api/v2/messages endpoint from a custom check.

Vector databases

services:
  - name: vector
    image: ghcr.io/qdrant/qdrant:v1.7.4
    ports: [6333]

Any registry works — Docker Hub, GHCR, ECR, GAR, a private registry the server has credentials for.

Private registries

The server pulls on demand. For private images, configure the registry on the Keystone server (out-of-band — there's no spec field for credentials). Alternately, use the agent snapshots feature: tar your service alongside your agent and inject it via fixtures.

Built-in HTTP mocks

When you don't want to run a real image and just need scripted HTTP responses:

services:
  - name: payment-api
    type: http_mock
    ports: [9090]
    default_response: 404
    record: true                 # capture every request for later assertion
    routes:
      - method: POST
        path: /v1/charge
        response: '{"status":"ok","charge_id":"ch_test_123"}'
        status: 200
      - method: GET
        path: /v1/balance
        response: '{"balance":10000}'
      - method: ANY
        path: "/v1/webhooks/.*"   # regex path matching
        response: '{"ok": true}'

No image, no Dockerfile — Keystone runs a built-in Go HTTP responder. Match order: routes are evaluated top-to-bottom; first match wins. Unmatched paths get default_response.

Recording mode

With record: true, the mock saves every request to its replay log. Invariants can later assert on what your agent sent:

invariants:
  charged_once:
    description: "Exactly one charge call was made"
    weight: 1.0
    gate: true
    check:
      type: http_mock_assertions
      service: payment-api
      assertions:
        - field: request_count
          filters: { path: "/v1/charge" }
          equals: 1
        - field: last_request.body
          contains: "amount"

Available field selectors:

  • request_count — number of matching requests
  • last_request.body — body of the most recent matching request
  • last_request.headers — headers map
  • requests[N] — Nth request (zero-indexed)

filters: narrow which requests are counted: method, path, plus header filters.

Wait conditions

wait_for is a shell command that runs inside the container every second, up to 60 seconds. The service is considered ready when it exits 0.

Service typeRecommended wait_for
Postgrespg_isready -U postgres
MySQLmysqladmin ping -h localhost -u root
Redisredis-cli ping
MongoDBmongosh --eval "db.adminCommand({ping:1})"
ElasticSearchcurl -sf http://localhost:9200/_cluster/health
Kafkakafka-topics --bootstrap-server localhost:9092 --list
Built-in http_mock(none needed — listens immediately)

If wait_for doesn't pass within 60s, sandbox creation fails with service <name> not ready.

Image caching

Keystone pulls images on demand, then caches them. If 100 sandboxes use postgres:16, the image is pulled once and started 100 times. Don't worry about boot time for widely-used images.

For freshly tagged images, the server respects Docker's pull policy — :latest is checked against the registry; pinned tags (:16, :7-alpine) hit the cache.

Patterns

Database + cache + mock API

The "real-world web app" shape:

services:
  - name: db
    image: postgres:16
    env:
      POSTGRES_PASSWORD: "{{ secrets.DB_PASS }}"
      POSTGRES_DB: app
    ports: [5432]
    wait_for: "pg_isready"
 
  - name: cache
    image: redis:7
    ports: [6379]
    wait_for: "redis-cli ping"
 
  - name: stripe-mock
    type: http_mock
    record: true
    default_response: 404
    routes:
      - method: POST
        path: /v1/payment_intents
        response: '{"id":"pi_test","status":"succeeded"}'
        status: 200
 
network:
  dns_overrides:
    api.stripe.com: stripe-mock.services.internal   # redirect real Stripe to mock

Multi-database fixture

services:
  - name: db_orders
    image: postgres:16
    env: { POSTGRES_DB: orders, POSTGRES_PASSWORD: "{{ secrets.DB_PASS }}" }
    ports: [5432]
    wait_for: "pg_isready"
 
  - name: db_inventory
    image: postgres:16
    env: { POSTGRES_DB: inventory, POSTGRES_PASSWORD: "{{ secrets.DB_PASS }}" }
    ports: [5432]
    wait_for: "pg_isready"
 
fixtures:
  - { type: sql, service: db_orders,   sql: "CREATE TABLE orders ..." }
  - { type: sql, service: db_inventory, sql: "CREATE TABLE items ..." }

Two separate Postgres instances, each on its own DNS name — both listen on :5432 internally with no host port collision.

Limits

LimitDefaultConfigurable on self-hosted?
Services per sandbox16Yes (sandbox.max_services)
wait_for timeout60sYes (services.wait_timeout)
Image size10 GBYes (Docker daemon config)
Concurrent service container starts5No

Troubleshooting

"service <name> not ready"wait_for didn't exit 0 within 60s. Check that the command is the right one for the image and that the service is actually starting (look at the audit log or stream /v1/sandboxes/:id/events).

"image <name> not found" — registry doesn't have it, or your server can't reach the registry. Try pulling the image manually on the server first.

"port already in use" — you tried to publish a port to the host with -p host:container syntax. Don't. Just declare ports: [<container-port>]; Keystone handles the rest.

Agent can't connect to service — verify the service name exactly matches the connection string. name: my-db means the agent connects to my-db:<port>, not mydb or my_db.