Agent

Desktop Sandboxes

Real browser sandboxes that run your app end-to-end for visual validation and QA.

Desktop Sandboxes give the agent an isolated virtual desktop with a real browser. The agent can navigate your app, interact with UI elements, take screenshots, and run E2E tests — all within a sandboxed environment that streams live to your dashboard.

How It Works

Each Grind mode session runs inside an isolated sandbox with:

  • A full Linux desktop environment
  • A real browser (Chromium) for web interaction
  • Git, build tools, and shell access
  • Your repository pre-cloned and ready

The sandbox desktop streams to the Desktop tab in the session panel via VNC, so you can watch the agent work in real time.

Viewing the Desktop

When a Grind session is running, click the Desktop tab in the sandbox panel to see the agent's live desktop.

StateWhat You See
Active sessionLive desktop stream with interactive controls
ConnectingLoading indicator while the VNC connection establishes
Session completeDesktop remains viewable for 1 hour after the session stops
RecordingVideo playback of the full session for review

Taking Control

You can interact with the sandbox directly:

  1. Click Take control to enter interactive mode (fullscreen)
  2. Use your mouse and keyboard to navigate, click, and type in the sandbox
  3. Click Hand back control or press Esc to return to view-only mode

This is useful for debugging, manually testing something the agent built, or guiding the agent through a tricky UI flow.

Browser Tools

When browser support is enabled, the agent has access to these tools:

ToolDescription
NavigateGo to a URL in the browser
ClickClick an element on the page
TypeType text into the focused element
FillFill a form input field
ScrollScroll the page viewport
Get StateRead the current page DOM and take a screenshot
CloseClose the browser tab

The agent uses these tools to interact with web applications visually — verifying UI behavior, testing user flows, and validating that code changes work correctly in a real browser.

Use Cases

  • E2E validation — Agent builds a feature, then opens the app in the browser to verify it works
  • UI bug fixes — Agent reproduces a visual bug, fixes it, and confirms the fix in the browser
  • Form testing — Agent fills out forms, submits them, and checks responses
  • Screenshot comparison — Agent captures before/after screenshots for visual regression
  • Full-stack QA — Agent starts a dev server, runs the app, and tests end-to-end

Enabling Browser Support

Browser support is automatically available in Grind mode sessions. For CLI usage, add the --browser flag:

paragon --lra "test the login flow end-to-end" --browser

Next Steps