- contact@verticalserve.com

Manually maintained test suites need a QA engineer who writes Playwright + a developer who fixes the suite every time the app changes + an on-call who guesses whether a failure is the test or the app. The bench shoulders most of it.
Most QA managers know the app inside-out but don't want to live inside a Playwright codebase. The bench is the surface they actually want: paste a brief, review the auto-generated feature tree, chat-edit cases, run on demand, review failures with RCA. No code, no CI YAML, no waiting for a dev to land your test PR.
When a regression run goes red, the dev wants to know: is this my problem or the test's problem? The bench's RCA agent reads the case spec, the per-step execution log, the page observations, and the error — and tells you. Plus a suggested fix and a confidence score, so you can prioritize.
wait_ms to 4000 — KPI cards take longer to render"The bench schedules regression runs on the cadence you pick — every hour, daily at 02:00, whatever. Webhook fires on every run, only on failures, OR only on regressions (case that was passing yesterday is failing now). With the RCA agent's verdict already attached. So your on-call only gets paged when something actually broke.
Platform engineering, IT, security. Install the bench via Helm (k8s) or docker-compose (single VM). Wire up Okta, your S3, your Postgres, your model endpoints. Add worker boxes as you scale. Standard ops tooling all the way down — no proprietary control plane to learn.
Anything with a browser-reachable UI. A few patterns the bench handles particularly well:
Multi-page apps with sidebars, tabbed workbenches, and complex filtering. The crawl traverses menu trees and the bootstrap builds a feature plan per major surface.
Form-heavy flows where every field has validation rules and every step has a happy path + several error paths. Profile sweeps generate the negative cases automatically.
Multi-step funnels with state that persists across pages. Session-mode test execution keeps the login + cart state across cases the way a real user would.
Same suite, different env. Per-env URLs, login URLs, and persona credentials. Compare a staging run to the last green production run before shipping.
Bootstrap auto-generates a Responsive plan that replays the same UX cases at each viewport breakpoint. Captures screenshots at every viewport for diff review.
Bootstrap also produces a UX-audit plan. Each case captures the page at every viewport and the executor's summary step renders a per-page critique: layout, clarity, accessibility, info density.
All bench primitives are open §15 bundles — see insightworker-app-samples for the source.
InsightTestBench runs in your environment, on your network, against your apps. Self-host in one command. Talk to us if you want help wiring it to your CI.
Self-hosted • Multi-env • Vision-grounded • Self-explaining failures • Scheduled regression watchdog