| name | creating-replay-vision-scanners |
| description | Guides agents through creating and safely sizing a Replay Vision scanner: choosing the scanner type (monitor/classifier/scorer/summarizer), shaping the RecordingsQuery that selects sessions, and — crucially — estimating observation volume and checking the org's monthly quota before creating, so a broad scanner doesn't exhaust the budget on its first scheduled sweep.
TRIGGER when: user asks to create, set up, or configure a Replay Vision scanner, OR when you are about to call vision-scanners-create, OR when widening an existing scanner's query or sampling_rate via vision-scanners-update.
DO NOT TRIGGER when: only reading scanners or observations, deleting a scanner, or running an existing scanner against a single session on demand (vision-scanners-scan-session). |
Creating Replay Vision scanners
A scanner is a standing LLM probe over session recordings. Once created and enabled, it runs on a
Temporal schedule that sweeps every 5 minutes, applying its prompt to each new matching recording and
recording the result as an observation (a queryable $recording_observed event). Each observation counts
against a monthly org quota (a fixed number of observations per calendar month).
That schedule is exactly why creation needs a gut-check: a scanner with a permissive query and full sampling
starts consuming quota automatically and can drain the whole month's budget within its first few sweeps.
Creation itself does not check quota — that protection only kicks in at observation time, by which point
the budget may already be gone.
Core principle: size before you ship
Never create an enabled scanner blind. Estimate its volume, check remaining quota, and — when the projected
volume is a meaningful fraction of what's left — show the user the numbers and get confirmation before
creating. This is the heart of the skill; the rest is supporting detail.
The flow
Step 1: What should the scanner do?
Pick a scanner_type and write its scanner_config. Every type needs a prompt; the rest is type-specific:
| Type | What it produces | scanner_config shape |
|---|
monitor | Open-ended observation against a prompt (e.g. "flag rage clicks") | {"prompt": "..."} |
classifier | Assigns tags from a fixed label set | {"prompt": "...", "tags": ["tag-a", "tag-b"]} — tags needs ≥1 entry; optional "multi_label": true, "allow_freeform_tags": false |
scorer | Numeric score on a rubric | {"prompt": "...", "scale": {"min": 1, "max": 5, "label": "frustration"}} — min < max; label optional |
summarizer | Free-text summary; optional facet embeddings for search | {"prompt": "..."}; optional "length": "short" | "medium" | "long" (default "medium"), "emits_embeddings": false |
scanner_type is locked after creation — to change it you delete and recreate, so confirm the type is
right up front, and get the scanner_config shape right (a wrong shape is a create error, not a silent
default).
If the user's intent makes the type and prompt obvious, just proceed — don't interrogate them.
Step 2: Which sessions?
The query is a RecordingsQuery shape that selects which recordings the scanner watches. date_from and
date_to are ignored (the schedule controls time), so don't bother setting them. Narrow the query to the
sessions that actually matter — by event, URL, person property, duration, etc. A narrow query is the single
biggest lever on cost.
sampling_rate (0..1, default 1.0) is a random downsample applied after the query matches. Lower it to
trade coverage for budget.
Step 3: Size it — the gut-check (do not skip)
Before creating, run both checks and reason about them together:
- Estimate volume — call
vision-scanners-estimate-create with the proposed query + sampling_rate.
It returns matched_sessions_in_window, the window_days measured, and
estimated_observations_per_month.
- Check budget — call
vision-quota-retrieve for remaining and exhausted against the org's monthly
monthly_quota.
Then decide:
- If
estimated_observations_per_month comfortably fits within remaining, proceed.
- If it's a large fraction of (or exceeds)
remaining, stop and tell the user the concrete numbers
— e.g. "This scanner is projected to produce ~X observations/month; you have Y of Z left this month." —
and confirm before creating, or suggest tightening the query or lowering sampling_rate first.
- If the org is already
exhausted, say so — a new enabled scanner won't produce anything until the quota
resets, and its observations will be silently skipped.
Confirmation here is a conversation step, not an API capability — surface the trade-off and let the user
choose. When the projected volume is clearly small relative to the budget, you don't need to ask.
Step 4: Create
Call vision-scanners-create. Minimal example:
{
"name": "Rage click monitor",
"scanner_type": "monitor",
"scanner_config": { "prompt": "Flag sessions where the user repeatedly clicks the same element in frustration." },
"query": { "kind": "RecordingsQuery", "events": [{ "id": "$rageclick", "type": "events" }] },
"sampling_rate": 1.0,
"model": "gemini-3-flash-preview",
"enabled": true
}
name must be unique within the team. Set enabled: false if the user wants to create it paused (no
schedule, no quota consumption) and turn it on later.
After creation
- Show the scanner's PostHog URL from the response so the user can review it in the UI.
- Results take a few minutes to appear (rasterizing the recording to video + the LLM call are slow). Inspect
them with
vision-scanners-observations-list for one scanner over time, or vision-observations-list
(requires session_id) for every scanner's findings on a single session. To dig into a recording, hand off
to the investigating-replay skill.
Updating an existing scanner
vision-scanners-update is a partial update — send only changed fields. Re-run the Step 3 gut-check
whenever you widen scope: a broader query or a higher sampling_rate raises the sweep volume just like a
fresh broad scanner would. Toggling enabled, tweaking the prompt, or narrowing the query don't need a
re-estimate. Editing config bumps scanner_version; past observations keep a snapshot of the old config.
Gotchas
- One observation per (scanner, session). Re-running a scanner on a session it already observed — even a
failed or ineligible one — is a no-op and won't produce a fresh scan.
- Ineligible ≠ failed. Observations can land
ineligible (e.g. too_short, no_recording) — a terminal
non-error outcome. Check error_reason when triaging why a scanner produced nothing.
- Provider/model are Google/Gemini only in the current version.