| name | builder-smoke-test |
| description | Smoke test the Agent Builder feature branch end-to-end against a hermetic project scaffolded by the skill (linked to the current worktree). Covers workspace reconciliation, stored agents/skills CRUD, ownership, visibility, stars, registry/library Copy flow, picker allowlists, model policy, RBAC role gating, role impersonation UI, builder defaults, infrastructure diagnostics, channels, and Studio + Agent Builder UI. Trigger when validating the agent-builder feature branch, PRs that touch packages/server, packages/playground, packages/playground-ui agent-builder routes, or builder EE code paths. |
Builder Smoke Test
End-to-end smoke testing of the Agent Builder feature set against a hermetic project the skill scaffolds at ~/mastra-builder-smoke-tests/builder-smoke (configurable). The project links to the current worktree via pnpm link: overrides, so changes to packages under packages/, stores/, auth/, channels/, observability/, browser/, and client-sdks/ take effect on the next mastra dev restart.
This skill is for branch QA — it complements the release-time mastra-smoke-test. It exercises the Builder EE surface (stored entities, RBAC, registry, infra, channels) using a minimal, predictable project rather than the kitchen-sink examples/agent.
⚠️ Mandatory Test Checklist
Use task_write to track progress. Run ALL sections unless --test or --scope narrows the run.
Do not skip sections unless you hit an actual blocker. "Seemed complex" or "I'll come back to it" are not valid reasons. Attempt every step — only stop when you literally cannot proceed. Report what you tried and what blocked you.
| # | Section | Reference | When required |
|---|
| 1 | Setup | references/setup.md | Always |
| 2 | Workspace | references/workspace.md | --test workspace or full |
| 3 | Reconciliation | references/reconciliation.md | Steps 1 + 5 only; steps 2/3/4/6 are out of smoke-test scope (see below) |
| 4 | Defaults | references/defaults.md | --test defaults or full |
| 5 | Model Policy | references/model-policy.md | --test model-policy or full |
| 6 | Skills | references/skills.md | --test skills or full |
| 7 | Registry | references/registry.md | --test registry or full |
| 8 | Agents | references/agents.md | --test agents or full |
| 9 | Picker Allowlists | references/picker-allowlist.md | --test pickers or full |
| 10 | Stars | references/stars.md | --test stars or full |
| 11 | Permissions / RBAC | references/permissions.md | --test permissions or full |
| 12 | Infrastructure | references/infrastructure.md | --test infrastructure or full |
| 13 | Channels | references/channels.md | --test channels or full |
| 14 | UI | references/ui.md | --test ui or full |
| 15 | Auth | references/auth.md | --test auth or --auth on |
Execution flow
- Confirm the project directory. Before scaffolding, ask the user where they want
$PROJECT_DIR to live. Offer the default (~/mastra-builder-smoke-tests/builder-smoke) as a suggestion. Skip the question if they already passed --dir or have $BUILDER_SMOKE_TEST_DIR exported. See references/setup.md step 0.
- Read the reference file for each section you're about to run.
- Under
--auth on, extract the session cookie before running any other section. The WorkOS cookie is httpOnly, so curl cannot mint it and document.cookie cannot read it. The scaffold ships a debug route at GET /smoke-test/cookie gated by SMOKE_TEST_COOKIE_LEAK=1. Follow the "Extracting the session cookie for curl (auth on)" section below before touching any auth-on endpoint. Do not pivot to UI-only testing because curl is "blocked" — the cookie route is the unblock path.
- Seed non-owner data after the server has booted at least once. A fresh scaffold has no skills authored by anyone other than the test user, which makes non-owner / Library Copy / non-owner visibility / non-admin stars flows untestable. Run
bash .claude/skills/builder-smoke-test/scripts/seed-multi-user.sh (or with --dir $PROJECT_DIR) before sections 6 (Skills), 7 (Registry), and 10 (Stars). The script is idempotent and bypasses RBAC by writing directly to libsql, so it works regardless of --auth mode or current role. Do not mark non-owner steps as "blocked" without running this first.
- Execute the steps — use
curl for API checks (with -H "Cookie: $COOKIE" under --auth on), whichever browser tool the harness has wired up (Stagehand, Chrome MCP, etc.) for UI checks.
- Record results in the summary table.
- Mark the section complete with
task_write before moving to the next.
Partial testing (--test)
If --test is provided:
- Always run Setup.
- Run only the specified section(s).
- Skip everything else.
Example: --test skills,registry,agents → Setup + Skills + Registry + Agents.
Scope shortcuts (--scope)
--scope runs a curated group of related sections. Setup is always implied.
| Scope | Includes |
|---|
rbac | permissions, auth |
skills | skills, registry, defaults |
agents | agents, pickers, defaults, model-policy |
infra | infrastructure, channels, reconciliation |
ui | ui |
quick | workspace, skills, agents, stars, ui (skips long-running) |
--scope and --test can be combined; the union is run.
Usage
/builder-smoke-test
/builder-smoke-test --test workspace,skills
/builder-smoke-test --test agents,stars
/builder-smoke-test --test reconciliation
/builder-smoke-test --test ui
/builder-smoke-test --scope rbac
/builder-smoke-test --scope skills
/builder-smoke-test --scope quick
/builder-smoke-test --auth on
/builder-smoke-test --auth off
/builder-smoke-test --auth on --role viewer
/builder-smoke-test --auth on --role member
/builder-smoke-test --skip-browser
Parameters
| Parameter | Description | Default |
|---|
--test | Comma-separated section names (see table above). | (all sections) |
--scope | Named group of sections (rbac, skills, agents, infra, ui, quick). Combinable with --test. | (none) |
--auth | on, off, or auto. auto enables the Auth section iff WORKOS_CLIENT_ID + WORKOS_API_KEY are set. | auto |
--role | Expected role of the logged-in user under --auth on: owner, admin, member, or viewer. Setup asserts the live /api/auth/me roles match; on mismatch the run stops and the user is told to either change their WorkOS role or re-run with the correct --role. Ignored under --auth off. | admin |
--clean | Delete test entities (smoke-test workspaces / agents / skills) at the end of each section. | false |
--skip-browser | Run only API/curl checks. UI section is skipped. | false |
--dir | Project directory the skill scaffolds into. Forwarded to scripts/scaffold.sh. Also reads $BUILDER_SMOKE_TEST_DIR from the environment when the flag is omitted. | ~/mastra-builder-smoke-tests/builder-smoke |
--reuse | If the project already exists at $PROJECT_DIR and has node_modules/@mastra/core, skip pnpm install. Forwarded to scripts/scaffold.sh. | false |
--openai-key | OPENAI_API_KEY value to write into the scaffolded .env. If omitted, the scaffold script falls back to $OPENAI_API_KEY in the shell, then to an interactive prompt. | (shell or prompt) |
--workos-api-key
--workos-client-id
--workos-organization-id | All three are required together to scaffold an auth-on project. Writes AUTH_PROVIDER=workos plus the three keys plus WORKOS_REDIRECT_URI=http://localhost:4111/api/auth/callback into .env. | (auth off) |
If --auth auto and no WorkOS env vars are present, the Auth section is auto-skipped and reported as ⏭️ Skipped (no WORKOS_* env vars).
Canonical order
When running multiple sections, execute them in the order shown in the
section table (1 → 15). The order is intentional:
- Setup must run first — preflight + readiness probe gate every later
section.
- Workspace / Reconciliation / Defaults / Model Policy establish that
the server's view of the project matches what the rest of the run
assumes. Run them before any CRUD pass.
- Skills → Registry → Agents → Pickers → Stars is a build-up: agents
reference skills, pickers depend on the entities created above.
- Permissions / Infrastructure / Channels / UI are read-mostly
inspections that benefit from existing entities.
- Auth runs last because it requires restarting
mastra dev with a
different .env.
If --test or --scope narrows the run, keep the relative order — just
skip the sections that fall outside the selection.
Required vs optional reference tiers
References fall into three tiers; an agent should treat them
accordingly:
- Required (every run):
setup.md. Any failure here blocks the rest
of the run.
- Standard (default tiers for
full, quick, scope shortcuts):
workspace.md, skills.md, agents.md, stars.md, ui.md (core),
auth.md when --auth on.
- Extended (only when explicitly selected via
--test/--scope or
the matching code surface changed): reconciliation.md,
defaults.md, model-policy.md, registry.md, picker-allowlist.md,
permissions.md, infrastructure.md, channels.md, ui.md extended
tier.
When skipping an extended section, mark it ⏭️ Skipped (not in scope)
in the result table — don't silently omit it.
Cleanup
The scaffold is a self-contained throwaway directory at $PROJECT_DIR. All
fixture state (workspaces, agents, skills, libsql DB, .mastra/workspace
files) lives inside it. The smoke test never writes to anything outside
$PROJECT_DIR (other than the dev server it runs).
At the end of every run:
- Stop the dev server (
kill $(lsof -i :4111 -sTCP:LISTEN -t) or
foreground Ctrl-C).
- Choose how to dispose of fixture state:
- Reuse: leave
$PROJECT_DIR in place. The next run can pass
--reuse (or --skip-scaffold to preflight) and pick up where this
one left off. Fastest for iterating.
- Reset:
rm -rf "$PROJECT_DIR" (or re-run scripts/scaffold.sh
without --reuse). Cheapest way to get back to a known-clean state.
Don't bother per-entity DELETE — the directory IS the state.
- If a section bailed mid-flight (assertion failure, network error),
record the partial state in the report's Issues section so the
next run knows what to expect.
Per-entity DELETE calls are only needed when a specific section
explicitly tests DELETE behavior (those sections include the DELETE step
inline). Otherwise the throwaway-directory model handles cleanup.
Never leave the dev server running on :4111 after the report is filed —
it blocks future runs.
Prerequisites
- Working tree on the agent-builder feature branch (or any branch you want to QA).
pnpm (10.x) and node on $PATH. The scaffold uses pnpm install --ignore-workspace inside the project dir so the repo-level workspace doesn't interfere.
- An
OPENAI_API_KEY. Supply via --openai-key, export OPENAI_API_KEY in the shell, or let the scaffold prompt for it.
- (Optional) WorkOS credentials for
--auth on runs: --workos-api-key, --workos-client-id, --workos-organization-id.
- Whichever browser MCP/tool the harness has access to. If none is available, run with
--skip-browser and report UI as ⏭️ Skipped (no browser tool).
Project layout (scaffolded for you)
$PROJECT_DIR/ ← see "Project dir resolution" below
├── package.json ← pnpm overrides → link:<worktree>/packages/*
├── tsconfig.json
├── .env ← OPENAI_API_KEY (+ AUTH_PROVIDER + WORKOS_* on auth-on)
└── src/mastra/
├── index.ts ← single Mastra instance, reads exported bindings from auth.ts
├── auth.ts ← top-level switch(process.env.AUTH_PROVIDER); no-op when unset
├── agents/index.ts ← weather-agent (gpt-4o-mini)
├── tools/index.ts ← weather-info tool
└── workflows/index.ts ← greet-workflow
The .env is the only thing that flips auth on/off — the same src/mastra/index.ts runs in both modes. Re-run scripts/scaffold.sh with or without --workos-* to switch.
Project dir resolution
$PROJECT_DIR is determined by every script (scaffold, preflight, wait-for-server) using this order:
--dir <path> flag
BUILDER_SMOKE_TEST_DIR env var (e.g. export BUILDER_SMOKE_TEST_DIR=~/code/builder-smoke)
~/mastra-builder-smoke-tests/builder-smoke (default)
For a long-lived setup, exporting BUILDER_SMOKE_TEST_DIR once in your shell rc is the lowest-friction option — every script picks it up automatically.
Running scripts (cwd matters)
All scripts under .claude/skills/builder-smoke-test/scripts/ resolve the worktree root from their own location. They can be invoked from anywhere, but conventionally the repo root.
| Script | Run from | Notes |
|---|
scaffold.sh | anywhere | Creates / refreshes $PROJECT_DIR. Forwards --openai-key, --workos-*, --reuse, --dir. |
preflight.sh | anywhere | Calls scaffold.sh then asserts the resulting .env matches --expect off|on. |
wait-for-server.sh | anywhere | Hits http://localhost:4111/api/agents. cwd doesn't matter. |
seed-multi-user.sh | anywhere | Inserts two skills owned by user_seed_other (1 public + 1 private) into the scaffold's libsql DB so non-owner / Library Copy flows can be tested without a second WorkOS account. Server must have booted at least once first. Idempotent. |
Invoke them as bash .claude/skills/builder-smoke-test/scripts/<name>.sh. Don't cd into scripts/ first — relative path resolution will break.
pnpm mastra:dev must be run from $PROJECT_DIR (where the scaffolded package.json is).
How mastra dev reads env (important)
mastra dev loads $PROJECT_DIR/.env via dotenv and unconditionally overwrites process.env with whatever's there (packages/cli/src/commands/dev/dev.ts ~line 384). Practical consequences:
.env is the source of truth for the running server. Inline overrides like AUTH_PROVIDER= pnpm mastra:dev are silently clobbered.
- Shell-only vars survive only if
.env has no entry for the same key. Re-running scripts/scaffold.sh always overwrites .env, so to toggle modes, re-scaffold.
- The auth mode the server actually runs in is determined by
.env alone. A globally exported AUTH_PROVIDER=workos in your shell does NOT enable WorkOS auth in the server if .env doesn't have it — but it WILL leak into anything else this process runs, which is its own kind of confusing. Preflight flags this case.
Auth modes
Two states matter:
- auth off —
AUTH_PROVIDER is absent (or blank) in $PROJECT_DIR/.env. No WorkOS, no RBAC, no FGA. This is the state for the auth-off run.
- auth on —
AUTH_PROVIDER=workos plus WORKOS_API_KEY, WORKOS_CLIENT_ID, WORKOS_ORGANIZATION_ID all present in $PROJECT_DIR/.env. WorkOS authentication + role-based access + per-resource FGA all engage. This is the state for the auth-on runs. FGA is wired through the WorkOS auth provider — it can't be disabled independently.
To switch modes, re-run the scaffold with or without the --workos-* flags; that's faster and safer than hand-editing .env.
Detection: run preflight before each section
bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect off \
--openai-key "$OPENAI_API_KEY"
bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect on \
--openai-key "$OPENAI_API_KEY" \
--workos-api-key "$WORKOS_API_KEY" \
--workos-client-id "$WORKOS_CLIENT_ID" \
--workos-organization-id "$WORKOS_ORGANIZATION_ID"
Preflight chains scaffold.sh followed by validation checks (project exists with node_modules/@mastra/core, $PROJECT_DIR/.env has OPENAI_API_KEY, optional WorkOS keys present when --expect on, and auth mode matches --expect). Each failure prints a stable error code; this table tells the agent what to do.
Resolving missing env vars
If scaffold.sh or preflight.sh reports a missing OPENAI_API_KEY or WORKOS_* var, the agent must not silently source any rc file. Instead, work down this list and stop at the first one that resolves:
-
Check whether the var is already in the process env you can see (echo "${OPENAI_API_KEY:-<unset>}"). If yes, re-run scaffold with --openai-key "$OPENAI_API_KEY" (and equivalent for WorkOS).
-
Check whether the var is in $PROJECT_DIR/.env from a prior run (grep -E "^(OPENAI_API_KEY|WORKOS_)" "$PROJECT_DIR/.env" 2>/dev/null). If yes, you can pass --reuse to the next scaffold call.
-
If neither, look for rc files that exist on disk. Common candidates: ~/.zshrc, ~/.bashrc, ~/.zshenv, ~/.profile, ~/.env.global, and any project-local .env you find. Use ls -1 (or test -f) to confirm before listing — don't fabricate paths.
-
Ask the user in one message: "Can you paste the value(s), or give me permission to source one of these files?" Include the list of files that actually exist.
-
Only after the user explicitly approves a specific file, source it in a subshell and rerun preflight with the inherited env. Pattern:
zsh -c 'source <approved-file> && bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect off --reuse'
zsh -c 'source <approved-file> && bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect on --reuse'
Use bash -c instead of zsh -c if the approved file is a bashrc.
-
Never write the secret value back into any rc file, never export it into the user's interactive shell, and never echo it back in chat in full. Refer to it as <your-openai-key> once you've used it.
| Error code | What it means | What the agent should do |
|---|
project-dir-missing | $PROJECT_DIR is unset or the directory does not exist (scaffold did not run, or was given a bad --dir). | Re-run preflight without --skip-scaffold, or pass an existing --dir <path> that scaffold has already populated. |
scaffold-failed | scripts/scaffold.sh returned non-zero. | Re-run scaffold with --no-reuse to force a fresh install. Inspect the printed pnpm install output for the real error. |
project-deps-missing | $PROJECT_DIR/node_modules/@mastra/core missing after scaffold. | Re-run scaffold without --reuse to force a fresh install. If that still fails, delete $PROJECT_DIR and re-run. |
openai-key-missing-in-project-env | $PROJECT_DIR/.env has no usable OPENAI_API_KEY. | Follow the "Resolving missing env vars" section above. Re-run preflight with --openai-key <value> once you have it. |
workos-keys-missing-in-project-env | --expect on but one or more of WORKOS_API_KEY / WORKOS_CLIENT_ID / WORKOS_ORGANIZATION_ID is absent or blank in .env. | Follow the "Resolving missing env vars" section above. Re-run preflight with all three --workos-* flags. |
mode-mismatch | --expect disagrees with the auth mode detected from $PROJECT_DIR/.env. | Re-run the scaffold with (auth on) or without (auth off) --workos-* flags. The scaffold is idempotent for the parts that don't change. |
bad-expect-value | --expect got something other than off or on. | Fix the invocation. (Parser also rejects flag-like values at parse time with exit 2.) |
.env policy: the scaffold owns $PROJECT_DIR/.env. Re-running scaffold overwrites it. Do not hand-edit the scaffolded .env; instead, re-run scaffold with different flags. (The skill never edits .env files outside $PROJECT_DIR.)
Extracting the session cookie for curl (auth on)
The WorkOS session cookie is httpOnly, so document.cookie and Stagehand's
extract cannot read it from a normal page. To hit authenticated endpoints
from curl after a browser SSO login, the scaffold exposes a tiny debug
route gated by an env var:
- Add
SMOKE_TEST_COOKIE_LEAK=1 to $PROJECT_DIR/.env (single line append; the scaffold leaves this var alone on re-run as long as the file already exists).
- Restart
mastra dev so the new env is picked up.
- Sign in once in the Stagehand browser (
stagehand_navigate to http://localhost:4111, complete WorkOS SSO).
- From the same browser tab, navigate to
http://localhost:4111/smoke-test/cookie and use stagehand_extract to read the page body. The page is a single text/plain line containing the request's Cookie header verbatim (e.g. wos_session=…).
- Export it once:
export COOKIE='<the-string-from-step-4>'. From here on, every authenticated curl is curl -H "Cookie: $COOKIE" "$BASE/…".
The route is only registered when SMOKE_TEST_COOKIE_LEAK=1 and is intentionally insecure — never enable it in a real project. The WORKOS_COOKIE_PASSWORD written by the scaffold is derived from $PROJECT_DIR, so the cookie value stays valid across mastra dev restarts within the same scaffold; you only need to repeat step 4 if you re-scaffold to a new directory.
/smoke-test/cookie returns 404? Always an env-ordering issue. The apiRoutes list is built once when mastra dev boots from process.env.SMOKE_TEST_COOKIE_LEAK. The flag has to be in .env before the boot — adding it after start has no effect until you restart. If you see a 404, run grep SMOKE_TEST_COOKIE_LEAK "$PROJECT_DIR/.env", then stop and restart mastra dev. Don't pivot to "UI only" because of this.
Seeding non-owner skills (Library Copy / non-owner flows)
A fresh scaffold has zero skills, and everything created through the API
is owned by either the auth-off "no caller" (no authorId) or the
currently signed-in user under auth-on. To exercise flows that require a
skill owned by someone else (Library Copy, non-owner read-only view,
private-skill visibility from a non-owner) without provisioning a second
WorkOS account, run the seed script after the server has booted at least
once:
cd $PROJECT_DIR
pnpm mastra:dev
bash .claude/skills/builder-smoke-test/scripts/seed-multi-user.sh
The script writes directly to $PROJECT_DIR/src/mastra/public/mastra.db
via the sqlite3 CLI (no Node deps). It's idempotent — re-running
replaces the seeded rows. Use the seeded skills wherever a reference
file asks for "a skill owned by another user"; clean them up with
DELETE curls against /api/stored/skills/:id or by re-scaffolding.
Starting the dev server
If the server is not running on :4111, the Setup section starts it. The convenience helpers live under scripts/:
bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect off
cd ~/mastra-builder-smoke-tests/builder-smoke
pnpm mastra:dev
bash .claude/skills/builder-smoke-test/scripts/wait-for-server.sh
wait-for-server.sh probes /api/agents — not / — because the SPA shell can return 200 before the API mounts. If it reports the server is up on :4112+ instead of :4111, mastra dev fell through to the next port; stop, free :4111, and restart. Continuing on a non-default port silently breaks every curl in every reference.
API base URL
Every reference assumes $BASE is exported. Set it once at the start of the run:
export BASE=http://localhost:4111/api
All curl examples in the references use $BASE and won't work in a shell that hasn't exported it.
Quick reference: key endpoints
This table lists the surfaces an agent will hit and where to look for the
authoritative request/response shape. Don't copy curl blocks from here —
run the per-section commands in references/<section>.md.
| Surface | Endpoint |
|---|
| Builder settings | GET /editor/builder/settings |
| Builder infra | GET /editor/builder/infrastructure |
| Registries (list) | GET /editor/builder/registries |
| Registry search | GET /editor/builder/registries/:registryId/search?q=… |
| Registry popular | GET /editor/builder/registries/:registryId/popular |
| Registry preview | GET /editor/builder/registries/:registryId/preview?owner=…&repo=…&path=… |
| Registry install | POST /editor/builder/registries/:registryId/install |
| Workspace CRUD | GET/POST/PATCH/DELETE /stored/workspaces[/:id] |
| Agent CRUD | GET/POST/PATCH/DELETE /stored/agents[/:id] |
| Agent star | PUT / DELETE /stored/agents/:id/star |
| Agent avatar | PATCH /stored/agents/:id with metadata.avatarUrl (owner-only) |
| Skill CRUD | GET/POST/PATCH/DELETE /stored/skills[/:id] |
| Skill publish | POST /stored/skills/:id/publish |
| Skill star | PUT / DELETE /stored/skills/:id/star |
| Auth me | GET /api/auth/me (returns logged-in user + roles + permissions) |
| Auth refresh | POST /auth/refresh |
Builder Studio routes
| Feature | Route |
|---|
| Agent Builder shell | /agent-builder |
| Agents (default view) | /agent-builder |
| Agent detail (view) | /agent-builder/agents/:id/view (bare :id redirects to /view) |
| Agent detail (edit) | /agent-builder/agents/:id/edit |
| Skills | /agent-builder/skills |
| Library (public skills) | /agent-builder/library |
| Skill detail | /agent-builder/skills/:id/edit (owner) or /agent-builder/skills/:id/view (non-owner) |
| Workspaces | /agent-builder/workspaces |
| Infrastructure | /agent-builder/infrastructure (readable by every default role — see infrastructure.md) |
Mobile renders a bottom-bar with the same primary entries.
Browser smoke
Use whichever browser tool the harness has wired up (Stagehand, Chrome MCP, etc.). Don't assume a specific provider — discover what's available, then drive the same checklist in references/ui.md.
The scaffolded project registers StagehandBrowser (matching examples/agent-builder). If BROWSERBASE_* keys aren't set in the shell, Stagehand falls back to local Playwright; that's fine for smoke. If neither Stagehand nor a local browser is reachable, mark UI as ⏭️ Skipped (no browser provider).
Result reporting
After testing, provide:
## Builder Smoke Test Results
**Date**: <date>
**Branch**: <branch>
**Commit**: <short sha>
**Server**: scaffolded project @ localhost:4111 (`$PROJECT_DIR`)
**Auth**: on / off / auto-skipped
| # | Section | Status | Notes |
| --- | ------------------ | -------- | ------------------------------- |
| 1 | Setup | ✅/❌ | |
| 2 | Workspace | ✅/❌ | |
| 3 | Reconciliation | ✅/❌/⏭️ | |
| 4 | Defaults | ✅/❌ | |
| 5 | Model Policy | ✅/❌ | |
| 6 | Skills | ✅/❌ | |
| 7 | Registry | ✅/❌ | |
| 8 | Agents | ✅/❌ | |
| 9 | Pickers | ✅/❌ | |
| 10 | Stars | ✅/❌ | |
| 11 | Permissions / RBAC | ✅/❌ | |
| 12 | Infrastructure | ✅/❌ | |
| 13 | Channels | ✅/❌ | |
| 14 | UI | ✅/❌/⏭️ | |
| 15 | Auth | ✅/❌/⏭️ | (skipped if no WORKOS\_\* vars) |
**Product issues**: (list any — server/UI behaved unexpectedly. For each: HTTP method + path or UI route, expected vs actual, one-sentence guess at the cause. Do not pre-decide "known bug" — log what the server actually did. Say "none" if empty.)
**Skill issues**: (list any — the skill itself was wrong, unclear, stale, or unreachable. For each: which file + step (e.g. `references/skills.md` step F2), and what was wrong. Doc drift, not product bugs. Say "none" if empty.)
**Verify before filing.** Before adding anything to either list, re-confirm against the live response in this run, not memory of an earlier call:
- For any **shape mismatch / missing field / wrong key name** claim, paste the actual JSON fragment (or the relevant keys) directly under the bullet so the claim is reproducible. If the skill says `features.agent.skills` and the response has `features.agent.skills`, that is not a skill issue — names that look similar in passing (`featSkills`, `agent.features.skill`, etc.) are easy to misread.
- For any **endpoint inconsistency** claim (e.g. "endpoint A returns X but B returns Y"), re-curl both endpoints fresh in the same run rather than reusing a stale response from earlier in the section.
- For any **RBAC / authz** claim (403 where you expected 200, or vice versa), check `references/permissions.md` for the matrix _and_ check the "Design decisions" list in this file. Several roles intentionally share `*:read`, which means infra/list/get endpoints look "ungated" but are working as intended. Also confirm the cookie you sent belongs to the role you think it does (`curl -H "Cookie: $(cat /tmp/cookie.txt)" $BASE/auth/me | jq '.role // .roles'`).
- For any **missing endpoint** claim (e.g. "agent avatar 404"), confirm the contract first — several flows are client-composed on top of generic CRUD (avatar = `PATCH metadata.avatarUrl`; Library Copy = `POST /stored/skills` with `metadata.origin`). The "Design decisions (don't file as bugs)" section enumerates the common ones.
- If a claim can't be reproduced on a fresh request, drop it.
**Regressions**: (list any behavioral changes from a previous run)
**Warnings**: (e.g., dev-server crash on `/auth/refresh` polling, OPENAI_API_KEY required at startup)
**Skipped sections**: (list with reason)
Known rough edges
The branch has accumulated minor papercuts. Note these in your report only if you hit them; don't fail the run on them:
- Don't
rm $PROJECT_DIR/mastra.db by hand while the server is up — stop the server first, then delete.
- Dev server can crash on hot-reload from
/auth/refresh polling. Restart and continue.
OPENAI_API_KEY is required at startup — server won't boot without it, even if you only test non-LLM surfaces.
mastra dev overwrites process.env from .env at boot, so inline env overrides on the command line don't reach the server. Re-run scaffold to change .env.
- The scaffold links against the current worktree's packages via
link: overrides. If you switch worktrees, re-run scaffold so the symlinks point at the right tree.
Design decisions (don't file as bugs)
These have come up across multiple runs and are intentional. If you observe one, note it in your report as "expected behavior" — do not open a product issue.
GET /auth/me without a cookie returns 200 with a null-ish body. The route is mounted as a public route (createPublicRoute); the contract is "return the current user or null", not "401 if missing". A 401 here would break the public app shell.
/editor/builder/infrastructure is readable by every default role (admin / member / viewer). The handler gates on infrastructure:read and every default role has *:read, which matches by resource-wildcard. The page only exposes deployment-shape data (provider names, registered flags, configured/unconfigured booleans) — no secrets.
- Flipping a skill's
visibility from private to public does not auto-publish unless the skill has a registered skillPath. Visibility and publication are independent fields by design. A plain-create skill flipped public stays at activeVersionId: null until a real POST /publish runs against a source path.
- Zod schema validation runs before the permission middleware on
/stored/* writes. A malformed body from a viewer returns a 400, not a 403. This is standard request lifecycle; the response surface doesn't leak resource state.
- The role-impersonation picker only lists roles different from the current one. Logged in as
admin, you'll see Member and Viewer and nothing else — there is no Admin self-item. This is intentional (admin is the baseline; you're already there).
- Impersonation is UI-only. The API still answers per the real logged-in role. A
curl while impersonating viewer will still return the admin's response.
Favorites sidebar entry links to /agent-builder/favorite (singular). The plural /favorites is not a registered route and renders the React Router 404. Use the sidebar link or the singular URL when scripting.
- Avatar upload uses agent
PATCH with metadata.avatarUrl, not a dedicated /avatar endpoint. See references/agents.md.
- Copy is client-side. There is no
POST /stored/skills/:id/copy. The UI fetches the source skill and POSTs a new row to /stored/skills with metadata.origin = "library-copy". See references/registry.md.
Out of smoke-test scope
Some flows are documented in references/ but are not driven by the smoke-test agent because they require server-lifecycle gymnastics that don't fit a single run:
- Reconciliation steps 2/3/4/6 (
references/reconciliation.md) require editing $PROJECT_DIR/src/mastra/index.ts (changing basePath / workspaceId / config), restarting mastra dev multiple times, and observing drift detection or orphan archival across restarts. The smoke-test agent runs only Step 1 (fresh-startup persistence) and Step 5 (non-builder workspaces untouched). Run the rest by hand when changing reconciliation code.
- Real role-swap testing (logging in as multiple WorkOS users with different roles in the same run) is out of scope. The agent verifies whichever role the live
--role user actually has, and additionally exercises the UI-only role impersonation flow under --role admin (see references/ui.md).
References
references/setup.md — server health, builder settings sanity, baseline counts, builder workspace existence
references/workspace.md — workspace CRUD via API
references/reconciliation.md — config-driven workspace lifecycle (fresh, idempotent, drift, archival, backfill)
references/defaults.md — builder defaults applied at agent create (memory, workspace, browser, model)
references/model-policy.md — allowed list, default model, dropdown filtering, rejection
references/skills.md — skill CRUD, visibility, publish, filesystem writes, files array
references/registry.md — skills.sh browse/install, library Copy flow, origin badges, gating
references/agents.md — stored agent CRUD, skill attachment, model swap, delete-from-edit, avatar upload
references/picker-allowlist.md — tools/agents/workflows pickers respect allowlists
references/stars.md — star/unstar agents and skills, idempotency
references/permissions.md — viewer/member/admin/owner gating, role expectation matrix, UI impersonation, auth-off bypass
references/infrastructure.md — /editor/builder/infrastructure payload + UI
references/channels.md — Slack provider visibility, connectChannel tool
references/ui.md — browser checklist across Builder routes
references/auth.md — WorkOS on/off, 401 behavior, authorId, mode-toggle via .env
scripts/scaffold.sh — scaffold or refresh the hermetic project at $PROJECT_DIR
scripts/preflight.sh — wraps scaffold.sh + mode expectation (--expect off|on)
scripts/wait-for-server.sh — poll :4111 until healthy