ワンクリックで
auto
Autonomous task execution with testing and security. Works through all tasks without stopping.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
メニュー
Autonomous task execution with testing and security. Works through all tasks without stopping.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
SOC 職業分類に基づく
| name | auto |
| description | Autonomous task execution with testing and security. Works through all tasks without stopping. |
| triggers | ["auto"] |
| allowed-tools | Bash, Read, Write, Edit, Grep, Glob, Task, TaskCreate, TaskUpdate, TaskList, Agent, SendMessage |
| model | opus |
| user-invocable | true |
Fully autonomous development. Works through all tasks without stopping until complete.
!git status --short
!node -e "try{const p=require('./prd.json');const sp=p.sprints?p.sprints[p.sprints.length-1]:p;const s=Object.values(sp.stories||p.stories||{});const name=sp.id||sp.name||p.sprint||'unknown';const done=s.filter(x=>x.passes===true).length;const pend=s.filter(x=>x.passes===null||x.passes===false).length;console.log('Sprint:',name,'| Done:',done,'| Pending:',pend,'| Total:',s.length)}catch(e){console.log('No prd.json')}"
auto
|-- Activate: write .claude/auto-active
|-- Check prd.json exists?
| |-- No -> Bootstrap from context
| +-- Yes -> Check pending tasks
| |-- None pending -> IDLE Detection
| +-- Has pending -> Execute tasks
|
+-- Execute until done or interrupted
+-- Deactivate: delete .claude/auto-active
On start, create the flag file using the Write tool (not Bash echo — avoids sensitive file permission prompt):
Write tool → .claude/auto-active
Content: {"started":"<current ISO timestamp>","sprint":"<current sprint>"}
This flag tells the Stop hook to block Claude from stopping. Claude keeps working as long as this flag exists.
On exit (user says "done", or nothing left), do not rm the flag — Bash ops on .claude/ trigger a sensitive-file permission prompt even under bypass. Instead, simply stop working. The Stop hook owns the flag lifecycle:
If the user explicitly says "deactivate auto" mid-sprint, use the Write tool to create .claude/auto-exit (empty file). The Stop hook treats that as an unconditional exit signal and cleans up both files.
Do not ask "Should I continue?" or show summaries and wait.
Instead:
When findings, scan results, or ad-hoc issues are identified during execution, write them to prd.json as stories before fixing them. prd.json is the source of truth that survives session restarts and /compact.
If the user gives a direct instruction (e.g., "fix this button", "update that copy") rather than saying "auto":
When prd.json does not exist:
Before first task, run these checks. Use simple commands that won't trigger security filters:
# 1. Git status
git status --short
# 2. Dependencies fresh?
# Compare timestamps — if package.json is newer than node_modules, run npm install
ls -lt package.json node_modules/.package-lock.json 2>/dev/null | head -1
If package.json is newer or node_modules is missing, run npm install.
# 3. Detect test runner — read package.json with Read tool, check for vitest/jest/playwright in devDependencies
# Use the detected runner for all test steps in this session
# 4. Build check
npm run build 2>&1 | tail -5
# 5. Branch check
git branch --show-current
If on main/master, create a feature branch before making changes.
# 6. Worktree cleanup
git worktree prune 2>/dev/null
Skip individual checks if they take >10 seconds. Use Read tool to inspect package.json instead of node -e one-liners.
// prd.json has two shapes:
// Flat: { stories: { "S1-001": {...} }, sprint: "sprint-1" }
// Nested: { sprints: [{ id: "sprint-1", stories: { "S1-001": {...} } }] }
const sp = prd.sprints ? prd.sprints[prd.sprints.length - 1] : prd;
const stories = sp.stories || prd.stories || {};
const storyEntries = Object.entries(stories);
const executable = storyEntries.filter(([id, s]) =>
s.passes !== true &&
(s.blockedBy || []).every(dep => stories[dep]?.passes === true)
);
Before starting a task, assess its scope:
Small (1-3 files, clear fix) → execute directly
Medium (3-5 files, clear approach) → execute with extra caution
Large (5+ files, new feature, multiple integrations) → write a 3-sentence inline plan before coding:
Then execute. Do not stop to ask — the inline plan is sufficient for auto mode.
[3/8] Starting: S6-003 — Add loading statesnpm run typecheck — fix if failsnpm run build — fix if fails[3/8] ✓ S6-003 | Next: S6-004passes: trueBefore writing code, load references/generation-constraints.md — it covers TypeScript strictness patterns, security/data-safety rules (fetch error handling, SSRF guards, env var enforcement), accessibility checklist (labels, focus rings, touch targets), design anti-slop rules, and the test-generation matrix for API/auth/data mutations. Each rule explains the failure mode it prevents, so you can judge when to bend it.
After writing code but before typecheck, re-read your diff against the 8-point self-critique checklist in the same reference file.
When creating stories in prd.json, each story can carry a verify: [] array and an acceptance: [] array. Load references/verify-tags.md for the tag definitions (visual, a11y, design, security, auth, test, api) and an example story.
If no verify field exists, auto infers from the task type (UI → visual+a11y+design, API → api+security, etc.).
Before marking a task done, verify each acceptance criterion. "Does it compile?" is not acceptance — "does it behave correctly?" is.
next/font or just declared in CSS.mcp__plugin_context7_context7__*), query them for version-pinned docs before writing code that touches the library. This prevents using deprecated APIs or patterns from old training data — even for well-known libraries like Next.js, React, and Supabase.doppler.yaml exists in the repo, secrets live in Doppler — prepend doppler run -- when running npm/pnpm/bun run dev|build|start|test commands. If doppler CLI is missing, install first (see doppler skill). If not logged in, stop and guide user to doppler login.Design context is not optional. Auto-generated UI without reading the design system produces stock shadcn that fails the AI slop checklist. Spend 30 seconds reading the design tokens before writing any component.
| Task Type | Verification |
|---|---|
| UX/UI (public pages) | agent-browser screenshots (desktop + mobile) + console errors |
| UX/UI (admin/internal) | typecheck + build only |
| Feature (UI) | Build passes + visual check if public UI changed + complete primary user flow once |
| Edge Function / API | Deploy + curl with real params + verify 200 + response shape matches expected |
| API Integration | Real request with real credentials + verify response contains expected data |
| Bug fix | Reproduce, verify fixed, no new errors |
| Refactor | Typecheck + build + existing tests pass + no behavior change |
| Auth / billing / RLS | Write or verify a test for the security-critical path |
All task types also require the Hardening Check (step 4c). This catches logic bugs that typecheck and build miss.
Integration test is mandatory for API/Edge Function tasks. Typecheck alone does not catch wrong API keys, wrong function signatures, or wrong database tables. Make one real request before marking done.
Ignore Preview-plugin visual reminders on non-UI edits. Claude Code's Preview plugin and similar tools may suggest "verify in browser" on any file change. Skip the suggestion when the edit was purely server-only: API routes, middleware, next.config.*, *.test.*, migration files, types-only files, server actions without JSX. Only run visual verification when the edit touches a React/Vue/Svelte component, page, layout, or CSS that ships to the browser.
Risk-shaped testing. When adding tests, prioritize paths that handle money, access control, or user data over easy-to-test pure functions.
For UI/API tasks, detect or start a dev server first:
# Check if already running
for port in 3000 3001 5173 8080; do curl -s http://localhost:$port > /dev/null 2>&1 && break; done
# If none found, start one in background
Bash({ command: "npm run dev", run_in_background: true })
# Wait for startup, then verify
Use agent-browser (preferred — token efficient) or Playwright (more capabilities):
agent-browser (preferred):
agent-browser open http://localhost:3000/[page]
agent-browser snapshot -i # Desktop screenshot + DOM
agent-browser viewport 375 812 # Switch to mobile
agent-browser snapshot -i # Mobile screenshot
agent-browser errors # Console errors
Playwright fallback (if agent-browser unavailable):
npx playwright screenshot http://localhost:3000/[page] .claude/screenshots/page-desktop.png
npx playwright screenshot --viewport-size=375,812 http://localhost:3000/[page] .claude/screenshots/page-mobile.png
If neither is available, fall back to WebFetch or curl for basic page load verification.
Analyze screenshots for: broken layout, missing content, visual regressions, design quality, dark mode correctness. Fix console errors or visual issues before marking task complete.
Before marking any task as complete:
1. Type Safety
npm run typecheck 2>/dev/null || npx tsc --noEmit 2>/dev/null
2. Tests
npm test -- --passWithNoTests --watchAll=false 2>/dev/null
3. Resource Validation If the task added external resources (images, fonts, API URLs), validate them:
# Check image/asset URLs are reachable
grep -rn 'https://.*\.(png|jpg|svg|webp|woff2)' src/ --include="*.tsx" --include="*.ts" | while read line; do
url=$(echo "$line" | grep -oP 'https://[^\s"'\'']+'); curl -s -o /dev/null -w "%{http_code} $url\n" "$url"
done
Fix broken URLs before committing — they cause blank images and layout shifts in production.
4. Self-Review
Run git diff and check: no console.log/debugger, no hardcoded colors, all UI states handled, no any types, no commented-out code.
4b. Sweeping Change Verification If the task involved a bulk find-and-replace (e.g., renaming, migrating values, swapping imports), grep for the OLD pattern to confirm it's fully eliminated. Partial migrations cause subtle bugs (e.g., USD→EUR migration that missed one pricing page).
4c. Hardening Check (per-task audit-lite) Review the diff for these patterns in the files you just changed. Fix before marking done:
| Pattern | What to Check | Fix |
|---|---|---|
| Fail-open auth | if (secret && ...) skips auth when env var is unset | Fail-closed: return 401 if env var missing |
| Unsafe casts | as unknown as, as any, double assertions | Create a validator (Zod or manual), parse instead of cast |
| Fire-and-forget fetch | fetch() without try/catch or .ok check | Wrap in try/catch, check res.ok, revert optimistic state on failure |
| Missing form labels | <input placeholder="..."> without <label> or aria-label | Add <label> or aria-label to every input |
| Missing autocomplete | Login/signup inputs without autoComplete | Add autoComplete="email", autoComplete="current-password", etc. |
| User-supplied URLs | Server-side fetch(userUrl) without validation | Validate URL, resolve DNS, block private IP ranges |
| Env var fallbacks | process.env.X || 'localhost' or || '' | Throw if missing in production, only fallback in dev |
| RLS policy logic | New table or RLS change | Verify policy restricts to auth.uid() for user data |
| Missing focus styles | Raw <button> without focus-visible:ring-* | Add focus-visible:ring-2 focus-visible:ring-ring |
| Stock UI | Fonts declared but not loaded, text-only nav, generic empty states | Load fonts via next/font, add icons, add visual personality |
| Dark mode | Colors that don't use theme tokens, cards same color as background | Use semantic tokens, add elevation distinction |
| Chart colors | hsl(var(--x)) when var already contains hsl(...) | Use raw HSL values or remove outer hsl() wrapper |
Only check patterns relevant to the files you changed — this is a 30-second scan of your own diff, not a full audit.
5. Design Token Compliance (UI tasks only)
If the task changed .tsx or .css files, verify the output uses the project's actual tokens:
# Check for stock shadcn / hardcoded colors in changed files
git diff --name-only | xargs grep -n "text-white\|bg-black\|text-gray-\|bg-gray-\|#[0-9a-fA-F]\{6\}" 2>/dev/null | grep -v "gradient\|from-\|to-\|via-" | head -10
# Check fonts are loaded, not just declared
grep -rn "fontFamily\|font-family" src/ --include="*.css" --include="*.tsx" | grep -v "next/font\|@font-face\|tailwind" | head -5
If stock colors or unloaded fonts found in YOUR changes, fix before proceeding.
6. UI/API Change? Visual Verification Run agent-browser or Playwright screenshots from the Verification section above. Not optional for UI tasks.
7. Mark Complete Only after all checks pass. UI files (.tsx, .css, layout, page) without visual verification → go back to step 6.
On failure:
passes: false, continue to next taskpasses: "needs-setup" with blockedReason explaining what's needed. This distinguishes "can't do yet" from "tried and failed."Do not retry a third time. Do not spend more than 10 minutes on retries for a single task.
Track error types across tasks. When the same error pattern appears 3+ times:
Common patterns to recognize:
| Error Pattern | Instant Fix |
|---|---|
exactOptionalPropertyTypes error | Add | undefined to optional prop types: foo?: string | undefined |
Cannot find module './X' | Check file exists, fix path or create file |
Type 'X' is not assignable to type 'Y' | Check the type definition, add union or cast |
Property 'X' does not exist on type 'Y' | Add to interface or use optional chaining |
RLS policy violation | Check auth.uid() in policy, verify user is authenticated |
CORS error | Check API route headers or middleware config |
as unknown as cast | Create a validator function, parse instead of assert |
| Unhandled fetch in component | Wrap in try/catch, check res.ok, add error feedback |
<input> without label | Add <label htmlFor> or aria-label prop |
Env var || '' fallback | Throw if missing, fallback only with NODE_ENV check |
| Middleware blocks new route | Add to PUBLIC_PREFIXES or route matcher |
| Font declared but not loaded | Add next/font import in layout.tsx |
hsl(var(--x)) double-wrap | Remove outer hsl() when CSS var already contains it |
| Stock shadcn tokens | Read project's globals.css, use actual brand colors |
feat|fix|refactorAfter solving hard problems (debugging, retries, unexpected errors), save reusable lessons to auto-memory:
| What to Save | Example |
|---|---|
| Environment quirks | "This project uses Vite on port 5173, not CRA on 3000" |
| Error fix recipes | "RLS 'permission denied' → check auth.uid() in policy, not custom function" |
| Architecture patterns | "API routes follow /api/v1/[resource]/route.ts pattern" |
| Build gotchas | "Must run npm run generate before build (Prisma client)" |
| Test setup | "Tests need TEST_DB_URL env var, seed with npm run seed:test" |
| Deploy requirements | "Vercel needs ANALYZE=true for bundle analysis" |
Also save after these events:
This builds per-project context that compounds across sessions.
With 1M context, compaction is almost never needed. Do NOT suggest /compact unless you are certain context usage exceeds 70%. A full sprint (10+ tasks) typically uses only 15-20% of 1M context.
Be concise but don't sacrifice clarity for brevity.
After committing completed tasks, check if changed files need deployment:
# Check what changed since last deploy/commit
git diff --name-only HEAD~1
| Changed Files | Deploy Action |
|---|---|
supabase/functions/*/index.ts | Deploy changed edge functions (read deploy command from project CLAUDE.md) |
supabase/migrations/*.sql | Run supabase db push or apply migration |
src/** (Vercel/Next.js) | Push to trigger Vercel auto-deploy |
For edge functions, read project-specific deploy config from CLAUDE.md (e.g., path to supabase binary, project ref, flags like --no-verify-jwt). If no config found, skip auto-deploy and note it in completion summary.
After deploy, verify the deployment succeeded (check endpoint responds with 200).
When all stories have passes === true:
All [N] tasks complete.
Summary:
- [X] features implemented
- [X] bugs fixed
- [X] improvements made
Run `progress` to see full results.
If no tasks to work on:
passes: true?
When all pending tasks are done, auto handles the sprint lifecycle — but verifies the work first and surfaces a summary before bumping.
1. BUILD GATE — run the real deploy-target build before anything else:
npm run build (or pnpm/yarn/bun equivalent)
If it fails, do NOT archive or bump. Create a prd.json story for each
error and continue working. Sprint can only close on a clean build.
2. Log summary to .claude/sprint-history.md:
"Sprint [N]: [done]/[total] tasks | [date] | [one-line summary of work]"
3. Archive completed stories:
- Copy current prd.json to .claude/archives/prd-archive-sprint-[N].json
- Remove stories with passes: true from prd.json
- Keep stories with passes: null, false, or "deferred"
4. Decide whether to bump — show a one-line honesty summary first:
Sprint [N] closed: [done]/[total] tasks.
Average realness: [avg]% (see realness field in core schema)
Build: passed in [T]s
Carried forward: [M] deferred, [K] new findings
Bumping to Sprint [N+1]. Say "stop" to pause.
Then proceed — no confirmation needed, but the user has a clean
window to interrupt. This beats silent bumps AND beats blocking
prompts that break autonomous execution.
5. If no new work exists, skip the bump and go to "Ask User" below.
Honest close criteria: A sprint doesn't close just because every story has passes: true. It closes when (a) build passes on deploy target, (b) the summary accurately reflects what shipped, and (c) realness scores are filled in honestly.
| Signal | Action |
|---|---|
| Deferred tasks from previous sprint | Carry forward, start working |
| Audit/brainstorm created new stories | Bump sprint, continue |
| Dev server running + UI changes made | Run visual scan, fix issues found |
| TODOs/FIXMEs in changed files | Create stories, fix them |
| Build warnings | Fix directly (no story needed) |
| Clean codebase, no work | Ask user (see below) |
When new work exists after sprint transition, continue immediately:
Sprint [N] complete ([done]/[total] tasks).
Archived completed stories. [M] tasks carried forward.
Continuing as Sprint [N+1].
Limit: 2 auto-continued sprints per session. After that, ask the user.
If the sprint had 5+ tasks, suggest simplify first:
Sprint [N] complete ([done]/[total] tasks).
Recommended: run `simplify` to catch duplicate code from this sprint.
What's next?
1. simplify - Review for duplicate code and over-abstraction
2. audit - Deep quality scan (finds bugs + violations)
3. brainstorm - Feature ideas + dead code scan
4. Done for now
Keep .claude/auto-active flag while asking. Only delete it if user picks "Done for now".
| Situation | Action |
|---|---|
| No prd.json | Bootstrap from context |
| All done + issues found | Brainstorm (auto-creates stories) |
| All done + clean code | Ask user for next action |
| All done + already auto-sprinted | Ask user (limit reached) |
| Build broken | Fix first |
| Task fails | Retry 2x, then skip |
| UX task | Browser verify |
| Blocked task | Skip, work on unblocked |
| < 5 tasks, no sprint | Work directly |
Show token / tool usage stats from the local telemetry log. Use when you want to know "which tools am I burning context on", "which skills are expensive", or "was yesterday's session mostly Read/Grep or actually productive".
Parallel quality audit with 7 specialized agents (Opus). Finds bugs, violations, and quality issues. Use audit for fixes, brainstorm for features.
Manage environment variables with Doppler — auto-install CLI, login, link projects, wrap commands with `doppler run`. Replaces scattered .env files with a hub/spoke architecture.
Scaffolds new projects or onboards existing ones. Detects stack, creates monorepo/single-app, configures strict tooling. Use for greenfield or first-time setup.
Archives completed stories from prd.json to reduce token usage.
Removes temporary screenshots, old backups, stale handoffs, and auto-active flags. Use when project has accumulated temp files.