一键导入
validate
End-to-end validate a merged PR or feature against the dev environment — UI + DB assertions with screenshots
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
End-to-end validate a merged PR or feature against the dev environment — UI + DB assertions with screenshots
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
Work on a ready GitHub issue (implement, test, commit, PR)
Groom GitHub issues from needs-details to ready
Review AI-drafted issues for human approval
Verify whether a groomed issue is still accurate
Visually compare design options (colors, tokens, variants) in a live preview
基于 SOC 职业分类
| name | validate |
| description | End-to-end validate a merged PR or feature against the dev environment — UI + DB assertions with screenshots |
Drive a merged change through real UI + DB to confirm it works end-to-end. Catches regressions that unit tests and CI smoke tests miss (real S3, real Supabase, real auth, real browser).
Use this after a PR has merged and you want eyes-on confirmation before the next deploy or before declaring the feature shipped. Not a substitute for unit/integration tests.
Most recent merge on main:
!git log -1 --pretty=format:'%h %s%n%b' origin/main 2>/dev/null | head -20 || git log -1 --pretty=format:'%h %s' 2>/dev/null
Files changed in last merge:
!git diff --name-only HEAD~1..HEAD 2>/dev/null | head -40 || echo "(no diff)"
Scope hint: $ARGUMENTS — may be a PR number, commit SHA, or free-form description ("the new upload-batches flow"). If empty, default to the last merge on main shown above.
If unsure, surface the question to the user before writing tests.
Resolve $ARGUMENTS to a concrete set of changes:
224): gh pr view 224 --json title,body,files, then git show <merge-commit> or git diff <base>..<head> for the diff.git show <sha> --stat and git log -1 --pretty=full <sha>.Read the PR body / commit message thoroughly — the "why" tells you what to validate, not just the "what".
Identify what's at risk, then group into validation items. Categories to consider:
Drop items aggressively. Three high-leverage tests beat seven low-leverage ones. Justify drops out loud ("opaque field, no parsing, skipped").
Present the list to the user via AskUserQuestion before writing anything. Single-select if the user wants one item at a time, multi-select if they want a batch. Include "all of them" as an option when reasonable.
Then create TaskCreate entries — one per validation item plus one for cleanup. Mark in_progress as you start each.
It's the cheapest and the most informative — tells you whether the migration even ran. Write a temp tsx script:
File: apps/web/scripts/temp-validate-<slug>.ts
import postgres from 'postgres';
import { config } from 'dotenv';
import { PLAN_LIMITS } from '@nexus/db/plans';
config({ path: '.env.local' });
const sql = postgres(process.env.DATABASE_URL!);
async function main() {
// 1. Schema check — confirm new columns/tables exist
const cols = await sql`
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = '<new-or-changed-table>'
ORDER BY ordinal_position
`;
console.log(cols);
// 2. Migration applied
const mig = await sql`
SELECT hash, created_at
FROM drizzle.__drizzle_migrations
ORDER BY created_at DESC LIMIT 5
`;
console.log(mig);
// 3. Drift check — if a source-of-truth column was introduced, confirm it
// matches the legacy aggregate for all users.
const drift = await sql`<aggregation comparing new vs legacy source>`;
console.log(drift.length === 0 ? 'OK no drift' : drift);
await sql.end();
}
main().catch((e) => {
console.error(e);
process.exit(1);
});
Run: pnpm tsx apps/web/scripts/temp-validate-<slug>.ts
If sanity passes, move on. If it fails, stop and figure out why — the rest of the work is meaningless until the DB is in the expected state.
File: apps/web/e2e/smoke/_temp-validate-<slug>.spec.ts
Required patterns (these are not negotiable — they're what made it work in real runs):
import { test, expect } from '../fixtures/authenticated';
import { findUserByEmail, getDb } from '../helpers/db';
import { REGULAR_USER } from '../helpers/auth';
import { PLAN_LIMITS } from '@nexus/db/plans';
// State is shared across tests (single user, single storage_usage row).
// Run serially so test #N's setup doesn't pollute test #M.
test.describe.configure({ mode: 'serial' });
test.use({ userRole: 'user' });
const SCREENSHOTS = 'test-results/temp-validate-<slug>';
async function getUserId(): Promise<string> {
const u = await findUserByEmail(REGULAR_USER.email);
if (!u) throw new Error(`regular user missing: ${REGULAR_USER.email}`);
return u.id;
}
// Reset all user state to a deterministic baseline.
async function cleanupForUser(userId: string): Promise<void> {
const sql = getDb();
await sql`DELETE FROM files WHERE user_id = ${userId}`;
await sql`DELETE FROM upload_batches WHERE user_id = ${userId}`;
await sql`
INSERT INTO storage_usage (id, user_id, used_bytes, file_count)
VALUES (gen_random_uuid()::text, ${userId}, 0, 0)
ON CONFLICT (user_id) DO UPDATE SET used_bytes = 0, file_count = 0, updated_at = now()
`;
}
test.describe('<feature> validation', () => {
test.beforeAll(async () => {
await cleanupForUser(await getUserId());
});
test.afterAll(async () => {
await cleanupForUser(await getUserId());
await getDb().end({ timeout: 5 });
});
test('<item 1>', async ({ page }) => {
// ...
await page.screenshot({
path: `${SCREENSHOTS}/01-foo.png`,
fullPage: true,
});
});
});
await page.setInputFiles('input[type="file"]', { name, mimeType, buffer }) is more reliable than going through the visible "Browse" proxy button.helpers/db sql template; assert on s3_key shape, FK columns, storage_usage deltas.page.getByRole('button', { name: 'Retry upload' }) for failed uploads), plus a DB assertion that no row was created.mode: 'serial' + --workers=1. Playwright's fullyParallel: true will silently break shared-state tests.?batch=1&input=...) is finicky; superjson wraps things. Drive through the UI instead, or use the page's own fetch via page.evaluate.afterEach-style cleanup is enough. Each test pulls from a fresh baseline only if cleanup is in beforeAll + you're running serial.pnpm -F web exec playwright test e2e/smoke/_temp-validate-<slug>.spec.ts \
--project=smoke --reporter=list --workers=1
--workers=1 is non-optional when state is shared.
When a test fails:
test-results/<test-name>/. The screenshot tells you the actual UI state, which is usually different from what you assumed.After the run:
rm apps/web/scripts/temp-validate-<slug>.ts
rm apps/web/e2e/smoke/_temp-validate-<slug>.spec.ts
# Screenshots are gitignored under test-results/, leave them.
git status # confirm clean
If the validation surfaced a real bug, do NOT delete — keep the spec around as a reproducer until the fix lands.
A short summary table with one row per validation item, columns: What / How / Result. Link screenshots inline where helpful. End with a one-sentence verdict ("ready to ship" / "blocker: ").