一键在 Manus 中运行任何 Skill

test-red-team

Adversarial red-team of a running web, React Native, or Capacitor hybrid app. Drives Playwright browser MCP (web/PWA), Playwright Android WebView attach (Capacitor), or adb tap-walk (native chrome) against a feature-first coverage matrix: each feature decomposed to surfaces, sub-pages, components, and states, attacked across 4 dimensions — UI/UX, data pipeline, security (OWASP-mapped), and performance. Cross-references Sentry for production telemetry, Supabase for DB-layer mutation truth and RLS verification, and Firecrawl for current OWASP/MASVS guidance. Produces a severity-ranked defect list with repro steps and evidence. Generic across any repo and stack. Use when asked to "red team this app", "attack my app", "break it", "find all the defects", "adversarial test", "pre-launch hardening", "pentest the app", or "full app QA". Distinct from test-playwright (session PDCA), test-qa (happy-path crawl), and audit-security (static code review).

在 Manus 中运行

星标4

分支0

更新时间2026年6月12日 00:41

来源

kensaurus

kensaurus/cursor-kenji

打开 GitHub 仓库查看创作者相关仓库

安装命令

下载

在 Manus 中运行

适用职业SOC

信息安全分析师计算机与数学类职业15-1212L4

文件资源管理器

2 个文件

SKILL.md

readonly

同仓库更多 Skills

同仓库

workflow-build-feature

kensaurus/cursor-kenji

End-to-end feature build workflow: spec → TDD → implement → smoke test → PR. Sequences workflow-spec-tdd, test-unit, test-playwright, and workflow-pr into a single tracked loop. The single entry point for "build a feature", "implement this", "add X", "ship a new capability", or any non-trivial feature request. Asks for scope clarification before starting if the request is ambiguous. Generic across web, mobile, and full-stack repos.

2026-06-124

workflow-fix-and-ship

kensaurus/cursor-kenji

Complete bug-fix lifecycle in one sweep: triage production signals (Sentry / logs) → reproduce → fix (debug-error) → verify full-stack (test-playwright) → PR (workflow-pr) → post-deploy smoke (deploy-verify). Use when "fix this bug and close the ticket", "patch this and ship", "fix this Sentry issue", "bug report from user", or when a production error needs to go from alert to resolved in a single session.

2026-06-124

workflow-launch-ready

kensaurus/cursor-kenji

Full launch preparation sweep for a new app or major release. One entry point that sequences enhance-web-seo, enhance-pwa, audit-bundle-size, audit-i18n, workflow-quality-gate, deploy-verify, and iterate-post-launch. Produces a launch checklist with all open items and a launch-readiness verdict. Use when "prepare for launch", "launch week", "everything before going live", "is the app launch-ready?", "pre-launch sweep", or "ship it to the world".

2026-06-124

workflow-onboard

kensaurus/cursor-kenji

First-contact orientation for an unfamiliar codebase. Reads package manifests, entry points, routing, data layer, auth, environment variables, and recent git history. Produces a concise briefing: what the app does, how it's structured, how to run it locally, and the top areas to understand first. Generic across web, mobile, and full-stack repos. Use when "I'm new to this repo", "orient me", "explain this codebase", "what does this do?", "onboard me", "first day on this project", or "catch me up on the codebase".

2026-06-124

workflow-quality-gate

kensaurus/cursor-kenji

Pre-release quality gate that sequences test-red-team, audit-security, audit-bundle-size, audit-performance, and test-unit into a single sweep. Produces a go/no-go verdict with a ranked defect list. Use when "is this ready to ship?", "quality gate", "pre-release checklist", "what do I need to fix before launch?", "ship-readiness check", or before any production release.

2026-06-124

audit-bundle-size

kensaurus/cursor-kenji

Analyse and shrink JavaScript bundle size for any web app. Auto-detects bundler (Vite, Webpack, Rollup, esbuild, Next.js, Turbopack). Runs a production build with bundle analysis, identifies the largest chunks, duplicate dependencies, non-tree-shakeable imports, and missing code-splitting or lazy-load boundaries. Researches current best practices via Firecrawl and Context7. Maps every finding to a specific file and import with before/after size estimates. Generic across any framework. Use when asked to "reduce bundle size", "analyse bundle", "tree shaking", "lazy loading", "code splitting", "slow initial load", "large JS", "chunk size", "build performance", "LCP caused by JS", "why is the bundle so big", or "first load JS too large".

2026-06-124

name	test-red-team
description	Adversarial red-team of a running web, React Native, or Capacitor hybrid app. Drives Playwright browser MCP (web/PWA), Playwright Android WebView attach (Capacitor), or adb tap-walk (native chrome) against a feature-first coverage matrix: each feature decomposed to surfaces, sub-pages, components, and states, attacked across 4 dimensions — UI/UX, data pipeline, security (OWASP-mapped), and performance. Cross-references Sentry for production telemetry, Supabase for DB-layer mutation truth and RLS verification, and Firecrawl for current OWASP/MASVS guidance. Produces a severity-ranked defect list with repro steps and evidence. Generic across any repo and stack. Use when asked to "red team this app", "attack my app", "break it", "find all the defects", "adversarial test", "pre-launch hardening", "pentest the app", or "full app QA". Distinct from test-playwright (session PDCA), test-qa (happy-path crawl), and audit-security (static code review).
license	MIT

test-red-team — Adversarial Full-App Defect Sweep

The job is not done when the code compiles or even when the happy path works. It is done when a hostile, skeptical attacker has tried every angle — UI/UX, data pipeline, security, and performance — and the defect list is in the user's hands. This skill exists because nobody ships a perfect first pass. Red teaming finds the gaps before real users do.

This skill produces a defect list. Unlike test-playwright (fix-as-you-go PDCA), the default output is a severity-ranked report, not inline fixes. Offer to fix after the report is delivered; ask which defects to prioritize.

Before ANY browser action, read protocol-browser-anti-stall (~/.cursor/skills/protocol-browser-anti-stall/SKILL.md) and apply every rule.

Coverage model

Do NOT do a blind structural DOM crawl — that is test-qa. Traverse a coverage matrix instead:

feature/capability
  → surfaces (routes, sub-pages, API endpoints, RPCs, tables)
    → components + states (forms, tables, modals, empty/error/loading, role variants)
      → 4 attack dimensions: UI/UX · data pipeline · security · performance

Each cell in the matrix (feature × surface × component-state × dimension) is marked PASS / DEFECT / N-A. The matrix is both the traversal plan and the audit trail in the final report.

Graceful fallback: if no discernible features exist (pure marketing site), degrade to route → component structural traversal.

Phase 0: Scope & rules of engagement

0a. Detect the stack

package.json             → framework, auth, ORM, RN/Expo, test port
capacitor.config.ts/.json → Capacitor hybrid (web + native shell)
app.json / app.config.*  → Expo / React Native
android/                 → native Android target

Record:

Target type: web-only | RN/Expo | Capacitor hybrid | mixed monorepo
Dev URL / port (scripts.dev, default 3000 / 5173 / 8081)
Auth pattern (Supabase Auth, NextAuth, Clerk, custom JWT)
Test credentials (.env.local, .env.test, README)
Backend MCPs available: Supabase (plugin-supabase-supabase), Sentry (plugin-sentry-sentry), Firecrawl (user-firecrawl)

0b. Pick the automation driver

Target	Driver
Web / PWA / Next.js / SvelteKit / Remix	Playwright browser MCP (`user-playwright`)
Capacitor WebView on Android emulator	Playwright `_android` WebView attach over ADB (see Phase 0c)
Native chrome: system dialogs, bottom sheets, permission prompts	`adb shell input tap` walk (see `mobile-emulator-test` skill)
Pure-native iOS/Android (Swift/Kotlin UI)	Out of scope — needs Appium; document as limitation

0c. Capacitor WebView attach (when target includes Capacitor)

Ensure the app is running, then in a script or shell:

const { _android } = require('playwright');
const [device] = await _android.devices();
// requires: ADB device online, Chrome ≥ 87, WebView debuggable flag
const webview = await device.webView({ pkg: 'com.your.app.id' });
const page = await webview.page();
// page is a standard Playwright Page — all browser MCP methods apply

For native chrome outside the WebView, fall back to adb shell input tap with coordinates from adb shell uiautomator dump.

0d. Rules of engagement

All test data prefixed RT-TEST- for easy cleanup
Do NOT mutate production data; ask before any non-reversible operation
Confirm exploits with evidence; no destructive PoCs against real rows
Secrets referenced by name only — never printed in chat

Phase 1: Recon & build the coverage matrix

Before opening the browser, build the matrix from source code. This is the most important phase — the matrix determines what you attack.

1a. Enumerate features / capabilities

Sources to check (read, do not shell-grep unless necessary):

Source	What to extract
`src/app/**/page.tsx` (Next.js App Router)	Routes → feature groupings
`src/routes/*/.tsx` (Remix / RN Router)	Route tree
`features//`, `modules//`, `src/*/` dirs	Feature boundary names
`supabase/migrations/*.sql`, `prisma/schema.prisma`	Data entities + relationships
`src/app/api/`, `supabase/functions/`	API endpoints + RPCs
README / feature-level docs	Business capability names
Nav component / sidebar / tab bar	User-visible feature map

1b. For each feature, decompose

Feature: [name]
  Surfaces:
    - Routes/pages: [list]
    - Sub-pages / deep links: [list]
    - API endpoints / RPCs called: [list]
    - DB tables touched: [list]
  Components + states:
    - [component name]: states=[default, empty, loading, error, role-A, role-B, ...]
    - Forms: [list fields + validation rules]
    - Modals / drawers: [list + triggers]
    - File upload surfaces: [list]
  Roles / tenants:
    - [role list — anon, user, admin, org-member, etc.]
  Input surfaces (for injection testing):
    - [input fields, URL params, query strings, headers, file content]

1c. Output the matrix

Print the matrix before starting Phases 2–5. Example:

COVERAGE MATRIX
| Feature       | Surface/Component+State          | UI/UX | Pipeline | Security | Perf |
|---------------|----------------------------------|-------|----------|----------|------|
| Auth          | Login form / empty               |       |          |          |      |
| Auth          | Login form / error (wrong creds) |       |          |          |      |
| Documents     | List page / empty state          |       |          |          |      |
| Documents     | List page / 100+ items           |       |          |          |      |
| Documents     | Create modal / all fields        |       |          |          |      |
| Documents     | Share flow / other-user role     |       |          |          |      |
...

Fill each cell with ✅ PASS / ❌ DEFECT(#N) / — N-A as you work through Phases 2–5.

Phase 2: UI/UX red team (per matrix cell)

For each cell, navigate to the surface, reach the component state, and attack.

Driver: Playwright browser MCP tools — browser_navigate, browser_snapshot, browser_take_screenshot, browser_click, browser_type, browser_fill_form, browser_console_messages, browser_network_requests, browser_resize.

Per cell checklist:

Attack	What to do
3-second clarity	Navigate cold. Can a new user understand the purpose in 3 s?
Primary action	Is the CTA obvious, above the fold, reachable in 1 tap/click?
Empty state	Remove all data (or use a fresh account). Is there a helpful empty-state message?
Loading state	Throttle network (`browser_evaluate` to `navigator.serviceWorker` or DevTools API). Does a skeleton/spinner appear?
Error state	Trigger a backend error (kill the API, bad payload). Is the error message human-readable?
Dead buttons	Click every button and link. Does each one produce a visible response?
Form labels	Every `<input>` has a visible label or `aria-label`.
Responsive	`browser_resize` to 1280×800, 768×1024, 375×812. Layout must not break.
Dark mode	Toggle if supported. No white flash, no invisible text.
Role variant	Test the same surface as each role. Different roles must see different data/controls.
Overflow	Long strings (200 chars), numbers with many digits. No clipping without ellipsis.

Capture a browser_take_screenshot as evidence for every DEFECT. Note the component file path if identifiable from the DOM or source tree.

Phase 3: Data pipeline red team (per matrix cell)

Every mutation surface (create / update / delete / upload) gets attacked here.

Verify mutations at 3 layers: (1) UI feedback, (2) network call, (3) DB row.

DB verification via Supabase MCP

Look up the tool schema first:

CallMcpTool(server: "plugin-supabase-supabase", toolName: "list_tables", arguments: {
  "project_id": "<PROJECT_ID>"
})

Then verify the mutation:

CallMcpTool(server: "plugin-supabase-supabase", toolName: "execute_sql", arguments: {
  "project_id": "<PROJECT_ID>",
  "query": "SELECT * FROM <table> WHERE <col> LIKE 'RT-TEST-%' ORDER BY created_at DESC LIMIT 10"
})

RLS role verification (confirm what the client can actually read/write):

CallMcpTool(server: "plugin-supabase-supabase", toolName: "execute_sql", arguments: {
  "project_id": "<PROJECT_ID>",
  "query": "SET ROLE authenticated; SELECT * FROM <table> WHERE user_id != auth.uid() LIMIT 5"
})

Attack patterns

Attack	How
Double-submit dupe	Click submit twice in rapid succession. Does a duplicate row appear?
Optimistic-update lie	Submit → immediately check DB. Did the row actually land?
Stale cache	Mutate in one tab / session, reload the other. Does it refresh?
Ghost delete	Delete an item, reload the page. Does it reappear?
Race condition	Submit two conflicting mutations in quick succession. Which wins? Any partial writes?
Partial write	Kill the network mid-request (`browser_evaluate` `fetch` override). Is the record consistent?
Idempotency	Retry the same request (Replay in DevTools or Playwright network intercept). Is it safe?
Relationship integrity	Delete a parent record. Are orphaned children cleaned up or guarded?
Pagination consistency	Create a record while on page 2. Does it appear in the right position?

Supabase logs check

After running attack patterns, pull API + postgres logs:

CallMcpTool(server: "plugin-supabase-supabase", toolName: "get_logs", arguments: {
  "project_id": "<PROJECT_ID>",
  "service": "api"
})

Flag any unexpected 5xx, unhandled exceptions, or RLS deny events.

Supabase advisors

CallMcpTool(server: "plugin-supabase-supabase", toolName: "get_advisors", arguments: {
  "project_id": "<PROJECT_ID>"
})

New ERROR-level advisors count as pipeline defects.

Phase 4: Security red team (per matrix cell, OWASP-mapped)

Full payload tables and OWASP/MASVS mapping: references/owasp-attack-checklist.md.

Priority order (highest real-world impact first):

#	Class (OWASP)	Key test
1	Authorization / IDOR (A01)	User B reads/mutates User A's resource by ID; verify at DB layer with `SET ROLE authenticated`
2	Authentication (A07)	Password-reset token reuse, session fixation, JWT payload tampering, missing brute-force lockout
3	Injection (A03)	XSS `<img src=x onerror=alert(1)>`, SQLi `' OR 1=1--`, path traversal `../../etc/passwd` in every input
4	Sensitive data (A02)	API responses include `password`/`token`/PII? `localStorage` stores tokens? Env vars in bundle?
5	Security headers	Missing `CSP`, `HSTS`, `X-Content-Type-Options`, `X-Frame-Options`, `Referrer-Policy`
6	Capacitor MASVS	`allowUniversalAccessFromFileURLs`? Bridge calls lack input validation? XSS can invoke native plugin?

Correlate with Sentry (search 401/403/CSP violations last 30 days) and research current OWASP guidance via Firecrawl.

Phase 5: Performance red team (per matrix cell)

Focus on the conditions real users hit: cold start, slow network, large data.

Network throttling

Use Playwright's CDP layer via browser_evaluate:

// Simulate Slow 3G
const client = await page.context().newCDPSession(page);
await client.send('Network.emulateNetworkConditions', {
  offline: false, downloadThroughput: 50 * 1024 / 8,
  uploadThroughput: 20 * 1024 / 8, latency: 300
});

On Capacitor/Android, use adb shell tc qdisc add dev wlan0 root netem delay 300ms.

Attacks per surface

Attack	What to look for
Cold load (hard refresh)	Time-to-interactive > 3 s on simulated 3G? Visible content flash / layout shift?
Large list (50+ items)	Scroll jank? Does it virtualize or render all DOM nodes?
Pagination / infinite scroll	Duplicate items on page boundary? Missing items?
Payload size	`browser_network_requests` → any response > 500 KB for a list endpoint?
Memory growth	Load a large list, scroll to bottom, back to top × 5. JS heap grows without bound?
Simultaneous requests	Rapid navigation between pages. Race conditions in loading states?
Supabase N+1	Check `get_logs(service: 'postgres')` after exercising a feature. Repeated identical queries = N+1.

Supabase query performance

CallMcpTool(server: "plugin-supabase-supabase", toolName: "get_advisors", arguments: {
  "project_id": "<PROJECT_ID>"
})

Advisors flag missing indexes, sequential scans on large tables, and bloated RLS policies. Each is a Medium–High performance defect.

Phase 6: Finding-chaining & triage

Before writing the report, scan all DEFECT cells for chains:

Two Medium findings often combine into a High or Critical path. Example: "reflected input in URL param" + "CSP missing" = stored XSS → data theft.
Auth weakness + IDOR = full account takeover chain.
Stale cache + no optimistic revert = silent data corruption.

For each finding, assign:

Field	Options
Severity	Critical / High / Medium / Low / Info
Likelihood	High (trivial to exploit) / Medium (requires access) / Low (theoretical)
Impact	Data loss / account takeover / PII leak / UX degradation / performance SLA breach
Affected component	`src/components/Foo.tsx`, `supabase/functions/bar`, etc.
Remediation	Specific fix at the file/function level, not vague advice

Phase 7: Defect report (the deliverable)

## Red-Team Defect Report — [App / Repo] — [Date]

### Scope
- Target type: [web | RN | Capacitor | mixed]
- Dev URL: [...]  Auth: [...]
- Features red-teamed: [list]
- Coverage matrix: [link or inline]

### Critical Defects (fix before any release)
| # | Feature | Surface / Component+State | Finding | Evidence | File | Remediation |
|---|---------|--------------------------|---------|----------|------|-------------|
| 1 | Auth | Login / error state | IDOR: User B can read User A documents via GET /api/docs/:id | Screenshot+network | src/app/api/docs/[id]/route.ts | Add ownership check: `WHERE user_id = auth.uid()` |

### High Defects
| # | Feature | Surface / Component+State | Finding | Evidence | File | Remediation |

### Medium Defects
| # | Feature | Surface / Component+State | Finding | Evidence | File | Remediation |

### Low / Info
| # | Feature | Surface / Component+State | Finding | Evidence | File | Remediation |

### Attack chains identified
1. [Finding #N] + [Finding #M] → [combined impact and attack path]

### Coverage matrix (final)
| Feature | Surface/Component+State | UI/UX | Pipeline | Security | Perf |
|---------|------------------------|-------|----------|----------|------|
...

### Backend truth-check
- Sentry: [new/related issues — count + highest severity]
- Supabase: [advisors, logs, RLS test results]

### Launch-readiness verdict
**Ready / Ready after Critical fixes / Not ready** — [2-sentence justification]

### Test data cleanup
- RT-TEST-* rows deleted: [Y/N / N-A]

Guardrails

Anti-stall always — never block >3 s on a browser action; incremental wait → snapshot → check; max 4 attempts per goal; mark [TIMEOUT] and move on.
Evidence for every DEFECT — screenshot + console + network + DB query result. "It looked broken" is not a defect.
No destructive PoCs — confirm XSS/SQLi with a benign payload (document.title='XSS', SELECT 1), never DROP TABLE or real data exfil.
Ask before mutating production rows — DDL (schema changes) for a requested fix ships; DELETE/UPDATE/TRUNCATE on real production data asks first.
Secrets by name only — never print .env values in chat or screenshots.
Honest verdict — do not declare "no defects" if any cell is untested. Mark untested cells N-A with a note explaining why (auth blocked, no data, etc.).
Offer to fix after the report — ask which Critical/High defects to address first; use test-playwright PDCA discipline when applying fixes.
MCP schemas first — always check tool schemas under mcps/<server>/tools/ before calling any MCP tool.
Pure-native iOS/Android out of scope — document it in the report if relevant.