with one click
with one click
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | sf-audit-design |
| description | UI/UX design audit. |
| disable-model-invocation | true |
| argument-hint | [file-path | "global" | "deep"] (omit for full project standard audit) |
Before resolving any ShipFlow-owned file, load $SHIPFLOW_ROOT/skills/references/canonical-paths.md ($SHIPFLOW_ROOT defaults to $HOME/shipflow). ShipFlow tools, shared references, skill-local references/*, templates, workflow docs, and internal scripts must resolve from $SHIPFLOW_ROOT, not from the project repo where the skill is running. Project artifacts and source files still resolve from the current project root unless explicitly stated otherwise.
Trace category: conditionnel.
Process role: source-de-chantier.
Before producing the final report, load $SHIPFLOW_ROOT/skills/references/chantier-tracking.md when this run is attached to a spec-first chantier. If exactly one active specs/*.md chantier is identified, append the current run to Skill Run History, update Current Chantier Flow when the run changes the chantier state, and include a final Chantier block. If no unique chantier is identified, do not write to any spec; report Chantier: non applicable or Chantier: non trace with the reason.
Because this skill has process role source-de-chantier, evaluate the standard threshold from $SHIPFLOW_ROOT/skills/references/chantier-tracking.md before the final report. If the findings reveal non-trivial future work and no unique chantier owns it, do not write to an existing spec; add a Chantier potentiel block with oui, non, or incertain, a proposed title, reason, severity, scope, evidence, recommended /sf-spec ... command, and next step. If the work is only a direct local fix or already belongs to the current chantier, state Chantier potentiel: non with the concrete reason.
Before producing the final report, load $SHIPFLOW_ROOT/skills/references/reporting-contract.md.
Default to report=user: concise, findings-first, and focused on top issues, proof gaps, chantier potential, and the next real action. Use report=agent, handoff, verbose, or full-report for the detailed audit matrix, domain checklist output, command evidence, assumptions, confidence limits, and handoff notes.
pwdhead -100 CLAUDE.md 2>/dev/null || echo "no CLAUDE.md"head -60 BRANDING.md 2>/dev/null || echo "no BRANDING.md — run /sf-init to generate"ls BUSINESS.md BRANDING.md GUIDELINES.md 2>/dev/null || echo "none found"grep -HE '^(metadata_schema_version|artifact_version|status|updated|confidence|next_review):' BUSINESS.md BRANDING.md GUIDELINES.md 2>/dev/null || echo "no metadata fields found"cat tailwind.config.* 2>/dev/null | head -80 || echo "no tailwind config"cat src/styles/global.css 2>/dev/null || cat src/assets/styles/*.css 2>/dev/null | head -100 || echo "no global styles found"find src/pages src/app -name "*.astro" -o -name "*.tsx" -o -name "*.vue" 2>/dev/null | grep -v node_modules | sortfind src/components -name "*.astro" -o -name "*.tsx" -o -name "*.vue" 2>/dev/null | grep -v node_modules | sortgrep -rn --include="*.{js,ts,jsx,tsx,astro,vue}" -E '\balert\(|\bconfirm\(|\bprompt\(|\bdocument\.write\(' src/ 2>/dev/null | head -20 || echo "none found"grep -rn --include="*.{astro,vue,tsx,jsx,html}" -iE '<marquee|<blink|<center|<font ' src/ 2>/dev/null | head -20 || echo "none found"grep -rn --include="*.{astro,vue,tsx,jsx,html}" -E '<div[^>]+role=["\x27]dialog' src/ 2>/dev/null | head -10 || echo "none found"grep -rn --include="*.{astro,vue,tsx,jsx,css,scss}" -E '#[0-9a-fA-F]{3,6}\b|rgb\(|rgba\(' src/ 2>/dev/null | grep -v node_modules | wc -l || echo "0"find src -type f \( -name "theme*" -o -name "*Theme*" -o -name "tokens*" -o -name "design-tokens*" -o -name "palette*" \) 2>/dev/null | grep -v node_modules | head -20 || echo "none found"grep -rn --include="*.{ts,tsx,js,jsx,vue,astro,svelte,dart,kt,swift}" -E 'ThemeMode|prefers-color-scheme|color-scheme|themeMode|theme_mode|darkMode|dark_mode' src/ lib/ 2>/dev/null | grep -v node_modules | head -10 || echo "none found"grep -rh --include="*.{css,scss}" -E '^\s*--(fs|fz|font-size|space|spacing|gap|color|c|bg|surface|duration|easing|ease|motion)-' src/ 2>/dev/null | sort -u | head -40 || echo "none found"grep -rn --include="*.{css,scss,vue,astro,tsx,jsx}" -E 'font-size:\s*[0-9]' src/ 2>/dev/null | grep -v 'var(--' | grep -v node_modules | wc -l || echo "0"grep -rn --include="*.{css,scss}" -E '(margin|padding|gap|top|right|bottom|left):\s*[0-9]+(\.[0-9]+)?(px|rem|em)' src/ 2>/dev/null | grep -v 'var(--' | grep -v node_modules | wc -l || echo "0"grep -rn --include="*.{css,scss}" -E '(transition|animation):\s*' src/ 2>/dev/null | grep -v 'var(--' | grep -v node_modules | wc -l || echo "0"grep -rn --include="*.{css,scss,ts,tsx,js,jsx,vue,astro,svelte}" -E 'prefers-reduced-motion' src/ 2>/dev/null | grep -v node_modules | wc -l || echo "0"find src -type d \( -name "design-system" -o -name "styleguide" -o -name "tokens-debug" -o -name "theme-debug" \) 2>/dev/null | grep -v node_modules | head -5 || echo "none found"grep -rln --include="*.{ts,tsx,js,jsx}" -E "(next-auth|@clerk/|better-auth|@auth/|lucia|@supabase/auth|firebase/auth|getServerSession|useSession|useUser|currentUser)" src/ app/ pages/ 2>/dev/null | grep -v node_modules | head -3 || echo "none — theme sync to server not required"Avant de commencer, vérifier le contexte chargé ci-dessus. Si BRANDING.md est absent :
Afficher un avertissement en tête de rapport :
⚠ Contexte manquant :
- [BRANDING.md manquant] L'audit design ne peut pas vérifier la cohérence visuelle avec l'identité de marque.
→ Lancer /sf-init pour générer ce fichier, ou /sf-docs update pour le mettre à jour.
Si le fichier existe mais semble incomplet, signaler. Continuer l'audit dans tous les cas.
BUSINESS.md, BRANDING.md, and GUIDELINES.md are ShipFlow decision contracts for design audits when they define visual identity, audience expectations, trust posture, product promise, or interface guidelines. Before scoring:
artifact_version, status, updated, confidence, and next_review when available.artifact_version, status, or updated is missing, add a proof gap: business doc metadata incomplete.status is draft, stale, outdated, deprecated, or confidence is low, cap confidence and state that design scoring depends on an unreviewed decision contract.next_review is before today's absolute date, treat the document as stale unless a newer replacement is explicit.A to the affected category.Business metadata versions section in every report.Use ShipFlow versioning semantics: patch = visual wording/example clarification without decision change, minor = changed brand/UI guidance inside the same strategy, major = changed ICP, positioning, trust posture, accessibility policy, market, or brand strategy.
Le design doit servir une promesse utilisateur cohérente :
Signaler les écarts de product coherence, docs mismatch, misleading UI state et unproven interface claim comme des risques produit.
$ARGUMENTS is "deep" → DEEP MODE: orchestrate 3 specialist agents in parallel (design tokens, components, a11y) for a pro-grade audit on a large project.$ARGUMENTS is "global" → GLOBAL MODE: audit ALL projects in the workspace (standard audit per project).$ARGUMENTS is a file path → PAGE MODE: review that single page.$ARGUMENTS is empty → PROJECT MODE (standard): full design audit of the entire project using the checklist below.Standard vs Deep: standard mode (default) is self-sufficient — one skill, one report, professional checklist covering the 13 categories below. Deep mode spawns dedicated specialist skills via the Agent tool, each one focusing exclusively on its domain (no attention dilution). Use deep when the project is large (> ~30 component files) or when you need audit-grade proof on each design token / component architecture / WCAG 2.2 concern.
Orchestrator that delegates to three specialist sub-skills, each running as a parallel subagent. Use this when a single skill's attention would be diluted across too many distinct checklists.
In one single message, use the Agent tool three times (one call per specialist). All three agents run in parallel. Each agent is subagent_type: "general-purpose" and receives:
The three specialists:
sf-audit-design-tokens — audits the 4 design token systems (theme + typography + spacing + motion) with coverage matrix per mode, modular ratio analysis, dependency graph, DTCG compliance, historical drift. Subscores: Theme Architecture, Typography Tokens, Spacing Tokens, Motion Tokens, Universal Palette Socle.
sf-audit-components — audits component architecture: atomic design inventory, duplication detection, god components (>15 props), unused components, abstraction quality (AHA rule), variant systems adoption (CVA, tailwind-variants), headless primitives adoption (Radix, React Aria), composition vs configuration, API hygiene.
sf-audit-a11y — full WCAG 2.2 audit (A/AA/AAA), keyboard navigation patterns per W3C APG, focus management (trap, roving tabindex, virtual focus), ARIA patterns per component, aria-live regions, screen reader announcements, focus appearance (2.4.11), target size (2.5.8), dragging alternatives (2.5.7), consistent help (3.2.6).
Agent prompt template (adapt per specialist):
You are auditing the project at: [project path]
Read and execute the specialist skill at:
${SHIPFLOW_ROOT:-$HOME/shipflow}/skills/sf-audit-[tokens|components|a11y]/SKILL.md
Read the project context first:
- `CLAUDE.md`
- `BUSINESS.md`, `BRANDING.md`, and `GUIDELINES.md` metadata if present
- the theme/token/motion/auth hints surfaced by this skill when relevant
Before scoring, identify linked systems and downstream consequences.
Evaluate product coherence and documentation alignment when UI states or feature promises changed.
Read/report business metadata versions and flag missing, stale, low-confidence, or unversioned contracts as proof gaps.
Do not ask follow-up questions. If context is missing, state assumptions and confidence limits.
Run the full skill read-only (no code fixes). Return a structured sub-report with:
- Global score for this domain (A/B/C/D)
- Subscores per phase/category
- Issue counts by severity (🔴 critical / 🟠 high / 🟡 medium)
- Top 5 priority improvements for this domain (file:line + fix + Why)
- Linked systems / consequences to watch
- Product/docs coherence gaps
- Business metadata versions
- Confidence / missing context
- Raw findings list (file:line + description + severity + Why)
Do not update AUDIT_LOG.md or TASKS.md — the orchestrator will do that.
Do not edit any source files.
After all three agents return, compile one consolidated deep-audit report:
DEEP DESIGN AUDIT: [project name]
═══════════════════════════════════════
Mode: DEEP (3 specialists, parallel) — [date]
GLOBAL SCORES
Design Tokens [A/B/C/D] — from sf-audit-design-tokens
Components [A/B/C/D] — from sf-audit-components
Accessibility [A/B/C/D] — from sf-audit-a11y
═══════════════════════════════════════
OVERALL [A/B/C/D] ← worst of the three (pro-grade standard)
─────────────────────────────────────
DESIGN TOKEN SYSTEMS (from sf-audit-design-tokens)
Theme Architecture [A/B/C/D]
Typography Tokens [A/B/C/D]
Spacing Tokens [A/B/C/D]
Motion Tokens [A/B/C/D]
Universal Palette Socle [A/B/C/D]
Issues: X 🔴 | Y 🟠 | Z 🟡
─────────────────────────────────────
COMPONENT ARCHITECTURE (from sf-audit-components)
Atomic Design Inventory [A/B/C/D]
Duplication [A/B/C/D]
Abstraction Quality (AHA) [A/B/C/D]
Variant Systems Adoption [A/B/C/D]
Headless Primitives [A/B/C/D]
API Hygiene [A/B/C/D]
Issues: X 🔴 | Y 🟠 | Z 🟡
─────────────────────────────────────
ACCESSIBILITY (from sf-audit-a11y)
WCAG 2.2 (A/AA) [A/B/C/D]
Keyboard Navigation [A/B/C/D]
Focus Management [A/B/C/D]
ARIA Patterns [A/B/C/D]
aria-live Regions [A/B/C/D]
Focus Appearance [A/B/C/D]
Target Size [A/B/C/D]
Issues: X 🔴 | Y 🟠 | Z 🟡
─────────────────────────────────────
CONSOLIDATED PRIORITY IMPROVEMENTS (top 10 across all three domains, ordered by impact)
⚡ [domain] [file:line] description — Why: [principle]
...
ALL CRITICAL ISSUES (🔴)
[domain] [file:line] description — Why: [principle]
...
Total: X 🔴 critical, Y 🟠 high, Z 🟡 medium across 3 domains
═══════════════════════════════════════
Overall rule for deep mode: overall score is the worst of the three (pro-grade standard — a project isn't "A" in design if its a11y is "C"). In standard mode, overall is an average; in deep mode, it's the worst.
./AUDIT_LOG.md: append one row with mode = deep, three sub-scores, overall = worst.${SHIPFLOW_DATA_DIR:-$HOME/shipflow_data}/AUDIT_LOG.md: same, Design column = overall, mark mode as deep in notes column.TASKS.md: replace ### Audit: Design subsection with three sub-subsections (#### Tokens, #### Components, #### A11y) — one task per critical / high / medium issue per domain.${SHIPFLOW_DATA_DIR:-$HOME/shipflow_data}/TASKS.md: same structure under the project section.After the unified report, ask the user: "Which domain should I fix first — tokens, components, or a11y?" Fix one domain at a time, re-running the relevant specialist after each batch to re-score. Do not attempt to fix all three in one pass — the context would explode.
Audit ALL UI projects in the workspace for design, UX, and accessibility issues.
Read ${SHIPFLOW_DATA_DIR:-$HOME/shipflow_data}/PROJECTS.md — check the Domain Applicability table. Identify projects with ✓ in the Design column.
Use AskUserQuestion to let the user choose:
multiSelect: trueUse the Task tool to launch one agent per selected project — ALL IN A SINGLE MESSAGE (parallel). Each agent: subagent_type: "general-purpose".
Agent prompt must include:
cd [path] then read CLAUDE.md for project contextBRANDING.md, theme/token files, reduced-motion support, auth/theme sync hints)BUSINESS.md, BRANDING.md, and GUIDELINES.md metadata versions; flag missing, stale, low-confidence, or unversioned contracts as proof gaps before scoringScope understood, User story / interface promise, Business metadata versions, Context read, Linked systems & consequences, Product/docs coherence, Risky assumptions / proof gaps, Findings, Confidence / missing contextAfter all agents return, compile a cross-project design report:
GLOBAL DESIGN AUDIT — [date]
═══════════════════════════════════════
PROJECT SCORES
[project] [A/B/C/D] — summary
...
CROSS-PROJECT PATTERNS
[Systemic design issues in 2+ projects]
ALL ISSUES BY SEVERITY
🔴 [project] file:line — description — Why: [principle]
🟠 [project] file:line — description — Why: [principle]
🟡 [project] file:line — description — Why: [principle]
PRIORITY IMPROVEMENTS ACROSS PROJECTS
⚡ [project] file:line — description — Why: [principle]
... (max 10, ordered by impact)
Total: X critical, Y high, Z medium across N projects
═══════════════════════════════════════
Update ${SHIPFLOW_DATA_DIR:-$HOME/shipflow_data}/AUDIT_LOG.md (one row per project, Design column) and ${SHIPFLOW_DATA_DIR:-$HOME/shipflow_data}/TASKS.md (each project's ### Audit: Design subsection).
Ask: "Which projects should I fix?" — list projects with scores. Fix only approved projects, one at a time.
$ARGUMENTS).Score each category A/B/C/D (A = excellent, D = critical issues). Be strict — professional standard.
clamp() instead of abrupt media-query breakpoints. Formula: clamp(MIN, PREFERRED, MAX) where PREFERRED is a rem + vw expression (e.g., clamp(1rem, 0.5rem + 2vw, 2rem)). Key rules:
rem (not px) in clamp values so the font respects user browser zoom/font-size preferences (accessibility)rem base + vw slope — pure vw ignores user settingsfont-size — likely replaceable with a single clamp() declarationfont-size values in components: every font-size resolves to a token (var(--fs-*), theme.fontSize.*, Theme.of(context).textTheme.*). Literal font-size: 1.2rem in a component file = violation. Acceptable exceptions: HTML email templates (mail clients require inline px), em units relative to parent (icons in text). Flag count: literal font-sizes outside tokens (see context block).font-size + line-height + letter-spacing (either as a single object/mixin or a triple of co-named CSS variables --fs-base, --lh-base, --ls-base). Isolated font-size tokens without paired line-height force per-component overrides → drift.xs, sm, base, lg, xl, 2xl, 3xl, hero). Simple, low cognitive load.body, body-sm, caption, heading-1, heading-2, display). Survives refactors and onboards new contributors faster.1.1×, 1.4×, 1.2× between consecutive levels — that's chaos, not a scale. Recommend Utopia.fyi for pro projects to generate the scale from a base size + ratio + viewport range.vw component caveat in clamp(): the vw portion must stay moderate (≤ ~3vw) and always added to a solid rem base. A clamp like clamp(1rem, 4vw, 2rem) (vw-dominant, no rem in preferred) breaks WCAG 1.4.4 Resize Text — at 200% browser zoom, the user's font-size preference is ignored because vw is computed from viewport, not from user font scale. Pattern to enforce: clamp(MIN_REM, X_REM + Y_VW, MAX_REM) with Y ≤ 3.oklch() (perceptually uniform) over hsl()/hexcolor-mix(in oklch, var(--brand) 70%, white) rather than hand-picked hex — one token change re-derives the palettecolor-mix() declarations have a static fallback line above them (old browsers drop the whole rule otherwise)<meta name="color-scheme" content="light dark"> + root color-scheme: light dark + CSS uses light-dark(<light>, <dark>) — eliminates @media (prefers-color-scheme: dark) duplicationsuccess, warning, danger, info, neutral (intent-based names, not hue-based). Each one declined into the variants the project actually uses (*-bg, *-fg, *-border, *-subtle typically). These are the floor — domain-specific intents (approve, reject, pending, archived, etc.) are added on top for the project's vocabulary.surface-base, surface-raised, surface-overlay, surface-sunken (or equivalent). Surfaces ≠ background colors — they encode elevation/role, not a hex value.Color(0xFF...), Colors.white, Colors.orange, text-blue-500, bg-red-100 in business components = violation. Names by intent (text-danger, bg-surface-raised), never by hue. Brand colors are an acceptable exception (brand-primary) but should still be named by role, not by what color they happen to be.--color-error and --color-danger exist for the same intent, that's drift — pick one and migrate.success token defined only for light mode breaks dark mode. Audit all semantic tokens across all theme modes.prefers-reduced-motionoutline: none without replacementuseEffect-on-click, sync work in onClick. INP replaced FID as Core Web Vital March 2024onClick/role="button"/tabIndex={0}/onKeyDown to parent, cursor-pointer + hover state for feedback, e.stopPropagation() on secondary actions. Skip if: multiple competing actions, destructive action (precise click = safety), drag surface, or form controls (use <label> instead)alert(), confirm(), or prompt() browser dialogs — use toast/modal componentsdocument.write() — never acceptable<marquee>, <blink>, <center>, <font>)onclick="..." handlers — use framework event handlinginnerHTML for user-facing content (XSS risk)<dialog> + showModal() — NOT <div role="dialog"> (native focus trap, Esc, backdrop, top-layer for free)popover attribute — NOT <dialog> (popovers have no aria-modal)inert to siblings, NOT aria-hidden (aria-hidden on focusable subtree = WCAG fail).card.has-image) — use :has() (.card:has(img)) — 95%+ support in 2026@container with container-type: inline-size on wrapper — NOT @media (media queries respond to viewport, not actual component space)container-type: size (height queries) except on fixed-dimension dashboards (expensive — ~10-15ms layout cost per pass at scale):has() selectors are child-scoped (:has(> img) not :has(img)) — bare descendant :has() forces full subtree walks on every mutation@view-transition { navigation: auto } + view-transition-name on hero elements (Baseline Oct 2025 — free smooth MPA transitions)::view-transition-* animations wrapped in @media (prefers-reduced-motion: no-preference)grid-template-rows: subgrid on the card (fixes "buttons at different heights")animation-timeline: scroll() / view()) gated by @media (prefers-reduced-motion: no-preference) AND only animate transform/opacity (compositor-only)anchor-name/position-anchor + popover attribute — NOT Floating UI/Popper (Baseline 2026, ~91% traffic)content-visibility: auto + contain-intrinsic-size: auto <px> (30-50% faster initial render)Flag these patterns — v0, bolt, lovable, Figma Make produce ~160 issues/app on average, 90%+ have a11y failures:
grid flex, w-full w-64, p-2 p-6) — AI pattern: adds classes without removing old ones`text-${color}-500`) — JIT scans plain text; concatenated classes get purged<div onClick> always has role="button" + tabIndex={0} + onKeyDown (Enter/Space) — top a11y failure in AI-generated appsalt; all form inputs have <label> (not placeholder-only):focus-visible, :disabled, loading, error, emptyrequired, pattern, type="email") before custom JS validationThe design tokens above (color, typography, spacing, motion) only pay off if there is a theme system that can swap them at runtime. Audit the architecture, not just the values.
light, dark, and system (follows OS preference). The system mode is the default for new users — explicit choice overrides it. Single-mode projects (dark-only, light-only) are acceptable only with a documented justification in BRANDING.md (e.g., "terminal product, dark-only by brand"). Absence of justification = violation.theme_preference.{ts,dart,kt,...}) that exposes the current mode, normalizes incoming values (any unknown value → system, never crash), and emits change events. Scattered localStorage.getItem('theme') reads across components = violation.localStorage, SharedPreferences, UserDefaults, chrome.storage).<script> in <head> reads localStorage and sets data-theme on <html> before stylesheets compute. SSR: read cookie or header. Native: read preference synchronously in app bootstrap.prefers-color-scheme honored at first render for new users with no stored preference (fallback before they choose).Appearance, Apparence, Display) exposing Light / Dark / System choice. Hidden behind a debug menu = violation; this is a user-level preference, not a developer toggle.if (isDark) ... to swap colors. Branching belongs in the token layer, not in business code.var(--space-*), theme.spacing.*, EdgeInsets.fromLTRB(theme.spacing.lg, ...)). Literal padding: 12px in components = violation. Acceptable exceptions: 0, 1px borders, hairlines.1.5×). Random ratios (5px, 13px, 27px) = violation.xs, sm, md, lg, xl, 2xl, 3xl)gutter, stack-tight, stack-loose, inset-card, section)clamp() (same accessibility rules as typography: rem-based, moderate vw). Component-level spacing (button padding, icon gap) stays static — fluid spacing inside components creates visual instability.padding: 14px, the question is "why not the nearest token?". Either the scale needs a new token, or the component needs to use an existing one. Never invent one-off values.transition-duration, animation-duration, and easing curves resolve to tokens (var(--duration-fast), var(--ease-out-quart)). Literal transition: 200ms ease in a component = violation.duration-instant (50ms), duration-fast (150ms), duration-base (250ms), duration-slow (400ms), duration-deliberate (600ms). ❌ duration-200ms, ms-md. Same logic as colors: name the role, not the literal.ease-out-standard (entrances), ease-in-standard (exits), ease-in-out-standard (move/morph). Avoid CSS keywords (ease, ease-in-out) — they're reserved for prototypes; production needs explicit cubic-béziers.prefers-reduced-motion honored systematically: every non-trivial animation/transition is wrapped in @media (prefers-reduced-motion: no-preference) or short-circuited at the source (a motion() helper that returns 0ms when the user opts out). Decorative parallax, scroll-driven animations, autoplay carousels → must respect the OS preference. Required animations (loading spinners, focus outlines) can stay but should be minimal.transform, opacity, filter only. Animating width, height, top, left, margin triggers layout on every frame — flag as performance violation.For each issue rated B or worse:
DESIGN REVIEW: [page name]
─────────────────────────────────────
Business metadata:
BUSINESS.md artifact_version=[x|missing] status=[x|missing] updated=[date|missing] confidence=[x|missing]
BRANDING.md artifact_version=[x|missing] status=[x|missing] updated=[date|missing] confidence=[x|missing]
GUIDELINES.md artifact_version=[x|missing] status=[x|missing] updated=[date|missing] confidence=[x|missing]
Visual Hierarchy [A/B/C/D] — one-line summary
Typography [A/B/C/D] — one-line summary (incl. design token system)
Color & Contrast [A/B/C/D] — one-line summary (incl. semantic palette)
Responsiveness [A/B/C/D] — one-line summary
Consistency [A/B/C/D] — one-line summary
Accessibility [A/B/C/D] — one-line summary
Usability (NN/g) [A/B/C/D] — one-line summary
Modern Patterns [A/B/C/D] — one-line summary
Modern CSS 2026 [A/B/C/D] — container queries, :has, view transitions, oklch
AI-Gen Smells [A/B/C/D] — Tailwind conflicts, missing states, div-as-button
─────────────────────────────────────
DESIGN TOKEN SYSTEM
Theme Architecture [A/B/C/D] — modes, persistence, sync, settings UI
Spacing System [A/B/C/D] — centralization, scale coherence, naming
Motion System [A/B/C/D] — tokens, prefers-reduced-motion, perf
─────────────────────────────────────
OVERALL [A/B/C/D]
PRIORITY IMPROVEMENTS (high impact, bounded effort)
⚡ [file:line] description — Why: [principle]
⚡ [file:line] description — Why: [principle]
...
Fixed: X issues | Remaining: Y issues
Priority improvement criteria: bounded changes that fix a B-level or worse issue without requiring a full redesign. Examples: darkening a hex for contrast, adding alt text, bumping a target size, adding prefers-reduced-motion. List max 5, ordered by impact.
If the current directory has no project markers (no package.json, no src/ dir, no tailwind.config.*) BUT contains multiple project subdirectories — you are at the workspace root, not inside a project.
Use AskUserQuestion:
multiSelect: true${SHIPFLOW_DATA_DIR:-$HOME/shipflow_data}/PROJECTS.md: label = project name, description = stackThen proceed to GLOBAL MODE with the selected projects.
Read the global styles, framework config (Tailwind, theme files), and 5-10 representative components. Document the four design token systems and report their state:
Color & semantic palette:
text-gray-600 AND text-gray-500 for similar purposes).success, warning, danger, info, neutral (or equivalent). Missing = violation.surface-base, surface-raised, surface-overlay, surface-sunken (or equivalent).bg-blue-500, Color(0xFFFF0000), text-orange-600 in business components).Theme system architecture:
light, dark, system. Single-mode = OK only if BRANDING.md documents the choice.system, never crash).<head>, SSR cookie, or native sync bootstrap).Typography design token system:
font-size values in components vs token references (see context block).font-size + line-height + letter-spacing. Isolated font-size tokens = violation.1.1×, 1.4×, 1.2×), recommend regenerating with Utopia.fyi — base + ratio + viewport range produce a coherent scale.clamp() for smooth viewport scaling. Verify clamp values use rem (not px), and the preferred value combines rem + vw (with vw ≤ ~3vw, never dominant). Formula: slope = (max-size - min-size) / (max-vw - min-vw), intercept = min-size - (min-vw × slope), then clamp(min-size, intercept-rem + slope×100vw, max-size).Spacing design token system:
padding/margin/gap in component files (see context block).gutter, stack-tight, inset-card) for larger ones. Mixed = violation.clamp(). Component-level spacing should stay static.Motion design token system:
transition/animation declarations (see context block).duration-fast, ease-out-standard), not by value (duration-200ms).prefers-reduced-motion: check support count (see context block). Should be present in every non-trivial animation block.width, height, top, left, margin (layout-triggering).Component patterns: Identify repeated patterns (cards, buttons, sections). Flag inconsistencies between instances.
Breakpoint usage: Check if responsive breakpoints are consistent.
Search the entire codebase for legacy/outdated patterns:
JavaScript browser dialogs:
alert( — replace with toast/notification componentconfirm( — replace with modal dialog componentprompt( — replace with form input/modaldocument.write( — never acceptableOutdated UI patterns:
<marquee>, <blink>, <center>, <font> tags<table> used for layout (tables only for tabular data)onclick="..." handlers<iframe> for layout purposes (embeds like YouTube/maps are fine)Deprecated CSS patterns:
!important overuse (more than 2-3 instances is a red flag)14px for body textfloat used for primary layout (use flexbox/grid)@media blocks that only change font-size at breakpoints. Replace with clamp(MIN, PREFERRED, MAX) for smooth scaling. Use rem-based values (not px) to respect user zoom preferencesclamp() with px units — flag clamp(Xpx, ...) patterns; should use rem so font scales with user browser settingsclamp() with pure vw preferred value — e.g., clamp(1rem, 4vw, 2rem) ignores user font-size; preferred value must combine rem + vw (e.g., 0.5rem + 2vw)hex/hsl in tokens where oklch() would be better (perceptual uniformity)@media queries responding to viewport where @container would respond to actual component space<div role="dialog"> where native <dialog> would work:has())Legacy JS patterns:
innerHTML for user-facing contentsetTimeout/setInterval for UI stateUndersized click targets & action propagation:
Look for containers where a small child element (icon button, hamburger, link, toggle) is the sole interactive element but the parent is not clickable. Flag when: container has descriptive content + exactly one primary action + container is NOT already interactive. Fix: propagate action to parent with onClick, role="button", tabIndex={0}, cursor-pointer + hover state. Do NOT propagate when: multiple competing actions, destructive action, drag surface, or form controls.
Scan the codebase for modern CSS opportunities:
@container? If not, flag — they'll break in nested layouts:has() adoption: any useEffect toggling parent classes based on child state that could be replaced by :has()?@view-transition { navigation: auto } enabled?oklch()? If hsl() or hex, flag (perceptual uniformity benefit)light-dark() + color-scheme: is dark mode implemented via light-dark() or via duplicated @media (prefers-color-scheme: dark) blocks?<dialog> + popover: are modals using <dialog> or <div role="dialog">? Are tooltips/menus using Floating UI or native popover?content-visibility: long lists, footers, off-screen sections — candidates for content-visibility: auto$value, $type, $description)? W3C DTCG reached first stable version Oct 2025If the project was built with v0, bolt, lovable, Figma Make, or heavy LLM assistance (check git log for AI attribution or rapid generation bursts), scan specifically for:
class=".*\bw-\w+\b.*\bw-\w+\b" etc.)`text-${x}-500`)<div onClick> without role="button"/tabIndex/onKeyDown<label>alt:focus-visible, :disabled, loading/error/empty statesConsolidated audit of the four design token systems (theme, typography, spacing, motion). They share the same logic — centralization, semantic naming, single source of truth — so they're audited together. Strictness is adaptive to project size (5c rule): small projects still receive professional findings, but the severity reflects blast radius; larger projects get harder blocking findings.
light, dark, system) declared and reachable from settings UI. Single-mode requires BRANDING.md justification.system, exposes change events.<head>. SSR: cookie/header. Native: sync bootstrap.prefers-color-scheme honored at first render for new users with no stored preference.if (isDark) branches in component code (mode-switching logic belongs in the token layer).font-size in components (count from context block; threshold = adaptive to project size).font-size + line-height + letter-spacing (object or co-named CSS variables).clamp() used for headings and hero text. Format: clamp(MIN_REM, X_REM + Y_VW, MAX_REM) with Y ≤ 3. Pure-vw preferred values = WCAG 1.4.4 violation (Resize Text fails at 200% zoom).0, 1px borders.13px, 27px).gutter, stack-tight, inset-card, section) for larger. Mixed = violation.clamp() for fluid behavior. Component-level spacing stays static.transition/animation durations or easings in components (count from context block).duration-fast, ease-out-standard), not by value.ease, ease-in-out).prefers-reduced-motion support count > 0 (see context block). Every non-trivial animation respects it. Decorative motion (parallax, autoplay carousels, scroll-driven) must opt out under reduced motion.transform, opacity, filter only. Animating width, height, top, left, margin triggers layout = perf violation.If the project has any design token system worth auditing and no design playground page detected (see context block "Design playground page"), append this to priority improvements:
⚡ No design system playground detected — run /sf-design-playground to scaffold a live design token preview page.
Why: visualizing all tokens in one place + live editing is the fastest way to iterate on the design system without opening 30 files.
For each page, check:
For each finding, include a "Why it matters" line citing the relevant UX principle or standard.
Fix all issues directly in code. Prioritize:
DESIGN AUDIT: [project name]
═══════════════════════════════════════
BUSINESS METADATA VERSIONS
BUSINESS.md artifact_version=[x|missing] status=[x|missing] updated=[date|missing] confidence=[x|missing] next_review=[date|missing]
BRANDING.md artifact_version=[x|missing] status=[x|missing] updated=[date|missing] confidence=[x|missing] next_review=[date|missing]
GUIDELINES.md artifact_version=[x|missing] status=[x|missing] updated=[date|missing] confidence=[x|missing] next_review=[date|missing]
Proof gaps: [missing/stale/unversioned docs that affected scoring, or none]
DESIGN SYSTEM HEALTH
Colors: X tokens used, Y inconsistencies
Typography: X sizes used, Y violations
Spacing: [consistent / mixed / chaotic]
Components: X patterns, Y inconsistencies
DESIGN TOKEN SYSTEM
Theme architecture:
Modes: [light + dark + system / single — justified / single — UNJUSTIFIED]
Preference: [centralized + normalized / scattered / missing]
Persistence: [present / missing]
Server sync: [yes — auth detected / N/A — no auth / MISSING — auth present but no sync]
FOUC prevention: [yes / no — flash of wrong theme]
Settings UI: [present / missing]
Typography tokens:
Centralization: X literal font-sizes outside tokens (target: 0)
Naming: [t-shirt / semantic / MIXED]
Bundle: [size+lh+ls bundled / font-size only]
Scale ratio: [coherent (1.X×) / chaotic — recommend Utopia.fyi]
Fluid clamp(): [adopted / partial / absent / vw-dominant — WCAG risk]
Spacing tokens:
Centralization: X literal margin/padding outside tokens
Naming: [t-shirt / semantic / MIXED]
Scale ratio: [4px-base / 8px-base / modular / chaotic]
Fluid layout: [adopted / static-only]
Motion tokens:
Centralization: X literal transition/animation outside tokens
Naming: [semantic / by-value / MIXED]
Reduced motion: X declarations (target: every non-trivial animation)
Compositor-only: [yes / NO — animates layout properties]
Universal palette socle:
Semantic intents: [success/warning/danger/info/neutral all present / X missing]
Surface tokens: [base/raised/overlay/sunken / X missing]
Hue-based names in components: X violations
Design playground page: [present / ABSENT — recommend /sf-design-playground]
OUTDATED PATTERNS
Browser dialogs: X found
Legacy HTML: X found
Deprecated CSS: X found
Legacy JS: X found
Click targets: X containers with undersized single-action children
div[role=dialog]: X (should be native <dialog>)
MODERN CSS ADOPTION
Container queries: [adopted / partial / absent]
:has() usage: [adopted / partial / absent]
View Transitions: [enabled / disabled]
OKLCH tokens: [yes / no — uses hex/hsl]
light-dark(): [yes / no — uses duplicated media queries]
Native dialog: X modals / Y with role=dialog
Popover API: [yes / no — uses Floating UI]
AI-GEN CODE SMELLS (if applicable)
Tailwind conflicts: X
Dynamic classes: X
div-as-button: X
Missing labels/alts: X
Missing states: X
PAGE SCORES
/ [A/B/C/D]
/about [A/B/C/D]
...
CROSS-PAGE CONSISTENCY [A/B/C/D]
ACCESSIBILITY (WCAG 2.2) [A/B/C/D]
USABILITY (NN/g) [A/B/C/D]
RESPONSIVENESS [A/B/C/D]
MODERN CSS 2026 [A/B/C/D]
AI-GEN CODE HEALTH [A/B/C/D] (if applicable)
THEME ARCHITECTURE [A/B/C/D]
TYPOGRAPHY TOKENS [A/B/C/D]
SPACING TOKENS [A/B/C/D]
MOTION TOKENS [A/B/C/D]
═══════════════════════════════════════
OVERALL [A/B/C/D]
PRIORITY IMPROVEMENTS (high impact, bounded effort)
⚡ [file:line] description — Why: [principle]
⚡ [file:line] description — Why: [principle]
... (max 5, ordered by impact)
⚡ Run /sf-design-playground to scaffold a live design token preview page
(auto-included if no design playground detected and project has any design token system)
⚡ Run /sf-design "<design remediation goal>" when the next step spans tokens, components, accessibility, browser proof, and implementation.
Why: broad design work needs a master lifecycle instead of a single specialist audit.
Fixed: X issues across Y files
Needs decision: Z items (listed below)
Shared file write protocol for AUDIT_LOG.md and TASKS.md:
After generating the report and applying fixes:
Append a row to two files:
${SHIPFLOW_DATA_DIR:-$HOME/shipflow_data}/AUDIT_LOG.md: append a row filling only the Design column, — for others../AUDIT_LOG.md: same without the Project column.Create either file if missing.
### Audit: Design subsection with critical (🔴), high (🟠), and medium (🟡) issues as task rows.${SHIPFLOW_DATA_DIR:-$HOME/shipflow_data}/TASKS.md: find the project's section, add/replace an ### Audit: Design subsection with the same tasks. Update the Dashboard "Top Priority" if critical issues found.<textarea> elements MUST use field-sizing: content — flag and fix any textarea missing this property.