Run any Skill in Manus with one click

Get Started

visual-verdict

Structured visual QA verdict for screenshot-to-reference comparisons

Run Skill in Manus

Overview

Structured visual QA verdict for screenshot-to-reference comparisons

Install command

npx skills add https://github.com/ImL1s/flutter-claude-skills --skill visual-verdict

Copy and paste this command into Claude Code to install the skill

Source

ImL1s/flutter-claude-skills

Stars5

Forks0

UpdatedApril 25, 2026 at 09:56

SKILL.md

readonly

name	visual-verdict
description	Structured visual QA verdict for screenshot-to-reference comparisons
requires-omc	true
level	2

Purpose

Compare generated UI screenshots against one or more reference images and return a strict JSON verdict that can drive the next edit iteration.

When to use

The task includes visual fidelity requirements (layout, spacing, typography, component styling)
You have a generated screenshot and at least one reference image
You need deterministic pass/fail guidance before continuing edits

Inputs

reference_images[] (one or more image paths)
generated_screenshot (current output image)
Optional: category_hint (e.g., hackernews, sns-feed, dashboard)

Output contract

Return JSON only with this exact shape:

{
  "score": 0,
  "verdict": "revise",
  "category_match": false,
  "differences": ["..."],
  "suggestions": ["..."],
  "reasoning": "short explanation"
}

Rules:

score: integer 0-100
verdict: short status (pass, revise, or fail)
category_match: true when the generated screenshot matches the intended UI category/style
differences[]: concrete visual mismatches (layout, spacing, typography, colors, hierarchy)
suggestions[]: actionable next edits tied to the differences
reasoning: 1-2 sentence summary

Threshold and loop

Target pass threshold is 90+.
If score < 90, continue editing and rerun /oh-my-claudecode:visual-verdict before any further visual review pass.
Do not treat the visual task as complete until the next screenshot clears the threshold.

Debug visualization

When mismatch diagnosis is hard:

Keep $visual-verdict as the authoritative decision.
Use pixel-level diff tooling (pixel diff / pixelmatch overlay) as a secondary debug aid to localize hotspots.
Convert pixel diff hotspots into concrete differences[] and suggestions[] updates.

Example

{
  "score": 87,
  "verdict": "revise",
  "category_match": true,
  "differences": [
    "Top nav spacing is tighter than reference",
    "Primary button uses smaller font weight"
  ],
  "suggestions": [
    "Increase nav item horizontal padding by 4px",
    "Set primary button font-weight to 600"
  ],
  "reasoning": "Core layout matches, but style details still diverge."
}

Related skills

verify-ui — manual screenshot comparison. Use this skill after verify-ui to quantify the diff with image-based metrics.
figma-implement-design → figma-use — after implementing a Figma design, use verify-ui and visual-verdict to validate the implementation matches the reference.
verify-ui-auto — after visual-verdict confirms the diff is acceptable, set up automated golden tests to prevent regressions.

More from this repository

same repository

verify-ui-auto

ImL1s/flutter-claude-skills

Automated UI verification loop: Marionette screenshot -> Figma reference -> pixel-diff -> difference list -> auto-fix iteration. Solves the false-positive verification problem flagged in insights reports (UI claimed fixed without actually being compared against the reference). Triggers (English): verify ui, ui compare, pixel diff, figma compare, automated ui verification, golden test fail. 自动 UI 验证：Marionette 截图 → Figma 参考图 → pixel-diff → 差异列表 → 自动修复循环。触发关键字（中文）：verify ui、ui 对比、pixel diff、figma 对比、自动验证

2026-04-255

codex-cli-rules

ImL1s/flutter-claude-skills

Operational rules for driving Codex CLI from scripts: success-signal contract, diff-feeding semantics, worktree -C flag, and stdin vs argv. Triggers when invoking `codex exec` programmatically (not interactively) — script wrappers, ralph loops, cron pipelines, multi-CLI fan-out. Surfaces silent failure modes that exit 0 but produce no useful output. Triggers (English): codex exec, codex cli, codex review, codex rescue, codex fallthrough, agent script invocation, programmatic codex.

2026-04-255

finishing-a-development-branch

ImL1s/flutter-claude-skills

Use when implementation is complete, all tests pass, and you need to decide how to integrate the work - guides completion of development work by presenting structured options for merge, PR, or cleanup

2026-04-255

deep-interview

ImL1s/flutter-claude-skills

Socratic deep interview with mathematical ambiguity gating before autonomous execution

2026-04-255

brainstorming

ImL1s/flutter-claude-skills

Use when creating or developing, before writing code or implementation plans - refines rough ideas into fully-formed designs through collaborative questioning, alternative exploration, and incremental validation. Don't use during clear 'mechanical' processes

2026-04-255

executing-plans

ImL1s/flutter-claude-skills

Use when partner provides a complete implementation plan to execute in controlled batches with review checkpoints - loads plan, reviews critically, executes tasks in batches, reports for review between batches

2026-04-255

Source

ImL1s

ImL1s/flutter-claude-skills

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations15-1253L4

name	visual-verdict
description	Structured visual QA verdict for screenshot-to-reference comparisons
requires-omc	true
level	2

Purpose

Compare generated UI screenshots against one or more reference images and return a strict JSON verdict that can drive the next edit iteration.

When to use

The task includes visual fidelity requirements (layout, spacing, typography, component styling)
You have a generated screenshot and at least one reference image
You need deterministic pass/fail guidance before continuing edits

Inputs

reference_images[] (one or more image paths)
generated_screenshot (current output image)
Optional: category_hint (e.g., hackernews, sns-feed, dashboard)

Output contract

Return JSON only with this exact shape:

{
  "score": 0,
  "verdict": "revise",
  "category_match": false,
  "differences": ["..."],
  "suggestions": ["..."],
  "reasoning": "short explanation"
}

Rules:

score: integer 0-100
verdict: short status (pass, revise, or fail)
category_match: true when the generated screenshot matches the intended UI category/style
differences[]: concrete visual mismatches (layout, spacing, typography, colors, hierarchy)
suggestions[]: actionable next edits tied to the differences
reasoning: 1-2 sentence summary

Threshold and loop

Target pass threshold is 90+.
If score < 90, continue editing and rerun /oh-my-claudecode:visual-verdict before any further visual review pass.
Do not treat the visual task as complete until the next screenshot clears the threshold.

Debug visualization

When mismatch diagnosis is hard:

Keep $visual-verdict as the authoritative decision.
Use pixel-level diff tooling (pixel diff / pixelmatch overlay) as a secondary debug aid to localize hotspots.
Convert pixel diff hotspots into concrete differences[] and suggestions[] updates.

Example

{
  "score": 87,
  "verdict": "revise",
  "category_match": true,
  "differences": [
    "Top nav spacing is tighter than reference",
    "Primary button uses smaller font weight"
  ],
  "suggestions": [
    "Increase nav item horizontal padding by 4px",
    "Set primary button font-weight to 600"
  ],
  "reasoning": "Core layout matches, but style details still diverge."
}

Related skills

verify-ui — manual screenshot comparison. Use this skill after verify-ui to quantify the diff with image-based metrics.
figma-implement-design → figma-use — after implementing a Figma design, use verify-ui and visual-verdict to validate the implementation matches the reference.
verify-ui-auto — after visual-verdict confirms the diff is acceptable, set up automated golden tests to prevent regressions.