Run any Skill in Manus with one click

review-skill

Stars2

Forks1

UpdatedApril 22, 2026 at 12:27

Reviews and automatically fixes Claude Code skills against official Anthropic best practices. Use when checking skill quality, refactoring bloated skills, improving discoverability, or contributing to open-source skills. Supports review, auto-fix, external review, and PR modes.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

costa-marcello

costa-marcello/skillkit

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software DevelopersComputer and Mathematical Occupations·SOC 15-1252

File Explorer

12 files

SKILL.md

readonly

Review Skill

Target Skill

The target skill to review is: $ARGUMENTS

If $ARGUMENTS is empty, ask the user which skill to review.

Pre-Flight Check

Before starting any mode, verify the target skill exists:

Check the target path contains a SKILL.md file. If not, report "No SKILL.md found at [path]" and stop.
List all files in the skill directory and references/ (if present) to build a complete file inventory.
Record the initial line count of SKILL.md with wc -l.

Mode Selection

Mode	Trigger	Action
Review + Auto-Fix (default)	User says "review", "check", "grade", or gives no mode	Run full deep review, then auto-fix all findings
Review Only	User says "report only", "no fix", "read-only"	Run full deep review, report only, no changes
Auto-Fix Only	User says "fix", "improve", "refactor", "auto-fix"	Skip report, apply fixes directly
External Review	User says "external", target is a GitHub URL	Clone to /tmp/, full deep review, report only (read-only)
Auto-PR	User says "PR", "contribute", "auto-pr"	Fork, full deep review, fix, submit PR

When no mode keyword is present, default to Review + Auto-Fix. The deep review always runs in every mode. Auto-fix always follows the deep review unless the user explicitly requests report-only output.

Setup (Optional)

Install create-skill for automated validation: see references/setup.md

All modes work without it using manual evaluation.

Mode 1: Review + Auto-Fix (Default)

Run a full deep review across every evaluation dimension, then automatically fix all findings.

Step 1: Run automated validation (if create-skill installed):

python3 "$CREATE_SKILL"/scripts/quick_validate.py <target-skill>
python3 "$CREATE_SKILL"/scripts/security_scan.py <target-skill> --verbose

Step 2: Structural evaluation -- Read references/evaluation-checklist.md and check every item against the target skill. Record pass/fail for each item with the file path and line number of the finding.

Step 3: Content quality evaluation -- Read references/content-quality-checklist.md and evaluate all 8 dimensions (degrees of freedom, conciseness, actionability, options overload, script quality, feedback loops, consistency, time-sensitive content). Record findings per dimension.

Step 4: Deep review -- Read references/research-backed-criteria.md and check all 6 criteria. Record a pass/fail verdict for each:

XML tag usage
Example quality (3-5 diverse examples)
Defect taxonomy (specification, input, structure, context, performance, maintainability)
Anti-patterns (OWASP, vendor docs, academic)
Formatting effectiveness
HELM-inspired metrics (clarity, actionability, robustness, maintainability, safety)

Step 5: Generate report as markdown with:

Executive summary table (aspect, grade, notes)
Section-by-section findings with file paths and line numbers
Deep review results table (criterion, verdict, evidence)
Combined grade using the unified rubric from references/evaluation-checklist.md
Recommended fixes ranked by severity (major first, then minor)

Step 6: Verify report before presenting:

Every finding has a file path and line number
Grade matches rubric criteria
Fixes are actionable (no "consider" or "ensure")
Deep review covers all 6 criteria from references/research-backed-criteria.md

Step 7: Present report, then proceed to auto-fix. After showing the full review report, automatically apply all recommended fixes using the Auto-Fix procedure (Mode 2). Do not wait for user confirmation. The review informs the fix -- every finding from Steps 2-4 becomes a fix target.

Step 8: Post-fix verification. After auto-fix completes, re-run Steps 2-4 against the modified skill. If any issues remain, fix them. Repeat until 0 major and 0 minor issues remain. Report the final grade with before/after comparison.

**Review + Auto-Fix Report Format:**

Skill Review: pdf

Executive Summary

Aspect	Grade	Notes
Frontmatter	A	Third-person description with triggers
Structure	B	487 lines -- close to 500-line limit
Content Quality	B	One decision point missing a default
Deep Review	B	Missing 2 example tags, no defect in other criteria
Scripts	A	Proper error handling throughout
Combined	B	One minor structural issue

Deep Review Results

Criterion	Verdict	Evidence
XML tag usage	Pass	`<instructions>` and `<example>` tags present
Example quality	Fail	Only 2 examples, need 3-5 diverse cases
Defect taxonomy	Pass	No specification, input, structure, context, performance, or maintainability defects
Anti-patterns	Pass	No OWASP, vendor, or academic anti-patterns
Formatting	Pass	Consistent Markdown + XML structure
HELM metrics	Pass	Clarity 5/5, Actionability pass, Robustness pass, Maintainability pass, Safety pass

Findings

1. Line count approaching limit (Minor)

File: SKILL.md (487 lines) Fix: Move the "Advanced Extraction" section (lines 320-410) to references/advanced-extraction.md.

2. Missing default for output format (Minor)

File: SKILL.md, line 145 Finding: Lists JSON, CSV, and Markdown output without recommending a default. Fix: Add "Default to Markdown. Use JSON when the user needs machine-readable output."

Recommended Fixes (by severity)

Extract advanced section to references (structural)
Add default output format recommendation (content)

Auto-Fix Applied

Proceeding to fix all findings above...

Changes summary: 2 issues fixed, 1 file reorganised, line count reduced from 487 to 395.

**Edge-Case Decision: context: fork on an orchestrator skill**

Skill deploy-fleet has context: fork set and allowed-tools: "Read, Grep, Bash(*), Task".

Decision: M2 violation (Class B). allowed-tools includes Task, which means this skill dispatches sub-agents. A forked subagent cannot spawn further subagents, so context: fork breaks the dispatch chain. Remove context: fork and agent from frontmatter.

Edge-Case Decision: context: fork on an interactive skill

Skill audit-config has context: fork set, no Task tool, but its body instructs: "Step 5: present the findings report to the user. Step 6: wait for confirmation before applying fixes."

Decision: M2 violation (Class C). The skill runs a two-stage interaction (review, then fix on user confirmation). A forked subagent returns a single final summary to the lead, so the user never sees the intermediate report and the confirmation step collapses. Remove context: fork and agent. Interactive skills must run inline.

Edge-Case Decision: context: fork on a mode-style reasoning skill

Skill ultrathink has context: fork set. Its body describes a persistent analytical mode that accumulates reasoning tokens across the conversation and relies on prior turn state.

Decision: M2 violation (Class D). Mode-style reasoning skills lose their extended-thinking tokens and cross-turn persistence when forked into a fresh subagent context. Remove context: fork and agent. Mode-style skills must run inline so the lead retains continuity.

Edge-Case Decision: line count at boundary

Skill api-docs has SKILL.md at exactly 500 lines.

Decision: m7 (minor), not M1 (major). The 500-line limit (M1) triggers at 501+. At 500, the skill is in the warning zone (400-500). Recommend extracting content to reach under 400 for Grade A.

Mode 2: Auto-Fix

Automatically refactor a skill to meet best practices. When triggered by Mode 1 (Review + Auto-Fix), use the review findings as the fix list. When triggered standalone, run Steps 1-2 below to identify issues first.

Auto-Fix Progress:
- [ ] Step 1: Read SKILL.md and all files in root, references/, scripts/, assets/
- [ ] Step 2: Run structural check (evaluation-checklist.md), content quality check (content-quality-checklist.md), deep review (research-backed-criteria.md). List every issue with file path and line number.
- [ ] Step 3: Fix frontmatter (description, context: fork correctness, missing fields)
- [ ] Step 4: Create references/ folder if needed
- [ ] Step 5: Move content over 500 lines to references/
- [ ] Step 6: Move loose files to references/ with clear names
- [ ] Step 7: Update SKILL.md references section
- [ ] Step 8: Verify final line count under 400 (Grade A target) or under 500 (Grade B minimum)
- [ ] Step 9: Run evaluation again to confirm 0 major and 0 minor issues remain
- [ ] Step 10: Generate summary of changes (files modified, issues fixed, before/after line counts, final grade)

Auto-Fix Actions:

Issue	Automatic Fix
Description not third-person	Rewrite: "Processes...", "Extracts..."
Missing trigger conditions	Add "Use when..." clause
`context: fork` incorrectly applied	Apply the four-class taxonomy in `references/evaluation-checklist.md` item 3. Fork is only for Class A (autonomous). Remove it for Class B (orchestrator), Class C (interactive), or Class D (mode-style reasoning). Remove `agent` together with `context: fork`.
SKILL.md over 500 lines	Extract sections to `references/`
Loose files in root	Move to `references/` with descriptive names
Duplicate reference files	Merge and deduplicate

Content Quality Fixes:

Issue	Automatic Fix
Vague instructions ("consider", "ensure")	Rewrite with strong verbs ("check", "verify", "run")
Too many options without default	Add recommended default + escape hatch pattern
Missing feedback loop	Add validation checkpoint before destructive actions
Verbose explanations Claude knows	Delete paragraphs that explain common concepts (JSON, APIs, HTTP). If the paragraph answers "Does Claude already know this?" with yes, remove it.
Time-sensitive content	Remove date-conditional logic. Keep versions pinned with a comment noting "version at time of writing — check official docs for current release". Wrap deprecated approaches in `<details>` with a deprecation label.
Scripts with bare `except:`	Add specific error handling with recovery actions
No examples provided	Add 3-5 diverse `<example>` blocks
Plain text structure (no delimiters)	Add XML tags (`<instructions>`, `<context>`)
Over-specification ("MUST", "CRITICAL")	Use natural language; Claude follows clear instructions

**Before/After: Auto-Fix on a bloated skill**

Before (SKILL.md, 580 lines):

---
name: data-export
description: "Export data from databases"
license: MIT
---

No trigger conditions in description
No context: fork -- autonomous skill (runs scripts, no sub-agent dispatch)
580 lines with inline SQL reference (lines 310-520)
Vague step: "Ensure the export format is correct"
3 loose files in root: formats.md, sql-ref.md, tips.md

After (SKILL.md, 340 lines):

---
name: data-export
description: "Exports data from SQL and NoSQL databases to CSV, JSON, or Parquet. Use when extracting datasets, scheduling recurring exports, or migrating between storage systems."
license: MIT
context: fork
agent: general-purpose
---

Description rewritten: third-person verb + three trigger conditions
context: fork added (scripts and <instructions> tags present)
SQL reference extracted to references/sql-syntax.md (210 lines saved)
Vague step rewritten: "Run python3 scripts/validate_schema.py against the output file"
Loose files moved and renamed: formats.md -> references/export-formats.md, sql-ref.md merged into references/sql-syntax.md, tips.md -> references/troubleshooting.md

Changes summary: 6 issues fixed, 3 files reorganised, line count reduced from 580 to 340.

**Auto-Fix: Grade D skill with multiple major issues**

Before (SKILL.md, 720 lines):

---
name: api-tester
description: "Test your APIs"
license: MIT
---

Issues found:

(M1) 720 lines, over 500-line limit
(M8) Description imperative ("Test your") + no "Use when..." triggers
(M2) Missing context: fork -- autonomous skill (no sub-agent dispatch) with script references
(M7) 4 directives use "ensure" or "handle appropriately" with no defaults
(m1) Lines 50-80 explain what REST APIs are
(m8) 2 loose .md files in root beside SKILL.md

After (SKILL.md, 310 lines):

---
name: api-tester
description: "Tests REST and GraphQL API endpoints with automated assertions. Use when validating API contracts, running regression tests, or checking response schemas."
license: MIT
context: fork
agent: general-purpose
---

Description rewritten: third-person + three triggers
context: fork added
410 lines extracted to references/api-patterns.md and references/schema-validation.md
4 vague directives replaced: "ensure response is valid" became "run python3 scripts/validate_response.py --schema expected.json"
REST explanation deleted (Claude knows what REST is)
Loose files moved: common-headers.md -> references/http-headers.md, auth-flows.md -> references/authentication.md

Changes summary: 6 major + 2 minor issues fixed, 2 files reorganised, line count reduced from 720 to 310. Grade improved from D to A.

**Auto-Fix: No issues found (Grade A skill)**

Skill changelog analysed. 280 lines, all checks pass. No fixes needed.

Changes summary: 0 issues found, 0 files changed. Skill meets Grade A criteria.

Mode 3: External Review

Review a skill from an external GitHub repository without modifying it.

Read references/mode-external-review.md for the full step-by-step procedure. If the reference fails to load, follow this inline summary:

Clone: git clone <github-url> /tmp/review-target
Read all files: SKILL.md first, then references/, scripts/, assets/.
Identify intent: What problem does the skill solve? Who uses it? What workflow does it automate?
Run all three evaluations (structural, content quality, deep review) using the same checklists as Mode 1 Steps 2-4.
Generate read-only report: strengths first, then findings with file paths and line numbers, ranked by severity.
Verify report: every finding has file path + line number, grade matches rubric, fixes use strong verbs.
Clean up: rm -rf /tmp/review-target

Do not modify any files. Report only.

Mode 4: Auto-PR

Fork an external skill repository, improve it, and submit a pull request.

Read references/mode-auto-pr.md for the full procedure and references/pr-template.md for the PR format. If references fail to load, follow this inline summary:

Fork: gh repo fork <github-url> --clone --remote
Branch: git checkout -b refactor/skill-best-practices
Run full deep review (Mode 1 Steps 2-4).
Apply Auto-Fix (Mode 2) using review findings.
Self-review respect check -- verify: no files deleted, no functionality removed, original language preserved, all changes additive.
Create PR with: summary, what is NOT changed, rationale for each change, test plan.
Use gh pr create with the template from references/pr-template.md.

Core principle: additive only. Do not delete files or remove functionality.

References

File	Purpose	Used By
`references/evaluation-checklist.md`	Structural validation + unified grading rubric	Review, Auto-Fix
`references/content-quality-checklist.md`	Content effectiveness (8 dimensions)	Review, Auto-Fix
`references/research-backed-criteria.md`	Deep review with academic citations	All modes (always runs)
`references/script-quality.md`	Script error handling, constants	Review, Auto-Fix
`references/feedback-loops.md`	Multi-step workflow validation	Review, Auto-Fix
`references/mode-external-review.md`	Full External Review procedure	External Review
`references/mode-auto-pr.md`	Full Auto-PR procedure with respect checks	Auto-PR
`references/pr-template.md`	PR description template	Auto-PR
`references/marketplace_template.json`	marketplace.json template	Auto-PR
`references/sources.md`	Bibliography	Review (deep)
`references/setup.md`	create-skill installation	Setup

Official Best Practices

name	review-skill
description	Reviews and automatically fixes Claude Code skills against official Anthropic best practices. Use when checking skill quality, refactoring bloated skills, improving discoverability, or contributing to open-source skills. Supports review, auto-fix, external review, and PR modes.
license	MIT
argument-hint	[skill-path] [mode]

review-skill

More from this repository

Review Skill

Target Skill

Pre-Flight Check

Mode Selection

Setup (Optional)

Mode 1: Review + Auto-Fix (Default)

Skill Review: pdf

Executive Summary

Deep Review Results

Findings

1. Line count approaching limit (Minor)

2. Missing default for output format (Minor)

Recommended Fixes (by severity)

Auto-Fix Applied

Mode 2: Auto-Fix

Mode 3: External Review

Mode 4: Auto-PR

References

Review Skill

Target Skill

Pre-Flight Check

Mode Selection

Setup (Optional)

Mode 1: Review + Auto-Fix (Default)

Skill Review: pdf

Executive Summary

Deep Review Results

Findings

1. Line count approaching limit (Minor)

2. Missing default for output format (Minor)

Recommended Fixes (by severity)

Auto-Fix Applied

Mode 2: Auto-Fix

Mode 3: External Review

Mode 4: Auto-PR

References

More from this repository