一键在 Manus 中运行任何 Skill

cpf-skill-creator

星标1

分支0

更新时间2026年3月30日 06:06

Create and improve Claude Code skills using a structured checkpointflow workflow with test-driven iteration. Use this skill whenever the user wants to create a skill, make a skill, build a skill, improve an existing skill, test a skill with evals, benchmark skill quality, optimize a skill description, or turn a conversation into a reusable skill. Also triggers on "skill for X", "automate this as a skill", "package this as a skill", and similar phrases.

安装

用 Codex 或 Claude 帮你安装复制这段 Prompt，粘贴到 Codex、Claude 或其他助手里，让它检查 Skill 页面并帮你完成安装。

在 Manus 中运行

来源

ThomasRohde

ThomasRohde/marketplace

打开 GitHub 仓库查看创作者相关仓库

下载

在 Manus 中运行

cpf-skill-creator

Create and improve Claude Code skills through a structured, workflow-driven process powered by checkpointflow.

Prerequisites

This skill requires:

checkpointflow (cpf CLI) — install with uv tool install checkpointflow or pip install checkpointflow
Anthropic skill-creator plugin — for eval scripts, grading agents, and the review viewer. Install with /plugin install skill-creator from any Claude Code marketplace that carries it.

How it works

This skill wraps the skill creation process in a cpf workflow that orchestrates the full lifecycle. The workflow uses audience-driven dispatch — agent steps do the work automatically, user steps pause for human decisions.

Finding the workflow

The workflow YAML is bundled alongside this SKILL.md. To locate it, find the directory containing this file and look for skill-creator.yaml:

# The workflow is at the same path as this SKILL.md:
# <this-skill-dir>/skill-creator.yaml

When running the workflow, use the absolute path to the YAML file. If this skill was loaded from the plugin cache, the path will be something like:

~/.claude/plugins/cache/<marketplace>/cpf/<version>/skills/cpf-skill-creator/skill-creator.yaml

Running the workflow

cpf run -f <path-to-skill-creator.yaml> --input '{"skill_name": "my-skill", "intent": "..."}'

Or use the cpf-workflow-runner skill which handles the interactive loop automatically.

The workflow lifecycle

Capture intent (user) — Refine what the skill does and when it triggers
Research & draft (agent) — Create the SKILL.md and test cases
Review draft (user) — Approve, revise, or cancel
Run tests & grade (agent) — Execute evals, grade, benchmark, launch viewer
Review results (user) — Examine outputs, decide to iterate/optimize/ship
Improve skill (agent) — Apply feedback, rewrite (loops back to step 4)
Optimize description (agent) — Tune triggering accuracy
Package (agent) — Create .skill file for distribution

Reference material: Anthropic skill-creator

This skill delegates to utilities from the Anthropic skill-creator plugin rather than duplicating them. Locate the skill-creator by searching the plugin cache:

# Try these locations in order:
find ~/.claude/plugins/cache -type d -name "skill-creator" -path "*/skills/*" 2>/dev/null | head -1

The skill-creator provides:

Path	Purpose
`scripts/aggregate_benchmark.py`	Aggregate grading into benchmark.json
`scripts/package_skill.py`	Package skill as .skill file
`scripts/run_loop.py`	Description optimization loop
`scripts/run_eval.py`	Run trigger evaluation
`eval-viewer/generate_review.py`	Generate HTML review viewer
`agents/grader.md`	Grading instructions for subagents
`agents/analyzer.md`	Benchmark analysis instructions
`agents/comparator.md`	Blind A/B comparison instructions
`assets/eval_review.html`	Template for trigger eval review UI
`references/schemas.md`	JSON schemas for evals, grading, benchmark

When running Python scripts, set the working directory to the skill-creator root so module imports resolve:

cd <skill-creator-path>
python -m scripts.aggregate_benchmark <workspace>/iteration-1 --skill-name my-skill

If the skill-creator plugin isn't installed, fall back to manual operations — grade inline, skip the viewer, and package by hand.

Skill writing principles

These principles apply when drafting or improving skills in the agent steps:

Description field is king. It's the primary triggering mechanism. Include both what the skill does AND specific contexts/phrases. Claude tends to under-trigger, so make descriptions slightly "pushy" — enumerate trigger phrases generously.

Progressive disclosure. Keep SKILL.md under 500 lines. Put detailed reference material in references/ subdirectories. Put scripts in scripts/. The model reads SKILL.md on trigger and only loads bundled resources when needed.

Explain the why. Today's LLMs respond better to reasoning than to rigid ALWAYS/NEVER rules. Explain why each instruction matters so the model can generalize to edge cases.

Examples over abstractions. Include concrete input/output examples. They're worth more than paragraphs of description.

Look for repeated patterns. If test runs independently produce similar helper scripts, bundle that script into the skill. Write it once in scripts/ and reference it.

Iteration philosophy

When improving a skill after user feedback:

Generalize from the specific feedback. The skill will be used across many prompts, not just the test cases. Don't overfit.
Keep it lean. Remove instructions that aren't pulling their weight. Read the test transcripts — if the skill is making the model waste time, cut those parts.
Explain the why. If you find yourself writing ALL-CAPS MUST/NEVER, reframe as reasoning the model can internalize.
Draft, then revise. Write the improvement, then read it fresh and tighten it.

Workspace layout

Test results and artifacts are organized in <skill-name>-workspace/ alongside the skill:

<skill-name>-workspace/
  evals/
    evals.json                    # Test prompts and assertions
  iteration-1/
    eval-0-<descriptive-name>/
      with_skill/
        outputs/                  # Files produced with the skill
        grading.json              # Assertion pass/fail results
        timing.json               # Tokens and duration
      without_skill/
        outputs/                  # Baseline outputs
        grading.json
        timing.json
      eval_metadata.json          # Prompt, assertions for this eval
    benchmark.json                # Aggregated metrics
    benchmark.md                  # Human-readable summary
    feedback.json                 # User feedback from the viewer
  iteration-2/
    ...

Quick start for improving an existing skill

Pass the existing skill via the existing_skill_path input:

cpf run -f <path-to-skill-creator.yaml> --input '{
  "skill_name": "my-skill",
  "intent": "Improve test coverage and fix edge case handling",
  "existing_skill_path": ".claude/skills/my-skill"
}'

The workflow will snapshot the existing skill for baseline comparison, then iterate.

Tips

The workflow handles orchestration; each agent step does the actual work (reading code, writing files, running tools).
Always use generate_review.py from the skill-creator plugin — never write custom HTML for the viewer.
For headless environments, pass --static <output.html> to generate_review.py.
The --previous-workspace flag on generate_review.py enables iteration-over-iteration comparison.
If subagents aren't available, run test cases inline (less rigorous but still useful with human review).
Description optimization requires the claude CLI (claude -p). Skip it if not available.

同仓库更多 Skills

同仓库

vendor-positioning-report

ThomasRohde/marketplace

Create or update independent vendor positioning matrix reports. Use when asked for magic quadrant style vendor comparisons, analyst-style market matrices, provider landscapes, technology shortlists, two-axis vendor charts, sourced scoring models, or PDF, Microsoft Word/DOCX, and Markdown vendor evaluation packages. Produces original, evidence-led reports with source ledgers, transparent scoring, bank/regulatory due diligence, and safeguards against Gartner or other analyst-firm imitation.

2026-04-301

cpf-workflow-author

ThomasRohde/marketplace

Create and edit checkpointflow workflow YAML files. Use this skill whenever the user wants to create a workflow, automate a process as a YAML pipeline, define steps that mix CLI commands with human or agent checkpoints, turn a conversation into a reusable runbook, or asks about writing cpf/checkpointflow workflows. Also use it when you see keywords like "workflow", "runbook", "approval flow", "pause and resume", or "await event" in the context of automation.

2026-03-301

earos-rubric

ThomasRohde/marketplace

Create new architecture evaluation rubrics (profiles and overlays) based on the Enterprise Architecture Rubric Operational Standard (EAROS). Use this skill whenever the user wants to "create a rubric", "add a rubric profile", "write an architecture evaluation rubric", "define scoring criteria for architecture artifacts", "create an EAROS profile", "add an overlay", "create a security overlay", "create a data overlay", "evaluate architecture artifacts", "set up architecture review criteria", "build a rubric for solution architecture", "create an ADR rubric", "create a capability map rubric", "define architecture quality criteria", or mentions "EAROS", "rubric", "architecture evaluation", "scoring profile", or "architecture review criteria" in the context of creating or extending evaluation rubrics. Also triggers when the user says "help me evaluate architecture documents", "define review criteria for our artifacts", "standardize architecture review", "create a review checklist", or any request to systematicall

2026-03-171

apply-rubric

ThomasRohde/marketplace

Apply an EAROS rubric to an architecture artifact using the three-pass agent evaluation pattern (Extractor, Evaluator, Challenger). Use this skill whenever the user wants to "evaluate an architecture artifact", "apply a rubric", "review an architecture document", "score an architecture artifact", "run an EAROS evaluation", "assess architecture quality", "apply the solution architecture rubric", "evaluate this ADR", "review this capability map", "check this against the rubric", "run the architecture review", or mentions "evaluate", "score", "assess", "review", or "apply rubric" in the context of applying an EAROS rubric to a specific artifact. Also triggers when the user says "how does this artifact score", "is this architecture document good enough", "run the three-pass evaluation", "extract evidence from this document", or any request to systematically evaluate a specific architecture work product against defined criteria. Does NOT trigger for creating rubrics (use earos-rubric for that), general architectur

2026-03-171

autoresearch

ThomasRohde/marketplace

Autonomous agent-driven optimization loop inspired by Karpathy's autoresearch. Sets up and runs an iterative hill-climbing harness where subagents modify an artifact, evaluate against a single scalar metric, and keep improvements. Use this skill whenever the user wants to "optimize something iteratively", "run an autoresearch loop", "hill-climb on performance", "auto-optimize", "iterate and improve automatically", "run experiments autonomously", "autonomous optimization", or mentions "autoresearch" in any context. Also triggers when the user describes a workflow like "try variations and measure which is best", "keep tweaking until it's faster", "optimize this config", "find the best prompt", "tune hyperparameters", "benchmark variations", or any scenario where they want an agent to autonomously explore a search space against a measurable objective. Works with any domain — code performance, prompt engineering, config tuning, SQL optimization, CSS optimization, model training, build flags, or anything with a me

2026-03-151

copilot-manager

ThomasRohde/marketplace

Discover, audit, reconcile, modify, and delete GitHub Copilot customizations across both Copilot IDE (VS Code) and Copilot CLI environments. Use this skill whenever the user wants to "list copilot customizations", "find copilot files", "audit copilot setup", "what copilot configs do I have", "reconcile copilot IDE and CLI", "clean up copilot files", "delete copilot agent", "show copilot instructions", "compare copilot configs", "remove unused copilot customizations", "copilot drift", "copilot inventory", "what does copilot see", "manage copilot", "copilot hygiene", or any request involving exploring, inspecting, modifying, or removing existing Copilot customization files. Also triggers when the user mentions conflicts between Copilot IDE and Copilot CLI, cross-readability issues, or wants to understand which files each Copilot variant sees. Does NOT trigger for creating new customizations from scratch (use copilot-customization skill instead).

2026-03-151

name

cpf-skill-creator

description

cpf-skill-creator

Create and improve Claude Code skills through a structured, workflow-driven process powered by checkpointflow.

Prerequisites

This skill requires:

checkpointflow (cpf CLI) — install with uv tool install checkpointflow or pip install checkpointflow
Anthropic skill-creator plugin — for eval scripts, grading agents, and the review viewer. Install with /plugin install skill-creator from any Claude Code marketplace that carries it.

How it works

Finding the workflow

The workflow YAML is bundled alongside this SKILL.md. To locate it, find the directory containing this file and look for skill-creator.yaml:

# The workflow is at the same path as this SKILL.md:
# <this-skill-dir>/skill-creator.yaml

When running the workflow, use the absolute path to the YAML file. If this skill was loaded from the plugin cache, the path will be something like:

~/.claude/plugins/cache/<marketplace>/cpf/<version>/skills/cpf-skill-creator/skill-creator.yaml

Running the workflow

cpf run -f <path-to-skill-creator.yaml> --input '{"skill_name": "my-skill", "intent": "..."}'

Or use the cpf-workflow-runner skill which handles the interactive loop automatically.

The workflow lifecycle

Capture intent (user) — Refine what the skill does and when it triggers
Research & draft (agent) — Create the SKILL.md and test cases
Review draft (user) — Approve, revise, or cancel
Run tests & grade (agent) — Execute evals, grade, benchmark, launch viewer
Review results (user) — Examine outputs, decide to iterate/optimize/ship
Improve skill (agent) — Apply feedback, rewrite (loops back to step 4)
Optimize description (agent) — Tune triggering accuracy
Package (agent) — Create .skill file for distribution

Reference material: Anthropic skill-creator

This skill delegates to utilities from the Anthropic skill-creator plugin rather than duplicating them. Locate the skill-creator by searching the plugin cache:

# Try these locations in order:
find ~/.claude/plugins/cache -type d -name "skill-creator" -path "*/skills/*" 2>/dev/null | head -1

The skill-creator provides:

Path	Purpose
`scripts/aggregate_benchmark.py`	Aggregate grading into benchmark.json
`scripts/package_skill.py`	Package skill as .skill file
`scripts/run_loop.py`	Description optimization loop
`scripts/run_eval.py`	Run trigger evaluation
`eval-viewer/generate_review.py`	Generate HTML review viewer
`agents/grader.md`	Grading instructions for subagents
`agents/analyzer.md`	Benchmark analysis instructions
`agents/comparator.md`	Blind A/B comparison instructions
`assets/eval_review.html`	Template for trigger eval review UI
`references/schemas.md`	JSON schemas for evals, grading, benchmark

When running Python scripts, set the working directory to the skill-creator root so module imports resolve:

cd <skill-creator-path>
python -m scripts.aggregate_benchmark <workspace>/iteration-1 --skill-name my-skill

If the skill-creator plugin isn't installed, fall back to manual operations — grade inline, skip the viewer, and package by hand.

Skill writing principles

These principles apply when drafting or improving skills in the agent steps:

Explain the why. Today's LLMs respond better to reasoning than to rigid ALWAYS/NEVER rules. Explain why each instruction matters so the model can generalize to edge cases.

Examples over abstractions. Include concrete input/output examples. They're worth more than paragraphs of description.

Look for repeated patterns. If test runs independently produce similar helper scripts, bundle that script into the skill. Write it once in scripts/ and reference it.

Iteration philosophy

When improving a skill after user feedback:

Generalize from the specific feedback. The skill will be used across many prompts, not just the test cases. Don't overfit.
Keep it lean. Remove instructions that aren't pulling their weight. Read the test transcripts — if the skill is making the model waste time, cut those parts.
Explain the why. If you find yourself writing ALL-CAPS MUST/NEVER, reframe as reasoning the model can internalize.
Draft, then revise. Write the improvement, then read it fresh and tighten it.

Workspace layout

Test results and artifacts are organized in <skill-name>-workspace/ alongside the skill:

<skill-name>-workspace/
  evals/
    evals.json                    # Test prompts and assertions
  iteration-1/
    eval-0-<descriptive-name>/
      with_skill/
        outputs/                  # Files produced with the skill
        grading.json              # Assertion pass/fail results
        timing.json               # Tokens and duration
      without_skill/
        outputs/                  # Baseline outputs
        grading.json
        timing.json
      eval_metadata.json          # Prompt, assertions for this eval
    benchmark.json                # Aggregated metrics
    benchmark.md                  # Human-readable summary
    feedback.json                 # User feedback from the viewer
  iteration-2/
    ...

Quick start for improving an existing skill

Pass the existing skill via the existing_skill_path input:

cpf run -f <path-to-skill-creator.yaml> --input '{
  "skill_name": "my-skill",
  "intent": "Improve test coverage and fix edge case handling",
  "existing_skill_path": ".claude/skills/my-skill"
}'

The workflow will snapshot the existing skill for baseline comparison, then iterate.

Tips

The workflow handles orchestration; each agent step does the actual work (reading code, writing files, running tools).
Always use generate_review.py from the skill-creator plugin — never write custom HTML for the viewer.
For headless environments, pass --static <output.html> to generate_review.py.
The --previous-workspace flag on generate_review.py enables iteration-over-iteration comparison.
If subagents aren't available, run test cases inline (less rigorous but still useful with human review).
Description optimization requires the claude CLI (claude -p). Skip it if not available.