원클릭으로 Manus에서 모든 스킬 실행

$pwd:

skill-forge-benchmark

Name: Skill Forge Benchmark
Author: AgriciDaniel

// Benchmark Claude Code skill performance with variance analysis, tracking pass rate, execution time, and token usage across iterations. Runs multiple trials per eval for statistical reliability, aggregates results into benchmark.json, and generates comparison reports between skill versions. Use when user says "benchmark skill", "measure skill performance", "skill metrics", "compare skill versions", "skill performance", "track skill improvement", "skill regression test", or "skill A/B test".

Manus에서 실행

$ git log --oneline --stat

stars:58

forks:28

updated:2026년 3월 6일 16:30

SKILL.md

readonly

related-skills.json

같은 저장소

skill-forge-eval.md

from "AgriciDaniel/skill-forge"

Run evaluation pipelines on Claude Code skills to test triggering accuracy, workflow correctness, and output quality. Spawns executor, grader, comparator, and analyzer sub-agents for parallel evaluation. Generates eval_metadata.json, grading.json, and feedback reports. Use when user says "eval skill", "test skill", "run evals", "evaluate skill", "skill evals", "test skill quality", "run skill tests", or "skill evaluation".

2026-03-0658

skill-forge.md

from "AgriciDaniel/skill-forge"

Ultimate Claude Code skill creator and architect. Designs, scaffolds, builds, reviews, evolves, and publishes production-grade Claude Code skills following the Agent Skills open standard and 3-layer architecture (directive, orchestration, execution). Handles single-file skills, multi-skill orchestrators with sub-skills and subagents, MCP-enhanced workflows, and full skill ecosystems. Industry detection for skill domain. Triggers on: "create skill", "build skill", "new skill", "skill creator", "skill builder", "skill-forge", "design skill", "scaffold skill", "review skill", "improve skill", "publish skill", "skill architecture", "convert skill", "port skill", "multi-platform", "cross-platform", "eval skill", "test skill", "benchmark skill", "skill evals", "measure skill", "skill performance", "skill A/B test".

2026-03-0658

skill-forge-evolve.md

from "AgriciDaniel/skill-forge"

Improve and iterate on existing Claude Code skills based on usage feedback, test results, or changing requirements. Handles under/over-triggering fixes, instruction refinement, new sub-skill addition, and architecture evolution. Use when user says "improve skill", "fix skill", "skill not triggering", "skill triggers too much", "update skill", or "evolve skill".

2026-03-0658

skill-forge-review.md

from "AgriciDaniel/skill-forge"

Audit and validate existing Claude Code skills for quality, triggering accuracy, structure compliance, and best practices. Scores skills on a 0-100 scale and provides prioritized improvement recommendations. Use when user says "review skill", "audit skill", "check skill", "validate skill", or "skill quality".

2026-03-0658

skill-forge-build.md

from "AgriciDaniel/skill-forge"

Scaffold and build Claude Code skills from plans or descriptions. Generates SKILL.md files, sub-skills, scripts, references, agents, and templates following the Agent Skills standard. Use when user says "build skill", "scaffold skill", "generate skill", "create SKILL.md", or "implement skill".

2026-02-1658

skill-forge-convert.md

from "AgriciDaniel/skill-forge"

Convert Claude Code skills to work on OpenAI Codex, Google Gemini CLI, Google Antigravity, and Cursor. Analyzes platform-specific features, generates target files (openai.yaml, AGENTS.md, GEMINI.md, .mdc rules), adapts frontmatter, converts MCP config, and produces compatibility reports. Use when user says "convert skill", "port skill", "multi-platform", "skill for codex", "skill for gemini", "skill for antigravity", "skill for cursor", "cross-platform skill", "convert to codex", "convert to gemini", "convert to antigravity", or "convert to cursor".

2026-02-1658

package.json

"author": "AgriciDaniel"

"repository": "AgriciDaniel/skill-forge"

GitHub 저장소 열기 Creator 저장소 보기

$ install --global

$ download --local

Manus에서 실행

$ useful --forSOC

소프트웨어 품질 보증 분석가·테스터컴퓨터 및 수학직15-1253L4

skill-forge-benchmark

Skill Benchmarking & Performance Tracking

Process

Step 1: Define Benchmark Configuration

Step 2: Execute Benchmark Runs

Step 3: Aggregate Results

Step 4: Compare with Previous Iterations

Step 5: Generate Benchmark Report

Step 6: Threshold Gating

Error Handling

Integration with Other Sub-Skills

Skill Benchmarking & Performance Tracking

Process

Step 1: Define Benchmark Configuration

Step 2: Execute Benchmark Runs

Step 3: Aggregate Results

Step 4: Compare with Previous Iterations

Step 5: Generate Benchmark Report

Step 6: Threshold Gating

Error Handling

Integration with Other Sub-Skills

skill-forge-benchmark

이 저장소의 다른 Skills

이 저장소의 다른 Skills

Skill Benchmarking & Performance Tracking

Process

Step 1: Define Benchmark Configuration

Step 2: Execute Benchmark Runs

Step 3: Aggregate Results

Step 4: Compare with Previous Iterations

Step 5: Generate Benchmark Report

Step 6: Threshold Gating

Error Handling

Integration with Other Sub-Skills

Skill Benchmarking & Performance Tracking

Process

Step 1: Define Benchmark Configuration

Step 2: Execute Benchmark Runs

Step 3: Aggregate Results

Step 4: Compare with Previous Iterations

Step 5: Generate Benchmark Report

Step 6: Threshold Gating

Error Handling

Integration with Other Sub-Skills