Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

skill-forge-benchmark

Name: Skill Forge Benchmark
Author: AgriciDaniel

// Benchmark Claude Code skill performance with variance analysis, tracking pass rate, execution time, and token usage across iterations. Runs multiple trials per eval for statistical reliability, aggregates results into benchmark.json, and generates comparison reports between skill versions. Use when user says "benchmark skill", "measure skill performance", "skill metrics", "compare skill versions", "skill performance", "track skill improvement", "skill regression test", or "skill A/B test".

Ejecutar en Manus

$ git log --oneline --stat

stars:58

forks:28

updated:6 de marzo de 2026, 16:30

SKILL.md

readonly

related-skills.json

mismo repositorio

skill-forge-eval.md

from "AgriciDaniel/skill-forge"

Run evaluation pipelines on Claude Code skills to test triggering accuracy, workflow correctness, and output quality. Spawns executor, grader, comparator, and analyzer sub-agents for parallel evaluation. Generates eval_metadata.json, grading.json, and feedback reports. Use when user says "eval skill", "test skill", "run evals", "evaluate skill", "skill evals", "test skill quality", "run skill tests", or "skill evaluation".

2026-03-0658

skill-forge.md

from "AgriciDaniel/skill-forge"

Ultimate Claude Code skill creator and architect. Designs, scaffolds, builds, reviews, evolves, and publishes production-grade Claude Code skills following the Agent Skills open standard and 3-layer architecture (directive, orchestration, execution). Handles single-file skills, multi-skill orchestrators with sub-skills and subagents, MCP-enhanced workflows, and full skill ecosystems. Industry detection for skill domain. Triggers on: "create skill", "build skill", "new skill", "skill creator", "skill builder", "skill-forge", "design skill", "scaffold skill", "review skill", "improve skill", "publish skill", "skill architecture", "convert skill", "port skill", "multi-platform", "cross-platform", "eval skill", "test skill", "benchmark skill", "skill evals", "measure skill", "skill performance", "skill A/B test".

2026-03-0658

skill-forge-evolve.md

from "AgriciDaniel/skill-forge"

Improve and iterate on existing Claude Code skills based on usage feedback, test results, or changing requirements. Handles under/over-triggering fixes, instruction refinement, new sub-skill addition, and architecture evolution. Use when user says "improve skill", "fix skill", "skill not triggering", "skill triggers too much", "update skill", or "evolve skill".

2026-03-0658

skill-forge-review.md

from "AgriciDaniel/skill-forge"

Audit and validate existing Claude Code skills for quality, triggering accuracy, structure compliance, and best practices. Scores skills on a 0-100 scale and provides prioritized improvement recommendations. Use when user says "review skill", "audit skill", "check skill", "validate skill", or "skill quality".

2026-03-0658

skill-forge-build.md

from "AgriciDaniel/skill-forge"

Scaffold and build Claude Code skills from plans or descriptions. Generates SKILL.md files, sub-skills, scripts, references, agents, and templates following the Agent Skills standard. Use when user says "build skill", "scaffold skill", "generate skill", "create SKILL.md", or "implement skill".

2026-02-1658

skill-forge-convert.md

from "AgriciDaniel/skill-forge"

Convert Claude Code skills to work on OpenAI Codex, Google Gemini CLI, Google Antigravity, and Cursor. Analyzes platform-specific features, generates target files (openai.yaml, AGENTS.md, GEMINI.md, .mdc rules), adapts frontmatter, converts MCP config, and produces compatibility reports. Use when user says "convert skill", "port skill", "multi-platform", "skill for codex", "skill for gemini", "skill for antigravity", "skill for cursor", "cross-platform skill", "convert to codex", "convert to gemini", "convert to antigravity", or "convert to cursor".

2026-02-1658

package.json

"author": "AgriciDaniel"

"repository": "AgriciDaniel/skill-forge"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Analistas de garantía de calidad de software y probadoresOcupaciones informáticas y matemáticas15-1253L4

skill-forge-benchmark

Skill Benchmarking & Performance Tracking

Process

Step 1: Define Benchmark Configuration

Step 2: Execute Benchmark Runs

Step 3: Aggregate Results

Step 4: Compare with Previous Iterations

Step 5: Generate Benchmark Report

Step 6: Threshold Gating

Error Handling

Integration with Other Sub-Skills

Skill Benchmarking & Performance Tracking

Process

Step 1: Define Benchmark Configuration

Step 2: Execute Benchmark Runs

Step 3: Aggregate Results

Step 4: Compare with Previous Iterations

Step 5: Generate Benchmark Report

Step 6: Threshold Gating

Error Handling

Integration with Other Sub-Skills

skill-forge-benchmark

Más de este repositorio

Más de este repositorio

Skill Benchmarking & Performance Tracking

Process

Step 1: Define Benchmark Configuration

Step 2: Execute Benchmark Runs

Step 3: Aggregate Results

Step 4: Compare with Previous Iterations

Step 5: Generate Benchmark Report

Step 6: Threshold Gating

Error Handling

Integration with Other Sub-Skills

Skill Benchmarking & Performance Tracking

Process

Step 1: Define Benchmark Configuration

Step 2: Execute Benchmark Runs

Step 3: Aggregate Results

Step 4: Compare with Previous Iterations

Step 5: Generate Benchmark Report

Step 6: Threshold Gating

Error Handling

Integration with Other Sub-Skills