一键在 Manus 中运行任何 Skill

开始使用

worker-benchmarks

Run comprehensive worker system benchmarks and performance analysis

在 Manus 中运行

星标59,493

分支6,880

更新时间2026年3月25日 20:28

来源

ruvnet

ruvnet/ruflo

打开 GitHub 仓库查看创作者相关仓库

安装命令

下载

在 Manus 中运行

适用职业SOC

软件质量保证分析师与测试员计算机与数学类职业15-1253L4

SKILL.md

readonly

name	worker-benchmarks
description	Run comprehensive worker system benchmarks and performance analysis
user-invocable	true

Worker Benchmarks Skill

Run comprehensive performance benchmarks for the agentic-flow worker system.

Quick Start

# Run full benchmark suite
npx agentic-flow workers benchmark

# Run specific benchmark
npx agentic-flow workers benchmark --type trigger-detection
npx agentic-flow workers benchmark --type registry
npx agentic-flow workers benchmark --type agent-selection
npx agentic-flow workers benchmark --type concurrent

Benchmark Types

1. Trigger Detection (`trigger-detection`)

Tests keyword detection speed across 12 worker triggers.

Target: p95 < 5ms
Iterations: 1000
Metrics: latency, throughput, histogram

2. Worker Registry (`registry`)

Tests CRUD operations on worker entries.

Target: p95 < 10ms
Iterations: 500 creates, gets, updates
Metrics: per-operation latency breakdown

3. Agent Selection (`agent-selection`)

Tests performance-based agent selection.

Target: p95 < 1ms
Iterations: 1000
Metrics: selection confidence, agent scores

4. Model Cache (`cache`)

Tests model caching performance.

Target: p95 < 0.5ms
Metrics: hit rate, cache size, eviction stats

5. Concurrent Workers (`concurrent`)

Tests parallel worker creation and updates.

Target: < 1000ms for 10 workers
Metrics: per-worker latency, memory usage

6. Memory Key Generation (`memory-keys`)

Tests memory pattern key generation.

Target: p95 < 0.1ms
Iterations: 5000
Metrics: unique patterns, throughput

Output Format

═══════════════════════════════════════════════════════════
📈 BENCHMARK RESULTS
═══════════════════════════════════════════════════════════

✅ Trigger Detection
   Operation: detect
   Count: 1,000
   Avg: 0.045ms | p95: 0.120ms (target: 5ms)
   Throughput: 22,222 ops/s
   Memory Δ: 0.12MB

✅ Worker Registry
   Operation: crud
   Count: 1,500
   Avg: 1.234ms | p95: 3.456ms (target: 10ms)
   Throughput: 810 ops/s
   Memory Δ: 2.34MB

───────────────────────────────────────────────────────────
📊 SUMMARY
───────────────────────────────────────────────────────────
Total Tests: 6
Passed: 6 | Failed: 0
Avg Latency: 0.567ms
Total Duration: 2345ms
Peak Memory: 8.90MB
═══════════════════════════════════════════════════════════

Integration with Settings

Benchmark thresholds are configured in .claude/settings.json:

{
  "performance": {
    "benchmarkThresholds": {
      "triggerDetection": { "p95Ms": 5 },
      "workerRegistry": { "p95Ms": 10 },
      "agentSelection": { "p95Ms": 1 },
      "memoryKeyGeneration": { "p95Ms": 0.1 },
      "concurrentWorkers": { "totalMs": 1000 }
    }
  }
}

Programmatic Usage

import { workerBenchmarks, runBenchmarks } from 'agentic-flow/workers/worker-benchmarks';

// Run full suite
const suite = await runBenchmarks();
console.log(suite.summary);

// Run individual benchmarks
const triggerResult = await workerBenchmarks.benchmarkTriggerDetection(1000);
const registryResult = await workerBenchmarks.benchmarkRegistryOperations(500);

Performance Optimization Tips

Model Cache: Enable with CLAUDE_FLOW_MODEL_CACHE_MB=512
Parallel Workers: Enable with CLAUDE_FLOW_WORKER_PARALLEL=true
Warning Suppression: Enable with CLAUDE_FLOW_SUPPRESS_WARNINGS=true
SQLite WAL Mode: Automatic for better concurrent performance

同仓库更多 Skills

同仓库

nested-subagents

ruvnet/ruflo

Spawn nested sub-agents (agents that spawn sub-agents, up to depth=5) via Claude Code's native Task tool — for context-managed deep delegation

2026-06-0959.5k

workflow-create

ruvnet/ruflo

Author a workflow — either an MCP workflow template (persisted, lifecycle) or a native .claude/workflows/*.js orchestration script (agent/parallel/pipeline fan-out)

2026-05-2959.5k

workflow-run

ruvnet/ruflo

Run a workflow — drive an MCP workflow lifecycle (execute/pause/resume/cancel) or invoke + resume a native .claude/workflows/*.js orchestration via the Workflow tool

2026-05-2959.5k

gaia-architecture-comparison

ruvnet/ruflo

Side-by-side comparison of ruflo vs HAL vs other GAIA harnesses — capability gaps, design decisions, and improvement roadmap

2026-05-2859.5k

gaia-debugging

ruvnet/ruflo

Diagnose why a GAIA question failed — extract trace, classify failure mode, and propose a fix

2026-05-2859.5k

gaia-submission

ruvnet/ruflo

Walk through a complete GAIA benchmark→submit flow — from key resolution through HAL-compatible package generation

2026-05-2859.5k

name	worker-benchmarks
description	Run comprehensive worker system benchmarks and performance analysis
user-invocable	true

Worker Benchmarks Skill

Run comprehensive performance benchmarks for the agentic-flow worker system.

Quick Start

# Run full benchmark suite
npx agentic-flow workers benchmark

# Run specific benchmark
npx agentic-flow workers benchmark --type trigger-detection
npx agentic-flow workers benchmark --type registry
npx agentic-flow workers benchmark --type agent-selection
npx agentic-flow workers benchmark --type concurrent

Benchmark Types

1. Trigger Detection (`trigger-detection`)

Tests keyword detection speed across 12 worker triggers.

Target: p95 < 5ms
Iterations: 1000
Metrics: latency, throughput, histogram

2. Worker Registry (`registry`)

Tests CRUD operations on worker entries.

Target: p95 < 10ms
Iterations: 500 creates, gets, updates
Metrics: per-operation latency breakdown

3. Agent Selection (`agent-selection`)

Tests performance-based agent selection.

Target: p95 < 1ms
Iterations: 1000
Metrics: selection confidence, agent scores

4. Model Cache (`cache`)

Tests model caching performance.

Target: p95 < 0.5ms
Metrics: hit rate, cache size, eviction stats

5. Concurrent Workers (`concurrent`)

Tests parallel worker creation and updates.

Target: < 1000ms for 10 workers
Metrics: per-worker latency, memory usage

6. Memory Key Generation (`memory-keys`)

Tests memory pattern key generation.

Target: p95 < 0.1ms
Iterations: 5000
Metrics: unique patterns, throughput

Output Format

═══════════════════════════════════════════════════════════
📈 BENCHMARK RESULTS
═══════════════════════════════════════════════════════════

✅ Trigger Detection
   Operation: detect
   Count: 1,000
   Avg: 0.045ms | p95: 0.120ms (target: 5ms)
   Throughput: 22,222 ops/s
   Memory Δ: 0.12MB

✅ Worker Registry
   Operation: crud
   Count: 1,500
   Avg: 1.234ms | p95: 3.456ms (target: 10ms)
   Throughput: 810 ops/s
   Memory Δ: 2.34MB

───────────────────────────────────────────────────────────
📊 SUMMARY
───────────────────────────────────────────────────────────
Total Tests: 6
Passed: 6 | Failed: 0
Avg Latency: 0.567ms
Total Duration: 2345ms
Peak Memory: 8.90MB
═══════════════════════════════════════════════════════════

Integration with Settings

Benchmark thresholds are configured in .claude/settings.json:

{
  "performance": {
    "benchmarkThresholds": {
      "triggerDetection": { "p95Ms": 5 },
      "workerRegistry": { "p95Ms": 10 },
      "agentSelection": { "p95Ms": 1 },
      "memoryKeyGeneration": { "p95Ms": 0.1 },
      "concurrentWorkers": { "totalMs": 1000 }
    }
  }
}

Programmatic Usage

import { workerBenchmarks, runBenchmarks } from 'agentic-flow/workers/worker-benchmarks';

// Run full suite
const suite = await runBenchmarks();
console.log(suite.summary);

// Run individual benchmarks
const triggerResult = await workerBenchmarks.benchmarkTriggerDetection(1000);
const registryResult = await workerBenchmarks.benchmarkRegistryOperations(500);

Performance Optimization Tips

Model Cache: Enable with CLAUDE_FLOW_MODEL_CACHE_MB=512
Parallel Workers: Enable with CLAUDE_FLOW_WORKER_PARALLEL=true
Warning Suppression: Enable with CLAUDE_FLOW_SUPPRESS_WARNINGS=true
SQLite WAL Mode: Automatic for better concurrent performance

worker-benchmarks

Worker Benchmarks Skill

Quick Start

Benchmark Types

1. Trigger Detection (trigger-detection)

2. Worker Registry (registry)

3. Agent Selection (agent-selection)

4. Model Cache (cache)

5. Concurrent Workers (concurrent)

6. Memory Key Generation (memory-keys)

Output Format

Integration with Settings

Programmatic Usage

Performance Optimization Tips

同仓库更多 Skills

同仓库更多 Skills

Worker Benchmarks Skill

Quick Start

Benchmark Types

1. Trigger Detection (trigger-detection)

2. Worker Registry (registry)

3. Agent Selection (agent-selection)

4. Model Cache (cache)

5. Concurrent Workers (concurrent)

6. Memory Key Generation (memory-keys)

Output Format

Integration with Settings

Programmatic Usage

Performance Optimization Tips

1. Trigger Detection (`trigger-detection`)

2. Worker Registry (`registry`)

3. Agent Selection (`agent-selection`)

4. Model Cache (`cache`)

5. Concurrent Workers (`concurrent`)

6. Memory Key Generation (`memory-keys`)

1. Trigger Detection (`trigger-detection`)

2. Worker Registry (`registry`)

3. Agent Selection (`agent-selection`)

4. Model Cache (`cache`)

5. Concurrent Workers (`concurrent`)

6. Memory Key Generation (`memory-keys`)