원클릭으로 Manus에서 모든 스킬 실행

$pwd:

woz-benchmark

Name: Woz Benchmark
Author: WithWoz

// Compare WOZCODE vs vanilla Claude Code on the user's codebase — real cost, turn, and time savings. TRIGGER on "compare woz", "how much does woz save", "benchmark woz", "woz vs claude", "show me savings", or /woz-benchmark.

Manus에서 실행

$ git log --oneline --stat

stars:181

forks:21

updated:2026년 5월 12일 20:43

SKILL.md

readonly

name	woz-benchmark
description	Compare WOZCODE vs vanilla Claude Code on the user's codebase — real cost, turn, and time savings. TRIGGER on "compare woz", "how much does woz save", "benchmark woz", "woz vs claude", "show me savings", or /woz-benchmark.
allowed-tools	Bash(node ), Bash(git ), Bash(ls ), Bash(test ), Bash(mkdir ), Bash(date ), Write, Read

WOZCODE Savings Benchmark

Run a side-by-side comparison of WOZCODE vs vanilla Claude Code on the user's own codebase. Each prompt runs twice against a fresh copy of the repo with git reset --hard between runs, so the target MUST be a clean git repo.

TRIGGER: "compare woz", "how much does woz save", "benchmark woz", "woz vs claude", "show me the savings", "is woz worth it", or /woz-benchmark.

Prerequisites

User logged in to WOZCODE (if not, stop and ask them to /woz-login).
Target directory is a git repo with a clean working tree.

Steps

1. Gather inputs — BE BRIEF

Ask for all three in ONE short message (< 10 lines). Do not re-explain what the benchmark does — the user already invoked it.

Target directory — absolute path to a clean git repo to run the test on.
Prompts — 2–10 real coding tasks. Tell them briefly: "meaty feature/refactor/bugfix work, not one-liners — trivial prompts hide WOZCODE's advantage". If they don't have prompts in mind, offer to suggest some after looking at their repo.
Environment setup (optional) — one line: "Anything Claude needs already in place (DB seeded, services running, credentials in .env)? Skip if the repo is self-contained."

Do NOT ask about the model. Default to opus in the YAML config. Only switch to sonnet or haiku if the user volunteers a different choice in their answer (e.g. "use sonnet" or "try it on haiku").

Keep examples OUT of the user message unless they ask for help picking prompts. The user doesn't need 4 bullet points of good-vs-bad prompt examples.

2. Validate the target

Before writing any config, verify the target is usable:

test -d <target>
git -C <target> rev-parse --git-dir
git -C <target> status --porcelain

If the directory doesn't exist, isn't a git repo, or has uncommitted changes, STOP and tell the user how to fix it (e.g. "please commit or stash your changes — the benchmark resets the repo between runs and would lose your work").

3. Write a temporary benchmark config

Use the Write tool to create a YAML file at /tmp/woz-benchmark-<timestamp>.yaml (get the timestamp from date +%s). Format:

model: opus
maxTurns: 15
prompts:
  - "first prompt from the user"
  - "second prompt from the user"
setup:
  commands:
    - "curl -L https://example.com/dataset.csv -o data/sample.csv"
    - "psql $DATABASE_URL -f seed.sql"

Notes:

Default to model: opus. Only use a different model if the user volunteered one in their answer.
Quote every prompt string. If a prompt contains a double quote, escape it with \".
Omit the entire setup: block if the user didn't give any environment setup commands.
Keep maxTurns: 15 as a safety cap so a single prompt can't run away.

4. Run the benchmark

One-line warning: "This'll take several minutes — each prompt runs twice." Then run:

node --no-warnings=ExperimentalWarning ${CLAUDE_PLUGIN_ROOT}/scripts/benchmark.js --target <target> --config <yaml-path> --user-env

--user-env loads the user's project CLAUDE.md hierarchy on BOTH sides. Do NOT pass --screenshots, --codex, --judge, or --trace.

5. Present the results as a savings report

The benchmark prints a detailed text report at the end. Relay the full report to the user, then add a clear, sales-oriented savings summary at the top. Compute the deltas from the report's totals:

💰 WOZCODE Savings Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Cost saved:       $X.XX  (Y% cheaper)
  Tokens saved:     X,XXX  (Y% fewer)
  Turns saved:      N      (Y% fewer)
  Time saved:       X min  (Y% faster)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Frame the numbers positively. If WOZCODE was slower or more expensive on a specific prompt, call it out honestly but note the aggregate.

Finally, tell the user where the detailed JSON report was saved (the benchmark prints this path).

Tips

If the user has no prompts in mind, read a few files in their repo and suggest 2-3 realistic tasks tailored to what you see.
The temp YAML file is safe to leave in /tmp — the OS cleans it up.
If the user wants to re-run with different prompts, just generate a new YAML and call the script again.
If the benchmark fails midway, its cleanup handler resets the repo to its original HEAD — the user's work is safe.

related-skills.json

같은 저장소

woz-bug.md

from "WithWoz/wozcode-plugin"

Report a WOZCODE bug. Same backend as /woz-feedback, tagged for bug triage. Session context (current session id, anonymous id, OS, arch, Node version) is auto-attached.

2026-05-29181

woz-feedback.md

from "WithWoz/wozcode-plugin"

Share feedback about WOZCODE — feature requests, general thoughts, anything that's working or not. For broken-behavior reports use /woz-bug (same backend, bug-tagged).

2026-05-29181

woz-benchmark.md

from "WithWoz/wozcode-plugin"

Compare WOZCODE vs vanilla Claude Code on the user's codebase — real cost, turn, and time savings. TRIGGER on "compare woz", "how much does woz save", "benchmark woz", "woz vs claude", "show me savings", or /woz-benchmark.

2026-05-18181

woz-login.md

from "WithWoz/wozcode-plugin"

Authenticate with the Woz service. Use when the user needs to log in or when authentication is required.

2026-05-18181

woz-logout.md

from "WithWoz/wozcode-plugin"

Clear stored Woz credentials and log out.

2026-05-18181

woz-savings.md

from "WithWoz/wozcode-plugin"

Show WOZCODE savings report - calls saved, time saved, tokens saved, and lifetime totals.

2026-05-18181

package.json

"author": "WithWoz"

"repository": "WithWoz/wozcode-plugin"

GitHub 저장소 열기 Creator 저장소 보기

$ install --global

$ download --local

Manus에서 실행

$ useful --forSOC

소프트웨어 개발자컴퓨터 및 수학직15-1252L4

name	woz-benchmark
description	Compare WOZCODE vs vanilla Claude Code on the user's codebase — real cost, turn, and time savings. TRIGGER on "compare woz", "how much does woz save", "benchmark woz", "woz vs claude", "show me savings", or /woz-benchmark.
allowed-tools	Bash(node ), Bash(git ), Bash(ls ), Bash(test ), Bash(mkdir ), Bash(date ), Write, Read

WOZCODE Savings Benchmark

TRIGGER: "compare woz", "how much does woz save", "benchmark woz", "woz vs claude", "show me the savings", "is woz worth it", or /woz-benchmark.

Prerequisites

User logged in to WOZCODE (if not, stop and ask them to /woz-login).
Target directory is a git repo with a clean working tree.

Steps

1. Gather inputs — BE BRIEF

Ask for all three in ONE short message (< 10 lines). Do not re-explain what the benchmark does — the user already invoked it.

Target directory — absolute path to a clean git repo to run the test on.
Prompts — 2–10 real coding tasks. Tell them briefly: "meaty feature/refactor/bugfix work, not one-liners — trivial prompts hide WOZCODE's advantage". If they don't have prompts in mind, offer to suggest some after looking at their repo.
Environment setup (optional) — one line: "Anything Claude needs already in place (DB seeded, services running, credentials in .env)? Skip if the repo is self-contained."

Keep examples OUT of the user message unless they ask for help picking prompts. The user doesn't need 4 bullet points of good-vs-bad prompt examples.

2. Validate the target

Before writing any config, verify the target is usable:

test -d <target>
git -C <target> rev-parse --git-dir
git -C <target> status --porcelain

3. Write a temporary benchmark config

Use the Write tool to create a YAML file at /tmp/woz-benchmark-<timestamp>.yaml (get the timestamp from date +%s). Format:

model: opus
maxTurns: 15
prompts:
  - "first prompt from the user"
  - "second prompt from the user"
setup:
  commands:
    - "curl -L https://example.com/dataset.csv -o data/sample.csv"
    - "psql $DATABASE_URL -f seed.sql"

Notes:

Default to model: opus. Only use a different model if the user volunteered one in their answer.
Quote every prompt string. If a prompt contains a double quote, escape it with \".
Omit the entire setup: block if the user didn't give any environment setup commands.
Keep maxTurns: 15 as a safety cap so a single prompt can't run away.

4. Run the benchmark

One-line warning: "This'll take several minutes — each prompt runs twice." Then run:

node --no-warnings=ExperimentalWarning ${CLAUDE_PLUGIN_ROOT}/scripts/benchmark.js --target <target> --config <yaml-path> --user-env

--user-env loads the user's project CLAUDE.md hierarchy on BOTH sides. Do NOT pass --screenshots, --codex, --judge, or --trace.

5. Present the results as a savings report

The benchmark prints a detailed text report at the end. Relay the full report to the user, then add a clear, sales-oriented savings summary at the top. Compute the deltas from the report's totals:

💰 WOZCODE Savings Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Cost saved:       $X.XX  (Y% cheaper)
  Tokens saved:     X,XXX  (Y% fewer)
  Turns saved:      N      (Y% fewer)
  Time saved:       X min  (Y% faster)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Frame the numbers positively. If WOZCODE was slower or more expensive on a specific prompt, call it out honestly but note the aggregate.

Finally, tell the user where the detailed JSON report was saved (the benchmark prints this path).

Tips

If the user has no prompts in mind, read a few files in their repo and suggest 2-3 realistic tasks tailored to what you see.
The temp YAML file is safe to leave in /tmp — the OS cleans it up.
If the user wants to re-run with different prompts, just generate a new YAML and call the script again.
If the benchmark fails midway, its cleanup handler resets the repo to its original HEAD — the user's work is safe.

woz-benchmark

WOZCODE Savings Benchmark

Prerequisites

Steps

1. Gather inputs — BE BRIEF

2. Validate the target

3. Write a temporary benchmark config

4. Run the benchmark

5. Present the results as a savings report

Tips

이 저장소의 다른 Skills

이 저장소의 다른 Skills

WOZCODE Savings Benchmark

Prerequisites

Steps

1. Gather inputs — BE BRIEF

2. Validate the target

3. Write a temporary benchmark config

4. Run the benchmark

5. Present the results as a savings report

Tips