在 Manus 中运行任何 Skill
一键导入
一键导入
一键在 Manus 中运行任何 Skill
开始使用$pwd:
perfup
// Autonomous performance optimization: research, PoC, benchmark, implement, review, PR
$ git log --oneline --stat
stars:2,620
forks:329
updated:2026年3月21日 12:10
SKILL.md
// Autonomous performance optimization: research, PoC, benchmark, implement, review, PR
| name | perfup |
| description | Autonomous performance optimization: research, PoC, benchmark, implement, review, PR |
| disable-model-invocation | true |
| allowed-tools | ["Bash","Read","Write","Edit","Glob","Grep","Task","WebSearch","WebFetch","Skill","TaskCreate","TaskUpdate","TaskList","TaskGet","EnterPlanMode","ExitPlanMode","AskUserQuestion"] |
Inspired by karpathy/autoresearch: you are an autonomous performance researcher for vllm-mlx. You propose optimizations, benchmark them, keep what works, discard what doesn't, and ship a production PR.
reports/perfup-results.tsv — append-only experiment log (commit, metric, status, description)memory/knowledge/perf_optimization_queue.md — ranked list of candidatesmemory/MEMORY.md — what's been done, what's knownscripts/benchmark_engines.pyRead existing state, then discover new opportunities.
memory/knowledge/perf_optimization_queue.md and memory/MEMORY.md$ARGUMENTS is provided (e.g. /perfup decode), focus on that area. Otherwise broad search.gh release list --repo ml-explore/mlx-lm --limit 5)Score and rank. Persist to memory.
memory/knowledge/perf_optimization_queue.md:
This is the core loop. Inspired by autoresearch: try, measure, keep or discard. Repeat.
SETUP:
git checkout -b perfup/<optimization-name>
Record baseline metrics (run benchmark on current code)
Initialize reports/perfup-results.tsv if not exists
LOOP:
1. Implement minimal PoC change in code
2. git commit -m "perfup: <brief description>"
3. Run benchmark: python3.12 scripts/benchmark_engines.py (or custom)
Redirect output: > reports/perfup-run.log 2>&1
4. Extract metrics from log (TTFT, decode tok/s, etc.)
5. Record to reports/perfup-results.tsv:
commit<TAB>decode_tps<TAB>ttft_ms<TAB>status<TAB>description
6. DECISION:
- If metric improved: KEEP. Log "keep" status. This is the new baseline.
- If metric same or worse: DISCARD. Log "discard". git reset --hard to previous keep.
- If crashed: Log "crash". Try to fix (1-2 attempts). If unfixable, discard and move on.
7. If improvement confirmed and significant (>5%): break loop → Phase 4
8. If no candidate works after trying top 3: inform user and stop.
Rules for the loop:
PoC validated. Now build it properly.
tests/python3.12 -m pytest tests/ -vIndependent review via Codex.
/review-loop <description of optimization>perfup/<name> or feat/<name> branchraullenchai remote (NEVER origin, NEVER main directly)gh pr create --repo raullenchai/vllm-mlx --base main
PR body must include:
perf_optimization_queue.md with PR#, date, confirmed speedupcommit decode_tps ttft_ms status description
a1b2c3d 68.4 245 baseline current main branch
b2c3d4e 72.1 240 keep reduce redundant mx.eval in decode loop
c3d4e5f 67.9 248 discard speculative prefill chunking
d4e5f6g 0.0 0 crash fused MoE kernel (import error)
If $ARGUMENTS provided:
ttft — Time to first token (prefill optimization)decode — Decode throughput (tok/s)tools — Tool calling accuracy/reliabilityaccuracy — Model output qualitymemory — Memory usage / longer contextsprefill — Prefill speedcache — Cache hit rate / prompt reuseperf_optimization_queue.md is the canonical record of what's tried/works/failed.