一键在 Manus 中运行任何 Skill

开始使用

sim-compare

星标17,720

分支1,694

更新时间2026年3月22日 22:58

Run cache policy hit rate comparison across multiple cache sizes with charts

安装

用 Codex 或 Claude 帮你安装复制这段 Prompt，粘贴到 Codex、Claude 或其他助手里，让它检查 Skill 页面并帮你完成安装。

在 Manus 中运行

来源

ben-manes

ben-manes/caffeine

打开 GitHub 仓库查看创作者相关仓库

下载

在 Manus 中运行

Input

Trace file: $ARGUMENTS

If no policies or sizes are specified, use sensible defaults based on the trace.

Workflow

Identify the trace format. Check the file extension and contents:
- .gz files: try common formats (lirs, arc, etc.)
- Look at simulator/src/main/resources/reference.conf for format options
- Use format:path syntax (e.g., lirs:trace.gz)
Select policies to compare. Policy names use category.PolicyName format. Note: config categories use hyphens (two-queue, greedy-dual), not underscores. Each policy is paired with configured admission filters (default: Always, TinyLfu, Clairvoyant), creating multiple instances per policy name.

Include at minimum:
- product.Caffeine (the production implementation)
- opt.Clairvoyant (theoretical optimal, upper bound)
- opt.Unbounded (infinite cache, ceiling)
- linked.Lru (baseline)
- sketch.WindowTinyLfu (research W-TinyLFU)
- sketch.HillClimberWindowTinyLfu (adaptive variant)
- Add relevant competitors based on trace characteristics:
  - For recency-heavy: linked.S4Lru, adaptive.Arc
  - For frequency-heavy: linked.Lfu, irr.Lirs
  - For scan-resistant: two-queue.TwoQueue, two-queue.S3Fifo
  - For size-aware traces: greedy-dual.Gdsf, greedy-dual.Camp
Choose cache sizes. Use a geometric progression covering the working set:
- Start small (e.g., 100), end near working set size
- 5-8 sizes: e.g., 100,500,1_000,2_500,5_000,10_000,25_000
- If the trace has few distinct keys, reduce the range

Run the simulation. Use the Gradle task:

./gradlew simulator:simulate -q \
  --maximumSize=100,500,1000,2500,5000,10000 \
  --metric="Hit Rate" \
  --title="Description" \
  --theme=light \
  --outputDir=build/reports/sim

Override the trace and policies via system properties appended to the command:

-Dcaffeine.simulator.files.paths.0="format:path/to/trace"
-Dcaffeine.simulator.policies.0=product.Caffeine
-Dcaffeine.simulator.policies.1=opt.Clairvoyant
# ... etc

Note: for single-size runs, use ./gradlew simulator:run -q with -Dcaffeine.simulator.maximum-size=N instead of simulator:simulate.

Read and interpret results. The simulate task produces:
- Individual CSV per cache size
- Combined CSV (policies as rows, sizes as columns)
- PNG chart (line graph of metric vs cache size) Read the CSV output files in the output directory:
- Compare hit rates across policies at each cache size
- Identify the crossover points where one policy overtakes another
- Note the gap between Caffeine and Clairvoyant (theoretical ceiling)
Explain findings. For each notable result:
- WHY does policy X beat policy Y on this trace?
- What trace characteristic drives the difference? (frequency bias, recency bias, scan patterns, temporal shifts)
- How close is Caffeine to optimal? Where does it lose?
- Reference the relevant research paper if applicable:
  - TinyLFU paper for admission filter behavior
  - Adaptive paper for hill climber effectiveness
  - See .claude/docs/research-foundations.md for paper-to-code mapping
Report. Present:
- Summary table of hit rates at each cache size
- Key takeaways (2-3 sentences)
- Notable policy behaviors
- Path to generated chart PNG

同仓库更多 Skills

同仓库

audit-adaptivity

ben-manes/caffeine

Audit the adaptive window hill-climber and region-resize logic for implementation defects (not algorithm quality)

2026-06-1717.7k

audit-jcache-conformance

ben-manes/caffeine

JSR-107 (JCache) spec-conformance audit

2026-06-1717.7k

audit-state-machine

ben-manes/caffeine

Audit explicit state machines (drain status, node lifecycle, async-value lifecycle) for illegal or missed transitions

2026-06-1717.7k

audit-temporal-walk

ben-manes/caffeine

Heavyweight history-mining bug audit. Walks the caffeine module's git history chronologically (oldest to HEAD), maintains a forward-tracked issue database, and surfaces concerns introduced by past commits that were never resolved. Catches bugs that snapshot mining cannot — half-fixes invisible from current state, latent+trigger pairs across multi-commit interactions, and partial refactors. Slow (model/effort-dependent; ~24h on Opus + max effort) and rare-run (every several months or before a major release).

2026-06-1717.7k

audit-sibling-divergence

ben-manes/caffeine

Differential audit comparing matched code paths that should behave identically. Spawns one auditor per sibling pair (sync/async, bounded/unbounded, view consistency, bulk vs single, generated node variants, read fast vs slow, adapter conformance) and requires a concrete witness scenario where the two paths diverge observably.

2026-06-0217.7k

audit-contract-drift

ben-manes/caffeine

Find places where documented API contracts and the implementation diverge

2026-04-2717.7k

name	sim-compare
description	Run cache policy hit rate comparison across multiple cache sizes with charts
argument-hint	<trace-file> [policies...] [sizes...]
context	fork
disable-model-invocation	true
allowed-tools	Read, Grep, Glob, Bash

Run a comprehensive cache policy comparison for the given trace.

Input

Trace file: $ARGUMENTS

If no policies or sizes are specified, use sensible defaults based on the trace.

Workflow

Identify the trace format. Check the file extension and contents:
- .gz files: try common formats (lirs, arc, etc.)
- Look at simulator/src/main/resources/reference.conf for format options
- Use format:path syntax (e.g., lirs:trace.gz)
Select policies to compare. Policy names use category.PolicyName format. Note: config categories use hyphens (two-queue, greedy-dual), not underscores. Each policy is paired with configured admission filters (default: Always, TinyLfu, Clairvoyant), creating multiple instances per policy name.

Include at minimum:
- product.Caffeine (the production implementation)
- opt.Clairvoyant (theoretical optimal, upper bound)
- opt.Unbounded (infinite cache, ceiling)
- linked.Lru (baseline)
- sketch.WindowTinyLfu (research W-TinyLFU)
- sketch.HillClimberWindowTinyLfu (adaptive variant)
- Add relevant competitors based on trace characteristics:
  - For recency-heavy: linked.S4Lru, adaptive.Arc
  - For frequency-heavy: linked.Lfu, irr.Lirs
  - For scan-resistant: two-queue.TwoQueue, two-queue.S3Fifo
  - For size-aware traces: greedy-dual.Gdsf, greedy-dual.Camp
Choose cache sizes. Use a geometric progression covering the working set:
- Start small (e.g., 100), end near working set size
- 5-8 sizes: e.g., 100,500,1_000,2_500,5_000,10_000,25_000
- If the trace has few distinct keys, reduce the range

Run the simulation. Use the Gradle task:

./gradlew simulator:simulate -q \
  --maximumSize=100,500,1000,2500,5000,10000 \
  --metric="Hit Rate" \
  --title="Description" \
  --theme=light \
  --outputDir=build/reports/sim

Override the trace and policies via system properties appended to the command:

-Dcaffeine.simulator.files.paths.0="format:path/to/trace"
-Dcaffeine.simulator.policies.0=product.Caffeine
-Dcaffeine.simulator.policies.1=opt.Clairvoyant
# ... etc

Note: for single-size runs, use ./gradlew simulator:run -q with -Dcaffeine.simulator.maximum-size=N instead of simulator:simulate.

Read and interpret results. The simulate task produces:
- Individual CSV per cache size
- Combined CSV (policies as rows, sizes as columns)
- PNG chart (line graph of metric vs cache size) Read the CSV output files in the output directory:
- Compare hit rates across policies at each cache size
- Identify the crossover points where one policy overtakes another
- Note the gap between Caffeine and Clairvoyant (theoretical ceiling)
Explain findings. For each notable result:
- WHY does policy X beat policy Y on this trace?
- What trace characteristic drives the difference? (frequency bias, recency bias, scan patterns, temporal shifts)
- How close is Caffeine to optimal? Where does it lose?
- Reference the relevant research paper if applicable:
  - TinyLFU paper for admission filter behavior
  - Adaptive paper for hill climber effectiveness
  - See .claude/docs/research-foundations.md for paper-to-code mapping
Report. Present:
- Summary table of hit rates at each cache size
- Key takeaways (2-3 sentences)
- Notable policy behaviors
- Path to generated chart PNG