| name | rate-awareness |
| description | Token-bucket advisory rate limiter per skill+session. Surfaces runaway tool-call loops in real time before pech/budget-watcher reports them post-hoc. Auto-fires on PreToolUse. Configurable per-skill limits via state/buckets.json. Do not use for cost reporting (see budget-watcher) or for blocking installs (see package-gate).
|
| model | haiku |
| tools | ["Read","Bash"] |
rate-awareness
Token-bucket advisory limiter. Each tool call consumes one token from a per
(session, skill) bucket. When the bucket is empty, an advisory is emitted to
stderr — the tool still proceeds (advisory-only contract per
shared/foundations/conduct/hooks.md).
Preconditions
state/buckets.json exists (opt-in gate — absence = silent no-op).
- The PreToolUse hook is registered via
hooks/hooks.json.
Inputs
- Hook event:
PreToolUse on every tool (matcher .*).
- Bucket config:
state/buckets.json — global_default plus optional
per_skill overrides.
- Session id:
$CLAUDE_SESSION_ID if set, else SHA-1 hash of
cwd:ppid truncated to 16 chars.
- Skill id: best-effort — most recent
skills/<name>/ ancestor of cwd,
else ungated.
Steps
- Hook (
hooks/pretooluse-rate.sh) fires; if state/buckets.json missing,
exits 0 silently.
scripts/rate-check.py opens state/buckets.json r+, takes an
exclusive lock (msvcrt on Windows, fcntl elsewhere).
- Reads state. Resolves capacity + refill rate via per-skill override or
global_default.
- Refills the (session, skill) bucket:
tokens = min(capacity, tokens + dt * refill).
- If
tokens >= 1 → consume 1, no advisory. Else → record empty, set
advisory = true.
- Write-back atomically (truncate +
fsync) under the same lock.
- If advisory: emit stderr message; always exit 0.
Success criterion: every PreToolUse decrements (or refills) exactly one
bucket entry; the 61st call within a 1-second window for a bucket of
capacity 60 emits the advisory; the tool call proceeds regardless.
Outputs
- Updated
state/buckets.json (mutated _buckets map keyed by
<session>::<skill>).
- Stderr advisory line on bucket-empty (visible to the developer; not
injected into the model context).
Configuration
state/buckets.json schema:
{
"global_default": {"capacity": 60, "refill_per_sec": 1.0},
"per_skill": {
"deep-research": {"capacity": 200, "refill_per_sec": 5.0}
}
}
Add per-skill entries as needed. Delete the file entirely to disable the
plugin (opt-in by file presence).
Handoff
Downstream: pech/budget-watcher still owns post-hoc cost ceilings. This
plugin only surfaces velocity anomalies before the spend lands.
Failure modes
| Code | Scenario | Counter |
|---|
| F09 | Two PreToolUse hooks writing buckets.json concurrently | Exclusive flock/msvcrt lock around the read-modify-write |
| F05 | Advisory spam on every call once bucket empties | Tokens stay at 0 until refill catches up; one stderr line per emptied call is the signal, not a bug |
| F13 | Skill mis-identified as ungated when cwd is outside any skills/ tree | Default ungated bucket uses global_default; per-skill overrides only fire when detection succeeds |
| F08 | Hook calling the script with wrong python | python3 per pech convention; matches budget-watcher hooks |