mit einem Klick
fleet-scorecard
// Daily fleet-wide scorecard across this instance and every managed instance in memory/instances.json — runs, tokens (OpenRouter shape), est. cost, skills, and reliability, with day-over-day deltas and alerts
// Daily fleet-wide scorecard across this instance and every managed instance in memory/instances.json — runs, tokens (OpenRouter shape), est. cost, skills, and reliability, with day-over-day deltas and alerts
| name | Fleet Scorecard |
| description | Daily fleet-wide scorecard across this instance and every managed instance in memory/instances.json — runs, tokens (OpenRouter shape), est. cost, skills, and reliability, with day-over-day deltas and alerts |
| schedule | 0 13 * * * |
| tags | ["meta","fleet","report","cost"] |
Today is ${today}. Publish the daily fleet scorecard to memory/scorecard.md and append a trend row to memory/scorecard-history.csv.
The fleet is discovered at runtime, never hardcoded: it is this repo ("self") plus every non-archived entry in memory/instances.json (the registry fleet-control and spawn-instance maintain). With zero managed instances the scorecard simply covers the single self repo — still useful.
All data has already been gathered by scripts/prefetch-fleet-scorecard.sh (which ran outside the sandbox with network/gh access). You do not need network or gh — work only from the prefetched files below.
If soul/SOUL.md and soul/STYLE.md exist and are populated, read them and match the operator's voice in the notification (step 6). If they are empty templates or absent, use a clear, direct, neutral tone — terse, lowercase, no fluff.
/tmp/fleet-scorecard/scorecard-body.md — the computed markdown tables (Fleet totals, Per-repo, Top skills by cost, Least reliable skills). These numbers are authoritative — do not recompute or alter them./tmp/fleet-scorecard/metrics.json — today's key totals: total_runs, total_failures, generations, prompt_tokens, cached_tokens, completion_tokens, total_tokens, est_cost_usd, cache_discount_usd.If /tmp/fleet-scorecard/scorecard-body.md is missing or empty, the prefetch failed or resolved an empty fleet — write a one-line note to /tmp/skill-result.txt saying so and stop (do not overwrite the existing scorecard, do not notify).
/tmp/fleet-scorecard/metrics.json (today).memory/scorecard-history.csv if it exists (the previous run's metrics) to compute deltas. If the file doesn't exist yet, this is the first run — deltas are "—".For total_runs, total_failures, generations, total_tokens, est_cost_usd, cache_discount_usd, compute today − previous. Format as signed (e.g. +312 runs, +$148, +5 failures). These are cumulative all-time figures, so deltas show the last ~24h of activity.
Scan the computed tables in scorecard-body.md and flag:
est_cost_usd delta > 1.5× the median daily delta from history (if ≥7 history rows exist), or just note the day's cost increase otherwise.total_failures rose by more than 10 since yesterday, flag it.✅ No anomalies — fleet healthy.memory/scorecard.mdStructure (overwrite the file):
# 🛰️ Aeon Fleet Scorecard — as of ${today}
_Auto-generated daily by skills/fleet-scorecard. Tokens reported OpenRouter-style (cached_tokens ⊆ prompt_tokens)._
## Since last update (~24h)
| Metric | Δ |
|---|---:|
| Runs | <signed> |
| Failures | <signed> |
| Generations | <signed> |
| Total tokens | <signed, humanized> |
| Est. cost | <signed $> |
| Cache discount | <signed $> |
## Alerts
<the alerts block from step 3>
<PASTE the full contents of /tmp/fleet-scorecard/scorecard-body.md verbatim here>
---
_Sources: GitHub Actions run history + each repo's `memory/token-usage.csv`. Fleet resolved from memory/instances.json + self. Cost = Anthropic list price (estimate)._
Append one line to memory/scorecard-history.csv (create with a header if it doesn't exist):
date,total_runs,total_failures,generations,prompt_tokens,cached_tokens,completion_tokens,total_tokens,est_cost_usd,cache_discount_usd
Use ${today} for the date and the values straight from metrics.json. Append, never rewrite prior rows.
Write a terse daily pulse to /tmp/scorecard-notify.md and send it with ./notify -f /tmp/scorecard-notify.md. One short paragraph — today's totals (runs, est. cost, total tokens), the headline deltas, and any alert. Example shape: "fleet at 12.5k runs, ~$7.8k notional. +312 runs / +$148 since yesterday. cost-report still failing (88% fail). caching saved ~$43k." Also copy this text to /tmp/skill-result.txt so the framework captures it.
Append a one-line entry to memory/logs/${today}.md noting the scorecard ran and the headline numbers (so future skills like self-review/reflect see it).
This skill needs no network inside the sandbox — all gh/API work happens in scripts/prefetch-fleet-scorecard.sh, which runs in the workflow's prefetch phase with gh auth. If the prefetch's cross-repo reads fail for a managed instance, it's almost always the GitHub token scope (the token needs read access to that instance's repo; self is always readable). The prefetch degrades gracefully — a repo it can't read is simply absent from the tables rather than crashing the run.
None for the skill itself. scripts/prefetch-fleet-scorecard.sh uses GH_TOKEN/GITHUB_TOKEN (provided by the workflow) and reads GITHUB_REPOSITORY to resolve "self".
End with a ## Summary listing the files written (memory/scorecard.md, memory/scorecard-history.csv, the log entry) and any alerts raised.
Weekly escalation audit — parses the follow-up / open-loop section of MEMORY.md plus the issue tracker, computes item ages, and alerts on items hitting urgency thresholds so nothing rots unattended
Weekly tracker for the Model Context Protocol (MCP) ecosystem — new server implementations, adoption velocity, npm/GitHub signals, and protocol evolution. Thesis check — is MCP becoming the default tool-call rail for agents?
Daily cross-skill signal detector — finds entities or themes surfaced independently by 3+ different skill categories in the last 48h and surfaces them as high-confidence write opportunities
Daily API spend watchdog — checks running weekly cost against a budget cap, alerts when approaching or exceeding it
Audit any contract on Base — verification, proxy/upgradeability, ownership/admin roles, and mint/freeze/pause/drain powers as a live capability matrix. Keyless via Etherscan v2 + Base RPC.
Map every contract deployed by an address on Base, link reused patterns, and surface serial-rug signals. Keyless via Etherscan v2 + Base RPC.