name	anticipation-gap-audit
description	Scores any agent definition or skill manifest against the four consumer-AI breakthrough problems (context, reliability, permission, judgment) plus a reactive-vs-anticipatory axis. Outputs a heatmap and ranked vulnerability list.

Anticipation Gap Audit

Evaluates an agent or skill manifest against the five-axis framework that separates reactive AI products from genuinely anticipatory ones. Produces a scored heatmap and a ranked list of where the subject is purely reactive vs where it could legitimately initiate without prompting.

Trigger

Use when the user says "anticipation audit", "score this agent", "how anticipatory is this", "gap audit", "reactive vs anticipatory check", "audit this skill for anticipation", or provides an agent definition and asks whether it could proactively help users.

Phase 1: Intake

Accept the subject. This can be:

An agent definition file (agent.yaml, agent.config.json, AGENT.md)
A skill manifest (SKILL.md)
A plain-text description of an agent or product
A product landing page pasted as text

Ask clarifying questions only if the subject is ambiguous. Otherwise proceed immediately.

Phase 2: Score Each Axis

Score the subject on all five axes. For each axis, assign:

GREEN (2) — fully addressed; agent handles this without human prompting
YELLOW (1) — partially addressed; works but degrades or requires human scaffolding
RED (0) — not addressed; agent is purely reactive on this dimension

Axis 1: Context

Does the agent maintain persistent, queryable awareness of user state across sessions and surfaces?

Indicators:

Has access to memory, prior conversation state, or persistent storage
Can recall what the user last did without being re-briefed
Context survives session resets and context-window limits
Handles multi-device or multi-agent state (what one agent did is visible to another)

Axis 2: Reliability

Does the agent behave deterministically enough to be trusted without supervision?

Indicators:

Same inputs produce consistent outputs across N runs
Gracefully handles unexpected inputs without silent failure
Has a fallback or abort path when confidence is low
Error messages are actionable, not generic
Has been validated by a test suite or regression check (see agent-reliability-calculator)

Axis 3: Permission

Does the agent have bounded authority to act on behalf of the user without per-action approval?

Indicators:

Has explicit permission scopes (e.g., "auto-act under threshold X, ask above it")
Permission boundaries are documented and machine-readable
The agent does not request permissions it doesn't need
User can audit what the agent has done post-hoc

Axis 4: Judgment

Does the agent know when to act vs ask vs queue vs abort?

Indicators:

Classifies actions by reversibility before executing
Has a cost-boundary check before spending user resources
Detects when the user is observing and adjusts behavior accordingly
Has an explicit "ask-first" path for high-stakes or ambiguous actions

Axis 5: Reactive-vs-Anticipatory

Does the agent initiate at the right moment without being prompted, or only when explicitly invoked?

Indicators:

Has trigger conditions (schedule, event, threshold) beyond "user typed a command"
Monitors state and fires when conditions are met
Produces outputs before the user notices a problem, not after
Has been deployed in a scheduled or event-driven mode (not only interactive)

Phase 3: Heatmap Output

## Anticipation Gap Audit — [Subject Name]

| Axis         | Score | Status  | Gap Summary                          |
|--------------|-------|---------|--------------------------------------|
| Context      |  X/2  | COLOR   | [one line]                           |
| Reliability  |  X/2  | COLOR   | [one line]                           |
| Permission   |  X/2  | COLOR   | [one line]                           |
| Judgment     |  X/2  | COLOR   | [one line]                           |
| Anticipatory |  X/2  | COLOR   | [one line]                           |
| **Total**    | XX/10 |         |                                      |

### Verdict

[One of:]
- ANTICIPATORY READY (8-10): Agent can initiate without prompting. Gaps are minor.
- SCAFFOLDED (5-7): Agent works well reactively. Anticipation requires human setup or cron scaffolding.
- REACTIVE ONLY (3-4): Agent is another inbox. Core problems unsolved.
- NOT DEPLOYABLE (0-2): Critical gaps in reliability or permission block safe autonomous use.

### Ranked Vulnerabilities

[List RED axes first, then YELLOW. For each:]
1. [Axis name] — [specific gap] — [what would fix it]

### Upgrade Path

[Concrete next steps ordered by effort-to-impact:]
1. [Highest-leverage fix — usually the lowest-scoring axis]
2. [Second fix]
3. [Third fix]

Phase 4: Follow-Up Options

Offer:

"Want me to draft the permission scope DSL for the Permission axis?"
"Want me to run failure-asymmetry to find the human-vs-agent invocation gap?"
"Want me to generate a test suite for the Reliability axis?"

Verification

A good audit:

Has at least one RED or YELLOW finding (a perfect 10/10 means the subject wasn't scrutinized)
Gap summaries name specific missing components, not vague categories
Upgrade path is ordered by effort-to-impact, not alphabetically
Verdict matches the numeric total without softening

Source

Extracted from Nate Kadlac newsletter (2026-05-05) — "The Anticipation Gap: Why 4 Problems Have to Be Solved Together for Consumer AI to Work" — four-problem framework for consumer AI breakthrough, extended to a fifth axis.

name	anticipation-gap-audit
description	Scores any agent definition or skill manifest against the four consumer-AI breakthrough problems (context, reliability, permission, judgment) plus a reactive-vs-anticipatory axis. Outputs a heatmap and ranked vulnerability list.