name	autocontext
description	Iterative strategy generation and evaluation system. Use when the user wants to evaluate agent output quality, run improvement loops, queue tasks for background evaluation, check run status, inspect runtime artifacts and session branch lineage, or discover available scenarios. Provides LLM-based judging with rubric-driven scoring.
allowed-tools	autocontext_judge autocontext_improve autocontext_status autocontext_scenarios autocontext_queue autocontext_runtime_snapshot

autocontext

autocontext is an iterative strategy generation and evaluation system that uses LLM-based judging to score and improve agent outputs.

Available Tools

autocontext_judge — Evaluate agent output against a rubric. Returns a 0–1 score with reasoning and per-dimension breakdowns.
autocontext_improve — Run a multi-round improvement loop. The agent output is judged, revised based on feedback, and re-evaluated until the quality threshold is met or max rounds are exhausted.
autocontext_queue — Enqueue a task for background evaluation by the task runner daemon.
autocontext_status — Check the status of runs and queued tasks.
autocontext_scenarios — List available evaluation scenarios and their families.
autocontext_runtime_snapshot — Inspect run artifacts, package provenance, branchable session lineage, and recent event-stream entries.

Quick Start

1. Evaluate output quality

Use autocontext_judge with a task prompt, the agent's output, and a rubric:

autocontext_judge(
  task_prompt="Write a Python function to parse CSV files",
  agent_output="def parse_csv(path): ...",
  rubric="Correctness, error handling, edge cases, documentation"
)

2. Improve output iteratively

Use autocontext_improve to automatically revise output through judge-guided feedback loops:

autocontext_improve(
  task_prompt="Write a Python function to parse CSV files",
  initial_output="def parse_csv(path): ...",
  rubric="Correctness, error handling, edge cases, documentation",
  max_rounds=5,
  quality_threshold=0.85
)

3. Queue background tasks

Use autocontext_queue with a scenario name to enqueue evaluation tasks for asynchronous processing:

autocontext_queue(spec_name="my_scenario")

Check results later with autocontext_status.

For deeper context, use autocontext_runtime_snapshot with the run ID. Add session_id when you need the active branch path before continuing work:

autocontext_runtime_snapshot(run_id="run_123", session_id="sess_123")

4. Discover scenarios

Use autocontext_scenarios to see what evaluation scenarios are available:

autocontext_scenarios()
autocontext_scenarios(family="agent_task")

Configuration

The extension auto-detects configuration from these sources:

Project config — .autoctx.json in the working directory (created via autoctx init)
Environment variables:
- AUTOCONTEXT_AGENT_PROVIDER or AUTOCONTEXT_PROVIDER — Provider type
- AUTOCONTEXT_AGENT_API_KEY or AUTOCONTEXT_API_KEY — Provider API key
- AUTOCONTEXT_AGENT_DEFAULT_MODEL or AUTOCONTEXT_MODEL — Model override
- AUTOCONTEXT_DB_PATH — SQLite database path override
Pi provider — Falls back to Pi's configured LLM provider

CLI Companion

For standalone usage outside Pi, install the autoctx CLI:

npm install -g autoctx
autoctx init
autoctx solve --description "your problem" --gens 5
autoctx simulate --description "your simulation" --runs 3

name	autocontext
description	Iterative strategy generation and evaluation system. Use when the user wants to evaluate agent output quality, run improvement loops, queue tasks for background evaluation, check run status, inspect runtime artifacts and session branch lineage, or discover available scenarios. Provides LLM-based judging with rubric-driven scoring.
allowed-tools	autocontext_judge autocontext_improve autocontext_status autocontext_scenarios autocontext_queue autocontext_runtime_snapshot

autocontext

autocontext is an iterative strategy generation and evaluation system that uses LLM-based judging to score and improve agent outputs.

Available Tools

autocontext_judge — Evaluate agent output against a rubric. Returns a 0–1 score with reasoning and per-dimension breakdowns.
autocontext_improve — Run a multi-round improvement loop. The agent output is judged, revised based on feedback, and re-evaluated until the quality threshold is met or max rounds are exhausted.
autocontext_queue — Enqueue a task for background evaluation by the task runner daemon.
autocontext_status — Check the status of runs and queued tasks.
autocontext_scenarios — List available evaluation scenarios and their families.
autocontext_runtime_snapshot — Inspect run artifacts, package provenance, branchable session lineage, and recent event-stream entries.

Quick Start

1. Evaluate output quality

Use autocontext_judge with a task prompt, the agent's output, and a rubric:

autocontext_judge(
  task_prompt="Write a Python function to parse CSV files",
  agent_output="def parse_csv(path): ...",
  rubric="Correctness, error handling, edge cases, documentation"
)

2. Improve output iteratively

Use autocontext_improve to automatically revise output through judge-guided feedback loops:

autocontext_improve(
  task_prompt="Write a Python function to parse CSV files",
  initial_output="def parse_csv(path): ...",
  rubric="Correctness, error handling, edge cases, documentation",
  max_rounds=5,
  quality_threshold=0.85
)

3. Queue background tasks

Use autocontext_queue with a scenario name to enqueue evaluation tasks for asynchronous processing:

autocontext_queue(spec_name="my_scenario")

Check results later with autocontext_status.

For deeper context, use autocontext_runtime_snapshot with the run ID. Add session_id when you need the active branch path before continuing work:

autocontext_runtime_snapshot(run_id="run_123", session_id="sess_123")

4. Discover scenarios

Use autocontext_scenarios to see what evaluation scenarios are available:

autocontext_scenarios()
autocontext_scenarios(family="agent_task")

Configuration

The extension auto-detects configuration from these sources:

Project config — .autoctx.json in the working directory (created via autoctx init)
Environment variables:
- AUTOCONTEXT_AGENT_PROVIDER or AUTOCONTEXT_PROVIDER — Provider type
- AUTOCONTEXT_AGENT_API_KEY or AUTOCONTEXT_API_KEY — Provider API key
- AUTOCONTEXT_AGENT_DEFAULT_MODEL or AUTOCONTEXT_MODEL — Model override
- AUTOCONTEXT_DB_PATH — SQLite database path override
Pi provider — Falls back to Pi's configured LLM provider

CLI Companion

For standalone usage outside Pi, install the autoctx CLI:

npm install -g autoctx
autoctx init
autoctx solve --description "your problem" --gens 5
autoctx simulate --description "your simulation" --runs 3

autocontext

autocontext

Available Tools

Quick Start

1. Evaluate output quality

2. Improve output iteratively

3. Queue background tasks

4. Discover scenarios

Configuration

CLI Companion

More from this repository

More from this repository

autocontext

Available Tools

Quick Start

1. Evaluate output quality

2. Improve output iteratively

3. Queue background tasks

4. Discover scenarios

Configuration

CLI Companion