| name | agent-proxy |
| description | Use when adding normalized progress stages for long-running agents, LLM calls, hosted search, tool loops, reasoning streams, provider queues, retries, cancellation, and SSE operation UX. |
Agent Proxy
Use this skill to make long-running agent and LLM work observable without pretending to know more than the provider reveals.
Standard
Every foreground long-running operation should expose a plain operation object:
type OperationView = {
id: string
kind: string
status: "running" | "done" | "error" | "cancelled" | "timeout"
startedAt: number
estimateMs: number
cancellable: boolean
stages: OperationStage[]
}
Every provider or workflow should emit normalized events:
type NormalizedProviderEvent =
| { type: "stage", stage: OperationStage }
| { type: "reasoning", delta: string }
| { type: "delta", delta: string }
| { type: "done" }
| { type: "error", error: Error }
Provider Mapping
Prefer real provider events over synthetic timers.
- OpenAI Responses: map
response.created, response.in_progress, response.web_search_call.*, response.output_item.added/done, reasoning deltas, output text deltas, annotations, completed, and failed.
- xAI Responses: use the same Responses-style lifecycle and tool mapping. Prefer
/v1/responses for new xAI work; fall back to Chat Completions only when needed.
- Gemini: map stream connection,
part.thought, first visible part.text, usage metadata, grounding metadata, URL context metadata, function calls, code execution parts, finish warnings, and errors.
- Featherless and OpenAI-compatible Chat Completions: map stream connection, first reasoning delta, inline
<think> blocks if present, first visible content delta, usage chunks, finish reasons, and errors.
- App-owned tool loops: emit explicit stages before tool invocation, after tool result, before synthesis, on retry, and on save/apply.
Use heartbeats only when no real provider event has arrived for a quiet interval such as 8-15 seconds.
Implementation Steps
- Add
agent-proxy-kit.
- Create a provider stage tracker with
{ provider, model, startedAt }.
- Emit
stage events to the frontend via SSE, WebSocket, or your existing stream protocol.
- Append stage events into the active operation state.
- Render elapsed time, rough estimate, latest stage, cancel button, and expandable history.
- Thread
AbortSignal from the browser to the server and into provider fetch/SDK calls.
- Log sanitized lifecycle events: start, stage, done, error, cancel.
- Test provider-event mapping and cancellation separately from live provider calls.
UX Rules
Disable only controls that would duplicate or corrupt the operation. Let users inspect previous results, browse old variants, open developer logs, and cancel while new work is running.
Do not show a stage for every token. Emit meaningful stages: submitted, connected, reasoning started, first visible token, hosted tool/search state, usage received, retry/fallback, completed.
Never store raw prompts, replies, or private content in server logs just to power progress. Stage labels and previews should be structural and sanitized.