| name | ai-core/middleware |
| description | Chat lifecycle middleware hooks: onConfig, onStart, onChunk, onBeforeToolCall, onAfterToolCall, onUsage, onFinish, onAbort, onError. Use for analytics, event firing, tool caching (toolCacheMiddleware), logging, and tracing. Middleware array in chat() config, left-to-right execution order. NOT onEnd/onFinish callbacks on chat() — use middleware.
|
| type | sub-skill |
| library | tanstack-ai |
| library_version | 0.10.0 |
| sources | ["TanStack/ai:docs/advanced/middleware.md"] |
Middleware
Dependency note: This skill builds on ai-core. Read it first for critical rules.
Setup — Analytics Tracking Middleware
import { chat, toServerSentEventsResponse } from '@tanstack/ai'
import { openaiText } from '@tanstack/ai-openai'
const stream = chat({
adapter: openaiText('gpt-5.2'),
messages,
middleware: [
{
onStart: (ctx) => {
console.log('Chat started:', ctx.model)
},
onFinish: (ctx, info) => {
trackAnalytics({ model: ctx.model, tokens: info.usage?.totalTokens })
},
onError: (ctx, info) => {
reportError(info.error)
},
},
],
})
return toServerSentEventsResponse(stream)
Hooks Reference
Every hook receives a ChatMiddlewareContext as its first argument, which provides
requestId, streamId, phase, iteration, chunkIndex, model, provider,
signal, abort(), defer(), and more.
| Hook | When | Second Argument |
|---|
onConfig | Once at startup (init) + once per iteration (beforeModel) + once at structured-output boundary | ChatMiddlewareConfig (return partial to merge) |
onStructuredOutputConfig | Once at the structured-output boundary (only when chat({ outputSchema })) | StructuredOutputMiddlewareConfig (return partial) |
onStart | Once after initial onConfig | none |
onIteration | Start of each agent loop iteration | IterationInfo |
onChunk | Every streamed chunk | StreamChunk (return void/chunk/chunk[]/null) |
onBeforeToolCall | Before each tool executes | ToolCallHookContext (return decision or void) |
onAfterToolCall | After each tool executes | AfterToolCallInfo |
onToolPhaseComplete | After all tool calls in an iteration | ToolPhaseCompleteInfo |
onUsage | When RUN_FINISHED includes usage data | UsageInfo |
onFinish | Run completed normally | FinishInfo |
onAbort | Run was aborted | AbortInfo |
onError | Unhandled error occurred | ErrorInfo |
Terminal hooks (onFinish, onAbort, onError) are mutually exclusive -- exactly
one fires per chat() invocation.
Phase values
ctx.phase is one of:
| Phase | When |
|---|
'init' | Initial setup (before the first onConfig snapshot is built). |
'beforeModel' | Right before each agent-loop adapter call (onConfig re-fires here). |
'modelStream' | During model streaming chunks within the agent loop. |
'beforeTools' | Before tool execution phase. |
'afterTools' | After tool execution phase. |
'structuredOutput' | During the final structured-output adapter call (set for all chunks from adapter.structuredOutputStream or the synthesized fallback). Triggered only when chat({ outputSchema }) is invoked; one phase transition per chat() invocation. |
Structured-output lifecycle rules (when chat({ outputSchema }) is used):
onStructuredOutputConfig fires before onConfig at the structured-output boundary.
onConfig re-fires at the same boundary with ctx.phase === 'structuredOutput', receiving the post-onStructuredOutputConfig view of the config (minus outputSchema).
onChunk and onUsage fire for every chunk and usage event emitted by the structured-output call, with ctx.phase === 'structuredOutput'.
onIteration does not fire for finalization — it is agent-loop-only.
onFinish fires once at the end of the whole chat() invocation, after the structured-output finalization completes (not after the agent loop). Terminal-hook exclusivity still holds (one of onFinish / onAbort / onError).
- Terminal
info and structured-output: info.usage / info.finishReason / info.content reflect the agent loop's terminal state, NOT the finalization step. Finalization state is intentionally segregated to keep agent-loop semantics clean. For a tools-less chat({ outputSchema }) run, info.usage is undefined and info.finishReason is null (no agent-loop iteration produced RUN_FINISHED). To capture finalization tokens, use onUsage — it fires for both agent-loop iterations and the final call. For the structured-output result itself, observe the structured-output.complete CUSTOM event in onChunk.
onStructuredOutputConfig
A dedicated config hook that fires only at the structured-output boundary
(when chat({ outputSchema }) is invoked). Use it to transform the JSON Schema
sent to the provider (inject $defs, strip vendor-incompatible keywords) or to
apply structured-output-specific config changes that should not affect the
agent-loop adapter calls.
Signature:
onStructuredOutputConfig?: (
ctx: ChatMiddlewareContext,
config: StructuredOutputMiddlewareConfig,
) =>
| void
| null
| Partial<StructuredOutputMiddlewareConfig>
| Promise<void | Partial<StructuredOutputMiddlewareConfig>>
StructuredOutputMiddlewareConfig shape:
interface StructuredOutputMiddlewareConfig extends ChatMiddlewareConfig {
outputSchema: JSONSchema
}
Ordering rule:
onStructuredOutputConfig fires before onConfig at the structured-output boundary.
onConfig re-fires at the same boundary with ctx.phase === 'structuredOutput', receiving the post-onStructuredOutputConfig view of the config (minus outputSchema).
- Use
onConfig for general-purpose transforms that apply to every adapter call (agent-loop iterations and the final structured-output call).
- Use
onStructuredOutputConfig when you need to transform the JSON Schema or apply structured-output-specific behavior.
Core Patterns
Pattern 1: Analytics and Logging Middleware
Use onStart, onFinish, onUsage, and onError for comprehensive observability.
Use ctx.defer() for non-blocking async side effects that should not block the stream.
import {
chat,
toServerSentEventsResponse,
type ChatMiddleware,
} from '@tanstack/ai'
import { openaiText } from '@tanstack/ai-openai'
const analytics: ChatMiddleware = {
name: 'analytics',
onStart: (ctx) => {
console.log(`[${ctx.requestId}] Chat started — model: ${ctx.model}`)
},
onUsage: (ctx, usage) => {
console.log(`[${ctx.requestId}] Tokens: ${usage.totalTokens}`)
},
onFinish: (ctx, info) => {
ctx.defer(
fetch('/api/analytics', {
method: 'POST',
body: JSON.stringify({
requestId: ctx.requestId,
model: ctx.model,
duration: info.duration,
tokens: info.usage?.totalTokens,
finishReason: info.finishReason,
}),
}),
)
},
onError: (ctx, info) => {
ctx.defer(
fetch('/api/errors', {
method: 'POST',
body: JSON.stringify({
requestId: ctx.requestId,
error: String(info.error),
duration: info.duration,
}),
}),
)
},
}
const stream = chat({
adapter: openaiText('gpt-5.2'),
messages,
middleware: [analytics],
})
return toServerSentEventsResponse(stream)
Pattern 2: Tool Interception Middleware
Use onBeforeToolCall to validate, gate, or transform tool arguments before execution.
Use onAfterToolCall to log results and timing. The first middleware that returns a
non-void decision from onBeforeToolCall short-circuits remaining middleware for that call.
import type { ChatMiddleware } from '@tanstack/ai'
const toolGuard: ChatMiddleware = {
name: 'tool-guard',
onBeforeToolCall: (ctx, hookCtx) => {
if (hookCtx.toolName === 'deleteDatabase') {
return { type: 'abort', reason: 'Dangerous operation blocked' }
}
if (hookCtx.toolName === 'search' && !hookCtx.args.limit) {
return {
type: 'transformArgs',
args: { ...hookCtx.args, limit: 10 },
}
}
},
onAfterToolCall: (ctx, info) => {
if (info.ok) {
console.log(`${info.toolName} completed in ${info.duration}ms`)
} else {
console.error(`${info.toolName} failed:`, info.error)
}
},
}
onBeforeToolCall decision types:
| Decision | Effect |
|---|
void / undefined | Continue normally, next middleware decides |
{ type: 'transformArgs', args } | Replace tool arguments before execution |
{ type: 'skip', result } | Skip execution, use provided result (used by toolCacheMiddleware) |
{ type: 'abort', reason? } | Abort the entire chat run |
Pattern 3: Structured-Output Middleware
When chat({ outputSchema }) is used, the final structured-output adapter call
now flows through the same middleware chain as the agent loop (with
ctx.phase === 'structuredOutput'). Before this change, the final call bypassed
middleware entirely — onChunk, onUsage, onConfig, and terminal hooks did
not see it.
Example A — Observability (tracing every chunk, including finalization):
import type { ChatMiddleware } from '@tanstack/ai'
const tracing: ChatMiddleware = {
name: 'tracing',
onChunk(ctx, chunk) {
span.addEvent('chunk', { phase: ctx.phase, type: chunk.type })
},
}
This middleware now observes every chunk from the final structured-output call,
attributed to ctx.phase === 'structuredOutput'. Before the fix, the final
adapter call bypassed middleware entirely — tracing would only see agent-loop
chunks.
Example B — Schema rewriting (inject shared $defs):
import type { ChatMiddleware } from '@tanstack/ai'
const injectDefs: ChatMiddleware = {
name: 'inject-defs',
onStructuredOutputConfig(_ctx, config) {
return {
outputSchema: { ...config.outputSchema, $defs: { ...sharedDefs } },
}
},
}
onStructuredOutputConfig is the right hook here because it has direct access
to config.outputSchema and runs only on the structured-output boundary —
schema rewrites do not leak into the agent-loop adapter calls.
Pattern 4: Multiple Middleware Composition
Middleware executes in array order (left-to-right). Ordering matters for hooks that
pipe or short-circuit:
import { chat, type ChatMiddleware } from '@tanstack/ai'
import { toolCacheMiddleware } from '@tanstack/ai/middlewares'
import { openaiText } from '@tanstack/ai-openai'
const logging: ChatMiddleware = {
name: 'logging',
onStart: (ctx) => console.log(`[${ctx.requestId}] started`),
onChunk: (ctx, chunk) => {
console.log(`[${ctx.requestId}] chunk: ${chunk.type}`)
},
onFinish: (ctx, info) => {
console.log(`[${ctx.requestId}] done in ${info.duration}ms`)
},
}
const configTransform: ChatMiddleware = {
name: 'config-transform',
onConfig: (ctx, config) => {
if (ctx.phase === 'init') {
return {
systemPrompts: [...config.systemPrompts, 'Always respond in JSON.'],
}
}
},
}
const stream = chat({
adapter: openaiText('gpt-5.2'),
messages,
tools: [weatherTool, stockTool],
middleware: [
logging,
configTransform,
toolCacheMiddleware({ ttl: 60_000 }),
],
})
Composition rules by hook:
| Hook | Composition | Effect of Order |
|---|
onConfig | Piped -- each receives previous output | Earlier middleware transforms first |
onStructuredOutputConfig | Piped -- each receives previous output | Earlier middleware transforms first |
onStart | Sequential | All run in order |
onChunk | Piped -- chunks flow through each | If first drops a chunk, later never see it |
onBeforeToolCall | First-win -- first non-void decision wins | Earlier middleware has priority |
onAfterToolCall | Sequential | All run in order |
onUsage | Sequential | All run in order |
onFinish/onAbort/onError | Sequential | All run in order |
Built-in: toolCacheMiddleware
Caches tool call results by name + arguments. Import from @tanstack/ai/middlewares:
import { chat } from '@tanstack/ai'
import { toolCacheMiddleware } from '@tanstack/ai/middlewares'
const stream = chat({
adapter,
messages,
tools: [weatherTool],
middleware: [
toolCacheMiddleware({
ttl: 60_000,
maxSize: 50,
toolNames: ['getWeather'],
}),
],
})
Options: maxSize (default 100), ttl (default Infinity), toolNames (default all),
keyFn (custom cache key), storage (custom backend like Redis). See
docs/advanced/middleware.md for custom storage examples.
Common Mistakes
a. MEDIUM: Trying to modify StreamChunks in middleware
const broken: ChatMiddleware = {
name: 'broken',
onChunk: (ctx, chunk) => {
chunk.delta = 'modified'
},
}
const correct: ChatMiddleware = {
name: 'correct',
onChunk: (ctx, chunk) => {
if (chunk.type === 'TEXT_MESSAGE_CONTENT') {
return { ...chunk, delta: chunk.delta.replace(/secret/g, '[REDACTED]') }
}
},
}
Middleware onChunk hooks are functional transforms. Return a new chunk, an array
of chunks, null (to drop), or void (to pass through). Mutating the input object
has no effect on the stream output.
Source: docs/advanced/middleware.md
b. MEDIUM: Middleware exceptions breaking the stream
const fragile: ChatMiddleware = {
name: 'fragile-analytics',
onFinish: async (ctx, info) => {
await fetch('/api/analytics', {
method: 'POST',
body: JSON.stringify({ duration: info.duration }),
})
},
}
const resilient: ChatMiddleware = {
name: 'resilient-analytics',
onFinish: (ctx, info) => {
ctx.defer(
fetch('/api/analytics', {
method: 'POST',
body: JSON.stringify({ duration: info.duration }),
}),
)
},
onChunk: (ctx, chunk) => {
try {
logChunk(chunk)
} catch (err) {
console.error('Logging failed:', err)
}
},
}
Wrap all middleware hooks in try-catch to prevent analytics or logging failures
from killing the chat stream. For async side effects, prefer ctx.defer() which
runs after the terminal hook and isolates failures.
Source: docs/advanced/middleware.md
Cross-References
- See also: ai-core/chat-experience/SKILL.md -- Middleware hooks into the chat lifecycle
- See also: ai-core/structured-outputs/SKILL.md -- Middleware now wraps the final structured-output call; use
onStructuredOutputConfig for JSON-Schema transforms