بنقرة واحدة
implementing-chat-streaming
// Provides SSE streaming patterns for the chat API and frontend. Use when implementing or modifying chat streaming, handling SSE events, or troubleshooting message flow between frontend and backend.
// Provides SSE streaming patterns for the chat API and frontend. Use when implementing or modifying chat streaming, handling SSE events, or troubleshooting message flow between frontend and backend.
Provides deployment commands and troubleshooting for Azure Container Apps. Use when running azd commands, deploying containers, debugging deployment failures, or updating infrastructure in this repository.
Provides research patterns for Foundry Agent Service SDK. Use when implementing agent features, looking up SDK methods, finding code samples, or troubleshooting Azure.AI.Projects API usage.
Provides architecture overview with state machines, SSE event flow, and file mappings. Use when understanding system design, debugging state issues, or maintaining ARCHITECTURE-FLOW.md.
Provides C# and ASP.NET Core coding standards for this repository. Use when writing or modifying C# code, implementing API endpoints, configuring middleware, or working with authentication in the backend.
Diagnose and fix incomplete local development setup. Use when dev servers fail to start, env vars are missing, authentication errors occur, or before running any dev commands for the first time.
Provides TypeScript and React coding standards for this repository. Use when writing or modifying TypeScript code, creating React components, implementing MSAL authentication, or working with the frontend.
| name | implementing-chat-streaming |
| description | Provides SSE streaming patterns for the chat API and frontend. Use when implementing or modifying chat streaming, handling SSE events, or troubleshooting message flow between frontend and backend. |
app.MapPost("/api/chat/stream", async (
ChatRequest request,
AgentFrameworkService agentService,
HttpContext httpContext,
CancellationToken cancellationToken) =>
{
httpContext.Response.Headers.Append("Content-Type", "text/event-stream");
httpContext.Response.Headers.Append("Cache-Control", "no-cache");
var conversationId = request.ConversationId
?? await agentService.CreateConversationAsync(request.Message, cancellationToken);
// Send conversation ID first
await httpContext.Response.WriteAsync(
$"data: {{\"type\":\"conversationId\",\"conversationId\":\"{conversationId}\"}}\n\n",
cancellationToken);
await httpContext.Response.Body.FlushAsync(cancellationToken);
// Stream chunks
await foreach (var chunk in agentService.StreamMessageAsync(
conversationId, request.Message, request.ImageDataUris, cancellationToken))
{
var json = JsonSerializer.Serialize(new { type = "chunk", content = chunk });
await httpContext.Response.WriteAsync($"data: {json}\n\n", cancellationToken);
await httpContext.Response.Body.FlushAsync(cancellationToken);
}
await httpContext.Response.WriteAsync("data: {\"type\":\"done\"}\n\n", cancellationToken);
})
.RequireAuthorization("RequireChatScope");
Actual return type: IAsyncEnumerable<StreamChunk> (not raw strings)
Why direct SDK? Uses ProjectResponsesClient directly because we need typed access to MCP approvals, file search quotes, and citation annotations. See .github/skills/researching-azure-ai-sdk/SKILL.md for full rationale.
public async IAsyncEnumerable<StreamChunk> StreamMessageAsync(
string conversationId,
string message,
List<string>? imageDataUris = null,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
ObjectDisposedException.ThrowIf(_disposed, this);
// Stream response - yields StreamChunk with text deltas OR annotations
await foreach (var update in responsesClient.CreateResponseStreamingAsync(...))
{
if (update is StreamingResponseOutputTextDeltaUpdate deltaUpdate)
{
yield return StreamChunk.Text(deltaUpdate.Delta);
}
else if (update is StreamingResponseOutputItemDoneUpdate itemDoneUpdate)
{
var annotations = ExtractAnnotations(itemDoneUpdate.Item, fileSearchQuotes);
if (annotations.Count > 0)
{
yield return StreamChunk.WithAnnotations(annotations);
}
}
}
}
StreamChunk model (backend/WebApp.Api/Models/StreamChunk.cs):
IsText / TextDelta - Text contentHasAnnotations / Annotations - Citation metadataCHAT_SEND_MESSAGE
→ CHAT_ADD_ASSISTANT_MESSAGE
→ CHAT_START_STREAM
→ (repeat CHAT_STREAM_CHUNK)
→ CHAT_STREAM_ANNOTATIONS (optional, for citations)
→ CHAT_STREAM_COMPLETE (with usage metrics)
If user cancels: CHAT_CANCEL_STREAM sets status to idle.
See: frontend/src/services/ChatService.ts
Key patterns:
data: linesBackend limits (see AzureAIAgentService.cs):
image/png, image/jpeg, image/gif, image/webpFrontend limits (see frontend/src/utils/fileAttachments.ts):
app.MapPost("/api/chat/stream", async (
ChatRequest request,
AgentFrameworkService agentService,
HttpContext httpContext,
IHostEnvironment env,
CancellationToken cancellationToken) =>
{
httpContext.Response.Headers.Append("Content-Type", "text/event-stream");
httpContext.Response.Headers.Append("Cache-Control", "no-cache");
var conversationId = request.ConversationId
?? await agentService.CreateConversationAsync(request.Message, cancellationToken);
await httpContext.Response.WriteAsync(
$"data: {{\"type\":\"conversationId\",\"conversationId\":\"{conversationId}\"}}\n\n",
cancellationToken);
await httpContext.Response.Body.FlushAsync(cancellationToken);
await foreach (var chunk in agentService.StreamMessageAsync(
conversationId, request.Message, request.ImageDataUris, cancellationToken))
{
var json = System.Text.Json.JsonSerializer.Serialize(new { type = "chunk", content = chunk });
await httpContext.Response.WriteAsync($"data: {json}\n\n", cancellationToken);
await httpContext.Response.Body.FlushAsync(cancellationToken);
}
await httpContext.Response.WriteAsync("data: {\"type\":\"done\"}\n\n", cancellationToken);
})
.RequireAuthorization("RequireChatScope")
.WithName("StreamChatMessage");
See: backend/WebApp.Api/Services/AgentFrameworkService.cs
Key patterns in StreamMessageAsync:
IAsyncEnumerable<StreamChunk> with [EnumeratorCancellation]StreamingResponseOutputTextDeltaUpdate for text contentStreamingResponseOutputItemDoneUpdate for annotationsFileSearchCallResponseItem for citation contextStreamingResponseCompletedUpdateCHAT_SEND_MESSAGE
→ CHAT_ADD_ASSISTANT_MESSAGE
→ CHAT_START_STREAM
→ (repeat CHAT_STREAM_CHUNK)
→ CHAT_STREAM_ANNOTATIONS (optional, for citations)
→ CHAT_STREAM_COMPLETE (with usage: promptTokens, completionTokens, totalTokens, duration)
Cancel: CHAT_CANCEL_STREAM sets status to idle and re-enables input.
Error: CHAT_ERROR with AppError containing message, optional retry action, timestamp.
| Event Type | Payload | Description |
|---|---|---|
conversationId | { conversationId: string } | Sent first for new conversations |
chunk | { content: string } | Text delta from agent response |
annotations | { annotations: [...] } | Citations (uri_citation, file_citation, etc.) |
usage | { duration, promptTokens, completionTokens, totalTokens } | Token metrics |
done | {} | Stream complete |
error | { message: string } | Error occurred |
Each state change prints (dev only):
🔄 [HH:MM:SS] ACTION_TYPE
Action: { … }
Changes: { field: before → after }