with one click
debug-instrumentation
// Debug and improve your LangWatch traces. Inspects production traces for missing input/output, disconnected spans, unlabeled traces, and missing metadata. Use when traces look broken or incomplete.
// Debug and improve your LangWatch traces. Inspects production traces for missing input/output, disconnected spans, unlabeled traces, and missing metadata. Use when traces look broken or incomplete.
Generate realistic synthetic evaluation datasets by analyzing the user's codebase, prompts, production traces, and reference materials. Interactive, consultant-style — asks clarifying questions, proposes a plan, generates a preview for approval, then delivers a complete dataset uploaded to LangWatch. Use when user asks to generate, create, or build a dataset for evaluation, testing, or benchmarking.
Analyze your AI agent's performance using LangWatch analytics. Use when the user wants to understand costs, latency, error rates, usage trends, or debug specific traces. Works with any LangWatch-instrumented agent.
Set up comprehensive evaluations for your AI agent with LangWatch — experiments (batch testing), evaluators (scoring functions), datasets, online evaluation (production monitoring), and guardrails (real-time blocking). Supports both code (SDK) and platform (CLI) approaches. Use when the user wants to evaluate, test, benchmark, monitor, or safeguard their agent.
Take your AI agent to the next level with full LangWatch integration. Adds tracing, prompt versioning, evaluation experiments, and simulation tests in one go. Use when the user wants comprehensive observability, testing, and prompt management for their agent.
Version and manage your agent's prompts with LangWatch Prompts CLI. Use for both onboarding (set up prompt versioning for an entire codebase) and targeted operations (version a specific prompt, create a new prompt version). Supports Python and TypeScript.
Evaluate multimodal AI agents that process images, audio, PDFs, or other files. Sets up evaluations using LangWatch's LLM-as-judge with image inputs, Scenario's multimodal testing, and document parsing evaluation patterns. Use when your agent handles non-text inputs.
| name | debug-instrumentation |
| description | Debug and improve your LangWatch traces. Inspects production traces for missing input/output, disconnected spans, unlabeled traces, and missing metadata. Use when traces look broken or incomplete. |
| license | MIT |
| compatibility | Requires the `langwatch` CLI with a valid `LANGWATCH_API_KEY`. Works with any coding agent. |
| metadata | {"category":"recipe"} |
This recipe uses the langwatch CLI to inspect your production traces and identify instrumentation issues.
Use langwatch docs <path> to read documentation as Markdown. Some useful entry points:
langwatch docs # Docs index
langwatch docs integration/python/guide # Python integration
langwatch docs integration/typescript/guide # TypeScript integration
langwatch docs prompt-management/cli # Prompts CLI
langwatch scenario-docs # Scenario docs index
Discover commands with langwatch --help and langwatch <subcommand> --help. List and get commands accept --format json for machine-readable output. Read the docs first instead of guessing SDK APIs or CLI flags.
If no shell is available, fetch the same Markdown over plain HTTP — append .md to any docs path (e.g. https://langwatch.ai/docs/integration/python/guide.md). Index: https://langwatch.ai/docs/llms.txt. Scenario index: https://langwatch.ai/scenario/llms.txt
langwatch trace search --limit 25 --start-date 2026-01-01 --format json
(Adjust --start-date to "last 24h" or "last 7d" — the CLI accepts ISO strings.)
For each trace, ask:
<empty>?langwatch status is a fast sanity check that the CLI is talking to the right project.
langwatch trace get <traceId> # Human-readable digest
langwatch trace get <traceId> -f json # Full span hierarchy as JSON
For traces that look problematic, check for:
autotrack_openai_calls(client) (Python) or experimental_telemetry (TypeScript/Vercel AI) is configured.@langwatch.trace() decorator is missing on the entry function.langwatch.get_current_trace().update(metadata={"labels": ["feature_name"]}).Use the CLI to read the integration guide for the project's framework. Compare the recommended setup with what's in the code.
langwatch docs # Browse the docs index
langwatch docs integration/python/guide # Python (or your framework)
langwatch docs integration/typescript/guide # TypeScript (or your framework)
For each issue found:
langwatch trace search and langwatch trace get to verify the fixAfter fixes, compare before/after:
You can also export a sample for diff:
langwatch trace export --format jsonl --limit 50 -o traces.jsonl
| Issue | Cause | Fix |
|---|---|---|
All traces show <empty> input/output | Missing autotrack or telemetry config | Add autotrack_openai_calls(client) or experimental_telemetry: { isEnabled: true } |
| Spans not connected to traces | Missing @langwatch.trace() on entry function | Add trace decorator to the main function |
| No labels on traces | Labels not set in trace metadata | Add metadata={"labels": ["feature"]} to trace update |
| Missing user_id | User ID not passed to trace | Add user_id to trace metadata |
| Traces from different calls merged | Missing langwatch.setup() or trace context not propagated | Ensure langwatch.setup() called at startup |