| name | datadog-query-recipes |
| description | Langfuse-specific Datadog query recipes for production telemetry research.
Use when asked to investigate tenant or project activity, public API endpoint
usage, queue consumer behavior, spans, logs, metrics, or ad hoc production
questions across prod-us, prod-eu, prod-hipaa, and prod-jp. This skill is for
reusable query shapes and measured research; pair it with
debug-issue-with-datadog when the task is an incident or root-cause analysis.
|
Datadog Query Recipes
Use this skill for Langfuse production telemetry research where the main work is
finding the right Datadog data path. Keep findings evidence-based and include
the exact Datadog links or query shapes that support the answer.
Required Scope
Unless the user explicitly narrows the scope, cover every production
environment:
prod-us
prod-eu
prod-hipaa
prod-jp
Query both Datadog sites when needed. Default to the EU site for prod-eu and
the US site for the other prod environments, but verify with a small count or
facet query before concluding an environment has no data.
Before querying live Datadog, load the relevant Datadog MCP guidance for the
data domain you need: traces, logs, metrics, and visualizations.
Workflow
- Identify the entity and signal: tenant ID, org ID, project ID, route, queue,
service, error class, or metric.
- Read only the relevant reference:
- Start with aggregate queries, grouped by environment, service, route,
queue, project, org, status, or error facets as appropriate.
- Fetch raw spans, logs, or traces only after aggregation identifies the
cluster or sample you need.
- For tenant-specific HTTP usage, prefer trace correlation over single-span
queries when tenant tags and route tags live on different spans.
- Report the windows, environments, sites, query links, and any sampling or
missing-data caveats.
When To Use Other Skills
- Use
debug-issue-with-datadog when a
Linear issue, GitHub issue, incident report, or monitor needs root-cause
analysis and patch recommendations.
- Use
detect-prod-regressions when the
user asks for a proactive production sweep or baseline comparison.
- Use
linear-bug-triage only after a human
approves sharing measured findings in Linear.
Output Expectations
Summarize what was checked, including:
- Datadog site and
env values covered.
- Time windows.
- Core filters or metrics used.
- Count, rate, latency, queue depth, trace sample, or "No measurements found".
- Datadog links or trace IDs that let the human rerun the query.