Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

$pwd:

debug-trace

Name: Debug Trace
Author: Arize-ai

// Diagnose failure modes by systematically investigating traces. Trigger when the user explicitly asks for cross-trace diagnosis: "what's going wrong?", "were there errors?", "debug this", "where is my agent struggling?". Do NOT trigger on: (1) advice questions ("what should I do?"), (2) statistical questions ("what's the average latency?"), (3) summarize requests, (4) trace filtering ("show me traces with errors"), (5) vague questions ("is there a problem?"), (6) unrelated requests.

Exécuter dans Manus

$ git log --oneline --stat

stars:9 927

forks:905

updated:28 mai 2026 à 19:24

SKILL.md

readonly

name

debug-trace

description

Diagnose failure modes by systematically investigating traces. Trigger when the user explicitly asks for cross-trace diagnosis: "what's going wrong?", "were there errors?", "debug this", "where is my agent struggling?". Do NOT trigger on: (1) advice questions ("what should I do?"), (2) statistical questions ("what's the average latency?"), (3) summarize requests, (4) trace filtering ("show me traces with errors"), (5) vague questions ("is there a problem?"), (6) unrelated requests.

Orientation

Your goal is to identify common failure modes and provide prioritized, actionable recommendations. If you already have a specific trace in context, skip the Steps below — apply the failure mode checklist directly to that trace and report findings. Otherwise, build a representative picture across multiple traces: start broad, inspect selectively, and stop once the main issue categories are clear.

Common Failure Modes to Watch For

This list is non-exhaustive. Use it as a starting checklist, not a complete taxonomy.

Explicit errors — exceptions, error status codes, or error messages in tool call spans, LLM spans, or retriever spans
Cost and latency — unusually high token counts or slow spans that suggest room for efficiency improvements
Retrieval quality — irrelevant, missing, or low-scoring chunks in RAG applications
LLM response quality — hallucination, factual incorrectness, wrong tone, inappropriate refusals
Tool use problems — wrong tool selected, malformed invocation, or poor handling of the tool response
Trajectory problems — the agent took an inefficient path, got stuck in a loop, or failed to complete its task
Instrumentation gaps — the application is poorly traced, leaving visibility gaps that make it difficult to understand behavior

Steps

Default query budget: 1–3 orientation queries, 3–5 aggregate or sampling queries, 3–7 targeted drilldowns. Avoid more than 20 GraphQL calls without summarizing progress. Avoid reading more than 5–10 full span inputs or outputs. Prefer fewer, richer queries — use GraphQL aliases to batch independent lookups. Use available GraphQL recipes before writing new queries.

Orient — Read /phoenix/agent-start.md. Use the phoenix-gql CLI to get a compact project overview in one aliased query: trace count, span count, latency quantiles, token totals, annotation names, slow traces, high-token traces, error spans.
Select — Choose a representative sample. Prioritize slow traces, high-token traces, and errored traces; include a few normal traces for comparison. Prefer diversity over volume.
Open-code — For each trace, write free-form notes on problems, surprises, and incorrect behaviors. Focus on the first failure in a trace, since upstream errors often cause downstream issues. Note independent downstream failures only when they reveal a separate root cause.
Axial-code — Cluster your notes into named failure categories. Let categories emerge from the data. Distinguish exact counts from sampled or estimated counts.
Summarize — Report findings using the output format below.

Observation Journal

Output this table inline in the conversation as you work — it stays compact by design and needs to be visible for axial-coding. Only write it to a file if the user asks for exhaustive analysis across a large number of traces.

For each inspected trace, add a row. Keep observations free-form. Fill in tentative_category only when the pattern is clear — leave it blank otherwise.

trace_id	observations	tentative_category

Use this journal as the input to axial-coding.

Output Format

Analysis scope — brief summary of what was analyzed: number of traces examined, time range if relevant, any filters applied
Findings table — one row per issue category with: label, short description, occurrence count, one or two representative span (or trace) links — see Linking to Findings below
Recommendations — for each issue, a concrete suggested fix (prompt change, parameter adjustment, tool fix, instrumentation improvement, etc.), if one can be identified

Linking to Findings

Prefer linking to the specific span that exhibits the issue over the parent trace whenever possible — span links land the user on the exact node, while a trace link forces them to hunt for the relevant span. Fall back to a trace link only when no single span captures the issue (e.g., a trajectory problem spanning many spans).

Use Phoenix's root-relative redirect URLs with the OpenTelemetry IDs returned by GraphQL — no project lookup required. Read the OTel IDs from the spanId and traceId GraphQL fields (the hex OTel IDs), not the id field (which is a Relay node ID and will not resolve):

Span: [short description](/redirects/spans/<spanId>)
Trace: [short description](/redirects/traces/<traceId>)

Caveats and Pitfalls

Existing evals and annotations are useful signal, but treat them as one input among many — they may be incomplete or incorrect.
Span status codes are not a reliable proxy for whether an error actually occurred. An exception in the output may be expected behavior; a success status code may mask an error message buried in the span attributes.
Do not overload your context by reading too many full spans. Read enough to support meaningful recommendations, then stop.

related-skills.json

même dépôt

phoenix-pxi.md

from "Arize-ai/phoenix"

Development guide for the Phoenix PXI agent. Use when modifying PXI-specific frontend or backend behavior, extending PXI tool wiring, updating PXI runtime capabilities, or changing the PXI agent request/dispatch flow. Start here for PXI-specific workflows, then read the relevant resource file for the layer you are changing.

2026-05-309.9k

phoenix-frontend.md

from "Arize-ai/phoenix"

Frontend development guidelines for the Phoenix AI observability platform. Use when writing, reviewing, or modifying React components, TypeScript code, styles, or UI features in the app/ directory. Triggers on any frontend task — new components, UI changes, styling, accessibility fixes, form handling, or component refactoring. Also use when the user asks about frontend conventions or component patterns for this project. For design system rules (error display, layout, dialogs, tokens), use the phoenix-design skill.

2026-05-299.9k

phoenix-design.md

from "Arize-ai/phoenix"

Design system conventions for the Phoenix frontend — layout, dialogs, error display, BEM CSS class naming, and CSS design tokens. Use when building UI, naming CSS classes, creating or consuming tokens, handling errors, or designing dialog interactions in app/src/.

2026-05-299.9k

playground.md

from "Arize-ai/phoenix"

Author, edit, or iterate on prompts in the Phoenix prompt playground. Load before any playground tool call, including single-shot prompt rewrites.

2026-05-289.9k

phoenix-cli.md

from "Arize-ai/phoenix"

Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, structure trace review with open coding and axial coding, inspect datasets, review experiments, query annotation configs, and use the GraphQL API. Use whenever the user is analyzing traces or spans, investigating LLM/agent failures, deciding what to do after instrumenting an app, building failure taxonomies, choosing what evals to write, or asking "what's going wrong", "what kinds of mistakes", or "where do I focus" — even without naming a technique.

2026-05-279.9k

phoenix-tracing.md

from "Arize-ai/phoenix"

OpenInference semantic conventions and instrumentation for Phoenix AI observability. Use when implementing LLM tracing, creating custom spans, or deploying to production.

2026-05-239.9k

package.json

"author": "Arize-ai"

"repository": "Arize-ai/phoenix"

Ouvrir le dépôt GitHub Voir les dépôts du créateur

$ install --global

$ download --local

Exécuter dans Manus

$ useful --forSOC

Développeurs de logicielsProfessions informatiques et mathématiques15-1252L4

name

debug-trace

description

Orientation

Common Failure Modes to Watch For

This list is non-exhaustive. Use it as a starting checklist, not a complete taxonomy.

Explicit errors — exceptions, error status codes, or error messages in tool call spans, LLM spans, or retriever spans
Cost and latency — unusually high token counts or slow spans that suggest room for efficiency improvements
Retrieval quality — irrelevant, missing, or low-scoring chunks in RAG applications
LLM response quality — hallucination, factual incorrectness, wrong tone, inappropriate refusals
Tool use problems — wrong tool selected, malformed invocation, or poor handling of the tool response
Trajectory problems — the agent took an inefficient path, got stuck in a loop, or failed to complete its task
Instrumentation gaps — the application is poorly traced, leaving visibility gaps that make it difficult to understand behavior

Steps

Orient — Read /phoenix/agent-start.md. Use the phoenix-gql CLI to get a compact project overview in one aliased query: trace count, span count, latency quantiles, token totals, annotation names, slow traces, high-token traces, error spans.
Select — Choose a representative sample. Prioritize slow traces, high-token traces, and errored traces; include a few normal traces for comparison. Prefer diversity over volume.
Open-code — For each trace, write free-form notes on problems, surprises, and incorrect behaviors. Focus on the first failure in a trace, since upstream errors often cause downstream issues. Note independent downstream failures only when they reveal a separate root cause.
Axial-code — Cluster your notes into named failure categories. Let categories emerge from the data. Distinguish exact counts from sampled or estimated counts.
Summarize — Report findings using the output format below.

Observation Journal

For each inspected trace, add a row. Keep observations free-form. Fill in tentative_category only when the pattern is clear — leave it blank otherwise.

trace_id	observations	tentative_category

Use this journal as the input to axial-coding.

Output Format

Analysis scope — brief summary of what was analyzed: number of traces examined, time range if relevant, any filters applied
Findings table — one row per issue category with: label, short description, occurrence count, one or two representative span (or trace) links — see Linking to Findings below
Recommendations — for each issue, a concrete suggested fix (prompt change, parameter adjustment, tool fix, instrumentation improvement, etc.), if one can be identified

Linking to Findings

Span: [short description](/redirects/spans/<spanId>)
Trace: [short description](/redirects/traces/<traceId>)

Caveats and Pitfalls

Existing evals and annotations are useful signal, but treat them as one input among many — they may be incomplete or incorrect.
Span status codes are not a reliable proxy for whether an error actually occurred. An exception in the output may be expected behavior; a success status code may mask an error message buried in the span attributes.
Do not overload your context by reading too many full spans. Read enough to support meaningful recommendations, then stop.

debug-trace

Orientation

Common Failure Modes to Watch For

Steps

Observation Journal

Output Format

Linking to Findings

Caveats and Pitfalls

Plus depuis ce dépôt

Plus depuis ce dépôt

Orientation

Common Failure Modes to Watch For

Steps

Observation Journal

Output Format

Linking to Findings

Caveats and Pitfalls