Run any Skill in Manus with one click

$pwd:

phoenix-cli

Name: Phoenix Cli
Author: Arize-ai

// Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, structure trace review with open coding and axial coding, inspect datasets, review experiments, query annotation configs, and use the GraphQL API. Use whenever the user is analyzing traces or spans, investigating LLM/agent failures, deciding what to do after instrumenting an app, building failure taxonomies, choosing what evals to write, or asking "what's going wrong", "what kinds of mistakes", or "where do I focus" — even without naming a technique.

Run Skill in Manus

$ git log --oneline --stat

stars:9,927

forks:905

updated:May 27, 2026 at 18:08

File Explorer

3 files

SKILL.md

readonly

related-skills.json

same repository

phoenix-pxi.md

from "Arize-ai/phoenix"

Development guide for the Phoenix PXI agent. Use when modifying PXI-specific frontend or backend behavior, extending PXI tool wiring, updating PXI runtime capabilities, or changing the PXI agent request/dispatch flow. Start here for PXI-specific workflows, then read the relevant resource file for the layer you are changing.

2026-05-309.9k

phoenix-frontend.md

from "Arize-ai/phoenix"

Frontend development guidelines for the Phoenix AI observability platform. Use when writing, reviewing, or modifying React components, TypeScript code, styles, or UI features in the app/ directory. Triggers on any frontend task — new components, UI changes, styling, accessibility fixes, form handling, or component refactoring. Also use when the user asks about frontend conventions or component patterns for this project. For design system rules (error display, layout, dialogs, tokens), use the phoenix-design skill.

2026-05-299.9k

phoenix-design.md

from "Arize-ai/phoenix"

Design system conventions for the Phoenix frontend — layout, dialogs, error display, BEM CSS class naming, and CSS design tokens. Use when building UI, naming CSS classes, creating or consuming tokens, handling errors, or designing dialog interactions in app/src/.

2026-05-299.9k

debug-trace.md

from "Arize-ai/phoenix"

Diagnose failure modes by systematically investigating traces. Trigger when the user explicitly asks for cross-trace diagnosis: "what's going wrong?", "were there errors?", "debug this", "where is my agent struggling?". Do NOT trigger on: (1) advice questions ("what should I do?"), (2) statistical questions ("what's the average latency?"), (3) summarize requests, (4) trace filtering ("show me traces with errors"), (5) vague questions ("is there a problem?"), (6) unrelated requests.

2026-05-289.9k

playground.md

from "Arize-ai/phoenix"

Author, edit, or iterate on prompts in the Phoenix prompt playground. Load before any playground tool call, including single-shot prompt rewrites.

2026-05-289.9k

phoenix-tracing.md

from "Arize-ai/phoenix"

OpenInference semantic conventions and instrumentation for Phoenix AI observability. Use when implementing LLM tracing, creating custom spans, or deploying to production.

2026-05-239.9k

package.json

"author": "Arize-ai"

"repository": "Arize-ai/phoenix"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	phoenix-cli
description	Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, structure trace review with open coding and axial coding, inspect datasets, review experiments, query annotation configs, and use the GraphQL API. Use whenever the user is analyzing traces or spans, investigating LLM/agent failures, deciding what to do after instrumenting an app, building failure taxonomies, choosing what evals to write, or asking "what's going wrong", "what kinds of mistakes", or "where do I focus" — even without naming a technique.
license	Apache-2.0
compatibility	Requires Node.js (for npx) or global install of @arizeai/phoenix-cli. Optionally requires jq for JSON processing.
metadata	{"author":"arize-ai","version":"3.3.0"}

Phoenix CLI

Invocation

px <resource> <action>                          # if installed globally
npx @arizeai/phoenix-cli <resource> <action>    # no install required

The CLI uses singular resource commands with subcommands like list and get:

px trace list
px trace get <trace-id>
px trace annotate <trace-id>
px trace add-note <trace-id>
px trace-annotations delete
px span list
px span annotate <span-id>
px span add-note <span-id>
px span-annotations delete
px session list
px session get <session-id>
px session annotate <session-id>
px session add-note <session-id>
px session-annotations delete
px dataset list
px dataset get <name>
px project list
px project get <name>
px annotation-config list
px auth status
px profile list
px profile show [name]
px profile create <name>
px profile use <name>
px profile edit <name>
px profile delete <name>

Setup

export PHOENIX_HOST=http://localhost:6006
export PHOENIX_PROJECT=my-project
export PHOENIX_API_KEY=your-api-key  # if auth is enabled

Always use --format raw --no-progress when piping to jq.

Quick Reference

Task	Files
Look at sampled traces, spans, or sessions and write specific notes about what went wrong (no taxonomy yet)	references/open-coding
Group those notes into a structured failure taxonomy and quantify what matters	references/axial-coding

Both stages tag every artifact with one shared coding annotation identifier (descriptive shape, e.g. coding-run:chatbot-context-loss-2026-05-06) so the run is queryable, reversible, and viewable as a unit. Pass --identifier <value> explicitly on every px call — shell inheritance is unreliable across agent harnesses. Open coding writes notes via px ... add-note and records a small local JSONL sidecar at .px/coding/<sanitized-identifier>.jsonl; axial coding reads that sidecar as the deterministic handoff and records labels in .px/coding/<sanitized-identifier>-axial.jsonl. Pick the identifier once per run (see references/open-coding.md), then share the Phoenix UI link from the wrap-up section. Revert is opt-in and runs three identifier-bound DELETEs only after explicit user confirmation.

Workflow term vs. server annotation name. The skill prose calls this value the coding annotation identifier (shell-variable hint: CODING_ANNOTATION_IDENTIFIER). The server-side annotation NAME used for the UI filter is unchanged — coding_session_id — for data compatibility with rows already written by previous runs. Don't try to rename the server-side annotation; treat the asymmetry as load-bearing.

Workflows

"What do I do after instrumenting?" / "Where do I focus?" / "What's going wrong?" open-coding → axial-coding → build evals for the top categories.

Reference Categories

Prefix	Description
`references/open-coding`	Free-form notes against sampled traces, spans, or sessions — reach for it whenever the user wants to make sense of LLM traffic but has no failure categories yet. Includes a unit-of-analysis diagnostic so the workflow runs at the level the failure modes actually live at (trace for stateless single-shot calls, session for multi-turn agents, span for mechanical/in-isolation failures).
`references/axial-coding`	Inductive grouping of notes into a MECE taxonomy with counts — reach for it whenever the user has observations and needs categories or eval targets

Auth

px auth status                                # check connection and authentication
px auth status --endpoint http://other:6006   # check a specific endpoint
px auth status --profile staging              # check a named profile's connection

Profiles

Named profiles let you switch between multiple Phoenix instances (local, staging, cloud) without juggling environment variables. Profiles are stored in ~/.px/settings.json (or $XDG_CONFIG_HOME/px/settings.json).

Configuration priority (highest to lowest): CLI flags > env vars > active profile > built-in defaults.

px profile list                              # list all profiles (shows active profile)
px profile show                              # show the active profile's settings
px profile show staging                      # show a named profile's settings
px profile create prod --endpoint https://app.phoenix.arize.com --api-key <key> --activate
px profile create local --endpoint http://localhost:6006 --project my-app
px profile use prod                          # switch the active profile
px profile edit prod                         # open profile JSON in $EDITOR (validates on save)
px profile delete prod --yes                 # delete a profile (--yes skips confirmation)

Use --profile <name> on any command to target a specific profile without changing the active one:

px trace list --profile staging --limit 10 --format raw --no-progress | jq .
px auth status --profile prod

px profile create options: --endpoint <url>, --project <name>, --api-key <key>, --header <key=value> (repeatable), --activate.

Projects

px project list                                            # list all projects (table view)
px project list --format raw --no-progress | jq '.[].name' # project names as JSON
px project get my-project --format raw --no-progress       # single record by exact name
px project get my-project --format raw --no-progress | jq -r '.id'  # extract project id

project get exits with ExitCode.FAILURE (1) on a name miss and writes a StructuredError {error, code: "FAILURE", hint} to stderr in --format json|raw.

Traces

px trace list --limit 20 --format raw --no-progress | jq .
px trace list --last-n-minutes 60 --limit 20 --format raw --no-progress | jq '.[] | select(.status == "ERROR")'
px trace list --since 2025-01-15T00:00:00Z --limit 50 --format raw --no-progress | jq .
px trace list --format raw --no-progress | jq 'sort_by(-.duration) | .[0:5]'
px trace list --include-notes --format raw --no-progress | jq '.[].notes'
px trace get <trace-id> --format raw | jq .
px trace get <trace-id> --format raw | jq '.spans[] | select(.status_code != "OK")'
px trace get <trace-id> --include-notes --format raw | jq '.notes'
px trace annotate <trace-id> --name reviewer --label pass
px trace annotate <trace-id> --name reviewer --score 0.9 --format raw --no-progress
px trace annotate <trace-id> --name reviewer --label pass --identifier "<coding-annotation-id>"  # tag with a coding annotation identifier
px trace add-note <trace-id> --text "needs follow-up"
px trace add-note <trace-id> --text "needs follow-up" --identifier "<coding-annotation-id>"  # tag + upsert on identifier
px trace-annotations delete --identifier "<coding-annotation-id>" --all -y            # nuke every annotation tied to this coding annotation identifier

px <entity>-annotations delete requires --all or both --start-time and --end-time and emits {deleted: true, target, filter} on success.

Trace JSON shape

Trace
  traceId, status ("OK"|"ERROR"), duration (ms), startTime, endTime
  annotations[] (with --include-annotations, excludes note)
    name, result { score, label, explanation }
  notes[] (with --include-notes)
    name="note", result { explanation }
  rootSpan  — top-level span (parent_id: null)
  spans[]
    name, span_kind ("LLM"|"CHAIN"|"TOOL"|"RETRIEVER"|"EMBEDDING"|"AGENT"|"RERANKER"|"GUARDRAIL"|"EVALUATOR"|"UNKNOWN")
    status_code ("OK"|"ERROR"|"UNSET"), parent_id, context.span_id
    notes[] (with --include-notes)
      name="note", result { explanation }
    attributes
      input.value, output.value          — raw input/output
      llm.model_name, llm.provider
      llm.token_count.prompt/completion/total
      llm.token_count.prompt_details.cache_read
      llm.token_count.completion_details.reasoning
      llm.input_messages.{N}.message.role/content
      llm.output_messages.{N}.message.role/content
      llm.invocation_parameters          — JSON string (temperature, etc.)
      exception.message                  — set if span errored

Spans

px span list --limit 20                                    # recent spans (table view)
px span list --last-n-minutes 60 --limit 50                # spans from last hour
px span list --since 2025-01-15T00:00:00Z --limit 50       # spans since a timestamp
px span list --span-kind LLM --limit 10                    # only LLM spans
px span list --status-code ERROR --limit 20                # only errored spans
px span list --name chat_completion --limit 10             # filter by span name
px span list --trace-id <id> --format raw --no-progress | jq .   # all spans for a trace
px span list --parent-id null --limit 10                   # only root spans
px span list --parent-id <span-id> --limit 10              # only children of a span
px span list --include-annotations --limit 10              # include annotation scores
px span list --include-notes --limit 10                    # include span notes
px span list --attribute llm.model_name:gpt-4 --limit 10  # filter by string attribute
px span list --attribute llm.token_count.total:500 --limit 10  # filter by numeric attribute
px span list --attribute 'user.id:"12345"' --limit 10     # force string match for numeric-looking value
px span list --attribute session.id:sess:abc:123 --limit 20  # colon in value OK (split on first colon only)
px span list --attribute llm.model_name:gpt-4 --attribute session.id:abc --limit 10  # AND multiple filters
px span list output.json --limit 100                       # save to JSON file
px span list --format raw --no-progress | jq '.[] | select(.status_code == "ERROR")'
px span annotate <span-id> --name reviewer --label pass
px span annotate <span-id> --name checker --score 1 --annotator-kind CODE
px span annotate <span-id> --name reviewer --label pass --identifier "<coding-annotation-id>"  # tag with a coding annotation identifier
px span add-note <span-id> --text "verified by agent"
px span add-note <span-id> --text "verified by agent" --identifier "<coding-annotation-id>"  # tag + upsert on identifier
px span-annotations delete --identifier "<coding-annotation-id>" --all -y           # nuke every annotation tied to this coding annotation identifier

Span JSON shape

Span
  name, span_kind ("LLM"|"CHAIN"|"TOOL"|"RETRIEVER"|"EMBEDDING"|"AGENT"|"RERANKER"|"GUARDRAIL"|"EVALUATOR"|"UNKNOWN")
  status_code ("OK"|"ERROR"|"UNSET"), status_message
  context.span_id, context.trace_id, parent_id
  start_time, end_time
  attributes
    input.value, output.value          — raw input/output
    llm.model_name, llm.provider
    llm.token_count.prompt/completion/total
    llm.input_messages.{N}.message.role/content
    llm.output_messages.{N}.message.role/content
    llm.invocation_parameters          — JSON string (temperature, etc.)
    exception.message                  — set if span errored
  annotations[] (with --include-annotations, excludes note)
    name, result { score, label, explanation }
  notes[] (with --include-notes)
    name="note", result { explanation }

Sessions

px session list --limit 10 --format raw --no-progress | jq .
px session list --order asc --format raw --no-progress | jq '.[].session_id'
px session list --include-annotations --include-notes --format raw --no-progress | jq '.[].notes'
px session get <session-id> --format raw | jq .
px session get <session-id> --include-annotations --format raw | jq '.session.annotations'
px session get <session-id> --include-notes --format raw | jq '.session.notes'
px session annotate <session-id> --name reviewer --label pass
px session annotate <session-id> --name reviewer --score 0.9 --format raw --no-progress
px session annotate <session-id> --name reviewer --label pass --identifier "<coding-annotation-id>"  # tag with a coding annotation identifier
px session add-note <session-id> --text "verified by agent"
px session add-note <session-id> --text "verified by agent" --identifier "<coding-annotation-id>"  # tag + upsert on identifier
px session-annotations delete --identifier "<coding-annotation-id>" --all -y              # nuke every annotation tied to this coding annotation identifier

Session JSON shape

SessionData
  id, session_id, project_id
  start_time, end_time
  token_count_prompt, token_count_completion, token_count_total  — cumulative across all LLM spans in the session (int, default 0)
  annotations[] (with --include-annotations, excludes note)
    name, result { score, label, explanation }
  notes[] (with --include-notes)
    name="note", result { explanation }
  traces[]
    id, trace_id, start_time, end_time

Datasets / Experiments / Prompts

px dataset list --format raw --no-progress | jq '.[].name'
px dataset get <name> --format raw | jq '.examples[] | {input, output: .expected_output}'
px dataset get <name> --split train --format raw | jq .    # filter by split
px dataset get <name> --version <version-id> --format raw | jq .
px experiment list --dataset <name> --format raw --no-progress | jq '.[] | {id, name, failed_run_count}'
px experiment get <id> --format raw --no-progress | jq '.[] | select(.error != null) | {input, error}'
px prompt list --format raw --no-progress | jq '.[].name'
px prompt get <name> --format text --no-progress   # plain text, ideal for piping to AI

Annotation Configs

px annotation-config list                                           # list all configs (table view)
px annotation-config list --format raw --no-progress | jq '.[].name' # config names as JSON

GraphQL

For ad-hoc queries not covered by the commands above. Output is {"data": {...}}.

px api graphql '{ projectCount datasetCount promptCount evaluatorCount }'
px api graphql '{ projects { edges { node { name traceCount tokenCountTotal } } } }' | jq '.data.projects.edges[].node'
px api graphql '{ datasets { edges { node { name exampleCount experimentCount } } } }' | jq '.data.datasets.edges[].node'
px api graphql '{ evaluators { edges { node { name kind } } } }' | jq '.data.evaluators.edges[].node'
# evaluator kind values: "LLM" | "CODE" | "BUILTIN"
# CODE = server-side code evaluator running in a sandbox; BUILTIN = pre-built server evaluator

# Introspect any type
px api graphql '{ __type(name: "Project") { fields { name type { name } } } }' | jq '.data.__type.fields[]'

Key root fields: projects, datasets, prompts, evaluators, projectCount, datasetCount, promptCount, evaluatorCount, viewer.

Docs

Download Phoenix documentation markdown for local use by coding agents.

px docs fetch                                # fetch default workflow docs to .px/docs
px docs fetch --workflow tracing             # fetch only tracing docs
px docs fetch --workflow tracing --workflow evaluation
px docs fetch --dry-run                      # preview what would be downloaded
px docs fetch --refresh                      # clear .px/docs and re-download
px docs fetch --output-dir ./my-docs         # custom output directory

Key options: --workflow (repeatable, values: tracing, evaluation, datasets, prompts, integrations, sdk, self-hosting, all), --dry-run, --refresh, --output-dir (default .px/docs), --workers (default 10).

phoenix-cli

More from this repository

More from this repository

Phoenix CLI

Invocation

Setup

Quick Reference

Workflows

Reference Categories

Auth

Profiles

Projects

Traces

Trace JSON shape

Spans

Span JSON shape

Sessions

Session JSON shape

Datasets / Experiments / Prompts

Annotation Configs

GraphQL

Docs

Phoenix CLI

Invocation

Setup

Quick Reference

Workflows

Reference Categories

Auth

Profiles

Projects

Traces

Trace JSON shape

Spans

Span JSON shape

Sessions

Session JSON shape

Datasets / Experiments / Prompts

Annotation Configs

GraphQL

Docs