Arize-ai

Design and implementation guide for the Phoenix CLI (`px`). Covers the noun-verb command structure, dual-audience design (humans and coding agents), Commander.js patterns, configuration resolution, output formats, exit codes, and conventions for adding or modifying commands. Triggers when working on phoenix-cli commands — adding new commands, modifying existing ones, refactoring command structure, or reviewing CLI code. Also triggers on mentions of `px` commands, CLI design, or adding a new resource to the CLI.

2026-07-19

phoenix-cli

Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, structure trace review with open coding and axial coding, inspect datasets, review experiments, query annotation configs, and use the GraphQL API. Use whenever the user is analyzing traces or spans, investigating LLM/agent failures, deciding what to do after instrumenting an app, building failure taxonomies, choosing what evals to write, or asking "what's going wrong", "what kinds of mistakes", or "where do I focus" — even without naming a technique.

2026-07-18

gh-stack

Manage stacked branches and pull requests with the gh-stack GitHub CLI extension. Use when the user wants to create, push, rebase, sync, navigate, or view stacks of dependent PRs. Triggers on tasks involving stacked diffs, dependent pull requests, branch chains, or incremental code review workflows.

2026-07-16

phoenix-evals-new-metric

Create a new built-in classification evaluator for Phoenix evals. Use this skill whenever the user asks to create a new eval, build a new metric, add a new builtin evaluator, create an LLM-as-a-judge metric, or add a new classification evaluator to Phoenix.

2026-07-16

phoenix-evals

Datenwissenschaftler

Build and run evaluators for AI/LLM applications using Phoenix.

phoenix-tracing

OpenInference semantic conventions and instrumentation for Phoenix AI observability. Use when implementing LLM tracing, creating custom spans, or deploying to production.

15 Skills40aktualisiert 2026-06-30

Zeigt die Top 8 von 38 gesammelten Skills in diesem Repository.

#002

project-rosetta-stone

17% des Creators

Skill

Beruf

Beschreibung

Aktualisiert

rosetta-add-framework

rosetta-add-framework-tier-build

Add a new agent framework to the Rosetta Stone repo — researches the framework, builds all three observability tiers (no-observability, phoenix, ax), tests each, runs a Playwright smoke against the UI, updates README + TODO, and raises a PR. Trigger when the user asks to "add the <framework> framework", "implement <framework>", "wire up <framework>", or similar. The framework must be one of the Arize-supported agent frameworks (see TODO below).

2026-06-30

Build a single tier (no-observability, phoenix, or ax) for a new framework. Clones the closest existing tier, swaps in framework-specific agent.py / tools.py / requirements.txt, and (for observability tiers) adds tracing.py + main.py wiring + eval-harness scripts. Part of the rosetta-add-framework flow; can be invoked standalone to rebuild a single tier from scratch.

2026-06-30

check-models

rosetta-add-framework-playwright

Find and update out-of-date OpenAI and Anthropic model references (in docs, MDX, notebooks, and code) to the latest size-equivalent models, and apply the code changes each new generation requires (e.g. max_tokens → max_completion_tokens for GPT-5). Use when asked to "check the models", "update model versions", "are these models current", "migrate the models in the docs/tutorials", or before publishing content that names a model.

2026-06-25

Run a public-flow Playwright smoke test against a freshly-built framework tier's Next.js frontend. Covers home page rendering + product browsing — the parts that don't require X/Twitter OAuth. The Playwright project (package.json, config, tests) lives inside this skill directory and is checked into the repo. Part of the rosetta-add-framework flow.

2026-05-27

rosetta-pr-screenshots

rosetta-add-framework-docs

Capture AX trace UI, Phoenix trace UI, and Wonder Toys app UI screenshots for a framework's PR, then upload as GitHub release assets and embed in the PR body. Called automatically by rosetta-add-framework-docs as part of new-framework PRs; can also be invoked standalone to retrofit existing PRs. Cross-platform — uses Playwright end-to-end.

2026-05-27

rosetta-add-framework-discover

Finalise a newly-added framework — updates the README's supported-frameworks table, directory tree, and per-framework "what differs" section, marks off the framework in the orchestrator skill's embedded TODO, commits per tier, and raises a PR. Part of the rosetta-add-framework flow.

2026-05-20

rosetta-add-framework-tier-test

Refresh the list of agent frameworks supported by Arize tracing and diff against what's already in the repo. Pulls live data from https://arize.com/docs/llms.txt and produces a clean to-do list. Part of the rosetta-add-framework flow; can also be invoked standalone to answer "what frameworks are left to add?"

2026-05-19

12 Skills385aktualisiert 2026-07-17

Test a freshly-built tier for a new framework — boots the backend, smoke-tests the chat endpoint, runs synthetic requests, and (for phoenix) runs the eval harness. Verifies traces land in the right project. Part of the rosetta-add-framework flow; can be invoked standalone after a build to validate.

2026-05-19

Zeigt die Top 8 von 15 gesammelten Skills in diesem Repository.

#003

arize-skills

13% des Creators

Skill

Beruf

Beschreibung

Aktualisiert

arize-annotation

Creates and manages annotation configs (categorical, continuous, freeform label schemas) and annotation queues (human review workflows) on Arize. Applies human annotations to project spans via the Python SDK. Use when the user mentions annotation config, annotation queue, label schema, human feedback, bulk annotate spans, update_annotations, labeling queue, annotate record, or human review.

2026-07-17

arize-instrumentation

Netzwerk- und Computersystemadministratoren

Adds Arize AX tracing to an LLM application for the first time. Follows a two-phase agent-assisted flow to analyze the codebase then implement instrumentation after user confirmation. Use when the user wants to instrument their app, add tracing from scratch, set up LLM observability, integrate OpenTelemetry or openinference, or get started with Arize tracing.

2026-07-17

arize-admin

Manages Arize users, organizations, spaces, projects, roles, role bindings, resource restrictions, and API keys via the ax CLI. Use for enterprise admin workflows: inviting and offboarding users, onboarding new teams, creating custom roles for SAML/SSO mappings, assigning roles to users, restricting project-level access, and managing service keys for multi-tenant architectures. Covers ax users, ax organizations, ax spaces, ax projects, ax roles, ax role-bindings, and ax api-keys.

arize-ai-provider-integration

Creates, reads, updates, and deletes Arize AI integrations that store LLM provider credentials used by evaluators and other Arize features. Supports any LLM provider (e.g. OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Vertex AI, Gemini, NVIDIA NIM). Use when the user mentions AI integration, LLM provider credentials, create integration, list integrations, update credentials, delete integration, or connecting an LLM provider to Arize.

Finanz- und Investitionsanalysten

arize-compliance-audit

INVOKE THIS SKILL when auditing an AI agent or LLM app for regulatory compliance. Covers EU AI Act, GPAI Code of Practice, GDPR, NIST AI RMF, Colorado AI Act, HIPAA, and ISO 42001. Scans the codebase for compliance gaps, cross-references Arize instrumentation for audit trail coverage, and produces an actionable remediation checklist tailored to the selected frameworks.

arize-dataset

Creates, manages, and queries Arize datasets and examples. Covers dataset CRUD, appending examples, exporting data, and file-based dataset creation using the ax CLI. Use when the user needs test data, evaluation examples, or mentions create dataset, list datasets, export dataset, append examples, dataset version, golden dataset, or test set.

arize-evaluator

Handles LLM-as-judge and code evaluator workflows on Arize including creating/updating evaluators, running evaluations on spans or experiments, managing tasks, trigger-run operations, column mapping, and continuous monitoring. Use when the user mentions create evaluator, LLM judge, code evaluator, hallucination, faithfulness, correctness, relevance, run eval, score spans, score experiment, trigger-run, column mapping, continuous monitoring, or improve evaluator prompt.

arize-experiment

Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows using the ax CLI. Use when the user mentions create experiment, run experiment, compare models, model performance, evaluate AI, experiment results, benchmark, A/B test models, or measure accuracy.

9 Skills237aktualisiert 2026-07-17

Zeigt die Top 8 von 12 gesammelten Skills in diesem Repository.

#004

coding-harness-tracing

10% des Creators

Skill

Beruf

Beschreibung

Aktualisiert

manage-cursor-tracing

Sonstige Computerberufe

Set up and configure Arize tracing for Cursor IDE sessions. Use when users want to set up tracing, configure Arize AX or Phoenix for Cursor, enable/disable tracing, or troubleshoot tracing issues. Triggers on "set up cursor tracing", "configure Arize for Cursor", "configure Phoenix for Cursor", "enable cursor tracing", "setup-cursor-tracing", or any request about connecting Cursor to Arize or Phoenix for observability.

2026-07-17

manage-omp-tracing

manage-claude-code-tracing

Set up and configure Arize tracing for Oh My Pi (omp) terminal coding sessions. Use when users want to set up tracing, configure Arize AX or Phoenix for Oh My Pi, enable/disable omp tracing, or troubleshoot tracing issues. Triggers on "set up omp tracing", "configure Arize for Oh My Pi", "configure Phoenix for omp", "enable omp tracing", "setup-omp-tracing", or any request about connecting omp / Oh My Pi to Arize or Phoenix for observability.

2026-07-02

Set up and configure Arize tracing for Claude Code sessions or Agent SDK applications. Use when users want to set up tracing, configure Arize AX or Phoenix, create a new Arize project, get an API key, enable/disable tracing, or troubleshoot tracing issues. Triggers on "set up tracing", "configure Arize", "configure Phoenix", "enable tracing", "setup-claude-code-tracing", "create Arize project", "get Arize API key", "agent sdk tracing", or any request about connecting Claude Code or the Agent SDK to Arize or Phoenix for observability.

manage-codex-tracing

Set up and configure Arize tracing for OpenAI Codex CLI sessions. Use when users want to set up Codex tracing, configure Arize AX or Phoenix for Codex, enable/disable tracing, or troubleshoot Codex tracing issues. Triggers on "set up codex tracing", "configure Arize for Codex", "configure Phoenix for Codex", "enable codex tracing", "setup-codex-tracing", or any request about connecting Codex to Arize or Phoenix for observability.

manage-copilot-tracing

Set up and configure Arize tracing for GitHub Copilot sessions. Use when users want to set up tracing, configure Arize AX or Phoenix for Copilot, enable/disable tracing, or troubleshoot tracing issues. Triggers on "set up copilot tracing", "configure Arize for Copilot", "configure Phoenix for Copilot", "enable copilot tracing", "setup-copilot-tracing", or any request about connecting GitHub Copilot to Arize or Phoenix for observability.

manage-gemini-tracing

Set up and configure Arize tracing for Gemini CLI sessions. Use when users want to set up tracing, configure Arize AX or Phoenix for Gemini, enable/disable tracing, or troubleshoot tracing issues. Triggers on "set up gemini tracing", "configure Arize for Gemini", "configure Phoenix for Gemini", "enable gemini tracing", "setup-gemini-tracing", or any request about connecting Gemini CLI to Arize or Phoenix for observability.

manage-kiro-tracing

Set up and configure Arize tracing for Kiro CLI sessions. Use when users want to set up Kiro tracing, configure Arize AX or Phoenix for Kiro, enable/disable tracing, choose or set a default traced agent, or troubleshoot Kiro tracing issues. Triggers on "set up kiro tracing", "configure Arize for Kiro", "configure Phoenix for Kiro", "enable kiro tracing", "setup-kiro-tracing", "kiro agent tracing", or any request about connecting Kiro CLI to Arize or Phoenix for observability.

manage-opencode-tracing

Set up and configure Arize tracing for opencode terminal coding sessions. Use when users want to set up tracing, configure Arize AX or Phoenix for opencode, enable/disable tracing, or troubleshoot tracing issues. Triggers on "set up opencode tracing", "configure Arize for opencode", "configure Phoenix for opencode", "enable opencode tracing", "setup-opencode-tracing", or any request about connecting opencode to Arize or Phoenix for observability.

5 Skills1.1k276aktualisiert 2026-06-22

Zeigt die Top 8 von 9 gesammelten Skills in diesem Repository.

#005

openinference

5.6% des Creators

Skill

Beruf

Beschreibung

Aktualisiert

python-canary-fix

Investigate and propose fixes for Python canary cron failures in the openinference repo. Use when the user mentions Python canary failures, Python cron failures, or when the auto-fix CI job reports Python instrumentation canary issues.

2026-06-22

genai-conformance

Run, interpret, and iterate on the OpenInference GenAI conformance MVP at python/openinference-instrumentation/scripts/conformance/. Use when the user mentions GenAI conformance, OTel GenAI semantic conventions, Weaver registry live-check, the dual-write conversion (`_genai_conversion.py`, `enable_genai_semconv`), `gen_ai.*` attribute coverage, or asks to add new providers / scenarios to the conformance harness.

2026-05-14

js-docs-sync

Keep hand-written docs/ documentation in JS packages accurate and up to date with their source code. Use this skill whenever: (1) source files in a JS package that has a docs/ folder are modified — especially exports, function signatures, types, or public API changes, (2) the user asks to "update docs", "sync docs", "check if docs are accurate", "review the documentation", or similar, (3) new exports or features are added to a JS package and the docs need to reflect them. Also trigger when the user mentions documentation drift, stale examples, or missing API coverage in any JS package under js/packages/.

2026-04-03

java-code-reviewer

Review Java OpenInference instrumentation code for correctness and completeness. Use this skill when reviewing a Java instrumentor package — whether it's a new instrumentor, a PR that modifies one, or when the user asks to audit/review/check an existing instrumentor's code quality. Trigger on phrases like "review the instrumentor", "check the Java code", "audit the package", "is this instrumentor correct", or any request to validate an OpenInference Java instrumentation package against project standards.

2026-03-21

python-code-reviewer

3 Skills204aktualisiert 2026-03-31

Review Python OpenInference instrumentation code for correctness and completeness. Use this skill when reviewing a Python instrumentor package — whether it's a new instrumentor, a PR that modifies one, or when the user asks to audit/review/check an existing instrumentor's code quality. Trigger on phrases like "review the instrumentor", "check the code", "audit the package", "is this instrumentor correct", or any request to validate an OpenInference Python instrumentation package against project standards.

2026-03-11

#006

arize-claude-code-plugin

3.4% des Creators

Skill

Beruf

Beschreibung

Aktualisiert

setup-claude-code-tracing

Informationssicherheitsanalysten

2026-03-31

arize-datasets

Datenbankadministratoren

Manage datasets in Arize AI using the ax CLI. Use when users want to list datasets, get dataset details, create new datasets, delete datasets, export dataset data, or work with dataset examples. Triggers on "list datasets", "create dataset", "ax datasets", "export dataset", "delete dataset", or any request about managing Arize datasets via CLI.

2026-02-25

arize-projects

Computernetzwerk-Architekten

Manage projects in Arize AI using the ax CLI. Use when users want to list projects, get project details, create new projects, delete projects, or organize work within Arize spaces. Triggers on "list projects", "create project", "ax projects", "delete project", or any request about managing Arize projects via CLI.

2026-02-25

#007

claude-code-otlp-collector

2 Skills20aktualisiert 2026-05-14

2.2% des Creators

Skill

Beruf

Beschreibung

Aktualisiert

debug-live-collector

Netzwerk- und Computersystemadministratoren

Use when testing an already-running Claude collector by sending a live marker trace to its configured OTLP endpoint without starting a second collector.

2026-05-14

debug-setup

Netzwerk- und Computersystemadministratoren

Use when validating Claude Code telemetry setup for this collector, especially to confirm Claude emits OTLP data to a local file before testing the configured upstream destination.

2026-05-14

#008

context-graphs

2 Skills20aktualisiert 2026-05-11

2.2% des Creators

Skill

Beruf

Beschreibung

Aktualisiert

context-graph-apply

INVOKE THIS SKILL when translating a context-graph-mining report into an experiment variant for the procurement-agent. Reads the report's proposed diffs and writes a variant config bundle to `experiments/variants/<id>/` (manifest.yaml + system_prompt.txt + optional vendors.json + departments.json). Does NOT modify procurement-agent source — runtime parameterization means the variant is purely additive config. Use after `context-graph-mining` produces a report; output feeds `procurement-experiment run-experiment`.

2026-05-11

context-graph-mining

1 Skills2626aktualisiert 2026-06-22

INVOKE THIS SKILL when mining the procurement-agent Arize project for patterns in agent decisions vs human overrides, building a context graph, and proposing updates to procurement-agent rules. Use after a batch of reviews has been captured (process.run + override.run spans with annotations) to surface (a) where the agent's recommendations systematically diverge from reviewer overrides, (b) tribal knowledge embedded in reviewer reasoning that should become explicit rules, and (c) concrete code-change proposals against the procurement-agent codebase.

2026-05-11

#010

tutorials

1.1% des Creators

Skill

Beruf

Beschreibung

Aktualisiert

check-models