Run any Skill in Manus with one click

$pwd:

arize-datasets

Name: Arize Datasets
Author: Arize-ai

// Manage datasets in Arize AI using the ax CLI. Use when users want to list datasets, get dataset details, create new datasets, delete datasets, export dataset data, or work with dataset examples. Triggers on "list datasets", "create dataset", "ax datasets", "export dataset", "delete dataset", or any request about managing Arize datasets via CLI.

Run Skill in Manus

$ git log --oneline --stat

stars:19

forks:1

updated:February 25, 2026 at 19:19

SKILL.md

readonly

related-skills.json

same repository

setup-claude-code-tracing.md

from "Arize-ai/arize-claude-code-plugin"

Set up and configure Arize tracing for Claude Code sessions or Agent SDK applications. Use when users want to set up tracing, configure Arize AX or Phoenix, create a new Arize project, get an API key, enable/disable tracing, or troubleshoot tracing issues. Triggers on "set up tracing", "configure Arize", "configure Phoenix", "enable tracing", "setup-claude-code-tracing", "create Arize project", "get Arize API key", "agent sdk tracing", or any request about connecting Claude Code or the Agent SDK to Arize or Phoenix for observability.

2026-03-3119

arize-projects.md

from "Arize-ai/arize-claude-code-plugin"

Manage projects in Arize AI using the ax CLI. Use when users want to list projects, get project details, create new projects, delete projects, or organize work within Arize spaces. Triggers on "list projects", "create project", "ax projects", "delete project", or any request about managing Arize projects via CLI.

2026-02-2519

package.json

"author": "Arize-ai"

"repository": "Arize-ai/arize-claude-code-plugin"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Database AdministratorsComputer and Mathematical Occupations15-1242L4

name	arize-datasets
description	Manage datasets in Arize AI using the ax CLI. Use when users want to list datasets, get dataset details, create new datasets, delete datasets, export dataset data, or work with dataset examples. Triggers on "list datasets", "create dataset", "ax datasets", "export dataset", "delete dataset", or any request about managing Arize datasets via CLI.

Arize AX Datasets

Manage datasets in the Arize AI platform using the ax CLI.

Prerequisites

The user must have:

Arize AX CLI installed (pip install arize-ax-cli)
CLI configured with valid credentials (ax config init)

Core Dataset Commands

List All Datasets

ax datasets list

Options:

--output <format> - Output format: table (default), json, csv, parquet
--profile <name> - Use specific configuration profile
--limit <n> - Limit number of results
--offset <n> - Skip first n results (pagination)

Examples:

# List as table (default)
ax datasets list

# List as JSON
ax datasets list --output json

# List with pagination
ax datasets list --limit 10 --offset 0

# Use production profile
ax datasets list --profile production

Extracting Dataset IDs:

To find a specific dataset ID for use in other operations:

# Get all dataset IDs and names as JSON
ax datasets list --output json | jq '.[] | {id: .id, name: .name}'

# Find a dataset ID by name
ax datasets list --output json | jq -r '.[] | select(.name == "Training Data") | .id'

# Save dataset ID to a variable
DATASET_ID=$(ax datasets list --output json | jq -r '.[] | select(.name == "Training Data") | .id')
echo "Found dataset: $DATASET_ID"

# Use the ID in subsequent commands
ax datasets get "$DATASET_ID"
ax datasets delete "$DATASET_ID"

Without jq (using grep):

# List with grep to find dataset
ax datasets list --output json | grep -A 2 "Training Data" | grep "id"

# More reliable pattern
ax datasets list --output json | grep -B 1 '"name": "Training Data"' | grep "id" | cut -d'"' -f4

Get Dataset Details

Retrieve information about a specific dataset:

ax datasets get <dataset-id>

Options:

--output <format> - Output format
--profile <name> - Configuration profile to use

Examples:

# Get dataset details
ax datasets get ds_abc123xyz

# Get as JSON
ax datasets get ds_abc123xyz --output json

# Get from production environment
ax datasets get ds_abc123xyz --profile production

Create a New Dataset

Create a dataset from a file:

ax datasets create --file <path> [options]

Supported File Formats:

CSV (.csv)
JSON (.json, .jsonl)
Parquet (.parquet)

Options:

--name <name> - Dataset name (required or inferred from filename)
--description <text> - Dataset description
--profile <name> - Configuration profile to use

Examples:

# Create from CSV
ax datasets create --file data.csv --name "Training Data" --description "Production training set"

# Create from JSON
ax datasets create --file examples.json --name "Test Examples"

# Create from Parquet
ax datasets create --file dataset.parquet --name "Large Dataset"

# Use staging profile
ax datasets create --file data.csv --name "Test Data" --profile staging

Delete a Dataset

Remove a dataset from Arize:

ax datasets delete <dataset-id>

Options:

--profile <name> - Configuration profile to use
--yes or -y - Skip confirmation prompt

Examples:

# Delete with confirmation
ax datasets delete ds_abc123xyz

# Delete without confirmation
ax datasets delete ds_abc123xyz --yes

# Delete from production
ax datasets delete ds_abc123xyz --profile production

⚠️ Warning: Deletion is permanent. Always verify the dataset ID before deleting.

Export Dataset Data

Export dataset examples to various formats:

ax datasets get <dataset-id> --output <format>

Export Formats:

json - JSON format
csv - Comma-separated values
parquet - Apache Parquet format

Examples:

# Export to JSON
ax datasets get ds_abc123xyz --output json > dataset.json

# Export to CSV
ax datasets get ds_abc123xyz --output csv > dataset.csv

# Export to Parquet
ax datasets get ds_abc123xyz --output parquet > dataset.parquet

Working with Multiple Profiles

When working across different environments (dev, staging, production):

# List datasets in production
ax datasets list --profile production

# Create dataset in staging
ax datasets create --file test_data.csv --profile staging

# Get dataset from dev environment
ax datasets get ds_dev_123 --profile dev

Pagination for Large Results

For accounts with many datasets, use pagination:

# First page (10 items)
ax datasets list --limit 10 --offset 0

# Second page
ax datasets list --limit 10 --offset 10

# Third page
ax datasets list --limit 10 --offset 20

Common Workflows

Workflow 1: Find Dataset by Name and Get Details

# 1. List all datasets and find the one you want
ax datasets list --output json | jq '.[] | {id: .id, name: .name}'

# 2. Extract the specific dataset ID by name
DATASET_ID=$(ax datasets list --output json | jq -r '.[] | select(.name == "Production Data") | .id')

# 3. Get detailed information about that dataset
ax datasets get "$DATASET_ID"

# 4. Export the dataset if needed
ax datasets get "$DATASET_ID" --output csv > dataset_export.csv

Workflow 2: Create and Verify Dataset

# 1. Create dataset
ax datasets create --file data.csv --name "My Dataset"

# 2. Find the new dataset ID
DATASET_ID=$(ax datasets list --output json | jq -r '.[] | select(.name == "My Dataset") | .id')
echo "Created dataset: $DATASET_ID"

# 3. Verify details
ax datasets get "$DATASET_ID"

Workflow 2: Export, Modify, and Re-upload

# 1. Export existing dataset
ax datasets get ds_abc123 --output csv > dataset.csv

# 2. Modify the CSV file (manual editing)

# 3. Create new version
ax datasets create --file dataset.csv --name "Updated Dataset v2"

Workflow 3: Migrate Dataset Between Environments

# 1. Export from production
ax datasets get ds_prod_123 --profile production --output json > prod_data.json

# 2. Import to staging
ax datasets create --file prod_data.json --name "Production Copy" --profile staging

Workflow 4: Cleanup Old Datasets

# 1. List all datasets
ax datasets list --output json > all_datasets.json

# 2. Review and identify datasets to delete (manual review)

# 3. Delete old datasets
ax datasets delete ds_old_001 --yes
ax datasets delete ds_old_002 --yes

Output Format Examples

Table Format (Default)

Human-readable table with columns for ID, Name, Created, and Status.

JSON Format

Structured JSON with full dataset metadata:

{
  "id": "ds_abc123xyz",
  "name": "Training Data",
  "description": "Production training set",
  "created_at": "2024-01-15T10:30:00Z",
  "num_examples": 1000,
  "size_bytes": 52428800
}

CSV Format

Comma-separated values, useful for importing into spreadsheets or pandas.

Parquet Format

Efficient columnar format, ideal for large datasets and data processing.

Troubleshooting

"Dataset not found"

Verify dataset ID: ax datasets list
Check you're using the correct profile: ax config show
Ensure the dataset exists in the current space/project

"Permission denied" or "Unauthorized"

Check API key is valid: ax config show --expand
Verify the key has dataset permissions in Arize
Try re-authenticating: ax config init

"File format not supported"

Supported formats are CSV, JSON (including JSONL), and Parquet. Check:

File extension is correct
File is not corrupted
File content matches the extension

Large dataset creation fails

For very large datasets:

Check file size and network stability
Try breaking into smaller chunks
Use Parquet format for better compression
Consider using the Arize Python SDK for programmatic uploads

Output is too large

For datasets with many examples:

Use --limit to restrict output size

Export to file instead of viewing in terminal:

ax datasets get ds_abc123 --output json > dataset.json

Use pagination with --limit and --offset

Tips

Extract dataset IDs by name:

DATASET_ID=$(ax datasets list --output json | jq -r '.[] | select(.name == "My Dataset") | .id')

Use JSON output for scripting: ax datasets list --output json | jq '.[] | .id'
List IDs and names together: ax datasets list --output json | jq '.[] | {id, name}'
Pipe to files for export: Always redirect large outputs to files
Verify before delete: Use ax datasets get "$DATASET_ID" to confirm before deleting
Profile naming: Use descriptive names like prod, staging, dev
Save IDs to variables: Store dataset IDs in shell variables for reuse in scripts
Check limits: Some operations may have rate limits or quotas

Next Steps

View dataset details in Arize UI: https://app.arize.com
Use datasets in experiments and evaluations
Integrate with Arize Python SDK for programmatic access
Set up CI/CD pipelines using the CLI

When to Use This Skill

Use this skill when users want to:

✅ List all datasets in their Arize account
✅ Get details about a specific dataset
✅ Create a new dataset from a local file
✅ Delete datasets they no longer need
✅ Export dataset data to different formats
✅ Work with datasets across multiple environments
✅ Troubleshoot dataset-related CLI issues

Don't use this skill for:

❌ GraphQL queries (use /arize-graphql-analytics instead)
❌ Installing/configuring the CLI (use /setup-arize-cli instead)
❌ Managing projects, models, or other Arize resources beyond datasets

arize-datasets

More from this repository

More from this repository

Arize AX Datasets

Prerequisites

Core Dataset Commands

List All Datasets

Get Dataset Details

Create a New Dataset

Delete a Dataset

Export Dataset Data

Working with Multiple Profiles

Pagination for Large Results

Common Workflows

Workflow 1: Find Dataset by Name and Get Details

Workflow 2: Create and Verify Dataset

Workflow 2: Export, Modify, and Re-upload

Workflow 3: Migrate Dataset Between Environments

Workflow 4: Cleanup Old Datasets

Output Format Examples

Table Format (Default)

JSON Format

CSV Format

Parquet Format

Troubleshooting

"Dataset not found"

"Permission denied" or "Unauthorized"

"File format not supported"

Large dataset creation fails

Output is too large

Tips

Next Steps

When to Use This Skill

Arize AX Datasets

Prerequisites

Core Dataset Commands

List All Datasets

Get Dataset Details

Create a New Dataset

Delete a Dataset

Export Dataset Data

Working with Multiple Profiles

Pagination for Large Results

Common Workflows

Workflow 1: Find Dataset by Name and Get Details

Workflow 2: Create and Verify Dataset

Workflow 2: Export, Modify, and Re-upload

Workflow 3: Migrate Dataset Between Environments

Workflow 4: Cleanup Old Datasets

Output Format Examples

Table Format (Default)

JSON Format

CSV Format

Parquet Format

Troubleshooting

"Dataset not found"

"Permission denied" or "Unauthorized"

"File format not supported"

Large dataset creation fails

Output is too large

Tips

Next Steps

When to Use This Skill