// Multi-agent data science framework using DS-STAR (Data Science - Structured Thought and Action) architecture. Automates data analysis through collaborative AI agents with multi-model support (Haiku, Sonnet, Opus). Use for exploratory data analysis, automated insights, and iterative data science workflows.
| name | ds-star |
| description | Multi-agent data science framework using DS-STAR (Data Science - Structured Thought and Action) architecture. Automates data analysis through collaborative AI agents with multi-model support (Haiku, Sonnet, Opus). Use for exploratory data analysis, automated insights, and iterative data science workflows. |
| metadata | {"version":"1.0.0","category":"ai-ml","tags":["data-science","multi-agent","analysis","automation","claude-models","iterative-refinement"],"author":"Jules Lescx (adapted by Claude)","original_repo":"https://github.com/JulesLscx/DS-Star"} |
DS-STAR (Data Science - Structured Thought and Action) is an intelligent multi-agent framework adapted for Claude Code that automates data science workflows through specialized AI agents. Originally designed for Google's Gemini models, this skill adapts DS-STAR to leverage Claude's model family (Haiku, Sonnet, Opus) for cost-efficient, high-quality data analysis.
DS-STAR orchestrates seven specialized agents that work collaboratively to:
Key Innovation: Multi-model optimization - route exploration tasks to Haiku, complex reasoning to Sonnet, and critical analysis to Opus for optimal cost/performance balance.
| Agent | Default Model | Task Type | Cost Profile |
|---|---|---|---|
| Analyzer | Haiku | Data inspection | $0.80/1M tokens |
| Planner | Sonnet | Strategy design | $15/1M tokens |
| Coder | Sonnet | Code generation | $15/1M tokens |
| Verifier | Sonnet | Result validation | $15/1M tokens |
| Router | Haiku | Decision routing | $0.80/1M tokens |
| Debugger | Sonnet | Error fixing | $15/1M tokens |
| Finalyzer | Sonnet | Output formatting | $15/1M tokens |
Estimated Cost (typical analysis):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1. ANALYZER (Haiku) โ
โ Input: Data files โ
โ Output: Data descriptions, schemas, summaries โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 2. PLANNER (Sonnet) โ
โ Input: Query + Data descriptions โ
โ Output: Initial analysis step โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 3. CODER (Sonnet) โ
โ Input: Plan + Data descriptions โ
โ Output: Python code to execute plan โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 4. DEBUGGER (Sonnet) [If needed] โ
โ Input: Code + Error โ
โ Output: Fixed code โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 5. VERIFIER (Sonnet) โ
โ Input: Code + Results + Query โ
โ Output: "Sufficient" or "Needs refinement" โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโ
โ Sufficient? โ
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโ
YES โ โ NO
โ โโโโโโโโโโโโโโโโ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ 6. ROUTER (Haiku) โ
โ โ Decide: Fix step or Add โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ โ
โ [Loop back to PLANNER]
โ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 7. FINALYZER (Sonnet) โ
โ Input: Final code + Results + Query โ
โ Output: Formatted answer (JSON/text) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
DS-STAR uses up to 5 refinement rounds (configurable):
runs/<run_id>/ for reproducibility# Add the marketplace (if not already added)
/plugin marketplace add token-eater/skills-marketplace
# Install DS-STAR skill
/plugin install ds-star
# Clone the marketplace
git clone https://github.com/token-eater/skills-marketplace.git
cd skills-marketplace
# The skill is in skills/ds-star/
# In Claude Code, invoke the skill
/skill ds-star
# The skill will prompt you for:
# 1. Data files (CSV, JSON, TXT, etc.)
# 2. Your analysis query
# 3. Model configuration (optional)
Basic Statistics:
Query: "What is the average age and gender distribution in this dataset?"
Data: customers.csv
Time Series Analysis:
Query: "What are the monthly sales trends over the past year?"
Data: sales_2024.csv
Data Quality:
Query: "How many missing values are in each column, and what percentage of the total?"
Data: survey_responses.csv
Correlation Analysis:
Query: "Which features have the strongest correlation with customer churn?"
Data: customer_features.csv
Configure which models to use for each agent:
# In your query, you can specify:
agent_models:
ANALYZER: haiku # Fast data inspection
PLANNER: sonnet # Strategic planning
CODER: sonnet # Code generation
VERIFIER: sonnet # Result validation
ROUTER: haiku # Simple routing decisions
DEBUGGER: opus # Complex debugging (if needed)
FINALYZER: sonnet # Output formatting
Or use presets:
Every run creates a structured artifact directory:
runs/<run_id>/
โโโ steps/
โ โโโ 001_analyzer/
โ โ โโโ prompt.md # Agent prompt
โ โ โโโ code.py # Generated code
โ โ โโโ result.txt # Execution output
โ โ โโโ metadata.json # Step metadata
โ โโโ 002_planner_init/
โ โโโ 003_coder/
โ โโโ ...
โโโ exec_env/ # Executed scripts
โโโ logs/
โ โโโ pipeline.log # Structured logs
โ โโโ execution.log # Code execution logs
โโโ final_output/
โ โโโ result.json # Final answer
โโโ pipeline_state.json # Resume state
DS-STAR supports resuming interrupted runs:
# Resume from a previous run ID
/skill ds-star --resume 20241123_143022_a1b2c3
| Configuration | Input Cost | Output Cost | Total | Savings |
|---|---|---|---|---|
| All Sonnet | $0.75 | $0.75 | $1.50 | 0% |
| Balanced | $0.30 | $0.30 | $0.60 | 60% |
| Fast | $0.15 | $0.15 | $0.30 | 80% |
Use Haiku for:
Use Sonnet for:
Use Opus for:
Data: iris.csv (150 rows, 5 columns)
Query: "What is the average petal length for each species, and which species has the highest variance?"
Result:
{
"final_answer": {
"averages": {
"setosa": 1.46,
"versicolor": 4.26,
"virginica": 5.55
},
"highest_variance": "virginica",
"variance_value": 0.304
}
}
Cost: $0.18 (Balanced configuration) Time: ~25 seconds Steps: 3 (Analyze โ Plan โ Code โ Verify โ Finalize)
Data: sales_2024.csv (365 rows, daily sales)
Query: "Identify the top 3 months by total sales and calculate month-over-month growth rates."
Result:
{
"final_answer": {
"top_3_months": [
{"month": "December", "total_sales": 125000, "growth": "+15%"},
{"month": "November", "total_sales": 108000, "growth": "+8%"},
{"month": "July", "total_sales": 95000, "growth": "+12%"}
],
"average_growth": "+8.3%"
}
}
Cost: $0.22 (Balanced configuration) Time: ~30 seconds Steps: 4 (with 1 refinement round)
The skill includes a custom ClaudeProvider that integrates with the Claude Agent SDK:
class ClaudeProvider(ModelProvider):
"""Provider for Claude models via Agent SDK."""
def __init__(self, model_name: str = "sonnet"):
self.model_name = model_name # haiku, sonnet, or opus
def generate_content(self, prompt: str) -> str:
# Use SDK's subagent system for context efficiency
# Automatically handles model routing
# Returns generated content
1. "Missing data files" error
data/ directory2. Code execution timeout
execution_timeout in config (default: 60s)3. "API key not found"
4. Verification always fails
max_refinement_rounds (default: 5)runs/<id>/steps/Enable detailed logging:
# Run with interactive mode to pause at each step
/skill ds-star --interactive
# Or inspect artifacts manually
ls runs/<run_id>/steps/
cat runs/<run_id>/logs/pipeline.log
# config.yaml (optional, in project root)
run_id: "my_experiment" # Custom run ID
model_name: "sonnet" # Default model
interactive: false # Pause between steps
max_refinement_rounds: 5 # Max iterations
execution_timeout: 60 # Code timeout (seconds)
preserve_artifacts: true # Save all outputs
runs_dir: "runs" # Artifact directory
data_dir: "data" # Data file directory
# Agent-specific models
agent_models:
ANALYZER: haiku
PLANNER: sonnet
CODER: sonnet
VERIFIER: sonnet
ROUTER: haiku
DEBUGGER: opus
FINALYZER: sonnet
| Feature | Original DS-STAR | This Skill |
|---|---|---|
| Models | Gemini, OpenAI | Claude (Haiku/Sonnet/Opus) |
| Context | Full context per call | Subagent-optimized |
| Cost | $1-2 per analysis | $0.20-0.40 per analysis |
| Integration | Standalone CLI | Claude Code skill |
| Resume | โ Yes | โ Yes |
| Interactive | โ Yes | โ Yes |
| Artifacts | โ Saved | โ Saved |
| Multi-model | Basic | Advanced routing |
โ Good Queries:
โ Avoid:
โ Best Formats:
โ Problematic:
Want to add new agents or capabilities?
scripts/dsstar.pyscripts/prompts.pyrun_pipeline() methodMIT License - Adapted from DS-Star by Jules Lescx.
Original DS-STAR based on Google Research paper.
Ready to automate your data science workflows? Install DS-STAR and let multi-agent AI handle the heavy lifting!