بنقرة واحدة
autodiscovery
// Create, configure, and monitor AutoDiscovery runs. Use when the user asks about their runs, experiments, discoveries, wants to check status, or wants to start a new discovery run.
// Create, configure, and monitor AutoDiscovery runs. Use when the user asks about their runs, experiments, discoveries, wants to check status, or wants to start a new discovery run.
This skill should be used when the user asks to "generate theories", "theorize about", "what theories explain", "form scientific theories", "literature-driven theories", "hypothesize", "form hypotheses", "generate hypotheses", "what hypotheses explain", "run the theorizer", or wants AI-generated, literature-grounded scientific theories or hypotheses about a research question.
Show the user the agent's work on a research project and save iterations on the user's behalf. Scaffold rendering and deploy infrastructure (Quarto today, GitHub Pages, dev container), show the rendered output, save iterations. Doesn't handle research execution (use `research-step`).
Look up or search papers, authors, citations, and full-text snippets on Semantic Scholar. Use for fast, targeted queries about a paper, author, or specific named research artifact (benchmark, dataset, model, method, etc.) — not comprehensive reports.
Local document metadata index for files used by Asta skills and tools. Use this skill when the user asks to store a document "in Asta" or retrieve "from Asta". Use it when the user references an "Asta document" or anything with an `asta://` URI.
Run scientific (software) experiments. Use when the user asks to "run an experiment", "run an investigation", or "research with Asta." Also use this skill to analyze experimental data generate a research report from it. The user may refer to this system by its internal project name, "Panda."
This skill should be used when the user asks to "find papers", "search for papers", "what does the literature say", "find research on", "academic papers about", "literature review", "cite papers", or needs to answer questions using academic literature.
| name | autodiscovery |
| description | Create, configure, and monitor AutoDiscovery runs. Use when the user asks about their runs, experiments, discoveries, wants to check status, or wants to start a new discovery run. |
| metadata | {"internal":true} |
| allowed-tools | Bash(asta autodiscovery *) Bash(asta auth *) Read(*) Write(*.json) TaskOutput |
Create, configure, and monitor AutoDiscovery runs via the asta autodiscovery commands. AutoDiscovery is an AI-driven scientific discovery platform that runs iterative experiments guided by Bayesian surprise and MCTS optimization.
This skill requires the asta CLI:
# Install/reinstall at the correct version
PLUGIN_VERSION=0.17.1
if [ "$(asta --version 2>/dev/null | grep -oE '[0-9]+\.[0-9]+\.[0-9]+')" != "$PLUGIN_VERSION" ]; then
uv tool install --force git+https://github.com/allenai/asta-plugins.git@v$PLUGIN_VERSION
fi
Prerequisites: Python 3.11+ and uv package manager
AutoDiscovery uses the shared Asta authentication. If you get an auth error, run asta auth login first.
asta autodiscovery runs — List all runs for the authenticated userasta autodiscovery run <runid> — Get full details for a specific runasta autodiscovery status <runid> — Check current execution status of a runasta autodiscovery credits — Show credit balance (granted/consumed/pending/available)asta autodiscovery experiments <runid> — List all experiments in a run with status, surprise scores, priors, posteriors, and hypothesesasta autodiscovery experiment <runid> <experiment_id> — Get full details for a single experiment. The experiment_id format is like node_0_0, node_1_0, etc.All commands support --format json (default) and --format text:
asta autodiscovery runs --format text
asta autodiscovery experiments <runid> --format text
Use --format text when presenting results directly to the user.
When showing results to the user:
When showing experiment details, focus on the hypothesis, analysis, and review fields. Show code and code_output only if specifically asked.
# 1. Create an empty run
RUNID=$(asta autodiscovery create)
echo "Created run: $RUNID"
# 2. Upload dataset files
asta autodiscovery upload "$RUNID" path/to/dataset.csv path/to/other_data.json
# 3. Save metadata configuration
asta autodiscovery metadata "$RUNID" --file metadata.json
# 4. Submit for execution
asta autodiscovery submit "$RUNID"
# 1. Fork (copies metadata + datasets server-side)
NEW_RUNID=$(asta autodiscovery fork <existing-runid>)
# 2. Optionally get and modify metadata
asta autodiscovery metadata-get "$NEW_RUNID" > metadata.json
# Edit metadata.json...
asta autodiscovery metadata "$NEW_RUNID" --file metadata.json
# 3. Submit
asta autodiscovery submit "$NEW_RUNID"
Build this JSON file for the user based on their research goals and datasets.
{
"name": "Run Name",
"description": "Context about the data: its origin, collection methods, known gaps or biases.",
"domain": "Research domain (e.g., Ornithology, Materials Science, NLP)",
"intent": "High-level guidance to condition exploration without specifying exact hypotheses.",
"datasets": [
{
"name": "filename.csv",
"description": "What this dataset contains and what each column/field represents.",
"content_type": "text/csv",
"file_size_bytes": 1048576,
"is_preloaded": false
}
],
"n_experiments": 20,
"exploration_weight": 3.0,
"mcts_selection": "ucb1_recursive",
"surprisal_width": 0.3,
"evidence_weight": 2.0
}
| Field | Purpose | Tips |
|---|---|---|
| name | Short title for the run | Keep it descriptive but concise |
| description | Dataset context for the AI agent | Describe data provenance, collection method, known gaps. The agent uses this to generate better hypotheses. |
| domain | Research field | Helps the agent contextualize hypotheses (e.g., "Genomics", "Climate Science") |
| intent | Exploration guidance | Steer exploration without being too specific. Example: "Focus on how temperature affects yield" rather than "Test if temperature > 30C reduces yield by 20%" |
Each entry in datasets should match an uploaded file:
| Field | Description |
|---|---|
| name | Filename as uploaded (must match exactly) |
| description | What the data contains, column meanings, units |
| content_type | MIME type: text/csv, application/json, text/tab-separated-values, etc. |
| file_size_bytes | File size in bytes (for display; upload validates independently) |
| is_preloaded | Always false for CLI uploads |
| Field | Default | Range | Description |
|---|---|---|---|
| n_experiments | - | 1-500 | Number of experiments. Each costs 1 credit. Start with 10-20 for exploration, 50-100 for thorough investigation. |
| exploration_weight | 1.0 | 0.1-10 | Higher = broader exploration across hypothesis space (3-5). Lower = deeper refinement of promising hypotheses (0.5-1). |
| mcts_selection | "ucb1_recursive" | "ucb1_recursive" or "pw" | Search strategy. UCB1 Recursive is the default and works well for most cases. Progressive Widening ("pw") can help with very large hypothesis spaces. |
| surprisal_width | 0.5 | 0.0-1.0 | Threshold for what counts as "surprising". Higher = only dramatic findings (0.7-1.0). Lower = subtle discoveries too (0.1-0.3). |
| evidence_weight | 1.0 | 0.1-10 | How much to trust experimental results. Higher = relies heavily on data (2-5). Lower = more cautious, skeptical (0.3-0.8). |
Quick exploration (new dataset, broad survey):
{
"n_experiments": 15,
"exploration_weight": 4.0,
"surprisal_width": 0.3,
"evidence_weight": 1.5
}
Deep investigation (known domain, specific questions):
{
"n_experiments": 50,
"exploration_weight": 1.0,
"surprisal_width": 0.5,
"evidence_weight": 3.0
}
Sensitive/noisy data (high bar for surprise, cautious inference):
{
"n_experiments": 30,
"exploration_weight": 2.0,
"surprisal_width": 0.7,
"evidence_weight": 0.5
}
When helping a user build metadata.json:
metadata.json (or user-specified path)Before writing metadata, read the user's data files to understand them:
# For CSV files - check headers and sample rows
head -5 dataset.csv
# For JSON files - check structure
python3 -c "import json; d=json.load(open('data.json')); print(type(d), len(d) if isinstance(d, list) else list(d.keys())[:10])"
Use this information to write accurate description fields for both the run and individual datasets.
asta autodiscovery creditssubmit command will show credit cost and ask for confirmationasta autodiscovery status <runid>