con un clic
dnanexus-integration
DNAnexus cloud genomics platform. Build apps/applets, manage data (upload/download), dxpy Python SDK, run workflows, FASTQ/BAM/VCF, for genomics pipeline development and execution.
Menú
DNAnexus cloud genomics platform. Build apps/applets, manage data (upload/download), dxpy Python SDK, run workflows, FASTQ/BAM/VCF, for genomics pipeline development and execution.
Basado en la clasificación ocupacional SOC
Deterministically query 78 public scientific, biomedical, materials science, regulatory, finance, and demographics databases through documented REST APIs. Use for reproducible lookups of compounds, genes, proteins, pathways, variants, clinical trials, patents, economic indicators, structures, astronomy objects, environmental records, or database-backed scientific facts when endpoints, filters, pagination, and provenance need to be explicit.
Hugging Face Transformers for loading Hub models, running pipeline inference, text generation, and Trainer fine-tuning on NLP, vision, audio, and multimodal tasks. Use when working with AutoModel, pipelines, tokenizers, or TrainingArguments—not for general ML outside the Transformers library.
Autonomously improve a real artifact (code, training recipe, agent harness, data pipeline, prompt) against an objective and an evaluator, using Hypothesis Tree Refinement (HTR) from the Arbor paper. Use this whenever someone wants to iteratively optimize something over many experiments without overfitting — e.g. "get my model's eval score up", "improve this agent/harness", "tune this pipeline", "beat the baseline on this benchmark", "run a search over approaches and keep the best", "do an MLE-bench / Kaggle-style optimization", or any long-horizon "make this artifact better and don't just memorize the dev set" task. Trigger it even when the user doesn't say "Arbor" or "hypothesis tree" but describes repeated experiment-and-evaluate loops, branching exploration of competing ideas, or worries about a dev/test gap. Runs Claude itself as the coordinator with subagent executors in isolated git worktrees; for the standalone `arbor` CLI tool see references/arbor-upstream.md.
Standard single-cell RNA-seq analysis pipeline. Use for QC, normalization, dimensionality reduction (PCA/UMAP/t-SNE), clustering, differential expression, visualization, and converting R-friendly single-cell formats such as Seurat or SingleCellExperiment RDS files into h5ad for Scanpy. Best for exploratory scRNA-seq analysis with established workflows. For deep learning models use scvi-tools; for data format questions use anndata.
Complete mass spectrometry analysis platform. Use for proteomics and metabolomics workflows—feature detection, peptide/protein identification, label-free and isobaric quantification, adduct/accurate-mass annotation, and complex LC-MS/MS pipelines. Supports extensive file formats and algorithms. For simple spectral comparison and small-molecule library matching use matchms.
Design experiments and studies BEFORE data is collected — choosing a design, randomizing, blocking, and laying out treatment combinations so the results will actually be interpretable. Use whenever someone is planning a study, asks how to assign subjects/samples to groups, mentions randomization, blocking, stratification, controls, factorial or fractional-factorial designs, design of experiments (DOE), screening many factors, response-surface optimization, crossover or repeated-measures or split-plot designs, cluster/group randomization, Latin squares, plate layouts, batch/run-order effects, replication vs. pseudoreplication, or sequential/adaptive/group-sequential designs. Trigger this even for informal phrasings like "how should I set up this experiment", "how do I avoid confounding", "what's the best way to test these 6 factors", or "assign these mice to conditions". For computing the sample size or power once the design is chosen, use statistical-power; for analyzing data already collected, use statistica
| name | dnanexus-integration |
| description | DNAnexus cloud genomics platform. Build apps/applets, manage data (upload/download), dxpy Python SDK, run workflows, FASTQ/BAM/VCF, for genomics pipeline development and execution. |
| license | Unknown |
| compatibility | Requires a DNAnexus account |
| required_environment_variables | [{"name":"DX_SECURITY_CONTEXT","prompt":"DNAnexus auth token context (normally set by `dx login`).","required_for":"optional features"},{"name":"DX_ASSET_BWA","prompt":"Optional asset id for the BWA example.","required_for":"optional features"}] |
| metadata | {"version":"1.1","skill-author":"K-Dense Inc.","openclaw":{"envVars":[{"name":"DX_SECURITY_CONTEXT","required":false,"description":"DNAnexus auth token context (normally set by `dx login`)."},{"name":"DX_ASSET_BWA","required":false,"description":"Optional asset id for the BWA example."}]}} |
DNAnexus is a cloud platform for biomedical data analysis and genomics. Build and deploy apps/applets, manage data objects, run workflows, and use the dxpy Python SDK for genomics pipeline development and execution.
This skill should be used when:
The skill is organized into five main areas, each with detailed reference documentation:
Purpose: Create executable programs (apps/applets) that run on the DNAnexus platform.
Key Operations:
dx-app-wizarddx build or dx build --appCommon Use Cases:
Reference: See references/app-development.md for:
Purpose: Manage files, records, and other data objects on the platform.
Key Operations:
dxpy.upload_local_file() and dxpy.download_dxfile()Common Use Cases:
Reference: See references/data-operations.md for:
Purpose: Run analyses, monitor execution, and orchestrate workflows.
Key Operations:
applet.run() or app.run()Common Use Cases:
Reference: See references/job-execution.md for:
Purpose: Programmatic access to DNAnexus platform through Python.
Key Operations:
Common Use Cases:
Reference: See references/python-sdk.md for:
Purpose: Configure app metadata and manage dependencies.
Key Operations:
Common Use Cases:
Reference: See references/configuration.md for:
import dxpy
# Upload input file
input_file = dxpy.upload_local_file("sample.fastq", project="project-xxxx")
# Run analysis
job = dxpy.DXApplet("applet-xxxx").run({
"reads": dxpy.dxlink(input_file.get_id())
})
# Wait for completion
job.wait_on_done()
# Download results
output_id = job.describe()["output"]["aligned_reads"]["$dnanexus_link"]
dxpy.download_dxfile(output_id, "aligned.bam")
import dxpy
# Find BAM files from a specific experiment
files = dxpy.find_data_objects(
classname="file",
name="*.bam",
properties={"experiment": "exp001"},
project="project-xxxx"
)
# Download each file
for file_result in files:
file_obj = dxpy.DXFile(file_result["id"])
filename = file_obj.describe()["name"]
dxpy.download_dxfile(file_result["id"], filename)
# src/my-app.py
import dxpy
import subprocess
@dxpy.entry_point('main')
def main(input_file, quality_threshold=30):
# Download input
dxpy.download_dxfile(input_file["$dnanexus_link"], "input.fastq")
# Process
subprocess.check_call([
"quality_filter",
"--input", "input.fastq",
"--output", "filtered.fastq",
"--threshold", str(quality_threshold)
])
# Upload output
output_file = dxpy.upload_local_file("filtered.fastq")
return {
"filtered_reads": dxpy.dxlink(output_file)
}
dxpy.run()
When working with DNAnexus, follow this decision tree:
Need to create a new executable?
Need to manage files or data?
Need to run an analysis or workflow?
Writing Python scripts for automation?
Configuring app settings or dependencies?
Often you'll need multiple capabilities together (e.g., app development + configuration, or data operations + job execution).
uv pip install dxpy
dx login
This authenticates your session and sets up access to projects and data.
dx --version
dx whoami
Process multiple files with the same analysis:
# Find all FASTQ files
files = dxpy.find_data_objects(
classname="file",
name="*.fastq",
project="project-xxxx"
)
# Launch parallel jobs
jobs = []
for file_result in files:
job = dxpy.DXApplet("applet-xxxx").run({
"input": dxpy.dxlink(file_result["id"])
})
jobs.append(job)
# Wait for all completions
for job in jobs:
job.wait_on_done()
Chain multiple analyses together:
# Step 1: Quality control
qc_job = qc_applet.run({"reads": input_file})
# Step 2: Alignment (uses QC output)
align_job = align_applet.run({
"reads": qc_job.get_output_ref("filtered_reads")
})
# Step 3: Variant calling (uses alignment output)
variant_job = variant_applet.run({
"bam": align_job.get_output_ref("aligned_bam")
})
Organize analysis results systematically:
# Create organized folder structure
dxpy.api.project_new_folder(
"project-xxxx",
{"folder": "/experiments/exp001/results", "parents": True}
)
# Upload with metadata
result_file = dxpy.upload_local_file(
"results.txt",
project="project-xxxx",
folder="/experiments/exp001/results",
properties={
"experiment": "exp001",
"sample": "sample1",
"analysis_date": "2025-10-20"
},
tags=["validated", "published"]
)
This skill includes detailed reference documentation:
Load these references when you need detailed information about specific operations or when working on complex tasks.