| name | dr-cook:data-visualizer |
| description | Generate publication-quality scientific figures and visualizations for academic manuscripts. Use when creating data visualization, plot, figure, chart, graph, heatmap, volcano plot, boxplot, violin plot, PCA plot, UMAP, survival curve, Kaplan-Meier, forest plot, scientific figure, publication figure, ggplot, matplotlib, R plot, Python plot, enrichment bubble chart, PPI network visualization, network figure, 数据可视化, 画图, 图表, 可视化. Do NOT trigger for: bioinformatics-assistant (analysis pipelines and computational methods), method-designer (experimental design and study planning).
|
data-visualizer
1. Overview
data-visualizer produces publication-quality figures for academic manuscripts, generating complete, runnable code in ggplot2 or matplotlib with correct resolution, font sizes, and color palettes. Three figure categories are supported.
Omics figures: Volcano plots, heatmaps, PCA/UMAP scatter plots, and enrichment bubble charts — produced from bioinformatics analysis outputs such as DESeq2 results or clusterProfiler enrichment tables. Typically downstream from bioinformatics-assistant.
Clinical and statistical figures: Boxplots, violin plots, Kaplan-Meier survival curves, and forest plots for meta-analyses. These communicate group comparisons, patient outcomes, and pooled effect sizes.
Network figures: PPI network visualization and Compound-Target-Pathway figures for TCM network pharmacology. Output is Cytoscape-ready edge/node tables or R igraph code.
2. Parameters
Required
| Parameter | Values | Description |
|---|
figure_type | volcano | heatmap | pca | umap | boxplot | violin | survival | forest | ppi_network | enrichment_bubble | custom | Which plot type to generate |
data_description | string | Description of input data — pasted data table, column names, or description of R object |
Optional
| Parameter | Values | Description |
|---|
tool | ggplot2 | matplotlib | base_R | cytoscape (default: ggplot2) | Plotting library or tool |
color_scheme | default | colorblind_safe | tcm_palette | journal_bw (default: default) | Color palette to apply |
output_format | pdf | png | svg (default: pdf) | File format for the saved figure |
domain | tcm | bioinformatics | clinical | pharmacology | Research domain; loaded from context_output if available |
target_journal | string | Journal name; affects figure dimensions and resolution requirements |
language | en | zh (default: en) | Language for axis labels and legend text |
3. Workflow
Step 1 — Check upstream context_output.
Inherit parameters.domain, parameters.target_journal, and parameters.language without asking. If raw_text contains tabular data (e.g., DESeq2 columns log2FoldChange, padj, gene), treat it as data_description.
Step 2 — Collect figure_type and data_description.
If figure_type is not specified by the user, present the figure menu:
"Which type of figure would you like?
(1) Volcano plot — DEG results
(2) Heatmap — expression matrix
(3) PCA / UMAP — dimensionality reduction
(4) Boxplot / Violin plot — group comparison
(5) Survival curve (Kaplan-Meier) — time-to-event data
(6) Forest plot — meta-analysis effect sizes
(7) Network figure — PPI or Compound-Target-Pathway
(8) Enrichment bubble chart — GO/KEGG results
(9) Describe your own figure"
After figure_type is known, confirm data_description if not already provided. Accept a pasted data sample, column name list, or reference to an upstream R object (e.g., "DESeq2 res with gene, log2FoldChange, padj columns").
Step 3 — Confirm tool and color scheme.
Default tool to ggplot2. If target_journal is known, auto-suggest color scheme:
- Nature, Science, Cell →
journal_bw; note "Color figures incur additional fees in print."
- Chinese journals →
default (full color acceptable).
- All others →
default.
Apply the suggested or default palette without asking if color_scheme is unspecified.
Step 4 — Load plot-templates.md.
Load references/plot-templates.md and retrieve the template matching figure_type. Use it as the structural foundation for Step 6. For custom: ask "Please describe your figure in 1–2 sentences," then select the closest template:
- Expression/omics data → volcano or heatmap
- Group comparisons/distributions → boxplot or violin
- Time-to-event data → survival
- Odds ratios/effect sizes → forest plot
- None apply → volcano scaffold; note which geoms and axes to replace
Step 5 — Load color-palettes.md.
Load references/color-palettes.md and extract hex values for the requested color_scheme. Confirm palette compatibility — journal_bw requires shape/pattern differentiation in addition to grayscale fills.
Step 6 — Generate figure code.
Produce complete, runnable R or Python code from the Step 4 template and Step 5 palette. Every code block must include:
- Data format note before the code: "Your data should be formatted as: [column names and types]."
- Palette application via
scale_color_manual(), scale_fill_manual(), or ggsci helpers.
- Publication settings:
theme_classic(base_size = 10); axis font 10pt; title 12pt; legend 8pt; 300 dpi minimum.
- Save call:
ggsave() or plt.savefig() with explicit width, height, dpi, units = "mm" per journal spec.
- Chinese font note when
language: zh: include library(showtext); showtext_auto() and replace all label strings with Chinese text.
Step 7 — Offer iteration.
End every response with:
"Would you like to adjust colors, resize for a specific journal format, add statistical annotations (e.g., significance brackets, p-value labels), switch to a different tool, or modify any other aspect of the figure?"
4. Output Format
Begin every response with the figure header:
[Figure: <Figure Type> | Tool: <tool> | Color: <color_scheme>]
Example: [Figure: Volcano Plot | Tool: ggplot2 | Color: Colorblind-safe]
Follow the header with:
- Data format requirement block — a short prose or bulleted list stating exactly what columns/structure the input data must have.
- Complete code block — language-tagged fence (
```r or ```python), complete and runnable.
- "Save as:" line — shows the
ggsave() or plt.savefig() call with explicit width, height, dpi, and units = "mm" arguments.
- Journal-specific note — only when
target_journal is set. Example: "Nature figures: max 180 mm wide (double-column) or 89 mm (single-column), 300 dpi minimum, TIFF or PDF preferred."
5. context_output
Reads from upstream
| Field | Source | Usage |
|---|
parameters.domain | any upstream module | Avoids re-asking for domain |
parameters.target_journal | any upstream module | Auto-selects figure dimensions and color scheme |
parameters.language | any upstream module | Sets axis label and legend language |
raw_text | bioinformatics-assistant, paper-writer | Used as data input when it contains tabular results |
Writes to output
{
"module": "data-visualizer",
"summary": "<e.g., 'Volcano plot code generated for human DEG data, ggplot2, colorblind-safe palette'>",
"artifacts": ["volcano_plot.pdf"],
"parameters": {
"figure_type": "<volcano | heatmap | pca | umap | boxplot | violin | survival | forest | ppi_network | enrichment_bubble | custom>",
"tool": "<ggplot2 | matplotlib | base_R | cytoscape>",
"color_scheme": "<default | colorblind_safe | tcm_palette | journal_bw>"
},
"status": "success | partial | failed",
"error_message": "<string | null>"
}
artifacts is always a list of strings, each being an expected output file name (e.g., ["heatmap.pdf"]). status = partial when data_description is insufficient to generate runnable code. status = failed if figure_type could not be determined after two attempts.
6. References
See references/ for:
plot-templates.md — Complete ggplot2/R code templates for volcano plot, heatmap, Kaplan-Meier survival curve, forest plot, and publication size guidelines
color-palettes.md — NPG default, colorblind-safe (Wong 2011), TCM earth tones, journal grayscale palettes with ggplot2 application code and accessibility rules