Run any Skill in Manus with one click

academic-figure-generation

Generates publication-quality academic figures (framework diagrams, pipeline illustrations, system architectures, method overviews) from a paper's method text and a target caption, using a local PaperBanana multi-agent pipeline (Retriever → Planner → Stylist → Visualizer → Critic).

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/jxtse/scientific-research-skills --skill academic-figure-generation

Copy and paste this command into Claude Code to install the skill

Source

jxtse/scientific-research-skills

Stars44

Forks3

UpdatedApril 22, 2026 at 13:51

File Explorer

2 files

SKILL.md

readonly

name	academic-figure-generation
description	Generates publication-quality academic figures (framework diagrams, pipeline illustrations, system architectures, method overviews) from a paper's method text and a target caption, using a local PaperBanana multi-agent pipeline (Retriever → Planner → Stylist → Visualizer → Critic).

Academic Figure Generation

Thin CLI wrapper around PaperBanana (a.k.a. PaperVizAgent), a multi-agent figure-generation pipeline for academic papers.

The skill provides exactly one script: scripts/generate.py. It feeds your method text + caption into PaperBanana and writes N candidate PNGs. Model selection and API keys come from PaperBanana's own configs/model_config.yaml — the wrapper does not override them.

One-time setup

Clone PaperBanana somewhere convenient:

git clone https://github.com/dwzhu-pku/PaperBanana.git ~/PaperBanana
cd ~/PaperBanana
uv venv && uv pip install -r requirements.txt

Configure configs/model_config.yaml — set the image model and the matching API key. Two common setups:

defaults:
  image_model_name: "gemini-3-pro-image-preview"   # or "openai/gpt-5.4-image-2"
  model_name: "gemini-3.1-pro-preview"             # text model for Planner/Stylist/Critic

api_keys:
  google_api_key: "..."        # required for Gemini models
  openrouter_api_key: ""       # required for openai/gpt-5.4-image-2

Use Gemini if you have a Google AI key; use GPT-Image-2 via OpenRouter if you have an OpenRouter key. Pick one — there's nothing else to wire up.

Workflow

Step 1: Gather inputs

You need:

Method text: the relevant section of the paper describing the approach (./method.md or ./method.tex).
Figure caption: the target caption, e.g. "Figure 1: Overview of our framework".

If the user only gives a vague request, ask:

What aspect of the method should the figure focus on?
Style? (block diagram, flowchart, pipeline, architecture, comparison)
Venue / column width? (ACL ≤ 7.5", NeurIPS single-column 5.5")

Step 2: Generate

~/PaperBanana/.venv/bin/python scripts/generate.py \
  --paperbanana-root ~/PaperBanana \
  --method-file ./method.md \
  --caption "Figure 1: Overview of our framework" \
  --out-dir ./figures/v1 \
  --candidates 3 \
  --aspect-ratio 16:9

Flag	Default	Notes
`--paperbanana-root`	(required)	Path to your PaperBanana checkout
`--method-file`	(required)	Method section as a text/markdown file
`--caption`	(required)	Target figure caption
`--out-dir`	(required)	Where PNGs land
`--candidates`	`3`	Independent diagram candidates
`--max-concurrent`	`2`	Cap concurrent runs (be gentle on quota)
`--exp-mode`	`demo_full`	Full pipeline (Planner+Stylist+Visualizer+Critic). Use `demo_planner_critic` to skip Stylist, or `vanilla` for single-shot.
`--aspect-ratio`	`16:9`	One of `21:9`, `16:9`, `3:2`, `1:1`
`--max-critic-rounds`	`2`	Critique → revise loops (early-exits if critic says "No changes needed")

Step 3: Present & iterate

Show all candidates to the user.
Common refinements: color scheme, layout, label text, font size.
Re-run with a tweaked caption or more candidates.

Step 4: Export

PNGs are written as candidate_0.png, candidate_1.png, … in --out-dir.
For camera-ready PDFs: magick candidate_0.png candidate_0.pdf.

Style guidelines

Color: consistent, colorblind-friendly palette
Fonts: match the paper's body font (Times for ACL/EMNLP, Helvetica/Arial for many ML venues)
Labels: concise; no full sentences inside the diagram
Arrows: solid for data flow, dashed for optional / feedback loops
Whitespace: don't overcrowd — reviewers skim figures in seconds

Common figure types

Type	When to use	Key elements
Pipeline / Flowchart	Sequential processing	Boxes + arrows, L→R or T→B
Architecture	System overview	Nested boxes, clear module boundaries
Comparison	Before/after, baseline vs proposed	Side-by-side panels
Ablation	Component contributions	Bar charts, highlighted rows
Framework	High-level conceptual overview	Abstract shapes, minimal detail

Troubleshooting

429 RESOURCE_EXHAUSTED on Gemini: monthly Google AI Studio spending cap hit. Raise it at https://ai.studio/spend or switch image_model_name to openai/gpt-5.4-image-2 and set OPENROUTER_API_KEY.
OpenRouter Client not initialized: OPENROUTER_API_KEY not in env and openrouter_api_key not in yaml.
No PNGs in output dir: check out_dir/results.json for the raw per-candidate response and any error messages.
Long latency (>5 min): most wall time is the image model. Lower --candidates or use --exp-mode vanilla for faster iteration.

Links

PaperBanana repo: https://github.com/dwzhu-pku/PaperBanana
PaperVizAgent (Google Research version of the same project): https://github.com/google-research/papervizagent

More from this repository

same repository

paper-fulltext-harvest

jxtse/scientific-research-skills

Batch download academic paper full-text (PDF/XML) from a list of DOIs. Handles publisher TDM (Text and Data Mining) APIs requiring institutional subscription (Elsevier ScienceDirect, Wiley Online, Springer Nature), Open Access sources (Crossref, Unpaywall, OpenAlex), and a browser-based fallback for paywalled publishers without TDM access (ACS, RSC, T&F, Chinese journals). Use when the user wants to harvest, scrape, fetch, or bulk-download papers from a DOI list, savedrecs export, or Excel; or wants to fill missing full-text PDFs for an existing literature collection. Triggers on phrases like "批量下载文献", "下载全文", "harvest papers", "scrape full text", "TDM API", "下载 Elsevier 全文", "Wiley 批量下载", "下载 PDF".

2026-04-2444

literature-search

jxtse/scientific-research-skills

Searches and discovers academic papers across multiple sources (Semantic Scholar, arXiv, Tavily, Exa, Gemini deep research, AMiner, Google Scholar) with adaptive engine selection based on query type. Returns ranked, deduplicated results with metadata (authors, venue, year, citations, abstract, PDF link). Use when the user asks to find papers / literature / publications / preprints / references on a topic, search for related work, look up a specific paper by title or DOI or arXiv ID, find papers by an author, find recent SOTA / state-of-the-art work, survey a research area, or run a deep / comprehensive literature search with synthesis.

2026-04-2244

paper-reading

jxtse/scientific-research-skills

Reads and analyzes academic papers (arXiv preprints, conference / journal PDFs, Zotero items) at three configurable depths: quick skim (2 min), standard read (10 min), or deep analysis (30 min). Produces structured digests covering problem, method, key innovation, results, limitations, reproducibility, hidden assumptions, and connections to the user's other work. Use when the user shares an arXiv link, PDF, or paper title and asks to read / summarize / digest / TL;DR / analyze / review / critique / explain / break down a paper, asks about a paper's contributions / methods / results / equations / figures, wants to compare two papers side by side, or needs a reading note for their records.

2026-04-2144

related-work-survey

jxtse/scientific-research-skills

Conducts a systematic related-work / literature-survey / state-of-the-art review for a research question by defining survey dimensions, searching each axis, building a taxonomy of prior work, identifying the gap, and producing a positioning narrative for a paper's Related Work section. Goes beyond a flat paper list to deliver structured analysis. Use when the user is starting a new research project and needs to map the landscape, asks "what's been done on X?" or "how does my idea compare to existing work?", needs to write or revise a Related Work / Background / Prior Art section, wants to identify a research gap or position their contribution, or asks to build a taxonomy of approaches in a research area.

2026-04-2144

social-media-paper-triage

jxtse/scientific-research-skills

Extracts paper recommendations from social-media posts and online articles (小红书 / Xiaohongshu / RedNote, 微信公众号 / WeChat Official Accounts, Twitter / X threads, Reddit posts, Bilibili videos, blog posts, newsletters, Jina Reader URLs), identifies the underlying academic papers, locates the authoritative original sources (arXiv, conference proceedings, DOI), and triages relevance to the user's research before any library action. Use when the user forwards a social-media link, screenshot, or article that mentions a paper / method / model, asks to "find the original paper" from a blog or thread, shares a 调研贴 / 论文推荐 / paper recommendation post, or wants to evaluate whether a buzz-worthy paper is worth reading before adding it to Zotero.

2026-04-2144

zotero-management

jxtse/scientific-research-skills

Manages a Zotero academic reference library through both the local API (localhost:23119, read-only) and the Web API (api.zotero.org, read-write), using a structured collection hierarchy (Inbox / Active Projects / Background / Reading Queue / Archive / Meta) plus project, status, priority, and type tags. Handles adding papers with full metadata, deduplication, attaching provenance notes, moving items between collections, updating tags after reading, listing the prioritized reading queue, and setting up the literature scaffold for a new project. Use when the user asks to add / save / file / organize a paper in Zotero, check / list / clean up the reading queue, move papers between collections, tag papers for a project, query their library ("what do I have on X?"), or set up Zotero for a new research project.

2026-04-2144

Source

jxtse

jxtse/scientific-research-skills

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Biological Scientists, All OtherLife, Physical, and Social Science Occupations19-1029L4

name	academic-figure-generation
description	Generates publication-quality academic figures (framework diagrams, pipeline illustrations, system architectures, method overviews) from a paper's method text and a target caption, using a local PaperBanana multi-agent pipeline (Retriever → Planner → Stylist → Visualizer → Critic).

Academic Figure Generation

Thin CLI wrapper around PaperBanana (a.k.a. PaperVizAgent), a multi-agent figure-generation pipeline for academic papers.

One-time setup

Clone PaperBanana somewhere convenient:

git clone https://github.com/dwzhu-pku/PaperBanana.git ~/PaperBanana
cd ~/PaperBanana
uv venv && uv pip install -r requirements.txt

Configure configs/model_config.yaml — set the image model and the matching API key. Two common setups:

defaults:
  image_model_name: "gemini-3-pro-image-preview"   # or "openai/gpt-5.4-image-2"
  model_name: "gemini-3.1-pro-preview"             # text model for Planner/Stylist/Critic

api_keys:
  google_api_key: "..."        # required for Gemini models
  openrouter_api_key: ""       # required for openai/gpt-5.4-image-2

Use Gemini if you have a Google AI key; use GPT-Image-2 via OpenRouter if you have an OpenRouter key. Pick one — there's nothing else to wire up.

Workflow

Step 1: Gather inputs

You need:

Method text: the relevant section of the paper describing the approach (./method.md or ./method.tex).
Figure caption: the target caption, e.g. "Figure 1: Overview of our framework".

If the user only gives a vague request, ask:

What aspect of the method should the figure focus on?
Style? (block diagram, flowchart, pipeline, architecture, comparison)
Venue / column width? (ACL ≤ 7.5", NeurIPS single-column 5.5")

Step 2: Generate

~/PaperBanana/.venv/bin/python scripts/generate.py \
  --paperbanana-root ~/PaperBanana \
  --method-file ./method.md \
  --caption "Figure 1: Overview of our framework" \
  --out-dir ./figures/v1 \
  --candidates 3 \
  --aspect-ratio 16:9

Flag	Default	Notes
`--paperbanana-root`	(required)	Path to your PaperBanana checkout
`--method-file`	(required)	Method section as a text/markdown file
`--caption`	(required)	Target figure caption
`--out-dir`	(required)	Where PNGs land
`--candidates`	`3`	Independent diagram candidates
`--max-concurrent`	`2`	Cap concurrent runs (be gentle on quota)
`--exp-mode`	`demo_full`	Full pipeline (Planner+Stylist+Visualizer+Critic). Use `demo_planner_critic` to skip Stylist, or `vanilla` for single-shot.
`--aspect-ratio`	`16:9`	One of `21:9`, `16:9`, `3:2`, `1:1`
`--max-critic-rounds`	`2`	Critique → revise loops (early-exits if critic says "No changes needed")

Step 3: Present & iterate

Show all candidates to the user.
Common refinements: color scheme, layout, label text, font size.
Re-run with a tweaked caption or more candidates.

Step 4: Export

PNGs are written as candidate_0.png, candidate_1.png, … in --out-dir.
For camera-ready PDFs: magick candidate_0.png candidate_0.pdf.

Style guidelines

Color: consistent, colorblind-friendly palette
Fonts: match the paper's body font (Times for ACL/EMNLP, Helvetica/Arial for many ML venues)
Labels: concise; no full sentences inside the diagram
Arrows: solid for data flow, dashed for optional / feedback loops
Whitespace: don't overcrowd — reviewers skim figures in seconds

Common figure types

Type	When to use	Key elements
Pipeline / Flowchart	Sequential processing	Boxes + arrows, L→R or T→B
Architecture	System overview	Nested boxes, clear module boundaries
Comparison	Before/after, baseline vs proposed	Side-by-side panels
Ablation	Component contributions	Bar charts, highlighted rows
Framework	High-level conceptual overview	Abstract shapes, minimal detail

Troubleshooting

429 RESOURCE_EXHAUSTED on Gemini: monthly Google AI Studio spending cap hit. Raise it at https://ai.studio/spend or switch image_model_name to openai/gpt-5.4-image-2 and set OPENROUTER_API_KEY.
OpenRouter Client not initialized: OPENROUTER_API_KEY not in env and openrouter_api_key not in yaml.
No PNGs in output dir: check out_dir/results.json for the raw per-candidate response and any error messages.
Long latency (>5 min): most wall time is the image model. Lower --candidates or use --exp-mode vanilla for faster iteration.