ワンクリックで
autofigure
// Convert raster figure images into editable DrawIO files using SAM3 segmentation, RMBG-2.0 background removal, and multimodal LLM drawio generation.
// Convert raster figure images into editable DrawIO files using SAM3 segmentation, RMBG-2.0 background removal, and multimodal LLM drawio generation.
CLI tool (arxivterminal) for fetching, searching, and managing arXiv papers locally. Use when working with arXiv papers using the arxivterminal command - fetching new papers by category, searching the local database, viewing papers from specific dates, or managing the local paper database.
Fetch arXiv papers by date range and topics, rank them for research value, and produce introduction digests. Use when the user wants a literature sweep, daily or weekly paper triage, or a written overview of the best papers in a niche — without relying on the arxivterminal local database.
Search and analyze research papers, find related work, summarize key ideas. Use when user says "find papers", "related work", "literature review", "what does this paper say", or needs to understand academic papers.
AI paper reviewer. Use when the user says 'review my paper', 'help me review this paper', '审稿', 'give me feedback on my paper', 'check my manuscript', 'evaluate this paper for NeurIPS/ICLR/EuroSys'. Accepts PDF files and produces structured narrative reviews with venue-specific dimensional scores and Accept/Reject recommendation.
Convert a local paper PDF to structured Markdown and export all figures as PNG + SVG + drawio. Attempts editable figure reconstruction via the built-in autofigure pipeline (SAM3 → RMBG-2.0 → VLM → SVG), falling back to a layered-SVG wrapper when API keys are unavailable. Use when the user wants to parse a paper PDF, extract its text as Markdown, or get editable/exportable figure assets.
Autonomous ML research loop — modify train.py, run 5-min GPU experiments, track val_bpb, iterate overnight. Invoke when the user wants to run self-directed architecture search on the autoresearch repo.
| name | autofigure |
| description | Convert raster figure images into editable DrawIO files using SAM3 segmentation, RMBG-2.0 background removal, and multimodal LLM drawio generation. |
| metadata | {"clawphd":{"emoji":"🖼️"}} |
| requires | {"env":["FAL_KEY"]} |
Convert a raster figure (PNG/JPEG) into an editable .drawio file where text, arrows, and shapes are real mxCell elements, and icons are embedded as transparent PNGs using DrawIO's image style.
| Tool | Purpose |
|---|---|
segment_figure | Detect icons/elements via SAM3 (fal.ai), produce samed.png + boxlib.json |
crop_remove_bg | Crop detected regions + RMBG-2.0 background removal → transparent PNGs |
generate_drawio_template | Multimodal LLM reconstructs the figure as DrawIO XML with gray placeholder mxCells |
replace_icons_drawio | Embed transparent icons into drawio placeholder mxCells → final.drawio |
Legacy SVG tools (
generate_svg_template,replace_icons_svg) are also available if you need SVG output instead.
Follow these steps in order. Each step's output feeds into the next.
segment_figure(
image_path="path/to/figure.png",
output_dir="./output"
)
Detects icons, diagrams, and visual elements using SAM3. Produces:
samed.png — figure with gray rectangles (#808080) + sequential labels (<AF>01, <AF>02, …)boxlib.json — coordinates of each detected regionOptional parameters:
text_prompts: Comma-separated SAM3 prompts (default: "icon,robot,animal,person")min_score: Confidence threshold (default: 0.0)merge_threshold: Overlap merge threshold (default: 0.001)crop_remove_bg(
image_path="path/to/figure.png",
boxlib_path="./output/boxlib.json",
output_dir="./output"
)
For each detected region:
icons/icon_AF01_nobg.pngProduces icon_infos.json with paths and coordinates for all icons.
generate_drawio_template(
figure_path="path/to/figure.png",
samed_path="./output/samed.png",
boxlib_path="./output/boxlib.json",
output_dir="./output"
)
Sends the original figure and samed.png to a multimodal LLM, which reconstructs the figure as DrawIO mxGraph XML:
<mxCell style="text;..."> elements<mxCell edge="1" ...> with mxGeometry<mxCell style="rounded=0;..."> rectangles, ellipses, etc.id="AF01" and value="<AF>01"Includes automatic XML validation and LLM-based repair.
Produces:
template.drawio — raw LLM outputoptimized_template.drawio — validated/fixed drawio fileOptional parameters:
optimize_iterations: LLM refinement iterations (default: 0)replace_icons_drawio(
template_drawio_path="./output/optimized_template.drawio",
icon_infos_path="./output/icon_infos.json",
figure_path="path/to/figure.png",
output_path="./output/final.drawio"
)
Replaces each gray placeholder mxCell with the corresponding transparent PNG icon embedded using DrawIO image style (image=data:image/png,{base64}). Matching strategy:
id (e.g., AF01)value (e.g., <AF>01 or <AF>01)Produces final.drawio — the editable DrawIO file with embedded icons, openable in draw.io.
User: "Convert this figure to an editable DrawIO file: figures/method.png"
Agent steps:
1. segment_figure(image_path="figures/method.png", output_dir="./output")
→ samed.png, boxlib.json (e.g. 5 boxes detected)
2. crop_remove_bg(image_path="figures/method.png", boxlib_path="./output/boxlib.json", output_dir="./output")
→ icon_infos.json, 5 transparent PNGs
3. generate_drawio_template(figure_path="figures/method.png", samed_path="./output/samed.png", boxlib_path="./output/boxlib.json", output_dir="./output")
→ template.drawio, optimized_template.drawio
4. replace_icons_drawio(template_drawio_path="./output/optimized_template.drawio", icon_infos_path="./output/icon_infos.json", figure_path="figures/method.png", output_path="./output/final.drawio")
→ final.drawio
5. Reply with the final drawio path
image=data:image/png,{base64_data}parent="1" and vertex="1" (edges use edge="1" instead)fal_api_key in config.json under tools.autofigure, or set the FAL_KEY environment variable.torch, torchvision, and transformers must be installed. The model is auto-downloaded from HuggingFace on first use.google/gemini-2.5-pro-preview) produce more accurate reconstructions.optimize_iterations=1 or 2 to refine the drawio layout.