| name | article-to-html-skill |
| description | Render a markdown draft / any document in the conversation context into a single-file "paper proposal" HTML — serif body, monospace meta, numbered sections, inline SVG figures, callouts, tables, optional interactive elements. Trigger when the user says "turn this into HTML / render as a web page / make a pretty HTML / give me a single-page doc / generate article HTML / paper-style HTML / convert to article html / render this as a doc html" or similar, and wants a **static single file** rather than Figma/PDF/PPT. Also fires when the user pastes some markdown and says "beautify this / typeset this / give me a web version". Do NOT trigger when the user wants slides, needs a backend, or needs a multi-page site. |
| user-invocable | true |
| argument-hint | Markdown文档路径 |
article-to-html
Turn a document (from conversation context or a given markdown file) into a self-contained HTML file, reusing the "paper proposal" design system defined in references/template.html: serif body + monospace meta + numbered sections + inline SVG figures + callouts + tables + optional JS interactivity.
Flow after trigger
- Get the source. If the user gave a file path,
Read it. If the draft is in conversation context, use that directly.
- Extract document skeleton. Title, subtitle, TL;DR, section list, whether figures are needed, whether tables are needed, whether interactivity is needed.
- Read the template.
Read references/template.html and use its full CSS + structure as the scaffold.
- Read the component reference.
Read references/components.md and pick out the snippets you need this round.
- If figures are needed: consult
references/svg-figures.md and draw with inline SVG (no external image references — keep the file portable).
- If interactivity is needed: consult
references/interactive.md and append a <script> block before </body>.
- Output.
Write to {same dir as source or current working dir}/{slug}.html. Filename = English slug of the title, or whatever the user specified.
- Report. One sentence with the file path + a one-line command to open it in a browser.
Design invariants (do NOT break)
- Single file. All CSS / SVG / JS inline. No external fonts, no CDNs, no remote images.
- Don't change the
paper palette. --paper: #f7f7f5 + --ink: #1a1a1a is this skill's visual signature. If the user explicitly asks for a different mood (dark mode, different accent), you can change --accent / --warn, but keep paper/ink.
- Serif body + monospace meta.
body uses ui-serif; every "metadata slot" (.doc-eyebrow / .doc-meta / th / code / figcaption / .num) uses ui-monospace. This mix is the signature.
- Section numbering prefix.
<h2><span class="num">01</span>Section title</h2> — small monospace, faint gray, 14px gap to the title. Add it even if the source has no numbers.
- Figures need figcaption.
<figcaption><span class="fig-num">FIG 1</span>caption text</figcaption>, the FIG N in accent color.
- TL;DR always on top. If the source has no TL;DR, condense the first one or two paragraphs into a ~60-word summary and put it there.
- No "generated by AI" footer watermark unless the user explicitly asks.
Component cheat sheet
| If the source contains... | Use... |
|---|
| Intro / abstract | .tldr block |
| Quotation | .callout.cite (with .cite-source) |
| Warning / heads-up | .callout.warn |
| Generic sidenote | .callout (default, white background) |
| Three parallel concepts / roles | .cards (three-column cards) |
| N parallel concepts | .cards cols-2 / cols-4 |
| Comparison / vendor matrix | <table> |
| Flow / architecture / timing / bar chart | inline SVG figure — see references/svg-figures.md |
| Open-questions list | <ol> with bold lead phrase per item |
| Reference links | <footer> containing a <ul> |
Interactive elements (optional)
The template is static by default. Add interactivity proactively when the document clearly benefits:
- Section collapse / expand (long documents)
- "Copy" button on code blocks
- Table filter / sort (great for vendor comparison tables)
- Dark mode toggle (persisted to localStorage)
- TOC + scrollspy
- Forms (e.g. for an RFC / decision doc, a "vote / leave comment" form persisted to localStorage)
Specific snippets in references/interactive.md. Default to none unless (a) the doc is long — ≥3 sections + ≥2 figures, or (b) the source is interaction-shaped (tutorial, decision doc, vendor selection).
Naming and output location
- Filename: English slug of the title, lowercase, hyphenated, ≤40 chars. For Chinese titles, use pinyin or a translated keyword hint, e.g. 《新基建提案》→
infra-proposal.html.
- Output location:
- If the user gave a source markdown path, output in the same directory.
- Otherwise output in the current working directory.
- If the user specifies a path, honor it.
- If a file with the same name exists, append
-v2 / -v3. Never silently overwrite the user's existing output.
Critical pitfalls (learned from real renders)
SVG <text> cannot contain HTML tags
HTML5 parsers exit foreign-content (SVG) mode when they hit HTML tags like <sub>, <sup>, <b>, <p>, <br>. This breaks the SVG mid-render — everything after leaks into body flow as plain text. Symptoms: big empty whitespace inside the SVG box, with raw text fragments below.
Always use SVG-native equivalents inside <text> / <tspan>:
| HTML | SVG equivalent |
|---|
<sub>X</sub> | <tspan baseline-shift="sub" font-size="0.75em">X</tspan> |
<sup>X</sup> | <tspan baseline-shift="super" font-size="0.75em">X</tspan> |
<b>X</b> | <tspan font-weight="600">X</tspan> |
| line break | separate <text> element, or <tspan x="..." dy="1.2em"> |
The same rule applies to <foreignObject> content — but prefer pure SVG <tspan> for portability.
Semantic precision over visual prettiness
When drawing technical diagrams (especially ML / data-flow / architecture), the arrows and shapes must encode the actual semantics. Generic "node → node" arrows are not enough. Examples from real renders:
- Causality vs. prediction in a transformer / autoregressive diagram: different arrow colors for different relations (e.g. red = causal compute
{x_1,…,x_P} → logits[P], blue = prediction logits[K] → input_ids[K+1]). One color = one semantic.
- Tensor shape distinction: when two rows of objects have different ranks (e.g. token-ids
(B, T) vs. logits (B, T, |V|)), draw the higher-rank tensor as a 3D cube (front rect + top polygon + right polygon) and the lower-rank as a flat rect. This makes the "extra dim" instantly readable.
- Highlight the sliced range: if the prose discusses a slice like
[-T-1:-1], color the cubes inside that range distinctly from the others, and put a bracket annotation under them with the slice expression. Color must match between the cube row and the bracket label.
- Discarded / dummy positions (e.g. the EOS slot, padding): faded gray + dashed stroke + label
discarded.
Layout collision checks
After every figure render, screenshot it and verify visually:
- Labels do not overlap legend boxes
- Arrow tails / heads don't run through text
- Row-label text (left margin) doesn't get covered by the first cube/box
- Slice brackets sit below the row they reference, not on top of it
If overlap is detected, prefer moving the label over shrinking the diagram. Reserve ~20px clear space around every text element.
Formula typography
For ML / math / paper-style content, formulas inside .equation blocks should use Times New Roman (or "Times" / "Nimbus Roman" fallback). This is what researchers expect — ui-monospace for formulas feels like Markdown source, not a paper.
.equation code {
font-family: "Times New Roman", "Times", "Nimbus Roman",
"Source Han Serif SC", "Songti SC", serif;
font-size: 16px;
line-height: 1.9;
}
.equation code sub, .equation code sup { font-size: 0.72em; }
<sub> / <sup> are fine in HTML body — the SVG rule does NOT apply here.
Code highlighting
Long code blocks (≥6 lines, or anything users will read line-by-line) deserve language-tagged syntax highlighting. Don't pull in Prism.js or highlight.js — they're external deps. Use a small inline tokenizer with paper-tone CSS classes:
tok-keyword (red-brown #a05050): if / elif / def / return / for / lambda
tok-builtin (olive #6f7a5a): sum / len / range / print
tok-string (warm #7a6420)
tok-number (steel #5a7a8c)
tok-comment (gray italic #909090)
tok-func (slate #3a5a72): function call names
tok-op (accent #b88a4a): * - + = / etc
tok-attr (steel): .mean / .sum / .detach
Add <pre class="with-lang" data-lang="python"> to display a small language tag in the top-right corner. Provide dark-mode variants for every tok-* class.
Common mistakes
- Recoloring the CSS. Don't touch paper/ink/accent unless the user said "different palette please". Visual consistency is the entire point.
- Switching body to sans-serif. The whole "proposal paper" feel hinges on the serif body. Changing fonts breaks the look.
- Using emoji as icons. The character vocabulary of this skill is uppercase monospace labels + small color blocks (
.mascot / .layer-icon). No emoji unless the user explicitly asks.
- External
<img src="https://...">. Kills portability. Either inline SVG, base64, or omit.
- Missing figcaption numbers. The
FIG 1 / FIG 2 / FIG 3 running labels are key to the "academic paper" feel.
- Callout on every paragraph. Callouts highlight one or two passages, not decorate. Keep total callouts ≤ 5 per document.
- Cramming 5 items into the three-column cards. Switch to
repeat(N, 1fr) (or use the .cols-N modifier), don't force wrap.
- Leaking domain-specific names from the example. The template is blank; if the source doesn't mention Crewlet / Anthropic / etc., the HTML shouldn't either.
File manifest
references/template.html — blank scaffold (full CSS + placeholder structure). Start here.
references/components.md — cut-and-paste HTML snippets for every component.
references/svg-figures.md — five typical SVG figure skeletons (architecture, timing, bar comparison, stacked layers, lifecycle).
references/ml-diagrams.md — ML / AI specific patterns: 3D cubes for tensor rank, two-color arrow semantics (causality vs. prediction), slice annotations, three-stage pipeline, equation typography, SVG-<text> pitfalls. Read this whenever the source is an ML / RL / transformer / training-loop note.
references/code-highlight.md — inline Python syntax highlighter (no CDN, no external deps). Use when the source has non-trivial code blocks.
references/interactive.md — JS snippets (collapse, copy, table filter/sort, TOC, dark mode, form → localStorage, reading progress, figure zoom).
assets/example.html — a completed reference render (the original Chinese article.html kept as a visual benchmark).
Source-type triage
Pick which references to read based on the source content:
| Source contains... | Must-read references |
|---|
| Architecture / system design | template.html + components.md + svg-figures.md |
| ML / RL / training / distillation / transformer | + ml-diagrams.md (mandatory) |
| Non-trivial code blocks (≥6 lines or method chains) | + code-highlight.md |
| Long doc (≥3 sections + ≥2 figures) | + interactive.md (TOC, dark mode, copy buttons) |
| Math-heavy paper notes | + ml-diagrams.md § Equation typography |
Iterative verification loop
For figures (especially SVG), always screenshot after writing and verify visually before declaring done:
- Open the HTML in the browser tool
- Screenshot the
<figure> element with scrollIntoViewIfNeeded
- Check: no leaked text outside SVG box, no label overlaps, semantics match the prose
- If broken, fix and reload — do NOT batch up many figures before checking
The most common failure mode (leaked text inside SVG) is invisible from the source; it only appears in the render.