تشغيل أي مهارة في Manus بنقرة واحدة

html-to-image

Render hand-coded HTML/CSS into crisp PNG images for architecture diagrams, blog cover art, technical posters, and any case where precise typography, layout, and brand control matter more than what a text-to-image model can deliver. Use this skill whenever the user wants a diagram, infographic, cover image, poster, or visual asset built from code (not generated by an image model), or whenever they reference past work like "another Compass-style architecture diagram" or "a header image for my blog post." Pairs with the frontend-design skill — frontend-design picks the aesthetic, this skill handles the HTML→PNG pipeline.

تشغيل في Manus

نظرة عامة

أمر التثبيت

npx skills add https://github.com/tikazyq/marvinzhang.dev --skill html-to-image

انسخ والصق هذا الأمر في Claude Code لتثبيت المهارة

المصدر

tikazyq/marvinzhang.dev

النجوم٢

التفرعات٠

آخر تحديث١٢ مايو ٢٠٢٦ في ٠٥:١٨

مستكشف الملفات

6 ملفات

SKILL.md

readonly

name

html-to-image

description

HTML to image

A workflow for turning a hand-coded HTML file into a high-resolution PNG. The skill captures the mechanical parts — viewport, DPR, font-load timing, crop, dual delivery — so the conversation can stay focused on visual decisions.

Setup

scripts/render.py requires Python 3.10+ (uses X | Y union and tuple[int, int] builtin-generic typing) + the Playwright Python package + the Chromium browser binaries Playwright manages itself.

Quick check (this both imports the module and launches Chromium — if it prints ready, you're done):

python3 -c "from playwright.sync_api import sync_playwright; p=sync_playwright().start(); b=p.chromium.launch(); b.close(); p.stop(); print('ready')"

If it fails, install:

pip install playwright
python3 -m playwright install chromium

Always invoke via python3 -m playwright, not the bare playwright CLI. On systems with a global Node install, which playwright often resolves to the JavaScript Playwright CLI, which downloads browsers to a different location than Python's Playwright looks in — so playwright install chromium will appear to succeed (exit 0, no output) while the Python module still can't find Chromium. The python3 -m form routes through the Python package and writes to the right place.

Chromium binaries land in ~/.cache/ms-playwright/ by default (or $PLAYWRIGHT_BROWSERS_PATH if set — e.g. some sandboxes use /opt/pw-browsers/). The Python module knows where to look; you generally shouldn't need to touch the path yourself.

If the install prompt is interactive (user-facing skill run, not a smoke test), confirm with the user before downloading — Chromium + headless shell are ~280 MB.

When to reach for this skill

"Draw an architecture diagram for X" / "make an infographic about Y"
"I need a cover image for my blog post about Z"
"Generate a poster / explainer / one-pager"
Anything where the user previously praised a code-rendered image and wants more in that style
Anything where text-to-image would mangle the typography or get the labels wrong

If the user only wants a live web component (no PNG export), skip this skill — use frontend-design alone.

Pair with frontend-design

This skill is the pipeline. It says nothing about visual style. Before writing HTML, consult the frontend-design skill for the aesthetic direction (typography, color, composition, avoiding generic AI looks). The split:

frontend-design → what the image should look like
html-to-image → how to render it crisply and deliver it

Both apply on every job.

Workflow

1. Clarify intent before drawing

For non-trivial diagrams, ask up to three clarifying questions before writing any HTML. Typical ones:

Is the audience technical, leadership, or mixed?
Are there parts that are "ours" vs. "ecosystem / external" that need visual separation?
Any brand color or logo to anchor on, or stay neutral?

If the user says "just go" or doesn't answer, commit to a default and state it in one line ("Going with neutral blue-gray + a single accent; will swap if you prefer something else"). Don't stall on confirmation.

2. Set up the working directory

Iterate on the HTML inside a scratch directory (e.g. /tmp/html-to-image-work/ or the project's working area). Only copy the final files to the user's intended location at delivery. This keeps half-finished artifacts out of the way.

3. Write the HTML

A few conventions the render script depends on:

Wrap the printable content in a .canvas (or .container, or main) element. The render script crops to this by default, stripping browser margin / body padding so the PNG has no whitespace around the artwork.
Load web fonts via <link> in <head>, not @import inside CSS. Faster, and networkidle will actually wait for them.
Inline SVG for icons. No external icon fonts (FontAwesome etc.) — they're a render-timing hazard.
All CSS inline in <style>. Keeps the artifact single-file and trivially shareable.

If the design uses many components, write a Python build script with f-string templates rather than one monolithic HTML — it makes iteration much faster.

4. Render

Use scripts/render.py. The script handles 2x DPR, networkidle wait, the extra 2-second wait for web fonts, and cropping.

python scripts/render.py diagram.html --preset architecture

Presets (viewport in CSS pixels; PNG is 2x):

Preset	Viewport	When to use
`architecture`	1360×1080	Dense layered diagrams for leadership reviews
`architecture-s`	1280×980	Side-by-side architecture, narrower aspect
`blog-cover`	1200×675	16:9 banner / hero for blog posts
`blog-square`	1200×1200	Single-concept square for social or inline use
`poster`	1440×1800	Tall poster, conference handout, explainer

Or pass --width W --height H for custom dimensions. --full-page disables cropping. --selector ".my-class" overrides the crop target.

5. Visual self-check

Always view the rendered PNG before reporting back. Common failures the eye catches but the script doesn't:

Fonts didn't load (text falls back to system font and looks wrong)
Icon SVGs misaligned or too small at print size
Content overflows the .canvas and gets clipped
Whitespace asymmetry, awkward grid breaks
Colors that look fine on screen but lose contrast at 2x

If any issue: edit HTML, re-render, re-view. Two or three iterations is normal.

6. Deliver both files

Hand over both the PNG (high-res, drop straight into slides / blog / docs) and the HTML (editable source for the next round of changes). Mention this dual delivery in the wrap-up so the user knows the HTML is theirs to keep tweaking.

Common gotchas

Fonts paint after networkidle. Networkidle fires when network is quiet, but Google Fonts often paints a frame or two later. The script waits an extra 2000ms. If a render still shows fallback fonts, bump the wait — don't remove it.

full_page=True includes body padding. This is the wrong default for diagrams. Always crop to .canvas / .container / main (the script does this automatically). Only use --full-page for full-bleed designs.

1x screenshots look soft. Always use device_scale_factor=2 (baked into the script). The PNG ends up at 2× CSS dimensions — e.g., a 1360-wide viewport produces a 2720-wide PNG. That's what makes it look crisp when embedded in slides at 100% zoom or viewed on retina.

SVG icons drift at small sizes. Inline SVG with viewBox="0 0 N N" and explicit width/height in CSS, not attribute sizes. Stroke widths look thicker at 2x than they did during design — verify in the PNG, not the browser.

Background color of the page matters. If the design is light, set body { background: #fff } (or the canvas's intended outer color). When cropped to .canvas, the body color won't show, but it prevents transparent edges if you ever switch to --full-page.

Style guidance lives in frontend-design

This skill intentionally has no opinions about color palettes, typography pairings, or aesthetic direction. Those decisions are case-by-case and the frontend-design skill covers them well. Resist the urge to copy the visual style of a previous diagram unless the user explicitly asks for it — diagrams that all look the same are a sign the skill has been over-prescribed.

المزيد من هذا المستودع

نفس المستودع

wechat-publish

tikazyq/marvinzhang.dev

Automated publishing of blog articles to WeChat Official Account (微信公众号). Generates WeChat-ready markdown, converts to styled content, and delivers via Telegram for easy copy-paste into 公众号助手 app. Triggers include "publish to WeChat", "发布到微信", "微信公众号发布", "wechat publish".

2026-03-242

blog-analytical

tikazyq/marvinzhang.dev

Write deep-dive analytical technical articles for marvinzhang.dev following 4-stage workflow (Research → Outline → Writing → Refine). Use for technical deep-dives, technology comparisons, industry analysis, and theoretical concepts with practical implications. Composes foundation and research skills.

2026-03-142

blog-announcement

tikazyq/marvinzhang.dev

Write announcement articles for marvinzhang.dev following 2-stage workflow (Writing → Refine). Use for project releases, product updates, and feature announcements. Composes foundation skills.

2026-03-142

blog-experiential

tikazyq/marvinzhang.dev

Write experiential articles for marvinzhang.dev following 3-stage workflow (Outline → Writing → Refine). Use for personal insights, lessons learned, project retrospectives, and reflective pieces. Composes foundation skills.

2026-03-142

blog-tutorial

tikazyq/marvinzhang.dev

Write step-by-step tutorial articles for marvinzhang.dev following 3-stage workflow (Outline → Writing → Refine). Use for how-to guides, implementation tutorials, and practical walkthroughs. Composes foundation skills.

2026-03-142

leanspec-sdd

tikazyq/marvinzhang.dev

Spec-Driven Development methodology for AI-assisted development. Use when working with specs, planning features, creating/implementing/refining/organizing specs, checking progress, updating specs, task breakdowns, design decisions, or any task involving a specs/ folder or .lean-spec/config.json.

2026-03-142

المصدر

tikazyq

tikazyq/marvinzhang.dev

فتح مستودع GitHub عرض مستودعات المنشئ

أمر التثبيت

تنزيل

تشغيل في Manus

مفيد لـSOC

مطوّرو الويبمهن الحاسوب والرياضيات15-1254L4

name

html-to-image

description

HTML to image

Setup

Quick check (this both imports the module and launches Chromium — if it prints ready, you're done):

python3 -c "from playwright.sync_api import sync_playwright; p=sync_playwright().start(); b=p.chromium.launch(); b.close(); p.stop(); print('ready')"

If it fails, install:

pip install playwright
python3 -m playwright install chromium

If the install prompt is interactive (user-facing skill run, not a smoke test), confirm with the user before downloading — Chromium + headless shell are ~280 MB.

When to reach for this skill

"Draw an architecture diagram for X" / "make an infographic about Y"
"I need a cover image for my blog post about Z"
"Generate a poster / explainer / one-pager"
Anything where the user previously praised a code-rendered image and wants more in that style
Anything where text-to-image would mangle the typography or get the labels wrong

If the user only wants a live web component (no PNG export), skip this skill — use frontend-design alone.

Pair with frontend-design

frontend-design → what the image should look like
html-to-image → how to render it crisply and deliver it

Both apply on every job.

Workflow

1. Clarify intent before drawing

For non-trivial diagrams, ask up to three clarifying questions before writing any HTML. Typical ones:

Is the audience technical, leadership, or mixed?
Are there parts that are "ours" vs. "ecosystem / external" that need visual separation?
Any brand color or logo to anchor on, or stay neutral?

2. Set up the working directory

3. Write the HTML

A few conventions the render script depends on:

Wrap the printable content in a .canvas (or .container, or main) element. The render script crops to this by default, stripping browser margin / body padding so the PNG has no whitespace around the artwork.
Load web fonts via <link> in <head>, not @import inside CSS. Faster, and networkidle will actually wait for them.
Inline SVG for icons. No external icon fonts (FontAwesome etc.) — they're a render-timing hazard.
All CSS inline in <style>. Keeps the artifact single-file and trivially shareable.

If the design uses many components, write a Python build script with f-string templates rather than one monolithic HTML — it makes iteration much faster.

4. Render

Use scripts/render.py. The script handles 2x DPR, networkidle wait, the extra 2-second wait for web fonts, and cropping.

python scripts/render.py diagram.html --preset architecture

Presets (viewport in CSS pixels; PNG is 2x):

Preset	Viewport	When to use
`architecture`	1360×1080	Dense layered diagrams for leadership reviews
`architecture-s`	1280×980	Side-by-side architecture, narrower aspect
`blog-cover`	1200×675	16:9 banner / hero for blog posts
`blog-square`	1200×1200	Single-concept square for social or inline use
`poster`	1440×1800	Tall poster, conference handout, explainer

Or pass --width W --height H for custom dimensions. --full-page disables cropping. --selector ".my-class" overrides the crop target.

5. Visual self-check

Always view the rendered PNG before reporting back. Common failures the eye catches but the script doesn't:

Fonts didn't load (text falls back to system font and looks wrong)
Icon SVGs misaligned or too small at print size
Content overflows the .canvas and gets clipped
Whitespace asymmetry, awkward grid breaks
Colors that look fine on screen but lose contrast at 2x

If any issue: edit HTML, re-render, re-view. Two or three iterations is normal.