| name | ppt-master |
| description | AI-driven multi-format SVG content generation system. Converts source documents (PDF/DOCX/URL/Markdown) into high-quality SVG pages and exports to PPTX through multi-role collaboration. Use when user asks to "create PPT", "make presentation", "生成PPT", "做PPT", "制作演示文稿", or mentions "ppt-master".
|
PPT Master Skill
AI-driven multi-format SVG content generation system. Converts source documents into high-quality SVG pages through multi-role collaboration and exports to PPTX.
Core Pipeline: Source Document → Create Project → [Template] → Strategist → [Image_Generator] → Executor Live Preview → Quality Check → Post-processing → Export
[!CAUTION]
🚨 Global Execution Discipline (MANDATORY)
This workflow is a strict serial pipeline. The following rules have the highest priority — violating any one of them constitutes execution failure:
- SERIAL EXECUTION — Steps MUST be executed in order; the output of each step is the input for the next. Non-BLOCKING adjacent steps may proceed continuously once prerequisites are met, without waiting for the user to say "continue"
- BLOCKING = HARD STOP — Steps marked ⛔ BLOCKING require a full stop; the AI MUST wait for an explicit user response before proceeding and MUST NOT make any decisions on behalf of the user
- NO CROSS-PHASE BUNDLING — Cross-phase bundling is FORBIDDEN. (Note: the Eight Confirmations in Step 4 are ⛔ BLOCKING — the AI MUST present recommendations and wait for explicit user confirmation before proceeding. Once the user confirms, all subsequent non-BLOCKING steps — design spec output, SVG generation, speaker notes, and post-processing — may proceed automatically without further user confirmation)
- GATE BEFORE ENTRY — Each Step has prerequisites (🚧 GATE) listed at the top; these MUST be verified before starting that Step
- NO SPECULATIVE EXECUTION — "Pre-preparing" content for subsequent Steps is FORBIDDEN (e.g., writing SVG code during the Strategist phase)
- NO SUB-AGENT SVG GENERATION — Executor Step 6 SVG generation is context-dependent and MUST be completed by the current main agent end-to-end. Delegating page SVG generation to sub-agents is FORBIDDEN
- SEQUENTIAL PAGE GENERATION ONLY — In Executor Step 6, after the global design context is confirmed, SVG pages MUST be generated sequentially page by page in one continuous pass. Grouped page batches (for example, 5 pages at a time) are FORBIDDEN
- SPEC_LOCK RE-READ PER PAGE — Before generating each SVG page, Executor MUST
read_file <project_path>/spec_lock.md. All colors / fonts / icons / images MUST come from this file — no values from memory or invented on the fly. Executor MUST also look up the current page's page_rhythm (anchor / dense / breathing), page_layouts (which template SVG to inherit, if any), and page_charts (which chart template to adapt, if any). Empty / absent entries are intentional Strategist signals — see executor-base.md §2.1. This rule exists to resist context-compression drift on long decks and to break the uniform "every page is a card grid" default
- SVG MUST BE HAND-WRITTEN, NOT SCRIPT-GENERATED — Every SVG page is written by the main agent directly, one page at a time (see rules 6 and 7). Writing or running a Python / Node / shell script that produces the SVG files in batch — looping over pages, templating from data, or emitting them via a generator — is FORBIDDEN, including under "save tokens", "quick draft", or "user is in a hurry" pretexts. The script-generation path was tried on a feature branch and abandoned: cross-page visual consistency depends on per-page authoring with full upstream context, which a generator script cannot reproduce
[!IMPORTANT]
🌐 Language & Communication Rule
- Response language: match the user's input and source materials. Explicit user override (e.g., "请用英文回答") takes precedence.
- Template format:
design_spec.md MUST follow its original English template structure (section headings, field names) regardless of conversation language. Content values may be in the user's language.
[!IMPORTANT]
🔌 Compatibility With Generic Coding Skills
ppt-master is a repository-specific workflow, not a general application scaffold
- Do NOT create
.worktrees/, tests/, branch workflows, or generic engineering structure by default
- On conflict with a generic coding skill, follow this skill unless the user explicitly says otherwise
Main Pipeline Scripts
| Script | Purpose |
|---|
${SKILL_DIR}/scripts/source_to_md/pdf_to_md.py | PDF to Markdown |
${SKILL_DIR}/scripts/source_to_md/doc_to_md.py | Documents to Markdown — native Python for DOCX/HTML/EPUB/IPYNB, pandoc fallback for legacy formats (.doc/.odt/.rtf/.tex/.rst/.org/.typ) |
${SKILL_DIR}/scripts/source_to_md/excel_to_md.py | Excel workbooks to Markdown — supports .xlsx/.xlsm; legacy .xls should be resaved as .xlsx |
${SKILL_DIR}/scripts/source_to_md/ppt_to_md.py | PowerPoint to Markdown |
${SKILL_DIR}/scripts/source_to_md/web_to_md.py | Web page to Markdown (supports WeChat via curl_cffi) |
${SKILL_DIR}/scripts/project_manager.py | Project init / validate / manage |
${SKILL_DIR}/scripts/analyze_images.py | Image analysis |
${SKILL_DIR}/scripts/latex_render.py | LaTeX formula rendering (manifest-driven PNG assets) |
${SKILL_DIR}/scripts/image_gen.py | AI image generation (multi-provider) |
${SKILL_DIR}/scripts/svg_quality_checker.py | SVG quality check |
${SKILL_DIR}/scripts/total_md_split.py | Speaker notes splitting |
${SKILL_DIR}/scripts/finalize_svg.py | SVG post-processing (unified entry) |
${SKILL_DIR}/scripts/svg_to_pptx.py | Export to PPTX |
${SKILL_DIR}/scripts/update_spec.py | Propagate a spec_lock.md color / font_family change across all generated SVGs |
For complete tool documentation, see ${SKILL_DIR}/scripts/README.md.
Windows note: if a python3 ... command fails (common on python.org installs, which provide python.exe but not python3.exe), rerun the same command with python instead.
Template Index
| Index | Path | Purpose |
|---|
| Layout templates | ${SKILL_DIR}/templates/layouts/layouts_index.json | Query available page layout templates |
| Brand presets | ${SKILL_DIR}/templates/brands/brands_index.json | Query available brand identity presets (color / typography / logo / voice) |
| Visualization templates | ${SKILL_DIR}/templates/charts/charts_index.json | Query available visualization SVG templates (charts, infographics, diagrams, frameworks) |
| Icon library | ${SKILL_DIR}/templates/icons/ | See ${SKILL_DIR}/templates/icons/README.md; search icons on demand with ls templates/icons/<library>/ | grep <keyword> |
Standalone Workflows
| Workflow | Path | Purpose |
|---|
topic-research | workflows/topic-research.md | Pre-pipeline — gather web sources when the user supplies only a topic with no source files |
create-template | workflows/create-template.md | Standalone layout template creation workflow |
create-brand | workflows/create-brand.md | Standalone brand-only template creation (identity preset; no SVG page roster) |
resume-execute | workflows/resume-execute.md | Phase B entry — resume execution in a fresh chat after Phase A (Step 1–5) completed in another session (split mode) |
verify-charts | workflows/verify-charts.md | Chart coordinate calibration — run after SVG generation if the deck contains data charts |
customize-animations | workflows/customize-animations.md | Object-level PPTX animation customization — run only when the user explicitly asks to tune animation order/effects/timing |
live-preview | workflows/live-preview.md | Browser-based live preview — auto-started during generation and re-enterable any time the user mentions "live preview", "preview", "看效果", or wants to click/select a slide element |
visual-review | workflows/visual-review.md | Per-page rubric-based visual self-check — run only when the user explicitly asks for a visual re-pass on the generated SVGs (between Executor and post-processing). Opt-in only; never invoked by the main pipeline. |
Workflow
Step 1: Source Content Processing
🚧 GATE: User has provided source material (PDF / DOCX / EPUB / URL / Markdown file / text description / conversation content — any form is acceptable).
No source content? When the user supplies only a topic name or requirements without any file or substantive description, run the topic-research workflow first, then return here with its products as input.
When the user provides non-Markdown content, convert immediately:
| User Provides | Command |
|---|
| PDF file | python3 ${SKILL_DIR}/scripts/source_to_md/pdf_to_md.py <file> |
| DOCX / Word / Office document | python3 ${SKILL_DIR}/scripts/source_to_md/doc_to_md.py <file> |
| XLSX / XLSM / Excel workbook | python3 ${SKILL_DIR}/scripts/source_to_md/excel_to_md.py <file> |
| CSV / TSV | Read directly as plain-text table source |
| PPTX / PowerPoint deck | python3 ${SKILL_DIR}/scripts/source_to_md/ppt_to_md.py <file> |
| EPUB / HTML / LaTeX / RST / other | python3 ${SKILL_DIR}/scripts/source_to_md/doc_to_md.py <file> |
| Web link | python3 ${SKILL_DIR}/scripts/source_to_md/web_to_md.py <URL> |
| WeChat / high-security site | python3 ${SKILL_DIR}/scripts/source_to_md/web_to_md.py <URL> (requires curl_cffi, included in requirements.txt) |
| Markdown | Read directly |
Office vector assets (EMF/WMF) from DOCX/PPTX sources:
doc_to_md.py / ppt_to_md.py extract embedded Office vector images (.emf/.wmf)
alongside bitmap images. After import-sources, these land in images/
together with image_manifest.json and are first-class assets in §VIII Image Resource List.
Do NOT convert EMF/WMF to PNG. The PPT Master pipeline preserves them as external
references (finalize_svg.py skips them) and svg_to_pptx.py embeds them as
PPTX-native media via image/x-emf / image/x-wmf MIME — PowerPoint renders them at full vector fidelity.
Converting via LibreOffice/Inkscape introduces CJK font substitution drift and
rasterization loss; the original EMF/WMF is always higher fidelity than the converted PNG.
Browser-based live preview cannot render EMF (will show blank) — this is expected;
the PPTX output is the source of truth.
✅ Checkpoint — Confirm source content is ready, proceed to Step 2.
Step 2: Project Initialization
🚧 GATE: Step 1 complete; source content is ready (Markdown file, user-provided text, or requirements described in conversation are all valid).
python3 ${SKILL_DIR}/scripts/project_manager.py init <project_name> --format <format>
Format options: ppt169 (default), ppt43, xhs, story, etc. For the full format list, see references/canvas-formats.md.
Import source content (choose based on the situation):
| Situation | Action |
|---|
| Has source files (PDF/MD/etc.) | python3 ${SKILL_DIR}/scripts/project_manager.py import-sources <project_path> <source_files...> --move |
| User provided text directly in conversation | No import needed — content is already in conversation context; subsequent steps can reference it directly |
⚠️ MUST use --move (not copy): all source files — Step 1's generated Markdown, original PDFs / MDs / images — go into sources/ via import-sources --move. After execution they no longer exist at the original location. Intermediate artifacts (e.g., _files/) are handled automatically.
✅ Checkpoint — Confirm project structure created successfully, sources/ contains all source files, converted materials are ready. Proceed to Step 3.
Step 3: Template Option
🚧 GATE: Step 2 complete; project directory structure is ready.
Default — free design. Proceed directly to Step 4. Do NOT query any *_index.json unless triggered. Do NOT ask the user. Do NOT proactively suggest, hint at, or fuzzy-match any template based on content, slug-like words, or vague style descriptions.
Template flow triggers ONLY on explicit directory paths supplied by the user in their initial message. The trigger rule is mechanical, not interpretive:
| User input contains | Step 3 action |
|---|
One or more explicit template directory paths (each resolves to a directory containing design_spec.md with kind: brand / kind: layout / kind: deck in its YAML frontmatter) | Read each spec's kind, dispatch per the kind matrix below, fuse if multiple |
| Anything else — bare template names ("用 academic_defense"), style descriptions ("麦肯锡风格"), brand mentions ("招商银行风格"), vague intent ("想用个模板"), or silence | Skip Step 3, free design |
There is no slug matching, no name lookup, no fuzzy resolution. A name without a path does not trigger — the user must give a path the AI can cd into.
Style descriptions ("麦肯锡风格" / "Keynote 风" / "极简风" / etc.) never trigger Step 3. They flow into Strategist's Eight Confirmations as a style brief (color / typography / tone in confirmations e–g).
Bare names ("academic_defense", "招商银行", "anthropic") do NOT trigger Step 3 even if a matching directory exists in the library. The user must give a path. AI must not "helpfully" resolve a name to a path.
"What templates exist?" is out-of-band Q&A — answer by listing entries from brands_index.json / layouts_index.json / decks_index.json together with their paths. Listing alone does not advance the pipeline; the user must send a path back to trigger Step 3.
To create a new layout or deck, read workflows/create-template.md. To create a new brand, read workflows/create-brand.md.
Three template kinds
The architecture has three independent reference bundles. Full schema in docs/zh/templates-architecture.md. Summary:
| Kind | Physical dir | Contains | Frontmatter |
|---|
| brand | templates/brands/<id>/ | identity-only segment: color / typography / logo / voice / icon style | kind: brand |
| layout | templates/layouts/<id>/ | structure-only segment: canvas / page structure / page types / SVG roster | kind: layout |
| deck | templates/decks/<id>/ | full replica: identity + structure + middle (template overview) segments | kind: deck |
Segment ownership (governs fusion override priority):
| Segment | Sections | Owner kind on fusion |
|---|
| Identity | Color Scheme / Typography / Logo / Voice & Tone / Icon Style | brand |
| Structure | Canvas / Page Structure / Page Types / SVG Roster | layout |
| Middle | Template Overview (use cases / design intent) | deck (no other kind writes this) |
Single-path dispatch
User path's kind | Step 3 action |
|---|
kind: brand | Copy design_spec.md + logo files + asset subdirs (images/ / illustrations/ / icons/) into <project>/templates/. Strategist locks identity segment as truth; structure stays free. |
kind: layout | Copy design_spec.md + SVG roster + asset files into <project>/templates/. Strategist locks structure; identity decided in Eight Confirmations e–g. |
kind: deck | Copy everything (design_spec.md + SVGs + logos + assets) into <project>/templates/. Strategist locks all segments; Eight Confirmations narrows to deck-content fields (audience / page count / outline / tone tweaks). |
TEMPLATE_DIR=<user-supplied path>
cp -r ${TEMPLATE_DIR}/* <project_path>/templates/
The single-line copy suffices for all three kinds — the spec's kind field tells Strategist how to read it; downstream code doesn't distinguish.
Multi-path fusion
When the user gives two or more paths of different kinds, Step 3 fuses them into a single <project>/templates/design_spec.md. Default granularity is segment-level integer replacement — entire identity / structure / middle segments are taken from the highest-priority source for that segment, no implicit field-level mixing.
Override priority by segment:
| Combination | Identity from | Structure from | Middle from |
|---|
| brand only | brand | (free design) | (none) |
| layout only | (free design) | layout | (none) |
| deck only | deck | deck | deck |
| brand + layout | brand | layout | (none) |
| brand + deck | brand (overrides deck) | deck | deck |
| layout + deck | deck | layout (overrides deck) | deck |
| brand + layout + deck | brand | layout | deck |
Field-level micro-adjustment (e.g. "use anthropic brand but primary changed to #FF0000") is not part of Step 3 fusion — it flows into Strategist Eight Confirmations e–g as a normal user request.
Same-kind multiple paths — conflict resolution
When the user gives two paths of the same kind (e.g. brands/anthropic + brands/google), Step 3 surfaces a conflict prompt before fusing — like resolving a git merge conflict:
AI: 你给了两个 brand,检测到段级冲突:
- Color Scheme(Anthropic 橙红 vs Google 多色)
- Typography(Styrene/AnthropicSans vs GoogleSans/Roboto)
- Logo(Anthropic 标 vs Google 标)
- Voice & Tone(restrained vs friendly)
- Icon Style(stroke vs filled)
要 (a) 全部按 Anthropic / (b) 全部按 Google / (c) 逐段挑?
Rules:
- Default: no implicit ordering — every cross-source segment difference is reported as a conflict
- Only when the user picks
(c) does AI walk through each segment one by one
- Field-level conflicts are out of scope — segment-level only
- Three or more same-kind paths are not supported — ask the user to converge to at most two
Fused spec provenance
When fusion happens (any multi-path case), the resulting <project>/templates/design_spec.md carries a provenance block immediately under its H1:
> **Fused from:**
> - deck: `templates/decks/招商银行/` (base)
> - brand: `templates/brands/anthropic/` (identity override)
> - layout: `templates/layouts/academic_defense/` (structure override)
> - conflicts resolved: Color Scheme from anthropic(user picked a)
Single-path Step 3 does not add provenance (the source is self-evident from the copied files).
✅ Checkpoint — Default path proceeds to Step 4 without user interaction. If the user supplied one or more explicit template paths, those have been dispatched (or fused) into <project_path>/templates/ before advancing.
Step 4: Strategist Phase (MANDATORY — cannot be skipped)
🚧 GATE: Step 3 complete; default free-design path taken, or (if triggered) template files copied into the project.
First, read the role definition:
Read references/strategist.md
⚠️ Mandatory gate: before writing design_spec.md, Strategist MUST read_file templates/design_spec_reference.md and follow its full I–XI section structure. See strategist.md Section 1.
Eight Confirmations (full template: templates/design_spec_reference.md):
⛔ BLOCKING: present the Eight Confirmations as a single bundled recommendation set and wait for explicit user confirmation or modification before outputting Design Specification & Content Outline. This is the single core confirmation point — once confirmed, all subsequent steps proceed automatically.
- Canvas format
- Page count range
- Target audience
- Style objective
- Color scheme
- Icon usage approach
- Typography plan, including formula rendering policy
- Image usage approach
Mandatory — split-mode note (not a ninth confirmation): after listing the eight confirmation details, you MUST append exactly one short line (rendered in the user's language, prefixed with 💡) about generation mode. Pick the variant by qualitative read of Phase A signals — recommended page count, source-material bulk, whether topic-research ran with substantial web-fetch accumulation:
| Signal read | Line content |
|---|
| Heavy (long page count / bulky sources / heavy web-fetch accumulation) | State estimated page count and large source size; recommend switching to split mode after Step 5 — stop this chat, open a fresh window and input 继续生成 projects/<project_name> to enter Phase B (SVG generation + export); no response or "continue" = default continuous mode. |
| Normal (default) | State scale is moderate, default continuous mode generates in one go; if mid-way window switch is desired, input 继续生成 projects/<project_name> after Step 5 to switch to split mode. |
This line is required output every run — the user must always see the mode choice exists. Whether to act on it is the user's call.
Formula rendering policy lives inside item 7 (Typography plan):
| Policy | Behavior |
|---|
mixed (default) | Strategist renders complex formula-worthy expressions as PNG assets; simple inline expressions remain editable text / Unicode |
render-all | Strategist renders every formula-worthy expression as PNG assets |
text-only | No formula rendering; formulas remain editable text / Unicode |
After the Eight Confirmations are approved and before outputting design_spec.md / spec_lock.md, if the confirmed formula policy is mixed or render-all and the content contains formula-worthy expressions, Strategist MUST:
- Identify explicit LaTeX and any source expressions that should be faithfully structured as formulas.
- Write
<project_path>/images/formula_manifest.json with only the formulas selected for rendering.
- Run:
python3 ${SKILL_DIR}/scripts/latex_render.py <project_path>
- Include the rendered formula PNGs as
Acquire Via: formula, Status: Rendered, Type: Latex Formula rows in design_spec.md §VIII Image Resource List; also list them in spec_lock.md images with | no-crop.
The formula renderer uses a provider fallback chain by default: codecogs,quicklatex,mathpad,wikimedia. The first three are color-aware; Wikimedia is an availability fallback. Formula PNGs are transparent by default: manifest background is the temporary render matte and transparency-removal reference, not a retained final background unless transparent: false is set for that item. Do not scan spec_lock.md for $...$ or $$...$$. Dollar-delimited math in source material is only a signal for Strategist; the renderer consumes the explicit manifest.
If the user provided images or formula PNGs were rendered, run analysis before outputting the design spec:
python3 ${SKILL_DIR}/scripts/analyze_images.py <project_path>/images
⚠️ Image handling: NEVER directly read / open / view image files (.jpg, .png, etc.). All image info comes from analyze_images.py output or the Design Spec's Image Resource List.
Output:
<project_path>/design_spec.md — human-readable design narrative
<project_path>/spec_lock.md — machine-readable execution contract (skeleton: templates/spec_lock_reference.md); Executor re-reads before every page
✅ Checkpoint — Phase deliverables complete, auto-proceed to next step:
## ✅ Strategist Phase Complete
- [x] Eight Confirmations completed (user confirmed)
- [x] Split-mode note appended below the eight items (heavy or normal variant)
- [x] Design Specification & Content Outline generated
- [x] Execution lock (spec_lock.md) generated
- [ ] **Next**: Auto-proceed to [Image_Generator / Executor] phase
Step 5: Image Acquisition Phase (Conditional)
🚧 GATE: Step 4 complete; Design Specification & Content Outline generated and user confirmed. Any formula rows already have Acquire Via: formula and Status: Rendered.
Trigger: At least one row in the resource list has Acquire Via: ai and/or Acquire Via: web. If every row is user, formula, or placeholder, skip to Step 6.
Always load the common framework:
Read references/image-base.md
Then lazy-load the path-specific reference for each row that actually needs it:
| Acquire Via | Load reference (only if any such row exists) | Run |
|---|
ai | references/image-generator.md | python3 ${SKILL_DIR}/scripts/image_gen.py --manifest <project_path>/images/image_prompts.json |
web | references/image-searcher.md | python3 ${SKILL_DIR}/scripts/image_search.py ... |
user / placeholder | (skip) | (skip) |
A deck with only ai rows never loads image-searcher.md; a deck with only web rows never loads image-generator.md. A mixed deck loads both, processes each row through its own path, and writes both image_prompts.json and image_sources.json.
⚠️ In-pipeline ai path MUST use manifest mode — even when only 1 ai row exists. Write images/image_prompts.json first, then run image_gen.py --manifest, then image_gen.py --render-md to produce the image_prompts.md sidecar. The positional form (image_gen.py "prompt" ...) is reserved for out-of-pipeline one-off testing / single-image fixups — it skips manifest + sidecar, leaving no audit trail.
Workflow:
- Extract all rows with
Status: Pending and Acquire Via ∈ {ai, web} from the design spec
- Generate prompts (ai rows) and/or run search (web rows) per image-base.md §2 dispatch table
- Verify every row reaches a terminal status:
Generated (ai success), Sourced (web success), or Needs-Manual
✅ Checkpoint — Confirm acquisition attempted for every row:
## ✅ Image Acquisition Phase Complete
- [x] image_prompts.json created (when any ai rows processed)
- [x] image_prompts.md sidecar rendered (when any ai rows processed)
- [x] image_sources.json created (when any web rows processed)
- [x] Each row: status is `Generated` / `Sourced` / `Needs-Manual` (no `Pending` remaining)
Default — auto-proceed to Step 6. Only when the user's Step 4 response explicitly opted into split mode (in reply to the optional hint), output the Phase A hand-off below and stop this conversation:
## ✅ Phase A Complete
- [x] Spec: `design_spec.md`, `spec_lock.md`
- [x] Resources: `sources/`, `images/`, `templates/`
- [ ] **Next**: open a fresh chat window and input `继续生成 projects/<project_name>` to enter Phase B via the [`resume-execute`](workflows/resume-execute.md) workflow.
On acquisition failure, do NOT halt — follow the Failure Handling rule in image-base.md §5: retry once, then mark the row Needs-Manual, report to user, and continue to the checkpoint above.
Step 6: Executor Phase
🚧 GATE: Step 4 (and Step 5 if triggered) complete; all prerequisite deliverables are ready.
Read the role definition based on the selected style:
Read references/executor-base.md # REQUIRED: common guidelines
Read references/shared-standards.md # REQUIRED: SVG/PPT technical constraints
Read references/executor-general.md # General flexible style
Read references/executor-consultant.md # Consulting style
Read references/executor-consultant-top.md # Top consulting style (MBB level)
Only read executor-base + shared-standards + one style file.
Design Parameter Confirmation (Mandatory): before the first SVG, output key design parameters from the spec (canvas dimensions, color scheme, font plan, body font size). See executor-base.md §2.
Live Preview Auto-Startup (Mandatory): before the first SVG, automatically start the browser editor in live mode and keep it running continuously through Executor + Step 7 export:
python3 ${SKILL_DIR}/scripts/svg_editor/server.py <project_path> --live
- Start it immediately when Executor begins;
svg_output/ may be empty. Editor opens at http://localhost:5050; port conflict → --port <other> and report the actual URL.
- Run it as a long-running side process/session; do not wait for it to exit before generating SVG pages. Do not wait for user confirmation after startup.
- Service must keep running until one of: (a) the user clicks Exit preview in the browser, or (b) the user explicitly asks in chat to stop it. Generation continues even if the user closes the editor.
- Do NOT read or apply submitted annotations during generation. Users may annotate at any time, but Executor proceeds without touching them. The window to apply annotations opens only after Step 7 completes — see
workflows/live-preview.md.
- The editor also supports staged direct edits (text content + SVG element attributes previewed immediately, then written to
svg_output/ only when the user clicks Apply changes; Ctrl+Z / Undo drops staged edits) alongside annotation; re-export stays chat-driven. Full scope and editor details: see workflows/live-preview.md Notes.
Pre-generation Batch Read (Mandatory): before the first SVG, batch-read every distinct layout SVG referenced in spec_lock.page_layouts and every distinct chart SVG referenced in spec_lock.page_charts (plus any §VII backup charts). One read per file, up front — do not re-read these during page generation. See executor-base.md §1.0.
Per-page spec_lock re-read (Mandatory): before each SVG page, read_file <project_path>/spec_lock.md and use only its colors / fonts / icons / images, plus the per-page page_rhythm / page_layouts / page_charts lookups (resolves to template SVGs already loaded in the batch read above). Resists context-compression drift on long decks. See executor-base.md §2.1.
⚠️ Main-agent only: SVG generation MUST stay in the current main agent — page design depends on full upstream context. Do NOT delegate to sub-agents.
⚠️ Generation rhythm: generate pages sequentially, one at a time, in the same continuous context. Do NOT batch (e.g., 5 per group).
Visual Construction Phase: generate SVG pages sequentially, one at a time, in one continuous pass → <project_path>/svg_output/
Quality Check Gate (Mandatory) — after all SVGs, BEFORE annotation handling and speaker notes:
python3 ${SKILL_DIR}/scripts/svg_quality_checker.py <project_path>
- Any
error (banned SVG features, viewBox mismatch, spec_lock drift, etc.) MUST be fixed before proceeding — return to Visual Construction, regenerate that page, re-run check.
warning entries (low-res image, non-PPT-safe font tail, etc.): fix when straightforward, otherwise acknowledge and release.
- Run against
svg_output/ (not after finalize_svg.py — finalize rewrites SVG and masks violations).
Logic Construction Phase: generate speaker notes → <project_path>/notes/total.md
✅ Checkpoint — Confirm all SVGs and notes are fully generated and quality-checked. Proceed directly to Step 7 post-processing:
## ✅ Executor Phase Complete
- [x] Live preview started and kept available at the reported URL
- [x] All SVGs generated to svg_output/
- [x] svg_quality_checker.py passed (0 errors)
- [x] Speaker notes generated at notes/total.md
Chart pages? If this deck contains data charts (bar / line / pie / radar / etc.), run the standalone verify-charts workflow before Step 7 to calibrate coordinates. AI models routinely introduce 10–50 px errors when mapping data to pixel positions; verify-charts eliminates that class of error. Skip if no chart pages.
Visual self-check (opt-in)? If the user explicitly asked for a per-page visual re-pass on the SVGs ("跑一下视觉自检 / 视觉回看", "visual review", "check pages visually", etc.), run the standalone visual-review workflow before Step 7. Do NOT run it by default and do NOT recommend it based on inferred model capability or deck size — trigger is user request only.
Step 7: Post-processing & Export
🚧 GATE: Step 6 complete; all SVGs generated to svg_output/; speaker notes notes/total.md generated.
🚧 Image readiness GATE (when Step 5 left ai rows in Needs-Manual): every expected file must exist at project/images/<filename> before running 7.1.
If files are missing: PAUSE, list the missing filenames, point the user to images/image_prompts.md (each ### Image N: block is paste-ready for ChatGPT / Gemini / Midjourney; auto-generated from image_prompts.json) and the required placement project/images/<filename>. Resume Step 7.1 only after all expected files are in place. finalize_svg.py and svg_to_pptx.py do not detect missing files at this layer — proceeding with gaps produces a deck with broken image references.
⚠️ Run the three sub-steps one at a time — each must complete successfully before the next.
❌ NEVER combine them into a single code block or shell invocation.
Canonical three-command pipeline (mirrors references/shared-standards.md §5):
Step 7.1 — Split speaker notes:
python3 ${SKILL_DIR}/scripts/total_md_split.py <project_path>
Step 7.2 — SVG post-processing (icon embedding / image crop & embed / text flattening / rounded rect to path):
python3 ${SKILL_DIR}/scripts/finalize_svg.py <project_path>
Step 7.3 — Export PPTX (embeds speaker notes by default):
python3 ${SKILL_DIR}/scripts/svg_to_pptx.py <project_path>
The native pptx consumes svg_output/ directly so the converter can preserve
high-fidelity primitives (icon <use> placeholders, image preserveAspectRatio
→ srcRect, rounded rect rx/ry → prstGeom roundRect). The svg_output/
snapshot in backup/<timestamp>/ is always written so the project can be
re-exported from frozen SVG sources without re-running the LLM. The SVG-rendered
preview pptx is opt-in via --svg-snapshot — live preview already provides the
SVG visual reference, so it's only needed when you want a self-contained file
to share. Pass -s output or -s final to force a single source if you need it.
Paragraph editability vs line fidelity — by default every dy-stacked line is
its own PowerPoint text frame, preserving exact SVG layout. Add --merge-paragraphs
only when the user explicitly asks for an editable / wrap-friendly export (e.g.
"I want to edit the abstract as one block", "make text boxes resizable / reflow"):
mergeable paragraph blocks collapse into one editable text frame with multiple
<a:p>, at the cost of PowerPoint re-wrapping inside each box. Default off keeps
pixel-fidelity; turn it on per the user's request, not on your own judgement.
Optional animation flags (the defaults already enable rich entrance animations — adjust only when the user asks for something different):
-t <effect> — page transition. Default fade. Options: fade / push / wipe / split / strips / cover / random / none.
-a <effect> — per-element entrance animation. Default auto (map effect from group id: chart→wipe, card-/step-/pillar-→fly, title/takeaway→fade; image-like ids hero / figure- / image / img- / kpi cycle a richer pool — zoom / dissolve / circle / box / diamond / wheel — so multiple images vary across the deck). Pass none to disable, a specific effect like fade, or mixed for the legacy 16-effect cycle. Requires top-level <g id="..."> groups (already required by Executor).
--animation-trigger {on-click,with-previous,after-previous} — Start mode (matches PowerPoint's animation-pane Start dropdown). Default after-previous (click-free cascade; pace via --animation-stagger). Use on-click for presenter-paced reveals, or with-previous for all-at-once.
--animation-config <path> — optional object-level sidecar. Default: <project_path>/animations.json when present.
--auto-advance <seconds> — kiosk-style auto-play.
Optional custom animations (only when the user asks to tune animation order/effects/timing for specific objects):
Run the standalone customize-animations workflow. Default export already has global entrance animation; do not create animations.json unless object-level customization was requested.
Optional recorded narration (only when the user asks for narrated/video export):
Run the standalone generate-audio workflow. The AI picks a narration backend (edge by default, or a configured cloud provider such as ElevenLabs / MiniMax / Qwen / CosyVoice for high-quality or cloned voices), asks the user once (backend + voice + rate/settings + embed-or-not, all with recommended values), then executes notes_to_audio.py and (if chosen) re-exports the PPTX with --recorded-narration audio.
Do NOT call notes_to_audio.py directly without going through the workflow — --voice / --voice-id is required and the workflow produces the locale/provider-aware recommendation that makes the choice meaningful.
Full effect list, anchor logic, and limits: references/animations.md.
❌ NEVER substitute cp for finalize_svg.py — finalize performs multiple critical processing steps
❌ NEVER force -s output for the legacy/preview pptx (PowerPoint's internal SVG parser drops icons and rounded corners). The default auto-split already gives native the high-fidelity source it needs without touching legacy.
❌ NEVER use --only (it suppresses one of the two output files)
Post-export annotation window: the preview service from Step 6 typically remains running after export. If the user submitted annotations in the browser (during Executor or after export) and now asks to apply them — they may quote the browser prompt (Changes saved to svg_output... / 修改已保存到 svg_output...), say "apply my annotations" / "应用注解" / equivalent — run live-preview Step 2 to apply and re-export. Annotations submitted during generation are also handled here, not earlier.
Direct edits in the browser: the user may also stage text / SVG attribute edits in the preview. These land in svg_output/ only after the user clicks Apply changes. If they ask to "re-export" / "重新导出" after applying such edits, just re-run Step 7.2–7.3 (finalize + export); no annotation-application step is needed unless they also saved AI-needed annotations.
Preview not running? Any time the user mentions "live preview", "preview", "看效果", or wants to select/click a slide element and the service is not running, run live-preview Step 1 to start it. If the service is already running, just point them at the URL — do not restart.
Role Switching Protocol
Before switching roles, MUST first read the corresponding reference file. Output marker:
## [Role Switch: <Role Name>]
📖 Reading role definition: references/<filename>.md
📋 Current task: <brief description>
Reference Resources
| Resource | Path |
|---|
| Shared technical constraints | references/shared-standards.md |
| Canvas format specification | references/canvas-formats.md |
| Image-text layout patterns (Primary structures + Modifier layers — combine freely) | references/image-layout-patterns.md |
| Image layout sizing (math for side-by-side container dimensions) | references/image-layout-spec.md |
| SVG image embedding | references/svg-image-embedding.md |
| Icon library | templates/icons/README.md |
Notes
- Local preview:
python3 -m http.server -d <project_path>/svg_final 8000
- Troubleshooting: on generation issues (layout overflow, export errors, blank images, etc.), check
docs/faq.md for known solutions