| name | ppt-master |
| description | AI-driven multi-format SVG content generation system. Converts source documents (PDF/DOCX/URL/Markdown) into high-quality SVG pages and exports to PPTX through multi-role collaboration. Use when user asks to "create PPT", "make presentation", "生成PPT", "做PPT", "制作演示文稿", or mentions "ppt-master".
|
PPT Master Skill
AI-driven multi-format SVG content generation system. Converts source documents into high-quality SVG pages through multi-role collaboration and exports to PPTX.
Core Pipeline: Source Document → Create Project → [Template] → Strategist → [Image_Generator] → Executor → Post-processing → Export
[!CAUTION]
🚨 Global Execution Discipline (MANDATORY)
This workflow is a strict serial pipeline. The following rules have the highest priority — violating any one of them constitutes execution failure:
- SERIAL EXECUTION — Steps MUST be executed in order; the output of each step is the input for the next. Non-BLOCKING adjacent steps may proceed continuously once prerequisites are met, without waiting for the user to say "continue"
- BLOCKING = HARD STOP — Steps marked ⛔ BLOCKING require a full stop; the AI MUST wait for an explicit user response before proceeding and MUST NOT make any decisions on behalf of the user
- NO CROSS-PHASE BUNDLING — Cross-phase bundling is FORBIDDEN. (Note: the Eight Confirmations in Step 4 are ⛔ BLOCKING — the AI MUST present recommendations and wait for explicit user confirmation before proceeding. Once the user confirms, all subsequent non-BLOCKING steps — design spec output, SVG generation, speaker notes, and post-processing — may proceed automatically without further user confirmation)
- GATE BEFORE ENTRY — Each Step has prerequisites (🚧 GATE) listed at the top; these MUST be verified before starting that Step
- NO SPECULATIVE EXECUTION — "Pre-preparing" content for subsequent Steps is FORBIDDEN (e.g., writing SVG code during the Strategist phase)
- NO SUB-AGENT SVG GENERATION — Executor Step 6 SVG generation is context-dependent and MUST be completed by the current main agent end-to-end. Delegating page SVG generation to sub-agents is FORBIDDEN
- SEQUENTIAL PAGE GENERATION ONLY — In Executor Step 6, after the global design context is confirmed, SVG pages MUST be generated sequentially page by page in one continuous pass. Grouped page batches (for example, 5 pages at a time) are FORBIDDEN
- SPEC_LOCK RE-READ PER PAGE — Before generating each SVG page, Executor MUST
read_file <project_path>/spec_lock.md. All colors / fonts / icons / images MUST come from this file — no values from memory or invented on the fly. Executor MUST also look up the current page's page_rhythm (anchor / dense / breathing), page_layouts (which template SVG to inherit, if any), and page_charts (which chart template to adapt, if any). Empty / absent entries are intentional Strategist signals — see executor-base.md §2.1. This rule exists to resist context-compression drift on long decks and to break the uniform "every page is a card grid" default
[!IMPORTANT]
🌐 Language & Communication Rule
- Response language: match the user's input and source materials. Explicit user override (e.g., "请用英文回答") takes precedence.
- Template format:
design_spec.md MUST follow its original English template structure (section headings, field names) regardless of conversation language. Content values may be in the user's language.
[!IMPORTANT]
🔌 Compatibility With Generic Coding Skills
ppt-master is a repository-specific workflow, not a general application scaffold
- Do NOT create
.worktrees/, tests/, branch workflows, or generic engineering structure by default
- On conflict with a generic coding skill, follow this skill unless the user explicitly says otherwise
Main Pipeline Scripts
| Script | Purpose |
|---|
${SKILL_DIR}/scripts/source_to_md/pdf_to_md.py | PDF to Markdown |
${SKILL_DIR}/scripts/source_to_md/doc_to_md.py | Documents to Markdown — native Python for DOCX/HTML/EPUB/IPYNB, pandoc fallback for legacy formats (.doc/.odt/.rtf/.tex/.rst/.org/.typ) |
${SKILL_DIR}/scripts/source_to_md/excel_to_md.py | Excel workbooks to Markdown — supports .xlsx/.xlsm; legacy .xls should be resaved as .xlsx |
${SKILL_DIR}/scripts/source_to_md/ppt_to_md.py | PowerPoint to Markdown |
${SKILL_DIR}/scripts/source_to_md/web_to_md.py | Web page to Markdown (supports WeChat via curl_cffi) |
${SKILL_DIR}/scripts/project_manager.py | Project init / validate / manage |
${SKILL_DIR}/scripts/analyze_images.py | Image analysis |
${SKILL_DIR}/scripts/image_gen.py | AI image generation (multi-provider) |
${SKILL_DIR}/scripts/svg_quality_checker.py | SVG quality check |
${SKILL_DIR}/scripts/total_md_split.py | Speaker notes splitting |
${SKILL_DIR}/scripts/finalize_svg.py | SVG post-processing (unified entry) |
${SKILL_DIR}/scripts/svg_to_pptx.py | Export to PPTX |
${SKILL_DIR}/scripts/update_spec.py | Propagate a spec_lock.md color / font_family change across all generated SVGs |
For complete tool documentation, see ${SKILL_DIR}/scripts/README.md.
Template Index
| Index | Path | Purpose |
|---|
| Layout templates | ${SKILL_DIR}/templates/layouts/layouts_index.json | Query available page layout templates |
| Visualization templates | ${SKILL_DIR}/templates/charts/charts_index.json | Query available visualization SVG templates (charts, infographics, diagrams, frameworks) |
| Icon library | ${SKILL_DIR}/templates/icons/ | See ${SKILL_DIR}/templates/icons/README.md; search icons on demand with ls templates/icons/<library>/ | grep <keyword> |
Standalone Workflows
| Workflow | Path | Purpose |
|---|
topic-research | workflows/topic-research.md | Pre-pipeline — gather web sources when the user supplies only a topic with no source files |
create-template | workflows/create-template.md | Standalone template creation workflow |
resume-execute | workflows/resume-execute.md | Phase B entry — resume execution in a fresh chat after Phase A (Step 1–5) completed in another session (split mode) |
verify-charts | workflows/verify-charts.md | Chart coordinate calibration — run after SVG generation if the deck contains data charts |
visual-edit | workflows/visual-edit.md | Browser-based visual editor for fine-grained edits — run only when the user explicitly requests it after export |
Workflow
Step 1: Source Content Processing
🚧 GATE: User has provided source material (PDF / DOCX / EPUB / URL / Markdown file / text description / conversation content — any form is acceptable).
No source content? When the user supplies only a topic name or requirements without any file or substantive description, run the topic-research workflow first, then return here with its products as input.
When the user provides non-Markdown content, convert immediately:
| User Provides | Command |
|---|
| PDF file | python3 ${SKILL_DIR}/scripts/source_to_md/pdf_to_md.py <file> |
| DOCX / Word / Office document | python3 ${SKILL_DIR}/scripts/source_to_md/doc_to_md.py <file> |
| XLSX / XLSM / Excel workbook | python3 ${SKILL_DIR}/scripts/source_to_md/excel_to_md.py <file> |
| CSV / TSV | Read directly as plain-text table source |
| PPTX / PowerPoint deck | python3 ${SKILL_DIR}/scripts/source_to_md/ppt_to_md.py <file> |
| EPUB / HTML / LaTeX / RST / other | python3 ${SKILL_DIR}/scripts/source_to_md/doc_to_md.py <file> |
| Web link | python3 ${SKILL_DIR}/scripts/source_to_md/web_to_md.py <URL> |
| WeChat / high-security site | python3 ${SKILL_DIR}/scripts/source_to_md/web_to_md.py <URL> (requires curl_cffi, included in requirements.txt) |
| Markdown | Read directly |
✅ Checkpoint — Confirm source content is ready, proceed to Step 2.
Step 2: Project Initialization
🚧 GATE: Step 1 complete; source content is ready (Markdown file, user-provided text, or requirements described in conversation are all valid).
python3 ${SKILL_DIR}/scripts/project_manager.py init <project_name> --format <format>
Format options: ppt169 (default), ppt43, xhs, story, etc. For the full format list, see references/canvas-formats.md.
Import source content (choose based on the situation):
| Situation | Action |
|---|
| Has source files (PDF/MD/etc.) | python3 ${SKILL_DIR}/scripts/project_manager.py import-sources <project_path> <source_files...> --move |
| User provided text directly in conversation | No import needed — content is already in conversation context; subsequent steps can reference it directly |
⚠️ MUST use --move (not copy): all source files — Step 1's generated Markdown, original PDFs / MDs / images — go into sources/ via import-sources --move. After execution they no longer exist at the original location. Intermediate artifacts (e.g., _files/) are handled automatically.
✅ Checkpoint — Confirm project structure created successfully, sources/ contains all source files, converted materials are ready. Proceed to Step 3.
Step 3: Template Option
🚧 GATE: Step 2 complete; project directory structure is ready.
Default — free design. Proceed directly to Step 4. Do NOT query layouts_index.json unless triggered. Do NOT ask the user. Do NOT proactively suggest, hint at, or fuzzy-match any template based on content, slug-like words, or vague style descriptions.
Template flow triggers ONLY on an explicit template directory path supplied by the user in their initial message. The trigger rule is mechanical, not interpretive:
| User input contains | Step 3 action |
|---|
An explicit path to a template directory (e.g. skills/ppt-master/templates/layouts/academic_defense/, projects/foo/template/, or any other absolute / relative path that resolves to a directory containing design_spec.md and one or more page SVGs) | Copy that directory's SVGs + design_spec.md + assets into the project, advance |
| Anything else — including bare template names ("用 academic_defense 模板"), style descriptions ("麦肯锡风格" / "Google style"), brand mentions ("招商银行风格"), vague intent ("想用个模板"), or silence | Skip Step 3, free design |
There is no slug matching, no name lookup, no fuzzy resolution. A template name without a path does not trigger — the user must give a path the AI can cd into.
The path may live anywhere — skills/ppt-master/templates/layouts/<name>/ (the built-in library), projects/<other_project>/template/ (reusing a previous project's templates), or any other location. Location is irrelevant; what matters is that the user named the path.
TEMPLATE_DIR=<user-supplied path>
cp ${TEMPLATE_DIR}/*.svg <project_path>/templates/
cp ${TEMPLATE_DIR}/design_spec.md <project_path>/templates/
cp ${TEMPLATE_DIR}/*.png <project_path>/images/ 2>/dev/null || true
cp ${TEMPLATE_DIR}/*.jpg <project_path>/images/ 2>/dev/null || true
Style descriptions ("麦肯锡风格" / "Keynote 风" / "极简风" / etc.) never trigger Step 3. They flow naturally into Strategist's Eight Confirmations as part of the user's input — Strategist uses them as a style brief when proposing color / typography / tone in confirmations e and g.
Bare template names ("academic_defense", "招商银行") do NOT trigger Step 3 even if a folder by that name exists in the library. The user must give a path. AI must not "helpfully" resolve a name to a path.
"What templates exist?" is out-of-band Q&A — answer by listing entries from layouts_index.json together with their paths. Listing alone does not advance the pipeline; the user still has to send a path to trigger the Step 3 copy.
To create a new template, read workflows/create-template.md.
✅ Checkpoint — Default path proceeds to Step 4 without user interaction. If the user's input contains an explicit template directory path, that directory is copied before advancing.
Step 4: Strategist Phase (MANDATORY — cannot be skipped)
🚧 GATE: Step 3 complete; default free-design path taken, or (if triggered) template files copied into the project.
First, read the role definition:
Read references/strategist.md
⚠️ Mandatory gate: before writing design_spec.md, Strategist MUST read_file templates/design_spec_reference.md and follow its full I–XI section structure. See strategist.md Section 1.
Eight Confirmations (full template: templates/design_spec_reference.md):
⛔ BLOCKING: present the Eight Confirmations as a single bundled recommendation set and wait for explicit user confirmation or modification before outputting Design Specification & Content Outline. This is the single core confirmation point — once confirmed, all subsequent steps proceed automatically.
- Canvas format
- Page count range
- Target audience
- Style objective
- Color scheme
- Icon usage approach
- Typography plan
- Image usage approach
Mandatory — split-mode note (not a ninth confirmation): after listing the eight confirmation details, you MUST append exactly one short line (rendered in the user's language, prefixed with 💡) about generation mode. Pick the variant by qualitative read of Phase A signals — recommended page count, source-material bulk, whether topic-research ran with substantial web-fetch accumulation:
| Signal read | Line content |
|---|
| Heavy (long page count / bulky sources / heavy web-fetch accumulation) | State estimated page count and large source size; recommend switching to split mode after Step 5 — stop this chat, open a fresh window and input 继续生成 projects/<project_name> to enter Phase B (SVG generation + export); no response or "continue" = default continuous mode. |
| Normal (default) | State scale is moderate, default continuous mode generates in one go; if mid-way window switch is desired, input 继续生成 projects/<project_name> after Step 5 to switch to split mode. |
This line is required output every run — the user must always see the mode choice exists. Whether to act on it is the user's call.
If the user provided images, run analysis before outputting the design spec:
python3 ${SKILL_DIR}/scripts/analyze_images.py <project_path>/images
⚠️ Image handling: NEVER directly read / open / view image files (.jpg, .png, etc.). All image info comes from analyze_images.py output or the Design Spec's Image Resource List.
Output:
<project_path>/design_spec.md — human-readable design narrative
<project_path>/spec_lock.md — machine-readable execution contract (skeleton: templates/spec_lock_reference.md); Executor re-reads before every page
✅ Checkpoint — Phase deliverables complete, auto-proceed to next step:
## ✅ Strategist Phase Complete
- [x] Eight Confirmations completed (user confirmed)
- [x] Split-mode note appended below the eight items (heavy or normal variant)
- [x] Design Specification & Content Outline generated
- [x] Execution lock (spec_lock.md) generated
- [ ] **Next**: Auto-proceed to [Image_Generator / Executor] phase
Step 5: Image Acquisition Phase (Conditional)
🚧 GATE: Step 4 complete; Design Specification & Content Outline generated and user confirmed.
Trigger: At least one row in the resource list has Acquire Via: ai and/or Acquire Via: web. If every row is user or placeholder, skip to Step 6.
Always load the common framework:
Read references/image-base.md
Then lazy-load the path-specific reference for each row that actually needs it:
| Acquire Via | Load reference (only if any such row exists) | Run |
|---|
ai | references/image-generator.md | python3 ${SKILL_DIR}/scripts/image_gen.py ... |
web | references/image-searcher.md | python3 ${SKILL_DIR}/scripts/image_search.py ... |
user / placeholder | (skip) | (skip) |
A deck with only ai rows never loads image-searcher.md; a deck with only web rows never loads image-generator.md. A mixed deck loads both, processes each row through its own path, and writes both image_prompts.md and image_sources.json.
Workflow:
- Extract all rows with
Status: Pending and Acquire Via ∈ {ai, web} from the design spec
- Generate prompts (ai rows) and/or run search (web rows) per image-base.md §2 dispatch table
- Verify every row reaches a terminal status:
Generated (ai success), Sourced (web success), or Needs-Manual
✅ Checkpoint — Confirm acquisition attempted for every row:
## ✅ Image Acquisition Phase Complete
- [x] image_prompts.md created (when any ai rows processed)
- [x] image_sources.json created (when any web rows processed)
- [x] Each row: status is `Generated` / `Sourced` / `Needs-Manual` (no `Pending` remaining)
Default — auto-proceed to Step 6. Only when the user's Step 4 response explicitly opted into split mode (in reply to the optional hint), output the Phase A hand-off below and stop this conversation:
## ✅ Phase A Complete
- [x] Spec: `design_spec.md`, `spec_lock.md`
- [x] Resources: `sources/`, `images/`, `templates/`
- [ ] **Next**: open a fresh chat window and input `继续生成 projects/<project_name>` to enter Phase B via the [`resume-execute`](workflows/resume-execute.md) workflow.
On acquisition failure, do NOT halt — follow the Failure Handling rule in image-base.md §5: retry once, then mark the row Needs-Manual, report to user, and continue to the checkpoint above.
Step 6: Executor Phase
🚧 GATE: Step 4 (and Step 5 if triggered) complete; all prerequisite deliverables are ready.
Read the role definition based on the selected style:
Read references/executor-base.md # REQUIRED: common guidelines
Read references/shared-standards.md # REQUIRED: SVG/PPT technical constraints
Read references/executor-general.md # General flexible style
Read references/executor-consultant.md # Consulting style
Read references/executor-consultant-top.md # Top consulting style (MBB level)
Only read executor-base + shared-standards + one style file.
Design Parameter Confirmation (Mandatory): before the first SVG, output key design parameters from the spec (canvas dimensions, color scheme, font plan, body font size). See executor-base.md §2.
Pre-generation Batch Read (Mandatory): before the first SVG, batch-read every distinct layout SVG referenced in spec_lock.page_layouts and every distinct chart SVG referenced in spec_lock.page_charts (plus any §VII backup charts). One read per file, up front — do not re-read these during page generation. See executor-base.md §1.0.
Per-page spec_lock re-read (Mandatory): before each SVG page, read_file <project_path>/spec_lock.md and use only its colors / fonts / icons / images, plus the per-page page_rhythm / page_layouts / page_charts lookups (resolves to template SVGs already loaded in the batch read above). Resists context-compression drift on long decks. See executor-base.md §2.1.
⚠️ Main-agent only: SVG generation MUST stay in the current main agent — page design depends on full upstream context. Do NOT delegate to sub-agents.
⚠️ Generation rhythm: generate pages sequentially, one at a time, in the same continuous context. Do NOT batch (e.g., 5 per group).
Visual Construction Phase: generate SVG pages sequentially, one at a time, in one continuous pass → <project_path>/svg_output/
Quality Check Gate (Mandatory) — after all SVGs, BEFORE speaker notes:
python3 ${SKILL_DIR}/scripts/svg_quality_checker.py <project_path>
- Any
error (banned SVG features, viewBox mismatch, spec_lock drift, etc.) MUST be fixed before proceeding — return to Visual Construction, regenerate that page, re-run check.
warning entries (low-res image, non-PPT-safe font tail, etc.): fix when straightforward, otherwise acknowledge and release.
- Run against
svg_output/ (not after finalize_svg.py — finalize rewrites SVG and masks violations).
Logic Construction Phase: generate speaker notes → <project_path>/notes/total.md
✅ Checkpoint — Confirm all SVGs and notes are fully generated and quality-checked. Proceed directly to Step 7 post-processing:
## ✅ Executor Phase Complete
- [x] All SVGs generated to svg_output/
- [x] svg_quality_checker.py passed (0 errors)
- [x] Speaker notes generated at notes/total.md
Chart pages? If this deck contains data charts (bar / line / pie / radar / etc.), run the standalone verify-charts workflow before Step 7 to calibrate coordinates. AI models routinely introduce 10–50 px errors when mapping data to pixel positions; verify-charts eliminates that class of error. Skip if no chart pages.
Step 7: Post-processing & Export
🚧 GATE: Step 6 complete; all SVGs generated to svg_output/; speaker notes notes/total.md generated.
🚧 Image readiness GATE (when Step 5 left ai rows in Needs-Manual): every expected file must exist at project/images/<filename> before running 7.1.
If files are missing: PAUSE, list the missing filenames, point the user to images/image_prompts.md (each ### Image N: block is paste-ready for ChatGPT / Gemini / Midjourney) and the required placement project/images/<filename>. Resume Step 7.1 only after all expected files are in place. finalize_svg.py and svg_to_pptx.py do not detect missing files at this layer — proceeding with gaps produces a deck with broken image references.
⚠️ Run the three sub-steps one at a time — each must complete successfully before the next.
❌ NEVER combine them into a single code block or shell invocation.
Canonical three-command pipeline (mirrors references/shared-standards.md §5):
Step 7.1 — Split speaker notes:
python3 ${SKILL_DIR}/scripts/total_md_split.py <project_path>
Step 7.2 — SVG post-processing (icon embedding / image crop & embed / text flattening / rounded rect to path):
python3 ${SKILL_DIR}/scripts/finalize_svg.py <project_path>
Step 7.3 — Export PPTX (embeds speaker notes by default):
python3 ${SKILL_DIR}/scripts/svg_to_pptx.py <project_path>
The two products now read from different sources by design: native pptx
consumes svg_output/ so the converter can preserve high-fidelity primitives
(icon <use> placeholders, image preserveAspectRatio → srcRect, rounded
rect rx/ry → prstGeom roundRect). The legacy/preview pptx still consumes
svg_final/ because PowerPoint's internal SVG parser cannot handle those
primitives. Pass -s output or -s final to force a single source on both
products if you need the older single-source behaviour.
Optional animation flags (the defaults already enable rich entrance animations — adjust only when the user asks for something different):
-t <effect> — page transition. Default fade. Options: fade / push / wipe / split / strips / cover / random / none.
-a <effect> — per-element entrance animation. Default mixed (auto-vary across the deck). Pass none to disable, or pick a specific effect like fade. Requires top-level <g id="..."> groups (already required by Executor).
--animation-trigger {on-click,with-previous,after-previous} — Start mode (matches PowerPoint's animation-pane Start dropdown). Default after-previous (click-free cascade; pace via --animation-stagger). Use on-click for presenter-paced reveals, or with-previous for all-at-once.
--auto-advance <seconds> — kiosk-style auto-play.
Optional recorded narration (only when the user asks for narrated/video export):
Run the standalone generate-audio workflow. The AI picks a narration backend (edge by default, or a configured cloud provider such as ElevenLabs / MiniMax / Qwen / CosyVoice for high-quality or cloned voices), asks the user once (backend + voice + rate/settings + embed-or-not, all with recommended values), then executes notes_to_audio.py and (if chosen) re-exports the PPTX with --recorded-narration audio.
Do NOT call notes_to_audio.py directly without going through the workflow — --voice / --voice-id is required and the workflow produces the locale/provider-aware recommendation that makes the choice meaningful.
Full effect list, anchor logic, and limits: references/animations.md.
❌ NEVER substitute cp for finalize_svg.py — finalize performs multiple critical processing steps
❌ NEVER force -s output for the legacy/preview pptx (PowerPoint's internal SVG parser drops icons and rounded corners). The default auto-split already gives native the high-fidelity source it needs without touching legacy.
❌ NEVER use --only (it suppresses one of the two output files)
Post-export iteration: whenever the user asks to change anything on a generated slide ("改一下", "调字号", "那里看着不对", "把图片换大点"), the visual-edit workflow is available — surface it as an option. If the user describes the change with enough specificity to apply directly ("第 3 页副标题字号改 32"), edit the SVG directly instead; if they're vaguely pointing at "somewhere" on the deck, run the workflow.
Role Switching Protocol
Before switching roles, MUST first read the corresponding reference file. Output marker:
## [Role Switch: <Role Name>]
📖 Reading role definition: references/<filename>.md
📋 Current task: <brief description>
Reference Resources
| Resource | Path |
|---|
| Shared technical constraints | references/shared-standards.md |
| Canvas format specification | references/canvas-formats.md |
| Image layout specification | references/image-layout-spec.md |
| SVG image embedding | references/svg-image-embedding.md |
| Icon library | templates/icons/README.md |
Notes
- Local preview:
python3 -m http.server -d <project_path>/svg_final 8000
- Troubleshooting: on generation issues (layout overflow, export errors, blank images, etc.), check
docs/faq.md for known solutions