| name | diagram-content-analysis |
| description | Analyze source material and determine what a diagram should show. Given a document, dataset, or concept description, extract the key dimensions, find the story, prioritize what earns visual encoding vs. annotation vs. being cut entirely, and write the content that will appear on the diagram. Produces a structured specification that the information design skill (diagram-visual-encoding) consumes. Use this skill whenever someone asks to create a diagram, visual explanation, infographic, or information graphic from source material — this is always the first step before any visual decisions. Also use when the user says a diagram "doesn't tell the right story" or "is missing the point" — those are content analysis problems, not design problems. Triggers on: diagram, visual explanation, infographic, information design, visualize this, explain this visually, one-pager, concept map, educational diagram, framework diagram.
|
Diagram Content Analysis
The quality of a diagram is determined before any visual decisions are made. If you
identify the wrong dimensions, tell the wrong story, or write unclear content, no amount
of visual encoding or graphic design can fix it. This skill is about the editorial judgment
that separates a diagram that illuminates from one that merely decorates.
When This Skill Runs
This skill produces the content specification — the "what to show" — that feeds into
the visual encoding skill (which decides how to show it) and eventually the graphic
design skill (which makes it look right). Think of it as the editorial layer: a good
editor decides what story to tell and what to cut before the designer touches it.
Inputs
This skill works with three kinds of source material:
-
Documents (most common) — book chapters, papers, reports, technical specs. The
challenge is compression: a 5,000-word chapter becomes ~200 words of diagram text plus
visual encodings. The skill guides that compression.
-
Datasets — CSV, tables, metrics. The challenge is finding the story in the numbers:
what pattern or comparison matters most?
-
Concepts described in conversation — the user explains a system, process, or idea.
The challenge is extracting structure from informal description.
Core Workflow
Phase A: Understand the Source
Read the source material thoroughly before extracting anything. The goal is to understand
it well enough to explain it to someone else — not just to identify keywords.
For documents, look for:
- The author's thesis or central argument
- The logical structure (how concepts build on each other)
- Which concepts are foundational vs. derived
- Where the author spends the most space (that's usually what matters most)
- What the author assumes the reader already knows
For datasets, look for:
- What dimensions exist (columns, categories, time series)
- What varies and what's constant
- Where the interesting patterns are (outliers, trends, clusters, gaps)
- What comparison the viewer would find most useful
For conversational descriptions, look for:
- What the user emphasizes or repeats
- What they seem to find most important or surprising
- Implicit structure they haven't articulated (sequences, hierarchies, trade-offs)
- Gaps in the description that need filling
Adaptive approach: If the user's intent is clear from context — they've described the
audience, purpose, and key message — proceed directly to extraction. If the intent is
ambiguous, ask targeted questions before analyzing:
- Who will look at this diagram? (audience expertise level)
- What should they do or understand differently after seeing it? (purpose)
- Is there a specific message or argument, or is this more exploratory? (story)
Keep the interview short — 2-3 questions maximum. The goal is to remove ambiguity, not to
conduct a requirements gathering session.
Phase B: Find the Story
Every effective diagram has a story — a throughline that organizes everything else. The
story is the answer to "so what?" It's the reason the diagram exists.
Three types of diagram stories:
Explanatory — "Here's how this works." The story is a mental model the viewer
doesn't yet have. The diagram teaches a concept, process, or system. Success: the viewer
can explain the concept to someone else after seeing the diagram.
- Common for: book chapters, educational materials, onboarding docs, technical architecture
- The challenge: deciding what level of detail serves understanding vs. overwhelms it
Analytical — "Here's what the data shows." The story is a pattern, trend, or
comparison the viewer should notice. The diagram makes a pattern visible that was hidden
in the raw numbers. Success: the viewer sees the pattern immediately.
- Common for: dashboards, reports, data presentations, research findings
- The challenge: not just showing data but encoding the interesting comparison
Persuasive — "Here's why this matters." The story makes an argument. The diagram
is evidence in service of a conclusion. Success: the viewer is more convinced of the
argument after seeing the diagram.
- Common for: proposals, pitch decks, policy briefs, executive summaries
- The challenge: being honest — the visual encoding should illuminate, not mislead
Write the story as a single sentence. If you can't compress it to one sentence, you
haven't found the story yet — you've found the topic. "Data pipelines" is a topic.
"Data volume drops 70% between ingestion and serving because most raw events are
noise" is a story.
Phase C: Extract and Classify Dimensions
List every dimension in the source material. A dimension is anything that varies and
could potentially be shown. Classify each:
Quantitative — has magnitude, can be compared numerically.
Volume, count, rate, cost, duration, ratio, percentage. These can be encoded with
position, length, area, or color intensity.
Categorical — distinct groups, no inherent order.
Type, role, category, department, platform. These can be encoded with spatial grouping,
color hue, or shape.
Relational — connections between things.
Flow (A feeds B), hierarchy (A contains B), dependency (A requires B), association
(A relates to B). These can be encoded with lines, containment, or proximity.
Ordinal — categories with meaningful order.
Stages, phases, priority levels, maturity levels. These can be encoded with spatial
position or color gradients.
Conceptual — abstract ideas that need definition and examples.
Frameworks, principles, mental models, taxonomies. These can't be "encoded" in the
visual-channel sense — they need text, and the diagram's job is to organize and relate
them spatially. This is the dimension type that the existing information design literature
handles least well, because it doesn't map neatly to Cleveland & McGill. Conceptual
diagrams succeed when the spatial arrangement itself teaches the relationships.
Be exhaustive in listing dimensions — you'll cut aggressively in the next phase. It's
easier to cut from a complete list than to discover you missed something important after
you've committed to a visual structure.
Phase D: Prioritize Ruthlessly
A diagram can effectively encode 2–3 dimensions visually. Trying to encode 5+ creates
noise. Everything else is either annotation text or gets cut.
Assign each dimension to a tier:
Primary (1 dimension) — What the viewer grasps in the first 3 seconds. This is the
story made visible. If the diagram is about flow, the primary encoding IS the flow. If
it's about comparison, the primary encoding IS the comparison. The primary dimension
gets the most powerful visual channel available.
Secondary (1–2 dimensions) — What the viewer notices on second look. Adds depth to
the story without competing with the primary. Must be visually independent from the
primary — the viewer should be able to read each encoding separately.
Tertiary (0–1 dimensions) — Rewards closer inspection. Subtle encoding (color
intensity, small position differences) for viewers who spend time with the diagram.
Optional — many diagrams are better with just primary and secondary.
Annotation — Important details that appear as text labels, footnotes, or callouts
but aren't visually encoded. Technology names, exact numbers, dates, caveats. These
are "read" not "seen."
Cut — Doesn't appear in the diagram at all. This is the hardest editorial decision
and the most important. The most common diagram failure mode is trying to show
everything. Every dimension you cut makes the remaining dimensions clearer.
How to decide what to cut:
- Does removing it change the story? If not, cut it.
- Is it something the viewer already knows? Cut it.
- Is it a detail that matters for implementation but not understanding? Cut it.
- Would a footnote or separate document serve it better? Move it there.
- Does it only matter to a subset of the audience? Cut it from the diagram,
mention it in accompanying text.
Phase E: Write Content (for standalone/educational diagrams)
When the diagram must stand on its own — someone who hasn't read the source material
should understand it — the text content needs as much care as the dimensional analysis.
Read references/content-writing.md for detailed guidance. Key principles:
Define every term of art. If the diagram uses vocabulary like "normalization,"
"chokepoint," or "frame," define it the first time it appears. Don't remove the
jargon — teaching the vocabulary IS part of the point — but pair each term with a
plain-language definition and a concrete example.
Use examples from different domains. When presenting multiple related concepts,
each example should come from a distinctly different domain. If all examples come from
the same domain, the concepts blur together. If you're explaining four types of
cognitive bias, use one example from medicine, one from investing, one from cooking,
one from sports — not four examples from investing.
Write for the card, not the page. Each card/node has limited space. Three lines
for a definition-with-example is enough. Cut every word that doesn't earn its place.
Layer the content. Structure text in progressive disclosure layers:
- Term or question (bold) — what is this concept?
- Definition + example (regular) — how does it work?
- Key insight (accent) — why does it matter?
- Action levers (secondary) — what can you do with it?
The viewer can stop at any layer and still get value.
Phase F: Produce the Specification
The output has two parts: a human-readable brief for review, and a structured spec
that the visual encoding skill can consume.
Human-readable brief (present this to the user for approval):
## Story
[One-sentence story]
## Audience
[Who, what they know, what they should take away]
## Dimensions (ranked)
- PRIMARY: [dimension] — [why this is the main thing]
- SECONDARY: [dimension] — [what depth this adds]
- ANNOTATION: [list of text-only details]
- CUT: [what was deliberately excluded and why]
## Content
[For each concept/node, the text that will appear — definition,
example, insight, levers. Written at diagram length, not document length.]
## Relationships
[How concepts/dimensions connect — flows, hierarchies, dependencies]
## Open Questions
[Anything the user should weigh in on before visual design begins]
Structured spec (append after the brief, for Skill 2):
Write the spec in YAML within a fenced code block. See references/output-format.md
for the full schema. The spec must capture:
story: type (explanatory/analytical/persuasive), primary message, audience
dimensions: each with name, type, values/range, priority tier, and encoding hints
relationships: typed connections between dimensions or concepts
concepts: for educational diagrams — term, definition, example, insight, levers
content: title, subtitle, footnotes, source attribution
constraints: page size if known, whether it must stand alone, print vs. screen
Common Failure Modes
Trying to show everything. The most frequent failure. A book chapter has 20
concepts; the diagram tries to include all 20 and none of them are clear. The fix is
always to cut more aggressively. A diagram that shows 5 things clearly is infinitely
more valuable than one that shows 20 things illegibly.
Missing the story. Dimensions are identified correctly but there's no narrative
throughline. The diagram is technically accurate but doesn't illuminate anything. The
fix is to write the one-sentence story and let it drive every prioritization decision.
Wrong audience assumption. Too much jargon for a general audience, or too
simplified for experts. The fix is to explicitly state the audience's existing
knowledge and calibrate definitions and detail level to that.
Same-domain examples. Four concepts, four examples from software engineering. The
viewer can't tell which example goes with which concept because they all sound similar.
The fix is deliberate domain diversity.
Weak content compression. Card text is a shortened version of the source document
rather than a rewrite. It reads like a summary rather than a standalone explanation.
The fix is to write the card text from scratch, using the source as reference, not as
a template to abbreviate.
References
references/content-writing.md — Detailed guidance on writing diagram content:
progressive disclosure, term definition, example selection, compression techniques.
references/output-format.md — The YAML spec schema that the visual encoding
skill consumes, with annotated examples.