| name | build-graph |
| description | Build a knowledge graph from markdown files. Ingests documents, extracts entities and relations, resolves duplicates, and induces a typed schema. Use when users want to create a knowledge graph, index their documents, or extract structured data from markdown files. |
Build Knowledge Graph
Overview
This skill builds a complete knowledge graph from markdown files in the current project directory. It uses the kgmd MCP tools to ingest, extract, resolve, and induce.
Tools
Use the kgmd MCP server tools: init_graph, ingest_documents, get_chunks_for_extraction, store_extractions, get_resolution_candidates, apply_merges, get_induction_stats, store_schema.
Workflow
Step 1: Initialize
Call init_graph() to create the .kgmd/ database. If it already exists, skip this step.
Step 2: Ingest
Call ingest_documents() to find and chunk all markdown files. Report the count of documents found and chunks created.
Step 3: Extract (iterative)
Loop until complete:
- Call
get_chunks_for_extraction(batch_size=10) to get the next batch
- If empty, extraction is complete — move to Step 4
- For each chunk, extract entities and relations:
- Entities: People, organizations, projects, technologies, events, concepts
- Types: Use specific types (Person, Organization, Project, Technology, Event, Concept) or create new ones if none fit
- Names: Use canonical full names (e.g., "Brian Anderson" not "Brian")
- Attributes: Notable properties (role, location, date, industry, etc.)
- Relations: Directed edges with predicates (works_at, founded, uses, depends_on, presented_at, manages, etc.)
- Confidence: 0.7-1.0 based on how explicit the relationship is in the text
- Call
store_extractions() with the batch results
- Report progress: chunks processed, entities found so far
Step 4: Resolve duplicates
- Call
get_resolution_candidates() to find similar entities via embedding clustering
- For each candidate cluster, review whether they refer to the same real-world entity
- Confirm merges where entities are clearly the same (e.g., "B. Anderson" and "Brian Anderson")
- Reject merges where entities are different (e.g., "John Smith" the CEO vs. "John Smith" the engineer)
- Call
apply_merges() with confirmed merges
Step 5: Induce schema
- Call
get_induction_stats() to see entity type counts, relation predicates, and attribute summaries
- Generate a YAML schema describing:
- Entity types with their common attributes
- Relation types with subject/object type constraints
- Call
store_schema() with the YAML
Step 6: Report
Summarize the completed graph:
- Documents processed
- Total entities by type
- Total relations by predicate
- Key findings or interesting patterns