| name | claude-history-ingest |
| description | Ingest Claude Code conversation history into the Obsidian wiki. Use this skill when the user wants to mine their past Claude conversations for knowledge, import their ~/.claude folder, extract insights from previous coding sessions, or says things like "process my Claude history", "add my conversations to the wiki", "what have I discussed with Claude before". Also triggers when the user mentions their .claude folder, Claude projects, session data, past conversation logs, local-agent-mode sessions, or audit logs.
|
Claude History Ingest ā Conversation Mining
You are extracting knowledge from the user's past Claude Code conversations and distilling it into the Obsidian wiki. Conversations are rich but messy ā your job is to find the signal and compile it.
This skill can be invoked directly or via the wiki-history-ingest router (/wiki-history-ingest claude).
Before You Start
- Resolve config ā follow the Config Resolution Protocol in
llm-wiki/SKILL.md (walk up CWD for .env ā ~/.obsidian-wiki/config ā prompt setup). This gives OBSIDIAN_VAULT_PATH and CLAUDE_HISTORY_PATH (defaults to ~/.claude)
- Read
.manifest.json at the vault root to check what's already been ingested
- Read
index.md at the vault root to know what the wiki already contains
Ingest Modes
Append Mode (default)
Check .manifest.json for each source file (conversation JSONL, memory file). Only process:
- Files not in the manifest (new conversations, new memory files, new projects)
- Files whose modification time is newer than their
ingested_at in the manifest
This is usually what you want ā the user ran a few new sessions and wants to capture the delta.
Full Mode
Process everything regardless of manifest. Use after a wiki-rebuild or if the user explicitly asks.
Claude Code Data Layout
Claude Code stores data in two locations. Scan both.
Source 1: ~/.claude/ (CLI sessions)
~/.claude/
āāā projects/ # Per-project directories
ā āāā -Users-name-project-a/ # Path-derived name (slashes ā dashes)
ā ā āāā <session-uuid>.jsonl # Conversation data (JSONL)
ā ā āāā memory/ # Structured memories
ā ā āāā MEMORY.md # Memory index
ā ā āāā user_*.md # User profile memories
ā ā āāā feedback_*.md # Workflow feedback memories
ā ā āāā project_*.md # Project context memories
ā āāā -Users-name-project-b/
ā ā āāā ...
āāā sessions/ # Session metadata (JSON)
ā āāā <pid>.json # {pid, sessionId, cwd, startedAt, kind, entrypoint}
āāā history.jsonl # Global session history
āāā tasks/ # Subagent task data
āāā plans/ # Saved plans
āāā settings.json
Source 2: ~/Library/Application Support/Claude/local-agent-mode-sessions/ (Desktop app agent sessions)
The Claude desktop app stores local agent mode sessions here. The structure is deeply nested:
~/Library/Application Support/Claude/local-agent-mode-sessions/
āāā <outer-uuid>/
āāā <inner-uuid>/
āāā local_<session-uuid>.json # Session metadata
āāā local_<session-uuid>/
āāā audit.jsonl # Audit log ā tool calls, file reads, commands run
āāā .claude/
āāā projects/
āāā <path-encoded-name>/ # Same path-encoding as ~/.claude/projects/
āāā <uuid>.jsonl # Conversation transcript (same JSONL format as CLI)
How to find all local-agent-mode sessions:
find ~/Library/Application\ Support/Claude/local-agent-mode-sessions -name "local_*.json" -maxdepth 4
find ~/Library/Application\ Support/Claude/local-agent-mode-sessions -name "audit.jsonl"
find ~/Library/Application\ Support/Claude/local-agent-mode-sessions -name "*.jsonl" -path "*/.claude/projects/*"
Session metadata (local_<uuid>.json) ā JSON file with fields like sessionId, cwd, startedAt, model, title. Read this first to understand the session context before opening the transcript.
Audit log (audit.jsonl) ā Each line is a JSON record of one agent action: tool calls (Read, Write, Bash, Edit), file accesses, shell commands executed, MCP calls. Useful for understanding what the agent actually did ā often richer signal than the conversation text alone. Fields: type, toolName, input, output, timestamp, sessionId.
Conversation transcript (.claude/projects/.../<uuid>.jsonl) ā Identical format to CLI conversation JSONL. Parse the same way as ~/.claude/projects/*/*.jsonl.
Key data sources ranked by value (both locations combined):
- Memory files (
~/.claude/projects/*/memory/*.md) ā Pre-distilled, already wiki-friendly. Gold.
- Conversation JSONL (both
~/.claude/projects/*/*.jsonl and desktop app transcripts) ā Full conversation transcripts. Rich but noisy.
- Audit logs (
audit.jsonl in desktop sessions) ā Tool-call level record of what was done. Useful for extracting concrete actions, file patterns, and command patterns even when the conversation is sparse.
- Session metadata (
sessions/*.json and local_*.json) ā Tells you which project, when, and what CWD.
Step 1: Survey and Compute Delta
Scan both data locations and compare against .manifest.json:
Glob: ~/.claude/projects/*/
Glob: ~/.claude/projects/*/memory/*.md
Glob: ~/.claude/projects/*/*.jsonl
DESKTOP_SESSIONS="$HOME/Library/Application Support/Claude/local-agent-mode-sessions"
find "$DESKTOP_SESSIONS" -name "local_*.json" -maxdepth 4
find "$DESKTOP_SESSIONS" -name "audit.jsonl"
find "$DESKTOP_SESSIONS" -name "*.jsonl" -path "*/.claude/projects/*"
Build a unified inventory and classify each file:
- New ā not in manifest ā needs ingesting
- Modified ā in manifest but file is newer ā needs re-ingesting
- Unchanged ā in manifest and not modified ā skip in append mode
Report to the user: "Found X CLI projects, Y desktop sessions. Memory files: A. Conversations: B. Audit logs: C. Delta: D new, E modified."
Step 2: Ingest Memory Files First
Memory files are already structured with YAML frontmatter:
---
name: memory-name
description: one-line description
type: user|feedback|project|reference
---
Memory content here.
For each memory file:
- Read it and parse the frontmatter
user type ā feeds into an entity page about the user, or concept pages about their domain
feedback type ā feeds into skills pages (workflow patterns, what works, what doesn't)
project type ā feeds into entity pages for the project
reference type ā feeds into reference pages pointing to external resources
The MEMORY.md index file in each project is a quick summary ā read it first to decide which individual memory files are worth reading in full.
Step 3: Parse Conversation JSONL
Each JSONL file is one conversation session. Each line is a JSON object:
{
"type": "user|assistant|progress|file-history-snapshot",
"message": {
"role": "user|assistant",
"content": "text string"
},
"uuid": "...",
"timestamp": "2026-03-15T10:30:00.000Z",
"sessionId": "...",
"cwd": "/path/to/project",
"version": "2.1.59"
}
For assistant messages, content may be an array of content blocks:
{
"content": [
{"type": "thinking", "text": "..."},
{"type": "text", "text": "The actual response..."},
{"type": "tool_use", "name": "Read", "input": {...}}
]
}
What to extract from conversations:
- Filter to
type: "user" and type: "assistant" entries only
- For assistant entries, extract
text blocks (skip thinking and tool_use ā those are noise)
- The
cwd field tells you which project this conversation belongs to
- The project directory name (e.g.,
-Users-name-Documents-projects-my-app) tells you the project path
Skip these:
type: "progress" ā internal agent progress updates
type: "file-history-snapshot" ā file state tracking
- Subagent conversations (under
subagents/ subdirectories) ā unless the user specifically asks
Step 3b: Parse Audit Logs (desktop sessions only)
For each audit.jsonl found under local-agent-mode-sessions/, read it line by line. Each line is a JSON record of one agent action:
{
"type": "tool_call",
"toolName": "Bash",
"input": {"command": "npm test"},
"output": "...",
"timestamp": "2026-04-10T14:22:00Z",
"sessionId": "..."
}
What to extract from audit logs:
- File access patterns ā which files does the agent repeatedly Read or Edit? These are the high-value files in the project. Note them as project references.
- Shell commands ā recurring Bash commands reveal the project's build/test/deploy workflow. Distill these into a
skills/ page (e.g. "how this project is built and tested").
- Tool call sequences ā if the agent always does Read ā Edit ā Bash in a particular order, that's a workflow pattern worth capturing.
- Error patterns ā failed tool calls (non-zero exit codes, error outputs) reveal pain points, known rough edges, or recurring bugs.
- MCP tool calls ā calls to MCP tools reveal which external services and APIs the project integrates with.
Skip from audit logs:
- Routine file reads with no pattern (e.g. reading config files once)
- Tool outputs that are just noise (long stack traces, verbose logs) ā summarize the error class, not the full output
- Anything that looks like secrets, tokens, or credentials in command arguments or outputs
Cross-reference with the conversation transcript: The audit log tells you what happened; the conversation tells you why. When both are available for the same session, use them together ā the audit log grounds the conversation in concrete actions.
Read the paired local_<uuid>.json session metadata before processing the audit log ā it gives you cwd, startedAt, and title to contextualize the actions.
Step 4: Cluster by Topic
Don't create one wiki page per conversation. Instead:
- Group extracted knowledge by topic across conversations
- A single conversation about "debugging auth + setting up CI" ā two separate topics
- Three conversations across different days about "React performance" ā one merged topic
- The project directory name gives you a natural first-level grouping
Step 5: Distill into Wiki Pages
Each Claude project maps to a project directory in the vault. The project directory name from ~/.claude/projects/ encodes the original path ā decode it to get a clean project name:
-Users/Documents/projects/my-Project ā myproject
-Users/Documents/projects/Another-app ā anotherapp
Project-specific vs. global knowledge
| What you found | Where it goes | Example |
|---|
| Project architecture decisions | projects/<name>/concepts/ | projects/my-project/concepts/main-architecture.md |
| Project-specific debugging | projects/<name>/skills/ | projects/my-project/skills/api-rate-limiting.md |
| General concept the user learned | concepts/ (global) | concepts/react-server-components.md |
| Recurring problem across projects | skills/ (global) | skills/debugging-hydration-errors.md |
| A tool/service used | entities/ (global) | entities/vercel-functions.md |
| Patterns across many conversations | synthesis/ (global) | synthesis/common-debugging-patterns.md |
For each project with content, create or update the project overview page at projects/<name>/<name>.md ā named after the project, not _project.md. Obsidian's graph view uses the filename as the node label, so _project.md makes every project show up as _project in the graph. Naming it <name>.md gives each project a distinct, readable node name.
Important: Distill the knowledge, not the conversation. Don't write "In a conversation on March 15, the user asked about X." Write the knowledge itself, with the conversation as a source attribution.
Write a summary: frontmatter field on every new/updated page ā 1ā2 sentences, ā¤200 chars, answering "what is this page about?" for a reader who hasn't opened it. wiki-query's cheap retrieval path reads this field to avoid opening page bodies.
Add confidence and lifecycle fields to every new page's frontmatter:
base_confidence: 0.42
lifecycle: draft
lifecycle_changed: <ISO date today>
On update, leave lifecycle and lifecycle_changed unchanged ā only a human editor transitions lifecycle state.
Mark provenance per the convention in llm-wiki (Provenance Markers section):
- Memory files are mostly extracted ā the user wrote them by hand and they're already distilled. Treat memory-derived claims as extracted unless you're stitching together claims from multiple memory files.
- Conversation distillation is mostly inferred. You're synthesizing a coherent claim from many turns of dialogue, often filling in implicit reasoning. Apply
^[inferred] liberally to synthesized patterns, generalizations across sessions, and "what the user really meant" interpretations.
- Use
^[ambiguous] when the user changed their mind across sessions or when assistant and user contradicted each other and the resolution is unclear.
- Write a
provenance: frontmatter block on every new/updated page summarizing the rough mix.
Step 6: Update Manifest, Journal, and Special Files
Update .manifest.json
For each source file processed, add/update its entry with:
ingested_at, size_bytes, modified_at
source_type: one of "claude_conversation", "claude_memory", "claude_audit_log", "claude_desktop_session"
project: the decoded project name
pages_created and pages_updated lists
Also update the projects section of the manifest:
{
"project-name": {
"source_path": "~/.claude/projects/-Users-...",
"vault_path": "projects/project-name",
"last_ingested": "TIMESTAMP",
"conversations_ingested": 5,
"conversations_total": 8,
"memory_files_ingested": 3,
"desktop_sessions_ingested": 2,
"audit_logs_ingested": 2
}
}
Create journal entry + update special files
Update index.md and log.md per the standard process:
- [TIMESTAMP] CLAUDE_HISTORY_INGEST projects=N conversations=M desktop_sessions=D audit_logs=A pages_updated=X pages_created=Y mode=append|full
hot.md ā Read $OBSIDIAN_VAULT_PATH/hot.md (create from the template in wiki-ingest if missing). Update Recent Activity with a one-line summary ā e.g. "Ingested 5 Claude conversations across 2 projects; surfaced patterns in API design and testing strategy." Keep the last 3 operations. Update Active Threads if any ongoing project is now better understood. Update updated timestamp.
Privacy
- Distill and synthesize ā don't copy raw conversation text verbatim
- Skip anything that looks like secrets, API keys, passwords, tokens
- If you encounter personal/sensitive content, ask the user before including it
- The user's conversations may reference other people ā be thoughtful about what goes in the wiki
Reference
See references/claude-data-format.md for more details on the data structures.