con un clic
mempalace
// MemPalace — Local AI memory with 96.6% recall. Semantic search, temporal knowledge graph, palace architecture (wings/rooms/drawers). Free, no cloud, no API keys.
// MemPalace — Local AI memory with 96.6% recall. Semantic search, temporal knowledge graph, palace architecture (wings/rooms/drawers). Free, no cloud, no API keys.
Async self-improvement loop. Reviews recent work in MemPalace, finds skill gaps, discovers or authors new Hermes skills, and writes a diary entry. Runs unattended via hermes cron or post-task hooks.
Enables Hermes to run a full self-reflection and self-improvement cycle using its Memory Palace, task history, error logs, and electric-sheep outputs; distill durable lessons; execute approved improvements; document the cycle in the LLM-Wiki; and archive both reflection details and generalized lessons for future behavior.
| name | mempalace |
| description | MemPalace — Local AI memory with 96.6% recall. Semantic search, temporal knowledge graph, palace architecture (wings/rooms/drawers). Free, no cloud, no API keys. |
| version | 3.4.0 |
| homepage | https://github.com/MemPalace/mempalace |
| author | Hermes Agent |
You have access to MemPalace, a local declarative memory system with a temporal knowledge graph. It stores verbatim conversation history and structured knowledge on the user's machine — zero cloud, zero API calls. This gives you persistent, high-recall memory across sessions.
| Level | Description | Example |
|---|---|---|
| Wings | People or projects | wing_alice, wing_myproject, dokploy, hermes |
| Halls | Categories within a wing | facts, events, preferences, advice, discoveries |
| Rooms | Specific topics | chromadb-setup, riley-school, auth-migration |
| Drawers | Individual memory chunks | Verbatim text |
| Knowledge Graph | Entity-relationship facts with time validity | Max → child_of → Alice |
| Tunnels | Cross-wing connections via shared room names | Same room in multiple wings |
mcp_mempalace_status to load palace overview and AAAK dialect spec.mcp_mempalace_search or mcp_mempalace_kg_query FIRST.
"Never guess from memory — verify from the palace."
"Wrong is worse than slow."
mcp_mempalace_diary_write to record what happened, what you learned, what matters.mcp_mempalace_kg_invalidate on the old fact, then mcp_mempalace_kg_add for the new one.Tools are prefixed with mcp_mempalace_ in Hermes. (In OpenClaw they appear as mempalace_.)
mcp_mempalace_search — Semantic search across all memories. Always start here.
query (required): Short natural language keywords or question. Do NOT include system prompts or conversation context.wing: Filter by wingroom: Filter by roomlimit: Max results (default 5)mcp_mempalace_check_duplicate — Check if content exists before filing.
content (required)threshold: Similarity threshold (default 0.9; lowering to 0.85–0.87 catches more near-duplicates without significant false positives)mcp_mempalace_status — Palace overview: total drawers, wings, rooms, AAAK specmcp_mempalace_list_wings — All wings with drawer countsmcp_mempalace_list_rooms — Rooms within a wing (optional wing filter)mcp_mempalace_get_taxonomy — Full wing/room/count treemcp_mempalace_get_aaak_spec — Get AAAK compression dialect specificationmcp_mempalace_kg_query — Query entity relationships with time filtering.
entity (required): e.g., "Max", "MyProject"as_of: Date filter (YYYY-MM-DD) — what was true at that timedirection: "outgoing", "incoming", or "both" (default "both")mcp_mempalace_kg_add — Add a fact: subject → predicate → object
subject, predicate, object (required)valid_from: When this became truesource_closet: Source referencemcp_mempalace_kg_invalidate — Mark a fact as no longer true
subject, predicate, object (required)ended: When it stopped being true (default: today)mcp_mempalace_kg_timeline — Chronological story of an entity
entity: Optional filter (omits for all events)mcp_mempalace_kg_stats — Graph overview: entities, triples, relationship typesmcp_mempalace_traverse — Walk from a room to find connected ideas across wings
start_room (required)max_hops: Connection depth (default 2)mcp_mempalace_find_tunnels — Find rooms that bridge two wings
wing_a, wing_b (required)mcp_mempalace_graph_stats — Graph connectivity overviewmcp_mempalace_add_drawer — Store verbatim content into a wing/room
wing, room, content (required)source_file: Optional source referencemcp_mempalace_delete_drawer — Remove a drawer by ID
drawer_id (required)mcp_mempalace_diary_write — Write a session diary entry
agent_name (required): Your name/identifierentry (required): What happened, what you learned, what matterstopic: Category tag (default "general")mcp_mempalace_diary_read — Read recent diary entries
agent_name (required)last_n: Number of entries (default 10)Sessions are automatically mined every 15 minutes via cron job mempalace-auto-save.
Hermes Palace Heartbeat (active structured saves):
~/.hermes/hooks/hermes_palace_heartbeat.sh — check if checkpoint save is due~/.hermes/hooks/hermes_palace_heartbeat.sh --precompact — emergency save check~/.hermes/hooks/hermes_palace_heartbeat.sh --ack — acknowledge save completed~/.hermes/HEARTBEAT.mdLegacy hooks (for Claude Code / Codex CLI):
~/.hermes/hooks/mempal_save_hook.sh — trigger manual save checkpoint~/.hermes/hooks/mempal_precompact_hook.sh — emergency pre-compaction saveCron job danger — never mine a full directory from a scheduler:
If a cron job runs mempalace mine <directory> --mode convos on a recurring interval,
each invocation scans the entire directory. When the directory grows large enough
that mining takes longer than the interval, overlapping instances pile up, consume
all CPU/RAM, and bloat the palace with duplicate drawers. Always use event-driven
hooks (mine a single staged file) rather than polling cron jobs (mine a directory).
If you must use cron, add a single-instance guard (flock or PID file check) and
a --limit flag to prevent unbounded runs.
When running mempalace mine on /home/elvis/.hermes/sessions/ as a cron job, follow this checklist to avoid segfaults, stale processes, and incomplete runs.
Before starting, verify no other mempalace mine process is already running:
ps aux | grep -i mempalace | grep -v grep
If a process is found:
ps -p <pid> -o pid,cmd,pcpu,etimefutex_do_wait (ps -L -p <pid> -o tid,stat,wchan), the HNSW index is corrupted and the process is deadlocked. Kill it and run mempalace repair before restarting./tmp/mempalace_staging/ or similar), another session is likely performing recovery. Do NOT interfere — let it finish.kill -15 <pid>collection.add() test (see HNSW corruption section) before launching a new minerHermes sessions/ directories accumulate request_dump_*.json artifacts that cause segfaults or extreme slowness. Temporarily move them out:
mkdir -p /tmp/request_dumps_hold
mv /home/elvis/.hermes/sessions/request_dump_*.json /tmp/request_dumps_hold/
Run the mining job as a background process so it survives foreground timeouts:
mempalace mine /home/elvis/.hermes/sessions/ --mode convos
Cron jobs cannot block for hours. Start a background watcher to restore request dumps when mining finishes:
(while kill -0 <mining_pid> 2>/dev/null; do sleep 30; done; mv /tmp/request_dumps_hold/request_dump_*.json /home/elvis/.hermes/sessions/ 2>/dev/null; rmdir /tmp/request_dumps_hold 2>/dev/null; echo "Restored $(date)") &
mempalace status is sampled and lags. Use direct ChromaDB queries to track real progress while the background job runs:
/home/elvis/.local/share/pipx/venvs/mempalace/bin/python -c "
import chromadb
client = chromadb.PersistentClient(path='/home/elvis/.mempalace/palace')
collection = client.get_or_create_collection('mempalace_drawers')
results = collection.get(where={'wing': 'sessions'})
sources = set(m.get('source_file', '') for m in results['metadatas'])
print(f'Total unique sources filed: {len(sources)}')
"
Calculate rate and ETA:
sources_per_minute = (sources2 - sources1) / minutes_elapsedETA_minutes = remaining_sources / sources_per_minuteAfter mining completes:
chroma.sqlite3 timestamp updated: ls -l /home/elvis/.mempalace/palace/chroma.sqlite3ls /home/elvis/.hermes/sessions/request_dump_*.json | wc -lIf MemPalace needs installation or re-initialization:
pip install mempalace
mempalace init ~/my-convos
mempalace mine ~/my-convos
MemPalace is already configured as an MCP server in Hermes. If you need to re-add it:
{
"mcpServers": {
"mempalace": {
"command": "python3",
"args": ["-m", "mempalace.mcp_server"]
}
}
}
Search for context:
# Semantic search — keep query short, no system prompt context
mcp_mempalace_search(query="Redis decision against")
# Knowledge graph query
mcp_mempalace_kg_query(entity="ProjectX")
# Specific wing/room
mcp_mempalace_search(query="API design", wing="myproject", room="backend")
# Traverse cross-wing connections
mcp_mempalace_traverse(start_room="auth-migration", max_hops=2)
Add to memory:
# Check for duplicates first (optional but recommended)
mcp_mempalace_check_duplicate(
content="Decided to use PostgreSQL over MySQL because...",
threshold=0.87
)
# File verbatim content
mcp_mempalace_add_drawer(
wing="myproject",
room="decisions",
content="Decided to use PostgreSQL over MySQL because..."
)
# Add KG fact
mcp_mempalace_kg_add(
subject="User",
predicate="prefers",
object="dark themes"
)
# Diary entry in AAAK format
mcp_mempalace_diary_write(
agent_name="Hermes",
entry="SESSION:2026-04-20|built.mempalace.hooks+configured.cron|milestone:memory.active|★★★"
)
If MCP tools aren't available yet, use the CLI:
mempalace search "query" — search
--wing WING, --room ROOM, --results Nmempalace status — overviewmempalace mine <dir> — ingest files
--mode convos for conversation miningmempalace repair — rebuild vector index from stored data (fixes HNSW segfaults after corruption)mempalace wake-up — get L0+L1 contextmcp_mempalace_check_duplicate before storing new content to avoid duplicates. Try threshold=0.85–0.87 for near-duplicate detection.mcp_mempalace_status) is a compressed notation for efficient storage. Read it naturally — expand codes mentally, treat markers as emotional context.mempalace status / mcp_mempalace_list_wings omit wings outside the top sampleBoth the CLI status command and MCP mcp_mempalace_status / mcp_mempalace_list_wings use a sampling/aggregation approach that, in large palaces, surfaces only the largest wings (commonly just dokploy and misjustice-alliance). They will:
sessions, hermes, ryno-prd, etc.status or list_wings to verify a mining run. Instead:
mempalace search "<session_id>" (CLI) or mcp_mempalace_search(query="...", wing="sessions") (MCP) to confirm content is retrievable.mcp_mempalace_list_rooms(wing="<wing>") — it returns accurate room-level drawer counts even when the wing is hidden from status / list_wings./home/elvis/.local/share/pipx/venvs/mempalace/bin/python -c "
import chromadb
client = chromadb.PersistentClient(path='/home/elvis/.mempalace/palace')
collection = client.get_or_create_collection('mempalace_drawers')
total = collection.count()
print(f'Total drawers: {total}')
# Query a specific wing (exact syntax depends on installed ChromaDB version)
try:
results = collection.get(where={'wing': 'sessions'})
except Exception:
results = collection.get(where={'wing': {'$eq': 'sessions'}})
print(f'Drawers in wing: {len(results[\"ids\"])}')
# List all wings (sample if huge; cap at 50K to avoid OOM)
sample_limit = min(total, 50000)
all_data = collection.get(limit=sample_limit)
wings = {}
for meta in all_data['metadatas']:
w = meta.get('wing', 'unknown')
wings[w] = wings.get(w, 0) + 1
for w, c in sorted(wings.items(), key=lambda x: -x[1]):
print(f' {w}: {c}')
"
Note: ChromaDB telemetry warnings in stderr are harmless. If where={'wing': 'sessions'} fails with an operator error, fall back to where={'wing': {'$eq': 'sessions'}} (or vice versa) depending on the local ChromaDB version.The CLI mempalace search has been observed returning empty results for queries that successfully match via MCP search. If verification fails on the CLI, always double-check with MCP mcp_mempalace_search.
source_file stores absolute pathsWhen querying ChromaDB directly by source_file, the metadata stores the full absolute path (e.g., /home/elvis/.hermes/sessions/session_20260421_080032.json), not just the filename. Querying with only the basename will return zero matches.
Immediately after a successful mempalace mine run, the following can appear stale for a short period (seconds to minutes depending on palace size):
mcp_mempalace_list_rooms — may not yet reflect newly filed drawers.mcp_mempalace_search and CLI mempalace search may not surface the freshest content until ChromaDB flushes its index.mempalace status and mcp_mempalace_list_wings are sampled/aggregated and will not show updated totals in real time.Reliable post-mining verification:
chroma.sqlite3 was modified at the mining timestamp (ls -l /home/elvis/.mempalace/palace/chroma.sqlite3).filed_at: If you query the collection directly, do not rely solely on Counter.most_common() on source_file; a new file with ~62 drawers can be buried below older files that have identical counts. Instead, sort by the filed_at metadata to surface the most recently filed drawers:
results = collection.get(where={'wing': 'sessions'})
indexed = sorted(enumerate(results['metadatas']),
key=lambda x: x[1].get('filed_at', ''),
reverse=True)
# indexed[0] will be the most recently filed drawer
This is the fastest way to confirm that a specific mining run actually wrote drawers to the palace.client.list_collections() now returns a list of strings (collection names), not collection objects. Use client.get_collection(name) to access a collection. Code written for older versions will raise NotImplementedError.Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given is a known non-fatal ChromaDB issue. Ignore it.The MCP server (mempalace.mcp_server) is a long-running process that caches the ChromaDB collection in memory. When a new mining run adds drawers to the palace, existing MCP server processes may hold a stale cached index that does not include the new data.
Symptoms:
mempalace search "cooling" --wing terrahash-stack returns resultsmcp_mempalace_search(query="cooling", wing="terrahash-stack") returns empty []mcp_mempalace_list_rooms(wing="terrahash-stack") shows room counts, but search with the same wing filter returns nothingmempalace.mcp_server processes visible in ps aux (started hours or days apart)Root cause:
_collection_cache globally and never refreshes itmempalace mine runs, the cached collection is out of sync with the on-disk palaceFix:
# Kill all stale MCP server processes
pkill -f "mempalace.mcp_server"
Hermes will auto-restart a fresh MCP server on the next tool call. The new process will load the current palace state.
Workaround if MCP is still failing after restart: Use the CLI as a fallback:
mempalace search "query" --wing mywing --room myroom --results 10
mempalace mine — segfaults OR deadlocksIf mempalace mine fails to write after a previous run crashed mid-write, the HNSW vector index is likely corrupted. Corruption manifests in two distinct ways:
mempalace mine crashes with core dumped in chromadb/segment/impl/vector/local_hnsw.pychromadb.PersistentClient also segfaultfutex_do_wait)mempalace mine starts but never progresses — no DB modifications, no new drawers filedfutex_do_wait in ps -L outputcollection.add() calls hang indefinitely (not just slow — they never return)Quick diagnostic test (30 seconds):
/home/elvis/.local/share/pipx/venvs/mempalace/bin/python -c "
import chromadb, time
client = chromadb.PersistentClient(path='/home/elvis/.mempalace/palace')
collection = client.get_or_create_collection('mempalace_drawers')
start = time.time()
collection.add(ids=['test_hnsw'], documents=['test'], metadatas=[{'wing': 'test'}])
print(f'OK — add() completed in {time.time()-start:.2f}s')
collection.delete(ids=['test_hnsw'])
" 2>&1 | grep -v telemetry
If this hangs for >10s, the HNSW index is corrupted and needs repair.
Do NOT:
chromadb or chroma-hnswlib — this does not fix index corruption and may break mempalace compatibilityFix:
mempalace repair
This rebuilds the palace vector index from stored data, backs up the old index to /home/elvis/.mempalace/palace.backup, and resolves both segfaults and deadlocks.
Repair timeline expectations:
| Palace size | Drawers | Duration | CPU | RAM |
|---|---|---|---|---|
| ~350 MB | ~22K | 25–30 min | 325% | ~850 MB |
| 20+ GB | 100K+ | Can exceed 1 hour | 300%+ | 2+ GB |
During repair, monitor progress via chroma.sqlite3 growth and find ~/.mempalace/palace -mmin -10 for new HNSW segment files. The process is CPU-intensive with multiple threads in R state — 0% CPU for >5 min indicates a problem, not completion.
After repair, verify writes work before restarting the miner:
# Quick post-repair verification (should complete in <1s)
mempalace search "test" --wing sessions --results 1
Then run mempalace mine again.
mempalace repair fails with shutil.copytree error (SQLite journal race condition)mempalace repair begins by copying the existing palace to palace.backup. If a SQLite transaction is active when repair starts, a .chroma.sqlite3-journal file may exist briefly and then disappear mid-copy. This causes shutil.copytree to raise:
shutil.Error: [('/home/elvis/.mempalace/palace/chroma.sqlite3-journal', '/home/elvis/.mempalace/palace.backup/chroma.sqlite3-journal', "[Errno 2] No such file or directory...")]
Fix: Wait for any active SQLite transactions to settle (no other mempalace processes running), then retry mempalace repair. If the journal file persists, shut down any MCP server processes that may hold the database open:
pkill -f "mempalace.mcp_server"
sleep 2
mempalace repair
When mcp_mempalace_* tools fail with ClosedResourceError or the gateway reports MCP server 'mempalace' is unreachable after N consecutive failures, you can still read and write palace data directly while the transport is broken.
Kill stale MCP processes — cached servers may be out of sync or deadlocked:
pkill -f "mempalace.mcp_server"
Wait 10s and retry the MCP call. If it still fails, the server is likely crashing on spawn or the gateway transport is broken.
Verify the palace files are intact:
python3 -c "
import sqlite3
for db in ['/home/elvis/.mempalace/palace/knowledge_graph.sqlite3',
'/home/elvis/.mempalace/palace/chroma.sqlite3']:
conn = sqlite3.connect(db)
print(f'{db}: {conn.execute(\"PRAGMA integrity_check\").fetchone()[0]}')
conn.close()
"
If MCP search is down but you need to find drawers:
/home/elvis/.local/share/pipx/venvs/mempalace/bin/python -c "
import chromadb
client = chromadb.PersistentClient(path='/home/elvis/.mempalace/palace')
collection = client.get_collection('mempalace_drawers')
res = collection.query(query_texts=['your search phrase'], n_results=5)
for doc in res['documents'][0]:
print(doc[:500])
"
If mcp_mempalace_kg_query is unavailable, query knowledge_graph.sqlite3 directly:
python3 -c "
import sqlite3
conn = sqlite3.connect('/home/elvis/.mempalace/palace/knowledge_graph.sqlite3')
c = conn.cursor()
c.execute('SELECT subject, predicate, object, valid_from FROM triples WHERE subject = ?', ('entity_name',))
for r in c.fetchall():
print(r)
conn.close()
"
When mcp_mempalace_kg_add fails and you must backfill knowledge graph relationships (e.g., a post-mortem was filed but its triples were not added):
import sqlite3
import uuid
from datetime import datetime
conn = sqlite3.connect('/home/elvis/.mempalace/palace/knowledge_graph.sqlite3')
c = conn.cursor()
def ensure_entity(name, etype='unknown'):
c.execute('SELECT id FROM entities WHERE name = ?', (name,))
row = c.fetchone()
if row:
return row[0]
eid = str(uuid.uuid4())
c.execute('INSERT INTO entities (id, name, type, properties) VALUES (?, ?, ?, ?)',
(eid, name, etype, '{}'))
return eid
def add_triple(subject, predicate, obj, valid_from=None, source_closet=None):
valid_from = valid_from or datetime.now().strftime('%Y-%m-%d')
tid = 't_' + subject + '_' + predicate + '_' + obj + '_' + str(uuid.uuid4())[:12]
c.execute('''INSERT INTO triples
(id, subject, predicate, object, valid_from, valid_to, confidence, source_closet, extracted_at)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)''',
(tid, subject, predicate, obj, valid_from, None, 1.0, source_closet, datetime.now().isoformat()))
# Example usage
ensure_entity('mempalace', 'system')
ensure_entity('palace_bloat_86GB', 'issue')
add_triple('mempalace', 'suffered_issue', 'palace_bloat_86GB',
source_closet='infrastructure/mempalace_bloat_postmortem')
conn.commit()
conn.close()
Safety rules for direct writes:
knowledge_graph.sqlite3 before bulk inserts (cp ... /tmp/kg_backup.sqlite3).ensure_entity() before inserting a triple — the entities table must contain the subject/object names or the KG will have orphaned references.id values for triples (e.g., t_<subject>_<predicate>_<object>_<uuid4>) so accidental re-runs don't create exact duplicates.source_closet to the wing/room path so the triples are traceable later.mempalace repair) when ongoing palace operations are needed. Direct SQLite writes bypass validation layers and should not become your default workflow.mempalace repair timeout on large palacesOn very large palaces (20+ GB), mempalace repair can exceed foreground timeout limits. If repair is killed or times out mid-rebuild, the palace may be left with 0 drawers because the new index was incomplete.
Symptoms:
mempalace status shows 0 drawerscollection.count() == 0/home/elvis/.mempalace/palace.backup existsFix: Restore from the backup that repair created before modifying data:
mv /home/elvis/.mempalace/palace /home/elvis/.mempalace/palace.empty_repair
mv /home/elvis/.mempalace/palace.backup /home/elvis/.mempalace/palace
Verify restoration:
/home/elvis/.local/share/pipx/venvs/mempalace/bin/python -c "
import chromadb
client = chromadb.PersistentClient(path='/home/elvis/.mempalace/palace')
print(client.get_or_create_collection('mempalace_drawers').count())
"
The restored backup may still contain the original HNSW corruption (if any), but the data drawers are intact. Test with a tiny mempalace mine on one file before restarting the full run.
If mempalace repair crashes or leaves the palace in an unusable state, and the backup is also corrupted, create a fresh palace and swap it in. Reading from a corrupted palace usually works fine; writing to it segfaults. Use this asymmetry to recover data.
Diagnostic pattern:
# Reading works even when writing segfaults
/home/elvis/.local/share/pipx/venvs/mempalace/bin/python -c "
import chromadb
c = chromadb.PersistentClient('/home/elvis/.mempalace/palace')
col = c.get_collection('mempalace_drawers')
print('Readable count:', col.count()) # Usually succeeds
"
Recovery workflow:
mkdir -p /home/elvis/.mempalace/palace_fresh
/home/elvis/.local/share/pipx/venvs/mempalace/bin/python -c "
import chromadb, shutil, os
old = chromadb.PersistentClient('/home/elvis/.mempalace/palace')
new = chromadb.PersistentClient('/home/elvis/.mempalace/palace_fresh')
old_col = old.get_collection('mempalace_drawers')
new_col = new.create_collection('mempalace_drawers')
total = old_col.count()
batch_size = 500
for offset in range(0, total, batch_size):
batch = old_col.get(limit=batch_size, offset=offset, include=['documents', 'metadatas'])
if batch['ids']:
new_col.add(ids=batch['ids'], documents=batch['documents'], metadatas=batch['metadatas'])
print(f'Migrated {min(offset + batch_size, total)}/{total}')
"
mempalace --palace /home/elvis/.mempalace/palace_fresh mine /tmp/clean_sessions --mode convos
mv /home/elvis/.mempalace/palace /home/elvis/.mempalace/palace.corrupted.$(date +%s)
mv /home/elvis/.mempalace/palace_fresh /home/elvis/.mempalace/palace
Note: Migration is CPU- and I/O-intensive. For 1,000+ drawers, run the Python migration in the background and monitor chroma.sqlite3 growth rather than blocking on completion.
If the backup itself is corrupted (malformed sqlite):
Repair creates the backup by copying the palace mid-operation. If repair is interrupted while writing the backup, the backup's chroma.sqlite3 can also be malformed (sqlite3.DatabaseError: database disk image is malformed). In this case:
mv /home/elvis/.mempalace/palace /home/elvis/.mempalace/palace.prerewrite
mv /home/elvis/.mempalace/palace.backup /home/elvis/.mempalace/palace.corrupt_backup
mempalace init /home/elvis/.mempalace/fresh_palace
Or let the next mempalace mine auto-create a new palace.mempalace mine --mode convos segfaults or extreme slownessA mempalace mine <dir> --mode convos run can segfault (exit code -11 / SIGSEGV) or run ** orders of magnitude slower than expected** when the target directory contains a small number of actual conversation files (.jsonl) mixed with a large number of unrelated files (e.g., .json request dumps, logs, or artifacts) that mempalace attempts to parse as conversations.
Symptoms — segfault path:
mempalace mine /home/elvis/.hermes/sessions/ --mode convos segfaults after listing thousands of filescollection.add() (unlike HNSW corruption)mempalace repair completes but finds 0 drawers, and previously filed data may become inaccessibleSymptoms — extreme slowness path:
chroma.sqlite3 continues growingmempalace status may cap at exactly 10,000 drawers even though embeddings are still being written (verify via direct ChromaDB query)Root cause:
--mode convos does not filter by extension; it attempts to ingest every file in the directoryrequest_dump_*.json files alongside ~1,800 actual session_*.json conversation sessionsrequest_dump_*.json file generates ~800+ embedding chunks (one per message), while a typical session generates only ~3–10Fix — isolate target files before mining:
# Create a clean temp directory with only conversation sessions
mkdir -p /tmp/hermes_jsonl
cp /home/elvis/.hermes/sessions/*.jsonl /tmp/hermes_jsonl/
mempalace mine /tmp/hermes_jsonl/ --mode convos
rm -rf /tmp/hermes_jsonl
If .jsonl files are unavailable and you must mine .json session files, filter explicitly:
mkdir -p /tmp/hermes_sessions
cp /home/elvis/.hermes/sessions/session_*.json /tmp/hermes_sessions/
mempalace mine /tmp/hermes_sessions/ --mode convos
rm -rf /tmp/hermes_sessions
Alternative — in-place staging to preserve wing name and skip duplicates
If you have already mined some content from the original directory and want to avoid creating a new wing (which would duplicate already-filed drawers due to different source_file paths), temporarily move the artifacts out of the directory, mine the clean directory, then restore them:
mkdir -p /tmp/request_dumps_hold
mv /home/elvis/.hermes/sessions/request_dump_*.json /tmp/request_dumps_hold/
mempalace mine /home/elvis/.hermes/sessions/ --mode convos
mv /tmp/request_dumps_hold/request_dump_*.json /home/elvis/.hermes/sessions/
rmdir /tmp/request_dumps_hold
This preserves the original wing name (e.g., sessions) and source_file paths, so already-filed files are correctly skipped as duplicates.
Prevention for auto-save hooks:
Configure cron hooks to copy only .jsonl files (or session_*.json files) to a staging directory before calling mempalace mine. Do not point mempalace mine directly at the raw Hermes sessions/ directory.
mempalace mine on directories with thousands of files (e.g., 4,000+ Hermes session files) can take 30–60+ minutes to complete under ideal conditions (clean, homogeneous files). On mixed directories containing request dumps or other non-conversation artifacts, the same job can take 10–30+ hours. The process is CPU-intensive (300%+ utilization) and alternates between compute-bound (Rsl) and disk-I/O-bound (Dsl) states as ChromaDB vectorizes and writes chunks.
Symptoms:
exit_code 124)chroma.sqlite3 growing and timestamp updating) before the timeout killed itchroma.sqlite3 growth, not the file counter)Fix: Run the mining job in the background so it is not subject to foreground timeout limits:
# Background — agent polls until completion
mempalace mine /home/elvis/.hermes/sessions/ --mode convos
Then monitor via ps or process polling until the job finishes. Do not start overlapping mining jobs; wait for the previous one to complete.
AI-Agent execution note: If an AI agent (e.g., a Hermes cron hook) is executing this workflow, avoid tight poll/wait loops — they exhaust the tool-call budget without producing progress and can leave the job orphaned when the iteration limit is hit. Instead:
notify_on_complete if your runtime supports it.chroma.sqlite3 modification time or running ps on the child PID no more than once every 5–10 minutes.output_preview for progress — mempalace mine buffers all output until completion.When the progress counter freezes on a single file for >10 minutes, use this checklist to decide whether the process is still making progress or should be killed.
| Check | Command | Interpretation |
|---|---|---|
| File size | ls -la <file> | If <1 MB with normal JSON, the file itself is not the bottleneck. |
| JSON sanity | python3 -c "import json; d=json.load(open('<file>')); print(len(d.get('messages',[])))" | Confirms the parser isn't choking on malformed data. |
| Process state | ps -p <pid> -o pid,cmd,pcpu,etime,stat | R + high CPU = compute-bound. D = disk-I/O-bound. S + 0% CPU = idle/stuck. |
| Context switches | cat /proc/<pid>/status | grep ctxt_switches | Increasing numbers = threads are still scheduling. Flat for >2 min = deadlocked. |
| Database growth | ls -la ~/.mempalace/palace/chroma.sqlite3 | Growing size + recent mtime = embeddings are still being written. Flat = no progress. |
| Thread states | ps -L -p <pid> -o tid,stat,wchan | All futex_do_wait = HNSW deadlock (kill + run mempalace repair). Mixed R/D = working. |
| Open files | lsof -p <pid> | grep chroma | chroma.sqlite3 open with write access = actively writing. No DB files open = process may have crashed or finished. |
Decision matrix:
mempalace repair, then restart.When started via Hermes Agent's terminal or process tools, a bash wrapper spawns the actual mempalace child process. The wrapper PID may appear sleeping with 0% CPU even though mining is proceeding normally.
pstree -p <wrapper_pid>
# or
ps --forest -o pid,ppid,cmd -g $(ps -o pgid= -p <wrapper_pid>)
ps -p <child_pid> -o pid,cmd,pcpu,pmem,etime,vsz,rss
source_file entries in the target wing:
results = collection.get(where={'wing': 'sessions'})
filed_sources = set(m.get('source_file', '') for m in results['metadatas'])
print(f'Unique files filed: {len(filed_sources)}')
mempalace mine buffers stdout until completion. Verify progress by watching chroma.sqlite3 timestamps or checking child process CPU/RSS growth.output_preview is empty. In Hermes Agent and similar environments, the absence of visible output does not mean the process is hung. Killing a working miner discards partial progress.-9 (SIGKILL), the system killed it — not mempalace. This indicates the execution environment (cron timeout, agent iteration limit, OOM killer) terminated the process. The miner itself did not crash. Partial progress up to that point may have been written.lsof confirms the process is active: lsof -p <pid> shows open files. If chroma.sqlite3 appears in the output with a growing size (ls -lt ~/.mempalace/palace/chroma.sqlite3), the miner is writing embeddings even though stdout is silent.R) and disk-I/O-bound (D)If the child PID disappears but the wrapper remains, the job finished (check exit code via wait). If the child is gone and chroma.sqlite3 was not updated, investigate segfaults or disk-space issues.
mempalace mine <dir> --mode convos auto-creates:
sessions for ~/sessions)technical, architecture, planning, decisions, problems, or general)
To verify a convo-mining run, search with wing="<dirname>" rather than expecting the wing to appear in status.Use this workflow when the user says the palace is "huge," "needs pruning," disk is full, or MCP is failing after rapid growth.
Before measuring or cleaning anything, verify that no background mempalace mine process is actively re-bloating the palace while you work. Rogue miners from previous failed sessions, cron jobs, or backgrounded hooks will continuously ingest junk and undo your cleanup.
pgrep -af "mempalace mine" | grep -v grep
Kill every miner you did not intentionally start:
pkill -f "mempalace mine"
Also kill stale mempalace repair processes — they compete for the same ChromaDB locks and can interfere with cleanup:
pkill -f "mempalace repair"
Check for respawned miners under the gateway cgroup. If the Hermes gateway is running as a systemd service, a rogue miner may reappear as a child process of the gateway after you kill the first one:
systemctl --user status hermes-gateway
# or
pgrep -af "mempalace mine" | grep -v grep
Kill any respawned miner immediately.
Wait 5 seconds and verify nothing remains:
pgrep -af "mempalace" | grep -E "mine|repair" | grep -v grep
If any rogue process respawns immediately, disable the auto-save hook temporarily (chmod -x ~/.hermes/hooks/mempalace_save_hook.sh) until cleanup is complete.
If miners respawn after killing them, they are being triggered by a recurring scheduler, not a one-off background job. Trace the spawner before continuing cleanup.
1. Check Hermes cron jobs:
cronjob list
# or
hermes cron list
Look for any job whose prompt or script contains mempalace mine or mines the sessions directory. Note the job_id.
2. Check system cron / systemd timers:
crontab -l
systemctl list-timers --all
ls /etc/cron.d/
3. Check save hooks:
grep -r "mempalace mine" ~/.hermes/hooks/
cat ~/.hermes/hooks/mempal_save_hook.sh
cat ~/.hermes/hooks/mempalace_save_hook.sh
Distinguish between:
4. Remove the cron job:
cronjob remove <job_id>
5. Verify no more miners spawn: Wait for the original interval (e.g., 15 minutes) and re-check:
pgrep -af "mempalace mine" | grep -v grep
If nothing appears, the spawner is dead.
Root cause pattern: A cron job running mempalace mine /home/elvis/.hermes/sessions/ --mode convos every 15 minutes will spawn overlapping instances when each run takes >15 minutes. Each instance scans the entire directory, re-ingests already-stored transcripts (no dedup at the directory level), and piles up until CPU/memory are exhausted. Never run a full-directory mine from a cron job without file locking or a single-instance guard.
du -sh ~/.mempalace/*/
Typical offenders:
palace.backup/ — left by mempalace repair (often 10–20+ GB)palace-backup-<timestamp>-N/ — multiple dated backups from repair or manual snapshotspalace.empty_repair/ — from a failed or timed-out repairpalace.pre_merge.* — old pre-merge snapshotspalace/ itself — oversized chroma.sqlite3 or bloated HNSW segmentsAction: Delete all backup/restore directories immediately. They are safe to remove.
rm -rf ~/.mempalace/palace.backup ~/.mempalace/palace.empty_repair ~/.mempalace/palace.pre_merge.*
ChromaDB keeps vector index segments as UUID-named directories inside palace/. After repairs or crashes, old segments are left behind.
Find the active segment:
/home/elvis/.local/share/pipx/venvs/mempalace/bin/python -c "
import sqlite3
conn = sqlite3.connect('/home/elvis/.mempalace/palace/chroma.sqlite3')
c = conn.cursor()
c.execute(\"SELECT id FROM segments WHERE scope = 'VECTOR'\")
print('ACTIVE:', c.fetchone()[0])
conn.close()
"
Delete every UUID directory except the active one:
active="<ACTIVE_UUID>"
for d in ~/.mempalace/palace/*-*-*-*-*; do
[ "$(basename "$d")" = "$active" ] || rm -rf "$d"
done
Query source files to find what inflated the palace:
import chromadb
from collections import Counter
client = chromadb.PersistentClient(path='/home/elvis/.mempalace/palace')
coll = client.get_or_create_collection('mempalace_drawers')
results = coll.get(limit=coll.count())
sources = Counter(m.get('source_file', '') for m in results['metadatas'])
for src, cnt in sources.most_common(20):
print(f'{cnt:4d}x {src}')
Common garbage patterns:
request_dump_*.json — API payload artifacts, 500–900 chunks each (explode into hundreds of drawers due to large JSON payloads)session_cron_*.json — automated cron transcripts, low-value for memoryhermes_batch_test)If you see only a handful of request_dump_*.json source files dominating the top-20 with counts in the hundreds, those are the primary bloat source. A typical interactive session generates 3–10 drawers; a single request_dump_*.json can generate 800+.
Use small-batch ID deletion. Do not use where filters with $contains — this ChromaDB version rejects them.
for attempt in range(5):
total = coll.count()
results = coll.get(limit=max(total, 2000))
bad_ids = [rid for rid, meta in zip(results['ids'], results['metadatas'])
if 'session_cron_' in meta.get('source_file', '')
or 'request_dump' in meta.get('source_file', '')]
if not bad_ids:
break
for i in range(0, len(bad_ids), 500):
coll.delete(ids=bad_ids[i:i+500])
If this segfaults (exit 139), the HNSW index is corrupted — run mempalace repair and try again.
sqlite3 ~/.mempalace/palace/chroma.sqlite3 "VACUUM;"
If the sqlite3 CLI is not available, use Python:
/home/elvis/.local/share/pipx/venvs/mempalace/bin/python -c "
import sqlite3
conn = sqlite3.connect('/home/elvis/.mempalace/palace/chroma.sqlite3')
conn.execute('VACUUM')
conn.close()
"
If incremental cleanup keeps segfaulting or the count/limit semantics are inconsistent:
Back up knowledge_graph.sqlite3 before touching anything:
cp ~/.mempalace/palace/knowledge_graph.sqlite3 /tmp/kg_backup.sqlite3
Move the old palace out of the way:
mv ~/.mempalace/palace ~/.mempalace/palace.old
Stage only clean interactive files. Exclude cron transcripts, request dumps, and any session_cron_* files explicitly:
mkdir -p /tmp/mempalace_staging
for f in ~/.hermes/sessions/session_2026*.json ~/.hermes/sessions/session_sess_*.json; do
basename=$(basename "$f")
case "$basename" in
session_cron_*) continue ;;
request_dump_*) continue ;;
esac
cp "$f" /tmp/mempalace_staging/
done
echo "Staged $(ls /tmp/mempalace_staging/*.json | wc -l) files"
Mine the staging directory in the background:
mempalace mine /tmp/mempalace_staging/ --mode convos
Then monitor via ps or process polling until completion. Do not start overlapping mining jobs.
Watch for InvalidCollectionException: If the miner crashes with Collection <uuid> does not exist, the collection was recreated mid-run (e.g., by a concurrent repair or by rm -rf palace/*). Clear the partial state and restart:
rm -rf ~/.mempalace/palace/*
mempalace mine /tmp/mempalace_staging/ --mode convos
Restore the knowledge graph backup:
cp /tmp/kg_backup.sqlite3 ~/.mempalace/palace/knowledge_graph.sqlite3
Verify the rebuild:
/home/elvis/.local/share/pipx/venvs/mempalace/bin/python -c "
import chromadb
client = chromadb.PersistentClient(path='/home/elvis/.mempalace/palace')
coll = client.get_collection('mempalace_drawers')
print(f'Clean drawers: {coll.count()}')
"
Delete old palace and staging directory:
rm -rf ~/.mempalace/palace.old /tmp/mempalace_staging /tmp/kg_backup.sqlite3
The most common root cause of uncontrolled growth is ~/.hermes/hooks/mempalace_save_hook.sh (or the legacy mempal_save_hook.sh) mining the entire sessions directory instead of just the transcript being saved.
Patch the mining block to stage the single transcript file:
STAGE_DIR=""
MINE_DIR=""
if [ -n "$TRANSCRIPT_PATH" ] && [ -f "$TRANSCRIPT_PATH" ]; then
STAGE_DIR=$(mktemp -d /tmp/mempalace_hook_stage.XXXXXX)
cp "$TRANSCRIPT_PATH" "$STAGE_DIR/"
MINE_DIR="$STAGE_DIR"
fi
if [ -n "$MINE_DIR" ]; then
mempalace mine "$MINE_DIR" --mode convos >> "$STATE_DIR/hook.log" 2>&1 &
if [ -n "$STAGE_DIR" ]; then
(sleep 30; rm -rf "$STAGE_DIR") &
fi
fi
This prevents request dumps, cron sessions, and unrelated files from being ingested every time the hook fires.
After a controlled rebuild (or any mempalace repair that changes the palace state), the Hermes agent's internal MCP client may enter a persistent backoff loop if it experienced ClosedResourceError or segfault-related disconnects during the cleanup. Symptoms:
mcp_mempalace_status returns: MCP server 'mempalace' is unreachable after N consecutive failures. Auto-retry available in ~Xs.hermes mcp test mempalace works fine (spawns a fresh test connection)This is a session-level transport issue, not a palace issue. The agent's cached MCP connection to the old/crashed server is dead and the backoff timer keeps extending.
Fix: Restart the Hermes agent session (or the gateway if running through it). The clean palace and MCP server are ready; the agent just needs a fresh connection.
"Never guess what you can query."
The palace is your long-term memory. Use it actively.
When a FastAPI broker or other non-agent service needs to write job receipts to MemPalace but cannot call MCP tools directly, use the on-disk bridge pattern.
See references/broker-filesystem-bridge.md for the full recipe.
File structured post-mortems in MemPalace with full knowledge graph integration. Link root causes, timeline facts, and remediation triples.
memoria-palace-postmortemProject research via targeted MemPalace search: multi-step query refinement, wing/room filtering, and cross-wing tunnel following.
mempalace-searchSystematically catalog a directory tree of documentation, code, or notes into the palace with automatic wing/room taxonomy.
catalog-directory-to-palaceMemPalace is MIT licensed. Created by Milla Jovovich, Ben Sigman, Igor Lins e Silva, and contributors.