en un clic
before-and-after
A/B test multi-agent consensus quality with and without Mycelium's structured negotiation. Uses real OpenClaw agents talking through a Mycelium room channel. Handles full setup if needed.
Menu
A/B test multi-agent consensus quality with and without Mycelium's structured negotiation. Uses real OpenClaw agents talking through a Mycelium room channel. Handles full setup if needed.
Run end-to-end smoke tests for the Mycelium cursor adapter. Verifies cursor-agent prereqs, single-host dispatch, multi-host dispatch through the hub, cross-family negotiation with claude_code/openclaw, workspace asset drift, and auth failures. Use when validating the cursor integration on a fresh install, after touching cursor-family code (`integrations/cursor/**`, `daemon/dispatch.py`, `daemon/runner.py`), or after upgrading `cursor-agent` itself.
Run end-to-end smoke tests for the Mycelium claude_code adapter. Verifies claude CLI prereqs, daemon installation, single-host cold-spawn dispatch, multi-room ownership semantics, notes persistence across spawns, control verbs (status/abort), budget gating, concurrent dispatch serialization, and (with funded API credits) autonomous coordination via mycelium-daemon coordination_tick handling. Use when validating the claude_code adapter after changes to `integrations/claude_code/**`, the daemon, or after upgrading the `claude` CLI itself.
Run end-to-end smoke tests for the Mycelium stack. Verifies install, memory, search, coordination, and OpenClaw integration. Use when validating a release, after a deploy, or when something feels broken.
Multi-agent coordination layer with persistent memory. Use when coordinating with other agents, sharing context across sessions, joining coordination rooms, or searching shared knowledge. Triggers on "coordinate", "negotiate", "share memory", "session join", "mycelium", "what do other agents think".
Cut a release — commit any staged changes, tag, create GitHub release, and optionally notify Webex. Use when the user says /release, /release --with-webex, or /release --preview.
End-to-end smoke test for cross-channel coordination. Spins up two ephemeral OpenClaw agents wired to local Matrix Synapse + mycelium-room, kicks off a structured negotiation from a Matrix DM, and verifies the consensus summary auto-delivers back to the user's Matrix DM. Use when validating PR
| name | before-and-after |
| description | A/B test multi-agent consensus quality with and without Mycelium's structured negotiation. Uses real OpenClaw agents talking through a Mycelium room channel. Handles full setup if needed. |
| argument-hint | <scenario description or experiment file> |
Compare how real OpenClaw agents reach consensus with and without Mycelium's CognitiveEngine. Same agents, same personas, same scenario — two coordination approaches.
You are the test harness. You set up the infrastructure, seed scenarios, observe transcripts, and evaluate. The agents do the actual negotiating.
Before anything else, verify the stack. Run each check and stop if any fail.
# 1. Mycelium CLI
mycelium --version
# If missing: pip install mycelium (or pipx install mycelium)
# 2. Resolve the backend URL — NEVER hardcode a port
MYCELIUM_API_URL=$(python3 -c "
import toml, os
cfg_path = os.path.expanduser('~/.mycelium/config.toml')
cfg = toml.load(cfg_path)
print(cfg.get('server', {}).get('api_url', 'http://localhost:8000'))
")
echo "Backend URL: $MYCELIUM_API_URL"
# 3. Mycelium stack running
curl -sf "$MYCELIUM_API_URL/health" | python3 -m json.tool
# Should show status=ok. If not: mycelium install && mycelium up
# 4. OpenClaw installed
openclaw --version
# If missing: npm install -g openclaw (or see https://docs.openclaw.ai)
# 4b. Mycelium health check
mycelium doctor
# All checks should be green. Fix any errors before proceeding.
# 5. OpenClaw gateway running
openclaw channels status
# Should show "Gateway reachable" in the output. If not: openclaw gateway start
# 6. Mycelium repo path (for the bundled plugin source)
MYCELIUM_REPO=$(pwd) # assumes running from the mycelium repo
ls "$MYCELIUM_REPO/mycelium-cli/src/mycelium/integrations/openclaw/assets/mycelium/plugin/index.ts" 2>/dev/null \
&& echo "Repo found: $MYCELIUM_REPO" \
|| echo "ERROR: not in the mycelium repo — cd to it first"
If any prerequisite fails, fix it before proceeding.
Throughout this skill, use $MYCELIUM_API_URL for all backend requests. Never hardcode a port.
mas_id is required for CFN knowledge ingest. KXP fires on deliberate room
writes — channel messages and mycelium memory set. Without mas_id those
writes reach the backend but never forward to CFN's knowledge graph.
python3 -c "
import toml, os
cfg = toml.load(os.path.expanduser('~/.mycelium/config.toml'))
mas_id = cfg.get('server', {}).get('mas_id', '')
print(f'mas_id = \"{mas_id}\"')
if not mas_id:
print('WARNING: mas_id is empty — knowledge ingest will silently fail')
"
If mas_id is empty, set it from the active room:
ACTIVE_ROOM=$(python3 -c "
import toml, os
cfg = toml.load(os.path.expanduser('~/.mycelium/config.toml'))
print(cfg.get('rooms', {}).get('active', ''))
")
if [ -n "$ACTIVE_ROOM" ]; then
MAS_ID=$(curl -sf "$MYCELIUM_API_URL/api/rooms/$ACTIVE_ROOM" | python3 -c "
import sys, json
r = json.load(sys.stdin)
print(r.get('mas_id') or '')
")
echo "Room mas_id: $MAS_ID"
mycelium config set server.mas_id "$MAS_ID"
fi
Experiment runs get expensive fast. Each scenario fires 10–40+ LLM calls across multi-turn chatter (before case) and negotiation rounds (after case). Default to haiku unless the user explicitly wants sonnet — the quality difference for "pick REST vs GraphQL" is indistinguishable, but the cost difference is ~12×.
First, check what model is already configured in openclaw:
CONFIGURED_MODEL=$(python3 -c "
import json, os
p = os.path.expanduser('~/.openclaw/openclaw.json')
try:
cfg = json.load(open(p))
print(cfg.get('agents', {}).get('defaults', {}).get('model', ''))
except Exception:
print('')
")
echo "Currently configured model: ${CONFIGURED_MODEL:-'(none)'}"
Before creating agents, use AskUserQuestion to ask — include the currently configured model as an option if one is set:
Which LLM and API key should the experiment agents use?
- Haiku (recommended — ~$0.10–0.30 per full experiment)
- Sonnet (flagship model — ~$1.50–4.00 per full experiment)
- Currently configured model (
$CONFIGURED_MODEL) (reuse existing openclaw default — no setup needed) ← only show this if$CONFIGURED_MODELis non-empty- Different API key or provider (isolate experiment cost to a separate key)
If the configured model is already haiku or sonnet, collapse options 1/2 and 3 into one — no need to ask twice for the same thing.
Then set EXP_MODEL and, if option 4, EXP_ANTHROPIC_KEY (or equivalent) based on the answer:
# Option 1 (default)
EXP_MODEL="anthropic/claude-haiku-4-5-20251001"
# Option 2
EXP_MODEL="anthropic/claude-sonnet-4-6"
# Option 3 — reuse the already-configured openclaw model
EXP_MODEL="$CONFIGURED_MODEL"
# Option 4 — ask the user for: (a) provider+model string, (b) API key or env var name
EXP_MODEL="anthropic/claude-haiku-4-5-20251001" # or whatever they specify
export ANTHROPIC_API_KEY="sk-ant-..." # their separate key; scoped to this shell
Reuse path (default): If the user picks option 1, 2, or 3, do nothing to the openclaw config — newly-created agents inherit auth.profiles from ~/.openclaw/openclaw.json. Zero setup.
Override path (option 4): Export the key in the shell before running openclaw agents add. Openclaw agents call LLMs via litellm, which picks up provider env vars (ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.) automatically. Do NOT write the key into openclaw.json — keep it in the shell so it's ephemeral and doesn't touch the user's real config.
Print a one-line summary of the choice so it lands in the transcript:
echo "Using model: $EXP_MODEL (key: $([ -n "${EXP_ANTHROPIC_KEY:-}" ] && echo 'experiment-scoped' || echo 'openclaw default'))"
Do not reuse existing agents. Create dedicated temporary agents for the experiment. This avoids clobbering real agent personas and keeps experiments idempotent.
Generate a unique experiment prefix to avoid room/agent name collisions:
EXP_ID="exp-$(date +%s | tail -c 5)" # e.g. exp-4821
Create agents with --non-interactive, using $EXP_MODEL from Phase 0.5:
openclaw agents add "${EXP_ID}-agent-a" \
--non-interactive \
--workspace ~/.openclaw/workspaces/${EXP_ID}-agent-a \
--model "$EXP_MODEL"
openclaw agents add "${EXP_ID}-agent-b" \
--non-interactive \
--workspace ~/.openclaw/workspaces/${EXP_ID}-agent-b \
--model "$EXP_MODEL"
For 3+ agent scenarios, create additional agents the same way.
Never hardcode a model in this skill. Always use $EXP_MODEL. The user picked it in Phase 0.5 for cost reasons — overriding it silently defeats the whole point of asking.
Disable sandbox mode for experiment agents. Newly-created agents inherit agents.defaults.sandbox.mode from openclaw.json, which is typically "all" (full sandboxing). Sandboxed agents cannot execute mycelium session join, mycelium negotiate propose, or mycelium negotiate respond because the mycelium CLI binary isn't visible inside the sandbox. The after case will fail silently: agents will dutifully reason about the negotiation in chat, then report "the mycelium CLI isn't available in this environment" — which makes the after case look worse than the before case purely because of a config gotcha.
Patch the sandbox setting for each experiment agent after creation:
python3 -c "
import json, os
p = os.path.expanduser('~/.openclaw/openclaw.json')
cfg = json.load(open(p))
for a in cfg['agents']['list']:
aid = a.get('id', '')
if aid.startswith('${EXP_ID}-'):
a['sandbox'] = {'mode': 'off'}
print(f'{aid} → sandbox: off')
json.dump(cfg, open(p, 'w'), indent=2)
"
This is critical. Without it, the after case is a test of "what happens when agents can't run the CLI they're supposed to run", not a test of the CLI protocol itself.
Derive personas from the user's scenario. Write a SOUL.md for each agent:
cat > ~/.openclaw/workspaces/${EXP_ID}-agent-a/SOUL.md << 'EOF'
{persona text — who they are, what they value, specific experience/data}
EOF
cat > ~/.openclaw/workspaces/${EXP_ID}-agent-b/SOUL.md << 'EOF'
{persona text — who they are, what they value, specific experience/data}
EOF
Good personas include:
The mycelium-bootstrap hook is the sole contamination vector. It injects
MYCELIUM_ROOM_ID and MYCELIUM_API_URL into every agent's environment at
bootstrap — which causes agents in the before case to spontaneously use
mycelium session join and mycelium negotiate even when the seed says to
just chat.
The mycelium-room channel plugin must stay active — agents need it to
route messages to each other. Only the bootstrap hook needs to be off.
openclaw hooks disable mycelium-bootstrap
Verify:
openclaw hooks list 2>&1 | grep -E "mycelium-bootstrap|mycelium-room"
# mycelium-bootstrap should show ⏸ (disabled)
# mycelium-room channel should still be listed as configured
python3 << 'PYEOF'
import json, os, toml
mycelium_cfg = toml.load(os.path.expanduser("~/.mycelium/config.toml"))
backend_url = mycelium_cfg.get("server", {}).get("api_url", "http://localhost:8000")
oc_path = os.path.expanduser("~/.openclaw/openclaw.json")
with open(oc_path) as f:
oc = json.load(f)
exp_agents = [os.environ["EXP_AGENT_A"], os.environ["EXP_AGENT_B"]]
room_name = os.environ["EXP_ROOM"]
oc.setdefault("channels", {})["mycelium-room"] = {
"enabled": True,
"backendUrl": backend_url,
"room": room_name,
"agents": exp_agents,
"requireMention": True,
}
with open(oc_path, "w") as f:
json.dump(oc, f, indent=2)
print(f"Channel configured: room={room_name}, agents={exp_agents}, backend={backend_url}")
PYEOF
EXP_AGENT_A="${EXP_ID}-agent-a" \
EXP_AGENT_B="${EXP_ID}-agent-b" \
EXP_ROOM="${EXP_ID}-before" \
python3 << 'PYEOF'
# ... (script above)
PYEOF
mycelium room create "${EXP_ID}-before"
mycelium room create "${EXP_ID}-after"
openclaw gateway restart
Verify the channel is connected and bootstrap hook is off:
sleep 4
grep "SSE connected.*${EXP_ID}-before" /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | tail -1
# Should show: [mycelium-room] SSE connected: {EXP_ID}-before (agents: ...)
openclaw hooks list 2>&1 | grep mycelium-bootstrap
# Should show ⏸ (disabled)
The channel plugin is active — agents can see each other's messages. The
bootstrap hook is disabled — agents do NOT get MYCELIUM_ROOM_ID or
MYCELIUM_API_URL injected, so they have no automatic signal to reach for
the Mycelium CLI. This is the true baseline.
The seed must explicitly @-mention both agents (the channel is
requireMention: true) AND tell them to reply by @-mentioning each other:
MENTIONS=$(printf '@%s ' "${EXP_ID}-agent-a" "${EXP_ID}-agent-b")
SEED_BODY="${MENTIONS}
${SCENARIO_PROMPT}
How to work together in this room:
- Reply by @-mentioning the other agent(s) whenever you want them to read and respond. Messages without an @mention are ignored by the channel.
- Keep each message to 2–3 paragraphs.
- Aim for consensus. When you think you've agreed, @-mention the other agent and explicitly say 'I agree' with the final decision.
- Do NOT use any mycelium CLI commands — coordinate only by talking."
curl -sf "$MYCELIUM_API_URL/api/rooms/${EXP_ID}-before/messages" \
-H "Content-Type: application/json" \
-d "$(python3 -c "import json,sys; print(json.dumps({'sender_handle':'facilitator','message_type':'broadcast','content':sys.argv[1]}))" "$SEED_BODY")"
The $SCENARIO_PROMPT is the actual question/decision the agents need to resolve. Keep it focused.
grep "mycelium-room.*←\|mycelium-room.*→" /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | tail -20
Poll room messages:
curl -sf "$MYCELIUM_API_URL/api/rooms/${EXP_ID}-before/messages?limit=20" | python3 -c "
import sys, json
data = json.load(sys.stdin)
msgs = data.get('messages', data) if isinstance(data, dict) else data
if isinstance(msgs, list):
msgs.reverse()
for m in msgs:
print(f'[{m[\"sender_handle\"]}] {m[\"content\"][:120]}')
print()
"
The conversation runs organically — agents see each other's replies and respond. Typical: 3–5 rounds of back-and-forth. Watch for agents settling on a position or explicitly agreeing. Each round takes ~15–20s (agent processing + LLM call + outbound POST).
Cost guard — critical. Unstructured agent chat can spiral. Three agents × 10 rounds × haiku context growth adds up fast. Watch for these failure modes and cut the before case off early when you see them:
To kill the before case cleanly and stop agents from continuing to burn tokens, switch the channel away from the before room and restart the gateway. The in-process dispatcher is tied to the SSE subscription — pulling the subscription aborts in-flight dispatches and prevents new ones:
# Point the channel at the after room (or any other room) to stop the before cascade
python3 -c "
import json, os
p = os.path.expanduser('~/.openclaw/openclaw.json')
cfg = json.load(open(p))
cfg['channels']['mycelium-room']['room'] = '${EXP_ID}-after'
json.dump(cfg, open(p, 'w'), indent=2)
"
openclaw gateway restart
Alternatively, to stop dispatch without moving to Phase 3 yet, point the channel at a throwaway room name that doesn't exist — the SSE will fail to connect and no messages will be dispatched.
Capture the transcript from the before room before you kill the channel, otherwise you'll be writing against a static snapshot while the agents are still generating messages behind you.
curl -sf "$MYCELIUM_API_URL/api/rooms/${EXP_ID}-before/messages?limit=50" | python3 -c "
import sys, json
data = json.load(sys.stdin)
msgs = data.get('messages', data) if isinstance(data, dict) else data
if isinstance(msgs, list):
msgs.reverse()
lines = []
for m in msgs:
lines.append(f'**{m[\"sender_handle\"]}:**')
lines.append(m['content'])
lines.append('')
print('\n'.join(lines))
" > ~/.mycelium/rooms/${EXP_ID}-before/transcript.md
Channel messages are deliberate room writes — KXP indexes them into CFN. Capture the ingest buffer to disk for the experiment artifact.
mycelium cfn log --limit 500 --json > ~/.mycelium/rooms/${EXP_ID}-before/ingest-log.json
mycelium cfn stats --json > ~/.mycelium/rooms/${EXP_ID}-before/ingest-stats.json
For human review during the run, mycelium cfn log --state=refused,error is
the fast signal on "did anything blow up here."
Re-enable the bootstrap hook (injects MYCELIUM_ROOM_ID + MYCELIUM_API_URL
into agent bootstrap) and switch the channel to the after room.
# Re-enable bootstrap hook
openclaw hooks enable mycelium-bootstrap
# Switch channel to after room
python3 -c "
import json, os
oc_path = os.path.expanduser('~/.openclaw/openclaw.json')
with open(oc_path) as f:
oc = json.load(f)
oc['channels']['mycelium-room']['room'] = '${EXP_ID}-after'
with open(oc_path, 'w') as f:
json.dump(oc, f, indent=2)
print('Switched to ${EXP_ID}-after')
"
openclaw gateway restart
sleep 4
grep "SSE connected.*${EXP_ID}-after" /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | tail -1
# Should show: [mycelium-room] SSE connected: {EXP_ID}-after (agents: ...)
Same scenario, same @-mention pattern — but the instructions tell agents to coordinate via the mycelium CLI instead of chatting in the channel:
MENTIONS=$(printf '@%s ' "${EXP_ID}-agent-a" "${EXP_ID}-agent-b")
SEED_BODY="${MENTIONS}
${SCENARIO_PROMPT}
Use Mycelium structured negotiation to reach consensus. Do NOT discuss this in chat — run these commands instead:
1. Join the coordination session (each of you runs this once with your own handle):
mycelium session join --handle <your-handle> --room ${EXP_ID}-after -m \"<your position in one sentence>\"
2. Wait for CognitiveEngine to address you. It will send a tick message telling you your turn, the current offer, and whether to 'propose' or 'respond'.
3. Respond via the CLI:
mycelium negotiate propose ISSUE=VALUE ISSUE=VALUE ... --room ${EXP_ID}-after --handle <your-handle>
mycelium negotiate respond accept --room ${EXP_ID}-after --handle <your-handle>
mycelium negotiate respond reject --room ${EXP_ID}-after --handle <your-handle>
Explain your reasoning briefly in chat before each CLI command so the human can follow along. Repeat until you receive a consensus message."
curl -sf "$MYCELIUM_API_URL/api/rooms/${EXP_ID}-after/messages" \
-H "Content-Type: application/json" \
-d "$(python3 -c "import json,sys; print(json.dumps({'sender_handle':'facilitator','message_type':'broadcast','content':sys.argv[1]}))" "$SEED_BODY")"
The automated sequence:
mycelium session join (~15s)mycelium negotiate propose/respond/decide → next roundMonitor:
# Tick delivery (🎯) and consensus (🤝)
grep "mycelium-room.*🎯\|mycelium-room.*🤝" /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | tail -20
# Session state
curl -sf "$MYCELIUM_API_URL/api/rooms" | python3 -c "
import sys, json
for r in json.load(sys.stdin):
if '${EXP_ID}-after' in r['name']:
print(f'{r[\"name\"]}: {r.get(\"coordination_state\", \"none\")}')
"
When consensus is reached or the session completes:
# Room messages
curl -sf "$MYCELIUM_API_URL/api/rooms/${EXP_ID}-after/messages?limit=50" | python3 -c "
import sys, json
data = json.load(sys.stdin)
msgs = data.get('messages', data) if isinstance(data, dict) else data
if isinstance(msgs, list):
msgs.reverse()
for m in msgs:
print(f'**{m[\"sender_handle\"]}:**')
print(m['content'])
print()
" > ~/.mycelium/rooms/${EXP_ID}-after/transcript.md
# Session room messages (ticks, proposals, consensus)
SESSION_ROOM=$(curl -sf "$MYCELIUM_API_URL/api/rooms" | python3 -c "
import sys, json
for r in json.load(sys.stdin):
if '${EXP_ID}-after:session:' in r['name']:
print(r['name']); break
")
echo "Session room: $SESSION_ROOM"
curl -sf "$MYCELIUM_API_URL/api/rooms/$SESSION_ROOM/messages?limit=100" | python3 -c "
import sys, json
data = json.load(sys.stdin)
msgs = data.get('messages', data) if isinstance(data, dict) else data
if isinstance(msgs, list):
msgs.reverse()
for m in msgs:
mt = m['message_type']
sh = m.get('sender_handle', '')
c = json.dumps(m['content']) if isinstance(m['content'], (dict, list)) else str(m['content'])
print(f'[{mt}] {sh}: {c}')
print()
" > ~/.mycelium/rooms/${EXP_ID}-after/session-transcript.md
Mirrors Phase 2e. Snapshot the full buffer + stats for the after run so the gist carries both-case cost evidence.
mycelium cfn log --limit 500 --json > ~/.mycelium/rooms/${EXP_ID}-after/ingest-log.json
mycelium cfn stats --json > ~/.mycelium/rooms/${EXP_ID}-after/ingest-stats.json
Important: the in-memory buffer is shared across both runs and resets only on backend restart. If you don't snapshot after Phase 2 (before) and again after Phase 3 (after), you'll lose the separation. Phase 2e + 3e ordering matters — each snapshot captures everything seen so far, and you diff them in Phase 4 to get the per-run cost.
Compare both transcripts against the success criteria. Score each criterion 1-5:
| Score | Meaning |
|---|---|
| 1 | Not addressed at all |
| 2 | Mentioned but unresolved |
| 3 | Partially addressed |
| 4 | Substantially addressed |
| 5 | Fully resolved |
Write the comparison report to ~/.mycelium/rooms/${EXP_ID}/evaluation.md:
## Before-and-After: {scenario name}
### Summary
| Metric | Before (Channel) | After (Mycelium) |
|--------|-------------------|-------------------|
| Consensus reached? | ... | ... |
| Rounds to resolution | ... | ... |
| Issues explicitly identified | ... | ... |
| Issues resolved | ... | ... |
| Specific assignments made | ... | ... |
| Overall score | X/5 | X/5 |
### CFN ingest activity (per-run cost delta)
Diff `ingest-stats.json` between the two runs and render the delta. Use
this block to catch cost regressions: if the `after` case ingests 10x the
tokens of `before` for the same scenario, something's wrong upstream of
the dedupe + circuit breaker.
```bash
python3 - <<'PY'
import json, pathlib
exp = "${EXP_ID}"
for label in ("before", "after"):
p = pathlib.Path(f"~/.mycelium/rooms/{exp}-{label}/ingest-stats.json").expanduser()
if not p.exists():
print(f"{label}: (missing)"); continue
d = json.loads(p.read_text())
t = d.get("total", {})
print(f"{label}: events={t.get('events',0)} "
f"tokens≈{t.get('estimated_cfn_knowledge_input_tokens',0):,} "
f"bytes={t.get('payload_bytes',0):,}")
PY
| Metric | Before | After | Delta |
|---|---|---|---|
| Ingest events | ... | ... | +/-N |
| Est. input tokens | ~... | ~... | +/-N% |
| Refused (circuit breaker) | ... | ... | ... |
| Deduped (hash hit) | ... | ... | ... |
| Errors | ... | ... | ... |
| Criterion | Before | After | Delta |
|---|---|---|---|
| ... | X/5 | X/5 | +/-N |
Before (unstructured channel):
After (Mycelium-mediated):
{honest assessment — if before beat after, say so}
## Phase 4b: (optional) Share results as a gist
Experiment artifacts (the evaluation report, both transcripts, and the session sub-room transcript) are often worth sharing with teammates — for PR discussions, roadmap meetings, or just "hey look at this failure mode." Inline posting hits GitHub's 65,536-char comment limit fast; gists are the cleanest path.
**Before creating a gist, scan for secrets.** Agent narration can leak absolute home paths, API keys from shell env, or session tokens if anything weird happened. Grep for the obvious patterns:
```bash
for f in ~/.mycelium/rooms/${EXP_ID}/evaluation.md \
~/.mycelium/rooms/${EXP_ID}-before/transcript.md \
~/.mycelium/rooms/${EXP_ID}-after/transcript.md \
~/.mycelium/rooms/${EXP_ID}-after/session-transcript.md \
~/.mycelium/rooms/${EXP_ID}-before/ingest-log.json \
~/.mycelium/rooms/${EXP_ID}-after/ingest-log.json \
~/.mycelium/rooms/${EXP_ID}-before/ingest-stats.json \
~/.mycelium/rooms/${EXP_ID}-after/ingest-stats.json; do
[ -f "$f" ] || continue
echo "=== $f ==="
grep -inE 'sk-[a-z0-9]|ghp_|gho_|bearer [a-z0-9]|api[_-]?key.*[=:]|password.*[=:]|/Users/|/home/' "$f" | head -5 || echo " (clean)"
done
The ingest-log JSON files are particularly worth scanning — they include the full payload content that was forwarded to CFN, which for agent turns may include filesystem paths, shell output, or tool results. Err on the side of redacting.
If anything lights up, either redact or skip the gist. Always ask the user before uploading — even a private gist is a URL someone could share, and experiment transcripts may contain persona content or internal scenario details that weren't meant to leave the machine.
Once clean, create a secret gist (URL-only, not listed on your profile).
Watch out for filename collisions. gh gist create uses the file's basename, so ~/.mycelium/rooms/${EXP_ID}-before/transcript.md and ~/.mycelium/rooms/${EXP_ID}-after/transcript.md both become transcript.md and the second one silently overwrites the first. Stage the files under unique names in a temp directory first:
Also fetch the knowledge graph and run a few semantic queries while you're here — this is the most interesting artifact to show teammates:
# Get the MAS ID from config
MAS=$(python3 -c "
import toml, os
cfg = toml.load(os.path.expanduser('~/.mycelium/config.toml'))
print(cfg['server']['mas_id'])
")
# Dump all nodes with a summary grouped by type
mycelium cfn ls --mas "$MAS" --limit 500 --json 2>/dev/null | python3 -c "
import sys, json
nodes = json.load(sys.stdin).get('nodes', [])
print(f'## Knowledge Graph ({len(nodes)} nodes)\n')
# Group by type
by_type = {}
for n in nodes:
t = n.get('node_type', 'unknown')
by_type.setdefault(t, []).append(n)
for t, ns in sorted(by_type.items(), key=lambda x: -len(x[1])):
print(f'### {t} ({len(ns)})')
for n in ns[:20]:
label = n.get('label') or n.get('name') or n.get('id', '')[:12]
desc = (n.get('description') or n.get('summary') or '')[:100]
print(f'- **{label}**: {desc}')
if len(ns) > 20:
print(f' ... and {len(ns)-20} more')
print()
" > /tmp/cfn-knowledge-graph.md
# Run 2-3 semantic queries relevant to the scenario
mycelium cfn query "What were the key decisions and agreements reached?" \
--mas "$MAS" > /tmp/cfn-query-decisions.txt 2>&1
mycelium cfn query "What were the main points of disagreement and how were they resolved?" \
--mas "$MAS" > /tmp/cfn-query-disagreements.txt 2>&1
echo "Knowledge graph: $(wc -l < /tmp/cfn-knowledge-graph.md) lines"
echo "Query results written"
Now stage everything including the knowledge graph artifacts:
STAGE=$(mktemp -d)
cp ~/.mycelium/rooms/${EXP_ID}/evaluation.md "$STAGE/evaluation.md"
cp ~/.mycelium/rooms/${EXP_ID}-before/transcript.md "$STAGE/before-transcript.md"
cp ~/.mycelium/rooms/${EXP_ID}-after/transcript.md "$STAGE/after-transcript.md"
cp ~/.mycelium/rooms/${EXP_ID}-after/session-transcript.md "$STAGE/after-session-transcript.md"
# Ingest logs — skip silently if a snapshot is missing (e.g. ran --before-only)
cp ~/.mycelium/rooms/${EXP_ID}-before/ingest-log.json "$STAGE/before-ingest-log.json" 2>/dev/null || true
cp ~/.mycelium/rooms/${EXP_ID}-before/ingest-stats.json "$STAGE/before-ingest-stats.json" 2>/dev/null || true
cp ~/.mycelium/rooms/${EXP_ID}-after/ingest-log.json "$STAGE/after-ingest-log.json" 2>/dev/null || true
cp ~/.mycelium/rooms/${EXP_ID}-after/ingest-stats.json "$STAGE/after-ingest-stats.json" 2>/dev/null || true
# Knowledge graph
cp /tmp/cfn-knowledge-graph.md "$STAGE/cfn-knowledge-graph.md" 2>/dev/null || true
cp /tmp/cfn-query-decisions.txt "$STAGE/cfn-query-decisions.txt" 2>/dev/null || true
cp /tmp/cfn-query-disagreements.txt "$STAGE/cfn-query-disagreements.txt" 2>/dev/null || true
gh gist create \
-d "${EXP_ID}: ${SCENARIO_NAME} — before-and-after" \
"$STAGE"/*
The command prints the gist URL. Use it in a PR comment (gh pr comment <N> --body "...") or Slack message with a short summary and the link — keep the comment body concise and put the full artifacts behind the gist link. Example PR comment body:
## Experiment: ${EXP_ID} — ${SCENARIO_NAME}
<3-line top-line finding>
| Metric | Before | After |
|---|---|---|
| <key stat 1> | ... | ... |
| <key stat 2> | ... | ... |
Full artifacts (evaluation + both transcripts + session sub-room):
<gist URL>
Notes:
gh gist create defaults to secret. Pass --public only if the user explicitly asks.gh CLI for PR/issue comments. Drag-and-drop upload in the web UI is the only native "attachment" path, and it can't be scripted. Gists are the closest scriptable alternative.Remove temporary agents and experiment rooms, restore hook state:
# Remove temp agents and channel config
python3 -c "
import json, os
oc_path = os.path.expanduser('~/.openclaw/openclaw.json')
with open(oc_path) as f:
oc = json.load(f)
oc['agents']['list'] = [a for a in oc.get('agents', {}).get('list', []) if not a.get('id', '').startswith('${EXP_ID}')]
oc.get('channels', {}).pop('mycelium-room', None)
with open(oc_path, 'w') as f:
json.dump(oc, f, indent=2)
print('Agents and channel config removed')
"
# Delete agent workspaces
rm -rf ~/.openclaw/workspaces/${EXP_ID}-*
rm -rf ~/.openclaw/agents/${EXP_ID}-*
# Ensure hooks are back to their normal state
openclaw hooks enable mycelium-bootstrap
openclaw hooks enable mycelium-knowledge-extract
# Delete experiment rooms
curl -sf -X DELETE "$MYCELIUM_API_URL/api/rooms/${EXP_ID}-before"
curl -sf -X DELETE "$MYCELIUM_API_URL/api/rooms/${EXP_ID}-after"
# Restart gateway to pick up config changes
openclaw gateway restart
Describe the scenario however you want. Extract:
For batch runs, provide a JSON file with an experiments array. Schema is loose — fill gaps with defaults:
{
"experiments": [
{
"name": "api-design",
"description": "REST vs GraphQL for the developer platform",
"agents": [
{ "persona": "Backend architect, REST purist, 10 years experience", "position": "REST with OpenAPI" },
{ "persona": "Frontend lead, tired of over-fetching", "position": "GraphQL for everything" }
],
"success_criteria": ["Clear decision on approach", "Migration path defined", "Both performance and DX addressed"]
}
]
}
--before-only — Run only the unstructured case--after-only — Run only the Mycelium negotiation case--eval-only — Skip both runs, evaluate existing transcripts--setup-only — Just configure agents and channel, don't run experiments| Problem | Likely cause | Fix |
|---|---|---|
openclaw agents add prompts interactively | Missing --non-interactive | Add --non-interactive --workspace <path> |
mycelium room create returns 400 | Room name already exists | Use unique $EXP_ID prefix or delete existing room first |
| No SSE connection in logs | Channel plugin not loaded | Check openclaw.plugin.json exists, plugin in load.paths and allow |
| Agents don't respond to seed | Channel config wrong room/agents | Verify channels.mycelium-room in openclaw.json matches the experiment |
| Ticks never arrive (after case) | Session room SSE not subscribed | Check poll found session room; verify CFN is running (docker logs mycelium-backend) |
| Agent processes hang | --local flag or SSE loop in child | Ensure NOT using --local; check MYCELIUM_CHANNEL_ONESHOT env var |
curl to backend fails | Wrong port | Read port from ~/.mycelium/config.toml, don't hardcode |
| After-case agents say "mycelium CLI isn't available in this environment" | Experiment agents are sandboxed, so they can't see the mycelium binary | Set sandbox: {mode: "off"} on each exp-* agent in openclaw.json and restart the gateway. See Phase 1a. |
mycelium cfn ls snapshots so reviewers can see exactly which concepts
CFN extracted as a result of the experiment