بنقرة واحدة
ghcrawl-cluster-operator
// Use when inspecting a ghcrawl SQLite store, pulling GitHub issue/PR data, refreshing summaries, embeddings, and clusters, or extracting one cluster and its evidence through the ghcrawl CLI.
// Use when inspecting a ghcrawl SQLite store, pulling GitHub issue/PR data, refreshing summaries, embeddings, and clusters, or extracting one cluster and its evidence through the ghcrawl CLI.
Manage one tmux agent lane from its matching ops pane, inspect live pane state and Codex logs on cold start, and produce concise manager summaries for OpenClaw and adjacent project work.
Recover Codex and Claude sessions from tmux cockpit panes, CX SAVE snapshots, codex-cockpit history, and recent Codex logs without overwriting the useful restore source first.
Maintain and release the crawlkit Go library, preserving downstream compatibility for gitcrawl, slacrawl, discrawl, and notcrawl.
Template skill for repository authors; excluded from public publishing.
Maintain, verify, and release graincrawl, the local-first Granola archive CLI, including SQLite archive behavior, read-only Granola source boundaries, Homebrew tap packaging, and crawlkit-powered TUI/snapshot surfaces.
Investigate a cluster of GitHub issues and PRs, determine canonical candidates, post duplicate/related status, preserve contributor credit, and execute cleanup actions. Supports autonomous mode for provided-link-only closeout, merge/fix follow-through, changelog, and post-merge issue/PR cleanup.
| name | ghcrawl-cluster-operator |
| description | Use when inspecting a ghcrawl SQLite store, pulling GitHub issue/PR data, refreshing summaries, embeddings, and clusters, or extracting one cluster and its evidence through the ghcrawl CLI. |
| license | MIT |
| metadata | {"source":"https://github.com/vincentkoc/dotskills"} |
Operate ghcrawl as a local-first GitHub issue and pull request crawler: inspect the SQLite store, pull GitHub data, refresh summaries and embeddings, build clusters, and extract cluster evidence through deterministic CLI commands.
The default stance is conservative and cost-aware. Inspect first, then run mutating or API-spend commands only when the operator asked for fresh data or enrichment.
doctor, configure, runs, clusters, cluster-explain, and threads.owner/repo and prefer --json for agent-readable output.sync or refresh only when fresh GitHub data is needed.--include-code only when file overlap matters; it hydrates PR file metadata and can increase DB size.embed, then cluster, after summary or configuration changes.cluster-explain before making durable maintainer edits.cluster and explain the affected cluster to verify the decision stuck.repo (required): GitHub repository in owner/repo format.db_path (optional): explicit SQLite database path when not using the configured default.cluster_id (optional): durable or run cluster identifier to explain or edit.thread_numbers (optional): comma-separated GitHub issue/PR numbers to inspect.include_code (optional): whether PR file metadata should be hydrated and used as clustering evidence.summary_model (optional): LLM model for structured summaries, usually gpt-5.4.embedding_basis (optional): vector source such as title_original or llm_key_summary.limit (optional): item cap for sync, summaries, or listing commands.doctor, runs, clusters, cluster-explain, threads.refresh, sync, summarize, key-summaries, and embed as remote/API-spend commands.cluster is local-only but can be CPU-heavy on huge repos.--json for agent-readable output unless opening the TUI.--include-code only when file overlap matters.ghcrawl doctor --json
ghcrawl configure --json
ghcrawl runs owner/repo --limit 10 --json
If the local store is empty or stale, pull current open GitHub data:
ghcrawl sync owner/repo --limit 200 --json
ghcrawl sync owner/repo --include-code --limit 200 --json
For a normal end-to-end update:
ghcrawl refresh owner/repo --json
Use code hydration when file evidence should affect clustering:
ghcrawl refresh owner/repo --include-code --json
Default clustering can run without LLM summaries. LLM summaries and embeddings enrich the cluster graph.
Useful configurations:
ghcrawl configure --summary-model gpt-5.4 --embedding-basis title_original --json
ghcrawl configure --summary-model gpt-5.4 --embedding-basis llm_key_summary --json
For structured key summaries:
ghcrawl key-summaries owner/repo --limit 200 --json
ghcrawl key-summaries owner/repo --number 12345 --json
Then refresh vectors and clusters:
ghcrawl embed owner/repo --json
ghcrawl cluster owner/repo --json
List clusters:
ghcrawl clusters owner/repo --min-size 2 --limit 20 --sort size --json
ghcrawl clusters owner/repo --search "cron timeout" --limit 10 --json
Explain one durable cluster:
ghcrawl cluster-explain owner/repo --id 123 --member-limit 50 --event-limit 50 --json
Inspect current durable clusters with members:
ghcrawl durable-clusters owner/repo --member-limit 25 --json
ghcrawl durable-clusters owner/repo --include-inactive --member-limit 25 --json
Pull specific issues/PRs from the local store:
ghcrawl threads owner/repo --numbers 123,456,789 --json
Open the TUI:
ghcrawl tui owner/repo
Use these only when the operator asks for durable cluster edits:
ghcrawl exclude-cluster-member owner/repo --id 123 --number 456 --reason "not same root cause" --json
ghcrawl include-cluster-member owner/repo --id 123 --number 456 --reason "same root cause" --json
ghcrawl set-cluster-canonical owner/repo --id 123 --number 456 --reason "clearest report" --json
ghcrawl merge-clusters owner/repo --source 123 --target 456 --reason "same issue family" --json
After edits, re-run:
ghcrawl cluster owner/repo --json
ghcrawl cluster-explain owner/repo --id 123 --member-limit 50 --event-limit 50 --json