تشغيل أي مهارة في Manus بنقرة واحدة

$pwd:

ghcrawl-cluster-operator

Name: Ghcrawl Cluster Operator
Author: vincentkoc

// Use when inspecting a ghcrawl SQLite store, pulling GitHub issue/PR data, refreshing summaries, embeddings, and clusters, or extracting one cluster and its evidence through the ghcrawl CLI.

تشغيل في Manus

$ git log --oneline --stat

stars:٥٤

forks:٥

updated:١٩ مايو ٢٠٢٦ في ٠٩:٥٥

مستكشف الملفات

3 ملفات

SKILL.md

readonly

name	ghcrawl-cluster-operator
description	Use when inspecting a ghcrawl SQLite store, pulling GitHub issue/PR data, refreshing summaries, embeddings, and clusters, or extracting one cluster and its evidence through the ghcrawl CLI.
license	MIT
metadata	{"source":"https://github.com/vincentkoc/dotskills"}

ghcrawl Cluster Operator

Purpose

Operate ghcrawl as a local-first GitHub issue and pull request crawler: inspect the SQLite store, pull GitHub data, refresh summaries and embeddings, build clusters, and extract cluster evidence through deterministic CLI commands.

The default stance is conservative and cost-aware. Inspect first, then run mutating or API-spend commands only when the operator asked for fresh data or enrichment.

When to use

Inspecting ghcrawl repository state, local runs, thread counts, clusters, or durable cluster decisions.
Pulling GitHub issue/PR data into a local ghcrawl store.
Running OpenAI-backed summaries, structured key summaries, embeddings, and clustering.
Explaining a cluster with members, events, canonical selections, exclusions, and local evidence.
Operating maintainer edits such as excluding a member from a durable cluster or setting a canonical item.

Workflow

Start with read-only checks: doctor, configure, runs, clusters, cluster-explain, and threads.
Confirm the target repo as owner/repo and prefer --json for agent-readable output.
Use sync or refresh only when fresh GitHub data is needed.
Use --include-code only when file overlap matters; it hydrates PR file metadata and can increase DB size.
Run structured key summaries before embedding when LLM summaries should influence vectors.
Run embed, then cluster, after summary or configuration changes.
Pull one cluster with cluster-explain before making durable maintainer edits.
After durable edits, rerun cluster and explain the affected cluster to verify the decision stuck.

Inputs

repo (required): GitHub repository in owner/repo format.
db_path (optional): explicit SQLite database path when not using the configured default.
cluster_id (optional): durable or run cluster identifier to explain or edit.
thread_numbers (optional): comma-separated GitHub issue/PR numbers to inspect.
include_code (optional): whether PR file metadata should be hydrated and used as clustering evidence.
summary_model (optional): LLM model for structured summaries, usually gpt-5.4.
embedding_basis (optional): vector source such as title_original or llm_key_summary.
limit (optional): item cap for sync, summaries, or listing commands.

Outputs

Local health and configuration status.
Run history and current cluster counts.
Cluster lists with size, names, titles, states, and member evidence.
Cluster explain output with members, events, exclusions, canonical picks, summaries, and top touched files when available.
Thread snapshots for selected issue/PR numbers.
Verification notes after refresh, embedding, clustering, or durable maintainer actions.

Ground Rules

Prefer read-only inspection commands first: doctor, runs, clusters, cluster-explain, threads.
Treat refresh, sync, summarize, key-summaries, and embed as remote/API-spend commands.
cluster is local-only but can be CPU-heavy on huge repos.
Always pass --json for agent-readable output unless opening the TUI.
Use --include-code only when file overlap matters.

Setup Check

ghcrawl doctor --json
ghcrawl configure --json
ghcrawl runs owner/repo --limit 10 --json

If the local store is empty or stale, pull current open GitHub data:

ghcrawl sync owner/repo --limit 200 --json
ghcrawl sync owner/repo --include-code --limit 200 --json

For a normal end-to-end update:

ghcrawl refresh owner/repo --json

Use code hydration when file evidence should affect clustering:

ghcrawl refresh owner/repo --include-code --json

LLM And Embedding Pipeline

Default clustering can run without LLM summaries. LLM summaries and embeddings enrich the cluster graph.

Useful configurations:

ghcrawl configure --summary-model gpt-5.4 --embedding-basis title_original --json
ghcrawl configure --summary-model gpt-5.4 --embedding-basis llm_key_summary --json

For structured key summaries:

ghcrawl key-summaries owner/repo --limit 200 --json
ghcrawl key-summaries owner/repo --number 12345 --json

Then refresh vectors and clusters:

ghcrawl embed owner/repo --json
ghcrawl cluster owner/repo --json

Pull A Cluster And Its Info

List clusters:

ghcrawl clusters owner/repo --min-size 2 --limit 20 --sort size --json
ghcrawl clusters owner/repo --search "cron timeout" --limit 10 --json

Explain one durable cluster:

ghcrawl cluster-explain owner/repo --id 123 --member-limit 50 --event-limit 50 --json

Inspect current durable clusters with members:

ghcrawl durable-clusters owner/repo --member-limit 25 --json
ghcrawl durable-clusters owner/repo --include-inactive --member-limit 25 --json

Pull specific issues/PRs from the local store:

ghcrawl threads owner/repo --numbers 123,456,789 --json

Open the TUI:

ghcrawl tui owner/repo

Local Maintainer Actions

Use these only when the operator asks for durable cluster edits:

ghcrawl exclude-cluster-member owner/repo --id 123 --number 456 --reason "not same root cause" --json
ghcrawl include-cluster-member owner/repo --id 123 --number 456 --reason "same root cause" --json
ghcrawl set-cluster-canonical owner/repo --id 123 --number 456 --reason "clearest report" --json
ghcrawl merge-clusters owner/repo --source 123 --target 456 --reason "same issue family" --json

After edits, re-run:

ghcrawl cluster owner/repo --json
ghcrawl cluster-explain owner/repo --id 123 --member-limit 50 --event-limit 50 --json

related-skills.json

نفس المستودع

tmux-lane-orchestrator.md

from "vincentkoc/dotskills"

Manage one tmux agent lane from its matching ops pane, inspect live pane state and Codex logs on cold start, and produce concise manager summaries for OpenClaw and adjacent project work.

2026-05-2754

codex-session-recovery.md

from "vincentkoc/dotskills"

Recover Codex and Claude sessions from tmux cockpit panes, CX SAVE snapshots, codex-cockpit history, and recent Codex logs without overwriting the useful restore source first.

2026-05-2154

crawlkit.md

from "vincentkoc/dotskills"

Maintain and release the crawlkit Go library, preserving downstream compatibility for gitcrawl, slacrawl, discrawl, and notcrawl.

2026-05-1954

example-skill.md

from "vincentkoc/dotskills"

Template skill for repository authors; excluded from public publishing.

2026-05-1954

graincrawl.md

from "vincentkoc/dotskills"

Maintain, verify, and release graincrawl, the local-first Granola archive CLI, including SQLite archive behavior, read-only Granola source boundaries, Homebrew tap packaging, and crawlkit-powered TUI/snapshot surfaces.

2026-05-1954

openclaw-github-dedupe.md

from "vincentkoc/dotskills"

Investigate a cluster of GitHub issues and PRs, determine canonical candidates, post duplicate/related status, preserve contributor credit, and execute cleanup actions. Supports autonomous mode for provided-link-only closeout, merge/fix follow-through, changelog, and post-merge issue/PR cleanup.

2026-05-1954

package.json

"author": "vincentkoc"

"repository": "vincentkoc/dotskills"

فتح مستودع GitHub عرض مستودعات المنشئ

$ install --global

$ download --local

تشغيل في Manus

$ useful --forSOC

علماء البياناتمهن الحاسوب والرياضيات15-2051L4

name	ghcrawl-cluster-operator
description	Use when inspecting a ghcrawl SQLite store, pulling GitHub issue/PR data, refreshing summaries, embeddings, and clusters, or extracting one cluster and its evidence through the ghcrawl CLI.
license	MIT
metadata	{"source":"https://github.com/vincentkoc/dotskills"}

ghcrawl Cluster Operator

Purpose

The default stance is conservative and cost-aware. Inspect first, then run mutating or API-spend commands only when the operator asked for fresh data or enrichment.

When to use

Inspecting ghcrawl repository state, local runs, thread counts, clusters, or durable cluster decisions.
Pulling GitHub issue/PR data into a local ghcrawl store.
Running OpenAI-backed summaries, structured key summaries, embeddings, and clustering.
Explaining a cluster with members, events, canonical selections, exclusions, and local evidence.
Operating maintainer edits such as excluding a member from a durable cluster or setting a canonical item.

Workflow

Start with read-only checks: doctor, configure, runs, clusters, cluster-explain, and threads.
Confirm the target repo as owner/repo and prefer --json for agent-readable output.
Use sync or refresh only when fresh GitHub data is needed.
Use --include-code only when file overlap matters; it hydrates PR file metadata and can increase DB size.
Run structured key summaries before embedding when LLM summaries should influence vectors.
Run embed, then cluster, after summary or configuration changes.
Pull one cluster with cluster-explain before making durable maintainer edits.
After durable edits, rerun cluster and explain the affected cluster to verify the decision stuck.

Inputs

repo (required): GitHub repository in owner/repo format.
db_path (optional): explicit SQLite database path when not using the configured default.
cluster_id (optional): durable or run cluster identifier to explain or edit.
thread_numbers (optional): comma-separated GitHub issue/PR numbers to inspect.
include_code (optional): whether PR file metadata should be hydrated and used as clustering evidence.
summary_model (optional): LLM model for structured summaries, usually gpt-5.4.
embedding_basis (optional): vector source such as title_original or llm_key_summary.
limit (optional): item cap for sync, summaries, or listing commands.

Outputs

Local health and configuration status.
Run history and current cluster counts.
Cluster lists with size, names, titles, states, and member evidence.
Cluster explain output with members, events, exclusions, canonical picks, summaries, and top touched files when available.
Thread snapshots for selected issue/PR numbers.
Verification notes after refresh, embedding, clustering, or durable maintainer actions.

Ground Rules

Prefer read-only inspection commands first: doctor, runs, clusters, cluster-explain, threads.
Treat refresh, sync, summarize, key-summaries, and embed as remote/API-spend commands.
cluster is local-only but can be CPU-heavy on huge repos.
Always pass --json for agent-readable output unless opening the TUI.
Use --include-code only when file overlap matters.

Setup Check

ghcrawl doctor --json
ghcrawl configure --json
ghcrawl runs owner/repo --limit 10 --json

If the local store is empty or stale, pull current open GitHub data:

ghcrawl sync owner/repo --limit 200 --json
ghcrawl sync owner/repo --include-code --limit 200 --json

For a normal end-to-end update:

ghcrawl refresh owner/repo --json

Use code hydration when file evidence should affect clustering:

ghcrawl refresh owner/repo --include-code --json

LLM And Embedding Pipeline

Default clustering can run without LLM summaries. LLM summaries and embeddings enrich the cluster graph.

Useful configurations:

ghcrawl configure --summary-model gpt-5.4 --embedding-basis title_original --json
ghcrawl configure --summary-model gpt-5.4 --embedding-basis llm_key_summary --json

For structured key summaries:

ghcrawl key-summaries owner/repo --limit 200 --json
ghcrawl key-summaries owner/repo --number 12345 --json

Then refresh vectors and clusters:

ghcrawl embed owner/repo --json
ghcrawl cluster owner/repo --json

Pull A Cluster And Its Info

List clusters:

ghcrawl clusters owner/repo --min-size 2 --limit 20 --sort size --json
ghcrawl clusters owner/repo --search "cron timeout" --limit 10 --json

Explain one durable cluster:

ghcrawl cluster-explain owner/repo --id 123 --member-limit 50 --event-limit 50 --json

Inspect current durable clusters with members:

ghcrawl durable-clusters owner/repo --member-limit 25 --json
ghcrawl durable-clusters owner/repo --include-inactive --member-limit 25 --json

Pull specific issues/PRs from the local store:

ghcrawl threads owner/repo --numbers 123,456,789 --json

Open the TUI:

ghcrawl tui owner/repo

Local Maintainer Actions

Use these only when the operator asks for durable cluster edits:

ghcrawl exclude-cluster-member owner/repo --id 123 --number 456 --reason "not same root cause" --json
ghcrawl include-cluster-member owner/repo --id 123 --number 456 --reason "same root cause" --json
ghcrawl set-cluster-canonical owner/repo --id 123 --number 456 --reason "clearest report" --json
ghcrawl merge-clusters owner/repo --source 123 --target 456 --reason "same issue family" --json

After edits, re-run:

ghcrawl cluster owner/repo --json
ghcrawl cluster-explain owner/repo --id 123 --member-limit 50 --event-limit 50 --json

ghcrawl-cluster-operator

ghcrawl Cluster Operator

Purpose

When to use

Workflow

Inputs

Outputs

Ground Rules

Setup Check

LLM And Embedding Pipeline

Pull A Cluster And Its Info

Local Maintainer Actions

المزيد من هذا المستودع

المزيد من هذا المستودع

ghcrawl Cluster Operator

Purpose

When to use

Workflow

Inputs

Outputs

Ground Rules

Setup Check

LLM And Embedding Pipeline

Pull A Cluster And Its Info

Local Maintainer Actions