원클릭으로 Manus에서 모든 스킬 실행

kb-vacuum

Scan vault raw/ folder for hyperlinks embedded in markdown notes; POST each unseen URL to LuminaVault server /v1/capture/safari so Hermes ingests + memorizes. Usage: /kb-vacuum [folder=raw/]

Manus에서 실행

스타0

포크0

업데이트2026년 5월 29일 09:55

출처

LuminaVault

LuminaVault/LuminaVaultServer

GitHub 저장소 열기 Creator 저장소 보기

설치 명령

다운로드

Manus에서 실행

유용한 대상SOC

소프트웨어 개발자컴퓨터 및 수학직15-1252L4

SKILL.md

readonly

이 저장소의 다른 Skills

같은 저장소

kb-compile

LuminaVault/LuminaVaultServer

Compile all uncompiled raw/ content into the wiki. Writes source summaries, creates/updates concept articles with Obsidian backlinks, and updates the index. Run after /kb-ingest to process new content.

2026-05-290

ai-cohort-scoreboard-marketing

LuminaVault/LuminaVaultServer

Generate daily Reddit and X/Twitter marketing content from the AI Cohort scoreboard, with automatic fallback to the most recent available data when today's file is missing.

2026-05-240

ai-cohort-scoreboard-maintenance

LuminaVault/LuminaVaultServer

Configuration, maintenance, and troubleshooting of AI Cohort scoreboard scripts including vault path setup and script updates.

2026-05-240

content-digests

LuminaVault/LuminaVaultServer

Build and deliver periodic content digests (news, stock, entertainment) to multiple platforms: save to vault Raw/ and print full markdown to stdout for cron-based delivery.

2026-05-240

external-content-ingestion

LuminaVault/LuminaVaultServer

Capture external content (X/Twitter articles, web posts) and ingest into Obsidian vault Raw/ with automatic theme detection, summarization, and structured frontmatter. Handles X/fixupx links via multi-strategy extraction (direct fetch → r.jina.ai → nitter fallback).

2026-05-240

founder-content-engine

LuminaVault/LuminaVaultServer

Weekly founder content research and production engine that transforms trending news into ready-to-use content assets.

2026-05-240

name	kb-vacuum
description	Scan vault raw/ folder for hyperlinks embedded in markdown notes; POST each unseen URL to LuminaVault server /v1/capture/safari so Hermes ingests + memorizes. Usage: /kb-vacuum [folder=raw/]

KB Vacuum

Walk the local vault's raw/ folder, extract every HTTP(S) hyperlink found in markdown notes, and POST each unseen one to LuminaVault server /v1/capture/safari so the server-side enrichment pipeline (YouTube / X / GenericOG / Jina tier-2) runs and the result lands in vault_files for Hermes memory compilation.

Companion to /kb-ingest: Whereas /kb-ingest takes ONE URL/PDF/text and stages it, /kb-vacuum is the batch tool — vacuum up every URL the user has accumulated in their notes since the last run.

Steps

1. Read config

cat ~/.claude/kb-config.json

Required keys:

vault_path — absolute path to the local vault root (e.g. /opt/data/obsidian-vault/FACorreia)
server_base_url — LuminaVault server origin (e.g. https://your-tenant.luminavault.app)
auth_token — JWT for the authenticated user (Bearer token used against /v1/capture/safari)

If any are missing, stop and tell the user which key needs to be set, then exit.

Set VAULT_PATH, SERVER_BASE_URL, AUTH_TOKEN from the config for subsequent steps.

2. Resolve target folder

The argument after /kb-vacuum is the folder to scan, relative to the vault root.

No argument → default to raw/
Argument like raw/AI → scan only that subdirectory
Argument like notes/ → scan that

Compute SCAN_ROOT="$VAULT_PATH/<arg-or-raw>". Verify it exists:

test -d "$SCAN_ROOT" || { echo "scan root not found: $SCAN_ROOT"; exit 1; }

3. Load the seen manifest

The manifest lives at $VAULT_PATH/.kb/vacuum_seen.json and stores the set of URLs already POSTed in prior runs. Format:

{
  "seen": [
    "https://example.com/article-a",
    "https://www.youtube.com/watch?v=..."
  ]
}

Create the file if it doesn't exist (with {"seen": []}). Load into memory as a Set<String> for O(1) dedup.

mkdir -p "$VAULT_PATH/.kb"
test -f "$VAULT_PATH/.kb/vacuum_seen.json" || echo '{"seen": []}' > "$VAULT_PATH/.kb/vacuum_seen.json"

4. Extract URLs from every `*.md` under SCAN_ROOT

Walk recursively. For each markdown file, extract all HTTP(S) URLs. Handle both bare URLs and [label](url) Markdown link wrappers — extract just the url part. Strip trailing punctuation (.,;:!?'"). Dedupe within the run.

Recommended approach: a small Python or jq+sed pipeline. Example Python:

import json, os, re, pathlib, sys, urllib.parse

vault = pathlib.Path(os.environ["VAULT_PATH"])
scan_root = pathlib.Path(os.environ["SCAN_ROOT"])
seen_path = vault / ".kb" / "vacuum_seen.json"
seen = set(json.loads(seen_path.read_text())["seen"])

# Match http(s) URLs both bare and inside Markdown link wrappers.
url_re = re.compile(r"https?://[^\s)>\]\"']+", re.IGNORECASE)
trailing_punct = ".,;:!?\"'"

discovered = []
discovered_set = set()
for md in scan_root.rglob("*.md"):
    try:
        text = md.read_text(encoding="utf-8", errors="ignore")
    except Exception:
        continue
    for raw in url_re.findall(text):
        url = raw
        while url and url[-1] in trailing_punct:
            url = url[:-1]
        if not url or url in discovered_set:
            continue
        discovered_set.add(url)
        discovered.append(url)

new_urls = [u for u in discovered if u not in seen]
print(f"DISCOVERED={len(discovered)}")
print(f"NEW={len(new_urls)}")

# Write the URL list to a temp file the bash steps can iterate.
out = vault / ".kb" / "vacuum_pending.txt"
out.write_text("\n".join(new_urls))

5. POST each unseen URL to `/v1/capture/safari`

For each URL in .kb/vacuum_pending.txt, POST:

while IFS= read -r url; do
  [ -z "$url" ] && continue
  CAPTURED_AT=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
  STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
    "$SERVER_BASE_URL/v1/capture/safari" \
    -H "Authorization: Bearer $AUTH_TOKEN" \
    -H "Content-Type: application/json" \
    -d "$(jq -nc --arg u "$url" --arg t "$CAPTURED_AT" '{url:$u, source:"kb-vacuum", capturedAt:$t}')")
  case "$STATUS" in
    2*)
      echo "POSTED $url"
      echo "$url" >> "$VAULT_PATH/.kb/vacuum_posted.txt"
      ;;
    4*)
      echo "CLIENT_ERROR ($STATUS) $url — skipping"
      ;;
    5*|000)
      echo "TRANSIENT ($STATUS) $url — retry once"
      sleep 1
      RETRY_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
        "$SERVER_BASE_URL/v1/capture/safari" \
        -H "Authorization: Bearer $AUTH_TOKEN" \
        -H "Content-Type: application/json" \
        -d "$(jq -nc --arg u "$url" --arg t "$CAPTURED_AT" '{url:$u, source:"kb-vacuum", capturedAt:$t}')")
      if [[ "$RETRY_STATUS" =~ ^2 ]]; then
        echo "POSTED (retry) $url"
        echo "$url" >> "$VAULT_PATH/.kb/vacuum_posted.txt"
      else
        echo "FAILED ($RETRY_STATUS) $url — will retry next run"
      fi
      ;;
  esac
done < "$VAULT_PATH/.kb/vacuum_pending.txt"

6. Update the seen manifest

Merge every successfully-POSTed URL from this run into the manifest:

import json, pathlib, os
vault = pathlib.Path(os.environ["VAULT_PATH"])
seen_path = vault / ".kb" / "vacuum_seen.json"
posted_path = vault / ".kb" / "vacuum_posted.txt"

seen = set(json.loads(seen_path.read_text())["seen"])
if posted_path.exists():
    for line in posted_path.read_text().splitlines():
        url = line.strip()
        if url:
            seen.add(url)

seen_path.write_text(json.dumps({"seen": sorted(seen)}, indent=2))
posted_path.unlink(missing_ok=True)
(vault / ".kb" / "vacuum_pending.txt").unlink(missing_ok=True)

7. Print the run summary

<DISCOVERED> urls discovered, <NEW> new, <POSTED> posted, <FAILED> failed

Example:

27 urls discovered, 14 new, 13 posted, 1 failed

If failed > 0, list the failed URLs so the user knows what to investigate (auth, server reachability, etc).

Notes

Dedup contract: client-side manifest (vacuum_seen.json) is primary. Server also dedupes via VaultFile (tenant_id, source_url) unique constraint, so re-POSTing the same URL accidentally is idempotent.
To re-vacuum everything: delete .kb/vacuum_seen.json and re-run. Server-side dedup keeps you safe from duplicate VaultFile rows.
Token rotation: when the JWT expires, update auth_token in ~/.claude/kb-config.json and re-run.
Why client-side, not server-cron: the vault lives on the user's machine. The server has no filesystem read access. Mirrors the kb-ingest / kb-compile / kb-import pattern.
iOS in-app surface: the iOS app has Safari share-extension capture but no "Vacuum vault" button yet. That's a separate ticket targeting a future /v1/capture/bulk endpoint.

kb-vacuum

이 저장소의 다른 Skills

이 저장소의 다른 Skills

KB Vacuum

Steps

1. Read config

2. Resolve target folder

3. Load the seen manifest

4. Extract URLs from every *.md under SCAN_ROOT

5. POST each unseen URL to /v1/capture/safari

6. Update the seen manifest

7. Print the run summary

Notes

KB Vacuum

Steps

1. Read config

2. Resolve target folder

3. Load the seen manifest

4. Extract URLs from every *.md under SCAN_ROOT

5. POST each unseen URL to /v1/capture/safari

6. Update the seen manifest

7. Print the run summary

Notes

4. Extract URLs from every `*.md` under SCAN_ROOT

5. POST each unseen URL to `/v1/capture/safari`

4. Extract URLs from every `*.md` under SCAN_ROOT

5. POST each unseen URL to `/v1/capture/safari`