Run any Skill in Manus with one click

$pwd:

wiki-retrieve

Name: Wiki Retrieve
Author: AgriciDaniel

// Hybrid retrieval primitive for the Compound Vault. Replaces the v1.6 static hot→index→drill read order with contextual-prefix + BM25 + cosine-rerank, modeled on Anthropic's Sept 2024 Contextual Retrieval research (35-49-67% retrieval-failure reduction). Opt-in via `bash bin/setup-retrieve.sh`; feature-detected by wiki-query and autoresearch. Triggers on: retrieve, hybrid retrieval, BM25, rerank, contextual retrieval, search the chunks, chunk search, vault search, semantic search, what chunks match, find relevant passages.

Run Skill in Manus

$ git log --oneline --stat

stars:5,796

forks:670

updated:May 26, 2026 at 21:42

SKILL.md

readonly

related-skills.json

same repository

autoresearch.md

from "AgriciDaniel/claude-obsidian"

Autonomous iterative research loop. Takes a topic, runs web searches, fetches sources, synthesizes findings, and files everything into the wiki as structured pages. Based on Karpathy's autoresearch pattern: program.md configures objectives and constraints, the loop runs until depth is reached, output goes directly into the knowledge base. Triggers on: "/autoresearch", "autoresearch", "research [topic]", "deep dive into [topic]", "investigate [topic]", "find everything about [topic]", "research and file", "go research", "build a wiki on".

2026-05-185.8k

canvas.md

from "AgriciDaniel/claude-obsidian"

Visual layer of the wiki. Add images, text cards, PDFs, and wiki pages to Obsidian canvas files with auto-positioning inside zones. Integrates with /banana for image capture. Triggers on: /canvas, canvas new, canvas add image, canvas add text, canvas add pdf, canvas add note, canvas zone, canvas list, canvas from banana, add to canvas, put this on the canvas, open canvas, create canvas.

2026-05-185.8k

defuddle.md

from "AgriciDaniel/claude-obsidian"

Strip clutter from web pages before ingesting into the wiki. Removes ads, navigation, headers, footers, and boilerplate: leaving clean readable markdown that saves 40-60% tokens. Triggers on: defuddle, clean this page, strip this url, fetch and clean, clean web content before ingesting, strip ads, remove clutter, clean URL content, readable markdown from URL.

2026-05-185.8k

obsidian-bases.md

from "AgriciDaniel/claude-obsidian"

Create and edit Obsidian Bases (.base files): Obsidian's native database layer for dynamic tables, card views, list views, filters, formulas, and summaries over vault notes. Triggers on: create a base, add a base file, obsidian bases, base view, filter notes, formula, database view, dynamic table, task tracker base, reading list base.

2026-05-185.8k

obsidian-markdown.md

from "AgriciDaniel/claude-obsidian"

Write correct Obsidian Flavored Markdown: wikilinks, embeds, callouts, properties, tags, highlights, math, and canvas syntax. Reference this when creating or editing any wiki page. Triggers on: write obsidian note, obsidian syntax, wikilink, callout, embed, obsidian markdown, wikilink format, callout syntax, embed syntax, obsidian formatting, how to write obsidian markdown.

2026-05-185.8k

save.md

from "AgriciDaniel/claude-obsidian"

Save the current conversation, answer, or insight into the Obsidian wiki vault as a structured note. Analyzes the chat, determines the right note type, creates frontmatter, files it in the correct wiki folder, and updates index, log, and hot cache. Triggers on: "save this", "save that answer", "/save", "file this", "save to wiki", "save this session", "file this conversation", "keep this", "save this analysis", "add this to the wiki".

2026-05-185.8k

package.json

"author": "AgriciDaniel"

"repository": "AgriciDaniel/claude-obsidian"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	wiki-retrieve
description	Hybrid retrieval primitive for the Compound Vault. Replaces the v1.6 static hot→index→drill read order with contextual-prefix + BM25 + cosine-rerank, modeled on Anthropic's Sept 2024 Contextual Retrieval research (35-49-67% retrieval-failure reduction). Opt-in via `bash bin/setup-retrieve.sh`; feature-detected by wiki-query and autoresearch. Triggers on: retrieve, hybrid retrieval, BM25, rerank, contextual retrieval, search the chunks, chunk search, vault search, semantic search, what chunks match, find relevant passages.
allowed-tools	Read Bash

wiki-retrieve: Hybrid Retrieval over the Vault

The v1.6 query path was Read(hot.md) → Read(index.md) → Read(3-5 pages) → synthesize. It worked, but page-level granularity loses to chunk-level granularity any time the answer lives in a specific passage rather than a whole page. The v1.7 wiki-retrieve skill is the chunk-level upgrade — opt-in, feature-gated, and replaces nothing if you don't run the setup.

Origin: This skill is original to claude-obsidian. There is no upstream kepano equivalent. The technique is from Anthropic's Sept 2024 Contextual Retrieval research — we implement it as agent-skill plumbing.

Data privacy (v1.7.1+)

Tier 1 (Anthropic API) and tier 2 (claude CLI subprocess) of the contextual-prefix generator send wiki page bodies off-machine. As of v1.7.1, both tiers are GATED behind explicit user consent at two layers:

scripts/contextual-prefix.py --allow-egress (default off). Without the flag, pick_prefix_tier() returns "synthetic" regardless of ANTHROPIC_API_KEY or claude binary presence.
bin/setup-retrieve.sh prompts before any non-synthetic Stage 1 run; default is abort.

To run fully on-machine (tier 3 synthetic prefix + local ollama rerank), use bash bin/setup-retrieve.sh --no-llm. This is also the effective behavior if you decline the consent prompt or omit --allow-egress.

The guard mirrors scripts/tiling-check.py:351 --allow-remote-ollama. v1.6 vaults that never provisioned this skill see zero behavior change.

Architecture

INGEST (one-time, then incremental):

  wiki/<page>.md
       │
       ▼
  scripts/contextual-prefix.py
       │   ├─ chunk on paragraph boundaries (~500 token target, 200 char overlap)
       │   ├─ generate 1-2 sentence prefix per chunk
       │   │     tier 1: ANTHROPIC_API_KEY → Anthropic API (Haiku, prompt-cached
       │   │                                 when body ≥ ~16 KB / Haiku 4.5 floor)
       │   │     tier 2: `claude` on PATH  → claude -p subprocess
       │   │     tier 3: synthetic         → frontmatter title + first paragraph
       │   └─ write .vault-meta/chunks/<address>/chunk-NNN.json
       │
       ▼
  scripts/bm25-index.py build
       └─ inverted index over chunks' contextualized_text → .vault-meta/bm25/index.json

QUERY:

  query string
       │
       ▼
  scripts/retrieve.py "<query>" --top 5
       ├─ bm25-index.py query "<query>" --top 20    (sparse candidate set)
       ├─ rerank.py "<query>" --candidates -        (dense rerank via ollama cosine)
       │     cosine(query_embedding, chunk_embedding)
       │     embeddings cached in .vault-meta/embed-cache.json keyed by body_hash
       └─ dedupe by page-address, return top-N candidates with absolute_path
       │
       ▼
  caller (wiki-query / autoresearch) reads the cited pages and synthesizes

Feature gating

Other skills must detect this skill before using it. The canonical detection:

[ -x scripts/retrieve.py ] && [ -d .vault-meta/chunks ] && \
  [ -f .vault-meta/bm25/index.json ] && \
  echo "wiki-retrieve installed" || echo "fallback: legacy hot→index→drill"

If detection fails, callers MUST fall back to the v1.6 read order. This skill never breaks the base plugin.

Setup

bash bin/setup-retrieve.sh

What it does, in order:

Sanity-checks the 4 scripts are present and executable.
Creates .vault-meta/chunks/ and .vault-meta/bm25/.
Probes ollama at http://127.0.0.1:11434 for nomic-embed-text (rerank prerequisite). Reports status; does not install.
Reports which contextual-prefix tier will be used (Anthropic API / claude CLI / synthetic).
Runs contextual-prefix.py --all to chunk + contextualize every wiki page.
Runs bm25-index.py build.
Smoke-tests retrieve.py against the query "wiki".

Flags:

--check — diagnostics only, no provisioning.
--no-llm — force tier-3 synthetic prefix (cheapest, zero LLM dependency).
--rebuild — re-chunk every page even if body_hash matches.

Cost ceiling

Per Anthropic's published research, contextual-prefix generation costs approximately $12 per 1,000 documents with Haiku + prompt caching. For a 100-page vault with ~3 chunks per page, that's ~$3.60 one-time, with incremental updates much cheaper (only changed pages re-process).

If you want to validate cost before running on a large vault:

bash bin/setup-retrieve.sh --no-llm   # provision with tier-3 synthetic prefix
# inspect retrieval quality manually; if insufficient, re-run without --no-llm

The claude-cli subprocess tier (no API key) is free in $ terms but slower (~3-10s per chunk depending on Haiku availability).

Skill commands (recipe)

These are the commands wiki-query and autoresearch will execute when wiki-retrieve is feature-detected. Other skills should mirror this pattern.

Standard retrieve

python3 scripts/retrieve.py "your question here" --top 5

Output: JSON with candidates array. Each candidate has absolute_path to the source page; caller reads that page (using the v1.7 transport selector) and synthesizes.

BM25-only (skip rerank)

python3 scripts/retrieve.py "query" --top 5 --no-rerank

Faster (no ollama call); lower quality.

Explain mode (debugging)

python3 scripts/retrieve.py "query" --top 5 --explain

Adds an explain block with per-stage diagnostics (BM25 candidate count, dedupe size, etc.).

Direct BM25 inspection

python3 scripts/bm25-index.py query "query" --top 10
python3 scripts/bm25-index.py stats

Rerank strategy probe

python3 scripts/rerank.py "query" --peek

Reports which strategy will run (cosine via ollama / no-op).

Integration with wiki-query

After this skill is installed, skills/wiki-query/SKILL.md standard and deep modes will:

Read wiki/hot.md (always — quick context).
Call python3 scripts/retrieve.py "<query>" --top 5.
Read the candidate pages from the result's absolute_path field (using the v1.7 transport selector — obsidian-cli read or Read tool).
Synthesize with chunk-level citation.

Quick mode is unchanged (hot.md only — never invokes retrieval).

If retrieve.py exits 10 (feature not provisioned), wiki-query falls back to the legacy v1.6 Read(index.md) → Read(N pages) order. No user-visible breakage.

Index maintenance

The index is NOT auto-refreshed when wiki pages change. Re-run after substantive ingest sessions:

python3 scripts/contextual-prefix.py --all      # incremental: only re-processes changed pages
python3 scripts/bm25-index.py build             # always full rebuild (cheap; pure Python)

A future v1.7.x patch will add an opt-in PostToolUse hook that triggers contextual-prefix + BM25 rebuild after every N writes. For v1.7.0, refresh is manual.

To wipe and start over:

rm -rf .vault-meta/chunks/ .vault-meta/bm25/ .vault-meta/embed-cache.json
bash bin/setup-retrieve.sh

Future tiers (v1.7.x roadmap)

Documented for transparency; not implemented in v1.7.0:

Stage	v1.7.0	v1.7.x target
Contextual prefix	API / claude-cli / synthetic	+ Voyage embed-based pseudo-prefix
Sparse retrieval	BM25	+ SPLADE learned-sparse
Dense retrieval	(none — rerank-only)	Separate vector candidate set fused with BM25 (true hybrid)
Rerank	nomic cosine / no-op	+ sentence-transformers BGE-base, Cohere Rerank, Voyage Rerank
Multi-vault	(single-vault)	Federation via wiki-federate (backlog #15)

Cross-reference

Decision tree for transports: wiki/references/transport-fallback.md
Concurrency policy: skills/wiki-ingest/SKILL.md §Concurrency
DragonScale Memory: wiki/concepts/DragonScale Memory.md
Anthropic Contextual Retrieval research: https://www.anthropic.com/news/contextual-retrieval

How to think (10-principle mapping)

When working on this skill, apply the 10-principle loop. See skills/think/SKILL.md for the canonical framework.

#	Principle	Application here
1	OBSERVE (ext)	Read the BM25 index state + embed cache state before issuing a query. Stale caches produce wrong answers.
2	OBSERVE (int)	Am I trusting the cache when it should have been invalidated by recent ingests? Check mtime against last ingest.
3	LISTEN	The user's query — what does it actually ask? Decompose into intent and terms before matching.
4	THINK	Which retrieval strategy fits this query? BM25-only / BM25 + rerank / contextual-prefix + BM25 + rerank.
5	CONNECT (lat)	How does this hybrid compare to v1.6 baseline? +32pp top-1 / +41% error reduction is the published delta.
6	CONNECT (sys)	`--allow-egress` consent gate for Anthropic API; ollama runs local-only; rerank caches under `.vault-meta/`.
7	FEEL	When not provisioned, exit 10 with a friendly "run `bash bin/setup-retrieve.sh` first" message — not a stack trace.
8	ACCEPT	When retrieval returns empty, say so honestly. Don't fabricate. Don't pad with low-confidence guesses.
9	CREATE	A ranked candidate list with `--explain` traceability for every score component.
10	GROW	Queries that consistently fail → content gaps in the wiki. Track those as autoresearch inputs.

wiki-retrieve

More from this repository

More from this repository

wiki-retrieve: Hybrid Retrieval over the Vault

Data privacy (v1.7.1+)

Architecture

Feature gating

Setup

Cost ceiling

Skill commands (recipe)

Standard retrieve

BM25-only (skip rerank)

Explain mode (debugging)

Direct BM25 inspection

Rerank strategy probe

Integration with wiki-query

Index maintenance

Future tiers (v1.7.x roadmap)

Cross-reference

How to think (10-principle mapping)

wiki-retrieve: Hybrid Retrieval over the Vault

Data privacy (v1.7.1+)

Architecture

Feature gating

Setup

Cost ceiling

Skill commands (recipe)

Standard retrieve

BM25-only (skip rerank)

Explain mode (debugging)

Direct BM25 inspection

Rerank strategy probe

Integration with wiki-query

Index maintenance

Future tiers (v1.7.x roadmap)

Cross-reference

How to think (10-principle mapping)