Name: Ingest
Author: damionrashford

name	ingest
description	Ingest and vectorize PDFs or text files for RAG retrieval. Use when the user provides a PDF, document, or text file they want to analyze, study, or generate content from.
argument-hint	<file-path>
compatibility	Requires Bun 1.0+. First run downloads ~23MB embedding model.
allowed-tools	Bash(bash ) Bash(bun ) Bash(node *) Read

Ingest Documents

Ingest a PDF or text file into the local vector store for retrieval-augmented generation.

bun "${CLAUDE_SKILL_DIR}/scripts/ingest.mjs" $ARGUMENTS

bun "${CLAUDE_SKILL_DIR}/scripts/list.mjs"

Confirm to the user: number of chunks and pages. Suggest next actions: report, flashcards, quiz, slides, audio, mind map, infographic, or data table.

To retrieve relevant chunks for any generation task:

bun "${CLAUDE_SKILL_DIR}/scripts/query.mjs" "<question>" --top-k 15

First run downloads the all-MiniLM-L6-v2 model (~23MB). Subsequent runs use cache.
Large PDFs (100+ pages) take 2-5 minutes on CPU — embedding is the bottleneck.
Scanned/image-only PDFs are automatically OCR'd via Tesseract.js (first OCR run downloads ~15MB language data). Use --no-ocr to skip.
The store persists at ~/.notebook-plugin/ across sessions. Use clear to reset.
Multiple files can be ingested into the same store — they're all searchable together.

name	ingest
description	Ingest and vectorize PDFs or text files for RAG retrieval. Use when the user provides a PDF, document, or text file they want to analyze, study, or generate content from.
argument-hint	<file-path>
compatibility	Requires Bun 1.0+. First run downloads ~23MB embedding model.
allowed-tools	Bash(bash ) Bash(bun ) Bash(node *) Read

Ingest a PDF or text file into the local vector store for retrieval-augmented generation.

bun "${CLAUDE_SKILL_DIR}/scripts/ingest.mjs" $ARGUMENTS

bun "${CLAUDE_SKILL_DIR}/scripts/list.mjs"

Confirm to the user: number of chunks and pages. Suggest next actions: report, flashcards, quiz, slides, audio, mind map, infographic, or data table.

To retrieve relevant chunks for any generation task:

bun "${CLAUDE_SKILL_DIR}/scripts/query.mjs" "<question>" --top-k 15

First run downloads the all-MiniLM-L6-v2 model (~23MB). Subsequent runs use cache.
Large PDFs (100+ pages) take 2-5 minutes on CPU — embedding is the bottleneck.
Scanned/image-only PDFs are automatically OCR'd via Tesseract.js (first OCR run downloads ~15MB language data). Use --no-ocr to skip.
The store persists at ~/.notebook-plugin/ across sessions. Use clear to reset.
Multiple files can be ingested into the same store — they're all searchable together.

ingest