Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

Loslegen

monoco-atom-doc-extract

Sterne9

Forks1

Aktualisiert9. Februar 2026 um 16:34

Extract documents to WebP pages for VLM analysis - Convert PDF, Office, Images to standardized WebP format

Installation

Mit Codex oder Claude installieren Kopieren Sie diesen Prompt, fügen Sie ihn in Codex, Claude oder einen anderen Assistant ein und lassen Sie die Skill-Seite prüfen und installieren.

In Manus ausführen

Quelle

IndenScale

IndenScale/monoco-toolkit

GitHub-Repository öffnen Creator-Repositorys ansehen

Download

In Manus ausführen

Verwandte BerufeSOC

Basierend auf der SOC-Berufsklassifikation

Sonstige ComputerberufeInformatik- und Mathematikberufe·SOC 15-1299

SKILL.md

readonly

name	monoco_atom_doc_extract
description	Extract documents to WebP pages for VLM analysis - Convert PDF, Office, Images to standardized WebP format
type	atom

Document Extraction

Extract documents to WebP pages suitable for Vision Language Model (VLM) analysis.

When to Use

Use this skill when you need to:

Analyze PDF documents with visual capabilities
Process Office documents (DOCX, PPTX, XLSX) for content extraction
Convert images or scanned documents to page sequences
Handle documents from ZIP archives

Commands

Extract a document:

monoco doc-extractor extract <file_path> [--dpi 150] [--quality 85] [--pages "1-5,10"]

List extracted documents:

monoco doc-extractor list [--category pdf] [--limit 20]

Search documents:

monoco doc-extractor search <query>

Show document details:

monoco doc-extractor show <hash_prefix>
monoco doc-extractor cat <hash_prefix>  # Show metadata JSON

Parameters

Parameter	Default	Description
`--dpi`	150	DPI for rendering (72-300)
`--quality`	85	WebP quality (1-100)
`--pages`	all	Page range (e.g., "1-5,10,15-20")

Output

Documents are stored in ~/.monoco/blobs/{sha256_hash}/:

source.{ext} - Original file
source.pdf - Normalized PDF
pages/*.webp - Rendered page images
meta.json - Document metadata

Example

# Extract a PDF with high quality
monoco doc-extractor extract ./report.pdf --dpi 200 --quality 90

# Extract specific pages from a document
monoco doc-extractor extract ./presentation.pptx --pages "1-10"

# List all PDF documents
monoco doc-extractor list --category pdf

# Show details of extracted document
monoco doc-extractor show a1b2c3d4

Best Practices

Use --dpi 200 or higher for documents with small text
Use --quality 90 for better image quality (larger files)
Extracted documents are cached by content hash - re-extraction is instant
Archives (ZIP) are automatically extracted and processed

Mehr aus diesem Repository

gleiches Repository

monoco-atom-doc-extract

IndenScale/monoco-toolkit

将文档提取为 WebP 页面以进行 VLM 分析 - 将 PDF、Office、图片转换为标准化 WebP 格式

2026-02-099

monoco-atom-doc-convert

IndenScale/monoco-toolkit

Document conversion and intelligent analysis - Use LibreOffice to convert Office/PDF documents to analyzable formats

2026-02-089

monoco-atom-doc-convert

IndenScale/monoco-toolkit

文档转换与智能分析 - 使用 LibreOffice 将 Office/PDF 文档转换为可分析格式

2026-02-089

engineer

IndenScale/monoco-toolkit

Engineer Role - Responsible for code generation, testing, and maintenance

2026-02-069

reviewer

IndenScale/monoco-toolkit

Reviewer Role - Responsible for code audit, architecture compliance checking, and feedback

2026-02-069

engineer

IndenScale/monoco-toolkit

Engineer Role - Responsible for code generation, testing, and maintenance

2026-02-069

name	monoco_atom_doc_extract
description	Extract documents to WebP pages for VLM analysis - Convert PDF, Office, Images to standardized WebP format
type	atom

Document Extraction

Extract documents to WebP pages suitable for Vision Language Model (VLM) analysis.

When to Use

Use this skill when you need to:

Analyze PDF documents with visual capabilities
Process Office documents (DOCX, PPTX, XLSX) for content extraction
Convert images or scanned documents to page sequences
Handle documents from ZIP archives

Commands

Extract a document:

monoco doc-extractor extract <file_path> [--dpi 150] [--quality 85] [--pages "1-5,10"]

List extracted documents:

monoco doc-extractor list [--category pdf] [--limit 20]

Search documents:

monoco doc-extractor search <query>

Show document details:

monoco doc-extractor show <hash_prefix>
monoco doc-extractor cat <hash_prefix>  # Show metadata JSON

Parameters

Parameter	Default	Description
`--dpi`	150	DPI for rendering (72-300)
`--quality`	85	WebP quality (1-100)
`--pages`	all	Page range (e.g., "1-5,10,15-20")

Output

Documents are stored in ~/.monoco/blobs/{sha256_hash}/:

source.{ext} - Original file
source.pdf - Normalized PDF
pages/*.webp - Rendered page images
meta.json - Document metadata

Example

# Extract a PDF with high quality
monoco doc-extractor extract ./report.pdf --dpi 200 --quality 90

# Extract specific pages from a document
monoco doc-extractor extract ./presentation.pptx --pages "1-10"

# List all PDF documents
monoco doc-extractor list --category pdf

# Show details of extracted document
monoco doc-extractor show a1b2c3d4

Best Practices

Use --dpi 200 or higher for documents with small text
Use --quality 90 for better image quality (larger files)
Extracted documents are cached by content hash - re-extraction is instant
Archives (ZIP) are automatically extracted and processed