com um clique
pdf-reader
// Read and comprehend PDF files, especially math lecture notes and academic papers. Use when the user asks to read, parse, analyze, or extract content from a PDF file.
// Read and comprehend PDF files, especially math lecture notes and academic papers. Use when the user asks to read, parse, analyze, or extract content from a PDF file.
Top-level session orchestration rules — subagent routing, context hygiene, and implementation discipline. Not intended for subagents.
Search Reddit and browse subreddit posts using the public JSON API. Use when you need to find Reddit discussions, community reactions, or story leads from specific subreddits.
Remove AI writing patterns from prose. Use when drafting, editing, or reviewing text to eliminate predictable AI tells.
| name | pdf-reader |
| description | Read and comprehend PDF files, especially math lecture notes and academic papers. Use when the user asks to read, parse, analyze, or extract content from a PDF file. |
Read and comprehend PDF files, especially math lecture notes and academic papers. Uses a hybrid text extraction + vision approach for maximum comprehension of equations, diagrams, and structured content.
All scripts use a venv at SKILL_DIR/.venv with pymupdf installed. If the venv is missing, create it from requirements.txt:
python3 -m venv SKILL_DIR/.venv
SKILL_DIR/.venv/bin/pip install -r SKILL_DIR/requirements.txt
Python command: Always invoke scripts with:
SKILL_DIR/.venv/bin/python SKILL_DIR/scripts/<script>.py [args]
All scripts are in SKILL_DIR/scripts/.
| Script | Purpose | Key args |
|---|---|---|
pdf_info.py <path> | Metadata + per-page analysis (page count, TOC, text density, math density, image count) | — |
pdf_extract.py <path> [--pages SPEC] | Extract text by page | --pages all|1-5|1,3,7|3 |
pdf_render.py <path> [--pages SPEC] [--dpi N] | Render pages to PNG images in /tmp/pi-pdf-*/ | --pages, --dpi (default 150) |
pdf_search.py <path> <query> [--context N] [--literal] | Search text content by regex or literal | --context lines (default 3), --literal flag |
Page specs: all, 1-5, 1,3,7, 3 (1-indexed, inclusive ranges).
Run pdf_info.py on every new PDF before doing anything else. This tells you:
pdf_extract.py <path>pdf_render.py <path>read tool for full visual comprehensionpdf_info.py output for pages with high math_density (>0.02) or image_count > 0 or low text_length (<100, likely diagram-only pages)read for equation and figure comprehensionpdf_search.py to find relevant pages, then render thoseWhen the user asks about something specific (e.g., "check theorem 3.2", "what's on page 7"):
pdf_search.py <path> "theorem 3.2" — find the pagepdf_render.py <path> --pages <page> — render just that pageread the image — see the actual theorem with proper math renderingWhen reading rendered page images:
1. pdf_info.py → assess size and content
2. Pick strategy (short/medium/long)
3. Extract text + selectively render
4. Provide summary with key findings
1. pdf_search.py → find the page
2. pdf_render.py → render that page (and maybe the next for proof continuation)
3. Read the image, state the theorem precisely
1. pdf_render.py --pages N → render the page
2. Read the image for full visual comprehension
3. Also extract text from pages N-1 and N+1 for surrounding context
4. Walk through the proof step by step
1. pdf_info.py → get TOC and page count
2. pdf_extract.py → full text extraction
3. Read abstract, intro, conclusion first (text is usually sufficient)
4. Render figures/theorem pages as needed for deeper understanding
5. Provide structured summary