Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

mineru-pdf-converter

Name: Mineru Pdf Converter
Author: Xueheng-Li

// This skill should be used when the user asks to "convert PDF to markdown", "use MinerU to convert [file]", "extract text from PDF", "PDF转Markdown", "转换PDF [路径]", "MinerU转换 [file]", "/mineru [file path]", or needs high-quality document conversion with formula and table recognition.

Ejecutar en Manus

$ git log --oneline --stat

stars:43

forks:12

updated:25 de enero de 2026, 08:34

Explorador de archivos

8 archivos

SKILL.md

readonly

related-skills.json

mismo repositorio

arxiv.md

from "Xueheng-Li/sysu-awesome-cc"

搜索 arxiv 论文并总结。当用户说"找寻XX的论文"、"搜索XX的论文"、"找arxiv上XX主题的论文"时使用。

2026-01-2543

frontend-design.md

from "Xueheng-Li/sysu-awesome-cc"

创建独特、生产级的高质量前端界面。当用户要求构建 Web 组件、页面、作品、海报或应用程序时使用此技能（例如网站、着陆页、仪表板、React 组件、HTML/CSS 布局，或对任何 Web UI 进行样式/美化）。生成富有创意、精致的代码和 UI 设计，避免通用的 AI 美学风格。

2026-01-2543

github-trending.md

from "Xueheng-Li/sysu-awesome-cc"

获取 GitHub 热门项目信息。当用户说"获取 github trending"、"今日/本周/本月热门项目"、"github 上有什么热门"时使用。

2026-01-2543

cc-insights.md

from "Xueheng-Li/sysu-awesome-cc"

This skill should be used when the user asks to "归档聊天记录", "archive my chats", "分析我与CC的交互", "analyze my Claude Code usage", "反思我的CC使用习惯", "生成CC洞察报告", "深度分析CC使用模式", "更新聊天归档", or mentions keywords like "交互日志", "使用模式分析", "CC insights", "deep analysis". Provides automated archiving and deep analysis of Claude Code interaction history.

2026-01-1343

chat-history-summarizer.md

from "Xueheng-Li/sysu-awesome-cc"

Extract and summarize Claude Code chat history into structured documentation. Use when the user asks to export, summarize, or document a conversation session, extract prompts and actions from chat logs, or create a record of what was accomplished in a session.

2026-01-1343

chinese-quote-converter.md

from "Xueheng-Li/sysu-awesome-cc"

Convert English straight quotation marks ("...") to Chinese curved quotation marks ("..." U+201C/D). Use when processing Chinese text documents, markdown files, or any content that needs proper Chinese typography with directional quotes. Triggers on keywords like "转换引号", "中文引号", "英文引号转中文", "quote conversion", "convert quotes".

2026-01-1343

package.json

"author": "Xueheng-Li"

"repository": "Xueheng-Li/sysu-awesome-cc"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Desarrolladores de softwareOcupaciones informáticas y matemáticas15-1252L4

name	mineru-pdf-converter
description	This skill should be used when the user asks to "convert PDF to markdown", "use MinerU to convert [file]", "extract text from PDF", "PDF转Markdown", "转换PDF [路径]", "MinerU转换 [file]", "/mineru [file path]", or needs high-quality document conversion with formula and table recognition.
version	1.1.0
allowed-tools	Bash, Read, Write

MinerU PDF Converter

Convert PDF and other documents to high-quality Markdown using MinerU cloud API. Handles large PDFs (>600 pages) automatically by splitting, converting, and merging.

Capabilities

Convert PDF, images, and other documents to Markdown/LaTeX
Preserve formulas, tables, and complex layouts
Auto-upload local files via MinerU batch API
Auto-split large PDFs (>600 pages) and merge results
Support additional output formats: LaTeX, DOCX, HTML

Requirements

Python packages:

pip install requests pymupdf

API Token: Token stored at: ~/.claude/skills/mineru-pdf-converter/references/mineru-token.md

Quick Start

Basic Conversion

python ~/.claude/skills/mineru-pdf-converter/scripts/mineru_convert.py \
  --input "/path/to/document.pdf" \
  --token-file "~/.claude/skills/mineru-pdf-converter/references/mineru-token.md"

Convert from URL

python ~/.claude/skills/mineru-pdf-converter/scripts/mineru_convert.py \
  --url "https://example.com/paper.pdf" \
  --token-file "~/.claude/skills/mineru-pdf-converter/references/mineru-token.md"

Verbose Mode with Progress

python ~/.claude/skills/mineru-pdf-converter/scripts/mineru_convert.py \
  --input "/path/to/document.pdf" \
  --token-file "~/.claude/skills/mineru-pdf-converter/references/mineru-token.md" \
  --verbose

Output shows progress percentage during conversion:

Uploading file: /path/to/document.pdf
File uploaded, batch_id: abc123
Waiting for conversion...
Status: running (25/100 pages, 25.0%)
Status: running (50/100 pages, 50.0%)
Status: done
Downloading result...

Additional Formats

# Include LaTeX output
python ~/.claude/skills/mineru-pdf-converter/scripts/mineru_convert.py \
  --input "/path/to/document.pdf" \
  --token-file "~/.claude/skills/mineru-pdf-converter/references/mineru-token.md" \
  --extra-formats "latex"

Workflow

When user requests PDF conversion:

Identify input type
- Local file path: Use batch upload API to get temporary URL
- URL: Submit directly to conversion API
Check file size (for PDFs)
- If >600 pages: Split into 500-page chunks using PyMuPDF
- Process each chunk separately
- Merge final Markdown output

Execute conversion

python ~/.claude/skills/mineru-pdf-converter/scripts/mineru_convert.py \
  --input "[path]" \
  --token-file "~/.claude/skills/mineru-pdf-converter/references/mineru-token.md"

Report result
- Confirm output path (subfolder named after input file by default)
- Note any warnings or partial failures
- Provide path to main .md file

Parameters Reference

Parameter	Default	Description
`--input`	-	Local file path (mutually exclusive with --url)
`--url`	-	Remote file URL (mutually exclusive with --input)
`--token-file`	-	Path to token file (required)
`--model`	vlm	Model: pipeline, vlm, MinerU-HTML
`--language`	ch	Document language
`--extra-formats`	[]	Additional formats: latex, docx, html
`--output-dir`	(source dir/filename)	Override output directory (skips subfolder creation)
`--enable-formula`	true	Enable formula recognition
`--enable-table`	true	Enable table recognition
`--page-ranges`	-	Page ranges to convert (e.g., "1-100,150-200") - see note below
`--timeout`	600	Max wait time in seconds

Page Ranges

The --page-ranges parameter allows you to convert only specific pages:

# Convert pages 1-50 and 100-150
python ~/.claude/skills/mineru-pdf-converter/scripts/mineru_convert.py \
  --input "/path/to/document.pdf" \
  --page-ranges "1-50,100-150" \
  --token-file "~/.claude/skills/mineru-pdf-converter/references/mineru-token.md"

How it works:

Local files (--input): Pages are extracted client-side using PyMuPDF before upload
URL input (--url): Page ranges sent to MinerU API server-side

This means page ranges now work for both local files and URLs.

Large PDF Handling

PDFs over 600 pages are automatically:

Split into chunks of max 500 pages using PyMuPDF
Each chunk converted separately via the API
Output Markdown files merged in order with page markers
Temporary chunk files cleaned up

To handle large PDFs, ensure PyMuPDF is installed:

pip install pymupdf

Model Selection

Model	Best For	Notes
vlm (default)	Complex layouts, formulas, tables	Higher accuracy, handles scanned documents
pipeline	Simple text documents	Faster processing, lower resource usage
MinerU-HTML	HTML output needed	Specialized for HTML output format

Error Handling

Error	Cause	Solution
Auth failed (401)	Invalid or expired token	Update token in mineru.md
Task timeout	Large file or slow server	Increase --timeout; retry later
Conversion failed	Unsupported format or corrupted file	Try pipeline model as fallback
Upload failed (413)	File >200MB	Split file manually first
Rate limit (429)	Exceeded 2000 pages/day quota	Wait until next day

Output Structure

The conversion produces a ZIP file that is extracted to a subfolder named after the input file. This prevents naming conflicts when converting multiple PDFs in the same directory.

Default behavior (no --output-dir specified):

For input /path/to/paper.pdf, output is saved to /path/to/paper/:

/path/to/paper/
├── full.md               # Main Markdown file
├── images/               # Extracted images
│   ├── image_1.png
│   └── image_2.png
└── paper.json            # Structured content (optional)

With --output-dir specified:

When --output-dir /custom/path is provided, files are extracted directly to that directory (no subfolder created):

/custom/path/
├── full.md
├── images/
│   └── ...
└── paper.json

API Quota

High priority: 2000 pages/day
Low priority: Additional capacity (slower processing)
Check remaining quota in the API response

Supporting Files

scripts/mineru_convert.py - Main conversion orchestrator
scripts/pdf_splitter.py - PDF splitting utility (PyMuPDF)
scripts/merge_markdown.py - Output merger for chunked conversions
references/api-reference.md - Full MinerU API documentation

Troubleshooting

Token Expired

The JWT token has an expiration date. If authentication fails:

Log in to mineru.net
Get a new API token
Update ~/.claude/skills/mineru-pdf-converter/references/mineru-token.md

Conversion Hangs

For very large or complex documents:

Increase timeout: --timeout 1200
Use page ranges to convert in sections: --page-ranges "1-100"
Try the pipeline model: --model pipeline

Missing Formulas or Tables

Ensure recognition is enabled (default):

--enable-formula true
--enable-table true

For detailed API reference, see references/api-reference.md.

mineru-pdf-converter

Más de este repositorio

MinerU PDF Converter

Capabilities

Requirements

Quick Start

Basic Conversion

Convert from URL

Verbose Mode with Progress

Additional Formats

Workflow

Parameters Reference

Page Ranges

Large PDF Handling

Model Selection

Error Handling

Output Structure

API Quota

Supporting Files

Troubleshooting

Token Expired

Conversion Hangs

Missing Formulas or Tables

MinerU PDF Converter

Capabilities

Requirements

Quick Start

Basic Conversion

Convert from URL

Verbose Mode with Progress

Additional Formats

Workflow

Parameters Reference

Page Ranges

Large PDF Handling

Model Selection

Error Handling

Output Structure

API Quota

Supporting Files

Troubleshooting

Token Expired

Conversion Hangs

Missing Formulas or Tables

Más de este repositorio