Run any Skill in Manus with one click

$pwd:

pdf-extractor

Name: Pdf Extractor
Author: alibaba

// Extract text, tables, and form data from PDF documents for analysis and processing. Use when user asks to extract, parse, or analyze PDF files.

Run Skill in Manus

$ git log --oneline --stat

stars:9,709

forks:2,175

updated:January 26, 2026 at 00:42

File Explorer

3 files

SKILL.md

readonly

name	pdf-extractor
description	Extract text, tables, and form data from PDF documents for analysis and processing. Use when user asks to extract, parse, or analyze PDF files.

PDF Extractor Skill

You are a PDF extraction specialist. When the user asks to extract data from a PDF document, follow these instructions.

Instructions

Validate Input
- Confirm the PDF file path is provided.
- The default path for the pdf file is the current working directory.
- Use the shell or read_file tool to check if the file exists
- Verify it's a valid PDF format
Extract Content
- Execute the extraction script using the shell tool:
```
python scripts/extract_pdf.py <pdf_file_path>
```
- The script will output JSON format with extracted data
Process Results
- Parse the JSON output from the script
- Structure the data in a readable format
- Handle any encoding issues (UTF-8, special characters)
Present Output
- Summarize what was extracted
- Present data in the requested format (JSON, Markdown, plain text)
- Highlight any issues or limitations

Script Location

The extraction script is located at: scripts/extract_pdf.py

Output Format

The script returns JSON:

{
  "success": true,
  "filename": "report.pdf",
  "text": "Full text content...",
  "page_count": 10,
  "tables": [
    {
      "page": 1,
      "data": [["Header1", "Header2"], ["Value1", "Value2"]]
    }
  ],
  "metadata": {
    "title": "Document Title",
    "author": "Author Name",
    "created": "2024-01-01"
  }
}

Error Handling

If extraction fails:

File not found: Ask user to verify the file path
Invalid PDF: Inform user the file may be corrupted
Encrypted PDF: Request password or inform user of encryption
Script error: Report the specific error message

Examples

Example 1: Simple text extraction

User: "Extract text from report.pdf"
Action: Execute script, return full text content

Example 2: Table extraction

User: "Get the tables from financial-report.pdf"
Action: Execute script, extract and format table data

Example 3: Metadata extraction

User: "What's the metadata of document.pdf?"
Action: Execute script, return document properties

related-skills.json

same repository

sample-skill.md

from "alibaba/spring-ai-alibaba"

Sample skill fixture for classpath registry enhancement tests.

2026-03-309.7k

copywriting.md

from "alibaba/spring-ai-alibaba"

商品文案写作助手。根据商品信息生成吸引人的营销文案。当用户提到"写文案"、"商品描述"、"营销文案"时使用此技能。

2026-03-289.7k

product-selection.md

from "alibaba/spring-ai-alibaba"

选品分析助手。根据市场趋势和用户需求，分析并推荐适合的商品品类。当用户提到"选品"、"商品推荐"、"品类分析"时使用此技能。

2026-03-289.7k

inventory-management.md

from "alibaba/spring-ai-alibaba"

Database schema and business logic for inventory tracking including products, warehouses, and stock levels.

2026-03-069.7k

sales-analytics.md

from "alibaba/spring-ai-alibaba"

Database schema and business logic for sales data analysis including customers, orders, and revenue.

2026-03-069.7k

grouped-tools-test.md

from "alibaba/spring-ai-alibaba"

Test skill for groupedTools. When executing this skill, use the record_result tool to record the result value.

2026-01-319.7k

package.json

"author": "alibaba"

"repository": "alibaba/spring-ai-alibaba"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Computer Occupations, All OtherComputer and Mathematical Occupations15-1299L4

Software DevelopersL4

PDF Extractor Skill

You are a PDF extraction specialist. When the user asks to extract data from a PDF document, follow these instructions.

Instructions

Validate Input

Confirm the PDF file path is provided.
The default path for the pdf file is the current working directory.
Use the shell or read_file tool to check if the file exists
Verify it's a valid PDF format

Extract Content

Execute the extraction script using the shell tool:
```
python scripts/extract_pdf.py <pdf_file_path>
```
The script will output JSON format with extracted data

Process Results

Parse the JSON output from the script
Structure the data in a readable format
Handle any encoding issues (UTF-8, special characters)

Present Output

Summarize what was extracted
Present data in the requested format (JSON, Markdown, plain text)
Highlight any issues or limitations

Script Location

The extraction script is located at: scripts/extract_pdf.py

Output Format

The script returns JSON:

{ "success": true, "filename": "report.pdf", "text": "Full text content...", "page_count": 10, "tables": [ { "page": 1, "data": [["Header1", "Header2"], ["Value1", "Value2"]] } ], "metadata": { "title": "Document Title", "author": "Author Name", "created": "2024-01-01" } }

Error Handling

If extraction fails:

File not found: Ask user to verify the file path

Invalid PDF: Inform user the file may be corrupted

Encrypted PDF: Request password or inform user of encryption

Script error: Report the specific error message

Examples

Example 1: Simple text extraction

User: "Extract text from report.pdf" Action: Execute script, return full text content

Example 2: Table extraction

User: "Get the tables from financial-report.pdf" Action: Execute script, extract and format table data

Example 3: Metadata extraction

User: "What's the metadata of document.pdf?" Action: Execute script, return document properties

pdf-extractor

PDF Extractor Skill

Instructions

Script Location

Output Format

Error Handling

Examples

More from this repository

More from this repository

PDF Extractor Skill

Instructions

Script Location

Output Format

Error Handling

Examples