원클릭으로 Manus에서 모든 스킬 실행

$pwd:

image-to-text

Name: Image To Text
Author: pascalorg

// Extract text from images using OCR. Use when the user shares a screenshot and you need to read the text content, copy UI labels, or extract copy from a design mockup.

Manus에서 실행

$ git log --oneline --stat

stars:77

forks:19

updated:2026년 3월 6일 19:27

파일 탐색기

5 개 파일

SKILL.md

readonly

name	image-to-text
description	Extract text from images using OCR. Use when the user shares a screenshot and you need to read the text content, copy UI labels, or extract copy from a design mockup.
metadata	{"author":"pascalorg","version":"1.0.0"}

Image to Text

Extract all readable text from an image using OCR (Tesseract). Returns the full text content along with word-level bounding boxes and confidence scores.

When to Use

Reading text content from a screenshot or design mockup
Extracting UI copy (labels, buttons, headings) so you don't have to retype it
Getting text positions and bounding boxes from a design image

How It Works

The image is passed to Tesseract.js for optical character recognition
Tesseract segments the image into lines and words
Returns the full text plus word-level details (position, confidence)

Usage

bash <skill-path>/scripts/image-to-text.sh <image-path> [language]

Arguments:

image-path — Path to the image file (required)
language — OCR language code (optional, defaults to eng). Common: eng, fra, deu, spa, chi_sim, jpn

Examples:

# Extract text from a screenshot
bash <skill-path>/scripts/image-to-text.sh ./screenshot.png

# Extract French text
bash <skill-path>/scripts/image-to-text.sh ./mockup.png fra

Output

{
  "text": "Request work\nSuggestions\nPlumbing\nHVAC\nCleaning\nElectrical",
  "confidence": 87.4,
  "words": [
    {
      "text": "Request",
      "confidence": 94.2,
      "bbox": { "x0": 142, "y0": 180, "x1": 268, "y1": 204 }
    },
    {
      "text": "work",
      "confidence": 96.1,
      "bbox": { "x0": 274, "y0": 180, "x1": 332, "y1": 204 }
    }
  ],
  "lines": [
    {
      "text": "Request work",
      "confidence": 95.1,
      "bbox": { "x0": 142, "y0": 180, "x1": 332, "y1": 204 }
    }
  ]
}

Field	Type	Description
text	String	Full extracted text, newline-separated
confidence	Number	Overall confidence score (0-100)
words	Array	Each word with text, confidence, and bounding box
lines	Array	Each line with text, confidence, and bounding box

Present Results to User

After extracting text, present the content grouped by lines:

Extracted text (87.4% confidence):

  Request work
  Suggestions
  Plumbing
  HVAC
  Cleaning
  Electrical

Found 6 lines, 6 words.

Use the extracted text directly when implementing UI copy from a design.

Troubleshooting

Low confidence / garbled text — Tesseract works best with clean, high-contrast text. Screenshots of rendered UI work well. Photos of text at angles or with noise may produce poor results.

Wrong language — Pass the correct language code as the second argument. Tesseract needs the right language model to recognize characters.

First run is slow — Tesseract downloads language data (~4MB for English) on the first run. Subsequent runs are faster.

related-skills.json

같은 저장소

agent-collaboration.md

from "pascalorg/skills"

Multi-model agent orchestration using specialized agents for planning, coding, research, math/science, visual analysis, and adversarial review. Use when tasks are complex enough to benefit from different models' strengths, when you want adversarial review to catch blind spots, or when coordinating multi-step workflows across agent roles. Triggers on complex projects, multi-step tasks, architecture decisions, or when explicitly requested.

2026-04-0377

web-design.md

from "pascalorg/skills"

Web design reference for building production-grade interfaces. Covers layout, typography, color, spacing, shadows, animation, accessibility, responsive design, components, performance, and UX psychology. Use when building UI, reviewing design quality, choosing design tokens, or making any visual design decision.

2026-03-1777

contrast-check.md

from "pascalorg/skills"

Check color contrast ratios against WCAG AA and AAA accessibility standards. Use when the user wants to verify if their color combinations are accessible, check contrast between text and background colors, or audit a palette for accessibility.

2026-03-0677

image-compare.md

from "pascalorg/skills"

Compare two images pixel-by-pixel and get a visual diff. Use when the user wants to compare their implementation against a design, spot differences between two screenshots, or verify visual regression.

2026-03-0677

image-analysis.md

from "pascalorg/skills"

Extract color palettes from images (screenshots, Figma exports, design mockups) to help implement matching UI. Use when the user shares a screenshot, design image, or asks to "match these colors", "extract colors from this image", "implement this design", or "get the color palette".

2026-03-0677

package.json

"author": "pascalorg"

"repository": "pascalorg/skills"

GitHub 저장소 열기 Creator 저장소 보기

$ install --global

$ download --local

Manus에서 실행

$ useful --forSOC

기타 컴퓨터 관련 직업컴퓨터 및 수학직15-1299L4

소프트웨어 개발자L4

name	image-to-text
description	Extract text from images using OCR. Use when the user shares a screenshot and you need to read the text content, copy UI labels, or extract copy from a design mockup.
metadata	{"author":"pascalorg","version":"1.0.0"}

Image to Text

Extract all readable text from an image using OCR (Tesseract). Returns the full text content along with word-level bounding boxes and confidence scores.

When to Use

Reading text content from a screenshot or design mockup
Extracting UI copy (labels, buttons, headings) so you don't have to retype it
Getting text positions and bounding boxes from a design image

How It Works

The image is passed to Tesseract.js for optical character recognition
Tesseract segments the image into lines and words
Returns the full text plus word-level details (position, confidence)

Usage

bash <skill-path>/scripts/image-to-text.sh <image-path> [language]

Arguments:

image-path — Path to the image file (required)
language — OCR language code (optional, defaults to eng). Common: eng, fra, deu, spa, chi_sim, jpn

Examples:

# Extract text from a screenshot
bash <skill-path>/scripts/image-to-text.sh ./screenshot.png

# Extract French text
bash <skill-path>/scripts/image-to-text.sh ./mockup.png fra

Output

{
  "text": "Request work\nSuggestions\nPlumbing\nHVAC\nCleaning\nElectrical",
  "confidence": 87.4,
  "words": [
    {
      "text": "Request",
      "confidence": 94.2,
      "bbox": { "x0": 142, "y0": 180, "x1": 268, "y1": 204 }
    },
    {
      "text": "work",
      "confidence": 96.1,
      "bbox": { "x0": 274, "y0": 180, "x1": 332, "y1": 204 }
    }
  ],
  "lines": [
    {
      "text": "Request work",
      "confidence": 95.1,
      "bbox": { "x0": 142, "y0": 180, "x1": 332, "y1": 204 }
    }
  ]
}

Field	Type	Description
text	String	Full extracted text, newline-separated
confidence	Number	Overall confidence score (0-100)
words	Array	Each word with text, confidence, and bounding box
lines	Array	Each line with text, confidence, and bounding box

Present Results to User

After extracting text, present the content grouped by lines:

Extracted text (87.4% confidence):

  Request work
  Suggestions
  Plumbing
  HVAC
  Cleaning
  Electrical

Found 6 lines, 6 words.

Use the extracted text directly when implementing UI copy from a design.

Troubleshooting

Low confidence / garbled text — Tesseract works best with clean, high-contrast text. Screenshots of rendered UI work well. Photos of text at angles or with noise may produce poor results.

Wrong language — Pass the correct language code as the second argument. Tesseract needs the right language model to recognize characters.

First run is slow — Tesseract downloads language data (~4MB for English) on the first run. Subsequent runs are faster.

image-to-text

Image to Text

When to Use

How It Works

Usage

Output

Present Results to User

Troubleshooting

이 저장소의 다른 Skills

이 저장소의 다른 Skills

Image to Text

When to Use

How It Works

Usage

Output

Present Results to User

Troubleshooting