| name | ddddocr |
| description | DDDDOCR OCR recognition service with MCP protocol support. Provides optical character recognition, object detection, and slide matching capabilities. Use for: Recognizing text from captcha images, Detecting objects/text regions in images, Matching slide positions for verification codes, Performing any OCR-related tasks through MCP protocol. |
DDDDOCR Service
Quick Start
Start the ddddocr service with all features enabled:
python scripts/start_ddddocr.py
The script automatically:
- Checks if service is already running
- Downloads the latest ddddocr binary for current platform if needed
- Starts service with ocr, det, slide, and mcp features
- Binds to 127.0.0.1:8000 by default
Command Line Tools
Use the provided scripts for quick OCR operations:
OCR Recognition
python scripts/ocr.py <image_path> [--color-filter FILTER] [--charset-range RANGE] [--text-only]
Examples:
python scripts/ocr.py image/3.png
python scripts/ocr.py image/3.png --text-only
python scripts/ocr.py image/3.png --color-filter green --charset-range "0123456789"
Object Detection
python scripts/det.py <image_path> [--json]
Examples:
python scripts/det.py image/3.png
python scripts/det.py image/3.png --json
Slide Matching
python scripts/slide.py <target_path> <background_path> [--algorithm match|comparison] [--simple-target] [--json]
Examples:
python scripts/slide.py image/su.png image/bg.png
python scripts/slide.py image/su.png image/bg.png --algorithm comparison
python scripts/slide.py image/target.png image/bg.png --simple-target --json
Core Capabilities
1. OCR Recognition
Recognize text from images, supports color filtering and character range specification.
Use cases:
- Captcha recognition (numeric, alphanumeric, Chinese)
- Text extraction from images
- Custom character set recognition
Endpoint: POST /ocr
2. Object Detection
Detect text regions and objects in images.
Use cases:
- Point-and-click captcha verification
- Text region localization
- Multiple object detection
Endpoint: POST /det
3. Slide Matching
Match slide images with background positions.
Algorithm 1 (slide-match): Template matching for transparent slides
Algorithm 2 (slide-comparison): Difference-based comparison
Use cases:
- Slide captcha verification
- Image positioning
Endpoints: POST /slide-match, POST /slide-comparison
MCP Protocol
The service implements the Model Context Protocol for AI agent integration.
Endpoint: POST http://127.0.0.1:8000/mcp
Available MCP tools:
ocr - OCR recognition with optional color filtering and character range
det - Object detection returning bounding boxes
slide_match - Slide matching (algorithm 1)
slide_comparison - Slide comparison (algorithm 2)
See references/mcp.md for MCP protocol details.
REST API
The service also provides a REST API:
| Endpoint | Method | Description |
|---|
/ocr | POST | OCR recognition |
/det | POST | Object detection |
/slide-match | POST | Slide matching |
/slide-comparison | POST | Slide comparison |
/status | GET | Service status |
/docs | GET | Swagger UI documentation |
See references/api.md for detailed API documentation.
Usage Examples
OCR Recognition
import requests
import base64
with open("image.png", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode()
response = requests.post("http://127.0.0.1:8000/ocr", json={
"image": image_b64,
"color_filter": "green",
"charset_range": "0123456789"
})
print(response.json())
Object Detection
response = requests.post("http://127.0.0.1:8000/det", json={
"image": image_b64
})
print(response.json())
Slide Matching
with open("target.png", "rb") as f:
target_b64 = base64.b64encode(f.read()).decode()
with open("background.png", "rb") as f:
bg_b64 = base64.b64encode(f.read()).decode()
response = requests.post("http://127.0.0.1:8000/slide-match", json={
"target_image": target_b64,
"background_image": bg_b64,
"simple_target": True
})
print(response.json())
Color Filtering
Supported presets: red, blue, green, yellow, orange, purple, cyan, black, white, gray
HSV ranges can also be specified as array of tuples: [(min_h, min_s, min_v), (max_h, max_s, max_v)]
Character Range Values
| Value | Description |
|---|
| 0 | Pure integers 0-9 |
| 1 | Pure lowercase a-z |
| 2 | Pure uppercase A-Z |
| 3 | Lowercase + Uppercase |
| 4 | Lowercase + 0-9 |
| 5 | Uppercase + 0-9 |
| 6 | Lowercase + Uppercase + 0-9 |
| 7 | Default full character set |
Custom string can also be used: "0123456789+-x/=?"
Service Status
Check if service is running:
curl http://127.0.0.1:8000/status
Response:
{
"code": 200,
"msg": "success",
"data": {
"service_status": "running",
"enabled_features": ["ocr", "det", "slide", "mcp"]
}
}