ワンクリックで
image-to-data
Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
メニュー
Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
SOC 職業分類に基づく
A curated collection of 2009+ best OpenClaw skills — AI tools, productivity, marketing, frontend, mobile, backend, DevOps and more. Weekly updated by MyClaw.ai — Powered by MyClaw.ai
Query historical crypto market data from 0xArchive across Hyperliquid, Lighter.xyz, and HIP-3. Covers orderbooks, trades, candles, funding rates, open interest, liquidations, and data quality. Use when the user asks about crypto market data, orderbooks, trades, funding rates, or historical prices on Hyperliquid, Lighter.xyz, or HIP-3.
"用于减少 AI Agent"没做却说做了""没验证却说完成了"等假完成行为的可靠性技能。通过任务约束、结果校验和执行规范,帮助 Agent 在复杂任务中保持真实执行、明确验证与可信交付。适用于代码、调试、研究、写作、规划、运维、API 集成、数据分析、部署等所有任务类型。"
Build a more reliable OpenClaw agent with battle-tested architecture patterns. Covers WAL protocol, working buffer, memory anti-poisoning, layered memory compression, cron design, selective skill integration, and heartbeat batching.
Facilitate over-the-counter trades between agents using Uniswap as the settlement layer. Use when user wants to trade tokens directly with another agent, settle an agent-to-agent trade through Uniswap, or execute an OTC swap with a specific counterparty agent. Verifies counterparty identity via ERC-8004, negotiates terms, and settles through Uniswap pools.
Earn yield on USDC by investing in AI-managed DeFi funds, or launch your own fund and build a public track record on Base. Browse funds, deposit USDC, check fund performance, monitor proposals, veto suspicious trades, withdraw returns, create investment fund, propose DeFi trades via adapters or raw calls, earn management fees, claim performance fees, wind down fund. Every trade is transparent and vetoable by LP agents.
| name | image-to-data |
| description | Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings. |
Based on DDC methodology (Chapter 2.4), this skill extracts structured data from construction images using computer vision, OCR, and AI models to analyze site photos, scanned documents, and drawings.
Book Reference: "Преобразование данных в структурированную форму" / "Data Transformation to Structured Form"
from dataclasses import dataclass, field
from enum import Enum
from typing import List, Dict, Optional, Any, Tuple
from datetime import datetime
import json
import base64
class ImageType(Enum):
"""Types of construction images"""
SITE_PHOTO = "site_photo"
SCANNED_DOCUMENT = "scanned_document"
FLOOR_PLAN = "floor_plan"
ELEVATION = "elevation"
DETAIL_DRAWING = "detail_drawing"
PROGRESS_PHOTO = "progress_photo"
SAFETY_PHOTO = "safety_photo"
DEFECT_PHOTO = "defect_photo"
MATERIAL_PHOTO = "material_photo"
EQUIPMENT_PHOTO = "equipment_photo"
class ExtractionType(Enum):
"""Types of data extraction"""
OCR_TEXT = "ocr_text"
TABLE = "table"
OBJECT_DETECTION = "object_detection"
MEASUREMENT = "measurement"
CLASSIFICATION = "classification"
PROGRESS = "progress"
@dataclass
class BoundingBox:
"""Bounding box for detected region"""
x: int
y: int
width: int
height: int
confidence: float = 1.0
@dataclass
class TextRegion:
"""Extracted text region from image"""
text: str
bbox: BoundingBox
confidence: float
language: str = "en"
@dataclass
class DetectedObject:
"""Detected object in image"""
label: str
bbox: BoundingBox
confidence: float
attributes: Dict[str, Any] = field(default_factory=dict)
@dataclass
class ExtractedTable:
"""Extracted table from image"""
headers: List[str]
rows: List[List[str]]
bbox: BoundingBox
confidence: float
@dataclass
class ProgressMeasurement:
"""Progress measurement from image"""
element_type: str
total_count: int
completed_count: int
percent_complete: float
area_sqft: Optional[float] = None
volume_cuft: Optional[float] = None
@dataclass
class ImageAnalysisResult:
"""Complete image analysis result"""
image_id: str
image_type: ImageType
text_regions: List[TextRegion]
detected_objects: List[DetectedObject]
tables: List[ExtractedTable]
progress: Optional[ProgressMeasurement] = None
metadata: Dict[str, Any] = field(default_factory=dict)
processing_time: float = 0.0
class OCREngine:
"""OCR engine for text extraction"""
def __init__(self, engine: str = "tesseract"):
self.engine = engine
self.supported_languages = ["en", "ru", "de", "fr", "es"]
def extract_text(
self,
image_data: bytes,
language: str = "en"
) -> List[TextRegion]:
"""Extract text from image"""
# Simulated OCR extraction (use actual OCR library in production)
# In production: pytesseract, EasyOCR, or cloud OCR services
regions = []
# Simulate detecting title block in drawing
regions.append(TextRegion(
text="PROJECT: OFFICE BUILDING",
bbox=BoundingBox(x=100, y=50, width=300, height=30, confidence=0.95),
confidence=0.95,
language=language
))
regions.append(TextRegion(
text="DRAWING: A-101",
bbox=BoundingBox(x=100, y=90, width=200, height=25, confidence=0.92),
confidence=0.92,
language=language
))
regions.append(TextRegion(
text="SCALE: 1:100",
bbox=BoundingBox(x=100, y=120, width=150, height=20, confidence=0.88),
confidence=0.88,
language=language
))
return regions
def extract_structured_text(
self,
image_data: bytes,
template: Optional[Dict] = None
) -> Dict[str, str]:
"""Extract structured text using template matching"""
# Extract text regions
regions = self.extract_text(image_data)
# Match to template fields
structured = {}
if template:
for field_name, field_config in template.items():
# Find matching region
for region in regions:
if field_config.get("keyword") in region.text.lower():
structured[field_name] = region.text
break
else:
# Default extraction
for region in regions:
if "PROJECT:" in region.text:
structured["project_name"] = region.text.split(":")[-1].strip()
elif "DRAWING:" in region.text:
structured["drawing_number"] = region.text.split(":")[-1].strip()
elif "SCALE:" in region.text:
structured["scale"] = region.text.split(":")[-1].strip()
return structured
class ObjectDetector:
"""Object detection for construction images"""
def __init__(self, model: str = "yolov8"):
self.model = model
self.construction_classes = self._load_construction_classes()
def _load_construction_classes(self) -> Dict[str, Dict]:
"""Load construction-specific object classes"""
return {
# Equipment
"excavator": {"category": "equipment", "safety_zone": 20},
"crane": {"category": "equipment", "safety_zone": 30},
"forklift": {"category": "equipment", "safety_zone": 10},
"concrete_mixer": {"category": "equipment", "safety_zone": 5},
"scaffolding": {"category": "equipment", "safety_zone": 5},
# Safety
"hard_hat": {"category": "ppe", "required": True},
"safety_vest": {"category": "ppe", "required": True},
"safety_glasses": {"category": "ppe", "required": False},
"harness": {"category": "ppe", "required": False},
# Materials
"rebar_bundle": {"category": "material", "unit": "bundle"},
"concrete_block": {"category": "material", "unit": "pallet"},
"lumber_stack": {"category": "material", "unit": "bundle"},
"pipe_stack": {"category": "material", "unit": "bundle"},
# Workers
"worker": {"category": "person", "track": True},
# Building elements
"column": {"category": "structure"},
"beam": {"category": "structure"},
"slab": {"category": "structure"},
"wall": {"category": "structure"},
}
def detect(
self,
image_data: bytes,
confidence_threshold: float = 0.5
) -> List[DetectedObject]:
"""Detect objects in image"""
# Simulated detection (use actual model in production)
# In production: YOLO, Faster R-CNN, etc.
detected = []
# Simulate detected objects
sample_detections = [
("worker", 0.92, BoundingBox(200, 300, 80, 180, 0.92)),
("hard_hat", 0.88, BoundingBox(210, 300, 30, 25, 0.88)),
("safety_vest", 0.85, BoundingBox(210, 340, 60, 80, 0.85)),
("scaffolding", 0.78, BoundingBox(400, 100, 200, 400, 0.78)),
("concrete_block", 0.72, BoundingBox(50, 450, 100, 50, 0.72)),
]
for label, conf, bbox in sample_detections:
if conf >= confidence_threshold:
class_info = self.construction_classes.get(label, {})
detected.append(DetectedObject(
label=label,
bbox=bbox,
confidence=conf,
attributes=class_info
))
return detected
def detect_safety_compliance(
self,
image_data: bytes
) -> Dict:
"""Detect safety compliance in image"""
objects = self.detect(image_data)
workers = [o for o in objects if o.label == "worker"]
hard_hats = [o for o in objects if o.label == "hard_hat"]
vests = [o for o in objects if o.label == "safety_vest"]
compliance = {
"workers_detected": len(workers),
"hard_hats_detected": len(hard_hats),
"vests_detected": len(vests),
"hard_hat_compliance": len(hard_hats) / len(workers) if workers else 1.0,
"vest_compliance": len(vests) / len(workers) if workers else 1.0,
"overall_compliance": "compliant" if len(hard_hats) >= len(workers) else "non-compliant",
"violations": []
}
if len(hard_hats) < len(workers):
compliance["violations"].append({
"type": "missing_hard_hat",
"count": len(workers) - len(hard_hats)
})
return compliance
class TableExtractor:
"""Extract tables from images"""
def extract_tables(
self,
image_data: bytes,
detect_headers: bool = True
) -> List[ExtractedTable]:
"""Extract tables from image"""
# Simulated table extraction
# In production: Camelot, Tabula, or custom CNN
tables = []
# Simulate a schedule table
tables.append(ExtractedTable(
headers=["Activity", "Start", "End", "Duration"],
rows=[
["Foundation", "2024-01-01", "2024-01-15", "14 days"],
["Framing", "2024-01-16", "2024-02-28", "44 days"],
["MEP Rough-in", "2024-03-01", "2024-03-31", "31 days"]
],
bbox=BoundingBox(50, 200, 500, 200, 0.85),
confidence=0.85
))
return tables
def table_to_dataframe(self, table: ExtractedTable) -> Dict:
"""Convert table to dictionary (DataFrame-like)"""
return {
"columns": table.headers,
"data": table.rows,
"records": [
dict(zip(table.headers, row))
for row in table.rows
]
}
class ProgressAnalyzer:
"""Analyze construction progress from images"""
def __init__(self):
self.reference_models = {}
def analyze_progress(
self,
current_image: bytes,
reference_image: Optional[bytes] = None,
element_type: str = "general"
) -> ProgressMeasurement:
"""Analyze progress by comparing images"""
# Simulated progress analysis
# In production: Use semantic segmentation + comparison
# Simulate progress detection
return ProgressMeasurement(
element_type=element_type,
total_count=100,
completed_count=65,
percent_complete=65.0,
area_sqft=15000.0,
volume_cuft=None
)
def compare_with_plan(
self,
site_photo: bytes,
plan_image: bytes
) -> Dict:
"""Compare site photo with plan"""
return {
"match_score": 0.78,
"deviations": [],
"completion_estimate": 65.0,
"areas_of_concern": []
}
class ConstructionImageAnalyzer:
"""
Main class for construction image analysis.
Based on DDC methodology Chapter 2.4.
"""
def __init__(self):
self.ocr = OCREngine()
self.detector = ObjectDetector()
self.table_extractor = TableExtractor()
self.progress_analyzer = ProgressAnalyzer()
def analyze_image(
self,
image_data: bytes,
image_type: ImageType,
image_id: str = "img_001",
extract_types: Optional[List[ExtractionType]] = None
) -> ImageAnalysisResult:
"""
Analyze a construction image.
Args:
image_data: Image data as bytes
image_type: Type of image
image_id: Unique image identifier
extract_types: Types of extraction to perform
Returns:
Complete analysis result
"""
start_time = datetime.now()
if extract_types is None:
extract_types = [ExtractionType.OCR_TEXT, ExtractionType.OBJECT_DETECTION]
text_regions = []
detected_objects = []
tables = []
progress = None
# OCR extraction
if ExtractionType.OCR_TEXT in extract_types:
text_regions = self.ocr.extract_text(image_data)
# Object detection
if ExtractionType.OBJECT_DETECTION in extract_types:
detected_objects = self.detector.detect(image_data)
# Table extraction
if ExtractionType.TABLE in extract_types:
tables = self.table_extractor.extract_tables(image_data)
# Progress analysis
if ExtractionType.PROGRESS in extract_types:
progress = self.progress_analyzer.analyze_progress(image_data)
processing_time = (datetime.now() - start_time).total_seconds()
return ImageAnalysisResult(
image_id=image_id,
image_type=image_type,
text_regions=text_regions,
detected_objects=detected_objects,
tables=tables,
progress=progress,
metadata={"extraction_types": [e.value for e in extract_types]},
processing_time=processing_time
)
def analyze_site_photo(
self,
image_data: bytes,
image_id: str = "site_001"
) -> Dict:
"""Analyze site photo for progress and safety"""
result = self.analyze_image(
image_data,
ImageType.SITE_PHOTO,
image_id,
[ExtractionType.OBJECT_DETECTION, ExtractionType.PROGRESS]
)
safety = self.detector.detect_safety_compliance(image_data)
return {
"image_id": result.image_id,
"objects_detected": len(result.detected_objects),
"progress": result.progress,
"safety_compliance": safety,
"equipment": [o.label for o in result.detected_objects if o.attributes.get("category") == "equipment"],
"materials": [o.label for o in result.detected_objects if o.attributes.get("category") == "material"]
}
def extract_drawing_data(
self,
image_data: bytes,
image_id: str = "dwg_001"
) -> Dict:
"""Extract data from scanned drawing"""
result = self.analyze_image(
image_data,
ImageType.FLOOR_PLAN,
image_id,
[ExtractionType.OCR_TEXT, ExtractionType.TABLE]
)
# Extract title block info
title_block = self.ocr.extract_structured_text(image_data)
return {
"image_id": result.image_id,
"title_block": title_block,
"text_regions": len(result.text_regions),
"tables": [
self.table_extractor.table_to_dataframe(t)
for t in result.tables
],
"all_text": [r.text for r in result.text_regions]
}
def batch_analyze(
self,
images: List[Tuple[bytes, ImageType, str]]
) -> List[ImageAnalysisResult]:
"""Analyze multiple images"""
results = []
for image_data, image_type, image_id in images:
result = self.analyze_image(image_data, image_type, image_id)
results.append(result)
return results
def export_results(
self,
result: ImageAnalysisResult,
format: str = "json"
) -> str:
"""Export analysis results"""
data = {
"image_id": result.image_id,
"image_type": result.image_type.value,
"text_count": len(result.text_regions),
"object_count": len(result.detected_objects),
"table_count": len(result.tables),
"texts": [
{"text": r.text, "confidence": r.confidence}
for r in result.text_regions
],
"objects": [
{"label": o.label, "confidence": o.confidence}
for o in result.detected_objects
],
"processing_time": result.processing_time
}
if format == "json":
return json.dumps(data, indent=2)
else:
raise ValueError(f"Unsupported format: {format}")
analyzer = ConstructionImageAnalyzer()
# Load image (in production, read from file)
with open("site_photo.jpg", "rb") as f:
image_data = f.read()
result = analyzer.analyze_site_photo(image_data)
print(f"Objects detected: {result['objects_detected']}")
print(f"Safety compliance: {result['safety_compliance']['overall_compliance']}")
print(f"Progress: {result['progress'].percent_complete}%")
with open("floor_plan.png", "rb") as f:
drawing_data = f.read()
data = analyzer.extract_drawing_data(drawing_data)
print(f"Drawing: {data['title_block'].get('drawing_number')}")
print(f"Project: {data['title_block'].get('project_name')}")
for table in data['tables']:
print(f"Table with {len(table['records'])} rows")
detector = ObjectDetector()
with open("site_photo.jpg", "rb") as f:
image_data = f.read()
safety = detector.detect_safety_compliance(image_data)
if safety['overall_compliance'] == 'non-compliant':
for violation in safety['violations']:
print(f"Violation: {violation['type']} - Count: {violation['count']}")
| Component | Purpose |
|---|---|
ConstructionImageAnalyzer | Main analysis engine |
OCREngine | Text extraction |
ObjectDetector | Object detection |
TableExtractor | Table extraction |
ProgressAnalyzer | Progress analysis |
ImageAnalysisResult | Complete analysis result |