Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

document-intelligence

Comprehensive construction document intelligence extraction system. Use when processing ANY construction documents (plans, specs, schedules, contracts, geotech, safety, SWPPP, sub lists, RFIs, submittals) to extract field-actionable data for daily reports, project management, quality control, and procurement. Also generates the Project Intelligence Dashboard and construction schedules (door, hardware, fixture, finish, plumbing, equipment, room) from extracted data. Triggers: "process documents", "extract from plans", "read the specs", "analyze schedule", "set up project", "add documents", "show project intelligence", "project intel", "what data do we have", "door schedule", "hardware schedule", "finish schedule", "generate schedules", "room schedule", "equipment schedule", any mention of construction document processing or project intelligence building. This skill should be used EVERY TIME construction documents need to be processed - it extracts deep, comprehensive data including specific values, schedules,

In Manus ausführen

Überblick

Installationsbefehl

npx skills add https://github.com/mgoodman60/foreman-os-plugin --skill document-intelligence

Kopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um den Skill zu installieren

Quelle

mgoodman60/foreman-os-plugin

Sterne1

Forks0

Aktualisiert27. Februar 2026 um 23:34

Datei-Explorer

33 Dateien

SKILL.md

readonly

Mehr aus diesem Repository

gleiches Repository

project-data

mgoodman60/foreman-os-plugin

Use this skill when reading from or writing to the project intelligence data store. Manages a multi-file data store (project-config.json, plans-spatial.json, specs-quality.json, schedule.json, directory.json, individual log files, and daily-report-data.json). Handles storage, cross-referencing, versioning, validation, and smart retrieval of project intelligence, report history, RFI/submittal logs, procurement tracking, vendor database, and lookahead schedules.

2026-02-271

data-normalization

mgoodman60/foreman-os-plugin

Project-agnostic data normalization patterns for the 28-file Project Brain. Defines 8 repeatable patterns (N1-N8) that fix common data quality issues: status standardization, counter reconciliation, cross-reference linking, field backfill, schema compliance, computed totals, date/format normalization, and deduplication. Safe to re-run (idempotent). Outputs a repair plan before executing.

2026-02-261

submittal-intelligence

mgoodman60/foreman-os-plugin

Review and verify submittal compliance against project specifications. Trigger phrases include "review submittal", "check submittal", "submittal compliance", "does this meet spec", "compare to spec", "submittal review", "product data review", "shop drawing review".

2026-02-261

closeout-commissioning

mgoodman60/foreman-os-plugin

Skill for tracking project closeout and building commissioning activities. Manages closeout checklists, commissioning workflows, warranty tracking, and substantial/final completion documentation. Integrates with punch-list and inspection systems.

2026-02-251

cobie-export

mgoodman60/foreman-os-plugin

Generate Construction Operations Building Information Exchange (COBie) compliant spreadsheets from existing project intelligence data. Maps extracted data into the 18 standard COBie worksheet format for facility management handover. Critical for healthcare facilities where ongoing maintenance and asset management are regulatory requirements. Exports as multi-tab Excel workbook. Triggers: "COBie", "COBie export", "facility handover", "asset register", "O&M data", "building information exchange", "commissioning data", "facility management data", "asset data", "COBie spreadsheet", "handover data", "building handover".

2026-02-251

environmental-compliance

mgoodman60/foreman-os-plugin

Environmental compliance management for construction projects. Covers LEED v4.1 construction administration, SWPPP permit compliance and inspections, hazardous materials management (asbestos, lead, mold, contaminated soil), construction waste management and diversion tracking, dust and air quality control, noise ordinance compliance, silica exposure prevention, and environmental incident response. Beyond basic BMP installation (see field-reference/bmp-field-guide.md). Triggers: "environmental", "LEED", "SWPPP", "hazmat", "hazardous materials", "asbestos", "lead paint", "waste management", "dust control", "noise", "environmental compliance", "recycling", "diversion", "VOC", "silica", "EPD", "HPD", "spill", "contaminated soil", "mold", "air quality", "noise variance", "waste diversion", "C&D waste".

2026-02-251

Quelle

mgoodman60

mgoodman60/foreman-os-plugin

GitHub-Repository öffnen Creator-Repositorys ansehen

Installationsbefehl

Download

In Manus ausführen

Nützlich fürSOC

ProjektmanagementspezialistenWirtschafts- und Finanzberufe13-1082L4

name	document-intelligence
description	Comprehensive construction document intelligence extraction system. Use when processing ANY construction documents (plans, specs, schedules, contracts, geotech, safety, SWPPP, sub lists, RFIs, submittals) to extract field-actionable data for daily reports, project management, quality control, and procurement. Also generates the Project Intelligence Dashboard and construction schedules (door, hardware, fixture, finish, plumbing, equipment, room) from extracted data. Triggers: "process documents", "extract from plans", "read the specs", "analyze schedule", "set up project", "add documents", "show project intelligence", "project intel", "what data do we have", "door schedule", "hardware schedule", "finish schedule", "generate schedules", "room schedule", "equipment schedule", any mention of construction document processing or project intelligence building. This skill should be used EVERY TIME construction documents need to be processed - it extracts deep, comprehensive data including specific values, schedules, specifications, and coordination information that goes far beyond basic document reading.
version	1.0.0

Construction Document Intelligence

A comprehensive extraction system for construction project documents that captures deep, field-actionable intelligence for daily reports, project management, quality control, and procurement tracking.

Overview

This skill transforms raw construction documents into structured, actionable project intelligence by:

Classifying documents automatically (plans, specs, schedules, contracts, etc.)
Extracting comprehensive data with specific numerical values, not just descriptions
Cross-referencing information across multiple documents
Tracking extraction quality with confidence levels and coverage metrics
Structuring output for downstream consumption by project management systems

Key difference from basic document reading: This skill extracts SPECIFIC VALUES and COMPLETE SCHEDULES rather than just summaries. For example:

Not just "concrete requirements" → "4,000 PSI concrete, w/c ratio 0.45 max, slump 4"±1", air content 5.5%±1.5%, cure 7 days wet"
Not just "room list" → Complete room schedule with ALL 50+ rooms, numbers, names, areas, departments
Not just "schedule milestones" → Full critical path with durations, float, predecessors, constraints

When to Use This Skill

ALWAYS use this skill when:

Setting up a new project (/set-project, /process-docs, project initialization)
Processing construction drawings (plans, elevations, sections, details)
Processing specification books or spec sections
Processing CPM schedules (P6, MS Project, Asta)
Processing geotech reports, safety plans, SWPPP documents
Processing contracts, subcontractor lists, RFI logs, submittal logs
Updating project intelligence with new document revisions
Any task involving "extract", "process", "read", "analyze" + construction documents

Triggers include:

"Process these plans and extract grid lines, rooms, and finishes"
"Read the spec book and get concrete requirements"
"Analyze this schedule for critical path and milestones"
"Extract subcontractor information from these contracts"
"Process the geotech report for bearing capacity and compaction requirements"
"What are the weather thresholds in the specs?"
"Set up project intelligence from these documents"
"Show me the project intelligence" / "project intel dashboard" / "what data do we have"
"Show me the door schedule" / "generate schedules" / "hardware schedule" / "finish schedule"
"What schedules have been extracted?" / "room schedule" / "equipment schedule"

Core Capabilities

1. Extraction Pipeline (5-Phase Adaptive Model)

This extraction system uses a 5-phase adaptive model. The phases below (Phase 1-2) cover the per-document extraction methodology. Cross-referencing (Phase 3), remediation (Phase 4), and normalization (Phase 5) are orchestrated at the pipeline level — see commands/process-docs.md for the full pipeline and agents/extraction-pipeline-orchestrator.md for the meta-agent.

Phase 1 — Index & Classify (Lightweight)

For each document, extract metadata and classify:

PDF metadata: creator application, dates, page count, title, author
Document type classification (see doc-orchestrator.md classification table)
For plan PDFs: build sheet index from TOC or title blocks
For specs: identify division/section structure

Phase 2A — Structural Analysis (Quick Scan)

Scan for structural patterns to guide extraction:

Sheet index (for drawings), Table of contents (for specs)
Headers, footers, tables, content signals
Adaptive intensity: cover sheets get 2 passes, plan sheets get 4-6

Phase 2B — Deep Content Extraction (Targeted)

Based on document type, extract comprehensive intelligence following specialized references in extraction-rules.md.

Entity resolution during extraction: Assign tentative entity references using directory.json and specs-quality.json as lookup tables. Mark as "entity_status": "tentative". These are confirmed in Phase 3 (Cross-Reference) at the pipeline level.

Pass 4: Visual / Graphical Analysis (Plan Sheets and Drawings)

This pass is REQUIRED for any PDF that contains construction drawings, plan sheets, details, sections, or schedules embedded as graphics. Construction documents communicate ~80% of their information through drawings, not text. Skipping this pass means missing scale data, dimensions, room locations, door swings, equipment positions, and graphical notes.

Method 1 — Claude Vision (Primary, always available in Cowork):

Convert PDF pages to images: pdftoppm -png -r 200 "document.pdf" /tmp/sheet_images/sheet (or PyMuPDF)
Use the Read tool to view each sheet image directly (Claude's native multimodal vision)
Extract: scale data (per-view), dimension strings, detail callouts, room labels with grid locations, door swings, equipment locations, general notes, title block data, north arrow
Store with "source": "claude_vision" and "confidence": "medium"

Method 2 — Python Visual Pipeline (Optional Enhancement): If cv2, skimage, sklearn are installed, optionally run visual_plan_analyzer.py for automated extraction with precise pixel coordinates:

OCR text extraction, wall/line detection, symbol recognition
Material zone detection (hatches → concrete, VCT, tile, insulation)
Dimension extraction (pairing OCR text with dimension lines)
Scale calibration (graphic scale bars AND text-format scales) Falls back to OpenCV + Tesseract if PaddleOCR is unavailable. This pipeline is not required — Claude Vision handles the majority of extraction needs.

Method 3 — Tesseract OCR (Supplement for small text): Supplement small text that Claude Vision misses: tesseract sheet.png output -l eng --psm 6

Priority: Claude Vision (primary, always available) > Tesseract (text supplement) > Python pipeline (optional precise coordinates)

DPI Recipes:

Document Type	DPI	Notes
D-size drawings (24x36, 30x42)	150	Crop to 1800x1200 for Vision
Letter-size PDFs (specs, submittals)	200	Standard rendering
High-detail areas (enlarged plans, details)	300-350	Zone crop specific areas
Text-extractable PDFs (contracts, reports)	Skip	Extract text directly, no rendering

Atomic Python Script Pattern: When updating multiple JSON files from extraction, use a single atomic script: read all targets → modify in memory → write all → print summary. This prevents partial writes and ensures consistency. See commands/process-docs.md for the code template.

Phase 2C: Scale Calibration Protocol (RECOMMENDED, non-blocking)

Scale calibration is RECOMMENDED before extracting measurement data from a sheet. It improves dimensional accuracy but is non-blocking — if calibration fails, record "scale_status": "uncalibrated" and continue extraction. Text-based data (room names, door marks, notes, equipment tags) is extracted regardless. Uncalibrated sheets are flagged for remediation in Phase 4.

Without a calibrated scale, pixel-based measurements are unreliable. This sub-pass runs as the FIRST step of visual analysis for every plan sheet.

Step 1 — Graphic Scale Bar Detection (Highest Priority):

Graphic scale bars are the most reliable scale anchor because they survive PDF resizing, printing at different sizes, and scanning. They are typically located near the title block or below individual views.

What to look for:

A horizontal line with evenly-spaced tick marks (short perpendicular lines)
Distance labels at ticks (e.g., "0", "4'", "8'", "16'" or "0", "5", "10", "20")
Often has a subdivided first segment (finer ticks between 0 and the first major tick)
May be labeled "GRAPHIC SCALE" or just have the unit label ("FEET", "INCHES")
Common styles: open bar (ticks only), filled bar (alternating black/white segments), ladder bar (enclosed rectangles)

Claude Vision detection method:

Scan the bottom-right quadrant of each sheet first (most common location), then title block area, then below each view
Look for the characteristic pattern: horizontal line + evenly-spaced ticks + number labels increasing left-to-right
Record: bar pixel length (endpoint to endpoint), bar real-world length (from labels), tick count, unit
Calculate pixels_per_foot = bar_pixel_length / bar_real_world_feet

Python pipeline detection method (Pass 7 enhanced):

Detect candidate horizontal line segments 50-500px long
Find perpendicular tick marks (short lines within 5px of the candidate, 10-30px tall)
Require minimum 3 evenly-spaced ticks (spacing variance < 15%)
Match nearby OCR text to tick positions for distance labels
Calculate pixels_per_foot from tick spacing and label values

Step 2 — Text Scale Notation (Secondary):

Read text-format scale notations from title blocks and view labels:

Architectural: 1/4" = 1'-0", 1/8" = 1'-0", 3/16" = 1'-0", 1/2" = 1'-0", 1" = 1'-0", 3" = 1'-0"
Metric: 1:50, 1:100, 1:200
Verbal: SCALE: 1/4" = 1'-0", PLAN SCALE: 1/8"
Per-view: Detail views often have their own scale different from the main plan

IMPORTANT: Text scales assume the sheet was printed or rendered at the nominal size (24×36", 30×42", etc.). If the PDF was resized during export or scanning, text scales will be WRONG. Always prefer graphic scale bar calibration when available.

Step 3 — Multi-Scale Zone Mapping:

Most plan sheets contain multiple views at different scales on the same sheet. Each view's scale must be mapped independently:

Main floor plan: Typically 1/4" = 1'-0" or 1/8" = 1'-0"
Enlarged plans (restrooms, kitchens, lobbies): Typically 1/2" = 1'-0"
Details: Typically 1" = 1'-0", 1-1/2" = 1'-0", or 3" = 1'-0"
Site plans: Typically 1" = 20'-0", 1" = 30'-0", or 1" = 50'-0"

For each view on a sheet, record:

{
  "view_name": "FLOOR PLAN",
  "scale_text": "1/4\" = 1'-0\"",
  "pixels_per_foot": 62.5,
  "calibration_method": "graphic_bar",
  "zone_bbox": [120, 80, 2800, 3200],
  "confidence": "high"
}

Step 4 — Known-Dimension Calibration Fallback:

When neither graphic scale bar nor text scale is found (rare but possible on details, sections, or scanned documents):

Find the longest dimension string on the sheet (e.g., "132'-8"" for an overall building dimension)
Measure its dimension line in pixels (endpoint to endpoint)
Calculate: pixels_per_foot = pixel_length / dimension_feet
Mark calibration as "method": "known_dimension", "confidence": "low"
Cross-check: if ANY other dimension on the sheet can be measured and verified against this calibration, upgrade confidence to "medium"

Step 5 — Double Calibration (Stretch Detection):

Scanned documents and some PDF exports may have different horizontal and vertical scaling (stretched/skewed). After establishing scale:

Find a horizontal dimension and calculate h_pixels_per_foot
Find a vertical dimension and calculate v_pixels_per_foot
If they differ by more than 3%: the image is stretched
Record both values and set "stretch_detected": true
All subsequent area calculations must use the geometric mean: pixels_per_foot = sqrt(h * v)

Scale Data Storage — REQUIRED:

Every plan sheet must have scale data recorded in plans-spatial.json under scale_calibration with per-sheet and per-view scale information:

"scale_calibration": {
  "sheets": {
    "A-101": {
      "sheet_name": "FLOOR PLAN",
      "calibration_method": "graphic_bar",
      "graphic_bar": {
        "pixel_length": 250,
        "real_world_feet": 16,
        "tick_count": 5,
        "location": [2400, 3600]
      },
      "text_scale": "1/4\" = 1'-0\"",
      "pixels_per_foot": {
        "horizontal": 62.5,
        "vertical": 62.3,
        "mean": 62.4
      },
      "stretch_detected": false,
      "zones": [
        {
          "view_name": "FLOOR PLAN",
          "scale_text": "1/4\" = 1'-0\"",
          "pixels_per_foot": 62.4,
          "zone_bbox": [120, 80, 2800, 3200],
          "confidence": "high"
        },
        {
          "view_name": "ENLARGED RESTROOM PLAN",
          "scale_text": "1/2\" = 1'-0\"",
          "pixels_per_foot": 124.8,
          "zone_bbox": [2850, 80, 3600, 1200],
          "confidence": "high"
        }
      ],
      "confidence": "high",
      "calibration_date": "2026-02-20"
    }
  }
}

If scale calibration fails for a sheet (no graphic bar, no text scale, no readable dimensions):

Record "calibration_method": "none", "scale_status": "uncalibrated", "confidence": "failed"
Flag the sheet for Phase 4 remediation (re-attempt with different method or manual review)
Continue extraction — text-based data (room names, door marks, notes, equipment tags, general notes) can still be extracted without scale
Pixel-based measurements on uncalibrated sheets should be marked with "confidence": "low" and "uncalibrated": true

Uncalibrated sheets reduce the extraction depth score but do not block the pipeline. They are queued for targeted remediation in Phase 4.

Pass 4B: Dimension String Chaining and Verification

After scale calibration, extract and chain dimension strings. Dimension chains are the primary cross-check for measurement accuracy.

What is a dimension chain? On construction drawings, dimensions are arranged in "strings" — a series of sub-dimensions along a common line that sum to an overall dimension. For example, column grid spacing (31'-4" + 29'-0" + 29'-0" + 29'-0" + 14'-4" = 132'-8"). If the sub-dimensions don't sum to the overall, there's either an extraction error or a drawing error worth flagging.

Claude Vision extraction method:

For each dimension string visible on the sheet, read ALL individual dimension values along the line
Identify the overall dimension (the one spanning the entire string, usually offset further from the drawing)
Sum the sub-dimensions and compare to the overall:
- Match within ±1": chain verified ✓
- Discrepancy > 1": flag for review — may indicate OCR misread or drawing error
Tag each dimension with its orientation (horizontal/vertical/angled)
Where possible, note what elements the dimension connects (e.g., "Grid 1 to Grid 2")

Store dimension chains in plans-spatial.json under dimensions.chains:

{
  "chain_id": "DIM-CHAIN-001",
  "sheet": "A-100",
  "orientation": "horizontal",
  "overall_value": "132'-8\"",
  "overall_numeric_ft": 132.67,
  "segments": [
    {"value": "31'-4\"", "numeric_ft": 31.33, "from_element": "Grid 1", "to_element": "Grid 2"},
    {"value": "29'-0\"", "numeric_ft": 29.0, "from_element": "Grid 2", "to_element": "Grid 3"}
  ],
  "segments_sum_ft": 132.67,
  "sum_matches_overall": true,
  "discrepancy_ft": 0
}

Dimensions NOT part of any chain go into dimensions.isolated.

Pass 4C: Elevation Marker and Spot Elevation Extraction

Elevation markers appear on building sections, wall sections, and elevation views. They provide critical vertical measurement data.

What to look for:

Level markers: Horizontal lines with text like "T.O. WALL 12'-0"", "FFE 0'-0"", "T.O. FOOTING -4'-0""
Relative elevations: Dimensions prefixed with +/- relative to FFE (Finished Floor Elevation)
Absolute elevations: Text like "EL. 856'-6"" referencing MSL (Mean Sea Level) datum

Spot elevations appear on civil/site plans as standalone numbers (e.g., "856.50") with an X or + marker:

Existing spot elevations: Typically shown with parentheses or dashed text "(856.50)"
Proposed spot elevations: Typically bold or standard text "856.50"
FFE: Finished Floor Elevation, often boxed or highlighted
Top of curb (TC): Labeled "TC" or "T/C"

Store elevation markers in plans-spatial.json under elevation_markers:

{
  "sheet": "A-300",
  "label": "T.O. WALL",
  "elevation": "12'-0\"",
  "elevation_ft": 12.0,
  "datum": "FFE = 0'-0\"",
  "marker_type": "level"
}

Store spot elevations in plans-spatial.json under site_grading.spot_elevations:

{
  "sheet": "C-103",
  "elevation_ft": 856.50,
  "type": "proposed",
  "label": "856.50'"
}

Pass 4D: Cross-Sheet Reference Extraction

Cross-sheet references are the connective tissue of a plan set. Every section cut, detail callout, elevation marker, and schedule reference on one sheet points to a view on another sheet. Building this index is what enables assembly chains and multi-page calculations.

What to extract:

Section Cut Markers — Circle with number/letter + arrow indicating viewing direction. Format: number / sheet_number (e.g., "3/A-301" means "Section 3 is on sheet A-301").
- Record: source sheet, marker location (grid reference), target sheet, section number, viewing direction
Detail Callouts — Circle or diamond with detail_number / sheet_number. Points to an enlarged detail view.
- Record: source sheet, marker location, target sheet, detail number, what's being detailed (wall, footing, connection, etc.)
Elevation Markers — Circle with number/letter, NO arrow (unlike section cuts). References an exterior or interior elevation view.
- Record: source sheet, target sheet, elevation number
Enlarged Plan References — Dashed rectangle on the main plan with label pointing to a larger-scale view on another sheet.
- Record: source sheet, target sheet, boundary on source (approximate grid bounds)
Schedule References — Door marks (D101), window marks (W-3), room numbers, equipment tags that cross-reference to schedule sheets.
- Record: mark/tag, plan sheet where it appears, schedule sheet it references
Spec Section References — Text like "SEE SECTION 03 30 00" or "PER SPEC 09 65 00" on drawings.
- Record: source sheet, spec section number, context

Store in plans-spatial.json under sheet_cross_references:

{
  "detail_callouts": [
    {
      "id": "XREF-001",
      "source_sheet": "A-100",
      "source_location": {"grid": "C-3", "zone": "plan_area"},
      "target_sheet": "A-501",
      "target_detail": "3",
      "callout_type": "section",
      "description": "Building section at Grid 3",
      "linked_elements": ["grid_3", "wall_type_2A"]
    }
  ]
}

IMPORTANT: Cross-reference extraction should happen on EVERY sheet, not just architectural. Structural sheets reference details, MEP sheets reference riser diagrams, civil sheets reference profile views. Build the index incrementally — each processed sheet adds its references to the master index.

After processing all sheets in a document, verify the index:

Every target sheet referenced should exist in the drawing index
Flag "orphan references" — callouts pointing to sheets not in the set (may indicate missing sheets)
Flag "unreferenced sheets" — sheets in the index that no other sheet points to (may be standalone or may be missing callouts)

Pass 4E: Contour Line and Grading Extraction (Civil/Site Sheets)

This pass applies ONLY to civil/site plan sheets (C-series sheets, grading plans, site plans with contour lines). Skip for architectural, structural, MEP, and other disciplines.

Contour lines represent terrain elevations. The difference between existing and proposed contours at any point equals the cut or fill depth — critical for earthwork volume estimation.

What to extract:

Contour Elevation Labels — 3-4 digit numbers (e.g., 856, 857, 858) placed along curved lines. These are the elevation values for each contour.
Existing Contours (current terrain):
- Dashed lines with elevation labels
- Represent the ground as it exists before construction
- Sometimes shown lighter or in a different color
Proposed Contours (design terrain):
- Solid lines with elevation labels
- Represent the finished grade after construction
- Usually shown heavier/darker than existing
Index Contours — Every 5th or 10th contour is drawn thicker. Labels are always shown on index contours. Use index contour spacing to determine the contour interval (typically 1', 2', or 5').
Drainage Patterns — Water flows perpendicular to contours, from high elevation to low. V-shaped contour patterns pointing uphill indicate valleys/swales (drainage paths). Contours bowing away from buildings should confirm positive drainage.
Cut/Fill Zones — Where proposed contours are LOWER than existing = CUT (excavation). Where proposed contours are HIGHER than existing = FILL (import/compaction). Document the general cut/fill pattern for the site.

Store in plans-spatial.json under site_grading:

{
  "contours": {
    "existing": [{"elevation_ft": 856.0, "line_style": "dashed", "sheet": "C-103"}],
    "proposed": [{"elevation_ft": 856.0, "line_style": "solid", "sheet": "C-103"}],
    "contour_interval_ft": 1.0
  },
  "cut_fill_volumes": {
    "net_operation": "cut",
    "avg_cut_depth_ft": 3.5,
    "max_cut_depth_ft": 6.0,
    "confidence": "low",
    "note": "Rough estimate — verify with earthwork quantity takeoff"
  },
  "drainage_patterns": [
    {"direction": "southeast", "high_elevation": 862, "low_elevation": 854, "fall_ft": 8}
  ]
}

Cross-check with SWPPP: The contour/grading data should be consistent with SWPPP disturbed area and cut/fill volumes. Flag if the grading plan shows significantly different earthwork than the SWPPP.

Pass 4F: Building Section and Wall Section Extraction (Section/Detail Sheets)

This pass applies to architectural section sheets (A-300 series), structural sections (S-300 series), wall section details (A-400/A-500 series), and any sheet containing building or wall section cuts. Skip for plan views, site plans, MEP plans, and reflected ceiling plans.

Building sections and wall sections are the primary source of vertical dimensioning, assembly composition, and structural depth data that cannot be obtained from plan views.

What to extract:

1. Building Sections (full-height cross-sections through the building):

Floor-to-floor height: Distance from finished floor (FFE) to the next level or roof structure. For single-story buildings, this is the floor-to-roof structure distance.
Foundation depth: Distance from finished grade to bottom of footing. Includes footing thickness, stem wall height, and any grade beam depth.
Ridge height: Top of roof ridge above finished floor.
Eave/parapet height: Top of exterior wall above finished floor.
Roof slope: Read from slope triangle symbols (e.g., 4:12, 1/4":12") or calculate from rise/run dimensions.
Ceiling height: Distance from FFE to bottom of ceiling system (ACT grid, GWB, exposed structure).
Structural depth: Depth of roof joists, trusses, purlins, or beams visible in section.
SOG thickness: Slab-on-grade thickness from section detail.
Cut location: Which grid line and direction the section is cut through (from section marker on plan).

2. Wall Sections (enlarged vertical cuts through wall assemblies):

Wall type identifier: Match to wall type legend (Type 1, Type 2, etc.) or wall schedule.
Total wall thickness: Overall dimension from face to face.
Layer-by-layer assembly (from exterior to interior):
- Material name (e.g., "5/8" Type X GWB", "3-5/8" metal stud", "R-13 batt insulation")
- Thickness of each layer
- Position classification: exterior_finish, sheathing, air_barrier, insulation, stud_cavity, interior_finish
Fire rating: If noted on section (e.g., "1-HR FIRE PARTITION")
Head/sill/jamb details: Conditions at top of wall, window sill, and door jamb if shown
Base condition: How wall meets floor (reveals, base trim, sealant joints)
Top condition: How wall meets structure above (deflection track, fire safing, sealant)

3. Section Identification:

Section mark/number: The label that identifies this section (e.g., "1/A-300")
Cut line reference: Which sheet and location the section is cut from
Scale: Section scale (typically 3/4"=1'-0" for building sections, 1-1/2"=1'-0" or 3"=1'-0" for wall sections)
Title: Section title as shown below the drawing

Store in plans-spatial.json under sections:

{
  "sections": {
    "building": [
      {
        "section_id": "SECT-A300-1",
        "sheet": "A-300",
        "section_mark": "1",
        "cut_line_on_sheet": "A-100",
        "cut_location": "Grid line 3, looking east",
        "scale": "3/4\" = 1'-0\"",
        "title": "Building Section - East/West",
        "floor_to_floor_ft": 14.5,
        "foundation_depth_ft": 4.0,
        "footing_thickness_in": 12,
        "stem_wall_height_ft": 2.0,
        "ridge_height_ft": 22.5,
        "eave_height_ft": 14.0,
        "parapet_height_ft": 0,
        "roof_slope": "4:12",
        "roof_slope_degrees": 18.43,
        "ceiling_height_ft": 10.0,
        "ceiling_type": "ACT",
        "structural_depth_in": 24,
        "structural_member": "PEMB rafter",
        "sog_thickness_in": 5,
        "measurements": [
          {"label": "FFE to T.O. Steel", "value_ft": 14.5, "confidence": "high"}
        ]
      }
    ],
    "wall": [
      {
        "section_id": "WSECT-A302-1",
        "sheet": "A-302",
        "section_mark": "1",
        "wall_type": "Type 4",
        "fire_rating": "1-HR",
        "total_thickness_in": 5.25,
        "layers": [
          {"material": "5/8\" Type X GWB", "thickness_in": 0.625, "position": "interior_finish"},
          {"material": "3-5/8\" 20ga metal stud @ 16\" OC", "thickness_in": 3.625, "position": "stud_cavity"},
          {"material": "R-13 batt insulation", "thickness_in": 3.5, "position": "insulation"},
          {"material": "5/8\" Type X GWB", "thickness_in": 0.625, "position": "interior_finish"}
        ],
        "base_condition": "Vinyl base on GWB, sealant at floor line",
        "top_condition": "Deflection track, fire safing above ceiling",
        "head_detail": null,
        "sill_detail": null,
        "measurements": [
          {"label": "Total assembly", "value_in": 5.25, "confidence": "high"}
        ]
      }
    ]
  }
}

Cross-checks after section extraction:

Wall types vs plan annotations: Every wall type referenced on floor plans should have a corresponding wall section. Flag missing sections.
Ceiling heights vs RCP: Section ceiling heights should match reflected ceiling plan height zones. Flag discrepancies > 2".
Foundation depth vs geotech: Foundation embedment from sections should meet geotech minimum (e.g., 24" below grade for frost). Flag if shallower.
Roof slope vs structural: Section roof slope should match structural framing slope. Flag if different.
Fire ratings vs plan partitions: Wall section fire ratings should match fire-rated partition callouts on plans.

Pass 4G: Reflected Ceiling Plan (RCP) and MEP System Extraction

This pass applies to RCP sheets (A-102, A-5xx series, sheets titled "Reflected Ceiling Plan") and MEP plan sheets (M-series mechanical, P-series plumbing, E-series electrical). Skip for site plans, structural framing plans, and architectural floor plans (unless they contain MEP overlay data).

RCPs and MEP plans are the primary source for overhead-plane data (ceiling heights, fixture placement, HVAC devices) and system-level data (pipe sizes, duct sizes, panel schedules, equipment tags) that cannot be obtained from architectural floor plans or sections.

What to extract:

1. Reflected Ceiling Plan Data:

Ceiling type per room/zone: ACT (acoustic ceiling tile), GWB (gypsum board), exposed structure, specialty (wood slat, metal panel, etc.)
Ceiling grid module: 2'×2' or 2'×4' for ACT ceilings
Ceiling tile product: Manufacturer and model from legend or notes (e.g., Armstrong Fine Fissured 1761)
Ceiling height zones: Group rooms by ceiling height. Read height callouts (e.g., "9'-0" AFF", "CLG @ 10'-0"")
Height transitions: Where ceiling height changes within or between rooms — soffits, bulkheads, stepped ceilings
Lighting fixture positions: Symbol type + tag (e.g., 2×4 recessed troffer "A", 6" downlight "C") with room locations
HVAC devices on ceiling: Supply diffusers, return grilles, exhaust grilles — tag and room
Sprinkler heads: Approximate count per room (reference only — FP drawings govern)
Access panels: Locations for above-ceiling MEP access

Store in plans-spatial.json under ceiling_data:

{
  "ceiling_data": {
    "source_sheet": "A-102",
    "grid_type": "2x2",
    "grid_module": "24\" x 24\"",
    "tile_product": "Armstrong Fine Fissured Second Look 1761",
    "height_zones": [
      {"rooms": ["101", "102", "103"], "ceiling_height_ft": 9.0, "ceiling_type": "ACT"},
      {"rooms": ["104"], "ceiling_height_ft": 10.0, "ceiling_type": "GWB"}
    ],
    "fixture_positions": [
      {"tag": "A", "type": "2x4 recessed troffer", "rooms": ["101", "102"], "count": 12},
      {"tag": "C", "type": "6\" LED downlight", "rooms": ["103", "104"], "count": 8}
    ],
    "soffits_bulkheads": [
      {"room": "101", "description": "Soffit at perimeter, 8'-0\" AFF, 18\" wide"}
    ]
  }
}

2. MEP System Data (Mechanical, Plumbing, Electrical):

Mechanical (M-series):

Equipment tags and locations: RTU, AHU, EF, VRF, ERV — tag, grid/room position, capacity (tons, CFM, MBH)
Duct sizes: Supply, return, exhaust — read duct size labels (e.g., "12×8", "10" RD") with system type
Diffuser/grille tags: Cross-reference to RCP positions and mechanical schedule
Ductwork routing: Major trunk runs with sizes, branches to rooms

Plumbing (P-series):

Pipe sizes: Domestic cold/hot, sanitary, storm, gas, medical — read size labels (e.g., "2" CW", "4" SAN")
Pipe material: As noted (copper, CPVC, PVC, cast iron, SS)
Fixture connections: Which fixtures connect to which pipe runs
Equipment: Water heaters, pumps, grease traps, backflow preventers — tag and location

Electrical (E-series):

Panel schedules: Panel name, voltage, phase, main breaker, circuit list with loads
Single-line diagram data: Service entrance size, transformer, main distribution, sub-panels
Circuit assignments: Which circuits serve which rooms/equipment
Receptacle and device counts: Per room counts of receptacles, switches, special outlets (GFCI, dedicated)
Equipment connections: HVAC equipment electrical requirements (voltage, phase, MCA, MOCP)

Store in plans-spatial.json under mep_systems:

{
  "mep_systems": {
    "electrical": {
      "source_sheets": ["E-100", "E-101", "E-200"],
      "panel_schedules": [
        {
          "panel": "LP-1",
          "location": "Electrical Room 110",
          "voltage": "208/120V",
          "phase": "3-phase",
          "main_breaker": "225A",
          "circuits": [
            {"circuit": 1, "description": "Lighting - Rooms 101-103", "breaker": "20A", "load_va": 1800}
          ]
        }
      ],
      "single_line_data": {
        "service_size": "400A",
        "voltage": "208/120V 3-phase",
        "transformer": "75 KVA"
      },
      "circuit_assignments": [
        {"room": "101", "circuits": [1, 3, 5], "description": "Lighting + receptacles"}
      ]
    },
    "pipe_sizes": [
      {"system": "domestic_cold", "size": "2\"", "material": "copper", "sheet": "P-100", "run_segments": []}
    ],
    "duct_sizes": [
      {"size": "12x8", "type": "supply", "sheet": "M-100", "run_segments": []}
    ],
    "equipment": [
      {
        "tag": "RTU-1",
        "type": "rooftop_unit",
        "sheet": "M-100",
        "position": {"grid": "C-3", "room": "Roof"},
        "schedule_data": {"cooling_tons": 10, "heating_mbh": 250, "cfm": 4000, "voltage": "208V/3ph"}
      }
    ]
  }
}

Cross-checks after RCP + MEP extraction:

RCP room numbers vs floor plan: Every room on RCP must exist in the floor plan room schedule
Ceiling heights vs finish schedule: RCP height zones should match finish schedule ceiling column
Fixture counts vs lighting schedule: Sum of fixture tags on RCP should match schedule quantities
Diffuser counts vs mechanical schedule: Supply/return diffuser count should support required CFM per room
Pipe sizes vs plumbing riser diagram: Pipe sizes on plan should match riser diagram
Panel schedule loads vs connected equipment: Equipment MCA should not exceed circuit breaker rating
Equipment tags on plan vs equipment schedules: Every tag on plan must appear in corresponding schedule

See references/plans-deep-extraction.md (MEP Drawings section and Reflected Ceiling Plan section) for detailed extraction rules, symbol libraries, and cross-reference tables.

MEP Extraction Depth Requirements:

For EVERY MEP sheet processed, the extraction must achieve the following minimums:

Discipline	Minimum Extraction
Mechanical	Every equipment tag + cooling tons + heating MBH + CFM + electrical (V/ph/MCA/MOCP) + served rooms
Electrical	Every panel schedule with ALL circuits, single-line diagram hierarchy, lighting fixture schedule with types/quantities/wattages
Plumbing	Every fixture from schedule with type/mounting/connections/ADA/flow rate, water heater specs, pipe sizes by system
Fire Protection	System type (wet/dry/pre-action), riser location, FDC, head schedule (type/temp/K-factor), fire pump if present

Do NOT stop at equipment tags only. The extraction must capture schedule data (capacity, ratings, electrical requirements) from mechanical/electrical/plumbing schedule sheets (M-300, E-300, P-400 series).

Reference: See references/mep-deep-extraction.md for complete field-by-field extraction templates.

Pass 4G-2: MEP Schedule Sheets (M-300, E-300, P-400 Series)

This pass is MANDATORY for any project with MEP drawings. Schedule sheets contain the engineering data that equipment tags on plan sheets reference. Without schedule extraction, you only get tag names — not capacities, ratings, or specifications.

Processing order:

M-300 series (Mechanical Schedules): Extract EVERY row from HVAC equipment schedules, exhaust fan schedules, diffuser/grille schedules
E-300 series (Panel Schedules, Lighting Schedules): Extract EVERY panel with ALL circuits, EVERY lighting fixture type with quantities
P-400 series (Plumbing Schedules): Extract EVERY fixture from plumbing fixture schedule, water heater/boiler schedule
FP sheets (Fire Protection): Extract system type, riser diagram data, sprinkler head schedule

Output depth per discipline — see references/mep-deep-extraction.md for complete field lists.

Validation: After processing, verify:

Every equipment tag on plan sheets has a matching schedule entry
Panel schedule connected load ≤ main breaker rating
Total CFM from diffusers ≤ equipment total CFM
Every occupied room has lighting fixtures assigned

Pass 4H: Exterior Elevation and Accessibility Extraction

This pass applies to exterior elevation sheets (A-200 series) and architectural floor plans with accessibility annotations. Skip for sections, MEP plans, and civil/site sheets (except site accessibility routes).

Exterior elevations provide the vertical face of the building — material zones, window/door positions, grade line, and overall proportions. Accessibility data is typically annotated on floor plans and site plans as required routes, clearances, and signage.

What to extract:

1. Exterior Elevation Data (A-200 series):

Elevation face: North, South, East, West (from title or orientation)
Material zones: Each distinct cladding area with material type, color/finish, and approximate coverage area
- Primary cladding (metal panels, CMU, brick, etc.)
- Accent materials (stone veneer, brick banding, decorative panels)
- Foundation/base visible material
- Trim/fascia material and color
Window and door positions: Mark numbers with their vertical positions on the elevation. Cross-reference to window/door schedules.
Grade line elevation: The finished grade line with elevation value if noted
Overall building dimensions: Height measurements (grade to eave, grade to ridge, floor-to-floor if multi-story)
Roof material and slope: Visible from elevation — material type, color, slope direction
Canopy/entry features: Portico structure, canopy material, entry surround
Signage locations: Where building signage is indicated

Store in plans-spatial.json under elevations.exterior:

{
  "elevations": {
    "exterior": [
      {
        "face": "South",
        "sheet": "A-200",
        "materials": [
          {"zone": "primary_cladding", "material": "Vertical metal panels", "color": "Champagne tan", "mfg_code": "Nucor 12-45"},
          {"zone": "base", "material": "Split-face CMU", "color": "Warm gray", "height_ft": 4.0},
          {"zone": "trim", "material": "Aluminum", "color": "Dark bronze anodized"}
        ],
        "grade_line_elevation_ft": 856.0,
        "window_positions": [
          {"mark": "W-101", "sill_height_ft": 3.0, "head_height_ft": 7.0}
        ],
        "door_positions": [
          {"mark": "101", "type": "storefront entry", "width_ft": 6.0, "height_ft": 7.0}
        ],
        "measurements": [
          {"label": "Grade to eave", "value_ft": 14.0},
          {"label": "Grade to ridge", "value_ft": 22.5}
        ]
      }
    ]
  }
}

2. Accessibility Data (Floor Plans + Site Plans):

Accessible routes: Paths of travel from parking/entrance to all public spaces. Note route width (min 36" clear, 44" in corridors) and any obstructions.
ADA restroom clearances: 60" turning radius, 18" centerline to side wall at WC, lavatory clearance (min 30"×48" clear floor space), grab bar locations (42" side, 36" rear).
Door clearances: Maneuvering clearances at accessible doors (pull side: 18" beside latch, push side: 12"). Door hardware type (lever vs knob — lever required for accessible).
Ramp data: Slope (max 1:12 = 8.33%), width (min 36"), landing size (60"×60" at top/bottom/turns), handrail heights (34"-38").
Signage locations: Tactile/Braille signage at room doors, exit signs, directional signs. Mounting height (48"-60" AFF, latch side).
Parking accessibility: Accessible stalls (8' wide + 5' access aisle), van accessible (8' + 8' aisle), signage, slope (max 2%).
Counter heights: ADA service counter (max 36" AFF), transaction counter (max 34"), knee clearance (min 27" high, 30" wide, 19" deep).

Store in plans-spatial.json under accessibility_data:

{
  "accessibility_data": {
    "accessible_routes": [
      {
        "from": "Accessible parking",
        "to": "Main entrance",
        "route_width_in": 60,
        "surface": "Concrete sidewalk",
        "slope_pct": 2.0,
        "compliant": true
      }
    ],
    "ada_restrooms": [
      {
        "room": "109",
        "turning_radius_in": 60,
        "wc_centerline_in": 18,
        "grab_bars": true,
        "lavatory_clearance": true,
        "door_width_in": 36,
        "compliant": true
      }
    ],
    "ramps": [
      {"location": "Main entry", "slope": "1:12", "width_in": 44, "handrails": true, "landings": true}
    ],
    "signage_locations": [
      {"type": "tactile_braille", "location": "All room doors", "mounting_height_in": 60, "side": "latch"}
    ],
    "accessible_parking": {
      "standard_count": 2,
      "van_accessible_count": 1,
      "signage": true,
      "access_aisle_width_ft": 5
    },
    "counter_heights": [
      {"location": "Reception 101", "height_in": 34, "type": "transaction", "knee_clearance": true}
    ]
  }
}

Cross-checks after elevation + accessibility extraction:

Material zones vs spec sections: Elevation materials should match Div 07 (thermal/moisture), Div 04 (masonry), Div 05 (metals) specifications
Window/door positions vs schedules: Every mark on elevation should exist in the door/window schedule
Grade elevation vs civil/site: Elevation grade line should match site grading plan at the building perimeter
ADA route widths vs code: 36" min passage, 44" min corridor, 60" turning. Flag violations.
Restroom clearances vs code: 60" turning radius, 18" WC centerline. Flag non-compliant rooms.
Ramp slopes vs code: Max 1:12 (8.33%). Flag steeper ramps.
Parking count vs code: Verify accessible stall count meets IBC Table 1106.1 minimums for total parking count.

See references/plans-deep-extraction.md (Exterior Elevations section) for detailed rendering data extraction rules and material zone formatting.

Pass 4I: Hatch Pattern Refinement and Keynote Extraction

This pass applies to ALL architectural and structural plan sheets that contain hatched regions or keynote/general note annotations. It is a refinement pass that improves existing Pass 5 material zone data and adds keynote intelligence that no prior pass extracts.

Why this pass exists: Pass 5 uses raw Gabor texture analysis to classify hatched regions, but construction hatching follows AIA/ANSI standards with specific patterns mapped to specific materials. This pass overlays standard-pattern matching on top of Pass 5 results to improve classification accuracy. It also extracts keynote bubbles and general note schedules — critical text-based intelligence that links numbered callouts on drawings to their full descriptions.

What to extract:

Hatch Refinement (improves Pass 5 material_zones):

For each material zone already detected by Pass 5, refine the classification:

Hatch direction analysis: Measure the dominant line angle(s) within each zone — single diagonal (45°) = steel/section cut (ANSI31), double diagonal (45°+135°) = concrete (AR-CONC), dots/stipple = earth, wavy lines = insulation, closely-spaced parallel lines = wood (end-grain vs. with-grain)
Hatch density measurement: Lines per inch within the zone — distinguishes similar patterns (dense = concrete, sparse = earth fill)
Legend cross-reference: If the sheet contains a material legend/key, match each zone's visual pattern to the legend entries and use the legend label as the authoritative material type
Section-cut vs plan-view hatch: In section views, hatching represents cut material. In plan views, hatching represents surface finish or material below. Tag each zone with "context": "section_cut" or "context": "plan_view" based on the sheet type detected in Pass 10/12
Adjacent label matching: If a leader line or text label is within 50px of a hatch zone boundary, associate that label with the zone as material_label
Hatch spacing measurement: Measure the inter-line spacing in pixels and convert using scale calibration to actual spacing (validates pattern identity)

Store refined zones by UPDATING existing entries in plans-spatial.json → material_zones[]:

{
  "zone_id": "MATZ-001",
  "sheet": "A-301",
  "material_type": "concrete",
  "hatch_pattern": "AR-CONC",
  "hatch_angle_deg": [45, 135],
  "hatch_spacing_in": 0.125,
  "hatch_density_lpi": 8.0,
  "context": "section_cut",
  "material_label": "4000 PSI CONC",
  "legend_match": true,
  "area_sf": 12.5,
  "bbox": [120, 340, 280, 520],
  "confidence": "high"
}

Keynote Extraction:

Keynotes are numbered callouts on drawings that reference a keynote schedule (either on the same sheet or on a general notes sheet). They are the primary way architects communicate specification-level information on plan sheets.

Keynote bubbles: Detect numbered circles/diamonds/hexagons with leader lines pointing to drawing elements. Extract the keynote number and the x,y position of the leader endpoint (what it points to)
Keynote schedule: Find the keynote schedule table on the sheet (or referenced sheet). Extract each keynote number → description mapping
General notes: Extract numbered general notes from notes blocks (identified in Pass 1 zone detection). Each note typically starts with a number or letter (1., 2., A., B.)
Note-to-element linking: For each keynote bubble on the drawing, record what element it points to (wall, door, detail, etc.) based on leader endpoint proximity to other detected elements
Specification references in notes: Extract spec section references within notes (e.g., "per Section 03 30 00", "see Spec 07 92 00") and link to the spec reference index

Store keynotes in plans-spatial.json → keynotes:

{
  "keynotes": {
    "schedule": [
      {
        "number": "1",
        "description": "PROVIDE 5/8\" TYPE X GWB ON RATED SIDE OF PARTITION",
        "spec_section": "09 29 00",
        "sheet_source": "A-001"
      }
    ],
    "callouts": [
      {
        "keynote_number": "1",
        "sheet": "A-101",
        "position": [1240, 890],
        "points_to": {"element_type": "wall", "element_id": "Type 2 partition"},
        "leader_endpoint": [1180, 870]
      }
    ]
  },
  "general_notes": [
    {
      "sheet": "A-001",
      "note_number": "1",
      "text": "ALL DIMENSIONS ARE TO FACE OF STUD UNLESS OTHERWISE NOTED",
      "spec_refs": [],
      "category": "dimensions"
    }
  ]
}

Cross-checks after hatch refinement + keynote extraction:

Hatch vs legend: Every hatch zone with a legend on the sheet should have legend_match: true. Zones without legend matches get confidence: "medium" at best
Keynote completeness: Every keynote bubble number on the drawing should have a matching entry in the keynote schedule. Missing = flag as incomplete
General notes vs spec sections: Spec section references in notes should exist in the spec reference index (from Pass 8). Missing = flag for review
Hatch zone coverage: On section sheets, hatched regions should cover most of the cut material area. Large unhatched cut areas = possible missed zones
Keynote density check: Sheets with many keynotes but no schedule reference = schedule may be on another sheet. Flag for multi-sheet linking
Material consistency: Hatch classification should be consistent across sheets — same material hatched the same way. Flag conflicting classifications for the same material across different sheets

See references/visual-extraction-reference.md (Pass 13 methodology section) for hatch angle measurement algorithms, keynote bubble detection, and standard pattern classification tables.

Merge visual results with text-based extraction. Visual data fills gaps where PDF text extraction misses graphical content. See references/visual-extraction-reference.md for accuracy expectations and integration rules.

2. Modular Specialized References

This skill uses domain-specific extraction guides. Read the appropriate reference(s) before extracting:

Reference	When to Use	Contains
construction-document-conventions.md	Always read first — before any extraction	How drawings are organized, sheet numbering navigation, cross-reference system, scale/measurement, quantity calculation formulas, contour/grading, line types, hatch patterns, abbreviations
extraction-rules.md	Read second	Framework, classification, principles
plans-deep-extraction.md	Processing plans/drawings	Complete room/finish/door/window schedules, MEP equipment, structural specs
specifications-deep-extraction.md	Processing specs	All CSI divisions with exact values, weather thresholds, testing frequencies
schedule-deep-extraction.md	Processing CPM schedules	Activity details, sequencing logic, float, critical path, constraints
civil-deep-extraction.md	Processing civil/site drawings	Storm/sanitary/water systems with pipe sizes and elevations
compliance-deep-extraction.md	Processing geotech, safety, SWPPP	Bearing capacity, compaction, BMPs, inspection requirements
project-docs-deep-extraction.md	Processing contracts, sub lists, RFIs, subcontracts, POs	Contract terms, sub contacts, RFI status, submittal tracking, SC scope/exclusions, PO line items
pemb-deep-extraction.md	Processing PEMB manufacturer documents	Column reactions, anchor bolts, framing, erection sequence, panels, accessories
submittals-deep-extraction.md	Processing submittal packages	Concrete mixes, door hardware, shop drawings, MEP equipment, finishes
dxf-extraction.md	Processing .dxf/.dwg CAD files	Layer mapping, block attributes, hatch areas, polyline geometry, dimensions, parse_dxf.py
visual-extraction-reference.md	Processing plan sheet images (PDF→PNG)	OCR text extraction, line/wall detection, symbol recognition, material zones, dimensions, scale calibration
masterformat-reference.md	Enriching reports, QA checks, morning briefs	CSI Div 01-33: sequencing, QC issues, hold points, testing, weather limits, PEMB + healthcare overlays
mep-deep-extraction.md	Processing MEP plan sheets	Complete HVAC/plumbing/electrical equipment, panel schedules, fixture schedules, pipe/duct sizes, fire protection

Workflow

For Single Document

Read base extraction rules:
```
Read references/extraction-rules.md
```
Classify the document (Pass 1 + 2):
- Check PDF metadata (creator application)
- Scan for structural patterns
- Identify document type

Read appropriate specialized reference(s):

# If plans/drawings:
Read references/plans-deep-extraction.md

# If specifications:
Read references/specifications-deep-extraction.md

# If schedule:
Read references/schedule-deep-extraction.md

# etc.

Extract comprehensively following the reference guide
Track extraction quality and structure output

For Multiple Documents

CRITICAL: Process ONE document at a time. Do NOT batch-process multiple documents in a single pass — this causes context overload and extraction loops. Instead:

Read base rules: references/extraction-rules.md
Classify all documents first (Pass 1 + 2 for each) — this is a lightweight scan, not full extraction
Present classification to user for confirmation
Use AskUserQuestion to let the user choose which document to process first
Recommended processing order (earlier provides context for later):
- Plans/drawings FIRST (establishes spatial framework)
  - For plan sheet PDFs: MUST run visual analysis (Pass 4) for scale data and graphical intelligence
- DXF/DWG files FIRST-A (exact spatial data takes priority over PDF estimates)
- Specifications SECOND (adds material/quality details)
- Schedule THIRD (adds time dimension)
- Contract FOURTH (adds constraints)
- Sub list FIFTH (adds personnel)
- Support docs LAST (RFIs, submittals, minutes)
After each document: STOP, report results, ask user what to process next
Cross-reference data across documents incrementally as each new document is processed
Report results after each document with metrics

Extraction Depth Guidance

Extract DEEP (100% completeness)

Plans:

Grid lines (ALWAYS - critical for spatial reference)
ALL room numbers and names
ALL doors in door schedule
ALL equipment in MEP schedules
Structural general notes with EXACT VALUES (PSI, rebar sizes, embedments)

Specifications:

Weather thresholds with NUMBERS (40°F, 90°F, 25 mph)
Testing frequencies with NUMBERS ("1 set per 50 CY")
Hold points vs. witness points (clearly distinguished)
Tolerances with LIMITS (FF 35/FL 25, ±1/4")

Schedule:

ALL milestones with dates
Critical path activities (float = 0)
Activity durations, dates, predecessors for critical/near-critical

Extract BROAD (key items only)

Meeting minutes (action items, decisions only)
Correspondence (directives, not entire threads)
Historical RFIs (closed items, not deep detail)

Integration with Downstream Systems

Daily Reports

Auto-complete location references
Spell subcontractor names correctly
Track room-by-room progress
Flag weather threshold conflicts

Project Management

Phase completion percentages
Schedule health monitoring
Procurement status tracking
Resource loading analysis

Quality Control

Test result verification
Tolerance acceptance
Hold point tracking
Inspection frequency compliance

Procurement

Equipment lead times
Submittal approval status
Long-lead item tracking
Critical path dependencies

Quantitative Intelligence

After extraction, the quantitative-intelligence skill uses extracted data + sheet cross-references to:

Build assembly chains linking elements across multiple sheets
Calculate derived quantities (concrete CY, flooring SF, fixture counts)
Merge DXF, visual, takeoff, and text sources with priority rules
Flag discrepancies when sources disagree by >10%
Store quantities in plans-spatial.json and sheet references in the respective data files for use by reports, briefs, and dashboards

Success Metrics

A successful extraction includes:

Completeness: 100% of critical items extracted
Specificity: Numerical values extracted, not just "per spec"
Structure: JSON output ready for consumption
Quality tracking: Coverage and confidence metrics
Cross-references: Related data linked across documents
User validation: Conflicts and low-confidence items flagged

Project Intelligence Dashboard

When the user asks to see extracted project data ("show me the project intelligence", "project intel dashboard", "what data do we have"), generate a comprehensive Project Intelligence Dashboard:

Load the project data files from the user's working directory (project-config.json, plans-spatial.json, specs-quality.json, schedule.json, directory.json, and all log files)
If not found: "No project intelligence found. Run /set-project first to extract data from your construction documents."
Generate a single self-contained HTML file with:
- Sidebar navigation with sections for: Overview, Documents Loaded, Grid Lines, Building Areas, Room Schedule (searchable), Site Layout, Spec Sections (searchable/filterable by CSI division), Key Materials, Weather Thresholds, Hold Points & Inspections, Construction Tolerances, Schedule, Contract Intelligence, Subcontractor Directory (searchable), Safety Intelligence, SWPPP & Erosion Control, Geotechnical Data, RFI Log, Submittal Log, Procurement & Materials, Vendor Database, Quantities (by CSI division, with source attribution), Sheet Cross-References (drawing index, detail callouts, assembly chains), Discrepancies (unresolved quantity variances)
- Data embedding: Embed the entire config as a JavaScript object (fully self-contained)
- Color palette: Navy #1B2A4A (headers), Blue #2E5EAA (accents), Light gray #F8F9FB (background), Light blue #EDF2F9 (sections), Amber #E8A838 (warnings), Green #2D8F4E (success), Red #C0392B (danger)
- Features: Search and filter, status badges, sidebar count badges, coverage checklist, responsive layout
Save as {PROJECT_CODE}_Project_Intelligence.html in the user's output folder

Extended reference: Detailed examples, templates, scoring rubrics, and best practices are in references/skill-detail.md.

Extraction Hooks (Optional)

Four Node.js hooks in foremanos-intel/hooks/ provide automated quality guardrails during extraction. They run locally (~12ms each, zero API calls) and never block writes — only warn on stderr.

Hook	Trigger	What It Does
`project-brain-validator.js`	PreToolUse (Write\|Edit)	Validates JSON structure, canonical top-level keys, catches empty overwrites
`counter-reconciler.js`	PostToolUse (Write)	Checks `_count` fields match actual array lengths
`extraction-checkpoint.js`	PreCompact	Saves extraction phase/progress before context compaction
`data-loss-guard.js`	PreToolUse (Write)	Warns if new content has fewer records than existing file

See commands/process-docs.md "Extraction Hooks" section for installation instructions.

MCP Tool Integration

Optional MCP server tools enhance the extraction pipeline when available. See references/mcp-extraction-tools.md for the full reference including PDF Tools (Phase 1 classification), PDF Display (Phase 3-4 PM review), Cloudflare D1 (extraction state persistence for 100+ doc projects), and R2 (render archive).