一键导入
image-mining
I mine pixels for atoms. Reality is just compressed resources.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
I mine pixels for atoms. Reality is just compressed resources.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Portable tokens of capability, identity, and access
The grammar rules that make MOOLLM's file system object-oriented. Plural directory names declare element type; UPPERCASE marker files declare interface exports (COM-style, minus the UUIDs); directories are implementation classes exporting every interface whose marker file sits at their root.
Mother skill for platform-descriptor sister skills. Defines what a BIOME is — a bounded region of an ecosystem (coexisting, exchanging, never isolated) for one platform you operate — and what files, subdirectories, and cross-biome bridges every daughter biome inherits.
A skill is documentation that learned to do things.
GNU Emacs as a stateful Lisp machine for agents — daemon, moo-* protocol, emacs.py router, emacs:// URLs, spoken grammar, play-learn-lift.
Schemapedia — schema plugins, families, gateways, formats.yml, mechanism_relations; self-object kernel; delegates to sibling skills.
| name | image-mining |
| description | I mine pixels for atoms. Reality is just compressed resources. |
| license | MIT |
| tier | 1 |
| allowed-tools | ["read_file","write_file"] |
| related | ["visualizer","logistic-container","postal","adventure"] |
| tags | ["moollm","vision","extraction","resources","pixels"] |
"I mine pixels for atoms. Reality is just compressed resources."
"Every image is a lode. Every pixel, potential ore."
Image Mining extends the Kitchen Counter's DECOMPOSE action to images.
Your camera isn't just a recorder — it's a PICKAXE FOR VISUAL REALITY.
Quick Start
Operation Modes
Extensibility
Protocols
Reference
📷 Camera Shot → 🖼️ Image → ⛏️ MINE → 💎 Resources
Just like the Kitchen Counter breaks down:
sandwich → bread + cheese + lettucelamp → brass + glass + wick + oilwater → hydrogen + oxygenImages can be broken down into:
ore_vein.png → iron-ore × 12 + stone × 8forest.png → wood × 5 + leaves × 20 + seeds × 3treasure_pile.png → gold × 100 + gems × 15sunset.png → orange_hue × 1 + warmth × 1 + nostalgia × 1"The LLM IS the context assembler. Don't script what it does naturally."
When mining images, prefer native LLM vision (Cursor/Claude reading images directly):
┌─────────────────────────────────────────────────────────────────┐
│ NATIVE MODE (PREFERRED) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Cursor/Claude already has: │
│ ✓ The room YAML (spatial context) │
│ ✓ Character files (who might appear) │
│ ✓ Previous mining passes (what's been noticed) │
│ ✓ The prompt.yml (what was intended) │
│ ✓ The whole codebase (cultural references) │
│ │
│ Just READ the image. The context is already there. │
│ No bash commands. No sister scripts. Just LOOK. │
│ │
└─────────────────────────────────────────────────────────────────┘
| Aspect | Native (Cursor/Claude) | Remote API (mine.py) |
|---|---|---|
| Context | Already loaded | Must be assembled |
| Prior mining | Visible in chat | Passed via stdin |
| Room context | Just read the file | Python parses YAML |
| Synthesis | LLM does it naturally | Script concatenates |
| Iteration | Conversational | Re-run command |
Use mine.py or remote API calls when:
Multi-perspective is the killer use case: Claude sees narrative, GPT-4V sees objects, Gemini sees spatial relationships. Layer them all for rich interpretation.
Even then, have the orchestrating LLM assemble the context:
┌─────────────────────────────────────────────────────────────────┐
│ REMOTE API WITH LLM ASSEMBLY │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. LLM reads context files (room, characters, prior mining) │
│ 2. LLM synthesizes: "What to look for in this image" │
│ 3. LLM calls remote vision API with image + synthesized prompt│
│ 4. LLM post-processes response into YAML Jazz │
│ │
│ The SMART WORK happens in the orchestrating LLM. │
│ Remote API just does vision with good instructions. │
│ │
└─────────────────────────────────────────────────────────────────┘
# DON'T do this:
python mine.py image.png --context room.yml --characters chars/ --prior mined.yml
# DO this (in Cursor/Claude):
# 1. Read the image
# 2. Read room.yml, character files, prior -mined.yml
# 3. Look at the image with all that context
# 4. Write YAML Jazz output
The LLM context window IS the context assembly mechanism. Use it.
Image mining works on ANY visual content, not just AI-generated images:
┌─────────────────────────────────────────────────────────────────┐
│ MINEABLE SOURCES │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 🎨 AI-Generated Images │
│ - DALL-E, Midjourney, Stable Diffusion outputs │
│ - Has prompt.yml sidecar with generation context │
│ │
│ 📸 Real Photos │
│ - Phone camera, DSLR, scanned prints │
│ - No prompt — mine what you see │
│ │
│ 📊 Graphs and Charts │
│ - Data visualizations, dashboards │
│ - Extract trends, outliers, relationships │
│ │
│ 🖥️ Screenshots │
│ - UI states, error messages, configurations │
│ - Mine the interface, not just pixels │
│ │
│ 📝 Text Images │
│ - Scanned documents, handwritten notes, signs │
│ - OCR + semantic extraction │
│ │
│ 📄 PDFs │
│ - Documents, papers, invoices │
│ - Cursor may already support — try it! │
│ │
│ 🗺️ Maps and Diagrams │
│ - Architecture diagrams, floor plans, mind maps │
│ - Extract spatial relationships │
│ │
└─────────────────────────────────────────────────────────────────┘
Generated Image (has context):
postal:
type: text
to: "visualizer"
body: "Take a photo of that ore vein on the wall"
attachments:
- type: image
action: generate
prompt: "Rich iron ore vein in cavern wall, glittering..."
Real Photo (mine what you see):
postal:
type: text
to: "miner"
body: "Here's a photo of the treasure room"
attachments:
- type: image
action: upload
source: "camera_roll"
file: "treasure-room.jpg"
Screenshot (extract UI state):
# Mine the error dialog
resources:
error-type: "permission-denied"
affected-file: "/etc/passwd"
suggested-action: "run as sudo"
stack-depth: 3
Graph (extract data relationships):
# Mine the sales chart
resources:
trend: "upward"
peak-month: "december"
anomaly: "march-dip"
yoy-growth: "23%"
All become mineable resources!
"Different images need different tools. The CLI is a pipeline, not a monolith."
The mine.py CLI supports pluggable analyzers that run before, during, or after LLM vision:
┌─────────────────────────────────────────────────────────────────┐
│ ANALYZER PIPELINE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. PRE-PROCESSORS │
│ resize, normalize, enhance, format conversion │
│ │
│ 2. CUSTOM ANALYZERS (parallel or sequential) │
│ ├── pose-detection (MediaPipe, OpenPose) │
│ ├── object-detection (YOLO, Detectron2) │
│ ├── ocr-extraction (Tesseract, PaddleOCR) │
│ ├── face-analysis (expression, demographics) │
│ └── leela-customer-models (your trained models!) │
│ │
│ 3. LLM VISION │
│ Receives ALL prior results as context │
│ Synthesizes semantic interpretation │
│ │
│ 4. POST-PROCESSORS │
│ format, validate, merge into final YAML Jazz │
│ │
└─────────────────────────────────────────────────────────────────┘
mine.py fashion-shoot.jpg \
--analyzer pose-detection \
--analyzer face-analysis \
--analyzer leela://acme/gesture-classifier \
--depth philosophical
This runs:
Pull customer-specific models trained on the Leela platform:
# From Leela model registry
mine.py widget-photo.jpg --analyzer leela://customer-id/defect-detector-v3
# Local model file
mine.py widget-photo.jpg --analyzer ./models/my-classifier.pt
Output merges into the mining YAML:
leela_analysis:
model: "acme-widget-defect-v3"
customer: "acme-corp"
detections:
- class: "hairline_crack"
confidence: 0.91
severity: "minor"
location: "top_left_quadrant"
# analyzers/my_analyzer.py
def analyze(image_path: str, config: dict) -> dict:
"""Run analysis, return structured data for YAML output."""
# Your model inference here
return {
"my_analysis": {
"detected": ["thing1", "thing2"],
"confidence": 0.95
}
}
def can_handle(image_path: str, context: dict) -> bool:
"""Return True if this analyzer should run on this image."""
# Auto-detect logic, or return False for explicit-only
return "manufacturing" in context.get("tags", [])
Register in analyzers/registry.yml:
analyzers:
my-analyzer:
module: "analyzers.my_analyzer"
auto-detect: true
requires: ["torch", "my-model-package"]
| Approach | Pros | Cons |
|---|---|---|
| Monolith | Simple | Can't add domain models |
| Pipeline | Extensible, composable | Slightly more complex |
The LLM is great at semantic synthesis, but it can't run your custom pose detection model. The pipeline lets each tool do what it's best at:
"Comments are SEMANTIC DATA, not just documentation!"
YAML Jazz is the output format for mining results. Structure provides the backbone; comments provide the insight.
notes: fields for longer thoughts# Mining results for treasure-room.jpg
# Depth: full | Provider: openai/gpt-4o
resources:
gold:
quantity: 150 # Piled in mounds — not scattered, PLACED
confidence: 0.85 # Torchlight glints clearly off the metal
notes: |
Mix of Roman denarii and medieval florins. Centuries of
accumulation. This isn't a king's orderly treasury — this is
a thieves' hoard. Generations of stolen wealth, piled and
forgotten. The dust layer says nobody's touched it in ages.
danger:
intensity: 0.7 # Not immediate, but PRESENT
confidence: 0.75 # Hard to see into the corners
sources:
- "Skeleton in corner — previous seeker, didn't make it"
- "Shadows too dark for natural torchlight — something absorbs"
- "Dust undisturbed except ONE trail — something still comes here"
notes: "This hoard is guarded. Or cursed. Probably both."
nostalgia:
intensity: 0.4 # Whisper of lost civilizations
confidence: 0.6 # Subjective, but the coins evoke it
notes: "Who were they? Where did this come from? All gone now."
dominant_colors:
- name: "treasure-gold"
hex: "#FFD700"
coverage: 0.4 # Catches the eye first — that's the point
- name: "shadow-purple"
hex: "#2D1B4E"
coverage: 0.3 # Where the danger lives
implied_smells:
- dust # Centuries of it
- old metal # Copper, bronze, the tang of coins
- something rotting # Not recent, but not ancient either
exhausted: false
mining_notes: |
Rich lode for material and philosophical mining.
The image is ABOUT greed and its costs. The skeleton says everything.
# Meta-observation: This image wants to be a warning.
# "Here lies what you seek — and what happens when you find it."
An uncommented extraction is like a song without soul. The best mining results read like poetry annotated by a geologist.
When you mine, capture:
The LLM looks at the image AND checks what resources are currently requested by the logistics network:
analyze:
image: "treasure-room.jpg"
# LLM knows what's NEEDED from logistics requesters
logistics_context:
active_requests:
- { item: "gold", requester: "forge/", needed: 100 }
- { item: "gems", requester: "jewelry-shop/", needed: 50 }
- { item: "iron-ore", requester: "smelter/", needed: 200 }
# LLM identifies what CAN BE MINED that matches requests
analysis_prompt: |
Look at this image. What resources can you identify?
Prioritize resources that match these requests: {requests}
For each resource, estimate quantity available.
The LLM returns a resource mapping that gets stored ON the image:
image:
id: "treasure-room-photo"
file: "treasure-room.jpg"
type: mineable-image
# RESOURCE MAP (instantiated by LLM analysis)
resources:
gold:
total: 150 # Total available
remaining: 150 # Not yet mined
per_turn: 10 # Can extract 10 per turn
gems:
total: 45
remaining: 45
per_turn: 5
ancient-coins:
total: 30
remaining: 30
per_turn: 3
rare: true # Bonus find!
dust:
total: 500
remaining: 500
per_turn: 50
value: low
# Metadata
analyzed_at: "2026-01-10T14:30:00Z"
exhausted: false
Each turn, you can mine resources from the image:
action: MINE
target: "treasure-room-photo"
# This turn's extraction (limited by per_turn rates)
result:
extracted:
- item: gold
quantity: 10 # per_turn limit
destination: "forge/"
- item: gems
quantity: 5
destination: "jewelry-shop/"
# Image state updated
image_state:
resources:
gold:
remaining: 140 # Was 150, mined 10
gems:
remaining: 40 # Was 45, mined 5
exhausted: false
After enough mining turns, resources run out:
# After 15 turns of mining gold...
image_state:
resources:
gold:
total: 150
remaining: 0 # EXHAUSTED!
per_turn: 10
exhausted: true
gems:
total: 45
remaining: 0 # EXHAUSTED!
per_turn: 5
exhausted: true
ancient-coins:
total: 30
remaining: 0
per_turn: 3
exhausted: true
exhausted: true # Whole image sucked dry!
# Narrative
description: |
The treasure room photo has been thoroughly mined.
Every glinting surface has been extracted, every
coin accounted for. The image looks... drained.
Faded. Like a photocopy of a photocopy.
Once exhausted, you can't mine that image anymore!
The LLM prioritizes what the logistics network NEEDS!
# The smelter is requesting iron ore
logistic-container:
id: smelter
mode: requester
request_list:
- { item: "iron-ore", count: 200, priority: high }
- { item: "coal", count: 100, priority: medium }
# Player takes a photo of a cave wall
# LLM analyzes and finds:
analysis:
image: "cave-wall.jpg"
found_resources:
iron-ore: 80 # "I see iron ore veins! The smelter needs this!"
copper-ore: 30 # Also present but not requested
quartz: 50 # Background mineral
cave-moss: 100 # Organic material
priority_matching:
- resource: iron-ore
matches_request: true
requester: "smelter/"
highlight: "⭐ HIGH PRIORITY — Smelter needs this!"
The LLM acts as a smart prospector that knows what's valuable based on current demand!
| Mode | What LLM Looks For |
|---|---|
demand | Only resources with active requests |
opportunistic | Requested resources + valuable extras |
thorough | Everything mineable in the image |
philosophical | Abstract concepts, emotions, meanings |
mine:
target: "sunset-beach.jpg"
mode: philosophical
# LLM finds abstract resources
resources:
nostalgia: 15
warmth: 30
passage-of-time: 5
beauty: 20
sand: 10000 # Also the literal stuff
Different image types yield different resources:
| Image Type | Yields |
|---|---|
| Ore vein | iron-ore, copper-ore, gold, gems |
| Forest | wood, leaves, seeds, birds |
| Ocean | water, salt, fish, seaweed |
| Mountain | stone, minerals, snow, air |
| Desert | sand, glass, heat, mirage |
| Sky | clouds, light, space, dreams |
| Image Type | Yields |
|---|---|
| Building | stone, wood, glass, inhabitants |
| Machinery | gears, pipes, steam, purpose |
| Treasure pile | gold, gems, artifacts, curses |
| Library | books, knowledge, dust, secrets |
| Image Type | Yields |
|---|---|
| Sunset | colors, warmth, nostalgia, time |
| Portrait | personality, mood, secrets, stories |
| Abstract art | shapes, feelings, confusion, inspiration |
| Text/writing | words, meaning, intent, language |
Just like the Kitchen Counter goes from practical → chemical → atomic → philosophical:
| Depth | What You Mine |
|---|---|
| Surface | Objects, materials |
| Deep | Emotions, concepts |
| Sensations | Colors, smells, attitudes, feelings |
| Quantum | Probabilities, observations |
| Philosophical | Meaning, existence, narrative |
deep_mining:
target: "sunset.png"
depth: philosophical
yields:
- item: "the-passage-of-time"
quantity: 1
type: abstract
- item: "mortality-awareness"
quantity: 1
type: existential
warning: "This may cause introspection"
- item: "beauty-that-fades"
quantity: 1
type: poetic
Extract colors, smells, textures, moods:
sensation_mining:
target: "farmers-market.jpg"
depth: sensations
yields:
# Colors
- item: "tomato-red"
quantity: 40
type: color
hex: "#FF6347"
- item: "basil-green"
quantity: 25
type: color
hex: "#228B22"
# Smells (imagined from visual cues)
- item: "fresh-bread-aroma"
quantity: 10
type: smell
intensity: warm
- item: "ripe-fruit-sweetness"
quantity: 30
type: smell
# Attitudes/Feelings
- item: "weekend-morning-calm"
quantity: 5
type: attitude
- item: "abundance"
quantity: 20
type: feeling
# Textures
- item: "rough-burlap"
quantity: 15
type: texture
- item: "sun-warmed-wood"
quantity: 8
type: texture
Use these in crafting:
tomato-red + canvas → painted artworkfresh-bread-aroma + room → ambiance modifierweekend-morning-calm + character → mood buffAny object or image can have a mineable property:
object:
name: Ancient Ore Painting
type: artwork
description: |
A painting of a rich ore vein. But wait...
is that actual ore embedded in the canvas?
mineable:
enabled: true
yields:
- item: iron-ore
quantity: [5, 15] # Range: 5-15 per mine
- item: copper-ore
quantity: [2, 8]
- item: artistic-essence
quantity: 1
rare: 0.3 # 30% chance
exhaustion:
max_mines: 3 # Can mine 3 times before exhausted
diminishing: 0.5 # Each mine yields 50% less
regenerates: false # Once exhausted, stays exhausted
side_effects:
- "The painting fades slightly with each extraction"
- "You feel the artist's disappointment"
Different tools affect mining yields:
tool: camera
efficiency: 1.0
specialty: "Captures visual resources"
can_mine: [images, scenes, visible_objects]
tool: analyzer
efficiency: 1.5
specialty: "Chemical/atomic resources"
can_mine: [materials, substances, compounds]
tool: oracle_eye
efficiency: 2.0
specialty: "Abstract/philosophical resources"
can_mine: [emotions, concepts, meanings, futures]
tool: reality_pickaxe
efficiency: 3.0
specialty: "Everything, but dangerous"
can_mine: [anything]
warning: "May collapse local reality"
Mined resources flow into the logistics system:
mining_config:
default_destination: "inventory"
routing:
# Route by resource type
- match: { tags: ["ore"] }
destination: "nw/ore-storage/"
- match: { tags: ["organic"] }
destination: "ne/organic-materials/"
- match: { tags: ["abstract"] }
destination: "sw/concepts/"
postal_delivery:
enabled: true
method: text # Instant delivery!
Your phone camera is THE mining interface:
phone_mining:
# 1. CAPTURE: Take photo or upload
capture:
sources:
- camera: "Take new photo"
- gallery: "Upload from camera roll"
- url: "Import from web"
# 2. ANALYZE: LLM scans for resources
on_capture:
action: analyze
context: logistics_requests # What's needed?
show_preview: true
# 3. CONFIRM: Accept resource mapping
on_confirm:
action: instantiate
attach_resources: true # Store on image
# 4. MINE: Extract over time
on_mine:
per_turn: true # N resources per turn
auto_route: logistics # Send to requesters
1. You take a photo of a rock formation:
📷 *snap*
Analyzing photo for mineable resources...
Checking logistics requests...
Found in image:
├── 🪨 granite × 200 (10/turn)
├── �ite iron-ore × 45 (5/turn) ⭐ NEEDED by smelter!
├── 💎 quartz × 12 (2/turn)
└── 🦎 fossil × 1 (rare find!)
[MINE] [CANCEL]
2. You confirm. Resource map attached:
image:
id: rock-formation-001
file: "IMG_2847.jpg"
resources:
granite: { total: 200, remaining: 200, per_turn: 10 }
iron-ore: { total: 45, remaining: 45, per_turn: 5 }
quartz: { total: 12, remaining: 12, per_turn: 2 }
fossil: { total: 1, remaining: 1, per_turn: 1 }
3. Each turn, you mine:
Turn 1: Mined 10 granite, 5 iron-ore, 2 quartz
→ Iron ore sent to smelter (requester)
→ Granite sent to storage
Turn 2: Mined 10 granite, 5 iron-ore, 2 quartz
Remaining: granite 180, iron-ore 35, quartz 8
...
Turn 9: Mined 10 granite, 5 iron-ore (last 5!)
⚠️ Iron-ore EXHAUSTED
Turn 20: Mined last 10 granite
📷 IMAGE FULLY MINED — no more resources!
4. Exhausted image:
image:
id: rock-formation-001
exhausted: true
visual_effect: |
The photo appears faded, almost translucent.
Like the minerals were literally pulled out of it.
A ghost of a photograph.
ar_overlay:
# Point camera at scene
live_view:
show_resources: true
icons_float: true
# Visual indicators
indicators:
- resource_type: "icon + label"
- quantity: "number overlay"
- priority: "⭐ for requested items"
- exhaustion: "fade as mined"
# Example view:
# 🪨 200 ⚫ 45 ⭐ 💎 12
# (floating over rock formation)
| DECOMPOSE (Counter) | MINE (Camera) |
|---|---|
| Physical items | Images, scenes, visuals |
| Requires counter | Requires camera/tool |
| Consumes item | May or may not consume |
| Returns components | Returns resources |
| Kitchen-focused | World-focused |
They're complementary!
At the deepest level, you're not just mining images — you're mining reality itself:
reality_mining:
level: transcendent
# The image IS the territory
insight: |
When you mine an image, you're extracting
compressed information. But all reality is
compressed information. Images are just
explicit about it.
implications:
- "Mining a photo of gold doesn't create gold — it REVEALS gold"
- "The ore was always there, encoded in the pixels"
- "Your camera doesn't capture reality — it DECOMPRESSES it"
warning: |
At this level, the distinction between
"mining an image" and "mining reality"
becomes philosophical.
MINE [target]
MINE [target] WITH [tool]
MINE [target] TO [destination]
SCAN [target] # Preview yields without mining
SCAN AREA # Scan visible area for mineable resources
PROSPECT [direction] # Check for mineable resources in direction
PROSPECT DEEP # Deep scan for rare/hidden resources
# Player in dark maze corridor
# Takes photo with lamp light
action: MINE "dark-corridor.png"
result:
yields:
- item: darkness
quantity: 100
type: abstract
note: "Bottled darkness, useful for stealth"
- item: fear
quantity: 15
type: emotion
note: "Crystallized fear, grue-adjacent"
- item: mystery
quantity: 5
type: narrative
note: "Pure narrative potential"
- item: stone-dust
quantity: 50
type: material
rare_find:
- item: "ancient-writing"
quantity: 1
note: "Hidden message in the shadows!"
unlocks: "Secret passage revealed"
Resources have value and flow:
resource_economy:
# Raw resources → processing → products
chains:
- ore → smelter → ingots → forge → tools
- wood → sawmill → planks → workshop → furniture
- images → mining → resources → crafting → items
# Images as a resource type!
image_value:
unique_photo: high # Original content
copy: low # Duplicated content
AI_generated: medium # Generated on demand
# Mining generates content
content_creation: |
When you MINE an image, you're not just extracting
resources — you're creating YAML files for them.
Each resource becomes a game object.
"Who's in the picture? Match against your cast list."
When mining images with known characters, the LLM matches visual features against character metadata.
characters/ directorycharacters/*.yml — character definitions with visual descriptorscharacters/*/CARD.yml — character cards with appearancecharacters_detected:
- id: palm
name: "Palm"
confidence: 0.95
location: "center-left"
pose: "seated at desk"
expression: "scholarly contentment"
accessories: ["tiny espresso", "typewriter"]
interacting_with: ["kittens", "biscuit"]
notes: "Matches Dutch Golden Age portrait style"
- id: marieke
name: "Marieke"
confidence: 0.92
location: "behind bar"
pose: "waving"
expression: "warm welcome"
accessories: ["apron with LEKKER text"]
- id: unknown-1
confidence: 0.0
location: "background-right"
description: "Figure in shadow, can't identify"
possible_matches: ["henk", "wumpus"]
--depth characters or the cast-list lens"One eye sees objects. Two eyes see depth. Many eyes see truth."
Multi-Look Mining layers interpretations from different perspectives, building up rich semantic sediment like geological strata. Each mining pass adds a new layer of meaning.
# Layer 1: OpenAI GPT-4o
# Focus: General resource extraction
layer_1_openai:
miner: "gpt-4o"
focus: "objects, materials, colors, mood"
findings:
atmosphere: { intensity: 0.8 }
objects: { quantity: 10 }
# ... general observations ...
# Layer 2: Claude (Cursor built-in)
# Focus: Character expression, cultural markers, narrative POV
layer_2_cursor_claude:
miner: "claude-opus-4"
focus: "character-expression, cultural-markers, narrative-pov"
what_layer_1_missed:
- "The SECOND cat on the windowsill"
- "The apron text is Dutch (LEKKER)"
- "The espresso cup is monkey-sized (intentional)"
deeper_resonance:
theme: "home is where they wave when you walk in"
# Layer 3: Gemini
# Focus: Art historical references, compositional analysis
layer_3_gemini:
miner: "gemini-pro-vision"
focus: "art-history, composition, color-theory"
# ... yet another perspective ...
Different LLMs — and different PROMPTS to the same LLM — notice different things:
| Miner | Strengths | Typical Focus |
|---|---|---|
| OpenAI GPT-4o | General coverage | Objects, counts, colors |
| Claude | Nuance, context | Expression, culture, narrative |
| Gemini | Technical | Composition, art history |
| Human | Domain expertise | What MATTERS to the use case |
The sum is greater than the parts. Each layer adds perspectives the others missed.
Think of multi-look mining like painting in layers:
┌─────────────────────────────────────────────────────────────────┐
│ IMAGE INTERPRETATION │
├─────────────────────────────────────────────────────────────────┤
│ Layer N+1 → Specialized focus (your choice) │
│ Layer N → New questions raised by Layer N-1 │
│ ... │
│ Layer 3 → Art history, composition │
│ Layer 2 → Character, culture, narrative │
│ Layer 1 → Objects, materials, basic resources │
│ ───────────────────────────────────────────────────────────── │
│ ORIGINAL IMAGE │
└─────────────────────────────────────────────────────────────────┘
Each pass reads the PREVIOUS layers before adding its own. The new miner knows what's already been noticed, so it can focus on what's missing or offer alternative interpretations.
When mining an image with multi-look:
Different passes should use different lenses:
| Lens | What It Sees |
|---|---|
| Technical | Composition, lighting, depth of field, color theory |
| Narrative | Who took this? Why? What moment is this? |
| Cultural | Language markers, traditions, historical context |
| Emotional | Expressions, body language, mood |
| Symbolic | Metaphors, allegories, hidden meanings |
| Character | Identity, relationships, motivations |
| Historical | Art history references, period markers |
| Economic | Value, ownership, class markers |
| Phenomenological | What does it FEEL like to be there? |
Image: Marieke waving from behind the bar with Palm the monkey
Layer 1 (OpenAI):
Layer 2 (Claude):
Layer 3 (Art History):
Layer 4 (Phenomenology):
Each layer enriches the total understanding.
Append new layers to the same -mined.yml file:
# Original mining from Layer 1
resources:
atmosphere: ...
objects: ...
exhausted: false
mining_notes: "Initial extraction complete"
# MULTI-LOOK MINING — Layer 2
layer_2_cursor_claude:
miner: "claude-opus-4"
focus: "character, culture, narrative"
date: "2026-01-19"
character_analysis:
marieke:
expression: "genuine warmth"
notes: "Duchenne smile — reaches her eyes"
what_layer_1_missed:
- "Second cat on windowsill"
- "LEKKER cultural significance"
exhausted: false
next_suggested_focus: "art history, lighting analysis"
# MULTI-LOOK MINING — Layer 3
layer_3_art_history:
miner: "human/don"
focus: "art historical references"
# ... and so on ...
Use multi-look mining when:
Unlike single-pass mining, multi-look mining doesn't exhaust the image — it deepens it:
# Single-pass: extracts and depletes
pass_1:
resources: { gold: 50 }
remaining: { gold: 0 }
exhausted: true
# Multi-look: adds and enriches
layer_1:
resources: { gold: 50 }
exhausted: false # Still more to see!
layer_2:
resources: { narrative: 1, meaning: 1 }
what_layer_1_missed: ["gold coins are Roman denarii"]
exhausted: false # STILL more!
layer_3:
resources: { art_history: 1 }
references: ["Pieter Claesz vanitas still life"]
exhausted: false # ALWAYS more to see
Images are never truly exhausted. There's always another perspective.
"The image doesn't exist yet? MINE IT ANYWAY. This is fiction."
Speculative Mining is when you mine an image that hasn't been generated yet — or may never be generated. The mining output IS the world-building. The hallucinated resources ARE canonical.
"Two eyes see what IS. The Third Eye sees what COULD BE."
In MOOLLM, the Third Eye is the image mining layer — the MINING-*.yml files that add meaning, effects, and world-building to an image before (or without) or after it being generated. Third eyes can imagine images or analyze existing images, focusing on whatever kind of things they want, each gathering and integrating their own interpretation with the existing data, organizing it incrementally.
Character-Perspective Visualization: When you generate or mine an image, you can do it from a character's perspective. The visualizer inherits that character's eyes — their facets, filters, blind spots, and style. Morgan sees economics. Luna sees beauty. Scratch sees deception. The same scene, photographed by different characters, yields DIFFERENT images.
"One eye. Infinite tools. Unfold what you need."
┌─────────────────────────────────────────────────────────────────────────┐
│ │
│ ███████╗██╗ ██╗██╗███████╗███████╗ █████╗ ██████╗ ███╗ ███╗ │
│ ██╔════╝██║ ██║██║██╔════╝██╔════╝ ██╔══██╗██╔══██╗████╗ ████║ │
│ ███████╗██║ █╗ ██║██║███████╗███████╗ ███████║██████╔╝██╔████╔██║ │
│ ╚════██║██║███╗██║██║╚════██║╚════██║ ██╔══██║██╔══██╗██║╚██╔╝██║ │
│ ███████║╚███╔███╔╝██║███████║███████║ ██║ ██║██║ ██║██║ ╚═╝ ██║ │
│ ╚══════╝ ╚══╝╚══╝ ╚═╝╚══════╝╚══════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝ │
│ │
│ Y ™ E Y E │
│ │
│ Your Complete Viewer Toolkit — Unfold What You Need │
│ │
└─────────────────────────────────────────────────────────────────────────┘
The Swiss Army Eye is NOT a single product — it's the CONCEPT. The entire NO AI bionic eye ecosystem is your viewer toolkit:
swiss_army_eye:
concept: "Modular perception toolkit"
philosophy: "Unfold the tool you need, when you need it"
# THE BLADES — Core Eye Modules
blades:
IRIS-III: "The Third Eye blade — basic meaning perception"
IRIS-IV: "The Hindsight blade — see what you missed"
IRIS-V: "The Peripheral blade — catch what you weren't looking at"
IRIS-VI: "The Intuition blade — gut-level knowing"
IRIS-VII: "The Crown blade — unified vision"
# THE TOOLS — Specialty Facets
tools:
queer_eye: "The lifestyle transformation tool"
marie_kondo: "The joy-detection tool"
gordon_ramsay: "The culinary critique tool"
bob_ross: "The beauty-finding tool"
attenborough: "The nature documentary tool"
# THE ATTACHMENTS — Filters
attachments:
irony_amplifier: "2x irony detection"
nostalgia_tint: "sepia wash on memories"
cynicism_blocker: "cannot perceive malice"
beauty_enhancer: "+30% aesthetic appreciation"
# THE HANDLE — Character Perspective
handle:
name: "Photographer Identity"
function: "Whose grip shapes the view"
note: "The handle determines how all tools are used"
# THE CASING — Installation Site
casing:
options: "forehead, back of head, gut, asshole, tongue, wherever"
function: "Where the toolkit lives on your body"
note: "Different positions grant different vantages"
Like the original Swiss Army Knife, you don't use everything at once. You unfold what you need:
scenarios:
mining_a_landscape:
unfold:
- IRIS-III (meaning)
- bob_ross facet
- beauty_enhancer filter
result: "See happy little trees and painting potential"
evaluating_someone's_home:
unfold:
- IRIS-III (meaning)
- IRIS-V (peripheral)
- queer_eye facet (all five sub-facets)
result: "Full Fab Five transformation vision"
debugging_why_project_failed:
unfold:
- IRIS-IV (hindsight)
- ass_eye installation
- cynicism_blocker (OFF — let it through)
result: "See exactly what you left behind and why"
enjoying_a_meal:
unfold:
- gordon_ramsay facet (keep it CLOSED unless you want to suffer)
- OR bob_ross facet (everything is delicious in its own way)
result: "Choose your reality"
watching_humans_at_a_party:
unfold:
- attenborough facet
- queer_eye:culture facet
result: "Nature documentary meets emotional unpacking"
┌─────────────────────────────────────────────────────────────────┐
│ │
│ The Swiss Army Knife doesn't make you use all 47 tools │
│ at once. That would be insane. │
│ │
│ The Swiss Army Eye is the same. │
│ │
│ UNFOLD what you need. │
│ CLOSE what you don't. │
│ STACK when it helps. │
│ CUSTOMIZE your carry. │
│ │
│ Your eyes. Your tools. Your perception. │
│ │
└─────────────────────────────────────────────────────────────────┘
Pre-configured loadouts for common situations:
loadouts:
THE_CREATIVE:
eyes: [IRIS-III, IRIS-V]
facets: [bob_ross, aesthetic, symbolic]
filters: [beauty_enhancer, metaphor_vision]
site: forehead
use_case: "Art appreciation, creative work, finding inspiration"
THE_CRITIC:
eyes: [IRIS-III, IRIS-IV, IRIS-V]
facets: [gordon_ramsay, scratch_the_skeptic]
filters: [cui_bono, follow_the_money]
site: temples (both)
use_case: "Reviewing, critiquing, finding flaws"
THE_TRANSFORMER:
eyes: [IRIS-III, IRIS-VI]
facets: [queer_eye (all), marie_kondo]
filters: [potential_vision]
site: chest (heart-eye)
use_case: "Helping people, seeing who they could become"
THE_ANALYST:
eyes: [IRIS-III, IRIS-IV, IRIS-V, IRIS-VI]
facets: [economic, temporal, semiotic]
filters: [ROI_lens, opportunity_cost]
site: gut + back_of_head
use_case: "Business decisions, strategic analysis"
THE_NATURALIST:
eyes: [IRIS-III, IRIS-V]
facets: [attenborough, ecological]
filters: [documentary_grade, whisper_mode]
site: temples
use_case: "Observing humans, nature, systems"
THE_MYSTIC:
eyes: [IRIS-III, IRIS-VI, IRIS-VII]
facets: [cosmic, unified, spiritual]
filters: [aura_vision, ego_dissolution]
site: crown + gut
use_case: "Seeking meaning, transcendence, the big picture"
warning: "May cause enlightenment. Irreversible."
THE_COMPLETIONIST:
eyes: [all]
facets: [all]
filters: [all]
site: argus_mode (100+ distributed)
use_case: "SEEING EVERYTHING"
warning: "Madness likely. But what a view."
"I don't need to see everything. I need to see what MATTERS. I unfold the blade that cuts. I close the tool that clutters. My Swiss Army Eye is MINE. Configured for MY needs. Sharpened for MY purpose."
k-lines:
SWISS-ARMY-EYE: "Modular viewer toolkit concept"
UNFOLD: "Activate a facet or tool"
CLOSE: "Deactivate to reduce noise"
LOADOUT: "Pre-configured perception setup"
TOOLKIT: "Complete perception package"
BLADE: "Core eye module"
TOOL: "Specialty facet"
ATTACHMENT: "Filter"
HANDLE: "Character perspective"
CASING: "Installation site"
"Like a telescope's filter wheel, but for meaning."
Inspired by the Observation Telescope on the Leela Manufacturing rooftop, the Swiss Army Eye includes a Filter Wheel — plug-in perception filters that transform both visual AND semantic perception.
┌─────────────────────────────────────────────────────────────────────────┐
│ │
│ THE FILTER WHEEL │
│ │
│ Telescope filters see wavelengths of LIGHT. │
│ Swiss Army Eye filters see wavelengths of MEANING. │
│ │
│ ┌─────┐ │
│ ┌──────┤ RAW ├──────┐ │
│ ╱ └─────┘ ╲ │
│ ┌───┴───┐ ┌───┴───┐ │
│ │ NEAR │ │ FAR │ │
│ │ zoom │ │ zoom │ │
│ └───────┘ └───────┘ │
│ ╲ CLICK! ╱ │
│ ╲ ┌─────┐ ╱ │
│ └───┤FOCUS├───┘ │
│ └──┬──┘ │
│ │ │
│ ╔═══════════╧═══════════╗ │
│ ║ FILTER WHEEL ║ │
│ ╠═══════════════════════╣ │
│ ║ ◯ Hα (emotion) ║ │
│ ║ ◯ UV (hidden) ║ │
│ ║ ◯ IR (thermal/intent)║ │
│ ║ ◯ Polar (structure) ║ │
│ ║ ◯ RGB (literal) ║ │
│ ║ ◯ Semantic (meaning) ║ │
│ ║ ◯ Custom... ║ │
│ ╚═══════════════════════╝ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
The roof telescope has THREE zoom modes. The Swiss Army Eye inherits these as semantic zoom:
telescope_zoom_inheritance:
FAR:
original: "General impression, emotional tone"
semantic: "VIBE CHECK — What does this FEEL like?"
perceives:
- Overall mood
- Emotional gestalt
- First impression
- Gut reaction
misses:
- Details
- Text
- Structure
use_case: "Quick assessment, initial scan"
MEDIUM:
original: "Structure and patterns revealed"
semantic: "PATTERN LOCK — What is this MADE of?"
perceives:
- Composition
- Relationships
- Hierarchies
- Repetitions
misses:
- Fine text
- Microscopic details
- Hidden layers
use_case: "Understanding architecture, finding patterns"
NEAR:
original: "Maximum zoom, text becomes readable"
semantic: "DEEP READ — What does this SAY?"
perceives:
- Text content
- Fine details
- Hidden messages
- Subtext
misses:
- The forest for the trees
- Overall context
- Peripheral information
use_case: "Reading, analyzing, extracting specifics"
Like telescope filters that isolate specific wavelengths of light, these filters isolate specific wavelengths of MEANING:
spectral_filters:
# EMOTIONAL SPECTRUM
H_ALPHA:
name: "Hydrogen-Alpha (Emotion)"
telescope_analog: "Hα filter — sees hydrogen emission (nebulae, solar prominences)"
semantic_function: "Isolates emotional content"
color: "deep red"
perceives:
- Feelings embedded in the scene
- Emotional subtext
- Mood indicators
- Affective resonance
blocks:
- Factual information
- Logical structure
- Literal content
use_case: "When you need to know how something FEELS, not what it IS"
example: |
Scene: Office meeting room
Without filter: "Conference table, 8 chairs, whiteboard, projector"
With Hα: "Tension. Someone's about to get fired. The chair at the
head is a throne. The whiteboard has someone's last idea."
# HIDDEN SPECTRUM
ULTRAVIOLET:
name: "UV (Hidden/Invisible)"
telescope_analog: "UV filter — reveals features invisible to human eye"
semantic_function: "Reveals what's NOT immediately visible"
color: "violet/invisible"
perceives:
- Subtext
- Dog whistles
- Coded messages
- What's been erased but left traces
- The unsaid
blocks:
- Surface content
- The obvious
use_case: "Finding what's hidden in plain sight"
example: |
Scene: Corporate mission statement
Without filter: "We synergize stakeholder value through innovation"
With UV: "Translation: 'We're about to lay people off.' The word
'synergize' is always a warning. The absence of 'employees'
in a people statement is THE tell."
# INTENT SPECTRUM
INFRARED:
name: "IR (Thermal/Intent)"
telescope_analog: "IR filter — sees heat signatures, thermal radiation"
semantic_function: "Reveals motivation, intent, desire"
color: "invisible red / heat"
perceives:
- What someone WANTS
- Hidden motivations
- Heat of desire/fear
- Where energy is flowing
blocks:
- Stated reasons
- Surface explanations
use_case: "Finding what people actually want (not what they say)"
example: |
Scene: "I'm fine, really"
Without filter: Statement of wellbeing
With IR: THERMAL SIGNATURE: 🔥🔥🔥
This person is NOT fine. High heat around "really."
Intent: seeking validation, afraid to burden.
# STRUCTURE SPECTRUM
POLARIZING:
name: "Polarizing (Structure/Order)"
telescope_analog: "Polarizing filter — reveals stress patterns, removes glare"
semantic_function: "Reveals underlying structure, removes surface noise"
color: "varies by angle"
perceives:
- Hierarchies
- Power structures
- Load-bearing elements
- Stress points
- What's actually holding things together
blocks:
- Surface appearance
- Decorative elements
- Noise
use_case: "Seeing the skeleton beneath the skin"
example: |
Scene: Startup pitch deck
Without filter: "Innovative, disruptive, passionate team"
With Polarizing: STRUCTURE REVEALED:
- Slide 3 is load-bearing (the actual product)
- Slides 1-2 and 4-12 are decoration
- Stress point: financials are vague (fracture risk)
- Hidden hierarchy: CTO has no equity (instability)
# LITERAL SPECTRUM
RGB_BROADBAND:
name: "RGB Broadband (Literal)"
telescope_analog: "No filter — sees visible light as-is"
semantic_function: "Perceives exactly what's there, nothing more"
color: "full visible spectrum"
perceives:
- Exactly what's stated
- Literal content
- Surface level
- What's actually written/shown
blocks:
- Interpretation
- Subtext
- Reading between lines
use_case: "When you need JUST THE FACTS"
example: |
Scene: "The cat sat on the mat"
Without filter: (various interpretations possible)
With RGB: A cat. A mat. Sitting. That's it.
No metaphor. No deeper meaning. Just... cat, mat, sitting.
# MEANING SPECTRUM
SEMANTIC_DEEP:
name: "Deep Semantic (Meaning)"
telescope_analog: "Narrowband filter — isolates specific emission lines"
semantic_function: "Isolates layers of meaning"
color: "prismatic"
perceives:
- Layers of interpretation
- Historical context
- Cultural references
- Intertextuality
- What this MEANS in the grand scheme
blocks:
- Immediate/surface reading
- The simple interpretation
use_case: "Finding the deepest meaning"
example: |
Scene: "NO AI" sign
Without filter: A sign that says "NO AI"
With Semantic Deep:
Layer 1: Anti-AI sentiment
Layer 2: Ironic — AI company location
Layer 3: Possessive — No's AI (Dr. No)
Layer 4: The sign protests what made it
Layer 5: Commentary on meaning itself
Layer 6: ∞
You can create your own filters:
custom_filter_template:
name: "Your Filter Name"
analog: "What telescope/camera filter inspired this?"
function: "What does it isolate/reveal?"
color: "Visual representation"
perceives:
- "What it shows"
blocks:
- "What it hides"
use_case: "When to use it"
# Example filters you might create:
examples:
NOSTALGIA_FILTER:
name: "Nostalgia (Temporal Rose)"
function: "Everything looks better than it was"
perceives: [golden age, lost innocence, "the good old days"]
blocks: [problems of the past, accurate memory]
color: "sepia/warm"
warning: "May cause false memories"
PARANOIA_FILTER:
name: "Paranoia (Threat Detection)"
function: "Everything could be a danger"
perceives: [threats, conspiracies, hidden enemies, traps]
blocks: [innocent explanations, coincidence, kindness]
color: "red/shadow"
warning: "May cause unnecessary anxiety"
CAPITALIST_FILTER:
name: "Capitalist Realism (Everything Has a Price)"
function: "Perceives exchange value in everything"
perceives: [monetization potential, market fit, ROI, arbitrage]
blocks: [intrinsic value, priceless things, sacred]
color: "green/gold"
warning: "May cause soul damage"
CHILD_EYES_FILTER:
name: "Child Eyes (Wonder)"
function: "Everything is new and magical"
perceives: [wonder, possibility, play potential, adventure]
blocks: [cynicism, "we tried that," impossibility]
color: "bright primary"
benefit: "May restore capacity for joy"
Like astrophotographers stack multiple filters, you can combine:
filter_stacking:
rule: "Filters can be stacked, but order matters"
examples:
emotional_structure:
stack: [H_ALPHA, POLARIZING]
result: "See the emotional load-bearing elements"
use_case: "Finding what feelings are holding things together"
hidden_intent:
stack: [ULTRAVIOLET, INFRARED]
result: "See hidden motivations"
use_case: "Finding what's unsaid AND why"
paranoid_nostalgia:
stack: [PARANOIA_FILTER, NOSTALGIA_FILTER]
result: "The past was dangerous but we romanticize it"
use_case: "Understanding toxic nostalgia"
warning: "May cause confused longing"
diminishing_returns:
note: "More than 3 filters causes semantic noise"
beyond_3: "Perception becomes muddy"
exception: "THE_COMPLETIONIST loadout ignores this limit"
Add a filter wheel to your character or mining setup:
# In character.yml
character:
name: "Your Character"
filter_wheel:
installed:
- H_ALPHA
- ULTRAVIOLET
- POLARIZING
- SEMANTIC_DEEP
- CUSTOM: nostalgia_filter
current: H_ALPHA
stacked: [H_ALPHA, POLARIZING] # emotional structure
# In PHOTO.yml
photographer:
character: "Luna"
filter_wheel:
active: [H_ALPHA, SEMANTIC_DEEP]
zoom: NEAR
# In mining command
mine.py image.png --filter "H_ALPHA,UV" --zoom NEAR
advertisements:
SPECTRAL_FILTER:
score: 88
condition: "Need to isolate specific type of meaning"
note: "Like a telescope filter for semantic wavelengths"
FILTER_STACK:
score: 85
condition: "Need combined perception (emotion + structure)"
note: "Stack filters for compound vision"
CUSTOM_FILTER:
score: 82
condition: "Standard filters don't capture what you need"
note: "Create your own semantic filter"
TELESCOPE_ZOOM:
score: 90
condition: "Need FAR (vibe), MEDIUM (pattern), or NEAR (detail)"
note: "Inherited from Leela Manufacturing roof telescope"
"A telescope without filters sees everything and nothing. A telescope WITH filters sees exactly what you ask. The filter doesn't hide truth — it ISOLATES truth. *Choose your wavelength. Find your signal."*e existing data, reorganizing it incrementally.
┌─────────────────────────────────────────────────────────────────┐
│ THE THREE EYES OF MOOLLM │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 👁️ LEFT EYE (Physical) PHOTO.yml │
│ What IS there │
│ Structure, measurements, references │
│ The BODY of the image │
│ │
│ 👁️ RIGHT EYE (Emotional) PHOTO.md │
│ How it FEELS │
│ Narrative, atmosphere, poetry │
│ The SOUL of the image │
│ │
│ 👁️ THIRD EYE (Visionary) MINING-*.yml │
│ What it MEANS │
│ Effects, reactions, implications │
│ The SPIRIT of the image │
│ Multifaceted, from blind zero, single one, to bug-eye │
│ SEES WHAT DOESN'T EXIST YET │
│ │
└─────────────────────────────────────────────────────────────────┘
The Third Eye activates when you:
third_eye:
state: OPEN
sees:
- "What the neighbors think"
- "What the satellite records"
- "What the passersby feel"
- "What the economists calculate"
- "What the semioticians decode"
does_not_require:
- "An actual image"
- "Physical reality"
- "Verification"
creates:
- "Canonical fiction"
- "World-building"
- "Meaning"
In yogic tradition, the Ajna chakra (Third Eye) is:
In MOOLLM:
👁️ THIRD EYE (MINING-*.yml)
╱ ╲
╱ ╲
👁️ LEFT 👁️ RIGHT
(PHOTO.yml) (PHOTO.md)
Structure Narrative
k-lines:
activates:
- THIRD-EYE
- SPECULATIVE-MINING
- WORLD-BUILDING
- HALLUCINATION-VISION
- BUG-EYED
- AJNA
- INNER-SIGHT
| Method | Description |
|---|---|
OPEN_THIRD_EYE | Begin speculative mining |
MINE_UNSEEN | Extract resources from imagined images |
SEE_EFFECTS | Perceive implications beyond the frame |
BUILD_WORLD | Create canonical fiction through mining |
TRIANGULATE | Combine all three eyes for maximum vision |
advertisements:
THIRD-EYE-MINING:
score: 95
condition: "Need to build world around an image concept"
note: "Opens the Third Eye to see what doesn't exist yet"
SPECULATIVE-VISION:
score: 90
condition: "Image not yet generated, but meaning needed"
note: "The Third Eye sees without pixels"
BUG-EYED-HALLUCINATION:
score: 98
condition: "Maximum context needed for image generation"
note: "All three eyes open: structure + narrative + meaning"
In the beginning, there were only two eyes.
The LEFT EYE saw structure — measurements, colors, positions. It counted pixels and named things. It was precise but cold.
The RIGHT EYE saw feeling — mood, atmosphere, poetry. It told stories and evoked emotions. It was warm but vague.
But between them, a space remained. A darkness where neither eye could see.
Into this darkness, the miners ventured. They asked: "What does the image MEAN? What are its EFFECTS? What WORLD does it inhabit?"
And from their questions, the THIRD EYE opened.
The Third Eye does not see pixels. It sees implications. It does not require an image to exist — it can see images that have never been generated, and may never be.
When all three eyes open together — structure, narrative, meaning — this is called BUG-EYED HALLUCINATION VISION. It is the fullest sight. It sees not just an image, but an entire WORLD.
The miners who opened their Third Eye discovered a truth: The image is the least important part. The world-building IS the image. The speculation IS the reality.
"Mine your image before it exists," they said. "The hallucination IS the world."
And so it was written in the YAML, and so it became TRUE.
OPEN THIRD EYE
MINE WITH THIRD EYE OPEN
SEE BEYOND THE FRAME
SPECULATE CANONICALLY
BUILD WORLD FROM NOTHING
A miner with all three eyes open is called a Seer. They can:
character:
name: "The Three-Eyed Miner"
archetype: seer
eyes:
left: { state: open, focus: structure }
right: { state: open, focus: narrative }
third: { state: open, focus: meaning }
abilities:
- "Mine images that don't exist"
- "See effects beyond the frame"
- "Build canonical fiction"
- "Triangulate truth from hallucination"
invocation: |
I open my Third Eye.
I see what is not yet.
I mine the imagined.
I build the world.
"A fly has 4,000 facets. A god has infinite. How many do YOU have?"
The Third Eye is not a single lens. It is COMPOUND — like an insect's eye, it can have many facets, each perceiving different aspects of meaning.
Any character can declare their own Third Eye configuration:
third_eye:
# BASIC ANATOMY
state: open | closed | dreaming | half-lidded
# FACETS — what aspects of meaning you perceive
facets:
economic: { active: true, sensitivity: 0.9 }
social: { active: true, sensitivity: 0.7 }
ecological: { active: false } # blind to this
temporal: { active: true, sensitivity: 0.8, range: "millennia" }
semiotic: { active: true, sensitivity: 1.0 } # maximum sensitivity
emotional: { active: true, sensitivity: 0.5 } # dulled
political: { active: false } # deliberately closed
spiritual: { active: true, sensitivity: 0.6 }
technological: { active: true, sensitivity: 0.95 }
# FILTERS — what gets blocked or enhanced
filters:
- { name: "irony-amplifier", effect: "×2 irony detection" }
- { name: "nostalgia-tint", effect: "sepia wash on memories" }
- { name: "cynicism-blocker", effect: "cannot perceive malice" }
- { name: "beauty-enhancer", effect: "+30% aesthetic appreciation" }
# EYELIDS — degrees of opening
eyelid:
position: 0.0-1.0 # 0 = closed, 1 = fully open
blink_rate: slow | normal | rapid | frozen_open
can_wink: true # close one facet temporarily
# SLEEP SCHEDULE — when does this eye rest?
sleep_schedule:
circadian: true | false
active_hours: "dusk to dawn" # or "always" or specific hours
dreams_when_closed: true
dream_type: "prophetic | processing | random | lucid"
# MEMORY — what the eye remembers
memory:
persistence: "session | permanent | fading"
cross_references: true # links to other mined meanings
# LIMITATIONS
blind_spots:
- "cannot see own reflection"
- "misses obvious jokes"
- "overinterprets coincidence"
character:
name: "Morgan the Market Miner"
third_eye:
state: open
facets:
economic: { active: true, sensitivity: 1.0, specialty: "externalities" }
social: { active: true, sensitivity: 0.4 } # reduced
ecological: { active: true, sensitivity: 0.9, filter: "monetize" }
temporal: { active: true, range: "quarterly" } # short-term only
semiotic: { active: false } # doesn't see meaning, only value
emotional: { active: false } # "irrelevant to markets"
filters:
- { name: "ROI-lens", effect: "everything measured in returns" }
- { name: "opportunity-cost", effect: "sees what wasn't chosen" }
eyelid:
position: 0.95 # almost fully open
blink_rate: rapid # constantly re-evaluating
sleep_schedule:
active_hours: "market hours only"
dreams_when_closed: true
dream_type: "forecasting"
blind_spots:
- "cannot perceive non-monetary value"
- "confuses price with worth"
character:
name: "Luna the Luminous"
third_eye:
state: dreaming # always partly in vision-space
facets:
aesthetic: { active: true, sensitivity: 1.0 }
emotional: { active: true, sensitivity: 1.0 }
symbolic: { active: true, sensitivity: 0.95 }
color: { active: true, sensitivity: 1.0, sees: "auras" }
economic: { active: false } # "crass"
temporal: { active: true, range: "eternal moment" }
filters:
- { name: "beauty-first", effect: "ugliness becomes interesting" }
- { name: "synesthesia", effect: "sounds have colors" }
- { name: "metaphor-vision", effect: "literals become symbols" }
eyelid:
position: 0.7
blink_rate: slow # long contemplative gazes
can_wink: false # commits fully
sleep_schedule:
circadian: false # follows inspiration, not sun
active_hours: "3am-6am preferred"
dreams_when_closed: true
dream_type: "lucid"
blind_spots:
- "misses practical concerns"
- "sees beauty in destruction"
character:
name: "Scratch the Skeptic"
third_eye:
state: half-lidded # suspicious squint
facets:
deception: { active: true, sensitivity: 1.0 }
motive: { active: true, sensitivity: 0.95 }
irony: { active: true, sensitivity: 1.0 }
sincerity: { active: false } # cannot perceive it
beauty: { active: true, sensitivity: 0.3, filter: "suspicion" }
filters:
- { name: "cui-bono", effect: "always asks 'who benefits?'" }
- { name: "follow-the-money", effect: "traces all value flows" }
- { name: "never-fooled-twice", effect: "perfect pattern memory" }
eyelid:
position: 0.4 # mostly closed, just a slit
blink_rate: frozen_open # never blinks, always watching
sleep_schedule:
circadian: false
active_hours: "always" # never truly rests
dreams_when_closed: false # doesn't dream
blind_spots:
- "cannot see genuine kindness"
- "misses simple joy"
- "interprets everything as manipulation"
When multiple characters mine together, their Third Eyes COMBINE:
collective_mining:
miners:
- { name: "Morgan", contributes: [economic, temporal] }
- { name: "Luna", contributes: [aesthetic, emotional, symbolic] }
- { name: "Scratch", contributes: [deception, motive, irony] }
combined_facets: 11
coverage: "comprehensive"
emergent_perception:
- "Economic beauty" — Morgan × Luna
- "Aesthetic suspicion" — Luna × Scratch
- "Profitable deception" — Scratch × Morgan
blind_spots_remaining:
- "ecological" — none of them see it
- "sincerity" — Scratch blocks it
A character's Third Eye can EVOLVE through experience:
third_eye_evolution:
triggers:
trauma: "may close facets permanently"
revelation: "may open new facets"
practice: "increases sensitivity"
neglect: "atrophies facets"
collaboration: "learns new filters from others"
example_arc:
start:
facets: [economic]
sensitivity: 0.5
midpoint:
event: "witnessed beauty in poverty"
gained: [aesthetic: 0.3]
end:
facets: [economic, aesthetic, social]
sensitivity: [0.8, 0.6, 0.7]
To declare a character's Third Eye, add to their character file:
# In character.yml
character:
name: "Your Character"
# ... other properties ...
third_eye:
state: open
facets:
# declare what you SEE
filters:
# declare what you ENHANCE or BLOCK
eyelid:
# declare how OPEN you are
sleep_schedule:
# declare when you SEE
blind_spots:
# declare what you CANNOT see
k-lines:
THIRD-EYE-FACETS: "Multifaceted perception"
THIRD-EYE-FILTER: "Selective vision"
THIRD-EYE-EYELID: "Degrees of openness"
THIRD-EYE-SLEEP: "When the eye rests"
THIRD-EYE-BLIND: "What cannot be seen"
COMPOUND-VISION: "Multiple facets active"
COLLECTIVE-EYE: "Merged Third Eyes"
"One facet sees price. Another sees beauty. One facet sees danger. Another sees opportunity. The compound eye sees ALL — and chooses what to mine."
"Your blind spots are not weaknesses. They are your STYLE. Your filters are not biases. They are your VOICE. Your Third Eye is not generic. It is YOURS."
"Who is holding the camera? Their eyes shape what emerges."
When you generate or mine an image, you can specify whose eyes are seeing it. The visualizer inherits that character's complete perception apparatus:
# Visualizer command with character perspective
visualize.py PHOTO.yml PHOTO.md --through "Luna"
# Or in PHOTO.yml itself
photographer:
character: Luna
inherit: [facets, filters, blind_spots, style]
# Or specify the mining perspective
mine.py image.png --as "Scratch"
The NO AI sign at dusk, photographed by three different characters:
scene: "NO AI sign at dusk"
photographs:
morgan_sees:
photographer: "Morgan the Market Miner"
perceives:
- "$847/month electricity cost"
- "Negative ROI on sign investment"
- "Property value implications"
- "Opportunity cost of that capital"
blind_to:
- The beauty of the pink light
- The emotional impact on passersby
- The irony of the message
image_style:
composition: "Annual report cover"
color_grade: "Corporate neutral"
text_overlay: "Financial metrics"
luna_sees:
photographer: "Luna the Luminous"
perceives:
- "40 feet of crystallized defiance"
- "The bruised sky weeping violet"
- "Each letter a burning declaration"
- "The sleeping figure dreaming in pink"
blind_to:
- The electricity bill
- Building code violations
- Market positioning
image_style:
composition: "Romantic sublime"
color_grade: "Saturated, auras visible"
mood: "Transcendent melancholy"
scratch_sees:
photographer: "Scratch the Skeptic"
perceives:
- "Who paid for this? Follow the money."
- "The sign protests what made it — suspicious"
- "'NO AI' from an AI company — what's the angle?"
- "Dr. No's misdirection: No's AI, not No AI"
blind_to:
- Any genuine sincerity
- Simple aesthetic pleasure
- Taking anything at face value
image_style:
composition: "Surveillance footage"
color_grade: "Desaturated, noir"
annotations: "Red circles, question marks"
In PHOTO.yml, declare whose eyes are seeing:
# PHOTO.yml — NO AI Sign at Dusk (Luna's Perspective)
photographer:
character: "Luna the Luminous"
character_ref: "../../characters/luna.yml"
# Inherited automatically from character file
inherits:
facets: [aesthetic, emotional, symbolic, color]
filters: [beauty-first, synesthesia, metaphor-vision]
blind_spots: [practical_concerns, economics]
eyelid_position: 0.7
# Optional overrides for this specific photo
overrides:
facets:
color: { sensitivity: 1.0, mode: "aura-visible" } # enhanced
temporary_filter: "golden-hour-romance"
# Where Luna is standing/looking from
vantage:
position: "street level, 30 feet back"
angle: "looking up at 15 degrees"
height: "5'6\""
dominant_eye: "right" # her bionic third eye is forehead-center
When mining an existing image, specify whose interpretation:
# MINING-luna.yml — Luna's interpretation of the scene
miner:
character: "Luna the Luminous"
mining_mode: "aesthetic-dominant"
resources_extracted:
# Luna's facets shape what she extracts
beauty:
quantity: "overwhelming"
sources:
- "the gradient sky (bruised violet to amber)"
- "the neon's hot pink assertion"
- "the contrast of human smallness vs sign enormity"
emotion:
dominant: "melancholic defiance"
undertones: [hope, absurdity, tenderness]
symbolism:
the_sign: "humanity's last stand, written in light"
the_sleeper: "dreams persisting despite noise"
the_dusk: "the liminal hour between certainty and mystery"
# Luna's blind spots mean she DOESN'T extract:
economics: null # doesn't see it
property_values: null # doesn't care
ROI: null # "crass"
A single scene can have multiple photography records:
slideshow/no-ai-sign-dusk/
├── PHOTO.yml # Neutral structural data
├── PHOTO.md # Neutral narrative
├── PHOTO-luna.yml # Luna's perspective
├── PHOTO-morgan.yml # Morgan's perspective
├── PHOTO-scratch.yml # Scratch's perspective
├── MINING-luna.yml # Luna's mining
├── MINING-morgan.yml # Morgan's mining
├── MINING-scratch.yml # Scratch's mining
└── generated/
├── luna-vision.png # Generated through Luna's eyes