| name | phototransduction |
| description | Reconstruct documents from photos/screenshots as structured markdown. Use when user shares photos, screenshots, or images of slides/papers/emails/screens and wants them captured as text. Common triggers — "OCR this", "reconstruct from photos", "transcribe slides", "photo to markdown", "took photos", "took a photo", "synced to iCloud", "photos on my phone", "screenshot of an email", "took screenshots", "photos in iCloud Photos", "captured an email/slide/page", "process the photos I just took". |
| triggers | ["photos of documents","photos of email","screenshots of email","OCR","reconstruct from photos","transcribe slides","photo to markdown","document from camera","took photos","took a photo","synced to iCloud","iCloud Photos","photos on my phone","screenshots","captured an email","capture verbatim from photo"] |
| tools | ["rhodopsin.py","Read","Write","Bash","Edit"] |
| epistemics | [] |
Phototransduction
Convert photos of documents (slides, papers, screens) into structured markdown with frontmatter.
Biology: phototransduction converts absorbed photons into molecular signals. Here: photos → structured internal text.
When to use
- User took photos of a presentation, Word doc, or screen
- User wants document content captured as searchable, linkable markdown
- User says "OCR", "reconstruct", "transcribe slides", "photo to markdown"
Procedure
1. Acquire photos
rhodopsin.py today
rhodopsin.py date 2026-04-18
rhodopsin.py recent 30
rhodopsin.py export UUID1 UUID2 UUID3...
rhodopsin.py batch YYYY-MM-DD HH:MM HH:MM
Prefer batch DATE HH:MM HH:MM over selective export UUID... for doc photo audits. Selective export silently skips iCloud-only photos (warns [!] not available locally); batch forces sync of the full time range and exports all photos numbered sequentially. Slot 67 (8 May 2026) failure: selective export pulled 14 of 23 photos; the missing IMG_3936 contained a new §9 Data-led governance section that wrong claim ("§9 doesn't exist") was made against, before user pushed for full re-export. Coverage-sensitive audits = batch; targeted single-photo lookup = export.
If photos are on the local machine already (e.g., /tmp/), skip to step 2.
If accessing via SSH to Mac, use rhodopsin.py on the Mac side or AppleScript export.
2. Auto-rotate (CRITICAL — EXIF is not enough)
Always rotate images to correct orientation before reading. This was the single biggest source of transcription errors — reading rotated text produces garbled output that looks plausible but is substantially wrong.
EXIF rotation is necessary but NOT sufficient. When the source document was displayed sideways on screen (e.g., Word/PDF page rotated within the viewer) and then photographed in landscape, the EXIF says "normal" but the content is sideways. sips --rotate 0 will not fix this — it only applies EXIF metadata.
Mandatory workflow for doc photos:
sips -r 0 image.jpg
sips -r 90 image.jpg --out image_cw.jpg
sips -r -90 image.jpg --out image_ccw.jpg
for f in *.jpg; do convert "$f" -auto-orient "$f"; done
~/germline/.venv/bin/python3 -c "
from PIL import Image
for f in ['IMG_001.jpg', 'IMG_002.jpg']:
img = Image.open(f)
img.rotate(90, expand=True).save(f.replace('.jpg','_ccw.jpg'))
img.rotate(-90, expand=True).save(f.replace('.jpg','_cw.jpg'))
"
Stop rule: if you cannot read the document text upright after EXIF rotation, do NOT transcribe. Rotate +90° and -90°, save both, pick the readable version. This is mandatory, not optional.
Lesson from 2026-04-25: A 12-photo OpCo paper transcription produced material errors on multiple bullets (Sponsor comments, Value Streams, RMG&R) because the photos were of a sideways-displayed Word doc. EXIF said "normal", content was 90° off. Single-pass model vision produced fluent reconstructions that read like the source but weren't. Rotation to upright fixed all of them.
3. Identify document boundaries
Photos may cover multiple documents. Before reading, scan all images to identify clusters:
- Check timestamps — bursts with gaps indicate different documents
- Check visual style — different templates, orientations, or formats
- Note page numbers if visible (e.g., "p 3 of 7")
Group photos by document. Process each document separately.
4. Read and reconstruct
Read images in order. For each page:
- Read the image with the Read tool
- Transcribe ALL visible text — don't paraphrase or summarize
- Preserve structure: headings, bullets, tables, numbered lists
- Note anything unclear with
[?] markers
- If text is dense or hard to read, flag for Apple OCR verification
Time-zone label cluster (mandatory for invite/calendar/scheduling screenshots). Outlook, Google Calendar, and similar tools display both the auto-adjusted local time AND a meta-line stating the original creation timezone (e.g., "This meeting has been adjusted to reflect your current time zone. It was initially created in the following time zone: (UTC+00:00) Dublin, Edinburgh, Lisbon, London"). The displayed time is the viewer's local time, NOT the creator's timezone. Single-pass label-skipping that asserts the displayed time is in the original timezone is the failure shape (Slot 29 instance, 2026-05-04). DO: read every timezone-related label in the field block — both the displayed time and any "auto-adjusted / originally created in / reflect your current time zone" hints — before asserting any time. DO NOT: derive timezone from one label when a sibling label flips the interpretation.
Key lesson: Dense rotated text is where errors concentrate. If photos were taken at an angle or the document was displayed sideways on screen, even after rotation the text quality may be poor. Flag these pages for user verification via Apple Live Text (camera OCR on iPhone/iPad is more accurate than model vision on rotated photos).
5. Structure as markdown
Write the reconstructed document with frontmatter:
---
title: "Document Title (reconstruction)"
date: YYYY-MM-DD
type: deliverable|reference
author: Original Author
source: Photos taken YYYY-MM-DD HH:MM TZ
original_format: Word document|PowerPoint deck, N pages/slides
status: reconstructed
pii: false
tags: [relevant, tags]
---
# Document Title
[reconstructed content]
---
**Related:**
- [[linked-document-1]] — relationship
- [[linked-document-2]] — relationship
6. Verify with Apple OCR
For any page where confidence is low (rotated text, dense paragraphs, small fonts):
- Ask the user to open the photo on their iPhone/iPad
- Use Apple Live Text (long-press on text in Photos app) to copy the text
- User pastes the OCR text into the conversation
- Compare against reconstruction and fix discrepancies
This step is not optional for dense text pages. Model vision on rotated/angled photos produces plausible but wrong text — errors that look like bad writing rather than bad OCR.
7. Save and interlink
- Save to
~/epigenome/chromatin/immunity/ (private, not public)
- Add frontmatter with source provenance
- Add
**Related:** wikilinks to connected documents
- Commit to epigenome repo and push
Anti-patterns (learned 2026-04-18)
-
Never read rotated images without rotating first. The model produces fluent-sounding but wrong text. "Phishing is a single system" was actually "What is missing is the operating model."
-
Never assume transcription errors are the author's writing problems. Review comments based on bad OCR waste everyone's time. Verify before critiquing.
-
Never substitute numbers from other sources. "86 in pilot, 1,319 in ideation" came from a different document — the actual text said "66 in pilot, 162 in POC." Keep transcription and analysis separate.
-
Check for missing pages. Count page numbers if visible. Compare photo count vs page count. One index off = one missing page.
-
Duplicate photos of the same page exist. Different angles, zoom levels. Don't double-count as separate pages.
CLI enhancement needed
rhodopsin.py export should auto-rotate based on EXIF orientation. Currently it converts HEIC→JPEG but doesn't fix rotation. Add sips --rotate 0 (which applies EXIF metadata) after conversion.