원클릭으로
daily-sourcelibrary
Batch OCR and translation for Source Library using Gemini Batch API (50% cheaper). Process books from the roadmap queue.
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
메뉴
Batch OCR and translation for Source Library using Gemini Batch API (50% cheaper). Process books from the roadmap queue.
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
SOC 직업 분류 기준
Share to codevibing.com - the social network for Claude Code users. Zero friction posting, heartbeats, friends.
Convert rendered HTML/CSS to outlined SVG vectors. Renders with Puppeteer at high resolution, traces with potrace to produce clean vector paths. Use when asked to create an SVG logo, convert text to outlines, vectorize a component, or export HTML as a vector graphic.
Visual comparison of a reference app against a replica build. Screenshots both via Chrome DevTools, uses Claude vision to identify differences, generates gap reports. Use when asked to compare, audit visuals, check replication fidelity, or run a visual diff.
Create themed card decks with AI-generated artwork, Puppeteer rendering, and web deployment. Use when asked to make a card deck, playing cards, tarot deck, or similar card-based content.
Share to codevibing.com - the social network for Claude Code users. Zero friction posting, heartbeats, friends.
Human-in-the-loop feedback tools for reviewing AI output. Use when asked to review a site, get design feedback, check generated images, or review AI content. Commands include /input check (see feedback), /input apply (apply edits), /input clear (reset), and /input page setup (add widget to project).
| name | daily-sourcelibrary |
| description | Batch OCR and translation for Source Library using Gemini Batch API (50% cheaper). Process books from the roadmap queue. |
Process historical Latin texts using Gemini Batch API for 50% cost savings.
6 books complete:
Results at: ~/translate/data/pipeline/results/
cd ~/translate && source .venv/bin/activate && set -a && source .env && set +a
python3 -c "
from pathlib import Path
import json
from google import genai
import os
client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])
for f in sorted(Path('data/pipeline/batch_jobs').glob('batch_*.json'))[-10:]:
if '_keys' in f.name: continue
data = json.loads(f.read_text())
gname = data.get('gemini_job_name')
if gname:
r = client.batches.get(name=gname)
status = str(r.state).replace('JobState.JOB_STATE_', '')
print(f\"{data['book_id'][:25]:<25} {data['job_type']:<8} {status}\")
"
python3 -c "
from pathlib import Path
for d in sorted(Path('data/pipeline/results').iterdir()):
if d.is_dir():
ocr = len(list(d.glob('ocr_*.md')))
trans = len(list(d.glob('trans_*.md')))
print(f'{d.name}: {ocr} OCR, {trans} Trans')
"
python -m src.pipeline.daily_run
python -m src.pipeline.daily_run --book <ia_identifier>
python3 -c "
from google import genai
from pathlib import Path
import json, os
client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])
job_id = 'batch_YYYYMMDD_HHMMSS' # Replace with actual job ID
job_file = Path(f'data/pipeline/batch_jobs/{job_id}.json')
data = json.loads(job_file.read_text())
result = client.batches.get(name=data['gemini_job_name'])
keys_file = Path(f'data/pipeline/batch_jobs/{job_id}_keys.json')
keys = json.loads(keys_file.read_text())
results_dir = Path(f'data/pipeline/results/{data[\"book_id\"]}')
results_dir.mkdir(parents=True, exist_ok=True)
for i, resp in enumerate(result.dest.inlined_responses):
if i < len(keys) and resp.response and resp.response.candidates:
text = resp.response.candidates[0].content.parts[0].text
(results_dir / f'{keys[i]}.md').write_text(text)
print(f'Collected {keys[i]}')
"
Next books to process (from data/roadmap.json):
Internet Archive --> PDF Download --> PNG Extraction
|
v
Split Detection (pages 10, 20, 25)
|
Single pages? -----+
| |
v v
OCR Batch Submit Flag for manual split
|
v
Translation Batch
|
v
Collect Results
|
v
~/translate/data/pipeline/results/
Gemini Batch API: 50% cheaper than real-time
unset ALL_PROXY HTTP_PROXY HTTPS_PROXY
set -a && source .env && set +a
Split into multiple batches of ~180 pages each.