一键导入
daily-sourcelibrary
Batch OCR and translation for Source Library using Gemini Batch API (50% cheaper). Process books from the roadmap queue.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Batch OCR and translation for Source Library using Gemini Batch API (50% cheaper). Process books from the roadmap queue.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Share to codevibing.com - the social network for Claude Code users. Zero friction posting, heartbeats, friends.
Convert rendered HTML/CSS to outlined SVG vectors. Renders with Puppeteer at high resolution, traces with potrace to produce clean vector paths. Use when asked to create an SVG logo, convert text to outlines, vectorize a component, or export HTML as a vector graphic.
Visual comparison of a reference app against a replica build. Screenshots both via Chrome DevTools, uses Claude vision to identify differences, generates gap reports. Use when asked to compare, audit visuals, check replication fidelity, or run a visual diff.
Create themed card decks with AI-generated artwork, Puppeteer rendering, and web deployment. Use when asked to make a card deck, playing cards, tarot deck, or similar card-based content.
Share to codevibing.com - the social network for Claude Code users. Zero friction posting, heartbeats, friends.
Human-in-the-loop feedback tools for reviewing AI output. Use when asked to review a site, get design feedback, check generated images, or review AI content. Commands include /input check (see feedback), /input apply (apply edits), /input clear (reset), and /input page setup (add widget to project).
| name | daily-sourcelibrary |
| description | Batch OCR and translation for Source Library using Gemini Batch API (50% cheaper). Process books from the roadmap queue. |
Process historical Latin texts using Gemini Batch API for 50% cost savings.
6 books complete:
Results at: ~/translate/data/pipeline/results/
cd ~/translate && source .venv/bin/activate && set -a && source .env && set +a
python3 -c "
from pathlib import Path
import json
from google import genai
import os
client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])
for f in sorted(Path('data/pipeline/batch_jobs').glob('batch_*.json'))[-10:]:
if '_keys' in f.name: continue
data = json.loads(f.read_text())
gname = data.get('gemini_job_name')
if gname:
r = client.batches.get(name=gname)
status = str(r.state).replace('JobState.JOB_STATE_', '')
print(f\"{data['book_id'][:25]:<25} {data['job_type']:<8} {status}\")
"
python3 -c "
from pathlib import Path
for d in sorted(Path('data/pipeline/results').iterdir()):
if d.is_dir():
ocr = len(list(d.glob('ocr_*.md')))
trans = len(list(d.glob('trans_*.md')))
print(f'{d.name}: {ocr} OCR, {trans} Trans')
"
python -m src.pipeline.daily_run
python -m src.pipeline.daily_run --book <ia_identifier>
python3 -c "
from google import genai
from pathlib import Path
import json, os
client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])
job_id = 'batch_YYYYMMDD_HHMMSS' # Replace with actual job ID
job_file = Path(f'data/pipeline/batch_jobs/{job_id}.json')
data = json.loads(job_file.read_text())
result = client.batches.get(name=data['gemini_job_name'])
keys_file = Path(f'data/pipeline/batch_jobs/{job_id}_keys.json')
keys = json.loads(keys_file.read_text())
results_dir = Path(f'data/pipeline/results/{data[\"book_id\"]}')
results_dir.mkdir(parents=True, exist_ok=True)
for i, resp in enumerate(result.dest.inlined_responses):
if i < len(keys) and resp.response and resp.response.candidates:
text = resp.response.candidates[0].content.parts[0].text
(results_dir / f'{keys[i]}.md').write_text(text)
print(f'Collected {keys[i]}')
"
Next books to process (from data/roadmap.json):
Internet Archive --> PDF Download --> PNG Extraction
|
v
Split Detection (pages 10, 20, 25)
|
Single pages? -----+
| |
v v
OCR Batch Submit Flag for manual split
|
v
Translation Batch
|
v
Collect Results
|
v
~/translate/data/pipeline/results/
Gemini Batch API: 50% cheaper than real-time
unset ALL_PROXY HTTP_PROXY HTTPS_PROXY
set -a && source .env && set +a
Split into multiple batches of ~180 pages each.