| name | asr-timing-review |
| description | Use for global recorded-video transcript and timing work: run ASR, inspect timestamped transcript segments, infer intended speech, manually decide visual cue timestamps, and produce timing-cues.json without relying on automatic cue matching alone. |
ASR Timing Review
ASR is evidence for editorial timing, not the final authority.
Required Loop
- Run the local ASR command when available:
LESSON_ID=<lesson-id> npm run asr:recording
- Read
shared/composer/<lesson-id>/asr-transcript.json.
- Compare the raw ASR with the intended script, recording context, and audio/video timing.
- Write a human-review ledger:
- time range
- raw ASR text
- inferred intended speech
- visual event
- confidence
- review note
- Manually write
timing-cues.json from the reviewed ledger.
- Update render-plan shot starts/durations only after the ledger explains why each cue time is chosen.
Rules
- NEVER treat programmatic cue matching as final timing when ASR is weak.
- NEVER match only sparse English tech tokens in Chinese recordings.
- Use rough script timestamps only as prior expectations, not final timing.
- If the ASR is crowded but the idea is inferable, state the inferred intended speech and use the timestamp range.
- If the idea is not inferable, mark
needs_manual_review and do not call the render final.
- Prefer cue anchors that the speaker can say clearly next time, such as
第一条:动画时钟, 第二条:浏览器资产, 警告:不是迁移口号, and 最后:声音驱动画布.
Output Shape
Create or update:
shared/composer/<lesson-id>/asr-human-review.md
shared/composer/<lesson-id>/timing-cues.json
- render plan timings that reference
timingSource: human-reviewed ASR ledger