| name | voice-bridge |
| description | Phone-side voice bridge runtime for Clawfinger. Use this skill when configuring profiles, tuning ALSA capture/playback endpoints, training audio quality, handling incoming call policies, or troubleshooting phone-to-gateway audio issues. Runs on macOS and Linux. |
VOICE-BRIDGE-SKILL
Purpose
Runbook for rooted Pixel telephony audio bridge + external voice gateway (ASR/LLM/TTS), with profile-driven tuning only.
Dependencies
- Root setup:
skills/root-pixel10pro/SKILL.md
- App service:
phone/app-android/app/src/main/java/com/tracsystems/phonebridge/GatewayCallAssistantService.kt
Runtime architecture
- RX path: in-call downlink PCM (
tinycap) -> gateway ASR/turn -> TTS WAV.
- TX path: TTS WAV -> in-call uplink PCM (
tinyplay) -> remote caller.
- Call flow is server-ASR authoritative (
/api/turn); no app-side hardcoded intent logic.
- Call policy is gateway-managed: incoming call filtering, greetings, max duration, and passphrase authentication are all configured on the gateway and fetched by the phone at each call start via
GET /api/config/call. The phone profile no longer contains these settings.
Profile source of truth
- Local profile file:
phone/profiles/pixel10pro-blazer-profile-v1.json
- Device active profile path:
/sdcard/Android/data/com.tracsystems.phonebridge/files/profiles/profile.json
- No hardcoded per-endpoint tuning in app code.
Required profile sections (device-specific only)
root_binaries
gateway
route_profile.set / route_profile.restore
playback.candidate_order_in_app
playback.endpoint_settings.<pcm_index>
sample_rate, channels, optional speed_compensation
playback.tuning
persistent_session, lock_device_for_call, prebuffer_ms, period_size, period_count, mmap
capture.candidate_order_in_app
capture.endpoint_settings.<pcm_index>
request_sample_rate, request_channels, effective_sample_rate
capture.tuning.strict_stream_only
call_session
session_log, gateway_report
beep
logging
policy
Sections removed from profile (now gateway-managed)
The following were moved to the gateway's config.json and are fetched via GET /api/config/call:
incoming (auto_answer, auto_answer_delay_ms, allowed, disallowed, unknown_allowed) → gateway: call_auto_answer, caller_allowlist, caller_blocklist, unknown_callers_allowed
greeting (owner, template) → gateway: greeting_incoming, greeting_outgoing, greeting_owner
call_session.max_duration_sec, call_session.max_duration_message → gateway: max_duration_sec, max_duration_message
Deploy profile
- Push:
./scripts/android-push-profile.sh profiles/pixel10pro-blazer-profile-v1.json
- Confirm on device:
adb shell cat /sdcard/Android/data/com.tracsystems.phonebridge/files/profiles/profile.json
- Restart app/service if needed and place a call.
Operational prerequisites
- Default dialer role: Clawfinger must hold
android.app.role.DIALER. Resets on every reboot — must be re-set with adb shell cmd role add-role-holder android.app.role.DIALER com.tracsystems.phonebridge 1. Without this, BridgeInCallService is never bound and calls go to the stock dialer.
- No screen lock: Lock screen MUST be
None or Swipe. Never PIN, pattern, password, fingerprint, or face unlock — any secure lock blocks call audio access and root services when the screen turns off.
- User 0 unlocked:
adb shell dumpsys user | grep RUNNING_UNLOCKED
- Battery mode for app:
Unrestricted.
- No internet or Google account needed — all AI processing runs on the host machine over USB (ADB reverse).
Gateway contract
- Gateway connection is read from profile (
gateway.base_url, gateway.bearer).
- Health:
curl -H "Authorization: Bearer <token>" <base_url>/health
- Call policy:
GET /api/config/call — fetched at call start by both BridgeInCallService (for caller filtering, auto-answer) and GatewayCallAssistantService (for greeting, max duration). Fetched on a background thread to avoid NetworkOnMainThread crashes.
- Voice endpoints:
POST /api/asr
POST /api/turn — includes caller_number and call_direction form fields for server-side caller filtering and session tracking.
- Turn response handling:
rejected: true — caller blocked by gateway policy. App plays rejection audio and hangs up.
hangup: true — gateway requests call termination (e.g., passphrase auth failed). App plays audio and hangs up.
- Fallback: if gateway is unreachable during
GET /api/config/call, the phone rejects the call (safe default).
Endpoint training workflow
Use this when modem/call session shifts to another PCM endpoint.
- Start a real call (human pickup required).
- Pull logs and debug wavs for that call.
- Identify active endpoints from logs:
- capture shift / capture pinned
- playback shift / playback selected
- Tune only that endpoint in local profile:
- capture:
request_sample_rate, request_channels, effective_sample_rate
- playback:
sample_rate, channels, optional speed_compensation
- Push updated profile and retest.
- Repeat until quality and transcription are stable.
Capture training loop
- Gate condition:
- Resolve active capture endpoint for the call.
- If
capture.endpoint_settings.<active_pcm_index> is complete, do not run capture training.
- If that endpoint entry is missing, empty (
{}), or incomplete, run capture training loop.
- Capture entry is considered complete when all required fields exist:
request_sample_rate
request_channels
effective_sample_rate
- Required precondition:
- profile must enable wav logging:
logging.debug_wav_dump.enabled=true,
- updated profile must be pushed and loaded on device before starting the loop.
- Mandatory artifacts each loop:
- pull call WAVs (
rxm-*, rx-*, tx-*) to local phone/debug-wavs/latest-call-*
- pull transcription outputs for the same call.
- Human QA step (required):
- listen to the pulled
rxm-* files,
- classify pitch quality (
normal, too high/fast, too low/slow),
- report this back to AI with the exact filenames.
- AI tuning step:
- adjust only the active capture endpoint settings in
capture.endpoint_settings.<pcm_index>,
- push profile, retest, pull artifacts again.
- Stop condition:
- keep iterating until human confirms natural pitch and transcript alignment on the pulled files.
Playback training loop
- Gate condition:
- Resolve active playback endpoint for the call.
- If
playback.endpoint_settings.<active_pcm_index> is complete, do not run playback training.
- If that endpoint entry is missing, empty (
{}), or incomplete, run playback training loop.
- Playback entry is considered complete when required fields exist:
sample_rate
channels
- optional tuning field:
speed_compensation (set when needed for pitch correction)
- Required precondition:
- profile must enable wav logging:
logging.debug_wav_dump.enabled=true,
- updated profile must be pushed and loaded on device before starting the loop.
- Mandatory artifacts each loop:
- pull call WAVs (
tx-*, plus rxm-*/rx-* for context) to local phone/debug-wavs/latest-call-*
- pull transcription outputs for the same call.
- Human QA step (required):
- evaluate what is heard on the remote phone during playback,
- classify playback quality (
normal, too high/fast, too low/slow, choppy),
- report this back to AI with relevant call timestamp and related pulled filenames.
- AI tuning step:
- adjust only the active playback endpoint settings in
playback.endpoint_settings.<pcm_index>,
- push profile, retest, pull artifacts again.
- Stop condition:
- keep iterating until human confirms natural playback pitch/timing and stable conversational delivery.
Acceptance criteria
- Greeting audible to remote caller.
- Minimum 3 stable turns without losing capture.
- No persistent high/low pitch artifacts in pulled
rxm-* wavs.
- Transcripts remain semantically aligned with what caller said.
Minimal troubleshooting
- No assistant audio heard remotely:
- verify route set was applied, playback endpoint selected, and
tinyplay success logs.
- No capture after greeting:
- verify active capture endpoint and its profile settings; retune endpoint-specific capture rates/channels.
- Mis-transcriptions with pitch artifacts:
- retune
effective_sample_rate for active capture endpoint; keep endpoint-index mapping in profile.
- Duplicate/overlapping call behavior:
- do not start a new dial while one live call exists.