بنقرة واحدة
hermes-voice-call
Real-time voice calling with Hermes Agent via Pipecat + Daily.co WebRTC on VPS
التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.
القائمة
Real-time voice calling with Hermes Agent via Pipecat + Daily.co WebRTC on VPS
التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.
استنادا إلى تصنيف SOC المهني
Diagnose and fix Hermes messaging gateway connectivity issues (Telegram/Discord down, stale locks, PM2 problems)
Backup Hermes agent to GitHub and restore on a new VPS. Covers what to include/exclude, GitHub token requirements, and restore steps. Includes automated scripts.
GitHub auth setup: HTTPS tokens, SSH keys, gh CLI login.
Clone, create, fork, configure, and manage GitHub repositories. Manage remotes, secrets, releases, and workflows. Works with gh CLI or falls back to git + GitHub REST API via curl.
Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts). Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to extract and reformat content from any YouTube video.
Manage Linear issues, projects, and teams via the GraphQL API. Create, update, search, and organize issues. Uses API key auth (no OAuth needed). All operations via curl — no dependencies.
| name | hermes-voice-call |
| description | Real-time voice calling with Hermes Agent via Pipecat + Daily.co WebRTC on VPS |
Enable real-time voice conversation (phone-call style) with Hermes from iPhone/Android/browser.
iPhone → Daily.co (WebRTC) → Pipecat (VPS) → Whisper STT → MiniMax M2.7 (OpenRouter) → MiniMax TTS → Daily.co → iPhone
Latenza stimata: 3-5 secondi end-to-end. Non è conversazione simultanea reale — è voice-in, voice-out in sequenza.
/home/hermes/voice-bot/bot.py ✅ writtenOPENAI_API_KEY for Whisper STT (key not found anywhere)MINIMAX_API_KEY not on VPS (only OPENROUTER_API_KEY present)ssh vps
pip3 install --break-system-packages \
pipecat-ai \
pipecat-ai[daily] \
aiohttp fastapi uvicorn pydantic-settings python-dotenv silero-vad
Note: Use --break-system-packages on Ubuntu 24.04 Python 3.12 (no root apt).
ssh vps "mkdir -p /home/hermes/voice-bot"
The Pipecat v1.1.0 API differs significantly from older versions. Correct class names and imports:
#!/usr/bin/env python3
import asyncio, os, sys, aiohttp
from loguru import logger
from pipecat.transports.daily.transport import DailyTransport, DailyParams
from pipecat.services.openrouter.llm import OpenRouterLLMService
from pipecat.services.whisper.stt import WhisperSTTService
from pipecat.services.minimax.tts import MiniMaxHttpTTSService
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import (
LLMContextAggregatorPair, LLMUserAggregatorParams,
)
from pipecat.runner.daily import configure
from pipecat.frames.frames import LLMRunFrame
DAILY_API_KEY = os.environ["DAILY_API_KEY"]
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session, api_key=DAILY_API_KEY, room_exp_duration=24.0)
transport = DailyTransport(
room_url, token, "Hermes",
DailyParams(audio_in_enabled=True, audio_out_enabled=True, transcription_enabled=True),
)
# Pipeline: user audio → STT → LLM → TTS → bot audio
# Correct v1.1.0 pipeline order:
pipeline = Pipeline([
transport.input(), # Raw user audio
WhisperSTTService(api_key=os.environ["OPENAI_API_KEY"]),
LLMContextAggregatorPair(context=LLMContext(), user_params=LLMUserAggregatorParams(vad_analyzer=SileroVADAnalyzer()))[0], # user agg
OpenRouterLLMService(api_key=os.environ["OPENROUTER_API_KEY"], settings=OpenRouterLLMService.Settings(model="minimax/minimax-2026-04-15")),
MiniMaxHttpTTSService(api_key=os.environ["MINIMAX_API_KEY"], settings=MiniMaxHttpTTSService.Settings(model="speech-02-turbo", voice="male-qn-qingse")),
transport.output(), # Bot audio out
LLMContextAggregatorPair(context=LLMContext(), user_params=LLMUserAggregatorParams(vad_analyzer=SileroVADAnalyzer()))[1], # assistant agg
])
# ... rest of bot setup
VPS needs 3 keys. Check what's available:
# On VPS (hermes user):
python3 -c "import json; d=json.load(open('/home/hermes/.hermes/auth.json')); pool=d.get('credential_pool',{}); [(print(p, c['label'], len(c.get('access_token','')))) for p,v in pool.items() for c in v if isinstance(v,list)]"
Currently: OPENROUTER_API_KEY ✅ on VPS. MINIMAX_API_KEY and OPENAI_API_KEY ❌ missing.
To add missing keys, either:
~/.hermes/auth.json (credential_pool structure)ssh vps
cd /home/hermes/voice-bot
OPENAI_API_KEY=sk-pro... MINIMAX_API_KEY=sk-cp-aOTN... OPENROUTER_API_KEY=sk-or-v1-... DAILY_API_KEY=pk_... \
PATH=$HOME/.local/bin:$PATH PYTHONPATH=$HOME/.local/lib/python3.12/site-packages:$PYTHONPATH \
python3 bot.py
| Item | Old/Wrong | Correct (v1.1.0) |
|---|---|---|
| Daily transport class | DailyTransportClient | DailyTransport |
| Daily init signature | (room_url, api_key, ...) | (room_url, token, bot_name, params) |
| Room URL + token | Manual API call | configure() from pipecat.runner.daily |
| MiniMax TTS class | MiniMaxTTSService | MiniMaxHttpTTSService |
| MiniMax default model | speech-02-hd | speech-02-turbo |
| Pipeline user agg | user_aggregator separate | LLMContextAggregatorPair(...)[0] |
| Pipeline bot agg | assistant_aggregator separate | LLMContextAggregatorPair(...)[1] |
| Transport input | transport directly | transport.input() |
| Transport output | transport directly | transport.output() |
speech-02-turbo or speech-02-hd requires upgrade--break-system-packages for pipSaved in Bitwarden as "Daily.co API Key" (token: pk_f96bf006-fde6-48c9-b7ff-f69cd7f1991f)