en un clic
hermes-agent
// Expert in building self-improving AI agents with tool use, multi-platform messaging, and a closed learning loop. Proficient in LLM orchestration, tool integration, session management, and agent autonomy.
// Expert in building self-improving AI agents with tool use, multi-platform messaging, and a closed learning loop. Proficient in LLM orchestration, tool integration, session management, and agent autonomy.
| name | hermes-agent |
| description | Expert in building self-improving AI agents with tool use, multi-platform messaging, and a closed learning loop. Proficient in LLM orchestration, tool integration, session management, and agent autonomy. |
Purpose of this document: A single, comprehensive reference that explains everything about the Hermes Agent project — its architecture, source code, features, release history, and design patterns — so that any AI or developer can fully understand the system.
Hermes Agent is a self-improving AI agent built by Nous Research. It is an open-source (MIT licensed), Python-based project that provides:
Tech Stack:
Repository: github.com/NousResearch/hermes-agent
Version: 0.7.0 (as of April 2026)
License: MIT
| Feature | Description |
|---|---|
| Terminal UI | Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, streaming tool output |
| Multi-Platform Messaging | Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email, Home Assistant, DingTalk, Feishu/Lark, WeCom, Mattermost, SMS, Webhook — all from a single gateway process |
| Learning Loop | Agent-curated memory with periodic nudges, autonomous skill creation, skills self-improve during use, FTS5 session search with LLM summarization, Honcho dialectic user modeling |
| Scheduled Tasks | Built-in cron scheduler with delivery to any platform (daily reports, nightly backups, weekly audits) |
| Subagent Delegation | Spawn isolated subagents for parallel workstreams with restricted toolsets |
| Execute Code | Python scripts that call tools via RPC, collapsing multi-step pipelines into zero-context-cost turns |
| Terminal Backends | Local, Docker, SSH, Modal, Daytona, Singularity — run on a $5 VPS or a GPU cluster |
| Skills | 70+ bundled skills across 28 categories, Skills Hub for community discovery, agentskills.io compatibility |
| Plugins | Drop-in Python plugins with lifecycle hooks (pre_llm_call, post_llm_call, on_session_start, on_session_end) |
| MCP | Client (connect to MCP servers for extended tools) and Server (expose conversations to MCP clients) |
| IDE Integration | VS Code, Zed, JetBrains via ACP server with session management and tool streaming |
| API Server | OpenAI-compatible /v1/chat/completions endpoint for headless integrations |
| Profiles | Multi-instance support — each profile gets isolated config, memory, sessions, skills, gateway |
| Security | Command approval system, secret redaction, SSRF protection, PII redaction, injection detection, credential directory protection |
| RL Training | Atropos environments for batch trajectory generation and agent policy optimization |
# One-line install (Linux, macOS, WSL2)
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
# After install
source ~/.bashrc # or: source ~/.zshrc
hermes # start chatting
# Key commands
hermes model # Choose LLM provider and model
hermes tools # Configure which tools are enabled
hermes config set # Set individual config values
hermes gateway # Start the messaging gateway
hermes setup # Run the full setup wizard
hermes update # Update to latest version
hermes doctor # Diagnose any issues
For development:
git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv venv --python 3.11
source venv/bin/activate
uv pip install -e ".[all,dev]"
python -m pytest tests/ -q # ~3000 tests
hermes-agent/
├── run_agent.py # AIAgent class — core conversation loop
├── model_tools.py # Tool orchestration, _discover_tools(), handle_function_call()
├── toolsets.py # Toolset definitions, _HERMES_CORE_TOOLS list
├── toolset_distributions.py # Toolset sampling distributions for RL
├── cli.py # HermesCLI class — interactive CLI orchestrator
├── hermes_state.py # SessionDB — SQLite session store (FTS5 search)
├── hermes_constants.py # Shared constants, get_hermes_home()
├── hermes_time.py # Timezone handling
├── utils.py # Shared utility functions
├── batch_runner.py # Parallel batch processing
├── trajectory_compressor.py # Trajectory compression for RL training
├── mcp_serve.py # MCP server mode entry point
├── mini_swe_runner.py # Minimal SWE benchmark runner
├── rl_cli.py # RL CLI commands
│
├── agent/ # Agent internals
│ ├── prompt_builder.py # System prompt assembly
│ ├── context_compressor.py # Auto context compression
│ ├── prompt_caching.py # Anthropic prompt caching
│ ├── auxiliary_client.py # Auxiliary LLM client (vision, summarization)
│ ├── model_metadata.py # Model context lengths, token estimation
│ ├── models_dev.py # models.dev registry integration
│ ├── display.py # KawaiiSpinner, tool preview formatting
│ ├── skill_commands.py # Skill slash commands (shared CLI/gateway)
│ └── trajectory.py # Trajectory saving helpers
│
├── hermes_cli/ # CLI subcommands and setup
│ ├── main.py # Entry point — all `hermes` subcommands
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
│ ├── commands.py # Slash command definitions + SlashCommandCompleter
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
│ ├── setup.py # Interactive setup wizard
│ ├── skin_engine.py # Skin/theme engine
│ ├── skills_config.py # `hermes skills` — skill management
│ ├── tools_config.py # `hermes tools` — tool management
│ ├── skills_hub.py # Skills Hub integration
│ ├── models.py # Model catalog, provider model lists
│ ├── model_switch.py # Shared /model switch pipeline
│ └── auth.py # Provider credential resolution
│
├── tools/ # Tool implementations (one file per tool)
│ ├── registry.py # Central tool registry
│ ├── approval.py # Dangerous command detection
│ ├── terminal_tool.py # Terminal/shell execution
│ ├── process_registry.py # Background process management
│ ├── file_tools.py # File read/write/search/patch
│ ├── web_tools.py # Web search/extract
│ ├── browser_tool.py # Browser automation
│ ├── code_execution_tool.py # execute_code sandbox
│ ├── delegate_tool.py # Subagent delegation
│ ├── mcp_tool.py # MCP client integration
│ ├── skills_tool.py # Skill management tool
│ ├── todo_tool.py # Todo/task tracking tool
│ ├── memory_tool.py # Memory read/write tool
│ ├── tts_tool.py # Text-to-speech
│ ├── vision_tool.py # Image analysis
│ ├── image_gen_tool.py # Image generation
│ └── environments/ # Terminal backends
│ ├── base.py # BaseEnvironment ABC
│ ├── local.py # Local execution
│ ├── docker.py # Docker containers
│ ├── ssh.py # SSH remote execution
│ ├── modal.py # Modal serverless
│ ├── managed_modal.py # Nous-hosted Modal
│ ├── daytona.py # Daytona serverless
│ ├── singularity.py # Singularity HPC containers
│ └── persistent_shell.py # Persistent shell mixin
│
├── gateway/ # Messaging platform gateway
│ ├── run.py # GatewayRunner — main message loop
│ ├── session.py # SessionStore — conversation persistence
│ ├── status.py # Gateway status, token locks
│ └── platforms/ # 16 platform adapters
│ ├── base.py # BasePlatformAdapter ABC
│ ├── telegram.py # Telegram (polling + webhook)
│ ├── discord.py # Discord
│ ├── slack.py # Slack
│ ├── whatsapp.py # WhatsApp
│ ├── matrix.py # Matrix (E2EE)
│ ├── signal.py # Signal
│ ├── email.py # Email (IMAP/SMTP)
│ ├── homeassistant.py # Home Assistant
│ ├── sms.py # SMS (Twilio)
│ ├── mattermost.py # Mattermost
│ ├── dingtalk.py # DingTalk
│ ├── feishu.py # Feishu/Lark
│ ├── wecom.py # WeCom (Enterprise WeChat)
│ ├── webhook.py # Generic webhook
│ └── api_server.py # OpenAI-compatible API server
│
├── acp_adapter/ # ACP server (IDE integration)
│ ├── server.py # HermesACPAgent class
│ ├── session.py # SessionManager
│ ├── events.py # Streaming callbacks
│ ├── permissions.py # Approval callbacks
│ └── entry.py # Entry point
│
├── cron/ # Scheduler
│ ├── scheduler.py # tick() — job execution engine
│ └── jobs.py # Job storage and CRUD
│
├── plugins/ # Plugin system
│ └── memory/ # 8 memory provider plugins
│ ├── openviking/
│ ├── mem0/
│ ├── hindsight/
│ ├── holographic/
│ ├── honcho/
│ ├── retaindb/
│ └── byterover/
│
├── environments/ # RL training environments (Atropos)
│ ├── hermes_base_env.py # Abstract base RL environment
│ ├── agent_loop.py # HermesAgentLoop — rollout execution
│ ├── tool_context.py # ToolContext — sandbox for RL
│ ├── web_research_env.py # Web research tasks
│ └── agentic_opd_env.py # Observation-Prediction-Demo env
│
├── skills/ # 70+ bundled skills across 28 categories
├── optional-skills/ # Additional optional skills
├── tests/ # ~3000 pytest tests
├── scripts/ # Install, update, packaging scripts
├── docker/ # Docker build files
├── docs/ # Documentation source (Docusaurus)
├── website/ # Landing page
├── desktop/ # Desktop app (Electron, separate repo)
├── tinker-atropos/ # RL submodule
│
├── pyproject.toml # Python package config
├── AGENTS.md # Developer guide for AI assistants
├── RELEASE_v0.2.0.md → v0.7.0.md # Release notes
└── cli-config.yaml.example # Example config
User config directory: ~/.hermes/
~/.hermes/
├── config.yaml # User settings
├── .env # API keys and secrets
├── MEMORY.md # Persistent agent memory
├── USER.md # User profile
├── SOUL.md # Agent personality/identity
├── sessions.db # SQLite session database
├── skills/ # User-installed skills
├── skins/ # Custom CLI themes
├── plugins/ # User plugins
├── cron/ # Cron jobs and output
│ ├── jobs.json
│ └── output/
├── cache/ # Image/audio cache
├── plans/ # Generated plans
├── profiles/ # Multi-instance profiles
└── mcp/ # MCP server configs
The AIAgent class is the heart of the system — the core conversation loop that orchestrates LLM calls, tool execution, context management, and response delivery.
Constructor (~60 parameters):
class AIAgent:
def __init__(self,
model: str = "anthropic/claude-opus-4.6",
max_iterations: int = 90,
enabled_toolsets: list = None,
disabled_toolsets: list = None,
quiet_mode: bool = False,
save_trajectories: bool = False,
platform: str = None, # "cli", "telegram", etc.
session_id: str = None,
session_db: SessionDB = None,
skip_context_files: bool = False,
skip_memory: bool = False,
base_url: str = None,
api_key: str = None,
provider: str = None,
api_mode: str = "chat_completions", # or "anthropic_messages" or "codex_responses"
tool_progress_callback = None,
stream_delta_callback = None,
thinking_callback = None,
status_callback = None,
iteration_budget: IterationBudget = None,
credential_pool = None,
checkpoints_enabled: bool = False,
# ... plus provider, routing, callback params
)
Main Methods:
| Method | Returns | Purpose |
|---|---|---|
chat(message, stream_callback) | str | Simple interface — returns final response text |
run_conversation(user_message, system_message, conversation_history, task_id) | dict | Full interface — returns {final_response, messages, completed, api_calls, error} |
_interruptible_api_call(api_kwargs) | Response | Runs API request in background thread with interrupt support |
_interruptible_streaming_api_call(api_kwargs, on_first_delta) | Response | Streaming variant with delta callbacks |
The Core Agent Loop (inside run_conversation()):
while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
response = client.chat.completions.create(
model=model, messages=messages, tools=tool_schemas
)
if response.tool_calls:
for tool_call in response.tool_calls:
result = handle_function_call(tool_call.name, tool_call.args, task_id)
messages.append(tool_result_message(result))
api_call_count += 1
else:
return response.content # Final text response
Key Behaviors:
chat_completions (OpenAI-compatible), anthropic_messages (Anthropic SDK), codex_responses (OpenAI Codex)IterationBudget (thread-safe):
class IterationBudget:
def consume() -> bool # Check and consume one iteration
def refund() # Give back iteration (for execute_code turns)
@property remaining # Remaining iterations
Bridges the agent and tool registry — handles discovery, schema generation, and dispatch.
Key Functions:
| Function | Purpose |
|---|---|
_discover_tools() | Imports all tool modules (each calls registry.register() on import) |
get_tool_definitions(enabled_toolsets, disabled_toolsets, quiet_mode) | Returns OpenAI-format tool schemas filtered by toolset |
handle_function_call(function_name, function_args, task_id, user_task, enabled_tools) | Main dispatcher — routes calls to registry with arg coercion |
coerce_tool_args(tool_name, args) | Type coercion for LLM-generated arguments (string→int, string→bool, etc.) |
Tool Discovery Order:
Special Tool Handling:
run_agent.py before handle_function_call()enabled_tools for sandbox tool listbrowser_navigate strips web_search reference if tools unavailableAsync Bridging:
asyncio.run()) to prevent "Event loop is closed" errors_run_async() detects running loop and spins up disposable thread if neededProvides flexible tool grouping and composition.
Core Toolsets:
| Toolset | Tools Included |
|---|---|
web | web_search, web_extract, web_crawl |
terminal | terminal |
file | read_file, write_file, edit_file, list_files, search_files |
browser | browser_navigate, browser_snapshot, browser_click, browser_type, browser_scroll, browser_extract |
vision | analyze_image |
image_gen | generate_image |
tts | text_to_speech |
todo | todo_read, todo_write |
memory | memory_read, memory_write |
session_search | session_search |
delegation | delegate_task |
code_execution | execute_code |
cronjob | create_job, list_jobs, delete_job |
messaging | send_message |
homeassistant | ha_get_states, ha_call_service, ... |
Composite Toolsets:
hermes-cli — All core tools for CLI platformhermes-telegram, hermes-discord, etc. — Platform-specific tool setshermes-gateway — Union of all platform toolsdebugging — terminal + file + websafe — Everything except terminalResolution:
resolve_toolset(name, visited=None) → List[str]
# Recursively resolves toolset to tool names
# Handles composition (includes) and cycle detection
# Special aliases: "all" or "*" = all tools
Singleton managing all tool schemas and handlers. Circular-import safe — has no tool dependencies.
ToolEntry (per-tool metadata):
@dataclass(slots=True)
class ToolEntry:
name: str
toolset: str
schema: dict # OpenAI-format tool definition
handler: Callable # Sync or async handler function
check_fn: Callable # Returns True if tool is available
requires_env: list # Required environment variables
is_async: bool
description: str
emoji: str
Key Methods:
registry.register(name, toolset, schema, handler, check_fn, requires_env)
registry.get_definitions(tool_names, quiet) # Returns filtered schemas
registry.dispatch(name, args, **kwargs) # Execute with async bridging
registry.deregister(name) # Remove (for MCP tool refresh)
registry.check_tool_availability() # Returns (available, unavailable)
SQLite-based persistent session storage with FTS5 full-text search.
Schema (v6):
-- Sessions table
sessions (
id TEXT PRIMARY KEY,
source TEXT, user_id TEXT, model TEXT, model_config TEXT,
system_prompt TEXT, parent_session_id TEXT,
started_at TEXT, ended_at TEXT, end_reason TEXT,
message_count INTEGER, tool_call_count INTEGER,
input_tokens INTEGER, output_tokens INTEGER,
cache_read_tokens INTEGER, cache_write_tokens INTEGER, reasoning_tokens INTEGER,
estimated_cost_usd REAL, actual_cost_usd REAL,
title TEXT -- UNIQUE INDEX
)
-- Messages table
messages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT, role TEXT, content TEXT,
tool_call_id TEXT, tool_calls TEXT, -- JSON
tool_name TEXT, timestamp TEXT,
token_count INTEGER, finish_reason TEXT,
reasoning TEXT, reasoning_details TEXT, codex_reasoning_items TEXT
)
-- FTS5 virtual table (auto-synced via triggers)
messages_fts (content)
Concurrency Model:
BEGIN IMMEDIATE for write transactions (lock at start, not commit)Key Operations:
create_session(), end_session(), reopen_session()add_message(), get_messages()search_sessions(query) — FTS5 full-text searchupdate_token_counts() — Supports both incremental (CLI) and absolute (gateway) modesImport-safe constants module with no circular dependencies.
get_hermes_home() → Path # HERMES_HOME env var or ~/.hermes
display_hermes_home() → str # User-friendly display: "~/.hermes"
get_optional_skills_dir() → Path # HERMES_OPTIONAL_SKILLS env var
parse_reasoning_effort(str) → Dict # "high" → {"enabled": True, "effort": "high"}
# Key constants
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
NOUS_API_BASE_URL = "https://inference-api.nousresearch.com/v1"
AI_GATEWAY_BASE_URL = "https://ai-gateway.vercel.sh/v1"
VALID_REASONING_EFFORTS = ("xhigh", "high", "medium", "low", "minimal")
The HermesCLI class provides the interactive terminal interface.
Features:
┊ activity feed for tool resultsConfiguration Loading:
load_cli_config() → dict
# Loads from ~/.hermes/config.yaml (or ./cli-config.yaml fallback)
# Merges with hardcoded defaults
# Expands ${ENV_VAR} references
# Maps terminal config → env vars
All hermes subcommands are dispatched from here:
hermes # Default: interactive chat
hermes chat # Explicit interactive mode
hermes gateway start|stop|status|install|uninstall
hermes setup # Setup wizard
hermes model # Select model/provider
hermes tools # Configure tools
hermes skills # Manage skills
hermes config set|get # Direct config manipulation
hermes cron list|delete # Cron job management
hermes doctor # Diagnose issues
hermes sessions browse # Session picker
hermes profile create|list|switch|delete|export|import
hermes mcp serve|add|remove # MCP management
hermes acp # Start ACP server
hermes update|uninstall|version
Profile System:
_apply_profile_override() runs BEFORE any imports to set HERMES_HOME--profile/-p from argvKey Configuration Sections:
model: "anthropic/claude-opus-4.6" # or dict with provider/base_url/api_key
providers: {} # Provider-specific configs
fallback_providers: [] # Ordered failover list
credential_pool: {} # Multiple API keys per provider
agent:
max_turns: 90
gateway_timeout: 1800
tool_use_enforcement: "auto"
terminal:
backend: "local" # local|docker|modal|daytona|ssh|singularity
timeout: 180
persistent_shell: true
docker_image: "nikolaik/python-nodejs:..."
compression:
enabled: true
threshold: 0.50 # Compress when 50% of context used
target_ratio: 0.20 # Summary = 20% of compressed content
protect_last_n: 20
auxiliary:
vision: { provider, model }
web_extract: { provider, model }
compression: { provider, model }
memory:
memory_enabled: true
provider: "" # "" | "honcho" | "mem0" | etc.
memory_char_limit: 2200
display:
personality: "kawaii"
show_reasoning: false
inline_diffs: true
skin: "default"
streaming: true
tts:
provider: "edge" # edge|elevenlabs|openai|neutts
stt:
enabled: true
provider: "local" # local|groq|openai
privacy:
redact_pii: false
mcp_servers: {} # MCP server configurations
skills:
external_dirs: [] # Additional skill directories
approvals:
mode: "smart" # smart|always|off
Config Files:
~/.hermes/config.yaml — User settings (authoritative)~/.hermes/.env — API keys and secretsAll slash commands defined centrally in COMMAND_REGISTRY:
CommandDef(name, description, category, aliases, args_hint, cli_only, gateway_only)
Derived automatically by:
process_command() — dispatch on canonical name/hermes subcommandsKey Commands:
| Command | Aliases | Description |
|---|---|---|
/new | /reset | Start fresh conversation |
/model | Show/switch model | |
/personality | Set agent personality | |
/retry | Retry last turn | |
/undo | Remove last turn | |
/compress | /compact | Compress context |
/usage | /cost | Show token usage |
/insights | Usage analytics | |
/skills | Browse/install skills | |
/background | /bg | Manage background processes |
/plan | Generate implementation plan | |
/rollback | Restore filesystem checkpoint | |
/verbose | Toggle debug output | |
/reasoning | Set reasoning effort | |
/yolo | Toggle approval bypass | |
/btw | Ephemeral side question | |
/stop | Kill current agent run | |
/queue | Queue next prompt | |
/browser | Interactive browser session | |
/history | /resume | Session browser |
/skin | Switch CLI theme |
Modular interactive wizard with independent sections:
Features:
Provider-specific model lists:
_PROVIDER_MODELS = {
"nous": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", ...], # 25+
"openrouter": ["anthropic/claude-opus-4.6", "google/gemini-3-flash", ...], # 30+
"anthropic": ["claude-opus-4-6", "claude-sonnet-4-6", ...],
"openai": ["gpt-5", "gpt-5.4-mini", "gpt-4.1", "gpt-4o", ...],
"copilot": ["gpt-5.4", "gpt-5.4-mini", "gpt-5.3-codex", ...],
"huggingface": [...],
"minimax": [...],
"kimi-coding": [...],
"alibaba": [...],
"deepseek": [...],
# ... more providers
}
Features:
/models endpointsData-driven CLI visual customization — no code changes needed.
Customizable Elements:
| Element | Key | Used By |
|---|---|---|
| Banner border/title/accent | colors.* | banner.py |
| Response box border | colors.response_border | cli.py |
| Spinner faces (waiting/thinking) | spinner.* | display.py |
| Spinner verbs/wings | spinner.* | display.py |
| Tool output prefix | tool_prefix | display.py |
| Per-tool emojis | tool_emojis | display.py |
| Agent name/welcome/prompt | branding.* | banner.py, cli.py |
Built-in Skins: default, ares, mono, slate, poseidon, sisyphus, charizard
User Skins: Drop ~/.hermes/skins/<name>.yaml and activate with /skin <name>
Shell command execution across multiple backends.
def terminal_tool(
command: str,
background: bool = False,
timeout: Optional[int] = None,
task_id: Optional[str] = None,
force: bool = False, # Skip approval for dangerous commands
workdir: Optional[str] = None,
check_interval: Optional[int] = None, # Background task polling
pty: bool = False,
) -> str # JSON result
Features:
TERMINAL_ENV (local/docker/ssh/modal/daytona/singularity)is_interrupted() during executionSafe file operations with size guards and sensitive path protection.
read_file_tool(path, offset, limit) — Read with pagination (default 100K char limit)write_file_tool(path, content) — Write with approval for sensitive pathsedit_file_tool(path, old_text, new_text) — String replacement editinglist_files_tool(path) — Directory listingsearch_files_tool(pattern, path) — Glob/regex file searchSafety:
/dev/zero, /dev/stdin, etc.)/etc/, /boot/, ~/.ssh without approvalWeb search and content extraction.
web_search_tool(query, limit) — Search via configurable backendweb_extract_tool(urls, format) — Extract content from URLsweb_crawl_tool(url, instructions) — LLM-guided web crawlingBackends: Firecrawl, Parallel Web, Tavily, Exa, DuckDuckGo
Headless browser automation with accessibility tree.
browser_navigate(url, task_id) → str
browser_snapshot(task_id, max_chars) → str
browser_click(ref, task_id) → str # Element refs like @e1, @e2
browser_type(ref, text, task_id) → str
browser_scroll(ref, direction, task_id) → str
browser_extract(task, max_chars, task_id) → str
Backends:
agent-browser CLIFeatures:
Spawn isolated subagents for parallel workstreams.
def delegate_task(
goal: str,
context: str = None,
toolsets: List[str] = None, # Default: ["terminal", "file", "web"]
tasks: List[Dict] = None, # Batch mode: up to 3 concurrent
max_iterations: int = 50,
parent_agent = None,
) → str
Isolation:
Blocked Tools: delegate_task, clarify, memory, send_message, execute_code (no recursion, no user interaction)
Constraints: MAX_DEPTH=2, MAX_CONCURRENT_CHILDREN=3
Model Context Protocol client integration.
Configuration (config.yaml):
mcp_servers:
filesystem:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
timeout: 120
remote_api:
url: "https://my-mcp-server.example.com/mcp"
headers:
Authorization: "Bearer sk-..."
sampling:
enabled: true
model: "gemini-3-flash"
Features:
tools/list_changed notificationsDangerous command detection and approval flow.
detect_dangerous_command(command: str) → (is_dangerous, pattern_key, description)
106 patterns covering:
rm -r /, mkfs, dd if=)chmod 777, chown -R root)DROP TABLE, DELETE FROM without WHERE)/etc/ writes, systemctl stop, fork bombs)cat ~/.env, cat ~/.netrc)Normalization: Strips ANSI escapes, null bytes, normalizes Unicode
Approval Modes:
smart — Learns which commands are safe based on user decisionsalways — Always ask for dangerous commandsoff — Never ask (or /yolo toggle)| Backend | File | Features |
|---|---|---|
| Local | local.py | Direct execution, interrupt support, non-blocking I/O, persistent shell |
| Docker | docker.py | Sandboxed containers, cap-drop ALL, no-new-privileges, PID limits, bind mounts |
| SSH | ssh.py | Remote execution via ControlMaster, persistent shell mixin |
| Modal | modal.py | Native Modal SDK Sandbox.create()/Sandbox.exec(), persistent snapshots |
| Managed Modal | managed_modal.py | Modal through Nous-hosted gateway |
| Daytona | daytona.py | Daytona SDK cloud sandboxes, stop/resume lifecycle |
| Singularity | singularity.py | Singularity containers with scratch dir, SIF cache |
Common Interface (BaseEnvironment ABC):
execute(command, timeout) → {"output": str, "returncode": int}
cleanup() → None
Assembles the system prompt from multiple sources:
| Component | Source |
|---|---|
| Agent Identity | DEFAULT_AGENT_IDENTITY or SOUL.md |
| Memory Guidance | When/how to use memory tool |
| Session Search Guidance | How to recall past conversations |
| Skills Guidance | When to create/patch skills |
| Tool Use Enforcement | Must execute tools, not describe actions |
| Skills Index | ~/.hermes/skills/.hermes-skills.json |
| Platform Hints | OS, Python version, shell, available tools |
| Context Files | .hermes.md, AGENTS.md, .cursorrules, SOUL.md |
| Model/Provider Info | Current model and provider identity |
Context File Discovery:
cwd/.hermes.md or HERMES.mdInjection Detection (30+ patterns):
Automatically compresses conversation history when approaching context limits.
class ContextCompressor:
threshold_percent: float = 0.50 # Compress when 50% of context used
protect_first_n: int = 3 # Keep system prompt + first turn
protect_last_n: int = 20 # Keep recent N messages
summary_target_ratio: float = 0.20 # Summary = 20% of compressed content
Compression Algorithm:
Anthropic prompt caching support for cost reduction.
Critical Rule: Do NOT implement changes that would alter past context, change toolsets, reload memories, or rebuild system prompts mid-conversation.
Separate LLM client for non-primary tasks:
Configured per-task via auxiliary section in config.yaml.
┊ prefixed activity feed for tool executionShared skill invocation for CLI and gateway:
~/.hermes/skills/ and external directories/plan command generates implementation plans stored in .hermes/plans/Main controller managing all platform adapters and routing messages.
Key Attributes:
adapters: Dict[Platform, BasePlatformAdapter] — Active platform instancessession_store: SessionStore — Conversation persistence_running_agents: Dict[str, AIAgent] — Per-session cached agents (preserves prompt caching)_pending_approvals: Dict[str, Dict] — Dangerous command approval trackingpairing_store: PairingStore — DM code-based user authorizationMessage Flow:
MessageEvent_handle_message()_handle_message_with_agent() (async, with per-session locking)Features:
/model/approve, /deny) with interactive buttons (Discord)SessionSource — Where a message originated (platform, chat_id, user info) SessionContext — Full context for system prompt injection (platforms, home channels, metadata) SessionStore — Loads/saves conversation transcripts as JSON files
~/.hermes/sessions/{session_key}.json
Format: [{role, content, timestamp}, ...]
Features:
16 adapters sharing BasePlatformAdapter interface:
| Platform | Key Features |
|---|---|
| Telegram | Polling + webhook mode, media handling, inline keyboards, forum topic isolation, group mention gating |
| Discord | Server channels, threads, reactions (processing/done/error), button-based approval, @mention requirement |
| Slack | Multi-workspace OAuth, thread handling, app_mention, /hermes subcommands |
| Group & DM support, media captions, LID↔phone alias resolution | |
| Matrix | E2EE room encryption, threaded messages, trusted device flow, native voice messages |
| Signal | Encrypted DMs, group membership, SSE keepalive, phone URL encoding |
| IMAP/SMTP, multi-recipient, skip_attachments option | |
| Home Assistant | REST tools + WebSocket, service discovery, smart home automation |
| SMS | Twilio integration |
| Mattermost | Self-hosted Slack alternative, configurable mention behavior |
| DingTalk | Alibaba enterprise messaging |
| Feishu/Lark | Enterprise messaging, message cards, approval workflows |
| WeCom | Enterprise WeChat, department management |
| Webhook | Generic HTTP POST for custom integrations |
| API Server | OpenAI-compatible /v1/chat/completions endpoint |
Common Features:
Built-in job scheduler running in the gateway background thread.
Schedule Types:
"once in 5m" — One-shot after duration"every 30m" — Recurring interval"0 9 * * *" — Standard cron expression"2026-04-06T14:00" — Absolute datetimeJob Storage: ~/.hermes/cron/jobs.json
Execution Flow:
tick() called every 60s from gateway background threadnext_run_athermes CLI subprocess with job prompt + skills~/.hermes/cron/output/{job_id}/{timestamp}.mdDelivery Targets:
"local" — Output saved locally only"origin" — Send to originating chat"telegram:<chat_id>" — Explicit platform/chat routing[SILENT] prefix — Suppress delivery but keep logsGrace Windows: Based on schedule frequency (120s–2hrs) to handle missed jobs
Skills are composable, agent-invokable knowledge units.
Structure:
skills/
├── creative/ # ASCII art, diagrams, music
├── software-development/ # Debugging, testing, docs
├── github/ # Codebase inspection, PR workflow
├── research/ # Literature, web scraping
├── productivity/ # Task management
├── media/ # Image, video, audio processing
├── mlops/ # ML experiment tracking
├── autonomous-ai-agents/ # Multi-agent orchestration
└── [20+ more categories]
Per-Skill Structure:
skill-name/
├── SKILL.md # Metadata (YAML frontmatter) + implementation instructions
└── [sub-skills]/
SKILL.md Example:
---
name: ascii-art
description: Generate ASCII art using multiple tools
version: 4.0.0
dependencies: []
metadata:
hermes:
tags: [ASCII, Art, Banners, Creative]
related_skills: [excalidraw]
---
[Implementation instructions, tool options, examples...]
Discovery:
~/.hermes/skills/ + external dirs.hermes-skills.json)hermes skillsagentskills.io) for community sharingDrop Python files into ~/.hermes/plugins/ to extend Hermes.
Plugin Capabilities:
pre_llm_call, post_llm_call, on_session_start, on_session_endhermes plugins enable/disable <name>Memory Provider Plugins (plugins/memory/): 8 implementations: openviking, mem0, hindsight, holographic, honcho, retaindb, byterover
Each implements:
class MemoryProvider:
def is_available() → bool
def store(key, value)
def retrieve(query) → list
def clear()
Hermes has a pluggable memory provider interface:
Built-in Memory:
MEMORY.md — Markdown file with persistent factsUSER.md — User profile informationHoncho Integration:
Configuration:
memory:
memory_enabled: true
provider: "" # "" for built-in, "honcho", "mem0", etc.
memory_char_limit: 2200
Agent Communication Protocol server for VS Code, Zed, JetBrains.
Entry: hermes acp → acp_adapter/server.py
HermesACPAgent Class:
initialize() # Handshake with IDE client
authenticate(method_id) # Validate credentials
new_session(cwd, mcp_servers) # Create isolated session
load_session(session_id) # Resume session
fork_session(session_id) # Branch for parallel work
list_sessions() # Browse all sessions
cancel(session_id) # Interrupt running agent
Features:
/model, /tools, /reset, /compact)OpenAI-compatible API endpoint for headless integrations (e.g., Open WebUI).
Endpoint: POST /v1/chat/completions
Features:
X-Hermes-Session-Id header for persistent sessions/api/jobs REST API for cron managementExpose Hermes conversations to MCP-compatible clients.
Entry: hermes mcp serve
Features:
Atropos-based RL training framework for agent policy optimization.
Base Class: HermesAgentBaseEnv (extends atroposlib.BaseEnv)
Configuration:
HermesAgentEnvConfig:
enabled_toolsets: ["terminal", "file", "web"]
max_agent_turns: 30
agent_temperature: 1.0
terminal_backend: "local" # or docker/modal for isolation
dataset_name: str
Subclass Requirements:
setup() — Load datasetget_next_item() — Return next taskformat_prompt() — Convert item → user messagecompute_reward() — Score rolloutevaluate() — Periodic eval on test setExample Environments:
web_research_env.py — Web research tasksagentic_opd_env.py — Observation-Prediction-Demonstrationhermes_swe_env.py — Software engineering tasksSupporting:
HermesAgentLoop — Orchestrates step-by-step rolloutsToolContext — Sandbox for tool execution, records side effectstrajectory_compressor.py — Compresses trajectories for training dataRun multiple fully isolated Hermes instances from the same installation.
Commands:
hermes profile create <name>
hermes profile list
hermes profile switch <name>
hermes profile delete <name>
hermes profile export <name>
hermes profile import <file>
hermes -p <name> # Launch with specific profile
Each profile gets:
HERMES_HOME directory (~/.hermes/profiles/<name>/)Implementation:
_apply_profile_override() sets HERMES_HOME env var before any importsget_hermes_home() automatically scope to active profile~/.hermes/profiles/) for cross-profile visibility--fuck-it-ship-it flag or /yolo toggle to bypass.docker, .azure, .config/gh, .ssh)execute_code sandbox output redactionprivacy.redact_pii modeuv.lock with hashes for reproducible buildslitellm dependency removed| Provider | API Mode | Key Features |
|---|---|---|
| Nous Portal | chat_completions | 400+ models, first-class setup |
| OpenRouter | chat_completions | 200+ models, provider routing preferences |
| Anthropic | anthropic_messages | Native prompt caching, OAuth PKCE |
| OpenAI | chat_completions | GPT-5, Codex |
| GitHub Copilot | chat_completions | OAuth, 400k context |
| Hugging Face | chat_completions | Curated agentic model picker |
| Google (Direct) | chat_completions | Full Gemini context lengths |
| z.ai/GLM | chat_completions | Chinese LLM models |
| Kimi/Moonshot | chat_completions | Kimi Code API |
| MiniMax | anthropic_messages | M2.7 models |
| Alibaba/DashScope | chat_completions | Qwen models |
| DeepSeek | chat_completions | V3 models |
| Vercel AI Gateway | chat_completions | Routing through Vercel |
| Kilo Code | chat_completions | Custom provider |
| OpenCode Zen/Go | chat_completions | Custom provider |
| Custom Endpoint | configurable | Any OpenAI-compatible API |
least_used rotation/v1/props for llama.cppstream_delta_callback/reasoning command to toggle display and effort level<think> tag extraction for compatible models216 merged PRs from 63 contributors, resolving 119 issues
call_llm() API)hermes -w)/rollback~/.hermes/plugins/)/stop command/browser connect via Chrome DevTools Protocol/v1/chat/completions)@file and @url context references with tab completion95 PRs and 16 resolved issues in 2 days
hermes mcp serve)168 PRs and 46 resolved issues
hermes_constants.py (no deps — imported by everything)
↑
tools/registry.py (no tool deps — imported by all tool files)
↑
tools/*.py (each calls registry.register() at import time)
↑
model_tools.py (imports tools/registry + triggers tool discovery)
↑
run_agent.py (AIAgent), cli.py (HermesCLI), gateway/run.py (GatewayRunner)
↑
hermes_cli/main.py (entry point — dispatches to all subsystems)
Key Principle: tools/registry.py is circular-import safe. It has no tool dependencies. Tool files import the registry; model_tools.py imports both.
| Pattern | Description |
|---|---|
| Registry-based Tool System | Single source of truth; plugins register at import time; dynamic (MCP) and static tools coexist |
| Toolset Composition | Recursive resolution with cycle detection; platform-specific composites |
| Iteration Budget | Thread-safe shared budget across parent + subagents |
| Streaming First | Preferred over non-streaming for health checking (stale connection detection) |
| Prefix Caching | System prompt cached across turns (Anthropic optimization); context never altered mid-conversation |
| Proactive Compression | Triggered at 50% context usage; structured summaries with iterative updates |
| Async Bridging | Persistent event loops prevent "Event loop is closed"; per-thread loops for workers |
| Profile Isolation | HERMES_HOME env var set before imports; all state functions route through get_hermes_home() |
| Agent Caching | Gateway caches AIAgent per session to preserve prompt cache across turns |
| WAL Concurrency | SQLite WAL mode + jitter retry for concurrent readers + single writer |
| Plugin Architecture | Tools, toolsets, hooks, memory providers extensible via plugins |
| Multi-Backend Execution | Pluggable terminal backends with unified BaseEnvironment interface |
| Safety Layers | Approval system → sensitive path guards → injection detection → capability filtering |
| Variable | Purpose |
|---|---|
HERMES_HOME | Override home directory (profiles set this automatically) |
OPENROUTER_API_KEY | OpenRouter provider key |
ANTHROPIC_API_KEY | Anthropic provider key |
OPENAI_API_KEY | OpenAI provider key |
NOUS_API_KEY | Nous Portal key |
FIRECRAWL_API_KEY | Web search/extraction |
EXA_API_KEY | Exa search backend |
BROWSERBASE_API_KEY | Cloud browser |
FAL_KEY | Image generation |
ELEVENLABS_API_KEY | Premium TTS |
HONCHO_API_KEY | Honcho memory |
TERMINAL_ENV | Terminal backend override |
TERMINAL_CWD | Terminal working directory |
MESSAGING_CWD | Gateway working directory |
| File | Purpose |
|---|---|
~/.hermes/config.yaml | Main configuration (YAML) |
~/.hermes/.env | API keys and secrets |
~/.hermes/MEMORY.md | Persistent agent memory |
~/.hermes/USER.md | User profile |
~/.hermes/SOUL.md | Agent persona/identity |
~/.hermes/sessions.db | SQLite session database |
~/.hermes/cron/jobs.json | Cron job definitions |
.hermes.md (in project dir) | Per-project context file |
AGENTS.md (in project dir) | Developer instructions for agent |
DO NOT hardcode ~/.hermes paths — Use get_hermes_home() from hermes_constants. Hardcoding breaks profiles.
DO NOT use simple_term_menu — Rendering bugs in tmux/iTerm2 (ghosting). Use curses instead.
DO NOT use \033[K (ANSI erase-to-EOL) — Leaks as literal text under prompt_toolkit's patch_stdout. Use space-padding.
_last_resolved_tool_names is process-global — Saved/restored around subagent execution in delegate_tool.py.
DO NOT hardcode cross-tool references in schemas — Tool may be unavailable. Add dynamic references in get_tool_definitions().
Tests must not write to ~/.hermes/ — _isolate_hermes_home autouse fixture redirects to temp dir.
Prompt caching must not break — Do NOT alter past context, change toolsets, reload memories, or rebuild system prompts mid-conversation.
Working directory behavior differs: CLI uses os.getcwd(), gateway uses MESSAGING_CWD env var.
Config has three loaders: load_cli_config() (CLI), load_config() (hermes tools/setup), direct YAML (gateway). They have different merge behaviors.
Profile operations are HOME-anchored — _get_profiles_root() returns Path.home() / ".hermes" / "profiles", NOT get_hermes_home() / "profiles". This is intentional for cross-profile visibility.
This document covers Hermes Agent v0.7.0 as of April 2026. For the latest information, refer to the official documentation and the GitHub repository.