name

direction-research

description

Research external agent systems (Letta, Hermes-Agent, OpenClaw, Devin, Cline, Cursor, Manus, Deep Agents, browser-use, AG-UI, etc.), AI coding agents (Claude Code, Codex), agent protocols (MCP, A2A, AG-UI), or internal architecture gaps to produce feasibility analysis and actionable recommendations. Default entry for any architectural / framework / library / "modern way" question — do not answer from priors alone. Trigger phrases — "research Letta", "compare with Hermes", "how does X handle memory", "should we use", "what's the right architecture for", "modern way to", "is X better than Y", "feasibility study", "direction research".

Direction Research

Investigate external agent architectures, AI/coding agent runtimes, agent protocols, or internal implementation gaps to inform V3 product and technical decisions. This is the default entry whenever the user asks about architecture, framework choice, library selection, agent protocol, or "the modern way" to do something — do not answer from training priors alone.

Read first

AGENTS.md
docs/README.md
docs/product/project_status.md (current V3 direction; jump from §2 to the active ADR — do not hardcode ADR numbers here so this skill does not need updating each time a new ADR supersedes an old one)
docs/architecture/v3/memory-native-sales-agent.md
docs/delivery/tasks/_active.md
docs/research/*.md — prior dated snapshots. Treat any older than ~60 days as needing refresh on fast-moving framework claims; treat newer ones as evidence but still verify load-bearing claims live.
Any user-provided links, files, or specific questions.

Scope

Identify the specific mechanism in the target system (memory management, tool use, sandbox/working state, persistence, runtime, hooks, skills, subagents, multi-agent topology, protocol surface). Compare against current V3 implementation. Highlight what is adoptable, inapplicable, or needs further investigation.

Information sources (priority order)

User-provided files, links, or quotes.
Local project docs and code (backend/, web/, docs/).
Live external sources — required for every external claim:
- DeepWiki MCP (when loaded in session): architecture-level Q&A on public GitHub repos. Free, unlimited — prefer first.
- Context7 MCP (when loaded in session): version-specific library API / docs lookup. Free tier is 1000 calls/month — use deliberately on the load-bearing claim, not for every fact.
- WebFetch on canonical repo / docs URL (raw.githubusercontent.com/..., official docs site, release notes page) — fallback when MCP servers are not loaded in the current session, or when the question is outside DeepWiki/Context7 coverage.
- WebSearch for discovery / when target is unknown / for ranking signals.
Blogs / Medium / Wikipedia / SEO content — only as confirmation, never sole source. Cross-verify with repo or official docs. Be especially skeptical of "X-month-old project at 100K+ stars" claims; verify with the repo's own commit history and release dates.

Do not guess at implementation details. If a source is unclear or unreachable, state the uncertainty explicitly.

Required output structure

Every direction-research report MUST include all five sections:

A. Target system summary

Concise description of the mechanism being investigated.

B. Dated evidence table

For every claim about an external system, a row with:

Project / library name
Version or release date examined (e.g. v0.13.0 (2026-05-07), commit SHA, or "as of <URL fetch date>")
Source URL (specific repo path, docs URL, or release notes URL — not the org root)
Direct quote or precise paraphrase of the load-bearing fact

A claim without a dated evidence row cannot enter the recommendation in §E.

C. V3 现状对比

What we have today (cite file paths in backend/, web/, docs/).
What we lack relative to the target.
What differs structurally (not just feature-by-feature).

D. 反证段（counterevidence — mandatory）

Before recommending, ask:

What does the target system explicitly fail at, deprecate, or warn against?
Is the project itself migrating away from this mechanism (e.g., Letta's pivot away from server-side sleep-time agents)?
Is the claim relying on blog / Wikipedia / Medium / SEO content rather than repo evidence?
Is hype level (stars, blog count) inflated relative to actual production usage?
Is the mechanism scoped for single-tenant / single-user / local-first when our use is multi-tenant SaaS?

If all checks pass cleanly, state so explicitly. Do not omit the section.

E. Recommendation

One of: adopt / adapt / defer / reject — grounded in §B (evidence) and §D (counterevidence), not priors.

No open-ended option lists without an §E conclusion.

Persistence

Before writing to disk, ask the user for confirmation. Preferred locations:

docs/research/<topic>.md for general architecture research. Read docs/research/README.md §3 准入条件 first — frontmatter must declare 性质：Research-only / 不自动开放实现授权.
Adjacent to the relevant ADR or task file if the research feeds a pending decision.

Stop conditions

Stop and escalate if:

the research question is too broad for an actionable conclusion (split into focused sub-questions)
the user has not provided a specific target system or internal area to compare
the conclusion would require deciding product priority or scope beyond the current task
live external sources are blocked / unreachable AND no recent prior docs/research/ snapshot covers the question — state the gap explicitly and offer to defer pending source access, rather than answering from priors