Run any Skill in Manus with one click

$pwd:

deep-research

Name: Deep Research
Author: Jose-Luis-Nunez

// Conduct systematic deep research on technical topics with source verification, triangulation, and citation-backed reports. Use when the user asks for deep research, comprehensive analysis, technology comparison, trend analysis, state of the art review, or architecture evaluation. Not for simple lookups, debugging, or questions answerable with 1-2 searches.

Run Skill in Manus

$ git log --oneline --stat

stars:0

forks:1

updated:May 6, 2026 at 17:54

SKILL.md

readonly

package.json

"author": "Jose-Luis-Nunez"

"repository": "Jose-Luis-Nunez/FitnessApp"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

Run any Skill with one click

name	deep-research
description	Conduct systematic deep research on technical topics with source verification, triangulation, and citation-backed reports. Use when the user asks for deep research, comprehensive analysis, technology comparison, trend analysis, state of the art review, or architecture evaluation. Not for simple lookups, debugging, or questions answerable with 1-2 searches.

Deep Research

Core Purpose

Deliver citation-backed, verified research reports through a structured multi-phase pipeline. Every factual claim must be traceable to a source.

Autonomy Principle: Operate independently through retrieval and synthesis.

CRITICAL — Phase 0 is mandatory. Your FIRST response to the user MUST be clarifying questions based on ambiguity analysis (Phase 0). Do NOT run any searches, launch any subagents, or begin Phase 1 until the user has answered. Skipping Phase 0 is the single most common failure mode of this skill. Even for precise prompts, confirm your assumptions before starting.

After Phase 0, pause zero more times: all subsequent phases (1 through 5) run without pausing. Show the research plan (Phase 1) in the output for transparency, but do not wait for confirmation — proceed directly to retrieval.

Decision Tree

Request received
├── Simple lookup (1-2 searches enough)?  → STOP: use normal WebSearch
├── Debugging / code fix?                 → STOP: use standard agent mode
└── Complex analysis needed?              → CONTINUE ↓

Depth selection
├── standard → 2-3 subagents, ~4.000 words,  phases 0-1-2-3-[QG]-3.1-3.5-4-5  [DEFAULT]
└── deep     → 4-6 subagents, 6.000+ words,  phases 0-1-2-3-[QG]-2-3-3.1-3.5-4-5

[QG] = Quality Gate: skip next wave if source targets already met

If the user does not specify depth, default to standard.

Agent Architecture

This skill uses a lead-agent / subagent pattern. The main (lead) agent orchestrates the pipeline; subagents handle retrieval, verification, and (in deep mode) synthesis. Every phase has a clear owner.

Agent Roles

Role	Agent	Purpose
Orchestrator	Main agent	Clarify, scope, plan, triangulate, critique, write report
Retrieval	Subagent (Role 1)	Broad search across assigned key areas (Wave 1)
Gap-Fill	Subagent (Role 2)	Targeted search for specific evidence gaps (Wave 2+)
Verification	Subagent (Role 3)	Citation spot-check: do sources support their claims?
Synthesis	Subagent (Role 4)	Draft area summaries with citations (deep only)

Agent Assignment per Phase

Phase	Standard	Deep
0 Clarify	Main	Main
1 Scope	Main	Main
2 Retrieve (Wave 1)	2-3 Retrieval SA	4 Retrieval SA
2b Retrieve (Wave 2+)	1 Gap-Fill SA	1-2 Gap-Fill SA
3 Triangulate	Main	Main
3.1 Citation Check	1 Verification SA	1 Verification SA
3.5 Outline Refine	Main	Main
4 Critique	Main	Main
5 Write Report	Main	Main (assisted by 1 Synthesis SA)

Key design decisions:

The Verification Agent runs as a subagent in both standard and deep, freeing the main agent's tool-call budget for coordination and writing.
The Gap-Fill Agent has a sharper prompt than the generic Retrieval Agent — it knows what is missing and searches specifically for that.
In deep mode, a Synthesis Agent pre-drafts area summaries before the main agent writes the final report. This reduces the main agent's synthesis load and produces more consistent cross-referencing.
The main agent never runs WebSearch/WebFetch in Phase 2. All retrieval is delegated to subagents so the main agent preserves its tool-call budget for Phases 3-5.

Subagent Model Selection

Use model: "fast" for Retrieval (Role 1) and Gap-Fill (Role 2) subagents — they search and extract, no deep reasoning needed. This reduces cost and latency. Use the default model (no model parameter) for Verification (Role 3) and Synthesis (Role 4) subagents, which require careful judgment.

Source Credibility Tiers

Subagents (Retrieval + Gap-Fill) tag every source with a credibility tier at extraction time. This costs zero extra tool calls — the agent already reads the URL, author, and content. Phase 3 uses the tiers for confidence assignment.

Tier	Label	Examples
1	Authoritative	Peer-reviewed journals (arxiv, ACM, IEEE), official documentation, established research labs (Microsoft Research, Google Research), recognized industry authorities (Fowler, Thoughtworks Radar, DORA/State of DevOps)
2	Credible	Engineering blogs from established companies (Netflix, Spotify, Airbnb tech blogs), conference talks (QCon, STAREAST, SeleniumConf), well-known practitioners with published track record
3	Supplementary	Personal blogs, Medium posts, Reddit/HN discussions, vendor marketing content, undated articles, sources with no identifiable author

Tagging rule: Each source in the source list gets a Tier: 1/2/3 tag. Phase 3 uses tiers for confidence: [High] requires at least one Tier 1 or Tier 2 source. Three Tier 3 sources alone = [Medium] maximum.

Tool Constraints & Subagent Strategy

A single agent turn has limited tool-call budget and a finite context window. Subagents bypass these limits: each subagent (spawned via the Task tool) gets its own tool budget and isolated context. Run multiple subagents in parallel by issuing several Task calls in a single message.

Scaling with Subagents

Depth	Subagents	Effective tool calls	Source target
standard	2-3 parallel	~75	15-25
deep	4 parallel	~100	30-50

How it works

Phase 0 + 1 run on the main agent (clarify, scope, plan).
Phase 2 splits key areas across parallel subagents. Each subagent gets a clear brief:
- Assigned key areas (e.g. "Areas 1-2" vs. "Areas 3-4")
- The search terms relevant to its areas
- Instructions to return a structured source list with extracts
Main agent collects subagent results, deduplicates sources, runs Phase 3 (triangulate), and identifies gaps.
If gaps remain, a second wave of Gap-Fill subagents fills them.
Phase 3.1 delegates citation spot-check to a Verification subagent.
Phase 5 (write) runs on the main agent. In deep mode, a Synthesis subagent pre-drafts area summaries first.

Subagent prompt templates

Role 1: Retrieval Agent (Wave 1 — both depths)

You are a research retrieval agent. Your task:

Topic: [research question]
Your assigned areas:
  - [Area N]: [sub-questions]
  - [Area M]: [sub-questions]

Search terms to start with: [list]

Budget: max 8 WebSearch + 5 WebFetch calls. If you have not found
sufficient sources after these calls, return what you have — do not
loop indefinitely. Keep your full response under 2000 words.

Source credibility tiers — assign one per source based on domain,
author, and content signals:
  Tier 1 (authoritative): Peer-reviewed, official docs, research labs,
    recognized authorities (Fowler, DORA, Thoughtworks Radar)
  Tier 2 (credible): Established company engineering blogs, conference
    talks, well-known practitioners
  Tier 3 (supplementary): Personal blogs, Medium, Reddit/HN, vendor
    marketing, undated or anonymous content

Instructions:
1. Run 4-6 WebSearch calls (parallel where possible).
2. For the 3-5 most relevant results, fetch full pages with WebFetch.
3. For each source, extract:
   - Core argument / key data points (2-3 sentences)
   - Author, publication, date
   - Credibility tier (1, 2, or 3)
   - Direct quotes worth citing (if any)
4. For EACH assigned area, write a 3-sentence TAKEAWAY that distills
   the consensus across sources. This takeaway is the main deliverable
   — the lead agent will use it directly as a building block for
   synthesis. Place it before the source list.
5. Return the area takeaways followed by a numbered source list.
   Number sources starting from [START_INDEX] (provided by lead agent).

   ## Area Takeaways
   **[Area N]:** [3-sentence distillation]
   **[Area M]:** [3-sentence distillation]

   ## Sources
   [START_INDEX] Author — Title — URL — Date accessed — Tier: N
   Summary: ...
   Key quotes: ...

Do NOT write the final report. Return takeaways + source list only.

Role 2: Gap-Fill Agent (Wave 2+ — both depths)

You are a targeted gap-fill agent. Your task:

Topic: [research question]
KNOWN GAP: [specific area/claim that lacks evidence]
What we already have: [brief summary of existing sources for this area]
What we need: [specific type of evidence missing — e.g. opposing view,
quantitative data, case study, newer source]

Search terms to start with: [list — tailored to the gap]

Budget: max 5 WebSearch + 3 WebFetch calls.

Source credibility tiers — assign one per source:
  Tier 1 (authoritative): Peer-reviewed, official docs, research labs,
    recognized authorities
  Tier 2 (credible): Established company engineering blogs, conference
    talks, well-known practitioners
  Tier 3 (supplementary): Personal blogs, Medium, Reddit/HN, vendor
    marketing, undated or anonymous content

Instructions:
1. Search specifically for the missing evidence. Do NOT re-cover
   areas that already have sufficient sources.
2. Try alternative framings if direct searches fail (synonyms,
   broader/narrower scope, different language).
3. If the gap cannot be filled after your budget, state this
   explicitly — do NOT pad with tangentially related sources.
4. Number sources starting from [START_INDEX].

Return in the same format as the Retrieval Agent (takeaway + sources
with tier tags).

Role 3: Verification Agent (Phase 3.1 — both depths)

You are a citation verification agent. Your task:

Budget: max 10 WebFetch calls. Keep your response under 1500 words.

You will receive a list of claims paired with their cited sources.
For each claim–source pair:
1. Fetch the source URL with WebFetch (or use the provided extract).
2. Locate the passage that supposedly supports the claim.
3. Rate the match:
   - SUPPORTED — the source directly states or strongly implies the claim.
   - PARTIAL — the source is related but does not fully support the claim.
   - UNSUPPORTED — the source does not support the claim, or contradicts it.
4. For PARTIAL / UNSUPPORTED, suggest a corrected claim or flag for removal.

Return a verification table:
| # | Claim (short) | Source [N] | Rating | Notes |

Role 4: Synthesis Agent (Phase 5 — deep only)

You are a research synthesis agent. Your task:

Budget: no external tool calls needed. Keep your response under 3000 words.

You will receive:
- A deduplicated source list with extracts covering [N] key areas
- Area takeaways from retrieval agents
- Confidence labels from triangulation

For each key area:
1. Identify the consensus view across sources.
2. Note contradictions or minority positions.
3. Draft 2-3 paragraphs of **report-ready prose** with inline [N]
   citations. The lead agent will incorporate these directly into
   the final report — write at publication quality.
4. Flag any area where evidence is thin or one-sided.

Return structured summaries per key area. Do NOT write the executive
summary, TL;DR, action plan, or methodology — the lead agent handles those.

Merge Protocol (Main Agent — after each wave)

When subagent results come back, the main agent merges them as follows:

Renumber sources — each subagent uses a START_INDEX to avoid collisions, but verify there are no gaps or duplicates. The final bibliography must be a clean sequential [1], [2], [3]... list.
Deduplicate by URL — if two subagents found the same source, keep the richer extract. Update all [N] references accordingly.
Reconcile conflicting takeaways — if two subagents produced contradictory takeaways for overlapping areas, do NOT silently pick one. Instead, note the contradiction as a finding for Phase 3 (triangulation). Both perspectives must surface in the report.
Preserve tier tags — carry over the credibility tier from the subagent output into the master source list.
Update the master source list before proceeding to the next phase.

Additional optimization tips

Be selective with WebFetch — only fetch pages whose search snippet looks highly relevant. Not every hit needs a full page load.
Leverage WebSearch summaries — the tool returns snippets. Use these as lightweight sources when a full fetch adds no value.
Follow citation chains — if a good article references another source, have a subagent search for that source directly.
Deduplicate — subagents may find the same source. Merge in Phase 3.

Error handling

WebSearch returns no results: Rephrase with broader terms or try alternative language. After 3 failed attempts on the same area, note the gap explicitly and move on — do not loop indefinitely.
WebFetch timeout / 403 / paywall: Fall back to the WebSearch snippet as a lightweight source. Mark it [snippet only] in the source list so the triangulation phase knows it is lower-quality evidence.
Subagent returns empty or poor results: Launch a replacement subagent with revised search terms. Do not retry the identical prompt.
Entire wave fails (all subagents return empty, e.g. due to rate limiting or API errors): Graceful degradation — produce a reduced report using whatever sources have been collected so far. Run Phase 3 (triangulate) on available evidence, note the degradation in the Methodology section, and set all affected sections to [Low] confidence. If zero sources exist, inform the user and stop.
Phase 0 or Phase 1 rejected by user: Revise scope or plan based on user feedback. Do not skip phases or proceed without confirmation.
Tool-call budget running low: Prioritize remaining areas by importance. Prefer finishing with fewer but well-triangulated claims over broad but shallow coverage.

Phases

Pre-Phase — CONTEXT LOADING (always)

Before starting a new research run, perform two checks:

1. Research context file: Search for RESEARCH_CONTEXT.md in the project root first. If not found, run a single Glob for **/RESEARCH_CONTEXT.md. If it exists, load it. This file contains persistent user preferences for research (e.g. default audience, recurring constraints, known terminology). Use it to pre-fill Phase 0 dimensions — only ask about dimensions the context file does not already answer.

2. Existing report check: Search for DEEP_RESEARCH_*.md files using Glob.

If a match exists: inform the user, state the file's last-modified date, and ask whether to update (use existing report as baseline, fill gaps) or replace (full new run).
If no match: proceed to Phase 0.

Phase 0 — CLARIFY (always — MANDATORY first response)

STOP. Do not call WebSearch, WebFetch, or Task before completing this phase. Your first message to the user must be clarifying questions — nothing else. This is the only user-facing pause in the entire pipeline.

Analyze the user's query for ambiguity across five dimensions. For each dimension where the query is ambiguous, generate one clarifying question. Skip dimensions where the query is already specific.

#	Dimension	Ask when…	Example question
1	SCOPE	The topic boundary could be interpreted broadly or narrowly	"Meinst du alle QA-Rollen oder nur Test-Automation?"
2	GOAL	It's unclear what decision or deliverable the research supports	"Soll das eine Entscheidungsvorlage sein oder Wissensaufbau?"
3	AUDIENCE	The reader's technical depth is unknown	"Wer liest den Report — dein Team, Lewis, oder extern?"
4	CONSTRAINTS	Unstated limitations could change the research direction	"Gibt es Tech-Stack-, Budget- oder Timeline-Vorgaben?"
5	ASSUMPTIONS	The query contains implicit assumptions that should be validated	"Du gehst von X aus — soll ich das hinterfragen oder als gegeben nehmen?"

Rules:

Generate 1-5 questions (only for ambiguous dimensions).
If all five dimensions are unambiguous, state which assumptions you are making and ask a single confirmation: "Ich verstehe den Scope als X — korrekt?" Never skip Phase 0 entirely.
If a RESEARCH_CONTEXT.md was loaded in the Pre-Phase, use it to resolve known dimensions before asking.

Present the questions and STOP. Wait for answers before proceeding.

Phase 1 — SCOPE

Restate the research question in one sentence, incorporating Phase 0 answers.
Break the topic into 4-7 key areas, each with 3-5 sub-questions.
Draft 10-15 search terms — vary:
- Language (English + German or domain language)
- Specificity (broad term + narrow term)
- Recency markers (e.g. "2025", "2026")
- Perspective — for each key area, include at least one search term from a critical or opposing viewpoint (e.g. "[technology] criticism", "[approach] failure cases", "[tool] alternatives comparison", "[concept] risks drawbacks"). This prevents one-sided retrieval.
State the chosen depth (standard / deep).
Assign key areas to subagents — split areas evenly, assign START_INDEX ranges so source numbering won't collide (e.g. Subagent A starts at [1], Subagent B at [20], Subagent C at [40]).
Show the plan in the output for transparency, then proceed immediately to Phase 2. Do not wait for user confirmation.

Phase 2 — RETRIEVE (parallel subagents + iterative rounds)

Wave 1:

Split the key areas from Phase 1 across 2-4 Retrieval subagents (one Task call each, launched in a single message for parallel execution).
Each subagent gets:
- Its assigned key areas + sub-questions
- Relevant search terms from the Phase 1 plan
- Its START_INDEX for source numbering
- The Retrieval Agent prompt template (Role 1)
Wait for all subagents to return.
Merge results using the Merge Protocol (see above).
Proceed to Phase 3 (Triangulate) without pausing.

Wave 2 (if gaps found after Phase 3):

Launch 1-2 Gap-Fill subagents (Role 2) targeting thin areas only. Each gets: the specific gap, existing sources summary, and what's needed.
Merge results into the master source list.
Run Phase 3 again.

Wave 3 (deep mode only, if still gaps):

One final Gap-Fill subagent wave for remaining holes.
Final merge + triangulation.

Source targets after all waves:

Depth	Waves	Subagents total	Sources
standard	1-2	2-4	15-25
deep	2-3	4-6	30-50

Search strategy (applies to all subagents):

Alternate broad and narrow queries.
If a search returns poor results, rephrase — don't repeat the same query.
Try both English and German terms for EU/DACH-relevant topics.
Add year qualifiers ("2025", "2026") to find recent content.
Follow citation chains: if a source references another, search for it.
Depth vs. breadth: If early results reveal that a sub-area is well-covered (3+ strong sources), shift remaining tool calls to under-covered areas. Do not keep fetching more sources for areas that already have high-confidence evidence.

Phase 3 — TRIANGULATE (after each retrieval round)

For each key claim, check: is it supported by 2+ independent sources?
Assign confidence levels using both source count and credibility tiers:
- [High] — 3+ independent sources agree, with at least one Tier 1 or Tier 2 source
- [Medium] — 2 sources agree (any tier), or 1 highly authoritative Tier 1 source, or 3+ Tier 3 sources agreeing (cap: cannot exceed Medium)
- [Low] — single source only, or only Tier 3 sources with fewer than 3
Apply tiered recency rules:
- Trends, tooling, benchmarks — strict <18 months. Discard older sources.
- Methodology foundations (e.g. Fowler, Google Testing Blog, Microsoft Research, seminal papers) — allowed regardless of age. Mark as [foundational] in the source list.
- In between — allow up to 36 months if no newer source covers the same claim. Note the age explicitly in the report.
Note contradictions between sources — these become findings, not errors.
Identify thin areas (any key area with <2 sources or no [High] claims) → feed back into the next Phase 2 round via Gap-Fill subagents.

Quality Gate (between waves)

After each triangulation, check whether another retrieval wave is needed:

All key areas have ≥2 sources?
  AND source target met (15+ standard, 30+ deep)?
  AND no key area is entirely [Low] confidence?
  AND at least 1 source per key area represents a critical/opposing view?
    → YES: skip remaining waves, proceed to Phase 3.1
    → NO:  run next wave with Gap-Fill subagents, targeting thin areas only

This prevents wasting tool calls when coverage is already sufficient.

Phase 3.1 — CITATION SPOT-CHECK (both depths)

After triangulation passes, delegate citation verification to a Verification subagent (Role 3). This frees the main agent's tool-call budget and catches the most common failure mode of deep-research systems (empirically measured at 40–80% citation accuracy across commercial tools).

Select the 5–10 highest-impact claims in the collected evidence — prioritize claims that drive the report's main conclusions.
Send the claims + their cited source URLs/extracts to the Verification subagent.
When the subagent returns its verification table, process the results:
- SUPPORTED — keep as-is.
- PARTIAL — reformulate the claim to match what the evidence says.
- UNSUPPORTED — launch a single Gap-Fill subagent to find a replacement source for this specific claim (max 3 WebSearch + 2 WebFetch). If no replacement found, downgrade to [Low].
Log any corrections in the Methodology Appendix of the final report.

Phase 3.5 — OUTLINE REFINEMENT (both depths)

After triangulation is complete, compare the planned report structure (from Phase 1) against the actual evidence collected:

Review fit: Do the planned sections match what the evidence supports?
Adapt if needed:
- Add sections for unexpected but well-evidenced findings.
- Demote or merge sections with insufficient evidence.
- Reorder by evidence strength and importance.
Document changes: Note what changed and why (evidence-driven, not speculative). Include in the Methodology Appendix of the final report.

Guardrails:

Adaptation must be driven by evidence already collected.
Do not add sections without supporting sources.
Do not abandon the original research question — stay on topic.
Maximum 50% structural change. If evidence pushes beyond 50%:
1. Standard mode: Note the mismatch in Methodology, apply the best-fitting structure within the 50% limit, and flag the scope issue in Open Questions.
2. Deep mode: Pause and show the user the revised outline with a short explanation of why the evidence diverges from the original scope. Ask whether to (a) proceed with the new focus, (b) narrow back to the original scope, or (c) split into two reports. This is the only additional user checkpoint beyond Phase 0.

Phase 4 — CRITIQUE (both depths)

Before writing the final report, run a self-critique by asking these red-team questions about the collected evidence:

What's missing? — Are there perspectives, geographies, or source types (academic, industry, community) that are underrepresented?
What could be wrong? — Which claims rest on weak evidence? Are there logical leaps between sources and conclusions?
What would a skeptic say? — If someone disagreed with the main findings, what would their strongest argument be?

If critique reveals a critical gap (not just a nuance):

Launch 1 Gap-Fill subagent to fill the gap (assign next available START_INDEX from the master source list).
Time-box to 5 tool calls maximum.
Do NOT restart the full pipeline.

Output: A short internal note (not included in the final report) listing:

Gaps found and whether they were addressed
Remaining caveats to mention in the report's "Open Questions" section

Phase 5 — SYNTHESIZE & WRITE

Deep mode only: Before writing, launch a Synthesis subagent (Role 4) with the full source list, area takeaways, and confidence labels. The subagent returns report-ready prose per key area. The main agent incorporates these into the final report structure.

Standard mode: The main agent writes directly from the area takeaways and source extracts collected during Phases 2-3.

Create the output file: DEEP_RESEARCH_[TOPIC].md

Required Structure

# Deep Research: [Topic]
> Generated [YYYY-MM-DD] | Depth: [standard/deep] | Sources: [N]

## TL;DR
(2-3 sentences: the decision-relevant bottom line. What should the reader
do or know? Written for stakeholders who will not read the full report.)

## Executive Summary
(200-400 words — key findings, no fluff)

## 1. Status Quo [Confidence: High/Medium/Low]
(Current state of practice, with citations [N])

## 2. Emerging Trends [Confidence: High/Medium/Low]
(What is gaining traction in the last 6-12 months, with citations [N])

## 3. Critical Assessment [Confidence: High/Medium/Low]
(Why trends fail in practice, known trade-offs, risks, with citations [N])

## 4. Action Plan
- [ ] [Action 1 — one sentence, concrete and actionable]
- [ ] [Action 2 — ...]
- [ ] ...
(Reference codebase / architecture-documentation.md where applicable)

## 5. Open Questions & Caveats
(What could not be conclusively answered — include unresolved
contradictions and findings flagged by the Critique phase)

## Methodology
(Depth chosen, number of subagents, waves run, outline changes if any,
citation corrections from Phase 3.1, degradation notes if applicable)

## Bibliography
[1] Author — Title — URL — Accessed YYYY-MM-DD — Tier: N
[2] ...

## Source Extracts
(Raw data from subagents — preserved for follow-up sessions)

### [1] Title
- **Summary:** [2-3 sentence extract]
- **Key quotes:** [direct quotes used in the report]
- **Source type:** [academic / industry / blog / docs / community]
- **Credibility tier:** [1 / 2 / 3]

### [2] Title
- ...

Confidence Labels

Each main section gets a confidence label based on Phase 3 triangulation:

Label	Meaning
`[High]`	Core claims backed by 3+ independent, credible sources (at least one Tier 1 or 2)
`[Medium]`	Claims backed by 2 sources, or 1 highly authoritative source, or 3+ Tier 3 only
`[Low]`	Single-source claims or limited evidence — treat as directional

Use the labels inline for individual claims too when they diverge from the section's overall confidence (e.g. a [High] section may contain one [Low] sub-claim).

Section guidelines

Section	Focus
TL;DR	2-3 sentences: decision-relevant bottom line for stakeholders who won't read the full report
Status Quo	Established patterns, current best practice, market adoption
Emerging Trends	Conference talks, blog posts, GitHub activity, community buzz
Critical Assessment	Failure modes, scalability concerns, complexity trade-offs
Action Plan	Markdown checklist (`- [ ]`). Each item: one concrete, actionable step. Tie to this project; reference architecture-documentation.md if relevant
Open Questions	Honest gaps, unresolved contradictions, Critique-phase caveats
Methodology	Transparency: how the research was conducted
Source Extracts	Raw subagent data (summary, key quotes, source type, credibility tier per source). Enables follow-up sessions without re-fetching

Quality Rules

Rule	Detail
Citation	Every factual claim uses `[N]`, resolved in Bibliography
Triangulation	Major claims backed by 2+ independent sources
Confidence	Every section labeled `[High]`, `[Medium]`, or `[Low]`; `[High]` requires ≥1 Tier 1/2 source
No fabrication	Never invent URLs, authors, or publication names
Recency	Tiered: trends/tooling <18 months; methodology foundations allowed regardless of age (mark `[foundational]`); in-between up to 36 months with age noted
Prose-first	>=80% flowing text; bullet lists only for enumerations. Exception: Action Plan uses a checklist.
Source count	standard: 15-25, deep: 30-50
Minimum length	standard: 4.000 words, deep: 6.000+
Contradictions	Surface them explicitly — don't silently pick a side
Critique	Both depths: run Phase 4 red-team before writing
Code snippets	Include when they illustrate a pattern; always attribute the source
Output language	Write the report in the language of the user's request. Search terms remain multilingual.
Citation verify	Both depths: run Phase 3.1 spot-check via Verification subagent

Output

Save report to project root: DEEP_RESEARCH_[TOPIC].md
If the topic is project-specific, also note relevant architecture-documentation.md sections.
After writing, do a self-check: scan for [N] references without a matching Bibliography entry. Fix before delivering.

When NOT to Use

Simple factual questions → one WebSearch call is enough.
Debugging or code fixes → standard agent mode.
Questions about this codebase only → use codebase search tools.

name	deep-research
description	Conduct systematic deep research on technical topics with source verification, triangulation, and citation-backed reports. Use when the user asks for deep research, comprehensive analysis, technology comparison, trend analysis, state of the art review, or architecture evaluation. Not for simple lookups, debugging, or questions answerable with 1-2 searches.