一键导入
research-discovery-execution
Execute and monitor discovery sessions for finding research papers across academic sources. Use when running discovery searches.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Execute and monitor discovery sessions for finding research papers across academic sources. Use when running discovery searches.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Initialize a new user's research assistant. Use this on first interaction or when user asks to "get started", "set up", or "introduce yourself". Also use when you don't know the user's research interests or the human memory block still has placeholder text.
Walk the user through new Thoth features since their last onboarding or update. Use when the user asks "what's new", "what changed", or "what can you do now". Also use after check_whats_new returns updates to walk through them.
Create, manage, and iterate on research plan documents in the Obsidian vault. Use when the user asks for a research plan, literature review roadmap, or when you need to formalize your own working research strategy.
Conduct deep analysis of research papers, synthesize literature, and generate comprehensive reviews. Use when user needs thorough paper analysis, literature reviews, or cross-paper synthesis.
Manage external knowledge collections (textbooks, lecture notes, background material) and search them to support research. Use when user wants to upload reference material or query foundational knowledge.
Answer questions using your existing research collection and external knowledge. Use when user asks questions about papers they have, wants summaries, or seeks insights from their knowledge base.
| name | research-discovery-execution |
| description | Execute and monitor discovery sessions for finding research papers across academic sources. Use when running discovery searches. |
Execute discovery searches systematically and monitor their progress to find relevant research papers.
Most common use: User needs to find papers on a specific topic from academic sources.
User request: "Find papers on quantum error correction from 2024"
Step 1: Verify research question exists
- Use list_research_questions
- If doesn't exist, tell orchestrator to create one first
Step 2: Run discovery
- Use run_discovery_for_question
- Specify question ID or name
- Set parameters (date range, sources)
Step 3: Monitor progress
- Update workflow_state with status
- Check for errors or timeouts
- Provide progress updates
Step 4: Return results
- Count of papers found per source
- Summary of top papers
- Quality metrics (relevance scores)
Before running discovery, validate:
Check existing research questions:
questions = list_research_questions()
If question not found:
"The research question hasn't been created yet. The orchestrator
should create it first using the research-question-creation skill."
If question exists but recently run:
"This discovery was run 2 hours ago. Found 45 papers.
Do you want to run again or use existing results?"
Run discovery with proper parameters:
run_discovery_for_question(
question_id="...",
force_refresh=False, # Set to True to ignore cache
max_results=100, # Limit per source
min_relevance=0.7 # Quality threshold
)
Sources checked (in order):
Update workflow_state during execution:
Initial:
"Discovery Status: Starting
Sources: arXiv, Semantic Scholar, PubMed
Expected time: 1-2 minutes"
During:
"Discovery Status: In Progress
arXiv: 23 papers found (complete)
Semantic Scholar: 15 papers found (in progress)
PubMed: pending
Time elapsed: 45 seconds"
Complete:
"Discovery Status: Complete
Total papers: 52
Sources: arXiv (23), Semantic Scholar (18), PubMed (11)
Duration: 118 seconds
Quality: 38 papers above relevance threshold"
Analyze and summarize results:
For each source:
- Count of papers found
- Quality distribution (high/medium/low relevance)
- Date range covered
- Top papers by relevance score
Overall:
- Total unique papers (deduplicating across sources)
- Papers meeting quality threshold
- Recommended next steps
Error: "Source timeout"
Problem: arXiv taking >60 seconds
Solution: Continue with other sources
Action: "arXiv timed out, but found 33 papers from Semantic Scholar
and PubMed. Do you want to retry arXiv or proceed with these?"
Error: "No papers found"
Problem: Search too narrow or no matching papers
Solution: Suggest broadening search
Action: "No papers found matching these criteria. Suggestions:
- Broaden date range (try last 2 years instead of 6 months)
- Add related keywords
- Try different sources"
Error: "Rate limit exceeded"
Problem: Too many requests to source
Solution: Wait and retry, or skip source
Action: "Hit rate limit on Semantic Scholar. Waiting 30 seconds...
Meanwhile, found 20 papers from arXiv."
Error: "Invalid research question"
Problem: Research question malformed or missing
Solution: Tell orchestrator to fix/create question
Action: "Research question needs to be created or fixed. Delegating
back to orchestrator..."
Papers are scored 0.0-1.0 based on:
Thresholds:
Default: Return all papers >0.6 relevance
Strict mode: Only >0.8 relevance
Exploratory mode: All papers >0.4 relevance
Example:
Found 80 papers total:
- 25 high quality (>0.8)
- 35 medium quality (0.6-0.8)
- 20 low quality (<0.6)
Recommended: Present high + medium (60 papers)
For large searches, run in batches:
Batch 1: Last 6 months (quick)
→ Review results
→ If insufficient, expand to 1 year
Batch 2: 6-12 months ago
→ Merge with Batch 1
→ If still insufficient, expand to 2 years
For complex topics:
Phase 1: Core keywords (narrow)
→ Get foundational papers
Phase 2: Related keywords (broad)
→ Find connections and context
Phase 3: Citation expansion
→ Papers cited by Phase 1 papers
Based on topic:
Computer Science topic:
Priority: arXiv > Semantic Scholar > CrossRef
Biomedical topic:
Priority: PubMed > bioRxiv > Semantic Scholar
Interdisciplinary:
Priority: Semantic Scholar > arXiv > PubMed
Sources can be queried in parallel:
Start all sources simultaneously:
- arXiv query (async)
- Semantic Scholar query (async)
- PubMed query (async)
Return results as they complete:
"arXiv: 23 papers found (15 seconds)"
"Semantic Scholar: still searching..."
"PubMed: 11 papers found (22 seconds)"
Cache results for 24 hours:
- Same research question
- Same parameters
- Within 24 hours
Skip cache if:
- force_refresh=True
- User explicitly asks for fresh search
- Important new papers expected (conference just happened)
=== Discovery Results ===
**Summary:**
- Total papers: 52
- High quality: 25 papers
- Date range: Jan 2024 - Jan 2025
- Duration: 118 seconds
**By Source:**
1. arXiv: 23 papers (10-30s search time)
2. Semantic Scholar: 18 papers (45s search time)
3. PubMed: 11 papers (25s search time)
**Top 5 Papers:**
1. "Quantum Error Correction with..." (relevance: 0.95, 150 cites)
2. "Surface Codes for Fault-Tolerant..." (relevance: 0.92, 120 cites)
3. ...
**Quality Distribution:**
- High (>0.8): 25 papers
- Medium (0.6-0.8): 20 papers
- Below threshold (<0.6): 7 papers (filtered out)
**Next Steps:**
Would you like me to:
- Download PDFs for high-quality papers?
- Run citation analysis?
- Create a reading list?
Always update workflow_state during discovery:
Start:
workflow_state: "Discovery started for quantum error correction"
Progress:
workflow_state: "Discovery 50% complete, 30 papers found so far"
Complete:
workflow_state: "Discovery complete: 52 papers found in 118 seconds"
Update active_papers memory:
"Papers pending download: [list of paper IDs]"
Update research_context:
"Current research: quantum error correction
Latest discovery: Jan 2025, 52 papers"
Standard search:
- Date range: Last 2 years
- Max results: 100 per source
- Min relevance: 0.7
Quick search:
- Date range: Last 6 months
- Max results: 50 per source
- Min relevance: 0.8
Comprehensive search:
- Date range: Last 5 years
- Max results: 200 per source
- Min relevance: 0.6
Your job as Discovery Scout:
Key principles:
Success metric: User gets high-quality, relevant papers quickly with clear next-step options.