ワンクリックで
research-discovery-setup
Set up effective automated discovery - from vague idea to working system in one streamlined conversation.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
メニュー
Set up effective automated discovery - from vague idea to working system in one streamlined conversation.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
SOC 職業分類に基づく
Initialize a new user's research assistant. Use this on first interaction or when user asks to "get started", "set up", or "introduce yourself". Also use when you don't know the user's research interests or the human memory block still has placeholder text.
Walk the user through new Thoth features since their last onboarding or update. Use when the user asks "what's new", "what changed", or "what can you do now". Also use after check_whats_new returns updates to walk through them.
Create, manage, and iterate on research plan documents in the Obsidian vault. Use when the user asks for a research plan, literature review roadmap, or when you need to formalize your own working research strategy.
Conduct deep analysis of research papers, synthesize literature, and generate comprehensive reviews. Use when user needs thorough paper analysis, literature reviews, or cross-paper synthesis.
Manage external knowledge collections (textbooks, lecture notes, background material) and search them to support research. Use when user wants to upload reference material or query foundational knowledge.
Answer questions using your existing research collection and external knowledge. Use when user asks questions about papers they have, wants summaries, or seeks insights from their knowledge base.
| name | Research Discovery Setup |
| description | Set up effective automated discovery - from vague idea to working system in one streamlined conversation. |
Help users configure automated discovery that delivers quality papers consistently.
Most common scenario: User has a vague research interest and needs help setting up discovery.
Agent: "Let's get your discovery set up! Three quick questions:
1. What research area interests you? (even if vague - we'll refine it)
2. How much time daily to review papers? (5 mins? 30 mins?)
3. Need cutting-edge research or learning fundamentals?
I'll configure everything based on your answers."
Agent: "Perfect! Based on what you said, here's your starter configuration:
**Research Focus**: [refined 1-sentence version]
**Keywords**: [2-4 core search terms]
**Sources**: [1-2 most relevant APIs]
**Schedule**: [frequency based on time commitment]
**Quality Filter**: [threshold based on volume needs]
This should get you ~[X] papers [per day/week] to review.
Ready to test it now?"
Agent: "Test results: [X] papers found
Quick scan of titles:
✓ [Y] papers look highly relevant
~ [Z] papers look somewhat relevant
✗ [W] papers look off-topic
Is this mix working for you? Too many? Too few? Wrong topic?"
Then: Refine based on feedback (see Refinement Patterns below)
Q1: Do they know what they want to research?
Q2: Have they tried searching yet?
Q3: How much time for review?
User: "I'm interested in AI and finance"
Agent: "Let's narrow that down. Pick ONE direction to start:
We can always expand later."
User picks one
Agent: "Great! Now let's scope it:
Example refined question: 'Can Twitter sentiment predict daily stock returns?'
Does something like that capture your interest?"
Agent: "Let's diagnose the problem. I need:
[User provides]
Agent analyzes and identifies issue:
Agent: "The issue is [specific problem]. Here's the fix: [specific change].
Want me to test this now?"
Agent: "You've been running for [X] days. Let's optimize:
Current results:
Assessment: [Good/Too many/Too few/Wrong mix]
Recommendations: [1-2 specific tweaks]
Want to try these adjustments?"
Check relevance:
├─ >70% relevant → Just cap max_papers to 25
├─ 40-70% relevant → Increase threshold +0.1
└─ <40% relevant → Keywords too broad, add specific terms
Check specificity:
├─ Keywords very specific → Broaden terms, add synonyms
├─ Keywords normal → Lower threshold -0.1
└─ Field is just slow → Adjust schedule to weekly
Check mismatch:
├─ Different domain (crypto vs stocks) → Add negative keywords
├─ Different time period (old papers) → Add date filter: last 2 years
├─ Different methodology → Add method-specific terms
└─ Different language → Add language filter
User shows example paper:
1. Look at that paper's title/abstract
2. Extract key terms it uses
3. Add those terms to search
4. Test again
Ask 3 questions:
Result: "[Specific searchable question]"
Question → Keywords (2-step):
1. Extract core nouns from question
2. Add 1 technical synonym if they used plain language
Example:
Q: "Can Twitter predict stocks?"
K: "twitter sentiment stock prediction"
Default recommendations:
Start with 2 sources max, add more only if missing papers.
Default: 0.7 (works for 80% of cases)
Adjustments:
- User wants only best papers → 0.8
- User wants comprehensive coverage → 0.65
- Getting <5 papers/day → Lower by 0.05
- Getting >40 papers/day → Raise by 0.05
Based on user's available time:
- 10 min/day → Daily, 10-15 papers, threshold 0.75
- 30 min/3x week → Every other day, 20-30 papers, threshold 0.7
- 1 hour/week → Weekly, 50 papers, threshold 0.65
Symptom: <50% of papers are relevant
Diagnosis: Filter is too permissive OR keywords too broad
Fix:
Option A: Increase threshold (0.7 → 0.75 or 0.8)
Option B: Add specific terms to keywords
Option C: Remove noisy source if one is producing junk
Recommendation: Try A first (fastest), then B if needed
Symptom: <10 papers/day in active field
Diagnosis: Filter too strict OR keywords too narrow
Fix:
Option A: Lower threshold (0.7 → 0.65)
Option B: Add synonym terms to keywords
Option C: Add another relevant source
Recommendation: Try B first (maintains quality), then A
Symptom: Getting papers from 2015-2019, user wants recent
Fix:
Add date filter: published_after="2022-01-01"
Or for cutting-edge: published_after="2024-01-01"
Symptom: Getting crypto papers when researching stocks
Fix:
Add negative keywords:
exclude_terms = ["cryptocurrency", "bitcoin", "crypto", "blockchain"]
| Problem | Instant Fix |
|---|---|
| Zero results | Lower threshold to 0.6, broader keywords |
| 100+ results/day | Raise threshold to 0.8, add specific terms |
| All old papers (>3 years) | Add date filter: last 2 years |
| Papers in wrong language | Add language filter: English |
| Wrong scientific domain | Add negative keywords for that domain |
| Too many duplicates | Normal - Thoth auto-deduplicates by DOI/title |
| Missing recent papers | Add arxiv source (most current) |
| Missing published papers | Add openalex or crossref |
User: "I'm writing a survey paper, need EVERYTHING"
Agent: "Comprehensive strategy:
You'll get 400-700 papers total. This is the 'fire hose' approach. After 2 weeks, we'll refine based on what you found.
Ready for high volume?"
User: "Just want to stay current, very busy"
Agent: "Time-efficient strategy:
One 15-minute review session per week. You'll catch major advances without daily commitment.
Sound good?"
User: "I'm new to this field, need foundational papers"
Agent: "Learning strategy:
Mix of recent reviews and important papers. Good for building understanding.
Ready to start?"
Agent: "I've configured your discovery. Before scheduling it, let's test:
[Run discovery once]
Got [X] papers. Let's review together:
1. Look at top 5 titles - relevant?
2. Any obvious misses or noise?
3. Is [X] papers manageable for you?
Based on your feedback, we'll adjust before going live."
Agent: "You've been running for a week. Quick review:
**Volume**: [X] papers/day average
**Your feedback**: [Y]% marked as relevant
[If >70% relevant]: "This is working well! Any tweaks needed?"
[If <50% relevant]: "We need to refine. The issue is likely [diagnosis]"
The goal: Get user from "vague idea" to "working discovery" in one conversation, with working system at the end.
Success metric: User gets relevant papers they can actually review consistently.