Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

prompt-schema-tuning

Customize what information is extracted from papers (schema) and how it's extracted (prompts). Use when the user wants to change analysis fields, instructions, or prompt templates.

Exécuter dans Manus

Aperçu

Customize what information is extracted from papers (schema) and how it's extracted (prompts). Use when the user wants to change analysis fields, instructions, or prompt templates.

Commande d'installation

npx skills add https://github.com/acertainKnight/project-thoth --skill prompt-schema-tuning

Copiez et collez cette commande dans Claude Code pour installer le skill

Source

acertainKnight/project-thoth

Étoiles10

Forks5

Mis à jour15 février 2026 à 02:55

SKILL.md

readonly

Plus depuis ce dépôt

même dépôt

onboarding

acertainKnight/project-thoth

Initialize a new user's research assistant. Use this on first interaction or when user asks to "get started", "set up", or "introduce yourself". Also use when you don't know the user's research interests or the human memory block still has placeholder text.

2026-03-0410

whats-new

acertainKnight/project-thoth

Walk the user through new Thoth features since their last onboarding or update. Use when the user asks "what's new", "what changed", or "what can you do now". Also use after check_whats_new returns updates to walk through them.

2026-03-0410

research-planning

acertainKnight/project-thoth

Create, manage, and iterate on research plan documents in the Obsidian vault. Use when the user asks for a research plan, literature review roadmap, or when you need to formalize your own working research strategy.

2026-03-0210

deep-research

acertainKnight/project-thoth

Conduct deep analysis of research papers, synthesize literature, and generate comprehensive reviews. Use when user needs thorough paper analysis, literature reviews, or cross-paper synthesis.

2026-02-1610

external-knowledge

acertainKnight/project-thoth

Manage external knowledge collections (textbooks, lecture notes, background material) and search them to support research. Use when user wants to upload reference material or query foundational knowledge.

2026-02-1610

knowledge-base-qa

acertainKnight/project-thoth

Answer questions using your existing research collection and external knowledge. Use when user asks questions about papers they have, wants summaries, or seeks insights from their knowledge base.

2026-02-1610

Source

acertainKnight

acertainKnight/project-thoth

Ouvrir le dépôt GitHub Voir les dépôts du créateur

Commande d'installation

Téléchargement

Exécuter dans Manus

Utile pourSOC

Administrateurs de bases de donnéesProfessions informatiques et mathématiques15-1242L4

name	prompt-schema-tuning
description	Customize what information is extracted from papers (schema) and how it's extracted (prompts). Use when the user wants to change analysis fields, instructions, or prompt templates.
tools	["get_schema_info","list_schema_presets","get_preset_details","update_schema_field","update_schema_instructions","reset_schema_to_default","list_prompt_templates","read_prompt_template","update_prompt_template","reset_prompt_template"]

Prompt & Schema Tuning

Customize document analysis by modifying what information is extracted (analysis schema) and how extraction happens (prompt templates). The schema defines fields like "summary" or "methodology". The prompts tell the LLM how to populate those fields.

Overview

Two configuration layers control analysis:

Analysis Schema - JSON config defining extractable fields, types, and instructions
- Location: _thoth/analysis_schema.json in vault
- Four presets: standard, detailed, minimal, custom
- Hot-reloaded by schema service
Prompt Templates - Jinja2 templates that guide the LLM
- Location: _thoth/prompts/{provider}/ in vault (custom) or templates/prompts/ (defaults)
- Provider-specific: google, openai, default, agent
- Hot-reloaded by file watcher

Core Tools

Schema Tools

Tool	Purpose	When to Use
`get_schema_info`	View current schema config	Check active preset, fields
`list_schema_presets`	List all presets	See available options
`get_preset_details`	View fields in a preset	Inspect before editing
`update_schema_field`	Add/update/remove field	Change what's extracted
`update_schema_instructions`	Change extraction guidance	Adjust how LLM extracts
`reset_schema_to_default`	Restore from template	Fix broken config

Prompt Tools

Tool	Purpose	When to Use
`list_prompt_templates`	List available templates	See what can be edited
`read_prompt_template`	View template content	Check current prompts
`update_prompt_template`	Modify template	Fix extraction issues
`reset_prompt_template`	Delete custom version	Revert to default

Understanding the Schema

Schema Structure

The schema is organized into presets. Each preset has:

name - Display name
description - Purpose description
fields - Dictionary of field definitions
instructions - Natural language guidance for LLM

Field Specification

Each field has:

{
  "type": "string|integer|array",
  "required": true/false,
  "description": "What this field should contain",
  "items": "string" (for arrays)
}

Example: Standard Preset

{
  "title": {
    "type": "string",
    "required": true,
    "description": "The title of the research paper"
  },
  "summary": {
    "type": "string",
    "required": false,
    "description": "A multi-paragraph summary"
  },
  "tags": {
    "type": "array",
    "items": "string",
    "required": false,
    "description": "Keywords for the article"
  }
}

Understanding Prompts

Prompt Templates

Jinja2 templates that create the actual LLM prompt. Key variables:

{{ content }} - Document text to analyze
{{ analysis_schema }} - JSON schema for output
{{ custom_instructions }} - Preset-specific instructions
{{ chunk }} - Text chunk for map-reduce processing

Provider Variations

Different templates per provider optimize for model behavior:

google/ - Gemini-specific guidance
openai/ - GPT-specific guidance
default/ - Generic templates
agent/ - Agent-specific prompts

Key Templates

analyze_content.j2 - Direct analysis (short docs)
analyze_section.j2 - Map phase (long docs)
reduce_sections.j2 - Reduce phase (synthesis)
refine_analysis.j2 - Iterative refinement

Workflow Examples

Example 1: Summaries Too Short

User: "The summaries in my analysis are too short. I need more detail."

1. Check current schema
   get_schema_info()
   → Active preset: standard

2. View preset details
   get_preset_details(preset="standard")
   → summary: "A multi-paragraph summary"

3. Update field description to be more specific
   update_schema_field(
     preset="standard",
     field_name="summary",
     field_spec={
       "type": "string",
       "required": false,
       "description": "A comprehensive 5-7 paragraph summary covering objectives, methodology, key results, and implications. Each paragraph should be 4-6 sentences with specific details from the paper."
     }
   )

4. Also update preset instructions for emphasis
   update_schema_instructions(
     preset="standard",
     instructions="Provide comprehensive analysis with detailed multi-paragraph summaries (5-7 paragraphs). Extract 10-15 key points with specific details. Be thorough in describing methodology and results."
   )

5. Response:
   "Updated the summary field to require 5-7 detailed paragraphs and modified preset instructions to emphasize comprehensive analysis. Changes take effect on your next analysis run."

Example 2: Remove Unnecessary Fields

User: "I don't need the strengths and weaknesses fields. They're not useful for my research."

1. Check which preset is active
   get_schema_info()
   → Active: standard

2. Remove first field
   update_schema_field(
     preset="standard",
     field_name="strengths",
     remove=true
   )

3. Remove second field
   update_schema_field(
     preset="standard",
     field_name="weaknesses",
     remove=true
   )

4. Response:
   "Removed 'strengths' and 'weaknesses' fields from the standard preset. Your analyses will no longer extract these fields."

Example 3: Add Custom Field

User: "I want to track which papers mention specific datasets. Add a 'datasets_used' field."

1. Check active preset
   get_schema_info()
   → Active: standard

2. Add new field
   update_schema_field(
     preset="standard",
     field_name="datasets_used",
     field_spec={
       "type": "array",
       "items": "string",
       "required": false,
       "description": "List of datasets mentioned in the paper, including their names and brief descriptions of how they were used"
     }
   )

3. Update instructions to emphasize this
   get_preset_details(preset="standard")
   # Read current instructions

   update_schema_instructions(
     preset="standard",
     instructions="[existing instructions] Pay special attention to identifying and listing all datasets mentioned in the paper, including dataset names, sources, and usage context."
   )

4. Response:
   "Added 'datasets_used' field to extract dataset information as an array of strings. Updated instructions to emphasize dataset extraction."

Example 4: Fix Prompt Issues

User: "The analysis keeps missing the methodology section. Can you check the prompt?"

1. List available templates
   list_prompt_templates(provider="google")
   → Shows analyze_content.j2, analyze_section.j2, etc.

2. Read current prompt
   read_prompt_template(
     template_name="analyze_content.j2",
     provider="google"
   )
   → Shows full Jinja2 template

3. Identify issue: prompt doesn't explicitly mention methodology

4. Update template to emphasize methodology
   update_prompt_template(
     template_name="analyze_content.j2",
     provider="google",
     content="[Updated Jinja2 template with added emphasis:
     'Pay special attention to the methodology section. Extract detailed information about study design, methods, procedures, and analytical approaches. If methodology is split across multiple sections, synthesize information from all locations.']"
   )

5. Response:
   "Updated the analyze_content.j2 prompt for Google provider to emphasize methodology extraction. The change is hot-reloaded and will affect your next analysis."

Example 5: Reset After Bad Edit

User: "I messed up the schema. Can you restore it?"

1. Check what's wrong
   get_schema_info()
   → Might show validation errors

2. Reset to defaults
   reset_schema_to_default(confirm=true)
   → Resets entire file

   # OR reset just one preset:
   reset_schema_to_default(preset="custom", confirm=true)

3. Response:
   "Reset schema to repository defaults. Your custom changes have been removed but saved in a .bak file if you need to recover anything."

Schema vs Prompts: When to Edit What

Edit the Schema When:

User wants different fields extracted ("add funding info")
Fields are wrong type ("tags should be an array")
Too many/too few fields ("I don't need all this detail")
Field descriptions are unclear to the LLM

Edit the Prompts When:

LLM consistently misses information ("not finding methods")
Output format is wrong ("summaries are too technical")
Need provider-specific adjustments ("works with GPT but not Gemini")
Want to add examples or constraints to guidance

Edit Both When:

Major change to analysis approach ("focus on reproducibility")
Field exists but LLM populates it poorly ("methodology" field needs better extraction)

Best Practices

Before Editing

Always read current config first (get_schema_info, get_preset_details, read_prompt_template)
Understand what's currently configured
Identify the specific problem (wrong field vs wrong extraction)

When Editing Schema

Make field descriptions detailed and specific
Use instructions for general guidance, field description for specifics
Test with one paper before bulk reprocessing
Keep backups (tools create .bak files automatically)

When Editing Prompts

Keep required Jinja2 variables ({{ content }}, {{ analysis_schema }})
Test with both short and long documents
Consider provider differences (Gemini vs GPT behavior)
Document why you made changes (comment in prompt)

After Editing

Changes take effect immediately (hot-reloaded)
No service restart needed
Test with a single paper first
If extraction fails, check validation warnings

Troubleshooting

Schema Validation Fails

Problem: "Schema validation failed: missing required key"
Solution:
1. validate_schema_file() to see specific error
2. Fix the issue or reset_schema_to_default()
3. Backup is saved automatically

Prompt Not Loading

Problem: "Template not found for provider"
Solution:
1. list_prompt_templates() to verify name and provider
2. Check spelling of template_name (must include .j2)
3. Provider must be 'google', 'openai', 'default', or 'agent'

LLM Returns Wrong Format

Problem: "LLM output doesn't match schema"
Solution:
1. read_prompt_template() to check if schema is passed
2. Verify {{ analysis_schema }} appears in template
3. Check field types are valid (string, integer, array)

Custom Prompt Not Used

Problem: "My changes aren't affecting analysis"
Solution:
1. Verify file saved to _thoth/prompts/{provider}/
2. Check provider name matches model (google for gemini, openai for gpt)
3. Hot-reload may take a few seconds

Schema Reference

Standard Preset (15 fields)

Balanced analysis: title, authors, year, doi, journal, abstract, key_points, summary, objectives, methodology, results, discussion, strengths, weaknesses, tags

Detailed Preset (20 fields)

Exhaustive extraction: adds data, experimental_setup, evaluation_metrics, limitations, future_work, related_work

Minimal Preset (6 fields)

Quick screening: title, authors, year, abstract, key_points (3-5), tags (3-5)

Custom Preset

User-defined template for specialized needs

Prompt Reference

Analysis Prompts

analyze_content.j2 - Single-pass analysis for short docs
analyze_section.j2 - Map phase for chunked docs
reduce_sections.j2 - Combine section analyses
refine_analysis.j2 - Iterative improvement

Other Prompts

extract_citations*.j2 - Citation extraction
evaluate_*.j2 - Relevance evaluation
consolidate_tags.j2 - Tag management
research_agent_chat.j2 - Agent conversations

Notes

Schema changes require valid JSON structure
Prompt changes should preserve Jinja2 syntax
All edits create .bak backups automatically
Changes apply to new analyses, not retroactively
Custom configs survive system updates (vault-based)