Execute qualquer Skill no Manus
com um clique

Execute qualquer Skill no Manus com um clique

Começar

$pwd:

document-grounding

Name: Document Grounding
Author: gaotiexinqu

// Convert a raw document into a structured grounding note for downstream research and summarization.

Executar no Manus

$ git log --oneline --stat

stars:434

forks:42

updated:20 de abril de 2026 às 08:45

Explorador de arquivos

5 arquivos

SKILL.md

readonly

related-skills.json

mesmo repositório

archive-grounding.md

from "gaotiexinqu/OneResearchClaw"

Unpack a ZIP archive, inventory its files, run the corresponding child grounding skill for each supported child file, and then write a real archive-level grounded.md.

2026-04-20434

grounded-research-lit.md

from "gaotiexinqu/OneResearchClaw"

Run focused literature and web research from a grounded note. Use when a grounded note already exists and you want targeted research results, opened-link evidence, deeper per-paper analysis materials, optional downloaded literature, and a two-stage literature output (`lit_initial.md` then refined `lit.md`).

2026-04-20434

grounded-review.md

from "gaotiexinqu/OneResearchClaw"

Review a research report draft with a structured scoring rubric, run a bounded repair loop when needed, and produce the final deliverable report.

2026-04-20434

grounded-summary.md

from "gaotiexinqu/OneResearchClaw"

Create a rich, evidence-preserving research report draft from a grounded note and its follow-up literature result. This is the main report-writing stage of the middle pipeline, not a compression memo.

2026-04-20434

input-router.md

from "gaotiexinqu/OneResearchClaw"

Use the input path to select the correct downstream grounding pipeline and continue execution until the selected grounding workflow is completed.

2026-04-20434

meeting-audio-grounding.md

from "gaotiexinqu/OneResearchClaw"

Convert a meeting audio file into a transcript bundle, then use meeting-grounding to produce structured meeting grounding outputs.

2026-04-20434

package.json

"author": "gaotiexinqu"

"repository": "gaotiexinqu/OneResearchClaw"

Abrir repositório GitHub Ver repositórios do creator

$ install --global

$ download --local

Executar no Manus

$ useful --forSOC

Escriturários de arquivosEscritório e Suporte Administrativo43-4071L4

name	document-grounding
description	Convert a raw document into a structured grounding note for downstream research and summarization.

Document Grounding

Convert a raw document into a structured grounding note.

This skill is for document grounding, not a narrative recap. It should produce a stable intermediate note that is easy to read and easy for downstream skills to use.

When to Use

Use this skill when:

the input is a single document file
the document may be a PDF, DOCX, Markdown, or TXT file
you need structured document notes before downstream research or summary work
the document may contain non-textual evidence such as tables, figures, diagrams, formulas, or code blocks

Do not use this skill when:

the task is to write a polished final report
the task is to perform external literature search
the task is to render output into PDF/DOCX
the input is a meeting transcript and you want meeting-specific grounding

Input

A single document file.

Supported first-stage formats:

.pdf
.docx
.md
.txt

The document may contain:

plain text
section headings
tables
figures / diagrams
formulas
code blocks
captions
layout / reading-order challenges

Output Bundle

For each input document, create one bundle directory:

data/grounded_notes/<type>-<doc_id>_<timestamp>/

where <type> is the file extension (e.g. pdf, docx, md, txt), <doc_id> is the sanitized filename without extension, and <timestamp> is the Beijing-time execution timestamp (format: YYYYMMDDHHMMSS).

For example:

data/grounded_notes/pdf-paper_name_20260410153022/
data/grounded_notes/docx-notes_001_20260410153100/
data/grounded_notes/md-project_readme_20260410153215/

Inside that bundle, the expected outputs are:

<bundle_dir>/
├─ ground_id.txt       # Ground ID for this unit (reused by all downstream stages)
├─ extracted.md
├─ extracted_meta.json
├─ asset_index.json
├─ grounded.md
└─ assets/
   ├─ tables/
   ├─ figures/
   └─ formulas/

The <ground_id> (e.g. pdf-paper_name_20260410153022) is the single stable identifier for the entire pipeline — all downstream directories (lit_inputs, lit_results, report_inputs, review_outputs, reports, final_outputs) reuse this same <ground_id>.

Important separation of responsibilities

ground_document.py is responsible for building the extraction bundle:
- extracted.md
- extracted_meta.json
- asset_index.json
- assets/...
The agent is responsible for reading those outputs and then writing the final:
- grounded.md

grounded.md must be a real grounding note. It must not remain a placeholder scaffold.

Required Agent Workflow

Run the extraction script through the existing run.sh entrypoint.
Read:
- extracted.md
- extracted_meta.json
- asset_index.json
If extracted.md contains AssetRef blocks, inspect the referenced files in assets/... before writing grounded.md.
Write grounded.md as a real structured grounding note.
Do not stop after merely confirming that the extraction bundle exists.

AssetRef Rule

extracted.md may contain blocks such as:

[AssetRef]
type: figure
id: figure_001
path: assets/figures/figure_001.png
instruction: Inspect this asset before writing grounded.md if it is relevant to the document's claims, comparisons, or conclusions.
[/AssetRef]

and

[AssetRef]
type: table
id: table_001
path_md: assets/tables/table_001.md
path_csv: assets/tables/table_001.csv
instruction: Inspect this table before writing grounded.md if it contains key evidence, comparisons, or numerical results.
[/AssetRef]

These are not decorative markers. They indicate that important evidence may exist outside the plain extracted text. If an AssetRef appears relevant, the agent must inspect the referenced asset before final grounding.

Output Format

Return markdown with exactly these sections.

# Document Grounding

## 1. Main Topic / Purpose
[2–4 sentence statement of the document’s main topic and purpose.]

## 2. Main Points
- [One bullet per major sub-topic, contribution, or argument]

## 3. Key Findings / Claims
- [Only items explicitly stated or strongly supported by the document]

## 4. Constraints / Risks
- [Only constraints, limitations, caveats, or risks explicitly stated or strongly supported by the document]

## 4a. Important Non-Textual Elements
- [Key tables, figures, diagrams, formulas, or code blocks that materially affect interpretation]
- [Mention how they affect the reading of the document if relevant]

## 5. Unresolved Issues
- [Preserve uncertainty, unanswered questions, incomplete evidence, or limitations that remain open]

## 6. Suggested Next Steps
- [Concrete follow-up directions grounded in the document]
- [Do not invent owners, deadlines, or commitments]

## 7. Search Keywords

### Problem Keywords
- ...

### Method / Solution Keywords
- ...

### Domain / Constraint Keywords
- ...

Instructions

Do not write a chronological recap of the document.
Do not invent facts, conclusions, owners, deadlines, or commitments.
Do not turn suggestions, hypotheses, or future work into established conclusions.
Do not add external knowledge, acronym expansions, years, benchmark details, or metadata not explicitly stated or strongly evidenced in the document.
Remove low-signal boilerplate when it does not affect meaning.
Group related content into high-level sub-topics instead of page-by-page bullets.
Keep the output concise and structured.
Treat non-textual evidence as first-class evidence when relevant.

Special Rules

Key Findings / Claims

Only include something here if it is explicitly stated or strongly supported by the document. If it is merely hinted, proposed, or speculative, do not present it as a confirmed finding.

Constraints / Risks

Only include constraints, limitations, or risks that are explicitly stated or strongly evidenced. Do not infer hidden constraints from weak hints.

Important Non-Textual Elements

Only mention tables, figures, formulas, diagrams, or code blocks that materially affect interpretation. Do not list every asset mechanically. If an asset was referenced in extracted.md through an AssetRef block and it appears relevant, inspect it before deciding whether to include it.

Suggested Next Steps

Only include follow-up actions or directions that are explicitly discussed or strongly implied by the document. Do not invent action owners, deadlines, or commitments.

Search Keywords

Use specific noun phrases that are useful for later search. Avoid generic terms such as:

project
document
update
issue
optimization
system

Handling Difficult Cases

If the document is repetitive, summarize repeated points once.
If the document is incomplete, paraphrase only when meaning is clear.
If sections conflict, keep the conflict under unresolved issues.
If the main topic is genuinely unclear, write: Unclear — multiple loosely related topics appear in the document.
For long documents, identify major section shifts first, then synthesize.
If non-textual assets are present but unclear, mention them cautiously rather than over-claiming their meaning.

Execution Rule

The task is not complete if grounded.md is missing, empty, or still contains placeholder scaffold text. The agent must overwrite or newly create grounded.md with a real grounding note based on:

extracted.md
extracted_meta.json
asset_index.json
any relevant files referenced from assets/...

Example Invocation

/document-grounding

Environment Variables

When running the extraction script, ensure the Docling model path is set:

export DOCLING_ARTIFACTS_PATH="${PROJECT_ROOT}/models/docling"

Supported model path locations (checked in order):

${PROJECT_ROOT}/models/docling
${PROJECT_ROOT}/models/docling-project/docling-models
/root/.cache/docling/models

document-grounding

Mais deste repositório

Mais deste repositório

Document Grounding

When to Use

Input

Output Bundle

Important separation of responsibilities

Required Agent Workflow

AssetRef Rule

Output Format

Instructions

Special Rules

Key Findings / Claims

Constraints / Risks

Important Non-Textual Elements

Suggested Next Steps

Search Keywords

Handling Difficult Cases

Execution Rule

Example Invocation

Environment Variables

Document Grounding

When to Use

Input

Output Bundle

Important separation of responsibilities

Required Agent Workflow

AssetRef Rule

Output Format

Instructions

Special Rules

Key Findings / Claims

Constraints / Risks

Important Non-Textual Elements

Suggested Next Steps

Search Keywords

Handling Difficult Cases

Execution Rule

Example Invocation

Environment Variables