Run any Skill in Manus with one click

Get Started

extract-references

Stars10

Forks0

UpdatedMay 22, 2026 at 02:42

Extract bibliography/references from PDF files using GROBID and return a Collection of Notes (one per reference).

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

bdambrosio

bdambrosio/Cognitive_workbench

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software DevelopersComputer and Mathematical Occupations·SOC 15-1252

File Explorer

2 files

SKILL.md

readonly

name	extract-references
type	python
description	Extract bibliography/references from PDF files using GROBID and return a Collection of Notes (one per reference).
schema_hint	{"path":"string (PDF file path or Note ID)","grobid_url":"string"}
resolve	{"path":"resource_id"}

extract-references

Extract bibliography/references from PDF files using GROBID. Returns a Collection of Notes, where each Note contains structured metadata for one reference (compatible with format-citation).

Input

path: PDF file path (absolute) or Note ID containing PDF URL/metadata (required)
grobid_url: Optional GROBID server URL (from world_config)

Output

Success (status: "success"):

resource_id: Collection ID containing Notes (one Note per reference)
Each Note contains:
- data: Structured reference metadata (title, authors, year, venue, doi, url)
- metadata: Source PDF, reference index, raw citation text

Behavior

Uses GROBID to parse PDF and extract references from <bibl> elements
Creates one Note per reference with structured metadata
Returns empty Collection if no references found
Reference Notes are compatible with format-citation tool
Note ID input: When given a Note ID (e.g., from semantic-scholar), looks up pdf_url from the Note's tool_metadata automatically — no manual metadata extraction needed

Planning Notes

This is the right tool for bibliography/reference extraction from papers. It uses GROBID structural parsing (deterministic, fast) rather than LLM extraction (slow, lossy). Prefer this over extract or map(extract) with citation-related instructions.

Common workflow with semantic-scholar:

semantic-scholar → $paper Collection
get_items("$paper")[0] → Note ID (contains pdf_url in tool_metadata)
extract-references(path=note_id) → $refs Collection of structured citation Notes
Each citation Note is JSON: {"title": "...", "authors": [...], "year": 2020, "venue": "..."}
Read with get_text(note_id) + json.loads() in Python

Do NOT:

Pass a Collection ID or $binding string as path — pass the actual Note ID from get_items()
Use pluck(field="text") on result Notes — content is JSON, not plain text
Use extract or map(extract) for reference lists — this tool is faster, structured, and deterministic

Examples

{"type":"extract-references","path":"/path/to/paper.pdf","out":"$refs"}
{"type":"extract-references","path":"Note_1234","out":"$refs"}
{"type":"format-citation","target":"$refs","format":"bibtex","out":"$bibtex"}

Full semantic-scholar pipeline:

{"type":"semantic-scholar","query":"attention is all you need","limit":1,"out":"$paper"}

items = get_items("$paper")
r = tool("extract-references", path=items[0], out="$refs")

More from this repository

same repository

get-financial-statements

bdambrosio/Cognitive_workbench

Fetch standardized financial statements (income, balance sheet, cash flow, earnings, company overview) for a ticker from Alpha Vantage. Returns combined annual+quarterly JSON for analysis.

2026-06-1910

look-at-target

bdambrosio/Cognitive_workbench

Aim the ChatterBot head to find and center a target in view — the user, a person, an object, an animal. Runs a closed visual loop (capture, judge where the target is, nudge pan/tilt, repeat) until the target is centered, or reports it could not find the target after searching. Use when the user says point at me, look at me, turn to face someone, find the cat, center on the person. For a one-off snapshot without re-aiming use camera-capture; for a manual fixed angle use head-move.

2026-06-1310

camera-capture

bdambrosio/Cognitive_workbench

Capture a still photo from the ChatterBot head camera. The captured frame is attached to your own visual input, so you can SEE it and answer questions about what is in view — whether the user is present, whether there is a cat, what the scene looks like. The camera rides the pan/tilt head, so it shows whatever the head is currently aimed at; aim first with head-move if needed. To also show the photo to the user on screen, follow with the display tool (the observation includes a ready <img> URL). Use when the user asks what you see, to take a picture or snapshot, or to check whether something or someone is in view.

2026-06-1310

head-move

bdambrosio/Cognitive_workbench

Move the ChatterBot head — aim the pan/tilt camera or play an expressive gesture. The bot is a stationary companion head; this points its gaze, it does NOT drive or navigate. Use when the user asks you to look somewhere, turn toward/away, look up/down, re-center, or nod/shake/scan. Angles are degrees 0-180 with 90 centered (pan 0=full right, 180=full left; tilt 0=down, 180=up, mounting-dependent). Give pan and/or tilt for absolute aim, OR a gesture (not both). Returns the confirmed pose once the head settles.

2026-06-1310

ramana-saying

bdambrosio/Cognitive_workbench

Return one genuine saying of Ramana Maharshi, drawn verbatim from his recorded talks, with source attribution. Use when delivering an authentic Ramana quote with attribution — not a paraphrase or a synthesized reflection. The returned text is a raw quote; add your own brief framing before presenting it.

2026-06-0910

generate-image

bdambrosio/Cognitive_workbench

Generate an original image from a text description, locally (Bonsai-Image 4B, ternary-quantized, on a long-lived studio server). Use when the user wants a picture, illustration, avatar, or face created from a description that does not already exist on the web. For existing photos of real things, prefer image search instead; for simple diagrams or line drawings, prefer authoring inline SVG.

2026-05-3010

name	extract-references
type	python
description	Extract bibliography/references from PDF files using GROBID and return a Collection of Notes (one per reference).
schema_hint	{"path":"string (PDF file path or Note ID)","grobid_url":"string"}
resolve	{"path":"resource_id"}

extract-references

Extract bibliography/references from PDF files using GROBID. Returns a Collection of Notes, where each Note contains structured metadata for one reference (compatible with format-citation).

Input

path: PDF file path (absolute) or Note ID containing PDF URL/metadata (required)
grobid_url: Optional GROBID server URL (from world_config)

Output

Success (status: "success"):

resource_id: Collection ID containing Notes (one Note per reference)
Each Note contains:
- data: Structured reference metadata (title, authors, year, venue, doi, url)
- metadata: Source PDF, reference index, raw citation text

Behavior

Uses GROBID to parse PDF and extract references from <bibl> elements
Creates one Note per reference with structured metadata
Returns empty Collection if no references found
Reference Notes are compatible with format-citation tool
Note ID input: When given a Note ID (e.g., from semantic-scholar), looks up pdf_url from the Note's tool_metadata automatically — no manual metadata extraction needed

Planning Notes

Common workflow with semantic-scholar:

semantic-scholar → $paper Collection
get_items("$paper")[0] → Note ID (contains pdf_url in tool_metadata)
extract-references(path=note_id) → $refs Collection of structured citation Notes
Each citation Note is JSON: {"title": "...", "authors": [...], "year": 2020, "venue": "..."}
Read with get_text(note_id) + json.loads() in Python

Do NOT:

Pass a Collection ID or $binding string as path — pass the actual Note ID from get_items()
Use pluck(field="text") on result Notes — content is JSON, not plain text
Use extract or map(extract) for reference lists — this tool is faster, structured, and deterministic

Examples

{"type":"extract-references","path":"/path/to/paper.pdf","out":"$refs"}
{"type":"extract-references","path":"Note_1234","out":"$refs"}
{"type":"format-citation","target":"$refs","format":"bibtex","out":"$bibtex"}

Full semantic-scholar pipeline:

{"type":"semantic-scholar","query":"attention is all you need","limit":1,"out":"$paper"}

items = get_items("$paper")
r = tool("extract-references", path=items[0], out="$refs")