一键在 Manus 中运行任何 Skill

academic-research

Nested swiss-knife reference for academic literature work — find papers, fetch full-text PDFs, trace citations, write LaTeX manuscripts. **First action for any "get me this paper" request:** `python3 <skill-path>/scripts/fetch_paper.py <DOI|arXiv-ID|PMID>` — walks arXiv → Unpaywall → Europe PMC → CORE → in-house publisher-page extraction (Nature/APS/AIP/IOP/Cambridge) → authorized institutional publisher → LibGen and saves the paper, metadata, and a resumable manifest under `papers/{slug}/`. Read the body when you need to escape the script: custom query shapes, citation networks, scholar analysis, LaTeX writing, or a tier-specific API call. Indexes 12 deep-dive API references and 6 pipeline workflows under `reference/`.

在 Manus 中运行

星标435

分支40

更新时间2026年6月9日 03:23

来源

Lingtai-AI

Lingtai-AI/lingtai

打开 GitHub 仓库查看创作者相关仓库

安装命令

下载

在 Manus 中运行

适用职业SOC

软件开发工程师计算机与数学类职业15-1252L4

文件资源管理器

29 个文件

SKILL.md

readonly

同仓库更多 Skills

同仓库

dev-guide-gotchas

Lingtai-AI/lingtai

Nested lingtai-dev-guide reference for known implementation footguns: Bubble Tea v2 paste, textarea theming, dev-mode rebuilds, editable installs, migrations, localization, authorization gates, and config conventions.

2026-06-14435

lingtai-dev-guide

Lingtai-AI/lingtai

Router for contributing to the LingTai project. Use this when you are about to change LingTai code or docs, set up a dev environment, navigate the Go TUI/portal repo or Python kernel, develop MCP addons, prepare a release, troubleshoot a running network, audit security, govern avatars, prepare a publication-bound release workflow, run a runtime self-check, get a PR review-ready, or steward a new skill. This is for developers and contributors; for end-user lessons, use tutorial-guide.

2026-06-14435

dev-guide-releasing

Lingtai-AI/lingtai

Compact lingtai-dev-guide release overview: when you are doing a release, the maintainer-authorization boundary, the version scheme, and a pointer to release-workflow for the full TUI/Portal + kernel publishing checklist, GitHub/PyPI/Homebrew steps, the required HTML release log, and the website release blog.

2026-06-14435

dev-guide-release-workflow

Lingtai-AI/lingtai

Nested lingtai-dev-guide reference for consequential LingTai releases: paired TUI/Portal + kernel release planning, clean worktrees, validation gates, GitHub/PyPI/Homebrew publishing boundaries, the required self-contained HTML release log, website release-log/blog drafting, and the reusable release blog template.

2026-06-14435

dev-guide-pr-review-deliverables

Lingtai-AI/lingtai

Nested lingtai-dev-guide reference for getting a PR review-ready: PR readiness gates, multi-model/daemon/Claude read-only review passes, self-contained local HTML explainers, PR body hygiene and gh pr edit troubleshooting, source-labeled deliverables with syntax/validation checks, and maintainer authorization boundaries for opening, editing, and merging PRs.

2026-06-14435

dev-guide-runtime-self-check

Lingtai-AI/lingtai

Nested lingtai-dev-guide reference for developer/operator runtime self-checks after a refresh, checkout, or preset/MCP change: probe which lingtai code is actually running, confirm the editable source and git HEAD, verify the active TUI/portal binary and dev-mode symlinks, rebuild the TUI from a clean release worktree, inspect MCP/addon module sources and tool surface, and report evidence safely with secrets redacted.

2026-06-14435

name	academic-research
description	Nested swiss-knife reference for academic literature work — find papers, fetch full-text PDFs, trace citations, write LaTeX manuscripts. First action for any "get me this paper" request: `python3 <skill-path>/scripts/fetch_paper.py <DOI\|arXiv-ID\|PMID>` — walks arXiv → Unpaywall → Europe PMC → CORE → in-house publisher-page extraction (Nature/APS/AIP/IOP/Cambridge) → authorized institutional publisher → LibGen and saves the paper, metadata, and a resumable manifest under `papers/{slug}/`. Read the body when you need to escape the script: custom query shapes, citation networks, scholar analysis, LaTeX writing, or a tier-specific API call. Indexes 12 deep-dive API references and 6 pipeline workflows under `reference/`.
version	3.0.0
allowed-tools	Bash(python3 ) Bash(curl ) Bash(pip ) Bash(pip3 )
tags	["academic","research","arxiv","crossref","openalex","semantic-scholar","core","pubmed","unpaywall","doi","pdf","citation","pipeline","europe-pmc","nasa-ads","inspire-hep","nested-skill"]

Academic Research

Nested swiss-knife reference. A modular skill: try the bundled script first, then load specific reference files only when you need to escape it.

Try this first

For 80% of "get me this paper" requests, the bundled script is the right answer. It walks the open-access ladder, falls back automatically, and writes a manifest the next session can resume from.

# Fetch by any identifier
python3 <skill-path>/scripts/fetch_paper.py 10.1103/PhysRevLett.125.015001
python3 <skill-path>/scripts/fetch_paper.py arXiv:2301.00001
python3 <skill-path>/scripts/fetch_paper.py PMID:12345678

# Batch (one identifier per line in the file)
python3 <skill-path>/scripts/fetch_paper.py --batch dois.txt --out papers/

# Resolve metadata only (no PDF download)
python3 <skill-path>/scripts/fetch_paper.py 10.1038/nature12373 --dry-run

# Skip LibGen (e.g. legal-sensitive environment)
python3 <skill-path>/scripts/fetch_paper.py <id> --no-libgen

Output layout (idempotent — re-runs skip entries with status: ok):

papers/{first-author-year-firstword}/
├── paper.pdf  |  paper.md      # full-text artifact
├── metadata.json                # CrossRef-normalized
└── manifest.json                # {status, tier, source, ts, doi}

Tier ladder (script stops at first hit):

Tier	Source	Best for
1	arXiv direct	Preprints (physics, CS, math, q-bio, econ)
2	Unpaywall	Publisher-blessed gold/green OA
3	Europe PMC	Biomedical full-text + PMC mirror
4	CORE	Institutional repositories (needs `$CORE_API_KEY`)
5	Publisher-page extract	Nature/APS/AIP/IOP/Cambridge → in-house extractor (stdlib + `requests`, no third-party deps). Fetches the already-accessible official article / DOI landing page and parses `citation_` metadata + the article body into structured Markdown. No paywall/CAPTCHA bypass, no cookies/credentials* — official pages only. A login/paywall page is a clean miss; the ladder then falls through. Opt out with `--no-publisher-extract`.
5b	Authorized publisher	Licensed/institutional access only — official DOI landing page → same-host publisher PDF → `%PDF-` validation, full provenance. Recovers paywalled-but-subscribed papers without shadow libraries. Never bypasses paywalls or handles credentials. Opt out with `--no-institutional`. See authorized-publisher-access.md.
6	LibGen	Last resort; opt out with `--no-libgen`

Set $LINGTAI_RESEARCH_EMAIL to a real address before first use — Unpaywall rejects placeholder emails with HTTP 422. The default falls back to lingtai-agent@example.org with a warning.

Read on only if: the script fails on your paper, you need a custom query shape, or you're composing a multi-step workflow (search → fetch → cite → write).

Before drafting citation-bearing academic writing

Verify sources before you write prose. If you are about to draft a paper, related-work section, literature review, References list, or any author–year / citation-bearing manuscript, pass the evidence-verification gate first: reference/evidence-verification-gate.md. No verified evidence → no confident prose. Peer-reviewed and preprint sources must not share the same evidence layer; search results are leads, not citations. If the user asks to "go fast," reduce scope, not verification. Produce a verified literature matrix before any submission-like draft.

Escape hatch — quick paths

I'm about to write a paper / references / related-work section → reference/evidence-verification-gate.md (verify FIRST)
The user wants a "fast" / "readable" paper draft → reference/evidence-verification-gate.md (reduce scope, not verification)
I have a DOI                → reference/api-doi-resolver.md → api-crossref.md
I have an arXiv ID          → reference/api-arxiv.md (direct PDF link)
I have a PMID               → reference/api-europe-pmc.md
I have a bibcode            → reference/api-nasa-ads.md (requires free key)
I only have keywords        → reference/decision-tree.md → pick API by discipline
I need a citation network   → reference/api-semantic-scholar.md or api-openalex.md
I need to override the PDF ladder → reference/pipeline-obtain-pdf.md
Tier-5 publisher-extract failed and I want to retry it manually → reference/publisher-page-extraction.md
OA + authorized-publisher failed and I have a batch + the user's Zotero → reference/zotero-institutional-fulltext-handoff.md
All OA chains failed        → reference/libgen-fallback.md (last resort)
I need astrophysics         → reference/api-nasa-ads.md
I need high-energy physics  → reference/api-inspire-hep.md
I need biomedical           → reference/api-europe-pmc.md or api-pubmed.md
I need to write/compile a paper → reference/pipeline-latex-writing.md
My empirical draft keeps getting reframed / reviewers "agree" → reference/anti-pattern-text-consistency-vs-data-correspondence.md
I hit an API error          → reference/error-handling.md

Reference index

API references (12)

Each card includes endpoint parameters, runnable code, response shape, rate limits, and fallbacks.

API	File	Best for	Key?
arXiv	api-arxiv.md	Preprint retrieval	No
CrossRef	api-crossref.md	DOI metadata, funder queries	No (mailto recommended)
DOI Resolver	api-doi-resolver.md	Batch DOI → structured citation	No
OpenAlex	api-openalex.md	Discovery, institution/concept analysis	No
Semantic Scholar	api-semantic-scholar.md	Citation networks, TLDR	No (tight limits)
CORE	api-core.md	OA full-text downloads	Optional (recommended)
PubMed	api-pubmed.md	Biomedical search, PMC full text	No
Unpaywall	api-unpaywall.md	OA versions / PDFs	email (real)
Google Scholar	api-google-scholar.md	Broadest discipline coverage	No (needs stealth)
Europe PMC	api-europe-pmc.md	Biomed, PMID, full-text XML	No
NASA ADS	api-nasa-ads.md	Astrophysics, BibTeX export	Yes (free)
INSPIRE-HEP	api-inspire-hep.md	High-energy physics	No

Pipeline workflows (6)

Pipeline	File	Purpose
Paper discovery	pipeline-discovery.md	Keywords → candidate papers
PDF acquisition	pipeline-obtain-pdf.md	Metadata → full text (manual ladder)
Citation tracking	pipeline-citation-tracking.md	Forward/backward citation networks
Scholar analysis	pipeline-scholar-analysis.md	Impact, trends, h-index
LaTeX writing	pipeline-latex-writing.md	Compile, bibliography, figures, debug
Decision tree	decision-tree.md	"I have X — which API should I use?"

Standalone references

evidence-verification-gate.md — verify-before-drafting gate for citation-bearing academic writing: detection triggers, the verified-literature-matrix schema, A/B/C evidence tiers, "go fast = reduce scope not verification," matrix-before-draft artifact convention, pre-draft lint
publisher-page-extraction.md — Tier-5 manual escape hatch (Nature/APS/AIP/IOP/Cambridge → structured Markdown)
authorized-publisher-access.md — Tier-5b automated probe: official DOI landing page → same-host publisher PDF, licensed access only, no paywall/credential handling
zotero-institutional-fulltext-handoff.md — Tier-6a human-in-the-loop: stage a failed batch into Zotero with tags, the human runs Find Full Text (institutional access), agent harvests PDFs back with provenance. No UI automation / TCC bypass / credential handling.
libgen-fallback.md — Last-resort PDF source with legal/safety notes
error-handling.md — 429 backoff, 403 publisher blocks, timeout patterns
anti-pattern-text-consistency-vs-data-correspondence.md — empirical-writing failure mode: prose drifts from the data while reviewer rounds make it more polished. Trigger pattern, re-anchoring steps, detection checklist.

Relationship to web-browsing

web-browsing is the routing layer ("which tier to use for this URL?")
academic-research is the deep-dive layer ("how do I write OpenAlex filter parameters? what email does Unpaywall want?")
The two are complementary. If you're just scraping one publisher page and don't need the OA ladder, web-browsing's extract_page.py is lighter.

Known caveats

Unpaywall's email parameter is required and must be real — placeholder addresses get HTTP 422. Set $LINGTAI_RESEARCH_EMAIL once.
CORE without an API key is harshly rate-limited (~100/day vs 10,000/day with a free key from https://core.ac.uk/services/api).
Semantic Scholar free tier is very tight (~100 reqs / 5 min). Request a key for any serious citation-network work.
Google Scholar requires a stealth browser (camoufox or playwright-stealth v2); legacy playwright_stealth API does not work.
arXiv enforces HTTPS — HTTP requests are 301-redirected automatically.
Library Genesis legality varies by jurisdiction — use is the user's responsibility. Pass --no-libgen to opt out.
Publisher-page extraction (Tier 5) is an in-house, self-contained extractor — stdlib + requests, no third-party dependency and nothing to install (replaces the old broken zhiping0913/Download_paper path, issue #136). It fetches the already-accessible official article / DOI landing page and parses citation_* metadata + the article body into structured Markdown with a provenance/limitations footer. It performs no paywall/CAPTCHA bypass and no cookie/credential handling — official pages only. A login/paywall interstitial, an unsupported DOI prefix, or a page with too little readable text is a clean miss, and the ladder falls through. Skip the tier with --no-publisher-extract. The extraction is heuristic (equations/figures/tables may be lost), so treat the artifact as a convenience copy, not a typeset full text. See reference/publisher-page-extraction.md.
Authorized-publisher tier (5b) only uses access you already have — it follows the official DOI landing page and grabs a same-host publisher PDF, validating %PDF- bytes and Content-Type before saving. It never bypasses paywalls/CAPTCHAs and never reads, stores, or replays cookies/credentials; most institutional access is IP-based, so on a licensed network a plain HTTP GET may work. Off-network it harmlessly misses. Pass --no-institutional to disable in legal-sensitive environments. Cookies/auth headers are never written to provenance. See reference/authorized-publisher-access.md.
Zotero institutional full-text is a human-in-the-loop handoff, not an agent tool (Tier 6a) — when OA and the Tier-5b authorized-publisher probe both miss for a batch, and the user runs Zotero Desktop on an institutional network, the agent stages the failed rows into Zotero via /connector/saveItems with a dated batch tag, the human selects the tagged items and clicks Find Full Text in the Zotero UI, then the agent harvests the resulting PDFs back with provenance (%PDF- validation, copy-not-move, resolved_by: human Find Full Text). Zotero's Local API item routes are GET-only and the Connector resolver set (oa+custom) is narrower than UI Find Full Text, so no HTTP endpoint exposes the broad path. Do not drive the UI via AppleScript/Accessibility — macOS TCC blocks it (-1719/-1743) and it is out of policy. No paywall bypass, no credential/cookie/session handling, no automatic LibGen fallthrough. See reference/zotero-institutional-fulltext-handoff.md.
Drafting citation-bearing academic writing without verifying sources first — under "fast"/"readable draft" pressure the model flattens DOI-verified papers, preprints, and search leads into one confident evidence layer, which reads as submission-ready and invites reviewer rejection. Pass the evidence-verification gate before drafting: verify a literature matrix first, keep A/B/C reliability tiers separate in the prose, and reduce scope (not verification) when speed is requested.
Writing an empirical paper iteratively can drift from the data — reviewer rounds make prose more polished and internally consistent without verifying it still matches the experiments on disk. Reviewer agreement is text-consistency evidence, not data-correspondence evidence. Anchor every claim to data files/runner code before writing, and re-derive (don't just rewrite) when feedback flags confusion. See reference/anti-pattern-text-consistency-vs-data-correspondence.md.

Found a bug or issue? If you encounter any problems with this skill, load the lingtai-issue-report skill and follow its instructions to report it.