con un clic
sd-paper-detail
// Extract full metadata from a ScienceDirect article page (abstract, authors, keywords, DOI, references, PDF link). Use when the user wants details about a specific paper.
// Extract full metadata from a ScienceDirect article page (abstract, authors, keywords, DOI, references, PDF link). Use when the user wants details about a specific paper.
Advanced search on ScienceDirect with filters like author, journal, year, title, keywords. Use when the user wants filtered academic paper search.
Download PDF from ScienceDirect articles. Requires institutional or subscriber access.
Export citations from ScienceDirect in RIS, BibTeX, or plain text format. Supports pushing to Zotero.
Browse a journal on ScienceDirect — view info, impact factor, latest articles, and specific issues. Use when the user asks about a journal or wants to browse its contents.
Navigate pages, change sort order, or adjust results per page on ScienceDirect search results.
Re-parse the currently open ScienceDirect search results page. Internal skill used by other skills.
| name | sd-paper-detail |
| description | Extract full metadata from a ScienceDirect article page (abstract, authors, keywords, DOI, references, PDF link). Use when the user wants details about a specific paper. |
| argument-hint | [PII or article URL] |
Extract complete metadata from a ScienceDirect article page.
Determine the article URL from $ARGUMENTS:
S0957417426005245): URL is {BASE_URL}/science/article/pii/{PII}https://doi.org/{DOI} (will redirect to ScienceDirect)Use navigate_page with initScript to prevent bot detection:
navigate_page({
url: "{article_url}",
initScript: "Object.defineProperty(navigator, 'webdriver', {get: () => undefined})"
})
If the article is already open in the current tab, you can skip this and go directly to Step 3.
After navigation, verify:
When a captcha page is detected (title "请稍候…" or body contains "Are you a robot"):
evaluate_script to poll for #captcha-box (up to 8s).take_snapshot to get the a11y tree. The checkbox is inside a cross-origin iframe but Chrome DevTools MCP can see it:
Iframe "包含 Cloudflare 安全质询的小组件"
checkbox "确认您是真人" ← target uid
click(uid) on the checkbox element.document.contentType changed or the page URL changed (indicating success). If still on captcha, retry once or fall back to asking the user.Use evaluate_script with built-in waiting. Do NOT use wait_for — it returns oversized snapshots on article pages.
async () => {
// Wait for article content to load (up to 10s)
for (let i = 0; i < 20; i++) {
if (document.querySelector('.title-text') || document.querySelector('#abstracts')) break;
await new Promise(r => setTimeout(r, 500));
}
const result = {};
// Title
result.title = document.querySelector('.title-text')?.textContent?.trim() || '';
// Authors
result.authors = [...document.querySelectorAll('.author-group .react-xocs-alternative-link')]
.map(el => el.textContent.trim())
.filter(Boolean);
if (result.authors.length === 0) {
result.authors = [...document.querySelectorAll('.author-group .button-link-text')]
.map(el => el.textContent.trim().replace(/\s*\d+$/, ''))
.filter(Boolean);
}
// Affiliations
result.affiliations = [...document.querySelectorAll('.affiliation dd')]
.map(el => el.textContent.trim()).filter(Boolean);
// Abstract (content is in div, not p tags)
const absDiv = document.querySelector('.abstract.author');
if (absDiv) {
const h2 = absDiv.querySelector('h2');
const contentDiv = h2?.nextElementSibling;
result.abstract = contentDiv?.textContent?.trim() || '';
}
if (!result.abstract) {
// Fallback: try p tags inside #abstracts
const absSection = document.querySelector('#abstracts, .Abstracts');
if (absSection) {
const absParagraphs = absSection.querySelectorAll('p');
result.abstract = [...absParagraphs].map(p => p.textContent.trim()).join(' ');
}
}
// Highlights
const highlights = [...document.querySelectorAll('.author-highlights li')];
if (highlights.length > 0) {
result.highlights = highlights.map(li => li.textContent.trim());
}
// Keywords
const kwSet = new Set();
document.querySelectorAll('.keyword span').forEach(k => kwSet.add(k.textContent.trim()));
result.keywords = [...kwSet];
// DOI
const doiLink = document.querySelector('a[href*="doi.org"]');
result.doi = doiLink?.href || '';
// Journal & volume info
result.journal = document.querySelector('.publication-title-link')?.textContent?.trim() || '';
const volText = document.querySelector('.publication-volume .text-xs, .publication-volume')?.textContent?.trim() || '';
result.volumeInfo = volText;
// PII
result.pii = window.location.pathname.match(/pii\/(\w+)/)?.[1] || '';
// Article type
result.articleType = document.querySelector('.article-dochead')?.textContent?.trim() || '';
// PDF link
const pdfLink = document.querySelector('a[href*="pdfft"][class*="accessbar"]');
result.pdfUrl = pdfLink?.href || '';
// Publication dates
const dateInfo = document.querySelector('.publication-history');
result.dates = dateInfo?.textContent?.trim() || '';
// References count
result.referenceCount = document.querySelectorAll('.reference, .bibliography li').length;
// Section headings (article structure)
result.sections = [...document.querySelectorAll('.Body h2, .article-content h2')]
.map(h => h.textContent.trim()).filter(Boolean);
return result;
}
Format the output clearly:
## {title}
**Authors**: {authors}
**Journal**: {journal}, {volumeInfo}
**DOI**: {doi}
**Type**: {articleType}
**PII**: {pii}
### Abstract
{abstract}
### Highlights
- {highlight1}
- {highlight2}
### Keywords
{keywords}
### Article Structure
{sections}
**References**: {referenceCount} cited
**PDF**: {pdfUrl or "Not available"}
| Element | Selector |
|---|---|
| Title | .title-text |
| Authors | .author-group .react-xocs-alternative-link |
| Affiliations | .affiliation dd |
| Abstract | .abstract.author (div after h2) |
| Highlights | .author-highlights li |
| Keywords | .keyword span |
| DOI link | a[href*="doi.org"] |
| Journal name | .publication-title-link |
| Volume/issue | .publication-volume .text-xs |
| PDF link | a[href*="pdfft"][class*="accessbar"] |
| References | .reference, .bibliography li |
| Section headings | .Body h2, .article-content h2 |
{BASE_URL}/science/article/pii/{PII}/pdfft?md5={hash}&pid=1-s2.0-{PII}-main.pdfmd5 hash that must be extracted from the page; it cannot be constructed.initScript on every navigate_page call to prevent bot detection.navigate_page + evaluate_script.