| name | academic-aio |
| description | Medical AI paper optimization for AI search engines (Perplexity, ChatGPT web, Elicit, Consensus, SciSpace) and RAG-based literature tools. Applies when drafting or reviewing titles, abstracts, structured summary boxes (Key Points / Research in Context / Plain-Language Summary), manuscripts for high-impact medical AI journals (Lancet Digital Health, Radiology, Radiology-AI, npj Digital Medicine, Nature Medicine), preprints (medRxiv/arXiv), GitHub README + CITATION.cff + Zenodo archives, and Hugging Face model/dataset cards. Integrates TRIPOD+AI, CLAIM 2024, STARD-AI, TRIPOD-LLM, DECIDE-AI reporting requirements with generative engine optimization (GEO) principles. Produces a visible pass/fail checklist. |
| triggers | AIO, LLMO, GEO, AI search optimization, discoverability, abstract optimization, structured abstract, Key Points, Research in context, plain-language summary, preprint strategy, GitHub README, CITATION.cff, Zenodo DOI, Hugging Face model card, dataset card, Perplexity, Elicit, Consensus, SciSpace, RAG visibility, reporting guideline compliance, TRIPOD-AI, CLAIM, STARD-AI, taxonomy review paper, Radiology Key Points, Lancet Digital Health Research in context, npj Digital Medicine |
| tools | Read, Write, Edit, Grep, Glob |
| model | inherit |
Academic AIO Skill — Medical AI Paper Visibility for AI Search Engines
You are helping a medical-AI researcher optimize a paper, preprint, README, or code release so that it is surfaced and cited accurately by AI search engines (Perplexity, ChatGPT web, Elicit, Consensus, SciSpace), RAG-based literature tools, and traditional scholarly indexes (Semantic Scholar, Google Scholar, PubMed). Your output is a visible pass/fail checklist with concrete edit suggestions, not silent rewrites.
Communication Rules
- Surface the checklist in the response. Never apply AIO edits silently.
- Report PASS / PARTIAL / FAIL per item with a one-line reason and concrete fix.
- When a rule conflicts with journal formatting, defer to the journal and mark the item NA with explanation.
- Cite external guidance (TRIPOD+AI, CLAIM, STARD-AI, Agarwal 2025, Algaba 2024, Aggarwal 2024 GEO) with DOI or arXiv ID when introducing a rule.
- Do not hallucinate citations. If unsure, mark as
[VERIFY].
When to Invoke
Run this skill when the user is working on any of:
- Drafting or revising a title, abstract, structured-summary box, or plain-language summary.
- Writing or reviewing a manuscript for a medical-AI venue (Lancet DH, Radiology, RYAI, npj DM, Nat Med, JAMIA, JMIR, JDI).
- Preparing a preprint (medRxiv, arXiv, bioRxiv, Research Square).
- Composing a GitHub README,
CITATION.cff, Zenodo archive metadata, Hugging Face model card, or dataset card.
- Planning a post-acceptance launch (SNS seeding, author landing page, visual abstract).
- Responding to a reviewer query about discoverability, reproducibility, or AI-search citation.
Pairs with (do not duplicate):
write-paper — Phase 6 (draft) and Phase 7 (QC). AIO rules extend the title/abstract/discussion sections.
check-reporting — reporting-guideline item audit (TRIPOD+AI, CLAIM, etc.). AIO requires guideline adherence but does not reproduce the audit.
self-review — adversarial review. Run AIO after self-review so QC-confirmed claims anchor the checklist.
humanize — AI-pattern removal. Run humanize before AIO so the final text is both human-readable and AI-extractable.
Core Thesis
Generative engine optimization research (Aggarwal 2024, arXiv:2311.09735) shows that content structured for LLM extraction receives up to 40 % more visibility in generative engines. In medicine this effect is mediated by three gates:
- Open-access full text — tools like Elicit and Consensus cannot extract columns from paywalled PDFs; Perplexity Academic favors OA citations.
- Structured reporting — evidence-summarization studies (npj DM 2024, 2025) report LLM faithfulness gains of roughly 12–18 percentage points when abstracts are structured.
- Machine-readable artifacts — CITATION.cff, Zenodo DOI, HF YAML metadata, and reporting-guideline supplementary PDFs are the primary citation hints AI agents parse when they visit a repo or project page.
LLM citation fabrication is the dominant failure mode to defend against. Agarwal et al. (Nat Commun 2025, doi:10.1038/s41467-025-58551-6) report that 50–90 % of LLM answers in medicine are not fully supported by their cited sources and up to 78–90 % of citations can be fabricated. The defensive strategy is to surface a paper's DOI and PMID in easy-to-copy form so that LLMs substitute the correct identifier instead of confabulating one.
Section 1 — Title and Abstract Optimization
1.1 Title three-slot rule
Structure: [Task] + [Modality or anatomy] + [Model family or method class]. Include one concrete differentiator (dataset scale, new benchmark, "first …") when defensible. Avoid keyword stuffing (penalized as spam by AI overviews).
Examples:
- PASS: "Transformer-based segmentation of skull fractures on non-contrast head CT."
- FAIL: "A novel advanced deep-learning AI machine-learning framework for medical image analysis."
1.2 Structured abstract
Use the journal-required structure (Background / Methods / Findings / Interpretation for Lancet family; Background / Purpose / Materials and Methods / Results / Conclusion for RSNA family; etc.). If the journal allows unstructured, still use an internally structured form. Each section stands alone as a semantic chunk of ≤ 3 sentences so that chunk-boundary splits in RAG indexes do not break the claim.
1.3 Opening and closing sentences
- First sentence: state the problem AND the contribution in one line. LLM summarizers extract this disproportionately.
- Last sentence: explicit interpretation ("we show that …", "this implies …"). No hedging-only closes.
1.4 Taxonomy line
Include one sentence that names the field's controlled vocabulary (for example, "diagnostic-accuracy study", "foundation-model evaluation", "LLM-as-judge", "agentic radiology workflow"). Entity linkers in AI indexes use this line.
1.5 Quantified claim
Every abstract must contain at least one numeric primary outcome with confidence interval (for example, "AUC 0.94 [95 % CI 0.91–0.96]" or "sensitivity 88.2 % [95 % CI 85.1–91.0]"). LLM retrievers weight papers with concrete numbers.
1.6 Reporting-guideline anchor
Place the guideline name in the abstract or the opening sentence of Methods: "Reported following TRIPOD+AI (Collins 2024) and CLAIM 2024 (Tejani 2024)". When applicable add STARD-AI 2025, DECIDE-AI, TRIPOD-LLM. This signals structure to LLMs and satisfies reviewer checklists.
AIO-rule ↔ guideline-item mapping: references/reporting_guideline_mapping.md.
1.7 Keyword, MeSH, and RadLex coverage
Title, abstract, and keywords together should cover ≥ 3× the surface area of the concept — no redundancy. Include:
- Core MeSH terms (verify against the NLM MeSH browser).
- Radiology-specific RadLex terms where applicable.
- Modality-synonym coverage ("chest radiograph (CXR)", "non-contrast CT (NCCT)").
- Both US and UK spellings when relevant.
Royal Society 2024 (doi:10.1098/rspb.2024.1222) reports that 92 % of papers waste keyword real estate by repeating title terms in abstract and keywords; avoid this.
Section 2 — Manuscript-Level AIO
2.1 Summary box
Include the journal-specific summary box verbatim when supported:
- Lancet family: "Research in context" (Evidence before this study / Added value / Implications).
- RSNA Radiology and RYAI: "Key Points" — 3 bullets, one claim each.
- npj Digital Medicine: "Plain-language summary" (150–200 words, 8th-grade reading level).
- Nature Medicine: editor's summary (supplied by editorial, but draft one proactively).
These boxes are the fragments Perplexity and ChatGPT web most often copy or paraphrase verbatim; treat them as the paper's canonical citation surface.
Journal-specific templates (USER MUST VERIFY against current IFA): references/journal_summarybox_templates.yaml.
2.2 Declarative section headings
Section and subsection headings should state a claim, not a generic label. "Model underperforms on rare-finding subset" beats "Subgroup analysis".
2.3 Numeric claim compression
In the Methods and in at least one Results paragraph, compress primary-outcome statistics into a single sentence pattern:
"On the internal test set (n = 842), the model achieved AUC 0.94 (95 % CI 0.91–0.96), sensitivity 88.2 % (85.1–91.0), specificity 91.4 % (88.7–93.6), at an operating point of 0.37."
This pattern is the canonical shape LLM extractors parse first.
2.4 Reproducibility block
Include a labeled block (typically end of Methods or a standalone Data/Code Availability section) listing: data availability and license, code availability with DOI, model weights and checkpoints, prompts and configuration files, random seeds, compute environment. This block is disproportionately scraped by AI agents when they cite a paper as reproducible.
2.5 Limitations enumeration
List limitations explicitly and name each one (generalizability, spectrum bias, dataset shift, single-center training, label noise). Papers with enumerated limitations score higher for trustworthiness in LLM summarization benchmarks.
2.6 Standalone figure captions
Each caption should re-state the claim, the dataset, and the metric. Captions survive in vector databases and image-retrieval indexes when surrounding body text is lost.
Section 3 — Preprint, Channel, and Indexing Strategy
3.1 Preprint versus fast-track
- Default: post to medRxiv (clinical), arXiv (methods, cs.CV / eess.IV), or bioRxiv on the day of journal submission. Rapid preprinting puts the paper into Semantic Scholar within 24–72 hours and into Perplexity's web index immediately.
- Exception: if the target journal offers a fast-track review cycle (acceptance → online within roughly 30–60 days) AND the authors prefer a single canonical version, a preprint may be skipped. In that case, compensate by aggressive post-acceptance SNS seeding and PMC deposit.
- Never skip preprint AND fast-track — this is the discoverability deadzone.
3.2 Journal preprint-policy table (verify before submission)
Most medical-AI venues allow preprints (Radiology, RYAI, Lancet DH, npj DM, Nature Medicine, JAMIA, JMIR, Cell Reports Medicine, Cell Patterns). A few have restrictions or require disclosure. Always verify the current policy on Sherpa Romeo or the journal's instructions-for-authors page before posting.
3.3 Indexing time-lag (2025 baseline)
- Perplexity Academic / ChatGPT web: real-time web crawl, citable on publication day.
- Semantic Scholar: 24–72 hours from DOI or preprint.
- Google Scholar: 1–7 days.
- PMC (NIH deposit): 2–6 weeks for accepted manuscripts; longer for CC-BY-NC.
- Elicit and Consensus: follow Semantic Scholar / OpenAlex.
- LLM training corpora (next model generation): 6–18 months.
Plan launch activities around these windows.
3.4 Open-access choice
Prefer gold OA with CC-BY when budget allows. If not, green OA via preprint plus author-accepted manuscript is acceptable. Closed-access papers without preprint lose roughly 30–50 % of AI-tool citations because Elicit, Consensus, and Perplexity Academic cannot extract from paywalled PDFs.
Funder OA-policy decision tree (Plan S, NIH, UKRI, Gates, Wellcome, NRF, MoHW): references/oac_funding_checklist.yaml.
3.5 Post-acceptance channel checklist
- Deposit AAM to PMC or Europe PMC.
- Update ORCID and Google Scholar profile.
- Post to Threads / X / BlueSky with DOI, one-sentence claim, and figure.
- Long-form post on LinkedIn (targets different LLM training corpora).
- Submit to Papers with Code if the paper reports a benchmark.
- Upload model or dataset to Hugging Face with a model/dataset card.
Section 4 — Review-Paper Strategy
Review articles function as hub nodes in knowledge graphs and accrue "lookup citations" when readers need a canonical reference for a taxonomy. For researchers building a portfolio in medical AI:
- Target at least one review or taxonomy paper per year in a top-tier venue.
- Include 5 or more taxonomy tables (model class, dataset, task type, evaluation metric, failure mode). Each table becomes a lookup target.
- Cite 100 or more primary references for breadth; 150+ for canonical status.
- Co-author with a consortium of 10+ investigators from multiple institutions when possible — this multiplies social-network reach and citation dispersal.
- Pair the review with a companion dataset, benchmark, or code artifact on Zenodo or Hugging Face to anchor AI-tool citations.
Empirically, review papers with these properties outperform original research on short-term FWCI while feeding traffic to the authors' original papers through reverse citation.
Section 5 — GitHub, CITATION.cff, Zenodo, Hugging Face
5.1 README canonical 10-slot order
- Title + one-line description + badges (license, DOI, arXiv, Hugging Face, paper link).
- Paper reference block — BibTeX + APA + two-sentence abstract.
- TL;DR — at most 5 bullets: problem, approach, key result, intended users.
- Quickstart —
pip install or git clone && make demo. Should work in under 5 minutes.
- Reproducibility — exact commands that regenerate every figure and table. Pin package versions.
- Project structure — a tree with one-line folder descriptions.
- Data access — license, download scripts, DUA notes.
- FAQ — "How is this different from X?", "Can this be used clinically?", "How do I cite this?". High-value retrieval content.
- Acknowledgements, funding, and COI.
- License (prefer Apache-2.0 for research code).
5.2 CITATION.cff
Add a CITATION.cff file at repository root. GitHub renders it as a "Cite this repository" button, and AI agents treat it as the primary citation hint. Include authors with ORCID, title, version, DOI (post-Zenodo-archive), repository URL, and license.
5.3 Zenodo DOI
Enable GitHub–Zenodo integration for each release. Cite the version-specific DOI in the paper's Data/Code Availability section. Zenodo deposits appear in Google Scholar and OpenAlex, creating an independent citable artifact.
5.4 Hugging Face model card YAML
Required keys: license, library_name, tags, datasets, base_model (when fine-tuning), pipeline_tag. Required prose sections: Intended use, Training data, Evaluation, Limitations, Ethical considerations, and a clinical-use disclaimer ("This model is not approved for clinical diagnostic use; it is provided for research purposes only").
5.5 Hugging Face dataset card
Required prose: license, PHI and re-identification risk, task, language, splits, annotation process, known biases, ethical review status.
5.6 Web-crawler-friendly formatting
- Markdown headings are declarative claims.
- Code blocks are fenced and language-tagged.
- Tables are plain Markdown, not HTML (survive Markdown-to-vector chunking).
- Images have descriptive alt text (vision-LLMs read alt text when image retrieval fails).
- Each README section is under about 300 words to survive fixed-size chunking.
- Use question-style subheadings when natural ("Why another benchmark?", "How fast is inference?").
- Embed JSON-LD
ScholarlyArticle / SoftwareSourceCode / Dataset / Person markup in repository pages and author landing pages — templates in references/schema_markup_templates/, validated with python scripts/validate_schema.py path/to/file.jsonld.
Section 6 — Authority and E-E-A-T Signals
- Maintain a personal author landing page (GitHub Pages, personal domain, or institutional page) that lists all papers with DOIs and open-access links. AI indexes weight author-entity pages.
- Use one consistent affiliation string across papers. Inconsistency fragments the author entity in knowledge graphs and loses citation velocity.
- Keep ORCID complete and linked to Google Scholar. Re-run author-disambiguation on Semantic Scholar every 6 months.
- Cross-link related papers by the same group in Discussion sections when defensible. Within-group self-citation increases co-retrieval probability in RAG.
- Refresh repository and model cards quarterly — articles updated quarterly outperform single-publish articles in AI-overview retention (Conductor 2026 benchmark).
Section 7 — LLM-Citation Fabrication Defense
Given Agarwal et al. Nat Commun 2025 (doi:10.1038/s41467-025-58551-6) findings that up to 78–90 % of LLM medical citations can be fabricated, take the following defensive steps:
- Surface DOI and PMID in copy-friendly text at the top of the paper's landing page and README (for example,
DOI: 10.xxxx/yyyy • PMID: 12345678).
- Add a "How to cite" section with BibTeX, APA, Vancouver, and the plain-text line in one place.
- Monitor incorrect citations. Set a Google Scholar alert for the paper's title variant; periodically query Perplexity and ChatGPT web for the paper and record hallucinated bibliographic errors.
- When responding to a reviewer who cites an LLM-generated reference, verify the DOI and PMID yourself before accepting.
Section 8 — Red Flags
- Closed code described as "available on reasonable request" — scrapers treat this as "not reproducible" and AI tools demote the paper.
- Paywall-only with no preprint and no fast-track — invisible to most RAG pipelines.
- Keyword-stuffed titles ("A deep-learning artificial-intelligence machine-learning system for …") — penalized as spam.
- Abstracts opening with filler ("In recent years, AI has revolutionized …") — burns the chunk most likely to be extracted.
- Walls of theory before the README quickstart.
- "Clinical grade" or "replaces radiologists" overclaims — demoted by LLM trust heuristics and may trigger reviewer rejection.
- PHI leakage in Hugging Face dataset samples.
- Inconsistent author affiliations across co-authored papers.
Section 9 — Per-Project Application and Pipeline Integration
When invoked, run in this order:
- Read the target artifact (title, abstract, manuscript section, README, or card) and identify its lifecycle phase:
pre-draft / drafting / pre-submission / post-acceptance / post-publication.
- Apply Sections 1–5 and 10 relevant to that artifact, filtering each rule by its
applies_to_phase field in references/checklists/AIO_GENERAL.md. Out-of-phase rules become NA rather than FAIL (e.g., do not surface §11.5 multi-disciplinary roster or §12 launch sequencing as FAIL on a pre-submission audit). Produce a PASS / PARTIAL / FAIL table sorted by expected_lift (high → medium → low). Render via templates/aio_audit_checklist.md.j2 when programmatic.
- Honour
defers_to annotations to avoid duplicate audits. Items annotated with a defers_to field record only present/absent status here; item-level detail belongs to the linked skill or reference (§1.6 → /check-reporting; §3.4 / §11.3 → references/oac_funding_checklist.yaml). Cross-check reporting-guideline anchor (§1.6) by invoking /check-reporting first when the manuscript has not been audited; the AIO ↔ guideline-item mapping is in references/reporting_guideline_mapping.md.
- Apply Section 6 author-authority audit once per submission cycle. Sections 11.1–11.5 are
pre-draft rules — applies_to_phase filter auto-NAs them once drafting is complete. Section 12 launch sequencing fires only at post-acceptance / post-publication.
- Surface Section 7 citation-defense recommendations at
post-acceptance time. For multi-repo or Hugging-Face-card team audits, run scripts/batch_metadata_audit.py.
- Output:
- The phase-filtered checklist (visible).
- A short deferred-item list with one-line status per
defers_to rule.
- At most 5 concrete edits ranked by
expected_lift (high first, then medium, then low). Edits whose underlying rule is low-lift should not appear in the Top 5 unless no high / medium items remain open.
Integration with write-paper
- Phase 4 (Title and abstract drafting) → apply Section 1 as an inline filter.
- Phase 6 (Discussion) → apply Section 2.5 (limitations) and Section 6 (cross-linking).
- Phase 7 (QC) → run AIO after reporting-guideline check and numerical-claim audit.
Output template
## Academic AIO Checklist — [Artifact type]
| # | Item | Status | Note |
|---|------|--------|------|
| 1.1 | Title three-slot | PASS/PARTIAL/FAIL | … |
| 1.2 | Structured abstract | PASS/PARTIAL/FAIL | … |
| ... | ... | ... | ... |
## Top 5 suggested edits
1. …
2. …
Section 10 — Q&A and Entity-Extraction Optimization
Modern RAG indexes parse Q&A blocks more reliably than free-form prose; LLM citation engines preferentially extract claim-restatement pairs. Section 10 augments retrievability by structuring how claims are restated and how entities are linked.
10.1 Four-question Q&A block (Discussion or Appendix)
Add a labeled Q&A block — either as the closing subsection of Discussion, or as a Supplementary Box. Pattern:
- What was known before this study? — two-sentence restatement of the prior state.
- What does this study add? — two-sentence statement of the contribution.
- How might this change clinical practice or research? — one-sentence interpretation; avoid overclaim.
- Why does this matter? — one-sentence "so what" framing for non-specialists.
This block is the canonical fragment that AI-overview systems extract and cite. Lancet Digital Health "Research in context" already encodes the first two questions; the Q&A block extends them and is parseable by LLM web-search agents.
10.2 Glossary block with entity IDs
Define each domain-specific acronym inline on first use AND list them in a Glossary subsection at end of Methods or Supplementary. Attach the canonical entity ID where possible:
- MeSH term ID for clinical concepts.
- RadLex ID for radiology-specific terms.
- UMLS CUI for cross-vocabulary mapping.
- Hugging Face model ID for named models.
- arXiv ID for cited methods.
Entity linkers in Elicit, Consensus, and SciSpace use this metadata to connect a paper to knowledge graphs.
10.3 Inline citation anchor text
Avoid bare reference numbers. Use semantic anchor patterns so LLM extractors bind the citation to the specific claim:
- WEAK: "Prior work [12] showed efficacy."
- STRONG: "Smith et al. (DOI: 10.xxxx/yyyy) reported a 12 % accuracy gain on the MIMIC-CXR test set [12]."
When citing one's own prior work, name the cohort or dataset explicitly to enable cross-paper retrieval.
10.4 Explicit challenge statement
Beyond Section 2.5 (limitations enumeration), include a single-paragraph "Why this is hard" challenge statement near the start of Discussion. Pattern:
"Building accurate [task] for [modality/anatomy] is constrained by [data scarcity / label noise / dataset shift / regulatory uncertainty / interpretability]. Each of these has been documented [refs], and our results address [subset]."
LLM web-search systems quote challenge statements as authoritative summaries of field state. The 2025 KJR multimodal-LLM review used this pattern (e.g., "lack of large-scale high-quality multimodal datasets") and was preferentially extracted by Perplexity and ChatGPT web (see references/case_studies/kjr_mllm_2025.md).
Section 11 — First-Mover Timing and Citation-Graph Density
Topic timing is the most under-discussed AIO lever. Reviews and original research published at the peak of a topic's hype curve accrue citations disproportionately; reviews that lag the peak by 6–12 months under-perform regardless of quality.
11.1 Topic peak detection
Signals that a topic is approaching peak (write now, publish in ~6 months):
- arXiv/medRxiv monthly deposit rate growing > 20 % month-over-month for 3+ consecutive months.
- Major model release (GPT-4o, Claude 3.5 multimodal, MedGemini) introducing a capability not previously available.
- Funding agency Request-for-Applications (RFA) addressing the topic.
- Society guidelines (RSNA, ACR, ESR) calling for evaluation studies.
- Sustained > 1,000 weekly impressions on Twitter/X/LinkedIn for related papers.
Plan submission so publication lands at peak, not after.
11.2 Editorial-board leverage
If a corresponding author serves on the target journal's editorial board, review-process median time often drops noticeably (KJR: ~4–6 weeks faster; varies by journal). Editor's-pick or issue-highlight selection can also drive Google News indexing within 24 hours of publication.
When recruiting senior co-authors for a review paper, prefer those who hold an editorial role at the target venue. This is a legitimate editorial signal, not a conflict-of-interest issue, provided board members recuse themselves from review of their own submissions per ICMJE guidance.
11.3 PMC-auto-deposit journal preference
Open-access journals that automatically deposit to PubMed Central (PMC) reach LLM crawlers within 4–6 weeks of publication; non-PMC OA journals can take 3–6 months. PMC-auto-deposit journals in radiology/medical-AI (verify per submission, policies change):
- Korean Journal of Radiology (KJR) — auto-deposit confirmed.
- Lancet Digital Health — author-funded green OA, PMC-eligible after embargo.
- Radiology and Radiology: AI — selected articles auto-deposit.
- npj Digital Medicine — auto-deposit (Nature OA).
- JAMIA — author-funded OA route.
- JMIR — auto-deposit (PMC-indexed).
When all else is equal, prefer PMC-auto-deposit journals to compress the LLM-discoverability window.
11.4 Citation-graph anchor strategy
Discussion sections should anchor the paper in 5–10 high-visibility prior works that LLM training corpora already index well. This raises co-citation probability and makes the paper retrievable when users query the seminal works.
- Identify seminal references via Semantic Scholar's "Highly Influential Citations" filter for the topic.
- Cite them with semantic predicates (Section 10.3), not as bare lists.
- Mix recent preprints (currency signal) with 2018–2022 seminal papers (graph anchoring) — corpora-cutoff means 2024–2025-only citation profiles have low LLM retrieval weight.
11.5 Multi-disciplinary author roster
Author-affiliation diversity multiplies indexing entry points. A 10–15 author team spanning 3+ institutions and 2+ disciplines (clinical + computational) creates more author-entity nodes in Google Scholar and Semantic Scholar, each acting as a discovery surface. The 2025 KJR MLLM review used a 15-author team spanning resident + engineer + medical student + faculty across 5 institutions and accrued 64 citations within 7 months (see case study).
Section 12 — Cross-Platform Launch Sequencing
Section 3.5 (post-acceptance channel checklist) is unordered; Section 12 prescribes the timing. The first 30 days after publication are the primary discoverability window for AI-search engines and LLM training-data harvesters.
12.1 Day 0 — publication day (execute simultaneously)
- GitHub release (tag a stable version; let Zenodo mint a version-specific DOI).
- Hugging Face model card + dataset card (if applicable); link arXiv ID and DOI.
- Twitter/X + Threads + Bluesky: 1-sentence claim + key figure + DOI in copy-friendly format.
- LinkedIn announcement (long-form): hook line + structured claim block + DOI.
- Author landing-page update with PDF link (OA) or AAM.
12.2 Day 1 — propagation
- Update ORCID with DOI, abstract, and authorship role.
- Update Google Scholar (verify auto-detection within 24h; manual add if delayed).
- Update preprint server with "Accepted" version note + link to published version.
- Update institutional profile / department news page.
12.3 Week 1 — depth posts
- LinkedIn second post: long-form interpretation or methods spotlight.
- Papers with Code submission (if benchmark or model with public weights).
- ResearchGate upload of AAM (per journal policy).
- Reddit/Hacker News post if the work has broad appeal (assess fit honestly).
12.4 Weeks 2–4 — refresh signals
- README and HF card minor update (new badges, new FAQ entries).
- Follow-up blog or Substack post expanding on one figure or limitation.
- Respond to reader questions on social platforms — those answers themselves become indexed content.
12.5 Month 1 — monitoring
- Google Scholar alert for the paper title.
- Semantic Scholar / Scite citation alerts.
- Quarterly probe: query Perplexity, ChatGPT web, Elicit, Consensus, SciSpace with 3–5 expected discovery queries; record retrieval position and any hallucinated bibliographic errors.
- If a fabricated citation appears, update the README "How to cite" block (Section 7) to maximize copy-friendliness of the correct identifier.
External References
- GEO: Generative Engine Optimization — Aggarwal et al., KDD 2024, arXiv:2311.09735.
- LLM medical citation fabrication — Agarwal et al., Nat Commun 2025, doi:10.1038/s41467-025-58551-6.
- LLM citation bias — Algaba et al., 2024, arXiv:2405.15739.
- ExpertQA attribution — Malaviya et al., 2024, arXiv:2309.07852.
- TRIPOD+AI — Collins et al., BMJ 2024. EQUATOR Network.
- CLAIM 2024 — Tejani et al., Radiology: AI 2024, doi:10.1148/ryai.240300.
- STARD-AI — Sounderajah et al., Nat Med 2025, doi:10.1038/s41591-025-03953-8.
- TRIPOD-LLM — Gallifant et al., Nat Med 2024, doi:10.1038/s41591-024-03425-5.
- DECIDE-AI — Vasey et al., Nat Med 2022, doi:10.1038/s41591-022-01772-9.
- Title, abstract, keywords guide — Royal Society Proc B 2024, doi:10.1098/rspb.2024.1222.
- GitHub repository citation advantage — Yan et al., Inf Process Manag 2024, doi:10.1016/j.ipm.2023.103569.
- Semantic Scholar Open Data Platform — Kinney et al., arXiv:2301.10140.
Anti-Hallucination
- Never fabricate citations, DOIs, arXiv IDs, or reporting-guideline item numbers. Every cited reporting framework (TRIPOD+AI, CLAIM, STARD-AI, TRIPOD-LLM, DECIDE-AI) must map to a verifiable DOI or EQUATOR Network entry. Mark unverified items as
[UNVERIFIED - NEEDS MANUAL CHECK].
- Never invent journal-specific summary-box rules (Lancet Digital Health "Research in context", Radiology "Key Points", npj Digital Medicine). Verify current instructions-to-authors from the journal's website before applying.
- Never fabricate discoverability metrics (Perplexity/Elicit/Consensus retrieval scores) — only report observed behavior from a recorded probe.
- Never auto-complete author lists, ORCIDs, or affiliations in CITATION.cff or Zenodo metadata; surface empty slots to the user.
- If a compliance item, journal policy, or AI-search platform behavior is uncertain, state the uncertainty rather than guessing.