Route Claude Code work by complexity, risk, and tool needs. Use when deciding how much reasoning depth a task needs, whether to read project memory first, whether the task should be decomposed, and whether the work is lightweight, standard, or investigation-heavy.

2026-04-08111

polymarket

AlexAI-MCP/hermes-CCC

Query Polymarket prediction markets for probability data and research insights on real-world events.

2026-04-08111

base-blockchain

AlexAI-MCP/hermes-CCC

Query Base (Ethereum L2) blockchain data — wallet balances, token info, transactions, gas analysis, contract inspection. No API key required.

2026-04-07111

blogwatcher

AlexAI-MCP/hermes-CCC

Monitor and summarize blog posts, RSS feeds, and web content for research and staying current with topics.

2026-04-07111

Source

AlexAI-MCP

AlexAI-MCP/hermes-CCC

Ouvrir le dépôt GitHub Voir les dépôts du créateur

Commande d'installation

Téléchargement

Exécuter dans Manus

Utile pourSOC

Développeurs de logicielsProfessions informatiques et mathématiques15-1252L4

name	arxiv
description	Search and retrieve academic papers from arXiv using their free REST API. No API key needed.
version	1.0.0
author	hermes-CCC (ported from Hermes Agent by NousResearch)
license	MIT
metadata	{"hermes":{"tags":["Research","Arxiv","Papers","Academic","API"],"related_skills":["research-paper-writing"]}}

arXiv

Purpose

Use this skill to search, inspect, and download academic papers from arXiv.
Prefer it for literature review, paper triage, and reproducible paper retrieval.
arXiv access is free and does not require an API key.
The primary API is Atom XML over HTTP.

Base Endpoint

Base URL: https://export.arxiv.org/api/query
Response format: Atom XML feed
Transport: HTTP GET
Authentication: none

Core Record Shape

id: canonical entry URL for the paper
title: paper title
summary: abstract text
authors: ordered author list
published: original publication timestamp
categories: arXiv subject tags
pdf_url: direct PDF URL when present

Useful Categories

cs.AI
cs.LG
cs.CL
stat.ML
cs.CV

Search Syntax

Search terms are passed in search_query=...
Field prefixes narrow the query:
ti: title search
au: author search
abs: abstract search
all: broad metadata search
Combine terms with AND, OR, and ANDNOT
URL-encode spaces as + or %20

Common Query Patterns

all:reasoning+AND+cat:cs.AI
ti:transformer+AND+cat:cs.CL
au:Goodfellow+AND+cat:cs.LG
abs:diffusion+AND+cat:cs.CV
all:reinforcement+learning+AND+cat:stat.ML

CLI Subcommands

/arxiv search
/arxiv get
/arxiv recent
/arxiv download

Subcommand Intent

/arxiv search: run a query and print matching entries
/arxiv get: fetch one known arXiv identifier and display parsed metadata
/arxiv recent: list recent submissions in a category or topic
/arxiv download: save the PDF locally from a known identifier or PDF URL

Search Example With curl

curl -s "https://export.arxiv.org/api/query?search_query=all:large+language+models+AND+cat:cs.CL&start=0&max_results=5&sortBy=relevance&sortOrder=descending"

Recent Example With curl

curl -s "https://export.arxiv.org/api/query?search_query=cat:cs.LG&start=0&max_results=10&sortBy=submittedDate&sortOrder=descending"

PDF Download Example With curl

curl -L "https://arxiv.org/pdf/2401.01234.pdf" -o 2401.01234.pdf

Fetch A Single Entry By Identifier

curl -s "https://export.arxiv.org/api/query?id_list=2401.01234"

Rate Limit Guidance

Keep requests at or below 3 req/sec
Sleep between loops when paginating large result sets
Cache parsed results locally if you will revisit them
Avoid hammering the endpoint with concurrent workers

Python Parsing Pattern

import xml.etree.ElementTree as ET
from urllib.request import urlopen

API_URL = "https://export.arxiv.org/api/query?search_query=all:reasoning+AND+cat:cs.AI&start=0&max_results=3"
NS = {"atom": "http://www.w3.org/2005/Atom"}

with urlopen(API_URL) as response:
    xml_bytes = response.read()

root = ET.fromstring(xml_bytes)

for entry in root.findall("atom:entry", NS):
    paper_id = entry.findtext("atom:id", default="", namespaces=NS)
    title = entry.findtext("atom:title", default="", namespaces=NS).strip()
    summary = entry.findtext("atom:summary", default="", namespaces=NS).strip()
    published = entry.findtext("atom:published", default="", namespaces=NS)
    authors = [
        author.findtext("atom:name", default="", namespaces=NS)
        for author in entry.findall("atom:author", NS)
    ]
    categories = [node.attrib.get("term", "") for node in entry.findall("atom:category", NS)]
    pdf_url = ""
    for link in entry.findall("atom:link", NS):
        if link.attrib.get("title") == "pdf":
            pdf_url = link.attrib.get("href", "")
            break
    print("id:", paper_id)
    print("title:", title)
    print("published:", published)
    print("authors:", ", ".join(authors))
    print("categories:", ", ".join(categories))
    print("pdf_url:", pdf_url)
    print("summary:", summary[:240], "...")
    print("-" * 60)

Minimal Search Helper

import xml.etree.ElementTree as ET
from urllib.parse import quote_plus
from urllib.request import urlopen

def search_arxiv(query: str, max_results: int = 5) -> list[dict]:
    encoded = quote_plus(query)
    url = (
        "https://export.arxiv.org/api/query"
        f"?search_query={encoded}&start=0&max_results={max_results}"
    )
    ns = {"atom": "http://www.w3.org/2005/Atom"}
    with urlopen(url) as response:
        root = ET.fromstring(response.read())
    rows = []
    for entry in root.findall("atom:entry", ns):
        authors = [
            node.findtext("atom:name", default="", namespaces=ns)
            for node in entry.findall("atom:author", ns)
        ]
        categories = [node.attrib.get("term", "") for node in entry.findall("atom:category", ns)]
        rows.append(
            {
                "id": entry.findtext("atom:id", default="", namespaces=ns),
                "title": entry.findtext("atom:title", default="", namespaces=ns).strip(),
                "summary": entry.findtext("atom:summary", default="", namespaces=ns).strip(),
                "authors": authors,
                "published": entry.findtext("atom:published", default="", namespaces=ns),
                "categories": categories,
            }
        )
    return rows

for row in search_arxiv("all:multimodal AND cat:cs.CV", max_results=3):
    print(row["title"])

Example AI And ML Queries

all:chain-of-thought AND cat:cs.AI
all:instruction tuning AND cat:cs.CL
all:reward modeling AND cat:cs.LG
all:vision transformer AND cat:cs.CV
all:mixture of experts AND cat:stat.ML
ti:retrieval augmented generation
abs:alignment AND cat:cs.AI

Practical Retrieval Workflow

Search broadly with all: and a category.
Narrow with ti: if the result set is noisy.
Pull the top 5 to 20 entries.
Parse title, summary, and categories.
Keep the paper id for citation and later retrieval.
Download only shortlisted PDFs.

Sorting And Pagination

start controls the starting offset
max_results controls page size
sortBy=relevance is useful for topic search
sortBy=submittedDate is useful for recent monitoring
sortOrder=descending is typical for recent feeds

ID And PDF Notes

arXiv IDs may appear as modern identifiers like 2401.01234
older identifiers can include subject prefixes
PDF links usually resolve as https://arxiv.org/pdf/<id>.pdf
the canonical entry page remains useful for metadata stability

Bulk Access Notes

Use the API for ordinary search and retrieval tasks.
For large-scale corpus access, arXiv also publishes bulk data and S3-style access paths in some workflows.
Bulk S3 access is appropriate for offline indexing, not for interactive one-off queries.
If you need many thousands of records, prefer bulk snapshots over high-frequency API pagination.

Failure Handling

If the feed is empty, print the final URL and query string first.
If XML parsing fails, save the raw response before retrying.
If pdf_url is missing, synthesize it from the parsed identifier.
If you get throttled, back off and reduce concurrency.

Good Defaults

Start with max_results=5
Use sortBy=relevance for topic search
Use sortBy=submittedDate for /arxiv recent
Keep summaries trimmed in terminal output
Persist parsed metadata as JSON if you will cite or compare papers later

Output Contract

Always print id
Always print title
Always print summary
Always print authors
Always print published
Always print categories
Always print pdf_url if available

When Not To Use This Skill

Do not use this as your only citation source for final camera-ready metadata.
Do not assume arXiv category tags equal venue topics.
Do not use the interactive API for massive mirror-scale ingestion jobs.