| name | search-as-code |
| description | Compose programmable search pipelines with the Search as Code Python SDK. |
Search as Code
Use this skill when a task needs nontrivial retrieval: many queries, source
constraints, evidence extraction, page fetching, dedupe, reranking, joins, or
aggregation. Prefer writing Python against sdk over serial tool calls.
Available SDK
sdk.search.web(query, limit=10)
sdk.search.web_many(queries, limit_per_query=10, concurrency=8)
sdk.fetch.url(url)
sdk.fetch.urls(urls, concurrency=8)
sdk.markdown.url(url)
sdk.markdown.urls(urls, concurrency=8)
sdk.markdown.convert_url(url)
sdk.filter.unique_urls(results)
sdk.filter.domains(results, include=[...], exclude=[...])
sdk.rank.lexical(query, results)
sdk.render.markdown(results, max_chars=8000)
Provider Choice
Use --provider parallel when the task benefits from AI-optimized excerpts,
current web search, and source extraction through one session. It requires
PARALLEL_API_KEY.
Use sdk.markdown.url() when you already have URLs and want compact markdown
through curl.md. It is a URL-to-markdown reader, not a search planner. Prefer it
for pages where raw HTML extraction would be noisy.
Pattern
- Turn the user task into a query plan.
- Fan out with
web_many when searches are independent.
- Deduplicate before fetching pages.
- Filter by authoritative domains when source quality matters.
- Fetch only promising URLs. Use
sdk.markdown.url() when markdown conversion
is the desired fetch behavior.
- Use normal Python for regexes, joins, counters, tables, and validation.
- Set
result or output to compact JSON-serializable evidence.
Example
vendors = {
"Mozilla": "site:mozilla.org/en-US/security/advisories CVE Fixed in high",
"Chrome": "site:chromereleases.googleblog.com High CVE Stable channel updated",
}
hits_by_query = sdk.search.web_many(list(vendors.values()), limit_per_query=8)
hits = [hit for group in hits_by_query for hit in group]
hits = sdk.filter.unique_urls(hits)
hits = sdk.filter.domains(hits, include=["mozilla.org", "googleblog.com"])
ranked = sdk.rank.lexical("CVE high severity fixed version advisory", hits)
result = {
"candidate_count": len(ranked),
"context": sdk.render.markdown(ranked[:12], max_chars=6000),
}
Guidance
Keep generated pipelines deterministic and inspectable. Persist intermediate
state explicitly when a later turn may need it. Do not dump large candidate sets
into result; return compact evidence, counts, failed cases, and the next query
plan.