| name | build-list |
| description | This skill should be used when the user asks to "build a prospect list", "find prospects", "gather leads", "explore targets", or wants to build a prospect list. Collects prospect candidates via web search based on BUSINESS.md and SALES_STRATEGY.md and registers them in the DB. |
| argument-hint | <project-id> [target-count=30] |
| allowed-tools | ["Bash","Read","Agent","WebSearch","WebFetch","mcp__plugin_lead-ace_api__add_prospects","mcp__plugin_lead-ace_api__check_prospect_dedup","mcp__plugin_lead-ace_api__get_outbound_targets","mcp__plugin_lead-ace_api__get_document","mcp__plugin_lead-ace_api__save_document","mcp__plugin_lead-ace_api__get_master_document"] |
Build List - Prospect List Building
A skill that collects prospect candidates via web search based on the information in BUSINESS.md and SALES_STRATEGY.md, retrieves contact information, and registers them in the database.
3-Phase Structure:
- Phase 1 (Candidate Collection): Find prospect candidates broadly via web search (name, official URL, overview)
- Phase 1.5 (Pre-dedup filter): Call
check_prospect_dedup with the candidates' domains and drop any the server would reject โ saves the per-candidate cost of Phase 1.7 (signal WebSearch) and Phase 2 (contact-retrieval sub-agents) on already-known orgs
- Phase 1.7 (Signal Collection): Pull a recent-signal slice for each surviving candidate (press release / funding / hiring) so
/outbound has fresh hooks
- Phase 2 (Contact + Keyperson Retrieval): Use sub-agents to explore each candidate's official site, retrieve email / form URL, AND surface at least one keyperson (job title + name)
Phase 1: Candidate Collection
1. Setup
- Project ID:
$0 (required)
- Target count:
$1 (default: 30. Approximate is fine -- "around N" is sufficient)
Load the following documents via MCP:
Call mcp__plugin_lead-ace_api__get_document with projectId: "$0" and slug: "business".
Call mcp__plugin_lead-ace_api__get_document with projectId: "$0" and slug: "sales_strategy".
Call mcp__plugin_lead-ace_api__get_master_document with slug: "tpl_industries" and keep the
returned vocabulary list โ every prospect's industry field MUST be set to one of those exact strings.
If either project document is not found, guide the user to run /strategy.
2. Review Search Notes
Do NOT pre-fetch the registered-prospect list. Server-side dedup in
add_prospects (Phase 3) is the single source of truth โ it returns
structured skippedDetails with reasons (email_duplicate,
form_url_duplicate, already_in_project, do_not_contact,
duplicate_in_batch) so this skill can adapt mid-flight without an O(N)
identifier dump.
Call mcp__plugin_lead-ace_api__get_document with projectId: "$0" and slug: "search_notes". If found, use its content. It contains knowledge from previous explorations:
- Exhausted keywords (do not repeat โ they already returned heavy duplicates)
- Coverage matrix (industry ร region ร company-size cells already covered)
- Useful information source sites (not yet fully explored)
- Directions to try next time
Use this to continue exploration from where the last session left off. If
search_notes is missing, treat every cell of the matrix as unexplored.
3. Search Strategy
Based on the "Search Keywords" and "Target" sections of SALES_STRATEGY.md, formulate multiple search queries.
Pick from unexplored cells of the coverage matrix first. Each query
should belong to a single (industry ร region ร size) cell, e.g.
B2B SaaS ร Pacific Northwest ร Series A. Cells already marked exhausted
in search_notes should not be retried unless the user explicitly asks.
Avoid every keyword listed under ## Exhausted Keywords in search_notes
(those previously returned โฅ 70% duplicates). Pick a synonym or different
angle instead.
Types of search queries (choose appropriate ones based on target type):
- Search by target industry + region
- Member lists of industry associations, federations
- Prospect collection from industry media and news sites
- Exhibitor lists from trade shows and events
- Client case studies from competitors
- Target exploration on job sites
- Directories or public databases of schools and corporations
4. Web Search Execution
Combine WebSearch and fetch_url.py (Jina Reader + Claude Haiku) to broadly collect prospect candidates.
Use fetch_url.py for page retrieval (do not use WebFetch):
python3 ${CLAUDE_PLUGIN_ROOT}/scripts/fetch_url.py --url "https://example.com" --prompt "Extract company list" --timeout 15
Has timeout control so it won't freeze on unresponsive sites. Also handles SPA sites.
Fallback when fetch_url.py is unavailable: if the invocation fails (either python3 or the claude CLI is missing from PATH, or any execution error), fall back to WebFetch for the rest of the run. WebFetch is blocked by some corporate B2B WAFs (typically 403) โ when that happens, skip the candidate and continue with the others rather than retrying.
This phase focuses on discovering candidates. Contact information (email, form, etc.) is collected in Phase 2, so only gather the following here:
Required (skip the candidate if missing):
- Name (company name, school name, organization name, etc.)
- Business overview (what the organization does; 1-2 sentences summarized from the official site)
- Official site URL
If available:
- Industry or field
- Department or branch name (school name for school corporations, target department for large companies)
- Country (ISO 3166-1 alpha-2, e.g., "US", "JP", "GB")
- Email addresses or SNS accounts found incidentally during search (no need to look for these intentionally)
- Organization name: the legal entity name if it differs from the prospect name (e.g., a school corporation that operates multiple schools)
Skip any prospect for which the official site URL and business overview cannot be obtained.
Search tips:
- A single query finds limited prospects, so vary the angles broadly
- Use portal sites and listing pages to find many candidates at once
- Stop searching once the target count (
$1, default 30) is reached. Deduplication rejections don't count (count only newly registered ones)
- No need to deep-dive individual official sites in this phase -- focus on securing a quantity of candidates
Duplicate-rate response (threshold-driven):
The duplicate signal comes from two places: Phase 1.5's check_prospect_dedup
decisions (most candidates are caught here, before signals / contact
retrieval) and Phase 3's add_prospects.skippedDetails (the safety net
that catches anything that slipped past 1.5). Combine both when judging a
batch โ but exclude plan_limit from the tally (it is a budget hit,
not an angle-exhaustion signal; treating it as exhaustion would mark a
perfectly good keyword as dead just because the user hit their plan cap
mid-cycle).
- < 30% skip rate โ healthy. Continue with the same angle.
- 30โ70% skip rate โ the angle is fading. Deep-dive within the same
target first before pivoting:
- Look beyond top results to page 2, 3, and beyond
- Add regional qualifiers (e.g., "SaaS companies" โ "SaaS companies
Portland", "SaaS companies Austin")
- Use synonyms / related terms (e.g., "consulting firm" โ "advisory
firm", "management consultancy")
- Follow industry-specific portal sites and directories
- Search for "competitors" / "similar services" of already-registered
prospects to find new ones organically
- โฅ 70% skip rate โ the angle is exhausted. Stop deep-diving on
this keyword / cell, record it under
## Exhausted Keywords (step 9),
mark the corresponding coverage-matrix cell as exhausted, and pivot to
a different (industry ร region ร size) cell for the next pass.
The 70% rule is a hard pivot threshold, not advisory โ repeating an
exhausted angle just spends quota on duplicates.
5. Priority and Match Reason Assessment
For each prospect, assign a match reason (why they're appropriate as a target, including their challenges and needs) and priority (1-5) based on SALES_STRATEGY.md criteria:
- 1: Top priority (perfectly matches target, needs are clear)
- 2: High priority (broadly matches target)
- 3: Standard (within target range)
- 4: Marginal (only partially meets criteria)
- 5: Under consideration (indirect possibility)
Factor in email retrieval ease: If the following signals are found during exploration, raise priority by 1 level for equal match quality (more email holders -> higher outbound success rate):
- Has press releases on press release distribution sites (high rate of PR contact email inclusion)
- Listed in startup DB or industry directory (more public information available)
- Email explicitly shown on official site (e.g., info@) discovered during exploration
Note on email types: Both named individual addresses (first.last@co.com) and generic addresses (info@, contact@, sales@, support@, pr@) are valid outreach targets. Named addresses generally have higher reply rates and deserve slightly higher priority, but generic addresses must not be excluded โ for many companies they are the only reachable channel.
Phase 1.5: Pre-dedup Filter
Before paying for Phase 1.7's per-candidate WebSearch and Phase 2's
per-candidate sub-agent contact retrieval, drop candidates the server
would reject anyway. The dedup decision needs only organizationDomain,
which is already known at the end of Phase 1, so running this gate first
saves both downstream costs.
Call mcp__plugin_lead-ace_api__check_prospect_dedup with:
projectId: "$0"
candidates: array of { organizationDomain, email?, contactFormUrl? } โ
one entry per Phase 1 candidate. organizationDomain is the apex domain
derived from the candidate's website_url (strip www. and path).
Include email / contactFormUrl if Phase 1 happened to surface them
(rare but possible).
The response is a decisions array in the same order as the input. Drop
any candidate whose kind === 'skip'. Tally the skip reasons (reason โ already_in_project | email_duplicate | form_url_duplicate | do_not_contact | duplicate_in_batch) and feed that tally into step 9 (## Exhausted Keywords) โ the same threshold rule applies (โฅ 70% skip in the batch =
exhausted angle, switch keywords for the next pass).
If most candidates are dropped here, the search angle is exhausted; do
not push through Phase 1.7 / Phase 2 with a near-empty list. Either
(a) re-run Phase 1 with a different keyword / region / size cell from the
coverage matrix, or (b) accept the smaller batch and continue. Phase 3's
add_prospects re-runs the same dedup as a safety net, so passing through
a few skip-marked candidates is harmless but wastes Phase 1.7 / Phase 2
effort.
Phase 1.7: Signal Collection
For each surviving (post-Phase-1.5) candidate, run one WebSearch
query of the form
"<organization name>" press release OR funding OR hiring 2025..2026 (or
your equivalent for the prospect's region / language). Skim the top
results for any of:
- A press release dated within the last 6 months
- A funding round announcement
- A hiring spike, role expansion, or new department launch
- A product launch, partnership, or named-customer announcement
When something concrete surfaces, append a ## Recent Signals section to
the candidate's overview of the form:
## Recent Signals
- 2026-03-12: Announced Series B led by Acme Ventures (TechCrunch)
- 2026-02-04: Hiring 5 senior backend engineers (LinkedIn)
Bullet date + 1 sentence + source. Do not invent signals โ if nothing
relevant turns up, leave the section out. /outbound reads ## Recent Signals and decides whether to open with a signal-aware hook; absent
section means no signal mention.
This is one query per prospect, not deep research. The SaaS-side daily
batch (B ยง4.2-B) refines signals over time; the goal here is to seed the
field at registration time.
Phase 2: Contact + Keyperson Retrieval
6. Contact Retrieval via Sub-agents
Split the post-Phase-1.5 candidate list (only the kind === 'fresh'
entries; Phase 1.7 may have enriched their overview with signals) into
batches of 5 and launch a sub-agent for each batch to retrieve contact
information.
Include the following in each sub-agent's prompt:
- List of assigned candidates (name, organization_name, website_url, overview, industry, department, country, match_reason, priority)
- Retrieve the contact enrichment procedure via
mcp__plugin_lead-ace_api__get_master_document with slug: "tpl_enrich_contacts" and follow its procedure
- Explore each candidate's official site to retrieve email addresses and contact form URLs
- Keyperson lookup is required, not optional. Search the official site's
team / leadership / about pages, then LinkedIn public results
(
site:linkedin.com/in "<organization name>" <target role>), then the
press release page. Capture at least one (contactName, department)
pair per candidate when any public source mentions one. If absolutely
nothing surfaces, leave both null and note it.
- Use
python3 ${CLAUDE_PLUGIN_ROOT}/scripts/fetch_url.py --url <URL> --prompt <instructions> for page retrieval (do not use WebFetch). If fetch_url.py cannot run (either python3 or the claude CLI is missing from PATH), fall back to WebFetch and skip any candidate the WAF blocks (403)
- After completion, return the results as a JSON array
Sub-agent allowed-tools: Bash, WebSearch, WebFetch, Read
Each object in the JSON array returned by the sub-agent includes the Phase 1 information (name, organization_name, overview, website_url, industry, department, country, match_reason, priority) plus the retrieved contacts (email, contact_form_url, form_type, sns_accounts, contact_name).
6b. Re-search for Candidates Without Contact Info (only when applicable)
If Phase 2 results show candidates with both email / contact_form_url as null, try to supplement contact info from sources other than the official site.
For each such candidate, search WebSearch for:
"{company name}" email address
"{company name}" contact
Information may be found from industry directories, press release distribution sites, event speaker information, etc. If found, update the candidate's data.
Limit: Re-search up to a maximum of 10 candidates without contact info. Register the rest without contact info (they will be skipped during outbound).
Phase 3: Registration
7. Database Registration
Call mcp__plugin_lead-ace_api__add_prospects with:
projectId: "$0"
prospects: array of prospect objects
Field mapping for the MCP tool:
For each prospect, construct the object as follows:
-
organizationDomain: Extract the apex domain from website_url (e.g., https://www.example.com/about -> example.com). Strip www. prefix and path. Used for dedup.
-
organizationName: the legal entity name (or name if not separately available)
-
organizationWebsiteUrl: the organization's official website URL
-
name: prospect name (company name, school name, department, etc.)
-
contactName: contact person name (optional)
-
department: department within the organization (optional)
-
overview: business overview (1-2 sentences). If Phase 1.7 surfaced
any signals, append the ## Recent Signals section after the overview
text within the same field.
-
industry: must be one of the strings from tpl_industries (the
vocabulary you fetched in step 1). Free-form industry strings break the
/evaluate aggregator and the timing-aware ordering. If none fit, use
Other.
-
country: ISO 3166-1 alpha-2 (e.g. US, CA, JP). Optional in the
payload โ when omitted the server falls back to TLD inference of the
organization domain. Set this when you have stronger evidence than the
TLD (LLM-derived from page content, address footer, etc.) and pass
countrySource: 'ai_inferred'. LeadAce currently only sends to US,
CA, and JP recipients; prospects from other countries register fine
but the send paths block them at outreach time. If /strategy already
identified a US-, CA-, or JP-only target audience, prefer those.
-
countrySource: optional, one of manual (operator confirmed) or
ai_inferred. Skip this field when leaving country blank.
-
websiteUrl: the specific page URL for this prospect
-
email: email address (optional*)
-
contactFormUrl: contact form URL (optional*)
-
formType: one of google_forms, native_html, wordpress_cf7, iframe_embed, with_captcha (optional)
-
snsAccounts: { x?, linkedin?, instagram?, facebook? } (optional*)
-
matchReason: why this prospect is a good target
-
priority: 1-5 (default 3)
-
hypothesis: per-prospect targeting hypothesis as a structured object (optional but recommended). Built from the assembled overview + any ## Recent Signals + matchReason + SALES_STRATEGY context. Read by the inquiry-landing chat snapshot to ground answers about the visiting org. Shape:
hypothesizedPain: 1โ3 short pain hypotheses, one sentence each (e.g. ["Manual lead routing slows reps", "No central buyer-signal aggregation"])
valueMapping: 1โ3 bullets of how our offering addresses those pains (same order as hypothesizedPain when paired)
timingSignals: 1โ3 concrete reasons NOW is a good moment, drawn from ## Recent Signals (e.g. ["Series B announced 12d ago", "2 SDR roles open since 18d"]). Omit when no signals surfaced โ do not invent.
targetDepartment / targetRolePattern: optional. Department / role pattern most likely to buy (e.g. "Sales Operations", "Director of Sales Ops").
bestChannel / bestKeyperson: optional. Skip when unclear; do NOT guess.
Keep each bullet to one short sentence. Skip fields when public info is too thin to fill them honestly. A partial hypothesis is fine; an invented one harms the chat AI's credibility.
* At least one of email, contactFormUrl, or snsAccounts is required. Prospects with no contact channel are rejected.
The server automatically deduplicates by email, contact form URL, and
organization domain within the project. Inspect skippedDetails after the
call: each entry is {name, reason} with reason โ email_duplicate | form_url_duplicate | already_in_project | do_not_contact | duplicate_in_batch | plan_limit. If the same reason clusters tightly
(e.g. โฅ 50% of skips are email_duplicate from one industry), record the
keyword in ## Exhausted Keywords and switch angles for the next pass.
Difference between organizations and prospects:
organizations = Legal entity unit (apex domain is PK)
prospects = Prospect unit (specific target within an organization)
Small company: organizationName = name (1:1, department is null)
School corporation operating multiple schools: organizationName = "Katayagi Gakuen School Corporation", name = "Nihon Kogakuin College" (1:many possible)
Department within large company: name = "ABC Corp.", department = "Sales Planning Dept."
8. Results Report
After DB registration, check reachable count:
Call mcp__plugin_lead-ace_api__get_outbound_targets with projectId: "$0" and limit: 1 to get the total and byChannel summary.
Report the following:
- Number of newly registered prospects / target count
- Reachable breakdown (among newly registered: N with email, N with form, N SNS-only, N without contacts)
- Breakdown by priority
- Number rejected as duplicates (if many, briefly describe how the search angle was changed)
- Total project reachable remaining (from
total field)
- Guide the user to run
/outbound as the next step
- Append a single low-key dashboard line at the end:
Dashboard: https://app.leadace.ai/prospects โ purely informational, do not push the user to open it
9. Update Search Notes
Save search notes via mcp__plugin_lead-ace_api__save_document with projectId: "$0", slug: "search_notes". Record information useful for the next exploration in the following structure:
# Search Notes
Last updated: YYYY-MM-DD
## Coverage Matrix
Track which (industry ร region ร company-size) cells have been covered
this run. Cells where the combined dedup-skip rate (Phase 1.5 + Phase 3,
excluding `plan_limit`) is โฅ 70% are marked `exhausted`. New runs should
pick from `unexplored` cells first.
| Industry | Region | Size | Status | Notes |
|---|---|---|---|---|
| B2B SaaS | US-West | Series A | covered | 12 added, 0 dups |
| B2B SaaS | US-West | Series B | exhausted | 14 dups / 18 attempts |
| HealthTech | US-Northeast | bootstrapped | unexplored | next run |
## Exhausted Keywords
Keywords whose combined dedup-skip rate (Phase 1.5 + Phase 3, excluding
`plan_limit`) was โฅ 70% this run. **Do not re-use without a fresh angle**
(different region, different size band, different role seniority). Each
entry: `keyword โ reason โ date`.
- "B2B SaaS Series B" โ 14/18 returned `already_in_project` โ 2026-05-06
## Useful Sources
- (Portal sites or listing page URLs that haven't been fully explored yet)
## Directions to Try Next Time
- (Search methods not attempted this time, regions or angles not yet explored)
## Notes
- (Areas where prospects were found unexpectedly, insights for next time)
If the previous version (from step 2) has a ## Hints from evaluate section, preserve its content and carry it over to the end of the new document (to preserve response pattern info added by evaluate).
If the previous version already has ## Coverage Matrix / ## Exhausted Keywords sections, merge into them โ don't overwrite. Only mark a
cell exhausted when this run's data confirms it; old exhausted entries
should be re-tested if the user asks for a sweep across previously-skipped
cells.