بنقرة واحدة
crm-data-quality
// Find incomplete records, normalize field values in bulk, dedupe with `hubspot objects merge`, and audit custom properties. Builds on `bulk-operations` for JSONL piping and dry-run/digest/confirm.
// Find incomplete records, normalize field values in bulk, dedupe with `hubspot objects merge`, and audit custom properties. Builds on `bulk-operations` for JSONL piping and dry-run/digest/confirm.
Build a targeted contact segment by filtering on lifecycle, engagement, jobtitle, geography, or firmographics — then export it as JSONL for a campaign or downstream tool.
Foundation patterns for the `hubspot` CLI — JSONL piping, batch read, pagination, dry-run/digest/confirm for destructive ops, and `hubspot history` for recovery. Every other skill builds on this one.
Retrieve activity history (calls, emails, notes, meetings, tasks) for a CRM record and assemble pre-call briefs.
Find a specific CRM record by ID, email, domain, or name fragment, and traverse associations for the full account picture.
Discover, create, update, and delete custom CRM object schemas. Use when defining a new object type, inspecting existing schemas, or removing one. Record CRUD on custom objects is identical to standard objects — see `bulk-operations`.
Identify inactive/at-risk customers via CRM filters and create follow-up tasks at scale. Builds on `bulk-operations`; defers activity-creation specifics to `sales-execution`.
| name | crm-data-quality |
| description | Find incomplete records, normalize field values in bulk, dedupe with `hubspot objects merge`, and audit custom properties. Builds on `bulk-operations` for JSONL piping and dry-run/digest/confirm. |
| triggers | ["clean up contacts","data quality","deduplicate","missing fields","normalize data","find incomplete records","merge duplicates","audit properties"] |
Read bulk-operations/SKILL.md first — JSONL piping, batch read, pagination, and dry-run/digest/confirm gating apply to every command below.
Don't guess property names. List them:
hubspot properties list --type contacts --format table
hubspot properties list --type contacts | jq -c 'select(.type=="enumeration") | {name, label}'
Same for --type companies, deals, or any custom type (hubspot objects types).
!name = NOT_HAS_PROPERTY (missing or empty). Bare name = HAS_PROPERTY. Within one --filter, chain with AND; multiple --filter flags are OR'd.
hubspot objects search --type contacts --filter "!email" --properties firstname,lastname,company
hubspot objects search --type contacts --filter "!phone AND !mobilephone" --properties email
hubspot objects search --type contacts --filter "!hubspot_owner_id" --properties email,lifecyclestage
For >100 results, use the pagination loop from bulk-operations.
Search → reshape with jq → pipe into update. Always --dry-run first; bulk-operations covers digest/confirm escalation for >100 rows. Reshape patterns: bulk-operations/resources/json-patterns.md.
# Collapse spellings into one canonical value
hubspot objects search --type contacts --filter "company~acme" \
| jq -c '{id, properties:{company:"Acme Corporation"}}' \
| hubspot objects update --type contacts --dry-run
# Lowercase emails (read, reshape, write)
hubspot objects search --type contacts --filter "email" --properties email \
| jq -c '{id, properties:{email: (.properties.email | ascii_downcase)}}' \
| hubspot objects update --type contacts --dry-run
hubspot objects mergeSecondary is folded into primary and deleted. Irreversible. Dry-run/digest/confirm gating applies.
# Single pair
hubspot objects merge --type contacts --primary 149 --secondary 425 --dry-run
hubspot objects merge --type contacts --primary 149 --secondary 425 # execute (≤100 pairs)
Bulk: pipe JSONL {"primary":"...","secondary":"..."} on stdin (omit --primary/--secondary).
Pagination required. objects search caps at 100 rows per call and jq -s slurps a single stream into memory — running the snippet below against a raw search will silently miss every duplicate that crosses a page boundary. Collect the full set first with the pagination loop from bulk-operations/SKILL.md (write to /tmp/contacts.jsonl), then dedupe from the file:
# /tmp/contacts.jsonl produced by the pagination loop (bulk-operations/SKILL.md)
jq -s -c '
group_by(.properties.email)[]
| select(length > 1)
| sort_by(.id | tonumber)
| .[0].id as $p | .[1:][] | {primary: $p, secondary: .id}
' /tmp/contacts.jsonl \
| hubspot objects merge --type contacts --dry-run | tee /tmp/merge-preview.jsonl
For >100 pairs, lift digest and impact.records_affected from the BulkData line and re-pipe the same producer with --digest/--confirm (see bulk-operations).
hubspot properties list (and get, batch-read) emits {name, label, type, fieldType, groupName} per row. Enum option values are not currently exposed by the CLI — read them off a real record (hubspot objects search ... --properties <enum>) or the HubSpot UI.
# Count properties per group (HubSpot groups standard fields; custom groups stand out)
hubspot properties list --type contacts | jq -rs 'group_by(.groupName) | map({group: .[0].groupName, count: length}) | .[]'
# All enumeration properties
hubspot properties list --type contacts | jq -c 'select(.type=="enumeration") | {name, label, fieldType}'
# Create a DQ flag property, then set it via the normalize pattern in section 2
hubspot properties create --type contacts --name dq_missing_phone --label "DQ: Missing Phone" --prop-type string --field-type text
Merge is irreversible. After any merge, hubspot history --since 1h captures the audit trail. If wrong direction, restore the secondary from the UI's recycle bin.