Run any Skill in Manus with one click

$pwd:

domo-data-generator

Name: Domo Data Generator
Author: stahura

// **Generating sample data for Domo** -- invoke when a user needs to create realistic sample datasets and upload them to a Domo instance. Primary signals: requests for sample data, demo data, test data, fake data for Domo; mentions of Salesforce, Google Analytics, QuickBooks, NetSuite, Google Ads, Facebook Ads, HubSpot, Marketo, or Health Portal sample data; questions about the datagen CLI or domo_data_generator. Covers: generating datasets, uploading to Domo, creating datasets in Domo, rolling dates, entity pools, connector icons, catalog management, and adding new dataset definitions. Skip for: real connector setup, production data pipelines, data transformations (Magic ETL), or Domo App Platform.

Run Skill in Manus

$ git log --oneline --stat

stars:16

forks:13

updated:April 12, 2026 at 20:28

SKILL.md

readonly

related-skills.json

same repository

app-studio-build.md

from "stahura/domo-ai-vibe-rules"

Step-by-step orchestrator for building Domo App Studio apps with native KPI cards via community-domo-cli. Sequences app creation, pages, theme, hero metrics, native charts, filter cards, layout assembly, and navigation. CLI-first — no raw API calls.

2026-04-3016

card-creation.md

from "stahura/domo-ai-vibe-rules"

Domo KPI card CRUD via community-domo-cli — body schema, column mapping, beast modes, chart type index with on-demand reference files, and gotchas.

2026-04-3016

appdb-collection-create.md

from "stahura/domo-ai-vibe-rules"

Create AppDB collections via CLI-first workflows where collection creation also provisions the required datastore, then returns collection identifiers for manifest wiring and document-write follow-up. Use when an agent must initialize new AppDB storage for a Domo app, not just list/get collections or create documents.

2026-04-1216

code-engine-create.md

from "stahura/domo-ai-vibe-rules"

Create Domo Code Engine packages from CLI workflows with deterministic payload contracts, automatic function parameter datatype mapping, and manifest packagesMapping follow-up guidance. Use when an agent must create a new package/versioned package container rather than only invoke an existing function from app runtime code.

2026-04-1216

code-engine-update.md

from "stahura/domo-ai-vibe-rules"

Update Domo Code Engine packages through CLI-driven versioned lifecycle workflows with compatibility checks, datatype contract safeguards, and manifest mapping drift synchronization. Use when an agent must update package code or create a new package version and keep app mappings aligned.

2026-04-1216

community-cli-howto.md

from "stahura/domo-ai-vibe-rules"

General instructions for using the community-domo-cli in this repository, including install/auth setup, endpoint testing, and troubleshooting.

2026-04-1216

package.json

"author": "stahura"

"repository": "stahura/domo-ai-vibe-rules"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name

domo-data-generator

description

**Generating sample data for Domo** -- invoke when a user needs to create realistic sample datasets and upload them to a Domo instance. Primary signals: requests for sample data, demo data, test data, fake data for Domo; mentions of Salesforce, Google Analytics, QuickBooks, NetSuite, Google Ads, Facebook Ads, HubSpot, Marketo, or Health Portal sample data; questions about the datagen CLI or domo_data_generator. Covers: generating datasets, uploading to Domo, creating datasets in Domo, rolling dates, entity pools, connector icons, catalog management, and adding new dataset definitions. Skip for: real connector setup, production data pipelines, data transformations (Magic ETL), or Domo App Platform.

Domo Sample Data Generator

Generate realistic, cross-referenced sample data for Domo using the datagen CLI.

Repository: https://github.com/brrink/domo_data_generator

Overview

The generator creates sample data mirroring major business platforms with consistent cross-source entity integrity, then uploads it to Domo. It includes:

18 pre-built datasets across 6 source categories (Salesforce, Google Analytics, Financial, Marketing, Health, AdPoint)
YAML-driven catalog for easy dataset additions
Shared entity pool (companies, people, products, sales reps, campaigns)
Date rolling to keep data looking current
Direct Domo integration (create datasets, upload, set connector icons)
Structured JSON output by default (AI-agent-friendly)
pipx installable -- runs from any directory

Setup

# Install globally with pipx (recommended)
pipx install git+https://github.com/brrink/domo_data_generator.git

# Initialize a working directory
mkdir my-domo-data && cd my-domo-data
datagen init

# Edit .env with your Domo credentials

If .env.example is missing or you want a clean start, create .env in the working directory with:

cat > .env <<'EOF'
DOMO_CLIENT_ID=your_client_id_here
DOMO_CLIENT_SECRET=your_client_secret_here
DOMO_API_HOST=api.domo.com
DOMO_INSTANCE=your_instance_name
DOMO_SET_CONNECTOR_TYPE=false
EOF

Required Environment Variables

Variable	Purpose
`DOMO_CLIENT_ID`	OAuth client identifier
`DOMO_CLIENT_SECRET`	OAuth client secret
`DOMO_API_HOST`	API endpoint hostname
`DOMO_INSTANCE`	Domo instance name
`DOMO_SET_CONNECTOR_TYPE`	Enable connector icon customization (optional, default: false)

Auth boundary note: domo_data_generator uses its own public-API/OAuth credential flow and does not run through community-domo-cli or ryuu session auth.

Current tooling boundary: most Product API automation should use community-domo-cli, but datagen dataset create/upload in this skill currently depends on python -m datagen with .env OAuth credentials (DOMO_CLIENT_ID / DOMO_CLIENT_SECRET).

CLI Reference

Entry point: datagen [OPTIONS] COMMAND [ARGS]

Global Options

Option	Description
`--verbose / -v`	Enable verbose logging
`--output / -o TEXT`	Output format: `json` (default), `table`, `yaml`
`--yes / -y`	Skip confirmation prompts

All commands emit structured JSON by default for easy machine parsing.

Init Command

`init` -- Initialize a working directory

datagen init                  # Initialize current directory
datagen init /path/to/dir     # Initialize a specific directory

Copies bundled catalog YAML files to ./catalog/, creates .env template, and creates ./data/ directory. Run this once before using the CLI in a new directory.

Core Commands

`generate` -- Generate sample data

datagen generate --all                    # Generate all datasets
datagen generate salesforce_opportunities  # Generate one dataset
datagen generate --all --seed 42          # Reproducible generation
datagen generate --all --dry-run          # Preview without writing

Requires entity pool initialization first. Run python -m datagen pool regenerate before generate even if your schema has no explicit entity_ref columns.

Option	Description
`name`	Dataset name (YAML filename stem), optional
`--all`	Generate all datasets
`--seed INTEGER`	Random seed for reproducibility
`--catalog-dir PATH`	Catalog directory override
`--data-dir PATH`	Data directory override
`--dry-run`	Preview without writing files

`upload` -- Upload data to Domo (full replace)

datagen upload --all
datagen upload salesforce_opportunities

Requires DOMO_CLIENT_ID and DOMO_CLIENT_SECRET.

Option	Description
`name`	Dataset name, optional
`--all`	Upload all datasets
`--catalog-dir PATH`	Catalog directory override
`--data-dir PATH`	Data directory override

`create-dataset` -- Create dataset(s) in Domo from catalog

datagen create-dataset --all --skip-existing
datagen create-dataset salesforce_opportunities

Requires DOMO_CLIENT_ID and DOMO_CLIENT_SECRET. The domo_id is persisted locally (in the catalog YAML if writable, otherwise in data/domo_ids.json).

Option	Description
`name`	Dataset name, optional
`--all`	Create all datasets
`--skip-existing`	Skip datasets that already have a `domo_id`
`--catalog-dir PATH`	Catalog directory override

`roll-dates` -- Shift rolling date columns to stay current

datagen roll-dates
datagen roll-dates --anchor-date 2026-04-01

Option	Description
`--anchor-date TEXT`	Target date (YYYY-MM-DD), defaults to today
`--catalog-dir PATH`	Catalog directory override
`--data-dir PATH`	Data directory override

Informational Commands

`list` -- List catalog dataset definitions

datagen list                    # JSON output (default)
datagen --output table list     # Rich table for humans
datagen list --verbose          # Include column/schema details

`status` -- Display generation status for all datasets

datagen status

Connector Icon Commands

Require DOMO_DEVELOPER_TOKEN and DOMO_INSTANCE.

`discover-types` -- Search Domo connector/provider types

datagen discover-types salesforce

`set-type` -- Set connector icon on a Domo dataset

datagen set-type salesforce_opportunities
datagen set-type salesforce_opportunities --provider-key custom_key

`set-type-all` -- Set connector icon on all datasets with a `domo_id`

datagen set-type-all

Entity Pool Commands

`pool regenerate` -- Regenerate the shared entity pool

datagen pool regenerate
datagen pool regenerate --seed 99
datagen pool regenerate --company-count 500 --person-count 1000

Option	Default
`--seed INTEGER`	42
`--company-count INTEGER`	200
`--person-count INTEGER`	500
`--product-count INTEGER`	50
`--sales-rep-count INTEGER`	20
`--campaign-count INTEGER`	30

`pool show` -- Display entity pool summary

datagen pool show

Common Workflows

Full setup for a new Domo instance

datagen init
# Edit .env with credentials
datagen pool regenerate
datagen generate --all
datagen create-dataset --all
datagen upload --all
datagen set-type-all

Daily refresh via cron

# Crontab entry: roll dates and re-upload daily at 6 AM
0 6 * * * cd /path/to/project && datagen roll-dates && datagen upload --all

Generate a single dataset end-to-end

datagen generate salesforce_opportunities
datagen create-dataset salesforce_opportunities
datagen upload salesforce_opportunities
datagen set-type salesforce_opportunities

Included Datasets

Category	Dataset Name	Key	Rows
Salesforce	Salesforce - Accounts	`salesforce_accounts`	500
Salesforce	Salesforce - Contacts	`salesforce_contacts`	1,500
Salesforce	Salesforce - Opportunities	`salesforce_opportunities`	2,500
Google Analytics	Google Analytics - Sessions	`ga_sessions`	5,000
Google Analytics	Google Analytics - Page Views	`ga_pageviews`	10,000
Financial	QuickBooks - Invoices	`financial_invoices`	3,000
Financial	NetSuite - General Ledger	`financial_gl_entries`	5,000
Marketing	Google Ads - Campaign Performance	`marketing_google_ads`	3,000
Marketing	Facebook Ads - Campaign Performance	`marketing_facebook_ads`	2,500
Marketing	HubSpot - Contacts	`marketing_hubspot_contacts`	2,000
Marketing	Marketing - Market Leads	`marketing_market_leads`	2,500
Marketing	Marketo - Leads	`marketing_marketo_leads`	3,000
Health	Health Portal - Demographics	`health_demographics`	15
Health	Health Portal - Lab Results	`health_lab_results`	1,470
Health	Health Portal - Vitals	`health_vitals`	5,250
AdPoint	AdPoint - Orders	`adpoint_orders`	150
AdPoint	AdPoint - Line Items	`adpoint_line_items`	500
AdPoint	AdPoint - Flights	`adpoint_flights`	2,000

Entity Pool

The shared entity pool provides consistent cross-dataset references. Entities are generated once and reused across all datasets.

Entity Type	Default Count	Key Fields
company	200	id, account_id, name, domain, industry, size, city, state, annual_revenue, employee_count
person	500	id, contact_id, first_name, last_name, full_name, email, company_id, company_name, title, phone
product	50	id, name, category, unit_price, sku
sales_rep	20	id, rep_id, first_name, last_name, full_name, email, region
campaign	30	id, name, channel, budget, status

Adding New Dataset Definitions

Dataset definitions live in the catalog/ directory as YAML files. Each YAML file defines metadata, columns, and generator configurations.

YAML Structure

dataset:
  name: My Custom Dataset
  domo_id: null
  source_type: custom
  description: "Description of the dataset"
  row_count: 1000
  tags:
    - custom
    - demo

schema:
  - name: id
    type: STRING
    generator: uuid4

  - name: company_name
    type: STRING
    generator: entity_ref
    entity: company
    field: name

  - name: amount
    type: DOUBLE
    generator: random_decimal
    min: 100.0
    max: 10000.0
    precision: 2

  - name: created_date
    type: DATE
    generator: date_range
    start_days_ago: 365
    end_days_ahead: 0
    rolling: true

Available Column Types

STRING, LONG, DOUBLE, DECIMAL, DATETIME, DATE

Available Generators

Generic: uuid4, random_choice, weighted_choice, random_int, random_decimal, date_range, entity_ref, compound, sequence, constant, derived_from_date, stage_derived, faker

Salesforce: sf_id, sf_opportunity_name, sf_case_subject, sf_lead_rating

Google Analytics: ga_session_id, ga_page_path, ga_source, ga_medium, ga_campaign, ga_browser, ga_device_category, ga_country, ga_bounce_rate, ga_session_duration, ga_pageviews, ga_landing_page

Financial: gl_account_code, gl_account_name, invoice_number, payment_terms, payment_method, invoice_status, journal_type, department, fiscal_period, debit_credit

Marketing/Ads: ad_platform, campaign_objective, ad_format, ad_headline, ad_keyword, targeting_type, impressions, clicks_from_impressions, ctr, cost_per_click, ad_spend, conversions_from_clicks, hubspot_lifecycle, hubspot_lead_status, ad_group_id

Health: health_lab_init, health_lab_field, health_vital_init, health_vital_field, health_demographics

Generator Column Options

Option	Used With	Description
`entity`	`entity_ref`	Entity pool type to reference
`field`	`entity_ref`	Field to pull from the entity
`choices`	`random_choice`, `weighted_choice`	List of possible values
`min` / `max`	`random_int`, `random_decimal`	Value range
`precision`	`random_decimal`	Decimal places
`template`	`compound`	String template with `{field}` placeholders
`refs`	`compound`	Column references for template substitution — must be a YAML list of column name strings (e.g. `["sku", "line_id"]`), not a dict/object. A dict triggers `ValidationError: schema.N.refs — Input should be a valid list`.
`start_days_ago` / `end_days_ahead`	`date_range`	Date range relative to today
`rolling`	`date_range`	Enable date rolling for freshness
`mapping`	`stage_derived`	Map source values to derived values
`source_column`	`stage_derived`, `derived_from_date`	Column to derive from
`format`	`derived_from_date`	Date format string
`faker_method`	`faker`	Faker library method name
`faker_args`	`faker`	Arguments for the Faker method

weighted_choice YAML format:

generator: weighted_choice
choices:
  "Tier 1": 0.40
  "Tier 2": 0.35
  "Tier 3": 0.25

compound refs vs formatted random strings: For values like PO-12345, prefer faker with bothify instead of abusing compound / refs:
- name: purchase_order_ref
  type: STRING
  generator: faker
  faker_method: bothify
  faker_args:
    text: "PO-#####"

Rules

Run datagen init first -- Initialize a working directory before using any other commands. This copies the catalog and creates .env.
Always generate before uploading -- Run generate (or generate --all) before upload to ensure CSV data files exist.
Create datasets before first upload -- Run create-dataset before upload for new datasets. The domo_id is persisted locally.
Use --skip-existing -- When running create-dataset --all, use --skip-existing to avoid duplicating datasets that already have a domo_id.
Entity pool consistency -- Regenerating the pool (pool regenerate) invalidates all previously generated data. Re-generate all datasets afterward.
Date rolling -- Use roll-dates before upload to keep date columns current. Only columns with rolling: true are affected.
Credentials -- DOMO_CLIENT_ID and DOMO_CLIENT_SECRET are required for upload and create-dataset. DOMO_DEVELOPER_TOKEN is required for set-type and discover-types. Offline commands need no credentials.
Reproducibility -- Use --seed for reproducible data generation across runs.
Output format -- Default output is JSON. Use --output table for human-readable Rich tables.

Checklist

CLI installed (pipx install git+https://github.com/brrink/domo_data_generator.git)
Working directory initialized (datagen init)
.env configured with Domo credentials
Entity pool generated (datagen pool regenerate)
Datasets generated (datagen generate --all)
Datasets created in Domo (datagen create-dataset --all --skip-existing)
Data uploaded (datagen upload --all)
Connector icons set (datagen set-type-all) if desired
Cron configured for daily date rolling and upload if needed

domo-data-generator

More from this repository

More from this repository

Domo Sample Data Generator

Overview

Setup

Required Environment Variables

CLI Reference

Global Options

Init Command

init -- Initialize a working directory

Core Commands

generate -- Generate sample data

upload -- Upload data to Domo (full replace)

create-dataset -- Create dataset(s) in Domo from catalog

roll-dates -- Shift rolling date columns to stay current

Informational Commands

list -- List catalog dataset definitions

status -- Display generation status for all datasets

Connector Icon Commands

discover-types -- Search Domo connector/provider types

set-type -- Set connector icon on a Domo dataset

set-type-all -- Set connector icon on all datasets with a domo_id

Entity Pool Commands

pool regenerate -- Regenerate the shared entity pool

pool show -- Display entity pool summary

Common Workflows

Full setup for a new Domo instance

Daily refresh via cron

Generate a single dataset end-to-end

Included Datasets

Entity Pool

Adding New Dataset Definitions

YAML Structure

Available Column Types

Available Generators

Generator Column Options

Rules

Checklist

Domo Sample Data Generator

Overview

Setup

Required Environment Variables

CLI Reference

Global Options

Init Command

init -- Initialize a working directory

Core Commands

generate -- Generate sample data

upload -- Upload data to Domo (full replace)

create-dataset -- Create dataset(s) in Domo from catalog

roll-dates -- Shift rolling date columns to stay current

Informational Commands

list -- List catalog dataset definitions

status -- Display generation status for all datasets

Connector Icon Commands

discover-types -- Search Domo connector/provider types

set-type -- Set connector icon on a Domo dataset

set-type-all -- Set connector icon on all datasets with a domo_id

Entity Pool Commands

pool regenerate -- Regenerate the shared entity pool

pool show -- Display entity pool summary

Common Workflows

Full setup for a new Domo instance

Daily refresh via cron

Generate a single dataset end-to-end

Included Datasets

Entity Pool

Adding New Dataset Definitions

YAML Structure

Available Column Types

Available Generators

Generator Column Options

Rules

Checklist

`init` -- Initialize a working directory

`generate` -- Generate sample data

`upload` -- Upload data to Domo (full replace)

`create-dataset` -- Create dataset(s) in Domo from catalog

`roll-dates` -- Shift rolling date columns to stay current

`list` -- List catalog dataset definitions

`status` -- Display generation status for all datasets

`discover-types` -- Search Domo connector/provider types

`set-type` -- Set connector icon on a Domo dataset

`set-type-all` -- Set connector icon on all datasets with a `domo_id`

`pool regenerate` -- Regenerate the shared entity pool

`pool show` -- Display entity pool summary

`init` -- Initialize a working directory

`generate` -- Generate sample data

`upload` -- Upload data to Domo (full replace)

`create-dataset` -- Create dataset(s) in Domo from catalog

`roll-dates` -- Shift rolling date columns to stay current

`list` -- List catalog dataset definitions

`status` -- Display generation status for all datasets

`discover-types` -- Search Domo connector/provider types

`set-type` -- Set connector icon on a Domo dataset

`set-type-all` -- Set connector icon on all datasets with a `domo_id`

`pool regenerate` -- Regenerate the shared entity pool

`pool show` -- Display entity pool summary