Run any Skill in Manus with one click

dct-generate

Stars0

Forks0

UpdatedFebruary 9, 2026 at 02:47

Use this skill when the user wants to create synthetic test data, generate fake datasets, create mock data for testing, produce realistic data with specific patterns, or need sample data with custom schemas. Triggers include "generate test data", "create fake data", "mock dataset", "synthetic data", "generate sample records", "create test data", "fake users", "mock data", or when needing test data with specific fields and relationships.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

andrew-a-hale

andrew-a-hale/dct

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software DevelopersComputer and Mathematical Occupations·SOC 15-1252

SKILL.md

readonly

DCT Generate - Create Synthetic Data

Generate realistic test data with customizable schemas and field types.

When to Use

Use this skill when you need to:

Create test datasets for development
Generate mock data for demos
Produce synthetic data for testing ETL pipelines
Create data with specific distributions
Generate data with referential integrity

Installation

which dct || go build -o dct && chmod +x ./dct

Usage

dct gen <schema> [flags]

Arguments

schema: JSON schema as a file path or inline JSON string

Flags

-n, --lines <number>: Number of rows to generate (default: 1)
-f, --format <format>: Output format - csv, ndjson (default: csv)
-o, --outfile <file>: Output file path (default: stdout)

Examples

From schema file:

dct gen schema.json -n 1000 -o test_data.csv

Inline schema:

dct gen '[{"field":"name","source":"firstNames"}]' -n 100

NDJSON output:

dct gen schema.json -n 500 -f ndjson -o output.ndjson

Generate to stdout:

dct gen users-schema.json -n 10

Schema Format

Array of field objects:

[
  {
    "field": "column_name",
    "source": "source_type",
    "config": { ... }
  }
]

Available Data Sources

Random Generators

randomBool - Boolean true/false

{"field": "active", "source": "randomBool"}

randomEnum - Random value from list

{"field": "status", "source": "randomEnum", "config": {"values": ["pending", "active", "inactive"]}}

randomAscii - Random ASCII string

{"field": "code", "source": "randomAscii", "config": {"length": 10}}

randomUniformInt - Uniform integer distribution

{"field": "age", "source": "randomUniformInt", "config": {"min": 18, "max": 65}}

randomNormal - Normal/Gaussian distribution

{"field": "score", "source": "randomNormal", "config": {"mean": 100, "std": 15}}

randomPoisson - Poisson distribution

{"field": "events", "source": "randomPoisson", "config": {"lambda": 5}}

randomDatetime - Random date/time

{"field": "created_at", "source": "randomDatetime", "config": {"min": "2024-01-01 00:00:00", "max": "2024-12-31 23:59:59", "tz": "UTC"}}

randomDate - Random date

{"field": "birth_date", "source": "randomDate", "config": {"min": "1980-01-01", "max": "2005-12-31"}}

randomTime - Random time

{"field": "meeting_time", "source": "randomTime", "config": {"min": "09:00:00", "max": "17:00:00"}}

Data Generators

uuid - UUID v4
```
{"field": "id", "source": "uuid"}
```

firstNames - Random first names

{"field": "first_name", "source": "firstNames"}

lastNames - Random last names

{"field": "last_name", "source": "lastNames"}

companies - Company names

{"field": "company", "source": "companies"}

emails - Email addresses
```
{"field": "email", "source": "emails"}
```

Derived Fields

Create computed fields using the Expr language:

{
  "field": "full_name",
  "source": "derived",
  "config": {
    "fields": ["first_name", "last_name"],
    "expression": "first_name + ' ' + last_name"
  }
}

Complex expressions:

{
  "field": "display_name",
  "source": "derived",
  "config": {
    "fields": ["first_name", "last_name", "company"],
    "expression": "first_name + ' ' + last_name + ' (' + company + ')'"
  }
}

Complete Schema Example

[
  {"field": "id", "source": "uuid"},
  {"field": "first_name", "source": "firstNames"},
  {"field": "last_name", "source": "lastNames"},
  {"field": "email", "source": "emails"},
  {"field": "age", "source": "randomUniformInt", "config": {"min": 18, "max": 65}},
  {"field": "department", "source": "randomEnum", "config": {"values": ["Engineering", "Sales", "Marketing", "HR"]}},
  {"field": "salary", "source": "randomNormal", "config": {"mean": 75000, "std": 15000}},
  {"field": "is_active", "source": "randomBool"},
  {
    "field": "full_name",
    "source": "derived",
    "config": {
      "fields": ["first_name", "last_name"],
      "expression": "first_name + ' ' + last_name"
    }
  }
]

Best Practices

Generate small samples first (n=10) to verify schema
Use derived fields to create realistic relationships
Use NDJSON format for nested/complex data
Save schemas to files for reuse
Use appropriate distributions for realistic data

Output Formats

CSV (default):

id,first_name,age
550e8400-e29b-41d4-a716-446655440000,John,34

NDJSON:

{"id":"550e8400-e29b-41d4-a716-446655440000","first_name":"John","age":34}
{"id":"550e8400-e29b-41d4-a716-446655440001","first_name":"Jane","age":28}

Related Skills

dct-peek: Verify generated data looks correct
dct-infer: Check schema of generated data
dct-diff: Compare generated data with production samples

dct-generate

More from this repository

More from this repository

DCT Generate - Create Synthetic Data

When to Use

Installation

Usage

Arguments

Flags

Examples

Schema Format

Available Data Sources

Random Generators

Data Generators

Derived Fields

Complete Schema Example

Best Practices

Output Formats

Related Skills

DCT Generate - Create Synthetic Data

When to Use

Installation

Usage

Arguments

Flags

Examples

Schema Format

Available Data Sources

Random Generators

Data Generators

Derived Fields

Complete Schema Example

Best Practices

Output Formats

Related Skills