Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

data-migration

Name: Data Migration
Author: Datus-ai

// Activate when the gen_job agent detects that the source and target databases differ. Covers cross-database transfer lifecycle - type mapping via adapter Mixin hints, DDL generation, data transfer via transfer_query_result, and lightweight reconciliation.

Ejecutar en Manus

$ git log --oneline --stat

stars:1253

forks:200

updated:7 de mayo de 2026, 07:13

SKILL.md

readonly

name	data-migration
description	Activate when the gen_job agent detects that the source and target databases differ. Covers cross-database transfer lifecycle - type mapping via adapter Mixin hints, DDL generation, data transfer via transfer_query_result, and lightweight reconciliation.
tags	["data-engineering","migration","cross-database","etl","reconciliation"]
version	1.1.0
user_invocable	false
disable_model_invocation	false
allowed_agents	["gen_job"]

Data Migration

Use this skill when the gen_job agent is asked to move data between DIFFERENT database engines (any source → any target: e.g., DuckDB ↔ Greenplum, MySQL ↔ StarRocks, Postgres ↔ ClickHouse, Hive → Iceberg). When source_database == target_database the job stays inside gen_job and should use the gen-table path for intra-database ETL; table-validation is an automatic validator (kind: validator) invoked by ValidationHook.on_end for schema-contract checks, not a selectable workflow step.

This skill covers the full lifecycle: source inspection, target DDL generation (using adapter Mixin hints), data transfer, and post-transfer reconciliation.

When to use this skill

Activate when you need to:

Migrate a single table from one database to another
Rebuild a target table with correct cross-database type mapping
Transfer data across database engines
Reconcile source vs target after migration

Prerequisites

Both source and target databases must be configured and accessible
Source database must support pandas query execution
Target database must support DDL and DML operations

Core workflow

Phase 1: Inspect Source

Use describe_table(database=source) to get source schema
Use read_query(database=source) to get row count and sample data
Identify column types, nullable columns, and primary key candidates
Document the source schema for DDL generation

Phase 2: Inspect Target

Check if target schema/database exists using list_tables(database=target)
If target table already exists, use describe_table(database=target) to compare
Determine whether to create new or replace existing

Phase 3: Build Target DDL (dialect-neutral)

Call get_migration_capabilities(database=target) to read the target dialect's hard requirements, forbids, type_hints, and a reference example_ddl.
- If the result reports supported == false, the target adapter has not implemented migration hints. Proceed in pure-LLM mode, relying on your own knowledge of that dialect.
For OLAP-like targets (dialect_family indicates OLAP) call suggest_table_layout(database=target, columns_json=...) to get distribution / partition / order-by hints.
Map source types → target types guided by type_hints. When ambiguous, prefer widening over narrowing.
Draft the CREATE TABLE DDL.
Call validate_ddl(database=target, ddl=<draft>, target_table=<name>). Iterate until errors == [].
Execute the DDL with execute_ddl(sql, database=target).

Phase 4: Transfer Data

Use transfer_query_result(source_sql, source_database, target_table, target_database, mode) to move data
For fresh migration use mode='replace' (creates the target table if missing; otherwise truncates it first)
For incremental load use mode='append'
Verify the transfer result (rows_transferred count)

Phase 5: Reconcile

Cross-database transfer sanity checks are normally driven by the transfer-reconciliation validator skill via ValidationHook at the end of the agent run — provided agent.validation.skill_validators_enabled is on (the default). When the validator is enabled, focus on correct transfer execution and let the hook compare tool-reported row counts plus a small target-side sanity query; if it reports blocking failures they will be injected back into this conversation so you can fix the transfer and retry.

When skill validators are disabled, the hook cannot reconcile for you. In that case you MUST compare the transfer tool's source/target row counts and run at least one target-side sanity query before declaring the transfer done.

Phase 6: Report

Summarize migration results including:

Source and target table names
Rows transferred
Reconciliation summary with pass/fail status
Any issues or warnings

Critical rules

Always specify the database parameter explicitly in every tool call
Source database is read-only: never execute DDL or write operations against it
Target database is write-target: all DDL and writes go here
Reconciliation is mandatory: never skip the row-count and target sanity checks after data transfer
Report reconciliation results even if some checks fail
NEVER fall back to a different database if the target is unavailable — STOP and report the error
Before starting, verify both source and target show "available": true in list_databases() output

Execution checklist

Use this checklist as the in-skill runbook. Keep the default path lightweight; only run expensive reconciliation when the user or project-specific validator rules ask for it.

Pre-transfer

Confirm the source table exists and has the expected shape.
Document source columns, types, nullability, and the source row count reported or needed for the transfer.
Identify key columns only when they are needed for layout, deduplication, or optional reconciliation.
Confirm the target database / schema exists, or create it if the target connector supports schema creation and the user asked for it.
Verify target connectivity and write permissions before executing DDL.

DDL generation

Map source column types to the target dialect using adapter migration capabilities when available.
Handle unsupported complex types (for example LIST, STRUCT, MAP, BLOB) by either reporting a blocking issue or explicitly excluding / serializing them with user approval.
For StarRocks or OLAP-style targets, choose a safe key and distribution layout (DUPLICATE KEY, DISTRIBUTED BY HASH, buckets, partitions) using suggest_table_layout when available.
Generate the target CREATE TABLE DDL.
Validate the DDL, execute it on the target, then verify the target schema.

Data transfer

Execute transfer_query_result.
Verify the tool-reported rows_transferred and any reported source / target counts.
If the transfer fails partially, report rows written and the exact error; do not continue as if the migration succeeded.

Default reconciliation

Compare the transfer tool's source/target row counts when both are available.
Run one small target-side sanity query only when the validator hook is disabled or the transfer result is ambiguous.
Treat blocking mismatches as retryable transfer failures and fix them before reporting success.

Optional extended reconciliation

Do not run these by default on large systems. They can be added by project-level validator skills or used when the user explicitly requests a deep audit:

null ratio comparison for nullable columns
min/max comparison for numeric and date columns
distinct count comparison for key columns
duplicate-key checks on the target table
key-based sample diff
numeric aggregate comparison such as SUM or AVG

Reporting

Compile check results with pass/fail status.
Flag blocking issues separately from warnings.
Output the final migration summary with source, target, rows transferred, and reconciliation status.

Output expectations

At minimum, return:

source and target database/table names
rows transferred
reconciliation summary with pass/fail
blocking issues that must be addressed

related-skills.json

mismo repositorio

gen-table.md

from "Datus-ai/Datus-agent"

Create database tables from SQL (CTAS) or natural language descriptions

2026-05-281.3k

gen-metrics.md

from "Datus-ai/Datus-agent"

Generate MetricFlow metrics from natural language business descriptions

2026-05-211.3k

gen-semantic-model.md

from "Datus-ai/Datus-agent"

Generate MetricFlow semantic models from database tables with validation and Knowledge Base publishing

2026-05-201.3k

airflow-workflow.md

from "Datus-ai/Datus-agent"

Execution guide for Airflow scheduled jobs — troubleshooting, updating, conn_id conventions, and cron references

2026-04-281.3k

scheduler-validation.md

from "Datus-ai/Datus-agent"

Scheduler validator driven by ValidationHook — read-only static verification of scheduled jobs (schedule correctness, configuration, runtime context already collected by deterministic hook). Does not trigger test runs.

2026-04-281.3k

grafana-dashboard.md

from "Datus-ai/Datus-agent"

Create, view, and manage Grafana dashboards with panels and datasources

2026-04-281.3k

package.json

"author": "Datus-ai"

"repository": "Datus-ai/Datus-agent"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Arquitectos de bases de datosOcupaciones informáticas y matemáticas15-1243L4

name	data-migration
description	Activate when the gen_job agent detects that the source and target databases differ. Covers cross-database transfer lifecycle - type mapping via adapter Mixin hints, DDL generation, data transfer via transfer_query_result, and lightweight reconciliation.
tags	["data-engineering","migration","cross-database","etl","reconciliation"]
version	1.1.0
user_invocable	false
disable_model_invocation	false
allowed_agents	["gen_job"]

Data Migration

This skill covers the full lifecycle: source inspection, target DDL generation (using adapter Mixin hints), data transfer, and post-transfer reconciliation.

When to use this skill

Activate when you need to:

Migrate a single table from one database to another
Rebuild a target table with correct cross-database type mapping
Transfer data across database engines
Reconcile source vs target after migration

Prerequisites

Both source and target databases must be configured and accessible
Source database must support pandas query execution
Target database must support DDL and DML operations

Core workflow

Phase 1: Inspect Source

Use describe_table(database=source) to get source schema
Use read_query(database=source) to get row count and sample data
Identify column types, nullable columns, and primary key candidates
Document the source schema for DDL generation

Phase 2: Inspect Target

Check if target schema/database exists using list_tables(database=target)
If target table already exists, use describe_table(database=target) to compare
Determine whether to create new or replace existing

Phase 3: Build Target DDL (dialect-neutral)

Call get_migration_capabilities(database=target) to read the target dialect's hard requirements, forbids, type_hints, and a reference example_ddl.
- If the result reports supported == false, the target adapter has not implemented migration hints. Proceed in pure-LLM mode, relying on your own knowledge of that dialect.
For OLAP-like targets (dialect_family indicates OLAP) call suggest_table_layout(database=target, columns_json=...) to get distribution / partition / order-by hints.
Map source types → target types guided by type_hints. When ambiguous, prefer widening over narrowing.
Draft the CREATE TABLE DDL.
Call validate_ddl(database=target, ddl=<draft>, target_table=<name>). Iterate until errors == [].
Execute the DDL with execute_ddl(sql, database=target).

Phase 4: Transfer Data

Use transfer_query_result(source_sql, source_database, target_table, target_database, mode) to move data
For fresh migration use mode='replace' (creates the target table if missing; otherwise truncates it first)
For incremental load use mode='append'
Verify the transfer result (rows_transferred count)

Phase 5: Reconcile

Phase 6: Report

Summarize migration results including:

Source and target table names
Rows transferred
Reconciliation summary with pass/fail status
Any issues or warnings

Critical rules

Always specify the database parameter explicitly in every tool call
Source database is read-only: never execute DDL or write operations against it
Target database is write-target: all DDL and writes go here
Reconciliation is mandatory: never skip the row-count and target sanity checks after data transfer
Report reconciliation results even if some checks fail
NEVER fall back to a different database if the target is unavailable — STOP and report the error
Before starting, verify both source and target show "available": true in list_databases() output

Execution checklist

Use this checklist as the in-skill runbook. Keep the default path lightweight; only run expensive reconciliation when the user or project-specific validator rules ask for it.

Pre-transfer

Confirm the source table exists and has the expected shape.
Document source columns, types, nullability, and the source row count reported or needed for the transfer.
Identify key columns only when they are needed for layout, deduplication, or optional reconciliation.
Confirm the target database / schema exists, or create it if the target connector supports schema creation and the user asked for it.
Verify target connectivity and write permissions before executing DDL.

DDL generation

Map source column types to the target dialect using adapter migration capabilities when available.
Handle unsupported complex types (for example LIST, STRUCT, MAP, BLOB) by either reporting a blocking issue or explicitly excluding / serializing them with user approval.
For StarRocks or OLAP-style targets, choose a safe key and distribution layout (DUPLICATE KEY, DISTRIBUTED BY HASH, buckets, partitions) using suggest_table_layout when available.
Generate the target CREATE TABLE DDL.
Validate the DDL, execute it on the target, then verify the target schema.

Data transfer

Execute transfer_query_result.
Verify the tool-reported rows_transferred and any reported source / target counts.
If the transfer fails partially, report rows written and the exact error; do not continue as if the migration succeeded.

Default reconciliation

Compare the transfer tool's source/target row counts when both are available.
Run one small target-side sanity query only when the validator hook is disabled or the transfer result is ambiguous.
Treat blocking mismatches as retryable transfer failures and fix them before reporting success.

Optional extended reconciliation

Do not run these by default on large systems. They can be added by project-level validator skills or used when the user explicitly requests a deep audit:

null ratio comparison for nullable columns
min/max comparison for numeric and date columns
distinct count comparison for key columns
duplicate-key checks on the target table
key-based sample diff
numeric aggregate comparison such as SUM or AVG

Reporting

Compile check results with pass/fail status.
Flag blocking issues separately from warnings.
Output the final migration summary with source, target, rows transferred, and reconciliation status.

Output expectations

At minimum, return:

source and target database/table names
rows transferred
reconciliation summary with pass/fail
blocking issues that must be addressed

data-migration

Data Migration

When to use this skill

Prerequisites

Core workflow

Phase 1: Inspect Source

Phase 2: Inspect Target

Phase 3: Build Target DDL (dialect-neutral)

Phase 4: Transfer Data

Phase 5: Reconcile

Phase 6: Report

Critical rules

Execution checklist

Pre-transfer

DDL generation

Data transfer

Default reconciliation

Optional extended reconciliation

Reporting

Output expectations

Más de este repositorio

Data Migration

When to use this skill

Prerequisites

Core workflow

Phase 1: Inspect Source

Phase 2: Inspect Target

Phase 3: Build Target DDL (dialect-neutral)

Phase 4: Transfer Data

Phase 5: Reconcile

Phase 6: Report

Critical rules

Execution checklist

Pre-transfer

DDL generation

Data transfer

Default reconciliation

Optional extended reconciliation

Reporting

Output expectations

Más de este repositorio