lakehouse-doc-en
clickzetta/clickzetta-skills
Singdata Lakehouse official documentation knowledge base (English). Consult references/ when writing SQL or answering questions about query syntax, functions, data types, DDL/DML, dynamic tables, permissions, vclusters, data lake, AI functions, and other Lakehouse topics.
clickzetta-oss-ingest-pipeline
clickzetta/clickzetta-skills
Build ClickZetta object storage (OSS/S3/COS) data ingestion pipelines, covering both continuous
ingestion (PIPE) and one-time batch import scenarios. Continuous ingestion supports LIST_PURGE
scan mode and EVENT_NOTIFICATION message notification mode; batch import supports Volume + INSERT
INTO and Volume + COPY INTO methods. Triggered when user says "object storage import", "OSS data
pipeline", "S3 data import", "PIPE continuous ingestion", "auto file loading", "bucket data sync",
"COS import", "batch import from OSS", "load data from OSS", "Volume import".
Includes PIPE continuous ingestion (two INGEST_MODEs), batch import (Volume + COPY/INSERT),
Connection/Volume creation, monitoring and management — all ClickZetta-specific logic.
Keywords: OSS, S3, COS, object storage, PIPE, COPY INTO, file ingestion
clickzetta-batch-sync-pipeline
clickzetta/clickzetta-skills
Create and manage ClickZetta Lakehouse batch sync tasks, supporting both single-table and multi-table modes.
Single-table mode is suitable for simple source-to-target table sync; multi-table mode supports full database mirror,
multi-table mirror, and sharded table merge.
Triggered when the user says "batch sync", "offline sync", "sync database to Lakehouse", "full database migration",
"multi-table sync", "periodic sync", "scheduled data sync", "sharded table merge", "offline data migration".
Covers single-table/multi-table batch sync task creation, data source configuration, column mapping,
sync rules, scheduling, deployment, and task operations — all ClickZetta Studio specific logic.
Keywords: batch sync, offline sync, full load, mirror, multi-table sync, scheduled sync
clickzetta-cdc-sync-pipeline
clickzetta/clickzetta-skills
Create and manage ClickZetta Lakehouse multi-table real-time sync (CDC) tasks, syncing entire MySQL / PostgreSQL
databases or multiple tables to Lakehouse in real time.
Supports three sync modes: full database mirror, multi-table mirror, and sharded table merge.
Based on Binlog (MySQL) or WALs (PostgreSQL) for second-level end-to-end latency, with full load + incremental two-phase sync.
Triggered when the user says "multi-table real-time sync", "full database sync", "database mirror",
"CDC full database", "multi-table CDC", "sharded table merge", "MySQL full database sync to Lakehouse",
"PostgreSQL full database sync", "multi-table realtime sync", "database migration",
"full load + incremental sync", "sync operations", "sync SOP", "sync alert configuration",
"Binlog position expired", "server-id conflict", "full re-sync", "add sync table".
Covers source database preparation (parameter configuration + permissions), three sync mode selection,
task creation and deployment, operations SOP (full re-sync/add table/
clickzetta-realtime-sync-pipeline
clickzetta/clickzetta-skills
Create and manage ClickZetta Lakehouse real-time sync tasks (single-table), syncing data from external sources
to Lakehouse in real time.
Supports Kafka, MySQL, PostgreSQL, and other data sources as the source, with Lakehouse as the target.
Real-time sync tasks are continuously running streaming tasks — no scheduling required; they start running upon submission.
Triggered when the user says "Studio real-time sync", "realtime sync", "single-table CDC sync",
"real-time data sync", "Kafka real-time sync to Lakehouse", "MySQL single-table real-time sync",
"single-table real-time sync", "real-time data migration".
Covers real-time sync task creation, data source configuration, column mapping (including JSONPath computed columns),
deployment, and operations — all ClickZetta Studio specific logic.
Keywords: real-time sync, single table, Kafka source, MySQL source, streaming, CDC
clickzetta-studio-task-manager
clickzetta/clickzetta-skills
Manage ClickZetta Lakehouse Studio tasks, covering task type descriptions (batch sync/multi-table batch sync/
real-time sync/multi-table real-time sync/data development), task folder organization, task type differentiation,
cz-cli task command family, scheduling configuration, dependency management, and common issue troubleshooting.
Implements the "separation of DDL and pipeline management" engineering standard: DDL tasks as drafts,
ETL tasks with scheduling, Dynamic Tables with auto-refresh.
Triggered when the user says "create Studio task", "task folder", "task scheduling", "cz-cli task",
"task dependency", "task failed", "task status", "full database sync task", "ETL task orchestration",
"task management", "separation of DDL and pipeline", "DDL task", "scheduling DAG", "task folder",
"Studio task", "batch sync", "real-time sync", "multi-table real-time sync", "data development task",
"task types", "which sync to choose", "sync task differences".
Keywords: Studio task, task management, cz-cli task, scheduli