Singdata Lakehouse official documentation knowledge base (English). Consult references/ when writing SQL or answering questions about query syntax, functions, data types, DDL/DML, dynamic tables, permissions, vclusters, data lake, AI functions, and other Lakehouse topics.
Singdata Lakehouse official documentation knowledge base (English). Consult references/ when writing SQL or answering questions about query syntax, functions, data types, DDL/DML, dynamic tables, permissions, vclusters, data lake, AI functions, and other Lakehouse topics.
Singdata Lakehouse official documentation knowledge base (English). Consult references/ when writing SQL or answering questions about query syntax, functions, data types, DDL/DML, dynamic tables, permissions, vclusters, data lake, AI functions, and other Lakehouse topics.
lakehouse-doc-en
Singdata Lakehouse official documentation (English). Locate docs by filename under references/ based on the user's question.
Singdata Lakehouse is a fully managed lakehouse architecture platform built from the ground up on cloud-native design principles. Through storage-compute separation, Serverless elastic architecture, open storage formats, and AI-optimized tools, it provides enterprises with a unified platform for data warehousing, data lakes, real-time processing, and BI reporting. Free Trial
Quick Start
Overview: Introduces the storage-compute separation architecture, Serverless computing, open data formats, and key application scenarios of Singdata Lakehouse.
Key Concepts: Introduces the storage-compute separation architecture, Serverless computing, open data formats, and key application scenarios of Singdata Lakehouse.
Tutorials: Walks through the complete workflow from data ingestion and SQL querying to data visualization, covering steps from data access to analysis and presentation.
User Guide
Studio: A web interface for data development and management, supporting data source connections, SQL querying, job orchestration, result visualization, and catalog browsing.
Object Model: Introduces the core concepts of the Singdata Lakehouse object model, including definitions and hierarchical relationships of catalogs, databases, tables, views, materialized views, functions, and shares.
Data Ingestion: Import data from local files, databases, Kafka, and other sources, covering core concepts, configuration steps, and operational examples.
Data Transformation: Covers core concepts, key configurations, typical operational steps, examples, and considerations around data transformation.
Data Analysis: A complete workflow guide from data import and SQL querying to visual analysis, covering data source connections, SQL syntax, function usage, and result export.
Security: Provides security features including user management, permission control, and audit logging, covering specific configuration methods for user creation, role authorization, data access policies, and operational auditing.
Data Sharing: Covers core concepts, key configurations, typical operational steps, examples, and considerations around data sharing.
Private Link: Achieves private network access across VPCs or from on-premises IDC to cloud services by configuring endpoint services and private links.
Benchmark: Covers core concepts, key configurations, typical operational steps, examples, and considerations around performance testing.
Ecosystem Tools: Covers core concepts, key configurations, typical operational steps, examples, and considerations around ecosystem tools.
Insight: Connects to Singdata Lakehouse data sources, creates datasets, and generates BI reports and dashboards via drag-and-drop for self-service data analysis and visualization.
SQL Reference
SQL Commands: Provides complete syntax references for DDL, DML, DQL, and other SQL commands, including specific parameters and usage examples for statements like CREATE, SELECT, INSERT, and more.
Data Types: Introduces the specific data types supported by Singdata Lakehouse, including exact numerics, floating-point numbers, strings, datetime, booleans, and their definitions.
SQL Functions: Covers core concepts, key configurations, typical operational steps, examples, and considerations around SQL functions.
SQL Usage Guide: Covers core concepts, key configurations, typical operational steps, examples, and considerations around SQL usage.
SDK Reference
Java SDK Reference: Covers core concepts, key configurations, typical operational steps, examples, and considerations around Java SDK.
Python SDK Reference: Covers core concepts, key configurations, typical operational steps, examples, and considerations around Python SDK.
Practice Tutorials
Efficiently Manage Objects and Organize Data: Create and manage data objects through various data sources such as object storage, databases, and data lakes, and organize catalogs, set permissions, and configure lifecycle policies.
Data Import and Export Practice: Provides specific operational steps and examples for importing data from local files, databases, Kafka, and other sources, and exporting query results to files or databases.
Data Query and Analysis Practice: A complete workflow guide from data import and SQL querying to visual analysis, covering multiple data sources such as local files, databases, and Kafka.
Build and Operate ELT Pipelines Practice: Build enterprise-grade ELT pipelines using scheduling tools, data quality monitoring, and task orchestration, covering the full lifecycle of development, testing, deployment, and failure recovery.
Optimize Computing Resources: Introduces specific methods for optimizing computing resource usage by adjusting compute group configurations, setting elastic scaling policies, and using resource monitoring.
Performance Experience: Provides performance testing methods, optimization suggestions, and monitoring metrics, covering specific operations for query acceleration, resource tuning, and bottleneck diagnosis.
Build Modern Data Stack: Introduces the core components and architectural patterns of the modern data stack, including selection and integration of key stages such as data integration, transformation, storage, analysis, and visualization.
AI Application Development: Provides a complete guide and toolchain for the AI application development process, from data preparation and model training to service deployment.
Security and Compliance Audit: Provides operational methods and parameter descriptions for user permission management, SQL audit logging, data masking policies, and compliance configuration.
Usage and Cost Management: View detailed usage, cost breakdown, and billing models for Singdata Lakehouse, and manage costs through budgets and alerting.
Lakehouse AI
Lakehouse AI Overview: Integrates unstructured data management, AI external functions, multimodal retrieval, Python development framework, and conversational analytics to achieve a closed loop from data to intelligent decision-making.
Data Preparation for AI: Singdata Lakehouse supports vector search, full-text search, and structured data analysis seamlessly combined, providing unified data services for AI applications such as RAG and recommendation systems.
AI Functions: Provides methods for creating and using AI functions in Singdata Lakehouse, supporting calls to external online AI services or offline model packages via Python/Java.
Lakehouse Python Development Framework (Zettapark): API reference for the Zettapark Lakehouse Python development framework, including classes, methods, and parameter descriptions for core modules such as DataFrame, SQL, and Catalog.
AI + BI Unified Workflow: Generate SQL queries, visual charts, and dashboards through natural language interaction, achieving an end-to-end workflow from data exploration to analysis and presentation.
AI Gateway: AI Gateway supports unified access, route distribution, load balancing, rate limiting, circuit breaking, caching, and monitoring for managing API calls to multiple model providers.
Conversational Data Analytics (DataGPT): Directly generate SQL, execute queries, and obtain visual charts through natural language queries, without writing code.
Lakehouse MCP Server: Lakehouse MCP Server exposes data lakehouse capabilities (such as tables, views, and functions) to AI assistants like Claude through the Model Context Protocol, enabling natural language querying and analysis.
AI Ecosystem: Introduces how to integrate Lakehouse with mainstream AI frameworks (such as PyTorch, TensorFlow) and tools (such as MLflow, LangChain) to support model training, inference, and AI application development.
Product Updates
Release Notes: Introduces new features and optimizations for Singdata Lakehouse in areas such as data import, SQL querying, data lake acceleration, and stream processing.
Other
User Agreement: Defines the account registration qualifications, process, usage rules, and security requirements for Singdata's product services, and outlines the rights and obligations between users and Singdata.
Privacy Policy: Describes how the company collects, uses, and protects personal information, as well as the rights users have and contact methods.
Product Trial Agreement: Product trial agreement, defining user responsibilities during the testing phase, company rights, intellectual property ownership, and disclaimers.