Review and optimize data storage, data models, and semantic models in Microsoft Fabric lakehouses. Covers Delta table optimization (V-Order, Z-Order, compaction, partitioning), data model anti-pattern detection, semantic model DAX review, relationship optimization, AI/agentic readiness (descriptions, synonyms, linguistic schema, display folders), workspace audit (Fabric best practices, dimensional modeling, Direct Lake), and actionable recommendation reports. Use when: optimization, performance, review, tuning, best practices, V-Order, Z-Order, partitioning, compaction, DAX optimization, data model review, AI readiness, agentic, NLQ, Copilot, natural language, workspace audit, Direct Lake, dimensional modeling, audit.
Create reviewable metadata proposals for Microsoft Fabric lakehouse tables and semantic models. Covers schema analysis, concise technical and business descriptions, glossary terms, classifications, Purview-like sensitivity label proposals, PII and sensitive data detection, relationship and lineage hints, data quality rules, CDE candidates, data products, JSON/PDF output, and steward review questions. Use when: metadata creation, data catalog, Purview mapping, glossary inference, data classification, sensitivity labels, stewardship review, governance metadata.
Generate and upload synthetic data to Microsoft Fabric lakehouse tables. Covers star-schema generation, realistic fake data (names, dates, IDs, transactions), configurable row counts, referential integrity between tables, Parquet export, and OneLake upload. Use when: synthetic data, test data, fake data, sample data, generate tables, populate lakehouse, seed data, mock data.
Algorithm reference and self-contained PySpark notebook templates for cleaning Microsoft Fabric lakehouse tables. Covers profiling, duplicate detection, null analysis, type validation, statistics, IQR outlier detection, date format validation, Spanish DNI/NIE checksum validation, and email/phone format checks. Each notebook can be uploaded and run independently in Fabric.
Infer and generate Power BI semantic models (TMDL format) from Microsoft Fabric lakehouse tables. Covers star-schema detection, table classification (fact/dimension), relationship inference, DAX measure generation, data type mapping, and deployment via REST API. Use when: create semantic model, build data model, star schema, Power BI model, TMDL, fact table, dimension table.