Run any Skill in Manus with one click

$pwd:

databricks-lakebase-autoscale

Name: Databricks Lakebase Autoscale
Author: databricks-solutions

// Patterns and best practices for Lakebase Autoscaling (next-gen managed PostgreSQL). Use when creating or managing Lakebase Autoscaling projects, configuring autoscaling compute or scale-to-zero, working with database branching for dev/test workflows, implementing reverse ETL via synced tables, or connecting applications to Lakebase with OAuth credentials.

Run Skill in Manus

$ git log --oneline --stat

stars:1,591

forks:343

updated:May 11, 2026 at 09:14

File Explorer

4 files

SKILL.md

readonly

name	databricks-lakebase-autoscale
description	Patterns and best practices for Lakebase Autoscaling (next-gen managed PostgreSQL). Use when creating or managing Lakebase Autoscaling projects, configuring autoscaling compute or scale-to-zero, working with database branching for dev/test workflows, implementing reverse ETL via synced tables, or connecting applications to Lakebase with OAuth credentials.

Lakebase Autoscaling

Lakebase Autoscaling is Databricks' next-generation managed PostgreSQL service for OLTP workloads: autoscaling compute, database branching, scale-to-zero, instant restore, and Delta-to-Postgres synced tables.

Use this skill when creating/managing Lakebase Autoscaling projects, branches, endpoints/computes, credentials, reverse ETL synced tables, or app connections.

Core framing

There is no separate Python “Lakebase SDK.” Use databricks-sdk for management and for minting short-lived database credentials with WorkspaceClient().postgres.generate_database_credential(...); use standard Postgres drivers (psycopg, SQLAlchemy, JDBC, pgx, etc.) for SQL.

Language	Credential / management SDK	DB driver / wrapper
Python	`databricks-sdk` `WorkspaceClient().postgres`	`psycopg[binary,pool]` canonical; SQLAlchemy supported
Node/TS	`@databricks/lakebase` convenience wrapper, Autoscaling only	Wrapper manages `pg` pool
Java/Go	Databricks SDK for Java/Go	Standard JDBC / `pgx`

Lead connection pattern

For production Python apps, start with:

psycopg_pool.ConnectionPool
connection_class=OAuthConnection, where OAuthConnection(psycopg.Connection).connect() calls w.postgres.generate_database_credential(endpoint=...)
max_lifetime=2700

This is the canonical pattern from the official Databricks Apps + Lakebase Autoscaling tutorial lineage and databricks-ai-bridge: no background token thread; physical connections get fresh credentials when opened/recycled.

Prefer max_lifetime=2700 as a defensive 45-minute recycle before 1-hour token expiry. The official tutorial does not set max_lifetime; databricks-ai-bridge uses 2700.

See connections.md.

Critical auth warning

Do not use WorkspaceClient().config.token, w.config.oauth_token().access_token, or any workspace-scoped OAuth token as the Postgres password. It will fail at Postgres login.

Use:

cred = WorkspaceClient().postgres.generate_database_credential(endpoint=endpoint_name)
password = cred.token

That token is Lakebase-scoped and is used as the Postgres password with sslmode=require.

Resource model

Project
  └── Branches
        ├── Endpoint/Compute: primary read-write endpoint
        ├── Read replicas: optional read-only endpoints
        ├── Roles
        └── Databases
              └── Schemas/Tables

Canonical names:

projects/{project_id}
projects/{project_id}/branches/{branch_id}
projects/{project_id}/branches/{branch_id}/endpoints/{endpoint_id}

Defaults on project creation:

default branch: production
default database: databricks_postgres
primary read-write endpoint/compute
Postgres role for the creator’s Databricks identity

Key SDK namespace: WorkspaceClient().postgres.

Most create/update/delete calls return long-running operations; call .wait().

Lakebase Autoscaling vs Provisioned

Aspect	Provisioned	Autoscaling
SDK module	`w.database`	`w.postgres`
Top-level resource	Instance	Project
Capacity	fixed CU tiers, ~16 GB/CU	0.5–112 CU, ~2 GB/CU
Branching	no	yes
Scale-to-zero	no	yes
Operations	mostly synchronous	LROs; use `.wait()`
Reverse ETL	synced tables	synced tables
Read replicas	readable secondaries	dedicated read-only endpoints

Non-obvious facts to preserve

Postgres versions: 16 and 17.
AWS regions: us-east-1, us-east-2, eu-central-1, eu-west-1, eu-west-2, ap-south-1, ap-southeast-1, ap-southeast-2.
Azure beta regions: eastus2, westeurope, westus.
Autoscaling computes: 0.5–32 CU with max - min <= 16.
Fixed-size always-on computes: 40–112 CU.
Autoscaling CU ≈ 2 GB RAM.
sslmode=require on all driver connections.
Endpoint host comes from w.postgres.get_endpoint(...).status.hosts.host.
GET responses often return effective properties under status; create/update payloads use spec.
All update calls need a FieldMask.
Scale-to-zero wake-up is automatic but apps should retry.
Connections can be closed by platform timeouts: 24-hour idle timeout and 3-day max connection lifetime.
macOS DNS can fail on long Lakebase hostnames; if so, resolve to IP and pass both host and hostaddr to psycopg.
Triggered/Continuous synced tables require Delta Change Data Feed.
Reverse ETL is Delta-to-Postgres only; not Postgres-to-Delta.

Task files

connections.md — app/notebook connection patterns and credential rotation.
operations.md — project, branch, endpoint/compute, scale-to-zero, limits, MCP mapping.
reverse-etl.md — synced tables from Delta Lake to Lakebase.

SDK / package versions

pip install -U "databricks-sdk>=0.81.0" "psycopg[binary,pool]>=3.1" "sqlalchemy>=2"

Use SQLAlchemy URL prefix postgresql+psycopg://... for psycopg3.

Current limitations

Not yet supported or not equivalent to Provisioned:

High availability with readable secondaries; use read replicas instead.
Databricks Apps UI integration may lag; Apps can connect manually via credentials/resource env vars.
Feature Store integration.
Stateful AI-agent memory integrations.
Postgres-to-Delta sync.
Custom billing tags / serverless budget policies.
Direct migration from Lakebase Provisioned; use pg_dump/pg_restore or reverse ETL patterns where appropriate.

related-skills.json

same repository

databricks-apps-python.md

from "databricks-solutions/ai-dev-kit"

Builds Databricks applications. Prefers AppKit (TypeScript + React SDK) for new apps; falls back to Python frameworks (Dash, Streamlit, Gradio, Flask, FastAPI, Reflex) when Python is required. Handles OAuth authorization, app resources, SQL warehouse and Lakebase connectivity, model serving, foundation model APIs, and deployment. Use when building web apps, dashboards, ML demos, or REST APIs for Databricks, or when the user mentions AppKit, Streamlit, Dash, Gradio, Flask, FastAPI, Reflex, or Databricks app.

2026-05-261.6k

databricks-bundles.md

from "databricks-solutions/ai-dev-kit"

Create and configure Declarative Automation Bundles (formerly Asset Bundles) with best practices for multi-environment deployments (CICD). Use when working with: (1) Creating new DAB projects, (2) Adding resources (dashboards, pipelines, jobs, alerts), (3) Configuring multi-environment deployments, (4) Setting up permissions, (5) Deploying or running bundle resources

2026-05-191.6k

databricks-lakebase-provisioned.md

from "databricks-solutions/ai-dev-kit"

Patterns and best practices for Lakebase Provisioned (Databricks managed PostgreSQL) for OLTP workloads. Use when creating Lakebase instances, connecting applications or Databricks Apps to PostgreSQL, implementing reverse ETL via synced tables, storing agent or chat memory, or configuring OAuth authentication for Lakebase.

2026-05-191.6k

skill-test.md

from "databricks-solutions/ai-dev-kit"

Testing framework for evaluating Databricks skills. Use when building test cases for skills, running skill evaluations, comparing skill versions, or creating ground truth datasets with the Generate-Review-Promote (GRP) pipeline. Triggers include "test skill", "evaluate skill", "skill regression", "ground truth", "GRP pipeline", "skill quality", and "skill metrics".

2026-05-191.6k

databricks-python-sdk.md

from "databricks-solutions/ai-dev-kit"

Databricks development guidance including Python SDK, Databricks Connect, CLI, and REST API. Use when working with databricks-sdk, databricks-connect, or Databricks APIs.

2026-05-151.6k

databricks-ai-functions.md

from "databricks-solutions/ai-dev-kit"

Use Databricks built-in AI Functions (ai_classify, ai_extract, ai_summarize, ai_mask, ai_translate, ai_fix_grammar, ai_gen, ai_analyze_sentiment, ai_similarity, ai_parse_document, ai_query, ai_forecast) to add AI capabilities directly to SQL and PySpark pipelines without managing model endpoints. Also covers document parsing and building custom RAG pipelines (parse → chunk → index → query).

2026-04-291.6k

package.json

"author": "databricks-solutions"

"repository": "databricks-solutions/ai-dev-kit"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Database AdministratorsComputer and Mathematical Occupations15-1242L4

name	databricks-lakebase-autoscale
description	Patterns and best practices for Lakebase Autoscaling (next-gen managed PostgreSQL). Use when creating or managing Lakebase Autoscaling projects, configuring autoscaling compute or scale-to-zero, working with database branching for dev/test workflows, implementing reverse ETL via synced tables, or connecting applications to Lakebase with OAuth credentials.

Lakebase Autoscaling

Use this skill when creating/managing Lakebase Autoscaling projects, branches, endpoints/computes, credentials, reverse ETL synced tables, or app connections.

Core framing

There is no separate Python “Lakebase SDK.” Use databricks-sdk for management and for minting short-lived database credentials with WorkspaceClient().postgres.generate_database_credential(...); use standard Postgres drivers (psycopg, SQLAlchemy, JDBC, pgx, etc.) for SQL.

Language	Credential / management SDK	DB driver / wrapper
Python	`databricks-sdk` `WorkspaceClient().postgres`	`psycopg[binary,pool]` canonical; SQLAlchemy supported
Node/TS	`@databricks/lakebase` convenience wrapper, Autoscaling only	Wrapper manages `pg` pool
Java/Go	Databricks SDK for Java/Go	Standard JDBC / `pgx`

Lead connection pattern

For production Python apps, start with:

psycopg_pool.ConnectionPool
connection_class=OAuthConnection, where OAuthConnection(psycopg.Connection).connect() calls w.postgres.generate_database_credential(endpoint=...)
max_lifetime=2700

Prefer max_lifetime=2700 as a defensive 45-minute recycle before 1-hour token expiry. The official tutorial does not set max_lifetime; databricks-ai-bridge uses 2700.

See connections.md.

Critical auth warning

Do not use WorkspaceClient().config.token, w.config.oauth_token().access_token, or any workspace-scoped OAuth token as the Postgres password. It will fail at Postgres login.

Use:

cred = WorkspaceClient().postgres.generate_database_credential(endpoint=endpoint_name)
password = cred.token

That token is Lakebase-scoped and is used as the Postgres password with sslmode=require.

Resource model

Project
  └── Branches
        ├── Endpoint/Compute: primary read-write endpoint
        ├── Read replicas: optional read-only endpoints
        ├── Roles
        └── Databases
              └── Schemas/Tables

Canonical names:

projects/{project_id}
projects/{project_id}/branches/{branch_id}
projects/{project_id}/branches/{branch_id}/endpoints/{endpoint_id}

Defaults on project creation:

default branch: production
default database: databricks_postgres
primary read-write endpoint/compute
Postgres role for the creator’s Databricks identity

Key SDK namespace: WorkspaceClient().postgres.

Most create/update/delete calls return long-running operations; call .wait().

Lakebase Autoscaling vs Provisioned

Aspect	Provisioned	Autoscaling
SDK module	`w.database`	`w.postgres`
Top-level resource	Instance	Project
Capacity	fixed CU tiers, ~16 GB/CU	0.5–112 CU, ~2 GB/CU
Branching	no	yes
Scale-to-zero	no	yes
Operations	mostly synchronous	LROs; use `.wait()`
Reverse ETL	synced tables	synced tables
Read replicas	readable secondaries	dedicated read-only endpoints

Non-obvious facts to preserve

Postgres versions: 16 and 17.
AWS regions: us-east-1, us-east-2, eu-central-1, eu-west-1, eu-west-2, ap-south-1, ap-southeast-1, ap-southeast-2.
Azure beta regions: eastus2, westeurope, westus.
Autoscaling computes: 0.5–32 CU with max - min <= 16.
Fixed-size always-on computes: 40–112 CU.
Autoscaling CU ≈ 2 GB RAM.
sslmode=require on all driver connections.
Endpoint host comes from w.postgres.get_endpoint(...).status.hosts.host.
GET responses often return effective properties under status; create/update payloads use spec.
All update calls need a FieldMask.
Scale-to-zero wake-up is automatic but apps should retry.
Connections can be closed by platform timeouts: 24-hour idle timeout and 3-day max connection lifetime.
macOS DNS can fail on long Lakebase hostnames; if so, resolve to IP and pass both host and hostaddr to psycopg.
Triggered/Continuous synced tables require Delta Change Data Feed.
Reverse ETL is Delta-to-Postgres only; not Postgres-to-Delta.

Task files

connections.md — app/notebook connection patterns and credential rotation.
operations.md — project, branch, endpoint/compute, scale-to-zero, limits, MCP mapping.
reverse-etl.md — synced tables from Delta Lake to Lakebase.

SDK / package versions

pip install -U "databricks-sdk>=0.81.0" "psycopg[binary,pool]>=3.1" "sqlalchemy>=2"

Use SQLAlchemy URL prefix postgresql+psycopg://... for psycopg3.

Current limitations

Not yet supported or not equivalent to Provisioned:

High availability with readable secondaries; use read replicas instead.
Databricks Apps UI integration may lag; Apps can connect manually via credentials/resource env vars.
Feature Store integration.
Stateful AI-agent memory integrations.
Postgres-to-Delta sync.
Custom billing tags / serverless budget policies.
Direct migration from Lakebase Provisioned; use pg_dump/pg_restore or reverse ETL patterns where appropriate.

databricks-lakebase-autoscale

Lakebase Autoscaling

Core framing

Lead connection pattern

Critical auth warning

Resource model

Lakebase Autoscaling vs Provisioned

Non-obvious facts to preserve

Task files

SDK / package versions

Current limitations

More from this repository

More from this repository

Lakebase Autoscaling

Core framing

Lead connection pattern

Critical auth warning

Resource model

Lakebase Autoscaling vs Provisioned

Non-obvious facts to preserve

Task files

SDK / package versions

Current limitations