// Research-driven Apache Polaris catalog management. Injects research steps for catalog operations, namespaces, principals, roles, and access control. Use when working with Iceberg catalog management, metadata organization, or access governance.
| name | polaris-catalog |
| description | Research-driven Apache Polaris catalog management. Injects research steps for catalog operations, namespaces, principals, roles, and access control. Use when working with Iceberg catalog management, metadata organization, or access governance. |
| allowed-tools | Read, Grep, Glob, Bash, WebSearch |
This skill does NOT prescribe specific catalog structures or access patterns. Instead, it guides you to:
ALWAYS run this first:
# Check if Polaris Python client is installed (v1.1.0+)
python -c "import polaris; print(f'Polaris {polaris.__version__}')" 2>/dev/null || echo "Polaris Python client not found"
# Check REST API availability (if running locally or remote)
curl -s http://localhost:8181/healthcheck || echo "Polaris API not reachable"
Critical Questions to Answer:
When to research: If you encounter unfamiliar Polaris features or need to validate patterns
Research queries (use WebSearch):
Official documentation: https://polaris.apache.org
Key documentation sections:
BEFORE creating new catalogs or roles, search for existing implementations:
# Find Polaris client usage
rg "polaris|Polaris" --type py
# Find catalog configurations
rg "catalog.*polaris|REST.*catalog" --type py --type yaml
# Find principal/role management
rg "principal|role|privilege" --type py
Key questions:
Check architecture docs for integration requirements:
/docs/ for catalog requirements and governance modelCore concept: Polaris organizes metadata into hierarchical entities
Entity hierarchy:
Polaris Instance
├── Catalogs (top-level, map to Iceberg catalogs)
│ ├── Namespaces (logical grouping within catalogs)
│ │ ├── Tables (Iceberg tables)
│ │ └── Views (Iceberg views)
│ └── Storage Configuration (S3, Azure, GCS)
├── Principals (users or services)
├── Principal Roles (labels assigned to principals)
└── Catalog Roles (privilege sets scoped to catalogs)
Research questions:
Core concept: Catalogs are top-level containers for Iceberg metadata
Research questions:
SDK features to research:
Core concept: Namespaces are logical groupings within catalogs
Research questions:
SDK features to research:
Core concept: Access control via principals, principal roles, and catalog roles
Research questions:
SDK features to research:
Core concept: Multi-level access control via role-based permissions
Access control flow:
Principal → Principal Role → Catalog Role → Privileges → Entity
Research questions:
SDK features to research:
TABLE_READ_DATA, TABLE_WRITE_DATA, NAMESPACE_CREATE, etc.X-Iceberg-Access-Delegation headerCore concept: Polaris exposes management and catalog APIs via REST
Research questions:
SDK features to research:
When this skill is invoked, you should:
Verify runtime state (don't assume):
curl -s http://localhost:8181/healthcheck
python -c "import polaris; print(polaris.__version__)"
Discover existing patterns (don't invent):
rg "polaris" --type py --type yaml
Research when uncertain (don't guess):
Validate against architecture (don't assume requirements):
/docs/Check PyIceberg integration (if applicable):
Use these WebSearch queries when encountering specific needs:
Key question: How does PyIceberg connect to Polaris catalogs?
Research areas:
X-Iceberg-Access-Delegation: vended-credentials)Key question: How does floe-runtime configure Polaris catalogs?
Research areas:
Key question: How does Polaris enforce data governance?
Research areas:
# Run Polaris locally
docker run -d -p 8181:8181 \
--name polaris \
apache/polaris:latest
# Check health
curl http://localhost:8181/healthcheck
# Access Polaris UI (if available)
open http://localhost:8181
# Install Polaris CLI (if available)
pip install apache-polaris-cli
# List catalogs
polaris catalog list
# Create catalog
polaris catalog create my_catalog \
--storage-type S3 \
--default-base-location s3://my-bucket/data
from polaris import PolarisClient
# Initialize client
client = PolarisClient(
host="localhost:8181",
credentials={"client_id": "...", "client_secret": "..."}
)
# Create catalog
client.create_catalog(
name="my_catalog",
storage_type="S3",
properties={"default-base-location": "s3://my-bucket/data"}
)
# Create namespace
client.create_namespace(
catalog="my_catalog",
namespace=["analytics", "staging"]
)
# Create catalog via REST API
curl -X POST http://localhost:8181/api/management/v1/catalogs \
-H "Content-Type: application/json" \
-d '{
"name": "my_catalog",
"storageType": "S3",
"properties": {
"default-base-location": "s3://my-bucket/data"
}
}'
Principal: data_engineer_service
↓
Principal Role: data_engineer
↓
Catalog Role: analytics_writer
↓
Privileges:
- NAMESPACE_CREATE (on catalog 'analytics')
- TABLE_READ_DATA (on namespace 'analytics.staging')
- TABLE_WRITE_DATA (on namespace 'analytics.staging')
# Create principal
client.create_principal("data_engineer_service")
# Create principal role
client.create_principal_role("data_engineer")
# Assign principal role to principal
client.assign_principal_role("data_engineer_service", "data_engineer")
# Create catalog role
client.create_catalog_role(
catalog="analytics",
name="analytics_writer"
)
# Grant privileges to catalog role
client.grant_privilege(
catalog="analytics",
catalog_role="analytics_writer",
privilege="NAMESPACE_CREATE"
)
# Assign catalog role to principal role
client.assign_catalog_role(
principal_role="data_engineer",
catalog="analytics",
catalog_role="analytics_writer"
)
Remember: This skill provides research guidance, NOT prescriptive catalog structures. Always: