| name | klai-tenant-isolation-checks |
| description | Klai tenant-isolation pattern checks. Codifies the standards from the
audit-tenant-isolation-2026-05-05 fix cycle into reusable diff-time
checks. Used by /klai:tenant-review and the GitHub Actions workflow.
TRIGGER when reviewing a code diff that touches:
- Postgres models with tenant columns (org_id, tenant_id)
- Webhook or OAuth callback handlers
- Service-to-service calls (klai-portal โ knowledge-ingest, retrieval-api, etc.)
- Qdrant search/scroll/upsert/delete
- FalkorDB / Graphiti operations
- Garage S3 image storage
- Redis cache with tenant-scoped keys
- Cross-org sites (lifespan, reapers, admin endpoints)
- Pydantic Settings with secret/token fields
- SOPS env-var changes
NOT for: greenfield architecture decisions (use klai-security-audit),
single-line typos, or non-Klai projects.
|
Klai Tenant-Isolation Checks
This skill codifies the patterns from
reports/audit-tenant-isolation-2026-05-05/standards.md as a reviewable
checklist. Use it when reviewing a code diff to catch tenant-isolation
regressions BEFORE they ship.
How to use
Given a diff (git diff main or PR diff), walk every changed line through
the relevant checks below. Each check has a hard-or-soft verdict:
- HARD โ blocker, must be fixed before merge
- SOFT โ flag for review, may be acceptable with explicit justification
Output format per finding:
[HARD|SOFT] file:line โ <pattern violated>
Current: <code excerpt>
Standard: <link to standards.md section>
Suggestion: <concrete fix>
Check 1: Postgres RLS coverage (HARD)
For every new SQLAlchemy model with org_id/tenant_id/customer_id:
For every new SQL text() query against an RLS-protected table:
For every new op.create_table(...) in alembic:
Standards ref: standards.md sections 1, 2
Check 2: Session-helper discipline (HARD)
For every new AsyncSessionLocal() direct usage (no helper):
HARD โ implicit cross-org via "no filter" is the bug class we're eliminating.
Standards ref: standards.md sections 3, 4
Check 3: Cat-A WITH CHECK discipline (HARD)
For every new RLS policy on tables in the auth/login path (portal_users,
portal_connectors, portal_join_requests, etc.):
Anti-pattern (Finding A-1): FOR ALL policy without explicit WITH CHECK
silently reuses USING โ letting INSERTs land any org_id.
Standards ref: standards.md section 2
Check 4: _require_<X>_secret validators (HARD)
For every new pydantic Settings field that is:
- A webhook secret (
*_webhook_secret, *_webhook_token)
- A service-to-service token (
*_internal_secret, *_api_key)
- An encryption key (
*_encryption_key, *_kek)
Must have:
Standards ref: standards.md section 5
Check 5: Webhook handler composite (HARD)
For every new endpoint with /webhook or /callback in path:
Standards ref: standards.md sections 6, 15
Check 6: Identity-assertion on internal endpoints (HARD)
For every new endpoint that:
- Reads
org_id / tenant_id / user_id from request body OR query-param, AND
- Is auth-gated only by
INTERNAL_SECRET middleware (not Zitadel JWT)
Must have:
Standards ref: standards.md section 7
Check 7: Qdrant filter-key discipline (HARD)
For every new client.search/scroll/retrieve/delete/upsert on Qdrant:
Standards ref: standards.md section 11
Check 8: FalkorDB / Graphiti per-org isolation (HARD)
For every new Cypher query OR Graphiti search:
Standards ref: standards.md section 12
Check 9: Garage S3 access (SOFT after SPEC-TI-009 lands)
For every new Garage S3 read / write / presigned URL:
Standards ref: standards.md section 13
Check 10: Redis tenant-prefixing (HARD)
For every new redis.set/get/delete/scan/keys/lpush/...:
Standards ref: standards.md section 14
For every new tenant-scoped namespace:
Check 11: Multi-org user resolution (HARD)
For every new SELECT FROM portal_users WHERE zitadel_user_id = ...:
Without rid filter, multi-org users get arbitrary tenant (Finding A-12).
Standards ref: standards.md section 10
Check 12: Platform-admin gating (HARD)
For every new app/api/admin/*.py endpoint that takes a slug URL-param
that may identify a tenant DIFFERENT from the caller's own org:
Without platform-admin gating, any tenant-admin can act on any other tenant
(Finding C-2).
Standards ref: standards.md section 16
Check 13: Constant-time secret compare (HARD)
For every new comparison involving a secret/token/signature:
Standards ref: standards.md section 15, pitfall non-constant-time-secret-compare
Check 14: post_deploy SQL operator-step (SOFT)
For every new alembic migration that:
- Creates RLS policies, OR
- Drops a table owned by
klai (not portal_api)
The PR body MUST include the operator-step:
ssh core-01 "docker exec -i klai-core-postgres-1 psql -U klai -d klai" < klai-portal/backend/alembic/versions/post_deploy_<rev>.sql
docker restart klai-core-<service>-1
Standards ref: standards.md section 8, pitfall alembic-cannot-drop-non-portal_api-tables
Check 15: Auto-migrate via entrypoint.sh (HARD)
For every new alembic migration in services that DON'T currently auto-migrate
(klai-mailer, klai-knowledge-mcp):
Without this, the migration ships in the image but never applies on prod
(per alembic-stamped-past-skipped-migration pitfall).
Services that already have auto-migrate (verified 2026-05-05):
portal-api, klai-connector, scribe-api, klai-knowledge-ingest.
Standards ref: standards.md section 9
Output template
When using this skill, structure the output as:
# Tenant-Isolation Review โ <branch>
**Diff scope:** `git diff main` (N files, M lines)
## HARD findings (block merge)
[None] OR
1. **Check N โ file:line โ <title>**
- Current: ...
- Standard: standards.md ยง<n>
- Suggestion: ...
## SOFT findings (review)
[None] OR
1. ...
## Confidence
XX โ <coverage of the diff, gaps>