| name | gcp-cloud-run |
| description | Specialized skill for building production-ready serverless applications on GCP. Covers Cloud Run services (containerized), Cloud Run Functions (event-driven), cold start optimization, and event-driven architecture with Pub/Sub. |
| risk | unknown |
| source | vibeship-spawner-skills (Apache 2.0) |
| date_added | "2026-02-27T00:00:00.000Z" |
GCP Cloud Run
Specialized skill for building production-ready serverless applications on GCP.
Covers Cloud Run services (containerized), Cloud Run Functions (event-driven),
cold start optimization, and event-driven architecture with Pub/Sub.
Principles
- Cloud Run for containers, Functions for simple event handlers
- Optimize for cold starts with startup CPU boost and min instances
- Set concurrency based on workload (start with 8, adjust)
- Memory includes /tmp filesystem - plan accordingly
- Use VPC Connector only when needed (adds latency)
- Containers should start fast and be stateless
- Handle signals gracefully for clean shutdown
Patterns
Cloud Run Service Pattern
Containerized web service on Cloud Run
When to use: Web applications and APIs,Need any runtime or library,Complex services with multiple endpoints,Stateless containerized workloads
# Dockerfile - Multi-stage build for smaller image
FROM node:20-slim AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
FROM node:20-slim
WORKDIR /app
# Copy only production dependencies
COPY --from=builder /app/node_modules ./node_modules
COPY src ./src
COPY package.json ./
# Cloud Run uses PORT env variable
ENV PORT=8080
EXPOSE 8080
# Run as non-root user
USER node
CMD ["node", "src/index.js"]
const express = require('express');
const app = express();
app.use(express.json());
app.get('/health', (req, res) => {
res.status(200).send('OK');
});
app.get('/api/items/:id', async (req, res) => {
try {
const item = await getItem(req.params.id);
res.json(item);
} catch (error) {
console.error('Error:', error);
res.status(500).json({ error: 'Internal server error' });
}
});
process.on('SIGTERM', () => {
console.log('SIGTERM received, shutting down gracefully');
server.close(() => {
console.log('Server closed');
process.exit(0);
});
});
const PORT = process.env.PORT || 8080;
const server = app.listen(PORT, () => {
console.log(`Server listening on port ${PORT}`);
});
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA', '.']
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA']
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args:
- 'run'
- 'deploy'
- 'my-service'
- '--image=gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA'
- '--region=us-central1'
- '--platform=managed'
- '--allow-unauthenticated'
- '--memory=512Mi'
- '--cpu=1'
- '--min-instances=1'
- '--max-instances=100'
- '--concurrency=80'
- '--cpu-boost'
images:
- 'gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA'
Structure
project/
├── Dockerfile
├── .dockerignore
├── src/
│ ├── index.js
│ └── routes/
├── package.json
└── cloudbuild.yaml
Gcloud_deploy
Direct gcloud deployment
gcloud run deploy my-service
--source .
--region us-central1
--allow-unauthenticated
--memory 512Mi
--cpu 1
--min-instances 1
--max-instances 100
--concurrency 80
--cpu-boost
Cloud Run Functions Pattern
Event-driven functions (formerly Cloud Functions)
When to use: Simple event handlers,Pub/Sub message processing,Cloud Storage triggers,HTTP webhooks
const functions = require('@google-cloud/functions-framework');
functions.http('helloHttp', (req, res) => {
const name = req.query.name || req.body.name || 'World';
res.send(`Hello, ${name}!`);
});
const functions = require('@google-cloud/functions-framework');
functions.cloudEvent('processPubSub', (cloudEvent) => {
const message = cloudEvent.data.message;
const data = message.data
? JSON.parse(Buffer.from(message.data, 'base64').toString())
: {};
console.log('Received message:', data);
processMessage(data);
});
const functions = require('@google-cloud/functions-framework');
functions.cloudEvent('processStorageEvent', async (cloudEvent) => {
const file = cloudEvent.data;
console.log(`Event: ${cloudEvent.type}`);
console.log(`Bucket: ${file.bucket}`);
console.log(`File: ${file.name}`);
if (cloudEvent.type === 'google.cloud.storage.object.v1.finalized') {
await processUploadedFile(file.bucket, file.name);
}
});
gcloud functions deploy hello-http \
--gen2 \
--runtime nodejs20 \
--trigger-http \
--allow-unauthenticated \
--region us-central1
gcloud functions deploy process-messages \
--gen2 \
--runtime nodejs20 \
--trigger-topic my-topic \
--region us-central1
gcloud functions deploy process-uploads \
--gen2 \
--runtime nodejs20 \
--trigger-event-filters="type=google.cloud.storage.object.v1.finalized" \
--trigger-event-filters="bucket=my-bucket" \
--region us-central1
Cold Start Optimization Pattern
Minimize cold start latency for Cloud Run
When to use: Latency-sensitive applications,User-facing APIs,High-traffic services
1. Enable Startup CPU Boost
gcloud run deploy my-service \
--cpu-boost \
--region us-central1
2. Set Minimum Instances
gcloud run deploy my-service \
--min-instances 1 \
--region us-central1
3. Optimize Container Image
# Use distroless for minimal image
FROM node:20-slim AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
FROM gcr.io/distroless/nodejs20-debian12
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY src ./src
CMD ["src/index.js"]
4. Lazy Initialize Heavy Dependencies
let bigQueryClient = null;
function getBigQueryClient() {
if (!bigQueryClient) {
const { BigQuery } = require('@google-cloud/bigquery');
bigQueryClient = new BigQuery();
}
return bigQueryClient;
}
app.get('/api/analytics', async (req, res) => {
const client = getBigQueryClient();
const results = await client.query({...});
res.json(results);
});
5. Increase Memory (More CPU)
gcloud run deploy my-service \
--memory 1Gi \
--cpu 2 \
--region us-central1
Optimization_impact
- Startup_cpu_boost: 50% faster cold starts
- Min_instances: Eliminates cold starts for traffic spikes
- Distroless_image: Smaller attack surface, faster pull
- Lazy_init: Defers heavy loading to first request
Concurrency Configuration Pattern
Proper concurrency settings for Cloud Run
When to use: Need to optimize instance utilization,Handle traffic spikes efficiently,Reduce cold starts
Understanding Concurrency
gcloud run deploy my-service \
--concurrency 80 \
--cpu 1
gcloud run deploy my-service \
--concurrency 1 \
--cpu 1
gcloud run deploy my-service \
--concurrency 10 \
--memory 2Gi
Node.js Concurrency
app.get('/api/data', async (req, res) => {
const [users, products] = await Promise.all([
fetchUsers(),
fetchProducts()
]);
res.json({ users, products });
});
app.get('/api/compute', (req, res) => {
const result = heavyCpuOperation();
res.json(result);
});
Python Concurrency with Gunicorn
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
# 4 workers for concurrency
CMD exec gunicorn --bind :$PORT --workers 4 --threads 2 main:app
from flask import Flask
app = Flask(__name__)
@app.route('/api/data')
def get_data():
return {'status': 'ok'}
Concurrency_guidelines
- Concurrency=1: Only for CPU-bound or unsafe code
- Concurrency=8 20: Memory-intensive workloads
- Concurrency=80: Default, good for I/O-bound
- Concurrency=250: Maximum, for very lightweight handlers
Pub/Sub Integration Pattern
Event-driven processing with Cloud Pub/Sub
When to use: Asynchronous message processing,Decoupled microservices,Event-driven architecture
Push Subscription to Cloud Run
gcloud pubsub topics create orders
gcloud pubsub subscriptions create orders-push \
--topic orders \
--push-endpoint https://my-service-xxx.run.app/pubsub \
--ack-deadline 600
const express = require('express');
const app = express();
app.use(express.json());
app.post('/pubsub', async (req, res) => {
if (!req.body.message) {
return res.status(400).send('Invalid Pub/Sub message');
}
try {
const message = req.body.message;
const data = message.data
? JSON.parse(Buffer.from(message.data, 'base64').toString())
: {};
console.log('Processing order:', data);
await processOrder(data);
res.status(200).send('OK');
} catch (error) {
console.error('Processing failed:', error);
res.status(500).send('Processing failed');
}
});
Publishing Messages
const { PubSub } = require('@google-cloud/pubsub');
const pubsub = new PubSub();
async function publishOrder(order) {
const topic = pubsub.topic('orders');
const messageBuffer = Buffer.from(JSON.stringify(order));
const messageId = await topic.publishMessage({
data: messageBuffer,
attributes: {
type: 'order_created',
priority: 'high'
}
});
console.log(`Published message ${messageId}`);
return messageId;
}
Dead Letter Queue
gcloud pubsub topics create orders-dlq
gcloud pubsub subscriptions update orders-push \
--dead-letter-topic orders-dlq \
--max-delivery-attempts 5
Cloud SQL Connection Pattern
Connect Cloud Run to Cloud SQL securely
When to use: Need relational database,Migrating existing applications,Complex queries and transactions
gcloud run deploy my-service \
--add-cloudsql-instances PROJECT:REGION:INSTANCE \
--set-env-vars INSTANCE_CONNECTION_NAME="PROJECT:REGION:INSTANCE" \
--set-env-vars DB_NAME="mydb" \
--set-env-vars DB_USER="myuser"
const { Pool } = require('pg');
const pool = new Pool({
user: process.env.DB_USER,
password: process.env.DB_PASS,
database: process.env.DB_NAME,
host: `/cloudsql/${process.env.INSTANCE_CONNECTION_NAME}`,
max: 5,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 10000,
});
app.get('/api/users', async (req, res) => {
const client = await pool.connect();
try {
const result = await client.query('SELECT * FROM users LIMIT 100');
res.json(result.rows);
} finally {
client.release();
}
});
import os
from sqlalchemy import create_engine
def get_engine():
instance_connection_name = os.environ["INSTANCE_CONNECTION_NAME"]
db_user = os.environ["DB_USER"]
db_pass = os.environ["DB_PASS"]
db_name = os.environ["DB_NAME"]
engine = create_engine(
f"postgresql+pg8000://{db_user}:{db_pass}@/{db_name}",
connect_args={
"unix_sock": f"/cloudsql/{instance_connection_name}/.s.PGSQL.5432"
},
pool_size=5,
max_overflow=2,
pool_timeout=30,
pool_recycle=1800,
)
return engine
Best_practices
- Use connection pooling (max 5-10 per instance)
- Set appropriate idle timeouts
- Handle connection errors gracefully
- Consider Cloud SQL Proxy for local development
Secret Manager Integration
Securely manage secrets in Cloud Run
When to use: API keys, database passwords,Service account keys,Any sensitive configuration
echo -n "my-secret-value" | gcloud secrets create my-secret --data-file=-
gcloud run deploy my-service \
--update-secrets=API_KEY=my-secret:latest
gcloud run deploy my-service \
--update-secrets=/secrets/api-key=my-secret:latest
const apiKey = process.env.API_KEY;
const fs = require('fs');
const apiKey = fs.readFileSync('/secrets/api-key', 'utf8');
const { SecretManagerServiceClient } = require('@google-cloud/secret-manager');
const client = new SecretManagerServiceClient();
async function getSecret(name) {
const [version] = await client.accessSecretVersion({
name: `projects/${projectId}/secrets/${name}/versions/latest`
});
return version.payload.data.toString();
}
Sharp Edges
/tmp Filesystem Counts Against Memory
Severity: HIGH
Situation: Writing files to /tmp directory in Cloud Run
Symptoms:
Container killed with OOM error.
Memory usage spikes unexpectedly.
File operations cause container restarts.
"Container memory limit exceeded" in logs.
Why this breaks:
Cloud Run uses an in-memory filesystem for /tmp. Any files written
to /tmp consume memory from your container's allocation.
Common scenarios:
- Downloading files temporarily
- Creating temp processing files
- Libraries caching to /tmp
- Large log buffers
A 512MB container that downloads a 200MB file to /tmp only has
~300MB left for the application.
Recommended fix:
Calculate memory including /tmp usage
steps:
- name: 'gcr.io/cloud-builders/gcloud'
args:
- 'run'
- 'deploy'
- 'my-service'
- '--memory=1Gi'
- '--image=gcr.io/$PROJECT_ID/my-service'
Stream instead of buffering
def process_large_file(bucket_name, blob_name):
blob = bucket.blob(blob_name)
blob.download_to_filename('/tmp/large_file')
with open('/tmp/large_file', 'rb') as f:
process(f.read())
def process_large_file(bucket_name, blob_name):
blob = bucket.blob(blob_name)
with blob.open('rb') as f:
for chunk in iter(lambda: f.read(8192), b''):
process_chunk(chunk)
Use Cloud Storage for large files
from google.cloud import storage
def process_with_gcs(bucket_name, input_blob, output_blob):
client = storage.Client()
bucket = client.bucket(bucket_name)
input_blob = bucket.blob(input_blob)
output_blob = bucket.blob(output_blob)
with input_blob.open('rb') as reader:
with output_blob.open('wb') as writer:
for chunk in iter(lambda: reader.read(65536), b''):
processed = transform(chunk)
writer.write(processed)
Monitor memory usage
import psutil
import logging
def log_memory():
memory = psutil.virtual_memory()
logging.info(f"Memory: {memory.percent}% used, "
f"{memory.available / 1024 / 1024:.0f}MB available")
Concurrency=1 Causes Scaling Bottlenecks
Severity: HIGH
Situation: Setting concurrency to 1 for request isolation
Symptoms:
Auto-scaling creates many container instances.
High latency during traffic spikes.
Increased cold starts.
Higher costs from more instances.
Why this breaks:
Setting concurrency to 1 means each container handles only one
request at a time. During traffic spikes:
- 100 concurrent requests = 100 container instances
- Each instance has cold start overhead
- More instances = higher costs
- Scaling takes time, requests queue up
This should only be used when:
- Processing is truly single-threaded
- Memory-heavy per-request processing
- Using thread-unsafe libraries
Recommended fix:
Set appropriate concurrency
gcloud run deploy my-service \
--concurrency=80 \
--max-instances=100
gcloud run deploy my-service \
--concurrency=4 \
--cpu=2
gcloud run deploy my-service \
--concurrency=1 \
--max-instances=1000
Node.js - use async properly
const express = require('express');
const app = express();
app.get('/api/data', async (req, res) => {
const data = await fetchFromDatabase();
const enriched = await enrichData(data);
res.json(enriched);
});
Python - use async framework
from fastapi import FastAPI
import asyncio
import httpx
app = FastAPI()
@app.get("/api/data")
async def get_data():
async with httpx.AsyncClient() as client:
response = await client.get("https://api.example.com/data")
return response.json()
Calculate concurrency
concurrency = memory_limit / per_request_memory
Example:
- 512MB container
- 20MB per request overhead
- Safe concurrency: ~25
CPU Throttled When Not Handling Requests
Severity: HIGH
Situation: Running background tasks or processing between requests
Symptoms:
Background tasks run extremely slowly.
Scheduled work doesn't complete.
Metrics collection fails.
Connection keep-alive breaks.
Why this breaks:
By default, Cloud Run throttles CPU to near-zero when not actively
handling a request. This is "CPU only during requests" mode.
Affected operations:
- Background threads
- Connection pool maintenance
- Metrics/telemetry emission
- Scheduled tasks within container
- Cleanup operations after response
Recommended fix:
Enable CPU always allocated
gcloud run deploy my-service \
--cpu-throttling=false \
--min-instances=1
Use startup CPU boost for initialization
gcloud run deploy my-service \
--cpu-boost \
--cpu-throttling=true
Move background work to Cloud Tasks
from google.cloud import tasks_v2
import json
def create_background_task(payload):
client = tasks_v2.CloudTasksClient()
parent = client.queue_path(
"my-project", "us-central1", "my-queue"
)
task = {
"http_request": {
"http_method": tasks_v2.HttpMethod.POST,
"url": "https://my-service.run.app/process",
"body": json.dumps(payload).encode(),
"headers": {"Content-Type": "application/json"}
}
}
client.create_task(parent=parent, task=task)
@app.post("/api/order")
async def create_order(order: Order):
order_id = await save_order(order)
create_background_task({"order_id": order_id})
return {"order_id": order_id, "status": "processing"}
Use Pub/Sub for async processing
steps:
- name: 'gcr.io/cloud-builders/gcloud'
args: ['run', 'deploy', 'api-service',
'--cpu-throttling=true']
- name: 'gcr.io/cloud-builders/gcloud'
args: ['run', 'deploy', 'worker-service',
'--cpu-throttling=false',
'--min-instances=1']
VPC Connector 10-Minute Idle Timeout
Severity: MEDIUM
Situation: Cloud Run service connecting to VPC resources
Symptoms:
Connection errors after period of inactivity.
"Connection reset" or "Connection refused" errors.
Sporadic failures to VPC resources.
Database connections drop unexpectedly.
Why this breaks:
Cloud Run's VPC connector has a 10-minute idle timeout on connections.
If a connection is idle for 10 minutes, it's silently closed.
Affects:
- Database connection pools
- Redis connections
- Internal API connections
- Any persistent VPC connection
Recommended fix:
Configure connection pool with keep-alive
from sqlalchemy import create_engine
engine = create_engine(
DATABASE_URL,
pool_size=5,
max_overflow=2,
pool_recycle=300,
pool_pre_ping=True
)
TCP keep-alive for custom connections
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 60)
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 60)
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5)
Redis with connection validation
import redis
pool = redis.ConnectionPool(
host=REDIS_HOST,
port=6379,
socket_keepalive=True,
socket_keepalive_options={
socket.TCP_KEEPIDLE: 60,
socket.TCP_KEEPINTVL: 60,
socket.TCP_KEEPCNT: 5
},
health_check_interval=30
)
client = redis.Redis(connection_pool=pool)
Use Cloud SQL Proxy sidecar
cloud-sql-python-connector[pg8000]
from google.cloud.sql.connector import Connector
import sqlalchemy
connector = Connector()
def getconn():
return connector.connect(
"project:region:instance",
"pg8000",
user="user",
password="password",
db="database"
)
engine = sqlalchemy.create_engine(
"postgresql+pg8000://",
creator=getconn
)
Container Startup Timeout (4 minutes max)
Severity: HIGH
Situation: Deploying containers with slow initialization
Symptoms:
Deployment fails with "Container failed to start".
Service never becomes healthy.
"Revision failed to become ready" errors.
Works locally but fails on Cloud Run.
Why this breaks:
Cloud Run expects your container to start listening on PORT within
4 minutes (240 seconds). If it doesn't, the instance is killed.
Common causes:
- Heavy framework initialization (ML models, etc.)
- Waiting for external dependencies at startup
- Large dependency loading
- Database migrations on startup
Recommended fix:
Enable startup CPU boost
gcloud run deploy my-service \
--cpu-boost \
--startup-cpu-boost
Lazy initialization
from functools import lru_cache
from fastapi import FastAPI
app = FastAPI()
model = None
@lru_cache()
def get_model():
global model
if model is None:
model = load_heavy_model()
return model
@app.get("/predict")
async def predict(data: dict):
model = get_model()
return model.predict(data)
Start listening immediately
import asyncio
from fastapi import FastAPI
import uvicorn
app = FastAPI()
initialized = asyncio.Event()
@app.on_event("startup")
async def startup():
asyncio.create_task(async_init())
async def async_init():
await load_models()
await warm_up_connections()
initialized.set()
@app.get("/ready")
async def ready():
if not initialized.is_set():
raise HTTPException(503, "Still initializing")
return {"status": "ready"}
@app.get("/health")
async def health():
return {"status": "healthy"}
Use multi-stage builds
# Build stage - slow
FROM python:3.11 as builder
WORKDIR /app
COPY requirements.txt .
RUN pip wheel --no-cache-dir --wheel-dir /wheels -r requirements.txt
# Runtime stage - fast startup
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /wheels /wheels
RUN pip install --no-cache /wheels/* && rm -rf /wheels
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]
Run migrations separately
steps:
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: 'bash'
args:
- '-c'
- |
gcloud run jobs execute migrate-job --wait
- name: 'gcr.io/cloud-builders/gcloud'
args: ['run', 'deploy', 'my-service', ...]
Second Generation Execution Environment Differences
Severity: MEDIUM
Situation: Migrating to or using Cloud Run second-gen execution environment
Symptoms:
Network behavior changes.
Different syscall support.
File system behavior differences.
Container behaves differently than in first-gen.
Why this breaks:
Cloud Run's second-generation execution environment uses a different
sandbox (gVisor) with different characteristics:
- More Linux syscalls supported
- Full /proc and /sys access
- Different network stack
- No automatic HTTPS redirect
- Different tmp filesystem behavior
Recommended fix:
Explicitly set execution environment
gcloud run deploy my-service \
--execution-environment=gen1
gcloud run deploy my-service \
--execution-environment=gen2
Handle network differences
from fastapi import FastAPI, Request
from fastapi.responses import RedirectResponse
app = FastAPI()
@app.middleware("http")
async def redirect_https(request: Request, call_next):
if request.headers.get("X-Forwarded-Proto") == "http":
url = request.url.replace(scheme="https")
return RedirectResponse(url, status_code=301)
return await call_next(request)
GPU access (second-gen only)
gcloud run deploy ml-service \
--execution-environment=gen2 \
--gpu=1 \
--gpu-type=nvidia-l4
Check execution environment
import os
def get_execution_environment():
try:
with open('/proc/version', 'r') as f:
version = f.read()
if 'gVisor' in version:
return 'gen2'
except:
pass
return 'gen1'
Request Timeout Configuration Mismatch
Severity: MEDIUM
Situation: Long-running requests or background processing
Symptoms:
Requests terminated before completion.
504 Gateway Timeout errors.
Processing stops unexpectedly.
Inconsistent timeout behavior.
Why this breaks:
Cloud Run has multiple timeout configurations that must align:
- Request timeout (default 300s, max 3600s for HTTP, 60m for gRPC)
- Client timeout
- Downstream service timeouts
- Load balancer timeout (for external access)
Recommended fix:
Set consistent timeouts
gcloud run deploy my-service \
--timeout=900
Handle long-running with webhooks
from fastapi import FastAPI, BackgroundTasks
import httpx
app = FastAPI()
@app.post("/process")
async def process(data: dict, background_tasks: BackgroundTasks):
task_id = create_task_id()
background_tasks.add_task(
long_running_process,
task_id,
data,
data.get("callback_url")
)
return {"task_id": task_id, "status": "processing"}
async def long_running_process(task_id, data, callback_url):
result = await heavy_computation(data)
if callback_url:
async with httpx.AsyncClient() as client:
await client.post(callback_url, json={
"task_id": task_id,
"result": result
})
Use Cloud Tasks for reliable long-running
from google.cloud import tasks_v2
def create_long_running_task(data):
client = tasks_v2.CloudTasksClient()
parent = client.queue_path(PROJECT, REGION, "long-tasks")
task = {
"http_request": {
"http_method": tasks_v2.HttpMethod.POST,
"url": "https://worker.run.app/process",
"body": json.dumps(data).encode(),
"headers": {"Content-Type": "application/json"}
},
"dispatch_deadline": {"seconds": 1800}
}
return client.create_task(parent=parent, task=task)
Streaming for long responses
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
@app.get("/large-report")
async def large_report():
async def generate():
for chunk in process_large_data():
yield chunk
return StreamingResponse(generate(), media_type="text/plain")
Validation Checks
Hardcoded GCP Credentials
Severity: ERROR
GCP credentials must never be hardcoded in source code
Message: Hardcoded GCP service account credentials. Use Secret Manager or Workload Identity.
GCP API Key in Source Code
Severity: ERROR
API keys should use Secret Manager
Message: Hardcoded GCP API key. Use Secret Manager.
Credentials JSON File in Repository
Severity: ERROR
Service account JSON files should not be in source control
Message: Credentials file detected. Add to .gitignore and use Secret Manager.
Running as Root User
Severity: WARNING
Containers should not run as root for security
Message: Dockerfile runs as root. Add USER directive for security.
Missing Health Check in Dockerfile
Severity: INFO
Cloud Run uses HTTP health checks, Dockerfile HEALTHCHECK is optional
Message: No HEALTHCHECK in Dockerfile. Cloud Run uses its own health checks.
Hardcoded Port in Application
Severity: WARNING
Port should come from PORT environment variable
Message: Hardcoded port. Use PORT environment variable for Cloud Run.
Large File Writes to /tmp
Severity: WARNING
/tmp uses container memory, large writes can cause OOM
Message: /tmp writes consume memory. Consider Cloud Storage for large files.
Synchronous File Operations
Severity: WARNING
Sync file ops block the event loop in async apps
Message: Synchronous file operations. Use async versions for better concurrency.
Global Mutable State
Severity: WARNING
Global state issues with concurrent requests
Message: Global mutable state may cause issues with concurrent requests.
Thread-Unsafe Singleton Pattern
Severity: WARNING
Singletons need thread safety for concurrency > 1
Message: Singleton pattern - ensure thread safety if using concurrency > 1.
Collaboration
Delegation Triggers
- user needs AWS serverless -> aws-serverless (Lambda, API Gateway, SAM)
- user needs Azure containers -> azure-functions (Azure Container Apps, Functions)
- user needs database design -> postgres-wizard (Cloud SQL design, AlloyDB)
- user needs authentication -> auth-specialist (Firebase Auth, Identity Platform)
- user needs AI integration -> llm-architect (Vertex AI, Cloud Run + LLM)
- user needs workflow orchestration -> workflow-automation (Cloud Workflows, Eventarc)
When to Use
Use this skill when the request clearly matches the capabilities and patterns described above.
Limitations
- Use this skill only when the task clearly matches the scope described above.
- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.