con un clic
devops-docker-patterns
// Docker containerization patterns including Dockerfile best practices, Compose orchestration, image optimization, networking, volumes, and security hardening for production workloads.
// Docker containerization patterns including Dockerfile best practices, Compose orchestration, image optimization, networking, volumes, and security hardening for production workloads.
Patterns for building and managing cloud data infrastructure on AWS and GCP using Infrastructure as Code, data lake architectures, cost optimization, and security best practices.
Data quality validation, observability, and monitoring for data pipelines. Use this skill when implementing data quality checks with Great Expectations or Soda Core, designing schema contracts, building anomaly detection, or establishing data observability practices. Covers validation frameworks, quality metrics, SLAs, freshness monitoring, and lineage tracking.
Streaming data patterns for event-driven architectures and real-time processing. Use this skill when building Kafka pipelines, implementing CDC, designing event sourcing systems, or working with stream processing frameworks like Flink and Kafka Streams. Covers delivery guarantees, backpressure, dead letter queues, and production-grade streaming infrastructure.
Testing patterns for data engineering pipelines and transformations. Use this skill when writing tests for SQL transforms, dbt models, data contracts, pipeline integration tests, or managing test data. Covers pytest-sql, dbt testing, contract testing, regression testing, and synthetic data generation for reliable data infrastructure.
Patterns and best practices for cloud data warehouses (Snowflake, BigQuery, Redshift), lakehouse architectures, Data Vault 2.0, and ELT pipeline design
Production-ready patterns for continuous integration and continuous deployment pipelines across GitHub Actions, GitLab CI, and general pipeline design principles.
| name | devops-docker-patterns |
| description | Docker containerization patterns including Dockerfile best practices, Compose orchestration, image optimization, networking, volumes, and security hardening for production workloads. |
Use multi-stage builds to separate build dependencies from the final runtime image. This keeps production images small and free of compilers, package managers, and source code.
# ---- Build stage ----
FROM python:3.12-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
# ---- Runtime stage ----
FROM python:3.12-slim
WORKDIR /app
COPY --from=builder /install /usr/local
COPY . .
USER nobody
EXPOSE 8000
CMD ["gunicorn", "app:create_app()", "--bind", "0.0.0.0:8000"]
Order layers so that dependency installation comes before source code COPY. This way, code changes do not invalidate the expensive dependency-install layer.
See dockerfile-patterns for: multi-stage builds, ARG/ENV usage, COPY vs ADD, ENTRYPOINT vs CMD patterns, and .dockerignore configuration.
Define services, networks, and volumes declaratively. Use depends_on with health checks to control startup order reliably.
services:
api:
build: .
ports:
- "8000:8000"
depends_on:
db:
condition: service_healthy
environment:
DATABASE_URL: postgres://app:secret@db:5432/mydb
networks:
- backend
db:
image: postgres:16-alpine
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U app"]
interval: 5s
timeout: 3s
retries: 5
networks:
- backend
volumes:
pgdata:
networks:
backend:
See compose-patterns for: service definitions, healthcheck strategies, profiles, override files, and environment management.
Choose the right base image for your workload. Alpine images are small but use musl libc, which can cause compatibility issues with some native extensions. Distroless images contain only the application runtime.
# Node.js production image -- 5x smaller than node:20
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
RUN npm run build
FROM gcr.io/distroless/nodejs20-debian12
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["dist/server.js"]
See image-optimization for: base image selection guide, layer caching strategies, .dockerignore examples, and distroless versus Alpine trade-offs.
Use named volumes for persistent data and bind mounts only for development. Choose the right network driver for your deployment model.
services:
app:
image: myapp:latest
volumes:
- app-data:/app/data # Named volume for persistence
- ./config:/app/config:ro # Bind mount, read-only
- tmp:/tmp # tmpfs for ephemeral scratch
networks:
- frontend
- backend
volumes:
app-data:
driver: local
tmp:
driver_opts:
type: tmpfs
device: tmpfs
See networking-volumes for: bridge/host/overlay network drivers, named volume lifecycle, bind mount patterns, and tmpfs usage.
Always run containers as a non-root user. Use Docker secrets or environment variables from a vault -- never embed credentials in the image.
FROM python:3.12-slim
RUN groupadd -r appuser && useradd -r -g appuser -d /app -s /sbin/nologin appuser
WORKDIR /app
COPY --chown=appuser:appuser . .
RUN pip install --no-cache-dir -r requirements.txt
USER appuser
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD ["python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
CMD ["gunicorn", "app:create_app()", "--bind", "0.0.0.0:8000"]
See security-patterns for: non-root user setup, secrets management, image scanning with Trivy/Grype, read-only filesystem configuration, and capability dropping.
Define health checks so orchestrators can detect and replace unhealthy containers. Log to stdout/stderr so the Docker logging driver can collect output.
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD ["curl", "-f", "http://localhost:8000/health"] || exit 1
services:
app:
image: myapp:latest
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
Write application logs to stdout, not to files inside the container. This lets Docker, Kubernetes, and log aggregators handle rotation and shipping.
| Avoid | Use Instead |
|---|---|
FROM ubuntu:latest as base image | FROM python:3.12-slim or distroless images sized for your runtime |
| Running as root inside containers | Create a dedicated user with useradd and switch with USER |
COPY . . before installing dependencies | Copy dependency manifests first, install, then copy source |
Hardcoding secrets in Dockerfile ENV | Docker secrets, mounted env files, or vault integration |
Using ADD for local files | Use COPY -- ADD has implicit tar extraction and URL fetch side effects |
apt-get install without cleanup | Chain apt-get update && apt-get install -y ... && rm -rf /var/lib/apt/lists/* |
| One giant Dockerfile stage | Multi-stage builds to separate build tools from runtime |
No .dockerignore file | Maintain .dockerignore to exclude .git, node_modules, __pycache__ |
docker-compose up with no health checks | Use depends_on with condition: service_healthy |
| Storing data in container writable layer | Named volumes for persistence, tmpfs for ephemeral data |
COPY requirements.txt and RUN pip install before COPY . . to avoid reinstalling dependencies on every code change.DOCKER_BUILDKIT=1. Provides parallel stage execution, better caching, and secret mounts (--mount=type=secret).depends_on with health conditions instead of arbitrary sleep commands for reliable service ordering.deploy.resources.limits in Compose to prevent a single container from consuming all host memory or CPU.docker build --squash or combine related RUN commands to reduce final layer count without sacrificing cache efficiency during development.source: Docker Official Documentation, Dockerfile Best Practices, Docker Compose Specification