Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

docker-containerization

Name: Docker Containerization
Author: findinfinitelabs

// Reality of the Chuuk Dictionary container builds — multi-stage Flask + React app image and a separate Ollama sidecar image. No docker-compose is used. Use when modifying the Dockerfiles, debugging build failures, or adding system dependencies.

Ejecutar en Manus

$ git log --oneline --stat

stars:0

forks:0

updated:4 de mayo de 2026, 01:56

SKILL.md

readonly

name	docker-containerization
description	Reality of the Chuuk Dictionary container builds — multi-stage Flask + React app image and a separate Ollama sidecar image. No docker-compose is used. Use when modifying the Dockerfiles, debugging build failures, or adding system dependencies.

Docker Containerization

Two Dockerfiles, no docker-compose. Production builds run in ACR (see the azure-container-deployment skill); local builds are useful for smoke-testing only.

Main app: `Dockerfile`

Multi-stage build:

Frontend stage — node:18-slim. npm ci from frontend/package*.json, then npm run build → frontend/dist.
Runtime stage — python:3.11-slim. Installs Tesseract (tesseract-ocr-eng only), poppler-utils, build toolchain. Installs Python deps. Copies models/ (the Helsinki fine-tuned weights). COPY . . for app code. COPY --from=frontend-builder /app/frontend/dist ./frontend/dist.

Final command:

gunicorn --bind 0.0.0.0:8000 --workers 2 --timeout 300 \
  --access-logfile - --error-logfile - app:app

The container runs as the default user (root in python:3.11-slim). There is no HEALTHCHECK and no dedicated non-root user. If you want either, add them — current state is "no".

The build context is the entire repo (COPY . .), filtered by .dockerignore. Anything not excluded ships into the image — be deliberate when adding large fixtures.

Ollama sidecar: `Dockerfile.ollama`

python:3.11-slim (not the official ollama/ollama base) plus the upstream installer:

RUN curl -fsSL https://ollama.com/install.sh | sh

Build arg PREPULL_LLM=false (default). When true, the image bakes in llama3.2:3b (Dockerfile.ollama) — much larger image, faster cold start.

Entrypoint is ollama-entrypoint.sh which runs ollama serve and ensures the chuukese-translator custom model exists.

System dependencies (and why they're there)

Package	Reason
`tesseract-ocr` + `tesseract-ocr-eng`	OCR (src/ocr/ocr_processor.py)
`poppler-utils`	`pdf2image` rasterization for PDF OCR
`gcc`, `g++`, `python3-dev`	Building wheels for `pymongo`/`numpy` extensions

Tesseract data packs other than English are not installed — Chuukese is handled by post-processing at the OCR layer.

Local build & run (for smoke testing)

# Build (slow on M-series due to amd64 cross — prefer ACR)
docker build -t chuuk-dictionary-app .
docker build -t chuuk-ollama -f Dockerfile.ollama .

# Run
docker run --rm -p 8000:8000 \
  -e COSMOS_MONGO_CONNECTION_STRING="$COSMOS_MONGO_CONNECTION_STRING" \
  -e FLASK_SECRET_KEY=dev-only \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  chuuk-dictionary-app

docker run --rm -p 11434:11434 chuuk-ollama

For full-stack local dev, use dev-start.sh (Vite + Flask side-by-side, no containers).

.dockerignore

Excludes .venv/, node_modules/, tests/, .git/, uploads/, output/, training_data/, plus the usual cache/byte-code patterns. Note: models/ is not excluded — they're intentionally baked in.

Common modifications

Adding a system package

Add the line to the apt-get install -y \\ block in Dockerfile. Combine with existing packages to keep one layer.

Adding a Python package

Add it to requirements.txt. Don't pip install in a new RUN line — that fragments the image.

Adding a frontend dependency

cd frontend && npm install <pkg>. The build stage will pick it up via npm ci.

Adding a Tesseract language

RUN apt-get update && apt-get install -y tesseract-ocr-<lang> && rm -rf /var/lib/apt/lists/*

Place it inside the existing system-deps block.

Pitfalls

node:18-slim is the current frontend base. Frontend builds fine on it; bumping to node:20/node:22 is safe but verify Vite 7 + React 19 still build cleanly.
The frontend stage's WORKDIR /app/frontend and the runtime stage's WORKDIR /app mean the COPY --from=frontend-builder /app/frontend/dist ./frontend/dist path is stage-local to the source side — don't shorten it.
pip install --root-user-action=ignore is set because the runtime user is root; that flag is not "ignored" — it suppresses pip's warning. If you switch to a non-root user, drop the flag.
--workers 2 --timeout 300 is tuned for OCR/translate latency. Reducing the timeout will cut off long publication-processing SSE streams (see publication-ocr-processing-workflow).
The CMD is plain JSON-array exec form. Don't switch to shell form unless you really want PID 1 semantics changed.

related-skills.json

mismo repositorio

ai-training-data-generation.md

from "findinfinitelabs/chuuk"

Generate high-quality training datasets from documents, text corpora, EPUBs, and structured content. Use when creating AI training data from dictionaries, Bible EPUBs, brochures, or when generating examples for machine learning models. Optimized for low-resource languages and domain-specific knowledge extraction. Supports parallel corpus extraction from NWT Bible EPUBs.

2026-05-040

azure-container-deployment.md

from "findinfinitelabs/chuuk"

Deploy the Chuuk Dictionary stack (main app + Ollama sidecar) to Azure Container Apps. Covers ACR remote builds via `az acr build`, Key Vault prerequisites, Cosmos DB credential injection, and the env-var contract. Use when running a deploy, debugging a failed deploy, or modifying Azure infrastructure.

2026-05-040

bible-epub-processing.md

from "findinfinitelabs/chuuk"

Parse JW.org NWT EPUBs to extract individual Bible verses for verse previews, parallel-corpus building, and Bible-coverage analytics. Use when working with NWT EPUB files, building parallel Chuukese↔English Bible data, or wiring scripture preview features.

2026-05-040

chuukese-language-processing.md

from "findinfinitelabs/chuuk"

Specialized processing for Chuukese language text including tokenization, accent handling, cultural context preservation, and language-specific patterns. Use when working with Chuukese text, translation tasks, or when building language models for this Micronesian language.

2026-05-040

code-documentation-standards.md

from "findinfinitelabs/chuuk"

Comprehensive code documentation standards and guidelines for maintaining up-to-date documentation across Python, HTML, CSS, and JavaScript codebases. Use when creating or modifying code to ensure proper documentation practices and maintainable code.

2026-05-040

css-styling-standards.md

from "findinfinitelabs/chuuk"

Styling conventions for the Chuuk Dictionary frontend — Mantine v8 theme, CSS Modules per page, global app-shell CSS, multilingual / accented-character considerations. Use when adding or modifying styles in `frontend/src/`.

2026-05-040

package.json

"author": "findinfinitelabs"

"repository": "findinfinitelabs/chuuk"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Desarrolladores de softwareOcupaciones informáticas y matemáticas15-1252L4

name	docker-containerization
description	Reality of the Chuuk Dictionary container builds — multi-stage Flask + React app image and a separate Ollama sidecar image. No docker-compose is used. Use when modifying the Dockerfiles, debugging build failures, or adding system dependencies.

Docker Containerization

Two Dockerfiles, no docker-compose. Production builds run in ACR (see the azure-container-deployment skill); local builds are useful for smoke-testing only.

Main app: `Dockerfile`

Multi-stage build:

Frontend stage — node:18-slim. npm ci from frontend/package*.json, then npm run build → frontend/dist.
Runtime stage — python:3.11-slim. Installs Tesseract (tesseract-ocr-eng only), poppler-utils, build toolchain. Installs Python deps. Copies models/ (the Helsinki fine-tuned weights). COPY . . for app code. COPY --from=frontend-builder /app/frontend/dist ./frontend/dist.

Final command:

gunicorn --bind 0.0.0.0:8000 --workers 2 --timeout 300 \
  --access-logfile - --error-logfile - app:app

The container runs as the default user (root in python:3.11-slim). There is no HEALTHCHECK and no dedicated non-root user. If you want either, add them — current state is "no".

The build context is the entire repo (COPY . .), filtered by .dockerignore. Anything not excluded ships into the image — be deliberate when adding large fixtures.

Ollama sidecar: `Dockerfile.ollama`

python:3.11-slim (not the official ollama/ollama base) plus the upstream installer:

RUN curl -fsSL https://ollama.com/install.sh | sh

Build arg PREPULL_LLM=false (default). When true, the image bakes in llama3.2:3b (Dockerfile.ollama) — much larger image, faster cold start.

Entrypoint is ollama-entrypoint.sh which runs ollama serve and ensures the chuukese-translator custom model exists.

System dependencies (and why they're there)

Package	Reason
`tesseract-ocr` + `tesseract-ocr-eng`	OCR (src/ocr/ocr_processor.py)
`poppler-utils`	`pdf2image` rasterization for PDF OCR
`gcc`, `g++`, `python3-dev`	Building wheels for `pymongo`/`numpy` extensions

Tesseract data packs other than English are not installed — Chuukese is handled by post-processing at the OCR layer.

Local build & run (for smoke testing)

# Build (slow on M-series due to amd64 cross — prefer ACR)
docker build -t chuuk-dictionary-app .
docker build -t chuuk-ollama -f Dockerfile.ollama .

# Run
docker run --rm -p 8000:8000 \
  -e COSMOS_MONGO_CONNECTION_STRING="$COSMOS_MONGO_CONNECTION_STRING" \
  -e FLASK_SECRET_KEY=dev-only \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  chuuk-dictionary-app

docker run --rm -p 11434:11434 chuuk-ollama

For full-stack local dev, use dev-start.sh (Vite + Flask side-by-side, no containers).

.dockerignore

Common modifications

Adding a system package

Add the line to the apt-get install -y \\ block in Dockerfile. Combine with existing packages to keep one layer.

Adding a Python package

Add it to requirements.txt. Don't pip install in a new RUN line — that fragments the image.

Adding a frontend dependency

cd frontend && npm install <pkg>. The build stage will pick it up via npm ci.

Adding a Tesseract language

RUN apt-get update && apt-get install -y tesseract-ocr-<lang> && rm -rf /var/lib/apt/lists/*

Place it inside the existing system-deps block.

Pitfalls

node:18-slim is the current frontend base. Frontend builds fine on it; bumping to node:20/node:22 is safe but verify Vite 7 + React 19 still build cleanly.
The frontend stage's WORKDIR /app/frontend and the runtime stage's WORKDIR /app mean the COPY --from=frontend-builder /app/frontend/dist ./frontend/dist path is stage-local to the source side — don't shorten it.
pip install --root-user-action=ignore is set because the runtime user is root; that flag is not "ignored" — it suppresses pip's warning. If you switch to a non-root user, drop the flag.
--workers 2 --timeout 300 is tuned for OCR/translate latency. Reducing the timeout will cut off long publication-processing SSE streams (see publication-ocr-processing-workflow).
The CMD is plain JSON-array exec form. Don't switch to shell form unless you really want PID 1 semantics changed.

docker-containerization

Docker Containerization

Main app: Dockerfile

Ollama sidecar: Dockerfile.ollama

System dependencies (and why they're there)

Local build & run (for smoke testing)

.dockerignore

Common modifications

Adding a system package

Adding a Python package

Adding a frontend dependency

Adding a Tesseract language

Pitfalls

Más de este repositorio

Docker Containerization

Main app: Dockerfile

Ollama sidecar: Dockerfile.ollama

System dependencies (and why they're there)

Local build & run (for smoke testing)

.dockerignore

Common modifications

Adding a system package

Adding a Python package

Adding a frontend dependency

Adding a Tesseract language

Pitfalls

Más de este repositorio

Main app: `Dockerfile`

Ollama sidecar: `Dockerfile.ollama`

Main app: `Dockerfile`

Ollama sidecar: `Dockerfile.ollama`