Generate high-quality training datasets from documents, text corpora, EPUBs, and structured content. Use when creating AI training data from dictionaries, Bible EPUBs, brochures, or when generating examples for machine learning models. Optimized for low-resource languages and domain-specific knowledge extraction. Supports parallel corpus extraction from NWT Bible EPUBs.
Deploy the Chuuk Dictionary stack (main app + Ollama sidecar) to Azure Container Apps. Covers ACR remote builds via `az acr build`, Key Vault prerequisites, Cosmos DB credential injection, and the env-var contract. Use when running a deploy, debugging a failed deploy, or modifying Azure infrastructure.
Parse JW.org NWT EPUBs to extract individual Bible verses for verse previews, parallel-corpus building, and Bible-coverage analytics. Use when working with NWT EPUB files, building parallel Chuukese↔English Bible data, or wiring scripture preview features.
Specialized processing for Chuukese language text including tokenization, accent handling, cultural context preservation, and language-specific patterns. Use when working with Chuukese text, translation tasks, or when building language models for this Micronesian language.
Comprehensive code documentation standards and guidelines for maintaining up-to-date documentation across Python, HTML, CSS, and JavaScript codebases. Use when creating or modifying code to ensure proper documentation practices and maintainable code.
Styling conventions for the Chuuk Dictionary frontend — Mantine v8 theme, CSS Modules per page, global app-shell CSS, multilingual / accented-character considerations. Use when adding or modifying styles in `frontend/src/`.
Conventions for the Chuuk Dictionary persistence layer — Azure Cosmos DB (MongoDB API) via `db_factory`, `DictionaryDB`, `UserDB`, and `PublicationManager`. Covers connection-resolution order, the actual collection/method names in use, and the managed-identity path. Use when adding queries, debugging connection issues, or extending the schema.
Reality of the Chuuk Dictionary container builds — multi-stage Flask + React app image and a separate Ollama sidecar image. No docker-compose is used. Use when modifying the Dockerfiles, debugging build failures, or adding system dependencies.