一键导入
terraform-skill
// Use when writing, reviewing, or debugging Terraform/OpenTofu modules, tests, CI, scans, or state ops - diagnoses failure mode (identity churn, secrets, blast radius, CI drift, state corruption) with version-aware guards.
// Use when writing, reviewing, or debugging Terraform/OpenTofu modules, tests, CI, scans, or state ops - diagnoses failure mode (identity churn, secrets, blast radius, CI drift, state corruption) with version-aware guards.
| name | terraform-skill |
| description | Use when writing, reviewing, or debugging Terraform/OpenTofu modules, tests, CI, scans, or state ops - diagnoses failure mode (identity churn, secrets, blast radius, CI drift, state corruption) with version-aware guards. |
| license | Apache-2.0 |
| metadata | {"author":"Anton Babenko","version":"1.17.0"} |
Diagnose-first guidance for Terraform and OpenTofu. Core file is a workflow; depth lives in references loaded on demand.
Every Terraform/OpenTofu response must include:
terraform or tofu), exact version, providers, state backend, execution path (local/CI/Cloud/Atlantis), environment criticality. State assumptions explicitly if the user did not provide them.fmt -check, validate, plan -out, policy check) tailored to runtime and risk tier.Never recommend direct production apply without a reviewed plan artifact and approval.
moved, import), CI changes, policy rules.| Failure category | Symptoms | Primary references |
|---|---|---|
| Identity churn | Resource addresses shift after refactor, count index churn, missing moved blocks | Code Patterns: count vs for_each, Code Patterns: moved blocks, Code Patterns: LLM mistakes |
| Secret exposure | Secrets in defaults, state, logs, CI artifacts | Security & Compliance, Code Patterns: write-only, State Management |
| Blast radius | Oversized stacks, shared prod/non-prod state, unsafe applies | State Management, Module Patterns |
| CI drift | Local plan ≠ CI plan, apply without reviewed artifact, unpinned versions | CI/CD Workflows, Code Patterns: versions |
| Compliance gaps | Missing policy stage, no approval model, no evidence retention | Security & Compliance, CI/CD Workflows |
| Testing blind spots | Plan-only validation of computed values, set-type indexing, mock/real confusion | Testing Frameworks |
| State corruption / recovery | Stuck lock, backend migration, drift reconciliation | State Management |
| Provider upgrade risk | Breaking-change provider bump, unpinned modules | Code Patterns: versions, Module Patterns |
| Provider lifecycle | Removing a provider with resources still in state, orphaned resources, removed block usage | State Management: Provider Removal |
| Bootstrap / orchestration misuse | null_resource + local-exec for bootstrap, remote-exec for setup scripts, provisioner stdout leaking secrets in CI logs | Code Patterns: Provisioners as Last Resort |
| Navigation / safe-rename blind spots | Cannot locate symbol defs/refs semantically, value-symbol rename done as blind text replace, grep-only refactor missing refs, hallucinated rg shim | Code Intelligence |
| Cross-cloud / provider mapping | "What's the Azure/GCP equivalent of X", picking a backend/auth model per cloud | State Management: Cross-cloud equivalents |
Activate when: creating or reviewing Terraform/OpenTofu configurations or modules, setting up or debugging tests, structuring multi-environment deployments, implementing IaC CI/CD, choosing module patterns or state organization, configuring or migrating remote state backends.
Don't use for: basic HCL syntax questions Claude already knows, provider API reference (link to docs), cloud-platform questions unrelated to Terraform/OpenTofu.
| Type | When to Use | Scope |
|---|---|---|
| Resource module | Single logical group of connected resources | VPC + subnets, SG + rules |
| Infrastructure module | Collection of resource modules for a purpose | Multiple resource modules in one region/account |
| Composition | Complete infrastructure | Spans multiple regions/accounts |
Flow: resource → resource module → infrastructure module → composition.
environments/ # prod/ staging/ dev/ — per-env configurations
modules/ # networking/ compute/ data/ — reusable modules
examples/ # minimal/ complete/ — docs + integration fixtures
Separate environments from modules. Use examples/ as both documentation and test fixtures. Keep modules small and single-responsibility.
See Module Patterns for architecture principles, naming conventions, variable/output contracts.
aws_instance.web_server, not aws_instance.main)this for genuine singleton resources onlyvpc_cidr_block, not cidr)main.tf, variables.tf, outputs.tf, versions.tfSee Module Patterns: Variable Naming and Code Patterns: Block Ordering for examples.
Resource blocks: count/for_each first → arguments → tags → depends_on → lifecycle.
Variable blocks: description → type → default → validation → nullable → sensitive.
See Code Patterns: Block Ordering & Structure for the full rules and examples.
| Situation | Approach | Tools | Cost |
|---|---|---|---|
| Quick syntax check | Static analysis | validate, fmt | Free |
| Pre-commit validation | Static + lint | validate, tflint, trivy, checkov | Free |
| Terraform 1.6+, simple logic | Native test framework | terraform test | Free-Low |
| Pre-1.6, or Go expertise | Integration testing | Terratest | Low-Med |
| Security/compliance focus | Policy as code | OPA, Sentinel | Free |
| Cost-sensitive workflow | Mock providers (1.7+) | Native tests + mocks | Free |
| Multi-cloud, complex | Full integration | Terratest + real infra | Med-High |
Before writing test code: validate resource schemas via Terraform MCP so assertions target real attributes.
command = plan — fast, for input-derived values onlycommand = apply — required for computed values (ARNs, generated names) and set-type nested blocks[0] — use for expressions or materialize via command = applySee Testing Frameworks for static-analysis pipelines, native-test patterns, Terratest integration, mock providers, and the full LLM-mistake checklist.
| Scenario | Use | Why |
|---|---|---|
| Boolean condition (create / don't) | count = condition ? 1 : 0 | Optional singleton toggle |
| Items may be reordered or removed | for_each = toset(list) | Stable resource addresses |
| Reference by key | for_each = map | Named access |
| Multiple named resources | for_each | Better identity stability |
Never use list index as long-lived identity — removing a middle element reshuffles every address after it. For the decision matrix, safe migration playbook, moved block patterns, and known-at-plan failure cases, see Code Patterns: count vs for_each.
Using try() in a local to prefer a conditional resource's attribute over its parent is a specialized but high-value pattern — it forces correct deletion order without explicit depends_on. Common use: VPC + secondary CIDR associations + subnets.
See Code Patterns: Locals for Dependency Management for the full pattern and worked example.
Standard layout:
my-module/
├── README.md # Usage documentation
├── main.tf # Primary resources
├── variables.tf # Typed inputs with descriptions
├── outputs.tf # Output values
├── versions.tf # required_version + required_providers
├── examples/
│ ├── minimal/
│ └── complete/
└── tests/
└── module_test.tftest.hcl # or Go for Terratest
Variable contracts: always description, always explicit type, use validation for complex constraints, use sensitive = true for secrets, prefer optional() with typed defaults (1.3+) over untyped map(any).
Output contracts: always description, mark sensitive outputs, expose stable subsets (not whole provider objects).
See Module Patterns for the full contract patterns, module release checklist, and LLM-mistake checklist.
Pipeline stages: validate → test → plan → apply (with environment protection).
Cost control: mock providers on PR validation, real-cloud integration only on main or scheduled, tag test resources, auto-cleanup.
Drift prevention: pin runtime and providers, commit .terraform.lock.hcl, apply the reviewed plan artifact from the plan stage (do not re-run plan inside the apply job), run policy/security stage on every path to apply.
See CI/CD Workflows for GitHub Actions, GitLab CI, and Atlantis templates plus the LLM-mistake checklist.
Essential checks:
trivy config .
checkov -d .
Don't: store secrets in variables or .tfvars, use default VPC, skip encryption, open security groups to 0.0.0.0/0, use inline ingress/egress blocks in aws_security_group.
Do: source secrets from a cloud secret manager (AWS Secrets Manager / Azure Key Vault / GCP Secret Manager) or use write_only arguments on 1.11+, create dedicated VPCs, enforce encryption at rest and TLS, least-privilege SGs, use separate aws_vpc_security_group_{ingress,egress}_rule resources (e.g. AWS provider v5+).
Marking a variable sensitive = true masks display only — the value still lives in state. Use write_only / *_wo on 1.11+, or keep secret material out of Terraform entirely via runtime lookups.
See Security & Compliance for trivy/checkov pipelines, state-file hardening, compliance mappings, and the LLM-mistake checklist.
Never use local state in teams or production. Remote backends provide automatic locking, encryption, versioning, audit logging, and safe collaboration.
AWS example (Azure azurerm / GCP gcs / TF Cloud syntax: see State Management: Choosing a Remote Backend):
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "prod/vpc/terraform.tfstate"
region = "us-east-1"
encrypt = true
use_lockfile = true # Native S3 locking, 1.10+
}
}
On Terraform < 1.10, use dynamodb_table = "terraform-state-lock" instead of use_lockfile. Azure Storage, GCS, and Terraform Cloud all offer built-in locking - see the State Management reference for syntax. For choosing among backends and their locking models, see Choosing a Remote Backend.
| Pattern | Use When | Example Path |
|---|---|---|
| Per environment | Different teams per env | prod/terraform.tfstate, staging/... |
| Per component | Independent lifecycles | prod/vpc/, prod/eks/, prod/rds/ |
| Hybrid (recommended) | Both benefits | prod/networking/, prod/compute/, staging/networking/ |
Split state when: different teams, different update cadences, or >500 resources. Combine when: tightly coupled resources, <100 resources, same lifecycle.
See State Management for locking, migration, multi-team isolation, disaster recovery, and the LLM-mistake checklist.
| Component | Strategy | Example |
|---|---|---|
| Terraform runtime | Pin minor | required_version = "~> 1.9" |
| Providers | Pin major | version = "~> 5.0" |
| Modules (prod) | Pin exact | version = "5.1.2" |
| Modules (dev) | Allow patch | version = "~> 5.1" |
Commit .terraform.lock.hcl intentionally. Keep provider/runtime upgrades in a separate PR from functional changes. See Code Patterns: Version Management for constraint syntax and upgrade workflow.
| Feature | Min version | Common use |
|---|---|---|
try() | 0.13+ | Safe fallbacks, replaces element(concat()) |
nullable = false | 1.1+ | Prevent null silently overriding defaults |
moved blocks | 1.1+ | Refactor without destroy/recreate |
optional() with defaults | 1.3+ | Typed object attributes |
import blocks | 1.5+ | Declarative imports, reviewable in VCS |
check blocks | 1.5+ | Runtime assertions |
Native terraform test | 1.6+ | Built-in test framework |
| Mock providers | 1.7+ | Cost-free unit testing |
removed blocks | 1.7+ | Declarative resource removal |
| Provider-defined functions | 1.8+ | Provider-specific transformations (requires provider to declare functions) |
| Cross-variable validation | 1.9+ | Reference other var.* in validation blocks |
write_only arguments | 1.11+ | Secrets never stored in state |
| S3 native lock-file | 1.10+ | State locking without DynamoDB |
Before emitting a feature, verify the runtime floor. See Code Patterns: Feature Guard Table for the full table with common LLM error patterns per feature.
terraform test / tofu test available — migrate simple unit tests, keep Terratest for complex integration.use_lockfile) is the correct default for new configurations — DynamoDB locking is no longer required.write_only arguments for secret handling keep credentials out of state.Semantic navigation for HCL. terraform-ls is optional; without it every row below degrades to a disclosed rg + Read fallback.
Self-contained terraform-ls layer of a generic code-intelligence discipline - apply the rows below directly. Recommended companion: the code-intelligence plugin (same antonbabenko/agent-plugins marketplace) carries the generic discipline (position anchoring, degradation gate, disclosure format, anti-phantom-shim) and ships /code-intelligence:doctor for readiness. If it is installed, defer to its generic protocol; this skill stays fully self-contained without it.
| Goal | Use | Tradeoff |
|---|---|---|
| Find definition / all references | terraform-ls goToDefinition / findReferences | Needs init + a position anchor |
| Rename value symbol (var/local/output/provider alias) | Manual: findReferences -> per-file fresh Read -> edit -> validate | No rename provider |
| Rename resource/module address | moved block + plan shows 0 destroy | Text rename forces destroy/recreate |
Exact text / known name / .tfvars / non-HCL | rg + Read | No semantic scope |
✅ Supported: goToDefinition, findReferences, documentSymbol, hover, workspaceSymbol.
❌ Unsupported: goToImplementation, call hierarchy, rename provider. Do not call these then report their absence as a finding.
terraform/tofu on PATH, terraform init run; cold start may need one retry.file:line:character) - anchor with rg first, never symbol-name-only.Depth: Code Intelligence.
Progressive disclosure — essentials here, depth on demand:
terraform_remote_state rules, release checklistcount/for_each deep dive, modern features, version management, localsApache License 2.0. See LICENSE for full terms.
Copyright © 2026 Anton Babenko