name	observability
description	Use when a runtime behavior change needs diagnosable signals: async/background jobs, external calls, user-visible operations, or incident-prone flows. Do not use for docs-only or refactor-only changes with no behavior shift.
metadata	{"short-description":"Observability plan and checklist"}

Use this skill to make important runtime behavior diagnosable through logs, metrics, traces, correlation identifiers, and safe logging practices.

Use this skill when a change adds or changes runtime behavior where operators or developers need evidence to diagnose outcomes, latency, or failures:

Async jobs, queues, schedulers, workers, or other background work.
External calls to APIs, databases, filesystems, message brokers, or third-party services.
User-visible operations where success, failure, or degraded behavior must be explainable.
Incident-prone flows such as retries, timeouts, fallbacks, rate limits, auth, payments, imports, exports, or migrations.

Do not use this skill for docs-only changes, pure formatting, mechanical refactors, or tests that do not change runtime behavior.

If this skill is triggered, open references/observability.md and follow only the relevant template sections.
Live-discover existing instrumentation before adding examples: logging config, metric/tracing libraries, dashboards/queries, external dependency interfaces, schema/config paths, connection state, and relevant version/status output.
Define the operations that need to be observable (user-facing or system-facing actions).
Identify correlation identifiers (request_id / job_id / trace_id) and ensure they are logged consistently.
Add the minimum log events: start / outcome / failure, with required fields.
Add metrics for errors and latency (expand to golden signals if relevant).
Add trace spans and ensure logs and metrics are correlated via identifiers.
Apply safety rules (no secrets/PII; follow OWASP/NIST logging guidance).
Control noise (sampling, throttling, or once-only logging).

Record decisions in the Observability Plan.
Include live-discovery evidence: inspected files/commands, version/status output, connection state, and log/metric/trace artifact paths or dashboard/query links.
Ensure the quality gate’s observability checklist passes.

name	observability
description	Use when a runtime behavior change needs diagnosable signals: async/background jobs, external calls, user-visible operations, or incident-prone flows. Do not use for docs-only or refactor-only changes with no behavior shift.
metadata	{"short-description":"Observability plan and checklist"}

Use this skill to make important runtime behavior diagnosable through logs, metrics, traces, correlation identifiers, and safe logging practices.

Use this skill when a change adds or changes runtime behavior where operators or developers need evidence to diagnose outcomes, latency, or failures:

Async jobs, queues, schedulers, workers, or other background work.
External calls to APIs, databases, filesystems, message brokers, or third-party services.
User-visible operations where success, failure, or degraded behavior must be explainable.
Incident-prone flows such as retries, timeouts, fallbacks, rate limits, auth, payments, imports, exports, or migrations.

Do not use this skill for docs-only changes, pure formatting, mechanical refactors, or tests that do not change runtime behavior.

If this skill is triggered, open references/observability.md and follow only the relevant template sections.
Live-discover existing instrumentation before adding examples: logging config, metric/tracing libraries, dashboards/queries, external dependency interfaces, schema/config paths, connection state, and relevant version/status output.
Define the operations that need to be observable (user-facing or system-facing actions).
Identify correlation identifiers (request_id / job_id / trace_id) and ensure they are logged consistently.
Add the minimum log events: start / outcome / failure, with required fields.
Add metrics for errors and latency (expand to golden signals if relevant).
Add trace spans and ensure logs and metrics are correlated via identifiers.
Apply safety rules (no secrets/PII; follow OWASP/NIST logging guidance).
Control noise (sampling, throttling, or once-only logging).

Record decisions in the Observability Plan.
Include live-discovery evidence: inspected files/commands, version/status output, connection state, and log/metric/trace artifact paths or dashboard/query links.
Ensure the quality gate’s observability checklist passes.

observability