Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

developer-experience

Name: Developer Experience
Author: a53ali

// Audit, measure, and improve developer experience (DX) using the SPACE framework and DORA metrics. Covers inner loop vs. outer loop friction, DX anti-patterns (slow CI, flaky tests, toolchain toil), DX improvement backlog prioritization, developer surveys, build/test benchmarks, and onboarding friction analysis.

Ejecutar en Manus

$ git log --oneline --stat

stars:0

forks:0

updated:16 de marzo de 2026, 23:16

SKILL.md

readonly

package.json

"author": "a53ali"

"repository": "a53ali/ai-dev"

$ gh browse

$ install --globalskills.sh

$ download --local

Ejecutar en Manus

[HINT] Descarga el directorio completo de la habilidad incluyendo SKILL.md y todos los archivos relacionados

related-imports.ts

// Habilidades Relacionadas

import session-logs

from "openclaw"

359,981

import openclaw-test-heap-leaks

import openclaw-qa-testing

from "openclaw"

359,981

import openclaw-secret-scanning-maintainer

import get-search-view-results

from "microsoft"

183,992

import component-fixtures

from "microsoft"

183,992

import auto-perf-optimize

from "microsoft"

183,992

import cpu-profile-analysis

Ejecuta cualquier Skill con un clic

name	developer-experience
description	Audit, measure, and improve developer experience (DX) using the SPACE framework and DORA metrics. Covers inner loop vs. outer loop friction, DX anti-patterns (slow CI, flaky tests, toolchain toil), DX improvement backlog prioritization, developer surveys, build/test benchmarks, and onboarding friction analysis.
triggers	["developer experience","DX","inner loop","slow CI","developer productivity","improve developer tooling","toil","friction in our workflow","developer satisfaction","flaky tests","build times","local dev setup","onboarding friction","SPACE framework"]
audience	engineer, manager

Developer Experience (DX)

Context

Developer experience is the sum of all interactions a developer has with their tools, processes, systems, and team while building software. Poor DX is not a comfort issue — it is a productivity, quality, and retention issue. Slow CI, flaky tests, painful local setups, and toil-heavy deploys compound across hundreds of developer-hours per week.

DX encompasses two loops:

Inner loop: The tight feedback cycle a developer runs locally — edit → build → test → run. Optimizing this has the highest per-developer ROI.
Outer loop: The collaborative cycle from code commit to production — PR → CI → review → merge → deploy → observe. Optimizing this improves team velocity and release confidence.

Use this skill when:

Engineers report spending significant time on non-product work
CI times have grown beyond 10–15 minutes
Onboarding new engineers to productivity takes more than a week
Developer satisfaction survey scores are declining
You want to run a DX audit before starting a platform engineering initiative

Core Frameworks

The SPACE Framework (Forsgren et al., 2021)

SPACE provides a multidimensional view of developer productivity. No single metric captures it all — use at least one metric per dimension.

Dimension	What It Measures	Example Metrics
Satisfaction	Developer wellbeing and job satisfaction	Survey NPS, burnout indicators, retention rate
Performance	Outcomes of developer work	Defect rate, reliability, customer satisfaction
Activity	Volume of actions and output	PRs merged, deploys per day, code review turnaround
Communication	Effectiveness of collaboration	PR review time, blocked time, meeting load
Efficiency	Low friction, flow state, minimal interruption	Build time, deploy time, time in flow, context switches

Anti-pattern: Using only Activity metrics (commits, PRs, lines of code) as a proxy for productivity. This incentivizes the wrong behaviors and ignores quality and wellbeing.

DORA Metrics (DevOps Research and Assessment)

DORA metrics measure the effectiveness of software delivery:

Metric	Elite	High	Medium	Low
Deployment frequency	Multiple/day	1/day–1/week	1/week–1/month	< 1/month
Lead time for changes	< 1 hour	1 day–1 week	1 week–1 month	> 6 months
Change failure rate	< 5%	5–10%	10–15%	> 15%
Time to restore service	< 1 hour	< 1 day	1 day–1 week	> 1 week

Inner Loop vs. Outer Loop

Inner Loop (Local Developer Workflow)

The inner loop is everything a developer does before pushing code. It should be sub-second to sub-minute.

Edit → Build → Test → Run (repeat)

Target benchmarks:

Step	Acceptable	Good	Elite
Hot reload / incremental build	< 5s	< 2s	< 500ms
Unit test suite (local)	< 2 min	< 30s	< 10s
Local service start	< 60s	< 15s	< 5s
Lint + type check	< 60s	< 15s	< 5s

Common inner loop pain points:

Cold Docker builds that take 5+ minutes
No incremental compilation (full rebuild on every change)
Test suite requires external services (DB, S3) that aren't mocked locally
No local dev environment parity with staging (missing env vars, secrets)
"Works on my machine" issues from inconsistent toolchain versions

Outer Loop (Team Workflow)

The outer loop is everything from push to production. Slow outer loops delay feedback and accumulate risk.

git push → CI (lint/test/build) → PR review → merge → deploy → observe

Target benchmarks:

Step	Acceptable	Good	Elite
CI pipeline total time	< 15 min	< 10 min	< 5 min
PR review turnaround	< 24 hours	< 4 hours	< 1 hour
Merge to production deploy	< 30 min	< 15 min	< 5 min
Deploy rollback time	< 15 min	< 5 min	< 2 min

Common outer loop pain points:

Flaky tests that require manual re-runs (trust erosion)
Sequential CI steps that could be parallelized
Manual approval gates for every deploy (not just production)
No deploy preview / ephemeral environments for review
Large PRs with long review cycles (→ batch smaller)

DX Anti-Patterns

1. Slow CI

Symptom: CI > 15 minutes. Developers stop watching the build; context switch away; forget what they were working on. Fix: Parallelize test jobs. Cache dependencies. Only run affected tests (test impact analysis). Use incremental builds.

2. Flaky Tests

Symptom: Tests that fail intermittently with no code change. Developers learn to re-run CI blindly. Fix: Quarantine flaky tests immediately (tag + skip + alert). Fix or delete. Track flakiness rate as a metric.

3. Poor Local Dev Setup

Symptom: Onboarding docs say "ask Sarah" or "it took me 3 days to get running." Fix: Automate setup with a single script (make setup or ./scripts/bootstrap.sh). Test it on clean machines monthly.

4. Toil-Heavy Deploys

Symptom: Deploying requires manual steps, Slack approvals, editing YAML by hand, or running specific commands in a specific order only the senior engineer knows. Fix: Automate the deploy path entirely. Document the blast radius, add rollback, and make it a one-click or auto-on-merge operation.

5. Broken Toolchain

Symptom: Engineers have different Node/Python/Go versions. npm install fails on some machines. Lockfiles get corrupted. Fix: Pin versions in .nvmrc, .python-version, go.mod. Use Nix, devcontainers, or Docker Compose for reproducible local environments.

6. Notification Overload

Symptom: Developers are @-mentioned in Slack constantly. PRs ping the wrong people. Alerts fire for things that aren't actionable. Fix: Audit notification surfaces. Use CODEOWNERS for targeted PR assignment. Classify alerts as actionable vs. informational.

7. Context Switching from Interruptions

Symptom: Engineers can't find "maker time." Average uninterrupted focus block is < 25 minutes. Fix: Introduce focus hours (no meetings 9am–12pm). Batch async communication. Reduce synchronous meeting load.

DX Audit Checklist

Run this audit when starting a DX improvement program. Score each item 0 (broken), 1 (partial), 2 (good).

Inner Loop Audit

INNER LOOP — Edit → Build → Test → Run

Environment Setup
[ ] New developer can be productive in < 1 day                    [ 0 | 1 | 2 ]
[ ] Setup is automated (single command or script)                 [ 0 | 1 | 2 ]
[ ] Toolchain versions are pinned and enforced                    [ 0 | 1 | 2 ]
[ ] Local environment matches staging (parity)                    [ 0 | 1 | 2 ]
[ ] Secrets are accessible without manual steps                   [ 0 | 1 | 2 ]

Build Speed
[ ] Incremental build < 5 seconds for typical change              [ 0 | 1 | 2 ]
[ ] Full clean build < 3 minutes                                  [ 0 | 1 | 2 ]
[ ] Hot reload / watch mode available                             [ 0 | 1 | 2 ]

Test Speed
[ ] Unit test suite runs in < 2 minutes locally                   [ 0 | 1 | 2 ]
[ ] Tests do not require network or external services by default  [ 0 | 1 | 2 ]
[ ] Flaky test rate < 1% (tracked as a metric)                    [ 0 | 1 | 2 ]
[ ] Tests can be run in parallel locally                          [ 0 | 1 | 2 ]

Local Run
[ ] Service starts in < 60 seconds                                [ 0 | 1 | 2 ]
[ ] Local service URLs and ports are documented                   [ 0 | 1 | 2 ]
[ ] Docker Compose or equivalent for local service dependencies   [ 0 | 1 | 2 ]

INNER LOOP SCORE: ___/30

Outer Loop Audit

OUTER LOOP — PR → CI → Review → Merge → Deploy → Observe

CI Pipeline
[ ] Total CI time < 15 minutes                                    [ 0 | 1 | 2 ]
[ ] CI pipeline is parallelized                                   [ 0 | 1 | 2 ]
[ ] Dependency caching is configured                              [ 0 | 1 | 2 ]
[ ] Test flakiness rate < 1% in CI                                [ 0 | 1 | 2 ]
[ ] CI failure provides clear, actionable error messages          [ 0 | 1 | 2 ]

Code Review
[ ] Average PR review turnaround < 24 hours                       [ 0 | 1 | 2 ]
[ ] CODEOWNERS configured for automatic review assignment         [ 0 | 1 | 2 ]
[ ] PR description template exists and is used                    [ 0 | 1 | 2 ]
[ ] Average PR size < 400 lines changed                           [ 0 | 1 | 2 ]

Deployment
[ ] Deploy to staging is automatic on merge                       [ 0 | 1 | 2 ]
[ ] Production deploy is a single action (not multi-step manual)  [ 0 | 1 | 2 ]
[ ] Rollback takes < 5 minutes                                    [ 0 | 1 | 2 ]
[ ] Deploy preview environments exist for review                  [ 0 | 1 | 2 ]
[ ] Feature flags available for risky changes                     [ 0 | 1 | 2 ]

Observability
[ ] Logs are structured and searchable                            [ 0 | 1 | 2 ]
[ ] Basic metrics and dashboards exist per service                [ 0 | 1 | 2 ]
[ ] Alerts are actionable (low false positive rate)               [ 0 | 1 | 2 ]
[ ] On-call runbooks exist and are up to date                     [ 0 | 1 | 2 ]

OUTER LOOP SCORE: ___/34

TOTAL DX SCORE: ___/64
  0-25:  Critical — significant toil and friction
  26-40: Moderate — several high-impact improvements needed
  41-52: Good — targeted improvements will have high ROI
  53-64: Strong — focus on measurement and continuous improvement

SPACE Framework Self-Assessment

Use this with your team (survey or workshop). Score 1–5.

SPACE SELF-ASSESSMENT
=====================

SATISFACTION
------------
1. I feel productive in my day-to-day work                        [ 1 | 2 | 3 | 4 | 5 ]
2. I have the tools and access I need to do my job                [ 1 | 2 | 3 | 4 | 5 ]
3. I am not frequently blocked by process or tooling              [ 1 | 2 | 3 | 4 | 5 ]
4. I rarely feel burned out by operational toil                   [ 1 | 2 | 3 | 4 | 5 ]

PERFORMANCE
-----------
5. My team ships features that customers value                    [ 1 | 2 | 3 | 4 | 5 ]
6. Bugs and incidents are rare and resolved quickly               [ 1 | 2 | 3 | 4 | 5 ]

ACTIVITY
--------
7. I can ship small changes frequently (multiple times/week)      [ 1 | 2 | 3 | 4 | 5 ]
8. Code reviews happen quickly and don't block my work            [ 1 | 2 | 3 | 4 | 5 ]

COMMUNICATION / COLLABORATION
------------------------------
9. I know who to ask when I'm stuck                               [ 1 | 2 | 3 | 4 | 5 ]
10. Technical decisions are made clearly and communicated well    [ 1 | 2 | 3 | 4 | 5 ]
11. I have enough uninterrupted focus time                        [ 1 | 2 | 3 | 4 | 5 ]

EFFICIENCY / FLOW
-----------------
12. My local build and test cycle is fast (< 5 min)               [ 1 | 2 | 3 | 4 | 5 ]
13. CI is reliable and fast (< 15 min, rarely flaky)              [ 1 | 2 | 3 | 4 | 5 ]
14. Deploying to production is low-friction                       [ 1 | 2 | 3 | 4 | 5 ]
15. I spend most of my time on product work, not toil             [ 1 | 2 | 3 | 4 | 5 ]

OPEN QUESTIONS
--------------
16. What is your single biggest source of daily friction?
    [open text]

17. What one tool, process, or change would most improve your productivity?
    [open text]

18. On a scale of 0-10, how likely are you to recommend this engineering
    environment to a friend joining the company?
    [0–10 NPS]

SCORING KEY
-----------
< 40:  Critical DX issues — engineers are burning significant time on non-product work
40-55: Moderate — several actionable improvements available
56-65: Good — targeted wins will compound over time
66-75: Strong — focus on the long tail and measurement

DX Improvement Backlog Template

Prioritize by: (Frequency of pain × Severity of impact) / Effort to fix

DX IMPROVEMENT BACKLOG
=======================
Scoring: Frequency (1-5) × Severity (1-5) = Impact score
         Effort: S (< 1 day), M (1 week), L (1 month), XL (> 1 month)
         Priority = Impact / Effort weight (S=1, M=2, L=4, XL=8)

ID  | Problem                          | Freq | Sev | Impact | Effort | Priority | Owner
----|----------------------------------|------|-----|--------|--------|----------|------
DX1 | CI pipeline takes 22 minutes     |  5   |  4  |  20    |   M    |   10.0   | ___
DX2 | Flaky auth test requires reruns  |  5   |  3  |  15    |   S    |   15.0   | ___
DX3 | Local setup takes 3 days         |  3   |  5  |  15    |   M    |    7.5   | ___
DX4 | No ephemeral preview envs        |  4   |  3  |  12    |   L    |    3.0   | ___
DX5 | Manual deploy process (6 steps)  |  3   |  4  |  12    |   M    |    6.0   | ___
... |                                  |      |     |        |        |          |

NEXT SPRINT: Pick the top 2-3 by priority score.

DEFINITION OF DONE per item:
  [ ] Improvement shipped and measurable
  [ ] Benchmark before/after recorded
  [ ] Announced to affected developers
  [ ] Added to DX metrics dashboard

DX Metrics Dashboard Spec

Track these monthly. Present to engineering leadership quarterly.

DX METRICS DASHBOARD — [Month] [Year]
======================================

INNER LOOP
----------
Median incremental build time:     ___ sec   (target: < 5s)
Median full build time:             ___ min   (target: < 3 min)
Median unit test suite time (local): ___ min  (target: < 2 min)
Local setup time (new hire):        ___ hrs   (target: < 4 hrs)

OUTER LOOP
----------
Median CI pipeline time:            ___ min   (target: < 15 min)
CI flakiness rate:                  ___%      (target: < 1%)
Median PR review turnaround:        ___ hrs   (target: < 24 hrs)
Median lead time (commit → prod):   ___ hrs   (target: < 24 hrs)
Deployment frequency (org):         ___/day
Change failure rate:                ___%      (target: < 5%)

ONBOARDING
----------
Time for new hire first deploy:     ___ days  (target: < 5 days)
Onboarding survey score:            ___/5

SATISFACTION (quarterly survey)
--------------------------------
SPACE self-assessment average:      ___/75
NPS score:                          ___       (target: > 40)
Top pain point cited this quarter:  ___

TOIL TRACKING
-------------
% time on non-product work (survey): ___%     (target: < 20%)
On-call incidents last month:         ___
Avg incident resolution time:         ___ min

TRENDS (vs. last quarter)
--------------------------
CI time:                ↑ worse / ↓ better / → stable
Flakiness:              ↑ worse / ↓ better / → stable
Review turnaround:      ↑ worse / ↓ better / → stable
Developer satisfaction: ↑ better / ↓ worse  / → stable

Developer Satisfaction Survey Template

Run quarterly, keep under 5 minutes.

DEVELOPER EXPERIENCE SURVEY — Q[X] [YEAR]

HOW'S YOUR TOOLING?
-------------------
1. How fast is your local build?
   [ ] < 30 sec  [ ] 30 sec – 2 min  [ ] 2–5 min  [ ] > 5 min

2. How often do you have to re-run CI due to flaky tests?
   [ ] Never  [ ] Rarely (< once/week)  [ ] Sometimes (1-3/week)  [ ] Often (daily)

3. How long does a typical CI run take?
   [ ] < 5 min  [ ] 5–10 min  [ ] 10–20 min  [ ] > 20 min

4. How painful is it to deploy your service?
   [ ] Fully automated, no friction  [ ] Mostly automated, minor manual steps
   [ ] Manual but documented  [ ] Manual and painful / undocumented

5. How easy is it to debug issues in production?
   [ ] Very easy (good logs, traces, dashboards)
   [ ] Mostly easy
   [ ] Difficult (some tools, but incomplete)
   [ ] Very difficult (log grep + tribal knowledge)

HOW'S YOUR WORKFLOW?
--------------------
6. How often are you blocked waiting for another team or process?
   [ ] Never  [ ] Rarely  [ ] Weekly  [ ] Daily

7. How much of your week is spent on non-product work (toil, meetings, context-switching)?
   [ ] < 10%  [ ] 10–20%  [ ] 20–40%  [ ] > 40%

8. Do you have enough uninterrupted focus time?
   [ ] Yes  [ ] Somewhat  [ ] No

OPEN FEEDBACK
-------------
9. What is your #1 source of daily friction?
   [open text]

10. What one improvement would most increase your productivity?
    [open text]

11. How likely are you to recommend this engineering environment to a friend?
    [0–10 NPS]

Toolchain Audit

Run this when suspecting DX problems are toolchain-rooted:

TOOLCHAIN AUDIT CHECKLIST

Version Management
[ ] Language versions pinned (.nvmrc / .python-version / go.mod / .tool-versions)
[ ] All developers using same version (verified, not assumed)
[ ] CI uses same version as local (matched)

Package Management
[ ] Lockfile committed and enforced in CI
[ ] Dependency installation is cached in CI
[ ] No globally installed tools required (everything in devDependencies / pyproject / go.mod)

Local Dev Environment
[ ] Single command to set up (make setup / ./bootstrap.sh)
[ ] Docker Compose or devcontainer for external dependencies
[ ] Environment variable management documented (direnv / .env.example)
[ ] No manual steps required after initial setup

Editor / IDE
[ ] Recommended extensions/plugins documented
[ ] Project-level lint and format config committed (.eslintrc, pyproject.toml, .editorconfig)
[ ] Format-on-save configured or documented

Git Hooks
[ ] Pre-commit hooks configured for fast local checks (lint, type-check, secrets scan)
[ ] Hooks installed automatically on setup (husky / pre-commit)
[ ] Hooks run in < 10 seconds (fast enough not to be bypassed)

Onboarding Friction as a DX Signal

First-day experience is the highest-signal audit you can run. New hires encounter every broken assumption because they have no tribal knowledge to work around.

Onboarding friction audit — run with every new hire:

Day 1 timer:
  Time to get development environment running:    ___ hours
  Number of "ask a human" moments:                ___
  Blockers encountered (list):                    ___

Day 5 checkpoint:
  Has the new hire shipped at least one PR to production?  Y / N
  What took longest?                                       ___
  What was most confusing?                                 ___
  What documentation was missing or wrong?                 ___

Treat every "ask a human" moment as a documentation or automation bug. Fix it before the next hire.

Output Format

When applying this skill, the agent should:

Start with the inner loop: Ask how long local builds and tests take before discussing CI or deployment. Inner loop has the highest per-developer ROI.
Run the DX audit: Use the checklist to score current state before recommending improvements.
Prioritize ruthlessly: Use the impact × severity / effort model. Don't try to fix everything at once.
Produce artifacts on request: DX audit checklist, SPACE self-assessment, backlog template, metrics dashboard, developer survey.
Connect to platform engineering when inner/outer loop friction is infrastructure-rooted — recommend the platform-engineering skill.
Cite SPACE dimensions when discussing productivity metrics to avoid the "lines of code" trap.
Format checklists as Markdown task lists - [ ]. Format benchmarks as tables. Format templates as code blocks.

References

Nicole Forsgren, Margaret-Anne Storey, Chandra Maddila, Thomas Zimmermann, Brian Houck, Jenna Butler — "The SPACE of Developer Productivity" (ACM Queue, 2021)
Forsgren, Humble, Kim — Accelerate: The Science of Lean Software and DevOps (IT Revolution, 2018)
DORA — State of DevOps Report (annual)
LeadDev — "Measuring developer experience"
Gartner — "Innovation Insight: Developer Experience Platforms" (2023)
Nate Swanson — "Developer Experience at Spotify"
DX Core 4 — getdx.com (speed, effectiveness, quality, impact)

name	developer-experience
description	Audit, measure, and improve developer experience (DX) using the SPACE framework and DORA metrics. Covers inner loop vs. outer loop friction, DX anti-patterns (slow CI, flaky tests, toolchain toil), DX improvement backlog prioritization, developer surveys, build/test benchmarks, and onboarding friction analysis.
triggers	["developer experience","DX","inner loop","slow CI","developer productivity","improve developer tooling","toil","friction in our workflow","developer satisfaction","flaky tests","build times","local dev setup","onboarding friction","SPACE framework"]
audience	engineer, manager

Developer Experience (DX)

Context

DX encompasses two loops:

Inner loop: The tight feedback cycle a developer runs locally — edit → build → test → run. Optimizing this has the highest per-developer ROI.
Outer loop: The collaborative cycle from code commit to production — PR → CI → review → merge → deploy → observe. Optimizing this improves team velocity and release confidence.

Use this skill when:

Engineers report spending significant time on non-product work
CI times have grown beyond 10–15 minutes
Onboarding new engineers to productivity takes more than a week
Developer satisfaction survey scores are declining
You want to run a DX audit before starting a platform engineering initiative

Core Frameworks

The SPACE Framework (Forsgren et al., 2021)

SPACE provides a multidimensional view of developer productivity. No single metric captures it all — use at least one metric per dimension.

Dimension	What It Measures	Example Metrics
Satisfaction	Developer wellbeing and job satisfaction	Survey NPS, burnout indicators, retention rate
Performance	Outcomes of developer work	Defect rate, reliability, customer satisfaction
Activity	Volume of actions and output	PRs merged, deploys per day, code review turnaround
Communication	Effectiveness of collaboration	PR review time, blocked time, meeting load
Efficiency	Low friction, flow state, minimal interruption	Build time, deploy time, time in flow, context switches

Anti-pattern: Using only Activity metrics (commits, PRs, lines of code) as a proxy for productivity. This incentivizes the wrong behaviors and ignores quality and wellbeing.

DORA Metrics (DevOps Research and Assessment)

DORA metrics measure the effectiveness of software delivery:

Metric	Elite	High	Medium	Low
Deployment frequency	Multiple/day	1/day–1/week	1/week–1/month	< 1/month
Lead time for changes	< 1 hour	1 day–1 week	1 week–1 month	> 6 months
Change failure rate	< 5%	5–10%	10–15%	> 15%
Time to restore service	< 1 hour	< 1 day	1 day–1 week	> 1 week

Inner Loop vs. Outer Loop

Inner Loop (Local Developer Workflow)

The inner loop is everything a developer does before pushing code. It should be sub-second to sub-minute.

Edit → Build → Test → Run (repeat)

Target benchmarks:

Step	Acceptable	Good	Elite
Hot reload / incremental build	< 5s	< 2s	< 500ms
Unit test suite (local)	< 2 min	< 30s	< 10s
Local service start	< 60s	< 15s	< 5s
Lint + type check	< 60s	< 15s	< 5s

Common inner loop pain points:

Cold Docker builds that take 5+ minutes
No incremental compilation (full rebuild on every change)
Test suite requires external services (DB, S3) that aren't mocked locally
No local dev environment parity with staging (missing env vars, secrets)
"Works on my machine" issues from inconsistent toolchain versions

Outer Loop (Team Workflow)

The outer loop is everything from push to production. Slow outer loops delay feedback and accumulate risk.

git push → CI (lint/test/build) → PR review → merge → deploy → observe

Target benchmarks:

Step	Acceptable	Good	Elite
CI pipeline total time	< 15 min	< 10 min	< 5 min
PR review turnaround	< 24 hours	< 4 hours	< 1 hour
Merge to production deploy	< 30 min	< 15 min	< 5 min
Deploy rollback time	< 15 min	< 5 min	< 2 min

Common outer loop pain points:

Flaky tests that require manual re-runs (trust erosion)
Sequential CI steps that could be parallelized
Manual approval gates for every deploy (not just production)
No deploy preview / ephemeral environments for review
Large PRs with long review cycles (→ batch smaller)

DX Anti-Patterns

1. Slow CI

2. Flaky Tests

3. Poor Local Dev Setup

4. Toil-Heavy Deploys

5. Broken Toolchain

6. Notification Overload

7. Context Switching from Interruptions

DX Audit Checklist

Run this audit when starting a DX improvement program. Score each item 0 (broken), 1 (partial), 2 (good).

Inner Loop Audit

INNER LOOP — Edit → Build → Test → Run

Environment Setup
[ ] New developer can be productive in < 1 day                    [ 0 | 1 | 2 ]
[ ] Setup is automated (single command or script)                 [ 0 | 1 | 2 ]
[ ] Toolchain versions are pinned and enforced                    [ 0 | 1 | 2 ]
[ ] Local environment matches staging (parity)                    [ 0 | 1 | 2 ]
[ ] Secrets are accessible without manual steps                   [ 0 | 1 | 2 ]

Build Speed
[ ] Incremental build < 5 seconds for typical change              [ 0 | 1 | 2 ]
[ ] Full clean build < 3 minutes                                  [ 0 | 1 | 2 ]
[ ] Hot reload / watch mode available                             [ 0 | 1 | 2 ]

Test Speed
[ ] Unit test suite runs in < 2 minutes locally                   [ 0 | 1 | 2 ]
[ ] Tests do not require network or external services by default  [ 0 | 1 | 2 ]
[ ] Flaky test rate < 1% (tracked as a metric)                    [ 0 | 1 | 2 ]
[ ] Tests can be run in parallel locally                          [ 0 | 1 | 2 ]

Local Run
[ ] Service starts in < 60 seconds                                [ 0 | 1 | 2 ]
[ ] Local service URLs and ports are documented                   [ 0 | 1 | 2 ]
[ ] Docker Compose or equivalent for local service dependencies   [ 0 | 1 | 2 ]

INNER LOOP SCORE: ___/30

Outer Loop Audit

OUTER LOOP — PR → CI → Review → Merge → Deploy → Observe

CI Pipeline
[ ] Total CI time < 15 minutes                                    [ 0 | 1 | 2 ]
[ ] CI pipeline is parallelized                                   [ 0 | 1 | 2 ]
[ ] Dependency caching is configured                              [ 0 | 1 | 2 ]
[ ] Test flakiness rate < 1% in CI                                [ 0 | 1 | 2 ]
[ ] CI failure provides clear, actionable error messages          [ 0 | 1 | 2 ]

Code Review
[ ] Average PR review turnaround < 24 hours                       [ 0 | 1 | 2 ]
[ ] CODEOWNERS configured for automatic review assignment         [ 0 | 1 | 2 ]
[ ] PR description template exists and is used                    [ 0 | 1 | 2 ]
[ ] Average PR size < 400 lines changed                           [ 0 | 1 | 2 ]

Deployment
[ ] Deploy to staging is automatic on merge                       [ 0 | 1 | 2 ]
[ ] Production deploy is a single action (not multi-step manual)  [ 0 | 1 | 2 ]
[ ] Rollback takes < 5 minutes                                    [ 0 | 1 | 2 ]
[ ] Deploy preview environments exist for review                  [ 0 | 1 | 2 ]
[ ] Feature flags available for risky changes                     [ 0 | 1 | 2 ]

Observability
[ ] Logs are structured and searchable                            [ 0 | 1 | 2 ]
[ ] Basic metrics and dashboards exist per service                [ 0 | 1 | 2 ]
[ ] Alerts are actionable (low false positive rate)               [ 0 | 1 | 2 ]
[ ] On-call runbooks exist and are up to date                     [ 0 | 1 | 2 ]

OUTER LOOP SCORE: ___/34

TOTAL DX SCORE: ___/64
  0-25:  Critical — significant toil and friction
  26-40: Moderate — several high-impact improvements needed
  41-52: Good — targeted improvements will have high ROI
  53-64: Strong — focus on measurement and continuous improvement

SPACE Framework Self-Assessment

Use this with your team (survey or workshop). Score 1–5.

SPACE SELF-ASSESSMENT
=====================

SATISFACTION
------------
1. I feel productive in my day-to-day work                        [ 1 | 2 | 3 | 4 | 5 ]
2. I have the tools and access I need to do my job                [ 1 | 2 | 3 | 4 | 5 ]
3. I am not frequently blocked by process or tooling              [ 1 | 2 | 3 | 4 | 5 ]
4. I rarely feel burned out by operational toil                   [ 1 | 2 | 3 | 4 | 5 ]

PERFORMANCE
-----------
5. My team ships features that customers value                    [ 1 | 2 | 3 | 4 | 5 ]
6. Bugs and incidents are rare and resolved quickly               [ 1 | 2 | 3 | 4 | 5 ]

ACTIVITY
--------
7. I can ship small changes frequently (multiple times/week)      [ 1 | 2 | 3 | 4 | 5 ]
8. Code reviews happen quickly and don't block my work            [ 1 | 2 | 3 | 4 | 5 ]

COMMUNICATION / COLLABORATION
------------------------------
9. I know who to ask when I'm stuck                               [ 1 | 2 | 3 | 4 | 5 ]
10. Technical decisions are made clearly and communicated well    [ 1 | 2 | 3 | 4 | 5 ]
11. I have enough uninterrupted focus time                        [ 1 | 2 | 3 | 4 | 5 ]

EFFICIENCY / FLOW
-----------------
12. My local build and test cycle is fast (< 5 min)               [ 1 | 2 | 3 | 4 | 5 ]
13. CI is reliable and fast (< 15 min, rarely flaky)              [ 1 | 2 | 3 | 4 | 5 ]
14. Deploying to production is low-friction                       [ 1 | 2 | 3 | 4 | 5 ]
15. I spend most of my time on product work, not toil             [ 1 | 2 | 3 | 4 | 5 ]

OPEN QUESTIONS
--------------
16. What is your single biggest source of daily friction?
    [open text]

17. What one tool, process, or change would most improve your productivity?
    [open text]

18. On a scale of 0-10, how likely are you to recommend this engineering
    environment to a friend joining the company?
    [0–10 NPS]

SCORING KEY
-----------
< 40:  Critical DX issues — engineers are burning significant time on non-product work
40-55: Moderate — several actionable improvements available
56-65: Good — targeted wins will compound over time
66-75: Strong — focus on the long tail and measurement

DX Improvement Backlog Template

Prioritize by: (Frequency of pain × Severity of impact) / Effort to fix

DX IMPROVEMENT BACKLOG
=======================
Scoring: Frequency (1-5) × Severity (1-5) = Impact score
         Effort: S (< 1 day), M (1 week), L (1 month), XL (> 1 month)
         Priority = Impact / Effort weight (S=1, M=2, L=4, XL=8)

ID  | Problem                          | Freq | Sev | Impact | Effort | Priority | Owner
----|----------------------------------|------|-----|--------|--------|----------|------
DX1 | CI pipeline takes 22 minutes     |  5   |  4  |  20    |   M    |   10.0   | ___
DX2 | Flaky auth test requires reruns  |  5   |  3  |  15    |   S    |   15.0   | ___
DX3 | Local setup takes 3 days         |  3   |  5  |  15    |   M    |    7.5   | ___
DX4 | No ephemeral preview envs        |  4   |  3  |  12    |   L    |    3.0   | ___
DX5 | Manual deploy process (6 steps)  |  3   |  4  |  12    |   M    |    6.0   | ___
... |                                  |      |     |        |        |          |

NEXT SPRINT: Pick the top 2-3 by priority score.

DEFINITION OF DONE per item:
  [ ] Improvement shipped and measurable
  [ ] Benchmark before/after recorded
  [ ] Announced to affected developers
  [ ] Added to DX metrics dashboard

DX Metrics Dashboard Spec

Track these monthly. Present to engineering leadership quarterly.

DX METRICS DASHBOARD — [Month] [Year]
======================================

INNER LOOP
----------
Median incremental build time:     ___ sec   (target: < 5s)
Median full build time:             ___ min   (target: < 3 min)
Median unit test suite time (local): ___ min  (target: < 2 min)
Local setup time (new hire):        ___ hrs   (target: < 4 hrs)

OUTER LOOP
----------
Median CI pipeline time:            ___ min   (target: < 15 min)
CI flakiness rate:                  ___%      (target: < 1%)
Median PR review turnaround:        ___ hrs   (target: < 24 hrs)
Median lead time (commit → prod):   ___ hrs   (target: < 24 hrs)
Deployment frequency (org):         ___/day
Change failure rate:                ___%      (target: < 5%)

ONBOARDING
----------
Time for new hire first deploy:     ___ days  (target: < 5 days)
Onboarding survey score:            ___/5

SATISFACTION (quarterly survey)
--------------------------------
SPACE self-assessment average:      ___/75
NPS score:                          ___       (target: > 40)
Top pain point cited this quarter:  ___

TOIL TRACKING
-------------
% time on non-product work (survey): ___%     (target: < 20%)
On-call incidents last month:         ___
Avg incident resolution time:         ___ min

TRENDS (vs. last quarter)
--------------------------
CI time:                ↑ worse / ↓ better / → stable
Flakiness:              ↑ worse / ↓ better / → stable
Review turnaround:      ↑ worse / ↓ better / → stable
Developer satisfaction: ↑ better / ↓ worse  / → stable

Developer Satisfaction Survey Template

Run quarterly, keep under 5 minutes.

DEVELOPER EXPERIENCE SURVEY — Q[X] [YEAR]

HOW'S YOUR TOOLING?
-------------------
1. How fast is your local build?
   [ ] < 30 sec  [ ] 30 sec – 2 min  [ ] 2–5 min  [ ] > 5 min

2. How often do you have to re-run CI due to flaky tests?
   [ ] Never  [ ] Rarely (< once/week)  [ ] Sometimes (1-3/week)  [ ] Often (daily)

3. How long does a typical CI run take?
   [ ] < 5 min  [ ] 5–10 min  [ ] 10–20 min  [ ] > 20 min

4. How painful is it to deploy your service?
   [ ] Fully automated, no friction  [ ] Mostly automated, minor manual steps
   [ ] Manual but documented  [ ] Manual and painful / undocumented

5. How easy is it to debug issues in production?
   [ ] Very easy (good logs, traces, dashboards)
   [ ] Mostly easy
   [ ] Difficult (some tools, but incomplete)
   [ ] Very difficult (log grep + tribal knowledge)

HOW'S YOUR WORKFLOW?
--------------------
6. How often are you blocked waiting for another team or process?
   [ ] Never  [ ] Rarely  [ ] Weekly  [ ] Daily

7. How much of your week is spent on non-product work (toil, meetings, context-switching)?
   [ ] < 10%  [ ] 10–20%  [ ] 20–40%  [ ] > 40%

8. Do you have enough uninterrupted focus time?
   [ ] Yes  [ ] Somewhat  [ ] No

OPEN FEEDBACK
-------------
9. What is your #1 source of daily friction?
   [open text]

10. What one improvement would most increase your productivity?
    [open text]

11. How likely are you to recommend this engineering environment to a friend?
    [0–10 NPS]

Toolchain Audit

Run this when suspecting DX problems are toolchain-rooted:

TOOLCHAIN AUDIT CHECKLIST

Version Management
[ ] Language versions pinned (.nvmrc / .python-version / go.mod / .tool-versions)
[ ] All developers using same version (verified, not assumed)
[ ] CI uses same version as local (matched)

Package Management
[ ] Lockfile committed and enforced in CI
[ ] Dependency installation is cached in CI
[ ] No globally installed tools required (everything in devDependencies / pyproject / go.mod)

Local Dev Environment
[ ] Single command to set up (make setup / ./bootstrap.sh)
[ ] Docker Compose or devcontainer for external dependencies
[ ] Environment variable management documented (direnv / .env.example)
[ ] No manual steps required after initial setup

Editor / IDE
[ ] Recommended extensions/plugins documented
[ ] Project-level lint and format config committed (.eslintrc, pyproject.toml, .editorconfig)
[ ] Format-on-save configured or documented

Git Hooks
[ ] Pre-commit hooks configured for fast local checks (lint, type-check, secrets scan)
[ ] Hooks installed automatically on setup (husky / pre-commit)
[ ] Hooks run in < 10 seconds (fast enough not to be bypassed)

Onboarding Friction as a DX Signal

First-day experience is the highest-signal audit you can run. New hires encounter every broken assumption because they have no tribal knowledge to work around.

Onboarding friction audit — run with every new hire:

Day 1 timer:
  Time to get development environment running:    ___ hours
  Number of "ask a human" moments:                ___
  Blockers encountered (list):                    ___

Day 5 checkpoint:
  Has the new hire shipped at least one PR to production?  Y / N
  What took longest?                                       ___
  What was most confusing?                                 ___
  What documentation was missing or wrong?                 ___

Treat every "ask a human" moment as a documentation or automation bug. Fix it before the next hire.

Output Format

When applying this skill, the agent should:

Start with the inner loop: Ask how long local builds and tests take before discussing CI or deployment. Inner loop has the highest per-developer ROI.
Run the DX audit: Use the checklist to score current state before recommending improvements.
Prioritize ruthlessly: Use the impact × severity / effort model. Don't try to fix everything at once.
Produce artifacts on request: DX audit checklist, SPACE self-assessment, backlog template, metrics dashboard, developer survey.
Connect to platform engineering when inner/outer loop friction is infrastructure-rooted — recommend the platform-engineering skill.
Cite SPACE dimensions when discussing productivity metrics to avoid the "lines of code" trap.
Format checklists as Markdown task lists - [ ]. Format benchmarks as tables. Format templates as code blocks.

References

Nicole Forsgren, Margaret-Anne Storey, Chandra Maddila, Thomas Zimmermann, Brian Houck, Jenna Butler — "The SPACE of Developer Productivity" (ACM Queue, 2021)
Forsgren, Humble, Kim — Accelerate: The Science of Lean Software and DevOps (IT Revolution, 2018)
DORA — State of DevOps Report (annual)
LeadDev — "Measuring developer experience"
Gartner — "Innovation Insight: Developer Experience Platforms" (2023)
Nate Swanson — "Developer Experience at Spotify"
DX Core 4 — getdx.com (speed, effectiveness, quality, impact)