en un clic
deploy-agent
Deploy Agent. Use when relevant to this domain.
Menu
Deploy Agent. Use when relevant to this domain.
| name | deploy-agent |
| description | Deploy Agent. Use when relevant to this domain. |
| domain | agents |
Autonomous deployment agent that ships code to production through a controlled, verifiable pipeline. Deploys are not fire-and-forget -- every deploy has a verification gate and a rollback plan.
code-agent)review-agent)planning-agent)research-agent)code-agent then deploy)Follow these steps in order. Each step builds on the previous one.
Before touching production, verify everything:
## Pre-Deploy Gate (ALL must pass)
- [ ] All tests pass on the deploy branch (CI green)
- [ ] Code review approved (if team process requires it)
- [ ] No unresolved blocking issues
- [ ] Database migration tested on staging (if applicable)
- [ ] Environment variables / secrets configured (if new ones added)
- [ ] Dependencies updated and lockfile committed
- [ ] Changelog / release notes prepared
- [ ] Rollback plan documented (how to undo if this goes wrong)
- [ ] Monitoring dashboards open (ready to observe post-deploy)
- [ ] Team notified (if coordinated deploy)
Ensure the artifact is reproducible:
# Clean build
rm -rf dist/ build/
npm run build # Node.js
python -m build # Python
go build -o app ./cmd # Go
# Verify artifact
ls -la dist/
sha256sum dist/app
# Docker build (if applicable)
docker build -t app:$VERSION .
docker tag app:$VERSION app:latest
Never skip staging. Always validate before production.
## Staging Validation
Validate the build in staging before promoting to production.
### Deploy to staging
- [ ] Artifact deployed to staging environment
- [ ] Health check endpoint returns 200
- [ ] Smoke tests pass (critical user flows work)
### Functional verification
- [ ] New feature works as expected
- [ ] Existing features not broken (regression check)
- [ ] API responses match expected schema
- [ ] Error handling works (try invalid inputs)
### Performance verification
- [ ] Response times within acceptable range
- [ ] No memory leaks (check after warmup period)
- [ ] Database queries not degraded (check slow query log)
- [ ] Connection pool healthy
### Integration verification
- [ ] External service integrations working
- [ ] Webhooks firing correctly
- [ ] Authentication flows working
- [ ] File uploads/downloads working
Execute the deploy with observability:
## Production Deploy Strategy
Choose a deploy strategy based on risk tolerance and infrastructure.
### Blue-Green (preferred)
1. Deploy new version to idle environment (green)
2. Run health checks on green
3. Switch traffic from blue to green
4. Monitor for 15 minutes
5. Keep blue as instant rollback
### Rolling Update
1. Deploy to 1 instance
2. Monitor for 5 minutes
3. If healthy, deploy to 25% of instances
4. Monitor for 5 minutes
5. Deploy to 50%, then 100%
### Canary
1. Route 5% traffic to new version
2. Monitor error rate, latency, business metrics
3. If healthy after 30 min, increase to 25%
4. If healthy after 1 hour, increase to 100%
The deploy is not done until verified in production:
## Post-Deploy Verification (run within 15 minutes)
- [ ] Health endpoint returns 200
- [ ] Version endpoint shows new version
- [ ] Critical user flows work (manual or automated smoke test)
- [ ] Error rate stable (not spiking)
- [ ] Latency stable (not degraded)
- [ ] No new errors in logs
- [ ] Database migrations applied successfully
- [ ] Background jobs processing (if applicable)
- [ ] External integrations responding
Every deploy must have a rollback plan before it starts:
# Docker rollback
docker service update --image app:PREVIOUS_VERSION app_service
# Kubernetes rollback
kubectl rollout undo deployment/app-deployment
# Git-based rollback
git revert HEAD
git push origin main # triggers CI/CD redeploy of previous version
# Database migration rollback
alembic downgrade -1 # Python/Alembic
npx prisma migrate resolve --rolled-back # Prisma
flyway undo # Flyway
## Rollback Triggers (any one = rollback)
- Error rate increases >2x baseline
- P99 latency increases >50%
- Critical user flow fails
- Data corruption detected
- Security vulnerability introduced
- Dependency service cannot connect
## Rollback Procedure
1. Decide: rollback within 5 minutes of anomaly detection
2. Execute rollback command (pre-written, tested)
3. Verify rollback deployed (health check, version check)
4. Notify team
5. Open incident if data impact
6. Post-mortem within 24 hours
Reusable patterns that appear frequently when applying this skill.
name: Deploy
on:
push:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm ci
- run: npm test
- run: npm run lint
build:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: docker build -t app:${{ github.sha }} .
- run: docker push registry/app:${{ github.sha }}
deploy-staging:
needs: build
runs-on: ubuntu-latest
steps:
- run: kubectl set image deployment/app app=registry/app:${{ github.sha }}
- run: kubectl rollout status deployment/app --timeout=120s
- run: curl -f https://staging.example.com/health
deploy-production:
needs: deploy-staging
runs-on: ubuntu-latest
environment: production
steps:
- run: kubectl set image deployment/app app=registry/app:${{ github.sha }}
- run: kubectl rollout status deployment/app --timeout=180s
- run: curl -f https://app.example.com/health
## Safe Migration Rules
1. Never drop columns in the same deploy that stops reading them
2. Add new columns as nullable first, backfill, then add NOT NULL
3. Test migration on copy of production data
4. Estimate migration time (lock duration for large tables)
5. Have a tested rollback script ready
6. Run during low-traffic window
## Migration Checklist
- [ ] Migration tested on staging with production-like data volume
- [ ] Rollback script tested and committed
- [ ] Application code compatible with both old and new schema
- [ ] Estimated lock duration acceptable (<1s for online services)
- [ ] Backup taken before migration
| Rationalization | Reality |
|---|---|
| "Staging is identical to production, skip staging test" | Staging is never identical. Differences in data volume, traffic patterns, and secrets create real divergences. Always test. |
| "It is a small change, deploy directly" | Small changes cause big outages. The deploy pipeline exists because judgment fails under pressure. |
| "I will add monitoring after deploy" | If you cannot observe it, you cannot know it broke. Set up monitoring before deploying. |
| "Rollback takes too long" | A 5-minute rollback is better than a 5-hour outage. Automate rollback if manual is too slow. |
| "Deploy on Friday, fix on Monday" | Friday deploys leave the weekend as blast radius. Deploy Tuesday-Thursday during business hours when the team is available. |
| "The tests pass, it will be fine" | Tests do not catch integration issues, data volume problems, or infrastructure misconfigurations. Staging verification is mandatory. |
| "Skip the health check, the app starts fine" | Health checks catch dependency failures, migration errors, and configuration issues. Never skip them. |
After deploying, confirm:
Code Agent. Use when relevant to this domain.
Planning Agent. Use when relevant to this domain.
Research Agent. Use when relevant to this domain.
Review Agent. Use when relevant to this domain.
Browser automation with AI — Playwright, Puppeteer, browser-use library. Navigate, extract, interact with web pages autonomously
Autonomous coding agent that works like Cursor AI. Plans, researches, writes code, runs tests, and iterates until tasks are complete.