Comprehensive guide for updating Auto Code deployments with zero downtime using rolling updates, blue-green deployments, and canary releases.
Auto Code supports multiple update strategies for both Docker Compose and Kubernetes deployments:
- Rolling Updates - Gradual replacement of old instances with new ones
- Blue-Green Deployments - Switch traffic between two identical environments
- Canary Releases - Gradual rollout to subset of users before full deployment
- Recreate Strategy - Stop old version, start new version (downtime acceptable)
Recommended strategies by deployment type:
- Development/Staging: Rolling updates (default)
- Production (small scale): Blue-green deployments
- Production (large scale): Canary releases with gradual rollout
Auto Code follows semantic versioning (MAJOR.MINOR.PATCH):
2.8.0 → 2.8.1 (patch: bug fixes, backwards compatible)
2.8.0 → 2.9.0 (minor: new features, backwards compatible)
2.8.0 → 3.0.0 (major: breaking changes)
Docker images are tagged with:
- Version tags:
auto-code-backend:2.8.1,auto-code-web-backend:2.8.1 - Latest tag:
auto-code-backend:latest(always points to latest stable) - Branch tags:
auto-code-backend:develop(for testing pre-release builds)
# Build and tag images
docker build -t auto-code-backend:2.8.1 -t auto-code-backend:latest apps/backend
docker build -t auto-code-web-backend:2.8.1 -t auto-code-web-backend:latest apps/web-backend
docker build -t auto-code-web-frontend:2.8.1 -t auto-code-web-frontend:latest apps/web-frontend
# Push to registry
docker push auto-code-backend:2.8.1
docker push auto-code-backend:latestBefore any update:
-
Backup data:
# Docker Compose docker-compose exec postgres pg_dump -U postgres autoclaude > backup-$(date +%Y%m%d).sql # Kubernetes kubectl exec -n auto-claude postgres-0 -- pg_dump -U postgres autoclaude > backup-$(date +%Y%m%d).sql
-
Review changelog:
- Check CHANGELOG.md for breaking changes
- Review migration scripts (if any)
- Identify required configuration changes
-
Test in staging:
- Deploy to staging environment first
- Run smoke tests and integration tests
- Verify all services healthy
-
Plan rollback:
- Document rollback steps
- Keep previous version images available
- Ensure backups are restorable
Docker Compose doesn't support true rolling updates out-of-the-box, but you can achieve gradual updates with orchestration:
# Pull new images
docker-compose pull
# Restart services one at a time
docker-compose up -d --no-deps --scale web-backend=2 web-backend
sleep 30 # Wait for new instance to be healthy
docker-compose up -d --no-deps --scale web-backend=1 web-backend
# Repeat for other services
docker-compose up -d --no-deps backend
docker-compose up -d --no-deps web-frontendRequires external load balancer (nginx, HAProxy, Traefik):
# Scale up with new version
docker-compose up -d --no-deps --scale web-backend=3 web-backend
# Wait for health checks
sleep 30
docker-compose ps
# Gracefully stop old instances
docker stop $(docker ps -q --filter name=web-backend | tail -n 1)
sleep 10
docker stop $(docker ps -q --filter name=web-backend | tail -n 1)
# Verify only new version running
docker-compose psCreate scripts/rolling-update-compose.sh:
#!/bin/bash
set -e
SERVICE=$1
NEW_VERSION=$2
echo "Starting rolling update for $SERVICE to version $NEW_VERSION"
# Update image version in docker-compose.yml
sed -i "s|image: auto-code-$SERVICE:.*|image: auto-code-$SERVICE:$NEW_VERSION|" docker-compose.yml
# Pull new image
docker-compose pull $SERVICE
# Scale up to 2 instances
docker-compose up -d --no-deps --scale $SERVICE=2 $SERVICE
# Wait for new instance to be healthy
echo "Waiting for new instance to be healthy..."
sleep 30
# Check health
if docker-compose ps $SERVICE | grep -q "healthy"; then
echo "New instance healthy, scaling down old instance"
docker-compose up -d --no-deps --scale $SERVICE=1 $SERVICE
echo "Rolling update complete"
else
echo "New instance unhealthy, rolling back"
docker-compose up -d --no-deps --scale $SERVICE=1 $SERVICE
exit 1
fiUsage:
chmod +x scripts/rolling-update-compose.sh
./scripts/rolling-update-compose.sh web-backend 2.8.1Kubernetes supports rolling updates natively with configurable strategies.
In kubernetes/web-backend-deployment.yaml:
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Max pods above desired count during update
maxUnavailable: 0 # Max pods unavailable during update (zero-downtime)
minReadySeconds: 10 # Wait 10s after pod ready before continuingStrategy parameters:
maxSurge: 1- Create 1 extra pod during rollout (3 → 4 → 3)maxUnavailable: 0- Never reduce below desired replica countminReadySeconds: 10- Gradual rollout with 10s wait between pods
# Update image version
kubectl set image deployment/web-backend \
web-backend=auto-code-web-backend:2.8.1 \
-n auto-claude
# Alternative: Edit deployment directly
kubectl edit deployment web-backend -n auto-claude
# Or use kubectl apply with updated manifest
kubectl apply -f kubernetes/web-backend-deployment.yaml# Watch rollout status
kubectl rollout status deployment/web-backend -n auto-claude
# Watch pods being replaced
kubectl get pods -n auto-claude -w
# View rollout history
kubectl rollout history deployment/web-backend -n auto-claudeExample output:
Waiting for deployment "web-backend" rollout to finish: 1 out of 3 new replicas have been updated...
Waiting for deployment "web-backend" rollout to finish: 1 old replicas are pending termination...
Waiting for deployment "web-backend" rollout to finish: 2 of 3 updated replicas are available...
deployment "web-backend" successfully rolled out
Useful for testing mid-rollout:
# Pause rollout after first pod
kubectl rollout pause deployment/web-backend -n auto-claude
# Test new version
kubectl exec -n auto-claude web-backend-xxxxx-yyy -- curl http://localhost:8000/health
# Resume rollout
kubectl rollout resume deployment/web-backend -n auto-claudeWith Helm charts:
# Update values.yaml or override during upgrade
helm upgrade auto-claude ./helm/auto-code \
--namespace auto-claude \
--set backend.image.tag=2.8.1 \
--set webBackend.image.tag=2.8.1 \
--set webFrontend.image.tag=2.8.1 \
--wait \
--timeout 5m
# Verify rollout
helm status auto-claude -n auto-claudeBlue-green deployments maintain two identical environments and switch traffic between them.
Requires manual orchestration with Docker Compose.
docker-compose.blue.yml:
version: '3.8'
services:
web-backend-blue:
image: auto-code-web-backend:2.8.0
container_name: web-backend-blue
ports:
- "8001:8000" # Blue environment on port 8001
environment:
- ENVIRONMENT=blue
# ... other config same as docker-compose.yml
web-frontend-blue:
image: auto-code-web-frontend:2.8.0
container_name: web-frontend-blue
ports:
- "3001:3000" # Blue environment on port 3001
# ... other configdocker-compose.green.yml:
version: '3.8'
services:
web-backend-green:
image: auto-code-web-backend:2.8.1 # New version
container_name: web-backend-green
ports:
- "8002:8000" # Green environment on port 8002
environment:
- ENVIRONMENT=green
# ... other config
web-frontend-green:
image: auto-code-web-frontend:2.8.1 # New version
container_name: web-frontend-green
ports:
- "3002:3000" # Green environment on port 3002
# ... other config# Start green environment with new version
docker-compose -f docker-compose.green.yml up -d
# Wait for health checks
sleep 30
# Verify green environment
curl http://localhost:8002/health
curl http://localhost:3002/Configure nginx/HAProxy to switch traffic:
nginx.conf:
upstream backend {
server localhost:8001; # Blue (current)
# server localhost:8002; # Green (new) - comment out until ready
}
upstream frontend {
server localhost:3001; # Blue (current)
# server localhost:3002; # Green (new) - comment out until ready
}Switch to green:
upstream backend {
# server localhost:8001; # Blue (old) - commented out
server localhost:8002; # Green (new) - now active
}
upstream frontend {
# server localhost:3001; # Blue (old) - commented out
server localhost:3002; # Green (new) - now active
}Reload nginx:
nginx -s reloadIf issues detected, switch back to blue:
# Edit nginx.conf to point back to blue
nginx -s reload
# Or stop green environment
docker-compose -f docker-compose.green.yml downKubernetes makes blue-green deployments easier with Services and labels.
backend-deployment-green.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend-green
namespace: auto-claude
spec:
replicas: 3
selector:
matchLabels:
app: backend
version: green # Green label
template:
metadata:
labels:
app: backend
version: green
spec:
containers:
- name: backend
image: auto-code-backend:2.8.1 # New version
# ... rest of specDeploy green:
kubectl apply -f kubernetes/backend-deployment-green.yaml
kubectl apply -f kubernetes/web-backend-deployment-green.yaml
kubectl apply -f kubernetes/web-frontend-deployment-green.yaml
# Wait for green pods to be ready
kubectl wait --for=condition=ready pod -l version=green -n auto-claude --timeout=300s# Port-forward to test green directly
kubectl port-forward -n auto-claude deployment/web-backend-green 8001:8000
# Test in another terminal
curl http://localhost:8001/healthUpdate Service selector to point to green:
# Patch service to use green version
kubectl patch service web-backend -n auto-claude -p '{"spec":{"selector":{"version":"green"}}}'
kubectl patch service web-frontend -n auto-claude -p '{"spec":{"selector":{"version":"green"}}}'
# Verify traffic is now going to green
kubectl get endpoints web-backend -n auto-claude# Monitor logs and metrics
kubectl logs -n auto-claude -l version=green --tail=100 -f
# If issues detected, rollback to blue
kubectl patch service web-backend -n auto-claude -p '{"spec":{"selector":{"version":"blue"}}}'
kubectl patch service web-frontend -n auto-claude -p '{"spec":{"selector":{"version":"blue"}}}'Once green is stable:
# Delete blue deployments
kubectl delete deployment backend-blue -n auto-claude
kubectl delete deployment web-backend-blue -n auto-claude
kubectl delete deployment web-frontend-blue -n auto-claudeCanary releases gradually roll out new version to subset of users, monitoring metrics before full rollout.
Keep existing deployment (stable) and create canary with fewer replicas:
web-backend-deployment-canary.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-backend-canary
namespace: auto-claude
spec:
replicas: 1 # Start with 1 canary pod (10% of traffic if stable has 9 replicas)
selector:
matchLabels:
app: web-backend
track: canary
template:
metadata:
labels:
app: web-backend
track: canary
version: 2.8.1 # New version
spec:
containers:
- name: web-backend
image: auto-code-web-backend:2.8.1 # New version
# ... rest of specDeploy canary:
# Ensure stable deployment is running
kubectl apply -f kubernetes/web-backend-deployment.yaml # 9 replicas, version 2.8.0
# Deploy canary
kubectl apply -f kubernetes/web-backend-deployment-canary.yaml # 1 replica, version 2.8.1
# Service will load-balance across both (90% stable, 10% canary)
kubectl get pods -n auto-claude -l app=web-backendMonitor error rates, latency, and health metrics:
# Watch canary logs
kubectl logs -n auto-claude -l track=canary -f
# Check error rates (requires Prometheus/monitoring)
# Compare error rate of canary vs stable
kubectl exec -n auto-claude monitoring-pod -- \
promtool query instant 'rate(http_requests_total{track="canary",status=~"5.."}[5m])'
# Check latency percentiles
kubectl exec -n auto-claude monitoring-pod -- \
promtool query instant 'histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{track="canary"}[5m]))'If canary metrics are healthy, gradually increase canary traffic:
# Increase canary to 3 replicas (30% traffic)
kubectl scale deployment web-backend-canary --replicas=3 -n auto-claude
# Monitor for 15 minutes
sleep 900
# Increase to 5 replicas (50% traffic)
kubectl scale deployment web-backend-canary --replicas=5 -n auto-claude
# Monitor for 15 minutes
sleep 900
# Promote canary to stable
kubectl scale deployment web-backend-canary --replicas=9 -n auto-claude
kubectl delete deployment web-backend -n auto-claude # Remove old stable
kubectl patch deployment web-backend-canary -n auto-claude -p '{"metadata":{"name":"web-backend"}}'For production canary deployments, use Flagger for automated progressive delivery:
Install Flagger:
# Add Flagger Helm repo
helm repo add flagger https://flagger.app
# Install Flagger (requires Istio, Linkerd, or other service mesh)
kubectl apply -f https://raw.githubusercontent.com/fluxcd/flagger/main/artifacts/flagger/crd.yaml
helm upgrade -i flagger flagger/flagger \
--namespace flagger-system \
--set meshProvider=istio \
--set metricsServer=http://prometheus:9090Flagger Canary resource:
web-backend-canary-flagger.yaml:
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: web-backend
namespace: auto-claude
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: web-backend
service:
port: 8000
analysis:
interval: 1m
threshold: 5 # Number of failed checks before rollback
maxWeight: 50 # Max percentage of traffic to canary
stepWeight: 10 # Increase traffic by 10% each interval
metrics:
- name: request-success-rate
thresholdRange:
min: 99 # Require 99% success rate
interval: 1m
- name: request-duration
thresholdRange:
max: 500 # Max 500ms latency
interval: 1m
webhooks:
- name: load-test
url: http://flagger-loadtester/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://web-backend-canary:8000/health"Apply and update deployment:
kubectl apply -f kubernetes/web-backend-canary-flagger.yaml
# Trigger canary by updating deployment image
kubectl set image deployment/web-backend web-backend=auto-code-web-backend:2.8.1 -n auto-claude
# Flagger will automatically:
# 1. Deploy canary pods
# 2. Gradually increase traffic (0% → 10% → 20% → ... → 50%)
# 3. Monitor metrics at each step
# 4. Promote or rollback based on success criteriaMonitor Flagger canary:
# Watch canary progress
kubectl get canary -n auto-claude -w
# View events
kubectl describe canary web-backend -n auto-claudeDocker Compose canary requires external traffic management:
# Run stable version (3 instances)
docker-compose up -d --scale web-backend=3
# Run canary version (1 instance on different port)
docker run -d --name web-backend-canary \
-p 8001:8000 \
auto-code-web-backend:2.8.1
# Configure load balancer for weighted routing
# nginx: 90% to localhost:8000, 10% to localhost:8001Kubernetes tracks rollout history and supports instant rollbacks.
# View all revisions
kubectl rollout history deployment/web-backend -n auto-claude
# View specific revision details
kubectl rollout history deployment/web-backend -n auto-claude --revision=2Example output:
REVISION CHANGE-CAUSE
1 <none>
2 kubectl set image deployment/web-backend web-backend=auto-code-web-backend:2.8.1
3 kubectl set image deployment/web-backend web-backend=auto-code-web-backend:2.8.2
# Rollback to previous revision
kubectl rollout undo deployment/web-backend -n auto-claude
# Rollback to specific revision
kubectl rollout undo deployment/web-backend -n auto-claude --to-revision=2
# Watch rollback progress
kubectl rollout status deployment/web-backend -n auto-claude# List Helm releases
helm history auto-claude -n auto-claude
# Rollback to previous release
helm rollback auto-claude -n auto-claude
# Rollback to specific revision
helm rollback auto-claude 2 -n auto-claude# Update docker-compose.yml to previous version
sed -i 's|auto-code-web-backend:2.8.1|auto-code-web-backend:2.8.0|' docker-compose.yml
# Pull old image (if not cached)
docker-compose pull web-backend
# Restart with old version
docker-compose up -d web-backendIf update includes database migrations, rollback requires reversing migrations:
# For Django migrations
docker-compose exec backend python manage.py migrate app_name 0003_previous_migration
# For Alembic migrations
docker-compose exec backend alembic downgrade -1
# For Kubernetes
kubectl exec -n auto-claude backend-xxxxx -- python manage.py migrate app_name 0003_previous_migrationBest practices for database migrations:
- Always make migrations backwards compatible
- Use multi-phase migrations for breaking changes
- Test rollback in staging before production
- Multiple replicas - At least 2 pods/containers per service
- Health checks - Liveness and readiness probes configured
- Graceful shutdown - Application handles SIGTERM properly
- Rolling update strategy -
maxUnavailable: 0in Kubernetes - Backwards compatibility - New version compatible with old version during rollout
Ensure application handles SIGTERM for graceful shutdown:
Python (FastAPI/Django):
import signal
import sys
def graceful_shutdown(signum, frame):
print("Received SIGTERM, shutting down gracefully...")
# Close database connections
# Finish in-flight requests
# Clean up resources
sys.exit(0)
signal.signal(signal.SIGTERM, graceful_shutdown)Node.js:
process.on('SIGTERM', () => {
console.log('Received SIGTERM, shutting down gracefully...');
server.close(() => {
// Close database connections
// Finish in-flight requests
process.exit(0);
});
});For zero-downtime updates with database changes:
- Add new columns/tables (nullable or with defaults)
- Deploy new application version
- Old version ignores new columns, new version uses them
- Write to both old and new columns
- Backfill old data to new columns
- Remove old columns/tables
- Deploy new version that only uses new schema
Example: Renaming column user_id to account_id
Phase 1: Add new column
ALTER TABLE tasks ADD COLUMN account_id INTEGER;Phase 2: Dual-write (application code)
# v2.8.1: Write to both columns
task.user_id = user.id
task.account_id = user.id # Also write to new column
task.save()
# Backfill existing data
UPDATE tasks SET account_id = user_id WHERE account_id IS NULL;Phase 3: Remove old column
-- v2.8.2: Drop old column
ALTER TABLE tasks DROP COLUMN user_id;-
Always test in staging first
- Deploy to staging environment
- Run full test suite
- Perform manual testing
- Verify rollback procedure
-
Use semantic versioning
- Tag all images with version numbers
- Never use
:latestin production - Keep old versions available for rollback
-
Monitor during updates
- Watch error rates and latency
- Check health endpoints
- Monitor resource usage (CPU, memory)
- Review application logs
-
Automate updates
- Use CI/CD pipelines
- Automate testing and validation
- Use GitOps for declarative deployments
-
Document changes
- Maintain CHANGELOG.md
- Document breaking changes
- Update runbooks and operational docs
-
Configure proper resource limits
resources: requests: cpu: 500m memory: 512Mi limits: cpu: 1000m memory: 1Gi
-
Use PodDisruptionBudgets
apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: web-backend-pdb spec: minAvailable: 2 # Keep at least 2 pods running during updates selector: matchLabels: app: web-backend
-
Configure preStop hooks for graceful shutdown
lifecycle: preStop: exec: command: ["/bin/sh", "-c", "sleep 15"] # Wait for load balancer to deregister
-
Set appropriate terminationGracePeriodSeconds
spec: terminationGracePeriodSeconds: 30 # Give 30s for graceful shutdown
-
Use external load balancer
- nginx or HAProxy for traffic management
- Health check endpoints for routing decisions
-
Version lock images
services: web-backend: image: auto-code-web-backend:2.8.1 # Specific version, not :latest
-
Use healthchecks
healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8000/health"] interval: 30s timeout: 10s retries: 3 start_period: 40s
- Error rates - HTTP 5xx responses
- Latency - p50, p95, p99 response times
- Throughput - Requests per second
- Resource usage - CPU, memory, disk
- Health check status - Liveness and readiness probe failures
# Error rate by version
rate(http_requests_total{status=~"5..",version="2.8.1"}[5m])
# Compare latency between versions
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{version="2.8.1"}[5m]))
# Pod restart count (high restarts indicate issues)
kube_pod_container_status_restarts_total{namespace="auto-claude"}
Configure alerts for update issues:
groups:
- name: update_alerts
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.01
for: 2m
annotations:
summary: "High error rate during update"
- alert: HighPodRestarts
expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
for: 5m
annotations:
summary: "Pods restarting during update"Symptoms:
- Rollout not progressing
- Pods stuck in Pending or ImagePullBackOff
Diagnosis:
# Check pod status
kubectl get pods -n auto-claude
# View pod events
kubectl describe pod web-backend-xxxxx -n auto-claude
# Check rollout status
kubectl rollout status deployment/web-backend -n auto-claudeSolutions:
- Verify image exists in registry
- Check resource availability (CPU, memory)
- Review node capacity:
kubectl describe node - Check image pull credentials
Symptoms:
- Readiness probes failing
- Pods in CrashLoopBackOff
- High error rates after update
Diagnosis:
# Check pod logs
kubectl logs -n auto-claude web-backend-xxxxx --previous
# Check events
kubectl get events -n auto-claude --sort-by='.lastTimestamp'
# Verify health endpoint
kubectl exec -n auto-claude web-backend-xxxxx -- curl http://localhost:8000/healthSolutions:
- Rollback immediately:
kubectl rollout undo deployment/web-backend -n auto-claude - Check configuration changes (environment variables, secrets)
- Verify database migrations completed successfully
- Review application logs for errors
Symptoms:
- Rollback command succeeds but still seeing new version
- Rollback completes but application still broken
Solutions:
# Verify rollback completed
kubectl rollout status deployment/web-backend -n auto-claude
# Check deployment image
kubectl get deployment web-backend -n auto-claude -o jsonpath='{.spec.template.spec.containers[0].image}'
# Force rollback to specific revision
kubectl rollout undo deployment/web-backend -n auto-claude --to-revision=1
# If database issue, rollback migrations
kubectl exec -n auto-claude backend-xxxxx -- python manage.py migrate app_name 0003_previous- Docker Deployment Guide - Docker Compose deployment instructions
- Kubernetes Deployment Guide - Kubernetes and Helm deployment
- Troubleshooting Guide - General troubleshooting
This guide covered comprehensive update strategies for Auto Code:
- Rolling updates - Default strategy for gradual replacement
- Blue-green deployments - Zero-downtime with instant rollback
- Canary releases - Risk mitigation with gradual rollout
- Rollback strategies - Quick recovery from failed updates
- Zero-downtime techniques - Graceful shutdown and database migrations
Recommended approach:
- Start with rolling updates for development/staging
- Use blue-green for small-scale production deployments
- Implement canary releases for large-scale production with automated tooling (Flagger)
- Always test in staging and have rollback plan ready