Deployment Strategies: Blue-Green, Canary, and Rolling Updates
Choosing the right deployment strategy is critical to delivering software reliably. A bad deployment can take down your entire system, erode user trust, and cost your organization real money. The good news is that the industry has developed battle-tested patterns — blue-green, canary, rolling, and others — that minimize risk while maximizing deployment velocity.
In this guide, we cover the most important deployment strategies, when to use each one, how they compare, and how to integrate them with your CI/CD pipeline. For controlling feature exposure independently of deployment, see Feature Flags.
Blue-Green Deployments
Blue-green deployment maintains two identical production environments, called Blue and Green. At any time, one is live (serving traffic) and the other is idle (staging the next release).
How It Works
- Blue is the current production environment serving all traffic
- Deploy the new version to Green
- Run smoke tests and health checks against Green
- Switch the load balancer (or DNS) to point traffic to Green
- Green is now production; Blue becomes the rollback target
- If anything goes wrong, switch traffic back to Blue instantly
# Nginx configuration for blue-green switching
upstream app_backend {
# Toggle between blue and green
server green.internal:8080; # Active
# server blue.internal:8080; # Standby
}
server {
listen 80;
location / {
proxy_pass http://app_backend;
}
}
Advantages
- Instant rollback: Switch traffic back in seconds
- Zero downtime: Users never see an outage during deployment
- Full environment testing: Test the complete new version before any user sees it
- Simple mental model: Easy to understand and implement
Disadvantages
- Double the infrastructure cost: You need two full environments
- Database migrations are tricky: Both environments must work with the same database schema
- Long-running transactions: Requests in flight during the switch may fail
Canary Deployments
Canary deployment routes a small percentage of traffic to the new version while the majority continues on the old version. You gradually increase traffic to the new version as confidence grows.
How It Works
- Deploy the new version alongside the existing version
- Route 1-5% of traffic to the new version
- Monitor error rates, latency, and business metrics
- If metrics are healthy, increase to 10%, 25%, 50%, 100%
- If metrics degrade, route all traffic back to the old version
# Kubernetes canary using Istio VirtualService
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: my-app
spec:
hosts:
- my-app.example.com
http:
- route:
- destination:
host: my-app
subset: stable
weight: 95
- destination:
host: my-app
subset: canary
weight: 5
Advantages
- Low risk: Only a small fraction of users are exposed to potential issues
- Real production traffic: Test with actual user behavior, not synthetic tests
- Gradual confidence: Increase exposure as metrics confirm stability
Disadvantages
- More complex routing: Requires traffic splitting capabilities
- Monitoring is critical: You need robust observability to detect issues at low traffic percentages
- Session stickiness: Users may switch between versions if not handled properly
Rolling Updates
Rolling updates replace instances of the old version with the new version one at a time (or in small batches). This is the default deployment strategy in Kubernetes.
How It Works
- You have N instances running version 1
- Take one instance out of the load balancer
- Upgrade it to version 2
- Add it back to the load balancer
- Repeat until all instances are running version 2
# Kubernetes rolling update configuration
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 6
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # 1 extra pod during update
maxUnavailable: 1 # 1 pod can be down during update
template:
spec:
containers:
- name: my-app
image: my-app:v2
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
Advantages
- No extra infrastructure: Uses the same resources as normal operation
- Zero downtime: Some instances are always available
- Built into Kubernetes: Minimal configuration required
Disadvantages
- Mixed versions during deployment: Both old and new versions serve traffic simultaneously
- Slow rollback: Rolling back means doing another rolling update in reverse
- Compatibility requirement: Old and new versions must coexist, especially for API contracts
Shadow (Dark) Deployments
Shadow deployments route a copy of production traffic to the new version without serving responses to users. The new version processes real requests, but its responses are discarded. This is ideal for testing performance, correctness, and resource consumption with real traffic patterns.
# Istio mirroring configuration
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: my-app
spec:
hosts:
- my-app.example.com
http:
- route:
- destination:
host: my-app
subset: stable
mirror:
host: my-app
subset: canary
mirrorPercentage:
value: 100.0
A/B Deployments
A/B deployments route traffic based on user attributes such as geographic location, device type, or user ID. Unlike canary deployments (which are percentage-based), A/B deployments use targeting rules to select which users see which version. This is closely related to feature flags but operates at the infrastructure level.
Strategy Comparison
| Strategy | Downtime | Rollback Speed | Infrastructure Cost | Risk Level | Complexity |
|---|---|---|---|---|---|
| Blue-Green | Zero | Instant | 2x | Low | Low |
| Canary | Zero | Fast | 1x + small | Very Low | Medium |
| Rolling | Zero | Slow | 1x | Medium | Low |
| Shadow | Zero | N/A | 2x | None | High |
| A/B | Zero | Fast | 1x + small | Low | High |
| Recreate | Yes | Slow | 1x | High | Very Low |
Rollback Strategies
No deployment strategy is complete without a rollback plan. Here are the key rollback approaches:
Instant Rollback (Blue-Green)
Switch traffic back to the previous environment. This is the fastest rollback method — measured in seconds, not minutes.
Progressive Rollback (Canary)
Set the canary weight back to 0%. Traffic immediately stops flowing to the new version. Then remove the canary deployment.
Rolling Rollback
Trigger a new rolling update that reverts to the previous version. In Kubernetes, this is as simple as:
kubectl rollout undo deployment/my-app
# Or roll back to a specific revision
kubectl rollout undo deployment/my-app --to-revision=3
Kubernetes Deployment Strategies
Kubernetes natively supports RollingUpdate and Recreate strategies. For canary and blue-green, you use additional tools:
- Istio / Linkerd: Service mesh for traffic splitting (canary, A/B)
- Argo Rollouts: Kubernetes controller for blue-green and canary with automated analysis
- Flagger: Progressive delivery tool that automates canary releases
For a deeper dive into Kubernetes concepts, see Kubernetes Architecture.
# Argo Rollouts canary strategy
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: my-app
spec:
replicas: 5
strategy:
canary:
steps:
- setWeight: 5
- pause: { duration: 5m }
- setWeight: 25
- pause: { duration: 10m }
- setWeight: 50
- pause: { duration: 10m }
- setWeight: 100
canaryService: my-app-canary
stableService: my-app-stable
trafficRouting:
istio:
virtualService:
name: my-app-vsvc
CI/CD Integration
Your deployment strategy should be automated through your CI/CD pipeline. A typical pipeline integrating deployment strategies looks like:
- Build: Compile, lint, unit test
- Package: Create container image, push to registry
- Deploy to staging: Full deployment to non-production environment
- Integration tests: Run e2e tests against staging
- Deploy canary: Deploy to 5% of production traffic
- Automated analysis: Compare canary metrics against baseline for 10 minutes
- Progressive rollout: If healthy, increase to 25%, 50%, 100%
- Rollback: If any step fails, automatically roll back
The right deployment strategy depends on your infrastructure, risk tolerance, and organizational maturity. Start with rolling updates (they are free and simple), graduate to blue-green when you need instant rollback, and adopt canary when you need fine-grained risk control. Combine these with feature flags for the ultimate deployment safety net.