Kubernetes Cost Optimization: Beyond the Basics
The Kubernetes Cost Problem
Kubernetes is powerful, but it's easy to waste money with it. Many teams use it as a lift-and-shift migration target without understanding its cost dynamics. The result? Bills 2-3x higher than they should be.
Most cost savings conversations focus on:
- Resource requests and limits
- Node instance types
- Auto-scaling configuration
These are good starts, but they're table stakes. The real savings come from deeper optimization.
Level 1: Resource Requests & Limits
The foundation of K8s cost optimization is right-sizing containers.
apiVersion: v1
kind: Pod
metadata:
name: web-app
spec:
containers:
- name: app
image: myapp:latest
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
Impact: 15-20% savings from proper requests/limits configuration.
Level 2: Node Pool Optimization
Move beyond uniform node types:
apiVersion: v1
kind: Node
metadata:
labels:
workload-type: compute-optimized
spot: "true"
gpu: "false"
Separate node pools for:
- Spot instances (non-critical workloads)
- Reserved instances (production baseline)
- On-demand (burstable capacity)
Impact: 30-40% additional savings through intelligent node selection.
Level 3: Pod Autoscaling & Bin Packing
Cluster Autoscaler is good. Descheduler is better.
# Install Descheduler to compact pods
helm install descheduler stable/descheduler \
--set policy.rules.duplicates.enabled=true \
--set policy.rules.lowNodeUtilization.enabled=true
This consolidates workloads onto fewer nodes, reducing waste from fragmented scheduling.
Impact: 20-25% additional savings from better bin-packing.
Level 4: Workload-Specific Optimizations
Different workload types need different approaches:
Batch Jobs
- Use spot instances aggressively
- Distribute across multiple node pools
- Implement job retries at scale
Long-running Services
- Use reserved instances as baseline
- Auto-scale based on business metrics, not CPU
- Implement proper graceful shutdown
Development Environments
- Namespace resource quotas
- Auto-scaling to zero during off-hours
- Ephemeral storage optimization
Real Numbers
We helped an e-commerce company optimize their Kubernetes infrastructure:
| Stage | Monthly Cost | Savings | |-------|-------------|---------| | Initial | $85,000 | - | | After Level 1 | $72,000 | 15% | | After Level 2 | $52,000 | 39% | | After Level 3 | $44,000 | 48% | | After Level 4 | $38,000 | 55% |
Tools That Help
- Kubecost: Chargeback and cost monitoring
- Falco: Security and performance optimization
- Kyverno: Policy as code for cost governance
- Goldpinger: Network cost visualization
The Culture Shift
Technical optimization only gets you so far. The real wins come from building a cost-aware culture:
- Educate developers on pod resource impact
- Make cost visible at deployment time
- Create accountability for resource consumption
- Celebrate cost optimization wins
Key Takeaways
- K8s cost optimization is layered—start with requests/limits, then move to infrastructure
- Spot instances should be your default for non-critical workloads
- Bin-packing and pod consolidation are underrated opportunities
- Automation prevents regression better than manual optimization
The companies that excel at Kubernetes cost optimization aren't just tech-savvy—they've built organizational practices around it.