Kubernetes Resource Limits and Requests

February 13, 2026 | Kubernetes Resources

Avoid OOMKilled with correct sizing.

Understanding Resource Requests and Limits

Misconfigured resource settings are the #1 cause of Kubernetes reliability issues. Requests determine scheduling, limits prevent resource starvation, and the ratio between them determines your Quality of Service class. Getting this right is critical for both stability and cost efficiency.

Requests vs Limits

Requests — The guaranteed amount of CPU/memory. The scheduler uses requests to decide which node can accommodate a pod. If a node has 2 CPU and 3.5GB allocated via requests, a pod requesting 1 CPU won't fit.
Limits — The maximum amount a container can use. Exceeding memory limits triggers an OOMKill. Exceeding CPU limits causes throttling.

Quality of Service (QoS) Classes

QoS Class	Condition	Eviction Priority
Guaranteed	Requests = Limits for all containers	Last to be evicted
Burstable	At least one container has requests < limits	Evicted after BestEffort
BestEffort	No requests or limits set	First to be evicted

Practical Sizing Strategy

Start with metrics — Run your application for at least 48 hours and observe actual CPU and memory usage
Set requests to P95 usage — This covers 95% of normal operation
Set limits to 2-3x requests — Allow burst capacity for traffic spikes
Never set CPU limits too tight — CPU throttling causes latency spikes that are hard to debug

Example Configuration

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: api
        image: api:v1.2.0
        resources:
          requests:
            cpu: 250m      # 0.25 CPU cores
            memory: 256Mi  # 256 MB
          limits:
            cpu: "1"       # 1 CPU core (4x request)
            memory: 512Mi  # 512 MB (2x request)

Common Mistakes

No requests set — Pods get BestEffort QoS and are first to be evicted under pressure
Requests too high — Wastes cluster resources; nodes appear full but are actually idle
Memory limits too low — Causes frequent OOMKills and pod restarts
CPU limits too tight — Causes throttling; application appears slow but uses less than 50% CPU in metrics
Same values for all services — A Go API server needs different resources than a Java application

Detecting OOMKilled Pods

# Find OOMKilled pods
kubectl get pods --all-namespaces -o json | jq '.items[] | select(.status.containerStatuses[]?.lastState.terminated.reason == "OOMKilled") | .metadata.name'

# Check current resource usage vs limits
kubectl top pods --containers

Vertical Pod Autoscaler (VPA)

Let Kubernetes recommend or automatically set resource values based on actual usage:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: api-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-service
  updatePolicy:
    updateMode: "Off"  # Start with recommendations only

Use updateMode: "Off" initially to get recommendations without automatic changes. Review suggestions before switching to "Auto".

Eazy SaaS Tip: We run a resource audit for our Kubernetes clients quarterly. In most cases, we find 30-40% of cluster resources are allocated but unused. Right-sizing these deployments directly reduces your cloud bill.

← Back to Blog