🧠 Concept 13: Resource Requests & Limits (Scheduling + Performance πŸ’―)

Image

Image

Image

Image


πŸš€ 1. Core Idea (1-line)

πŸ‘‰ Requests = guaranteed resources, Limits = maximum allowed resources


🧠 2. Why This Concept Exists (VERY IMPORTANT ⚠️)

Without limits:

  • One pod can consume ALL CPU/RAM ❌

  • Other apps starve 😡

πŸ‘‰ Cluster becomes unstable


πŸ’‘ 3. Two Key Terms


🟒 1. Requests (Minimum guarantee)

πŸ‘‰ Scheduler uses this to decide:

  • Where to place the pod

Example:

requests:
  cpu: "500m"
  memory: "256Mi"

πŸ‘‰ Means:

  • Needs at least 0.5 CPU

  • Needs at least 256MB RAM


πŸ”΄ 2. Limits (Maximum cap)

limits:
  cpu: "1"
  memory: "512Mi"

πŸ‘‰ Means:

  • Cannot exceed 1 CPU

  • Cannot exceed 512MB RAM


βš™οΈ 4. How Scheduling Works (VERY IMPORTANT πŸ”₯)

πŸ‘‰ Kubernetes scheduler checks:

  • Node has enough requests capacity?
    βœ… Yes β†’ Pod scheduled
    ❌ No β†’ Pod pending

πŸ’₯ 5. Runtime Behavior

CPU:

  • Can burst above request (until limit)

  • If exceeds limit β†’ throttled ⚠️

Memory:

  • If exceeds limit β†’ OOMKilled πŸ’€

πŸ“¦ 6. Example YAML

resources:
  requests:
cpu: "200m"
memory: "128Mi"
  limits:
cpu: "500m"
memory: "256Mi"

🧠 7. VERY IMPORTANT Concepts

πŸ‘‰ CPU Units:

  • 1000m = 1 CPU

  • 500m = 0.5 CPU

πŸ‘‰ Memory Units:

  • Mi = Mebibyte

  • Gi = Gibibyte


πŸ”₯ 8. Real-world DevOps Insight

For your ML workloads πŸ‘€:

  • Inference β†’ moderate CPU + memory

  • Training β†’ high CPU/GPU

πŸ‘‰ Set proper limits to avoid:

  • Node crash

  • Resource starvation


⚠️ 9. Common Mistakes (INTERVIEW TRAPS)

❌ Not setting requests β†’ bad scheduling
❌ Setting limits too low β†’ OOMKill
❌ Setting limits too high β†’ waste resources


πŸ’Ό 10. Interview Answer

πŸ‘‰ β€œResource requests define the minimum resources required for scheduling, while limits define the maximum resources a container can consume, ensuring fair resource usage and cluster stability.”


⚑ 11. CKA Commands

kubectl describe pod <name>

πŸ‘‰ Shows:

  • Requests

  • Limits

  • OOMKilled events


🧠 12. Memory Trick

πŸ‘‰ Request = reservation πŸͺ‘
πŸ‘‰ Limit = boundary 🚧


πŸ”₯ 13. Pro Insight (Real-world)

  • Always set requests + limits

  • Use:

  • HPA (autoscaling)

  • Metrics (Prometheus)

πŸ‘‰ Optimize based on usage


πŸš€ Next Step

Bol:

πŸ‘‰ β€œnext”

Then we go to:
πŸ”₯ Concept 14: HPA (Horizontal Pod Autoscaling πŸ’― β€” VERY IMPORTANT FOR REAL WORLD)