Kubernetes Tutorial for Beginners to Advanced with Practical Projects and YAML Examples
Kubernetes Security: RBAC, PSA, and OPA
Security in Kubernetes is multi-layered. A misconfigured cluster can expose your entire infrastructure — a rogue pod can read all Secrets, escape to the host, or pivot to cloud provider APIs. In this chapter, we cover the three core security pillars: RBAC (who can do what in the cluster), Pod Security Admission (what containers are allowed to do on the node), and OPA/Gatekeeper (policy enforcement as code). This is what separates a CKS-level engineer from a beginner.
RBAC: Role-Based Access Control
RBAC controls which users, service accounts, and groups can perform which actions on which resources. Every kubectl command is an RBAC check. Understanding RBAC is mandatory for any production cluster — it is the primary access control mechanism.
Defines permissions within a namespace. Contains rules: which resources (pods, services, deployments) and which verbs (get, list, create, delete, patch). Namespaced resource only.
Same as Role but cluster-wide. Can grant access to non-namespaced resources (nodes, PersistentVolumes, namespaces themselves, ClusterRoles). Also reusable across namespaces via RoleBinding.
Binds a Role (or ClusterRole) to a subject (User, Group, ServiceAccount) within a namespace. The binding lives in a namespace; the permissions apply only to that namespace.
Binds a ClusterRole to a subject cluster-wide. Use sparingly — grants the subject permissions on all namespaces. Suitable for cluster admins, monitoring agents, and CI/CD systems.
# =============================================
# SCENARIO: Developer access — can view pods/logs in staging
# but CANNOT create, delete, or modify anything
# =============================================
# 1. Create the Role (namespaced — staging only)
cat > developer-role.yaml < dev-rolebinding.yaml < cicd-role.yaml < prometheus-rbac.yaml <
Pod Security Admission (PSA): Replacing PodSecurityPolicies
PSA was introduced in Kubernetes 1.25 to replace the deprecated PodSecurityPolicy (PSP). It enforces security standards at the namespace level by labeling namespaces with one of three built-in security profiles. These profiles prevent dangerous configurations like running as root, privilege escalation, and host network access.
No restrictions — pod can do anything. Only for system-level workloads like CNI plugins, node agents. Never use for application workloads.
Prevents known privilege escalation. Blocks: hostNetwork, hostPID, hostPath, privileged containers, capability additions. Good for most application workloads.
All baseline restrictions plus: must run as non-root, must drop ALL capabilities, no privilege escalation allowed. Use for sensitive workloads (payment, PII data). May require image changes.
# =============================================
# Applying PSA via Namespace Labels
# =============================================
# Three modes per policy:
# enforce — rejects violating pods
# audit — allows but logs violation to audit log
# warn — allows but returns a warning to the user
# Production: strict enforcement
kubectl label namespace production \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/enforce-version=latest \
pod-security.kubernetes.io/warn=restricted \
pod-security.kubernetes.io/warn-version=latest
# Staging: warn only (see violations without blocking)
kubectl label namespace staging \
pod-security.kubernetes.io/enforce=baseline \
pod-security.kubernetes.io/warn=restricted \
pod-security.kubernetes.io/audit=restricted
# System namespaces: privileged (required for system components)
kubectl label namespace kube-system \
pod-security.kubernetes.io/enforce=privileged
# =============================================
# Pod spec that passes 'restricted' PSA
# =============================================
apiVersion: v1
kind: Pod
metadata:
name: secure-api
namespace: production
spec:
securityContext:
runAsNonRoot: true # Must not run as root (UID 0)
runAsUser: 1000 # Specific non-root UID
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault # Use container runtime's default seccomp profile
containers:
- name: api
image: mycompany/api:v1.0
securityContext:
allowPrivilegeEscalation: false # Cannot gain more privileges than parent process
readOnlyRootFilesystem: true # Container cannot write to its own filesystem
capabilities:
drop: ["ALL"] # Drop ALL Linux capabilities (restricted requires this)
add: ["NET_BIND_SERVICE"] # Only add back what you need (bind port < 1024)
volumeMounts:
- name: tmp
mountPath: /tmp # Writable scratch space (since rootfs is read-only)
- name: cache
mountPath: /app/cache
volumes:
- name: tmp
emptyDir: {}
- name: cache
emptyDir: {}
# =============================================
# Test if your pod would pass PSA
# =============================================
# Dry-run deploy to see PSA warnings
kubectl apply --dry-run=server -f pod.yaml -n production
# Warning: would violate PodSecurity "restricted:latest": ...
# Check existing pods in a namespace for PSA violations
kubectl label namespace staging pod-security.kubernetes.io/audit=restricted --overwrite
# Then check audit logs for violations
OPA Gatekeeper: Policy as Code
PSA handles pod-level security, but OPA Gatekeeper handles complex organizational policies — required labels, allowed image registries, resource limit enforcement, naming conventions, and more. It uses Rego (a policy language) to define rules and runs as an admission webhook, blocking non-compliant resources before they are created.
# Install OPA Gatekeeper
helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
helm install gatekeeper gatekeeper/gatekeeper \
--namespace gatekeeper-system \
--create-namespace \
--set replicas=3 \
--set auditInterval=60
# =============================================
# Policy 1: All pods MUST have resource limits
# Without limits, a pod can starve all other pods on the node
# =============================================
# Step 1: Define the ConstraintTemplate (the policy logic in Rego)
cat > require-resource-limits-template.yaml < require-resource-limits-constraint.yaml < allowed-registries-template.yaml < allowed-registries-constraint.yaml < required-labels.yaml < 0
msg := sprintf("Missing required labels: %v", [missing])
}
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: RequiredLabels
metadata:
name: deployment-must-have-labels
spec:
enforcementAction: deny
match:
kinds:
- apiGroups: ["apps"]
kinds: ["Deployment"]
parameters:
labels:
- "app"
- "team" # Which team owns this service
- "cost-center" # For cloud cost allocation
- "version" # Application version
EOF
kubectl apply -f required-labels.yaml
# Verify all constraints
kubectl get constraints
# NAME ENFORCEMENT-ACTION TOTAL-VIOLATIONS
# must-have-resource-limits deny 3
# only-approved-registries deny 0
# deployment-must-have-labels deny 12
# See details of violations
kubectl describe requireresourcelimits must-have-resource-limits
Enterprise Security Hardening: SOC2 Compliant Kubernetes Cluster
Scenario: FinSecure Corp is undergoing a SOC2 Type II audit. The auditors require: least-privilege RBAC, no containers running as root, all images from approved registries, audit logging enabled, and no privileged containers in production. You need to implement a complete security hardening checklist.
# =============================================
# PHASE 1: Namespace Security Labeling
# =============================================
for ns in production staging; do
kubectl label namespace $ns \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/warn=restricted \
pod-security.kubernetes.io/audit=restricted \
--overwrite
done
# =============================================
# PHASE 2: RBAC — Principle of Least Privilege
# =============================================
# Remove default service account auto-mount
# By default, Kubernetes mounts the service account token in every pod
# This is a security risk — pods can call the Kubernetes API!
kubectl patch serviceaccount default -n production \
-p '{"automountServiceAccountToken": false}'
# Create specific service accounts for apps that NEED API access
kubectl create serviceaccount payment-service -n production
kubectl patch serviceaccount payment-service -n production \
-p '{"automountServiceAccountToken": true}'
# =============================================
# PHASE 3: Audit Logging (API Server config)
# =============================================
cat > audit-policy.yaml < no-privileged.yaml <
Security Troubleshooting Guide
Debug: kubectl describe pod pod-name — look for the PSA violation message. Common fixes: set runAsNonRoot: true, allowPrivilegeEscalation: false, capabilities.drop: [ALL], seccompProfile.type: RuntimeDefault. Use PSA in warn mode first to detect violations before enforcing.
Debug: Run kubectl auth can-i get pods -n namespace --as username to confirm. Check if a RoleBinding exists: kubectl get rolebindings -n namespace -o yaml | grep username. Remember: RBAC is additive — the user needs at least one binding that grants the permission.
Fix: Always add excludedNamespaces: ["kube-system", "gatekeeper-system", "cert-manager"] to all constraints. System pods need privileged access and will fail restricted policies. Test constraints in dryrun enforcement mode before switching to deny.
Interview Questions — Chapter 11
- Explain the difference between Role and ClusterRole. When would you use a ClusterRole bound with a RoleBinding (not ClusterRoleBinding)?
- A pod is making API calls to the Kubernetes API server. How is it authenticated? What security risk does the default behavior create?
- What are the three PSA enforcement levels (enforce, audit, warn)? What is the recommended migration strategy from PSP to PSA?
- What does
capabilities.drop: [ALL]do in a container's securityContext? Why is it required for the restricted PSA profile? - What is OPA Gatekeeper and how does it differ from PSA? Can they be used together?
- What is a ConstraintTemplate vs a Constraint in OPA Gatekeeper?
- Describe the 4 Cs of Cloud Native Security.
- A developer's kubectl commands are being rejected with 403. Walk through your RBAC debugging steps.
Monitoring with Prometheus and Grafana
You cannot run production Kubernetes without observability. When a pod crashes at 3am, you need to know why — was it OOM killed? CPU throttled? Did the upstream API start returning 500s? The Prometheus + Grafana stack is the industry standard for Kubernetes monitoring. Together with Alertmanager, it provides metrics collection, dashboards, and automated incident alerting.
The Observability Stack: Architecture Overview
┌─────────────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ Application Pods kube-state-metrics node-exporter │
│ /metrics endpoint ──┐ (cluster objects) ──┐ (node CPU/mem) ──┐│
│ │ │ ││
│ ▼ ▼ ▼│
│ ┌─────────────────────────────────────────────┐ │
│ │ PROMETHEUS SERVER │ │
│ │ - Scrapes /metrics every 15 seconds │ │
│ │ - Stores time-series data (TSDB) │ │
│ │ - Evaluates alerting rules │ │
│ └────────────┬───────────────────┬────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌─────────────────┐ │
│ │ GRAFANA │ │ ALERTMANAGER │ │
│ │ Dashboards │ │ PagerDuty/Slack│ │
│ │ Alerts UI │ │ Email/Webhook │ │
│ └──────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Installing kube-prometheus-stack (One Command Setup)
# kube-prometheus-stack installs EVERYTHING:
# Prometheus + Alertmanager + Grafana + node-exporter + kube-state-metrics
# + pre-built Kubernetes dashboards + default alerting rules
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Install with production-ready values
cat > monitoring-values.yaml <
ServiceMonitor: Scraping Your Application's Metrics
# ServiceMonitor tells Prometheus HOW and WHERE to scrape metrics from your app
# Your app must expose a /metrics endpoint in Prometheus format
# Example: Node.js API with prom-client metrics
# In your app: const promClient = require('prom-client');
# promClient.collectDefaultMetrics(); // CPU, memory, event loop lag
# app.get('/metrics', (req, res) => { res.end(await promClient.register.metrics()); });
# 1. Your app's service must have a port named "metrics"
apiVersion: v1
kind: Service
metadata:
name: api-service
namespace: production
labels:
app: api-service
monitoring: "true" # Label used by ServiceMonitor selector
spec:
selector:
app: api-service
ports:
- name: http
port: 80
targetPort: 3000
- name: metrics # Named metrics port — required for ServiceMonitor
port: 9090
targetPort: 9090
---
# 2. ServiceMonitor — tells Prometheus to scrape this service
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: api-service-monitor
namespace: production
spec:
selector:
matchLabels:
monitoring: "true" # Must match labels on the Service
namespaceSelector:
matchNames:
- production
endpoints:
- port: metrics # Must match the named port on the Service
interval: 15s # Scrape every 15 seconds
path: /metrics
scheme: http
# For authenticated endpoints:
# bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
# 3. Verify Prometheus is scraping your app
# Port-forward Prometheus UI
kubectl port-forward svc/kube-prometheus-stack-prometheus -n monitoring 9090:9090
# Open: http://localhost:9090/targets
# Find your service — status should be "UP"
# =============================================
# Useful PromQL Queries for Kubernetes
# =============================================
# CPU usage by namespace (% of request)
sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (namespace)
/ sum(kube_pod_container_resource_requests{resource="cpu"}) by (namespace)
# Memory usage by pod
sum(container_memory_working_set_bytes{container!=""}) by (pod, namespace)
# Pod restart count (last 1 hour)
increase(kube_pod_container_status_restarts_total[1h]) > 0
# HTTP error rate for your service (needs your app to expose http_requests_total)
rate(http_requests_total{status=~"5..", job="api-service"}[5m])
/ rate(http_requests_total{job="api-service"}[5m])
# P99 request latency
histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket{job="api-service"}[5m])) by (le))
# Node disk usage
(node_filesystem_size_bytes - node_filesystem_free_bytes) / node_filesystem_size_bytes * 100
# Pods in CrashLoopBackOff
kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff"} == 1
PrometheusRule: Defining Alerts
# PrometheusRule defines alerting and recording rules
# These are automatically loaded by the Prometheus Operator
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: production-alerts
namespace: production
labels:
release: kube-prometheus-stack # Must match the Prometheus operator's ruleSelector
spec:
groups:
- name: pod-alerts
interval: 30s
rules:
# Alert: Pod crash looping
- alert: PodCrashLooping
expr: |
increase(kube_pod_container_status_restarts_total[15m]) > 3
for: 5m # Must be true for 5 minutes before firing
labels:
severity: critical
team: "{{ $labels.namespace }}"
annotations:
summary: "Pod {{ $labels.pod }} is crash looping"
description: "Container {{ $labels.container }} in pod {{ $labels.pod }} has restarted {{ $value }} times in 15 minutes."
runbook_url: "https://wiki.company.com/runbooks/pod-crash-loop"
# Alert: High memory usage
- alert: HighMemoryUsage
expr: |
(container_memory_working_set_bytes{container!=""}
/ on(pod, container, namespace)
kube_pod_container_resource_limits{resource="memory"}) > 0.9
for: 10m
labels:
severity: warning
annotations:
summary: "Container {{ $labels.container }} memory usage is above 90%"
description: "Memory usage: {{ $value | humanizePercentage }}"
# Alert: Deployment has 0 replicas available
- alert: DeploymentHasNoReplicas
expr: |
kube_deployment_status_replicas_available == 0
for: 2m
labels:
severity: critical
annotations:
summary: "Deployment {{ $labels.deployment }} has 0 available replicas"
description: "Service may be completely down."
- name: sla-alerts
rules:
# Alert: High HTTP error rate (> 1% errors for 5 min)
- alert: HighHTTPErrorRate
expr: |
sum(rate(http_requests_total{status=~"5.."}[5m])) by (service)
/ sum(rate(http_requests_total[5m])) by (service) > 0.01
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate on {{ $labels.service }}"
description: "Error rate: {{ $value | humanizePercentage }}"
# Alert: P99 latency above 2 seconds
- alert: HighP99Latency
expr: |
histogram_quantile(0.99,
sum(rate(http_request_duration_seconds_bucket[5m])) by (le, service)
) > 2
for: 5m
labels:
severity: warning
annotations:
summary: "P99 latency on {{ $labels.service }} above 2s"
Alertmanager: Routing Alerts to PagerDuty and Slack
# Configure Alertmanager routing
# Critical alerts → PagerDuty (wake someone up)
# Warning alerts → Slack (non-urgent notification)
kubectl create secret generic alertmanager-config \
--from-literal=alertmanager.yaml='
global:
resolve_timeout: 5m
pagerduty_url: "https://events.pagerduty.com/v2/enqueue"
route:
group_by: ["alertname", "namespace"]
group_wait: 30s # Wait 30s to batch alerts
group_interval: 5m
repeat_interval: 4h # Re-alert every 4 hours if unresolved
receiver: slack-warnings # Default receiver
routes:
# Critical alerts go to PagerDuty (immediate human response)
- match:
severity: critical
receiver: pagerduty-critical
continue: true # Also send to Slack for visibility
# Database alerts get their own Slack channel
- match:
team: database
receiver: slack-database
receivers:
- name: pagerduty-critical
pagerduty_configs:
- service_key: "YOUR_PAGERDUTY_SERVICE_KEY"
description: "{{ range .Alerts }}{{ .Annotations.summary }}\n{{ end }}"
details:
runbook: "{{ .Annotations.runbook_url }}"
namespace: "{{ .Labels.namespace }}"
- name: slack-warnings
slack_configs:
- api_url: "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
channel: "#k8s-alerts"
title: "{{ .CommonAnnotations.summary }}"
text: "{{ range .Alerts }}{{ .Annotations.description }}\n{{ end }}"
color: "{{ if eq .CommonLabels.severity \"critical\" }}danger{{ else }}warning{{ end }}"
- name: slack-database
slack_configs:
- api_url: "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
channel: "#database-ops"
title: "[DB] {{ .CommonAnnotations.summary }}"
inhibit_rules:
# If a node is down, suppress alerts about pods on that node
- source_match:
severity: critical
alertname: NodeDown
target_match:
severity: warning
equal: ["node"]
' -n monitoring
Full SRE Observability Stack: SLOs, Dashboards, and On-Call Automation
Scenario: OrderFlow e-commerce has an SLA of 99.9% uptime and 200ms P99 latency. Their SRE team needs: real-time dashboards showing error budgets, automated alerts to PagerDuty when SLOs are at risk, and a Grafana dashboard that non-technical stakeholders can understand. You will implement the complete observability stack from scratch.
# 1. Install the full stack
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
-n monitoring --create-namespace --values monitoring-values.yaml
# 2. Create SLO-based Recording Rules (pre-compute expensive queries)
kubectl apply -f - <<'EOF'
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: orderflow-slo-rules
namespace: production
labels:
release: kube-prometheus-stack
spec:
groups:
- name: slo-recording-rules
interval: 30s
rules:
# 5-minute success rate
- record: job:http_requests_success:rate5m
expr: |
sum(rate(http_requests_total{status!~"5..", job="orderflow-api"}[5m])) by (job)
/ sum(rate(http_requests_total{job="orderflow-api"}[5m])) by (job)
# 1-hour success rate (for error budget)
- record: job:http_requests_success:rate1h
expr: |
sum(rate(http_requests_total{status!~"5..", job="orderflow-api"}[1h])) by (job)
/ sum(rate(http_requests_total{job="orderflow-api"}[1h])) by (job)
# Error budget remaining (99.9% SLO target)
- record: job:error_budget_remaining:rate1h
expr: |
(job:http_requests_success:rate1h - 0.999) / (1 - 0.999)
- name: slo-alerts
rules:
# Alert when error budget burn rate is too fast
# 14.4x burn rate = exhausts monthly budget in 2 hours
- alert: ErrorBudgetBurnHigh
expr: |
(1 - job:http_requests_success:rate5m) > (14.4 * (1 - 0.999))
for: 2m
labels:
severity: critical
team: sre
annotations:
summary: "High error budget burn rate for orderflow-api"
description: "Current success rate: {{ $value | humanizePercentage }}. At this rate, the monthly error budget will be exhausted in ~2 hours."
- alert: ErrorBudgetBurnMedium
expr: |
(1 - job:http_requests_success:rate1h) > (6 * (1 - 0.999))
for: 15m
labels:
severity: warning
team: sre
annotations:
summary: "Elevated error budget burn rate"
description: "Error budget may be exhausted within 24 hours if rate continues."
EOF
# 3. Import Grafana dashboards via ConfigMap
kubectl apply -f - <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
name: orderflow-dashboard
namespace: monitoring
labels:
grafana_dashboard: "1" # Grafana sidecar automatically imports this
data:
orderflow.json: |
{
"title": "OrderFlow SLO Dashboard",
"panels": [
{
"title": "Success Rate (SLO: 99.9%)",
"type": "stat",
"targets": [{"expr": "job:http_requests_success:rate5m * 100"}],
"thresholds": {"steps": [
{"value": 0, "color": "red"},
{"value": 99.0, "color": "yellow"},
{"value": 99.9, "color": "green"}
]}
},
{
"title": "Error Budget Remaining (30-day)",
"type": "gauge",
"targets": [{"expr": "job:error_budget_remaining:rate1h * 100"}]
},
{
"title": "P99 Latency",
"type": "graph",
"targets": [{
"expr": "histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket{job=\"orderflow-api\"}[5m])) by (le))",
"legendFormat": "P99"
}]
}
]
}
EOF
# 4. Verify everything
kubectl port-forward svc/kube-prometheus-stack-grafana -n monitoring 3000:80
# Open http://localhost:3000 → admin / SecureGrafanaPassword123!
# Navigate to Dashboards → OrderFlow SLO Dashboard
Interview Questions — Chapter 12
- What is the difference between metrics, logs, and traces? Which tool handles each in a typical Kubernetes observability stack?
- What is a ServiceMonitor and how does it differ from configuring scrape jobs in prometheus.yml directly?
- Explain what kube-state-metrics does and how it differs from node-exporter.
- What is a PrometheusRule? How does the Prometheus Operator load these automatically?
- Explain the concept of an error budget and how it relates to SLOs and SLAs.
- What does the
for: 5mfield in a PrometheusRule alert mean? Why is it important? - Your Grafana dashboard shows no data. What are the 3 most common causes?
- What is the difference between a recording rule and an alerting rule in Prometheus?
CI/CD with ArgoCD — GitOps for Kubernetes
GitOps is the practice of using Git as the single source of truth for both application code and infrastructure configuration. With ArgoCD, your Kubernetes cluster continuously syncs itself to match what's in your Git repository. Nobody runs kubectl apply manually — Git is the only way to change production. This creates a complete audit trail, instant rollback capability (just revert a git commit), and eliminates configuration drift.
Push-Based CI/CD (Old Way) vs Pull-Based GitOps (ArgoCD Way):
Push: CI pipeline runs → builds Docker image → kubectl apply directly to cluster. The pipeline needs cluster credentials. Drift is possible — someone can kubectl apply something that's not in git.
Pull (ArgoCD): CI pipeline runs → builds Docker image → pushes image tag to Git repo → ArgoCD detects git change → syncs cluster to match git. ArgoCD lives inside the cluster. No external system needs cluster access. Git is always the truth.
Installing ArgoCD
# Install ArgoCD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# Wait for all ArgoCD pods to be ready
kubectl wait --for=condition=Ready pods --all -n argocd --timeout=300s
# Install ArgoCD CLI
curl -sSL -o argocd-linux-amd64 https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64
chmod +x argocd-linux-amd64 && mv argocd-linux-amd64 /usr/local/bin/argocd
# Get the initial admin password
argocd admin initial-password -n argocd
# jX9Kd2pLmQrT4vWx ← initial password (change this immediately!)
# Access the ArgoCD UI
kubectl port-forward svc/argocd-server -n argocd 8080:443
# Open: https://localhost:8080
# Login via CLI
argocd login localhost:8080 --username admin --password jX9Kd2pLmQrT4vWx --insecure
# Change admin password
argocd account update-password \
--current-password jX9Kd2pLmQrT4vWx \
--new-password MySecureArgoPassword123!
# Expose ArgoCD via Ingress (production)
kubectl apply -f - <
ArgoCD Application: Syncing from Git
# =============================================
# Repository structure for GitOps
# =============================================
# git-ops-repo/
# apps/
# myapp/
# base/ ← Kustomize base manifests
# deployment.yaml
# service.yaml
# ingress.yaml
# kustomization.yaml
# overlays/
# staging/ ← Staging-specific patches
# kustomization.yaml (replicas=1, image tag=latest)
# production/ ← Production-specific patches
# kustomization.yaml (replicas=5, image tag=v2.1.0)
# infrastructure/
# monitoring/ ← ArgoCD also manages monitoring
# cert-manager/
# argocd/
# apps/ ← ArgoCD Application manifests (App of Apps pattern)
# myapp-staging.yaml
# myapp-production.yaml
# =============================================
# ArgoCD Application — Deploying from Git
# =============================================
cat > myapp-production-app.yaml < base/kustomization.yaml < overlays/production/kustomization.yaml <
App of Apps Pattern: Managing Many Applications
# The "App of Apps" pattern: one ArgoCD Application manages other ArgoCD Applications
# This is the standard for managing a full platform with many services
# root-app.yaml — The single root application that manages everything else
cat > root-app.yaml < frontend-project.yaml <
Full GitOps Pipeline: GitHub Actions + ArgoCD + Automated Promotions
Scenario: ShipFast startup wants a zero-touch deployment pipeline. When a developer merges to main: (1) GitHub Actions builds and pushes the Docker image, (2) automatically updates the image tag in the GitOps repo, (3) ArgoCD deploys to staging automatically, (4) after passing smoke tests, promotes to production. Zero manual kubectl commands — ever.
# GitHub Actions workflow: .github/workflows/gitops-deploy.yaml
cat > .github/workflows/gitops-deploy.yaml <<'WORKFLOW'
name: Build, Push, and Deploy via GitOps
on:
push:
branches: [main]
env:
REGISTRY: 123456789.dkr.ecr.us-east-1.amazonaws.com
IMAGE_NAME: shipfast/api
GITOPS_REPO: mycompany/gitops-repo
jobs:
build-and-push:
runs-on: ubuntu-latest
outputs:
image-tag: ${{ steps.meta.outputs.tags }}
short-sha: ${{ steps.short-sha.outputs.SHA }}
steps:
- uses: actions/checkout@v4
- name: Get short SHA
id: short-sha
run: echo "SHA=${GITHUB_SHA::8}" >> $GITHUB_OUTPUT
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
aws-region: us-east-1
- name: Login to ECR
uses: aws-actions/amazon-ecr-login@v2
- name: Build and push Docker image
run: |
docker build -t $REGISTRY/$IMAGE_NAME:${{ steps.short-sha.outputs.SHA }} .
docker push $REGISTRY/$IMAGE_NAME:${{ steps.short-sha.outputs.SHA }}
echo "Image pushed: $REGISTRY/$IMAGE_NAME:${{ steps.short-sha.outputs.SHA }}"
update-gitops-repo:
needs: build-and-push
runs-on: ubuntu-latest
steps:
- name: Checkout GitOps repo
uses: actions/checkout@v4
with:
repository: ${{ env.GITOPS_REPO }}
token: ${{ secrets.GITOPS_PAT }} # Personal Access Token for the gitops repo
- name: Update staging image tag
run: |
cd apps/api/overlays/staging
# Use yq to update the image tag in kustomization.yaml
yq e '.images[0].newTag = "${{ needs.build-and-push.outputs.short-sha }}"' \
-i kustomization.yaml
echo "Updated staging to: ${{ needs.build-and-push.outputs.short-sha }}"
- name: Commit and push
run: |
git config user.name "GitHub Actions Bot"
git config user.email "bot@mycompany.com"
git add .
git commit -m "chore: update staging api image to ${{ needs.build-and-push.outputs.short-sha }}
Triggered by: ${{ github.sha }}
PR: ${{ github.event.pull_request.html_url }}"
git push
# ArgoCD detects the git change → deploys to staging automatically
smoke-test-staging:
needs: update-gitops-repo
runs-on: ubuntu-latest
steps:
- name: Wait for ArgoCD to sync staging
run: |
# Install ArgoCD CLI
curl -sSL -o argocd https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64
chmod +x argocd
./argocd login ${{ secrets.ARGOCD_SERVER }} \
--username ${{ secrets.ARGOCD_USERNAME }} \
--password ${{ secrets.ARGOCD_PASSWORD }} \
--insecure
# Wait for staging sync to complete
./argocd app wait api-staging --health --timeout 300
- name: Run smoke tests against staging
run: |
curl -f https://staging.shipfast.io/health || exit 1
curl -f https://staging.shipfast.io/api/v1/status || exit 1
echo "Smoke tests passed!"
promote-to-production:
needs: smoke-test-staging
runs-on: ubuntu-latest
environment: production # Requires manual approval in GitHub (optional)
steps:
- name: Checkout GitOps repo
uses: actions/checkout@v4
with:
repository: ${{ env.GITOPS_REPO }}
token: ${{ secrets.GITOPS_PAT }}
- name: Promote image to production
run: |
cd apps/api/overlays/production
yq e '.images[0].newTag = "${{ needs.build-and-push.outputs.short-sha }}"' \
-i kustomization.yaml
- name: Commit promotion to production
run: |
git config user.name "GitHub Actions Bot"
git config user.email "bot@mycompany.com"
git add .
git commit -m "feat: promote api ${{ needs.build-and-push.outputs.short-sha }} to production"
git push
# ArgoCD detects → deploys to production
WORKFLOW
Interview Questions — Chapter 13
- What is GitOps and how does it differ from traditional CI/CD (push-based deployments)?
- What is the difference between ArgoCD's "Synced" and "Healthy" statuses? Can an app be Synced but Unhealthy?
- Explain the
selfHealandprunesettings in ArgoCD's automated sync policy. What are the risks if you don't enable them? - What is the App of Apps pattern? Why is it useful for managing a platform with many services?
- How does Kustomize differ from Helm? When would you choose one over the other?
- A developer manually ran
kubectl scale deployment api --replicas=0in production. If selfHeal is enabled in ArgoCD, what happens? - What is an ArgoCD ApplicationProject? How does it enforce multi-team isolation?
- How do you manage secrets in a GitOps workflow? You cannot commit secrets to git — what are the patterns for handling this?
Kubernetes on AWS EKS
Amazon Elastic Kubernetes Service (EKS) is the most widely used managed Kubernetes platform in the world. AWS manages the control plane (API server, etcd, scheduler) — you only manage worker nodes. EKS integrates deeply with AWS services: IAM for authentication, VPC for networking, EBS/EFS for storage, ALB for ingress, and ECR for container images. Understanding EKS-specific concepts is essential for any cloud-native DevOps role.
EKS Architecture: What AWS Manages vs What You Manage
| Component | AWS Manages | You Manage |
|---|---|---|
| API Server | ✅ Fully managed, HA across 3 AZs | Nothing |
| etcd | ✅ Managed, encrypted, backed up | Nothing |
| Worker Nodes | EC2 instances provisioned (Managed Node Groups) | Instance type, scaling, OS patching |
| Networking (CNI) | AWS VPC CNI installed by default | VPC design, subnet selection, security groups |
| IAM Authentication | AWS IAM integration via aws-auth ConfigMap | IAM roles, users, RBAC mappings |
| Storage (CSI) | EBS CSI Driver available as addon | StorageClasses, PVCs, backup policies |
Creating an EKS Cluster with eksctl (Recommended)
# Install eksctl
curl --silent --location "https://github.com/eksctl-io/eksctl/releases/latest/download/eksctl_Linux_amd64.tar.gz" | tar xz -C /tmp
mv /tmp/eksctl /usr/local/bin
# Create production EKS cluster using config file (best practice)
cat > eks-cluster.yaml < 5m v1.29.0
# ip-10-0-2-100.ec2.internal Ready 5m v1.29.0
# ip-10-0-3-75.ec2.internal Ready 4m v1.29.0
IRSA: IAM Roles for Service Accounts
IRSA is the EKS feature that allows individual pods to have their own AWS IAM permissions — without putting AWS credentials in environment variables or on EC2 instance roles. This is the most important EKS security concept. A pod that needs S3 access gets only S3 permissions, not the full EC2 instance role.
# =============================================
# SCENARIO: Give a pod access to S3 only
# =============================================
# 1. Enable OIDC provider for the cluster (one-time setup)
eksctl utils associate-iam-oidc-provider \
--cluster production-cluster \
--region us-east-1 \
--approve
# Get the OIDC issuer URL
OIDC_ISSUER=$(aws eks describe-cluster --name production-cluster \
--query "cluster.identity.oidc.issuer" --output text | sed 's|https://||')
echo "OIDC Issuer: $OIDC_ISSUER"
# 2. Create IAM Policy for S3 access
aws iam create-policy \
--policy-name app-s3-policy \
--policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
"Resource": "arn:aws:s3:::my-app-bucket/*"
},
{
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": "arn:aws:s3:::my-app-bucket"
}
]
}'
# 3. Create IAM Role with a trust policy for the Kubernetes ServiceAccount
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
aws iam create-role \
--role-name app-s3-role \
--assume-role-policy-document "{
\"Version\": \"2012-10-17\",
\"Statement\": [{
\"Effect\": \"Allow\",
\"Principal\": {
\"Federated\": \"arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${OIDC_ISSUER}\"
},
\"Action\": \"sts:AssumeRoleWithWebIdentity\",
\"Condition\": {
\"StringEquals\": {
\"${OIDC_ISSUER}:sub\": \"system:serviceaccount:production:app-service-account\",
\"${OIDC_ISSUER}:aud\": \"sts.amazonaws.com\"
}
}
}]
}"
aws iam attach-role-policy \
--role-name app-s3-role \
--policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/app-s3-policy
# 4. Create Kubernetes ServiceAccount annotated with the IAM Role ARN
kubectl apply -f - <
Cluster Autoscaler and Karpenter
# =============================================
# OPTION 1: Cluster Autoscaler (traditional)
# Scales existing node groups up/down based on pending pods
# =============================================
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
--namespace kube-system \
--set autoDiscovery.clusterName=production-cluster \
--set awsRegion=us-east-1 \
--set rbac.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=\
"arn:aws:iam::123456789:role/ClusterAutoscalerRole" \
--set extraArgs.balance-similar-node-groups=true \
--set extraArgs.skip-nodes-with-system-pods=false \
--set extraArgs.scale-down-delay-after-add=2m \
--set extraArgs.scale-down-unneeded-time=5m
# =============================================
# OPTION 2: Karpenter (modern, much faster)
# Provisions individual EC2 instances in seconds, not minutes
# More cost-efficient — picks optimal instance types automatically
# =============================================
helm repo add karpenter https://charts.karpenter.sh/
helm install karpenter karpenter/karpenter \
--namespace karpenter \
--create-namespace \
--set settings.aws.clusterName=production-cluster \
--set settings.aws.defaultInstanceProfile=KarpenterNodeInstanceProfile \
--set controller.resources.requests.cpu=1 \
--set controller.resources.requests.memory=1Gi
# Create a NodePool (Karpenter's provisioner)
cat > karpenter-nodepool.yaml <
Production EKS Cluster: Multi-AZ, Private Networking, and Cost Optimization
Scenario: ScaleUp SaaS is moving from self-managed Kubernetes to EKS. They need: a private cluster (no public node IPs), multi-AZ deployment for high availability, Karpenter for auto-scaling, AWS Load Balancer Controller for Ingress (instead of nginx), and RDS PostgreSQL as the database instead of in-cluster PostgreSQL. Total infrastructure cost must be optimized using spot instances for non-critical workloads.
# 1. Install AWS Load Balancer Controller (replaces nginx ingress on EKS)
# It creates ALB and NLB natively from Ingress and Service resources
# Create IAM policy
curl -o alb-controller-policy.json \
https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/install/iam_policy.json
aws iam create-policy \
--policy-name AWSLoadBalancerControllerIAMPolicy \
--policy-document file://alb-controller-policy.json
# Create ServiceAccount with IRSA
eksctl create iamserviceaccount \
--cluster=production-cluster \
--namespace=kube-system \
--name=aws-load-balancer-controller \
--attach-policy-arn=arn:aws:iam::123456789:policy/AWSLoadBalancerControllerIAMPolicy \
--approve
helm repo add eks https://aws.github.io/eks-charts
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=production-cluster \
--set serviceAccount.create=false \
--set serviceAccount.name=aws-load-balancer-controller
# 2. Use ALB Ingress (EKS-native — uses actual AWS ALB, not a pod-based ingress)
cat > alb-ingress.yaml <
{"Type": "redirect", "RedirectConfig": {"Protocol": "HTTPS", "StatusCode": "HTTP_301"}}
alb.ingress.kubernetes.io/healthcheck-path: /health
alb.ingress.kubernetes.io/group.name: production-alb # Share one ALB across multiple Ingresses
spec:
rules:
- host: app.scaleup.io
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port: { number: 80 }
- path: /
pathType: Prefix
backend:
service:
name: frontend-service
port: { number: 80 }
EOF
kubectl apply -f alb-ingress.yaml
# 3. Connect to RDS — use ExternalName service for seamless migration
kubectl apply -f - < batch-deployment.yaml <
Interview Questions — Chapter 14
- What does EKS actually manage vs what you are responsible for as the cluster operator?
- What is IRSA (IAM Roles for Service Accounts) and why is it superior to using node instance roles for pod permissions?
- How does the AWS VPC CNI differ from other CNI plugins like Calico? What is the implication for pod density per node?
- What is the difference between Cluster Autoscaler and Karpenter? Why is Karpenter generally preferred for new EKS clusters?
- Explain the aws-auth ConfigMap. What happens if it gets corrupted or deleted?
- What is a Managed Node Group in EKS? How does updating the AMI work with zero downtime?
- How do you handle spot instance interruptions gracefully in Kubernetes?
- What is the AWS Load Balancer Controller and how does it differ from nginx-ingress on EKS?
Production Deployment Strategies
Deploying to production is not just kubectl set image and hoping for the best. Production deployments must be safe, reversible, and observable. The strategy you choose determines your blast radius when things go wrong, your deployment speed, and the experience your users have during releases. This chapter covers every major deployment strategy used by top engineering teams worldwide.
Deployment Strategies Comparison
| Strategy | Downtime | Rollback Speed | Resource Cost | Best For |
|---|---|---|---|---|
| Recreate | Yes (downtime) | Slow | Low (1x) | Dev environments, batch jobs |
| Rolling Update | None | Minutes | Low (1.25x) | Most stateless apps |
| Blue/Green | None | Instant | High (2x) | High-risk releases, compliance |
| Canary | None | Fast | Medium (1.1–1.5x) | Risk mitigation, A/B testing |
| Shadow (Traffic Mirroring) | None | N/A (not serving traffic) | High (2x) | Testing new version with real traffic safely |
Rolling Updates: Tuning for Zero-Downtime
# Rolling update is the Kubernetes default — but the defaults are often wrong
# Default: maxUnavailable=25%, maxSurge=25% — this means up to 25% downtime!
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
namespace: production
spec:
replicas: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0 # NEVER have fewer than 10 running pods (zero downtime!)
maxSurge: 3 # Allow 3 extra pods during the rollout (13 max total)
# With these settings:
# 1. 3 new pods start (total: 10 old + 3 new = 13)
# 2. Wait until new pods pass readiness probes
# 3. Terminate 3 old pods (total: 7 old + 3 new = 10)
# 4. Repeat until all 10 are new version
minReadySeconds: 30 # New pod must be Ready for 30s before counting as available
# Prevents "too fast" rollout hiding crashes
progressDeadlineSeconds: 600 # Fail if rollout doesn't complete in 10 minutes
selector:
matchLabels: { app: api-server }
template:
metadata:
labels: { app: api-server }
spec:
containers:
- name: api
image: mycompany/api:v2.0
readinessProbe: # CRITICAL: Rolling update waits for this before moving on
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 15 # Wait 15s after container starts
periodSeconds: 5 # Check every 5 seconds
successThreshold: 2 # Must be healthy twice in a row before counted as Ready
failureThreshold: 3 # 3 failures = mark as not ready, pause rollout
livenessProbe: # Restarts the container if unhealthy
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"]
# Wait 15s before stopping — allows load balancer to drain connections
# Prevents 502 errors during pod termination
terminationGracePeriodSeconds: 60 # Give the app 60s to finish in-flight requests
# Monitor the rollout
kubectl rollout status deployment/api-server -n production -w
# Waiting for deployment "api-server" rollout to finish: 3 out of 10 new replicas have been updated...
# Waiting for deployment "api-server" rollout to finish: 6 out of 10 new replicas have been updated...
# Waiting for deployment "api-server" rollout to finish: 9 out of 10 new replicas have been updated...
# deployment "api-server" successfully rolled out
# Immediate rollback if issues detected
kubectl rollout undo deployment/api-server -n production
# Rolls back to previous revision instantly
# Rollback to specific revision
kubectl rollout history deployment/api-server -n production
kubectl rollout undo deployment/api-server --to-revision=3 -n production
Blue/Green Deployments
# Blue/Green: Run two complete environments simultaneously
# "Blue" = current production version
# "Green" = new version (fully deployed but not receiving traffic)
# Switch traffic: update the Service selector from blue to green (instant!)
# Rollback: update selector back to blue (instant!)
# =============================================
# 1. Blue Deployment (current production)
# =============================================
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-blue
namespace: production
labels:
app: api
version: blue
spec:
replicas: 5
selector:
matchLabels:
app: api
version: blue
template:
metadata:
labels:
app: api
version: blue
spec:
containers:
- name: api
image: mycompany/api:v1.5.0 # Current production version
ports:
- containerPort: 3000
---
# =============================================
# 2. Service — currently pointing to BLUE
# =============================================
apiVersion: v1
kind: Service
metadata:
name: api-service
namespace: production
spec:
selector:
app: api
version: blue # ← This is the key! Points to blue pods
ports:
- port: 80
targetPort: 3000
---
# =============================================
# 3. Green Deployment (new version — not receiving traffic yet)
# =============================================
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-green
namespace: production
labels:
app: api
version: green
spec:
replicas: 5
selector:
matchLabels:
app: api
version: green
template:
metadata:
labels:
app: api
version: green
spec:
containers:
- name: api
image: mycompany/api:v2.0.0 # New version — fully deployed, not live yet
ports:
- containerPort: 3000
# =============================================
# 4. Test green deployment (no real user traffic)
# =============================================
kubectl port-forward deployment/api-green 8080:3000 -n production
curl http://localhost:8080/health # Test new version directly
# Run integration tests against green
kubectl run test --image=mycompany/integration-tests --rm -it \
--env="API_URL=http://api-green.production.svc.cluster.local" \
-- pytest tests/integration/
# =============================================
# 5. SWITCH TO GREEN — update Service selector
# This is an atomic operation — instant traffic switch!
# =============================================
kubectl patch service api-service -n production \
-p '{"spec": {"selector": {"app": "api", "version": "green"}}}'
# All traffic immediately flows to green pods
# Zero downtime — the Service IP doesn't change
# =============================================
# 6. Rollback — just patch selector back to blue
# =============================================
kubectl patch service api-service -n production \
-p '{"spec": {"selector": {"app": "api", "version": "blue"}}}'
# Instant rollback! Users never notice.
# =============================================
# 7. After validation — clean up blue
# =============================================
kubectl delete deployment api-blue -n production
PodDisruptionBudget: Protecting Availability During Node Maintenance
# PodDisruptionBudget (PDB) tells Kubernetes how many pods from a Deployment
# can be voluntarily disrupted at once (node drain, cluster upgrade, etc.)
# Without a PDB, a node drain can terminate ALL pods of a service simultaneously!
# =============================================
# PDB: At least 80% of pods must be available at all times
# =============================================
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
namespace: production
spec:
minAvailable: "80%" # With 10 replicas: at least 8 must always be running
# OR:
# maxUnavailable: 2 # At most 2 can be unavailable at once
selector:
matchLabels:
app: api-server
# =============================================
# Real-world scenario: EKS node group upgrade
# =============================================
# Without PDB: kubectl drain node drains ALL pods on a node instantly
# If all pods of a service land on the same node = complete outage
# With PDB: kubectl drain respects PDB
# It will only evict pods if minAvailable is maintained
# If it would violate the PDB, the drain pauses and waits
# Drain a node (for maintenance/upgrade)
kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data
# evicting pod production/api-server-xxxxx
# evicting pod production/api-server-yyyyy
# error when evicting pods/api-server-zzzzz: Cannot evict pod as it would violate PDB
# ← Kubernetes waits for another pod to start on a different node first
# =============================================
# HPA: Horizontal Pod Autoscaler
# Scale based on CPU, memory, or custom metrics
# =============================================
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 5 # Never scale below 5 (for reliability)
maxReplicas: 50 # Never scale above 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60 # Scale up when avg CPU > 60%, down when < 60%
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
- type: Pods # Custom metric: pending jobs in queue
pods:
metric:
name: jobs_pending_per_pod
target:
type: AverageValue
averageValue: "30" # Scale when > 30 pending jobs per pod
behavior:
scaleUp:
stabilizationWindowSeconds: 60 # Wait 60s before scaling up again
policies:
- type: Pods
value: 5 # Add at most 5 pods per scale event
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300 # Wait 5 minutes before scaling down
policies:
- type: Percent
value: 10 # Remove at most 10% of pods per scale event
periodSeconds: 60
Capstone: Zero-Downtime Progressive Delivery with Full Observability
Scenario: PayStream processes $10M/day in payments. A single bad deployment cost them $200K in a previous outage. You must design and implement a deployment pipeline with: automated canary analysis (automatic rollback if error rate spikes), PDB to guarantee availability during upgrades, HPA to handle traffic spikes, and a complete runbook for on-call engineers.
# =============================================
# 1. PodDisruptionBudget — never < 90% pods available
# =============================================
kubectl apply -f - <<'EOF'
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: payment-api-pdb
namespace: production
spec:
minAvailable: "90%"
selector:
matchLabels:
app: payment-api
EOF
# =============================================
# 2. HPA with custom metrics from Prometheus
# =============================================
kubectl apply -f - <<'EOF'
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: payment-api-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: payment-api
minReplicas: 10
maxReplicas: 100
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50 # Aggressive — scale earlier for payment service
- type: External
external:
metric:
name: http_requests_per_second
selector:
matchLabels:
service: payment-api
target:
type: AverageValue
averageValue: "500" # Scale when > 500 req/s per pod
behavior:
scaleUp:
stabilizationWindowSeconds: 30 # Respond faster to traffic spikes
policies:
- type: Percent
value: 100 # Double pods if needed
periodSeconds: 30
scaleDown:
stabilizationWindowSeconds: 600 # 10 min cooldown — don't thrash
EOF
# =============================================
# 3. Canary deployment with automated analysis
# Using Flagger for progressive delivery
# =============================================
helm repo add flagger https://flagger.app
helm install flagger flagger/flagger \
--namespace flagger-system \
--create-namespace \
--set prometheus.install=true \
--set meshProvider=kubernetes
kubectl apply -f - <<'EOF'
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: payment-api
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: payment-api
progressDeadlineSeconds: 600
service:
port: 80
targetPort: 3000
analysis:
interval: 1m # Analyze metrics every 1 minute
threshold: 5 # Auto-rollback after 5 consecutive failures
maxWeight: 50 # Maximum 50% canary traffic
stepWeight: 10 # Increase canary by 10% per interval
metrics:
- name: request-success-rate
thresholdRange:
min: 99 # Must have > 99% success rate
interval: 1m
- name: request-duration
thresholdRange:
max: 500 # P99 latency must be < 500ms
interval: 1m
webhooks:
- name: load-test
url: http://flagger-loadtester.test/
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://payment-api-canary.production/"
EOF
# How Flagger works:
# 1. You update the Deployment image tag
# 2. Flagger detects the change
# 3. Creates payment-api-canary deployment with new version
# 4. Sends 10% traffic to canary
# 5. Checks Prometheus metrics every 1 minute
# 6. If success-rate > 99% and latency < 500ms: increase to 20%, then 30%...
# 7. If any check fails: AUTO-ROLLBACK → 0% to canary, alert sent
# Monitor canary progress
kubectl get canary payment-api -n production -w
# NAME STATUS WEIGHT LASTTRANSITIONTIME
# payment-api Progressing 10 2024-01-15T10:00:00Z
# payment-api Progressing 20 2024-01-15T10:01:00Z
# payment-api Progressing 30 2024-01-15T10:02:00Z
# payment-api Succeeded 0 2024-01-15T10:07:00Z ← Full rollout!
# =============================================
# 4. On-Call Runbook (in the cluster as ConfigMap)
# =============================================
kubectl create configmap payment-api-runbook -n production \
--from-literal=runbook.md="
# Payment API On-Call Runbook
## Pod CrashLoopBackOff
1. kubectl logs pod-name --previous
2. kubectl describe pod pod-name
3. Check OOM: kubectl top pods -n production
4. Rollback if recent deploy: kubectl rollout undo deploy/payment-api
## High Latency Alert (P99 > 500ms)
1. Check HPA: kubectl get hpa -n production
2. Check DB connections: kubectl exec -n production deploy/payment-api -- env | grep DB
3. Check upstream: kubectl exec -n production deploy/payment-api -- curl -s payment-gateway/health
4. If DB issue: kubectl get pods -n data
## Scaling Emergency (traffic spike)
1. kubectl scale deployment payment-api --replicas=50 -n production
2. Verify nodes available: kubectl get nodes | grep Ready
3. If nodes insufficient: (Karpenter auto-provisions — wait 90s)
"
You have built a production-grade progressive delivery system that automatically validates deployments with real traffic, rolls back on degradation, scales to handle any load, and documents operational procedures. This is the gold standard of Kubernetes deployments used by Netflix, Shopify, and Airbnb. You are now a production Kubernetes engineer.
Interview Questions — Chapter 15
- What is the difference between a rolling update and a blue/green deployment? When is the extra cost of blue/green justified?
- What does
maxUnavailable: 0andmaxSurge: 3mean in a rolling update strategy? What is the trade-off? - Why does
terminationGracePeriodSecondsexist and what happens if it's too short? - What is a PodDisruptionBudget and when does it actually take effect? What operations does it NOT protect against?
- Explain how HPA works. What Kubernetes component is responsible for executing the scaling decision?
- What is Flagger and how does it automate canary analysis? What metrics would you use to automatically decide to roll back?
- A rolling update is stuck at 50% and not progressing. What are the possible causes?
- What is the difference between a liveness probe and a readiness probe in the context of deployments? What happens during a rolling update if a readiness probe fails on new pods?
Chapters 11–15 Covered
You've mastered Security (RBAC, PSA, OPA Gatekeeper), Monitoring (Prometheus, Grafana, SLOs), GitOps with ArgoCD, AWS EKS with IRSA and Karpenter, and Production Deployment Strategies including canary analysis with Flagger. You are now equipped to operate production Kubernetes clusters at scale.
Practice all labs on killer.sh (official CKA simulator). Focus on speed — the exam is time-pressured. Use imperative kubectl commands wherever possible. bookmark the Kubernetes documentation (kubernetes.io/docs) — it is open book during the exam.