Kubernetes Full Course (Beginner to Advanced) with Real Projects and Hands-on Examples
Services and Networking
Pods are ephemeral — they die and get recreated with new IP addresses. If your frontend directly called a pod’s IP, that IP would break every time the pod restarted. Kubernetes Services solve this by providing a stable virtual IP and DNS name in front of a dynamic set of pods. Networking is the glue that holds microservices together, and understanding it deeply is what separates a junior from a senior Kubernetes engineer.
The Four Service Types
Exposes the service on an internal IP only. Only reachable from within the cluster. Used for inter-service communication (frontend talking to backend). This is the most common type.
Exposes the service on each node’s IP at a static port (30000–32767). Useful for local development and testing. Not recommended for production — exposes node IPs directly.
Provisions an external cloud load balancer (AWS ELB, GCP LB, Azure LB) automatically. Each LoadBalancer service gets its own cloud LB and external IP. Expensive if overused — prefer Ingress for HTTP.
Maps a service to an external DNS name (e.g., mydb.rds.amazonaws.com). No proxying — just a CNAME alias in CoreDNS. Used to reference external services using internal Kubernetes DNS names.
Complete Service YAML Examples
# =============================================
# 1. ClusterIP — Internal service (most common)
# =============================================
apiVersion: v1
kind: Service
metadata:
name: backend-api
namespace: production
labels:
app: backend-api
spec:
type: ClusterIP # Default — no need to specify, but shown for clarity
selector:
app: backend-api # Routes traffic to pods with this label
ports:
- name: http
port: 80 # Port the Service listens on (inside cluster)
targetPort: 3000 # Port the pod container is listening on
protocol: TCP
- name: metrics
port: 9090
targetPort: 9090
# DNS name: backend-api.production.svc.cluster.local
# Other pods call it as: http://backend-api (within same namespace)
# Or: http://backend-api.production (cross-namespace)
---
# =============================================
# 2. NodePort — For local testing / dev
# =============================================
apiVersion: v1
kind: Service
metadata:
name: frontend-nodeport
namespace: staging
spec:
type: NodePort
selector:
app: frontend
ports:
- port: 80
targetPort: 3000
nodePort: 31000 # Fixed node port; must be 30000-32767
# Accessible at: http://:31000
---
# =============================================
# 3. LoadBalancer — Cloud production exposure
# =============================================
apiVersion: v1
kind: Service
metadata:
name: payment-api-lb
namespace: production
annotations:
# AWS-specific annotations for EKS
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-internal: "false"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
spec:
type: LoadBalancer
selector:
app: payment-api
ports:
- port: 443
targetPort: 8443
protocol: TCP
loadBalancerSourceRanges:
- "203.0.113.0/24" # Restrict access to specific IP ranges (security best practice)
---
# =============================================
# 4. ExternalName — Alias for external services
# =============================================
apiVersion: v1
kind: Service
metadata:
name: production-database
namespace: production
spec:
type: ExternalName
externalName: myapp-db.abc123xyz.us-east-1.rds.amazonaws.com
# Pods can now call: postgres://production-database:5432/mydb
# Instead of hardcoding the RDS endpoint everywhere
How Kubernetes Networking Works: The CNI Layer
Kubernetes does not implement pod-to-pod networking itself. Instead, it delegates to a Container Network Interface (CNI) plugin. The CNI plugin is responsible for: assigning IP addresses to pods, ensuring every pod can communicate with every other pod across nodes (without NAT), and implementing NetworkPolicies (firewall rules).
Simple overlay network. Uses VXLAN. Easy to set up, good for beginners. Does NOT support NetworkPolicy — you need Calico for that. Best for: learning, simple on-prem clusters.
Production-grade CNI. Uses BGP for routing (no overlay needed = higher performance). Supports NetworkPolicy. Used by major cloud providers. Best for: production clusters needing network security.
Modern CNI using eBPF for networking (bypasses iptables entirely). Extremely high performance at scale. Supports L7 NetworkPolicy (filter by HTTP method, path). Best for: large-scale production, security-conscious teams.
EKS default. Assigns real VPC IP addresses to pods. No overlay — pods are native VPC citizens. Each EC2 instance can host a limited number of pods (based on ENI/IP limits per instance type).
NetworkPolicy: Firewall Rules for Pods
By default, all pods can communicate with all other pods in the cluster. In a production environment with PCI-DSS, HIPAA, or SOC2 compliance requirements, you must isolate services using NetworkPolicies. NetworkPolicy works like a firewall — you define which pods can talk to which pods, on which ports.
# network-policy.yaml — Production security configuration
# Scenario: payment-service should only accept traffic from api-gateway
# and only communicate with postgres database. Nothing else.
# STEP 1: Default deny all ingress and egress for the payment namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: payment
spec:
podSelector: {} # Applies to ALL pods in this namespace
policyTypes:
- Ingress
- Egress
# No rules = deny everything
---
# STEP 2: Allow ingress to payment-service ONLY from api-gateway
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-api-gateway-to-payment
namespace: payment
spec:
podSelector:
matchLabels:
app: payment-service
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: api-gateway-ns # Only from pods in this namespace
podSelector:
matchLabels:
app: api-gateway # AND with this label (AND logic, not OR)
ports:
- protocol: TCP
port: 8080
---
# STEP 3: Allow egress from payment-service to postgres only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-payment-to-postgres
namespace: payment
spec:
podSelector:
matchLabels:
app: payment-service
policyTypes:
- Egress
egress:
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
- to: # Also allow DNS resolution (required!)
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
---
# Verify policies
kubectl get networkpolicies -n payment
kubectl describe networkpolicy allow-api-gateway-to-payment -n payment
# Test connectivity (should fail — blocked by NetworkPolicy)
kubectl exec -n api-gateway deploy/api-gateway -- \
curl -s --connect-timeout 3 http://payment-service.payment:8080/pay
# Should succeed
kubectl exec -n frontend deploy/frontend -- \
curl -s --connect-timeout 3 http://payment-service.payment:8080/pay
# Should timeout — frontend not allowed to reach payment
DNS in Kubernetes: CoreDNS Deep Dive
# Every service gets a DNS record automatically:
# Format: ..svc.cluster.local
# Examples:
# backend-api.production.svc.cluster.local → ClusterIP of backend-api service
# postgres.data.svc.cluster.local → ClusterIP of postgres service
# From within the SAME namespace, you can use shorthand:
# http://backend-api (short name)
# http://backend-api:80 (with port)
# From a DIFFERENT namespace, use full name:
# http://backend-api.production
# http://backend-api.production.svc.cluster.local
# Verify DNS resolution from inside a pod
kubectl run dns-test --image=busybox --rm -it -- nslookup backend-api.production
# Server: 10.96.0.10 (CoreDNS ClusterIP)
# Address: 10.96.0.10:53
# Name: backend-api.production.svc.cluster.local
# Address: 10.100.47.23 (Service ClusterIP)
# Check CoreDNS ConfigMap (customize DNS behavior)
kubectl get configmap coredns -n kube-system -o yaml
# You can add custom DNS entries, forward specific domains to external DNS, etc.
# Custom DNS entry example - add to CoreDNS ConfigMap data.Corefile:
# example.com:53 {
# forward . 8.8.8.8
# }
# This forwards all *.example.com queries to Google DNS
Multi-Tier Banking Application with Network Isolation
Scenario: SecureBank needs a 3-tier application — frontend (React), backend API (Node.js), and database (PostgreSQL) — with strict network isolation. The frontend can only reach the API, the API can only reach the database, and the database cannot initiate any outbound connections. This is a PCI-DSS requirement.
# 1. Create namespaces with labels (labels are used in NetworkPolicy selectors)
kubectl create namespace securebank-frontend
kubectl create namespace securebank-api
kubectl create namespace securebank-data
kubectl label namespace securebank-frontend name=securebank-frontend
kubectl label namespace securebank-api name=securebank-api
kubectl label namespace securebank-data name=securebank-data
# 2. Deploy the 3 tiers
cat > securebank-stack.yaml <<EOF
# --- FRONTEND DEPLOYMENT ---
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
namespace: securebank-frontend
spec:
replicas: 2
selector:
matchLabels: { app: frontend }
template:
metadata:
labels: { app: frontend }
spec:
containers:
- name: frontend
image: nginx:1.25-alpine
ports:
- containerPort: 80
resources:
requests: { memory: "64Mi", cpu: "50m" }
limits: { memory: "128Mi", cpu: "200m" }
---
# Frontend Service (NodePort for external access in dev)
apiVersion: v1
kind: Service
metadata:
name: frontend-svc
namespace: securebank-frontend
spec:
type: NodePort
selector: { app: frontend }
ports:
- port: 80
targetPort: 80
nodePort: 30080
---
# --- API DEPLOYMENT ---
apiVersion: apps/v1
kind: Deployment
metadata:
name: banking-api
namespace: securebank-api
spec:
replicas: 3
selector:
matchLabels: { app: banking-api }
template:
metadata:
labels: { app: banking-api }
spec:
containers:
- name: banking-api
image: node:20-alpine
command: ["/bin/sh", "-c", "node -e \"const h=require('http');h.createServer((q,r)=>{r.writeHead(200);r.end('Banking API OK')}).listen(3000)\""]
ports:
- containerPort: 3000
resources:
requests: { memory: "128Mi", cpu: "100m" }
limits: { memory: "256Mi", cpu: "500m" }
---
# API Service (ClusterIP — internal only)
apiVersion: v1
kind: Service
metadata:
name: banking-api-svc
namespace: securebank-api
spec:
type: ClusterIP
selector: { app: banking-api }
ports:
- port: 3000
targetPort: 3000
---
# --- DATABASE STATEFULSET ---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: securebank-data
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels: { app: postgres }
template:
metadata:
labels: { app: postgres }
spec:
containers:
- name: postgres
image: postgres:15-alpine
env:
- name: POSTGRES_PASSWORD
value: "securepassword123"
ports:
- containerPort: 5432
resources:
requests: { memory: "256Mi", cpu: "200m" }
limits: { memory: "512Mi", cpu: "500m" }
---
# Database Service (ClusterIP — internal only)
apiVersion: v1
kind: Service
metadata:
name: postgres-svc
namespace: securebank-data
spec:
type: ClusterIP
selector: { app: postgres }
ports:
- port: 5432
targetPort: 5432
EOF
kubectl apply -f securebank-stack.yaml
# 3. Apply Network Policies — Default deny all in each namespace
for ns in securebank-frontend securebank-api securebank-data; do
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: $ns
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
EOF
done
# 4. Allow frontend → API
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-api
namespace: securebank-api
spec:
podSelector:
matchLabels: { app: banking-api }
policyTypes: [Ingress]
ingress:
- from:
- namespaceSelector:
matchLabels: { name: securebank-frontend }
podSelector:
matchLabels: { app: frontend }
ports:
- port: 3000
EOF
# 5. Allow API → Database
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-api-to-db
namespace: securebank-data
spec:
podSelector:
matchLabels: { app: postgres }
policyTypes: [Ingress]
ingress:
- from:
- namespaceSelector:
matchLabels: { name: securebank-api }
podSelector:
matchLabels: { app: banking-api }
ports:
- port: 5432
EOF
# 6. Verify isolation — frontend cannot reach database directly
kubectl exec -n securebank-frontend deploy/frontend -- \
timeout 3 nc -zv postgres-svc.securebank-data 5432 || echo "BLOCKED - correct!"
# 7. Verify allowed path — API can reach database
kubectl exec -n securebank-api deploy/banking-api -- \
timeout 3 nc -zv postgres-svc.securebank-data 5432 && echo "ALLOWED - correct!"
Networking Troubleshooting Guide
Debug steps: (1) Verify the service exists: kubectl get svc -n namespace. (2) Test DNS from inside pod: kubectl exec pod -- nslookup service-name. (3) Check CoreDNS pods are running: kubectl get pods -n kube-system | grep coredns. (4) Check if NetworkPolicy is blocking DNS (port 53).
Debug: Check that the service selector matches pod labels exactly: kubectl get endpoints service-name. If Endpoints shows <none>, the selector doesn’t match any pods. Check: kubectl get pods --show-labels and compare with service selector.
Debug: Use kubectl exec pod -- curl -v --connect-timeout 3 http://target to test. If it times out, a NetworkPolicy is likely blocking it. Check all policies in both namespaces. Remember: NetworkPolicies are additive — if any policy allows traffic, it is allowed. The default (no policies) = allow all.
Interview Questions — Chapter 6
- A pod’s IP changes every time it restarts. How do Services solve this problem internally?
- What is the difference between ClusterIP, NodePort, and LoadBalancer? When would you choose each?
- How does kube-proxy implement service routing using iptables? What is the alternative and why is it better at scale?
- Explain the full DNS name format for a Kubernetes Service and how a pod resolves it.
- What is a CNI plugin and name three examples? What does Cilium do differently from Flannel?
- If you apply a default-deny NetworkPolicy to a namespace, what breaks immediately and why?
- What is the difference between namespaceSelector and podSelector in a NetworkPolicy? What happens when you use both in the same from rule?
- Your LoadBalancer service is stuck in Pending. What is the most common cause?
ConfigMaps and Secrets
Hardcoding configuration into container images is an anti-pattern. If your database host changes, you would need to rebuild and redeploy your image. Instead, Kubernetes separates configuration from code using ConfigMaps (for non-sensitive data) and Secrets (for sensitive data). This follows the 12-Factor App methodology — store config in the environment, not in the application.
ConfigMaps: Non-Sensitive Configuration
# =============================================
# Creating ConfigMaps — Multiple methods
# =============================================
# Method 1: From literal values (quick and simple)
kubectl create configmap app-config \
--from-literal=NODE_ENV=production \
--from-literal=LOG_LEVEL=info \
--from-literal=API_URL=https://api.example.com \
--from-literal=MAX_CONNECTIONS=100
# Method 2: From a file (great for config files like nginx.conf, app.properties)
kubectl create configmap nginx-config \
--from-file=nginx.conf \ # Key = filename, Value = file content
--from-file=mime.types
# Method 3: From YAML (most common in production — version controlled)
cat > app-configmap.yaml <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: api-config
namespace: production
data:
# Simple key-value pairs
NODE_ENV: "production"
LOG_LEVEL: "info"
DB_HOST: "postgres-service"
DB_PORT: "5432"
DB_NAME: "appdb"
MAX_POOL_SIZE: "20"
# Multi-line config file stored as a ConfigMap entry
app.json: |
{
"server": {
"port": 3000,
"timeout": 30000,
"maxConnections": 100
},
"cache": {
"ttl": 3600,
"maxItems": 10000
},
"features": {
"darkMode": true,
"betaAPI": false
}
}
nginx.conf: |
upstream backend {
server localhost:3000;
}
server {
listen 80;
location / {
proxy_pass http://backend;
proxy_read_timeout 60;
}
location /health {
return 200 'OK';
}
}
EOF
kubectl apply -f app-configmap.yaml
# =============================================
# Using ConfigMaps in Pods — 3 Methods
# =============================================
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 3
selector:
matchLabels: { app: api-server }
template:
metadata:
labels: { app: api-server }
spec:
containers:
- name: api
image: mycompany/api:v3.0
# METHOD 1: Inject specific keys as environment variables
env:
- name: NODE_ENV
valueFrom:
configMapKeyRef:
name: api-config
key: NODE_ENV
- name: DB_HOST
valueFrom:
configMapKeyRef:
name: api-config
key: DB_HOST
# METHOD 2: Inject ALL keys as environment variables at once
envFrom:
- configMapRef:
name: api-config # All keys become env vars
# NODE_ENV=production, LOG_LEVEL=info, DB_HOST=postgres-service, etc.
# METHOD 3: Mount as files in a volume (best for config files)
volumeMounts:
- name: config-files
mountPath: /etc/app/config # app.json and nginx.conf appear here as files
readOnly: true
volumes:
- name: config-files
configMap:
name: api-config
items: # Optionally select specific keys to mount
- key: app.json
path: app.json # Creates /etc/app/config/app.json
- key: nginx.conf
path: nginx.conf # Creates /etc/app/config/nginx.conf
Secrets: Sensitive Data Management
Kubernetes Secrets are base64 encoded, NOT encrypted by default. Anyone with access to etcd or the right RBAC permissions can read them. For true production security: (1) Enable encryption at rest for etcd, (2) Use AWS Secrets Manager / HashiCorp Vault with the Secrets Store CSI Driver, (3) Limit Secret access via RBAC. Never commit Secret YAML files to git repositories.
# =============================================
# Creating Secrets
# =============================================
# Method 1: Imperative (values are auto-base64 encoded by kubectl)
kubectl create secret generic db-credentials \
--from-literal=username=appuser \
--from-literal=password=SuperSecret123! \
--from-literal=connection-string="postgresql://appuser:SuperSecret123!@postgres:5432/appdb"
# Method 2: From files (certificates, SSH keys)
kubectl create secret generic tls-certs \
--from-file=tls.crt=./server.crt \
--from-file=tls.key=./server.key
# Method 3: TLS Secret (special type for Ingress TLS termination)
kubectl create secret tls api-tls-secret \
--cert=server.crt \
--key=server.key
# Method 4: Docker registry credentials (for private image pulls)
kubectl create secret docker-registry registry-credentials \
--docker-server=private-registry.mycompany.com \
--docker-username=myuser \
--docker-password=mypassword \
--docker-email=devops@mycompany.com
# Method 5: YAML (values must be manually base64 encoded)
# echo -n 'SuperSecret123!' | base64 → U3VwZXJTZWNyZXQxMjMh
cat > db-secret.yaml <<EOF
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
namespace: production
type: Opaque # Generic secret type
data:
username: YXBwdXNlcg== # base64('appuser')
password: U3VwZXJTZWNyZXQxMjMh # base64('SuperSecret123!')
stringData: # Alternative: plain text (Kubernetes encodes automatically)
connection-string: "postgresql://appuser:SuperSecret123!@postgres:5432/appdb"
EOF
# WARNING: Never commit this file to git!
# Add db-secret.yaml to .gitignore immediately
# =============================================
# Using Secrets in Pods
# =============================================
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-with-secrets
spec:
template:
spec:
# Pull images from private registry
imagePullSecrets:
- name: registry-credentials
containers:
- name: api
image: private-registry.mycompany.com/api:v1.0
# Inject individual secret keys as env vars
env:
- name: DB_USERNAME
valueFrom:
secretKeyRef:
name: db-credentials
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
# Mount secrets as files (better for certs, complex configs)
volumeMounts:
- name: tls-certs
mountPath: /etc/ssl/app
readOnly: true
- name: db-secret-volume
mountPath: /etc/secrets/db
readOnly: true
volumes:
- name: tls-certs
secret:
secretName: api-tls-secret
defaultMode: 0400 # Read-only by owner only (security!)
- name: db-secret-volume
secret:
secretName: db-credentials
Production Secret Management with External Vault
# Using AWS Secrets Manager via Secrets Store CSI Driver
# This is the production-grade approach — secrets never stored in etcd
# 1. Install the Secrets Store CSI Driver
helm repo add secrets-store-csi-driver https://kubernetes-sigs.github.io/secrets-store-csi-driver/charts
helm install csi-secrets-store secrets-store-csi-driver/secrets-store-csi-driver \
--namespace kube-system \
--set syncSecret.enabled=true # Sync to Kubernetes Secrets as well
# 2. Install the AWS Provider
kubectl apply -f https://raw.githubusercontent.com/aws/secrets-store-csi-driver-provider-aws/main/deployment/aws-provider-installer.yaml
# 3. Create a SecretProviderClass
cat > aws-secret-provider.yaml <<EOF
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: aws-secrets
namespace: production
spec:
provider: aws
parameters:
objects: |
- objectName: "prod/myapp/db-credentials" # AWS Secrets Manager secret name
objectType: "secretsmanager"
jmesPath: # Extract individual fields
- path: username
objectAlias: db-username
- path: password
objectAlias: db-password
secretObjects: # Sync to Kubernetes Secret
- secretName: db-credentials-synced
type: Opaque
data:
- objectName: db-username
key: username
- objectName: db-password
key: password
EOF
kubectl apply -f aws-secret-provider.yaml
# 4. Use in a pod — secrets are mounted from AWS Secrets Manager
# The pod must have an IAM role (via IRSA on EKS) to access the secret
apiVersion: v1
kind: Pod
metadata:
name: secure-api
namespace: production
spec:
serviceAccountName: api-service-account # Must have IAM role with secrets access
containers:
- name: api
image: mycompany/api:v1.0
volumeMounts:
- name: aws-secrets
mountPath: /mnt/secrets
readOnly: true
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials-synced # Synced Kubernetes Secret
key: password
volumes:
- name: aws-secrets
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: aws-secrets
Multi-Environment Config Management: Dev, Staging, Production
Scenario: HealthApp runs the same application in 3 environments. Each environment has different database hosts, log levels, feature flags, and API keys. Without ConfigMaps and Secrets, the team was baking environment-specific values into Docker images — a maintenance nightmare. You need to externalize all config so the same image runs in all environments.
# Create namespace per environment
kubectl create namespace dev
kubectl create namespace staging
kubectl create namespace production
# Development ConfigMap — verbose logging, local services
cat > dev-config.yaml <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: dev
data:
ENVIRONMENT: "development"
LOG_LEVEL: "debug"
DB_HOST: "postgres-dev.dev.svc.cluster.local"
DB_NAME: "healthapp_dev"
CACHE_ENABLED: "false"
FEATURE_NEW_DASHBOARD: "true" # Feature flags — enabled in dev for testing
FEATURE_AI_DIAGNOSIS: "true"
API_RATE_LIMIT: "10000" # High limit in dev (no real traffic)
EOF
# Production ConfigMap — minimal logging, production services
cat > prod-config.yaml <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: production
data:
ENVIRONMENT: "production"
LOG_LEVEL: "warn" # Only warnings and errors in production
DB_HOST: "postgres-prod.production.svc.cluster.local"
DB_NAME: "healthapp_prod"
CACHE_ENABLED: "true"
FEATURE_NEW_DASHBOARD: "false" # Not yet released to production
FEATURE_AI_DIAGNOSIS: "false"
API_RATE_LIMIT: "1000" # Strict rate limiting in production
EOF
kubectl apply -f dev-config.yaml
kubectl apply -f prod-config.yaml
# Create different secrets per environment
kubectl create secret generic app-secrets -n dev \
--from-literal=DB_PASSWORD=devpassword123 \
--from-literal=JWT_SECRET=dev-jwt-secret-key-not-secure \
--from-literal=STRIPE_API_KEY=sk_test_123456789
kubectl create secret generic app-secrets -n production \
--from-literal=DB_PASSWORD=$(openssl rand -base64 32) \
--from-literal=JWT_SECRET=$(openssl rand -base64 64) \
--from-literal=STRIPE_API_KEY=sk_live_actualproductionkey
# The SAME deployment YAML works in all environments
# It references app-config and app-secrets which exist in every namespace
cat > app-deployment.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: healthapp
spec:
replicas: 1
selector:
matchLabels: { app: healthapp }
template:
metadata:
labels: { app: healthapp }
spec:
containers:
- name: healthapp
image: healthapp/api:v2.5.0 # SAME image in all environments
envFrom:
- configMapRef:
name: app-config # Picks up the namespace-specific ConfigMap
- secretRef:
name: app-secrets # Picks up the namespace-specific Secret
resources:
requests: { memory: "128Mi", cpu: "100m" }
limits: { memory: "512Mi", cpu: "1000m" }
EOF
# Deploy same YAML to different environments
kubectl apply -f app-deployment.yaml -n dev
kubectl apply -f app-deployment.yaml -n staging
kubectl apply -f app-deployment.yaml -n production
# Verify each environment has the right config
kubectl exec -n dev deploy/healthapp -- env | grep ENVIRONMENT
# ENVIRONMENT=development
kubectl exec -n production deploy/healthapp -- env | grep ENVIRONMENT
# ENVIRONMENT=production
# Verify secrets are different (check DB_PASSWORD length only, never print secrets!)
kubectl exec -n dev deploy/healthapp -- sh -c 'echo ${#DB_PASSWORD} chars'
kubectl exec -n production deploy/healthapp -- sh -c 'echo ${#DB_PASSWORD} chars'
Interview Questions — Chapter 7
- What is the difference between ConfigMap and Secret? When would you use each?
- Kubernetes Secrets are “base64 encoded, not encrypted.” What does this mean in practice and how do you secure them properly?
- A pod’s ConfigMap is updated. Does the running pod automatically see the new values? Does the answer differ between env vars and volume mounts?
- What is the Secrets Store CSI Driver and why would you use it instead of native Kubernetes Secrets?
- How do you manage environment-specific configuration (dev/staging/prod) without duplicating Deployment YAML files?
- What happens if a pod references a ConfigMap or Secret that does not exist? What status does the pod show?
- Explain the security risk of using
envFrom: secretRefversus injecting individual secret keys.
Volumes and Storage
Container filesystems are ephemeral — when a container dies, all data written to its filesystem is gone. For stateful applications like databases, message queues, and file servers, you need persistent storage that outlives pods and containers. Kubernetes storage is one of the most complex topics, but mastering it unlocks the ability to run any workload on Kubernetes.
The Storage Hierarchy: PV → PVC → StorageClass
StorageClass (defines HOW storage is provisioned — EBS, NFS, etc.)
↓ (dynamic provisioning)
PersistentVolume [PV] (represents actual storage — 50Gi EBS volume)
↓ (bound to)
PersistentVolumeClaim [PVC] (pod's request for storage — "I need 10Gi")
↓ (mounted by)
Pod → Container (sees the storage as a regular filesystem path)
FLOW:
1. Admin creates StorageClass (or cloud provider does automatically)
2. Developer creates PVC ("give me 20Gi ReadWriteOnce")
3. StorageClass dynamically provisions a real volume (EBS, GCP PD, etc.)
4. PV is created and bound to the PVC
5. Pod mounts the PVC — container sees /data as a persistent directory
Volume Types Quick Reference
Temporary directory shared between containers in a pod. Deleted when the pod terminates. Use for: scratch space, caching, sharing files between sidecar containers.
Mounts a path from the host node’s filesystem into the pod. Dangerous (gives pod access to node). Use for: DaemonSets that need to read node logs, container runtime sockets. Never for user workloads.
The standard way to request persistent storage. Decouples pod spec from storage implementation. The same YAML works on AWS (EBS), GCP (PD), or on-prem (NFS) by just changing the StorageClass.
Mount ConfigMap or Secret data as files. Covered in Chapter 7. The most common non-ephemeral volume type in typical stateless application deployments.
Network File System. ReadWriteMany — multiple pods on different nodes can mount simultaneously. Good for: shared file storage, legacy apps that need a shared filesystem. Performance is lower than block storage.
The modern extensible storage plugin interface. AWS EBS CSI, Ceph CSI, Portworx — all use CSI. Replaces the old in-tree volume plugins. Install CSI drivers as separate deployments in your cluster.
StorageClass, PV, PVC — Complete Production Example
# =============================================
# 1. StorageClass — defines storage provider and parameters
# =============================================
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
annotations:
storageclass.kubernetes.io/is-default-class: "true" # Used when PVC has no storageClassName
provisioner: ebs.csi.aws.com # AWS EBS CSI Driver
parameters:
type: gp3 # GP3 SSD — best performance/cost on AWS
iops: "3000" # Provisioned IOPS
throughput: "125" # MB/s throughput
encrypted: "true" # Encrypt volume at rest (compliance!)
reclaimPolicy: Retain # When PVC is deleted: Retain (keep data) or Delete
allowVolumeExpansion: true # Allow resizing PVCs without downtime
volumeBindingMode: WaitForFirstConsumer # Only provision when a pod actually claims it
# Ensures volume is created in same AZ as pod
---
# NFS StorageClass for shared ReadWriteMany storage
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-shared
provisioner: nfs.csi.k8s.io
parameters:
server: nfs.mycompany.internal
share: /exports/k8s
reclaimPolicy: Retain
allowVolumeExpansion: true
---
# =============================================
# 2. PersistentVolumeClaim — "I need 50Gi of fast SSD"
# =============================================
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data-pvc
namespace: production
spec:
accessModes:
- ReadWriteOnce # RWO: mounted by one node at a time (block storage)
# - ReadWriteMany # RWX: multiple nodes simultaneously (NFS/Ceph)
# - ReadOnlyMany # ROX: multiple nodes, read-only
storageClassName: fast-ssd # Which StorageClass to use
resources:
requests:
storage: 50Gi # Request 50 gigabytes
---
# =============================================
# 3. Check PVC status
# =============================================
kubectl get pvc -n production
# NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
# postgres-data-pvc Bound pvc-3f4a1b2c-... 50Gi RWO fast-ssd 2m
# A Bound PVC means a real EBS volume has been provisioned and attached.
# A Pending PVC means: StorageClass not found, insufficient quota, or AZ issue.
# =============================================
# 4. Use the PVC in a Pod
# =============================================
apiVersion: apps/v1
kind: StatefulSet # Use StatefulSet for databases, not Deployment
metadata:
name: postgres
namespace: production
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels: { app: postgres }
template:
metadata:
labels: { app: postgres }
spec:
securityContext:
fsGroup: 999 # postgres user GID — ensures postgres can write to volume
containers:
- name: postgres
image: postgres:15.4
env:
- name: POSTGRES_DB
value: appdb
- name: POSTGRES_USER
value: appuser
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
- name: PGDATA
value: /var/lib/postgresql/data/pgdata # Data dir inside mount point
ports:
- containerPort: 5432
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
resources:
requests: { memory: "512Mi", cpu: "500m" }
limits: { memory: "2Gi", cpu: "2000m" }
# PostgreSQL specific liveness check
livenessProbe:
exec:
command: ["pg_isready", "-U", "appuser", "-d", "appdb"]
initialDelaySeconds: 30
periodSeconds: 10
# volumeClaimTemplates: StatefulSet creates one PVC per replica automatically
volumeClaimTemplates:
- metadata:
name: postgres-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 50Gi
# StatefulSet names: postgres-0 (pod) → postgres-storage-postgres-0 (PVC)
# If you scale to 3 replicas, you get postgres-0, postgres-1, postgres-2
# Each with their own dedicated PVC
StatefulSets vs Deployments for Databases
| Feature | Deployment | StatefulSet |
|---|---|---|
| Pod names | Random (api-7df9c-xkp2z) | Stable ordinal (postgres-0, postgres-1) |
| Storage | Shared PVC (if any) | Dedicated PVC per pod (via volumeClaimTemplates) |
| Scaling order | Parallel, random | Sequential (0→1→2 up, 2→1→0 down) |
| DNS | Service DNS only | Per-pod DNS: postgres-0.postgres.ns.svc.cluster.local |
| Use case | Stateless apps (APIs, web servers) | Databases, Kafka, ZooKeeper, Elasticsearch |
Production PostgreSQL + Redis StatefulSet with Automated Backups
Scenario: DataVault Inc. needs to run PostgreSQL for primary storage and Redis for caching on Kubernetes. Both need persistent storage, automated daily backups to S3, and the ability to survive node failures without losing data. You also need to resize the PostgreSQL volume after initial deployment when the data grows.
# 1. PostgreSQL StatefulSet with backup CronJob
kubectl apply -f - <<'EOF'
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgresql
namespace: datavault
spec:
serviceName: postgresql
replicas: 1
selector:
matchLabels: { app: postgresql }
template:
metadata:
labels: { app: postgresql }
spec:
initContainers:
- name: set-permissions
image: busybox
command: ["sh", "-c", "chown -R 999:999 /var/lib/postgresql/data"]
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
containers:
- name: postgresql
image: postgres:15.4-alpine
env:
- name: POSTGRES_DB
value: datavault
- name: POSTGRES_USER
value: dvadmin
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
ports:
- containerPort: 5432
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
resources:
requests: { memory: "1Gi", cpu: "500m" }
limits: { memory: "4Gi", cpu: "2000m" }
livenessProbe:
exec:
command: ["pg_isready", "-U", "dvadmin", "-d", "datavault"]
initialDelaySeconds: 30
periodSeconds: 10
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 100Gi
---
# Headless service for StatefulSet DNS
apiVersion: v1
kind: Service
metadata:
name: postgresql
namespace: datavault
spec:
clusterIP: None # Headless — enables DNS for each pod
selector: { app: postgresql }
ports:
- port: 5432
EOF
# 2. CronJob for daily database backups to S3
kubectl apply -f - <<'EOF'
apiVersion: batch/v1
kind: CronJob
metadata:
name: postgres-backup
namespace: datavault
spec:
schedule: "0 2 * * *" # 2:00 AM UTC every day
concurrencyPolicy: Forbid # Don't run overlapping backups
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: backup
image: postgres:15.4-alpine
command:
- /bin/sh
- -c
- |
BACKUP_FILE="backup-$(date +%Y%m%d-%H%M%S).sql.gz"
echo "Starting backup: $BACKUP_FILE"
pg_dump -h postgresql.datavault.svc.cluster.local \
-U dvadmin -d datavault | gzip > /tmp/$BACKUP_FILE
# Upload to S3 using AWS CLI (needs IAM role via IRSA)
aws s3 cp /tmp/$BACKUP_FILE s3://datavault-backups/postgres/$BACKUP_FILE
echo "Backup complete: s3://datavault-backups/postgres/$BACKUP_FILE"
env:
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
EOF
# 3. Scale PostgreSQL volume from 100Gi to 200Gi (online, no downtime)
# First, verify the StorageClass supports expansion
kubectl get storageclass fast-ssd -o jsonpath='{.allowVolumeExpansion}'
# true
# Edit the PVC
kubectl patch pvc data-postgresql-0 -n datavault \
--type='merge' \
-p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'
# Watch the resize happen
kubectl get pvc data-postgresql-0 -n datavault -w
# NAME STATUS VOLUME CAPACITY ACCESS MODES
# data-postgresql-0 Bound pvc-... 100Gi RWO
# data-postgresql-0 Bound pvc-... 200Gi RWO ← resized!
# 4. Redis StatefulSet for caching
kubectl apply -f - <<'EOF'
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
namespace: datavault
spec:
serviceName: redis
replicas: 1
selector:
matchLabels: { app: redis }
template:
metadata:
labels: { app: redis }
spec:
containers:
- name: redis
image: redis:7.2-alpine
command: ["redis-server", "--appendonly", "yes", "--requirepass", "$(REDIS_PASSWORD)"]
env:
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: redis-secret
key: password
ports:
- containerPort: 6379
volumeMounts:
- name: redis-data
mountPath: /data
resources:
requests: { memory: "256Mi", cpu: "100m" }
limits: { memory: "1Gi", cpu: "500m" }
volumeClaimTemplates:
- metadata:
name: redis-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 10Gi
EOF
Interview Questions — Chapter 8
- What is the difference between a PersistentVolume and a PersistentVolumeClaim? Who creates each?
- What does the reclaimPolicy Retain vs Delete mean on a StorageClass?
- Explain the three access modes: ReadWriteOnce, ReadWriteMany, ReadOnlyMany. Which cloud storage types support each?
- Why do we use StatefulSet for databases instead of a Deployment?
- A PVC is stuck in Pending. What are three common causes?
- How do volumeClaimTemplates in a StatefulSet differ from a regular volume in a Deployment?
- What is the Container Storage Interface (CSI) and why was it introduced?
- How do you expand a PVC after it has been created? What must be true for this to work?
Ingress and Load Balancing
If you create a LoadBalancer Service for every microservice, each one gets its own cloud load balancer — at $15–20/month each. With 50 microservices, that is $1,000/month just for load balancers, not counting data transfer costs. Ingress solves this: one load balancer, one external IP, routing HTTP traffic to dozens of services based on hostname and path.
How It Works: An Ingress Controller (Nginx, Traefik, AWS ALB) is deployed in your cluster as a pod. It watches for Ingress resources and programs itself to route traffic. One LoadBalancer Service points to the Ingress Controller. All your application services are ClusterIP. The Ingress Controller routes traffic from the single external IP to the right service based on the rules you define.
Installing Nginx Ingress Controller
# Install Nginx Ingress Controller via Helm (most common approach)
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx \
--create-namespace \
--set controller.replicaCount=2 \ # 2 replicas for HA
--set controller.service.type=LoadBalancer \
--set controller.metrics.enabled=true \ # Expose Prometheus metrics
--set controller.config.use-gzip="true" \ # Enable gzip compression
--set controller.config.proxy-body-size="50m" # Max upload size
# Verify installation
kubectl get pods -n ingress-nginx
# ingress-nginx-controller-5c8d66c76d-2xhkp 1/1 Running 0 2m
# ingress-nginx-controller-5c8d66c76d-7klmn 1/1 Running 0 2m
# Get the external IP
kubectl get svc -n ingress-nginx
# ingress-nginx-controller LoadBalancer 10.0.0.100 34.120.45.67 80:30080/TCP,443:30443/TCP 2m
# 34.120.45.67 is your single external IP for ALL services
Ingress Rules: Path and Host Based Routing
# =============================================
# 1. Path-Based Routing — same host, different paths
# api.example.com/users → user-service
# api.example.com/orders → order-service
# api.example.com/ → api-gateway
# =============================================
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$2 # Strip the path prefix when forwarding
nginx.ingress.kubernetes.io/ssl-redirect: "true" # Always redirect HTTP to HTTPS
nginx.ingress.kubernetes.io/proxy-connect-timeout: "10"
nginx.ingress.kubernetes.io/proxy-read-timeout: "60"
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
spec:
ingressClassName: nginx # Which Ingress Controller handles this
tls:
- hosts:
- api.example.com
secretName: api-tls-secret # TLS cert stored as a Kubernetes Secret
rules:
- host: api.example.com
http:
paths:
- path: /users(/|$)(.*) # Regex match
pathType: ImplementationSpecific
backend:
service:
name: user-service
port: { number: 80 }
- path: /orders(/|$)(.*)
pathType: ImplementationSpecific
backend:
service:
name: order-service
port: { number: 80 }
- path: /
pathType: Prefix # Catch-all for root
backend:
service:
name: api-gateway
port: { number: 80 }
---
# =============================================
# 2. Host-Based Routing — different hostnames → different services
# app.example.com → frontend
# api.example.com → backend API
# admin.example.com → admin panel (with auth)
# =============================================
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: multi-host-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts: [app.example.com, api.example.com, admin.example.com]
secretName: wildcard-tls-secret # *.example.com wildcard cert
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: frontend-service
port: { number: 80 }
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: backend-api-service
port: { number: 3000 }
- host: admin.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: admin-service
port: { number: 8080 }
---
# =============================================
# 3. Ingress with Basic Auth (quick access control)
# =============================================
# Create auth secret first
# htpasswd -c auth admin → enter password when prompted
# kubectl create secret generic basic-auth --from-file=auth -n production
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: admin-protected-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/auth-type: basic
nginx.ingress.kubernetes.io/auth-secret: basic-auth
nginx.ingress.kubernetes.io/auth-realm: "Admin Area — Authorized Access Only"
spec:
ingressClassName: nginx
rules:
- host: admin.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: admin-service
port: { number: 8080 }
TLS with cert-manager (Automated SSL)
# cert-manager automatically provisions and renews TLS certificates from Let's Encrypt
# No more manual cert renewals!
# Install cert-manager
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--set installCRDs=true
# Create a ClusterIssuer (Let's Encrypt production)
cat > letsencrypt-issuer.yaml <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: devops@example.com # Your email for expiry notifications
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
- http01: # HTTP-01 challenge (requires port 80 accessible)
ingress:
class: nginx
EOF
kubectl apply -f letsencrypt-issuer.yaml
# Now create Ingress with automatic TLS
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: auto-tls-ingress
namespace: production
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod # This annotation triggers cert-manager!
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- app.example.com
secretName: app-tls-cert # cert-manager creates and fills this Secret
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: frontend-service
port: { number: 80 }
# Watch cert-manager provision the certificate
kubectl get certificate -n production -w
# NAME READY SECRET AGE
# app-tls-cert False app-tls-cert 10s ← provisioning
# app-tls-cert True app-tls-cert 45s ← certificate issued!
# Check details
kubectl describe certificate app-tls-cert -n production
Advanced Ingress: Rate Limiting and Canary Deployments
# Rate limiting annotation — prevent API abuse
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rate-limited-api
namespace: production
annotations:
nginx.ingress.kubernetes.io/limit-rps: "10" # Max 10 requests/second per IP
nginx.ingress.kubernetes.io/limit-burst-multiplier: "5" # Allow burst up to 50 rps
nginx.ingress.kubernetes.io/limit-connections: "20" # Max 20 concurrent connections per IP
spec:
ingressClassName: nginx
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-service
port: { number: 80 }
---
# Canary deployment — send 10% of traffic to new version
# Useful for testing new releases with real traffic before full rollout
# STEP 1: Main Ingress (routes 90% traffic to stable service)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: main-ingress
namespace: production
spec:
ingressClassName: nginx
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-service-v1 # Stable version
port: { number: 80 }
---
# STEP 2: Canary Ingress (routes 10% traffic to new version)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: canary-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/canary: "true" # This is a canary!
nginx.ingress.kubernetes.io/canary-weight: "10" # 10% of traffic
spec:
ingressClassName: nginx
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-service-v2 # New version
port: { number: 80 }
# Monitor error rates in both services:
# If v2 shows higher error rates, delete the canary ingress to rollback instantly
# If v2 is healthy, gradually increase canary-weight to 25, 50, 100, then delete v1
Full-Stack SaaS App: Multi-Tenant Ingress with Auto TLS
Scenario: CloudDesk SaaS serves 50 enterprise customers. Each gets their own subdomain: acme.clouddesk.io, globex.clouddesk.io, etc. You need: wildcard TLS cert, host-based routing to tenant-specific services, rate limiting to prevent abuse, and a canary pipeline to test new API versions on 5% of traffic before full rollout.
# 1. Install Nginx Ingress + cert-manager
helm install ingress-nginx ingress-nginx/ingress-nginx -n ingress-nginx --create-namespace
helm install cert-manager jetstack/cert-manager -n cert-manager --create-namespace --set installCRDs=true
# 2. Get external IP and set DNS wildcard record
EXTERNAL_IP=$(kubectl get svc ingress-nginx-controller -n ingress-nginx \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "Create a DNS wildcard record: *.clouddesk.io -> $EXTERNAL_IP"
# 3. Create Let's Encrypt ClusterIssuer with DNS-01 challenge for wildcard cert
kubectl apply -f - <<'EOF'
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-dns
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: ops@clouddesk.io
privateKeySecretRef:
name: le-dns-key
solvers:
- dns01: # DNS-01 required for wildcard certs
route53:
region: us-east-1
hostedZoneID: Z1234EXAMPLE # Your Route53 Hosted Zone ID
EOF
# 4. Request wildcard certificate
kubectl apply -f - <<'EOF'
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: clouddesk-wildcard-cert
namespace: ingress-nginx
spec:
secretName: clouddesk-wildcard-tls
issuerRef:
name: letsencrypt-dns
kind: ClusterIssuer
dnsNames:
- "clouddesk.io"
- "*.clouddesk.io" # Wildcard covers all subdomains
EOF
# 5. Multi-tenant Ingress — each tenant gets their own subdomain
kubectl apply -f - <<'EOF'
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: clouddesk-tenants
namespace: production
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/limit-rps: "100"
nginx.ingress.kubernetes.io/use-regex: "true"
spec:
ingressClassName: nginx
tls:
- hosts: ["*.clouddesk.io"]
secretName: clouddesk-wildcard-tls
rules:
- host: acme.clouddesk.io
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: tenant-acme-app
port: { number: 80 }
- host: globex.clouddesk.io
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: tenant-globex-app
port: { number: 80 }
# For smaller tenants sharing the same service, use path routing:
- host: app.clouddesk.io
http:
paths:
- path: /tenant/initech
pathType: Prefix
backend:
service:
name: shared-app-service
port: { number: 80 }
EOF
# 6. Verify
kubectl get ingress -n production
kubectl describe certificate clouddesk-wildcard-cert -n ingress-nginx
curl -k https://acme.clouddesk.io/health # Should return 200
Interview Questions — Chapter 9
- Why is Ingress more cost-effective than creating a LoadBalancer Service per microservice?
- What is an Ingress Controller? Name three popular ones and one key difference between them.
- Explain the difference between path-based and host-based routing. When would you use each?
- How does cert-manager work with Let’s Encrypt to automatically issue and renew TLS certificates?
- What is the difference between HTTP-01 and DNS-01 challenge in Let’s Encrypt? When must you use DNS-01?
- How do you implement a canary deployment using Nginx Ingress annotations? What metrics would you monitor?
- An Ingress rule is defined but traffic isn’t reaching the backend service. Walk through your troubleshooting steps.
- What is the IngressClass resource and why was it introduced in Kubernetes 1.18?
Helm Package Manager
Deploying a microservice application to Kubernetes means writing and maintaining dozens of YAML files — Deployment, Service, Ingress, ConfigMap, Secret, HPA, PDB, ServiceAccount, Roles… As applications grow, raw YAML becomes unmanageable. Helm is the Kubernetes package manager. It bundles all related YAML files into a single versioned, configurable package called a Chart — the equivalent of an apt package or npm module for Kubernetes.
Helm Core Concepts
A Helm package. Contains all YAML templates plus a values file. Can be versioned and distributed via Helm repositories. Examples: the nginx-ingress chart, the PostgreSQL chart.
The configuration file (values.yaml) that customizes a chart. You can override any value at install time. This is how the same chart deploys with 2 replicas in dev and 20 replicas in production.
A specific deployment of a chart into a cluster. Installing the same chart twice (e.g., two PostgreSQL instances) creates two releases with separate names, each independently upgradeable and rollbackable.
A collection of charts. Like npm registry for Kubernetes. Popular repos: Bitnami (databases, common apps), ingress-nginx, jetstack (cert-manager). Search all at: artifacthub.io.
Helm Essential Commands
# =============================================
# Installation and Repository Management
# =============================================
# Install Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version
# version.BuildInfo{Version:"v3.13.0", ...}
# Add repositories
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add stable https://charts.helm.sh/stable
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
# Update repos (like apt-get update)
helm repo update
# Search for charts
helm search repo postgresql # Search installed repos
helm search hub redis # Search Artifact Hub (public)
# Show chart info and default values
helm show chart bitnami/postgresql # Chart metadata
helm show values bitnami/postgresql # All configurable values (very long!)
helm show values bitnami/postgresql | grep -A 5 "replicaCount" # Filter specific settings
# =============================================
# Installing Charts
# =============================================
# Basic install with default values
helm install my-postgres bitnami/postgresql \
--namespace database \
--create-namespace
# Install with custom values inline
helm install my-redis bitnami/redis \
--namespace cache \
--create-namespace \
--set auth.password=MyRedisPassword123 \
--set master.persistence.size=10Gi \
--set replica.replicaCount=2
# Install with custom values file (BEST PRACTICE — version controlled!)
helm install my-app ./my-chart \
--namespace production \
--values values.yaml \
--values values-production.yaml # Second file overrides first
# Dry run — see what would be deployed without actually deploying
helm install my-app ./my-chart \
--dry-run \
--values values.yaml | grep -A 20 "kind: Deployment"
# =============================================
# Managing Releases
# =============================================
# List all releases
helm list -A # All namespaces
helm list -n production # Specific namespace
# Get detailed status of a release
helm status my-postgres -n database
# Get the computed values used for a release
helm get values my-postgres -n database
# Get all rendered YAML for a release
helm get manifest my-postgres -n database
# Upgrade a release (applies changes to values or chart version)
helm upgrade my-postgres bitnami/postgresql \
--namespace database \
--set primary.resources.limits.memory=2Gi \
--reuse-values # Keep all previously set values, only override what you specify
# Rollback to previous revision
helm rollback my-postgres 1 -n database # Roll back to revision 1
# View revision history
helm history my-postgres -n database
# REVISION STATUS CHART APP VERSION DESCRIPTION
# 1 superseded postgresql-12.12.10 15.2.0 Install complete
# 2 deployed postgresql-12.13.0 15.3.0 Upgrade complete
# Uninstall (keeps history by default)
helm uninstall my-postgres -n database
Creating Your Own Helm Chart
# Generate chart scaffold
helm create myapp
# Creates this structure:
# myapp/
# Chart.yaml ← Chart metadata (name, version, description)
# values.yaml ← Default values
# templates/ ← YAML templates with Go templating
# deployment.yaml
# service.yaml
# ingress.yaml
# _helpers.tpl ← Reusable template functions
# charts/ ← Chart dependencies (sub-charts)
# =============================================
# Chart.yaml — Chart metadata
# =============================================
cat > myapp/Chart.yaml <<'EOF'
apiVersion: v2
name: myapp
description: A production-grade microservice Helm chart
type: application
version: 1.0.0 # Chart version (increment this when you change templates)
appVersion: "3.2.1" # Application version (the Docker image tag)
maintainers:
- name: DevOps Team
email: devops@company.com
dependencies:
- name: postgresql
version: "12.x.x"
repository: https://charts.bitnami.com/bitnami
condition: postgresql.enabled # Only include if postgresql.enabled=true in values
EOF
# =============================================
# values.yaml — Default values
# =============================================
cat > myapp/values.yaml <<'EOF'
replicaCount: 2
image:
repository: mycompany/myapp
tag: "3.2.1"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 80
targetPort: 3000
ingress:
enabled: true
className: nginx
host: myapp.example.com
tls: true
certIssuer: letsencrypt-prod
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
autoscaling:
enabled: false
minReplicas: 2
maxReplicas: 20
targetCPUUtilization: 60
env:
LOG_LEVEL: "info"
NODE_ENV: "production"
postgresql:
enabled: true # Enable the postgresql sub-chart dependency
auth:
database: myappdb
username: myappuser
existingSecret: db-credentials
affinity: {}
tolerations: []
nodeSelector: {}
EOF
# =============================================
# templates/deployment.yaml — with Go templating
# =============================================
cat > myapp/templates/deployment.yaml <<'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "myapp.fullname" . }} {{/* Uses helper function */}}
namespace: {{ .Release.Namespace }}
labels:
{{- include "myapp.labels" . | nindent 4 }} {{/* Standard labels block */}}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }} {{/* Value from values.yaml */}}
{{- end }}
selector:
matchLabels:
{{- include "myapp.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "myapp.selectorLabels" . | nindent 8 }}
spec:
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: {{ .Values.service.targetPort }}
env:
{{- range $key, $value := .Values.env }}
- name: {{ $key }}
value: {{ $value | quote }}
{{- end }}
{{- if .Values.postgresql.enabled }}
- name: DB_HOST
value: {{ include "myapp.fullname" . }}-postgresql
- name: DB_NAME
value: {{ .Values.postgresql.auth.database }}
{{- end }}
resources:
{{- toYaml .Values.resources | nindent 10 }}
EOF
# =============================================
# templates/ingress.yaml — conditional rendering
# =============================================
cat > myapp/templates/ingress.yaml <<'EOF'
{{- if .Values.ingress.enabled -}} {{/* Only create Ingress if ingress.enabled=true */}}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: {{ include "myapp.fullname" . }}
annotations:
{{- if .Values.ingress.tls }}
nginx.ingress.kubernetes.io/ssl-redirect: "true"
{{- end }}
{{- if .Values.ingress.certIssuer }}
cert-manager.io/cluster-issuer: {{ .Values.ingress.certIssuer }}
{{- end }}
spec:
ingressClassName: {{ .Values.ingress.className }}
{{- if .Values.ingress.tls }}
tls:
- hosts:
- {{ .Values.ingress.host }}
secretName: {{ include "myapp.fullname" . }}-tls
{{- end }}
rules:
- host: {{ .Values.ingress.host }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: {{ include "myapp.fullname" . }}
port:
number: {{ .Values.service.port }}
{{- end }}
Helm in CI/CD Pipelines
# =============================================
# GitHub Actions workflow for Helm deployment
# =============================================
# .github/workflows/deploy.yaml
name: Deploy to Kubernetes
on:
push:
branches: [main]
workflow_dispatch:
inputs:
environment:
description: 'Target environment'
required: true
default: 'staging'
type: choice
options: [staging, production]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/github-actions-role
aws-region: us-east-1
- name: Update kubeconfig for EKS
run: aws eks update-kubeconfig --name production-cluster --region us-east-1
- name: Set image tag from git commit
run: echo "IMAGE_TAG=${GITHUB_SHA::8}" >> $GITHUB_ENV
- name: Lint chart
run: helm lint ./helm/myapp
- name: Deploy to staging
if: github.event_name == 'push' || inputs.environment == 'staging'
run: |
helm upgrade --install myapp-staging ./helm/myapp \
--namespace staging \
--create-namespace \
--values ./helm/myapp/values.yaml \
--values ./helm/myapp/values-staging.yaml \
--set image.tag=${{ env.IMAGE_TAG }} \
--wait \ # Wait until all pods are ready
--timeout 5m \ # Fail if not ready in 5 minutes
--atomic # Auto-rollback on failure!
- name: Run smoke tests
run: |
kubectl run smoke-test --image=curlimages/curl --rm -it --restart=Never \
-- curl -f http://myapp-staging.staging.svc.cluster.local/health
- name: Deploy to production
if: inputs.environment == 'production'
run: |
helm upgrade --install myapp ./helm/myapp \
--namespace production \
--values ./helm/myapp/values.yaml \
--values ./helm/myapp/values-production.yaml \
--set image.tag=${{ env.IMAGE_TAG }} \
--wait --timeout 10m --atomic
Build and Publish a Helm Chart for a Full Microservices Stack
Scenario: TaskFlow SaaS has 4 microservices: frontend (React), API (Node.js), worker (Python), and a PostgreSQL database. Currently deployed with 40+ raw YAML files that nobody dares to touch. You will package the entire stack into a single Helm chart, create environment-specific values files, publish it to GitHub Container Registry, and deploy it with a single command.
# 1. Create the umbrella chart
helm create taskflow
cd taskflow
# 2. Update Chart.yaml with sub-chart dependencies
cat > Chart.yaml <<'EOF'
apiVersion: v2
name: taskflow
description: TaskFlow SaaS — complete application stack
version: 2.0.0
appVersion: "4.1.0"
dependencies:
- name: postgresql
version: "12.12.10"
repository: https://charts.bitnami.com/bitnami
condition: postgresql.enabled
- name: redis
version: "18.2.0"
repository: https://charts.bitnami.com/bitnami
condition: redis.enabled
EOF
# 3. Create environment-specific values files
cat > values-staging.yaml <<'EOF'
replicaCount: 1
image:
tag: "latest"
resources:
requests: { memory: "64Mi", cpu: "50m" }
limits: { memory: "128Mi", cpu: "200m" }
ingress:
host: staging.taskflow.io
postgresql:
enabled: true
primary:
persistence:
size: 5Gi
redis:
enabled: true
master:
persistence:
size: 1Gi
EOF
cat > values-production.yaml <<'EOF'
replicaCount: 5
image:
tag: "4.1.0" # Pinned version in production!
resources:
requests: { memory: "512Mi", cpu: "500m" }
limits: { memory: "1Gi", cpu: "1000m" }
ingress:
host: app.taskflow.io
tls: true
certIssuer: letsencrypt-prod
autoscaling:
enabled: true
minReplicas: 5
maxReplicas: 100
targetCPUUtilization: 50
postgresql:
enabled: false # Use RDS in production, not in-cluster postgres
externalHost: taskflow.cluster-xyz.us-east-1.rds.amazonaws.com
redis:
enabled: true
master:
persistence:
size: 20Gi
replica:
replicaCount: 2
EOF
# 4. Download sub-chart dependencies
helm dependency update
# 5. Validate the chart
helm lint . --values values-staging.yaml
helm lint . --values values-production.yaml
# 6. Test render (see all generated YAML)
helm template taskflow . \
--values values-production.yaml \
--namespace production | head -100
# 7. Package the chart
helm package .
# Successfully packaged chart and saved it to: taskflow-2.0.0.tgz
# 8. Publish to GitHub Container Registry
echo $GITHUB_TOKEN | helm registry login ghcr.io -u $GITHUB_USERNAME --password-stdin
helm push taskflow-2.0.0.tgz oci://ghcr.io/mycompany/charts
# 9. Deploy staging from published chart
helm install taskflow-staging oci://ghcr.io/mycompany/charts/taskflow \
--version 2.0.0 \
--namespace staging \
--create-namespace \
--values values-staging.yaml \
--wait --atomic
# 10. Deploy production
helm install taskflow oci://ghcr.io/mycompany/charts/taskflow \
--version 2.0.0 \
--namespace production \
--create-namespace \
--values values-production.yaml \
--wait --timeout 10m --atomic
# 11. Verify everything deployed correctly
helm list -A
# NAMESPACE NAME REVISION STATUS CHART APP VERSION
# staging taskflow-staging 1 deployed taskflow-2.0.0 4.1.0
# production taskflow 1 deployed taskflow-2.0.0 4.1.0
kubectl get all -n production | grep taskflow
40+ YAML files → 1 Helm chart. Deploy the entire TaskFlow stack to any environment with a single command. Rollback any release in under 30 seconds. Version-controlled, auditable deployments. This is how mature engineering teams manage Kubernetes at scale.
Common Helm Errors
You ran helm install on a release that already exists. Use helm upgrade --install instead — it creates or updates (idempotent). This is the pattern to use in CI/CD pipelines.
A previous Helm operation got stuck. Check: helm history release-name. If a revision is in “pending-upgrade” state, you may need to rollback: helm rollback release-name.
A Kubernetes resource (e.g., a Service) was created outside Helm, and now Helm is trying to create one with the same name. Fix: add the label helm.sh/chart to the existing resource, or delete it and let Helm recreate it. Use --force to overwrite (risky in production).
Interview Questions — Chapter 10
- What is Helm and what problem does it solve over raw kubectl apply?
- Explain the difference between a Helm Chart, a Release, and a Revision.
- What is the difference between helm install and helm upgrade –install? Which do you use in CI/CD and why?
- How does –atomic work in helm upgrade? When would you use it?
- What are Helm hooks and give two production examples of when you would use them?
- What is the difference between values.yaml and values-production.yaml? How do you combine them in a single helm command?
- A colleague ran helm uninstall in production. How do you recover if the release history still exists?
- What are Chart dependencies and how does helm dependency update work?
- How do you test a Helm chart without deploying it to a cluster?
Chapters 6–10 Covered
You’ve mastered Services, NetworkPolicy, ConfigMaps, Secrets, Persistent Storage, Ingress with TLS, and Helm. Coming up in Part 3: