Kubernetes Full Course (Beginner to Advanced) with Real Projects and Hands-on Examples

Complete DevOps Training Program — Part 2

Kubernetes: Zero to Production

Chapters 6–10: Networking, Configuration, Storage, Ingress, and Helm Package Manager.

Chapter 6: Services & Networking

Chapter 7: ConfigMaps & Secrets

Chapter 8: Volumes & Storage

Chapter 9: Ingress & Load Balancing

Chapter 10: Helm Package Manager

Chapter Six

Services and Networking

Pods are ephemeral — they die and get recreated with new IP addresses. If your frontend directly called a pod’s IP, that IP would break every time the pod restarted. Kubernetes Services solve this by providing a stable virtual IP and DNS name in front of a dynamic set of pods. Networking is the glue that holds microservices together, and understanding it deeply is what separates a junior from a senior Kubernetes engineer.

The Four Service Types

ClusterIP (Default)

Exposes the service on an internal IP only. Only reachable from within the cluster. Used for inter-service communication (frontend talking to backend). This is the most common type.

NodePort

Exposes the service on each node’s IP at a static port (30000–32767). Useful for local development and testing. Not recommended for production — exposes node IPs directly.

LoadBalancer

Provisions an external cloud load balancer (AWS ELB, GCP LB, Azure LB) automatically. Each LoadBalancer service gets its own cloud LB and external IP. Expensive if overused — prefer Ingress for HTTP.

ExternalName

Maps a service to an external DNS name (e.g., mydb.rds.amazonaws.com). No proxying — just a CNAME alias in CoreDNS. Used to reference external services using internal Kubernetes DNS names.

Complete Service YAML Examples

# =============================================
# 1. ClusterIP — Internal service (most common)
# =============================================
apiVersion: v1
kind: Service
metadata:
  name: backend-api
  namespace: production
  labels:
    app: backend-api
spec:
  type: ClusterIP          # Default — no need to specify, but shown for clarity
  selector:
    app: backend-api       # Routes traffic to pods with this label
  ports:
  - name: http
    port: 80               # Port the Service listens on (inside cluster)
    targetPort: 3000       # Port the pod container is listening on
    protocol: TCP
  - name: metrics
    port: 9090
    targetPort: 9090
# DNS name: backend-api.production.svc.cluster.local
# Other pods call it as: http://backend-api (within same namespace)
# Or: http://backend-api.production (cross-namespace)

---
# =============================================
# 2. NodePort — For local testing / dev
# =============================================
apiVersion: v1
kind: Service
metadata:
  name: frontend-nodeport
  namespace: staging
spec:
  type: NodePort
  selector:
    app: frontend
  ports:
  - port: 80
    targetPort: 3000
    nodePort: 31000        # Fixed node port; must be 30000-32767
                           # Accessible at: http://:31000

---
# =============================================
# 3. LoadBalancer — Cloud production exposure
# =============================================
apiVersion: v1
kind: Service
metadata:
  name: payment-api-lb
  namespace: production
  annotations:
    # AWS-specific annotations for EKS
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-internal: "false"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
spec:
  type: LoadBalancer
  selector:
    app: payment-api
  ports:
  - port: 443
    targetPort: 8443
    protocol: TCP
  loadBalancerSourceRanges:
  - "203.0.113.0/24"    # Restrict access to specific IP ranges (security best practice)

---
# =============================================
# 4. ExternalName — Alias for external services
# =============================================
apiVersion: v1
kind: Service
metadata:
  name: production-database
  namespace: production
spec:
  type: ExternalName
  externalName: myapp-db.abc123xyz.us-east-1.rds.amazonaws.com
  # Pods can now call: postgres://production-database:5432/mydb
  # Instead of hardcoding the RDS endpoint everywhere

How Kubernetes Networking Works: The CNI Layer

Kubernetes does not implement pod-to-pod networking itself. Instead, it delegates to a Container Network Interface (CNI) plugin. The CNI plugin is responsible for: assigning IP addresses to pods, ensuring every pod can communicate with every other pod across nodes (without NAT), and implementing NetworkPolicies (firewall rules).

Flannel

Simple overlay network. Uses VXLAN. Easy to set up, good for beginners. Does NOT support NetworkPolicy — you need Calico for that. Best for: learning, simple on-prem clusters.

Calico

Production-grade CNI. Uses BGP for routing (no overlay needed = higher performance). Supports NetworkPolicy. Used by major cloud providers. Best for: production clusters needing network security.

Cilium

Modern CNI using eBPF for networking (bypasses iptables entirely). Extremely high performance at scale. Supports L7 NetworkPolicy (filter by HTTP method, path). Best for: large-scale production, security-conscious teams.

AWS VPC CNI

EKS default. Assigns real VPC IP addresses to pods. No overlay — pods are native VPC citizens. Each EC2 instance can host a limited number of pods (based on ENI/IP limits per instance type).

NetworkPolicy: Firewall Rules for Pods

By default, all pods can communicate with all other pods in the cluster. In a production environment with PCI-DSS, HIPAA, or SOC2 compliance requirements, you must isolate services using NetworkPolicies. NetworkPolicy works like a firewall — you define which pods can talk to which pods, on which ports.

# network-policy.yaml — Production security configuration
# Scenario: payment-service should only accept traffic from api-gateway
# and only communicate with postgres database. Nothing else.

# STEP 1: Default deny all ingress and egress for the payment namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: payment
spec:
  podSelector: {}          # Applies to ALL pods in this namespace
  policyTypes:
  - Ingress
  - Egress
  # No rules = deny everything

---
# STEP 2: Allow ingress to payment-service ONLY from api-gateway
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-api-gateway-to-payment
  namespace: payment
spec:
  podSelector:
    matchLabels:
      app: payment-service
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: api-gateway-ns      # Only from pods in this namespace
      podSelector:
        matchLabels:
          app: api-gateway           # AND with this label (AND logic, not OR)
    ports:
    - protocol: TCP
      port: 8080

---
# STEP 3: Allow egress from payment-service to postgres only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-payment-to-postgres
  namespace: payment
spec:
  podSelector:
    matchLabels:
      app: payment-service
  policyTypes:
  - Egress
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: postgres
    ports:
    - protocol: TCP
      port: 5432
  - to:                       # Also allow DNS resolution (required!)
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
    ports:
    - protocol: UDP
      port: 53

---
# Verify policies
kubectl get networkpolicies -n payment
kubectl describe networkpolicy allow-api-gateway-to-payment -n payment

# Test connectivity (should fail — blocked by NetworkPolicy)
kubectl exec -n api-gateway deploy/api-gateway -- \
  curl -s --connect-timeout 3 http://payment-service.payment:8080/pay
# Should succeed

kubectl exec -n frontend deploy/frontend -- \
  curl -s --connect-timeout 3 http://payment-service.payment:8080/pay
# Should timeout — frontend not allowed to reach payment

DNS in Kubernetes: CoreDNS Deep Dive

# Every service gets a DNS record automatically:
# Format: ..svc.cluster.local

# Examples:
# backend-api.production.svc.cluster.local  → ClusterIP of backend-api service
# postgres.data.svc.cluster.local           → ClusterIP of postgres service

# From within the SAME namespace, you can use shorthand:
# http://backend-api         (short name)
# http://backend-api:80      (with port)

# From a DIFFERENT namespace, use full name:
# http://backend-api.production
# http://backend-api.production.svc.cluster.local

# Verify DNS resolution from inside a pod
kubectl run dns-test --image=busybox --rm -it -- nslookup backend-api.production
# Server: 10.96.0.10         (CoreDNS ClusterIP)
# Address: 10.96.0.10:53
# Name: backend-api.production.svc.cluster.local
# Address: 10.100.47.23      (Service ClusterIP)

# Check CoreDNS ConfigMap (customize DNS behavior)
kubectl get configmap coredns -n kube-system -o yaml
# You can add custom DNS entries, forward specific domains to external DNS, etc.

# Custom DNS entry example - add to CoreDNS ConfigMap data.Corefile:
# example.com:53 {
#   forward . 8.8.8.8
# }
# This forwards all *.example.com queries to Google DNS

Chapter 6 — Project 1

Multi-Tier Banking Application with Network Isolation

Scenario: SecureBank needs a 3-tier application — frontend (React), backend API (Node.js), and database (PostgreSQL) — with strict network isolation. The frontend can only reach the API, the API can only reach the database, and the database cannot initiate any outbound connections. This is a PCI-DSS requirement.

# 1. Create namespaces with labels (labels are used in NetworkPolicy selectors)
kubectl create namespace securebank-frontend
kubectl create namespace securebank-api
kubectl create namespace securebank-data
kubectl label namespace securebank-frontend name=securebank-frontend
kubectl label namespace securebank-api name=securebank-api
kubectl label namespace securebank-data name=securebank-data

# 2. Deploy the 3 tiers
cat > securebank-stack.yaml <<EOF
# --- FRONTEND DEPLOYMENT ---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend
  namespace: securebank-frontend
spec:
  replicas: 2
  selector:
    matchLabels: { app: frontend }
  template:
    metadata:
      labels: { app: frontend }
    spec:
      containers:
      - name: frontend
        image: nginx:1.25-alpine
        ports:
        - containerPort: 80
        resources:
          requests: { memory: "64Mi", cpu: "50m" }
          limits:   { memory: "128Mi", cpu: "200m" }
---
# Frontend Service (NodePort for external access in dev)
apiVersion: v1
kind: Service
metadata:
  name: frontend-svc
  namespace: securebank-frontend
spec:
  type: NodePort
  selector: { app: frontend }
  ports:
  - port: 80
    targetPort: 80
    nodePort: 30080
---
# --- API DEPLOYMENT ---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: banking-api
  namespace: securebank-api
spec:
  replicas: 3
  selector:
    matchLabels: { app: banking-api }
  template:
    metadata:
      labels: { app: banking-api }
    spec:
      containers:
      - name: banking-api
        image: node:20-alpine
        command: ["/bin/sh", "-c", "node -e \"const h=require('http');h.createServer((q,r)=>{r.writeHead(200);r.end('Banking API OK')}).listen(3000)\""]
        ports:
        - containerPort: 3000
        resources:
          requests: { memory: "128Mi", cpu: "100m" }
          limits:   { memory: "256Mi", cpu: "500m" }
---
# API Service (ClusterIP — internal only)
apiVersion: v1
kind: Service
metadata:
  name: banking-api-svc
  namespace: securebank-api
spec:
  type: ClusterIP
  selector: { app: banking-api }
  ports:
  - port: 3000
    targetPort: 3000
---
# --- DATABASE STATEFULSET ---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  namespace: securebank-data
spec:
  serviceName: postgres
  replicas: 1
  selector:
    matchLabels: { app: postgres }
  template:
    metadata:
      labels: { app: postgres }
    spec:
      containers:
      - name: postgres
        image: postgres:15-alpine
        env:
        - name: POSTGRES_PASSWORD
          value: "securepassword123"
        ports:
        - containerPort: 5432
        resources:
          requests: { memory: "256Mi", cpu: "200m" }
          limits:   { memory: "512Mi", cpu: "500m" }
---
# Database Service (ClusterIP — internal only)
apiVersion: v1
kind: Service
metadata:
  name: postgres-svc
  namespace: securebank-data
spec:
  type: ClusterIP
  selector: { app: postgres }
  ports:
  - port: 5432
    targetPort: 5432
EOF
kubectl apply -f securebank-stack.yaml

# 3. Apply Network Policies — Default deny all in each namespace
for ns in securebank-frontend securebank-api securebank-data; do
  kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
  namespace: $ns
spec:
  podSelector: {}
  policyTypes: [Ingress, Egress]
EOF
done

# 4. Allow frontend → API
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-api
  namespace: securebank-api
spec:
  podSelector:
    matchLabels: { app: banking-api }
  policyTypes: [Ingress]
  ingress:
  - from:
    - namespaceSelector:
        matchLabels: { name: securebank-frontend }
      podSelector:
        matchLabels: { app: frontend }
    ports:
    - port: 3000
EOF

# 5. Allow API → Database
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-api-to-db
  namespace: securebank-data
spec:
  podSelector:
    matchLabels: { app: postgres }
  policyTypes: [Ingress]
  ingress:
  - from:
    - namespaceSelector:
        matchLabels: { name: securebank-api }
      podSelector:
        matchLabels: { app: banking-api }
    ports:
    - port: 5432
EOF

# 6. Verify isolation — frontend cannot reach database directly
kubectl exec -n securebank-frontend deploy/frontend -- \
  timeout 3 nc -zv postgres-svc.securebank-data 5432 || echo "BLOCKED - correct!"

# 7. Verify allowed path — API can reach database
kubectl exec -n securebank-api deploy/banking-api -- \
  timeout 3 nc -zv postgres-svc.securebank-data 5432 && echo "ALLOWED - correct!"

Networking Troubleshooting Guide

Problem: Pod cannot reach a Service by DNS name

Debug steps: (1) Verify the service exists: kubectl get svc -n namespace. (2) Test DNS from inside pod: kubectl exec pod -- nslookup service-name. (3) Check CoreDNS pods are running: kubectl get pods -n kube-system | grep coredns. (4) Check if NetworkPolicy is blocking DNS (port 53).

Problem: Service exists but no traffic reaches pods

Debug: Check that the service selector matches pod labels exactly: kubectl get endpoints service-name. If Endpoints shows <none>, the selector doesn’t match any pods. Check: kubectl get pods --show-labels and compare with service selector.

Problem: NetworkPolicy blocking unexpected traffic

Debug: Use kubectl exec pod -- curl -v --connect-timeout 3 http://target to test. If it times out, a NetworkPolicy is likely blocking it. Check all policies in both namespaces. Remember: NetworkPolicies are additive — if any policy allows traffic, it is allowed. The default (no policies) = allow all.

Interview Questions — Chapter 6

A pod’s IP changes every time it restarts. How do Services solve this problem internally?
What is the difference between ClusterIP, NodePort, and LoadBalancer? When would you choose each?
How does kube-proxy implement service routing using iptables? What is the alternative and why is it better at scale?
Explain the full DNS name format for a Kubernetes Service and how a pod resolves it.
What is a CNI plugin and name three examples? What does Cilium do differently from Flannel?
If you apply a default-deny NetworkPolicy to a namespace, what breaks immediately and why?
What is the difference between namespaceSelector and podSelector in a NetworkPolicy? What happens when you use both in the same from rule?
Your LoadBalancer service is stuck in Pending. What is the most common cause?

Chapter Seven

ConfigMaps and Secrets

Hardcoding configuration into container images is an anti-pattern. If your database host changes, you would need to rebuild and redeploy your image. Instead, Kubernetes separates configuration from code using ConfigMaps (for non-sensitive data) and Secrets (for sensitive data). This follows the 12-Factor App methodology — store config in the environment, not in the application.

ConfigMaps: Non-Sensitive Configuration

# =============================================
# Creating ConfigMaps — Multiple methods
# =============================================

# Method 1: From literal values (quick and simple)
kubectl create configmap app-config \
  --from-literal=NODE_ENV=production \
  --from-literal=LOG_LEVEL=info \
  --from-literal=API_URL=https://api.example.com \
  --from-literal=MAX_CONNECTIONS=100

# Method 2: From a file (great for config files like nginx.conf, app.properties)
kubectl create configmap nginx-config \
  --from-file=nginx.conf \           # Key = filename, Value = file content
  --from-file=mime.types

# Method 3: From YAML (most common in production — version controlled)
cat > app-configmap.yaml <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: api-config
  namespace: production
data:
  # Simple key-value pairs
  NODE_ENV: "production"
  LOG_LEVEL: "info"
  DB_HOST: "postgres-service"
  DB_PORT: "5432"
  DB_NAME: "appdb"
  MAX_POOL_SIZE: "20"
  
  # Multi-line config file stored as a ConfigMap entry
  app.json: |
    {
      "server": {
        "port": 3000,
        "timeout": 30000,
        "maxConnections": 100
      },
      "cache": {
        "ttl": 3600,
        "maxItems": 10000
      },
      "features": {
        "darkMode": true,
        "betaAPI": false
      }
    }
  
  nginx.conf: |
    upstream backend {
      server localhost:3000;
    }
    server {
      listen 80;
      location / {
        proxy_pass http://backend;
        proxy_read_timeout 60;
      }
      location /health {
        return 200 'OK';
      }
    }
EOF
kubectl apply -f app-configmap.yaml

# =============================================
# Using ConfigMaps in Pods — 3 Methods
# =============================================

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 3
  selector:
    matchLabels: { app: api-server }
  template:
    metadata:
      labels: { app: api-server }
    spec:
      containers:
      - name: api
        image: mycompany/api:v3.0
        
        # METHOD 1: Inject specific keys as environment variables
        env:
        - name: NODE_ENV
          valueFrom:
            configMapKeyRef:
              name: api-config
              key: NODE_ENV
        - name: DB_HOST
          valueFrom:
            configMapKeyRef:
              name: api-config
              key: DB_HOST
        
        # METHOD 2: Inject ALL keys as environment variables at once
        envFrom:
        - configMapRef:
            name: api-config     # All keys become env vars
            # NODE_ENV=production, LOG_LEVEL=info, DB_HOST=postgres-service, etc.
        
        # METHOD 3: Mount as files in a volume (best for config files)
        volumeMounts:
        - name: config-files
          mountPath: /etc/app/config   # app.json and nginx.conf appear here as files
          readOnly: true
      
      volumes:
      - name: config-files
        configMap:
          name: api-config
          items:             # Optionally select specific keys to mount
          - key: app.json
            path: app.json   # Creates /etc/app/config/app.json
          - key: nginx.conf
            path: nginx.conf # Creates /etc/app/config/nginx.conf

Secrets: Sensitive Data Management

Security Warning

Kubernetes Secrets are base64 encoded, NOT encrypted by default. Anyone with access to etcd or the right RBAC permissions can read them. For true production security: (1) Enable encryption at rest for etcd, (2) Use AWS Secrets Manager / HashiCorp Vault with the Secrets Store CSI Driver, (3) Limit Secret access via RBAC. Never commit Secret YAML files to git repositories.

# =============================================
# Creating Secrets
# =============================================

# Method 1: Imperative (values are auto-base64 encoded by kubectl)
kubectl create secret generic db-credentials \
  --from-literal=username=appuser \
  --from-literal=password=SuperSecret123! \
  --from-literal=connection-string="postgresql://appuser:SuperSecret123!@postgres:5432/appdb"

# Method 2: From files (certificates, SSH keys)
kubectl create secret generic tls-certs \
  --from-file=tls.crt=./server.crt \
  --from-file=tls.key=./server.key

# Method 3: TLS Secret (special type for Ingress TLS termination)
kubectl create secret tls api-tls-secret \
  --cert=server.crt \
  --key=server.key

# Method 4: Docker registry credentials (for private image pulls)
kubectl create secret docker-registry registry-credentials \
  --docker-server=private-registry.mycompany.com \
  --docker-username=myuser \
  --docker-password=mypassword \
  --docker-email=devops@mycompany.com

# Method 5: YAML (values must be manually base64 encoded)
# echo -n 'SuperSecret123!' | base64  → U3VwZXJTZWNyZXQxMjMh
cat > db-secret.yaml <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
  namespace: production
type: Opaque           # Generic secret type
data:
  username: YXBwdXNlcg==          # base64('appuser')
  password: U3VwZXJTZWNyZXQxMjMh # base64('SuperSecret123!')
stringData:             # Alternative: plain text (Kubernetes encodes automatically)
  connection-string: "postgresql://appuser:SuperSecret123!@postgres:5432/appdb"
EOF
# WARNING: Never commit this file to git!
# Add db-secret.yaml to .gitignore immediately

# =============================================
# Using Secrets in Pods
# =============================================
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-with-secrets
spec:
  template:
    spec:
      # Pull images from private registry
      imagePullSecrets:
      - name: registry-credentials
      
      containers:
      - name: api
        image: private-registry.mycompany.com/api:v1.0
        
        # Inject individual secret keys as env vars
        env:
        - name: DB_USERNAME
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: username
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: password
        
        # Mount secrets as files (better for certs, complex configs)
        volumeMounts:
        - name: tls-certs
          mountPath: /etc/ssl/app
          readOnly: true
        - name: db-secret-volume
          mountPath: /etc/secrets/db
          readOnly: true
      
      volumes:
      - name: tls-certs
        secret:
          secretName: api-tls-secret
          defaultMode: 0400   # Read-only by owner only (security!)
      - name: db-secret-volume
        secret:
          secretName: db-credentials

Production Secret Management with External Vault

# Using AWS Secrets Manager via Secrets Store CSI Driver
# This is the production-grade approach — secrets never stored in etcd

# 1. Install the Secrets Store CSI Driver
helm repo add secrets-store-csi-driver https://kubernetes-sigs.github.io/secrets-store-csi-driver/charts
helm install csi-secrets-store secrets-store-csi-driver/secrets-store-csi-driver \
  --namespace kube-system \
  --set syncSecret.enabled=true  # Sync to Kubernetes Secrets as well

# 2. Install the AWS Provider
kubectl apply -f https://raw.githubusercontent.com/aws/secrets-store-csi-driver-provider-aws/main/deployment/aws-provider-installer.yaml

# 3. Create a SecretProviderClass
cat > aws-secret-provider.yaml <<EOF
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: aws-secrets
  namespace: production
spec:
  provider: aws
  parameters:
    objects: |
      - objectName: "prod/myapp/db-credentials"   # AWS Secrets Manager secret name
        objectType: "secretsmanager"
        jmesPath:                                   # Extract individual fields
          - path: username
            objectAlias: db-username
          - path: password
            objectAlias: db-password
  secretObjects:              # Sync to Kubernetes Secret
  - secretName: db-credentials-synced
    type: Opaque
    data:
    - objectName: db-username
      key: username
    - objectName: db-password
      key: password
EOF
kubectl apply -f aws-secret-provider.yaml

# 4. Use in a pod — secrets are mounted from AWS Secrets Manager
# The pod must have an IAM role (via IRSA on EKS) to access the secret
apiVersion: v1
kind: Pod
metadata:
  name: secure-api
  namespace: production
spec:
  serviceAccountName: api-service-account   # Must have IAM role with secrets access
  containers:
  - name: api
    image: mycompany/api:v1.0
    volumeMounts:
    - name: aws-secrets
      mountPath: /mnt/secrets
      readOnly: true
    env:
    - name: DB_PASSWORD
      valueFrom:
        secretKeyRef:
          name: db-credentials-synced    # Synced Kubernetes Secret
          key: password
  volumes:
  - name: aws-secrets
    csi:
      driver: secrets-store.csi.k8s.io
      readOnly: true
      volumeAttributes:
        secretProviderClass: aws-secrets

Chapter 7 — Project 1

Multi-Environment Config Management: Dev, Staging, Production

Scenario: HealthApp runs the same application in 3 environments. Each environment has different database hosts, log levels, feature flags, and API keys. Without ConfigMaps and Secrets, the team was baking environment-specific values into Docker images — a maintenance nightmare. You need to externalize all config so the same image runs in all environments.

# Create namespace per environment
kubectl create namespace dev
kubectl create namespace staging
kubectl create namespace production

# Development ConfigMap — verbose logging, local services
cat > dev-config.yaml <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: dev
data:
  ENVIRONMENT: "development"
  LOG_LEVEL: "debug"
  DB_HOST: "postgres-dev.dev.svc.cluster.local"
  DB_NAME: "healthapp_dev"
  CACHE_ENABLED: "false"
  FEATURE_NEW_DASHBOARD: "true"    # Feature flags — enabled in dev for testing
  FEATURE_AI_DIAGNOSIS: "true"
  API_RATE_LIMIT: "10000"          # High limit in dev (no real traffic)
EOF

# Production ConfigMap — minimal logging, production services
cat > prod-config.yaml <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: production
data:
  ENVIRONMENT: "production"
  LOG_LEVEL: "warn"               # Only warnings and errors in production
  DB_HOST: "postgres-prod.production.svc.cluster.local"
  DB_NAME: "healthapp_prod"
  CACHE_ENABLED: "true"
  FEATURE_NEW_DASHBOARD: "false"  # Not yet released to production
  FEATURE_AI_DIAGNOSIS: "false"
  API_RATE_LIMIT: "1000"          # Strict rate limiting in production
EOF

kubectl apply -f dev-config.yaml
kubectl apply -f prod-config.yaml

# Create different secrets per environment
kubectl create secret generic app-secrets -n dev \
  --from-literal=DB_PASSWORD=devpassword123 \
  --from-literal=JWT_SECRET=dev-jwt-secret-key-not-secure \
  --from-literal=STRIPE_API_KEY=sk_test_123456789

kubectl create secret generic app-secrets -n production \
  --from-literal=DB_PASSWORD=$(openssl rand -base64 32) \
  --from-literal=JWT_SECRET=$(openssl rand -base64 64) \
  --from-literal=STRIPE_API_KEY=sk_live_actualproductionkey

# The SAME deployment YAML works in all environments
# It references app-config and app-secrets which exist in every namespace
cat > app-deployment.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: healthapp
spec:
  replicas: 1
  selector:
    matchLabels: { app: healthapp }
  template:
    metadata:
      labels: { app: healthapp }
    spec:
      containers:
      - name: healthapp
        image: healthapp/api:v2.5.0   # SAME image in all environments
        envFrom:
        - configMapRef:
            name: app-config          # Picks up the namespace-specific ConfigMap
        - secretRef:
            name: app-secrets         # Picks up the namespace-specific Secret
        resources:
          requests: { memory: "128Mi", cpu: "100m" }
          limits:   { memory: "512Mi", cpu: "1000m" }
EOF

# Deploy same YAML to different environments
kubectl apply -f app-deployment.yaml -n dev
kubectl apply -f app-deployment.yaml -n staging
kubectl apply -f app-deployment.yaml -n production

# Verify each environment has the right config
kubectl exec -n dev deploy/healthapp -- env | grep ENVIRONMENT
# ENVIRONMENT=development

kubectl exec -n production deploy/healthapp -- env | grep ENVIRONMENT
# ENVIRONMENT=production

# Verify secrets are different (check DB_PASSWORD length only, never print secrets!)
kubectl exec -n dev deploy/healthapp -- sh -c 'echo ${#DB_PASSWORD} chars'
kubectl exec -n production deploy/healthapp -- sh -c 'echo ${#DB_PASSWORD} chars'

Interview Questions — Chapter 7

What is the difference between ConfigMap and Secret? When would you use each?
Kubernetes Secrets are “base64 encoded, not encrypted.” What does this mean in practice and how do you secure them properly?
A pod’s ConfigMap is updated. Does the running pod automatically see the new values? Does the answer differ between env vars and volume mounts?
What is the Secrets Store CSI Driver and why would you use it instead of native Kubernetes Secrets?
How do you manage environment-specific configuration (dev/staging/prod) without duplicating Deployment YAML files?
What happens if a pod references a ConfigMap or Secret that does not exist? What status does the pod show?
Explain the security risk of using envFrom: secretRef versus injecting individual secret keys.

Chapter Eight

Volumes and Storage

Container filesystems are ephemeral — when a container dies, all data written to its filesystem is gone. For stateful applications like databases, message queues, and file servers, you need persistent storage that outlives pods and containers. Kubernetes storage is one of the most complex topics, but mastering it unlocks the ability to run any workload on Kubernetes.

The Storage Hierarchy: PV → PVC → StorageClass


  StorageClass (defines HOW storage is provisioned — EBS, NFS, etc.)
       ↓ (dynamic provisioning)
  PersistentVolume [PV] (represents actual storage — 50Gi EBS volume)
       ↓ (bound to)
  PersistentVolumeClaim [PVC] (pod's request for storage — "I need 10Gi")
       ↓ (mounted by)
  Pod → Container (sees the storage as a regular filesystem path)

  FLOW:
  1. Admin creates StorageClass (or cloud provider does automatically)
  2. Developer creates PVC ("give me 20Gi ReadWriteOnce")
  3. StorageClass dynamically provisions a real volume (EBS, GCP PD, etc.)
  4. PV is created and bound to the PVC
  5. Pod mounts the PVC — container sees /data as a persistent directory

Volume Types Quick Reference

emptyDir

Temporary directory shared between containers in a pod. Deleted when the pod terminates. Use for: scratch space, caching, sharing files between sidecar containers.

hostPath

Mounts a path from the host node’s filesystem into the pod. Dangerous (gives pod access to node). Use for: DaemonSets that need to read node logs, container runtime sockets. Never for user workloads.

PersistentVolumeClaim

The standard way to request persistent storage. Decouples pod spec from storage implementation. The same YAML works on AWS (EBS), GCP (PD), or on-prem (NFS) by just changing the StorageClass.

configMap / secret

Mount ConfigMap or Secret data as files. Covered in Chapter 7. The most common non-ephemeral volume type in typical stateless application deployments.

NFS

Network File System. ReadWriteMany — multiple pods on different nodes can mount simultaneously. Good for: shared file storage, legacy apps that need a shared filesystem. Performance is lower than block storage.

CSI (Container Storage Interface)

The modern extensible storage plugin interface. AWS EBS CSI, Ceph CSI, Portworx — all use CSI. Replaces the old in-tree volume plugins. Install CSI drivers as separate deployments in your cluster.

StorageClass, PV, PVC — Complete Production Example

# =============================================
# 1. StorageClass — defines storage provider and parameters
# =============================================
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"  # Used when PVC has no storageClassName
provisioner: ebs.csi.aws.com          # AWS EBS CSI Driver
parameters:
  type: gp3                            # GP3 SSD — best performance/cost on AWS
  iops: "3000"                         # Provisioned IOPS
  throughput: "125"                    # MB/s throughput
  encrypted: "true"                    # Encrypt volume at rest (compliance!)
reclaimPolicy: Retain                  # When PVC is deleted: Retain (keep data) or Delete
allowVolumeExpansion: true             # Allow resizing PVCs without downtime
volumeBindingMode: WaitForFirstConsumer  # Only provision when a pod actually claims it
                                          # Ensures volume is created in same AZ as pod

---
# NFS StorageClass for shared ReadWriteMany storage
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-shared
provisioner: nfs.csi.k8s.io
parameters:
  server: nfs.mycompany.internal
  share: /exports/k8s
reclaimPolicy: Retain
allowVolumeExpansion: true

---
# =============================================
# 2. PersistentVolumeClaim — "I need 50Gi of fast SSD"
# =============================================
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-data-pvc
  namespace: production
spec:
  accessModes:
  - ReadWriteOnce                # RWO: mounted by one node at a time (block storage)
  # - ReadWriteMany              # RWX: multiple nodes simultaneously (NFS/Ceph)
  # - ReadOnlyMany               # ROX: multiple nodes, read-only
  storageClassName: fast-ssd     # Which StorageClass to use
  resources:
    requests:
      storage: 50Gi              # Request 50 gigabytes

---
# =============================================
# 3. Check PVC status
# =============================================
kubectl get pvc -n production
# NAME                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
# postgres-data-pvc    Bound    pvc-3f4a1b2c-...                           50Gi       RWO            fast-ssd       2m

# A Bound PVC means a real EBS volume has been provisioned and attached.
# A Pending PVC means: StorageClass not found, insufficient quota, or AZ issue.

# =============================================
# 4. Use the PVC in a Pod
# =============================================
apiVersion: apps/v1
kind: StatefulSet           # Use StatefulSet for databases, not Deployment
metadata:
  name: postgres
  namespace: production
spec:
  serviceName: postgres
  replicas: 1
  selector:
    matchLabels: { app: postgres }
  template:
    metadata:
      labels: { app: postgres }
    spec:
      securityContext:
        fsGroup: 999          # postgres user GID — ensures postgres can write to volume
      containers:
      - name: postgres
        image: postgres:15.4
        env:
        - name: POSTGRES_DB
          value: appdb
        - name: POSTGRES_USER
          value: appuser
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: password
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata   # Data dir inside mount point
        ports:
        - containerPort: 5432
        volumeMounts:
        - name: postgres-storage
          mountPath: /var/lib/postgresql/data
        resources:
          requests: { memory: "512Mi", cpu: "500m" }
          limits:   { memory: "2Gi",   cpu: "2000m" }
        # PostgreSQL specific liveness check
        livenessProbe:
          exec:
            command: ["pg_isready", "-U", "appuser", "-d", "appdb"]
          initialDelaySeconds: 30
          periodSeconds: 10
  # volumeClaimTemplates: StatefulSet creates one PVC per replica automatically
  volumeClaimTemplates:
  - metadata:
      name: postgres-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 50Gi
# StatefulSet names: postgres-0 (pod) → postgres-storage-postgres-0 (PVC)
# If you scale to 3 replicas, you get postgres-0, postgres-1, postgres-2
# Each with their own dedicated PVC

StatefulSets vs Deployments for Databases

Feature	Deployment	StatefulSet
Pod names	Random (api-7df9c-xkp2z)	Stable ordinal (postgres-0, postgres-1)
Storage	Shared PVC (if any)	Dedicated PVC per pod (via volumeClaimTemplates)
Scaling order	Parallel, random	Sequential (0→1→2 up, 2→1→0 down)
DNS	Service DNS only	Per-pod DNS: postgres-0.postgres.ns.svc.cluster.local
Use case	Stateless apps (APIs, web servers)	Databases, Kafka, ZooKeeper, Elasticsearch

Chapter 8 — Project 1

Production PostgreSQL + Redis StatefulSet with Automated Backups

Scenario: DataVault Inc. needs to run PostgreSQL for primary storage and Redis for caching on Kubernetes. Both need persistent storage, automated daily backups to S3, and the ability to survive node failures without losing data. You also need to resize the PostgreSQL volume after initial deployment when the data grows.

# 1. PostgreSQL StatefulSet with backup CronJob
kubectl apply -f - <<'EOF'
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgresql
  namespace: datavault
spec:
  serviceName: postgresql
  replicas: 1
  selector:
    matchLabels: { app: postgresql }
  template:
    metadata:
      labels: { app: postgresql }
    spec:
      initContainers:
      - name: set-permissions
        image: busybox
        command: ["sh", "-c", "chown -R 999:999 /var/lib/postgresql/data"]
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
      containers:
      - name: postgresql
        image: postgres:15.4-alpine
        env:
        - name: POSTGRES_DB
          value: datavault
        - name: POSTGRES_USER
          value: dvadmin
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        ports:
        - containerPort: 5432
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
        resources:
          requests: { memory: "1Gi", cpu: "500m" }
          limits:   { memory: "4Gi", cpu: "2000m" }
        livenessProbe:
          exec:
            command: ["pg_isready", "-U", "dvadmin", "-d", "datavault"]
          initialDelaySeconds: 30
          periodSeconds: 10
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 100Gi
---
# Headless service for StatefulSet DNS
apiVersion: v1
kind: Service
metadata:
  name: postgresql
  namespace: datavault
spec:
  clusterIP: None        # Headless — enables DNS for each pod
  selector: { app: postgresql }
  ports:
  - port: 5432
EOF

# 2. CronJob for daily database backups to S3
kubectl apply -f - <<'EOF'
apiVersion: batch/v1
kind: CronJob
metadata:
  name: postgres-backup
  namespace: datavault
spec:
  schedule: "0 2 * * *"       # 2:00 AM UTC every day
  concurrencyPolicy: Forbid    # Don't run overlapping backups
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 3
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
          - name: backup
            image: postgres:15.4-alpine
            command:
            - /bin/sh
            - -c
            - |
              BACKUP_FILE="backup-$(date +%Y%m%d-%H%M%S).sql.gz"
              echo "Starting backup: $BACKUP_FILE"
              pg_dump -h postgresql.datavault.svc.cluster.local \
                -U dvadmin -d datavault | gzip > /tmp/$BACKUP_FILE
              # Upload to S3 using AWS CLI (needs IAM role via IRSA)
              aws s3 cp /tmp/$BACKUP_FILE s3://datavault-backups/postgres/$BACKUP_FILE
              echo "Backup complete: s3://datavault-backups/postgres/$BACKUP_FILE"
            env:
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-secret
                  key: password
EOF

# 3. Scale PostgreSQL volume from 100Gi to 200Gi (online, no downtime)
# First, verify the StorageClass supports expansion
kubectl get storageclass fast-ssd -o jsonpath='{.allowVolumeExpansion}'
# true

# Edit the PVC
kubectl patch pvc data-postgresql-0 -n datavault \
  --type='merge' \
  -p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'

# Watch the resize happen
kubectl get pvc data-postgresql-0 -n datavault -w
# NAME                  STATUS   VOLUME    CAPACITY   ACCESS MODES
# data-postgresql-0     Bound    pvc-...   100Gi      RWO
# data-postgresql-0     Bound    pvc-...   200Gi      RWO    ← resized!

# 4. Redis StatefulSet for caching
kubectl apply -f - <<'EOF'
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis
  namespace: datavault
spec:
  serviceName: redis
  replicas: 1
  selector:
    matchLabels: { app: redis }
  template:
    metadata:
      labels: { app: redis }
    spec:
      containers:
      - name: redis
        image: redis:7.2-alpine
        command: ["redis-server", "--appendonly", "yes", "--requirepass", "$(REDIS_PASSWORD)"]
        env:
        - name: REDIS_PASSWORD
          valueFrom:
            secretKeyRef:
              name: redis-secret
              key: password
        ports:
        - containerPort: 6379
        volumeMounts:
        - name: redis-data
          mountPath: /data
        resources:
          requests: { memory: "256Mi", cpu: "100m" }
          limits:   { memory: "1Gi",   cpu: "500m" }
  volumeClaimTemplates:
  - metadata:
      name: redis-data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 10Gi
EOF

Interview Questions — Chapter 8

What is the difference between a PersistentVolume and a PersistentVolumeClaim? Who creates each?
What does the reclaimPolicy Retain vs Delete mean on a StorageClass?
Explain the three access modes: ReadWriteOnce, ReadWriteMany, ReadOnlyMany. Which cloud storage types support each?
Why do we use StatefulSet for databases instead of a Deployment?
A PVC is stuck in Pending. What are three common causes?
How do volumeClaimTemplates in a StatefulSet differ from a regular volume in a Deployment?
What is the Container Storage Interface (CSI) and why was it introduced?
How do you expand a PVC after it has been created? What must be true for this to work?

Chapter Nine

Ingress and Load Balancing

If you create a LoadBalancer Service for every microservice, each one gets its own cloud load balancer — at $15–20/month each. With 50 microservices, that is $1,000/month just for load balancers, not counting data transfer costs. Ingress solves this: one load balancer, one external IP, routing HTTP traffic to dozens of services based on hostname and path.

How It Works: An Ingress Controller (Nginx, Traefik, AWS ALB) is deployed in your cluster as a pod. It watches for Ingress resources and programs itself to route traffic. One LoadBalancer Service points to the Ingress Controller. All your application services are ClusterIP. The Ingress Controller routes traffic from the single external IP to the right service based on the rules you define.

Installing Nginx Ingress Controller

# Install Nginx Ingress Controller via Helm (most common approach)
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.replicaCount=2 \          # 2 replicas for HA
  --set controller.service.type=LoadBalancer \
  --set controller.metrics.enabled=true \    # Expose Prometheus metrics
  --set controller.config.use-gzip="true" \  # Enable gzip compression
  --set controller.config.proxy-body-size="50m"  # Max upload size

# Verify installation
kubectl get pods -n ingress-nginx
# ingress-nginx-controller-5c8d66c76d-2xhkp   1/1   Running   0   2m
# ingress-nginx-controller-5c8d66c76d-7klmn   1/1   Running   0   2m

# Get the external IP
kubectl get svc -n ingress-nginx
# ingress-nginx-controller   LoadBalancer   10.0.0.100   34.120.45.67   80:30080/TCP,443:30443/TCP   2m
# 34.120.45.67 is your single external IP for ALL services

Ingress Rules: Path and Host Based Routing

# =============================================
# 1. Path-Based Routing — same host, different paths
# api.example.com/users  → user-service
# api.example.com/orders → order-service
# api.example.com/       → api-gateway
# =============================================
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /$2    # Strip the path prefix when forwarding
    nginx.ingress.kubernetes.io/ssl-redirect: "true"   # Always redirect HTTP to HTTPS
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "10"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "60"
    nginx.ingress.kubernetes.io/proxy-body-size: "10m"
spec:
  ingressClassName: nginx            # Which Ingress Controller handles this
  tls:
  - hosts:
    - api.example.com
    secretName: api-tls-secret       # TLS cert stored as a Kubernetes Secret
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /users(/|$)(.*)        # Regex match
        pathType: ImplementationSpecific
        backend:
          service:
            name: user-service
            port: { number: 80 }
      - path: /orders(/|$)(.*)
        pathType: ImplementationSpecific
        backend:
          service:
            name: order-service
            port: { number: 80 }
      - path: /
        pathType: Prefix             # Catch-all for root
        backend:
          service:
            name: api-gateway
            port: { number: 80 }

---
# =============================================
# 2. Host-Based Routing — different hostnames → different services
# app.example.com        → frontend
# api.example.com        → backend API
# admin.example.com      → admin panel (with auth)
# =============================================
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: multi-host-ingress
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  ingressClassName: nginx
  tls:
  - hosts: [app.example.com, api.example.com, admin.example.com]
    secretName: wildcard-tls-secret     # *.example.com wildcard cert
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: frontend-service
            port: { number: 80 }
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: backend-api-service
            port: { number: 3000 }
  - host: admin.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: admin-service
            port: { number: 8080 }

---
# =============================================
# 3. Ingress with Basic Auth (quick access control)
# =============================================
# Create auth secret first
# htpasswd -c auth admin   → enter password when prompted
# kubectl create secret generic basic-auth --from-file=auth -n production

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: admin-protected-ingress
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/auth-type: basic
    nginx.ingress.kubernetes.io/auth-secret: basic-auth
    nginx.ingress.kubernetes.io/auth-realm: "Admin Area — Authorized Access Only"
spec:
  ingressClassName: nginx
  rules:
  - host: admin.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: admin-service
            port: { number: 8080 }

TLS with cert-manager (Automated SSL)

# cert-manager automatically provisions and renews TLS certificates from Let's Encrypt
# No more manual cert renewals!

# Install cert-manager
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set installCRDs=true

# Create a ClusterIssuer (Let's Encrypt production)
cat > letsencrypt-issuer.yaml <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: devops@example.com           # Your email for expiry notifications
    privateKeySecretRef:
      name: letsencrypt-prod-key
    solvers:
    - http01:                           # HTTP-01 challenge (requires port 80 accessible)
        ingress:
          class: nginx
EOF
kubectl apply -f letsencrypt-issuer.yaml

# Now create Ingress with automatic TLS
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: auto-tls-ingress
  namespace: production
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod    # This annotation triggers cert-manager!
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - app.example.com
    secretName: app-tls-cert         # cert-manager creates and fills this Secret
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: frontend-service
            port: { number: 80 }

# Watch cert-manager provision the certificate
kubectl get certificate -n production -w
# NAME           READY   SECRET          AGE
# app-tls-cert   False   app-tls-cert    10s   ← provisioning
# app-tls-cert   True    app-tls-cert    45s   ← certificate issued!

# Check details
kubectl describe certificate app-tls-cert -n production

Advanced Ingress: Rate Limiting and Canary Deployments

# Rate limiting annotation — prevent API abuse
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: rate-limited-api
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/limit-rps: "10"          # Max 10 requests/second per IP
    nginx.ingress.kubernetes.io/limit-burst-multiplier: "5"  # Allow burst up to 50 rps
    nginx.ingress.kubernetes.io/limit-connections: "20"  # Max 20 concurrent connections per IP
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: api-service
            port: { number: 80 }

---
# Canary deployment — send 10% of traffic to new version
# Useful for testing new releases with real traffic before full rollout

# STEP 1: Main Ingress (routes 90% traffic to stable service)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: main-ingress
  namespace: production
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: api-service-v1        # Stable version
            port: { number: 80 }

---
# STEP 2: Canary Ingress (routes 10% traffic to new version)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: canary-ingress
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/canary: "true"           # This is a canary!
    nginx.ingress.kubernetes.io/canary-weight: "10"      # 10% of traffic
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: api-service-v2        # New version
            port: { number: 80 }

# Monitor error rates in both services:
# If v2 shows higher error rates, delete the canary ingress to rollback instantly
# If v2 is healthy, gradually increase canary-weight to 25, 50, 100, then delete v1

Chapter 9 — Project 1

Full-Stack SaaS App: Multi-Tenant Ingress with Auto TLS

Scenario: CloudDesk SaaS serves 50 enterprise customers. Each gets their own subdomain: acme.clouddesk.io, globex.clouddesk.io, etc. You need: wildcard TLS cert, host-based routing to tenant-specific services, rate limiting to prevent abuse, and a canary pipeline to test new API versions on 5% of traffic before full rollout.

# 1. Install Nginx Ingress + cert-manager
helm install ingress-nginx ingress-nginx/ingress-nginx -n ingress-nginx --create-namespace
helm install cert-manager jetstack/cert-manager -n cert-manager --create-namespace --set installCRDs=true

# 2. Get external IP and set DNS wildcard record
EXTERNAL_IP=$(kubectl get svc ingress-nginx-controller -n ingress-nginx \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "Create a DNS wildcard record: *.clouddesk.io -> $EXTERNAL_IP"

# 3. Create Let's Encrypt ClusterIssuer with DNS-01 challenge for wildcard cert
kubectl apply -f - <<'EOF'
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-dns
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: ops@clouddesk.io
    privateKeySecretRef:
      name: le-dns-key
    solvers:
    - dns01:                              # DNS-01 required for wildcard certs
        route53:
          region: us-east-1
          hostedZoneID: Z1234EXAMPLE      # Your Route53 Hosted Zone ID
EOF

# 4. Request wildcard certificate
kubectl apply -f - <<'EOF'
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: clouddesk-wildcard-cert
  namespace: ingress-nginx
spec:
  secretName: clouddesk-wildcard-tls
  issuerRef:
    name: letsencrypt-dns
    kind: ClusterIssuer
  dnsNames:
  - "clouddesk.io"
  - "*.clouddesk.io"          # Wildcard covers all subdomains
EOF

# 5. Multi-tenant Ingress — each tenant gets their own subdomain
kubectl apply -f - <<'EOF'
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: clouddesk-tenants
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/limit-rps: "100"
    nginx.ingress.kubernetes.io/use-regex: "true"
spec:
  ingressClassName: nginx
  tls:
  - hosts: ["*.clouddesk.io"]
    secretName: clouddesk-wildcard-tls
  rules:
  - host: acme.clouddesk.io
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: tenant-acme-app
            port: { number: 80 }
  - host: globex.clouddesk.io
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: tenant-globex-app
            port: { number: 80 }
  # For smaller tenants sharing the same service, use path routing:
  - host: app.clouddesk.io
    http:
      paths:
      - path: /tenant/initech
        pathType: Prefix
        backend:
          service:
            name: shared-app-service
            port: { number: 80 }
EOF

# 6. Verify
kubectl get ingress -n production
kubectl describe certificate clouddesk-wildcard-cert -n ingress-nginx
curl -k https://acme.clouddesk.io/health   # Should return 200

Interview Questions — Chapter 9

Why is Ingress more cost-effective than creating a LoadBalancer Service per microservice?
What is an Ingress Controller? Name three popular ones and one key difference between them.
Explain the difference between path-based and host-based routing. When would you use each?
How does cert-manager work with Let’s Encrypt to automatically issue and renew TLS certificates?
What is the difference between HTTP-01 and DNS-01 challenge in Let’s Encrypt? When must you use DNS-01?
How do you implement a canary deployment using Nginx Ingress annotations? What metrics would you monitor?
An Ingress rule is defined but traffic isn’t reaching the backend service. Walk through your troubleshooting steps.
What is the IngressClass resource and why was it introduced in Kubernetes 1.18?

Chapter Ten

Helm Package Manager

Deploying a microservice application to Kubernetes means writing and maintaining dozens of YAML files — Deployment, Service, Ingress, ConfigMap, Secret, HPA, PDB, ServiceAccount, Roles… As applications grow, raw YAML becomes unmanageable. Helm is the Kubernetes package manager. It bundles all related YAML files into a single versioned, configurable package called a Chart — the equivalent of an apt package or npm module for Kubernetes.

Helm Core Concepts

Chart

A Helm package. Contains all YAML templates plus a values file. Can be versioned and distributed via Helm repositories. Examples: the nginx-ingress chart, the PostgreSQL chart.

Values

The configuration file (values.yaml) that customizes a chart. You can override any value at install time. This is how the same chart deploys with 2 replicas in dev and 20 replicas in production.

Release

A specific deployment of a chart into a cluster. Installing the same chart twice (e.g., two PostgreSQL instances) creates two releases with separate names, each independently upgradeable and rollbackable.

Repository

A collection of charts. Like npm registry for Kubernetes. Popular repos: Bitnami (databases, common apps), ingress-nginx, jetstack (cert-manager). Search all at: artifacthub.io.

Helm Essential Commands

# =============================================
# Installation and Repository Management
# =============================================

# Install Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version
# version.BuildInfo{Version:"v3.13.0", ...}

# Add repositories
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add stable https://charts.helm.sh/stable
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx

# Update repos (like apt-get update)
helm repo update

# Search for charts
helm search repo postgresql            # Search installed repos
helm search hub redis                  # Search Artifact Hub (public)

# Show chart info and default values
helm show chart bitnami/postgresql     # Chart metadata
helm show values bitnami/postgresql    # All configurable values (very long!)
helm show values bitnami/postgresql | grep -A 5 "replicaCount"  # Filter specific settings

# =============================================
# Installing Charts
# =============================================

# Basic install with default values
helm install my-postgres bitnami/postgresql \
  --namespace database \
  --create-namespace

# Install with custom values inline
helm install my-redis bitnami/redis \
  --namespace cache \
  --create-namespace \
  --set auth.password=MyRedisPassword123 \
  --set master.persistence.size=10Gi \
  --set replica.replicaCount=2

# Install with custom values file (BEST PRACTICE — version controlled!)
helm install my-app ./my-chart \
  --namespace production \
  --values values.yaml \
  --values values-production.yaml    # Second file overrides first

# Dry run — see what would be deployed without actually deploying
helm install my-app ./my-chart \
  --dry-run \
  --values values.yaml | grep -A 20 "kind: Deployment"

# =============================================
# Managing Releases
# =============================================

# List all releases
helm list -A              # All namespaces
helm list -n production   # Specific namespace

# Get detailed status of a release
helm status my-postgres -n database

# Get the computed values used for a release
helm get values my-postgres -n database

# Get all rendered YAML for a release
helm get manifest my-postgres -n database

# Upgrade a release (applies changes to values or chart version)
helm upgrade my-postgres bitnami/postgresql \
  --namespace database \
  --set primary.resources.limits.memory=2Gi \
  --reuse-values               # Keep all previously set values, only override what you specify

# Rollback to previous revision
helm rollback my-postgres 1 -n database   # Roll back to revision 1

# View revision history
helm history my-postgres -n database
# REVISION   STATUS     CHART               APP VERSION   DESCRIPTION
# 1          superseded postgresql-12.12.10  15.2.0        Install complete
# 2          deployed   postgresql-12.13.0   15.3.0        Upgrade complete

# Uninstall (keeps history by default)
helm uninstall my-postgres -n database

Creating Your Own Helm Chart

# Generate chart scaffold
helm create myapp
# Creates this structure:
# myapp/
#   Chart.yaml          ← Chart metadata (name, version, description)
#   values.yaml         ← Default values
#   templates/          ← YAML templates with Go templating
#     deployment.yaml
#     service.yaml
#     ingress.yaml
#     _helpers.tpl       ← Reusable template functions
#   charts/             ← Chart dependencies (sub-charts)

# =============================================
# Chart.yaml — Chart metadata
# =============================================
cat > myapp/Chart.yaml <<'EOF'
apiVersion: v2
name: myapp
description: A production-grade microservice Helm chart
type: application
version: 1.0.0          # Chart version (increment this when you change templates)
appVersion: "3.2.1"     # Application version (the Docker image tag)
maintainers:
- name: DevOps Team
  email: devops@company.com
dependencies:
- name: postgresql
  version: "12.x.x"
  repository: https://charts.bitnami.com/bitnami
  condition: postgresql.enabled    # Only include if postgresql.enabled=true in values
EOF

# =============================================
# values.yaml — Default values
# =============================================
cat > myapp/values.yaml <<'EOF'
replicaCount: 2

image:
  repository: mycompany/myapp
  tag: "3.2.1"
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 80
  targetPort: 3000

ingress:
  enabled: true
  className: nginx
  host: myapp.example.com
  tls: true
  certIssuer: letsencrypt-prod

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "256Mi"
    cpu: "500m"

autoscaling:
  enabled: false
  minReplicas: 2
  maxReplicas: 20
  targetCPUUtilization: 60

env:
  LOG_LEVEL: "info"
  NODE_ENV: "production"

postgresql:
  enabled: true          # Enable the postgresql sub-chart dependency
  auth:
    database: myappdb
    username: myappuser
    existingSecret: db-credentials

affinity: {}
tolerations: []
nodeSelector: {}
EOF

# =============================================
# templates/deployment.yaml — with Go templating
# =============================================
cat > myapp/templates/deployment.yaml <<'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "myapp.fullname" . }}        {{/* Uses helper function */}}
  namespace: {{ .Release.Namespace }}
  labels:
    {{- include "myapp.labels" . | nindent 4 }} {{/* Standard labels block */}}
spec:
  {{- if not .Values.autoscaling.enabled }}
  replicas: {{ .Values.replicaCount }}           {{/* Value from values.yaml */}}
  {{- end }}
  selector:
    matchLabels:
      {{- include "myapp.selectorLabels" . | nindent 6 }}
  template:
    metadata:
      labels:
        {{- include "myapp.selectorLabels" . | nindent 8 }}
    spec:
      containers:
      - name: {{ .Chart.Name }}
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        imagePullPolicy: {{ .Values.image.pullPolicy }}
        ports:
        - containerPort: {{ .Values.service.targetPort }}
        env:
        {{- range $key, $value := .Values.env }}
        - name: {{ $key }}
          value: {{ $value | quote }}
        {{- end }}
        {{- if .Values.postgresql.enabled }}
        - name: DB_HOST
          value: {{ include "myapp.fullname" . }}-postgresql
        - name: DB_NAME
          value: {{ .Values.postgresql.auth.database }}
        {{- end }}
        resources:
          {{- toYaml .Values.resources | nindent 10 }}
EOF

# =============================================
# templates/ingress.yaml — conditional rendering
# =============================================
cat > myapp/templates/ingress.yaml <<'EOF'
{{- if .Values.ingress.enabled -}}   {{/* Only create Ingress if ingress.enabled=true */}}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: {{ include "myapp.fullname" . }}
  annotations:
    {{- if .Values.ingress.tls }}
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    {{- end }}
    {{- if .Values.ingress.certIssuer }}
    cert-manager.io/cluster-issuer: {{ .Values.ingress.certIssuer }}
    {{- end }}
spec:
  ingressClassName: {{ .Values.ingress.className }}
  {{- if .Values.ingress.tls }}
  tls:
  - hosts:
    - {{ .Values.ingress.host }}
    secretName: {{ include "myapp.fullname" . }}-tls
  {{- end }}
  rules:
  - host: {{ .Values.ingress.host }}
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: {{ include "myapp.fullname" . }}
            port:
              number: {{ .Values.service.port }}
{{- end }}

Helm in CI/CD Pipelines

# =============================================
# GitHub Actions workflow for Helm deployment
# =============================================
# .github/workflows/deploy.yaml
name: Deploy to Kubernetes

on:
  push:
    branches: [main]
  workflow_dispatch:
    inputs:
      environment:
        description: 'Target environment'
        required: true
        default: 'staging'
        type: choice
        options: [staging, production]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    
    - name: Configure AWS credentials
      uses: aws-actions/configure-aws-credentials@v4
      with:
        role-to-assume: arn:aws:iam::123456789:role/github-actions-role
        aws-region: us-east-1
    
    - name: Update kubeconfig for EKS
      run: aws eks update-kubeconfig --name production-cluster --region us-east-1
    
    - name: Set image tag from git commit
      run: echo "IMAGE_TAG=${GITHUB_SHA::8}" >> $GITHUB_ENV
    
    - name: Lint chart
      run: helm lint ./helm/myapp
    
    - name: Deploy to staging
      if: github.event_name == 'push' || inputs.environment == 'staging'
      run: |
        helm upgrade --install myapp-staging ./helm/myapp \
          --namespace staging \
          --create-namespace \
          --values ./helm/myapp/values.yaml \
          --values ./helm/myapp/values-staging.yaml \
          --set image.tag=${{ env.IMAGE_TAG }} \
          --wait \                            # Wait until all pods are ready
          --timeout 5m \                      # Fail if not ready in 5 minutes
          --atomic                            # Auto-rollback on failure!
    
    - name: Run smoke tests
      run: |
        kubectl run smoke-test --image=curlimages/curl --rm -it --restart=Never \
          -- curl -f http://myapp-staging.staging.svc.cluster.local/health
    
    - name: Deploy to production
      if: inputs.environment == 'production'
      run: |
        helm upgrade --install myapp ./helm/myapp \
          --namespace production \
          --values ./helm/myapp/values.yaml \
          --values ./helm/myapp/values-production.yaml \
          --set image.tag=${{ env.IMAGE_TAG }} \
          --wait --timeout 10m --atomic

Chapter 10 — Project 1

Build and Publish a Helm Chart for a Full Microservices Stack

Scenario: TaskFlow SaaS has 4 microservices: frontend (React), API (Node.js), worker (Python), and a PostgreSQL database. Currently deployed with 40+ raw YAML files that nobody dares to touch. You will package the entire stack into a single Helm chart, create environment-specific values files, publish it to GitHub Container Registry, and deploy it with a single command.

# 1. Create the umbrella chart
helm create taskflow
cd taskflow

# 2. Update Chart.yaml with sub-chart dependencies
cat > Chart.yaml <<'EOF'
apiVersion: v2
name: taskflow
description: TaskFlow SaaS — complete application stack
version: 2.0.0
appVersion: "4.1.0"
dependencies:
- name: postgresql
  version: "12.12.10"
  repository: https://charts.bitnami.com/bitnami
  condition: postgresql.enabled
- name: redis
  version: "18.2.0"
  repository: https://charts.bitnami.com/bitnami
  condition: redis.enabled
EOF

# 3. Create environment-specific values files
cat > values-staging.yaml <<'EOF'
replicaCount: 1
image:
  tag: "latest"
resources:
  requests: { memory: "64Mi",  cpu: "50m" }
  limits:   { memory: "128Mi", cpu: "200m" }
ingress:
  host: staging.taskflow.io
postgresql:
  enabled: true
  primary:
    persistence:
      size: 5Gi
redis:
  enabled: true
  master:
    persistence:
      size: 1Gi
EOF

cat > values-production.yaml <<'EOF'
replicaCount: 5
image:
  tag: "4.1.0"          # Pinned version in production!
resources:
  requests: { memory: "512Mi", cpu: "500m" }
  limits:   { memory: "1Gi",   cpu: "1000m" }
ingress:
  host: app.taskflow.io
  tls: true
  certIssuer: letsencrypt-prod
autoscaling:
  enabled: true
  minReplicas: 5
  maxReplicas: 100
  targetCPUUtilization: 50
postgresql:
  enabled: false          # Use RDS in production, not in-cluster postgres
  externalHost: taskflow.cluster-xyz.us-east-1.rds.amazonaws.com
redis:
  enabled: true
  master:
    persistence:
      size: 20Gi
  replica:
    replicaCount: 2
EOF

# 4. Download sub-chart dependencies
helm dependency update

# 5. Validate the chart
helm lint . --values values-staging.yaml
helm lint . --values values-production.yaml

# 6. Test render (see all generated YAML)
helm template taskflow . \
  --values values-production.yaml \
  --namespace production | head -100

# 7. Package the chart
helm package .
# Successfully packaged chart and saved it to: taskflow-2.0.0.tgz

# 8. Publish to GitHub Container Registry
echo $GITHUB_TOKEN | helm registry login ghcr.io -u $GITHUB_USERNAME --password-stdin
helm push taskflow-2.0.0.tgz oci://ghcr.io/mycompany/charts

# 9. Deploy staging from published chart
helm install taskflow-staging oci://ghcr.io/mycompany/charts/taskflow \
  --version 2.0.0 \
  --namespace staging \
  --create-namespace \
  --values values-staging.yaml \
  --wait --atomic

# 10. Deploy production
helm install taskflow oci://ghcr.io/mycompany/charts/taskflow \
  --version 2.0.0 \
  --namespace production \
  --create-namespace \
  --values values-production.yaml \
  --wait --timeout 10m --atomic

# 11. Verify everything deployed correctly
helm list -A
# NAMESPACE    NAME               REVISION   STATUS    CHART             APP VERSION
# staging      taskflow-staging   1          deployed  taskflow-2.0.0    4.1.0
# production   taskflow           1          deployed  taskflow-2.0.0    4.1.0

kubectl get all -n production | grep taskflow

What You’ve Achieved

40+ YAML files → 1 Helm chart. Deploy the entire TaskFlow stack to any environment with a single command. Rollback any release in under 30 seconds. Version-controlled, auditable deployments. This is how mature engineering teams manage Kubernetes at scale.

Common Helm Errors

Error: cannot re-use a name that is still in use

You ran helm install on a release that already exists. Use helm upgrade --install instead — it creates or updates (idempotent). This is the pattern to use in CI/CD pipelines.

Error: UPGRADE FAILED: another operation is in progress

A previous Helm operation got stuck. Check: helm history release-name. If a revision is in “pending-upgrade” state, you may need to rollback: helm rollback release-name.

Error: rendered manifests contain a resource that already exists

A Kubernetes resource (e.g., a Service) was created outside Helm, and now Helm is trying to create one with the same name. Fix: add the label helm.sh/chart to the existing resource, or delete it and let Helm recreate it. Use --force to overwrite (risky in production).

Interview Questions — Chapter 10

What is Helm and what problem does it solve over raw kubectl apply?
Explain the difference between a Helm Chart, a Release, and a Revision.
What is the difference between helm install and helm upgrade –install? Which do you use in CI/CD and why?
How does –atomic work in helm upgrade? When would you use it?
What are Helm hooks and give two production examples of when you would use them?
What is the difference between values.yaml and values-production.yaml? How do you combine them in a single helm command?
A colleague ran helm uninstall in production. How do you recover if the release history still exists?
What are Chart dependencies and how does helm dependency update work?
How do you test a Helm chart without deploying it to a cluster?

Part 2 Complete

Chapters 6–10 Covered

You’ve mastered Services, NetworkPolicy, ConfigMaps, Secrets, Persistent Storage, Ingress with TLS, and Helm. Coming up in Part 3:

Chapter 11: Kubernetes Security (RBAC, PSA, OPA)

Chapter 12: Monitoring with Prometheus & Grafana

Chapter 13: CI/CD Pipelines with ArgoCD

Chapter 14: Kubernetes on AWS EKS

Chapter 15: Production Deployment Strategies

Sumit Sharma

11 Posts View All Posts