☸️
Module 3 of 9 20–25% 3 sub-modules Β· 21 units Domain 1: Develop Containerized Solutions on Azure

Deploy and Monitor Applications on Azure Kubernetes Service

Deploy and manage workloads on AKS using kubectl and manifest files. Configure node pools, networking with Azure CNI, and monitor with Container Insights and KQL.

Azure Kubernetes ServicekubectlHelm

Last updated: Β· Aligned with Course AI-200T00-A

Module

Deploy Applications to Azure Kubernetes Service

units
🎬 Unit 1

Introduction

5 min

Azure Kubernetes Service (AKS) is a managed Kubernetes cluster β€” Azure runs the control plane, you manage workloads. Deploy AI inference APIs, vector search services, and background processors as containers that automatically scale, survive crashes, and expose public endpoints via Azure Load Balancer. Four concepts to master: Pods, Deployments, Services, kubectl.

πŸ’‘ Exam Tip
Exam focus: 1) Deployment manifest fields (replicas, resources, secretKeyRef) 2) Service types (ClusterIP vs LoadBalancer) 3) selector/label matching 4) kubectl troubleshooting commands and failure modes.
πŸ“˜ Unit 2

Create Kubernetes Deployment Manifests

10 min

AKS Architecture: User β†’ LoadBalancer Service β†’ Pods in Deployment β†’ ACR

UserHTTPServiceLoadBalancerport 80β†’8080Deployment (replicas: 3)Pod 1app:inference2 CPU / 4GiReady βœ“Pod 2app:inference2 CPU / 4GiReady βœ“Pod 3app:inference2 CPU / 4GiReady βœ“ACRAcrPull via MIselector: app=inference ↔ pod label: app=inference (must match!)

1. Deployment Manifest Structure

A Deployment tells Kubernetes: "run N copies of this container image." Kubernetes maintains that exact state β€” restarting crashed pods, scheduling on healthy nodes.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ai-inference-api
spec:
  replicas: 2
  selector:
    matchLabels:
      app: inference-api       # Must match template.labels below
  template:
    metadata:
      labels:
        app: inference-api     # Must match selector above
    spec:
      containers:
      - name: api
        image: myregistry.azurecr.io/inference-api:v1.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            memory: "2Gi"
            cpu: "1000m"       # 1 CPU = 1000 millicores
          limits:
            memory: "4Gi"
            cpu: "2000m"

2. Replicas for High Availability

  1. 1 replica β€” single point of failure. App is down while Kubernetes restarts it after a crash.
  2. 2 replicas β€” basic resilience. One can crash, the other keeps serving. Minimum for production.
  3. 3+ replicas β€” enables rolling updates with zero downtime. One updates while two keep serving traffic.

3. Resource Requests and Limits

  1. Requests = minimum guaranteed resources (scheduler uses this to pick a node). Set too low = pod on overloaded node = poor performance.
  2. Limits = maximum allowed. Pod is killed (OOMKill) if memory limit exceeded. Set too low for model inference = constant restarts.
⚠️ Common Gotcha
OOMKill = Out-of-Memory Kill. Happens when a pod exceeds its memory limit. Fix: raise the memory limit, not restart the pod. The exam presents this as "pod keeps restarting" β€” answer is always increase memory limit.

4. Injecting Secrets via Kubernetes Secrets

kubectl create secret generic api-secrets --from-literal=api-key=your-secret-key

# Reference in Deployment manifest:
env:
- name: API_KEY
  valueFrom:
    secretKeyRef:
      name: api-secrets   # Secret object name
      key: api-key        # Key within the secret
πŸ’‘ Exam Tip
Kubernetes Secrets are base64-encoded (NOT encrypted by default). For production on AKS, use Azure Key Vault with the Secrets Store CSI driver. The exam tests this distinction.
πŸ“˜ Unit 3

Expose Applications with Kubernetes Services

10 min

1. Service Types β€” When to Use Each

#TypeAccessible FromUse For
1ClusterIP (default)Inside cluster onlyBackend microservices: vector DB, embeddings service, worker
2NodePortNode IP + high portDev/test access without load balancer
3LoadBalancerInternet (public IP)Production AI APIs accessible externally
4ExternalNameCluster β†’ external DNSRepresent external services as K8s services
πŸ’‘ Exam Tip
AI scenario: vector DB / embeddings = ClusterIP (internal). Public inference API = LoadBalancer. Internal = ClusterIP. External = LoadBalancer. This exact pattern appears on the exam.

2. LoadBalancer Service Manifest

apiVersion: v1
kind: Service
metadata:
  name: inference-api-service
spec:
  type: LoadBalancer
  selector:
    app: inference-api    # Routes to pods with this label
  ports:
  - protocol: TCP
    port: 80              # External port (clients connect here)
    targetPort: 8080      # Container port (app listens here)

3. The Selector-Label Contract (Most Common Failure)

Service selector must exactly match pod labels. One typo = Service has no endpoints = silent traffic drop.

kubectl get pods --show-labels
kubectl describe svc inference-api-service | grep Selector
⚠️ Common Gotcha
Selector mismatch is the #1 "why is my service not routing traffic" scenario. Always check selector vs pod labels first.

4. ClusterIP Internal DNS

ClusterIP services are reachable via stable DNS: servicename.namespace.svc.cluster.local. Your API pod calls http://vector-search-svc.default.svc.cluster.local:8080 even as pod IPs change.

πŸ“˜ Unit 4

Deploy, Verify, and Troubleshoot

10 min

1. Deploy with kubectl apply (Idempotent)

kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl apply -f ./k8s/   # Apply entire directory

apply creates if not exists, updates if exists. Safe to run multiple times.

2. Verify Status

kubectl get deployments                    # READY 2/2 = both replicas healthy
kubectl get pods                           # STATUS Running/Pending/CrashLoopBackOff
kubectl get svc                            # EXTERNAL-IP for LoadBalancer (may show pending)

3. Common Failure Modes and Fixes

  1. ImagePullBackOff β€” wrong image tag, registry URL, or missing pull credentials. Fix: kubectl describe pod to see exact error.
  2. CrashLoopBackOff β€” app starts then crashes. Fix: kubectl logs pod-name for stack trace. Do NOT blindly restart.
  3. Pending β€” can't schedule. Insufficient node resources. Fix: kubectl describe pod β†’ Events section shows reason.
  4. No Endpoints β€” Service selector doesn't match pod labels. Fix: kubectl describe svc and kubectl get pods --show-labels.
kubectl describe pod my-pod     # Events, conditions, image details
kubectl logs my-pod             # App stdout/stderr
kubectl logs my-pod --previous  # Logs from PREVIOUS crashed container

⚑ AKS Master Cheatsheet

Deploy manifestkubectl apply -f deployment.yaml
Check pod statuskubectl get pods
View logskubectl logs pod-name
Previous crash logskubectl logs pod-name --previous
Describe pod (events)kubectl describe pod pod-name
Check labelskubectl get pods --show-labels
Check service endpointskubectl describe svc svc-name
External IPkubectl get svc β†’ EXTERNAL-IP column
Attach AKS to ACRaz aks create --attach-acr myregistry
Internal service (backend)ClusterIP (default)
External service (prod API)LoadBalancer
1 CPU core1000m (millicores)
πŸ§ͺ Unit 5

Exercise β€” Deploy AI Inference API to AKS

30 min
  1. Create AKS cluster with ACR integration: az aks create --attach-acr myregistry
  2. Write Deployment manifest with 2 replicas, resource requests/limits, and secret injection
  3. Write LoadBalancer Service manifest with correct selector
  4. Apply with kubectl apply -f
  5. Verify EXTERNAL-IP and test API endpoint
  6. Intentionally mismatch labels and observe the "no endpoints" failure
βœ… Unit 6

Knowledge Check

5 min
  1. Q: Vector search service should only be reachable by pods inside the cluster. Which Service type? A: ClusterIP
  2. Q: Pod shows ImagePullBackOff. Most likely cause? A: Wrong image tag, registry URL, or missing pull credentials
  3. Q: Service has no endpoints. Where to look? A: Compare pod labels (kubectl get pods --show-labels) with Service selector (kubectl describe svc)
  4. Q: A pod keeps restarting with OOMKill. Fix? A: Increase the memory limit in the Deployment manifest
  5. Q: You need 3 replicas for zero-downtime rolling updates. Which field? A: spec.replicas: 3 in the Deployment manifest
🏁 Unit 7

Summary

2 min

AKS manifests define desired state β€” Kubernetes makes it real. Set replicas: 2+ for production. Match Service selector to pod labels exactly. ClusterIP for internal services, LoadBalancer for external APIs. Diagnose in order: kubectl describe pod β†’ kubectl logs β†’ check selector/labels.

🧠 Memory Tricks

Diagnostic flow: "Describe then Logs" β€” describe shows WHAT (events), logs shows WHY (app error)

Service type quick rule: Backend microservice = ClusterIP. Public AI API = LoadBalancer.

Failure cheatsheet: ImagePullBackOff=bad image/creds | CrashLoopBackOff=app crash | Pending=no resources | No Endpoints=selector mismatch

☸️
Module Cheatsheet

Azure Kubernetes Service (AKS)

20–25% PDF

πŸ”‘ Key Facts

  • ClusterIP β€” Internal only β€” backend services, vector DBs
  • LoadBalancer β€” Public IP β€” production AI API endpoints
  • Selector = Labels β€” Service routes to pods by exact label match
  • 1 CPU = 1000m β€” 500m = half core. requests = scheduler min, limits = hard cap
  • --previous flag β€” Logs from last CRASHED container instance
  • ImagePullBackOff β€” Wrong tag, registry URL, or missing pull credentials
  • CrashLoopBackOff β€” App crashes on start β€” check kubectl logs pod --previous
  • OOMKill β€” Memory limit too low β€” increase resources.limits.memory

πŸ’» Commands & Patterns

az aks create -n myaks -g rg   --node-count 3 --attach-acr myacr
az aks get-credentials -n myaks -g rg
kubectl apply -f deployment.yaml
kubectl get pods          # Running/Pending/CrashLoop?
kubectl describe pod my-pod     # WHY did it fail?
kubectl logs my-pod             # App stdout
kubectl logs my-pod --previous  # Last crash logs
kubectl get pods --show-labels  # Check selector match
kubectl scale deployment myapp --replicas=3
Module

Monitor and Troubleshoot AKS Workloads

units
🎬 Unit 1

AKS Monitoring Overview

3 min

Container Insights (part of Azure Monitor) provides cluster-level metrics, pod logs, and node utilization for AKS. Combined with kubectl for live debugging, you get full visibility into AI workloads running on Kubernetes.

πŸ’‘ Exam Tip
AKS monitoring exam pillars: 1) Container Insights for cluster metrics + Log Analytics 2) kubectl describe/logs/exec for live debugging 3) CrashLoopBackOff β†’ check logs --previous 4) Pending β†’ describe pod for resource/scheduling issues 5) Workload Identity for Azure service access from pods.
πŸ“˜ Unit 2

Container Insights and Log Analytics

7 min

Enable Container Insights

# Enable Container Insights on existing cluster
az aks enable-addons \
  --addons monitoring \
  --name my-aks-cluster \
  --resource-group rg \
  --workspace-resource-id $LOG_ANALYTICS_ID

# KQL: pod restart count (CrashLoopBackOff signal)
KubePodInventory
| where TimeGenerated > ago(1h)
| where Namespace == "ai-apps"
| summarize restarts=sum(PodRestartCount) by PodUid, Name
| where restarts > 3
| order by restarts desc

# KQL: container CPU usage
Perf
| where ObjectName == "K8SContainer"
| where CounterName == "cpuUsageNanoCores"
| summarize avg_cpu=avg(CounterValue)
    by bin(TimeGenerated, 5m), InstanceName
πŸ“˜ Unit 3

kubectl Debugging Commands

8 min

Diagnose Common Failures

# CrashLoopBackOff β€” app crashes on start
kubectl logs my-pod --previous        # last crash logs
kubectl describe pod my-pod           # events section = root cause

# Pending β€” stuck scheduling
kubectl describe pod my-pod           # look for "Insufficient CPU"
kubectl get nodes                     # check node capacity
kubectl top nodes                     # live resource usage

# ImagePullBackOff β€” can't pull image
kubectl describe pod my-pod           # shows registry/auth error
# Fix: ensure AcrPull role on managed identity

# OOMKilled β€” out of memory
kubectl describe pod my-pod | grep -A5 "Last State"
# Fix: increase resources.limits.memory in deployment YAML

# Exec into running pod for debugging
kubectl exec -it my-pod -- /bin/bash
kubectl port-forward my-pod 8080:8080  # local testing
⚠️ Common Gotcha
Always check kubectl describe pod Events section first β€” it tells you exactly why scheduling failed. --previous flag shows the last crashed container's logs, not the current one.
πŸ“˜ Unit 4

Workload Identity for Azure Services

7 min

Pods Access Azure Services Securely

# Enable OIDC + Workload Identity on cluster
az aks update --enable-oidc-issuer \
  --enable-workload-identity \
  --name my-aks --resource-group rg

# Create managed identity for the workload
az identity create --name ai-workload-id --resource-group rg

# Federate: allow pod SA to use the managed identity
az identity federated-credential create \
  --name aks-federated \
  --identity-name ai-workload-id \
  --resource-group rg \
  --issuer $OIDC_ISSUER \
  --subject system:serviceaccount:ai-apps:ai-sa

# Grant managed identity access to Azure OpenAI
az role assignment create \
  --role "Cognitive Services OpenAI User" \
  --assignee $MI_CLIENT_ID \
  --scope $OPENAI_RESOURCE_ID
πŸ’‘ Exam Tip
Workload Identity = pods authenticate to Azure (Cosmos, OpenAI, KV) using managed identity β€” no secrets in YAML or environment variables. Label pod spec with azure.workload.identity/use: "true".
🏁 Unit 5

Summary

2 min

AKS monitoring: Container Insights β†’ Log Analytics KQL for cluster metrics. kubectl: describe (events/scheduling), logs --previous (crashes), top (live usage), exec (interactive debug). Failure map: CrashLoopBackOff=app crash, Pending=no resources, ImagePullBackOff=registry auth, OOMKill=raise memory limits. Workload Identity: pods access Azure services via managed identity β€” no secrets in YAML.

🧠

Quick Quiz

5 questions β€” test your understanding before moving on

Finished reading this module? Mark it complete to track your progress.

Frequently Asked Questions

What percentage of the AI-200 exam covers Develop Containerized Solutions on Azure? +

Domain 1 (Develop Containerized Solutions on Azure) accounts for 20–25% of the AI-200 exam. Deploy and Monitor Applications on Azure Kubernetes Service topics like Azure Kubernetes Service and kubectl are actively tested. Study all official skill objectives listed in the module header above.

Is Azure Kubernetes Service on the AI-200 exam? +

Yes. Deploy and Monitor Applications on Azure Kubernetes Service is part of Domain 1 in the official AI-200 skill outline, weighted at 20–25%. The key services tested are Azure Kubernetes Service, kubectl, Helm. Review the code examples and exam tips in this module for targeted prep.

How do I practice Azure Kubernetes Service hands-on? +

The best approach is to create a free Azure account and follow the code examples in this module step-by-step. The official Microsoft Learn sandbox for Course AI-200T00-A also provides free lab environments for Azure Kubernetes Service and related services.