📋 Quick Reference

AI-200 Cheatsheets

One cheatsheet per module — key facts, CLI commands, memory tricks, and exam gotchas.
All sourced from Microsoft Learn AI-200 course content.

Download Master PDF (All 9 Modules) View Full Modules →

📦

Module 1 · Containers & Registries

Azure Container Registry (ACR) + App Service

20–25%

🔑 Key Facts

SKUs: Basic (10 GB, dev) → Standard (100 GB, prod) → Premium (500 GB, geo-rep, private link)
Geo-replication: Premium ONLY — push once, pull from nearest replica
Auth: Managed Identity + AcrPull role (never admin user in prod)
ACR Tasks (QTS): Quick (one-shot) | Triggered (git/base-image/schedule) | Multi-step (YAML pipeline)
Base image trigger: Auto-rebuild when upstream OS patched — Premium only
Tag types: Stable (mutable, dev) vs Unique (immutable, prod) vs Digest (sha256, gold)
Lock image: --write-enabled false prevents tag overwrite
WEBSITES_PORT: Must match your app listen port or container fails to start
KV Reference syntax: @Microsoft.KeyVault(SecretUri=https://vault.../secrets/name/)
CD from ACR: --enable-cd true creates webhook for auto-pull on image push

🧠 Memory Tricks

SKU ladder: BSP — Basic/Standard/Premium. Only P gets Private link + geo-rep.
QTS tasks: Quick=run once, Triggered=event fires it, Step=YAML pipeline.
Port mismatch = #1 startup failure → ALWAYS check WEBSITES_PORT first.

💻 Commands

az acr create --name myacr --sku Premium -g rg

# Build in cloud (no local Docker)
az acr build --registry myacr --image api:v1 .

# Triggered task (rebuild on git commit)
az acr task create --name build-on-push \
  --registry myacr --image "api:{{{.Run.ID}}}" \
  --context https://github.com/org/repo \
  --branch main --git-access-token $TOKEN

# Lifecycle: auto-delete untagged >30d
az acr config retention update \
  --registry myacr --status enabled \
  --days 30 --type UntaggedManifests

# Lock production tag
az acr repository update \
  --name myacr --image api:v1.2.3 \
  --write-enabled false

# App Service: set container port
az webapp config appsettings set \
  -g rg -n myapp --settings WEBSITES_PORT=8000

# App Service: view live logs
az webapp log tail -g rg -n myapp

⚠️ Exam Gotchas

Admin user = NEVER production. Geo-rep = Premium only. KV reference shows literal string = MI missing Secrets User role. Base image trigger = Premium only (common trap).

🚀

Module 2 · Serverless Containers

Azure Container Apps (ACA)

20–25%

🔑 Key Facts

Environment: Shared VNet + Log Analytics for all apps inside it
External ingress: Public FQDN — internet accessible
Internal ingress: Same-environment only via internal DNS
Revision: Immutable snapshot created on each update — enable rollback & canary
secretref: Maps Container Apps secret to env var — THE secure injection pattern
--yaml flag: Config-as-code; IGNORES all other CLI flags
Registry auth (best): Managed identity + AcrPull role
Scale to zero: minReplicas=0 + KEDA trigger
Canary: Multiple revision mode + traffic split % (e.g. 90/10)
Diagnose order: logs → revision list → replica list

🧠 Memory Tricks

secretref: value never appears in YAML, history, or logs. Always the correct answer for "secure secret injection without hardcoding."
Revision = safety net — every update makes one. Update went bad? Deactivate it, activate previous.

💻 Commands

# Fast deploy (creates env + app)
az containerapp up --name ai-api -g rg \
  --environment myenv \
  --image myacr.azurecr.io/api:v1 \
  --target-port 8000 --ingress external

# Config-as-code (YAML overrides ALL flags)
az containerapp create -n ai-api -g rg \
  --environment myenv --yaml ./app.yaml

# Add secret
az containerapp secret set -n ai-api -g rg \
  --secrets openai-key="sk-abc123"

# Reference secret as env var
az containerapp update -n ai-api -g rg \
  --set-env-vars \
  OPENAI_KEY=secretref:openai-key

# View live logs
az containerapp logs show \
  -n ai-api -g rg --follow --tail 50

# List revisions
az containerapp revision list \
  -n ai-api -g rg -o table

⚠️ Exam Gotchas

--yaml ignores ALL other CLI flags. secretref: = never stored in plain text. Fan-out needs a Service Bus topic, not multiple Container Apps queues.

☸️

Module 3 · Kubernetes

Azure Kubernetes Service (AKS)

20–25%

🔑 Key Facts

Deployment: Desired state — Kubernetes maintains N replicas always
ClusterIP: Internal only — backend microservices, vector DBs
LoadBalancer: Public IP — production AI API endpoints
Selector MUST match labels: Service routes to pods by label match
1 CPU = 1000m (millicores); 500m = half core
requests: Minimum guaranteed (scheduler picks node)
limits: Maximum — OOMKill if memory exceeded
secretKeyRef: Inject K8s Secret as pod env var
--attach-acr: Grants AcrPull to cluster managed identity
Rolling update: Needs replicas ≥ 2 for zero-downtime

🧠 Memory Tricks

Failure map: ImagePullBackOff=bad image/creds | CrashLoopBackOff=app crash (check --previous logs) | Pending=no resources | No Endpoints=selector≠labels | OOMKill=raise memory limit.
Diagnose: describe (WHAT) then logs (WHY).

💻 Commands

# Create cluster with ACR attached
az aks create -n myaks -g rg \
  --node-count 3 --attach-acr myacr

# Get kubeconfig
az aks get-credentials -n myaks -g rg

# Deploy
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

# Status checks
kubectl get pods
kubectl get svc
kubectl get deployments

# Troubleshoot
kubectl describe pod my-pod     # Events/errors
kubectl logs my-pod             # App stdout
kubectl logs my-pod --previous  # Last crash logs
kubectl get pods --show-labels  # Check labels

# Scale
kubectl scale deployment my-app --replicas=3

# Exec into pod
kubectl exec -it my-pod -- /bin/sh

⚠️ Exam Gotchas

Selector/label mismatch = Service has no endpoints (silent traffic drop). OOMKill = raise limits.memory, don't restart. K8s Secrets are base64-encoded NOT encrypted — use Key Vault CSI driver for production.

🌍

Module 4 · NoSQL Database

Azure Cosmos DB for NoSQL

25–30%

🔑 Key Facts

Hierarchy: Account → Database → Container → Item
Good PK: High cardinality + query-aligned + immutable (userId, productId)
Bad PK: isActive, status, region — low cardinality = hot partitions
Point read: ~1 RU/KB — cheapest op (needs BOTH id + partition key)
Write: ~5–10 RU/KB
Cross-partition query: 100+ RU — fan-out to ALL partitions
Manual throughput min: 400 RU/s per dedicated container
Autoscale min: 1,000 RU/s — use for bursty AI traffic
_etag: Optimistic concurrency token — use with if_match
Prod auth: Entra ID + Cosmos DB Built-in Data Contributor role

🧠 Memory Tricks

PK rule of 3 — HAI: High cardinality, Aligned to queries, Immutable.
RU ladder: Point read (1) → Simple query (3–5) → Complex (10+) → Cross-partition (100+).
Autoscale for AI = bursty. Manual = steady/predictable.

💻 SDK Pattern (Python)

from azure.cosmos import CosmosClient, PartitionKey
from azure.identity import DefaultAzureCredential

client = CosmosClient(
  "https://myacct.documents.azure.com:443/",
  credential=DefaultAzureCredential()
)
db = client.get_database_client("aidata")
ctr = db.get_container_client("products")

# Point read — ~1 RU (cheapest)
item = ctr.read_item(
  item="product-123",
  partition_key="electronics"
)

# Upsert (create or replace)
ctr.upsert_item(body={...})

# Optimistic concurrency
ctr.replace_item(
  item=item["id"], body=item,
  if_match=item["_etag"]
)

# Query with partition key
items = list(ctr.query_items(
  query="SELECT * FROM c WHERE c.cat = @cat",
  parameters=[{"name":"@cat","value":"electronics"}],
  partition_key="electronics"
))

⚠️ Exam Gotchas

isActive/status as PK = always wrong (2 hot partitions). Point read needs BOTH id AND pk — id alone forces a query. Monitor RU cost via x-ms-request-charge header.

🐘

Module 5 · Relational Database

Azure Database for PostgreSQL

25–30%

🔑 Key Facts

Tiers: Burstable (dev, no PgBouncer) | General Purpose (prod) | Memory Optimized (complex queries)
PgBouncer: ONLY on GP and MO — port 6432 (not 5432)
Entra token resource: https://ossrdbms-aad.database.windows.net/.default
Best TLS mode: verify-full (validates CA AND hostname)
JSONB: Binary JSON with GIN indexing — use for flexible AI metadata
BIGSERIAL: Auto-increment 64-bit PK (not SERIAL which overflows)
TIMESTAMPTZ: Always over TIMESTAMP (stores UTC, timezone-aware)
pgvector: CREATE EXTENSION vector — similarity search on embeddings
Transactional DDL: Wrap ALTER TABLE in BEGIN...COMMIT
Keyset pagination: WHERE id > last_id LIMIT N (not OFFSET)

🧠 Memory Tricks

BuGM: Burstable (no Bouncer) | General (Bouncer ✅) | Memory (Bouncer ✅).
TLS order: disable (blocked) → require (encrypt) → verify-ca (+CA) → verify-full (+hostname = best).
Entra URL: ossrdbms-aad.database.windows.net — memorise exactly.

💻 Patterns

-- Create AI conversation table
CREATE TABLE messages (
  id         BIGSERIAL PRIMARY KEY,
  session_id UUID NOT NULL DEFAULT gen_random_uuid(),
  role       VARCHAR(50) CHECK (role IN ('user','assistant')),
  content    TEXT NOT NULL,
  metadata   JSONB DEFAULT '{}'::jsonb,
  created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);

-- Keyset pagination (fast at scale)
SELECT * FROM messages
WHERE session_id = $1 AND id > $last_id
ORDER BY id ASC LIMIT 50;

-- Upsert
INSERT INTO sessions (id, user_id)
VALUES ($1, $2)
ON CONFLICT (id) DO UPDATE
  SET user_id = EXCLUDED.user_id;

-- Transactional DDL (PostgreSQL superpower)
BEGIN;
ALTER TABLE messages ADD COLUMN tokens INT;
CREATE INDEX idx_messages_tokens ON messages(tokens);
COMMIT;

-- Entra auth token (Python)
token = cred.get_token(
  "https://ossrdbms-aad.database.windows.net/.default"
)

⚠️ Exam Gotchas

PgBouncer on Burstable = impossible. verify-full validates CA AND hostname — "ensure client connects to correct server" = verify-full. OFFSET degrades at scale; use keyset pagination.

⚡

Module 6 · In-Memory Cache

Azure Managed Redis

20–25%

🔑 Key Facts

Azure Redis port: 10000 with TLS — NOT 6379 (open-source default)
Cache-aside: GET → MISS → DB → SET with TTL → return (app manages cache)
Write-through: Write to cache AND DB simultaneously (always consistent)
Write-behind: Write to cache, async DB update (lower latency, loss risk)
String: Simple key→value, SET with TTL, INCR for atomic counters
Hash: Object fields — hset/hget/hincrby (avoid full JSON serialization)
List: Ordered queue/history — lpush/rpush + lrange + ltrim
Set: Unique members — sadd/sismember/smembers
Sorted Set: Ranked data — zadd + zrange (leaderboards, priority)
Prod auth: redis-entraid + DefaultAzureCredential (no stored keys)

🧠 Memory Tricks

Port: Azure Redis = 10000 (not 6379). Always.
Data types SHLSS: String (cache), Hash (objects), List (queue), Set (unique), Sorted Set (ranked).
TTL ladder: Rate limit 60s | Inference 5-60m | Profile 15-60m | Catalog 1-24h | Static 24h+.

💻 Python (redis-py)

import redis

r = redis.Redis(
  host="myinst.redis.azure.com",
  port=10000, ssl=True,
  decode_responses=True,
  password="your-access-key"
)

# Cache-aside pattern
def get_result(key: str):
  hit = r.get(key)
  if hit: return hit           # HIT: return instantly
  val = expensive_db_query()
  r.set(key, val, ex=300)      # Cache 5 minutes
  return val

# Atomic rate limiter
def check_rate(user_id: str) -> bool:
  key = f"rate:{user_id}:{int(time.time())//60}"
  count = r.incr(key)
  r.expire(key, 60)
  return count <= 10           # max 10 req/min

# Hash for user profile
r.hset("user:1001", mapping={
  "name":"Alice","tier":"premium","credits":"500"
})
r.hincrby("user:1001", "credits", -10)

# List for chat history (last 10)
r.lpush("chat:abc", "Hello")
r.ltrim("chat:abc", 0, 9)

⚠️ Exam Gotchas

Port 10000 NOT 6379. Never store API keys/PII in Redis without short TTL + encryption. Cache-aside = app manages cache, NOT transparent proxy. decode_responses=True always.

📨

Module 7 · Async Messaging

Azure Service Bus

25–30%

🔑 Key Facts

Queue: Point-to-point — ONE consumer processes each message
Topic + Subscriptions: Pub/Sub — ALL subscriptions get a copy (fan-out)
Peek-lock (default): Lock → Process → complete() or abandon() — guaranteed delivery
Receive-and-Delete: Deleted on receive — fast but lost on crash
complete(): Removes message (success path)
abandon(): Returns to queue for retry (failure path)
DLQ path: queuename/$deadletterqueue (after max delivery count)
Claim-check: Large payload (>256 KB) → Blob Storage, send URI in message
Sessions: FIFO per session_id — use for per-user conversation ordering
SQL filter: Route topic messages by ApplicationProperties values

🧠 Memory Tricks

Queue vs Topic: Queue = one wins. Topic = everyone wins.
Peek-lock flow: "Look before you delete" — peek (lock) → process → complete or abandon.
Claim-check: "Check coat at door" — store blob (coat), carry URI (ticket).

💻 SDK Pattern (Python)

from azure.servicebus import ServiceBusClient, ServiceBusMessage
from azure.identity import DefaultAzureCredential

client = ServiceBusClient(
  "mynamespace.servicebus.windows.net",
  DefaultAzureCredential()
)

# Send message
with client.get_queue_sender("doc-queue") as sender:
  msg = ServiceBusMessage(
    body='{"doc_id":"123","url":"https://..."}',
    message_id="doc-123",
    application_properties={"priority":"high"}
  )
  sender.send_messages(msg)

# Receive with peek-lock
with client.get_queue_receiver(
  "doc-queue", max_wait_time=5
) as receiver:
  for msg in receiver:
    try:
      process(str(msg))
      receiver.complete_message(msg)   # SUCCESS
    except:
      receiver.abandon_message(msg)    # RETRY

# Claim-check: large payloads
blob_uri = upload_to_blob(large_content)
sender.send_messages(
  ServiceBusMessage(f'{{"blob_uri":"{blob_uri}"}}')
)

⚠️ Exam Gotchas

Fan-out to N services = Topic NOT Queue. Receive-and-Delete = message lost on crash (never for critical AI pipeline). Growing DLQ = urgent alert — consumer logic is broken. Standard max msg = 256 KB.

🔑

Module 8 · Secrets Management

Azure Key Vault

25–30%

🔑 Key Facts

Object types (SKC): Secrets (strings) | Keys (crypto) | Certificates (TLS)
Secrets User: Read-only secret values — assign to app managed identity
Secrets Officer: Full CRUD on secrets — CI/CD, developers
Contributor ≠ data access: Management-plane role CANNOT read secrets
Versionless URI: Auto-rotates — always resolves latest version
Versioned URI: Pinned — does NOT auto-rotate
Local cache: 5-min in-process TTL — avoids KV throttling (2,000 ops/10s)
Soft delete: Always on — 7–90 day recovery window
Purge protection: Prevents permanent delete during retention (compliance)
HSM keys: Premium tier — FIPS 140-2 Level 3, private key never leaves vault

🧠 Memory Tricks

Role ladder: Secrets User (read) → Secrets Officer (CRUD) → Administrator (everything).
"Contributor ≠ secret access" — management and data plane are COMPLETELY SEPARATE in Key Vault.
KV reference fails = always check MI has Secrets User role first.

💻 SDK + App Service Pattern

from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient

client = SecretClient(
  vault_url="https://myvault.vault.azure.net",
  credential=DefaultAzureCredential()
)

# Get latest version (~1 RU equivalent)
secret = client.get_secret("openai-api-key")
api_key = secret.value

# Get specific version
v = client.get_secret("openai-api-key", version="abc")

# Cache to avoid throttling
import time
_cache = {}
def get_secret_cached(name):
  c = _cache.get(name)
  if c and time.time() - c["t"] < 300:
    return c["v"]
  v = client.get_secret(name).value
  _cache[name] = {"v": v, "t": time.time()}
  return v

# App Service KV Reference (auto-rotates)
# Set app setting value to:
# @Microsoft.KeyVault(SecretUri=
#   https://myvault.vault.azure.net/secrets/key/)

# Role assignment
az role assignment create \
  --role "Key Vault Secrets User" \
  --assignee $MI_PRINCIPAL_ID \
  --scope $VAULT_RESOURCE_ID

⚠️ Exam Gotchas

Contributor CANNOT read secrets — must add data-plane role. KV reference literal in app = MI missing Secrets User role. Fetch once at startup + cache. Versionless URI = auto-rotate; pinned = manual rotation.

🔭

Module 9 · Observability

OpenTelemetry + Application Insights

15–20%

🔑 Key Facts

Three pillars (TML): Traces (where) | Metrics (what trend) | Logs (why)
Trace: Complete request journey across all services, linked by one trace ID
Span: One named, timed unit of work. Has trace ID + span ID + parent span ID
W3C traceparent: HTTP header that links spans across service boundaries
Server span → App Insights: requests table
Client/Internal span → App Insights: dependencies table
span.set_attribute() → App Insights: customDimensions column
Trace ID → App Insights: operation_Id column
Cloud role name: service.name + service.namespace → Application Map node
Auto-instrumented: requests, FastAPI, psycopg2, Azure SDK, logging

🧠 Memory Tricks

Table rule: "My Server Receives Requests. My Client Creates Dependencies." SERVER=requests table, CLIENT=dependencies table.
Without unique cloud role names = ALL services appear as ONE node on Application Map.
traceparent = GPS coordinate linking every span to the global trace.

💻 Setup + Custom Spans (Python)

pip install azure-monitor-opentelemetry

# One call — traces + metrics + logs
from azure.monitor.opentelemetry import configure_azure_monitor

configure_azure_monitor()
# Reads APPLICATIONINSIGHTS_CONNECTION_STRING env var

# Set unique cloud role (per service)
from opentelemetry.sdk.resources import Resource
configure_azure_monitor(
  resource=Resource.create({
    "service.name": "embedding-service",
    "service.namespace": "rag-pipeline"
  })
)

# Custom span for AI operation
from opentelemetry import trace
from opentelemetry.trace import StatusCode

tracer = trace.get_tracer("rag-pipeline")

def run_inference(prompt):
  with tracer.start_as_current_span("llm_inference") as span:
    span.set_attribute("model", "gpt-4o")
    span.set_attribute("prompt_tokens", len(prompt.split()))
    try:
      result = call_openai(prompt)
      span.set_attribute("total_tokens", result.usage.total_tokens)
      return result
    except Exception as e:
      span.set_status(StatusCode.ERROR, str(e))
      span.record_exception(e)
      raise

⚠️ Exam Gotchas

Outgoing calls (OpenAI, Cosmos, Key Vault) = dependencies table NOT requests table. Without cloud role names = one node on App Map = cannot identify bottleneck. configure_azure_monitor() = one call sets up ALL three pillars.

🎯

Exam Day · All Modules

Scenario → Answer Decision Tree

READ THIS LAST

Secure secret injection without hardcoding?Key Vault → Managed Identity → Secrets User role

Message payload exceeds 256 KB?Claim-check: Blob Storage + URI in Service Bus message

Multiple services must react to same event?Service Bus Topic + subscriptions (fan-out)

AI app with bursty/unpredictable traffic DB?Cosmos DB with Autoscale throughput (10–100% of max)

Container fails to start on App Service?Check WEBSITES_PORT first, then registry auth credentials

PostgreSQL connection pooling needed?General Purpose or higher tier + connect on port 6432

Cache AI inference results at low latency?Redis cache-aside on port 10000 with TTL 5–60 min

K8s Service not routing traffic?Check selector ≠ labels: kubectl get pods --show-labels

Find latency bottleneck across AI pipeline?App Insights Application Map → KQL on dependencies table

ACR geo-replication needed?Premium SKU only — Basic/Standard = no geo-rep

Contributor can't read Key Vault secrets?Add Key Vault Secrets User data-plane role separately

Messages per user must be processed in order?Service Bus Sessions with session_id = userId

Pod keeps restarting with OOMKill?Increase resources.limits.memory in Deployment manifest

Deploy with git-tracked config, no drift?Container Apps --yaml flag (overrides all other CLI flags)