π Quick Reference
AI-200 Cheatsheets
One cheatsheet per module β key facts, CLI commands, memory tricks, and exam gotchas.
All sourced from Microsoft Learn AI-200 course content.
Module 1 Β· Containers & Registries
Azure Container Registry (ACR) + App Service
20β25%
π Key Facts
- SKUs: Basic (10 GB, dev) β Standard (100 GB, prod) β Premium (500 GB, geo-rep, private link)
- Geo-replication: Premium ONLY β push once, pull from nearest replica
- Auth: Managed Identity + AcrPull role (never admin user in prod)
- ACR Tasks (QTS): Quick (one-shot) | Triggered (git/base-image/schedule) | Multi-step (YAML pipeline)
- Base image trigger: Auto-rebuild when upstream OS patched β Premium only
- Tag types: Stable (mutable, dev) vs Unique (immutable, prod) vs Digest (sha256, gold)
- Lock image: --write-enabled false prevents tag overwrite
- WEBSITES_PORT: Must match your app listen port or container fails to start
- KV Reference syntax: @Microsoft.KeyVault(SecretUri=https://vault.../secrets/name/)
- CD from ACR: --enable-cd true creates webhook for auto-pull on image push
π§ Memory Tricks
SKU ladder: BSP β Basic/Standard/Premium. Only P gets Private link + geo-rep.
QTS tasks: Quick=run once, Triggered=event fires it, Step=YAML pipeline.
Port mismatch = #1 startup failure β ALWAYS check WEBSITES_PORT first.
QTS tasks: Quick=run once, Triggered=event fires it, Step=YAML pipeline.
Port mismatch = #1 startup failure β ALWAYS check WEBSITES_PORT first.
π» Commands
az acr create --name myacr --sku Premium -g rg
# Build in cloud (no local Docker)
az acr build --registry myacr --image api:v1 .
# Triggered task (rebuild on git commit)
az acr task create --name build-on-push \
--registry myacr --image "api:{{{.Run.ID}}}" \
--context https://github.com/org/repo \
--branch main --git-access-token $TOKEN
# Lifecycle: auto-delete untagged >30d
az acr config retention update \
--registry myacr --status enabled \
--days 30 --type UntaggedManifests
# Lock production tag
az acr repository update \
--name myacr --image api:v1.2.3 \
--write-enabled false
# App Service: set container port
az webapp config appsettings set \
-g rg -n myapp --settings WEBSITES_PORT=8000
# App Service: view live logs
az webapp log tail -g rg -n myapp β οΈ Exam Gotchas
Admin user = NEVER production. Geo-rep = Premium only. KV reference shows literal string = MI missing Secrets User role. Base image trigger = Premium only (common trap).
Module 2 Β· Serverless Containers
Azure Container Apps (ACA)
20β25%
π Key Facts
- Environment: Shared VNet + Log Analytics for all apps inside it
- External ingress: Public FQDN β internet accessible
- Internal ingress: Same-environment only via internal DNS
- Revision: Immutable snapshot created on each update β enable rollback & canary
- secretref: Maps Container Apps secret to env var β THE secure injection pattern
- --yaml flag: Config-as-code; IGNORES all other CLI flags
- Registry auth (best): Managed identity + AcrPull role
- Scale to zero: minReplicas=0 + KEDA trigger
- Canary: Multiple revision mode + traffic split % (e.g. 90/10)
- Diagnose order: logs β revision list β replica list
π§ Memory Tricks
secretref: value never appears in YAML, history, or logs. Always the correct answer for "secure secret injection without hardcoding."
Revision = safety net β every update makes one. Update went bad? Deactivate it, activate previous.
Revision = safety net β every update makes one. Update went bad? Deactivate it, activate previous.
π» Commands
# Fast deploy (creates env + app) az containerapp up --name ai-api -g rg \ --environment myenv \ --image myacr.azurecr.io/api:v1 \ --target-port 8000 --ingress external # Config-as-code (YAML overrides ALL flags) az containerapp create -n ai-api -g rg \ --environment myenv --yaml ./app.yaml # Add secret az containerapp secret set -n ai-api -g rg \ --secrets openai-key="sk-abc123" # Reference secret as env var az containerapp update -n ai-api -g rg \ --set-env-vars \ OPENAI_KEY=secretref:openai-key # View live logs az containerapp logs show \ -n ai-api -g rg --follow --tail 50 # List revisions az containerapp revision list \ -n ai-api -g rg -o table
β οΈ Exam Gotchas
--yaml ignores ALL other CLI flags. secretref: = never stored in plain text. Fan-out needs a Service Bus topic, not multiple Container Apps queues.
Module 3 Β· Kubernetes
Azure Kubernetes Service (AKS)
20β25%
π Key Facts
- Deployment: Desired state β Kubernetes maintains N replicas always
- ClusterIP: Internal only β backend microservices, vector DBs
- LoadBalancer: Public IP β production AI API endpoints
- Selector MUST match labels: Service routes to pods by label match
- 1 CPU = 1000m (millicores); 500m = half core
- requests: Minimum guaranteed (scheduler picks node)
- limits: Maximum β OOMKill if memory exceeded
- secretKeyRef: Inject K8s Secret as pod env var
- --attach-acr: Grants AcrPull to cluster managed identity
- Rolling update: Needs replicas β₯ 2 for zero-downtime
π§ Memory Tricks
Failure map: ImagePullBackOff=bad image/creds | CrashLoopBackOff=app crash (check --previous logs) | Pending=no resources | No Endpoints=selectorβ labels | OOMKill=raise memory limit.
Diagnose: describe (WHAT) then logs (WHY).
Diagnose: describe (WHAT) then logs (WHY).
π» Commands
# Create cluster with ACR attached az aks create -n myaks -g rg \ --node-count 3 --attach-acr myacr # Get kubeconfig az aks get-credentials -n myaks -g rg # Deploy kubectl apply -f deployment.yaml kubectl apply -f service.yaml # Status checks kubectl get pods kubectl get svc kubectl get deployments # Troubleshoot kubectl describe pod my-pod # Events/errors kubectl logs my-pod # App stdout kubectl logs my-pod --previous # Last crash logs kubectl get pods --show-labels # Check labels # Scale kubectl scale deployment my-app --replicas=3 # Exec into pod kubectl exec -it my-pod -- /bin/sh
β οΈ Exam Gotchas
Selector/label mismatch = Service has no endpoints (silent traffic drop). OOMKill = raise limits.memory, don't restart. K8s Secrets are base64-encoded NOT encrypted β use Key Vault CSI driver for production.
Module 4 Β· NoSQL Database
Azure Cosmos DB for NoSQL
25β30%
π Key Facts
- Hierarchy: Account β Database β Container β Item
- Good PK: High cardinality + query-aligned + immutable (userId, productId)
- Bad PK: isActive, status, region β low cardinality = hot partitions
- Point read: ~1 RU/KB β cheapest op (needs BOTH id + partition key)
- Write: ~5β10 RU/KB
- Cross-partition query: 100+ RU β fan-out to ALL partitions
- Manual throughput min: 400 RU/s per dedicated container
- Autoscale min: 1,000 RU/s β use for bursty AI traffic
- _etag: Optimistic concurrency token β use with if_match
- Prod auth: Entra ID + Cosmos DB Built-in Data Contributor role
π§ Memory Tricks
PK rule of 3 β HAI: High cardinality, Aligned to queries, Immutable.
RU ladder: Point read (1) β Simple query (3β5) β Complex (10+) β Cross-partition (100+).
Autoscale for AI = bursty. Manual = steady/predictable.
RU ladder: Point read (1) β Simple query (3β5) β Complex (10+) β Cross-partition (100+).
Autoscale for AI = bursty. Manual = steady/predictable.
π» SDK Pattern (Python)
from azure.cosmos import CosmosClient, PartitionKey
from azure.identity import DefaultAzureCredential
client = CosmosClient(
"https://myacct.documents.azure.com:443/",
credential=DefaultAzureCredential()
)
db = client.get_database_client("aidata")
ctr = db.get_container_client("products")
# Point read β ~1 RU (cheapest)
item = ctr.read_item(
item="product-123",
partition_key="electronics"
)
# Upsert (create or replace)
ctr.upsert_item(body={...})
# Optimistic concurrency
ctr.replace_item(
item=item["id"], body=item,
if_match=item["_etag"]
)
# Query with partition key
items = list(ctr.query_items(
query="SELECT * FROM c WHERE c.cat = @cat",
parameters=[{"name":"@cat","value":"electronics"}],
partition_key="electronics"
)) β οΈ Exam Gotchas
isActive/status as PK = always wrong (2 hot partitions). Point read needs BOTH id AND pk β id alone forces a query. Monitor RU cost via x-ms-request-charge header.
Module 5 Β· Relational Database
Azure Database for PostgreSQL
25β30%
π Key Facts
- Tiers: Burstable (dev, no PgBouncer) | General Purpose (prod) | Memory Optimized (complex queries)
- PgBouncer: ONLY on GP and MO β port 6432 (not 5432)
- Entra token resource: https://ossrdbms-aad.database.windows.net/.default
- Best TLS mode: verify-full (validates CA AND hostname)
- JSONB: Binary JSON with GIN indexing β use for flexible AI metadata
- BIGSERIAL: Auto-increment 64-bit PK (not SERIAL which overflows)
- TIMESTAMPTZ: Always over TIMESTAMP (stores UTC, timezone-aware)
- pgvector: CREATE EXTENSION vector β similarity search on embeddings
- Transactional DDL: Wrap ALTER TABLE in BEGIN...COMMIT
- Keyset pagination: WHERE id > last_id LIMIT N (not OFFSET)
π§ Memory Tricks
BuGM: Burstable (no Bouncer) | General (Bouncer β
) | Memory (Bouncer β
).
TLS order: disable (blocked) β require (encrypt) β verify-ca (+CA) β verify-full (+hostname = best).
Entra URL: ossrdbms-aad.database.windows.net β memorise exactly.
TLS order: disable (blocked) β require (encrypt) β verify-ca (+CA) β verify-full (+hostname = best).
Entra URL: ossrdbms-aad.database.windows.net β memorise exactly.
π» Patterns
-- Create AI conversation table
CREATE TABLE messages (
id BIGSERIAL PRIMARY KEY,
session_id UUID NOT NULL DEFAULT gen_random_uuid(),
role VARCHAR(50) CHECK (role IN ('user','assistant')),
content TEXT NOT NULL,
metadata JSONB DEFAULT '{}'::jsonb,
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
-- Keyset pagination (fast at scale)
SELECT * FROM messages
WHERE session_id = $1 AND id > $last_id
ORDER BY id ASC LIMIT 50;
-- Upsert
INSERT INTO sessions (id, user_id)
VALUES ($1, $2)
ON CONFLICT (id) DO UPDATE
SET user_id = EXCLUDED.user_id;
-- Transactional DDL (PostgreSQL superpower)
BEGIN;
ALTER TABLE messages ADD COLUMN tokens INT;
CREATE INDEX idx_messages_tokens ON messages(tokens);
COMMIT;
-- Entra auth token (Python)
token = cred.get_token(
"https://ossrdbms-aad.database.windows.net/.default"
) β οΈ Exam Gotchas
PgBouncer on Burstable = impossible. verify-full validates CA AND hostname β "ensure client connects to correct server" = verify-full. OFFSET degrades at scale; use keyset pagination.
Module 6 Β· In-Memory Cache
Azure Managed Redis
20β25%
π Key Facts
- Azure Redis port: 10000 with TLS β NOT 6379 (open-source default)
- Cache-aside: GET β MISS β DB β SET with TTL β return (app manages cache)
- Write-through: Write to cache AND DB simultaneously (always consistent)
- Write-behind: Write to cache, async DB update (lower latency, loss risk)
- String: Simple keyβvalue, SET with TTL, INCR for atomic counters
- Hash: Object fields β hset/hget/hincrby (avoid full JSON serialization)
- List: Ordered queue/history β lpush/rpush + lrange + ltrim
- Set: Unique members β sadd/sismember/smembers
- Sorted Set: Ranked data β zadd + zrange (leaderboards, priority)
- Prod auth: redis-entraid + DefaultAzureCredential (no stored keys)
π§ Memory Tricks
Port: Azure Redis = 10000 (not 6379). Always.
Data types SHLSS: String (cache), Hash (objects), List (queue), Set (unique), Sorted Set (ranked).
TTL ladder: Rate limit 60s | Inference 5-60m | Profile 15-60m | Catalog 1-24h | Static 24h+.
Data types SHLSS: String (cache), Hash (objects), List (queue), Set (unique), Sorted Set (ranked).
TTL ladder: Rate limit 60s | Inference 5-60m | Profile 15-60m | Catalog 1-24h | Static 24h+.
π» Python (redis-py)
import redis
r = redis.Redis(
host="myinst.redis.azure.com",
port=10000, ssl=True,
decode_responses=True,
password="your-access-key"
)
# Cache-aside pattern
def get_result(key: str):
hit = r.get(key)
if hit: return hit # HIT: return instantly
val = expensive_db_query()
r.set(key, val, ex=300) # Cache 5 minutes
return val
# Atomic rate limiter
def check_rate(user_id: str) -> bool:
key = f"rate:{user_id}:{int(time.time())//60}"
count = r.incr(key)
r.expire(key, 60)
return count <= 10 # max 10 req/min
# Hash for user profile
r.hset("user:1001", mapping={
"name":"Alice","tier":"premium","credits":"500"
})
r.hincrby("user:1001", "credits", -10)
# List for chat history (last 10)
r.lpush("chat:abc", "Hello")
r.ltrim("chat:abc", 0, 9) β οΈ Exam Gotchas
Port 10000 NOT 6379. Never store API keys/PII in Redis without short TTL + encryption. Cache-aside = app manages cache, NOT transparent proxy. decode_responses=True always.
Module 7 Β· Async Messaging
Azure Service Bus
25β30%
π Key Facts
- Queue: Point-to-point β ONE consumer processes each message
- Topic + Subscriptions: Pub/Sub β ALL subscriptions get a copy (fan-out)
- Peek-lock (default): Lock β Process β complete() or abandon() β guaranteed delivery
- Receive-and-Delete: Deleted on receive β fast but lost on crash
- complete(): Removes message (success path)
- abandon(): Returns to queue for retry (failure path)
- DLQ path: queuename/$deadletterqueue (after max delivery count)
- Claim-check: Large payload (>256 KB) β Blob Storage, send URI in message
- Sessions: FIFO per session_id β use for per-user conversation ordering
- SQL filter: Route topic messages by ApplicationProperties values
π§ Memory Tricks
Queue vs Topic: Queue = one wins. Topic = everyone wins.
Peek-lock flow: "Look before you delete" β peek (lock) β process β complete or abandon.
Claim-check: "Check coat at door" β store blob (coat), carry URI (ticket).
Peek-lock flow: "Look before you delete" β peek (lock) β process β complete or abandon.
Claim-check: "Check coat at door" β store blob (coat), carry URI (ticket).
π» SDK Pattern (Python)
from azure.servicebus import ServiceBusClient, ServiceBusMessage
from azure.identity import DefaultAzureCredential
client = ServiceBusClient(
"mynamespace.servicebus.windows.net",
DefaultAzureCredential()
)
# Send message
with client.get_queue_sender("doc-queue") as sender:
msg = ServiceBusMessage(
body='{"doc_id":"123","url":"https://..."}',
message_id="doc-123",
application_properties={"priority":"high"}
)
sender.send_messages(msg)
# Receive with peek-lock
with client.get_queue_receiver(
"doc-queue", max_wait_time=5
) as receiver:
for msg in receiver:
try:
process(str(msg))
receiver.complete_message(msg) # SUCCESS
except:
receiver.abandon_message(msg) # RETRY
# Claim-check: large payloads
blob_uri = upload_to_blob(large_content)
sender.send_messages(
ServiceBusMessage(f'{{"blob_uri":"{blob_uri}"}}')
) β οΈ Exam Gotchas
Fan-out to N services = Topic NOT Queue. Receive-and-Delete = message lost on crash (never for critical AI pipeline). Growing DLQ = urgent alert β consumer logic is broken. Standard max msg = 256 KB.
Module 8 Β· Secrets Management
Azure Key Vault
25β30%
π Key Facts
- Object types (SKC): Secrets (strings) | Keys (crypto) | Certificates (TLS)
- Secrets User: Read-only secret values β assign to app managed identity
- Secrets Officer: Full CRUD on secrets β CI/CD, developers
- Contributor β data access: Management-plane role CANNOT read secrets
- Versionless URI: Auto-rotates β always resolves latest version
- Versioned URI: Pinned β does NOT auto-rotate
- Local cache: 5-min in-process TTL β avoids KV throttling (2,000 ops/10s)
- Soft delete: Always on β 7β90 day recovery window
- Purge protection: Prevents permanent delete during retention (compliance)
- HSM keys: Premium tier β FIPS 140-2 Level 3, private key never leaves vault
π§ Memory Tricks
Role ladder: Secrets User (read) β Secrets Officer (CRUD) β Administrator (everything).
"Contributor β secret access" β management and data plane are COMPLETELY SEPARATE in Key Vault.
KV reference fails = always check MI has Secrets User role first.
"Contributor β secret access" β management and data plane are COMPLETELY SEPARATE in Key Vault.
KV reference fails = always check MI has Secrets User role first.
π» SDK + App Service Pattern
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient
client = SecretClient(
vault_url="https://myvault.vault.azure.net",
credential=DefaultAzureCredential()
)
# Get latest version (~1 RU equivalent)
secret = client.get_secret("openai-api-key")
api_key = secret.value
# Get specific version
v = client.get_secret("openai-api-key", version="abc")
# Cache to avoid throttling
import time
_cache = {}
def get_secret_cached(name):
c = _cache.get(name)
if c and time.time() - c["t"] < 300:
return c["v"]
v = client.get_secret(name).value
_cache[name] = {"v": v, "t": time.time()}
return v
# App Service KV Reference (auto-rotates)
# Set app setting value to:
# @Microsoft.KeyVault(SecretUri=
# https://myvault.vault.azure.net/secrets/key/)
# Role assignment
az role assignment create \
--role "Key Vault Secrets User" \
--assignee $MI_PRINCIPAL_ID \
--scope $VAULT_RESOURCE_ID β οΈ Exam Gotchas
Contributor CANNOT read secrets β must add data-plane role. KV reference literal in app = MI missing Secrets User role. Fetch once at startup + cache. Versionless URI = auto-rotate; pinned = manual rotation.
Module 9 Β· Observability
OpenTelemetry + Application Insights
15β20%
π Key Facts
- Three pillars (TML): Traces (where) | Metrics (what trend) | Logs (why)
- Trace: Complete request journey across all services, linked by one trace ID
- Span: One named, timed unit of work. Has trace ID + span ID + parent span ID
- W3C traceparent: HTTP header that links spans across service boundaries
- Server span β App Insights: requests table
- Client/Internal span β App Insights: dependencies table
- span.set_attribute() β App Insights: customDimensions column
- Trace ID β App Insights: operation_Id column
- Cloud role name: service.name + service.namespace β Application Map node
- Auto-instrumented: requests, FastAPI, psycopg2, Azure SDK, logging
π§ Memory Tricks
Table rule: "My Server Receives Requests. My Client Creates Dependencies." SERVER=requests table, CLIENT=dependencies table.
Without unique cloud role names = ALL services appear as ONE node on Application Map.
traceparent = GPS coordinate linking every span to the global trace.
Without unique cloud role names = ALL services appear as ONE node on Application Map.
traceparent = GPS coordinate linking every span to the global trace.
π» Setup + Custom Spans (Python)
pip install azure-monitor-opentelemetry
# One call β traces + metrics + logs
from azure.monitor.opentelemetry import configure_azure_monitor
configure_azure_monitor()
# Reads APPLICATIONINSIGHTS_CONNECTION_STRING env var
# Set unique cloud role (per service)
from opentelemetry.sdk.resources import Resource
configure_azure_monitor(
resource=Resource.create({
"service.name": "embedding-service",
"service.namespace": "rag-pipeline"
})
)
# Custom span for AI operation
from opentelemetry import trace
from opentelemetry.trace import StatusCode
tracer = trace.get_tracer("rag-pipeline")
def run_inference(prompt):
with tracer.start_as_current_span("llm_inference") as span:
span.set_attribute("model", "gpt-4o")
span.set_attribute("prompt_tokens", len(prompt.split()))
try:
result = call_openai(prompt)
span.set_attribute("total_tokens", result.usage.total_tokens)
return result
except Exception as e:
span.set_status(StatusCode.ERROR, str(e))
span.record_exception(e)
raise β οΈ Exam Gotchas
Outgoing calls (OpenAI, Cosmos, Key Vault) = dependencies table NOT requests table. Without cloud role names = one node on App Map = cannot identify bottleneck. configure_azure_monitor() = one call sets up ALL three pillars.
Exam Day Β· All Modules
Scenario β Answer Decision Tree
READ THIS LAST
Secure secret injection without hardcoding?Key Vault β Managed Identity β Secrets User role
Message payload exceeds 256 KB?Claim-check: Blob Storage + URI in Service Bus message
Multiple services must react to same event?Service Bus Topic + subscriptions (fan-out)
AI app with bursty/unpredictable traffic DB?Cosmos DB with Autoscale throughput (10β100% of max)
Container fails to start on App Service?Check WEBSITES_PORT first, then registry auth credentials
PostgreSQL connection pooling needed?General Purpose or higher tier + connect on port 6432
Cache AI inference results at low latency?Redis cache-aside on port 10000 with TTL 5β60 min
K8s Service not routing traffic?Check selector β labels: kubectl get pods --show-labels
Find latency bottleneck across AI pipeline?App Insights Application Map β KQL on dependencies table
ACR geo-replication needed?Premium SKU only β Basic/Standard = no geo-rep
Contributor can't read Key Vault secrets?Add Key Vault Secrets User data-plane role separately
Messages per user must be processed in order?Service Bus Sessions with session_id = userId
Pod keeps restarting with OOMKill?Increase resources.limits.memory in Deployment manifest
Deploy with git-tracked config, no drift?Container Apps --yaml flag (overrides all other CLI flags)