What percentage of the AI-200 exam covers Secure, Monitor, and Troubleshoot Azure Solutions?

Domain 4 (Secure, Monitor, and Troubleshoot Azure Solutions) accounts for 20–25% of the AI-200 exam. Observe and Troubleshoot Apps on Azure topics like OpenTelemetry and KQL are actively tested.

Is Azure Monitor & KQL on the AI-200 exam?

Yes. Observe and Troubleshoot Apps on Azure is part of Domain 4 in the official AI-200 skill outline, weighted at 20–25%. The key services tested are OpenTelemetry, KQL, Azure Monitor, Application Insights.

How do I practice Azure Monitor & KQL hands-on?

Create a free Azure account and follow the code examples in this module step-by-step. The official Microsoft Learn sandbox for Course AI-200T00-A also provides free lab environments for OpenTelemetry and related services.

Module 9: Azure Monitor & KQL — AI-200 Study Notes

Module

Instrument AI Applications with OpenTelemetry

units

🎬 Unit 1

Introduction

3 min

Distributed AI pipelines span multiple services — API gateway, embedding service, vector search, LLM orchestrator. When a user reports slow responses, which service is the bottleneck? OpenTelemetry answers this with distributed traces showing the full request journey across all services, linked by a single trace ID. Azure Monitor Application Insights stores, queries, and visualizes this telemetry.

💡 Exam Tip

Exam pillars: 1) Trace, span, context propagation (W3C traceparent) 2) Azure Monitor Distro setup (one function call) 3) Cloud role names for Application Map 4) OpenTelemetry → App Insights term mapping 5) KQL queries (requests vs dependencies tables).

📘 Unit 2

Observability Concepts

7 min

Distributed Trace: One trace ID links all spans across services — find the bottleneck instantly

1. The Three Pillars of Observability

Distributed Traces — full path of a request through all services with timing. Answers: "Where is the latency?" Primary AI-200 focus.
Metrics — aggregate numbers over time (request rate, error rate, p95 latency). Answers: "Is something trending wrong?"
Logs — timestamped discrete events. Answers: "Why did this specific operation fail?"

Together: Metrics tell you something changed, Traces tell you where, Logs tell you why.

2. Traces and Spans Anatomy

A trace = the complete record of one request across all services. A span = one named, timed unit of work within that trace.

Trace ID — 128-bit hex, shared by ALL spans in one request journey
Span ID — 64-bit hex, unique to this specific operation
Parent Span ID — links this span to its caller (forms a tree/waterfall)
Name — operation name: HTTP GET /api/chat, vector_search, llm_inference
Start/End timestamps — exact duration in milliseconds
Attributes — key-value metadata: HTTP method, model name, token count
Status — OK or ERROR

3. W3C traceparent — How Spans Connect Across Services

When Service A calls Service B via HTTP, it passes the trace context in a header so B's spans link to A's span in the same trace:

traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
             ^^ version  ^^ 32-char trace ID (shared!)    ^^ 16-char span ID  ^^ flags

Without this header: each service creates a disconnected trace. With it: all services' spans form one unified waterfall.

💡 Exam Tip

W3C traceparent = the thread that stitches spans across services. Question: "What links spans from different services into a single trace?" Answer: W3C traceparent header propagation.

4. OpenTelemetry → Application Insights Term Mapping

#	OpenTelemetry Concept	App Insights Term	KQL Table
1	Trace ID	Operation ID (operation_Id)	All tables
2	Server span (SpanKind.SERVER)	Request	requests table
3	Client/Internal span	Dependency	dependencies table
4	span.set_attribute("key", val)	customDimensions	customDimensions column
5	Span events	Traces	traces table

⚠️ Common Gotcha

Incoming HTTP requests = requests table. Outgoing calls (OpenAI API, Cosmos DB, Key Vault) = dependencies table. KQL query "find slow OpenAI calls" → query the dependencies table where name contains "openai".

📘 Unit 3

Configure the Azure Monitor OpenTelemetry Distro

8 min

1. Install and Initialize

pip install azure-monitor-opentelemetry

from azure.monitor.opentelemetry import configure_azure_monitor

# One call — sets up traces, metrics, and logs
# Reads APPLICATIONINSIGHTS_CONNECTION_STRING from environment variable
configure_azure_monitor()

This single call initializes trace, metric, and log providers — all exporting to Application Insights. No separate configuration needed for each.

2. Connection String

# Recommended: environment variable
export APPLICATIONINSIGHTS_CONNECTION_STRING="InstrumentationKey=...;IngestionEndpoint=..."

# Code-based (avoid in production)
configure_azure_monitor(connection_string="InstrumentationKey=...")

3. What the Distro Auto-Instruments (No Code Changes)

requests / urllib3 — outgoing HTTP calls to OpenAI, vector DBs → appear as dependency spans
Flask / Django / FastAPI — incoming HTTP requests → appear as server spans (requests table)
psycopg2 — PostgreSQL queries → dependencies table
Azure SDK — Key Vault, Storage, Service Bus calls → dependencies table
Python logging module — standard log calls flow to traces table automatically

💡 Exam Tip

For common AI calls (HTTP to OpenAI, psycopg2 to PostgreSQL, Azure SDK to Key Vault), the Distro captures telemetry automatically. Custom spans are only needed for business-logic operations (chunking, ranking, re-ranking).

4. Cloud Role Name — Multi-Service Identification

Without unique cloud role names, all services appear as a single node on the Application Map. You lose visibility into inter-service latency.

from opentelemetry.sdk.resources import Resource

configure_azure_monitor(
    resource=Resource.create({
        "service.name": "embedding-service",
        "service.namespace": "rag-pipeline"
    })
)
# App Map shows node: "rag-pipeline.embedding-service"

# Or via env var (no code change):
export OTEL_SERVICE_NAME=embedding-service
export OTEL_RESOURCE_ATTRIBUTES=service.namespace=rag-pipeline

⚠️ Common Gotcha

Cloud role name = service.name (+ service.namespace). Without unique names per service, the Application Map shows one node for all services — you cannot see which service is slow.

📘 Unit 4

Create Custom Spans for AI Operations

10 min

1. Custom Span for Model Inference

from opentelemetry import trace

tracer = trace.get_tracer("rag-pipeline")

def run_inference(prompt: str) -> dict:
    with tracer.start_as_current_span("llm_inference") as span:
        span.set_attribute("model", "gpt-4o")
        span.set_attribute("prompt_tokens", len(prompt.split()))

        result = call_openai_api(prompt)

        span.set_attribute("completion_tokens", result.usage.completion_tokens)
        span.set_attribute("total_tokens", result.usage.total_tokens)
        return result

2. Record Errors in Spans

from opentelemetry.trace import StatusCode

with tracer.start_as_current_span("vector_search") as span:
    try:
        results = vector_db.search(embedding, top_k=5)
        span.set_attribute("results_count", len(results))
    except Exception as e:
        span.set_status(StatusCode.ERROR, str(e))
        span.record_exception(e)   # Adds stack trace to span
        raise

3. Span Kinds and Their Mappings

SpanKind.SERVER — incoming request handler → requests table in App Insights
SpanKind.CLIENT — outgoing call to external service → dependencies table
SpanKind.INTERNAL — internal processing (chunking, ranking) → dependencies table

📘 Unit 5

Analyze Telemetry in Application Insights

7 min

1. Application Map

Visualizes the topology of your AI pipeline. Shows each service as a node, with call volume and failure rates on edges. Requires each service to have a unique cloud role name. Immediately reveals which service is the bottleneck or source of errors.

2. KQL Queries for AI Pipeline Analysis

// 1. Find slowest operations in last hour
requests
| where timestamp > ago(1h)
| summarize avg(duration), percentile(duration, 95) by name
| order by percentile_duration_95 desc

// 2. Find slow OpenAI calls
dependencies
| where name contains "openai"
| where duration > 5000    // over 5 seconds
| project timestamp, name, duration, customDimensions

// 3. Query by custom span attribute (model name)
requests
| where customDimensions.model == "gpt-4o"
| summarize count(), avg(duration) by bin(timestamp, 5m)

// 4. Error rate by service (cloud_RoleName)
requests
| where timestamp > ago(1h)
| summarize total=count(), errors=countif(success==false) by cloud_RoleName
| extend error_rate = round(100.0 * errors / total, 2)

⚡ OpenTelemetry + App Insights Master Cheatsheet

Setup (1 line)configure_azure_monitor()

Connection string env varAPPLICATIONINSIGHTS_CONNECTION_STRING

Trace → App InsightsTrace ID = operation_Id column

Server span → KQL tablerequests table

Client/Internal span → KQLdependencies table

Custom attribute → KQLcustomDimensions column

Service identity on App Mapservice.name + service.namespace resource attrs

Propagation headerW3C traceparent

Auto-instrumentedrequests, FastAPI, psycopg2, Azure SDK, logging

Record error in spanspan.set_status(ERROR) + span.record_exception(e)

🧪 Unit 6

Exercise — Instrument a RAG Pipeline

30 min

Install azure-monitor-opentelemetry and call configure_azure_monitor()
Set unique cloud role names for each service (API, embeddings, vector search)
Create custom spans for embedding generation and LLM inference with token attributes
Add error recording with record_exception()
Send requests and observe the Application Map in Application Insights
Write KQL to find the slowest operation across the pipeline

🏁 Unit 7

Summary

2 min

OpenTelemetry provides vendor-neutral observability: Traces (where), Metrics (what), Logs (why). The Azure Monitor Distro sets everything up in one call. W3C traceparent links spans across services. Set unique service.name per service for Application Map. Create custom spans for AI-specific operations. Query Application Insights with KQL: incoming requests → requests table, outgoing calls → dependencies table.

🧠 Memory Tricks

Three pillars mnemonic — TML: Traces (where), Metrics (what trend), Logs (why it failed)

Table mapping: "My server receives Requests. My client creates Dependencies." SERVER = requests. CLIENT = dependencies.

traceparent: "The GPS coordinate that finds your span in the global trace timeline."

🔭

Module Cheatsheet

OpenTelemetry + Application Insights

15–20% PDF

🔑 Key Facts

Three pillars (TML) — Traces (where) | Metrics (what trend) | Logs (why)
W3C traceparent — HTTP header linking spans across service boundaries into one trace
Server span → KQL — requests table (incoming HTTP to your service)
Client span → KQL — dependencies table (outgoing: OpenAI, Cosmos, Key Vault)
Trace ID → App Insights — operation_Id column — joins all spans in a request
Custom attr → KQL — span.set_attribute() → customDimensions column
Cloud role name — service.name + service.namespace → Application Map node
configure_azure_monitor() — One call sets up ALL three pillars (traces + metrics + logs)

💻 Commands & Patterns

pip install azure-monitor-opentelemetry
from azure.monitor.opentelemetry import configure_azure_monitor
from opentelemetry.sdk.resources import Resource
# One call — reads APPLICATIONINSIGHTS_CONNECTION_STRING env var
configure_azure_monitor(resource=Resource.create(&#123;
  "service.name": "embedding-service",
  "service.namespace": "rag-pipeline"
&#125;))
# Custom span for AI operation
from opentelemetry import trace
from opentelemetry.trace import StatusCode
tracer = trace.get_tracer("rag-pipeline")
def run_inference(prompt):
  with tracer.start_as_current_span("llm_inference") as span:
    span.set_attribute("model", "gpt-4o")
    span.set_attribute("prompt_tokens", len(prompt.split()))
    try:
      return call_openai(prompt)
    except Exception as e:
      span.set_status(StatusCode.ERROR, str(e))
      span.record_exception(e); raise

Module

Analyze Logs and Set Up Alerts with KQL and Azure Monitor

units

Analyze Logs with KQL — Microsoft Learn

🎬 Unit 1

Introduction to KQL

3 min

Kusto Query Language (KQL) is the query language for Azure Monitor Logs and Application Insights. Use it to query telemetry tables, find errors, calculate latency, and set up metric-based alerts — all critical for AI app observability.

💡 Exam Tip

KQL exam pillars: 1) Core operators: where, project, summarize, extend, join, render 2) Key tables: requests, dependencies, exceptions, traces, customMetrics 3) Alert rule: signal → condition → action group → notification 4) ago() for time ranges 5) bin() for time-series aggregation.

📘 Unit 2

Core KQL Operators

10 min

Essential Query Patterns

// Failed requests in last hour
requests
| where timestamp > ago(1h)
| where success == false
| project timestamp, name, resultCode, duration

// P95 latency by operation (bin = time bucket)
requests
| where timestamp > ago(24h)
| summarize p95=percentile(duration, 95) by
    bin(timestamp, 1h), name
| render timechart

// Join requests with exceptions
requests
| where timestamp > ago(1h) and success == false
| join kind=leftouter exceptions
    on operation_Id
| project timestamp, name, type, outerMessage

// Track OpenAI call latency via dependencies
dependencies
| where timestamp > ago(6h)
| where target contains "openai"
| summarize avg_ms=avg(duration), calls=count()
    by bin(timestamp, 30m)
| render timechart

📘 Unit 3

Azure Monitor Alerts

8 min

Alert Rule → Action Group → Notification

# Create action group (email notification)
az monitor action-group create \
  --name ai-alerts-ag \
  --resource-group rg \
  --short-name aialerts \
  --email-receiver name=oncall email=team@example.com

# Create log alert — fires when error rate exceeds threshold
az monitor scheduled-query create \
  --name high-error-rate \
  --resource-group rg \
  --scopes $APP_INSIGHTS_ID \
  --condition-query "requests | where success==false
    | summarize count() by bin(timestamp,5m)
    | where count_ > 10" \
  --condition-threshold 0 \
  --condition-operator GreaterThan \
  --evaluation-frequency 5m \
  --window-size 5m \
  --action-groups $ACTION_GROUP_ID

💡 Exam Tip

Alert flow: Signal (KQL query) → Condition (threshold) → Action Group (who to notify) → Notification (email/webhook/SMS). Action group is reusable across multiple alert rules.

📘 Unit 4

Custom Metrics and Dashboards

6 min

Track AI-Specific Metrics

from applicationinsights import TelemetryClient

tc = TelemetryClient("YOUR_INSTRUMENTATION_KEY")

# Track token usage as custom metric
tc.track_metric("openai_tokens_used", total_tokens,
    properties={"model": model, "operation": "embed"})

# Track cache hit/miss ratio
tc.track_metric("semantic_cache_hit", 1 if cache_hit else 0)
tc.flush()

# Query in KQL:
# customMetrics
# | where name == "openai_tokens_used"
# | summarize sum(value) by bin(timestamp, 1h)

💡 Exam Tip

Custom metrics appear in the customMetrics KQL table. Use them to track AI-specific KPIs: token cost, cache hit rate, embedding latency, RAG relevance scores.

🏁 Unit 5

Summary

2 min

KQL: pipe-based queries on telemetry tables (requests, dependencies, exceptions, traces, customMetrics). Core operators: where → summarize → project → render. Alerts: KQL signal + threshold → action group → notification. Custom metrics for AI KPIs (tokens, cache hits). bin() for time-series, ago() for relative time windows, percentile() for latency percentiles.

Observe and Troubleshoot Apps on Azure

Instrument AI Applications with OpenTelemetry

Introduction

Observability Concepts

Distributed Trace: One trace ID links all spans across services — find the bottleneck instantly

Configure the Azure Monitor OpenTelemetry Distro

Create Custom Spans for AI Operations

Analyze Telemetry in Application Insights

⚡ OpenTelemetry + App Insights Master Cheatsheet

Exercise — Instrument a RAG Pipeline

Summary

OpenTelemetry + Application Insights

Analyze Logs and Set Up Alerts with KQL and Azure Monitor

Introduction to KQL

Core KQL Operators

Azure Monitor Alerts

Custom Metrics and Dashboards

Summary

Quick Quiz

Related Modules — Secure, Monitor, and Troubleshoot Azure Solutions

Frequently Asked Questions