🌍
Module 4 of 9 25–30% 3 sub-modules Β· 24 units Domain 2: Develop AI Solutions by Using Azure Data Management Services

Develop AI Solutions with Azure Cosmos DB for NoSQL

Connect to Cosmos DB using the SDK and run queries. Optimize RU consumption with indexing policies and consistency levels. Store embeddings and execute vector similarity search. Implement change feed processor.

Azure Cosmos DBDiskANNChange FeedPython SDK

Last updated: Β· Aligned with Course AI-200T00-A

Module

Build Queries for Azure Cosmos DB for NoSQL

units
🎬 Unit 1

Introduction

3 min

Azure Cosmos DB for NoSQL is a globally distributed, schemaless JSON document database with automatic indexing and single-digit millisecond reads. AI applications use it for product catalogs, user profiles, inference result caching, and conversation history. The key insight: partition key selection determines performance more than anything else.

πŸ’‘ Exam Tip
Exam pillars: 1) Resource hierarchy (Account→DB→Container→Item) 2) Partition key design rules 3) Request Units (RU) cost model 4) SDK CRUD pattern (point read vs query) 5) Throughput modes (Manual vs Autoscale).
πŸ“˜ Unit 2

Explore Azure Cosmos DB for NoSQL

10 min

Cosmos DB: Container partitioned by userId β€” data spread across physical partitions

Cosmos DB ContainerPartition /userIduser-001: 5 RUuser-002: 3 RUuser-003: 8 RUPartition /regionus-east: 12 RUeu-west: 7 RUasia-se: 4 RUVector IndexDiskANN index1536-dim embeddingscosine similarity

1. Resource Hierarchy

Memory aid: Library β†’ Floor β†’ Bookshelf β†’ Book

  1. Account β€” top-level resource. Unique DNS endpoint: https://myaccount.documents.azure.com:443/. Holds up to 500 databases.
  2. Database β€” logical namespace. Groups related containers. Can share throughput across containers (shared throughput mode).
  3. Container β€” unit of scalability. Requires a partition key. Schema-agnostic β€” items in the same container can have different shapes.
  4. Item β€” individual JSON document. Must have id + partition key value to be uniquely addressable via point read.

2. Partition Key Design β€” The Most Important Decision

  1. High cardinality β€” many distinct values spread data across partitions evenly. Good: userId, tenantId, productId. Bad: status (only true/false), isActive (boolean).
  2. Query-aligned β€” if you always query by userId, make it the partition key. Same-partition queries cost fewer RUs and are faster.
  3. Immutable β€” you cannot change a container's partition key. Wrong choice = create new container + migrate all data.
⚠️ Common Gotcha
Using isActive (boolean) or status as a partition key creates only 2-3 hot partitions and will throttle under load. The exam tests this anti-pattern frequently β€” it's always wrong.

3. Request Units (RUs) β€” Cost Model

  1. Point read 1 KB item β‰ˆ 1 RU β€” cheapest operation. Requires both id AND partition key.
  2. Write 1 KB item β‰ˆ 5–10 RUs β€” always more expensive than reads.
  3. Simple query (with partition key) β‰ˆ 3–5 RUs
  4. Cross-partition query (scan) β‰ˆ 100+ RUs β€” fan-out to all partitions. Avoid in hot paths.

Monitor cost via the x-ms-request-charge response header or response.request_charge in SDK.

4. Throughput Modes

  1. Manual throughput β€” fixed RU/s you specify. Min 400 RU/s for dedicated container. Predictable cost. Use for steady, known workloads.
  2. Autoscale throughput β€” scales 10%–100% of your set maximum automatically. Min 1,000 RU/s. No over-provisioning. Best for AI workloads with bursty or unpredictable traffic.
πŸ’‘ Exam Tip
Bursty AI workloads = Autoscale. Predictable steady workload = Manual. The exam gives a scenario and asks which throughput mode β€” match traffic pattern to mode.

5. Item System Properties

  1. _etag β€” optimistic concurrency token. Use with if_match to prevent lost updates from race conditions.
  2. _ts β€” Unix timestamp of last modification. Use for change feed or auditing.
  3. _rid, _self β€” internal resource identifiers (read-only, auto-generated).
πŸ“˜ Unit 3

Implement the Cosmos DB NoSQL SDK

12 min

1. Client Initialization β€” Singleton Pattern

Create CosmosClient ONCE at application startup and reuse it. Recreating per request wastes connections and adds latency (TCP handshake overhead).

from azure.cosmos import CosmosClient
from azure.identity import DefaultAzureCredential

client = CosmosClient(
    "https://myaccount.documents.azure.com:443/",
    credential=DefaultAzureCredential()   # Managed identity in prod, az CLI locally
)
db = client.get_database_client("aidata")
container = db.get_container_client("products")

2. Authentication Options

  1. Account keys β€” simple, full access. Use for dev/test only. Avoid in production (long-lived, no audit trail).
  2. Entra ID (RBAC) β€” production standard. Assign Cosmos DB Built-in Data Contributor (read/write) or Cosmos DB Built-in Data Reader (read-only). Works with managed identities.

3. CRUD Operations

# Create β€” fails if id already exists
container.create_item(body={"id": "product-123", "categoryId": "electronics", "name": "Speaker"})

# Upsert β€” insert or replace (no conflict error)
container.upsert_item(body={...})

# Point Read β€” cheapest: ~1 RU for 1 KB (REQUIRES both id AND partition key)
item = container.read_item(item="product-123", partition_key="electronics")

# Optimistic concurrency β€” prevent lost updates
container.replace_item(item=item["id"], body=item, if_match=item["_etag"])

# Delete
container.delete_item(item="product-123", partition_key="electronics")
⚠️ Common Gotcha
Point read requires BOTH id AND partition key value. If you only know the id, you must query (much more expensive). Always design your access pattern to use point reads when possible.

4. Idempotent Container Creation

from azure.cosmos import PartitionKey, ThroughputProperties

db = client.create_database_if_not_exists(id="aidata")
container = db.create_container_if_not_exists(
    id="products",
    partition_key=PartitionKey(path="/categoryId"),
    offer_throughput=ThroughputProperties(auto_scale_max_throughput=4000)
)
πŸ“˜ Unit 4

Build SQL Queries for Cosmos DB

10 min

1. SQL Syntax (JSON-aware)

# Basic filter + projection (c = container alias)
SELECT c.id, c.name, c.price
FROM c
WHERE c.categoryId = "electronics"
  AND c.price < 100

# Array contains
SELECT c.id FROM c WHERE ARRAY_CONTAINS(c.features, "wifi")

2. Parameterized Queries via SDK

query = "SELECT * FROM c WHERE c.categoryId = @cat AND c.price < @price"
params = [{"name": "@cat", "value": "electronics"}, {"name": "@price", "value": 100}]

items = list(container.query_items(
    query=query,
    parameters=params,
    partition_key="electronics"   # Stays in one partition = cheap
))

3. Cross-Partition Queries β€” Avoid in Hot Paths

Queries without the partition key in WHERE clause fan out to ALL partitions. They cost dramatically more RUs and have higher latency. If you frequently need cross-partition queries, reconsider your partition key choice.

⚠️ Common Gotcha
Cross-partition queries are expensive. The exam will present a scenario with high RU consumption + latency and ask why β€” the answer is cross-partition queries due to poor partition key design.

⚑ Cosmos DB Master Cheatsheet

HierarchyAccount β†’ Database β†’ Container β†’ Item
Partition key rulesHigh cardinality + query-aligned + immutable
Point read cost~1 RU for 1 KB (cheapest operation)
Write cost~5–10 RUs per 1 KB
Autoscale min1,000 RU/s
Manual min400 RU/s dedicated container
Optimistic concurrency_etag with if_match parameter
Monitor RU costx-ms-request-charge header
Auth (prod)Entra ID + Cosmos DB Built-in Data Contributor
Bad partition keyisActive, status (low cardinality = hot partitions)
Point read requiresBOTH id AND partition key value
Bursty trafficAutoscale throughput
πŸ§ͺ Unit 5

Exercise β€” Build a Product Query Service

30 min
  1. Create Cosmos DB account and container with /categoryId as partition key
  2. Use Python SDK to insert 10 product items with different categories
  3. Perform a point read using id + partition key β€” observe 1 RU cost
  4. Run a same-partition query β€” observe 3-5 RUs
  5. Run a cross-partition query β€” observe 100+ RUs
  6. Implement optimistic concurrency with _etag and test a conflict scenario
βœ… Unit 6

Knowledge Check

5 min
  1. Q: You know a product's id and categoryId. Cheapest fetch? A: Point read β€” read_item(item=id, partition_key=categoryId) β‰ˆ 1 RU
  2. Q: AI app has bursty traffic. Which throughput mode? A: Autoscale (scales 10%–100% of max automatically)
  3. Q: Two processes updating same item concurrently. How to prevent lost update? A: Optimistic concurrency: if_match=item["_etag"]
  4. Q: Why avoid isActive as partition key? A: Only 2 values (true/false) = 2 hot partitions = throttling under load
  5. Q: RU consumption suddenly spiked 10x. Most likely cause? A: Cross-partition queries due to missing partition key in WHERE clause
🏁 Unit 7

Summary

2 min

Choose partition keys with high cardinality aligned to your most common query pattern. Use Autoscale for AI workloads. Prefer point reads (1 RU) over queries (3–100+ RUs). Use the SDK singleton pattern. Protect concurrent writes with _etag. Monitor costs via x-ms-request-charge.

🧠 Memory Tricks

Partition key rule of 3: HAI β€” High cardinality, Aligned to queries, Immutable

RU cost ladder: Point read (1) β†’ Simple query (3–5) β†’ Complex (10+) β†’ Cross-partition (100+)

_etag = "version stamp" for optimistic concurrency β€” like a database row version

🏁 Unit 8

Exam Summary Card

2 min
ScenarioAnswer
Bursty traffic throughputAutoscale (min 1,000 RU/s)
Predictable steady trafficManual (min 400 RU/s)
Cheapest fetch operationPoint read (id + partition key)
Prevent concurrent overwritesif_match with _etag
Bad partition key exampleisActive, status, region (low cardinality)
Good partition key exampleuserId, tenantId, productId (high cardinality)
Cross-partition query cost100+ RUs (fan-out to all partitions)
Production authEntra ID + Built-in Data Contributor role
🌍
Module Cheatsheet

Azure Cosmos DB for NoSQL

25–30% PDF

πŸ”‘ Key Facts

  • Hierarchy β€” Account β†’ Database β†’ Container β†’ Item
  • Good partition key β€” High cardinality + query-aligned + immutable (userId, productId)
  • Bad partition key β€” isActive, status, region β€” low cardinality = hot partitions
  • Point read β€” ~1 RU/KB β€” cheapest op (needs BOTH id + partition key)
  • Cross-partition query β€” 100+ RU β€” fan-out to ALL partitions. Avoid in hot paths.
  • Autoscale throughput β€” 1,000 RU/s min β€” use for bursty AI workloads
  • Manual throughput β€” 400 RU/s min β€” use for steady, predictable workloads
  • _etag β€” Optimistic concurrency token β€” use with if_match to prevent lost updates

πŸ’» Commands & Patterns

from azure.cosmos import CosmosClient
from azure.identity import DefaultAzureCredential
client = CosmosClient(
  "https://myacct.documents.azure.com:443/",
  credential=DefaultAzureCredential()
)
ctr = client.get_database_client("db").get_container_client("ctr")
# Point read (~1 RU β€” cheapest)
item = ctr.read_item(item="id-123", partition_key="electronics")
# Upsert (no conflict error)
ctr.upsert_item(body=&#123;...&#125;)
# Optimistic concurrency
ctr.replace_item(item["id"], body=item, if_match=item["_etag"])
# Parameterized query (with pk = cheap)
list(ctr.query_items(query="SELECT * FROM c WHERE c.cat=@c",
  parameters=[&#123;"name":"@c","value":"electronics"&#125;],
  partition_key="electronics"))
Module

Implement Vector Search and Change Feed in Azure Cosmos DB

units
🎬 Unit 1

Introduction to Cosmos DB Vector Search

3 min

Azure Cosmos DB NoSQL supports native vector search β€” store document embeddings alongside your data in the same container. No separate vector store needed. The change feed enables event-driven architecture: process new/updated documents automatically in real time.

πŸ’‘ Exam Tip
Vector + Change Feed exam pillars: 1) Vector policy on container (flat/quantizedFlat/diskANN) 2) VectorDistance() in SQL queries 3) Change feed: all inserts + updates, no deletes, ordered per partition 4) Change feed processor with lease container 5) DiskANN = approximate (production), flat = exact (dev).
πŸ“˜ Unit 2

Vector Policy and Indexing

8 min

Configure Vector Index on Container

from azure.cosmos import CosmosClient, PartitionKey
from azure.identity import DefaultAzureCredential

client = CosmosClient(
    "https://myacct.documents.azure.com:443/",
    credential=DefaultAzureCredential()
)
db = client.get_database_client("ai-db")

# Create container with vector policy
container = db.create_container_if_not_exists(
    id="documents",
    partition_key=PartitionKey("/category"),
    vector_embedding_policy={
        "vectorEmbeddings": [{
            "path": "/embedding",
            "dataType": "float32",
            "dimensions": 1536,
            "distanceFunction": "cosine"
        }]
    },
    indexing_policy={
        "vectorIndexes": [{
            "path": "/embedding",
            "type": "diskANN"
        }]
    }
)
πŸ’‘ Exam Tip
diskANN = approximate search, production-ready. flat = exact but slow at scale. quantizedFlat = compressed for memory efficiency. Always set dimensions to match your embedding model.
πŸ“˜ Unit 4

Change Feed Processor

8 min

Process New Documents Automatically

from azure.cosmos.aio import CosmosClient
import asyncio

async def process_changes(changes, context):
    for doc in changes:
        # Auto-embed every new document inserted to container
        text = doc.get("content", "")
        embedding = await generate_embedding_async(text)
        await container.patch_item(
            item=doc["id"],
            partition_key=doc["category"],
            patch_operations=[{
                "op": "add",
                "path": "/embedding",
                "value": embedding
            }]
        )

# Start change feed processor
processor = container.create_change_feed_processor(
    name="embedding-processor",
    handler=process_changes,
    lease_container=lease_container
)
await processor.start()
πŸ’‘ Exam Tip
Change feed = all inserts + updates, no deletes. Ordered per partition key. Lease container tracks position β€” enables restart from last checkpoint. Pattern: new doc inserted β†’ change feed β†’ auto-embed β†’ ready for vector search.
🏁 Unit 5

Summary

2 min

Cosmos DB vector search: container vector policy (diskANN) + VectorDistance() in SQL. Change feed processor: leases track position, processes all inserts/updates (not deletes), enables auto-embedding pipeline. Combine both: insert document β†’ change feed triggers embedding β†’ VectorDistance() query for RAG retrieval. All in one Cosmos DB container β€” no separate vector store.

🧠

Quick Quiz

5 questions β€” test your understanding before moving on

Finished reading this module? Mark it complete to track your progress.

Related Modules β€” Develop AI Solutions by Using Azure Data Management Services

Frequently Asked Questions

What percentage of the AI-200 exam covers Develop AI Solutions by Using Azure Data Management Services? +

Domain 2 (Develop AI Solutions by Using Azure Data Management Services) accounts for 25–30% of the AI-200 exam. Develop AI Solutions with Azure Cosmos DB for NoSQL topics like Azure Cosmos DB and DiskANN are actively tested. Study all official skill objectives listed in the module header above.

Is Cosmos DB for NoSQL on the AI-200 exam? +

Yes. Develop AI Solutions with Azure Cosmos DB for NoSQL is part of Domain 2 in the official AI-200 skill outline, weighted at 25–30%. The key services tested are Azure Cosmos DB, DiskANN, Change Feed, Python SDK. Review the code examples and exam tips in this module for targeted prep.

How do I practice Cosmos DB for NoSQL hands-on? +

The best approach is to create a free Azure account and follow the code examples in this module step-by-step. The official Microsoft Learn sandbox for Course AI-200T00-A also provides free lab environments for Azure Cosmos DB and related services.