Build Queries for Azure Cosmos DB for NoSQL
Introduction
3 minAzure Cosmos DB for NoSQL is a globally distributed, schemaless JSON document database with automatic indexing and single-digit millisecond reads. AI applications use it for product catalogs, user profiles, inference result caching, and conversation history. The key insight: partition key selection determines performance more than anything else.
Explore Azure Cosmos DB for NoSQL
10 minCosmos DB: Container partitioned by userId β data spread across physical partitions
1. Resource Hierarchy
Memory aid: Library β Floor β Bookshelf β Book
- Account β top-level resource. Unique DNS endpoint:
https://myaccount.documents.azure.com:443/. Holds up to 500 databases. - Database β logical namespace. Groups related containers. Can share throughput across containers (shared throughput mode).
- Container β unit of scalability. Requires a partition key. Schema-agnostic β items in the same container can have different shapes.
- Item β individual JSON document. Must have
id+ partition key value to be uniquely addressable via point read.
2. Partition Key Design β The Most Important Decision
- High cardinality β many distinct values spread data across partitions evenly. Good:
userId,tenantId,productId. Bad:status(only true/false),isActive(boolean). - Query-aligned β if you always query by
userId, make it the partition key. Same-partition queries cost fewer RUs and are faster. - Immutable β you cannot change a container's partition key. Wrong choice = create new container + migrate all data.
isActive (boolean) or status as a partition key creates only 2-3 hot partitions and will throttle under load. The exam tests this anti-pattern frequently β it's always wrong. 3. Request Units (RUs) β Cost Model
- Point read 1 KB item β 1 RU β cheapest operation. Requires both id AND partition key.
- Write 1 KB item β 5β10 RUs β always more expensive than reads.
- Simple query (with partition key) β 3β5 RUs
- Cross-partition query (scan) β 100+ RUs β fan-out to all partitions. Avoid in hot paths.
Monitor cost via the x-ms-request-charge response header or response.request_charge in SDK.
4. Throughput Modes
- Manual throughput β fixed RU/s you specify. Min 400 RU/s for dedicated container. Predictable cost. Use for steady, known workloads.
- Autoscale throughput β scales 10%β100% of your set maximum automatically. Min 1,000 RU/s. No over-provisioning. Best for AI workloads with bursty or unpredictable traffic.
5. Item System Properties
_etagβ optimistic concurrency token. Use withif_matchto prevent lost updates from race conditions._tsβ Unix timestamp of last modification. Use for change feed or auditing._rid,_selfβ internal resource identifiers (read-only, auto-generated).
Implement the Cosmos DB NoSQL SDK
12 min1. Client Initialization β Singleton Pattern
Create CosmosClient ONCE at application startup and reuse it. Recreating per request wastes connections and adds latency (TCP handshake overhead).
from azure.cosmos import CosmosClient
from azure.identity import DefaultAzureCredential
client = CosmosClient(
"https://myaccount.documents.azure.com:443/",
credential=DefaultAzureCredential() # Managed identity in prod, az CLI locally
)
db = client.get_database_client("aidata")
container = db.get_container_client("products") 2. Authentication Options
- Account keys β simple, full access. Use for dev/test only. Avoid in production (long-lived, no audit trail).
- Entra ID (RBAC) β production standard. Assign
Cosmos DB Built-in Data Contributor(read/write) orCosmos DB Built-in Data Reader(read-only). Works with managed identities.
3. CRUD Operations
# Create β fails if id already exists
container.create_item(body={"id": "product-123", "categoryId": "electronics", "name": "Speaker"})
# Upsert β insert or replace (no conflict error)
container.upsert_item(body={...})
# Point Read β cheapest: ~1 RU for 1 KB (REQUIRES both id AND partition key)
item = container.read_item(item="product-123", partition_key="electronics")
# Optimistic concurrency β prevent lost updates
container.replace_item(item=item["id"], body=item, if_match=item["_etag"])
# Delete
container.delete_item(item="product-123", partition_key="electronics") 4. Idempotent Container Creation
from azure.cosmos import PartitionKey, ThroughputProperties
db = client.create_database_if_not_exists(id="aidata")
container = db.create_container_if_not_exists(
id="products",
partition_key=PartitionKey(path="/categoryId"),
offer_throughput=ThroughputProperties(auto_scale_max_throughput=4000)
) Build SQL Queries for Cosmos DB
10 min1. SQL Syntax (JSON-aware)
# Basic filter + projection (c = container alias)
SELECT c.id, c.name, c.price
FROM c
WHERE c.categoryId = "electronics"
AND c.price < 100
# Array contains
SELECT c.id FROM c WHERE ARRAY_CONTAINS(c.features, "wifi") 2. Parameterized Queries via SDK
query = "SELECT * FROM c WHERE c.categoryId = @cat AND c.price < @price"
params = [{"name": "@cat", "value": "electronics"}, {"name": "@price", "value": 100}]
items = list(container.query_items(
query=query,
parameters=params,
partition_key="electronics" # Stays in one partition = cheap
)) 3. Cross-Partition Queries β Avoid in Hot Paths
Queries without the partition key in WHERE clause fan out to ALL partitions. They cost dramatically more RUs and have higher latency. If you frequently need cross-partition queries, reconsider your partition key choice.
β‘ Cosmos DB Master Cheatsheet
_etag with if_match parameterx-ms-request-charge headerExercise β Build a Product Query Service
30 min- Create Cosmos DB account and container with
/categoryIdas partition key - Use Python SDK to insert 10 product items with different categories
- Perform a point read using id + partition key β observe 1 RU cost
- Run a same-partition query β observe 3-5 RUs
- Run a cross-partition query β observe 100+ RUs
- Implement optimistic concurrency with
_etagand test a conflict scenario
Knowledge Check
5 min- Q: You know a product's id and categoryId. Cheapest fetch? A: Point read β
read_item(item=id, partition_key=categoryId)β 1 RU - Q: AI app has bursty traffic. Which throughput mode? A: Autoscale (scales 10%β100% of max automatically)
- Q: Two processes updating same item concurrently. How to prevent lost update? A: Optimistic concurrency:
if_match=item["_etag"] - Q: Why avoid
isActiveas partition key? A: Only 2 values (true/false) = 2 hot partitions = throttling under load - Q: RU consumption suddenly spiked 10x. Most likely cause? A: Cross-partition queries due to missing partition key in WHERE clause
Summary
2 minChoose partition keys with high cardinality aligned to your most common query pattern. Use Autoscale for AI workloads. Prefer point reads (1 RU) over queries (3β100+ RUs). Use the SDK singleton pattern. Protect concurrent writes with _etag. Monitor costs via x-ms-request-charge.
π§ Memory Tricks
Partition key rule of 3: HAI β High cardinality, Aligned to queries, Immutable
RU cost ladder: Point read (1) β Simple query (3β5) β Complex (10+) β Cross-partition (100+)
_etag = "version stamp" for optimistic concurrency β like a database row version
Exam Summary Card
2 min| Scenario | Answer |
|---|---|
| Bursty traffic throughput | Autoscale (min 1,000 RU/s) |
| Predictable steady traffic | Manual (min 400 RU/s) |
| Cheapest fetch operation | Point read (id + partition key) |
| Prevent concurrent overwrites | if_match with _etag |
| Bad partition key example | isActive, status, region (low cardinality) |
| Good partition key example | userId, tenantId, productId (high cardinality) |
| Cross-partition query cost | 100+ RUs (fan-out to all partitions) |
| Production auth | Entra ID + Built-in Data Contributor role |
Azure Cosmos DB for NoSQL
π Key Facts
- Hierarchy β Account β Database β Container β Item
- Good partition key β High cardinality + query-aligned + immutable (userId, productId)
- Bad partition key β isActive, status, region β low cardinality = hot partitions
- Point read β ~1 RU/KB β cheapest op (needs BOTH id + partition key)
- Cross-partition query β 100+ RU β fan-out to ALL partitions. Avoid in hot paths.
- Autoscale throughput β 1,000 RU/s min β use for bursty AI workloads
- Manual throughput β 400 RU/s min β use for steady, predictable workloads
- _etag β Optimistic concurrency token β use with if_match to prevent lost updates
π» Commands & Patterns
from azure.cosmos import CosmosClient
from azure.identity import DefaultAzureCredential
client = CosmosClient(
"https://myacct.documents.azure.com:443/",
credential=DefaultAzureCredential()
)
ctr = client.get_database_client("db").get_container_client("ctr")
# Point read (~1 RU β cheapest)
item = ctr.read_item(item="id-123", partition_key="electronics")
# Upsert (no conflict error)
ctr.upsert_item(body={...})
# Optimistic concurrency
ctr.replace_item(item["id"], body=item, if_match=item["_etag"])
# Parameterized query (with pk = cheap)
list(ctr.query_items(query="SELECT * FROM c WHERE c.cat=@c",
parameters=[{"name":"@c","value":"electronics"}],
partition_key="electronics")) Implement Vector Search and Change Feed in Azure Cosmos DB
Introduction to Cosmos DB Vector Search
3 minAzure Cosmos DB NoSQL supports native vector search β store document embeddings alongside your data in the same container. No separate vector store needed. The change feed enables event-driven architecture: process new/updated documents automatically in real time.
Vector Policy and Indexing
8 minConfigure Vector Index on Container
from azure.cosmos import CosmosClient, PartitionKey
from azure.identity import DefaultAzureCredential
client = CosmosClient(
"https://myacct.documents.azure.com:443/",
credential=DefaultAzureCredential()
)
db = client.get_database_client("ai-db")
# Create container with vector policy
container = db.create_container_if_not_exists(
id="documents",
partition_key=PartitionKey("/category"),
vector_embedding_policy={
"vectorEmbeddings": [{
"path": "/embedding",
"dataType": "float32",
"dimensions": 1536,
"distanceFunction": "cosine"
}]
},
indexing_policy={
"vectorIndexes": [{
"path": "/embedding",
"type": "diskANN"
}]
}
) diskANN = approximate search, production-ready. flat = exact but slow at scale. quantizedFlat = compressed for memory efficiency. Always set dimensions to match your embedding model. Vector Similarity Search
7 minVectorDistance() Query
def search_documents(query_text, top_k=5):
q_emb = generate_embedding(query_text)
results = list(container.query_items(
query="""SELECT TOP @k c.title, c.content,
VectorDistance(c.embedding, @qvec) AS score
FROM c
ORDER BY VectorDistance(c.embedding, @qvec)""",
parameters=[
{"name": "@k", "value": top_k},
{"name": "@qvec", "value": q_emb}
],
enable_cross_partition_query=True
))
return results VectorDistance() returns distance (lower = more similar). Always ORDER BY VectorDistance() ASC. Cross-partition query required for global vector search β costs more RUs. Change Feed Processor
8 minProcess New Documents Automatically
from azure.cosmos.aio import CosmosClient
import asyncio
async def process_changes(changes, context):
for doc in changes:
# Auto-embed every new document inserted to container
text = doc.get("content", "")
embedding = await generate_embedding_async(text)
await container.patch_item(
item=doc["id"],
partition_key=doc["category"],
patch_operations=[{
"op": "add",
"path": "/embedding",
"value": embedding
}]
)
# Start change feed processor
processor = container.create_change_feed_processor(
name="embedding-processor",
handler=process_changes,
lease_container=lease_container
)
await processor.start() Summary
2 minCosmos DB vector search: container vector policy (diskANN) + VectorDistance() in SQL. Change feed processor: leases track position, processes all inserts/updates (not deletes), enables auto-embedding pipeline. Combine both: insert document β change feed triggers embedding β VectorDistance() query for RAG retrieval. All in one Cosmos DB container β no separate vector store.