⚑
Module 6 of 9 25–30% 3 sub-modules Β· 21 units Domain 2: Develop AI Solutions by Using Azure Data Management Services

Enhance AI Solutions with Azure Managed Redis

Implement Azure Managed Redis data operations including caching, expiration, and invalidation. Implement vector indexing to enable similarity search for semantic caching and RAG.

Azure Managed RedisRediSearchVector Index

Last updated: Β· Aligned with Course AI-200T00-A

Module

Cache Content with Azure Managed Redis

units
🎬 Unit 1

Introduction

3 min

Azure Managed Redis is a fully managed Redis service. Redis is an in-memory key-value store that runs at microsecond latency β€” 100-1,000x faster than database queries. AI applications use it to cache model inference results, prompt/response pairs, embedding vectors, rate-limit counters, and session state β€” all to reduce cost and latency.

πŸ’‘ Exam Tip
Exam pillars: 1) Tiers + features 2) Cache-aside pattern implementation 3) Data types and commands 4) TTL strategy 5) Cache invalidation patterns 6) Redis port 10000 with TLS (not 6379).
πŸ“˜ Unit 2

Explore Azure Managed Redis

7 min

Cache-Aside Pattern: App checks Redis β†’ HIT returns instantly β†’ MISS fetches DB then caches result

AI Appget_product(id)β‘  Check cacheRedisport 10000 (TLS)HIT β†’ return valueMISS β†’ fetch DBβ‘‘ HITβ‘’ MISS: query DBDatabasePostgreSQL/Cosmosβ‘£ cache + TTLTTL: fast-changing 1-5m | stable 1-24h | static 24h+

1. Azure Managed Redis Tiers

#TierMemoryPersistenceClusterUse For
1Memory OptimizedUp to 1.5 TBAOFβœ…Large AI caches, embedding stores
2BalancedUp to 120 GBAOFβœ…General production AI workloads
3Compute OptimizedUp to 120 GBAOFβœ…High throughput, CPU-intensive operations
4Flash OptimizedUp to 13 TBAOFβœ…Very large datasets, warm data on flash

Also: Azure Cache for Redis with Basic/Standard/Premium tiers β€” Basic is single-node dev/test; Standard adds replication; Premium adds geo-replication and VNet.

2. Connection: Port 10000 (Not 6379)

Azure Managed Redis uses port 10000 with mandatory TLS β€” not the default Redis port 6379. This is a common exam trap.

redis://myinstance.redis.azure.com:10000 (TLS required)
⚠️ Common Gotcha
Port 6379 = open-source Redis default. Azure Managed Redis = port 10000 with TLS. Any question about Azure Redis connection β†’ port 10000.

3. Caching Strategies β€” Which to Use When

  1. Cache-aside (Lazy loading) β€” app checks cache first, on MISS fetches from DB and stores in cache. Most common for AI inference results. Requires cache invalidation on source data change.
  2. Write-through β€” writes go to cache AND DB simultaneously. Cache always consistent. Higher write latency. Good for user profiles that must always be current.
  3. Write-behind (Write-back) β€” writes go to cache, DB updated asynchronously. Low write latency. Risk of data loss on crash. Advanced scenario.
  4. Read-through β€” cache fetches from DB on MISS automatically. Cache acts as transparent proxy. Simplifies application code.
πŸ’‘ Exam Tip
Cache-aside = application manages cache. Cache hit: return from cache. Cache miss: fetch DB, populate cache, set TTL, return. The exam always has a "implement caching" scenario where cache-aside is the right answer for AI inference results.
πŸ“˜ Unit 3

Client Libraries and Configuration

5 min

1. Python: redis-py with Managed Identity

import redis

# Access key authentication
r = redis.Redis(
    host='your-instance.redis.azure.com',
    port=10000,
    ssl=True,
    decode_responses=True,
    password='your-access-key'
)

# Entra ID (Managed Identity) authentication
from redis_entraid.cred_provider import create_from_default_azure_credential
credential_provider = create_from_default_azure_credential(
    ("https://redis.azure.com/.default",)
)
r = redis.Redis(host='...', port=10000, ssl=True,
    decode_responses=True, credential_provider=credential_provider)
πŸ’‘ Exam Tip
Production: use create_from_default_azure_credential from redis-entraid package. Token refreshes automatically. Access key = shared long-lived secret β€” avoid in production.

2. decode_responses=True

Set this in ALL clients. Without it, Redis returns bytes objects. With it, returns Python str. Nearly always what you want.

πŸ“˜ Unit 4

Implement Redis Data Operations

12 min

1. Strings — Most Common Type (Key→Value)

r.set("inference:result:user-123", '{"answer": "42", "confidence": 0.98}', ex=300)
result = r.get("inference:result:user-123")  # None if expired

r.setex("rate:user-123", 60, 1)  # Set with TTL of 60 seconds
r.incr("rate:user-123")           # Atomic increment β€” thread-safe counter
remaining = r.ttl("rate:user-123")  # Time to live in seconds

2. Hashes β€” Object with Multiple Fields

Use for session data, user profiles β€” avoids serializing/deserializing a full JSON blob when you only need one field.

r.hset("user:1001", mapping={"name": "Alice", "tier": "premium", "credits": "500"})
name = r.hget("user:1001", "name")        # Get one field
profile = r.hgetall("user:1001")          # Get all fields as dict
r.hincrby("user:1001", "credits", -10)   # Atomically decrement credits

3. Lists β€” Ordered Queue / Recent History

r.lpush("chat:session-abc", "Hello")   # Push to left
r.rpush("chat:session-abc", "World")   # Push to right
history = r.lrange("chat:session-abc", 0, -1)  # All items
r.ltrim("chat:session-abc", 0, 9)      # Keep only last 10 messages

4. Sets β€” Unique Members

r.sadd("active-sessions", "sess-001", "sess-002")
r.sismember("active-sessions", "sess-001")  # True/False
r.smembers("active-sessions")               # All members

5. Sorted Sets β€” Leaderboards / Priority Queues

r.zadd("model-latency", {"gpt-4o": 250.5, "gpt-4o-mini": 85.2})
fastest = r.zrange("model-latency", 0, -1, withscores=True)  # Ascending (fastest first)

6. TTL Strategy by Data Type

#DataTTLReason
1Rate limit counters60–300 secondsReset per time window
2Inference result cache5–60 minutesReasonably fresh, invalidate on model update
3User session data15–60 minutesExpire inactive sessions
4Product catalog1–24 hoursChanges infrequently
5Static config24+ hoursVery stable data
⚠️ Common Gotcha
Never store sensitive data (API keys, PII) in Redis without encryption β€” Redis stores data in memory and may persist it to disk. Expire sensitive data with short TTLs or use Key Vault instead.

⚑ Redis Master Cheatsheet

Azure Redis port10000 (TLS, NOT 6379)
Cache-aside key commandsGET β†’ miss β†’ DB β†’ SET with TTL
Set with TTL (seconds)r.set(key, val, ex=300)
Atomic counterr.incr(key) β€” thread-safe
Object storageHash: r.hset(key, mapping=dict)
Ordered queueList: lpush/rpush + lrange
Unique membersSet: r.sadd
Ranked dataSorted Set: r.zadd
Decode bytes to strdecode_responses=True
Prod authredis-entraid + DefaultAzureCredential
πŸ§ͺ Unit 5

Exercise β€” Cache AI Inference Results

30 min
  1. Create Azure Managed Redis instance and connect on port 10000 with TLS
  2. Implement cache-aside for model inference: GET β†’ MISS β†’ model call β†’ SET with 5-min TTL
  3. Implement rate limiting: INCR per user per minute, block at limit
  4. Store session history in a Redis List, trim to last 10 messages
  5. Measure latency difference: Redis HIT vs database query
βœ… Unit 6

Knowledge Check

5 min
  1. Q: Azure Managed Redis connection port? A: 10000 with TLS (not 6379)
  2. Q: Cache frequently-read inference results, update on source change. Which pattern? A: Cache-aside (lazy loading)
  3. Q: Track API calls per user per minute atomically. Which command? A: INCR with SETEX for the window TTL
  4. Q: Store user profile fields individually without full JSON serialization. Which data type? A: Redis Hash
  5. Q: Production authentication for Redis without stored keys? A: Entra ID with redis-entraid + DefaultAzureCredential
🏁 Unit 7

Summary

2 min

Azure Managed Redis = in-memory cache at microsecond latency. Connect on port 10000 with TLS. Implement cache-aside for AI inference results. Choose data types by pattern: String (simple cache), Hash (objects), List (queues/history), Set (unique members), Sorted Set (leaderboards). Set TTLs based on data volatility. Use Entra ID for production auth.

🧠 Memory Tricks

Cache-aside flow: "Get β†’ Miss β†’ DB β†’ Set" β€” always in that order

Data type mnemonic: "SHLSS" β€” String (cache), Hash (objects), List (queue), Set (unique), Sorted Set (ranked)

Azure Redis port: 10000 = "ten thousand ms slower than RAM... but still fast" (just remember: 10000, not 6379)

⚑
Module Cheatsheet

Azure Managed Redis

20–25% PDF

πŸ”‘ Key Facts

  • Azure Redis port β€” 10000 with TLS β€” NOT the open-source default 6379
  • Cache-aside flow β€” GET β†’ MISS β†’ DB β†’ SET with TTL β†’ return
  • Atomic counter β€” r.incr(key) β€” thread-safe, use for rate limiting
  • Hash (object) β€” r.hset(key, mapping=dict) β€” avoids full JSON serialize/deserialize
  • List (queue) β€” lpush/rpush + lrange + ltrim β€” chat history, task queues
  • Set (unique) β€” r.sadd/sismember β€” active sessions, deduplicated sets
  • Sorted Set (ranked) β€” r.zadd β€” leaderboards, priority queues, TTL ordering
  • Prod auth β€” redis-entraid + DefaultAzureCredential (no stored key)

πŸ’» Commands & Patterns

import redis
r = redis.Redis(host="myinst.redis.azure.com",
  port=10000, ssl=True, decode_responses=True,
  password="your-access-key")
# Cache-aside
def get_result(key):
  hit = r.get(key)
  if hit: return hit
  val = db_query()
  r.set(key, val, ex=300)  # 5-min TTL
  return val
# Atomic rate limit
key = f"rate:{user_id}:{int(time.time())//60}"
count = r.incr(key); r.expire(key, 60)
if count > 10: raise RateLimitError()
# Hash for user profile
r.hset("user:1001", mapping={"name":"Alice","tier":"premium"})
r.hincrby("user:1001", "credits", -10)
# List: keep last 10 messages
r.lpush("chat:abc", "Hello"); r.ltrim("chat:abc", 0, 9)
Module

Implement Vector Indexing and Semantic Caching with Azure Managed Redis

units
🎬 Unit 1

Introduction to Redis Vector Search

3 min

Azure Managed Redis (via the RediSearch module) supports vector similarity search β€” store embeddings in Redis hashes and query with FT.SEARCH KNN. The primary AI use case is semantic caching: return cached LLM responses for semantically similar prompts instead of calling the API every time.

πŸ’‘ Exam Tip
Redis vector exam pillars: 1) FT.CREATE with VECTOR HNSW field 2) FLOAT32 / DIM / DISTANCE_METRIC COSINE 3) FT.SEARCH KNN syntax with DIALECT 2 4) Cosine distance 0=identical β€” similarity = 1 - distance 5) Cache hit threshold ~0.95.
πŸ“˜ Unit 2

Create a Vector Index with FT.CREATE

8 min

Create RediSearch Vector Index

import redis, struct

r = redis.Redis(
    host="myredis.redis.cache.windows.net",
    port=10000, password="key", ssl=True,
    decode_responses=False  # MUST be False for binary vectors
)

r.execute_command(
    "FT.CREATE", "idx:docs", "ON", "HASH",
    "PREFIX", "1", "doc:",
    "SCHEMA",
    "title", "TEXT",
    "embedding", "VECTOR", "HNSW", "6",
        "TYPE", "FLOAT32",
        "DIM", "1536",
        "DISTANCE_METRIC", "COSINE"
)

# Store document with embedding
def embed(text):
    vec = oai.embeddings.create(input=text,
        model="text-embedding-3-small").data[0].embedding
    return struct.pack(f"{len(vec)}f", *vec)

r.hset("doc:1", mapping={
    "title": "Azure Key Vault",
    "embedding": embed("Azure Key Vault secures secrets")
})
πŸ’‘ Exam Tip
decode_responses=False required for binary vector data. HNSW = approximate index type. DIM 1536 must match your embedding model dimensions.
πŸ“˜ Unit 4

Semantic Caching Pattern

8 min

Cache-Aside with Similarity Threshold

import hashlib

THRESHOLD = 0.95  # cosine similarity for cache hit

def semantic_cache_get(prompt):
    q_emb = embed(prompt)
    res = r.execute_command(
        "FT.SEARCH", "idx:cache",
        "*=>[KNN 1 @embedding $blob AS score]",
        "PARAMS", "2", "blob", q_emb,
        "RETURN", "2", "response", "score",
        "DIALECT", "2"
    )
    if res[0] > 0:
        distance = float(dict(zip(res[2][::2],
            res[2][1::2]))[b"score"])
        if (1 - distance) >= THRESHOLD:
            return res[2][res[2].index(b"response")+1]
    return None

def semantic_cache_set(prompt, response, ttl=3600):
    key = f"cache:{hashlib.md5(prompt.encode()).hexdigest()}"
    r.hset(key, mapping={
        "prompt": prompt,
        "response": response,
        "embedding": embed(prompt)
    })
    r.expire(key, ttl)
⚠️ Common Gotcha
Redis returns cosine distance (0=identical). Similarity = 1 - distance. Always set TTL on cached responses β€” LLM answers become stale. Never cache user-specific or sensitive data.
πŸ§ͺ Unit 5

Exercise

20 min
  1. Create an Azure Managed Redis (Enterprise tier with RediSearch)
  2. Create an FT.CREATE index with HNSW FLOAT32 DIM=1536 COSINE
  3. Store 10 Q&A pairs with embeddings using r.hset()
  4. Implement semantic_cache_get() with 0.95 similarity threshold
  5. Test with rephrased questions β€” verify cache hits vs misses
🏁 Unit 6

Summary

2 min

Redis vector search: FT.CREATE with HNSW index β†’ HSET with binary embedding β†’ FT.SEARCH KNN with DIALECT 2. Cosine similarity = 1 - distance. Semantic cache returns cached LLM responses for similar prompts (β‰₯0.95 threshold). Always TTL cache entries. Use decode_responses=False for binary vector operations.

🧠

Quick Quiz

5 questions β€” test your understanding before moving on

Finished reading this module? Mark it complete to track your progress.

Related Modules β€” Develop AI Solutions by Using Azure Data Management Services

Frequently Asked Questions

What percentage of the AI-200 exam covers Develop AI Solutions by Using Azure Data Management Services? +

Domain 2 (Develop AI Solutions by Using Azure Data Management Services) accounts for 25–30% of the AI-200 exam. Enhance AI Solutions with Azure Managed Redis topics like Azure Managed Redis and RediSearch are actively tested. Study all official skill objectives listed in the module header above.

Is Azure Managed Redis on the AI-200 exam? +

Yes. Enhance AI Solutions with Azure Managed Redis is part of Domain 2 in the official AI-200 skill outline, weighted at 25–30%. The key services tested are Azure Managed Redis, RediSearch, Vector Index. Review the code examples and exam tips in this module for targeted prep.

How do I practice Azure Managed Redis hands-on? +

The best approach is to create a free Azure account and follow the code examples in this module step-by-step. The official Microsoft Learn sandbox for Course AI-200T00-A also provides free lab environments for Azure Managed Redis and related services.