Azure Auth Series — Blog 7: API Gateway Authentication#
In Blog 6, every service validated its own JWTs. The Task API fetched JWKS, the Notification Service fetched JWKS, the Audit Service fetched JWKS — three separate HTTP calls to Azure AD on every request. That works, but it doesn't scale. What if you have ten services? Twenty?
This blog puts Azure API Management (APIM) in front of all three services. APIM validates the JWT once, extracts the claims, and forwards them as trusted HTTP headers. Backends skip token validation entirely and just read the headers. One gateway, one place to change auth rules, one place to add rate limiting.
Source code: github.com/MinhQuanBuiSco/Azure/.../07_api_gateway
The Azure Auth Series#
| Blog | Topic | What You'll Learn |
|---|---|---|
| 1. Basic Login | Frontend auth | Sign in with Microsoft Entra ID |
| 2. Protected API | Backend auth | Build a FastAPI backend that validates tokens |
| 3. RBAC | Authorization | Control access based on user roles |
| 4. Managed Identity | Zero secrets | Deploy to Azure without storing credentials |
| 5. Multi-Tenant | Organizations | Let users from any org sign in |
| 6. Service-to-Service | OBO + Client Credentials | Authenticate services to each other |
| 7. API Gateway | You are here | Centralize auth with Azure API Management |
What We're Building#
Architecture#
Blog 6 (no gateway):
Frontend → Task API (validate JWT)
→ Notification Svc (validate JWT)
→ Audit Svc (validate JWT)
Each service fetches JWKS independently
Blog 7 (with APIM gateway):
Frontend → APIM Gateway → Task API (trust headers)
→ Notification Svc (trust headers)
→ Audit Svc (trust headers)
APIM validates JWT once, forwards claims as headers
What Changed from Blog 6#
| Aspect | Blog 6 | Blog 7 |
|---|---|---|
| JWT validation | Every service validates independently | APIM validates once at the gateway |
| JWKS fetching | 3 services × N requests | APIM caches JWKS, services skip |
| Rate limiting | None | 100 req/min per user at gateway |
| Auth rule location | Spread across 3 auth.py files | Centralized in APIM XML policies |
| Backend auth mode | Always validate JWT | Dual-mode: TRUST_GATEWAY flag |
| Frontend API URL | http://localhost:8000 | APIM gateway URL |
| Infrastructure | Local dev only | Terraform: APIM + Container Apps + ACR |
The App#

The frontend looks the same as Blog 6 — sign in, manage tasks, see notification and audit badges. But the request path is different: everything goes through APIM first.

Step 1: The Problem — Duplicated Auth Logic#
In Blog 6, every service had its own auth.py that did the same thing:
1. Extract Bearer token from Authorization header
2. Decode the JWT header to get the key ID (kid)
3. Fetch JWKS from login.microsoftonline.com/{tenant_id}
4. Find the matching RSA key
5. Verify signature, audience, issuer
6. Return claims
Three services means three copies of this logic. If you want to change the allowed issuers or add a new audience, you update three files. If you want to add rate limiting, you add it to three main.py files.
An API gateway solves this by doing steps 1–6 once and passing the verified claims downstream.
Step 2: APIM Policies — The Core Innovation#
APIM uses XML policies to process every request before it reaches the backend. Each API gets its own policy file.
Task API Policy (task-api.xml)#
<policies> <inbound> <base /> <!-- CORS for frontend --> <cors allow-credentials="true"> <allowed-origins> <origin>http://localhost:3000</origin> </allowed-origins> <allowed-methods preflight-result-max-age="300"> <method>*</method> </allowed-methods> <allowed-headers> <header>*</header> </allowed-headers> </cors> <!-- Rate limit: 100 req/min per user (JWT sub) or IP --> <rate-limit-by-key calls="100" renewal-period="60" counter-key="@( context.Request.Headers .GetValueOrDefault("Authorization","") .AsJwt()?.Subject ?? context.Request.IpAddress )" /> <!-- Validate JWT — store parsed token in "jwt" variable --> <validate-jwt header-name="Authorization" require-scheme="Bearer" failed-validation-httpcode="401" output-token-variable-name="jwt"> <openid-config url="https://login.microsoftonline.com/${tenant_id} /v2.0/.well-known/openid-configuration" /> <audiences> <audience>${audience}</audience> </audiences> <issuers> <issuer>https://sts.windows.net/${tenant_id}/</issuer> </issuers> </validate-jwt> <!-- Extract claims → backend headers --> <set-header name="X-User-OID" exists-action="override"> <value>@(((Jwt)context.Variables["jwt"]) .Claims.GetValueOrDefault("oid", new [] {""}).FirstOrDefault())</value> </set-header> <set-header name="X-User-Name" exists-action="override"> <value>@(((Jwt)context.Variables["jwt"]) .Claims.GetValueOrDefault("name", new [] {""}).FirstOrDefault())</value> </set-header> <set-header name="X-User-Email" exists-action="override"> <value>@(((Jwt)context.Variables["jwt"]) .Claims.GetValueOrDefault("preferred_username", new [] {""}).FirstOrDefault())</value> </set-header> <set-header name="X-Tenant-ID" exists-action="override"> <value>@(((Jwt)context.Variables["jwt"]) .Claims.GetValueOrDefault("tid", new [] {""}).FirstOrDefault())</value> </set-header> <set-header name="X-User-Roles" exists-action="override"> <value>@(String.Join(",", ((Jwt)context.Variables["jwt"]) .Claims.GetValueOrDefault("roles", new string[0])))</value> </set-header> </inbound> <backend><base /></backend> <outbound><base /></outbound> <on-error><base /></on-error> </policies>
Three things happen on every request:
- Rate limiting — 100 calls per minute, keyed by the JWT
subclaim. Falls back to IP address for unauthenticated requests. - JWT validation — Fetches the OpenID configuration, verifies signature, checks audience and issuer. The parsed token is stored in the
jwtvariable. - Claims forwarding — Extracts
oid,name,preferred_username,tid, androlesfrom the validated JWT and sets them as HTTP headers.
Audit API Policy — Role Enforcement at the Gateway#
The audit API policy adds an extra check: the token must contain the AuditLog.Write application role.
<validate-jwt ... output-token-variable-name="jwt"> <openid-config url="..." /> <audiences> <audience>${audience}</audience> </audiences> <issuers> <issuer>https://sts.windows.net/${tenant_id}/</issuer> </issuers> <!-- Require AuditLog.Write application role --> <required-claims> <claim name="roles" match="any"> <value>AuditLog.Write</value> </claim> </required-claims> </validate-jwt>
If the token doesn't have AuditLog.Write, APIM returns 401 before the request even reaches the Audit Service. This is the same check that audit-service/auth.py did in Blog 6, but now it happens at the gateway.
Step 3: The TRUST_GATEWAY Pattern#
The key architectural decision in Blog 7 is the TRUST_GATEWAY flag. Each service's config.py adds one new variable:
# config.py — NEW in Blog 7 TRUST_GATEWAY = os.getenv( "TRUST_GATEWAY", "false" ).lower() == "true" if TRUST_GATEWAY: logger.info( "TRUST_GATEWAY=true — accepting claims " "from APIM headers (skip JWT validation)" ) else: logger.info( "TRUST_GATEWAY=false — validating JWT " "tokens locally" )
And each service's auth.py dispatches based on it:
# auth.py — dual-mode validation async def validate_token(request: Request) -> dict: if TRUST_GATEWAY: return _extract_claims_from_headers(request) # Fall back to JWT validation (Blog 6 behavior) auth_header = request.headers.get("Authorization", "") if not auth_header.startswith("Bearer "): raise HTTPException( status_code=status.HTTP_401_UNAUTHORIZED, detail="Missing Bearer token", ) credentials = HTTPAuthorizationCredentials( scheme="Bearer", credentials=auth_header[7:] ) return await _validate_jwt(credentials)
Extracting Claims from Headers#
When TRUST_GATEWAY=true, the backend reads APIM-injected headers instead of parsing JWT:
def _extract_claims_from_headers(request: Request) -> dict: """Extract user claims from APIM-injected headers.""" oid = request.headers.get("X-User-OID", "") name = request.headers.get("X-User-Name", "") email = request.headers.get("X-User-Email", "") tenant_id = request.headers.get("X-Tenant-ID", "") roles_header = request.headers.get("X-User-Roles", "") if not oid or not tenant_id: raise HTTPException( status_code=status.HTTP_401_UNAUTHORIZED, detail="Missing required gateway headers", ) roles = [ r.strip() for r in roles_header.split(",") if r.strip() ] return { "oid": oid, "name": name, "preferred_username": email, "tid": tenant_id, "roles": roles, "scp": "access_as_user", "_source": "gateway_headers", }
The result is the same claims dictionary that _validate_jwt() returns — the rest of the application code doesn't know or care which mode was used.
Why Two Modes?#
| Mode | When | Why |
|---|---|---|
TRUST_GATEWAY=false | Local development | No APIM needed, validates JWT directly |
TRUST_GATEWAY=true | Cloud deployment | APIM already validated, skip redundant check |
This means docker compose up still works locally without deploying APIM, while the cloud deployment gets the performance benefit of centralized validation.
Step 4: Audit Service — Role Enforcement in Both Modes#
The Audit Service has a special requirement: it only accepts tokens with the AuditLog.Write application role. In Blog 6, this was checked in auth.py. In Blog 7, it's checked twice — at the gateway (APIM policy) and at the backend (defense in depth).
# audit-service/auth.py def _extract_claims_from_headers(request: Request) -> dict: """Extract claims from APIM headers for app-only tokens.""" oid = request.headers.get("X-User-OID", "") tenant_id = request.headers.get("X-Tenant-ID", "") roles_header = request.headers.get("X-User-Roles", "") if not tenant_id: raise HTTPException( status_code=status.HTTP_401_UNAUTHORIZED, detail="Missing required gateway header", ) roles = [ r.strip() for r in roles_header.split(",") if r.strip() ] # Defense in depth: check role even in gateway mode if "AuditLog.Write" not in roles: raise HTTPException( status_code=status.HTTP_403_FORBIDDEN, detail="Missing required application role: " "AuditLog.Write", ) return { "oid": oid, "tid": tenant_id, "roles": roles, "azp": oid, "_source": "gateway_headers", }
Even though APIM already checked the role in its <required-claims> block, the backend checks again. If APIM is misconfigured or bypassed, the backend still rejects unauthorized calls.
Step 5: Infrastructure as Code — Terraform#
Blog 7's Terraform provisions four modules:
infra/
├── main.tf # Root orchestrator
├── variables.tf # ACR name, APIM name, client IDs
├── outputs.tf # APIM gateway URL, service FQDNs
└── modules/
├── resource_group/ # Azure Resource Group
├── container_registry/ # ACR (Basic SKU)
├── container_apps/ # 3 Container Apps
└── api_management/ # APIM + 3 APIs + policies
└── policies/
├── task-api.xml
├── notification.xml
└── audit.xml
Root Module#
# infra/main.tf module "resource_group" { source = "./modules/resource_group" name = var.resource_group_name location = var.location tags = local.tags } module "container_registry" { source = "./modules/container_registry" name = var.acr_name resource_group_name = module.resource_group.name location = module.resource_group.location tags = local.tags } module "container_apps" { source = "./modules/container_apps" project_name = var.project_name resource_group_name = module.resource_group.name location = module.resource_group.location acr_login_server = module.container_registry.login_server acr_admin_username = module.container_registry.admin_username acr_admin_password = module.container_registry.admin_password tenant_id = var.tenant_id task_api_client_id = var.task_api_client_id task_api_client_secret = var.task_api_client_secret notification_client_id = var.notification_client_id audit_client_id = var.audit_client_id allowed_origins = var.allowed_origins tags = local.tags } module "api_management" { source = "./modules/api_management" apim_name = var.apim_name resource_group_name = module.resource_group.name location = module.resource_group.location publisher_email = var.publisher_email tenant_id = var.tenant_id task_api_fqdn = module.container_apps.task_api_fqdn notification_fqdn = module.container_apps.notification_fqdn audit_fqdn = module.container_apps.audit_fqdn task_api_audience = "api://${var.task_api_client_id}" notification_audience = "api://${var.notification_client_id}" audit_audience = "api://${var.audit_client_id}" tags = local.tags }
Container Apps — TRUST_GATEWAY=true#
Each Container App is configured with TRUST_GATEWAY=true so it accepts APIM-forwarded headers:
# modules/container_apps/main.tf (Task API excerpt) resource "azurerm_container_app" "task_api" { name = "task-api" container_app_environment_id = azurerm_container_app_environment.this.id resource_group_name = var.resource_group_name revision_mode = "Single" template { min_replicas = 0 max_replicas = 1 container { name = "task-api" image = "mcr.microsoft.com/k8se/quickstart:latest" cpu = 0.25 memory = "0.5Gi" env { name = "TRUST_GATEWAY" value = "true" } env { name = "AZURE_HOME_TENANT_ID" value = var.tenant_id } env { name = "AZURE_API_CLIENT_ID" value = var.task_api_client_id } env { name = "AZURE_API_CLIENT_SECRET" secret_name = "api-client-secret" } env { name = "NOTIFICATION_URL" value = "https://${ azurerm_container_app.notification .ingress[0].fqdn }" } env { name = "AUDIT_URL" value = "https://${ azurerm_container_app.audit .ingress[0].fqdn }" } # ... more env vars } } ingress { external_enabled = true target_port = 8000 transport = "http" traffic_weight { latest_revision = true percentage = 100 } } secret { name = "api-client-secret" value = var.task_api_client_secret } # ... registry config }
The min_replicas = 0 means Container Apps can scale to zero when idle — you only pay when requests come in.
APIM Module — Three APIs, Three Policies#
# modules/api_management/main.tf resource "azurerm_api_management" "this" { name = var.apim_name location = var.location resource_group_name = var.resource_group_name publisher_name = "Azure Auth Series" publisher_email = var.publisher_email sku_name = "Developer_1" } # Task API — path: /api resource "azurerm_api_management_api" "task_api" { name = "task-api" api_management_name = azurerm_api_management.this.name resource_group_name = var.resource_group_name revision = "1" display_name = "Task API" path = "api" protocols = ["https"] service_url = "https://${var.task_api_fqdn}/api" subscription_required = false } # Notification API — path: /notification resource "azurerm_api_management_api" "notification" { # ... similar, path = "notification" service_url = "https://${var.notification_fqdn}" } # Audit API — path: /auditing resource "azurerm_api_management_api" "audit" { # ... similar, path = "auditing" service_url = "https://${var.audit_fqdn}" }
Each API gets its policy via templatefile(), which injects the tenant_id and audience variables into the XML:
resource "azurerm_api_management_api_policy" "task_api" { api_name = azurerm_api_management_api.task_api.name api_management_name = azurerm_api_management.this.name resource_group_name = var.resource_group_name xml_content = templatefile( "${path.module}/policies/task-api.xml", { tenant_id = var.tenant_id audience = var.task_api_audience } ) }
Step 6: Request Flow — End to End#
Here's what happens when a user creates a task:
1. User clicks "Add" in the frontend
2. Frontend sends POST to APIM gateway:
POST https://blog07-apim.azure-api.net/api/tasks
Authorization: Bearer eyJ0eXAi...
3. APIM processes the request:
a. Rate limit check (100/min per JWT sub)
b. Validate JWT (signature, audience, issuer)
c. Extract claims → set headers:
X-User-OID: abc-123
X-User-Name: Test Admin
X-User-Email: admin@contoso.com
X-Tenant-ID: 0f485f73-...
X-User-Roles: Admin
4. APIM routes to Task API Container App:
POST https://task-api.internal/api/tasks
(with X-User-* headers)
5. Task API reads headers (TRUST_GATEWAY=true):
claims = _extract_claims_from_headers(request)
# No JWKS fetch needed!
6. Task API creates the task, then:
a. OBO → Notification Service (direct HTTPS)
b. Client Credentials → Audit Service (direct HTTPS)
Both S2S calls bypass APIM — they go directly
to the Container App FQDNs.
Notice that service-to-service calls (step 6) bypass APIM. The Task API calls Notification and Audit services directly because:
- Those calls use different tokens (OBO and Client Credentials), not the user's original token
- Internal calls don't need gateway routing or rate limiting
- Each downstream service still validates its own token (the S2S tokens, not the original user token)
Step 7: Deployment#
The setup.sh script automates everything:
./setup.sh
It runs five phases:
Phase 1: Azure AD Setup
Create 4 app registrations (Task API, Notification,
Audit, SPA Frontend)
Define scopes and app roles
Create 3 test users with role assignments
Phase 2: Terraform
terraform init && terraform apply
Provisions: RG + ACR + 3 Container Apps + APIM
(APIM Developer tier takes ~30-45 minutes)
Phase 3: Docker Build + Push
Build 3 images with --platform linux/amd64
Push to ACR
Phase 4: Update Container Apps
az containerapp update → point to real Docker images
Phase 5: Write .env Files
task-api/.env: TRUST_GATEWAY=false (for local dev)
frontend/.env.local: API URL = APIM gateway URL
After deployment, your frontend points to the APIM gateway:
# frontend/.env.local NEXT_PUBLIC_API_URL=https://blog07-apim.azure-api.net
All API calls from the frontend go through https://blog07-apim.azure-api.net/api/* — APIM handles auth, rate limiting, and routing to the correct Container App.
Cleanup#
./cleanup.sh
Destroys all Azure resources (Terraform + AD apps + test users) to avoid charges.
Step 8: Local Development#
For local development, set TRUST_GATEWAY=false in your .env files (the default). This makes each service validate JWTs directly — same as Blog 6.
# task-api/.env TRUST_GATEWAY=false AZURE_HOME_TENANT_ID=your-tenant-id AZURE_API_CLIENT_ID=your-api-client-id # ... rest of config
Start services locally:
# Terminal 1 cd task-api && uvicorn main:app --port 8000 # Terminal 2 cd notification-service && uvicorn main:app --port 8001 # Terminal 3 cd audit-service && uvicorn main:app --port 8002 # Terminal 4 cd frontend && npm run dev
The frontend points to http://localhost:8000 (no APIM), and every service validates tokens independently. When you deploy to Azure, flip TRUST_GATEWAY=true and point the frontend to the APIM URL.
How the Gateway Changes Security#
Without Gateway (Blog 6)#
Frontend → Task API
✓ Validates JWT (JWKS fetch)
✓ Checks roles
✗ No rate limiting
✗ Auth logic duplicated in 3 services
With Gateway (Blog 7)#
Frontend → APIM
✓ Validates JWT (cached JWKS)
✓ Rate limits (100/min per user)
✓ Forwards claims as headers
→ Task API
✓ Reads headers (fast, no HTTP call)
✓ Checks roles (same logic, same result)
The backends still check roles — require_role("Admin", "Editor") works identically in both modes. The difference is where token validation happens.
Common Pitfalls#
1. Forgetting to Set TRUST_GATEWAY in Container Apps#
If TRUST_GATEWAY is missing or false in the cloud, backends will try to validate JWTs that APIM already stripped or modified. Make sure Terraform sets TRUST_GATEWAY=true in the Container App environment variables.
2. Header Spoofing#
When TRUST_GATEWAY=true, backends trust whatever is in the X-User-* headers. If someone can bypass APIM and call backends directly, they can forge these headers. Ensure:
- Container Apps are only accessible through APIM (use internal ingress or network restrictions)
- Or keep both gateway mode and local JWT validation as a fallback
3. APIM Developer Tier Limitations#
The Developer tier is not backed by an SLA and shouldn't be used in production. For production workloads, use Standard or Premium tier. Developer tier is perfect for learning and testing.
4. Policy Template Variables#
The APIM policy XML uses ${tenant_id} and ${audience} — these are Terraform template variables, not APIM expressions. If you edit the XML manually, replace them with actual values. The @(...) syntax is APIM's C# expression language.
5. S2S Calls Bypass APIM#
Service-to-service calls (OBO, Client Credentials) go directly to Container App FQDNs, not through APIM. This means the downstream services still need their own token validation for S2S calls (the _validate_jwt path in dual-mode auth).
Cost Considerations#
| Resource | Approximate Cost |
|---|---|
| APIM Developer tier | ~$50/month |
| 3 Container Apps (0.25 CPU, 0.5 GB) | ~$0.07/hr each when active |
| ACR Basic | ~$5/month |
| Container Apps at zero replicas | $0 when idle |
Run ./cleanup.sh when you're done testing to avoid charges.