Back
View source
Cloud Engineering··22 min

Azure Auth Series — Blog 7: API Gateway Authentication

Centralize JWT validation, rate limiting, and multi-service routing with Azure API Management. One APIM gateway protects three microservices — backends trust pre-validated claims from headers instead of verifying tokens themselves.

Azure Auth Series — Blog 7: API Gateway Authentication#

In Blog 6, every service validated its own JWTs. The Task API fetched JWKS, the Notification Service fetched JWKS, the Audit Service fetched JWKS — three separate HTTP calls to Azure AD on every request. That works, but it doesn't scale. What if you have ten services? Twenty?

This blog puts Azure API Management (APIM) in front of all three services. APIM validates the JWT once, extracts the claims, and forwards them as trusted HTTP headers. Backends skip token validation entirely and just read the headers. One gateway, one place to change auth rules, one place to add rate limiting.

Source code: github.com/MinhQuanBuiSco/Azure/.../07_api_gateway


The Azure Auth Series#

BlogTopicWhat You'll Learn
1. Basic LoginFrontend authSign in with Microsoft Entra ID
2. Protected APIBackend authBuild a FastAPI backend that validates tokens
3. RBACAuthorizationControl access based on user roles
4. Managed IdentityZero secretsDeploy to Azure without storing credentials
5. Multi-TenantOrganizationsLet users from any org sign in
6. Service-to-ServiceOBO + Client CredentialsAuthenticate services to each other
7. API GatewayYou are hereCentralize auth with Azure API Management

What We're Building#

Architecture#

Blog 6 (no gateway):
  Frontend → Task API (validate JWT)
           → Notification Svc (validate JWT)
           → Audit Svc (validate JWT)
  Each service fetches JWKS independently

Blog 7 (with APIM gateway):
  Frontend → APIM Gateway → Task API (trust headers)
                           → Notification Svc (trust headers)
                           → Audit Svc (trust headers)
  APIM validates JWT once, forwards claims as headers

What Changed from Blog 6#

AspectBlog 6Blog 7
JWT validationEvery service validates independentlyAPIM validates once at the gateway
JWKS fetching3 services × N requestsAPIM caches JWKS, services skip
Rate limitingNone100 req/min per user at gateway
Auth rule locationSpread across 3 auth.py filesCentralized in APIM XML policies
Backend auth modeAlways validate JWTDual-mode: TRUST_GATEWAY flag
Frontend API URLhttp://localhost:8000APIM gateway URL
InfrastructureLocal dev onlyTerraform: APIM + Container Apps + ACR

The App#

API Gateway Landing

The frontend looks the same as Blog 6 — sign in, manage tasks, see notification and audit badges. But the request path is different: everything goes through APIM first.

API Gateway Dashboard


Step 1: The Problem — Duplicated Auth Logic#

In Blog 6, every service had its own auth.py that did the same thing:

1. Extract Bearer token from Authorization header
2. Decode the JWT header to get the key ID (kid)
3. Fetch JWKS from login.microsoftonline.com/{tenant_id}
4. Find the matching RSA key
5. Verify signature, audience, issuer
6. Return claims

Three services means three copies of this logic. If you want to change the allowed issuers or add a new audience, you update three files. If you want to add rate limiting, you add it to three main.py files.

An API gateway solves this by doing steps 1–6 once and passing the verified claims downstream.


Step 2: APIM Policies — The Core Innovation#

APIM uses XML policies to process every request before it reaches the backend. Each API gets its own policy file.

Task API Policy (task-api.xml)#

<policies>
  <inbound>
    <base />
    <!-- CORS for frontend -->
    <cors allow-credentials="true">
      <allowed-origins>
        <origin>http://localhost:3000</origin>
      </allowed-origins>
      <allowed-methods preflight-result-max-age="300">
        <method>*</method>
      </allowed-methods>
      <allowed-headers>
        <header>*</header>
      </allowed-headers>
    </cors>

    <!-- Rate limit: 100 req/min per user (JWT sub) or IP -->
    <rate-limit-by-key
      calls="100"
      renewal-period="60"
      counter-key="@(
        context.Request.Headers
          .GetValueOrDefault("Authorization","")
          .AsJwt()?.Subject
        ?? context.Request.IpAddress
      )" />

    <!-- Validate JWT — store parsed token in "jwt" variable -->
    <validate-jwt
      header-name="Authorization"
      require-scheme="Bearer"
      failed-validation-httpcode="401"
      output-token-variable-name="jwt">
      <openid-config
        url="https://login.microsoftonline.com/${tenant_id}
             /v2.0/.well-known/openid-configuration" />
      <audiences>
        <audience>${audience}</audience>
      </audiences>
      <issuers>
        <issuer>https://sts.windows.net/${tenant_id}/</issuer>
      </issuers>
    </validate-jwt>

    <!-- Extract claims → backend headers -->
    <set-header name="X-User-OID" exists-action="override">
      <value>@(((Jwt)context.Variables["jwt"])
        .Claims.GetValueOrDefault("oid",
          new [] {""}).FirstOrDefault())</value>
    </set-header>
    <set-header name="X-User-Name" exists-action="override">
      <value>@(((Jwt)context.Variables["jwt"])
        .Claims.GetValueOrDefault("name",
          new [] {""}).FirstOrDefault())</value>
    </set-header>
    <set-header name="X-User-Email" exists-action="override">
      <value>@(((Jwt)context.Variables["jwt"])
        .Claims.GetValueOrDefault("preferred_username",
          new [] {""}).FirstOrDefault())</value>
    </set-header>
    <set-header name="X-Tenant-ID" exists-action="override">
      <value>@(((Jwt)context.Variables["jwt"])
        .Claims.GetValueOrDefault("tid",
          new [] {""}).FirstOrDefault())</value>
    </set-header>
    <set-header name="X-User-Roles" exists-action="override">
      <value>@(String.Join(",",
        ((Jwt)context.Variables["jwt"])
          .Claims.GetValueOrDefault("roles",
            new string[0])))</value>
    </set-header>
  </inbound>
  <backend><base /></backend>
  <outbound><base /></outbound>
  <on-error><base /></on-error>
</policies>

Three things happen on every request:

  1. Rate limiting — 100 calls per minute, keyed by the JWT sub claim. Falls back to IP address for unauthenticated requests.
  2. JWT validation — Fetches the OpenID configuration, verifies signature, checks audience and issuer. The parsed token is stored in the jwt variable.
  3. Claims forwarding — Extracts oid, name, preferred_username, tid, and roles from the validated JWT and sets them as HTTP headers.

Audit API Policy — Role Enforcement at the Gateway#

The audit API policy adds an extra check: the token must contain the AuditLog.Write application role.

<validate-jwt ...
  output-token-variable-name="jwt">
  <openid-config url="..." />
  <audiences>
    <audience>${audience}</audience>
  </audiences>
  <issuers>
    <issuer>https://sts.windows.net/${tenant_id}/</issuer>
  </issuers>
  <!-- Require AuditLog.Write application role -->
  <required-claims>
    <claim name="roles" match="any">
      <value>AuditLog.Write</value>
    </claim>
  </required-claims>
</validate-jwt>

If the token doesn't have AuditLog.Write, APIM returns 401 before the request even reaches the Audit Service. This is the same check that audit-service/auth.py did in Blog 6, but now it happens at the gateway.


Step 3: The TRUST_GATEWAY Pattern#

The key architectural decision in Blog 7 is the TRUST_GATEWAY flag. Each service's config.py adds one new variable:

# config.py — NEW in Blog 7
TRUST_GATEWAY = os.getenv(
    "TRUST_GATEWAY", "false"
).lower() == "true"

if TRUST_GATEWAY:
    logger.info(
        "TRUST_GATEWAY=true — accepting claims "
        "from APIM headers (skip JWT validation)"
    )
else:
    logger.info(
        "TRUST_GATEWAY=false — validating JWT "
        "tokens locally"
    )

And each service's auth.py dispatches based on it:

# auth.py — dual-mode validation
async def validate_token(request: Request) -> dict:
    if TRUST_GATEWAY:
        return _extract_claims_from_headers(request)

    # Fall back to JWT validation (Blog 6 behavior)
    auth_header = request.headers.get("Authorization", "")
    if not auth_header.startswith("Bearer "):
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="Missing Bearer token",
        )
    credentials = HTTPAuthorizationCredentials(
        scheme="Bearer", credentials=auth_header[7:]
    )
    return await _validate_jwt(credentials)

Extracting Claims from Headers#

When TRUST_GATEWAY=true, the backend reads APIM-injected headers instead of parsing JWT:

def _extract_claims_from_headers(request: Request) -> dict:
    """Extract user claims from APIM-injected headers."""
    oid = request.headers.get("X-User-OID", "")
    name = request.headers.get("X-User-Name", "")
    email = request.headers.get("X-User-Email", "")
    tenant_id = request.headers.get("X-Tenant-ID", "")
    roles_header = request.headers.get("X-User-Roles", "")

    if not oid or not tenant_id:
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="Missing required gateway headers",
        )

    roles = [
        r.strip()
        for r in roles_header.split(",")
        if r.strip()
    ]

    return {
        "oid": oid,
        "name": name,
        "preferred_username": email,
        "tid": tenant_id,
        "roles": roles,
        "scp": "access_as_user",
        "_source": "gateway_headers",
    }

The result is the same claims dictionary that _validate_jwt() returns — the rest of the application code doesn't know or care which mode was used.

Why Two Modes?#

ModeWhenWhy
TRUST_GATEWAY=falseLocal developmentNo APIM needed, validates JWT directly
TRUST_GATEWAY=trueCloud deploymentAPIM already validated, skip redundant check

This means docker compose up still works locally without deploying APIM, while the cloud deployment gets the performance benefit of centralized validation.


Step 4: Audit Service — Role Enforcement in Both Modes#

The Audit Service has a special requirement: it only accepts tokens with the AuditLog.Write application role. In Blog 6, this was checked in auth.py. In Blog 7, it's checked twice — at the gateway (APIM policy) and at the backend (defense in depth).

# audit-service/auth.py
def _extract_claims_from_headers(request: Request) -> dict:
    """Extract claims from APIM headers for app-only tokens."""
    oid = request.headers.get("X-User-OID", "")
    tenant_id = request.headers.get("X-Tenant-ID", "")
    roles_header = request.headers.get("X-User-Roles", "")

    if not tenant_id:
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="Missing required gateway header",
        )

    roles = [
        r.strip()
        for r in roles_header.split(",")
        if r.strip()
    ]

    # Defense in depth: check role even in gateway mode
    if "AuditLog.Write" not in roles:
        raise HTTPException(
            status_code=status.HTTP_403_FORBIDDEN,
            detail="Missing required application role: "
                   "AuditLog.Write",
        )

    return {
        "oid": oid,
        "tid": tenant_id,
        "roles": roles,
        "azp": oid,
        "_source": "gateway_headers",
    }

Even though APIM already checked the role in its <required-claims> block, the backend checks again. If APIM is misconfigured or bypassed, the backend still rejects unauthorized calls.


Step 5: Infrastructure as Code — Terraform#

Blog 7's Terraform provisions four modules:

infra/
├── main.tf                    # Root orchestrator
├── variables.tf               # ACR name, APIM name, client IDs
├── outputs.tf                 # APIM gateway URL, service FQDNs
└── modules/
    ├── resource_group/        # Azure Resource Group
    ├── container_registry/    # ACR (Basic SKU)
    ├── container_apps/        # 3 Container Apps
    └── api_management/        # APIM + 3 APIs + policies
        └── policies/
            ├── task-api.xml
            ├── notification.xml
            └── audit.xml

Root Module#

# infra/main.tf
module "resource_group" {
  source   = "./modules/resource_group"
  name     = var.resource_group_name
  location = var.location
  tags     = local.tags
}

module "container_registry" {
  source              = "./modules/container_registry"
  name                = var.acr_name
  resource_group_name = module.resource_group.name
  location            = module.resource_group.location
  tags                = local.tags
}

module "container_apps" {
  source              = "./modules/container_apps"
  project_name        = var.project_name
  resource_group_name = module.resource_group.name
  location            = module.resource_group.location

  acr_login_server   = module.container_registry.login_server
  acr_admin_username = module.container_registry.admin_username
  acr_admin_password = module.container_registry.admin_password

  tenant_id              = var.tenant_id
  task_api_client_id     = var.task_api_client_id
  task_api_client_secret = var.task_api_client_secret
  notification_client_id = var.notification_client_id
  audit_client_id        = var.audit_client_id
  allowed_origins        = var.allowed_origins
  tags                   = local.tags
}

module "api_management" {
  source              = "./modules/api_management"
  apim_name           = var.apim_name
  resource_group_name = module.resource_group.name
  location            = module.resource_group.location
  publisher_email     = var.publisher_email

  tenant_id         = var.tenant_id
  task_api_fqdn     = module.container_apps.task_api_fqdn
  notification_fqdn = module.container_apps.notification_fqdn
  audit_fqdn        = module.container_apps.audit_fqdn

  task_api_audience     = "api://${var.task_api_client_id}"
  notification_audience = "api://${var.notification_client_id}"
  audit_audience        = "api://${var.audit_client_id}"
  tags                  = local.tags
}

Container Apps — TRUST_GATEWAY=true#

Each Container App is configured with TRUST_GATEWAY=true so it accepts APIM-forwarded headers:

# modules/container_apps/main.tf (Task API excerpt)
resource "azurerm_container_app" "task_api" {
  name                         = "task-api"
  container_app_environment_id = azurerm_container_app_environment.this.id
  resource_group_name          = var.resource_group_name
  revision_mode                = "Single"

  template {
    min_replicas = 0
    max_replicas = 1

    container {
      name   = "task-api"
      image  = "mcr.microsoft.com/k8se/quickstart:latest"
      cpu    = 0.25
      memory = "0.5Gi"

      env {
        name  = "TRUST_GATEWAY"
        value = "true"
      }
      env {
        name  = "AZURE_HOME_TENANT_ID"
        value = var.tenant_id
      }
      env {
        name  = "AZURE_API_CLIENT_ID"
        value = var.task_api_client_id
      }
      env {
        name        = "AZURE_API_CLIENT_SECRET"
        secret_name = "api-client-secret"
      }
      env {
        name  = "NOTIFICATION_URL"
        value = "https://${
          azurerm_container_app.notification
            .ingress[0].fqdn
        }"
      }
      env {
        name  = "AUDIT_URL"
        value = "https://${
          azurerm_container_app.audit
            .ingress[0].fqdn
        }"
      }
      # ... more env vars
    }
  }

  ingress {
    external_enabled = true
    target_port      = 8000
    transport        = "http"
    traffic_weight {
      latest_revision = true
      percentage      = 100
    }
  }

  secret {
    name  = "api-client-secret"
    value = var.task_api_client_secret
  }
  # ... registry config
}

The min_replicas = 0 means Container Apps can scale to zero when idle — you only pay when requests come in.

APIM Module — Three APIs, Three Policies#

# modules/api_management/main.tf
resource "azurerm_api_management" "this" {
  name                = var.apim_name
  location            = var.location
  resource_group_name = var.resource_group_name
  publisher_name      = "Azure Auth Series"
  publisher_email     = var.publisher_email
  sku_name            = "Developer_1"
}

# Task API — path: /api
resource "azurerm_api_management_api" "task_api" {
  name                  = "task-api"
  api_management_name   = azurerm_api_management.this.name
  resource_group_name   = var.resource_group_name
  revision              = "1"
  display_name          = "Task API"
  path                  = "api"
  protocols             = ["https"]
  service_url           = "https://${var.task_api_fqdn}/api"
  subscription_required = false
}

# Notification API — path: /notification
resource "azurerm_api_management_api" "notification" {
  # ... similar, path = "notification"
  service_url = "https://${var.notification_fqdn}"
}

# Audit API — path: /auditing
resource "azurerm_api_management_api" "audit" {
  # ... similar, path = "auditing"
  service_url = "https://${var.audit_fqdn}"
}

Each API gets its policy via templatefile(), which injects the tenant_id and audience variables into the XML:

resource "azurerm_api_management_api_policy" "task_api" {
  api_name            = azurerm_api_management_api.task_api.name
  api_management_name = azurerm_api_management.this.name
  resource_group_name = var.resource_group_name

  xml_content = templatefile(
    "${path.module}/policies/task-api.xml",
    {
      tenant_id = var.tenant_id
      audience  = var.task_api_audience
    }
  )
}

Step 6: Request Flow — End to End#

Here's what happens when a user creates a task:

1. User clicks "Add" in the frontend

2. Frontend sends POST to APIM gateway:
   POST https://blog07-apim.azure-api.net/api/tasks
   Authorization: Bearer eyJ0eXAi...

3. APIM processes the request:
   a. Rate limit check (100/min per JWT sub)
   b. Validate JWT (signature, audience, issuer)
   c. Extract claims → set headers:
      X-User-OID: abc-123
      X-User-Name: Test Admin
      X-User-Email: admin@contoso.com
      X-Tenant-ID: 0f485f73-...
      X-User-Roles: Admin

4. APIM routes to Task API Container App:
   POST https://task-api.internal/api/tasks
   (with X-User-* headers)

5. Task API reads headers (TRUST_GATEWAY=true):
   claims = _extract_claims_from_headers(request)
   # No JWKS fetch needed!

6. Task API creates the task, then:
   a. OBO → Notification Service (direct HTTPS)
   b. Client Credentials → Audit Service (direct HTTPS)
   Both S2S calls bypass APIM — they go directly
   to the Container App FQDNs.

Notice that service-to-service calls (step 6) bypass APIM. The Task API calls Notification and Audit services directly because:

  • Those calls use different tokens (OBO and Client Credentials), not the user's original token
  • Internal calls don't need gateway routing or rate limiting
  • Each downstream service still validates its own token (the S2S tokens, not the original user token)

Step 7: Deployment#

The setup.sh script automates everything:

./setup.sh

It runs five phases:

Phase 1: Azure AD Setup
  Create 4 app registrations (Task API, Notification,
    Audit, SPA Frontend)
  Define scopes and app roles
  Create 3 test users with role assignments

Phase 2: Terraform
  terraform init && terraform apply
  Provisions: RG + ACR + 3 Container Apps + APIM
  (APIM Developer tier takes ~30-45 minutes)

Phase 3: Docker Build + Push
  Build 3 images with --platform linux/amd64
  Push to ACR

Phase 4: Update Container Apps
  az containerapp update → point to real Docker images

Phase 5: Write .env Files
  task-api/.env: TRUST_GATEWAY=false (for local dev)
  frontend/.env.local: API URL = APIM gateway URL

After deployment, your frontend points to the APIM gateway:

# frontend/.env.local
NEXT_PUBLIC_API_URL=https://blog07-apim.azure-api.net

All API calls from the frontend go through https://blog07-apim.azure-api.net/api/* — APIM handles auth, rate limiting, and routing to the correct Container App.

Cleanup#

./cleanup.sh

Destroys all Azure resources (Terraform + AD apps + test users) to avoid charges.


Step 8: Local Development#

For local development, set TRUST_GATEWAY=false in your .env files (the default). This makes each service validate JWTs directly — same as Blog 6.

# task-api/.env
TRUST_GATEWAY=false
AZURE_HOME_TENANT_ID=your-tenant-id
AZURE_API_CLIENT_ID=your-api-client-id
# ... rest of config

Start services locally:

# Terminal 1
cd task-api && uvicorn main:app --port 8000

# Terminal 2
cd notification-service && uvicorn main:app --port 8001

# Terminal 3
cd audit-service && uvicorn main:app --port 8002

# Terminal 4
cd frontend && npm run dev

The frontend points to http://localhost:8000 (no APIM), and every service validates tokens independently. When you deploy to Azure, flip TRUST_GATEWAY=true and point the frontend to the APIM URL.


How the Gateway Changes Security#

Without Gateway (Blog 6)#

Frontend → Task API
  ✓ Validates JWT (JWKS fetch)
  ✓ Checks roles
  ✗ No rate limiting
  ✗ Auth logic duplicated in 3 services

With Gateway (Blog 7)#

Frontend → APIM
  ✓ Validates JWT (cached JWKS)
  ✓ Rate limits (100/min per user)
  ✓ Forwards claims as headers
         → Task API
           ✓ Reads headers (fast, no HTTP call)
           ✓ Checks roles (same logic, same result)

The backends still check roles — require_role("Admin", "Editor") works identically in both modes. The difference is where token validation happens.


Common Pitfalls#

1. Forgetting to Set TRUST_GATEWAY in Container Apps#

If TRUST_GATEWAY is missing or false in the cloud, backends will try to validate JWTs that APIM already stripped or modified. Make sure Terraform sets TRUST_GATEWAY=true in the Container App environment variables.

2. Header Spoofing#

When TRUST_GATEWAY=true, backends trust whatever is in the X-User-* headers. If someone can bypass APIM and call backends directly, they can forge these headers. Ensure:

  • Container Apps are only accessible through APIM (use internal ingress or network restrictions)
  • Or keep both gateway mode and local JWT validation as a fallback

3. APIM Developer Tier Limitations#

The Developer tier is not backed by an SLA and shouldn't be used in production. For production workloads, use Standard or Premium tier. Developer tier is perfect for learning and testing.

4. Policy Template Variables#

The APIM policy XML uses ${tenant_id} and ${audience} — these are Terraform template variables, not APIM expressions. If you edit the XML manually, replace them with actual values. The @(...) syntax is APIM's C# expression language.

5. S2S Calls Bypass APIM#

Service-to-service calls (OBO, Client Credentials) go directly to Container App FQDNs, not through APIM. This means the downstream services still need their own token validation for S2S calls (the _validate_jwt path in dual-mode auth).


Cost Considerations#

ResourceApproximate Cost
APIM Developer tier~$50/month
3 Container Apps (0.25 CPU, 0.5 GB)~$0.07/hr each when active
ACR Basic~$5/month
Container Apps at zero replicas$0 when idle

Run ./cleanup.sh when you're done testing to avoid charges.


Resources#