MS Entra Managed Identity in Kubernetes

Introduction

One of the most common mistakes I see teams make when running workloads on Azure Kubernetes Service (AKS) is storing credentials in Kubernetes Secrets to authenticate with Azure services — hard-coded storage account keys, client secrets for app registrations, or connection strings checked in alongside manifests. These approaches create operational pain: secrets expire, rotate, and leak through logs or container image layers.

When I migrated our multi-tenant SaaS platform from key-based auth to managed identity, the difference was immediate. No more secret rotations causing 3 AM pages. No more emergency credential revocations after repository exposure incidents. No more "which team owns that service principal?" mysteries.

Microsoft Entra Managed Identity is the right-first approach for Azure workloads. This article covers both system-assigned and user-assigned managed identities, explains the modern AKS Workload Identity federation model, and provides production-ready YAML and code examples to get you running.

Table of Contents


What is Managed Identity?

Managed Identity is an Azure feature that provides an automatically managed identity in Microsoft Entra ID for applications to use when connecting to Azure resources. The Azure platform handles credential management — you never see, store, or rotate a password or certificate.

spinner

Benefits

Feature
Key-Based Auth
Managed Identity

Credential rotation

Manual, error-prone

Automatic

Secret storage

Required

Not needed

Audit trail

Limited

Full Entra audit logs

Least privilege

Hard to enforce

Enforced via RBAC

Multi-environment

Different secrets per env

Same identity, different roles

Compliance

High overhead

Built-in


System-Assigned vs User-Assigned Managed Identity

Understanding the difference is critical to choosing the right pattern.

System-Assigned Managed Identity

  • Lifecycle tied to the resource: Created when you enable it on a specific Azure resource (e.g., an AKS node pool), deleted when that resource is deleted.

  • One-to-one mapping: One system-assigned identity per resource.

  • Good for: Simple, single-service scenarios. AKS node-level operations (pulling images from ACR, writing to Azure Monitor).

User-Assigned Managed Identity

  • Standalone resource: Created independently, assigned to one or many resources.

  • Reusable: Share the same identity across multiple pods, node pools, or even different Azure resources.

  • Lifecycle independent: Deleting an AKS cluster does not delete the managed identity.

  • Good for: Application workloads, multi-pod scenarios, shared identity patterns, and disaster recovery.

spinner

When to Use Which

Scenario
Recommendation

AKS pulls images from ACR

System-assigned on node pool

App reads from Azure Blob Storage

User-assigned via Workload Identity

Multi-pod shared access

User-assigned via Workload Identity

Temporary / dev clusters

System-assigned (simpler)

Production microservices

User-assigned (portable, auditable)

Disaster recovery / cluster migration

User-assigned (survives cluster deletion)


AKS Workload Identity Architecture

AKS Workload Identity (the successor to AAD Pod Identity v1/v2) uses OpenID Connect (OIDC) federation — a Kubernetes-native, standards-based approach. It does NOT require a DaemonSet or a mutating webhook running in your cluster (unlike the legacy pod identity solution).

spinner

Key Components

Component
Purpose

OIDC Issuer

AKS exposes a public OIDC endpoint; Entra uses it to validate KSA tokens

Federated Identity Credential

Links a Managed Identity to a specific KSA (namespace + name)

Mutating Webhook

Injects AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_FEDERATED_TOKEN_FILE into pods

Projected Service Account Token

Short-lived Kubernetes token used as the federated credential

DefaultAzureCredential

Azure SDK credential chain; picks up injected env vars automatically


Setting Up AKS Workload Identity

Prerequisites

Create AKS Cluster with OIDC and Workload Identity Enabled

Enable Workload Identity on an Existing Cluster


System-Assigned Managed Identity on AKS

System-assigned identity on AKS is primarily used at the node pool / kubelet level — allowing nodes to pull container images from Azure Container Registry (ACR) or write logs to Azure Monitor without storing credentials.

Attach ACR to AKS Using System-Assigned Identity (Kubelet Identity)

This command automatically assigns the AcrPull role on the ACR to the AKS kubelet's managed identity. Your pods can now pull images from ACR without imagePullSecrets:

System-Assigned Identity for Node-Level Azure Operations

The system-assigned kubelet identity can also be granted access to other resources:

Important: System-assigned identity at the node level gives ALL pods on that node the same access. For application workloads requiring fine-grained per-pod permissions, always use User-Assigned Managed Identity via Workload Identity instead.


User-Assigned Managed Identity on AKS

This is the recommended approach for application workloads. It uses Workload Identity federation to bind a Kubernetes Service Account to an Azure Managed Identity.

Step 1: Create User-Assigned Managed Identity

Step 2: Create Kubernetes Service Account

Or as a YAML file:

Step 3: Create Federated Identity Credential

This is the critical link — it tells Entra ID to trust tokens issued by this AKS cluster's OIDC endpoint for this specific service account.

Step 4: Grant Azure RBAC to the Managed Identity

Step 5: Deploy Your Pod with Workload Identity

After the webhook mutates the pod, you can verify the injected environment:


Application Code Integration

The beauty of managed identity is that your application code uses DefaultAzureCredential — the same code works locally (via az login) and in AKS (via Workload Identity) without any changes.

Go

Python

Node.js / TypeScript


Role Assignments and RBAC

Least-privilege role assignments are central to managed identity security. Always scope to the minimum necessary resource.

Common Role Assignments

Scope Examples


Multiple Identities and Multi-Tenancy

In a multi-team cluster, different services should have different managed identities.

Pattern: One Identity Per Workload

Each service account in its own namespace maps to its own managed identity with its own scoped role assignments.

Managing With Terraform


Security Best Practices

1. Avoid Node-Level Identity for Application Access

2. Always Scope Roles to Specific Resources

Never assign roles at the subscription level for application workloads.

3. Restrict Federated Credential Subject Precisely

The subject in a federated credential must be system:serviceaccount:<namespace>:<name>. Wildcards are not supported; this is intentional security enforcement.

4. Separate Identities Per Environment

5. Use Azure Policy to Prevent Credential Leakage

6. Monitor Identity Usage


Troubleshooting

Webhook Not Injecting Environment Variables

AADSTS70021: No Matching Federated Identity Record Found

This means the federated credential subject doesn't match the pod's actual service account.

AADSTS50034: User Account Does Not Exist

The managed identity's principal ID was not found in the tenant — usually means the role assignment used the wrong assignee or the identity is in a different subscription.

Pod Can't Access Azure Resource (403 Forbidden)

Check Token Exchange is Working


What I Learned

After migrating dozens of services from key-based authentication to managed identity, here are the lessons that saved me the most time:

  1. User-assigned is almost always the right choice for applications. System-assigned makes sense only for node-level operations like ACR pulling. For everything else, user-assigned gives you portability, reuse, and survivability across cluster recreations.

  2. The federated credential subject must be exact. The format is system:serviceaccount:<namespace>:<name> — no wildcards. Get this wrong and you'll waste hours on confusing AADSTS errors.

  3. Role assignment propagation takes time. Azure RBAC changes can take 1–2 minutes to propagate. In CI/CD pipelines, add a sleep 60 or retry loop after granting permissions.

  4. DefaultAzureCredential is the right primitive. Don't use WorkloadIdentityCredential directly unless you have a specific reason. DefaultAzureCredential works locally with az login, in AKS with Workload Identity, and in any other Azure compute — same code everywhere.

  5. Always scope to the minimum resource, not the resource group. Key Vault at vault scope, Storage at container scope. It's more verbose in Terraform/CLI but dramatically reduces blast radius.

  6. Label the pod, not just the service account. Both the Service Account annotation (azure.workload.identity/client-id) AND the pod label (azure.workload.identity/use: "true") are required. Missing either one means the webhook won't inject credentials.

  7. Separate identities per environment, per service. A single shared managed identity across dev/prod is a compliance and blast radius problem waiting to happen.


Summary

spinner

Next steps:

Last updated