An agent that bypasses your SSO is a procurement deal-breaker in any regulated org. The good news: agents fit the OAuth model cleanly if you choose the right pattern. The bad news: most teams pick the wrong one. Here are the three patterns that work, the one to avoid, and how to roll it out.
The three patterns that work
1. User-on-behalf-of (OBO)
The agent acts under the user's identity. The user logs into the agent host with SSO; the host obtains a token that the agent presents to every downstream MCP server. Audit logs show "user X via agent A".
- Strengths: clean attribution, minimal new identities, leverages existing IdP groups.
- Weaknesses: session lifetime ties the agent to the user; tokens scoped per user not per agent.
- Pick when: the agent assists a known user (IDE copilots, support agents).
2. Workload identity
The agent has its own machine identity (service principal in Entra, machine user in Okta). It authenticates through OIDC client credentials, holds short-lived tokens, and carries metadata about which user triggered it.
- Strengths: identity outlives any session; clean for batch and scheduled agents.
- Weaknesses: new identity to govern; needs careful least-privilege.
- Pick when: the agent runs autonomously (cron jobs, monitoring, scheduled research).
3. Hybrid: workload identity + impersonation
The agent uses workload identity for auth, then impersonates the requesting user via OBO for downstream calls. Best audit story; most plumbing.
- Strengths: complete attribution chain (agent + user); least-privilege per call.
- Weaknesses: complex; not every IdP supports the impersonation primitive cleanly.
- Pick when: the agent is multi-user and audit-heavy (regulated industries).
The one to avoid
Long-lived shared tokens in MCP env blocks. Common, easy, fundamentally insecure. See credential rotation for the migration.
Token exchange in practice
The pattern looks the same regardless of pattern choice:
1. User authenticates to agent host (SSO + MFA)
2. Host receives ID token + refresh token
3. Host exchanges ID token for access tokens scoped to the agent
4. Agent presents access tokens to MCP servers
5. MCP servers validate tokens against IdP introspection endpoint
6. Tokens expire in 5–15 minutes; refresh on demand
Token exchange (RFC 8693) is the right primitive. Both Okta and Entra support it; Auth0 has parity. Older OAuth-only setups need a small adapter.
Scopes per agent
Each agent should define its own scopes. Sample scope catalogue:
agent: support-bot
required_scopes:
- zendesk:tickets:read
- zendesk:tickets:write
- postgres:support_db:read
optional_scopes:
- slack:channels:read
Map scopes to existing IdP groups for assignment. The agent never asks for more than declared; the gateway rejects out-of-scope calls.
The MCP gateway is the enforcement point
Token validation, scope checks, and per-agent rate limits happen at the MCP gateway — not at every MCP server. The upstream servers trust the gateway and operate on a service token. This keeps server implementations simple and policy in one place.
Roll-out plan
Three weeks for most teams:
- Week 1: workload identity provisioned; non-prod agents use it.
- Week 2: gateway enforces scopes for non-prod; user OBO wired for prod read-only.
- Week 3: prod write-path moves to OBO; old shared tokens revoked.
Test with a small pilot group; expect at least one IdP edge case. Have a rollback path that re-enables a fallback token for 24 hours if needed.
Audit story
A tool call should produce an audit record with:
- IdP user id (subject claim).
- Agent identity (workload principal).
- Issuing IdP (multi-tenant orgs).
- Scopes presented vs scopes required.
- Decision (allow/deny + reason).
Feeds straight into the audit trail. Auditors recognise the shape from any other OAuth audit they have seen.
Common pitfalls
- Reusing user tokens past their TTL — token exchange exists for a reason, use it.
- Skipping introspection — never trust JWT claims on a long-lived token without a back-channel check.
- Over-broad scopes —
*:writeis a CV submission, not a scope. - No agent identity at all — "the agent uses Alice's token" is fine until Alice quits.
Where this is heading
Standardised agent identity primitives in OIDC by 2027 — a urn:agent claim distinguishing agent-driven calls from user-initiated ones. Build the conventions now, swap to standards later.