Skip to main content
Tutorial5 min read

SOC 2 compliant MCP deployment: the controls auditors actually ask about

An MCP-heavy stack changes how SOC 2 controls map to reality. Here are the controls auditors will probe in 2026 and what evidence to collect from day one.

SOC 2 was written before agents. The Trust Services Criteria still apply — but how you map them to an MCP-driven system changes a lot. Here is the practical control mapping auditors will accept in 2026, plus the evidence trail you need to make it painless.

What SOC 2 cares about, applied to MCP

SOC 2 Type II evaluates controls against five Trust Services Criteria: Security, Availability, Processing Integrity, Confidentiality, Privacy. For an MCP-heavy stack, the changes concentrate in Security and Processing Integrity.

This article assumes you are aiming for Type II and have an existing SOC 2 program. If you are also subject to GDPR, pair this with GDPR compliant AI agents.

The control mapping

CC6.1 — logical access

Auditors will ask: "Who can call which MCP tools, and how do you enforce it?"

Required:

  • Per-call authorisation, not standing grants. See MCP access control lists.
  • Workload identity for every MCP server (not shared tokens).
  • Quarterly access reviews including agent service accounts.

Evidence: policy decision logs, identity issuance logs, access review attestations.

CC6.6 — identification of unauthorised access

For MCP: tool calls outside expected scope, denied authorisations, anomalous tool sequences.

Required:

  • Real-time alerting on policy denies.
  • Trace storage for at least the audit window (12 months for Type II).
  • Documented response to detected anomalies.

Evidence: alerting configuration, sample incident tickets, audit trails.

CC6.7 — transmission and disposal

MCP uses JSON-RPC over stdio (local) or HTTP/SSE (remote). Both need encryption in transit when crossing trust boundaries.

Required:

  • mTLS for any MCP server reachable over the network.
  • Documented data-disposal policy for trace stores (with PII-aware redaction).

Evidence: TLS configuration, retention/disposal policy, sample disposal logs.

CC7.2 — system monitoring

The agent itself is part of the system. Monitoring the LLM is now in scope.

Required:

  • Real-time metrics covering tool calls, error rates, latency. See real-time agent monitoring.
  • Alerting wired to on-call.
  • Documented runbooks per alert.

Evidence: monitoring dashboards, alert policies, runbook docs.

CC8.1 — change management

When does a model upgrade or a new prompt count as a "change"? Auditors want it to count.

Required:

  • Pinned model versions in production (no auto-bumps).
  • PR review for prompt changes, with the same rigour as code.
  • Regression suite as part of the change pipeline (see continuous agent regression testing).

Evidence: deployment changelog, PR reviews, regression results per release.

PI1.4 — processing integrity

For MCP: was the right tool called with the right arguments, did the result correctly reach the user?

Required:

  • Trace for every user-visible task, end to end.
  • Evidence of validation (schema checks, refusal rates within bounds).
  • Documented error-handling and retry semantics — see distributed agent failure recovery.

Evidence: trace samples, validation logs, error-classification reports.

Evidence collection — automate from day one

The single biggest difference between a painful audit and an easy one: evidence is automated, not screenshot-based.

Build these collectors as part of your platform, not at audit time:

  • Authorisation log archive. Every allow/deny decision, signed and immutable, retained for 12 months.
  • Trace archive. Sampled (or full, if storage allows) traces per task, retained for 12 months.
  • Model + prompt version manifest. Every deployment writes the active model IDs and prompt hashes to a versioned manifest. Auditors love this.
  • Access review automation. A nightly job that lists all human and service principals with their grants, dumps to S3 with a hash, and emails the security team for review.

For each, document:

  • What system writes it.
  • Where it is stored.
  • Retention and disposal.
  • Who can access it (and the access log for that).

Vendor and sub-processor management

Every external MCP server is a sub-processor in SOC 2 terms. Auditors will ask:

  1. Is there a list?
  2. Does it include data flows?
  3. Is each item under contract or covered by acceptable terms?
  4. How do you know they are still trustworthy?

Required:

  • Sub-processor registry covering every MCP server (including OSS ones used in prod).
  • DPA (or equivalent) with each commercial vendor.
  • Periodic re-review tied to the registry trust criteria — see trusted MCP registry providers.

Common gaps auditors find

The first SOC 2 audit of an MCP-heavy stack typically surfaces:

  • Shared service-account tokens. Replace with workload identity before the audit, not during.
  • Auto-pulled latest tags. Pin every dependency by digest.
  • Trace store with raw PII. Redact at write time; the trace store is in scope.
  • No regression suite. "We test in prod" is not an answer.
  • Undocumented break-glass paths. Document them, audit access, time-limit them.

Fixing each post-audit costs 5–10× more than building it in.

A 90-day prep plan

If you are 90 days from your first audit window:

Days Focus
0–30 Inventory every MCP server. Write sub-processor registry. Pin all dependencies.
30–60 Implement per-call authorisation + audit log. Stand up regression suite.
60–75 Wire monitoring + alerting + runbooks.
75–90 Dry-run with internal "auditor" reviewing the evidence trail. Fix gaps.

If you slip on any of these, slip the audit window. A failed Type II audit is worse than a delayed one.

Where this is heading

AICPA is drafting AI-specific guidance that will likely formalise much of the above. EU equivalents (under the AI Act) are similar in shape — see EU AI Act MCP compliance. Build now to a strict interpretation; the formal guidance will codify what good teams already do.

Loadout

Build your AI agent loadout

The directory of MCP servers and AI agents that actually work. Pick the right loadout for Slack, Postgres, GitHub, Figma and 20+ integrations — with install commands ready to paste into Claude Desktop, Cursor or your own stack.

© 2026 Loadout. Built on Angular 21 SSR.