Skip to main content
Tutorial3 min read

MCP server health metrics: the standard health check every server should expose

Hosts cannot pick a healthy server if servers do not declare health. Here is the proposed standard health-check shape, the metrics worth surfacing, and how to wire it into a registry.

Every web service exposes a health endpoint. MCP servers, mostly, do not. The result is hosts that cannot tell a struggling server from a dead one, and registries that cannot prune stale entries. Here is the health-check standard that is emerging in 2026 and the metrics worth exposing today.

Why MCP needs its own shape

A generic /health endpoint is not enough. MCP servers have specifics:

  • Tool readiness — server is up, but is its postgres connection ready?
  • Per-tool health — one tool may be degraded while others work.
  • Capability declaration — what does the server actually expose right now?
  • Version and protocol — what MCP version, what server version, what schema?

The proposed shape (under discussion in the MCP spec working group at the time of writing) covers all four.

The proposed health method

A new MCP method, callable from the host:

{
  "jsonrpc": "2.0",
  "method": "health",
  "id": 1
}

Response:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "status": "ok",
    "server": { "name": "github-mcp", "version": "1.4.2" },
    "protocol": "mcp/1.0",
    "uptime_seconds": 3942,
    "tools": {
      "search_issues": { "status": "ok", "p95_ms": 142 },
      "create_issue": { "status": "degraded", "p95_ms": 1820, "reason": "rate-limited upstream" }
    },
    "dependencies": {
      "github_api": "ok",
      "rate_limit_remaining": 4500
    }
  }
}

The seven metrics worth exposing

1. Overall status

Enum: ok | degraded | down. Drives the host's red/yellow/green indicator.

2. Per-tool latency (p50, p95)

Hosts use this to decide which tool to surface first when there are alternatives.

3. Per-tool error rate (last 5 min)

Catches transient regressions before they propagate.

4. Upstream dependency status

Most MCP servers proxy something else; expose its status.

5. Rate-limit headroom

Both the server's own rate limit (against the host) and any upstream limits (GitHub API, Stripe).

6. Cache hit rate

For read-heavy servers; helps the host decide whether to cache further upstream.

7. Version and schema hash

Lets the host detect schema drift; can refuse to call a server if the schema changed.

Implementation in 30 lines

const counters = { calls: 0, errors: 0, latencies: [] as number[] };

export async function health() {
  const p95 = percentile(counters.latencies, 95);
  return {
    status: counters.errors / counters.calls > 0.05 ? 'degraded' : 'ok',
    server: { name: 'github-mcp', version: VERSION },
    uptime_seconds: process.uptime(),
    tools: { /* per-tool roll-up */ },
    dependencies: await probeUpstream(),
  };
}

Wire counters updates inside every tool handler. Drop a window-trim every minute (last 5 min only). Done.

Host-side use

The host polls health every 30–60 seconds per server. Three actions:

  1. Drop unhealthy servers from the active tool menu.
  2. Surface degradation in the UI — Claude Desktop / Cursor get a status indicator per server.
  3. Inform routing — multiple servers offering the same tool? Pick the healthy one.

Registry use

A self-hosted registry can pull health from registered servers and surface it in the catalogue. Stale servers get demoted. Healthy ones get promoted.

Privacy notes

Health responses must not leak:

  • Sample inputs or arguments.
  • User identifiers.
  • Internal IPs or hostnames beyond what is already public.

Especially for hosted MCP servers, treat the health endpoint as semi-public.

Common mistakes

  • One global status only — too coarse; per-tool granularity matters.
  • Stale metrics — don't return cached numbers older than a minute.
  • Returning sensitive data — sample inputs are tempting; do not include them.
  • No rate-limit signal — most outages start as rate-limit exhaustion.

Where this is heading

Two trends to expect: the health method lands as a 2027 MCP spec extension, and observability platforms (compared here) ingest it natively. Implementing today gives you a head start.

Loadout

Build your AI agent loadout

The directory of MCP servers and AI agents that actually work. Pick the right loadout for Slack, Postgres, GitHub, Figma and 20+ integrations — with install commands ready to paste into Claude Desktop, Cursor or your own stack.

© 2026 Loadout. Built on Angular 21 SSR.