MCP server health metrics: the standard health check every server should expose

Every web service exposes a health endpoint. MCP servers, mostly, do not. The result is hosts that cannot tell a struggling server from a dead one, and registries that cannot prune stale entries. Here is the health-check standard that is emerging in 2026 and the metrics worth exposing today.

Why MCP needs its own shape

A generic /health endpoint is not enough. MCP servers have specifics:

Tool readiness — server is up, but is its postgres connection ready?
Per-tool health — one tool may be degraded while others work.
Capability declaration — what does the server actually expose right now?
Version and protocol — what MCP version, what server version, what schema?

The proposed shape (under discussion in the MCP spec working group at the time of writing) covers all four.

The proposed `health` method

A new MCP method, callable from the host:

{
  "jsonrpc": "2.0",
  "method": "health",
  "id": 1
}

Response:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "status": "ok",
    "server": { "name": "github-mcp", "version": "1.4.2" },
    "protocol": "mcp/1.0",
    "uptime_seconds": 3942,
    "tools": {
      "search_issues": { "status": "ok", "p95_ms": 142 },
      "create_issue": { "status": "degraded", "p95_ms": 1820, "reason": "rate-limited upstream" }
    },
    "dependencies": {
      "github_api": "ok",
      "rate_limit_remaining": 4500
    }
  }
}

The seven metrics worth exposing

1. Overall status

Enum: ok | degraded | down. Drives the host's red/yellow/green indicator.

2. Per-tool latency (p50, p95)

Hosts use this to decide which tool to surface first when there are alternatives.

3. Per-tool error rate (last 5 min)

Catches transient regressions before they propagate.

4. Upstream dependency status

Most MCP servers proxy something else; expose its status.

5. Rate-limit headroom

Both the server's own rate limit (against the host) and any upstream limits (GitHub API, Stripe).

6. Cache hit rate

For read-heavy servers; helps the host decide whether to cache further upstream.

7. Version and schema hash

Lets the host detect schema drift; can refuse to call a server if the schema changed.

Implementation in 30 lines

const counters = { calls: 0, errors: 0, latencies: [] as number[] };

export async function health() {
  const p95 = percentile(counters.latencies, 95);
  return {
    status: counters.errors / counters.calls > 0.05 ? 'degraded' : 'ok',
    server: { name: 'github-mcp', version: VERSION },
    uptime_seconds: process.uptime(),
    tools: { /* per-tool roll-up */ },
    dependencies: await probeUpstream(),
  };
}

Wire counters updates inside every tool handler. Drop a window-trim every minute (last 5 min only). Done.

Host-side use

The host polls health every 30–60 seconds per server. Three actions:

Drop unhealthy servers from the active tool menu.
Surface degradation in the UI — Claude Desktop / Cursor get a status indicator per server.
Inform routing — multiple servers offering the same tool? Pick the healthy one.

Registry use

A self-hosted registry can pull health from registered servers and surface it in the catalogue. Stale servers get demoted. Healthy ones get promoted.

Privacy notes

Health responses must not leak:

Sample inputs or arguments.
User identifiers.
Internal IPs or hostnames beyond what is already public.

Especially for hosted MCP servers, treat the health endpoint as semi-public.

Common mistakes

One global status only — too coarse; per-tool granularity matters.
Stale metrics — don't return cached numbers older than a minute.
Returning sensitive data — sample inputs are tempting; do not include them.
No rate-limit signal — most outages start as rate-limit exhaustion.

Where this is heading

Two trends to expect: the health method lands as a 2027 MCP spec extension, and observability platforms (compared here) ingest it natively. Implementing today gives you a head start.