Every web service exposes a health endpoint. MCP servers, mostly, do not. The result is hosts that cannot tell a struggling server from a dead one, and registries that cannot prune stale entries. Here is the health-check standard that is emerging in 2026 and the metrics worth exposing today.
Why MCP needs its own shape
A generic /health endpoint is not enough. MCP servers have specifics:
- Tool readiness — server is up, but is its postgres connection ready?
- Per-tool health — one tool may be degraded while others work.
- Capability declaration — what does the server actually expose right now?
- Version and protocol — what MCP version, what server version, what schema?
The proposed shape (under discussion in the MCP spec working group at the time of writing) covers all four.
The proposed health method
A new MCP method, callable from the host:
{
"jsonrpc": "2.0",
"method": "health",
"id": 1
}
Response:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"status": "ok",
"server": { "name": "github-mcp", "version": "1.4.2" },
"protocol": "mcp/1.0",
"uptime_seconds": 3942,
"tools": {
"search_issues": { "status": "ok", "p95_ms": 142 },
"create_issue": { "status": "degraded", "p95_ms": 1820, "reason": "rate-limited upstream" }
},
"dependencies": {
"github_api": "ok",
"rate_limit_remaining": 4500
}
}
}
The seven metrics worth exposing
1. Overall status
Enum: ok | degraded | down. Drives the host's red/yellow/green indicator.
2. Per-tool latency (p50, p95)
Hosts use this to decide which tool to surface first when there are alternatives.
3. Per-tool error rate (last 5 min)
Catches transient regressions before they propagate.
4. Upstream dependency status
Most MCP servers proxy something else; expose its status.
5. Rate-limit headroom
Both the server's own rate limit (against the host) and any upstream limits (GitHub API, Stripe).
6. Cache hit rate
For read-heavy servers; helps the host decide whether to cache further upstream.
7. Version and schema hash
Lets the host detect schema drift; can refuse to call a server if the schema changed.
Implementation in 30 lines
const counters = { calls: 0, errors: 0, latencies: [] as number[] };
export async function health() {
const p95 = percentile(counters.latencies, 95);
return {
status: counters.errors / counters.calls > 0.05 ? 'degraded' : 'ok',
server: { name: 'github-mcp', version: VERSION },
uptime_seconds: process.uptime(),
tools: { /* per-tool roll-up */ },
dependencies: await probeUpstream(),
};
}
Wire counters updates inside every tool handler. Drop a window-trim every minute (last 5 min only). Done.
Host-side use
The host polls health every 30–60 seconds per server. Three actions:
- Drop unhealthy servers from the active tool menu.
- Surface degradation in the UI — Claude Desktop / Cursor get a status indicator per server.
- Inform routing — multiple servers offering the same tool? Pick the healthy one.
Registry use
A self-hosted registry can pull health from registered servers and surface it in the catalogue. Stale servers get demoted. Healthy ones get promoted.
Privacy notes
Health responses must not leak:
- Sample inputs or arguments.
- User identifiers.
- Internal IPs or hostnames beyond what is already public.
Especially for hosted MCP servers, treat the health endpoint as semi-public.
Common mistakes
- One global status only — too coarse; per-tool granularity matters.
- Stale metrics — don't return cached numbers older than a minute.
- Returning sensitive data — sample inputs are tempting; do not include them.
- No rate-limit signal — most outages start as rate-limit exhaustion.
Where this is heading
Two trends to expect: the health method lands as a 2027 MCP spec extension, and observability platforms (compared here) ingest it natively. Implementing today gives you a head start.