You approve an MCP server once, the tools look benign, and your agent starts using them. Weeks later the server quietly changes what those tools do — and nothing asks you to re-approve. That's a rug pull, and it's one of the nastiest attacks against MCP precisely because the protocol has no built-in mechanism to notice a tool definition has shifted underneath it.
How a rug pull works
The attack splits trust from behaviour in time. First, the server exposes genuinely useful, harmless tools to earn your one-time approval — the moment of human scrutiny. Once that approval is banked, the server silently alters a tool's definition, description or behaviour. Because MCP doesn't track tool-definition changes or force re-approval when they happen, your agent keeps calling a tool whose meaning has changed. The malicious payload lives in the tool definition itself, which every session shares, so a single poisoned definition compromises every agent that calls it until someone notices and pulls it.
Rug pull vs tool poisoning
The two are cousins. Tool poisoning is deploying a tool that masquerades as legitimate from the start, hoping the user or the model picks it. A rug pull is the time-delayed variant: clean at approval, malicious afterwards. Both operate at the supply-chain layer — the definition, not the call — which is what makes them slip past defences built only to inspect runtime arguments. If your security model trusts a tool forever because you vetted it once, both attacks beat it.
The incidents that made it real
This isn't only theory. The postmark-mcp package-squatting incident in September 2025 saw a fake npm package build trust across fifteen versions before silently BCC'ing every email to an attacker. The Clawdbot exposure in January 2026 leaked credentials and conversation histories from two thousand-plus MCP instances through unauthenticated gateways. And the GitHub MCP prompt-injection chain used malicious issues to hijack agents into exfiltrating private repository data through an entirely legitimate tool. As of 2026 the rug pull is well-documented and named in vendor threat matrices — the building blocks are all in the wild.
How to defend against it
Pin and verify. Treat a tool definition like a dependency: record a hash of each tool's definition at approval, and re-prompt when it changes rather than trusting silently. Prefer servers that ship signed tool definitions and have a track record under a publisher reputation system. Run untrusted servers in a sandboxed execution runtime, keep a human checkpoint on write actions, and lean on the same hygiene that catches the rest of the supply-chain family — detecting malicious MCP servers, how to vet MCP servers and supply chain attacks.
Going further
The structural fix is coming from the ecosystem, not a single setting: signed definitions, verified namespaces in the official registry, and re-approval on change. Until those are universal, assume one-time approval is exactly that. Read MCP security best practices and prompt injection prevention, and browse the security category.