ML Engineer Loadout
4 MCP servers for managing experiments and data without context-switching
Last verified:
Query training data from BigQuery, version code on GitHub, store experiment notes in Memory, and read your project files. A pragmatic loadout for engineers who want to spend less time switching between Jupyter and a dozen tabs.
Install ML Engineer Loadout
Paste into~/Library/Application Support/Claude/claude_desktop_config.json{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/you/ml"]
},
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..." }
},
"bigquery": {
"command": "npx",
"args": ["-y", "bigquery-mcp"],
"env": { "GOOGLE_APPLICATION_CREDENTIALS": "/Users/you/.gcp/service-account.json" }
},
"memory": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-memory"]
}
}
}After pasting, fully restart your MCP client. Replace placeholders (paths, API keys) with your own values.
What's in this loadout
- Filesystem MCP Servercoremcp
Read training scripts, notebooks, model configs.
- GitHub MCP Servercoremcp
Open PRs for new experiments, link runs to commits.
- BigQuery MCPcoremcp
Pull training data slices, compute distributions, build evaluation sets.
- Memory MCP Serveroptionalmcp
Persist hyperparameter notes, eval results, and dataset versions across sessions.
What you can do with it
Compare two runs
I trained two models last week — runs ID 42 and 47, both tracked in /Users/me/ml/experiments. Read their configs, compare hyperparameters and final metrics, and tell me which one we should ship.Build an eval set
Using bigquery, sample 10,000 user queries from last month's logs where the model returned a confidence below 0.5. Save them as a CSV in /Users/me/ml/eval-sets/low-confidence-2026-04.csv and commit through github.Recall what we tried
From memory: what hyperparameter ranges have we already swept on the ranking model? List the ones we ruled out and why.
Client compatibility
- Claude Desktop 0.7.4Tested ✓
- Cursor 0.42+Tested ✓
- Windsurf 1.4Tested ✓
- Continue 0.9Untested
Variants
Heavy version
Adds Snowflake for teams on a different warehouse and a vector DB (Pinecone) for retrieval experiments.
Adds: snowflake-mcp, pinecone-mcp
Lite version
Filesystem + GitHub. Just code and version control.
Removes: bigquery-mcp, mcp-server-memory
FAQ — ML Engineer Loadout
What is the ML Engineer Loadout?
Query training data from BigQuery, version code on GitHub, store experiment notes in Memory, and read your project files. A pragmatic loadout for engineers who want to spend less time switching between Jupyter and a dozen tabs. It includes 4 MCP servers: Filesystem MCP Server, GitHub MCP Server, BigQuery MCP, Memory MCP Server.
Does this loadout work with Cursor and Windsurf?
Yes. The config block above is one JSON object that drops into Claude Desktop's, Cursor's, or Windsurf's MCP config. Each client uses the same mcpServers schema; only the file path differs.
How long does setup take?
About 3 minutes the first time: copy the config, paste it into your client's MCP configuration file, replace placeholder API keys with your own, and restart the client.
Can I customise this loadout?
Yes. Use the Lite version if you want fewer servers, or the Heavy version to add 2 more for advanced workflows. You can also remove any individual server by deleting its entry from the JSON.
How much does it cost?
Loadout itself is free. Most listed servers are open-source and free to run. Servers that talk to paid SaaS (Stripe, SendGrid, Linear, etc.) follow that vendor's pricing.
When was this loadout last verified?
Last verified on 2026-04-24. We re-test featured loadouts at least monthly and update the config when MCP servers ship breaking changes.