← All articles Engineering

How to Host a Remote MCP Server in 2026: A Practical Guide

OBTO Team · Insights from the Glass Box

If you've built an MCP server that works locally over stdio, you're maybe 30% of the way to something an agent can actually rely on. The hard part isn't the protocol — it's hosting: transport, auth, scaling, and the part almost everyone skips, knowing whether your server is actually working once agents start calling it.

That last part matters more than you'd think. An April 2026 analysis of 2,181 public remote MCP endpoints found that 52% were completely dead and only 9% were fully healthy. The rest limped along — slow responses, stale data, silent failures. Hosting an MCP server is easy. Hosting one that's still alive in three months is the real problem this guide addresses.

Local vs. remote: when you actually need hosting

Local stdio servers are fine for personal tooling — your own machine, your own Claude Desktop config. You need a remote server the moment any of these are true: multiple people or agents share the tools, the tools touch credentials you don't want on every laptop, or you want one canonical version instead of N drifting copies.

Remote means HTTP, and in 2026 that means Streamable HTTP. The older HTTP+SSE transport was deprecated in the 2025-03-26 spec revision and major providers are sunsetting it this year. If you're starting now, build Streamable HTTP only; if you're maintaining an SSE server, plan the migration before your clients drop support for you.

The four problems every remote MCP server must solve

1. Transport and session handling

Streamable HTTP uses a single endpoint that handles POSTs and optionally upgrades to a stream. It's simpler than SSE's dual-endpoint dance, but session management is still on you: agents hold long conversations, and your server needs to either be stateless per-call or persist session state somewhere that survives a restart. Serverless platforms make this awkward — cold starts are known to break the MCP initialization handshake on Lambda and Cloud Run.

2. Authentication

The spec settled on OAuth 2.1 with Dynamic Client Registration, and major clients now require it for their connector ecosystems. Implementing this per-server is a genuine pain, which is why the emerging consensus is a gateway pattern: one component handles auth centrally and proxies authenticated requests to your servers. This also sidesteps the token-lifecycle mismatch where an agent session outlives the OAuth token that started it.

3. Scaling and cost

MCP traffic is bursty and weird. An agent might call your tool once a day or 400 times in one planning loop. Scale-to-zero pricing sounds right until cold starts break handshakes; always-on containers fix that but cost money while idle. There's no universal answer — but you should know what a tool call costs you, because your agents' operators will eventually ask. (We've written before about how inference cost and speed interact in agent loops in Fast AI Inference — the same economics apply to tool calls.)

4. Observability — the one everyone skips

This is why half the public MCP ecosystem is dead. A remote MCP server fails silently by default: the agent gets an error or garbage, retries or hallucinates around it, and no human ever sees a stack trace. At minimum you need per-call logging (which tool, which client, latency, success/failure), error rates over time, and some way to replay a failed call. If you can't answer "what did this agent's tool call actually do yesterday," you don't have a production server — you have a demo with a domain name.

Your hosting options, honestly compared

DIY on serverless (Cloudflare Workers, Lambda, Cloud Run). Cheapest entry, real engineering cost. Cloudflare Workers is the most MCP-mature of these with first-party guides and automatic TLS/scaling. You still own auth, observability, and session handling yourself. Right choice if MCP infrastructure is your product.

DIY on containers (Azure Container Apps, Render, your own k8s). More control, no cold-start handshake issues, more ops burden. Right choice if you have a platform team and compliance requirements that demand it.

Managed MCP platforms. Hosting, auth, and the operational layer handled; you write tool logic. The trade-off to scrutinize is lock-in and opacity — can you see what's happening inside, and can you leave?

That second question is where we'll be transparent about our own stake: OBTO is in this third category, with two opinionated differences. First, everything is self-hostable — mcp.obto.co runs the same open stack you can run yourself, so "can you leave" is answered by design rather than by a sales call. Second, every tool call produces a Glass Receipt: a full, inspectable record of what was called, by whom, with what inputs, at what cost. That's our answer to the dead-endpoint problem — you can't fix a server you can't see into.

A 30-minute path to a hosted server

Define one tool well. Single responsibility, typed inputs, errors that an LLM can act on ("rate limited, retry after 60s" beats "Error 429").
Pick stateless if you can. If every call is self-contained, every hosting option gets easier.
Deploy behind central auth. Gateway pattern, OAuth 2.1 DCR at the edge — never roll per-server auth twice.
Wire observability before traffic. Per-call logs and an error-rate alert. Day one, not month three.
Test from a real client. Claude, or any MCP-capable agent — the handshake is where deployments fail, and you won't see it from curl.

On OBTO, steps 2–4 are the platform's job; the getting started guide walks through defining and deploying a first tool. The Builder tier is $0 with no card required, so the experiment costs you nothing but the 30 minutes — and pricing stays flat and published ($49 Team / $149 Business) rather than metered-with-surprises.

The bigger picture

Remote MCP servers are becoming the API layer of the agentic web — the thing your AI workforce actually touches when it acts on the world. The 52%-dead statistic isn't an indictment of MCP; it's what every young protocol's ecosystem looks like before operational discipline catches up. The teams that treat MCP servers as production infrastructure — authenticated, observed, accountable — are the ones whose agents will still be working in a year.

Host it like you mean it.