← All articles Engineering

How to Build an MCP Tool: A Practical Guide

OBTO Team · Insights from the Glass Box

The Model Context Protocol (MCP) has quietly become the way AI agents reach the outside world. Introduced by Anthropic in late 2024 and now adopted across major model providers, it gives a model a standard way to call your functions, read your data, and act on your systems — without bespoke glue code for every integration. If you have an API, a database, or an internal service you want an agent to use safely, building an MCP tool is how you expose it.

This guide walks through what an MCP server actually is, the three primitives you'll work with, how to choose a transport, a minimal working example, and what changes when you take it to production.

What an MCP server really is

An MCP server is a small, focused process that advertises a set of capabilities to an AI client over a standard protocol. Under the hood, MCP is built on JSON-RPC and split into two layers: a data layer that defines the messages and primitives, and a transport layer that defines how those messages travel between client and server. Because the contract is standardized, any MCP-compatible client can talk to any MCP server — that's the whole point, and the reason the ecosystem grew to hundreds of public servers within a year.

The mental model that helps most: you are not building an "AI feature." You are publishing a typed, discoverable interface to something you already have, and letting the model decide when to use it.

The three primitives

MCP gives a server exactly three first-class things to expose. Picking the right one for each capability is most of the design work.

Tools — executable actions

A tool is a function the model can call: "create an invoice," "search orders," "restart the build." Each tool declares a name, a description, and a JSON Schema for its inputs, so the client knows exactly how to call it and the model knows when it applies. Tools are where agents touch the real world, so they're also where you put your validation and guardrails.

Resources — read-only data

A resource is addressable, read-only context the model can pull in: a file, a database row, a config document, a knowledge-base entry. Resources are for reading, not doing. Keeping reads separate from actions makes behavior easier to reason about and to audit.

Prompts — reusable templates

A prompt is a parameterized template the server offers to the client — a vetted starting point for a common task ("summarize this incident," "draft a release note"). Prompts let you ship known-good interaction patterns instead of hoping each user reinvents them.

One useful distinction: a few capabilities don't live on the server at all. Sampling, for instance, is the server asking the client's model to complete a prompt — control flows back to the host. Knowing what belongs on the server versus the client keeps your design clean.

Choosing a transport

MCP standardizes on two official transports, and the right choice is mostly about where the server runs:

stdio — the server runs as a local subprocess and communicates over standard input/output. No network, no auth overhead, lowest latency. Ideal for local developer tools and desktop clients.
Streamable HTTP — the server runs remotely and accepts HTTP POST requests, with optional Server-Sent Events for streaming responses. This is the path for hosted, multi-user tools, and it supports standard HTTP auth: bearer tokens, API keys, and custom headers.

A practical rule: prototype locally with stdio, then move to Streamable HTTP when the tool needs to be shared, secured, and scaled. (A newer convention worth watching is the Server Card — a metadata document served at /.well-known/mcp.json so clients can discover a server's capabilities and auth requirements before connecting.)

A minimal tool

Here's the shape of a single tool using the official TypeScript SDK (Node). The details vary by language, but the contract — name, description, typed inputs, a handler that returns a result — is the same everywhere.

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({ name: "orders-server", version: "1.0.0" });

server.registerTool(
  "search_orders",
  {
    description: "Find a customer's orders by email, optionally filtered by status.",
    inputSchema: {
      customer_email: z.string().email(),                 // typed + validated before any side effect
      status: z.enum(["open", "closed"]).default("open"),
    },
  },
  async ({ customer_email, status }) => {
    const orders = await db.queryOrders({ email: customer_email, status });
    return { content: [{ type: "text", text: JSON.stringify(orders) }] };
  }
);

// stdio by default; swap in a Streamable HTTP transport to go remote
await server.connect(new StdioServerTransport());

Three things make this a good tool rather than just a working one. The description is written for a model to read — it says what the tool does and when to use it. The input schema is explicit and typed, so bad calls fail fast. And validation happens before any real-world side effect, because the model will eventually call it with something you didn't expect.

Designing tools agents use well

Most MCP frustration is design, not code. A few principles that pay off:

One tool, one job. Narrow, well-named tools are easier for a model to choose correctly than a single do-everything endpoint with a mode flag.
Write descriptions for the model. The description and parameter docs are the model's only guide to when and how to call your tool. Treat them as part of the interface, not an afterthought.
Return structured, bounded results. A tool that can return 40,000 rows will eventually blow a context window. Paginate, summarize, or cap by default.
Fail loudly and specifically. Clear errors let a capable agent recover; vague ones send it in circles.

From localhost to production

Getting a tool working is the easy part. Running it for real means handling authentication and authorization, rate limiting, logging, versioning, and — critically — visibility into what the tool was actually asked to do and what it did in response. The last point is easy to skip and expensive to add later: when an agent does something surprising, you need a record of every call, its arguments, and its result, or you're debugging blind. We covered that discipline in depth in our guide to AI agent observability.

This is the part of the problem OBTO is built to absorb. Hosting an MCP tool on OBTO means it runs as a secured, observable endpoint by default: every invocation produces a Glass Receipt — a structured record of the call, its arguments, its result, and its token cost — so the tool is auditable from the first request. And because OBTO is open and containerized on Kubernetes by default, you can run that same tool on our cloud or take it and deploy it on your own servers. As the manifesto puts it: you see everything, you own everything, and you can leave anytime.

Where to start

If you're building your first MCP tool, a sensible path: pick one capability you already expose through an API, model it as a single well-described tool, run it locally over stdio, and exercise it from a real client. Once the contract feels right, move it to Streamable HTTP, add auth and observability, and only then expand the surface area. Our getting-started guide walks through hosting a tool on OBTO end to end, the AI workforce overview shows how multiple tools compose into real workflows, and transparent, published pricing means the cost of running them is something you can see before you commit.

MCP turned integration from a custom project into a standard. The teams that win with agents this year are the ones who expose their systems as clean, well-described, observable tools — and that work starts with the first one.