What Is Model Context Protocol (MCP)? | AI Agent Tutorial

The Problem MCP Was Designed to Solve in AI Development

Before MCP existed, connecting an AI model to an external system — a database, a version control API, a ticketing system — required one of two uncomfortable compromises. Either you stuffed all relevant data into the context window before each inference call (expensive, fragile, and bounded by token limits), or you built a bespoke integration layer: a custom tool-calling harness, hand-rolled JSON schemas, model-specific prompt wrappers, and glue code that broke every time either the model provider or the target API changed.

This is not a minor inconvenience. Consider a realistic mid-size engineering team that wants its AI coding assistant to read open Jira tickets, check the current Git branch status, query Sentry for recent errors, and look up relevant Confluence pages — all in one session. Without a standard, each of these integrations must be written, tested, and maintained independently. Swap the AI provider (say, from OpenAI to Anthropic) and you rewrite the integrations. Upgrade the Jira schema and you chase down broken JSON deserializers. The integration matrix explodes multiplicatively.

The deeper problem is architectural: the AI model has no standardized way to express "I need to call out to something external" and receive a structured, typed result back within a well-defined protocol boundary. OpenAI's function calling, Anthropic's tool use, Google's function declarations — each is a valid approach but they are not interoperable. An MCP server written once works across all hosts that implement the MCP client protocol, regardless of which model powers them.

MCP was designed by Anthropic and released as an open specification in late 2024 to solve exactly this: define a universal, model-agnostic protocol for connecting AI agents to external data sources and tools. The analogy used in the specification is instructive — MCP is to AI tools what LSP (Language Server Protocol) is to code intelligence. LSP means a Go language server works in VS Code, Neovim, and Emacs without being rewritten for each editor. MCP means a GitHub integration server works with Claude, GPT-4, Gemini, and any future model without per-model rewrites.

The practical consequences for a senior developer are significant. You build your integration once, as an MCP server. Clients negotiate capabilities at connection time. The protocol handles transport, serialization, error propagation, and capability advertisement. Your integration code does not need to know which model will consume it.

Tips
- When evaluating whether MCP fits your use case, ask: "Would I need to rewrite this integration if I changed AI providers?" If yes, MCP is likely the right abstraction.
- The LSP analogy is the best mental model to anchor on. If you have experience building or using LSP servers, the MCP client/server architecture will feel familiar within minutes.
- Review the history of bespoke AI tool integrations on your current project before adopting MCP — the migration pain is usually lower than expected because MCP servers are thin wrappers over existing APIs.
- Read the changelog sections of the MCP specification before assuming feature parity across host implementations. Not all hosts implement all protocol capabilities.

Core MCP Concepts: Resources, Tools, Prompts, and Sampling

The MCP specification defines four first-class primitives. Understanding their semantics precisely is important because misusing them leads to agent behavior that is brittle, unpredictable, or insecure.

Resources represent externally readable data that the server exposes. A resource has a URI scheme (e.g., file://, github://, postgres://), a MIME type, and content that is either text or binary. Resources are analogous to GET endpoints — they are read-only from the agent's perspective, and the server controls what they expose. A resource might be a file, a database row, a rendered Confluence page, or a live feed of recent Sentry events. The critical design point: resources do not execute logic. They surface data. When an agent reads a resource, it is pulling state into its context, not triggering a side effect.

Resource URI examples:
  file:///repo/src/auth/session.ts
  github://repos/acme/api/pulls/open
  postgres://mydb/public/users?limit=50
  sentry://projects/backend/issues?status=unresolved&limit=20

Tools are the opposite of resources in one critical dimension: they are allowed to have side effects. A tool call may write to a database, open a PR, send a Slack message, restart a service, or deploy a build. Tools are defined with a JSON Schema for their input parameters, and the server validates inputs before execution. The agent's host is responsible for deciding whether to auto-approve or require user confirmation before executing a tool — this is a host-level policy, not a protocol-level one, which is an important security boundary discussed later.

// Example tool definition returned by a GitHub MCP server
{
  "name": "create_pull_request",
  "description": "Open a new pull request on a GitHub repository",
  "inputSchema": {
    "type": "object",
    "properties": {
      "owner": { "type": "string" },
      "repo": { "type": "string" },
      "title": { "type": "string" },
      "head": { "type": "string" },
      "base": { "type": "string" },
      "body": { "type": "string" }
    },
    "required": ["owner", "repo", "title", "head", "base"]
  }
}

Prompts are server-defined prompt templates that the host can surface to users or compose into agent workflows. This is the least commonly implemented primitive but one of the most powerful for workflow automation. A Prompts response from an MCP server includes structured message templates with slot arguments — think of them as parameterized system prompts that encode domain expertise. A Jira MCP server might expose a triage_bug_prompt that takes an issue ID and returns a structured prompt guiding the model through root cause analysis using known project conventions.

Sampling is the most architecturally interesting primitive. It allows the MCP server itself to request that the host perform an LLM inference call. This inverts the typical flow: rather than agent → tool call → server → response, sampling allows server → inference request → agent → result → server. This enables agentic sub-tasks to be embedded in server-side logic, which is powerful for complex multi-step automation but demands careful thinking about recursion depth, cost, and trust boundaries.

Tips
- Enforce the resource/tool distinction rigorously in your own MCP servers. Resources should be idempotent. If your "resource" has a side effect, it is actually a tool — model it that way so hosts can apply correct confirmation policies.
- When writing tool input schemas, use description fields on every property, not just the object. Models use these descriptions to choose correct argument values — sparse schemas produce poor tool calls.
- Avoid exposing Sampling unless you have a concrete need for server-initiated inference. It increases the attack surface and complicates the trust model significantly for most use cases.
- Prompt templates are underutilized. If your team has domain-specific investigation runbooks, encoding them as MCP Prompts lets every AI session benefit from them without copy-pasting into system prompts.

How MCP Compares to Direct API Calls and Function Calling

Three integration patterns dominate AI-to-external-system connectivity in 2025: direct API calls embedded in agent code, model-native function/tool calling, and MCP. Each occupies a distinct position on the trade-off surface.

Direct API calls (the agent code makes HTTP requests itself, either by generating curl commands or by having the runtime execute requests) provide maximum flexibility but zero standardization. There is no capability negotiation, no typed interface, no schema validation. The model guesses parameter names from documentation embedded in the system prompt. Errors surface as raw HTTP responses that the model must interpret. This works for one-off experiments but is not a foundation for production workflows.

Model-native function calling (OpenAI function calling, Anthropic tool use, Gemini function declarations) is a significant improvement. The model declares intent to call a tool in a structured JSON envelope. The host validates and executes the call. However, the tool definitions are written into the agent's configuration at startup. Adding a new tool requires redeploying the agent. The tool definitions are model-specific in their schema encoding nuances. There is no standard for tool discovery at runtime or capability versioning.

MCP addresses the gaps that model-native function calling leaves open:

Dimension	Direct API Call	Model-Native Tool Calling	MCP
Schema validation	None	At call time	At definition and call time
Runtime tool discovery	No	No	Yes (`tools/list`)
Capability versioning	No	No	Yes (protocol negotiation)
Transport abstraction	No	No	Yes (stdio, SSE, HTTP)
Multi-model portability	No	No	Yes
Side-effect policy	None	Host-dependent	Host-controlled, protocol-guided
Resource streaming	No	No	Yes

The key distinction for a senior developer evaluating when to use MCP vs model-native function calling: if your tools need to be shared across different agent frameworks, models, or teams, MCP is the right layer. If you are building a single-model, single-framework integration that will never be reused, model-native function calling has lower setup overhead.

A practical example: a team running Claude Code, Cursor, and a CI pipeline LLM automation all need access to the same internal deployment API. Without MCP, each environment gets its own integration implementation. With MCP, one deployment MCP server is written once and all three clients connect to it.


claude mcp add deployment-api -- ./deployment-mcp-server

{
  "mcpServers": {
    "deployment-api": {
      "url": "http://localhost:8080/mcp"
    }
  }
}

from mcp import ClientSession, StdioServerParameters
async with ClientSession(...) as session:
    await session.initialize()
    result = await session.call_tool("trigger_deployment", {"env": "staging"})

Tips
- Do not treat MCP as a replacement for model-native function calling in existing single-model projects. The migration cost is only justified when cross-client reuse or runtime tool discovery is needed.
- When benchmarking MCP against direct API calls for latency-sensitive workloads, measure round-trip time including serialization. MCP adds one protocol layer but saves the token cost of embedding raw documentation in the system prompt.
- The tools/list capability is particularly valuable for large MCP servers with many tools — let the agent discover relevant tools dynamically rather than loading all tool schemas upfront, which wastes context tokens.
- For internal enterprise MCP servers, lean into JSON Schema validation on inputs aggressively. Production incidents caused by malformed tool calls are significantly reduced when the MCP server rejects bad inputs before executing them.

Official MCP Specification: Key Versions and What Changed

The MCP specification is managed as an open document at modelcontextprotocol.io and versioned using a date-based scheme. Understanding the version history matters in practice because host implementations often lag the specification, and certain features you rely on may not be available in older MCP clients.

2024-11-05 (Initial Release): The first publicly released version. Established the core primitives: Resources, Tools, Prompts, and the JSON-RPC 2.0 message framing. Defined stdio and SSE as the two supported transports. Introduced the capability negotiation handshake (initialize / initialized). This version was intentionally minimal — a sufficient base for the ecosystem to form around, not a feature-complete specification.

2025-03-26: A significant revision that introduced Streamable HTTP as a replacement for the original SSE transport. SSE over HTTP has a structural limitation: it is unidirectional (server to client only), which requires maintaining a separate HTTP POST endpoint for client-to-server messages. Streamable HTTP resolves this by using a single bidirectional HTTP connection where both request and response bodies can be streamed. This version also formalized OAuth 2.1 as the authentication mechanism for remote MCP servers, replacing the ad-hoc token header patterns that proliferated in the 2024-11-05 ecosystem. Batch request support was also added, enabling multiple JSON-RPC calls in a single HTTP round-trip.

Client → Server:
{
  "jsonrpc": "2.0",
  "method": "initialize",
  "params": {
    "protocolVersion": "2025-03-26",
    "capabilities": {
      "roots": { "listChanged": true },
      "sampling": {}
    },
    "clientInfo": { "name": "claude-code", "version": "1.2.0" }
  },
  "id": 1
}

Server → Client:
{
  "jsonrpc": "2.0",
  "result": {
    "protocolVersion": "2025-03-26",
    "capabilities": {
      "tools": { "listChanged": true },
      "resources": { "subscribe": true, "listChanged": true },
      "prompts": { "listChanged": true }
    },
    "serverInfo": { "name": "github-mcp", "version": "0.9.1" }
  },
  "id": 1
}

The listChanged capability flags are important for runtime behavior. When a server sets tools.listChanged: true, it signals that it will emit notifications/tools/list_changed events when its tool inventory changes. Hosts that handle these notifications can dynamically update their available tool set without reconnecting. Hosts that do not implement this notification handler will have a stale view of available tools.

What to check when evaluating an MCP server's spec compliance: look for the protocolVersion field in the server's initialize response. Servers that respond with 2024-11-05 may not support Streamable HTTP, OAuth, or batch requests. This matters if you are building a remote MCP server deployment — the 2025-03-26 OAuth flow is substantially more secure than passing raw API keys in headers, which was the de facto pattern in the initial release.

For the resources.subscribe capability: when true, clients can send resources/subscribe requests to receive push notifications when a specific resource's content changes. This is underutilized in current implementations but is the correct mechanism for building reactive agent workflows (e.g., watching a file tree for changes and triggering re-analysis automatically).

Tips
- Always log the negotiated protocolVersion from the initialize response when building or debugging an MCP integration. Version mismatches are the first thing to check when features behave unexpectedly.
- If you are building a new remote MCP server today (May 2026), target the 2025-03-26 spec and implement Streamable HTTP and OAuth 2.1. The SSE transport has maintenance overhead and is no longer the recommended path for remote deployments.
- Subscribe to the modelcontextprotocol/specification GitHub repository to receive notifications about spec updates. Breaking changes are rare but transport and auth changes have implementation implications across your entire MCP server fleet.
- When writing integration tests for an MCP server, test capability negotiation explicitly — assert that the server's initialize response contains the capability flags your client depends on, not just that tool calls succeed.