MCP Architecture and Primitives

The client-host-server architecture

MCP uses a layered architecture with three roles: hosts, clients, and servers.

Hosts are applications users interact with directly—Claude Code, Cursor, or any application embedding AI capabilities. Hosts create and manage client instances, control permissions, enforce security policies, and coordinate AI interactions.

Clients connect hosts to servers. Each client talks to exactly one server—a host running Claude Code might spawn three clients: one for PostgreSQL, one for GitHub, one for Jira. Clients handle protocol negotiation, route messages, and manage subscriptions.

Servers provide capabilities. A PostgreSQL server exposes queries; a GitHub server exposes repository operations; a filesystem server exposes file access. Servers operate independently and expose capabilities through MCP's primitives: tools, resources, and prompts.

This architecture enforces isolation. Servers cannot see each other. A database server has no knowledge of what the GitHub server is doing. Context flows through the client, which mediates all interactions. The host maintains the global view.

The one-to-one client-server relationship simplifies security: each connection can be managed, logged, and permissioned independently.

The four primitives

MCP defines four primitives that structure all communication between clients and servers. Three are server-side: tools, resources, and prompts. One is client-side: sampling.

Each primitive has a different control model—who decides when to use it.

Primitive	Provider	Control Model	Purpose
Tools	Server	Model-controlled	Executable actions
Resources	Server	Application-controlled	Data access
Prompts	Server	User-controlled	Interaction templates
Sampling	Client	Server-initiated	LLM completions

The control model matters more than it might seem. The agent decides when to call tools. The application decides when to fetch resources. Users decide when to activate prompts. Servers decide when to request sampling.

Tools: executable actions

Tools are functions that servers expose for agents to call. When Claude Code determines it needs to query a database, it calls a tool. When it needs to create a GitHub pull request, it calls a tool. Tools represent actions that change state or retrieve information.

A tool definition includes a name, description, and input schema:

{
  "name": "execute_query",
  "description": "Execute a SQL query against the database",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "SQL query to execute"
      },
      "database": {
        "type": "string",
        "description": "Target database name"
      }
    },
    "required": ["query"]
  }
}

The input schema uses JSON Schema, which agents interpret to construct valid invocations. Clear descriptions matter: agents rely on them to decide when and how to use tools.

Invocation is direct: the client sends tools/call with the tool name and arguments, the server executes, and returns a result.

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "execute_query",
    "arguments": {
      "query": "SELECT * FROM users LIMIT 10",
      "database": "production"
    }
  }
}

Results contain content blocks that can be text, images, or structured data:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "| id | name | email |\n|-----|------|-------|\n| 1 | Alice | alice@example.com |"
      }
    ],
    "isError": false
  }
}

The isError flag distinguishes between tool execution failures and protocol errors. A failed database query returns isError: true with an error message in the content. A malformed request returns a protocol-level error.

Tool descriptions directly affect agent behavior. Vague descriptions lead to misuse. Write descriptions that answer: what does this tool do, when should it be used, and what are its limitations?

Resources: data access

Resources are data servers expose for reading—passive, unlike tools. Configuration files, database schemas, documentation pages: anything an agent might read for context can be a resource.

Resources are identified by URIs:

{
  "uri": "file:///project/src/config.json",
  "name": "Application Configuration",
  "description": "Main configuration file for the application",
  "mimeType": "application/json"
}

Clients read resources with resources/read:

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "resources/read",
  "params": {
    "uri": "file:///project/src/config.json"
  }
}

The response includes the content:

{
  "jsonrpc": "2.0",
  "id": 2,
  "result": {
    "contents": [
      {
        "uri": "file:///project/src/config.json",
        "mimeType": "application/json",
        "text": "{\"debug\": false, \"port\": 8080}"
      }
    ]
  }
}

Resources support subscriptions. Clients can subscribe to changes and receive notifications when resources update. A file watcher server might notify clients when monitored files change. This enables reactive workflows without polling.

Resource templates allow parameterized access using URI templates (RFC 6570):

{
  "uriTemplate": "db://schema/{table}",
  "name": "Table Schema",
  "description": "Get schema for a database table"
}

The application-controlled designation means the host application decides which resources to fetch. The agent does not autonomously request resources; the application determines what context to provide.

Prompts: interaction templates

Prompts are templates that shape how agents approach tasks. They are user-controlled—users select them explicitly, often via slash commands in the UI.

A code review prompt might define the perspective the agent should take:

{
  "name": "security_review",
  "description": "Review code for security vulnerabilities",
  "arguments": [
    {
      "name": "code",
      "description": "Code to review",
      "required": true
    }
  ]
}

When retrieved, the prompt expands into messages that seed the conversation:

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "prompts/get",
  "params": {
    "name": "security_review",
    "arguments": {
      "code": "function login(user, pass) { ... }"
    }
  }
}

{
  "jsonrpc": "2.0",
  "id": 3,
  "result": {
    "messages": [
      {
        "role": "user",
        "content": {
          "type": "text",
          "text": "Review this code for security vulnerabilities. Focus on injection attacks, authentication flaws, and data exposure risks.\n\nCode:\nfunction login(user, pass) { ... }"
        }
      }
    ]
  }
}

Prompts encode domain expertise. A security team creates prompts ensuring reviews follow organizational guidelines; a documentation team creates prompts enforcing style standards. Users select the prompt, the server provides the expertise.

Sampling: server-initiated LLM requests

Sampling flips the direction. Servers request LLM completions from clients rather than the other way around.

A data analysis server might need to interpret results before returning them. Rather than hosting its own LLM, it asks the client's LLM to generate a summary:

{
  "jsonrpc": "2.0",
  "id": 4,
  "method": "sampling/createMessage",
  "params": {
    "messages": [
      {
        "role": "user",
        "content": {
          "type": "text",
          "text": "Summarize these query results in plain English: [data]"
        }
      }
    ],
    "maxTokens": 500
  }
}

The client presents the request to the user for approval, invokes the LLM, and returns the result. Human oversight remains in the loop.

Sampling lets servers do things that would otherwise require hosting their own models. A documentation server asks the LLM to explain a concept; a code analysis server asks it to classify patterns. The server borrows LLM capabilities from the client.

Scrutinize sampling requests—they let servers influence what the LLM generates. Clients should implement approval flows so users can review and reject them.

JSON-RPC 2.0 message format

All MCP communication uses JSON-RPC 2.0 over UTF-8. Every message includes "jsonrpc": "2.0" to identify the protocol version.

Requests initiate operations and expect responses:

{
  "jsonrpc": "2.0",
  "id": 5,
  "method": "tools/list",
  "params": {}
}

The id field uniquely identifies the request. It must be a string or number, never null. Responses include the same id to correlate with requests.

Responses answer requests with results or errors:

{
  "jsonrpc": "2.0",
  "id": 5,
  "result": {
    "tools": [...]
  }
}

Error responses replace result with error:

{
  "jsonrpc": "2.0",
  "id": 5,
  "error": {
    "code": -32602,
    "message": "Invalid params",
    "data": {"reason": "Missing required field"}
  }
}

Standard error codes follow JSON-RPC conventions: -32700 for parse errors, -32600 for invalid requests, -32601 for unknown methods, -32602 for invalid parameters, -32603 for internal errors.

Notifications are one-way messages without responses:

{
  "jsonrpc": "2.0",
  "method": "notifications/resources/updated",
  "params": {
    "uri": "file:///project/config.json"
  }
}

Notifications omit the id field. The receiver must not send a response. They are used for events: resource changes, tool list updates, progress reports.

Protocol lifecycle

MCP connections follow a three-phase lifecycle: initialization, operation, and shutdown.

Initialization establishes the connection and negotiates capabilities. The client sends an initialize request declaring its protocol version and capabilities:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2025-06-18",
    "capabilities": {
      "roots": {"listChanged": true},
      "sampling": {}
    },
    "clientInfo": {
      "name": "claude-code",
      "version": "1.0.0"
    }
  }
}

The server responds with its own capabilities:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2025-06-18",
    "capabilities": {
      "tools": {"listChanged": true},
      "resources": {"subscribe": true},
      "prompts": {}
    },
    "serverInfo": {
      "name": "postgres-mcp",
      "version": "2.0.0"
    }
  }
}

The client confirms with an initialized notification:

{
  "jsonrpc": "2.0",
  "method": "notifications/initialized"
}

Version negotiation ensures compatibility. If the server cannot support the client's version, it responds with a version it does support. If versions are incompatible, the client disconnects.

Operation is the working phase. Both parties respect negotiated capabilities—clients can only use what servers declared, and vice versa.

Shutdown terminates the connection cleanly. For stdio transport, the client closes the input stream and waits for the server to exit. For HTTP transport, the client sends an HTTP DELETE with the session identifier.

The lifecycle ensures both parties agree on capabilities before real work begins. A server that does not declare resources never receives resources/read requests. A client that does not declare sampling never receives sampling requests.

Primitive interactions in practice

An agent fixing a database performance issue might work like this:

Query available tools (tools/list)—finds explain_query and execute_query.
Application fetches the slow query from logs (resource, application-controlled).
Agent calls explain_query on the problematic SQL (tool, model-controlled).
Server asks the LLM to summarize the execution plan in plain English (sampling).
Agent reads table schema (resource) to check indexes.
Agent calls execute_query with an optimized version.

Tools execute. Resources inform. Sampling borrows LLM capabilities. The protocol ties it together.

On this page