MCP Engineering — Complete Model Context Protocol System

Build, integrate, secure, and scale MCP servers and clients. From first server to production multi-tool architecture.

When to Use

Building an MCP server (any language)
Integrating MCP tools into an AI agent
Debugging MCP connection/auth issues
Designing multi-server architectures
Securing MCP endpoints for production
Evaluating which MCP servers to use

Phase 1: MCP Fundamentals

What MCP Is

Model Context Protocol = standardized way for AI agents to call external tools. Think of it as "USB for AI" — one protocol, any tool.

Architecture

Agent (Client) ←→ MCP Transport ←→ MCP Server ←→ External Service
                   (stdio/HTTP)      (your code)    (API, DB, file system)

Core Concepts

Concept	What It Does	Example
Server	Exposes tools, resources, prompts	A server wrapping the GitHub API
Client	Discovers and calls server capabilities	OpenClaw, Claude Desktop, Cursor
Tool	A callable function with typed params	`create_issue(title, body, labels)`
Resource	Read-only data the agent can access	`file://workspace/config.json`
Prompt	Reusable prompt templates	`summarize_pr(pr_url)`
Transport	How client↔server communicate	stdio (local) or HTTP+SSE (remote)

Transport Decision

Factor	stdio	HTTP/SSE	Streamable HTTP
Setup complexity	Low	Medium	Medium
Multi-client	No	Yes	Yes
Remote access	No	Yes	Yes
Streaming	Via stdio	SSE	Native
Auth needed	No (local)	Yes	Yes
Best for	Local dev, single agent	Production, shared	Modern production

Rule: Start with stdio for development. Move to HTTP for production or multi-agent.

Phase 2: Building Your First MCP Server

Server Brief YAML

server_name: "[service]-mcp"
description: "[What this server does in one sentence]"
transport: stdio | http
tools:
  - name: "[verb_noun]"
    description: "[What it does — be specific for LLM tool selection]"
    params:
      - name: "[param]"
        type: "string | number | boolean | object | array"
        required: true | false
        description: "[What this param controls]"
    returns: "[What the tool returns]"
    error_cases:
      - "[When/how it fails]"
resources:
  - uri: "[protocol://path]"
    description: "[What data this exposes]"
external_dependencies:
  - "[API/service this wraps]"
auth_required: true | false
auth_method: "api_key | oauth2 | none"

TypeScript Server Template (stdio)

// server.ts — minimal MCP server
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({
  name: "my-service",
  version: "1.0.0",
});

// Define a tool
server.tool(
  "get_item",                          // tool name (verb_noun)
  "Fetch an item by ID",               // description (LLM reads this)
  { id: z.string().describe("Item ID") }, // params with descriptions
  async ({ id }) => {
    try {
      const result = await fetchItem(id);
      return {
        content: [{ type: "text", text: JSON.stringify(result, null, 2) }],
      };
    } catch (error) {
      return {
        content: [{ type: "text", text: `Error: ${error.message}` }],
        isError: true,
      };
    }
  }
);

// Define a resource
server.resource(
  "config",
  "config://app",
  async (uri) => ({
    contents: [{ uri: uri.href, mimeType: "application/json", text: JSON.stringify(config) }],
  })
);

// Start
const transport = new StdioServerTransport();
await server.connect(transport);

Python Server Template (stdio)

# server.py — minimal MCP server
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent
import json

server = Server("my-service")

@server.list_tools()
async def list_tools():
    return [
        Tool(
            name="get_item",
            description="Fetch an item by ID",
            inputSchema={
                "type": "object",
                "properties": {
                    "id": {"type": "string", "description": "Item ID"}
                },
                "required": ["id"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "get_item":
        result = await fetch_item(arguments["id"])
        return [TextContent(type="text", text=json.dumps(result, indent=2))]
    raise ValueError(f"Unknown tool: {name}")

async def main():
    async with stdio_server() as (read, write):
        await server.run(read, write, server.create_initialization_options())

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

Tool Design Rules

Verb-noun naming: create_issue, search_docs, update_config — never issue or doStuff
Descriptions are critical: The LLM picks tools based on descriptions. Be specific. Include when NOT to use.
Granular over god-tools: search_issues + get_issue + create_issue beats manage_issues
Return structured data: JSON over prose. Let the LLM format for the user.
Error messages for LLMs: Include what went wrong AND what to try next
Idempotent where possible: create_or_update > create (prevents duplicates from retries)
Limit output size: Paginate or truncate. A 10MB response kills the context window.
Include examples in descriptions: "Search issues. Example: search_issues(query='bug label:critical')"

Tool Description Quality Checklist

Says what the tool DOES (not just the name restated)
Mentions when to use vs. when NOT to use
Each param has a description with format hints
Return format is documented
Edge cases mentioned (empty results, not found, etc.)

Phase 3: HTTP Transport & Production Server

HTTP Server Template (TypeScript)

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import express from "express";

const app = express();
app.use(express.json());

const server = new McpServer({ name: "my-service", version: "1.0.0" });
// ... register tools ...

app.post("/mcp", async (req, res) => {
  const transport = new StreamableHTTPServerTransport("/mcp", res);
  await server.connect(transport);
  await transport.handleRequest(req, res);
});

app.listen(3001, () => console.log("MCP server on :3001"));

Auth Patterns

API Key (simplest)

// Middleware
function authMiddleware(req, res, next) {
  const key = req.headers["x-api-key"] || req.headers.authorization?.replace("Bearer ", "");
  if (!key || !validKeys.has(key)) {
    return res.status(401).json({ error: "Invalid API key" });
  }
  req.userId = keyToUser.get(key);
  next();
}

OAuth 2.0 (for user-scoped access)

# MCP OAuth flow
1. Client requests tool → server returns 401 with auth URL
2. User completes OAuth in browser → gets access token
3. Client stores token, includes in subsequent requests
4. Server validates token, calls external API on user's behalf

Production Checklist

Phase 4: Client Integration

OpenClaw Configuration

# In openclaw config — stdio server
mcpServers:
  my-service:
    command: "node"
    args: ["path/to/server.js"]
    env:
      API_KEY: "{{env.MY_SERVICE_API_KEY}}"

# HTTP server
mcpServers:
  my-service:
    url: "https://mcp.myservice.com/mcp"
    headers:
      Authorization: "Bearer {{env.MY_SERVICE_TOKEN}}"

Claude Desktop Configuration

{
  "mcpServers": {
    "my-service": {
      "command": "node",
      "args": ["/path/to/server.js"],
      "env": { "API_KEY": "your-key" }
    }
  }
}

Client-Side Tool Selection

When multiple MCP servers are connected, the agent sees ALL tools. Help the agent pick correctly:

Unique tool names: Prefix if needed (github_search vs jira_search)
Clear descriptions: Disambiguate similar tools across servers
Don't overload: 20-30 tools max across all servers. Beyond that, agents get confused.

Multi-Server Architecture

Agent
├── github-mcp (code: create_pr, search_code, list_issues)
├── slack-mcp (comms: send_message, search_messages)
├── postgres-mcp (data: query, list_tables)
└── internal-mcp (business: get_customer, update_pipeline)

Principle: One server per domain. Don't build a mega-server.

Phase 5: Testing MCP Servers

Test Pyramid

        /  E2E  \        Agent actually uses the tool
       / Integration \    Tool calls real API (sandbox)
      /    Unit       \   Business logic without MCP layer

Unit Test Pattern

// Test the tool handler directly, no MCP transport
describe("get_item", () => {
  it("returns item when found", async () => {
    mockDb.findById.mockResolvedValue({ id: "123", name: "Test" });
    const result = await getItemHandler({ id: "123" });
    expect(result.content[0].text).toContain("Test");
  });

  it("returns error for missing item", async () => {
    mockDb.findById.mockResolvedValue(null);
    const result = await getItemHandler({ id: "missing" });
    expect(result.isError).toBe(true);
  });

  it("handles API timeout gracefully", async () => {
    mockDb.findById.mockRejectedValue(new Error("timeout"));
    const result = await getItemHandler({ id: "123" });
    expect(result.isError).toBe(true);
    expect(result.content[0].text).toContain("try again");
  });
});

Integration Test with MCP Inspector

# Use the MCP Inspector to manually test
npx @modelcontextprotocol/inspector node server.js

# Or use mcporter for CLI testing
mcporter call my-service.get_item id=123
mcporter list my-service --schema  # verify tool schemas

Test Checklist Per Tool

Happy path returns expected format
Missing required params returns clear error
Invalid param types return clear error
Not-found cases handled (don't throw, return error content)
Rate limit / quota exceeded handled
Auth failure handled (expired token, invalid key)
Large response truncated appropriately
Timeout handled (external API slow)
Concurrent calls don't interfere

Phase 6: Common MCP Server Patterns

1. API Wrapper (most common)

Wrap an existing REST/GraphQL API as MCP tools.

External API → MCP Server → Agent

Key decisions:

Map 1 API endpoint → 1 MCP tool (usually)
Simplify params (agent doesn't need every API option)
Aggregate related calls (e.g., get user + get user's repos = 1 tool)
Cache where safe (reduce API calls)

2. Database Query

Database → MCP Server → Agent

Safety rules:

Read-only by default. Write tools require explicit opt-in.
Parameterized queries only. NEVER interpolate agent input into SQL.
Row limit on all queries (agent can ask for more if needed).
Schema as a resource (let agent discover tables/columns).

3. File System

File System → MCP Server → Agent

Safety rules:

Sandbox to specific directories. Never allow ../ traversal.
Read-only by default. Write requires allowlist.
Size limits on reads. Don't send 1GB files through MCP.

4. Multi-Step Workflow

Some tools need to orchestrate multiple steps:

server.tool("deploy_service", "Build, test, and deploy a service", {
  service: z.string(),
  environment: z.enum(["staging", "production"]),
}, async ({ service, environment }) => {
  // Step 1: Build
  const buildResult = await build(service);
  if (!buildResult.success) return error(`Build failed: ${buildResult.error}`);

  // Step 2: Test
  const testResult = await runTests(service);
  if (!testResult.success) return error(`Tests failed: ${testResult.summary}`);

  // Step 3: Deploy (only if build + tests pass)
  if (environment === "production") {
    // Extra safety: require confirmation resource
    return {
      content: [{
        type: "text",
        text: `Ready to deploy ${service} to production. Tests: ${testResult.passed}/${testResult.total} passed. Call confirm_deploy to proceed.`
      }]
    };
  }
  const deployResult = await deploy(service, environment);
  return success(`Deployed ${service} to ${environment}: ${deployResult.url}`);
});

5. Aggregator Server

Combine multiple data sources into unified tools:

GitHub + Jira + PagerDuty → DevOps MCP Server → Agent

One get_service_status tool that queries all three and returns a unified view.

Phase 7: Security & Hardening

Threat Model

Threat	Risk	Mitigation
Prompt injection via tool output	Agent executes malicious instructions in API response	Sanitize output, strip HTML/scripts
Excessive permissions	Tool has write access it shouldn't	Principle of least privilege per tool
Data exfiltration	Agent sends sensitive data to wrong tool	Tool allowlists, audit logging
Denial of service	Agent calls tool in infinite loop	Rate limiting, circuit breakers
Credential leakage	API keys in tool responses	Strip sensitive fields from output
SSRF	Agent provides URL that hits internal network	URL allowlisting, no private IPs

Security Checklist

Dangerous Tool Patterns (Avoid)

❌ server.tool("execute_sql", ..., async ({ query }) => db.raw(query))
❌ server.tool("run_command", ..., async ({ cmd }) => exec(cmd))
❌ server.tool("fetch_url", ..., async ({ url }) => fetch(url))  // SSRF
❌ server.tool("write_file", ..., async ({ path, content }) => fs.writeFile(path, content))

Safe Alternatives

✅ Parameterized queries with allowlisted tables
✅ Predefined commands with argument validation
✅ URL allowlist + no private IP ranges
✅ Write to specific directory + filename validation

Phase 8: Debugging & Troubleshooting

Common Issues

Symptom	Likely Cause	Fix
Tool not appearing in agent	Schema error / server not connected	Check `mcporter list` or client logs
"Connection refused"	Server not running or wrong port	Verify process, check port
Tool times out	External API slow or hanging	Add timeout, check API health
"Invalid params"	Schema mismatch between client/server	Verify schema with `--schema` flag
Agent picks wrong tool	Ambiguous descriptions	Rewrite descriptions, add "Use this when..."
Agent calls tool in loop	Tool returning confusing error	Return clearer error with "do NOT retry"
Large response crashes	No output truncation	Add pagination or character limit
Auth errors intermittent	Token expiry	Implement token refresh

Debug Workflow

Verify server starts: node server.js — does it start without errors?
List tools: mcporter list my-server --schema — are all tools registered?
Call directly: mcporter call my-server.tool_name param=value — does it return expected output?
Check client config: Is the server path/URL correct? Are env vars set?
Read client logs: Most clients log MCP connection errors
Test with Inspector: npx @modelcontextprotocol/inspector for interactive debugging

Logging Template

server.tool("my_tool", description, schema, async (params) => {
  const requestId = crypto.randomUUID().slice(0, 8);
  console.error(`[${requestId}] my_tool called:`, JSON.stringify(params));
  const start = Date.now();
  try {
    const result = await doWork(params);
    console.error(`[${requestId}] my_tool success: ${Date.now() - start}ms`);
    return success(result);
  } catch (error) {
    console.error(`[${requestId}] my_tool error: ${error.message} (${Date.now() - start}ms)`);
    return errorResponse(error.message);
  }
});

Note: Use console.error for logs in stdio transport (stdout is reserved for MCP protocol).

Phase 9: MCP Server Selection Guide

Evaluating Existing MCP Servers

Score 0-5 per dimension:

Dimension	What to Check
Maintained	Last commit < 3 months? Issues addressed? Version > 1.0?
Secure	No raw SQL/exec? Auth implemented? Input validated?
Well-typed	Full JSON Schema for all tools? Descriptions useful?
Tested	Has tests? CI passing?
Documented	Setup instructions? Tool descriptions? Examples?
Lightweight	Minimal dependencies? Fast startup?

Score < 15/30: Build your own. Score 15-24: Use with caution. Score 25+: Good to use.

Popular MCP Server Categories

Category	Use Case	Examples
Code	GitHub, GitLab, code search	github-mcp, gitlab-mcp
Data	PostgreSQL, SQLite, Snowflake	postgres-mcp, sqlite-mcp
Comms	Slack, Discord, email	slack-mcp, gmail-mcp
Docs	Notion, Confluence, Google Docs	notion-mcp, gdocs-mcp
DevOps	AWS, GCP, Kubernetes, Terraform	aws-mcp, k8s-mcp
Search	Brave, Google, vector stores	brave-search, rag-mcp
Files	Local FS, S3, Google Drive	filesystem-mcp, s3-mcp
CRM	HubSpot, Salesforce	hubspot-mcp, sfdc-mcp

Phase 10: Architecture Patterns

Single Agent + Multiple Servers

Agent ──┬── github-mcp
        ├── slack-mcp
        ├── postgres-mcp
        └── custom-mcp

Best for: Most use cases. Simple, effective.

Gateway Pattern

Agent ── MCP Gateway ──┬── server-1
                       ├── server-2
                       └── server-3

Gateway handles: auth, rate limiting, logging, routing. Best for: Enterprise, multi-tenant, compliance requirements.

Agent-per-Domain

Orchestrator Agent
├── Code Agent (github-mcp, gitlab-mcp)
├── Data Agent (postgres-mcp, analytics-mcp)
└── Comms Agent (slack-mcp, email-mcp)

Best for: Complex workflows, specialized agents.

Tool Count Guidelines

Total Tools	Recommendation
1-10	Great. Agent handles well.
10-20	Good. Ensure distinct descriptions.
20-30	Caution. Group by server, review descriptions.
30-50	Risk. Consider agent-per-domain pattern.
50+	Dangerous. Agent WILL pick wrong tools. Split or use gateway.

Phase 11: Publishing MCP Servers

Package Structure

my-mcp-server/
├── src/
│   ├── server.ts        # MCP server entry
│   ├── tools/           # Tool handlers
│   │   ├── search.ts
│   │   └── create.ts
│   ├── auth.ts          # Auth middleware
│   └── config.ts        # Configuration
├── tests/
│   ├── tools.test.ts
│   └── integration.test.ts
├── package.json
├── tsconfig.json
├── README.md            # Setup + tool docs
└── LICENSE

README Template for MCP Servers

# [Service] MCP Server

[One sentence: what this enables]

## Quick Start
[3 steps max to get running]

## Tools
| Tool | Description | Params |
|------|-------------|--------|
[Table of all tools]

## Configuration
[Env vars, auth setup]

## Examples
[2-3 real usage examples with agent conversation]

npm Publishing

# package.json
{
  "name": "@myorg/service-mcp",
  "version": "1.0.0",
  "bin": { "service-mcp": "./dist/server.js" },
  "files": ["dist"],
  "keywords": ["mcp", "model-context-protocol", "ai-tools"]
}

npm publish

Quality Rubric (0-100)

Dimension	Weight	What to Score
Tool design	20%	Names, descriptions, granularity, params
Security	20%	Auth, input validation, output sanitization, least privilege
Reliability	15%	Error handling, timeouts, circuit breakers
Testing	15%	Unit + integration coverage, edge cases
Documentation	10%	Setup, tool docs, examples
Performance	10%	Response time, output size, caching
Maintainability	10%	Code structure, types, logging

Score 0-40: Not production ready. 40-70: Usable with caveats. 70-90: Solid. 90+: Excellent.

Common Mistakes

Mistake	Fix
God-tool that does everything	Split into focused tools
Vague tool descriptions	Write descriptions as if explaining to a new hire
No error handling	Every external call wrapped in try/catch
Returning raw API responses	Shape output for agent consumption
No rate limiting	Add per-tool and per-client limits
Ignoring output size	Paginate or truncate responses
Hardcoded credentials	Use env vars or secret manager
No logging	Can't debug what you can't see
Testing only happy path	Test errors, timeouts, edge cases
Building before checking	Search for existing MCP server first

Natural Language Commands

"Build an MCP server for [service]" → Use Phase 2 templates
"Add a tool to my MCP server" → Follow tool design rules
"Secure my MCP server" → Phase 7 checklist
"Debug MCP connection issue" → Phase 8 workflow
"Evaluate this MCP server" → Phase 9 scoring
"Design multi-server architecture" → Phase 10 patterns
"Publish my MCP server" → Phase 11 structure
"Convert REST API to MCP" → Phase 6 Pattern 1
"Add auth to my MCP server" → Phase 3 auth patterns
"Test my MCP server" → Phase 5 checklist
"How many tools is too many?" → Phase 10 tool count table
"Review my tool descriptions" → Phase 2 quality checklist