Concepts

Understanding how mcp-setu works.

The Agentic Loop

mcp-setu runs an agentic loop that bridges Ollama and MCP:

User Input
    ↓
[Bridge Loop]
    ↓
Send message + tool definitions to Ollama
    ↓
Ollama decides: "I'll call this tool"
    ↓
Execute tool via MCP (get result)
    ↓
Send result back to Ollama
    ↓
Repeat until Ollama says "I'm done"
    ↓
Display final response to user

The loop exits when:

Ollama stops requesting tools (final response ready)
Max iterations reached (default: 20, prevents infinite loops)
An error occurs

Models & Tool Calling

Not all language models support tool calling. mcp-setu requires a model that understands the JSON-based tool definition format and can generate structured tool calls.

Supported models:

✅ Gemma 4 & 3
✅ Qwen 2.5 & 3
✅ Llama 3.2 & 3.3
✅ Mistral Nemo
✅ Command R
✅ Phi 4
✅ DeepSeek R1

Unsupported models:

❌ Llama 2
❌ Older Qwen versions
❌ Mistral 7B

When you run mcp-setu, it checks tool support at startup and exits with a clear error if your model doesn't support it.

MCP Servers

MCP (Model Context Protocol) servers are processes that provide tools to the model.

How MCP Servers Work

mcp-setu spawns a server process (or connects to HTTP endpoint)
Server advertises tools via JSON-RPC (or HTTP API)
Model calls a tool
mcp-setu routes the call to the appropriate server
Server executes and returns result
mcp-setu sends result back to model
Loop continues until model is satisfied

Transport Mechanisms

mcp-setu supports three ways to talk to MCP servers:

Stdio (JSON-RPC 2.0)

Servers run as subprocesses. Communication via stdin/stdout.

json

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "."]
    }
  }
}

Best for: Local Node.js servers, development, reliable communication.

HTTP Streamable (Modern Standard)

Remote servers accessed via HTTP POST with streaming.

json

{
  "mcpServers": {
    "remote-api": {
      "type": "http-streamable",
      "url": "http://your-mcp-server.com/mcp"
    }
  }
}

Best for: Cloud-hosted servers, production, remote integration.

HTTP/SSE (Legacy)

Server-Sent Events based communication (deprecated).

json

{
  "mcpServers": {
    "legacy": {
      "type": "http-sse",
      "url": "http://legacy-server.com/events"
    }
  }
}

Best for: Legacy server compatibility.

Configuration Structure

mcp.json
├── ollama (Ollama settings)
│   ├── baseUrl        → Where Ollama runs
│   ├── model          → Which model to use
│   ├── systemPrompt   → System message
│   ├── temperature    → Creativity (0=deterministic, 1=creative)
│   └── contextLength  → How much history to keep
│
└── mcpServers (Available tools)
    ├── filesystem     → File operations
    ├── sqlite         → Database queries
    ├── memory         → Persistent context
    └── [custom]       → Your servers

Tool Definition Format

Tools are sent to Ollama in this format:

json

{
  "name": "read_file",
  "description": "Read the contents of a file",
  "inputSchema": {
    "type": "object",
    "properties": {
      "path": {
        "type": "string",
        "description": "Path to the file"
      }
    },
    "required": ["path"]
  }
}

The model sees these definitions and knows:

What tools are available
What each tool does
What arguments each tool takes
What types of values are required

Workflow Example

Here's a real execution flow:

User: "How many lines are in main.go?"

┌─ mcp-setu ─────────────────────────────┐
│ Sends to Ollama (tool definitions included):
│ - User message: "How many lines..."
│ - Available tools: read_file, etc.
└──────────────────────────────────────┘
                    ↓
        Ollama processes & decides
                    ↓
        "I'll call read_file with path='main.go'"
                    ↓
┌─ mcp-setu ─────────────────────────────┐
│ Routes to filesystem server:
│ → Call: read_file("main.go")
│ ← Result: [file contents...]
└──────────────────────────────────────┘
                    ↓
        Sends result back to Ollama
                    ↓
        "The file has 477 lines"
                    ↓
┌─ mcp-setu ─────────────────────────────┐
│ Checks: Is model done? Yes.
│ Sends response to user
└──────────────────────────────────────┘
                    ↓
        User sees: "main.go has 477 lines"

Performance Considerations

Response Times

Three main factors:

Model inference time — How long the model takes to generate a response (usually the slowest)
Tool execution time — How long tools take to run (filesystem, database queries)
Network latency — For HTTP-based MCP servers

Optimization Tips

Use GPU — Run Ollama with GPU acceleration (100x faster)
Smaller models — llama3.2:3b is much faster than llama3.3:70b
Shorter context — Reduce contextLength if responses are slow
Cache results — If tools return the same data, the model can reuse answers
Parallel tools — Independent tool calls run in parallel

Monitoring

Use /stats in chat to monitor:

/stats

Performance Statistics
  Messages: 12
  Tool calls: 8
  Iterations: 4
  Session duration: 2m 34s
  Average response time: 1.2s

Security

mcp-setu runs entirely on your machine:

No cloud — Everything stays local
No data sharing — Models run locally
Credentials in env vars — Not stored in config files
OAuth 2.1 support — For remote servers

Best Practices

Store API tokens in environment variables
Use HTTPS/TLS for remote servers
Validate tool results before trusting them
Don't share your mcp.json with secrets

Next Steps

Configuration — Deep dive into settings
Examples — Real-world patterns
Development — Build on mcp-setu

Concepts ​

The Agentic Loop ​

Models & Tool Calling ​

MCP Servers ​

How MCP Servers Work ​

Transport Mechanisms ​

Stdio (JSON-RPC 2.0) ​

HTTP Streamable (Modern Standard) ​

HTTP/SSE (Legacy) ​

Configuration Structure ​

Tool Definition Format ​

Workflow Example ​

Performance Considerations ​

Response Times ​

Optimization Tips ​

Monitoring ​

Security ​

Best Practices ​

Next Steps ​

Concepts

The Agentic Loop

Models & Tool Calling

MCP Servers

How MCP Servers Work

Transport Mechanisms

Stdio (JSON-RPC 2.0)

HTTP Streamable (Modern Standard)

HTTP/SSE (Legacy)

Configuration Structure

Tool Definition Format

Workflow Example

Performance Considerations

Response Times

Optimization Tips

Monitoring

Security

Best Practices

Next Steps