Skip to content
Back to all posts
Article

How MCP Actually Works: The Capability Bus Behind Codex, Claude Code, and Gemini CLI

MCP, demystified. Why it's a capability bus, not a plugin store, and what that changes about how agents reach external tools.

8 min read

Watch (11:10)


Overview

MCP, demystified. Why it's a capability bus, not a plugin store, and what that changes about how agents reach external tools.


Full transcript (from the video)

This video is a focused MCP explainer for engineers who already know the acronym but still feel like the mechanism is fuzzy. The useful framing is not that MCP is some magical agent protocol. The useful framing is that MCP is the capability bus. It is the layer that lets a coding agent runtime discover external systems, present them as usable capabilities and move structured results back into the model loop. That is why it shows up behind Codeex, Claude Code, and Gemini CLI. The exact UX differs, but the architectural job is similar. I also want to separate MCP from adjacent ideas that people blur together. Instruction files are not MCP.

Long-term memory is not MCP. Approval policy is not MCP. Those are separate layers that happen to touch the same session. The official MCP architecture starts with three roles, host, client, and server. The host is the application the human is using. In this video, that means codeex, claude, code, Gemini CLI.

Inside the host, there is an MCP client for each server connection. The server is the external system exposing capabilities. That detail matters because it kills one common misconception. The model is not connecting directly to your doc site or issue tracker. The host runtime is it owns the connections, the permissions and the translation from protocol surface to model visible capability. The official architecture docs also describe hosts connecting to multiple servers at once is why MCP works as a reusable extension layer instead of a one-off plug-in path. At the protocol at the protocol level, MCP is not only about tools. The broader surface is tools, resources, and prompts. Tools are the most obvious because they map cleanly onto actions. Search docs. Open a ticket, call an internal API. Resources are structured context objects. The host can attach or reference. Prompts are reusable entry points exposed by the server. Different hosts emphasize different pieces of this code and Gemini CLI document resource and prompt behavior very explicitly. OpenAI's codeex talks focus more on the server registration, transport, O and tool gating path. The important architectural point is that the server exports capabilities and the host decides how much of that surface becomes user-facing UX. MCP is a capability layer, not a remote screen sharing layer. When people say MCP lets the model use tools, they usually skip the machinery that makes that safe enough to trust. The actual loop is discover, gate, call, and return. First, the server describes what it can do. Then the host validates the server configuration, applies transport and O rules, and decides what should be exposed. Only then does the model see a named capability surface. If the model asks to use something, it does not directly run code on the remote system.

It asks the host runtime. The host executes through the MCP connection, collects the result, and feeds structured output back into the next model turn. That loop is what makes MCP feel native inside a coding CLI. The protocol is not replacing the runtime.

It is feeding the runtime. Transport details matter operationally, but they are not the main idea. Local MCP servers are usually child processes connected over standard input and output. Remote servers ride HTTP-based transport and the current product docs vary in exactly which forms they emphasize codeex documents studio and streamable HTTP clouded code documents local studio and remote HTTP while also noting older SSE support Gemini CLI documents studio SSE and streamable HTTP. The architectural point is that transport is plumbing. It affects deployment reliability and off, but not the fundamental role of the bus.

The host still owns the connection details, the environment variables, the startup behavior, the timeouts, and in some cases the OOTH flow for remote access. Open AI's codeex docs present MCP very clearly as runtime extension.

You register servers in the codeex config either by naming a local command plus arguments or by pointing to a remote URL. The docs also expose a very practical control surface around that server. You can pass environment variables, set startup timeouts, and restricted the tool surface with enabled and disabled tool lists for remote servers. Codeex also documents OOTH support. That is a strong clue about where OpenAI thinks MCP lives in the stack. It is not being treated as a prompt add-on. It is being treated as a governed runtime connection. Also notice what sits next to it, but is not the same thing. Agents MD repo instructions shape behavior. MCP supplies callable capability. Anthropics clawed code docs make two things especially concrete.

First, MCP is scoped infrastructure. You can register servers locally per project or at the user level and project scoped configuration can live in a repo file named mcp.json.

Second clawed code shows more of the non-tool surface directly in product UX.

The docs describe how resources can be referenced, how prompts can show up as slash commands, and how tool changes can update dynamically without a full restart. The docs also describe tool search is essentially a lazy loading answer to giant tool surfaces. If MCP tool descriptions would eat too much context, Claude can discover the right tools more selectively. That is an important design lesson. Capability buses need discovery discipline, not just more end points. Geminy CLI is explicit that MCP is a built-in part of the CLI surface. The docs describe both JSON configuration through settings.json and command line management through Gemini MCP. They also document support for multiple transport types and call out tools, resources, and prompts directly. By step one detail I like in the Gemini docs is that they talk about validating and sanitizing tool schemas and resolving naming conflicts. That is exactly the kind of boring runtime work that makes a capability bus usable at scale. Gemini also keeps the trust layer visibly separate. Trusted folders determine whether a workspace is trusted and the policy engine governs what kinds of actions are allowed. So again, MCP is the capability surface. But policy still decides whether that capability can actually fire. This is the separation that makes the whole system understandable. Instruction files such as agents MD, cloud MD or gemin dent tell the agent how to behave in this repo. They define workflow boundaries and completion rules. MCP does something different. It tells the runtime what outside capability exists. Then there is a third layer on top of that trust approvals and policy. That layer decides whether a capability should be usable in the current workspace and under the current settings. Once you split the system that way, a lot of confusion disappears. If the model behaved oddly, look at instructions. If the wrong tools were visible, look at MCP registration.

If something was blocked or demanded consent, look at policy. Blending those layers together is what makes agent systems feel haunted. I keep using the phrase capability bus and matches what the protocol is doing architecturally. A bus lets multiple systems attach to shared capability providers through a stable contract. That is exactly what MCP enables. The same DOC server or browser server can be consumed by different hosts. A single host can combine many servers at once. And none of that requires rewriting the model itself every time your internal tool surface changes. That is important for builders because it creates a clean seam. You can improve or replace the model. You can improve or replace the servers. You can tighten policy. You can update repo instructions. Those are different axes of change. MCP helps keep the capability axis modular instead of baking every integration into one vendor specific monolith. The failure modes are predictable once you stop treating MCP like magic. The first bad pattern is the kitchen sync server that exposes a huge pile of barely named right actions. The second, the second is weak schemas that force the model to guess what you meant.

The third is assuming the protocol itself provides the safety model. It does not. The host still needs policy, trust, and approval boundaries. And the fourth mistake is trying to make MCP do jobs. It was not built for it, like acting as your memory system or your full planning substrate. Good MCP design is boring in the right way. small tool surfaces, explicit argument shapes, clear errors, separate policy. When those pieces are missing, the protocol is not the problem. The surrounding architecture is, if you're building MCP servers yourself, the design target should be host mediated reliability.

Keep tools narrow enough that a human reviewer can understand them. Separate read heavy capability from write heavy capability so policy can treat them differently. Return structured payloads instead of giant pros blobs whenever you can. Structured output gives the host and model a better chance of doing the next step correctly. Use resources for external context that should stay referenceable. Use prompts when a server has a reusable task entry point worth naming and avoid the temptation to tunnel around the host with hidden side effects. The whole benefit of MCP is that the runtime remains in the loop. If you break that property, you lose the reviewability that makes this architecture valuable. The clean mental model is now pretty simple. MCP is the capability bus. Instruction files shape behavior inside the repo policy and trust decide what execution is allowed.

The runtime sits in the middle and mediates every action. Codex, Clode Code, and Gemini CLI each expose that stack with different UX, different config shapes, and different product emphases. But the architecture rhymes strongly enough to learn from as one family. That is why MCP matters. It gives you a portable extension seam for capabilities without forcing all of agent behavior, trust, and context into one tangled prompt blob. If you're building local AI tools, keep those layers separate.