← Home
Tools Series 7 of 7 issues

Building Effective Tools for AI

Seven issues on building production-grade MCP servers, A2A agents, and observable tool systems. Grounded in the open-source Recall memory layer.

By the end of this series you'll be able to...
  1. Design idempotent tools that are safe to retry under network failures — without duplicating state or silently swallowing errors. Issue 1
  2. Write LLM-readable error messages that tell an agent what went wrong, what was expected, and what to do next — eliminating retry loops caused by opaque errors. Issue 1
  3. Choose the right MCP transport and place auth, secrets, and session context at the correct layer so credentials never appear in model context. Issue 2
  4. Build a production FastMCP server with a three-layer middleware stack (auth, timeout, logging) and async-acknowledge for slow operations. Issue 3
  5. Reduce planning errors by designing tools around user intent — collapsing overlapping tools into fewer, better-scoped interfaces with measured error rates. Issue 4
  6. Implement A2A coordination with a full task lifecycle (SUBMITTED → WORKING → INPUT_REQUIRED → COMPLETED) so multi-agent systems can pause, surface blockers, and resume. Issue 5
  7. Instrument every tool call with a ToolCallRecord and configure the four alert queries that catch 90% of production failures. Issue 6
  8. Defend against tool poisoning and tool shadowing with startup description validation and namespaced tool names. Issue 7
  9. Trace cost back to the originating session in a multi-agent chain using per-call cost_usd logging. Issues 6–7
  10. Evaluate any third-party MCP server or A2A sub-agent against a security checklist: description whitelist, namespace isolation, Agent Card signature verification. Issues 5, 7
01

What Makes a Good Tool

Five properties every production MCP tool must have — and why most demo tools satisfy only one of them.

Idempotency by DesignLLM-Readable ErrorsStructured Return ShapeCall-Level Observability
17 min read · mcp, tools, idempotency
02

MCP Architecture In Depth

Three transports, three primitives, and the trust boundary that determines where auth and secrets belong in an MCP server.

Transport SelectionTrust BoundaryAuth at the Transport LayerResource vs Tool Decision
18 min read · mcp, auth, architecture
03

Building Your First Production MCP Server

How to wire auth, timeout, and logging middleware before your first tool — and the async-acknowledge pattern that prevents hanging tool calls.

Schema From CodeMiddleware-First ArchitectureAsync-Acknowledge PatternDocstring as Runtime Instruction
22 min read · mcp, fastmcp, middleware
04

Tool Design in the Real World

How Recall's 8-tool design collapsed to 5 tools — and why designing for the LLM's decision surface, not your backend's capability, is the key to lower planning error rates.

Intent-Aligned Tool DesignPlanning Error RateParameter vs Tool BoundaryStart Coarse Heuristic
21 min read · mcp, tool-design, llm
05

A2A — When Agents Need to Talk to Each Other

A2A gives multi-agent systems a task lifecycle that makes every state in a sub-agent's execution visible, pausable, and recoverable — solving the coordination failures that async function calls cannot.

A2A Task Lifecycle State MachineINPUT_REQUIRED (Pause and Resume)Agent Card DiscoveryREJECTED vs FAILED Semantics
21 min read · a2a, mcp, multi-agent
06

Tool Observability

Every tool call produces one record that answers three questions — did it succeed, how long did it take, and how much did it cost — and those three questions, asked consistently, are the foundation of everything useful you'll ever know about your tool layer in production.

ToolCallRecord (One Record Per Call)Cost-Per-Session AlertingP95 Latency Trendinginputs_hash (Cardinality-Safe Deduplication)
22 min read · observability, monitoring, mcp
07

The Tool Ecosystem in 2026

MCP is no longer an emerging standard — it's infrastructure, with real security threats, five open problems, and a clear picture of what teams can solve today versus what requires ecosystem-level coordination.

Tool Poisoning (MCPTox Attack Surface)Namespace IsolationFive Open ProblemsSigned Agent Cards
21 min read · mcp, a2a, security