All Issues
28 issues · 4 series · 28 checklists
Part I Founders
Pragmatic AI for Founders
01 02 03 04 05 06
AI Is a Probabilistic Engine, Not a Deterministic One 12 min →
AI is a probabilistic pattern matcher that requires a deterministic chassis to be reliable, and most value comes from the system around the model, not the model itself. Prompts as Control Programs, Not Questions 11 min →
Prompts are control programs that compile a deterministic interface to a probabilistic engine, and the quality of that interface determines whether your AI system is reliable or brittle. Agents as Three-Layer Systems: Tools, Memory, and Orchestration 13 min →
Agents are three-layer systems where the LLM is the decision layer, but 80% of the work is software engineering -- design the data layer first, then tools, then orchestration. Where AI Breaks and The Data Layer Solution 10 min →
AI fails predictably in two ways -- hallucination and bad retrieval -- and designing your data layer first prevents 80% of production disasters. Silent Failures and Monitoring AI in Production 11 min →
The scariest AI failures are silent -- no errors thrown, just slow degradation -- and monitoring drift is the only way to catch them before users leave. Build vs Buy vs Embed: AI Strategy That Actually Works 13 min →
Most AI strategy failures come from building when you should buy, buying vendors that can't handle your data layer, or measuring ROI without counting the disasters you prevented. Part II Technical
Agent Design Fieldbook
01 02 03 04 05 06 07 08
The Foundation: Why Most AI Agents Fail in Production 12 min →
Most AI agents are wrappers that crash in production; production agents are systems with 8 deterministic layers where the LLM is constrained, not trusted. The Data Layer: Your Agent Is Only As Good As Your Data 11 min →
LLMs will hallucinate field names and values unless you explicitly define what's real through schemas, field registries, and validation boundaries. The Ingestion Layer: Getting Messy Data Into Your Clean Schema 12 min →
Vendor data is chaos -- different units, formats, nulls, duplicates; ingestion is 80% of the work, and scripts beat pipelines for flexibility. The Intent Layer: Classify Before You Act 12 min →
The first LLM call should classify what the user wants (search, compare, select), not execute it -- then route to specialized handlers that do one job well. The Filter Extraction Layer: From Natural Language to Query 15 min →
LLMs are excellent at extracting filters from natural language, but terrible at enforcing boundaries -- inject your field registry, validate everything, and relax constraints when needed. The Memory Layer: Conversations That Persist 16 min →
Agents need memory -- not just chat history, but structured context (what was fetched, what was selected, what tokens were used) persisted to database for multi-turn conversations. The Sort & Rank Layer: Ordering Results Intelligently 12 min →
Sorting isn't just ORDER BY -- define what 'best' means for each field, handle JSONB with SQL expressions, and let the LLM infer user intent from keywords like 'cheapest' or 'best'. The Product Deep-Dive Layer: When Users Want More Than Specs 17 min →
When users want to go deep on a product -- reading reviews, understanding thermal performance, checking compatibility -- RAG lets you search unstructured knowledge that doesn't fit in structured columns. Part III Tools
Building Effective Tools for AI
01 02 03 04 05 06 07
What Makes a Good Tool 17 min →
Five properties every production MCP tool must have — and why most demo tools satisfy only one of them. MCP Architecture In Depth 18 min →
Three transports, three primitives, and the trust boundary that determines where auth and secrets belong in an MCP server. Building Your First Production MCP Server 22 min →
How to wire auth, timeout, and logging middleware before your first tool — and the async-acknowledge pattern that prevents hanging tool calls. Tool Design in the Real World 21 min →
How Recall's 8-tool design collapsed to 5 tools — and why designing for the LLM's decision surface, not your backend's capability, is the key to lower planning error rates. A2A — When Agents Need to Talk to Each Other 21 min →
A2A gives multi-agent systems a task lifecycle that makes every state in a sub-agent's execution visible, pausable, and recoverable — solving the coordination failures that async function calls cannot. Tool Observability 22 min →
Every tool call produces one record that answers three questions — did it succeed, how long did it take, and how much did it cost — and those three questions, asked consistently, are the foundation of everything useful you'll ever know about your tool layer in production. The Tool Ecosystem in 2026 21 min →
MCP is no longer an emerging standard — it's infrastructure, with real security threats, five open problems, and a clear picture of what teams can solve today versus what requires ecosystem-level coordination. Part IV Memory
Memory in AI Systems
01 02 03 04 05 06 07
Memory Is Belief State, Not Storage 17 min →
Most agents treat memory as a storage problem. It is a state management problem — and the distinction produces completely different architectures. The WRITE Phase — What to Remember and How 18 min →
Not every turn deserves a memory. How you design the extraction step and score importance and confidence at write time determines whether your memory store improves or degrades. The MANAGE Phase — The Work Nobody Does 15 min →
Memory without curation rots. How contradiction detection, decay scoring, and consolidation keep your agent's belief state accurate over time. The READ Phase — Retrieval as a Hyperparameter 17 min →
Retrieval thresholds, ranking functions, and injection rates are tunable. Most teams set them once and forget them. Here's what to tune and how. Four Design Patterns in Order 15 min →
Most teams jump to Pattern 3 (vector store) or Pattern 4 (episodic log). Most should start with Pattern 2. A decision matrix for which memory architecture your use case actually requires. Memory Failures — Named and Fixable 17 min →
Six memory failure modes with detection queries and mitigations. Plus GDPR erasure: the single SQL call that satisfies Article 17. Building a Memory System End-to-End 19 min →
Schema to governance — the complete implementation. Every component from the series assembled, every decision justified, and the migration path from SQLite to Postgres+pgvector.