Agent Design Fieldbook

An 8-layer architecture for production AI agents — from data layer to RAG. Real code, real failures, battle-tested patterns for engineers and technical founders.

By the end of this series you'll be able to...

Diagnose why your AI agent fails in production — and use the 5-gap framework to identify exactly which layer is missing. Issue 1
Design a database schema that LLMs can reason about — including a Field Registry with aliases, units, ranges, and vague-term mappings that eliminates hallucinated field names. Issue 2
Build a 4-step ingestion pipeline (Fetch → Transform → Validate → Upsert) that normalizes multi-vendor chaos into a clean schema, with raw data preserved for debugging. Issue 3
Implement intent classification and routing — a Classify-Route-Execute pattern that handles multi-intent messages and resolves entity references. Issue 4
Convert natural language to validated database queries — using field-aware extraction prompts, range clamping, and constraint relaxation that handles zero-result searches without dead-ending the user. Issue 5
Build persistent agent memory — a 5-table session schema that survives page refreshes, resolves pronouns and positional references, and tracks token usage to prevent runaway API costs. Issue 6
Define what "best" means for every field — a Sortable Field Registry with preferred direction per metric, JSONB SQL expression generation, and LLM-inferred sort extraction. Issue 7
Decide when RAG is actually needed and implement a hybrid architecture that serves structured specs from a database and unstructured knowledge from a vector store. Issue 8
Track and optimize LLM costs at three levels (step, turn, session) with per-step token records that double as a fine-tuning dataset. Issue 6
Assess any existing agent's production readiness using eight layer-specific checklists, each scored so you know whether to stop, prototype with caution, or ship. All issues

The Foundation: Why Most AI Agents Fail in Production

Most AI agents are wrappers that crash in production; production agents are systems with 8 deterministic layers where the LLM is constrained, not trusted.

8-Layer Agent ArchitectureWrapper vs. SystemProbabilistic vs. DeterministicAgent Readiness Scorecard

12 min read · ai, agents, engineering

→

The Data Layer: Your Agent Is Only As Good As Your Data

LLMs will hallucinate field names and values unless you explicitly define what's real through schemas, field registries, and validation boundaries.

Field Registry PatternSchema ScopingThree Validation BoundariesTyped Columns vs. JSONB

11 min read · ai, agents, engineering

→

The Ingestion Layer: Getting Messy Data Into Your Clean Schema

Vendor data is chaos -- different units, formats, nulls, duplicates; ingestion is 80% of the work, and scripts beat pipelines for flexibility.

ETL vs. ELTThe Airflow TaxNormalization Pipelinefull_data Safety Net

12 min read · ai, agents, engineering

→

The Intent Layer: Classify Before You Act

The first LLM call should classify what the user wants (search, compare, select), not execute it -- then route to specialized handlers that do one job well.

Classify-Route-ExecuteRouting / Gateway PatternMulti-Intent DetectionEntity Extraction

12 min read · ai, agents, engineering

→

The Filter Extraction Layer: From Natural Language to Query

LLMs are excellent at extracting filters from natural language, but terrible at enforcing boundaries -- inject your field registry, validate everything, and relax constraints when needed.

Field-Aware ExtractionValidate-Clamp-RelaxConstraint RelaxationIntermediate Representation (SQL Injection Shield)

15 min read · ai, agents, engineering

→

The Memory Layer: Conversations That Persist

Agents need memory -- not just chat history, but structured context (what was fetched, what was selected, what tokens were used) persisted to database for multi-turn conversations.

Session Context StructureThree-Level Token AggregationDatabase Persistence SchemaReference Resolution

16 min read · ai, agents, engineering

→

The Sort & Rank Layer: Ordering Results Intelligently

Sorting isn't just ORDER BY -- define what 'best' means for each field, handle JSONB with SQL expressions, and let the LLM infer user intent from keywords like 'cheapest' or 'best'.

Sortable Field RegistryPreferred Direction per FieldJSONB SQL ExpressionsLLM-Inferred Sort

12 min read · ai, agents, engineering

→

The Product Deep-Dive Layer: When Users Want More Than Specs

When users want to go deep on a product -- reading reviews, understanding thermal performance, checking compatibility -- RAG lets you search unstructured knowledge that doesn't fit in structured columns.

Hybrid Architecture (SQL + RAG)RAG Decision FrameworkSection-Aware ChunkingStreaming Citations

17 min read · ai, agents, engineering

→