Agent Design Fieldbook Issue 5/8

The Filter Extraction Layer: From Natural Language to Query

LLMs are excellent at extracting filters from natural language, but terrible at enforcing boundaries -- inject your field registry, validate everything, and relax constraints when needed.

Apr 13, 2026 · 15 min read · Sentient Zero Labs

In this issue (13 sections)

In Issue 4, we built intent classification. Now we dive into what happens when the intent is “search”: converting natural language to precise database queries.

User says: “Find me a lightweight laptop with great battery, not too expensive.”

What does that mean? “Lightweight” — is that under 1kg? 2kg? 3kg? “Great battery” — 5 hours? 10 hours? “Not too expensive” — $500? $1,000? $2,000?

If you let the LLM guess, it will hallucinate numbers. We learned this the hard way.

In this issue:

Why LLMs can’t extract filters without field definitions
The field-aware extraction pattern (inject your registry)
Validation and clamping (reject unknown fields, clamp ranges)
Constraint relaxation (what to do when you get zero results)

ℹ

This is where your field registry (Issue 2) becomes critical.

History Anchor: Symbolic AI -> Tool Use + ReAct

Extracting structured data from natural language is one of AI’s oldest problems. Frame-based dialog systems (1990s) solved it with hand-coded slot definitions — “departure_city,” “arrival_city,” “date.” Our field registry is the same idea, updated for LLMs: explicit definitions that constrain extraction instead of letting the model guess.

The Disasters

Disaster 1: The Vague Terms

USER: "Find me a lightweight laptop with great battery, not too expensive"

VAGUE TERMS:
• "lightweight" = ??? (< 1kg? < 2kg? < 3kg?)
• "great battery" = ??? (> 5h? > 8h? > 12h?)
• "not too expensive" = ??? (< $500? < $1000? < $2000?)

┌──────────────────────────────────────────────────────────┐
│ LLM EXTRACTION (Without Field Definitions):              │
├──────────────────────────────────────────────────────────┤
│ {                                                        │
│   "weight": {"<": 100},        ← Hallucinated nonsense   │
│   "battery": {">": 1000},      ← Impossible value        │
│   "price": {"<": 999999}       ← Unbounded garbage       │
│ }                                                        │
└──────────────────────────────────────────────────────────┘
  ↓
DATABASE QUERY: WHERE weight < 100 AND battery > 1000 AND price < 999999
  ↓
RESULT: 0 products (or everything) ❌

The LLM had no idea what reasonable values looked like. So it made them up.

⚠ Why it happened

No field definitions. The LLM knows “cheap” means low price, but it doesn’t know that in your domain, prices range from $300 to $5,000.

Disaster 2: The Invented Field

USER: "Find laptops with good GPU performance"
  ↓
LLM EXTRACTION: {"gpu_performance": {">": "good"}}
  ↓
DATABASE: ERROR - column "gpu_performance" does not exist
  ↓
ACTUAL FIELD: "graphics_card" (text field with model names like "RTX 4060")

The LLM invented a field that sounded reasonable. But our database doesn’t have gpu_performance. It has graphics_card.

⚠ Why it happened

No field registry injection. The LLM guessed a plausible field name instead of using your actual schema.

Disaster 3: The Zero Results

USER: "Find Dell laptops under $500 with 32GB RAM"
  ↓
FILTERS: brand = "Dell", price < 500, ram >= 32
  ↓
DATABASE QUERY: WHERE brand = 'Dell' AND price < 500 AND ram >= 32
  ↓
RESULT: 0 products ❌
  ↓
REALITY: No Dell laptop under $500 has 32GB RAM
  ↓
USER: "Your search sucks!" (leaves)

The filters were valid. The extraction was correct. But no products match those criteria.

⚠ Why it happened

No constraint relaxation strategy. When you get zero results, you need to intelligently relax constraints.

The Pattern

LLMs are excellent at understanding intent. They’re terrible at knowing:

What fields exist in your database
What ranges are valid
What vague terms mean in your domain
What to do when constraints are impossible

You must tell them explicitly.

The Diagnosis: Why Extraction Alone Fails

What LLMs Can vs. Can’t Do

┌──────────────────────────────────────────────────────────┐
│ WHAT LLM CAN DO (Extraction)                             │
├──────────────────────────────────────────────────────────┤
│ ✓ Understand "cheap" means low price                     │
│ ✓ Understand "lightweight" means low weight              │
│ ✓ Extract multiple constraints from one sentence         │
│ ✓ Handle synonyms ("budget" = "cheap" = "affordable")    │
│ ✓ Map natural language to filter structure               │
└──────────────────────────────────────────────────────────┘

┌──────────────────────────────────────────────────────────┐
│ WHAT LLM CAN'T DO (Boundaries)                           │
├──────────────────────────────────────────────────────────┤
│ ❌ Know your field names ("price" vs "list_price")        │
│ ❌ Know your units (USD vs EUR, kg vs lbs)                │
│ ❌ Know your ranges (price: $0-$10,000, not unlimited)    │
│ ❌ Know what "cheap" means in your domain ($500? $1000?)  │
│ ❌ Enforce constraints (will hallucinate if unsure)       │
└──────────────────────────────────────────────────────────┘

The Gap

USER: "Find cheap laptops"
  ↓
┌─────────────────────────────────────────────────────────┐
│ WITHOUT FIELD DEFINITIONS                               │
├─────────────────────────────────────────────────────────┤
│ LLM guesses:                                            │
│ {"price": {"<": 10000000}}  ← Unbounded hallucination   │
└─────────────────────────────────────────────────────────┘
  ↓
RETURNS EVERYTHING (useless)

  vs

┌─────────────────────────────────────────────────────────┐
│ WITH FIELD DEFINITIONS                                  │
├─────────────────────────────────────────────────────────┤
│ Prompt includes:                                        │
│ "price (numeric) [unit: USD] [range: 0-10000]           │
│  [aliases: cost, budget, cheap]                         │
│  [mapping: cheap < 800, affordable < 1200]"             │
│                                                         │
│ LLM extracts:                                           │
│ {"price": {"<": 800}}  ← Grounded in field definition   │
└─────────────────────────────────────────────────────────┘
  ↓
RETURNS RELEVANT RESULTS

💡 Key Insight

The gap is not intelligence — the LLM is smart enough. The gap is knowledge. It doesn’t know your schema. Inject it.

Field Registry Injection

Telling the LLM What Exists

Your field registry (Issue 2) becomes critical here. You inject it directly into the extraction prompt.

Field Definition Format

┌─────────────────────────────────────────────────────────┐
│  FIELD DEFINITION (What Goes in Prompt)                 │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  For each filterable field, provide:                    │
│                                                         │
│  1. NAME: Canonical field name                          │
│  2. TYPE: numeric, text, enum                           │
│  3. UNIT: USD, kg, hours, etc.                          │
│  4. RANGE: min and max values                           │
│  5. ALIASES: Alternative names users might say          │
│  6. MAPPINGS: Vague term → specific value               │
│                                                         │
└─────────────────────────────────────────────────────────┘

Example: The Price Field

FIELD: price
├─ Type: numeric
├─ Unit: USD
├─ Range: 0 - 10,000
├─ Aliases: cost, budget, amount
└─ Mappings:
   • "cheap" → < 800
   • "affordable" → < 1200
   • "mid-range" → 1200 - 2500
   • "expensive" → > 2500
   • "premium" → > 4000

Generating Field Prompts from Registry

def get_field_prompt(field_registry: dict) -> str:
    """Generate field definitions for injection into LLM prompt."""
    lines = []
    for name, spec in field_registry.items():
        if not spec.get("filterable"):
            continue
        
        line = f"- {name} ({spec['data_type']})"
        
        if spec.get("unit"):
            line += f" [unit: {spec['unit']}]"
        
        if spec.get("range_min") is not None:
            line += f" [range: {spec['range_min']}-{spec['range_max']}]"
        
        if spec.get("aliases"):
            line += f" [aliases: {', '.join(spec['aliases'])}]"
        
        lines.append(line)
    
    return "\n".join(lines)

# Example output:
# - price (numeric) [unit: USD] [range: 0-10000] [aliases: cost, budget]
# - weight_kg (numeric) [unit: kg] [range: 0.5-5] [aliases: weight, lightweight]
# - battery_life (numeric) [unit: hours] [range: 1-24] [aliases: battery]
# - graphics_card (text) [aliases: gpu, graphics]

The Extraction Prompt

Designing a Field-Aware Prompt

FILTER_PROMPT = """
Extract search filters from the user's query.

AVAILABLE FIELDS (Only use these):
{field_definitions}

OPERATORS:
- = (equals)
- < (less than)
- <= (less than or equal)
- > (greater than)
- >= (greater than or equal)
- between (range)

VAGUE TERM MAPPINGS:
- "cheap" / "budget" → price < 800
- "affordable" → price < 1200
- "lightweight" → weight_kg < 1.5
- "ultralight" → weight_kg < 1.0
- "good battery" / "long battery" → battery_life > 8
- "all-day battery" → battery_life > 12

USER QUERY: "{query}"

Return JSON:
{{
  "filters": {{
    "field_name": {{"operator": "value"}},
    ...
  }},
  "reasoning": "Why these filters were extracted"
}}

RULES:
- ONLY use fields from the list above
- Map vague terms to specific values using the mappings
- If a term is unclear, OMIT it (don't guess)
- Use correct units from field definitions
- Values must be within the field's range
"""

Extraction Flow

USER: "Find cheap laptops with good battery, under 2kg"
  ↓
┌─────────────────────────────────────────────────────────┐
│ PROMPT INJECTION                                        │
├─────────────────────────────────────────────────────────┤
│ Available fields:                                       │
│ - price (numeric) [unit: USD] [range: 0-10000]          │
│ - weight_kg (numeric) [unit: kg] [range: 0.5-5]         │
│ - battery_life (numeric) [unit: hours] [range: 1-24]    │
│ - brand (text)                                          │
│ - graphics_card (text)                                  │
│                                                         │
│ Vague term mappings:                                    │
│ - "cheap" → price < 800                                 │
│ - "good battery" → battery_life > 8                     │
└─────────────────────────────────────────────────────────┘
  ↓
┌─────────────────────────────────────────────────────────┐
│ LLM EXTRACTION                                          │
├─────────────────────────────────────────────────────────┤
│ {                                                       │
│   "filters": {                                          │
│     "price": {"<": 800},                                │
│     "battery_life": {">": 8},                           │
│     "weight_kg": {"<": 2}                               │
│   },                                                    │
│   "reasoning": "cheap→price<800, good battery→>8h,      │
│                 explicit 'under 2kg'"                   │
│ }                                                       │
└─────────────────────────────────────────────────────────┘

The Extraction Function

async def extract_filters(query: str, field_registry: dict) -> dict:
    """Extract filters from natural language query."""
    # Generate field definitions
    field_defs = get_field_prompt(field_registry)
    
    # Build prompt
    prompt = FILTER_PROMPT.format(
        field_definitions=field_defs,
        query=query
    )
    
    # Call LLM
    response = await llm.generate(
        prompt=prompt,
        temperature=0.1,  # Low temperature for consistency
        max_tokens=300
    )
    
    # Parse JSON response
    result = json.loads(response)
    return result["filters"]

💡 Key Insight

By extracting JSON instead of asking the LLM to write raw SQL, you immunize your agent against SQL injection attacks. A prompt injection that tries to DROP TABLE will fail — the LLM can only output filter JSON, which your code validates and converts to parameterized queries. This is a critical security win.

ℹ

For schemas with more than 50 fields, the field registry prompt can get large. Use context pruning — only inject fields relevant to the detected intent or component type. This keeps token usage low and improves extraction accuracy.

Validation and Clamping

The Validation Flow

Even with field definitions, LLMs can still return invalid values. Always validate.

LLM EXTRACTION: {"price": {"<": 50000}, "gpu_performance": {">": "good"}}
  ↓
┌─────────────────────────────────────────────────────────┐
│ STEP 1: CHECK FIELD EXISTS                              │
├─────────────────────────────────────────────────────────┤
│ • "price" in registry? ✓                                │
│ • "gpu_performance" in registry? ❌                      │
│   → REJECT: Unknown field "gpu_performance"             │
└─────────────────────────────────────────────────────────┘
  ↓
┌─────────────────────────────────────────────────────────┐
│ STEP 2: CHECK VALUE IN RANGE                            │
├─────────────────────────────────────────────────────────┤
│ • price: 50000 in range [0, 10000]? ❌                   │
│   → CLAMP: 50000 → 10000                                │
│   → WARN: "Clamped price to max $10,000"                │
└─────────────────────────────────────────────────────────┘
  ↓
┌─────────────────────────────────────────────────────────┐
│ VALIDATED FILTERS                                       │
├─────────────────────────────────────────────────────────┤
│ {                                                       │
│   "price": {"<": 10000}                                 │
│ }                                                       │
│ Warnings: ["Clamped price to $10,000",                  │
│            "Ignored unknown field: gpu_performance"]    │
└─────────────────────────────────────────────────────────┘

The Validation Function

from dataclasses import dataclass
from typing import Dict, List, Tuple, Any

@dataclass
class ValidationResult:
    filters: Dict[str, Any]
    warnings: List[str]

def validate_filters(raw_filters: dict, field_registry: dict) -> ValidationResult:
    """Validate and clamp extracted filters."""
    validated = {}
    warnings = []
    
    for field, condition in raw_filters.items():
        # Step 1: Check field exists
        if field not in field_registry:
            warnings.append(f"Unknown field ignored: '{field}'")
            continue
        
        spec = field_registry[field]
        
        # Step 2: Check field is filterable
        if not spec.get("filterable", True):
            warnings.append(f"Field '{field}' is not filterable")
            continue
        
        # Extract operator and value
        operator = list(condition.keys())[0]
        value = condition[operator]
        
        # Step 3: Clamp to valid range
        if spec.get("range_max") is not None and value > spec["range_max"]:
            warnings.append(f"{field}: clamped {value} to max {spec['range_max']}")
            value = spec["range_max"]
        
        if spec.get("range_min") is not None and value < spec["range_min"]:
            warnings.append(f"{field}: clamped {value} to min {spec['range_min']}")
            value = spec["range_min"]
        
        validated[field] = {operator: value}
    
    return ValidationResult(filters=validated, warnings=warnings)

Constraint Relaxation

The Zero Results Problem

Sometimes valid filters return no results. Don’t just show “No results found” — help the user.

USER: "Find Dell laptops under $500 with 32GB RAM"
  ↓
FILTERS: brand = "Dell", price < 500, ram >= 32
  ↓
DATABASE QUERY: WHERE brand = 'Dell' AND price < 500 AND ram >= 32
  ↓
RESULT: 0 products ❌
  ↓
WHAT NOW? (Don't just fail!)

Relaxation Strategy

Define a priority order for constraints. Relax the least important first.

┌─────────────────────────────────────────────────────────┐
│ PRIORITY ORDER (Most to Least Important)               │
├─────────────────────────────────────────────────────────┤
│ 1. Core Specs (RAM, storage) - Most important          │
│ 2. Price - Somewhat flexible                           │
│ 3. Brand - Least important (user might accept others)  │
└─────────────────────────────────────────────────────────┘

Relaxation Flow

INITIAL FILTERS: brand = "Dell", price < 500, ram >= 32
  ↓
┌─────────────────────────────────────────────────────────┐
│ TRY 1: All constraints                                  │
│ WHERE brand = 'Dell' AND price < 500 AND ram >= 32      │
│ → 0 results                                             │
└─────────────────────────────────────────────────────────┘
  ↓
┌─────────────────────────────────────────────────────────┐
│ TRY 2: Relax brand (least important)                    │
│ WHERE price < 500 AND ram >= 32                         │
│ → 0 results                                             │
└─────────────────────────────────────────────────────────┘
  ↓
┌─────────────────────────────────────────────────────────┐
│ TRY 3: Relax price (next least important)               │
│ WHERE ram >= 32                                         │
│ → 12 results ✓                                          │
└─────────────────────────────────────────────────────────┘
  ↓
RESPONSE: "No Dell laptops under $500 with 32GB RAM found.
           I found 12 laptops with 32GB RAM (relaxed brand and price):"

Relaxation Implementation

async def search_with_relaxation(
    filters: dict,
    priority_order: list,
    min_results: int = 3
) -> Tuple[list, list]:
    """Search with automatic constraint relaxation."""
    
    # Try strict search first
    results = await search(filters)
    if len(results) >= min_results:
        return results, []  # No relaxation needed
    
    # Relax constraints in priority order (least important first)
    relaxed_fields = []
    current_filters = filters.copy()
    
    for field in reversed(priority_order):  # Least important first
        if field not in current_filters:
            continue
        
        # Remove this constraint
        del current_filters[field]
        relaxed_fields.append(field)
        
        # Try again
        results = await search(current_filters)
        if len(results) >= min_results:
            return results, relaxed_fields
    
    # Return whatever we have
    return results, relaxed_fields

# Usage:
results, relaxed = await search_with_relaxation(
    filters={"brand": "Dell", "price": {"<": 500}, "ram": {">=": 32}},
    priority_order=["ram", "price", "brand"]  # Most to least important
)

if relaxed:
    print(f"Note: Relaxed constraints: {', '.join(relaxed)}")

The Complete Flow

End-to-End Filter Extraction

USER: "Find cheap Dell laptops with good battery"
  ↓
┌─────────────────────────────────────────────────────────┐
│ STEP 1: INJECT FIELD DEFINITIONS                        │
│ Generate prompt with field registry                     │
└─────────────────────────────────────────────────────────┘
  ↓
┌─────────────────────────────────────────────────────────┐
│ STEP 2: LLM EXTRACTION                                  │
│ {                                                       │
│   "price": {"<": 800},                                  │
│   "brand": {"=": "Dell"},                               │
│   "battery_life": {">": 8}                              │
│ }                                                       │
└─────────────────────────────────────────────────────────┘
  ↓
┌─────────────────────────────────────────────────────────┐
│ STEP 3: VALIDATION                                      │
│ All fields exist? ✓                                     │
│ All values in range? ✓                                  │
└─────────────────────────────────────────────────────────┘
  ↓
┌─────────────────────────────────────────────────────────┐
│ STEP 4: DATABASE SEARCH                                 │
│ Found: 0 results                                        │
└─────────────────────────────────────────────────────────┘
  ↓
┌─────────────────────────────────────────────────────────┐
│ STEP 5: CONSTRAINT RELAXATION                           │
│ Relax brand → 3 results ✓                               │
└─────────────────────────────────────────────────────────┘
  ↓
RESPONSE: "No Dell laptops matched, but I found 3 cheap 
           laptops with good battery from other brands:"

The Proof: Before/After

Metrics That Changed

Before	After
Unknown field errors: 35%	Unknown field errors: 2% (rejected with warning)
Out-of-range hallucinations: 28%	Out-of-range hallucinations: 0% (clamped)
'Cheap' mapped correctly: 0% (unbounded guesses)	'Cheap' mapped correctly: 94% (price < $800)
Zero-result dead ends: 42%	Zero-result dead ends: 8% (relaxation helps)
User frustration: 'Search doesn't work'	User satisfaction: 'Search is smart'

What changed: We stopped letting the LLM guess and started telling it exactly what exists.

Resources

ReAct: Synergizing Reasoning and Acting in Language Models (Yao et al., 2022) — Filter extraction is the “Act” step: the LLM reasons about the query, then the code executes the validated filter. The paper explains why the loop needs grounding in external data to avoid hallucination.
Toolformer: Language Models Can Teach Themselves to Use Tools (Schick et al., 2023) — Demonstrates the model deciding what arguments to pass to APIs; directly analogous to field-aware filter extraction. The failure modes (invented API arguments) map exactly to the invented-field disasters in this issue.
OpenAI Structured Outputs — The production mechanism for getting structured JSON from LLMs. The extraction patterns in this issue map directly to OpenAI’s strict: true function definitions.
OWASP: SQL Injection — Why the intermediate JSON representation (extracting filters as JSON, converting to parameterized queries) is a critical security pattern, not just an engineering convenience.
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (Wei et al., 2022) — The reasoning field in the extraction prompt ("reasoning": "cheap→price<800") is a direct application of CoT: asking the model to explain its extraction improves accuracy and makes failures debuggable.

The Checklist: Filter Extraction Readiness

	Item	Score
	Do you inject field definitions into the extraction prompt?	Field Defs
	Do you map vague terms to specific values?	Vague Terms
	Do you reject unknown fields?	Validation
	Do you clamp out-of-range values?	Clamping
	Do you warn users about adjustments?	Warnings
	Do you have a constraint relaxation strategy?	Relaxation
	Is relaxation priority order defined?	Priority
	Do you tell users which constraints were relaxed?	Feedback

0 of 8

Score Interpretation:

0-3 checked: Extraction not ready. Start with field definitions.
4-6 checked: Prototype-ready. Add validation and clamping.
7-8 checked: Production-ready.

What’s Next

Issue 6: The Memory Layer

Now that you can extract filters and search, how do you remember what was found?

“User says ‘Tell me about the first one.’ You need to remember: (1) what products were fetched, (2) what order they were in, (3) what the user has selected.”

What You’ll Learn:

Session context design (what to remember)
Database persistence (chat history, selections)
Token tracking (cost control across sessions)

Key Takeaways

1 The Problem: LLMs extract filters but don't know your schema's boundaries
2 The Solution: Inject field definitions (name, type, unit, range, aliases), validate everything (reject unknown, clamp out-of-range), relax constraints when needed (least important first)
3 Security Win: JSON extraction instead of raw SQL immunizes against injection attacks
4 Key Takeaway: Field-aware extraction turns vague language into precise queries

Glossary

Filter Extraction: Converting natural language to structured query filters
Field Registry: List of filterable fields with metadata (units, ranges, aliases)
Validation: Checking if extracted filters match your schema
Clamping: Adjusting out-of-range values to valid boundaries
Constraint Relaxation: Removing filters when query returns 0 results
Vague Terms: Words like “cheap”, “good”, “lightweight” that need explicit mapping
Priority Order: Ranking constraints from most to least important for relaxation
Intermediate Representation: JSON filters instead of raw SQL — provides security against injection
Context Pruning: Only injecting relevant fields into prompts for large schemas (>50 fields)

Until next issue,

Sentient Zero Labs