How to Build a Safe Agent Layer on Top of Postgres

by Hoshang Mehta

Most teams connect AI agents directly to Postgres. It seems simple—just give agents a connection string and let them query. But here's what happens: agents write inefficient queries that lock tables, access sensitive data they shouldn't, and create security risks that keep you up at night.

I've seen teams deploy agents with direct Postgres access, then discover agents scanning millions of rows, accessing production data during peak hours, or violating compliance requirements. The fix is always harder than building it right from the start.

A safe agent layer sits between your agents and Postgres. It provides the controls that direct database access can't: query optimization, access boundaries, security enforcement, and performance protection. It's the difference between hoping agents behave and ensuring they do.

This guide shows you how to build a safe agent layer on top of Postgres. You'll learn the architecture, implementation patterns, and practical steps to secure agent access without breaking functionality.

Table of Contents


Why You Need a Safe Agent Layer

Direct Postgres access for agents creates multiple problems:

Problem 1: Performance Impact

What happens: Agents write queries that lock tables, scan millions of rows, or create connection pool exhaustion.

Example: An agent writes a query that scans the entire customers table (10 million rows) without indexes. The query takes 45 seconds, locks the table, and causes timeouts across your application.

Impact: Customer-facing services slow down or crash. Revenue impact: $50,000 in lost sales.

Solution: Safe layer routes agents to read replicas and optimizes queries through views.

Problem 2: Security Risks

What happens: Agents can access any table, any column, any data they have permissions for.

Example: A support agent needs to look up Customer A, but with direct access, it can query all customers, employee data, financial records, and more.

Impact: Data breaches, compliance violations, security incidents.

Solution: Safe layer enforces access boundaries through sandboxed views. Agents can only access data defined in views.

Problem 3: Cost Explosion

What happens: Agents generate expensive queries that spike database costs.

Example: An agent enters an infinite loop, querying a 500GB table 1000 times per minute. Each query costs $50. Monthly cost: $2.4 million.

Impact: Unexpected infrastructure costs, budget overruns.

Solution: Safe layer monitors costs, sets limits, and optimizes queries to keep costs predictable.

Problem 4: Compliance Failures

What happens: During compliance audits, you can't prove agents only access appropriate data.

Example: SOC2 auditor asks: "How do you ensure agents only access customer data they're authorized to see?" You don't have an answer.

Impact: Failed audits, delayed product launches, regulatory fines.

Solution: Safe layer provides audit trails, access logs, and governance controls that prove compliance.

Problem 5: No Query Control

What happens: Agents can write any query, including dangerous ones (DROP TABLE, DELETE, etc.).

Example: An agent writes a query that deletes all test data, but accidentally targets production.

Impact: Data loss, service disruption, recovery costs.

Solution: Safe layer restricts queries to SELECT statements on sandboxed views. No write access, no dangerous operations.


The Architecture: Three-Layer Pattern

The safe agent layer uses a three-layer architecture:

Agent → Tool Layer → View Layer → Read Replica → Postgres Production

Layer 1: Read Replica (Isolation)

Purpose: Isolate agent queries from production Postgres.

What it does:

  • Routes all agent queries to read replicas
  • Protects production performance
  • Allows independent scaling

Why it matters: Agent queries don't impact production. Production stays fast for customer-facing services.

Layer 2: Sandboxed Views (Governance)

Purpose: Define exactly what data agents can access.

What it does:

  • Creates SQL views that limit access
  • Filters rows and columns
  • Enforces compliance requirements

Why it matters: Agents can only access data defined in views. No accidental exposure of sensitive data.

Layer 3: Tool Abstraction (Interface)

Purpose: Provide agent-friendly interfaces to views.

What it does:

  • Creates MCP tools from views
  • Validates inputs
  • Handles errors gracefully

Why it matters: Agents use tools, not views directly. Natural language interfaces make agents more effective.


Layer 1: Read Replica Isolation

The first layer isolates agent queries from production Postgres.

Why Read Replicas?

Production protection: Agent queries can't impact production performance. Production stays fast for customer-facing services.

Independent scaling: Scale replicas independently based on agent query volume.

Cost optimization: Replicas are cheaper than production. You can optimize replica configuration for analytical queries.

Disaster recovery: Replicas can serve as backups. If production fails, replicas provide continuity.

Setting Up Read Replicas

Option 1: Managed Replicas (Recommended)

AWS RDS:

# Create read replica
aws rds create-db-instance-read-replica \
  --db-instance-identifier agent-replica \
  --source-db-instance-identifier production-db \
  --db-instance-class db.r5.large

Google Cloud SQL:

# Create read replica
gcloud sql instances create agent-replica \
  --master-instance-name production-db \
  --tier db-n1-standard-2

Azure Database:

# Create read replica
az postgres flexible-server replica create \
  --resource-group myResourceGroup \
  --name agent-replica \
  --source-server production-db

Option 2: Streaming Replication (Self-Managed)

Postgres streaming replication:

-- On production (primary)
ALTER SYSTEM SET wal_level = replica;
ALTER SYSTEM SET max_wal_senders = 3;
ALTER SYSTEM SET max_replication_slots = 3;
SELECT pg_reload_conf();

-- Create replication user
CREATE USER replicator WITH REPLICATION PASSWORD 'password';

-- On replica
-- Add to postgresql.conf:
primary_conninfo = 'host=production-db port=5432 user=replicator password=password'

Configuring Replicas for Agent Queries

Optimize replicas for analytical queries:

Configuration:

-- Increase work_mem for analytical queries
ALTER SYSTEM SET work_mem = '256MB';

-- Increase shared_buffers for caching
ALTER SYSTEM SET shared_buffers = '4GB';

-- Enable query logging
ALTER SYSTEM SET log_statement = 'all';
ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries > 1 second

SELECT pg_reload_conf();

Monitoring Replica Performance

Track replica health:

Key metrics:

  • Replication lag: How far behind production (should be < 1 second)
  • Query latency: How fast queries execute
  • Connection pool usage: How many connections are active
  • CPU/Memory usage: Resource utilization

Monitoring queries:

-- Check replication lag
SELECT 
  client_addr,
  state,
  sync_state,
  pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn) AS lag_bytes
FROM pg_stat_replication;

-- Check query performance
SELECT 
  query,
  calls,
  total_exec_time,
  mean_exec_time,
  max_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Layer 2: Sandboxed Views

The second layer creates sandboxed views that define what agents can access.

What Are Sandboxed Views?

Sandboxed views are SQL views that:

  • Limit columns (exclude sensitive fields)
  • Filter rows (only relevant data)
  • Join data across systems (unified access)
  • Optimize queries (pre-aggregate, index)

Example:

-- Customer Support View (Sandboxed)
CREATE VIEW customer_support_view AS
SELECT 
  customer_id,
  customer_name,
  email,
  plan_name,
  signup_date,
  subscription_status,
  last_login_date,
  active_users_30d,
  open_tickets
FROM customers
WHERE is_active = true
  AND signup_date >= DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR)  -- GDPR: only last 2 years
  -- Excludes: credit_card_number, internal_notes, ssn, etc.

Creating Sandboxed Views

Step 1: Identify What Agents Need

Before creating views, identify what data agents actually need:

Questions to ask:

  • What questions will agents answer?
  • What data is required to answer those questions?
  • What's the minimum data needed? (principle of least privilege)
  • What data should agents never access?

Example: A support agent needs:

  • ✅ Customer name, email, plan, signup date
  • ✅ Recent product usage (last 30 days)
  • ✅ Open support tickets
  • ❌ Credit card numbers
  • ❌ Internal sales notes
  • ❌ Other customers' data

Step 2: Create the View

Create a view that includes only what agents need:

-- Customer Support View
CREATE VIEW customer_support_view AS
SELECT 
  customer_id,
  customer_name,
  email,
  plan_name,
  signup_date,
  subscription_status,
  last_login_date,
  active_users_30d,
  open_tickets
FROM customers
WHERE is_active = true
  AND signup_date >= DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR);

Step 3: Add Indexes

Add indexes to optimize queries:

-- Index for common queries
CREATE INDEX idx_customer_support_email ON customers(email) WHERE is_active = true;
CREATE INDEX idx_customer_support_signup ON customers(signup_date) WHERE is_active = true;

Step 4: Test the View

Test the view to verify it works:

-- Test query
SELECT * FROM customer_support_view 
WHERE email = 'customer@example.com';

-- Verify it returns correct data
-- Verify it excludes sensitive fields
-- Verify it filters correctly

Advanced View Patterns

Pattern 1: Customer-Scoped Views

Views that filter by customer from conversation context:

-- Customer-scoped view (parameterized)
CREATE VIEW customer_context_view AS
SELECT * FROM customer_support_view
WHERE customer_id = :conversation_customer_id;

Pattern 2: Multi-Table Joins

Views that join data across multiple tables:

-- Unified customer view (joins customers + orders + support)
CREATE VIEW customer_360_view AS
SELECT 
  c.customer_id,
  c.customer_name,
  c.email,
  c.plan_name,
  o.order_count,
  o.total_revenue,
  s.open_tickets,
  s.last_ticket_date
FROM customers c
LEFT JOIN (
  SELECT customer_id, COUNT(*) as order_count, SUM(amount) as total_revenue
  FROM orders
  GROUP BY customer_id
) o ON c.customer_id = o.customer_id
LEFT JOIN (
  SELECT customer_id, COUNT(*) as open_tickets, MAX(created_at) as last_ticket_date
  FROM support_tickets
  WHERE status = 'open'
  GROUP BY customer_id
) s ON c.customer_id = s.customer_id
WHERE c.is_active = true;

Pattern 3: Pre-Aggregated Views

Views that pre-aggregate data for performance:

-- Pre-aggregated customer metrics
CREATE VIEW customer_metrics_aggregated AS
SELECT 
  customer_id,
  customer_name,
  email,
  plan_name,
  -- Pre-aggregated metrics
  total_revenue,
  order_count,
  avg_order_value,
  active_users_30d,
  feature_adoption_score
FROM customers_aggregated
WHERE is_active = true;

Layer 3: Tool Abstraction

The third layer creates tools that agents use to query views.

What Are MCP Tools?

MCP (Model Context Protocol) tools are functions that agents can call. They abstract database queries behind natural language interfaces.

Tool structure:

{
  "name": "get_customer_info",
  "description": "Get customer information for support context. Returns customer details, subscription status, recent usage, and open tickets.",
  "parameters": {
    "email": {
      "type": "string",
      "description": "Customer email address",
      "required": true
    }
  },
  "query": "SELECT * FROM customer_support_view WHERE email = :email LIMIT 1"
}

Creating Tools from Views

Step 1: Define the Tool

Describe what the tool should do:

Tool: get_customer_info
Purpose: Get customer information for support context
Input: Customer email
Output: Customer details, subscription status, recent usage, open tickets
View: customer_support_view

Step 2: Create Tool Definition

Create the tool definition:

{
  "name": "get_customer_info",
  "description": "Get customer information for support context. Returns customer details, subscription status, recent usage, and open tickets.",
  "parameters": {
    "email": {
      "type": "string",
      "description": "Customer email address",
      "required": true
    }
  },
  "query": "SELECT * FROM customer_support_view WHERE email = :email LIMIT 1"
}

Step 3: Add Input Validation

Validate inputs to prevent SQL injection:

def validate_email(email: str) -> bool:
    """Validate email format"""
    import re
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return bool(re.match(pattern, email))

def get_customer_info(email: str):
    # Validate input
    if not validate_email(email):
        raise ValueError("Invalid email format")
    
    # Execute query
    query = "SELECT * FROM customer_support_view WHERE email = :email LIMIT 1"
    result = db.execute(query, {"email": email})
    return result

Step 4: Test the Tool

Test the tool to verify it works:

# Test with valid email
result = get_customer_info("customer@example.com")
assert result is not None

# Test with invalid email
try:
    get_customer_info("invalid-email")
    assert False, "Should raise ValueError"
except ValueError:
    pass

# Test with non-existent email
result = get_customer_info("nonexistent@example.com")
assert result is None

Step-by-Step Implementation

Here's how to build the safe agent layer step by step:

Step 1: Set Up Read Replica

Time: 30-60 minutes

Steps:

  1. Create read replica of production Postgres
  2. Configure replica for analytical queries
  3. Test replication lag (< 1 second)
  4. Set up monitoring

Commands:

# AWS RDS example
aws rds create-db-instance-read-replica \
  --db-instance-identifier agent-replica \
  --source-db-instance-identifier production-db \
  --db-instance-class db.r5.large

Step 2: Create Sandboxed Views

Time: 1-2 hours per view

Steps:

  1. Identify what data agents need
  2. Create view with only needed columns
  3. Add filters (active customers, date ranges, etc.)
  4. Add indexes for performance
  5. Test the view

Example:

-- Create view
CREATE VIEW customer_support_view AS
SELECT 
  customer_id,
  customer_name,
  email,
  plan_name,
  subscription_status,
  active_users_30d,
  open_tickets
FROM customers
WHERE is_active = true
  AND signup_date >= DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR);

-- Add indexes
CREATE INDEX idx_customer_support_email ON customers(email) WHERE is_active = true;

Step 3: Create MCP Tools

Time: 30 minutes per tool

Steps:

  1. Define tool purpose and parameters
  2. Create tool definition
  3. Add input validation
  4. Test the tool

Example:

{
  "name": "get_customer_info",
  "description": "Get customer information for support context",
  "parameters": {
    "email": {
      "type": "string",
      "description": "Customer email address",
      "required": true
    }
  },
  "query": "SELECT * FROM customer_support_view WHERE email = :email LIMIT 1"
}

Step 4: Connect Agents

Time: 15 minutes per agent

Steps:

  1. Get MCP server URL and credentials
  2. Add to agent framework configuration
  3. Test agent can use tools
  4. Verify access boundaries work

Example (Claude Desktop):

{
  "mcpServers": {
    "pylar": {
      "url": "https://api.pylar.ai/mcp/your-server-id",
      "apiKey": "your-bearer-token"
    }
  }
}

Step 5: Add Monitoring

Time: 1 hour

Steps:

  1. Set up query logging
  2. Track query performance
  3. Monitor costs
  4. Set up alerts

Example:

-- Enable query logging
ALTER SYSTEM SET log_statement = 'all';
ALTER SYSTEM SET log_min_duration_statement = 1000;

-- Monitor query performance
SELECT 
  query,
  calls,
  total_exec_time,
  mean_exec_time
FROM pg_stat_statements
WHERE query LIKE '%customer_support_view%'
ORDER BY total_exec_time DESC;

Step 6: Iterate Based on Usage

Time: Ongoing

Steps:

  1. Monitor how agents use the layer
  2. Identify optimization opportunities
  3. Refine views and tools
  4. Add new views as needed

Real-World Examples

Let me show you real-world implementations:

Example 1: Support Agent Layer

Requirements:

  • Support agents answer customer questions
  • Access customer data for support context
  • Cannot access other customers' data
  • Cannot access financial data

Implementation:

1. Read Replica:

# Create read replica
aws rds create-db-instance-read-replica \
  --db-instance-identifier support-agent-replica \
  --source-db-instance-identifier production-db

2. Sandboxed View:

CREATE VIEW customer_support_view AS
SELECT 
  customer_id,
  customer_name,
  email,
  plan_name,
  subscription_status,
  last_login_date,
  active_users_30d,
  open_tickets
FROM customers
WHERE is_active = true
  AND signup_date >= DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR);

CREATE INDEX idx_customer_support_email ON customers(email) WHERE is_active = true;

3. MCP Tool:

{
  "name": "get_customer_info",
  "description": "Get customer information for support context",
  "parameters": {
    "email": {
      "type": "string",
      "description": "Customer email address",
      "required": true
    }
  },
  "query": "SELECT * FROM customer_support_view WHERE email = :email LIMIT 1"
}

Result: Support agents get complete customer context without accessing other customers' data or financial information.

Example 2: Analytics Agent Layer

Requirements:

  • Analytics agents generate insights
  • Access aggregated customer data
  • Cannot access PII
  • Cannot access individual customer records

Implementation:

1. Read Replica:

# Create read replica optimized for analytics
aws rds create-db-instance-read-replica \
  --db-instance-identifier analytics-agent-replica \
  --source-db-instance-identifier production-db \
  --db-instance-class db.r5.xlarge  # Larger instance for analytics

2. Sandboxed View (Pre-aggregated):

CREATE VIEW customer_analytics_view AS
SELECT 
  -- Aggregated metrics only (no PII)
  plan_name,
  subscription_status,
  COUNT(*) as customer_count,
  AVG(active_users_30d) as avg_active_users,
  SUM(total_revenue) as total_revenue,
  AVG(order_count) as avg_orders
FROM customers_aggregated
WHERE is_active = true
GROUP BY plan_name, subscription_status
HAVING COUNT(*) >= 10;  -- Minimum aggregation threshold

3. MCP Tool:

{
  "name": "get_customer_analytics",
  "description": "Get aggregated customer analytics (no PII)",
  "parameters": {
    "plan_name": {
      "type": "string",
      "description": "Plan name to filter by (optional)",
      "required": false
    }
  },
  "query": "SELECT * FROM customer_analytics_view WHERE plan_name = COALESCE(:plan_name, plan_name)"
}

Result: Analytics agents get insights without accessing individual customer data or PII.

Example 3: Multi-Tenant Agent Layer

Requirements:

  • Agents access tenant data
  • Complete isolation between tenants
  • Agents can only access their tenant's data

Implementation:

1. Read Replica:

# Create read replica
aws rds create-db-instance-read-replica \
  --db-instance-identifier tenant-agent-replica \
  --source-db-instance-identifier production-db

2. Sandboxed View (Tenant-scoped):

CREATE VIEW tenant_data_view AS
SELECT 
  tenant_id,
  customer_id,
  customer_name,
  email,
  plan_name,
  subscription_status,
  usage_data
FROM customers
WHERE tenant_id = :conversation_tenant_id
  AND is_active = true;

CREATE INDEX idx_tenant_data_tenant ON customers(tenant_id) WHERE is_active = true;

3. MCP Tool:

{
  "name": "get_tenant_customers",
  "description": "Get customers for current tenant",
  "parameters": {
    "tenant_id": {
      "type": "string",
      "description": "Tenant ID (from conversation context)",
      "required": true
    }
  },
  "query": "SELECT * FROM tenant_data_view WHERE tenant_id = :tenant_id"
}

Result: Complete tenant isolation. Agents can only access their tenant's data.


Common Mistakes to Avoid

Here are mistakes I've seen teams make:

Mistake 1: Skipping the Read Replica

What happens: Teams connect agents directly to production Postgres.

Why it's a problem: Agent queries impact production performance. One slow query can crash customer-facing services.

The fix: Always use read replicas for agent queries. Never connect agents directly to production.

Mistake 2: Not Creating Views

What happens: Teams give agents access to raw tables instead of views.

Why it's a problem: No access boundaries. Agents can access any data, any column, any row.

The fix: Always create sandboxed views. Agents query views, not tables.

Mistake 3: Views That Are Too Broad

What happens: Teams create views that include too much data "just in case."

Why it's a problem: Broad views defeat the purpose. Agents can still access data they shouldn't.

The fix: Follow principle of least privilege. Views should include only the minimum data needed.

Mistake 4: Not Adding Indexes

What happens: Teams create views but don't add indexes.

Why it's a problem: Queries are slow, costs spike, user experience degrades.

The fix: Add indexes for frequently queried columns. Monitor query performance and add indexes as needed.

Mistake 5: No Input Validation

What happens: Tools accept any input without validation.

Why it's a problem: SQL injection risk, invalid queries, errors.

The fix: Validate all inputs. Check formats, ranges, and types before executing queries.

Mistake 6: Not Monitoring Performance

What happens: Teams deploy the layer and don't monitor it.

Why it's a problem: Performance issues go unnoticed. Costs spike. Users get frustrated.

The fix: Monitor query performance, costs, and access patterns from day one. Set up alerts for anomalies.

Mistake 7: Static Views

What happens: Views are created once and never updated.

Why it's a problem: As agents evolve, views become outdated. Over-permissive or under-permissive access.

The fix: Review and update views regularly. Remove unused views, optimize existing views, add new views as needed.


Where Pylar Fits In

Pylar makes building a safe agent layer on top of Postgres practical. Here's how:

Read Replica Support: Pylar connects to read replicas, not production. You configure replica connections in Pylar, and all agent queries route to replicas automatically. Production stays protected.

Sandboxed Views: Pylar's SQL IDE lets you create sandboxed views that define exactly what agents can access. Views can join data across multiple systems, filter rows and columns, and enforce compliance requirements. Views are version-controlled and auditable.

MCP Tool Builder: Pylar automatically generates MCP tools from your views. Describe what you want in natural language, and Pylar creates the tool definition, parameter validation, and query logic. No backend engineering required.

Query Optimization: Pylar views are optimized for agent queries. Pylar suggests indexes, pre-aggregations, and query optimizations that keep costs low and performance fast.

Access Control: Pylar enforces access control on every query. Before executing a query, Pylar validates role, context, and permissions. Queries that violate access boundaries are rejected.

Monitoring: Pylar Evals tracks query performance, costs, and access patterns. You can see exactly how agents are using your Postgres layer, which queries are most expensive, and where optimization opportunities exist.

Framework-Agnostic: Pylar tools work with any MCP-compatible framework. Whether you're using Claude Desktop, LangGraph, OpenAI, n8n, or any other framework, Pylar provides the same safe layer.

Pylar is the safe agent layer for Postgres. Instead of building custom read replica routing, view management, and tool generation, you build views and tools in Pylar. The safe layer is built in.


Frequently Asked Questions

Do I need a read replica for agents?

Can I use the same replica for multiple agents?

How do I handle write operations?

What if I need real-time data?

How do I optimize view performance?

Can I use views with existing Postgres setup?

How do I test the safe layer?

What if I need to update a view?

How do I monitor the safe layer?

Can I use this pattern with other databases?


Building a safe agent layer on top of Postgres isn't optional—it's essential. Without it, agents can impact production performance, access sensitive data, and create security risks. Start with read replicas, add sandboxed views, create tools, and monitor continuously.

The three-layer pattern gives you the foundation. Isolate agent queries from production, govern access through views, and provide agent-friendly interfaces through tools. With proper implementation, agents become secure, performant business tools rather than production risks.

How to Build Safe Agent Layer on Postgres