Most teams connect AI agents directly to Postgres. It seems simple—just give agents a connection string and let them query. But here's what happens: agents write inefficient queries that lock tables, access sensitive data they shouldn't, and create security risks that keep you up at night.
I've seen teams deploy agents with direct Postgres access, then discover agents scanning millions of rows, accessing production data during peak hours, or violating compliance requirements. The fix is always harder than building it right from the start.
A safe agent layer sits between your agents and Postgres. It provides the controls that direct database access can't: query optimization, access boundaries, security enforcement, and performance protection. It's the difference between hoping agents behave and ensuring they do.
This guide shows you how to build a safe agent layer on top of Postgres. You'll learn the architecture, implementation patterns, and practical steps to secure agent access without breaking functionality.
Table of Contents
- Why You Need a Safe Agent Layer
- The Architecture: Three-Layer Pattern
- Layer 1: Read Replica Isolation
- Layer 2: Sandboxed Views
- Layer 3: Tool Abstraction
- Step-by-Step Implementation
- Real-World Examples
- Common Mistakes to Avoid
- Where Pylar Fits In
- Frequently Asked Questions
Why You Need a Safe Agent Layer
Direct Postgres access for agents creates multiple problems:
Problem 1: Performance Impact
What happens: Agents write queries that lock tables, scan millions of rows, or create connection pool exhaustion.
Example: An agent writes a query that scans the entire customers table (10 million rows) without indexes. The query takes 45 seconds, locks the table, and causes timeouts across your application.
Impact: Customer-facing services slow down or crash. Revenue impact: $50,000 in lost sales.
Solution: Safe layer routes agents to read replicas and optimizes queries through views.
Problem 2: Security Risks
What happens: Agents can access any table, any column, any data they have permissions for.
Example: A support agent needs to look up Customer A, but with direct access, it can query all customers, employee data, financial records, and more.
Impact: Data breaches, compliance violations, security incidents.
Solution: Safe layer enforces access boundaries through sandboxed views. Agents can only access data defined in views.
Problem 3: Cost Explosion
What happens: Agents generate expensive queries that spike database costs.
Example: An agent enters an infinite loop, querying a 500GB table 1000 times per minute. Each query costs $50. Monthly cost: $2.4 million.
Impact: Unexpected infrastructure costs, budget overruns.
Solution: Safe layer monitors costs, sets limits, and optimizes queries to keep costs predictable.
Problem 4: Compliance Failures
What happens: During compliance audits, you can't prove agents only access appropriate data.
Example: SOC2 auditor asks: "How do you ensure agents only access customer data they're authorized to see?" You don't have an answer.
Impact: Failed audits, delayed product launches, regulatory fines.
Solution: Safe layer provides audit trails, access logs, and governance controls that prove compliance.
Problem 5: No Query Control
What happens: Agents can write any query, including dangerous ones (DROP TABLE, DELETE, etc.).
Example: An agent writes a query that deletes all test data, but accidentally targets production.
Impact: Data loss, service disruption, recovery costs.
Solution: Safe layer restricts queries to SELECT statements on sandboxed views. No write access, no dangerous operations.
The Architecture: Three-Layer Pattern
The safe agent layer uses a three-layer architecture:
Agent → Tool Layer → View Layer → Read Replica → Postgres Production
Layer 1: Read Replica (Isolation)
Purpose: Isolate agent queries from production Postgres.
What it does:
- Routes all agent queries to read replicas
- Protects production performance
- Allows independent scaling
Why it matters: Agent queries don't impact production. Production stays fast for customer-facing services.
Layer 2: Sandboxed Views (Governance)
Purpose: Define exactly what data agents can access.
What it does:
- Creates SQL views that limit access
- Filters rows and columns
- Enforces compliance requirements
Why it matters: Agents can only access data defined in views. No accidental exposure of sensitive data.
Layer 3: Tool Abstraction (Interface)
Purpose: Provide agent-friendly interfaces to views.
What it does:
- Creates MCP tools from views
- Validates inputs
- Handles errors gracefully
Why it matters: Agents use tools, not views directly. Natural language interfaces make agents more effective.
Layer 1: Read Replica Isolation
The first layer isolates agent queries from production Postgres.
Why Read Replicas?
Production protection: Agent queries can't impact production performance. Production stays fast for customer-facing services.
Independent scaling: Scale replicas independently based on agent query volume.
Cost optimization: Replicas are cheaper than production. You can optimize replica configuration for analytical queries.
Disaster recovery: Replicas can serve as backups. If production fails, replicas provide continuity.
Setting Up Read Replicas
Option 1: Managed Replicas (Recommended)
AWS RDS:
# Create read replica
aws rds create-db-instance-read-replica \
--db-instance-identifier agent-replica \
--source-db-instance-identifier production-db \
--db-instance-class db.r5.large
Google Cloud SQL:
# Create read replica
gcloud sql instances create agent-replica \
--master-instance-name production-db \
--tier db-n1-standard-2
Azure Database:
# Create read replica
az postgres flexible-server replica create \
--resource-group myResourceGroup \
--name agent-replica \
--source-server production-db
Option 2: Streaming Replication (Self-Managed)
Postgres streaming replication:
-- On production (primary)
ALTER SYSTEM SET wal_level = replica;
ALTER SYSTEM SET max_wal_senders = 3;
ALTER SYSTEM SET max_replication_slots = 3;
SELECT pg_reload_conf();
-- Create replication user
CREATE USER replicator WITH REPLICATION PASSWORD 'password';
-- On replica
-- Add to postgresql.conf:
primary_conninfo = 'host=production-db port=5432 user=replicator password=password'
Configuring Replicas for Agent Queries
Optimize replicas for analytical queries:
Configuration:
-- Increase work_mem for analytical queries
ALTER SYSTEM SET work_mem = '256MB';
-- Increase shared_buffers for caching
ALTER SYSTEM SET shared_buffers = '4GB';
-- Enable query logging
ALTER SYSTEM SET log_statement = 'all';
ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries > 1 second
SELECT pg_reload_conf();
Monitoring Replica Performance
Track replica health:
Key metrics:
- Replication lag: How far behind production (should be < 1 second)
- Query latency: How fast queries execute
- Connection pool usage: How many connections are active
- CPU/Memory usage: Resource utilization
Monitoring queries:
-- Check replication lag
SELECT
client_addr,
state,
sync_state,
pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn) AS lag_bytes
FROM pg_stat_replication;
-- Check query performance
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
max_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
Layer 2: Sandboxed Views
The second layer creates sandboxed views that define what agents can access.
What Are Sandboxed Views?
Sandboxed views are SQL views that:
- Limit columns (exclude sensitive fields)
- Filter rows (only relevant data)
- Join data across systems (unified access)
- Optimize queries (pre-aggregate, index)
Example:
-- Customer Support View (Sandboxed)
CREATE VIEW customer_support_view AS
SELECT
customer_id,
customer_name,
email,
plan_name,
signup_date,
subscription_status,
last_login_date,
active_users_30d,
open_tickets
FROM customers
WHERE is_active = true
AND signup_date >= DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR) -- GDPR: only last 2 years
-- Excludes: credit_card_number, internal_notes, ssn, etc.
Creating Sandboxed Views
Step 1: Identify What Agents Need
Before creating views, identify what data agents actually need:
Questions to ask:
- What questions will agents answer?
- What data is required to answer those questions?
- What's the minimum data needed? (principle of least privilege)
- What data should agents never access?
Example: A support agent needs:
- ✅ Customer name, email, plan, signup date
- ✅ Recent product usage (last 30 days)
- ✅ Open support tickets
- ❌ Credit card numbers
- ❌ Internal sales notes
- ❌ Other customers' data
Step 2: Create the View
Create a view that includes only what agents need:
-- Customer Support View
CREATE VIEW customer_support_view AS
SELECT
customer_id,
customer_name,
email,
plan_name,
signup_date,
subscription_status,
last_login_date,
active_users_30d,
open_tickets
FROM customers
WHERE is_active = true
AND signup_date >= DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR);
Step 3: Add Indexes
Add indexes to optimize queries:
-- Index for common queries
CREATE INDEX idx_customer_support_email ON customers(email) WHERE is_active = true;
CREATE INDEX idx_customer_support_signup ON customers(signup_date) WHERE is_active = true;
Step 4: Test the View
Test the view to verify it works:
-- Test query
SELECT * FROM customer_support_view
WHERE email = 'customer@example.com';
-- Verify it returns correct data
-- Verify it excludes sensitive fields
-- Verify it filters correctly
Advanced View Patterns
Pattern 1: Customer-Scoped Views
Views that filter by customer from conversation context:
-- Customer-scoped view (parameterized)
CREATE VIEW customer_context_view AS
SELECT * FROM customer_support_view
WHERE customer_id = :conversation_customer_id;
Pattern 2: Multi-Table Joins
Views that join data across multiple tables:
-- Unified customer view (joins customers + orders + support)
CREATE VIEW customer_360_view AS
SELECT
c.customer_id,
c.customer_name,
c.email,
c.plan_name,
o.order_count,
o.total_revenue,
s.open_tickets,
s.last_ticket_date
FROM customers c
LEFT JOIN (
SELECT customer_id, COUNT(*) as order_count, SUM(amount) as total_revenue
FROM orders
GROUP BY customer_id
) o ON c.customer_id = o.customer_id
LEFT JOIN (
SELECT customer_id, COUNT(*) as open_tickets, MAX(created_at) as last_ticket_date
FROM support_tickets
WHERE status = 'open'
GROUP BY customer_id
) s ON c.customer_id = s.customer_id
WHERE c.is_active = true;
Pattern 3: Pre-Aggregated Views
Views that pre-aggregate data for performance:
-- Pre-aggregated customer metrics
CREATE VIEW customer_metrics_aggregated AS
SELECT
customer_id,
customer_name,
email,
plan_name,
-- Pre-aggregated metrics
total_revenue,
order_count,
avg_order_value,
active_users_30d,
feature_adoption_score
FROM customers_aggregated
WHERE is_active = true;
Layer 3: Tool Abstraction
The third layer creates tools that agents use to query views.
What Are MCP Tools?
MCP (Model Context Protocol) tools are functions that agents can call. They abstract database queries behind natural language interfaces.
Tool structure:
{
"name": "get_customer_info",
"description": "Get customer information for support context. Returns customer details, subscription status, recent usage, and open tickets.",
"parameters": {
"email": {
"type": "string",
"description": "Customer email address",
"required": true
}
},
"query": "SELECT * FROM customer_support_view WHERE email = :email LIMIT 1"
}
Creating Tools from Views
Step 1: Define the Tool
Describe what the tool should do:
Tool: get_customer_info
Purpose: Get customer information for support context
Input: Customer email
Output: Customer details, subscription status, recent usage, open tickets
View: customer_support_view
Step 2: Create Tool Definition
Create the tool definition:
{
"name": "get_customer_info",
"description": "Get customer information for support context. Returns customer details, subscription status, recent usage, and open tickets.",
"parameters": {
"email": {
"type": "string",
"description": "Customer email address",
"required": true
}
},
"query": "SELECT * FROM customer_support_view WHERE email = :email LIMIT 1"
}
Step 3: Add Input Validation
Validate inputs to prevent SQL injection:
def validate_email(email: str) -> bool:
"""Validate email format"""
import re
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
return bool(re.match(pattern, email))
def get_customer_info(email: str):
# Validate input
if not validate_email(email):
raise ValueError("Invalid email format")
# Execute query
query = "SELECT * FROM customer_support_view WHERE email = :email LIMIT 1"
result = db.execute(query, {"email": email})
return result
Step 4: Test the Tool
Test the tool to verify it works:
# Test with valid email
result = get_customer_info("customer@example.com")
assert result is not None
# Test with invalid email
try:
get_customer_info("invalid-email")
assert False, "Should raise ValueError"
except ValueError:
pass
# Test with non-existent email
result = get_customer_info("nonexistent@example.com")
assert result is None
Step-by-Step Implementation
Here's how to build the safe agent layer step by step:
Step 1: Set Up Read Replica
Time: 30-60 minutes
Steps:
- Create read replica of production Postgres
- Configure replica for analytical queries
- Test replication lag (< 1 second)
- Set up monitoring
Commands:
# AWS RDS example
aws rds create-db-instance-read-replica \
--db-instance-identifier agent-replica \
--source-db-instance-identifier production-db \
--db-instance-class db.r5.large
Step 2: Create Sandboxed Views
Time: 1-2 hours per view
Steps:
- Identify what data agents need
- Create view with only needed columns
- Add filters (active customers, date ranges, etc.)
- Add indexes for performance
- Test the view
Example:
-- Create view
CREATE VIEW customer_support_view AS
SELECT
customer_id,
customer_name,
email,
plan_name,
subscription_status,
active_users_30d,
open_tickets
FROM customers
WHERE is_active = true
AND signup_date >= DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR);
-- Add indexes
CREATE INDEX idx_customer_support_email ON customers(email) WHERE is_active = true;
Step 3: Create MCP Tools
Time: 30 minutes per tool
Steps:
- Define tool purpose and parameters
- Create tool definition
- Add input validation
- Test the tool
Example:
{
"name": "get_customer_info",
"description": "Get customer information for support context",
"parameters": {
"email": {
"type": "string",
"description": "Customer email address",
"required": true
}
},
"query": "SELECT * FROM customer_support_view WHERE email = :email LIMIT 1"
}
Step 4: Connect Agents
Time: 15 minutes per agent
Steps:
- Get MCP server URL and credentials
- Add to agent framework configuration
- Test agent can use tools
- Verify access boundaries work
Example (Claude Desktop):
{
"mcpServers": {
"pylar": {
"url": "https://api.pylar.ai/mcp/your-server-id",
"apiKey": "your-bearer-token"
}
}
}
Step 5: Add Monitoring
Time: 1 hour
Steps:
- Set up query logging
- Track query performance
- Monitor costs
- Set up alerts
Example:
-- Enable query logging
ALTER SYSTEM SET log_statement = 'all';
ALTER SYSTEM SET log_min_duration_statement = 1000;
-- Monitor query performance
SELECT
query,
calls,
total_exec_time,
mean_exec_time
FROM pg_stat_statements
WHERE query LIKE '%customer_support_view%'
ORDER BY total_exec_time DESC;
Step 6: Iterate Based on Usage
Time: Ongoing
Steps:
- Monitor how agents use the layer
- Identify optimization opportunities
- Refine views and tools
- Add new views as needed
Real-World Examples
Let me show you real-world implementations:
Example 1: Support Agent Layer
Requirements:
- Support agents answer customer questions
- Access customer data for support context
- Cannot access other customers' data
- Cannot access financial data
Implementation:
1. Read Replica:
# Create read replica
aws rds create-db-instance-read-replica \
--db-instance-identifier support-agent-replica \
--source-db-instance-identifier production-db
2. Sandboxed View:
CREATE VIEW customer_support_view AS
SELECT
customer_id,
customer_name,
email,
plan_name,
subscription_status,
last_login_date,
active_users_30d,
open_tickets
FROM customers
WHERE is_active = true
AND signup_date >= DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR);
CREATE INDEX idx_customer_support_email ON customers(email) WHERE is_active = true;
3. MCP Tool:
{
"name": "get_customer_info",
"description": "Get customer information for support context",
"parameters": {
"email": {
"type": "string",
"description": "Customer email address",
"required": true
}
},
"query": "SELECT * FROM customer_support_view WHERE email = :email LIMIT 1"
}
Result: Support agents get complete customer context without accessing other customers' data or financial information.
Example 2: Analytics Agent Layer
Requirements:
- Analytics agents generate insights
- Access aggregated customer data
- Cannot access PII
- Cannot access individual customer records
Implementation:
1. Read Replica:
# Create read replica optimized for analytics
aws rds create-db-instance-read-replica \
--db-instance-identifier analytics-agent-replica \
--source-db-instance-identifier production-db \
--db-instance-class db.r5.xlarge # Larger instance for analytics
2. Sandboxed View (Pre-aggregated):
CREATE VIEW customer_analytics_view AS
SELECT
-- Aggregated metrics only (no PII)
plan_name,
subscription_status,
COUNT(*) as customer_count,
AVG(active_users_30d) as avg_active_users,
SUM(total_revenue) as total_revenue,
AVG(order_count) as avg_orders
FROM customers_aggregated
WHERE is_active = true
GROUP BY plan_name, subscription_status
HAVING COUNT(*) >= 10; -- Minimum aggregation threshold
3. MCP Tool:
{
"name": "get_customer_analytics",
"description": "Get aggregated customer analytics (no PII)",
"parameters": {
"plan_name": {
"type": "string",
"description": "Plan name to filter by (optional)",
"required": false
}
},
"query": "SELECT * FROM customer_analytics_view WHERE plan_name = COALESCE(:plan_name, plan_name)"
}
Result: Analytics agents get insights without accessing individual customer data or PII.
Example 3: Multi-Tenant Agent Layer
Requirements:
- Agents access tenant data
- Complete isolation between tenants
- Agents can only access their tenant's data
Implementation:
1. Read Replica:
# Create read replica
aws rds create-db-instance-read-replica \
--db-instance-identifier tenant-agent-replica \
--source-db-instance-identifier production-db
2. Sandboxed View (Tenant-scoped):
CREATE VIEW tenant_data_view AS
SELECT
tenant_id,
customer_id,
customer_name,
email,
plan_name,
subscription_status,
usage_data
FROM customers
WHERE tenant_id = :conversation_tenant_id
AND is_active = true;
CREATE INDEX idx_tenant_data_tenant ON customers(tenant_id) WHERE is_active = true;
3. MCP Tool:
{
"name": "get_tenant_customers",
"description": "Get customers for current tenant",
"parameters": {
"tenant_id": {
"type": "string",
"description": "Tenant ID (from conversation context)",
"required": true
}
},
"query": "SELECT * FROM tenant_data_view WHERE tenant_id = :tenant_id"
}
Result: Complete tenant isolation. Agents can only access their tenant's data.
Common Mistakes to Avoid
Here are mistakes I've seen teams make:
Mistake 1: Skipping the Read Replica
What happens: Teams connect agents directly to production Postgres.
Why it's a problem: Agent queries impact production performance. One slow query can crash customer-facing services.
The fix: Always use read replicas for agent queries. Never connect agents directly to production.
Mistake 2: Not Creating Views
What happens: Teams give agents access to raw tables instead of views.
Why it's a problem: No access boundaries. Agents can access any data, any column, any row.
The fix: Always create sandboxed views. Agents query views, not tables.
Mistake 3: Views That Are Too Broad
What happens: Teams create views that include too much data "just in case."
Why it's a problem: Broad views defeat the purpose. Agents can still access data they shouldn't.
The fix: Follow principle of least privilege. Views should include only the minimum data needed.
Mistake 4: Not Adding Indexes
What happens: Teams create views but don't add indexes.
Why it's a problem: Queries are slow, costs spike, user experience degrades.
The fix: Add indexes for frequently queried columns. Monitor query performance and add indexes as needed.
Mistake 5: No Input Validation
What happens: Tools accept any input without validation.
Why it's a problem: SQL injection risk, invalid queries, errors.
The fix: Validate all inputs. Check formats, ranges, and types before executing queries.
Mistake 6: Not Monitoring Performance
What happens: Teams deploy the layer and don't monitor it.
Why it's a problem: Performance issues go unnoticed. Costs spike. Users get frustrated.
The fix: Monitor query performance, costs, and access patterns from day one. Set up alerts for anomalies.
Mistake 7: Static Views
What happens: Views are created once and never updated.
Why it's a problem: As agents evolve, views become outdated. Over-permissive or under-permissive access.
The fix: Review and update views regularly. Remove unused views, optimize existing views, add new views as needed.
Where Pylar Fits In
Pylar makes building a safe agent layer on top of Postgres practical. Here's how:
Read Replica Support: Pylar connects to read replicas, not production. You configure replica connections in Pylar, and all agent queries route to replicas automatically. Production stays protected.
Sandboxed Views: Pylar's SQL IDE lets you create sandboxed views that define exactly what agents can access. Views can join data across multiple systems, filter rows and columns, and enforce compliance requirements. Views are version-controlled and auditable.
MCP Tool Builder: Pylar automatically generates MCP tools from your views. Describe what you want in natural language, and Pylar creates the tool definition, parameter validation, and query logic. No backend engineering required.
Query Optimization: Pylar views are optimized for agent queries. Pylar suggests indexes, pre-aggregations, and query optimizations that keep costs low and performance fast.
Access Control: Pylar enforces access control on every query. Before executing a query, Pylar validates role, context, and permissions. Queries that violate access boundaries are rejected.
Monitoring: Pylar Evals tracks query performance, costs, and access patterns. You can see exactly how agents are using your Postgres layer, which queries are most expensive, and where optimization opportunities exist.
Framework-Agnostic: Pylar tools work with any MCP-compatible framework. Whether you're using Claude Desktop, LangGraph, OpenAI, n8n, or any other framework, Pylar provides the same safe layer.
Pylar is the safe agent layer for Postgres. Instead of building custom read replica routing, view management, and tool generation, you build views and tools in Pylar. The safe layer is built in.
Frequently Asked Questions
Do I need a read replica for agents?
Can I use the same replica for multiple agents?
How do I handle write operations?
What if I need real-time data?
How do I optimize view performance?
Can I use views with existing Postgres setup?
How do I test the safe layer?
What if I need to update a view?
How do I monitor the safe layer?
Can I use this pattern with other databases?
Building a safe agent layer on top of Postgres isn't optional—it's essential. Without it, agents can impact production performance, access sensitive data, and create security risks. Start with read replicas, add sandboxed views, create tools, and monitor continuously.
The three-layer pattern gives you the foundation. Isolate agent queries from production, govern access through views, and provide agent-friendly interfaces through tools. With proper implementation, agents become secure, performant business tools rather than production risks.
Related Posts
How to Build MCP Tools Without Coding
You don't need to code to build MCP tools. This tactical guide shows three ways to create them—from manual coding to Pylar's natural language approach—and why the simplest method takes under 2 minutes.
Agent Cost Optimization: A Data Engineer's Guide
Agent costs can spiral out of control fast. This practical guide for data engineers shows where costs come from, how to measure them, and strategies to optimize costs by 50-70% without breaking functionality.
Building a Supabase MCP Server for AI Agents
Learn how to build a Supabase MCP server that safely exposes your database to AI agents. Use RLS policies, sandboxed views, and MCP tools to create a secure agent data access layer.