Building Agent Products

Transform your agents into real products that users love and pay for. This guide covers architecture, deployment, user experience, and business considerations for agent-powered applications.

Product Architecture

Core Components

Every agent product needs these building blocks:

Bash

1┌─────────────────────────────────────────┐
2│              Your Product               │
3├─────────────────────────────────────────┤
4│  Frontend          │  Backend           │
5│  - User Interface  │  - API Gateway     │
6│  - Real-time UI    │  - Agent Runtime   │
7│  - Settings        │  - Tool Services   │
8│                    │  - Database        │
9├─────────────────────────────────────────┤
10│            Infrastructure               │
11│  - Queues  - Cache  - Storage  - CDN    │
12└─────────────────────────────────────────┘

Architecture Patterns

1. Synchronous (Simple)

Best for quick tasks (< 30 seconds)

Python

1# User waits for complete response
2@app.post("/analyze")
3async def analyze(request: AnalyzeRequest):
4    result = agent.run(request.query)
5    return {"result": result}

2. Asynchronous (Scalable)

Best for longer tasks

Python

1# Return job ID immediately
2@app.post("/analyze")
3async def start_analysis(request: AnalyzeRequest):
4    job_id = create_job(request)
5    queue.enqueue(run_agent_job, job_id)
6    return {"job_id": job_id}
7 
8# Poll for results
9@app.get("/jobs/{job_id}")
10async def get_job(job_id: str):
11    job = get_job_status(job_id)
12    return job

3. Streaming (Real-time)

Best for interactive experiences

Python

1@app.post("/chat/stream")
2async def stream_chat(request: ChatRequest):
3    async def generate():
4        async for chunk in agent.stream(request.message):
5            yield f"data: {json.dumps(chunk)}\n\n"
6 
7    return StreamingResponse(generate(), media_type="text/event-stream")

Real-World Product Examples

1. Code Review Bot

Product: Automated code review for GitHub PRs

Architecture:

Bash

GitHub Webhook → Queue → Code Review Agent → GitHub API
                              ↓
                    Analysis Storage (for history)

Key Features:

Webhook receives PR events
Agent analyzes code changes
Posts inline comments on GitHub
Tracks issues found over time

Ask Claude Code:

Bash

1Build a GitHub code review bot that:
21. Receives webhooks for new PRs
32. Fetches the diff
43. Analyzes code for issues, style, security
54. Posts review comments on GitHub
65. Approves or requests changes
7 
8Include FastAPI backend, Redis queue, and GitHub integration.

2. Customer Support Agent

Product: AI-powered support ticket handling

Architecture:

Bash

Support Channels → Ticket System → Support Agent → Response
      ↓                                   ↓
  (Email, Chat)              Knowledge Base + CRM

Key Features:

Multi-channel intake (email, chat, form)
Knowledge base search
CRM integration for context
Human escalation workflow
Response quality scoring

Ask Claude Code:

Bash

1Build a customer support agent that:
21. Receives tickets from multiple channels
32. Searches knowledge base for answers
43. Drafts responses with tone matching
54. Escalates complex issues to humans
65. Tracks resolution metrics
7 
8Include ticket queue, knowledge base RAG, and admin dashboard.

3. Research Assistant

Product: Automated research and report generation

Architecture:

Bash

Research Query → Research Agent → Report Generator → Delivery
                      ↓
        Web Search + Document Analysis + Citations

Key Features:

Multi-source research
Fact verification
Citation management
Multiple output formats
Collaborative editing

Ask Claude Code:

Bash

1Build a research assistant product that:
21. Takes research questions from users
32. Searches web and academic sources
43. Synthesizes findings with citations
54. Generates reports in PDF/Markdown
65. Allows users to ask follow-up questions
7 
8Include user accounts, report history, and sharing.

4. Data Analysis Platform

Product: Natural language data exploration

Architecture:

Bash

User Question → Data Agent → SQL/Analysis → Visualization
                    ↓
          Database Schema + Query History

Key Features:

Natural language to SQL
Chart generation
Dashboard builder
Scheduled reports
Data alerts

Ask Claude Code:

Bash

1Build a data analysis agent that:
21. Connects to user databases
32. Understands schema automatically
43. Converts questions to SQL queries
54. Generates visualizations
65. Creates shareable dashboards
7 
8Include database connectors, chart library, and permissions.

User Experience Design

Progressive Disclosure

Don't overwhelm users with agent complexity:

Bash

1Level 1: Simple Chat
2└── User asks, agent responds
3 
4Level 2: Show Thinking
5└── Display agent's reasoning steps
6 
7Level 3: Tool Visibility
8└── Show which tools are being used
9 
10Level 4: Full Control
11└── Let users guide agent decisions

Real-time Feedback

Keep users informed during long operations:

TypeScript

1// Frontend component for agent status
2function AgentStatus({ jobId }) {
3  const [status, setStatus] = useState('starting');
4  const [steps, setSteps] = useState([]);
5 
6  useEffect(() => {
7    const eventSource = new EventSource(`/jobs/${jobId}/stream`);
8 
9    eventSource.onmessage = (event) => {
10      const data = JSON.parse(event.data);
11      setStatus(data.status);
12      setSteps(data.steps);
13    };
14 
15    return () => eventSource.close();
16  }, [jobId]);
17 
18  return (
19    <div>
20      <StatusBadge status={status} />
21      <StepsList steps={steps} />
22      <ProgressBar progress={steps.length / totalSteps} />
23    </div>
24  );
25}

Error States

Design for agent failures:

Timeout: "Taking longer than expected. [Wait] [Cancel]"
Tool failure: "Couldn't access X. Trying alternative..."
Uncertain: "I found conflicting information. Here are both perspectives..."
Can't help: "This is outside my capabilities. [Contact human]"

Trust Building

Help users trust agent outputs:

Show sources: Link to where information came from
Confidence indicators: Signal when agent is uncertain
Edit suggestions: Let users modify agent outputs
Explain reasoning: Show why agent made decisions

Deployment Strategies

Container-based Deployment

dockerfile

1# Dockerfile for agent service
2FROM python:3.11-slim
3 
4WORKDIR /app
5 
6COPY requirements.txt .
7RUN pip install --no-cache-dir -r requirements.txt
8 
9COPY . .
10 
11CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

YAML

1# docker-compose.yml
2version: '3.8'
3 
4services:
5  api:
6    build: .
7    ports:
8      - "8000:8000"
9    environment:
10      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
11      - REDIS_URL=redis://redis:6379
12    depends_on:
13      - redis
14      - postgres
15 
16  worker:
17    build: .
18    command: celery -A tasks worker --loglevel=info
19    environment:
20      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
21      - REDIS_URL=redis://redis:6379
22 
23  redis:
24    image: redis:7-alpine
25 
26  postgres:
27    image: postgres:15
28    volumes:
29      - postgres_data:/var/lib/postgresql/data

Serverless Deployment

For variable workloads:

Python

1# AWS Lambda handler
2def handler(event, context):
3    body = json.loads(event['body'])
4    result = agent.run(body['query'])
5 
6    return {
7        'statusCode': 200,
8        'body': json.dumps({'result': result})
9    }

Kubernetes for Scale

YAML

1# k8s deployment
2apiVersion: apps/v1
3kind: Deployment
4metadata:
5  name: agent-service
6spec:
7  replicas: 3
8  selector:
9    matchLabels:
10      app: agent-service
11  template:
12    spec:
13      containers:
14      - name: agent
15        image: your-registry/agent:latest
16        resources:
17          requests:
18            memory: "512Mi"
19            cpu: "500m"
20          limits:
21            memory: "1Gi"
22            cpu: "1000m"
23        env:
24        - name: ANTHROPIC_API_KEY
25          valueFrom:
26            secretKeyRef:
27              name: api-keys
28              key: anthropic

Monitoring and Observability

Key Metrics to Track

| Metric | Description | Alert Threshold | |--------|-------------|-----------------| | Task completion rate | % of tasks completed successfully | < 95% | | Average latency | Time from request to response | > 30s | | Tool failure rate | % of tool calls that fail | > 5% | | Token usage | Tokens consumed per task | > budget | | User satisfaction | Ratings/feedback scores | < 4/5 |

Structured Logging

Python

1import structlog
2 
3logger = structlog.get_logger()
4 
5def run_agent_with_logging(user_id: str, task: str):
6    log = logger.bind(user_id=user_id, task_id=str(uuid4()))
7 
8    log.info("agent_started", task=task)
9 
10    try:
11        start = time.time()
12        result = agent.run(task)
13        duration = time.time() - start
14 
15        log.info("agent_completed",
16                 duration=duration,
17                 tokens_used=result.tokens,
18                 tools_used=result.tools)
19 
20        return result
21 
22    except Exception as e:
23        log.error("agent_failed", error=str(e))
24        raise

Distributed Tracing

Python

1from opentelemetry import trace
2 
3tracer = trace.get_tracer(__name__)
4 
5def run_agent(task: str):
6    with tracer.start_as_current_span("agent_run") as span:
7        span.set_attribute("task", task)
8 
9        # Agent loop
10        for i, step in enumerate(agent_steps):
11            with tracer.start_as_current_span(f"step_{i}") as step_span:
12                step_span.set_attribute("tool", step.tool)
13                result = execute_step(step)
14                step_span.set_attribute("result_size", len(result))

Alerting

YAML

1# Prometheus alert rules
2groups:
3- name: agent-alerts
4  rules:
5  - alert: HighAgentLatency
6    expr: histogram_quantile(0.95, agent_duration_seconds) > 30
7    for: 5m
8    labels:
9      severity: warning
10    annotations:
11      summary: "Agent latency is high"
12 
13  - alert: HighToolFailureRate
14    expr: rate(tool_failures_total[5m]) / rate(tool_calls_total[5m]) > 0.05
15    for: 5m
16    labels:
17      severity: critical

Cost Management

Token Budgeting

Python

1class BudgetedAgent:
2    def __init__(self, max_tokens: int = 100000):
3        self.max_tokens = max_tokens
4        self.tokens_used = 0
5 
6    def run(self, task: str) -> str:
7        if self.tokens_used >= self.max_tokens:
8            raise BudgetExceededError("Token budget exceeded")
9 
10        response = self._call_api(task)
11        self.tokens_used += response.usage.total_tokens
12 
13        return response

Caching

Python

1import hashlib
2from functools import lru_cache
3 
4class CachedAgent:
5    def __init__(self):
6        self.cache = {}
7 
8    def run(self, task: str) -> str:
9        cache_key = hashlib.md5(task.encode()).hexdigest()
10 
11        if cache_key in self.cache:
12            return self.cache[cache_key]
13 
14        result = self._run_agent(task)
15        self.cache[cache_key] = result
16 
17        return result

Usage Tiers

Python

1TIER_LIMITS = {
2    "free": {"tokens_per_day": 10000, "tasks_per_day": 10},
3    "pro": {"tokens_per_day": 100000, "tasks_per_day": 100},
4    "enterprise": {"tokens_per_day": None, "tasks_per_day": None}
5}
6 
7def check_usage(user_id: str, tier: str) -> bool:
8    usage = get_daily_usage(user_id)
9    limits = TIER_LIMITS[tier]
10 
11    if limits["tokens_per_day"] and usage.tokens > limits["tokens_per_day"]:
12        return False
13    if limits["tasks_per_day"] and usage.tasks > limits["tasks_per_day"]:
14        return False
15 
16    return True

Safety and Guardrails

Input Validation

Python

1from pydantic import BaseModel, validator
2 
3class AgentRequest(BaseModel):
4    task: str
5    max_tokens: int = 4096
6 
7    @validator('task')
8    def validate_task(cls, v):
9        if len(v) > 10000:
10            raise ValueError("Task too long")
11        if contains_pii(v):
12            raise ValueError("Please remove personal information")
13        return v
14 
15    @validator('max_tokens')
16    def validate_tokens(cls, v):
17        if v > 100000:
18            raise ValueError("Token limit too high")
19        return v

Output Filtering

Python

1def filter_output(response: str) -> str:
2    """Filter agent output before returning to user."""
3 
4    # Remove any leaked system information
5    response = remove_system_paths(response)
6 
7    # Redact sensitive patterns
8    response = redact_secrets(response)
9 
10    # Check for harmful content
11    if contains_harmful_content(response):
12        return "I cannot provide that information."
13 
14    return response

Action Approval

Python

1SENSITIVE_ACTIONS = ["delete_file", "send_email", "make_payment"]
2 
3def execute_with_approval(tool: str, params: dict, user_id: str) -> str:
4    if tool in SENSITIVE_ACTIONS:
5        # Create approval request
6        request_id = create_approval_request(user_id, tool, params)
7 
8        # Wait for approval (or timeout)
9        approved = wait_for_approval(request_id, timeout=300)
10 
11        if not approved:
12            return "Action requires approval and was not approved."
13 
14    return execute_tool(tool, params)

Rate Limiting

Python

1from slowapi import Limiter
2from slowapi.util import get_remote_address
3 
4limiter = Limiter(key_func=get_remote_address)
5 
6@app.post("/agent")
7@limiter.limit("10/minute")
8async def run_agent(request: Request):
9    # ...

Monetization Models

Usage-Based Pricing

Bash

1Pay per task:
2- Free: 10 tasks/month
3- Basic: $0.10/task (up to 1000)
4- Pro: $0.05/task (unlimited)
5 
6Pay per token:
7- $0.01 per 1000 tokens

Subscription Tiers

Bash

1Free Tier:
2- 100 tasks/month
3- Basic tools only
4- Community support
5 
6Pro Tier ($29/month):
7- 1000 tasks/month
8- All tools
9- Priority support
10- Custom prompts
11 
12Enterprise (Custom):
13- Unlimited tasks
14- Custom tools
15- SLA guarantee
16- Dedicated support

Value-Based Pricing

Price based on outcomes:

Research reports: $5-50 per report
Code reviews: $1 per PR
Support tickets: $0.50 per resolution

Hybrid Model

Python

1def calculate_price(user: User, task: Task) -> float:
2    base_price = TIER_PRICES[user.tier]
3 
4    # Add usage overage
5    if task.tokens > user.tier_limit:
6        overage = task.tokens - user.tier_limit
7        base_price += overage * OVERAGE_RATE
8 
9    # Add premium features
10    if task.uses_premium_tools:
11        base_price += PREMIUM_TOOL_FEE
12 
13    return base_price

Case Study: Building a Content Agent Product

Let's walk through building a complete product.

Product Vision

ContentBot: AI-powered content creation for marketers

Features:

Blog post generation
Social media content
Email campaigns
SEO optimization
Brand voice matching

Technical Implementation

Ask Claude Code:

Bash

1Build a content creation agent product with:
2 
31. Backend (FastAPI):
4   - User authentication
5   - Content generation API
6   - Template management
7   - Usage tracking
8 
92. Agent:
10   - Web research for topics
11   - Brand voice analysis
12   - SEO keyword optimization
13   - Multiple content formats
14 
153. Database:
16   - User accounts and teams
17   - Content history
18   - Templates and prompts
19   - Usage analytics
20 
214. Frontend (Next.js):
22   - Dashboard
23   - Content editor
24   - Template library
25   - Settings
26 
27Include deployment config for Railway or Vercel.

MVP Timeline

Week 1: Core agent

Basic content generation
Simple API

Week 2: User system

Authentication
Usage tracking
Basic frontend

Week 3: Polish

Brand voice
Templates
Multiple formats

Week 4: Launch

Deployment
Monitoring
Documentation

Launch Checklist

Technical

[ ] Load testing completed
[ ] Error handling covers edge cases
[ ] Monitoring and alerts configured
[ ] Backup and recovery tested
[ ] Security audit completed
[ ] API documentation published

Product

[ ] Onboarding flow tested
[ ] Help documentation written
[ ] Feedback mechanism in place
[ ] Pricing page live
[ ] Terms of service updated

Operations

[ ] Support channels ready
[ ] Escalation procedures defined
[ ] Cost monitoring active
[ ] Analytics tracking verified

Next Steps

Continue Learning

Review Building Agents for implementation details
Study successful agent products in your domain
Join AI product communities for feedback

Practice Projects

Build an MVP: Take one of the examples and ship it
Add monetization: Implement usage-based pricing
Scale it: Handle 100+ concurrent users
Optimize costs: Reduce token usage by 50%

Resources

Anthropic API Pricing
Vercel AI SDK
LangChain - Agent framework
Modal - Serverless compute for AI

Start building your agent product! The best products solve real problems for real users. Pick a problem you understand deeply and build an agent to solve it. Ship early, iterate based on feedback, and grow from there.