Building Agent Products
Learn to build, deploy, and monetize production-ready products powered by AI agents
Building Agent Products
Transform your agents into real products that users love and pay for. This guide covers architecture, deployment, user experience, and business considerations for agent-powered applications.
Product Architecture
Core Components
Every agent product needs these building blocks:
┌─────────────────────────────────────────┐│ Your Product │├─────────────────────────────────────────┤│ Frontend │ Backend ││ - User Interface │ - API Gateway ││ - Real-time UI │ - Agent Runtime ││ - Settings │ - Tool Services ││ │ - Database │├─────────────────────────────────────────┤│ Infrastructure ││ - Queues - Cache - Storage - CDN │└─────────────────────────────────────────┘Architecture Patterns
1. Synchronous (Simple)
Best for quick tasks (< 30 seconds)
# User waits for complete response@app.post("/analyze")async def analyze(request: AnalyzeRequest): result = agent.run(request.query) return {"result": result}2. Asynchronous (Scalable)
Best for longer tasks
# Return job ID immediately@app.post("/analyze")async def start_analysis(request: AnalyzeRequest): job_id = create_job(request) queue.enqueue(run_agent_job, job_id) return {"job_id": job_id} # Poll for results@app.get("/jobs/{job_id}")async def get_job(job_id: str): job = get_job_status(job_id) return job3. Streaming (Real-time)
Best for interactive experiences
@app.post("/chat/stream")async def stream_chat(request: ChatRequest): async def generate(): async for chunk in agent.stream(request.message): yield f"data: {json.dumps(chunk)}\n\n" return StreamingResponse(generate(), media_type="text/event-stream")Real-World Product Examples
1. Code Review Bot
Product: Automated code review for GitHub PRs
Architecture:
GitHub Webhook → Queue → Code Review Agent → GitHub API ↓ Analysis Storage (for history)Key Features:
- Webhook receives PR events
- Agent analyzes code changes
- Posts inline comments on GitHub
- Tracks issues found over time
Ask Claude Code:
Build a GitHub code review bot that:1. Receives webhooks for new PRs2. Fetches the diff3. Analyzes code for issues, style, security4. Posts review comments on GitHub5. Approves or requests changes Include FastAPI backend, Redis queue, and GitHub integration.2. Customer Support Agent
Product: AI-powered support ticket handling
Architecture:
Support Channels → Ticket System → Support Agent → Response ↓ ↓ (Email, Chat) Knowledge Base + CRMKey Features:
- Multi-channel intake (email, chat, form)
- Knowledge base search
- CRM integration for context
- Human escalation workflow
- Response quality scoring
Ask Claude Code:
Build a customer support agent that:1. Receives tickets from multiple channels2. Searches knowledge base for answers3. Drafts responses with tone matching4. Escalates complex issues to humans5. Tracks resolution metrics Include ticket queue, knowledge base RAG, and admin dashboard.3. Research Assistant
Product: Automated research and report generation
Architecture:
Research Query → Research Agent → Report Generator → Delivery ↓ Web Search + Document Analysis + CitationsKey Features:
- Multi-source research
- Fact verification
- Citation management
- Multiple output formats
- Collaborative editing
Ask Claude Code:
Build a research assistant product that:1. Takes research questions from users2. Searches web and academic sources3. Synthesizes findings with citations4. Generates reports in PDF/Markdown5. Allows users to ask follow-up questions Include user accounts, report history, and sharing.4. Data Analysis Platform
Product: Natural language data exploration
Architecture:
User Question → Data Agent → SQL/Analysis → Visualization ↓ Database Schema + Query HistoryKey Features:
- Natural language to SQL
- Chart generation
- Dashboard builder
- Scheduled reports
- Data alerts
Ask Claude Code:
Build a data analysis agent that:1. Connects to user databases2. Understands schema automatically3. Converts questions to SQL queries4. Generates visualizations5. Creates shareable dashboards Include database connectors, chart library, and permissions.User Experience Design
Progressive Disclosure
Don't overwhelm users with agent complexity:
Level 1: Simple Chat└── User asks, agent responds Level 2: Show Thinking└── Display agent's reasoning steps Level 3: Tool Visibility└── Show which tools are being used Level 4: Full Control└── Let users guide agent decisionsReal-time Feedback
Keep users informed during long operations:
// Frontend component for agent statusfunction AgentStatus({ jobId }) { const [status, setStatus] = useState('starting'); const [steps, setSteps] = useState([]); useEffect(() => { const eventSource = new EventSource(`/jobs/${jobId}/stream`); eventSource.onmessage = (event) => { const data = JSON.parse(event.data); setStatus(data.status); setSteps(data.steps); }; return () => eventSource.close(); }, [jobId]); return ( <div> <StatusBadge status={status} /> <StepsList steps={steps} /> <ProgressBar progress={steps.length / totalSteps} /> </div> );}Error States
Design for agent failures:
- Timeout: "Taking longer than expected. [Wait] [Cancel]"
- Tool failure: "Couldn't access X. Trying alternative..."
- Uncertain: "I found conflicting information. Here are both perspectives..."
- Can't help: "This is outside my capabilities. [Contact human]"
Trust Building
Help users trust agent outputs:
- Show sources: Link to where information came from
- Confidence indicators: Signal when agent is uncertain
- Edit suggestions: Let users modify agent outputs
- Explain reasoning: Show why agent made decisions
Deployment Strategies
Container-based Deployment
# Dockerfile for agent serviceFROM python:3.11-slim WORKDIR /app COPY requirements.txt .RUN pip install --no-cache-dir -r requirements.txt COPY . . CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]# docker-compose.ymlversion: '3.8' services: api: build: . ports: - "8000:8000" environment: - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} - REDIS_URL=redis://redis:6379 depends_on: - redis - postgres worker: build: . command: celery -A tasks worker --loglevel=info environment: - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} - REDIS_URL=redis://redis:6379 redis: image: redis:7-alpine postgres: image: postgres:15 volumes: - postgres_data:/var/lib/postgresql/dataServerless Deployment
For variable workloads:
# AWS Lambda handlerdef handler(event, context): body = json.loads(event['body']) result = agent.run(body['query']) return { 'statusCode': 200, 'body': json.dumps({'result': result}) }Kubernetes for Scale
# k8s deploymentapiVersion: apps/v1kind: Deploymentmetadata: name: agent-servicespec: replicas: 3 selector: matchLabels: app: agent-service template: spec: containers: - name: agent image: your-registry/agent:latest resources: requests: memory: "512Mi" cpu: "500m" limits: memory: "1Gi" cpu: "1000m" env: - name: ANTHROPIC_API_KEY valueFrom: secretKeyRef: name: api-keys key: anthropicMonitoring and Observability
Key Metrics to Track
| Metric | Description | Alert Threshold | |--------|-------------|-----------------| | Task completion rate | % of tasks completed successfully | < 95% | | Average latency | Time from request to response | > 30s | | Tool failure rate | % of tool calls that fail | > 5% | | Token usage | Tokens consumed per task | > budget | | User satisfaction | Ratings/feedback scores | < 4/5 |
Structured Logging
import structlog logger = structlog.get_logger() def run_agent_with_logging(user_id: str, task: str): log = logger.bind(user_id=user_id, task_id=str(uuid4())) log.info("agent_started", task=task) try: start = time.time() result = agent.run(task) duration = time.time() - start log.info("agent_completed", duration=duration, tokens_used=result.tokens, tools_used=result.tools) return result except Exception as e: log.error("agent_failed", error=str(e)) raiseDistributed Tracing
from opentelemetry import trace tracer = trace.get_tracer(__name__) def run_agent(task: str): with tracer.start_as_current_span("agent_run") as span: span.set_attribute("task", task) # Agent loop for i, step in enumerate(agent_steps): with tracer.start_as_current_span(f"step_{i}") as step_span: step_span.set_attribute("tool", step.tool) result = execute_step(step) step_span.set_attribute("result_size", len(result))Alerting
# Prometheus alert rulesgroups:- name: agent-alerts rules: - alert: HighAgentLatency expr: histogram_quantile(0.95, agent_duration_seconds) > 30 for: 5m labels: severity: warning annotations: summary: "Agent latency is high" - alert: HighToolFailureRate expr: rate(tool_failures_total[5m]) / rate(tool_calls_total[5m]) > 0.05 for: 5m labels: severity: criticalCost Management
Token Budgeting
class BudgetedAgent: def __init__(self, max_tokens: int = 100000): self.max_tokens = max_tokens self.tokens_used = 0 def run(self, task: str) -> str: if self.tokens_used >= self.max_tokens: raise BudgetExceededError("Token budget exceeded") response = self._call_api(task) self.tokens_used += response.usage.total_tokens return responseCaching
import hashlibfrom functools import lru_cache class CachedAgent: def __init__(self): self.cache = {} def run(self, task: str) -> str: cache_key = hashlib.md5(task.encode()).hexdigest() if cache_key in self.cache: return self.cache[cache_key] result = self._run_agent(task) self.cache[cache_key] = result return resultUsage Tiers
TIER_LIMITS = { "free": {"tokens_per_day": 10000, "tasks_per_day": 10}, "pro": {"tokens_per_day": 100000, "tasks_per_day": 100}, "enterprise": {"tokens_per_day": None, "tasks_per_day": None}} def check_usage(user_id: str, tier: str) -> bool: usage = get_daily_usage(user_id) limits = TIER_LIMITS[tier] if limits["tokens_per_day"] and usage.tokens > limits["tokens_per_day"]: return False if limits["tasks_per_day"] and usage.tasks > limits["tasks_per_day"]: return False return TrueSafety and Guardrails
Input Validation
from pydantic import BaseModel, validator class AgentRequest(BaseModel): task: str max_tokens: int = 4096 @validator('task') def validate_task(cls, v): if len(v) > 10000: raise ValueError("Task too long") if contains_pii(v): raise ValueError("Please remove personal information") return v @validator('max_tokens') def validate_tokens(cls, v): if v > 100000: raise ValueError("Token limit too high") return vOutput Filtering
def filter_output(response: str) -> str: """Filter agent output before returning to user.""" # Remove any leaked system information response = remove_system_paths(response) # Redact sensitive patterns response = redact_secrets(response) # Check for harmful content if contains_harmful_content(response): return "I cannot provide that information." return responseAction Approval
SENSITIVE_ACTIONS = ["delete_file", "send_email", "make_payment"] def execute_with_approval(tool: str, params: dict, user_id: str) -> str: if tool in SENSITIVE_ACTIONS: # Create approval request request_id = create_approval_request(user_id, tool, params) # Wait for approval (or timeout) approved = wait_for_approval(request_id, timeout=300) if not approved: return "Action requires approval and was not approved." return execute_tool(tool, params)Rate Limiting
from slowapi import Limiterfrom slowapi.util import get_remote_address limiter = Limiter(key_func=get_remote_address) @app.post("/agent")@limiter.limit("10/minute")async def run_agent(request: Request): # ...Monetization Models
Usage-Based Pricing
Pay per task:- Free: 10 tasks/month- Basic: $0.10/task (up to 1000)- Pro: $0.05/task (unlimited) Pay per token:- $0.01 per 1000 tokensSubscription Tiers
Free Tier:- 100 tasks/month- Basic tools only- Community support Pro Tier ($29/month):- 1000 tasks/month- All tools- Priority support- Custom prompts Enterprise (Custom):- Unlimited tasks- Custom tools- SLA guarantee- Dedicated supportValue-Based Pricing
Price based on outcomes:
- Research reports: $5-50 per report
- Code reviews: $1 per PR
- Support tickets: $0.50 per resolution
Hybrid Model
def calculate_price(user: User, task: Task) -> float: base_price = TIER_PRICES[user.tier] # Add usage overage if task.tokens > user.tier_limit: overage = task.tokens - user.tier_limit base_price += overage * OVERAGE_RATE # Add premium features if task.uses_premium_tools: base_price += PREMIUM_TOOL_FEE return base_priceCase Study: Building a Content Agent Product
Let's walk through building a complete product.
Product Vision
ContentBot: AI-powered content creation for marketers
Features:
- Blog post generation
- Social media content
- Email campaigns
- SEO optimization
- Brand voice matching
Technical Implementation
Ask Claude Code:
Build a content creation agent product with: 1. Backend (FastAPI): - User authentication - Content generation API - Template management - Usage tracking 2. Agent: - Web research for topics - Brand voice analysis - SEO keyword optimization - Multiple content formats 3. Database: - User accounts and teams - Content history - Templates and prompts - Usage analytics 4. Frontend (Next.js): - Dashboard - Content editor - Template library - Settings Include deployment config for Railway or Vercel.MVP Timeline
Week 1: Core agent
- Basic content generation
- Simple API
Week 2: User system
- Authentication
- Usage tracking
- Basic frontend
Week 3: Polish
- Brand voice
- Templates
- Multiple formats
Week 4: Launch
- Deployment
- Monitoring
- Documentation
Launch Checklist
Technical
- [ ] Load testing completed
- [ ] Error handling covers edge cases
- [ ] Monitoring and alerts configured
- [ ] Backup and recovery tested
- [ ] Security audit completed
- [ ] API documentation published
Product
- [ ] Onboarding flow tested
- [ ] Help documentation written
- [ ] Feedback mechanism in place
- [ ] Pricing page live
- [ ] Terms of service updated
Operations
- [ ] Support channels ready
- [ ] Escalation procedures defined
- [ ] Cost monitoring active
- [ ] Analytics tracking verified
Next Steps
Continue Learning
- Review Building Agents for implementation details
- Study successful agent products in your domain
- Join AI product communities for feedback
Practice Projects
- Build an MVP: Take one of the examples and ship it
- Add monetization: Implement usage-based pricing
- Scale it: Handle 100+ concurrent users
- Optimize costs: Reduce token usage by 50%
Resources
- Anthropic API Pricing
- Vercel AI SDK
- LangChain - Agent framework
- Modal - Serverless compute for AI
Start building your agent product! The best products solve real problems for real users. Pick a problem you understand deeply and build an agent to solve it. Ship early, iterate based on feedback, and grow from there.