Building Agent Products
Ship production-ready agent systems that users love and pay for
Building Agent Products
You can build agents. Now let's ship products. This guide covers architecture, deployment, UX, and monetization for agent-powered applications.
Architecture Patterns
Choose based on your use case:
Synchronous (Simple)
Best for quick tasks (< 30 seconds):
@app.post("/analyze")async def analyze(request: AnalyzeRequest): result = agent.run(request.query) return {"result": result}Pros: Simple, easy to debug Cons: User waits, timeout risk
Asynchronous (Scalable)
Best for longer tasks:
@app.post("/analyze")async def start_analysis(request: AnalyzeRequest): job_id = create_job(request) queue.enqueue(run_agent_job, job_id) return {"job_id": job_id} @app.get("/jobs/{job_id}")async def get_job(job_id: str): return get_job_status(job_id)Pros: Handles long tasks, scalable Cons: More complex, needs job management
Streaming (Real-time)
Best for interactive experiences:
@app.post("/chat/stream")async def stream_chat(request: ChatRequest): async def generate(): async for chunk in agent.stream(request.message): yield f"data: {json.dumps(chunk)}\n\n" return StreamingResponse(generate(), media_type="text/event-stream")Pros: Great UX, immediate feedback Cons: Complex client handling
Real-World Product Examples
1. Code Review Bot
Product: Automated code review for GitHub PRs
Key Features:
- Webhook receives PR events
- Agent analyzes diff for bugs, security, style
- Posts inline comments on GitHub
- Approves or requests changes
2. Customer Support Agent
Product: AI-powered ticket handling
Key Features:
- Multi-channel intake (email, chat, form)
- Knowledge base search
- CRM integration for context
- Human escalation workflow
3. Research Assistant
Product: Automated research and reports
Key Features:
- Multi-source research
- Fact verification
- Citation management
- Multiple output formats
4. Data Analysis Platform
Product: Natural language data exploration
Key Features:
- Natural language to SQL
- Chart generation
- Dashboard builder
- Scheduled reports
User Experience Design
Progressive Disclosure
Don't overwhelm users with agent complexity:
Level 1: Simple Chat└── User asks, agent responds Level 2: Show Thinking└── Display reasoning steps Level 3: Tool Visibility└── Show which tools are used Level 4: Full Control└── User guides decisionsReal-time Feedback
Keep users informed during long operations:
function AgentStatus({ jobId }) { const [status, setStatus] = useState('starting'); const [steps, setSteps] = useState([]); useEffect(() => { const events = new EventSource(`/jobs/${jobId}/stream`); events.onmessage = (e) => { const data = JSON.parse(e.data); setStatus(data.status); setSteps(data.steps); }; return () => events.close(); }, [jobId]); return ( <div> <StatusBadge status={status} /> <StepsList steps={steps} /> <ProgressBar progress={steps.length / totalSteps} /> </div> );}Error States
Design for failures:
| State | Message | Actions |
|---|---|---|
| Timeout | "Taking longer than expected" | Wait, Cancel |
| Tool failure | "Couldn't access X" | Retry, Skip |
| Uncertain | "Found conflicting info" | Show both |
| Can't help | "Outside my capabilities" | Contact human |
Trust Building
Help users trust agent outputs:
- Show sources: Link to where information came from
- Confidence indicators: Signal when uncertain
- Edit suggestions: Let users modify outputs
- Explain reasoning: Show why decisions were made
Deployment
Docker Setup
FROM python:3.11-slim WORKDIR /appCOPY requirements.txt .RUN pip install --no-cache-dir -r requirements.txtCOPY . . CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]# docker-compose.ymlversion: '3.8' services: api: build: . ports: - "8000:8000" environment: - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} - REDIS_URL=redis://redis:6379 depends_on: - redis worker: build: . command: celery -A tasks worker --loglevel=info environment: - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} - REDIS_URL=redis://redis:6379 redis: image: redis:7-alpineServerless
For variable workloads:
# AWS Lambda handlerdef handler(event, context): body = json.loads(event['body']) result = agent.run(body['query']) return { 'statusCode': 200, 'body': json.dumps({'result': result}) }Kubernetes
For scale:
apiVersion: apps/v1kind: Deploymentmetadata: name: agent-servicespec: replicas: 3 template: spec: containers: - name: agent image: your-registry/agent:latest resources: requests: memory: "512Mi" cpu: "500m" limits: memory: "1Gi" cpu: "1000m" env: - name: ANTHROPIC_API_KEY valueFrom: secretKeyRef: name: api-keys key: anthropicMonitoring
Key Metrics
| Metric | Description | Alert If |
|---|---|---|
| Completion rate | % tasks completed | < 95% |
| Latency (p95) | Time to response | > 30s |
| Tool failure rate | % tool calls failed | > 5% |
| Token usage | Tokens per task | > budget |
| User satisfaction | Ratings | < 4/5 |
Structured Logging
import structlog logger = structlog.get_logger() def run_with_logging(user_id: str, task: str): log = logger.bind(user_id=user_id, task_id=str(uuid4())) log.info("agent_start", task=task) try: start = time.time() result = agent.run(task) duration = time.time() - start log.info("agent_done", duration=duration, tokens=result.tokens, tools=result.tools) return result except Exception as e: log.error("agent_failed", error=str(e)) raiseAlerting
# Prometheus alert rulesgroups:- name: agent-alerts rules: - alert: HighLatency expr: histogram_quantile(0.95, agent_duration_seconds) > 30 for: 5m labels: severity: warning - alert: HighToolFailures expr: rate(tool_failures[5m]) / rate(tool_calls[5m]) > 0.05 for: 5m labels: severity: criticalCost Management
Token Budgeting
class BudgetedAgent: def __init__(self, max_tokens: int = 100000): self.max_tokens = max_tokens self.tokens_used = 0 def run(self, task: str) -> str: if self.tokens_used >= self.max_tokens: raise BudgetExceededError() response = self._call_api(task) self.tokens_used += response.usage.total_tokens return responseCaching
import hashlib class CachedAgent: def __init__(self): self.cache = {} def run(self, task: str) -> str: key = hashlib.md5(task.encode()).hexdigest() if key in self.cache: return self.cache[key] result = self._run(task) self.cache[key] = result return resultUsage Tiers
TIERS = { "free": {"tokens_day": 10000, "tasks_day": 10}, "pro": {"tokens_day": 100000, "tasks_day": 100}, "enterprise": {"tokens_day": None, "tasks_day": None}} def check_usage(user_id: str, tier: str) -> bool: usage = get_daily_usage(user_id) limits = TIERS[tier] if limits["tokens_day"] and usage.tokens > limits["tokens_day"]: return False if limits["tasks_day"] and usage.tasks > limits["tasks_day"]: return False return TrueSafety and Guardrails
Input Validation
from pydantic import BaseModel, validator class AgentRequest(BaseModel): task: str max_tokens: int = 4096 @validator('task') def validate_task(cls, v): if len(v) > 10000: raise ValueError("Task too long") if contains_pii(v): raise ValueError("Remove personal information") return vOutput Filtering
def filter_output(response: str) -> str: response = remove_system_paths(response) response = redact_secrets(response) if contains_harmful_content(response): return "I cannot provide that information." return responseAction Approval
SENSITIVE_ACTIONS = ["delete_file", "send_email", "make_payment"] def execute_with_approval(tool: str, params: dict, user_id: str) -> str: if tool in SENSITIVE_ACTIONS: request_id = create_approval_request(user_id, tool, params) approved = wait_for_approval(request_id, timeout=300) if not approved: return "Action not approved." return execute_tool(tool, params)Rate Limiting
from slowapi import Limiter limiter = Limiter(key_func=get_remote_address) @app.post("/agent")@limiter.limit("10/minute")async def run_agent(request: Request): # ...Monetization Models
Usage-Based
Pay per task:- Free: 10 tasks/month- Basic: $0.10/task (up to 1000)- Pro: $0.05/task (unlimited) Pay per token:- $0.01 per 1000 tokensSubscription Tiers
Free Tier:- 100 tasks/month- Basic tools- Community support Pro ($29/month):- 1000 tasks/month- All tools- Priority support Enterprise (Custom):- Unlimited- Custom tools- SLA guaranteeValue-Based
Price by outcome:
- Research reports: $5-50/report
- Code reviews: $1/PR
- Support tickets: $0.50/resolution
Launch Checklist
Technical
- [ ] Load testing completed
- [ ] Error handling covers edge cases
- [ ] Monitoring and alerts configured
- [ ] Backup and recovery tested
- [ ] Security audit completed
- [ ] API documentation published
Product
- [ ] Onboarding flow tested
- [ ] Help documentation written
- [ ] Feedback mechanism in place
- [ ] Pricing page live
- [ ] Terms of service updated
Operations
- [ ] Support channels ready
- [ ] Escalation procedures defined
- [ ] Cost monitoring active
- [ ] Analytics tracking verified
Case Study: Content Agent Product
Let's walk through building a complete product.
Product: ContentBot
AI content creation for marketers.
Features:
- Blog post generation
- Social media content
- Email campaigns
- SEO optimization
- Brand voice matching
MVP Timeline
| Week | Focus |
|---|---|
| 1 | Core agent + basic API |
| 2 | User system + tracking |
| 3 | Templates + formats |
| 4 | Deploy + monitor |
Architecture
# content_agent.pyclass ContentAgent: def __init__(self, brand_voice: str = None): self.client = Anthropic() self.brand_voice = brand_voice async def generate_blog(self, topic: str, keywords: list) -> str: # Research topic research = await self._research(topic) # Generate outline outline = await self._create_outline(topic, research) # Write content content = await self._write_content(outline, keywords) # Apply brand voice if self.brand_voice: content = await self._apply_voice(content) return contentNext Steps
- Review Building Agents for implementation
- Study successful products in your domain
- Join AI product communities for feedback
Practice Projects
| Project | Goal |
|---|---|
| Ship an MVP | Pick an idea, deploy in 4 weeks |
| Add monetization | Implement usage-based pricing |
| Scale to 100 users | Handle concurrent load |
| Reduce costs 50% | Optimize token usage |
Resources
- Anthropic API Pricing
- Vercel AI SDK
- LangChain
- Modal — Serverless AI compute
Success
Start shipping! The best products solve real problems for real users. Pick a problem you understand, build an agent to solve it, ship early, iterate on feedback.