Skip to main content

Using Agents Effectively

Master prompting, tool selection, and output handling to get the best results from AI agents

2-3 hours
10 min read

Using Agents Effectively

Learn to work with AI agents like a pro. This guide covers prompting strategies, tool orchestration, debugging techniques, and patterns for getting reliable results.

Prompting for Agents

System Prompts

Define the agent's personality and constraints:

Python
SYSTEM_PROMPT = """You are a helpful research assistant.
Your capabilities:
- Search the web for information
- Read and analyze documents
- Take notes and organize findings
- Generate comprehensive reports
Guidelines:
- Always cite your sources
- Verify facts from multiple sources
- Acknowledge uncertainty
- Ask for clarification when needed
Limitations:
- Cannot access private or paywalled content
- Cannot make changes to external systems
- Must respect rate limits on searches
"""

Task Prompts

Be specific about what you want:

Bad prompt:

Bash
Research AI agents

Good prompt:

Bash
Research the current state of AI agents in customer support.
Focus on:
1. Major platforms and their features
2. Implementation challenges
3. Success metrics and case studies
4. Cost considerations
Output format:
- Executive summary (3-5 sentences)
- Detailed findings by topic
- Recommendations
- Sources cited
Target length: 1500-2000 words

Few-shot Examples

Show the agent what you want:

Python
PROMPT_WITH_EXAMPLES = """
Analyze customer feedback and extract key themes.
Example input:
"The app is fast but crashes when I try to export. Also wish it had dark mode."
Example output:
{
"sentiment": "mixed",
"themes": [
{"topic": "performance", "sentiment": "positive", "detail": "app speed"},
{"topic": "stability", "sentiment": "negative", "detail": "export crashes"},
{"topic": "features", "sentiment": "neutral", "detail": "dark mode request"}
]
}
Now analyze this feedback:
"{user_input}"
"""

Tool Selection Strategies

Give Clear Tool Descriptions

Tools should be self-explanatory:

Python
{
"name": "search_knowledge_base",
"description": """Search the company knowledge base for information.
Use this tool when:
- User asks about company policies
- User needs product documentation
- User asks 'how to' questions
Do NOT use for:
- General knowledge questions
- External information
- Real-time data
Returns: List of relevant articles with titles and snippets""",
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query - use specific keywords"
},
"category": {
"type": "string",
"enum": ["policies", "products", "procedures"],
"description": "Filter by category for better results"
}
},
"required": ["query"]
}
}

Guide Tool Choice

Help the agent pick the right tool:

Python
TOOL_SELECTION_PROMPT = """
Available tools and when to use them:
1. **web_search**: For current events, general knowledge, external information
2. **knowledge_base**: For company-specific information, policies, products
3. **database_query**: For user data, analytics, specific records
4. **calculator**: For any mathematical computations
Decision tree:
- Is it about our company? -> knowledge_base
- Is it about the user's data? -> database_query
- Does it need math? -> calculator
- Otherwise -> web_search
"""

Tool Composition

Design tools that work together:

Python
# Tools that build on each other
RESEARCH_TOOLS = [
{
"name": "search_sources",
"description": "Find relevant sources for a topic"
},
{
"name": "read_source",
"description": "Extract information from a specific source"
},
{
"name": "take_note",
"description": "Save a note with source attribution"
},
{
"name": "get_notes",
"description": "Retrieve all notes taken so far"
},
{
"name": "write_report",
"description": "Generate report from notes"
}
]

Output Handling

Structured Outputs

Request specific formats:

Python
STRUCTURED_OUTPUT_PROMPT = """
Analyze the provided data and return your findings in this exact JSON format:
{
"summary": "One paragraph overview",
"key_findings": [
{"finding": "string", "confidence": "high|medium|low", "evidence": "string"}
],
"recommendations": ["string"],
"limitations": ["string"]
}
Important: Return ONLY the JSON, no additional text.
"""

Validation

Validate agent outputs:

Python
from pydantic import BaseModel, validator
from typing import List, Literal
class Finding(BaseModel):
finding: str
confidence: Literal["high", "medium", "low"]
evidence: str
class AnalysisResult(BaseModel):
summary: str
key_findings: List[Finding]
recommendations: List[str]
limitations: List[str]
@validator('key_findings')
def at_least_one_finding(cls, v):
if len(v) < 1:
raise ValueError("Must have at least one finding")
return v
def parse_agent_output(response: str) -> AnalysisResult:
"""Parse and validate agent output."""
try:
data = json.loads(response)
return AnalysisResult(**data)
except json.JSONDecodeError:
raise ValueError("Agent did not return valid JSON")
except ValidationError as e:
raise ValueError(f"Invalid output format: {e}")

Streaming

Handle streaming responses:

Python
async def stream_agent_response(task: str):
"""Stream agent response with tool call handling."""
async with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=[{"role": "user", "content": task}],
tools=TOOLS
) as stream:
async for event in stream:
if event.type == "content_block_delta":
if event.delta.type == "text_delta":
yield {"type": "text", "content": event.delta.text}
elif event.type == "content_block_start":
if event.content_block.type == "tool_use":
yield {
"type": "tool_start",
"tool": event.content_block.name
}

Debugging Agents

Common Issues and Solutions

Issue: Agent not using tools

Symptoms: Agent makes up information instead of using tools

Solutions:

Python
# 1. Make tools more prominent in system prompt
SYSTEM = """You MUST use the provided tools to gather information.
Do NOT rely on your training data for facts."""
# 2. Add explicit instructions in task
TASK = """
Research {topic}.
IMPORTANT: Use the search tool for EVERY fact you include.
Do not include any information that doesn't come from a tool.
"""
# 3. Require tool citation
"""After each fact, cite the tool call that provided it."""

Issue: Agent stuck in loop

Symptoms: Same tool called repeatedly with same inputs

Solutions:

Python
# 1. Track and detect loops
seen_calls = set()
for call in tool_calls:
key = (call.name, json.dumps(call.input, sort_keys=True))
if key in seen_calls:
# Inject guidance
messages.append({
"role": "user",
"content": "You've already tried this. Try a different approach or provide your answer with available information."
})
seen_calls.add(key)
# 2. Limit iterations
MAX_TOOL_CALLS = 10
if len(tool_calls) >= MAX_TOOL_CALLS:
# Force completion
pass

Issue: Hallucinated tool results

Symptoms: Agent references tool results that don't exist

Solutions:

Python
# 1. Validate tool results exist
def validate_citations(response: str, tool_results: list) -> bool:
"""Check that all citations reference actual results."""
# Implementation
pass
# 2. Use explicit result markers
TOOL_RESULT_FORMAT = """
[TOOL RESULT - {tool_name}]
{result}
[END TOOL RESULT]
Only use information that appears within TOOL RESULT markers.
"""

Debugging Tools

Logging

Python
import structlog
logger = structlog.get_logger()
def debug_agent_run(task: str):
log = logger.bind(task_id=str(uuid4()))
log.debug("starting_agent", task=task)
for i, step in enumerate(agent_steps):
log.debug("agent_step",
step=i,
thinking=step.thinking,
tool=step.tool,
input=step.input)
result = execute_tool(step.tool, step.input)
log.debug("tool_result",
step=i,
tool=step.tool,
result=result[:200]) # Truncate for logging
log.debug("agent_complete", steps=len(agent_steps))

Visual Debugging

Python
def visualize_agent_trace(trace: list):
"""Create visual representation of agent execution."""
output = []
for i, step in enumerate(trace):
output.append(f"""
Step {i + 1}:
Thinking: {step.thinking[:100]}...
Tool: {step.tool}
Input: {json.dumps(step.input, indent=2)}
Result: {step.result[:100]}...
Duration: {step.duration:.2f}s
""")
return "\n".join(output)

Prompt Engineering Patterns

Chain of Thought

Make the agent think step-by-step:

Python
COT_PROMPT = """
Solve this problem step by step:
{problem}
Work through it like this:
1. Understand what is being asked
2. Identify what information you need
3. Gather that information using tools
4. Analyze the information
5. Form your conclusion
6. Verify your answer
Show your work at each step.
"""

Self-Critique

Have the agent check its own work:

Python
SELF_CRITIQUE_PROMPT = """
After completing the task, critique your own work:
1. Did you fully address the question?
2. Are your sources reliable and cited?
3. Are there gaps in your analysis?
4. What could be improved?
Based on your critique, revise if needed.
"""

Reflection

Help the agent learn from mistakes:

Python
REFLECTION_PROMPT = """
The previous attempt had issues:
{error_description}
Before trying again:
1. What went wrong?
2. Why did it happen?
3. How will you avoid it this time?
Now try again with these learnings.
"""

Advanced Patterns

Multi-turn Refinement

Iterate to improve results:

Python
def iterative_refinement(task: str, max_iterations: int = 3) -> str:
"""Refine agent output through multiple passes."""
current_output = agent.run(task)
for i in range(max_iterations):
# Ask for critique
critique_prompt = f"""
Here is an attempt at: {task}
Attempt:
{current_output}
Critique this attempt. Be specific about:
- What's good
- What's missing or wrong
- How to improve it
Then provide an improved version.
"""
response = agent.run(critique_prompt)
# Extract improved version
if "improved version" in response.lower():
current_output = extract_improved_version(response)
else:
break # No improvements needed
return current_output

Parallel Exploration

Explore multiple approaches simultaneously:

Python
async def parallel_exploration(task: str, num_approaches: int = 3) -> str:
"""Try multiple approaches in parallel and select best."""
# Generate different approaches
approach_prompt = f"""
For this task: {task}
Generate {num_approaches} different approaches to solving it.
Format as:
APPROACH 1: ...
APPROACH 2: ...
"""
approaches = await agent.run(approach_prompt)
# Execute each approach in parallel
results = await asyncio.gather(*[
agent.run(f"Execute this approach: {approach}")
for approach in parse_approaches(approaches)
])
# Select best result
selection_prompt = f"""
Task: {task}
Here are {len(results)} different results:
{format_results(results)}
Select the best one and explain why.
"""
return await agent.run(selection_prompt)

Human-in-the-Loop

Include human checkpoints:

Python
def run_with_approval(task: str, checkpoints: list) -> str:
"""Run agent with human approval at key points."""
messages = [{"role": "user", "content": task}]
while True:
response = agent.step(messages)
# Check if at a checkpoint
if should_checkpoint(response, checkpoints):
# Present to human
print(f"Agent wants to: {describe_action(response)}")
approved = input("Approve? (y/n): ")
if approved.lower() != 'y':
# Add human guidance
guidance = input("Guidance: ")
messages.append({
"role": "user",
"content": f"Do not do that. Instead: {guidance}"
})
continue
# Execute
messages = execute_step(response, messages)
if response.stop_reason == "end_turn":
break
return extract_final_response(messages)

Best Practices Checklist

Before Running

  • [ ] Clear, specific task description
  • [ ] Appropriate tools available
  • [ ] System prompt defines constraints
  • [ ] Output format specified
  • [ ] Error handling in place

During Execution

  • [ ] Monitor for loops
  • [ ] Track token usage
  • [ ] Log all tool calls
  • [ ] Handle timeouts
  • [ ] Validate tool results

After Completion

  • [ ] Validate output format
  • [ ] Check source citations
  • [ ] Review for hallucinations
  • [ ] Measure quality metrics
  • [ ] Log for analysis

Common Use Case Recipes

Research Question

Python
def research_question(question: str) -> str:
return agent.run(f"""
Research this question thoroughly: {question}
Process:
1. Search for 3-5 authoritative sources
2. Read and extract key information from each
3. Identify areas of agreement and disagreement
4. Synthesize into a comprehensive answer
Requirements:
- Cite every factual claim
- Note confidence levels
- Include diverse perspectives
- Acknowledge limitations
Format: Start with summary, then detailed findings, then sources.
""")

Code Generation

Python
def generate_code(specification: str) -> str:
return agent.run(f"""
Generate code for: {specification}
Process:
1. Clarify requirements (ask if needed)
2. Design the solution approach
3. Write the code with comments
4. Add error handling
5. Write example usage
Output:
- Code block with full implementation
- Brief explanation of approach
- Example usage
- Notes on dependencies or limitations
""")

Data Analysis

Python
def analyze_data(question: str, data_context: str) -> str:
return agent.run(f"""
Analyze this data to answer: {question}
Data context: {data_context}
Process:
1. Understand the question and data available
2. Write and execute analysis queries
3. Interpret the results
4. Generate visualizations if helpful
5. Form conclusions
Output:
- Direct answer to the question
- Supporting analysis
- Visualizations (describe or generate)
- Caveats and limitations
""")

Next Steps

Now that you can use agents effectively:

  1. Building Agents: Create custom agents
  2. Agent Products: Ship production systems

Practice

  1. Take an existing agent and optimize its prompts
  2. Add detailed logging and analyze its behavior
  3. Implement output validation for your use case
  4. Build a self-critiquing version

Master the craft! Effective agent use is about clear communication, proper tooling, and continuous refinement. The more you work with agents, the better you'll understand how to get reliable results.