Building AI Agents with Tool Calling: Architecture Patterns for Production
Tool calling is the difference between an LLM that can talk about doing things and an agent that can actually do them. Without tools, the model can only generate text. With tools, it can query your database, call APIs, run code, send messages, and chain multiple actions to complete a complex task.
The demo is easy. Getting it reliable in production is the work. Here's the full architectural picture.
What tool calling actually is
The LLM doesn't execute tools directly. The flow is:
1. You send: user message + list of available tools (with schemas)
2. LLM responds with: "I want to call tool X with these arguments"
3. Your code executes tool X
4. You send: tool result back to LLM
5. LLM responds with: final answer (or another tool call)
The LLM is a reasoning engine that decides which tools to call and with what arguments. Your code executes the actual function and returns the result. The LLM never has direct access to your systems - only to what you explicitly expose as tools.
Basic tool calling with Claude
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
// Define your tools
const tools = [
{
name: 'get_order_status',
description: 'Get the current status of a customer order. Use this when the customer asks about their order.',
input_schema: {
type: 'object',
properties: {
order_id: {
type: 'string',
description: 'The order ID, usually starts with ORD-',
},
},
required: ['order_id'],
},
},
{
name: 'cancel_order',
description: 'Cancel an order that is still in pending or processing status. Cannot cancel shipped or delivered orders.',
input_schema: {
type: 'object',
properties: {
order_id: { type: 'string' },
reason: { type: 'string', description: 'Reason for cancellation' },
},
required: ['order_id', 'reason'],
},
},
{
name: 'search_products',
description: 'Search the product catalog. Use this when customer asks about available products.',
input_schema: {
type: 'object',
properties: {
query: { type: 'string' },
category: { type: 'string', enum: ['electronics', 'clothing', 'home', 'books'] },
max_price: { type: 'number' },
},
required: ['query'],
},
},
];
// Tool implementations
const toolHandlers = {
get_order_status: async ({ order_id }) => {
const order = await db.orders.findById(order_id);
if (!order) return { error: 'Order not found' };
return {
order_id: order.id,
status: order.status,
created_at: order.createdAt,
estimated_delivery: order.estimatedDelivery,
items: order.items.map(i => ({ name: i.name, quantity: i.quantity })),
};
},
cancel_order: async ({ order_id, reason }) => {
const order = await db.orders.findById(order_id);
if (!order) return { error: 'Order not found' };
if (['shipped', 'delivered'].includes(order.status)) {
return { error: `Cannot cancel order in ${order.status} status` };
}
await db.orders.update(order_id, { status: 'cancelled', cancellationReason: reason });
await refundPayment(order.paymentId);
return { success: true, message: 'Order cancelled and refund initiated' };
},
search_products: async ({ query, category, max_price }) => {
const results = await searchIndex.search(query, { category, maxPrice: max_price, limit: 5 });
return { products: results.map(p => ({ id: p.id, name: p.name, price: p.price, description: p.description })) };
},
};
The agent loop
The core execution pattern - keep calling until no more tools are needed:
async function runAgent(userMessage, conversationHistory = []) {
const messages = [
...conversationHistory,
{ role: 'user', content: userMessage },
];
// Safety limit: prevent infinite loops
const MAX_ITERATIONS = 10;
let iterations = 0;
while (iterations < MAX_ITERATIONS) {
iterations++;
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 4096,
system: `You are a customer support agent for ShopApp. Help customers with their orders and product questions.
Always be accurate - only report information from the tools, never make up order details.
If you cannot help with something, say so clearly.`,
tools,
messages,
});
// Add assistant response to message history
messages.push({ role: 'assistant', content: response.content });
// Check stop reason
if (response.stop_reason === 'end_turn') {
// No more tool calls - extract final text response
const textBlock = response.content.find(block => block.type === 'text');
return { response: textBlock?.text, history: messages };
}
if (response.stop_reason !== 'tool_use') {
throw new Error(`Unexpected stop reason: ${response.stop_reason}`);
}
// Execute all tool calls in parallel
const toolUseBlocks = response.content.filter(block => block.type === 'tool_use');
const toolResults = await Promise.all(
toolUseBlocks.map(async (toolUse) => {
try {
const handler = toolHandlers[toolUse.name];
if (!handler) throw new Error(`Unknown tool: ${toolUse.name}`);
const result = await handler(toolUse.input);
return {
type: 'tool_result',
tool_use_id: toolUse.id,
content: JSON.stringify(result),
};
} catch (err) {
return {
type: 'tool_result',
tool_use_id: toolUse.id,
is_error: true,
content: `Tool execution failed: ${err.message}`,
};
}
})
);
// Add tool results to messages and continue the loop
messages.push({ role: 'user', content: toolResults });
}
throw new Error('Agent exceeded maximum iterations');
}
Tool design: the most important decisions
Descriptions matter more than schemas
The model chooses tools based on their description, not their schema. A poorly described tool will be used at the wrong time or not at all.
// Bad description - too vague
{
name: 'query_db',
description: 'Query the database',
}
// Good description - specific about when to use it
{
name: 'get_customer_orders',
description: 'Retrieve order history for a specific customer. Use this when the customer asks about past purchases, order history, or wants to find a previous order. Returns last 20 orders sorted by date.',
}
Design for the model's reasoning
The model sees your tool as a black box. It reasons about what the tool does from the description and decides inputs from the schema. Design both to match how the model will think:
- Use plain language in descriptions
- Make parameter names self-explanatory
- Specify what the tool returns in the description
- Note limitations ("cannot cancel shipped orders")
- Specify when NOT to use a tool if it's commonly confused with another
Granular tools outperform omnibus tools
// Bad: one big tool with many modes
{
name: 'manage_order',
description: 'Manage orders - get status, cancel, update, etc.',
input_schema: {
action: { type: 'string', enum: ['get', 'cancel', 'update', 'refund'] },
// ...
}
}
// Good: separate tools per action
// get_order_status, cancel_order, update_order_address, request_refund
// Each with its own clear description and precise schema
Granular tools lead to more reliable tool selection and easier error handling.
Authorization and safety
Never expose a tool that can take a destructive action without checking authorization:
const toolHandlers = {
cancel_order: async ({ order_id, reason }, context) => {
const order = await db.orders.findById(order_id);
// Verify the order belongs to the authenticated user
if (order.userId !== context.userId) {
return { error: 'Order not found' }; // Don't reveal it exists
}
// Check business rules
if (['shipped', 'delivered'].includes(order.status)) {
return {
error: `This order cannot be cancelled because it has already been ${order.status}.`,
suggestion: 'You can initiate a return after delivery.'
};
}
await db.orders.update(order_id, { status: 'cancelled' });
// Log for audit trail
await auditLog.write({
action: 'order_cancelled',
orderId: order_id,
userId: context.userId,
reason,
triggeredBy: 'ai_agent',
});
return { success: true };
},
};
// Pass context through the agent call
async function runAgent(userMessage, userId) {
const context = { userId, permissions: await getUserPermissions(userId) };
// Wrap handlers to inject context
const authorizedHandlers = Object.fromEntries(
Object.entries(toolHandlers).map(([name, handler]) => [
name,
(args) => handler(args, context),
])
);
}
Rule: treat tool calling like an API endpoint. Every tool call needs the same authorization checks you'd apply to a REST API call.
Handling long-running tasks
Some tools trigger async operations (sending emails, processing payments, running reports). Don't block the agent waiting for them:
const tools = [
{
name: 'generate_monthly_report',
description: 'Generates a detailed monthly report. This takes 2-5 minutes. Returns a job ID that can be used to check status.',
input_schema: {
properties: {
month: { type: 'string', description: 'Format: YYYY-MM' },
include_charts: { type: 'boolean' },
},
required: ['month'],
},
},
{
name: 'check_report_status',
description: 'Check the status of a report generation job.',
input_schema: {
properties: {
job_id: { type: 'string' },
},
required: ['job_id'],
},
},
];
const toolHandlers = {
generate_monthly_report: async ({ month, include_charts }) => {
// Fire and forget - return job ID immediately
const jobId = await reportQueue.add({ month, include_charts });
return {
job_id: jobId,
message: 'Report generation started. Check back in 2-5 minutes with the job ID.',
estimated_completion: new Date(Date.now() + 3 * 60 * 1000).toISOString(),
};
},
check_report_status: async ({ job_id }) => {
const job = await reportQueue.getJob(job_id);
if (job.isCompleted()) {
return { status: 'complete', download_url: job.result.url };
}
return { status: job.state, progress: job.progress };
},
};
Multi-agent patterns
For complex workflows, use multiple specialized agents:
// Orchestrator: decides which sub-agent to invoke
async function orchestratorAgent(userRequest) {
const response = await client.messages.create({
model: 'claude-opus-4-6',
system: 'You are an orchestrator. Based on the user request, decide which specialist to invoke.',
tools: [
{ name: 'invoke_billing_agent', description: 'For payment, subscription, and invoice questions' },
{ name: 'invoke_support_agent', description: 'For product issues, returns, and complaints' },
{ name: 'invoke_sales_agent', description: 'For product recommendations and upgrades' },
],
messages: [{ role: 'user', content: userRequest }],
});
const toolCall = response.content.find(b => b.type === 'tool_use');
switch (toolCall?.name) {
case 'invoke_billing_agent': return billingAgent(userRequest);
case 'invoke_support_agent': return supportAgent(userRequest);
case 'invoke_sales_agent': return salesAgent(userRequest);
}
}
Observability: what to log
async function runAgentWithLogging(userMessage, userId, sessionId) {
const trace = {
sessionId,
userId,
startTime: Date.now(),
iterations: [],
};
// Instrument every LLM call and tool execution
// Log: model, tokens, tool calls, results, errors, total duration
const result = await runAgent(userMessage);
trace.totalDuration = Date.now() - trace.startTime;
trace.totalInputTokens = trace.iterations.reduce((s, i) => s + i.inputTokens, 0);
trace.totalOutputTokens = trace.iterations.reduce((s, i) => s + i.outputTokens, 0);
trace.toolCallsCount = trace.iterations.reduce((s, i) => s + i.toolCalls.length, 0);
await observability.recordTrace(trace);
return result;
}
Use a dedicated observability tool: LangSmith, Langfuse (open source), or Helicone. Without tracing, debugging agent failures is nearly impossible - you need to see the full message chain, which tools were called, what they returned, and why the model made each decision.
Production checklist
- Maximum iteration limit (prevent infinite loops)
- Timeout per tool call (don't block indefinitely)
- Authorization check in every destructive tool
- Audit log for every action taken via agent
- Token usage tracking per session
- Graceful handling of tool execution errors
- Full trace logging for debugging
- Rate limiting per user (agents can be expensive)
- System prompt that specifies what the agent cannot do
The agent pattern is genuinely powerful - a well-designed agent can handle complex, multi-step customer requests that would have required multiple human interactions. The reliability comes from good tool design, proper authorization, and thorough testing of edge cases. Demos work on happy paths. Production agents need to handle everything else.
Aunimeda builds AI-powered solutions - chatbots, AI agents, voice assistants, and automation systems for businesses.
Contact us to discuss AI integration for your business. See also: AI Solutions, AI Agents, Chatbot Development