Tool Calling / Function Calling

Language models are impressive text processors, but they can’t check today’s weather, look up your database, or send an email — without tools. Tool calling (also called function calling) is the mechanism that lets LLMs interact with external systems, turning them from text generators into agents that can act.

The Core Mechanic

Tool calling works through a structured request-response cycle:

1. You define available tools (name, description, input schema)
2. You send a message + tools to the model
3. The model decides: answer directly OR call a tool
4. If calling a tool: model returns structured call with arguments
5. Your code executes the actual function
6. You send the result back to the model
7. Model uses result to answer (or calls another tool)

The model never directly executes tools — it just requests them. Your code executes them. This is critical for safety: you control what tools are available and what they’re allowed to do.

Defining Tools

Tools are defined using JSON Schema. The description matters enormously — it’s how the model decides when and how to use each tool.

# Anthropic Claude tool definition
tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a city. Use this when the user asks "
                       "about weather conditions, temperature, or forecasts.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "City name, e.g. 'London' or 'Tokyo'"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature units to return"
                }
            },
            "required": ["city"]
        }
    },
    {
        "name": "search_database",
        "description": "Search the product database. Use this for any product-related "
                       "questions — inventory, pricing, specifications.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string"},
                "category": {
                    "type": "string",
                    "enum": ["electronics", "clothing", "furniture", "all"]
                },
                "max_results": {"type": "integer", "default": 5}
            },
            "required": ["query"]
        }
    }
]

Handling Tool Calls: Complete Example

import anthropic
import json

client = anthropic.Anthropic()

def get_weather(city: str, units: str = "celsius") -> dict:
    # Your actual weather API call here
    return {"temperature": 22, "condition": "partly cloudy", "humidity": 65}

def search_database(query: str, category: str = "all", max_results: int = 5) -> list:
    # Your actual database query here
    return [{"id": 1, "name": "Widget Pro", "price": 49.99}]

TOOL_FUNCTIONS = {
    "get_weather": get_weather,
    "search_database": search_database
}

def run_agent(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=1024,
            tools=tools,
            messages=messages
        )

        # If model is done (no more tool calls)
        if response.stop_reason == "end_turn":
            return response.content[0].text

        # Process tool calls
        if response.stop_reason == "tool_use":
            # Add assistant's response to history
            messages.append({"role": "assistant", "content": response.content})

            # Execute each tool call
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    func = TOOL_FUNCTIONS[block.name]
                    result = func(**block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result)
                    })

            # Add tool results to history
            messages.append({"role": "user", "content": tool_results})

Parallel Tool Calling

Modern LLMs can call multiple tools simultaneously when the calls are independent. This dramatically reduces latency for multi-step tasks.

Sequential (slow):
  [Call weather] → [wait 200ms] → [Call news] → [wait 200ms] → [generate response]
  Total: ~600ms

Parallel (fast):
  [Call weather + Call news simultaneously] → [wait 200ms] → [generate response]
  Total: ~300ms

# Example: model calls two tools in parallel
response.content might contain:
[
    ToolUseBlock(id="tool_1", name="get_weather", input={"city": "Tokyo"}),
    ToolUseBlock(id="tool_2", name="get_weather", input={"city": "London"}),
    ToolUseBlock(id="tool_3", name="search_news", input={"query": "AI developments"})
]

# Execute all in parallel
import asyncio

async def execute_tools_parallel(tool_calls):
    tasks = []
    for call in tool_calls:
        tasks.append(execute_tool(call.name, call.input))
    return await asyncio.gather(*tasks)

Tool Calling vs. RAG

A common architectural question: should you use tool calling or RAG for knowledge retrieval?

Scenario	Tool Calling	RAG
Real-time data (weather, stock prices)	✓	✗
Structured DB queries	✓	✗
Static knowledge base (docs, wikis)	Either	✓
Large document corpora	✗	✓
Requires computation (sum, filter)	✓	✗
Semantic search over text	Either	✓

In practice, many production systems use both: a search_knowledge_base tool that internally calls a RAG pipeline.

Tool Safety and Guardrails

Tools can have real-world consequences. A send_email tool can spam your customers. A delete_record tool can destroy data. Design your tool layer defensively:

def delete_record(table: str, record_id: int) -> dict:
    # Confirm the record exists first
    record = db.query(f"SELECT * FROM {table} WHERE id = {record_id}")
    if not record:
        return {"success": False, "error": f"Record {record_id} not found in {table}"}

    # Log the deletion for audit
    audit_log.write(f"DELETE {table}/{record_id} at {datetime.now()}")

    # Soft delete instead of hard delete
    db.execute(f"UPDATE {table} SET deleted_at = NOW() WHERE id = {record_id}")
    return {"success": True, "soft_deleted": True}

Design principles for safe tools:

Prefer read-only tools; make write tools require explicit confirmation
Soft-delete instead of hard-delete
Rate limit expensive tools
Log all tool calls for audit
Return clear error messages (they go back to the model)
Validate and sanitize all inputs before execution

The Tool Description Is Everything

The model decides which tool to call based entirely on the description. Bad descriptions lead to wrong tool choices:

Bad description:
  name: "db_query"
  description: "Query the database"

  → Model doesn't know what's in the database, when to use it,
    or what kind of query format to pass

Good description:
  name: "search_customer_orders"
  description: "Search orders for a specific customer by name, email, or order ID.
                Returns order history including status, items, and dates.
                Use for any question about past orders or order status."

  → Model knows exactly when and how to use this tool

Writing good tool descriptions is as important as prompt engineering. Budget time for it.

2026: Tools Are the New APIs

The industry is converging on tool calling as the standard interface for AI-to-system integration. MCP (Model Context Protocol) is formalizing this into a standard protocol. The implication:

Any backend service that exposes well-designed tools can be controlled by an AI agent. Your existing REST APIs can become agent capabilities with a thin tool-definition layer on top.

This is fundamentally changing how software is architected — AI-first applications are tool-first applications.