Applications using LLMs become much more powerful when the model can interact with external resources, such as APIs, databases, and the web, to gather dynamic data or to perform actions. Tool use (or function calling) is what transforms a language model from a conversational interface into an autonomous agent capable of taking action, accessing real-time information, and solving complex multi-step problems.
This doc starts with a high-level overview of tool use and then dives into the details of how tool use works. If you're already familiar with tool use, you can skip to the How to Use Tools on the Groq API section.
There are a few important pieces in the tool calling process:
Let's break down each step in more detail.
To use tools, the model must be provided with tool definitions. These tool definitions are in JSON schema format and are passed to the model via the tools parameter in the API request.
// Sample request body with tool definitions and messages
{
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
// JSON Schema object
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
}
],
"messages": [
{
"role": "system",
"content": "You are a weather assistant. Respond to the user question and use tools if needed to answer the query."
},
{
"role": "user",
"content": "What's the weather in San Francisco?"
}
],
}Key fields:
name: Function identifierdescription: Helps the model decide when to use this toolparameters: Function parameters defined as a JSON Schema object. Refer to JSON Schema for schema documentation.When the model decides to use a tool, it returns structured tool calls in the response. The model returns a tool_calls array with the following fields:
{
"role": "assistant",
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"San Francisco, CA\", \"unit\": \"fahrenheit\"}"
}
}]
}Key fields:
id: Unique identifier you'll reference when returning resultsfunction.name: Which tool to executefunction.arguments: JSON string of arguments (needs parsing)Application code will then execute the tool and create a new message with the results. This new message is appended to the conversation and sent back to the model.
{
"role": "tool",
# must match the `id` from the assistant's `tool_calls`
"tool_call_id": "call_abc123",
"name": "get_weather",
"content": "{\"temperature\": 72, \"condition\": \"sunny\", \"unit\": \"fahrenheit\"}"
}Key connections:
tool message's tool_call_id must match the id from the assistant's tool_callscontent can be any string value. Different tools may return different types of data.The model is then provided with the updated messages array:
[
{
"role": "user",
"content": "What's the weather in San Francisco?"
},
{
"role": "assistant",
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"San Francisco, CA\", \"unit\": \"fahrenheit\"}"
}
}]
},
{
"role": "tool",
"tool_call_id": "call_abc123",
"name": "get_weather",
"content": "{\"temperature\": 72, \"condition\": \"sunny\", \"unit\": \"fahrenheit\"}"
}
]The model then analyzes the tool results and either:
tool_calls){
"role": "assistant",
"content": "The weather in San Francisco is sunny and 72 degrees Fahrenheit."
}This tool-calling sequence is normally implemented in your application code, but Groq suports a number of ways to call tools server-side which allow your application code to remain simple while still allowing you to use tools.
All models hosted on Groq support tool use, and in general, we recommend the latest models for improved tool use capabilities:
| Model ID | Local & Remote Tool Use Support? | Parallel Tool Use Support? | JSON Mode Support? | Built-In Tools Support? |
|---|---|---|---|---|
moonshotai/kimi-k2-instruct-0905 | Yes ✅ | Yes ✅ | Yes ✅ | No ❌ |
openai/gpt-oss-20b | Yes ✅ | No ❌ | Yes ✅ | Yes ✅ |
openai/gpt-oss-120b | Yes ✅ | No ❌ | Yes ✅ | Yes ✅ |
openai/gpt-oss-safeguard-20b | Yes ✅ | No ❌ | Yes ✅ | No ❌ |
qwen/qwen3-32b | Yes ✅ | Yes ✅ | Yes ✅ | No ❌ |
meta-llama/llama-4-scout-17b-16e-instruct | Yes ✅ | Yes ✅ | Yes ✅ | No ❌ |
meta-llama/llama-4-maverick-17b-128e-instruct | Yes ✅ | Yes ✅ | Yes ✅ | No ❌ |
llama-3.3-70b-versatile | Yes ✅ | Yes ✅ | Yes ✅ | No ❌ |
llama-3.1-8b-instant | Yes ✅ | Yes ✅ | Yes ✅ | No ❌ |
groq/compound | No ❌ | N/A | Yes ✅ | Yes ✅ |
groq/compound-mini | No ❌ | N/A | Yes ✅ | Yes ✅ |
Groq supports three distinct patterns for tool use, each suited for different use cases: Groq built-in tools, remote tool calling via MCP servers, and local tool calling.
Groq maintains a set of pre-built tools like web search, code execution, and browser automation that execute entirely on Groq's infrastructure. These tools require minimal configuration and no tool orchestration on your end. With one API call, you get a capable, real-time AI agent. All tool calls happen in a single API call – when provided configured to have access to built-in tools, the model autonomously calls built-in tools and handles the entire agentic loop internally.
Ideal for:
Supported models:
groq/compound and groq/compound-miniopenai/gpt-oss-20b and openai/gpt-oss-120bThe Model Context Protocol (MCP) is an open standard that allows models to connect to and execute external tools. Each MCP server hosts a set of tools, providing endpoints to fetch their definitions and execute them without requiring the end user to implement the underlying tool logic.
Groq supports MCP tool discovery and execution server-side via remote tool calling. Similar to built-in tools, this allows you to use third-party tools with minimal configuration and no tool orchestration on your end. To use remote tools, you provide an MCP server configuration, which includes the MCP server URL and authentication headers. Groq's servers will connect to the MCP server, discover the available tools, pass them to the model, and execute any tools that are called server-side — all in a single API call.
Ideal for:
If you want the most control over tool execution logic, you can implement local tool calling. To do this, you manually write a set of functions and corresonding tool definitions. The tool definitions are provided to the model at inference time, and the model returns structured tool call requests (example provided above; a JSON object specifying which function to call and what arguments to use). Your application code then executes the function that corresponds to the tool call request locally and sends the results back to the model for the final response.
These functions can connect to external resources such as databases, APIs, and external services, but they are "local" in the sense that they are executed on the same machine as the application code. You can also connect to MCP servers locally to execute tools. This requires implementing code to discover tools from the MCP server, provide them to the model at inference time, routing any tool calls back to the MCP server for execution, and finally returning the results back to the model for the final response.
Ideal for:
| Pattern | You Provide | Execution Location | Orchestration | API Calls |
|---|---|---|---|---|
| Built-In | List of enabled built-in tools | Groq servers | Groq manages | Single call |
| Remote MCP | MCP server URL + auth | MCP server | Groq manages | Single call |
| Local | Tool definitions + implementation | Your code | You manage loop | Multiple (2+ per iteration) |
Many models support parallel tool use, where multiple tools can be called simultaneously in a single request. This is crucial for efficient agentic systems:
Without parallel tool use:
Query: "What's the weather in NYC and LA?"
Call 1: get_weather(location="NYC") → Wait for result
Call 2: get_weather(location="LA") → Wait for result
Final responseWith parallel tool use:
Query: "What's the weather in NYC and LA?"
Call 1: [get_weather(location="NYC"), get_weather(location="LA")]
Both execute simultaneously → Final responseParallel tool use dramatically reduces latency for queries that require multiple tool calls.
Because agentic workflows involve multiple inference calls, using Groq's fast inference can significantly improve the user experience of an agentic application:
With traditional inference speeds of 10-30 tokens/second, multi-tool workflows can feel painfully slow. Groq's inference speed of 300-1,000+ tokens/second makes these agentic experiences feel instantaneous.
Now that you understand the fundamentals of tool use and agentic systems, explore the specific patterns for using tools on the Groq API: