While LLMs excel at generating text, Groq's Compound systems take the next step. Compound is an advanced AI system that is designed to solve problems by taking action and intelligently uses external tools, such as web search and code execution, alongside the powerful GPT-OSS 120B, Llama 4 Scout, and Llama 3.3 70B models. This allows it access to real-time information and interaction with external environments, providing more accurate, up-to-date, and capable responses than an LLM alone.
Groq's compound AI system should not be used by customers for processing protected health information as it is not a HIPAA Covered Cloud Service under Groq's Business Associate Addendum at this time. This system is also not available currently for use with regional / sovereign endpoints.
There are two compound systems available:
groq/compound
: supports multiple tool calls per request. This system is great for use cases that require multiple web searches or code executions per request.groq/compound-mini
: supports a single tool call per request. This system is great for use cases that require a single web search or code execution per request. groq/compound-mini
has an average of 3x lower latency than groq/compound
.Both systems support the following tools:
Custom user-provided tools are not supported at this time.
To use compound systems, change the model
parameter to either groq/compound
or groq/compound-mini
:
from groq import Groq
client = Groq()
completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is the current weather in Tokyo?",
}
],
# Change model to compound to use built-in tools
# model: "llama-3.3-70b-versatile",
model="groq/compound",
)
print(completion.choices[0].message.content)
# Print all tool calls
# print(completion.choices[0].message.executed_tools)
And that's it!
When the API is called, it will intelligently decide when to use search or code execution to best answer the user's query. These tool calls are performed on the server side, so no additional setup is required on your part to use built-in tools.
In the above example, the API will use its build in web search tool to find the current weather in Tokyo. If you didn't use compound systems, you might have needed to add your own custom tools to make API requests to a weather service, then perform multiple API calls to Groq to get a final result. Instead, with compound systems, you can get a final result with a single API call.
To view the tools (search or code execution) used automatically by the compound system, check the executed_tools
field in the response:
import os
from groq import Groq
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))
response = client.chat.completions.create(
model="groq/compound",
messages=[
{"role": "user", "content": "What did Groq release last week?"}
]
)
# Log the tools that were used to generate the response
print(response.choices[0].message.executed_tools)
The usage_breakdown
field in responses provides detailed information about all the underlying models used during the compound system's execution.
"usage_breakdown": {
"models": [
{
"model": "llama-3.3-70b-versatile",
"usage": {
"queue_time": 0.017298032,
"prompt_tokens": 226,
"prompt_time": 0.023959775,
"completion_tokens": 16,
"completion_time": 0.061639794,
"total_tokens": 242,
"total_time": 0.085599569
}
},
{
"model": "openai/gpt-oss-120b",
"usage": {
"queue_time": 0.019125835,
"prompt_tokens": 903,
"prompt_time": 0.033082052,
"completion_tokens": 873,
"completion_time": 1.776467372,
"total_tokens": 1776,
"total_time": 1.809549424
}
}
]
}
Compound systems support versioning through the Groq-Model-Version
header. In most cases, you won't need to change anything since you'll automatically be on the latest stable version. To view the latest changes to the compound systems, see the Compound Changelog.
System | Default Version (no header) | Latest Version ( Groq-Model-Version: latest ) |
---|---|---|
groq/compound | 2025-07-23 (stable) | 2025-08-16 (prerelease) |
groq/compound-mini | 2025-07-23 (stable) | 2025-08-16 (prerelease) |
2025-07-23
, the latest stable version that has been fully tested and deployedGroq-Model-Version: latest
): Uses version 2025-08-16
, the prerelease version with the newest features before they're rolled out to everyoneTo use a specific version, pass the version in the Groq-Model-Version
header:
curl -X POST "https://api.groq.com/openai/v1/chat/completions" \
-H "Authorization: Bearer $GROQ_API_KEY" \
-H "Content-Type: application/json" \
-H "Groq-Model-Version: latest" \
-d '{
"model": "groq/compound",
"messages": [{"role": "user", "content": "What is the weather today?"}]
}'
Now that you understand the basics of compound systems, explore these topics: