While LLMs excel at generating text, compound-beta
takes the next step.
It's an advanced AI system that is designed to solve problems by taking action and intelligently uses external tools - starting with web search and code execution - alongside the powerful Llama 4 models and Llama 3.3 70b model.
This allows it access to real-time information and interaction with external environments, providing more accurate, up-to-date, and capable responses than an LLM alone.
There are two compound systems available:
compound-beta
: supports multiple tool calls per request. This system is great for use cases that require multiple web searches or code executions per request.compound-beta-mini
: supports a single tool call per request. This system is great for use cases that require a single web search or code execution per request. compound-beta-mini
has an average of 3x lower latency than compound-beta
.Both systems support the following tools:
Custom user-provided tools are not supported at this time.
To use compound systems, change the model
parameter to either compound-beta
or compound-beta-mini
:
from groq import Groq
client = Groq()
completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is the current weather in Tokyo?",
}
],
# Change model to compound-beta to use agentic tooling
# model: "llama-3.3-70b-versatile",
model="compound-beta",
)
print(completion.choices[0].message.content)
# Print all tool calls
# print(completion.choices[0].message.executed_tools)
And that's it!
When the API is called, it will intelligently decide when to use search or code execution to best answer the user's query. These tool calls are performed on the server side, so no additional setup is required on your part to use agentic tooling.
In the above example, the API will use its build in web search tool to find the current weather in Tokyo. If you didn't use compound systems, you might have needed to add your own custom tools to make API requests to a weather service, then perform multiple API calls to Groq to get a final result. Instead, with compound systems, you can get a final result with a single API call.
To view the tools (search or code execution) used automatically by the compound system, check the executed_tools
field in the response:
import os
from groq import Groq
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))
response = client.chat.completions.create(
model="compound-beta",
messages=[
{"role": "user", "content": "What did Groq release last week?"}
]
)
# Log the tools that were used to generate the response
print(response.choices[0].message.executed_tools)
Now that you understand the basics of compound systems, explore these topics: