Some models and systems on Groq have native support for automatic code execution, allowing them to perform calculations, run code snippets, and solve computational problems in real-time.
Only Python is currently supported for code execution.
The use of this tool with a supported model in GroqCloud is not a HIPAA Covered Cloud Service under Groq's Business Associate Addendum at this time. This tool is also not available currently for use with regional / sovereign endpoints.
Built-in code execution is supported for the following models and systems:
Model ID | Model |
---|---|
openai/gpt-oss-20b | OpenAI GPT-OSS 20B |
openai/gpt-oss-120b | OpenAI GPT-OSS 120B |
compound-beta | Compound Beta |
compound-beta-mini | Compound Beta Mini |
For a comparison between the compound-beta
and compound-beta-mini
systems and more information regarding extra capabilities, see the Compound Systems page.
To use code execution with Groq's Compound systems, change the model
parameter to one of the supported models or systems.
import os
from groq import Groq
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))
response = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Calculate the square root of 101 and show me the Python code you used",
}
],
model="compound-beta-mini",
)
# Final output
print(response.choices[0].message.content)
# Reasoning + internal tool calls
print(response.choices[0].message.reasoning)
# Code execution tool call
if response.choices[0].message.executed_tools:
print(response.choices[0].message.executed_tools[0])
And that's it!
When the API is called, it will intelligently decide when to use code execution to best answer the user's query. Code execution is performed on the server side in a secure sandboxed environment, so no additional setup is required on your part.
This is the final response from the model, containing the answer based on code execution results. The model combines computational results with explanatory text to provide a comprehensive response. Use this as the primary output for user-facing applications.
The square root of 101 is: 10.04987562112089
Here is the Python code I used:
import math
print("The square root of 101 is: ")
print(math.sqrt(101))
This shows the model's internal reasoning process and the Python code it executed to solve the problem. You can inspect this to understand how the model approached the computational task and what code it generated. This is useful for debugging and understanding the model's decision-making process.
<tool>python(import math; print("The square root of 101 is: "); print(math.sqrt(101)))</tool> <output>The square root of 101 is: 10.04987562112089 </output>
This contains the raw executed tools data, including the generated Python code, execution output, and metadata. You can use this to access the exact code that was run and its results programmatically.
{
"string": "",
"name": "",
"index": 0,
"type": "python",
"arguments": "{\"code\": \"import math; print(\"The square root of 101 is: \"); print(math.sqrt(101))\"}",
"output": "The square root of 101 is: \n10.04987562112089\n",
"search_results": { "results": [] }
}
To use code execution with OpenAI's GPT-OSS models on Groq (20B & 120B), add the code_interpreter
tool to your request.
from groq import Groq
client = Groq(api_key="your-api-key-here")
response = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Calculate the square root of 12345. Output only the final answer.",
}
],
model="openai/gpt-oss-20b", # or "openai/gpt-oss-120b"
tool_choice="required",
tools=[
{
"type": "code_interpreter"
}
],
)
# Final output
print(response.choices[0].message.content)
# Reasoning + internal tool calls
print(response.choices[0].message.reasoning)
# Code execution tool call
print(response.choices[0].message.executed_tools[0])
When the API is called, it will use code execution to best answer the user's query. Code execution is performed on the server side in a secure sandboxed environment, so no additional setup is required on your part.
This is the final response from the model, containing the answer based on code execution results. The model combines computational results with explanatory text to provide a comprehensive response.
111.1080555135405112450044
This shows the model's internal reasoning process and the Python code it executed to solve the problem. You can inspect this to understand how the model approached the computational task and what code it generated.
We need sqrt(12345). Compute.math.sqrt returns 111.1080555... Let's compute with precision.Let's get more precise.We didn't get output because decimal sqrt needs context. Let's compute.It didn't output because .sqrt() might not be available for Decimal? Actually Decimal has sqrt method? There is sqrt in Decimal from Python 3.11? Actually it's decimal.Decimal.sqrt() available. But maybe need import Decimal. Let's try.It outputs nothing? Actually maybe need to print.
This contains the raw executed tools data, including the generated Python code, execution output, and metadata. You can use this to access the exact code that was run and its results programmatically.
{
name: 'python',
index: 0,
type: 'function',
arguments: 'import math\nmath.sqrt(12345)\n',
search_results: { results: null },
code_results: [ { text: '111.1080555135405' } ]
}
When you make a request to a model or system that supports code execution, it:
required
)Ask the model to perform complex calculations, and it will automatically execute Python code to compute the result.
import os
from groq import Groq
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))
chat_completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Calculate the monthly payment for a $30,000 loan over 5 years at 6% annual interest rate using the standard loan payment formula. Use python code.",
}
],
model="compound-beta-mini",
)
print(chat_completion.choices[0].message.content)
Provide code snippets to check for errors or understand their behavior. The model can execute the code to verify functionality.
import os
from groq import Groq
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))
chat_completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Will this Python code raise an error? `import numpy as np; a = np.array([1, 2]); b = np.array([3, 4, 5]); print(a + b)`",
}
],
model="compound-beta-mini",
)
print(chat_completion.choices[0].message.content)
Web search is priced at $0.00005 / second of execution.
Please see the Pricing page for more information.
Code execution functionality is powered by E2B, a secure cloud environment for AI code execution. E2B provides isolated, ephemeral sandboxes that allow models to run code safely without access to external networks or sensitive data.