Model Context Protocol (MCP) is an open-source standard that enables AI applications to connect with external systems like databases, APIs, and tools. Think of MCP as a "USB-C port for AI applications" - it provides a standardized way for AI models to access and interact with your data and workflows.
Remote MCP on Groq is currently in beta. Please let us know your feedback in our Community.
As a developer, you know how powerful AI can be when it has access to the right information and tools. But connecting AI models to your existing systems has traditionally required custom integrations for each service. MCP solves this problem by creating a universal protocol that lets AI models securely connect to any external system.
With MCP, you can build AI agents that:
Groq's implementation of MCP provides significant advantages:
Remote MCP is available on all models that support tool use:
Model ID | Model |
---|---|
openai/gpt-oss-20b | GPT-OSS 20B |
openai/gpt-oss-120b | GPT-OSS 120B |
qwen/qwen3-32b | Qwen3 32B |
moonshotai/kimi-k2-instruct-0905 | Kimi K2 Instruct |
meta-llama/llama-4-maverick-17b-128e-instruct | Llama 4 Maverick |
meta-llama/llama-4-scout-17b-16e-instruct | Llama 4 Scout |
llama-3.3-70b-versatile | Llama 3.3 70B |
llama-3.1-8b-instant | Llama 3.1 8B Instant |
MCP works by adding external tools to your AI model requests through the tools
parameter. Each MCP tool specifies:
Here's a simple example using Hugging Face's MCP server to search for trending AI models.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.GROQ_API_KEY,
baseURL: "https://api.groq.com/openai/v1",
});
const response = await client.responses.create({
model: "openai/gpt-oss-120b",
input: "What models are trending on Huggingface?",
tools: [
{
type: "mcp",
server_label: "Huggingface",
server_url: "https://huggingface.co/mcp",
}
]
});
console.log(response);
When using MCP with the Responses API, you'll receive a structured response containing:
{
"id": "resp_01k59jhydefcd8wb7hbc460yav",
"object": "response",
"status": "completed",
"output": [
{
"type": "mcp_list_tools",
"id": "mcpl_1720577121",
"server_label": "Huggingface",
"tools": [...] // Available tools from the MCP server
},
{
"type": "reasoning",
"content": [
{
"type": "reasoning_text",
"text": "User asks: 'What are the trending models on Huggingface?' Need to fetch trending models..."
}
]
},
{
"type": "mcp_call",
"server_label": "Huggingface",
"name": "model_search",
"arguments": "{\"limit\":10,\"sort\":\"trendingScore\"}",
"output": "Showing first 10 models matching sorted by trendingScore..."
},
{
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "Here are the top 10 trending models on Hugging Face..."
}
]
}
]
}
For detailed information on configuring the OpenAI client with Groq, see our Responses API documentation. Groq's remote MCP support is fully compatible with OpenAI's remote MCP API.
MCP servers have access to all data in your AI model's context, including your messages, system prompts, and previous conversation history. Only connect to MCP servers from trusted sources that you control or verify. Malicious servers could potentially exfiltrate sensitive information from your requests. Always review the server's documentation and security practices before integration.
Groq's Responses API is specifically designed for agentic workflows that involve multiple steps and tool interactions:
Unlike chat completions that treat everything as conversational text, the Responses API treats actions as first-class citizens:
The Responses API was built from the ground up with MCP in mind, whereas chat completions require retrofitting MCP onto a conversation-based API. This means:
Connect to Firecrawl's MCP server for automated web scraping and data extraction. You'll need a Firecrawl API key to authenticate with their API.
Important Notes:
server_description
to help the AI model understand how to use the tool effectivelyimport OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.GROQ_API_KEY,
baseURL: "https://api.groq.com/openai/v1",
});
const response = await client.responses.create({
model: "openai/gpt-oss-120b",
input: [
{
type: "message",
role: "user",
content: "What are the production models on https://console.groq.com/docs/models?"
}
],
tools: [
{
type: "mcp",
server_label: "firecrawl",
server_description: "Web scraping and content extraction capabilities",
server_url: "https://mcp.firecrawl.dev/<APIKEY>/v2/mcp",
require_approval: "never"
}
],
stream: false
});
console.log(response);
A typical Firecrawl MCP response includes tool discovery, reasoning, and web scraping execution:
{
"id": "resp_01k5sv3np4fydva2jd9zzknbdv",
"object": "response",
"status": "completed",
"output": [
{
"type": "mcp_list_tools",
"server_label": "firecrawl",
"tools": [
{
"name": "firecrawl_scrape",
"description": "Scrape content from a single URL with advanced options. This is the most powerful, fastest and most reliable scraper tool..."
},
{
"name": "firecrawl_map",
"description": "Map a website to discover all indexed URLs on the site..."
},
{
"name": "firecrawl_search",
"description": "Search the web and optionally extract content from search results..."
},
{
"name": "firecrawl_crawl",
"description": "Starts a crawl job on a website and extracts content from all pages..."
},
{
"name": "firecrawl_extract",
"description": "Extract structured information from web pages using LLM capabilities..."
}
// ... additional tools available
]
},
{
"type": "reasoning",
"content": [{
"type": "reasoning_text",
"text": "User asks: 'What are the production models on https://console.groq.com/docs/models?' Need to retrieve content from that URL. Use firecrawl_scrape..."
}]
},
{
"type": "mcp_call",
"server_label": "firecrawl",
"name": "firecrawl_search",
"arguments": "{\"limit\":5,\"query\":\"Groq production models console.groq.com/docs/models\",\"scrapeOptions\":{\"formats\":[\"markdown\"]}}",
"output": "{\"web\":[{\"url\":\"https://console.groq.com/docs/models\",\"title\":\"Supported Models - GroqDocs\",\"markdown\":\"# Production Models...\"}]}"
},
{
"type": "message",
"role": "assistant",
"content": [{
"type": "output_text",
"text": "Here are the **Production‑grade models** listed on Groq's documentation. These models meet high standards for speed, quality, and reliability:\n\n| Model ID | Developer | Context Window | Max Completion Tokens |\n|----------|-----------|----------------|------------------------|\n| llama-3.1-8b-instant | Meta | 131,072 | 131,072 |\n| llama-3.3-70b-versatile | Meta | 131,072 | 32,768 |\n| openai/gpt-oss-120b | OpenAI | 131,072 | 65,536 |\n..."
}]
}
]
}
This example demonstrates how Firecrawl's MCP server provides multiple web scraping tools (firecrawl_scrape
, firecrawl_search
, firecrawl_map
, etc.) and intelligently selects the best approach to extract the requested information from web pages.
Enable natural language web search for your AI agents with Parallel's MCP server. You'll need a Parallel API key to authenticate with their API.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.GROQ_API_KEY,
baseURL: "https://api.groq.com/openai/v1",
});
const response = await client.responses.create({
model: "openai/gpt-oss-120b",
input: "What are the best models for agentic workflows on Groq? Search only on console.groq.com",
tools: [
{
type: "mcp",
server_label: "parallel_web_search",
server_url: "https://mcp.parallel.ai/v1beta/search_mcp/",
headers: {
"x-api-key": "<PARALLEL_API_KEY>"
},
require_approval: "never"
}
]
});
console.log(response);
{
"id": "resp_01k59pzd4bfe698awmye9cnd99",
"object": "response",
"status": "completed",
"created_at": 1758041453,
"output": [
{
"type": "mcp_list_tools",
"id": "mcpl_2365835890",
"status": "completed",
"server_label": "parallel_web_search",
"tools": [
{
"annotations": null,
"description": "Purpose: Perform web searches for a given objective and return result... [truncated]",
"input_schema": {
"properties": {
"include_domains": {
"items": {
"type": "string"
},
"title": "Include Domains",
"type": "array"
},
"objective": {
"title": "Objective",
"type": "string"
},
"search_queries": {
"items": {
"type": "string"
},
"title": "Search Queries",
"type": "array"
},
"search_type": {
"enum": [
"list",
"targeted",
"general",
"single_page"
],
"title": "Search Type",
"type": "string"
}
},
"required": [
"objective",
"search_queries",
"search_type",
"include_domains"
],
"title": "simple_search_toolArguments",
"type": "object"
},
"name": "web_search_preview"
}
]
},
{
"type": "reasoning",
"id": "resp_01k59pzd4bfe6s5m93vee5mzfb",
"status": "completed",
"content": [
{
"type": "reasoning_text",
"text": "We need to answer: best models for agentic workflows on Groq, search only on console.groq.com... [truncated]"
}
],
"summary": []
},
{
"type": "mcp_call",
"id": "mcpc_01k59pzd4bfe7vwk5ffp35f2g2",
"status": "completed",
"server_label": "parallel_web_search",
"name": "web_search_preview",
"arguments": "{"include_domains":["console.groq.com"],"objective":"Find the best models for agentic workflows on Groq according to console.groq.com documentation or listings... [truncated]",
"approval_request_id": null,
"output": "[truncated]"
},
{
"type": "message",
"id": "msg_01k59pzd4bfe7a2k07v4n3bjdn",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "**Best Groq models for agentic / tool‑using workflows**... [truncated]",
"annotations": [],
"logprobs": null
}
]
}
],
"model": "openai/gpt-oss-120b",
"tools": [
{
"type": "mcp",
"server_label": "parallel_web_search",
"server_url": "https://mcp.parallel.ai/<redacted>",
"headers": {
"x-api-key": "<PARALLEL_API_KEY>"
},
"allowed_tools": null,
"require_approval": "never",
"server_description": null
}
]
}
Best Groq models for agentic / tool‑using workflows
(All information pulled from the official Groq console documentation – only results fromconsole.groq.com
were consulted.)
Model / System | Why it’s a top pick for agentic workflows | Key agentic features | Typical use‑cases |
---|---|---|---|
Groq Compound (groq/compound ) | Fully‑featured “AI system” that already bundles the most useful tools (web‑search, code‑execution, browser‑automation, etc.). No extra tool‑definition work is required. | • Built‑in tools are auto‑invoked <br/>• Handles parallel tool calls <br/>• Extremely fast (~450 tps) | Autonomous agents that need to browse the web, run code, or scrape data without custom tooling. |
Groq Compound‑Mini (groq/compound-mini ) | Same tool set as Compound but lighter‑weight (fewer parameters, lower cost) – great for rapid prototyping or when latency budget is tight. | • Same built‑in tools as Compound <br/>• ~450 tps, lower memory footprint | Quick proofs‑of‑concept, edge‑deployed agents, or budget‑constrained pipelines. |
Llama 3.3‑70B‑versatile (llama-3.3-70b-versatile ) | Flagship production model with full tool‑use support, parallel‑tool capability, and JSON‑mode. | • ✅ Tool‑use & parallel tool‑use <br/>• ✅ JSON‑mode for structured output <br/>• 131 k context window | Multi‑step reasoning, data‑retrieval + computation pipelines, mixed‑modal (vision + text) agents. |
Llama 3.1‑8B‑instant (llama-3.1-8b-instant ) | Smallest “instant” model that still supports tool use – ideal when you need many concurrent agents or very low latency. | • ✅ Tool‑use <br/>• ✅ Parallel tool‑use <br/>• Fast inference (~450 tps) | High‑throughput agent farms, real‑time chat assistants, low‑cost routing agents. |
Meta‑Llama 4 Scout‑17B‑16E‑instruct (meta-llama/llama-4-scout-17b-16e-instruct ) | MoE model with strong reasoning & tool‑use, offering a good balance of size and performance. | • ✅ Tool‑use & parallel tool‑use <br/>• ✅ JSON‑mode <br/>• 131 k context | Complex planning agents, code‑generation + execution loops, multi‑tool orchestration. |
Meta‑Llama 4 Maverick‑17B‑128E‑instruct (meta-llama/llama-4-maverick-17b-128e-instruct ) | Slightly larger MoE variant with higher token‑speed (~500 tps) and robust tool handling. | • ✅ Tool‑use & parallel tool‑use <br/>• ✅ JSON‑mode <br/>• 131 k context | Heavy‑duty autonomous agents, research assistants that call many APIs in parallel. |
OpenAI GPT‑OSS 120B (openai/gpt-oss-120b ) | Open‑weight model with full tool‑use support (single‑tool only, but still very capable). | • ✅ Tool‑use (no parallel) <br/>• ✅ JSON‑mode | Scenarios where you only need one tool per turn (e.g., single‑API lookup) and want the large‑parameter knowledge base. |
OpenAI GPT‑OSS 20B (openai/gpt-oss-20b ) | Smaller OpenAI model that also supports tool‑use; useful when you need lower cost but still want OpenAI‑style reasoning. | • ✅ Tool‑use (no parallel) <br/>• ✅ JSON‑mode | Lightweight agents that call a single function (e.g., weather or simple database query). |
Moonshot Kimi K2 0905 (moonshotai/kimi-k2-instruct-0905 ) | Designed specifically for agentic intelligence; excellent at tool use, coding, and autonomous problem‑solving. | • ✅ Tool‑use & parallel tool‑use <br/>• ✅ JSON‑mode <br/>• 256 k context window | High‑complexity agents that need huge context (e.g., long document analysis + tool calls). |
Scenario | Recommended model(s) |
---|---|
Full‑featured autonomous agents (web search, code exec, browser automation) | groq/compound (or groq/compound-mini for lower cost) |
High‑throughput many‑agent pipelines | llama-3.1-8b-instant |
Best overall performance with parallel tool calls | llama-3.3-70b-versatile |
MoE‑style strong reasoning + tool use | meta-llama/llama-4-scout-17b-16e-instruct / meta-llama/llama-4-maverick-17b-128e-instruct |
When you need massive context ( >128 k tokens ) | moonshotai/kimi-k2-instruct-0905 |
Open‑source large‑parameter knowledge base | openai/gpt-oss-120b (single‑tool) or openai/gpt-oss-20b (lighter) |
These models collectively give you the best blend of tool‑use capability, parallel execution, large context windows, and Groq’s ultra‑fast inference, making them the top choices for building agentic workflows on the Groq platform.
Automate your invoicing process with Stripe's MCP server. You'll need a Stripe API key with appropriate permissions.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.GROQ_API_KEY,
baseURL: "https://api.groq.com/openai/v1",
});
const response = await client.responses.create({
model: "openai/gpt-oss-120b",
input: "Create an invoice for $100 for customer Groq Labs Testing using Stripe.",
tools: [
{
type: "mcp",
server_label: "Stripe",
server_url: "https://mcp.stripe.com",
headers: {
Authorization: "Bearer <STRIPE_TOKEN>"
},
require_approval: "never"
}
]
});
console.log(response);
A typical Stripe MCP response includes tool discovery, reasoning, and sequential action execution to create an invoice:
{
"id": "resp_01k59tasz2eg4as5q4n37kaqch",
"object": "response",
"status": "completed",
"output": [
{
"type": "mcp_list_tools",
"server_label": "Stripe",
"tools": [
{
"name": "create_customer",
"description": "This tool will create a customer in Stripe."
},
{
"name": "create_product",
"description": "This tool will create a product in Stripe."
},
{
"name": "create_price",
"description": "This tool will create a price in Stripe."
},
{
"name": "create_invoice",
"description": "This tool will create an invoice in Stripe."
},
{
"name": "create_invoice_item",
"description": "This tool will create an invoice item in Stripe."
},
{
"name": "finalize_invoice",
"description": "This tool will finalize an invoice in Stripe."
}
// ... additional tools available
]
},
{
"type": "reasoning",
"content": [{
"type": "reasoning_text",
"text": "We need to create an invoice for $100 for customer named Groq Labs Testing. Steps: 1. Create customer 2. Create product and price 3. Create invoice 4. Add invoice item 5. Finalize invoice..."
}]
},
{
"type": "mcp_call",
"server_label": "Stripe",
"name": "create_customer",
"arguments": "{\"name\":\"Groq Labs Testing\"}",
"output": "{\"id\":\"cus_T4BNAMWCQH1Po3\"}"
},
{
"type": "mcp_call",
"server_label": "Stripe",
"name": "create_product",
"arguments": "{\"name\":\"Groq Labs Testing Invoice\"}",
"output": "{\"id\":\"prod_T4BNkXTDpoQjEo\"}"
},
{
"type": "mcp_call",
"server_label": "Stripe",
"name": "create_price",
"arguments": "{\"currency\":\"usd\",\"product\":\"prod_T4BNkXTDpoQjEo\",\"unit_amount\":10000}",
"output": "{\"id\":\"price_1S830L4C7KtbdMSK1Oy4u38s\"}"
},
{
"type": "mcp_call",
"server_label": "Stripe",
"name": "create_invoice",
"arguments": "{\"customer\":\"cus_T4BNAMWCQH1Po3\"}",
"output": "{\"id\":\"in_1S830M4C7KtbdMSKpV3FA37N\",\"status\":\"draft\"}"
},
{
"type": "mcp_call",
"server_label": "Stripe",
"name": "create_invoice_item",
"arguments": "{\"customer\":\"cus_T4BNAMWCQH1Po3\",\"invoice\":\"in_1S830M4C7KtbdMSKpV3FA37N\",\"price\":\"price_1S830L4C7KtbdMSK1Oy4u38s\"}",
"output": "{\"id\":\"ii_1S830N4C7KtbdMSKWML91TiW\"}"
},
{
"type": "mcp_call",
"server_label": "Stripe",
"name": "finalize_invoice",
"arguments": "{\"invoice\":\"in_1S830M4C7KtbdMSKpV3FA37N\"}",
"output": "{\"id\":\"in_1S830M4C7KtbdMSKpV3FA37N\",\"status\":\"open\",\"url\":\"https://invoice.stripe.com/i/acct_1S82ye4C7KtbdMSK/test_...?s=ap\"}"
},
{
"type": "message",
"role": "assistant",
"content": [{
"type": "output_text",
"text": "Your invoice has been created and finalized for **$100** (USD) for the customer **Groq Labs Testing**..."
}]
}
]
}
This example demonstrates how MCP orchestrates multiple Stripe API calls to complete a complex business workflow - creating a customer, product, price, invoice, invoice item, and finalizing the invoice in a single request.
Other payment processors also support MCP. For example, PayPal's MCP server allows you to create invoices, manage payments, and more.
You can connect to multiple MCP servers in a single request, allowing AI to coordinate across different systems:
{
"tools": [
{
"type": "mcp",
"server_label": "parallel_web_search",
"server_url": "https://mcp.parallel.ai/<redacted>",
"headers": {
"x-api-key": "<PARALLEL_API_KEY>"
}
},
{
"type": "mcp",
"server_label": "Stripe",
"server_url": "https://mcp.stripe.com",
"headers": { "Authorization": "Bearer <STRIPE_TOKEN>" }
}
]
}
MCP servers often require authentication. Groq handles credentials securely:
In the case of authentication issues, you will receive a 424 Failed Dependency
error with the following content:
{
"error": {
"message": "Error retrieving tool list from MCP server: 'Stripe' Http status code: 401 (Unauthorized)",
"type": "external_connector_error",
"param": "tools",
"code": "http_error"
}
}
This may be due to:
If connection issues persist, try testing with a known working MCP server first to isolate whether the issue is with your configuration or the specific server.
While Groq's MCP implementation is fully compatible with OpenAI's remote MCP specification, there are some limitations to be aware of (we're working on them!):
"require_approval": true
)"streaming": true
)"allowed_tools": ["tool1", "tool2"]
)Groq's MCP implementation is fully compatible with OpenAI's remote MCP specification. Existing integrations typically only need to change:
https://api.openai.com/v1
to https://api.groq.com/openai/v1
openai/gpt-oss-120b
While we recommend the Responses API for its native MCP support, you can also use MCP with the Chat Completions API.
The Chat Completions API retrofits MCP onto a conversation-based interface. For the best MCP experience with multi-step workflows and approval controls, use the Responses API.
import Groq from "groq-sdk";
const groq = new Groq({
apiKey: process.env.GROQ_API_KEY,
});
const completion = await groq.chat.completions.create({
model: "openai/gpt-oss-120b",
messages: [
{
role: "user",
content: "What models are trending on Huggingface?"
}
],
tools: [
{
type: "mcp",
server_label: "Huggingface",
server_url: "https://huggingface.co/mcp"
}
]
});
console.log(completion.choices[0].message);