Model Context Protocol (MCP)

Model Context Protocol (MCP) is an open-source standard that enables AI applications to connect with external systems like databases, APIs, and tools. Think of MCP as a "USB-C port for AI applications" - it provides a standardized way for AI models to access and interact with your data and workflows.

What is MCP?

As a developer, you know how powerful AI can be when it has access to the right information and tools. But connecting AI models to your existing systems has traditionally required custom integrations for each service. MCP solves this problem by creating a universal protocol that lets AI models securely connect to any external system.

Real-World Examples

With MCP, you can build AI agents that:

  • Access your codebase: Let AI read GitHub repositories, create issues, and manage pull requests
  • Query your database: Enable natural language queries against PostgreSQL, MySQL, or any database
  • Browse the web: Give AI the ability to search and extract information from websites
  • Control your tools: Connect to Slack, Notion, Google Calendar, or any API-based service
  • Analyze your data: Let AI work with spreadsheets, documents, and business intelligence tools

Why Use MCP with Groq?

Groq's implementation of MCP provides significant advantages:

  • Drop-in compatibility: Existing OpenAI Responses + MCP integrations work with just an endpoint change
  • Superior performance: Groq's speed makes tool-using agents feel snappier and more reliable
  • Cost efficiency: Run the same AI experiences more cost-effectively at scale
  • Built-in security: Clear approval controls and allowlists help teams control tool usage

Supported Models

Remote MCP is available on all models that support tool use:

Model IDModel
openai/gpt-oss-20b
GPT-OSS 20B
openai/gpt-oss-120b
GPT-OSS 120B
qwen/qwen3-32b
Qwen3 32B
moonshotai/kimi-k2-instruct-0905
Kimi K2 Instruct
meta-llama/llama-4-maverick-17b-128e-instruct
Llama 4 Maverick
meta-llama/llama-4-scout-17b-16e-instruct
Llama 4 Scout
llama-3.3-70b-versatile
Llama 3.3 70B
llama-3.1-8b-instant
Llama 3.1 8B Instant

Getting Started

MCP works by adding external tools to your AI model requests through the tools parameter. Each MCP tool specifies:

  • Server details: Where to connect (URL, authentication)
  • Tool restrictions: Which operations are allowed
  • Approval settings: Whether human approval is required

Your First MCP Request

Here's a simple example using Hugging Face's MCP server to search for trending AI models.

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.GROQ_API_KEY,
  baseURL: "https://api.groq.com/openai/v1",
});

const response = await client.responses.create({
  model: "openai/gpt-oss-120b",
  input: "What models are trending on Huggingface?",
  tools: [
    {
      type: "mcp",
      server_label: "Huggingface",
      server_url: "https://huggingface.co/mcp",
    }
  ]
});

console.log(response);
Response Structure

When using MCP with the Responses API, you'll receive a structured response containing:

  1. Tool Discovery: Lists available tools from the MCP server
  2. Reasoning: Shows the model's decision-making process
  3. MCP Call: The actual tool execution with results
  4. Final Message: The synthesized answer using tool data
JSON
{
"id": "resp_01k59jhydefcd8wb7hbc460yav",
"object": "response",
"status": "completed",
"output": [
  {
    "type": "mcp_list_tools",
    "id": "mcpl_1720577121",
    "server_label": "Huggingface",
    "tools": [...] // Available tools from the MCP server
  },
  {
    "type": "reasoning", 
    "content": [
      {
        "type": "reasoning_text",
        "text": "User asks: 'What are the trending models on Huggingface?' Need to fetch trending models..."
      }
    ]
  },
  {
    "type": "mcp_call",
    "server_label": "Huggingface", 
    "name": "model_search",
    "arguments": "{\"limit\":10,\"sort\":\"trendingScore\"}",
    "output": "Showing first 10 models matching sorted by trendingScore..."
  },
  {
    "type": "message",
    "role": "assistant",
    "content": [
      {
        "type": "output_text", 
        "text": "Here are the top 10 trending models on Hugging Face..."
      }
    ]
  }
]
}

For detailed information on configuring the OpenAI client with Groq, see our Responses API documentation. Groq's remote MCP support is fully compatible with OpenAI's remote MCP API.

Why the Responses API?

Groq's Responses API is specifically designed for agentic workflows that involve multiple steps and tool interactions:

Action-Oriented Design

Unlike chat completions that treat everything as conversational text, the Responses API treats actions as first-class citizens:

  • Tool discovery is a separate, labeled step
  • Reasoning is exposed as its own output type
  • Tool calls are clearly identified and structured
  • Approvals are built into the flow naturally

Native MCP Support

The Responses API was built from the ground up with MCP in mind, whereas chat completions require retrofitting MCP onto a conversation-based API. This means:

  • Better handling of multi-step tool workflows
  • Clearer separation between reasoning and action
  • Built-in approval and control mechanisms
  • More reliable stateless operation

MCP Examples

Firecrawl Integration

Connect to Firecrawl's MCP server for automated web scraping and data extraction. You'll need a Firecrawl API key to authenticate with their API.

Important Notes:

  • Use a descriptive server_description to help the AI model understand how to use the tool effectively
  • Firecrawl requires that you provide a URL in your request for it to browse and extract content from
  • The API key should be included in the server URL as shown in the example
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.GROQ_API_KEY,
  baseURL: "https://api.groq.com/openai/v1",
});

const response = await client.responses.create({
  model: "openai/gpt-oss-120b",
  input: [
    {
      type: "message",
      role: "user",
      content: "What are the production models on https://console.groq.com/docs/models?"
    }
  ],
  tools: [
    {
      type: "mcp",
      server_label: "firecrawl",
      server_description: "Web scraping and content extraction capabilities",
      server_url: "https://mcp.firecrawl.dev/<APIKEY>/v2/mcp",
      require_approval: "never"
    }
  ],
  stream: false
});

console.log(response);
Example Firecrawl Integration Response

A typical Firecrawl MCP response includes tool discovery, reasoning, and web scraping execution:

JSON
{
"id": "resp_01k5sv3np4fydva2jd9zzknbdv",
"object": "response",
"status": "completed",
"output": [
  {
    "type": "mcp_list_tools",
    "server_label": "firecrawl",
    "tools": [
      {
        "name": "firecrawl_scrape",
        "description": "Scrape content from a single URL with advanced options. This is the most powerful, fastest and most reliable scraper tool..."
      },
      {
        "name": "firecrawl_map", 
        "description": "Map a website to discover all indexed URLs on the site..."
      },
      {
        "name": "firecrawl_search",
        "description": "Search the web and optionally extract content from search results..."
      },
      {
        "name": "firecrawl_crawl",
        "description": "Starts a crawl job on a website and extracts content from all pages..."
      },
      {
        "name": "firecrawl_extract",
        "description": "Extract structured information from web pages using LLM capabilities..."
      }
      // ... additional tools available
    ]
  },
  {
    "type": "reasoning",
    "content": [{
      "type": "reasoning_text", 
      "text": "User asks: 'What are the production models on https://console.groq.com/docs/models?' Need to retrieve content from that URL. Use firecrawl_scrape..."
    }]
  },
  {
    "type": "mcp_call",
    "server_label": "firecrawl",
    "name": "firecrawl_search",
    "arguments": "{\"limit\":5,\"query\":\"Groq production models console.groq.com/docs/models\",\"scrapeOptions\":{\"formats\":[\"markdown\"]}}",
    "output": "{\"web\":[{\"url\":\"https://console.groq.com/docs/models\",\"title\":\"Supported Models - GroqDocs\",\"markdown\":\"# Production Models...\"}]}"
  },
  {
    "type": "message",
    "role": "assistant", 
    "content": [{
      "type": "output_text",
      "text": "Here are the **Production‑grade models** listed on Groq's documentation. These models meet high standards for speed, quality, and reliability:\n\n| Model ID | Developer | Context Window | Max Completion Tokens |\n|----------|-----------|----------------|------------------------|\n| llama-3.1-8b-instant | Meta | 131,072 | 131,072 |\n| llama-3.3-70b-versatile | Meta | 131,072 | 32,768 |\n| openai/gpt-oss-120b | OpenAI | 131,072 | 65,536 |\n..."
    }]
  }
]
}

This example demonstrates how Firecrawl's MCP server provides multiple web scraping tools (firecrawl_scrape, firecrawl_search, firecrawl_map, etc.) and intelligently selects the best approach to extract the requested information from web pages.

Enable natural language web search for your AI agents with Parallel's MCP server. You'll need a Parallel API key to authenticate with their API.

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.GROQ_API_KEY,
  baseURL: "https://api.groq.com/openai/v1",
});

const response = await client.responses.create({
  model: "openai/gpt-oss-120b",
  input: "What are the best models for agentic workflows on Groq? Search only on console.groq.com",
  tools: [
    {
      type: "mcp",
      server_label: "parallel_web_search",
      server_url: "https://mcp.parallel.ai/v1beta/search_mcp/",
      headers: {
        "x-api-key": "<PARALLEL_API_KEY>"
      },
      require_approval: "never"
    }
  ]
});

console.log(response);
Example Web Search Response
JSON
{
  "id": "resp_01k59pzd4bfe698awmye9cnd99",
  "object": "response",
  "status": "completed",
  "created_at": 1758041453,
  "output": [
    {
      "type": "mcp_list_tools",
      "id": "mcpl_2365835890",
      "status": "completed",
      "server_label": "parallel_web_search",
      "tools": [
        {
          "annotations": null,
          "description": "Purpose: Perform web searches for a given objective and return result... [truncated]",
          "input_schema": {
            "properties": {
              "include_domains": {
                "items": {
                  "type": "string"
                },
                "title": "Include Domains",
                "type": "array"
              },
              "objective": {
                "title": "Objective",
                "type": "string"
              },
              "search_queries": {
                "items": {
                  "type": "string"
                },
                "title": "Search Queries",
                "type": "array"
              },
              "search_type": {
                "enum": [
                  "list",
                  "targeted",
                  "general",
                  "single_page"
                ],
                "title": "Search Type",
                "type": "string"
              }
            },
            "required": [
              "objective",
              "search_queries",
              "search_type",
              "include_domains"
            ],
            "title": "simple_search_toolArguments",
            "type": "object"
          },
          "name": "web_search_preview"
        }
      ]
    },
    {
      "type": "reasoning",
      "id": "resp_01k59pzd4bfe6s5m93vee5mzfb",
      "status": "completed",
      "content": [
        {
          "type": "reasoning_text",
          "text": "We need to answer: best models for agentic workflows on Groq, search only on console.groq.com... [truncated]"
        }
      ],
      "summary": []
    },
    {
      "type": "mcp_call",
      "id": "mcpc_01k59pzd4bfe7vwk5ffp35f2g2",
      "status": "completed",
      "server_label": "parallel_web_search",
      "name": "web_search_preview",
      "arguments": "{"include_domains":["console.groq.com"],"objective":"Find the best models for agentic workflows on Groq according to console.groq.com documentation or listings... [truncated]",
      "approval_request_id": null,
      "output": "[truncated]"
    },
    {
      "type": "message",
      "id": "msg_01k59pzd4bfe7a2k07v4n3bjdn",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "**Best Groq models for agentic / tool‑using workflows**... [truncated]",
          "annotations": [],
          "logprobs": null
        }
      ]
    }
  ],
  "model": "openai/gpt-oss-120b",
  "tools": [
    {
      "type": "mcp",
      "server_label": "parallel_web_search",
      "server_url": "https://mcp.parallel.ai/<redacted>",
      "headers": {
        "x-api-key": "<PARALLEL_API_KEY>"
      },
      "allowed_tools": null,
      "require_approval": "never",
      "server_description": null
    }
  ]
}

Final Output


Best Groq models for agentic / tool‑using workflows

(All information pulled from the official Groq console documentation – only results from console.groq.com were consulted.)

Model / SystemWhy it’s a top pick for agentic workflowsKey agentic featuresTypical use‑cases
Groq Compound (groq/compound)Fully‑featured “AI system” that already bundles the most useful tools (web‑search, code‑execution, browser‑automation, etc.). No extra tool‑definition work is required.• Built‑in tools are auto‑invoked <br/>• Handles parallel tool calls <br/>• Extremely fast (~450 tps)Autonomous agents that need to browse the web, run code, or scrape data without custom tooling.
Groq Compound‑Mini (groq/compound-mini)Same tool set as Compound but lighter‑weight (fewer parameters, lower cost) – great for rapid prototyping or when latency budget is tight.• Same built‑in tools as Compound <br/>• ~450 tps, lower memory footprintQuick proofs‑of‑concept, edge‑deployed agents, or budget‑constrained pipelines.
Llama 3.3‑70B‑versatile (llama-3.3-70b-versatile)Flagship production model with full tool‑use support, parallel‑tool capability, and JSON‑mode.• ✅ Tool‑use & parallel tool‑use <br/>• ✅ JSON‑mode for structured output <br/>• 131 k context windowMulti‑step reasoning, data‑retrieval + computation pipelines, mixed‑modal (vision + text) agents.
Llama 3.1‑8B‑instant (llama-3.1-8b-instant)Smallest “instant” model that still supports tool use – ideal when you need many concurrent agents or very low latency.• ✅ Tool‑use <br/>• ✅ Parallel tool‑use <br/>• Fast inference (~450 tps)High‑throughput agent farms, real‑time chat assistants, low‑cost routing agents.
Meta‑Llama 4 Scout‑17B‑16E‑instruct (meta-llama/llama-4-scout-17b-16e-instruct)MoE model with strong reasoning & tool‑use, offering a good balance of size and performance.• ✅ Tool‑use & parallel tool‑use <br/>• ✅ JSON‑mode <br/>• 131 k contextComplex planning agents, code‑generation + execution loops, multi‑tool orchestration.
Meta‑Llama 4 Maverick‑17B‑128E‑instruct (meta-llama/llama-4-maverick-17b-128e-instruct)Slightly larger MoE variant with higher token‑speed (~500 tps) and robust tool handling.• ✅ Tool‑use & parallel tool‑use <br/>• ✅ JSON‑mode <br/>• 131 k contextHeavy‑duty autonomous agents, research assistants that call many APIs in parallel.
OpenAI GPT‑OSS 120B (openai/gpt-oss-120b)Open‑weight model with full tool‑use support (single‑tool only, but still very capable).• ✅ Tool‑use (no parallel) <br/>• ✅ JSON‑modeScenarios where you only need one tool per turn (e.g., single‑API lookup) and want the large‑parameter knowledge base.
OpenAI GPT‑OSS 20B (openai/gpt-oss-20b)Smaller OpenAI model that also supports tool‑use; useful when you need lower cost but still want OpenAI‑style reasoning.• ✅ Tool‑use (no parallel) <br/>• ✅ JSON‑modeLightweight agents that call a single function (e.g., weather or simple database query).
Moonshot Kimi K2 0905 (moonshotai/kimi-k2-instruct-0905)Designed specifically for agentic intelligence; excellent at tool use, coding, and autonomous problem‑solving.• ✅ Tool‑use & parallel tool‑use <br/>• ✅ JSON‑mode <br/>• 256 k context windowHigh‑complexity agents that need huge context (e.g., long document analysis + tool calls).

How the list was derived

  1. Tool‑use support table (found on the Introduction to Tool Use page) enumerates every model that can call external tools, marks those that also support parallel tool calls, and notes JSON‑mode availability.
  2. The Agentic Tooling section highlights the Groq Compound systems as the “out‑of‑the‑box” agentic solution that already bundles tools.
  3. The Supported Models table on the Models page lists the production models; the ones above are the ones that also appear in the tool‑use table.
  4. The AutoGen + Groq and CrewAI + Groq guides all use Llama 3.3‑70B‑versatile or Llama 3.1‑8B‑instant as the recommended default, reinforcing their status as the go‑to agents for the Groq ecosystem.

Quick recommendation cheat‑sheet

ScenarioRecommended model(s)
Full‑featured autonomous agents (web search, code exec, browser automation)groq/compound (or groq/compound-mini for lower cost)
High‑throughput many‑agent pipelinesllama-3.1-8b-instant
Best overall performance with parallel tool callsllama-3.3-70b-versatile
MoE‑style strong reasoning + tool usemeta-llama/llama-4-scout-17b-16e-instruct / meta-llama/llama-4-maverick-17b-128e-instruct
When you need massive context ( >128 k tokens )moonshotai/kimi-k2-instruct-0905
Open‑source large‑parameter knowledge baseopenai/gpt-oss-120b (single‑tool) or openai/gpt-oss-20b (lighter)

These models collectively give you the best blend of tool‑use capability, parallel execution, large context windows, and Groq’s ultra‑fast inference, making them the top choices for building agentic workflows on the Groq platform.

Creating an Invoice

Automate your invoicing process with Stripe's MCP server. You'll need a Stripe API key with appropriate permissions.

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.GROQ_API_KEY,
  baseURL: "https://api.groq.com/openai/v1",
});

const response = await client.responses.create({
  model: "openai/gpt-oss-120b",
  input: "Create an invoice for $100 for customer Groq Labs Testing using Stripe.",
  tools: [
    {
      type: "mcp",
      server_label: "Stripe",
      server_url: "https://mcp.stripe.com",
      headers: {
        Authorization: "Bearer <STRIPE_TOKEN>"
      },
      require_approval: "never"
    }
  ]
});

console.log(response);
Example Stripe Invoice Creation

A typical Stripe MCP response includes tool discovery, reasoning, and sequential action execution to create an invoice:

JSON
{
"id": "resp_01k59tasz2eg4as5q4n37kaqch",
"object": "response",
"status": "completed",
"output": [
  {
    "type": "mcp_list_tools",
    "server_label": "Stripe",
    "tools": [
      {
        "name": "create_customer",
        "description": "This tool will create a customer in Stripe."
      },
      {
        "name": "create_product",
        "description": "This tool will create a product in Stripe."
      },
      {
        "name": "create_price",
        "description": "This tool will create a price in Stripe."
      },
      {
        "name": "create_invoice",
        "description": "This tool will create an invoice in Stripe."
      },
      {
        "name": "create_invoice_item",
        "description": "This tool will create an invoice item in Stripe."
      },
      {
        "name": "finalize_invoice",
        "description": "This tool will finalize an invoice in Stripe."
      }
      // ... additional tools available
    ]
  },
  {
    "type": "reasoning",
    "content": [{
      "type": "reasoning_text",
      "text": "We need to create an invoice for $100 for customer named Groq Labs Testing. Steps: 1. Create customer 2. Create product and price 3. Create invoice 4. Add invoice item 5. Finalize invoice..."
    }]
  },
  {
    "type": "mcp_call",
    "server_label": "Stripe",
    "name": "create_customer",
    "arguments": "{\"name\":\"Groq Labs Testing\"}",
    "output": "{\"id\":\"cus_T4BNAMWCQH1Po3\"}"
  },
  {
    "type": "mcp_call",
    "server_label": "Stripe",
    "name": "create_product",
    "arguments": "{\"name\":\"Groq Labs Testing Invoice\"}",
    "output": "{\"id\":\"prod_T4BNkXTDpoQjEo\"}"
  },
  {
    "type": "mcp_call",
    "server_label": "Stripe",
    "name": "create_price",
    "arguments": "{\"currency\":\"usd\",\"product\":\"prod_T4BNkXTDpoQjEo\",\"unit_amount\":10000}",
    "output": "{\"id\":\"price_1S830L4C7KtbdMSK1Oy4u38s\"}"
  },
  {
    "type": "mcp_call",
    "server_label": "Stripe",
    "name": "create_invoice",
    "arguments": "{\"customer\":\"cus_T4BNAMWCQH1Po3\"}",
    "output": "{\"id\":\"in_1S830M4C7KtbdMSKpV3FA37N\",\"status\":\"draft\"}"
  },
  {
    "type": "mcp_call",
    "server_label": "Stripe",
    "name": "create_invoice_item",
    "arguments": "{\"customer\":\"cus_T4BNAMWCQH1Po3\",\"invoice\":\"in_1S830M4C7KtbdMSKpV3FA37N\",\"price\":\"price_1S830L4C7KtbdMSK1Oy4u38s\"}",
    "output": "{\"id\":\"ii_1S830N4C7KtbdMSKWML91TiW\"}"
  },
  {
    "type": "mcp_call",
    "server_label": "Stripe",
    "name": "finalize_invoice",
    "arguments": "{\"invoice\":\"in_1S830M4C7KtbdMSKpV3FA37N\"}",
    "output": "{\"id\":\"in_1S830M4C7KtbdMSKpV3FA37N\",\"status\":\"open\",\"url\":\"https://invoice.stripe.com/i/acct_1S82ye4C7KtbdMSK/test_...?s=ap\"}"
  },
  {
    "type": "message",
    "role": "assistant",
    "content": [{
      "type": "output_text",
      "text": "Your invoice has been created and finalized for **$100** (USD) for the customer **Groq Labs Testing**..."
    }]
  }
]
}

This example demonstrates how MCP orchestrates multiple Stripe API calls to complete a complex business workflow - creating a customer, product, price, invoice, invoice item, and finalizing the invoice in a single request.

Other payment processors also support MCP. For example, PayPal's MCP server allows you to create invoices, manage payments, and more.

Advanced Features

Multiple MCP Servers

You can connect to multiple MCP servers in a single request, allowing AI to coordinate across different systems:

JSON
{
  "tools": [
    {
      "type": "mcp",
      "server_label": "parallel_web_search",
      "server_url": "https://mcp.parallel.ai/<redacted>",
      "headers": {
        "x-api-key": "<PARALLEL_API_KEY>"
      }
    },
    {
      "type": "mcp",
      "server_label": "Stripe",
      "server_url": "https://mcp.stripe.com",
      "headers": { "Authorization": "Bearer <STRIPE_TOKEN>" }
    }
  ]
}

Authentication & Security

MCP servers often require authentication. Groq handles credentials securely:

  • Headers sent only to MCP servers: Tokens are only transmitted to the specific server URL
  • Redacted logs: Authentication headers are automatically redacted from logs

Connection Troubleshooting

In the case of authentication issues, you will receive a 424 Failed Dependency error with the following content:

JSON
{
  "error": {
    "message": "Error retrieving tool list from MCP server: 'Stripe' Http status code: 401 (Unauthorized)",
    "type": "external_connector_error",
    "param": "tools",
    "code": "http_error"
  }
}

This may be due to:

  • Incorrect credentials: Check your authentication tokens and headers
  • Invalid server URL: Verify the MCP server endpoint is correct and accessible
  • Server unavailable: The MCP server may be down or not responding

Debugging Connection Issues

  1. Verify credentials: Double-check API keys, tokens, and authentication headers
  2. Test server URL: Ensure the MCP server URL is accessible and returns valid responses
  3. Check server status: Confirm the MCP server is running and healthy

If connection issues persist, try testing with a known working MCP server first to isolate whether the issue is with your configuration or the specific server.

Limitations

While Groq's MCP implementation is fully compatible with OpenAI's remote MCP specification, there are some limitations to be aware of (we're working on them!):

  • Approvals are not yet supported ("require_approval": true)
  • Streaming is not yet supported ("streaming": true)
  • Filtering tools is not yet supported ("allowed_tools": ["tool1", "tool2"])

OpenAI Compatibility

Groq's MCP implementation is fully compatible with OpenAI's remote MCP specification. Existing integrations typically only need to change:

  • Base URL: From https://api.openai.com/v1 to https://api.groq.com/openai/v1
  • Model name: To a Groq-supported model like openai/gpt-oss-120b
  • API key: To your Groq API key

Using MCP with Chat Completions

While we recommend the Responses API for its native MCP support, you can also use MCP with the Chat Completions API.

import Groq from "groq-sdk";

const groq = new Groq({
  apiKey: process.env.GROQ_API_KEY,
});

const completion = await groq.chat.completions.create({
  model: "openai/gpt-oss-120b",
  messages: [
    {
      role: "user",
      content: "What models are trending on Huggingface?"
    }
  ],
  tools: [
    {
      type: "mcp",
      server_label: "Huggingface",
      server_url: "https://huggingface.co/mcp"
    }
  ]
});

console.log(completion.choices[0].message);

Next Steps

Was this page helpful?