Firecrawl is an enterprise-grade web scraping platform that turns any website into clean, AI-ready data. Combined with Groq's fast inference through MCP, you can build intelligent agents that scrape websites, extract structured data, and conduct deep research with natural language instructions.
Key Features:
pip install openai python-dotenvexport GROQ_API_KEY="your-groq-api-key"
export FIRECRAWL_API_KEY="your-firecrawl-api-key"import os
from openai import OpenAI
from openai.types import responses as openai_responses
client = OpenAI(
base_url="https://api.groq.com/api/openai/v1",
api_key=os.getenv("GROQ_API_KEY")
)
tools = [
openai_responses.tool_param.Mcp(
server_label="firecrawl",
server_url=f"https://mcp.firecrawl.dev/{os.getenv('FIRECRAWL_API_KEY')}/v2/mcp",
type="mcp",
require_approval="never",
)
]
response = client.responses.create(
model="openai/gpt-oss-120b",
input="Scrape https://console.groq.com/docs/models and provide an overview of available models",
tools=tools,
temperature=0.1,
top_p=0.4,
)
print(response.output_text)Extract data in specific JSON formats across multiple sources:
response = client.responses.create(
model="openai/gpt-oss-120b",
input="""Extract pricing from https://openai.com, https://anthropic.com, https://groq.com
Return JSON:
{
"company_name": "string",
"pricing_plans": [{"plan_name": "string", "price": "string", "features": ["string"]}]
}""",
tools=tools,
temperature=0.1,
)
print(response.output_text)Conduct comprehensive research across multiple sources:
response = client.responses.create(
model="openai/gpt-oss-120b",
input="""Research "latest trends in AI model inference speed and performance":
1. Recent developments (2024-2025)
2. Key companies and technologies
3. Performance benchmarks
4. Future trends
Provide a comprehensive report with citations.""",
tools=tools,
temperature=0.1,
)
print(response.output_text)Scrape multiple URLs in parallel:
response = client.responses.create(
model="openai/gpt-oss-120b",
input="""Batch scrape these URLs and summarize key findings:
- https://arxiv.org/abs/2401.xxxxx
- https://arxiv.org/abs/2402.xxxxx
- https://arxiv.org/abs/2403.xxxxx""",
tools=tools,
temperature=0.1,
)
print(response.output_text)Firecrawl MCP provides several powerful tools for web scraping, data extraction, and research:
| Tool | Description |
|---|---|
firecrawl_scrape | Scrape content from a single URL with advanced options and formatting |
firecrawl_batch_scrape | Scrape multiple URLs efficiently with built-in rate limiting and parallel processing |
firecrawl_check_batch_status | Check the status of a batch operation and retrieve results |
firecrawl_search | Search the web and optionally extract content from search results |
firecrawl_crawl | Start an asynchronous crawl with advanced options for depth and link following |
firecrawl_extract | Extract structured information from web pages using LLM capabilities and JSON schemas |
firecrawl_deep_research | Conduct comprehensive deep web research with intelligent crawling and LLM analysis |
firecrawl_generate_llmstxt | Generate standardized llms.txt files that define how LLMs should interact with a site |
Challenge: Build an AI-powered competitive intelligence system that monitors competitor websites, extracts key business metrics, and generates automated reports using Firecrawl and Groq!
For more detailed documentation and resources on building web intelligence applications with Groq and Firecrawl, see: