Groq

Tavily + Groq: Real-Time Search, Scraping & Crawling for AI

Tavily is a comprehensive web search, scraping, and crawling API designed specifically for AI agents. It provides real-time web access, content extraction, and advanced search capabilities. Combined with Groq's ultra-fast inference through MCP, you can build intelligent agents that research topics, monitor websites, and extract structured data in seconds.

Key Features:

  • Multi-Modal Search: Web search, content extraction, and crawling in one API
  • AI-Optimized Results: Clean, structured data designed for LLM consumption
  • Advanced Filtering: Search by date range, domain, content type, and more
  • Content Extraction: Pull complete article content from any URL
  • Search Depth Control: Choose between basic and advanced search
  • Fast Execution: Groq's inference makes synthesis nearly instant

Quick Start

1. Install the required packages:

bash
pip install openai python-dotenv

2. Get your API keys:

bash
export GROQ_API_KEY="your-groq-api-key"
export TAVILY_API_KEY="your-tavily-api-key"

3. Create your first research agent:

python
import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.groq.com/api/openai/v1",
    api_key=os.getenv("GROQ_API_KEY")
)

tools = [{
    "type": "mcp",
    "server_url": f"https://mcp.tavily.com/mcp/?tavilyApiKey={os.getenv('TAVILY_API_KEY')}",
    "server_label": "tavily",
    "require_approval": "never",
}]

response = client.responses.create(
    model="openai/gpt-oss-120b",
    input="What are recent AI startup funding announcements?",
    tools=tools,
    temperature=0.1,
    top_p=0.4,
)

print(response.output_text)

Advanced Examples

Time-Filtered Research

Search within specific time ranges:

python
response = client.responses.create(
    model="openai/gpt-oss-120b",
    input="""Find AI model releases from past month.
    Use tavily_search with:
    - time_range: month
    - search_depth: advanced
    - max_results: 10
    
    Provide details about models, companies, and capabilities.""",
    tools=tools,
    temperature=0.1,
)

print(response.output_text)

Product Information Extraction

Extract structured product data:

python
response = client.responses.create(
    model="openai/gpt-oss-120b",
    input="""Find iPhone models on apple.com.
    Use tavily_search then tavily_extract to get:
    - Model names
    - Prices
    - Key features
    - Availability""",
    tools=tools,
    temperature=0.1,
)

print(response.output_text)

Multi-Source Content Extraction

Extract and compare content from multiple URLs:

python
urls = [
    "https://example.com/article1",
    "https://example.com/article2",
    "https://example.com/article3"
]

response = client.responses.create(
    model="openai/gpt-oss-120b",
    input=f"""Extract content from: {', '.join(urls)}
    
    Analyze and compare:
    - Main themes
    - Key differences in perspective
    - Common facts
    - Author conclusions""",
    tools=tools,
    temperature=0.1,
)

print(response.output_text)

Available Tavily Tools

ToolDescription
tavily_searchSearch with advanced filters (time, depth, topic, max results)
tavily_extractExtract full content from specific URLs
tavily_scrapeScrape single pages with clean output
tavily_batch_scrapeScrape multiple URLs in parallel
tavily_crawlCrawl websites with depth and pattern controls

Search Parameters

Search Depth:

  • basic - Fast, surface-level results (under 3 seconds)
  • advanced - Comprehensive, deep results (5-10 seconds)

Time Range:

  • day, week, month, year

Topic:

  • general, news

Challenge: Build an automated content curation system that monitors news sources, filters by relevance, extracts key information, generates summaries, and publishes daily digests!

Additional Resources

Was this page helpful?