Tavily + Groq: Real-Time Search, Scraping & Crawling for AI

Tavily is a comprehensive web search, scraping, and crawling API designed specifically for AI agents. It provides real-time web access, content extraction, and advanced search capabilities. Combined with Groq's ultra-fast inference through MCP, you can build intelligent agents that research topics, monitor websites, and extract structured data in seconds.

Key Features:

Multi-Modal Search: Web search, content extraction, and crawling in one API
AI-Optimized Results: Clean, structured data designed for LLM consumption
Advanced Filtering: Search by date range, domain, content type, and more
Content Extraction: Pull complete article content from any URL
Search Depth Control: Choose between basic and advanced search
Fast Execution: Groq's inference makes synthesis nearly instant

Quick Start

1. Install the required packages:

bash

pip install openai python-dotenv

2. Get your API keys:

Groq: console.groq.com/keys
Tavily: app.tavily.com

bash

export GROQ_API_KEY="your-groq-api-key"
export TAVILY_API_KEY="your-tavily-api-key"

3. Create your first research agent:

python

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.groq.com/api/openai/v1",
    api_key=os.getenv("GROQ_API_KEY")
)

tools = [{
    "type": "mcp",
    "server_url": f"https://mcp.tavily.com/mcp/?tavilyApiKey={os.getenv('TAVILY_API_KEY')}",
    "server_label": "tavily",
    "require_approval": "never",
}]

response = client.responses.create(
    model="openai/gpt-oss-120b",
    input="What are recent AI startup funding announcements?",
    tools=tools,
    temperature=0.1,
    top_p=0.4,
)

print(response.output_text)

Advanced Examples

Time-Filtered Research

Search within specific time ranges:

python

response = client.responses.create(
    model="openai/gpt-oss-120b",
    input="""Find AI model releases from past month.
    Use tavily_search with:
    - time_range: month
    - search_depth: advanced
    - max_results: 10
    
    Provide details about models, companies, and capabilities.""",
    tools=tools,
    temperature=0.1,
)

print(response.output_text)

Product Information Extraction

Extract structured product data:

python

response = client.responses.create(
    model="openai/gpt-oss-120b",
    input="""Find iPhone models on apple.com.
    Use tavily_search then tavily_extract to get:
    - Model names
    - Prices
    - Key features
    - Availability""",
    tools=tools,
    temperature=0.1,
)

print(response.output_text)

Multi-Source Content Extraction

Extract and compare content from multiple URLs:

python

urls = [
    "https://example.com/article1",
    "https://example.com/article2",
    "https://example.com/article3"
]

response = client.responses.create(
    model="openai/gpt-oss-120b",
    input=f"""Extract content from: {', '.join(urls)}
    
    Analyze and compare:
    - Main themes
    - Key differences in perspective
    - Common facts
    - Author conclusions""",
    tools=tools,
    temperature=0.1,
)

print(response.output_text)

Available Tavily Tools

Tool	Description
`tavily_search`	Search with advanced filters (time, depth, topic, max results)
`tavily_extract`	Extract full content from specific URLs
`tavily_scrape`	Scrape single pages with clean output
`tavily_batch_scrape`	Scrape multiple URLs in parallel
`tavily_crawl`	Crawl websites with depth and pattern controls

Search Parameters

Search Depth:

basic - Fast, surface-level results (under 3 seconds)
advanced - Comprehensive, deep results (5-10 seconds)

Time Range:

day, week, month, year

Topic:

general, news

Challenge: Build an automated content curation system that monitors news sources, filters by relevance, extracts key information, generates summaries, and publishes daily digests!

Get Started

Features

Built-In Tools

Compound

Advanced Features

Prompting Guide

Production Readiness

Developer Resources

Console

Support & Guidelines

Uncategorized