# https://console.groq.com llms.txt - [JigsawStack 🧩](https://console.groq.com/docs/jigsawstack): The JigsawStack 🧩 documentation page provides an overview of the AI SDK, detailing its capabilities in automating tasks such as web scraping, OCR, and translation using Large Language Models (LLMs), as well as its features like the Prompt Engine and built-in safety guardrails. This page serves as a starting point for integrating JigsawStack into existing applications and learning about its key features and setup process. - [Groq API Reference](https://console.groq.com/docs/api-reference): The Groq API Reference provides detailed information on the available APIs, endpoints, and parameters for interacting with Groq's services, enabling developers to integrate and utilize Groq's capabilities in their applications. This reference guide serves as a comprehensive resource for understanding and using the Groq API effectively. - [Rate Limits](https://console.groq.com/docs/rate-limits): This documentation page explains rate limits, which regulate how frequently users and applications can access the API within specified timeframes to ensure service stability, fair access, and protection against misuse. It provides details on understanding, viewing, and handling rate limits, as well as options for requesting higher limits. - [Wolfram‑Alpha Integration](https://console.groq.com/docs/wolfram-alpha): The Wolfram‑Alpha Integration documentation page provides information on how to integrate Wolfram's computational knowledge engine with Groq models, enabling them to access precise calculations and structured knowledge for mathematical, scientific, and engineering computations. This page outlines the supported models, setup process, and usage examples for leveraging Wolfram‑Alpha integration in Groq applications. - [Model Permissions](https://console.groq.com/docs/model-permissions): This page provides information on configuring model permissions to limit which models can be used at the organization and project level, allowing for fine-grained control over model access. Model permissions can be set using "Only Allow" or "Only Block" strategies to either allow or block specific models. - [Overview Refresh: Page (mdx)](https://console.groq.com/docs/overview-refresh): The "Overview Refresh: Page (mdx)" page is used to provide an overview of the refresh functionality for MDX pages. This page currently does not have any content to display, indicating that it may be under development or require additional information. - [Model Context Protocol (MCP)](https://console.groq.com/docs/mcp): The Model Context Protocol (MCP) is an open-source standard that enables AI applications to connect with external systems like databases, APIs, and tools, providing a standardized way for AI models to access and interact with data and workflows. This documentation page explains how to use MCP to integrate AI models with various external systems and tools. - [Visit Website](https://console.groq.com/docs/visit-website): The "Visit Website" page provides information on how to use Groq's website visiting tool, which allows supported models to access and analyze content from publicly accessible websites. This tool enables models to retrieve and process website content, providing detailed analysis and responses based on the actual page content. - [FlutterFlow + Groq: Fast & Powerful Cross-Platform Apps](https://console.groq.com/docs/flutterflow): This documentation page provides a guide on integrating FlutterFlow with Groq to build fast and powerful cross-platform apps with AI capabilities. It outlines the benefits of the integration and offers a step-by-step quick start guide to get started with building AI-powered apps using FlutterFlow and Groq. - [Billing FAQs](https://console.groq.com/docs/billing-faqs): The "Billing FAQs" page provides answers to common questions about Groq's billing model, including upgrading to the Developer tier, understanding billing cycles, and progressive billing thresholds. This page helps users navigate the billing process, covering topics such as benefits of upgrading, downgrading, and special billing considerations for customers in India. - [Supported Models](https://console.groq.com/docs/models): This page provides an overview of the models supported on GroqCloud, including production models, preview models, and deprecated models, to help users choose the right models for their applications. It also offers code examples and API endpoints to access and retrieve a list of available models. - [Models: Featured Cards (tsx)](https://console.groq.com/docs/models/featured-cards): This page provides an overview of featured cards showcasing various AI systems, including their capabilities, modalities, and performance metrics. It highlights specific models, such as Groq Compound and OpenAI GPT-OSS 120B, and their respective features and functionalities. - [Models: Models (tsx)](https://console.groq.com/docs/models/models): This page provides detailed information on available models, including their speeds, pricing, and technical specifications. It lists model attributes such as speed, pricing, rate limits, context window, maximum completion tokens, and maximum file size to help users choose and utilize the models effectively. - [Projects](https://console.groq.com/docs/projects): The "Projects" page provides a framework for managing multiple applications, environments, and teams within a single Groq account, enabling organizations to isolate workloads and gain granular control over resources, costs, and access permissions. This page guides users in creating and managing projects to organize their work, track spending, and collaborate with teams. - [Qwen3 32b: Page (mdx)](https://console.groq.com/docs/model/qwen3-32b): The Qwen3 32b page provides information and documentation for the Qwen3 32b model, likely a type of AI or machine learning model. This page currently does not contain any relevant information or content. - [Deepseek R1 Distill Qwen 32b: Model (tsx)](https://console.groq.com/docs/model/deepseek-r1-distill-qwen-32b): The Deepseek R1 Distill Qwen 32b model page provides information on a distilled version of DeepSeek's R1 model, fine-tuned from the Qwen-2.5-32B base model, which leverages knowledge distillation to retain robust reasoning capabilities while enhancing efficiency. This page details the model's key features, including exceptional performance on mathematical and logical reasoning tasks, a massive 128K context window, native tool use, and JSON mode support. - [Llama Prompt Guard 2 86m: Page (mdx)](https://console.groq.com/docs/model/llama-prompt-guard-2-86m): The Llama Prompt Guard 2 86m page provides information and guidelines for utilizing the Llama Prompt Guard 2 model with 86 million parameters. This page serves as a resource for understanding the capabilities and implementation of this specific model variant. - [Key Technical Specifications](https://console.groq.com/docs/model/meta-llama/llama-prompt-guard-2-86m): This documentation page provides key technical specifications for Llama Prompt Guard 2, detailing its architecture, performance metrics, and capabilities for detecting and preventing malicious prompt attacks on LLM applications. It outlines essential information for integrating and optimizing the model for enhanced security. - [Key Technical Specifications](https://console.groq.com/docs/model/meta-llama/llama-prompt-guard-2-22m): This documentation page provides key technical specifications and usage guidelines for Llama Prompt Guard 2, a model designed to detect and prevent malicious prompt attacks on LLM applications. It outlines the model's architecture, performance metrics, and best practices for implementation and integration. - [Llama 4 Scout 17b 16e Instruct: Model (tsx)](https://console.groq.com/docs/model/meta-llama/llama-4-scout-17b-16e-instruct): The Llama 4 Scout 17b 16e Instruct model page provides information on Meta's 17 billion parameter mixture-of-experts model, featuring native multimodality for text and image understanding, instruction-tuned for tasks like chat, visual reasoning, and coding. This page details the model's capabilities, including its 128K token context length and industry-leading inference speed on Groq. - [Llama 4 Maverick 17b 128e Instruct: Model (tsx)](https://console.groq.com/docs/model/meta-llama/llama-4-maverick-17b-128e-instruct): This page provides documentation for the Llama 4 Maverick 17b 128e Instruct model, a 17 billion parameter mixture-of-experts model with native multimodality for text and image understanding. The model is optimized for tasks such as assistant-like chat, visual reasoning, and coding, and offers industry-leading inference speed on Groq. - [Key Technical Specifications](https://console.groq.com/docs/model/meta-llama/llama-guard-4-12b): This documentation page provides an overview of the key technical specifications for Llama-Guard-4-12B, a specialized multimodal content moderation model designed to identify and classify potentially harmful content. It outlines the model's architecture, performance metrics, and use cases for content moderation and AI safety applications. - [Qwen3 32b: Model (tsx)](https://console.groq.com/docs/model/qwen/qwen3-32b): The Qwen3 32B model page provides an overview of the latest generation large language model, offering advancements in reasoning, instruction-following, and multilingual support. This page details the model's key features, including its ability to switch between thinking and non-thinking modes for various applications. - [Key Technical Specifications](https://console.groq.com/docs/model/whisper-large-v3): This documentation page provides key technical specifications and details for the Whisper Large v3 model, including its architecture, performance metrics, and usage guidelines for various applications. It outlines the model's capabilities, such as high-accuracy transcription, multilingual support, and handling challenging audio conditions, to help users understand its optimal use cases and best practices. - [Key Technical Specifications](https://console.groq.com/docs/model/whisper-large-v3-turbo): This documentation page provides key technical specifications and model details for the Whisper Large v3 Turbo speech-to-text model, highlighting its optimized architecture for speed, performance metrics, and use cases. It outlines the model's capabilities, including multilingual support, real-time transcription, and cost-effective solutions for various applications. - [Llama 3.3 70b Versatile: Model (tsx)](https://console.groq.com/docs/model/llama-3.3-70b-versatile): This page provides information about the Llama 3.3 70b Versatile model, a multilingual large language model with 70 billion parameters optimized for various natural language processing tasks. It details the model's capabilities, performance, and applications. - [Llama3 70b 8192: Model (tsx)](https://console.groq.com/docs/model/llama3-70b-8192): This page provides documentation for the Llama3 70b 8192 model, a production-ready foundation model optimized for dialogue and content-generation tasks. The model offers a balance of performance and speed, delivering fast and consistent outputs through the Groq API. - [Key Technical Specifications](https://console.groq.com/docs/model/distil-whisper-large-v3-en): This documentation page provides key technical specifications and details for Distil-Whisper Large v3, a speech-to-text model built on the encoder-decoder transformer architecture, highlighting its performance metrics, model details, and best practices for usage. The page outlines the model's capabilities, use cases, and optimization strategies for real-time transcription, content processing, and interactive applications. - [Llama3 8b 8192: Model (tsx)](https://console.groq.com/docs/model/llama3-8b-8192): This page provides documentation for the Llama3 8b 8192 model, specifically its integration and usage on Groq hardware. The Llama-3-8B-8192 model delivers exceptional performance with industry-leading speed and cost-efficiency. - [Key Technical Specifications](https://console.groq.com/docs/model/openai/gpt-oss-20b): This documentation page provides an overview of the key technical specifications, use cases, and best practices for the GPT-OSS 20B model, a Mixture-of-Experts (MoE) architecture with 20B total parameters. It serves as a reference for developers and users to understand the model's capabilities, performance metrics, and optimal deployment strategies. - [Key Technical Specifications](https://console.groq.com/docs/model/openai/gpt-oss-120b): This documentation page provides an overview of the key technical specifications, use cases, and best practices for the GPT-OSS 120B model, a Mixture-of-Experts (MoE) architecture with 120B total parameters and exceptional performance across various benchmarks. The page serves as a guide for developers and researchers to understand the model's capabilities and optimize its use in applications such as agentic systems, research, and multilingual AI assistants. - [Key Technical Specifications](https://console.groq.com/docs/model/openai/gpt-oss-safeguard-20b): This documentation page provides an overview of the key technical specifications for GPT-OSS-Safeguard 20B, a model designed for safety classification tasks with support for custom policy interpretation and transparent reasoning. The page covers model architecture, performance metrics, use cases, and best practices for implementing the model in various applications. - [Mistral Saba 24b: Model (tsx)](https://console.groq.com/docs/model/mistral-saba-24b): This page provides details about the Mistral Saba 24b model, a specialized multilingual model optimized for Arabic, Farsi, Urdu, Hebrew, and Indic languages. The model features a 32K token context window and tool use capabilities, enabling strong performance across various languages. - [Llama Prompt Guard 2 22m: Page (mdx)](https://console.groq.com/docs/model/llama-prompt-guard-2-22m): The Llama Prompt Guard 2 22m page provides information on safeguarding prompts for the Llama model. This page currently does not contain any relevant details. - [Llama 4 Scout 17b 16e Instruct: Page (mdx)](https://console.groq.com/docs/model/llama-4-scout-17b-16e-instruct): The Llama 4 Scout 17b 16e Instruct page provides information and guidance on utilizing the Llama 4 Scout 17b 16e Instruct model for specific tasks. This page is currently under development or has no available content to display. - [Llama 3.3 70b Specdec: Model (tsx)](https://console.groq.com/docs/model/llama-3.3-70b-specdec): The Llama 3.3 70b Specdec model is Groq's speculative decoding version of Meta's Llama 3.3 70B model, optimized for high-speed inference while maintaining high quality. This model delivers exceptional performance with significantly reduced latency, making it ideal for real-time applications. - [Llama 4 Maverick 17b 128e Instruct: Page (mdx)](https://console.groq.com/docs/model/llama-4-maverick-17b-128e-instruct): The Llama 4 Maverick 17b 128e Instruct page provides information and documentation for the specified model, likely including its architecture, training data, and usage guidelines. This page serves as a central resource for developers and researchers working with the Llama 4 Maverick 17b 128e Instruct model. - [Key Technical Specifications](https://console.groq.com/docs/model/allam-2-7b): This documentation page provides key technical specifications, use cases, best practices, and getting started information for the ALLaM-2-7B model, a bilingual Arabic-English autoregressive transformer. The page details the model's architecture, performance metrics, and application areas, including Arabic language technology, research, and development. - [Deepseek R1 Distill Llama 70b: Model (tsx)](https://console.groq.com/docs/model/deepseek-r1-distill-llama-70b): The Deepseek R1 Distill Llama 70b model page provides information on a distilled version of DeepSeek's R1 model, fine-tuned from the Llama-3.3-70B-Instruct base model for robust reasoning capabilities and exceptional performance on mathematical and logical tasks. This page offers details on leveraging Groq's industry-leading speed with this model. - [Qwen 2.5 Coder 32b: Model (tsx)](https://console.groq.com/docs/model/qwen-2.5-coder-32b): The Qwen 2.5 Coder 32b model is a specialized AI model fine-tuned for code generation and development tasks, built on 5.5 trillion tokens of code and technical content. This model delivers instant, production-quality code generation capabilities comparable to GPT-4. - [Llama 3.2 1b Preview: Model (tsx)](https://console.groq.com/docs/model/llama-3.2-1b-preview): The Llama 3.2 1b Preview model page provides information on a fast and cost-effective language model with 1.23 billion parameters, suitable for high-throughput applications such as text analysis, information retrieval, and content summarization. This page details the model's capabilities, including its 128K context window, speed, accuracy, and use cases for rapid prototyping and content processing. - [Key Technical Specifications](https://console.groq.com/docs/model/playai-tts-arabic): This documentation page provides key technical specifications, use cases, best practices, and limitations for a text-to-speech model, offering guidance on its application in various scenarios such as creative content generation, voice agentic experiences, and customer support. It also covers essential considerations for responsible use, data privacy, and bias mitigation to ensure effective and fair utilization of the model. - [Llama 3.2 3b Preview: Model (tsx)](https://console.groq.com/docs/model/llama-3.2-3b-preview): The Llama 3.2 3b Preview model page provides details on a fast and balanced AI model with 3.1 billion parameters, suitable for tasks like content creation, summarization, and information retrieval. This page offers information on the model's capabilities, ideal use cases, and performance benefits for real-time applications. - [Qwen Qwq 32b: Model (tsx)](https://console.groq.com/docs/model/qwen-qwq-32b): The Qwen Qwq 32b model page provides information on a 32-billion parameter reasoning model, detailing its competitive performance and speed capabilities. This page offers an overview of the model's key features, including its performance, speed, and technical details, as well as resources for further exploration. - [Key Technical Specifications](https://console.groq.com/docs/model/gemma2-9b-it): This documentation page provides an overview of the key technical specifications for the Gemma 2 9B IT model, including its architecture, performance metrics, and optimal usage guidelines. It details the model's capabilities, performance benchmarks, and best practices for deployment in various applications. - [Llama Guard 4 12b: Page (mdx)](https://console.groq.com/docs/model/llama-guard-4-12b): This page provides information on Llama Guard 4 12b, a component designed to safeguard interactions and ensure compliance with predefined guidelines. The details on this page are presented in Markdown format for easy readability and understanding. - [Llama Guard 3 8b: Model (tsx)](https://console.groq.com/docs/model/llama-guard-3-8b): The Llama Guard 3 8b model, built on the Llama framework, is a specialized content moderation tool designed to identify and filter potentially harmful content. This page provides information on utilizing the model with Groq's high-speed AI processing capabilities for content moderation applications. - [Key Technical Specifications](https://console.groq.com/docs/model/moonshotai/kimi-k2-instruct-0905): This documentation page outlines the key technical specifications, performance metrics, and best practices for the Kimi-K2-Instruct-0905 model, a cutting-edge AI model built on a Mixture-of-Experts architecture. It provides essential information for developers to understand the model's capabilities and leverage its features for enhanced frontend development, advanced agent scaffolds, tool calling, and full-stack development. - [Kimi K2 Version](https://console.groq.com/docs/model/moonshotai/kimi-k2-instruct): The Kimi K2 Version page provides information on the current model version, which redirects to the latest 0905 version, offering improved performance, 256K context, and enhanced tool use capabilities. This page details the technical specifications, use cases, and best practices for utilizing the Kimi K2 model, particularly its advanced coding, reasoning, and multilingual capabilities. - [Qwen 2.5 32b: Model (tsx)](https://console.groq.com/docs/model/qwen-2.5-32b): This page provides information about the Qwen 2.5 32b model, a flagship AI model developed by Alibaba with GPT-4 level capabilities and near-instant response times. The page details the model's key features, including its performance in creative writing and complex reasoning tasks, as well as its availability for use on the Groq Hosted AI Models website. - [Key Technical Specifications](https://console.groq.com/docs/model/playai-tts): The "Key Technical Specifications" page provides an overview of the PlayAI Dialog v1.0 model, including its architecture, training data, and key use cases, to help developers understand its capabilities and limitations. This page serves as a technical reference for integrating and utilizing the PlayAI Dialog model for various applications, such as creative content generation, voice agentic experiences, and customer support. - [Llama 3.1 8b Instant: Model (tsx)](https://console.groq.com/docs/model/llama-3.1-8b-instant): The Llama 3.1 8b Instant model on Groq provides rapid response times with production-grade reliability, making it suitable for latency-sensitive applications. This model balances efficiency and performance for use cases such as chat interfaces, content filtering systems, and large-scale data processing workloads. - [Compound Beta: Page (mdx)](https://console.groq.com/docs/agentic-tooling/compound-beta): The Compound Beta: Page (mdx) documentation provides information on utilizing Markdown extensions (mdx) to create and customize pages within the Compound Beta framework. This page serves as a resource for developers looking to leverage mdx for page creation. - [Agentic Tooling: Page (mdx)](https://console.groq.com/docs/agentic-tooling): This page provides information on utilizing Agentic Tooling with Markdown extensions (MDX) to create interactive and dynamic content. It serves as a resource for developers looking to integrate agentic capabilities into their MDX pages. - [Compound Beta Mini: Page (mdx)](https://console.groq.com/docs/agentic-tooling/compound-beta-mini): The Compound Beta Mini: Page (mdx) documentation provides information on utilizing the MDX page type within the Compound Beta Mini framework. This page currently does not have any specific content or details to display. - [Compound: Page (mdx)](https://console.groq.com/docs/agentic-tooling/groq/compound): The Page compound is a basic building block for creating content pages in MDX format. It serves as a container for page content, providing a structured layout for displaying text, images, and other media. - [Compound Mini: Page (mdx)](https://console.groq.com/docs/agentic-tooling/groq/compound-mini): The Compound Mini: Page (mdx) component is a Markdown extension (mdx) page type used for creating and displaying content within the Compound Mini application. This page type allows developers to author content using Markdown syntax and render it within the application. - [✨ Vercel AI SDK + Groq: Rapid App Development](https://console.groq.com/docs/ai-sdk): This documentation page provides a comprehensive guide on integrating Vercel's AI SDK with Groq for rapid app development, enabling developers to leverage powerful language models for various applications. It offers a step-by-step quick start guide in JavaScript to deploy a scalable and high-speed application within minutes. - [Parallel + Groq: Fast Web Search for Real-Time AI Research](https://console.groq.com/docs/parallel): This documentation page provides a guide on integrating Parallel's real-time web search capabilities with Groq's high-speed inference to build AI research agents that can quickly find and analyze current information. It offers a step-by-step quick start guide, code examples, and advanced use cases for applications such as multi-company comparisons, real-time market data retrieval, and breaking news monitoring. - [Tavily + Groq: Real-Time Search, Scraping & Crawling for AI](https://console.groq.com/docs/tavily): This documentation page provides a guide on integrating Tavily's real-time search, scraping, and crawling API with Groq's ultra-fast inference to build intelligent agents that can research topics, monitor websites, and extract structured data. It offers a comprehensive overview of the combined capabilities, key features, and example use cases for leveraging Tavily and Groq to accelerate AI-driven data extraction and research tasks. - [AutoGen + Groq: Building Multi-Agent AI Applications](https://console.groq.com/docs/autogen): This documentation page provides a guide on building multi-agent AI applications using AutoGen and Groq, enabling developers to create sophisticated AI agents that collaborate to solve complex tasks quickly. It covers key features, including multi-agent orchestration, tool integration, and flexible workflows, along with code examples for getting started with the technology. - [Content Moderation](https://console.groq.com/docs/content-moderation): This documentation page provides an overview of content moderation, a crucial aspect of ensuring safe and responsible use of models by detecting and filtering harmful or unwanted content in user prompts and model responses. It also introduces various content moderation models offered by Groq, including policy-following and prebaked safety models. - [Browser Automation](https://console.groq.com/docs/browser-automation): This documentation page provides information on browser automation, a feature that enables certain Groq models to launch and control multiple browsers simultaneously to gather comprehensive information from various sources. It covers the supported models, setup instructions, and how browser automation works to facilitate parallel web research and deeper analysis. - [Understanding and Optimizing Latency on Groq](https://console.groq.com/docs/production-readiness/optimizing-latency): This guide provides a comprehensive overview of latency in Groq-powered applications, helping developers understand, measure, and optimize latency for production deployment. It covers key metrics, factors affecting latency, and strategies for optimizing performance in Large Language Model (LLM) applications. - [Security Onboarding](https://console.groq.com/docs/production-readiness/security-onboarding): The **Security Onboarding** guide provides best practices and recommendations for securing API keys, client configurations, and integrations when using Groq's API, ensuring a secure and reliable experience. This page outlines shared security responsibilities between Groq and customers, covering key management, transport security, input safety, and incident response. - [Production-Ready Checklist for Applications on GroqCloud](https://console.groq.com/docs/production-readiness/production-ready-checklist): The "Production-Ready Checklist for Applications on GroqCloud" provides a comprehensive guide for deploying and scaling LLM applications on GroqCloud, covering critical aspects such as model selection, performance optimization, monitoring, and cost management. This checklist helps developers ensure their Groq-powered applications are launched and scaled with confidence, minimizing common pitfalls and optimizing user experience, operational costs, and system reliability. - [Quickstart](https://console.groq.com/docs/quickstart): The Quickstart guide provides a step-by-step introduction to getting started with the Groq API, covering essential setup and usage procedures. This page walks you through creating an API key, setting it up securely, and making your first API request using various programming languages and third-party libraries. - [Structured Outputs](https://console.groq.com/docs/structured-outputs): Structured Outputs guarantees model responses strictly conform to a provided JSON schema, ensuring reliable and type-safe data structures by throwing an error if the model cannot produce a compliant response. This feature enables binary output, type-safe responses, and programmatic refusal detection, simplifying the process of obtaining structured information from unstructured text. - [Speech to Text](https://console.groq.com/docs/speech-to-text): The "Speech to Text" documentation page provides an overview of the Groq API's speech-to-text solution, offering OpenAI-compatible endpoints for near-instant transcriptions and translations. This page guides developers on integrating high-quality audio processing into their applications using the API's endpoints, supported models, and audio file limitations. - [Agno + Groq: Fast Agents](https://console.groq.com/docs/agno): This documentation page provides a guide on building fast agents using Agno and Groq, enabling the creation of multi-modal agents that can perform tasks such as searching knowledge stores, understanding images, and generating structured outputs. It offers a quick start tutorial and examples for building simple agents and multi-agent teams. - [🚅 LiteLLM + Groq for Production Deployments](https://console.groq.com/docs/litellm): This documentation page provides a guide on integrating LiteLLM with Groq for production deployments, enabling features such as cost management, smart caching, and spend tracking. It offers a quick start tutorial and next steps for configuring advanced features and building production-ready applications with LiteLLM and Groq. - [Text to Speech](https://console.groq.com/docs/text-to-speech): The Text to Speech documentation page provides information on how to use the Groq API to convert text into lifelike audio using fast text-to-speech (TTS) models. This page guides developers on generating high-quality audio content for various applications, such as customer support agents and game development characters. - [Prometheus Metrics](https://console.groq.com/docs/prometheus-metrics): This documentation page provides information on accessing and utilizing Prometheus metrics for monitoring Groq's usage, specifically for Enterprise tier customers. It outlines the available metrics, APIs, and querying methods, including integration with Grafana and usage of MetricsQL. - [Groq Batch API](https://console.groq.com/docs/batch): The Groq Batch API enables large-scale asynchronous processing of API requests, allowing users to submit batches of requests for processing within a 24-hour to 7-day window at a 50% cost discount compared to synchronous APIs. This API is ideal for use cases that don't require immediate responses, such as processing large datasets, generating content in bulk, and running evaluations. - [Changelog](https://console.groq.com/docs/legacy-changelog): The Groq Changelog provides a chronological record of updates, releases, and developments to the Groq API, allowing users to track changes and stay informed about new features and models. This page lists updates in reverse chronological order, with the most recent changes appearing at the top. - [Arize + Groq: Open-Source AI Observability](https://console.groq.com/docs/arize): This documentation page provides a guide on integrating Arize Phoenix, an open-source AI observability library, with Groq-powered applications to gain deep insights into LLM workflow performance and behavior. It outlines the features and steps to set up automatic tracing, real-time monitoring, and evaluation frameworks for Groq applications using Arize Phoenix. - [Responses API](https://console.groq.com/docs/responses-api): The Responses API page provides documentation for integrating advanced conversational AI capabilities into applications using Groq's API, which is fully compatible with OpenAI's Responses API. This page covers configuration, features, and usage guidelines for the Responses API, including text and image inputs, stateful conversations, function calling, and more. - [Images and Vision](https://console.groq.com/docs/vision): The "Images and Vision" documentation page provides information on utilizing Groq API's multimodal models with vision capabilities to analyze and interpret visual data from images, generating human-readable text and insights. This page guides developers on how to integrate image processing features, such as visual question answering and Optical Character Recognition (OCR), into their applications using supported models. - [Assistant Message Prefilling](https://console.groq.com/docs/prefilling): This page provides information on Assistant Message Prefilling, a technique used with Groq API to control model output by prefilling `assistant` messages. It allows users to direct text-to-text models to skip introductions, enforce output formats, and maintain conversation consistency. - [OpenAI Compatibility](https://console.groq.com/docs/openai): This page provides information on the compatibility of Groq API with OpenAI's client libraries, enabling users to easily configure their existing OpenAI applications to run on Groq. It covers configuration, unsupported features, and additional resources for migrating to Groq's API. - [Prompt Caching](https://console.groq.com/docs/prompt-caching): This documentation page explains Prompt Caching, a feature that automatically reuses computation from recent API requests with shared prefixes to deliver cost savings and improved response times. It provides an overview of how prompt caching works, its benefits, supported models, and best practices for structuring prompts to optimize caching. - [Firecrawl + Groq: AI-Powered Web Scraping & Data Extraction](https://console.groq.com/docs/firecrawl): This documentation page provides a guide on integrating Firecrawl, an enterprise-grade web scraping platform, with Groq's fast inference capabilities to build intelligent agents for scraping websites, extracting structured data, and conducting deep research. The page offers a quick start guide, code examples, and advanced use cases for AI-powered web scraping and data extraction using Firecrawl and Groq. - [Built-in Tools](https://console.groq.com/docs/compound/built-in-tools): The "Built-in Tools" page provides an overview of the comprehensive set of tools that come equipped with Compound systems, enabling users to access real-time information, computational power, and interactive environments. This page details the default and available tools, their configurations, and usage guidelines for Compound system versions. - [Compound](https://console.groq.com/docs/compound): The Compound documentation page provides information on advanced AI systems that solve problems by taking action and intelligently using external tools, such as web search and code execution, alongside powerful large language models. This page details the capabilities, available Compound systems, and usage guidelines for integrating these systems into applications. - [Use Cases](https://console.groq.com/docs/compound/use-cases): The "Use Cases" documentation page provides an overview of various applications and scenarios where Groq's compound systems can be utilized, particularly those requiring real-time information. This page explores specific use cases, including real-time fact checking, chart generation, natural language calculation, and code debugging, highlighting the capabilities and benefits of using Groq's compound systems. - [Search Settings: Page (mdx)](https://console.groq.com/docs/compound/search-settings): The "Search Settings: Page (mdx)" documentation page provides information on configuring search settings for MDX pages. This page is currently not populated with content, but is intended to guide users on customizing search functionality for their MDX pages. - [Compound Beta: Page (mdx)](https://console.groq.com/docs/compound/systems/compound-beta): The Compound Beta: Page (mdx) documentation provides information on utilizing Markdown extensions (MDX) to create and customize pages within the Compound Beta framework. This page serves as a resource for developers looking to leverage MDX for page creation. - [Systems](https://console.groq.com/docs/compound/systems): This documentation page provides an overview of Groq's compound AI systems, including the Compound and Compound Mini models, which utilize external tools to enhance response accuracy and capability. The page details the features, use cases, and differences between these two systems, helping users choose the most suitable option for their specific needs. - [Compound Beta Mini: Page (mdx)](https://console.groq.com/docs/compound/systems/compound-beta-mini): The Compound Beta Mini: Page (mdx) documentation provides information on utilizing the MDX page type within the Compound Beta Mini framework. This page currently does not have any specific content or details to display. - [Key Technical Specifications](https://console.groq.com/docs/compound/systems/compound): The "Key Technical Specifications" page provides an overview of Compound's technical capabilities, including its model architecture and performance metrics. This page details Compound's core features, such as intelligent reasoning, tool use, and benchmark performance. - [Key Technical Specifications](https://console.groq.com/docs/compound/systems/compound-mini): This documentation page provides an overview of the key technical specifications for Compound Mini, a model powered by Llama 3.3 70B and GPT-OSS 120B for intelligent reasoning and tool use. It covers details on model architecture, performance metrics, use cases, best practices, and other essential information for utilizing Compound Mini effectively. - [E2B + Groq: Open-Source Code Interpreter](https://console.groq.com/docs/e2b): The E2B + Groq: Open-Source Code Interpreter documentation page provides a guide on using the E2B SDK to create secure, sandboxed environments for executing code generated by LLMs via the Groq API. This page offers a Python quick start tutorial, example code, and resources for building AI-driven applications that generate and execute code in real-time. - [Anchor Browser + Groq: Blazing Fast Browser Agents](https://console.groq.com/docs/anchorbrowser): This documentation page provides a quickstart guide on using Anchor Browser with Groq to create blazing-fast browser agents for automating web interactions, such as data collection, using AI-powered browser automation. It outlines the prerequisites, setup, and usage examples for integrating Anchor Browser with Groq's fast inference to simplify browser-based automations. - [🎨 Gradio + Groq: Easily Build Web Interfaces](https://console.groq.com/docs/gradio): This documentation page provides a guide on integrating Gradio with Groq to easily build web interfaces for fast Groq applications, enabling the creation of interactive demos and shareable apps. It offers a quick start tutorial and resources for building robust applications with Gradio and Groq, including multimodal chatbots with text, audio, and vision models. - [Browser Use + Groq: Intelligent Web Research & Product Comparison](https://console.groq.com/docs/browseruse): This documentation page provides instructions and examples for integrating Browser Use and Groq to build intelligent web research agents that can autonomously browse the web, extract information, and compare products across multiple sources. It guides developers through setting up the required packages, API keys, and creating research agents that can deliver comprehensive insights in seconds. - [HuggingFace + Groq: Real-Time Model & Dataset Discovery](https://console.groq.com/docs/huggingface): This documentation page provides instructions and examples for integrating HuggingFace's model and dataset repository with Groq's fast inference capabilities, enabling real-time discovery, analysis, and recommendations of AI models and datasets. The guide covers key features such as trending models, smart recommendations, and dataset exploration, along with a quick start guide and advanced examples for building intelligent agents. - [Introduction to Tool Use](https://console.groq.com/docs/tool-use): This page provides an introduction to tool use in Large Language Models (LLMs), a feature that enables LLMs to interact with external resources and perform actions beyond simple text generation. It covers supported models, agentic tooling, and how tool use works with the Groq API. - [Google Cloud Private Service Connect](https://console.groq.com/docs/security/gcp-private-service-connect): This page provides a guide on setting up Google Cloud Private Service Connect (PSC) to securely access Groq's API services through private network connections, eliminating exposure to the public internet. It outlines the overview, prerequisites, and step-by-step setup process for configuring PSC endpoints. - [Reasoning](https://console.groq.com/docs/reasoning): The "Reasoning" page provides information on utilizing reasoning models for complex problem-solving tasks that require step-by-step analysis and logical deduction. This documentation covers supported models, reasoning formats, and API parameters for controlling the presentation of a model's decision-making process. - [Your Data in GroqCloud](https://console.groq.com/docs/your-data): This page provides information on how Groq handles customer data in GroqCloud, including the types of data retained, circumstances for retention, and available controls. It helps users understand and manage data retention settings through the Data Controls settings. - [Browser Search](https://console.groq.com/docs/browser-search): The "Browser Search" documentation page provides information on using built-in browser search functionality with supported models on Groq, allowing for interactive web content access and more comprehensive search results. This page covers features, supported models, usage guidelines, and best practices for leveraging browser search capabilities. - [Integrations: Button Group (tsx)](https://console.groq.com/docs/integrations/button-group): The "Integrations: Button Group (tsx)" page provides information on creating a button group, a collection of buttons displayed together, and its properties. This documentation outlines the structure and customization options for integration buttons within the button group. - [What are integrations?](https://console.groq.com/docs/integrations): This documentation page explains what integrations are and how they can enhance Groq-powered applications by connecting to external services. It provides an overview of available integration categories to help users find suitable tools for their needs. - [Integrations: Integration Buttons (ts)](https://console.groq.com/docs/integrations/integration-buttons): This page documents the integration buttons available for various categories, including AI agent frameworks, browser automation, and LLM app development. The integration buttons are defined in the `integrationButtons` object, which maps integration groups to arrays of button configurations. - [🦜️🔗 LangChain + Groq](https://console.groq.com/docs/langchain): This documentation page provides a guide on integrating LangChain with Groq, a fast inference API, to build sophisticated applications with Large Language Models (LLMs). It outlines the benefits and capabilities of combining LangChain components, such as chains, prompt templates, and tools, with Groq's API for accelerated LLM inference. - [xRx + Groq: Easily Build Rich Multi-Modal Experiences](https://console.groq.com/docs/xrx): This documentation page provides a guide on how to use xRx, an open-source framework, in conjunction with Groq to build rich multi-modal experiences, enabling developers to create AI-powered applications with seamless text, voice, and other interaction forms. The page offers a quick start guide, including setup instructions and example applications, to help developers get started with integrating xRx and Groq. - [🗂️ LlamaIndex 🦙](https://console.groq.com/docs/llama-index): This page provides an overview of LlamaIndex, a data framework for building LLM-based applications that leverage context augmentation, such as Retrieval-Augmented Generation (RAG) systems. It outlines how LlamaIndex enables the ingestion, structuring, and access of private or domain-specific data for safe and reliable injection into LLMs. - [Mastra + Groq: Build Production AI Agents & Workflows](https://console.groq.com/docs/mastra): This documentation page provides a guide on integrating Mastra, a TypeScript framework for building production-ready AI applications, with Groq's fast inference to create sophisticated AI agents and workflows. It outlines the key features and capabilities of the combined solution, along with a step-by-step quick start guide and advanced examples for building AI systems. - [CrewAI + Groq: High-Speed Agent Orchestration](https://console.groq.com/docs/crewai): This documentation page provides a guide on integrating CrewAI, a framework for orchestrating multiple AI agents, with Groq's high-speed inference capabilities to enable rapid autonomous decision-making and collaboration. It outlines the benefits and implementation details of using Groq with CrewAI for fast and reliable agent interactions and scalable multi-agent systems. - [Spend Limits](https://console.groq.com/docs/spend-limits): This page provides information on setting and managing spend limits to control API costs, including automated spending limits and proactive usage alerts. It outlines the steps to set a spending limit, add usage alerts, and manage notifications to help users stay within their budget. - [API Error Codes and Responses](https://console.groq.com/docs/errors): This documentation page provides detailed information on the API error codes and responses used by our API, including standard HTTP response status codes and their corresponding descriptions. It outlines the various error codes that may be encountered, along with example response bodies and guidance on how to handle each error. - [Toolhouse 🛠️🏠](https://console.groq.com/docs/toolhouse): This documentation page provides a step-by-step guide on how to use Toolhouse, a Backend-as-a-Service for the agentic stack, in conjunction with Groq's fast inference and models like Llama 4 Maverick and Compound Beta to build conversational and autonomous agents. It covers setup, configuration, and deployment of Toolhouse agents using the Toolhouse CLI and Groq API keys. - [Flex Processing](https://console.groq.com/docs/flex-processing): The Flex Processing service tier is optimized for high-throughput workloads that prioritize fast inference and can handle occasional request failures, offering higher rate limits at the same pricing as on-demand processing. This tier is ideal for workloads that require rapid processing and can gracefully handle temporary request failures. - [Text Generation](https://console.groq.com/docs/text-chat): The "Text Generation" documentation page provides an overview of generating human-like text with Groq's Chat Completions API, enabling natural conversational interactions with large language models. This page guides developers on using the API for various applications, including conversational agents, content generation, and task automation. - [BrowserBase + Groq: Scalable Browser Automation with AI](https://console.groq.com/docs/browserbase): This documentation page explains how to integrate BrowserBase, a cloud-based headless browser infrastructure, with Groq's AI-powered natural language control for scalable browser automation. It provides a guide on setting up and using the combined solution for automating browser actions using plain English instructions. - [LiveKit + Groq: Build End-to-End AI Voice Applications](https://console.groq.com/docs/livekit): This documentation page provides a guide on integrating LiveKit with Groq to build end-to-end AI voice applications, combining speech recognition, text-to-speech, and real-time communication features. It offers a step-by-step tutorial on setting up a voice agent using LiveKit and Groq, enabling developers to create scalable voice applications with multi-user interactions. - [Overview: Page (mdx)](https://console.groq.com/docs/overview): The "Page (mdx)" documentation page provides an overview of the MDX page component, outlining its purpose and functionality within the system. This page serves as a central resource for understanding the role and capabilities of MDX pages. - [Overview](https://console.groq.com/docs/overview/content): The "Overview" page provides an introduction to the Groq API, highlighting its key features such as fast LLM inference, OpenAI compatibility, and ease of integration and scaling. This page serves as a starting point for developers to quickly get started with building applications on Groq. - [Composio](https://console.groq.com/docs/composio): This documentation page provides an overview and setup guide for Composio, a platform for integrating tools with LLMs and AI agents, enabling fast and secure interactions with external applications. It details how to build Groq-based assistants that can seamlessly interact with various tools and applications. - [Prompt Engineering Patterns Guide](https://console.groq.com/docs/prompting/patterns): The Prompt Engineering Patterns Guide provides a systematic approach to selecting effective prompt patterns for various tasks when working with open-source language models, ensuring improved output reliability and performance. This guide helps users choose the optimal prompt pattern for their specific task, with a focus on maximizing model performance and accuracy across applications. - [Prompt Basics](https://console.groq.com/docs/prompting): The "Prompt Basics" guide provides fundamental principles for crafting effective prompts for open-source instruction-tuned large language models, enabling users to communicate clear instructions and expectations. This guide covers essential concepts, including prompt building blocks, role channels, and best practices for optimizing prompt quality and model output. - [Model Migration Guide](https://console.groq.com/docs/prompting/model-migration): The Model Migration Guide provides a comprehensive framework for transitioning prompts from commercial models to open-source ones like Llama, focusing on adjusting prompting techniques, matching generation parameters, and testing outputs. This guide outlines key principles and strategies for ensuring a smooth migration, including refactoring prompts, aligning system behavior and tone, and achieving sampling and parameter parity. - [Exa + Groq: AI-Powered Web Search & Content Discovery](https://console.groq.com/docs/exa): This documentation page provides a guide on integrating Exa, an AI-native search engine, with Groq's fast inference capabilities to build intelligent search applications that enable AI-powered web search and content discovery. It offers tutorials, code examples, and key features for utilizing Exa and Groq to create advanced search agents for tasks such as semantic understanding, company research, and content extraction. - [Web Search](https://console.groq.com/docs/web-search): The "Web Search" documentation page provides information on how to utilize native web search capabilities in Groq models, allowing them to access real-time web content and provide up-to-date answers. This page explains the functionality, supported systems, and usage guidelines for web search, including customization options and output formats. - [Web Search: Countries (ts)](https://console.groq.com/docs/web-search/countries): The "Web Search: Countries (ts)" page provides a list of countries in TypeScript format, allowing developers to easily integrate a comprehensive list of nations into their web applications. This list includes 196 countries, each represented as a string, and can be used for various purposes such as autocomplete suggestions, geographic data analysis, or validation. - [Code Execution](https://console.groq.com/docs/code-execution): This documentation page provides information on code execution capabilities in Groq, including supported models and systems, and how to use code execution to perform calculations and run code snippets. It covers native support for automatic code execution, currently limited to Python, and details on usage with specific models and systems. - [MLflow + Groq: Open-Source GenAI Observability](https://console.groq.com/docs/mlflow): This documentation page provides instructions and information on integrating MLflow with Groq for open-source GenAI observability, enabling users to build and monitor better Generative AI applications. It covers features such as tracing dashboards, automated tracing, and evaluation metrics, as well as a Python quick start guide for getting started with MLflow and Groq. - [How Groq Uses Your Feedback](https://console.groq.com/docs/feedback-policy): This page explains how Groq collects, reviews, and uses feedback provided by users to improve the safety, reliability, and usefulness of GroqCloud and its products. It outlines the types of feedback collected, how it is reviewed and retained, and how it is used to enhance product quality and system safety. - [LoRA Inference on Groq](https://console.groq.com/docs/lora): This documentation page provides information on utilizing Groq's inference services for Low-Rank Adaptation (LoRA) adapters, a Parameter-efficient Fine-tuning (PEFT) technique that customizes model behavior without altering base model weights. It outlines the benefits, features, and deployment options for LoRA inference on Groq's infrastructure. - [Groq Client Libraries](https://console.groq.com/docs/libraries): The Groq Client Libraries page provides information on the official Python and JavaScript/TypeScript client libraries for accessing the Groq REST API, as well as community-developed libraries for other programming languages. These libraries offer convenient access to the Groq API, including type definitions and synchronous/asynchronous clients.