# https://console.groq.com llms.txt - [Groq API Reference](https://console.groq.com/docs/api-reference): The Groq API Reference provides detailed documentation on the Groq API, including endpoints, parameters, and response formats. This reference guide is intended to help developers integrate and interact with the Groq API to build applications and services. - [FlutterFlow + Groq: Fast & Powerful Cross-Platform Apps](https://console.groq.com/docs/flutterflow): This documentation page provides a guide on integrating FlutterFlow with Groq to build fast and powerful cross-platform apps with AI capabilities. It outlines a quick start process to get started with building AI-powered apps using FlutterFlow and Groq in just 10 minutes. - [Images and Vision](https://console.groq.com/docs/vision): The "Images and Vision" documentation page provides information on using the Groq API's multimodal models to analyze and interpret visual data from images, enabling applications such as visual question answering, caption generation, and Optical Character Recognition (OCR). This page guides developers on how to integrate vision capabilities into their applications using supported models and API endpoints. - [E2B + Groq: Open-Source Code Interpreter](https://console.groq.com/docs/e2b): The E2B + Groq: Open-Source Code Interpreter documentation page provides instructions and code examples for using the E2B SDK to create secure, sandboxed environments for executing code generated by LLMs via the Groq API. This page guides developers in integrating E2B and Groq to build AI-powered data analysis applications, specifically focusing on code interpretation and execution in a secure sandbox environment. - [Responses API](https://console.groq.com/docs/responses-api): The Responses API page provides documentation for Groq's conversational AI API, which supports text and image inputs, stateful conversations, and function calling, and is compatible with OpenAI's Responses API. This page guides developers on configuring the API, using built-in tools, and implementing features such as structured outputs and reasoning. - [Text Generation](https://console.groq.com/docs/text-chat): The "Text Generation" page provides an overview of generating human-like text with Groq's Chat Completions API, enabling natural conversational interactions with large language models for various applications. This documentation covers key concepts, including chat completions, streaming responses, and structured outputs, to help developers get started with implementing text generation capabilities. - [Built-in Tools](https://console.groq.com/docs/compound/built-in-tools): The "Built-in Tools" page provides an overview of the comprehensive set of tools that come equipped with Compound systems, enabling users to access real-time information, computational power, and interactive environments. This page details the default and available tools, their configurations, and usage examples for customizing and utilizing these tools effectively. - [Search Settings: Page (mdx)](https://console.groq.com/docs/compound/search-settings): The "Search Settings: Page (mdx)" documentation page provides information on configuring search settings for MDX pages. This page allows users to customize search functionality and behavior for their MDX content. - [Key Technical Specifications](https://console.groq.com/docs/compound/systems/compound): This documentation page outlines the key technical specifications of Compound, a powerful tool powered by Llama4 Scout and GPT-OSS120B for intelligent reasoning and tool use. It provides details on the model architecture, performance metrics, and best practices for deploying Compound for applications such as real-time web search and code execution. - [Compound Beta: Page (mdx)](https://console.groq.com/docs/compound/systems/compound-beta): The Compound Beta: Page (mdx) documentation provides information on utilizing the MDX page type within Compound Beta. This page serves as a resource for developers and users to understand the implementation and usage of MDX pages in Compound Beta. - [Compound Beta Mini: Page (mdx)](https://console.groq.com/docs/compound/systems/compound-beta-mini): The Compound Beta Mini: Page (mdx) documentation page appears to be a stub or placeholder, as it currently contains no actual documentation content. This page likely serves as a redirect or import hub for related documentation components. - [Key Technical Specifications](https://console.groq.com/docs/compound/systems/compound-mini): This documentation page outlines the key technical specifications of Compound Mini, a model powered by Llama3.3 70B and GPT-OSS 120B, highlighting its performance, capabilities, and limitations. It provides an overview of Compound Mini's features, including its intelligent reasoning, tool use, and benchmark performance. - [Systems](https://console.groq.com/docs/compound/systems): This documentation page provides an overview of Groq's compound AI systems, including the Compound and Compound Mini systems, which utilize external tools to enhance response accuracy and capability. The page details the features, use cases, and differences between these two systems, helping users choose the most suitable one for their needs. - [Use Cases](https://console.groq.com/docs/compound/use-cases): This documentation page provides an overview of various use cases for Groq's compound systems, highlighting their capabilities in handling real-time information and diverse tasks. It showcases solutions for applications such as real-time fact checking, chart generation, natural language calculation, and code debugging. - [Compound](https://console.groq.com/docs/compound): The Compound documentation page provides information on Groq's advanced AI system that solves problems by taking action and intelligently using external tools, such as web search and code execution, alongside powerful large language models. This page details the capabilities, usage, and limitations of the Compound system, including its available variants, supported tools, and integration with Groq's API. - [API Error Codes and Responses](https://console.groq.com/docs/errors): This page provides detailed information on API error codes and responses, including standard HTTP status codes and custom error codes, to help with troubleshooting and error handling. It outlines the various error codes, their descriptions, and example response bodies to facilitate understanding and resolution of API request issues. - [FAQs](https://console.groq.com/docs/billing-faqs): The "FAQs" page provides answers to frequently asked questions about Groq's billing model, including details on progressive billing thresholds, monthly billing cycles, and payment processing. This page helps users understand how Groq's billing works, including special considerations for customers in India. - [Projects](https://console.groq.com/docs/projects): The "Projects" page provides a framework for managing multiple applications, environments, and teams within a single Groq account, enabling organizations to isolate workloads and gain granular control over resources, costs, and access permissions. This page guides users in creating and managing projects to organize their work, track spending and usage, and control team collaboration. - [Compound: Page (mdx)](https://console.groq.com/docs/agentic-tooling/groq/compound): The "Compound: Page (mdx)" component is a MarkdownX (MDX) page template used for creating compound pages. This page type serves as a container for displaying content, but currently, no content is available for display. - [Compound Mini: Page (mdx)](https://console.groq.com/docs/agentic-tooling/groq/compound-mini): The Compound Mini: Page (mdx) documentation page provides information on utilizing MDX pages within the Compound Mini framework. This page serves as a resource for developers looking to integrate MDX content into their Compound Mini projects. - [Compound Beta: Page (mdx)](https://console.groq.com/docs/agentic-tooling/compound-beta): The Compound Beta: Page (mdx) documentation provides information on utilizing the MDX page type within Compound Beta. This page serves as a resource for developers and users to understand the implementation and usage of MDX pages in Compound Beta. - [Compound Beta Mini: Page (mdx)](https://console.groq.com/docs/agentic-tooling/compound-beta-mini): The Compound Beta Mini: Page (mdx) documentation provides information on utilizing the MDX page type within the Compound Beta Mini framework. This page currently does not contain any specific details or guidelines for use. - [Agentic Tooling: Page (mdx)](https://console.groq.com/docs/agentic-tooling): This page provides information on Agentic Tooling in MDX format. It serves as a documentation resource for understanding and implementing Agentic Tooling using Markdown extensions. - [Overview Refresh: Page (mdx)](https://console.groq.com/docs/overview-refresh): The "Overview Refresh: Page (mdx)" page provides a refresh overview for a specific page in MDX format. This page is used to display refreshed information related to a particular page. - [Understanding and Optimizing Latency on Groq](https://console.groq.com/docs/production-readiness/optimizing-latency): This documentation page provides guidance on understanding, measuring, and optimizing latency in Groq-powered applications, specifically when building production applications with Large Language Models (LLMs). It helps developers comprehend key metrics, factors affecting latency, and optimization strategies to ensure efficient deployment of their Groq-based applications. - [Production-Ready Checklist for Applications on GroqCloud](https://console.groq.com/docs/production-readiness/production-ready-checklist): The "Production-Ready Checklist for Applications on GroqCloud" provides a comprehensive guide for deploying and scaling LLM applications on GroqCloud, covering critical aspects such as model selection, performance optimization, monitoring, and cost management. This checklist helps developers launch and manage their Groq-powered applications with confidence, ensuring a reliable and efficient user experience. - [Rate Limits](https://console.groq.com/docs/rate-limits): This documentation page explains rate limits, which regulate how frequently users and applications can access the API within specified timeframes to ensure service stability, fair access, and protection against misuse. It provides details on understanding, viewing, and handling rate limits, as well as options for requesting higher limits. - [Assistant Message Prefilling](https://console.groq.com/docs/prefilling): This documentation page explains how to use Assistant Message Prefilling with the Groq API, a technique that allows for more control over model output by directing the model to skip introductions, enforce specific output formats, and maintain conversation consistency. By prefilling `assistant` messages, users can customize the output of text-to-text models powered by Groq. - [Changelog](https://console.groq.com/docs/legacy-changelog): The Groq Changelog provides a chronological record of updates, releases, and developments to the Groq API, allowing users to track changes and stay informed about new features and models. This page lists updates in reverse chronological order, with the most recent changes appearing at the top. - [MLflow + Groq: Open-Source GenAI Observability](https://console.groq.com/docs/mlflow): This documentation page provides a comprehensive guide on integrating MLflow with Groq for open-source GenAI observability, enabling users to build and monitor better Generative AI applications. It covers features such as tracing dashboards, automated tracing, and evaluation metrics, as well as a Python quick start guide for getting started with MLflow and Groq. - [Arize + Groq: Open-Source AI Observability](https://console.groq.com/docs/arize): This documentation page provides a guide on integrating Arize Phoenix, an open-source AI observability library, with Groq-powered applications to gain deep insights into LLM workflow performance and behavior. It outlines the features and steps to achieve comprehensive tracing and monitoring, enabling users to track application performance, identify bottlenecks, and assess LLM performance. - [Structured Outputs](https://console.groq.com/docs/structured-outputs): The "Structured Outputs" feature guarantees that model responses conform to a provided JSON schema, ensuring reliable and type-safe data structures by throwing an error if the model cannot produce a compliant response. This feature enables customers to obtain consistent and programmatically valid outputs, eliminating the need for manual validation or retries. - [Code Execution](https://console.groq.com/docs/code-execution): This documentation page provides information on code execution capabilities in Groq, including supported models and systems, and how to use code execution to perform calculations and run code snippets. It outlines the details of native code execution support, currently limited to Python, and its usage with specific models and systems. - [Groq Client Libraries](https://console.groq.com/docs/libraries): The Groq Client Libraries documentation page provides information on the official Python and JavaScript/TypeScript client libraries for interacting with the Groq REST API, as well as community-developed libraries in other programming languages. This page guides developers on installing, using, and contributing to Groq client libraries for their specific use cases. - [🗂️ LlamaIndex 🦙](https://console.groq.com/docs/llama-index): The LlamaIndex page provides an overview of the LlamaIndex data framework, which enables the ingestion, structuring, and access of private or domain-specific data for Retrieval-Augmented Generation (RAG) systems and other LLM-based applications. This framework facilitates the safe and reliable injection of data into LLMs for more accurate text generation. - [🦜️🔗 LangChain + Groq](https://console.groq.com/docs/langchain): This page provides a guide on integrating LangChain with Groq API to build sophisticated applications with Large Language Models (LLMs), leveraging LangChain components such as chains, prompt templates, memory, tools, and agents. It offers a quick start tutorial on installing the package, setting up API keys, and creating a LangChain assistant with Groq API for fast inference speed. - [Agno + Groq: Fast Agents](https://console.groq.com/docs/agno): This documentation page provides a guide on building fast agents using Agno and Groq, enabling the creation of autonomous programs that utilize language models to achieve tasks. It covers the setup, implementation, and use cases of Agno agents, including examples of single agents and multi-agent teams. - [Reasoning](https://console.groq.com/docs/reasoning): The "Reasoning" page provides information on utilizing reasoning models for complex problem-solving tasks that require step-by-step analysis and logical deduction. This page details the importance of speed, supported models, and API parameters for controlling the reasoning process, including format and effort levels. - [🎨 Gradio + Groq: Easily Build Web Interfaces](https://console.groq.com/docs/gradio): This documentation page provides a guide on integrating Gradio with Groq to easily build web interfaces for Groq applications, enabling users to create interactive demos and deploy shareable apps. It offers a step-by-step quick start guide and resources for building robust multimodal applications with Gradio and Groq. - [Spend Limits](https://console.groq.com/docs/spend-limits): This page provides information on setting and managing spend limits to control API costs, including automated spending limits and proactive usage alerts. It guides users through setting up spending limits, adding usage alerts, and understanding how spend limits work to prevent exceeding budget thresholds. - [Browser Search](https://console.groq.com/docs/browser-search): The "Browser Search" documentation page provides information on using built-in browser search functionality with supported models on Groq, allowing for interactive web content access and more comprehensive search results. This page covers supported models, usage guidelines, pricing, and best practices for leveraging browser search capabilities. - [Prompt Engineering Patterns Guide](https://console.groq.com/docs/prompting/patterns): The "Prompt Engineering Patterns Guide" provides a systematic approach to selecting effective prompt patterns for various tasks when working with open-source language models, ensuring improved output reliability and performance. This guide helps users choose the optimal prompt pattern for their specific task, covering a range of applications, from simple Q&A and creative writing to complex tasks like multi-step math and mission-critical accuracy. - [Model Migration Guide](https://console.groq.com/docs/prompting/model-migration): The Model Migration Guide provides step-by-step instructions for transitioning from commercial models like GPT, Claude, and Gemini to open-source models like Llama, focusing on adjusting prompting techniques and generation parameters. This guide helps users adapt their prompts and model settings to achieve desired outputs from open-source models. - [Prompt Basics](https://console.groq.com/docs/prompting): The "Prompt Basics" guide provides fundamental principles for crafting effective prompts for open-source instruction-tuned large language models, enabling users to communicate clear instructions and expectations. This guide covers essential concepts, including prompt building blocks, role channels, and best practices for optimizing prompt quality and model output. - [Google Cloud Private Service Connect](https://console.groq.com/docs/security/gcp-private-service-connect): This page provides a guide on setting up Google Cloud Private Service Connect (PSC) to securely access Groq's API services through private network connections, eliminating exposure to the public internet. It outlines the prerequisites, setup steps, and benefits of using PSC, including private IP access, reduced latency, and improved network security controls. - [Integrations: Button Group (tsx)](https://console.groq.com/docs/integrations/button-group): The Button Group component page provides documentation and usage guidelines for displaying a collection of buttons in a grid layout. It outlines the properties and usage of the component, including the structure of the button objects and their respective properties. - [Integrations: Integration Buttons (ts)](https://console.groq.com/docs/integrations/integration-buttons): This page documents the integration buttons available for various categories, including AI agent frameworks, browser automation, LLM app development, and more. It provides a list of integration groups and their corresponding buttons, each with details such as title, description, and icon sources. - [What are integrations?](https://console.groq.com/docs/integrations): This page provides an overview of integrations, which enable users to connect their Groq-powered applications to external services and enhance their capabilities. It serves as a catalog to browse and explore various integration categories that can be used to extend and customize Groq applications. - [Quickstart](https://console.groq.com/docs/quickstart): The Quickstart page provides a rapid onboarding guide to get started with the Groq API, covering essential steps such as creating an API key, setting it up securely, and making your first API request. This page serves as a launchpad for developers to quickly integrate Groq's capabilities into their applications. - [Speech to Text](https://console.groq.com/docs/speech-to-text): The "Speech to Text" documentation page provides information on using the Groq API to convert audio to text through fast and high-quality speech-to-text solutions, offering OpenAI-compatible endpoints for transcriptions and translations. This page details API endpoints, supported models, and guidelines for working with audio files to integrate speech-to-text functionality into applications. - [Text to Speech](https://console.groq.com/docs/text-to-speech): The "Text to Speech" documentation page provides instructions on how to use the Groq API to convert text into lifelike audio using text-to-speech (TTS) models, supporting 23 voices in English and Arabic. This page guides developers on generating high-quality audio content for various applications, such as customer support agents and game development characters. - [Browser Automation](https://console.groq.com/docs/browser-automation): This documentation page provides information on browser automation, a feature that enables advanced web research by launching and controlling up to 10 browsers simultaneously to gather comprehensive information from multiple sources. It details the supported models, setup, and functionality of browser automation on Groq's systems. - [LiveKit + Groq: Build End-to-End AI Voice Applications](https://console.groq.com/docs/livekit): This documentation page provides a guide on integrating LiveKit with Groq to build end-to-end AI voice applications, combining speech recognition, text-to-speech, and real-time communication features. It offers a step-by-step tutorial on setting up a voice agent using LiveKit and Groq, enabling developers to create scalable voice applications with multi-user interactions. - [LoRA Inference on Groq](https://console.groq.com/docs/lora): This documentation page provides information on running LoRA (Low-Rank Adaptation) inference on Groq's infrastructure, a parameter-efficient fine-tuning technique that allows for customized model behavior without altering base model weights. It outlines the benefits, features, and deployment options for LoRA adapters on GroqCloud, specifically for enterprise-tier customers. - [🚅 LiteLLM + Groq for Production Deployments](https://console.groq.com/docs/litellm): This page provides a guide on using LiteLLM with Groq for production deployments, covering features such as cost management, smart caching, and spend tracking to optimize resource utilization. It offers a quick start tutorial and next steps for configuring advanced features and building production-ready applications with LiteLLM and Groq. - [Content Moderation](https://console.groq.com/docs/content-moderation): This documentation page provides guidelines and technical information on content moderation, a crucial aspect of ensuring safe and responsible use of models by detecting and filtering harmful or unwanted content in user prompts and model responses. It specifically covers the usage and functionality of Llama Guard4, a multimodal safeguard model developed by Meta, for content moderation across multiple formats. - [Visit Website](https://console.groq.com/docs/visit-website): The "Visit Website" page provides information on how to use Groq's website visiting tool, which allows supported models to access and analyze content from publicly accessible websites. This tool enables models to retrieve and process website content, providing detailed analysis based on the actual page content. - [Groq Batch API](https://console.groq.com/docs/batch): The Groq Batch API enables large-scale asynchronous processing of API requests, allowing users to submit batches of requests for processing within a 24-hour to 7-day window at a 50% cost discount compared to synchronous APIs. This API is ideal for use cases that don't require immediate responses, such as processing large datasets, generating content in bulk, and running evaluations. - [Toolhouse 🛠️🏠](https://console.groq.com/docs/toolhouse): This documentation page provides a step-by-step guide on how to use Toolhouse, a Backend-as-a-Service for the agentic stack, in conjunction with Groq's fast inference and Llama4 models to build conversational and autonomous agents. It outlines the setup process, integration with Groq API, and usage examples for Llama4 Maverick and Compound Beta models. - [Anchor Browser + Groq: Blazing Fast Browser Agents](https://console.groq.com/docs/anchorbrowser): This documentation page provides a quickstart guide on using Anchor Browser with Groq to create blazing-fast browser agents for automating web interactions, such as data collection, using AI-powered browser automation. It covers prerequisites, setup, and example code for extracting data from websites using the Anchor Browser Python SDK and Groq's fast inference. - [AutoGen + Groq: Building Multi-Agent AI Applications](https://console.groq.com/docs/autogen): This documentation page provides a guide on building multi-agent AI applications using AutoGen and Groq, enabling developers to create sophisticated AI agents that collaborate to solve complex tasks quickly. It covers features such as multi-agent orchestration, tool integration, and flexible workflows, and includes a Python quick start guide and examples of advanced features. - [How Groq Uses Your Feedback](https://console.groq.com/docs/feedback-policy): This page explains how Groq collects, reviews, and uses feedback provided by users through various channels, and how it is handled in accordance with Groq's Privacy Policy. It outlines the types of feedback collected, the process of review, and how feedback is used to improve product quality and system safety. - [Prompt Caching](https://console.groq.com/docs/prompt-caching): This documentation page explains Prompt Caching, a feature that automatically reuses computation from recent requests when they share a common prefix, delivering cost savings and improved response times. Prompt Caching works automatically on API requests with no code changes required, providing a 50% discount on cached input tokens. - [OpenAI Compatibility](https://console.groq.com/docs/openai): This page provides information on the compatibility of Groq API with OpenAI's client libraries, allowing users to easily configure their existing applications to run on Groq. It outlines the necessary configuration, unsupported features, and alternative APIs, such as the Responses API, to facilitate a seamless transition. - [Your Data in GroqCloud](https://console.groq.com/docs/your-data): This page provides information on how Groq handles customer data in GroqCloud, including the types of data retained, circumstances for retention, and controls available to users. It outlines the data retention policies for usage metadata and customer data, and explains how users can manage their data settings through the Data Controls settings. - [xRx + Groq: Easily Build Rich Multi-Modal Experiences](https://console.groq.com/docs/xrx): This documentation page provides a guide on how to use xRx, an open-source framework, in conjunction with Groq to build rich multi-modal experiences, enabling developers to create AI-powered applications with seamless text, voice, and other interaction forms. The page offers a quick start guide, key features, and sample applications to help developers get started with building sophisticated AI systems. - [Web Search: Countries (ts)](https://console.groq.com/docs/web-search/countries): The "Web Search: Countries (ts)" page provides a list of countries in TypeScript format, exported as a constant array of strings. This page serves as a reference for developers who need to integrate a comprehensive list of countries into their web applications. - [Web Search](https://console.groq.com/docs/web-search): The "Web Search" documentation page provides information on using native web search capabilities in Groq models, allowing them to access real-time web content and provide up-to-date answers. This page explains the functionality, supported systems, and usage guidelines for web search, including customization options and output formats. - [Overview](https://console.groq.com/docs/overview/content): The "Overview" page provides an introduction to Groq's API, highlighting its fast LLM inference, OpenAI compatibility, and ease of integration and scalability. This page serves as a starting point for developers to quickly get started with building apps on Groq, offering resources, code examples, and model information. - [Overview: Page (mdx)](https://console.groq.com/docs/overview): The "Page (mdx)" component serves as a container for displaying content in MDX format. This page component is designed to render MDX content, providing a flexible and dynamic way to present information. - [Key Technical Specifications](https://console.groq.com/docs/model/allam-2-7b): The "Key Technical Specifications" page provides an overview of the ALLaM-2-7B model, including its architecture, performance metrics, and use cases, highlighting its capabilities as a bilingual Arabic-English autoregressive transformer. This page serves as a technical reference for developers and researchers looking to understand the model's specifications and best practices for leveraging its bilingual language understanding and generation capabilities. - [Kimi K2 Version](https://console.groq.com/docs/model/moonshotai/kimi-k2-instruct): The Kimi K2 Version page provides information on the model's technical specifications, performance metrics, and use cases for the Kimi K2 model, which currently redirects to the latest 0905 version. This page serves as a documentation hub for developers and users to understand the capabilities and best practices for leveraging the Kimi K2 model in various applications. - [Key Technical Specifications](https://console.groq.com/docs/model/moonshotai/kimi-k2-instruct-0905): This documentation page outlines the key technical specifications, use cases, and best practices for the Kimi-K2-Instruct-0905 model, a Mixture-of-Experts (MoE) architecture-based AI model with exceptional performance in coding, math, and reasoning tasks. The page provides essential information for developers to effectively utilize the model's capabilities in various applications, including frontend development, agent scaffolds, tool calling, and full-stack development. - [Llama 3.1 8b Instant: Model (tsx)](https://console.groq.com/docs/model/llama-3.1-8b-instant): The Llama3.1 8b Instant model on Groq provides rapid response times with production-grade reliability, making it suitable for latency-sensitive applications. This model balances efficiency and performance for use cases such as chat interfaces, content filtering systems, and large-scale data processing workloads. - [Deepseek R1 Distill Llama 70b: Model (tsx)](https://console.groq.com/docs/model/deepseek-r1-distill-llama-70b): The Deepseek R1 Distill Llama70b model page provides information on a distilled version of DeepSeek's R1 model, fine-tuned from the Llama-3.3-70B-Instruct base model, which leverages knowledge distillation to retain robust reasoning capabilities. This model delivers exceptional performance on mathematical and logical reasoning tasks with Groq's industry-leading speed. - [Qwen Qwq 32b: Model (tsx)](https://console.groq.com/docs/model/qwen-qwq-32b): The Qwen Qwq32b model page provides an overview of the 32-billion parameter reasoning model, detailing its competitive performance on complex tasks and deployment on Groq's hardware for rapid results. This page offers information on the model's key features, including its performance, parameters, and speed, as well as a link to try the model. - [Qwen3 32b: Page (mdx)](https://console.groq.com/docs/model/qwen3-32b): The Qwen332b: Page (mdx) documentation provides information on utilizing the Qwen3.2B model within MDX (Multidimensional Expressions) pages. This page serves as a resource for integrating and implementing Qwen3.2B functionality in compatible environments. - [Qwen 2.5 Coder 32b: Model (tsx)](https://console.groq.com/docs/model/qwen-2.5-coder-32b): This page provides documentation for the Qwen2.5 Coder32b model, a specialized AI model fine-tuned for code generation and development tasks. The model delivers production-quality code generation capabilities, comparable to GPT-4, built on a large dataset of code and technical content. - [Llama3 8b 8192: Model (tsx)](https://console.groq.com/docs/model/llama3-8b-8192): The Llama3 8b 8192 model page provides details on a Groq-hosted AI model that delivers exceptional performance with industry-leading speed and cost-efficiency. This page offers information on utilizing the Llama-3-8B-8192 model, optimized for high-volume applications where both speed and cost are crucial. - [Llama 3.2 3b Preview: Model (tsx)](https://console.groq.com/docs/model/llama-3.2-3b-preview): The Llama3.2.3b Preview model page provides details on a fast and balanced AI model with 3.1 billion parameters, suitable for tasks like content creation, summarization, and information retrieval. This page offers information on the model's capabilities, ideal use cases, and performance benefits for real-time applications. - [Key Technical Specifications](https://console.groq.com/docs/model/distil-whisper-large-v3-en): This documentation page provides key technical specifications and details for Distil-Whisper Large v3, a speech-to-text model built on the encoder-decoder transformer architecture, highlighting its performance metrics, model architecture, and use cases. The page outlines essential information for implementing and optimizing the model for various applications, including real-time transcription, content processing, and interactive speech recognition features. - [Key Technical Specifications](https://console.groq.com/docs/model/whisper-large-v3-turbo): This documentation page provides key technical specifications and details for Whisper Large v3 Turbo, OpenAI's fastest speech recognition model optimized for speed and high accuracy. It outlines the model's capabilities, use cases, and best practices for applications requiring rapid transcription, such as real-time streaming, meeting transcription, and high-volume audio processing. - [Llama 4 Scout 17b 16e Instruct: Page (mdx)](https://console.groq.com/docs/model/llama-4-scout-17b-16e-instruct): The Llama4 Scout17b16e Instruct: Page (mdx) documentation provides guidance on utilizing the Llama4 Scout17b16e model for instructional tasks in Markdown format. This page serves as a reference for integrating and implementing the model in various applications. - [Qwen 2.5 32b: Model (tsx)](https://console.groq.com/docs/model/qwen-2.5-32b): The Qwen2.5-32b model page provides information on Alibaba's flagship AI model, which delivers near-instant responses with GPT-4 level capabilities across various tasks. This model excels in creative writing, complex reasoning, and more, thanks to its training on 5.5 trillion diverse tokens. - [Qwen3 32b: Model (tsx)](https://console.groq.com/docs/model/qwen/qwen3-32b): The Qwen3 32B model page provides information on the latest generation of large language models in the Qwen series, offering advancements in reasoning, instruction-following, and multilingual support. This page details the model's capabilities, including its ability to switch between thinking and non-thinking modes for various tasks. - [Mistral Saba 24b: Model (tsx)](https://console.groq.com/docs/model/mistral-saba-24b): The Mistral Saba24B model page provides information on a specialized multilingual model trained to excel in Arabic, Farsi, Urdu, Hebrew, and Indic languages, with a 32K token context window and tool use capabilities. This page details the capabilities and features of the Mistral Saba24B model, which delivers exceptional results across multilingual tasks while maintaining strong performance in English. - [Llama 4 Maverick 17b 128e Instruct: Page (mdx)](https://console.groq.com/docs/model/llama-4-maverick-17b-128e-instruct): The Llama4 Maverick17b128e Instruct: Page (mdx) documentation provides information on utilizing the Llama4 Maverick17b128e model in Markdown pages. This page serves as a reference for integrating and implementing the model within MDX files. - [Llama 4 Scout 17b 16e Instruct: Model (tsx)](https://console.groq.com/docs/model/meta-llama/llama-4-scout-17b-16e-instruct): The Llama4 Scout17b16e Instruct model page provides information on Meta's 17 billion parameter mixture-of-experts model, featuring native multimodality for text and image understanding, instruction-tuned for tasks like chat, visual reasoning, and coding. This page offers details on the model's capabilities, including its 128K token context length and industry-leading inference speed on Groq. - [Llama 4 Maverick 17b 128e Instruct: Model (tsx)](https://console.groq.com/docs/model/meta-llama/llama-4-maverick-17b-128e-instruct): The Llama4 Maverick17b128e Instruct model page provides information on Meta's 17 billion parameter mixture-of-experts model, featuring native multimodality for text and image understanding, instruction-tuned for tasks like chat, visual reasoning, and coding. This page details the model's capabilities, including its 128K token context length and industry-leading inference speed on Groq. - [Key Technical Specifications](https://console.groq.com/docs/model/meta-llama/llama-prompt-guard-2-86m): This documentation page outlines the key technical specifications and features of Llama Prompt Guard2, a model designed to detect and prevent malicious prompt attacks on LLM applications. It provides detailed information on the model's architecture, performance metrics, and best practices for implementation to enhance LLM application security. - [Key Technical Specifications](https://console.groq.com/docs/model/meta-llama/llama-prompt-guard-2-22m): This documentation page provides key technical specifications for Llama Prompt Guard2, a specialized classifier model designed to detect and prevent prompt attacks in LLM applications. It outlines the model's architecture, performance metrics, features, use cases, and best practices for implementation. - [Key Technical Specifications](https://console.groq.com/docs/model/meta-llama/llama-guard-4-12b): This documentation page outlines the key technical specifications, use cases, and best practices for the Llama-Guard-4-12B model, a content moderation and safety classification tool built on Meta's Llama4 Scout architecture. The page provides essential information for integrating and optimizing the model for safe and effective content evaluation in various applications. - [Llama3 70b 8192: Model (tsx)](https://console.groq.com/docs/model/llama3-70b-8192): The Llama3.0 70B model on Groq is a reliable foundation model that excels at dialogue and content-generation tasks, offering a balance of performance and speed. This model remains production-ready and cost-effective, providing fast and consistent outputs via the Groq API. - [Key Technical Specifications](https://console.groq.com/docs/model/gemma2-9b-it): This page provides an overview of the key technical specifications for the Gemma2 9B IT model, including its architecture, performance metrics, and use cases. It outlines essential details for developers and researchers to understand the model's capabilities and best practices for implementation. - [Llama Prompt Guard 2 86m: Page (mdx)](https://console.groq.com/docs/model/llama-prompt-guard-2-86m): The Llama Prompt Guard286m page provides information on a specific configuration or implementation of the Llama Prompt Guard. This page is intended to document the details of the Guard286m setup, likely including its functionality, usage, and technical specifications. - [Key Technical Specifications](https://console.groq.com/docs/model/whisper-large-v3): This documentation page outlines the key technical specifications, model details, and use cases for the Whisper Large v3 speech recognition model, including its architecture, performance metrics, and best practices for high-accuracy transcription, multilingual applications, and challenging audio conditions. The page provides essential information for developers and users to effectively utilize the model for various speech-to-text applications. - [Key Technical Specifications](https://console.groq.com/docs/model/openai/gpt-oss-120b): This documentation page outlines the key technical specifications, use cases, and best practices for the GPT-OSS120B model, a Mixture-of-Experts (MoE) architecture with 120B total parameters and exceptional performance across various benchmarks. The page provides essential information for developers and researchers to effectively deploy and utilize the model for frontier-grade agentic applications, advanced research, and high-accuracy mathematical and coding tasks. - [Key Technical Specifications](https://console.groq.com/docs/model/openai/gpt-oss-20b): This documentation page outlines the key technical specifications, use cases, and best practices for the GPT-OSS20B model, a Mixture-of-Experts (MoE) architecture with 20B total parameters and exceptional performance across various benchmarks. The page provides essential information for developers and users to effectively deploy and utilize the GPT-OSS20B model in various applications. - [Llama Prompt Guard 2 22m: Page (mdx)](https://console.groq.com/docs/model/llama-prompt-guard-2-22m): The Llama Prompt Guard222m page provides guidelines and best practices for crafting and utilizing prompts effectively with the Llama model. This documentation aims to help users optimize their prompts to achieve more accurate and relevant responses from the model. - [Deepseek R1 Distill Qwen 32b: Model (tsx)](https://console.groq.com/docs/model/deepseek-r1-distill-qwen-32b): The DeepSeek R1 Distill Qwen32b model page provides an overview of a distilled language model, fine-tuned from the Qwen-2.5-32B base model, which leverages knowledge distillation to retain robust reasoning capabilities while enhancing efficiency. This page details the model's key features, performance, and capabilities, particularly in mathematical and logical reasoning tasks. - [Llama Guard 4 12b: Page (mdx)](https://console.groq.com/docs/model/llama-guard-4-12b): The Llama Guard412b page provides information on a specific model, likely detailing its features, usage, and integration guidelines. This page appears to be a placeholder or a template, awaiting documentation content related to the Llama Guard412b model. - [Key Technical Specifications](https://console.groq.com/docs/model/playai-tts-arabic): The "Key Technical Specifications" documentation page provides an overview of the PlayAI Dialog v1.0 model, including its technical architecture, training data, and use cases, to help developers understand its capabilities and limitations. This page serves as a central resource for technical details, best practices, and considerations for implementing the model in various applications, such as creative content generation, voice agentic experiences, and customer support systems. - [Llama Guard 3 8b: Model (tsx)](https://console.groq.com/docs/model/llama-guard-3-8b): The Llama Guard 3 8b model is a specialized content moderation model built on the Llama framework, designed to identify and filter potentially harmful content. This model is hosted by Groq, which provides fast inference with industry-leading latency and performance for high-speed AI processing. - [Key Technical Specifications](https://console.groq.com/docs/model/playai-tts): The "Key Technical Specifications" documentation page provides an overview of the PlayAI Dialog v1.0 model, including its technical architecture, training data, and use cases, to help developers understand its capabilities and limitations. This page serves as a central resource for technical details, best practices, and considerations for implementing the model in various applications, such as creative content generation, voice agentic experiences, and customer support systems. - [Llama 3.2 1b Preview: Model (tsx)](https://console.groq.com/docs/model/llama-3.2-1b-preview): The Llama3.2.1b Preview model page provides information on a fast and cost-effective language model with 1.23 billion parameters, suitable for high-throughput applications requiring rapid responses. This page details the model's capabilities, including text analysis, information retrieval, and content summarization, and its optimal balance of speed, quality, and cost. - [Llama 3.3 70b Specdec: Model (tsx)](https://console.groq.com/docs/model/llama-3.3-70b-specdec): The Llama3.3-70b SpecDec model is a speculative decoding version of Meta's Llama3.3-70B model, optimized for high-speed inference while maintaining high quality. This model delivers exceptional performance with significantly reduced latency, making it ideal for real-time applications. - [Llama 3.3 70b Versatile: Model (tsx)](https://console.groq.com/docs/model/llama-3.3-70b-versatile): The Llama3.3-70B-Versatile model page provides information on Meta's advanced multilingual large language model, optimized for various natural language processing tasks. This page offers details on the model's capabilities, performance, and applications, allowing users to explore its features and potential use cases. - [CrewAI + Groq: High-Speed Agent Orchestration](https://console.groq.com/docs/crewai): This documentation page provides a guide on integrating CrewAI with Groq for high-speed agent orchestration, enabling rapid autonomous decision-making and collaboration among multiple AI agents. It outlines the benefits and implementation details of leveraging Groq's fast inference speeds with CrewAI to optimize response times and streamline complex workflows. - [Introduction to Tool Use](https://console.groq.com/docs/tool-use): This documentation page introduces the concept of tool use in Large Language Models (LLMs), a feature that enables LLMs to interact with external resources and perform actions beyond simple text generation. It provides an overview of supported models, agentic tooling, and a step-by-step guide on how to integrate tools with the Groq API. - [✨ Vercel AI SDK + Groq: Rapid App Development](https://console.groq.com/docs/ai-sdk): This documentation page provides a guide on rapidly developing applications using the Vercel AI SDK with Groq, enabling seamless integration with powerful language models for scalable and high-speed applications. It offers a quick start guide, setup instructions, and a challenge to enhance a basic chat interface into a specialized code explanation assistant. - [Composio](https://console.groq.com/docs/composio): This documentation page provides an overview and setup guide for Composio, a platform that enables the integration of tools with LLMs and AI agents, specifically with Groq-based assistants. It outlines the key features and a quick start tutorial for building AI agents that can interact with external applications using Composio tools. - [Models: Models (tsx)](https://console.groq.com/docs/models/models): The Models Table component page provides documentation for displaying a list of models based on specified criteria, including filtering by release stage and model inclusion or exclusion. This page guides developers on how to use the Models Table, including required and optional props, table headers, and model filtering and hiding functionality. - [Supported Models](https://console.groq.com/docs/models): This page lists and describes the models supported on GroqCloud, including production models, preview models, and deprecated models, to help users choose the right model for their needs. It provides information on the different types of models and systems available, as well as how to access them through the GroqCloud Models API endpoint. - [Models: Featured Cards (tsx)](https://console.groq.com/docs/models/featured-cards): This page documents featured cards that demonstrate the capabilities of various AI systems, including their technical specifications and functionalities. The featured cards, such as Groq Compound and OpenAI GPT-OSS120B, showcase AI systems with advanced capabilities like tool use, reasoning, and code execution. - [Wolfram‑Alpha Integration](https://console.groq.com/docs/wolfram-alpha): The Wolfram‑Alpha Integration documentation page provides information on how to integrate Wolfram's computational knowledge engine with Groq models, enabling them to access precise calculations and structured knowledge for mathematical, scientific, and engineering computations. This page outlines the supported models, setup process, and usage examples for leveraging Wolfram‑Alpha integration in Groq applications. - [Flex Processing](https://console.groq.com/docs/flex-processing): The Flex Processing service tier is optimized for high-throughput workloads that prioritize fast inference and can handle occasional request failures, offering higher rate limits at the same pricing as on-demand processing. This tier is ideal for workloads that require rapid processing and can gracefully handle temporary request failures. - [JigsawStack 🧩](https://console.groq.com/docs/jigsawstack): The JigsawStack 🧩 documentation page provides an overview of the AI SDK, detailing its capabilities in automating tasks such as web scraping, OCR, and translation using Large Language Models (LLMs), as well as its features like the Prompt Engine and built-in safety guardrails. This page serves as a starting point for integrating JigsawStack into existing applications and learning how to utilize its features for optimized performance and safety.