# https://console.groq.com llms.txt - [🗂️ LlamaIndex 🦙](https://console.groq.com/docs/llama-index): The LlamaIndex page provides an overview of the LlamaIndex data framework, which enables the ingestion, structuring, and access of private or domain-specific data for Retrieval-Augmented Generation (RAG) systems and other LLM-based applications. This framework facilitates the safe and reliable injection of data into LLMs for more accurate text generation. - [Rate Limits](https://console.groq.com/docs/rate-limits): This documentation page explains rate limits, which regulate how frequently users and applications can access the API within specified timeframes to ensure service stability, fair access, and protection against misuse. It provides details on understanding rate limits, limit types, and how to handle rate limit errors, as well as view current limits and rate limit information. - [Speech to Text](https://console.groq.com/docs/speech-to-text): The "Speech to Text" documentation page provides information on using the Groq API to convert audio to text through its OpenAI-compatible endpoints, enabling fast transcriptions and translations. This page guides developers on integrating high-quality audio processing into their applications using the API's transcription and translation endpoints. - [CrewAI + Groq: High-Speed Agent Orchestration](https://console.groq.com/docs/crewai): This page provides a guide on integrating CrewAI, a framework for orchestrating multiple AI agents, with Groq's high-speed inference API to optimize response times and enable rapid autonomous decision-making. It outlines the benefits and implementation details of using Groq with CrewAI for fast and reliable agent interactions and scalable multi-agent systems. - [Introduction to Tool Use](https://console.groq.com/docs/tool-use): This documentation page introduces the concept of tool use in Large Language Models (LLMs), a feature that enables LLMs to interact with external resources and perform actions beyond simple text generation. The page provides an overview of supported models, agentic tooling, and a step-by-step guide on how to integrate tools with the Groq API. - [Toolhouse 🛠️🏠](https://console.groq.com/docs/toolhouse): The Toolhouse documentation page provides a comprehensive guide on how to equip Large Language Models (LLMs) with various tools, such as Code Interpreter, Web Search, and Email tools, to enable them to perform tasks autonomously. This page offers a step-by-step tutorial on integrating Toolhouse with Groq, including installation, configuration, and usage of custom and pre-built tools. - [Content Moderation](https://console.groq.com/docs/content-moderation): This documentation page provides information on content moderation for Large Language Models (LLMs), specifically focusing on the use of Llama Guard3, a safeguard model designed to detect and filter harmful or unwanted content generated by LLMs. The page details the functionality, usage, and implementation of Llama Guard3 for ensuring responsible and safe use of LLMs. - [xRx + Groq: Easily Build Rich Multi-Modal Experiences](https://console.groq.com/docs/xrx): This documentation page provides a guide on how to use xRx, an open-source framework, in conjunction with Groq to build rich multi-modal experiences, enabling developers to create AI-powered applications with seamless text, voice, and other interaction forms. The page offers a quick start guide, including setup instructions and example applications, to help developers get started with building immersive experiences. - [Prompting for AI Models on Groq](https://console.groq.com/docs/prompting): This documentation page provides guidance on crafting effective prompts for AI models hosted on Groq, covering best practices, strategies, and techniques to optimize output quality. It outlines actionable tips on structured queries, prompt placement, temperature settings, and leveraging system and user prompts to get the most out of Groq's fast inference speed. - [🦜️🔗 LangChain + Groq](https://console.groq.com/docs/langchain): This page provides a guide on integrating LangChain with Groq API to build sophisticated applications with large language models (LLMs), leveraging LangChain components such as chains, prompt templates, memory, tools, and agents. It offers a quick start tutorial on setting up the LangChain-Groq package and creating a simple LangChain assistant for extracting product information from text. - [Assistant Message Prefilling](https://console.groq.com/docs/prefilling): The Assistant Message Prefilling documentation page explains how to control model output by prefilling `assistant` messages, allowing for customized responses that skip introductions, enforce specific formats, and maintain conversational consistency. This technique is particularly useful for generating concise code snippets, extracting structured data, and directing text-to-text models powered by Groq. - [Flex Processing](https://console.groq.com/docs/flex-processing): This documentation page describes Flex Processing, a service tier optimized for high-throughput workloads that prioritizes fast inference and can handle occasional request failures. It provides information on the benefits, availability, and usage of Flex Processing and other service tiers, including on-demand and auto tiers. - [Text Generation](https://console.groq.com/docs/text-chat): This page provides documentation and guidance on using Groq's Text Generation capabilities, specifically the Chat Completions API, to generate human-like text responses for various applications. It covers topics such as basic chat completions, streaming, stop sequences, and JSON mode, to help developers integrate conversational AI into their applications. - [MLflow + Groq: Open-Source GenAI Observability](https://console.groq.com/docs/mlflow): This documentation page provides a comprehensive guide on integrating MLflow with Groq to enable open-source observability for Generative AI (GenAI) applications, allowing users to track and monitor their interactions with Groq models. The page covers the key features of the integration, including tracing dashboards, automated tracing, and evaluation metrics, as well as a Python quick start guide to get started with MLflow and Groq. - [Groq client libraries](https://console.groq.com/docs/libraries): This documentation page provides information on Groq client libraries, including official Python and JavaScript/TypeScript libraries, as well as community-developed libraries for other programming languages. The page details installation, usage, and features of these libraries, which offer convenient access to the Groq REST API. - [Models: Models (tsx)](https://console.groq.com/docs/models/models): This page provides a comprehensive list of available models, including their details such as developer, context window tokens, maximum completion tokens, and maximum file size. The model table is divided into production and preview models, offering a quick reference for users to explore and select suitable models for their needs. - [Supported Models](https://console.groq.com/docs/models): The "Supported Models" page lists the models currently supported by GroqCloud, categorized into production models, preview models, and preview systems, and provides information on their intended use and lifecycle. This page also details how to access hosted models through the GroqCloud Models API endpoint and provides resources for deprecated models. - [Images and Vision](https://console.groq.com/docs/vision): The "Images and Vision" documentation page provides information on using the Groq API's multimodal models to analyze and interpret visual data from images, enabling applications such as visual question answering, caption generation, and Optical Character Recognition (OCR). This page guides developers on how to integrate vision capabilities into their applications using supported models and API endpoints. - [Arize + Groq: Open-Source AI Observability](https://console.groq.com/docs/arize): This documentation page provides a guide on integrating Arize Phoenix, an open-source AI observability library, with Groq-powered applications to gain deep insights into LLM workflow performance and behavior. It outlines the features and steps to instrument a Groq application with Arize Phoenix for comprehensive tracing, monitoring, and evaluation of AI applications. - [Text to Speech](https://console.groq.com/docs/text-to-speech): The "Text to Speech" documentation page provides instructions on how to use the Groq API to convert text into lifelike spoken audio using available text-to-speech models. This page guides developers on utilizing the API endpoint to generate high-quality audio content in various voices and languages. - [Reasoning](https://console.groq.com/docs/reasoning): This documentation page provides information on reasoning models, which excel at complex problem-solving tasks that require step-by-step analysis, logical deduction, and structured thinking. It covers supported models, reasoning formats, and configuration parameters for leveraging instant reasoning capabilities in real-time applications. - [Quickstart](https://console.groq.com/docs/quickstart): The "Quickstart" page provides a step-by-step guide to getting started with the Groq API, covering key setup and usage basics to help users quickly integrate the API into their projects. This page walks users through creating an API key, setting it up securely, and making their first API request using various programming languages. - [Key Technical Specifications](https://console.groq.com/docs/agentic-tooling/compound-beta): This documentation page provides key technical specifications for Compound-beta, a model that leverages Llama4 Scout and Llama3.3 70B for core reasoning, routing, and tool use. It outlines technical details, performance metrics, and use cases for Compound-beta, as well as best practices for deployment and implementation. - [Agentic Tooling](https://console.groq.com/docs/agentic-tooling): This documentation page provides an overview of Agentic Tooling, a feature that enables advanced AI systems to solve problems by taking action and intelligently using external tools, such as web search and code execution, alongside powerful Llama models. It details the available agentic tool systems, including `compound-beta` and `compound-beta-mini`, and guides users on how to utilize these tools in their applications. - [Key Technical Specifications](https://console.groq.com/docs/agentic-tooling/compound-beta-mini): This documentation page provides an overview of the key technical specifications for Compound-beta-mini, a model that leverages Llama4 Scout and Llama3.3 70B for core reasoning, routing, and tool use. The page details the model's architecture, performance metrics, technical details, and best practices for deployment and use. - [Integrations: Button Group (tsx)](https://console.groq.com/docs/integrations/button-group): The Button Group component page provides documentation for integrating a collection of buttons in a grid layout, accepting an array of customizable button objects. This page outlines the properties and configuration options for the Button Group and individual Integration Button components. - [Integrations: Integration Buttons (ts)](https://console.groq.com/docs/integrations/integration-buttons): This page provides information on integration buttons, specifically a catalog of predefined buttons for various integration groups, such as AI agent frameworks and LLM app development tools. The integration buttons are organized by category and contain details like title, description, and icon sources. - [What are integrations?](https://console.groq.com/docs/integrations): This page provides an overview of integrations, which enable connections to external services to enhance Groq-powered applications with additional capabilities. It categorizes available integrations to help users find suitable tools for their needs. - [🎨 Gradio + Groq: Easily Build Web Interfaces](https://console.groq.com/docs/gradio): This documentation page provides a guide on integrating Gradio with Groq to easily build web interfaces for applications, enabling the creation of interactive demos and shareable apps. It offers a quick start tutorial and resources for building robust, multimodal applications with Gradio and Groq. - [FlutterFlow + Groq: Fast & Powerful Cross-Platform Apps](https://console.groq.com/docs/flutterflow): This documentation page provides a guide on integrating FlutterFlow with Groq to build fast and powerful cross-platform apps with AI capabilities. It outlines a quick start process to get started with building AI-powered apps using FlutterFlow and Groq in just 10 minutes. - [Groq Batch API](https://console.groq.com/docs/batch): The Groq Batch API enables large-scale asynchronous processing of API requests, allowing users to submit batches of requests for processing within a 24-hour to 7-day window at a 25% cost discount compared to synchronous APIs. This API is ideal for use cases that don't require immediate responses, such as processing large datasets, generating content in bulk, and running evaluations. - [Qwen Qwq 32b: Model (tsx)](https://console.groq.com/docs/model/qwen-qwq-32b): The Qwen Qwq32b model page provides information on a 32-billion parameter reasoning model, Qwen/Qwq-32B, which delivers competitive performance on complex reasoning and coding tasks. This page details the model's key features, including its performance and deployment on Groq's hardware for fast reasoning. - [Qwen 2.5 Coder 32b: Model (tsx)](https://console.groq.com/docs/model/qwen-2.5-coder-32b): This page provides documentation for the Qwen2.5 Coder32b model, a specialized AI model fine-tuned for code generation and development tasks. The model delivers production-quality code generation capabilities, comparable to GPT-4, leveraging a large dataset of code and technical content. - [Model: Eu Notice (tsx)](https://console.groq.com/docs/model/eu-notice): This page provides notice regarding the usage of the Eu Notice (tsx) model, specifically highlighting limitations on rights granted under the Llama4 Community License Agreement for individuals or companies based in the European Union. The notice affects multimodal models included in Llama4, outlining restricted usage within the EU. - [Llama3 70b 8192: Model (tsx)](https://console.groq.com/docs/model/llama3-70b-8192): The Llama3.0 70B model on Groq is a reliable foundation model that excels at dialogue and content-generation tasks, offering a balance of performance and speed. This page provides information on the Llama3.0 70B model, including its key features and capabilities. - [Llama 3.1 8b Instant: Model (tsx)](https://console.groq.com/docs/model/llama-3.1-8b-instant): This page provides documentation for the Llama3.1 8b Instant model, a Groq-hosted AI model optimized for rapid response times and production-grade reliability. The page details the model's capabilities and applications, including chat interfaces, content filtering systems, and large-scale data processing workloads. - [Llama 3.2 3b Preview: Model (tsx)](https://console.groq.com/docs/model/llama-3.2-3b-preview): The Llama3.2.3b Preview model page provides information on a fast and balanced AI model with 3.1 billion parameters, ideal for tasks like content creation, summarization, and information retrieval. This page details the model's capabilities, features, and applications, including its efficient design for cost-effective performance in real-time uses such as chatbots and content generation. - [Llama 3.3 70b Specdec: Model (tsx)](https://console.groq.com/docs/model/llama-3.3-70b-specdec): The Llama3.3-70b SpecDec model page provides information on Groq's speculative decoding version of Meta's Llama3.3-70B model, optimized for high-speed inference while maintaining high quality. This page details the model's capabilities, including exceptional performance with reduced latency, making it suitable for real-time applications. - [Llama3 8b 8192: Model (tsx)](https://console.groq.com/docs/model/llama3-8b-8192): The Llama3 8b 8192 model page provides information on a Groq-hosted AI model that delivers exceptional performance with industry-leading speed and cost-efficiency. This model is suitable for high-volume applications where both speed and cost are crucial. - [Mistral Saba 24b: Model (tsx)](https://console.groq.com/docs/model/mistral-saba-24b): The Mistral Saba24B model page provides information on a specialized AI model trained to excel in Arabic, Farsi, Urdu, Hebrew, and Indic languages, with a 32K token context window and tool use capabilities. This model delivers exceptional results across multilingual tasks while maintaining strong performance in English. - [Qwen 2.5 32b: Model (tsx)](https://console.groq.com/docs/model/qwen-2.5-32b): The Qwen2.5-32B model page provides an overview of Alibaba's flagship AI model, delivering GPT-4 level capabilities and near-instant responses across various tasks. This page details the model's key features, including its training data and performance in creative writing and complex reasoning. - [Llama 4 Maverick 17b 128e Instruct: Model (tsx)](https://console.groq.com/docs/model/llama-4-maverick-17b-128e-instruct): The Llama4 Maverick17b128e Instruct model page provides information on Meta's 17 billion parameter mixture-of-experts model, featuring native multimodality for text and image understanding, instruction-tuned for tasks like chat, visual reasoning, and coding. This page offers details on utilizing the model on Groq for industry-leading inference speed. - [Deepseek R1 Distill Llama 70b: Model (tsx)](https://console.groq.com/docs/model/deepseek-r1-distill-llama-70b): The DeepSeek R1 Distill Llama70b model page provides information on a distilled version of DeepSeek's R1 model, fine-tuned from the Llama-3.3-70B-Instruct base model, which leverages knowledge distillation to deliver exceptional performance on mathematical and logical reasoning tasks. This model is hosted on Groq's platform, offering industry-leading speed. - [Llama 3.3 70b Versatile: Model (tsx)](https://console.groq.com/docs/model/llama-3.3-70b-versatile): The Llama3.3-70B-Versatile model page provides information on Meta's advanced multilingual large language model, optimized for various natural language processing tasks. This model offers high performance across benchmarks with 70 billion parameters, suitable for diverse applications. - [Model Details](https://console.groq.com/docs/model/playai-tts): The "Model Details" page provides an in-depth overview of the PlayAI Dialog model, including its architecture, training data, key features, and limitations, to help users understand its capabilities and potential applications. This page is intended to guide developers, content creators, and researchers in utilizing the PlayAI Dialog model for creative content generation, interactive storytelling, and narrative development. - [Llama Guard 3 8b: Model (tsx)](https://console.groq.com/docs/model/llama-guard-3-8b): The Llama Guard 3 8b model is a specialized content moderation model built on the Llama framework, designed to identify and filter potentially harmful content. This model is hosted by Groq, which provides fast inference with industry-leading latency and performance for high-speed AI processing. - [Deepseek R1 Distill Qwen 32b: Model (tsx)](https://console.groq.com/docs/model/deepseek-r1-distill-qwen-32b): The DeepSeek R1 Distill Qwen32b model page provides an overview of a distilled language model that leverages knowledge distillation to retain robust reasoning capabilities while enhancing efficiency. This page details the model's key features, performance, and capabilities, particularly in mathematical and logical reasoning tasks, complex problem-solving, and integration with external tools and APIs. - [Llama 3.2 1b Preview: Model (tsx)](https://console.groq.com/docs/model/llama-3.2-1b-preview): The Llama3.2-1b Preview model page provides information on a high-speed, cost-effective AI model with 1.23 billion parameters, ideal for applications requiring rapid text analysis, information retrieval, and content summarization. This page details the model's key features, use cases, and benefits, including its optimal balance of speed, quality, and cost. - [Llama 4 Scout 17b 16e Instruct: Model (tsx)](https://console.groq.com/docs/model/llama-4-scout-17b-16e-instruct): The Llama4 Scout17b16e Instruct model page provides information on Meta's 17 billion parameter mixture-of-experts model, featuring native multimodality for text and image understanding, instruction-tuned for tasks like chat, visual reasoning, and coding. This page offers details on the model's capabilities, including a 10M token context length and industry-leading inference speed on Groq. - [LiveKit + Groq: Build End-to-End AI Voice Applications](https://console.groq.com/docs/livekit): This page provides a guide on integrating LiveKit with Groq to build end-to-end AI voice applications, combining speech recognition, text-to-speech, and real-time communication features. It offers a step-by-step tutorial on setting up a voice agent using LiveKit and Groq's high-performance speech recognition capabilities. - [Agno + Groq: Lightning Fast Agents](https://console.groq.com/docs/agno): This documentation page provides a guide on building lightning-fast agents using Agno and Groq, enabling developers to create autonomous programs that leverage language models to achieve tasks. It covers the setup, configuration, and implementation of various agent types, including agentic RAG, image agents, reasoning agents, and multi-agent teams. - [API Error Codes and Responses](https://console.groq.com/docs/errors): This documentation page provides a comprehensive list of API error codes and responses, including standard HTTP status codes and custom error codes, to help with error handling and debugging. It explains the meaning and implications of each error code, along with example response bodies and a structured error object to facilitate troubleshooting. - [Composio](https://console.groq.com/docs/composio): This documentation page provides an overview and quick start guide for Composio, a platform for managing and integrating tools with LLMs and AI agents, enabling the creation of fast, Groq-based assistants that interact with external applications. It outlines the key features and capabilities of Composio, including tool integration, authentication management, and optimized execution, and offers a step-by-step tutorial on building a Composio-enabled Groq agent. - [Changelog](https://console.groq.com/docs/legacy-changelog): The Groq Changelog provides a chronological record of updates, releases, and changes to the Groq API, allowing users to track ongoing developments and new features. This page lists recent updates, including new model releases, feature additions, and deprecations, to help users stay informed about changes to the API. - [Overview](https://console.groq.com/docs/overview): The "Overview" page provides an introduction to Groq's API, highlighting its fast LLM inference, OpenAI compatibility, and ease of integration and scalability. This page serves as a starting point for developers to quickly get started with building applications on Groq. - [✨ Vercel AI SDK + Groq: Rapid App Development](https://console.groq.com/docs/ai-sdk): This documentation page provides a guide on rapidly developing applications using the Vercel AI SDK with Groq, enabling seamless integration with powerful language models for scalable and high-speed applications. It offers a step-by-step quick start guide in JavaScript, covering setup, configuration, and deployment of AI-powered applications. - [E2B + Groq: Open-Source Code Interpreter](https://console.groq.com/docs/e2b): The E2B + Groq: Open-Source Code Interpreter documentation page provides instructions and code examples for building secure, sandboxed environments that execute code generated by Large Language Models (LLMs) via the Groq API. This page guides developers in integrating E2B's code interpreter with Groq's AI capabilities to create applications for data analysis, coding, and reasoning-heavy agents. - [🚅 LiteLLM + Groq for Production Deployments](https://console.groq.com/docs/litellm): This page provides a guide on integrating LiteLLM with Groq for production deployments, enabling features such as cost management, smart caching, and spend tracking. It offers a quick start tutorial and next steps for configuring advanced features and building production-ready applications with LiteLLM and Groq. - [AutoGen + Groq: Building Multi-Agent AI Applications](https://console.groq.com/docs/autogen): This documentation page provides a guide on building multi-agent AI applications using AutoGen, an open-source framework developed by Microsoft Research, powered by Groq's fast inference speed. It covers the integration of AutoGen with Groq to create sophisticated AI agents that collaborate to solve complex tasks, along with code examples and advanced features. - [OpenAI Compatibility](https://console.groq.com/docs/openai): This page provides information on the compatibility of Groq API with OpenAI's client libraries, allowing users to easily configure their existing applications to run on Groq. It outlines the configuration steps, currently unsupported features, and encourages users to provide feedback on requested features. - [JigsawStack 🧩](https://console.groq.com/docs/jigsawstack): The JigsawStack 🧩 documentation page provides information on integrating a powerful AI SDK into any backend to automate tasks such as web scraping, OCR, and translation using Large Language Models (LLMs). This page explains the features and capabilities of the JigsawStack Prompt Engine, including its ability to automatically select the best LLMs, optimize performance, and ensure output accuracy and safety. - [Groq API Reference](https://console.groq.com/docs/api-reference): The Groq API Reference provides detailed information on the Groq API, including endpoints, parameters, and response formats, to help developers integrate and interact with Groq services. This reference guide serves as a comprehensive resource for building, testing, and deploying applications that utilize the Groq API. - [Hooks: Use Extended Models (ts)](https://console.groq.com/docs/hooks/use-extended-models): This page documents the "Hooks: Use Extended Models" feature, which provides a way to interact with extended models using TypeScript. It outlines the schema and types for extended models, including features, metadata, and pricing, as well as utility functions for working with these models.