
Supported Models

GroqCloud currently supports the following models:

Llama 3.1 405B (Preview)

  • Model ID: llama-3.1-405b-reasoning
  • Developer: Meta
  • Context Window: 131,072 tokens
  • Model Card

Llama 3.1 70B (Preview)

  • Model ID: llama-3.1-70b-versatile
  • Developer: Meta
  • Context Window: 131,072 tokens
  • Model Card

Llama 3.1 8B (Preview)

  • Model ID: llama-3.1-8b-instant
  • Developer: Meta
  • Context Window: 131,072 tokens
  • Model Card

Early API access to Llama 3.1 405B is currently only available to paying customers - stay tuned for general availability. During preview launch, we are limiting all 3.1 models to max_tokens of 8k and 405b to 16k input tokens.

Llama 3 Groq 70B Tool Use (Preview)

  • Model ID: llama3-groq-70b-8192-tool-use-preview
  • Developer: Groq
  • Context Window: 8,192 tokens
  • Model Card

Llama 3 Groq 8B Tool Use (Preview)

  • Model ID: llama3-groq-8b-8192-tool-use-preview
  • Developer: Groq
  • Context Window: 8,192 tokens
  • Model Card

Meta Llama 3 70B

  • Model ID: llama3-70b-8192
  • Developer: Meta
  • Context Window: 8,192 tokens
  • Model Card

Meta Llama 3 8B

  • Model ID: llama3-8b-8192
  • Developer: Meta
  • Context Window: 8,192 tokens
  • Model Card

Mixtral 8x7B

  • Model ID: mixtral-8x7b-32768
  • Developer: Mistral
  • Context Window: 32,768 tokens
  • Model Card

Gemma 7B

  • Model ID: gemma-7b-it
  • Developer: Google
  • Context Window: 8,192 tokens
  • Model Card

Gemma 2 9B

  • Model ID: gemma2-9b-it
  • Developer: Google
  • Context Window: 8,192 tokens
  • Model Card


  • Model ID: whisper-large-v3
  • Developer: OpenAI
  • File Size: 25 MB
  • Model Card

These are chat and audio type models and are directly accessible through the GroqCloud Models API endpoint using the model IDs mentioned above. You can use the endpoint to return a JSON list of all active models:

import requests
import os

api_key = os.environ.get("GROQ_API_KEY")
url = ""

headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"

response = requests.get(url, headers=headers)
