Documentation

Supported Models

GroqCloud currently supports the following models:

Model IDDeveloperContext Window (tokens)Max TokensMax File SizeModel Card Link
distil-whisper-large-v3-enHuggingFace--25 MBCard
gemma2-9b-itGoogle8,192--Card
gemma-7b-itGoogle8,192--Card
llama3-groq-70b-8192-tool-use-previewGroq8,192--Card
llama3-groq-8b-8192-tool-use-previewGroq8,192--Card
llama-3.1-70b-versatileMeta128k32,768-Card
llama-3.1-70b-specdecMeta128k8,192Card
llama-3.1-8b-instantMeta128k8,192-Card
llama-3.2-1b-previewMeta128k8,192-Card
llama-3.2-3b-previewMeta128k8,192-Card
llama-3.2-11b-vision-previewMeta128k8,192-Card
llama-3.2-90b-vision-previewMeta128k8,192-Card
llama-guard-3-8bMeta8,192--Card
llama3-70b-8192Meta8,192--Card
llama3-8b-8192Meta8,192--Card
mixtral-8x7b-32768Mistral32,768--Card
whisper-large-v3OpenAI--25 MBCard
whisper-large-v3-turboOpenAI--25 MBCard
Deprecated Models

Note: Models with a context window of 128K tokens are currently limited to 8,192 max tokens in preview. Models without a context window or max tokens are denoted with a dash ("-").


Hosted models are directly accessible through the GroqCloud Models API endpoint using the model IDs mentioned above. You can use the https://api.groq.com/openai/v1/models endpoint to return a JSON list of all active models:

1import requests
2import os
3
4api_key = os.environ.get("GROQ_API_KEY")
5url = "https://api.groq.com/openai/v1/models"
6
7headers = {
8    "Authorization": f"Bearer {api_key}",
9    "Content-Type": "application/json"
10}
11
12response = requests.get(url, headers=headers)
13
14print(response.json())