GroqCloud currently supports the following models:


Production Models

Note: Production models are intended for use in your production environments. They meet or exceed our high standards for speed and quality.

MODEL IDDEVELOPERCONTEXT WINDOW (TOKENS)MAX COMPLETION TOKENSMAX FILE SIZEMODEL CARD LINK
distil-whisper-large-v3-en
HuggingFace
-
-
25 MB
gemma2-9b-it
Google
8,192
-
-
llama-3.3-70b-versatile
Meta
128K
32,768
-
llama-3.1-8b-instant
Meta
128K
8,192
-
llama-guard-3-8b
Meta
8,192
-
-
llama3-70b-8192
Meta
8,192
-
-
llama3-8b-8192
Meta
8,192
-
-
mixtral-8x7b-32768
Mistral
32,768
-
-
whisper-large-v3
OpenAI
-
-
25 MB
whisper-large-v3-turbo
OpenAI
-
-
25 MB

Preview Models

Note: Preview models are intended for evaluation purposes only and should not be used in production environments as they may be discontinued at short notice.

MODEL IDDEVELOPERCONTEXT WINDOW (TOKENS)MAX COMPLETION TOKENSMAX FILE SIZEMODEL CARD LINK
qwen-qwq-32b
Alibaba Cloud
128K
-
-
mistral-saba-24b
Mistral
32K
-
-
qwen-2.5-coder-32b
Alibaba Cloud
128K
-
-
qwen-2.5-32b
Alibaba Cloud
128K
-
-
deepseek-r1-distill-qwen-32b
DeepSeek
128K
16,384
-
deepseek-r1-distill-llama-70b-specdec
DeepSeek
128K
16,384
-
deepseek-r1-distill-llama-70b
DeepSeek
128K
-
-
llama-3.3-70b-specdec
Meta
8,192
-
-
llama-3.2-1b-preview
Meta
128K
8,192
-
llama-3.2-3b-preview
Meta
128K
8,192
-
llama-3.2-11b-vision-preview
Meta
128K
8,192
-
llama-3.2-90b-vision-preview
Meta
128K
8,192
-

Deprecated models are models that are no longer supported or will no longer be supported in the future. A suggested alternative model for you to use is listed for each deprecated model. See our deprecated models here


Hosted models are directly accessible through the GroqCloud Models API endpoint using the model IDs mentioned above. You can use the https://api.groq.com/openai/v1/models endpoint to return a JSON list of all active models:

1import requests
2import os
3
4api_key = os.environ.get("GROQ_API_KEY")
5url = "https://api.groq.com/openai/v1/models"
6
7headers = {
8    "Authorization": f"Bearer {api_key}",
9    "Content-Type": "application/json"
10}
11
12response = requests.get(url, headers=headers)
13
14print(response.json())