Documentation

Supported Models

GroqCloud currently supports the following models:


Production Models

MODEL IDDEVELOPERCONTEXT WINDOW (TOKENS)MAX OUTPUT TOKENSMAX FILE SIZEMODEL CARD LINK
distil-whisper-large-v3-enHuggingFace--25 MBCard
gemma2-9b-itGoogle8,192--Card
llama-3.3-70b-versatileMeta128k32,768-Card
llama-3.1-8b-instantMeta128k8,192-Card
llama-guard-3-8bMeta8,192--Card
llama3-70b-8192Meta8,192--Card
llama3-8b-8192Meta8,192--Card
mixtral-8x7b-32768Mistral32,768--Card
whisper-large-v3OpenAI--25 MBCard
whisper-large-v3-turboOpenAI--25 MBCard

Preview Models

Note: Preview models are intended for evaluation purposes only and should not be used in production environments as they may be discontinued at short notice.

MODEL IDDEVELOPERCONTEXT WINDOW (TOKENS)MAX OUTPUT TOKENSMAX FILE SIZEMODEL CARD LINK
llama-3.3-70b-specdecMeta8,192--Card
llama-3.2-1b-previewMeta128k8,192-Card
llama-3.2-3b-previewMeta128k8,192-Card
llama-3.2-11b-vision-previewMeta128k8,192-Card
llama-3.2-90b-vision-previewMeta128k8,192-Card

See our deprecated models here


Hosted models are directly accessible through the GroqCloud Models API endpoint using the model IDs mentioned above. You can use the https://api.groq.com/openai/v1/models endpoint to return a JSON list of all active models:

1import requests
2import os
3
4api_key = os.environ.get("GROQ_API_KEY")
5url = "https://api.groq.com/openai/v1/models"
6
7headers = {
8    "Authorization": f"Bearer {api_key}",
9    "Content-Type": "application/json"
10}
11
12response = requests.get(url, headers=headers)
13
14print(response.json())