Documentation
Supported Models
GroqCloud currently supports the following models:
Llama 3.1 405B (Preview)
- Model ID:
llama-3.1-405b-reasoning
- Developer: Meta
- Context Window: 131,072 tokens
- Model Card
Llama 3.1 70B (Preview)
- Model ID:
llama-3.1-70b-versatile
- Developer: Meta
- Context Window: 131,072 tokens
- Model Card
Llama 3.1 8B (Preview)
- Model ID:
llama-3.1-8b-instant
- Developer: Meta
- Context Window: 131,072 tokens
- Model Card
Early API access to Llama 3.1 405B is currently only available to paying customers - stay tuned for general availability. During preview launch, we are limiting all 3.1 models to max_tokens of 8k and 405b to 16k input tokens.
Llama 3 Groq 70B Tool Use (Preview)
- Model ID:
llama3-groq-70b-8192-tool-use-preview
- Developer: Groq
- Context Window: 8,192 tokens
- Model Card
Llama 3 Groq 8B Tool Use (Preview)
- Model ID:
llama3-groq-8b-8192-tool-use-preview
- Developer: Groq
- Context Window: 8,192 tokens
- Model Card
Meta Llama 3 70B
- Model ID:
llama3-70b-8192
- Developer: Meta
- Context Window: 8,192 tokens
- Model Card
Meta Llama 3 8B
- Model ID:
llama3-8b-8192
- Developer: Meta
- Context Window: 8,192 tokens
- Model Card
Mixtral 8x7B
- Model ID:
mixtral-8x7b-32768
- Developer: Mistral
- Context Window: 32,768 tokens
- Model Card
Gemma 7B
- Model ID:
gemma-7b-it
- Developer: Google
- Context Window: 8,192 tokens
- Model Card
Gemma 2 9B
- Model ID:
gemma2-9b-it
- Developer: Google
- Context Window: 8,192 tokens
- Model Card
Whisper
- Model ID:
whisper-large-v3
- Developer: OpenAI
- File Size: 25 MB
- Model Card
These are chat and audio type models and are directly accessible through the GroqCloud Models API endpoint using the model IDs mentioned above. You can use the https://api.groq.com/openai/v1/models
endpoint to return a JSON list of all active models:
import requests
import os
api_key = os.environ.get("GROQ_API_KEY")
url = "https://api.groq.com/openai/v1/models"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
response = requests.get(url, headers=headers)
print(response.json())