Welcome

Fast LLM inference, OpenAI-compatible. Simple to integrate, easy to scale. Start building in minutes.

curl -X POST https://api.groq.com/openai/v1/chat/completions \
-H "Authorization: Bearer $GROQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b-versatile",
"messages": [{
    "role": "user",
    "content": "Explain the importance of fast language models"
}]
}'
OpenAI
OpenAI GPT-OSS (20B & 120B) models now available for instant inference.
These models have built-in browser search and code execution capabilities.
Learn about GPT-OSS

Was this page helpful?