# https://console.groq.com llms-full.txt
## 🗂️ LlamaIndex 🦙
URL: https://console.groq.com/docs/llama-index
## 🗂️ LlamaIndex 🦙
[LlamaIndex](https://www.llamaindex.ai/) is a data framework for LLM-based applications that benefit from context augmentation, such as Retrieval-Augmented Generation (RAG) systems. LlamaIndex provides the essential abstractions to more easily ingest, structure, and access private or domain-specific data, resulting in safe and reliable injection into LLMs for more accurate text generation.
For more information, read the LlamaIndex Groq integration documentation for [Python](https://docs.llamaindex.ai/en/stable/examples/llm/groq.html) and [JavaScript](https://ts.llamaindex.ai/modules/llms/available_llms/groq).
---
## Rate Limits
URL: https://console.groq.com/docs/rate-limits
## Rate Limits
Rate limits act as control measures to regulate how frequently users and applications can access our API within specified timeframes. These limits help ensure service stability, fair access, and protection
against misuse so that we can serve reliable and fast inference for all.
### Understanding Rate Limits
Rate limits are measured in:
- **RPM:** Requests per minute
- **RPD:** Requests per day
- **TPM:** Tokens per minute
- **TPD:** Tokens per day
- **ASH:** Audio seconds per hour
- **ASD:** Audio seconds per day
Rate limits apply at the organization level, not individual users. You can hit any limit type depending on which threshold you reach first.
**Example:** Let's say your RPM =50 and your TPM =200K. If you were to send50 requests with only100 tokens within a minute, you would reach your limit even though you did not send200K tokens within those
50 requests.
## Rate Limits
The following is a high level summary and there may be exceptions to these limits. You can view the current, exact rate limits for your organization on the [limits page](/settings/limits) in your account settings.
## Rate Limit Headers
In addition to viewing your limits on your account's [limits](https://console.groq.com/settings/limits) page, you can also view rate limit information such as remaining requests and tokens in HTTP response
headers as follows:
The following headers are set (values are illustrative):
## Handling Rate Limits
When you exceed rate limits, our API returns a `429 Too Many Requests` HTTP status code.
**Note**: `retry-after` is only set if you hit the rate limit and status code429 is returned. The other headers are always included.
---
## Initialize the Groq client
URL: https://console.groq.com/docs/speech-to-text/scripts/transcription.py
```python
import os
import json
from groq import Groq
# Initialize the Groq client
client = Groq()
# Specify the path to the audio file
filename = os.path.dirname(__file__) + "/YOUR_AUDIO.wav" # Replace with your audio file!
# Open the audio file
with open(filename, "rb") as file:
# Create a transcription of the audio file
transcription = client.audio.transcriptions.create(
file=file, # Required audio file
model="whisper-large-v3-turbo", # Required model to use for transcription
prompt="Specify context or spelling", # Optional
response_format="verbose_json", # Optional
timestamp_granularities = ["word", "segment"], # Optional (must set response_format to "json" to use and can specify "word", "segment" (default), or both)
language="en", # Optional
temperature=0.0 # Optional
)
# To print only the transcription text, you'd use print(transcription.text) (here we're printing the entire transcription object to access timestamps)
print(json.dumps(transcription, indent=2, default=str))
```
---
## Speech To Text: Transcription (js)
URL: https://console.groq.com/docs/speech-to-text/scripts/transcription
import fs from "fs";
import Groq from "groq-sdk";
// Initialize the Groq client
const groq = new Groq();
async function main() {
// Create a transcription job
const transcription = await groq.audio.transcriptions.create({
file: fs.createReadStream("YOUR_AUDIO.wav"), // Required path to audio file - replace with your audio file!
model: "whisper-large-v3-turbo", // Required model to use for transcription
prompt: "Specify context or spelling", // Optional
response_format: "verbose_json", // Optional
timestamp_granularities: ["word", "segment"], // Optional (must set response_format to "json" to use and can specify "word", "segment" (default), or both)
language: "en", // Optional
temperature:0.0, // Optional
});
// To print only the transcription text, you'd use console.log(transcription.text); (here we're printing the entire transcription object to access timestamps)
console.log(JSON.stringify(transcription, null,2));
}
main();
---
## Initialize the Groq client
URL: https://console.groq.com/docs/speech-to-text/scripts/translation.py
```python
import os
from groq import Groq
# Initialize the Groq client
client = Groq()
# Specify the path to the audio file
filename = os.path.dirname(__file__) + "/sample_audio.m4a" # Replace with your audio file!
# Open the audio file
with open(filename, "rb") as file:
# Create a translation of the audio file
translation = client.audio.translations.create(
file=(filename, file.read()), # Required audio file
model="whisper-large-v3", # Required model to use for translation
prompt="Specify context or spelling", # Optional
language="en", # Optional ('en' only)
response_format="json", # Optional
temperature=0.0 # Optional
)
# Print the translation text
print(translation.text)
```
---
## Speech To Text: Translation (js)
URL: https://console.groq.com/docs/speech-to-text/scripts/translation
import fs from "fs";
import Groq from "groq-sdk";
// Initialize the Groq client
const groq = new Groq();
async function main() {
// Create a translation job
const translation = await groq.audio.translations.create({
file: fs.createReadStream("sample_audio.m4a"), // Required path to audio file - replace with your audio file!
model: "whisper-large-v3", // Required model to use for translation
prompt: "Specify context or spelling", // Optional
language: "en", // Optional ('en' only)
response_format: "json", // Optional
temperature:0.0, // Optional
});
// Log the transcribed text
console.log(translation.text);
}
main();
---
## Speech to Text
URL: https://console.groq.com/docs/speech-to-text
## Speech to Text
Groq API is the fastest speech-to-text solution available, offering OpenAI-compatible endpoints that
enable near-instant transcriptions and translations. With Groq API, you can integrate high-quality audio
processing into your applications at speeds that rival human interaction.
## API Endpoints
We support two endpoints:
| Endpoint | Usage | API Endpoint |
|----------------|--------------------------------|-------------------------------------------------------------|
| Transcriptions | Convert audio to text | `https://api.groq.com/openai/v1/audio/transcriptions` |
| Translations | Translate audio to English text| `https://api.groq.com/openai/v1/audio/translations` |
## Supported Models
| Model ID | Model | Supported Language(s) | Description |
|-----------------------------|----------------------|-------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
| `whisper-large-v3-turbo` | Whisper Large V3 Turbo | Multilingual | A fine-tuned version of a pruned Whisper Large V3 designed for fast, multilingual transcription tasks. |
| `distil-whisper-large-v3-en` | Distil-Whisper English | English-only | A distilled, or compressed, version of OpenAI's Whisper model, designed to provide faster, lower cost English speech recognition while maintaining comparable accuracy. |
| `whisper-large-v3` | Whisper large-v3 | Multilingual | Provides state-of-the-art performance with high accuracy for multilingual transcription and translation tasks. |
## Which Whisper Model Should You Use?
Having more choices is great, but let's try to avoid decision paralysis by breaking down the tradeoffs between models to find the one most suitable for
your applications:
- If your application is error-sensitive and requires multilingual support, use `whisper-large-v3`.
- If your application is less sensitive to errors and requires English only, use `distil-whisper-large-v3-en`.
- If your application requires multilingual support and you need the best price for performance, use `whisper-large-v3-turbo`.
The following table breaks down the metrics for each model.
| Model | Cost Per Hour | Language Support | Transcription Support | Translation Support | Real-time Speed Factor | Word Error Rate |
|--------|--------|--------|--------|--------|--------|--------|
| `whisper-large-v3` | $0.111 | Multilingual | Yes | Yes |189 |10.3% |
| `whisper-large-v3-turbo` | $0.04 | Multilingual | Yes | No |216 |12% |
| `distil-whisper-large-v3-en` | $0.02 | English only | Yes | No |250 |13% |
## Working with Audio Files
### Audio File Limitations
* Max File Size: 40 MB (free tier), 100MB (dev tier)
* Max Attachment File Size: 25 MB. If you need to process larger files, use the `url` parameter to specify a url to the file instead.
* Minimum File Length: 0.01 seconds
* Minimum Billed Length: 10 seconds. If you submit a request less than this, you will still be billed for 10 seconds.
* Supported File Types: Either a URL or a direct file upload for `flac`, `mp3`, `mp4`, `mpeg`, `mpga`, `m4a`, `ogg`, `wav`, `webm`
* Single Audio Track: Only the first track will be transcribed for files with multiple audio tracks. (e.g. dubbed video)
* Supported Response Formats: `json`, `verbose_json`, `text`
* Supported Timestamp Granularities: `segment`, `word`
### Audio Request Examples
I am on **free tier** and want to transcribe an audio file:
- Use the `file` parameter to add a local file up to 25 MB.
- Use the `url` parameter to add a url to a file up to 40 MB.
I am on **dev tier** and want to transcribe an audio file:
- Use the `file` parameter to add a local file up to 25 MB.
- Use the `url` parameter to add a url to a file up to 100 MB.
If audio files exceed these limits, you may receive a [413 error](/docs/errors).
### Audio Preprocessing
Our speech-to-text models will downsample audio to 16KHz mono before transcribing, which is optimal for speech recognition. This preprocessing can be performed client-side if your original file is extremely
large and you want to make it smaller without a loss in quality (without chunking, Groq API speech-to-text endpoints accept up to 40MB for free tier and 100MB for [dev tier](/settings/billing)). We recommend FLAC for lossless compression.
The following `ffmpeg` command can be used to reduce file size:
```shell
ffmpeg \
-i \
-ar 16000 \
-ac 1 \
-map0:a \
-c:a flac \