Documentation

Rate Limits

Rate limits act as control measures to regulate how frequently a user or application can make requests within a given timeframe.

Current rate limits for chat completions:

You can view the current rate limits for chat completions in your organization settings


The team is working on introducing paid tiers with stable and increased rate limits in the near future.

Status code & rate limit headers

We set the following x-ratelimit headers to inform you on current rate limits applicable to the API key and associated organization.


The following headers are set (values are illustrative):


HeaderValueNotes
retry-after2In seconds
x-ratelimit-limit-requests14400Always refers to Requests Per Day (RPD)
x-ratelimit-limit-tokens18000Always refers to Tokens Per Minute (TPM)
x-ratelimit-remaining-requests14370Always refers to Requests Per Day (RPD)
x-ratelimit-remaining-tokens17997Always refers to Tokens Per Minute (TPM)
x-ratelimit-reset-requests2m59.56sAlways refers to Requests Per Day (RPD)
x-ratelimit-reset-tokens7.66sAlways refers to Tokens Per Minute (TPM)

When the rate limit is reached we return a 429 Too Many Requests HTTP status code.


Note, retry-after is only set if you hit the rate limit and status code 429 is returned. The other headers are always included.