The Performance tier provides prioritized capacity for low, consistent latency. It includes a 99.9% availability SLA and a 99% latency guarantee aligned to your enterprise agreement.
Enterprise only: The Performance tier is only available on enterprise plans. Reach out to our enterprise team to get access.
service_tier=auto in order to use your performance tier limits if they are avaiable, and burst into on_demand. This is a great way to balance perforamcne and costs.| Model ID | Name |
|---|---|
openai/gpt-oss-120b | GPT-OSS 120B |
openai/gpt-oss-20b | GPT-OSS 20B |
llama-3.3-70b-versatile | Llama 3.3 70B Versatile |
Note: If you want automatic context-length handling and tier mapping, use service_tier=auto and we'll route requests to appropriate tiers behind the scenes based on your limits and context size.