Groq

Text to Speech

Learn how to instantly generate lifelike audio from text.

Overview

The Groq API speech endpoint provides fast text-to-speech (TTS), enabling you to convert text to spoken audio in seconds. With support for English and Arabic voices, you can create life-like audio content for customer support agents, game characters, narration, and more.

API Endpoint

EndpointUsageAPI Endpoint
SpeechConvert text to audiohttps://api.groq.com/openai/v1/audio/speech

Supported Models

Model IDLanguageDescription
canopylabs/orpheus-v1-english
EnglishExpressive TTS with vocal direction controls
canopylabs/orpheus-arabic-saudi
Arabic (Saudi)Authentic Saudi dialect synthesis

Quick Start

The speech endpoint takes four key inputs:

  • model: canopylabs/orpheus-v1-english or canopylabs/orpheus-arabic-saudi
  • input: the text to generate audio from
  • voice: the desired voice for output
  • response format: defaults to "wav"
import os
from groq import Groq

client = Groq(api_key=os.environ.get("GROQ_API_KEY"))

speech_file_path = "orpheus-english.wav" 
model = "canopylabs/orpheus-v1-english"
voice = "troy"
text = "Welcome to Orpheus text-to-speech. [cheerful] This is an example of high-quality English audio generation with vocal directions support."
response_format = "wav"

response = client.audio.speech.create(
    model=model,
    voice=voice,
    input=text,
    response_format=response_format
)

response.write_to_file(speech_file_path)

Next Steps

For comprehensive documentation on available voices, vocal directions, use cases, and best practices, see the Orpheus documentation:

Orpheus Text to Speech
Learn about vocal directions, available voices, use cases, and best practices for generating expressive speech

Was this page helpful?