Groq

Canopy Labs Orpheus V1 English

Preview
canopylabs/orpheus-v1-english
Try it in Playground
INPUT
Text
OUTPUT
Audio
CAPABILITIES
Canopy Labs logoCanopy Labs

Orpheus V1 English is an expressive text-to-speech model developed by Canopy Labs that generates fast, high-quality audio with unique vocal direction controls. This model offers multiple voices and low-latency inference, with support for bracketed directions like [cheerful] or [whisper] to control how the model speaks.


PRICING

Per Million Characters
$22.00
45,454 / $1

QUANTIZATION

This uses Groq's TruePoint Numerics, which reduces precision only in areas that don't affect accuracy, preserving quality while delivering significant speedup over traditional approaches. Learn more here.

Key Technical Specifications

Model Architecture

Orpheus V1 English is built on an advanced architecture optimized for expressive speech synthesis. The model supports vocal direction controls through bracketed text, enabling everything from subtle conversational nuances to highly expressive character performances. It features six professionally-trained voice personas with different strengths for expressive direction performance.

Vocal Directions Support

The model uniquely supports vocal directions for expressive control:
  • Use bracketed text like [cheerful], [whisper], or [dramatic] to control speech style
  • More directions create more expressive, acted performances
  • Fewer or no directions produce natural, conversational cadence
  • Supports 1-2 word directions (typically adjectives or adverbs)

Use Cases

Customer Support & AI Assistants
Use with no directions for natural, conversational interactions that feel human and approachable. Perfect for customer service bots, virtual assistants, and FAQ systems where authenticity matters.
Game Characters & Interactive Media
Leverage expressive directions to create memorable, dynamic character performances. Add bracketed directions like [menacing whisper] or [excited] for engaging game dialogue and interactive storytelling.
Professional Narration & Business Content
Use subtle professional directions like [professionally] or [authoritatively] for authoritative, polished delivery in corporate videos, e-learning content, and business presentations.
Content Creation & Entertainment
Combine multiple directions for engaging, varied performances in podcasts, audiobooks, YouTube content, and storytelling. Create everything from subtle nuances to highly expressive narrative performances.

Best Practices

  • For natural conversations (customer support, AI assistants), omit directions entirely to get conversational, human-like cadence.
  • Use 1-2 word directions (adjectives or adverbs) for best results - examples: [cheerful], [whisper], [professionally], [dramatically].
  • Experiment with removing punctuation to give the model more freedom in choosing intonation patterns, especially for expressive performances.
  • Test different voices for your use case; some voices perform better with expressive directions than others, particularly for complex emotional ranges.
  • Keep input text under 200 characters maximum per request.
  • Use hyphens (2-0-3) for letter-by-letter reading of numbers, as pure numbers like 203 are normalized to 'two hundred and three'.

Quick Start

To get started with Orpheus V1 English, please visit our Orpheus text-to-speech documentation page for detailed usage examples, vocal direction guides, and code samples.

Was this page helpful?