Canopy Labs Orpheus V1 English

Preview

canopylabs/orpheus-v1-english

Try it in Playground

INPUT

Text

OUTPUT

Audio

CAPABILITIES

Text to Speech

Canopy Labs

Orpheus V1 English is an expressive text-to-speech model developed by Canopy Labs that generates fast, high-quality audio with unique vocal direction controls. This model offers multiple voices and low-latency inference, with support for bracketed directions like [cheerful] or [whisper] to control how the model speaks.

PRICING

Per Million Characters

$22.00

45,454 / $1

QUANTIZATION

This uses Groq's TruePoint Numerics, which reduces precision only in areas that don't affect accuracy, preserving quality while delivering significant speedup over traditional approaches. Learn more here.

Key Technical Specifications

Model Architecture

Orpheus V1 English is built on an advanced architecture optimized for expressive speech synthesis. The model supports vocal direction controls through bracketed text, enabling everything from subtle conversational nuances to highly expressive character performances. It features six professionally-trained voice personas with different strengths for expressive direction performance.

Vocal Directions Support

The model uniquely supports vocal directions for expressive control:

Use bracketed text like [cheerful], [whisper], or [dramatic] to control speech style
More directions create more expressive, acted performances
Fewer or no directions produce natural, conversational cadence
Supports 1-2 word directions (typically adjectives or adverbs)

Use Cases

Customer Support & AI Assistants

Use with no directions for natural, conversational interactions that feel human and approachable. Perfect for customer service bots, virtual assistants, and FAQ systems where authenticity matters.

Game Characters & Interactive Media

Leverage expressive directions to create memorable, dynamic character performances. Add bracketed directions like [menacing whisper] or [excited] for engaging game dialogue and interactive storytelling.

Professional Narration & Business Content

Use subtle professional directions like [professionally] or [authoritatively] for authoritative, polished delivery in corporate videos, e-learning content, and business presentations.

Content Creation & Entertainment

Combine multiple directions for engaging, varied performances in podcasts, audiobooks, YouTube content, and storytelling. Create everything from subtle nuances to highly expressive narrative performances.

Best Practices

For natural conversations (customer support, AI assistants), omit directions entirely to get conversational, human-like cadence.
Use 1-2 word directions (adjectives or adverbs) for best results - examples: [cheerful], [whisper], [professionally], [dramatically].
Experiment with removing punctuation to give the model more freedom in choosing intonation patterns, especially for expressive performances.
Test different voices for your use case; some voices perform better with expressive directions than others, particularly for complex emotional ranges.
Keep input text under 200 characters maximum per request.
Use hyphens (2-0-3) for letter-by-letter reading of numbers, as pure numbers like 203 are normalized to 'two hundred and three'.

Quick Start

To get started with Orpheus V1 English, please visit our Orpheus text-to-speech documentation page for detailed usage examples, vocal direction guides, and code samples.

Getting Started

Core Features

Tools & Integrations

Compound (Agentic AI)

Guides

Service Tiers

Advanced

Production Readiness

Account and Console

Developer Resources

Legal

Uncategorized