Documentation
LiveKit + Groq: Build End-to-End AI Voice Applications
LiveKit complements Groq's high-performance speech recognition capabilities by providing text-to-speech and real-time communication features. This integration enables you to build end-to-end AI voice applications with:
- Complete Voice Pipeline: Combine Groq's fast and accurate speech-to-text (STT) with LiveKit's text-to-speech (TTS) capabilities
- Real-time Communication: Enable multi-user voice interactions with LiveKit's WebRTC infrastructure
- Flexible TTS Options: Access multiple text-to-speech voices and languages through LiveKit's TTS integrations
- Scalable Architecture: Handle thousands of concurrent users with LiveKit's distributed system
Quick Start (7 minutes to hello world)
1. Prerequisites
- Grab your Groq API Key
- Create a free LiveKit Cloud account
- Install the LiveKit CLI and authenticate in your Command Line Interface (CLI)
- Create a free ElevenLabs account and generate an API Key
1. Clone the starter template for our Python voice agent using your CLI:
When prompted for your OpenAI and Deepgram API key, press Enter to skip as we'll be using custommized plugins for Groq and ElevenLabs for fast inference speed.
lk app create --template voice-pipeline-agent-python
2. CD into your project directory and update the .env.local
file to replace OPENAI_API_KEY
and DEEPGRAM_API_KEY
with the following:
GROQ_API_KEY=<your-groq-api-key>
ELEVEN_API_KEY=<your-elevenlabs-api-key>
3. Update your requirements.txt
file and add the following line:
livekit-plugins-elevenlabs>=0.7.9
4. Update your agent.py
file with the following to configure Groq for STT with whisper-large-v3
, Groq for LLM with llama-3.3-70b-versatile
, and ElevenLabs for TTS:
import logging
from dotenv import load_dotenv
from livekit.agents import (
AutoSubscribe,
JobContext,
JobProcess,
WorkerOptions,
cli,
llm,
)
from livekit.agents.pipeline import VoicePipelineAgent
from livekit.plugins import silero, openai, elevenlabs
load_dotenv(dotenv_path=".env.local")
logger = logging.getLogger("voice-agent")
def prewarm(proc: JobProcess):
proc.userdata["vad"] = silero.VAD.load()
async def entrypoint(ctx: JobContext):
initial_ctx = llm.ChatContext().append(
role="system",
text=(
"You are a voice assistant created by LiveKit. Your interface with users will be voice. "
"You should use short and concise responses, and avoiding usage of unpronouncable punctuation. "
"You were created as a demo to showcase the capabilities of LiveKit's agents framework."
),
)
logger.info(f"connecting to room {ctx.room.name}")
await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
# Wait for the first participant to connect
participant = await ctx.wait_for_participant()
logger.info(f"starting voice assistant for participant {participant.identity}")
agent = VoicePipelineAgent(
vad=ctx.proc.userdata["vad"],
stt=openai.STT.with_groq(model="whisper-large-v3"),
llm=openai.LLM.with_groq(model="llama-3.3-70b-versatile"),
tts=elevenlabs.TTS(),
chat_ctx=initial_ctx,
)
agent.start(ctx.room, participant)
# The agent should be polite and greet the user when it joins :)
await agent.say("Hey, how can I help you today?", allow_interruptions=True)
if __name__ == "__main__":
cli.run_app(
WorkerOptions(
entrypoint_fnc=entrypoint,
prewarm_fnc=prewarm,
),
)
5. Make sure you're in your project directory to install the dependencies and start your agent:
python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt
python3 agent.py dev
6. Within your project directory, clone the voice assistant frontend Next.js app starter template using your CLI:
lk app create --template voice-assistant-frontend
7. CD into your frontend directory and launch your frontend application locally:
pnpm install
pnpm dev
8. Visit your application (http://localhost:3000/ by default), select Connect and talk to your agent!
Challenge: Configure your voice assistant and the frontend to create a travel agent that will help plan trips!
For more detailed documentation and resources, see: