Add ElevenLabs Low Latency Voice Assistant Integration #261

zealoushacker · 2025-11-01T04:03:32Z

Pull Request

Description

This PR adds a comprehensive cookbook demonstrating how to build a low-latency voice assistant by integrating ElevenLabs' speech processing capabilities with Claude's conversational AI. The cookbook progressively optimizes for real-time performance, teaching developers how to minimize latency through various streaming techniques.

What this cookbook demonstrates:

Integration of ElevenLabs speech-to-text and text-to-speech APIs with Claude
Step-by-step optimization techniques to reduce latency in voice applications
Comparison of different streaming approaches (HTTP streaming vs WebSocket streaming)
Production-ready implementation of a continuous conversational voice assistant

Type of Change

New cookbook
Bug fix (fixes an issue in existing cookbook)
Documentation update
Code quality improvement (refactoring, optimization)
Dependency update
Other (please describe):

Cookbook Checklist (if applicable)

Cookbook has a clear, descriptive title
Includes a problem statement or use case description
Code is well-commented and easy to follow
Includes expected outputs or results

Testing

I have tested this cookbook/change locally
All cells execute without errors

Additional Context

This cookbook includes two main components:

Interactive Notebook (low_latency_stt_claude_tts.ipynb) - A tutorial-style notebook that walks through building a voice assistant step-by-step, demonstrating various optimization techniques with performance metrics at each stage.
Production WebSocket Script (stream_voice_assistant_websocket.py) - A fully functional voice assistant using WebSocket streaming for minimal latency, featuring continuous microphone input and gapless audio playback.

The cookbook is particularly valuable for developers building real-time voice applications who need to understand the tradeoffs between different streaming approaches and how to optimize for latency.

Key features:

Compatible with ElevenLabs free tier
Includes comprehensive setup instructions and environment configuration
Well-documented code with security best practices (environment variables, input validation)
Performance comparisons between different implementation approaches

- Interactive voice assistant with enter-to-start/stop recording - ElevenLabs speech-to-text using Scribe model - Claude Haiku 4.5 for intelligent responses - WebSocket streaming TTS for minimal latency - Comprehensive notebook demonstrating optimization techniques - API key validation and placeholder setup 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Security improvements: - Replace hardcoded API keys with environment variables - Add .env.example template with setup instructions - Add python-dotenv dependency for environment management Code quality improvements: - Add missing docstring to on_close function - Extract magic numbers to named constants in AudioQueue class - Make voice ID dynamically fetched from available voices - Make TTS model and output format configurable constants 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Use dynamically selected VOICE_ID variable instead of hardcoded voice ID in the "Generate Input Audio" section for consistency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Remove redundant resource links and streamline documentation references to focus on main website and API overview. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Change sounddevice requirement from >=0.5.2 to >=0.5.1 to fix installation issues (version 0.5.2 doesn't exist) - Update sentence-by-sentence streaming cell to use mp3_44100_128 format instead of pcm_44100 (free tier compatible) - Add pip upgrade cell to notebook for better package management - Clean up notebook cell execution outputs Co-Authored-By: ashprabaker <[email protected]>

Added a detailed "How to Use This Cookbook" section that guides users through: - Step 1: Environment setup with API keys and dependencies - Step 2: Working through the notebook to learn concepts - Step 3: Running the production script for hands-on experience Also expanded the "More About ElevenLabs" section with additional resources including Voice Library, API Playground, and SDK links. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The script was using pcm_44100 format which requires ElevenLabs Pro tier, causing WebSocket connections to close with error 1008. Fixed by: - Changed TTS_OUTPUT_FORMAT from pcm_44100 to mp3_44100_128 (free tier) - Added pydub dependency for MP3 decoding - Updated AudioQueue.add() to decode MP3 chunks before playback - Enhanced WebSocket close handler to log error details - Updated docstring to reflect MP3 format usage The script now works with free tier ElevenLabs accounts. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

github-actions · 2025-11-01T04:04:03Z

Summary

Status	Count
🔍 Total	2
✅ Successful	0
⏳ Timeouts	0
🔀 Redirected	0
👻 Excluded	1
❓ Unknown	0
🚫 Errors	1
⛔ Unsupported	0

Errors per input

Errors in temp_md/low_latency_stt_claude_tts.md

[200] https://elevenlabs.io/app/developers/api-keys | Rejected status code (this depends on your "accept" configuration): OK
Full Github Actions output

github-actions · 2025-11-01T04:05:07Z

Model Check Results ✅

I've reviewed the Claude model usage in the changed files for this PR.

Files Reviewed

third_party/ElevenLabs/README.md
third_party/ElevenLabs/low_latency_stt_claude_tts.ipynb
third_party/ElevenLabs/stream_voice_assistant_websocket.py

Model References Found

All files use: claude-haiku-4-5

Analysis

Status: ✅ All model references are valid and follow best practices

The code uses claude-haiku-4-5, which is the official API alias for Claude Haiku 4.5 (full model ID: claude-haiku-4-5-20251001). This is the recommended approach as:

✅ It's a current public model from the latest generation
✅ It's not deprecated
✅ It uses the generation alias format which automatically points to the latest version
✅ This provides better maintainability without needing to update date-stamped model IDs

References

Validated against the current model list at: https://docs.claude.com/en/docs/about-claude/models/overview.md

No changes needed! 🎉

zealoushacker · 2025-11-01T04:09:42Z

@Adriaan-ANT thanks for an awesome cookbook!

Adriaan-ANT

Thank you @zealoushacker!

Adriaan-ANT and others added 9 commits October 21, 2025 06:11

Clear notebook outputs

2532bbf

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Fix hardcoded voice ID in notebook

269cdc2

Use dynamically selected VOICE_ID variable instead of hardcoded voice ID in the "Generate Input Audio" section for consistency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Simplify ElevenLabs resources section in README

a3c8b60

Remove redundant resource links and streamline documentation references to focus on main website and API overview. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Added notebook outputs

718581e

zealoushacker requested review from Adriaan-ANT and PedramNavid November 1, 2025 04:05

Adriaan-ANT approved these changes Nov 1, 2025

View reviewed changes

Adriaan-ANT merged commit 279a36e into main Nov 1, 2025
6 checks passed

Adriaan-ANT mentioned this pull request Nov 1, 2025

Revert "Add ElevenLabs Low Latency Voice Assistant Integration" #262

Closed

zealoushacker mentioned this pull request Nov 1, 2025

Update ElevenLabs Voice Assistant: Improve documentation and error handling #263

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ElevenLabs Low Latency Voice Assistant Integration #261

Add ElevenLabs Low Latency Voice Assistant Integration #261

Uh oh!

zealoushacker commented Nov 1, 2025

Uh oh!

github-actions bot commented Nov 1, 2025

Uh oh!

github-actions bot commented Nov 1, 2025

Uh oh!

zealoushacker commented Nov 1, 2025

Uh oh!

Adriaan-ANT left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add ElevenLabs Low Latency Voice Assistant Integration #261

Add ElevenLabs Low Latency Voice Assistant Integration #261

Uh oh!

Conversation

zealoushacker commented Nov 1, 2025

Pull Request

Description

Type of Change

Cookbook Checklist (if applicable)

Testing

Additional Context

Uh oh!

github-actions bot commented Nov 1, 2025

Summary

Errors per input

Errors in temp_md/low_latency_stt_claude_tts.md

Uh oh!

github-actions bot commented Nov 1, 2025

Model Check Results ✅

Files Reviewed

Model References Found

Analysis

References

Uh oh!

zealoushacker commented Nov 1, 2025

Uh oh!

Adriaan-ANT left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Adriaan-ANT left a comment •

edited

Loading