Releases: livekit/agents
[email protected]
What's Changed
- Improve IVR example README and add inline comments for clarifications by @toubatbrian in #4065
- show milliseconds in CLI by @tinalenguyen in #4080
- fix legacy api
ws_url(WorkerOptions) by @theomonnom in #4090 - fix turn-detector loading issue due to transformers 4.57.2 by @longcw in #4084
- add openai prompt cache retention param by @tinalenguyen in #4089
- flush telemetry traces and logs when cleanup job task by @longcw in #4082
- livekit-agents 1.3.5 by @theomonnom in #4091
Full Changelog: https://github.com/livekit/agents/compare/[email protected]@1.3.5
[email protected]
What's Changed
- fix
task_idsis not defined by @theomonnom in #4025 - fix tests and type checking by @longcw in #4011
- fix contextvar when using text mode in console by @longcw in #3972
- allow turn detection mode to be updated within session by @longcw in #3816
- Inference: Allow provider specific parameter updates by @adrian-cowham in #3808
- Fix docstrings after #1811 Blingfire default tokenizer switch by @mrkowalski in #3812
- fix bithuman avatar getting local participant identity by @longcw in #4029
- Allow pause in final transcript by @chenghao-mou in #3995
- clear internal buffer of datastream io when interruption by @longcw in #4030
- Support for pronunciation dictionary in Cartesia TTS by @cateet in #4033
- Add OVHcloud AI Endpoints provider by @eliasto in #4037
- bring back
drain-timeouton the CLI by @theomonnom in #4038 - feat(elevenlabs): add STTv2 with streaming support for Scribe v2 by @yorrick in #3909
- add JobContext.local_participant_identity by @longcw in #4031
- fix: ensure logger name is set even when custom scope is provided by @davidzhao in #4040
- chore: remove pyav <16 lock by @davidzhao in #4044
- add use_realtime to elevenlabs stt and support scribe v2 realtime model by @longcw in #4041
- Remove flags from RawFunctionDescription by @philipp-eisen in #4050
- Temp workaround for langfuse otel traces by @chenghao-mou in #3987
- fix cloud tracer overwrites user-defined tracer provider by @longcw in #4060
- Fix: Propagate ws_url in AgentServer.from_server_options by @kstonekuan in #4046
- make
ChatContext.summarizeprivate by @theomonnom in #4068 - add makefile by @chenghao-mou in #4067
- feat(openai): add verbosity parameter support to LLM.with_azure() by @IanSteno in #4070
- add dump signal handler and IPC message by @chenghao-mou in #4064
- fix: accurate speech duration in VAD EOS by @jayeshp19 in #4058
- add
chat_ctxargument toAgentSession.generate_replyby @theomonnom in #4074 - add livekit credentials to environment by @tinalenguyen in #4075
- Changing audio format for rime from wav/mp3 to pcm by @gokuljs in #4073
- livekit-agents 1.3.4 by @theomonnom in #4077
New Contributors
- @eliasto made their first contribution in #4037
- @yorrick made their first contribution in #3909
- @philipp-eisen made their first contribution in #4050
- @kstonekuan made their first contribution in #4046
- @IanSteno made their first contribution in #4070
Full Changelog: https://github.com/livekit/agents/compare/[email protected]@1.3.4
[email protected]
New Features
Observability
To learn more about the new observability features, check out our full write-up on the LiveKit blog. It walks through how session playback, trace inspection, and synchronized logs streamline debugging for voice agents. Read more here
New CLI
The CLI has been redesigned, and a new text-only mode was added so you can test your agent without using voice.
python3 my_agent.py console --text
You can also now configure both the input device and output device directly through the provided parameters.
python3 my_agent.py console --input-device "AirPods" --output-device "MacBook"
New AgentServer API
Weโve renamed Worker to AgentServer, and you now need to use a decorator to define the entrypoint. All existing functionality remains backward compatible. This change lays the groundwork for upcoming design improvements and new features.
server = AgentServer()
def prewarm(proc: JobProcess): ...
def load(proc: JobProcess): ...
server.setup_fnc = prewarm
server.load_fnc = load
@server.rtc_session(agent_name="my_customer_service_agent")
async def entrypoint(ctx: JobContext): ...Session Report & on_session_end callback
Use the on_session_end callback to generate a structured SessionReport that the conversation history, events, recording metadata, and the agentโs configuration.
server = AgentServer()
async def on_session_end(ctx: JobContext) -> None:
report = ctx.make_session_report()
print(json.dumps(report.to_dict(), indent=2))
chat_history = report.chat_history
# Do post-processing on your session (e.g final evaluations, generate a summary, ...)
@server.rtc_session(on_session_end=on_session_end)
async def my_agent(ctx: JobContext) -> None:
...AgentHandoff item
To capture everything that occurred during your session, we added an AgentHandoff item to the ChatContext.
class AgentHandoff(BaseModel):
...
old_agent_id: str | None
new_agent_id: strImproved turn detection model
We updated the turn-detection model, resulting in measurable accuracy improvements across most languages. The table below shows the change in [email protected] between versions 0.4.0 and 0.4.1, along with the percentage difference.
This new version also handles special user inputs such as email addresses, street addresses, and phone numbers much more effectively.
TaskGroup
We added TaskGroup, which lets you run multiple tasks concurrently and wait for all of them to finish. This is useful when collecting several pieces of information from a user where the order doesnโt matter, or when the user may revise earlier inputs while continuing the flow.
Weโve also added an example that uses TaskGroup to build a SurveyAgent, which you can use as a reference.
task_group = TaskGroup()
task_group.add(lambda: GetEmailTask(), id="get_email_task", description="Get the email address")
task_group.add(lambda: GetPhoneNumberTask(), id="phone_number_task", description="Get the phone number")
task_group.add(lambda: GetCreditCardTask(), id="credit_card_task", description="Get credit card")
results = await task_groupIVR systems
Agents can now optionally handle IVR-style interactions. Enabling ivr_detection allows the session to identify and respond appropriately to IVR tones or patterns, and min_endpointing_delay lets you control how long the system waits before ending a turnโuseful for menu-style inputs.
session = AgentSession(
ivr_detection=True,
min_endpointing_delay=5,
)llm_node FlushSentinel
We added a FlushSentinel marker that can be yielded from llm_node to flush partial LLM output to TTS and start a new TTS stream. This lets you emit a short, early response (for example, when a specific tool call is detected) while the main LLM response continues in the background. For a concrete pattern, see the flush_llm_node.py example.
async def llm_node(self, chat_ctx: llm.ChatContext, tools: list[llm.FunctionTool], model_settings: ModelSettings) -> AsyncIterable[llm.ChatChunk | FlushSentinel]:
yield "This is the first sentence"
yield FlushSentinel()
yield "Another TTS generation"Changes
asyncio-debug
The --asyncio-debug argument was removed, use PYTHONASYNCIODEBUG environment variable instead.
What's Changed
- feat: new CLI & new AgentServer API by @theomonnom in #3199
- remove unused code & fix ServerEnvOption by @theomonnom in #3220
- remove custom excepthook by @theomonnom in #3221
- fix python 3.9 by @theomonnom in #3222
- fix invalid
LogLevelon the CLI by @theomonnom in #3292 - add
Agent.idby @theomonnom in #3478 - add
AgentHandoffchat item by @theomonnom in #3479 - Add
AgentHandoffto the chat_ctx & AgentSessionReport by @theomonnom in #3541 - fix cli
readcharby @theomonnom in #3542 - fix
RecorderIOav.error.MemoryError by @theomonnom in #3543 - fix record & save to tempfile by @theomonnom in #3544
- save session json report when
--recordis enabled by @theomonnom in #3572 - brianyin/agt-1947-automatically-parse-dtmf-input-from-users by @toubatbrian in #3512
- ingest data to cloud by @theomonnom in #3609
- fix Audio/Video input source attach by @theomonnom in #3615
- Allow Recording Verbal DTMF Input when ask_confirmation is turned off by @toubatbrian in #3607
- Agent IVR System Example by @toubatbrian in #3610
- add
ChatContext.summarizeby @theomonnom in #3660 - Gather DTMF Minor Bug Fix by @toubatbrian in #3672
- brianyin/agt-2076-support-repeat-instruction-in-dtmf-gathering by @toubatbrian in #3674
- rename
assistanttoagentby @theomonnom in #3690 - TaskGroup by @tinalenguyen in #3680
- ignore on_enter on GetEmailTask by @theomonnom in #3691
- Refactor mock session utilities into a separate file by @toubatbrian in #3692
- fix _MetadataLogProcessor by @tinalenguyen in #3697
- add Created-At header for the audio recording by @theomonnom in #3698
- fix tool validation by @tinalenguyen in #3699
- use otel logger for the chat_history by @theomonnom in #3700
- Support Agent Session Tools by @toubatbrian in #3707
- add extra instructions + tools params into GetEmailTask by @tinalenguyen in #3711
- format transcript logs by @paulwe in #3708
- add participant attributes to traces by @theomonnom in #3725
- fix duplicate
agent_sessionspan by @theomonnom in #3726 - fix chat_history upload by @theomonnom in #3728
- rename
realtime_sessiontortc_sessionby @theomonnom in #3729 - add backward compatibility by @theomonnom in #3730
- add missing options attr to session start log by @paulwe in #3731
- brian/dtmf-send-tool by @toubatbrian in #3656
- log potential thread leaks preventing process from exiting by @theomonnom in #3744
- check room connection state + rename to on_emit + taskgroup fix by @tinalenguyen in #3738
- add survey agent example by @tinalenguyen in #3681
- update examples to use AgentServer by @tinalenguyen in #3767
- allow multiple ids for out of scope by @tinalenguyen in #3789
- add chat_history json to the report upload by @theomonnom in #3799
- set log timestamps for chat history by @paulwe in #3800
- check recorder_io in make_session_report by @ti...
[email protected]
Note
A more detailed changelog will be available soon!
What's Changed
- feat: new CLI & new AgentServer API by @theomonnom in #3199
- remove unused code & fix ServerEnvOption by @theomonnom in #3220
- remove custom excepthook by @theomonnom in #3221
- fix python 3.9 by @theomonnom in #3222
- fix invalid
LogLevelon the CLI by @theomonnom in #3292 - add
Agent.idby @theomonnom in #3478 - add
AgentHandoffchat item by @theomonnom in #3479 - Add
AgentHandoffto the chat_ctx & AgentSessionReport by @theomonnom in #3541 - fix cli
readcharby @theomonnom in #3542 - fix
RecorderIOav.error.MemoryError by @theomonnom in #3543 - fix record & save to tempfile by @theomonnom in #3544
- save session json report when
--recordis enabled by @theomonnom in #3572 - brianyin/agt-1947-automatically-parse-dtmf-input-from-users by @toubatbrian in #3512
- ingest data to cloud by @theomonnom in #3609
- fix Audio/Video input source attach by @theomonnom in #3615
- Allow Recording Verbal DTMF Input when ask_confirmation is turned off by @toubatbrian in #3607
- Agent IVR System Example by @toubatbrian in #3610
- add
ChatContext.summarizeby @theomonnom in #3660 - Gather DTMF Minor Bug Fix by @toubatbrian in #3672
- brianyin/agt-2076-support-repeat-instruction-in-dtmf-gathering by @toubatbrian in #3674
- rename
assistanttoagentby @theomonnom in #3690 - TaskGroup by @tinalenguyen in #3680
- ignore on_enter on GetEmailTask by @theomonnom in #3691
- Refactor mock session utilities into a separate file by @toubatbrian in #3692
- fix _MetadataLogProcessor by @tinalenguyen in #3697
- add Created-At header for the audio recording by @theomonnom in #3698
- fix tool validation by @tinalenguyen in #3699
- use otel logger for the chat_history by @theomonnom in #3700
- Support Agent Session Tools by @toubatbrian in #3707
- add extra instructions + tools params into GetEmailTask by @tinalenguyen in #3711
- format transcript logs by @paulwe in #3708
- add participant attributes to traces by @theomonnom in #3725
- fix duplicate
agent_sessionspan by @theomonnom in #3726 - fix chat_history upload by @theomonnom in #3728
- rename
realtime_sessiontortc_sessionby @theomonnom in #3729 - add backward compatibility by @theomonnom in #3730
- add missing options attr to session start log by @paulwe in #3731
- brian/dtmf-send-tool by @toubatbrian in #3656
- log potential thread leaks preventing process from exiting by @theomonnom in #3744
- check room connection state + rename to on_emit + taskgroup fix by @tinalenguyen in #3738
- add survey agent example by @tinalenguyen in #3681
- update examples to use AgentServer by @tinalenguyen in #3767
- allow multiple ids for out of scope by @tinalenguyen in #3789
- add chat_history json to the report upload by @theomonnom in #3799
- set log timestamps for chat history by @paulwe in #3800
- check recorder_io in make_session_report by @tinalenguyen in #3805
- feat(cartesia): add LiveKit user agent to requests by @mi-yu in #3809
- Add Speechmatics TTS by @aaronng91 in #3754
- built-in GetAddressTask by @tinalenguyen in #3807
- fix extra instructions param and update confirm_address docstring by @tinalenguyen in #3810
- Add support for using a previous silero vad model file by @zaheerabbas-prodigal in #3779
- allow updating the same agent that is running to apply changes in agent by @longcw in #3814
- chore: fix ruff & formatting by @davidzhao in #3827
- fix type checking for agents 1.3 by @longcw in #3842
- fix: correct base64 data handling in image content conversion #3867 by @tarsyang in #3868
- fix observability by @davidzhao in #3828
- avoid rotating transcription synchronizer twice during detach and attach by @longcw in #3845
- fix pickling AgentServer for python 3.9 by @longcw in #3847
- add better word alignment for Cartesia by @chenghao-mou in #3876
- fix jupyter for agents 1.3 by @longcw in #3877
- feat(minimax): comprehensive TTS updates and parameter rename by @zhenyujia23-crypto in #3788
- feat(aws): add credentials customization for aws stt by @civilcoder55 in #3840
- make sure user away timer is cancelled when session closed by @longcw in #3895
- fix duplicate responses from gemini by @tinalenguyen in #3898
- support google safety settings by @tinalenguyen in #3815
- add audio_frame_size_ms for RoomInputOptions by @longcw in #3899
- add new room options by @longcw in #3417
- feat(tts): add sample rate option to TTS configuration for rime tts plugin arcana model by @gokuljs in #3910
- deepgram plugin: better websocket logs by @jjmaldonis in #3912
- Add download location in readme by @chenghao-mou in #3908
- chore: move LK env var checks later by @davidzhao in #3920
- fix ForkServerContext import by @theomonnom in #3924
- add timeout to datastream clear_buffer to avoid deadlock when missing playback finished event by @longcw in #3917
- observability cleanup by @davidzhao in #3929
- feature: GPT-5.1 support by @c0mpli in #3928
- record when the session was started by @davidzhao in #3930
- add <3.14 requirement temporarily by @chenghao-mou in #3921
- Allow tool role for dummy user message by @chenghao-mou in #3938
- add <3.14 requirement temporarily by @chenghao-mou in #3942
- skip CI checks for md changes by @chenghao-mou in #3939
- AGT-2200 Improve usage collector and metric logging with more details by @chenghao-mou in #3935
- feat(cartesia): debug log Cartesia request id on WS connection by @mi-yu in #3940
- Allow users to pick BVCTelephony at runtime by @bcherry in #3926
- don't use decorators for setup_fnc & load_fnc by @theomonnom in #3945
- expose
room_iofrom theAgentSessionby @theomonnom in #3946 - Added Support for gpt-5.1-chat-latest by @devb-enp in #3932
- turn-detection: use v0.4.1-intl by @lwestn in #3941
- optimizations for turn detector model size by @davidzhao in #3953
- feat: add AvatarTalk integration by @Maelstro in #3139
- remove noisy error/warn logs by @theomonnom in #3955
- release livekit-agents 1.3.1 by @theomonnom in #3957
New Contributors
- @mi-yu made their first contribution in #3809
- @aaronng91 made their first contribution in #3754
- @zaheerabbas-prodigal made their first contribution in #3779
- @tarsyang made their first contribution in #3868
- @chenghao-mou made their first contr...
[email protected]
What's Changed
- chore: update to correct livekit-plugins-minimax-ai package by @davidzhao in #3753
- New Spitch STT model: Mansa v1 by @temibabs in #3748
- add missing file error for turn detection model by @tinalenguyen in #3755
- Add Language Parameter Support for Rime Arcana TTS Model by @gokuljs in #3757
- Update Letta voice API integration to use new endpoint by @cpfiffer in #3736
- livekit-blingfire v1.0.1 by @theomonnom in #3763
- fix blingfire windows compilation for python 3.14-freethreaded by @theomonnom in #3766
- allow elevenlabs language code parameter to be null by @tinalenguyen in #3761
- changes default base url for spelling by @aryeila in #3768
- Improve streaming handling for Neuphonic by @alexshelkov in #3703
- turn-detector: add turn detector v0.4.0-intl by @lwestn in #3764
- fix(google): improved handling of tool_response_scheduling by @davidzhao in #3781
- Update stt.py - Deepgram plugin allows PT for Keyterms by @cateet in #3786
- skip exception log for StopResponse from a tool call by @longcw in #3790
- restore root otel context for AgentTask and generate_reply in entrypoint by @longcw in #3772
- feat: add preflight transcript via utterance by @dan-ince-aai in #3654
- fix(google): do not pass in scheduling parameter by default by @davidzhao in #3793
- Add Fish Audio TTS Plugin for LiveKit Agents by @ywkim in #3720
- Shubhra/nvidia plugins by @Shubhrakanti in #3392
- Undo
basic_agentaccidental commit in #3392 by @Shubhrakanti in #3797 - fix: bithuman prewarm credential issue by @CathyL0 in #3794
- Added missing parameters to Gladia by @Karamouche in #3796
- livekit-agents 1.2.18 by @theomonnom in #3806
New Contributors
- @cpfiffer made their first contribution in #3736
- @aryeila made their first contribution in #3768
- @cateet made their first contribution in #3786
- @ywkim made their first contribution in #3720
- @Karamouche made their first contribution in #3796
Full Changelog: https://github.com/livekit/agents/compare/[email protected]@1.2.18
[email protected]
What's Changed
- Fix/aws realtime tooluse barge in by @kachenjr in #3704
- reset user state from away when user input transcribed by @longcw in #3716
- set _read_audio_atask to None in init by @tinalenguyen in #3727
- add issue templates by @tinalenguyen in #3689
- feat: gemini session resumption handle by @aryanvdesh in #3735
- turn detection: model v0.3.1-intl by @lwestn in #3724
- fix room io audio output deadlock by @longcw in #3746
- added gemini live model by @tinalenguyen in #3750
- add openai gpt-5-chat-latest model by @tinalenguyen in #3741
New Contributors
- @kachenjr made their first contribution in #3704
- @aryanvdesh made their first contribution in #3735
Full Changelog: https://github.com/livekit/agents/compare/[email protected]@1.2.17
[email protected]
What's Changed
- Conditional message truncation based on LLM capabilitities by @hadamove-rapidsos in #3655
- AWS realtime: make ModelStreamErrorException recoverable by @longcw in #3662
- use inference gateway in basic_agent by @longcw in #3611
- chore: use self.chat_ctx in multi_agent example by @longcw in #3663
- fix(speechmatics): AdditionalVocab configuration update by @sam-s10s in #3676
- chore(cartesia): default to the latest API version by @davidzhao in #3652
- Update Soniox STT parameters by @matejmarinko-soniox in #3670
- Improve gcp vertex credential check by @ChenghaoMou in #2798
- Add streaming support for Neuphonic by @alexshelkov in #3182
- Unify generate_reply and say code pattern by @ChenghaoMou in #3683
- chore: correct minimax-ai package name by @davidzhao in #3682
- fix(cartesia,deepgram): correctly timeout while in middle of TTS synthesis by @davidzhao in #3686
- feat(voice/run_result): OTEL tracing of judge by @bml1g12 in #3639
- feat: add CometAPI integration to OpenAI plugin by @tensornull in #3641
- Configure Prometheus in multi process mode by @efontan-dialpad in #3565
- Fix: Connection Pool race condition by @adrian-cowham in #3705
- chore: lock onnxruntime to <=1.23.1 by @davidzhao in #3712
- fix(worker): ensure safe iteration over process pool during job joining by @Panmax in #3710
- feat(cartesia): sonic-3 by @davidzhao in #3715
New Contributors
- @hadamove-rapidsos made their first contribution in #3655
- @matejmarinko-soniox made their first contribution in #3670
- @tensornull made their first contribution in #3641
- @efontan-dialpad made their first contribution in #3565
Full Changelog: https://github.com/livekit/agents/compare/[email protected]@1.2.16
[email protected]
What's Changed
- reduce CI noise by @theomonnom in #3552
- automatically update livekit-agents pyproject.toml optional dependencies on version bump by @theomonnom in #3553
- added drain param to session.shutdown() by @tinalenguyen in #3562
- fix stt final transcript triggers user turn in manual turn detection by @longcw in #3559
- feat(livekit-plugins-hume): add model_version parameter by @zgreathouse in #3563
- fix model provider and metrics for FallbackAdapter and StreamAdapter by @longcw in #3526
- handle aiohttp client error when connecting to openai realtime api by @longcw in #3574
- Add
timeoutparam towith_openrouter()function by @msaelices in #3538 - Add a dev folder for examples to keep the git graph clean by @Shubhrakanti in #3582
- Ensure ctx.api uses WorkerOptions credentials by exporting LIVEKIT_* in worker by @hwuiwon in #3581
- feat(google): Add thinking_config support, new model, and expanded voice profiles for google gemini TTS by @hwuiwon in #3583
- chore: ensure a recent version of certifi is installed by @davidzhao in #3580
- fix(deepgram): correctly handle timeout related errors by @davidzhao in #3579
- realtime model: wait for generate_reply before update tool results by @longcw in #3511
- fix aws realtime deps version by @longcw in #3592
- Updating Cartesia Version by @namantalreja in #3570
- fix: lock pyav to <16 due to build issue by @davidzhao in #3593
- lift google realtime api out of beta by @tinalenguyen in #3614
- catch delete_room errors and disable delete_room_on_close by default by @longcw in #3600
- feat(telemetry/utils): add ttft reporting to LangFuse by @bml1g12 in #3594
- Add RTZR(ReturnZero) STT Plugin for LiveKit Agents by @kimdwkimdw in #3376
- chore: Remove duplicate docstring for
preemptive_generationparameter in AgentSession by @m-hamashita in #3624 - fix(deepgram): send CloseStream message before closing TTS WebSocket by @Nisarg38 in #3608
- feat(speechmatics): add max_speakers parameter for speaker diarization by @nsepehr in #3524
- Align Google STT plugin with official documentation by @mrkowalski in #3628
- add backwards compatibility for google's realtime model by @tinalenguyen in #3630
- fix: exclude temperature parameter for gpt-5 and similar models by @TheAli711 in #3573
- turn_detection: reduce max_endpointing_delay to 3s by @lwestn in #3640
- feat: Integrate streaming endpoints for Sarvam APIs by @shreyas-sarvam in #3498
- fix: heartbeat by @zachkamran in #3648
- enable zero retention mode in elevenlabs by @tinalenguyen in #3647
- Unprompted STT Reconnection at startup by @adrian-cowham in #3649
- fix #3650 cartesia version backward compatibility by @wlbksy in #3651
- livekit-agents 1.2.15 by @theomonnom in #3658
New Contributors
- @hwuiwon made their first contribution in #3581
- @namantalreja made their first contribution in #3570
- @kimdwkimdw made their first contribution in #3376
- @m-hamashita made their first contribution in #3624
- @Nisarg38 made their first contribution in #3608
- @nsepehr made their first contribution in #3524
- @TheAli711 made their first contribution in #3573
- @lwestn made their first contribution in #3640
- @shreyas-sarvam made their first contribution in #3498
- @adrian-cowham made their first contribution in #3649
Full Changelog: https://github.com/livekit/agents/compare/[email protected]@1.2.15
[email protected]
New feature
- Introduce LiveKit Inference: a unified model interface enabling STT, LLM, and TTS via one API key, with optimized latency, billing, and concurrency management ๐
https://blog.livekit.io/introducing-livekit-inference/
What's Changed
- inference: fix
extra_headersprovider by @theomonnom in #3549
Full Changelog: https://github.com/livekit/agents/compare/[email protected]@1.2.14
[email protected]
What's Changed
- Gladia STT: support partial transcriptions by @fabrice404 in #3530
- support STT with
model:langand parse model specs outside ctor by @longcw in #3536 - update inference API & update model names by @theomonnom in #3545
- deepgram: support for Flux by @davidzhao in #3245
New Contributors
- @fabrice404 made their first contribution in #3530
Full Changelog: https://github.com/livekit/agents/compare/[email protected]@1.2.13
