proposed changes to PR #2962 #2981

vipyne · 2025-11-05T14:48:46Z

Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

- Add support for speech-2.6-hd and speech-2.6-turbo models - Add 16 new languages (total 40): Afrikaans, Bulgarian, Catalan, Danish, Persian, Filipino, Hebrew, Croatian, Hungarian, Malay, Norwegian, Nynorsk, Slovak, Slovenian, Swedish, Tamil - Add new emotions: calm and fluent - Add new parameters: text_normalization (renamed from english_normalization), latex_read, force_cbr, exclude_aggregated_audio, subtitle_enable, subtitle_type - Extract trace_id from response headers for all requests - Improve error handling for non-streaming error responses - Add detailed extra_info logging (audio_length, audio_size, usage_characters, word_count) - Add validation warnings for language/model compatibility - Fix silent error issue where HTTP 200 responses with errors were ignored BREAKING CHANGE: Renamed parameter english_normalization to text_normalization

codecov · 2025-11-05T14:51:30Z

Codecov Report

❌ Patch coverage is 0% with 76 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/pipecat/services/minimax/tts.py	0.00%	76 Missing ⚠️

Files with missing lines	Coverage Δ
src/pipecat/services/minimax/tts.py	`0.00% <0.00%> (ø)`

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

markbackman · 2025-11-05T14:57:25Z

src/pipecat/services/minimax/tts.py

+            latex_read: Enable LaTeX formula reading.
+            force_cbr: Enable Constant Bitrate (CBR) for audio encoding (MP3 only).
+            exclude_aggregated_audio: Whether to exclude aggregated audio in final chunk.
+            subtitle_enable: Enable subtitle generation (non-streaming only).


We should consider removing since this parameter and subtitle_type as they're for non-streaming only. This MiniMax implementation is streaming only.

The alternative would be to add a streaming arg to allow for a non-streaming mode`, but I don't think that's a good idea.

Here is a mismatch. Our model can support streaming subtitle.

perhaps (non-streaming only). should be omitted from comment then?

@zhenyujia23-crypto is there a way subtitles can be added to the example 07y-interruptible-minimax.py to exercise this parameter?

markbackman · 2025-11-05T14:58:28Z

src/pipecat/services/minimax/tts.py

+                "disgusted", "surprised", "calm", "fluent").
+            english_normalization: Deprecated; use `text_normalization` instead
+            text_normalization: Enable text normalization (Chinese/English).
+            latex_read: Enable LaTeX formula reading.


Is this actually applicable? I guess LLMs can output LaTeX format, but I don't know if it makes sense for a TTS to do the same 🤷

Yes, we do support. but need transform to the

I guess there's no harm in including it, but generally voice agents are best if they skip pronouncing complex math (code or tables).

markbackman · 2025-11-05T14:59:34Z

src/pipecat/services/minimax/tts.py

+            english_normalization: Deprecated; use `text_normalization` instead
+            text_normalization: Enable text normalization (Chinese/English).
+            latex_read: Enable LaTeX formula reading.
+            force_cbr: Enable Constant Bitrate (CBR) for audio encoding (MP3 only).


Is mp3 and option? Again, this probably doesn't make sense for a streaming use case.

This para default is false, doesn't effect streaming. For streaming use case, user can just ignore it.

will remove.

Yes, let's remove if it doesn't apply to streaming then.

markbackman · 2025-11-05T15:00:28Z

src/pipecat/services/minimax/tts.py

            if service_lang:
                self._settings["language_boost"] = service_lang

+                # Validate language-model compatibility


I'm not sure if we should include this type of checking. It's likely to become stale and need to be maintained. It would be better to rely on the service returning an error about unsupported languages for a given model.

Will remove it

src/pipecat/services/minimax/tts.py

markbackman · 2025-11-05T15:05:41Z

src/pipecat/services/minimax/tts.py

+        if params.text_normalization is not None:
+            self._settings["voice_setting"]["text_normalization"] = params.text_normalization
+
+        # Add latex_read if provided


Consider removing based if the parameter isn't included.

markbackman · 2025-11-05T15:05:50Z

src/pipecat/services/minimax/tts.py

+        if params.latex_read is not None:
+            self._settings["voice_setting"]["latex_read"] = params.latex_read
+
+        # Add force_cbr if provided (for MP3 format only)


Same. Consider removing based if the parameter isn't included.

markbackman · 2025-11-05T15:06:05Z

src/pipecat/services/minimax/tts.py

+        if params.force_cbr is not None:
+            self._settings["audio_setting"]["force_cbr"] = params.force_cbr
+
+        # Add subtitle settings if provided


I think we should remove the parameter and this logic since non-streaming isn't supported.

src/pipecat/services/minimax/tts.py

markbackman · 2025-11-05T15:07:23Z

src/pipecat/services/minimax/tts.py

                yield TTSStartedFrame()

                # Process the streaming response
+                logger.trace(f"Starting to read streaming response, status={response.status}")


markbackman · 2025-11-05T15:07:45Z

src/pipecat/services/minimax/tts.py


                async for chunk in response.content.iter_chunked(CHUNK_SIZE):
+                    chunk_count += 1
+                    logger.trace(f"Received chunk #{chunk_count}, size={len(chunk)} bytes")


markbackman · 2025-11-05T15:07:54Z

src/pipecat/services/minimax/tts.py


+                    # Log raw buffer content for debugging
+                    if chunk_count == 1:
+                        logger.trace(f"Raw buffer content: {buffer[:200]}")  # First 200 bytes


src/pipecat/services/minimax/tts.py

markbackman · 2025-11-05T15:08:49Z

src/pipecat/services/minimax/tts.py

-                            data = json.loads(data_block[5:].decode("utf-8"))
-                            # Skip data blocks containing extra_info
+                            data_str = data_block[5:].decode("utf-8")
+                            logger.trace(


Remove this and other trace logging.

But, I think we still need a place to log the trace_id, coz, it will help us to know the problem from our service.

Is the tracing code necessary for other folks who use this class?
Perhaps a better approach is to create a git patch with all the tracing code that can be applied locally to a user's pipecat code.
I see that it is useful, but it feels like it should be a more "opt-in" feature.

markbackman · 2025-11-05T15:09:19Z

src/pipecat/services/minimax/tts.py

                            if not chunk_data:
                                continue

+                            # Check for subtitle file (if subtitle generation is enabled)


markbackman

Ok, I think the biggest issue to sort out is what the scope of this service is. Pipecat is for real-time use cases, so non-streaming is not really applicable. I think we should remove the non-streaming items from this class to keep it focused and simple.

Also, remove the trace logging, as it's only applicable during development and isn't practical for real-word debugging.

Co-authored-by: Mark Backman <[email protected]>

- Remove non-essential parameters: latex_read, force_cbr, exclude_aggregated_audio - Simplify trace_id logging (keep for errors only, remove from success logs) - Make subtitle generation support streaming with word-level timestamps - Remove non-streaming logic warnings and validation - Clean up code structure and remove duplicate parameter handling - Reduce code complexity while maintaining core functionality Responds to PR pipecat-ai#2981 reviewer comments about keeping TTS service focused on real-time use cases.

- Remove force_cbr parameter and complex audio encoding options - Remove exclude_aggregated_audio streaming control - Remove all logger.debug statements for cleaner logging - Simplify InputParams class (11 → 7 parameters) - Streamline subtitle support to word-level only - Remove sentence-level subtitle functionality - Fix geographic comment: "west of unite state" → "western United States" - Update documentation for streaming-only word-level subtitles - Maintain all core TTS functionality for real-time use cases - Preserve language support (40 languages), emotions, and trace ID tracking This addresses PR pipecat-ai#2981 feedback by simplifying the service to focus purely on streaming TTS with word-level subtitles, removing unnecessary complexity and debugging code while maintaining essential functionality.

minimax and others added 10 commits November 4, 2025 15:39

docs(minimax): add API endpoint comment for west US region

a71745a

pr notes

cba9cab

lint

80dcdcc

some debugs -> traces

5d900ee

add western US base_url to docs

0dfbed4

let's address and logger in a separate PR

47177cb

ensure error_message is defined

92aba90

add deprecation warning for english_normalization param

3f14a7e

update changelog

1ecb39d

markbackman reviewed Nov 5, 2025

View reviewed changes

src/pipecat/services/minimax/tts.py Outdated Show resolved Hide resolved

markbackman reviewed Nov 5, 2025

View reviewed changes

src/pipecat/services/minimax/tts.py Show resolved Hide resolved

markbackman reviewed Nov 5, 2025

View reviewed changes

src/pipecat/services/minimax/tts.py Outdated Show resolved Hide resolved

markbackman reviewed Nov 5, 2025

View reviewed changes

src/pipecat/services/minimax/tts.py Outdated Show resolved Hide resolved

markbackman reviewed Nov 5, 2025

View reviewed changes

src/pipecat/services/minimax/tts.py Outdated Show resolved Hide resolved

markbackman reviewed Nov 5, 2025

View reviewed changes

Update src/pipecat/services/minimax/tts.py

b29e2ee

Co-authored-by: Mark Backman <[email protected]>

vipyne and others added 6 commits November 5, 2025 09:17

Update src/pipecat/services/minimax/tts.py

b3dfbc9

Co-authored-by: Mark Backman <[email protected]>

Update src/pipecat/services/minimax/tts.py

8a877b1

Co-authored-by: Mark Backman <[email protected]>

add warning

fea385e

remove force_cbr

b10cb8d

remove language validation

1982916

remove non-streaming logic

8157cba

zhenyujia23-crypto mentioned this pull request Nov 10, 2025

Feat/minimax simplified tts（Nov 11th） #3019

Open

proposed changes to PR #2962 #2981

Are you sure you want to change the base?

proposed changes to PR #2962 #2981

Uh oh!

Conversation

vipyne commented Nov 5, 2025

Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

Uh oh!

codecov bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markbackman left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Nov 5, 2025 •

edited

Loading