feat: Add logprobs support to SGLang backend #4912

AryanBagade · 2025-12-11T23:25:28Z

Overview:
Add logprobs support to SGLang backend, enabling log probability data to be returned in streaming responses. This is the SGLang counterpart to the vLLM logprobs support added in #4697.

Details:

Add output_options.logprobs handling in _build_sampling_params for token-based format (sets return_logprob and top_logprobs_num)
Add OpenAI format logprobs/top_logprobs parameter handling
Add _extract_logprobs static method to parse SGLang's output_token_logprobs and output_top_logprobs fields
Update _process_token_stream to include log_probs and top_logprobs in output
Add aggregated_logprobs e2e test configuration

Related Issues:

Closes [FEATURE]: Add logprobs support to SGLang Backend #4685

Summary by CodeRabbit

New Features
- Log probability support: Users can now request per-token log probabilities and top-k alternative tokens from both chat and completion API endpoints, enabling detailed analysis of model outputs and probability-based applications.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Aryan Bagade <[email protected]>

copy-pr-bot · 2025-12-11T23:25:31Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2025-12-11T23:25:35Z

👋 Hi AryanBagade! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

coderabbitai · 2025-12-11T23:27:20Z

Walkthrough

This change adds logprobs support to the SGLang backend by introducing logprobs extraction from framework responses, parsing logprobs parameters from requests, and integrating extracted probabilities into response outputs for both chat and completion request paths.

Changes

Cohort / File(s)	Summary
Core logprobs extraction and integration `components/src/dynamo/sglang/request_handlers/llm/decode_handler.py`	Adds module-level logger and static method `_extract_logprobs` to extract per-token logprobs and top-k logprobs from SGLang responses. Parses logprobs parameters from both token-based and OpenAI-format request paths, configuring `return_logprob` and `top_logprobs_num` accordingly. Integrates extraction into streaming paths to attach log_probs and top_logprobs to outputs. Adds initialization logging in DecodeWorkerHandler constructor.
Logprobs test configuration `tests/serve/test_sglang.py`	Imports new payload builders `chat_payload_with_logprobs` and `completion_payload_with_logprobs`. Adds `aggregated_logprobs` test configuration requesting logprobs (chat with top_logprobs=3, completion with logprobs=5). Extends completions_only configuration to include aggregated_logprobs testing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

decode_handler.py: The new _extract_logprobs static method requires careful review to verify correct slicing logic for new tokens only and proper extraction of both log probabilities and top-k candidates from SGLang responses.
Request parsing logic: Review token-based and OpenAI-format logprobs parameter parsing paths to ensure both request types are handled correctly and parameter validation is appropriate.
test_sglang.py: Verify that new test payloads correctly request logprobs with appropriate parameters and that the aggregated configuration properly validates the feature across both chat and completion paths.

Poem

🐰 From SGLang's depths, the logprobs now flow,
With tokens and probabilities all aglow,
Top-k candidates dance in the light,
Extracted with care, extracted just right,
The rabbit hops forth with probabilities bright! 🎲

Pre-merge checks

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: Add logprobs support to SGLang backend' clearly summarizes the main change—adding logprobs support to SGLang—which is the primary focus of the changeset.
Description check	✅ Passed	The PR description follows the template structure with Overview, Details, and Related Issues sections, providing clear implementation details and linking to issue #4685.
Linked Issues check	✅ Passed	The PR successfully addresses all coding requirements from issue #4685: handling logprobs parameters, parsing SGLang logprobs fields, extracting and returning logprobs in output, and adding e2e test coverage.
Out of Scope Changes check	✅ Passed	All changes are directly related to implementing logprobs support for the SGLang backend as specified in the linked issue; no unrelated modifications are present.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py (1)

373-428: Add logprobs extraction to OpenAI-format text stream output.

The _process_text_stream method does not extract or emit logprobs in the response, even though _build_sampling_params requests them from the SGLang engine (lines 117-123) and the data is available in the response. According to the OpenAI chat completion streaming specification, logprobs should be included in each choice object when requested. Currently, _process_token_stream correctly extracts and includes logprobs using the _extract_logprobs method, but _process_text_stream ignores this data entirely, causing OpenAI-format requests to violate API spec compliance by omitting logprobs even when requested.

Add logprobs extraction to the choice_data dict in _process_text_stream, similar to how _process_token_stream handles it, to ensure logprobs are included in the OpenAI-formatted chunks when requested.

🧹 Nitpick comments (1)

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py (1)

58-61: Inconsistent logger usage.

These lines use logging.info() (root logger) while the logprobs validation uses the module-level logger defined at line 17. For consistency and proper log namespacing, consider using logger.info() throughout.

         if self.serving_mode == DisaggregationMode.DECODE:
             if self.prefill_client is None:
                 raise ValueError(
                     "prefill_client must be provided when serving_mode is decode"
                 )
             self.prefill_client = prefill_client
-            logging.info("Decode worker handler initialized")
+            logger.info("Decode worker handler initialized")

         self.prefill_router_client = prefill_router_client
-        logging.info("Worker handler initialized")
+        logger.info("Worker handler initialized")

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3324994 and 32486d5.

📒 Files selected for processing (2)

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py (6 hunks)
tests/serve/test_sglang.py (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

tests/serve/test_sglang.py (1)

tests/utils/payload_builder.py (2)

chat_payload_with_logprobs (316-356)

completion_payload_with_logprobs (359-393)

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py (2)

components/src/dynamo/vllm/handlers.py (1)

_extract_logprobs (683-740)

components/src/dynamo/trtllm/request_handlers/handler_base.py (1)

_extract_logprobs (110-177)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

🔇 Additional comments (5)

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py (4)

17-18: LGTM!

Adding a module-level logger is appropriate for the new warning messages in logprobs validation.

92-107: LGTM!

The logprobs validation is thorough—handles empty strings, non-integer values, and negative integers with appropriate warnings.

117-124: LGTM!

The OpenAI-format handling correctly differentiates between the boolean logprobs flag and the optional top_logprobs count, matching the OpenAI API specification.

346-355: LGTM!

The logprobs extraction is correctly integrated after determining new tokens, and fields are only added when non-None.

tests/serve/test_sglang.py (1)

20-22: LGTM!

The new payload builder imports are correctly added alongside existing imports.

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py

tests/serve/test_sglang.py

Signed-off-by: Aryan Bagade <[email protected]>

ishandhanani

Some thoughts

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py

ishandhanani · 2025-12-12T01:49:11Z

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py

                async for out in self._process_text_stream(agg, context):
                    yield out

+    @staticmethod


wdyt about moving this into handler base?

I had some digging here, basically the extraction logic is backend-specific since SGLang returns logprobs as tuples (logprob, token_id, decoded_text) which differs from TRTLLM's structure.
vLLM also keeps its _extract_logprobs in handlers.py rather than a shared base for the same reason. Happy to refactor if you see a clean abstraction though!

Signed-off-by: Aryan Bagade <[email protected]>

karen-sy · 2025-12-12T19:23:58Z

/ok to test 45fb04e

…'s SamplingParams doesn't accept return_logprob or top_logprobs_num. These must be passed as separate kwargs to engine.async_generate(). Signed-off-by: Aryan Bagade <[email protected]>

AryanBagade · 2025-12-13T03:16:27Z

Fixed logprobs params, return_logprob and top_logprobs_num are passed to async_generate() as separate kwargs, not inside SamplingParams.

feat: Add logprobs support to SGLang backend closes: ai-dynamo#4685

85ff86c

Signed-off-by: Aryan Bagade <[email protected]>

AryanBagade requested review from a team as code owners December 11, 2025 23:25

pull-request-size bot added the size/L label Dec 11, 2025

github-actions bot added external-contribution Pull request is from an external contributor feat labels Dec 11, 2025

Merge branch 'main' into feat/add-logprobs-support-sglang-backend

32486d5

coderabbitai bot reviewed Dec 11, 2025

View reviewed changes

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py Outdated Show resolved Hide resolved

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py Outdated Show resolved Hide resolved

tests/serve/test_sglang.py Show resolved Hide resolved

rmccorm4 assigned ishandhanani Dec 12, 2025

fix: address CodeRabbit review feedback for SGLang logprobs

246ab54

Signed-off-by: Aryan Bagade <[email protected]>

AryanBagade force-pushed the feat/add-logprobs-support-sglang-backend branch from e6b1c9d to 246ab54 Compare December 12, 2025 00:22

Merge branch 'main' into feat/add-logprobs-support-sglang-backend

1cf26f9

ishandhanani requested changes Dec 12, 2025

View reviewed changes

AryanBagade and others added 2 commits December 11, 2025 19:31

fix: use logging module directly per review feedback

0e60cfd

Signed-off-by: Aryan Bagade <[email protected]>

Merge branch 'main' into feat/add-logprobs-support-sglang-backend

45fb04e

AryanBagade requested a review from ishandhanani December 12, 2025 03:32

copy-pr-bot bot had a problem deploying to GITLAB December 12, 2025 19:24 Failure

AryanBagade and others added 2 commits December 12, 2025 19:13

fix: pass logprobs params to async_generate not SamplingParams SGLang…

2420fc9

…'s SamplingParams doesn't accept return_logprob or top_logprobs_num. These must be passed as separate kwargs to engine.async_generate(). Signed-off-by: Aryan Bagade <[email protected]>

Merge branch 'main' into feat/add-logprobs-support-sglang-backend

b1da340

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add logprobs support to SGLang backend #4912

feat: Add logprobs support to SGLang backend #4912

AryanBagade commented Dec 11, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

coderabbitai bot commented Dec 11, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ishandhanani left a comment

Uh oh!

Uh oh!

ishandhanani Dec 12, 2025

Uh oh!

AryanBagade Dec 12, 2025

Uh oh!

karen-sy commented Dec 12, 2025

Uh oh!

AryanBagade commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Add logprobs support to SGLang backend #4912

Are you sure you want to change the base?

feat: Add logprobs support to SGLang backend #4912

Conversation

AryanBagade commented Dec 11, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

coderabbitai bot commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ishandhanani left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ishandhanani Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

AryanBagade Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

karen-sy commented Dec 12, 2025

Uh oh!

AryanBagade commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AryanBagade commented Dec 11, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 11, 2025 •

edited

Loading