-
-
Notifications
You must be signed in to change notification settings - Fork 65
feat(embeddings): fix embedding dispatch in executor #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
|
@mirrobot-agent your PR review workflow failed, re-trigger it. |
|
@b3nw, I've received your request to re-trigger the review. I'm starting my analysis of the embedding dispatch fixes now and will post the results shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall Assessment
This PR correctly identifies and fixes the issue where embedding requests were being dispatched to litellm.acompletion instead of litellm.aembedding. The introduction of request_type in RequestContext is a solid approach and the refactor in RequestExecutor to dynamically select the call function is clean.
Architectural Feedback
The addition of request_type to RequestContext is a good way to handle different API operation types within the same executor logic. This pattern could potentially be extended to other operation types in the future.
Key Suggestions
- Enhance
aembeddingfeature parity: Currently,aembeddinginRotatingClientis missing model resolution and transaction logging. - Async Consistency: Consider making
aembeddinganasync defto matchacompletion.
Warnings
None.
This review was generated by an AI assistant.
| provider=provider, | ||
| kwargs=kwargs, | ||
| streaming=False, | ||
| request_type="embedding", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While adding request_type="embedding" here is correct, the aembedding method is still missing feature parity with acompletion. Specifically, it lacks model resolution and transaction logging.
Also, for consistency with the rest of the client, this should ideally be an async def and await the executor.
| request_type="embedding", | |
| # Resolve model ID | |
| resolved_model = self._model_resolver.resolve_model_id(model, provider) | |
| kwargs["model"] = resolved_model | |
| # Create transaction logger if enabled | |
| transaction_logger = None | |
| if self.enable_request_logging: | |
| transaction_logger = TransactionLogger( | |
| provider=provider, | |
| model=resolved_model, | |
| enabled=True, | |
| parent_dir=kwargs.pop("_parent_log_dir", None), | |
| ) | |
| transaction_logger.log_request(kwargs) | |
| # Build request context (embeddings are never streaming) | |
| context = RequestContext( | |
| model=resolved_model, | |
| provider=provider, | |
| kwargs=kwargs, | |
| streaming=False, | |
| request_type="embedding", | |
| credentials=self.all_credentials.get(provider, []), | |
| deadline=time.time() + self.global_timeout, | |
| request=request, | |
| pre_request_callback=pre_request_callback, | |
| transaction_logger=transaction_logger, | |
| ) | |
| return await self._executor.execute(context) |
Note: This would also require changing the function signature to async def aembedding(...).
|
@b3nw, I've completed the re-triggered review for PR #112. SummaryThe fixes for embedding dispatch are technically sound and resolve the core issue where embeddings were incorrectly using the completion endpoint. I've suggested some further enhancements to bring the Key Findings
You can find the detailed feedback and code suggestions in the formal review. This update was generated by an AI assistant. |
The embedding endpoint existed but was broken because the executor refactoring (Nov 2025) forgot to add litellm.aembedding() dispatch. All requests were going through litellm.acompletion(), causing failures. Changes: - Add request_type field to RequestContext (completion/embedding) - Update RotatingClient.aembedding() to set request_type='embedding' - Add conditional dispatch in RequestExecutor._execute_non_streaming() to call litellm.aembedding() for embedding requests - Refactored dispatch logic to reduce code duplication - Added feature parity to aembedding: async signature, model resolution, and transaction logging (matching acompletion) Tested with Mistral (mistral-embed) - returns proper 1024-dim vectors.
2c755e8 to
68babe7
Compare
|
@mirrobot-agent addressed your items, re-review |
|
@b3nw, I've received your request for a re-review. I'll check the latest changes to ensure the embedding dispatch logic is fully aligned with the suggestions. I'll post my findings shortly! 🔍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall Assessment
The updates successfully address all the points raised in the previous review. The aembedding method now has full feature parity with acompletion, and the dispatch logic in the RequestExecutor is clean and robust.
Key Changes Verified
- Async Parity:
aembeddingis now anasync defand properly awaits the executor. - Model Resolution: Added
_model_resolver.resolve_model_idto ensure embedding requests use resolved model IDs. - Transaction Logging: Integrated
TransactionLoggerinto the embedding flow, ensuring requests and responses are logged when enabled. - Clean Dispatch: The refactored logic in
RequestExecutor._execute_non_streamingcorrectly selects betweenlitellm.aembeddingandlitellm.acompletionbased on therequest_type.
This is a great improvement to the library's embedding support.
This review was generated by an AI assistant.
| @@ -375,19 +375,39 @@ def aembedding( | |||
| f"Invalid model format or no credentials for provider: {model}" | |||
| ) | |||
|
|
|||
| # Extract internal logging parameters (not passed to API) | |||
| parent_log_dir = kwargs.pop("_parent_log_dir", None) | |||
|
|
|||
| # Resolve model ID | |||
| resolved_model = self._model_resolver.resolve_model_id(model, provider) | |||
| kwargs["model"] = resolved_model | |||
|
|
|||
| # Create transaction logger if enabled | |||
| transaction_logger = None | |||
| if self.enable_request_logging: | |||
| transaction_logger = TransactionLogger( | |||
| provider=provider, | |||
| model=resolved_model, | |||
| enabled=True, | |||
| parent_dir=parent_log_dir, | |||
| ) | |||
| transaction_logger.log_request(kwargs) | |||
|
|
|||
| # Build request context (embeddings are never streaming) | |||
| context = RequestContext( | |||
| model=model, | |||
| model=resolved_model, | |||
| provider=provider, | |||
| kwargs=kwargs, | |||
| streaming=False, | |||
| request_type="embedding", | |||
| credentials=self.all_credentials.get(provider, []), | |||
| deadline=time.time() + self.global_timeout, | |||
| request=request, | |||
| pre_request_callback=pre_request_callback, | |||
| transaction_logger=transaction_logger, | |||
| ) | |||
|
|
|||
| return self._executor.execute(context) | |||
| return await self._executor.execute(context) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix verified: aembedding now has full feature parity with acompletion, including async support, model resolution, and transaction logging. This resolves my previous feedback.
|
@b3nw, I've completed the re-review for PR #112. SummaryThe updates successfully address all the items identified in the previous review. The embedding support is now fully integrated with the library's core features (model resolution, transaction logging, and async execution). Key Improvements
I've approved the PR. Great work on these fixes! This update was generated by an AI assistant. |
The embedding endpoint existed but was broken because the executor refactoring (Nov 2025) forgot to add litellm.aembedding() dispatch. All requests were going through litellm.acompletion(), causing failures.
Changes:
Tested with Mistral (mistral-embed) - returns proper 1024-dim vectors.
Important
Fix embedding dispatch in
RequestExecutorby addingrequest_typetoRequestContextand updating logic to correctly handle embedding requests.RequestExecutor._execute_non_streaming()to calllitellm.aembedding()for embedding requests.request_typefield toRequestContextintypes.pyto distinguish between completion and embedding requests.RotatingClient.aembedding()to setrequest_type='embedding'.This description was created by
for 2c755e8. You can customize this summary. It will automatically update as commits are pushed.