Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions pages/docs/features/ocr.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,21 @@ ocr:

Support for custom OCR providers and user-defined strategies is planned for future releases.

### 5. Upload Files to Provider (Direct)

For supported LLM Providers (**OpenAI, AzureOpenAI, Anthropic, and Google**) and their respective models, files can now be sent directly to the provider APIs as message attachments,
allowing the provider to use their own native OCR implementations to parse files using the `Upload to Provider` option in the file attachment dropdown menu.

Currently all four of the aforementioned providers offer support for images and PDFs, with Google also including support for audio and video files when used in conjunction with compatible multimodal models.

<Callout type="note" title="Azure OpenAI PDF Upload Caveat" emoji='✏️'>
For **Azure OpenAI** endpoints, the Upload to Provider option for PDF files is only available when using the Responses API. Azure OpenAI's Chat Completions API supports images but does not support PDF file attachments.

If you do not see 'Upload to Provider' as an option for PDFs in your chat's attachment dropdown menu with Azure OpenAI, ensure that the Responses API parameter is enabled in the Parameters panel.

Note: Standard OpenAI endpoints support PDF uploads in both Chat Completions and Responses APIs.
</Callout>

## Detailed Configuration

For additional, detailed configuration options, see the [OCR Config Object Structure](/docs/configuration/librechat_yaml/object_structure/ocr).
Expand Down