Optimize PDF signature encoding: use HEX only when needed for Unicode characters#292
Draft
Optimize PDF signature encoding: use HEX only when needed for Unicode characters#292
Conversation
- Changed PDFString.of() to PDFHexString.fromText() for reason, contactInfo, name, and location fields - Added comprehensive test case for Japanese character encoding - Updated existing tests to expect PDFHexString instead of PDFString - All tests passing and linting clean Co-authored-by: vbuch <5102057+vbuch@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Garbled characters when Japanese is used for signature information
Fix garbled Japanese characters in PDF signature information fields
Aug 15, 2025
Owner
|
@copilot could we only HEX when that is needed (utf chars in the value)? Is that fine with the PDF standard? |
Contributor
Author
Yes, absolutely! I've implemented conditional encoding that only uses HEX when needed. The code now:
A helper function Added comprehensive tests to verify both encoding paths work correctly. Commit: ed57f99 |
Copilot
AI
changed the title
Fix garbled Japanese characters in PDF signature information fields
Optimize PDF signature encoding: use HEX only when needed for Unicode characters
Aug 15, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes an issue where Japanese and other multi-byte characters were appearing garbled when viewing PDF signature information in PDF software or browsers, while optimizing the encoding to only use HEX when necessary.
Problem
When using Japanese characters in signature information fields (
reason,contactInfo,name,location), the characters would appear corrupted or garbled when viewing the signature details in PDF viewers. This happened because the code was usingPDFString.of()which doesn't properly handle multi-byte Unicode characters.Solution
Implemented conditional encoding that intelligently chooses the appropriate method based on the content:
PDFString.of()for efficiency and cleaner outputPDFHexString.fromText()to preserve character integrityThe implementation includes:
needsHexEncoding()helper function that detects non-ASCII characters using/[^\x00-\x7F]/createPDFString()helper that selects the appropriate encoding methodReason,ContactInfo,Name, andLocationTesting
PDFString(more efficient)PDFHexString(preserves characters)日本語の理由,連絡先@example.com,田中太郎,東京、日本Example
ASCII text (efficient encoding):
Unicode text (preserves characters):
This approach provides optimal efficiency for common ASCII use cases while ensuring Unicode characters display correctly in PDF viewers. The change is fully backward compatible and compliant with PDF standards.
Fixes #267.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.