Skip to content

[13.4-stable] newlog: sanitize non-Latin-1 chars in gzip header#5988

Open
eriknordmark wants to merge 1 commit into
lf-edge:13.4-stablefrom
eriknordmark:newlog-latin1-backport-13.4-stable
Open

[13.4-stable] newlog: sanitize non-Latin-1 chars in gzip header#5988
eriknordmark wants to merge 1 commit into
lf-edge:13.4-stablefrom
eriknordmark:newlog-latin1-backport-13.4-stable

Conversation

@eriknordmark
Copy link
Copy Markdown
Contributor

Description

Backport of #5977.

newlogd assigns the per-batch metadata header to gzip.Writer.Name (app logs,
holds the app's DisplayName) or Comment (device logs). Go's gzip writer
enforces RFC 1952's NUL-free ISO-8859-1 restriction on those fields, so the
operator-supplied DisplayName can contain bytes the writer refuses to emit
(CJK, emoji, smart quotes, extended-Latin glyphs like Łódź). The error
surfaces from gw.Write or gw.Close as log.Fatal, taking newlogd down;
the watchdog then reboots the device. After reboot the same collect file is
still on /persist, so the crash repeats — a reboot loop attributed to
"agent newlogd" with the signature
finalizeGzipToOutTempFile: cannot close filegzip.Write: non-Latin-1 header string.

Sanitize the header at the assignment site. Runes outside 0x01–0xFF are
rewritten to the literal text form of a Go Unicode escape (\uXXXX or
\UXXXXXXXX); valid Latin-1 (including 0x80–0xFF accented characters)
passes through unchanged. The substitution is reversible — cloud-side parsing
can recover the original DisplayName if desired. A one-shot warning is logged
when sanitization fires.

This branch predates the newlog: restructuring files in the package split
on master, so the sanitizer lives inline in pkg/newlog/cmd/newlogd.go next
to prepareGzipToOutTempFile, and the unit tests live in a new
pkg/newlog/cmd/sanitize_header_test.go. Functionally identical to #5977.

How to test and validate this PR

Deploy an app instance with a non-Latin-1 character in its DisplayName (e.g.
test-中-app). Watch newlogd in logread; with the fix it logs
prepareGzipToOutTempFile: header "test-中-app" contains non-Latin-1 characters; escaped to "test-中-app"
and rotates a gzip file under /persist/newlog/keepSentQueue/app.<uuid>.log.<ts>.gz
with Name="test-中-app". Without the fix, newlogd dies on the rotation
and the device reboots within ~60s with the signature above.

Latin-1-only DisplayNames (e.g. café-Ñandú) must pass through unchanged —
no warning, no escape.

Unit coverage: pkg/newlog/cmd/sanitize_header_test.go covers identity
behavior on Latin-1 input, the escape rules for runes outside Latin-1 and
NUL, the round-trip through gzip.Writer.Close, and pins the underlying
stdlib precondition.

End-to-end validation was performed on master before merge of #5977.

Changelog notes

Fixes a device reboot loop triggered when an app's DisplayName contains
non-Latin-1 characters.

PR Backports

  • 14.5-stable: separate PR.
  • 13.4-stable: this PR.

Checklist

  • I've added a reference link to the original PR
  • PR's title follows the template

When a log batch rolls over, newlogd writes the metadata header into
the gzip Name field for app logs (the app's DisplayName) and the
Comment field for device logs. RFC 1952 restricts both fields to
NUL-free ISO-8859-1, and Go's gzip writer enforces this on Close()
with the error "gzip.Write: non-Latin-1 header string". The error
path is log.Fatal, so when an operator names an app with a CJK
character, a smart quote, an emoji, an extended-Latin glyph like
Lodz, or anything else outside Latin-1, newlogd dies the moment it
tries to finalize the next gzip for that app -- which presents on
the device as a reboot loop attributed to "agent newlogd".

Sanitize the header before assigning. Runes outside 0x01-0xFF are
rewritten to the literal text form of a Go Unicode escape; valid
Latin-1 (including 0x80-0xFF accented characters) passes through.
The substitution is reversible, so the cloud-side parser can still
recover the original DisplayName if it chooses to. A warning is
logged on the first sanitization so the operator can see which
header was rewritten and why.

Backport of lf-edge#5977 (cherry-picked from commit
bfd646f; manually adapted because
this branch predates the newlog/cmd file restructuring, so the
sanitizer and its tests live in newlogd.go / sanitize_header_test.go
instead of util.go / util_test.go).

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions github-actions Bot requested review from deitch and naiming-zededa May 22, 2026 21:04
@eriknordmark eriknordmark added bug Something isn't working stable Should be backported to stable release(s) labels May 22, 2026
@eriknordmark eriknordmark marked this pull request as ready for review May 22, 2026 21:09
@eriknordmark eriknordmark requested review from europaul and rene May 22, 2026 21:09
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 32.66%. Comparing base (8077a3d) to head (24598c4).
⚠️ Report is 200 commits behind head on 13.4-stable.

Additional details and impacted files
@@               Coverage Diff               @@
##           13.4-stable    #5988      +/-   ##
===============================================
+ Coverage        24.78%   32.66%   +7.88%     
===============================================
  Files                8        9       +1     
  Lines             1138     1090      -48     
===============================================
+ Hits               282      356      +74     
+ Misses             788      654     -134     
- Partials            68       80      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working stable Should be backported to stable release(s)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant