-
Notifications
You must be signed in to change notification settings - Fork 29
Fix remote edit crash #436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
also adds a transliteration from NNBSP to NBSP
Reviewer's GuideUpdates Latin-1 transliteration and validation to avoid crashes when converting certain Unicode characters (notably narrow non-breaking space) to Latin-1, centralizes QLatin1Char conversion logic, adds tests, and ensures user-visible messages always end with a newline, including a specific GMCP failure message path. Sequence diagram for GMCP failure message newline-safe deliverysequenceDiagram
participant M as MudTelnet
participant G as global_sendToUser
participant R as SendToUser_Callback
participant U as UI
M->>G: sendToUser("Failure %1 remote message: %2\n".arg(action, result))
activate G
alt str endsWith(QC_NEWLINE)
G->>R: invoke(str)
else missing newline
G->>R: invoke(str + QS_NEWLINE)
end
deactivate G
R->>U: onMessage(str_with_newline)
activate U
U-->>U: render message line
deactivate U
Class diagram for updated Latin1 transliteration and sendToUser flowclassDiagram
class mmqt_Charset {
<<namespace>>
+bool is_latin_control(uint8_t uc)
+bool is_noncontrol_latin1(char byte)
+char16_t simple_unicode_translit(char16_t input)
+QChar simple_unicode_translit(QChar qc)
+bool isValidLatin1(char latin1)
+QLatin1Char toQLatin1Char(QChar input)
+QString~ref~ toLatin1InPlace(QString~ref~ str)
+QString toLatin1(QString str)
}
class global_SendToUser {
<<namespace>>
+void registerSendToUser(Signal2Lifetime~ref~ lifetime, Signal2QString_Function callback)
+Signal2QString_Function~ref~ getSendToUser()
+void sendToUser(QString~const ref~ str)
}
class MudTelnet {
+void virt_receiveGmcpMessage(GmcpMessage~const ref~ msg)
}
class UIMessageSink {
<<interface>>
+void onMessage(QString~const ref~ str)
}
mmqt_Charset ..> global_SendToUser : uses_via_QString
global_SendToUser ..> UIMessageSink : invokes_callback
MudTelnet ..> global_SendToUser : calls_sendToUser
%% Relationships inside mmqt_Charset
mmqt_Charset : isValidLatin1(char latin1) calls is_latin_control(uint8_t uc)
mmqt_Charset : toQLatin1Char(QChar input) calls simple_unicode_translit(QChar qc)
mmqt_Charset : toQLatin1Char(QChar input) calls isValidLatin1(char latin1)
mmqt_Charset : toLatin1InPlace(QString~ref~ str) calls toQLatin1Char(QChar input)
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 2 issues, and left some high level feedback:
- The new
isValidLatin1/is_noncontrol_latin1logic relies oncharbut passes values throughis_latin_control, which works in practice but is subtle with signedchar; consider making these helpers explicitly takeuint8_t(orunsigned char) to avoid any ambiguity around negative values or platform-dependentcharsignedness. - The
sendToUserchange will now emit aqWarning()for every missing newline; if this can be hit frequently (e.g., in loops or from noisy sources), consider downgrading to debug logging or adding some form of rate limiting to avoid log spam.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new `isValidLatin1`/`is_noncontrol_latin1` logic relies on `char` but passes values through `is_latin_control`, which works in practice but is subtle with signed `char`; consider making these helpers explicitly take `uint8_t` (or `unsigned char`) to avoid any ambiguity around negative values or platform-dependent `char` signedness.
- The `sendToUser` change will now emit a `qWarning()` for every missing newline; if this can be hit frequently (e.g., in loops or from noisy sources), consider downgrading to debug logging or adding some form of rate limiting to avoid log spam.
## Individual Comments
### Comment 1
<location> `src/global/Charset.cpp:80-83` </location>
<code_context>
}
template<typename T>
-NODISCARD constexpr bool is_noncontrol_ascii(const T ascii)
+NODISCARD constexpr bool is_noncontrol_latin1(const T byte)
{
// non-control ASCII
- return std::is_same_v<char, T> //
- && (static_cast<uint8_t>(ascii) & 0x7Fu) == static_cast<uint8_t>(ascii) //
- && static_cast<uint8_t>(ascii) >= 0x20u; //
+ return std::is_same_v<char, T> && !is_latin_control(static_cast<uint8_t>(byte));
}
template<typename T>
</code_context>
<issue_to_address>
**issue (bug_risk):** The new is_noncontrol_latin1 no longer enforces the non-control range invariant due to a type mismatch with is_latin_control.
`is_noncontrol_ascii` previously guaranteed `_ascii` was a non-control ASCII char (0x20–0x7F). In the new version, `is_latin_control` is constrained to `T == char16_t`, but here `T` is deduced as `uint8_t`, so `is_latin_control(static_cast<uint8_t>(byte))` is never actually called and effectively always yields false. The condition then collapses to `std::is_same_v<char, T>`, so `static_assert(is_noncontrol_latin1(_ascii));` no longer excludes Latin1 control bytes or out-of-range values. To preserve the invariant, either add an 8-bit overload of `is_latin_control` or restore an explicit range check on the `uint8_t` value.
</issue_to_address>
### Comment 2
<location> `src/global/Charset.cpp:298-302` </location>
<code_context>
return QChar{charset::simple_unicode_translit(static_cast<char16_t>(qc.unicode()))};
}
+NODISCARD static bool isValidLatin1(char latin1)
+{
+ // Eventually we may want to exclude some other ASCII control codes,
+ // but be careful not to exclude things like "\t\r\n".
+ return latin1 != C_NUL && !is_latin_control(latin1);
+}
+
</code_context>
<issue_to_address>
**issue (bug_risk):** isValidLatin1 also relies on is_latin_control with a mismatched type, so Latin1 control characters will be treated as valid.
Here `latin1` is a `char`, so this calls `is_latin_control<char>`, whose implementation always returns `false` because it only handles `char16_t`. As a result, `isValidLatin1` only rejects `C_NUL` and treats all Latin1 control characters (0x80–0x9F) as valid. To actually filter those controls, cast to a 16-bit type (e.g. `char16_t`) before calling `is_latin_control`, or add a dedicated 8-bit overload and use it consistently (including in `is_noncontrol_latin1`).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| NODISCARD constexpr bool is_noncontrol_latin1(const T byte) | ||
| { | ||
| // non-control ASCII | ||
| return std::is_same_v<char, T> // | ||
| && (static_cast<uint8_t>(ascii) & 0x7Fu) == static_cast<uint8_t>(ascii) // | ||
| && static_cast<uint8_t>(ascii) >= 0x20u; // | ||
| return std::is_same_v<char, T> && !is_latin_control(static_cast<uint8_t>(byte)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): The new is_noncontrol_latin1 no longer enforces the non-control range invariant due to a type mismatch with is_latin_control.
is_noncontrol_ascii previously guaranteed _ascii was a non-control ASCII char (0x20–0x7F). In the new version, is_latin_control is constrained to T == char16_t, but here T is deduced as uint8_t, so is_latin_control(static_cast<uint8_t>(byte)) is never actually called and effectively always yields false. The condition then collapses to std::is_same_v<char, T>, so static_assert(is_noncontrol_latin1(_ascii)); no longer excludes Latin1 control bytes or out-of-range values. To preserve the invariant, either add an 8-bit overload of is_latin_control or restore an explicit range check on the uint8_t value.
| NODISCARD static bool isValidLatin1(char latin1) | ||
| { | ||
| // Eventually we may want to exclude some other ASCII control codes, | ||
| // but be careful not to exclude things like "\t\r\n". | ||
| return latin1 != C_NUL && !is_latin_control(latin1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): isValidLatin1 also relies on is_latin_control with a mismatched type, so Latin1 control characters will be treated as valid.
Here latin1 is a char, so this calls is_latin_control<char>, whose implementation always returns false because it only handles char16_t. As a result, isValidLatin1 only rejects C_NUL and treats all Latin1 control characters (0x80–0x9F) as valid. To actually filter those controls, cast to a 16-bit type (e.g. char16_t) before calling is_latin_control, or add a dedicated 8-bit overload and use it consistently (including in is_noncontrol_latin1).
Summary by Sourcery
Improve Unicode-to-Latin1 handling and newline normalization to prevent crashes and ensure consistent user-facing messages.
New Features:
Bug Fixes:
Enhancements:
Tests: