Fix stale CASE exchange blocking new session handshake#16
Open
jonlil wants to merge 2 commits into
Open
Conversation
5043e71 to
30715b1
Compare
When a Matter client restarts and sends a new CASESigma1 reusing the same exchange ID as an in-progress handshake, the old exchange is still awaiting CASESigma3. The new Sigma1 gets routed to the stale exchange, producing "Invalid opcode: CASESigma1, expected: CASESigma3" and the handshake fails. The client must then wait through its retry/backoff cycle (typically 30-60s) before succeeding on a subsequent attempt with a different exchange ID. Fix: in Session::post_recv(), when an existing exchange receives a new session request (CASESigma1 or PBKDFParamRequest), remove the stale exchange via remove_exch() and fall through to create a fresh one. remove_exch() either clears the slot immediately (allowing the new exchange to reuse it) or marks it as Dropped if MRP operations are pending.
30715b1 to
30dd689
Compare
When a stale exchange is evicted by post_recv (new CASESigma1 on an existing exchange), the slot is set to None. The old Exchange object still holds a reference to that slot index. When it is eventually dropped, Drop calls remove_exch on the already-cleared slot, causing an unwrap panic. Handle the None case gracefully by returning true (already cleaned up) instead of panicking.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When a Matter client restarts and sends a new
CASESigma1reusing the same exchange ID as an in-progress handshake, the old exchange is still awaitingCASESigma3. The newSigma1gets routed to the stale exchange, producing:The handshake fails and the client must wait through its retry/backoff cycle (typically 30-60s) before succeeding with a different exchange ID.
This is easily reproducible by restarting a Matter controller that connects to an rs-matter server — the first reconnect attempt always fails.
Fix
In
Session::post_recv(), when an existing exchange receives a new session request (CASESigma1orPBKDFParamRequest), mark the stale exchange asDroppedand fall through to create a fresh one. The new handshake proceeds immediately from scratch.Uses the existing
MessageMeta::is_new_session()check andRole::set_dropped_state()— no new types or APIs needed.Testing
Tested with a Zigbee-to-Matter bridge (rs-matter server) and a Matter controller client (also rs-matter based). Before the fix, every server restart caused a 30-60s reconnect delay. After the fix, the client reconnects on its first attempt.