windows: fix layout-switch freeze (Punto Switcher) via WM_INPUTLANGCHANGE#15
windows: fix layout-switch freeze (Punto Switcher) via WM_INPUTLANGCHANGE#15chilango74 wants to merge 2 commits into
Conversation
The WM_KEYDOWN handler in KeyEventBuilder::process_message() called next_kbd_msg() (PeekMessageW) before acquiring the LAYOUT_CACHE mutex. PeekMessageW can synchronously dispatch pending messages — notably WM_INPUTLANGCHANGE sent by layout-switching tools like PuntoSwitcher. If the dispatched message re-enters the WNDPROC and tries to acquire LAYOUT_CACHE, the non-reentrant std::sync::Mutex deadlocks. The WM_KEYUP handler already had the correct ordering (lock first, drop, then PeekMessageW) with a comment explaining the deadlock risk. This commit applies the same pattern to WM_KEYDOWN. Additionally, handle WM_INPUTLANGCHANGE explicitly to refresh the layout cache and update modifier state, instead of falling through to DefWindowProc which may dispatch further IME messages in an unpredictable re-entrant context. Fixes warpdotdev/warp#8675. Fixes warpdotdev/warp#10050. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
/oz-review |
There was a problem hiding this comment.
Thank you for confirming that this diff fixes the deadlock.
The root cause you've cited doesn't make sense to me. What I understand from what you said:
next_kbd_msg, which callsPeekMessageW, was called before acquiring theLAYOUT_CACHEmutexPeekMessageWcan synchronously dispatch pending messages, e.g.WM_INPUTLANGCHANGEwhich is used by layout-switchers- If the dispatched message re-enters the
WNDPROCand tries to acquireLAYOUT_CACHE, the non-reentrantstd::sync::Mutexdeadlocks on the same thread
However, point (3) is not consistent with (1). Why would a deadlock occur in (3) if PeekMessageW was called before acquiring the LAYOUT_CACHE mutex? PeekMessageW should only deadlock if we already hold the lock.
Can you provide more detail how you tested the fix? How did you run the code and what steps did you use to do once it was running?
| } | ||
| update_modifiers(window, userdata); | ||
| result = ProcResult::Value(1); | ||
| }, |
There was a problem hiding this comment.
Are you sure we aren't breaking something by preventing WM_INPUTLANGCHANGE from falling back to DefWindowProc? The official microsoft docs say:
"You should make any application-specific settings and pass the message to the DefWindowProc function, which passes the message to all first-level child windows"
…PUTLANGCHANGE Revise the fix after a minidump of the frozen process disproved the original explanation. The freeze is NOT a `LAYOUT_CACHE` mutex deadlock: the captured dump shows the main thread blocked in `NtUserMsgWaitForMultipleObjectsEx` (winit's event-loop wait) re-entered through Punto Switcher's injected hook (`pshook64.dll` / `CallNextHookEx`) — no mutex frames anywhere. Accordingly: - Revert the `WM_KEYDOWN` lock reorder in keyboard.rs; it was irrelevant (no mutex is involved, and `next_kbd_msg` already ran before the lock). - Handle `WM_INPUTLANGCHANGE` by refreshing the cached keyboard layout and then deferring to `DefWindowProc` (kept, per the Win32 docs, so the message still propagates to first-level child windows). Isolation testing on Windows with Punto Switcher: refreshing the layout cache in this handler is the minimal change that stops the freeze; `update_modifiers` and swallowing the message are not needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@acarl005 you were right on both counts — thanks for the careful review. I've revised the PR. On the root cause. Your objection was correct: with To get the real answer I captured a full minidump of the frozen process (original, unpatched code) with procdump and walked it with exact symbols. The main thread is blocked in: i.e. winit's event-loop wait ( On your inline Changes vs the previous revision:
Honest caveat: I isolated the operative change empirically (the layout-cache refresh; Re-requesting review when you have a moment. |
Summary
Fixes a freeze that affects Warp (and any winit app) on Windows when the keyboard layout is switched by tools like Punto Switcher.
Root cause (from a minidump of the frozen process)
A full dump was taken with procdump on the frozen window (original, unpatched code) and analyzed with
minidump-stackwalkusing symbols generated from the local system DLLs (so x64 unwind/CFI is exact).Reliable top of the main thread:
MsgWaitForMultipleObjectsEx— winit's event-loop wait (wait_for_messages_impl) — re-entered during message dispatch.pshook64.dll(Punto Switcher's global hook,%LOCALAPPDATA%\Yandex\Punto Switcher\pshook64.dll) is injected into the process and present on the stack (CallNextHookEx), i.e. it runs on our thread during dispatch.LAYOUT_CACHE/WaitOnAddressframes anywhere.So this is a re-entrant Win32 message-loop wait mediated by Punto Switcher's hook, not a Rust mutex self-deadlock. (Thanks @acarl005 for pushing back — the original mutex explanation was inconsistent with the code, and the dump confirms it.)
Fix
Handle
WM_INPUTLANGCHANGEto refresh the cached keyboard layout, then defer toDefWindowProc(kept, per the Win32 docs, so the message still propagates to first-level child windows).Testing (Windows 11 + Punto Switcher)
Built the winit test app for
x86_64-pc-windows-gnuand exercised real Punto Switcher layout switches (English ↔ Russian, confirmed by alphabet changes in the key log). Isolation variants:WM_INPUTLANGCHANGEhandlerupdate_modifiers+Value(1)(swallow)DefWindowProc(this PR)DefWindowProc, noupdate_modifiersSo the layout-cache refresh is the minimal change that stops the freeze; swallowing the message and
update_modifiersare not needed.Honest limitation: we isolated the operative change empirically and proved what it is not (a mutex deadlock), but the precise reason a cache refresh prevents the re-entrant wait is not fully traced (Punto Switcher's hook is closed-source). The dump at the freeze point shows no layout-loading frames.
Checklist
changelogentryDefWindowProcretained for child-window propagation