Skip to content

windows: fix layout-switch freeze (Punto Switcher) via WM_INPUTLANGCHANGE#15

Open
chilango74 wants to merge 2 commits into
warpdotdev:masterfrom
chilango74:fix/windows-ime-layout-switch-deadlock
Open

windows: fix layout-switch freeze (Punto Switcher) via WM_INPUTLANGCHANGE#15
chilango74 wants to merge 2 commits into
warpdotdev:masterfrom
chilango74:fix/windows-ime-layout-switch-deadlock

Conversation

@chilango74
Copy link
Copy Markdown

@chilango74 chilango74 commented May 26, 2026

Summary

Fixes a freeze that affects Warp (and any winit app) on Windows when the keyboard layout is switched by tools like Punto Switcher.

Note: this PR was revised after capturing a minidump of the frozen process. The original explanation (a LAYOUT_CACHE mutex deadlock) was wrong — see "Root cause" below. The keyboard.rs change from the first revision has been dropped.

Root cause (from a minidump of the frozen process)

A full dump was taken with procdump on the frozen window (original, unpatched code) and analyzed with minidump-stackwalk using symbols generated from the local system DLLs (so x64 unwind/CFI is exact).

Reliable top of the main thread:

0  win32u.dll!NtUserMsgWaitForMultipleObjectsEx        (instruction pointer)
1  winit::...::event_loop::wait_for_messages_impl       (Found by: call frame info)
2  user32.dll!CallNextHookEx
3  user32.dll!...
  • The thread is blocked in MsgWaitForMultipleObjectsEx — winit's event-loop wait (wait_for_messages_impl) — re-entered during message dispatch.
  • pshook64.dll (Punto Switcher's global hook, %LOCALAPPDATA%\Yandex\Punto Switcher\pshook64.dll) is injected into the process and present on the stack (CallNextHookEx), i.e. it runs on our thread during dispatch.
  • No mutex / LAYOUT_CACHE / WaitOnAddress frames anywhere.

So this is a re-entrant Win32 message-loop wait mediated by Punto Switcher's hook, not a Rust mutex self-deadlock. (Thanks @acarl005 for pushing back — the original mutex explanation was inconsistent with the code, and the dump confirms it.)

Fix

Handle WM_INPUTLANGCHANGE to refresh the cached keyboard layout, then defer to DefWindowProc (kept, per the Win32 docs, so the message still propagates to first-level child windows).

WM_INPUTLANGCHANGE => {
    { let mut layouts = LAYOUT_CACHE.lock().unwrap(); layouts.get_current_layout(); }
    result = ProcResult::DefWindowProc(wparam);
}

Testing (Windows 11 + Punto Switcher)

Built the winit test app for x86_64-pc-windows-gnu and exercised real Punto Switcher layout switches (English ↔ Russian, confirmed by alphabet changes in the key log). Isolation variants:

Variant WM_INPUTLANGCHANGE handler Result
original (no handler) freezes (dump captured)
keyboard.rs reorder only still freezes (dropped)
refresh + update_modifiers + Value(1) (swallow) skips DefWindowProc no freeze
refresh + DefWindowProc (this PR) keeps DefWindowProc no freeze
refresh only + DefWindowProc, no update_modifiers keeps DefWindowProc no freeze

So the layout-cache refresh is the minimal change that stops the freeze; swallowing the message and update_modifiers are not needed.

Honest limitation: we isolated the operative change empirically and proved what it is not (a mutex deadlock), but the precise reason a cache refresh prevents the re-entrant wait is not fully traced (Punto Switcher's hook is closed-source). The dump at the freeze point shows no layout-loading frames.

Checklist

  • Tested on Windows (the only platform changed)
  • Added a changelog entry
  • DefWindowProc retained for child-window propagation

The WM_KEYDOWN handler in KeyEventBuilder::process_message() called
next_kbd_msg() (PeekMessageW) before acquiring the LAYOUT_CACHE mutex.
PeekMessageW can synchronously dispatch pending messages — notably
WM_INPUTLANGCHANGE sent by layout-switching tools like PuntoSwitcher.
If the dispatched message re-enters the WNDPROC and tries to acquire
LAYOUT_CACHE, the non-reentrant std::sync::Mutex deadlocks.

The WM_KEYUP handler already had the correct ordering (lock first,
drop, then PeekMessageW) with a comment explaining the deadlock risk.
This commit applies the same pattern to WM_KEYDOWN.

Additionally, handle WM_INPUTLANGCHANGE explicitly to refresh the
layout cache and update modifier state, instead of falling through to
DefWindowProc which may dispatch further IME messages in an
unpredictable re-entrant context.

Fixes warpdotdev/warp#8675.
Fixes warpdotdev/warp#10050.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@acarl005
Copy link
Copy Markdown
Collaborator

/oz-review

Copy link
Copy Markdown
Collaborator

@acarl005 acarl005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to redo

Copy link
Copy Markdown
Collaborator

@acarl005 acarl005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for confirming that this diff fixes the deadlock.

The root cause you've cited doesn't make sense to me. What I understand from what you said:

  1. next_kbd_msg, which calls PeekMessageW, was called before acquiring the LAYOUT_CACHE mutex
  2. PeekMessageW can synchronously dispatch pending messages, e.g. WM_INPUTLANGCHANGE which is used by layout-switchers
  3. If the dispatched message re-enters the WNDPROC and tries to acquire LAYOUT_CACHE, the non-reentrant std::sync::Mutex deadlocks on the same thread

However, point (3) is not consistent with (1). Why would a deadlock occur in (3) if PeekMessageW was called before acquiring the LAYOUT_CACHE mutex? PeekMessageW should only deadlock if we already hold the lock.

Can you provide more detail how you tested the fix? How did you run the code and what steps did you use to do once it was running?

}
update_modifiers(window, userdata);
result = ProcResult::Value(1);
},
Copy link
Copy Markdown
Collaborator

@acarl005 acarl005 May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure we aren't breaking something by preventing WM_INPUTLANGCHANGE from falling back to DefWindowProc? The official microsoft docs say:

"You should make any application-specific settings and pass the message to the DefWindowProc function, which passes the message to all first-level child windows"

…PUTLANGCHANGE

Revise the fix after a minidump of the frozen process disproved the original
explanation. The freeze is NOT a `LAYOUT_CACHE` mutex deadlock: the captured
dump shows the main thread blocked in `NtUserMsgWaitForMultipleObjectsEx`
(winit's event-loop wait) re-entered through Punto Switcher's injected hook
(`pshook64.dll` / `CallNextHookEx`) — no mutex frames anywhere.

Accordingly:
- Revert the `WM_KEYDOWN` lock reorder in keyboard.rs; it was irrelevant (no
  mutex is involved, and `next_kbd_msg` already ran before the lock).
- Handle `WM_INPUTLANGCHANGE` by refreshing the cached keyboard layout and then
  deferring to `DefWindowProc` (kept, per the Win32 docs, so the message still
  propagates to first-level child windows).

Isolation testing on Windows with Punto Switcher: refreshing the layout cache
in this handler is the minimal change that stops the freeze; `update_modifiers`
and swallowing the message are not needed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@chilango74 chilango74 changed the title windows: fix deadlock on keyboard layout switch (PuntoSwitcher, IME) windows: fix layout-switch freeze (Punto Switcher) via WM_INPUTLANGCHANGE May 28, 2026
@chilango74
Copy link
Copy Markdown
Author

@acarl005 you were right on both counts — thanks for the careful review. I've revised the PR.

On the root cause. Your objection was correct: with next_kbd_msg (PeekMessageW) already called before the LAYOUT_CACHE lock, the mutex-deadlock story doesn't hold. I audited every LAYOUT_CACHE holder and none holds the lock across a message pump, so a self-deadlock there is impossible.

To get the real answer I captured a full minidump of the frozen process (original, unpatched code) with procdump and walked it with exact symbols. The main thread is blocked in:

0  win32u.dll!NtUserMsgWaitForMultipleObjectsEx     (instruction pointer)
1  winit::...::event_loop::wait_for_messages_impl    (call frame info)
2  user32.dll!CallNextHookEx

i.e. winit's event-loop wait (MsgWaitForMultipleObjectsEx) re-entered through Punto Switcher's injected hook (pshook64.dll, via CallNextHookEx). There are no mutex/LAYOUT_CACHE/WaitOnAddress frames anywhere. So it's a re-entrant Win32 message-loop wait mediated by the layout-switcher's hook — not a Rust mutex deadlock.

On your inline DefWindowProc comment. Also fair — the revised handler keeps DefWindowProc (returns ProcResult::DefWindowProc(wparam)), so the message still propagates to first-level child windows per the docs. I verified on Windows that keeping DefWindowProc still fixes the freeze (the earlier Value(1) that swallowed the message is gone).

Changes vs the previous revision:

  • Dropped the keyboard.rs lock reorder entirely (proven irrelevant).
  • WM_INPUTLANGCHANGE now just refreshes the cached layout and defers to DefWindowProc.

Honest caveat: I isolated the operative change empirically (the layout-cache refresh; update_modifiers and swallowing the message are both unnecessary) and proved what it is not (a mutex deadlock), but I can't fully explain why refreshing the cache prevents the re-entrant wait — Punto Switcher's hook is closed-source and the dump shows no layout-loading frames at the freeze point. Happy to dig further if you have a theory.

Re-requesting review when you have a moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants