diff --git a/specs/GH537/product.md b/specs/GH537/product.md new file mode 100644 index 0000000000..7bf9aa85c2 --- /dev/null +++ b/specs/GH537/product.md @@ -0,0 +1,755 @@ +# PRODUCT.md — Honor user-defined shell bindkeys in Warp's input editor + +Issue: https://github.com/warpdotdev/warp/issues/537 + +Figma: none provided. + +## Summary + +Warp's input editor currently ignores user-defined keybindings declared in the +user's shell — `bindkey` in zsh, `bind` / `~/.inputrc` in bash readline, and +`bind` in fish. When a user types in a Warp prompt, those customizations have +no effect, even though the same keys work in any other terminal running the +same shell. This spec covers honoring those user bindings inside the Warp +input editor (the prompt where shell commands are typed) for zsh, bash, and +fish, sourced from the user's actual running shell so that whatever the shell +reports is what Warp respects. + +## Goals / Non-goals + +In scope: + +- Honoring user-defined keybindings in Warp's shell command input editor for + zsh, bash, and fish sessions. +- Discovery via the user's live shell session (querying the shell for its + current binding table), so dynamic and conditionally-declared bindings are + picked up — not by parsing rc files. +- Best-effort coverage of the action set: any shell widget / readline function + / fish input function that has a Warp-input equivalent is honored. Widgets + with no clean Warp equivalent degrade gracefully (see below) rather than + silently stealing the keystroke. +- Keymap modes: emacs vs vi (insert/command/visual) for the shells that + expose them. Mode switches initiated by the user (e.g. `bindkey -v`, + `set -o vi`, vi-mode plugins, fish bind modes) take effect without restart. +- Conflict policy with Warp's own keybindings (see Behavior #14). + +Out of scope for this spec: + +- Bindings inside surfaces other than the shell command input editor — the + AI prompt input, command palette, search, settings — keep their existing + Warp keybindings unchanged. (See Open question at #5.) +- Other shells (PowerShell, nushell, xonsh, csh family). Adding more shells + follows the same shape but is not required for this issue to land. +- A Warp-native keybinding-import config surface where users redeclare their + bindings inside Warp settings. The intent of this issue is "the bindings I + already have should just work" — not "give me yet another config". +- Static parsing of `~/.zshrc`, `~/.inputrc`, `~/.config/fish/`, etc. The + source of truth is the live shell. +- Honoring shell-level abbreviations, aliases, completions, syntax + highlighting, or autosuggestion plugins. Only key-to-action bindings. + +## Behavior + +### Motivating cases + +Real-world bindkey users in 2025 fall into two overlapping groups, and +the spec must serve both. The driving examples: + +**Group 1: single-keystroke external widgets (TUI takeovers).** + +- **atuin** binds `Ctrl-R` (history search) and Up arrow to its own + zsh widget / bash `bind -x` command / fish function. Pressing the + bound key opens atuin's TUI, the user fuzzy-searches their history, + selects a command, atuin writes the result to the shell's + `$BUFFER` / `$READLINE_LINE` / fish `commandline` and exits. The + user is then back at the prompt with that command in the editor. +- **fzf** binds `Ctrl-R` (history fuzzy-find), `Ctrl-T` (file + fuzzy-find), and `Alt-C` (directory fuzzy-cd) to similar shell + widgets that invoke the `fzf` binary as a TUI. +- **Editor-launching macros** like the canonical + `bindkey '^X^E' edit-command-line` (open `$EDITOR` to edit the + current command). + +**Group 2: continuous inline-rendering plugins** (the line editor +itself is customized — these don't fire on a single keystroke; they +hook every keystroke to paint, suggest, highlight, or expand inline). + +- **zsh-autosuggestions** wraps `self-insert` and other widgets to + paint a dimmed history-suggestion inline as the user types. Right + arrow / End / Ctrl-E accepts the suggestion via a wrapper widget. +- **zsh-syntax-highlighting** (and **fast-syntax-highlighting**) + hooks widgets to repaint the prompt line with syntax colors as the + user types. +- **fish abbreviations** (`abbr`) expand on space / enter — this is + fish's first-class feature, not a plugin, and many users rely on + it heavily. +- **zsh-vi-mode** (jeffreytse/zsh-vi-mode) rebinds large parts of the + keymap, swaps cursor shapes per mode, and adds surround/text-object + operators. + +The spec must honor both groups as primary v1 use cases; "v1 ships +without atuin/fzf" is not acceptable, and "v1 ships but +zsh-autosuggestions silently no longer works" is also not acceptable. +Group 1 is handled by external widget pass-through (#11.5). Group 2 +needs a separate mechanism — the shell's line editor needs to be the +authority for the current prompt's display so its plugins can paint +inline. See #11.6. + +### Discovery and lifecycle + +1. When a Warp tab starts a supported shell (zsh, bash, fish), Warp queries + that shell for its current keybinding table once the shell is ready to + accept commands but before the first user keystroke is processed by the + input editor. Until the table arrives, the input editor uses Warp's + default keymap; once the table arrives, user bindings take effect on the + next keystroke. + +2. The query mechanism is shell-native and visible only to Warp internals — + the user does not see the query command echoed in their scrollback, in + history, or in any block. Equivalents in spirit (not literal): + - zsh: `bindkey -L` for each keymap (`main`, `emacs`, `viins`, `vicmd`, + `vivis`, `viopp`, `command`, `isearch`, `menuselect`, plus any + user-defined keymaps). + - bash: `bind -p` for the current keymap and `bind -p -m emacs` / + `-m vi-insert` / `-m vi-command` for the others. + - fish: `bind` with no args, plus `bind -M insert` / `default` / + `visual` / etc. + +3. If the shell fails to start, exits before the query completes, or returns + an unparseable response, Warp logs a diagnostic and falls back to its + default keymap for that tab. The tab remains usable; no user-facing error + toast is required. + +4. When the user changes their bindings inside an existing session + (`bindkey '^X^E' edit-command-line`, `bind '"\C-x\C-e": edit-and-execute-command'`, + sourcing a new rc file mid-session, switching emacs/vi mode), Warp picks + up the change without requiring a restart of the tab. Discovery is + driven shell-side at every `precmd`, so the change is detected when + the prompt next redraws. The user-visible invariant: a binding + declared at the shell prompt is honored starting with the first + keystroke after Warp has parsed the next `ShellBindings` payload + from that prompt. Keystrokes pressed during the small async window + between the prompt firing and the payload being parsed use the + previous keymap (consistent with the non-blocking guarantee in #26); + declarations never block typing. + +5. Each tab tracks its own bindings independently. Changing bindings in one + tab does not affect another tab, even if both run the same shell. + +6. Closing and reopening a tab re-queries from scratch. Warp does not cache + bindings across tab restarts; the user's current shell state is always + the source of truth. + +### Honoring bindings in the input editor + +7. While the user is typing in the shell command input editor, every key + press is resolved against the precedence ladder defined in #14 + (reserved infrastructure keys → user-customized Warp keybindings → + user shell bindings for the active keymap → Warp's default + keybindings → default character insertion). Shell bindings are + consulted only after the two higher tiers have been checked and have + not produced a match. When the matched action is a shell binding and + the bound widget has a Warp equivalent, Warp performs that action + and consumes the keystroke. When the bound widget is unsupported + (#11), the keystroke continues down the ladder to Warp's defaults. + +8. Multi-key sequences (`^X^E`, `^[f`, `gg`, fish `\\cx\\ce`) are honored + as a single action. Resolution rules: + + - **Mid-sequence buffering.** While Warp has received one or more + keys that match a prefix of a longer binding but not yet a complete + binding, no action fires and no character is inserted; Warp waits + for the next key. + - **Ambiguous bindings (prefix is also a complete binding).** When + a key sequence matches both a complete binding and a prefix of a + longer binding (the canonical example: `^[` is `vi-cmd-mode` *and* + a prefix of `^[f`), Warp uses a 500 ms ambiguity timeout. If + another key arrives within 500 ms that extends the prefix, the + longer match wins. If no key arrives within the timeout, the + complete short binding fires. + - **Pure-prefix timeout.** When a sequence matches a prefix but no + complete binding (e.g. partial `^X` of `^X^E`), pending keys are + held without timeout — readline / ZLE both wait indefinitely on + pure prefixes. The user pressing any non-extending key abandons + the prefix immediately (next rule). + - **Abandonment.** When a non-matching key arrives mid-sequence, + the prefix is abandoned: Warp replays the buffered keys plus the + just-received key through normal handling, in arrival order. Any + of those replayed keys may itself trigger a single-key binding; + none of them re-enter prefix accumulation until the replay + finishes. This matches readline / ZLE behavior. + - **No keystroke is ever silently dropped.** Either a binding fires, + or the buffered keys are replayed. + - **Focus loss / window blur** mid-sequence abandons the prefix + (replay path); on refocus, accumulation starts fresh. + + The 500 ms ambiguity timeout is the standard readline default; it + may be made configurable in a follow-up but is fixed for v1. + +9. "Insert literal string" bindings (e.g. zsh `bindkey -s '^X' 'echo hi\n'`, + readline `"\C-x": "echo hi\n"`) inject the bound text into the input + stream as if the user had typed each character — matching shell + input-queue semantics, not literal text insertion. A newline in the + bound string therefore submits the line (it triggers `accept-line`), + `^A` moves the cursor to the start, and so on. The injected + characters are processed through the same key-resolution chain as + real keystrokes, including any other bindings they happen to trigger. + Plain `self-insert` (no string macro) inserts the literal key + character at the cursor as expected. + +10. Bindings fall into three categories that Warp handles differently; + the user does not need to think about which is which, but the spec + must be precise about each. + + **Category A — built-in widgets** (the bound action is a well-known + ZLE / readline / fish input function — `backward-kill-word`, + `kill-line`, `up-history`, etc.). Warp translates these to its own + `InputAction` and executes them natively in its block-mode editor. + Fast, no shell roundtrip. + + **Category B — string macros** (`bindkey -s` / readline string + bindings). Handled per #9: injected back through the input + pipeline so newlines submit and control characters trigger their + actions. + + **Category C — external shell-function widgets** (the bound + action is a user-defined zsh widget declared via `zle -N`, a bash + `bind -x` shell command, or a fish function — including atuin's + `atuin-search`, fzf's `fzf-history-widget`, custom user widgets, + plugin-provided widgets, and `edit-command-line`). These are + honored via pass-through: see #11.5 for the user experience. + + The full set of Category A widgets Warp must honor when bound + includes, at minimum: + + - Cursor motion: `forward-char`, `backward-char`, `forward-word`, + `backward-word`, `beginning-of-line`, `end-of-line`, + `beginning-of-buffer-or-history`, `end-of-buffer-or-history`. + - Deletion: `backward-delete-char`, `delete-char`, `backward-kill-word`, + `kill-word`, `kill-line`, `backward-kill-line`, `kill-whole-line`, + `unix-word-rubout`, `unix-line-discard`. + - Yank / kill ring: `yank`, `yank-pop`, `kill-region`, `copy-region-as-kill`. + - History: `up-line-or-history`, `down-line-or-history`, `up-history`, + `down-history`, `history-incremental-search-backward`, + `history-incremental-search-forward`, + `history-search-backward` / `-forward`, fish's history-pager bindings. + - Editing: `transpose-chars`, `transpose-words`, `upcase-word`, + `downcase-word`, `capitalize-word`, `quoted-insert`, `tab-insert`, + `overwrite-mode`, `undo`, `redo` (where supported). + - Submission and abort: `accept-line`, `accept-and-hold`, + `accept-and-infer-next-history`, `accept-search`, `send-break` + (`^C`), `eof` / `delete-char-or-list` (`^D` on empty line). + - Vi mode: `vi-cmd-mode`, `vi-insert`, `vi-replace`, `vi-add-next`, + `vi-add-eol`, `vi-change`, `vi-delete`, `vi-yank`, `vi-put-after`, + `vi-put-before`, `vi-find-next-char` / `-prev-char`, `vi-repeat-find`, + `vi-up-line-or-history`, `vi-down-line-or-history`, `vi-goto-mark`, + `vi-set-mark`, `vi-replace-chars`, `vi-substitute`, + `vi-change-whole-line`, plus fish-mode equivalents. + - Completion: `complete-word`, `expand-or-complete`, + `expand-or-complete-prefix`, `menu-complete`, `reverse-menu-complete`. + These trigger Warp's existing completion UI (not the shell's), but + from whichever key the user has bound them to. Behavior of the + completion UI itself is unchanged by this spec. + +11. Category A widgets that have no clean Warp equivalent (`redisplay`, + `quoted-insert` in edge cases, etc.) are handled as follows: + + - If the widget has a documented behavior Warp can replicate + cheaply, Warp replicates it. + - Otherwise the keystroke falls through to Warp's default handling + for that key, and Warp emits a one-time-per-session diagnostic + noting the unsupported widget. The diagnostic uses the same + redaction policy as telemetry: widget name verbatim only when in + the documented shell-vocabulary allowlist; user-defined or + otherwise unknown names are written as `user-defined`. Key + sequences and binding bodies are never included. + - This rule does not apply to Category C (external shell-function + widgets) — those go through pass-through, never the + "unsupported" path. + +11.5. **External widget pass-through (Category C).** When a key is + bound to an external shell-function widget, pressing that key: + + - Briefly hands input control to the shell. Warp's block-mode + input editor yields; the shell's line editor (ZLE / readline / + fish-line-editor) takes over the prompt with the user's + currently-typed buffer pre-populated as `$BUFFER` / + `$READLINE_LINE` / `commandline`. The cursor starts at the + same position the user had in Warp's editor *on shells where + the v1 sync mechanism preserves cursor position* — namely + zsh and bash with blesh detected (`full`-mode shells, where + every keystroke has already reached the shell). On + `batched`-mode shells (vanilla bash, fish in v1) the cursor + lands at end-of-buffer because the v1 literal-paste sync + lacks a keymap-independent way to position the cursor (see + TECH §6.2.5); the v1 motivating widgets (atuin, fzf, + `edit-command-line`) operate on buffer content and are not + sensitive to cursor position, so this gap is acceptable for + v1. Cursor-sensitive widgets that *do* care (custom + buffer-transform widgets that delete from the cursor, etc.) + will see the cursor at end-of-buffer on these shells; this + is a tracked v1 limitation, not a permanent design choice. + Lifting it is a follow-up gated on a keymap-independent + cursor-set primitive. + - The widget runs natively. If it draws a TUI (atuin, fzf, + `edit-command-line` opening `$EDITOR`, etc.), the alt-screen + handling Warp already uses for `vim` / `less` / `htop` applies — + the widget gets full terminal control until it exits. + - When the widget exits, the new buffer state (whatever the + widget wrote into `$BUFFER` / `$READLINE_LINE` / `commandline`) + is synced back into Warp's input editor and the user is + returned to block-mode editing. The cursor position the widget + left behind is preserved. + - If the widget calls `accept-line` (i.e. submits the command + itself, as some atuin configurations do), Warp treats the + submission the same as if the user had pressed Enter in + block mode — the resulting command is run as a Warp block. + - The widget's stderr / stdout (anything it writes outside its + alt-screen) renders as terminal output, like any other + command. It does not appear in Warp's input editor. + - Cancellation: if the widget exits without writing to the + buffer (the user presses Esc inside atuin), Warp's editor + content is restored to whatever it was before the binding + fired. The user did not lose their in-progress typing. + + **Failure modes.** If the shell errors during widget invocation + (the widget is undefined, the bound function exits non-zero, the + shell crashes), Warp restores the user's pre-invocation buffer + and surfaces a one-time diagnostic naming the widget. The widget + name in the diagnostic follows the same allowlist-or-bucket + redaction policy as #11 and telemetry — verbatim only when the + widget name appears in the well-known ZLE/readline/fish + vocabulary; user-defined or plugin-private names are + redacted to the bucket `user-defined`. Key contents and + binding bodies never appear in the diagnostic. The + keystroke is not silently swallowed and the user is never + left with a dead prompt. + + **Latency.** Pass-through introduces a small round-trip: typically + 50–150 ms before the widget's TUI appears. This is not a hard + invariant but the spec calls it out so the implementation budgets + appropriately. atuin's own latency measured outside Warp is the + floor. + + **Concurrent input.** Once Warp has yielded for the widget, + subsequent keystrokes reach the shell directly (this is what + makes atuin's UI navigable). Warp does not buffer or re-intercept + keystrokes during pass-through. Returning focus to Warp's + block-mode editor happens when the widget signals completion. + +11.6. **Continuous inline-rendering plugin support.** When the user has + plugins installed that hook every keystroke to paint, suggest, + highlight, or expand inline (zsh-autosuggestions, zsh- or + fast-syntax-highlighting, fish abbreviations, zsh-vi-mode's + visual mode indicators), Warp honors them. Concretely, while the + user types in the input editor on a tab where these plugins are + active: + + - **Inline suggestions appear.** If zsh-autosuggestions or an + equivalent is loaded and would have suggested a completion at + the current buffer state, that suggestion is visible in the + input editor in dimmed text after the cursor, exactly as it + would render in the user's terminal without Warp. + - **Suggestion acceptance works.** The keys the plugin binds to + accept a suggestion (typically Right arrow, End, Ctrl-E) accept + it the same way they would natively. Word-at-a-time acceptance + (Alt-F when bound to a `_zsh_autosuggest_accept_word`-style + widget) also works. + - **Syntax highlighting renders.** If zsh-syntax-highlighting, + fast-syntax-highlighting, or an equivalent is loaded, the + input editor shows the same per-token coloring as the user's + native terminal does — command vs argument, valid vs invalid + command, matching/mismatching quote and bracket pairs. + - **fish abbreviations expand.** Pressing space or enter after a + typed abbreviation expands it to its full form before the + command runs, exactly as fish does natively. + - **vi-mode indicators are correct.** Cursor shape per vi mode + (block in command mode, beam in insert, underline in replace) + matches what the active vi-mode plugin would draw. zsh-vi-mode + surround / text-object operators behave as they would natively. + - **No double-render or flicker.** The user sees one rendered + line per prompt — Warp's editor is not separately rendered on + top of (or under) the shell's view of the buffer. + - **Block mode UI is preserved.** Everything above the current + prompt (block list, sidebar, command palette, etc.) renders + and behaves exactly as it does today. The change is scoped to + how the active prompt's input area is composed. + + The implementation uses per-keystroke PTY injection plus + shell-emitted DCS overlays (see TECH §6, §7). Behavioral + invariants above are the bar; TECH owns the latency budget + and the per-shell capability matrix that determines whether + each shell hits all of them in v1. The matrix: + + - **zsh and bash-with-blesh:** all invariants above (inline + suggestions, syntax highlighting, vi-mode indicators, + acceptance keys, abbreviation-style expansion if any). + The shell's line editor exposes a per-keystroke hook + (`zle-line-pre-redraw` on zsh; blesh's hook surface on + bash) so every keystroke participates in the + inline-rendering pipeline. + - **fish v1:** Category C bindings (atuin, fzf, custom + widgets) plus `abbr` expansion on space and enter (the + one fish-specific inline behavior PRODUCT #11.6 + requires) plus vi-mode indicators (the in-app state + machine in TECH §"#13" tracks dispatched vi widgets and + renders cursor shape per mode regardless of injection + mode). Fish has no per-keystroke hook, so the remaining + inline overlays (syntax highlighting, autosuggestions + painted as the user types) are not delivered in v1; + abbr expansion is delivered via a fish-specific + space-trigger sync (TECH §6.1 step 5) and an Enter sync + that catches all-other-cases. Continuous inline- + rendering parity for fish is a tracked follow-up. + - **vanilla bash (no blesh):** Category C bindings only. + Readline has no per-keystroke hook and no fish-style + abbr feature to honor; the inline-rendering invariants + above don't apply structurally. TECH §7.3 surfaces this + with a one-time diagnostic at tab start. + + **Security of the overlay channel.** The shell-emitted DCS + overlays carrying buffer state and plugin output are + process-controlled input arriving over the PTY; any + process the user runs can write the same byte stream. To + prevent a hostile or careless process from spoofing + overlays, every payload carries the same per-tab nonce as + Warp's other shell-integration DCS payloads, and payloads + are subject to size caps, strict schema validation, and + whole-payload-discard on any failure. TECH §6.4 specifies + the exact validation phases and bounds. + + The defense rests on the nonce staying secret from + descendant processes. The nonce is delivered to the shell + out-of-band — for zsh/bash via the initial environment as + `WARP_BOOTSTRAP_NONCE`, which the bootstrap copies into a + non-exported shell-local variable and *unsets* before any + user rc file runs; for fish via a `0600`-mode tempfile + whose path is passed in `--init-command` and which the + bootstrap reads-then-`rm`s before any further work. After + bootstrap, the nonce lives only in the shell process's own + memory: it is not in the environment, not in shell history, + not in command output, and not in any tempfile. Child + processes spawned by the shell inherit a clean environment + that does not contain the nonce. Same-user processes that + can already read the shell's memory (`/proc//environ`, + debugger attach, `ptrace`) can defeat the nonce but they + can already break the user's session in many other ways; + those are out of scope as documented in TECH §1's threat + model. If a real leak occurs (a rc-file misconfiguration + that exports the variable, e.g.) the user can rotate by + closing and reopening the tab — every tab gets a fresh + nonce at bootstrap. User-visible result: a process that + tries to inject a fake buffer-state overlay does not affect + Warp's editor; the bad payload is dropped silently. + + **Failure mode.** If the plugin emits something Warp's renderer + can't faithfully display (an obscure ANSI sequence, a 24-bit + color the active theme rejects, custom terminal-mode toggling), + the plugin's output renders as plain text (not crashing) and a + one-time-per-session diagnostic notes the limitation. The user + is never left with a broken prompt. + + **Detection.** Warp does not need to enumerate plugins by name. + The same `bindkey -L` / `bind -p` / `bind` query used for + Category A/B/C bindings already surfaces the plugin's installed + widgets (e.g., `_zsh_autosuggest_accept` shows up bound to + Right arrow when zsh-autosuggestions is loaded). The presence + of these widgets is the signal that the implementation should + activate the inline-rendering path for that tab. + +12. `clear-screen` (typically `^L`) clears Warp's block list to the current + prompt, matching the user's expectation from a real terminal — even if + Warp's default `^L` already does this, the binding must continue to + work when remapped to another key. + +### Modes + +13. When the shell is in vi mode, the active keymap follows the shell's + current mode (insert / command / visual / replace). Warp learns about + mode transitions through the same mechanism it uses for binding + discovery (see #4) — when the shell signals a mode change (by repaint, + OSC, prompt update, or whichever signal the implementation lands on), + Warp's input editor switches keymaps so the next keystroke is matched + against the new map. Visible mode indicators (cursor shape, vim-mode + plugin status text in the prompt) remain whatever the shell already + drew; Warp does not add its own. + - **Resolved (TECH §"#13"):** an in-app state machine drives + mode tracking. Initial mode comes from the bootstrap payload + (zsh `$KEYMAP`, bash `bind -v`, fish `$fish_bind_mode`); + each `Precmd` payload is authoritative for resync; between + prompts the widget dispatcher updates `active_keymap` + synchronously based on the dispatched widget (vi-cmd-mode → + ViCommand, vi-insert → ViInsert, etc.). + +### Precedence and conflicts + +14. Resolution order for a single keystroke in the shell command input + editor, highest priority first: + 1. Reserved infrastructure keys (see below). + 2. User-customized Warp keybindings (anything the user has explicitly + set in Warp settings). + 3. User shell bindkeys for the active keymap. + 4. Warp's default keybindings. + 5. Default character insertion. + + Rationale: a key the user has explicitly bound in Warp settings is the + strongest signal of intent. Below that, the user's shell bindings + override Warp's defaults — that is the entire point of this issue. Warp + defaults are the floor. + + **Reserved infrastructure keys.** A small set of keys is structurally + needed for Warp ↔ shell communication (input reporting, prompt-mode + switching, kill-buffer signaling) and cannot be honored as + user-controlled in v1. User bindings on these keys are imported into + Warp's debug view tagged `reserved-by-warp` and do not fire; + bindings on every other key follow the regular precedence above. + The reserved set per shell: + + - **zsh:** `^P` (Warp uses for `kill-buffer`), `\ei` (input + reporting), `\ep` (switch to PS1 prompt), `\ew` (switch + to Warp prompt). The `\ep`/`\ew` bindings match the + bash/fish set; the zsh bootstrap already installs them + for prompt-mode switching, and treating them as + user-controlled would break Warp ↔ shell prompt + communication. + - **bash:** `\C-p` (`kill-whole-line` for clear-buffer), `\ei` + (input reporting), `\ep` (switch to PS1 prompt), `\ew` (switch to + Warp prompt). + - **fish:** `\cP` (clear input buffer), `\ep` (switch to PS1 + prompt), `\ew` (switch to Warp prompt), `\ei` (input reporting). + + These match the keys Warp's existing bootstrap already installs in + each shell. Lifting the exception (re-implementing each integration + point without bind-level interception) is a tracked follow-up; the + integrations exist today and replacing them is out of scope here. + +15. When a user shell binding shadows a Warp default, no warning, banner, or + toast appears. The user already declared this binding in their shell + config; the override is the desired behavior. Diagnostics for shadowed + Warp defaults may be available through verbose logging but are not + user-facing. + +16. When a user shell binding cannot be honored because the bound widget is + unsupported (#11), the keystroke falls through to Warp's default — it + does not steal the keystroke and produce nothing. The user sees the + Warp default fire on that key, which may differ from what their shell + would have done. The diagnostic from #11 is the user's signal that + something they configured isn't supported yet. + +### Multi-tab and multi-shell scenarios + +17. A window with multiple tabs running different shells (one zsh, one + bash, one fish) honors each tab's bindings independently. Switching + focus between tabs changes the active binding table to that tab's. + +18. SSH sessions: when the user SSHes from a Warp tab to a remote host and + a shell starts on the remote, Warp does not query the remote shell for + bindings in v1. The local Warp input editor continues to use the + bindings of the local shell that started the tab, or Warp defaults if + the local shell wasn't a supported one. Honoring remote bindings + requires the remote-side Warp agent and is out of scope here. + +19. Subshells started inside a session (`bash` typed at a zsh prompt, + `tmux`, etc.) keep the parent tab's binding table. The user does not + see a re-query, and bindings the subshell may have configured do not + take effect in the Warp input editor. Re-querying on every subshell + transition is feasible but a follow-up. + +### Surface boundaries + +20. Bindings only apply while the user's input focus is in the shell + command input editor of a tab whose shell is one of zsh / bash / fish. + They do not affect: + - Warp command palette, settings, search, AI prompt input, + block-level chrome (the keystrokes there continue to use Warp's own + keymap). + - Tabs whose shell is not a supported shell — those tabs use Warp + defaults. + - Any modal overlay rendered above the input editor (file palette, + command palette, suggestions popover focus, etc.). + +21. Switching focus from the input editor to another surface and back does + not require re-querying. The binding table from the most recent query + remains valid for the duration of the tab unless invalidated by #4. + +### AI / agent prompt input + +22. The AI prompt input editor does not honor shell bindkeys by default — + it is not a shell, and shell vi/emacs muscle memory there would + conflict with the AI input's own conventions. + - **Resolved (TECH §"#22"):** v1 ships with no AI-prompt + opt-in. The matcher's `BindingOrigin::Contextual` tier only + activates on tabs whose focus is the shell command input + editor. The opt-in setting and its dependent + `ClassifierGate` (#22.5) are tracked as a follow-up — they + must ship together since #22 without the gate ships the + flicker bug. + +22.5. **Interaction with the shell-vs-natural-language classifier + (follow-up scope, ships with #22 opt-in).** This section + describes the design that lands *when* the AI-prompt opt-in + from #22 ships — v1 of this PR does *not* implement either + #22 or #22.5; the matcher's `Contextual` tier is inactive + in the AI prompt input throughout v1 (see #22's resolution + in Open Questions). The design is specified here so that + when #22 lands as a follow-up the classifier interaction is + already worked out and the two ship together; the + follow-ups list at the bottom of TECH.md tracks this as a + pair. + + Warp's agent conversation input runs a per-keystroke classifier + that labels the current buffer as "shell command" or "natural + language", and the label can flicker as the user types + (`cd ~/p` initially looks shell-y, then `cd ~/please help me` + flips to NL). If bindkey honoring is gated on this classifier, + naive gating produces three failure modes the follow-up must + avoid: + + - **Flickering inline plugins.** Dimmed autosuggestions and + syntax-highlight colors that appear and disappear as the + classifier oscillates mid-word. The user sees their command + lose its highlighting between keystrokes for no visible reason. + - **Misclassified bindkey loss.** The user presses `Ctrl-R` + expecting atuin, but the classifier had just flipped to "NL", + so the binding is not honored and atuin doesn't open. The + user thinks bindings are broken. + - **Misclassified bindkey activation.** The user is composing + a sentence in NL, classifier briefly flips to "shell", an + autosuggest dimmed-text suggestion appears in the middle of + their sentence and looks like a rendering bug. + + The follow-up rules that resolve these (engaged when #22 opts + in; not in v1 scope): + + a. **Explicit bound keystrokes are classifier-independent.** When + the user presses a key bound to an external widget (Category + C: atuin's `Ctrl-R`, fzf's `Ctrl-T`, etc.) the binding is + always honored, regardless of the current classifier label. + Pressing the bound key is an unambiguous intent signal from + the user that overrides whatever the classifier last said. + Cost of a stray accidental press is low (the widget opens, + user dismisses it with Escape). Cost of a missed intentional + press is high (binding feels broken). + + b. **Inline-plugin rendering is hysteretic and debounced.** + Continuous inline plugins (autosuggest, syntax-highlighting, + fish abbreviations) only render when the classifier has held + "shell command" for at least the last N characters (N small, + on the order of 3–5) and only stop rendering when the + classifier has held "NL" for the last N characters. The + output is debounced over a short window (~80 ms) before the + transition takes effect. The user sees one stable rendering + state per logical phrase of typing, not a flicker on every + keystroke. + + c. **Transitions are clean, not animated.** When the + hysteretic state flips, inline-plugin output disappears (or + appears) in a single frame. No fade, no partial paint. + + d. **Explicit user override (lock).** A keyboard shortcut + (default `Ctrl-Alt-L`, user-rebindable) and a small + affordance in the input editor let the user lock the + current buffer to "shell mode" or "NL mode", disabling + the classifier for that buffer. Use case: the user knows + what they're typing and the classifier keeps getting it + wrong. The lock resets at the next agent turn (per-buffer, + not sticky across the conversation). + + e. **Classifier output is observable for debugging.** A + developer setting exposes the current classifier label and + hysteresis state in the input editor (subtle indicator). + Off by default; used to diagnose user reports of "bindings + are flaky in agent input". + + These rules apply whenever #22 is opted in. If #22 stays off + (v1 default), the classifier interaction doesn't arise because + bindkeys aren't honored in the agent input at all. + +### Settings, opt-out, and discoverability + +23. The feature is on by default for supported shells once it ships. + - **Resolved (TECH §5):** gated by + `FeatureFlag::HonorShellBindkeys` for staged rollout + (default off → dogfood → preview → stable). Once stable + the flag is on by default for supported shells. + +24. A single setting "Honor shell keybindings in input editor" lives under + the Keybindings section of settings. Toggling it off immediately + restores Warp's default keymap for all tabs (no restart). Toggling + it back on resumes matching against each tab's most recently + received binding table; any drift since the toggle was off is + picked up on the tab's next `precmd` payload, since re-queries are + shell-driven (see TECH.md §1). Users who want a fresh re-import + without waiting for the next prompt can press Enter on an empty + line, which fires `precmd` immediately. + +25. The Keybindings settings page surfaces, somewhere reachable, a + read-only view of the bindings Warp has imported for the active tab — + enough that a user debugging "why didn't my binding work" can see what + Warp received from the shell and which entries Warp marked unsupported. + Format: a list of `key → action (status)` rows where status is one + of `honored-builtin` (Category A — translated to a Warp action), + `honored-macro` (Category B — string macro re-injected per #9), + `honored-passthrough` (Category C — external widget routed through + pass-through per #11.5), `shadowed-by-warp-user` (a user-customized + Warp keybinding wins for this key), `reserved-by-warp` (one of the + structurally reserved keys from #14), or `unsupported` (Category A + widget Warp cannot replicate; user-defined-shell-function widgets + do not appear in this status — they are always + `honored-passthrough`). The exact UI is left to the tech spec; the + behavioral requirement is that the information is discoverable + without enabling debug logging. + +### Performance and correctness invariants + +26. The initial binding query must not block the user's first keystroke + perceptibly. If the query has not completed by the time the user types, + the keystroke is handled with Warp defaults; it is not buffered or + delayed. Late-arriving bindings apply to subsequent keystrokes. + +27. The query must not appear in the user's shell history, in scrollback, + in the block list, or as visible output. Side effects on the shell's + own state (kill ring, last-status `$?`, etc.) must be avoided or + cleaned up. + +28. Receiving a malformed or partial response from the shell never causes + a crash, hang, or stuck input editor. The fallback is always: drop the + bad data, log a diagnostic, keep using whatever binding table was + valid before. + +29. Existing Warp keybindings that the user has not customized continue to + work unchanged on tabs running unsupported shells, on tabs where the + feature is disabled, and on tabs where the query has not yet completed + or failed. + +## Open questions + +Collected from inline references above plus a few cross-cutting +ones the tech spec resolved during the spec iteration. All +items below are marked Resolved with a pointer to the TECH +section that closed each one; this section exists as the +single index of v1 design decisions for reviewers and future +maintainers. + +- (Resolved) v1 handling of user-defined named widgets whose + body is shell code (#11): honored via Category C pass-through + (TECH §3, §6); only built-in widgets without a Warp equivalent + land in the `Unsupported` fallthrough path. +- (Resolved) Canonical signal for vi-mode transitions across + zsh, bash, and fish (#13): in-app state machine driven by + dispatched widget transitions, with bootstrap and `Precmd` + payloads providing initial state and resync. See TECH §"#13". +- (Resolved) AI prompt input opt-in for shell bindings (#22): + v1 ships with no opt-in; the `Contextual` tier doesn't + activate on the AI prompt. The opt-in plus its dependent + `ClassifierGate` (#22.5) are tracked as a follow-up that + must ship together. +- (Resolved) Default-on vs feature-flagged staged rollout + (#23): gated by `FeatureFlag::HonorShellBindkeys` + (default off → dogfood → preview → stable). See TECH §5. +- (Resolved) Telemetry redaction policy for widget names — see + #11; the rule is allowlist-or-bucket, never raw user-defined + names. +- (Resolved) Default keystroke for `agent-input.lock-mode` + (#22.5d): `Ctrl-Alt-L`, user-rebindable via the editable + Warp keymap. diff --git a/specs/GH537/tech.md b/specs/GH537/tech.md new file mode 100644 index 0000000000..3b83a63b16 --- /dev/null +++ b/specs/GH537/tech.md @@ -0,0 +1,2047 @@ +# TECH.md — Honor user-defined shell bindkeys in Warp's input editor + +Issue: https://github.com/warpdotdev/warp/issues/537 +Product spec: [`product.md`](./product.md) + +## Context + +Warp's input editor receives raw keystrokes, matches them against the +`Keymap` table, and dispatches `InputAction` variants. Today that table +knows nothing about the user's shell bindings, so user customizations +(`bindkey '^X^E' edit-command-line`, readline `bind`, fish `bind`) are +ignored. See `product.md` for the user-visible behavior we want. + +Relevant code, with line ranges: + +- **Input editor and actions** — `app/src/terminal/input.rs (1072-1149)`. + `InputAction` is the dispatched action type. Today it covers + Warp-flavored actions (`FocusInputBox`, `CtrlR`, `CtrlD`, + `MaybeOpenCompletionSuggestions`, etc.) but does **not** have the + fine-grained editor verbs ZLE / readline expose + (`backward-kill-word`, `transpose-chars`, `kill-line`, `yank-pop`, + `up-history`, `vi-cmd-mode`, …). The buffer model lives in + `InputBufferModel` in the same file. `crates/editor/src/editor.rs + (18-55)` exposes the underlying `EditorView` trait. +- **Keymap** — `crates/warpui_core/src/keymap.rs (25-38, 44-49, 72-150)`. + `Keymap { fixed_bindings, editable_bindings }` indexed by name, with + `Trigger::{Keystrokes(Vec), Standard(StandardAction), + Custom(CustomTag)}` and `ContextPredicate` for context-scoped layering. + Resolution: `editable_bindings` (user-overridden) wins over + `fixed_bindings` (Warp defaults). Matching lives in + `crates/warpui_core/src/keymap/matcher.rs`. +- **Shell type and session** — `crates/warp_terminal/src/shell/mod.rs + (58-96, 250-255)`. `ShellType { Zsh, Bash, Fish, PowerShell }`, + `Shell { type, version, options, plugins, shell_path }`, + `ShellStarter::init()` at line 79. `app/src/terminal/local_tty/shell.rs + (1-200)` for spawn details. +- **Bootstrap and DCS hooks** — `app/src/terminal/bootstrap.rs (1-150)` + injects a per-shell init script from `bundled/bootstrap/{zsh,bash,fish, + pwsh}.sh`. Script-to-app communication uses + `app/src/terminal/model/ansi/dcs_hooks.rs (1-150)`: `DProtoHook` + variants (`Bootstrapped`, `Precmd`, `Preexec`, `InputBuffer`, + `InitShell`, …) carry hex-encoded JSON payloads + (`HEX_ENCODED_JSON_MARKER = 'd'`). DCS dispatch arrives as + `ModelEvent::PluggableNotification` in + `app/src/terminal/model_events.rs (468-472)`. **There is no live + "invisible command exec" primitive today**; bootstrap-emitted DCS + payloads are the right plumbing to extend. +- **Settings** — `app/src/terminal/keys_settings.rs (15-71, 26-34)`. + `define_settings_group!` macro is the pattern for new boolean toggles + (see `quake_mode_enabled`). Feature flags live in + `crates/warp_features/src/lib.rs (9+)`. +- **Telemetry** — `app/src/server/telemetry/events.rs (1237+, 2920)`. + `TelemetryEvent` enum + `send_telemetry_from_ctx!` macro. +- **Vi-mode tracking** — none today. The `vim` crate is Warp's own + in-editor vi emulation, not shell awareness. + +## Proposed changes + +The implementation has five logical pieces. Each maps cleanly to one +subsystem above. + +### 1. Bootstrap-side binding query + +Extend the bootstrap scripts to dump the user's binding table to Warp +via a new DCS hook variant. Doing the query in bootstrap (rather than +adding a runtime invisible-exec primitive) avoids polluting history, +scrollback, and last-status; it also runs before the first prompt so +bindings are available when the user starts typing. + +- `bundled/bootstrap/zsh.sh`: discover keymaps dynamically with + `bindkey -l` (this enumerates the standard set — `main`, `emacs`, + `viins`, `vicmd`, `vivis`, `viopp`, `command`, `isearch`, + `menuselect` — and any user-defined keymaps created via + `bindkey -N `), then run `bindkey -L -M $keymap` per keymap and + emit a JSON object `{ keymap_name: [ { keys, widget }, … ] }`. Also + emit `KEYMAP` so the active keymap is known. User-defined keymaps + pass through with their declared name; the matcher honors them when + they are referenced as the active keymap (resolves PRODUCT #2's + reference to "user-defined keymaps"). +- `bundled/bootstrap/bash.sh`: `bind -p` for the current keymap and + `bind -p -m emacs / vi-insert / vi-command` for the others. Detect vi + vs emacs via `set -o | grep -E '^(vi|emacs)'`. +- `bundled/bootstrap/fish.sh`: this requires reworking the existing + bootstrap, which currently sets + `fish_key_bindings = fish_default_key_bindings` (line 306) and then + installs four Warp-required binds (`\cP`, `\ep`, `\ew`, `\ei`) on + top — clobbering any user `fish_vi_key_bindings` setting and any + user-installed binds. To honor user fish bindings without losing + Warp's required reporting binds, we change the bootstrap to: + + 1. Capture the user's `fish_key_bindings` value at the very top of + the bootstrap, and stop the unconditional reset at line 306. The + user's chosen scheme runs as configured. + 2. After the user's scheme runs, install Warp's four reserved binds + (`\cP`, `\ep`, `\ew`, `\ei`) explicitly in every bind mode the + user uses (default, insert, visual for vi mode; default for + emacs; plus any custom modes discovered via `bind -L`). Those + four keys are reserved for Warp and intentionally shadow user + bindings on them — the explicit precedence boundary from + PRODUCT #14. + 3. Snapshot the resulting `bind` output per mode and emit it as the + `ShellBindings` payload. The vi-mode-vs-input-reporting conflict + that originally motivated the reset is resolved here because the + reporting bind is reinstalled in whichever mode is active, instead + of the scheme being reset wholesale. + + Mode tracking uses `$fish_bind_mode` for the initial snapshot and + the in-app vi state machine described in the open-questions section + for transitions. + +The payload is emitted as a new `DProtoHook::ShellBindings` variant in +`dcs_hooks.rs` carrying `{ shell, keymaps: Vec, +active_keymap, schema_version, nonce }`. Reuse `HEX_ENCODED_JSON_MARKER`. +`KeymapTable` is the wire-format helper (a serializable form of one +keymap's bindings, distinct from §2's in-app `Vec` +storage type); the receive path decodes wire-format `ShellBindings` +and translates it into §2's in-app `ShellBindings` struct before +storing it on `Shell`. The §1 shape and the §2 struct are +related-but-distinct types — same name for brevity, different field +layouts because the wire format prizes parser simplicity and the +in-app form prizes lookup ergonomics. + +**`schema_version` policy.** The bootstrap script is shipped from +the app to the PTY at runtime (`bundled/bootstrap/*.sh` are +embedded in the binary and written into the shell's init path on +session start), so the bootstrap version and the app's expected +schema are always pinned together — version skew should be +impossible in steady state. `schema_version` validation is +defense-in-depth, covering: stale bootstrap scripts persisted on +disk from a partial install, rc-file injection of a hand-crafted +`ShellBindings` DCS frame (caught upstream by the nonce check +but the version check is a second layer), and dev-branch +mismatches where a developer is running an old app against a new +bootstrap. Schema-version bumps follow the existing +`DProtoHook` convention: a new version is only introduced when +field semantics change incompatibly; additive fields use +`#[serde(default)]` and do not bump. Any future bump ships +alongside an app release; the new bootstrap is always co-shipped. +The same policy applies to `WarpBufferState` (§6.1 step 4): it +carries the same `schema_version` field and follows the same +co-shipped-bump rules. + +The `ShellBindings` payload is a privileged terminal-control message +(it can rewrite local key handling) and is only accepted from the +bootstrap context: + +- Each Warp-spawned shell receives a per-session, per-tab nonce in its + initial environment (`WARP_BOOTSTRAP_NONCE`). The very first action in + the bootstrap script is to copy this value into a non-exported, + shell-local variable (`typeset -g` in zsh, plain assignment in bash + with `export -n`, `set -l` plus careful scoping in fish), then + `unset WARP_BOOTSTRAP_NONCE` and remove it from the inherited + environment so it is not visible to any descendant process. Every + `ShellBindings` and `Precmd` payload the bootstrap emits embeds this + value. The app rejects any payload whose nonce does not match the + expected value for that tab. + + **Threat model** (documented explicitly so the limits are not + oversold). The nonce defends against: + - Innocent process output that happens to contain a DCS sequence + (`cat`'d binary file, curl response, log dump, terminal-art). + - Descendants of the user's shell that did not exist at bootstrap + time and never had the chance to read the nonce. + + It does **not** defend against: + - A process spawned during the window between the shell starting + and the bootstrap unsetting the variable. For zsh and bash this + window is closed by making the unset the first non-trivial line + of the bootstrap, before any user rc file is sourced. + + **Fish-specific caveat.** Warp launches fish as + `fish -f no-mark-prompt --login --init-command ''` + (`app/src/terminal/local_tty/shell.rs:632`). Fish runs `config.fish` + and any user functions *before* `--init-command`, so the env-var + nonce is readable to user code that runs at config time. To close + this gap the fish path passes the nonce out-of-band: Warp writes + the nonce to a tempfile under the user's runtime dir with mode + `0600`, passes the path as the first argument of `--init-command`, + and the bootstrap reads it then `rm`s the file before any further + work. The `WARP_BOOTSTRAP_NONCE` env var is not used for fish at + all. This brings fish to parity with zsh/bash on later-spawned + descendants but does not protect against an adversarial + `config.fish` written before Warp launched, which is consistent + with the same-uid threat model below. + - A same-user process that already has read access to the parent + shell's environment (`/proc//environ` on Linux, + `procfs`/`ps eww` on macOS — both gated by same-uid). Such a + process can already inject keystrokes through `TIOCSTI` (where + enabled), modify rc files, or attach via debugger; defending the + DCS channel against this attacker offers no marginal security. + - A privileged adversary; out of scope for any user-mode mitigation. + + This trust boundary is the same one Warp's existing shell-integration + hooks already implicitly rely on. The nonce makes that boundary + explicit and raises the bar above pure-output spoofing. +- **Validation order, single rule.** Validation runs in three + strictly ordered phases on the receive side, each of which rejects + the entire payload on failure: + 1. **Pre-decode byte cap.** The DCS frame's hex-decoded byte length + is checked against the 256 KiB total cap *before* JSON + deserialization runs. Frames exceeding the cap are dropped at + the framing layer and logged; no allocation for parsed structures + happens. This bounds memory and CPU before untrusted data + reaches `serde_json`. + 2. **Schema decode (with nonce check).** JSON is decoded into the + `ShellBindings` struct via a single `serde_json` pass. The + decoded struct's `nonce` field is compared against the tab's + expected `WARP_BOOTSTRAP_NONCE` (zsh/bash) or fish tempfile- + nonce value. Missing/mismatched nonce, field type mismatch, + unknown `schema_version`, or malformed `Keystroke` string + discards the entire payload — no partial application. There + is no per-entry "drop one and keep the rest" branch. + 3. **Post-decode bounds.** After successful decode, the parsed + structure is checked against the per-entry caps (max 4 KiB + per binding entry, max 16 keymaps, max 8192 bindings total). + Any violation discards the entire payload. + + The previous draft's "drop oversized entries before parsing" + language is retired; the rule is uniform — every validation + failure is whole-payload rejection so the app never applies a + partial table. +- The same nonce check applies to the existing `Precmd` hook; + a `Precmd` payload that fails the nonce check has its + contents ignored, so the previous binding table stays in + place and the re-query hash on that payload is discarded + along with everything else. + +### Re-query mechanism + +Re-queries are driven entirely shell-side; the app never has to mutate +shell state to trigger a re-emit (which the running shell can't observe +anyway — flipping an env var from outside has no effect on the live +session). The bootstrap script keeps a shell-scoped variable +`__warp_bindings_hash` initialized at startup to the hash emitted +alongside the first `ShellBindings` payload. On every `precmd` the +script: + +1. Recomputes the 64-bit hash of the current binding table. +2. Emits the hash in the `Precmd` DCS payload (informational; the app + uses it for telemetry and to detect mid-session resyncs). +3. If the new hash differs from `__warp_bindings_hash`, emits a fresh + `ShellBindings` payload with the full table and updates + `__warp_bindings_hash` to the new value. + +The app-side handler simply consumes whatever arrives. Steady state is +one hash computation per prompt; the full payload is re-emitted only on +real changes (new `bindkey`, mode switch via `bindkey -v`, sourcing a +new rc file, plugin rebind). PRODUCT #26 holds because the work runs +inside `warp_precmd` after the user's command output, asynchronously to +keystrokes. + +**Preserving shell state during the hash step (PRODUCT #27).** The +hash function runs as the very first action of `warp_precmd` and must +leave shell-observable state untouched. The discipline: + +- **Last-status (`$?` / `$status`).** Save before any other + expression: zsh `local __warp_status=$?`, bash + `local __warp_status=$?`, fish `set -l __warp_status $status`. Any + value the user reads from `$?` later in their own `precmd` chain + sees the saved value, restored via `return $__warp_status` at the + end of the function (or `set -e status $__warp_status` in fish). +- **Shell options.** No `set -o`, `setopt`, `shopt`, or + `set -gx fish_