Skip to content

fix(helpers): press_key supports keyboard shortcuts (suppress char + canonical code/vk)#297

Open
sauravpanda wants to merge 1 commit intomainfrom
fix/press-key-shortcut-support
Open

fix(helpers): press_key supports keyboard shortcuts (suppress char + canonical code/vk)#297
sauravpanda wants to merge 1 commit intomainfrom
fix/press-key-shortcut-support

Conversation

@sauravpanda
Copy link
Copy Markdown
Collaborator

@sauravpanda sauravpanda commented May 5, 2026

Summary

Fixes a P1 surfaced by a codex review of the repo. Lifts the workaround that PR #258 made inside fill_input up to the press_key helper itself, so every keyboard shortcut — Cmd+S, Ctrl+Z, Ctrl+L, Cmd+Shift+P, etc. — now works for any caller, not just the select-all that fill_input does.

The bugs

Two distinct problems in helpers.press_key:

  1. char event always fired after keyDown. Even with Ctrl/Cmd held, the helper unconditionally emitted an Input.dispatchKeyEvent type=char after the keyDown. Chrome treats char as printable text input, so the shortcut handler fires AND the literal letter gets typed. PR fix(helpers): add fill_input, wait_for_element, wait_for_network_idle #258 worked around this inside fill_input by dispatching select-all via raw CDP calls; every other shortcut remained broken.

  2. code and windowsVirtualKeyCode were wrong for letters/digits. The fallback computed code=key (so code="a" for the A key) and vk=ord(key[0]) (97 for "a"). CDP's shortcut handlers compare against canonical physical-key codes — "KeyA" / 65 for the A key, "Digit5" / 53 for the 5 key. Without those, e.code in JS is wrong and shortcut listeners that check e.code === "KeyA" (a common pattern) don't fire.

The fix

  • New _key_metadata(key) returns the canonical (vk, code, text) for letters (KeyA/65, KeyZ/90), digits (Digit5/53), and the pre-existing special-key table (Enter/13/\r, Backspace/8, etc.). Punctuation/symbols fall back to ASCII vk + literal code (best-effort; Chrome may still recognise the shortcut from the modifier flags alone).
  • press_key now suppresses both text on the keyDown and the entire char event when any of Alt|Ctrl|Meta (modifier bits 0b0111) is set. Shift alone is still text input — Shift+A still types A.
  • fill_input's clear-first path simplified back to press_key("a", modifiers=...). The raw-CDP workaround was 5 lines; it's now one call. The helper does the right thing.

Tests

10 new in tests/unit/test_helpers.py:

  • test_press_key_letter_uses_canonical_code_and_vk — covers "a"KeyA/65 and "Z"KeyZ/90.
  • test_press_key_digit_uses_canonical_code_and_vk"5"Digit5/53.
  • test_press_key_special_key_uses_kkeys_tableEnter still resolves via _KEYS.
  • test_press_key_no_modifiers_emits_text_and_char_event — keyDown carries text, char event fires.
  • test_press_key_with_ctrl_suppresses_text_and_char — keyDown has no text, no char event.
  • test_press_key_with_meta_suppresses_text_and_char — same for Cmd.
  • test_press_key_with_alt_suppresses_text_and_char — same for Alt.
  • test_press_key_shift_alone_still_emits_text_and_char — Shift alone is text input.
  • test_press_key_ctrl_shift_combo_still_suppressesCtrl+Shift+P (modifier wins over Shift).
  • test_press_key_emits_keyup_with_consistent_metadata — keyUp carries the same code/vk/modifiers.

uv run pytest tests/93 passed (83 → 93).

Test plan

  • Unit tests pass locally.
  • On a real daemon: an agent calling press_key("z", modifiers=2 if not darwin else 4) in a textarea undoes the last edit (Ctrl/Cmd+Z).
  • press_key("l", modifiers=2 if not darwin else 4) focuses the address bar (Ctrl/Cmd+L).
  • fill_input(selector, "hello", clear_first=True) still clears on macOS (Cmd+A path goes through the new press_key).

Summary by cubic

Fixes keyboard shortcuts by updating press_key to suppress printable char events with Alt/Ctrl/Meta and to emit canonical code/vk for letters and digits. Shortcuts like Cmd+S, Ctrl+Z, and Cmd/Ctrl+A now work reliably, and fill_input uses press_key directly.

  • Bug Fixes
    • Suppress text and char events when Alt/Ctrl/Meta are held; Shift alone still types text.
    • Emit canonical CDP metadata for letters/digits (e.g., KeyA/65, Digit5/53); special keys unchanged.
    • Simplify fill_input clear path to use press_key("a", modifiers=Cmd/Ctrl); removed raw CDP workaround.
    • Added 10 unit tests; total suite now 93 passing.

Written for commit e0dcdd6. Summary will update on new commits.

Two related issues that have shown up across recent PRs:

1. press_key always emitted a `char` event after every single-char
   keyDown, regardless of modifiers. With Ctrl/Cmd held, that `char`
   makes Chrome treat the press as printable text input (typing "a")
   instead of firing the shortcut (Cmd+A). PR #258 worked around this
   inside fill_input by dispatching the select-all directly via raw
   CDP calls, but every other shortcut was still broken.

2. The keyDown's `code` and `windowsVirtualKeyCode` for letters/digits
   came from a literal-key fallback (code="a", vk=ord("a")=97). CDP's
   shortcut handlers compare against canonical physical-key codes —
   "KeyA" / 65 for the A key, "Digit5" / 53 for the 5 key. Without
   that, e.code in JS is wrong and shortcut listeners that check
   `e.code === "KeyA"` (a common pattern) do not fire.

Fix at source so every shortcut works for any caller:

- Added _key_metadata(key) which returns the canonical (vk, code, text)
  for letters (Key{X}, ord(upper)), digits (Digit{N}, ord(N)), and the
  pre-existing special-key table. Punctuation/symbols fall back to
  ASCII vk + literal code.
- press_key suppresses both `text` on keyDown and the entire `char`
  event when any of Alt/Ctrl/Meta is set (modifier bits 0b0111).
  Shift alone is still text input.
- fill_input's clear path now just calls press_key("a", modifiers=...)
  instead of dispatching directly via cdp; the helper does the right
  thing now.

10 new tests in tests/unit/test_helpers.py cover:
- canonical code/vk for letters (KeyA/65, KeyZ/90) and digits (Digit5/53)
- Enter/Backspace/etc still use the _KEYS table
- no-modifier press emits text + char
- Ctrl / Meta / Alt each suppress text + char
- Shift alone keeps text + char
- Ctrl+Shift combo suppresses (modifier wins over Shift)
- keyUp metadata is consistent

Identified via codex review (P1). Full suite: 93 passed (83 -> 93).
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e0dcdd6905

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

upper = key.upper()
if "A" <= upper <= "Z":
# vk for letters is the uppercase ASCII codepoint, regardless of case.
return (ord(upper), f"Key{upper}", key)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle multichar uppercase mappings in _key_metadata

The new letter path crashes for some valid single-character inputs because key.upper() can return multiple code points (for example "ß" -> "SS"), and ord(upper) then raises TypeError. This is a regression from the previous implementation and breaks fill_input for international text, since it calls press_key for each character and will now fail at runtime on these inputs.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant