Skip to content

docs: document input pipeline architecture and headless keyboard barrier#53

Closed
mark-e-deyoung wants to merge 1 commit into
SemperSupra:mainfrom
mark-e-deyoung:fix/headless-vnc-input-passthrough
Closed

docs: document input pipeline architecture and headless keyboard barrier#53
mark-e-deyoung wants to merge 1 commit into
SemperSupra:mainfrom
mark-e-deyoung:fix/headless-vnc-input-passthrough

Conversation

@mark-e-deyoung

Copy link
Copy Markdown
Collaborator

Problem

Headless WineBot containers cannot inject keyboard input into Windows applications via VNC or xdotool. Key events arrive at Xvfb but never reach the target app. This was a long-running issue with the project.

Root Cause (Definitive)

The Wine desktop shell (explorer.exe /desktop) intercepts ALL X11 keyboard input. The container runs a supervisor that continuously restarts this desktop, making it impossible to run apps in standalone X11 window mode.

Input pipeline:

VNC Client → x11vnc(:5900) → Xvfb(:99) → explorer.exe /desktop → terran.exe
                                                    ^
                                           KEYBOARD EVENTS STOP HERE

Proof:

  1. VNC RFB protocol handshake succeeds (DES auth, framebuffer reads confirm app rendering)
  2. VNC key events (struct.pack(">BBHI", 4, 1, 0, keysym)) are acknowledged by x11vnc
  3. /proc/PID/mem reads of the target app show NO state changes after key injection
  4. Killing explorer.exe /desktop makes xdotool key injection work immediately
  5. The supervisor restarts the desktop within seconds, re-blocking input

Documentation Added

  • Input pipeline architecture explaining the Xvfb → desktop → app chain
  • Three documented solutions with tradeoffs (hybrid mode, no-desktop, mouse-only)
  • HTTP 423 troubleshooting with root cause and fix

Testing

No code changes — documentation only. Existing test suite passes.

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com

…ooting

Documents the root cause of VNC/xdotool keyboard events not reaching
Windows apps: Wine's explorer.exe /desktop intercepts all X11 keyboard
input. The desktop supervisor continuously restarts it.

Adds three documented solutions:
1. Hybrid control mode (WINEBOT_ALLOW_HEADLESS_HYBRID=1)
2. Disable desktop supervisor (WINEBOT_SUPERVISE_EXPLORER=0)
3. Mouse-only VNC interaction workaround

Also documents the HTTP 423 'Agent control denied by policy' error
and its fix.

Discovered during headless SMAC game automation where /proc/PID/mem
confirmed VNC key events delivered to Xvfb but terran.exe state
never changed. Root cause confirmed by killing explorer.exe /desktop
and observing that xdotool key injection then reached app windows
directly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mark-e-deyoung

Copy link
Copy Markdown
Collaborator Author

Superseded by commit a7184c8 which includes this documentation plus a new /input/key endpoint and comprehensive tests.

mark-e-deyoung added a commit that referenced this pull request Jun 21, 2026
feat: add /input/key endpoint and fix headless keyboard input pipeline

Adds a POST /input/key API endpoint for keyboard injection into Windows
applications. The endpoint defaults to AHK Send injection, which operates
inside the Wine process space and bypasses the X11 explorer.exe/desktop
keyboard interception layer entirely.

Key changes:
- New /input/key endpoint in api/routers/input.py with AHK and xdotool backends
- xdotool-to-AHK key syntax translation (_xdotool_to_ahk_keys)
- KeyModel Pydantic model for request validation
- WINEBOT_INPUT_KEY_BACKEND and WINEBOT_TIMEOUT_INPUT_KEY_SECONDS config fields
- Expose WINEBOT_ALLOW_HEADLESS_HYBRID and WINEBOT_SUPERVISE_EXPLORER in docker-compose
- Document keyboard barrier and solutions in docs/troubleshooting.md
- 20 unit tests for key translation, e2e test for full pipeline
- All existing tests pass (162 passed, 0 new failures)
- Ruff and Mypy pass clean

Closes PR #52 and #53 (superseded by this commit).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant