Releases: baronsmv/linux-rt-upscaler
v0.3.0
Changelog
Introducing tile‑based processing, new hotkeys and options, and a new Vulkan + XCB backend
Tile‑Based Processing
The capture and upscaling pipeline now supports a tile‑based mode that dramatically reduces GPU and CPU work when only a small portion of the source window changes. This mode is configurable from the CLI.
- The capture C library (
src/upscaler/capture/lib/) was refactored into modular components for damage tracking and shared‑memory tile images. - Each frame is divided into tiles (configurable
tile_size, default 64 px). A fast xxHash64 checksum of every tile is stored. On subsequent captures only tiles whose hash differs are re‑processed. - Damage rectangles from the X server are expanded by a context margin (
tile_context_margin) to account for the neural network’s receptive field, preventing artifacts at its boundaries. - When the number of changed tiles exceeds
max_tile_layersor the changed area fraction exceedsarea_threshold, the system falls back to a full‑frame upscale to avoid excessive dispatch overhead. - Dedicated tile shaders (e.g.,
PassX_tile.hlsl/.spv) that operate onTexture2DArrayallow batching many tiles in a single GPU dispatch. - All tile‑related settings (
use_tile_processing,use_damage_tracking,tile_size,tile_context_margin,max_tile_layers,area_threshold) are configurable in theconfig.yamlfile and as CLI options.
New hotkeys
Hotkey support has been extended to cover zooming and panning (offset):
| Hotkey | Action |
|---|---|
Alt+Shift+Plus / Alt+Shift+Minus |
Zoom in / out through predefined levels (50 %, 75 %, 100 %, 150 %, …) |
Alt+Shift+Up/Down/Left/Right |
Pan the upscaled content by a step (25 px by default) |
Alt+Shift+Escape |
Gracefully exit the application |
All hotkeys remain customizable in the configuration file.
Vulkan Backend
Previously, the project relied on a fork of compushady, specifically its Vulkan backend. While it served as a great foundation, its monolithic and unmaintained implementation became a challenge to work with, as some features required deep changes to its architecture.
This release introduces a new C++ Vulkan extension (src/upscaler/vulkan/lib/), tailored to the needs of the upscaler. Direct Vulkan usage allows us:
- Resource management with pre‑allocated staging buffers and fence‑based frame pipelining.
- A custom swapchain manager with its own XCB connection and fence synchronization.
- Batch methods for tile and slice processing.
All HLSL shaders are now pre‑compiled to SPIR‑V (.spv files) using dxc at build time. The runtime no longer compiles HLSL on the fly, reducing startup latency.
Migration from Xlib to XCB
All X11 interaction (except the capture C library) has been migrated from python‑xlib to xcffib, the Python binding for XCB. It serves as a robust solution to X11 thread collisions noticed previously, as well as asynchronous connections around tasks like screenshots, queue events and display events.
Unused compushady‑lrtu, ewmh, and xlib dependencies got removed, replaced by the new custom Vulkan backend and the XCB binding respectively.
Bonus: Early Development Artifacts, or A Brief Tile‑Processing Glitch Gallery
Some FuNNy artifacts from early tiled‑processing development. After many tries and artifacts, it's ready and artifact-free, but it resulted on some interesting sightings.

v0.2.10
Changelog
Introducing new high‑performance X11 capture method, with significantly better framerates and lower latency
Replaces the previous slow screenshot‑based capture with a native C implementation using:
- XShm (shared memory) for zero‑copy frame grabs.
- XDamage extension to detect changed regions and skip static frames.
- Automatic fallback to
XGetImagewhen SHM is unavailable.
This reduces CPU usage and eliminates unnecessary GPU work when the target window is idle.
Damage‑Aware Frame Skipping
The pipeline now checks the damage rectangles returned by the capture library.
If no pixel changes are detected (and no OSD is active), the upscaling compute passes and Lanczos scaling are skipped, saving GPU power and reducing latency. Some windows still may force an update.
Performance Improvements (tested with two models)
4x24
| Metric | v0.2.9 | v0.2.10 | Change |
|---|---|---|---|
| Average FPS | 35.7 | 46.8 | +31% |
| 1% Low FPS | 22.5 | 32.9 | +46% |
| 0.1% Low FPS | 20.3 | 30.4 | +50% |
| GPU Load | 46.4% | 58.3% | +12pp |
| Frame Time | 28.0 ms | 21.4 ms | -24% |
8x32
| Metric | v0.2.9 | v0.2.10 | Change |
|---|---|---|---|
| Average FPS | 20.9 | 25.1 | +20% |
| 1% Low FPS | 16.1 | 21.2 | +32% |
| 0.1% Low FPS | 15.0 | 20.6 | +37% |
| GPU Load | 59.8% | 72.3% | +12.5pp |
| Frame Time | 47.7 ms | 39.8 ms | -17% |
Measured with MangoHud on AMD Radeon RX 5600 XT + Intel Core i3-12100F
Changes to the Vulkan backend and new CLI options
- Added
--vulkan-present-modeoption with support for:fifo(V‑Sync on, lowest power, no tearing)mailbox(tear‑free, low latency, uncapped FPS)immediate(lowest latency, may tear)
- Staging buffer pool configurable via
--vulkan-buffer-pool-size. Pre‑allocates buffers for partial texture updates, reducing allocation overhead during frequent small uploads. - Refactored monolithic
vulkan.cppinto modules for better maintainability in the future.
Bug Fixes and internal improvements
- Damage rectangle clamping to capture subregion prevents out‑of‑bounds blits.
- Thread termination now uses non‑blocking queue
put_nowait()to avoid hangs on shutdown. - Frame grabber explicitly cleaned up during window switches and pipeline stops.
- Switch queue now drains old requests, keeping only the most recent target window.
- Added guard against zero‑size scaling rects to prevent invalid GPU dispatches.
- Fixed potential deadlock between pipeline thread and Qt main thread during swapchain recreation.
- Fixed double‑destruction of
VkSurfaceKHRwhen swapchain creation fails. - Added missing Python exceptions for texture creation failures and missing Vulkan queue families.
- Corrected staging buffer pool size check to prevent out‑of‑bounds access.
- Added bounds checking for buffer‑to‑image copies in
Resource.copy_to. - Improved memory type selection for texture downloads (now uses
DEVICE_LOCALfor GPU‑write buffer). XSyncadded afterXShmGetImageto ensure image transfer completion (prevents tearing/artifacts).
v0.2.9.post2
Changelog
See the v0.2.9 release notes for earlier changes.
Post-release: Python 3.14 Support
After testing both the compushady fork and the upscaler with Python 3.14, everything worked as expected. If any issues arise, the Vulkan backend can be adjusted as needed.
- Added Python 3.14 to the wheel-building matrix.
- Pre‑install
PySide6usinguvin CI to work aroundcp314wheel resolution quirks. - Updated README with compatibility notes.
v0.2.9.post1
Changelog
See the v0.2.9 release notes for earlier changes.
Post-release: version dependency bound and better screenshot directory default
- Added minimum version with a loose upper bound to
compushady-lrtuto avoid packaging with previous version. - Screenshot directory by default now uses the user's localized Pictures folder, via calling
user_pictures_dir()from new dependencyplatformdirs.
v0.2.9
Changelog
Introducing hotkey support, OSD, lossless screenshots, automatic Wayland scale factor detection and automatic pause events
Hotkey Manager
Added an XCB‑based global hotkey manager (HotkeyManager) that listens for keyboard shortcuts even when the overlay window does not have focus. Supported actions:
Alt+Shift+Sto toggle overlay visibilityAlt+Shift+Mto cycle through upscaling modelsAlt+Shift+Gto cycle through the main output geometry modes (fit,stretch,cover)Alt+Shift+Pto take a lossless screenshot (pre-Lanczos)
Hotkeys can be customised via the hotkeys section in the YAML configuration file.
On‑Screen Display (OSD)
An overlay message appears when switching models, changing geometry, or taking a screenshot. The OSD is rendered directly on the Vulkan surface and fades out automatically.
Lossless asynchronous screenshots
Pressing the screenshot hotkey captures the raw SRCNN‑upscaled image (pre‑Lanczos) and saves it as a PNG. The save location can be customised with the screenshot_dir configuration option. The capture is performed asynchronously to minimise impact on the rendering pipeline.
Model and geometry switching
The upscaler can now cycle through all upscaling models and main geometries (fit, stretch, and cover) without restarting the application.
Automatic scale-factor detection
The monitor scale factor is now detected automatically using screeninfo, eliminating the need for manual override on Wayland. The --scale-factor CLI flag remains available for cases where auto‑detection is insufficient.
Pause on focus loss or window minimization
When the target window loses focus (and the overlay is in an always‑on‑top mode), the pipeline automatically pauses and hides the overlay. This behaviour can be disabled with --no-focus-pause or the pause_on_focus_loss configuration option.
Other changes
- All external requests—model switching, geometry cycling, screenshot capture, and OSD display—now use dedicated
queue.Queueinstances. - The main pipeline loop dequeues and processes these requests on the pipeline thread, eliminating race conditions and simplifying cross‑thread communication.
- The
PipelineControllerhelper class encapsulates this logic, keeping the corePipelineclass focused solely on frame processing. compushady-lrtuhas been updated to include the nativeTexture2D.download()method, which performs a two‑stage GPU-CPU readback using a device‑local staging buffer.- The
screeninfolibrary is now a required dependency and is used to automatically detect the monitor’s physical scale factor on Wayland. - The
WindowTrackerclass was enhanced to accurately detect window minimized state (map_state) and active focus (_NET_ACTIVE_WINDOW). Combined with the refactored pause‑flag logic, the overlay now pauses and resumes when the target window loses/regains focus or is minimized/restored.
Fixes
- Captured images previously exhibited swapped red and blue channels due to a mismatch between Vulkan’s
B8G8R8A8_UNORMtexture format and PIL’s defaultRGBAdecoder. The fix instructs PIL to decode the raw data asBGRA, producing correct colours in all saved screenshots. - The original hotkey implementation used
python-xlibwith aQSocketNotifieron the same X11 display connection that Qt internally manages via XCB. The newHotkeyManageropens a separate XCB connection and processes events via a dedicatedQSocketNotifier, integrating with Qt’s event loop without interference. - Configuration overrides containing
nullvalues (e.g.,scale_factor: null,log_file: null) were incorrectly logged as “unknown configuration key” warnings. Theapply_overridesfunction now properly skipsNonevalues after confirming the key exists, eliminating the spurious warnings. - A log was querying info on non-existant windows before it could handle it, causing an uncatched
AttributeError. It was replaced to first handle the state.
v0.2.8
Changelog
Introducing a follow‑focus window mode
With the option -f/--follow-focus, the upscaler now automatically switches to the currently focused window. The pipeline adapts to the new window’s size and geometry.
Improvements & Refactoring
- The code has been reworked for better maintainability, paying some of the technical debt from previous releases.
PipelineandOverlayWindownow have subpackages with helper classes that take most of the delegated responsibilities, and each accept aConfigobject instead of long parameter lists from the CLI. Also, they encapsulate most of their internal logic, only exposing their public class.- Geometry overlay calculations moved to a dedicated
overlay.geometrymodule. - Opacity control moved from
pipelinetooverlay. - Configuration handling encapsulated in a new
configsubpackage. - Window acquisition and tracking logic reorganized under a dedicated
windowpackage.
Bug Fixes
- Fixed swapchain recreation problems related to more than one reference of the instance. Suboptimal responses are also now ignored.
- Fixed a bug where the pipeline would fail to recover after a target window resize, causing the overlay to display cropped content.
- Corrected an early return in
_handle_window_changethat prevented full geometry updates after introducing--follow-focusfeature. - Adjusted the focus monitor to ignore the overlay window itself.
v0.2.7
Changelog
Introducing configuration profiles
You can now define named configuration profiles in the YAML config file. Select a profile manually with -p <name> or let the application automatically match a profile based on the window title (exact match, regex, or substring).
Profiles can override any configuration option (crop, model, scale factor, etc.) and follow the priority: CLI arguments > manual profile > auto‑matched profile > general YAML > defaults.
Mouse wheel forwarding
Mouse wheel events (both vertical and horizontal) are now forwarded to the target window.
The overlay translates Qt wheel events into X11 ButtonPress/ButtonRelease events with the appropriate button numbers (4=up, 5=down, 6=left, 7=right) and sends them to the target window. Horizontal scrolling (e.g., on touchpads) is now supported.
Wayland scale factor
Added --scale-factor CLI option (float) to compensate for Wayland fractional scaling.
This multiplier is applied when determining the physical monitor geometry, ensuring that the overlay covers the correct area and mouse clicks map accurately. The scale factor can also be set per profile, allowing different scaling needs per application.
Other improvements
- If a log file is specified (
--log-file), the file now records all messages at DEBUG level and above, regardless of the console log level (which can be set via-q/--quietor--debug). - Moved all validation logic to
utils/validators.pywith a declarative rule system. - Added an explicit swapchain destroyer and garbage collector call to free Vulkan resources when the application exits.
- Reorganised code for better separation of concerns (config loading, window acquisition, overlay creation, pipeline setup).
- Fixed a slight offset in mouse click coordinates when using
--scale-factorwith letterboxed content. - The scale factor is now applied correctly in the pipeline when calculating the scaling rectangle, ensuring clicks land exactly where they should.
v0.2.6
Improvements around swap chain creation and Vulkan responses
- The project now depends on a forked version of compushady that exposes
is_suboptimal(),is_out_of_date(), andneeds_recreation()methods. - Detect and recover from
VK_SUBOPTIMAL_KHRandVK_ERROR_OUT_OF_DATE_KHRby recreating the swapchain when needed. - Automatic detection when the captured window is resized or recreated (XID change) and recreate the frame grabber and SRCNN upscaler with the new dimensions.
- Content dimensions now are re-evaluated when the overlay window is resized, ensuring the scaling mode (fit/stretch/cover) adapts correctly.
- Overlay logic updated for crop, content dimensions, target handle, and target size when window properties change, ensuring clicks always map to the correct position in the target window.
- We now reuse the same Lanczos compute object across frames, avoiding per‑frame creation overhead.
- Added detailed timing logs for initialization, frame processing, and resource recreation; added debug logs for coordinate mapping and event forwarding.
- Properly release Vulkan resources before closing X display to prevent segmentation faults.
- Added a failure counter; after 30 consecutive failures, the pipeline stops gracefully instead of looping indefinitely.
- The pipeline thread is now non‑daemon, and
stop()waits for it to finish before cleaning up, eliminating race conditions on exit. - Reduced X11 calls for opacity changes to once per 100 ms, lowering overhead.
v0.2.5
New options to customize overlay, target window area and positioning
- Added
--output-geometry - Pure mode keywords (
stretch,fit,cover). - Added
!for stretch,^for cover (e.g.,1920x1080!,1920x1080^). - Percentage‑based sizing with optional stretch (
50%= fit,50%!= stretch). - Fixed‑width / fixed‑height accept
!for stretch (1920x,1920x!,x1080,x1080!). - Added
--background-color(CSS color names, hex codes). - Letterbox areas now use the chosen color (previously hardcoded black in HLSL).
- Added
--offset-xand--offset-yto shift the content rectangle from its centered position. - Added
--crop {top,bottom,left,right}to remove unwanted borders before upscaling. - The neural network focuses only on the cropped region, improving quality and performance.
Fixes and internal improvements
- Used Qt6’s native interfaces to obtain the X11 display pointer, improving compatibility (especially on Wayland/XWayland).
- Corrected double upscale (4×) pipeline: first pass now writes to intermediate texture, second pass reads from it.
- Ensured Python
struct.packformat matches the HLSL constant buffer layout exactly (BGRA order for background colour). - Fixed signed/unsigned comparison in Lanczos shader to correctly handle negative
dstX/dstY(forcovermode). - Removed hardcoded aspect‑ratio calculation from the shader; the rectangle is now passed directly from the host, making all scaling modes possible.
- Centralised all config default values in a
DEFAULTSdictionary; reduced repetitive conditionals in argument handling. - Added geometry syntax validation with clear error messages for invalid
--output-geometrystrings. - Added crop dimension checks to prevent crashes when crop values exceed window size.
- Resolved “pack expected 13 items” errors by correcting
struct.packformat. - Ensured percentage and fixed‑dimension specs now correctly apply
fit(letterbox) by default, with!for stretch.
v0.2.4
The overlay no longer freezes when the target window is closed, and X11 error spam has been eliminated.
- Fixed overlay freeze after target window closure.
- Eliminated
X protocol errorspam when sending events to a destroyed window. - Resolved
QMetaObject::invokeMethoderror by using a proper slot and queued connection. - Ensured the application exits fully without hanging.
Details:
- Added
WindowWatcherto monitor the target window and automatically stop the pipeline when it's closed. - The pipeline now sets a
stopped_eventand signals the main thread to quit cleanly via a Qt queued connection. No more frozen overlay. - Custom error handlers suppress default stderr printing and log X errors (like
BadWindow) at debug level, preventing console clutter. FrameGrabber.grab()now checks the C function's return code and raises an exception on failure, ensuring the pipeline stops immediately when the window is gone.- Normal shutdown messages are now
INFOinstead ofERROR, so users see a clean console unless verbose logging is enabled. - Centralized X11 connection management with proper cleanup in
__del__andcloseEvent.