Skip to content

Releases: baronsmv/linux-rt-upscaler

v0.3.0

27 Apr 04:40

Choose a tag to compare

Changelog

Introducing tile‑based processing, new hotkeys and options, and a new Vulkan + XCB backend

Tile‑Based Processing

The capture and upscaling pipeline now supports a tile‑based mode that dramatically reduces GPU and CPU work when only a small portion of the source window changes. This mode is configurable from the CLI.

  • The capture C library (src/upscaler/capture/lib/) was refactored into modular components for damage tracking and shared‑memory tile images.
  • Each frame is divided into tiles (configurable tile_size, default 64 px). A fast xxHash64 checksum of every tile is stored. On subsequent captures only tiles whose hash differs are re‑processed.
  • Damage rectangles from the X server are expanded by a context margin (tile_context_margin) to account for the neural network’s receptive field, preventing artifacts at its boundaries.
  • When the number of changed tiles exceeds max_tile_layers or the changed area fraction exceeds area_threshold, the system falls back to a full‑frame upscale to avoid excessive dispatch overhead.
  • Dedicated tile shaders (e.g., PassX_tile.hlsl / .spv) that operate on Texture2DArray allow batching many tiles in a single GPU dispatch.
  • All tile‑related settings (use_tile_processing, use_damage_tracking, tile_size, tile_context_margin, max_tile_layers, area_threshold) are configurable in the config.yaml file and as CLI options.

New hotkeys

Hotkey support has been extended to cover zooming and panning (offset):

Hotkey Action
Alt+Shift+Plus / Alt+Shift+Minus Zoom in / out through predefined levels (50 %, 75 %, 100 %, 150 %, …)
Alt+Shift+Up/Down/Left/Right Pan the upscaled content by a step (25 px by default)
Alt+Shift+Escape Gracefully exit the application

All hotkeys remain customizable in the configuration file.

Vulkan Backend

Previously, the project relied on a fork of compushady, specifically its Vulkan backend. While it served as a great foundation, its monolithic and unmaintained implementation became a challenge to work with, as some features required deep changes to its architecture.

This release introduces a new C++ Vulkan extension (src/upscaler/vulkan/lib/), tailored to the needs of the upscaler. Direct Vulkan usage allows us:

  • Resource management with pre‑allocated staging buffers and fence‑based frame pipelining.
  • A custom swapchain manager with its own XCB connection and fence synchronization.
  • Batch methods for tile and slice processing.

All HLSL shaders are now pre‑compiled to SPIR‑V (.spv files) using dxc at build time. The runtime no longer compiles HLSL on the fly, reducing startup latency.

Migration from Xlib to XCB

All X11 interaction (except the capture C library) has been migrated from python‑xlib to xcffib, the Python binding for XCB. It serves as a robust solution to X11 thread collisions noticed previously, as well as asynchronous connections around tasks like screenshots, queue events and display events.

Unused compushady‑lrtu, ewmh, and xlib dependencies got removed, replaced by the new custom Vulkan backend and the XCB binding respectively.

Bonus: Early Development Artifacts, or A Brief Tile‑Processing Glitch Gallery

Some FuNNy artifacts from early tiled‑processing development. After many tries and artifacts, it's ready and artifact-free, but it resulted on some interesting sightings.

Screenshot_20260414_185816 Screenshot_20260414_185905 Screenshot_20260414_185810 Captura de pantalla_20260426_125020 Captura de pantalla_20260426_135240 Screenshot_20260414_182027 Screenshot_20260414_181728

v0.2.10

12 Apr 20:18

Choose a tag to compare

Changelog

Introducing new high‑performance X11 capture method, with significantly better framerates and lower latency

Replaces the previous slow screenshot‑based capture with a native C implementation using:

  • XShm (shared memory) for zero‑copy frame grabs.
  • XDamage extension to detect changed regions and skip static frames.
  • Automatic fallback to XGetImage when SHM is unavailable.

This reduces CPU usage and eliminates unnecessary GPU work when the target window is idle.

Damage‑Aware Frame Skipping

The pipeline now checks the damage rectangles returned by the capture library.

If no pixel changes are detected (and no OSD is active), the upscaling compute passes and Lanczos scaling are skipped, saving GPU power and reducing latency. Some windows still may force an update.

Performance Improvements (tested with two models)

4x24

Metric v0.2.9 v0.2.10 Change
Average FPS 35.7 46.8 +31%
1% Low FPS 22.5 32.9 +46%
0.1% Low FPS 20.3 30.4 +50%
GPU Load 46.4% 58.3% +12pp
Frame Time 28.0 ms 21.4 ms -24%

8x32

Metric v0.2.9 v0.2.10 Change
Average FPS 20.9 25.1 +20%
1% Low FPS 16.1 21.2 +32%
0.1% Low FPS 15.0 20.6 +37%
GPU Load 59.8% 72.3% +12.5pp
Frame Time 47.7 ms 39.8 ms -17%

Measured with MangoHud on AMD Radeon RX 5600 XT + Intel Core i3-12100F

Changes to the Vulkan backend and new CLI options

  • Added --vulkan-present-mode option with support for:
    • fifo (V‑Sync on, lowest power, no tearing)
    • mailbox (tear‑free, low latency, uncapped FPS)
    • immediate (lowest latency, may tear)
  • Staging buffer pool configurable via --vulkan-buffer-pool-size. Pre‑allocates buffers for partial texture updates, reducing allocation overhead during frequent small uploads.
  • Refactored monolithic vulkan.cpp into modules for better maintainability in the future.

Bug Fixes and internal improvements

  • Damage rectangle clamping to capture subregion prevents out‑of‑bounds blits.
  • Thread termination now uses non‑blocking queue put_nowait() to avoid hangs on shutdown.
  • Frame grabber explicitly cleaned up during window switches and pipeline stops.
  • Switch queue now drains old requests, keeping only the most recent target window.
  • Added guard against zero‑size scaling rects to prevent invalid GPU dispatches.
  • Fixed potential deadlock between pipeline thread and Qt main thread during swapchain recreation.
  • Fixed double‑destruction of VkSurfaceKHR when swapchain creation fails.
  • Added missing Python exceptions for texture creation failures and missing Vulkan queue families.
  • Corrected staging buffer pool size check to prevent out‑of‑bounds access.
  • Added bounds checking for buffer‑to‑image copies in Resource.copy_to.
  • Improved memory type selection for texture downloads (now uses DEVICE_LOCAL for GPU‑write buffer).
  • XSync added after XShmGetImage to ensure image transfer completion (prevents tearing/artifacts).

v0.2.9.post2

11 Apr 06:20

Choose a tag to compare

Changelog

See the v0.2.9 release notes for earlier changes.

Post-release: Python 3.14 Support

After testing both the compushady fork and the upscaler with Python 3.14, everything worked as expected. If any issues arise, the Vulkan backend can be adjusted as needed.

  • Added Python 3.14 to the wheel-building matrix.
  • Pre‑install PySide6 using uv in CI to work around cp314 wheel resolution quirks.
  • Updated README with compatibility notes.

v0.2.9.post1

11 Apr 07:26

Choose a tag to compare

Changelog

See the v0.2.9 release notes for earlier changes.

Post-release: version dependency bound and better screenshot directory default

  • Added minimum version with a loose upper bound to compushady-lrtu to avoid packaging with previous version.
  • Screenshot directory by default now uses the user's localized Pictures folder, via calling user_pictures_dir() from new dependency platformdirs.

v0.2.9

11 Apr 07:12

Choose a tag to compare

Changelog

Introducing hotkey support, OSD, lossless screenshots, automatic Wayland scale factor detection and automatic pause events

Hotkey Manager

Added an XCB‑based global hotkey manager (HotkeyManager) that listens for keyboard shortcuts even when the overlay window does not have focus. Supported actions:

  • Alt+Shift+S to toggle overlay visibility
  • Alt+Shift+M to cycle through upscaling models
  • Alt+Shift+G to cycle through the main output geometry modes (fit, stretch, cover)
  • Alt+Shift+P to take a lossless screenshot (pre-Lanczos)

Hotkeys can be customised via the hotkeys section in the YAML configuration file.

On‑Screen Display (OSD)

An overlay message appears when switching models, changing geometry, or taking a screenshot. The OSD is rendered directly on the Vulkan surface and fades out automatically.

Lossless asynchronous screenshots

Pressing the screenshot hotkey captures the raw SRCNN‑upscaled image (pre‑Lanczos) and saves it as a PNG. The save location can be customised with the screenshot_dir configuration option. The capture is performed asynchronously to minimise impact on the rendering pipeline.

Model and geometry switching

The upscaler can now cycle through all upscaling models and main geometries (fit, stretch, and cover) without restarting the application.

Automatic scale-factor detection

The monitor scale factor is now detected automatically using screeninfo, eliminating the need for manual override on Wayland. The --scale-factor CLI flag remains available for cases where auto‑detection is insufficient.

Pause on focus loss or window minimization

When the target window loses focus (and the overlay is in an always‑on‑top mode), the pipeline automatically pauses and hides the overlay. This behaviour can be disabled with --no-focus-pause or the pause_on_focus_loss configuration option.

Other changes

  • All external requests—model switching, geometry cycling, screenshot capture, and OSD display—now use dedicated queue.Queue instances.
  • The main pipeline loop dequeues and processes these requests on the pipeline thread, eliminating race conditions and simplifying cross‑thread communication.
  • The PipelineController helper class encapsulates this logic, keeping the core Pipeline class focused solely on frame processing.
  • compushady-lrtu has been updated to include the native Texture2D.download() method, which performs a two‑stage GPU-CPU readback using a device‑local staging buffer.
  • The screeninfo library is now a required dependency and is used to automatically detect the monitor’s physical scale factor on Wayland.
  • The WindowTracker class was enhanced to accurately detect window minimized state (map_state) and active focus (_NET_ACTIVE_WINDOW). Combined with the refactored pause‑flag logic, the overlay now pauses and resumes when the target window loses/regains focus or is minimized/restored.

Fixes

  • Captured images previously exhibited swapped red and blue channels due to a mismatch between Vulkan’s B8G8R8A8_UNORM texture format and PIL’s default RGBA decoder. The fix instructs PIL to decode the raw data as BGRA, producing correct colours in all saved screenshots.
  • The original hotkey implementation used python-xlib with a QSocketNotifier on the same X11 display connection that Qt internally manages via XCB. The new HotkeyManager opens a separate XCB connection and processes events via a dedicated QSocketNotifier, integrating with Qt’s event loop without interference.
  • Configuration overrides containing null values (e.g., scale_factor: null, log_file: null) were incorrectly logged as “unknown configuration key” warnings. The apply_overrides function now properly skips None values after confirming the key exists, eliminating the spurious warnings.
  • A log was querying info on non-existant windows before it could handle it, causing an uncatched AttributeError. It was replaced to first handle the state.

v0.2.8

27 Mar 02:20

Choose a tag to compare

Changelog

Introducing a follow‑focus window mode

With the option -f/--follow-focus, the upscaler now automatically switches to the currently focused window. The pipeline adapts to the new window’s size and geometry.

Improvements & Refactoring

  • The code has been reworked for better maintainability, paying some of the technical debt from previous releases.
  • Pipeline and OverlayWindow now have subpackages with helper classes that take most of the delegated responsibilities, and each accept a Config object instead of long parameter lists from the CLI. Also, they encapsulate most of their internal logic, only exposing their public class.
  • Geometry overlay calculations moved to a dedicated overlay.geometry module.
  • Opacity control moved from pipeline to overlay.
  • Configuration handling encapsulated in a new config subpackage.
  • Window acquisition and tracking logic reorganized under a dedicated window package.

Bug Fixes

  • Fixed swapchain recreation problems related to more than one reference of the instance. Suboptimal responses are also now ignored.
  • Fixed a bug where the pipeline would fail to recover after a target window resize, causing the overlay to display cropped content.
  • Corrected an early return in _handle_window_change that prevented full geometry updates after introducing --follow-focus feature.
  • Adjusted the focus monitor to ignore the overlay window itself.

v0.2.7

22 Mar 02:20

Choose a tag to compare

Changelog

Introducing configuration profiles

You can now define named configuration profiles in the YAML config file. Select a profile manually with -p <name> or let the application automatically match a profile based on the window title (exact match, regex, or substring).

Profiles can override any configuration option (crop, model, scale factor, etc.) and follow the priority: CLI arguments > manual profile > auto‑matched profile > general YAML > defaults.

Mouse wheel forwarding

Mouse wheel events (both vertical and horizontal) are now forwarded to the target window.

The overlay translates Qt wheel events into X11 ButtonPress/ButtonRelease events with the appropriate button numbers (4=up, 5=down, 6=left, 7=right) and sends them to the target window. Horizontal scrolling (e.g., on touchpads) is now supported.

Wayland scale factor

Added --scale-factor CLI option (float) to compensate for Wayland fractional scaling.

This multiplier is applied when determining the physical monitor geometry, ensuring that the overlay covers the correct area and mouse clicks map accurately. The scale factor can also be set per profile, allowing different scaling needs per application.

Other improvements

  • If a log file is specified (--log-file), the file now records all messages at DEBUG level and above, regardless of the console log level (which can be set via -q/--quiet or --debug).
  • Moved all validation logic to utils/validators.py with a declarative rule system.
  • Added an explicit swapchain destroyer and garbage collector call to free Vulkan resources when the application exits.
  • Reorganised code for better separation of concerns (config loading, window acquisition, overlay creation, pipeline setup).
  • Fixed a slight offset in mouse click coordinates when using --scale-factor with letterboxed content.
  • The scale factor is now applied correctly in the pipeline when calculating the scaling rectangle, ensuring clicks land exactly where they should.

v0.2.6

20 Mar 20:49

Choose a tag to compare

Improvements around swap chain creation and Vulkan responses

  • The project now depends on a forked version of compushady that exposes is_suboptimal(), is_out_of_date(), and needs_recreation() methods.
  • Detect and recover from VK_SUBOPTIMAL_KHR and VK_ERROR_OUT_OF_DATE_KHR by recreating the swapchain when needed.
  • Automatic detection when the captured window is resized or recreated (XID change) and recreate the frame grabber and SRCNN upscaler with the new dimensions.
  • Content dimensions now are re-evaluated when the overlay window is resized, ensuring the scaling mode (fit/stretch/cover) adapts correctly.
  • Overlay logic updated for crop, content dimensions, target handle, and target size when window properties change, ensuring clicks always map to the correct position in the target window.
  • We now reuse the same Lanczos compute object across frames, avoiding per‑frame creation overhead.
  • Added detailed timing logs for initialization, frame processing, and resource recreation; added debug logs for coordinate mapping and event forwarding.
  • Properly release Vulkan resources before closing X display to prevent segmentation faults.
  • Added a failure counter; after 30 consecutive failures, the pipeline stops gracefully instead of looping indefinitely.
  • The pipeline thread is now non‑daemon, and stop() waits for it to finish before cleaning up, eliminating race conditions on exit.
  • Reduced X11 calls for opacity changes to once per 100 ms, lowering overhead.

v0.2.5

19 Mar 03:51

Choose a tag to compare

New options to customize overlay, target window area and positioning

  • Added --output-geometry
  • Pure mode keywords (stretch, fit, cover).
  • Added ! for stretch, ^ for cover (e.g., 1920x1080!, 1920x1080^).
  • Percentage‑based sizing with optional stretch (50% = fit, 50%! = stretch).
  • Fixed‑width / fixed‑height accept ! for stretch (1920x, 1920x!, x1080, x1080!).
  • Added --background-color (CSS color names, hex codes).
  • Letterbox areas now use the chosen color (previously hardcoded black in HLSL).
  • Added --offset-x and --offset-y to shift the content rectangle from its centered position.
  • Added --crop {top,bottom,left,right} to remove unwanted borders before upscaling.
  • The neural network focuses only on the cropped region, improving quality and performance.

Fixes and internal improvements

  • Used Qt6’s native interfaces to obtain the X11 display pointer, improving compatibility (especially on Wayland/XWayland).
  • Corrected double upscale (4×) pipeline: first pass now writes to intermediate texture, second pass reads from it.
  • Ensured Python struct.pack format matches the HLSL constant buffer layout exactly (BGRA order for background colour).
  • Fixed signed/unsigned comparison in Lanczos shader to correctly handle negative dstX/dstY (for cover mode).
  • Removed hardcoded aspect‑ratio calculation from the shader; the rectangle is now passed directly from the host, making all scaling modes possible.
  • Centralised all config default values in a DEFAULTS dictionary; reduced repetitive conditionals in argument handling.
  • Added geometry syntax validation with clear error messages for invalid --output-geometry strings.
  • Added crop dimension checks to prevent crashes when crop values exceed window size.
  • Resolved “pack expected 13 items” errors by correcting struct.pack format.
  • Ensured percentage and fixed‑dimension specs now correctly apply fit (letterbox) by default, with ! for stretch.

v0.2.4

17 Mar 02:22

Choose a tag to compare

The overlay no longer freezes when the target window is closed, and X11 error spam has been eliminated.

  • Fixed overlay freeze after target window closure.
  • Eliminated X protocol error spam when sending events to a destroyed window.
  • Resolved QMetaObject::invokeMethod error by using a proper slot and queued connection.
  • Ensured the application exits fully without hanging.

Details:

  • Added WindowWatcher to monitor the target window and automatically stop the pipeline when it's closed.
  • The pipeline now sets a stopped_event and signals the main thread to quit cleanly via a Qt queued connection. No more frozen overlay.
  • Custom error handlers suppress default stderr printing and log X errors (like BadWindow) at debug level, preventing console clutter.
  • FrameGrabber.grab() now checks the C function's return code and raises an exception on failure, ensuring the pipeline stops immediately when the window is gone.
  • Normal shutdown messages are now INFO instead of ERROR, so users see a clean console unless verbose logging is enabled.
  • Centralized X11 connection management with proper cleanup in __del__ and closeEvent.