Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 17 additions & 35 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,29 +3,25 @@
## === LONG TERM ====

### 1. RetroArch Shader Support
Verify compatibility with RetroArch shader presets
Ongoing compatibility with RetroArch shader presets (currently 89% — 1702/1906 presets compile)

**Rule**: Fix issues in our filter chain system (`src/render/filter_chain/`), not in shader files themselves
**Rule**: Fix issues in our filter chain system (`filter-chain/`), not in shader files themselves

**Reference**: https://github.com/libretro/slang-shaders/blob/master/spec/SHADER_SPEC.md

- [ ] (Known issues to be populated as discovered)

### 2. Latency Optimization
Minimize end-to-end latency while maintaining code quality

**Example Areas** (tasks uncertain, use Tracy for measurement):
- Compositor frame delivery latency
- DMA-BUF import overhead
- Shader pass execution time
- Uniform buffer updates
- Render pass barriers/transitions
- Queue submission overhead
- CPU-GPU synchronization points
**Instrumented** (measurement infra in place): compositor latency tracking, GPU timestamps per pass, Tracy CPU zones, frame pacing

**Known Issues**:

- [ ] `device.waitIdle()` on every frame import (`external_frame_importer.cpp`) — full GPU pipeline stall. Replace with queue-level waits or timeline semaphores
- [ ] No Presentation Time protocol feedback loop for monitor-sync frame pacing

## === Phase 1: Fundamental Infrastructure & Compositor Frame Delivery ===

This roadmap covers core infrastructure work focused on establishing robust compositor-based frame capture and shader processing capabilities.
Core infrastructure for robust compositor-based frame capture and shader processing.

---

Expand All @@ -34,47 +30,33 @@ Prevent regressions in filter chain when adding new features

- [x] Catch SPIR-V compilation errors early
- [x] Report shader compilation failures with diagnostics
- [ ] Golden image generation for reference outputs
- [ ] Comparison against golden images (pixel-by-pixel or perceptual)
- [ ] Automated regression detection for various shader presets
- [ ] Automated test runner for shader validation
- [x] Golden image generation for reference outputs
- [x] Comparison against golden images (pixel-by-pixel and SSIM)
- [x] Automated regression detection for shader presets (14 curated upstream presets)
- [x] Automated test runner for shader validation
- [ ] Enable visual tests in CI — remove `DISABLED` guard in `tests/CMakeLists.txt`, set `GOGGLES_INCLUDE_VISUAL_TESTS=1` in CI lane, enable Git LFS in checkout

### 2. Tracy Profiling Improvements

- [ ] Add Tracy GPU profiling support (Vulkan)
- [x] Single-process Tracy timeline profiling, [context](https://github.com/wolfpld/tracy/issues/822)

### 3. Error Traceback Integration

- [ ] Integrate cpptrace for stack traces on errors
- [ ] Hook into existing error handling (`tl::expected`)
- [ ] Configure for debug/release builds

---

### 4. Compositor Protocol Completeness
### 3. Compositor Protocol Completeness
Extend nested compositor to support broader app compatibility

**Current State**: Headless wlroots compositor with XDG Shell, XWayland, basic input
**Current State**: Headless wlroots compositor with XDG Shell, XWayland, Layer Shell, pointer constraints, relative pointer, DMA-BUF explicit sync

**Missing Capabilities** (blocking specific app types):

- [ ] **Layer Shell** (`wlr_layer_shell_v1`) - Game launcher overlays (Steam, Epic), desktop panels
- [x] **Layer Shell** (`wlr_layer_shell_v1`) - Game launcher overlays (Steam, Epic), desktop panels
- [ ] **Presentation Time** (`wlr_presentation_time`) - Frame pacing, tear-free presentation, VRR
- [ ] **Data Device** (`wl_data_device_manager`) - Clipboard and drag-and-drop for launchers
- [ ] **DRM Lease** (`wlr_drm_lease_v1`) - VR applications (SteamVR)
- [ ] **Idle Inhibit** (`zwlr_idle_inhibit_v1`) - Prevent screensaver during video playback
- [ ] **Touch Input** (`wlr_touch`) - Mobile/touchscreen game ports
- [ ] **Text Input** (`zwlr_text_input_v3`) - IME support for CJK languages

**Nice-to-Have Enhancements**:

- [ ] **Primary Selection** (`zwlr_primary_selection_v1`) - Middle-click paste
- [ ] **Output Management** (`wlr_output_manager_v1`) - Multi-monitor display configuration
- [ ] **Fractional Scaling** (`wp_fractional_scale_v1`) - HiDPI text rendering
- [ ] **Tablet/Stylus** (`wlr_tablet_tool`) - Drawing applications
- [ ] **Session Lock** (`ext_session_lock_manager_v1`) - Screen locker support
- [ ] **Gamma Control** (`wlr_gamma_control_manager`) - Color management
- [ ] **xdg-activation** - Window focus tokens for multi-window launchers
- [ ] **Keyboard Shortcuts Inhibit** (`zwp_keyboard_shortcuts_inhibit_v1`) - Global hotkeys
- [ ] **Tearing Control** (`wp_tearing_control_v1`) - Reduced latency mode
Expand Down
119 changes: 119 additions & 0 deletions docs/policies/boundary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# Boundary Layer

Contracts at module interfaces — what data crosses, who owns it, what the receiver may assume.

**Governing principles:** Data flows down, events flow up (#1). Ownership is in the types (#3). Napkin dependency graph (#2).

---

### BOUND.modules.001 — No forwarding accessors

**Guard:** Adding a public method that returns a mutable or const reference to an owned member, or a method that delegates to an owned member without adding logic.
**Safety:** A type must not expose its internal components via accessor methods that let callers reach through to manipulate internals. If callers need functionality from an internal component, either: (a) the owning type provides a method that encapsulates the operation, or (b) the internal component should be a sibling, not a child — restructure the ownership tree so both are independently accessible.

<details>

**Violation:** Callers couple to internal structure. Refactoring the owner's internals breaks all callers. The owner becomes a facade that adds no value — just an indirection layer.
**Escapes:** Static tools cannot distinguish a legitimate accessor (exposing a stable interface) from a leaky abstraction (exposing implementation details). The test is: would callers break if the internal type changed? If yes, it's leaky.
**Locked:** No `component()` style accessors that return references to owned members. No pass-through methods that add no logic.
**Free:** Accessors for value-type properties (e.g., `width()`, `format()`). Methods that genuinely encapsulate multi-step operations on internals.

</details>

### BOUND.modules.002 — Typed events, not callbacks

**Guard:** Adding inter-module communication via `std::function` callbacks, callback registration methods (`set_*_callback`), or lambda wiring.
**Safety:** Subsystems communicate through typed event structs emitted on the source object. Listeners hold RAII connection objects that auto-disconnect on destruction. Events are type-safe (subscribe to a specific struct type, not a string or enum). No `std::function` member variables for inter-module communication. No callback registration methods.

<details>

**Violation:** Callback spaghetti — a mediator registers N callbacks between N subsystems, creating hidden coupling. Callback lifetimes are manual (dangling callbacks if listener is destroyed). Adding a new event type requires modifying the mediator.
**Escapes:** Static tools cannot distinguish a callback used for internal strategy (fine) from one used for inter-module communication (bad). The distinction is scope: callbacks within a single module are acceptable; callbacks that cross module boundaries should be typed events.
**Locked:** Typed event structs for cross-module communication. RAII connections for listener lifecycle. No `set_*_callback` registration patterns for inter-module communication.
**Free:** Internal use of callbacks within a module (e.g., lambda passed to an algorithm). Event struct field design. Whether events are emitted synchronously or queued.

</details>

### BOUND.modules.003 — No cross-module type leakage

**Guard:** Importing types from one module's namespace into another module's public interface.
**Safety:** A module's public interface must not reference types from peer modules. Boundary types shared across modules must live in a shared location (e.g., `util/`). If module A needs to display information from module B, B defines a snapshot/DTO type in a shared location, not in B's internal namespace.

<details>

**Violation:** UI module imports compositor types directly — changes to compositor internals break the UI. Circular dependencies between modules. "Include what you use" becomes "include everything."
**Escapes:** Static tools can detect circular includes but cannot judge whether a cross-module type reference is appropriate or leaky. The judgment is: does the referenced type belong to the module's public contract, or is it an internal type that happens to be accessible?
**Locked:** No peer-module type references in public interfaces. Shared types live in shared locations.
**Free:** Internal use of types within a module. Which types are shared vs module-internal.

</details>

### BOUND.render.001 — DMA-BUF frame ownership transfer

**Guard:** Modifying `ExternalImageFrame`, its production in the compositor, or its consumption in the render pipeline.
**Safety:** DMA-BUF handle ownership transfers via `UniqueFd::dup()` — producer retains its copy, consumer owns the duplicate. `sync_fd` (if present) transfers with the frame via the same dup mechanism. Receiver must not assume `sync_fd` is always present. Single-plane DMA-BUF only.

<details>

**Violation:** Double-close on DMA-BUF fd (shared instead of duplicated). Vulkan import fails on multi-plane buffer.
**Escapes:** Ownership transfer is semantic — no tool verifies that `dup()` is used rather than raw fd copy.
**Locked:** Ownership model (dup, not transfer). Single-plane requirement.
**Free:** Adding fields to the frame type. Internal rendering that produces the buffer.

</details>

### BOUND.render.002 — Vulkan result checking beyond semgrep scope

**Guard:** Adding or modifying Vulkan API calls that return `vk::Result`.
**Safety:** All `vk::Result` returns must be checked explicitly. Use `VK_TRY(call, code, msg)` for propagation. Semgrep only catches `static_cast<void>(waitIdle())` — all other unchecked results escape the hard gate.

<details>

**Violation:** Vulkan call silently fails; subsequent operations use invalid state.
**Escapes:** Semgrep rule only matches one specific pattern. General result checking is semantic.
**Locked:** Explicit result checking for all Vulkan calls. `VK_TRY` as the standard propagation mechanism.
**Free:** Error codes and messages. Whether cleanup paths log or propagate.

</details>

### BOUND.render.003 — Vulkan destruction ordering

**Guard:** Modifying Vulkan object destruction logic. Adding new Vulkan objects.
**Safety:** Vulkan objects must be destroyed in dependency order. GPU idle or fence-waited before destroying in-use objects. Destruction order should follow from the ownership tree (principle #5), not from manual sequencing.

<details>

**Violation:** Validation layer error on destroying a referenced object. GPU crash from destroying a pipeline while in use.
**Escapes:** Semgrep enforces manual destruction (no RAII wrappers), but cannot verify destruction ORDER.
**Locked:** Dependency-ordered destruction. GPU idle before destroying in-use objects.
**Free:** Which specific Vulkan objects exist. Internal organization of destroy calls.

</details>

### BOUND.code.001 — Comment and documentation rules

**Guard:** Adding or modifying comments or Doxygen docstrings in any C++ source file.
**Safety:** Comments must explain non-obvious why, constraints, workarounds, or invariants. Comments must NOT narrate obvious what, provide step-by-step tutorials, or include LLM-verbose justifications. Doxygen `///` is required only when the declaration alone is insufficient.

<details>

**Violation:** Codebase bloated with narration comments that obscure real invariant documentation.
**Escapes:** No static tool can distinguish useful constraint comments from narration.
**Locked:** "Why, not what" principle. Doxygen restricted to declarations where the type signature is insufficient.
**Free:** Comment phrasing. Whether a particular line needs a comment.

</details>

### BOUND.headers.001 — Pragma once in headers

**Guard:** Creating new header files (`.hpp`, `.h`) in `src/` or `tests/`.
**Safety:** All headers must use `#pragma once`. No `#ifndef`/`#define`/`#endif` include guards.

<details>

**Violation:** Double-inclusion causing redefinition errors.
**Escapes:** No hard gate enforces `#pragma once` presence.
**Locked:** `#pragma once` as the sole include guard mechanism.
**Free:** Header file naming. Header content beyond the guard.

</details>
53 changes: 53 additions & 0 deletions docs/policies/liveness.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Liveness Layer

Capabilities that must remain reachable. Pipeline must not stall. Deadlock-freedom.

**Governing principles:** Lifecycle follows the dependency graph (#5). No mediator objects (#4).

---

### LIVE.render.001 — Shader pipeline recordable during hot-reload

**Guard:** Modifying shader hot-reload, pipeline slot management, or the active/pending swap logic.
**Safety:** The active shader pipeline must always be recordable (or empty). Async compilation must never block the render loop. Failed compilation must not corrupt the active pipeline. Render loop calls `record()` every frame — this must complete in bounded time.

<details>

**Liveness:** Render loop never blocks on compilation. Failed reload preserves previous pipeline. Retired pipelines outlive in-flight GPU work.
**Violation:** Render stall during compile. Black frame after failed reload. Use-after-free on retired pipeline referenced by in-flight frame.
**Escapes:** "Active always valid" depends on swap logic never removing active without a ready replacement. This is control flow. The retire delay must exceed frames in flight — a numeric relationship not checkable by tools.
**Locked:** Non-blocking compilation. Retire delay before destruction. Failed reload preserves active.
**Free:** Compilation mechanism. Retire delay value. Number of retired slots.
**Related:** RES.render.003

</details>

### LIVE.render.002 — Frame pipeline must advance

**Guard:** Modifying the acquire → record → submit → present cycle.
**Safety:** Frame slots are independent — no circular dependency between them. Each slot has its own fence, semaphore, and command buffer. Round-robin advancement ensures no slot waits on another slot's data (beyond the GPU fence signal). At least one present mode must be selectable (FIFO is mandatory per Vulkan spec).

<details>

**Liveness:** No circular fence dependency. No deadlock from frame pipeline stall.
**Violation:** Deadlock from circular wait. Black screen from pipeline stall.
**Escapes:** Circular dependency freedom depends on the round-robin design. A refactor adding cross-slot data dependency could introduce a cycle. No tool verifies deadlock-freedom of a frame pipeline.
**Locked:** Independent frame slots. Round-robin advancement. FIFO as fallback present mode.
**Free:** Number of frame slots. Present mode preference. FPS pacing mechanism.

</details>

### LIVE.compositor.001 — Event loop terminable

**Guard:** Modifying compositor thread lifecycle, the event loop, or shutdown logic.
**Safety:** `wl_display_terminate()` causes the event loop to return. Thread join must not deadlock. The compositor thread must not block indefinitely on any operation that prevents it from checking the terminate flag. Shutdown must complete in bounded time.

<details>

**Liveness:** Compositor always terminable. Application shutdown completes in bounded time.
**Violation:** Application hangs on shutdown — thread never joins.
**Escapes:** If the compositor thread blocks in a long operation that doesn't return to the event loop, it won't see the terminate flag. No tool verifies all code paths return to dispatch.
**Locked:** `wl_display_terminate()` as shutdown signal. Thread join in destructor. No infinite blocking in compositor thread.
**Free:** Event loop iteration frequency. Timer sources. Work done per iteration.

</details>
Loading
Loading