--- id: LSN-0027 ticket: perf-host-debug-overlay-isolation title: Host Debug Overlay Isolation created: 2026-04-10 tags: [performance, host, gfx, telemetry] --- # Host Debug Overlay Isolation The PROMETEU debug overlay (HUD) was decoupled from the emulated machine pipeline and moved to the Host layer to ensure measurement purity and architectural separation. ## The Original Problem The debug overlay used to be rendered by injecting pixels directly into the emulated GFX pipeline during the logical frame execution. This caused several issues: - **Performance Distortion:** Cycle measurements for certification included the overhead of formatting technical strings and performing extra draw calls. - **Leaky Abstraction:** The emulated machine became aware of Host-only inspection needs. - **GFX Coupling:** The HUD was "burned" into the emulated framebuffer, making it impossible to capture raw game frames without the overlay while technical debugging was active. ## The Solution: Host-Side Rendering with Atomic Telemetry The implemented solution follows a strictly non-intrusive approach: 1. **Atomic Telemetry (Push-based):** A new `AtomicTelemetry` structure was added to the HAL. It uses `AtomicU64`, `AtomicU32`, and `AtomicUsize` to track metrics (Cycles, Memory, Logs) in real-time. 2. **Runtime Decoupling:** The `VirtualMachineRuntime` updates these atomic counters during its `tick` loop only if `inspection_active` is enabled. It does not perform any rendering or string formatting. 3. **Host-Side HUD:** The `HostRunner` (in `prometeu-host-desktop-winit`) now takes a `snapshot()` of the atomic telemetry and renders the HUD as a native layer after the emulated machine has finished its work for the tick. ## Impact and Benefits - **Zero Machine Overhead:** Rendering the HUD consumes Host CPU/GPU cycles but does not affect the emulated machine's cycle counter or logical behavior. - **Fidelity:** The emulated framebuffer remains pure, containing only game pixels. - **Responsive Telemetry:** By using atomics, the Host can read the most recent metrics at any time without waiting for frame boundaries or acquiring heavy read-locks on the runtime state. - **Platform Agnosticism:** Non-desktop hosts (which do not need the overlay) do not pay any implementation cost or performance penalty for the HUD's existence. ## Lessons Learned - **Decouple Data from View:** Even for internal debugging tools, keeping the data collection (Runtime) separate from the visualization (Host) is crucial for accurate profiling. - **Atomic Snapshots are Sufficient:** For high-frequency HUD updates, eventual consistency via relaxed atomic loads is more than enough and significantly more performant than synchronizing via Mutexes or logical frame boundaries. - **Late Composition:** Composition of technical layers should always happen at the latest possible stage of the display pipeline to avoid polluting the core simulation state.