2.9 KiB
2.9 KiB
| id | ticket | title | created | tags | ||||
|---|---|---|---|---|---|---|---|---|
| LSN-0027 | perf-host-debug-overlay-isolation | Host Debug Overlay Isolation | 2026-04-10 |
|
Host Debug Overlay Isolation
The PROMETEU debug overlay (HUD) was decoupled from the emulated machine pipeline and moved to the Host layer to ensure measurement purity and architectural separation.
The Original Problem
The debug overlay used to be rendered by injecting pixels directly into the emulated GFX pipeline during the logical frame execution. This caused several issues:
- Performance Distortion: Cycle measurements for certification included the overhead of formatting technical strings and performing extra draw calls.
- Leaky Abstraction: The emulated machine became aware of Host-only inspection needs.
- GFX Coupling: The HUD was "burned" into the emulated framebuffer, making it impossible to capture raw game frames without the overlay while technical debugging was active.
The Solution: Host-Side Rendering with Atomic Telemetry
The implemented solution follows a strictly non-intrusive approach:
- Atomic Telemetry (Push-based): A new
AtomicTelemetrystructure was added to the HAL. It usesAtomicU64,AtomicU32, andAtomicUsizeto track metrics (Cycles, Memory, Logs) in real-time. - Runtime Decoupling: The
VirtualMachineRuntimeupdates these atomic counters during itstickloop only ifinspection_activeis enabled. It does not perform any rendering or string formatting. - Host-Side HUD: The
HostRunner(inprometeu-host-desktop-winit) now takes asnapshot()of the atomic telemetry and renders the HUD as a native layer after the emulated machine has finished its work for the tick.
Impact and Benefits
- Zero Machine Overhead: Rendering the HUD consumes Host CPU/GPU cycles but does not affect the emulated machine's cycle counter or logical behavior.
- Fidelity: The emulated framebuffer remains pure, containing only game pixels.
- Responsive Telemetry: By using atomics, the Host can read the most recent metrics at any time without waiting for frame boundaries or acquiring heavy read-locks on the runtime state.
- Platform Agnosticism: Non-desktop hosts (which do not need the overlay) do not pay any implementation cost or performance penalty for the HUD's existence.
Lessons Learned
- Decouple Data from View: Even for internal debugging tools, keeping the data collection (Runtime) separate from the visualization (Host) is crucial for accurate profiling.
- Atomic Snapshots are Sufficient: For high-frequency HUD updates, eventual consistency via relaxed atomic loads is more than enough and significantly more performant than synchronizing via Mutexes or logical frame boundaries.
- Late Composition: Composition of technical layers should always happen at the latest possible stage of the display pipeline to avoid polluting the core simulation state.