All checks were successful
Intrepid/Prometeu/Runtime/pipeline/head This commit looks good
Reviewed-on: #13 Co-authored-by: bQUARKz <bquarkz@gmail.com> Co-committed-by: bQUARKz <bquarkz@gmail.com>
2.1 KiB
2.1 KiB
| id | ticket | title | created | tags | |||
|---|---|---|---|---|---|---|---|
| LSN-0026 | perf-runtime-telemetry-hot-path | Push-based Telemetry Model | 2026-04-10 |
|
Push-based Telemetry Model
The PROMETEU telemetry system evolved from an on-demand scan model (pull) to an incremental counter model (push), aiming to minimize the impact on the runtime's hot path.
The Original Problem
Previously, at every host tick, the runtime requested memory usage information from the asset banks. This resulted in:
O(n)scans over resource maps.- Multiple read lock acquisitions in every tick.
- Unnecessary overhead on handheld hardware, where every microsecond counts.
The Solution: Push Model with Atomics
The implemented solution uses AtomicUsize in drivers and the VM to maintain the system state in real-time with O(1) read and write cost:
- Drivers (Assets): Atomic counters in each
BankPolicyare updated duringload,commit, andcancel. - VM (Heap): A
used_bytescounter in theHeapstruct tracks allocations and deallocations (sweep). - System (Logs): The
LogServicetracks log pressure emitted in each frame.
Two Levels of Observability
To balance performance and debugging, the collection was divided:
- Frame Snapshot (Always): Automatic capture at the end of each logical frame. Irrelevant cost (
O(1)). Serves theCertifierand historical logs. - Host Tick (On-Demand): Detailed collection in every tick only occurs if
inspection_activeis enabled (e.g., F1 Overlay on).
Lessons Learned
- Trigger Decoupling: We should not use the
Certifierstate to enable visual debugging features (like the overlay), as they have different purposes and costs. - Eventual Consistency is Sufficient: For telemetry metrics, it is not necessary to lock the system to obtain an exact value every nanosecond. Relaxed atomic reading is sufficient and much more performant.
- Cost Isolation: Moving the aggregation logic to the driver simplifies the runtime and ensures that the telemetry cost is paid only during state mutations, rather than repeatedly during stable execution.