prometeu-runtime/docs/specs/runtime/10-debug-inspection-and-profiling.md

298 lines
6.5 KiB
Markdown

# Debug, Inspection, and Profiling
Domain: machine diagnostics
Function: normative
Didactic companion: [`../learn/mental-model-observability-and-debugging.md`](../runtime/learn/mental-model-observability-and-debugging.md)
## 1 Scope
This chapter defines the machine-visible debugging, inspection, and profiling surface of PROMETEU.
It covers:
- execution modes;
- pause and stepping;
- state inspection;
- graphics inspection;
- profiling;
- breakpoints and watchpoints;
- event and fault visibility;
- certification-facing diagnostics;
- Host-side debug overlay (HUD) isolation.
## 2 Execution Modes
PROMETEU operates in three main modes:
### 2.1 Normal Mode
- continuous execution
- no detailed inspection
- focus on game and experience
### 2.2 Debug Mode
- controlled execution
- access to internal state
- pauses and stepping
### 2.3 Certification Mode
- deterministic execution
- collected metrics
- report generation
No mode alters the logical result of the program.
## 3 Execution Debug
### 3.1 Pause and Resume
The system can be paused at safepoints:
- frame start
- before UPDATE
- after DRAW
- before SYNC
During pause:
- state is frozen
- buffers are not swapped
- logical time does not advance
### 3.2 Step-by-Step
PROMETEU allows stepping at different levels:
- **by frame**
- **by function**
- **by VM instruction**
Stepping by instruction reveals:
- Program Counter (PC)
- current instruction
- operand stack
- call stack
## 4 State Inspection
### 4.1 Stacks
PROMETEU allows inspecting:
- **Operand Stack**
- **Call Stack**
For each frame:
- content
- depth
- growth and cleanup
### 4.2 Heap
The heap can be inspected in real time:
- total size
- current usage
- peak usage
- live objects
The programmer can observe:
- allocation patterns
- fragmentation
- GC pressure
### 4.3 Global Space
Global variables:
- current values
- references
- initialization
## 5 Graphics Debug
PROMETEU allows inspecting the graphics system:
- front buffer
- back buffer
- palette state
- active sprites
It is possible to:
- freeze the image
- observe buffers separately
- identify excessive redraw
## 6 Time Profiling (Cycles)
### 6.1 Per-Frame Measurement
For each frame, PROMETEU records:
- total cycles used
- cycles per subsystem
- execution peaks
Conceptual example:
```
Frame 18231:
Total:9,842/10,000cycles
UPDATE:4,210
DRAW:3,180
AUDIO:920
SYSTEM:612
```
### 6.2 Per-Function Profiling
PROMETEU can associate cycles with:
- functions
- methods
- logical blocks
### 6.3 Per-Instruction Profiling
At the lowest level, the system can display:
- executed instructions
- individual cost
- frequency
## 7 Memory Profiling
PROMETEU records:
- average heap usage
- heap peak
- allocations per frame
- GC frequency
Example:
```
Heap:
Avg:24KB
Peak:34KB❌
Limit:32KB
```
These data directly feed the certification.
### 7.1 Bank Occupancy Profiling
Bank occupancy diagnostics are slot-first.
The visible per-bank telemetry used by host inspection surfaces and certification is:
- `bank_type`
- `used_slots`
- `total_slots`
For the current runtime banks, the canonical bank names are:
- `GLYPH`
- `SOUNDS`
Byte-oriented bank occupancy is not the canonical visible profiling contract.
## 8 Breakpoints and Watchpoints
### 8.1 Breakpoints
PROMETEU supports breakpoints in:
- specific frames
- functions
- VM instructions
Breakpoints:
- pause execution
- preserve state
- do not change behavior
### 8.2 Watchpoints
Watchpoints monitor:
- variables
- heap addresses
- specific values
Execution can pause when:
- a value changes
- a limit is exceeded
## 9 Event and Fault Debugging
## 10 Certification Diagnostics
Certification diagnostics may enforce bank occupancy ceilings.
For bank residency, certification uses slot-based limits, such as:
- `max_glyph_slots_used`
- `max_sound_slots_used`
Bank certification MUST NOT depend on `max_gfx_bytes` or `max_audio_bytes`.
PROMETEU allows observing:
- event queue
- active timers
- published system faults
Each event has:
- origin
- frame
- cost
- consequence
## 10 Host-Side Debug Overlay (HUD) Isolation
The visual Debug Overlay (HUD) for technical inspection is not part of the emulated machine pipeline.
### 10.1 Responsibilities
1. **Runtime:** Only exposes telemetry data via the machine diagnostics surface. It does not perform HUD rendering or string formatting.
2. **Emulated graphics contract:** Machine graphics primitives such as `fill_rect` and `draw_text` remain valid parts of the emulated graphics/syscall contract. They are not host overlay APIs.
3. **Host overlay module:** The Desktop Host owns a dedicated overlay module that performs host-side text, panel, and simple bar composition.
4. **Composition boundary:** The overlay is composed on the Host presentation surface after the emulated frame is ready. Overlay pixels must not be written back into the emulated framebuffer.
5. **Host control:** Overlay visibility and presentation policy remain under Host control.
2. **Host (Desktop):** Responsible for collecting telemetry from the runtime and rendering the HUD as a native, transparent layer.
### 10.2 Principles
- **Zero Pipeline Interference:** HUD rendering must not inject pixels into the emulated framebuffer. It is applied after upscaling or as a separate display surface.
- **Zero Cycle Impact:** HUD-related processing (like formatting technical text) must occur outside the emulated machine cycles.
- **Toggle Control:** The activation of the overlay (typically via **F1**) is managed by the Host layer.
### 10.3 Atomic Telemetry Model
To ensure zero-impact synchronization between the VM and the Host Debug Overlay, PROMETEU uses a **push-based atomic model**:
1. **Atomic Storage:** Metrics such as cycles, syscalls, and memory usage are stored in a dedicated `AtomicTelemetry` structure using thread-safe atomic types (`AtomicU64`, `AtomicU32`, etc.).
2. **Lockless Access:** The Host (Desktop) reads these metrics asynchronously and without locks by taking a `snapshot()` of the atomic state.
3. **Single Source of Truth:** This model is the exclusive source of truth for both real-time inspection and frame-end certification, replacing legacy per-frame buffered fields.
4. **Frame-Closed Log Metric:** `logs_count` in the snapshot represents the number of logs emitted in the last completed logical frame, not a transient in-flight counter.
## 11 Integration with CAP and Certification
All debug and profiling data:
- feed the certification report
- are collected deterministically
- do not depend on external tools
- are consistent regardless of whether the Host HUD is active or not.