165 lines
9.2 KiB
Markdown
165 lines
9.2 KiB
Markdown
Prometeu Runtime — Architecture (Baseline)
|
||
|
||
This document is the concise, authoritative description of the current Prometeu VM baseline after the architectural reset. It reflects the implementation as it exists today — no legacy, no transitional wording.
|
||
|
||
|
||
1. Overview
|
||
-----------
|
||
|
||
- Stack‑based virtual machine
|
||
- Operand stack + call frames; bytecode is fetched from a ROM/program image with a separate constant pool.
|
||
- GC‑managed heap
|
||
- Non‑compacting mark–sweep collector; stable object handles (`HeapRef`) while live. Sweep invalidates unreachable handles; objects are never moved.
|
||
- Closures (Model B)
|
||
- First‑class closures with a heap‑allocated environment. The closure object is passed to the callee as a hidden `arg0` when invoking a closure.
|
||
- Cooperative coroutines
|
||
- Deterministic, cooperative scheduling. Switching and GC occur only at explicit safepoints (`FRAME_SYNC`).
|
||
- Unified syscall ABI
|
||
- PBX pre-load artifacts declare canonical host bindings in `SYSC` and encode call sites as `HOSTCALL <sysc_index>`. The loader resolves and patches them to numeric `SYSCALL <id>` before verification/execution. Capability gating is enforced at load and checked again defensively at runtime. Syscalls are not first‑class values.
|
||
|
||
|
||
2. Memory Model
|
||
----------------
|
||
|
||
2.1 Stack vs Heap
|
||
|
||
- Stack
|
||
- Each running context has an operand stack plus call frames (locals, return bookkeeping). Primitive values (integers, floats, booleans) reside on the stack. Heap objects are referenced by opaque `HeapRef` values on the stack.
|
||
- The VM’s current operand stack and frames are GC roots.
|
||
|
||
- Heap
|
||
- The heap stores runtime objects that require identity and reachability tracking. Handles are `HeapRef` indices into an internal object store.
|
||
- The collector is mark–sweep, non‑moving: it marks from roots, then reclaims unreachable objects without relocating survivors. Indices for live objects remain stable across collections.
|
||
|
||
2.2 Heap Object Kinds (as used today)
|
||
|
||
- Arrays of `Value`
|
||
- Variable‑length arrays whose elements may contain further `HeapRef`s.
|
||
- Closures
|
||
- Carry a function identifier and a captured environment (a slice/vector of `Value`s stored with the closure). Captured `HeapRef`s are traversed by the GC.
|
||
- Coroutines
|
||
- Heap‑resident coroutine records (state + wake time + suspended operand stack and call frames). These act as GC roots when suspended.
|
||
|
||
Notes:
|
||
- Literals like strings and numbers are sourced from the constant pool in the program image; heap allocation is only used for runtime objects (closures, arrays, coroutine records, and any future heap kinds). The constant pool never embeds raw `HeapRef`s.
|
||
|
||
2.3 GC Roots
|
||
|
||
- VM roots
|
||
- Current operand stack and call frames of the running coroutine (or main context).
|
||
- Suspended coroutines
|
||
- All heap‑resident, suspended coroutine objects are treated as roots. Their saved stacks/frames are scanned during marking.
|
||
- Root traversal
|
||
- The VM exposes a root‑visitor that walks the operand stack, frames, and coroutine records to feed the collector. The collector then follows children from each object kind (e.g., array elements, closure environments, coroutine stacks).
|
||
|
||
|
||
3. Execution Model
|
||
-------------------
|
||
|
||
3.1 Interpreter Loop
|
||
|
||
- The VM runs a classic fetch–decode–execute loop over the ROM’s bytecode. The current program counter (PC), operand stack, and call frames define execution state.
|
||
- Function calls establish new frames; returns restore the caller’s frame and adjust the operand stack to the callee’s declared return slot count (the verifier enforces this shape statically).
|
||
- Errors
|
||
- Traps (well‑defined fault conditions) surface as trap reasons; panics indicate internal consistency failures. The VM can report logical frame endings such as `FrameSync`, `BudgetExhausted`, `Halted`, end‑of‑ROM, `Breakpoint`, `Trap(code, …)`, and `Panic(msg)`.
|
||
|
||
3.2 Safepoints
|
||
|
||
- `FRAME_SYNC` is the only safepoint.
|
||
- At `FRAME_SYNC`, the VM performs two actions in a well‑defined order:
|
||
1) Garbage‑collection opportunity: root enumeration + mark–sweep.
|
||
2) Scheduler handoff: the currently running coroutine may yield/sleep, and a next ready coroutine is selected deterministically.
|
||
- No other opcode constitutes a GC or scheduling safepoint. Syscalls do not implicitly trigger GC or rescheduling.
|
||
|
||
3.3 Scheduler Behavior (Cooperative Coroutines)
|
||
|
||
- Coroutines are cooperative and scheduled deterministically (FIFO among ready coroutines).
|
||
- `YIELD` and `SLEEP` take effect at `FRAME_SYNC`:
|
||
- `YIELD` places the current coroutine at the end of the ready queue.
|
||
- `SLEEP` parks the current coroutine until its exact `wake_tick`, after which it re‑enters the ready queue at the correct point.
|
||
- `SPAWN` creates a new coroutine with its own stack/frames recorded in the heap and enqueues it deterministically.
|
||
- No preemption: the VM never interrupts a coroutine between safepoints.
|
||
|
||
|
||
4. Verification Model
|
||
----------------------
|
||
|
||
4.1 Verifier Responsibilities
|
||
|
||
The verifier statically checks bytecode for structural safety and stack‑shape correctness. Representative checks include:
|
||
|
||
- Instruction well‑formedness
|
||
- Unknown opcode, truncated immediates/opcodes, malformed function boundaries, trailing bytes.
|
||
- Control‑flow integrity
|
||
- Jump targets within bounds and to instruction boundaries; functions must have proper terminators; path coverage ensures a valid exit.
|
||
- Stack discipline
|
||
- No underflow/overflow relative to declared max stack; consistent stack height at control‑flow joins; `RET` occurs at the expected height.
|
||
- Call/return shape
|
||
- Direct calls and returns must match the declared argument counts and return slot counts. Mismatches are rejected.
|
||
- Syscalls
|
||
- The verifier runs only on the patched executable image. `HOSTCALL` is invalid at verification time. Final `SYSCALL` IDs must exist per `SyscallMeta`, and arity/declared return slot counts must match metadata.
|
||
- Closures
|
||
- `CALL_CLOSURE` is only allowed on closure values; the callee function must be known; argument counts for closure calls must match.
|
||
- Coroutines
|
||
- `YIELD` context must be valid; `SPAWN` argument counts are validated.
|
||
|
||
4.2 Runtime vs Verifier Guarantees
|
||
|
||
- The verifier guarantees structural correctness and stack‑shape invariants. It does not perform full type checking of value contents; dynamic checks (e.g., numeric domain checks, polymorphic comparisons, concrete syscall argument validation) occur at runtime and may trap.
|
||
- Capability gating for syscalls is enforced at load from cartridge capability flags and checked again at runtime by the VM/native interface.
|
||
|
||
|
||
5. Closures (Model B) — Calling Convention
|
||
-------------------------------------------
|
||
|
||
- Creation
|
||
- `MAKE_CLOSURE` captures N values from the operand stack into a heap‑allocated environment alongside a function identifier. The opcode _pushes a `HeapRef` to the new closure.
|
||
- Call
|
||
- `CALL_CLOSURE` invokes a closure. The closure object itself is supplied to the callee as a hidden `arg0`. User‑visible arguments follow the function’s declared arity.
|
||
- Access to captures
|
||
- The callee can access captured values via the closure’s environment. Captured `HeapRef`s are traced by the GC.
|
||
|
||
|
||
6. Unified Syscall ABI
|
||
-----------------------
|
||
|
||
- Identification
|
||
- Host bindings are declared canonically as `(module, name, version)` in PBX `SYSC`, then executed as numeric IDs after loader patching. Syscalls are not first‑class values.
|
||
- Metadata‑driven
|
||
- `SyscallMeta` defines expected arity and return slot counts. The loader resolves `HOSTCALL` against this metadata and rejects raw `SYSCALL` in PBX pre-load artifacts; the verifier checks final IDs/arity/return‑slot counts against the same metadata.
|
||
- Arguments and returns
|
||
- Arguments are taken from the operand stack in the order defined by the ABI. Returns use multi‑slot results via a host‑side return buffer (`HostReturn`) which the VM copies back onto the stack, or zero slots for “void”. A mismatch in result counts is a fault/panic per current hardening logic.
|
||
- Capabilities
|
||
- Cartridge capability flags are applied before load-time host resolution. Missing required capability aborts load; invoking a syscall without the required capability also traps defensively at runtime.
|
||
|
||
|
||
7. Garbage Collection
|
||
----------------------
|
||
|
||
- Collector
|
||
- Non‑moving mark–sweep.
|
||
- Triggers
|
||
- GC runs only at `FRAME_SYNC` safepoints.
|
||
- Liveness
|
||
- Roots comprise: the live VM stack/frames and all suspended coroutines. The collector traverses object‑specific children (array elements, closure environments, coroutine stacks).
|
||
- Determinism
|
||
- GC opportunities and scheduling order are tied to `FRAME_SYNC`, ensuring repeatable execution traces across runs with the same inputs.
|
||
|
||
|
||
8. Non‑Goals
|
||
-------------
|
||
|
||
- No RC
|
||
- No HIP
|
||
- No preemption
|
||
- No mailbox
|
||
|
||
|
||
9. Notes for Contributors
|
||
--------------------------
|
||
|
||
- Keep the public surface minimal and metadata‑driven (e.g., syscalls via `SyscallMeta`).
|
||
- Do not assume implicit safepoints; schedule and GC only at `FRAME_SYNC`.
|
||
- When adding new opcodes or object kinds, extend the verifier and GC traversal accordingly (children enumeration, environment scanning, root sets).
|
||
- This document is the canonical reference; update it alongside any architectural change.
|