10 KiB
10 KiB
Prometeu VM Runtime — Canonical Architecture
Status: canonical
This document is the authoritative architectural reference for the Prometeu VM runtime. It reflects the implementation as it exists today and defines the invariants that govern architectural changes in the VM layer.
Scope boundary:
- PROMETEU itself is a fantasy handheld / fantasy console with a broader machine model, firmware model, cartridge model, and virtual hardware surface.
- This document does not define the whole PROMETEU machine.
- This document defines the VM/runtime subsystem that executes bytecode inside that machine.
- For broader machine-level framing, see
../specs/README.md.
Document roles:
- This file is normative for VM/runtime architecture.
- Detailed domain specifications may live under
docs/runtime/specs/, but they must not contradict this document where VM/runtime invariants are concerned. - Roadmaps, agendas, and PR proposals may discuss future changes, but they are not authoritative until this document is updated.
- The machine-wide fantasy console framing lives in the runtime specs manual and related domain specs; those documents are complementary, not competing VM architecture sources.
Maintenance rule:
- Any PR that changes VM/runtime architectural invariants must update this document in the same change.
- Overview
- Stack‑based virtual machine
- Operand stack + call frames; bytecode is fetched from a ROM/program image with a separate constant pool.
- GC‑managed heap
- Non‑compacting mark–sweep collector; stable object handles (
HeapRef) while live. Sweep invalidates unreachable handles; objects are never moved.
- Non‑compacting mark–sweep collector; stable object handles (
- Closures (Model B)
- First‑class closures with a heap‑allocated environment. The closure object is passed to the callee as a hidden
arg0when invoking a closure.
- First‑class closures with a heap‑allocated environment. The closure object is passed to the callee as a hidden
- Cooperative coroutines
- Deterministic, cooperative scheduling. Switching and GC occur only at explicit safepoints (
FRAME_SYNC).
- Deterministic, cooperative scheduling. Switching and GC occur only at explicit safepoints (
- Unified syscall ABI
- PBX pre-load artifacts declare canonical host bindings in
SYSCand encode call sites asHOSTCALL <sysc_index>. The loader resolves and patches them to numericSYSCALL <id>before verification/execution. Capability gating is enforced at load and checked again defensively at runtime. Syscalls are not first‑class values.
- PBX pre-load artifacts declare canonical host bindings in
- Memory Model
2.1 Stack vs Heap
-
Stack
- Each running context has an operand stack plus call frames (locals, return bookkeeping). Primitive values (integers, floats, booleans) reside on the stack. Heap objects are referenced by opaque
HeapRefvalues on the stack. - The VM’s current operand stack and frames are GC roots.
- Each running context has an operand stack plus call frames (locals, return bookkeeping). Primitive values (integers, floats, booleans) reside on the stack. Heap objects are referenced by opaque
-
Heap
- The heap stores runtime objects that require identity and reachability tracking. Handles are
HeapRefindices into an internal object store. - The collector is mark–sweep, non‑moving: it marks from roots, then reclaims unreachable objects without relocating survivors. Indices for live objects remain stable across collections.
- The heap stores runtime objects that require identity and reachability tracking. Handles are
2.2 Heap Object Kinds (as used today)
- Arrays of
Value- Variable‑length arrays whose elements may contain further
HeapRefs.
- Variable‑length arrays whose elements may contain further
- Closures
- Carry a function identifier and a captured environment (a slice/vector of
Values stored with the closure). CapturedHeapRefs are traversed by the GC.
- Carry a function identifier and a captured environment (a slice/vector of
- Coroutines
- Heap‑resident coroutine records (state + wake time + suspended operand stack and call frames). These act as GC roots when suspended.
Notes:
- Literals like strings and numbers are sourced from the constant pool in the program image; heap allocation is only used for runtime objects (closures, arrays, coroutine records, and any future heap kinds). The constant pool never embeds raw
HeapRefs.
2.3 GC Roots
- VM roots
- Current operand stack and call frames of the running coroutine (or main context).
- Suspended coroutines
- All heap‑resident, suspended coroutine objects are treated as roots. Their saved stacks/frames are scanned during marking.
- Root traversal
- The VM exposes a root‑visitor that walks the operand stack, frames, and coroutine records to feed the collector. The collector then follows children from each object kind (e.g., array elements, closure environments, coroutine stacks).
- Execution Model
3.1 Interpreter Loop
- The VM runs a classic fetch–decode–execute loop over the ROM’s bytecode. The current program counter (PC), operand stack, and call frames define execution state.
- Function calls establish new frames; returns restore the caller’s frame and adjust the operand stack to the callee’s declared return slot count (the verifier enforces this shape statically).
- Errors
- Traps (well‑defined fault conditions) surface as trap reasons; panics indicate internal consistency failures. The VM can report logical frame endings such as
FrameSync,BudgetExhausted,Halted, end‑of‑ROM,Breakpoint,Trap(code, …), andPanic(msg).
- Traps (well‑defined fault conditions) surface as trap reasons; panics indicate internal consistency failures. The VM can report logical frame endings such as
3.2 Safepoints
FRAME_SYNCis the only safepoint.- At
FRAME_SYNC, the VM performs two actions in a well‑defined order:- Garbage‑collection opportunity: root enumeration + mark–sweep.
- Scheduler handoff: the currently running coroutine may yield/sleep, and a next ready coroutine is selected deterministically.
- At
- No other opcode constitutes a GC or scheduling safepoint. Syscalls do not implicitly trigger GC or rescheduling.
3.3 Scheduler Behavior (Cooperative Coroutines)
- Coroutines are cooperative and scheduled deterministically (FIFO among ready coroutines).
YIELDandSLEEPtake effect atFRAME_SYNC:YIELDplaces the current coroutine at the end of the ready queue.SLEEPparks the current coroutine until its exactwake_tick, after which it re‑enters the ready queue at the correct point.
SPAWNcreates a new coroutine with its own stack/frames recorded in the heap and enqueues it deterministically.- No preemption: the VM never interrupts a coroutine between safepoints.
- Verification Model
4.1 Verifier Responsibilities
The verifier statically checks bytecode for structural safety and stack‑shape correctness. Representative checks include:
- Instruction well‑formedness
- Unknown opcode, truncated immediates/opcodes, malformed function boundaries, trailing bytes.
- Control‑flow integrity
- Jump targets within bounds and to instruction boundaries; functions must have proper terminators; path coverage ensures a valid exit.
- Stack discipline
- No underflow/overflow relative to declared max stack; consistent stack height at control‑flow joins;
REToccurs at the expected height.
- No underflow/overflow relative to declared max stack; consistent stack height at control‑flow joins;
- Call/return shape
- Direct calls and returns must match the declared argument counts and return slot counts. Mismatches are rejected.
- Syscalls
- The verifier runs only on the patched executable image.
HOSTCALLis invalid at verification time. FinalSYSCALLIDs must exist perSyscallMeta, and arity/declared return slot counts must match metadata.
- The verifier runs only on the patched executable image.
- Closures
CALL_CLOSUREis only allowed on closure values; the callee function must be known; argument counts for closure calls must match.
- Coroutines
YIELDcontext must be valid;SPAWNargument counts are validated.
4.2 Runtime vs Verifier Guarantees
- The verifier guarantees structural correctness and stack‑shape invariants. It does not perform full type checking of value contents; dynamic checks (e.g., numeric domain checks, polymorphic comparisons, concrete syscall argument validation) occur at runtime and may trap.
- Capability gating for syscalls is enforced at load from cartridge capability flags and checked again at runtime by the VM/native interface.
- Closures (Model B) — Calling Convention
- Creation
MAKE_CLOSUREcaptures N values from the operand stack into a heap‑allocated environment alongside a function identifier. The opcode _pushes aHeapRefto the new closure.
- Call
CALL_CLOSUREinvokes a closure. The closure object itself is supplied to the callee as a hiddenarg0. User‑visible arguments follow the function’s declared arity.
- Access to captures
- The callee can access captured values via the closure’s environment. Captured
HeapRefs are traced by the GC.
- The callee can access captured values via the closure’s environment. Captured
- Unified Syscall ABI
- Identification
- Host bindings are declared canonically as
(module, name, version)in PBXSYSC, then executed as numeric IDs after loader patching. Syscalls are not first‑class values.
- Host bindings are declared canonically as
- Metadata‑driven
SyscallMetadefines expected arity and return slot counts. The loader resolvesHOSTCALLagainst this metadata and rejects rawSYSCALLin PBX pre-load artifacts; the verifier checks final IDs/arity/return‑slot counts against the same metadata.
- Arguments and returns
- Arguments are taken from the operand stack in the order defined by the ABI. Returns use multi‑slot results via a host‑side return buffer (
HostReturn) which the VM copies back onto the stack, or zero slots for “void”. A mismatch in result counts is a fault/panic per current hardening logic. - Example: the canonical asset runtime load surface is
asset.load(asset_id, slot) -> (status, handle). The caller does not supplyasset_nameorasset_type; bank kind is derived fromasset_tableusingasset_id.
- Arguments are taken from the operand stack in the order defined by the ABI. Returns use multi‑slot results via a host‑side return buffer (
- Capabilities
- Cartridge capability flags are applied before load-time host resolution. Missing required capability aborts load; invoking a syscall without the required capability also traps defensively at runtime.
- Garbage Collection
- Collector
- Non‑moving mark–sweep.
- Triggers
- GC runs only at
FRAME_SYNCsafepoints.
- GC runs only at
- Liveness
- Roots comprise: the live VM stack/frames and all suspended coroutines. The collector traverses object‑specific children (array elements, closure environments, coroutine stacks).
- Determinism
- GC opportunities and scheduling order are tied to
FRAME_SYNC, ensuring repeatable execution traces across runs with the same inputs.
- GC opportunities and scheduling order are tied to
- Non‑Goals
- No RC
- No HIP
- No preemption
- No mailbox
- Notes for Contributors
- Keep the public surface minimal and metadata‑driven (e.g., syscalls via
SyscallMeta). - Do not assume implicit safepoints; schedule and GC only at
FRAME_SYNC. - When adding new opcodes or object kinds, extend the verifier and GC traversal accordingly (children enumeration, environment scanning, root sets).
- Update this document alongside any architectural change that affects runtime invariants.