prometeu-runtime/ARCHITECTURE.md
2026-03-24 13:40:42 +00:00

8.6 KiB
Raw Blame History

Prometeu Runtime — Architecture (Baseline)

This document is the concise, authoritative description of the current Prometeu VM baseline after the architectural reset. It reflects the implementation as it exists today — no legacy, no transitional wording.

  1. Overview

  • Stackbased virtual machine
    • Operand stack + call frames; bytecode is fetched from a ROM/program image with a separate constant pool.
  • GCmanaged heap
    • Noncompacting marksweep collector; stable object handles (HeapRef) while live. Sweep invalidates unreachable handles; objects are never moved.
  • Closures (Model B)
    • Firstclass closures with a heapallocated environment. The closure object is passed to the callee as a hidden arg0 when invoking a closure.
  • Cooperative coroutines
    • Deterministic, cooperative scheduling. Switching and GC occur only at explicit safepoints (FRAME_SYNC).
  • Unified syscall ABI
    • Numeric ID dispatch with metadata (SyscallMeta). Verifier enforces arity/returnslot counts; capability gating at runtime. Syscalls are not firstclass values.
  1. Memory Model

2.1 Stack vs Heap

  • Stack

    • Each running context has an operand stack plus call frames (locals, return bookkeeping). Primitive values (integers, floats, booleans) reside on the stack. Heap objects are referenced by opaque HeapRef values on the stack.
    • The VMs current operand stack and frames are GC roots.
  • Heap

    • The heap stores runtime objects that require identity and reachability tracking. Handles are HeapRef indices into an internal object store.
    • The collector is marksweep, nonmoving: it marks from roots, then reclaims unreachable objects without relocating survivors. Indices for live objects remain stable across collections.

2.2 Heap Object Kinds (as used today)

  • Arrays of Value
    • Variablelength arrays whose elements may contain further HeapRefs.
  • Closures
    • Carry a function identifier and a captured environment (a slice/vector of Values stored with the closure). Captured HeapRefs are traversed by the GC.
  • Coroutines
    • Heapresident coroutine records (state + wake time + suspended operand stack and call frames). These act as GC roots when suspended.

Notes:

  • Literals like strings and numbers are sourced from the constant pool in the program image; heap allocation is only used for runtime objects (closures, arrays, coroutine records, and any future heap kinds). The constant pool never embeds raw HeapRefs.

2.3 GC Roots

  • VM roots
    • Current operand stack and call frames of the running coroutine (or main context).
  • Suspended coroutines
    • All heapresident, suspended coroutine objects are treated as roots. Their saved stacks/frames are scanned during marking.
  • Root traversal
    • The VM exposes a rootvisitor that walks the operand stack, frames, and coroutine records to feed the collector. The collector then follows children from each object kind (e.g., array elements, closure environments, coroutine stacks).
  1. Execution Model

3.1 Interpreter Loop

  • The VM runs a classic fetchdecodeexecute loop over the ROMs bytecode. The current program counter (PC), operand stack, and call frames define execution state.
  • Function calls establish new frames; returns restore the callers frame and adjust the operand stack to the callees declared return slot count (the verifier enforces this shape statically).
  • Errors
    • Traps (welldefined fault conditions) surface as trap reasons; panics indicate internal consistency failures. The VM can report logical frame endings such as FrameSync, BudgetExhausted, Halted, endofROM, Breakpoint, Trap(code, …), and Panic(msg).

3.2 Safepoints

  • FRAME_SYNC is the only safepoint.
    • At FRAME_SYNC, the VM performs two actions in a welldefined order:
      1. Garbagecollection opportunity: root enumeration + marksweep.
      2. Scheduler handoff: the currently running coroutine may yield/sleep, and a next ready coroutine is selected deterministically.
  • No other opcode constitutes a GC or scheduling safepoint. Syscalls do not implicitly trigger GC or rescheduling.

3.3 Scheduler Behavior (Cooperative Coroutines)

  • Coroutines are cooperative and scheduled deterministically (FIFO among ready coroutines).
  • YIELD and SLEEP take effect at FRAME_SYNC:
    • YIELD places the current coroutine at the end of the ready queue.
    • SLEEP parks the current coroutine until its exact wake_tick, after which it reenters the ready queue at the correct point.
  • SPAWN creates a new coroutine with its own stack/frames recorded in the heap and enqueues it deterministically.
  • No preemption: the VM never interrupts a coroutine between safepoints.
  1. Verification Model

4.1 Verifier Responsibilities

The verifier statically checks bytecode for structural safety and stackshape correctness. Representative checks include:

  • Instruction wellformedness
    • Unknown opcode, truncated immediates/opcodes, malformed function boundaries, trailing bytes.
  • Controlflow integrity
    • Jump targets within bounds and to instruction boundaries; functions must have proper terminators; path coverage ensures a valid exit.
  • Stack discipline
    • No underflow/overflow relative to declared max stack; consistent stack height at controlflow joins; RET occurs at the expected height.
  • Call/return shape
    • Direct calls and returns must match the declared argument counts and return slot counts. Mismatches are rejected.
  • Syscalls
    • Syscall IDs must exist per SyscallMeta. Arity and declared return slot counts must match metadata. Capability checks are enforced at runtime (not by the verifier).
  • Closures
    • CALL_CLOSURE is only allowed on closure values; the callee function must be known; argument counts for closure calls must match.
  • Coroutines
    • YIELD context must be valid; SPAWN argument counts are validated.

4.2 Runtime vs Verifier Guarantees

  • The verifier guarantees structural correctness and stackshape invariants. It does not perform full type checking of value contents; dynamic checks (e.g., numeric domain checks, polymorphic comparisons, concrete syscall argument validation) occur at runtime and may trap.
  • Capability gating for syscalls is enforced at runtime by the VM/native interface.
  1. Closures (Model B) — Calling Convention

  • Creation
    • MAKE_CLOSURE captures N values from the operand stack into a heapallocated environment alongside a function identifier. The opcode _pushes a HeapRef to the new closure.
  • Call
    • CALL_CLOSURE invokes a closure. The closure object itself is supplied to the callee as a hidden arg0. Uservisible arguments follow the functions declared arity.
  • Access to captures
    • The callee can access captured values via the closures environment. Captured HeapRefs are traced by the GC.
  1. Unified Syscall ABI

  • Identification
    • Syscalls are addressed by a numeric ID. They are not firstclass values.
  • Metadatadriven
    • SyscallMeta defines expected arity and return slot counts. The verifier checks IDs/arity/returnslot counts against this metadata.
  • Arguments and returns
    • Arguments are taken from the operand stack in the order defined by the ABI. Returns use bounded multislot results via a hostside return buffer (HostReturn) which the VM copies back onto the stack, or zero slots for “void”. A mismatch in result counts is a fault/panic per current hardening logic.
  • Capabilities
    • Each VM instance has capability flags. Invoking a syscall without the required capability traps.
  1. Garbage Collection

  • Collector
    • Nonmoving marksweep.
  • Triggers
    • GC runs only at FRAME_SYNC safepoints.
  • Liveness
    • Roots comprise: the live VM stack/frames and all suspended coroutines. The collector traverses objectspecific children (array elements, closure environments, coroutine stacks).
  • Determinism
    • GC opportunities and scheduling order are tied to FRAME_SYNC, ensuring repeatable execution traces across runs with the same inputs.
  1. NonGoals

  • No RC
  • No HIP
  • No preemption
  • No mailbox
  1. Notes for Contributors

  • Keep the public surface minimal and metadatadriven (e.g., syscalls via SyscallMeta).
  • Do not assume implicit safepoints; schedule and GC only at FRAME_SYNC.
  • When adding new opcodes or object kinds, extend the verifier and GC traversal accordingly (children enumeration, environment scanning, root sets).
  • This document is the canonical reference; update it alongside any architectural change.