2026-03-24 13:40:42 +00:00

5.5 KiB
Raw Blame History

Prometeu Bytecode — Core ISA (Minimal, BytecodeOnly)

This document defines the minimal, stable Core ISA surface for the Prometeu Virtual Machine at the bytecode level. It specifies instruction encoding, the stack evaluation model, and the instruction set currently available. Higherlevel constructs (closures, coroutines) are intentionally out of scope at this stage.

Encoding Rules

  • Endianness: Littleendian.
  • Instruction layout: [opcode: u16][immediate: spec.imm_bytes].
  • Opcodes are defined in prometeu_bytecode::isa::core::CoreOpCode.
  • Immediate sizes and stack effects are defined by CoreOpCode::spec() returning CoreOpcodeSpec.
  • All jump immediates are absolute u32 byte offsets from the start of the current function.

Stack Machine Model

  • The VM is stackbased. Unless noted, operands are taken from the top of the operand stack and results are pushed back.
  • Types at the bytecode level are represented by the Value enum; the VM may perform numeric promotion where appropriate (e.g., Int32 + Float -> Float).
  • Stack underflow is a trap (TRAP_STACK_UNDERFLOW).
  • Some operations may trap for other reasons (e.g., division by zero, invalid indices, type mismatches).

Instruction Set (Core)

  • Execution control:

    • NOP — no effect.
    • HALT — terminates execution (block terminator).
    • JMP u32 — unconditional absolute jump (block terminator).
    • JMP_IF_FALSE u32 — pops [bool], jumps if false.
    • JMP_IF_TRUE u32 — pops [bool], jumps if true.
    • TRAP — software trap/breakpoint (block terminator).
  • Stack manipulation:

    • PUSH_CONST u32 — load constant by index → _pushes [value].
    • PUSH_I64 i64, PUSH_F64 f64, PUSH_BOOL u8, PUSH_I32 i32, PUSH_BOUNDED u32(<=0xFFFF) — push literals.
    • POP — pops 1.
    • POP_N u32 — pops N.
    • DUP[x] -> [x, x].
    • SWAP[a, b] -> [b, a].
  • Arithmetic:

    • ADD, SUB, MUL, DIV, MOD — binary numeric ops.
    • NEG — unary numeric negation.
    • BOUND_TO_INT[bounded] -> [int64].
    • INT_TO_BOUND_CHECKED[int] -> [bounded] (traps on overflow 0..65535).
  • Comparison and logic:

    • EQ, NEQ, LT, LTE, GT, GTE — comparisons → [bool].
    • AND, OR, NOT — boolean logic.
    • BIT_AND, BIT_OR, BIT_XOR, SHL, SHR — integer bit operations.
  • Variables:

    • GET_GLOBAL u32, SET_GLOBAL u32 — access global slots.
    • GET_LOCAL u32, SET_LOCAL u32 — access local slots (current frame).
  • Functions and scopes:

    • CALL u32 — call by function index; argument/result arity per function metadata.
    • RET — return from current function (block terminator).
    • PUSH_SCOPE, POP_SCOPE — begin/end lexical scope.
  • System/Timing:

    • SYSCALL u32 — platform call; arity/types are verified by the VM/firmware layer.
    • FRAME_SYNC — yield until the next frame boundary (e.g., vblank); explicit safepoint.

For exact immediates and stack effects, see CoreOpCode::spec() which is the single source of truth used by the decoder, disassembler, and (later) verifier.

Canonical Decoder Contract

  • The canonical decoder is prometeu_bytecode::decode_next(pc, bytes).
  • It uses the Core ISA spec to determine immediate size and the canonical next_pc.
  • Unknown or legacy opcodes must produce a deterministic UnknownOpcode error.

Module Boundary

  • Core ISA lives under prometeu_bytecode::isa::core and reexports:
    • CoreOpCode — the opcode enum of the core profile.
    • CoreOpcodeSpec and CoreOpCodeSpecExt — spec with imm_bytes, stack effects, and flags.
  • Consumers (encoder/decoder/disasm/verifier) should import from this module to avoid depending on internal layout.

FRAME_SYNC — Semantics and Placement (Bytecode Level)

  • Semantics:

    • FRAME_SYNC is a zero-operand instruction and does not modify the operand stack.
    • It marks a VM safepoint for GC and the cooperative scheduler. In CoreOpcodeSpec this is exposed as spec.is_safepoint == true.
    • On execution, the VM may suspend the current fiber/coroutine until the next frame boundary (e.g., vsync) and/or perform GC. After resuming, execution continues at the next instruction.
  • Placement rules (representable and checkable):

    • FRAME_SYNC may appear anywhere inside a function body where normal instructions can appear. It is NOT a block terminator (spec.is_terminator == false).
    • Instruction boundaries are canonical: encoders/emitters must only place FRAME_SYNC at valid instruction PCs. The verifier already enforces “jump-to-boundary” and end-exclusive [start, end) function ranges using the canonical layout routine.
    • Entrypoints that represent a render/update loop SHOULD ensure at least one reachable FRAME_SYNC along every long-running path to provide deterministic safepoints for GC/scheduling. This policy is semantic and may be enforced by higher-level tooling; at the bytecode level it is representable via spec.is_safepoint and can be counted by static analyzers.
  • Disassembly:

    • Disassemblers must print the mnemonic FRAME_SYNC verbatim for this opcode.
    • Tools MAY optionally annotate it as a safepoint in comments, e.g., FRAME_SYNC ; safepoint.
  • Verification notes:

    • The bytecode verifier treats FRAME_SYNC as a normal instruction with no stack effect and no control-flow targets. It is permitted before RET, between basic blocks, and as the last instruction of a function. Jumps targeting the function end (pc == end) remain valid under the end-exclusive rule.