prometeu-runtime/docs/pull-requests/PR-2 - PBX SYSC and HOSTCALL Loader Patching.md
2026-03-24 13:40:44 +00:00

12 KiB

PR-2 - PBX SYSC and HOSTCALL Loader Patching

Goal

Teach the runtime to load canonical host bindings from PBX, resolve them at load time, validate ABI and capabilities, and rewrite pre-load host calls into final numeric syscalls.

This PR assumes PR-1 is already available, so cartridge capabilities are exposed to the loader as internal CapFlags.

Why

Prometeu's hardware contract is:

  • source- and SDK-level host APIs map to canonical identities (module, name, version)
  • load-time resolution maps those identities to numeric syscall ids
  • runtime execution is numeric-only via SYSCALL <id>

The runtime already has the important building blocks:

  • canonical syscall registry
  • load-time resolve_program_syscalls
  • verifier support for numeric SYSCALL <id>
  • VM dispatch by numeric id

What is missing is the PBX and loader wiring.

Scope

In scope:

  • add a mandatory PBX SYSC section
  • extend each SYSC entry with ABI shape
  • add pre-load opcode HOSTCALL <sysc_index>
  • parse SYSC at load time
  • resolve SYSC against the host syscall registry
  • validate declared ABI against authoritative host metadata
  • validate capabilities using cartridge-granted CapFlags
  • reject unused SYSC entries
  • reject out-of-bounds HOSTCALL indices
  • patch all HOSTCALL <sysc_index> to SYSCALL <id>
  • ensure verifier runs only on the patched image

Out of scope:

  • compiler emission
  • stdlib SDK pack format
  • final external tooling for PBX generation
  • platform policy beyond current manifest-derived granted capabilities

Architectural Contract

PBX SYSC

SYSC is a unique, deduplicated, program-wide table of declared host bindings.

Each entry carries:

  • module
  • name
  • version
  • arg_slots
  • ret_slots

SYSC is mandatory for every valid PBX. If the program requires no host bindings, SYSC is present with count = 0.

Pre-load callsites

The compiler-side artifact form is:

HOSTCALL <u32 sysc_index>

Rules:

  • sysc_index is zero-based into the SYSC table
  • code must not contain final SYSCALL <id> for unresolved host-backed SDK calls
  • the final executable image given to the VM must contain no HOSTCALL

Load-time responsibilities

The loader must:

  1. parse SYSC
  2. resolve each entry to syscall_id
  3. validate arg_slots and ret_slots
  4. validate capabilities
  5. scan code for HOSTCALL
  6. mark used SYSC indices
  7. reject out-of-bounds indices
  8. reject unused SYSC entries
  9. rewrite HOSTCALL <index> into SYSCALL <id>
  10. ensure no HOSTCALL remains
  11. only then hand the image to the verifier

PBX Format Changes

prometeu-bytecode

Add a new section kind:

  • SYSC

Recommended temporary binary layout:

u32 count
repeat count times:
    u16 module_len
    [module_len bytes UTF-8]
    u16 name_len
    [name_len bytes UTF-8]
    u16 version
    u16 arg_slots
    u16 ret_slots

Rules:

  • strings are UTF-8
  • duplicate canonical identities are invalid
  • malformed lengths are invalid
  • missing SYSC is invalid

Bytecode model changes

Extend BytecodeModule with a syscall declaration vector.

Suggested shape:

pub struct SyscallDecl {
    pub module: String,
    pub name: String,
    pub version: u16,
    pub arg_slots: u16,
    pub ret_slots: u16,
}

Then:

pub syscalls: Vec<SyscallDecl>

Serialization/deserialization must include section kind for SYSC.

Opcode Changes

Add HOSTCALL

Add a new opcode:

  • HOSTCALL

Immediate:

  • u32 sysc_index

Recommended behavior:

  • valid only in pre-load artifact form
  • not allowed to reach final verifier/VM execution path

Required code updates:

  • opcode enum
  • decoder
  • opcode spec
  • assembler
  • disassembler

Verifier recommendation:

  • the verifier does not need to support HOSTCALL
  • it should continue to run after loader patching only

Loader Integration

Resolution

Use the existing canonical resolver in prometeu-hal::syscalls.

For each SYSC entry:

  1. resolve (module, name, version)
  2. obtain SyscallResolved
  3. compare arg_slots
  4. compare ret_slots
  5. compare required capability against granted cartridge flags

If any check fails, load fails deterministically.

Capability source

Capabilities come from the already-loaded cartridge manifest via PR-1.

This PR must not invent authority from PBX contents.

Code patching

Recommended algorithm:

  1. parse code buffer
  2. whenever HOSTCALL <index> is found:
  3. validate index < syscalls.len()
  4. mark that SYSC[index] is used
  5. rewrite opcode/immediate in place or rebuild the code buffer
  6. emit final SYSCALL <resolved_id>

After scan:

  1. every SYSC entry must be marked used
  2. no HOSTCALL may remain

Either patching strategy is acceptable:

  • mutate byte buffer in place
  • rebuild a fresh patched buffer

The final ProgramImage must contain only numeric SYSCALL <id>.

Verifier Contract

No verifier redesign is required.

The intended contract is:

  • loader validates interface compatibility
  • verifier validates final numeric program structure

This means:

  • loader checks SYSC-declared ABI against host metadata
  • verifier checks stack effects of final SYSCALL <id> using existing runtime metadata

This is intentional and not considered harmful duplication.

Proposed Code Areas

prometeu-bytecode

Likely files:

  • src/model.rs
  • src/opcode.rs
  • src/opcode_spec.rs
  • src/decoder.rs
  • src/assembler.rs
  • src/disassembler.rs
  • src/lib.rs

prometeu-hal

Likely files:

  • src/cartridge_loader.rs
  • possibly helper types or resolver glue near syscall loading logic

prometeu-vm

Likely files:

  • load path in src/virtual_machine.rs

Required behavior:

  • patch before Verifier::verify(...)

Implementation Plan

Implementation should be staged so each phase leaves the workspace in a coherent state and keeps the verifier/VM contract intact.

Commit Checklist

  1. prometeu-bytecode: add SyscallDecl model and mandatory SYSC section
  2. prometeu-bytecode: add SYSC parser validation and load errors
  3. prometeu-bytecode: add phase-1 coverage for empty/valid/invalid SYSC
  4. prometeu-bytecode: wire downstream constructors to new BytecodeModule.syscalls field
  5. prometeu-bytecode: clean up naming/docs after phase-1 passes

Phase 1 - Extend PBX module format

Target crates:

  • crates/console/prometeu-bytecode

Steps:

  1. add SyscallDecl to src/model.rs and extend BytecodeModule with syscalls: Vec<SyscallDecl>
  2. reserve a new section kind for SYSC in module serialization/deserialization
  3. enforce the mandatory-section rule: valid PBS images always carry SYSC, including count = 0
  4. reject malformed payloads, invalid UTF-8, and duplicate canonical identities during load
  5. update src/lib.rs exports if needed so the VM and HAL can consume the new declarations

Checkpoint:

  • BytecodeLoader::load(...) returns BytecodeModule with canonical syscall declarations preserved from PBX

Phase 2 - Add pre-load opcode support

Target crates:

  • crates/console/prometeu-bytecode

Steps:

  1. add HOSTCALL to src/opcode.rs with a u32 immediate
  2. extend src/opcode_spec.rs and src/decoder.rs so the loader can scan and decode HOSTCALL
  3. update src/assembler.rs and src/disassembler.rs so tests and fixtures can produce/read pre-load artifacts
  4. keep the runtime contract explicit: HOSTCALL is representable in bytecode artifacts but must not survive loader patching

Checkpoint:

  • bytecode tooling round-trips HOSTCALL <index> correctly, while runtime execution still depends on patched SYSCALL <id>

Phase 3 - Bridge PBX declarations to host metadata

Target crates:

  • crates/console/prometeu-hal

Steps:

  1. add a resolver path that accepts program-owned syscall declarations instead of only &'static str identities
  2. resolve each (module, name, version) against src/syscalls.rs
  3. validate arg_slots and ret_slots against authoritative SyscallMeta
  4. validate required capabilities against cartridge-derived CapFlags
  5. return deterministic, load-facing errors for unknown syscalls, ABI mismatches, and missing capabilities

Design note:

  • this phase likely needs a small owned-string adapter or a new helper alongside resolve_program_syscalls(...), because PBX strings are runtime data, not &'static str

Checkpoint:

  • given only BytecodeModule.syscalls plus granted CapFlags, the loader can produce a resolved table sysc_index -> syscall_id

Phase 4 - Patch before verification

Target crates:

  • crates/console/prometeu-vm

Steps:

  1. add a load-time patching helper near src/virtual_machine.rs or a small dedicated module
  2. run that helper immediately after BytecodeLoader::load(...) and before Verifier::verify(...)
  3. scan module.code, decode every instruction, and rewrite HOSTCALL <index> into SYSCALL <resolved_id>
  4. reject out-of-bounds HOSTCALL indices during the scan
  5. track SYSC usage and reject declarations that are never referenced
  6. assert that no HOSTCALL remains before handing code to the verifier
  7. only then call Verifier::verify(...), compute max_stack_slots, and build ProgramImage::from(module)

Why the sequencing matters:

  • the current load path in crates/console/prometeu-vm/src/virtual_machine.rs verifies the raw module immediately after BytecodeLoader::load(...)
  • ProgramImage currently stores only the final ROM/functions/constants and does not preserve a syscall declaration table, so patching must happen while the code is still a BytecodeModule

Checkpoint:

  • the verifier sees only numeric SYSCALL <id> instructions, preserving the existing verifier and VM execution model

Phase 5 - Tests and failure matrix

Target crates:

  • crates/console/prometeu-bytecode
  • crates/console/prometeu-hal
  • crates/console/prometeu-vm

Steps:

  1. add serialization/deserialization tests for missing, empty, valid, malformed, and duplicate SYSC
  2. add opcode tests for HOSTCALL decoding and assembler/disassembler coverage
  3. add resolver tests for unknown identity, ABI mismatch, and capability mismatch
  4. add VM load-path tests proving patch-before-verify behavior
  5. add an assertion that final executable images contain only numeric SYSCALL

Suggested PR slicing:

  1. bytecode format + tests
  2. HOSTCALL opcode plumbing
  3. HAL resolution/ABI validation
  4. VM loader patching
  5. integration and regression tests

Deterministic Load Errors

Load must fail for at least:

  1. missing SYSC
  2. malformed SYSC
  3. invalid UTF-8
  4. duplicate syscall identities
  5. unknown syscall identity
  6. ABI mismatch between SYSC and host metadata
  7. missing capability
  8. HOSTCALL with out-of-bounds sysc_index
  9. declared SYSC entry unused by all HOSTCALLs
  10. HOSTCALL still present after patch

Acceptance Criteria

  • PBX parser supports mandatory SYSC
  • BytecodeModule carries syscall declarations
  • runtime understands HOSTCALL
  • loader resolves SYSC entries before verification
  • loader validates arg_slots and ret_slots
  • loader validates capabilities against cartridge flags
  • loader rewrites HOSTCALL to SYSCALL
  • verifier runs only on patched code
  • final VM execution path remains numeric-only

Tests

Add tests covering:

  1. valid PBX with empty SYSC and no HOSTCALL
  2. valid PBX with one syscall and one HOSTCALL
  3. unknown syscall identity
  4. capability mismatch
  5. ABI mismatch
  6. missing SYSC
  7. duplicate SYSC entries
  8. malformed SYSC payload
  9. HOSTCALL index out of bounds
  10. unused SYSC entry
  11. patched image contains only SYSCALL

Prefer synthetic in-memory PBX images in tests.

Definition of Done

After this PR:

  • PBX declares canonical host bindings in SYSC
  • pre-load code references those bindings with HOSTCALL
  • the loader resolves and validates them during load
  • the loader patches executable code to SYSCALL <id>
  • the verifier and VM continue to operate on numeric syscalls only