# PR-2 - PBX SYSC and HOSTCALL Loader Patching ## Goal Teach the runtime to load canonical host bindings from PBX, resolve them at load time, validate ABI and capabilities, and rewrite pre-load host calls into final numeric syscalls. This PR assumes PR-1 is already available, so cartridge capabilities are exposed to the loader as internal `CapFlags`. ## Why Prometeu's hardware contract is: - source- and SDK-level host APIs map to canonical identities `(module, name, version)` - load-time resolution maps those identities to numeric syscall ids - runtime execution is numeric-only via `SYSCALL ` The runtime already has the important building blocks: - canonical syscall registry - load-time `resolve_program_syscalls` - verifier support for numeric `SYSCALL ` - VM dispatch by numeric id What is missing is the PBX and loader wiring. ## Scope In scope: - add a mandatory PBX `SYSC` section - extend each `SYSC` entry with ABI shape - add pre-load opcode `HOSTCALL ` - parse `SYSC` at load time - resolve `SYSC` against the host syscall registry - validate declared ABI against authoritative host metadata - validate capabilities using cartridge-granted `CapFlags` - reject unused `SYSC` entries - reject out-of-bounds `HOSTCALL` indices - patch all `HOSTCALL ` to `SYSCALL ` - ensure verifier runs only on the patched image Out of scope: - compiler emission - stdlib SDK pack format - final external tooling for PBX generation - platform policy beyond current manifest-derived granted capabilities ## Architectural Contract ### PBX `SYSC` `SYSC` is a unique, deduplicated, program-wide table of declared host bindings. Each entry carries: - `module` - `name` - `version` - `arg_slots` - `ret_slots` `SYSC` is mandatory for every valid PBX. If the program requires no host bindings, `SYSC` is present with `count = 0`. ### Pre-load callsites The compiler-side artifact form is: ```text HOSTCALL ``` Rules: - `sysc_index` is zero-based into the `SYSC` table - code must not contain final `SYSCALL ` for unresolved host-backed SDK calls - the final executable image given to the VM must contain no `HOSTCALL` ### Load-time responsibilities The loader must: 1. parse `SYSC` 2. resolve each entry to `syscall_id` 3. validate `arg_slots` and `ret_slots` 4. validate capabilities 5. scan code for `HOSTCALL` 6. mark used `SYSC` indices 7. reject out-of-bounds indices 8. reject unused `SYSC` entries 9. rewrite `HOSTCALL ` into `SYSCALL ` 10. ensure no `HOSTCALL` remains 11. only then hand the image to the verifier ## PBX Format Changes ### `prometeu-bytecode` Add a new section kind: - `SYSC` Recommended temporary binary layout: ```text u32 count repeat count times: u16 module_len [module_len bytes UTF-8] u16 name_len [name_len bytes UTF-8] u16 version u16 arg_slots u16 ret_slots ``` Rules: - strings are UTF-8 - duplicate canonical identities are invalid - malformed lengths are invalid - missing `SYSC` is invalid ### Bytecode model changes Extend `BytecodeModule` with a syscall declaration vector. Suggested shape: ```rust pub struct SyscallDecl { pub module: String, pub name: String, pub version: u16, pub arg_slots: u16, pub ret_slots: u16, } ``` Then: ```rust pub syscalls: Vec ``` Serialization/deserialization must include section kind for `SYSC`. ## Opcode Changes ### Add `HOSTCALL` Add a new opcode: - `HOSTCALL` Immediate: - `u32 sysc_index` Recommended behavior: - valid only in pre-load artifact form - not allowed to reach final verifier/VM execution path Required code updates: - opcode enum - decoder - opcode spec - assembler - disassembler Verifier recommendation: - the verifier does not need to support `HOSTCALL` - it should continue to run after loader patching only ## Loader Integration ### Resolution Use the existing canonical resolver in `prometeu-hal::syscalls`. For each `SYSC` entry: 1. resolve `(module, name, version)` 2. obtain `SyscallResolved` 3. compare `arg_slots` 4. compare `ret_slots` 5. compare required capability against granted cartridge flags If any check fails, load fails deterministically. ### Capability source Capabilities come from the already-loaded cartridge manifest via PR-1. This PR must not invent authority from PBX contents. ### Code patching Recommended algorithm: 1. parse code buffer 2. whenever `HOSTCALL ` is found: 3. validate `index < syscalls.len()` 4. mark that `SYSC[index]` is used 5. rewrite opcode/immediate in place or rebuild the code buffer 6. emit final `SYSCALL ` After scan: 1. every `SYSC` entry must be marked used 2. no `HOSTCALL` may remain Either patching strategy is acceptable: - mutate byte buffer in place - rebuild a fresh patched buffer The final `ProgramImage` must contain only numeric `SYSCALL `. ## Verifier Contract No verifier redesign is required. The intended contract is: - loader validates interface compatibility - verifier validates final numeric program structure This means: - loader checks `SYSC`-declared ABI against host metadata - verifier checks stack effects of final `SYSCALL ` using existing runtime metadata This is intentional and not considered harmful duplication. ## Proposed Code Areas ### `prometeu-bytecode` Likely files: - `src/model.rs` - `src/opcode.rs` - `src/opcode_spec.rs` - `src/decoder.rs` - `src/assembler.rs` - `src/disassembler.rs` - `src/lib.rs` ### `prometeu-hal` Likely files: - `src/cartridge_loader.rs` - possibly helper types or resolver glue near syscall loading logic ### `prometeu-vm` Likely files: - load path in `src/virtual_machine.rs` Required behavior: - patch before `Verifier::verify(...)` ## Deterministic Load Errors Load must fail for at least: 1. missing `SYSC` 2. malformed `SYSC` 3. invalid UTF-8 4. duplicate syscall identities 5. unknown syscall identity 6. ABI mismatch between `SYSC` and host metadata 7. missing capability 8. `HOSTCALL` with out-of-bounds `sysc_index` 9. declared `SYSC` entry unused by all `HOSTCALL`s 10. `HOSTCALL` still present after patch ## Acceptance Criteria - PBX parser supports mandatory `SYSC` - `BytecodeModule` carries syscall declarations - runtime understands `HOSTCALL` - loader resolves `SYSC` entries before verification - loader validates `arg_slots` and `ret_slots` - loader validates capabilities against cartridge flags - loader rewrites `HOSTCALL` to `SYSCALL` - verifier runs only on patched code - final VM execution path remains numeric-only ## Tests Add tests covering: 1. valid PBX with empty `SYSC` and no `HOSTCALL` 2. valid PBX with one syscall and one `HOSTCALL` 3. unknown syscall identity 4. capability mismatch 5. ABI mismatch 6. missing `SYSC` 7. duplicate `SYSC` entries 8. malformed `SYSC` payload 9. `HOSTCALL` index out of bounds 10. unused `SYSC` entry 11. patched image contains only `SYSCALL` Prefer synthetic in-memory PBX images in tests. ## Definition of Done After this PR: - PBX declares canonical host bindings in `SYSC` - pre-load code references those bindings with `HOSTCALL` - the loader resolves and validates them during load - the loader patches executable code to `SYSCALL ` - the verifier and VM continue to operate on numeric syscalls only