From 5c53b926273a70b8c34cdcfcf446abb9e4055625 Mon Sep 17 00:00:00 2001 From: bQUARKz Date: Mon, 2 Mar 2026 15:11:20 +0000 Subject: [PATCH] clean up --- ...- PBX SYSC and HOSTCALL Loader Patching.md | 442 ------------------ 1 file changed, 442 deletions(-) delete mode 100644 docs/pull-requests/PR-2 - PBX SYSC and HOSTCALL Loader Patching.md diff --git a/docs/pull-requests/PR-2 - PBX SYSC and HOSTCALL Loader Patching.md b/docs/pull-requests/PR-2 - PBX SYSC and HOSTCALL Loader Patching.md deleted file mode 100644 index f84f5f1b..00000000 --- a/docs/pull-requests/PR-2 - PBX SYSC and HOSTCALL Loader Patching.md +++ /dev/null @@ -1,442 +0,0 @@ -# PR-2 - PBX SYSC and HOSTCALL Loader Patching - -## Goal - -Teach the runtime to load canonical host bindings from PBX, resolve them at load time, validate ABI and capabilities, and rewrite pre-load host calls into final numeric syscalls. - -This PR assumes PR-1 is already available, so cartridge capabilities are exposed to the loader as internal `CapFlags`. - -## Why - -Prometeu's hardware contract is: - -- source- and SDK-level host APIs map to canonical identities `(module, name, version)` -- load-time resolution maps those identities to numeric syscall ids -- runtime execution is numeric-only via `SYSCALL ` - -The runtime already has the important building blocks: - -- canonical syscall registry -- load-time `resolve_program_syscalls` -- verifier support for numeric `SYSCALL ` -- VM dispatch by numeric id - -What is missing is the PBX and loader wiring. - -## Scope - -In scope: - -- add a mandatory PBX `SYSC` section -- extend each `SYSC` entry with ABI shape -- add pre-load opcode `HOSTCALL ` -- parse `SYSC` at load time -- resolve `SYSC` against the host syscall registry -- validate declared ABI against authoritative host metadata -- validate capabilities using cartridge-granted `CapFlags` -- reject unused `SYSC` entries -- reject out-of-bounds `HOSTCALL` indices -- patch all `HOSTCALL ` to `SYSCALL ` -- ensure verifier runs only on the patched image - -Out of scope: - -- compiler emission -- stdlib SDK pack format -- final external tooling for PBX generation -- platform policy beyond current manifest-derived granted capabilities - -## Architectural Contract - -### PBX `SYSC` - -`SYSC` is a unique, deduplicated, program-wide table of declared host bindings. - -Each entry carries: - -- `module` -- `name` -- `version` -- `arg_slots` -- `ret_slots` - -`SYSC` is mandatory for every valid PBX. -If the program requires no host bindings, `SYSC` is present with `count = 0`. - -### Pre-load callsites - -The compiler-side artifact form is: - -```text -HOSTCALL -``` - -Rules: - -- `sysc_index` is zero-based into the `SYSC` table -- code must not contain final `SYSCALL ` for unresolved host-backed SDK calls -- the final executable image given to the VM must contain no `HOSTCALL` - -### Load-time responsibilities - -The loader must: - -1. parse `SYSC` -2. resolve each entry to `syscall_id` -3. validate `arg_slots` and `ret_slots` -4. validate capabilities -5. scan code for `HOSTCALL` -6. mark used `SYSC` indices -7. reject out-of-bounds indices -8. reject unused `SYSC` entries -9. rewrite `HOSTCALL ` into `SYSCALL ` -10. ensure no `HOSTCALL` remains -11. only then hand the image to the verifier - -## PBX Format Changes - -### `prometeu-bytecode` - -Add a new section kind: - -- `SYSC` - -Recommended temporary binary layout: - -```text -u32 count -repeat count times: - u16 module_len - [module_len bytes UTF-8] - u16 name_len - [name_len bytes UTF-8] - u16 version - u16 arg_slots - u16 ret_slots -``` - -Rules: - -- strings are UTF-8 -- duplicate canonical identities are invalid -- malformed lengths are invalid -- missing `SYSC` is invalid - -### Bytecode model changes - -Extend `BytecodeModule` with a syscall declaration vector. - -Suggested shape: - -```rust -pub struct SyscallDecl { - pub module: String, - pub name: String, - pub version: u16, - pub arg_slots: u16, - pub ret_slots: u16, -} -``` - -Then: - -```rust -pub syscalls: Vec -``` - -Serialization/deserialization must include section kind for `SYSC`. - -## Opcode Changes - -### Add `HOSTCALL` - -Add a new opcode: - -- `HOSTCALL` - -Immediate: - -- `u32 sysc_index` - -Recommended behavior: - -- valid only in pre-load artifact form -- not allowed to reach final verifier/VM execution path - -Required code updates: - -- opcode enum -- decoder -- opcode spec -- assembler -- disassembler - -Verifier recommendation: - -- the verifier does not need to support `HOSTCALL` -- it should continue to run after loader patching only - -## Loader Integration - -### Resolution - -Use the existing canonical resolver in `prometeu-hal::syscalls`. - -For each `SYSC` entry: - -1. resolve `(module, name, version)` -2. obtain `SyscallResolved` -3. compare `arg_slots` -4. compare `ret_slots` -5. compare required capability against granted cartridge flags - -If any check fails, load fails deterministically. - -### Capability source - -Capabilities come from the already-loaded cartridge manifest via PR-1. - -This PR must not invent authority from PBX contents. - -### Code patching - -Recommended algorithm: - -1. parse code buffer -2. whenever `HOSTCALL ` is found: -3. validate `index < syscalls.len()` -4. mark that `SYSC[index]` is used -5. rewrite opcode/immediate in place or rebuild the code buffer -6. emit final `SYSCALL ` - -After scan: - -1. every `SYSC` entry must be marked used -2. no `HOSTCALL` may remain - -Either patching strategy is acceptable: - -- mutate byte buffer in place -- rebuild a fresh patched buffer - -The final `ProgramImage` must contain only numeric `SYSCALL `. - -## Verifier Contract - -No verifier redesign is required. - -The intended contract is: - -- loader validates interface compatibility -- verifier validates final numeric program structure - -This means: - -- loader checks `SYSC`-declared ABI against host metadata -- verifier checks stack effects of final `SYSCALL ` using existing runtime metadata - -This is intentional and not considered harmful duplication. - -## Proposed Code Areas - -### `prometeu-bytecode` - -Likely files: - -- `src/model.rs` -- `src/opcode.rs` -- `src/opcode_spec.rs` -- `src/decoder.rs` -- `src/assembler.rs` -- `src/disassembler.rs` -- `src/lib.rs` - -### `prometeu-hal` - -Likely files: - -- `src/cartridge_loader.rs` -- possibly helper types or resolver glue near syscall loading logic - -### `prometeu-vm` - -Likely files: - -- load path in `src/virtual_machine.rs` - -Required behavior: - -- patch before `Verifier::verify(...)` - -## Implementation Plan - -Implementation should be staged so each phase leaves the workspace in a coherent state and keeps the verifier/VM contract intact. - -### Commit Checklist - -1. `prometeu-bytecode: add SyscallDecl model and mandatory SYSC section` -2. `prometeu-bytecode: add SYSC parser validation and load errors` -3. `prometeu-bytecode: add phase-1 coverage for empty/valid/invalid SYSC` -4. `prometeu-bytecode: wire downstream constructors to new BytecodeModule.syscalls field` -5. `prometeu-bytecode: clean up naming/docs after phase-1 passes` - -### Phase 1 - Extend PBX module format - -Target crates: - -- `crates/console/prometeu-bytecode` - -Steps: - -1. add `SyscallDecl` to `src/model.rs` and extend `BytecodeModule` with `syscalls: Vec` -2. reserve a new section kind for `SYSC` in module serialization/deserialization -3. enforce the mandatory-section rule: valid PBS images always carry `SYSC`, including `count = 0` -4. reject malformed payloads, invalid UTF-8, and duplicate canonical identities during load -5. update `src/lib.rs` exports if needed so the VM and HAL can consume the new declarations - -Checkpoint: - -- `BytecodeLoader::load(...)` returns `BytecodeModule` with canonical syscall declarations preserved from PBX - -### Phase 2 - Add pre-load opcode support - -Target crates: - -- `crates/console/prometeu-bytecode` - -Steps: - -1. add `HOSTCALL` to `src/opcode.rs` with a `u32` immediate -2. extend `src/opcode_spec.rs` and `src/decoder.rs` so the loader can scan and decode `HOSTCALL` -3. update `src/assembler.rs` and `src/disassembler.rs` so tests and fixtures can produce/read pre-load artifacts -4. keep the runtime contract explicit: `HOSTCALL` is representable in bytecode artifacts but must not survive loader patching - -Checkpoint: - -- bytecode tooling round-trips `HOSTCALL ` correctly, while runtime execution still depends on patched `SYSCALL ` - -### Phase 3 - Bridge PBX declarations to host metadata - -Target crates: - -- `crates/console/prometeu-hal` - -Steps: - -1. add a resolver path that accepts program-owned syscall declarations instead of only `&'static str` identities -2. resolve each `(module, name, version)` against `src/syscalls.rs` -3. validate `arg_slots` and `ret_slots` against authoritative `SyscallMeta` -4. validate required capabilities against cartridge-derived `CapFlags` -5. return deterministic, load-facing errors for unknown syscalls, ABI mismatches, and missing capabilities - -Design note: - -- this phase likely needs a small owned-string adapter or a new helper alongside `resolve_program_syscalls(...)`, because PBX strings are runtime data, not `&'static str` - -Checkpoint: - -- given only `BytecodeModule.syscalls` plus granted `CapFlags`, the loader can produce a resolved table `sysc_index -> syscall_id` - -### Phase 4 - Patch before verification - -Target crates: - -- `crates/console/prometeu-vm` - -Steps: - -1. add a load-time patching helper near `src/virtual_machine.rs` or a small dedicated module -2. run that helper immediately after `BytecodeLoader::load(...)` and before `Verifier::verify(...)` -3. scan `module.code`, decode every instruction, and rewrite `HOSTCALL ` into `SYSCALL ` -4. reject out-of-bounds `HOSTCALL` indices during the scan -5. track `SYSC` usage and reject declarations that are never referenced -6. assert that no `HOSTCALL` remains before handing code to the verifier -7. only then call `Verifier::verify(...)`, compute `max_stack_slots`, and build `ProgramImage::from(module)` - -Why the sequencing matters: - -- the current load path in `crates/console/prometeu-vm/src/virtual_machine.rs` verifies the raw module immediately after `BytecodeLoader::load(...)` -- `ProgramImage` currently stores only the final ROM/functions/constants and does not preserve a syscall declaration table, so patching must happen while the code is still a `BytecodeModule` - -Checkpoint: - -- the verifier sees only numeric `SYSCALL ` instructions, preserving the existing verifier and VM execution model - -### Phase 5 - Tests and failure matrix - -Target crates: - -- `crates/console/prometeu-bytecode` -- `crates/console/prometeu-hal` -- `crates/console/prometeu-vm` - -Steps: - -1. add serialization/deserialization tests for missing, empty, valid, malformed, and duplicate `SYSC` -2. add opcode tests for `HOSTCALL` decoding and assembler/disassembler coverage -3. add resolver tests for unknown identity, ABI mismatch, and capability mismatch -4. add VM load-path tests proving patch-before-verify behavior -5. add an assertion that final executable images contain only numeric `SYSCALL` - -Suggested PR slicing: - -1. bytecode format + tests -2. `HOSTCALL` opcode plumbing -3. HAL resolution/ABI validation -4. VM loader patching -5. integration and regression tests - -## Deterministic Load Errors - -Load must fail for at least: - -1. missing `SYSC` -2. malformed `SYSC` -3. invalid UTF-8 -4. duplicate syscall identities -5. unknown syscall identity -6. ABI mismatch between `SYSC` and host metadata -7. missing capability -8. `HOSTCALL` with out-of-bounds `sysc_index` -9. declared `SYSC` entry unused by all `HOSTCALL`s -10. `HOSTCALL` still present after patch - -## Acceptance Criteria - -- PBX parser supports mandatory `SYSC` -- `BytecodeModule` carries syscall declarations -- runtime understands `HOSTCALL` -- loader resolves `SYSC` entries before verification -- loader validates `arg_slots` and `ret_slots` -- loader validates capabilities against cartridge flags -- loader rewrites `HOSTCALL` to `SYSCALL` -- verifier runs only on patched code -- final VM execution path remains numeric-only - -## Tests - -Add tests covering: - -1. valid PBX with empty `SYSC` and no `HOSTCALL` -2. valid PBX with one syscall and one `HOSTCALL` -3. unknown syscall identity -4. capability mismatch -5. ABI mismatch -6. missing `SYSC` -7. duplicate `SYSC` entries -8. malformed `SYSC` payload -9. `HOSTCALL` index out of bounds -10. unused `SYSC` entry -11. patched image contains only `SYSCALL` - -Prefer synthetic in-memory PBX images in tests. - -## Definition of Done - -After this PR: - -- PBX declares canonical host bindings in `SYSC` -- pre-load code references those bindings with `HOSTCALL` -- the loader resolves and validates them during load -- the loader patches executable code to `SYSCALL ` -- the verifier and VM continue to operate on numeric syscalls only