clean up
This commit is contained in:
parent
8ce12d9a71
commit
5c53b92627
@ -1,442 +0,0 @@
|
|||||||
# PR-2 - PBX SYSC and HOSTCALL Loader Patching
|
|
||||||
|
|
||||||
## Goal
|
|
||||||
|
|
||||||
Teach the runtime to load canonical host bindings from PBX, resolve them at load time, validate ABI and capabilities, and rewrite pre-load host calls into final numeric syscalls.
|
|
||||||
|
|
||||||
This PR assumes PR-1 is already available, so cartridge capabilities are exposed to the loader as internal `CapFlags`.
|
|
||||||
|
|
||||||
## Why
|
|
||||||
|
|
||||||
Prometeu's hardware contract is:
|
|
||||||
|
|
||||||
- source- and SDK-level host APIs map to canonical identities `(module, name, version)`
|
|
||||||
- load-time resolution maps those identities to numeric syscall ids
|
|
||||||
- runtime execution is numeric-only via `SYSCALL <id>`
|
|
||||||
|
|
||||||
The runtime already has the important building blocks:
|
|
||||||
|
|
||||||
- canonical syscall registry
|
|
||||||
- load-time `resolve_program_syscalls`
|
|
||||||
- verifier support for numeric `SYSCALL <id>`
|
|
||||||
- VM dispatch by numeric id
|
|
||||||
|
|
||||||
What is missing is the PBX and loader wiring.
|
|
||||||
|
|
||||||
## Scope
|
|
||||||
|
|
||||||
In scope:
|
|
||||||
|
|
||||||
- add a mandatory PBX `SYSC` section
|
|
||||||
- extend each `SYSC` entry with ABI shape
|
|
||||||
- add pre-load opcode `HOSTCALL <sysc_index>`
|
|
||||||
- parse `SYSC` at load time
|
|
||||||
- resolve `SYSC` against the host syscall registry
|
|
||||||
- validate declared ABI against authoritative host metadata
|
|
||||||
- validate capabilities using cartridge-granted `CapFlags`
|
|
||||||
- reject unused `SYSC` entries
|
|
||||||
- reject out-of-bounds `HOSTCALL` indices
|
|
||||||
- patch all `HOSTCALL <sysc_index>` to `SYSCALL <id>`
|
|
||||||
- ensure verifier runs only on the patched image
|
|
||||||
|
|
||||||
Out of scope:
|
|
||||||
|
|
||||||
- compiler emission
|
|
||||||
- stdlib SDK pack format
|
|
||||||
- final external tooling for PBX generation
|
|
||||||
- platform policy beyond current manifest-derived granted capabilities
|
|
||||||
|
|
||||||
## Architectural Contract
|
|
||||||
|
|
||||||
### PBX `SYSC`
|
|
||||||
|
|
||||||
`SYSC` is a unique, deduplicated, program-wide table of declared host bindings.
|
|
||||||
|
|
||||||
Each entry carries:
|
|
||||||
|
|
||||||
- `module`
|
|
||||||
- `name`
|
|
||||||
- `version`
|
|
||||||
- `arg_slots`
|
|
||||||
- `ret_slots`
|
|
||||||
|
|
||||||
`SYSC` is mandatory for every valid PBX.
|
|
||||||
If the program requires no host bindings, `SYSC` is present with `count = 0`.
|
|
||||||
|
|
||||||
### Pre-load callsites
|
|
||||||
|
|
||||||
The compiler-side artifact form is:
|
|
||||||
|
|
||||||
```text
|
|
||||||
HOSTCALL <u32 sysc_index>
|
|
||||||
```
|
|
||||||
|
|
||||||
Rules:
|
|
||||||
|
|
||||||
- `sysc_index` is zero-based into the `SYSC` table
|
|
||||||
- code must not contain final `SYSCALL <id>` for unresolved host-backed SDK calls
|
|
||||||
- the final executable image given to the VM must contain no `HOSTCALL`
|
|
||||||
|
|
||||||
### Load-time responsibilities
|
|
||||||
|
|
||||||
The loader must:
|
|
||||||
|
|
||||||
1. parse `SYSC`
|
|
||||||
2. resolve each entry to `syscall_id`
|
|
||||||
3. validate `arg_slots` and `ret_slots`
|
|
||||||
4. validate capabilities
|
|
||||||
5. scan code for `HOSTCALL`
|
|
||||||
6. mark used `SYSC` indices
|
|
||||||
7. reject out-of-bounds indices
|
|
||||||
8. reject unused `SYSC` entries
|
|
||||||
9. rewrite `HOSTCALL <index>` into `SYSCALL <id>`
|
|
||||||
10. ensure no `HOSTCALL` remains
|
|
||||||
11. only then hand the image to the verifier
|
|
||||||
|
|
||||||
## PBX Format Changes
|
|
||||||
|
|
||||||
### `prometeu-bytecode`
|
|
||||||
|
|
||||||
Add a new section kind:
|
|
||||||
|
|
||||||
- `SYSC`
|
|
||||||
|
|
||||||
Recommended temporary binary layout:
|
|
||||||
|
|
||||||
```text
|
|
||||||
u32 count
|
|
||||||
repeat count times:
|
|
||||||
u16 module_len
|
|
||||||
[module_len bytes UTF-8]
|
|
||||||
u16 name_len
|
|
||||||
[name_len bytes UTF-8]
|
|
||||||
u16 version
|
|
||||||
u16 arg_slots
|
|
||||||
u16 ret_slots
|
|
||||||
```
|
|
||||||
|
|
||||||
Rules:
|
|
||||||
|
|
||||||
- strings are UTF-8
|
|
||||||
- duplicate canonical identities are invalid
|
|
||||||
- malformed lengths are invalid
|
|
||||||
- missing `SYSC` is invalid
|
|
||||||
|
|
||||||
### Bytecode model changes
|
|
||||||
|
|
||||||
Extend `BytecodeModule` with a syscall declaration vector.
|
|
||||||
|
|
||||||
Suggested shape:
|
|
||||||
|
|
||||||
```rust
|
|
||||||
pub struct SyscallDecl {
|
|
||||||
pub module: String,
|
|
||||||
pub name: String,
|
|
||||||
pub version: u16,
|
|
||||||
pub arg_slots: u16,
|
|
||||||
pub ret_slots: u16,
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Then:
|
|
||||||
|
|
||||||
```rust
|
|
||||||
pub syscalls: Vec<SyscallDecl>
|
|
||||||
```
|
|
||||||
|
|
||||||
Serialization/deserialization must include section kind for `SYSC`.
|
|
||||||
|
|
||||||
## Opcode Changes
|
|
||||||
|
|
||||||
### Add `HOSTCALL`
|
|
||||||
|
|
||||||
Add a new opcode:
|
|
||||||
|
|
||||||
- `HOSTCALL`
|
|
||||||
|
|
||||||
Immediate:
|
|
||||||
|
|
||||||
- `u32 sysc_index`
|
|
||||||
|
|
||||||
Recommended behavior:
|
|
||||||
|
|
||||||
- valid only in pre-load artifact form
|
|
||||||
- not allowed to reach final verifier/VM execution path
|
|
||||||
|
|
||||||
Required code updates:
|
|
||||||
|
|
||||||
- opcode enum
|
|
||||||
- decoder
|
|
||||||
- opcode spec
|
|
||||||
- assembler
|
|
||||||
- disassembler
|
|
||||||
|
|
||||||
Verifier recommendation:
|
|
||||||
|
|
||||||
- the verifier does not need to support `HOSTCALL`
|
|
||||||
- it should continue to run after loader patching only
|
|
||||||
|
|
||||||
## Loader Integration
|
|
||||||
|
|
||||||
### Resolution
|
|
||||||
|
|
||||||
Use the existing canonical resolver in `prometeu-hal::syscalls`.
|
|
||||||
|
|
||||||
For each `SYSC` entry:
|
|
||||||
|
|
||||||
1. resolve `(module, name, version)`
|
|
||||||
2. obtain `SyscallResolved`
|
|
||||||
3. compare `arg_slots`
|
|
||||||
4. compare `ret_slots`
|
|
||||||
5. compare required capability against granted cartridge flags
|
|
||||||
|
|
||||||
If any check fails, load fails deterministically.
|
|
||||||
|
|
||||||
### Capability source
|
|
||||||
|
|
||||||
Capabilities come from the already-loaded cartridge manifest via PR-1.
|
|
||||||
|
|
||||||
This PR must not invent authority from PBX contents.
|
|
||||||
|
|
||||||
### Code patching
|
|
||||||
|
|
||||||
Recommended algorithm:
|
|
||||||
|
|
||||||
1. parse code buffer
|
|
||||||
2. whenever `HOSTCALL <index>` is found:
|
|
||||||
3. validate `index < syscalls.len()`
|
|
||||||
4. mark that `SYSC[index]` is used
|
|
||||||
5. rewrite opcode/immediate in place or rebuild the code buffer
|
|
||||||
6. emit final `SYSCALL <resolved_id>`
|
|
||||||
|
|
||||||
After scan:
|
|
||||||
|
|
||||||
1. every `SYSC` entry must be marked used
|
|
||||||
2. no `HOSTCALL` may remain
|
|
||||||
|
|
||||||
Either patching strategy is acceptable:
|
|
||||||
|
|
||||||
- mutate byte buffer in place
|
|
||||||
- rebuild a fresh patched buffer
|
|
||||||
|
|
||||||
The final `ProgramImage` must contain only numeric `SYSCALL <id>`.
|
|
||||||
|
|
||||||
## Verifier Contract
|
|
||||||
|
|
||||||
No verifier redesign is required.
|
|
||||||
|
|
||||||
The intended contract is:
|
|
||||||
|
|
||||||
- loader validates interface compatibility
|
|
||||||
- verifier validates final numeric program structure
|
|
||||||
|
|
||||||
This means:
|
|
||||||
|
|
||||||
- loader checks `SYSC`-declared ABI against host metadata
|
|
||||||
- verifier checks stack effects of final `SYSCALL <id>` using existing runtime metadata
|
|
||||||
|
|
||||||
This is intentional and not considered harmful duplication.
|
|
||||||
|
|
||||||
## Proposed Code Areas
|
|
||||||
|
|
||||||
### `prometeu-bytecode`
|
|
||||||
|
|
||||||
Likely files:
|
|
||||||
|
|
||||||
- `src/model.rs`
|
|
||||||
- `src/opcode.rs`
|
|
||||||
- `src/opcode_spec.rs`
|
|
||||||
- `src/decoder.rs`
|
|
||||||
- `src/assembler.rs`
|
|
||||||
- `src/disassembler.rs`
|
|
||||||
- `src/lib.rs`
|
|
||||||
|
|
||||||
### `prometeu-hal`
|
|
||||||
|
|
||||||
Likely files:
|
|
||||||
|
|
||||||
- `src/cartridge_loader.rs`
|
|
||||||
- possibly helper types or resolver glue near syscall loading logic
|
|
||||||
|
|
||||||
### `prometeu-vm`
|
|
||||||
|
|
||||||
Likely files:
|
|
||||||
|
|
||||||
- load path in `src/virtual_machine.rs`
|
|
||||||
|
|
||||||
Required behavior:
|
|
||||||
|
|
||||||
- patch before `Verifier::verify(...)`
|
|
||||||
|
|
||||||
## Implementation Plan
|
|
||||||
|
|
||||||
Implementation should be staged so each phase leaves the workspace in a coherent state and keeps the verifier/VM contract intact.
|
|
||||||
|
|
||||||
### Commit Checklist
|
|
||||||
|
|
||||||
1. `prometeu-bytecode: add SyscallDecl model and mandatory SYSC section`
|
|
||||||
2. `prometeu-bytecode: add SYSC parser validation and load errors`
|
|
||||||
3. `prometeu-bytecode: add phase-1 coverage for empty/valid/invalid SYSC`
|
|
||||||
4. `prometeu-bytecode: wire downstream constructors to new BytecodeModule.syscalls field`
|
|
||||||
5. `prometeu-bytecode: clean up naming/docs after phase-1 passes`
|
|
||||||
|
|
||||||
### Phase 1 - Extend PBX module format
|
|
||||||
|
|
||||||
Target crates:
|
|
||||||
|
|
||||||
- `crates/console/prometeu-bytecode`
|
|
||||||
|
|
||||||
Steps:
|
|
||||||
|
|
||||||
1. add `SyscallDecl` to `src/model.rs` and extend `BytecodeModule` with `syscalls: Vec<SyscallDecl>`
|
|
||||||
2. reserve a new section kind for `SYSC` in module serialization/deserialization
|
|
||||||
3. enforce the mandatory-section rule: valid PBS images always carry `SYSC`, including `count = 0`
|
|
||||||
4. reject malformed payloads, invalid UTF-8, and duplicate canonical identities during load
|
|
||||||
5. update `src/lib.rs` exports if needed so the VM and HAL can consume the new declarations
|
|
||||||
|
|
||||||
Checkpoint:
|
|
||||||
|
|
||||||
- `BytecodeLoader::load(...)` returns `BytecodeModule` with canonical syscall declarations preserved from PBX
|
|
||||||
|
|
||||||
### Phase 2 - Add pre-load opcode support
|
|
||||||
|
|
||||||
Target crates:
|
|
||||||
|
|
||||||
- `crates/console/prometeu-bytecode`
|
|
||||||
|
|
||||||
Steps:
|
|
||||||
|
|
||||||
1. add `HOSTCALL` to `src/opcode.rs` with a `u32` immediate
|
|
||||||
2. extend `src/opcode_spec.rs` and `src/decoder.rs` so the loader can scan and decode `HOSTCALL`
|
|
||||||
3. update `src/assembler.rs` and `src/disassembler.rs` so tests and fixtures can produce/read pre-load artifacts
|
|
||||||
4. keep the runtime contract explicit: `HOSTCALL` is representable in bytecode artifacts but must not survive loader patching
|
|
||||||
|
|
||||||
Checkpoint:
|
|
||||||
|
|
||||||
- bytecode tooling round-trips `HOSTCALL <index>` correctly, while runtime execution still depends on patched `SYSCALL <id>`
|
|
||||||
|
|
||||||
### Phase 3 - Bridge PBX declarations to host metadata
|
|
||||||
|
|
||||||
Target crates:
|
|
||||||
|
|
||||||
- `crates/console/prometeu-hal`
|
|
||||||
|
|
||||||
Steps:
|
|
||||||
|
|
||||||
1. add a resolver path that accepts program-owned syscall declarations instead of only `&'static str` identities
|
|
||||||
2. resolve each `(module, name, version)` against `src/syscalls.rs`
|
|
||||||
3. validate `arg_slots` and `ret_slots` against authoritative `SyscallMeta`
|
|
||||||
4. validate required capabilities against cartridge-derived `CapFlags`
|
|
||||||
5. return deterministic, load-facing errors for unknown syscalls, ABI mismatches, and missing capabilities
|
|
||||||
|
|
||||||
Design note:
|
|
||||||
|
|
||||||
- this phase likely needs a small owned-string adapter or a new helper alongside `resolve_program_syscalls(...)`, because PBX strings are runtime data, not `&'static str`
|
|
||||||
|
|
||||||
Checkpoint:
|
|
||||||
|
|
||||||
- given only `BytecodeModule.syscalls` plus granted `CapFlags`, the loader can produce a resolved table `sysc_index -> syscall_id`
|
|
||||||
|
|
||||||
### Phase 4 - Patch before verification
|
|
||||||
|
|
||||||
Target crates:
|
|
||||||
|
|
||||||
- `crates/console/prometeu-vm`
|
|
||||||
|
|
||||||
Steps:
|
|
||||||
|
|
||||||
1. add a load-time patching helper near `src/virtual_machine.rs` or a small dedicated module
|
|
||||||
2. run that helper immediately after `BytecodeLoader::load(...)` and before `Verifier::verify(...)`
|
|
||||||
3. scan `module.code`, decode every instruction, and rewrite `HOSTCALL <index>` into `SYSCALL <resolved_id>`
|
|
||||||
4. reject out-of-bounds `HOSTCALL` indices during the scan
|
|
||||||
5. track `SYSC` usage and reject declarations that are never referenced
|
|
||||||
6. assert that no `HOSTCALL` remains before handing code to the verifier
|
|
||||||
7. only then call `Verifier::verify(...)`, compute `max_stack_slots`, and build `ProgramImage::from(module)`
|
|
||||||
|
|
||||||
Why the sequencing matters:
|
|
||||||
|
|
||||||
- the current load path in `crates/console/prometeu-vm/src/virtual_machine.rs` verifies the raw module immediately after `BytecodeLoader::load(...)`
|
|
||||||
- `ProgramImage` currently stores only the final ROM/functions/constants and does not preserve a syscall declaration table, so patching must happen while the code is still a `BytecodeModule`
|
|
||||||
|
|
||||||
Checkpoint:
|
|
||||||
|
|
||||||
- the verifier sees only numeric `SYSCALL <id>` instructions, preserving the existing verifier and VM execution model
|
|
||||||
|
|
||||||
### Phase 5 - Tests and failure matrix
|
|
||||||
|
|
||||||
Target crates:
|
|
||||||
|
|
||||||
- `crates/console/prometeu-bytecode`
|
|
||||||
- `crates/console/prometeu-hal`
|
|
||||||
- `crates/console/prometeu-vm`
|
|
||||||
|
|
||||||
Steps:
|
|
||||||
|
|
||||||
1. add serialization/deserialization tests for missing, empty, valid, malformed, and duplicate `SYSC`
|
|
||||||
2. add opcode tests for `HOSTCALL` decoding and assembler/disassembler coverage
|
|
||||||
3. add resolver tests for unknown identity, ABI mismatch, and capability mismatch
|
|
||||||
4. add VM load-path tests proving patch-before-verify behavior
|
|
||||||
5. add an assertion that final executable images contain only numeric `SYSCALL`
|
|
||||||
|
|
||||||
Suggested PR slicing:
|
|
||||||
|
|
||||||
1. bytecode format + tests
|
|
||||||
2. `HOSTCALL` opcode plumbing
|
|
||||||
3. HAL resolution/ABI validation
|
|
||||||
4. VM loader patching
|
|
||||||
5. integration and regression tests
|
|
||||||
|
|
||||||
## Deterministic Load Errors
|
|
||||||
|
|
||||||
Load must fail for at least:
|
|
||||||
|
|
||||||
1. missing `SYSC`
|
|
||||||
2. malformed `SYSC`
|
|
||||||
3. invalid UTF-8
|
|
||||||
4. duplicate syscall identities
|
|
||||||
5. unknown syscall identity
|
|
||||||
6. ABI mismatch between `SYSC` and host metadata
|
|
||||||
7. missing capability
|
|
||||||
8. `HOSTCALL` with out-of-bounds `sysc_index`
|
|
||||||
9. declared `SYSC` entry unused by all `HOSTCALL`s
|
|
||||||
10. `HOSTCALL` still present after patch
|
|
||||||
|
|
||||||
## Acceptance Criteria
|
|
||||||
|
|
||||||
- PBX parser supports mandatory `SYSC`
|
|
||||||
- `BytecodeModule` carries syscall declarations
|
|
||||||
- runtime understands `HOSTCALL`
|
|
||||||
- loader resolves `SYSC` entries before verification
|
|
||||||
- loader validates `arg_slots` and `ret_slots`
|
|
||||||
- loader validates capabilities against cartridge flags
|
|
||||||
- loader rewrites `HOSTCALL` to `SYSCALL`
|
|
||||||
- verifier runs only on patched code
|
|
||||||
- final VM execution path remains numeric-only
|
|
||||||
|
|
||||||
## Tests
|
|
||||||
|
|
||||||
Add tests covering:
|
|
||||||
|
|
||||||
1. valid PBX with empty `SYSC` and no `HOSTCALL`
|
|
||||||
2. valid PBX with one syscall and one `HOSTCALL`
|
|
||||||
3. unknown syscall identity
|
|
||||||
4. capability mismatch
|
|
||||||
5. ABI mismatch
|
|
||||||
6. missing `SYSC`
|
|
||||||
7. duplicate `SYSC` entries
|
|
||||||
8. malformed `SYSC` payload
|
|
||||||
9. `HOSTCALL` index out of bounds
|
|
||||||
10. unused `SYSC` entry
|
|
||||||
11. patched image contains only `SYSCALL`
|
|
||||||
|
|
||||||
Prefer synthetic in-memory PBX images in tests.
|
|
||||||
|
|
||||||
## Definition of Done
|
|
||||||
|
|
||||||
After this PR:
|
|
||||||
|
|
||||||
- PBX declares canonical host bindings in `SYSC`
|
|
||||||
- pre-load code references those bindings with `HOSTCALL`
|
|
||||||
- the loader resolves and validates them during load
|
|
||||||
- the loader patches executable code to `SYSCALL <id>`
|
|
||||||
- the verifier and VM continue to operate on numeric syscalls only
|
|
||||||
Loading…
x
Reference in New Issue
Block a user