clean up
This commit is contained in:
parent
8ce12d9a71
commit
5c53b92627
@ -1,442 +0,0 @@
|
||||
# PR-2 - PBX SYSC and HOSTCALL Loader Patching
|
||||
|
||||
## Goal
|
||||
|
||||
Teach the runtime to load canonical host bindings from PBX, resolve them at load time, validate ABI and capabilities, and rewrite pre-load host calls into final numeric syscalls.
|
||||
|
||||
This PR assumes PR-1 is already available, so cartridge capabilities are exposed to the loader as internal `CapFlags`.
|
||||
|
||||
## Why
|
||||
|
||||
Prometeu's hardware contract is:
|
||||
|
||||
- source- and SDK-level host APIs map to canonical identities `(module, name, version)`
|
||||
- load-time resolution maps those identities to numeric syscall ids
|
||||
- runtime execution is numeric-only via `SYSCALL <id>`
|
||||
|
||||
The runtime already has the important building blocks:
|
||||
|
||||
- canonical syscall registry
|
||||
- load-time `resolve_program_syscalls`
|
||||
- verifier support for numeric `SYSCALL <id>`
|
||||
- VM dispatch by numeric id
|
||||
|
||||
What is missing is the PBX and loader wiring.
|
||||
|
||||
## Scope
|
||||
|
||||
In scope:
|
||||
|
||||
- add a mandatory PBX `SYSC` section
|
||||
- extend each `SYSC` entry with ABI shape
|
||||
- add pre-load opcode `HOSTCALL <sysc_index>`
|
||||
- parse `SYSC` at load time
|
||||
- resolve `SYSC` against the host syscall registry
|
||||
- validate declared ABI against authoritative host metadata
|
||||
- validate capabilities using cartridge-granted `CapFlags`
|
||||
- reject unused `SYSC` entries
|
||||
- reject out-of-bounds `HOSTCALL` indices
|
||||
- patch all `HOSTCALL <sysc_index>` to `SYSCALL <id>`
|
||||
- ensure verifier runs only on the patched image
|
||||
|
||||
Out of scope:
|
||||
|
||||
- compiler emission
|
||||
- stdlib SDK pack format
|
||||
- final external tooling for PBX generation
|
||||
- platform policy beyond current manifest-derived granted capabilities
|
||||
|
||||
## Architectural Contract
|
||||
|
||||
### PBX `SYSC`
|
||||
|
||||
`SYSC` is a unique, deduplicated, program-wide table of declared host bindings.
|
||||
|
||||
Each entry carries:
|
||||
|
||||
- `module`
|
||||
- `name`
|
||||
- `version`
|
||||
- `arg_slots`
|
||||
- `ret_slots`
|
||||
|
||||
`SYSC` is mandatory for every valid PBX.
|
||||
If the program requires no host bindings, `SYSC` is present with `count = 0`.
|
||||
|
||||
### Pre-load callsites
|
||||
|
||||
The compiler-side artifact form is:
|
||||
|
||||
```text
|
||||
HOSTCALL <u32 sysc_index>
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- `sysc_index` is zero-based into the `SYSC` table
|
||||
- code must not contain final `SYSCALL <id>` for unresolved host-backed SDK calls
|
||||
- the final executable image given to the VM must contain no `HOSTCALL`
|
||||
|
||||
### Load-time responsibilities
|
||||
|
||||
The loader must:
|
||||
|
||||
1. parse `SYSC`
|
||||
2. resolve each entry to `syscall_id`
|
||||
3. validate `arg_slots` and `ret_slots`
|
||||
4. validate capabilities
|
||||
5. scan code for `HOSTCALL`
|
||||
6. mark used `SYSC` indices
|
||||
7. reject out-of-bounds indices
|
||||
8. reject unused `SYSC` entries
|
||||
9. rewrite `HOSTCALL <index>` into `SYSCALL <id>`
|
||||
10. ensure no `HOSTCALL` remains
|
||||
11. only then hand the image to the verifier
|
||||
|
||||
## PBX Format Changes
|
||||
|
||||
### `prometeu-bytecode`
|
||||
|
||||
Add a new section kind:
|
||||
|
||||
- `SYSC`
|
||||
|
||||
Recommended temporary binary layout:
|
||||
|
||||
```text
|
||||
u32 count
|
||||
repeat count times:
|
||||
u16 module_len
|
||||
[module_len bytes UTF-8]
|
||||
u16 name_len
|
||||
[name_len bytes UTF-8]
|
||||
u16 version
|
||||
u16 arg_slots
|
||||
u16 ret_slots
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- strings are UTF-8
|
||||
- duplicate canonical identities are invalid
|
||||
- malformed lengths are invalid
|
||||
- missing `SYSC` is invalid
|
||||
|
||||
### Bytecode model changes
|
||||
|
||||
Extend `BytecodeModule` with a syscall declaration vector.
|
||||
|
||||
Suggested shape:
|
||||
|
||||
```rust
|
||||
pub struct SyscallDecl {
|
||||
pub module: String,
|
||||
pub name: String,
|
||||
pub version: u16,
|
||||
pub arg_slots: u16,
|
||||
pub ret_slots: u16,
|
||||
}
|
||||
```
|
||||
|
||||
Then:
|
||||
|
||||
```rust
|
||||
pub syscalls: Vec<SyscallDecl>
|
||||
```
|
||||
|
||||
Serialization/deserialization must include section kind for `SYSC`.
|
||||
|
||||
## Opcode Changes
|
||||
|
||||
### Add `HOSTCALL`
|
||||
|
||||
Add a new opcode:
|
||||
|
||||
- `HOSTCALL`
|
||||
|
||||
Immediate:
|
||||
|
||||
- `u32 sysc_index`
|
||||
|
||||
Recommended behavior:
|
||||
|
||||
- valid only in pre-load artifact form
|
||||
- not allowed to reach final verifier/VM execution path
|
||||
|
||||
Required code updates:
|
||||
|
||||
- opcode enum
|
||||
- decoder
|
||||
- opcode spec
|
||||
- assembler
|
||||
- disassembler
|
||||
|
||||
Verifier recommendation:
|
||||
|
||||
- the verifier does not need to support `HOSTCALL`
|
||||
- it should continue to run after loader patching only
|
||||
|
||||
## Loader Integration
|
||||
|
||||
### Resolution
|
||||
|
||||
Use the existing canonical resolver in `prometeu-hal::syscalls`.
|
||||
|
||||
For each `SYSC` entry:
|
||||
|
||||
1. resolve `(module, name, version)`
|
||||
2. obtain `SyscallResolved`
|
||||
3. compare `arg_slots`
|
||||
4. compare `ret_slots`
|
||||
5. compare required capability against granted cartridge flags
|
||||
|
||||
If any check fails, load fails deterministically.
|
||||
|
||||
### Capability source
|
||||
|
||||
Capabilities come from the already-loaded cartridge manifest via PR-1.
|
||||
|
||||
This PR must not invent authority from PBX contents.
|
||||
|
||||
### Code patching
|
||||
|
||||
Recommended algorithm:
|
||||
|
||||
1. parse code buffer
|
||||
2. whenever `HOSTCALL <index>` is found:
|
||||
3. validate `index < syscalls.len()`
|
||||
4. mark that `SYSC[index]` is used
|
||||
5. rewrite opcode/immediate in place or rebuild the code buffer
|
||||
6. emit final `SYSCALL <resolved_id>`
|
||||
|
||||
After scan:
|
||||
|
||||
1. every `SYSC` entry must be marked used
|
||||
2. no `HOSTCALL` may remain
|
||||
|
||||
Either patching strategy is acceptable:
|
||||
|
||||
- mutate byte buffer in place
|
||||
- rebuild a fresh patched buffer
|
||||
|
||||
The final `ProgramImage` must contain only numeric `SYSCALL <id>`.
|
||||
|
||||
## Verifier Contract
|
||||
|
||||
No verifier redesign is required.
|
||||
|
||||
The intended contract is:
|
||||
|
||||
- loader validates interface compatibility
|
||||
- verifier validates final numeric program structure
|
||||
|
||||
This means:
|
||||
|
||||
- loader checks `SYSC`-declared ABI against host metadata
|
||||
- verifier checks stack effects of final `SYSCALL <id>` using existing runtime metadata
|
||||
|
||||
This is intentional and not considered harmful duplication.
|
||||
|
||||
## Proposed Code Areas
|
||||
|
||||
### `prometeu-bytecode`
|
||||
|
||||
Likely files:
|
||||
|
||||
- `src/model.rs`
|
||||
- `src/opcode.rs`
|
||||
- `src/opcode_spec.rs`
|
||||
- `src/decoder.rs`
|
||||
- `src/assembler.rs`
|
||||
- `src/disassembler.rs`
|
||||
- `src/lib.rs`
|
||||
|
||||
### `prometeu-hal`
|
||||
|
||||
Likely files:
|
||||
|
||||
- `src/cartridge_loader.rs`
|
||||
- possibly helper types or resolver glue near syscall loading logic
|
||||
|
||||
### `prometeu-vm`
|
||||
|
||||
Likely files:
|
||||
|
||||
- load path in `src/virtual_machine.rs`
|
||||
|
||||
Required behavior:
|
||||
|
||||
- patch before `Verifier::verify(...)`
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
Implementation should be staged so each phase leaves the workspace in a coherent state and keeps the verifier/VM contract intact.
|
||||
|
||||
### Commit Checklist
|
||||
|
||||
1. `prometeu-bytecode: add SyscallDecl model and mandatory SYSC section`
|
||||
2. `prometeu-bytecode: add SYSC parser validation and load errors`
|
||||
3. `prometeu-bytecode: add phase-1 coverage for empty/valid/invalid SYSC`
|
||||
4. `prometeu-bytecode: wire downstream constructors to new BytecodeModule.syscalls field`
|
||||
5. `prometeu-bytecode: clean up naming/docs after phase-1 passes`
|
||||
|
||||
### Phase 1 - Extend PBX module format
|
||||
|
||||
Target crates:
|
||||
|
||||
- `crates/console/prometeu-bytecode`
|
||||
|
||||
Steps:
|
||||
|
||||
1. add `SyscallDecl` to `src/model.rs` and extend `BytecodeModule` with `syscalls: Vec<SyscallDecl>`
|
||||
2. reserve a new section kind for `SYSC` in module serialization/deserialization
|
||||
3. enforce the mandatory-section rule: valid PBS images always carry `SYSC`, including `count = 0`
|
||||
4. reject malformed payloads, invalid UTF-8, and duplicate canonical identities during load
|
||||
5. update `src/lib.rs` exports if needed so the VM and HAL can consume the new declarations
|
||||
|
||||
Checkpoint:
|
||||
|
||||
- `BytecodeLoader::load(...)` returns `BytecodeModule` with canonical syscall declarations preserved from PBX
|
||||
|
||||
### Phase 2 - Add pre-load opcode support
|
||||
|
||||
Target crates:
|
||||
|
||||
- `crates/console/prometeu-bytecode`
|
||||
|
||||
Steps:
|
||||
|
||||
1. add `HOSTCALL` to `src/opcode.rs` with a `u32` immediate
|
||||
2. extend `src/opcode_spec.rs` and `src/decoder.rs` so the loader can scan and decode `HOSTCALL`
|
||||
3. update `src/assembler.rs` and `src/disassembler.rs` so tests and fixtures can produce/read pre-load artifacts
|
||||
4. keep the runtime contract explicit: `HOSTCALL` is representable in bytecode artifacts but must not survive loader patching
|
||||
|
||||
Checkpoint:
|
||||
|
||||
- bytecode tooling round-trips `HOSTCALL <index>` correctly, while runtime execution still depends on patched `SYSCALL <id>`
|
||||
|
||||
### Phase 3 - Bridge PBX declarations to host metadata
|
||||
|
||||
Target crates:
|
||||
|
||||
- `crates/console/prometeu-hal`
|
||||
|
||||
Steps:
|
||||
|
||||
1. add a resolver path that accepts program-owned syscall declarations instead of only `&'static str` identities
|
||||
2. resolve each `(module, name, version)` against `src/syscalls.rs`
|
||||
3. validate `arg_slots` and `ret_slots` against authoritative `SyscallMeta`
|
||||
4. validate required capabilities against cartridge-derived `CapFlags`
|
||||
5. return deterministic, load-facing errors for unknown syscalls, ABI mismatches, and missing capabilities
|
||||
|
||||
Design note:
|
||||
|
||||
- this phase likely needs a small owned-string adapter or a new helper alongside `resolve_program_syscalls(...)`, because PBX strings are runtime data, not `&'static str`
|
||||
|
||||
Checkpoint:
|
||||
|
||||
- given only `BytecodeModule.syscalls` plus granted `CapFlags`, the loader can produce a resolved table `sysc_index -> syscall_id`
|
||||
|
||||
### Phase 4 - Patch before verification
|
||||
|
||||
Target crates:
|
||||
|
||||
- `crates/console/prometeu-vm`
|
||||
|
||||
Steps:
|
||||
|
||||
1. add a load-time patching helper near `src/virtual_machine.rs` or a small dedicated module
|
||||
2. run that helper immediately after `BytecodeLoader::load(...)` and before `Verifier::verify(...)`
|
||||
3. scan `module.code`, decode every instruction, and rewrite `HOSTCALL <index>` into `SYSCALL <resolved_id>`
|
||||
4. reject out-of-bounds `HOSTCALL` indices during the scan
|
||||
5. track `SYSC` usage and reject declarations that are never referenced
|
||||
6. assert that no `HOSTCALL` remains before handing code to the verifier
|
||||
7. only then call `Verifier::verify(...)`, compute `max_stack_slots`, and build `ProgramImage::from(module)`
|
||||
|
||||
Why the sequencing matters:
|
||||
|
||||
- the current load path in `crates/console/prometeu-vm/src/virtual_machine.rs` verifies the raw module immediately after `BytecodeLoader::load(...)`
|
||||
- `ProgramImage` currently stores only the final ROM/functions/constants and does not preserve a syscall declaration table, so patching must happen while the code is still a `BytecodeModule`
|
||||
|
||||
Checkpoint:
|
||||
|
||||
- the verifier sees only numeric `SYSCALL <id>` instructions, preserving the existing verifier and VM execution model
|
||||
|
||||
### Phase 5 - Tests and failure matrix
|
||||
|
||||
Target crates:
|
||||
|
||||
- `crates/console/prometeu-bytecode`
|
||||
- `crates/console/prometeu-hal`
|
||||
- `crates/console/prometeu-vm`
|
||||
|
||||
Steps:
|
||||
|
||||
1. add serialization/deserialization tests for missing, empty, valid, malformed, and duplicate `SYSC`
|
||||
2. add opcode tests for `HOSTCALL` decoding and assembler/disassembler coverage
|
||||
3. add resolver tests for unknown identity, ABI mismatch, and capability mismatch
|
||||
4. add VM load-path tests proving patch-before-verify behavior
|
||||
5. add an assertion that final executable images contain only numeric `SYSCALL`
|
||||
|
||||
Suggested PR slicing:
|
||||
|
||||
1. bytecode format + tests
|
||||
2. `HOSTCALL` opcode plumbing
|
||||
3. HAL resolution/ABI validation
|
||||
4. VM loader patching
|
||||
5. integration and regression tests
|
||||
|
||||
## Deterministic Load Errors
|
||||
|
||||
Load must fail for at least:
|
||||
|
||||
1. missing `SYSC`
|
||||
2. malformed `SYSC`
|
||||
3. invalid UTF-8
|
||||
4. duplicate syscall identities
|
||||
5. unknown syscall identity
|
||||
6. ABI mismatch between `SYSC` and host metadata
|
||||
7. missing capability
|
||||
8. `HOSTCALL` with out-of-bounds `sysc_index`
|
||||
9. declared `SYSC` entry unused by all `HOSTCALL`s
|
||||
10. `HOSTCALL` still present after patch
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- PBX parser supports mandatory `SYSC`
|
||||
- `BytecodeModule` carries syscall declarations
|
||||
- runtime understands `HOSTCALL`
|
||||
- loader resolves `SYSC` entries before verification
|
||||
- loader validates `arg_slots` and `ret_slots`
|
||||
- loader validates capabilities against cartridge flags
|
||||
- loader rewrites `HOSTCALL` to `SYSCALL`
|
||||
- verifier runs only on patched code
|
||||
- final VM execution path remains numeric-only
|
||||
|
||||
## Tests
|
||||
|
||||
Add tests covering:
|
||||
|
||||
1. valid PBX with empty `SYSC` and no `HOSTCALL`
|
||||
2. valid PBX with one syscall and one `HOSTCALL`
|
||||
3. unknown syscall identity
|
||||
4. capability mismatch
|
||||
5. ABI mismatch
|
||||
6. missing `SYSC`
|
||||
7. duplicate `SYSC` entries
|
||||
8. malformed `SYSC` payload
|
||||
9. `HOSTCALL` index out of bounds
|
||||
10. unused `SYSC` entry
|
||||
11. patched image contains only `SYSCALL`
|
||||
|
||||
Prefer synthetic in-memory PBX images in tests.
|
||||
|
||||
## Definition of Done
|
||||
|
||||
After this PR:
|
||||
|
||||
- PBX declares canonical host bindings in `SYSC`
|
||||
- pre-load code references those bindings with `HOSTCALL`
|
||||
- the loader resolves and validates them during load
|
||||
- the loader patches executable code to `SYSCALL <id>`
|
||||
- the verifier and VM continue to operate on numeric syscalls only
|
||||
Loading…
x
Reference in New Issue
Block a user