# Host ABI Binding and Loader Resolution Specification Status: Draft v1 (Temporary) Applies to: PBX host-binding metadata, cartridge load, canonical syscall resolution, and loader-side capability gating ## 1. Purpose This document defines the current contract between: - the PBS compiler/toolchain, - the PBX program artifact, - the cartridge loader, - and the host syscall registry. Its purpose is to let compiler and runtime work in parallel without diverging on the most critical boundary: how host-backed SDK calls become load-time-resolved numeric syscalls. This document is intentionally narrower than a full PBX or VM specification. It focuses on loader behavior and the minimum artifact contract the loader must consume. ## 2. Scope This document defines: - the canonical identity of a host binding, - the relationship between `declare host` and canonical host bindings, - the minimum PBX metadata required for load-time host resolution, - the pre-load opcode form used for host-backed call sites, - the required loader algorithm, - the required loader patching step from pre-load host callsites to numeric syscalls, - the division of responsibility between loader-side ABI checks and verifier-side stack checks, - capability checks required during load, - deterministic loader failures for malformed or unauthorized host usage, - the requirement that VM execution remains numeric-only. This document does not define: - the full PBX module format, - the exact in-memory patching strategy used by the loader, - the publication format of stdlib SDK packs, - the semantics of individual subsystems such as GFX or AUDIO. ## 3. Normative Inputs This document follows two prior decisions: 1. Canonical host identity is defined by the hardware ABI as `(module, name, version)`. 2. User code does not declare host bindings directly; SDK/toolchain-controlled modules expose reserved `declare host` surfaces and the compiler lowers those usages into PBX host-binding metadata. This document therefore treats: - `declare host Gfx { fn draw_pixel(...); }` as compile-time surface only, and treats: - `("gfx", "draw_pixel", 1)` as the normative runtime-facing identity. ## 4. Core Model The host-binding pipeline is: 1. User code imports SDK surfaces such as `Gfx` from `@sdk:gfx`. 2. The compiler resolves those surfaces against the selected stdlib line. 3. The compiler maps each host-backed SDK member to a canonical identity `(module, name, version)`. 4. The compiler emits those required canonical identities into the PBX `SYSC` table. 5. The compiler emits pre-load host call sites as `HOSTCALL `. 6. At cartridge load time, the loader resolves `SYSC` entries against the host syscall registry. 7. The loader verifies ABI compatibility and capabilities before execution starts. 8. The loader rewrites every `HOSTCALL ` into `SYSCALL `. 9. The VM executes only numeric syscall identifiers. The loader never resolves host bindings from source-level names such as `Gfx.draw_pixel`. It resolves only canonical identities declared in the PBX. ## 5. Canonical Host Binding Identity Every host binding is identified by the triple: ```text (module, name, version) ``` Rules: - `module` is the canonical host subsystem namespace, such as `gfx` or `audio`. - `name` is the canonical host operation name within that module. - `version` is the canonical ABI version of that host operation. - This identity is stable across source languages and SDK naming styles. - This identity is the only identity used by the loader for host resolution. Source-level owners such as `Gfx` are SDK-facing aliases. They are not loader-facing identities. ## 6. Relationship to `declare host` `declare host` is a reserved SDK/toolchain surface used to expose host-backed APIs to source programs. Rules: - User-authored project modules do not declare canonical host bindings directly. - `declare host` exists to define compile-time interface shape, not executable PBS bodies. - The compiler is responsible for mapping each host-backed member to its canonical host identity. - The PBX must not depend on source spellings such as `Gfx` for runtime resolution. Example: ```pbs import { Gfx, color } from @sdk:gfx; let c: color = color.red; Gfx.draw_pixel(1, 1, c); ``` The source-level call may originate from an SDK declaration such as: ```pbs declare host Gfx { [Host(module = "gfx", name = "draw_pixel", version = 1)] [Capability(name = "gfx")] fn draw_pixel(x: int, y: int, c: color); } ``` but the PBX-facing declaration is canonical, for example: ```text ("gfx", "draw_pixel", 1) ``` ## 7. PBX Host Binding Section ### 7.1 Temporary section contract Until the full PBX host-binding format is stabilized elsewhere, the loader contract uses a dedicated PBX section for declared host bindings. Temporary section identifier: - `SYSC` Meaning: - the list of canonical host bindings required by the program. This section is metadata only. It does not define the executable opcode stream itself. ### 7.2 Temporary binary layout The current temporary layout is: ```text u32 count repeat count times: u16 module_len [module_len bytes UTF-8] u16 name_len [name_len bytes UTF-8] u16 version u16 arg_slots u16 ret_slots ``` Rules: - strings are UTF-8, - `module_len` and `name_len` are byte lengths, - `arg_slots` is the slot count the compiler expects the host binding to consume, - `ret_slots` is the slot count the compiler expects the host binding to produce, - duplicate `(module, name, version)` entries are invalid, - malformed lengths or invalid UTF-8 are load errors. The `SYSC` table is: - unique per program, - deduplicated by canonical identity, - ordered by first occurrence in the compiled program. The loader MUST preserve the table order as emitted. ### 7.3 Presence requirements The `SYSC` section is mandatory for every valid PBX artifact. Rules: - every valid `program.pbx` MUST contain a `SYSC` section, - the loader MUST inspect `SYSC` before execution begins, - an empty `SYSC` section is valid and means the program requires no host bindings, - absence of `SYSC` is a deterministic load error, - malformed or duplicate `SYSC` declarations are deterministic load errors. Rationale: - the Prometeu runtime is always host-aware, - PBX structural uniformity is preferred over optional host-binding metadata, - loader simplicity and deterministic testability take priority over omitting an empty section. ## 8. Loader Resolution Algorithm The loader must perform host-binding resolution before VM execution starts. Required algorithm: 1. Read the PBX header and section table. 2. Locate and parse the `SYSC` section before launching the VM. 3. Decode the declared canonical identities in source order. 4. Reject duplicate identities. 5. Resolve each identity against the host syscall registry. 6. For each resolved entry, verify that declared `arg_slots` and `ret_slots` match the authoritative host metadata. 7. Verify that the granted capability set covers every resolved binding. 8. Scan the executable bytecode for every `HOSTCALL ` instruction. 9. Reject any `HOSTCALL` whose `sysc_index` is out of bounds. 10. Reject any `SYSC` entry that is never referenced by a `HOSTCALL`. 11. Rewrite every `HOSTCALL ` into `SYSCALL `. 12. Reject the loaded image if any `HOSTCALL` remains after the rewrite pass. 13. Produce the final executable program image. 14. Abort load deterministically if any step fails. The loader must not defer canonical-name resolution until the first call site executes. ## 9. Host Registry Resolution The host maintains a canonical syscall registry: ```text (module, name, version) -> syscall_id + metadata ``` That metadata includes, at minimum: - numeric syscall identifier, - canonical identity, - argument slot count, - return slot count, - required capabilities. Rules: - the loader resolves only against this canonical registry, - unknown identities are load errors, - successful resolution yields numeric syscall identifiers and associated metadata, - the resolved metadata is authoritative for capability gating and later runtime verification. ## 10. Pre-Load Callsite Form The compiler emits host-backed call sites in pre-load form as: ```text HOSTCALL ``` Rules: - `sysc_index` is a zero-based index into the program's `SYSC` table, - every host-backed call site in PBX code must use `HOSTCALL`, not `SYSCALL`, - `HOSTCALL` exists only in the pre-load artifact form, - `HOSTCALL` must not remain in the final executable image passed to the VM. Rationale: - the compiler does not know final numeric syscall identifiers, - the loader resolves by canonical identity and materializes the numeric form, - the verifier and VM can stay numeric-only after the patch step. ## 11. Loader-Side ABI Validation Load-time ABI validation is mandatory. Rules: - every `SYSC` entry declares `arg_slots` and `ret_slots`, - the loader resolves the entry to authoritative host metadata, - the loader MUST reject the image if declared `arg_slots` differs from host metadata, - the loader MUST reject the image if declared `ret_slots` differs from host metadata. This validation is not a replacement for bytecode verification. It exists to detect mismatches between: - the compiler's emitted host-binding contract, - and the host registry contract actually present in the runtime. ## 12. Capability Gating Capability gating is mandatory during load. Rules: - every resolved host binding declares required capabilities, - frontend/build host-admission computes deterministic `requiredCapabilities` from host binding metadata for packer assistance, - the cartridge manifest declares requested capabilities using a nominal capability list, - the loader derives or receives the granted capability set from the cartridge/platform policy environment, - if a required capability is not granted, load fails, - the loader must fail before the program begins execution. The compiler may infer which capabilities are required by the emitted `SYSC` table, but it does not grant authority. Authority belongs to the cartridge/platform loading environment. Runtime capability checks may still exist defensively in the VM, but they do not replace mandatory load-time validation. ## 13. Verifier Interaction The loader and the verifier validate different layers of the contract. Loader responsibilities: - validate canonical host identity resolution, - validate `SYSC`-declared ABI shape against host metadata, - validate capabilities, - patch `HOSTCALL ` into `SYSCALL `. Verifier responsibilities: - run on the already-patched executable image, - validate stack safety and control-flow safety, - treat `SYSCALL ` exactly like any other final numeric instruction, - derive stack effects from authoritative runtime syscall metadata. This is intentional and not considered harmful duplication: - the loader checks interface compatibility, - the verifier checks structural correctness of the final program. ## 14. Execution Model After Load After successful load: - the VM executes numeric syscall identifiers only, - runtime dispatch must not depend on canonical strings, - canonical names exist only in PBX metadata and loader-side resolution, - the runtime-facing execution form remains `SYSCALL `. This preserves the hardware contract: - canonical identity is used for declaration and linking, - numeric identity is used for execution. ## 15. Deterministic Loader Errors The loader must fail deterministically for at least the following cases: 1. missing `SYSC` section, 2. malformed `SYSC` payload, 3. invalid UTF-8 in canonical names, 4. duplicate `(module, name, version)` declarations, 5. unknown canonical identity, 6. ABI mismatch between `SYSC` declaration and host registry metadata, 7. capability mismatch, 8. `HOSTCALL` references an out-of-bounds `sysc_index`, 9. a declared `SYSC` entry is unused by all `HOSTCALL` instructions, 10. a `HOSTCALL` remains in the executable image after loader patching, 11. any internal inconsistency between parsed host-binding metadata and the loader's host registry contract. Diagnostics should identify the failing canonical identity whenever available. ## 16. Relationship to `stdlib` `stdlib` selection is a compile-time concern that determines which SDK surfaces and host-backed declarations the compiler may use. Rules: - the loader does not resolve source imports such as `@sdk:gfx`, - the loader only consumes canonical host-binding metadata emitted into the PBX, - `stdlib` affects the loader indirectly through what the compiler emitted, - the loader must not attempt to reconstruct SDK/module semantics from the PBX. In other words: - `stdlib` selects the compile-time SDK contract, - `SYSC` carries the runtime-facing host-binding contract. ## 17. Relationship to Cartridge Manifest Capabilities The cartridge manifest is the current declarative source for requested runtime capabilities. Rules: - cartridge capability declarations are external to `program.pbx`, - capability declarations are serializable nominal identifiers such as `gfx` or `audio`, - the loader converts that nominal declaration into the internal capability flag set used by the runtime, - the loader MUST NOT infer granted authority solely from the presence of `SYSC` entries. This keeps authority separate from code generation: - PBX declares what the program needs, - cartridge packaging declares what the cartridge requests, - the loader/runtime environment decides what is granted. ## 18. Current Implementation Direction This document aligns with the current Prometeu direction: - SDK/toolchain-controlled `declare host` surfaces define host-backed APIs, - the compiler emits canonical host-binding declarations into PBX `SYSC`, - the compiler emits `HOSTCALL ` in pre-load code, - the loader resolves those declarations at cartridge load, - the loader patch-rewrites host calls to numeric syscalls, - the VM continues to execute numeric syscalls only. ## 19. Open Items Deferred The following remain deferred: 1. final PBX section numbering and chunk registry policy, 2. exact binary opcode allocation for `HOSTCALL`, 3. whether the loader patches bytecode in place or rebuilds a fresh executable image buffer, 4. final cartridge policy model when platform-level grants differ from manifest requests, 5. final chunk numbering and binary compatibility policy for PBX versions that predate mandatory `SYSC`, 6. final integration shape with `ProgramImage` or another loaded-program container. ## 20. Current Decision Summary The current temporary contract is: 1. Host bindings are identified canonically by `(module, name, version)`. 2. `declare host` is reserved to SDK/toolchain modules and is compile-time surface only. 3. The compiler emits a unique, deduplicated `SYSC` table ordered by first occurrence. 4. Each `SYSC` entry carries `module`, `name`, `version`, `arg_slots`, and `ret_slots`. 5. The compiler emits host-backed call sites as `HOSTCALL `. 6. The loader parses `SYSC` before execution. 7. The loader resolves canonical identities against the host registry during load. 8. The loader validates `arg_slots` and `ret_slots` against authoritative host metadata. 9. The loader enforces capability gating during load. 10. The loader rejects out-of-bounds and unused `SYSC` references. 11. The loader rewrites every `HOSTCALL ` into `SYSCALL `. 12. The verifier runs after patching and validates the final numeric program. 13. The VM executes numeric syscall identifiers only.