# PBS Bytecode and PBX Mapping Specification Status: Draft v1 (Backend Baseline) Applies to: mapping from lowered/optimized backend programs into PBX sections, bytecode-facing artifacts, and source-to-artifact invariants required by loader/verifier/runtime ## 1. Purpose This document defines the normative mapping between backend-lowered semantics and emitted PBX/bytecode-facing artifacts. Its purpose is to keep artifact emission deterministic and compatible with runtime loader/verifier contracts. ## 2. Scope This document defines: - artifact-level obligations at `IRVM -> BytecodeModule` emission boundary, - mapping invariants for function layout, code layout, callsite forms, and host-binding declarations, - minimum debug/source-attribution hooks for v1 backend/runtime diagnostics and conformance, - and deterministic artifact rejection expectations for emitter-side failures. This document does not define: - full ISA semantics, - runtime loader patching internals, - or one mandatory emitter implementation architecture. ## 3. Authority and Precedence Normative precedence: 1. Runtime authority (`docs/specs/hardware/topics/chapter-2.md`, `chapter-3.md`, `chapter-9.md`, `chapter-12.md`, `chapter-16.md`) 2. Bytecode authority (`docs/specs/bytecode/ISA_CORE.md`) 3. `docs/specs/compiler-languages/pbs/6.1. Intrinsics and Builtin Types Specification.md` 4. `docs/specs/compiler-languages/pbs/6.2. Host ABI Binding and Loader Resolution Specification.md` 5. `20. IRBackend to IRVM Lowering Specification.md` 6. `21. IRVM Optimization Pipeline Specification.md` 7. This document If a rule here conflicts with higher-precedence authorities, it is invalid. ## 4. Normative Inputs This document depends on, at minimum: - `docs/specs/compiler-languages/pbs/6.1. Intrinsics and Builtin Types Specification.md` - `docs/specs/compiler-languages/pbs/6.2. Host ABI Binding and Loader Resolution Specification.md` - `20. IRBackend to IRVM Lowering Specification.md` - `21. IRVM Optimization Pipeline Specification.md` ## 5. Already-Settled Inputs The following are fixed and must not be contradicted: - The compiler emits host-binding declarations in PBX `SYSC`. - Host-backed callsites are emitted in pre-load form as `HOSTCALL `. - `SYSC` entries are deduplicated by canonical identity and ordered by first occurrence. - The loader resolves host bindings and rewrites `HOSTCALL` to `SYSCALL` before execution. - Raw `SYSCALL` in pre-load artifacts is rejected. - VM-owned intrinsic artifacts are distinct from `SYSC`, `HOSTCALL`, and `SYSCALL`. - `SYSC` section is mandatory in valid PBX artifacts (empty section is valid). ## 6. Artifact Mapping Contract (v1) ### 6.1 Required module surfaces Emitter output must map to a `BytecodeModule` shape containing: 1. `const_pool`, 2. `functions`, 3. `code`, 4. `exports`, 5. `syscalls`, 6. optional `debug_info` that still satisfies v1 minimum debug obligations. ### 6.2 Function ordering and IDs Function ordering must be deterministic: 1. the published wrapper function index is `0`, 2. function index `0` is owned by the compiler-selected physical wrapper rather than by manifest metadata or nominal export lookup, 3. remaining functions are ordered by `(moduleId -> modulePool canonical key, callable_name, source_start)`, 4. identical admitted input graph yields identical function ordering and function ids. For PBS executable publication: - the userland callable marked with `[Frame]` is not itself the physical entrypoint unless it is wrapped by the published synthetic wrapper, - final `FRAME_RET` belongs to the wrapper path. ### 6.3 Function code layout Emitter must satisfy: 1. `code_offset` values are unique and monotonic over function order, 2. `code_len` exactly matches emitted bytes for each function body, 3. `code_offset + code_len` stays within `code.len`, 4. and final code concatenation is deterministic. ### 6.4 Instruction encoding Emitter must satisfy: 1. little-endian encoding, 2. instruction layout `[opcode: u16][immediate]`, 3. jump immediates as `u32` offsets relative to function start, 4. immediate sizes matching selected Core ISA opcode spec. ### 6.5 Host-backed mapping obligations For host-backed operations: 1. emit canonical declarations in `SYSC` (`module`, `name`, `version`, `arg_slots`, `ret_slots`), 2. deduplicate by canonical identity, 3. order by first occurrence, 4. emit callsites as `HOSTCALL ` only, 5. do not emit raw `SYSCALL` in pre-load artifact form. ### 6.6 VM-owned intrinsic mapping obligations For VM-owned intrinsic operations: 1. emit VM-owned intrinsic call form (`INTRINSIC `), 2. resolve `` from the canonical ISA-scoped intrinsic registry artifact, 3. keep intrinsic path distinct from host-binding metadata and host call opcodes, 4. and do not emit VM-owned builtin/intrinsic semantics through `SYSC`. ### 6.7 Internal symbolic-to-index mapping Compilers may use internal symbolic references before final index materialization. If used, symbolic references must be resolved deterministically to final numeric indices before serialization. ## 7. Minimum Debug Attribution Contract (v1) For v1 backend/runtime diagnostics and conformance support, emitted artifacts must preserve at minimum: 1. `function_names` entries for all emitted function indices, 2. `pc_to_span` entries for each emitted instruction start PC. This minimum does not require one universal source-map format. ## 8. Deterministic Emitter Rejection Emitter-side rejection must be deterministic for malformed or inconsistent artifact candidates, including at minimum: 1. inconsistent function layout bounds, 2. unresolved symbolic references at serialization boundary, 3. illegal pre-load host call form (`SYSCALL` in pre-load image), 4. duplicate `SYSC` canonical identities, 5. and declared host ABI shape mismatch detectable at compile target metadata line. ## 9. Conformance-Facing Baseline At minimum, artifact conformance checks should assert: 1. canonical `SYSC` declarations for admitted host-backed operations, 2. deterministic `SYSC` dedup/order, 3. pre-load `HOSTCALL` callsites for host-backed paths, 4. no host-binding leakage for VM-owned intrinsic/builtin operations, 5. minimum debug attribution hooks required by v1, 6. deterministic function ordering and code layout invariants. ## 10. Explicit Deferrals The following remain deferred: - richer optional debug/source-map formats, - additional PBX section-level contracts beyond current baseline, - and profile-specific binary compatibility policy details beyond current v1 baseline. ## 11. Non-Goals - Repeating full ISA/runtime documentation. - Mandating one byte-for-byte whole-image golden as sole conformance oracle. - Defining loader patching internals already owned elsewhere. ## 12. Exit Criteria This document is healthy when: 1. artifact mapping obligations are explicit and testable, 2. host-backed and VM-owned emission boundaries are explicit, 3. deterministic ordering/layout rules are explicit, 4. and v1 minimum debug/source attribution contract is explicit.