179 lines
6.6 KiB
Markdown
179 lines
6.6 KiB
Markdown
# PBS Bytecode and PBX Mapping Specification
|
|
|
|
Status: Draft v1 (Backend Baseline)
|
|
Applies to: mapping from lowered/optimized backend programs into PBX sections, bytecode-facing artifacts, and source-to-artifact invariants required by loader/verifier/runtime
|
|
|
|
## 1. Purpose
|
|
|
|
This document defines the normative mapping between backend-lowered semantics and emitted PBX/bytecode-facing artifacts.
|
|
|
|
Its purpose is to keep artifact emission deterministic and compatible with runtime loader/verifier contracts.
|
|
|
|
## 2. Scope
|
|
|
|
This document defines:
|
|
|
|
- artifact-level obligations at `IRVM -> BytecodeModule` emission boundary,
|
|
- mapping invariants for function layout, code layout, callsite forms, and host-binding declarations,
|
|
- minimum debug/source-attribution hooks for v1 backend/runtime diagnostics and conformance,
|
|
- and deterministic artifact rejection expectations for emitter-side failures.
|
|
|
|
This document does not define:
|
|
|
|
- full ISA semantics,
|
|
- runtime loader patching internals,
|
|
- or one mandatory emitter implementation architecture.
|
|
|
|
## 3. Authority and Precedence
|
|
|
|
Normative precedence:
|
|
|
|
1. Runtime authority (`docs/specs/hardware/topics/chapter-2.md`, `chapter-3.md`, `chapter-9.md`, `chapter-12.md`, `chapter-16.md`)
|
|
2. Bytecode authority (`docs/specs/bytecode/ISA_CORE.md`)
|
|
3. `docs/pbs/specs/6.1. Intrinsics and Builtin Types Specification.md`
|
|
4. `docs/pbs/specs/6.2. Host ABI Binding and Loader Resolution Specification.md`
|
|
5. `20. IRBackend to IRVM Lowering Specification.md`
|
|
6. `21. IRVM Optimization Pipeline Specification.md`
|
|
7. This document
|
|
|
|
If a rule here conflicts with higher-precedence authorities, it is invalid.
|
|
|
|
## 4. Normative Inputs
|
|
|
|
This document depends on, at minimum:
|
|
|
|
- `docs/pbs/specs/6.1. Intrinsics and Builtin Types Specification.md`
|
|
- `docs/pbs/specs/6.2. Host ABI Binding and Loader Resolution Specification.md`
|
|
- `20. IRBackend to IRVM Lowering Specification.md`
|
|
- `21. IRVM Optimization Pipeline Specification.md`
|
|
|
|
## 5. Already-Settled Inputs
|
|
|
|
The following are fixed and must not be contradicted:
|
|
|
|
- The compiler emits host-binding declarations in PBX `SYSC`.
|
|
- Host-backed callsites are emitted in pre-load form as `HOSTCALL <sysc_index>`.
|
|
- `SYSC` entries are deduplicated by canonical identity and ordered by first occurrence.
|
|
- The loader resolves host bindings and rewrites `HOSTCALL` to `SYSCALL` before execution.
|
|
- Raw `SYSCALL` in pre-load artifacts is rejected.
|
|
- VM-owned intrinsic artifacts are distinct from `SYSC`, `HOSTCALL`, and `SYSCALL`.
|
|
- `SYSC` section is mandatory in valid PBX artifacts (empty section is valid).
|
|
|
|
## 6. Artifact Mapping Contract (v1)
|
|
|
|
### 6.1 Required module surfaces
|
|
|
|
Emitter output must map to a `BytecodeModule` shape containing:
|
|
|
|
1. `const_pool`,
|
|
2. `functions`,
|
|
3. `code`,
|
|
4. `exports`,
|
|
5. `syscalls`,
|
|
6. optional `debug_info` that still satisfies v1 minimum debug obligations.
|
|
|
|
### 6.2 Function ordering and IDs
|
|
|
|
Function ordering must be deterministic:
|
|
|
|
1. entrypoint function index is `0`,
|
|
2. entrypoint index `0` is selected from qualified `EntrypointRef(entryPointModuleId, entryPointCallableName)`,
|
|
3. remaining functions are ordered by `(moduleId -> modulePool canonical key, callable_name, source_start)`,
|
|
4. identical admitted input graph yields identical function ordering and function ids.
|
|
|
|
### 6.3 Function code layout
|
|
|
|
Emitter must satisfy:
|
|
|
|
1. `code_offset` values are unique and monotonic over function order,
|
|
2. `code_len` exactly matches emitted bytes for each function body,
|
|
3. `code_offset + code_len` stays within `code.len`,
|
|
4. and final code concatenation is deterministic.
|
|
|
|
### 6.4 Instruction encoding
|
|
|
|
Emitter must satisfy:
|
|
|
|
1. little-endian encoding,
|
|
2. instruction layout `[opcode: u16][immediate]`,
|
|
3. jump immediates as `u32` offsets relative to function start,
|
|
4. immediate sizes matching selected Core ISA opcode spec.
|
|
|
|
### 6.5 Host-backed mapping obligations
|
|
|
|
For host-backed operations:
|
|
|
|
1. emit canonical declarations in `SYSC` (`module`, `name`, `version`, `arg_slots`, `ret_slots`),
|
|
2. deduplicate by canonical identity,
|
|
3. order by first occurrence,
|
|
4. emit callsites as `HOSTCALL <sysc_index>` only,
|
|
5. do not emit raw `SYSCALL` in pre-load artifact form.
|
|
|
|
### 6.6 VM-owned intrinsic mapping obligations
|
|
|
|
For VM-owned intrinsic operations:
|
|
|
|
1. emit VM-owned intrinsic call form (`INTRINSIC <id>`),
|
|
2. resolve `<id>` from the canonical ISA-scoped intrinsic registry artifact,
|
|
3. keep intrinsic path distinct from host-binding metadata and host call opcodes,
|
|
4. and do not emit VM-owned builtin/intrinsic semantics through `SYSC`.
|
|
|
|
### 6.7 Internal symbolic-to-index mapping
|
|
|
|
Compilers may use internal symbolic references before final index materialization.
|
|
|
|
If used, symbolic references must be resolved deterministically to final numeric indices before serialization.
|
|
|
|
## 7. Minimum Debug Attribution Contract (v1)
|
|
|
|
For v1 backend/runtime diagnostics and conformance support, emitted artifacts must preserve at minimum:
|
|
|
|
1. `function_names` entries for all emitted function indices,
|
|
2. `pc_to_span` entries for each emitted instruction start PC.
|
|
|
|
This minimum does not require one universal source-map format.
|
|
|
|
## 8. Deterministic Emitter Rejection
|
|
|
|
Emitter-side rejection must be deterministic for malformed or inconsistent artifact candidates, including at minimum:
|
|
|
|
1. inconsistent function layout bounds,
|
|
2. unresolved symbolic references at serialization boundary,
|
|
3. illegal pre-load host call form (`SYSCALL` in pre-load image),
|
|
4. duplicate `SYSC` canonical identities,
|
|
5. and declared host ABI shape mismatch detectable at compile target metadata line.
|
|
|
|
## 9. Conformance-Facing Baseline
|
|
|
|
At minimum, artifact conformance checks should assert:
|
|
|
|
1. canonical `SYSC` declarations for admitted host-backed operations,
|
|
2. deterministic `SYSC` dedup/order,
|
|
3. pre-load `HOSTCALL` callsites for host-backed paths,
|
|
4. no host-binding leakage for VM-owned intrinsic/builtin operations,
|
|
5. minimum debug attribution hooks required by v1,
|
|
6. deterministic function ordering and code layout invariants.
|
|
|
|
## 10. Explicit Deferrals
|
|
|
|
The following remain deferred:
|
|
|
|
- richer optional debug/source-map formats,
|
|
- additional PBX section-level contracts beyond current baseline,
|
|
- and profile-specific binary compatibility policy details beyond current v1 baseline.
|
|
|
|
## 11. Non-Goals
|
|
|
|
- Repeating full ISA/runtime documentation.
|
|
- Mandating one byte-for-byte whole-image golden as sole conformance oracle.
|
|
- Defining loader patching internals already owned elsewhere.
|
|
|
|
## 12. Exit Criteria
|
|
|
|
This document is healthy when:
|
|
|
|
1. artifact mapping obligations are explicit and testable,
|
|
2. host-backed and VM-owned emission boundaries are explicit,
|
|
3. deterministic ordering/layout rules are explicit,
|
|
4. and v1 minimum debug/source attribution contract is explicit.
|