diff --git a/docs/general/specs/13. Conformance Test Specification.md b/docs/general/specs/13. Conformance Test Specification.md index 3d9dd78a..6910c004 100644 --- a/docs/general/specs/13. Conformance Test Specification.md +++ b/docs/general/specs/13. Conformance Test Specification.md @@ -38,7 +38,9 @@ Normative precedence: 4. `4. Static Semantics Specification.md` 5. `12. Diagnostics Specification.md` 6. `13. Lowering IRBackend Specification.md` -7. This document +7. `20. IRBackend to IRVM Lowering Specification.md` +8. `21. IRVM Optimization Pipeline Specification.md` +9. This document This document must not weaken normative requirements from higher-precedence specs. @@ -50,6 +52,8 @@ This document depends on: - `4. Static Semantics Specification.md` - `12. Diagnostics Specification.md` - `13. Lowering IRBackend Specification.md` +- `20. IRBackend to IRVM Lowering Specification.md` +- `21. IRVM Optimization Pipeline Specification.md` - `17. Compatibility and Evolution Policy.md` ## 5. Quality Gate Model @@ -77,7 +81,8 @@ At minimum, those tests must: 1. compile selected fixtures through the executable pipeline for that language, 2. execute them against the runtime line declared for that backend, 3. assert deterministic observable outcomes for supported fixtures, -4. and preserve fixed bugs and published supported examples as regression tests. +4. preserve fixed bugs and published supported examples as regression tests, +5. and cover deterministic behavior across `LowerToVM`, `OptimizeIRVM`, and `EmitBytecode` boundaries when those stages are present. Until executable artifact emission exists for a language, Gate I is explicitly `deferred` for that language and does not block frontend progress. diff --git a/docs/general/specs/15. Bytecode and PBX Mapping Specification.md b/docs/general/specs/15. Bytecode and PBX Mapping Specification.md index a93642c8..de116807 100644 --- a/docs/general/specs/15. Bytecode and PBX Mapping Specification.md +++ b/docs/general/specs/15. Bytecode and PBX Mapping Specification.md @@ -1,27 +1,28 @@ # PBS Bytecode and PBX Mapping Specification -Status: Draft v0 (Skeleton) -Applies to: mapping from lowered PBS programs into PBX sections, bytecode-facing artifacts, and related source-to-artifact invariants +Status: Draft v1 (Backend Baseline) +Applies to: mapping from lowered/optimized backend programs into PBX sections, bytecode-facing artifacts, and source-to-artifact invariants required by loader/verifier/runtime ## 1. Purpose -This document will define the normative mapping between lowered PBS semantics and emitted PBX/bytecode-facing artifacts. +This document defines the normative mapping between backend-lowered semantics and emitted PBX/bytecode-facing artifacts. + +Its purpose is to keep artifact emission deterministic and compatible with runtime loader/verifier contracts. ## 2. Scope -This document is intended to define: +This document defines: -- artifact-level obligations after lowering, -- mapping of semantic constructs into PBX-visible structures, -- source-to-artifact invariants required by verifier and loader stages, -- interaction points with host-binding metadata such as `SYSC`, -- interaction points with VM-owned builtin and intrinsic metadata. +- artifact-level obligations at `IRVM -> BytecodeModule` emission boundary, +- mapping invariants for function layout, code layout, callsite forms, and host-binding declarations, +- minimum debug/source-attribution hooks for v1 backend/runtime diagnostics and conformance, +- and deterministic artifact rejection expectations for emitter-side failures. This document does not define: -- the full bytecode ISA, -- UI/tooling presentation, -- one compiler's internal IR representation. +- full ISA semantics, +- runtime loader patching internals, +- or one mandatory emitter implementation architecture. ## 3. Authority and Precedence @@ -29,10 +30,11 @@ Normative precedence: 1. Runtime authority (`docs/specs/hardware/topics/chapter-2.md`, `chapter-3.md`, `chapter-9.md`, `chapter-12.md`, `chapter-16.md`) 2. Bytecode authority (`docs/specs/bytecode/ISA_CORE.md`) -3. `6.1. Intrinsics and Builtin Types Specification.md` -4. `6.2. Host ABI Binding and Loader Resolution Specification.md` -5. `13. Lowering IRBackend Specification.md` -6. This document +3. `docs/pbs/specs/6.1. Intrinsics and Builtin Types Specification.md` +4. `docs/pbs/specs/6.2. Host ABI Binding and Loader Resolution Specification.md` +5. `20. IRBackend to IRVM Lowering Specification.md` +6. `21. IRVM Optimization Pipeline Specification.md` +7. This document If a rule here conflicts with higher-precedence authorities, it is invalid. @@ -40,75 +42,135 @@ If a rule here conflicts with higher-precedence authorities, it is invalid. This document depends on, at minimum: -- `6.1. Intrinsics and Builtin Types Specification.md` -- `6.2. Host ABI Binding and Loader Resolution Specification.md` -- `13. Lowering IRBackend Specification.md` +- `docs/pbs/specs/6.1. Intrinsics and Builtin Types Specification.md` +- `docs/pbs/specs/6.2. Host ABI Binding and Loader Resolution Specification.md` +- `20. IRBackend to IRVM Lowering Specification.md` +- `21. IRVM Optimization Pipeline Specification.md` ## 5. Already-Settled Inputs -The following inputs are already fixed elsewhere and must not be contradicted here: +The following are fixed and must not be contradicted: -- The compiler emits host-binding declarations into PBX `SYSC`. +- The compiler emits host-binding declarations in PBX `SYSC`. - Host-backed callsites are emitted in pre-load form as `HOSTCALL `. - `SYSC` entries are deduplicated by canonical identity and ordered by first occurrence. -- The loader resolves host bindings and rewrites `HOSTCALL` into `SYSCALL`. -- VM-owned builtin projections, builtin constants, and intrinsic callsites do not emit host-binding metadata. +- The loader resolves host bindings and rewrites `HOSTCALL` to `SYSCALL` before execution. +- Raw `SYSCALL` in pre-load artifacts is rejected. - VM-owned intrinsic artifacts are distinct from `SYSC`, `HOSTCALL`, and `SYSCALL`. -- V1 source-attribution hooks for diagnostics/conformance need only support recovery of at least one source location for source-attributable backend failures; one exact source-map format is not yet required. +- `SYSC` section is mandatory in valid PBX artifacts (empty section is valid). -## 6. Current Conformance-Facing Mapping Baseline +## 6. Artifact Mapping Contract (v1) -The current boundary across backend-facing specs is: +### 6.1 Required module surfaces -- `13. Lowering IRBackend Specification.md` defines what facts must be preserved by lowering, -- this document defines how the PBS-facing artifact facts appear in PBX/bytecode-visible form, -- and `19. Verification and Safety Checks Specification.md` defines later validity checks after the artifact is formed and, where relevant, patched by the loader. +Emitter output must map to a `BytecodeModule` shape containing: -The current conformance-facing artifact baseline is invariant-based. +1. `const_pool`, +2. `functions`, +3. `code`, +4. `exports`, +5. `syscalls`, +6. optional `debug_info` that still satisfies v1 minimum debug obligations. -At minimum, the PBS-facing artifact contract currently includes: +### 6.2 Function ordering and IDs -1. canonical host-binding declarations in `SYSC` for admitted host-backed uses; -2. `SYSC` deduplication by canonical identity and ordering by first occurrence; -3. host-backed callsites emitted in pre-load form as `HOSTCALL `; -4. no host-binding metadata emission for VM-owned builtin projections, builtin constants, or intrinsic callsites; -5. no executable bytecode emission for stdlib interface modules by themselves; -6. enough source-attribution information to recover at least one source location for backend failures that remain in the PBS-facing diagnostics/conformance surface. +Function ordering must be deterministic: -This baseline does not require one full-image PBX golden. +1. entrypoint function index is `0`, +2. remaining functions are ordered by `(module_key, callable_name, source_start)`, +3. identical admitted input graph yields identical function ordering and function ids. -## 7. Initial Section Targets +### 6.3 Function code layout -At minimum, the completed document should contain normative sections for: +Emitter must satisfy: -1. artifact mapping goals, -2. section-level invariants, -3. callsite and control-flow mapping invariants, -4. metadata mapping obligations, -5. intrinsic and builtin artifact obligations, -6. source-attribution hooks for diagnostics/conformance. +1. `code_offset` values are unique and monotonic over function order, +2. `code_len` exactly matches emitted bytes for each function body, +3. `code_offset + code_len` stays within `code.len`, +4. and final code concatenation is deterministic. -## 8. TODO +### 6.4 Instruction encoding -The following items remain to be closed in future agenda discussion. +Emitter must satisfy: -- Which PBX sections beyond `SYSC` are part of the PBS-facing contract. -- Whether a distinct intrinsic declaration section is part of the PBS-facing contract in v1. -- Which source-level constructs require dedicated artifact patterns versus implementation freedom. -- How much of artifact ordering, numbering, and layout is normative at PBS level versus owned purely by PBX authority. -- Whether compilers should emit final `INTRINSIC ` directly for a selected VM line or may emit a declaration-based preload form such as `INTRCALL `. -- Which richer debug/source-map surfaces, if any, are added beyond the minimum attribution hooks already sufficient for v1 diagnostics and conformance. +1. little-endian encoding, +2. instruction layout `[opcode: u16][immediate]`, +3. jump immediates as `u32` offsets relative to function start, +4. immediate sizes matching selected Core ISA opcode spec. -## 9. Non-Goals +### 6.5 Host-backed mapping obligations -- Repeating the full PBX or ISA documentation. -- Mandating one binary-emitter implementation strategy. +For host-backed operations: + +1. emit canonical declarations in `SYSC` (`module`, `name`, `version`, `arg_slots`, `ret_slots`), +2. deduplicate by canonical identity, +3. order by first occurrence, +4. emit callsites as `HOSTCALL ` only, +5. do not emit raw `SYSCALL` in pre-load artifact form. + +### 6.6 VM-owned intrinsic mapping obligations + +For VM-owned intrinsic operations: + +1. emit VM-owned intrinsic call form (`INTRINSIC `), +2. keep intrinsic path distinct from host-binding metadata and host call opcodes, +3. and do not emit VM-owned builtin/intrinsic semantics through `SYSC`. + +### 6.7 Internal symbolic-to-index mapping + +Compilers may use internal symbolic references before final index materialization. + +If used, symbolic references must be resolved deterministically to final numeric indices before serialization. + +## 7. Minimum Debug Attribution Contract (v1) + +For v1 backend/runtime diagnostics and conformance support, emitted artifacts must preserve at minimum: + +1. `function_names` entries for all emitted function indices, +2. `pc_to_span` entries for each emitted instruction start PC. + +This minimum does not require one universal source-map format. + +## 8. Deterministic Emitter Rejection + +Emitter-side rejection must be deterministic for malformed or inconsistent artifact candidates, including at minimum: + +1. inconsistent function layout bounds, +2. unresolved symbolic references at serialization boundary, +3. illegal pre-load host call form (`SYSCALL` in pre-load image), +4. duplicate `SYSC` canonical identities, +5. and declared host ABI shape mismatch detectable at compile target metadata line. + +## 9. Conformance-Facing Baseline + +At minimum, artifact conformance checks should assert: + +1. canonical `SYSC` declarations for admitted host-backed operations, +2. deterministic `SYSC` dedup/order, +3. pre-load `HOSTCALL` callsites for host-backed paths, +4. no host-binding leakage for VM-owned intrinsic/builtin operations, +5. minimum debug attribution hooks required by v1, +6. deterministic function ordering and code layout invariants. + +## 10. Explicit Deferrals + +The following remain deferred: + +- richer optional debug/source-map formats, +- additional PBX section-level contracts beyond current baseline, +- and profile-specific binary compatibility policy details beyond current v1 baseline. + +## 11. Non-Goals + +- Repeating full ISA/runtime documentation. +- Mandating one byte-for-byte whole-image golden as sole conformance oracle. - Defining loader patching internals already owned elsewhere. -## 10. Exit Criteria +## 12. Exit Criteria -This document is ready to move beyond skeleton status only when: +This document is healthy when: -1. the PBS-to-artifact boundary is explicit, -2. artifact-level invariants needed by verifier and loader are normative, -3. the document no longer relies on unresolved `TODO` items for ordinary v1 artifact mapping behavior. +1. artifact mapping obligations are explicit and testable, +2. host-backed and VM-owned emission boundaries are explicit, +3. deterministic ordering/layout rules are explicit, +4. and v1 minimum debug/source attribution contract is explicit. diff --git a/docs/general/specs/19. Verification and Safety Checks Specification.md b/docs/general/specs/19. Verification and Safety Checks Specification.md index 608bdab0..e60e0f4c 100644 --- a/docs/general/specs/19. Verification and Safety Checks Specification.md +++ b/docs/general/specs/19. Verification and Safety Checks Specification.md @@ -1,38 +1,44 @@ # PBS Verification and Safety Checks Specification -Status: Draft v0 (Quality-Gate Baseline) -Applies to: test-backed safety gates for compiler stages and integration regression safety checks +Status: Draft v1 (Quality-Gate Baseline) +Applies to: test-backed safety gates for backend compiler stages and integration regression safety checks against runtime ## 1. Purpose -This document defines the minimum safety/verification quality gates that can be enforced now. +This document defines the minimum safety/verification quality gates that must be enforced for executable backend pipelines. -The intent is to keep safety measurable without over-specifying future verifier internals before backend and runtime integration stabilize. +Its purpose is to keep safety measurable and deterministic across: + +- lowering, +- optimization, +- emission, +- and runtime integration boundaries. ## 2. Scope This document defines: -- compiler-stage safety checks that must be covered by tests now, -- conditional integration safety regression checks for executable backends, +- mandatory compiler-stage safety checks for executable pipeline stages, +- conditional integration safety regression checks against runtime line, - and evidence requirements for safety-related support claims. This document does not define: -- full verifier algorithm internals, -- full bytecode/runtime proof obligations, -- or one mandatory verifier implementation architecture. +- full runtime verifier algorithm internals, +- formal proof of all runtime safety properties, +- or one mandatory verifier/optimizer architecture. ## 3. Authority and Precedence Normative precedence: -1. `2. Governance and Versioning.md` -2. `12. Diagnostics Specification.md` -3. `13. Lowering IRBackend Specification.md` -4. `13. Conformance Test Specification.md` +1. `docs/pbs/specs/2. Governance and Versioning.md` +2. `docs/pbs/specs/12. Diagnostics Specification.md` +3. `20. IRBackend to IRVM Lowering Specification.md` +4. `21. IRVM Optimization Pipeline Specification.md` 5. `15. Bytecode and PBX Mapping Specification.md` -6. This document +6. `13. Conformance Test Specification.md` +7. This document If a rule here conflicts with higher-precedence authorities, it is invalid. @@ -40,65 +46,90 @@ If a rule here conflicts with higher-precedence authorities, it is invalid. This document depends on: -- `12. Diagnostics Specification.md` -- `13. Lowering IRBackend Specification.md` +- `docs/pbs/specs/12. Diagnostics Specification.md` - `13. Conformance Test Specification.md` - `15. Bytecode and PBX Mapping Specification.md` +- `20. IRBackend to IRVM Lowering Specification.md` +- `21. IRVM Optimization Pipeline Specification.md` ## 5. Safety Gate Model -PBS currently uses two safety gate families. - ### 5.1 Gate S-U: Compiler-Stage Safety Unit Tests (Required) -At minimum, test coverage must exist for: +For executable pipelines, S-U must cover stage-level safety checks at minimum: -1. deterministic rejection of malformed source input at lexer/parser level, -2. deterministic rejection of in-scope frontend semantic failures, -3. deterministic dependency/manifest validation failures, -4. stable diagnostics identity for required rejection surfaces (`code`, `severity`, primary attribution). +1. `LowerToVM` safety checks: + - deterministic control-flow lowering checks, + - deterministic function-id assignment checks, + - mandatory slot/return shape checks, + - mandatory callsite form checks. +2. `OptimizeIRVM` safety checks: + - semantic-preservation regression checks, + - deterministic output checks for same input graph, + - profile-compatibility checks (`vm_profile`). +3. `EmitBytecode` safety checks: + - deterministic function/code layout checks, + - host-backed pre-load form checks (`HOSTCALL`, `SYSC`), + - deterministic rejection for malformed emission candidates, + - minimum debug-attribution output checks. +4. deterministic diagnostics identity for required rejection surfaces (`code`, `severity`, primary attribution when applicable). -This gate is mandatory regardless of backend maturity. +This gate is mandatory regardless of backend maturity stage. -### 5.2 Gate S-I: Integration Safety Regression With Runtime (Conditional) +### 5.2 Gate S-I: Integration Safety Regression With Runtime (Required for Executable Backends) -For languages/backends that emit executable artifacts, integration safety regression tests must be introduced. +For backends that emit executable artifacts, S-I must cover integration safety at minimum: -At minimum, these tests must cover: +1. valid pre-load host path that loader patches and runtime executes, +2. rejection of out-of-bounds `HOSTCALL` index, +3. rejection of unused `SYSC` declarations, +4. rejection of raw `SYSCALL` in pre-load artifact, +5. rejection of host ABI mismatch (`arg_slots`/`ret_slots`), +6. rejection of missing capability at load-time, +7. valid VM-owned intrinsic path, +8. repeatability across declared runtime line. -1. deterministic rejection behavior for known malformed/unsupported integration scenarios in supported scope, -2. preservation of fixed safety regressions, -3. and repeatability across the declared runtime line for that backend. - -When executable artifact emission is not yet present, Gate S-I is explicitly `deferred`. +When executable artifact emission is not present, S-I may be explicitly `deferred`. ## 6. Evidence Requirements Safety claims must map to gate evidence: -- `S-U` evidence is required for baseline safety posture, -- `S-I` evidence is additionally required for integration-ready safety claims, -- and no published safety-related support claim should exist without current gate evidence. +1. S-U evidence is required for baseline safety posture, +2. S-I evidence is required for integration-ready safety claims, +3. and published safety-related support claims must remain evidence-backed and current. -## 7. Explicit Deferrals +For executable backends, evidence should include deterministic regression fixtures that prove pipeline order and stage-boundary safety behavior. + +## 7. Deterministic Failure Taxonomy Expectations + +Backend safety checks should expose stable rejection families at minimum: + +1. `MARSHAL_FORMAT_*`, +2. `MARSHAL_LINKAGE_*`, +3. `MARSHAL_VERIFY_PRECHECK_*`. + +This taxonomy complements existing diagnostics contracts and does not replace runtime loader/verifier error surfaces. + +## 8. Explicit Deferrals The following are intentionally deferred: -- full verifier obligation catalog, -- formal proof of runtime safety properties, -- richer source-map/debug attribution standards beyond current minimum diagnostics needs. +- full formal verifier obligation catalog, +- formal proof frameworks for optimizer equivalence, +- and richer standardized source-map formats beyond v1 minimum hooks. -## 8. Non-Goals +## 9. Non-Goals -- Replacing runtime/bytecode authority. -- Freezing verifier architecture before implementation maturity. -- Defining performance tooling in a safety spec. +- Replacing runtime or bytecode authority. +- Freezing optimizer/verifier internal architecture prematurely. +- Defining performance tooling in a safety specification. -## 9. Exit Criteria +## 10. Exit Criteria -This document is in a healthy state when: +This document is healthy when: -1. required compiler-stage safety checks are test-backed, +1. required stage-level safety checks are test-backed, 2. executable backends have explicit S-I status (`pass`/`fail`/`deferred`), -3. safety regressions are preserved in tests, +3. integration safety regressions are preserved in tests, 4. and safety-related support claims remain evidence-backed. diff --git a/docs/general/specs/20. IRBackend to IRVM Lowering Specification.md b/docs/general/specs/20. IRBackend to IRVM Lowering Specification.md new file mode 100644 index 00000000..e9ae8407 --- /dev/null +++ b/docs/general/specs/20. IRBackend to IRVM Lowering Specification.md @@ -0,0 +1,162 @@ +# PBS IRBackend to IRVM Lowering Specification + +Status: Draft v1 (Backend Baseline) +Applies to: executable-backend lowering from `IRBackend` handoff into `IRVM` + +## 1. Purpose + +This document defines the normative lowering contract from executable `IRBackend` to `IRVM`. + +Its purpose is to make backend lowering deterministic, reviewable, and compatible with runtime ISA/verification authority. + +## 2. Scope + +This document defines: + +- required backend preconditions before `IRBackend -> IRVM` starts, +- `IRVM` structural obligations needed before optimization/emission, +- deterministic function-id assignment and control-flow lowering obligations, +- callsite lowering obligations across function, host-backed, and VM-owned intrinsic paths, +- mandatory compiler-side structural pre-verification before bytecode emission, +- and deterministic rejection obligations for backend-originated failures. + +This document does not define: + +- binary PBX layout details, +- loader patching internals, +- runtime execution internals, +- or one mandatory optimizer implementation. + +## 3. Authority and Precedence + +Normative precedence: + +1. Runtime authority (`docs/specs/hardware/topics/chapter-2.md`, `chapter-3.md`, `chapter-9.md`, `chapter-12.md`, `chapter-16.md`) +2. Bytecode authority (`docs/specs/bytecode/ISA_CORE.md`) +3. `docs/pbs/specs/13. Lowering IRBackend Specification.md` +4. This document + +If a rule here conflicts with higher-precedence authorities, it is invalid. + +## 4. Normative Inputs + +This document depends on: + +- `docs/pbs/specs/13. Lowering IRBackend Specification.md` +- `docs/pbs/specs/6.1. Intrinsics and Builtin Types Specification.md` +- `docs/pbs/specs/6.2. Host ABI Binding and Loader Resolution Specification.md` +- `15. Bytecode and PBX Mapping Specification.md` +- `21. IRVM Optimization Pipeline Specification.md` + +## 5. Backend Entry Preconditions + +Lowering from `IRBackend` to `IRVM` may start only when: + +1. frontend admission to `IRBackend` is complete for the active slice, +2. required callsite categories (`CALL_FUNC`/`CALL_HOST`/`CALL_INTRINSIC`) are available, +3. required canonical host/intrinsic metadata is available for admitted callsites, +4. dependency-scoped fail-fast exclusions have already been applied at the `IRBackend` boundary, +5. and target `vm_profile` is selected deterministically. + +## 6. IRVM Model Obligations + +`IRVM` v1 is a quasi-ISA backend form: + +1. it may contain Core ISA ops and internal extension ops (`IRVM_EXT`), +2. each `IRVM_EXT` must declare structural metadata equivalent to opcode-spec needs: + - `pops`, + - `pushes`, + - `is_branch`, + - `is_terminator`, +3. and `IRVM_EXT` must be eliminable before bytecode emission. + +`IRVM` must preserve, per function: + +1. `param_slots`, +2. `local_slots`, +3. `return_slots`, +4. `max_stack_slots`, +5. function entry mapping and deterministic identity. + +## 7. Control-Flow Lowering Obligations + +Control-flow lowering must satisfy: + +1. symbolic labels are allowed internally in lowering, +2. final jump immediates must be resolved to `u32` offsets relative to function start before emission, +3. jump targets must be instruction-boundary-valid within function ranges, +4. and reachable fallthrough beyond function end without terminator is rejected. + +## 8. Deterministic Function-ID Assignment + +Function indexing must be deterministic: + +1. entrypoint function id is `0`, +2. remaining functions are ordered by `(module_key, callable_name, source_start)`, +3. and identical admitted input graphs must produce identical function-id assignments. + +## 9. Callsite Lowering Obligations + +### 9.1 Function calls + +`CALL_FUNC` lowers to `CALL ` with valid index and arity-consistent signature contracts. + +### 9.2 Host-backed calls + +`CALL_HOST` lowers to pre-load `HOSTCALL `, not raw `SYSCALL`. + +### 9.3 VM-owned intrinsic calls + +`CALL_INTRINSIC` lowers to VM-owned intrinsic path and must remain distinct from host-binding paths. + +### 9.4 Closure and coroutine constraints (v1) + +1. closure calls with unknown static callee are rejected in v1, +2. `SPAWN` argument count must match callee `param_slots`, +3. and `YIELD` with non-empty operand stack is rejected. + +## 10. Mandatory Compiler-Side Pre-Verification + +Before bytecode emission, backend must run structural pre-verification on lowered `IRVM` (after resolving to Core-ISA-compatible form), covering at minimum: + +1. jump target validity, +2. stack-depth join consistency, +3. underflow/overflow against declared limits, +4. return-shape consistency, +5. function-id validity, +6. callsite arity consistency, +7. and structural validity of host/intrinsic call forms. + +This pre-verification does not replace runtime verifier authority. + +## 11. Deterministic Rejection Policy + +When backend lowering fails at this boundary: + +1. rejection must be deterministic, +2. diagnostics identity and phase must remain stable, +3. and source attribution must be preserved whenever the failure remains source-attributable and user-actionable. + +## 12. Explicit Deferrals + +The following are intentionally deferred: + +- optimizer pass catalog and tuning policy, +- richer internal IR analysis models beyond v1 obligations, +- and advanced closure/lifetime optimizations beyond v1 constraints. + +## 13. Non-Goals + +- Defining one in-memory IR implementation graph. +- Redefining runtime verifier/loader internals. +- Defining PBX binary chunk policy in this document. + +## 14. Exit Criteria + +This document is healthy when: + +1. `IRBackend -> IRVM` obligations are explicit and testable, +2. deterministic function/call/control-flow lowering rules are explicit, +3. mandatory pre-verification boundary is explicit, +4. and scope boundaries with optimization/emission specs are explicit. + diff --git a/docs/general/specs/21. IRVM Optimization Pipeline Specification.md b/docs/general/specs/21. IRVM Optimization Pipeline Specification.md new file mode 100644 index 00000000..4e966ce5 --- /dev/null +++ b/docs/general/specs/21. IRVM Optimization Pipeline Specification.md @@ -0,0 +1,124 @@ +# PBS IRVM Optimization Pipeline Specification + +Status: Draft v1 (Pipeline Baseline) +Applies to: placement and invariants of the `OptimizeIRVM` stage between lowering and bytecode emission + +## 1. Purpose + +This document defines the normative placement and safety invariants of the `OptimizeIRVM` stage. + +Its purpose is to allow optimization in the backend pipeline without transferring semantic optimization responsibility to runtime. + +## 2. Scope + +This document defines: + +- canonical stage placement for `OptimizeIRVM`, +- input/output contracts of optimization stage boundaries, +- semantic-preservation invariants for optimization passes, +- profile-compatibility obligations (`vm_profile`) across optimization and emission, +- and minimum validation expectations for optimization-stage regressions. + +This document does not define: + +- a mandatory list of optimization passes, +- pass scheduling heuristics, +- or runtime dynamic optimization/JIT policies. + +## 3. Authority and Precedence + +Normative precedence: + +1. Runtime authority (`docs/specs/hardware/topics/chapter-2.md`, `chapter-3.md`, `chapter-9.md`, `chapter-12.md`, `chapter-16.md`) +2. Bytecode authority (`docs/specs/bytecode/ISA_CORE.md`) +3. `20. IRBackend to IRVM Lowering Specification.md` +4. `15. Bytecode and PBX Mapping Specification.md` +5. This document + +If a rule here conflicts with higher-precedence authorities, it is invalid. + +## 4. Normative Inputs + +This document depends on: + +- `20. IRBackend to IRVM Lowering Specification.md` +- `15. Bytecode and PBX Mapping Specification.md` +- `19. Verification and Safety Checks Specification.md` +- `docs/pbs/decisions/IRVM Optimization Stage Placement Decision.md` + +## 5. Canonical Backend Pipeline Order + +For executable backends, the canonical order is: + +1. `LowerToVM`: `IRBackend -> IRVM`, +2. `OptimizeIRVM`: `IRVM -> IRVM` optimized, +3. `EmitBytecode`: optimized `IRVM -> BytecodeModule` pre-load artifact. + +`OptimizeIRVM` must not be skipped in a pipeline that claims this profile unless explicitly configured as a no-op stage that still preserves contracts. + +## 6. Input and Output Contracts + +### 6.1 Input contract + +`OptimizeIRVM` input must: + +1. satisfy lowering obligations from `20`, +2. carry selected `vm_profile`, +3. and remain structurally valid for optimizer analyses. + +### 6.2 Output contract + +`OptimizeIRVM` output must: + +1. be semantically equivalent to input under the selected `vm_profile`, +2. preserve callable/slot/return contracts, +3. preserve deterministic mapping behavior required by emission, +4. and be valid for bytecode emission and subsequent runtime verification. + +## 7. Optimization Invariants + +Optimization passes must satisfy all of: + +1. semantic preservation of observable program behavior, +2. determinism for the same admitted input graph and selected profile, +3. profile compatibility (no introduction of out-of-profile executable opcode forms), +4. preservation of host-backed vs VM-owned call-boundary classification, +5. and preservation of diagnostics/source-attribution hooks required at v1 minimum. + +## 8. Runtime Responsibility Boundary + +1. Semantic/structural program optimization belongs to compile-time backend stages. +2. Runtime remains responsible for load, patch, verify, and execute responsibilities. +3. Runtime-internal micro-optimizations are allowed only when they do not alter bytecode-observable semantics or contract behavior. + +## 9. Validation Expectations + +At minimum, optimization-stage validation must include: + +1. equivalence-oriented regression fixtures across optimized vs non-optimized pipeline outputs, +2. preservation of known negative loader/verifier rejection behavior, +3. and deterministic reproduction of artifact-level invariants required by `15`. + +## 10. Explicit Deferrals + +The following are intentionally deferred: + +- target-specific optimization profitability models, +- advanced interprocedural pipelines beyond v1 baseline, +- and formal equivalence proof frameworks. + +## 11. Non-Goals + +- Defining runtime JIT or adaptive execution. +- Replacing loader/verifier contracts. +- Mandating one optimizer architecture or pass framework. + +## 12. Exit Criteria + +This document is healthy when: + +1. stage placement is explicit and implemented in pipeline order, +2. optimization invariants are explicit and test-backed, +3. backend/runtime responsibility boundary is explicit, +4. and integration safety evidence references this stage concretely. + diff --git a/docs/general/specs/README.md b/docs/general/specs/README.md index 68649ddf..4d17ff9b 100644 --- a/docs/general/specs/README.md +++ b/docs/general/specs/README.md @@ -13,6 +13,11 @@ Typical topics: - verification/safety acceptance boundaries, - artifact/runtime acceptance contracts. +Current backend-facing chapter additions: + +- `20. IRBackend to IRVM Lowering Specification.md` +- `21. IRVM Optimization Pipeline Specification.md` + ## Rule Language-specific specs (for example `docs/pbs/specs`) must reference this layer instead of duplicating cross-language acceptance policy. diff --git a/docs/pbs/specs/13. Lowering IRBackend Specification.md b/docs/pbs/specs/13. Lowering IRBackend Specification.md index f7642ce7..a525289e 100644 --- a/docs/pbs/specs/13. Lowering IRBackend Specification.md +++ b/docs/pbs/specs/13. Lowering IRBackend Specification.md @@ -1,7 +1,7 @@ # PBS Lowering IRBackend Specification -Status: Draft v1 (Frontend Scope) -Applies to: first lowering boundary from bound PBS frontend model into `IRBackend`/`IRBackendFile` +Status: Draft v1 (Frontend Scope + Backend Handoff Addendum) +Applies to: first lowering boundary from bound PBS frontend model into `IRBackend`/`IRBackendFile`, plus executable-backend handoff obligations at this boundary ## 1. Purpose @@ -17,7 +17,8 @@ This document defines: - AST invariants that lowering may assume as hard contract, - semantic obligations preserved in `IRBackend`, - deterministic rejection behavior for unsupported frontend-lowering forms, -- and diagnostics attribution obligations for lowering failures in frontend scope. +- diagnostics attribution obligations for lowering failures in frontend scope, +- and executable-backend handoff obligations at the `IRBackend` boundary. This document does not define: @@ -48,6 +49,7 @@ This document depends on: - `4. Static Semantics Specification.md` - `11. AST Specification.md` - `12. Diagnostics Specification.md` +- `docs/general/specs/20. IRBackend to IRVM Lowering Specification.md` ## 5. Lowering Preconditions @@ -135,3 +137,60 @@ This document is healthy when: 2. preserved `IRBackend` obligations are explicit, 3. deterministic rejection policy is explicit and test-backed, 4. and scope boundaries with general/backend acceptance specs are explicit. + +## 12. Executable Backend Handoff Addendum (v1) + +For executable backends, the `IRBackend` output from this frontend boundary must satisfy the additional handoff obligations below. + +### 12.1 Callable obligations + +For each executable callable admitted into `IRBackend`, the frontend must preserve: + +1. stable callable identity in the current build graph, +2. observable callable signature (input arity and return shape), +3. callable category needed by downstream lowering, +4. source attribution anchor (`fileId`, `start`, `end`), +5. and executable body representation that is backend-lowerable. + +### 12.2 Callsite classification obligations + +Each executable callsite in `IRBackend` must be classified into exactly one category: + +1. `CALL_FUNC`, +2. `CALL_HOST`, +3. `CALL_INTRINSIC`. + +Backend lowering must not infer this category by textual heuristics. + +### 12.3 Host-backed metadata obligations + +Each host-backed callsite admitted at this boundary must preserve canonical host identity: + +1. `module`, +2. `name`, +3. `version`. + +When available at this boundary, declared host ABI shape (`arg_slots`, `ret_slots`) must also be preserved for downstream validation. + +### 12.4 VM-owned metadata obligations + +VM-owned intrinsic callsites admitted at this boundary must preserve canonical intrinsic identity: + +1. `canonicalName`, +2. `canonicalVersion`. + +VM-owned builtin projections/constants/intrinsics must not be reclassified as host bindings. + +### 12.5 Deterministic capability obligations + +`requiredCapabilities` derived at this boundary must be deterministic for the same admitted input graph. + +### 12.6 Boundary with backend lowering specs + +This addendum defines obligations preserved at the `IRBackend` boundary only. + +It does not replace: + +1. `docs/general/specs/20. IRBackend to IRVM Lowering Specification.md`, +2. `docs/general/specs/21. IRVM Optimization Pipeline Specification.md`, +3. or `docs/general/specs/15. Bytecode and PBX Mapping Specification.md`.