169 lines
5.8 KiB
Markdown

# PBS AST Specification
Status: Draft v1
Applies to: canonical AST contract emitted by PBS parser/frontend before IRBackend lowering
## 1. Purpose
This document defines the mandatory AST contract between PBS parsing and frontend lowering to `IRBackend`.
The contract is obligations-first: implementations may choose any internal representation, but must expose equivalent observable AST behavior and invariants.
## 2. Scope
This document defines:
- AST root and ordering invariants,
- mandatory node families for declarations, statements, and expressions in the supported v1 slice,
- mandatory node attribution and declaration metadata,
- deterministic rejection and recovery boundaries,
- and Gate U evidence obligations for AST conformance.
This document does not define:
- static-semantics algorithms,
- runtime/bytecode/verifier behavior,
- or `IRBackend -> IRVM` lowering.
## 3. Authority and Precedence
Normative precedence:
1. `3. Core Syntax Specification.md`
2. `4. Static Semantics Specification.md`
3. `12. Diagnostics Specification.md`
4. This document
5. `13. Lowering IRBackend Specification.md`
If a rule here conflicts with a higher-precedence source, the higher-precedence rule wins.
## 4. Normative Inputs
This document depends on:
- `3. Core Syntax Specification.md`
- `4. Static Semantics Specification.md`
- `12. Diagnostics Specification.md`
- `13. Lowering IRBackend Specification.md`
## 5. AST Contract Model
The AST contract is defined by required observable behavior, not by one mandatory class hierarchy or parser architecture.
Conformance is implementation-language agnostic.
## 6. Root and Structural Invariants
Parser output must satisfy all of the following:
1. exactly one AST root per source file;
2. deterministic child ordering consistent with source order;
3. preserved declaration and lexical block hierarchy;
4. parent/child span integrity (no structurally impossible attribution relationships);
5. no post-parse AST rewrite that changes source-observable parse meaning.
## 7. Mandatory Attribution and Declaration Metadata
Every node consumed by diagnostics or lowering must carry stable source attribution:
- `file`,
- `start`,
- `end`.
Mandatory declaration metadata on required declaration nodes includes:
1. declaration name,
2. declared signature/surface when applicable (parameters/return),
3. declaration-level syntactic flags/attributes required by later phases,
4. stable `file/start/end` attribution,
5. lifecycle-marker metadata when a declaration carries `[Init]` or `[Frame]`.
Missing required attribution or metadata on mandatory nodes is non-conformant.
## 8. Mandatory Declaration Families
The v1 AST must represent, at minimum, declaration families for:
- imports,
- top-level `fn`,
- `struct`,
- `contract`,
- `service`,
- `error`,
- `enum`,
- `callback`,
- `declare global`,
- `declare const`,
- and declaration nodes required by barrel/linking flow.
Declaration identity must be preserved at AST boundary; implementations must not prematurely merge/collapse declarations (including overload sets).
Lifecycle-observable declarations must preserve marker metadata explicitly:
- top-level `fn` marked with `[Init]`,
- top-level `fn` marked with `[Frame]`,
- and host method signatures marked with `[InitAllowed]`.
## 9. Mandatory Statement and Expression Families
The v1 AST must represent statement/expression families for the supported core syntax slice, including at minimum:
- statements: `block`, `let`, `return`, expression statement;
- expressions: `identifier`, literals, `unary`, `binary`, `call`, `group`.
Additional supported forms required by `3` are also in scope for explicit node representation.
## 10. Precedence, Associativity, and Rejection
Precedence and associativity outcomes are normative through AST shape.
Rules:
1. accepted forms must preserve parse outcome explicitly in AST shape;
2. non-associative or forbidden chained forms are deterministic reject with stable diagnostics;
3. unsupported forms outside the active source slice are deterministic reject;
4. unsupported/invalid forms must not be masked by permissive synthetic nodes that imply accepted semantics.
## 11. Recovery and Diagnostics Integration
Parser recovery may continue after syntax errors for diagnostic collection, but recovered AST must remain:
- structurally coherent,
- attribution-consistent,
- and non-permissive (must not fabricate valid semantics that hide required rejection).
AST-facing diagnostics must follow `12` identity and attribution rules.
For conformance, required diagnostic identity is keyed by stable machine fields, not rendered sentence text.
## 12. Conformance and Gate U Evidence
AST conformance evidence is provided through Gate U fixtures as defined in `docs/specs/compiler/13. Conformance Test Specification.md`.
At minimum, Gate U must include:
1. representative valid AST fixtures for mandatory declaration/statement/expression families;
2. malformed/recovery fixtures with deterministic rejection assertions;
3. attribution assertions (`file/start/end`) on required nodes;
4. precedence/associativity shape assertions for representative expressions;
5. mandatory negative families:
- unexpected token in declaration context,
- missing required closer (`)`, `}`, or `;`) as applicable,
- forbidden non-associative chains,
- unsupported forms outside the active syntax slice.
## 13. Non-Goals
- Freezing one parser implementation architecture.
- Replacing static semantics/linking decisions with AST-only rules.
- Defining backend/runtime/bytecode/verifier behavior.
## 14. Exit Criteria
This document is healthy when:
1. mandatory AST families and metadata are explicit,
2. recovery/rejection invariants are explicit and test-backed,
3. attribution invariants are explicit and test-backed,
4. lowering preconditions consumed by `13` are unambiguous.