prometeu-studio/docs/specs/random/pbs/PBS - Language Syntax Specification v0.md
2026-03-24 13:42:16 +00:00

374 lines
8.6 KiB
Markdown

# PBS - Language Syntax Specification v0
Status: Draft (Normative for FE rewrite)
Scope: PBS v0 Core syntax (lexer + parser contract)
Language: English only
## 1. Goals
This document defines the canonical PBS v0 Core syntax.
It is designed to:
- be deterministic to parse,
- be stable for tests and tooling,
- align with runtime authority,
- provide a clear baseline for frontend rewrite.
This document defines syntax only. Runtime behavior, bytecode encoding, scheduling, and GC internals are out of scope.
## 2. Authority and precedence
Normative precedence order:
1. Runtime authority (`docs/specs/hardware/topics/chapter-2.md`, `chapter-3.md`, `chapter-9.md`, `chapter-12.md`, `chapter-16.md`)
2. Bytecode authority (`docs/specs/bytecode/ISA_CORE.md`)
3. This syntax specification
4. Legacy references (`docs/specs/pbs_old/*`)
If a syntax rule from legacy material conflicts with runtime or bytecode authority, that rule is invalid.
## 3. Source model
- Source encoding: UTF-8.
- Line terminators: `\n` and `\r\n` are both valid.
- Whitespace is insignificant except as separator.
- A source file is a declaration unit; top-level executable statements are forbidden.
## 4. Lexical specification
### 4.1 Tokens
The lexer must produce at least:
- identifiers,
- keywords,
- numeric literals,
- string literals,
- punctuation,
- operators,
- comments,
- EOF.
Each token must carry a source span (byte offsets).
### 4.2 Comments
- Line comment: `// ...` until line end.
- Block comments are not part of v0 Core.
### 4.3 Identifiers
Identifier:
- starts with `_` or alphabetic character,
- continues with `_`, alphabetic, or digit characters.
Keywords cannot be used as identifiers.
### 4.4 Keywords
Active keywords in `.pbs` files (v0 Core):
- `import`, `from`, `as`
- `service`, `fn`
- `declare`, `struct`, `contract`, `error`
- `let`
- `if`, `else`, `when`, `for`, `in`, `return`
- `true`, `false`
Barrel-only keywords:
- `pub`, `mod`
- `type`
Reserved (not active in `.pbs` v0 Core grammar, but reserved):
- `host`, `handle`
- `alloc`, `borrow`, `mutate`, `peek`, `take`, `weak`
- `spawn`, `yield`, `sleep`
### 4.5 Literals
Numeric literals:
- `IntLit`: decimal integer (`0`, `42`, `1000`)
- `FloatLit`: decimal float with dot (`3.14`, `0.5`)
- `BoundedLit`: decimal integer with `b` suffix (`0b`, `255b`)
String literals:
- delimited by `"`.
- support escapes: `\\`, `\"`, `\n`, `\r`, `\t`.
Booleans:
- `true`, `false`.
## 5. Module and barrel model
### 5.1 Required files
A module is valid only if it contains:
- one or more `.pbs` source files,
- exactly one `mod.barrel` file.
Missing `mod.barrel` is a compile-time error.
### 5.2 Barrel responsibility
`mod.barrel` is the only place where module visibility is defined.
Visibility levels:
- `mod`: visible across files in the same module.
- `pub`: visible to other modules through imports.
Any top-level declaration not listed in `mod.barrel` is file-private.
Using `mod` or `pub` as top-level declaration modifiers in `.pbs` files is a syntax error.
### 5.3 Barrel grammar
```ebnf
BarrelFile ::= BarrelItem* EOF
BarrelItem ::= BarrelVisibility BarrelKind Identifier ';'
BarrelVisibility ::= 'mod' | 'pub'
BarrelKind ::= 'fn' | 'type' | 'service'
```
Rules:
- `mod.barrel` cannot declare aliases.
- Barrel item order has no semantic meaning.
- The same symbol cannot appear more than once in `mod.barrel`.
- Each barrel item must resolve to an existing top-level declaration in the module.
- Alias syntax is allowed only in `import` declarations.
- Importing modules may only import symbols marked as `pub` in the target module barrel.
Examples:
```barrel
mod fn clamp;
pub fn sum;
pub type Vector;
mod service Audio;
```
## 6. File and declaration grammar
EBNF conventions used:
- `A?` optional
- `A*` zero or more
- `A+` one or more
- terminals in single quotes
### 6.1 File
```ebnf
File ::= ImportDecl* TopDecl* EOF
```
### 6.2 Imports
Imports must target modules, never files.
```ebnf
ImportDecl ::= 'import' ( ModuleRef | '{' ImportList '}' 'from' ModuleRef ) ';'
ImportList ::= ImportItem (',' ImportItem)*
ImportItem ::= Identifier ('as' Identifier)?
ModuleRef ::= '@' Identifier ':' ModulePath
ModulePath ::= Identifier ('/' Identifier)*
```
Examples:
```pbs
import @core:math;
import { Vector, Matrix as Mat } from @core:math;
```
### 6.3 Top-level declarations
```ebnf
TopDecl ::= TypeDecl | ServiceDecl | FunctionDecl
```
Top-level `let` and top-level statements are not allowed.
### 6.4 Type declarations
```ebnf
TypeDecl ::= 'declare' TypeKind Identifier TypeBody
TypeKind ::= 'struct' | 'contract' | 'error'
TypeBody ::= '{' TypeMember* '}'
TypeMember ::= FieldDecl | FnSigDecl
FieldDecl ::= Identifier ':' TypeRef ';'
FnSigDecl ::= 'fn' Identifier ParamList ReturnType? ';'
```
### 6.5 Services
```ebnf
ServiceDecl ::= 'service' Identifier (':' Identifier)? ServiceBody
ServiceBody ::= '{' ServiceMember* '}'
ServiceMember ::= 'fn' Identifier ParamList ReturnType? Block
```
### 6.6 Functions
```ebnf
FunctionDecl ::= 'fn' Identifier ParamList ReturnType? ElseFallback? Block
ParamList ::= '(' Param (',' Param)* ')'
Param ::= Identifier ':' TypeRef
ReturnType ::= ':' TypeRef
ElseFallback ::= 'else' Expr
```
## 7. Type syntax
```ebnf
TypeRef ::= TypePrimary
TypePrimary ::= SimpleType | GenericType | TupleType
SimpleType ::= Identifier
GenericType ::= Identifier '<' TypeRef (',' TypeRef)* '>'
TupleType ::= '(' TypeRef ',' TypeRef (',' TypeRef){0,4} ')'
```
Runtime alignment:
- Tuple type arity in v0 Core is 2..6.
- This aligns with runtime multi-return slot limits.
## 8. Statements and blocks
```ebnf
Block ::= '{' Stmt* TailExpr? '}'
Stmt ::= LetStmt | ReturnStmt | IfStmt | ForStmt | ExprStmt
TailExpr ::= Expr
LetStmt ::= 'let' Identifier (':' TypeRef)? '=' Expr ';'
ReturnStmt ::= 'return' Expr? ';'
ExprStmt ::= Expr ';'
IfStmt ::= 'if' Expr Block ('else' (IfStmt | Block))?
ForStmt ::= 'for' Identifier 'in' RangeExpr Block
RangeExpr ::= '[' Expr? '..' Expr? ']'
```
Notes:
- `if` is a statement in v0 Core.
- `when` is an expression.
- `break` and `continue` are deferred from v0 Core syntax.
## 9. Expression grammar and precedence
Assignment is not an expression in v0 Core.
```ebnf
Expr ::= WhenExpr
WhenExpr ::= 'when' OrExpr 'then' Expr 'else' Expr | OrExpr
OrExpr ::= AndExpr ('||' AndExpr)*
AndExpr ::= EqualityExpr ('&&' EqualityExpr)*
EqualityExpr ::= CompareExpr (('==' | '!=') CompareExpr)?
CompareExpr ::= CastExpr (('<' | '<=' | '>' | '>=') CastExpr)?
CastExpr ::= AddExpr ('as' TypeRef)*
AddExpr ::= MulExpr (('+' | '-') MulExpr)*
MulExpr ::= UnaryExpr (('*' | '/' | '%') UnaryExpr)*
UnaryExpr ::= ('!' | '-') UnaryExpr | CallExpr
CallExpr ::= PrimaryExpr ('(' ArgList? ')')*
ArgList ::= Expr (',' Expr)*
Literal ::= IntLit | FloatLit | BoundedLit | StringLit | BoolLit
BoolLit ::= 'true' | 'false'
PrimaryExpr ::= Literal | Identifier | GroupExpr | Block
GroupExpr ::= '(' Expr ')'
```
Non-associative constraints:
- `a < b < c` is invalid.
- `a == b == c` is invalid.
## 10. Runtime authority alignment constraints
These are hard constraints for frontend and syntax decisions:
- Runtime is deterministic and frame-synchronized.
- `FRAME_SYNC` is runtime safepoint and not surface syntax.
- Heap semantics are GC-based at runtime authority level.
- Host interaction is via `SYSCALL` in bytecode and host ABI mapping.
- Syscalls are callable but not first-class.
Syntax implications for v0 Core:
- No RC/HIP/gate-specific syntax is active.
- No closure literal syntax in v0 Core.
- No coroutine syntax (`spawn`, `yield`, `sleep`) in v0 Core.
## 11. Deferred syntax (explicitly out of v0 Core)
Deferred for later profiles:
- heap-specialized syntax: `alloc`, `borrow`, `mutate`, `peek`, `take`, `weak`
- first-class closure/lambda surface syntax
- coroutine surface syntax
- pattern matching
- macro system
These words stay reserved so later profiles do not break source compatibility.
## 12. Conformance requirements for frontend rewrite
A v0 Core frontend is conformant if:
- tokenization follows this lexical spec,
- barrel parsing and validation follows Section 5,
- parsing follows this grammar and precedence,
- parser is deterministic,
- each token and AST node keeps stable spans,
- forbidden constructs produce deterministic diagnostics.
## 13. Minimal canonical examples
### 13.1 Function
```pbs
fn sum(a: int, b: int): int {
return a + b;
}
```
### 13.2 Local function + barrel visibility
```pbs
fn clamp(x: int, lo: int, hi: int): int {
if x < lo {
return lo;
}
if x > hi {
return hi;
}
return x;
}
```
```barrel
mod fn clamp;
```
### 13.3 Imports
```pbs
import @core:math;
import { Vec2 as V2 } from @core:math;
```