diff --git a/files/TODO.md b/files/TODO.md new file mode 100644 index 00000000..df284fb4 --- /dev/null +++ b/files/TODO.md @@ -0,0 +1,448 @@ +# Prometeu VM/Compiler/Bytecode — Atomic PR Plan (Junie-ready) + +> **Entry point contract (confirmed)** +> +> * The PBS entry point is **`/src/main/modules/main.pbs::frame(): void`**. +> * The compiler must inject **`FRAME_SYNC` immediately before `RET`** at the end of this function. +> * `FRAME_SYNC` is a **signal only** (no GC opcodes). The VM uses it as a safe point. +> * Missing entry point is a **fatal compile error**. + +> **Goal**: Deliver a sequence of small, incremental PRs that bring the implementation closer to the published PBS/VM specs. +> +> **Rules for Junie (strict)** +> +> * **Do not make product decisions.** Only implement what is specified in the PR. +> * **If anything is unclear, stop and ask.** Do not improvise. +> * **All new/modified code comments must be in English.** +> * Each PR must be **atomic** and **mergeable**. +> * Each PR must include: **Briefing**, **Target**, **Non-goals**, **Implementation notes**, **Tests**, **Acceptance criteria**. + +--- + +## PR-00 — Compiler: Enforce PBS entry point and inject FRAME_SYNC in main.pbs::frame() + +### Briefing + +PBS requires a single logical entry point: `src/main/modules/main.pbs::frame(): void`. The VM relies on `FRAME_SYNC` as a **signal-only safe point** to perform GC work between logical frames. Today the compiler does not guarantee either the existence of this entry point nor the injection of `FRAME_SYNC`. + +### Target + +1. **Entry point validation** (fatal error at compile time): + + * Ensure the root project contains file `src/main/modules/main.pbs`. + * Ensure that file declares `fn frame(): void`. + * If missing, emit a **fatal diagnostic** and abort compilation. +2. **FRAME_SYNC injection** (only for the entry point): + + * In lowering/codegen for `main.pbs::frame(): void`, ensure the epilogue emits: + + * `FRAME_SYNC` **immediately before** `RET`. + +### Non-goals + +* Do not inject `FRAME_SYNC` into any other function named `frame`. +* Do not add any GC opcode or GC scheduling metadata into bytecode. +* Do not change runtime behavior besides the presence of `FRAME_SYNC` at the safe point. + +### Implementation notes + +* Identify the entry point by **(file path + function name + signature)**: + + * file: `src/main/modules/main.pbs` + * function: `frame` + * return: `void` + * (parameters: must be none) +* The safest place to inject is at the end of lowering for `Return` in that function: + + * Emit `FRAME_SYNC` just before emitting `RET`. +* Prefer to implement entry point existence checks in an early phase (project scan / module discovery) so errors are clear. + +### Tests + +1. **Positive**: project with `main.pbs` and `fn frame(): void` compiles. +2. **Injection**: compiled bytecode for entry point contains `FRAME_SYNC` right before `RET`. + + * Acceptable test forms: + + * bytecode disasm snapshot test, or + * inspect emitted instruction stream before encoding. +3. **Negative**: + + * Missing `main.pbs` => fatal compile error. + * `main.pbs` exists but missing `frame` => fatal compile error. + * `frame` exists with wrong signature (params or non-void) => fatal compile error. + +### Acceptance criteria + +* Compiler rejects projects without the entry point with a clear fatal diagnostic. +* Compiler injects `FRAME_SYNC` **only** in `main.pbs::frame(): void`. +* `FRAME_SYNC` is placed **immediately before** `RET` in the entry point epilogue. + +--- + +## PR-01 — Linker: Relocate control-flow jump targets when concatenating modules + +### Briefing + +Today the linker patches `CALL` and `PUSH_CONST` immediates, but **does not relocate jump targets** (`JMP`, `JMP_IF_*`) after concatenating module bytecode into a single code blob. This breaks cross-module correctness because label resolution in per-module assembly produces addresses relative to each module’s own code segment. + +### Target + +* In the linker’s relocation pass, patch immediates for: + + * `OpCode::Jmp` + * `OpCode::JmpIfTrue` + * `OpCode::JmpIfFalse` +* Add `module_code_offsets[module_index]` to the jump target immediate. + +### Non-goals + +* No changes to opcode encoding. +* No changes to verifier. +* No changes to how labels are resolved in the assembler. + +### Implementation notes + +* Extend the existing “patch immediates” loop in `Linker::link`. +* Determine `module_index` from the current iterated module during relocation. +* Make sure **only jump targets** are adjusted, not fallthrough logic. +* Add a small helper function for patching immediates to reduce duplication. + +### Tests + +1. **Unit test** (preferred) in `prometeu-linker` (or wherever `Linker` tests live): + + * Create 2 small modules where module #2 contains a local jump. + * Link them. + * Assert that the encoded jump immediate in the final program equals `original_target + module2_offset`. +2. **Integration test** (if a unit test is hard): + + * Build two modules and execute in VM; ensure it reaches expected instruction sequence (e.g., sets a known local/global). + +### Acceptance criteria + +* Multi-module programs with jumps inside non-first modules execute correctly. +* Existing call/const relocation remains correct. +* Tests cover at least one `JMP` and one conditional jump. + +--- + +## PR-02 — VM: Introduce GateId vs heap index and a minimal GatePool (no RC yet) + +### Briefing + +Current runtime treats `Value::Gate(x)` as a **direct heap base index**. The spec requires a **Gate Pool** where `GateId` resolves to `{alive, base, slots, type_id, rc...}` and heap access happens only through gate validation + resolution. + +This PR introduces the **data model** without changing ownership/RC yet, enabling later RC work. + +### Target + +* Add `GateId` type (e.g., `u32`) and a `GateEntry` struct with fields: + + * `alive: bool` + * `base: u32` + * `slots: u32` + * `type_id: u32` (store it, even if VM doesn’t use it yet) +* Add `GatePool` container to the VM state. +* Update `ALLOC(type_id, slots)` to: + + * bump-alloc `slots` in heap + * insert new `GateEntry { alive: true, base, slots, type_id }` + * push `Value::Gate(GateId)` (GateId is index into gate_pool) +* Update `GATE_LOAD/GATE_STORE` to: + + * validate GateId (in-range + alive) + * bounds-check offset against `slots` + * translate to heap index: `base + offset` + +### Non-goals + +* No reference counting yet. +* No reclaim/free yet. +* No enforcement of borrow/mutate rules yet. + +### Implementation notes + +* Keep heap as `Vec` (or existing representation). In this PR, do not change heap layout. +* Add a helper: `resolve_gate(gate_id) -> Result<&GateEntry, Trap>` and `resolve_gate_mut(...)`. +* Define two new trap codes (or map onto existing ones if already defined): + + * `TRAP_INVALID_GATE` (gate_id out of range) + * `TRAP_DEAD_GATE` (entry exists but `alive == false`) +* Make sure the VM never reads/writes heap directly for gate operations without resolution. + +### Tests + +1. VM unit tests: + + * Allocate 2 gates; ensure they get distinct GateIds. + * Store to gate offset 0, load back, assert equal. + * Store to offset == slots (OOB) triggers OOB trap. + * Use an invalid GateId triggers INVALID_GATE trap. +2. If trap codes are new, test for the exact trap code. + +### Acceptance criteria + +* `ALLOC` returns GateId (not heap base). +* `GATE_LOAD/STORE` uses gate_pool resolution. +* Invalid/dead gate attempts trap deterministically. + +--- + +## PR-03 — VM: Add strong reference counting (RC) with deterministic retain/release semantics + +### Briefing + +The spec requires strong RC tracking for gates and deterministic behavior for invalid/dead gates. Today `GATE_RETAIN`/`GATE_RELEASE` are no-ops (or effectively pop-only). + +This PR implements **strong_rc** and updates runtime to adjust RC in well-defined places. + +### Target + +* Extend `GateEntry` with: + + * `strong_rc: u32` +* Define semantics: + + * New allocation starts with `strong_rc = 1` (gate value returned on stack owns 1 reference) + * `GATE_RETAIN`: increment strong_rc + * `GATE_RELEASE`: decrement strong_rc; if reaches 0 then mark gate `alive=false` and schedule reclaim + +### Non-goals + +* No compacting heap. +* No weak refs. +* Reclaim can be minimal (safe-point only) and may not actually reuse memory yet. + +### Implementation notes + +* Implement a `reclaim_queue: Vec` in VM state. +* On `strong_rc` reaching 0: + + * set `alive = false` + * push gate_id into `reclaim_queue` +* **Safe point**: drain reclaim queue on `FRAME_SYNC`. + + * Compiler guarantees `FRAME_SYNC` at the end of the PBS entry point `main.pbs::frame(): void` (PR-00). +* In this PR, reclaim may simply: + + * overwrite the heap range `[base, base+slots)` with `Value::Nil` (or a safe default) + * keep gate_id non-reusable + +### Tests + +1. RC lifecycle test: + + * alloc gate (rc=1) + * retain (rc=2) + * release (rc=1) + * release (rc=0) => gate becomes dead + * subsequent load/store traps DEAD_GATE +2. Reclaim effect test (if you overwrite heap slots): + + * store a value, release to 0, run safe-point + * confirm heap region is cleared (only if heap is inspectable in tests) + +### Acceptance criteria + +* `GATE_RETAIN/RELEASE` changes RC. +* Gate transitions to `dead` at rc==0. +* Dead gate access traps. +* Reclaim happens at the chosen safe point. + +--- + +## PR-04 — VM: Automatic RC adjustments on stack/local/global/heap moves (no more “manual RC correctness”) + +### Briefing + +Relying on explicit `GATE_RETAIN/RELEASE` everywhere is error-prone. The spec indicates RC must be adjusted on assignments/pops/stores. This PR makes RC correctness **a VM invariant**: when a gate value is copied into a slot, RC increments; when replaced/dropped, RC decrements. + +### Target + +* Implement centralized helpers: + + * `inc_rc_if_gate(Value)` + * `dec_rc_if_gate(Value)` +* Apply them in all places where values are moved or overwritten: + + * Stack `PUSH`/`POP` (when dropping values) + * Local set/get if they clone values + * Global set/get + * `GATE_STORE` (heap cell overwrite) + * Any instruction that overwrites an existing slot (e.g., `STORE_LOCAL`, `STORE_GLOBAL`, etc.) + +### Non-goals + +* No changes to compiler output. +* No borrow/mutate enforcement. + +### Implementation notes + +* When writing into a slot: + + 1. `dec_rc_if_gate(old_value)` + 2. write new value + 3. `inc_rc_if_gate(new_value)` **only if the semantics is “copy into slot”** +* When moving (not copying) is possible, avoid double inc/dec. +* If the VM uses `clone()` widely, be explicit about when RC should increase. + +> ⚠️ If it’s unclear whether an opcode is “move” or “copy”, **stop and ask** (do not guess). + +### Tests + +1. Stack drop test: + + * alloc gate, push into local, pop stack, ensure rc doesn’t underflow. +2. Overwrite test: + + * local = gateA, then local = gateB + * rc of gateA decremented + * gateA becomes dead if no other refs +3. Heap store overwrite test: + + * gateX stores gateA into offset 0 + * then stores gateB into same offset + * rc adjusts accordingly + +### Acceptance criteria + +* No RC leaks on overwrites. +* No premature dead gates when references still exist. +* Tests cover overwrite in at least 2 storage kinds (local + heap). + +--- + +## PR-05 — VM: Implement (debug-mode) Borrow/Mutate/Peek scopes as observable state + +### Briefing + +Currently `GATE_BEGIN_PEEK/BORROW/MUTATE` and `GATE_END_*` are no-ops. Even if v0 is permissive, the VM should at least track scope state to enable future enforcement and better diagnostics. + +### Target + +* Add per-gate “scope counters” (or a small state machine) in `GateEntry`: + + * `peek_count: u32` + * `borrow_count: u32` + * `mutate_count: u32` (should be 0/1 if exclusive) +* Implement opcodes to increment/decrement and validate balanced usage: + + * End without begin => trap (or panic if considered VM bug) + * Negative underflow => trap +* In **debug builds**, optionally enforce: + + * cannot `begin_mutate` when `borrow_count>0` + * cannot `begin_borrow` when `mutate_count>0` + +### Non-goals + +* No compiler changes. +* No runtime copy-back scratch buffers yet. + +### Implementation notes + +* Keep enforcement behind a feature flag or debug-only cfg. +* Always keep counters balanced; mismatches should be deterministic. + +### Tests + +* Begin/End balance test per scope. +* Debug-only conflict test (if enabled). + +### Acceptance criteria + +* Scopes are no longer no-ops. +* Misbalanced begin/end produces deterministic error. + +--- + +## PR-06 — Bytecode/VM: Represent strings as ConstId (dedup + stable value size) + +### Briefing + +`Value::String(String)` stores dynamic payload in runtime values. The spec direction prefers string refs into constant pools for stability and dedup. This PR migrates runtime string values to `ConstId` references. + +### Target + +* Add `Value::StringRef(ConstId)` (or `Value::String(ConstId)`) +* Ensure program image contains a string pool. +* Update `PUSH_CONST` behavior for string constants: + + * push `StringRef(id)` instead of allocating a runtime `String` + +### Non-goals + +* No interning beyond the existing constant pool. +* No changes to the source language. + +### Implementation notes + +* Decide where the string pool lives (ProgramImage / ConstPool). +* Update debug printing and trap formatting if needed. + +### Tests + +* Constant string pushed twice references same ConstId. +* Equality/comparison behavior remains unchanged (if supported). + +### Acceptance criteria + +* Strings in runtime values are pool references. +* Existing programs using constants still run. + +--- + +## PR-07 — Compiler: Enforce import placement rules (top-of-file) + +### Briefing + +The PBS module model specifies that imports must be at the top-level (and typically before other declarations). This PR makes the compiler reject invalid import placement to align with the module/linking spec. + +### Target + +* In parser/collector phase, detect imports appearing after non-import declarations. +* Emit a diagnostic with a clear message and span. + +### Non-goals + +* No changes to how linking works. +* No auto-fix. + +### Implementation notes + +* Track a boolean `seen_non_import_decl` during file scan. +* When an import is encountered and the flag is set, produce an error diagnostic. + +### Tests + +* One file with valid imports at top => ok. +* One file with import after a function/type => error. + +### Acceptance criteria + +* Compiler rejects invalid import placement with stable diagnostic. + +--- + +# Suggested merge order + +1. PR-00 (entry point validation + FRAME_SYNC injection) +2. PR-01 (linker JMP relocation) +3. PR-02 (GatePool + GateId) +4. PR-03 (RC strong + retain/release + reclaim at FRAME_SYNC) +5. PR-04 (automatic RC on moves/overwrites) +6. PR-05 (scope tracking) +7. PR-06 (StringRef ConstId) +8. PR-07 (import placement enforcement) + +--- + +# Open questions (must be answered before PR-04) + +> Junie must stop and ask if any of these cannot be determined from the codebase. + +1. Which exact opcodes correspond to storage overwrites (locals/globals) in the current VM implementation (names may differ). +2. Which trap codes are already defined for gate errors vs whether we introduce new ones. + +> ✅ Note: the safe point for reclaim is **FRAME_SYNC** (compiler inserts it at the end of `fn frame(): void`). diff --git a/test-cartridges/test01/cartridge/program.pbc b/test-cartridges/test01/cartridge/program.pbc index 13dbb99e..af7a5f4b 100644 Binary files a/test-cartridges/test01/cartridge/program.pbc and b/test-cartridges/test01/cartridge/program.pbc differ diff --git a/test-cartridges/test01/src/main/modules/main.pbs b/test-cartridges/test01/src/main/modules/main.pbs index e96f2f2a..bc375c85 100644 --- a/test-cartridges/test01/src/main/modules/main.pbs +++ b/test-cartridges/test01/src/main/modules/main.pbs @@ -35,7 +35,7 @@ fn add2(a: int, b: int): int { fn frame(): void { let zero = Vec2.ZERO; - let zz = add(zero.getX(), zero.getY()); + let zz = add2(zero.getX(), zero.getY()); // 1. Locals & Arithmetic let x = 10;