add TODO.md

This commit is contained in:
bQUARKz 2026-02-07 17:16:19 +00:00
parent d994005f8b
commit 55e97f4407
Signed by: bquarkz
SSH Key Fingerprint: SHA256:Z7dgqoglWwoK6j6u4QC87OveEq74WOhFN+gitsxtkf8
3 changed files with 449 additions and 1 deletions

448
files/TODO.md Normal file
View File

@ -0,0 +1,448 @@
# Prometeu VM/Compiler/Bytecode — Atomic PR Plan (Junie-ready)
> **Entry point contract (confirmed)**
>
> * The PBS entry point is **`/src/main/modules/main.pbs::frame(): void`**.
> * The compiler must inject **`FRAME_SYNC` immediately before `RET`** at the end of this function.
> * `FRAME_SYNC` is a **signal only** (no GC opcodes). The VM uses it as a safe point.
> * Missing entry point is a **fatal compile error**.
> **Goal**: Deliver a sequence of small, incremental PRs that bring the implementation closer to the published PBS/VM specs.
>
> **Rules for Junie (strict)**
>
> * **Do not make product decisions.** Only implement what is specified in the PR.
> * **If anything is unclear, stop and ask.** Do not improvise.
> * **All new/modified code comments must be in English.**
> * Each PR must be **atomic** and **mergeable**.
> * Each PR must include: **Briefing**, **Target**, **Non-goals**, **Implementation notes**, **Tests**, **Acceptance criteria**.
---
## PR-00 — Compiler: Enforce PBS entry point and inject FRAME_SYNC in main.pbs::frame()
### Briefing
PBS requires a single logical entry point: `src/main/modules/main.pbs::frame(): void`. The VM relies on `FRAME_SYNC` as a **signal-only safe point** to perform GC work between logical frames. Today the compiler does not guarantee either the existence of this entry point nor the injection of `FRAME_SYNC`.
### Target
1. **Entry point validation** (fatal error at compile time):
* Ensure the root project contains file `src/main/modules/main.pbs`.
* Ensure that file declares `fn frame(): void`.
* If missing, emit a **fatal diagnostic** and abort compilation.
2. **FRAME_SYNC injection** (only for the entry point):
* In lowering/codegen for `main.pbs::frame(): void`, ensure the epilogue emits:
* `FRAME_SYNC` **immediately before** `RET`.
### Non-goals
* Do not inject `FRAME_SYNC` into any other function named `frame`.
* Do not add any GC opcode or GC scheduling metadata into bytecode.
* Do not change runtime behavior besides the presence of `FRAME_SYNC` at the safe point.
### Implementation notes
* Identify the entry point by **(file path + function name + signature)**:
* file: `src/main/modules/main.pbs`
* function: `frame`
* return: `void`
* (parameters: must be none)
* The safest place to inject is at the end of lowering for `Return` in that function:
* Emit `FRAME_SYNC` just before emitting `RET`.
* Prefer to implement entry point existence checks in an early phase (project scan / module discovery) so errors are clear.
### Tests
1. **Positive**: project with `main.pbs` and `fn frame(): void` compiles.
2. **Injection**: compiled bytecode for entry point contains `FRAME_SYNC` right before `RET`.
* Acceptable test forms:
* bytecode disasm snapshot test, or
* inspect emitted instruction stream before encoding.
3. **Negative**:
* Missing `main.pbs` => fatal compile error.
* `main.pbs` exists but missing `frame` => fatal compile error.
* `frame` exists with wrong signature (params or non-void) => fatal compile error.
### Acceptance criteria
* Compiler rejects projects without the entry point with a clear fatal diagnostic.
* Compiler injects `FRAME_SYNC` **only** in `main.pbs::frame(): void`.
* `FRAME_SYNC` is placed **immediately before** `RET` in the entry point epilogue.
---
## PR-01 — Linker: Relocate control-flow jump targets when concatenating modules
### Briefing
Today the linker patches `CALL` and `PUSH_CONST` immediates, but **does not relocate jump targets** (`JMP`, `JMP_IF_*`) after concatenating module bytecode into a single code blob. This breaks cross-module correctness because label resolution in per-module assembly produces addresses relative to each modules own code segment.
### Target
* In the linkers relocation pass, patch immediates for:
* `OpCode::Jmp`
* `OpCode::JmpIfTrue`
* `OpCode::JmpIfFalse`
* Add `module_code_offsets[module_index]` to the jump target immediate.
### Non-goals
* No changes to opcode encoding.
* No changes to verifier.
* No changes to how labels are resolved in the assembler.
### Implementation notes
* Extend the existing “patch immediates” loop in `Linker::link`.
* Determine `module_index` from the current iterated module during relocation.
* Make sure **only jump targets** are adjusted, not fallthrough logic.
* Add a small helper function for patching immediates to reduce duplication.
### Tests
1. **Unit test** (preferred) in `prometeu-linker` (or wherever `Linker` tests live):
* Create 2 small modules where module #2 contains a local jump.
* Link them.
* Assert that the encoded jump immediate in the final program equals `original_target + module2_offset`.
2. **Integration test** (if a unit test is hard):
* Build two modules and execute in VM; ensure it reaches expected instruction sequence (e.g., sets a known local/global).
### Acceptance criteria
* Multi-module programs with jumps inside non-first modules execute correctly.
* Existing call/const relocation remains correct.
* Tests cover at least one `JMP` and one conditional jump.
---
## PR-02 — VM: Introduce GateId vs heap index and a minimal GatePool (no RC yet)
### Briefing
Current runtime treats `Value::Gate(x)` as a **direct heap base index**. The spec requires a **Gate Pool** where `GateId` resolves to `{alive, base, slots, type_id, rc...}` and heap access happens only through gate validation + resolution.
This PR introduces the **data model** without changing ownership/RC yet, enabling later RC work.
### Target
* Add `GateId` type (e.g., `u32`) and a `GateEntry` struct with fields:
* `alive: bool`
* `base: u32`
* `slots: u32`
* `type_id: u32` (store it, even if VM doesnt use it yet)
* Add `GatePool` container to the VM state.
* Update `ALLOC(type_id, slots)` to:
* bump-alloc `slots` in heap
* insert new `GateEntry { alive: true, base, slots, type_id }`
* push `Value::Gate(GateId)` (GateId is index into gate_pool)
* Update `GATE_LOAD/GATE_STORE` to:
* validate GateId (in-range + alive)
* bounds-check offset against `slots`
* translate to heap index: `base + offset`
### Non-goals
* No reference counting yet.
* No reclaim/free yet.
* No enforcement of borrow/mutate rules yet.
### Implementation notes
* Keep heap as `Vec<Value>` (or existing representation). In this PR, do not change heap layout.
* Add a helper: `resolve_gate(gate_id) -> Result<&GateEntry, Trap>` and `resolve_gate_mut(...)`.
* Define two new trap codes (or map onto existing ones if already defined):
* `TRAP_INVALID_GATE` (gate_id out of range)
* `TRAP_DEAD_GATE` (entry exists but `alive == false`)
* Make sure the VM never reads/writes heap directly for gate operations without resolution.
### Tests
1. VM unit tests:
* Allocate 2 gates; ensure they get distinct GateIds.
* Store to gate offset 0, load back, assert equal.
* Store to offset == slots (OOB) triggers OOB trap.
* Use an invalid GateId triggers INVALID_GATE trap.
2. If trap codes are new, test for the exact trap code.
### Acceptance criteria
* `ALLOC` returns GateId (not heap base).
* `GATE_LOAD/STORE` uses gate_pool resolution.
* Invalid/dead gate attempts trap deterministically.
---
## PR-03 — VM: Add strong reference counting (RC) with deterministic retain/release semantics
### Briefing
The spec requires strong RC tracking for gates and deterministic behavior for invalid/dead gates. Today `GATE_RETAIN`/`GATE_RELEASE` are no-ops (or effectively pop-only).
This PR implements **strong_rc** and updates runtime to adjust RC in well-defined places.
### Target
* Extend `GateEntry` with:
* `strong_rc: u32`
* Define semantics:
* New allocation starts with `strong_rc = 1` (gate value returned on stack owns 1 reference)
* `GATE_RETAIN`: increment strong_rc
* `GATE_RELEASE`: decrement strong_rc; if reaches 0 then mark gate `alive=false` and schedule reclaim
### Non-goals
* No compacting heap.
* No weak refs.
* Reclaim can be minimal (safe-point only) and may not actually reuse memory yet.
### Implementation notes
* Implement a `reclaim_queue: Vec<GateId>` in VM state.
* On `strong_rc` reaching 0:
* set `alive = false`
* push gate_id into `reclaim_queue`
* **Safe point**: drain reclaim queue on `FRAME_SYNC`.
* Compiler guarantees `FRAME_SYNC` at the end of the PBS entry point `main.pbs::frame(): void` (PR-00).
* In this PR, reclaim may simply:
* overwrite the heap range `[base, base+slots)` with `Value::Nil` (or a safe default)
* keep gate_id non-reusable
### Tests
1. RC lifecycle test:
* alloc gate (rc=1)
* retain (rc=2)
* release (rc=1)
* release (rc=0) => gate becomes dead
* subsequent load/store traps DEAD_GATE
2. Reclaim effect test (if you overwrite heap slots):
* store a value, release to 0, run safe-point
* confirm heap region is cleared (only if heap is inspectable in tests)
### Acceptance criteria
* `GATE_RETAIN/RELEASE` changes RC.
* Gate transitions to `dead` at rc==0.
* Dead gate access traps.
* Reclaim happens at the chosen safe point.
---
## PR-04 — VM: Automatic RC adjustments on stack/local/global/heap moves (no more “manual RC correctness”)
### Briefing
Relying on explicit `GATE_RETAIN/RELEASE` everywhere is error-prone. The spec indicates RC must be adjusted on assignments/pops/stores. This PR makes RC correctness **a VM invariant**: when a gate value is copied into a slot, RC increments; when replaced/dropped, RC decrements.
### Target
* Implement centralized helpers:
* `inc_rc_if_gate(Value)`
* `dec_rc_if_gate(Value)`
* Apply them in all places where values are moved or overwritten:
* Stack `PUSH`/`POP` (when dropping values)
* Local set/get if they clone values
* Global set/get
* `GATE_STORE` (heap cell overwrite)
* Any instruction that overwrites an existing slot (e.g., `STORE_LOCAL`, `STORE_GLOBAL`, etc.)
### Non-goals
* No changes to compiler output.
* No borrow/mutate enforcement.
### Implementation notes
* When writing into a slot:
1. `dec_rc_if_gate(old_value)`
2. write new value
3. `inc_rc_if_gate(new_value)` **only if the semantics is “copy into slot”**
* When moving (not copying) is possible, avoid double inc/dec.
* If the VM uses `clone()` widely, be explicit about when RC should increase.
> ⚠️ If its unclear whether an opcode is “move” or “copy”, **stop and ask** (do not guess).
### Tests
1. Stack drop test:
* alloc gate, push into local, pop stack, ensure rc doesnt underflow.
2. Overwrite test:
* local = gateA, then local = gateB
* rc of gateA decremented
* gateA becomes dead if no other refs
3. Heap store overwrite test:
* gateX stores gateA into offset 0
* then stores gateB into same offset
* rc adjusts accordingly
### Acceptance criteria
* No RC leaks on overwrites.
* No premature dead gates when references still exist.
* Tests cover overwrite in at least 2 storage kinds (local + heap).
---
## PR-05 — VM: Implement (debug-mode) Borrow/Mutate/Peek scopes as observable state
### Briefing
Currently `GATE_BEGIN_PEEK/BORROW/MUTATE` and `GATE_END_*` are no-ops. Even if v0 is permissive, the VM should at least track scope state to enable future enforcement and better diagnostics.
### Target
* Add per-gate “scope counters” (or a small state machine) in `GateEntry`:
* `peek_count: u32`
* `borrow_count: u32`
* `mutate_count: u32` (should be 0/1 if exclusive)
* Implement opcodes to increment/decrement and validate balanced usage:
* End without begin => trap (or panic if considered VM bug)
* Negative underflow => trap
* In **debug builds**, optionally enforce:
* cannot `begin_mutate` when `borrow_count>0`
* cannot `begin_borrow` when `mutate_count>0`
### Non-goals
* No compiler changes.
* No runtime copy-back scratch buffers yet.
### Implementation notes
* Keep enforcement behind a feature flag or debug-only cfg.
* Always keep counters balanced; mismatches should be deterministic.
### Tests
* Begin/End balance test per scope.
* Debug-only conflict test (if enabled).
### Acceptance criteria
* Scopes are no longer no-ops.
* Misbalanced begin/end produces deterministic error.
---
## PR-06 — Bytecode/VM: Represent strings as ConstId (dedup + stable value size)
### Briefing
`Value::String(String)` stores dynamic payload in runtime values. The spec direction prefers string refs into constant pools for stability and dedup. This PR migrates runtime string values to `ConstId` references.
### Target
* Add `Value::StringRef(ConstId)` (or `Value::String(ConstId)`)
* Ensure program image contains a string pool.
* Update `PUSH_CONST` behavior for string constants:
* push `StringRef(id)` instead of allocating a runtime `String`
### Non-goals
* No interning beyond the existing constant pool.
* No changes to the source language.
### Implementation notes
* Decide where the string pool lives (ProgramImage / ConstPool).
* Update debug printing and trap formatting if needed.
### Tests
* Constant string pushed twice references same ConstId.
* Equality/comparison behavior remains unchanged (if supported).
### Acceptance criteria
* Strings in runtime values are pool references.
* Existing programs using constants still run.
---
## PR-07 — Compiler: Enforce import placement rules (top-of-file)
### Briefing
The PBS module model specifies that imports must be at the top-level (and typically before other declarations). This PR makes the compiler reject invalid import placement to align with the module/linking spec.
### Target
* In parser/collector phase, detect imports appearing after non-import declarations.
* Emit a diagnostic with a clear message and span.
### Non-goals
* No changes to how linking works.
* No auto-fix.
### Implementation notes
* Track a boolean `seen_non_import_decl` during file scan.
* When an import is encountered and the flag is set, produce an error diagnostic.
### Tests
* One file with valid imports at top => ok.
* One file with import after a function/type => error.
### Acceptance criteria
* Compiler rejects invalid import placement with stable diagnostic.
---
# Suggested merge order
1. PR-00 (entry point validation + FRAME_SYNC injection)
2. PR-01 (linker JMP relocation)
3. PR-02 (GatePool + GateId)
4. PR-03 (RC strong + retain/release + reclaim at FRAME_SYNC)
5. PR-04 (automatic RC on moves/overwrites)
6. PR-05 (scope tracking)
7. PR-06 (StringRef ConstId)
8. PR-07 (import placement enforcement)
---
# Open questions (must be answered before PR-04)
> Junie must stop and ask if any of these cannot be determined from the codebase.
1. Which exact opcodes correspond to storage overwrites (locals/globals) in the current VM implementation (names may differ).
2. Which trap codes are already defined for gate errors vs whether we introduce new ones.
> ✅ Note: the safe point for reclaim is **FRAME_SYNC** (compiler inserts it at the end of `fn frame(): void`).

View File

@ -35,7 +35,7 @@ fn add2(a: int, b: int): int {
fn frame(): void { fn frame(): void {
let zero = Vec2.ZERO; let zero = Vec2.ZERO;
let zz = add(zero.getX(), zero.getY()); let zz = add2(zero.getX(), zero.getY());
// 1. Locals & Arithmetic // 1. Locals & Arithmetic
let x = 10; let x = 10;