prometeu-runtime/files/Hard Reset.md
2026-03-24 13:40:31 +00:00

8.0 KiB

Prometeu Industrial-Grade Refactor Plan (JVM-like)

Language policy: All implementation notes, code comments, commit messages, PR descriptions, and review discussion must be in English.

Reset policy: This is a hard reset. We do not keep compatibility with the legacy bytecode/linker/verifier behaviors. No heuristics, no “temporary support”, no string hacks.

North Star: A JVM-like philosophy:

  • Control-flow is method-local and canonical.
  • The linker resolves symbols and tables, not intra-function branches.
  • A single canonical layout/decoder/spec is used across compiler/linker/verifier/VM.
  • Any invalid program fails with clear diagnostics, not panics.

Phase 2 — Canonical Layout + Verifier Contract (JVM-like Control Flow)

PR-04 (5 pts) — Rewrite layout to compute instruction boundaries via decoder (no heuristics)

Briefing

Layout must be computed canonically using the decoder, not guessed via ad-hoc stepping.

Target

prometeu_bytecode::layout becomes the only authority for:

  • function ranges [start, end)
  • function length
  • valid instruction boundaries
  • pc→function lookup

Scope

  • Implement layout computation by scanning bytes with the canonical decoder.

  • Provide APIs:

    • function_range(func_idx) -> (start, end)
    • function_len(func_idx)
    • is_boundary(func_idx, rel_pc) or is_boundary_abs(abs_pc)
    • lookup_function_by_pc(abs_pc)

Requirements Checklist

  • No “clamp_jump_target” or tolerant APIs remain.
  • Layout derived only via decoder.

Completion Tests

  • Unit tests: boundaries for a known bytecode sequence.
  • Fuzz/table tests: random instruction sequences produce monotonic ranges and valid boundaries.

PR-05 (3 pts) — Verifier hard reset: branches are function-relative only

Briefing

The verifier must not guess absolute vs relative. One encoding only.

Target

Branches use immediate = target_rel_to_function_start, with target == func_len allowed.

Scope

  • Replace any dual-format logic.

  • Validation:

    • target_rel <= func_len
    • if target_rel == func_len: OK (end-exclusive)
    • else target must be an instruction boundary
  • All boundary checks must come from layout.

Requirements Checklist

  • No heuristics.
  • Verifier depends only on layout + decoder.

Completion Tests

  • JumpToEnd accepted.
  • JumpToMidInstruction rejected.
  • JumpOutsideFunction rejected.

PR-06 (3 pts) — Linker hard reset: never relocate intra-function branches

Briefing

Linker must not rewrite local control-flow.

Target

Remove any relocation/patching for Jmp/JmpIf*.

Scope

  • Delete branch relocation logic.
  • Ensure only symbol/table/call relocations remain.

Requirements Checklist

  • Linker does not inspect/patch branch immediates.

Completion Tests

  • Link-order invariance test (A+B vs B+A) passes for intra-function branches.

Phase 3 — JVM-like Symbol Identity: Signature-based Overload & Constant-Pool Mindset

PR-07 (5 pts) — Introduce Signature interning (SigId) and descriptor canonicalization

Briefing

Overload must be by signature, not by name/arity.

Target

Create a canonical function descriptor system (JVM-like) and intern signatures.

Scope

  • Add Signature model:

    • params types + return type
  • Add SignatureInterner -> SigId

  • Add descriptor() canonical representation (stable, deterministic).

Requirements Checklist

  • SigId is used as identity in compiler IR.
  • Descriptor is stable and round-trippable.

Completion Tests

  • debug(int)->void and debug(string)->void produce different descriptors.
  • Descriptor stability tests.

PR-08 (5 pts) — Replace name/arity import/export keys with (name, SigId)

Briefing

name/arity and dedup-by-name break overload and are not industrial.

Target

Rewrite import/export identity:

  • ExportKey { module_path, base_name, sig }
  • ImportKey { dep, module_path, base_name, sig }

Scope

  • Update lowering to stop producing name/arity.
  • Update output builder to stop exporting short names and name/arity.
  • Update collector to stop dedup-by-name.

Requirements Checklist

  • No code constructs or parses "{name}/{arity}".
  • Overload is represented as first-class, not a hack.

Completion Tests

  • Cross-module overload works.
  • Duplicate export of same (name, sig) fails deterministically.

PR-09 (3 pts) — Overload resolution rules (explicit, deterministic)

Briefing

Once overload exists, resolution rules must be explicit.

Target

Implement a deterministic overload resolver based on exact type match (no implicit hacks).

Scope

  • Exact-match resolution only (initially).
  • Clear diagnostic when ambiguous or missing.

Requirements Checklist

  • No best-effort fallback.

Completion Tests

  • Ambiguous call produces a clear diagnostic.
  • Missing overload produces a clear diagnostic.

Phase 4 — Eliminate Stringly-Typed Protocols & Debug Hacks

PR-10 (5 pts) — Replace origin: Option<String> and all string protocols with structured enums

Briefing

String prefixes like svc: and @dep: are fragile and non-industrial.

Target

All origins and external references become typed data.

Scope

  • Replace string origins with enums.
  • Update lowering/collector/output accordingly.

Requirements Checklist

  • No .starts_with('@'), split(':') protocols.

Completion Tests

  • Grep-based test/lint step fails if forbidden patterns exist.

PR-11 (5 pts) — DebugInfo V1: structured function metadata (no name@offset+len)

Briefing

Encoding debug metadata in strings is unacceptable.

Target

Introduce a structured debug info format that stores offset/len as fields.

Scope

  • Add DebugFunctionInfo { func_idx, name, code_offset, code_len }.
  • Remove all parsing of @offset+len.
  • Update orchestrator/linker/emit to use structured debug info.

Requirements Checklist

  • No code emits or parses @offset+len.

Completion Tests

  • A test that fails if any debug name contains @ pattern.
  • Debug info roundtrip test.

Phase 5 — Hardening: Diagnostics, Error Handling, and Regression Shields

PR-12 (3 pts) — Replace panics in critical build pipeline with typed errors + diagnostics

Briefing

unwrap/expect in compiler/linker transforms user errors into crashes.

Target

Introduce typed errors and surface diagnostics.

Scope

  • Replace unwraps in:

    • symbol resolution
    • import/export linking
    • entrypoint selection
  • Ensure clean error return with context.

Requirements Checklist

  • No panic paths for invalid user programs.

Completion Tests

  • Invalid program produces diagnostics, not panic.

Briefing

We need a system immune to opcode churn.

Target

Add tests that fail if:

  • linker steps bytes manually
  • decoder/spec drift exists
  • link order changes semantics

Scope

  • Link-order invariance tests.
  • Spec coverage tests.
  • Optional: lightweight “forbidden patterns” tests.

Requirements Checklist

  • Changing an opcode immediate size requires updating only the spec and tests.

Completion Tests

  • All new regression tests pass.

Summary of Estimated Cost (Points)

  • Phase 1: PR-01 (3) + PR-02 (5) + PR-03 (3) = 11
  • Phase 2: PR-04 (5) + PR-05 (3) + PR-06 (3) = 11
  • Phase 3: PR-07 (5) + PR-08 (5) + PR-09 (3) = 13
  • Phase 4: PR-10 (5) + PR-11 (5) = 10
  • Phase 5: PR-12 (3) + PR-13 (3) = 6

Total: 51 points

Note: If any PR starts to exceed 5 points in practice, it must be split into smaller PRs.


Non-Negotiables

  • No compatibility with legacy encodings.
  • No heuristics.
  • No string hacks.
  • One canonical decoder/spec/layout.
  • Everything in English (including review comments).