Intrepid/prometeu-runtime

Fork 0

bQUARKz c9b9f18b32

fixes

2026-03-24 13:40:31 +00:00

8.0 KiB

Raw Blame History

Prometeu Industrial-Grade Refactor Plan (JVM-like)

Language policy: All implementation notes, code comments, commit messages, PR descriptions, and review discussion must be in English.

Reset policy: This is a hard reset. We do not keep compatibility with the legacy bytecode/linker/verifier behaviors. No heuristics, no “temporary support”, no string hacks.

North Star: A JVM-like philosophy:

Control-flow is method-local and canonical.
The linker resolves symbols and tables, not intra-function branches.
A single canonical layout/decoder/spec is used across compiler/linker/verifier/VM.
Any invalid program fails with clear diagnostics, not panics.

Phase 2 — Canonical Layout + Verifier Contract (JVM-like Control Flow)

PR-04 (5 pts) — Rewrite layout to compute instruction boundaries via decoder (no heuristics)

Briefing

Layout must be computed canonically using the decoder, not guessed via ad-hoc stepping.

Target

prometeu_bytecode::layout becomes the only authority for:

function ranges [start, end)
function length
valid instruction boundaries
pc→function lookup

Scope

Implement layout computation by scanning bytes with the canonical decoder.
Provide APIs:
- function_range(func_idx) -> (start, end)
- function_len(func_idx)
- is_boundary(func_idx, rel_pc) or is_boundary_abs(abs_pc)
- lookup_function_by_pc(abs_pc)

Requirements Checklist

No “clamp_jump_target” or tolerant APIs remain.
Layout derived only via decoder.

Completion Tests

Unit tests: boundaries for a known bytecode sequence.
Fuzz/table tests: random instruction sequences produce monotonic ranges and valid boundaries.

PR-05 (3 pts) — Verifier hard reset: branches are function-relative only

Briefing

The verifier must not guess absolute vs relative. One encoding only.

Target

Branches use immediate = target_rel_to_function_start, with target == func_len allowed.

Scope

Replace any dual-format logic.
Validation:
- target_rel <= func_len
- if target_rel == func_len: OK (end-exclusive)
- else target must be an instruction boundary
All boundary checks must come from layout.

Requirements Checklist

No heuristics.
Verifier depends only on layout + decoder.

Completion Tests

JumpToEnd accepted.
JumpToMidInstruction rejected.
JumpOutsideFunction rejected.

PR-06 (3 pts) — Linker hard reset: never relocate intra-function branches

Briefing

Linker must not rewrite local control-flow.

Target

Remove any relocation/patching for Jmp/JmpIf*.

Scope

Delete branch relocation logic.
Ensure only symbol/table/call relocations remain.

Requirements Checklist

Linker does not inspect/patch branch immediates.

Completion Tests

Link-order invariance test (A+B vs B+A) passes for intra-function branches.

Phase 3 — JVM-like Symbol Identity: Signature-based Overload & Constant-Pool Mindset

PR-07 (5 pts) — Introduce Signature interning (`SigId`) and descriptor canonicalization

Briefing

Overload must be by signature, not by name/arity.

Target

Create a canonical function descriptor system (JVM-like) and intern signatures.

Scope

Add Signature model:
- params types + return type
Add SignatureInterner -> SigId
Add descriptor() canonical representation (stable, deterministic).

Requirements Checklist

SigId is used as identity in compiler IR.
Descriptor is stable and round-trippable.

Completion Tests

debug(int)->void and debug(string)->void produce different descriptors.
Descriptor stability tests.

PR-08 (5 pts) — Replace `name/arity` import/export keys with `(name, SigId)`

Briefing

name/arity and dedup-by-name break overload and are not industrial.

Target

Rewrite import/export identity:

ExportKey { module_path, base_name, sig }
ImportKey { dep, module_path, base_name, sig }

Scope

Update lowering to stop producing name/arity.
Update output builder to stop exporting short names and name/arity.
Update collector to stop dedup-by-name.

Requirements Checklist

No code constructs or parses "{name}/{arity}".
Overload is represented as first-class, not a hack.

Completion Tests

Cross-module overload works.
Duplicate export of same (name, sig) fails deterministically.

PR-09 (3 pts) — Overload resolution rules (explicit, deterministic)

Briefing

Once overload exists, resolution rules must be explicit.

Target

Implement a deterministic overload resolver based on exact type match (no implicit hacks).

Scope

Exact-match resolution only (initially).
Clear diagnostic when ambiguous or missing.

Requirements Checklist

No best-effort fallback.

Completion Tests

Ambiguous call produces a clear diagnostic.
Missing overload produces a clear diagnostic.

Phase 4 — Eliminate Stringly-Typed Protocols & Debug Hacks

PR-10 (5 pts) — Replace `origin: Option<String>` and all string protocols with structured enums

Briefing

String prefixes like svc: and @dep: are fragile and non-industrial.

Target

All origins and external references become typed data.

Scope

Replace string origins with enums.
Update lowering/collector/output accordingly.

Requirements Checklist

No .starts_with('@'), split(':') protocols.

Completion Tests

Grep-based test/lint step fails if forbidden patterns exist.

PR-11 (5 pts) — DebugInfo V1: structured function metadata (no `name@offset+len`)

Briefing

Encoding debug metadata in strings is unacceptable.

Target

Introduce a structured debug info format that stores offset/len as fields.

Scope

Add DebugFunctionInfo { func_idx, name, code_offset, code_len }.
Remove all parsing of @offset+len.
Update orchestrator/linker/emit to use structured debug info.

Requirements Checklist

No code emits or parses @offset+len.

Completion Tests

A test that fails if any debug name contains @ pattern.
Debug info roundtrip test.

Phase 5 — Hardening: Diagnostics, Error Handling, and Regression Shields

PR-12 (3 pts) — Replace panics in critical build pipeline with typed errors + diagnostics

Briefing

unwrap/expect in compiler/linker transforms user errors into crashes.

Target

Introduce typed errors and surface diagnostics.

Scope

Replace unwraps in:
- symbol resolution
- import/export linking
- entrypoint selection
Ensure clean error return with context.

Requirements Checklist

No panic paths for invalid user programs.

Completion Tests

Invalid program produces diagnostics, not panic.

PR-13 (3 pts) — Add regression test suite: link-order invariance + opcode-change immunity

Briefing

We need a system immune to opcode churn.

Target

Add tests that fail if:

linker steps bytes manually
decoder/spec drift exists
link order changes semantics

Scope

Link-order invariance tests.
Spec coverage tests.
Optional: lightweight “forbidden patterns” tests.

Requirements Checklist

Changing an opcode immediate size requires updating only the spec and tests.

Completion Tests

All new regression tests pass.

Summary of Estimated Cost (Points)

Phase 1: PR-01 (3) + PR-02 (5) + PR-03 (3) = 11
Phase 2: PR-04 (5) + PR-05 (3) + PR-06 (3) = 11
Phase 3: PR-07 (5) + PR-08 (5) + PR-09 (3) = 13
Phase 4: PR-10 (5) + PR-11 (5) = 10
Phase 5: PR-12 (3) + PR-13 (3) = 6

Total: 51 points

Note: If any PR starts to exceed 5 points in practice, it must be split into smaller PRs.

Non-Negotiables

No compatibility with legacy encodings.
No heuristics.
No string hacks.
One canonical decoder/spec/layout.
Everything in English (including review comments).

8.0 KiB Raw Blame History

Prometeu Industrial-Grade Refactor Plan (JVM-like)

Phase 2 — Canonical Layout + Verifier Contract (JVM-like Control Flow)

PR-04 (5 pts) — Rewrite layout to compute instruction boundaries via decoder (no heuristics)

PR-05 (3 pts) — Verifier hard reset: branches are function-relative only

PR-06 (3 pts) — Linker hard reset: never relocate intra-function branches

Phase 3 — JVM-like Symbol Identity: Signature-based Overload & Constant-Pool Mindset

PR-07 (5 pts) — Introduce Signature interning (SigId) and descriptor canonicalization

PR-08 (5 pts) — Replace name/arity import/export keys with (name, SigId)

PR-09 (3 pts) — Overload resolution rules (explicit, deterministic)

Phase 4 — Eliminate Stringly-Typed Protocols & Debug Hacks

PR-10 (5 pts) — Replace origin: Option<String> and all string protocols with structured enums

PR-11 (5 pts) — DebugInfo V1: structured function metadata (no name@offset+len)

Phase 5 — Hardening: Diagnostics, Error Handling, and Regression Shields

PR-12 (3 pts) — Replace panics in critical build pipeline with typed errors + diagnostics

PR-13 (3 pts) — Add regression test suite: link-order invariance + opcode-change immunity

Summary of Estimated Cost (Points)

Non-Negotiables

8.0 KiB

Raw Blame History

PR-07 (5 pts) — Introduce Signature interning (`SigId`) and descriptor canonicalization

PR-08 (5 pts) — Replace `name/arity` import/export keys with `(name, SigId)`

PR-10 (5 pts) — Replace `origin: Option<String>` and all string protocols with structured enums

PR-11 (5 pts) — DebugInfo V1: structured function metadata (no `name@offset+len`)