5.0 KiB
Prometeu Industrial-Grade Refactor Plan (JVM-like)
Language policy: All implementation notes, code comments, commit messages, PR descriptions, and review discussion must be in English.
Reset policy: This is a hard reset. We do not keep compatibility with the legacy bytecode/linker/verifier behaviors. No heuristics, no “temporary support”, no string hacks.
North Star: A JVM-like philosophy:
- Control-flow is method-local and canonical.
- The linker resolves symbols and tables, not intra-function branches.
- A single canonical layout/decoder/spec is used across compiler/linker/verifier/VM.
- Any invalid program fails with clear diagnostics, not panics.
Phase 3 — JVM-like Symbol Identity: Signature-based Overload & Constant-Pool Mindset
PR-08 (5 pts) — Replace name/arity import/export keys with (name, SigId)
Briefing
name/arity and dedup-by-name break overload and are not industrial.
Target
Rewrite import/export identity:
ExportKey { module_path, base_name, sig }ImportKey { dep, module_path, base_name, sig }
Scope
- Update lowering to stop producing
name/arity. - Update output builder to stop exporting short names and
name/arity. - Update collector to stop dedup-by-name.
Requirements Checklist
- No code constructs or parses
"{name}/{arity}". - Overload is represented as first-class, not a hack.
Completion Tests
- Cross-module overload works.
- Duplicate export of same
(name, sig)fails deterministically.
PR-09 (3 pts) — Overload resolution rules (explicit, deterministic)
Briefing
Once overload exists, resolution rules must be explicit.
Target
Implement a deterministic overload resolver based on exact type match (no implicit hacks).
Scope
- Exact-match resolution only (initially).
- Clear diagnostic when ambiguous or missing.
Requirements Checklist
- No best-effort fallback.
Completion Tests
- Ambiguous call produces a clear diagnostic.
- Missing overload produces a clear diagnostic.
Phase 4 — Eliminate Stringly-Typed Protocols & Debug Hacks
PR-10 (5 pts) — Replace origin: Option<String> and all string protocols with structured enums
Briefing
String prefixes like svc: and @dep: are fragile and non-industrial.
Target
All origins and external references become typed data.
Scope
- Replace string origins with enums.
- Update lowering/collector/output accordingly.
Requirements Checklist
- No
.starts_with('@'),split(':')protocols.
Completion Tests
- Grep-based test/lint step fails if forbidden patterns exist.
PR-11 (5 pts) — DebugInfo V1: structured function metadata (no name@offset+len)
Briefing
Encoding debug metadata in strings is unacceptable.
Target
Introduce a structured debug info format that stores offset/len as fields.
Scope
- Add
DebugFunctionInfo { func_idx, name, code_offset, code_len }. - Remove all parsing of
@offset+len. - Update orchestrator/linker/emit to use structured debug info.
Requirements Checklist
- No code emits or parses
@offset+len.
Completion Tests
- A test that fails if any debug name contains
@pattern. - Debug info roundtrip test.
Phase 5 — Hardening: Diagnostics, Error Handling, and Regression Shields
PR-12 (3 pts) — Replace panics in critical build pipeline with typed errors + diagnostics
Briefing
unwrap/expect in compiler/linker transforms user errors into crashes.
Target
Introduce typed errors and surface diagnostics.
Scope
-
Replace unwraps in:
- symbol resolution
- import/export linking
- entrypoint selection
-
Ensure clean error return with context.
Requirements Checklist
- No panic paths for invalid user programs.
Completion Tests
- Invalid program produces diagnostics, not panic.
PR-13 (3 pts) — Add regression test suite: link-order invariance + opcode-change immunity
Briefing
We need a system immune to opcode churn.
Target
Add tests that fail if:
- linker steps bytes manually
- decoder/spec drift exists
- link order changes semantics
Scope
- Link-order invariance tests.
- Spec coverage tests.
- Optional: lightweight “forbidden patterns” tests.
Requirements Checklist
- Changing an opcode immediate size requires updating only the spec and tests.
Completion Tests
- All new regression tests pass.
Summary of Estimated Cost (Points)
- Phase 1: PR-01 (3) + PR-02 (5) + PR-03 (3) = 11
- Phase 2: PR-04 (5) + PR-05 (3) + PR-06 (3) = 11
- Phase 3: PR-07 (5) + PR-08 (5) + PR-09 (3) = 13
- Phase 4: PR-10 (5) + PR-11 (5) = 10
- Phase 5: PR-12 (3) + PR-13 (3) = 6
Total: 51 points
Note: If any PR starts to exceed 5 points in practice, it must be split into smaller PRs.
Non-Negotiables
- No compatibility with legacy encodings.
- No heuristics.
- No string hacks.
- One canonical decoder/spec/layout.
- Everything in English (including review comments).