prometeu-runtime/discussion/lessons/DSC-0014-perf-vm-allocation-and-copy-pressure/LSN-0035-first-materialization-is-not-the-same-as-hot-path-copy-pressure.md
bQUARKz cd6bf6b313
All checks were successful
Intrepid/Prometeu/Runtime/pipeline/head This commit looks good
Intrepid/Prometeu/Runtime/pipeline/pr-master This commit looks good
[PERF] VM Allocation and Copy Pressure
2026-04-20 09:09:24 +01:00

4.2 KiB

id ticket title created tags
LSN-0035 perf-vm-allocation-and-copy-pressure First Materialization Is Not the Same as Hot-Path Copy Pressure 2026-04-20
runtime
vm
performance
strings
allocation
telemetry

Context

DSC-0014 started from a broad performance complaint around VM allocation and string-heavy execution. The tempting response would have been to chase "zero alloc everywhere" or to reopen public VM semantics around globals and strings.

The completed work converged on a narrower and more durable rule: the real optimization boundary is not "all allocation is bad", but "first materialization cost must be separated from repeated hot-path copy pressure".

That distinction shaped three concrete outcomes:

  • immutable string payloads are now shared instead of recopied in common paths such as constant-pool use and GET_GLOBAL;
  • internal telemetry records heap-allocation and string-materialization evidence without turning those counters into certification policy;
  • the published specs now explain that zero-allocation happy paths are engineering targets, not guest-visible compatibility promises.

Key Decisions

Allocation Baseline for Strings and Globals (DEC-0018)

What: PROMETEU keeps strings in the public VM language surface, but the runtime treats string payload as potentially expensive. Hardcoded strings may materialize once during build/load, runtime-created strings may materialize when a new value is semantically created, and GET_GLOBAL keeps its public behavior while avoiding unnecessary repeated payload copying internally.

Why: The original hotspot was not the existence of strings. It was the fact that already-materialized payloads were being cloned repeatedly on hot paths. Fixing that internally is cheaper and safer than redefining the guest ABI.

Trade-offs: String concatenation still creates a new string when the language semantics require it. The runtime gives up the illusion of "free strings", but preserves the correct public model while removing redundant copies that were never semantically required.

Patterns and Algorithms

  • Separate semantic allocation from accidental copying: A new value may require materialization. Re-reading or forwarding an existing value usually should not.

  • Share immutable payloads aggressively: If a payload is immutable and already materialized, shared ownership is often the cheapest way to preserve public semantics while removing internal copy pressure.

  • Keep public semantics stable, move optimization inward: GET_GLOBAL did not need new meaning. The fix belonged in representation and ownership, not in the opcode contract.

  • Keep engineering evidence internal unless explicitly promoted: The runtime now records internal allocation evidence, but that evidence is not automatically a certification rule or ABI promise.

  • Make internal counters robust under parallel tests: Process-global counters can produce false failures in concurrent test runs. Internal evidence for runtime behavior should be scoped so unrelated work cannot contaminate the result.

Pitfalls

  • Do not treat every allocation as equivalent. A first materialization that creates a semantically new value is not the same problem as cloning an old value on every hot-path access.

  • Do not promote engineering goals to public contracts by accident. "Zero alloc on the happy path" is useful for implementation discipline, but dangerous if readers start assuming it is a certification guarantee.

  • Do not optimize strings by changing surface semantics prematurely. The pressure here was internal. Reopening the guest ABI first would have increased scope and risk without solving the actual hotspot cleanly.

  • Do not use shared global counters for per-test behavioral evidence. The failed coverage run showed that instrumentation design matters as much as the metric itself.

Takeaways

  • The right performance question is often "where is the repeated cost?" rather than "where is any cost at all?"
  • Immutable payload sharing is a strong default when public semantics must remain stable and hot-path copies are the real issue.
  • Internal telemetry should help engineering decisions without silently becoming a product contract.