prometeu-runtime/discussion/lessons/DSC-0014-perf-vm-allocation-and-copy-pressure/LSN-0035-first-materialization-is-not-the-same-as-hot-path-copy-pressure.md
bQUARKz cd6bf6b313
All checks were successful
Intrepid/Prometeu/Runtime/pipeline/head This commit looks good
Intrepid/Prometeu/Runtime/pipeline/pr-master This commit looks good
[PERF] VM Allocation and Copy Pressure
2026-04-20 09:09:24 +01:00

70 lines
4.2 KiB
Markdown

---
id: LSN-0035
ticket: perf-vm-allocation-and-copy-pressure
title: First Materialization Is Not the Same as Hot-Path Copy Pressure
created: 2026-04-20
tags: [runtime, vm, performance, strings, allocation, telemetry]
---
## Context
`DSC-0014` started from a broad performance complaint around VM allocation and string-heavy execution. The tempting response would have been to chase "zero alloc everywhere" or to reopen public VM semantics around globals and strings.
The completed work converged on a narrower and more durable rule: the real optimization boundary is not "all allocation is bad", but "first materialization cost must be separated from repeated hot-path copy pressure".
That distinction shaped three concrete outcomes:
- immutable string payloads are now shared instead of recopied in common paths such as constant-pool use and `GET_GLOBAL`;
- internal telemetry records heap-allocation and string-materialization evidence without turning those counters into certification policy;
- the published specs now explain that zero-allocation happy paths are engineering targets, not guest-visible compatibility promises.
## Key Decisions
### Allocation Baseline for Strings and Globals (`DEC-0018`)
**What:**
PROMETEU keeps strings in the public VM language surface, but the runtime treats string payload as potentially expensive. Hardcoded strings may materialize once during build/load, runtime-created strings may materialize when a new value is semantically created, and `GET_GLOBAL` keeps its public behavior while avoiding unnecessary repeated payload copying internally.
**Why:**
The original hotspot was not the existence of strings. It was the fact that already-materialized payloads were being cloned repeatedly on hot paths. Fixing that internally is cheaper and safer than redefining the guest ABI.
**Trade-offs:**
String concatenation still creates a new string when the language semantics require it. The runtime gives up the illusion of "free strings", but preserves the correct public model while removing redundant copies that were never semantically required.
## Patterns and Algorithms
- Separate semantic allocation from accidental copying:
A new value may require materialization. Re-reading or forwarding an existing value usually should not.
- Share immutable payloads aggressively:
If a payload is immutable and already materialized, shared ownership is often the cheapest way to preserve public semantics while removing internal copy pressure.
- Keep public semantics stable, move optimization inward:
`GET_GLOBAL` did not need new meaning. The fix belonged in representation and ownership, not in the opcode contract.
- Keep engineering evidence internal unless explicitly promoted:
The runtime now records internal allocation evidence, but that evidence is not automatically a certification rule or ABI promise.
- Make internal counters robust under parallel tests:
Process-global counters can produce false failures in concurrent test runs. Internal evidence for runtime behavior should be scoped so unrelated work cannot contaminate the result.
## Pitfalls
- Do not treat every allocation as equivalent.
A first materialization that creates a semantically new value is not the same problem as cloning an old value on every hot-path access.
- Do not promote engineering goals to public contracts by accident.
"Zero alloc on the happy path" is useful for implementation discipline, but dangerous if readers start assuming it is a certification guarantee.
- Do not optimize strings by changing surface semantics prematurely.
The pressure here was internal. Reopening the guest ABI first would have increased scope and risk without solving the actual hotspot cleanly.
- Do not use shared global counters for per-test behavioral evidence.
The failed coverage run showed that instrumentation design matters as much as the metric itself.
## Takeaways
- The right performance question is often "where is the repeated cost?" rather than "where is any cost at all?"
- Immutable payload sharing is a strong default when public semantics must remain stable and hot-path copies are the real issue.
- Internal telemetry should help engineering decisions without silently becoming a product contract.