prometeu-studio/docs/packer/decisions/Metadata Convergence to AssetEntry.metadata Decision.md

146 lines
6.1 KiB
Markdown

# Metadata Convergence to AssetEntry.metadata Decision
Status: Accepted
Date: 2026-03-17
Domain Owner: `docs/packer`
Cross-Domain Impact: `docs/vm-arch`, runtime asset consumers
## Context
The packer asset contract currently has multiple metadata-producing sources with different responsibilities:
- asset-level runtime contract metadata (authoring declaration);
- codec-related metadata (codec configuration/effective codec parameters);
- pipeline-derived metadata generated during build materialization (for example, indexed ranges for packed samples).
At runtime, consumers read one metadata sink from the asset table: `AssetEntry.metadata`.
Without an explicit decision, the system risks inconsistent behavior and documentation drift:
- metadata spread across sources without deterministic merge semantics;
- ambiguity between storage-layout fields (`AssetEntry.offset`/`AssetEntry.size`) and pipeline-internal indexing data (`offset`/`length` per sample);
- Studio, packer, and runtime docs diverging on where runtime consumers should read final values.
## Decision
The following direction is adopted:
1. All runtime-consumable metadata must converge to a single sink: `AssetEntry.metadata`.
2. Source segmentation in `asset.json` is allowed for authoring clarity, but build materialization must normalize these sources into that single sink.
3. Metadata normalization must be deterministic and testable.
4. `AssetEntry.offset` and `AssetEntry.size` remain payload slicing fields and are not reinterpreted as pipeline-internal indexing metadata.
5. Pipeline indexing metadata (for example, audio per-`sample_id` ranges) must live inside `AssetEntry.metadata` under explicit keys.
## Adopted Constraints
### 1. Source Segmentation vs Runtime Sink
- authoring sources may remain segmented (asset metadata, codec metadata, pipeline metadata);
- runtime consumers must read effective values from `AssetEntry.metadata`;
- packer build output is responsible for normalization.
### 2. Deterministic Convergence
- normalization must produce the same `AssetEntry.metadata` for the same effective declaration and build inputs;
- metadata key collisions between independent sources must be rejected with a build-time error unless explicitly specified by family/format contract;
- normalization order and collision policy must be documented by packer specs.
### 3. Audio Indexing Semantics
For multi-sample audio banks, sample indexing metadata belongs to `AssetEntry.metadata`, keyed by `sample_id`.
Illustrative shape:
```json
{
"metadata": {
"sample_rate": 22050,
"channels": 1,
"samples": {
"1": { "offset": 0, "length": 100 }
},
"codec": {
"parity": 10
}
}
}
```
This decision accepts either nested codec metadata (for example `metadata.codec.*`) or a flat equivalent only when the family/format spec declares that shape explicitly.
### 4. Offset Ambiguity Guardrail
- `AssetEntry.offset`/`AssetEntry.size` describe where one packed asset payload is stored in `assets.pa`;
- `metadata.samples[*].offset`/`metadata.samples[*].length` describe internal layout/indexing inside that asset's runtime payload contract;
- documentation and tests must keep these meanings separate.
## Why This Direction Was Chosen
- It keeps runtime consumption simple: one metadata sink.
- It preserves authoring ergonomics: source metadata can stay segmented by concern.
- It avoids semantic duplication between packer and runtime consumers.
- It creates a clear path for bank-like assets (tiles/audio) that require indexed internal metadata.
## Explicit Non-Decisions
This decision does not define:
- the final complete metadata schema for every asset family;
- the final canonical codec metadata shape (`nested` vs `flat`) for all formats;
- multi-sample audio runtime loading implementation details;
- exact binary container/header layout for audio banks.
## Implications
- packer specs must define normalization semantics and collision policy;
- packer build/materialization must emit normalized metadata into `AssetEntry.metadata`;
- runtime-facing docs must state that effective metadata is read from `AssetEntry.metadata`;
- tests must cover convergence correctness and ambiguity boundaries for offset semantics.
## Propagation Targets
Specs:
- [`../specs/3. Asset Declaration and Virtual Asset Contract Specification.md`](../specs/3.%20Asset%20Declaration%20and%20Virtual%20Asset%20Contract%20Specification.md)
- [`../specs/4. Build Artifacts and Deterministic Packing Specification.md`](../specs/4.%20Build%20Artifacts%20and%20Deterministic%20Packing%20Specification.md)
- [`../../vm-arch/ARCHITECTURE.md`](../../vm-arch/ARCHITECTURE.md)
Plans:
- next packer PR/plan that introduces metadata normalization and multi-source merge validation
Code:
- packer asset declaration/materialization pipeline
- asset table emission path (`AssetEntry.metadata` payload)
Tests:
- normalization unit tests (source merge determinism)
- collision/ambiguity tests for offset semantics
- regression tests for runtime-readable metadata shape
Docs:
- packer specs and learn artifacts covering metadata source-to-sink flow
- runtime asset-management documentation referencing `AssetEntry.metadata` as sink
## References
Related specs:
- [`../specs/3. Asset Declaration and Virtual Asset Contract Specification.md`](../specs/3.%20Asset%20Declaration%20and%20Virtual%20Asset%20Contract%20Specification.md)
- [`../specs/4. Build Artifacts and Deterministic Packing Specification.md`](../specs/4.%20Build%20Artifacts%20and%20Deterministic%20Packing%20Specification.md)
Related agendas:
- no formal agenda artifact yet for this specific topic; this decision consolidates current packer/runtime alignment discussion.
## Validation Notes
This decision is correctly implemented only when all of the following are true:
- runtime consumers can read final effective metadata exclusively from `AssetEntry.metadata`;
- segmented metadata sources in authoring inputs converge deterministically during packing;
- offset semantics remain unambiguous between asset-table payload slicing and pipeline-internal indexing;
- documentation across packer and runtime-facing domains is consistent about this source-to-sink contract.