prometeu-studio/docs/specs/packer/4. Build Artifacts and Deterministic Packing Specification.md

232 lines
7.0 KiB
Markdown

# Build Artifacts and Deterministic Packing Specification
Status: Draft
Scope: Runtime-facing artifacts and deterministic build behavior
Purpose: Define how the packer emits `assets.pa` and companion artifacts.
## Authority and Precedence
This specification consolidates the initial packer agenda and decision wave into normative form.
Runtime-side reading semantics remain upstream in:
- `../runtime/docs/runtime/specs/13-cartridge.md`
- `../runtime/docs/runtime/specs/15-asset-management.md`
## Core Rules
1. `assets.pa` is the authoritative runtime-facing artifact.
2. Companion JSON files do not replace the internal header as the runtime contract.
3. The global asset table order is deterministic by increasing `asset_id`.
4. The runtime-facing `asset_id` is the same stable `asset_id` allocated by the packer registry.
## `assets.pa` Structure
Baseline structure:
```text
[fixed binary prelude]
[canonical JSON header]
[binary payload region]
```
### Prelude Fields
The baseline prelude includes:
- `magic`
- `schema_version`
- `header_len`
- `payload_offset`
- `flags`
- `reserved`
Rules:
- `flags` and `reserved` exist from day 1;
- `flags = 0` unless later specified otherwise;
- `reserved = 0` unless later specified otherwise;
- `header_checksum` is not part of the baseline envelope contract.
## Canonical JSON Header
The header is serialized canonically.
Rules:
- UTF-8 encoding;
- no extra whitespace;
- object keys sorted lexicographically;
- arrays preserve declared order;
- canonicalization applies recursively;
- runtime-facing header values in v1 must avoid floating-point numbers.
## Header Contents
The header carries:
- `asset_table`
- `preload`
### Asset Table
`asset_table` is emitted as a deterministically ordered list of asset entries.
Rules:
- one entry per included asset in the current build set;
- no secondary runtime-only identity layer;
- no synthetic dense reindexing layer;
- `asset_name` remains present as logical/API-facing metadata.
### Asset Entry Metadata Convergence
Each emitted asset entry has one runtime metadata sink: `AssetEntry.metadata`.
Rules:
- packer materialization must normalize all runtime-consumable metadata-producing sources into `asset_table[].metadata`;
- equivalent declaration/build inputs must produce equivalent normalized metadata;
- metadata key collisions across independent sources must fail build unless the family/format spec declares an explicit merge rule;
- normalization behavior must be testable and covered by conformance-oriented tests.
Baseline normalized metadata segmentation:
- `output.metadata` materializes at the metadata root;
- `output.codec_configuration` materializes under `metadata.codec`;
- `output.pipeline` materializes under `metadata.pipeline`;
- format-specific runtime-required fields may remain directly readable at the metadata root when the runtime consumer requires them there.
### Preload
Preload is emitted deterministically from per-asset declaration.
Rules:
- assets with `preload.enabled = false` do not appear in emitted preload data;
- assets with `preload.enabled = true` do appear;
- preload ordering is deterministic by increasing `asset_id`.
## Companion Artifacts
The baseline companion artifacts are:
- `build/asset_table.json`
- `build/preload.json`
- `build/asset_table_metadata.json`
Rules:
- `build/asset_table.json` mirrors `header.asset_table` 1:1;
- `build/preload.json` mirrors `header.preload` 1:1;
- `build/asset_table_metadata.json` is tooling-only;
- richer tooling data must not be added to the 1:1 mirror files.
## Alignment and Offsets
Alignment exists only when explicitly required by spec.
Rules:
- there is no implicit baseline alignment beyond envelope and format requirements;
- any required alignment must be normative and visible;
- emitted offsets are always relative to the payload region, never the start of the full file.
Offset ambiguity guardrail:
- `asset_table[].offset`/`asset_table[].size` describe payload slicing inside `assets.pa`;
- internal pipeline indexing data (for example per-sample ranges for audio banks) must live under `asset_table[].metadata`;
- internal indexing fields must not be interpreted as payload slicing fields.
## Format-Specific Baseline: `TILES/indexed_v1`
The first-wave producer contract for `TILES/indexed_v1` is fixed and runtime-aligned.
### Tile Selection and Identity
Rules:
- only selected artifacts participate in the emitted tile bank;
- `1 artifact = 1 tile` in v1;
- artifacts are normalized by ascending `artifacts[*].index`;
- emitted `tile_id` equals the normalized artifact index;
- duplicate or gapped artifact indices are build-blocking structural failures.
### Emitted Sheet Shape
Rules:
- the emitted tile-bank sheet is always `256 x 256` in v1;
- tile placement within that emitted sheet is row-major;
- runtime `width` and `height` for the bank entry therefore refer to the full emitted sheet, not one individual artifact;
- resulting per-bank capacities are:
- `tile_size = 8` -> `1024` tiles
- `tile_size = 16` -> `256` tiles
- `tile_size = 32` -> `64` tiles
- capacity overflow is a build-blocking structural failure.
### Payload Layout
Rules:
- the serialized payload for `TILES/indexed_v1` is:
1. one packed `u4` pixel plane for the full emitted sheet;
2. one palette block of `64 * 16 * 2 = 2048` bytes;
- packed pixel bytes must be emitted by the packer, not inferred later by the runtime;
- the palette block is serialized as `RGB565` `u16` values;
- palette bytes must be emitted in little-endian order;
- payload interpretation must not depend on incidental per-artifact boundaries.
### Runtime Entry Derivation
Rules:
- emitted tile-bank entries use:
- `bank_type = TILES`
- `codec = NONE`
- tile-bank v1 metadata must expose at least:
- `tile_size`
- `width`
- `height`
- `palette_count`
- tile-bank v1 requires `palette_count = 64`;
- tile-bank v1 size formulas are:
- `size = ceil(width * height / 2) + 2048`
- `decoded_size = (width * height) + 2048`
### Palette Contract
Rules:
- bank palettes are declared under `output.pipeline.palettes`;
- each palette declaration uses the shape `{ "index": <int>, "palette": { ... } }`;
- palette ordering is ascending numeric `index`, never raw array position;
- palette ids in the emitted tile bank are the normalized declared palette indices;
- any tile in the bank may be rendered with any palette in the bank at runtime;
- palette selection is a runtime draw concern, not a tile-payload embedding concern.
## Determinism
Equivalent build inputs must produce equivalent outputs.
Rules:
- no filesystem iteration dependence;
- no hidden defaults that affect output;
- no usage-based hot-first packing in the baseline contract;
- determinism is preferred over speculative physical-layout optimization.
## Non-Goals
- per-format payload internals for every output family
- future locality optimization policy
- cartridge-level integrity/signature strategy
## Exit Criteria
This specification is complete enough when:
- runtime-facing artifact authority is explicit;
- companion artifacts are bounded;
- emitted ordering and header behavior are deterministic.