14 KiB
PR-24 Asset File Cache Hydration and Walker Reuse
Domain Owner: docs/packer
Cross-Domain Impact: docs/studio
Briefing
The runtime loader already walks asset roots and produces walkResult, and PackerWorkspacePaths already reserves assets/.prometeu/cache.json.
What is still missing is the actual cache lifecycle:
- previous cache is not loaded before a walk;
- walkers do not receive prior file facts for comparison;
walkResultdoes not become a durable cache artifact after the scan.
That leaves the current runtime path unable to reuse prior file knowledge such as lastModified, size, fingerprint, and family-specific probe metadata.
This PR introduces the first durable asset file cache flow for the runtime-backed packer wave. It also tightens how walk output becomes part of the runtime snapshot and how diagnostics are split between normal aggregated surfaces and file-scoped UI-facing surfaces.
Objective
Deliver an asset-scoped file cache stored in assets/.prometeu/cache.json, hydrated before the runtime walk and refreshed from the current walkResult after the walk completes, while also attaching the current walkResult to the runtime snapshot.
Dependencies
./PR-14-project-runtime-core-snapshot-model-and-lifecycle.md./PR-15-snapshot-backed-asset-query-services.md./PR-16-write-lane-command-completion-and-used-write-services.md./PR-21-point-in-memory-snapshot-updates-after-write-commit.md../specs/2. Workspace, Registry, and Asset Identity Specification.md../specs/4. Build Artifacts and Deterministic Packing Specification.md../specs/5. Diagnostics, Operations, and Studio Integration Specification.md
Scope
- define the durable schema for
assets/.prometeu/cache.json - store cache entries per asset and per discovered file, not as one flat global fingerprint bag
- restrict cache and internal file walk analysis to assets that are already registered and therefore have stable
asset_id - load prior cache state during runtime snapshot bootstrap and refresh
- pass prior asset-scoped cache entries into the asset walker
- let walkers compare current file observations against prior cached facts such as
lastModified,size,fingerprint, and family-specific metadata - treat the current
walkResultas the source used to build the next durable cache state - attach the current
walkResultto the in-memory runtime snapshot for later query and UI use - persist refreshed cache after a successful runtime load or write-path point patch that recomputes asset content
- keep cache miss, corruption, or version mismatch non-fatal for normal asset reads
- keep the Studio-visible asset query surface stable while the cache becomes an internal optimization and comparison input
- keep diagnostics out of the durable cache artifact
- sink general walk diagnostics into the normal asset/runtime diagnostics surface
- preserve file-scoped diagnostics as segregated walk output for UI consumers
Non-Goals
- no remote/shared cache
- no final
build/packincremental pipeline - no background watch service or external reconcile loop
- no silent reuse of stale cache entries when file identity no longer matches the current asset file
- no cache file per asset root; the baseline artifact remains the workspace-level
assets/.prometeu/cache.json - no UI contract that exposes raw cache internals directly to Studio
- no cache support for unregistered assets; registration remains the prerequisite for internal file analysis and durable cache ownership
Execution Shape
PR-24 should be treated as an umbrella execution plan, not as one direct implementation PR.
This work should be split into smaller follow-up PRs so cache persistence, walker reuse policy, and runtime snapshot integration can each land with narrow tests and isolated regressions.
Execution Method
-
Introduce a packer-owned cache repository around
PackerWorkspacePaths.cachePath(project). The repository must load, validate, and save one workspace cache artifact without leaking raw filesystem JSON handling into loaders or walkers. -
Define a versioned durable cache model. The baseline model should include:
- workspace-level schema/version fields
- asset-scoped entries keyed by stable
asset_id - file-scoped entries keyed by normalized relative path inside the asset root
- reusable file facts such as mime type, size,
lastModified, content fingerprint, and family-specific probe metadata - no persisted diagnostics; diagnostics remain runtime results produced by the current walk only
-
Extend walker inputs so previous cache is available during content probing. The walker contract should receive the prior asset cache view together with the declaration and asset root, rather than forcing each concrete walker to reopen cache storage on its own. Unregistered assets do not enter this flow; they must be registered first before internal file analysis and cache ownership apply.
-
Define cache comparison rules inside the walker layer. Baseline rules:
- if current file
sizediffers from cachedsize, cached data is invalid immediately - if current file
lastModifiedis after cachedlastModified, cached data is invalid immediately - content hash or fingerprint should be the last comparison step, used only when the cheaper checks do not already force invalidation and the policy still needs stronger confirmation
- if prior file facts remain valid under that ordered comparison policy, the walker may reuse prior metadata instead of recomputing everything
- if identity facts differ, the walker must treat the file as changed and emit fresh probe output
- missing prior cache is a normal cache miss, not an error
- corrupted or incompatible prior cache should surface diagnostics or operational logging as appropriate, then fall back to a cold walk
- if current file
-
Promote
walkResultfrom transient scan output to cache refresh input. After a successful walk, the loader must convert only the cacheable portions of the currentwalkResultinto the next durable asset cache entry set and merge it into the workspace cache model. Persisted cache data must be limited to reusable probe facts and metadata, never diagnostics. -
Attach walk output to the runtime snapshot. The runtime snapshot should retain a dedicated runtime projection of the current walk output, not the raw probe objects themselves, so query services and Studio-facing adapters can access file-scoped probe metadata and file-scoped diagnostics without forcing a new filesystem walk. The initial runtime posture should keep the full available file set and the subset that is currently build-eligible, plus bank-size measurement data needed by future fixed-size hardware bank checks. The raw probe may still carry file bytes during the active walk, but the snapshot projection must strip byte payloads before retention. The snapshot should keep inventory, probe metadata, build-candidate classification, and bank-size measurements, but not whole file contents or raw
PackerFileProbeinstances by default. Later cleanup may reduce that retained surface, but the first implementation should prefer preserving available walk data rather than prematurely trimming it. -
Split diagnostic sinks intentionally. Baseline rule:
- asset-level or walk-level diagnostics that represent the normal operational truth of the asset should flow into the usual runtime/query diagnostics sink
- file-scoped diagnostics produced by probe processing should remain segregated per file inside the walk result projection
- Studio may consume those file-scoped diagnostics for detailed UI rendering, but that segregation must not be lost by collapsing everything into one flat diagnostics list
- none of those diagnostics are persisted in
cache.json
-
Persist cache only at stable visibility points. The normal runtime path should save refreshed cache after the loader finishes building a coherent snapshot. Write-path flows that patch one asset in memory should update only the affected asset cache entry after durable commit and successful re-walk.
-
Keep runtime snapshot and cache ownership aligned. Runtime snapshot data may retain the current walk output needed by query services, but the durable cache artifact remains a packer-owned operational store under
assets/.prometeu/cache.json. -
Emit observability only at meaningful boundaries. The implementation may emit
cache_hitandcache_missevents or counters, but adapters must not collapse cache behavior into fake asset-change semantics.
Acceptance Criteria
- runtime load attempts to read
assets/.prometeu/cache.jsonbefore walking assets - prior asset-scoped cache entries are passed into walkers as comparison input
- cache entries are keyed by stable
asset_id, not by asset path - unregistered assets do not receive cache entries and do not undergo internal file analysis before registration
- walkers compare current files against prior facts using ordered checks where
sizeinvalidates first,lastModifiedinvalidates next when the current value is newer, and fingerprint/hash remains the final expensive check - the current walk output is attached to the in-memory runtime snapshot through a byte-free runtime projection, not through raw probe objects
- the runtime snapshot keeps enough walk data to expose available files, build-candidate files, and bank-size measurement data
- the runtime snapshot does not retain raw bytes for every discovered file by default
- normal asset/runtime diagnostics include the general walk diagnostics that should participate in the standard diagnostics surface
- file-scoped diagnostics remain segregated in the walk result projection for UI consumers
- the resulting
walkResultis used to refresh the durable cache state - successful runtime load writes a coherent updated cache artifact back to
assets/.prometeu/cache.json - missing, corrupted, or version-mismatched cache does not block snapshot load; the packer falls back to a cold walk
- point write flows that already patch one asset in memory can refresh only that asset's cache slice after commit instead of forcing full cache rebuild
- cache entries are isolated by asset and file path so one asset cannot accidentally reuse another asset's file facts
- persisted cache does not contain diagnostics from prior runs
- Studio list/details behavior remains stable and does not depend on direct cache awareness
Tests
- loader tests for cold load when
cache.jsonis absent - loader tests for warm load when prior cache exists and matches current files
- loader tests for fallback when
cache.jsonis malformed, unreadable, or schema-incompatible - cache model tests proving asset cache lookup is aligned by
asset_id - walker tests proving changed
sizeinvalidates reuse immediately - walker tests proving newer
lastModifiedinvalidates reuse immediately - walker tests proving fingerprint/hash is evaluated only as the last comparison step when cheaper checks do not already invalidate reuse
- walker tests proving stable files can reuse prior metadata without changing query-visible results
- cache serialization tests proving diagnostics are never written to
cache.json - snapshot/query tests proving
walkResultis attached to the runtime asset model - tests proving general walk diagnostics sink into the normal diagnostics surface
- tests proving file-scoped diagnostics remain segregated per file for UI-facing consumers
- runtime registry tests for point cache refresh after write commit on one asset
- event or observability tests for
cache_hitandcache_missboundaries if those signals are emitted in this wave
Risks and Recovery
- path-keyed cache would become unsafe during relocate flows, so the cache owner key must remain
asset_id - overly aggressive cache reuse can hide real content changes if comparison rules are under-specified
- saving cache at the wrong lifecycle point can publish partial truth that no coherent snapshot ever observed
- if one part of the cache flow proves unstable, recovery should disable cache hydration or persistence for that path and preserve the current cold-walk behavior until the narrower follow-up PR is corrected
Affected Artifacts
docs/packer/pull-requests/**docs/packer/specs/2. Workspace, Registry, and Asset Identity Specification.mddocs/packer/specs/5. Diagnostics, Operations, and Studio Integration Specification.mdprometeu-packer/prometeu-packer-v1/src/main/java/p/packer/PackerWorkspacePaths.javaprometeu-packer/prometeu-packer-v1/src/main/java/p/packer/repositories/**prometeu-packer/prometeu-packer-v1/src/main/java/p/packer/models/**prometeu-packer/prometeu-packer-v1/src/test/java/p/packer/services/**prometeu-packer/prometeu-packer-v1/src/test/java/p/packer/repositories/**
Suggested Next Step
Derive smaller implementation PRs from PR-24:
-
cache model and repository Scope:
- durable
cache.jsonschema - load/save repository
asset_id-aligned cache lookup- serialization tests proving diagnostics are excluded
- durable
-
walker contract and comparison policy Scope:
- previous-cache input contract
- ordered invalidation checks by
size, then newerlastModified, then fingerprint/hash - file-scoped diagnostics preservation
-
runtime snapshot and loader integration Scope:
- attach
walkResultto runtime snapshot - sink general diagnostics into the normal asset/runtime diagnostics surface
- refresh cache from the cacheable parts of
walkResult - point write-path refresh for one affected asset
- attach