--- id: LSN-0025 ticket: compiler-analyze-compile-build-pipeline-split title: Compiler Pipeline Entrypoints and Result Boundaries created: 2026-03-30 tags: [compiler, pipeline, analyze, compile, build, contracts, conformance] --- ## Context The compiler pipeline used to expose one public `run` flow that always resolved dependencies, loaded sources, ran the frontend, lowered to IRVM, optimized, emitted bytecode, verified it, and finally wrote `build/program.pbx`. That shape hid three different intents behind one operation: - tooling-only semantic analysis with no artifact side effects, - in-memory executable compilation with no disk write, - and filesystem-backed artifact materialization. This became a real boundary problem once Studio and future LSP-like consumers needed semantic results without forcing PBX persistence. ## Key Decisions ### Keep One Canonical Pipeline, Not Three Divergent Pipelines **What:** The compiler now keeps one canonical shared stage order and exposes three public entrypoints over that same pipeline: `analyze`, `compile`, and `build`. **Why:** The important architectural rule is shared semantics with different terminal boundaries, not separate services that slowly drift apart. `build` must stay defined as `compile` plus terminal persistence, not as another independent executable path. **Trade-offs:** This keeps behavior consistent for callers, but it requires the stage boundaries and result contracts to be explicit. Without explicit contracts, one shared pipeline easily collapses back into a mutable context API that callers misuse. ### Make Terminal Stage Boundaries Part of the Public Contract **What:** The entrypoints now mean: - `analyze = ResolveDeps + LoadSources + FrontendPhase` - `compile = analyze + LowerToIRVM + OptimizeIRVM + EmitBytecode + LinkBytecode + VerifyBytecode` - `build = compile + WriteBytecodeArtifact` **Why:** The value is not just naming. Each entrypoint now communicates a precise side-effect boundary and a precise result boundary. That lets tooling consumers ask for semantic facts, executable callers ask for validated in-memory bytecode, and filesystem callers ask for persisted artifacts without inventing alternate pipeline semantics. **Trade-offs:** The team must protect these boundaries with tests and conformance docs. If callers start bypassing them with ad hoc helpers, `compile` and `build` drift immediately. ### Publish Stable Result Contracts Instead of Leaking Mutable Pipeline Context **What:** The public surface now returns stable record contracts: - `AnalysisSnapshot` - `CompileResult` - `BuildResult` **Why:** `BuilderPipelineContext` is mutable internal state, not a good external contract. Stable result models make the minimum payload explicit and keep callers from depending on incidental intermediate fields. **Trade-offs:** This adds adapter code from pipeline context to public results. That cost is worth paying because it limits public coupling and makes future multi-caller evolution safer. ## Final Implementation The implementation landed across specs, code, and tests: - Chapter 23 in `docs/specs/compiler` now defines the canonical entrypoints and minimum result contracts. - `BuilderPipelineService` now exposes `analyze`, `compile`, and `build` as the public surface. - `Compile.main` now composes the default filesystem context and calls `build`. - `AnalysisSnapshot`, `CompileResult`, and `BuildResult` carry the stable output contracts. - integration coverage now proves that `analyze` and `compile` do not write `build/program.pbx`, while `build` does. ## Examples - Use `analyze` when the caller needs diagnostics, source table access, workspace resolution, and frontend semantic facts for tooling. - Use `compile` when the caller needs validated executable bytecode in memory and must not touch the filesystem. - Use `build` when the caller wants the normal artifact-producing compiler behavior and a concrete `program.pbx` path. ## Pitfalls - Do not reintroduce a public `run` alias. That would blur the side-effect boundary the discussion just made explicit. - Do not let `build` diverge semantically from `compile`. The only extra step for `build` is terminal artifact persistence. - Do not leak `BuilderPipelineContext` back into callsites as the real public contract. That would make the stable result models nominal only. - Do not add caller-specific configs that silently change stage order or stage meaning under the names `analyze`, `compile`, or `build`. - Do not treat `compile` as "half-build". It is a complete validated in-memory executable result, not an editorially weaker path. ## References - `DEC-0007` Canonical compiler entrypoints for analyze, compile, and build - `PLN-0009` Propagate DEC-0007 into compiler pipeline specs and public contracts - `PLN-0010` Refactor BuilderPipelineService into explicit analyze, compile, and build entrypoints - `PLN-0011` Migrate compiler callsites and tests to explicit build, compile, and analyze entrypoints - `docs/specs/compiler/23. Compiler Pipeline Entry Points Specification.md` - `docs/specs/compiler/22. Backend Spec-to-Test Conformance Matrix.md` - `prometeu-compiler/prometeu-build-pipeline/src/main/java/p/studio/compiler/workspaces/BuilderPipelineService.java` - `prometeu-compiler/prometeu-build-pipeline/src/test/java/p/studio/compiler/integration/MainProjectPipelineIntegrationTest.java` ## Takeaways - The durable pattern is one canonical compiler pipeline with explicit terminal entrypoints, not multiple near-duplicate pipelines. - Side-effect boundaries are first-class API semantics: `analyze` and `compile` must stay no-write, and `build` is the only artifact-materialization path. - Stable result contracts are part of the architectural fix; callers should consume `AnalysisSnapshot`, `CompileResult`, and `BuildResult`, not mutable pipeline internals.