prometeu-studio/docs/specs/compiler/23. Compiler Pipeline Entry Points Specification.md

8.0 KiB

Compiler Pipeline Entry Points Specification

Status: Draft v1 (Operational Entry-Point Baseline)
Applies to: canonical compiler entrypoints, stage-terminal boundaries, and result contracts for analyze, compile, and build

1. Purpose

This document defines the canonical operational entrypoints of the shared compiler pipeline.

Its purpose is to make the compiler usable through explicit entrypoints instead of one monolithic public build invocation, while preserving one shared semantic pipeline for all supported frontends.

2. Scope

This document defines:

  • the canonical stage order shared by compiler entrypoints,
  • the terminal stage and mandatory behavior of analyze,
  • the terminal stage and mandatory behavior of compile,
  • the terminal stage and mandatory behavior of build,
  • the minimum public result contracts for those entrypoints,
  • and the context/config constraints that allow different callers without creating parallel pipeline semantics.

This document does not define:

  • one mandatory Java class or package layout,
  • editor-specific source providers or document-session providers,
  • frontend-specific semantic facts beyond the minimum shared AnalysisSnapshot contract,
  • or one mandatory CLI architecture.

3. Authority and Precedence

Normative precedence:

  1. Runtime authority (docs/specs/hardware/topics/chapter-2.md, chapter-3.md, chapter-9.md, chapter-12.md, chapter-16.md)
  2. Bytecode authority (docs/specs/bytecode/ISA_CORE.md)
  3. 14. Name Resolution and Module Linking Specification.md
  4. 20. IRBackend to IRVM Lowering Specification.md
  5. 21. IRVM Optimization Pipeline Specification.md
  6. 15. Bytecode and PBX Mapping Specification.md
  7. 19. Verification and Safety Checks Specification.md
  8. This document

If a rule here conflicts with a higher-precedence authority, it is invalid.

4. Normative Inputs

This document depends on:

  • 14. Name Resolution and Module Linking Specification.md
  • 15. Bytecode and PBX Mapping Specification.md
  • 19. Verification and Safety Checks Specification.md
  • 20. IRBackend to IRVM Lowering Specification.md
  • 21. IRVM Optimization Pipeline Specification.md
  • docs/specs/compiler-languages/pbs/13. Lowering IRBackend Specification.md

5. Canonical Shared Pipeline

The compiler MUST expose one canonical shared pipeline.

The canonical stage order is:

  1. ResolveDepsPipelineStage
  2. LoadSourcesPipelineStage
  3. FrontendPhasePipelineStage
  4. LowerToIRVMPipelineStage
  5. OptimizeIRVMPipelineStage
  6. EmitBytecodePipelineStage
  7. LinkBytecodePipelineStage
  8. VerifyBytecodePipelineStage
  9. WriteBytecodeArtifactPipelineStage

Public entrypoints MAY terminate early according to this document, but they MUST NOT:

  1. reorder these stages,
  2. redefine their shared semantic meaning,
  3. or create parallel compiler pipelines under the same public entrypoint names.

6. Entry Point Contracts

6.1 analyze

analyze MUST terminate at FrontendPhasePipelineStage.

analyze is defined as:

  1. ResolveDeps
  2. LoadSources
  3. FrontendPhase

analyze MUST:

  1. resolve workspace/dependency inputs needed by the frontend,
  2. load source surfaces admitted by the selected compiler configuration,
  3. run frontend semantic analysis,
  4. return an AnalysisSnapshot or equivalent result contract,
  5. and remain free of backend artifact side effects.

analyze MUST NOT:

  1. lower to IRVM,
  2. optimize IRVM,
  3. emit bytecode,
  4. run bytecode link/verify gates,
  5. or persist artifact files.

6.2 compile

compile MUST terminate at VerifyBytecodePipelineStage.

compile is defined as:

  1. analyze
  2. LowerToIRVM
  3. OptimizeIRVM
  4. EmitBytecode
  5. LinkBytecode
  6. VerifyBytecode

compile MUST:

  1. preserve the canonical executable path used to produce an in-memory executable artifact,
  2. produce a validated executable result in memory,
  3. include bytecode emission, bytecode link precheck, and bytecode verification,
  4. and preserve the correctness expectations of the current executable path used by PBS.

compile MUST NOT persist artifact files.

6.3 build

build MUST terminate at WriteBytecodeArtifactPipelineStage.

build is defined as:

  1. compile
  2. WriteBytecodeArtifact

build MUST:

  1. preserve the current default filesystem-oriented artifact behavior,
  2. write the executable artifact only after the compile contract has been satisfied,
  3. and remain the canonical artifact-materialization entrypoint for filesystem-backed callers.

7. Result Contracts

7.1 AnalysisSnapshot

AnalysisSnapshot is the minimum shared result contract for analyze.

AnalysisSnapshot MUST expose at minimum:

  1. diagnostics,
  2. semantic facts produced by the frontend,
  3. stable references to loaded sources, including the equivalent of FileTable,
  4. and workspace-resolution metadata required by tooling consumers.

Implementations MAY add fields, but they MUST NOT omit those minimum elements.

7.2 Compile result

The compile result MUST expose at minimum:

  1. the validated in-memory executable artifact,
  2. the equivalent of bytecodeModule,
  3. the equivalent of serialized bytecode bytes,
  4. and enough metadata to allow build to persist the artifact without recompiling through another semantic path.

7.3 Build result

The build result MUST expose at minimum:

  1. the persisted artifact location,
  2. the compile payload from which the artifact was produced or an equivalent stable bridge to it,
  3. and the outcome needed by filesystem-backed callers to confirm successful artifact materialization.

8. Context and Configuration Model

The compiler MAY accept distinct config/context inputs per entrypoint in order to support CLI, editor, LSP, and other callers.

Those configs/contexts MAY vary:

  1. source acquisition strategies,
  2. sinks or collectors,
  3. composition helpers,
  4. and the exact shape of public result wrappers.

Those configs/contexts MUST NOT:

  1. change the canonical stage order for a given entrypoint,
  2. redefine stage semantics,
  3. or create an alternate semantic pipeline under the public names analyze, compile, or build.

Filesystem-default composition MUST happen outside the pipeline core.

9. Public Surface Rule

The shared compiler public surface MUST expose explicit entrypoints equivalent in meaning to:

  1. analyze(config, logs)
  2. compile(config, logs)
  3. build(config, logs)

The exact method or service names MAY vary, but the semantic split MUST remain explicit.

The legacy public concept run MUST NOT remain the normative public entrypoint. The behavior previously associated with default run MUST be expressed as build with filesystem-oriented config/context assembled outside the pipeline itself.

10. Compatibility Rule

This document MUST preserve compatibility with the current executable path that produces artifacts for PBS.

At minimum, that compatibility covers:

  1. the same semantic stage ordering up to executable artifact production,
  2. the same correctness envelope for the in-memory executable artifact before persistence,
  3. and no regression in emit/link/verify safety behavior.

PBS is a baseline correctness consumer for this path, but PBS MUST NOT become an exclusive semantic owner of the public compiler entrypoint surface.

11. Explicit Deferrals

The following remain deferred:

  • editor-specific source-provider contracts,
  • richer multi-profile result types beyond the minimum contracts in this document,
  • and frontend-specific analysis payload extensions beyond the required shared AnalysisSnapshot minimum.

12. Non-Goals

  • Creating separate public pipelines per frontend.
  • Replacing backend, bytecode, or runtime authority documents.
  • Freezing one internal implementation architecture prematurely.

13. Exit Criteria

This document is healthy when:

  1. the compiler entrypoints are explicit and stable,
  2. the terminal stage for each entrypoint is unambiguous,
  3. the minimum result contracts are explicit,
  4. run no longer acts as a normative public concept,
  5. and caller-specific contexts are constrained without opening semantic drift.