prometeu-studio/docs/agendas/PBS Intrinsics and Builtin Types Agenda.md

12 KiB

PBS Intrinsics and Builtin Types

Status: draft

This document defines a proposed intrinsic model for PBS and the Prometeu VM. Its goal is to let the frontend expose builtin types such as color, vec2, and pixel as first-class language entities while keeping their behavior implemented canonically inside the VM.

The design intentionally avoids coupling builtin types to the HAL. Host-backed behavior continues to flow through syscalls. Builtin domain behavior flows through VM intrinsics.

1. Objectives

This model exists to satisfy these constraints:

  1. Builtin types must be real language types, not just naming conventions.
  2. Builtin values must remain stack-only unless explicitly specified otherwise.
  3. Domain behavior of builtin types must execute in the VM, not in frontend- generated helper code, so semantics remain canonical.
  4. Host integration must stay separate from builtin semantics.
  5. The verifier must be able to validate intrinsic calls using slot layouts.
  6. The model must scale from scalar builtins such as color to fixed-layout aggregates such as vec2 and pixel.

2. Non-Objectives

This document does not define:

  1. Final parser syntax for every frontend language.
  2. The full bytecode encoding that will ship in the first implementation.
  3. Heap-backed builtin types.
  4. User-extensible intrinsics.

3. Terms

3.1 Builtin type

A builtin type is a language-level type whose semantic identity is owned by the runtime platform rather than by user code.

Examples:

  • color
  • vec2
  • pixel

3.2 Physical slot layout

Each builtin type has a canonical physical representation in VM slots.

Examples:

  • color -> 1 slot
  • vec2 -> 2 slots
  • pixel -> 3 slots

3.3 Intrinsic

An intrinsic is a VM-owned operation with a canonical semantic definition.

An intrinsic is not:

  • a user function
  • a syscall
  • a host callback

An intrinsic is callable, but it is executed entirely by the VM.

4. Three Kinds of Calls

The execution model should distinguish three categories:

4.1 Function call

Calls user-defined code.

  • Owned by the program
  • Compiled from source bodies
  • Executed through normal CALL

4.2 Intrinsic call

Calls VM-defined builtin behavior.

  • Owned by the VM
  • Declared in the frontend as builtin members or builtin operations
  • Executed through an intrinsic dispatch path

4.3 Syscall

Calls host-managed services.

  • Owned by the host platform
  • Resolved by canonical identity and capability checks
  • Executed through SYSCALL

The rule is simple:

  • builtin semantics -> intrinsic
  • host/platform behavior -> syscall

5. Builtin Types Are Values, Not Just Method Bags

A builtin type must define more than methods. It must define its value model.

Each builtin type should specify:

  1. Semantic identity
  2. Physical slot layout
  3. Named fields or projections
  4. Construction rules
  5. Constants, if any
  6. Intrinsic operations

5.1 Example: color

color is a scalar builtin type.

  • Semantic meaning: RGB565 color
  • Physical layout: 1 slot
  • Domain behavior: conversions, comparisons, construction helpers

Illustrative shape:

builtin type color {
  layout: [color]

  static fn from_raw(raw: int) -> color
  static fn rgb(r: int, g: int, b: int) -> color
}

The physical carrier for a scalar builtin may be:

  • a dedicated Value variant such as Value::Color(u16), or
  • another canonical 1-slot carrier defined by the VM ABI

The semantic type remains color either way.

5.2 Example: vec2

vec2 is a fixed-layout aggregate builtin type.

  • Semantic meaning: 2D vector
  • Physical layout: 2 slots
  • Field projections: x, y
  • Domain behavior: dot, length, distance

Illustrative shape:

builtin type vec2 {
  layout: [float, float]

  field x: float @slot(0)
  field y: float @slot(1)

  static fn new(x: float, y: float) -> vec2

  fn dot(other: vec2) -> float
  fn length() -> float
  fn distance(other: vec2) -> float
}

5.3 Example: pixel

pixel is also a fixed-layout aggregate builtin type.

  • Semantic meaning: position + color
  • Physical layout: 3 slots

Illustrative shape:

builtin type pixel {
  layout: [int, int, color]

  field x: int @slot(0)
  field y: int @slot(1)
  field color: color @slot(2)

  static fn new(x: int, y: int, color: color) -> pixel
}

The key point is that pixel is still a first-class type in the language even though it occupies three stack slots physically.

6. Frontend Contract

Frontend languages should expose builtin types through a builtin registry or prelude, not through normal user modules.

The frontend may present syntax such as:

  • vec2
  • vec2.new(1.0, 2.0)
  • v.x
  • v.dot(other)

But user code must not provide the implementation body of builtin members.

The frontend owns:

  • syntax
  • name resolution
  • type checking
  • lowering decisions

The frontend does not own:

  • builtin semantics
  • builtin numeric algorithms
  • host integration behavior

6.1 Builtin declarations are semantic shells

For FE purposes, a builtin declaration behaves like an interface or shell with no user body.

For example:

builtin type vec2 {
  field x: float
  field y: float

  static fn new(x: float, y: float) -> vec2
  fn dot(other: vec2) -> float
}

The frontend uses this declaration to type-check source programs, but lowering must map each member access to VM-defined behavior.

7. Lowering Rules

7.1 Construction

Construction of a builtin aggregate does not necessarily require a VM intrinsic.

If the physical layout is already the canonical stack layout, the frontend may lower construction to ordinary pushes or tuple shaping.

Examples:

  • vec2.new(x, y) -> push x, push y
  • pixel.new(x, y, c) -> push x, push y, push c

The constructor remains semantically builtin even if its lowering is a pure layout operation.

7.2 Field projection

Field projection also does not necessarily require a VM intrinsic.

If a builtin type has a fixed stack layout, the frontend may lower field access through canonical slot projection rules.

Examples:

  • vec2.x -> project slot 0
  • vec2.y -> project slot 1
  • pixel.color -> project slot 2

If projection requires non-trivial stack reshaping, the compiler may emit helper stack operations or a dedicated intrinsic. The semantic rule remains the same.

7.3 Domain behavior

Operations whose meaning belongs to the builtin domain should lower to intrinsics.

Examples:

  • v1.dot(v2) -> intrinsic
  • v.length() -> intrinsic
  • v.distance(other) -> intrinsic

This is the core reason intrinsics exist: the semantics of these operations must be owned by the VM rather than by frontend-emitted helper code.

8. Intrinsic Identity

Intrinsics should be language-independent.

A canonical identity should use:

(owner, name, version)

Examples:

  • ("color", "rgb", 1)
  • ("vec2", "dot", 1)
  • ("vec2", "length", 1)
  • ("vec2", "distance", 1)

owner may be:

  • a builtin type name, such as vec2
  • a builtin namespace, if needed later

This identity is used by the frontend and by any optional preload artifact format. The VM may execute a compact numeric intrinsic id after resolution.

9. Intrinsic Metadata

The VM should own a canonical intrinsic registry.

Illustrative shape:

IntrinsicMeta {
  id: u32
  owner: string
  name: string
  version: u16
  arg_layout: [AbiType]
  ret_layout: [AbiType]
  deterministic: bool
  may_allocate: bool
  cost_hint: u32
}

AbiType must describe physical slot layout, not just source syntax.

Illustrative shape:

AbiType =
  Int32
  Int64
  Float64
  Bool
  String
  Handle
  Color
  Aggregate([AbiType])

Examples:

  • color -> Color
  • vec2 -> Aggregate([Float64, Float64])
  • pixel -> Aggregate([Int32, Int32, Color])

Flattening into slots is canonical. For verifier purposes:

  • Color consumes 1 slot
  • Aggregate([Float64, Float64]) consumes 2 slots
  • Aggregate([Int32, Int32, Color]) consumes 3 slots

10. Bytecode Model

This specification recommends a distinct intrinsic instruction path.

Illustrative final executable form:

INTRINSIC <id>

The VM interprets <id> using the intrinsic registry.

An optional preload artifact may use a declaration table, similar to syscall resolution:

INTRCALL <decl_index>

with:

IntrinsicDecl {
  owner: string
  name: string
  version: u16
}

The loader may then resolve:

INTRCALL <decl_index> -> INTRINSIC <id>

This mirrors syscall resolution without introducing host coupling.

If the implementation prefers, PBS may also emit final INTRINSIC <id> directly once the intrinsic registry is stable.

11. Verifier Rules

The verifier must treat intrinsics as first-class stack effects with known layouts.

For each intrinsic, the verifier must know:

  1. input slot count
  2. output slot count
  3. slot layout
  4. whether allocation is allowed

Example:

vec2.dot(vec2, vec2) -> float

Canonical flattened layout:

  • args: [float, float, float, float]
  • returns: [float]

So verifier stack effect is:

pop 4, push 1

Another example:

pixel.new(int, int, color) -> pixel

Flattened layout:

  • args: [int, int, color]
  • returns: [int, int, color]

This may be lowered as:

  • no-op stack shaping, or
  • a constructor intrinsic with pop 3, push 3

The verifier must reason about the actual lowered form.

12. Determinism Rules

Intrinsic behavior must be VM-canonical.

This means:

  1. Intrinsics must not call the host.
  2. Intrinsics must not depend on wall-clock time.
  3. Intrinsics must not depend on frontend-generated helper algorithms.
  4. Numeric behavior must be specified tightly enough for portability.

This matters especially for floating-point operations such as:

  • vec2.length
  • vec2.distance

If the platform requires strong cross-host determinism, the VM must define the numeric behavior precisely enough to avoid backend drift. Depending on the final design, that may mean:

  • canonical floating-point rules
  • constrained instruction sequences
  • fixed-point math for some domains

The important requirement is semantic ownership by the VM.

13. Separation from HAL

Builtin types must not be defined by HAL types.

Correct layering:

  1. PBS and the VM define builtin semantic types.
  2. The VM defines intrinsic behavior.
  3. Syscalls bridge VM values to host services.
  4. The HAL adapts host/runtime values to device-specific behavior.

For example:

  • color is a VM builtin type
  • gfx.clear(color) is a syscall using a color argument
  • the host runtime converts that value into any HAL-specific color adapter

The HAL may consume builtin values, but it must not own their language-level definition.

A minimal first wave should support:

  1. Scalar builtin type:
    • color
  2. Aggregate builtin types:
    • vec2
    • pixel
  3. Intrinsic-only domain methods for vec2:
    • dot
    • length
    • distance

Initial constructor and field access lowering may remain compiler-driven when they are pure layout operations.

15. Summary

This proposal establishes the following model:

  1. Builtin types are first-class language types with canonical slot layouts.
  2. Stack-only builtin aggregates are valid even when they occupy multiple slots.
  3. Frontends expose builtin declarations as semantic shells, not as user- implemented code.
  4. Construction and field projection may lower directly to canonical stack layout operations.
  5. Builtin domain behavior lowers to VM intrinsics.
  6. Syscalls remain exclusively for host-backed behavior.
  7. The VM, not the frontend or HAL, owns the semantics of builtin operations.