prometeu-runtime/docs/specs/05-audio-peripheral.md

4.8 KiB

Audio Peripheral (Audio System)

Domain: virtual hardware: audio Function: normative

Didactic companion: ../learn/mental-model-audio.md

1 Scope

This chapter defines the runtime-facing audio contract of PROMETEU.

The AUDIO peripheral is responsible for sound generation and mixing under explicit machine limits.

Core contract:

  • command-driven control from the game loop;
  • fixed output format;
  • finite voice count;
  • deterministic voice conflict policy;
  • explicit cost in cycles and certification surface.

2 General Architecture

The audio system is composed of:

  • Voices (channels) — independent players
  • Samples — PCM data
  • Mixer — summation of voices
  • Output — PCM stereo buffer

Timing boundary:

  • game sends commands at 60 Hz;
  • audio generates PCM at 48 kHz.

3 Output Format

  • Sample rate: 48,000 Hz
  • Format: PCM16 stereo (signed i16)
  • Clipping: saturation/clamp

4 Voices

4.1 Quantity

MAX_VOICES = 16

Each voice:

  • plays 1 sample at a time
  • is independent
  • is mixed into the final output

4.2 Voice State

Each voice maintains:

  • sample_id
  • pos (fractional position in the sample)
  • rate (pitch)
  • volume (0..255)
  • pan (0..255, left→right)
  • loop_mode (off / on)
  • loop_start, loop_end
  • priority (optional)

4.3 Voice Conflict

If all voices are occupied:

  • an explicit policy is applied:
    • STEAL_OLDEST
    • STEAL_QUIETEST
    • STEAL_LOWEST_PRIORITY

5 Samples

5.1 Format

PROMETEU samples:

  • PCM16 mono
  • own sample_rate (e.g., 22050, 44100, 48000)
  • immutable data at runtime

Fields:

  • sample_rate
  • frames_len
  • loop_start, loop_end (optional)

5.2 Command Surface

Representative command surface:

audio.play(sample, voice, volume, pan, pitch, priority)
audio.stop(voice)
audio.setVolume(voice, value)
audio.setPan(voice, value)
audio.setPitch(voice, value)
audio.isPlaying(voice)

6 Pitch and Interpolation

  • rate = 1.0 → normal speed
  • rate > 1.0 → higher pitch
  • rate < 1.0 → lower pitch

As position becomes fractional:

  • linear interpolation is used between two neighboring samples

7 Mixer

For each output frame (48kHz):

  1. For each active voice:
    • read sample at current position
    • apply pitch
    • apply volume
    • apply pan → generates L/R
  2. Sum all voices
  3. Apply clamp
  4. Write to the stereo buffer

Cost depends on:

  • number of active voices
  • use of interpolation

8 Synchronization with the Game

  • Game runs at 60Hz
  • Audio generates data at 48kHz

Every frame (60Hz):

  • game sends commands:
    • play
    • stop
    • set_volume
    • set_pan
    • set_pitch

The audio applies these commands and continues playing.

9 Host Responsibilities

The PROMETEU machine defines the audio model, commands, and limits.

The host is responsible for:

  • choosing the concrete audio backend;
  • scheduling buffer delivery;
  • keeping PCM output continuous without changing logical behavior.

10 Audio and CAP

Audio participates in the Execution CAP:

  • mixing cost per frame
  • cost per active voice
  • command cost

Example:

Frame 1024:
voices_active: 9
mix_cycles: 410
audio_commands: 6

11 Syscall Return and Fault Policy

audio follows status-first policy for operations with operational failure modes.

Fault boundary:

  • Trap: structural ABI misuse (type/arity/capability/shape mismatch);
  • status: operational failure;
  • Panic: internal invariant break only.

11.1 MVP return shape

In the current MVP:

  • audio.play returns status:int;
  • audio.play_sample returns status:int.

ABI audio.play:

  1. bank_id: int — index of the sound bank
  2. sample_id: int — index of the sample within the bank
  3. voice_id: int — index of the voice to use (0..15)
  4. volume: int — volume level (0..255)
  5. pan: int — panning (0..255)
  6. pitch: float — playback rate
  7. loop_mode: int0 for Off, 1 for On

Return-shape matrix in v1 syscall surface:

Syscall Return Policy basis
audio.play status:int operational rejection is observable
audio.play_sample status:int operational rejection is observable

11.2 Minimum status table for play/play_sample

  • 0 = OK
  • 1 = VOICE_INVALID
  • 2 = SAMPLE_NOT_FOUND
  • 3 = ARG_RANGE_INVALID
  • 5 = NO_EFFECT
  • 6 = BANK_INVALID

Operational rules:

  • no fallback to default bank when a bank id cannot be resolved;
  • no silent no-op for invalid voice_id;
  • invalid voice_id must return VOICE_INVALID, not ARG_RANGE_INVALID;
  • invalid numeric ranges (e.g. volume, pan, pitch) must return explicit status.