prometeu-runtime/docs/specs/runtime/05-audio-peripheral.md

# Audio Peripheral (Audio System)

Domain: virtual hardware: audio
Function: normative

Didactic companion: [`../learn/mental-model-audio.md`](../runtime/learn/mental-model-audio.md)

## 1 Scope

This chapter defines the runtime-facing audio contract of PROMETEU.

The AUDIO peripheral is responsible for sound generation and mixing under explicit machine limits.

Core contract:

- command-driven control from the game loop;
- fixed output format;
- finite voice count;
- deterministic voice conflict policy;
- explicit cost in cycles and certification surface.

## 2 General Architecture

The audio system is composed of:

- **Voices (channels)** — independent players
- **Samples** — PCM data
- **Mixer** — summation of voices
- **Output** — PCM stereo buffer

Timing boundary:

- game sends commands at **60 Hz**;
- audio generates PCM at **48 kHz**.

## 3 Output Format

- Sample rate: **48,000 Hz**
- Format: **PCM16 stereo (signed i16)**
- Clipping: saturation/clamp

## 4 Voices

### 4.1 Quantity

```
MAX_VOICES = 16
```

Each voice:

- plays **1 sample at a time**
- is independent
- is mixed into the final output

### 4.2 Voice State

Each voice maintains:

- `sample_id`
- `pos` (fractional position in the sample)
- `rate` (pitch)
- `volume` (0..255)
- `pan` (0..255, left→right)
- `loop_mode` (off / on)
- `loop_start`, `loop_end`
- `priority` (optional)

### 4.3 Voice Conflict

If all voices are occupied:

- an explicit policy is applied:
  - `STEAL_OLDEST`
  - `STEAL_QUIETEST`
  - `STEAL_LOWEST_PRIORITY`

## 5 Samples

### 5.1 Format

PROMETEU samples:

- **PCM16 mono**
- own sample_rate (e.g., 22050, 44100, 48000)
- immutable data at runtime

Fields:

- `sample_rate`
- `frames_len`
- `loop_start`, `loop_end` (optional)

### 5.2 Command Surface

Representative command surface:

```text
audio.play(sample, voice, volume, pan, pitch, priority)
audio.stop(voice)
audio.setVolume(voice, value)
audio.setPan(voice, value)
audio.setPitch(voice, value)
audio.isPlaying(voice)
```

## 6 Pitch and Interpolation

- `rate = 1.0` → normal speed
- `rate > 1.0` → higher pitch
- `rate < 1.0` → lower pitch

As position becomes fractional:

- **linear interpolation** is used between two neighboring samples

## 7 Mixer

For each output frame (48kHz):

1. For each active voice:
    - read sample at current position
    - apply pitch
    - apply volume
    - apply pan → generates L/R
2. Sum all voices
3. Apply clamp
4. Write to the stereo buffer

Cost depends on:

- number of active voices
- use of interpolation

## 8 Synchronization with the Game

- Game runs at **60Hz**
- Audio generates data at **48kHz**

Every frame (60Hz):

- game sends commands:
    - play
    - stop
    - set_volume
    - set_pan
    - set_pitch

The audio applies these commands and continues playing.

## 9 Host Responsibilities

The PROMETEU machine defines the audio model, commands, and limits.

The host is responsible for:

- choosing the concrete audio backend;
- scheduling buffer delivery;
- keeping PCM output continuous without changing logical behavior.

## 10 Audio and CAP

Audio participates in the Execution CAP:

- mixing cost per frame
- cost per active voice
- command cost

Example:

```
Frame 1024:
voices_active: 9
mix_cycles: 410
audio_commands: 6
```

## 11 Syscall Return and Fault Policy

`audio` follows status-first policy for operations with operational failure modes.

Fault boundary:

- `Trap`: structural ABI misuse (type/arity/capability/shape mismatch);
- `status`: operational failure;
- `Panic`: internal invariant break only.

### 11.1 MVP return shape

In the current MVP:

- `audio.play` returns `status:int`;
- `audio.play_sample` returns `status:int`.

ABI `audio.play`:
1. `bank_id: int` — index of the sound bank
2. `sample_id: int` — index of the sample within the bank
3. `voice_id: int` — index of the voice to use (0..15)
4. `volume: int` — volume level (0..255)
5. `pan: int` — panning (0..255)
6. `pitch: float` — playback rate
7. `loop_mode: int` — `0` for Off, `1` for On

Return-shape matrix in v1 syscall surface:

| Syscall               | Return       | Policy basis                           |
| --------------------- | ------------ | -------------------------------------- |
| `audio.play`          | `status:int` | operational rejection is observable    |
| `audio.play_sample`   | `status:int` | operational rejection is observable    |

### 11.2 Minimum status table for `play`/`play_sample`

- `0` = `OK`
- `1` = `VOICE_INVALID`
- `2` = `SAMPLE_NOT_FOUND`
- `3` = `ARG_RANGE_INVALID`
- `5` = `NO_EFFECT`
- `6` = `BANK_INVALID`

Operational rules:

- no fallback to default bank when a bank id cannot be resolved;
- no silent no-op for invalid `voice_id`;
- invalid `voice_id` must return `VOICE_INVALID`, not `ARG_RANGE_INVALID`;
- invalid numeric ranges (e.g. `volume`, `pan`, `pitch`) must return explicit status.