How to Generate Audio from x86 Assembly Tiny Code Demos in 2025
What You'll Build: A 16-Byte x86 Program That Produces Sound
By the end of this tutorial you'll have a working bootable binary — assembled with NASM, run under QEMU — that programs the PC speaker through the Programmable Interval Timer, writes bright-green characters to the VGA text buffer at 0xB8000, and does all of it in under 512 bytes. The showpiece target is a 16-byte variant inspired by the demoscene entry wake_up 16b, a size-coding curiosity that squeezes beep-synchronized "Matrix rain" output into a single x86 instruction sequence small enough to fit in a tweet.
What you'll learn:
- How x86 boots into real mode and why BIOS-initialized register state is a free resource
- How to program PIT channel 2 (ports
0x42/0x43) and toggle the PC speaker gate at port0x61 - How to write directly to VGA text memory without an OS or driver
- How demosceners squeeze dual audio+visual output into 16 raw bytes using opcode reuse and BIOS state assumptions
- How to assemble, pad, and run the binary under QEMU with PC speaker emulation enabled
Tech stack: NASM ≥ 2.15, QEMU ≥ 7.x, Linux or WSL2, a Makefile, and nothing else.
The 'wake_up' 16b demo explained
The wake_up 16b entry, listed on Pouet.net under the bootsector/16b category, produces an audible sequence of descending beeps while columns of ASCII characters cascade down a green-on-black terminal — all without a single OS call, sound card driver, or graphics library. The entire program is 16 bytes placed at the very start of a 512-byte MBR sector.
Here are those 16 bytes as a hex dump:
00000000: b0 b3 e6 61 b9 ff 2f e6 42 e2 fe c0 88 07 eb f2
Every one of those 16 bytes does double or triple duty. The LOOP at offset 0x09 both counts the PIT divisor delay and acts as a character-index counter for the VGA write. There is no wasted space.
Why 16 bytes is an extreme constraint
The demoscene "16b" category requires the entire executable payload to fit in 16 bytes. That rules out subroutine calls, stack setup, data sections, and most structured control flow. You have eight general-purpose registers (in 16-bit real mode: AX, BX, CX, DX, SI, DI, BP, SP), a direction flag, and whatever the BIOS left in those registers when it jumped to your bootsector at 0x7C00. Everything else costs bytes you don't have.
What the final program does
The demo produces a series of descending-pitch beeps (PIT divisor increments on each iteration), and on every timing loop iteration it also writes the current loop counter value as an ASCII character to VGA text memory with the 0x0A attribute (bright green on black). The effect looks like a stuttering matrix rain column synchronized to each beep's duration.
Background: x86 Real Mode, BIOS Interrupts, and PC Speaker Basics
Before writing a single instruction you need to understand the environment your 16 bytes lands in.
How x86 boots into real mode and why it matters for tiny programs
When a PC powers on, the CPU starts in 16-bit real mode at CS:IP = F000:FFF0. The BIOS runs POST, then reads 512 bytes from the first sector of the boot device into memory at physical address 0x7C00, verifies the magic bytes 0x55 0xAA at offsets 510–511, and jumps there. At that moment your code runs with full hardware access, no memory protection, and no OS overhead.
Segmentation in real mode is simple: physical address = segment × 16 + offset. So CS = 0x07C0, IP = 0x0000 and CS = 0x0000, IP = 0x7C00 both reach the same byte. BIOS leaves register state partially initialized — AX often holds the drive number, DL is the boot drive, and most importantly for size-coders CX is frequently zero or near-zero, which is exploitable.
BIOS INT 10h and INT 16h
For most bootsector demos you'd use INT 10h (video) and INT 16h (keyboard). INT 10h, AH=0Eh prints a character in teletype mode; INT 10h, AH=00h sets a video mode. But each interrupt call costs 2 bytes for the INT opcode + vector, plus setup instructions. At 16 bytes you skip BIOS entirely and write directly to hardware ports and VGA memory.
Generating sound via port 0x61 and PIT channel 2
The PC speaker is controlled by two pieces of hardware working together:
| Port | Name | Purpose |
|------|------|---------|
| 0x43 | PIT Mode/Command | Set PIT channel 2 mode (square wave, binary count) |
| 0x42 | PIT Channel 2 Data | Write 16-bit frequency divisor (low byte, then high byte) |
| 0x61 | PC Speaker Gate | Bits 0–1: connect PIT ch2 output to speaker |
To produce a tone: write the mode byte to 0x43, write the 16-bit divisor to 0x42 (low byte first, then high byte), then set bits 0 and 1 of port 0x61 to 1. To silence it, clear those bits.
The role of the PIT in tone generation
The Programmable Interval Timer (Intel 8253/8254) runs on a 1.193180 MHz clock. Channel 2 in Mode 3 (square wave) divides that clock by your divisor and sends the resulting square wave to the PC speaker. The frequency formula is:
divisor = 1193180 / frequency_hz
For concert A (440 Hz): 1193180 / 440 ≈ 2712 = 0x0A98. For a 1 kHz beep: 1193180 / 1000 = 1193 = 0x04A9. The PIT accepts any 16-bit divisor; smaller values → higher pitch.
Setting Up Your Development Environment
Goal: get NASM and QEMU installed and wired together so you can iterate in seconds.
Prerequisites
| Tool | Minimum Version | Install command (Debian/Ubuntu) |
|------|----------------|----------------------------------|
| NASM | 2.15 | sudo apt install nasm |
| QEMU | 7.0 | sudo apt install qemu-system-x86 |
| GNU Make | 4.x | sudo apt install make |
| dd | any | pre-installed on Linux/macOS |
On macOS: brew install nasm qemu. On Windows: use WSL2 (Ubuntu 22.04) and install the same packages inside WSL.
Building a Makefile for the assemble-and-run workflow
Create this Makefile at the root of your project:
# Makefile for x86 real-mode PC speaker demo
ASM = nasm
QEMU = qemu-system-i386
SRC = boot.asm
BIN = boot.bin
IMG = boot.img
# Assemble flat binary, pad to 512 bytes, run in QEMU
all: $(IMG)
$(QEMU) -drive format=raw,file=$(IMG) \
-soundhw pcspk \
-display sdl \
-no-reboot 2>/dev/null || \
$(QEMU) -drive format=raw,file=$(IMG) \
-device pcspk \
-display sdl \
-no-reboot
$(IMG): $(BIN)
cp $(BIN) $(IMG)
dd if=/dev/zero bs=1 count=$$((512 - $$(wc -c < $(BIN)))) >> $(IMG) 2>/dev/null
printf '\x55\xaa' | dd of=$(IMG) bs=1 seek=510 conv=notrunc 2>/dev/null
$(BIN): $(SRC)
$(ASM) -f bin $(SRC) -o $(BIN)
clean:
rm -f $(BIN) $(IMG)
.PHONY: all clean
Three things worth noting. First, the Makefile tries -soundhw pcspk (QEMU ≤ 7.x style) and falls back to -device pcspk (QEMU ≥ 8.x). Second, dd appends zero bytes to reach exactly 512 bytes so the BIOS accepts the sector. Third, printf '\x55\xaa' writes the MBR magic signature at offset 510 without touching your code bytes — critical if your source omits the times 510-($-$$) db 0 / dw 0xAA55 boilerplate.
Step-by-Step: Writing a Minimal PC Speaker Tone in x86 Assembly
Goal: produce a 440 Hz beep for roughly 500 ms, then halt — in a correctly structured bootsector.
; boot.asm — 440 Hz beep demo, real-mode bootsector
; Assemble: nasm -f bin boot.asm -o boot.bin
[BITS 16]
[ORG 0x7C00]
start:
; --- Step 1: Program PIT channel 2 for 440 Hz ---
; Mode byte: channel 2 | lo+hi byte access | mode 3 (square wave) | binary
; 0b10110110 = 0xB6
mov al, 0xB6
out 0x43, al ; write mode command to PIT
; Divisor for 440 Hz: 1193180 / 440 = 2712 = 0x0A98
mov ax, 2712
out 0x42, al ; write low byte (0x98) of divisor
mov al, ah
out 0x42, al ; write high byte (0x0A) of divisor
; --- Step 2: Enable PC speaker gate ---
in al, 0x61 ; read current port 0x61 state
or al, 0x03 ; set bits 0 (timer gate) and 1 (speaker data)
out 0x61, al ; speaker now produces 440 Hz tone
; --- Step 3: Delay loop (~500 ms at ~1 GHz emulated clock) ---
; Outer loop: 65535 iterations
mov cx, 0xFFFF
outer:
; Inner loop: burns ~65535 cycles per outer iteration
push cx
mov cx, 0xFFFF
inner:
loop inner
pop cx
loop outer
; --- Step 4: Disable PC speaker ---
in al, 0x61
and al, 0xFC ; clear bits 0 and 1 (mask with 11111100)
out 0x61, al
; --- Halt ---
cli
hlt
; MBR padding and boot signature
times 510-($-$$) db 0
dw 0xAA55
Line-by-line breakdown:
mov al, 0xB6/out 0x43, al— The mode byte0xB6selects channel 2, 16-bit access (lo then hi), Mode 3 (square wave), binary counting. This is the mandatory setup before writing the divisor.mov ax, 2712/ twoout 0x42, alcalls — Writes divisor low byte then high byte. NASM puts2712 & 0xFF = 0x98into AL automatically;mov al, ahextracts the high byte0x0A.in al, 0x61/or al, 0x03/out 0x61, al— Read-modify-write on port0x61. Bit 0 enables the PIT channel 2 gate; bit 1 connects it to the speaker output. You must OR rather than write a constant to preserve bits 2–7 (they control NMI masking and other system state).- The nested
loopburns CPU cycles to create a delay without BIOSINT 15h. At QEMU's emulated speed this gives roughly 200–800 ms depending on host speed. and al, 0xFC— Clears bits 0 and 1 (0xFC = 11111100b) to silence the speaker.
Extreme Size Optimization: Squeezing Audio Logic into 16 Bytes
Goal: understand how the wake_up 16b demo achieves the same audio+visual output in exactly 16 bytes.
Reusing BIOS-initialized state
When BIOS hands off to your bootsector, it has already configured the PIT for the system timer (channel 0). More importantly, it leaves port 0x61 in a known state with bits 0–1 writable. Size-coders exploit this: instead of a full PIT mode command, some demos write only the divisor bytes and toggle the gate, trusting the BIOS mode setup survives. That eliminates two instructions.
The 16-byte sequence, byte by byte
00000000: b0 b3 mov al, 0xB3
00000002: e6 61 out 0x61, al
00000004: b9 ff2f mov cx, 0x2FFF
00000007: e6 42 out 0x42, al
00000009: e2 fe loop 0x00000009 ; tight delay + pitch step
0000000b: c0 (prefix / opcode overlap)
0000000c: 88 07 mov [bx], al
0000000e: eb f2 jmp short 0x00000002
| Offset | Hex | Mnemonic | Purpose |
|--------|-----|----------|---------|
| 0x00 | B0 B3 | MOV AL, 0xB3 | Load speaker-enable gate value (bits 0+1 set) and mode hint |
| 0x02 | E6 61 | OUT 0x61, AL | Enable PC speaker gate immediately |
| 0x04 | B9 FF 2F | MOV CX, 0x2FFF | Set loop counter (controls tone duration and pitch) |
| 0x07 | E6 42 | OUT 0x42, AL | Write current AL as PIT channel 2 divisor byte |
| 0x09 | E2 FE | LOOP 0x09 | Burn cycles (delay) AND decrement pitch counter |
| 0x0B | C0 | (opcode prefix) | Overlaps with next instruction encoding |
| 0x0C | 88 07 | MOV [BX], AL | Write AL as ASCII char to VGA memory via BX |
| 0x0E | EB F2 | JMP SHORT 0x02 | Restart with new AL value (pitch shifted) |
0xB3 = 10110011b happens to be a valid partial PIT mode byte and a non-zero ASCII character and a non-zero gate value for port 0x61. One immediate value does three jobs. The LOOP at 0x09 runs 0x2FFF = 12287 times per pitch step — that inner burn determines both the beep duration and how quickly AL changes on the next iteration (since AL changes each outer loop pass).
Self-modifying / overlapping opcodes
The byte at 0x0B (0xC0) is intentionally ambiguous: depending on execution flow it can act as an instruction prefix or be skipped entirely. This technique — placing a byte that is both the tail of one instruction's encoding and the head of another — is called opcode overlap and is a core demoscene trick. The x86 variable-length encoding makes this possible in ways RISC architectures cannot match.
Adding Matrix Rain Visuals Synchronized to the Beeps
Goal: write ASCII characters to VGA text memory inside the same timing loop that drives the PIT delay, so every visual update is locked to the audio rhythm.
; matrix_rain_snippet.asm
; Assumes DS=0, ES=0 or set ES=0xB800 before this block
; BX = current VGA text buffer offset (byte index into 0xB800:0000)
; AL = current loop counter / character value
[BITS 16]
[ORG 0x7C00]
start:
; Point ES at VGA text buffer segment
mov ax, 0xB800
mov es, ax
; Enable PC speaker (bits 0+1 of port 0x61)
mov al, 0xB3
out 0x61, al
; Program PIT channel 2 — write mode byte first
mov al, 0xB6
out 0x43, al
xor bx, bx ; BX = 0, start of VGA buffer
pitch_loop:
; Load pitch/character value into AL
mov al, cl ; CL counts down from initial value
; Write divisor byte to PIT ch2 (sets approximate frequency)
out 0x42, al
; Write character + attribute to VGA text memory
; Each VGA text cell = 2 bytes: [ASCII][attribute]
mov byte [es:bx], al ; ASCII character (loop counter as char)
mov byte [es:bx+1], 0x0A ; attribute: bright green (0x0A) on black
; Advance to next VGA cell (2 bytes per cell)
add bx, 2
; Wrap at 80*25*2 = 4000 bytes
cmp bx, 4000
jl .no_wrap
xor bx, bx
.no_wrap:
; Delay loop — same counter drives audio timing
mov cx, 0x1FFF
.delay:
loop .delay
; Decrement outer pitch counter and repeat
dec cl
jnz pitch_loop
; Halt
cli
hlt
times 510-($-$$) db 0
dw 0xAA55
Key design decisions:
ES = 0xB800— the VGA text buffer starts at physical address0xB8000. SettingESto0xB800lets you address it with a zero-basedBXoffset using[es:bx].0x0Aattribute — In VGA text mode the attribute byte encodesbackground(4 bits) | foreground(4 bits).0x0A = 0000 1010bmeans black background, bright green foreground. That's the Matrix look.mov byte [es:bx], alwrites the character;mov byte [es:bx+1], 0x0Awrites the color. Together they form one complete text cell.- The
loop .delayinner loop burns approximately0x1FFF = 8191cycles per character written. The outerdec cl / jnz pitch_loopdrives 255 iterations total, each at a slightly different pitch (sinceclfeeds directly into the PIT divisor output viaout 0x42, al). - In a true 16-byte version,
BXis assumed pre-initialized by BIOS (often to zero) and the attribute write is dropped to save bytes — the VGA cell's existing attribute byte is left in place or the character alone creates the visual.
Testing and Debugging Your Tiny Demo
Goal: verify you actually hear sound and see green characters, and know how to diagnose when you don't.
The complete build-and-run script
#!/usr/bin/env bash
# run_demo.sh — assemble, pad, and launch QEMU with PC speaker support
set -euo pipefail
ASM_FILE="${1:-boot.asm}"
BIN_FILE="boot.bin"
IMG_FILE="boot.img"
echo "[1/3] Assembling ${ASM_FILE}..."
nasm -f bin "${ASM_FILE}" -o "${BIN_FILE}"
echo "[2/3] Padding to 512 bytes and writing MBR signature..."
BIN_SIZE=$(wc -c < "${BIN_FILE}")
if [ "${BIN_SIZE}" -gt 510 ]; then
echo "ERROR: binary is ${BIN_SIZE} bytes — exceeds 510 byte MBR limit" >&2
exit 1
fi
cp "${BIN_FILE}" "${IMG_FILE}"
dd if=/dev/zero bs=1 count=$((510 - BIN_SIZE)) >> "${IMG_FILE}" 2>/dev/null
printf '\x55\xaa' >> "${IMG_FILE}"
echo "[3/3] Launching QEMU..."
# Try legacy -soundhw flag (QEMU < 8.0), fall back to -device for newer versions
if qemu-system-i386 --help 2>&1 | grep -q '\-soundhw'; then
SOUND_FLAG="-soundhw pcspk"
else
SOUND_FLAG="-device pcspk"
fi
qemu-system-i386 \
-drive format=raw,file="${IMG_FILE}" \
${SOUND_FLAG} \
-display sdl \
-no-reboot \
-no-shutdown
echo "Done."
Make it executable with chmod +x run_demo.sh, then run ./run_demo.sh boot.asm.
Troubleshooting table
| Symptom | Likely Cause | Fix |
|---------|-------------|-----|
| No sound in QEMU on macOS | QEMU uses CoreAudio; pcspk emulation may be missing | Install qemu via Homebrew and add -audiodev coreaudio,id=snd0 -machine pcspk-audiodev=snd0 |
| QEMU exits immediately | Missing 0xAA55 boot signature | Verify the Makefile printf step; check xxd boot.img \| tail -1 shows 55 aa |
| Screen blank / no VGA output | ES not set to 0xB800 before VGA write | Add mov ax, 0xB800 / mov es, ax before any [es:bx] access |
| Garbled sound / wrong pitch | Divisor written in wrong byte order | PIT expects low byte first, then high byte; check your two out 0x42 calls |
| -soundhw: invalid option | QEMU ≥ 8.0 dropped -soundhw | Switch to -device pcspk; the script above detects this automatically |
| loop never exits | CX already 0 when loop starts | On some BIOS versions CX is pre-set to 0 — add mov cx, 0xFFFF explicitly |
Using Bochs for instruction-level debugging
If QEMU gives you no audio feedback at all, Bochs with its built-in debugger lets you single-step and inspect port I/O:
# bochsrc.txt
megs: 4
floppya: 1_44=boot.img, status=inserted
boot: floppy
log: bochs.log
display_library: sdl2
nes: enabled=1
Run bochs -f bochsrc.txt and type b 0x7c00 then c to break at your bootsector, then s to step instruction by instruction. The info ports command shows the last values written to 0x42, 0x43, and 0x61.
Going Further: Size-Coding Communities and Next Challenges
You've built a working PC speaker demo. Here's how to keep going.
The demoscene size-coding progression
The standard category ladder looks like this:
| Category | Bytes | What becomes possible | |----------|-------|-----------------------| | 16b | 16 | Single effect, heavy BIOS state reuse | | 32b | 32 | Two effects or a loop with variable pitch | | 64b | 64 | Keyboard-reactive audio, VGA mode switching | | 256b | 256 | Procedural music sequences, multiple voice simulation | | 512b | 512 | Full MBR bootsector with screen-clearing, title, multi-tone melody | | 1kb | 1024 | Intro-quality visuals + music, font rendering |
At 32 bytes you can add an outer melody loop that steps through a table of PIT divisors to play actual notes. At 256 bytes you can implement a crude square-wave arpeggiator. At 1kb, people have implemented full 4-channel tracker playback.
Pouet.net and the demoscene community
Pouet.net (pouet.net) is the canonical database for demoscene releases. Search for "bootsector" or filter by platform "PC (DOS/386)" and size "16b". The comment threads contain reverse-engineered breakdowns of many entries. Submissions happen at demoparties (Revision, Assembly, Solskogen) and online.
Resources
| Resource | URL | What it covers |
|----------|-----|----------------|
| Sizecoding.org wiki | sizecoding.org | Tricks index, register state tables, opcode savings cheatsheet |
| OSDev bare-metal audio | wiki.osdev.org/PC_Speaker | PIT programming reference, port map, mode byte format |
| NASM Manual | nasm.us/doc | Official x86 instruction reference with real-mode specifics |
| Pouet.net | pouet.net | Demoscene releases, source code links, category browser |
| QEMU PC Speaker docs | qemu.org/docs | -device pcspk configuration, audiodev backends |
Browser-based PC speaker emulation
If you want to share your demo without asking viewers to install QEMU, the v86 project (github.com/copy/v86) emulates an entire x86 PC in WebAssembly and supports PC speaker audio via the Web Audio API. You can load your boot.img directly into a v86 instance embedded in a webpage — the 440 Hz tone comes out of the browser's audio context with no plugins required.
The path from 16 bytes to a full 1kb intro takes maybe 20 hours of deliberate practice. Every byte you add unlocks new capabilities. Start with the 440 Hz beep, get it working in QEMU, then add one more note. The constraint is the point — it forces you to understand x86 at a level that no OS-mediated API will ever require of you.