Learn Creative Coding (#72) - 3D Audio Visualization

in StemSocial2 hours ago

Learn Creative Coding (#72) - 3D Audio Visualization

cc-banner

Back in episode 19 we connected audio to visuals for the first time -- the Web Audio API gave us frequency data as arrays of numbers, and we mapped those numbers to circle sizes, bar heights, colors, particle speeds. All of it was flat. 2D canvas. The bars went up and down, the circles pulsed, the particles drifted across a plane. Looked cool, sounded reactive, but it was a window into the music, not a space inside it.

Now we have Three.js. The audio analysis is the same -- AnalyserNode, getByteFrequencyData, arrays of 0-255 values, 60 times per second. Nothing changes on the data side. What changes is what we do with it. Frequency bins can drive the height of 3D columns arranged in a circle. Bass amplitude can shake the camera, pulse a point light, inflate a central sphere. We can build terrain that morphs in real time to the music's spectrum. Particles can emit faster on loud moments and drift in silence. Materials can shift color with the spectral centroid. And with Three.js's AudioListener and PositionalAudio, sound itself becomes spatial -- walk toward an object and it gets louder. Walk away and it fades.

This episode connects the audio pipeline from ep019 to the 3D pipeline we've been building since ep062. Frequency to geometry, amplitude to environment, spectrum to material. 3D scenes that don't just show music -- they inhabit it.

Setting up the audio pipeline in Three.js

Three.js has its own audio system built on top of the Web Audio API. You can use either the raw browser API (like we did in ep019) or Three.js's wrappers. The wrappers are convenient because they integrate with the scene graph and handle spatial audio automatically. Let's set up both so you can pick whichever feels right.

The Three.js approach uses AudioListener (attached to the camera, acts as your ears) and Audio (a non-positional sound source):

import * as THREE from 'three';
import { OrbitControls } from 'three/addons/controls/OrbitControls.js';

const scene = new THREE.Scene();
scene.background = new THREE.Color(0x050508);

const camera = new THREE.PerspectiveCamera(
  60, window.innerWidth / window.innerHeight, 0.1, 100
);
camera.position.set(0, 4, 12);

const renderer = new THREE.WebGLRenderer({ antialias: true });
renderer.setSize(window.innerWidth, window.innerHeight);
renderer.setPixelRatio(window.devicePixelRatio);
document.body.appendChild(renderer.domElement);

const controls = new OrbitControls(camera, renderer.domElement);
controls.enableDamping = true;

// audio setup
const listener = new THREE.AudioListener();
camera.add(listener);

const audio = new THREE.Audio(listener);
const audioLoader = new THREE.AudioLoader();
const analyser = new THREE.AudioAnalyser(audio, 256);
// analyser.getFrequencyData() returns a Uint8Array of 128 values

// load a track
audioLoader.load('your-track.mp3', function (buffer) {
  audio.setBuffer(buffer);
  audio.setLoop(true);
  audio.setVolume(0.7);
});

// start playback on user click (browsers require user gesture)
document.addEventListener('click', () => {
  if (!audio.isPlaying) audio.play();
}, { once: true });

THREE.AudioAnalyser wraps the browser's AnalyserNode -- under the hood it's the exact same FFT we used in ep019. analyser.getFrequencyData() returns the same 0-255 Uint8Array. analyser.getAverageFrequency() gives you the mean of all bins as a single number -- handy for "overall loudness" effects. The fftSize of 256 gives us 128 frequency bins, same as before.

One difference from ep019: here the audio source is a loaded file (AudioLoader). In ep019 we used getUserMedia for microphone input. Both work with AudioAnalyser. If you want microphone input in Three.js:

navigator.mediaDevices.getUserMedia({ audio: true }).then(stream => {
  const micSource = listener.context.createMediaStreamSource(stream);
  const rawAnalyser = listener.context.createAnalyser();
  rawAnalyser.fftSize = 256;
  micSource.connect(rawAnalyser);

  const micData = new Uint8Array(rawAnalyser.frequencyBinCount);

  // in your animation loop:
  // rawAnalyser.getByteFrequencyData(micData);
});

Same API, different source. The rest of this episode uses the file-based approach but everyting works with microphone input too.

Frequency bars in 3D: the circular equalizer

The classic audio visualizer is a row of bars whose heights track frequency bins. In 3D we can arrange those bars in a circle, making a cylindrical equalizer sculpture. Low frequencies in the center (tall, slow-moving), high frequencies on the outside (short, twitchy).

const barCount = 64;
const bars = [];
const barGroup = new THREE.Group();
scene.add(barGroup);

for (let i = 0; i < barCount; i++) {
  const angle = (i / barCount) * Math.PI * 2;
  const radius = 4;

  const geo = new THREE.BoxGeometry(0.15, 1, 0.15);
  // shift pivot to bottom so scaling goes upward
  geo.translate(0, 0.5, 0);

  const hue = i / barCount;
  const mat = new THREE.MeshStandardMaterial({
    color: new THREE.Color().setHSL(hue, 0.7, 0.4),
    roughness: 0.4,
    metalness: 0.3
  });

  const bar = new THREE.Mesh(geo, mat);
  bar.position.set(
    Math.cos(angle) * radius,
    0,
    Math.sin(angle) * radius
  );
  bar.lookAt(0, 0, 0);

  barGroup.add(bar);
  bars.push(bar);
}

// lighting
scene.add(new THREE.AmbientLight(0x111122, 0.6));
const topLight = new THREE.DirectionalLight(0xffeedd, 1.5);
topLight.position.set(2, 8, 3);
scene.add(topLight);

Each bar is a box with its geometry translated so the pivot is at the bottom -- when we scale Y, it grows upward instead of expanding from the center. The bars are arranged in a circle and rotated to face the center using lookAt. The hue shifts around the circle so you get a rainbow ring.

Now in the animation loop, map frequency data to bar height:

const clock = new THREE.Clock();

function animate() {
  requestAnimationFrame(animate);
  const t = clock.getElapsedTime();

  const freqData = analyser.getFrequencyData();

  for (let i = 0; i < barCount; i++) {
    // map bar index to frequency bin
    const binIndex = Math.floor((i / barCount) * freqData.length);
    const value = freqData[binIndex] / 255;  // normalize to 0..1

    // smooth the height change (lerp toward target)
    const targetHeight = 0.1 + value * 3.0;
    const currentHeight = bars[i].scale.y;
    bars[i].scale.y += (targetHeight - currentHeight) * 0.3;

    // color intensity from frequency value
    const hue = i / barCount;
    const lightness = 0.25 + value * 0.25;
    bars[i].material.color.setHSL(hue, 0.7, lightness);
  }

  // slow rotation of the whole ring
  barGroup.rotation.y = t * 0.15;

  controls.update();
  renderer.render(scene, camera);
}

animate();

The lerp smoothing (+= (target - current) * 0.3) is important. Raw frequency data jumps around frame to frame -- without smoothing the bars flicker nervously. The lerp factor of 0.3 gives snappy response while filtering out single-frame spikes. Higher values (0.5-0.8) track the audio tighter but jitter more. Lower values (0.1-0.15) give a fluid, lazy feel. Experiment with what matches the music's energy.

The whole group rotates slowly so you see the ring from different angles. Simple, but it really sells the 3D-ness of the visualization -- you're not looking at a flat bar chart, you're orbiting a frequency sculpture.

Bass-reactive environment

Individual bars track frequency bins, but the room itself should feel the music too. The bass (low frequency energy) is the most physically powerful part of music -- it's what shakes your chest at a concert. We can make the 3D environment respond to bass the same way: shake the camera, pulse lights, distort geometry.

function getBassEnergy(freqData) {
  // average the first 8 bins (lowest frequencies)
  let sum = 0;
  for (let i = 0; i < 8; i++) {
    sum += freqData[i];
  }
  return (sum / 8) / 255;  // normalized 0..1
}

function getTrebleEnergy(freqData) {
  // average the last 32 bins
  let sum = 0;
  const start = freqData.length - 32;
  for (let i = start; i < freqData.length; i++) {
    sum += freqData[i];
  }
  return (sum / 32) / 255;
}

// central sphere that inflates with bass
const sphere = new THREE.Mesh(
  new THREE.SphereGeometry(0.8, 32, 32),
  new THREE.MeshStandardMaterial({
    color: 0x000000,
    emissive: 0xff2244,
    emissiveIntensity: 2.0,
    roughness: 0.15
  })
);
sphere.position.set(0, 2, 0);
scene.add(sphere);

// point light that pulses with bass
const bassLight = new THREE.PointLight(0xff2244, 2.0, 10);
bassLight.position.copy(sphere.position);
scene.add(bassLight);

// in the animation loop:
function updateEnvironment(freqData, t) {
  const bass = getBassEnergy(freqData);
  const treble = getTrebleEnergy(freqData);

  // sphere inflates with bass
  const sphereScale = 0.8 + bass * 1.2;
  sphere.scale.setScalar(sphereScale);

  // emissive intensity pulses
  sphere.material.emissiveIntensity = 1.5 + bass * 4.0;

  // light intensity follows bass
  bassLight.intensity = 1.0 + bass * 5.0;

  // camera shake on heavy bass hits
  if (bass > 0.7) {
    camera.position.x += (Math.random() - 0.5) * 0.04;
    camera.position.y += (Math.random() - 0.5) * 0.03;
  }

  // scene background subtly shifts with treble
  const bgBrightness = 0.02 + treble * 0.03;
  scene.background.setRGB(bgBrightness, bgBrightness, bgBrightness * 1.5);
}

The camera shake is intentionally small -- just enough to feel the impact without making the viewer seasick. Only triggers above a threshold (0.7) so it's tied to actual bass hits, not constant background rumble. In a real project you'd smooth the camera back to its base position each frame (another lerp) so it doesn't drift.

The treble-reactive background is subtle -- barely visible most of the time. But on hi-hat rolls and cymbal crashes the background lightens just enough to sense it. These micro-details are what separate a reactive visualization from a truly immersive one. Every part of the scene should breathe with the audio, not just the obvious elements.

Audio-driven terrain

Here's where it gets really fun. Map the frequency spectrum directly to terrain geometry. Each frequency bin controls the height of a vertex row. The terrain morphs in real time -- mountains rise on bass drops, ridges form on snare hits, valleys open in quiet sections:

const terrainSize = 20;
const segments = 64;
const terrainGeo = new THREE.PlaneGeometry(
  terrainSize, terrainSize, segments, segments
);
terrainGeo.rotateX(-Math.PI / 2);

const terrainMat = new THREE.MeshStandardMaterial({
  color: 0x2244aa,
  wireframe: true,
  roughness: 0.8
});

const terrain = new THREE.Mesh(terrainGeo, terrainMat);
terrain.position.y = -1;
scene.add(terrain);

// store original positions for reference
const terrainPositions = terrainGeo.attributes.position;
const vertexCount = terrainPositions.count;

function updateTerrain(freqData, t) {
  for (let i = 0; i < vertexCount; i++) {
    const x = terrainPositions.getX(i);
    const z = terrainPositions.getZ(i);

    // map Z position to a frequency bin
    const normalizedZ = (z + terrainSize / 2) / terrainSize;  // 0..1
    const binIndex = Math.floor(normalizedZ * (freqData.length - 1));
    const freqValue = freqData[binIndex] / 255;

    // also add some X-based variation so rows aren't flat
    const xWave = Math.sin(x * 0.8 + t * 0.5) * 0.3;

    const height = freqValue * 3.0 + xWave;
    terrainPositions.setY(i, height);
  }

  terrainPositions.needsUpdate = true;
  terrainGeo.computeVertexNormals();
}

The wireframe material lets you see the mesh deform clearly. Each row of vertices along the Z axis maps to a different frequency bin -- the front edge tracks the lowest frequencies, the back edge tracks the highest. When a bass note hits, the front rows spike upward. When hi-hats rattle, the back rows ripple.

The X-based sine wave prevents each row from being perfectly flat -- without it you'd see a staircase pattern where every vertex in a Z-row has the same height. The wave adds cross-contour variation that makes the terrain feel more organic.

computeVertexNormals() recalculates lighting normals after deforming the mesh. Skip this for wireframe (normals don't matter for wireframe rendering) but you'll need it if you switch to a solid material -- without it, the lighting stays flat regardless of the terrain shape.

Particle emission driven by audio

Audio-driven particle systems feel magical. Particles burst out on every beat, their color shifts with the spectrum, their speed tracks the overall energy. Combine the particle techniques from ep065 with the audio pipeline:

const maxParticles = 20000;
const particlePositions = new Float32Array(maxParticles * 3);
const particleVelocities = new Float32Array(maxParticles * 3);
const particleLifetimes = new Float32Array(maxParticles);
let activeParticles = 0;

const particleGeo = new THREE.BufferGeometry();
particleGeo.setAttribute('position',
  new THREE.BufferAttribute(particlePositions, 3));

const particleMat = new THREE.PointsMaterial({
  color: 0xffaa44,
  size: 0.06,
  sizeAttenuation: true,
  transparent: true,
  opacity: 0.8,
  blending: THREE.AdditiveBlending,
  depthWrite: false
});

const particles = new THREE.Points(particleGeo, particleMat);
scene.add(particles);

function emitParticles(count, bass, treble) {
  for (let j = 0; j < count; j++) {
    if (activeParticles >= maxParticles) break;

    const i = activeParticles;
    const i3 = i * 3;

    // emit from center sphere
    const theta = Math.random() * Math.PI * 2;
    const phi = Math.acos(2 * Math.random() - 1);

    particlePositions[i3] = sphere.position.x;
    particlePositions[i3 + 1] = sphere.position.y;
    particlePositions[i3 + 2] = sphere.position.z;

    // velocity outward, speed scales with bass
    const speed = 0.03 + bass * 0.1;
    particleVelocities[i3] = Math.sin(phi) * Math.cos(theta) * speed;
    particleVelocities[i3 + 1] = Math.sin(phi) * Math.sin(theta) * speed;
    particleVelocities[i3 + 2] = Math.cos(phi) * speed;

    particleLifetimes[i] = 1.0;
    activeParticles++;
  }
}

function updateParticles(delta, bass) {
  let writeIndex = 0;

  for (let i = 0; i < activeParticles; i++) {
    const i3 = i * 3;
    particleLifetimes[i] -= delta * 0.5;

    if (particleLifetimes[i] <= 0) continue;

    const w3 = writeIndex * 3;
    particlePositions[w3] = particlePositions[i3] + particleVelocities[i3];
    particlePositions[w3 + 1] = particlePositions[i3 + 1] + particleVelocities[i3 + 1];
    particlePositions[w3 + 2] = particlePositions[i3 + 2] + particleVelocities[i3 + 2];

    particleVelocities[w3] = particleVelocities[i3] * 0.98;
    particleVelocities[w3 + 1] = particleVelocities[i3 + 1] * 0.98 - 0.001;
    particleVelocities[w3 + 2] = particleVelocities[i3 + 2] * 0.98;

    particleLifetimes[writeIndex] = particleLifetimes[i];
    writeIndex++;
  }

  activeParticles = writeIndex;
  particleGeo.attributes.position.needsUpdate = true;
  particleGeo.setDrawRange(0, activeParticles);

  // color from spectral content
  const hue = 0.08 + bass * 0.1;  // warm when bassy, cooler when not
  particleMat.color.setHSL(hue, 0.8, 0.5);
}

The emission rate should scale with audio energy. In your animation loop:

function animate() {
  requestAnimationFrame(animate);
  const delta = clock.getDelta();
  const t = clock.getElapsedTime();

  const freqData = analyser.getFrequencyData();
  const bass = getBassEnergy(freqData);
  const treble = getTrebleEnergy(freqData);
  const avg = analyser.getAverageFrequency() / 255;

  // emit more particles when music is louder
  const emitCount = Math.floor(avg * 40);
  emitParticles(emitCount, bass, treble);
  updateParticles(delta, bass);

  updateEnvironment(freqData, t);

  controls.update();
  renderer.render(scene, camera);
}

Quiet passages emit maybe 2-3 particles per frame. A bass drop fires 30+. The particles burst outward from the central sphere, creating a pulsing cloud that expands and contracts with the music's dynamics. Additive blending means overlapping particles get brighter, so dense burst moments glow hot.

Reactive ShaderMaterial

For the most intimate audio-visual coupling, pass frequency data directly into a shader as uniforms. The shader can do things JavaScript can't do per-pixel -- frequency-dependent fresnel glow, bass-driven displacement, spectral color mapping:

const reactiveMat = new THREE.ShaderMaterial({
  vertexShader: `
    uniform float uBass;
    uniform float uTreble;
    uniform float uTime;

    varying vec3 vNormal;
    varying vec3 vPosition;

    void main() {
      vNormal = normalize(normalMatrix * normal);

      // bass displacement: push vertices outward
      vec3 displaced = position + normal * uBass * 0.3;

      // treble adds high-freq ripple
      displaced += normal * sin(position.y * 10.0 + uTime * 3.0) * uTreble * 0.08;

      vPosition = displaced;
      gl_Position = projectionMatrix * modelViewMatrix * vec4(displaced, 1.0);
    }
  `,
  fragmentShader: `
    uniform float uBass;
    uniform float uTreble;
    uniform float uTime;

    varying vec3 vNormal;
    varying vec3 vPosition;

    void main() {
      // fresnel edge glow pulses with treble
      vec3 viewDir = normalize(cameraPosition - vPosition);
      float fresnel = 1.0 - abs(dot(viewDir, vNormal));
      fresnel = pow(fresnel, 2.0);

      // base color shifts with bass
      vec3 baseColor = mix(
        vec3(0.1, 0.15, 0.4),  // cool blue when quiet
        vec3(0.8, 0.2, 0.1),   // hot red on bass hit
        uBass
      );

      // edge glow color tracks treble
      vec3 glowColor = vec3(0.3, 0.6, 1.0) * (1.0 + uTreble * 2.0);

      vec3 col = mix(baseColor, glowColor, fresnel * (0.5 + uTreble));

      gl_FragColor = vec4(col, 1.0);
    }
  `,
  uniforms: {
    uBass: { value: 0 },
    uTreble: { value: 0 },
    uTime: { value: 0 }
  }
});

const reactiveSphere = new THREE.Mesh(
  new THREE.IcosahedronGeometry(1.5, 6),
  reactiveMat
);
reactiveSphere.position.set(0, 2, 0);
scene.add(reactiveSphere);

// in the animation loop:
reactiveMat.uniforms.uBass.value = bass;
reactiveMat.uniforms.uTreble.value = treble;
reactiveMat.uniforms.uTime.value = t;

The icosahedron inflates on bass hits (vertices pushed along their normals), ripples with treble (high-frequency sine displacement), shifts from cold blue to hot red with bass energy, and glows at the edges when treble is high. Every visual aspect responds to a different part of the spectrum. The result is a shape that feels like it's alive and breathing the music.

Icosahedron subdivision level 6 gives enough vertices for smooth displacement. Lower levels (3-4) would show visible faceting when the bass pushes vertices around. The normals are smooth enough that the fresnel effect reads cleanly. For even smoother results you could use SphereGeometry with high segment counts, but icosahedron at level 6 has more evenly distributed vertices which displaces more uniformly.

Spatial audio: sound in 3D space

Three.js supports positional audio through THREE.PositionalAudio. Instead of the sound coming from everywhere equally (like THREE.Audio), it's attached to an object in the scene. As the camera moves closer, the sound gets louder. Move away, it fades. Pan left and you hear it in the left ear. The same spatial processing that games use for footsteps and ambient effects.

// positional audio source attached to an object
const positionalSound = new THREE.PositionalAudio(listener);
audioLoader.load('ambient-loop.mp3', function (buffer) {
  positionalSound.setBuffer(buffer);
  positionalSound.setRefDistance(3);     // full volume within this distance
  positionalSound.setRolloffFactor(1.5); // how fast it fades with distance
  positionalSound.setLoop(true);
  positionalSound.play();
});

// attach to a visible object
const soundObject = new THREE.Mesh(
  new THREE.DodecahedronGeometry(0.4, 0),
  new THREE.MeshStandardMaterial({
    emissive: 0x44aaff,
    emissiveIntensity: 2.0
  })
);
soundObject.position.set(5, 2, -3);
soundObject.add(positionalSound);  // sound follows this mesh
scene.add(soundObject);

setRefDistance is the distance at which the sound is at full volume. Beyond that, it attenuates according to the rolloff factor. The listener (attached to the camera) determines the "ear" position and orientation. Three.js handles all the panning and volume math through the Web Audio API's PannerNode under the hood.

For creative audio-visual installations, imagine placing multiple sound sources around a 3D scene -- each one playing a different instrument or stem from a track. Walking through the space mixes the music spatially. Close to the drums, you hear the beat louder. Move toward the synth pad, it dominates. The visual representation of each source can react to its own audio level, creating zones of activity scattered through the 3D world.

You'd create a separate AudioAnalyser for each PositionalAudio to get per-source frequency data. Then each visual element reacts only to its own audio source. The drum zone has bass-reactive geometry while the melody zone has treble-reactive particles. Same techniques, applied per-source instead of globally.

Beat detection

The techniques above respond continuously to frequency levels. But sometimes you want to trigger events on beats -- flash on the kick drum, change visual mode on the drop, spawn a burst of particles on every snare hit. That's beat detection.

A simple approach: track the average energy over time, and trigger a "beat" when the current energy exceeds the recent average by some threshold:

const energyHistory = new Float32Array(60); // ~1 second of history at 60fps
let historyIndex = 0;
let lastBeatTime = 0;

function detectBeat(freqData, t) {
  const bass = getBassEnergy(freqData);

  // add to rolling history
  energyHistory[historyIndex % energyHistory.length] = bass;
  historyIndex++;

  // compute average energy
  let avgEnergy = 0;
  for (let i = 0; i < energyHistory.length; i++) {
    avgEnergy += energyHistory[i];
  }
  avgEnergy /= energyHistory.length;

  // beat detected when current bass exceeds average by threshold
  // and enough time has passed since last beat (debounce)
  const threshold = 1.4;
  const minInterval = 0.15;  // seconds

  if (bass > avgEnergy * threshold && t - lastBeatTime > minInterval) {
    lastBeatTime = t;
    return true;
  }

  return false;
}

The threshold multiplier (1.4) means the current bass needs to be 40% above the running average to register as a beat. Higher threshold = fewer false triggers, but might miss soft beats. Lower threshold = catches more beats but fires on non-beat fluctuations too. The minInterval debounce prevents multiple triggers from a single drum hit that rings across several frames.

Use beat detection to trigger visual events:

if (detectBeat(freqData, t)) {
  // flash the background
  scene.background.setRGB(0.15, 0.05, 0.05);

  // burst of particles
  emitParticles(80, bass, treble);

  // camera impulse
  camera.position.z -= 0.3;
}

// decay the flash back to dark
const bg = scene.background;
bg.r *= 0.92;
bg.g *= 0.92;
bg.b *= 0.92;

The background flash decays exponentially each frame (*= 0.92) so it fades quickly after each beat. The camera impulse pushes forward slightly on every beat -- combined with the lerp-based camera damping in OrbitControls, it creates a subtle rhythmic push-pull motion. Small effects that accumulate into something that genuinely feels like being inside the music.

Creative exercise: 3D concert visualizer

Allez, time to put it all together. A 3D audio-reactive scene with a central reactive sphere, circular frequency bars, bass-pulsing lights, particles emitted on beats, and a terrain that flows with the spectrum. Connect your favourite track and orbit through a visualization that responds to every layer of the music.

The scene setup combines everything from above -- I'll show just the animation loop that ties it all together, since the individual pieces are the code blocks we already covered:

const clock = new THREE.Clock();

function animate() {
  requestAnimationFrame(animate);
  const delta = clock.getDelta();
  const t = clock.getElapsedTime();

  const freqData = analyser.getFrequencyData();
  const bass = getBassEnergy(freqData);
  const treble = getTrebleEnergy(freqData);
  const avg = analyser.getAverageFrequency() / 255;

  // 1. frequency bars
  for (let i = 0; i < barCount; i++) {
    const binIndex = Math.floor((i / barCount) * freqData.length);
    const value = freqData[binIndex] / 255;
    const target = 0.1 + value * 3.0;
    bars[i].scale.y += (target - bars[i].scale.y) * 0.3;
    bars[i].material.color.setHSL(i / barCount, 0.7, 0.25 + value * 0.25);
  }
  barGroup.rotation.y = t * 0.15;

  // 2. reactive sphere
  reactiveMat.uniforms.uBass.value += (bass - reactiveMat.uniforms.uBass.value) * 0.2;
  reactiveMat.uniforms.uTreble.value += (treble - reactiveMat.uniforms.uTreble.value) * 0.3;
  reactiveMat.uniforms.uTime.value = t;

  // 3. environment
  bassLight.intensity = 1.0 + bass * 5.0;
  const sphereScale = 0.8 + bass * 1.0;
  sphere.scale.setScalar(sphereScale);

  // 4. terrain
  updateTerrain(freqData, t);

  // 5. beat detection and particle burst
  if (detectBeat(freqData, t)) {
    emitParticles(60, bass, treble);
    scene.background.setRGB(0.1, 0.04, 0.04);
  }

  // 6. continuous particle emission (proportional to loudness)
  emitParticles(Math.floor(avg * 15), bass, treble);
  updateParticles(delta, bass);

  // 7. background fade
  const bg = scene.background;
  bg.r = Math.max(bg.r * 0.94, 0.02);
  bg.g = Math.max(bg.g * 0.94, 0.015);
  bg.b = Math.max(bg.b * 0.94, 0.025);

  controls.update();
  renderer.render(scene, camera);
}

animate();

The smoothed uniforms (lerped instead of directly assigned) prevent the shader from jittering on noisy audio. The bass value in the shader ramps up smoothly and decays smoothly, which translates to displacement that breathes rather than twitches.

Add bloom post-processing from ep069 and this scene genuinely glows. The emissive sphere halos pulse with the bass, the particles create streaks of additive light, the frequency bars shimmer. Without bloom it looks like a tech demo. With bloom it looks like a concert stage.

The terrain in wireframe mode creates a retro Tron-like landscape that flows with the music. Switch to solid material with emissive coloring and it becomes more alien -- a living ground that rises and falls with every note.

This is one of those episodes where I'd say: stop reading and go build. Load your favourite track, set up the audio pipeline, pick one or two reactive elements, and play. The code above is a toolkit -- use the pieces that fit your aesthetic. Some of the best audio visualizations I've seen use just one idea executed well. A single reactive sphere with great materials and lighting can be more powerful than a scene with every trick crammed in.

What's ahead

We've connected the Web Audio API to Three.js's 3D pipeline -- frequency data driving geometry, amplitude shaping the environment, spectral content controlling materials, beat detection triggering visual events, and spatial audio placing sound in 3D space. The same data arrays from ep019, now sculpting a 3D world in real time.

Next time we'll add phyiscs to our 3D scenes -- rigid bodies, collisions, gravity, forces. Objects that fall, bounce, stack, and shatter. The creative possibilities when geometry has mass and the world has rules... well, you'll see.

't Komt erop neer...

  • Three.js wraps the Web Audio API with AudioListener (on the camera = your ears), Audio (non-positional source), and AudioAnalyser (frequency data extraction). Under the hood it's the same FFT from ep019 -- getFrequencyData() returns a Uint8Array of 0-255 values, one per frequency bin
  • Circular frequency bars: BoxGeometry pivoted at the base, arranged in a ring with lookAt(center). Map each bar's Y scale to its frequency bin value. Lerp toward the target height (+= (target - current) * factor) to smooth jittery audio data. Rotate the whole group slowly for 3D presence
  • Bass-reactive environment: average the lowest 8 frequency bins for bass energy. Use it to scale a central sphere, pulse a point light's intensity, shake the camera on heavy hits (small random offsets above a threshold). Treble energy can subtly shift the background color. Every element should breathe with the music
  • Audio-driven terrain: PlaneGeometry with per-vertex Y displacement mapped to frequency bins along one axis. The terrain morphs in real time -- bass raises the front, treble ripples the back. Add cross-axis sine waves to break the staircase pattern. Call computeVertexNormals() after deformation for correct lighting on solid materials
  • Particle emission scaled to audio energy: emit more particles when the average frequency level is high. Particles burst from the reactive sphere with velocities proportional to bass. Color shifts with spectral content. Additive blending makes dense bursts glow hot
  • ShaderMaterial driven by audio uniforms: pass bass, treble, and time as uniforms. Vertex shader displaces geometry along normals for bass inflation, adds high-frequency ripple for treble. Fragment shader blends colors based on bass energy and applies fresnel edge glow that pulses with treble
  • Spatial audio with PositionalAudio: sound attached to a 3D object that fades with distance and pans with position relative to the camera. refDistance sets the full-volume range, rolloffFactor controls attenuation speed. Multiple sources create spatial mixing -- walk through the scene to remix the music
  • Beat detection: compare current bass energy against a rolling average. Trigger on threshold crossing with debounce (minimum interval between beats). Use beats to flash the background, burst particles, impulse the camera. Exponential decay (*= 0.92) gives natural fade after each trigger

Sallukes! Thanks for reading.

X

@femdev