5 steps → 1

Sequence Expert

I built a tool that let Kibeam's content team hear audio sequences instantly.

Stack: Python/Flask, JavaScript, XML

"Before Sequence Expert, I couldn't hear what a sequence would actually sound like until we built it on the device. Choksi's tool saved countless hours and my sanity."

Alannah Forman
Producer, Kibeam Learning

In screenless play, audio is the interface. Every sequence in Kibeam's interactive books is defined in XML files that reference audio assets. To hear what a sequence would actually sound like, you had to trace through five applications: script to XML to manifest to Finder to audio player. Five steps per sound. We had thousands.

Having performed comedy for over a decade, I knew that a 300ms pause can make or break a punchline. Writing for PBS taught me kids need a moment of silence after new vocabulary. Those micro-decisions mattered. And the workflow made them painful to iterate.

The Easter Egg Hunt

Find the sequence in the script...

Copy the asset ID... hold on. Getting a Slack msg...

Wait — DINO_NA_ThatsRight or DINO_NA_ThatsGreat?

I need a better finder than Finder.

OMG. 4 more files to add, and then set up the 200ms pauses between...

Verify what seq_game05_r01_correct actually sounds like.

[seq_game05_r01_correct]
NARRATOR: Can you find the T-Rex?
NARRATOR: Great job!
SFX: Dino Roar

ID	File
audio_SFX_PositiveInput	SFX_PositiveInput.wv
audio_NA_ThatsRight	DINO_NA_ThatsRight.wv
audio_NA_TRexWasTheKing	DINO_NA_TRex...King.wv

DINO_NA_ThatsGreat.wv

DINO_NA_ThatsRight.wv

DINO_SFX_ThatsWrong.wv

Repeat for each media entry.
In each sequence. ↻
5+ min per sequence. 😩

This process was so tedious that most of the team skipped it entirely. Instead, they'd wait an hour or longer for a device build just to check timing tweaks and audio changes. Technical Designers who needed to nail the sweet spot for pauses had no choice but to suffer through the manual path.

Process Friction Meant:

QA had to wait for hardware builds to verify simple audio updates

Producers couldn't audition existing audio when planning retrofits

Technical Designers couldn't validate script edits without building to device

The Solution

One click to play.

No tracing. No DAW. No hardware builds.

I built Sequence Expert, a full-stack local application that parses Kibeam's XML structure and plays back the audio as it would sound on the device.

In Action

🔊 Sound on

Select a sequence, hit play, hear it immediately — chime, narration, sound effect, all in order.

Video Transcript

The demo shows the Sequence Expert interface. A user selects an audio sequence from a dropdown menu and clicks "Play Sequence." The tool immediately plays the full sequence in order: an opening chime sound effect, followed by narration ("Let's count together! One, two, three..."), and concluding with a completion sound effect. No manual coordination between audio files is required — the entire sequence plays automatically with correct timing.

Under the Hood

How It Works

The problem was fragmentation. Sequence logic references lived in one file. Audio references in another. Asset mappings in a third. The tool needed to unify them automatically.

Sequence Expert

Parse

sequence skeleton

→

Resolve

ID → filename

→

Hydrate

attach paths

→

Play

Web Audio

The browser never sees the fragmentation—just a unified timeline.

The tool works in three phases: parse the sequence structure to get the skeleton, resolve abstract IDs against the asset manifest to find filenames, and hydrate each node with a playable file path.

The Core Function: Hydration

async loadMediaAndWaits(sequenceId) {
  // 1. Get sequence structure (just IDs, no files yet)
  const sequence = this.sequences.find(
    (s) => s.id === sequenceId
  );

  // 2. Get the asset dictionary (ID → filename mapping)
  const assets = await this.loadAssetManifest(
    sequence.basePath
  );

  // 3. Hydrate: cross-reference to build playable paths
  sequence.children.forEach((node) => {
    if (node.type === "media") {
      const assetId = node.getAttribute("assetId");
      const file = assets.find(
        (a) => a.id === assetId
      )?.file;
      node.fullPath = `${sequence.assetPath}/${file}`;
    }
  });
}

Code sample is representative; proprietary details have been generalized.

The backend is a Python/Flask server that performs real-time audio transcoding—converting hardware-native formats to browser-playable audio on the fly. A teammate built the initial transcoding logic. I extended it with cross-platform deployment—automatic DNS configuration for Windows, Mac, and Linux—so anyone on the team could run the server locally without manual network setup.

Impact

5 → 1

Steps Eliminated

One click replaces manual cross-referencing

5-10×

Faster Reviews

Instant playback vs. waiting for builds

Disciplines Adopted

QA, Producers, & Technical Designers chose to use it

"Choksi identified a pain point and built a tool that let us listen to files on our workstation, rather than downloading test builds to the device."

Madeline Mechem
Technical Designer, Kibeam Learning

What I'd add next: Search by audio filename. Sometimes you start with the file, not the sequence. It would also help to see which sequences share the same audio.