The Audio Gap Nobody Solved

I wanted to add a listen button to The Second Interface. Started drafting the TTS script. Then stopped.

You can’t just read a visual piece aloud. It wasn’t written to be heard.

Headlines create hierarchy. Typography carries weight. Whitespace gives you room to breathe. Scroll position builds anticipation. None of that survives the transition to audio.

Synthetic voices closed the delivery gap. ElevenLabs doesn’t sound like a robot anymore. That problem is solved.

The editorial problem isn’t. What needs to be said? In what order? With what pacing? That’s judgment, not synthesis.

A visual piece earns credibility through craft. You see the work before you know who made it. Audio earns credibility through presence. The listener decides in fifteen seconds whether you’re worth their time. Same content. Opposite architecture.

Medium and Substack read the page aloud. That’s a fallback, not a solution. Podcast producers solve it properly. Full teams. Months of iteration.

The middle is empty. Nobody’s built the thing that takes a visual-first piece and produces a purpose-built audio companion. Article and audio share an outline, not a script.