KVNCNNLLY

Typing to Think, Typing to Prompt

My earlier post about the rise of voice as a UI had me pondering what exactly is it that makes voice input ideal in some circumstances and entirely undesired in others. Spoken language is faster per word and, in a naive economic sense, “better”1. And yet, there are times where I have no interest in “going fast”. Why?

This topic makes me think of Andy Clark’s Supersizing the Mind, which opens with a famous Feynman quote about insisting that his “daily notes” were not simply a record of his activities but his actual thinking:

“No, no! They aren’t a record of my thinking process. They are my thinking process. I actually did the work on paper.” - Feynman on his ‘daily notes’

My referenced blog post is about 1,300 words and it took about 1.5 hours to write (and probably another 1.5 hours of editing here and there, though I can’t say I’m proud of it). Compared to typical spoken language ranges of 120-150 words per minute, that’s about a 10-minute speak-to-write piece.

With my current habits and skills, I could not speak that piece out. I could speak something out, certainly. But not the end product of what was written and edited down. For that particular piece, most of the “writing” time was interstitial thinking. Writing isn’t always like that. I have written ~1k word things start to finish in what feels like a single breath, so the “struggle” factor can vary.

Struggle factor aside, I find it very hard to envision writing any long form piece by voice. I don’t think it’s wrong, or impossible. Certainly a draft of some sort could be produced, and that might even be more effective in some speak-right-through-it sense.

Prompting, searching, descriptive and summary tasks all feel like they belong to a different category of writing to me. There’s something about a 15 to 250-word prompt that is ephemeral, with a target or outcome in mind, a clear audience (the LLM/agent) and a sort of psychological comfort that a little meandering is A-OK. There’s something fluid about it that makes voice all the more appealing, on top of the speed benefits. But it doesn’t feel like thinking, which I say with no dismissiveness intended.

This sort of “short and easy” prompt-esque writing is just easier. The entirety of the output is loaded into mental RAM, so speaking it out may come with relative ease.

Conversely, speaking in long, structured, well-reasoned statements doesn’t feel like a common standard, but is certainly a skill that can be learned. With the burgeoning voice-first movement, products specializing in “rambles to beautified document” synthesis are becoming more popular. It’s possible this will put people on the path of rambling their way to learning to speak at incrementally greater length and with incrementally more clarity and structure.

The idea of voice-first tools getting good enough to become a common medium for deep work and deep thinking is intriguing. I don’t think we’re there now and I’m not sure we will ever get there. The written word and symbol give so much referential leverage that it’s hard to imagine escaping “the work on paper.” Voice is more like a new pen, so perhaps what we need is a new form of paper.


  1. That’s the business model for some voice-first products, more or less. For example, WisprFlow’s pricing page leans very heavily into time-savings as value proposition. ↩︎

#Questions #Tech #Writing