Information Efficiency of Code vs Spoken Word

English is efficient for human communication because of the shared context we have. We can mostly understand each other without having to spell out every detail. Step into a foreign culture and even if you're fluent in the language, you'll find you still struggle with communication until you've built up the shared cultural experience.

This is true even for regional moves within a given macro-culture like the USA. I grew up moving frequently with a military family. As a boy, I'd dedicate time after each move to learn all the local sports teams and their players' names. This was the language the local boys spoke. It was the minimal necessary to follow conversations at school.

Code has to be unambiguous because the computer requires precise instruction. However, there's necessarily ambiguity in the spoken word. This is the result of cultural evolution. A culture that spelled everything out verbally the way code does would have lost on the battlefield long ago, defeated by a culture with more efficient modes of communication. In fact, it's been shown that human languages share a relatively tight range of information efficiency. My guess is anything more verbose or terse would lead to slowness or confusion respectively, and consequently defeat. This is the efficient front of human language discovered and maintained through millennia of competition.

Study reveals all languages share similar information speed despite differences

The LLM doesn't share your cultural context. It's not a local. It's not your countryman. It's not even a human. The onus is on you to tell it what you mean. Even if you do so optimally given the tools at your disposal, not all ambiguity will be driven out. When humans misunderstand each other, it often goes unnoticed. If we notice, we'll usually ask clarifying questions.

Unlike a human, the LLM will notice but it won't ask a clarifying question. It'll rank potential interpretations in order of probability and choose the best one. It won't tell you it did that. You'll have to read carefully to figure it out.

Ironically, this interaction closely resembles the interaction with a human that doesn't realize the misunderstanding has taken place. The listener has to have a high focus and attention to detail, listen carefully, and identify the source of confusion. It turns out that this is exactly what good software engineers excel at.

It's not necessarily Computer Science as a discipline that trains this ability. It's years of experience dealing with high consequence misunderstandings between humans or computers. The computer just presents a particularly obtuse partner that helps provide the breadth and depth of misunderstandings necessary to tune the human mind to ambiguity identification. My guess is that lawyers and judges are really good at this too.

Search This Blog

Information Efficiency of Code vs Spoken Word

Comments

Post a Comment

Popular posts from this blog

Rules and Conventions

Engineering Truisms

Representation with Insufficient Dimensions