Propositional Content
that ursine juggernaut isn't gonna bethink to sup on you by itself
LLMs write flawless prose. They are also, generally, bad-to-terrible writers. For people who aren’t really readers or writers — that is to say, most people — it may seem strange that these two things can be true at the same time. David Foster Wallace explained how and why way back in April 2001:
When I say or write something, there are actually a whole lot of different things I am communicating. The propositional content (the actual information I’m trying to convey) is only one part of it. Another part is stuff about me, the communicator. Everyone knows this. It’s a function of the fact that there are uncountably many well-formed ways to say the same basic thing, from e.g. “I was attacked by a bear!” to “Goddamn bear tried to kill me!” to “That ursine juggernaut bethought to sup upon my person!” and so on. And different levels of diction and formality are only the simplest kinds of distinction; things get way more complicated in the sorts of interpersonal communication where social relations and feelings and moods come into play. Here’s a familiar sort of example. Suppose that you and I are acquaintances and we’re in my apartment having a conversation and that at some point I want to terminate the conversation and not have you be in my apartment anymore. Very delicate social moment. Think of all the different ways I can try to handle it: “Wow, look at the time”; “Could we finish this up later?”; “Could you please leave now?”; “Go”; “Get out”; “Get the hell out of here”; “Didn’t you say you had to be someplace?”; “Time for you to hit the dusty trail, my friend”; “Off you go then, love”; or that sly old telephone-conversation ender: “Well, I’m going to let you go now”; etc.(n) And then think of all the different factors and implications of each option.
People hate LLM prose because it is deeply impoverished prose. We are accustomed to writing speaking to us on many levels; LLM prose is restricted to the simplest, most basic level, while emitting nothing but a monotonic hum, or in the worst case a car alarm, on the other, subtler levels.
Sometimes this is OK. LLMs are one-note, but sometimes that note is the note you want. Heck, an LLM apparently recently won a significant literary prize (though if you think this says more about literary prizes and their juries than about LLMs, you’ll need to find someone other than me to argue the point.) But usually that one note rapidly becomes incredibly grating.
An analogy: imagine that a movie or TV series had nothing but yacht rock as its soundtrack. Not inherently bad, right? Sometimes, e.g. Guardians of the Galaxy, it could even be a memorable strength. But now suppose that all movies and TV series only ever had yacht-rock soundtracks. Heartwrenching breakups, grim war scenes, suspense, The Exorcist, Schindler’s List, Barbie and Oppenheimer alike … all soundtracked by cheerful lyrics backed by banal guitars. This would be pretty infuriating! But at the same time you could probably live with it for corporate videos and commercials.
It is true that sometimes prose is purely functionary, and its propositional content is all that matters … but such cases are actually exceedingly rare. At home, at work, in the world of commerce, the content of what you write is inevitably colored by how you write. DFW again:
People really do "judge" one another according to their use of language. Constantly. Of course, people judge one another on the basis of all kinds of things--weight, scent, physiognomy, occupation, make of vehicle--and, again, doubtless it's all terribly complicated and occupies whole battalions of sociolinguists. But it's clear that at least one component of all this interpersonal semantic judging involves acceptance, meaning not some touchy-feely emotional affirmation but actual acceptance or rejection of somebody's bid to be regarded as a peer, a member of somebody else's collective or community or Group. Another way to come at this is to acknowledge something that in the Usage Wars gets mentioned only in very abstract terms: "Correct" English usage is, as a practical matter, a function of whom you're talking to and how you want that person to respond--not just to your utterance but also to you. In other words, a large part of the agenda of any communication is rhetorical and depends on what some rhet-scholars call "Audience" or "Discourse Community."
This is especially true on the Internet, a point I made in a surprisingly popular piece I wrote way back in 2014:
the weird thing the Internet has done to language: Standard Written English — or, at least, its most fundamentalist form, Clinical Standard Written English — has actually become incorrect in most online contexts … to the Reddit-reading masses [it sounds] orthodox, lifeless, soulless, a parade of pale impersonal zombie words drained of blood by some linguistic vampire, if you’ll pardon the mixed horror-movie metaphor. That in turn is one reason why — online, at least — a new generation of irreverent, colloquial, acerbic online sites is eating old media’s lunch.
Now Clinical Standard Written English is back, and everywhere, but mutated into a new and weirdly distinctive form. It’s not X, it’s Y. The rule of three. Phrases and metaphors that sound like they should make sense, until you look at the slightly more closely and realize they don’t. And it’s not just em-dashes — it’s the way they’re used. (Sorry, I couldn’t resist.) We’re always listening for the human resonance in the words we encounter … and LLMs give us little more than a monotone.
It is true that you can paper over this problem today, some, by giving LLMs stylistic instructions, or asking them to write in a particular style. But the interesting thing is that over the last three years frontier LLMs have gotten palpably worse at writing prose. My very first “OMG this is insane and mind-blowing” LLM moment was when I asked GPT-4 to write something in the style of Riddley Walker’s post-apocalyptic broken-English prose. I had previously asked ChatGPT/GPT-3.5, and the results were deeply meh. But then:
But then, after years of alleged progress:
(it’s perhaps worth noting that GPT-4.5 was actually quite good.)
It’s pretty easy to guess why. Models since have been optimized and reinforcement-learned so as to be good at other things: coding, mostly, but more generally, outputs of immediate economic value, rather than outputs with emotional resonance. RL has side effects. Expanding an LLM’s capability envelope somewhere often means shrinking it elsewhere. (This is one reason why post-training is as much art as science.)
So LLMs aren’t inherently, necessarily, bad writers. If the frontier labs wanted to make them superbly symphonic rather than gratingly one-note, they could. (Good writing is admittedly much harder to grade than effective code or other quantifiable benchmarks, but it could be done.) But I don’t see a lot of incentive for that to change anytime soon, so we’re likely to be stuck with Sloppish for a while longer. An interesting implication, though, is that so long as we are, LLMs still don’t really pass the Turing test1 … which perhaps, for now, is for the best.
It’s very worth reading Turing’s actual 1950 paper on the subject, Computing Machinery and Intelligence, in which he predicts (and demolishes) many of today’s AI arguments … but also concedes “May not machines carry out something which ought to be described as thinking but which is very different from what a man does? This objection is a very strong one, but at least we can say that if, nevertheless, a machine can be constructed to play the imitation game satisfactorily, we need not be troubled by this objection.“





