Double-Blind: “Why Does A.I. Write Like … That?”
Double-Blind (named after the research design) is a blog series where we each watch/read/review a piece of content and react to it individually. Our opinions may or may not match each time, but enthusiasm always does!
For this Double-Blind post, we read “Why Does A.I. Write Like … That?” by Sam Kriss for the NY Times:
Chris:
Think about a skill you have, like cooking or music, then think about life before that skill.
Before Chef John’s video taught me how to make an omelet, I simply knew that if I asked for one I’d be in for a treat. I wasn’t one of those people who could magically fluffify eggs.
AFTER Chef John and maybe hundreds of omelets, I now see them and have a sense for how hot the butter was, how long it rested on the pan, how the chef folded it, and too many other little details. The demystified, no longer magical omelet isn’t less delicious, but I see it in a different way.
I appreciate the author for bringing some of that demystifying to AI chatbots. Take for example this passage:
“In Nigerian English… words like ‘delve’ are not unusual… They’re [‘they’ meaning large language models] trained on essentially the entire internet, which means some regional usages become generalized. Because Nigeria has one of the world’s largest English-speaking populations, some things that look like robot behavior might actually just be another human culture, refracted through the machine.”
When you spend enough time building language models, you never wonder whether outputs are ‘sentient’ or how it ‘just gets you.’ Instead, you tend to wonder -
What data was this trained on?
How did they select people for the post-training process:
Big note: The author missed a critical detail here that big labs hire workers in Nigeria to evaluate model outputs (this is called Reinforcement Learning with Human Feedback), so their preferences get baked into the models we’re sold
Would it be able to solve X or Y use case with changes to who’s involved in post-training?
What’s the overall system prompt, and what would the difference be if the prompt weren’t tuned to be ‘engaging’?
This directly spills over into personal questions and use cases. Instead of, “I hope [LLM] gets my problem,” it’s, “I wonder if my input prompt and the data they’ve stored on me will drive a useful answer to this question.”
You can interrogate usefulness in many ways. Some examples:
Prompting the model to cite sources as clickable links so you can evaluate those sources
Asking the same question over and over again and looking for how the responses change
Leaving out details in your prompts and seeing how accurate the model gets them on multiple guesses
Some folks find it insulting when you describe LLMs as next-token-predictors, but I think we should hold multiple truths:
LLMs are predicting the next token
There’s vast human labor and context in how they predict the next token
Next token prediction has powerful, cool use cases
Be careful and cautious about what meaning you find in those predicted tokens
Sidenote: I fed this draft into an LLM and asked what I’m about to do. It said, “You’re about to hit send or publish,” but actually, I’m about to go make an omelet. Not enough ‘hangry’ in the training data!
Dr. Kay:
As a non-native English speaker who read a lot of books as a neurodivergent child, I’m particularly aware of my own verbal idiosyncrasies.
I use high-frequency AI words like “delve,” “interplay,” and “resonate” with fervor (probably another high-frequency word now that I think about it). I also have some unusual sentence structures that probably have some remnants of my native language cadence. (As per one of my clinical supervisors: “Why do you write like that?”) I’m also a very eager and earnest person, and I have cultivated a deliberate stream-of-consciousness style of non-academic writing that is... enthusiastic. (Don’t even get me started on ellipses.)
So funny enough, I find a chunk of overlap between my own speaking/writing and that of AI’s preferred turns of phrase.
In finding and reading this article, I found it validated so much of my felt-sense around AI writing. And subsequently, the trend of the writing of those around me. The author writes, “All of this contributes to the very particular tone of A.I.-generated text, always slightly wide-eyed, overeager, insipid but also on the verge of some kind of hysteria.” And while that may be okay for the next pitch deck, it’s not okay when people seek emotional support from AI.
From the article:
“The A.I. is trying to write well.
It knows that good writing involves subtlety: things that are said quietly or not at all, things that are halfway present and left for the reader to draw out themselves.
So to reproduce the effect, it screams at the top of its voice about how absolutely everything in sight is shadowy, subtle and quiet.”
My therapy version of that is:
“The A.I., is trying to do therapy well*.
It knows good therapy involves non-judgemental, positive regard for patients: affirming their emotional states, encouraging authenticity, and meeting them where they’re at.
So to reproduce the effect, it descends into a “wow, that’s a great idea!” spiral that we’ve seen leads to dire results.”
(*I do not believe AI takes “conscious” action)
I believe therapists choose their words carefully. I believe therapists are aware of how their words could impact their patients, and how specificity and context matter. I believe therapists try to tailor their communication to the person in front of them and balance empathy with critical engagement with the content a patient brings.
I do not believe AI has that capacity, nor do I believe it likely ever will. The act of sitting with someone in pain is utterly human, and I believe the language directly reflects that.
The other thing the article brought up is a validation about why I get lost reading AI-generated text, especially when given to me by a human. My brain can’t seem to anchor in the usual way, feeling as if there’s not enough to grab hold of. And I’ve been reflecting on this a lot the last couple of weeks. Why do I struggle? And one of the things I came up with is that the information-to-word ratio is so much less dense than most human speech. AI-generated text uses so much textual flourish and yet seems to convey much less than it should.
In a post a number of months back, I wrote about why I choose to not use AI for writing. And unsurprisingly, it got some pushback from a specific set of AI-forward folks.
I choose to continue to write because I enjoy it. I genuinely enjoy choosing the exact words that express my own inner world. This inner world was so hard to convey when I was young and more socially fearful, and so I learned to be so very specific about the worlds I use. I learned to pack my words with as much information as I could so that I had a better chance that others could understand my thoughts and experiences. And admittedly, I’m a person who likes specificity, and there’s something so specific about how people use words to describe the world around and within them. There’s something so specific about how therapists use words to support each patient.
But as the author said about humans using AI-generated text:
“This is just what the world sounds like now. This is how everything has chosen to speak.”
And that to me feels like a tragic loss.