Search results

Part 3: From Bakhtin's heteroglossia to AI model collapse (Bakhtin and model collapse: How to use AI with expressive writing without generating AI slop)

by Tom Johnson on Jan 30, 2026 comments
categories: ai • ai-book-club • writing

AI model collapse occurs when models trained on their own output lose the creative "tails" of data, degrading into a stagnant, monologic average. Embracing "centrifugal" edge cases and diverse voices helps keep both writing and AI models alive and coherent.

This post is part of a series. See Part 2: Heteroglossia for the previous section.

Part 3: From Bakhtin’s heteroglossia to AI model collapse
Takeaways and a new approach to using AI with expressive writing
Next section

Part 3: From Bakhtin’s heteroglossia to AI model collapse

Now let me make a radical pivot into the present with Bakhtin and compare the idea of monologism and heteroglossia (from Bakhtin’s essays originally written in 1934 and 1941) to AI model collapse today. In 2024, a troubling research paper came out called “AI models collapse when trained on recursively generated data.” It argued that AI trained on AI-generated content suffers “model collapse”—a degrading of output quality such that the model loses its ability to reflect reality.

The paper argues that this degradation happens because of the disappearance of the “tails” of the data distribution. With each new model version, when the model is exclusively trained on its own output, the prediction algorithms focus on the most probable, central part of the bell curve. The tails—those rare, creative, or bizarre data points that make up the jagged, alien edge of human thought—”get washed away.” As the copy of a copy of a copy reduces these fringes, the model arrives at a “mean representation” of the data that eventually collapses into nonsense (1). In the paper’s examples, by the ninth generation, a model asked about architecture begins babbling repetitively about “blue-tailed jackrabbits” (2). It loses its connection to the living world.

The problem with AI-generated content, Bakhtin might say (I’m speculating here; obviously Bakhtin predates generative AI and AI slop by decades), is that AI-generated content adopts a monologic style. That monologic style is inherently centripetal; it exerts a centralizing pressure to align on a unified, single “authoritative” voice (274). Bakhtin warns that this orientation toward unity eventually makes discourse “sclerotic” (292) and “histological” (like dead tissue under a microscope) (275)—it turns the living speech into a dead specimen.

In contrast, heteroglossia is a centrifugal force. It embraces the edge of content. On the outside edge, outside the bell curve of predictability, content includes more fringe or extreme, radical, and subversive takes. The outer edge is less predictable; it’s more alien, unexpected, and bizarre. It isn’t smooth. It’s where language is “still warm from struggle and hostility, as yet unresolved and still fraught with hostile intentions and accents,” Bakhtin would say (331). The loss of this edge content leads us instead toward monologism. And monologism leads to model collapse.

In short, not only does writing need heteroglossia (many different tongues) to add meaning and interest, AI models also thrive on this difference. Sameness is a death sentence for both soul and software. This finding should lead us to celebrate those edge voices putting centrifugal pressure away from the center to keep language, and our AI models, alive. Welcome subversive language and be curious about otherness. Trespassing beyond the fenceline of normal, perhaps skirting danger, is what makes prose exciting (and AI models coherent).

Takeaways and a new approach to using AI with expressive writing

Let’s now move toward practical takeaways and a recommended approach to using AI with writing (remember my original aim, which is to rescue expressive writing from prohibitions with AI). If heteroglossia (especially bringing in alien contexts) is vital to any personal essay, can AI help here without diluting your own voice and language? Yes, and I think this technique works in a way that’s mostly acceptable. I’ll explain three strategies:

Next section

Continue on to the next section, Part 4: Use AI as a research assistant.

Comment on LinkedIn

About Tom Johnson

I'm an API technical writer based in the Seattle area. On this blog, I write about topics related to technical writing and communication — such as software documentation, API documentation, AI, information architecture, content strategy, writing processes, plain language, tech comm careers, and more. Check out my API documentation course if you're looking for more info about documenting APIs. Or see my posts on AI and AI course section for more on the latest in AI and tech comm.

If you're a technical writer and want to keep on top of the latest trends in the tech comm, be sure to subscribe to email updates below. You can also learn more about me or contact me. Finally, note that the opinions I express on my blog are my own points of view, not that of my employer.

Email newsletter

AI Book Club

Recent blog posts

Popular series

Archives

Browse by tag

Search tomjoht.github.io with DeepWiki

Other tech writing blogs