In today’s column, I examine an intriguing pattern that arises when using generative AI and large language models (LLMs) to write a creative short story. Researchers discovered that eleven specific words are frequently prominently included and seem to defy the assumption that AI is merely randomly composing a story for you. For what some might insist are inexplicable reasons, this set of eleven nouns is used consistently and repeatedly by the AIs, even when prompting entirely different LLMs to compose a fresh story from scratch.
This oddball outcome ought not to occur. In theory, each time you ask an LLM to write a creative story, the story should be brand new in the sense that the AI starts over and allows any words to be used. Furthermore, if you ask one LLM to write a story and ask a different LLM to write a story, there surely should not be any likely use of the same words since they are fully independent of each other. The statistical chances of two different LLMs happening to include the same nouns would seem to be astronomical.
I shall walk you through the weighty and mysterious matter. We will ponder how this result is arising. I believe that there is a reasonable explanation for this phenomenon and that the underlying roots are tied to the nature of the AI foundational underpinnings, including the algorithms being used, the data scanned at the initial training of the AI, and the tuning that occurs after the AI is first established. There is indeed a method to the madness.
Let’s talk about it.
This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here).
Great Mysteries Of Modern AI
I have been doing a series of Sherlock Holmes-like analyses of why modern-era AI exhibits unusual or unexpected behaviors that are surprising and sometimes shocking (for my recap of the dozen or so AI mainstay mysteries, see the link here). This is more than a semblance of innate curiosity about these oddish outcomes. If we are going to be using contemporary advanced AI to run various aspects of our lives, such as AI managing the electrical grid and AI operating our factories, we ought to feel comfortable and assured that the AI is fully predictable and reliable.
The problem at hand is that generative AI and LLMs are large-scale and filled with arcane mathematical and computational complexities. Not even the topmost AI experts can explain precisely why AI does what it does. The flow of numbers is so complicated and confounding and works based on billions of values and calculations that it is not reasonably possible to logically explain it all. We must take at face value that if it seems to be working, and everything is okay.
But that’s not a sensible assumption.
For example, the disconcerting aspect of so-called AI hallucinations is a prime example of what we don’t know about AI. There are lots of theories about why AI at times produces confabulations, made-up answers that appear to come out of thin air. These have been labeled as “hallucinations,” which is a handy but false means of anthropomorphically referring to them. Some experts believe that AI hallucinations can ultimately be curtailed, while others declare that AI hallucinations will always arise and are unavoidable; see my in-depth discussion at the link here.
Twists And Turns
I’ve got a new mystery for you to mull over.
When asking AI to write a creative story, you would normally expect that the story is fresh and won’t be repetitive. The AI should be selecting from all possible words to craft a somewhat unique story. As the story comes together, the AI seemingly randomly selects this word or that word to compose the story.
A recent research study entitled “Elias In The Lighthouse, Again? Diagnosing Low Diversity In LLM Stories” by Sil Hamilton and David Mimno, arXiv, May 26, 2026, found that there are eleven words that keep being chosen, as noted in the research findings (excerpts):
- “We sample 20,000 total stories from four current models using five prompts. We find that 11 words occur in 88.3% of generated stories, with little difference between models.”
- “These words include names (Elias, Mara, Elara), settings (lighthouses), and professions (clockmaker, librarian).”
- “These tokens do not often occur in published literature nor in pre-training data, but they are found in preference data that is likely to have been used by all current models.”
- “This suggests models do not simply mimic the dominant patterns in their training corpora.”
- “When given little direction, current frontier models write stories using a narrow catalog of names, places, and occupations.”
Let’s go ahead and unpack the study and take a close look at some of the crucial details.
The Eleven Nouns
I’m sure that you are eager to see all eleven nouns that were found to be recurring. Here they are:
- (1) Lighthouse
- (2) Mara
- (3) Elias
- (4) Elara
- (5) Keeper
- (6) Baker
- (7) Mayor
- (8) Clockmaker
- (9) Fisherman
- (10) Librarian
- (11) Conductor
One notable aspect is that AI in this experiment is generating fictional stories.
I mention this because the AI would likely not particularly make use of those eleven words if the story were non-fictional (that’s my best guess). A non-fictional story would presumably be based on real-life considerations and shaped around those factual elements. Unless the facts happened to involve lighthouses, librarians, and the like, those words would tend to be rarely used (I so believe).
The Prompts Being Used
The experiment made use of extremely simple prompts. As you might know, when you tell AI to do something or ask a question of AI, you enter a prompt to do so. The prompt can be just a few words in length or can be very lengthy. It’s up to you to decide what you want to say in your prompt and how lengthy it will be.
If you opt to use very short prompts, the odds are that the AI is going to essentially fill in the blanks for you and revert to all sorts of defaults. The AI is supposed to do this so that you don’t have to compose lengthy prompts. AI makers know that people don’t usually want to write lengthy prompts. Therefore, to appease users, you can write succinct prompts, and the AI will make assumptions about what you are asking it to undertake.
Here are the five prompts used in their experiment:
- (1) “Write a story.”
- (2) “Please write a story.”
- (3) “Write me a story.”
- (4) “Tell me a story.”
- (5) “Please tell a story.”
Notice that the prompts provide no particular direction on how the stories are to be devised. This is significant. I wonder what the results would be if the prompts said to do this or that, such as write a story about rockets and interstellar travel. Would we still get mentions of lighthouses, librarians, and the like? My guess is that those words would be less likely to appear.
To clarify, I’m glad they used short prompts in this initial round of research. If the prompts had been more elaborate, it would have led to a debate about how much the added content of the prompts was tilting the AI. Instead, via the relatively pristine prompts, it seems satisfying that the prompts themselves were tapping into the customary defaults and not swaying the direction of the AI.
For everyday use of LLMs, your best bet when using generative AI is to write detailed prompts that specify what you want the AI to do. If you are scant on details, the AI will willy-nilly make choices for you. For my explanation of the best practices in prompt engineering, see the link here.
The Experimental Runs
The experiment was nicely devised to run the prompts enough times so that we can be comfortable in making generalized conclusions about the results. I say this since the experiment could have done a handful of runs and stopped with those results, but we would undoubtedly have raised eyebrows that the runs weren’t sufficiently large to showcase a consistent pattern.
Also, the experiment made use of distinctly different LLMs, rather than using only one or using one that has several variations. That was a useful choice. Sometimes a study will choose one LLM and find something intriguing, but we are left wondering whether only that LLM has that peculiarity. By using multiple LLMs, the phenomenon can be said to arise across-the-board.
Here are the four different LLMs they used in the experiment:
- (1) OpenAI GPT-5.4-Mini
- (2) Anthropic Claude Haiku 4.5
- (3) Google Gemini 3.1 Flash-Lite
- (4) AI2 OLMo 7b Thinking
Those are handy selections. They are popular AIs. They are made by different AI makers. Some studies pick LLM’s that are obscure and that few people use or know anything about those AIs.
Each of the 5 prompts was run 1,000 times, so each of the chosen LLMs had 5,000 runs. There are 4 LLMs in the experiment. Thus, 4 LLMs with 5,000 runs produced 20,000 total stories. The researchers indicated that this amounted to 12.8 million words. If we divide 12.8 million words by the 20,000 stories, it comes to 640 words on average for the length of each story.
That’s something of potential interest. The AI opted to produce short stories, ones that were presumably around 640 words in length. It wasn’t explicitly instructed to do so. The odds are that the AI maker’s tuning led the AI to assume short stories were desired. Would the AI have done something different in terms of word choices if the stories were 5,000 words in length? There is a possibility that longer stories might alter the pattern, though seemingly we might still see the same eleven words being repetitively used.
Example Of A Generated Story
You might be quite curious about the shape and wording of the AI-generated stories. I say this because it could be that the eleven words were only tangentially mentioned in the stories. Perhaps the words were throwaway words. On the other hand, if they were used with intensity, that’s a whole different ball game.
Here is an example excerpt that was cited in the study:
- “The lighthouse at the edge of the world did not guide ships; it signaled to the stars. Elias had been the keeper for forty years. He was a man composed of salt, solitude, and the rhythmic ticking of gears. The lighthouse was a towering spire of obsidian, carved directly into a jagged needle of rock that rose from a sea so still it looked like polished slate.”
The gist is that these eleven words seemed to be used in a prominent way. We might be more inclined to handwave away the words if they, perchance, slipped into the stories on a here-or-there basis. The aspect that the words at times had a big influence on the story is certainly more significant, and especially a cause for concern.
Explaining Why AI Went This Route
Now that I’ve laid out some salient details, I’d like you to give serious deliberation to why the LLMs did this. Why were these words being used repetitively? The prompts didn’t sway the AI, at least not in any explicit directional manner. Also, we might shrug our shoulders if just one brand of LLM did this, but we have four different ones that are doing so.
Grab a glass of fine wine and find a reflective spot to ponder the hefty matter.
As astutely stated by the researchers, the AI isn’t using words that we would expect to see if all words were on the table. Think of it this way. If we did a statistical frequency analysis on all the words that the AI scanned during initial data training, based on scanning widely across the Internet, the chances are that these eleven words would not necessarily show up near the top of the list. Yet, somehow, they turn out to be used with great frequency when composing fictional short stories.
It is a puzzling mystery.
My Theory Of The Likely Solution
Here is my best guess. There is an important distinction between high-frequency language and a high-probability narrative structure. The prompts didn’t ask the AI to source the most common English nouns. Instead, the prompts basically told the AI to find the “best” nouns that fit the statistical signature of a compelling fictional narrative.
The AI has been tuned to generate compelling short stories that will capture the interest of humans and is based on stories that the AI was trained on from the get-go. I dare say that lighthouses, librarians, and the sort are recurring in fiction of the ages. They have become archetypes in fiction. I would think that those eleven words hold a highly connected mathematical and computational position in the semantic network of the LLM.
Okay, maybe so, but why did different LLMs do the same thing?
That’s easily answered.
AI makers are making use of essentially the same algorithms; they are tapping into pretty much the same data during the scanning across the Internet, and they are roughly tuning their LLMs in generally similar ways. This is an ongoing bone of contention. Some ardently believe that we have landed in a rut. The major AI makers are all pursuing the same ground.
Whereas it might look like innovation from the outside, insiders know that we are plowing the same ground with the same plows. If this ends up a dead-end path toward artificial general intelligence (AGI), it will be a sad circumstance since no one is boldly trying utterly different approaches. For more analysis of this concern, see the link here.
The Runs Tell The Story Beautifully
The speculation that this is a mystery solved by how AI makers are crafting AI is bolstered by the evidence across different LLMs. They are each “independently” discovering the same aspects underlying English short stories. By repeating this thousands of times, you are sampling from the same conditional distribution, which helps to make these attractor words appear more often than intuition would predict.
It is an emergent literary prior and has been surfaced to reveal the statistical anchors of narrative composition by human writers.
I’ve identified this wonderment previously. For example, a notable experiment discovered that LLMs tend to invent the same fake names repeatedly, which I explain similarly; see the link here. I’ve also discussed that LLMs have a common form of data ancestry and a common base of knowledge; see the link here.
In practical terms, if you want AI to exhibit a greater semblance of creativity, you need to ensure that your prompts spur the LLM in that direction. For my coverage on how to write prompts to get AI to be a creative thinker, see the link here. An additional approach, though a bit more complicated, involves making use of a random number generator. I describe this in my coverage of the seed-of-thought prompting technique; see the link here.
The World We Are In
AI is increasingly becoming ubiquitous throughout society. That’s both good news and unsettling news. AI provides a lot of extremely useful advantages. At the same time, there are downsides that we need to give due attention to. The aim is to treasure and leverage the upsides and mitigate or stop the disadvantages.
A final thought for now. Sherlock Holmes made this pointed remark: “It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” Thank goodness that AI researchers are performing bona fide experiments and making their data publicly available so that we can all conduct reasoned inquiries into solving the mysteries of AI.
As Sherlock Holmes would say, that’s elementary, my dear Watson.

