LLMs sample from training frequency, so they’ll give you “key → unlock” a thousand times before “key → reef.”
We asked a frontier LLM for word associations and compared the results to Linguabase. The pattern is consistent: LLMs give you the statistically dominant meanings and miss the rest.
| Word | Linguabase only | Both | LLM only |
|---|---|---|---|
| key | reef, atoll, cay, clef, transpose, tumbler, cryptography, fob, keystone | unlock, cipher, door, password, piano, code, solution | lock, keyboard, essential, metal, chain, scale, crucial |
| bridge | bidding, trump, slam, contract, luthier, nose, Wheatstone, cantilever, viaduct | crossing, span, arch, suspension, river, dental, overpass | connect, gap, cable, pier, structure, link |
| window | browser, tab, popup, opportunity, mullion, oculus, clerestory, fenestration | sill, pane, casement, screen, curtain, frame, view, glass | shade, ledge, transparent, light, ventilation, drapes |
| nephew | nepotism, avuncular, nibling, godson, godfather, doting, prodigal, namesake | niece, uncle, aunt, cousin, brother, sister, relative, kin | relation, generation, son, kinship, bond |
| penguin | Linux, Happy Feet, Morgan Freeman, rookery, porpoising, countershading, crèche | cold, flightless, krill, waddle, emperor, tuxedo, Antarctic | ice, swim, arctic, ocean, pebble |
| tornado | Joplin, Moore, Wizard of Oz, storm chaser, mesocyclone, mobile home, alley | twister, funnel, supercell, cyclone, vortex, storm, Dorothy | rotating, destruction, severe, Midwest, siren |
| giraffe | ossicones, blood pressure, Geoffrey, okapi, reticulated, Serengeti, ruminant | acacia, spots, neck, tall, tongue, savanna, Africa, calf | legs, pattern, horns, graceful |
LLM tested: Claude 4.5 Opus, January 2026. Prompt: “list words related to [word], comma delimited.”
Which do you prefer? The leftmost column (unique to Linguabase), or the rightmost (created by an LLM)?
As game designers and wordplay lovers, even the best LLM output feels flat. And over time, it starts to feel repetitious — an insidious sameness that makes level 49 feel eerily like levels 36 and 85.
Good word games enchant and retain players when they include that kind of richness: non-obvious facets, technical depth, cultural touchstones, etymological connections, sensory and experiential associations. And they feel better when there’s less generic filler.
The Linguabase column (left) is more interesting because it embodies more dimensions.
LLMs are trained on text where some meanings appear far more often than others. Perhaps in the training data, “key” in security contexts outnumbers “key” as a low island — leaving us with what’s statistically dominant: unlock, lock, door, password, access.
This isn’t a prompting problem. You can ask for “diverse” or “unusual” associations and the LLM will try — but it’s still sampling from the same skewed distribution.
You can use an LLM to check whether “key → reef” is a valid association. It’ll say yes. But it won’t generate that association reliably, because reef isn’t in the high-probability zone when the prompt is “what’s related to key?”
That’s the core insight: LLMs are good validators, bad generators—at scale, for this task.
Definitions are the layer where LLMs come closest to being a substitute. You could brute-force 400K API calls to generate definitions in your style. But there are catches:
For associations and vocabulary rankings, there’s no LLM shortcut. For definitions, there’s a hard way that might work—but you’d still need the word list and difficulty rankings to start from.
We spent a decade building the graph the hard way:
Built from 1.5 million words and 100 million connections. Shipped as a curated 400K-word graph with ~40M connections—every plausible word a player would use, without noise. Learn more → or see licensing options →