Words with Spaces

Companion article
The Small World of English — how any two words connect in 6–7 hops.

There are nearly half a million compound phrases that aren’t in any dictionary—simply because they contain spaces. “Boiling water.” “Saturday night.” “Help me.” I got interested in this because I make word games. I wanted to understand when to include words with spaces—and the legacy effects of traditional dictionaries on our sense of “what is a word.”

Here’s a slider. Look at expressions at different familiarity levels. Gold words are missing from traditional dictionaries:

Slide the knob to see missing terms.
Familiar Obscure
Totally missing In Wiktionary In traditional dictionaries: MW Oxford Both

Crowd-sourced Wiktionary has 16 times more headwords than Merriam-Webster’s already hefty tabletop book. Yet even Wiktionary leaves gaps.

Focused on singletons

When dictionaries were planned and created, lexicographers focused on the building blocks of language—and overwhelmingly preferred individual words. Even the technical term is clinical: “multi-word expressions” (MWEs)—as if they’re a deviation from the norm.

Coverage drops as you go deeper
MW +Wikt

Merriam-Webster (green) covers just 18% of the top 10,000 MWEs, dropping to 7% by 100K. Adding Wiktionary (yellow) brings coverage to 75%, but even that drops to 48%.

A few selected, non-obvious expressions (called “opaque compounds”) would be included if they seemed interesting enough. And dictionary pages were not wasted on self-evident combinations. But what was lost?

Buried in the billions

250 billion
grammatically valid 2-word pairs
85% are nonsense
tennis democracy cute doornail
15% are plausible (~30B)
cold water dry shoe
~700K
crystallized into
something more
One in 40,000 plausible pairs
carries special conceptual weight

Language is a vast combinatorial space. English has roughly 325K nouns, 60K verbs, and 85K adjectives. Multiply across grammatical patterns and you get about 250 billion possible two-word combinations.

Most are nonsense—“purple Wednesday,” “angry furniture.” But roughly 15% are plausible: “wooden chair,” “morning coffee.” That’s still 30 billion sensible pairs.

What interests me is that within those billions, some combinations have crystallized into something more—expressions that carry conceptual weight beyond their parts. “Hot dog” isn’t just a warm canine. “Red tape” isn’t about colored adhesive.

I asked Claude to brainstorm candidates (explained below). It returned 730K MWEs. These are some of the types, and how often they appeared in Merriam-Webster (15,921 of 89,128 headwords), Oxford American (13,758 of 72,495 headwords), and Wiktionary (221,546 of 1.44 million English entries). In each graph below, the left side is more familiar, the right more obscure:

transparent
213K
boiling water paper towel cold weather front door
technical
150K
blood pressure machine learning black hole climate change
semi-opaque
130K
high school living room best friend fast food
named entity
133K
New York World War Coca Cola Michael Jackson
opaque
23K
hot dog piece of cake red tape couch potato
phrasal verb
12K
kick out give up look after break down
institutional
2K
null and void terms and conditions cease and desist rules and regulations
binomial
1.7K
black and white back and forth hide and seek yes or no
light verb
2.3K
take care have fun get married tell the truth
other
1.5K
I am as well as each other such as
Totally missing In Wiktionary In traditional dictionaries: MW Oxford Both

Print dictionaries barely cover these expressions: Merriam-Webster has just 2.5%, Oxford has 2.2%. Even combined, they cover only 3.4%. Wiktionary does better at 32%.

The opaque compounds and phrasal verbs get the best coverage in Wiktionary. The named entities (proper nouns) are mostly encyclopedic topics—so they’re in Wikipedia. The technical terms drift into jargon as they get more obscure, but jargon isn’t useful for word games because too few players know it.

But see all the gold at the top of those little graphs for “transparent” and “semi-opaque”? That’s a deep reservoir of real MWEs—and we’ve been conditioned by dictionaries not to expect it. The old guard dictionaries, colored above in dark blue and dark green, overwhelmingly ignore them.

But even among the transparent—some have more conceptual weight than just their words. “Yellow pear” is just describing. “Boiling water” names a thing—what linguists call the naming function—a state of matter, a hazard, a cooking stage. The difference? One could have been a word.

Estimate source: These numbers are illustrative—no absolute counts exist. From 500K single-word terms, we sampled and classified parts of speech, including inflected forms and rare terms: ~325K nouns, 60K verbs, 85K adjectives (roughly twice WordNet’s counts). We multiplied across grammatical patterns. To estimate how many “make sense,” we randomly paired words: noun+noun 12% sensible, adjective+noun 64%, verb+noun 40%. The 30 billion is a weighted estimate.

Here are examples of are cousins of other words that did get lexicalized:

Got a word Didn’t
frozen water → ice boiling water
eyes closing briefly → blink closed eyes
dull persistent pain → ache severe pain

Should English have created a word for severe pain, other than the more emotionally tinged “agony” or “anguish”? I don’t know. But “severe pain” is tight enough to be a useful token in a word game. It names a thing.

Named entities

This analysis is about timeless words, not trivia. So I excluded 133K named entities—proper nouns like “New York City,” “Albert Einstein,” and “World War II.” Named entities are well-covered by Wikipedia, as are 47% of “technical” MWEs (“Parkinson’s disease”).

Scrabble famously excludes proper nouns. Crossword puzzles embrace them. Choosing how many proper nouns to include in a word game is tangled with cultural assumptions. Easy trivia is pointless, and hard trivia is alienating. Should all U.S. presidents be fair game? What about K-pop stars? British monarchs? Nobel laureates? The “right” answer depends on your audience, and that’s a design question, not a linguistic one.

What Kinds of Phrases?

The opacity distinction isn’t the only interesting one. MWEs vary in origin and character too—some are technical, some metaphorical, some named after people, some freshly coined. Here’s how they break down:

domain-specific
blood pressure type 2 diabetes zero gravity habeas corpus chapter 11
figurative
alarm bells melting pot young at heart zero hour cloud 9
eponymous
Achilles’ tendon tommy gun venetian blind wellington boots
taxonomic
Tasmanian devil Canada goose tyrannosaurus rex turkey vulture
neologism
fidget spinner smart home gig economy viral video zero waste
dated
wet nurse video cassette zoot suit typewriter ribbon

The words that aren’t words

The line between “word” and “phrase” is fuzzier than dictionaries suggest. Words tend to carry more conceptual weight—but not always. Some words probably didn’t need to be coined; some phrases deserve more credit than they get. English is full of two- and three-word phrases that function as single semantic units—effectively “words”—but because they’re not fused into one orthographic word, they’re invisible to dictionaries and underappreciated as vocabulary. “Kindergarteners” is in the dictionary, but not “middle schoolers.” Newer professions—“car salesman,” “vacuum salesman,” “balloon seller”—are two words, but ancient professions might have a one-worder: jeweler, florist, fishmonger, tobacconist, stationer, mercer.

Meanwhile, in Spanish

Other languages sometimes have tidy one-worders for things English can only describe. Spanish carves up time with precision English lacks: madrugada for the pre-dawn hours, atardecer for late afternoon waning into evening. The mid-day nap was so compelling we adopted the siesta into English.

But, concise one-word equivalents appear only among the top 2k most familiar MWEs—and even then, rarely. It’s as if languages collectively decided these concepts weren’t worth crystallizing into single words.

Exceptions exist—Norwegian forelsket for the euphoria of falling in love, Danish hygge for candlelit contentment, Portuguese saudade for longing. But these are famous precisely because they’re rare. Beyond the top tier of familiarity, the well runs dry everywhere. No language has a single word for “parking lot frustration” or “Sunday evening dread.” Generally, if English describes it with a phrase, so does everyone else.

The lexicographer’s mind

Why did dictionary-makers include some phrases and not others?

Lexicographers used a substitutability test: if you can swap synonyms freely, it’s not a lexical unit. “Cold feet” (meaning fear) can’t become “frigid feet”—so it gets an entry. But the test cuts both ways. You can say “boiling water” but not “seething water” or “raging water.” The phrase resists substitution too.

Looking at extremes—the words most often included vs. excluded at the start or end of a phrase—reveals patterns. Compare two groups that made these decisions. Traditional lexicographers (Merriam-Webster, Oxford) worked under page limits, favoring established scientific and technical vocabulary. Wiktionary volunteers, unconstrained by space, focused on phrasal verbs and everyday idioms. Tap these toggles to explore the lexicographer decisions.

Toggle these toggles
(N of M) = N phrases in dictionary out of M total phrases with that anchor word

Curiously, Wiktionary also achieved 100% coverage on some less common ending words. Every lily we tested (water lilies, Easter lilies, tiger lilies). Every sauce (dipping sauces, hot sauces, pasta sauces). Every pudding (Yorkshire puddings, rice puddings, Christmas puddings). Every acid (stomach acid, fatty acid, sulfuric acid).

This analysis wasn’t possible before LLMs

Print dictionaries had space limits. Corpus linguistics could find frequent n-grams, but frequency can’t tell you what’s a concept. “The table” is extremely frequent but names nothing; “loose leaf” is rare but names a thing. Sorting n-grams by frequency buries what matters.

I needed a different signal. I went fishing in Claude’s brain.

Generate
Word lists
(Wiktionary, Wikipedia, LOC subjects) 22M+ probes 730K MWEs surfaced
Rank
Each term appears in
30 random batches of 75 1.4M comparisons Average position = 0–1 score
Classify
Tag each MWE:
transparent, semi-opaque, opaque; phrasal verb, technical, named entity; figurative, eponymous, taxonomic 10 types
Compare
MW (89K) → 2.3%
Oxford (72K) → 2.0%
Wikt (1.4M) → 33% Cross-tab by familiarity × type

The seed lists—Wiktionary, Wikipedia, Library of Congress subjects—were chosen to cover topics people actually care about. The probes work because Claude only produces a phrase as a category member if it thinks of it as a unit. Ask for marquee terms and you get “now playing.” Ask for cooking hazards and you get “boiling water.”

Beyond the dictionary

When an English speaker says “Saturday night,” they’re not computing “the night portion of the day called Saturday.” They’re invoking a concept—end of the work week, social time, a feeling. The phrase is the word for that concept.

Companion article
The Small World of English — how we built a navigable network of 1.5 million terms where any two words connect in 6–7 hops.

For word games, this is a reorientation. There’s a vast space of MWEs that could be playable—sitting alongside single words. Letter counts still matter, so long MWEs don’t have much role in gameplay (unless it’s Wheel of Fortune)—but “paper towel” fits anywhere “discombobulate” does. The constraint was never length. It was our dictionary-shaped sense of what counts.

It made me think about words differently. Not where the next space shows up—but whether something names a thing.