Words with Spaces

Companion articles

The Small World of English — how any two words connect in 6–7 hops.

Linguabase vs. Oxford — why traditional dictionaries skip these phrases.

English has hundreds of thousands of compound phrases that name things—not just describe them. Take “boiling water.” Is it just water that happens to be boiling? Or is it a loaded term—a hazard, a cooking stage, a concept that other languages gave its own word? Traditional dictionaries skip almost all such phrases, because they contain spaces. Merriam-Webster and Oxford cover about 3%.

I got interested in this because I make word games—and wanted to understand which phrases carry enough conceptual weight to count as vocabulary, and why dictionaries trained us to think they don’t.

Here’s a slider. Look at expressions at different familiarity levels. Gold words are missing from traditional dictionaries:

Slide the knob to see missing terms.

Familiar Obscure

Excluding long terms >11 chars

Totally missing In Wiktionary In traditional dictionaries: MW Oxford Both

What I’m interested in are the phrases that carry more weight than their parts—the ones that are loaded. The obscure end of this slider is definitely noise—LLM artifacts, jargon fragments, Wiktionary debris. “I love you” isn’t opaque, but it’s tight enough to put on a tile. Where you draw the line is up to you.

Crowd-sourced Wiktionary has 16 times more headwords than Merriam-Webster’s already hefty tabletop book. Yet even Wiktionary leaves gaps.

Focused on singletons

When dictionaries were planned and created, lexicographers focused on the building blocks of language—and overwhelmingly preferred individual words. Even the technical term is clinical: “multi-word expressions” (MWEs)—as if they’re a deviation from the norm. Linguists also call many of these collocations—words that habitually travel together—and collocation dictionaries do exist. But their focus is on co-occurrence, not on whether a phrase is loaded—whether it names something, or just describes it.

Coverage drops as you go deeper

MW +Wikt

Merriam-Webster (green) covers just 18% of the top 10,000 MWEs, dropping to 7% by 100K. Adding Wiktionary (yellow) brings coverage to 75%, but even that drops to 48%.

A few selected, non-obvious expressions (called “opaque compounds”) would be included if they seemed interesting enough. And dictionary pages were not wasted on self-evident combinations. But what was lost?

Obvious meaning—often missing from dictionaries: hospital bills, smooth skin, angry person, exit door
Unpredictable meaning—usually in dictionaries: fat chance, melting pot, guilt trip, cold feet

Buried in the billions

250 billion

grammatically valid 2-word pairs

85% are nonsense

tennis democracy cute doornail

15% are plausible (~30B)

cold water dry shoe

~775K

crystallized into
something more

One in 40,000 plausible pairs
carries special conceptual weight

Language is a vast combinatorial space. English has roughly 325K nouns, 60K verbs, and 85K adjectives. Multiply across grammatical patterns and you get about 250 billion possible two-word combinations.

Most are nonsense—“purple Wednesday,” “angry furniture.” But roughly 15% are plausible: “wooden chair,” “morning coffee.” That’s still 30 billion sensible pairs.

What interests me is that within those billions, some combinations have crystallized into something more—expressions that carry conceptual weight beyond their parts. “Hot dog” isn’t just a warm canine. “Red tape” isn’t about colored adhesive.

I asked Claude to brainstorm candidates (explained below). It returned 774K MWEs. These are some of the types, and how often they appeared in Merriam-Webster (19,641 of 89,128 headwords), Oxford American (15,488 of 72,495 headwords), and Wiktionary (232,454 of 1.44 million English entries). In each graph below, the left side is more familiar, the right more obscure:

transparent

213K

boiling water paper towel cold weather front door

technical

150K

blood pressure machine learning black hole climate change

semi-opaque

130K

high school living room best friend fast food

named entity

133K

New York World War Coca Cola Michael Jackson

opaque

23K

hot dog piece of cake red tape couch potato

phrasal verb

12K

kick out give up look after break down

institutional

null and void terms and conditions cease and desist rules and regulations

binomial

1.7K

black and white back and forth hide and seek yes or no

light verb

2.3K

take care have fun get married tell the truth

other

1.5K

I am as well as each other such as

Totally missing In Wiktionary In traditional dictionaries: MW Oxford Both

Print dictionaries barely cover these expressions: Merriam-Webster has just 2.4%, Oxford has 2.1%. Even combined, they cover only 3.2%. Wiktionary does better at 30%.

Estimating the 250B > 30B funnel: These numbers above the table are illustrative—no absolute counts exist. From 500K single-word terms, we sampled and classified parts of speech, including inflected forms and rare terms: ~325K nouns, 60K verbs, 85K adjectives (roughly twice WordNet’s counts). We multiplied across grammatical patterns. To estimate how many “make sense,” we randomly paired words: noun+noun 12% sensible, adjective+noun 64%, verb+noun 40%. The 30 billion is a weighted estimate.

The opaque compounds and phrasal verbs get the best coverage in Wiktionary. The named entities (proper nouns) are mostly encyclopedic topics—so they’re in Wikipedia. The technical terms drift into jargon as they get more obscure, but jargon isn’t useful for word games because too few players know it.

But see all the gold at the top of those little graphs for “transparent” and “semi-opaque”? That’s a deep reservoir of real MWEs—and we’ve been conditioned by dictionaries not to expect it. The old guard dictionaries, colored above in dark blue and dark green, overwhelmingly ignore them.

Here are examples of are cousins of other words that did get lexicalized.

Got a word	Didn’t
frozen water → ice	boiling water
eyes closing briefly → blink	closed eyes
dull persistent pain → ache	severe pain

But even among the transparent—some are more loaded than others. “Yellow pear” is just describing. “Boiling water” names a thing—what linguists call the naming function—a hazard, a cooking stage, a concept that Slavic languages and Japanese each gave its own single word (Russian kipyatok, Polish wrzątek, Japanese nettō). The difference? One could have been a word.

Should English have created a word for severe pain, other than the more emotionally tinged “agony” or “anguish”? I don’t know. But “severe pain” is tight enough to be a useful token in a word game. It names a thing.

Named entities

This analysis is about timeless words, not trivia. So I excluded 133K named entities—proper nouns like “New York City,” “Albert Einstein,” and “World War II.” Named entities are well-covered by Wikipedia, as are 47% of “technical” MWEs (“Parkinson’s disease”).

Scrabble famously excludes proper nouns. Crossword puzzles embrace them. Choosing how many proper nouns to include in a word game is tangled with cultural assumptions. Easy trivia is pointless, and hard trivia is alienating. Should all U.S. presidents be fair game? What about K-pop stars? British monarchs? Nobel laureates? The “right” answer depends on your audience, and that’s a design question, not a linguistic one.

What Kinds of Phrases?

The opacity distinction isn’t the only interesting one. MWEs vary in origin and character too—some are technical, some metaphorical, some named after people, some freshly coined. Here’s how they break down:

domain-specific
blood pressure type 2 diabetes zero gravity habeas corpus chapter 11

figurative
alarm bells melting pot young at heart zero hour cloud 9

eponymous
Achilles’ tendon tommy gun venetian blind wellington boots

taxonomic
Tasmanian devil Canada goose tyrannosaurus rex turkey vulture

neologism
fidget spinner smart home gig economy viral video zero waste

dated
wet nurse video cassette zoot suit typewriter ribbon

The words that aren’t words

The line between “word” and “phrase” is fuzzier than dictionaries suggest. Words tend to carry more conceptual weight—but not always. Some words probably didn’t need to be coined; some phrases deserve more credit than they get. English is full of two- and three-word phrases that function as single semantic units—effectively “words”—but because they’re not fused into one orthographic word, they’re invisible to dictionaries and underappreciated as vocabulary. “Kindergarteners” is in the dictionary, but not “middle schoolers.” Newer professions—“car salesman,” “vacuum salesman,” “balloon seller”—are two words, but ancient professions might have a one-worder: jeweler, florist, fishmonger, tobacconist, stationer, mercer.

German and Norwegian don’t “remove the space”—their compounds never had one. Champignoncremesuppe and Gesundheitszeugnis are single grammatical structures from the start, and adding spaces would change the meaning or break the grammar. Norwegian takes this so seriously it has a word for the error: særskrivingsfeil (separate-writing error). The space matters: røykfritt means “smokefree,” but røyk fritt means “smoke freely.” En norsklærer is a teacher of Norwegian; en norsk lærer is a Norwegian teacher. Not all compounds make it into the dictionary—there’s a near-unbounded set of possible ones, just as in English. But the ones that are loaded enough do. Norwegian even puts two-word compounds in the dictionary when they earn it—svart hull (black hole) has a space but gets an entry, because the meaning isn’t obvious from the parts. English has the opposite problem: the phrase-to-compound pipeline (ice cream → icecream? not yet) is slow, arbitrary, and not necessarily tied to meaning.

Meanwhile, in Spanish…

Other languages sometimes have tidy one-worders for things English can only describe. Spanish carves up time with precision English lacks: madrugada for the pre-dawn hours, atardecer for late afternoon waning into evening. The mid-day nap was so compelling we adopted the siesta into English.

But, concise one-word equivalents appear only among the top 2k most familiar MWEs—and even then, rarely. It’s as if languages collectively decided these concepts weren’t worth crystallizing into single words.

Exceptions exist—Norwegian forelsket for the euphoria of falling in love, Danish hygge for candlelit contentment, Portuguese saudade for longing. But these are famous precisely because they’re rare. Beyond the top tier of familiarity, the well runs dry everywhere. No language has a single word for “parking lot frustration” or “Sunday evening dread.” Generally, if English describes it with a phrase, so does everyone else.

The lexicographer’s mind

Corpus tool

Sketch Engine — a workhorse of modern lexicography. Its “Word Sketch” produces a collocation profile for any word: which adjectives, verbs, and nouns habitually travel with it, ranked by statistical salience.

Why did dictionary-makers include some phrases and not others?

Lexicographers used a substitutability test: if you can swap synonyms freely, it’s not a lexical unit. “Cold feet” (meaning fear) can’t become “frigid feet”—so it gets an entry.

Looking at extremes—the words most often included vs. excluded at the start or end of a phrase—reveals patterns. Compare two groups that made these decisions. Traditional lexicographers (Merriam-Webster, Oxford) worked under page limits, favoring established scientific and technical vocabulary. Wiktionary volunteers, unconstrained by space, focused on phrasal verbs and everyday idioms. Tap these toggles to explore the lexicographer decisions.

Toggle these toggles

(N of M) = N phrases in dictionary out of M total phrases with that anchor word

Curiously, Wiktionary also achieved 100% coverage on some less common ending words. Every lily we tested (water lilies, Easter lilies, tiger lilies). Every sauce (dipping sauces, hot sauces, pasta sauces). Every pudding (Yorkshire puddings, rice puddings, Christmas puddings). Every acid (stomach acid, fatty acid, sulfuric acid).

This analysis wasn’t practical before LLMs

Print dictionaries had space limits. Corpus linguists built collocation dictionaries from frequency data—but frequency alone can’t tell you what’s a concept. “The table” is extremely frequent but names nothing; “loose leaf” is rare but names a thing. Sorting n-grams by frequency buries what matters.

I needed a different signal. I went fishing in Claude’s brain.

Generate

Word lists
(Wiktionary, Wikipedia, LOC subjects) ↓ 22M+ probes ↓ 774K MWEs surfaced

Rank

Each term appears in
30 random batches of 75 ↓ 1.4M comparisons ↓ Average position = 0–1 score

Classify

Tag each MWE:
transparent, semi-opaque, opaque; phrasal verb, technical, named entity; figurative, eponymous, taxonomic ↓ 10 types

Compare

MW (89K) → 2.3%
Oxford (72K) → 2.0%
Wikt (1.4M) → 33% ↓ Cross-tab by familiarity × type

The seed lists—Wiktionary, Wikipedia, Library of Congress subjects—were chosen to cover topics people actually care about. The probes work because Claude only produces a phrase as a category member if it thinks of it as a unit. Ask for marquee terms and you get “now playing.” Ask for cooking hazards and you get “boiling water.”

Beyond the dictionary

When an English speaker says “Saturday night,” they’re not computing “the night portion of the day called Saturday.” They’re invoking a concept—end of the work week, social time, a feeling. The phrase is the word for that concept.

For word games, this is a reorientation. There’s a vast space of MWEs that could be playable—sitting alongside single words. Letter counts still matter, so long MWEs don’t have much role in gameplay (unless it’s Wheel of Fortune)—but “paper towel” fits anywhere “discombobulate” does. The constraint was never length. It was our dictionary-shaped sense of what counts.

It made me think about words differently. Not where the next space shows up—but whether something is loaded.