You define your game mechanics. We generate data that works for those mechanics.
You
Define Your Game
mechanics, rules, content needs
↓
We
Build Your Bundle
compact, documented, ready to ship
↓
Ship
All Your Levels
hundreds or thousands
Every word game is a data problem wearing a different costume.
Scrabble needs to answer one question: “Is this a valid word?” A flat list of terms suffices. Wordle clones need a shorter list of familiar five-letter words—and someone has to decide whether SLAVE or ABORT might trend on social media for the wrong reasons. NYT Connections needs four categories with carefully calibrated decoys—words that look like they belong together but don’t. Semantic pathfinding games need pre-computed routes through meaning-space, because you can’t brute-force hundreds of millions of permutations in real time.
Below are four examples of puzzle formats we’ve built from our licensable data—but we craft puzzle data for YOUR game mechanics, not just these.
Download Sample Puzzles
25 complete puzzles—10 categories, 10 words each, with difficulty scores per word. Raw algorithmic output, no human editing.
For letter-tile games, anagram challenges, and Wordle-style spelling games. We provide a set of letter tiles, target words with optional semantic clues, and a complete list of valid anagrams and other letter combinations—everything a player might legitimately form from the available letters.
For semantic navigation games where players traverse meaning-space—connecting concepts through chains of associations. We provide validated origin-target pairs solvable in exactly 3 moves (4 hops through meaning-space), with multiple solution paths, precalculated hints, and backward “convergence” clues that tell players when they’re getting close.
Puzzle Generation
ORIGIN POOL
819 curated origin words
→
BRUTE-FORCE ENUMERATION
enumerate ALL 4-hop paths to every reachable candidate target
→
PATH COUNT FILTER
keep 16–43 paths (3–4 advancing choices per hop)
→
CHOKE POINT REJECTION
reject if one hop1 word dominates >60% of routes
→
SHORTCUT FILTER
reject if 2-hop or 3-hop paths exist (too close semantically)
Hint Generation
VERIFIED PATHS
only paths that actually reach target
→
FORWARD EXTRACTION
for each hop: collect words from all solutions
→
FREQUENCY RANKING
hints appearing in more paths ranked higher (more downstream options)
→
BACKWARD TRAVERSAL
from TARGET back: find all words 1–3 hops away
→
CONVERGENCE WORDS
“am I close?” clues shared across puzzles with same target
Why hints must be precalculated: Finding valid hints requires enumerating all paths—not just one—to guarantee every hint actually reaches the target. With 15 words visible at each hop, that’s 50,000+ candidate paths per puzzle. Ranking hints by how many paths use them (more paths = more downstream options) requires global knowledge. And compressing hint data for mobile delivery (74% size reduction via ID encoding) requires knowing all hints upfront. Runtime computation would add seconds of latency; precalculation makes hints instant and guaranteed correct.
MEDIUM
Sample Data Level 89 of 779
sugar → peace
Path A
sugar→sweet→pleasant→harmony→peace
Path B
sugar→cane→plant→olive→peace
Path C
sugar→dissolve→solution→resolution→peace
Hints
sweetcanedissolve(every hint verified to reach target—no dead ends)
Converge
harmonycalmtreaty(1-hop from target)
Themed Categories
For NYT Connections-style games with intentional decoys—words that tempt players into wrong groupings, where ambiguity is the challenge. Or for other category games that need mutually exclusive pools where no word could plausibly belong to two groups.
ANCHOR WORD
word with rich semantic profile
→
ASSOCIATION PULL
select 8 strongest associations (≤7 letters each)
→
THEMATIC ISOLATION CHECK
does this anchor’s FULL neighborhood overlap any existing output words? → reject
→
REPEAT ×8
all 8 groups must have zero semantic bleed between them
HARD
Sample Data Level 234 of 1,500
Theme 1Climate
monsooncarbonozonewarmingecologypolar
Theme 2Pressed
coercedironedcidercrushedmashedcreased
Theme 3Keyboard
shiftreturnescapecontrolspacetab
EasyMediumHard
Clue-Based Puzzles
For crossword-style games and semantic guessing games—short clues designed for gameplay, plus related words that don’t give away the answer morphologically. Players identify a target word from semantic clues, but there might be decoys that match some (not all!) of the clues.
reject same-stem, same-root relatives (SEAL clues can’t include SEALS, SEALING)
→
4 CLUES
one strong association per sense
Decoy Selection
4 CLUES
given these 4 clues
→
CANDIDATE SCAN
score each word against all 4 clues
→
ZERO DETECTION
which clues have ZERO connection? (no network link)
→
MATCH PATTERN
record which clues each decoy matches: “0,1” or “2,3” etc.
→
SEPARATION SCORING
answer_total − best_decoy must exceed threshold
Why decoys work: Each decoy must match 2–3 of the 4 clues—enough to seem plausible—but have at least one “zero” (a clue it cannot possibly connect to). Players eliminate decoys by finding the impossible connection. The answer wins by having no zeros: it connects to all 4 clues through different senses of the word.
HARD
Sample Data Level 512 of 3,000
Answer
army
Clues
platoonreconnaissanceambushbarracks
Decoys
battaliontacticssquadron
EasyMediumHard
What You Get
We deliver production-ready puzzle data for all your levels—hundreds or thousands, depending on your game’s needs. The data is generated exclusively for your game, unique to your studio. Every puzzle draws from Linguabase’s complete vocabulary stack: vocabulary rankings, semantic associations, sense clouds, and more.
Compact data payload—minified JSON optimized for your game client, modular so you can update or swap level packs without redeploying
Human-readable version—uncompressed, commented format you can inspect to understand the data structure and verify puzzle quality
Clear documentation—schema reference, field explanations, integration examples
Embedded difficulty scores—each puzzle tagged with difficulty metrics so your game can dynamically adjust challenge to individual players
Content filtering—pre-vetted for your audience (family-friendly, teen, adult)
The four formats above are examples. Your game has its own mechanics—we’ll design a data structure that fits, exclusive to you.
Get in Touch
We can generate test datasets for creative development and playtesting—small batches to validate that your mechanics work. When you’re ready, we produce data for thousands of production levels.
We’ve been generating word puzzles about meaning for over a decade. The edge cases are where all the work is.
Describe your game mechanics—how players interact, what constitutes a “level,” what data you need at runtime—and we’ll design a puzzle data format that plugs directly into your game.