You define your game mechanics. We generate data that works for those mechanics.
You
Define Your Game
mechanics, rules, content needs
↓
We
Build Your Bundle
compact, documented, ready to ship
↓
Ship
All Your Levels
hundreds or thousands
Every word game is a data problem wearing a different costume.
Scrabble needs to answer one question: “Is this a valid word?” A flat list of terms suffices. Wordle clones need a shorter list of familiar five-letter words—and someone has to decide whether SLAVE or ABORT might trend on Twitter for the wrong reasons. NYT Connections needs four categories with carefully calibrated decoys—words that look like they belong together but don’t. Semantic pathfinding games need pre-computed routes through meaning-space, because you can’t brute-force hundreds of millions of permutations in real time.
Below are four examples of puzzle formats we’ve built from our licensable data—but we design data for YOUR game mechanics, not just these. Tell us how your game works, and we’ll generate the data structures it needs.
Anagram Puzzles
For letter-tile games, anagram challenges, and Wordle-style spelling games. We provide a set of letter tiles, target words with optional semantic clues, and a complete list of valid anagrams and other letter combinations—everything a player might legitimately form from the available letters. Build your spelling game with puzzle data from Linguabase.
Puzzle Generation
LETTER SET
7 letters (configurable)
→
VOCABULARY SCAN
for each word: does it use only these letters?
→
SPELLING RULES
can letter repeat consecutively? DEEP needs 2 E’s to alternate—LEVEL needs 2 L’s
→
TARGET SELECTION
from valid pool: length variety, edit distance ≥3, no shared stems, no containment
For semantic navigation games where players traverse meaning-space—connecting concepts through chains of associations. We provide validated origin-target pairs solvable in exactly 3 moves (4 hops through meaning-space), with multiple solution paths, precalculated hints, and backward “convergence” clues that tell players when they’re getting close. Build your exploration game with puzzle data from Linguabase.
Puzzle Generation
ORIGIN POOL
819 curated origin words
→
BRUTE-FORCE ENUMERATION
enumerate ALL 4-hop paths to every reachable candidate target
→
PATH COUNT FILTER
keep 16–43 paths (3–4 advancing choices per hop)
→
CHOKE POINT REJECTION
reject if one hop1 word dominates >60% of routes
→
SHORTCUT FILTER
reject if 2-hop or 3-hop paths exist (too close semantically)
Hint Generation
VERIFIED PATHS
only paths that actually reach target
→
FORWARD EXTRACTION
for each hop: collect words from all solutions
→
FREQUENCY RANKING
hints appearing in more paths ranked higher (more downstream options)
→
BACKWARD TRAVERSAL
from TARGET back: find all words 1–3 hops away
→
CONVERGENCE WORDS
“am I close?” clues shared across puzzles with same target
Why hints must be precalculated: Finding valid hints requires enumerating all paths—not just one—to guarantee every hint actually reaches the target. With 15 words visible at each hop, that’s 50,000+ candidate paths per puzzle. Ranking hints by how many paths use them (more paths = more downstream options) requires global knowledge. And compressing hint data for mobile delivery (74% size reduction via ID encoding) requires knowing all hints upfront. Runtime computation would add seconds of latency; precalculation makes hints instant and guaranteed correct.
MEDIUM
Sample Data Level 89 of 779
sugar → peace
Path A
sugar→sweet→pleasant→harmony→peace
Path B
sugar→cane→plant→olive→peace
Path C
sugar→dissolve→solution→resolution→peace
Hints
sweetcanedissolve(every hint verified to reach target—no dead ends)
Converge
harmonycalmtreaty(1-hop from target)
Themed Categories
For NYT Connections-style games with intentional decoys—words that tempt players into wrong groupings, where ambiguity is the challenge. Or for other category games that need mutually exclusive pools where no word could plausibly belong to two groups. Build your categorization game with puzzle data from Linguabase.
ANCHOR WORD
word with rich semantic profile
→
ASSOCIATION PULL
select 8 strongest associations (≤7 letters each)
→
THEMATIC ISOLATION CHECK
does this anchor’s FULL neighborhood overlap any existing output words? → reject
→
REPEAT ×8
all 8 groups must have zero semantic bleed between them
HARD
Sample Data Level 234 of 1,500
Theme 1Climate
monsooncarbonozonewarmingecologypolar
Theme 2Pressed
coercedironedcidercrushedmashedcreased
Theme 3Keyboard
shiftreturnescapecontrolspacetab
EasyMediumHard
Clue-Based Puzzles
For crossword-style games and semantic guessing games—definitions that work as hints, plus related words for crafting clues that don’t give away the answer morphologically. Players identify a target word from semantic clues, but there might be decoys that match some (not all!) of the clues. Build your word-guessing game with puzzle data from Linguabase.
reject same-stem, same-root relatives (SEAL clues can’t include SEALS, SEALING)
→
4 CLUES
one strong association per sense
Decoy Selection
4 CLUES
given these 4 clues
→
CANDIDATE SCAN
score each word against all 4 clues
→
ZERO DETECTION
which clues have ZERO connection? (no graph link)
→
MATCH PATTERN
record which clues each decoy matches: “0,1” or “2,3” etc.
→
SEPARATION SCORING
answer_total − best_decoy must exceed threshold
Why decoys work: Each decoy must match 2–3 of the 4 clues—enough to seem plausible—but have at least one “zero” (a clue it cannot possibly connect to). Players eliminate decoys by finding the impossible connection. The answer wins by having no zeros: it connects to all 4 clues through different senses of the word.
HARD
Sample Data Level 512 of 3,000
Answer
army
Clues
platoonreconnaissanceambushbarracks
Decoys
battaliontacticssquadron
EasyMediumHard
What You Get
We deliver production-ready puzzle data for all your levels—hundreds or thousands, depending on your game’s needs. The data is generated exclusively for your game, unique to your studio. Every puzzle draws from Linguabase’s complete vocabulary stack: vocabulary rankings, semantic associations, sense clouds, and more.
Compact data payload—minified JSON optimized for your game client, modular so you can update or swap level packs without redeploying
Human-readable version—uncompressed, commented format you can inspect to understand the data structure and verify puzzle quality
Clear documentation—schema reference, field explanations, integration examples
Embedded difficulty scores—each puzzle tagged with difficulty metrics so your game can dynamically adjust challenge to individual players
Content filtering—pre-vetted for your audience (family-friendly, teen, adult)
The four formats above are examples. Your game has its own mechanics—we’ll design a data structure that fits, exclusive to you.
Get in Touch
We can generate test datasets for creative development and playtesting—small batches to validate that your mechanics work. When you’re ready, we produce data for thousands of production levels.
Benefit from a decade of experience with the twists, turns, and edge cases of making word puzzles about meaning.
Describe your game mechanics—how players interact, what constitutes a “level,” what data you need at runtime—and we’ll design a puzzle data format that plugs directly into your game.