I’ve been working on a fairly ambitious open world prototype in Unity but I was kind of hitting a wall in productivity, so I decided to take a break from that and instead work on a smaller game in Luxe, and that game is…
Words
Today I wanted to add the proper word matching (along with the core feature of matching words in any order) and so here's how I got on with that.Word lists
One of the main things for a word game is a word list, perhaps no big surprise there. :laughing: However, the word lists for games like Scrabble are generally not available, or are under some kind of copyright. Poking around, I found a list called SOWPODS, which appears to be an amalgamation of the US and UK Scrabble dictionaries, but it wasn't immediately clear what the licensing situation for it was. So to sidestep this, first I tried...Making a word list!
Getting some (licensed) data
There is a fairly well-known corpus of word frequency available for free from [https://www.wordfrequency.info](https://www.wordfrequency.info), the downside to it being that the free version only has 5000 free lemmas available. (The lemma is the root word, so if the word 'sow' was to be included in the list, 'sows', 'sowing' etc. would also be included.) So I decided to use this list to generate my own dictionary of common words. Luxe has a useful binary data format built into it that looks similar to JSON, so I decided to make a Ruby script to parse the CSV files from wordfrequency.info and output them into a 'dictionary' file which the game can then load using the inbuilt parsing.Word list format
To facilitate matching words from a collection of letters I decided to index the words based on the letters that make them up, arranged in a well-known order. This means that if multiple matches for a group of letters are possible, then one or more of those can be shown. Given that the word frequency data also gives a ranking based on word usage this could also mean that if you manage to spell an uncommon word directly, or get closer to an uncommon word than a more common word that would use the same letters, the game would be able to detect this. So to generate this was fairly simple, and the generated file looks something like this:words = {
ADERST = [
"TRADES"
"STARED"
"DATERS"
"TREADS"
"DERATS"
]
}
If arranged alphabetically, the letters of any of those words would become the key “ADERST”, and using this we can easily get a list of ‘common’ words. This actually works fairly well and is the format I’m currently using, however I hit a snag…
5000 lemmas isn't really all that many...
The word list from wordfrequency.info is good quality, but it's missing a lot of words. This is probably to be expected, most scrabble dictionaries seem to have around 300,000 words, 60 times as many. A commercial licence would be an option to extend the word list, but that's expensive and still only has 60,000 words. I decided to look around and see if any other options were available. Fortunately, there is a public domain word dictionary that is used by a lot of word games called ENABLE freely available, you can [download it here](http://web.archive.org/web/20090122025747/http://personal.riverusers.com/~thegrendel/software.html) if you're interested.Using this data I regenerated the dictionary, and now there are lots of commonly used words available. Unfortunately, I lost the ranking data for them - I may consider some way to bring that back if it becomes better for gameplay, but for now it will suffice.
Putting the word list into the game
This was fairly painless, the Luxe data format combined with the generated files made it easy enough. I added some conditions for success/failure to make a word, and now I am just tweaking the animations to make them more exciting! I tried a few different approaches, such as having the letters swap to form a word, but it looked fairly slow and clunky so my plan is to try and streamline that a bit next... and probably work on some particle confetti for a celebration of word matching success.I have a commercial-friendly license for the COCA corpus from a few years back- if you give me a word list I can give it back to you sorted by frequency based on that.
Thank you, that’s extremely kind of you to offer. Let me work out what would be useful and I’ll get back to you!