Free Form Dialog Trees in “Don’t Crush Me”
Don’t Crush Me is a game about pleading for your life with a robotic garbage compactor. It came up many years ago during a discussion in the AwfulJams IRC channel. The recent advent of Smooth Inverse Frequency proved an ample opportunity to revisit the idea with the benefits of modern machine learning. In this post we’re going to cover the building SIF in Rust, compiling it to a library we can use in the Godot Game Engine, and then building a dialog tree in GDScript to control our gameplay.
First, a little on Smoothed Inverse Frequency:
In a few words, SIF involves taking a bunch of sentences, converting them to row-vectors, and taking out the principle component. The details are slightly more involved, but not MUCH more involved. Part of the conversion to vector rows involves tokenization (which I largely ignore in favor of splitting on whitespace for simplicity), and smoothing based on word frequency (which I also currently ignore).
Really, one of the “largest” challenges in this process was taking the Glove vectors and embedding them in the library so that GDScript didn’t have to read anything from a multi-gigabyte file. The Glove 6B 50-D uncased vectors take up only about 150 megs in an optimal float format, and I’m quite certain they can be made more compact still. Additionally, since we know all of the tokens in advance, we can use a Perfect Hash Function to optimally index into the words at runtime.
With our ‘tokenize’ and ‘vectorize’ functions defined we are free to put these methods into a small Rust GDNative library and built it out. After an absurdly long wait for the build to compile (~20 minutes on my Ryzen 3950X) we have a library! It’s then a matter of adding a few supporting config files and we have a similarity
method we can use:
Now the less fun part: Writing Dialog. In the older jam Hindsight is 60 Seconds, I capped things off with a dialog tree as part of a last ditch effort to avoid doing work on things that mattered. The structure of that tree was something like this…
const COMMENT = "_COMMENT" const ACTION = "_ACTION" const PROMPT = "_PROMPT" const BACKGROUND = "_BACKGROUND" var dialog = { "_TEMPLATE": { COMMENT: "We begin at _START. Ignore this.", PROMPT: "The dialog that starts this question.", ACTION: "method_name_to_invoke", "dialog_choice": "resulting path name or a dictionary. If a dictionary, parse as though it were a path on its own.", "alternative_choice": { PROMPT: "This is one of the ways to do it.", "What benefit does this have?": "question", "Oh neat.": { PROMPT: "We can go deeper.", "…": "_END" } } },
I like this format. It’s easy to read and reason about, but it’s limited in that only one dialog choice corresponds to one action. For DCM I wanted to be able to have multiple phrasings of the same thing without repeating the entire block. Towards that end, I used a structure like this:
var dialog_tree = { "START": [ # AI Start state: # Possible transitions: { TRIGGER_PHRASES:["Hello?", "Hey!", "Is anyone there?", "Help!", "Can anyone hear me?"], TRIGGER_WEIGHTS: 0, # Can be an array, too. NEXT_STATE: "HOW_CAN_I_HELP_YOU", # AI State. RESPONSE: "Greetings unidentified waste item. How can I assist you?", PLACEHOLDER: "Can you help me?", ON_ENTER: "show_robot" # When we run this transition. }, { TRIGGER_PHRASES: ["Stop!", "Stop compressing!", "Don't crush me, please!", "Don't crush me!", "Wait!", "Hold on."], NEXT_STATE: "STOP_COMPRESS_1", RESPONSE: "Greetings unidentified waste item. You have asked to halt the compression process. Please give your justification.", PLACEHOLDER: "I am alive.", ON_ENTER: "show_robot" }, { TRIGGER_PHRASES: ["Where am I?", "What is this place?"], NEXT_STATE: "WHERE_AM_I", RESPONSE: "Greetings unidentified waste item. You are in the trash compactor.", ON_ENTER: "show_robot" } ],
This has proven to be incredibly unruly and, if you are diligent, you may have realized it’s just as possible to do the same “multiple trigger phrases” in the first approach via some simple splitting on a special character like “|”.
So how well does it work? The short answer is, “well enough.” It has highlighted a much more significant issue: the immensity of the input space. Initially, it was thought that using a placeholder in the input text would help to anchor and bias the end-user’s choices and hide the seams of the system. In practice, this was still a wrought endeavor.
All things considered, I’m still proud of how things turned out. It’s a system that’s far from perfect, but it’s interesting and it was plenty satisfying to build. I hope that people enjoy the game after the last bits are buffed out (hopefully before GDC 2020).