Tracery tutorial

by Allison Parrish

This is a tutorial on Tracery. Tracery is a computer language for random text generation originally developed by Kate Compton.

In this tutorial, examples are presented in boxes like this:

{
  "origin": "#[test:#testNouns#]warning#",
  "warning": "This is #test.a# of the emergency #broadcastNouns# system. This is only #test.a#.",
  "testNouns": ["test", "trial", "experiment", "case study", "quiz", "analysis"],
  "broadcastNouns": ["broadcast", "program", "telecast", "podcast", "sitcom", "documentary"]
}

Regenerate

The grammar is in the top section, and a generated line of output is in the bottom section. You can generate another line of output by pressing the “Regenerate” button.

You can’t modify the examples in this tutorial, but you can cut and paste them into the following tools that allow you to make changes to grammars and see how they look:

Beau Gunderson’s Tracery writer
Kate Compton’s Tracery tutorial and Tracery visual editor
Cheap Bots Done Quick, which lets you turn your Tracery grammar into a Twitter bot with a minimum amount of fuss.

You might be interested in reading Nora Reed’s explanation of how @nerdgarbagebot works, which takes you through the process of ideating and implementing a Tracery grammar for a Twitter bot. (Nora Reed makes a lot of amazing bots with Tracery, including @thinkpiecebot.)

Rules and expansions

A Tracery grammar is a series of rules that tell the computer how to put text together, piece by piece. Tracery grammars consist of a series of rules and expansions. The goal of writing a Tracery grammar is to write rules and expansions that, when followed by the computer, produce interesting (funny, insightful, poetic) text. The word for generating a text from a grammar is “expand”—we’ll be talking a lot below about “expanding” the grammar into a text. (Hopefully the reasons for using this word will become clear!)

The rules and expansions have to be written in a very specific format so that the computer can understand them. Here’s an example of a complete, but very boring, Tracery grammar:

{
  "origin": "Hello, world!"
}

Regenerate

This grammar can produce only one text: Hello, world!. Not very interesting, but helpful for the moment to illustrate how a grammar is put together.

Tracery grammars have to start with a { and end with a }. Between those two curly brackets, you write a series of rules that look like this:

"rule": "expansion"

… where rule is the name of the rule, and expansion is what you want the text to be replaced with. Every grammar needs to have a rule named origin: it’s where the computer will start the process of expanding the entire grammar.

Here’s a Tracery grammar with two rules:

{
  "origin": "Hello, #noun#!",
  "noun": "galaxy"
}

Regenerate

This grammar, again, can only ever produce one text: Hello, galaxy! But it accomplishes it in a slightly more sophisticated way. Notice in the expansion for the origin rule the following text:

#noun#

When the computer encounters text that looks like this—a word surrounded by # signs—it looks in the grammar for a rule with the same name as the word, and replaces the text with the expansion for that rule.

When a grammar has more than one rule, the rules have to be separated by commas. (That is: all of the rules should have commas after them except for the last one.) Make sure to put the commas after the double quotes, and not inside them like you might expect.

Let’s add a third rule to this grammar, just to see how it looks:

{
  "origin": "#greeting#, #noun#!",
  "greeting": "Howdy",
  "noun": "galaxy"
}

Regenerate

EXERCISE: Add another rule for the punctuation at the end of the sentence, so that the grammar produces the text “Howdy, galaxy?”

Adding alternatives

The examples above are really boring, because they can only ever produce one output. In order for a grammar to be able to produce different outputs, we need to make the expansions of our rules have alternatives for the computer to choose between. Rules with alternatives look like this:

"rule": ["alternative one", "alternative two", "alternative three"]

That is: the name of the rule (in quotes), followed by a colon, followed by a [, followed by a list of comma-separated bits of text, followed by a ]. When the computer has to expand a rule with multiple alternatives, it will select one at random.

Here’s our “Hello, world!” grammar, now with multiple alternatives for what we’re greeting:

{
  "origin": "#greeting#, #noun#!",
  "greeting": "Howdy",
  "noun": ["world", "solar system", "galaxy", "local cluster", "universe"]
}

Regenerate

Click on the “Regenerate” button and you’ll see different outputs. (Sometimes it’ll look like it isn’t working, but that’s just because the computer randomly selected the same alternative twice in a row. It can happen!)

Let’s make this “Hello, world!” example even more interesting by adding alternatives for the greeting rule:

{
  "origin": "#greeting#, #noun#!",
  "greeting": ["Howdy", "Hello", "Greetings", "What's up", "Hey", "Hi"],
  "noun": ["world", "solar system", "galaxy", "local cluster", "universe"]
}

Regenerate

JSON problems

Tracery grammars are written in a format called JSON (or “javascript object notation”). JSON is a common format for exchanging data between computer programs written in different programming languages and on different kinds of computers.

Unfortunately for our purposes, JSON is a very persnickety format to write in. If you don’t get the formatting just right, the computer will complain and will refuse to do anything with your input. The tools linked to above will show you where the errors in your formatting are, usually by giving you a line number or highlighting the line that has the problem. If you get an error and you’re having trouble fixing it, try these strategies:

Check to ensure that every [ has a matching ].
Make sure that you have a comma between all of your rules, but that there is no comma after the last rule.
For rules with alternatives, make sure that you didn’t accidentally put a comma inside of the quotes.

The JSON format actually provides some flexibility that might be helpful in formatting your grammar. For example, the JSON format doesn’t care about where you put linebreaks, so you could format the “Hello, world” grammar like so with no repercussions on the behavior of the grammar:

{
  "origin": "#greeting#, #noun#!",
  "greeting": [
    "Howdy",
    "Hello",
    "Greetings",
    "What's up",
    "Hey",
    "Hi"
  ],
  "noun": [
    "world",
    "solar system",
    "galaxy",
    "local cluster",
    "universe"
  ]
}

Regenerate

Modifiers

Let’s make a more sophisticated grammar that produces sentences in the format “Dammit Jim, I’m a X, not a Y!” popularized by the ground-breaking science fiction program, Star Trek. Such a grammar might look like this:

{
  "origin": "#interjection#, #name#! I'm a #profession#, not a #profession#!",
  "interjection": ["alas", "congratulations", "eureka", "fiddlesticks",
    "good grief", "hallelujah", "oops", "rats", "thanks", "whoa", "yes"],
  "name": ["Jim", "John", "Tom", "Steve", "Kevin", "Gary", "George", "Larry"],
  "profession": [
"accountant",
        "actor",
        "archeologist",
        "astronomer",
        "audiologist",
        "bartender",
        "butcher",
        "carpenter",
        "composer",
        "crossing guard",
        "curator",
        "detective",
        "economist",
        "editor",
        "engineer",
        "epidemiologist",
        "farmer",
        "flight attendant",
        "forest fire prevention specialist",
        "graphic designer",
        "hydrologist",
        "librarian",
        "lifeguard",
        "locksmith",
        "mathematician",
        "middle school teacher",
        "nutritionist",
        "painter",
        "physical therapist",
        "priest",
        "proofreader",
        "rancher",
        "referee",
        "reporter",
        "sailor",
        "sculptor",
        "singer",
        "sociologist",
        "stonemason",
        "surgeon",
        "tailor",
        "taxi driver",
        "teacher assistant",
        "teacher",
        "teller",
        "therapist",
        "tour guide",
        "translator",
        "travel agent",
        "umpire",
        "undertaker",
        "urban planner",
        "veterinarian",
        "web developer",
        "weigher",
        "welder",
        "woodworker",
        "writer",
        "zoologist"
  ]
}

Regenerate

(List of professions taken from Darius Kazemi’s Corpora Project—an excellent place to find lists of things. And they’re already preformatted in JSON, which can make it easier to cut-and-paste into your Tracery grammars.)

This is pretty good, but there are problems. The first is that we typed in all of the interjections in lower case, but they’re supposed to have the first letter capitalized (since they’re at the beginning of the sentence). The second problem is that the grammar occasionally produces something like

yes, George! I'm a economist, not a zoologist!

“A economist” isn’t right. It should be “an economist.” English indefinite articles are tricky that way!

There are several ways to solve these problems. We could just change all of our interjections to be capitalized, and add the appropriate article to the beginning of each profession. But (1) this will be time consuming and (2) it means that we won’t ever be able to re-use those same rules with the unmodified versions of those rules. What to do?

Thankfully, Tracery comes equipped with a series of modifiers that take the expansion of a rule and apply a transformation to it. The two modifiers we’re going to use are .a, which adds the appropriate indefinite article before the expansion of a rule, and .capitalize, which capitalizes the first letter of the expansion.

Use the modifers by adding .a inside the # signs, right after the name of the rule. For example, change:

#interjection#

#interjection.capitalize#

Here’s our “Dammit Jim” generator with the modifiers in place:

{
  "origin": "#interjection.capitalize#, #name#! I'm #profession.a#, not #profession.a#!",
  "interjection": ["alas", "congratulations", "eureka", "fiddlesticks",
    "good grief", "hallelujah", "oops", "rats", "thanks", "whoa", "yes"],
  "name": ["Jim", "John", "Tom", "Steve", "Kevin", "Gary", "George", "Larry"],
  "profession": [
        "accountant",
        "actor",
        "archeologist",
        "astronomer",
        "audiologist",
        "bartender",
        "butcher",
        "carpenter",
        "composer",
        "crossing guard",
        "curator",
        "detective",
        "economist",
        "editor",
        "engineer",
        "epidemiologist",
        "farmer",
        "flight attendant",
        "forest fire prevention specialist",
        "graphic designer",
        "hydrologist",
        "librarian",
        "lifeguard",
        "locksmith",
        "mathematician",
        "middle school teacher",
        "nutritionist",
        "painter",
        "physical therapist",
        "priest",
        "proofreader",
        "rancher",
        "referee",
        "reporter",
        "sailor",
        "sculptor",
        "singer",
        "sociologist",
        "stonemason",
        "surgeon",
        "tailor",
        "taxi driver",
        "teacher assistant",
        "teacher",
        "teller",
        "therapist",
        "tour guide",
        "translator",
        "travel agent",
        "umpire",
        "undertaker",
        "urban planner",
        "veterinarian",
        "web developer",
        "weigher",
        "welder",
        "woodworker",
        "writer",
        "zoologist"
  ]
}

Regenerate

Another modifier you can use is .s, which turns the text in the expansion into its plural version. Using this, we can modify the above example to be a Star Wars meme instead of a Star Trek one:

{
  "origin": "These aren't the #profession.s# we're looking for.",
  "profession": [
        "accountant",
        "actor",
        "archeologist",
        "astronomer",
        "audiologist",
        "bartender",
        "butcher",
        "carpenter",
        "composer",
        "crossing guard",
        "curator",
        "detective",
        "economist",
        "editor",
        "engineer",
        "epidemiologist",
        "farmer",
        "flight attendant",
        "forest fire prevention specialist",
        "graphic designer",
        "hydrologist",
        "librarian",
        "lifeguard",
        "locksmith",
        "mathematician",
        "middle school teacher",
        "nutritionist",
        "painter ",
        "physical therapist",
        "priest",
        "proofreader",
        "rancher",
        "referee",
        "reporter",
        "sailor",
        "sculptor",
        "singer",
        "sociologist",
        "stonemason",
        "surgeon",
        "tailor",
        "taxi driver",
        "teacher assistant",
        "teacher",
        "teller",
        "therapist",
        "tour guide",
        "translator",
        "travel agent",
        "umpire",
        "undertaker",
        "urban planner",
        "veterinarian",
        "web developer",
        "weigher",
        "welder",
        "woodworker",
        "writer",
        "zoologist"
  ]
}

Regenerate

Next steps

Congratulations, you now know the basics of writing a Tracery grammar. You can take your grammar and paste it right into Cheap Bots Done Quick to turn your grammar into a Twitter bot!

If you need ideas for text generators to make, check out the Snowclones Database or Know Your Meme.

Tracery has a number of features that we didn’t go into here, including the ability to save the output of a rule to be re-used later in the same expansion. See Kate Compton’s tutorial for more information. You might be interested in these advanced text generators that Kate Compton made with Tracery.

If you’re a Javascript programmer and you want to incorporate Tracery into your own projects, the source code is available here (also available as a Node module).