Sponsored By

Tightening the World-Plot Interface: or, Why I Am Obsessed With Conversation Models

Argues in favor of conversational mechanics that offer the player both affect and diegetic agency, and provides some suggestions about design based on Aristotelian rhetorical categories, as well as experience from Versu and other projects.

Emily Short, Blogger

June 9, 2015

16 Min Read
Game Developer logo in a gray background | Game Developer

The following is a revised reprint of an article that appeared on my blog. It has been reformatted and lightly edited for a different audience.

Framed is an interactive comic game in which you move around the panels of the story, reordering events in order to change what happens in the story. It looks really attractive, too.  

When I first heard of this game, I was hugely excited about it. There aren't that many entries in the interactive comic space, and this seemed to offer a slightly different set of mechanics to go alongside Dan Benmergui's (unfinished but, to judge by the demos, awesome) Storyteller or Troy Chin's Forgetting or the somewhat over-difficult Strip 'Em All

When I actually played Framed, though, I had essentially the same reaction as The Digital Reader:

While Framed is based on a clever dynamic, the actual game is repetitive to the point that I am bored... Rather than have the user solve puzzles with different goals and different solutions, the vast majority of the levels I played all had the same goal: avoid the cops. Other than setting things up so the protagonist can either bypass cops or sneak up behind cops and hit them over the head, there's not much to this game.

I'm maybe a little less harsh than this — I did feel that Framed was worth playing, and I know that some people did enjoy the puzzles — but nonetheless, I was hoping for something that did new work in telling an interactive story, rather than just setting up a bunch of puzzle levels. In that area it fell short. All of the puzzles are about a similar problem — one set of characters escaping another — and the stakes don't alter much either. This makes for boring story.

The problem occurs at the world model-to-plot interface. That's a challenging area for just about any game in which the player cannot influence the plot directly, but has to change the world model in order to move forward. Often even a well-developed, systematic game mechanic only provides a consistent way for the verbs to modify the world model while the relationship between the world model and the story continues to rely on standard triggers. For instance, the traditional model for old-school text adventures is terrific at describing doors and containers, and setting up lots of puzzles about opening doors and containers, and providing sweet new door-and-container-related mechanics. Unfortunately, very few interesting stories are actually about the doors and containers themselves. This is just the easiest proxy we have for the idea of gradual discovery, just as evading the police is the easiest proxy Framed could come up with for the turning point of an action thriller.

*

A frequent design solution involves making the world model-to-plot relationship a sequence of many special cases. In scene X, you're fighting the villain, so coming up with something to distract her is the thing that will move the plot forward. In scene Y, you're trying to get away from the goods, so inventing a method of transport is the order of the day. The story is dominant, and it creates contexts in which some aspect of the world state is endowed with a particular, if temporary, plot meaning. For one specific scene, a routine action with your Portal gun becomes plot-altering rather than simply physical. At the extreme end are games that consist only of special cases: QTE-driven pieces like Heavy Rain, many passages in recent Telltale games, the majority of what is written in Twine. The meaning of a given action is highly situation-dependent, or else only situation-appropriate actions are available at all.

The method can give you a lot of dramatic power, though at the cost of the particular kind of agency Stacey Mason calls affect:

Diegetic agency allows us to make changes to the narrative. Affect, on the other hand, allows us to move through the space, swing a sword, or jump over an obstacle. The two are interrelated, but not synonymous. In Big Blue Box’s 2004 game Fable, the player controls a character in an action-RPG style fantasy. She may press buttons to swing a sword, cast spells, and so on. Performing each of these actions individually, I would argue, constitutes affect. The player may perform “good” or “evil” tasks, saving a man versus killing him for example, and her character and game experience will change according to that decision. I would argue that this type of choice is diegetic agency. Sometimes the two might coincide: a player might swing a sword to kill a man, thus exercising both affect and diegetic agency at the same time. -- Stacey Mason, On Games and Links: Extending the Vocabulary of Agency and Immersion in Interactive Narratives

The tightly-constrained, low-affect storytelling often works better in intense scenes interspersed through the plot, rather than all the time. At its most extreme form, even with a world model, this kind of thing becomes indistinguishable from choice-based interactive fiction: if the only things you can meaningfully do in a given scene are DROP GUN or SHOOT GUN, then, functionally, even a parser game or a first person shooter shares a lot with the

(a) Drop gun
(b) Shoot gun

presentation. The experience is not identical, but the level of constraint is obvious to the player.

Invisible Parties cover artOccasionally special-casing is precisely the point. One of the things I love about Sam Ashwell's interactive fiction Invisible Parties is how fiercely it embraces these special cases. It is all about particularness and variety and difference; its puzzles are also all about learning to apply really esoteric verbs ("USE TEXTUAL CRITICISM", for pity's sake) in various situations.

*

Clearly, though, there’s something enormously seductive about the idea of a game in which both affect (ability to fiddle with the world in systematic, predictable, plannable ways) and diegetic agency (ability to make narratively meaningful choices that affect the plot) are available consistently throughout much of the experience, rather than just occasionally in specialized one-off scenes.

I don’t at all think that this is the only way for an interactive story to be good, but I think it is a way that is highly coveted by a lot of people.

When I hear people talk about their dream of a holodeck experience, often what seems to lurk behind that dream is actually this desire. When Warren Spector keynoted the Inventing the Future of Games conference a couple of years ago, he talked about wishing for a game in which the NPCs would notice and react to small social gestures as well as grand moves, where spontaneously spilling a glass of water on someone would be read as part of the story. Others have fantasized to me about games where “everything matters,” environments in which they could explore freely and have their every gesture multiplied into unimaginably juicy, story-rich responsiveness from the world around them. A sandbox game, plus more story content than the collected Tolstoy. I suspect that many of the people saying this would find that a LARP or even a tabletop storygame actually provided a level of possibility that inhibited play (the “OMG uh what should I do?? I’m not thinking of something cool enough to do in this moment!!” factor), but the affect-plus-diegetic-agency aspect does seem to be a major part of what they want.

I can think of very few games, indie or commercial, hobbyist or AAA, IF or not, that come anywhere close to this. Make It Good, perhaps — and it’s a very difficult, very inaccessible kind of work, but one of the masterpieces of modern interactive fiction. Slouching Towards Bedlam is full of scenes in which diegetic agency is possible but the player is unlikely to realize it on the first playthrough; it’s only on replay that one recognizes how the story can be bent at those moments. It works because it’s modeling a specific kind of action to have decisive, supernatural influence on the story, but the player doesn’t know this initially. Façade observes the player’s behavior and triggers a lot of different outcomes depending on that behavior, but it does so in such a black-box, inscrutable way that it’s rare to feel remotely in control of the situation. Prom Week offers a lot of detail on a large social network and allows the player to play with it inventively, but there’s so much data available about how every character feels towards every other that it can be hard to master the playing field.

*

Part of the solution, as many of us have been saying for many years, is to make the world model (and thus the verbs available to the player) be about things that typically matter narratively, rather than things that typically don’t. Shooting people, when it happens in a story, is usually important, but not very many stories primarily turn on shooting. (A few, yes. But not most of them, not even in action movies.) Opening boxes and getting into rooms appear much more frequently in stories but are often so unimportant as not to be mentioned explicitly. Hence the need for conversation models, for ways of systematizing communication between characters. Communication is at the core of most stories, one way or another. (I’ve written lots about conversation modeling in the past, including modeling of moods, knowledge, conversation topics, the presence of multiple parties in a room, and so on.)

This is necessary, but it’s not sufficient, because it doesn’t delve into the question of how to design a game around a conversation model, and how to get both diegetic agency and affect out of one.

Blood & Laurels cover artThe system in the AI-driven narrative tool Versu is the most advanced I’ve worked on: it allowed characters to develop complicated opinions about one another, tracked factual knowledge that had been tagged as being significant to the story, coped reasonably with conversation in which characters could come and go, and allowed NPCs to have in-the-moment reactions to things much like Spector’s knocking-over-a-glass. Versu provided a library of small gestural behavior and appropriate reactions to these. It had the granularity required to allow characters to get on one another’s nerves, or fall in love, or become friends, gradually — Chris Crawford talks often about the need for floating point variables to calculate the nuances of feeling that you could have towards another person, and while I think that’s not really the biggest issue, nonetheless the sense of build-up in degrees is important if the player is going to have a sense that all their actions in the game have mattered and been taken under consideration. (I say more here, about halfway down the page, about numbers used in mood modeling and thresholds for NPC behavior.)

But having a model is not the same as having a complete design solution that builds a story effectively around those moments. With Versu we also experimented with a number of different design approaches. The last release, Blood & Laurels, is built around scenes that might correspond to levels in a different sort of game. Each scene has several possible outcomes, and the outcomes depend on some aspect of the model being in a particular state. Some scenes end when a character has certain information. Some scenes end when a character is in a particular mood, or they’ve reached some relationship. Those outcomes are situation-dependent, but there can be many different ways for the player to reach particular mood or information outcomes. In addition, B&L wasn’t trying to do dynamic plotting: the high level of the story is a nodal diagram with predetermined nodes. You will never fall into a completely new scene with new stakes in Blood & Laurels; no character will ever formulate a new plan specifically in response to what you’ve done, because they’re not capable of goal-seeking at the level of narrative space.

The results are imperfect, even within those constraints. Last fall I gave a post-mortem (Powerpoint file with notesPDF of slides w/o notes) about lessons learned from the Blood & Laurels feedback and what I’d like to experiment with next. But a lot of the things I talked about there were about user interface, the choice of affordances, the need for richer text generation to better represent the richness of the AI model, and other problems that belong to the code level.

I didn’t talk so much about the design issues around specific elements of the conversation/knowledge model, and I want to come back to that a little now.

*

Thinking very broadly about conversation mechanics, I find it useful to think about persuading one or more NPCs to do something: change their goals, carry out an action you want them to carry out, prevent an action, get out of the way of an action, etc. Many scenes of conventional drama take this form one way or another, whether it’s Hamlet’s father’s ghost persuading Hamlet to try to kill Claudius or Bruce Willis saying something really badass to a bunch of terrorists and making them run away in fear.

If we want the player to navigate with intention, we need them to understand what they can reasonably hope to persuade the characters to do or not do. In Prom Week, a character can choose to befriend or break up with other characters, for instance — there’s a very clear domain of high school-style interactions in play. In other types of story, the options are much more situational — in Blood & Laurels, the protagonist is often simply trying to manipulate other characters out of having a reason to kill him.

Then we have to offer the player a persuasion toolkit. For all its age, I find the Aristotelian breakdown of rhetoric into ethos, pathos, and logos still pretty useful. Ethos: You might convince an NPC because you’ve established a strong positive relationship with them, or by appeal to some other authority they respect, whether that’s God, the government, local standards of etiquette, etc. A lot of dating sims work on these grounds; you build credibility with the NPC by doing the sorts of things they like to see people do, and the thing you’re convincing the NPC to do is date you. Pathos: You might convince an NPC via emotional manipulation, gaining their sympathy or making them too afraid of you to resist or by triggering a state of heightened emotional vulnerability in which they’re not thinking straight. Logos: You might give an NPC information that makes it evident they should change their approach.

Here’s a thing about ethos: it works on long scales. Respect, affection, and love aren’t gained instantly. Game models that focus primarily on ethical persuasion are often long-form pieces with cumulative stats (visual novels, long RPGs) that would require you to replay big chunks if you wanted to get a different outcome. It’s really hard to plan ahead around this: often the player needs to rely on the effects of a long friendship in some circumstance they couldn’t possibly have anticipated before they got there.

At the other end of the scale is logos, pieces that turn on knowledge-modeling. Information can be delivered more or less instantly, and new information can be, as it were, crafted out of existing information (see Detective Grimoire or Phoenix Wright). This works best if the types of information you’re using are systematic (there’s a reason these are mystery games with trope-defined concepts of “evidence”) rather than just a cluster of variable world-state facts. Likewise, you need the player to be able to remember all the key facts they’ve gathered, so the knowledge model probably needs not to be too enormous.

What about pathos? In my experience, it’s the least reliable basis for building a mechanic. What moves people is, after all, often highly situational and difficult to build into a plan. You can, if you like, give players a blunt emotional instrument like “compliment Bob” or “insult Bob” — and Versu did experiment with this — but spamming the same few social gestures decreases both their plausibility (how often do you stand around telling someone a list of reasons you think they’re awesome?) and their emotional impact (if you do, how long before they stop saying “thanks” and start wondering what you want from them?). In Blood & Laurels we actually put a cooldown timer on some of those verbs just in order to de-spam-ify them. (Which then made players wonder why the affordances came and went as they did. More UI clarity needed.) In a couple of other prototypes, we had NPCs who were capable of going into “overload” states — where they were too sad or too angry to react normally, and would start doing different kinds of things instead — but this is harder to manage in the context of a really script-heavy story.

*

Here’s my contention: there are a lot of games that have experimented with doing just one type of persuasion, but having all three available gives the author a lot more creative flexibility in terms of pacing and variable dramatic intensity. As complex as it may be to have a conversation model that includes all of those elements, having access to all of them is likely to produce game stories that are closer to what we recognize as well-formed narrative in other media; it may be better to have all three axes of persuasion but a relatively small verb set in each category than to have one axis of persuasion that is more densely populated.

At the same time, the ideal design of this kind would be one that systematically taught the player how to use their conversational toolkit, and made the possible NPC goal states clear, so that the player knew what to be aiming at.

And then players will have affect and diegetic agency at the same time, and kittens will be born from daffodils, and we will rule the Empire as father and son.

Read more about:

Featured Blogs
Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like