Recently in the Forum Ludorum seminar, we have discussed the topic of interactive storytelling in games. Based on that discussion, I was able to summarize my own current understanding of the topic below.

Basic Definitions

For the purpose of this essay, the terms "story" and "narrative" are synonymous with each other (but not with "plot" – see below) and defined as a retrospective account of a sequence of causally and chronologically linked events (based on Simons 2007). Narrative is how we humans "make sense" of the world or, more formally, how we process, store, and share our in-the-moment experiences by reducing them to sequential reports. Because video games are a participatory, interactive medium, they generate in-the-moment experiences for players, but as authored media, developers also commonly insert curated narratives into them. This gives rise to two distinct kinds of narratives in games (first identifed and named by Marc LeBlanc at GDC 2000):

  • Embedded narrative elements are authored by the developer and reproduced by the game in a more or less fixed sequence. An embedded narrative conforms with our definition of a "retroactive account" of events despite the player experiencing said events in-the-moment, because the developer has already retroactively curated and composed them into a narrative to be shared. Authoring an embedded story has a lot in common with storytelling in traditional, non-interactive media (particularly, with film-making). We claim that the primary purpose of embedded narrative elements in a video game is to establish the context for its in-the-moment gameplay experiences and to convince the players to invest themselves emotionally into its outcome.
  • Emergent narrative is a retroactive account of the player's gameplay experience and, unlike the embedded narrative, does not exist until the player actively reflects on their experience (without necessarily sharing it with other players). Some sources further subdivide emergent narratives into "player-driven" and "procedural": a player-driven narrative is an account of the player's own actions and reactions during their experience, while a procedural one is an account of autonomous interactions between the game's rules and content that the player has perceived to be causally and chronologically linked. While player-driven narratives are fairly common in single- and multiplayer games, procedural narratives typically only emerge in games featuring intricately simulated worlds, interlocking game systems, and artificially-intelligent characters.

Procedural narratives are not to be confused with "computational narratives" (Yannakakis and Togelius 2014), which are authored at run time by the game's artificial intelligence, rather than (exclusively) by the player or by the developer. Computation narratives are a largely theoretical (at the moment) attempt to teach machines to recognize narrative threads in unstructured clusters of gameplay events and, in the long term, to allow machines to develop these narratives into dramatic conflicts, similar to how a hands-off game master would run a tabletop RPG campaign.

Finally, we define a dramatic narrative as one that is driven by a conflict between an individual character and an opposing force (often another character), which the audience (the player) is expected to emotionally invest themselves in.

Aspects of a Narrative

In this section, we present a framework for analyzing and improving video game narratives. Because of the fundamental differences of how embedded and emergent narratives are authored, however, two related but distinct toolsets are needed to model them. Our embedded narrative model is adapted from The Lorerunner's Six Points of Story and is rooted in the aforementioned claim that the primary purpose of embedded narrative is to establish context for gameplay interactions. Together, following aspects comprise the narrative design of a video game:

  • Plot is the overarching structure of a story, its central dramatic arc. In video games, plot usually concerns the driving "external" conflict(s) of the game.  The existence of this authored overall structure is the main difference between embedded and emergent stories in video games.
  • Characterization comprises all details that make individual characters (whether player-controlled or not) in a game more distinct and memorable, with side characters generally receiving less attention than the main ones. Characters in video games are a topic of a separate study.
  • Character arcs are dramatic arcs concerning the "internal" conflicts and subsequent changes in individual (main) characters. This is distinct from characterization in that while a character arc usually requires some characterization, even a fleshed-out character may remain static through the story.
  • World building comprises all details that flesh out the setting of the game and its history. World building provides essential context for conflict in the story.
  • Themes are overarching concepts discussed by the story. Prominent central themes are particularly important in non-linear video games where they help bring multiple parallel subplots (e.g. side quests) into a coherent whole.

For emergent narratives, we can instead turn to a theory developed by the indie pen-and-paper community The Forge (Edwards 2001) to model fantasy role-playing as exploration of shared imaginary worlds. Borrowing from fantasy role-playing is appropriate because emergent narratives are arguably the main strength and purpose of this medium. Edwards' model states that dramatic narratives emerge from the Premise of a role-playing game, which consists of Character (who the story is about) and Setting (where and when it takes place), which together produce Situation (what kicks off the story); as well as System (how the shared fiction is negotiated by players) and Color (evocative details of the fantasy). A more technical look at emergent storytelling specifically in video games is given by Chauvin et al. and Ryan et al., among others.

Semiotic Building Blocks

An embedded narrative of a video game is created by the developer and mediated by the game to the player using a variety of signs in the semiotic sense (for a more in-depth introduction to Peircean semiotics and their relation to video games, see Aristov 2017). These can be subdivided into two broad categories: content and rules, or, in cybertextual terms, "textons" and "transition functions".

The content of a video game comprises its textual (where "text" is defined by Aarseth 1997 as "any object with the primary function to relay verbal information"), audio (sound), and visual components. A case can be made for including haptic (touch-based) and spatial content (such as 3D architecture) as separate sub-categories, but there are too few examples of haptic storytelling in commercial games to theoretically model them, while spatial signs are almost always mediated through visual display due to limitations of gaming hardware and are therefore rolled into that category.

The rules of a video game can also be subdivided into player controls (how the player interacts with the game state) and behaviors (how the game state evolves without player interaction). As tools for constructing embedded stories, they are unique to interactive media, but have been historically neglected by developers in favor of appropriating content-based storytelling techniques adopted first from literature and later from film. The term commonly used in game studies to describe narratively meaningful rules is "mechanics-as-metaphor" (cf. Portnow and Floyd 2012).

Textual Content

Even though following content is referred here as "textual", it includes both the visual representations of text (letters and words displayed on the screen) and audio representations (sound files played back on the speakers) und is subdivided into dialogue and flavor text. In terms of antecedent media, the former is rooted in theater, while the latter is typically literary in nature.

  • Dialogue is the textual representation of speech-based interactions. Dialogue content is usually tied to character constructs within the game, and most if not all of it involves the playable characters in some way. Dialogue can occur outside of gameplay (e.g. in cutscenes), as part of gameplay (e.g. dialogue trees), or during gameplay (e.g. idle enemy chatter).
  • Flavor text is any piece of descriptive text that is tied to inanimate objects, such as item descriptions, "codex" entries, in-game documents, audio logs, descriptions on geographic maps, mission briefings, and loading screens. Typically, flavor texts are not mandatory to read unless they contain crucial gameplay information like mission objectives.

Audio Content

Audio content are any sound files that contain information beyond verbal.

  • Dialogue, when voiced by actors, is audial in addition to being textual, as voice acting can convey additional information, particularly characterization, through intonation and cadence of speech.
  • Ambient sounds convey information about the setting (world building) and important plot events. An example would be bird chatter suddenly stopping to inform the player of an enemy arrival.
  • Soundtrack typically refers to the background music playing during gameplay and cutscenes. Soundtrack is primarily used in games to convey mood and emotion associated with the current location in the game world, stage of the plot, or, more rarely, a particular character (their leitmotif).

Visual/Spatial Content

  • Character design, specifically, their appearance, is the main way of conveying characterization, while changes in appearance can show the progression of their character arc. Character design is also often the main promotion method for games with strong character focus.
  • Animation comprises repeatable (gameplay) animations of both playable characters and NPCs (such as traversal, attacks, and idle animations), which can convey their characterization and character growth, but also cutscenes and scripted events, which often use cinematic techniques to advance the plot and to showcase world building.
  • Level design is the medium of environmental storytelling (cf. Brown 2020) wherein the spatial relationships between individual level elements and sub-areas convey additional information about the game world and its inhabitants, usually by implying a narrative of past events.
  • Stylization concerns how in-game objects are presented visually, rather than the objects themselves. It ranges from the stylized UI design, through stylized (e.g. cel-shaded) 3D visuals, to stylized camerawork (cf. Angkasa 2018). All of these can be used to convey the story's genre, themes, and overarching mood to the player.

Player Controls

"Controls" here are a subset of game rules that allow players to interact with the game's internal state. Note that the term "game mechanics" is ambiguous: Hunicke et al. use it to refer to all rules governing game behaviors, while Järvinen 2008 and Sicart 2008 specifically define it as player controls. In this section, we will use "mechanics" and "player controls" interchangeably.

  • Repeatable interactions are all controls that invoke consistent game behaviors. Cook 2012 uses the term "loops" to describe how repeated player actions and audiovisual feedback allow players to form and to refine a mental model of the game's hidden systems. Because said feedback remains consistent with player actions even when repeated indefinitely, such mechanics can be used to present a narrative of what is normal in the game world: what always happens, what has a chance of happening, and, most surreptitiously, what never happens under normal circumstances.
  • Exceptional mechanics (a term coined by Seraphine 2016) are intentional lapses in the consistency of player controls, inserted by the game designers to subvert the previously-established "normal" narrative. This is used to narrate some major change in the game world or in the player character (what Cook would describe as an "arc") by way of an enthymeme: "normally, when I do this, that happens, but now it doesn't, so something must have changed". As an illustration, Seraphine uses the example of a game temporarily disabling the "fire weapon" control to indicate that the player character – an otherwise stoic and unwavering soldier who has never before failed to fire his gun when the player pressed the button – cannot bring himself to shoot his friend.
  • Emotionally-aligned mechanics is perhaps the least-studied type of interactive storytelling, where gameplay mechanics are specifically designed to invoke the same emotional response in players as the narrative events invoke in the characters (without necessarily being the same type of activities). The best example would be the indie horror pen-and-paper RPG Dread, which uses a Jenga tower to effectively build up tension in players, just as it mounts up in their characters. By having players pull blocks from the tower any time their characters do anything dangerous, the player's fear of the tower toppling over superimposes itself onto their character's fear of injury and death, creating a much more meaningful experience.

ADDENDUM 2021-03-24: Regarding exceptional mechanics, Hoge 2018 explains how Monolith's famed Nemesis System assembles procedural character arcs out of tightly-controlled exceptions to the consistent rules (which the game previously taught to the player). His argument that players mostly remember the rare exceptions to the normal gameplay goes a long way to answer the question posed in Portnow and Floyd 2014 of why moment-to-moment experience of interactive media appears to be much more forgettable than that of non-interactive fiction. The latter may be able to focus on exceptional events to tell a gripping story by piggybacking on the audience's preexisting notions of normality, while games must spend most of their playtime establishing said normality so that the players appreciate the rare unexpected events and remember them as a causally-linked story arc.

Procedural Behaviors

The final category encompasses rules, run-time behaviors that emerge from them, and all other systemic interactions that are used for narrative purposes. They are similar to repeatable interactions above, in that they establish a narrative of what is normal and what never happens, except that the player does not have to engage with them in any way other than observation and interpretation. Bogost 2007 has coined the term "procedural rhetoric" to describe how games-as-simulations make claims about the real world with their systemic interactions alone, but the same deliberation applies to "narrating" the realities of their own fictional worlds. Because these behaviors emerge from underlying rules and simulations, they blur the line between embedded and emergent storytelling.

  • Value-guided AI is a kind of video game artificial intelligence (see Aristov 2019a for an overview of the topic) that chooses the behaviors of non-player characters and computer-controlled opponents according to specific ethical and moral values authored by the designers, rather than game-theoretic optimization – in other words, it operates in the diegetic, rather than the ludic frame (see Aristov 2019b). Because each NPC's values can be set individually, the designer can use them to express their characterization and character arcs through behaviors, rather than just in their visual design, dialogue, etc. A classic example would be the Pac-Man ghosts having different movement patterns that are based on their authored personalities (Iwatani 2011).
  • Procedural environment covers any type of narrative change occurring in the scope of an entire level or region of the game world. This can be a change in the level elements in response to player actions (e.g. extra rat swarms in later levels of Dishonored if the player has caused too much chaos earlier), measurable differences in basic game rules across different locations (e.g. the wanted meter in Mafia III going up more slowly in response to crime in a predominantly black neighborhood), or even run-time manipulation of level design to conform with genre or themes (e.g. the AI Director in Left 4 Dead).

Plot Progression in Video Games

Progression in video games refers to the player gaining incremental access to a game's content and mechanics, meaning that the player cannot experience certain parts of it until certain actions are performed, and is therefore synonymous with the term "exposition" as used by Kasavin 2010. Plot progression is thus a measure of how many of the embedded (authored) story building blocks the player has already accessed or can currently access. The appeal of progression systems lies in the delayed gratification and lusory attitude (Salen and Zimmermann), while for plot progression specifically, Aarseth's concept of intrigue ("a secret plot in which the [player] is an innocent, but voluntary, target... with several possible outcomes that depend on various factors", p.112) is also highly relevant.

There are two popular and one rare approaches to tracking plot progression (or, equivalently, referencing a particular plot moment) in video games:

  • Spatial progression ties the current stage of the plot to the player's position in the game world or, more formally, to the farthest position in the world the player can currently reach. This usually requires the world to be relatively immutable and unlocked region by region, strongly correlating plot progression with the size of the accessible portion of the game world. Plots with spatial progression tend either to be linear, or to feature few major branches and are commonly found in Metroidvanias, Souls-likes, and perpetually-static MMORPGs.
    • Level/mission sequence is a specific subcategory of spatial progression which disables backtracking to previously explored levels, so your plot progression is effectively specified by the current level ID and your local position relative to the level's (spatial) goal.
  • Causal or state-based progression instead stores an internal representation of the plot as a set of "event flags" that track which plot events the player has already experienced and which can occur next. This progression type allows (but does not require) plots to be extremely non-linear in their sequence of events, while still remaining fully authored by the designer. It is commonly found in open world games (e.g. The Elder Scrolls, Saints Row), where all "levels" are available for traversal from the start of the game, but individual plot events like dialogue and cutscenes must still be triggered in a specific narrative order.
  • Temporal progression is a rare variation where the current plot stage is tied to the time elapsed since the start of the game, as in non-interactive media. Games with temporal progression include the genres of rhythm games and "clockwork games" (Brown 2019) like Outer Wilds, the adventure game The Last Express and the immersive sim Pathologic.

The two main progression types (spatial and causal) are rarely used in isolation, e.g. many role-playing games tie plot events to the completion of dungeon levels (spatial progression), but grant access to said dungeon levels based on which plot event flags have been raised (causal progression).