Having published my video game AI reading list, I came to realize that there are two interrelated aspects of video game AI architectures that the academic and industry literature doesn't really talk about: namely, the scope and the frame of artificially intelligent agents. Specialists throw around terms like "squad controller" and "director AI", but never seem to set them in a broader context – possibly because it's self-evident and obvious to them. I don't have that luxury, however, so I will try to present my current understanding of the two aspects and how they relate to each other.

Scopes

The "scope" of an artificially intelligent agent in a video game is a measure of how much power it can exert over its environment, or, more casually, how many game objects are under its control. It is important to understand that I view AI agents here as invisible and intangible game entities that tell actual physical objects in the game space how to behave, without being necessarily tied to these. For my model, I propose three basic scopes:

  1. Unit scope consists of a single game entity that an AI agent controls, usually a character. It can be a disposable entity as simple as a zergling in StarCraft or as complex as a Replica Soldier in F.E.A.R., or a simulation of another human player such as any computer-controlled brawler in Street Fighter. This is usually what most people immediately think about when "video game AI" is mentioned.
  2. Team scope consists of multiple game entities in a collaborative relationship, which may or may not also be individually controlled by unit-scope agents, thus including both "squad managers" (like in F.E.A.R. and Dishonored 2) and strategy game AIs that coordinate all of their soldiers at once (like the aliens in XCOM: Enemy Unknown), respectively. Like with unit-scope agents, multiple team-scope AIs can be active in a game at once.
  3. Level scope consists of the entirety of non-geometry and non-player elements of a game level, meaning that an AI in charge of it can spawn and direct enemies, items, and obstacles at runtime. The most famous example is the AI Director from the Left 4 Dead series, but the "Macro AI" in Alien: Isolation works in a similar way. While director AIs are associated with dynamic difficulty management, what they actually control is the pacing of a game, thus being the closest thing video games have to a Dungeon Master at the moment. Unlike other scopes, there is usually at most one level-scope AI in a game.

A game's AI agents can control any combination of scopes: F.E.A.R. and Dishonored 2, for instance, feature unit- and team-scope agents, while Left 4 Dead and Alien: Isolation combine unit- and level-scope AIs. Theoretically, a combination of team- and level-scope agents is possible for a game like XCOM, where a director AI would control the frequency and severity of alien attacks, while a team manager would actually play said attacks out against the player.

Frames

I am not the first person to adapt Gary Alan Fine's threefold model of cognitive frames – social, ludic, and diegetic – which he observed in pen-and-paper role-players (as detailed in his book Shared Fantasy, chapter 6) to video games. Salen and Zimmerman incorporate it into their Play of Simulation schema, Montola builds his social framework of role-playing upon it, and Worch uses two parts of it to explore the player-avatar identification in video games. To the best of my knowledge, however, I am the first to apply it to artificially intelligent agents:

  1. Ludic (gaming) frame encompasses agents who are "playing to win" (to borrow the distinction introduced by Soren Johnson), i.e. to defeat the human players. This includes both experimental AIs like Deep Blue and AlphaGo, and enemies from F.E.A.R. that proactively look for ways to destroy your avatar, as well as player-imitating bots in online multiplayer games.
  2. Diegetic (narrative) frame encompasses agents who are "playing to lose", in Johnson's terms, or, more generously, have the purpose of facilitating the exact play experience the game's designer wanted the players to have. Most commonly, the purpose of diegetic AI behaviors is to sell the player on the illusion of interacting with living, possibly rational beings and groups, who look like they try to defeat you (if they are hostile at all).

Another way of thinking about it is that a ludic-frame AI is an opponent, while a diegetic-frame one is an obstacle. Note that this distinction is not as clear-cut as with scopes, seeing how "enemies who play to win" may well be "the exact play experience the designer wanted" (e.g. the relentless enemy assault in F.E.A.R. underscores the overall dread of its plot). You may also notice that Fine's social frame is missing from my model – mainly because game AIs currently don't persist outside of their respective games' scopes, although Jesse Schell talked about changing that as far back as 2013.

Overview

The most problematic part of my model is that scopes and frames don't fit together neatly in a grid. To the best of my current understanding, they actually come together somewhat like this:

Scope Frame
Ludic Diegetic
Unit "Rival"
(e.g. Deep Blue, AlphaGo,
 XCOM: Enemy Unknown)
"Character"
(e.g. The Elder Scrolls V: Skyrim)
Team "Coordinator"
(e.g. F.E.A.R., Mark of the Ninja)
Level "Director"
(e.g. Left 4 Dead, Alien: Isolation)

Some comments on individual combos:

  • Rival agents are there to defeat the human. Note how I make no distinction for unit and team scopes in the ludic frame, seeing how an AI that's playing to win has no need for a separation of individual and group behaviors – it will just micromanage every unit directly to maximize its chances of victory. An interesting exception to this are the online multiplayer bots in team-based games like Counter-Strike, which must display teamwork (a team frame activity) while, in the interest of fairness, only relying on what individual agents' (i.e. unit frame) detect or know a priori about the game.
  • Character agents exist to sell the illusion of a living being. They enact behaviors that maintain the player's suspension of disbelief and thus support the specific fiction that the game designer has set up for them. Rival and Character AIs are what people most commonly associate with the term "game AI".
  • Coordinator agents' purpose is to sell the illusion of coordination. They are needed because developers rarely achieve true emergent coordination with character agents alone, nor is it particularly efficient to burden a leader-type character with additional team-coordinating computations. Coordinator AIs comprise squad managers like in F.E.A.R. and Dishonored 2, but also dynamically spawned "situation controllers", e.g. when multiple guards react to a player sighting in Mark of the Ninja, and large crowd controllers, e.g. in recent Hitman games. Coordinators typically possess no physical presence in the game space, form "teams" out of adjacent character agents at runtime, and only assign roles or goals for characters' individual AIs to carry out, instead of micromanaging them, like a Rival agent would.
  • Director agent is there to maintain the pacing of gameplay. Fine's frames don't really apply to a level-scope agent, because its core responsibility (difficulty/pacing) has both ludic and diegetic significance.

Jesse Schell's ideas about an AI that persists across multiple installments of a series and develops a personal relationship with each individual player would perhaps extend this model into the social frame. I tentatively call it the Friend agent and believe that scope distinctions would be largely irrelevant to it, as it would exist and operate within the context of entire games and series.