Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Game Decision Models


Two-Player Adversarial Games

Up to this point, our search models have operated under the assumption that the agent is the sole force affecting the environment. We must now extend this concept to cover scenarios where external agents also instigate change. This includes:

Example Game State

In this example, the student goes first, attempting to form a vertical line.

PSSPSPPS
SPPSSSSP
SPSSSSPS
PSSSSPSP
PSSSSPSS
SSSSPSSS
SSSSPPPP
SPSSSSSP

Defining the Search Space

To analyze these games, we define the search space as follows:

The Minimax Strategy

We define two players: Max and Min.

Max must devise a strategy to ensure victory. A strategy is a conditional plan: “First I will make move A. If Min responds with X, I will do Y; if Min responds with Z, I will do W.” This structure allows us to build an AND/OR tree representing the game flow.

The Minimax Algorithm

This logical structure forms the basis of the Minimax algorithm:

  1. Generate the entire game tree of possible moves.

  2. Evaluate the terminal states (leaves) and assign values (e.g., 0 for a tie, 1 for a win, -1 for a loss).

  3. Backpropagate these values up the tree.

  4. Result: The values eventually reach the root. This value represents the best outcome Max can expect, assuming both players play optimally.

  5. Selection: Max chooses the move corresponding to the highest backed-up value.

Note: This relies on the assumption that the opponent plays perfectly. We also define a ply as one turn taken by one player.

Properties of Minimax

The Problem of Scale

Searching the full tree is rarely feasible for complex games due to the sheer size of the state space.

To handle these scales, we modify the algorithm to cut off the search:

  1. Search only to a fixed depth.

  2. At each leaf node of this limited tree, evaluate the state using an evaluation function.

  3. Back these heuristic values up the tree as if they were true terminal values.

The Evaluation Function The key to success is the quality of the evaluation function (e.g., for Checkers, Othello, Chess, or Mancala).

Quiescence (When to Stop)


Alpha-Beta Pruning

Theoretically, the deeper we search, the better the program performs. If we can eliminate (prune) parts of the Minimax tree that do not affect the final decision, we gain time to search deeper in promising branches. However, we must ensure we never prune a path that could lead to an optimal solution.

Definitions

The Pruning Rule

Stop searching the children of a node (prune the sub-tree) when b <= a.

The Algorithm

alpha-beta(State s)
  ab-max(s, neg-inf, pos-inf)

ab-max(State s, val a, val b)
  if s is leaf
    return val(s)
  for each child c in children(s)
    v <- ab-min(c, a, b)
    if v > a
      a <- v
    if b <= a
      return a
  return a

ab-min(State s, val a, val b)
  if s is leaf
    return val(s)
  for each child c in children(s)
    v <- ab-max(c, a, b)
    if v < b
      b <- v
    if b <= a
      return b
  return b

Efficiency Analysis

The effectiveness of Alpha-Beta pruning depends heavily on the order in which nodes are examined.