Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Probability and Expectimax


Assumptions Thus Far

Up until this point, our exploration of games and search algorithms (such as Minimax) has focused on a specific subset of games. Specifically, we have looked at games that rely on two key assumptions:

  1. Deterministic: If an agent takes a specific action, the result is known and certain. A move on the board always results in the same resulting state.

  2. Perfect Information: The agent has access to the opponent’s complete state; there is no hidden information.

However, many real-world games do not fit these criteria. We must expand our algorithms to handle games that are:

Fundamentals of Probability

To handle stochastic environments, we must introduce concepts from probability theory.

Random Variables

A Random Variable represents an event with a set of possible outcomes. Examples include the roll of a die or the winner of a sports match.

Uncertainty vs. True Randomness

It is important to distinguish between true randomness and uncertainty.

Distributions

A probability distribution assigns a probability to each potential outcome. A fundamental rule is that the sum of probabilities for all possible outcomes must equal 1.

Expectation and the Expectimax Algorithm

Expectation

Expectation is a property of a distribution. When outcomes are numeric, the expectation represents the weighted average of those outcomes. To calculate the expected value, we multiply each outcome’s value by the probability of that outcome occurring and sum the results.

The formula for the expected value of a random variable is:

E[V]=ooutcomesP(V=o)×oE[V] = \sum_{o \in outcomes} P(V = o) \times o

The Expectimax Algorithm

How do we adapt our search trees when random events occur? We utilize the Expectimax algorithm.

In a standard search tree, we alternate between Maximizing and Minimizing layers. In Expectimax, we introduce a new type of node: the Expectation Node (or Chance Node).

From this setup, it looks a lot like MiniMax with slight changes:

Note: Probabilistic search can sometimes yield counter-intuitive results compared to strict Minimax. Strategies may favor a lower guaranteed score over a high-risk, high-reward option depending on the distribution.

Expecti-Minimax (Adversarial Games)

This approach can be applied to 2-player adversarial games involving chance, such as Backgammon. The tree structure alternates to account for the opponent:

  1. Max Node (Agent move)

  2. Chance Node (Dice roll/Environment)

  3. Min Node (Opponent move)

  4. Chance Node (Dice roll/Environment)

Regarding Pruning: While Alpha-Beta pruning is highly effective in Minimax, pruning in Expectimax trees is mathematically more complex. (See course problem set for details).

Advanced Concepts

Partial Observability

In games with hidden information, we cannot start from a known state.

Multiplayer Games

When more than two players are involved, the “Zero Sum” property often no longer applies.