Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

A* and Friends

Heuristics

Reading: 4.1, 4.2

Unlike brute-force searches, which operate without regard to the specific nature of the problem—often searching in entirely the wrong direction during pathfinding—we require a more intelligent approach. The goal is an algorithm that selects the next node to expand not based on generic rules like depth, but based on the likelihood that the node lies on the correct path.

This requirement leads to a class of algorithms called “Best-First Search.” These algorithms function similarly to other search methods but leverage problem-specific knowledge to select the next node for expansion.

Best-First Generate and Test Algorithm

BestFirst(StartState s)
  while (isNotGoal(s))
    generate successors to s
    point successors back to s
    add successors to open
    add s to closed
    s <- best from open

The mechanism for estimating how likely a node is to be on the correct path is called a heuristic. Note that the timing of adding nodes to the closed list has been modified here relative to standard DFS or BFS; this distinction is important for implementation.

Defining Heuristics

We define a heuristic as a function, h(n), which takes a node in the search space as an argument and returns an estimate of the number (or cost) of the remaining steps to the goal. This allows the algorithm to select the next node to expand based, at least in part, on the h(n) value. In path planning, for example, a common heuristic is the straight-line distance to the goal.

In the context of AI, a “greedy search” selects the node with the smallest h(n) as the next node for expansion. While intuitive, this approach is not always complete or optimal.

Finding Heuristics

A common example of a heuristic is using the Manhattan Distance for the 8-puzzle. In general, to find a heuristic, one often relaxes one of the problem’s strict rules.

For instance, in pathfinding, we might relax the rule requiring travel along the graph’s edges, allowing straight-line movement instead. In the 8-puzzle, we might relax the rule that two tiles cannot occupy the same space simultaneously. Often, the solution to a relaxed version of the problem serves as a good heuristic for the original, harder problem. However, one must be aware that if the heuristic is too costly to calculate, it may not be efficient to use.


Greedy search has significant downsides. In the worst case, it suffers the same fate as Depth-First Search (DFS); it is neither optimal nor complete, and it can suffer O(b^m) time and space complexity. Conversely, Uniform Cost Search selects nodes based on g(n) (the cost incurred so far). While Uniform Cost is complete and optimal, it is not efficient.

The solution is to combine the properties of both into a single cost function.

The A* Cost Function: f(n) = h(n) + g(n)

Since g(n) is the cost so far, and h(n) is the estimated remaining cost, f(n) represents the estimated total cost of the path through node n.

Proofs and Assumptions

To prove the validity of A*, we must make specific assumptions about the heuristics used.

Assumption 1: Admissibility We assume that the heuristic h(n) never overestimates the actual cost to the goal from that node. No matter the value of h(n), we know h(n) <= h*(n) (where h* is the true cost). If this holds, h(n) is considered admissible. Consequently, f(n) never overestimates the actual cost of the best solution passing through n.

Assumption 2: Monotonicity We assume that for any pair of nodes n and n', where n is an ancestor of n', h(n) <= h(n') + c(a,n,n') where c(a,n,n') is the known cost of taking action a to move from n to n'. This is called monotonicity. The result of this is that f(n) <= f(n'). In general, a function is monotonic if it does not change direction (e.g., if it is increasing, it continues to increase or stays flat).

Proof of Optimality

  1. Assume two goal states, G1 and G2. Let G1 be the optimal goal state and G2 be a goal state on a non-optimal path.

  2. We know that f(G1) = C* (the optimal cost)

  1. Proof by Contradiction: Imagine that A* has selected G2 for expansion rather than G1.

  2. Therefore, there exists some node n which is an ancestor of G1 and is currently a leaf node in the A* search frontier. (Such a node must exist if G1 has not yet been expanded).

  3. C* >= f(n) (by admissibility).

  4. f(G1) >= f(n) (by substitution).

  5. f(n) >= f(G2) (Since G2 was selected for expansion instead of n).

  6. f(G1) >= f(G2) (by substitution).

  7. f(G2) = g(G2) + h(G2) (by definition).

  8. h(G2) = 0 (it is a goal node and the heuristic is admissible).

  9. f(G2) = g(G2).

  10. f(G1) >= g(G2) (by substitution).

  11. f(G1) = g(G1) + h(G1) (by definition).

  12. h(G1) = 0 (it is a goal node and the heuristic is admissible).

  13. f(G1) = g(G1).

  14. g(G1) >= g(G2) (by substitution).

  15. Contradiction: This reads “the cost to the optimal goal G1 is greater than the cost to the non-optimal goal G2.” This contradicts the definition of G1 as the optimal goal.

  16. Q.E.D.

Proof of Completeness

Monotonicity and Pathmax

We know g(n) increases as we traverse a path (assuming non-negative edge costs). If h(n) decreases by less than g(n) increases, then f(n) is increasing. However, we only know that h(n) is admissible (underestimates). We can easily construct an admissible h(n) that is not monotonic.

To address this, we use Pathmax. We know the cost of the path through n is at least f(n). If a child n' has an f(n') that is less than f(n), we know the true cost is still at least f(n). Therefore, we track the maximal f(n) along a path; if any node m has an f(m) less than the max seen on its path so far, we set f(m) = max. Because the value is always increasing, the max should generally be f(parent).

Fortunately, most natural heuristics are inherently monotonic. Using the concept of “relaxing a problem constraint” typically yields an admissible, monotonic heuristic.

Time & Space Complexity


IDA* (Iterative Deepening A*)

Just as iterative deepening transforms Breadth-First Search (BFS) into a linear-memory algorithm, we can apply the same technique to A*.

F-Cost Limited DFS

First, we define a DFS limited by f-cost rather than depth. Instead of stopping at a specific node depth, we stop when the f-cost of a node exceeds a limit. If the path is not found in the current iteration, the algorithm returns the smallest f-cost encountered that exceeded the current cutoff. This allows the cutoff to be increased by the exact necessary amount for the next iteration.

FLimitDFS(StartState s, Limit l)
  done <- false
  min <- infinity
  while ( s != null )
    if (f(s) <= l)
      if (isGoal(s))
        return path to s
      generate successors to s
      point successors back to s
      push successors on open stack
    else
      if (f(s) < min) min <- f(s)
    
    s <- pop next state from open

  // The smallest cost that was seen greater than Limit l
  return min

The IDA* Algorithm

IDA*(StartState s)
  l <- f(s)
  r <- null
  while (isNotPath(r))
    r <- FLimitDFS(s, l)
    if (isNumber(r)) l <- r
  return r

Analysis of IDA*

IDA* works well in practice. It is complete, optimal, and linear in space complexity. Under the right conditions, it is O(b^d) in time.

It achieves linear space because FLimitDFS runs a standard DFS, storing only a linear amount of memory at any one time. It is complete and optimal because FLimitDFS expands new nodes (those not visited in previous iterations) in the same order as A* (increasing f-cost). If A* finds the optimal path, IDA* will as well.

In many situations, the time complexity remains O(b^d) for the same reason standard iterative deepening works: the final iteration dominates the runtime. However, potential issues can arise:

  1. Unique F-Costs: If every f-cost is unique, each iteration of FLimitDFS adds only one new node. This severely degrades performance.

  2. Tree Structure: Even if many nodes share f-costs, certain tree structures can degrade performance to Omega(b^d * lg(b^d)).

  3. Graph Loops: If the search space is a graph rather than a tree, there is no inherent loop checking. A* detects loops via the closed list, but IDA* may search loops repeatedly. In the worst case, this leads to Omega(2^(b^d)). Avoiding this while maintaining linear memory is fundamentally difficult.


SMA* (Simplified Memory-Bounded A*)

IDA* suffers from using too little memory, leading to the loop and regeneration issues described above. SMA* attempts to solve this by utilizing all available memory. There are a variety of algorithms for doing this, but we’ll talk about SMA* because the book calls it the “best”. Be aware that the book’s first author invented SMA*, so take that with a grain of salt.

The concept is to operate exactly like A* until memory is full. When memory is full and a new child needs to be generated, the algorithm must forget a node currently in memory to make room.

Algorithm Logic once Memory is Full

  1. Which node to delete? The algorithm selects the node least likely to be expanded (the one with the highest f-cost).

  2. Backing up values: We cannot forget the node entirely, or we might waste resources regenerating it later. We store the best f-value of the deleted descendant in its parent. Note that this applies to descendants, not just immediate children. If a node is deleted that already had a deleted child, the smaller of the two f-costs is backed up to its parent.

  3. Regeneration: If the search later explores nodes where the f-costs exceed this “backed-up” value, the algorithm knows a potentially better path exists in the previously deleted subtree and will regenerate it.

Note: While conceptually simple, SMA is difficult to implement correctly. The formal algorithm was notably removed from the 2nd edition of the standard textbook.*