(This page contains course notes for Artificial Intelligence taught in the Autumn of 2019. If you have come here from a search engine or the like, you may wish to visit the course home page for more material, or visit my home page for possibly more up to date versions.)

State Space Search.

Reading: Chapter 1, 3.0-3.6
Intelligence is all about making good choices. Making good choices is about considering alternatives, and the effects of the alternatives.
The process of systematically considering alternatives is called "search"
Hypothesis: All problems that require intelligence can be characterized as a state space and intelligence can be characterized a search in that space.
State space- Definition of a problem:

State- a condition or mode of the problem
Initial state- the start state from which the program tries to solve the problem.
Set of operators- an operator is an action that can be taken within the framework of the problem that changes the current state to some other valid state in the problem.
Goal state- the state we want the problem to be in.

We're usually interested in the order of the operators to get from the initial state to the goal state (but not always).
Examples:

Find a route from Michelson to Ram's Head Tavern.

8-puzzle (15-puzzle)

1	3
4	2	6
7	5	8

1	2	3
4	5	6
7	8

Missionaries and Canibals
Rubik's Cube
Traveling salesman (wiring).
Robot path planning.
Note that these problems result in a graph

Generate and test algorithm:

GenerateTest(StartState s)
  while (isNotGoal(s))
    generate successors to s
    point successors back to s
    add successors to open
    add successors to closed
    s ←  SOME state from open

Generating the successors is called expanding the state.
8-puzzle example
This algorithm builds a search tree, made up of search nodes (which correspond to states)
Note that state space does not equal search tree.

The ability to go in a loop or even just undo the previous operator results in a tree that is often much larger than the state space.
Note that these trees and graphs are different from ones we are used to because they are implicit. Instead of the entire graph being there in memory, we only keep a portion of it around. It exists in its entirety only as a mathematical idea implied by the start state and the operators.

Algorithm evaluation criteria:

Completeness

is the algorithm guaranteed to find a solution if there is one.

Optimality

does the algorithm necessarily find the optimal solution?

Time complexity

How long does it take to find a solution?
Usually worst-case, but sometimes average case.
Often look at the branching factor and the depth.

Branching factor is the average number of children of a node.
Depth is the number of generations of the node the search algorithm looks through.

Space Complexity

How much memory does the algorithm use?
In AI, this is often more important than time complexity.

Brute-force Search

Breadth-first
Uniform-cost
Depth-first
Bi-directional

Breadth-first Search

Expand current node
Expand all the children.
repeat.
Covers all nodes of depth 1, then of depth 2, etc.

BreadthFirstSearch(StartState s)
  add s to closed
  while (isNotGoal(s))
    generate successors to s
    For each successor not in closed
        point successors back to s
        add successors to closed
        add successors to open queue
    s ←  next state from open queue

open	closed	Comment
	(A)	Start at A
(B A) (C A)	(A B C))	Add A's neighbors
(C A) (E B A) (F B A) (G B A)	(A B C E F G)	New states go to the back of the queue
(E B A) (F B A) (G B A) (D C A)	(A B C D E F G)	G already known about
(F B A) (G B A) (D C A) (H E B A)	(A B C D E F G H)	and so on...
...	...

Criteria analysis of BFS

Complete- yes. why?
Optimal- yes. why?
assume a branching factor b.
assume the solution is at depth d.
assume the dominant time cost is in the generation of the node
assume the dominant space cost is in the storage of the node.
then the time it takes to find a solution at depth d is a factor of the number of nodes the algorithm generates to find that solution:

if d=0, then the number of nodes is 1. If d=1, then the number of nodes is 1+b, and if d=2 then the number of nodes is 1+b+b².
In general this becomes, for depth d: 1+b+b²+b³+...+b^d.
because the last term of the polynomial dominates the rest, we say the time is O(b^d).

How many nodes must be kept in memory at one time?

the entire frontier (at least) must be kept in memory so the next level of the tree can be generated: O(b^d).

Both time and space complexity are exponential in the depth of the solution. This is very, very, nasty.

Given a branching factor of 10, 100,000 nodes/second, 100 bytes/node

Depth	Number of Nodes	Time	Memory
0	1	.00001 seconds	100 bytes
2	111	.001 seconds	10 K
4	11,111	.11 seconds	1 M
6	10^6	10 seconds	111 M
8	10^8	17 minutes	11 G
10	10^10	28 hours	1 terabyte
12	10^12	116 days	111 terabytes
14	10^14	32 years	11 petabytes

What does this tell us? If the problem is big enough, then time can be a serious contstraint, BUT using BFS, the real problem is memory. I commonly have problems that take an hour or 2 to run, but 11 G of RAM is hard to come by. Serious problems are often worth waiting a week or 2 to run, but where are you going to get a terabyte? And there are only of depth 10.

Uniform Cost- Just like breadth-first search, except instead of expanding a node from the shallowest depth, you expand a node with the minimal cost accrued so far g(n).

UniformCost(StartState s)
  while (isNotGoal(s))
    add s to closed
    generate successors to s
    foreach c ∈ successors
      if c ∉ closed
        calculate g(c)
        point c back to s
        add successor to open priority queue
    s ←  next state from open

open	closed	comments
(0 A)	.	empty closed list
(4 B A) (8 C A)	(A)	Only add A to closed
(7 E B A) (8 C A) (13 F B A) (18 G B A)	(A B)	E is closest
(8 D E B A) (8 C A) (12 H E B A) (13 F B A) (18 G B A)	(A B E)	found a shorter path to D
(8 C A) (9 C D E B A) (12 H E B A) (13 F B A) (14 G D E B A) (18 G B A)	(A B D E)	why don't we worry about that long path to C?
...	...

Note that we use the closed list differently. We can't just add a state to the closed set when it is generated because there is no guarantee that that path is really the shortest. Take this small example:

We expand B first, giving us D, with a total cost of 11. Next we expand C, which should give us D along a shorter path, but if we added D to the closed set when we generated it from B then we would toss it out when we generated it from C. Thus we must add nodes only when expanded.
Note that we could ignore the closed set entirely and never put anything on it. That would not a a problem in the sense that if there are multiple states on the open queue, the ones with the higher path costs would be buried. Unfortunately, we could end up with multiple copies in the queue, taking up space, and even if we still find the optimal path, we could end up checking though pieces of tree multiple times.
Same characteristics of breadth-first search, except it works better for searches with costs.
This is the famous Dijkstra's algorithm.
Depth First Search- Always expand the node deepest in the tree.

DepthFirstSearch(StartState s)
  while (isNotGoal(s))
    generate successors to s (not on this path)
    point successors back to s
    push successors on open stack
    s ←  pop next state from open

NO CLOSED LIST!

This has interesting effects:

open	comment
(A)	starts the same
(B A) (C A)	not exciting yet
(E B A) (F B A) (G B A) (C A)	C is getting buried on the stack
(H E B A) (D E B A) (F B A) (G B A) (C A)	Can't count on particular expansion order
(K H E B A) (J H E B A) (G H E B A) (D E B A) (F B A) (G B A) (C A)	hey, this isn't the shortest path...
...	...

Optimal? no.
Complete? maybe. (limit the depth?)
Time complexity? O(b^d).
Space Complexity? O(d). This is important because space is always such a problem.

Depth First Iterative Deepening- An attempt to add optimality and completeness to depth first while maintaining the same time and space complexities.

Imagine a DFS with a maximal d, that is, the algorithm will treat any node of depth d as a dead end.
Examine this if the minmal solution depth s is:

s < d; the DFS will find a solution, but it is not guaranteed to be optimal.
s > d; the DFS will not find a solution.
s = d; the DFS will find a solution, and it is guaranteed to be optimal.

What we would like to do is a depth limited DFS that is limited to the minimal depth of the solution.
If we don't know the depth of the solution we will have to find it.
Proposal. Do a depth limited DFS with depth of 0. If we find a solution then we're done. if not, do a DFS to depth 1. If we find a solution, then we're done, and the optimal solution is at depth 1. If not, repeat this process until we find a solution.

If we find a solution at depth n, the we're sure that is the optimal solution because if it wasn't, then we would have found the solution when we searched to depth n-1.
We know we will find a solution because eventually we will do a depth limited DFS to the depth of the solution, and it will, given enough time, come across that solution.
We know the space complexity remains O(d) because at any one time, we're only doing a DFS with space complexity O(d).
But wait! That's a whole lot of extra work, isn't it? What has happened to our time complexity?
Funny you should ask...

^d-1

^d-2

^d-3

⁰

^d-1

^d-2

^d-3

^d-2

^d-3

⁰

^d-1

^d-2

^d-3

⁰

is still O(b^d).

DFID can be a big win when considering brute-force algorithms and the problem is big.

Bi-directional search

The idea of bi-directional search is to reduce time while caring less about space.
essentially, you run 2 searches: one from the start state forward, and one from the goal state backward, and when they hit in the middle, you have a solution.

Complete? yes, if even one of the searches is complete.
Optimal? yes, depending on the searches involved. Name 2 searches could result in a non-optimal solution?
Time? If we can promise that the searches meet in the middle, we end up with 2 searches that run to depth d, 2*b^d/2 = O(b^d/2).
Space? Depends on the searches. Could be O(d) for two depth first searches.

Issues to resolve:

Can we go backwards? Not all problems can we figure out the previous state from the current one.
What if there are multiple goal states? What if the goal states are implicit?
Can we efficiently check if a given node is part of the frontier of the of other search? Yes, using a hash table. But then the frontier must be stored in that hash table. In order to guarantee that the 2 searches will meet at all, at least 1 must be BFS, thus the table size is b^d/2 . Therefore, for any serious implementation of bi-directional, the space complexity is O(b^d/2).

Partial Information
- Sensorless- we don't know what state we're in! Use sets of states, called belief states.
- Contingency- What if some action doesn't work? Uncertain results of actions.
- Exploration- We don't even know how the environment works! We'll have to explore first....