g(node) + h(node). - Computing Science, Thompson Rivers University

Search Methodologies
Fall 2013
Comp3710 Artificial Intelligence
Computing Science
Thompson Rivers University
Course Outline


Part I – Introduction to Artificial Intelligence
Part II – Classical Artificial Intelligence


Knowledge Representation
Searching



Knowledge Represenation and Automated Reasoning





Search Methodologies
Advanced Search
Propositinoal and Predicate Logic
Inference and Resolution for Problem Solving
Rules and Expert Systems
Part III – Machine Learning
Part IV – Advanced Topics
TRU-COMP3710
Search Methodologies
2
Chapter Learning Outcomes







Decide if DFS (Depth First Search) and/or BFS (Breadth First Search)
is complete and/or admissible on a tree and/or a graph.
Define what an admissible heuristic is.
Write a pseudo A* algorithm.
Given two heuristics, decide which one is better.
Given expanded states within a running A* algorithm, decide the next
state to visit.
Design and implement a Java program for the 8-puzzle game, using an
A* algorithm.
Use of Hill Climbing for the 4-queens game.
TRU-COMP3710
Search Methodologies
3
Chapter Outline
















Introduction
Problem Solving as Search
Data-Driven and Goal-Driven Search
Generate and Test
Depth-First Search
Breadth-First Search
Properties of Search Methods
Why Humans Use Depth-First Search
Implementing Depth-First and Breadth-First Search
Example: Web Spidering
Depth-First Iterative Deepening
Using Heuristics for Search
Hill Climing
Best-First Search
Beam Search
Identifying Optimal Paths
TRU-COMP3710
Search Methodologies
4
Brute Force Search



Search methods that examine every node in the search tree – also
called exhaustive.
It does not say the search order.
Generate and Test is the simplest brute force search method:



Generate possible solutions to the problem.
Test each one in turn to see if it is a valid solution.
Stop when a valid solution is found.

The method used to generate possible solutions must be carefully
chosen.

[Q] Can brute force search find any solution?
[Q] What if there are so many possible solutions, i.e., states in the
search tree?
[Q] Any better ideas?


TRU-COMP3710
Search Methodologies
5
Depth-First Search (DFS)



Another exhaustive search method.
Follows each path to its deepest node, before backtracking to try the
next path.
[Q] A search traversal in the next example is ?
1
2
4
TRU-COMP3710
3
5
6
Search Methodologies
7
6
Breadth-First Search (BFS)



Another exhaustive search method.
Follows each path to a given depth before moving on to the next depth.
[Q] A search traversal in the next example is ?
1
2
4
TRU-COMP3710
3
5
6
Search Methodologies
7
7


It is important to choose the correct search method for a given
problem.
In some situations, for example, depth first search will never find a
solution, even though one exists.

[Q] Can you give an example of this case?
TRU-COMP3710
Search Methodologies
8
Properties of Search Methods
1.
Complexity

2.
Completeness

3.



A method is irrevocable if, like hill climbing, it does not ever back-track.
[Q] Assuming a tree with uniform edge values,


A method is optimal (or admissible) if it is guaranteed to find the best path
to the goal.
Is an admissible method is complete?
Is a complete method is admissible?
Irrevocability


A complete method is one that is guaranteed to find a goal if one exists.
Optimality & Admissibility

4.
How much time and memory the method uses.
Is DFS complete? Is DFS admissible? Is BFS complete? Is BFS admissible?
[Q] Assuming a tree with non-uniform edge values,
[Q] Assuming a graph with uniform edge values,
TRU-COMP3710
Search Methodologies
9
Implementations

Generic search algorithm:
or a graph.)
(Note that the search space can be a tree
Do
Select a node;
// using an evaluation function
Visit the node;
Test if the node is a goal, then stop;
// using a goal test function
Otherwise, expand the child nodes;

How to implement depth-first search?


Stack
How to implement breadth-first search?

Queue
TRU-COMP3710
Search Methodologies
10
Depth-First Iterative Deepening





An exhaustive search method based on both depth-first and breadthfirst search.
Carries out depth-first search to depth of 1, then to depth of 2, 3, and
so on until a goal node is found.
Efficient in memory use, and can cope with infinitely long branches.
Not as inefficient in time as it might appear, particularly for very large
trees, in which it only needs to examine the largest row (the last one)
once.
[Q] The search path in the next example is 1?
2
TRU-COMP3710
4
Search Methodologies5
3
6
117
Best-First Search




Picks the most likely path (based on heuristic value) from the
partially expanded tree at each stage.
Tends to find a shorter path than depth-first or breadth-first search,
but does not guarantee to find the best path in a graph.
[Q] What is the search path in the next example? Is the result the best
path?
[Q] What if the costs are all the same?
2
3
TRU-COMP3710
1
1.5
2
Search Methodologies
3
12
Beam Search







Breadth-first method.
Only expands the best few paths at each level.
Thus has the memory advantages of depth-first search.
Not exhaustive, and so may not find the best solution.
May not find a solution at all.
All the search methods so far have too large time complexity.
[Q] What do we have to do?
TRU-COMP3710
Search Methodologies
13
Heuristics


Heuristic: a rule or other piece of information that is used to make
methods such as search more efficient or effective.
In search, often use a heuristic evaluation function, f(n):


f(n) tells you the approximate distance (or cost) from a node, n, to a goal
node.
f(n) may not be 100% accurate, but it should give better results than
pure guesswork.
TRU-COMP3710
Search Methodologies
14
Heuristics – How Informed?


In general, the more informed a heuristic is, the better it will perform.
Heuristic h is called more informed than j, if:




h(n)  j(n) for all nodes n.
In general, a search method using h will search more efficiently than
one using j.
A heuristic should reduce the number of nodes that need to be
examined.
[Q] Can you solve the 8-puzzle game?
[Q] Which
one do you
want to move
first?


h1: the sum of Manhatan distances
h1(left) = 5
h2: the number of miss tiles
h2(left) = 4
Which one is more informed?
[Q] Can you find any good heuristic for the 8-puzzle game?
[Q] Can you decide which heuristic is better (or admissible)?
TRU-COMP3710
Search Methodologies
15
Optimal Paths





An optimal path through a tree is one that is the shortest possible path
from root to goal.
In other words, an optimal path has the lowest cost (not necessarily at
the shallowest depth, if edges have costs associated with them).
The British Museum algorithm is a general approach to find an optimal
path by examining every possible path, and selecting the one with the
least cost.
There are more efficient ways.
We need to use some good heuristics. [Q] Which ones then?
TRU-COMP3710
Search Methodologies
16
Heuristics – admissible

Definition: A heuristic h(n) is admissible if, for every node n,
h(n) ≤ h*(n), where h*(n) is the true cost to reach the goal state from n.

An admissible heuristic never overestimates the cost to reach the goal,
i.e., it is optimistic.
Example:




hSLD(n) (never overestimates the actual road distance) on a map
h(n) = 0 for all n
[Q] Can you give admissible heuristics for the 8-puzzle game?
h1: the sum of Manhatan distances
h1(left) = 5
h2: the number of miss tiles
h2(left) = 4
TRU-COMP3710
Search Methodologies
17
Optimal Path – A* Algorithms

Uses the cost function:





f(node) = g(node) + h(node).
g(node) is the cost of the path so far leading to the node.
h(node) is an underestimate of how far node is from a goal state. (This
is the definition of an admissible heuristic.)
f is a path-based evaluation function.
An A* algorithm expands paths from nodes that have the lowest f
value until the algorithm visits the goal node.

Theorem: If h(n) is admissible, A* using is optimal.

[Q] Which admissible heuristic is better?
[Q] When two admissible heuristics satisfy h1(n)  h2(n) for all
nodes n, which one is better?

TRU-COMP3710
Search Methodologies
18
Optimal Path – A* Algorithms

A* algorithms are optimal,
i.e., they are guaranteed to find the shortest path to a goal node,

provided h(node) is always an underestimate. (This is the definition of
an admissible heuristic.)
A* methods are also optimally efficient – they expand the fewest
possible paths to find the right one.
If h is not admissible, the method is called A, rather than A*.

Algorithm:

next node <- initial node;
while (next node != goal)
expand nodes;
select next node to visit;
TRU-COMP3710
Search Methodologies
// a non-visited node
// having the smallest f-value
19
Optimal Path – Uniform Cost Search

[Q] zero is an admissible heuristic?

Also known as branch and bound.
Like A*, but uses:





f(node) = g(node).
g(node) is the cost of the path leading up to node.
I.e., h(node) = 0.
Even when a goal node is found (at the child node expansion; still not
visited yet), the method needs to continue to run in case a preferable
solution is found, till a goal node is visited.
TRU-COMP3710
Search Methodologies
20

Let’s use the heuristic of Manhatan distances.
\

Let’s expand the current node, then





283, 104, 765
283, 164, 075
283, 164, 750
g = 1; h = 4; f = 5
g = 1; h = 6; f = 7
g = 1; h = 6; f = 7
283, 104, 765 (g=1) is selected and visited. Let’s expand this new node.






283, 164, 075
283, 164, 750
203, 184, 765
283, 014, 765
283, 140, 765
283, 164, 705
g = 1; h = 6; f = 7
g = 1; h = 6; f = 7
g = 1 + 1; h = 3; f = 5
g = 1 + 1; h = 5; f = 7
g = 1 + 1; h = 5; f = 7
previous ones in the queue
new ones
visited already
21

283, 104, 765 (g=1) is selected and visited. Let’s expand this new node.






g = 1; h = 6; f = 7
g = 1; h = 6; f = 7
g = 2; h = 3; f = 5
g = 2; h = 5; f = 7
g = 2; h = 5; f = 7
previous ones in the queue
new ones
283, 104, 765 (g=2) is selected and visited. Let’s expand this new node.








283, 164, 075
283, 164, 750
203, 184, 765
283, 014, 765
283, 140, 765
283, 164, 075
283, 164, 750
283, 014, 765
283, 140, 765
023, 184, 765
230, 184, 765
283, 104, 765
g = 1; h = 6; f = 7
g = 1; h = 6; f = 7
g = 2; h = 5; f = 7
g = 2; h = 5; f = 7
g = 2+1; h = 2; f = 5
g = 2+1; h = 4; f = 7
visited already
023, 184, 765 (g=3) is selected and visited. Let’s expand this new node.
22

023, 184, 765 (g=3) is selected and visited. Let’s expand this new node.








g = 1; h = 6; f = 7
g = 1; h = 6; f = 7
g = 2; h = 5; f = 7
g = 2; h = 5; f = 7
g = 3; h = 4; f = 7
g = 3+1; h = 1; f = 5
visited already
123, 084, 765 (g=4) is selected and visited. Let’s expand this new node.









283, 164, 075
283, 164, 750
283, 014, 765
283, 140, 765
230, 184, 765
123, 084, 765
283, 104, 765
283, 164, 075
283, 164, 750
283, 014, 765
283, 140, 765
230, 184, 765
023, 184, 765
123, 804, 765
123, 784, 065
g = 1; h = 6; f = 7
g = 1; h = 6; f = 7
g = 2; h = 5; f = 7
g = 2; h = 5; f = 7
g = 3; h = 4; f = 7
visited already
g = 4+1; h = 0; f = 5
g = 4+1; h = 2; f = 7
123, 804, 765 (g=5) is selected and visited. This is the goal node.
23
Optimal Path – Uniform Cost Search


From Arad to Bucharest
Using the heuristic hZERO
TRU-COMP3710
Search Methodologies
24













Arad(0): Zerind(75); Sibiu (140); Timisoara(118)
[Arad <-] Zerind(75): Oradea(75+71); Sibiu(140); Timisoara(118)
[Arad <-] Timisoara(118): Lugoj(118+111); Oradea(146); Sibiu(140)
[Arad <-] Sibiu(140): Oradea(140+151); Rimnicu(140+80); Fagaras(140+99); Lugoj(229); Oradea(146)
[Zerind <-] Oradea(146): Rimnicu(220); Fagaras(239); Lugoj(118+111)
[Sibiu <-] Rimnicu(220): Craiova(220+146); Pitesti(220+97); Fagaras(239); Lugoj(229)
[Timisoara <-] Lugoj(229): Mehadia(229+70); Craiova(366); Pitesti(317); Fagaras(239)
[Sibiu <-] Fagaras(239): Bucharest(239+211); Mehadia(299); Craiova(366); Pitesti(317)
[Lugoj <-] Mehadia(299): Dobreta(299+75); Bucharest(450); Craiova(366); Pitesti(317)
[Rimnicu <-] Pitesti(317): Bucharest(317+101); Dobreta(374); Bucharest(450); Craiova(366)
[Rimnicu <-] Craiova(366): Dobreta(366+120); Bucharest(418); Dobreta(374)
[Mehadia <-] Dobreta(374): Bucharest(418)
[Pitesti <-] Bucharest(418)
How many nodes are visited?
TRU-COMP3710
Search Methodologies
25
Optimal Path – more informed heuristic


From Arad to Bucharest
Using the heuristic hSLD
TRU-COMP3710
Search Methodologies
Additional information
26






Arad(0): Zerind(75+374); Sibiu (140+253); Timisoara(118+329)
[Arad <-] Sibiu(140): Oradea(140+151+380); Fagaras(140+99+176); Rimnicu(140+80+193);
Zerind(75+374); Timisoara(118+329)
[Sibiu <-] Rimnicu(220): Craiova(220+146+160); Pitesti(220+97+100); Oradea(291+380);
Fagaras(239+176); Zerind(75+374); Timisoara(118+329)
[Sibiu <-] Fagaras(239): Bucharest(239+211+0); Craiova(366+160); Pitesti(317+100);
Oradea(291+380); Zerind(75+374); Timisoara(118+329)
[Rimnicu <-] Pitesti(317): Craiova(317+138+160); Bucharest(317+101+0); Bucharest(450+0);
Craiova(366+160); Oradea(291+380); Zerind(75+374); Timisoara(118+329)
[Pitesti <-] Bucharest(418)
How many nodes are visited?
TRU-COMP3710
Search Methodologies
27






283164705(0+5): 283104765(1+4); 283164075(1+6); 283164750(1+6)
283104765(1+4): 203184765(2+3); 283014765(2+5); 283140765(2+5); 283164075(1+6);
283164750(1+6)
203184765(2+3): 023184765(3+2); 230184765(3+4); 283014765(2+5); 283140765(2+5);
283164075(1+6); 283164750(1+6)
023184765(3+2): 123084765(4+1); 230184765(3+4); 283014765(2+5); 283140765(2+5);
283164075(1+6); 283164750(1+6)
123084765(4+1); 123784065(5+2); 123804765(5+0); 230184765(3+4); 283014765(2+5);
283140765(2+5); 283164075(1+6); 283164750(1+6)
123804765(5+0)
[Q] How many times
do you need to move?
h1: the sum of Manhatan distances
h1(left) = 5
TRU-COMP3710
Search Methodologies
28

[Q] How to implement the 8-puzzle game?

A bit detail algorithm:
TRU-COMP3710
Search Methodologies
29
Optimal Path – Greedy Search

Like A*, but uses:

f(node) = h(node).

Hence, always expands the node that appears to be closest to a goal,
regardless of what has gone before.
We humans usually use this approach.

Not optimal, and not guaranteed to find a solution.



Can easily be fooled into taking poor paths.
[Q] Can you give an example?
TRU-COMP3710
Search Methodologies
30
Local Search Algorithms

In the previous problems, we know what a goal state is. However, in
many optimization problems, the path to the goal is irrelevant; the
goal state itself is the solution.


[Q] Can we make any good heuristic?
[Q] Can we use A* algorithms? Why?

State space = set of "complete" configurations
Find configuration satisfying constraints, e.g., n-queens. (We do not
have to consider of the order of move.)

In such cases, we can use local search algorithms.



Keep a single "current" state, and try to improve it.
[Q] Isn’t it the greedy approach?
TRU-COMP3710
Heuristic Search
31
Example: n-queens



Put n queens on an n × n board with no two queens on the same row,
column, or diagonal
[Q] What is a goal state? How do we solve?
[Q] Can we use a local search algorithm?
TRU-COMP3710
Heuristic Search
32
Hill Climbing




An informed, irrevocable local search method.
Easiest to understand when considered as a method for finding the
highest point in a three dimensional search space:
Check the height one foot away from your current location in each
direction; North, South, East and West.
As soon as you find a position whose height is higher than your current
position, move to that location, and restart the algorithm.
TRU-COMP3710
Search Methodologies
33
Hill Climbing – Foothills

Difficulties for hill-climbing methods.

A foothill is a local maximum.
TRU-COMP3710
Search Methodologies
34
Hill Climbing – Plateaus

Difficulties for hill-climbing methods.

Flat areas that make it hard to find where to go next.
TRU-COMP3710
Search Methodologies
35
Hill Climbing – Ridges

Difficulties for hill-climbing methods


B is higher than A.
At C, the hill-climber can’t find a higher point North, South, East or
West, so it stops.
TRU-COMP3710
Search Methodologies
36
Summary



A* algorithms with heuristics
Greedy search
[Q] When a goal is known?
Local search
[Q] When a goal is know?
[Q] How to improve?
TRU-COMP3710
Search Methodologies
37