Lecture 2: Intelligent Agents

Lecture 5:
On-line search
Constraint Satisfaction Problems
Reading: Sec. 4.5 & Ch. 5, AIMA
Rutgers CS440, Fall 2003
Recap
•
•
•
•
Blind search (BFS, DFS)
Informed search (Greedy, A*)
Local search (Hill-climbing, SA, GA1)
Today:
– On-line search
– Constraint satisfaction problems
Rutgers CS440, Fall 2003
On-line search
• So far, off-line search:
– Observe environment, determine best set of actions (shortest path)
that leads from start to goal
– Good for static environments
• On-line search:
–
–
–
–
Perform action, observe environment, perform, observe, …
Dynamic, stochastic domains: exploration
Similar to…
…local search, but to get to all successor states, agent has to
explore all actions
Rutgers CS440, Fall 2003
Depth-first online
•
•
Online algorithms cannot simultaneously explore distant states (i.e.,
jump from current state to a very distant state, like, e.g., A*)
Similar to DFS
– Try unexplored actions
– Once all tried, and the goal is not reached, backtrack to unbacktracked
previous state
Rutgers CS440, Fall 2003
DFS-Online
function action = DFS-Online(percept)
s = GetState(percept);
if GoalTest(s) return stop;
if NewState(s) unexplored[s] = Actions(s);
if NonEmpty(sp)
Result[sp,ap] = s;
Unbacktracked[s] = sp;
end
If Empty(unexpolred[s])
If Empty(unbacktracked[s]) return stop;
else
a: Result[ s, a ] = Pop( unbacktracked[s] );
else
a = Pop( unexplored[s] );
end
sp = s;
return a;
endcc
Rutgers CS440, Fall 2003
Maze
G
s0
Rutgers CS440, Fall 2003
Online local search
• Hill-climbing…
– Is an online search!
– However, it does not have any memory.
– Can one use random restarts?
• Instead…
– Random wandering by choosing random actions (random walk) --eventually, visits all points
– Augment HC with memory of sorts: keep track of estimates of h(s)
and updated them as the search proceeds
– LRTA* - learning real-time A*
Rutgers CS440, Fall 2003
LRTA*
function action = LRTA*(percept)
s = GetState(percept);
if GoalTest(s) return stop;
if NewState(s) H[s] = h(s);
if NonEmpty(sp)
Result[sp,ap] = s;
H[sp] = min b Actions(sp) { c(sp,b,s) + H(s) };
end
a = arg min b Actions(s) { c(s,b, Result[s,b] ) +
H(Result[s,b] ) };
end
sp = s;
return a;
endcc
Rutgers CS440, Fall 2003
Maze – LRTA*
2
1
G
3
2
1
1
4
3
s0
3
4
4
2 2
3 3
3 3
4
4 4
2
4
3
4
4
Rutgers CS440, Fall 2003
Constraint satisfaction problems - CSP
• Slightly different from standard search formulation
– Standard search: abstract notion of state + successor function +
goal test
• CSP:
– State: a set of variables V = {V1, …, VN} and values that can be
assigned to them specified by their domains D = {D1, …, DN}, Vi 
Di
– Goal: a set of constraints on the values that combinations of V can
take
• Examples:
– Map coloring, scheduling, transportation, configuration, crossword
puzzle, N-queens, minesweeper, …
Rutgers CS440, Fall 2003
Map coloring
• Vi = { WA, NT, SA, Q, NSW, V, T }
• Di = { R, G, B }
• Goal / constraint: adjacent regions must have different color
• Solution: { (WA,R), (NT,G), (SA,B), (Q,R), (NSW,G), (V,R), (T,G) }
Rutgers CS440, Fall 2003
Constraint graph
• Nodes: variables, links: constraints
NT
Q
WA
NSW
SA
V
T
Rutgers CS440, Fall 2003
Types of CSPs
• Discrete-valued variables
– Finite: O(dN) assignments, d = |D|
• Map coloring, Boolean satisfiability
– Infinite: D is infinite, e.g., strings or natural numbers
• Scheduling
• Linear constraints: aV1 + bV2 < V3
• Continuous-valued variables
– Functional optimization
– Linear programming
Rutgers CS440, Fall 2003
Types of constraints
•
•
•
•
Unary: V  blue
Binary: V  Q
Higher order
Preferences: different values of V have different “scores”
Rutgers CS440, Fall 2003
CSP as search
function assignment = NaiveCSP(V,D,C)
1) Initial state = {};
2) Successor function: assign value to unassigned
variable that does not conflict with current assignments.
3) Goal: Assignment complete?
end
•
•
•
•
•
Generic: fits all CSP
Depth N, N = number of variables
DFS, but path irrelevant
Problem: # leaves is n!dN > dN
Luckily, we only need consider one variable per depth! –
Assignment is commutative.
Rutgers CS440, Fall 2003
Backtracking search
• DFS with single variable assignment
Rutgers CS440, Fall 2003
Map coloring example
NT
Q
WA
NSW
SA
V
T
Rutgers CS440, Fall 2003
How to improve efficiency?
•
Address the following questions
1.
2.
3.
4.
Which variable to pick next?
What value to assign next?
Can one detect failure early?
Take advantage of problem structure?
Rutgers CS440, Fall 2003
Most constrained variable
• Choose next the variable that is most constrained based on
current assignment
NT
Q
WA
NSW
SA
V
T
Rutgers CS440, Fall 2003
Most constraining variable
• Tie breaker among most constrained variables
NT
Q
WA
NSW
SA
V
T
Rutgers CS440, Fall 2003
Least constraining value
• Select the value that least constrains the other variables (rules
our fewest other variables)
Rutgers CS440, Fall 2003
Forward checking
• Terminate search early, if necessary:
– keep track of values of unassigned variables
Rutgers CS440, Fall 2003
Constraint propagation
• FC propagates assignment from assigned to unassigned
variables, but does not provide early detection of all failures
• NT & SA cannot both be blue!
• CP repeatedly enforces constraints
Rutgers CS440, Fall 2003
Arc consistency
• Simplest form of propagation makes each arcs consistent:
• X -> Y is consistent iff for all x  X, there is y  Y that satisfies
constraints
• Detects failure earlier than FC.
• Either preprocessor or after each assignment
• AC-3 algorithm
Rutgers CS440, Fall 2003
Problem structure
• Knowing something about the problem can significantly reduce
search time
NT
Q
WA
NSW
SA
V
T
• T is independent of the rest! Connected components.
• c-components => O(dcn/c)
Rutgers CS440, Fall 2003
Tree-structured CSP
• If graph is a tree, CSP can be accomplished in O(nd2)
NT
Q
WA
WA
NT
Q
SA
NSW NSW
V
SA
V
1. Flatten tree into a “chain” where Teach node is preceded by its
parent
2. Propagate constraints from last to second node
3. Make assignment from first to last
Rutgers CS440, Fall 2003
Nearly tree-structured problems
• What if a graph is not a tree?
• Maybe can be made into a tree by constraining it on a variable.
NT
NT
Q
Q
WA
WA
NSW
SA
NSW
SA
V
V
T
T
• If c is cutset size, then O(dc (N-c)d2 )
Rutgers CS440, Fall 2003
Iterative CSP
•
•
Local search for CSP
Start from (invalid) initial assignment, modify it to reduce the
number of violated constraints
1. Randomly pick a variable
2. Reassign a value such that constraints are least violated (minconflict heuristic)
– Hill-climbing with h(s) = min-conflict
Rutgers CS440, Fall 2003