search A* Examples: CS 331/531 Dr M M Awais 1 search 8-Puzzle f(N) = g(N) + h(N) with h(N) = number of misplaced tiles 3+3 1+5 2+3 3+4 5+2 0+4 3+2 1+3 2+3 4+1 5+0 3+4 1+5 2+4 CS 331/531 Dr M M Awais 2 search Robot Navigation CS 331/531 Dr M M Awais 3 search Robot Navigation f(N) = h(N), with h(N) = Manhattan distance to the goal (not A*) 8 7 7 6 5 4 5 4 3 3 2 6 7 6 8 7 3 2 3 4 5 6 5 1 0 1 2 4 5 6 5 4 3 2 3 CS 331/531 Dr M M Awais 4 5 6 4 search Robot Navigation f(N) = h(N), with h(N) = Manhattan distance to the goal (not A*) 8 7 7 6 5 4 5 4 3 3 2 6 77 6 8 7 3 2 3 4 5 6 5 1 00 1 2 4 5 6 5 4 3 2 3 CS 331/531 Dr M M Awais 4 5 6 5 search Robot Navigation f(N) = g(N)+h(N), with h(N) = Manhattan distance to goal (A*) 8+3 6+5 8 7+4 7 6+3 6 5+6 5 4+7 4 3+8 3 2+9 2 3+10 3 4 7+2 7 6+1 6 5 5+6 5 4+7 4 3+8 3 3 2+9 2 1+10 1 0+11 0 1 6 5 2 4 7+0 7 6+1 6 5 8+1 8 7+2 7 6+3 6 5+4 5 4+5 4 3+6 3 2+7 2 3+8 3 4 CS 331/531 Dr M M Awais 5 6 6 search Adversary Search (Games) The aim is to move in such a way as to ‘stop’ the opponent from making a good / winning move. Game playing can use Tree - Search. The tree or game - tree alternates between two players. CS 331/531 Dr M M Awais 7 search Games? Games are a form of multi-agent environment What do other agents do How do they affect our success? Cooperative vs. competitive multi-agent environments. Competitive multi-agent environments give rise to adversarial problems (games) Why study games? Fun; historically entertaining Interesting subject of study because they are hard Easy to represent and agents restricted to small number of actions CS 331/531 Dr M M Awais 8 search Games vs. Search Search – no adversary Solution is (heuristic) method for finding goal Heuristics and CSP techniques can find optimal solution Evaluation function: estimate of cost from start to goal through given node Examples: path planning, scheduling activities Games – adversary Solution is strategy (strategy specifies move for every possible opponent reply). Time limits force an approximate solution Evaluation function: evaluate “goodness” of game position Examples: chess, checkers, Othello, backgammon CS 331/531 Dr M M Awais 9 search Types of Games CS 331/531 Dr M M Awais 10 search Game setup Two players: MAX and MIN MAX and MIN take turns until the game is over. Winner gets award, looser gets penalty. Games as search: Initial state: e.g. board configuration of chess Successor function: list of (move,state) pairs specifying legal moves. Terminal test: Is the game finished? Utility function: Gives numerical value of terminal states. E.g. win (+1), loose (-1) and draw (0) in tic-tac-toe (next) MAX uses search tree to determine next move. CS 331/531 Dr M M Awais 11 search Things to Remember: 1. Every move is vital 2. The opponent could win at the next move or subsequent moves. 3. Keep track of the safest moves 4. The opponent is well - informed 5. How the opponent is likely to response to your moves. CS 331/531 Dr M M Awais 12 search A Two move win P1 moves B Player 1 = P1 Player 2 = P2 D C P2 moves wins E P1 F P2 G P1 H I J P1 P2 P2 Safest move for P1 is always A to C Safest move for P2 is always A to D (if allowed 1st move) CS 331/531 Dr M M Awais 13 search MINIMAX Procedure for Games Assumption: Opponent has same knowledge of state space and makes a consistent effort to WIN. MIN: Label for the opponent trying to minimize other player’s (MAX) score. MAX: Player trying to win (maximise advantage) BOTH MAX AND MIN ARE EQUALLY INFORMED CS 331/531 Dr M M Awais 14 search Rules MAX 1. Label levels MAX and MIN 2. Assign values to leaf nodes: MIN 0 if MIN wins 1 if MAX wins MAX 3. Propagate values up the graph. If parent is MAX, assign it Max-value of its children MIN If parent is MIN, assign it min-value of its children CS 331/531 Dr M M Awais 15 search Rules MAX 1. Label level’s MAX and MIN 2. Assign values to leaf nodes: MIN 0 if MIN wins 1 if MAX wins MAX 3. Propagate values up the graph. If parent is MAX, assign it Max-value of its children MIN If parent is MIN, assign it min-value of its children CS 331/531 Dr M M Awais 0 1 16 search Rules MAX 3. Propagate values up the graph. If parent is MAX, assign it Max-value of its children If parent is MIN, assign it MIN MAX 1 min-value of its children Max(0,1) = 1 1 MIN Max(1) = 1 0 CS 331/531 Dr M M Awais 1 17 search Rules MAX 3. Propagate values up the graph. If parent is MAX, assign it Max-value of its children If parent is MIN, assign it MIN 1 1 1 MAX min-value of its children Min(1) = 1 1 MIN Min(1) = 1 CS 331/531 Dr M M Awais 0 1 18 search Max(1,1) Rules MAX 3. Propagate values up the graph. 1 If parent is MAX, assign it Max-value of its children If parent is MIN, assign it MIN 1 1 1 MAX min-value of its children Min(1) = 1 1 MIN Min(1) = 1 CS 331/531 Dr M M Awais 0 1 19 search Utility Values • Leaf Nodes represent the result of the game • Results could be WIN or LOOSE for any player • WIN for MAX is 1, LOOSE for MAX is 0 • These values are known as Utility values / functions • Draw could be another result, in this case • WIN for MAX could be 1 • LOOSE for MAX could be –1 • DRAW could be 0 CS 331/531 Dr M M Awais 20 search Game tree (2-player, deterministic, turns) CS 331/531 Dr M M Awais 21 search MINMAX Unfinished Games • Apply from the leaf node to the start node • Or, Result nodes are necessary to be in search space • What if you want to evaluate the game status at an intermediate level • E.g., • The game finishes at level 5 • We want to find out the relative advantage of MAX upto level 3. • Solution: Evaluate intermediate nodes the through a heuristic and then apply MINMAX CS 331/531 Dr M M Awais 22 search Minimaxing to fixed ply depth (Complex games) Strategy: n - move look ahead - Suppose you start in the middle of the game. - One cannot assign WIN/LOOSE values at that stage - In this case some heuristics evaluation is applied - Values are then projected back to supply indications of WINNING/LOOSING trend. CS 331/531 Dr M M Awais 23 search HEURISTIC FUNCTION: TIC - TAC - TOE X X O X O X X O M(n) = Total of possible winning lines for MAX O(n) = Trial of Opponents winning lines E(n) = M(n) - O(n) CS 331/531 Dr M M Awais 24 search HEURISTIC FUNCTION: TIC - TAC - TOE X X O X O M(n)=4 M(n)=5 X X O M(n) = Total of possible winning lines for MAX O(n) = Trial of Opponents winning lines E(n) = M(n) - O(n) CS 331/531 Dr M M Awais 25 search HEURISTIC FUNCTION: TIC - TAC - TOE X X O X O M(n)=5 O(n)=1 E(n)=4 X X O M(n)=4 O(n)=2 E(n)=2 M(n) = Total of possible winning lines for MAX O(n) = Trial of Opponents winning lines E(n) = M(n) - O(n) CS 331/531 Dr M M Awais 26 search Two-Ply Game Tree CS 331/531 Dr M M Awais 27 search Two-Ply Game Tree CS 331/531 Dr M M Awais 28 search Two-Ply Game Tree CS 331/531 Dr M M Awais 29 search Two-Ply Game Tree The minimax decision Minimax maximizes the worst-case outcome for max. CS 331/531 Dr M M Awais 30 search Problem of minimax search Number of games states is exponential to the number of moves. CS 331/531 Dr M M Awais 31 search Solution Do not examine every node Alpha-beta pruning Alpha = value of best choice found so far at any choice point along the MAX path Beta = value of best choice found so far at any choice point along the MIN path CS 331/531 Dr M M Awais 32 search Alpha - Beta Procedures • Minimax procedure pursues all branches in the space. Some of them could have been ignored or pruned. • To improve efficiency pruning is applied to two person games CS 331/531 Dr M M Awais 33 search Simple Idea if A > 5 OR B < 0 If the first condition A > 5 succeeds then B < 0 may not be evaluated. if A > 5 AND B < 0 If the first condition A > 5 fails then B < 0 may not be evaluated. CS 331/531 Dr M M Awais 34 search Implementation FORWARD PASS: APPLY DEPTH FIRST SEARCH REACH THE LEAF NODE BACKWARD PASS: PROPAGATE THE VALUES TO THE ROOT NODE CS 331/531 Dr M M Awais 35 search a MAX c MIN b = 0.4 e -0.2 (at least) MAX MIN g = -0.2 Why –0.2 is the least value? CS 331/531 Dr M M Awais 36 search a MAX c MIN b = 0.4 e -0.2 MAX MIN g = -0.2 Suppose this node takes a value less than –0.2 Value for node e will not change and remains at –0.2 CS 331/531 Dr M M Awais 37 search a MAX c MIN b = 0.4 e v MAX MIN g = -0.2 Suppose this node takes a value greater than –0.2, say v Value for node e will change to v CS 331/531 Dr M M Awais 38 search a MAX c MIN b = 0.4 e v MAX MIN g = -0.2 WHAT IS THE LOWER BOUND ON v? Lower bound is the value at node g CS 331/531 Dr M M Awais 39 search a MAX c MIN b = 0.4 e =-0.2 (at least) MAX MIN g = -0.2 Minimum advantage for e MAX node is –0.2 This is called the ALPHA Value for MAX Node CS 331/531 Dr M M Awais 40 search a MAX c -0.2 (at most) MIN b = 0.4 e =-0.2 (at least) MAX MIN g = -0.2 Why –0.2 is the AT MOST value For node c ? CS 331/531 Dr M M Awais 41 search a MAX c MIN b = 0.4 v e =-0.2 (at least) MAX MIN g = -0.2 Suppose this node takes a value v less than –0.2 Value for node c will change to v CS 331/531 Dr M M Awais 42 search a MAX MIN b = 0.4 c -0.2 e =-0.2 (at least) MAX MIN g = -0.2 Suppose this node takes a value greater than –0.2 Value for node c will not change and will remain at –0.2 CS 331/531 Dr M M Awais 43 search a MAX MIN b = 0.4 c -0.2 e =-0.2 (at least) MAX MIN g = -0.2 WHAT IS THE UPPER BOUND ON v? UPPER bound is the value at node e CS 331/531 Dr M M Awais 44 search a MAX MIN b = 0.4 c = -0.2 (at most) e =-0.2 (at least) MAX MIN g = -0.2 Maximum advantage for c MIN node is –0.2 This is called the BETA Value for MIN Node CS 331/531 Dr M M Awais 45 search a MAX MIN b = 0.4 c = -0.2 (at most) e =-0.2 (at least) MAX MIN g = -0.2 FIND THE ALPHA VALUE FOR NODE a ? CS 331/531 Dr M M Awais 46 search a = 0.4 (at least) MAX MIN b = 0.4 c = -0.2 (at most) e =-0.2 (at least) MAX MIN g = -0.2 The least advantage which MAX can get in this portion of the game is 0.4 CS 331/531 Dr M M Awais 47 search a = 0.4 (at least) MAX MIN b = 0.4 c = -0.2 (at most) e =-0.2 (at least) MAX MIN g = -0.2 IF this least advantage is acceptable, then Expanding to c and to all the proceeding nodes can be neglected: Prune away link to c With ALPHA=0.4 CS 331/531 Dr M M Awais 48 search - MAX node neglects MAX values <= a (atleast it can score) at MIN nodes below it. A - MIN node neglects values >= b (almost it can score) at MAX nodes below it C MIN B =10 G=0 H C node can score ATMOST 0 nothing above 0 (beta) A node can score ATLEAST 10 nothing less than 10 (alpha) CS 331/531 Dr M M Awais 49 search Alpha-Beta Example Do DF-search until first leaf Range of possible values [-∞,+∞] [-∞, +∞] CS 331/531 Dr M M Awais 50 search Alpha-Beta Example (continued) [-∞,+∞] [-∞,3] CS 331/531 Dr M M Awais 51 search Alpha-Beta Example (continued) [-∞,+∞] [-∞,3] CS 331/531 Dr M M Awais 52 search Alpha-Beta Example (continued) [3,+∞] [3,3] CS 331/531 Dr M M Awais 53 search Alpha-Beta Example (continued) [3,+∞] This node is worse for MAX [3,3] [-∞,2] CS 331/531 Dr M M Awais 54 search Alpha-Beta Example (continued) [3,14] [3,3] [-∞,2] CS 331/531 Dr M M Awais , [-∞,14] 55 search Alpha-Beta Example (continued) [3,5] [3,3] [−∞,2] CS 331/531 Dr M M Awais , [-∞,5] 56 search Alpha-Beta Example (continued) [3,3] [3,3] [−∞,2] CS 331/531 Dr M M Awais [2,2] 57 search Alpha-Beta Example (continued) [3,3] [3,3] [-∞,2] CS 331/531 Dr M M Awais [2,2] 58
© Copyright 2026 Paperzz