5 2 1 8 3 4 7 6 up 5 2 1 8 3 4 7 6 1 2 3 4 5 6 7 8 (a) (b) 1 5 2 8 3 4 7 6 up 1 5 2 4 8 3 7 6 left 1 5 2 1 5 2 down 4 8 3 4 3 7 6 7 8 6 left 1 2 3 4 5 6 7 8 up 1 2 3 4 5 7 8 6 Last tile moved down up 1 2 4 5 3 7 8 6 left 1 2 4 5 3 7 8 6 Blank tile (c) Figure 11.1 An 8-puzzle problem instance: (a) initial configuration; (b) final configuration; and (c) a sequence of moves leading from the initial to the final configuration. Terminal node (non-goal) Non-terminal node x1 = 1 x1 = 0 Terminal node (goal) x2 = 1 x2 = 0 x3 = 0 x3 = 0 x3 = 1 x4 = 0 x4 = 1 x4 = 0 f (x) = 0 x3 = 1 x4 = 1 f (x) = 2 Figure 11.2 The graph corresponding to the 0/1 integer-linear-programming problem. 1 2 3 1 4 5 2 4 3 6 5 7 8 6 7 9 8 7 9 8 9 (a) 1 1 2 3 4 5 6 7 8 9 10 2 3 4 4 5 6 5 6 7 7 7 7 8 9 8 9 8 9 8 9 10 10 10 10 10 10 10 10 (b) Figure 11.3 Two examples of unfolding a graph into a tree. 7 2 3 A 4 6 5 1 8 down right Step 1 7 2 3 6 5 4 C 1 8 7 2 3 B 4 6 1 8 5 up 7 2 3 D 4 6 5 7 2 E 4 6 3 F 1 8 5 1 8 Step 3 right down Step 2 up 7 2 3 G 4 6 1 8 5 7 2 3 4 6 1 8 5 right 7 Blank tile The last tile moved 2 H 4 6 3 1 8 5 Figure 11.4 States resulting from the first three steps of depth-first search applied to an instance of the 8-puzzle. Bottom of the stack 1 2 3 4 7 8 9 5 5 1 4 5 3 8 9 7 11 10 14 13 16 15 19 4 6 9 10 11 12 13 14 15 16 17 8 11 14 17 17 18 19 20 21 22 23 16 19 18 24 24 21 23 24 23 Current State (a) Top of the stack (b) (c) Figure 11.5 Representing a DFS tree: (a) the DFS tree; successor nodes shown with dashed lines have already been explored; (b) the stack storing untried alternatives only; and (c) the stack storing untried alternatives along with their parent. The shaded blocks represent the parent state and the block to the right represents successor states that have not been explored. 7 2 3 4 6 5 1 8 1 2 3 4 5 6 7 8 (a) (b) 6 Blank Tile The last tile moved 7 2 3 4 6 5 1 8 7 2 3 4 6 5 1 8 6 Step 1 Step 1 7 2 3 4 6 1 8 5 7 7 2 3 4 6 5 1 8 7 7 7 2 3 4 6 1 8 5 7 2 3 4 6 5 1 8 7 Step 2 8 7 2 3 4 6 1 8 5 7 2 4 6 3 1 8 5 7 2 3 4 6 5 1 8 6 7 2 3 4 6 5 1 8 6 Step 1 Step 1 7 7 2 3 4 6 1 8 5 7 2 3 4 6 5 1 8 7 7 Step 2 8 7 2 4 6 3 1 8 5 7 2 3 4 8 6 1 5 7 2 3 4 6 1 8 5 7 2 3 4 6 5 1 8 7 Step 2 7 2 3 4 6 1 8 5 6 8 7 2 4 6 3 1 8 5 Step 3 7 6 3 7 4 2 6 1 8 5 7 2 3 4 6 1 8 5 7 7 7 7 2 3 4 8 6 1 5 Step 4 7 2 3 4 6 1 8 5 6 7 2 3 4 6 5 1 8 8 3 7 4 2 6 1 8 5 7 2 3 4 6 1 8 5 7 2 3 5 4 1 6 8 8 7 7 (c) Figure 11.6 Applying best-first search to the 8-puzzle: (a) initial configuration; (b) final configuration; and (c) states resulting from the first four steps of best-first search. Each state is labeled with its h-value (that is, the Manhattan distance from the state to the final state). A B C (a) D E F (b) Figure 11.7 The unstructured nature of tree search and the imbalance resulting from static partitioning. Service any pending messages Do a fixed amount of work Finished available work Got work Processor active Processor idle Select a processor and request work from it Service any pending messages Got a reject Issued a request Figure 11.8 A generic scheme for dynamic load balancing. 1 1 3 3 5 4 5 7 4 7 9 8 8 9 10 10 11 14 11 13 13 14 16 17 15 16 17 19 18 19 Cutoff depth 24 21 23 22 23 24 Current State (a) (b) Figure 11.9 Splitting the DFS tree in Figure 11.5. The two subtrees along with their stack representations are shown in (a) and (b). w0 = 0.5 w0 = 0.25 w0 = 0.5 w1 = 0.25 w1 = 0.5 w2 = 0.25 Step 1 Step 2 Step 4 w3 = 0.25 w3 = 0.25 w2 = 0.25 Step 3 w0 = 0.5 w0 = 0.25 w1 = 0.5 w1 = 0.25 w0 = 1.0 w1 = 0.5 Step 5 Step 6 Figure 11.10 Tree-based termination detection. Steps 1–6 illustrate the weights at various processors after each work transfer. 700 Speedup 600 ARR GRR RP 500 400 300 200 100 0 0 200 400 600 800 1000 1200 p Figure 11.11 Speedups of parallel DFS using ARR, GRR and RP load-balancing schemes. 900000 Number of work requests 700000 500000 GRR Expected (GRR) RP Expected (RP) 300000 100000 0 0 200 400 600 800 1000 1200 p Figure 11.12 Number of work requests generated for RP and GRR and their expected values (O( p log2 p) and O( p log p) respectively). 2.5e+07 W 2e+07 1.5e+07 1e+07 E = 0.64 E = 0.74 E = 0.85 E = 0.90 5e+06 0 0 20000 40000 60000 80000 100000 120000 2 p log p Figure 11.13 Experimental isoefficiency curves for RP for different efficiencies. Number of busy Processors Time (a) Number of busy Processors Time (b) Number of busy Processors Time (c) Figure 11.14 Three different triggering mechanisms: (a) a high triggering frequency leads to high load-balancing cost, (b) the optimal frequency yields good performance, and (c) a low frequency leads to high idle times. Global pointer 1 2 3 4 5 6 Idle 5 6 7 1 2 3 4 Global pointer Busy Figure 11.15 Mapping idle and busy processors with the use of a global pointer. Global list maintained at designated processor Put expanded nodes Get current best node Lock the list Place generated nodes in the list Pick the best node from the list Unlock the list Expand the node to generate successors P0 Lock the list Lock the list Place generated nodes in the list Pick the best node from the list Unlock the list Expand the node to generate successors Place generated nodes in the list Pick the best node from the list Unlock the list Expand the node to generate successors Pp−1 P1 Figure 11.16 A general schematic for parallel best-first search using a centralized strategy. The locking operation is used here to serialize queue access by various processors. Exchange best nodes Local list Local list Local list Exchange best nodes Exchange best nodes P0 Pp−1 P1 Figure 11.17 A message-passing implementation of parallel best-first search using the ring communication strategy. blackboard Exchange best nodes Exchange best nodes Exchange best nodes Local list Local list Local list Pp−1 P0 P1 Figure 11.18 An implementation of parallel best-first search using the blackboard communication strategy. Start node S Start node S R1 1 2 3 10 11 4 12 5 6 R2 9 R3 L1 R4 R5 L3 13 L4 Goal node G 7 L2 Goal node G 8 Total number of nodes generated by sequential formulation = 13 Total number of nodes generated by two-processor formulation of DFS = 9 (a) (b) Figure 11.19 The difference in number of nodes searched by sequential and parallel formulations of DFS. For this example, parallel DFS reaches a goal node after searching fewer nodes than sequential DFS. Start node S 1 Start node S R1 2 R2 3 4 R3 5 L1 R4 L2 R5 6 L3 R6 7 L4 R7 Goal node G L5 Goal node G Total number of nodes generated by sequential DFS = 7 (a) Total number of nodes generated by two-processor formulation of DFS = 12 (b) Figure 11.20 A parallel DFS formulation that searches more nodes than its sequential counterpart. Initially, target = x After Increment, target = x+5 000 x x 000 1 1 x+1 1 001 3 x+2 010 010 100 x+3 1 2 1 011 100 110 x+3 x+2 1 011 010 100 x+1 x 111 x+2 2 000 000 110 100 x+4 101 11 101 110 111 000 001 Figure 11.21 Message combining and a sample implementation on an eight-processor hypercube. Subtask generator (manager) Subtasks Search tree Work request Processors Figure 11.22 The single-level work-distribution scheme for tree search.
© Copyright 2026 Paperzz