Deterministic Coin Tossing Presented by Gal Kamar April 19, 2015 236825 Seminar in Distributed Graph Algorithms Cole R., Vishkin U. (1986) Deterministic Coin Tossing with Applications to Optimal Parallel List Rankings, in Information and Control 70, pp 32-53 Motivation Find a distributed algorithm for selecting a MIS in a ring graph The problem has a standard linear time serial algorithm Use this MIS in order to solve the List Ranking problem Agenda Definitions and notation Lemmas Selecting a log 𝑛 − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡 Selecting a 2 − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡 (MIS) List Ranking using the 2 − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡 (dependent on time) The Ring Graph Ring Graph: Let G(V,E) be a connected directed graph. G(V,E) is called a ring if ∀ 𝑣 ∈ 𝑉 𝑑𝑖𝑛 𝑣 = 𝑑𝑜𝑢𝑡 𝑣 = 1. 𝑣𝑛𝑒𝑥𝑡 : G(V,E) is a ring thus a single vertex 𝑣𝑛𝑒𝑥𝑡 exists such that 𝑣, 𝑣𝑛𝑒𝑥𝑡 ∈ 𝐸, the vertex following 𝑣. 𝑣𝑝𝑟𝑒𝑣 : G(V,E) is a ring thus a single vertex 𝑣𝑝𝑟𝑒𝑣 exists such that 𝑣𝑝𝑟𝑒𝑣 , 𝑣 ∈ 𝐸, the vertex preceding 𝑣. 𝑣next 𝑣 𝑣prev R – Ruling Set r-ruling set of G(V,E) – A subset U of vertices such that: U is an Independent set. ∀ 𝑣 ∈ 𝑉 ∃ 𝑢 ∈ 𝑈 𝑠. 𝑡. There exists a directed path 𝑃 𝑣, 𝑢 = 𝑣 − ⋯ − 𝑢 whose edge length ≤ 𝑟. Notice that a 2-ruling set is a MIS 2-ruling set is a MIS 𝑣next Independent set 𝑣 directly from the definition of the ruling set 𝑣prev Maximal Set Lets assume that there exists a 2-ruling set 𝑈 such that it is not maximal. Therefore there exists a vertex 𝑣 ∈ 𝑉 𝑠. 𝑡. 𝑣 ∉ 𝑈 and 𝑣 ∪ 𝑈 is an independent set. So we can conclude that 𝑣𝑝𝑟𝑒𝑣 , 𝑣𝑛𝑒𝑥𝑡 ∉ 𝑈. 𝑣𝑝𝑟𝑒𝑣 ∈ 𝑉 and from the definition of the ruling set there exists a vertex 𝑢 ∈ 𝑈 𝑠. 𝑡. 𝑃(𝑣𝑝𝑟𝑒𝑣 , 𝑢) has an edge length ≤ 2. Let us examine the path 𝑃 𝑣𝑝𝑟𝑒𝑣 , 𝑣𝑛𝑒𝑥𝑡 = 𝑣𝑝𝑟𝑒𝑣 − 𝑣 − 𝑣𝑛𝑒𝑥𝑡 . Since G(V,E) is a ring 𝑃(𝑣𝑝𝑟𝑒𝑣 , 𝑢) must be a sub path of 𝑃 𝑣𝑝𝑟𝑒𝑣 , 𝑣𝑛𝑒𝑥𝑡 and since all vertices of 𝑃 𝑣𝑝𝑟𝑒𝑣 , 𝑣𝑛𝑒𝑥𝑡 are not in 𝑈 then there is a contradiction. 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 (𝑣) 𝑆𝐸𝑅𝐼𝐴𝐿0 𝑣 : vertex 𝑣’s index (ID number) 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣 : The index of the rightmost bit in which SERIAL0 (𝑣) and S𝐸𝑅𝐼𝐴𝐿0 (𝑣𝑛𝑒𝑥𝑡 ) differ 𝑆𝐸𝑅𝐼𝐴𝐿𝑘+1 𝑣 : The index of the rightmost bit in which SERIALk (𝑣) and S𝐸𝑅𝐼𝐴𝐿𝑘 (𝑣𝑛𝑒𝑥𝑡 ) differ 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 𝑣 𝑗 : The value of bit number j of 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 𝑣 2 𝑆𝐸𝑅𝐼𝐴𝐿0 = 010 𝑆𝐸𝑅𝐼𝐴𝐿1 = 0 3 7 𝑆𝐸𝑅𝐼𝐴𝐿0 = 011 𝑆𝐸𝑅𝐼𝐴𝐿0 = 111 𝑆𝐸𝑅𝐼𝐴𝐿1 = 2 Local Extremum Local Minimum: (with respect to 𝑆𝐸𝑅𝐼𝐴𝐿𝐾 ) 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 𝑣 ≤ 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 (𝑣𝑝𝑟𝑒𝑣 ) 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 𝑣 ≤ 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 (𝑣𝑛𝑒𝑥𝑡 ) Local Maximum: (with respect to 𝑆𝐸𝑅𝐼𝐴𝐿𝐾 ) 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 𝑣 ≥ 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 (𝑣𝑝𝑟𝑒𝑣 ) 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 𝑣 ≥ 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 (𝑣𝑛𝑒𝑥𝑡 ) Lemmas 𝐿𝑒𝑚𝑚𝑎 1 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑖 can be computed in 𝑂(1). 𝐿𝑒𝑚𝑚𝑎 2 0 ≤ 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑖 ≤ log 𝑛 − 1 and is represented by 𝑐𝑒𝑖𝑙 𝑙𝑜𝑔𝑙𝑜𝑔 𝑛 Lemma 3 bits. ∀ 𝑣 ∈ 𝑉 the number of vertices in the shortest path from 𝑣 to the next local extremum with respect to 𝑆𝐸𝑅𝐼𝐴𝐿1 is at most log(𝑛). Computing 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣 if 𝑣 < 𝑣𝑛𝑒𝑥𝑡 then 𝑆𝑊𝐴𝑃(𝑣, 𝑣𝑛𝑒𝑥𝑡 ). Now we can assume 𝑣 ≥ 𝑣𝑛𝑒𝑥𝑡 . Assign ℎ = 𝑣 − 𝑣𝑛𝑒𝑥𝑡 , 𝑘 = ℎ − 1. Let ℎ𝑙 denote bit of index 𝑙 in ℎ. ℎ𝑙 = 0 𝑓𝑜𝑟 0 ≤ 𝑙 < 𝑆𝐸𝑅𝐼𝐴𝐿1 (𝑣). Example 𝑣 = 15 = 1111 𝑣𝑛𝑒𝑥𝑡 = 11 = 1011 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣 = 2 𝑘𝑙 = 1 𝑓𝑜𝑟 0 ≤ 𝑙 < 𝑆𝐸𝑅𝐼𝐴𝐿1 (𝑣). ℎ = 15 − 11 = 1111 − 1011 = 0100 ℎ𝑆𝐸𝑅𝐼𝐴𝐿1 (𝑣) = 1. 𝑘 = 4 − 1 = 0100 − 0001 = 0011 𝑘𝑆𝐸𝑅𝐼𝐴𝐿1 (𝑣) = 0. 𝑗 = 0100 ⨁ 0011 = 0111 ℎ𝑙 = 𝑘𝑙 𝑓𝑜𝑟 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣 < 𝑙 Assign 𝑗 = ℎ ⨁ 𝑘 𝑗=3 𝑗 =𝑗 −1=3 −1=2 J is the unary representation of 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣 + 1. Convert j to binary representation and subtract 1. Lemma 3 - proof Without loss of generality let 𝑣 ∈ 𝑉 𝑠. 𝑡. 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣 ≤ 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑛𝑒𝑥𝑡 . If 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣 = 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑛𝑒𝑥𝑡 then 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑛𝑒𝑥𝑡 is a local extremum and we are done so assume 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣 < 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑛𝑒𝑥𝑡 . Let 𝑣𝑠𝑚𝑎𝑙𝑙 be the first vertex on the path from 𝑣 𝑠. 𝑡. 𝑣 ≠ 𝑣𝑠𝑚𝑎𝑙𝑙 and 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑠𝑚𝑎𝑙𝑙 ≤ 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑠𝑚𝑎𝑙𝑙 𝑝𝑟𝑒𝑣 . Denote the path P = 𝑃(𝑣, 𝑣𝑠𝑚𝑎𝑙𝑙𝑝𝑟𝑒𝑣 ) = 𝑣 − ⋯ − 𝑣𝑠𝑚𝑎𝑙𝑙𝑝𝑟𝑒𝑣 . Let us examine the sequence of 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑃 = (𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣 , … , 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑠𝑚𝑎𝑙𝑙𝑝𝑟𝑒𝑣 ). From the definition of 𝑣𝑠𝑚𝑎𝑙𝑙 we get that it is strictly monotonous. 𝑣𝑠𝑚𝑎𝑙𝑙 exists and the length of P is at most 𝒍𝒐𝒈 𝒏 since 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑃 is strictly monotonous and ∀ 𝑢 ∈ 𝑉 0 ≤ 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑢 ≤ log 𝑛 − 1 𝑣𝑠𝑚𝑎𝑙𝑙 𝑝𝑟𝑒𝑣 is a local maximum: < 𝑆𝐸𝑅𝐼𝐴𝐿1 (𝑣𝑠𝑚𝑎𝑙𝑙 𝑝𝑟𝑒𝑣 ) from the monotony of 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑃 . 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑠𝑚𝑎𝑙𝑙 𝑝𝑟𝑒𝑣 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑠𝑚𝑎𝑙𝑙 ≤ 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑠𝑚𝑎𝑙𝑙 𝑝𝑟𝑒𝑣 𝑝𝑟𝑒𝑣 from the definition of 𝑣𝑠𝑚𝑎𝑙𝑙 . Lemmas 𝐿𝑒𝑚𝑚𝑎 4 Alteration property: Denote by 𝑏 the value of bit number 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣 in 𝑆𝐸𝑅𝐼𝐴𝐿0 𝑣 . Then the value of this bit in 𝑆𝐸𝑅𝐼𝐴𝐿0 𝑣𝑛𝑒𝑥𝑡 is 1 − 𝑏. Stems directly from the definition of 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑖 Let 𝑣1 , … 𝑣𝑘 be a sequence of consecutive vertices such that ∀ 𝑖 ∈ 1, . . 𝑘 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑖 is a local minimum or ∀ 𝑖 ∈ 1, . . 𝑘 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑖 is a local maximum. 𝐿𝑒𝑚𝑚𝑎 5 ∀ 𝑖, 𝑗 ∈ 1, . . 𝑘 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑖 = 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑗 . Stems directly from the definition of local minimum /maximum. 𝐿𝑒𝑚𝑚𝑎 6 ∀ 𝑗 ∈ 2, . . 𝑘 𝑆𝐸𝑅𝐼𝐴𝐿0 𝑗 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑗 Deferred by the alteration property = 1 − 𝑆𝐸𝑅𝐼𝐴𝐿0 𝑗 − 1 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑗 −1 So What do we do now? Find an algorithm that will receive the input graph and output a log 𝑛 − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡. The algorithm will select local extremums with respect to 𝑆𝐸𝑅𝐼𝐴𝐿1 . Problem Sequences of consecutive local maximums and consecutive local minimums will present a seemingly symmetric situations. Solution In order to break these apparently symmetric situations, “Deterministic Coin Tossing” is used. Deterministic Coin Tossing Let 𝑣1 , … 𝑣𝑘 be a sequence of consecutive vertices such that ∀ 𝑖 ∈ 1, . . 𝑘 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑖 is a local minimum or ∀ 𝑖 ∈ 1, . . 𝑘 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑣𝑖 is a local maximum. Associate a-priory a bit with each vertex such that it will be different then its neighbors in the sequence. The Basic Step For Processor 𝑖; 0 ≤ 𝑖 ≤ 𝑛 − 1 ; pardo 𝑆𝐸𝑅𝐼𝐴𝐿0 𝑖 = 𝑖 Compute 𝑆𝐸𝑅𝐼𝐴𝐿1 (𝑖) If (𝑆𝐸𝑅𝐼𝐴𝐿1 (𝑖) is a local minimum) if (𝒊𝒑𝒓𝒆𝒗 𝒊𝒏𝒆𝒙𝒕 are not local minimums or 𝑆𝐸𝑅𝐼𝐴𝐿0 𝑖 𝑆𝐸𝑅𝐼𝐴𝐿1 (𝑖) = 1) Select 𝑖 If (𝑖, 𝑖𝑝𝑟𝑒𝑣 , 𝑖𝑛𝑒𝑥𝑡 are not selected and if 𝑆𝐸𝑅𝐼𝐴𝐿1 𝑖 is a local maximum ) (𝑖 is available) If (𝒊𝒑𝒓𝒆𝒗 𝒊𝒏𝒆𝒙𝒕 are not available or 𝑆𝐸𝑅𝐼𝐴𝐿0 𝑖 Select 𝑖 𝑆𝐸𝑅𝐼𝐴𝐿1 (𝑖) = 1) 𝐿𝑜𝑔(𝑛) – 𝑅𝑢𝑙𝑖𝑛𝑔 𝑆𝑒𝑡 Preforming the basic step on the input graph results in a log 𝑛 − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡 (The selected vertices) Independency adjacent vertices are never selected. (Lemma 5 and 6) ∀ 𝑣 ∈ 𝑉 ∃ 𝑢 ∈ 𝑈 𝑠. 𝑡. There exists a directed path 𝑃 𝑣, 𝑢 = 𝑣 − ⋯ − 𝑢 whose edge length ≤ log(𝑛). Every local extremum is either selected or is a neighbor of a selected vertex. ∀ 𝑣 ∈ 𝑉 the number of edges in the shortest path from 𝑣 to the next local extremum with respect to 𝑆𝐸𝑅𝐼𝐴𝐿1 is at most log 𝑛 − 1. (Lemma 3) runs in 𝑂 1 time using n processors. (𝑘) log (𝑛) − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡 A log (𝑘) (𝑛) − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡 can be obtained in 𝑂 𝑘 time using 𝑛 processors by applying the basic step to the input graph k times. In order to apply the K iteration of the basic step some changes need to be applied Vertices and neighbors of vertices that have been selected in previous iterations and their edges need to be deleted. The role of 𝑆𝐸𝑅𝐼𝐴𝐿0 will be played by 𝑆𝐸𝑅𝐼𝐴𝐿𝐾−1 The role of 𝑆𝐸𝑅𝐼𝐴𝐿1 will be played by 𝑆𝐸𝑅𝐼𝐴𝐿𝐾 Extend the basic step to handle vertices with degree < 2 A 𝟐 − 𝒓𝒖𝒍𝒊𝒏𝒈 𝒔𝒆𝒕, MIS, can be obtained in 𝑶(𝒍𝒐𝒈∗ (𝒏)) time using 𝒏 processors ′ The 𝑘 𝑡ℎ Basic Step For processor 𝑖; 0 ≤ 𝑖 ≤ 𝑛 − 1; pardo If (𝑖𝑝𝑟𝑒𝑣 𝑜𝑟 𝑖 𝑜𝑟 𝑖𝑛𝑒𝑥𝑡 were selected in a previous iteration) Delete vertex 𝑖 and its edges For processor 𝑖; 0 ≤ 𝑖 ≤ 𝑛 − 1; (𝑖 is in the remaining graph) pardo Case 1: deg 𝑖 = 2 Compute 𝑆𝐸𝑅𝐼𝐴𝐿𝐾 (𝑖) (lemma 7) If (deg 𝑖𝑝𝑟𝑒𝑣 = deg 𝑖𝑛𝑒𝑥𝑡 = 2) Case 2: deg 𝑖 = 1 If ( 𝑖, 𝑖𝑛𝑒𝑥𝑡 ∈ 𝐸 or 𝑖 ′ 𝑠 neighbor has degree 2 ) Apply basic step to 𝑖 Select 𝑖 Case 3: deg 𝑖 = 0 Select 𝑖 Lemmas 𝐿𝑒𝑚𝑚𝑎 7 Let 𝑖, 𝑗 be adjacent in the input graph for the kth basic step, then 𝑆𝐸𝑅𝐼𝐴𝐿𝐾−1 𝑖 ≠ 𝑆𝐸𝑅𝐼𝐴𝐿𝐾−1 𝑗 For 𝑘 = 1 this is trivial Lets assume that 𝑆𝐸𝑅𝐼𝐴𝐿𝐾−1 𝑖 = 𝑆𝐸𝑅𝐼𝐴𝐿𝐾−1 𝑗 . Both vertices had to be local extremums at the previous iteration. The basic step selects each local extremum or one of its neighbors therefore at least one of these vertices had to be deleted and cannot be included in the input graph. ′ Output of the 𝑘 𝑡ℎ Basic Step (1 < 𝑘) The output of the 𝑘 application of the basic step is a log k 𝑛 − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡 Independency adjacent vertices are never selected. (stems from the algorithm and the preprocessing phase) ∀ 𝑣 ∈ 𝑉 ∃ 𝑢 ∈ 𝑈 𝑠. 𝑡. There exists a directed path 𝑃 𝑣, 𝑢 = 𝑣 − ⋯ − 𝑢 whose edge length ≤ log 𝑘 (𝑛) . ′ Output of the 𝑘 𝑡ℎ Basic Step (1 < 𝑘) The input graph of the 𝑘 ′ 𝑡ℎ basic step consists of simple paths each comprising at most log 𝑘−1 (𝑛) vertices. Denote the edge length of a path by 𝛿 if 𝛿 ≤ 1 if 1 < 𝛿 < log 𝑘 𝑛 the first vertex is selected The last vertex is selected If log k n ≤ 𝛿 < log 𝑘−1(𝑛) Notice that 0 ≤ 𝑆𝐸𝑅𝐼𝐴𝐿𝐾−1 ≤ log 𝑘 𝑛 − 1 (extension of lemma 2) ∀ 𝑣 𝑠. 𝑡. 𝑣 is in the input graph of the 𝑘 ′ 𝑡ℎ basic step, the number of edges in the shortest path from 𝑣 to the next local extremum (if exists) with respect to 𝑆𝐸𝑅𝐼𝐴𝐿𝑘 is at most log k 𝑛 − 1. (extension of lemma 3) Every local extremum is either selected or is a neighbor of a selected vertex ∗ The output of the log (𝑛) iteration After log ∗ 𝑛 iterations all vertices are deleted The size of the selected set is at least 𝑛/3 The selected set forms a 2 − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡 The selected set froms a MIS Alternative Algorithm The paper presents an alternative algorithm for selecting a 2 − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡 The alternative algorithm runs in 𝑂 log 𝑛 𝑡𝑖𝑚𝑒 𝑢𝑠𝑖𝑛𝑔 𝑛 log 𝑛 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑜𝑟𝑠 The List Ranking Problem Given a Linked list of length 𝑛, compute the distance from each element of the linked list to the end of the list The standard deterministic parallel algorithm Runs in 𝑂 𝑛𝑙𝑜𝑔 𝑛 𝑝 + log 𝑛 time (using 𝑝 processors) The Standard List Ranking Algorithm Assign a processor to each of the 𝑛 elements in the list Denote the pointer of element 𝑖 of the input array by 𝐷(𝑖) Initialize 𝑅 𝑖 = 1 , 1 ≤ 𝑖 ≤ 𝑛 Iterate 𝑐𝑒𝑖𝑙(log 𝑛 ) times For processor 𝑖, 1 ≤ 𝑖 ≤ 𝑛 pardo preform Shortcut: 𝑅 𝑖 =𝑅 𝑖 +𝑅 𝐷 𝑖 𝐷 𝑖 = 𝐷(𝐷 𝑖 ) The Summation Problem Input: An array of 𝑛 numbers 𝐴 1 , 𝐴 2 , … , 𝐴(𝑛). Assume without loss of generality that log 𝑛 is an integer Problem: Compute Σ𝑛 𝑖=1 𝐴(𝑖) Algorithm Runs in 𝑂 log 𝑛 using 𝑛 log 𝑛 processors The Summation Algorithm “plant” a balanced binary tree with 𝑛 leaves on the array. The nodes of the tree at level ℎ are denote ℎ, 𝑗 , 1 ≤ 𝑗 ≤ 2log Initialization: for all 1 ≤ 𝑗 ≤ 𝑛 pardo [0, 𝑗] = 𝐴 𝑗 For ℎ = 1 𝒕𝒐 𝑙𝑜𝑔(𝑛) do For all 1 ≤ 𝑗 ≤ 2log 𝑛 −ℎ pardo ℎ, 𝑗 = ℎ − 1, 2𝑗 − 1 + [ℎ − 1, 2𝑗] Output [𝑙𝑜𝑔 𝑛 , 1] 𝑛 −ℎ The Prefix Sum Problem Input: An array of 𝑛 numbers 𝐴 1 , 𝐴 2 , … , 𝐴(𝑛). Assume without loss of generality that log 𝑛 is an integer Problem: Compute Σ 𝑗 𝑖=1 𝐴 𝑖 𝑓𝑜𝑟 1 ≤ 𝑗 ≤ 𝑛 Algorithm Runs in 𝑂( + log 𝑛 ) time using 𝑝 processors. 𝑛 𝑝 The Prefix Sum Algorithm Preform the summation algorithm obtaining all [ℎ, 𝑗] values Associate another number 𝐶[ℎ, 𝑗] for each node of the tree Initialization: 𝐶 log 𝑛 , 1 = 0 For ℎ = 𝑙𝑜𝑔 𝑛 − 1 𝒅𝒐𝒘𝒏𝒕𝒐 0 𝒅𝒐 For all 1 ≤ 𝑗 ≤ 2log 𝑛 If −ℎ pardo 𝑗 is odd 𝑗+1 ] 2 Then 𝐶 ℎ, 𝑗 = 𝐶[ℎ + 1 , Else 𝐶 ℎ, 𝑗 = 𝐶 ℎ + 1, 2 + [ℎ, 𝑗 − 1] 𝑗 For all 1 ≤ 𝑗 ≤ 𝑛 pardo 𝐶 0, 𝑗 = 𝐶 0, 𝑗 + [0, 𝑗] Output 𝐶 0, 𝑗 , 1 ≤ 𝑗 ≤ 𝑛 The “ideal” List Ranking Problem Ideally an algorithm for the list ranking problem would preform a total of 𝑂(𝑛) short-cuts. If it were possible to “plant” a balanced binary tree in the linked list (in the order of the linked list) then solving the list ranking problem would be trivial. Assign 1 to each leaf and then apply the prefix sum algorithm In each iteration ℎ of the summation algorithm, the odd locations correspond to short cuts in the linked list at level ℎ − 1 The Problem: in order to “plant” a balanced binary tree with respect to the linked list, the list ranking problem needs to be solved! “short-cut” Example Solving the List Ranking Problem Notice that each iteration maintains the following features: The list is a single list whose length is half of the previous one It takes 𝑂(1) parallel time to execute Find an algorithm that approximates these features. “Plant” an approximately balanced tree, a 2-3 tree Each leaf corresponds to an element of the list Each level of the tree corresponds to an iteration The nodes in each level correspond to elements over which shortcuts have not yet been made in previous iterations. In each level divide the elements into two sets: Victims: elements that are shortcuts Survivors: elements that are not shortcuts Solving the List Ranking Problem In order to approximate the features in the “ideal” algorithm, the following constrains need to be met: If an element is a survivor then its successor (if any) is a victim Of every three adjacent elements at least one is a survivor From these constrains it can be deduced that: At most one half of the elements are survivors To remove all victims from the list, each survivor preforms at most two shortcut operations A 𝟐 − 𝒓𝒖𝒍𝒊𝒏𝒈 𝒔𝒆𝒕 provides an appropriate set of survivors! List Ranking Algorithm Initialization: m=n 𝑅 𝑖 = 1,1 ≤ 𝑖 ≤ 𝑛 Denote the pointer of element 𝑖 of the input array by 𝐷 𝑖 t=1 While (𝑚 > log 𝑛 ) do 𝑛 Step 1 (initialization for the present while loop iteration) For 𝑗, 0 ≤ 𝑗 ≤ 𝑚 − 1 pardo 𝑆𝐸𝑅𝐼𝐴𝐿0 𝑗 = 𝑗 List Ranking Algorithm Step 2 Compute a 2 − 𝑟𝑢𝑙𝑖𝑛𝑔 𝑠𝑒𝑡 into vector RULING using the previous algorithm. From now on for each instruction the processor that preforms it is specified Suppose 𝑝 processors are available Processor 𝑖 1 ≤ 𝑖 ≤ 𝑝 is assigned to segment [ array to this iteration. 𝑖−1 𝑚 𝑖𝑚 ,…, 𝑝 𝑝 − 1] of the input List Ranking Algorithm Step 3 For Processor 𝑖, 1 ≤ 𝑖 ≤ 𝑝 pardo For 𝑗 = 1−𝑖 𝑚 𝑝 to 𝑖𝑚 𝑝 − 1 do If ( 𝑅𝑈𝐿𝐼𝑁𝐺 𝐽 = 1 ) 𝑂𝑃 𝑖, 𝑡 = 𝐷 𝑗 , 𝑗, 𝑅 𝑗 𝑅 𝑗 =𝑅 𝑗 +𝑅 𝐷 𝑗 𝐷 𝑗 = 𝐷(𝐷 𝑗 ) If (𝑅𝑈𝐿𝐼𝑁𝐺 𝐷 𝑗 = 0) 𝑂𝑃 𝑖, 𝑡 = 𝐷 𝑗 , 𝑗, 𝑅 𝑗 𝑅 𝑗 =𝑅 𝑗 +𝑅 𝐷 𝑗 𝐷 𝑗 = 𝐷(𝐷 𝑗 ) List Ranking Algorithm Step 4 Preform the balanced binary tree prefix sum computation with respect to the vector 𝑅𝑈𝐿𝐼𝑁𝐺 resulting in: 𝑚 = Σ𝑗 𝑅𝑈𝐿𝐼𝑁𝐺(𝑗) An array of length 𝑚 containing each element 𝑗 𝑠. 𝑡. 𝑅𝑈𝐿𝐼𝑁𝐺 𝑗 = 1 (the input arry for the next iteration of the while loop) List Ranking Algorithm Step 5 Apply a simulation of the standard deterministic parallel algorithm by 𝑝 processors to the current array Step 6 For Processor 𝑖, 1 ≤ 𝑖 ≤ 𝑝 pardo For 𝑡 = 𝑇 downto 1do 𝑅 𝑂𝑃 𝑖, 𝑡 . 1 = 𝑅 𝑂𝑃 𝑖, 𝑡 . 2 − 𝑂𝑃 𝑖, 𝑡 . 3 Complexity 𝑛 𝑛 𝑂 𝑡𝑖𝑚𝑒 𝑢𝑠𝑖𝑛𝑔 𝑝 ≤ 𝑝 log 𝑛 𝑙𝑜𝑔𝑙𝑜𝑔(𝑛) * Each time 𝑚 gets a new value broadcast it to all processors Alternative Algorithm The paper presents an alternative algorithm for the List Ranking Problem The alternative Algorithm runs in 𝑂 𝑘𝑙𝑜𝑔 𝑛 𝑡𝑖𝑚𝑒 fixed 𝑘 𝑛𝑙𝑜𝑔𝑘 𝑛 𝑢𝑠𝑖𝑛𝑔 log(𝑛) for any THANK YOU
© Copyright 2026 Paperzz