Window-Based Greedy Contention Management for Transactional Memory Gokarna Sharma (LSU) Brett Estrade (Univ. of Houston) Costas Busch (LSU) DISC 2010 - 24th International Symposium on Distributed Computing 1 Transactional Memory - Background • The emergence of multi-core architectures – Opportunities and challenges • How to handle access to shared data? – Locks, Monitors, … • Transactional memory (TM) is an alternative synchronization abstraction – Simple, composable, … • Three types – Hardware, Software, and Hybrid TMs – Our focus is on STM Systems DISC 2010 - 24th International Symposium on Distributed Computing 2 STM Systems • Progress is ensured through contention management (CM) policy • If transactions modify different data – everything is OK • If transactions modify same data – conflicts arise that must be resolved - job of a contention management policy • Of particular interest are greedy contention managers – Transactions immediately restart after every abort DISC 2010 - 24th International Symposium on Distributed Computing 3 Prior Work • Mostly empirical evaluation • Theoretical Analysis – [Guerraoui et al., PODC’05] • Greedy Contention Manager • Competitive ratio = O(s2) (s is the number of shared resources) – [Attiya et al., PODC’06] • Improved to O(s) – [Schneider & Wattenhofer, ISAAC’09] • RandomizedRounds Contention Manager • Competitive ratio = O(C . log n) (C is the maximum number of conflicting transactions and n is the number of transactions) – [Attiya & Milani, OPODIS’09] • Bimodal Scheduler • Competitive ratio = O(s) (for bimodal workload with equi-length transactions) DISC 2010 - 24th International Symposium on Distributed Computing 4 Our Contributions • Execution window model for TM Transactions 1 2 3 Threads 1 2 3 . . . N M . . . M N • Makespan bound of any CM algorithm based on the contention measure C with in the window and the window parameters M and N • Two new randomized contention management algorithms that are very close to O(s)-competitive • An adaptive version that adapts to the amount of contention C DISC 2010 - 24th International Symposium on Distributed Computing 5 Roadmap • Previous TM models and problem complexity • Our TM model • Our algorithms and proof ideas DISC 2010 - 24th International Symposium on Distributed Computing 6 Previous TM Models • One-shot scheduling problem – n transactions, a single transaction per thread – Best bound proven to be achievable is O(s) • Problem Complexity: directly related to vertex coloring – Coloring problem -> One-shot scheduling problem -> One-shot scheduling Solution -> Coloring Solution • NP-Hard to approximate an optimal vertex coloring • Can we do better under the limitations of coloring reduction? DISC 2010 - 24th International Symposium on Distributed Computing 7 Execution Window Model • A M × N window W – M threads with a sequence of N transactions per thread, i.e., collection of N one-shot transaction sets Transactions 1 1 2 3 . 2 3 Threads . . N M . . . M N DISC 2010 - 24th International Symposium on Distributed Computing 8 Makespan Bounds • Let C denote the maximum number of conflicting transactions for any transaction inside the window • Trivial Makespan Bounds: – Straightforward upper bound: τ . min(CN,MN), where τ is the execution time duration – One-shot analysis bound [Attiya et al., PODC’06]: O(sN) – Using RandomizedRounds [Schneider & Wattenhofer, ISAAC’09] N times, makespan bound: O(τ . CN logM) • Our Bounds: – Offline-Greedy: Makespan bound = O(τ . (C + N log(MN))) and Competitive Ratio = O(s + log(MN)) with high probability – Online-Greedy: Makespan bound = O(τ . (C log(MN) + N log2(MN))) and Competitive Ratio = O(s . log(MN) + log2(MN)) high probability DISC 2010 - 24th International Symposium on Distributed Computing 9 Intuition 1 2 3 . . . N’ 1 N 2 3 N M M N Random interval N • The random delays help conflicting transactions shift inside the window and their execution time may not coincide • More apparent in scenarios where conflicts are more frequent inside the same column transactions and less frequent in different column transactions DISC 2010 - 24th International Symposium on Distributed Computing 10 How it works? (1/2) • Random intervals: Assume each thread Pi knows Ci and each transaction has same duration τ (this assumption can be removed) • Conflicts: Divide time steps into frames [each time step is of size τ] – Frame size depends on the conflict resolution strategy of the algorithm • Number of frames in random intervals: Each thread chooses a random number qi independently, uniformly, and randomly from the range [0, αi -1], where αi = Ci / log(MN) • Handling conflicts: Use priorities DISC 2010 - 24th International Symposium on Distributed Computing 11 How it works? (2/2) First frame of Thread 1 where T12 Second 11 executes q1 ϵ [0, α1 -1], α1 = C1 / log(MN) Frames 1 2 3 F11 N F12 F1N F3N Thread 1 Thread 2 Thread 3 M Thread M N C=maxi Ci, 1 ≤ i ≤ M Makespan = (C / log(MN) + Number of frames) × Frame Size = (C / log(MN) + N) × Frame Size DISC 2010 - 24th International Symposium on Distributed Computing 12 Offline-Greedy Algorithm (1/2) • Initialization: – Frames are of size Φ = Θ(τ . ln(MN)) time steps – Each thread Pi is assigned initially a random period of qi ϵ [0, αi-1] frames, αi = Ci / log(MN) – Each transaction Tij is assigned to frame Fij = qi + (j-1) • Priority assignment: each transaction has two priorities: low or high – Transaction Tij is initially in low priority – Tij switches to high priority in the first time step of frame Fij and remains in high priority thereafter • Conflict resolution: uses conflict graph explicitly to resolve conflicts – Conflict graph is dynamic and evolves while the execution of the transactions progresses DISC 2010 - 24th International Symposium on Distributed Computing 13 Offline-Greedy Algorithm (2/2) • Proof Intuition: With high probability each transaction commits in its assigned frame – Let A’ ⊆ A denote the subset of conflicting transactions with Tij in frame Fij • |A’| ≤ log(MN) – 1, then Tij commits in frame Fij 2 • |A’| ≥ log(MN) with probability at most (1/MN) • Makespan: O(𝜏 ⋅ (C + N log(MN))) with high probability – Pro: For C ≤ N log(MN) makespan is log(MN) factor far from optimal, since N is a trivial lower bound – Con: Need to know dependency graph to resolve conflicts • Competitive ratio: O(s + log(MN)) with high probability – Pro: Independent with any choice of C DISC 2010 - 24th International Symposium on Distributed Computing 14 Online-Greedy Algorithm (1/2) • Online in the sense that it does not depend on knowing the dependency graph to resolve conflicts • Similar to Offline-Greedy except the conflict resolution strategy • Priority assignment – Two different priorities associated with each transaction as a vector π(1), π(2) – π(1) represent the Boolean priority as in Offline-Greedy – π(2) ∈ [1, M] represent random priorities: A transaction chooses π(2) uniformly at random on the start of frame Fij and after every abort [Idea from Schneider & Wattenhofer, ISAAC’09] • Conflict resolution – On conflict of Tij with Tkl: if πij(2) < πkl(2) then abort(Tij, Tkl) otherwise abort(Tkl, Tij) DISC 2010 - 24th International Symposium on Distributed Computing 15 Online-Greedy Algorithm (2/2) • Proof Intuition: frame duration is now Φ’ = O(𝜏 ⋅ log2(MN)) – Analysis is similar to Offline-Greedy • Makespan: O(𝜏 ⋅ (C log(MN) + N log2(MN))) with high probability – Pro: no need to know dependency graph to resolve conflicts – Con: makespan is worse in comparison to Offline-Greedy 2 • Competitive ratio: O(s ⋅ log(MN) + log (MN)) with high probability • Pro: Independent of the contention measure C DISC 2010 - 24th International Symposium on Distributed Computing 16 Adaptive-Greedy Algorithm • Limitations of Offline-Greedy and Online-Greedy algorithms – The values of Ci need to be known in advance • Adaptive-Greedy: each thread starts with guessing Ci = 1 – Similar to the exponential back-off strategy used by Polka – Based on current Ci estimate, the thread attempts to execute Online-Greedy algorithm – If a thread Pi is unable to commit transactions (bad event) then Pi assumes choice of Ci is incorrect and starts over again by assuming Ci’ = 2 ⋅ Ci for remaining transactions • Correct choice of Ci is reached in logCi iterations DISC 2010 - 24th International Symposium on Distributed Computing 17 Discussions • For variable length transactions – 𝜏 on makespan bounds is replaced with 𝜏max, which is the maximum duration of any transaction in the window – 𝜏max / 𝜏min factor in competitive ratio bounds, where 𝜏min is the minimum duration of any transaction in the window • Future extensions – Instead of one randomization interval at the beginning of window, random periods of low priority between subsequent transactions – Dynamic expansion and contraction of the execution window to preserve the contention measure C DISC 2010 - 24th International Symposium on Distributed Computing 18 Conclusions • Execution window model for TM • Two new randomized greedy CM algorithms that are very close to O(s)-competitive • Adaptive version of the previous algorithms for better performance by avoiding the limitations of the known value of C DISC 2010 - 24th International Symposium on Distributed Computing 19
© Copyright 2024 Paperzz