Scalably Scheduling Processes with Arbitrary Speedup Curves (Better Scheduling in the Dark) Jeff Edmonds York University Kirk Pruhs University of Pittsburgh SODA 2009 Every Deterministic Nonclairvoyant Scheduler has a Suboptimal Load Threshold Jeff Edmonds York University Submitted to STOC 2009 The Scheduling Problem • Allocate p processors to a stream of n jobs: • Measure of Quality:P F (A(I )) = n (c i= 1 i • Competitive Ratio:F ( A ( I ) ) max I F ( O pt ( I ) ) ¡ ri ) = R t n t dt Examples of Schedulers Shortest Remaining Processing Time (SRPT) Shortest Elapsed Time First (SETF) Online Online: Optimal: Future ? All Knowing All Powerful Competitive: Shortest Remaining Processing Time (SRPT) max I F (SJ F (I )) F ( O pt ( I ) ) = 1 Nonclairvoyant Nonclairvoyant: Optimal: Not Competitive: Future ? All Knowing All Powerful Nonclairvoyant I nput Opt F ( SE T F ) F ( O pt ) SE T F = (n) I nput E qui Opt F ( E qu i ) F ( O pt ) F ( N on cl ai r voy an t ) F ( O pt ) = = ( n ln n ) p ( n) [MPT] Performance vs Load Average Performance F (A(I )) Load maxI F (Opt(I )) I F (A(I )) F ( O pt ( I ) ) = (n) Performance vs Load Average Performance F (A(I )) F (Opt(I )) s c Load maxI I F (A(I )) F ( O pt ( I ) ) = O(1) Performance vs Load Average Performance F (Opt(I )) F (A s (I )) c Load maxI I F (A s (I )) F ( O pt ( I ) ) = O(c) Resource Augmentation Nonclairvoyant: Future ? Extra Speed Optimal: Competitive: All Knowing All Powerful Resource Augmentation 1+ 2² I nput Opt F ( SE T F 1 + ² ) F ( O pt 1 ) [KP] SE T F = £ ( 1) ² I nput E qui 2+ ² Opt F ( E qu i 2 + ² ) F ( O pt 1 ) = £ ( 1) Required ² [E] Sublinear Nondecreasing Speedup JFunctions = f J ;:::;J g • A set of jobs: • Each job has phases: • Each phase: – Work: – Speedup function: • Nondecreasing • Sublinear • Examples: 1 Ji = Jq = i Wq ¡ i q i n J 1 ; : : : ; J qi i i hW q ; ¡ q i i i ® Sublinear Nondecreasing Speedup Functions Nonclairvoyant: Future Extra Speed Optimal: Competitive? All Knowing All Powerful ? Sublinear Nondecreasing Speedup Functions Arrives over time Currently Alive Opt gives all its resources to the parallelizable job and hence Opt competes them as they arrive. The sequential jobs complete with no resources. Sublinear Nondecreasing Speedup Functions Arrives over time Currently Alive Shortest Elapsed Time First (SETF) gives all its resources to a sequential job, wasting it. The parallelizable jobs, getting no resources never complete. SE T F s F ( SE T F s ) F ( O pt 1 ) = (n) Sublinear Nondecreasing Speedup Functions Arrives over time Currently Alive ` ² jobs Equi spreads its resources fairly. Most are wasted on the sequential jobs. The parallelizable jobs don’t get enough and fall behind. ` jobs EQUI waists є resources, has є extra. E qui 1+but ² F ( E qu i 1 + ² ) F ( O pt 1 ) = ` =² + ` ` = 1 ² Sublinear Nondecreasing Speedup Functions I nput E qui 2+ ² Opt F ( E qu i 2 + ² ) F ( O pt 1 ) = £ ( 1) Required ² [E] nt jobs currently alive sorted by arrival time. Latest Arrival Processor Sharing Opt SE T F 1 job But may be sequential. L AP S ¯n t E qui New result [EP] F ( L A P Sh¯ ; 1 + ¯ + F ( O pt 1 ) ² ;i ) nt jobs = £( Speed Compromise Too thin & needs 2+speed ² jobs 1 ¯² ) nt jobs currently alive sorted by arrival time. Latest Arrival Processor Sharing Opt SE T F 1 job L AP S ¯n t nt E qui F ( SE T F 1+ ² ) F ( O pt 1 ) New result [EP] = £( 1 0² ¯ ¼0 Compromise jobs ¯= 1 jobs ) F ( L A P Sh¯ ; 1 + ¯ + ² ; i ) = F ( O pt 1 ) F ( E qu i 2+ ² ) = £ ( 1 ) F ( O pt 1 ) 1² £( 1 ¯² ) Backwards Quantifies Desired result: Obtained: New result [E STOC09?] New result [EP] 9Alg 8² F ( A l g1+ ² ) F ( O pt 1 ) = 8² 9Alg F ( A l g1+ ² ) F ( O pt 1 ) = £( F ( A l g1+ ² ) F ( O pt 1 ) = ! (1) ¯= 1² 2 8Alg 9² ²= 1¯ 2 F ( L A P Sh¯ ; 1 + ¯ + F ( O pt 1 ) ²i ’ ) = £( 1 ² O ( 1) 1 ¯² ) ’ 1 ²2 ) Performance vs LLoad Threshold 2 [0; 1] F (Opt L (I )) = 1 Defn: A set of jobs has load if i.e. can be optimally handled with speed L. Defn: F ¯ (L ) = max wit h load I L F (L AP Sh¯ ;1i (I )) Equi (β=1) has the best performance, but it only can handle half load. L= 1 2 Small β can handleL = 1¡ ¯ almost full load but its performance 1. degrades with ¯ L Lower Bound Opt L AP S ¯n t jobs Too thin & needs 1+ speed ¯ ³ ´2 P Measure of resource concentration i· nt (½i ) 2 = (¯n t ) ¢ ¯t = nt P 1 ¯nt 1 i· nt ( ½i ) 2 = 1 ¯nt Lower Bound Opt To concentrated, may be sequential. Performance = 1/β ¯n t Too thin jobs & needs 1+ speed ¯ β 0: Constant β: Alg specifies processor allocation for each job when nt jobs alive Measure of resource concentration ¯t = nt 1 P i· nt ¯ = limt ! 1 ( ½i ) 2 ¯t Lower Bound ti O pt I n put Opt ignores extra jobs & competesPstream F low = i 2t i Alg Alg attempts all P & completes none F low = i (1 + i )t i Oops: We need an extra restriction that we can switch the job the alg is working the most on to being sequential. Likely because the alg favors the more recent jobs. Lower Bound Alg specifies processor allocation for each job when nt jobs alive Eg Equi or Lapsβ Arbitrary Adv gives promises no job completes Compute work wi completed on each job Adv gives work wi Non trivial to job so it algebra does not complete Brouwer's fixed point theorem I n put O pt Alg Time ti between jobs = wi so Opt can complete as arrive Compute competitive ratio Proof Sketch F ( L A P Sh¯ ; 1 + ¯ + F ( O pt 1 ) ² ;i ) = £( 1 ¯² ) Proof Sketch • In the worst cast inputs, each phase is either sequential or parallelizable. LAPS LAPS Proof Sketch Potential Function • • • • • Define a potential function Φt. It says how much debt Laps has in the bank. Φ0 = Φfinal = 0. Φt does not increase as jobs arrive or complete. At other dF times, ( L A P S) dF ( O pt ) d© dt + dt · c • Result follows by integrating dt F (L AP S) + ©f i n al ¡ ©0 · cF (Opt) Potential Function nt jobs currently alive sorted by arrival time. Coefficient: Parallelizable work done by Opt not by LAPS: 1 2 3 … x 1x 2x 3 … ©= ° xnnt n0t+1 t P Job arrives: i 2 [n t ] d© dt i ¢max(x i ; 0) = 0 Potential Function n jobs currently alive sorted by arrival time. Coefficient: Parallelizable work done by Opt not by LAPS: x11x22x33 … … ©= ° P xnnt nt-1 i i+1i 0 i 2 [n t ] d© Job completes: dt t i ¢max(x i ; 0) · 0 Potential Function n jobs currently alive sorted by arrival time. Coefficient: Parallelizable work done by Opt not by LAPS: x11x22x33 … xnnt … ©= ° Opt works: P i 2 [n t ] d© dt t i ¢max(x i ; 0) · ° ¢n t ¢1 Speed of Opt Opt Potential Function n jobs currently alive sorted by arrival time. Coefficient: Parallelizable work done by Opt not by LAPS: LAPS works: d© dt (1 ¡ ¯)n t x ( 1¡ ¯ ) n x11x22x33 … … ©= ° · ° xnnt t P i 2 [n t ] t L AP S i ¢max(x i ; 0) P i 2 [( 1¡ ¯ ) n t ;n t ¡ `bt ] i ¢¡ ( 1+ ¯ + ² ) ¯nt Less work not done by Laps Potential Function n jobs currently alive sorted by arrival time. Coefficient: Parallelizable work done by Opt not by LAPS: LAPS works: d© dt (1 ¡ ¯)n t x ( 1¡ ¯ ) n x11x22x33 … … ©= ° · ° xnnt t P i 2 [n t ] t L AP S i ¢max(x i ; 0) P i 2 [( 1¡ ¯ ) n t ;n t ¡ `bt ] i ¢¡ ( 1+ ¯ + ² ) ¯nt Speed of Laps Potential Function n jobs currently alive sorted by arrival time. Coefficient: Parallelizable work done by Opt not by LAPS: LAPS works: d© dt (1 ¡ ¯)n t x ( 1¡ ¯ ) n x11x22x33 … … ©= ° · ° xnnt t P i 2 [n t ] t L AP S i ¢max(x i ; 0) P i 2 [( 1¡ ¯ ) n t ;n t ¡ `bt ] i ¢¡ ( 1+ ¯ + ² ) ¯nt Shared with ¯n t jobs. Potential Function n jobs currently alive sorted by arrival time. Coefficient: Parallelizable work done by Opt not by LAPS: LAPS works: d© dt (1 ¡ ¯)n t x ( 1¡ ¯ ) n x11x22x33 … … ©= ° · ° xnnt t P i 2 [n t ] L AP S i ¢max(x i ; 0) P i 2 [( 1¡ ¯ ) n t ;n t ¡ `bt ] Range of t ¯n t i ¢¡ jobs worked on. ( 1+ ¯ + ² ) ¯nt Potential Function n jobs currently alive sorted by arrival time. Coefficient: Parallelizable work done by Opt not by LAPS: LAPS works: d© dt (1 ¡ ¯)n t x ( 1¡ ¯ ) n x11x22x33 … … ©= ° · ° xnnt t P i 2 [n t ] t L AP S i ¢max(x i ; 0) P i 2 [( 1¡ ¯ ) n t ;n t ¡ `bt ] `bt = i ¢¡ ( 1+ ¯ + ² ) ¯nt # of jobs • sequential under LAPS xi · 0 • LAPS is a head, i.e. Potential Function n jobs currently alive sorted by arrival time. Coefficient: Parallelizable work done by Opt not by LAPS: LAPS works: d© dt (1 ¡ ¯)n t x ( 1¡ ¯ ) n x11x22x33 … … ©= ° · ° t P i 2 [n t ] xnnt t L AP S i ¢max(x i ; 0) P i 2 [( 1¡ ¯ ) n t ;n t ¡ `bt ] `bt = i ¢¡ ( 1+ ¯ + ² ) ¯nt # of jobs • sequential under LAPS xi · 0 • LAPS is a head, i.e. Function dF (Potential L A P S) + d© · c dF ( O pt ) dt dt dt nt · £( # jobs alive under Laps 1 ¯² ) b̀ · N t # jobs alive under Opt resulting competitive ratio A page of math later, and the proof is done. Opt works: LAPS works: d© dt · ° ¢n t ¢1 P +° i 2 [( 1¡ ¯ ) n t ;n t ¡ `bt ] `bt = i ¢¡ ( 1+ ¯ + ² ) ¯nt # of jobs • sequential under LAPS xi · 0 • LAPS is a head, i.e. Conclusions nt jobs currently alive sorted by arrival time. Latest Arrival Processor Sharing L AP S ¯n t [EP] Resource F ( LAugmentation: APS ) h¯ ; 1 + ¯ + ² ; i F ( O pt 1 ) jobs = £( [E]: Suboptimal Load Threashold 8Alg 9² F ( A l g ) = 1+ ² F ( O pt 1 ) 1 ¯² ) ! (1) Other Models Same Techniques •Broadcast: Many page requests serviced simultaneously [EP:SODA02, EP:SODA03] •TCP: Add Incr & Mult Decr ~ EQUI [EDD:PAA03, E:Latin 04] •Speed Scaling: Each algorithm can dynamically choose its speed s, but it must pay for it with energy P(s) = sα [CELLSP:STACS09,EP??] Thank you Conclusions nt jobs currently alive sorted by arrival time. Latest Arrival Processor Sharing L AP S ¯n t [EP] Resource F ( LAugmentation: APS ) h¯ ; 1 + ¯ + ² ; i [EP]: jobs = £( F ( O pt 1 ) 8Alg 9² F ( A l g1+ ² ) F ( O pt 1 ) = 1 ¯² ) (n) [CELLMP]: Scaling: £ (®2 ) ¯ L AP Speed S¯ is -competitive for [EP]: L Speed with £ (ln p) multi-processors.: AP SScaling ¯ -competitive is = 1 ® Scheduling in the Dark Multiprocessor – Batch Edmonds, Chinn, Brecht, Deng STOC 97 - Speed 2+ε Edmonds STOC 99 - Speed 1+ε Edmonds, Pruhs SODA 09 - A ε not competitive Edmonds, Pruhs ? STOC 09 Edmonds, Pruhs SODA 02 Edmonds, Pruhs SODA 04 TCP – one bottle neck Edmonds, Datta, Dymond PAA 03 - General Network Edmonds Latin 04 Chan, Edmonds, Lam, Lee, Marchetti-Spaccamela, and Pruhs ? STACS 09 Edmonds, Pruhs being written Multicast - reduction - LWF Speed Scaling – one proc. - multi proc. Nonclairvoyant Speed Scaling for Flow and Energy Ho-Leung Chan (Pittsburgh) Jeff Edmonds (York) Tak-Wah Lam (Hong Kong Lap-Kei Lee (Hong Kong) A. Marchetti-Spaccamela (Roma) Kirk Pruhs (Pittsburgh) Submitted to STACS 2009 Speed Scaling s Each algorithm can dynamically choose , ® P (s) its = sspeed but it must pay for it with energy = F ( A l g) + E ( A l g) F ( O pt ) + E ( O pt ) R R n hA l g ; t i dt + = ( shA l g ; t i ) ® dt Rt Rt n hO p t ; t i dt + ( shO p t ; t i ) ® dt t Known: 9 3-competitive clairvoyant alg [BCP]. L AP S¯ New [CELLMP]: L AP S¯ : t is £ (®3 ) -competitive for shL A P S;t i = (n hL A P S;t i ) 1=® Partition speed to the latest arriving jobs. ¯n ¯= 1 ® Speed Scaling s Each algorithm can dynamically choose , ® P (s) its = sspeed but it must pay for it with energy = F ( A l g) + E ( A l g) F ( O pt ) + E ( O pt ) R R n hA l g ; t i dt + ( shA l g ; t i ) ® dt Rt Rt n hO p t ; t i dt + ( shO p t ; t i ) ® dt t Known: 9 3-competitive clairvoyant alg [BCP]. L AP S¯ New [CELLMP]: t is £ (®3 ) -competitive for ¯= 1 ® Only for one processor or fully parallelizable jobs New [EP]: Multi processors & Parallel-Sequential Jobs Speed Scaling s Each algorithm can dynamically choose , ® P (s) its = sspeed but it must pay for it with energy s Processors Model: • Dynamically allocate pi unit speed processors to job Ji. • Energy is (#processors)α per time. • α2 competitive. Individual Speeds Model: • • • • Dynamically partition the p processors among the jobs. Run processor k at speed sk. Energy is skα per processor per time. log p competitive. Speed Scaling for Flow and Energy with mult-Processors and Arbitrary Speedup Curves Jeff Edmonds York University Kirk Pruhs University of Pittsburgh Being Written Performance Fvs Load (Opt (I )) = 1 s Defn: A set of jobs has load s if s 2 [0; 1] i.e. can be optimally handled with speed Defn: F ¯ (s) = max wit h load F (L AP Sh¯ ;1i (I )) I s = maxI F ( L A P Sh¯ ; 1 i ( I ) ) F ( O pt s ( I ) sL A P S sO p t = 1+ ¯ + ² sL A P S sO p t = 1 s = F ( L A P S( I ) ) F ( O pt ( I ) ) F ¯ (s) = 4( 1+ ¯ + ² ) ¯² 41 s ¯ ( 1 ¡ 1¡ ¯ ) s for s < 1 1+ ¯ Performance Fvs Load (Opt (I )) = 1 s Defn: A set of jobs has load s if s 2 [0; 1] i.e. can be optimally handled with speed Defn: F ¯ (s) = max wit h load F (L AP Sh¯ ;1i (I )) I s F ¯ (s) = 41 s ¯ ( 1 ¡ 1¡ ¯ ) for s < s Equi (β=1) has the best performance, but it only can handle half load. s= 1 1+ ¯ 1 2 Small β can handle s ¼ 1¡ ¯ almost full load but its performance 1. ¯ degrades with Speed Scaling s Each algorithm can dynamically choose , ® P (s) its = sspeed but it must pay for it with energy = F ( A l g) + E ( A l g) F ( O pt ) + E ( O pt ) R R n hA l g ; t i dt + ( shA l g ; t i ) ® dt Rt Rt n hO p t ; t i dt + ( shO p t ; t i ) ® dt t Known: 9 3-competitive clairvoyant alg [BCP]. L AP S¯ New [CELLMP]: t is £ (®2 ) -competitive for Every(®) nonclairvoyant algorithm is ® P (s) = s ! (1) -competitive for P (s) = s! ( 1) -competitive for ¯= 1 ®
© Copyright 2026 Paperzz