Case 1 - University of Southern California

Energy-Aware Wireless Scheduling with Near
Optimal Backlog and Convergence Time Tradeoffs
A(t)
Q(t)
μ(t)
Michael J. Neely
University of Southern California
INFOCOM 2015, Hong Kong
http://www-bcf.usc.edu/~mjneely
A Single Wireless Link
A(t)
Q(t)
μ(t)
Q(t+1) = max[Q(t) + A(t) – μ(t), 0]
A Single Wireless Link
A(t)
Q(t)
μ(t)
Q(t+1) = max[Q(t) + A(t) – μ(t), 0]
Uncontrolled: A(t) = random arrivals, λ
A Single Wireless Link
A(t)
Q(t)
μ(t)
Q(t+1) = max[Q(t) + A(t) – μ(t), 0]
Uncontrolled: A(t) = random arrivals, λ
Controlled: μ(t) = bits served
[depends on power use & channel state]
Random Channel States ω(t)
ω(t)
t
•
•
•
•
•
Observe ω(t) on slot t
ω(t) in {0, ω1, ω2, … , ωM}
ω(t) ~ i.i.d. over slots
π(ωk) = Pr[ω(t) = ωk]
Probabilities are unknown
Opportunistic Power Allocation
p(t) = power decision on slot t
[based on observation of ω(t)]
Assume:
• p(t) in {0, 1} (“on” or “off”)
• μ(t) = p(t)ω(t)
Time average expectations:
t-1
p(t) = (1/t) ∑ E[ p(τ) ]
τ=0
Stochastic Optimization Problem
Minimize : lim p(t)
Subject to: lim μ(t) ≥ λ
p(t) in {0, 1} for all slots t
p* = ergodic optimal average power
Define: Fix ε>0. ε-approximation on slot t if:
p(t) ≤ p* + ε
μ(t) ≥ λ - ε
Challenge: Unknown probabilities!
Prior algorithms and analysis
E[Q]
Tε
• Neely 03, 06 (DPP)
Georgiadis et al. 06
• Neely, Modiano, Li 05, 08:
O(1/ε)
O(1/ε2)
O(1/ε)
O(1/ε2)
• Neely 07:
O(log(1/ε))
O(1/ε2)
• Huang et. al. ‘13 (DPP-LIFO):
O(log2(1/ε))
O(1/ε2)
• Li, Li, Eryilmaz ‘13, ’15:
O(1/ε)
O(1/ε2)
(additional sample path results)
Prior algorithms and analysis
E[Q]
Tε
• Neely 03, 06 (DPP)
Georgiadis et al. 06
• Neely, Modiano, Li 05, 08:
O(1/ε)
O(1/ε2)
O(1/ε)
O(1/ε2)
• Neely 07:
O(log(1/ε))
O(1/ε2)
• Huang et. al. ‘13 (DPP-LIFO):
O(log2(1/ε))
O(1/ε2)
• Li, Li, Eryilmaz ‘13, ’15:
O(1/ε)
O(1/ε2)
(additional sample path results)
• Huang et al. ’14:
O(1/ε2/3)
O(1/ε1+2/3)
Main Results
1. Lower Bound: No algorithm can do better than
O(1/ε) convergence time.
2. Upper Bound: Provide tighter analysis to show
that Drift-Plus-Penalty (DPP) algorithm achieves:
• Convergence Time:
Tε = O( log(1/ε) / ε)
• Average queue size:
E[Q] ≤ O( log(1/ε) )
Part 1: Ω(1/ε) Lower Bound for all Algorithms
 Example system:
• ω(t) in {1, 2, 3}
• Pr[ω(t) = 3], Pr[ω(t) = 2], Pr[ω(t) = 1] unknown.
 Proof methodology:
• Case 1: Pr[ transmit | ω(0) = 2 ] > ½.
o Assume Pr[ω(t) = 3] = Pr[ω(t) = 2] = ½.
o Optimally compensate for mistake on slot 0.
• Case 2: Pr[ transmit | ω(0) = 2 ] ≤ ½.
o Assume different probabilities.
o Optimally compensate for mistake on slot 0.
Case 1: Fix λ=1, ε > 0
Power E[p(t)]
1
X
0
0
1
h(μ) curve
Rate E[μ(t)]
Case 1: Fix λ=1, ε > 0
1
Power E[p(t)]
(E[μ(0)], E[p(0)])
is in this region.
A
X
0
0
1
Rate E[μ(t)]
Case 1: Fix λ=1, ε > 0
1
Power E[p(t)]
(E[μ(0)], E[p(0)])
is in this region.
A
X
0
0
1
Rate E[μ(t)]
Case 1: Fix λ=1, ε > 0
1
Power E[p(t)]
(E[μ(0)], E[p(0)])
is in this region.
A
X
0
0
1
Rate E[μ(t)]
Case 1: Fix λ=1, ε > 0
1
Power E[p(t)]
(E[μ(0)], E[p(0)])
is in this region.
A
X Optimal compensation
Requires time Ω(1/ε).
0
0
1
Rate E[μ(t)]
Part 2: Upper Bound
Power E[p(t)]
• Channel states 0 < ω1 < ω2 < … < ωM
• General h(μ) curve (piecewise linear)
p*
h(μ) curve
λ
Rate E[μ(t)]
Part 2: Upper Bound
Power E[p(t)]
• Channel states 0 < ω1 < ω2 < … < ωM
• General h(μ) curve (piecewise linear)
Transmit iff
ω(t) ≥ ωκ-1
Transmit iff
ω(t) ≥ ωκ
h(μ) curve
λ
Rate E[μ(t)]
Drift-Plus-Penalty Alg (DPP)
• Δ(t) = Q(t+1)2 – Q(t)2
• Observe ω(t), choose p(t) to minimize:
Δ(t) + V p(t)
Drift
Weighted penalty
Drift-Plus-Penalty Alg (DPP)
• Δ(t) = Q(t+1)2 – Q(t)2
• Observe ω(t), choose p(t) to minimize:
Δ(t) + V p(t)
Drift
Weighted penalty
• Algorithm becomes:
 P(t) = 1 if Q(t)ω(t) ≥ V
 P(t) = 0 else
Q(t)
ω(t)
Drift Analysis of DPP
Positive drift
Negative drift
Q(t)
0
V/ωk+1
Transmit iff
ω(t) ≥ ωκ
V/ωk
V/ωk-1
Transmit iff
ω(t) ≥ ωκ-1
0 < ω1 < ω2 < … < ωM
Useful Drift Lemma (with transients)
Negative drift: -β
Z(t)
0
Lemma: E[erZ(t)] ≤ D + (erZ(0) – D)ρt
“steady state”
• Apply 1: Z(t) = Q(t)
• Apply 2: Z(t) = V/ωk – Q(t)
“transient”
After transient time O(V) we get:
Pr[ Red intervals ] = O(e-cV)
Positive drift
Negative drift
Q(t)
0
V/ωk+1
Choose V = log(1/ε)
V/ωk
V/ωk-1
Pr[ Red ] = O(ε)
After transient time O(V) we get:
Pr[ Red intervals ] = O(e-cV)
Positive drift
Negative drift
Q(t)
0
V/ωk+1
V/ωk-1
V/ωk
λ
Analytical Result
p*
λ
• But queue is stable, so E[μ] = λ + O(ε).
• So we timeshare appropriately and:
• E[Q(t)] ≤ O( log(1/ε) )
• Tε
≤ O( log(1/ε) / ε )
Simulation: E[p] versus queue size
Simulation: E[p] versus time
Non-ergodic simulation
(adaptive to changes)
Conclusions
• Fundamental lower bound on convergence time
o Unknown probabilities
o “Cramer-Rao” like bound for controlled queues
• Tighter drift analysis for DPP algorithm:
o ε-approximation to optimal power
o Queue size O( log(1/ε) ) [optimal]
o Convergence time O( log(1/ε)/ε ) [near optimal]