pps

On the convergence of SDDP
and related algorithms
Speaker: Ziming Guan
Supervisor: A. B. Philpott
Sponsor: Fonterra New Zealand
Motivation
•
Pereira and Pinto, Multi-Stage Stochastic Optimization Applied to
Energy Planning, Mathematical Programming, 52, pp. 359-375, 1991.
Summary
• Description of problem class
• SDDP and its related algorithm
• Theoretical convergence
• Implementation issues
Properties for random quantities
•
Random quantities appear only on the right-hand side
of the linear constraints in each stage.
•
The set of random outcomes is discrete and finite.
•
Random quantities in different stages are independent.
•
Can accommodate PARMA process for RHS
uncertainty.
Scenario tree, scenario outcome, scenario
p21
w1(2)
w2(2)
w1(1)
p11
p12
w3(2)
p21
w2(1)
p13
p21
w1(2)
w2(2)
w3(1)
w3(2)
Hydro-thermal scheduling
Stage problem
Cuts
Θ(t+1)
Reservoir storage, x(t+1)
Stochastic Dual Dynamic Programming
• [Pereira and Pinto, 1991]
• Initialization: Sample some scenarios and fix them through the
course of the algorithm.
• Forward pass: For stage t=1,…,T, solve the stage t problem
for each scenario.
• Calculate the lower bound and upper bound.
• If not converge,
– Backward pass: For stage t=T-1,…,1, for the stage t
problem in each scenario, solve all stage t+1 problems to
calculate a cut for stage t problems.
– Back to Forward pass.
w2(1)
w2(2)
w1(3), w3(3)
w1(2)
w1(1)
p11
p12
w2(1)
p13
w3(1)
w2(2)
w3(2)
w1(1)
p11
p12
w2(1)
p13
w3(1)
w2(1)
w2(2)
w1(3), w3(3)
w1(2)
w1(1)
p11
p12
w2(1)
p13
w3(1)
w2(2)
w3(2)
Dynamic Outer Approximation Sampling Algorithm
No upper bound calculation until algorithm is terminated.
w2(1)
w2(2)
w3(3)
w1(2)
w1(1)
p11
p12
w2(1)
p13
w3(1)
w2(2)
w3(2)
w1(1)
p11
p12
w2(1)
p13
w3(1)
w2(1)
w2(2)
w1(3)
w1(2)
w1(1)
p11
p12
w2(1)
p13
w3(1)
w2(2)
w3(2)
• We have a convergence proof for DOASA.
• This can be used to understand the convergence
behaviour of SDDP.
Sampling properties of DOASA
• Forward Pass Sampling Property (FPSP):
Each scenario is traversed infinitely many times with
probability 1 in the forward pass.
How do we guarantee this?
Either
• Independently sample a single outcome in each stage
with a positive probability for each scenario outcome in
the forward pass.
• Repeat an exhaustive enumeration of each scenario in
the forward pass.
Convergence Theorem
• Under FPSP, DOASA converges with probability 1 to an
optimal solution to the stage 1 problem in a finite number
of iterations.
Sampling in cut calculation
• Sample some stage problems.
• Keep a list of dual solutions, search the best one for the
stage problem that are not sampled.
• Backward Pass Sampling Property (BPSP):
In any stage, each scenario outcome is visited infinitely
many times with probability 1 in the backward pass.
Convergence Theorem
• Under FPSP and BPSP, the algorithm converges with
probability 1 to an optimal solution to the stage 1
problem in a finite number of iterations.
Corollaries
• If every outcome is used in cut calculation we only need
FPSP.
• We can bias sampling as long as FPSP is satisfied.
(Note estimation of upper bound needs unbiased
scenarios.)
Resampling
• SDDP does not resample the forward pass. It creates N
scenarios of inflows at the start.
• FPSP is NOT satisfied.
• SDDP will terminate with probability 1.
• Cuts give a lower bound, but policy need not be optimal.
Always Dry, when at convergence...
Dry
Dry
Wet
Dry
Wet

Wet

Negative inflows
• SDDP uses PARMA model for inflows.
• Negative inflows might result – not physically possible.
• Some implementations adjust random outcomes to make
inflow non-negative – this destroys stage-wise
independence.
• Cut sharing is no longer valid.
• Log-normal inflows not valid for convexity reasons.
Convexity matters in backward pass
• Transmission losses can make stage problem not
convex if free disposal is not allowed.
• Unit commitment integer effects are not convex.
Convergence expectation
• We run DOASA on a problem at Fonterra NZ.
• Maximum size for convergence
= 12 stages x 24 states.
• In revenue management application,
8 states, 5000 stages converge,
20 states, 5000 stages does not.
• Convergence is problem dependent.
Case study: NZ model
demand
N
MAN
S
HAW
demand
TPO
Computational results: NZ model
•
•
•
•
9 reservoirs
52 weekly stages
30 inflow outcomes per stage
Model written in AMPL/CPLEX
• Takes 100 iterations and 2 hours on a standard Windows
PC to converge
2005-2006 policy simulated with historical inflow sequences
4500
4000
3500
1995
1996
1997
1998
1999
3000
2500
2000
2001
2002
2003
2004
2000
1500
1000
500
0
0
10
20
30
40
50
END