Monte-Carlo method for Two

Lecture 5
Monte-Carlo method for Two-Stage
SLP
Leonidas Sakalauskas
Institute of Mathematics and Informatics
Vilnius, Lithuania
EURO Working Group on Continuous Optimization
Content
Introduction
 Monte Carlo estimators
 Stochastic Differentiation
  -feasible gradient approach for two-stage SLP
 Interior-point method for two stage SLP
 Testing optimality
 Convergence analysis
 Counterexample

Two-stage stochastic optimization
problem


F ( x)  c  x  E min y [q  y | W  y  T  x  h, y  R ]  min
m

Ax  b, x  n ,
assume vectors q, h and matrices W, T random in general, and,
consequently, depending on an elementary event.
Two-stage stochastic optimization
complete recourse will be
problem
with
F ( x)  c  x  E Q( x,  )  min n
xD 
subject to the feasible set

D  x A  x  b, x  R
n


where
Q( x,  )  min
m
y [q  y | W  y  T  x  h, y  R ]
It can be derived, that under the assumption on the
existence of a solution to the second stage problem in
and continuity of measure P, the objective function is
smoothly differentiable and its gradient is expressed as
 x F ( x)  Eg ( x,  )
where
g ( x,  )  c  T  u *
is given by the set of solutions of the dual problem
(h  T  x)T  u *  max u [( h  T  x)T  u | u  W T  q  0,
u  Rs ]
Monte-Carlo samples
We assume here that the Monte-Carlo samples of a
certain size N are provided for any
n
Y  ( y1 , y 2 ,..., y N ),
xR
and the sampling estimator of the objective function
can be computed :
N
1
F ( x)   f ( x, y j )
N j 1
Aad sampling variance can be computed also which is
useful to evaluate the accuracy of estimator
2
1 N
2
j
D ( x) 
f ( x, y )  F ( x) 


N  1 j 1
Gradient
The gradient is evaluated using the
same random sample:
1 N
j
g ( x)   g ( x, y ),
N j 1
xD  R
n
Covariance matrix
We use the sampling covariance matrix


1 N
j
j
A( x) 
g  x, y   g  x   g  x, y   g  x 

N  n j 1
later on for normalising of the gradient estimator.

T
Approaches of stochastic
gradient

We examine several estimators for
stochastic gradient:




Analytical approach (AA);
Finite difference approach (FD);
Simulated perturbation stochastic
approach (SPSA);
Likelihood ratio approach (LR).
Analytical approach (AA)
Gradient is expressed as
where
  
 x F ( x )  E g i x, 
g1  x,    c  T  u*
is given by the a set of solutions of the dual
problem
(h  T  x)T  u*  max u [( h  T  x)T  u | u  W T  q  0, u  m ]
Gradient search procedure
Let some initial point
x D  R
0
n
be given.
The random sample of a certain initial size N0 be
generated at this point, and Monte-Carlo estimates be
computed.
The iterative stochastic procedure of gradient search
could be used further:
x
t 1
t
~
 x    g (x )
t
 – feasible direction approach
Let us define the set of feasible
directions as follows:


V ( x)  g  Ag  0, 1in  g j  0, if x j  0
n
Gradient projection


Denote,
as projection of vector g onto the set
g
U. U
Since the objective function is differentiable, the
solution
xD
is optimal if
F  x V  0
Assume a certain multiplier
Define the function
x
by
 0 to be given.
 : V ( x)   


xj 

 x ( g )  min  ˆ , min( ) 
 1gj j 0,n g j 


Thus
for any
xgD
, when
g V  x  , x  D
1 j n g j  0
   x ( g ),
Now, let a certain small value
 0
be given.
 x : V ( x )  
 x ( g )  ˆ  max min x j , ˆ  g j  ,
g  0 ,
1 j  n
Then we introduce the function
1 j  n
j
g j 0
 x ( g )  0 , if 1 j  n ( g j  0)
and define the ε - feasible set


V ( x)  g n Ag  0, 1i  n g j  0, if  0  x j   x ( g ) 

The starting point can be obtained as the solution of the
deterministic linear problem:
( x0 , y 0 )  arg min[c  x  q  y | A  x  b, W  y  T  x  h, y  Rm , x  Rn ].
x, y
The iterative stochastic procedure of gradient search could
be used further:
x t 1  x t   t  G ( x t )
where  t
  t (Gt )
x
Gt  G ( xt )
is the step-length multiplier and
is the projection of gradient


estimator to the ε -feasible set.
V xt
Monte-Carlo sample size problem
There is no a great necessity to compute
estimators with a high accuracy on starting the
optimisation, because then it suffices only to
approximately evaluate the direction leading to
the optimum.
Therefore, one can obtain not so large samples at
the beginning of the optimum search and, later
on, increase the size of samples so as to get the
estimate of the objective function with a desired
accuracy just at the time of decision making on
finding the solution to the optimisation problem.
We propose a following version for regulating the
sample size in practice:
N t 1
t





n

Fish
(

,
n
,
N

n
)

, N max 
 min max  

n
,
N
~
~

min
t T
t 1
t





(
G
(
x
)

(
A
(
x
))

(
G
(
x
)






Statistical testing of the optimality
hypothesis
The optimality hypothesis could be accepted for some
point xt with significance 1  
, if the following
condition is satisfied
t
t T
t 1
t
(
N

n
)

(
G
(
x
))

(
A
(
x
))

(
G
(
x
))
2
Tt 
 Fish(  , n, N t  n)
n
Next, we can use the asymptotic normality again and
decide that the objective function is estimated with a
permissible accuracy
, if its confidence bound does
not exceed this value:

~ t
  D( x ) / N t  
Computer simulation


Two-stage stochastic linear optimisation problem.
Dimensions of the task are as follows:


the first stage has 10 rows and 20 variables;
the second stage has 20 rows and 30 variables.
http://www.math.bme.hu/~deak/twostage/ l1/20x20.1/
(2006-01-20).
Dwo stage stochastic
programing



The estimate of the optimal value of the objective
function given in the database is 182.94234  0.066
N0=Nmin=100 Nmax=10000.
Maximal number of iterations t  100 , generation of
max
trials was broken when the estimated
confidence
interval of the objective function exceeds admissible
value .
Initial data were as follows:


 = =0.95;   0.99,
  0.1; 0.2; 0.5; 1.0.
Frequency of stopping under
admissible interval
100
80
60
1
0,5
40
0,2
20
0,1
0
1
6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
Change of the objective function
under admissible interval
184,5
184
0,1
183,5
0,2
183
0,5
1
182,5
182
1
12
23 34 45
56 67 78
89 100
Change of confidence interval
under admissible interval
7
6
5
4
3
2
1
0
0,1
0,2
0,5
1
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99
Change of the Monte-Carlo sample
size under admissible interval
1400000
1200000
1000000
0,1
800000
0,2
600000
0,5
400000
1
200000
0
1
6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
Change of the Hotelling
statistics under admissible
interval
10
9
8
7
6
5
4
3
2
1
0
0,1
0,2
0,5
1
1
11
21
31
41
51
61
71
81
91
Histogram of ratio
N
N
admissible interval
under
j
t
t
j 1
30
25
20
0,1
0,2
15
0,5
1
10
5
0
8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
Wrap-Up and Conclisions



The stochastic adaptive method has been developed to
solve stochastic linear problems by a finite sequence of
Monte-Carlo sampling estimators
The method is grounded by adaptive regulation of the size
of Monte-Carlo samples and the statistical termination
procedure, taking into consideration the statistical
modeling accuracy
The proposed adjustment of sample size, when it is taken
inversely proportional to the square of the norm of the
Monte-Carlo estimate of the gradient, guarantees the
convergence a. s. at a linear rate