Optimal Stopping Time

Optimal Stopping Time
Pejman Mahboubi
August 21, 2012
Pejman Mahboubi
Optimal Stopping Time
Poisson Process
I
Consider the dice game that is described on page 87 of your
text book. There are 2 rules
I
I
Roll a die.
If the die shows a 6, then the game is over, and you gain 0. If
the die shows “x < 600 , then either take $x and end the game
or go to step 1
I
Q: When should we stop to maximize the expected gain?
I
The “stopping strategy” or the “optimal stopping time” is a
random variable
I
To see that the optimal stopping time is random variable, we
write the probabilistic interpretation of the game
I
Consider the Markov chain {Xn }∞
n=1 , with S = {1, · · · , 6},
with p(x, y ) = 1/6, when x 6= 6, and p(6, 6) = 1.
I
i.e. 6 is an absorbing state and all other states are transient
Pejman Mahboubi
Optimal Stopping Time
I
The sample space is the set of all strings of {1, · · · , 6} which
are eventually 6 (because 6 is absorbing), lets write down a
few of these paths
ω1 = 1, 1, 3, 2, 6, 6, 6 · · ·
ω2 = 4, 5, 1, 2, 2, 6, 6, 6, 6 · · ·
ω3 = 5, 4, 3, 6, 6, 6 · · ·
I
I
I
I
Assume the strategy T is defined by stopping when see 5 or a
bigger number.
Then XT makes sense. For example
T (ω1 ) = ∞, T (ω2 ) = 2, T (ω3 ) = 1
X1 (ω2 ) = 4, XT (ω2 ) (ω2 ) = X2 (ω2 ) = 5
Therefore for XT to make sense, we only need T be an
integer-valued R.V
T : Ω → {1, 2, · · · }
I
BUT we need more, if we want T to be stopping time
Pejman Mahboubi
Optimal Stopping Time
I
For example in the our previous example, the value of T is
determined by looking into the path and find the first time
number 5 appears.
{T = 3} = {paths ω : X1 (ω) < 5, X2 (ω) < 5, X3 (ω) = 3}
I
I
To check that if T (ω) = n or not, we only need to check path
ω up to time n
To see what this means consider this strategy: Define T 0 to
be the exact time right before a 6 appears.
T 0 (ω1 ) = 4, T 0 (ω2 ) = 5, T 0 (ω3 ) = 3
I
I
But we know that in a fare game we can not look into the
future. This means that {T 0 = n} should be defined by
{X1 , · · · , Xn }
Therefore a stopping time T (in discrete time) is a random
variable with values in {0, 1, 2 · · · } such that the set of paths
{T = n} is measurable with respect to {X1 , · · · , Xn }, where
the word measurable is in the sense described above.
Pejman Mahboubi
Optimal Stopping Time
I
In our game we assigned to each state space a value.
I
This means we have a function on S = {1, · · · , 6} by
f (k) = k, if k ≤ 5, f (6) = 0
I
We want to define the strategy T such that it maximizes
Ef (XT )
I
Let T be the hitting time of a set, defined as follows
T = inf{n : Xn ≥ k}
I
i.e T (at a path ω) is the first time that the path hits k or a
bigger number. (hits the set {k, k + 1, · · · , 6}
I
Is T a stopping time?
I
Finding the optimal strategy (OS), means finding the best k,
i.e., the k that maximizes Ef (XT ).
I
If this specific k is our first roll, then we should stop
Pejman Mahboubi
Optimal Stopping Time
I
I
I
I
I
Let v (k) be the expected gain when we follow the optimal
strategy, given the first roll is a k. [and we don’t know yet
what this strategy is]
Let u(k) be the expected gain, if the player does not stop
after rolling a k, but from then on plays according to the
optimal strategy.
By comparing u(k) and v (k) we can find at which k we
should stop
Assume the first roll is a k. If the according to OS we should
stop, then means that the maximum gain is equal to what we
get at k, i.e f (k).
This means if v (k) > f (k), then we should continue, i.e.,
v (k) ≥ f (k) ∀k = 1, · · · , 6
I
If we continue after the fist roll, then we will go to state k
with probability 1/6. Since our gain after that follows the OS,
then we will gain v (k) with probability 1/6
Pejman Mahboubi
Optimal Stopping Time
I
Therefore, by the MP of the chain


v (1)


u(k) =
pk,i v (i) = [pk1 , · · · , pk,6 ]  ... 
i=1
v (6)
6
X
I
Or more generally,




u(1)
v (1)
 .. 
 . 
 .  = P  ..  ,
u(6)
v (6)
where P = [pij ]6×6 , is the PTM. Since v (k) ≥ f (k), then






5/2
u(1)
f (1)
 . 
 ..   .. 
 .. 

 .  ≥P  .  = 
 5/2 
u(6)
f (6)
0
Pejman Mahboubi
Optimal Stopping Time
I
I
The last inequality tells us quiet a bit about the OS
I
We know that if hit 5, then we should stop
How about 4? should we stop? YES, because otherwise,
v (4) = u(4) ≥
Pejman Mahboubi
Optimal Stopping Time

Download Report

Optimal Stopping Time

Paperzz.com

Your Paperzz