OPTIMAL U-TURN TIME Question: suppose there is only one door

OPTIMAL U-TURN TIME
LARRY, ZHIRONG LI
Question: suppose there is only one door out of 2N doors that leads to the exit. You are standing at the
origin and there are N doors to your right and N doors to your left. These doors are equally spaced and the
right exit could be any of them. Let X be the accumulated walking distance until your
find the right door
and S be the distance from the right door to the origin. Find a strategy such that E X
S is minimized.
Let D be the spacing between any two adjacent doors.
Scheme 1: A naive strategy is to pick one direction at random and walk in that direction towards the
end. If the right door is in the opposite direction, you make a u-turn and go pass the origin to explore the
other side.
The probability of choosing east direction (suppose the doors are all aligned from the west to the east)
should be 21 . Otherwise, let p be the probability of choosing to walk towards the east end.
The probability of choosing the right direction (Let C be such event) is 21 because of the following:
I) the right door is located at the east side with probability 21 , the probability for you to choose this
direction is p
II) the right door is located at the west side with probability 12 , the probability for you to choose this
direction is 1 − p
Hence the total probability to choose the right direction is 12 p + 12 (1 − p) = 21
Therefore
X
X
= E E
E
|C
S
S
X
X
= P (C = right) E
|C = right + P (C = wrong) E
|C = wrong
S
S
1
1
ND + ND + S
=
× E [1 |C = right ] + × E
|C = wrong
2
2
S
1 1 1
2N D
=
+ + ×E
|C = wrong
2 2 2
S
However,
E
N
N
X
X
ND
1
1
|C = wrong = N D
× P (S = iD) =
S
iD
i
i=1
i=1
Therefore
X
E
S
=1+
N
X
1
i=1
i
So it does not really matter which direction you choose initially. The expected ratio is independent of p.
Date: Dec. 09, 2011.
1
OPTIMAL U-TURN TIME
2
The next question is: is it optimal to make a U-turn half way through walking in the initial chosen
direction? Let J be the threshold, i.e., if you pass the total number of J doors and the right door is not
found yet, you will immediately make a U-turn. We assume the U-turn is made at most once.
X
X
= E E
|C
E
S
S
X
X
= P (C = right) E
|C = right + P (C = wrong) E
|C = wrong
S
S
1
X
1
X
=
E
|C = right + E
|C = wrong
2
S
2
S
Notice that
E
=
PJ
=
J
N
X
S
|C = right
PN
× P (S = iD) + i=J+1
PN
+ NN−J + 2(NN+J) i=J+1 1i
iD
i=1 iD
iD + 2 (N + J) D
× P (S = iD)
iD
Similarly,
=
=
X
S |C = wrong
PN 2JD+iD
× P (S =
i=1
iD
PN 1
2J
1+ N
i=1 i
E
iD)
In conclusion,
X
E
S
N
2 (N + J) X 1
1+
N
i
!
1
+
2
=
1
2
=
N
N
J X1 N +J X 1
+
1+
N i=1 i
N
i
i=J+1
N
2J X 1
1+
N i=1 i
!
i=J+1
we can do a sanity check on J = 0, N respectively. Intuitively speaking, they should yield the same answer.
Let J = N , then we get
N
X
X
1
E
=1+
S
i
i=1
which is not surprising at all.
When J = 0, we can get
E
X
S
=1+
N
X
1
i=1
i
as expected. PN 1 N +J PN
1
J
If such E X
i=1 i + N
i=J+1 i is a continuous function and f (0) = f (N ), how
S = f (J) = 1 + N
about the minimum?
We can consider f (J + 1) − f (J) or f (J+1)
f (J) to see what the minimum is. However, I attached the screenshot of Matlab code and the minimum of the expected ratio. Amazingly, the best J value is about 0.12N (a
OPTIMAL U-TURN TIME
3
coarse estimation), if N takes a very large value, we believe the ratio tends to increase to large value since
it is not upper bounded at all. Therefore we should make u-turns as early as possible.
If we simplify this problem, we may be able to get a closed form solution. Suppose that the right door is
uniformly distributed in [−1, 1], then
ˆ 1
x
X
|C = right
=
dP (S ≤ s)
E
S
0 s
ˆ J
ˆ 1
s
s + 2 (J + 1)
=
ds +
ds
s
s
0
J
= J + 1 − J − 2 (J + 1) ln J
1 − 2 (J + 1) · ln J
=
Similarly,
E
X
|C = wrong
S
ˆ
1
x
dP (S ≤ s)
0 s
ˆ 1
2J + s
= lim
ds
→0 s
= lim (1 − − 2J ln )
=
→0
Hence the expected ratio is given by
f (J) = 1 − (J + 1) · ln J − J ln There is no minimum solution at all because of such thing. This is not surprising since S can approach
zero while it appears in the denominator. So we should impose the constraints such that the right door must
be at least > 0 away from the origin.
Scheme 3: Based on Wei’s claim, we should make multiple U-turns. An optimal search algorithm might
be to alternately search for both sides, if not found, double the search region until we find the exit, i.e., goto
1, −1, 2, −2, 4, −4, 8, −8, 16, −16, · · ·
For simplicity, suppose the distance between any two adjacent doors is 1. As before, the expected ratio
can be written as
X
X
X
X
E
|C
= P (C = right) E
|C = right + P (C = wrong) E
|C = wrong
=E E
S
S
S
S
We start out with E X
S |C = right . Suppose
2K < N ≤ 2K+1
hence K =
ln N ln 2 . We can decompose the integer interval {1, · · · , N } as
{1, · · · , N } = (0, 1] ∪ ∪K−1
2i , 2i+1 ∪ 2K , N
i=1
Let X(2i ,2i+1 ] be the accumulated distance for the first time for the player to move from the origin to 2i
and moves towards 2i+1 afterward conditioning on that the first move is in the right direction. (We will find
that if the first move is in the opposite direction, the accumulated distance would be slightly different)
Notice that X(20 ,21 ] = 5 and the recurrence relation is given by
X(2i ,2i+1 ] = X(2i−1 ,2i ] + 2i − 2i−1 + 4 · 2i
OPTIMAL U-TURN TIME
4
Figure 0.1. The Best Choice of U-turn step J
This is because in order to reach 2i , one has to move from 2i−1 to the next reflection point 2i and then make
a U-turn. We suppose we initially move in the right direction, thus one has to move to −2i then back to 2i ,
thus 4 · 2i .
Solve the above recurrent equation we can get
X(2i ,2i+1 ]
= X(2i−1 ,2i ] + 5 · 2i − 2i−1
= X(2i−2 ,2i−1 ] + 5 · 2i + 2i−1 − 2i−1 + 2i−2
= ···
= X(20 ,21 ] + 5 ·
i
X
j=1
=
5+9
i
X
j=1
=
i
9·2 −4
2j−1
2j −
i
X
j=1
2j−1
OPTIMAL U-TURN TIME
5
Therefore
E
X
|C = right
S
=
N
X
j=1
X
· P (S = j)
S=j
i+1
=
K−1
X 2X
1 1
1
X
+
·
+
1N
S=j N
i
i=0
j=2 +1
=
1
+
N
i+1
2X
K−1
X
i=0 j=2i +1
K−1
X 2X 1
1
+
+
N
N
i
i=0
j=2 +1
j=2K +1
X 1
·
S N
X(2i ,2i+1 ] + j − 2i 1
·
+
j
N
i+1
=
N
X
N
X
j=2K +1
N
X
j=2K +1
X(2K ,2K+1 ] + j − 2K 1
·
j
N
i+1
K−1
X 2X X(2i ,2i+1 ] − 2i 1
1
+
·
+
N
j
N
i
i=0
j=2 +1
i+1
=
X 2X X(2i ,2i+1 ] − 2i 1
K−1
1
· 1 + 2K − 20 + N − 2K +
·
+
N
j
N
i
i=0
j=2 +1
=
1+
1
N
K−1
X
2i+3 − 4 ·
i=0
i+1
2X
j=2i +1
1
1 K+3
+
2
−4 ·
j
N
N
X
j=2K +1
N
X
j=2K +1
N
X
j=2K +1
X(2K ,2K+1 ] − 2K 1
·
j
N
X(2K ,2K+1 ] − 2K 1
·
j
N
1
j
Similarly, suppose we initially move in the wrong direction, then
Y(20 ,21 ] = 9
since we move to −1 first then to +1 then −2 then to 1 → 2, so the total distance is 1 + 2 + 3 + 3 = 9. And
the recurrence equation is given by
Y(2i ,2i+1 ]
= X(2i+1 ,2i+2 ] − 2i+1 − 2i − 2 · 2i+1
=
9 · 2i+1 − 4 − 3 · 2i+1 + 2i
=
6 · 2i+1 + 2i − 4
Above recurrence relation is due to the following facts:
i) By the definition of Y(2i ,2i+1 ] , it is the accumulated distance for the first time for the player to move from
the origin to 2i and move beyond after wards conditioning on the first move is in the wrong direction (move
to −1 first).
ii) Based on i), 2i is not the reflection point while −2i is. Therefore the correct path is to move to −2i first,
not found, then move to 2i and return to −2i+1 then return to 2i+1 via 2i . Hence the accumulated distance
Y(2i ,2i+1 ] plus the distance 2i → 2i+1 plus 2i+1 → −2i+1 is equal to X(2i+1 ,2i+2 ] (take the opposite direction
as the equivalent problem when initial move is in the right direction).
OPTIMAL U-TURN TIME
6
Thus
E
X
|C = wrong
S
=
N
X
j=1
X
· P (S = j)
S=j
i+1
=
K−1
X 2X
1+2 1
1
X
+
·
+
1 N
S
=
j
N
i
i=0
j=2 +1
N
X
j=2K +1
i+1
=
K−1
X 2X Y(2i ,2i+1 ] + j − 2i 1
1
+
·
+
N
j
N
i
i=0
j=2 +1
X 1
·
S N
N
X
j=2K +1
Y(2K ,2K+1 ] + j − 2K 1
2
·
+
j
N
N
i+1
=
X 2X 6 · 2i+1 − 4 1
K−1
1
· 1 + 2K − 20 + N − 2K +
·
+
N
j
N
i
i=0
j=2 +1
i+1
=
1+
K−1
2X 1
1
1 X
6 · 2i+1 − 4 ·
+
6 · 2K+1 − 4 ·
N i=0
j
N
i
j=2 +1
N
X
j=2K +1
N
X
j=2K +1
2
6 · 2K+1 − 4 1
·
+
j
N
N
1
2
+
j
N
Hence the
X
X
X
X
= E E
|C
= P (C = right) E
|C = right + P (C = wrong) E
|C = wrong
E
S
S
S
S


i+1
K−1
N
X
2X 1
1  X i+3
1
= 1+
·
2
−4 ·
+ 2K+3 − 4 ·
2N
j
j
i=0
j=2i +1
j=2K +1


i+1
2X
K−1
N
X
1
1 X
1
6 · 2i+1 − 4 ·
+
+ 6 · 2K+1 − 4 ·
+ 2
2N i=0
j
j
i
K
j=2 +1
j=2 +1
Surprisingly, the ratio is kinda close to its upper bound 1 + 12 (8 + 12) = 11, i.e., I got ratio=7.92 for
N = 1000 and ratio=8.3046 for N = 10K.
Scheme 4: We examine another similar scheme: randomly
choose one direction, but move 1, −2, 4, −8, 16, −32, · · ·
Similarly, suppose 22K < N ≤ 22(K+1) , hence K = 2lnlnN2 . As before, suppose the right door is located at
the positive part of the axis. Notice that the reflection points now become 20 , 22 , · · · , 22i . Let X(22i ,22(i+1) ]
be the accumulated distance for the player for the first time to move from the origin to 22i and move forward.
Therefore the difference equation can be established as follows:
X(22i ,22(i+1) ] = X(22(i−1) ,22i ] + 22i − 22(i−1) + 2 22i − −22i+1
The second terms comes from the fact that player moves to 22(i−1) and keeps moving toward the next
reflection point 22i . From there, the player will move backward toward −22i+1 then back to 22i , this will be
X(22i ,22(i+1) ] by the definition of such X 0 s. It can be easily verified that X(22·0 ,22·1 ] = X(1,4] = 1 + 3 + 3 = 7.
We can get that
X(22i ,22(i+1) ] = 9 · 4i − 2, ∀i ∈ {0, 1, · · · }
OPTIMAL U-TURN TIME
7
Algorithm 1 Calculate the ratio for scheme 3
clear all, close all;
% N is the dimension of the problem
N=1000;
K=floor(log(N)/log(2));
expectedRatio=1;
%first summation term acc=0;
for i=0:K-1
accInner=0;
for j=2î+1:2^(i+1)
accInner = accInner+1/j;
end
acc = acc + (2^(i+3)-4)*accInner;
end
expectedRatio = expectedRatio+acc/2/N;
% second summation term accInner=0;
for j=2^K+1:N
accInner = accInner+1/j;
end
expectedRatio=expectedRatio+(2^(K+3)-4)*accInner/2/N;
% third summation term
acc=0;
for i=0:K-1
accInner=0;
for j=2î+1:2^(i+1)
accInner = accInner+1/j;
end
acc = acc + (6*2^(i+1)-4)*accInner;
end
expectedRatio = expectedRatio+acc/2/N;
% fourth summation term
accInner=0;
for j=2^K+1:N
accInner = accInner+1/j;
end
expectedRatio=expectedRatio+(6*2^(K+1)-4)*accInner/2/N+1/N
hence
2(i+1)
E
X
|C = right
S
=
K−1
X 2X
1 1
·
+
1 N
2i
i=0
j=2 +1
=
1+
K−1
X
i=0
X(22i ,22(i+1) ] + j − 22i 1
·
+
j
N
2(i+1)
8 · 4i − 2 ·
2X
j=22i +1
1 1
·
+ 8 · 4K − 2 ·
j N
N
X
j=22K +1
N
X
j=22K +1
X(22K ,22(K+1) ] + j − 22K 1
·
j
N
1 1
·
j N
OPTIMAL U-TURN TIME
8
If the first move is in the wrong direction, i.e., moves to −20 first. Then the reflection points now become
2 , 23 , · · · . At this time, we decompose {1, 2, · · · , N }as
1
0
K n
o n
o
X
0
{1, 2, · · · , N } = {1, 2} ∪
22i+1 + 1, · · · , 22(i+1)+1 ∪ 22K +1 + 1, · · · , N
i=1
where
22K
0
+1
< N ≤ 22(K
0
+1)+1
Hence
$
0
K =
ln N
ln 2
−1
2
%
We define Y(22i+1 ,22(i+1)+1 ] be the accumulated distance for the player for the first time to move from the
origin to 22i+1 and move forward. The difference equation is given by
Y(22i+1 ,22(i+1)+1 ] = Y(22(i−1)+1 ,22i+1 ] + 22i+1 − 22(i−1)+1 + 2 22i+1 − −22(i+1)
Notice that Y(22·0+1 ,22(0+1)+1 ] = Y(2,8] = 16 hence we can get
Y(22i+1 ,22(i+1)+1 ] = 18 · 4i − 2, ∀i ∈ {0, 1, · · · }
Therefore
0
X
|C = wrong
E
S
2(i+1)+1
K
−1 2 X
Y(22i+1 ,22(i+1)+1 ] + j − 22i+1 1
X
3 1
4 1
=
·
+ ·
·
+
j
N
|1 N {z 2 N} i=0 j=22i+1 +1
j=1,2
+
N
X
Y22K 0 +1 ,22(K 0 +1)+1 i + j − 22K
j
j=22K 0 +1 +1
0
=
K
−1
X
2+1
16 · 4i − 2 ·
1+
+
N
i=0
22(i+1)+1
X
j=22i+1 +1
0
+1
·
1
N
0
1 1
·
+ 16 · 4K − 2 ·
j N
N
X
j=22K 0 +1 +1
1 1
·
j N
Therefore the expected ratio is given by the following formula:
E
X
S
X
X
X
= E E
|C
= P (C = right) E
|C = right + P (C = wrong) E
|C = wrong
S
S
S


2(i+1)
K−1
N
X
2X 1
1 X
1
8 · 4i − 2 ·
= 1+
+ 8 · 4K − 2 ·
2N i=0
j
j
j=22i +1
j=22K +1


0
K
−1
22(i+1)+1
N
X
X
X
0
1 
1
1

+
3+
16 · 4i − 2 ·
+ 16 · 4K − 2 ·
2N
j
j
0
2i+1
2K +1
i=0
j=2
+1
j=2
+1
OPTIMAL U-TURN TIME
9
Conclusion 1: Scheme 3 and 4 are both bounded from above by a constant 11 and 9. To see this, we notice
that
h
P
i
P22(i+1)+1 1 PK 0 −1
N
1
1
K0
i
0 +1
3
+
+
16
·
4
−
2
·
16
·
4
−
2
·
2K
2i+1
i=0
j=2
+1 j
j=2
+1 j
N
h
P
i
0
2(i+1)+1
P
P
0
K
−1
2
N
1
+ 16 · 4K − 2 · j=22K 0 +1 +1 22K10 +1
< N1 3 + i=0 16 · 4i − 2 · j=22i+1 +1 22i+1
h
i
0
PK 0 −1 16·4i −2 P22(i+1)+1
PN
1
16·4K −2
0
=
3
+
1
1
+
0
2K
+1
2i+1
i=0
j=2
+1
j=2
+1
N
2·4i
2·4K
h P
i
0
2(i+1)+1
P
P
P
1
K
−1
2
N
1
0 +1
<
8
1
+
8
1
+
8
1
2i+1
2K
j=0
i=0
j=2
+1
j=2
+1
N
=
8N
N
=8
Similarly for the other term, hence the total expected ratio is always upper bounded by 1 + 8 = 9. Of course
Wei’s proof is much simpler without going into the details of the expected ratio.
Conclusion 2: Based on Matlab simulation, the ratios for N = 1K and 10K are 5.1555 and 5.3882, respectively. Hence if I were not mistaken (I will double check the derivation of scheme 3 later on), scheme 4 is
much better than scheme 3 and 9 is not a tight bound at all!
Conclusion 3: Surprisingly, for N ≤ 10K to be some constant natural number, scheme 2 performs best. It
seems that there is no need to make multiple u-turns. However, when N goes to infinity, since you may
choose the right direction but u-turns too early, then you will have to explore all the doors in the wrong
direction (tend to infinity) then go back to explore the right direction. Theoretically speaking, the ratio is
like scheme 1 to go to infinity. In reality, what shall we choose? Since N might not be very large, while
scheme 4 does not give too much difference. I tried N = 100K, then the ratio for scheme 4 becomes 5.3867,
NOT BAD at all.
Scheme 5: Thanks to Tian for pointing out this original problem. I would shamelessly copy and paste the
following figure from the UIUC course website simply because I so much love the drawing, LOL.
A nice illustration of competitive ratio analysis and issues is provided by the lost-cow problem[2]. As
shown in Figure 12.26a[1], a short-sighted cow is following along an infinite fence and wants to find the gate.
This makes a convenient one-dimensional planning problem. If the location of the gate is given, then the
cow can reach it by traveling directly. If the cow is told that the gate is exactly distance 1 away, then it
can move one unit in one direction and return to try the other direction if the gate has not been found.
The competitive ratio in this case (the set of environments corresponds to all gate placements) is 3. What
if the cow is told only that the gate is at least distance 1 away? In this case, the best strategy is a spiral
search, which is to zig-zag back and forth while iteratively doubling the distance traveled in each direction,
OPTIMAL U-TURN TIME
10
Algorithm 2 Calculate the ratio for Scheme 4
clear all, close all;
% N is the dimension of the problem
N=100000;
K=floor(log(N)/log(2)/2);
expectedRatio=1;
%first summation term
acc=0;
for i=0:K-1
accInner=0;
for j=4î+1:4^(i+1)
accInner = accInner+1/j;
end
acc = acc + (8*4î-2)*accInner;
end
expectedRatio = expectedRatio+acc/2/N;
% second summation term
accInner=0;
for j=4^K+1:N
accInner = accInner+1/j;
end
expectedRatio=expectedRatio+(8*4^K-2)*accInner/2/N;
% third summation term
KPrime=floor((log(N)/log(2)-1)/2);
acc=0;
for i=0:KPrime-1
accInner=0;
for j=2^(2*i+1)+1:2^(2*i+3)
accInner = accInner+1/j;
end
acc = acc + (16*4î-2)*accInner;
end
expectedRatio = expectedRatio+acc/2/N;
% fourth summation term accInner=0;
for j=2^(2*KPrime+1)+1:N
accInner = accInner+1/j;
end
expectedRatio=expectedRatio+(16*4^KPrime-2)*accInner/2/N+3/N/2
as shown in Figure 12.26b. In other words: left one unit, right one unit, left two units, right two units,
left four units, and so on. The competitive ratio for this strategy turns out to be 9, which is optimal. This
approach resembles iterative deepening.
References
[1] http://planning.cs.uiuc.edu/node622.html#fig:cowpath
OPTIMAL U-TURN TIME
11
[2] R. A. Baeza, J. C. Culberson, and G. J. E. Rawlins. Searching in the plane. Information and Computation, 106(2):234-252,
1993.

Download Report

OPTIMAL U-TURN TIME Question: suppose there is only one door

Paperzz.com

Your Paperzz