A new bijection on m-Dyck paths and application to random sampling

A new bijection on m-Dyck paths
and application to random sampling
Axel Bacher
LIPN, Université Paris 13
June 3rd, 2016
Outline
1
Introduction
2
The folding bijection
m-Łukasiewicz paths and m-Dyck prefixes
Folding and unfolding
3
Random sampling
Main algorithm
Complexity and limit law
4
Perspectives
m-Dyck paths and (m+1)-ary trees
m-Dyck path: path in N from 0 to 0 with steps in {+1, −m}.
1 n
The number of paths of length n = (m+1) n0 is
mn0 +1 n0
(Fuß-Catalan number).
Random sampling of general classes of trees in time O(n log n),
based on the cycle lemma. [Devroye ’12]
The folding bijection for Dyck paths
One can sample a Dyck path in time O(n) by sampling a Dyck prefix
and use the folding bijection to get a pointed Łukasiewicz path.
[Barcucci, Pinzani, Sprugnoli ’92; B., Bodini, Jacquot ’15]
m-Łukasiewicz paths
m-Łukasiewicz path: non-negative path except at its end.
Paths of length n = (m+1) n0 + r have height h = r − (m+1).
rn
(Raney number).
Their number is
n n0
m-Łukasiewicz paths
m-Łukasiewicz path: non-negative path except at its end.
Paths of length n = (m+1) n0 + r have height h = r − (m+1).
rn
(Raney number).
Their number is
n n0
A pointed path has an associated factorization:
(
0 ≤ h(qi ) ≤ m − 1, i < k
w = p q0 d · · · qk d, where
0 ≤ h(qk ) ≤ r − 1.
Decorated m-Dyck prefixes
Paths of length n = (m+1) n0 + r have height h = (m+1) h0 + r.
Decorated m-Dyck prefixes
a2 = 2
a1 = 3
a0 = 1
Paths of length n = (m+1) n0 + r have height h = (m+1) h0 + r.
A decoration of an m-Dyck prefix is defined as a sequence:
(
1 ≤ ai ≤ m, i < h0
(a0 , . . . , ah0 ), where
1 ≤ ak ≤ r.
0
(a path has mh r possible decorations).
Decorated m-Dyck prefixes
a2 = 2
a1 = 3
a0 = 1
Paths of length n = (m+1) n0 + r have height h = (m+1) h0 + r.
A decoration of an m-Dyck prefix is defined as a sequence:
(
1 ≤ ai ≤ m, i < h0
(a0 , . . . , ah0 ), where
1 ≤ ak ≤ r.
0
(a path has mh r possible decorations).
A decorated m-Dyck prefix has an associated factorization:
w = p uq0 · · · uqh0 ,
where h(qi ) = ai .
Folding and unfolding
p uq0 · · · uqk
p q0 d · · · qk d
Theorem
The folding operation is a bijection between decorated m-Dyck prefixes
and pointed m-Łukasiewicz paths. Folding or unfolding only requires
reading the part of the path after the point.
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
If we go in the negatives, we randomly point the path and unfold.
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
If we go in the negatives, we randomly point the path and unfold.
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
If we go in the negatives, we randomly point the path and unfold.
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
If we go in the negatives, we randomly point the path and unfold.
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
If we go in the negatives, we randomly point the path and unfold.
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
If we go in the negatives, we randomly point the path and unfold.
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
If we go in the negatives, we randomly point the path and unfold.
Random m-Dyck prefix
We draw u and d steps with probabilities
m
m+1
and
1
m+1 .
If we go in the negatives, we randomly point the path and unfold.
At all times, paths of height (m+1) h0 + r appear with probability
0
proportional to mh .
Random m-Łukasiewicz path
We draw a random m-Dyck prefix of length n = (m+1) n0 + r and
height h = (m+1) h0 + r.
We randomly decorate this prefix and we fold.
The result is a uniform pointed m-Łukasiewicz path.
Complexity
Random m-Łukasiewicz path
w←ε
for i = 1, . . . , n do
m
s ← u with probability m+1
, d otherwise
w ← ws
if h(w) < 0 then
randomly point w
unfold w (forget the decoration)
end if
end for
randomly decorate w
fold w (forget the point)
return w
Complexity
Random m-Łukasiewicz path
w←ε
for i = 1, . . . , n do
m
s ← u with probability m+1
, d otherwise
w ← ws
if h(w) < 0 then
randomly point w
unfold w (forget the decoration)
end if
end for
randomly decorate w
fold w (forget the point)
return w
We consider complexity in random bits and memory accesses.
Complexity
Random m-Łukasiewicz path
w←ε
for i = 1, . . . , n do
m
s ← u with probability m+1
, d otherwise
w ← ws
if h(w) < 0 then
randomly point w
unfold w (forget the decoration)
end if
end for
randomly decorate w
fold w (forget the point)
return w
β
1
O(log i)
Unif{1, . . . , i}
√
O( n)
Unif{1, . . . , n}
We consider complexity in random bits and memory accesses.
Complexity
Random m-Łukasiewicz path
w←ε
for i = 1, . . . , n do
m
s ← u with probability m+1
, d otherwise
w ← ws
if h(w) < 0 then
randomly point w
unfold w (forget the decoration)
end if
end for
randomly decorate w
fold w (forget the point)
return w
β
1
O(log i)
Unif{1, . . . , i}
√
O( n)
Unif{1, . . . , n}
We consider complexity in random bits and memory accesses.
The if branches are independent with probability ∼
1
2i .
Complexity (cont.)
Theorem
The cost in random bits and memory accesses satisfies:
Bn d
−
→ β;
n
Mn d
−
→ 1 + X + Unif[0, 1].
n
1
.
The number β is the cost in random bits of Bernoulli m+1
According to [Knuth, Yao ’76], we can take:
1
m
1
m
β ∼ − m+1
log2 m+1
− m+1
log2 m+1
.
Complexity (cont.)
Theorem
The cost in random bits and memory accesses satisfies:
Mn d
−
→ 1 + X + Unif[0, 1].
n
Bn d
−
→ β;
n
1
.
The number β is the cost in random bits of Bernoulli m+1
According to [Knuth, Yao ’76], we can take:
1
m
1
m
β ∼ − m+1
log2 m+1
− m+1
log2 m+1
.
The law X is defined by:
X=
X
Unif[0, x],
x∈S
where S is a Poisson point process of density λ(x) =
1
2x
on (0, 1].
Properties of the limit law
The distribution X =
X
Poissonx∈(0,1] ( 1−x
)
2x
Cumulant generating function K(z) = log E(ezX ):
Z z y
e −1−y
1
K(z) =
.
dy,
κn (X) =
2
2y
2n(n + 1)
0
Properties of the limit law
The distribution X =
X
Poissonx∈(0,1] ( 1−x
)
2x
Cumulant generating function K(z) = log E(ezX ):
Z z y
e −1−y
1
K(z) =
.
dy,
κn (X) =
2
2y
2n(n + 1)
0
Distribution function F (x) = P(X ≤ x):
F (x) + F 0 (x) + 2xF 00 (x) = F (x − 1)
Properties of the limit law
The distribution X =
X
Poissonx∈(0,1] ( 1−x
)
2x
Cumulant generating function K(z) = log E(ezX ):
Z z y
e −1−y
1
K(z) =
.
dy,
κn (X) =
2
2y
2n(n + 1)
0
Distribution function F (x) = P(X ≤ x):
F (x) + F 0 (x) + 2xF 00 (x) = F (x − 1)
F (x) =
sin
√
2x,
0 ≤ x ≤ 1.
Properties of the limit law
The distribution X =
X
Poissonx∈(0,1] ( 1−x
)
2x
Cumulant generating function K(z) = log E(ezX ):
Z z y
e −1−y
1
K(z) =
.
dy,
κn (X) =
2
2y
2n(n + 1)
0
Distribution function F (x) = P(X ≤ x):
F (x) + F 0 (x) + 2xF 00 (x) = F (x − 1)
r
√
2e1−γ
F (x) =
sin 2x,
0 ≤ x ≤ 1.
π
Properties of the limit law
The distribution X =
X
Poissonx∈(0,1] ( 1−x
)
2x
Cumulant generating function K(z) = log E(ezX ):
Z z y
e −1−y
1
K(z) =
.
dy,
κn (X) =
2
2y
2n(n + 1)
0
Distribution function F (x) = P(X ≤ x):
F (x) + F 0 (x) + 2xF 00 (x) = F (x − 1)
r
√
2e1−γ
F (x) =
sin 2x,
0 ≤ x ≤ 1.
π
Tail distribution asymptotics:
1 − F (x) = x−x (log x)−2x (e/2)x+o(x) .
Graph of the distribution function
F (x)
1
0
1
r
1
0
x
1
x
√
2e1−γ
sin 2x
π
Perspectives
Can we use a similar method for other paths (+a, −b)?

Download Report

A new bijection on m-Dyck paths and application to random sampling

Paperzz.com

Your Paperzz