The Galton{Watson tree conditioned on its height 1 Introduction

The Galton{Watson tree conditioned on its height
Jochen Geiger and Gotz Kersting
Universitat Frankfurty
September 1998
Abstract
We give a probabilistic construction of the Galton{Watson tree
conditioned on its height by decomposing it along the line of descent
to the left{most particle at maximal height. This construction provides
a representation of the nal generation size as a sum of independent
increments. Based on this representation we prove limit laws for the
nal generation size of a Galton{Watson tree conditioned on its height
and total progeny, respectively.
1 Introduction
Let T denote the random family tree of a Galton{Watson branching process
starting with a single founding ancestor, where each particle independently
has probability pk of producing k ospring. For a detailed denition and discussion of this process, we refer to 2], 1]. Regard T as a rooted planar tree
with the distinguishable ospring of each vertex ordered from left to right.
P
Let = kpk be the mean number of children per particle and denote by
y
Research of both authors supported by the German Research Foundation DFG
Fachbereich Mathematik, Postfach 11 19 32, D{60054 Frankfurt am Main, Germany
1
Zn (T ) the nth generation size (= number of vertices with distance n from
the root). We write H (T ) for the height of the tree (= maximal distance of
a vertex from the root) and q = P (Zn (T ) ! 0) for the extinction probability of the branching process. We assume 0 < p0 < 1 throughout, so that
P (H (T ) = n) > 0 for any n and q = 1 () 1:
In this paper we study the conditional structure of the family tree T given
that H (T ) = n: Clearly, particles no longer evolve independently of each
other nor homogeneously in time. However, the Galton{Watson tree conditioned on its height satises some striking consistency and independence
conditions. These properties allow to construct the tree inductively by attaching independent subtrees at the bottom of the line of descent to the
left{most particle at maximal height. This provides a decomposition of the
nal generation size of a conditioned Galton{Watson tree into a sum of independent increments. The construction, which is the analog of the backward
construction of the Galton{Watson tree conditioned on non{extinction in
4], is explained in section 2. In section 3 we prove a strong limit law for the
nal generation size for arbitrary ospring distribution. In the critical case
with nite variance the limiting distribution of the nal generation size is
known to be the same, if instead the total population size of the tree were
conditioned to be large 7]. We give a simple probabilistic proof of this fact.
The concept of spinal decompositions as an approach to proving limit theorems for Galton{Watson processes goes back to Lyons, Pemantle, and
Peres 10]. Similar and related constructions for the genealogical tree arising
from a branching particle system have occured in 5], 3].
2 Growing a conditioned Galton{Watson tree from
the top
Our starting point is the following simple but crucial observation: The law of
the subtree founded by the left{most child of the root, who has a descendant
2
in the nal generation of a Galton{Watson tree conditioned on height n +1,
is the same as the law of a Galton{Watson tree conditioned on height n:
This conditional law is independent of the rst generation size and the rank
of the child. The subtrees founded by the siblings to the left and right of
the distinguished rst generation particle are independent Galton{Watson
trees conditioned on height strictly less than n and n + 1 respectively.
More precisely, let H (T ) denote the height of the Galton{Watson tree T ,
i.e. for n 0
H (T ) = n : () Zn(T ) > 0 Zn+1 (T ) = 0
and let T (i), 1 i Z1(T ) be the subtrees founded by the rst generation
particles of T: For a nite tree T with Z1 (T ) > 0 write R(T ) for the rank
of the left{most such subtree with maximal height,
R(T ) = minf1 i Z1(T )j H (T (i)) = H (T ) ; 1g:
We sum up the properties stated above in the following lemma (we abbreviate R = R(T ) H = H (T ) : : :). The proof is immediate from the
Galton{Watson conditions of independence and time{homogeneity.
Lemma 2.1 The subtrees T (i), 1 i Z1, are conditionally independent
given fR = j Z1 = k H = n + 1g, 1 j k < 1 n 0, with
8 L(T j Z = 0) 1 i j ; 1
>
n
<
(i)
L(T j R = j Z1 = k H = n + 1) = >: L(T j H = n) i = j L(T j Zn+1 = 0) j + 1 i k:
The conditional joint distribution of R and Z1 is
P (R = j Z1 = k jH = n + 1) = cn pk P (Zn = 0)j ;1P (Zn+1 = 0)k;j (1)
where cn = P (H = n)=P (H = n + 1) n 0.
3
The properties stated in lemma 2.1 suggest to grow the conditioned
Galton{Watson tree from the top backwards. We now construct an increasing sequence of trees (Tn )n0 such that Tn has law L(T jH = n): The tree
Tn+1 is obtained from Tn by attaching independent subtrees at the bottom
of the line of descent of the left{most particle at maximal height.
Construction. Let (Vn+1 Wn+1), n 0, be a sequence of independent
random variables with distribution (1),
P (Vn+1 = j Wn+1 = k) = cn pk P (Zn = 0)j;1 P (Zn+1 = 0)k;j and let T0 be a Galton{Watson tree of height 0 (i.e. T0 consists of its root
only). Inductively construct Tn+1 , n 0, by the following procedure:
Let the rst generation size of Tn be Wn :
Let Tn be the subtree founded by the Vn th rst generation particle
+1
+1
+1
of Tn+1 .
Attach independent Galton{Watson trees conditioned on height strictly
less than n to the Vn ; 1 siblings to the left of the distinguished rst
generation particle.
+1
Attach independent Galton{Watson trees conditioned on height strictly
less than n + 1 to the Wn ; Vn siblings to the right of the distin+1
guished rst generation particle.
+1
The tree Tn has the same probabilistic structure as a Galton{Watson tree
conditioned on height n:
Proposition 2.2 Suppose 0 < p0 < 1: Then
L(Tn) = L(T jH = n)
for any n 0:
Proof. Per induction on n using the properties stated in Lemma 2.1.
4
Remark. For any ospring distribution (pk )k0, (Vn Wn) has a weak limit
(V1 W1) say, with distribution
P (V1 = j W1 = k) = c1 pk qk;1 1 j k
(2)
where
1
X
P
(
H
=
n
)
k;1 p ;1 1:
c1 := nlim
c
=
lim
=
kq
(3)
n
k
!1
n!1 P (H = n + 1)
k=1
Note that V1 is conditionally uniform on f1 kg given W1 = k k 1:
This displays the fact that, as n ! 1 all particles in the nal generation
of a Galton{Watson tree conditioned on height n are descendants of the
same rst generation particle. In subcritical and critical cases 1 the
extinction probability q equals 1, i.e. W1 has the size{biased distribution
(;1 kpk )k1 :
The particles in the nal generation n of Tn are the particles in generation
n +1 of Tn+1 descended from the distinguished rst generation particle. Any
other particle in the nal generation n + 1 of Tn+1 is a descendant of one of
the Wn+1 ; Vn+1 siblings to the right of the distinguished rst generation
particle. This provides the following representation of the nal generation
size of a Galton{Watson tree conditioned on its height:
Let Zni and Xn+1 , n 0 i 1 be independent random variables where
L(Zni)
= L(Zn jZn+1 = 0)
and Xn+1 has distribution L(Wn+1 ; Vn+1 )
P (Xn+1 = k) = cnP (Zn+1 = 0)k
Dene Z0 := 1 and
Zn+1 := Zn +
XX
n+1
i=1
1
X
j =k+1
Zni n 0:
pj P (Zn = 0)j;(k+1) k 0:
(4)
(5)
(6)
5
Corollary 2.3 Suppose 0 < p0 < 1: Then
L(Zn) = L(ZH jH = n)
for any n 0:
Proof. By construction of (Tn)n0 and Proposition 2.2.
Remark. A representation similar to (6) holds jointly for the last k n
generations of a Galton{Watson tree conditioned on height n: However, note
that also siblings to the left of particles in the distinguished line of descent
can have descendants in generations n ; j j 1: Alternatively, one can
construct the tree along the backbone of the left{most particle in generation
n ; k + 1:
3 Limit laws for the nal generation size
Our rst result states that, as n ! 1 the nal generation size of a Galton{
Watson tree conditioned on height n has a proper limit with nite mean.
The statement holds for arbitrary ospring distribution. We remark that
strong convergence of the joint distribution of the nal k generation sizes can
be derived in much the same way (compare the remark following Corollary
2.3).
Proposition 3.1 Suppose 0 < p0 < 1: Then
a:s: Zn ;!
Z1 as n ! 1
(7)
E Z1 = nlim
E Zn < 1:
(8)
!1
Proof. Since (Zn)n0 has non{negative increments, it has an almost sure
limit Z1 say, and lim n!1 E Zn = E Z1 1: By (6), the independence
properties of the Xn+1 and Zni and Corollary 2.3
E Zn+1 ; E Zn = EXn+1 E (ZnjZn+1 = 0)
= EXn+1 E (ZnjZn > 0 Zn+1 = 0) P (Zn > 0jZn+1 = 0)
= EXn+1 E Zn P (Zn > 0jZn+1 = 0):
(9)
6
For any increasing sequence (an )n0 with a0 > 0
log an+1 ; log an
a;1
n (an+1 ; an )
a;1
0 (an+1 ; an )
and, consequently,
lim a < 1
n!1 n
() X an (an ; an ) < 1:
1
;1
+1
n=0
Take an = E Zn , then (9) implies
E Z1 < 1
()
1
X
n=0
EXn+1 P (Zn > 0jZn+1 = 0) < 1:
(10)
(Note that the quantity on the right{hand side of (10) is the expected number of `contributing' siblings of the particles in the distinguished line of
descent.) We begin with an upper bound of the rst factor of the summands on the right{hand side of (10). We write f g if there exist positive
constants c1 and c2 such that c1f (n) g (n) c2 f (n) n 0.
EXn+1
X k P (Zn
(3)(5)
1
+1
k=1
= 0)k
j
X X k pj P (Zn
1
;1
j =2 k=1
1
X
j =1
+1
1
X
j =k+1
pj P (Zn = 0)j;(k+1)
= 0)j ;1
j 2pj P (Zn+1 = 0)j ;1 :
(11)
For the other term in the sum on the right{hand side of (10) note that
P (Zn > 0jZn+1 = 0) = PP(Z(H ==n)0)
n+1
P (H = n) (3)
P (H = n + 1):
(12)
In view of (11), (12), and the binomial theorem, we conclude (here c denotes
some positive constant)
1
X
n=0
EXn+1 P (Zn > 0jZn+1 = 0)
7
c XXj
1
1
2
n=0 j =1
= c
pj P (Zn+1 = 0)j ;1P (H = n + 1)
1
1
X
X
j =1
jpj
n=0
j P (H n)j;1 P (H = n + 1)
c X jpj X P (H n + 1)j ; P (H n)j
1
j =1
c
1
X
j =1
P
1
n=0
jpj P (H < 1)j <
1
(13)
since either = 1
j =1 jpj 1 or q = P (H < 1) < 1 (in fact, the inequality
P jp qj q holds).
The claim of Proposition 3.1 follows from (10)
j 1 j
and (13).
From Proposition 3.1 we derive the following limit law for the nal generation size of a Galton{Watson tree with arbitrary ospring distribution
conditioned on height n. In the critical, nite variance case the result is
due to Kesten, Ney, and Spitzer 6]. Seneta 11] removed the second moment assumption. In fact, no conditions on the ospring distributions are
required.
Theorem 3.2 Let n = L(ZH jH = n) n 0 and suppose 0 < p0 < 1:
The sequence (n )n0 converges in a strong sense,
1
X
jjn ; n jj < 1
n
where jj jj denotes total variation norm. If (14)
+1
=0
bution of (n )n0 , then
1
denotes the limiting distri-
1 (k) = pk0 k k 1
(15)
where k = limn!1 P (Zn = k)=P (H = n) is a solution of
k = c1
1
X
j =1
j P (Z1 = kjZ0 = j ):
8
(16)
Proof. By Corollary 2.3 we have n = L(Zn): Consequently,
jjn ; n jj 2P (Zn 6= Zn):
+1
+1
Since (Zn )n0 has independent non{negative, integer{valued increments,
P1 (Z 6= Z ) < 1 i P (Z < 1) = 1 (by the Borel{Cantelli lemma).
n
1
n=0 n+1
Clearly, (8) implies Z1 < 1 a.s., which establishes (14). For the representation of 1 in (15) note that
k
P (ZH = kjH = n) = P P(Z(nH==kn))p0 ! 1 (k) as n ! 1
which implies existence of k := limn!1 P (Zn = k)=P (H = n): Finally, for
any k 1
1
P (Zn+1 = k) = c X
P (Zn = j ) P (Z = kjZ = j )
n
n+1
n
P (H = n + 1)
j =1 P (H = n)
= cn
1
X
j =1
P (Zn = j jH = n) p;0 j P (Z1 = kjZ0 = j ): (17)
To justify that summation and limiting procedures may be interchanged on
the right{hand side of (17), note that the non{negativity of the increments
of (Zn )n0 implies
P (Z1 = j )
P (Zn = j )P (Z1 = Zn)
P (Zn = j )P (Z1 = Z1) (18)
where the second factor on the right{hand side of (18) is positive by (14)
and the fact, that P (Zm+1 = Zm ) > 0 for any m 1 due to the assumption
p0 > 0. Hence, by (17), Fatou's lemma and the dominated convergence
theorem,
k = c1 nlim
!1
= c1
1
X
j =1
1
X
j =1
P (Zn = j ) p;0 j P (Z1 = kjZ0 = j )
1(j ) p;0 j P (Z1 = kjZ0 = j )
which is equation (16).
9
P
In the critical case = 1 with nite variance 2 = k(k ; 1)pk < 1, the
limiting distribution of the nal generation size is the same, if the Galton{
Watson tree T is conditioned on its total population size instead,
lim
n!1 n;12hN
P (ZH = kjN = n) = 1 (k) k 1
P
(19)
where N (T ) = 1
k=0 Zk (T ) and h denotes the span of the ospring distribution (pk )k0 (see Theorem 2 in 7]). The criticality assumption is crucial
for (19) to hold: Suppose that for some > 0
p0k = ck pk k 0 and
P p0 = 1:
k0 k
Then the distribution of T conditioned on its total population size being n
is the same for the two ospring distributions (pk )k0 and (p0k )k0 , whereas
0 dier (e.g., if p = 1 ; p = ", then (2) ! 1
the distributions 1 and 1
2
0
1
as " ! 0). In 7] the limit law (19) is derived from a joint local limit
theorem for H ZH and N: We give here a simple probabilistic proof of a
somewhat weaker result.
Theorem 3.3 Suppose = P1k=1 kpk = 1 2 = P1k=1 k(k ; 1)pk < 1
and p0 > 0: Then
lim P (ZH = kjN n) = 1 (k) k 1
n!1
(20)
where 1 is the limit of L(ZH jH = n) from Theorem 3.2.
Proof. We will show that, as m ! 1 ZH is conditionally independent of
N given H = m: To this end recall the construction of Tm along the line of
descent to the left{most particle at maximal height m and let vj 0 j m
be the distinguished particle's ancestor in generation m ; j . Clearly, vj is
a child of vj +1 vm is the founding ancestor, and v0 is the distinguished
particle itself. Decompose the total population size of Tm by writing
N (Tm) = 1 +
10
Pm Y j =1 j
where Yj is the number of particles in Tm, whose most recent ancestor in
the distinguished line of descent is vj . By construction of Tm, the random
variables Yj 1 j m are independent and the law of Yj ; 1 can be
represented as a random sum of independent copies of the total population
size of Galton{Watson trees conditioned on extinction at generation j ; 1
and j respectively. In particular, the law of Yj does not depend on m and
P (Yj n)
(2)
P (Xj 1) P (N njH < j )
P (H < j jN n) P (N n) n j 1
(21)
where N = N (T ) H = H (T ): To estimate the right{hand side of (21) we
use the following asymptotics for the height and the size of a Galton{Watson
tree T ,
3
lim
n 2 P (N = n) =
n!1 n;12hN
L(n
; 12
H jN = n)
h p
2
(22)
2W (23)
U ;2 as n ! 1
(24)
;!
n!1 n;12hN
where W is the maximum of standard Brownian excursion of duration 1
(see 8], Lemma 2.1.4 and Theorem 2.4.3). It is immediate from (22) that
d
L(n N jN n) ;!
;1
where U is uniformly distributed on (0 1): Consequently,
1
d 2W as n ! 1
(n; 2 H jN n) ;!
L
U
(25)
with W and U independent. From (21), (22), and (25) we obtain the
following asymptotic lower bound for the tail of L(Yj ):
lim
inf jP (Yj xj 2) cx; 2 P (2W
j !1
1
x
; 12
U ) > 0 x > 0:
(26)
Estimating Nm = N (Tm) through the maximum of the Yj m=2 j m
we obtain from (26),
lim inf P (Nm xm2 ) > 0 x 0:
(27)
m!1
11
Now recall that L(ZH jH = m N n) = L(Zm jNm n) by Proposition 2.2.
Hence, for any " > 0
jP (ZH = kjN n) ; (k)j
X jP (ZH = kjH = m N n) ; (k)j P (H = mjN n)
m
msup" n jP (Zm = kjNm n) ; (k)j + P (H "pnjN n): (28)
1
1
1
=1
1
p
We may assume 1 (k) > 0 since 1 (k) = 0 implies P (Zm = k) = 0 for any
m 1. Then (27) implies
supp jP (Zm = kjNm n) ; 1 (k)j
m" n
supp 1 (k);1 P (Nm n Zm = k) ; P (Nm n)
m" n
supp jP (Nm njZ1 = k) ; P (Nm n)j
m" n
+ 1 (k);1 supp P (Zm 6= Z1 ):
(29)
The second term on the right{hand side of (29) goes to 0 as n
Proposition 3.1. We will show below that
! 1 by
m" n
jP (Nm njZ
1
= k) ; P (Nm n)j
P (
> m)
(30)
where is a nite coupling time depending on k but not on m or n. Hence,
if we pass to the limit n ! 1 in (28), then (29), (30) and the limit law (25)
imply
lim sup jP (ZH = kjN n) ; 1 (k)j P (2W
n!1
"
U )
(31)
for any " > 0: Then let " ! 0 in (31) to establish (20). We nally explain the coupling that leads to (30): Let (Yj0 Zj0 )j 1 have distribution
L((Yj Zj )j1jZ1 = k) and let
= minfj 0jZj = Z1 g 0 = minfj 0jZj0 = Z10 g:
12
Observe that for any j 1
L((Yi )i j j j > ) = L((Yi)i j j j > )
0
0
(32)
since Yi is conditionally independent of Z1 = k given i > : Also, note that
(30) is the coupling inequality for the coupling time
= minfj > _ 0j
Pj Y = Pj Y 0g:
i=1 i
i=1 i
In view of (32) we have to produce a coupling for sums of independent but
not identically distributed random variables. However, note that the Yj have
a weak limit by (2) and the fact that fH mg % fH < 1g: It is well{
known that in this case successfull couplings exist (a Mineka or Ornstein
coupling will do, see e.g. 9], Chap. 2), i.e. < 1 a.s.
References
1] Asmussen, S. and Hering, H. (1983) Branching Processes,
Birkhauser, Boston.
2] Athreya, K. B. and Ney, P. (1972) Branching Processes, Springer,
New York.
3] Chauvin, B., Rouault, A., and Wakolbinger, A. (1991) Growing conditioned trees, Stoch. Proc. Appl. 39, 117{130.
4] Geiger, J. (1997) Elementary new proofs of classical limit theorems
for Galton{Watson processes, to appear in J. Appl. Prob. 36
5] Kallenberg, O. (1977) Stability of critical cluster elds. Math.
Nachrichten 77, 7{43.
6] Kesten, H., Ney, P., and Spitzer, F. (1966) The Galton{Watson
process with mean one and nite variance. Theor. Prob. Appl. 11, 513{
540.
13
7] Kesten, H. and Pittel, B. (1996) A local limit theorem for the
number of nodes, the height, and the number of nal leaves in a critical
branching process tree, Random Structures and Algorithms 8, 243{
299.
8] Kolchin, V.F. (1986) Random Mappings, Optimization Software,
New York.
9] Lindvall, T. (1992) Lectures on the Coupling Method, Wiley, New
York.
10] Lyons, R., Pemantle, R., and Peres, Y. (1995) Conceptual
proofs of L log L criteria for mean behavior of branching processes.
Ann. Prob. 25, 1125{1138.
11] Seneta, E. (1967) The Galton{Watson process with mean one,
J. Appl. Prob. 4, 489{495.
14