The University of Melbourne
Department of Mathematics and Statistics
Masters of Science Thesis
An Exposition of Extremal
Processes
Supervisor:
Author:
James Oates
Associate Professor Aihua
Xia
October 12, 2012
Abstract
This thesis will expose the reader to the convergence of extremal processes with
different properties. The main result will be the convergence of the maximum of
a stationary sequence of dependent random variables with independent random
sample size. This is achieved by associating the random variables to a
two-dimensional point process, then proving the point process converges to a
two-dimensional mixed Poisson process. The processes can be related to the
maximal process and the extremal process taking advantage of their convergence
in Skorokhod topology to complete the argument.
Contents
1 Introduction
5
2 Point Processes
8
2.1
2.2
2.3
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.1.1
Probability space . . . . . . . . . . . . . . . . . . . . . . . .
8
2.1.2
Set formalities . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.1.3
Types of convergence . . . . . . . . . . . . . . . . . . . . . . 10
Introduction to point processes . . . . . . . . . . . . . . . . . . . . 11
2.2.1
General framework and idea . . . . . . . . . . . . . . . . . . 11
2.2.2
Stationary point processes . . . . . . . . . . . . . . . . . . . 12
2.2.3
Simple point processes . . . . . . . . . . . . . . . . . . . . . 14
2.2.4
Realisations of point processes . . . . . . . . . . . . . . . . . 16
Point process theory . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1
Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.2
Moment measures of point processes . . . . . . . . . . . . . 21
2.3.3
Probability generating functional . . . . . . . . . . . . . . . 23
2.3.4
Operations on point processes . . . . . . . . . . . . . . . . . 24
2.3.5
Poisson process on general phase space . . . . . . . . . . . . 25
2.3.6
Limit theorems for point processes . . . . . . . . . . . . . . 29
3 The space D
32
3.1
Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2
Skorokhod metric and topology . . . . . . . . . . . . . . . . . . . . 32
3.3
Compactness in D
. . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2
3.4
Weak convergence in D . . . . . . . . . . . . . . . . . . . . . . . . . 35
4 Classical extreme value theory
40
4.1
Extreme value distributions . . . . . . . . . . . . . . . . . . . . . . 40
4.2
Stationary sequences of dependent random variables . . . . . . . . . 41
5 Extremal process
44
5.1
Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2
Convergence of finite dimensional distributions . . . . . . . . . . . . 45
5.3
Tightness in Skorokhod topology . . . . . . . . . . . . . . . . . . . 48
6 Extremal process of random sample size
51
6.1
Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2
Convergence of finite dimensional distributions . . . . . . . . . . . . 53
6.3
Tightness in Skorokhod topology . . . . . . . . . . . . . . . . . . . 57
7 Extremal process of dependent random variables
59
7.1
Foundations and dependent structure . . . . . . . . . . . . . . . . . 59
7.2
Convergence in vague topology
7.3
Convergence in Skorokhod topology . . . . . . . . . . . . . . . . . . 68
. . . . . . . . . . . . . . . . . . . . 61
8 Extremal process of dependent random variables with random
sample size
70
8.1
Convergence to the 2D mixed Poisson process . . . . . . . . . . . . 70
8.2
Convergence in Skorokhod topology . . . . . . . . . . . . . . . . . . 78
9 Conclusion
85
3
4
Chapter 1
Introduction
In the early 1940’s Gnedenko [8] would be one of the first to investigate the distribution of the maximum of a set of random variables as the set size tends to
infinity. This paper develops the idea of what we know today as classical extreme
value theory. The idea is to look at the maximum of a set of independent and
identically distributed (iid) scaled random variables and see whether it converges
to another non-degenerate distribution in the limit as the set becoming infinitely
large.
This was extended by considering a stationary sequences of dependent random
variables. Initially Watson [24] accomplished this by considering m-dependent random variables that were asymptotically independent. Then Newell [19] extended
Watson’s work by relaxing the restrictions on the dependence structure slightly
and weakening a condition on the exceedances. Very similar work was also done
by Galambos [7].
Berman came along and showed the sample size could be random. In their
paper [2] they showed the distribution of dependent random variables can converge
to limit even if the sample size is random. Berman did this for the sample size
being independent and dependent on the sample random variables. Thomas [22]
enhanced Berman’s result by removing the independence of the sample random
variables in some theorems.
The field of extreme value theory was being developed in the area of random
5
CHAPTER 1. INTRODUCTION
processes, or in this case aptly titled extremal processes. Consider the process
taking the value of the maximum of the set of random variables, however as time
increases more random variables are included changing the value of the maximum.
As the set becomes infinitely large this process approaches the extremal process.
Lamperti [12] showed convergence of this process in Skorokhod topology for iid
random variables as the sample size tends to infinity. These extremal processes
are where the interests of this thesis lie.
Expanding on Lamperti’s result, Silvestrov and Tuegels [20] considered an extremal process based on a set of random variables with random sample size as
opposed to non-random sample size. They proved such a process weakly converged regardless whether the sample size random variable was dependent on the
sample random variables or not. This was achieved by assuming the non-random
sample size extremal process and the sample size random variable jointly weakly
converged.
After Lamperti showed convergence in the iid case the field looked at extending
via two types of dependent structures. Papers by Loynes [16] and Welsch [27] considered weak convergence of extremal processes based on strong mixing stationary
sequences, whereas Berman [3] was looking at Gaussian sequences where the correlation tends to zero as distance increased. Then Leadbetter [15] and [13] started
to combine the two results to develop what is known as asymptotic independence.
This is weaker than strong mixing however it does encompass the Gaussian case.
Weissman followed by first proving that a two-dimensional point process based
on independent non-identically distributed random variables weakly converged to
a two-dimensional Poisson point process. As seen in [25] and [26], Weissman then
relates these point processes to the extremal process to prove weak convergence.
Adler [1] followed this by proving that the two-dimensional point process based
on dependent random variables still converged to a two-dimensional Poisson point
process. This was again transferable to the extremal process via Weissman’s idea.
This is where we are now, and leads the reader to the main result of the thesis.
The aim is to prove the maximum of a stationary sequence of dependent random
variables with random sample size converges to an extremal process. This will be
6
An Exposition of Extremal Processes
achieved in a similar manner to Adler [1], except the point process will converge to
a two-dimensional mixed Poisson point process where the parameter is dependent
on the limit of the sample size. This idea can be seen in Chapter 8. The preluding
chapters are providing the correct background to tackle the arguments presented
in the final proof.
7
Chapter 2
Point Processes
The aim of this chapter is to give the necessary background knowledge regarding
probability spaces and point processes. This is required for later chapters as we will
use this to show that a sequence of empirical point processes approaches a Poisson
point process as the sample size becomes large. A nice but formal introduction to
point processes can be found in [5], however a more rigorous and dense reference
is [11]. At this point I would like to mention [18] as it provided a very gentle
introduction into the topic.
2.1
Preliminaries
2.1.1
Probability space
A probability space (Ω, F, P ) is an axiomatic way of defining probabilities. The
triplet is defined as follows.
Ω is the set of all possible outcomes and is called the sample space. The
elements of this set are denoted by ω.
F is a collection of events that are subsets of Ω, and it forms a σ-algebra with
the following properties:
1. Ω ∈ F and ∅ ∈ F.
8
An Exposition of Extremal Processes
2. If A ∈ F then Ac ∈ F.
3. If A1 , A2 , . . . ∈ F then
S∞
i=1
Ai ∈ F and
T∞
i=1
Ai ∈ F.
P is a probability measure on F which is defined to have the following properties:
1. P {Ω} = 1.
2. P {A} ≥ 0, ∀A ∈ F.
S
3. If A1 , A2 , . . . ∈ F and Ai ∩ Aj = ∅, ∀i 6= j, then P { ∞
i=1 Ai } =
P∞
i=1 P {A1 }.
A function X is a random variable if it maps from probability space Ω to the real
line, and is F measurable, i.e.
{ω : a ≤ X(ω) ≤ b} ∈ F
(2.1)
for all a ≤ b and a, b ∈ R.
2.1.2
Set formalities
This subsection will cover some terminology that will be used in later sections.
For the following definitions consider space S with Borel σ-algebra BS and metric
ρ. All the following definitions can be found in Appendix M5 on page 239 of [4].
ε - net
An ε-net for a set A ⊂ S is the set of points {xk } where ∀x ∈ A, there exists
an xk such that ρ(x, xk ) < ε.
Bounded sets
The set A ⊂ S is bounded if it’s diameter, sup{ρ(x, y) : x, y ∈ A} is finite.
Totally bounded set
The set A ⊂ S is totally bounded if ∀ε > 0 there exists a finite ε-net. The
closure of a totally bounded set is bounded but the converse does not hold.
The closure of A is compact if and only if A is a totally bounded and the
9
CHAPTER 2. POINT PROCESSES
closure of A is complete. For set B ⊂ Rk with the Euclidean metric the set
is totally bounded if and only if it is bounded.
Relatively compact set
The set A ⊂ S is relatively compact if the closure of A is compact. This
implies every sequence of A has a convergent subsequence, but the limit may
not be in A.
2.1.3
Types of convergence
The following subsection will present the types of convergence that will be seen in
the thesis.
Convergence of finite dimensional distributions
We consider a sequence of random processes {Xn (t), t ∈ T }, n ≥ 0. Let D ⊂
T then we say that {Xn } converges to {X0 } in finite dimensional distributions
d
along D, if ∀k ≥ 1, t1 , . . . , tk ∈ D, (Xn (t1 ), . . . , Xn (tk )) → (X0 (t1 ), . . . , X0 (tk ))
d
as n → ∞, where → is convergence in distribution. For further details see
[4].
Convergence in vague topology
Consider the set Fcb defined as continuous functions on (S, ρ) with bounded
R
support. Also consider a sequence of measures {µn } on BS . If f dµn →
R
f dµ, ∀f ∈ Fcb then we say µn converges to µ in vague topology. This is
v
denoted by µn ⇒ µ. This convergence is thoroughly explored in [11].
Weak convergence
Weak convergence is similar to that of convergence in vague topology however
the mass must be conserved. Consider the set Fc of continuous functions on
R
R
(S, ρ) and sequence of measures {µn } on BS . If f dµn → f dµ, ∀f ∈ Fc
w
then we say µn weakly converges to µ. This is denoted by µn ⇒ µ. Weak
convergence is formulated concisely in [4].
Notice that weak convergence implies convergence in vague topology, but the converse is false. The following example highlights the difference between the two.
10
An Exposition of Extremal Processes
Example 2.1. Let {µn } be a sequence of probability measures such that, µn = δn ,
where δn is the Dirac measure at n,
1, n ∈ A
δn (A) =
0, otherwise.
(2.2)
v
Let µ := 0, then µn ⇒ µ as n → ∞, however µn does not weakly converge to µ as
the mass escapes.
This idea of the mass escaping is related to the concept of tightness, which will
be discussed later on (Section (3.4)) . However at this point it is worth mentioning
that convergence of finite dimensional distributions and tightness are enough to
ensure a sequence of probability measures weakly converges.
2.2
2.2.1
Introduction to point processes
General framework and idea
The idea behind point processes is best described by considering a phase space
S. Within the phase space S, there are generally a countable number of random
points. These points can be analysed using different realization methods that
will be covered later. Most of the work in the later chapters will be considering
S = Rd , but most of the point process theory will be proved in the framework of
S, a separable and complete metric space (Polish space).
When S = [0, ∞) one realisation is considering the number of points, Nt , in
the interval (0, t]. This can easily be extended to an arbitrary interval (t1 , t2 ] by
denoting N (t1 , t2 ] = Nt2 − Nt1 . If S has a metric space structure then we can
define BS as the Borel σ-algebra. The Borel σ-algebra is the smallest σ-algebra of
all subsets of S containing all open sets. From this, for B ∈ BS , we can consider
the number of points in B, N (B). We will assume that there are only finitely
many points in a bounded subset of S, that is, for bounded B ∈ BS , N (B) < ∞.
Assume N (B) is F-measurable ∀B ∈ BS , then the following probabilities are
11
CHAPTER 2. POINT PROCESSES
well defined for Bi ∈ BS and ni ∈ Z+ := {0, 1, 2, . . . }:
P {N (B1 ) = n1 , . . . , N (Bk ) = nk },
for k ≥ 1.
(2.3)
Then we can define the finite dimensional distributions of the point process {N (B) :
B ∈ BS }, for fixed k ∈ N := {1, 2, . . . } and ni ∈ Z+ , Bi ∈ BS for i = 1, . . . k, as
P {N (B1 ) = n1 , . . . , N (Bk ) = nk }.
2.2.2
(2.4)
Stationary point processes
If we consider Rd , there are two main types of stationary point processes, crudely
stationary and strictly stationary. A point process defined on Rd is crudely stationary if ∀B ∈ BRd , t ∈ Rd , it has the property,
P {N (B + t) = n} = P {N (B) = n},
n ∈ Z+ ,
(2.5)
where B + t = {s + t : s ∈ B}. Similarly it is a strictly stationary point process if
∀B ∈ BRd , t ∈ Rd , it has the property,
P {N (B1 + t) = n1 , . . . , N (Bk + t) = nk }
= P {N (B1 ) = n1 , . . . , N (Bk ) = nk } ∀k ≥ 1, and n1 , . . . , nk ∈ Z+
(2.6)
so the finite dimensional distributions of the point process are invariant under a
shift of domain.
Now we will prove some results for a stationary point process on R. It is the
most simple way to generate the idea without getting bogged down in notation
and measure theory. Let H : R+ 7→ R+ such that H(x) = E{N (0, x]}.
Theorem 2.1. For a crudely stationary point process on R,
H(x) = λx,
where λ = E{N (0, 1]} ≤ ∞.
12
x ∈ R+ ,
(2.7)
An Exposition of Extremal Processes
Proof. First we need to show that H(x) satisfies the Cauchy functional equation.
This is done using the crudely stationary property of the point process.
H(x + y) = E{N (0, x + y]}
= E{N (0, x]} + E{N (x, x + y]}
= E{N (0, x]} + E{N (0, y]}
= H(x) + H(y).
(2.8)
Knowing H(x) is non-decreasing in x and H(0) = 0, we can use Cauchy’s functional
equation defined in Lemma 3.6.III on page 65 of [5] to show H(x) = H(1)x,
completing the proof.
The quantity λ is commonly referred to as the intensity, and takes the values
0 ≤ λ ≤ ∞. Another quantity that we need to define is the rate, ρ. This is defined
as ρ = limh↓0
P {N (0,h]>0}
.
h
Theorem 2.2. For a crudely stationary point process on R, the limit
ρ = lim
h↓0
P {N (0, h] > 0}
h
(2.9)
exists for 0 ≤ ρ ≤ ∞.
Proof. This proof will take advantage of the subadditive Lemma (Lemma 3.6.I on
page 64 of [5]). Let f (x) = P {N (0, x] > 0} for x ∈ (0, ∞) then using the crudely
stationary property,
f (x + y) = P {N (0, x + y] > 0}
= P {N (0, x] > 0} + P {N (0, x] = 0, N (x, x + y] > 0}
≤ P {N (0, x] > 0} + P {N (0, y] > 0}
= f (x) + f (y).
(2.10)
If we complement this with limx↓0 P {N (0, x] > 0} = P {N (∅) > 0} = 0 and f (x) is
non-decreasing in x. The subadditive Lemma shows the existence of ρ, completing
the proof.
13
CHAPTER 2. POINT PROCESSES
Lemma 2.1. For a crudely stationary point process on R,
λ ≥ ρ.
(2.11)
Proof. Using the result from Theorem 2.1 and Theorem 2.2,
λh = E{N (0, h]}
∞
X
=
nP {N (0, h] = n}
≥
n=0
∞
X
P {N (0, h] = n}
n=1
= P {N (0, h] > 0}
(2.12)
which implies
λ≥
P {N (0, h] > 0}
.
h
(2.13)
Then take the limit as h ↓ 0 of both sides to complete the proof.
2.2.3
Simple point processes
A point process is classified as a simple point process on S if,
P {N ({x}) = 0 or 1, ∀x ∈ S} = 1.
(2.14)
This is an elegant way saying there is almost surely no where in the domain where
more then one point can exist. The definition of a simple point process allows us
to find a case where the rate and the intensity are equal.
Theorem 2.3. For all crudely stationary simple point processes on R,
λ=ρ
where an equivalent ∞ is an allowed solution.
14
(2.15)
An Exposition of Extremal Processes
Proof. Let n ∈ N then we can partition the interval (0, 1] into 2n sub intervals of
equal length. Define the indicator random variable to be,
i
1, if N i−1
,
>0
n
n
2
2
1ni =
, i = 1, . . . , 2n .
0, otherwise
(2.16)
Using N (0, x + y] = N (0, x] + N (x, x + y], in the limit,
n
2
X
a.s.
1ni −→
N (0, 1] as n → ∞
(2.17)
i=1
where the convergence is monotone as the interval is continually halved. Thus
using monotone convergence theorem we can show,
)
(
2n
X
1ni
λ = E{N (0, 1]} = E lim
n→∞
= lim
n→∞
2n
X
i=1
E {1ni }
i=1
1
= lim 2 P N 0, n > 0
n→∞
2
n
=ρ
(2.18)
by Theorem 2.2, concluding the proof.
Before moving to different types of realisations it is worth mentioning a definition. A distribution on Z+ with probability generating function (pgfn), ϕ(z) is
infinitely divisible if for each n ∈ N there exists a pgfn ϕn (z) such that
ϕ(z) = (ϕn (z))n ,
0 ≤ z ≤ 1,
(2.19)
where ϕt (z) = E{z N (0,t] } is the pgfn of a point process. Then if the point process
N (0, t] with pgfn ϕt (z) is stationary with independent increments we can see that
ϕt1 +t2 (z) = ϕt1 (z)ϕt2 (z) as N (0, t1 ] and N (t1 , t2 ] are independent. We can continue
this to show ϕt (z) = [ϕ nt (z)]n , hence a stationary point process with independent
increments is infinitely divisible.
15
CHAPTER 2. POINT PROCESSES
2.2.4
Realisations of point processes
So far we have only considered point processes on R. In this space there are three
main ways of representing realisations, however they do not always apply to other
spaces so easily. I will go through the different ways of representing realisations
and their inherent pros and cons.
Interval between successive points
Representing the realisation as the interval between successive points is an
elegant way of seeing how our process works. However this only really works
in R, or spaces that have a natural order. If we consider more general spaces
they may not have such nice ordering properties, even Rd for d > 1 has
problems. How do you distinguish the next point from the point we are
currently at? Hence the problems faced by this particular realisation.
Countable subsets of the phase space
To count the subsets of phase space where the points exist will work in a
more general space than R. For example the subsets can be of the form
{xi }, then they are assumed to be locally finite which implies there are only
finitely many points on a bounded subset of S. Counting such subsets has
it’s own problems, due to the fact a point process need not be simple. This
means one subset can have multiply points even if the subset is a singleton.
This problem can be overcome in two ways. Firstly if we order the realisations then it is easily distinguishable between points. Again this posses the
problem of only really being a viable option in R or an ordered space. The
second method is to consider each realisation as a multiset, so the subsets
that contain the realisations can appear multiply times.
Counts in subsets
Possibly the best way to represent realisations of a point process is to count
the realisations themselves. This is easily done in R but not transferable
to more general spaces, unless we use counting measures. With S having a
metric space structure we can say that N (·) takes the value of the measure
16
An Exposition of Extremal Processes
defined on the Borel σ-algebra, BS . Where the measure must take the values
from Z+ or ∞. For this we need to assume that the sets are locally finite, so
for a bounded B ∈ BS the measure is finite.
2.3
2.3.1
Point process theory
Foundations
The theory of point processes is best established in a complete and separable
topological space (csm) with a metric. So for the rest of this work we will assume
the phase space S is a complete and separable topological space with a metric.
The completeness of the space ensures every Cauchy sequence of the space has it’s
limit point in the space. Where as the separability ensures the space contains a
countably dense subset A, such that ∀x ∈ S, there exits a sequence of points in A
converging to x.
Let phase space S have the Borel σ-algebra BS . Then denote NS as the collection of all counting measures of N (·) on (S, BS ) that are locally finite. Where
locally finite implies N (B) < ∞ for any bounded B ∈ BS . Then we want to equip
BN with the Borel σ-algebra defined by
BN = smallest σ-algebra of subsets of N containing all finite
dimensional cylinder sets, defined as,
{N (·) ∈ N : N (B1 ) = n1 , . . . , N (Bk ) = nk }
for k ∈ N, bounded B1 , . . . , Bk ∈ BS , n1 , . . . , nk ∈ Z+ .
(2.20)
The above definitions are more rigorously defined Appendix 2.5 on page 398 of [5].
They show the space NS is a csm space if S is a csm space and NS has Skorokhod
topology. The Skorokhod topology is explored in Chapter 3.
Let ξ be a point process on phase space S. Then it is a measurable mapping
from a probability space (Ω, F, P ) into a counting measure space (N , BN ). Define
another probability measure, Pξ = P ◦ ξ −1 on (N , BN ), as the distribution of the
17
CHAPTER 2. POINT PROCESSES
point process ξ. Hence for a set A ∈ BN ,
Pξ {A} = P {ξ −1 (A)}.
(2.21)
Define δx to be the Dirac measure at x defined in (2.2).
Theorem 2.4. A measure µ(·) ∈ NS if and only if it can be expressed as
µ(·) =
X
kα δxα (·)
(2.22)
α
where kα ∈ N and {xα } is a countable, locally finite set of distinct points from S.
Proof. This proof follows the idea of Lemma 2.1 on page 19 of [11]. It is clear that
a µ(·) expressed in this way is an element of the counting measure NS . So it is
only left to show, given a counting measure µ(·) it can be expressed in the form of
(2.22).
As µ(·) ∈ NS we know it is integer valued, bounded and countable. Hence
we only need to show that µ(·) has no non-atomic component. If we consider a
arbitrary point x ∈ S, and a sequence of positive real numbers, {εn , n = 1, 2, . . . }
such that εn → 0 as n → ∞. Let Sε (x) denote a ball of radius ε centred at x.
Then consider the sequence of open balls Sεn (x) ↓ {x} as n → ∞. By continuity
of counting measures,
µ({x}) = lim µ(Sεn (x))
n→∞
(2.23)
and we know the right hand side only consists of non-negative integers, thus the
left hand side is the same. Hence if x is not an atom of µ(·) it must sit in the
centre of open balls such that µ(Sεn (x)) = 0. This implies that the support of µ(·)
is purely atomic, completing the proof.
Theorem 2.5. Let ξ be a mapping from probability space (Ω, F, P ) into (NS , BN ).
Then ξ is a point process on S if and only if for every B ∈ BS , ξB = ξ(B, ·) is a
random variable.
Proof. Given ξ is a point process on S. For all ω ∈ Ω, ξ(ω) = ξ(·, ω) is a counting
measure, where it’s value on set B ∈ BS is ξB (ω) = ξ(B, ω). This implies ξB
18
An Exposition of Extremal Processes
is a function mapping such that ξB : Ω 7→ Z+ . If we also consider the mapping
πB : NS 7→ Z+ such that πB (µ) = µ(B), ∀µ ∈ NS we can relate them all in the
following way.
ξ
Ω
/ NS
ξB
πB
Z+
By definition πB−1 (C) ∈ BN for C ∈ BZ+ where BZ+ denotes the Borel σ-algebra
of Z+ . Hence if we write ξB in the following way we can see it is measurable. For
C ∈ BZ+ ,
ξB−1 (C) = ξ −1 (πB−1 (C)) ⊂ F,
(2.24)
showing that ξB (·) is a random variable. Thus it is left to show, given ξB is a
random variable, ξ is point process. For all sets B ∈ BS and C ∈ BZ+
ξ −1 (πB−1 (C)) = ξB−1 (C) ⊂ F
(2.25)
hence all unions of B must also be in F,
!
[
ξ −1
πB−1 (C)
⊂ F.
(2.26)
B∈BS
However we know BN = σ
S
B∈BS
πB−1 (BZ+ ) , thus for a set A ∈ BN
ξ −1 (A) ⊂ F,
(2.27)
concluding the proof.
Theorem 2.6. Let {η(B) : B ∈ BS } be a family of non-negative integer valued
random variables on a common probability space (Ω, F, P ), that are indexed by the
Borel sets B ∈ BS , where space S is a csm space. Then there exists a point process
ξ such that,
a.s.
ξB = η(B)
(2.28)
if and only if for every disjoint and bounded sets B1 , B2 ∈ BS ,
a.s.
η(B1 ∪ B2 ) = η(B1 ) + η(B2 )
19
(2.29)
CHAPTER 2. POINT PROCESSES
and for all sequences of bounded sets {Bn } ∈ BS such that Bn ↓ ∅,
a.s.
η(Bn ) −→ 0.
(2.30)
Proof. Theorem 5.4 on page 42 of [11] provides a proof.
Now we can use Theorem 2.6 to define a completely random point process. A
point process ξ on (S, BS ) is called completely random if for all bounded, pairwise disjoint sets B1 , . . . , Bk , the random variables ξB1 , . . . , ξBk are all mutually
independent.
Theorem 2.7 (Daniell-Kolmogorov’s existence theorem for point processes). For
each k ∈ N and bounded Borel sets B1 , . . . , Bk ∈ BS , denote the discrete distribution on Z+ by,
qk (B1 , . . . , Bk ; n1 , . . . , nk )
for n1 , . . . , nk ∈ Z+ .
(2.31)
This distribution will be the finite dimensional distribution of a point process if it
satisfies the necessary and sufficient conditions which have been simplified to the
following four conditions. For each k ∈ N, bounded sets B1 , . . . , Bk ∈ BS , and
n1 , . . . , nk ∈ Z+ ,
1. qk (B1 , . . . , Bk ; n1 , . . . , nk ) = qk (Bπ1 , . . . , Bπk ; nπ1 , . . . , nπk ) for any permutation π of {1, . . . , k}.
2.
P∞
j=0 qk (B1 , . . . , Bk ; n1 , . . . , nk−1 , j)
= qk−1 (B1 , . . . , Bk−1 ; n1 , . . . , nk−1 ).
3. For sets B1 , B2 , such that B1 ∩ B2 = ∅
q3 (B1 , B2 , B1 ∪ B2 ; n1 , n2 , n3 ) = 0
(2.32)
for n1 + n2 6= n3 .
4. Let {Bn } be a non-increasing sequence of sets from BS , such that Bn → ∅ as
n → ∞, then
q1 (Bn ; 0) → 1
20
as n → ∞.
(2.33)
An Exposition of Extremal Processes
Proof. A more general proof can be found in Theorem 5.3 on page 41 of [11]. Also
Theorem 9.2.X on page 30 of [6] is very similar to the one posed here.
We can see the first two requirements of Theorem 2.7 are the same as Kolmogorov’s existence theorem, which proves a distribution is indeed a finite dimensional distribution. The third requirement is there to enforce almost sure finite
additivity of the point processes. The fourth requirement is to enforce almost sure
continuity of the point process at ∅. These two requirements together necessitate
it is a point process and the point process is almost surely a counting measure.
Consider a probability space (Ω, F, P ), and a point process ξ that maps the
probability space into counting measure space (N , BN ). Then if we set Ω = N ,
F = BN and ξ to be the identity mapping ξ(ω) = ω where ω ∈ Ω, we create the
canonical process space. This space is useful because the original probability space
isn’t required any more, only the probability of each value occurring. This can be
seen by considering P {ω : ξ(ω) ∈ B} where ω ∈ Ω = N and B ∈ F = BN . Then
P {ω : ξ(ω) ∈ B} = P {ω : ω ∈ B} = P {B}, hence defining a distribution for the
point process ξ. This leads us conveniently to the next theorem.
Theorem 2.8. The distribution of a point process is completely determined by it’s
finite dimensional distributions.
Proof. See Theorem 9.4 on page 77 of [11].
2.3.2
Moment measures of point processes
For a point process ξ on (S, BS ) we can define the function M (·) by,
M (B) = E{ξ(B)},
for B ∈ BS
(2.34)
which is a finitely additive set function. This is commonly known as the mean
measure.
Lemma 2.2. The set function M (·) is a measure.
21
CHAPTER 2. POINT PROCESSES
Proof. As ξ only takes values from Z+ , M (B) ≥ 0, ∀B ∈ BS . Consider a sequence
of non-increasing sets {Bn } ∈ BS such that Bn ↓ ∅ as n → ∞. The convergence is
monotonic as the set size is non-increasing, hence,
0 ≤ M (∅) ≤ lim sup M (Bn )
n→∞
= lim sup E{ξ(Bn )}
n→∞
= E{lim sup ξ(Bn )}
n→∞
= E{ξ(∅)}
= 0,
(2.35)
which ensures,
lim M (Bn ) = M (∅) = 0
n→∞
(2.36)
Leaving only countable additivity to be shown. Consider the countable collection
of bounded, pairwise disjoint, B1 , B2 , . . . ∈ BS . Using Bi ∩ Bj = ∅, ∀i 6= j and
a.s.
ξ(∅) = 0,
M
∞
[
i=1
!
Bi
(
=E
ξ
∞
[
!)
Bi
( ∞ i=1 )
X
=E
ξ (Bi )
i=1
=
=
∞
X
i=1
∞
X
E {ξ (Bi )}
M (Bi )
(2.37)
i=1
concluding the proof.
From this we can define the second moment measure M2 (·) to be,
M2 (B1 × B2 ) = E{ξ(B1 )ξ(B2 )},
22
B1 , B2 ∈ BS
(2.38)
An Exposition of Extremal Processes
which is a symmetric measure defined on (S 2 , BS 2 ). It is symmetric because
M2 (B1 × B2 ) = M2 (B1 × B2 ). Now we wish to define the kth factorial moment measure, M[k] (·). This will be useful later on when we are considering the
asymptotic behaviour of point processes. The factorial moment measure is the
point process analogous to the factorial moment of a random variable. The second
factorial moment measure M[2] (·) is defined as
M[2] (B1 × B2 ) = M2 (B1 × B2 ) − M (B1 ∩ B2 ).
(2.39)
Similarly the 3rd factorial moment measure is defined in the following way,
M[3] (B1 × B2 × B3 ) = M3 (B1 × B2 × B3 ) − M2 (B1 × (B2 ∩ B3 ))
− M2 (B2 × (B1 ∩ B3 )) − M2 (B3 × (B1 ∩ B2 )) + 2M (B1 ∩ B2 ∩ B3 ).
(2.40)
This can be extended to the kth factorial moment by adding and subtracting the
correct number of intersections.
2.3.3
Probability generating functional
There are multiple types of generating functionals but the one of interest for this
paper is the probability generating functional (pgfl). Let ξ be a point process on
(S, BS ), and let h be a measurable function h : S 7→ [0, 1] such that h is equal to
1 outside some bounded set. Then the pgfl, G[h] is define in the following way,
Z
G[h] = E exp
log(h(x))ξ(dx)
(2.41)
S
(
)
Y
=E
h(x)ξ(dx) ,
(2.42)
x∈S
where log(0) := −∞ such that ∞.0 = 0. If we look closely at (2.42) we can
see that for point processes dx can be simplified to the atom at which there is a
realisation. So for point processes the pgfl can be written as,
Y
G[h] = E
h(x)ξ({x}) .
x:ξ({x})>0
23
(2.43)
CHAPTER 2. POINT PROCESSES
2.3.4
Operations on point processes
This section outlines the way in which point processes can be built from other
point processes. The two methods that will be covered here are superposition and
mixing.
Superposition
The superposition of point processes refers to summing up n independent
point processes. Let ξ1 , . . . , ξn be independent point processes on a common
probability space. Then
ξ=
n
X
ξi
(2.44)
i=1
is a another point process. If we consider the pgfl of such a process we can
see
Z
log(h(x))ξ(dx)
G[h] = E exp
S
!)
(
Z
n
X
log(h(x))
ξi (dx)
= E exp
S
=E
( n
Y
exp
=
)
log(h(x))ξi (dx)
S
i=1
n
Y
i=1
Z
Gi [h]
(2.45)
i=1
where Gi [h] is the pgfl of the corresponding point process ξi . Conversely if
a point process ξ has a pgfl G[h], which can be broken up into the product
of pgfls Gi [h], then point process ξ is a superposition of independent point
processes with pgfls Gi [h].
Mixing
Another way of constructing point processes is by altering a point process
with respect to a parameter or random variable. This will be covered more
in the next subsection by giving examples of a mixed Poisson process.
24
An Exposition of Extremal Processes
2.3.5
Poisson process on general phase space
A Poisson process on general phase space (S, BS ), with a locally finite mean measure µ(·) on (S, BS ), defines the family of discrete distributions:
1. For bounded sets B ∈ BS and n ∈ Z+ ,
q1 (B; n) =
(µ(B))n e−µ(B)
.
n!
(2.46)
2. For bounded, pairwise disjoint sets B1 , . . . , Bk ∈ BS , k ≥ 2,
qk (B1 , . . . , Bk ; n1 , . . . , nk ) =
k
Y
q1 (Bi ; ni )
(2.47)
i=1
To show the Poisson process defines a point process, we need to show that the
distributions above satisfy the Daniell-Kolmogorov’s existence theorem for point
processes. Part one can be most easily shown by considering the case k = 2 and
creating independent sets. For B1 , B2 ∈ BS and n1 , n2 ∈ Z+ ,
X
q2 (B1 , B2 ; n1 , n2 ) =
q1 (B1 ∩ B2c ; m1 )q1 (B2 ∩ B1c ; m2 )q1 (B1 ∩ B2 ; m3 )
m1 +m3 =n1
m2 +m3 =n2
X
=
q1 (B2 ∩ B1c ; m1 )q1 (B1 ∩ B2c ; m2 )q1 (B2 ∩ B1 ; m3 )
m1 +m3 =n2
m2 +m3 =n1
= q2 (B2 , B1 ; n2 , n1 ).
(2.48)
This idea can easily be extended to k arguments, completing part one.
Part two is shown in a similar fashion to part one. Consider the case k = 2,
∞
X
q2 (B1 , B2 ; n1 , j) =
j=0
=
=
n1 X
∞
X
q1 (B1 ∩ B2c ; n1 − i)q1 (B2 ∩ B1c ; j − i)q1 (B1 ∩ B2 ; i)
i=0 j=i
n1
X
q1 (B1 ∩
i=0
n1
X
B2c ; n1
− i)q1 (B1 ∩ B2 ; i)
q2 (B1 ∩ B2c , B1 ∩ B2 ; n1 − i, i)
i=0
∞
X
q1 (B2 ∩ B1c ; j − i)
j=i
∞
X
q1 (B2 ∩ B1c ; j)
j=0
= q1 (B1 ; n1 ).
(2.49)
25
CHAPTER 2. POINT PROCESSES
This procedure can again be extended to k arguments completing part two.
Part three is shown by considering bounded, disjoint sets B1 , B2 ∈ BS ,
q3 (B1 , B2 , B1 ∪ B2 ; n1 , n2 , n3 ) = q3 (B1 , B2 , B1 ∩ B2 ; n1 , n2 , n1 + n2 − n3 ) (2.50)
as all the sets are disjoint now we can use (2.47),
q3 (B1 , B2 , B1 ∩ B2 ; n1 , n2 , n1 + n2 − n3 )
= q1 (B1 ; n1 )q1 (B2 ; n2 )q1 (B1 ∩ B2 ; n1 + n2 − n3 )
(µ(B1 ))n1 e−µ(B1 ) (µ(B2 ))n2 e−µ(B2 ) (µ(B1 ∩ B2 ))n1 +n2 −n3 e−µ(B1 ∩B2 )
.
=
n1 !
n2 !
(n1 + n2 − n3 )!
(2.51)
Because µ(·) is a measure, µ(B1 ∩ B2 ) = µ(∅) = 0. Hence equation (2.51) is 0
unless n3 = n1 + n2 , leaving only part 4 to show.
Part four can be shown by considering a sequence of bounded sets {Bn } ∈ BS
such that Bn ↓ ∅ as n → ∞. Then using the continuity of a measure we can see
(µ(Bn ))0 e−µ(Bn )
n→∞
0!
= 1,
lim q1 (Bn ; 0) = lim
n→∞
(2.52)
proving the existence of a Poisson point process.
If a point process ξ on (S, BS ) has distributions (2.46) and (2.47) with mean
measure µ(·) then it is called a Poisson process with mean measure µ(·). If A ∈ S
then mean measure is defined by, E{ξ(A)} = µ(A). Here some special cases of
Poisson process will be explored, mainly the stationary Poisson process on S = Rd ,
which has mean measure µ(·) = λ| · | where | · | represents Lebesgue measure. So
the mean measure µ(·) is just a constant λ multiplied by the measure of the set
| · |.
The other type that is worth discussing is the mixed Poisson process. It is a
type of mixing where the mean measure µ(·) is a random variable. More formally,
if ξ is a Poisson process on (S, BS ) with mean measure µ(·), and Λ is a random
measure on the same space, then a mixed Poisson process can be created by letting
µ(·) = Λ.
26
An Exposition of Extremal Processes
The next property that will be mentioned is the atomic nature of the Poisson
process. If we consider a Poisson process ξ on (S, BS ) with mean measure µ(·)
then we define an atom of µ(·) as a point x ∈ S, such that µ({x}) > 0. Similarly
it is an atom of ξ if P {ξ({x}) > 0} > 0. The definition of the atoms ensures that
no process can contain more then countably many atoms.
Theorem 2.9. For a Poisson process ξ on (S, BS ) with mean measure µ(·), x is
a fixed atom of ξ if and only if it is an atom of µ(·).
Proof. Firstly, we will assume that x is an atom of µ(·), therefore µ({x}) > 0.
Then for a k ∈ N,
µ({x})k e−µ({x})
> 0.
(2.53)
k!
Thus x must be an atom of ξ. Now we just need to consider the converse, let x be
P {ξ({x}) = k} =
an atom of ξ then there exists a k ∈ N such that,
0 < P {ξ({x}) = k}
(2.54)
which implies µ({x}) > 0. Hence it is an atom of µ(·) concluding the proof.
Theorem 2.10. Let ξ be a Poisson process on (S, BS ) with mean measure µ(·). ξ
is simple if and only if µ(·) is non-atomic.
Proof. Assuming µ(·) has an atom x ∈ S. At x a Poisson number of points with
mean measure µ({x}) > 0 occurs which ensures ξ is not simple. Thus if ξ is simple
µ(·) is non-atomic.
We will start the argument for the converse by considering some properties of
a simple process. Consider open balls with centre x and radius ε, Sε (x). Notice
the requirement for a process to be simple is equivalent to
lim
ε→0
P {ξ(Sε (x)) > 1}
= 0,
P {ξ(Sε (x)) > 0}
∀x ∈ S,
(2.55)
implying that a point process is simple if P {ξ(Sε (x)) > 1} = o(P {ξ(Sε (x)) > 0}).
From the definition of a Poisson point process,
P {ξ(Sε (x)) > 0} = 1 − e−µ(Sε (x))
P {ξ(Sε (x)) > 1} = 1 − e−µ(Sε (x)) − µ(Sε (x))e−µ(Sε (x)) .
27
(2.56)
CHAPTER 2. POINT PROCESSES
Assuming that µ(·) is non-atomic, consider the ratio,
P {ξ(Sε (x)) > 1}
P {ξ(Sε (x)) > 0}
1 − e−µ(Sε (x)) − µ(Sε (x))e−µ(Sε (x))
=
1 − e−µ(Sε (x))
µ(Sε (x))
.
= 1 − µ(Sε (x))
e
−1
(2.57)
If we do a Taylor expansion on eµ(Sε (x)) about 0 in equation (2.57) we can see the
right hand side of (2.57) tends to 1 as ε → 0. This implies the Poisson process
ξ is simple when µ(·) is non-atomic completing the proof. The idea for the proof
stemmed from Theorem 2.4.II. on page 35 of [6].
The final addition to the properties of Poisson point processes will be the pgfl.
Consider a Poisson point process ξ on (S, BS ) with mean measure µ(·), the pgfl
is calculated using the definition in (2.42). Fix h such that h(x) = 1 for x ∈
/ B
where B is a bounded Borel set,
(
G[h] = E
)
Y
ξ({x})
h(x)
x∈B
( (
=E
E
Y
x∈B
))
h(x)ξ({x}) ξ(B)
(2.58)
The conditional probability can be found by considering the partition, {B1 , . . . , Bk }
of a bounded set B ∈ BS ,
P {ξ(B1 ) = n1 , . . . , ξ(Bk ) = nk } =
k
Y
(µB (Bi ))ni
ni !
i=1
e−µB (Bi )
(2.59)
where µB (Bi ) = µ(B ∩ Bi ). Hence for n1 + · · · + nk = n, n1 , . . . nk ∈ Z+ ,
n!
P {ξ(B1 ) = n1 , . . . , ξ(Bk ) = nk |ξ(B) = n} = Qk
i=1 ni !
k
Y
(µB (Bi ))ni
i=1
µ(B)
.
(2.60)
We can see that (2.60) is a multinomial distribution. Thus given ξ(B) = n the
points are distributed independently among the partitions with probability
µB (Bi )
µ(B)
of being in each partition. Denote {x1 , . . . , xn } as the multiset of points such that
28
An Exposition of Extremal Processes
µ({xi }) > 0 for 1 ≤ i ≤ n. Being a multiset the points do not need to be distinct,
hence we can calculate (2.58),
G[h] = E {E {h(x1 ) · · · h(xn )|ξ(B) = n}}
Z
Z
∞
X
µB (dxn )
µB (dx1 )
=
P {ξ(B) = n} · · · h(x1 ) · · · h(xn )
···
µ(B)
µ(B)
B
B
n=0
Z
Z
∞
X
(µ(B))n −µ(B)
µB (dx1 )
µB (dxn )
=
e
· · · h(x1 ) · · · h(xn )
···
n!
µ(B)
µ(B)
B
B
n=0
n
R
∞
X
h(x)µB (dx) −µ(B)
B
=
e
n!
n=0
Z
= exp
h(x)µB (dx) − µ(B)
B
Z
(h(x) − 1)µ(dx)
(2.61)
= exp
S
as h(x) = 1, ∀x ∈
/ B.
2.3.6
Limit theorems for point processes
The main limit theorem for point processes that will be mentioned here is a result
from the superposition of point processes in space S. This will be done using the
probability generating functionals technique developed by [23] then refined by [28].
Convergence of a point process is in vague topology which is equivalent in this
case to convergence of the finite dimensional distributions. For example the sequence of point processes {ξn , n = 1, 2, . . . } on (S, BS ) with bounded Borel sets
B1 , . . . , Bk ∈ BS has corresponding finite dimensional distributions qn (B1 , . . . , Bk ;n1 , . . . , nk ), and another point process ξ with the same properties has finite dimensional distributions q(B1 , . . . , Bk ; n1 , . . . , nk ). If qn (B1 , . . . , Bk ; n1 , . . . , nk ) →
v
v
q(B1 , . . . , Bk ; n1 , . . . , nk ) as n → ∞, then ξn ⇒ ξ, where ⇒ is described in Section
2.1.3.
This demonstrates the major difference between convergence of point processes
and weak convergence of stochastic processes as the later requires tightness while
the former relies only on the vague topology. With this idea now established we
can prove a theorem equivalent to Lemma 4 of [28] .
29
CHAPTER 2. POINT PROCESSES
Theorem 2.11. Let {ξn } be a sequence of point processes on (S, BS ) with a corv
responding pgfls {Gn }, and ξ be another point process with pgfl G. Then ξn ⇒ ξ if
and only if Gn [h] → G[h] for all continuous functions h : S → [0, 1], where 1 − h
has compact support.
Proof. Assume that Gn [h] → G[h]. As Gn [h] and G[h] specify the finite dimenv
sional distributions uniquely, ξn ⇒ ξ must converge by assumption. Now to show
the converse. Consider the sequence of simple functions {h0m } and {hm } that approximate h such that h0m ≤ h ≤ hm . Where a simple function is a function that
can be written in terms of a sum of measurable sets A1 , . . . , Am and real number
a1 , . . . , am ie,
hm (x) =
m
X
am 1Am (x).
(2.62)
i=1
Using the fact G[h] is a monotone function,
Gn [h0m ] ≤ Gn [h] ≤ Gn [hm ].
(2.63)
Consider the limit as n → ∞,
G[h0m ] ≤ lim inf Gn [h] ≤ lim sup Gn [h] ≤ G[hm ],
n→∞
(2.64)
n→∞
by definition of weak convergence and the measurable property of simple function.
Then consider m → ∞, Theorem 2 (ii) of [28] implies G[hm ] → G[h] and G[h0m ] →
G[h] as 1 − h, 1 − h0m and 1 − hm have the same support. Thus leaving us with
the result,
lim Gn [h] = G[h]
n→∞
Consider the expansion of the pgfl, from [23],
G[1 − h] = 1 +
Z
∞
X
(−1)k
k=1
k!
S
Z
···
h(x1 ) · · · h(xk )M[k] (dxk ).
(2.65)
S
Example 2.2. Consider a stationary point process ξ on (S, BS ) with mean measure µ and pgfl G[h]. Then consider a set of point processes {ξn , n = 1, 2, . . . }
derived from ξ by dilating the measure to
30
µ
n
and denote it’s pgfl by Gn [h]. Let
An Exposition of Extremal Processes
Mn,[k] denote the kth factorial moment measure of ξn . Then a superposition of n
independent ξn point processes has pgfl (Gn [h])n . Using the expansion defined in
(2.65) and a result from Theorem 6 of [28],
Z
1
Gn [1 − h] = 1 − h(x)Mn,[1] (dx) + o( )
n
SZ
µ
1
=1−
h(x)dx + o( )
n S
n
(2.66)
which implies
n
Z
µ
1
(Gn [1 − h]) = 1 −
h(x)dx + o( )
n S
n
Z
→ exp −µ h(x)dx
n
(2.67)
S
as n → ∞. Thus Gn [h] → exp(µ
R
S
(h(x) − 1)dx), which is the pgfl of a stationary
Poisson point process from (2.61). By Theorem 2.11 we can see that this implies
that the superposition of n iid copies of a stationary point process converge to a
Poisson point process.
31
Chapter 3
The space D
This chapter will cover the construction of space D and why it is useful to consider
random processes that converge in the space D. Many of the definitions and
structure of D can be found in [4]. Later on we will be defining the Skorokhod
metric which induces the commonly used Skorokhod topology.
3.1
Definition
Let D = D[0, 1] be the space of real functions that are right continuous and have
left hand limits. This is commonly known as the space of càdlàg functions. For a
function x to be càdlàg the following two limits must exist,
x(t+) = lim x(s) = x(t),
for 0 ≤ t ≤ 1
x(t−) = lim x(s),
for 0 < t ≤ 1.
s↓t
s↑t
The space D is a good way of capturing all the functions that can have a maximum
of one jump at any one point in time.
3.2
Skorokhod metric and topology
The beauty of the Skorokhod topology is that it allows the convergence of processes
that are equal everywhere bar their discontinuities. It is an elegant way of allowing
32
An Exposition of Extremal Processes
an infinitesimal small deformation of the time scale to allow functions that may
differ at jumps to be equal. This is done by defining the metric in the following
way.
Let Λ be a class of strictly increasing continuous mappings from [0, 1] onto
itself. Then let λ ∈ Λ such that λ(0) = 0 and λ(1) = 1 and define I to be the
identity mapping. For x, y ∈ D we define the Skorokhod metric d(x, y) as,
d(x, y) = inf {kλ − Ik ∨ kx − y ◦ λk}
λ∈Λ
(3.1)
where kxk = supt |x(t)| < ∞. This metric induces the Skorokhod topology.
The functions xn ∈ D will converge to x in Skorokhod topology if there is a set
of functions λn ∈ Λ such that limn xn (λn (t)) = x(t) and limn λn (t) = t uniformly.
Uniform convergence implies convergence in Skorokhod topology but the converse
is not necessarily true.
There exists an equivalent metric do , to that of d in space D. The difference
being, do gives us Skorokhod topology, so that the space (D, do ) is complete. The
completeness is an important part when trying to characterise the compactness of
sets. The difference between the metrics is we want the mapping λ to be closer to
the identity function, therefore we define a new kλko such that it is an increasing
function on [0, 1], λ(0) = 0 and λ(1) = 1,
λ(t) − λ(s) o
kλk = sup log
.
t−s s<t
(3.2)
Define Λ = {λ : kλko < ∞}. Thus we can define the new metric
do (x, y) = inf {kλko ∨ kx − y ◦ λk}
λ∈Λ
(3.3)
which creates Skorokhod topology on D.
Theorem 3.1. The space D is separable under d, do and complete under do .
Proof. See Theorem 12.2, page 128 of [4].
From Section 2.3.1 we require completeness and separability of the space to
consider convergence and compactness with greater ease. The complete and separable nature of D will play an integral part in the rest of the chapter.
33
CHAPTER 3. THE SPACE D
3.3
Compactness in D
We will use an application of the Arzelà-Ascoli theorem for the requirements of a
set of functions x to be compact in Skorokhod topology. Initially we will consider
the space C = C[0, 1] which is the space of all continuous functions on the unit
interval. We will define the modulus of continuity of this space and then switch to
space D and make the appropriate changes.
We will define the modulus of continuity for an arbitrary x on [0, 1] to be,
w(x, δ) = sup |x(s) − x(t)|,
0 < δ ≤ 1.
(3.4)
|s−t|≤δ
For the function x to be (uniformly) continuous on [0, 1] it is necessary and sufficient for x to satisfy
lim w(x, δ) = 0.
δ→0
(3.5)
If the function x satisfies (3.5) then x ∈ C. We can now define the Arzelà-Ascoli
theorem which completely characterises relative compactness in the space C.
Theorem 3.2 (Arzelá-Ascoli theorem). The set A ⊂ C has compact closure if and
only if
sup |x(0)| < ∞
(3.6)
x∈A
lim sup w(x, δ) = 0
δ→0 x∈A
(3.7)
are both satisfied.
Proof. The proof can be found on page 81 of [4].
We now turn our attention to space D with Skorokhod topology where we will
have to define a different modulus of compactness. As functions are allowed to
have infinitesimally small deformations of the time scale, we define the modulus
of compactness as
w0 (x, δ) =
sup
{|x(t) − x(t1 )| ∧ |x(t2 ) − x(t)|} .
0∨(t−δ)≤t1 <t<t2 ≤(t+δ)∧1
34
(3.8)
An Exposition of Extremal Processes
If x ∈ D, then ω 0 (x, δ) → 0 as δ ↓ 0. On the other hand, for a function x with
ω 0 (x, δ) → 0 as δ ↓ 0, x is either left continuous or right continuous with right and
left limits respectively at each point. With this new modulus of compactness we
can apply the Arzelá-Ascoli theorem to find when a set of functions has compact
closure in space D with Skorokhod topology.
Theorem 3.3. A necessary and sufficient condition for a set A ⊂ D to have
compact closure in Skorokhod topology is that the following two conditions must be
met,
sup kxk < ∞,
(3.9)
x∈A
and
lim sup w0 (x, δ) = 0
(3.10)
δ→0 x∈A
Proof. The proof can be found on page 130 of [4].
3.4
Weak convergence in D
Consider a sequence of probability measures {P, Pn , n ≥ 1} on (D, BD ) where BD
is the Borel σ-algebra generated by do . {Pn } converges weakly to P if and only if
{Pn } is tight and it has finite dimensional distributions (see (3.16)) at a dense set
of times that converge to those of P .
Tightness is an important part in considering convergence of probability measures. A set of probability measures may settle down to a limit however the limit
may not be a probability measure (see example (2.1)). Formally a set of probability measures {Pn } on (D, BD ) is tight if for every ε > 0 there exists a compact set
Kε such that inf n Pn {Kε } > 1 − ε.
Theorem 3.4. If the space S is a separable and complete metric space with metric
ρ. Then a single probability measures P on (S, BS ) is tight.
Proof. This proof follows from Theorem 1.3 on page 8 of [4]. Let S 1 (x) denote
k
an open ball centred at x with radius
1
.
k
35
Due to the separability of S we can
CHAPTER 3. THE SPACE D
find a sequence of open balls, S 1 (x1 ), S 1 (x2 ), . . . that cover the space S. Thus
k
k
S
choosing a sufficiently large nk to satisfy P { i≤nk S 1 (xi )} > 1 − 2δk . Let ε > k10 ,
k
then consider the set
A:=
\ [
S 1 (xi )
k
k≥1 i≤nk
\ [
⊂
S 1 (xi )
k
k≥k0 i≤nk
[
⊂
S 1 (xi ).
i≤nk0
(3.11)
k0
Let the points {x1 , . . . , xnk } define an ε-net. The ε-net is finite for finite nk , hence
1
k0
∀x ∈ A, ∃xi such that ρ(x, xi ) <
< ε, showing that A is a totally bounded
set. From Section 2.1.2, the totally bounded set A will have compact closure K
because the space S is complete. Then using the Bonferroni inequality,
(
)
\ [
P {K} ≥ P
S 1 (xi )
k
k≥1 i≤nk
)
(
≥P
[
S1 (xi )
(
+P
)
[
S 1 (xi )
2
(
+ ··· + P
i≤nk
i≤nk
)
[
S 1 (xi )
k
−k+1
i≤nk
>1−δ
(3.12)
which completes the proof.
Theorem 3.5 (Prohorov’s Theorem). Let Π be a family of probability measure on
(S, BS ). If Π is tight, then it is relatively compact.
Proof. See Theorem 5.1 on page 59 of [4].
Corollary 3.1. If the sequence of probability measures {Pn } are tight, and for
each subsequence that converges, it converges weakly to P , then the entire sequence
w
converges weakly to P , i.e. Pn ⇒ P .
Proof. See Corollary of Theorem 5.1 on page 59 of [4].
As tightness is required to show that a sequence of probability measures weakly
converge in Skorokhod topology, we will note the conditions that need to be satisfied for a sequence of probability measures to be tight in space D. Tightness of
36
An Exposition of Extremal Processes
probability measures in space D is denoted using the Arzelá-Ascoli characterisation
of compactness.
Theorem 3.6. The sequence of probability measures {Pn } on (D, BD ) is tight if
and only if both of the following conditions are satisfied for x ∈ D,
lim lim sup Pn {x : kxk ≥ a} = 0
a→∞
(3.13)
n
and for ε > 0,
lim lim sup Pn {x : w0 (x, δ) ≥ ε} = 0
δ→0
(3.14)
n
Proof. The proof will follow from Theorem 13.2 on page 139 of [4]. The two
conditions in Theorem 3.6 can be thought of as, ∀η > 0, there exists an a and δ
such that Pn {x : kxk ≥ a} ≤ η and Pn {x : w0 (x, δ) ≥ ε} ≤ η.
Let’s consider the sequence of measures {Pn } to be tight. As it is tight, given
an η we can find a compact set Kε such that inf n Pn {Kε } > 1 − η. By Theorem
3.3 the set Kε ⊂ {x : kxk ≤ a} for sufficiently large a and Kε ⊂ {x : ω 0 (x, δ) ≤ ε}
for sufficiently small δ, completing half of the argument.
As space D is complete and separable, from Theorem (3.4) a single probability
measure is tight. Thus a probability measure P on (D, BD ) is tight which implies,
for a given η, P {x : kxk ≥ a} ≤ η and P {x : w0 (x, δ) ≥ ε} ≤ η by the first half of
the proof.
Using this for a given η choose an a such that, if B = {x : kxk ≤ a} then
Pn {B} ≥ 1 − η for all n. Similarly we can choose a δk so that, if Bk = {kxk :
ω 0 (x, δk ) ≤ k1 } then Pn {Bk } ≥ 1 − 2ηk for all n. Let K be the closure of A =
T
B ∩ k Bk , then apply Bonferroni’s inequality,
Pn {K} ≥ Pn {A}
(
= Pn
)
B∩
\
Bk
k
= Pn {B} + Pn {B1 } + · · · + Pn {Bk } − k
≥1−η−
k
X
η
2i
i=1
> 1 − 2η.
(3.15)
37
CHAPTER 3. THE SPACE D
As set A satisfies the conditions of Theorem (3.3), K must be compact completing
the proof.
Now we will consider the finite dimensional distributions to complete the section on weak convergence. The finite dimensional distributions are a projection
from D to Rk . Let 0 ≤ t1 < t2 < · · · < tk ≤ 1, then the projection is denoted by
πt1 ...tk where,
πt1 ...tk (x) = (x(t1 ), . . . , x(tk )).
(3.16)
From this definition the projection πt is continuous at x if and only if x is continuous at t. For a probability measure P on D denote,
TP = {t ∈ [0, 1] : πt is continuous except for points of P -measure 0}.
(3.17)
Then for an x ∈ D denote the set,
Jt = {x : x(t) 6= x(t− )}.
(3.18)
By definition for t ∈ TP , P {Jt } = 0.
Lemma 3.1. The set TP contains 0,1 and the set TPc ∩ [0, 1] is at most countable.
Proof. From the definition of Λ in (3.1), 0 and 1 are fixed points so we know that
π0 and π1 are continuous, hence 0 and 1 are always in TP . To show the second
part of the statement we need to show P {Jt } > 0 for at most countable many t.
By Lemma 1 on page 122 of [4] a single x ∈ D can only have finitely many points
where |x(t) − x(t− )| > ε. Hence let
Jt (ε) = {x ∈ D : |x(t) − x(t− )| > ε},
(3.19)
then for a fixed > 0 and 0 < δ ≤ 1 there can only be finitely many t such that
P {Jt (ε)} > δ. Applying limε↓0 P {Jt (ε)} = P {Jt } completes the proof.
For t1 , . . . , tk ∈ TP , πt1 ... ,tk is continuous with P -measure 1. Then using the
continuous mapping theorem defined in Theorem 2.7 on page 21 of [4],
w
Pn ⇒ P
38
(3.20)
An Exposition of Extremal Processes
implies
→ P πt−1
Pn πt−1
1 ...tk
1 ...tk
for t1 , . . . , tk ∈ TP
(3.21)
However if the t0i s don’t lie in TP then we can’t assume that (3.20) implies (3.21).
The subclass A of S is a separating class if two probability measures are equal
on A then they are equal on the whole of S. In conjunction with this definition
let T ⊂ [0, 1], H ∈ Rk and t1 , . . . , tk ∈ T . Then for a given k let p{πt : t ∈ T } be
the class of sets πt−1
(H).
1 ,...,tk
Theorem 3.7. If T contains 1 and is dense in [0, 1], then σ{πt : t ∈ T } = BD
and p{πt : t ∈ T } is a separating class.
Proof. See Theorem 12.5 on page 134 of [4].
For the following theorem and the rest of the work we will use the notation →i
J
as convergence in the limit of i. Similarly ⇒ i denotes convergence in Skorokhod
topology in the limit of i. This notations allows us to specify convergence with
respect to i without strictly specifying the value. When we do have a value, the
limiting value will be specified clearly, this notation will just be used for general
theorems.
Theorem 3.8. Let {Pn } be a sequence of probability measures on (D, BD ). If {Pn }
J
is tight, and Pn πt−1
→ P πt−1
holds for t1 , . . . , tk ∈ TP . Then Pn ⇒ P .
1 ...tk
1 ...tk
Proof. This proof will follow from Theorem 13.1 on page 139 of [4]. The proof
w
will take advantage of Corollary 3.1 by assuming Pni ⇒ i Q then show P = Q. If
t1 , . . . , tk ∈ TP then by statement of theorem Pni πt−1
→i P πt−1
. If t1 , . . . , tk ∈
1 ,...,tk
1 ,...,tk
TQ then by assumption Pni πt−1
→i Qπt−1
.
1 ,...,tk
1 ,...,tk
By definition of TP and TQ they contain 0 and 1. They also have countable
compliments as the compliment must have P and Q measure 0 respectively. Thus
TP and TQ are dense on [0, 1]. If we now consider TP ∩ TQ , it too contains 0, 1 and
is dense on [0, 1] as both TP and TQ are dense on [0, 1].
The final installation of the proof is if t1 , . . . , tk ∈ TP ∩ TQ then P πt−1
=
1 ,...,tk
Qπt−1
. However using Theorem 3.7, p{πt : t ∈ TP ∩ TQ } is a separating class,
1 ,...,tk
implying that P = Q, concluding the proof.
39
Chapter 4
Classical extreme value theory
Classical extreme value theory has been developed over the past half century and it
delves into the distribution of the maxima of a set of random variables. Consider a
set of n scaled random variables, in the limit as n → ∞ the maximum may converge
to a non-degenerate distribution. Any set of random variables that satisfy this is
known to be in the domain of attraction of the extreme value distribution.
4.1
Extreme value distributions
We define the set of random variables {Yi , i = 1, 2, . . . } to be independent and identically distributed (iid) with distribution function FY . Let Qn = max {Y1 , . . . , Yn },
and {an > 0}, {bn } be sequences of real numbers. Consider
P {Qn ≤ x} = P {Y1 ≤ x, . . . , Yn ≤ x}
= FY (x)n .
(4.1)
Because 0 ≤ FY ≤ 1, from (4.1) as n becomes large the distribution of P {Qn ≤ x}
becomes degenerate. However there is a collection of distributions FY and {an >
0}, {bn } such that,
P {Qn ≤ an x + bn } = FY (an x + bn )n → G(x) as n → ∞,
(4.2)
where G(x) is a non-degenerate distribution. The collection of distributions FY
that satisfy (4.2) are said to be in the domain of attraction of G. It has been shown
40
An Exposition of Extremal Processes
in [8] and [14] that G must follow one of three distributions, subject to rescaling.
The distributions G can be found on page 4 of [14] as
−x
Type I (Gumbel): G(x) = e−e , ∀x,
0,
x ≤ 0,
Type II (Fréchet): G(x) =
e−x−α , x > 0,
e−(−x)α , x ≤ 0,
Type III (Weibull): G(x) =
1,
x > 0,
(4.3)
where α > 0. An example of a distribution that converges to a G is the exponential
distribution.
Example 4.1. Let {Yi , i = 1, 2, . . . } be set of iid random variables with parameter
λ and Qn = max {Y1 , . . . , Yn }. Consider
P {Qn ≤ an x + bn } = FY (an x + bn )n
= 1 − e−λ(an x+bn )
≈ e−ne
If we set bn =
1
λ
−λ(an x+bn )
n
as n becomes large.
log n and an = λ1 , then
−x
P {Qn ≤ an x + bn } → e−e
as n → ∞,
which is type I of the extreme value distributions defined in (4.3).
4.2
Stationary sequences of dependent random
variables
If we remove the iid requirement of the random variables in the previous section, it
has been shown in [14] that a stationary sequence will still converge to an extreme
value distribution G(·) under certain conditions. The sequence of random variables
41
CHAPTER 4. CLASSICAL EXTREME VALUE THEORY
{Xj , j = 1, 2, . . . } are a stationary sequence of random variables if {Xj , . . . , Xj+k }
d
= {X1 , . . . , Xk+1 }, ∀k ≥ 0 and j ≥ 1. Define Mn = max{X1 , . . . , Xn }, then we will
be considering P {Mn ≤ an x + bn }, where {an > 0} and {bn } are real sequences
like before.
We require two conditions on the sequence {Xj , j = 1, 2, . . . } for P {Mn ≤ an x+
bn } to converge to a non-degenerate distribution as n → ∞. The first condition
is a mixing condition and the second is a asymptotic mixing condition. Let’s
denote the joint density function of Xi1 , . . . , Xik as Fi1 ...ik (x1 , . . . , xk ) = P {Xi1 ≤
x1 , . . . , Xik ≤ xk }. We can always find a set of integers that satisfy,
1 ≤ i1 < · · · < ip < j1 < · · · < jq ,
j1 − ip ≥ m.
(4.4)
In a similar fashion we can find sets of integers E1 , . . . , Er that are sub intervals
of {1, . . . , n}, such that min{Ej } − max{Ei } ≥ m, ∀i < j. i.e. all sets E1 , . . . , Er
are separated by m time units. Denote the event Aj = {max{Ej } ≤ an x + bn }
then the joint distribution can be written as P {Aj } = FEj (an x + bn , . . . , an x + bn ).
With this we can define the mixing condition as follows,
|P {A1 ∩ A2 } − P {A1 }P {A2 }| ≤ αn,m
(4.5)
where αn,m is non-increasing in m. Such that if mn is a real sequence with mn → ∞
and mn /n → 0, then αn,mn → 0 as n → ∞.
The mixing condition in (4.5) can be extended to encompass more events,
r
Y
P {Aj } ≤ (r − 1)αn,m ,
(4.6)
P {A1 ∩ · · · ∩ Ar } −
j=1
where A1 , . . . , Ar are the events defined above. This extension is validated using
induction. We have the base case of r = 2 seen in (4.5). We will assume it holds
42
An Exposition of Extremal Processes
for r, then show it is valid for r + 1,
r+1
Y
P {Aj }
P {A1 ∩ · · · ∩ Ar+1 } −
j=1
≤ |P {A1 ∩ · · · ∩ Ar+1 } − P {A1 ∩ · · · ∩ Ar }P {Ar+1 }|
r
Y
+ P {A1 ∩ · · · ∩ Ar } −
P {Aj } P {Ar+1 }
j=1
≤ αn,m + (r − 1)αn,m
= rαn,m .
(4.7)
Therefore the mixing condition holds for r events separated by m time units.
The other condition on the Xi0 s that needs to be satisfied for there to be a
non-degenerate distribution of Mn as n → ∞, is there can’t be two maxima in
close proximity of each other. For a given k, the condition can be formulated by
n
lim sup n
n→∞
[k]
X
P {X1 > an x + bn , Xj > an x + bn } → 0 as n → ∞,
(4.8)
j=2
where [ nk ] is the integer part of
n
.
k
This lays a bound on the probability of an
exceedance occurring more than once in the set {X1 , . . . , X[ nk ] }. If conditions
(4.5) and (4.8) are satisfied then Theorem 3.4.1 on page 59 of [14] has shown for
sequences {an > 0} and {bn },
P {Mn ≤ an x + bn } → G(x) as n → ∞,
(4.9)
if and only if,
n(1 − P {X1 ≤ an x + bn }) → τ
for 0 ≤ τ < ∞,
where G(x) is one of the three distributions defined in (4.3).
43
(4.10)
Chapter 5
Extremal process
An extremal process can be thought of as a record process. Following the world
record times of the 100m sprint, the process will take the value of the current
record. This chapter will follow Lamperti’s work in [12] and expand upon some of
the proofs given in that paper. However we will only be considering the maximum
of the set opposed to Lamperti who went on to consider the kth largest element
of the set.
5.1
Foundations
Let {Yi , i = 1, 2, . . . } be a sequence of iid random variables with distribution
function FY (x) = P {Yi ≤ x}. Consider the maximum of the set of random
variables Qn = max{Y1 , . . . , Yn }. If there exists real sequences {an > 0} and {bn }
that satisfy (4.2), then an extremal process can exist. These sequences define a
set of scaled random variables Yn,i = (Yi − bn )/an , for 1 ≤ i ≤ n.
This is where Lamperti made the big break through in considering the random
process of maximums opposed to the distribution. We will make a slight deviation
from [12] here by using the random process qn (t) defined by
0,
0 < t < n,1
qn (t) =
max
1
≤ t ≤ 1.
k≤tn Yn,k ,
n
44
(5.1)
An Exposition of Extremal Processes
The problem with definition (5.1) comes about when 0 < t < 1/n. This is
because the process can have a negative first jump implying the process is no
longer non-decreasing, which presents difficulties in the work. To avoid this we
will consider the slightly modified process
q̃n (t) = max Yn,k ,
0 < t ≤ 1.
(5.2)
k≤(1∨tn)
The difference between the modified process and qn (t) is clearly, q̃n (t) − qn (t) =
Yn,1 1{t < 1/n} therefore,
P
q̃n (t) − qn (t) → 0 as n → ∞.
Furthermore for t > 0,
1
P sup |q̃n (s) − qn (s)| > 0 ≤ P t <
→ 0 as n → ∞.
n
s≥t
(5.3)
(5.4)
The assertion (5.3) implies that q̃(t) and q(t) have the same asymptotic behaviour
as n → ∞, while (5.4) implies they have the same asymptotic behaviour in the
sense of Skorokhod topology as n → ∞. We will work on q̃(t) as it is more
convenient due to it’s non-decreasing nature. After determining the asymptotic
behaviour of the modified process we can relate it back to q(t) to achieve the
desired result.
5.2
Convergence of finite dimensional distributions
This section will cover the convergence of {qn (t), 0 < t ≤ 1} to the extremal
process {q(t), 0 < t ≤ 1} whenever (4.2) is satisfied. From Chapter 3 we can
see the convergence of random processes can be broken down into two problems:
convergence of the finite dimensional distributions and tightness of the process
with respect to a topology. Specifically this section will cover the convergence of
the finite dimensional distributions. We will start by proving Theorem 2.1 of [12].
45
CHAPTER 5. EXTREMAL PROCESS
Theorem 5.1. Assume that (4.2) holds for one of the three non-degenerate distributions G(·) defined in (4.3). Then the finite dimensional distributions of {qn (t),
0 < t ≤ 1} defined in (5.1) converge to those of the Markov process {q(t), 0 < t ≤
1} for t ∈ (0, ∞) defined by the distribution Ht (x),
Ht (x) = P {q(t) ≤ x} = G(x)t
(5.5)
and,
P {q(t + s) ≤ y|q(s) = x) =
0,
for y < x,
(5.6)
G(y)t , for y ≥ x.
Proof. Initially we will compute the finite dimensional distributions of the limiting
process {q(t), 0 < t ≤ 1} via (5.5) and (5.6). Let 0 < t1 < · · · < tk ≤ 1, then
consider P {q(t1 ) ≤ x1 , . . . , q(tk ) ≤ xk }. The case when k = 2 is illustrated below,
P {q(t1 ) ≤ x1 , q(t2 ) ≤ x2 }
Z x1
=
P {q(t2 − t1 + t1 ) ≤ x2 |q(t1 ) = u}dHt1 (u)
0
Z x1
G(x2 )t2 −t1 1{x2 ≥ u}dHt1 (u)
=
0
Z x1 ∧x2
t2 −t1
= G(x2 )
dHt1 (u)
0
t2 −t1
= G(x2 )
G(x1 ∧ x2 )t1 .
(5.7)
By (5.7), if x2 ≤ x1 then P {q(t1 ) ≤ x1 , q(t2 ) ≤ x2 } = G(x2 )t2 = P {q(t2 ) ≤ x2 }.
Therefore the event when x2 ≤ x1 is already covered. Hence when 0 < t1 < · · · < tk
and x1 < · · · < xk , we have
P {q(t1 ) ≤ x1 , q(t2 ) ≤ x2 } = G(x2 )t2 −t1 G(x1 )t1 .
(5.8)
We can extend it to k terms using the Markovian nature assumed in the theorem.
46
An Exposition of Extremal Processes
Using the property discussed before, assume x1 < · · · < xk ,
P {q(t1 ) ≤ x1 , . . . , q(tk ) ≤ xk }
Z x1 Z xk−1
P {q(tk ) ≤ xk |q(tk−1 ) = uk−1 , . . . , q(t1 ) = u1 }
···
=
0
0
dHtk−1 |tk−2 ,...,t1 (uk−1 |uk−2 , . . . , u1 ) · · · dHt1 (u1 )
Z
x1
Z
xk−1
P {q(tk ) ≤ xk |q(tk−1 ) = uk−1 }dHtk−1 |tk−2 (uk−1 |uk−2 ) · · · dHt1 (u1 )
Z x1 Z xk−1
tk −tk−1
dHtk−1 |tk−2 (uk−1 |uk−2 ) · · · dHt1 (u1 )
···
= G(xk )
···
=
0
0
0
0
tk −tk−1
= G(xk )
G(xk−1 )
tk−1 −tk−2
· · · G(x1 )t1 .
(5.9)
Now we need to calculate the finite dimensional distributions of {q̃n (t), 0 < t ≤
1} to see whether they converge to (5.9) as n → ∞. As P {Qn ≤ x} = FY (x)n , for
0 < t ≤ 1,
P {q̃n (t) ≤ x} = P {Qbntc∨1 ≤ an x + bn }
= (F (an x + bn )n )
bntc 1
∨n
n
.
(5.10)
As the interior of the brackets of (5.10) converges to G(x) as n → ∞, P {q̃n (t) ≤
x} → G(x)t , because P {t < 1/n} → 0 as n → ∞. The distribution G(x) is one
of the three described in (4.3) and it is continuous for all x, which implies G(x)t
is continuous for all x. With this result we can go on to show that the finite
dimensional distributions of {q̃n (t), 0 < t ≤ 1} converge to those of {q(t), 0 < t ≤
1}.
For t1 < · · · < tk consider the event {q̃n (t1 ) ≤ x1 , . . . , q̃n (tk ) ≤ xk }. The nondecreasing nature of q̃n (t) implies for an i < j, q̃n (ti ) ≤ q(tj ), thus if xi ≥ xj ,
the event {q̃n (ti ) ≤ xi } contains the event {q̃n (tj ) ≤ xj }, so it may be omitted.
Without loss of generality assume x1 < · · · < xk . Let’s first consider the base case
47
CHAPTER 5. EXTREMAL PROCESS
of k = 2, remembering that the Yi0 s are iid,
P {q̃n (t1 ) ≤ x1 , q̃n (t2 ) ≤ x2 }
= P {q̃n (t1 ) ≤ x1 }P {q̃n (t2 ) ≤ x2 |q̃n (t1 ) ≤ x1 }
= P {q̃n (t1 ) ≤ x1 }P {Yn,i ≤ x2 , 1 ≤ i ≤ bnt2 c|Yn,i ≤ x1 , 1 ≤ i ≤ bnt1 c}
= P {q̃n (t1 ) ≤ x1 }P {Yn,i ≤ x2 , bnt1 c + 1 ≤ i ≤ bnt2 c}
= P {q̃n (t1 ) ≤ x1 }P {Yn,i ≤ x2 , 1 ≤ i ≤ bnt2 c − bnt1 c}.
(5.11)
From (5.10) the above converges to G(x1 )t1 G(x2 )t2 −t1 as n → ∞, thus proving the
base case. Let’s now assume it holds for k and show that this holds for k + 1,
P {q̃n (t1 ) ≤ x1 , . . . , q̃n (tk+1 ) ≤ xk+1 }
= P {q̃n (t1 ) ≤ x1 , . . . , q̃n (tk ) ≤ xk }P {q̃n (tk+1 ) ≤ xk+1 |q̃n (t1 ) ≤ x1 , . . . , qn (tk ) ≤ xk }
= P {q̃n (t1 ) ≤ x1 , . . . , q̃n (tk ) ≤ xk }P {Yn,ik+1 ≤ xk+1 , 1 ≤ ik+1 ≤ bntk+1 c
|Yn,i1 ≤ x1 , 1 ≤ i1 ≤ bnt1 c, . . . , Yn,ik ≤ xk , 1 ≤ ik ≤ bntk c}
= P {q̃n (t1 ) ≤ x1 , . . . , q̃n (tk ) ≤ xk }P {Yn,ik+1 ≤ xk+1 , bntk c + 1 ≤ ik+1 ≤ bntk+1 c}
= P {q̃n (t1 ) ≤ x1 , . . . , q̃n (tk ) ≤ xk }P {Yn,ik+1 ≤ xk+1 , 1 ≤ ik+1 ≤ bntk+1 c − bntk c}.
(5.12)
Similarly to before, as n → ∞ one can see that (5.12) converges to G(x1 )t1 G(x2 )t2 −t1
· · · G(xk+1 )tk+1 −tk . This concludes the induction and shows that the finite dimensional distributions of {q̃n (t), 0 < t ≤ 1} converge to those of {q(t), 0 < t ≤ 1}. By
(5.3) the asymptotic behaviour is the same in probability hence ∀k ≥ 1, 0 < t1 <
· · · < tk ≤ 1,
FDD
{qn (t), 0 < t ≤ 1} =⇒ {q(t), 0 < t ≤ 1},
as n → ∞,
(5.13)
concluding the proof.
5.3
Tightness in Skorokhod topology
In this section we will aim to show the process {qn (t), r ≤ t ≤ s} is tight in
Skorokhod topology as n → ∞.
48
An Exposition of Extremal Processes
Theorem 5.2. Assume Theorem 5.1 holds with an extra condition of P {Qn ≤
x} → G(x) as n → ∞ such that G(·) takes the form of Type II in (4.3). Also
assume the process {q(t), 0 < t ≤ 1} is separable. Separability is discussed in
Section 2.3.1, and is required as the foundations are built on this property. Then
for 0 < r ≤ t ≤ s ≤ 1 such that q(r) and q(s) are finite,
J
{qn (t), r ≤ t ≤ s} ⇒ {q(t), r ≤ t ≤ s},
as n → ∞
(5.14)
J
where ⇒ denotes convergence in the Skorokhod topology.
Proof. For this to hold we need to show that {qn (t), r ≤ t ≤ s} is tight in Skorokhod topology. We will prove tightness via an application of Theorem 3.6. Again
we will work on the more convenient process, {q̃n (t), r ≤ t ≤ s} then relate it to
{qn (t), r ≤ t ≤ s} at the end.
By the assumptions of the theorem, (3.13) is satisfied as q(r) and q(s) are finite
and {q̃n (t), r ≤ t ≤ s} is a non-decreasing process. So it is left to show (3.14) to
satisfy tightness in the Skorokhod topology.
We will need to make a slight modification to w0 (x, δ) described in (3.10) as
our t ∈ [r, s]. We will denote the modulus of compactness to be,
w00 (x, δ, r, s) =
sup
{|x(t) − x(t1 )| ∧ |x(t2 ) − x(t)|} .
(5.15)
r∨(t−δ)≤t1 <t<t2 ≤(t+δ)∧s
The modulus of compactness w00 (x, δ, r, s) is on a subinterval of w0 (x, δ) as 0 < r ≤
t ≤ s ≤ 1. Therefore theorems relating tightness in Skorokhod topology will hold.
Thus we need to show ∀ε > 0,
lim lim sup P {w00 (q̃n , δ, r, s) > ε} = 0,
δ→0
(5.16)
n→∞
to complete the proof.
We know q̃n (t) only increases in jumps, so the event {w00 (q̃n , δ, r, s) > ε} is
contained in the following two events. Consider the events for a α < 2ε ,
An = {q̃n (r) < α} ,
(5.17)
Bn = {q̃n (r) ≥ α, and ∃t ∈ [r, s] : q̃n (·)has at least two jumps in the
interval [t, (t + 2δ) ∧ s]} .
(5.18)
49
CHAPTER 5. EXTREMAL PROCESS
The limit of P {An } can be made arbitrarily small by selecting a small enough α.
Fix an α and denote #A as the number of elements in set A, then the event Bn
has a upper bound,
P {Bn }
≤ P {∃t ∈ [r, s] : q̃n (·)has at least two jumps in the interval [t, (t + 2δ) ∧ s]}
≤ E[#{t ∈ [r, s] : q̃n (·) has at least two jumps in the interval [t, (t + 2δ) ∧ s]}]
X
≤
P {Yn,i > α, Yn,j > α}
0≤i<j≤bnsc
j−i≤b2δnc
=
X
P {Yn,i > α}P {Yn,j > α}
0≤i<j≤bnsc
j−i≤b2δnc
X
= (1 − FY (an α + bn ))2
1
0≤i<j≤bnsc
j−i≤b2δnc
≤ (1 − FY (an α + bn ))2 2δn2 s.
(5.19)
As the FY (·) lies in the domain of attraction of G(·), limn→∞ nP {Yn,1 > x} =
− log(G(x)). Hence
lim n (1 − FY (an α + bn )) = G(α),
n→∞
(5.20)
which implies for sufficiently large n, (1 − FY (an α + bn ))2 = O(n−2 ). Implement
this in (5.19) to find, lim supn→∞ P {Cn } ≤ O(δ), concluding {q̃n (t), r ≤ t ≤ s} is
tight in the Skorokhod topology. This and (5.12) imply
J
{q̃n (t), r ≤ t ≤ s} ⇒ {q(t), r ≤ t ≤ s} as n → ∞.
(5.21)
Then (5.4) indicates q̃n (t) and qn (t) share the same asymptotic behaviour in the
Skorokhod topology as n → ∞. Theorem 3.1, page 27 of [4] implies random
elements with the same asymptotic behaviour must converge to the same limit.
This theorem and (5.21) complete the proof.
Note the proof here is only given for G(·) of the form Type II seen in (4.3).
This convergence does hold for all G(·) however the theorem here is to generate
the idea of how to go about the proof. See [12] for the full details.
50
Chapter 6
Extremal process of random
sample size
This chapter will focus on extending the results of Lamperti [12] seen in Chapter
4. The extension is removing the need for a non-random sample size. This has
been covered by a number of people, e.g. Berman in [2], where he considered both
a random sample size that was independent from the random variables, and a
sample size that is dependent, but converges in probability so that the sample size
becomes asymptotically non-random.
Teugels and Silvestrov made an advance in [20], where they showed an extremal
process of random sample size dependent on the random variables converges in
Skorokhod topology. This chapter will closely follow [20].
6.1
Foundations
Let {Yi , i = 1, 2, . . . } be a sequence of iid random variables with distribution
function FY (x) = P {Yi ≤ x}. Consider the maximum of the set of random
variables Qn = max{Y1 , . . . , Yn }. If there exists real sequences {an > 0} and
{bn } such that (4.2) can be satisfied, then define a set of scaled random variables,
51
CHAPTER 6. EXTREMAL PROCESS OF RANDOM SAMPLE SIZE
{Yn,i , i = 1, 2, · · · } where Yn,i = (Yi − bn )/an . Consider the same process as (5.1),
0,
0 < t < n,1
(6.1)
qn (t) =
max
Y , 1 ≤ t ≤ 1.
k≤tn
n,k
n
Let υn be a positive random variable, then the interests of this chapter lie in the
following process,
hn (t) = max Yn,k ,
k≤tυn
0<t≤1
(6.2)
which clearly has random sample size. If we denote µn = υn /n and assume the
following conditions on the random sample size µn ,
µn =
υn w
⇒ µ as n → ∞,
n
where P {µ > 0} = 1.
(6.3)
The random variable µn allows us to define (6.2) as a combination of two processes,
{µn (t) = tµn , 0 < t ≤ 1} and {qn (t), 0 < t ≤ 1},
hn (t) = qn (tµn ),
0 < t ≤ 1.
(6.4)
Again we will consider the modified process {q̃n (t), 0 < t ≤ 1} due to the nice
nature described in (5.2). This will define the modified process with random
sample size,
h̃n (t) = q̃n (tµn ),
0 < t ≤ 1.
(6.5)
Similar to before, consider h̃n (t) − hn (t) = Yn,1 1(tµn < 1/n). Using condition
(6.3),
P
h̃n (t) − hn (t) → 0 as n → ∞,
0 < t ≤ 1.
Furthermore for t > 0,
1
→ 0 as n → ∞.
P sup h̃n (s) − hn (s) > 0 ≤ P tµn <
n
s≥t
(6.6)
(6.7)
Equations (6.6) and (6.7) indicate that h̃n (·) and hn (·) have the same asymptotic
behaviour not only in one dimensional distributions but in Skorokhod topology as
52
An Exposition of Extremal Processes
well. This allows us to work on a convenient non-decreasing function, h̃n (t) then
relate our results back.
J
Chapter 5 proves {qn (t), 0 < t ≤ 1} ⇒ {q(t), 0 < t ≤ 1} as n → ∞, which
FDD
implies {qn (t), 0 < t ≤ 1} =⇒ {q(t), 0 < t ≤ 1} as n → ∞. This combined with
FDD
condition (6.3) would be enough to prove that {hn (t), 0 < t ≤ 1} =⇒ {h(t), 0 <
t ≤ 1} as n → ∞ if υn and hn (t) are independent. However if they are dependent
it is not a sufficient condition. Thus weak convergence of the joint distribution
needs to be assumed,
w
{(qn (t), µn ), 0 < t ≤ 1} ⇒ {(q(t), µ), 0 < t ≤ 1} as n → ∞.
(6.8)
For w > 0, let w < τ1,w < τ2,w < · · · be the jump times of the process
{h(t), 0 < t ≤ 1} on the interval [w, 1]. Then let S be the set of all points for
0 < t ≤ 1 such that P {τk, 1 = tµ} = 0 ∀k, n = 1, 2, . . . . Then the set S c contains
n
at most a countable number of points as the random variable τk, 1 /µ has at most
n
countably many atoms. Thus the set S is (0, 1] except for countably many points,
implying S is dense on (0, 1]. If the distribution of τk, 1 /µ is continuous then
n
S = (0, 1].
6.2
Convergence of finite dimensional distributions
This section will follow Theorem 1 of [20] where they developed an elegant proof
for the weak convergence of the extremal process with random sample size using
the monotonic nature of the modified process. We consider convergence of the
finite dimensional distributions on dense set S as the probability of there being a
jump in the process is 0 on this set. See Section 3.4 for more details.
Theorem 6.1. Let condition (6.8) hold, then
FDD
{hn (t) = qn (tµn ), t ∈ S} =⇒ {h(t) = q(tµ), t ∈ S}
as n → ∞.
(6.9)
Proof. Consider the process {h̃n (t), 0 < t ≤ 1} and it’s non-decreasing nature.
Let · · · < z−1,r < z0,r < z1,r < · · · for r = 1, 2, . . . , partition the interval (0, 1],
53
CHAPTER 6. EXTREMAL PROCESS OF RANDOM SAMPLE SIZE
Figure 6.1: Approximate process defined in (6.10)
such that z−k,r → 0 and zk,r → 1 as k → ∞. The partition must also satisfy
dr = maxk (zk+1,r − zk,r ) → 0 as r → ∞.
±
(t),
Define the approximate extremal process with non-random sample size {qn,r
0 < t ≤ 1} as
±
qn,r
(t) = q̃n (zk+ 1±1 ,n ) for zk,n ≤ t < zk+1,n ,
2
−∞ < k < ∞,
(6.10)
similarly for the asymptotic case,
qr± (t) = q̃(zk+ 1±1 ,n ) for zk,n ≤ t < zk+1,n ,
2
−∞ < k < ∞.
(6.11)
An illustration of how the approximate process (dotted line) relates to the extremal process (solid line) can be seen in Figure 6.1. Note the solid line can also
be a region where they are equal. Using (6.10) define the approximate process
with random sample size,
±
h±
n,r (t) = qn,r (tµn ),
0 < t ≤ 1.
(6.12)
By definition of the process {h±
n,r (t), t > 0}, it satisfies the following inequality for
all n = 1, 2, . . . ,
+
h−
n,r (t) ≤ h̃n (t) ≤ hn,r (t),
0<t≤1
(6.13)
and in the limiting case,
+
h−
r (t) ≤ h(t) ≤ hr (t),
54
0 < t ≤ 1.
(6.14)
An Exposition of Extremal Processes
By definition of the partitions,
h(t) − h±
r (t) ≤ sup |q(tµ) − q(tµ + s)| .
(6.15)
|s|≤dr
For t ∈ S the random point tµ is almost surely a point of continuity of the process
{q(t), 0 < t ≤ 1}. In conjunction with dr → 0 as r → ∞, (6.15) implies,
P
h±
r (t) → h(t)
for t ∈ S,
as r → ∞.
(6.16)
We can always find a set U which is dense in Rm , such that {u1 , . . . , um } ∈ U
are continuity points of distribution functions of the random vectors (h(ti ), i =
1, . . . , m) and (h±
r (ti ), i = 1, . . . , m). This is because distribution functions can
only have countably many discontinuities. For t1 , . . . , tm ∈ S, (6.14) implies
P {h−
r (ti ) ≤ ui , i = 1, . . . , m} ≥ P {h(ti ) ≤ ui , i = 1, . . . , m}
(6.17)
P {h+
r (ti ) ≤ ui , i = 1, . . . , m} ≤ P {h(ti ) ≤ ui , i = 1, . . . , m}.
(6.18)
and
Take the limit of both sides of both equations as r → ∞,
lim sup P {h+
r (ti ) ≤ ui , i = 1, . . . , m} ≤ P {h(ti ) ≤ ui , i = 1, . . . , m}
r→∞
≤ lim inf P {h−
r (ti ) ≤ ui , i = 1, . . . , m}
r→∞
(6.19)
Due to (6.16) the left and right hand sides of (6.19) converge to P {h(ti ) ≤ ui , i =
1, . . . , m}. As the ti , i = 1, . . . , m are arbitrary points in S, m ≥ 1,
FDD
{h±
r (t), t ∈ S} =⇒ {h(t), t ∈ S} as r → ∞.
(6.20)
Let’s again consider the arbitrary points t1 , . . . , tm ∈ S and select partitions in
such a way that, zk,r ∈ S and P {ti µ = zk,r } = 0 ∀k, r, i. Using this with condition
55
CHAPTER 6. EXTREMAL PROCESS OF RANDOM SAMPLE SIZE
(6.8),
P {h±
n,r (ti ) ≤ ui , i = 1, . . . , m}
±
(ti µn ) ≤ ui , i = 1, . . . , m}
= P {qn,r
∞
X
=
P {q̃n (zki + 1±1 ,r ) ≤ ui , ti µn ∈ [zki ,r , zki +1,r ), i = 1, . . . , m}
2
→
k1 ,...,km =−∞
∞
X
P {q(zki + 1±1 ,r ) ≤ ui , ti µn ∈ [zki ,r , zki +1,r ), i = 1, . . . , m} as n → ∞
2
k1 ,...,km =−∞
= P {h±
r (ti ) ≤ ui , i = 1, . . . , m}
by bounded convergence theorem. Thus
FDD
±
{h±
n,r (ti ), i = 1, . . . , m} =⇒ {hr (ti ), i = 1, . . . , m} as n → ∞.
(6.21)
For (u1 , . . . , um ) ∈ U we can combine (6.13), (6.16), (6.20) and (6.21) to find,
lim inf P {h̃n (ti ) ≤ ui , i = 1, . . . , m}
n→∞
≥ lim lim inf P {h+
nr (ti ) ≤ ui , i = 1, . . . , m}
r→∞ n→∞
= lim P {h+
r (ti ) ≤ ui , i = 1, . . . , m}
r→∞
= P {h(ti ) ≤ ui , i = 1, . . . , m}
(6.22)
and
lim supP {h̃n (ti ) ≤ ui , i = 1, . . . , m}
n→∞
≤ lim lim sup P {h−
nr (ti ) ≤ ui , i = 1, . . . , m}
r→∞
n→∞
= lim P {h−
r (ti ) ≤ ui , i = 1, . . . , m}
r→∞
= P {h(ti ) ≤ ui , i = 1, . . . , m}.
(6.23)
Thus leaving us with lim supn→∞ P {h̃n (ti ) ≤ ui , i = 1, . . . , m} ≤ P {h(ti ) ≤ ui , i =
1, . . . , m} ≤ lim inf n→∞ P {h̃n (ti ) ≤ ui , i = 1, . . . , m}. As the lim sup is on the left
and the lim inf is on the right the limits must be equal,
FDD
{h̃n (ti ), i = 1, . . . , m} =⇒ {h(ti ), i = 1, . . . , m} as n → ∞.
56
(6.24)
An Exposition of Extremal Processes
It is now time to relate the results back to {hn (t), 0 < t ≤ 1}. Equation (6.6) and
(6.24) imply
FDD
{hn (ti ), i = 1, . . . , m} =⇒ {h(ti ), i = 1, . . . , m} as n → ∞.
(6.25)
As ti , i = 1, . . . , m are arbitrary points in dense set S, the proof is complete.
6.3
Tightness in Skorokhod topology
The previous section showed the process {hn (t), t ∈ S} weakly converges as n →
∞. In this section we will show the process is tight in Skorokhod topology.
J
In Chapter 5 we showed {qn (t), 0 < t ≤ 1} ⇒ {q(t), 0 < t ≤ 1}. However to
prove the random sample size process {hn (t), 0 < t ≤ 1} converges in Skorokhod
topology we will have to assume a similar condition to (6.8). The joint convergence of {qn (t), 0 < t ≤ 1} and {µn } in Skorokhod topology is required to prove
{hn (t), 0 < t ≤ 1} converges in Skorokhod topology. Thus we will assume
J
{(qn (t), µn ), 0 < t ≤ 1} ⇒ {(q(t), µ), 0 < t ≤ 1} as n → ∞,
(6.26)
J
where ⇒ denotes convegence in Skorokhod topology and is discussed in Section
3.4. This allows us to show that {hn (t), 0 < t ≤ 1} is tight in Skorokhod topology
without assuming any independence on {qn (t), 0 < t ≤ 1} and {µn }.
Theorem 6.2. Let condition (6.26) hold, then
J
{hn (t) = qn (tµn ), 0 < t ≤ 1} ⇒ {h(t) = q(tµ), 0 < t ≤ 1} as n → ∞.
(6.27)
Proof. We use the same modulus of compactness defined in (5.15) for the same
reasons discussed there,
w00 (x, δ, r, s) =
sup
{|x(t) − x(t1 )| ∧ |x(t2 ) − x(t)|} .
(6.28)
r∨(t−δ)≤t1 <t<t2 ≤(t+δ)∧s
Theorem 6.1 gives us convergence of the finite dimensional distributions of {hn (t),
0 < t ≤ 1} in set S which is dense in (0, 1]. Thus it is left to show that it is tight
57
CHAPTER 6. EXTREMAL PROCESS OF RANDOM SAMPLE SIZE
in Skorokhod topology. For ε > 0 and 0 < r < s ≤ 1,
lim lim sup P {w00 (hn , δ, r, s) > ε}
n→∞
= lim lim sup P
sup
{|qn (tµn ) − qn (t1 µn )| ∧ |qn (t2 µn ) − qn (tµn )|} > ε
δ→0 n→∞
r∨(t−δ)≤t1 <t
t<t2 ≤(t+δ)∧s
≤ lim lim sup P
sup
{|qn (t) − qn (t1 )| ∧ |qn (t2 ) − qn (t)|} > ε
δ→0 n→∞
rµn ∨(t−δµn )≤t1 <t
t<t2 ≤(t+δµn )∧sµn
1
1
00
≤ lim lim sup P w (qn , δµn , rµn , sµn ) > ε, ≤ µn ≤ v + P µn <
δ→0 n→∞
v
v
δ→0
+ P {µn > v}
n o
r
1
00
≤ lim lim lim sup P w qn , δv, , sv > ε + P µn <
+ P {µn > v} .
v→∞ δ→0 n→∞
v
v
(6.29)
Applying condition (6.26), {qn (t), t > 0} is tight in Skorokhod topology hence
limv→∞ limδ→0 lim supn→∞ P {w00 (qn , δv, vr , sv) > ε} = 0. Thus,
lim lim sup P {w00 (hn , δ, r, s) > ε}
n→∞
1
≤ lim lim sup P µn <
+ P {µn > v}
v→∞ n→∞
v
n
vo
2
+P µ>
,
≤ lim P µ <
v→∞
v
2
δ→0
we double and half the interval as the point
1
v
(6.30)
and v may not be continuity points
of the distribution of µ. Condition (6.3) implies that (6.30) equals zero. Hence
the process {hn (t), t > 0} is tight in Skorokhod topology.
58
Chapter 7
Extremal process of dependent
random variables
This chapter will follow Adler’s work [1]. It will show the extremal process of
dependent random variables will converge for a non-random sample size. This
extends the work done by Lamperti in [12], as it includes a dependent structure
for the random variables.
7.1
Foundations and dependent structure
Let {Xj , j = 1, 2, . . . } be a stationary sequence of random variables such that
d
{Xj , . . . , Xj+k } = {X1 , . . . , Xk+1 } for k ≥ 0 and j ≥ 1. Denote Mn = max{Xj , j =
1, . . . , n}, then for certain distributions there exists two real sequences {an > 0}
and {bn } such that,
P {Mn ≤ an x + bn } → G(x) as n → ∞
(7.1)
as seen in (4.9). Denote the set of scaled random variables Xn,j = (Xj − bn )/an .
Denote B as a Borel subset of (0, 1] × (−∞, ∞) and #A as the number of
elements in set A. From Theorem 2.5 Ln (B) defines a point process where,
j
Ln (B) = # j :
, Xn,j ∈ B, j = 1, . . . , n ,
(7.2)
n
59
CHAPTER 7. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES
which lies in space R. As {Ln (B), n = 1, 2, . . . } is a sequence of point processes it
is natural to consider the convergence in vague topology as n becomes large. The
idea of convergence in vague topology is explored in Section 2.1.3 and is denoted
w
by ⇒ .
Let G∗ be the left hand endpoint of the support of G, G∗ = inf{x : G(x) > 0},
then denote the set T = (0, 1] × (G∗ , ∞) ⊂ R2 . Define µ as the Lebesgue-Stieltjes
measure on (G∗ , ∞) such that
µ(x, y] = log
G(y)
G(x)
,
for x < y.
(7.3)
Then λ can be defined by
λ(A) = (t2 − t1 )µ(x, y],
(7.4)
for rectangles A = (t1 , t2 ] × (x, y] ⊂ T . We can define the two-dimensional Poisson
process on T with mean measure λ by the following two features,
L(B) is a Poisson variable with mean measure λ(B), for all Borel sets
B ⊂ T,
(7.5)
for k ≥ 1 the disjoint Borel sets B1 , . . . , Bk ⊂ T implies L(B1 ), . . . , L(Bk )
are independent.
(7.6)
In Section 2.3.5 we showed that the above definition indeed defines a point process.
v
This chapter will investigate the convergence of Ln ⇒ L as n → ∞, however we
will need to put some conditions on the dependent structure of the Xi ’s.
Consider the set of integers with the property,
1 ≤ i1 < · · · < ip < j1 < · · · < jq ≤ n,
j1 − ip > m.
(7.7)
Also consider the sequence of sets {Aj } ⊂ R and {Bj } ⊂ R, where each Aj
and Bj are the union of finitely many intervals, where the intervals can have
infinite bounds. Restrict the sequences of sets further by only allowing finitely
(n)
many different types of Aj and Bj . Then if we denote Fi1 ,...,ir (A1 , . . . , Ar ) as the
60
An Exposition of Extremal Processes
probability P {Xn,i1 ∈ A1 , . . . , Xn,ir ∈ Ar } we can define the mixing condition
using the sets with the form above and integers of the form (7.7) to be
(n)
Fi1 ,...,ip ,j1 ,...,jq (A1 , . . . , Ap , B1 , . . . , Bq )
(n)
(n)
−Fi1 ,...,ip (A1 , . . . , Ap )Fj1 ,...,jq (B1 , . . . , Bq ) ≤ αn,m ,
(7.8)
where αn,m is non-increasing in m. Let {qn } be a real sequence such that if qn → ∞
with qn /n → 0, then αn,qn → 0, as n → ∞. This mixing conditions is to ensure
sets of random variables that become more distant approach independent sets of
random variables.
Define another two real sequences {kn }, {pn } such that qn = (n − kn pn )/kn ,
pn → ∞, kn → ∞, kn pn /n → 1 and kn αn,qn → 0 as n → ∞. An example of such
sequences can be found in Theorem 1.3 of [9]. If condition (7.8) is satisfied we
require one more condition,
pn −1
lim kn
n→∞
X
(pn − j)P {Xn,1 ≥ x, Xn,j+1 ≥ x} = 0
(7.9)
j=1
for all x such that 0 < G(x) < 1. This condition ensures there are not two
exceedances close together. It lays a bound on probability of there being more
then one exceedance in the set {X1 , . . . , Xpn }.
This concludes the initial setup and dependent structure section. From here the
next section will consider the convergence in vague topology of the point process
as n becomes large
7.2
Convergence in vague topology
The proof of convergence in vague topology requires a result which is a simplified
version of Theorem 2.3 of [10] and Theorem 4.7 on page 35 of [11].
Theorem 7.1. Let L1 , L2 , . . . be point processes on [0, 1] × (−∞, ∞) and L be
v
a two-dimensional Poisson process with mean measure λ(·). Then Ln ⇒ L if the
61
CHAPTER 7. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES
following two conditions hold
P {Ln (B) = 0} → exp{−λ(B)},
lim sup E[Ln (B)] ≤ λ(B),
(7.10)
(7.11)
n→∞
for all sets B of finite unions of disjoint, bounded rectangles of the form (x, y] ×
(s, t].
Proof. See Theorem 4.7 on page 35 of [11].
Theorem 7.2. Let {Xj , j = 1, 2, . . . } be a stationary sequence of random variables. If there exists two sequences of real numbers, {an > 0} and {bn } such that,
P n {Xn,1 ≤ x} → G(x),
as n → ∞
(7.12)
v
and conditions (7.8) and (7.9) are satisfied, then Ln ⇒ L where Ln is defined by
(7.2) and L is defined by (7.5) and (7.6).
Proof. Theorem 7.1 gives an easy way to proof the process Ln converges to L in
vague topology. Begin by considering condition (7.11). Without loss of generality,
we fix a set B = (c1 , c2 ] × (x, y] ⊂ T then let ζn,j be a indicator random variable
such that ζn,j = 1 if (j/n, Xn,j ) ∈ B, and ζn,j = 0 otherwise. This implies
P
Ln (B) = nj=1 ζn,j . Define the set An,j = {x : (j/n, x) ∈ B}, a diagram of such a
set can be seen in Figure 7.1. Knowing nP {Xn,1 > x} ≈ − log(G(x)),
j
E[ζn,j ] = P
, Xn,j ∈ B
n
1
= nP {Xn,j ∈ (x, y]}1 j ∈(c1 ,c2 ]
n
n
1
= (nP {Xn,j > x} − nP {Xn,j > y})1 j ∈(c1 ,c2 ]
n
n
1
G(y)
≈ log
1 nj ∈(c1 ,c2 ]
n
G(x)
µ(An,j )
=
n
62
(7.13)
An Exposition of Extremal Processes
Figure 7.1: Set An,j
where µ is defined in (7.3). Hence,
"
E[Ln (B)] = E
n
X
#
ζn,j
j=1
=
≈
n
X
j=1
n
X
j=1
E [ζn,j ]
µ(An,j )
.
n
(7.14)
Equation (7.4) implies as n → ∞ (7.14) tends to λ. Hence (7.11) is satisfied and
it is only left to show (7.10) is satisfied.
Let En denote events that are defined in terms of {ζn,1 , . . . , ζn,k } and Fn denote
events that are defined in terms of {ζn,1+m , . . . , ζn,k+m }. From (7.8) there must be
a double sequence such that,
|P {En ∩ Fn } − P {En }P {Fn }| ≤ αn,m ,
(7.15)
which is non-increasing in m. If a real sequence qn → ∞ with qn /n → 0 then
αn,qn → 0 as n → ∞. As before, we can define real sequences {kn } and {pn } that
satisfy the same conditions as (7.9). Let x = inf{s : (t, s) ∈ B, for some t ∈ [0, 1]}
then P {ζn,1 = 1, ζn,j = 1} ≤ P {Xn,1 > x, Xn,j > x}. Apply this to condition (7.9)
63
CHAPTER 7. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES
to achieve
pn −1
lim kn
n→∞
X
(pn − j)P {ζn,1 = 1, ζn,j+1 = 1} = 0.
(7.16)
j=1
The next part is a simplified version of [17] which is similar to [16].
P
Denote Pn,k = P {Ln (B) = k} = P { nj=1 ζn,j = k}, then Pn,0 will be the focus
in satisfying (7.10). Partition the integers 1, . . . , n into 2kn blocks of size pn and
qn alternatively, starting with a block of size pn . Then denote Pn as the set of
integers that fall into blocks of size pn , likewise denote Qn as the set of integers
that fall into blocks of size qn .
Define the event Bn,k by “for exactly k values of i, ζn,i = 1 where i = 1, . . . , n
and all such i’s lie in Pn .” Also define the event Cn,k by “for exactly k values of
i, ζn,i = 1 where i = 1, . . . , n and some such i’s lie in Qn .” The event {Ln (B) =
k} = Bn,k ∪ Cn,k , however by definition Bn,k ∩ Cn,k = ∅ implying Pn,k = P {Bn,k } +
P {Cn,k }. Using (7.13) consider
P {Cn,k } ≤ E{Cn,k }
X
P {ζn,k = 1}
=
k∈Qn
≈
kn qn
sup µ(An,j ) → 0,
n j
as n → ∞.
(7.17)
This is due to supj µ(An,j ) being bounded for B ⊂ T and the conditions of the
sequences indicate that kn qn n−1 → 0 as n → ∞. Thus for Pn,k to have a limit as
n → ∞ it must be equivalent to P {Bn,k }, hence |Pn,k − P {Bn,k }| → 0 as n → ∞.
Let Gn,i , i = 1, . . . , kn be the event “ζn,j = 0 for all j in the ith Pn block.”
Then |Pn,0 − P {Gn,1 ∩ · · · ∩ Gn,kn }| → 0 as n → ∞. From (7.15), |P {Gn,1 ∩
Gn,2 } − P {Gn,1 }P {Gn,2 }| ≤ αn,qn as each pn block is separated by a block of
qn . This relation can be extended by considering kn events, then it will become
Qn
|P {Gn,1 ∩ · · · ∩ Gn,kn } − kj=1
P {Gn,j }| ≤ kn αn,qn . This can be shown by induction
in a similar fashion to (4.7).
Qn
Due to the conditions on the sequences |P {Gn,1 ∩· · ·∩Gn,kn }− kj=1
P {Gn,j }| →
Qk n
0 as n → ∞. This implies the asymptotic behaviour of j=1 P {Gn,j } will be the
64
An Exposition of Extremal Processes
same as P {Gn,1 ∩ · · · ∩ Gn,kn }, hence we need to calculate P {Gn,i }. Let r be the
first integer of the ith Pn block and denote Hn,i as the event “ζn,j = 1 at least
twice in the ith Pn block.” Consider Gn,i ,
P {Gn,i } = 1 − P {Hn,i }
− P {ζn,r = 0, . . . , ζn,r+pn −2 = 0, ζn,r+pn −1 = 1}
.
− ..
− P {ζn,r = 1, ζn,r+1 = 0, . . . , ζn,r+pn −1 = 0}
(7.18)
where,
P {Hn,i } ≤ P {ζn,r = 1, ζn,r+1 = 1}
.
+ ..
+ P {ζn,r+pn −2 = 1, ζn,r+pn −1
+ P {ζn,r = 1, ζn,r+2 = 1}
.
+ ..
+ P {ζn,r+pn −3 = 1, ζn,r+pn −1
.
+ ..
+ P {ζn,r = 1, ζn,r+pn −1 = 1}
= 1}
pn − 1
pn − 2
= 1}
o
1
pn −1
=
X
(pn − j)P {ζn,r = 1, ζn,r+j = 1}.
(7.19)
j=1
As Xn,i are stationary random variables, (7.19) is the same as (7.9) without the
factor of kn , thus P {Hn,i } ∼ o(kn−1 ). Hence,
P {Gn,i } = 1 −
pn
X
j=1
P {ζn,j = 1} +
pn
X
P {ζn,j = 1}
j=1
− P {ζn,1 = 0, . . . , ζn,r+pn −2 = 0, ζn,pn = 1}
.
− ..
− P {ζn,1 = 1, ζn,r+1 = 0, . . . , ζn,pn = 0} + o(kn−1 ).
65
(7.20)
CHAPTER 7. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES
Denote the event Gn,i\{d} as “ζn,l = 0, ∀l 6= d in the ith Pn block”. If we now
consider the difference equation,
pn
X
(P {ζn,d = 1} − P {ζn,1 = 0, . . . , ζn,d−1 = 0, ζn,d = 1, ζn,d+1 = 0, . . . , ζn,pn = 0})
d=1
=
pn
X
P {ζn,d = 1} − P {{ζn,d = 1} ∩ Gn,1\{d} }
d=1
pn
=
X
P {{ζn,d = 1} ∩ Gn,1\{d} } + P {{ζn,d = 1} ∩ (Gn,1\{d} )c }
d=1
−P {{ζn,d = 1} ∩ Gn,1\{d} }
=
pn
X
P {ζn,d = 1, at least one of ζn,l = 1, for some l 6= d in the 1st Pn block}.
d=1
(7.21)
We can expand the sum out by counting the number of certain types of relationships. This means we’ll count types that are next to each other, one apart, two
apart, etc. This is similar to what we did in Hn,i , however we will end up with
double as the sum in (7.21) counts it twice, once when d = α, l = β then again
66
An Exposition of Extremal Processes
when d = β, l = α.
pn
X
P {ζn,d = 1, at least one of ζn,l = 1, for some l 6= d in the 1st Pn block}
d=1
≤ P {ζn,1 = 1, ζn,2 = 1} + P {ζn,2 = 1, ζn,1 = 1}
.
+ ..
+ P {ζn,pn −1 = 1, ζn,pn = 1} + P {ζn,pn = 1, ζn,pn −1
+ P {ζn,1 = 1, ζn,3 = 1} + P {ζn,3 = 1, ζn,1 = 1}
.
+ ..
+ P {ζn,pn −2 = 1, ζn,pn = 1} + P {ζn,pn = 1, ζn,pn −2
.
+ ..
+ P {ζn,1 = 1, ζn,pn = 1} + P {ζn,pn = 1, ζn,1 = 1}
= 1}
2(pn − 1)
2(pn − 2)
= 1}
o
2
pn −1
=2
X
(pn − j)P {ζn,1 = 1, ζn,j+1 = 1}
(7.22)
j=1
as Xn,i are stationary random variables. Equation (7.9) indicates (7.22) is o(kn−1 ).
Hence (7.19) and (7.22) imply
X
P {ζn,j = 1} + o(kn−1 )
1X
∼1−
µ(An,j ) + o(kn−1 )
n
P {Gn,i } = 1 −
(7.23)
where the sum is over all j in the ith Pn block. Because o(kn−1 ) → 0 as n → ∞,
P
for sufficiently large n, P {Gn,i } ∼ 1 − n1
µ(An,j ).
Ppn
pn
1
Consider n j=1 µ(An,j ) ∼ n supj µ(An,j ) → 0 as supj µ(An,j ) is bounded and
pn
n
→ 0. Make another approximation by considering sufficiently large n,
X
1X
1
1−
µ(An,j ) ≈ exp
µ(An,j ) .
(7.24)
n
n
Qn
P
Hence, |P {Gn,1 , . . . , Gn,kn } − ki=1
exp n1
µ(An,j ) | → 0 as n → ∞. As the
sum in the exponential is over all j in the ith Pn block, take the product into the
exponential to see it becomes a sum over all Pn . Hence, |P {Gn,1 , . . . , Gn,kn } −
67
CHAPTER 7. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES
exp
µ(A
)
| → 0 and the exponent converges to λ as n → ∞. This is
n,j
j∈Pn
P
1
n
due to the probability of all the cases of ζn,j = 1 happening in Pn is one as n → ∞,
concluding the proof.
7.3
Convergence in Skorokhod topology
The previous section has shown that the point process Ln converges to the twodimensional Poisson process, L in vague topology. However there is a more relevant
topology, the Skorokhod topology, discussed in Chapter 3. Luckily a comment on
page 212 of [21] indicates, convergence in vague topology implies convergence in
Skorokhod topology when the limiting process is a Poisson process. This implies
J
that Ln ⇒ L as n → ∞.
The kth extremal process can be defined in the following manner,
kth largest in {Xn,1 , . . . , Xn,bntc }, 1 ≤ k ≤ bntc
mkn (t) =
−∞,
k > bntc.
(7.25)
Also denote Ln (t, x) as Ln ((0, t] × (x, ∞)), and in an analogous fashion write
L(t, x) = L((0, t] × (x, ∞)). Then the extremal process can be written in terms of
Ln in the same fashion as [25], remembering that Ln is right continuous.
mkn (t) = min{x : Ln (t, x) ≤ k − 1},
(7.26)
{mkn (t) ≤ x} = {Ln (t, x) ≤ k − 1}.
(7.27)
then,
There are similar equations for the limiting process,
mk (t) = min{x : L(t, x) ≤ k − 1},
(7.28)
{mk (t) ≤ x} = {L(t, x) ≤ k − 1}.
(7.29)
then,
Consider the random process Znk = (m1n , . . . , mkn ) and the k-dimensional extremal
process Z k = (m1 , . . . , mk ).
68
An Exposition of Extremal Processes
J
Theorem 7.3. Under the same conditions as Theorem 7.2, Znk ⇒ Z k on Dk [a, 1]
for fixed k and 0 < a < 1, where Dk is k-dimensional D space which is defined in
Chapter 3.
Proof. As Weissman didn’t use any of the independence properties that were assumed in they’re paper [26], Theorem 2.1 from [26] is sufficient to complete the
proof.
Although this proof may be claimed verbatim we still had to do all the initial
J
work to show that Ln ⇒ L for dependent random variables before using Weissman’s
work. This concludes the chapter and provides enough knowledge to consider a
process of dependent random variables with random sample size.
69
Chapter 8
Extremal process of dependent
random variables with random
sample size
This chapter will use the same techniques as in [1] but extend the results a little
by considering a set of random sample size rather than non-random number n.
Initially we will establish the convergence of a two-dimensional point process to a
two-dimensional mixed Poisson process in vague topology. Then apply this result
to show that the extremal process of stationary random variables with random
sample size converges in Skorokhod topology.
8.1
Convergence to the 2D mixed Poisson process
Let {Xj , j = 1, 2, . . . } be a stationary sequence of random variables such that
d
{Xj , . . . , Xj+k } = {X1 , . . . , Xk+1 } for j ≥ 1, k ≥ 0. Let {νn , n = 1, 2 . . . } be a
sequence of random variables independent of {Xj , j = 1, 2, . . . } such that
νn w
⇒η
n
as n → ∞, where η is an almost surely positive random variable. Assume there
70
An Exposition of Extremal Processes
are real sequences {an > 0} and {bn } such that,
X j − bn
≤ x → G(x)
P max
1≤j≤n
an
as n → ∞
(8.1)
where G(x) is one of the classical extreme value distributions considered in (4.3).
Denote Xn,j =
Xj −bn
an
to produce a set of scaled random variables {Xn,j , j =
1, 2, . . . }.
Let B be a Borel subset of (0, 1] × (−∞, ∞) and denote #A to be the number
of elements in the set A. From Theorem 2.5 In (B) defines a point process where,
j
In (B) = # j :
, Xn,j ∈ B, j = 1, . . . , νn ,
(8.2)
νn
which lies in space R. As {In , n = 1, 2, . . . } is a sequence of point processes it is
natural to consider convergence in vague topology as n becomes large. Convergence
v
in vague topology is outlined in Section 2.1.3 and is denoted by ⇒ .
Let G∗ be the left hand endpoint of G(x), G∗ = inf{x : G(x) > 0}. Then define
set T = (0, 1] × (G∗ , ∞) ⊂ R2 . Let µ be the Lebesgue-Stieltjes measure on (G∗ , ∞)
defined in (7.3). Let λ(A) be the same as (7.4) for rectangles A = (t1 , t2 ] × (x, y] ⊂
T . Then define the diffuse mixed mean measure by
λη (A) = ηλ(A) = η(t2 − t1 )µ(x, y]
(8.3)
where A are rectangles of the form defined above. We can define a two-dimensional
mixed Poisson process by the following two properties: given η,
I(B) is a Poisson variable with mean measure λη (Bi ), for all Borel sets
B ⊂ T,
(8.4)
for k ≥ 1 the disjoint Borel sets B1 , . . . , Bk ⊂ T implies the counts of points
on these sets I(B1 ), . . . , I(Bk ) are independent.
(8.5)
Akin to Chapter 7 we need to impose some conditions on the dependent structure of the set of random variables {Xn,j , j = 1, 2, . . . }. However the conditions
(7.8) and (7.9) suffice for Theorem 8.2. So we will assume that we can find a
{qn }, {pn } and {kn } that satisfy those conditions.
71
CHAPTER 8. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES WITH RANDOM SAMPLE SIZE
Before the convergence of In is considered we will need to mention a special
version of Theorem 2.3 from [10]. This can also be found in Theorem 4.7 on page
35 of [11].
Theorem 8.1. Let I1 , I2 , . . . be point processes on [0, 1]×(−∞, ∞) and I be a twodimensional mixed Poisson process with the diffuse mean measure λη (·) directed by
v
η. Then In ⇒ I if the following two conditions hold
P {In (B) = 0} → E[exp(−λη (B))],
(8.6)
lim sup E[In (B)] ≤ E [λη (B)] ,
(8.7)
n→∞
for all sets B of finite unions of disjoint, bounded rectangles of the form (x, y] ×
(s, t].
Proof. As λη is diffuse, Theorem 2.10 ensures that I is a simple point process.
Thus we can use Theorem 4.7 on page 35 of [11] to complete the proof.
Note that Theorem 8.1 is slightly different to that of Theorem 7.1 as it is a
mixed Poisson directed by η. This will be the main difference for the rest of the
Chapter.
Theorem 8.2. Let {Xj , j = 1, 2, . . . } be a stationary sequence of random variables. Let νn be a random variable independent of {Xj , j = 1, 2, . . . }, such that
νn w
⇒η
n
as n → ∞ where P {η > 0} = 1. If two sequences of real numbers {an > 0},
and {bn } can be found such that,
P n {X1 ≤ an x + bn } → G(x)
(8.8)
v
and conditions (7.8), (7.9) are satisfied, then In ⇒ I where In is specified in (8.2)
and I is defined by (8.4) and (8.5) with diffuse mean measure λη defined in (8.3).
Proof. We will show that (8.6) and (8.7) are satisfied and we start with (8.7).
Without loss of generality, we fix a set B = (c1 , c2 ] × (x, y] ⊂ T then define a
sequence of indicator random variables {ξr,j , j = 1, 2, . . . , r} such that ξr,j = 1 if
72
An Exposition of Extremal Processes
( rj , Xn,j ) ∈ B and ξr,j = 0 otherwise. Consider,
j
E [ξνn ,j |νn = sn ] =P
, Xn,j ∈ B|νn = sn
νn
j
=P
, Xn,j ∈ B|νn = sn
sn
j
, Xn,j ∈ B
=P
sn
Pn
As In (B) = νj=1
ξνn ,j and nP {Xn,j > x} ≈ − log(G(x)), consider
"ν
#
n
X
E [In (B)|νn = sn ] = E
ξνn ,j |νn = sn
(8.9)
j=1
=
sn
X
E [ξνn ,j |νn = sn ]
j=1
=
sn
X
P
j=1
=
1
n
j
, Xn,j
sn
X
∈B
nP {Xn,j ∈ (x, y]}
j
∈(c1 ,c2 ]
sn
for 1≤j≤n
=
1
n
X
(nP {Xn,j > x} − nP {Xn,j > y})
j
∈(c1 ,c2 ]
sn
for 1≤j≤n
sn
≈ (c2 − c1 ) log
n
G(y)
G(x)
.
(8.10)
As E[In (B)] = E[E[In (B)|νn = sn ]], equation (8.10) shows E[In (B)] → E[ηλ(B)]
= E[λη (B)] as n → ∞ satisfying (8.7). It is left to show that (8.6) is satisfied to
complete the proof.
Let En denote an event in terms of {ξn,1 , . . . , ξn,k }, and Fn denote an event
in terms of {ξn,m+k , . . . , ξn,n }. According to condition (7.8) there exists a double
sequence {αn,m } which is non increasing in m and satisfies
|P (En ∩ Fn ) − P (En )P (Fn )| ≤ αn,m .
(8.11)
Let {qn } be a real sequence such that if qn → ∞ with qn /n → 0 then αnqn → 0 as
n → ∞. Let x = inf{r : (t, r) ∈ B, for some t ∈ [0, 1]} then P {ξνn ,1 = 1, ξνn ,j = 1}
73
CHAPTER 8. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES WITH RANDOM SAMPLE SIZE
≤ P {Xn,1 ≥ x, Xn,j ≥ x}. Hence by condition (7.9) there exists further real sequences {kn } and {pn } such that,
pn −1
lim kn
n→∞
X
(pn − j) P {ξνn ,1 = 1, ξνn ,j+1 = 1} = 0.
(8.12)
j=1
Denote Pn,k = P {In (B) = k} = P {
P∞
j=1 ξνn ,j
= k} which is probability of
In being exactly k. Partition the integers 1, 2, . . . , νn into a random number 2kνn
blocks of size pn and qn alternatively, starting with {1, 2, . . . , pn }. The last block
may be incomplete. Note that this is different from [1] as we are required to
partition the integers into a random number of blocks with non-random sizes.
Denote the set χ0n to be the integers that fall into blocks of size pn , likewise
denote the set χ00n to be the integers that fall into blocks of size qn . Then define
the event Bn,k to be “for exactly k values of i = 1, 2, . . . , νn , ξνn ,i = 1, and all
such i0 s are in χ0n ,” similarly define the event Cn,k to be “for exactly k values of
i = 1, 2, . . . , νn , ξνn ,i = 1, and some such i0 s are in χ00n .” Notice {In (B) = k} =
Bn,k ∪ Cn,k , as Bn,k , Cn,k are disjoint events Pn,k = P {Bn,k } + P {Cn,k }.
Approximate P {Cn,k } using the same techniques as (8.10) to find the following,
P {Cn,k } ≤
X
P {ξνn ,j = 1}
j∈χ00
n
=
1
n
X
(nP {Xn,j > x} − nP {Xn,j > y})
j
∈(c1 ,c2 ]
νn
for j∈χ00
n
qn kνn
G(y)
≈
(c2 − c1 ) log
.
(8.13)
n
G(x)
G(y)
As (c2 − c1 ) and log G(x)
are bounded because B ⊂ T , we only need to consider
the sequences. The sequences can be broken up into two cases, consider νn ≥ n,
qn kνn
qν kν d
≤ n n → 0,
n
n
as n → ∞,
(8.14)
qn kνn
qn kn
≤
→ 0,
n
n
as n → ∞.
(8.15)
and νn < n,
74
An Exposition of Extremal Processes
This implies P {Cn,k } → 0 as n → ∞. Hence for Pn,k and P {Bn,k } to have a limit
as n → ∞ they must be equal. Which implies |Pn,k − P {Bn,k }| → 0 as n → ∞.
Denote Gνn ,i as the event “ξνn ,l = 0 for every l in the ith χ0n block.” Using the
result above, |Pn,0 − P {Gνn ,1 ∩ · · · ∩ Gνn ,kνn }| → 0 as n → ∞. Let Sn denote the
values νn can take then we can use (8.11) to calculate a bound for the following
difference formula remembering that each pn block is separated by a block of qn .
|P {Gνn ,1 ∩ · · · ∩ Gνn ,kνn } − P {Gνn ,1 } · · · P {Gνn ,kνn }|
X
≤
|P {Gsn ,1 ∩ · · · ∩ Gsn ,ksn } − P {Gsn ,1 } · · · P {Gsn ,ksn }|P {νn = sn }
sn ∈Sn
≤
X
ksn αsn ,qn P {νn = sn }.
(8.16)
sn ∈Sn
Again we will show this tends to zero by separating into two cases, consider n > sn ,
ksn αsn ,qn ≤ kn α snn n,qn → 0,
as
sn
n
as n → ∞,
(8.17)
approaches finite positive constant as n → ∞. Consider n ≤ sn ,
ksn αsn ,qn ≤ ksn αsn ,qsn → 0,
Hence |P {Gνn ,1 , . . . , Gνn ,kνn } −
Qkνn
i=1
as n → ∞.
(8.18)
P {Gνn ,i }| → 0 as n → ∞. The problem has
now boiled down to finding as asymptotic solution to P {Gνn ,i }. Let r denote the
first element of the ith χ0n block,
P {Gνn ,i } = 1−(P {ξνn ,r = 1, . . . , ξνn ,r+pn −1 = 1}
+ P {ξνn ,r = 0, ξνn ,r+1 = 1 . . . , ξνn ,r+pn −1 = 1}
.
+ ..
+ P {ξνn ,r = 1, . . . , ξνn ,r+pn −2 = 1, ξνn ,r+pn −1 = 0}
.
+ ..
+ P {ξνn ,r = 0, . . . , ξνn ,r+pn −2 = 0, ξνn ,r+pn −1 = 1}
.
+ ..
+ P {ξνn ,r = 1, ξνn ,r+1 = 0 . . . , ξνn ,r+pn −1 = 0}
75
(8.19)
CHAPTER 8. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES WITH RANDOM SAMPLE SIZE
Let’s now denote Hνn ,i as the event that “there is at least two ξνn ,j = 1 in the ith
χ0n block”. Hence,
P {Hνn ,i } ≤ P {ξνn ,r = 1, ξνn ,r+1 = 1}
.
+ ..
+ P {ξνn ,r+pn −2 = 1, ξνn ,r+pn −1
+ P {ξνn ,r = 1, ξνn ,r+2 = 1}
.
+ ..
+ P {ξνn ,r+pn −3 = 1, ξνn ,r+pn −1
.
+ ..
+ P {ξνn ,r = 1, ξνn ,r+pn −1 = 1}
= 1}
pn − 1
pn − 2
= 1}
o
1
pn −1
=
X
(pn − j) P {ξνn ,r = 1, ξνn ,r+j = 1} .
(8.20)
j=1
From (7.9), P {Hνn ,i } ∼ o(kν−1
) because the {Xj , j = 1, 2, . . . } are stationary.
n
Applying this to (8.19) the probability of Gνn ,i can be expressed as,
P {Gνn ,j } = 1 −
pn
X
j=1
P {ξνn ,j = 1} +
pn
X
P {ξνn ,j = 1}
j=1
pn
−
X
P {ξνn ,1 = 0, . . . , ξνn ,j−1 = 0, ξνn ,j = 1, ξνn ,j+1 = 0, . . . , ξνn ,pn = 0} + o(kν−1
),
n
j=1
(8.21)
as the {Xj , j = 1, 2, . . . } are stationary. Define the event Gνn ,i\{j} as “ξνn ,l =
76
An Exposition of Extremal Processes
0, ∀l 6= d in the ith χ0n block”. Then consider the difference formula,
pn
X
(P {ξνn ,d = 1}
d=1
−P {ξνn ,1 = 0, . . . , ξνn ,d−1 = 0, ξνn ,d = 1, ξνn ,d+1 = 0, . . . , ξνn ,pn = 0})
=
pn
X
P {ξνn ,d = 1} − P {{ξνn ,d = 1} ∩ Gνn ,i\{d} }
d=1
=
pn
X
P {{ξνn ,d = 1} ∩ Gνn ,i\{d} } + P {{ξνn ,d = 1} ∩ (Gνn ,i\{d} )c }
d=1
−P {{ξνn ,d = 1} ∩ Gνn ,i\{d} }
=
pn
X
P {ξνn ,d = 1, at least one ξνn ,l = 1, for some l 6= d in the 1st χ0n block}.
d=1
(8.22)
We can expand the sum by doing the same as (7.22), except for random sample
size,
pn
X
P {ξνn ,d = 1, at least one ξνn ,l = 1, for some l 6= d in the 1st χ0n block}
d=1
≤ P {ξνn ,1 = 1, ξνn ,2 = 1} + P {ξνn ,2 = 1, ξνn ,1 = 1}
.
+ ..
+ P {ξνn ,pn −1 = 1, ξνn ,pn = 1} + P {ξνn ,pn = 1, ξνn ,pn = 1}
+ P {ξνn ,1 = 1, ξνn ,3 = 1} + P {ξνn ,3 = 1, ξνn ,1 = 1}
..
+.
+ P {ξ
= 1, ξ
= 1} + P {ξ
= 1, ξ
= 1}
νn ,pn −2
νn ,pn
νn ,pn
2(pn − 1)
2(pn − 2)
νn ,pn−2
.
+ ..
+ P {ξνn ,1 = 1, ξνn ,pn = 1} + P {ξνn ,pn = 1, ξνn ,1 = 1}
o
2
pn −1
=2
X
(pn − j) P {ξνn ,1 = 1, ξνn ,j+1 = 1} .
j=1
77
(8.23)
CHAPTER 8. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES WITH RANDOM SAMPLE SIZE
Again by (7.9) we can see (8.23)∼ o(kν−1
). Applying (8.23) to (8.21) implies
n
P {Gνn ,i } = 1 −
X
P {ξνn ,j = 1} + o(kν−1
)
n
(8.24)
where the sum is over all j in the ith χ0n block. Equation (8.24) indicates for
P
P
a sufficiently large
n,
P
{G
}
≈
1
−
P
{ξ
=
1}.
As
P {ξνn ,j = 1} ≈
ν
,i
ν
,j
n
n
P
G(y)
pn
(c2 − c1 ) log G(x)
→ 0 as n → ∞, for sufficiently large n, 1 − P {ξνn ,j =
n
P
1} ≈ e− P {ξνn ,j =1} where the sum is over all j in the ith χ0n block. Therefore
kνn
X
Y
→ 0 as n → ∞.
P {Gνn ,i } − exp −
P
{ξ
=
1}
(8.25)
ν
,j
n
i=1
j∈χ0n
Finally consider
exp −
X
P {ξνn ,j = 1}
j∈χ0n
=
X
sn ∈Sn
≈
X
sn ∈Sn
exp −
X
P {ξsn ,j = 1} P {νn = sn }
j∈χ0n
G(y)
pn ksn
(c2 − c1 ) log
P {νn = sn }
exp −
n
G(x)
→ E[exp (−λη (B))],
as n → ∞
(8.26)
as pn ksn tends to sn as n becomes large. Hence (8.6) is satisfied which completes
the proof.
8.2
Convergence in Skorokhod topology
So far all the convergence arguments in Chapter 8 are in vague topology. However
the more relevant topology is Skorokhod topology defined in Chapter 3. To apply
our results in this topology a result from [21] is required. In general, convergence
in vague topology does not imply convergence in Skorokhod topology, however in
the case of the process converging to a Poisson Process such as I, the converse does
J
hold. Therefore Theorem 8.2 holds in Skorokhod topology as well. Hence In ⇒ I
78
An Exposition of Extremal Processes
J
where ⇒ denotes weak convergence in Skorokhod topology. It can be found as a
comment on page 212 of [21].
Consider the process defined by,
mkνn (t) = kth largest among
Xn,1 , . . . , Xn,bνn tc ,
0<t≤1
(8.27)
where k is an integer such that 1 ≤ k ≤ bνn tc. Let In (t, x) = In ((0, t], ×(x, ∞))
then from [25], In (t, x) and mkνn (t) are related by mkνn (t) = min {x, In (t, x) ≤ k − 1}
which implies,
mkνn (t) ≤ x = {In (t, x) ≤ k − 1} .
(8.28)
Then define the kth extremal process, {mk (t), t > 0} as mk (t) = min{x, I(t, x) ≤
k − 1} which in turn implies
mk (t) ≤ x = {I(t, x) ≤ k − 1} .
(8.29)
Before we get into the proof of Zνkn (t), 0 < t ≤ 1 = {(m1νn (t), . . . , mkνn (t)), 0 <
t ≤ 1} converging to Z k (t), 0 < t ≤ 1 = {(m1 (t), . . . , mk (t)), 0 < t ≤ 1} in
Skorokhod topology. It is worth noting, unlike [1] we cannot quote the results
from [26] verbatim, as they’re result is for a non-random sample size n. Therefore
we need to prove the results ourselves. This will be shown in two steps. First
the finite dimensional distributions of {mkνn (t), 0 < t ≤ 1} converge to a limit, in
a similar way to [25] except for random sample size. Then to show it is tight in
space D. This will follow [26] but again replacing n with random sample size.
Theorem 8.3. Under the same assumptions as Theorem 8.2, the finite dimen
1
sional distributions of Zνkn (t), 0 < t ≤ 1 =
mνn (t), . . . , mkνn (t) , 0 < t ≤ 1
converge to Z k (t), 0 < t ≤ 1 = m1 (t), . . . , mk (t) , 0 < t ≤ 1 .
Proof. For every 0 < t1 < · · · < tq and {xij ; i = 1, . . . , q; j = 1, . . . , k}, consider
the following event,
(
Wνn =
q
k
\
\
mjνn (ti )
i=1 j=1
79
≤ xij
)
.
(8.30)
CHAPTER 8. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES WITH RANDOM SAMPLE SIZE
nT T
o
q
k
From (8.28), Wνn =
(I
(t
,
x
)
≤
j
−
1)
. Then Theorem 8.2 implies
n
i
ij
nT T i=1 j=1
o
nT T
o
q
k
q
k
j
P {Wνn } → P
i=1
j=1 (I(ti , xij ) ≤ j − 1) = P
i=1
j=1 (m (ti ) ≤ xij ) by
relation (8.29) as n → ∞. This final step completes the proof.
Before the tightness arguments begins, some definitions of sets and characteristics of functions need to be established. Denote the family of distribution functions
{Ht : t > 0} such that,
L (m1n (t)) → Ht
as n → ∞.
(8.31)
Then define x∗t = sup{x : Ht (x) = 0}, x∗0 = limt↓0 x∗t exists even if it is −∞ and
H0 (x) = 1(x > x∗0 ).
For a given b, M, δ > 0 denote the set U = {(t, x) : 0 ≤ t ≤ b, Kt ≤ x ≤ M }
where Kt = max {−M, x∗t + δ}. Then denote the space D(U ) as all integer valued
functions such that, for z ∈ D(U ), z : U 7→ R1 which are finite, non-decreasing in
t, non-increasing in x, and each component is right continuous. Let Λ2 be a group
of transforms λ, that maps U onto itself and have the form λ(t, x) = (λ1 (t), λ2 (x))
where each λ1,2 are strictly increasing mappings. With this define an extended
Skorokhod metric in two dimensions,
d2 (z, y) = inf2 {kz − y ◦ λk ∧ kλk}
(8.32)
λ∈Λ
where,
kz − yk = sup{|z(u) − y(λ(u))|1 }
u∈U
kλk = sup{|λ(u) − u|2 }
u∈U
and | · |i is the Euclidean norm on Ri . This extended Skorokhod metric induces
the extended Skorokhod topology.
For z ∈ D(U ) and sz (t) ⊂ U such that sz (t) is the set of all jump points with
−
−
absicca ≤ t. A point (ti , xi ) is classified as a jump point if z(ti , x−
i ) − z(ti , xi ) −
z(ti , xi ) + z(t−
i , xi ) 6= 0. Then define the following function,
kth largest ordinate of sz (t) if #sz (t) ≥ k
hk (t|z) =
,
K
if #s (t) < k
t
z
80
0≤t≤b
(8.33)
An Exposition of Extremal Processes
and clearly hk (·|z) ∈ D[0, b] for z ∈ D(U ).
Lemma 8.1. The mapping hk : D(U ) 7→ D[0, b] is continuous.
Proof. Let zn , z ∈ D(U ) such that d2 (zn , z) → 0 as n → ∞. Then for all ε > 0
there exists a n > nε such that d2 (zn , z) < ε. As zn and z are integers we are
particularly interested in the case when 0 < ε < 1. In such case there exists a
λn = (λn1 , λn2 ) ∈ Λ2 and n > n , such that ∀u ∈ U ,
zn (λn (u)) = z(u),
kλn k < ε.
(8.34)
This implies that hk (t|z ◦ λn ) = hk (t|z) for 0 ≤ t ≤ b. Looking at the definition
of hk (t|z), hk (t|z ◦ λn ) and the kth largest ordinate of sz (λn1 t) can only differ by
kλn2 k. Hence, |hk (λn1 (t)|zn ) − hk (t|z ◦ λn )| ≤ kλn2 k which implies,
|hk (λn1 (t)|zn ) − hk (t|z)| ≤ kλn2 k < ε
(8.35)
as kλn1 k < ε, it implies d1 (hk (·|zn ), hk (·|z)) = |hk (·|zn ) − hk (·|z)| < ε implying all
convergent subsequences of D(U ) converge after the mapping hk , hence fulfilling
the proof.
From Lemma 8.1 the mapping hk : D(U ) 7→ D[0, b] is continuous. As In ,
I ∈ D(U ) by the mapping theorem defined in Theorem 2.7, page 21 of [4] and
Theorem 8.2,
J
hk (·|In ) ⇒ hk (·|I) in D[0, b].
(8.36)
Before the main result of this section, one more lemma is required which is the
solution to problem 5.9 on page 65 of [4].
Lemma 8.2. A set of probability measures {Pn } on S = S 0 × S 00 are tight if and
only if the two sets of marginal distributions are tight.
Proof. Start by assuming that {Pn } is tight on S, and show this implies the
marginals are tight. As {Pn } is tight on S, ∀ε > 0 there exists a compact set kε ⊂ S
such that inf n Pn (kε ) ≥ 1 − ε. Denote the set of probability measures on S 0 and
81
CHAPTER 8. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES WITH RANDOM SAMPLE SIZE
S 00 as {Pn1 }, {Pn2 } respectively. As they are marginals of S, Pn1 (A) = Pn (A × S 00 )
and Pn2 (A) = Pn (A × S 0 ).
Consider the projection of kε onto S 0 , it is the set kε1 = {x : ∃y, such that (x, y)
∈ kε }. Then for any sequence of points {am } ∈ kε1 , there exists a sequence {bm }
such that {(am , bm )} ∈ kε by definition of kε1 . As kε is a compact set, it must
contain a convergent subsubsequence, (ami , bmi ) →i (a0 , b0 ) ∈ kε (→i notation is
described in Section 3.4). This implies that ami →i a0 so for all sequences {am }
there exists a convergent subsequence implying kε1 is compact.
In a similar fashion we can consider the projection of kε onto S 00 , which is
the set kε2 = {y : ∃x, such that (x, y) ∈ kε }. Then for a sequence {bm } ∈ kε2
there exists a sequence {am } such that {(am , bm )} ∈ kε . As kε is compact it must
have a convergent subsubsequence so (ami , bmi ) →i (a0 , b0 ) which implies that
bmi →i b0 . For all sequences {bm } there exists a convergent subsequence implying
kε2 is compact.
Now it is left to show Pnj {kεj } ≥ 1 − ε for tightness as kεj is compact for
j = 1, 2. From the definition of marginals,
inf Pn1 {kε1 } = inf Pn {kε1 × S 00 }
n
n
≥ inf Pn {kε }
n
≥1−ε
(8.37)
thus Pn1 is tight. In similar fashion,
inf Pn2 {kε2 } = inf Pn {kε2 × S 0 }
n
n
≥ inf Pn {kε }
n
≥1−ε
(8.38)
thus showing that if the probability measures {Pn } are tight on S then the marginals are tight.
To finish the proof assume the set of probability measures {Pn1 } and {Pn2 }
are tight on S 0 and S 00 respectively. Then show this implies the set of probability
measures {Pn } are tight on S. As {Pn1 } is tight on S 0 , for all ε > 0 there exists a
82
An Exposition of Extremal Processes
compact set kε1 ⊂ S 0 such that Pn1 {kε1 } ≥ 1 − 2ε . Similarly as {Pn2 } is tight on
S 00 there exists a compact set kε2 ⊂ S 00 such that Pn2 {kε2 } ≥ 1 − 2ε .
Consider the set kε = kε1 × kε2 . As kε1 is a compact set, for a sequence {am } ∈
kε1 it has a convergent subsequence ami →i a0 . Similarly the sequence {bm } ∈ kε2
has a convergent subsequence bmi →i b0 . Thus the sequence {(am , bm )} ∈ kε has a
convergent subsequence (ami , bmi ) →i (a0 , b0 ) implying the set kε is compact.
Consider Pn {kε } = Pn {kε1 × kε2 }. As kε1 is the projection of kε onto S 0 and
kε2 is the projection of kε onto S 00 , the set {kε1 × kε2 } = {kε1 × S 00 } ∩ {kε2 × S 0 }.
Now consider the following,
Pn {kε } = Pn {kε1 × kε2 }
= Pn {{kε1 × S 00 } ∩ {kε2 × S 0 }}
= Pn {kε1 × S 00 } + Pn {kε2 × S 0 } − Pn {{kε1 × S 00 } ∪ {kε2 × S 0 }}
ε
ε
≥1− +1− −1
2
2
=1−ε
(8.39)
as Pn {kε1 × S 00 } = Pn1 {kε1 } and Pn {kε2 × S 0 } = Pn2 {kε2 }. This shows that the
set of probability measures {Pn } are tight on S if the marginals are tight, thus
concluding the proof.
Theorem 8.4. Using the same assumptions as Theorem 8.2, for fixed k, {Zνkn (t),
J
0 < t ≤ 1} ⇒ {Z k (t), 0 < t ≤ 1} on Dk [a, b] as n → ∞. Where D[a, b] is the space
described in Chapter 3 and 0 < a < b ≤ 1.
Proof. As (8.33) is dependent on M and δ consider the following events. For a
given 0 < a < b,
Aνn (M, δ) = hk (t|In ) 6= mkνn (t) for some t ∈ [a, b]
⊆ {In (b, M ) ≥ 1 or In (a, Ka ) ≤ k − 1}
=: Bνn (M, δ).
(8.40)
Consider the Portmanteau Theorem defined in Theorem 2.1, page 16 of [4].
w
Pn ⇒ P is equivalent to showing lim supn Pn F ≤ P F for all closed sets F . As
83
CHAPTER 8. EXTREMAL PROCESS OF DEPENDENT RANDOM
VARIABLES WITH RANDOM SAMPLE SIZE
event Bνn (M, δ) is closed, Portmanteau Theorem and Theorem 8.2,
lim sup P {Bνn (M, δ)} ≤ P {I(b, M ) ≥ 1 or I(a, Ka ) ≤ k − 1}
(8.41)
n→∞
a.s.
a.s.
Using the definition of I, I(t, x) −→ 0 as x → ∞ and I(t, x) −→ ∞ as x ↓ x∗t .
Hence by choosing a large M and a small δ the right hand side of (8.41) becomes
arbitrarily small, implying supn P {Bνn (M, δ)} ≤ ε. By (8.36) and the previous
arguments, for a given k,
J
{mkνn (t), 0 < t ≤ 1} ⇒ {mk (t), 0 < t ≤ 1},
for 0 < a < b.
(8.42)
From (8.42) for a given k, a particular sequence {mkνn (t), 0 < t ≤ 1} is tight
in D[a, b]. Lemma 8.2 can be extended to encompass k probability measures on k
dimensions, providing k is finite. This implies m1νn (t), . . . , mkνn (t) , 0 < t ≤ 1
is tight on Dk [a, b]. Combine this with Theorem 8.3 which shows the finite dimensional distributions of {Zνkn (t), 0 < t ≤ 1} converge to come to the conclusion
J
{Zνkn (t), 0 < t ≤ 1} ⇒ {Z k (t), 0 < t ≤ 1} hence the theorem is proved.
84
Chapter 9
Conclusion
This thesis has delved into the world of extremal processes with many different
characteristics. The focus has been convergence of the maximum of a set of random
variables in Skorokhod topology as the sample size becomes large. Starting with
iid random variables of both non-random and random sample size, to the more
interesting stationary sequence of dependent random variables with non-random
sample size.
This led the reader to the main focus of the thesis, a stationary sequence
of dependent random variables with independent random sample size. A twodimensional point process based on these random variables, converged to a twodimensional mixed Poisson point process in Skorokhod topology. The parameter
of the Poisson point process was random and dependent on the number of random
variables in the sample. From here it was a matter of relating the extremal process to the two-dimensional point process like Weissman in [26] to complete the
convergence arguments.
Chapter 8’s work differs from Adler’s due to the random sample size which
required a two-dimensional mixed Poisson process. However the flaw with this
technique was the assumption that the sample size random variable and sample
random variables were independent.
Future work in this area could be to show a sequence of stationary dependent random variables with dependent random sample size converges in Skorokhod
85
CHAPTER 9. CONCLUSION
topology. This may be achieved in a similar way to Chapter 8, however there may
be more success using a technique akin to Silvestrov and Tuegels outlined in Chapter 6. These ideas and questions will hopefully be answered one day perpetuating
new ones.
86
Acknowledgements
I would like to thank Associate Professor Aihua Xia for inspiring me to do
Masters and guiding me the whole way. I have learnt a vast amount under his
supervision, so thank you.
Bibliography
[1] R. J. Adler. Weak convergence results for extremal processes generated by
dependent random variables. The Annals of Probability, 6(4):660–667, 1978.
[2] S. M. Berman. Limiting distribution of the maximum term in sequences of
dependent random variables. The Annals of Mathematical Statistics, 33(3):pp.
894–908, 1962.
[3] S. M. Berman. Limit theorems for the maximum term in stationary sequences.
The Annals of Mathematical Statistics, 35(2):pp. 502–516, 1964.
[4] P. Billingsley. Convergence of probability measures. Wiley, New York, second
edition, 1999.
[5] D. Daley, D. J. Vere-Jones. An Introduction to the Theory of Point Processes,
volume I: Elementary Theory and Methods. Springer-Verlag, second edition,
2003.
[6] D. Daley, D. J. Vere-Jones. An Introduction to the Theory of Point Processes,
volume II: General Theory and Structure. Springer, second edition, 2008.
[7] J. Galambos. On the distribution of the maximum of random variables. The
Annals of Mathematical Statistics, 43(2):pp. 516–521, 1972.
[8] B. Gnedenko. Sur la distribution limite du terme maximum d’une serie
aleatoire. The Annals of Mathematics, 44(3):pp. 423–453, 1943.
[9] I. Ibragimov. Some limit theorems for stationary processes. Theory of Probability and Its Applications, 7(4):349–382, 1962.
2
An Exposition of Extremal Processes
[10] O. Kallenberg. Characterization and convergence of random measures and
point processes. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 27:9–11, 1973.
[11] O. Kallenberg. Random Measures. Academic press, 3rd edition, 1983.
[12] J. Lamperti. On extreme order statistics. The Annals of Mathematical Statistics, 35(4):pp. 1726–1737, 1964.
[13] M. Leadbetter. Weak convergence of high level exceedances by a stationary
sequence. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 34:11–15, 1976.
[14] M. Leadbetter, G. Lindgren, and H. Rootzén. Extremes and Related Properties
of Random Sequences and Processes. Springer-Verlag, 1983.
[15] M. R. Leadbetter. On extreme values in stationary sequences. Probability
Theory and Related Fields, 28:289–303, 1974. 10.1007/BF00532947.
[16] R. M. Loynes. Extreme values in uniformly mixing stationary stochastic
processes. The Annals of Mathematical Statistics, 36(3):pp. 993–999, 1965.
[17] R. Meyer. A Poisson-type limit theorem for mixing sequences of dependent
’rare’ events. The Annals of Probability, 1(3):480–483, 1973.
[18] R. Milne. Point processes. Lectures from University of Western Australia
(unpublished), 1992.
[19] G. F. Newell. Asymptotic extremes for m-dependent random variables. The
Annals of Mathematical Statistics, 35(3):pp. 1322–1325, 1964.
[20] D. S. Silvestrov and J. L. Teugels. Limit theorems for extremes with random
sample size. Advances in Applied Probability, 30(3):pp. 777–806, 1998.
[21] M. L. Straf. Weak convergence of stochastic processes with several parameters.
Proc. Sixth Berkeley Symp. Math. Statist. Prob., 2:187–221, 1972.
[22] D. L. Thomas. On limiting distributions of a random number of dependent
random variables. The Annals of Mathematical Statistics, 43(5):pp. 1719–
1726, 1972.
3
BIBLIOGRAPHY
[23] D. Vere-Jones. Some applications of probability generating functionals to the
study of input-output streams. Journal of the Royal Statistical Society. Series
B (Methodological), 30(2):pp. 321–333, 1968.
[24] G. S. Watson.
Extreme values in samples from m-dependent stationary
stochastic processes. The Annals of Mathematical Statistics, 25(4):pp. 798–
800, 1954.
[25] I. Weissman. Multivariate extremal processes generated by independent nonidentically distributed. Applied Probability, 12(3):477–487, 1975.
[26] I. Weissman. On weak convergence of extremal processes. The Annals of
Probability, 4(3):470–473, 1975.
[27] R. E. Welsch. A weak convergence theorem for order statistics from strongmixing processes. The Annals of Mathematical Statistics, 42(5):pp. 1637–1646,
1971.
[28] M. Westcott. The probability generating functional. Journal of the Australian
Mathematical Society, 14(04):448–466, 1972.
4
© Copyright 2026 Paperzz