Goodness-of-fit for Fitting Real Data with Hawkes Processes
Yuanda Chen
May, 2016
Abstract
In this article we discuss the theory of random time change and apply it to multivariate Hawkes
processes for assessing the goodness-of-fit.
1
Random Time Change
The theory of random time change, or random change of scale, as Papangelou (1972) calls it, characterize the relationship between a simple point process and the homogeneous Poisson process with unit rate.
Papangelou’s description is quoted in Daley and Vere-Jones (2003) as:
(Daley and Vere-Jones, 2003, p.258) Suppose that, starting at 0 say, we trace R+ in such a way
that at the time we are passing position t our speed is 1/λ∗ (t), which can be ∞. (The value λ∗ (t)
is determined by the past, i.e. by what happened up to t.) Then the time instants at which we
shall meet all the points in R+ of the process form a homogeneous Poisson process.
where, λ∗ (t) is a notation we will frequently use in this article, which is short for λ(t|FtN− ). Intuitively, it
means that we can perform a transformation of time rescaling by stretching and shrinking the time axis,
based on the value of the conditional intensity λ∗ (t). Anytime when λ∗ (t) < 1, the points are expected
to occur less frequently than a homogeneous Poisson process does, and by performing a shrinking of time,
distant points are brought closer together; likewise, anytime when λ∗ (t) > 1, by performing a stretching of
time, nearby points are drawn further apart. We need the following definition to precisely state the theorem.
1.1
Univariate Case
Definition 1.1. (Daley and Vere-Jones, 2003, p.241, Lemma 7.2.V) Let {Tk }k=1,2,...,n be a simple point
process with internal history FtN− . The integrated intensity function
Z t
Λ(t|FtN− ) =
λ(s|FtN− )ds
0
is called the compensator of the point process.
Remark 1.1. (Daley and Vere-Jones, 2003, p.241, Lemma 7.2.V) The process Λ(t|FtN− ) is called the compensator for the reason that it is the process that must be subtracted from the increasing process N (t) to make
it a martingale. More precisely, the process
M (t) = N (t) − Λ(t|FtN− )
is an FtN− -martingale: for any s > t > 0
E M (s)|FtN− = M (t)
which is true because,
E[dM (t)]
= E[dN (t)] − E[dΛ(t|FtN− )]
= E[dN (t)] − E[λ∗ (t)dt]
=
0
1
Theorem 1.1. (Daley and Vere-Jones, 2003, p.258, Theorem 7.4.I) Let N (t) be a simple point process with
natural filtration FtN− and bounded, strictly positive conditional intensity λ∗ (t) = λ(t|FtN− ) and compensator
Rt
Λ∗ (t) = 0 λ∗ (s)ds that is not almost surely bounded. Define τ = Λ∗ (t), then under the random time change
t 7−→ τ , the transformed process defined as
Ñ (τ ) = N Λ∗ −1 (τ ) = N (t)
is a homogeneous Poisson process with unit rate.
Proof. Let 0 = t0 < t1 < · · · < tn ≤ T be a realization of the simple point process N (t). Define
Z
ti
λ∗ (s)ds
vi =
(1)
ti−1
and a Jacobian matrix J whose element in row i and column j is
∗
λ (ti )
if i = j
dvi
−λ∗ (ti−1 )
if i = j + 1
=
Jij =
dtj
0
otherwise
Notice that J is lower triangular, so its determinant is the product of its diagonal elements:
n
Y
|J| =
λ∗ (tk ).
k=1
The joint density function of {Tk }k=1,2,...,n is given by Rubin (1972):
( Z
n
n
Y
Y
∗
fT1 ,...,Tn (t1 , . . . , tn ) =
λ (tk ) ·
exp −
k=1
tk
)
∗
λ (s)ds
tk−1
k=1
By the change of variables formula (see Casella and Berger, 2002, p.158, (4.3.2) and p.185, (4.6.7)), the joint
density function of the random variables
Z
Ti
Vi =
λ∗ (s)ds
Ti−1
is given by
fV1 ,...,Vn (v1 , . . . , vn )
=
=
fT1 ,...,Tn (t1 , . . . , tn )|J|−1
( Z
n
n
Y
Y
λ∗ (tk ) ·
exp −
k=1
=
n
Y
n
Y
tk
)"
λ∗ (s)ds
n
Y
#−1
λ∗ (tk )
k=1
)
∗
λ (s)ds
tk−1
k=1
=
tk−1
k=1
( Z
exp −
tk
e−vk
k=1
Notice that V1 , . . . , Vn are the interarrival times of the process Ñ , and by Lemma 1.2 given below, the
expression above indicates that V1 , . . . , Vn are mutually independent, each with the density of an exponential
distribution with parameter 1.
2
Lemma 1.2. (Casella and Berger, 2002, p.184, Theorem 4.6.11) The random variable X1 , . . . , Xn are
mutually independent if and only if there exist functions fi (xi ) for i = 1, 2, . . . , n, such that the joint density
function f (·) of (X1 , . . . , Xn ) can be written as
f (x1 , . . . , xn ) = f1 (x1 ) · f2 (x2 ) · · · · · fn (xn ).
Figure 1 illustrates how the random time change works. In the top panel, a simulated inhomogeneous
Poisson process is shown, with its intensity function shown as the solid curve and Rthe simulated points as cross
t
marks. In the bottom panel, the compensator defined in Definition 1.1, Λ(t) = 0 λ(s)ds = t − cos(t) + 1, is
shown as the solid line, with the x-axis being t and y-axis the rescaled time τ = Λ(t) (see Theorem 1.1). The
cross marks on the x-axis and y-axis are the occurrence times of the process, in the original time scale t, and
in the rescaled time τ , respectively. The distance between two consecutive points on the x-axis and y-axis
are the interarrival times {wk }k=1,2,... , and {vk }k=1,2,... as defined by (1), respectively. This example clearly
shows the shrinking and stretching described at the beginning of this section. For example, the interarrival
between the last two points is shrunk after the time change because of the low intensity during that period,
and the shrinking is visualized by the decrease of the distance between the two points, when transformed
from the x-axis to the y-axis. As expected, the points are more evenly spread on the y-axis, after applying
the random time change.
Figure 1: An illustrative example of the random time change on the simulated inhomogeneous Poisson
process with intensity function λ(t) = 1 + sin(t). In the top panel, the intensity function λ(t) and the
simulated points are shown, as the solid curve and cross marks, respectively. In the bottom panel, the same
simulated points are shown on the x-axis which represents the original time scale t, and their rescaled time
τ are shown on the y-axis, through the random time change τ = Λ(t), where Λ(t) is the compensator and is
shown as the solid curve.
3
1.2
Multivariate Case
Proposition 1.3. (Daley and Vere-Jones, 2003, p.265, Proposition 7.4.VI (a)) Let {N m (t)}m=1,2,...,M be
an M -variate point process defined on [0, ∞) with stochastic intensities λm ∗ (t) that are strictly positive so
that the compensators
Z
t
Λm ∗ (t) =
λm ∗ (s)ds → ∞
0
as t → ∞. Define τ m = Λm ∗ (t), then under the simultaneous random time changes t 7→ τ m , the transformed
processes defined as
−1
Ñ m (τ m ) = N m Λm ∗ (τ m ) = N m (t)
are independent homogeneous Poisson processes each having unit rate.
For the proof please refer to (Aalen and Hoem, 1978, pp.91-93, Section 4.2).
2
2.1
Goodness of Fit
Rescaled Interarrival Times for Hawkes Processes
By applying Proposition 1.3 on the multivariate Hawkes process with exponential decays, we derive the
following result. This result will be applied in the goodness-of-fit analysis.
Proposition 2.1. Consider an M -variate Hawkes process with exponential decays. Let {Tkm }k=1,2,... be the
arrival times of events in the m-th dimension, for m = 1, 2, . . . , M . Denote
M
X
X
m
m
m
n
α
mn
1 − e−βmn (Tk −Tk−1 ) Rmn (k − 1) +
1 − e−βmn (Tk −Ti )
Vkm =
β
mn
m
n
m
n=1
{i:Tk−1 ≤Ti <Tk }
m
+µm (Tkm − Tk−1
)
where Rmn (k) is defined recursively as
m
Rmn (k) = e−βmn (Tk
m
−Tk−1
)
m
X
Rmn (k − 1) +
e−βmn (Tk
−Tin )
m ≤T n <T m }
{i:Tk−1
i
k
with initial condition:
Rmn (0) = 0
then {Vk1 }k=1,2,... , {Vk2 }k=1,2,... , . . . , {VkM }k=1,2,... are M independent sequences of independent identically
distributed exponential random variables with unit rate.
Proof. First consider a bivariate Hawkes process where M = 2. By Proposition 1.3, we have
Vk1
=
1
Λ1 (Tk1 ) − Λ1 (Tk−1
)
Z
Tk1
=
λ1 (s)ds
1
Tk−1
=
µ1 (Tk1
−
1
Tk−1
)
Z
Tk1
+ α11
1
Tk−1
X
e
−β11 (s−Ti1 )
Z
Tk1
ds + α12
1
Tk−1
{i:Ti1 <s}
4
X
{i:Ti2 <s}
2
e−β12 (s−Ti ) ds
=
µ1 (Tk1
−
1
Tk−1
)
Z
1
Tk−1
Tk1
2
Tk1
+α11
e
Z
e−β12 (s−Ti ) ds
1
≤Ti2 <s}
{i:Tk−1
Tk1
1
Tk−1
2
X
e−β12 (s−Ti ) +
1
}
{i:Ti2 <Tk−1
1
= µ1 (Tk1 − Tk−1
) + α11
1
e−β11 (s−Ti ) + e−β11 (s−Tk−1 ) ds
1
}
{i:Ti1 <Tk−1
X
1
Tk−1
1
X
+ α11
+α12
Z
Tk1
Z
1
X
e−β11 (s−Ti ) ds + α12
Z
1
Tk−1
1
}
{i:Ti1 <Tk−1
1
−β11 (s−Tk−1
)
Z
Tk1
1
Tk−1
2
X
e−β12 (s−Ti ) ds
1
}
{i:Ti2 <Tk−1
2
X
ds + α12
1
Tk−1
Tk1
e−β12 (s−Ti ) ds
1
≤Ti2 <s}
{i:Tk−1
1
= µ1 (Tk1 − Tk−1
)
+
α11
β11
h
X
1
1
1
1
e−β11 (Tk−1 −Ti ) − e−β11 (Tk −Ti )
i
1
{i:Ti1 <Tk−1
}
+
α12
β12
h
i
1
2
1
2
e−β12 (Tk−1 −Ti ) − e−β12 (Tk −Ti )
X
1
{i:Ti2 <Tk−1
}
+
i α
1
1
α11 h
12
1 − e−β11 (Tk −Tk−1 ) +
β11
β12
Define
R11 (k) =
X
h
X
1
1
{i:Tk−1
≤Ti2 <Tk1 }
1
1
1
2
e−β11 (Tk −Ti )
{i:Ti1 <Tk1 }
and
R12 (k) =
X
e−β12 (Tk −Ti )
{i:Ti2 <Tk1 }
then recursively
1
1
R11 (k) = e−β11 (Tk −Tk−1 ) [R11 (k − 1) + 1]
and
1
1
X
R12 (k) = e−β12 (Tk−1 −Tk−2 ) R12 (k − 1) +
1
2
e−β12 (Tk −Ti )
1
{i:Tk−1
≤Ti2 <Tk1 }
for k ≥ 2, with initial conditions
R11 (1) = 0
and
R12 (1) =
X
1
2
e−β12 (T1 −Ti ) .
{i:Ti2 <T11 }
In fact, this is the same as saying
m
Rmn (k) = e−βmn (Tk
m
−Tk−1
)
Rmn (k − 1) +
X
m ≤T n <T m }
{i:Tk−1
i
k
5
2
1 − e−β12 (Tk −Ti )
m
e−βmn (Tk
−Tin )
i
with Rmn (0) = 0. Using this notation, the rescaled interarrival times can be simplified as:
Vk1
1
= µ1 (Tk1 − Tk−1
)
+
i
i
1
1
1
1
α11 h
α12 h
1 − e−β11 (Tk −Tk−1 ) R11 (k − 1) +
1 − e−β12 (Tk −Tk−1 ) R12 (k − 1)
β11
β12
+
i α
1
1
α11 h
12
1 − e−β11 (Tk −Tk−1 ) +
β11
β12
X
h
i
1
2
1 − e−β12 (Tk −Ti )
1
≤Ti2 <Tk1 }
{i:Tk−1
Similarly,
Vk2
2
= µ2 (Tk2 − Tk−1
)
+
i
i
2
2
2
2
α21 h
α22 h
1 − e−β21 (Tk −Tk−1 ) R21 (k − 1) +
1 − e−β22 (Tk −Tk−1 ) R22 (k − 1)
β21
β22
+
α21
β21
h
X
2
{i:Tk−1
≤Ti1 <Tk2 }
i
i α h
2
2
2
1
22
1 − e−β22 (Tk −Tk−1 )
1 − e−β21 (Tk −Ti ) +
β22
where,
X
R21 (k) =
2
1
2
2
e−β21 (Tk −Ti )
{i:Ti1 <Tk2 }
and
X
R22 (k) =
e−β22 (Tk −Ti )
{i:Ti2 <Tk2 }
from the recursive definition of Rmn (k). In general,
Vkm
=
m
Λm (Tk−1
, Tkm )
Z
Tim
=
λm (s)ds
m
Tk−1
Z
Tkm
=
µm +
m
Tk−1
M
X
n
X
αmn
e−βmn (s−Ti ) ds
{i:Tin <s}
n=1
m
= µm (Tkm − Tk−1
)
+
M
X
Z
Tkm
αmn
n=1
m
Tk−1
m
= µm (Tkm − Tk−1
)+
M
X
αmn
+
β
n=1 mn
X
e
−βmn (s−Tin )
m }
{i:Tin <Tk−1
M
X
αmn
β
n=1 mn
X
+
m ≤T n <s}
{i:Tk−1
i
h
i
m
n
m
n
e−βmn (Tk−1 −Ti ) − e−βmn (Tk −Ti )
X
m }
{i:Tin <Tk−1
X
e
−βmn (s−Tin )
h
i
m
n
1 − e−βmn (Tk −Ti )
m ≤T n <T m }
{i:Tk−1
i
k
6
ds
M
X
m
m
αmn
1 − e−βmn (Tk −Tk−1 ) Rmn (k − 1) +
=
β
n=1 mn
X
1−e
−βmn (Tkm −Tin )
m ≤T n <T m }
{i:Tk−1
i
k
m
+µm (Tkm − Tk−1
)
2.2
QQ-plot
We now assess how the Hawkes process fit the data in comparison to how the Poisson process does. In
statistics, a Q-Q plot (where both the Q’s stand for quantile) is a graphical method for comparing two
probability distributions, by plotting their quantiles against each other. Before we can describe a Q-Q plot
precisely, we need the following definitions.
For any 0 < p < 1, the p quantile of a set of data is a number, denoted as Q(p), on the scale of the data
that divides the data into two groups, so that a fraction p of the observations fall below and a fraction 1 − p
fall above (?, pp.11-12). More specifically we define the following empirical and theoretical quantiles.
Definition 2.1. (?, p.12) For a set of data yi for i = 1, 2, . . . , n, we denote the sorted data as y(i) with
y(1) being the smallest and y(n) the largest. Then for each i and pi = (i − 0.5)/n, we define the pi empirical
quantile to be
Qe (pi ) = y(i) ,
i.e. the quantiles Qe (pi ) of the data are just the ordered data values themselves, y(i) .
Definition 2.2. (?, pp.193-194) For a random variable Y with theoretical cumulative distribution function
(CDF) F (y). The p theoretical quantile of F , where 0 < p < 1, is a number Qt (p) such that
F (Qt (p)) = p
or equivalently,
Qt (p) = F −1 (p).
In a Q-Q plot that compares a set of data yi for i = 1, 2, . . . , n, against a known distribution F (y), Qe (pi )
is plotted against Qt (pi ), where pi = (i − 0.5)/n. When the data yi are actually from the distribution F (y),
the points will approximately lie on the line y = x.
For a multivariate Hawkes process, the quantities Vkm defined in Proposition 2.1 should follow an exponential distribution with parameter λ = 1. That is, assuming N (t) is a multivariate Hawkes process
Vkm
=
M
X
m
m
αmn
1 − e−βmn (Tk −Tk−1 ) Rmn (k − 1) +
β
n=1 mn
X
m
1 − e−βmn (Tk
−Tin )
m ≤T n <T m }
{i:Tk−1
i
k
m
+µm (Tkm − Tk−1
)
is a sequence of independent identically distributed exponential random variables with unit rate, for each
m = 1, 2, 3, 4, where Rmn (k) is defined recursively as
X
m
m
m
n
e−βmn (Tk −Ti )
Rmn (k) = e−βmn (Tk −Tk−1 ) Rmn (k − 1) +
m ≤T n <T m }
{i:Tk−1
i
k
with initial condition:
Rmn (0) = 0,
7
where Tkm is the time of the k-th event in N m (t).
For Poisson processes, Proposition 2.1 still applies because it can be regarded as a multivariate Hawkes
process with base intensities νm and jump sizes αmn = 0 for m, n = 1, 2, 3, 4, in which case Vkm reduces to
m
νm (Tkm − Tk−1
).
To get a clearer view at the extreme values, we make the Q-Q plot in log-scale. That is we plot the log of
empirical quantiles log10 Qe (pk ) of Vkm , against the log of theoretical quantiles log10 Qt (pk ) of the standard
exponential distribution, for each m, where Qt (pk ) = − ln(1 − pk ).
Figure 2 shows the Q-Q plot of the same data fitted to both the Hawkes process and the Poisson process.
The data used is the stock INTC traded on NASDAQ from 11:00 a.m to 12:00 p.m. on 2012-06-21. The
Hawkes process obviously fit the data better than the Poisson process does, as the empirical and theoretical
quantiles are better matched when fitting the data to a Hawkes process.
References
Aalen, O. O. and J. M. Hoem
1978. Random time changes for multivariate counting processes.
1978(2):81–101.
Scandinavian Actuarial Journal,
Casella, G. and R. Berger
2002. Statistical Inference, Duxbury advanced series in statistics and decision sciences. Thomson Learning.
Daley, D. and D. Vere-Jones
2003. An Introduction to the Theory of Point Processes, Volume 1. Springer.
Papangelou, F.
1972. Integrability of expected increments of point processes and a related random change of scale. Trans.
Amer. Math. Soc., 165:483–483.
Rubin, I.
1972. Regular point processes and their detection. IEEE Trans. Inform. Theory, 18(5):547–557.
8
9
Figure 2: The Q-Q plot for the stock INTC traded on NASDAQ from 11:00 a.m to 12:00 p.m. on 2012-06-21. Each panel shows the Q-Q plot of the
empirical quantities (as defined in Definition 2.1) of Vkm calculated for N m (t) as described in Proposition 2.1, against the theoretical quantiles (as
defined in Definition 2.2) of a standard exponential distribution, in log scale, where m = 1, 2, 3, 4. The Q-Q plot for the Hawkes process is shown in
blue plus marks and the Poisson process in black dots. The red line plots the diagonal y = x and the 0.1, 0.2, . . . , 0.9 quantiles are marked with red
crosses.
© Copyright 2026 Paperzz