MATH STAT - Helmut-Schmidt

The Marginal Distribution of
Compound Poisson INAR(1)
Processes
Christian H. Weiß
Department of Mathematics & Statistics
Helmut Schmidt University, Hamburg
MATH
STAT
Pedro Puig
Department of Mathematics
Universitat Autònoma de Barcelona
MATH
STAT
(Poisson)
INAR(1) Processes
Definition & Properties
Count Data Processes
MATH
STAT
Basic approach for real-valued processes:
stationary ARMA(p,q) model.
Let innovations (t)Z be white noise, then
Xt = α1 · Xt−1 + . . . + αp · Xt−p + t + . . . + βq · t−q,
where α1, . . . , αp, β1, . . . , βq ∈ R suitably chosen.
Enumerous extension („ARMA alphabet soup“, Holan et al.
(2010))
→ SARIMA, ARFIMA, VARMA, GARCH, . . .
Not applicable to count data processes: generally, α · X 6∈ N0.
Christian H. Weiß — Helmut Schmidt University, Hamburg
Count Data Processes
MATH
STAT
Several approaches of how to avoid “multiplication problem”.
Popular: Use appropriate thinning operations.
(Weiß, 2008)
In this talk, we exclusively consider models based on
binomial thinning operator (Steutel & van Harn, 1979):
α ◦ X :=
X
X
Yi ,
where Yi are i.i.d. Bin(1, α),
i=1
i. e., α ◦ X ∼ Bin(X, α) and has range {0, . . . , X}.
In particular, E[α ◦ X] = E[α · X].
(≈ number of “survivors” from population of size X)
Christian H. Weiß — Helmut Schmidt University, Hamburg
INAR(1) Processes
MATH
STAT
Let (t)Z be i.i.d. with range N0 = {0, 1, . . .},
denote E[t] = µ, V [t] = σ2. Let α ∈ (0; 1).
(Xt)Z referred to as INAR(1) process if
X
t
|{z}
Population at time t
α ◦ Xt−1
=
|
{z
}
+
Survivors of time t−1
plus appropr. independence assumptions.
t
|{z}
.
Immigration
(McKenzie, 1985)
Properties:
Homogeneous Markov chain with
P(Xt = k | Xt−1 = l) =
minX
{k,l} l
j=0
j
αj (1 − α)l−j · P(t = k − j).
Christian H. Weiß — Helmut Schmidt University, Hamburg
INAR(1) Processes
MATH
STAT
If INAR(1) process stationary (see below), then
pgfX (z) = pgfX (1 − α + αz) · pgf(z).
In particular, if µ, σ < ∞, we have
µ
,
µX =
1−α
2
σX
= µX ·
σ2
µ
+α
1+α
.
(Note: Xt equidispersed iff t equidispersed.)
Autocorrelation function: ρX (k) = αk , i. e., AR(1)-type.
For further properties and references, see Weiß (2008).
Christian H. Weiß — Helmut Schmidt University, Hamburg
Poisson INAR(1) Processes
MATH
STAT
Most popular instance of INAR(1) family:
Poisson INAR(1) model,
Xt = α ◦ Xt−1 + t.
Here, innovations (t)Z i.i.d. Poi(λ), such that µ = σ2 = λ.
Stationary marginal distribution:
λ ), because:
also Poisson distribution, Poi( 1−α
additivity of Poisson distribution, and
Poisson distribution invariant w.r.t. binomial thinning:
If X ∼ Poi(µ), then α ◦ X ∼ Poi(α · µ).
Christian H. Weiß — Helmut Schmidt University, Hamburg
Poisson INAR(1) Processes
Poisson INAR(1) has equidispersed marginals.
Modifications to get overdispersed marginals:
• modify thinning operation (Weiß, 2008), or
• modify distribution of innovations, e. g.,
negative binomially (NB) distributed t’s.
In the following, we more generally consider
compound Poisson (CP) distributed innovations.
Christian H. Weiß — Helmut Schmidt University, Hamburg
MATH
STAT
MATH
STAT
Compound Poisson
INAR(1) Processes
Definition & Properties
Compound Poisson Distribution
Y1, Y2, . . . i.i.d. with range N = {1, 2, . . .},
denote pgf as H(z)
(compounding distribution).
Let N ∼ Poi(λ), independent of Y1, Y2, . . .
:= Y1 + . . . + YN compound Poisson distributed:
∼ CP(λ, H).
Then
pgf(z) = exp λ(H(z) − 1) .
More precisely:
∼ CPν (λ, H) if H(z) = h1z + . . . + hν z ν with hν > 0,
∼ CP∞(λ, H) if Yi have infinite range.
Christian H. Weiß — Helmut Schmidt University, Hamburg
MATH
STAT
Compound Poisson Distribution
MATH
STAT
Properties:
• equidispersed iff ν = 1:
CP1(λ, H) = Poi(λ).
overdispersed iff ν > 1, e. g., . . .
• CPν (λ, H) with ν < ∞:
Hermite distribution of order ν.
• CPν (λ, H) with ν < ∞ and hx = 1/ν for x = 1, . . . , ν:
Poisson distribution of order ν, abbr. Poiν (λ).
• NB(n, π)-distribution:
CP∞(λ, H) with
λ := −n ln π and H(z) being pgf of log-series distribution:
H(z) =
ln 1 − (1 − π)z
ln π
(1 − π)k k
=
z .
k=1 −k ln π
∞
X
Christian H. Weiß — Helmut Schmidt University, Hamburg
Compound Poisson Distribution
Properties:
MATH
STAT
(cont.)
• Additivity:
If X1, X2 independent with Xi ∼ CPν (λi, Hi) (common ν),
then X1 + X2 CPν (λ, H)-distributed.
• Invariance w.r.t. binomial thinning:
If X ∼ CPν (λ, H), then α ◦ X ∼ CPν (η, G).
• Count data model parametrized by ν first factorial cumulants
closed under addition and under binomial thinning
iff it has CPν -distribution.
(Puig & Valero, 2007; Schweer & Weiß, 2014)
Christian H. Weiß — Helmut Schmidt University, Hamburg
Compound Poisson INAR(1) Processes
MATH
STAT
INAR(1) process (Xt)t∈Z referred to as
CPINAR(1) process
if innovations (t)Z i.i.d. CPν (λ, H) (possibly ν = ∞).
ν = 1:
Poisson INAR(1) model.
Popular for ν = ∞:
NB-innovations, NB-INAR(1) model.
Heathcote (1966), Schweer & Weiß (2014):
If H 0(1) < ∞, then unique stationary marginal distribution
of (Xt)t∈Z is CPν (η, G), where
η G(z) − 1
= λ
∞
X
H(1 − αi
+ αiz) − 1
.
i=0
Christian H. Weiß — Helmut Schmidt University, Hamburg
MATH
STAT
Compound Poisson
INAR(1) Processes
Marginal Distribution
Compound Poisson INAR(1): Marginal Distribution
MATH
STAT
See above:
If INAR(1)’s innovations (t)Z CPν -distributed (possibly ν = ∞),
then INAR(1)’s observations (Xt)t∈Z also CPν -distributed,
with same compounding order ν.
E.g., for the purpose of likelihood computations:
How to compute efficiently and precisely
pmf of marginal distribution of (Xt)t∈Z?
If compounding order finite, ν < ∞,
then marginal distribution can be computed exactly!
Christian H. Weiß — Helmut Schmidt University, Hamburg
Compound Poisson INAR(1): Marginal Distribution
Scheme:
MATH
STAT
(see Weiß & Puig (2015) for details & references)
1. Compute parameters η and g1, . . . , gν of marginal distribution:
(a) If innovations’ param. λ and h1, . . . , hν readily available,
solve following linear equations in η and g1, . . . , gν :
λ − (1 − α) · g − . . . − (1 − α)ν · g = 0,
ν
1
η
ν
P
i
λ
i−k · g = 0,
k
k
hk · η − (1 − α ) · gk + α
(1
−
α)
k = ν, . . . , 2.
i
k
i=k+1
g1 + . . . + gν = 1,
(b) If innovations’ CPν -distr. via fact. cumul. κ(1), , . . . , κ(ν), ,
κ(n), then first κ(1), X , . . . , κ(ν), X via κ(n), X =
,
n
1−α
ν
κ(r), X
X
afterwards expand η G(z) − 1 :=
· (z − 1)r .
r!
r=1
Christian H. Weiß — Helmut Schmidt University, Hamburg
Compound Poisson INAR(1): Marginal Distribution
Scheme:
MATH
STAT
(see Weiß & Puig (2015) for details & references)
1. (. . . )
2. Recursive scheme for computation of pmf P (X = k):
P (X = 0) = e−η ,
(“Panjer recursion” )
{k,ν}
η minX
·
j gj · P (X = k − j)
P (X = k) =
k
j=1
for k ≥ 1.
In the case ν = ∞, however, Step 1 is crucial point.
Idea:
Replace infinite compounding structure by finite one,
then continue with above Scheme.
Christian H. Weiß — Helmut Schmidt University, Hamburg
Compound Poisson INAR(1): Marginal Distribution
MATH
STAT
Hermite-ν approximation, type 1:
For innovations t being CP∞(λ, H)-distributed,
define ν-th order approximation CPν (λ̃, H̃) by
λ̃ := λ,
h̃1 := h1, . . . , h̃ν−1 := hν−1,
P
h̃ν := 1 − ν−1
k=1 hk .
Then proceed according to Steps 1(a) and 2 of above Scheme.
Preserves P ( = 0), . . . , P ( = ν − 1),
approximates remaining innovations’ probabilities.
1st order approximation = Poisson approximation.
Christian H. Weiß — Helmut Schmidt University, Hamburg
Compound Poisson INAR(1): Marginal Distribution
MATH
STAT
Hermite-ν approximation, type 2:
For innovations t being CP∞(λ, H)-distributed with mean µ,
define ν-th order approximation CPν (λ̃, H̃) by
h̃1 :=
h1
Pν
k=1 hk
, . . . , h̃ν :=
hν
Pν
k=1 hk
,
µ
λ̃ := Pν
.
k=1 k h̃k
Then proceed according to Steps 1(a) and 2 of above Scheme.
Preserves observations’ mean µX ,
approximates all innovations’ probabilities.
1st order approximation = Poisson approximation.
Christian H. Weiß — Helmut Schmidt University, Hamburg
Compound Poisson INAR(1): Marginal Distribution
Performance evaluation:
NB-INAR(1) model with NB(n, π)-distributed innovations,
different levels of mean parameter n,
dispersion parameter π, autocorrelation parameter α.
Benchmark:
Numerically exact Markov chain approach (Weiß, 2010).
Concerning computational efficiency:
Computing time for approximations:
Computing time for MC approach:
< 0.1 s.
70 − 100 s.
Christian H. Weiß — Helmut Schmidt University, Hamburg
MATH
STAT
Compound Poisson INAR(1): Marginal Distribution
MATH
STAT
Overall quality of approximation via Kullback-Leibler divergence:
1E-1
1E-4
1E-7
n=1, Π=0.5, Α=0.5
HA1
HA2
1E-10
5
10
15
20
Christian H. Weiß — Helmut Schmidt University, Hamburg
ν
Compound Poisson INAR(1): Marginal Distribution
MATH
STAT
approx − true
Relative errors
for mean, dispersion index and
true
skewness:
0.
-0.1
-0.2
ν
5
10
15
20
n=1, Π=0.5, Α=0.5;
relative errors w.r.t.
mean, HA1
mean, HA2
-0.3
disp., HA1
disp., HA2
-0.4
skew., HA1
-0.5
skew., HA2
Christian H. Weiß — Helmut Schmidt University, Hamburg
Compound Poisson INAR(1): Marginal Distribution
MATH
STAT
Overall quality of type 1 approximation via KL divergence:
1E-1
1E-4
1E-7
HA1, Π=0.5, Α=0.5
n=1
n=2
1E-10
5
10
15
20
Christian H. Weiß — Helmut Schmidt University, Hamburg
ν
Compound Poisson INAR(1): Marginal Distribution
MATH
STAT
Overall quality of type 1 approximation via KL divergence:
1E-1
1E-4
1E-7
HA1, n=1, Π=0.5
Α=0.25
Α=0.5
Α=0.75
1E-10
5
10
15
20
Christian H. Weiß — Helmut Schmidt University, Hamburg
ν
Compound Poisson INAR(1): Marginal Distribution
MATH
STAT
Overall quality of type 1 approximation via KL divergence:
1E-1
1E-4
1E-7
1E-10
HA1, n=1, Α=0.5
Π=0.75
1E-13
Π=0.5
1E-16
Π=0.25
5
10
15
20
Christian H. Weiß — Helmut Schmidt University, Hamburg
ν
MATH
STAT
Conclusions
. . . and Future Research
Conclusions
MATH
STAT
• Type 1 approximation performs best concerning KL divergence, and concerning dispersion index or skewness.
• If one chooses ν ≥ 15, then difference between exact distribution and type 1 approximation negligible.
• Both approximations much faster than MC approach.
Future research:
empirical version of the Hermite approximation, e. g.,
based on first ν empirical factorial cumulants.
(→ semi-parametric estimation of INAR(1)’s marginal)
Christian H. Weiß — Helmut Schmidt University, Hamburg
Thank You
for Your Interest!
Christian H. Weiß
Department of Mathematics & Statistics
Helmut Schmidt University, Hamburg
MATH
STAT
[email protected]
Literature
MATH
STAT
Weiß & Puig (2015): The Marginal Distribution of Compound Poisson
INAR(1) Processes. Springer Proc. Math. Stat. 122, 397–405.
Heathcote (1966): A branching process allowing immigration. Jour. Roy.
Stat. Soc. B 28(1), 213–217.
Holan et al. (2010): The ARMA alphabet soup: A tour of ARMA model
variants. Statistics Surveys 4, 232–274.
McKenzie (1985): Some simple models for discrete variate time series. Water
Resources Bull. 21, 645–650.
Puig & Valero (2007): Characterization of count data distributions involving
additivity and binomial subsampling. Bernoulli 13, 544–555.
Schweer & Weiß (2014): Compound Poisson INAR(1) processes: stochastic
properties and testing for overdispersion. C. Stat. Data Anal. 77, 267–284.
Steutel & van Harn (1979): Discrete analogues of self-decomposability and
stability. Ann. Probab. 7, 893–899.
Weiß (2008): Thinning operations for modelling time series of counts—a
survey. Adv. Stat. Anal. 92, 319–341.
Weiß (2010): The INARCH(1) model for overdispersed time series of counts.
Comm. Stat. Sim. Comp. 39, 1269–1291.
Christian H. Weiß — Helmut Schmidt University, Hamburg