Parameter Estimation - Brian Hartman

Parameter Estimation
Chapters 13-15
Stat 477 - Loss Models
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
1 / 23
Methods for parameter estimation
Methods for parameter estimation
Methods for estimating parameters in a parametric model:
method of moments
matching of quantiles (or percentiles)
maximum likelihood estimation
full/complete, individual data
complete, grouped data
truncated or censored data
Bayes estimation
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
2 / 23
Method of moments
Method of moments
Assume we observe values of the random sample X1 , . . . , Xn from a
population with common distribution function F (x|θ) where
θ = (θ1 , . . . , θp ) is a vector of p parameters.
The observed sample values will be denoted by x1 , . . . , xn .
Denote the k-th raw moment by µ0k (θ) = E(X k |θ). Denote the empirical
(sample) moments by
n
1X k
xj .
µ̂0k =
n
j=1
The method of moments is fairly straightforward: equate the first p raw
moments to the corresponding p empirical moments and solve the
resulting system of simultaneous equations:
n
µ0k (θ̂)
1X k
=
xj .
n
j=1
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
3 / 23
Method of moments
Illustrations
Illustrations
Example 1: For a Normal distribution, derive expressions for the
method of moment estimators for the parameters µ and σ 2 .
Example 2: For a Gamma distribution with a shape parameter α and
a scale parameter θ, derive expressions for their method of moment
estimators.
Example 3: For a Pareto distribution, derive expressions for the
method of moment estimators for the parameters α and θ.
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
4 / 23
Method of moments
Illustrations
SOA Example #6
For a sample of dental claims x1 , x2 , . . . , x10 you are given:
P
P 2
xi = 3860 and
xi = 4, 574, 802
Claims are assumed to follow a lognormal distribution with
parameters µ and σ.
µ and σ are estimated using the method of moments.
Calculate E[X ∧ 500] for the fitted distribution. [∼259]
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
5 / 23
Matching of quantiles
Matching of quantiles
Denote the 100q-th quantile of the distribution by πq (θ) for which in the
case of continuous distribution is the solution to:
F (πq (θ)|θ) = q.
Denote the smoothed empirical estimate by π̂q . This is usually found by
solving for
π̂q = (1 − h)x(j) + hx(j+1) ,
where j = b(n + 1)qc and h = (n + 1)q − j. Here x(1) ≤ · · · ≤ x(n)
denote the order statistics of the sample.
The quantile matching estimate of θ is any solution to the p equations:
πqk (θ̂) = π̂qk for k = 1, 2, . . . , p,
where the qk ’s are p arbitrarily chosen quantiles.
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
6 / 23
Matching of quantiles
Example
Example
Suppose the following observed sample
12, 15, 18, 21, 21, 23, 28, 32, 32, 32, 58
come from a Weibull with
γ
F (x|λ, γ) = 1 − e−λx .
Calculate the quantile matching estimates of λ and γ by matching the
median and the 75-th quantile.
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
7 / 23
Matching of quantiles
Example
SOA Example #155
You are given the following data:
0.49 0.51 0.66 1.82 3.71 5.20 7.62 12.66 35.24
You use the method of percentile matching at the 40th and 80th
percentiles to fit an inverse Weibull distribution to these data. Calculate
the estimate of θ. [1.614]
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
8 / 23
Maximum likelihood estimation
The method of maximum likelihood
The maximum likelihood estimate of parameter vector θ is obtained by
maximizing the likelihood function. The likelihood contribution of an
observation is the probability of observing the data.
In many cases, it is more straightforward to maximize the logarithm of the
likelihood function.
The likelihood function will vary depending on whether it is completely
observed, or if not, then possibly truncated and/or censored.
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
9 / 23
Likelihood function
Complete, individual data
No truncation and no censoring
In the case where we have no truncation and no censoring and each
observation Xi , for i = 1, . . . , n, is recorded, the likelihood function is
L(θ) =
n
Y
fXj (xj |θ).
j=1
The corresponding log-likelihood function is
`(θ) = log[L(θ)] =
n
X
log fXj (xj |θ).
j=1
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
10 / 23
Likelihood function
Illustrations
Illustrations
Assume we have n samples completely observed. Derive expressions for
the maximum likelihood estimates for the following distributions:
Exponential: fX (x) = λe−λx , for x > 0.
Uniform: fX (x) = 1θ , for 0 < x < θ.
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
11 / 23
Likelihood function
Illustrations
SOA Example #26
You are given:
Low-hazard risks have an exponential claim size distribution with
mean θ.
Medium-hazard risks have an exponential claim size distribution with
mean 2θ.
High-hazard risks have an exponential claim size distribution with
mean 3θ.
No claims from low-hazard risks are observed.
Three claims from medium-hazard risks are observed, of sizes 1, 2 and
3.
One claim from a high-hazard risk is observed, of size 15.
Calculate the maximum likelihood estimate of θ. [2]
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
12 / 23
Likelihood function
Complete, grouped data
Complete, grouped data
Starting with a set of numbers c0 < c1 < · · · < ck where from the sample,
we have nj observed values in the interval (cj−1 , cj ].
For such data, the likelihood function is
L(θ) =
k
Y
[F (cj |θ) − F (cj−1 |θ)]nj .
j=1
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
13 / 23
Likelihood function
Example
Example
Suppose you are given the following observations for a loss random
variable X:
Interval
(0, 2]
(2, 4]
(4, 6]
(6, 8]
(8, ∞)
Total
Number of Observations
4
7
10
6
3
30
Determine the log-likelihood function of the sample if X has a Pareto with
parameters α and θ. If it is possible to maximize this log-likelihood and
solve explicitly, determine the MLE of the parameters.
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
14 / 23
Likelihood function
Example
SOA Example #44
You are given:
Losses follow an exponential distribution with mean θ.
A random sample of 20 losses is distributed as follows:
Loss Range
Frequency
[0,1000]
7
(1000, 2000]
6
(2000, ∞)
7
Calculate the maximum likelihood estimate of θ. [1996.90]
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
15 / 23
Likelihood function
Truncated and/or censored data
Truncated and/or censored data
Consider the case where we have a left truncated and right censored
observation. It is usually best to represent this observation as (ti , xi , δi )
where:
ti is the left truncation point;
xi is the value that produced the data point; and
δi is an indicator (1 or 0) whether data is right censored.
When observation is right censored, the contribution to the likelihood is
1 − F (xi ) δi
.
1 − F (ti )
Otherwise if not right censored, the contribution to the likelihood is
f (xi ) 1−δi
.
1 − F (ti )
Note that the policy limit if reached would be equal to xi − ti .
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
16 / 23
Likelihood function
Truncated and/or censored data
SOA Example #4
You are given:
Losses follow a single-parameter Pareto distribution with density
function:
α
f (x) = α+1
x>1
0<α<∞
x
A random sample of size five produced three losses with values 3, 6
and 14, and two losses exceeding 25.
Calculate the maximum likelihood estimate of α. [0.2507]
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
17 / 23
Likelihood function
Example
Illustrative example
An insurance company records the claim amounts from a portfolio of
policies with a current deductible of 100 and policy limit of 1100. Losses
less than 100 are not reported to the company and losses above the limit
are recorded as 1000. The recorded claim amounts as
120, 180, 200, 270, 300, 1000, 1000.
Assume ground-up losses follow a Pareto with parameters α and θ = 400.
Use the maximum likelihood estimate of α to estimate the Loss
Elimination Ratio for a policy with twice the current deductible.
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
18 / 23
Bayesian estimation
Definitions and notation
Assume that conditionally on θ, the observable random variables
X1 , . . . , Xn are i.i.d. with conditional density fX|θ (x|θ). Denote the
random vector X = (X1 , . . . , Xn )0 .
prior distribution: the probability distribution of θ with density π(θ).
joint distribution of (X, θ) is given by
fX,θ (x, θ) = fX|θ (x|θ) · π(θ)
= fX|θ (x1 |θ) · · · fX (xn |θ) · π(θ).
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
19 / 23
Bayesian estimation
Posterior distribution
Posterior distribution
The conditional probability distribution of θ, given the observed data
X1 , . . . , Xn , is called the posterior distribution.
This is denoted by π(θ|X) and is computed using
πθ|X (θ|x) = Z
fX,θ (x, θ)
=Z
fX,θ (x, θ)dθ
fX|θ (x|θ)π(θ)
.
fX|θ (x|θ)π(θ)dθ
We can conveniently write πθ|X (θ|x) = c · fX|θ (x1 , . . . , xn |θ)π(θ),
where c is the constant of integration.
More often, we write πθ|X (θ|x) ∝ fX|θ (x1 , . . . , xn |θ)π(θ).
The mean of the posterior, E(θ|X), is called the Bayes estimator of θ.
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
20 / 23
Bayesian estimation
Poisson with a Gamma prior
Poisson with a Gamma prior
Suppose that conditionally on θ, claims on an insurance policy X1 , . . . , Xn
are distributed as Poisson with mean λ.
Let the prior distribution of λ be a Gamma with parameters α and θ (as in
textbook).
Show that the posterior distribution is also Gamma distributed and
find expressions for its parameters.
Show that the resulting Bayes estimator can be expressed as the
weighted combination of the sample mean and the prior mean.
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
21 / 23
Bayesian estimation
SOA Exam Question
SOA Exam Question
You are given:
Conditionally, given β, an individual loss X has Exponential
distribution with density:
f (x|β) =
1 −x/β
e
, for x > 0.
β
The prior distribution of β is Inverse Gamma with density:
π(β) =
Z
Hint:
0
∞
c2 −c/β
e
, for β > 0.
β3
(n − 2)!
1 −a/y
e
dy =
, for n = 2, 3, . . .
n
y
an−1
Given that the observed loss is x, calculate the mean of the posterior
distribution of β.
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
22 / 23
Bayesian estimation
SOA Exam Question
SOA Example #5
You are given
The annual number of claims for a policyholder has a binomial
distribution with probability function:
2 x
p(x|q) =
q (1 − q)2−x ,
x = 0, 1, 2, . . .
x
The prior distribution is:
p(q) = 4q 3 ,
0<q<1
This policyholder had one claim in each of Years 1 and 2. Calculate the
Bayesian estimate of the number of claims in Year 3. [4/3]
Chapters 13-15 (Stat 477)
Parameter Estimation
Brian Hartman - BYU
23 / 23