3. Credibility Theory 3.1. Introduction

3. CREDIBILITY THEORY
1
2
1
2
3
4
5
6
7
8
9
10
Σ
3
4
5
6
7
8
41
9
1
1
1
1
1
1
10
11
12
13
14
15
16
1
1
17
18
1
1
1
19
20
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
2
0
0
2
2
0
6
1
1
4
3
1
1
1
1
0
0
5
1
1
1
0
Table 3.1: Years with accidents
3.
Credibility Theory
3.1. Introduction
We start with an example. A company has insured twenty drivers in a small village
during the past ten years. Table 3.1 shows, indicated by a 1, which driver had at
least one accident in the corresponding year. The last row shows the number of
years with at least one claim. Some of the drivers had no accidents. They will think
that it is unfair that they have to pay the same premium as the drivers with 4 and
more years with accidents. Another insurer could offer the good drivers a cheaper
contract, and the first insurer would lose them. Thus the first insurer should offer
the good risks a cheaper premium, and has therefore to increase the premia for the
poor risks.
But could it not be possible that the probability of at least one accident in a year
is the same for all the drivers? Had some of the drivers just been unlucky? We can
test this by using a χ2 -test. Let µ̂i be the mean number of years with an accident
for driver i and
20
1 X
µ̄ =
µ̂i
20 i=1
be the mean number of years with an accident for a typical driver. Then
P
2
10 20
i=1 (µ̂i − µ̄)
Z=
= 49.1631 .
µ̄(1 − µ̄)
42
3. CREDIBILITY THEORY
Under the hypothesis that IIE[µi ] is the same for all drivers the statistic Z is approximately χ219 -distributed. But
IIP[χ219 > 49] ≈ 0.0002 .
Thus it is very unlikely that each driver has the same probability of having at least
one accident in a year.
For the insurance company it is preferable to attract good risks and to get rid of
the poor risks. Thus they would like to charge premia according to their experience
with a particular customer. This procedure is called experience rating or credibility. Let us denote by Yij the losses of the i-th risk in year j, where i = 1, 2, . . . , n
and j = 1, 2, . . . , m. Denote the mean losses of the i-th risk by
m
Ȳi :=
1 X
Yij .
m j=1
We make the following assumptions:
i) There exists a parameter Θi belonging to the risk i. The parameters (Θi : 1 ≤
i ≤ n) are iid..
ii) The vectors ((Yi1 , Yi2 , . . . , Yim , Θi ) : 1 ≤ i ≤ n) are iid..
iii) For fixed i, given Θi the random variables Yi1 , Yi2 , . . . , Yim are conditionally iid..
For instance Θi can be the parameters of a family of distributions, or more generally,
the distribution function of a claim of the i-th risk.
Denote by m(ϑ) = IIE[Yij | Θi = ϑ] the conditional expected mean of an aggregate
loss given Θi = ϑ, by µ = IIE[m(Θi )] the overall mean of the aggregate losses and
by s2 (ϑ) = Var[Yij | Θi = ϑ] the conditional variance of Yij given Θi = ϑ. Now
instead of using the expected mean µ of a claim of a typical risk for the premium
calculations one should use m(Θi ) instead. Unfortunately we do not know Θi . But
Ȳi is an estimator for m(Θi ). But it is not a good idea either to use Ȳi for the
premium calculation. Assume that somebody who is insured for the first year has a
big claim. His premium for the next year would become larger than his claim. He
would not be able to take insurance furthermore. But the idea behind insurance is
that the risk is shared with all the insurance takers, in order that nobody gets into
troubles because he is unlucky. Thus we have to find other methods to estimate
m(Θi ).
3. CREDIBILITY THEORY
43
3.2. Bayesian Credibility
Suppose we know the distribution of Θi and the conditional distribution of Yij given
Θi . We now want to find the best estimate M0 of m(Θi ) in the sense that
IIE[(M − m(Θi ))2 ] ≥ IIE[(M0 − m(Θi ))2 ]
for all measurable functions M = M (Yi1 , Yi2 , . . . , Yim ).
Yi1 , Yi2 , . . . , Yim ]. Then
Let M00 = IIE[m(Θi ) |
IIE[(M − m(Θi ))2 ] = IIE[{(M − M00 ) + (M00 − m(Θi ))}2 ]
= IIE[(M − M00 )2 ] + IIE[(M00 − m(Θi ))2 ] + 2IIE[(M − M00 )(M00 − m(Θi ))] .
We have then
IIE[(M − M00 )(M00 − m(Θi ))] = IIE[IIE[(M − M00 )(M00 − m(Θi )) | Yi1 , Yi2 , . . . , Yim ]]
= IIE[(M − M00 )(M00 − IIE[m(Θi ) | Yi1 , Yi2 , . . . , Yim ])]
=0.
Thus
IIE[(M − m(Θi ))2 ] = IIE[(M − M00 )2 ] + IIE[(M00 − m(Θi ))2 ] .
The second term is independent of the choice of M . That means that the best
estimate is M0 = M00 . We call M0 the Bayesian credibility estimator. In the sequel
we will consider the two most often used Bayesian credibility models.
3.2.1.
The Poisson-Gamma Model
We assume that Yij has a compound Poisson distribution with parameter λi and an
individual claim size distribution which is the same for all the risks. It is therefore
sufficient for the company to estimate λi . In the sequel we assume that all claim
sizes are equal to 1, i.e. Yij = Nij . Motivated by the compound negative binomial
model we assume that λi ∼ Γ(γ, α). Let
m
1 X
Nij
N̄i =
m j=1
44
3. CREDIBILITY THEORY
denote the mean number of claims of the i-th risk. We know the overall mean
IIE[λi ] = γα−1 . For the best estimator for λi we get
Z ∞ Y
m Nij
αγ
`
e−`
`γ−1 e−α` d`
`
N
!
Γ(γ)
ij
0
j=1
IIE[λi | Ni1 , Ni2 , . . . , Nim ] = Z ∞ m N
αγ
Y ` ij
e−`
`γ−1 e−α` d`
N
!
Γ(γ)
ij
0
j=1
Z ∞
`mN̄i +γ e−(m+α)` d`
Γ(mN̄i + γ + 1)(m + α)mN̄i +γ
0
=Z ∞
=
(m + α)mN̄i +γ+1 Γ(mN̄i + γ)
`mN̄i +γ−1 e−(m+α)` d`
0
mN̄i + γ
m
α γ m
m =
=
N̄i +
=
N̄i + 1 −
IIE[λi ] .
m+α
m+α
m+α α
m+α
m+α
The best estimator is therefore of the form
Z N̄i + (1 − Z)IIE[λi ]
with the credibility factor
1
Z=
α .
m
The credibility factor Z is increasing with the number of years m and converges to
1 as m → ∞. Thus the more observations we have the higher is the weight put to
the empirical mean N̄i .
1+
The only problem that remains is to estimate the parameters γ and α. For an
insurance company this quite often is no problem because they have big data sets.
3.2.2.
The Normal-Normal Model
For the next model we do not assume that the individual claims have the same
distribution for all risks. First we assume that each risk consists of a lot of subrisks such that the random variables Yij are approximately normally distributed.
Moreover, the conditional variance is assumed to be the same for all risks, i.e. Yij ∼
N(Θi , σ 2 ). Next we assume that the parameter Θi ∼ N(µ, η 2 ) is also normally
distributed. Note that µ and η 2 have to be chosen in such a way that IIP[Yij ≤ 0] is
small.
Let us first consider the joint density of Yi1 , Yi2 , . . . Yim , Θi
m
n (ϑ − µ)2 X
(yij − ϑ)2 o
−(m+1)/2 −1 −m
(2π)
η σ exp −
+
.
2η 2
2σ 2
j=1
3. CREDIBILITY THEORY
45
Because we will mainly be interested in the posterior distribution of Θi we should
P
write the exponent in the form (ϑ − ·)2 . For abbreviation we write mȳi = m
j=1 yij .
m
m
µ
1
X
yij2
µ2
m mȳi (ϑ − µ)2 X (yij − ϑ)2
2
−
2ϑ
+
+
=
ϑ
+
+
+
.
2
2
2
2
2
2
2
2η 2
2σ
2η
2σ
2η
2σ
2η
2σ
j=1
j=1
The joint density can therefore be written as
µ
 mȳi 1
m −1 2 
+

 ϑ− 2 + 2
η
σ
η2 σ2
C(yi1 , yi2 · · · , yim ) exp −
1
m −1


2 2+ 2
η
σ
where C is a function of the data. Hence the posterior distribution of Θi is normally
distributed with mean
µ
mȲi 1
m −1 µσ 2 + mη 2 Ȳi
mη 2
mη 2 + 2
+
=
= 2
Ȳi + 1 − 2
µ.
η2
σ
η2 σ2
σ 2 + mη 2
σ + mη 2
σ + mη 2
Again the credibility premium is a linear combination Z Ȳi + (1 − Z)IIE[Yij ] of the
mean losses of the i-th risk and the overall mean µ. The credibility factor has a
similar form as in the Poisson-gamma model
1
.
Z=
σ2
1+ 2
η m
3.2.3.
Is the Credibility Premium Formula Always Linear?
From the previous considerations one could conjecture that the credibility premium
is always of the form Z Ȳi + (1 − Z)IIE[Yij ]. We try to find a counterexample. Assume
that Θi takes the values 1 and 2 with equal probabilities 12 . Given Θi the random
variables Yij ∼ Pois(Θi ) are Poisson distributed. Then
IIE[Θi | Yi1 , Yi2 , . . . , Yim ]
= IIP[Θi = 1 | Yi1 , Yi2 , . . . , Yim ] + 2IIP[Θi = 2 | Yi1 , Yi2 , . . . , Yim ]
m
1 Y 1 −1
e
2 j=1 Yij !
= 2 − IIP[Θi = 1 | Yi1 , Yi2 , . . . , Yim ] = 2 − m
m
1 Y 1 −1 Y 2Yij −2 e +
e
2 j=1 Yij !
Y
!
ij
j=1
=2−
1
.
1 + e−m 2mȲi
It turns out that there is no possibility to get a linear formula.
46
3. CREDIBILITY THEORY
3.3. Empirical Bayes Credibility
In order to calculate the Bayes credibility estimator one needs to know the joint
distribution of m(Θi ) and (Yi1 , Yi,2 , . . . , Yim ). In practice one does not have distributions but only data. It seems therefore not feasible to estimate joint distributions
if only one of the variables is observable. We therefore want to restrict now to linear
estimators, i.e. estimators of the form
M = ai0 + ai1 Yi1 + ai2 Yi2 + · · · + aim Yim .
The best estimator is called linear Bayes estimator. It turns out that it is enough
to know the first two moments of (m(Θi ), Yi 1). One then has to estimate these
quantities. We further do not know the mean values nor the variances. In practice
we will have to estimate these quantities. We will therefore now proceed in two
steps: First we estimate the linear Bayes premium, and then we estimate the first
two moments. The corresponding estimator is called empirical Bayes estimator.
3.3.1.
The Bühlmann Model
In addition to the notation introduced before denote by σ 2 = IIE[s2 (Θi )] and by
v 2 = Var[m(Θi )]. Note that σ 2 is not the unconditional variance of Yij . In fact
IIE[Yij2 ] = IIE[IIE[Yij2 | Θi ]] = IIE[s2 (Θi ) + (m(Θi ))2 ] = σ 2 + IIE[(m(Θi ))2 ] .
Thus the variance of Yij is
Var[Yij ] = σ 2 + IIE[(m(Θi ))2 ] − µ2 = σ 2 + v 2 .
We consider credibility premia of the form
ai0 +
m
X
aij Yij .
j=1
Which parameters aij minimize the expected quadratic error
m
h
2 i
X
IIE ai0 +
aij Yij − m(Θi )
?
j=1
We first differentiate with respect to ai0 .
m
m
h
2 i
h
i
X
X
1 d
IIE ai0 +
aij Yij − m(Θi )
= IIE ai0 +
aij Yij − m(Θi )
2 dai0
j=1
j=1
= ai0 +
m
X
j=1
aij − 1 µ = 0 .
(3.1)
3. CREDIBILITY THEORY
47
It follows that
ai0 = 1 −
m
X
aij µ
j=1
and thus our estimator has the property that
m
h
i
X
IIE ai0 +
aij Yij = µ .
j=1
Let 1 ≤ k ≤ m and differentiate with respect to aik .
m
m
h
2 i
h i
X
X
1 d
IIE ai0 +
aij Yij − m(Θi )
= IIE Yik ai0 +
aij Yij − m(Θi )
2 daik
j=1
j=1
= ai0 µ +
m
X
aij IIE[Yik Yik ] + aik IIE[Yik2 ] − IIE[Yik m(Θi )] = 0 .
j=1
j6=k
We already know from (3.1) that IIE[Yik2 ] = σ 2 + v 2 + µ2 . For j 6= k we get
IIE[Yik Yij ] = IIE[IIE[Yik Yij | Θi ]] = IIE[(m(Θi ))2 ] = v 2 + µ2 .
And finally
IIE[Yik m(Θi )] = IIE[IIE[Yik m(Θi ) | Θi ]] = IIE[(m(Θi ))2 ] = v 2 + µ2 .
The equations to solve are
1−
m
X
2
aij µ +
j=1
m
X
aij (v 2 + µ2 ) + aik (σ 2 + v 2 + µ2 ) − (v 2 + µ2 )
j=1
j6=k
2
= aik σ − 1 −
m
X
aij v 2 = 0 .
j=1
The right hand side of
m
X
σ 2 aik = v 2 1 −
aij
j=1
is independent of k. Thus
ai1 = ai2 = · · · = aim =
v2
.
σ 2 + mv 2
(3.2)
The credibility premium is
mv 2
mv 2 Ȳ
+
1
−
µ = Z Ȳi + (1 − Z)µ
i
σ 2 + mv 2
σ 2 + mv 2
(3.3)
48
3. CREDIBILITY THEORY
where
1
.
σ2
1+ 2
v m
The formula for the credibility premium does only depend on µ, σ 2 , and v 2 . This
also were the only quantities we had assumed in our model. Hence the approach is
quite general. We do not need any other assumptions on the distribution of Θi .
Z=
In order to apply the result we need to estimate the parameters µ, σ 2 , and v 2 .
In order to make the right decision we look for unbiased estimators.
i) µ: The natural estimator of µ is
n
µ̂ =
m
n
1X
1 XX
Yij =
Ȳi .
nm i=1 j=1
n i=1
(3.4)
It is easy to see that µ̂ is unbiased. It can be shown that µ̂ is the best linear
unbiased estimator.
ii) σ 2 : For the estimation of s2 (Θi ) we would use the unbiased estimator
m
1 X
(Yij − Ȳi )2 .
m − 1 j=1
Thus
n
m
XX
1
(Yij − Ȳi )2
σ̂ =
n(m − 1) i=1 j=1
2
(3.5)
is an unbiased estimator for σ 2 .
iii) v2 : Ȳi is an unbiased estimator of m(Θi ). Therefore a natural estimate of v 2
would be
n
1 X
(Ȳi − µ̂)2 .
n − 1 i=1
3. CREDIBILITY THEORY
49
But this estimator is biased:
n
n
1 X h n − 1
1 X
1 X 2 i
IIE[(Ȳi − µ̂)2 ] =
IIE
Ȳi −
Ȳj
n − 1 i=1
n − 1 i=1
n
n j6=i
n
h n − 1
n
1 X 2 i
IIE
Ȳj
=
Ȳ1 −
n−1
n
n j=2
n
h n − 1
2 i
n
1X
=
(Ȳj − µ)
IIE
(Ȳ1 − µ) −
n−1
n
n j=2
n n − 1 2
n−1
2
=
IIE[(Ȳ1 − µ)2 ] +
II
E
[(
Ȳ
−
µ)
]
1
n−1
n
n2
m
m X
m
i
h 1 X
2 X
Y1j + µ2
Y1i Y1j − µ
= IIE (Ȳ1 − µ)2 = IIE 2
m i=1 j=1
m j=1
=
m−1 2
σ2
1 2
(σ + v 2 + µ2 ) +
(v + µ2 ) − µ2 = v 2 +
.
m
m
m
We have to correct the estimator and get
n
1 X
1
(Ȳi − µ̂)2 − σ̂ 2 .
n − 1 i=1
m
v̂ 2 =
(3.6)
Example 3.1. A company has insured twenty similar collective risks over the last
ten years. Table 3.2 shows the annual losses of each the risks. Thus the estimators
are
20
1 X
µ̂ =
Ȳi = 102.02 ,
20 i=1
20
10
1 XX
σ̂ =
(Yij − Ȳi )2 = 473.196
180 i=1 j=1
2
and
20
1 X
1
v̂ =
(Ȳi − µ̂)2 − σ̂ 2 = 806.6 .
19 i=1
10
2
The credibility factor can now be computed
Z=
1
σ̂ 2
1+ 2
v̂ m
= 0.944585 .
Table 3.3 gives the credibility premia for next year for each of the risks. The data
had been simulated using µ = 100, σ 2 = 400 and v 2 = 900. The credibility model
interprets the data as having more fluctuations in a single risk than in the mean
50
3. CREDIBILITY THEORY
i
1
2
3
4
5
6
7
8
9
10
Ȳi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
67
80
125
96
176
89
70
22
121
126
125
116
53
110
125
87
92
44
95
103
71
82
118
144
161
129
116
33
106
101
106
134
62
136
112
95
46
74
107
105
50
109
101
152
153
95
102
48
55
158
104
133
89
75
145
85
74
67
147
139
56
61
135
124
191
88
129
20
103
129
135
167
76
52
114
111
82
93
154
138
64
89
120
94
139
131
81
56
87
110
83
142
63
49
61
89
68
89
121
145
69
113
117
132
157
72
69
25
81
112
177
150
66
85
74
101
94
57
125
145
77
149
101
155
192
91
75
43
130
107
157
134
69
110
168
85
116
83
146
140
94
91
101
143
151
122
120
53
113
114
95
153
70
89
98
88
109
53
83
148
56
127
100
94
175
79
95
48
98
117
101
134
57
68
114
90
81
82
117
137
48
108
122
113
147
56
93
64
97
95
128
109
49
34
97
164
89
38
139
127
65.2
100.9
114.0
124.7
164.2
95.2
95.0
41.2
99.1
116.9
121.1
137.2
65.4
80.8
110.8
99.5
85.1
68.0
123.4
132.7
Table 3.2: Annual losses of the risks
values of the risks. This is not surprising because we only have 10 data for each risk
but 20 risks. Thus the fluctuations of Ȳi could also be due to larger fluctuations
within the risk.
3.3.2.
The Bühlmann Straub model
The main field of application for credibility theory are collective insurance contracts,
e.g. employees insurance, fire insurance for companies, third party liability insurance
for employees, travel insurance for the customers of a travel agent etc.. In reality the
volume of the risk is not the same for each of the contracts. Looking at the credibility
formulae calculated until now we can recognize that the credibility factors should
be larger for higher risk volumes, because we have more data available. Thus we
want to take the volume of the risk into consideration. The risk volume can be the
number of employees, the sum insured, etc..
In order to see how to model different risk volumes we first consider an example.
3. CREDIBILITY THEORY
51
Risk
Premium
1
67.240
2
100.962
3
113.336
4
123.443
5
160.754
6
95.578
7
95.389
Risk
Premium
8
44.570
9
99.262
10
116.075
11
120.043
12
135.251
13
67.429
14
81.976
Risk
Premium
15
110.313
16
99.640
17
86.038
18
69.885
19
122.215
20
131.000
Table 3.3: Credibility premia for the twenty risks
Example 3.2. Assume that we insure taxi drivers. The losses of each driver are
dependent on the driver itself, but also on the policies of the employer and the
region where the company works. The i-th company employs Pi taxi drivers. There
is a random variable Θi which determines the environment and the policies of the
company. The random variable Θik determines the risk class of each single driver.
We assume that the vectors (Θi1 , Θi2 , . . . , ΘiPi , Θi ) are independent and that, given
Θi , the parameters Θi1 , Θi2 , . . . , ΘiPi are conditionally iid.. The aggregate claims of
different companies shall be independent. The aggregate claims of different drivers
of company i are conditionally independent given Θi , and the aggregate claims Yikj
of a driver are conditionally independent given Θik and Yikj depends on Θik only.
Then the expected value of the annual aggregate claims of company i given Θi is
IIE
Pi
hX
i
Yikj Θi = Pi IIE[Yi1j | Θi ] .
k=1
The conditional variance of company i is then
IIE
Pi
hX
2
Yikj − Pi IIE[Yi1j | Θi ]
i
2
Θi = Pi IIE (Yi1j − IIE[Yi1j | Θi ]) Θi .
k=1
Thus both the conditional mean and the conditional variance are proportional to
characteristics of the company.
The example will motivate the model assumptions. Let Pij denote the volume of
the i-th risk in year j. We assume that there exists a parameter Θi for each risk. The
parameters Θ1 , Θ2 , . . . , Θn are assumed to be iid.. Denote by Yij the aggregate claims
of risk i in year j. We assume that the vectors (Yi1 , Yi2 , . . . , Yim , Θi ) are independent.
Within each risk the random variables Yi1 , Yi2 , . . . , Yim are conditionally independent
given Θi . Moreover, there exist functions m(ϑ) and s2 (ϑ) such that
IIE[Yij | Θi = ϑ] = Pij m(ϑ)
52
3. CREDIBILITY THEORY
and
Var[Yij | Θi = ϑ] = Pij s2 (ϑ) .
As in the Bühlmann model we let µ = IIE[m(Θi )], σ 2 = IIE[s2 (Θi )] and v 2 =
Var[m(Θi )]. Denote by
Pi· =
m
X
Pij ,
P·· =
j=1
n
X
Pi·
i=1
the risk volume of the i-th risk over all m years and the risk volume over all years
and companies.
First we normalize the annual losses. Let Xij = Yij /Pij . Note that IIE[Xij | Θi ] =
m(Θi ) and Var[Xij | Θi ] = s2 (Θi )/Pij . We next try to find the best linear estimator
for m(Θi ). Hence we have to minimize
IIE
h
ai0 +
m
X
2 i
aij Xij − m(Θi )
.
j=1
It is clear that Xkj does not carry information on m(Θi ) if k 6= i. We first differentiate with respect to ai0 .
m
m
h
2 i
h
i
X
X
1 d
IIE ai0 +
aij Xij − m(Θi )
= IIE ai0 +
aij Xij − m(Θi )
2 dai0
j=1
j=1
= ai0 − 1 −
m
X
aij µ = 0 .
j=1
As in the Bühlmann model we get
m
X
ai0 = 1 −
aij µ
j=1
and therefore
m
h
i
X
IIE ai0 +
aij Xij = µ .
j=1
Differentiate with respect to aik ,
m
m
h
2 i
h i
X
X
1 d
IIE ai0 +
aij Xij − m(Θi )
= IIE Xik ai0 +
aij Xij − m(Θi )
2 daik
j=1
j=1
= ai0 µ +
m
X
j=1
j6=k
2
aij IIE[Xik Xij ] + aik IIE Xik
− IIE[Xik m(Θi )] = 0 .
3. CREDIBILITY THEORY
53
Compute the terms. For j 6= k
IIE[Xik Xij ] = IIE[IIE[Xik Xij | Θi ]] = IIE (m(Θi ))2 = v 2 + µ2 .
2 2
Θi = IIE (m(Θi ))2 + P −1 s2 (Θi ) = v 2 + µ2 + P −1 σ 2 .
= IIE IIE Xik
IIE Xik
ik
ik
2
2
2
IIE[Xik m(Θi )] = IIE[IIE[Xik m(Θi ) | Θi ]] = IIE (m(Θi )) = v + µ .
Thus
1−
m
X
m
X
aij µ2 +
aij (v 2 + µ2 ) + aik (v 2 + µ2 + Pik−1 σ 2 ) − (v 2 + µ2 )
j=1
j=1
j6=k
=
aik Pik−1 σ 2
m
X
− 1−
aij v 2 = 0 .
j=1
The right hand side of
σ
2
Pik−1 aik
= 1−
m
X
aij v 2
j=1
is independent of k and thus there exists a constant a such that aik = Pik a. Then it
follows readily
v2
.
aik = Pik 2
σ + Pi· v 2
Note that the formula is consistent with (3.2). Define
X̄i =
m
1 X
Pij Xij
Pi· j=1
the weighted mean of the annual losses of the i-th risk. Then the credibility premium
is
Pi· v 2 Pi· v 2
1− 2
µ
+
X̄i = Zi X̄i + (1 − Zi )µ
σ + Pi· v 2
σ 2 + Pi· v 2
with the credibility factor
1
Zi =
.
σ2
1+
Pi· v 2
Note that the premium is consistent with (3.3). Denote by P̂i(m+1) the (estimated)
risk volume of the next year. Then the credibility premium for the next year will be
P̂i(m+1) (Zi X̄i + (1 − Zi )µ) .
The credibility factor Zi depends on Pi· . This means that in general different risks
will get different credibility factors.
It remains to find estimators for the parameters.
54
3. CREDIBILITY THEORY
i) µ: We want to modify the estimator (3.4) for the Bühlmann Straub model. The
loss Xij should be weighted with Pij because it originates from a volume of this
size. Thus we get
n
n
m
1 XX
1 X
Pi· X̄i .
µ̂ =
Pij Xij =
P·· i=1 j=1
P·· i=1
It turns out that µ̂ is unbiased.
ii) σ 2 : We want to modify the estimator (3.5) for the Bühlmann Straub model.
The conditional variance of Xij is Pij−1 s2 (Θi ). This suggests that we should
weight the term (Xij − X̄i )2 with Pij . Therefore we try an estimator of the form
c
n X
m
X
Pij (Xij − X̄i )2 .
i=1 j=1
In order to compute the expected value of this estimator we keep i fixed.
IIE
m
hX
2
Pij (Xij − X̄i )
i
= IIE
j=1
Pij Xij2
−2
j=1
= IIE
=
m
hX
m
hX
m
X
Pij Xij X̄i +
j=1
m
X
Pij X̄i2
i
j=1
i
Pij Xij2 − IIE[Pi· X̄i2 ]
j=1
m
X Pij
j=1
m m
1 XX
σ2
2
2
Pij Pik IIE[Xij Xik ]
+v +µ −
Pij
Pi· j=1 k=1
= mσ 2 + Pi· (v 2 + µ2 ) − Pi· (v 2 + µ2 ) − σ 2 = (m − 1)σ 2 .
(3.7)
Hence the proposed estimator has mean value
m
n X
h X
i
IIE c
Pij (Xij − X̄i )2 = cn(m − 1)σ 2 .
i=1 j=1
Thus our unbiased estimator is
n
m
XX
1
σ̂ =
Pij (Xij − X̄i )2 .
n(m − 1) i=1 j=1
2
iii) v2 : A closer look at the estimator (3.6) shows hat it contains the terms (Yij −
µ̂)2 , all with the same weight. For the estimator in the Bühlmann Straub model
we propose an estimator that contains a term of the form
n X
m
X
i=1 j=1
Pij (Xij − µ̂)2 .
3. CREDIBILITY THEORY
55
Pij
1
2
3
4
5
6
Company
number
1
2
3
Year
4
5
10
5
2
15
5
10
10
5
2
20
5
10
12
5
3
20
5
10
12
5
3
20
3
10
15
5
3
21
3
10
6
7
8
9
10
15
5
3
22
3
10
15
5
3
22
3
10
13
5
3
22
3
12
9
5
3
25
3
12
9
5
3
25
5
12
Table 3.4: Premium volumes
Let us compute its expected value.
IIE
n X
m
hX
2
Pij (Xij − µ̂)
i=1 j=1
n X
m
X
=
i
n X
m
X
=
Pij IIE[Xij2 ] − P·· IIE[µ̂2 ]
i=1 j=1
Pij
σ2
Pij
i=1 j=1
2
+v +µ
2
m
n
m
n
1 XXXX
Pij Pkl IIE[Xij Xkl ]
−
P·· i=1 j=1 k=1 l=1
n
m
1 X X 2 σ 2
2
2
= nmσ + P·· (v + µ ) −
P
+v +µ
P·· i=1 j=1 ij Pij
2
2
+
2
n X
m X
m
X
i=1 j=1
2
2
2
Pij Pil (v + µ ) +
i=1 j=1
l=1
l6=j
= (nm − 1)σ + P·· −
n X
m X
n X
m
X
n
X
P2
i·
i=1
P··
Let
k=1
k6=i
Pij Pkl µ2
l=1
v2 .
(3.8)
n
P∗ =
X 1
Pi· Pi· 1 −
.
nm − 1 i=1
P··
Then
n
m
XX
1 1
v̂ = ∗
Pij (Xij − µ̂)2 − σ̂ 2
P
nm − 1 i=1 j=1
2
is an unbiased estimator of v 2 .
Example 3.3.
An insurance company has issued motor insurance policies for
business cars to six companies. Table 3.4 shows the number of cars for each of the
six companies in each of the past ten years. Table 3.5 shows the aggregate claims,
measured in thousands of e, for each of the companies in each of the past ten years.
The first step is to calculate the mean aggregate claims per car Xij = Yij /Pij for each
of the companies in each year, see Table 3.6. For the computation of the estimators
56
3. CREDIBILITY THEORY
Yij
1
2
3
4
5
6
Company
number
1
2
3
4
Year
5
74
52
40
171
49
126
50
111
28
85
132
148
180
83
60
100
74
128
43
74
59
116
37
151
179
87
43
153
21
165
6
7
8
9
10
140
85
74
44
54
128
95
40
61
13
27
100
149
71
46
31
43
65
20
93
72
110
53
233
81
83
81
252
102
246
Table 3.5: Aggregate annual losses
Year
1
2
3
4
5
6
1
2
3
4
5
6
7
8
9
10
7.4
10.4
20
11.4
9.8
12.6
5
22.2
14
4.25
26.4
14.8
15
16.6
20
5
14.8
12.8
3.583
14.8
19.667
5.8
12.333
15.1
11.933
17.4
14.333
7.286
7
16.5
9.333
17
24.667
2
18
12.8
6.333
8
20.333
0.591
9
10
11.462
14.2
15.333
1.409
14.333
5.417
2.222
18.6
24
4.4
17.667
19.417
9
16.6
27
10.08
20.4
20.5
Table 3.6: Normalised losses
the figures in Table 3.7 are needed. We further get
P·· = 554,
P ∗ = 7.08585 .
We can now compute the following estimates.
6
1 X
µ̂ =
Pi· X̄i = 9.94765 ,
P·· i=1
6
10
1X1X
σ̂ =
Pij (Xij − X̄i )2 = 157.808 ,
6 i=1 9 j=1
2
6
10
1 1 XX
2
2
v̂ = ∗
Pij (Xij − µ̂) − σ̂ = 28.7578 .
P 59 i=1 j=1
2
The number of business cars each company plans to use next year and the credibility
premium is given in Table 3.8. One can clearly see how the credibility factor increases
with Pi· .
3.3.3.
The Bühlmann Straub model with missing data
In practice it is not realistic that all policy holders of a certain collective insurance
type started to insure their risks in the same year. Thus some of the Pij ’s may be
3. CREDIBILITY THEORY
57
Company
Pi·
X̄i
1
2
3
4
5
6
120
50
28
212
38
106
8.425
15.580
20.143
5.071
15.579
14.057
P
j
Pij (Xij − X̄i )2
P
1659.62
735.78
494.10
2310.63
1289.26
2032.23
j
Pij (Xij − µ̂)2
1937.84
2321.95
3404.48
7352.86
2494.30
3821.88
Table 3.7: Quantities used in the estimators
Company
Cars
next year
Credibility
factor
Credibility premium
per unit volume
Actual premium
1
2
3
4
5
6
9
6
3
24
3
12
0.95627
0.90110
0.83613
0.97477
0.87382
0.95078
8.4916
15.0230
18.4722
5.1938
14.8684
13.8544
76.424
90.138
55.417
124.651
44.605
166.252
Table 3.8: Premium next year
0. The question arises whether this has an influence on the model. The answer is
that its influence is only small. Denote by mi the number of years with Pij 6= 0.
For the credibility estimator it is clear that aij = 0 if Pij = 0 because then also
Yij = 0. For convenience we define Xij = 0 if Pij = 0. The computation of aij does
not change if some of the Pij ’s are 0. Moreover, it is easy to see that µ̂ remains
unbiased.
Next let us consider the estimator of σ 2 . A closer look at (3.7) shows that
IIE
m
hX
i
Pij (Xij − X̄i )2 = (mi − 1)σ 2 .
j=1
Hence the estimator of σ 2 must be changed to
n
m
1X 1 X
σ̂ =
Pij (Xij − X̄i )2 .
n i=1 mi − 1 j=1
2
Consider the estimator of v. The computation (3.8) changes to
IIE
n X
m
hX
i=1
n
i
X
Pi·2 2
Pij (Xij − µ̂)2 = (m· − 1)σ 2 + P·· −
v
P
··
i=1
j=1
58
3. CREDIBILITY THEORY
where
m· =
n
X
mi .
i=1
Thus one has to change
n
P∗ =
Pi· 1 X Pi· 1 −
m· − 1 i=1
P··
and
n
m
1 1 X X
2
2
v̂ = ∗
Pij (Xij − µ̂) − σ̂ .
P
m· − 1 i=1 j=1
2
3.4. General Bayes Methods
We can generalise the approach above by summarising the observations in a vector
X i . In the models considered before we would have X i = (Xi,1 , Xi,2 , . . . , Xi,mi )> .
We then want to estimate the vector mi ∈ IRs . In the models above we had
> >
mi = m(Θi ). We assume that the vectors (X >
i , mi ) are independent and that
second moments exist. In contrast to the models considered above we do not assume
that the Xik are conditionally independent nor identically distributed. For example,
we could have
2
2
2
2
mi = (IIE[Xik | Θi ], IIE[Xik
| Θi ])> and X i = (Xi1 , Xi1
, Xi2 , Xi2
, . . . , Xi,mi , Xi,m
)> .
i
The goal is now to estimate mi .
The estimator M minimising IIE[kM − mi k2 ] is then M = IIE[mi | X i ]. The
problem with the estimator is again, that we need the joint distribution of the
quantities. We therefore restrict again to linear estimators, i.e. estimators of the
form M = g + GX, where we for the moment omit the index. We denote the
entries of g by (gi0 ). The quantity to minimise is then
ρ(M ) :=
X
i
IIE
h
gi0 +
m
X
gij Xj − mi
2 i
.
j=1
In order to minimise ρ(M ) we first take the derivative with respect to gi0 and equate
it to zero
m
h
i
X
dρ(M )
= 2IIE gi0 +
gi` X` − mi = 0 .
dgi0
`=1
3. CREDIBILITY THEORY
59
The derivative with respect to gij yields
m
h i
X
dρ(M )
= 2IIE Xj gi0 +
gi` X` − mi = 0 .
dgij
`=1
In matrix form we can express the equations as
IIE[g + GX − m] = 0 ,
IIE[(g + GX − m)X > ] = 0 .
The solution we look for is therefore
g = IIE[m] − GIIE[X] ,
>
(3.9a)
−1
G = Cov[m, X ] Var[X]
.
(3.9b)
Here we assume that Var[X] is invertible. Note that Var[X] is symmetric. Because
for any vector a we have a> Var[X]a = Var[a> X] it follows that Var[X] is positive
semi-definite. If Var[X] is not invertible, then there is a such that a> Var[X]a = 0,
or equivalently, a> X = 0. Hence some of the entries of X can be expressed via the
others. In this case, it is no loss of generality to reduce X to X ∗ such that Var[X ∗ ]
becomes invertible.
If we now again consider n collective contracts (X i , mi ) the linear Bayes estimator M i for mi can be written as
−1
M i = IIE[mi ] + Cov[mi , X >
E[X i ]) .
i ] Var[X i ] (X i − II
Thus we start with the expected value IIE[mi ] and correct it according to the deviation of X i from the expected value IIE[X i ]. The correction is “larger” if the covariance between mi and X i is “larger”, “smaller” if the variance of X i is “larger”.
Example 3.4. Suppose there is an unobserved variable Θi that determines the
risk. Suppose IIE[X i | Θi ] = Y i b(Θi ) and Var[X i | Θi ] = P i V i (Θi ) for some known
Y i ∈ IRm×q , P i ∈ IRm×m and some functions b(θ) ∈ IRq , V i (θ) ∈ IRm×m of the
unobservable random variable Θi . We assume that P i is invertible. The Bühlmann
and the Bühlmann-Straub models are special cases.
We here are interested to estimate b(Θi ). Introduce the following quantities
β = IIE[b(Θi )], Λ = Var[b(Θi )] and Φi = P i IIE[V i (Θi )]. The moments required are
β, IIE[X i ] = Y i β,
Cov[b(Θi ), X >
E[b(Θi ) | Θi ], IIE[X >
E[Cov[b(Θi ), X >
i ] = Cov[II
i | Θi ]] + II
i | Θi ]]
>
>
>
= Cov[b(Θi ), b(Θi )> Y >
i ] = Cov[b(Θi ), b(Θi ) ]Y i = ΛY i ,
60
3. CREDIBILITY THEORY
and
Var[X i ] = Var[IIE[X i | Θi ]] + IIE[Var[X i | Θi ]] = Y i ΛY >
i + Φi .
We assume that IIE[V (Θi )] and therefore Φi is invertible. This yields the solution
>
>
>
−1
−1
b̄i = ΛY >
i (Y i ΛY i + Φi ) X i + (I − ΛY i (Y i ΛY i + Φi ) Y i )β .
(3.10)
Note that Λ is symmetric. If Λ would not be invertible could we write some coordinates of b(Θi ) as a linear function of the others. Thus we assume that Λ is
invertible. A well know formula gives
−1
−1
−1
−1
−1 > −1
(Y i ΛY >
= Φ−1
+Y>
i + Φi )
i Φi Y i ) Y i Φi .
i − Φi Y i (Λ
Define the matrix
−1
−1
−1
−1
> −1
−1 −1
−1
Z i = ΛY >
+Y>
= ΛY >
.
i Φi Y i (Λ
i Φi Y i ) Λ
i Φi Y i (ΛY i Φi Y i + I)
Then we find
−1
> −1
−1
−1
−1
Z i = (ΛY >
= I − (ΛY >
.
i Φi Y i + I − I)(ΛY i Φi Y i + I)
i Φi Y i + I)
We also find
>
−1
ΛY >
i (Y i ΛY i + Φi ) Y i
−1
> −1
−1
−1
−1 > −1
= ΛY >
+Y>
i Φi Y i − ΛY i Φi Y i (Λ
i Φi Y i ) Y i Φi Y i
−1
> −1
> −1
−1
= (I − Z i )ΛY >
i Φi Y i = (ΛY i Φi Y i + I) ΛY i Φi Y i = Z i .
It follows that we can write the LB estimator as
−1
b̄i = (I − Z i )(ΛY >
i Φi X i + β) .
(3.11)
The matrix Z i is called the credibility matrix.
−1
Suppose that in addition Y >
i Φi Y i is invertible. This is equivalent to that Y i
has full rank q and m ≥ q. Then
−1
> −1
> −1
> −1
−1 > −1
−1
(ΛY >
i Φi Y i + I)(Y i Φi Y i ) Y i Φi = (Λ + (Y i Φi Y i ) )Y i Φi .
−1
−1
Multiplying by (ΛY >
from the left gives
i Φi Y i + I)
−1
> −1
> −1
> −1
−1 > −1
−1
−1
(Y >
i Φi Y i ) Y i Φi = (ΛY i Φi Y i + I) (Λ + (Y i Φi Y i ) )Y i Φi .
3. CREDIBILITY THEORY
61
Rearranging the term yields
−1
> −1
> −1
−1
(I − Z i )ΛY >
i Φi = (ΛY i Φi Y i + I) ΛY i Φi
−1
> −1
−1
−1 > −1
= (I − (ΛY >
i Φi Y i + I) )(Y i Φi Y i ) Y i Φi
−1
−1 > −1
= Z i (Y >
i Φi Y i ) Y i Φi .
If we define
−1
−1 > −1
b̂i = (Y >
i Φi Y i ) Y i Φi X i
we can express the estimator in credibility weighted form
b̄i = Z i b̂i + (I − Z i )β .
3.5. Hilbert Space Methods
In the section before we considered the space of quadratic integrable random variables L2 . Choose the inner product hX, Y i = IIE[X > Y ]. The problem was to
minimise kM − mk, where M was an estimator from some linear subspace of estimators. We now want to consider the problem in the Hilbert space (L, h·, ·i).
Let m ∈ L2 be an unknown quantity. Let L ⊂ L2 be some subset. We now
want to minimise ρ(M ) = kM − mk2 over all estimators M ∈ L. If L is a
closed subspace of L2 then the optimal estimator, called the L-Bayes estimator,
is the projection mL = pro(m | L). Because m − mL ∈ L⊥ we have ρ(M ) =
km − mL k2 + kmL − M k2 which clearly is minimised by choosing M = mL .
If L0 ⊂ L is a closed linear subspace then iterated projections give mL0 =
pro(mL | L0 ). More generally, if {0} = L0 ⊂ L1 ⊂ · · · ⊂ Ln = L2 is a nested family
of closed linear subspaces of L2 . Then we have m = mLn and
mLk
k
X
=
(mLj − mLj−1 ) .
j=1
For exampe if L1 is the space of constants and L2 is the space of linear estimators
we find mL1 = IIE[mi ] and
−1
mL2 − mL1 = Cov[mi , X >
E[X i ]) .
I ] Var[X i ] (X i − II
62
3. CREDIBILITY THEORY
3.6. Bonus-Malus Systems
Credibility is a method that works well for collective contracts. For individual
customers, the method is not feasible because the customers will not understand
the premium policy. One therefore looks for a scheme that is easier to understand.
For motor insurance, a traditional scheme are bonus malus systems.
A bonus-malus system consists of I classes {1, 2, . . . , I}, a transition rule T (k) =
(tij (k))i,j , a premium scale b, an initial state i0 , and a premium π for the maximal
malus I. Here, tij (k) = 1 if a customer in class i is transferred to class j if he has k
P
claims, and tij (k) = 0 otherwise. In particular, j tij (k) = 1. The premium scale
is b = (b1 , b2 , . . . , bI )> with 0 < b1 ≤ b2 ≤ · · · ≤ bI = 1.
A customer now has Ni claims in year i. Let X(0) be the unit row vector with
Xi (0) = 1Ii=i0 . Then the process X(n) = X(n − 1)T (Nn ) models the movement
through the classes of the customer. The premium the customer has to pay in
period n is then X(n − 1)bπ.
Suppose now that {Ni } are iid.. Then {X(n)} is a Markov chain with transition
P
P[Nn = k]. We can assume that the Markov chain is
probabilities P = ∞
k=0 T (k)II
ergodig. There is a stationary distribution p = (p1 , p2 , . . . , pI ). Thus, the average
premium earned per year converges to pbπ. The net premium the insurer should
charge is therefore π = IIE[Nk ]µ/(pb), where µ is the expected claim size.
Of course, in such a scheme there are many customers. Suppose now there is a
risk parameter Θ. Denote its distribution by Fθ (ϑ). Then the stationary distribution
R
p(ϑ) depends on the risk parameter. The net premium is then IIE[Nk | Θ =
ϑ]µ(ϑ)/(p(ϑ)b) dFθ (ϑ).
In a credibility system, the ultimate premium charged to a customer with risk
parameter ϑ is IIE[Nk | Θ = ϑ]µ(ϑ)/(p(ϑ)b). In a bonus-malus system one would like
to charge this premium. However, because only a finite number premia are possible,
this may not be possible in any case. One possible criterion could be to minimise
the square loss
Z
[p(ϑ)bπ − IIE[Nk | Θ = ϑ]µ(ϑ)]2 dFθ (ϑ) .
This problem of finding the optimal b is quite complicated because the premium π
depends on b in a non-linear way.
3. CREDIBILITY THEORY
63
Bibliographical Remarks
Credibility theory originated in North America in the early part of the twentieth
century. These early ideas are now known as American credibility. References can be
found in the survey paper by Norberg [65]. The Bayesian approach to credibility is
generally attributed to Bailey [14] and [15]. The empirical Bayes approach is due to
Bühlmann [20] and Bühlmann and Straub [21]. For an introduction to bonus-malus
systems see [68, Ch. 7] and references therein.