Thank you to… without whom none of this would have been possible

Contents
ƒ Status
St t update
d t
GI ROC
Effectiveness of Reserving Methods
working party
Steven Fisher & Derek Newton
GIRO 23-26
GIRO,
23 26 September 2008
ƒ What have we been doing?
ƒ What are our results so far?
ƒ What plans for the year ahead?
ƒ Prize draw!
Thank y
you to…
without whom none
of this would have
been possible
1
Working party members
Insurers providing data
Testing exercise volunteers
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
G
Gary
Yeates
Y t
Chris Wiltshire
Shreyas Shah
David Payne
James Orr
Derek Newton
Mary-Frances Miller
ƒ
ƒ
ƒ
ƒ
ƒ
Chris
Ch
i M
Marinan
i
Chris Jelfs
Andrew Gray
Steven Fisher (chair)
Rob Barritt
C
Company
A
Company B
Company C
Company D
Company E
Company F
Company G
Company H
Aditya
y Tibrewala
Alex Panayi
Andrew Gray
Andy White
Antony Claughton
Ben Qin
Brian Gravelsons
Caroline Symonds
Chlose Paillot
Clare Edler
Colin Kerley
Colum D'Auria
Conor Dolan
Craig Martindale
Dale Lee
Daniel Smith
David Halse
Debarshi Chatterjee
Derek Newton
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
Gary
y Yeates
Geoff Morley
Graham Robertson
Ian Thomas
Isobel Prowen
James Wackrow
Jenny Wong
Jonathan Broughton
Julian Leigh
Julie Evans
Julie Sims
Jürgen Wielandts
y
Keith Taylor
Kerryn Ferris
Lis Gibson
Marian Keane
Mark Wylie
Matthew Brown
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
Matthew Myring-McCullagh
y g
g
Matthew Spedding
Oliver Bettis
Patrick Crellin
Paul Goodenough
Philip Archer-Lock
Rebecca Christie
Richard Winter
Scott Yin
Shane O'Dea
Shreyas Shah
Silvana Sarabia
Stuart Yates
Suzanne Patten
Thomas Cordier
Tim Jordan
Vincent Robert
Yew Khuen-Yoon
2
Research funding
ƒ G
Gratefully
t f ll received
i d ffrom th
the A
Actuarial
t i l
Profession
ƒ Used to set up prize fund as incentive for
testing volunteers
ƒ Results of prize draw later…
“The 5 big questions”
What have we
been doing?
ƒ
How accurate are the reserving methods?
ƒ Which reserving methods work best for which classes (or in which
circumstances)
i
) – or more importantly
i
l when
h d
do they
h not work
k well?
ll?
ƒ How much value does the actuary add?
ƒ How much value does understanding the business add (i.e., from additional
information), and what additional information is required for each method?
ƒ
How much value is added by combining different methods, and how does one
assess how much weight to give to each method?
ƒ
What real world circumstances impact the robustness of the reserving methods?
ƒ
What diagnostics, method variants and other adjustments can be applied to improve
the robustness and accuracy of the methods?
ƒ
How volatile is the best estimate under different reserving methods, and how does
this volatility interact with any measure of accuracy for the methods?
3
“The 5 big questions”
ƒ
How accurate are the reserving methods?
ƒ Which reserving methods work best for which classes (or in which
circumstances)
i
) – or more importantly
i
l when
h do
d they
h not work
k well?
ll?
ƒ How much value does the actuary add?
ƒ How much value does understanding the business add (i.e., from additional
information), and what additional information is required for each method?
ƒ
How much value is added by combining different methods, and how does one
assess how much weight to give to each method?
ƒ
What real world circumstances impact the robustness of the reserving methods?
ƒ
What diagnostics, method variants and other adjustments can be applied to improve
the robustness and accuracy of the methods?
ƒ
How volatile is the best estimate under different reserving methods, and how
does this volatility interact with any measure of accuracy for the methods?
Testing approach
Objectives of working party
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
E i i l ttesting
Empirical
ti
Based on real data
Manual & mechanical testing
Analysis of thousands of “reserve errors”
q
Identification of keyy themes/issues/questions
W wantt to
We
t test
t t many different
diff
t methods…
th d
Based on many different datasets...
Covering many different classes…
Run by many different actuaries...
y different yyear-ends!
At many
4
A philosophical question
ƒ What do we mean by an “effective”
effective method?
ƒ Which is more “effective”?
ƒ A method that frequently differs widely from the eventual outcome but,
on average over many trials, comes very close to the eventual
outcome; or
ƒ A method that has less variability from the eventual outcome, but on
average over many trials is not as close to the answer; or
ƒ A method that gives a good answer at an early stage of development,
but the accuracy of that answer doesn't
doesn t improve over time
ƒ Different methods may be more effective in different circumstances
ƒ Development of a “method reliability index” versus graphical
analysis of estimates
Caveats…
What are our
results so far?
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
Artificiality
A
tifi i lit off testing
t ti exercise
i
Absence of real business information
No access to underwriters & claims staff
Absence of market benchmark data
“True” ultimate not known
Time available for testing
Lack of peer review
5
Dataset (All) AccYear (All) Experience (All) Location (All) Qualification (All) Experience in class (All) Type Judgement
Important!
Results overview
3
Average of Error%Ult
2.5
2
ƒ Still much work to be done…
1.5
1
0.5
0
-0.5
-1
-1.5
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
ƒ We
W focus
f
today
t d on questions,
ti
nott conclusions
l i
ƒ 44
44,203
203 individual estimates to analyse
ƒ Possible evidence to suggest tendency to be conservative amongst
testers (but not universal)
ƒ Wide divergence in testers’ estimates – testers suffered from lack
of real business information
ƒ Premium better estimator than claims in early years; claims better
than premium in later years
ƒ No evidence observed from this exercise that final selected
estimates were “better” than ICL/IBF
ƒ Fewer outliers from non-CL/BF techniques
ƒ Beware of misleading patterns
PCL
ICL
ELR
(exp)
ELR
(prem)
PBF
(exp)
PBF
(prem)
IBF
(exp)
IBF
(prem)
APC
AIC
PCE
PPCI
1
2
3
4
5
6
7
8
9
10
11
12
Method order Method Age of acc year
PPCF
OpTimePaid
OpTimeIncICRFS
13
14
15
16
Other1
Other2
Final
17
18
19
YearEnd
Tester
6-1
6-2
6-3
6-4
6-5
6-6
6-7
6-8
6-9
6 - 10
6 - 11
6 - 12
6 - 13
6 - 14
6 - 15
6 - 16
6 - 17
6 - 18
6 - 19
6 - 20
6 - 21
6 - 22
6 - 23
6 - 24
6 - 25
6 - 26
6 - 27
6
1
Dataset Aviation Products Liability AccYear (All) Experience (All) Location (All) Qualification (All) Experience in class (All) Type Judgement
Average of Error%Ult
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
0.8
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
-1
PCL
ICL
ELR (exp)
ELR (prem)
1
2
3
4
PBF (exp)
PBF (prem)
IBF (exp)
IBF (prem)
APC
AIC
PCE
Final
5
6
7
8
9
10
11
19
Method order Method Age of acc year
YearEnd
Tester
6 - 30
6 - 31
6 - 32
6 - 33
6 - 34
7 - 30
7 - 31
7 - 32
7 - 33
7 - 34
8 - 30
8 - 31
8 - 32
8 - 33
8 - 34
9 - 30
9 - 31
9 - 32
9 - 33
9 - 34
10 - 30
10 - 31
10 - 32
10 - 33
10 - 34
3
Average of Error%Ult
2.5
2
1.5
1
0.5
0
-0.5
-1
-1.5
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
Dataset General Liability AccYear (All) Experience (All) Location (All) Qualification (All) Experience in class (All) Type Judgement
PCL
ICL
ELR (prem)
1
2
4
PBF (prem)
IBF (prem)
PCE
ICRFS
Final
6
8
11
16
19
Method order Method Age of acc year
YearEnd
Tester
6-1
6-2
6-3
6-4
6-5
6-6
6-7
6-8
6-9
6 - 60
7-1
7-2
7-3
7-4
7-5
7-6
7-7
7-8
7 - 60
8-1
8-2
8-3
8-4
8-5
8-6
8-7
8-8
Paid vs incurred data
ƒ IIs incurred
i
dd
data
t always
l
“b
“better”
tt ” th
than paid
id d
data?
t ?
ƒ Is choice of method more important than choice
of data?
ƒ How did testers respond to case estimate
redundancies?
ƒ Important to use both datasets?
7
Dataset General Liability AccYear 6 YearEnd (All) Experience (All) Location (All) Qualification (All) Experience in class (All) Type Judgement
1
Dataset Employers' Liability AccYear (All) YearEnd (All) Experience (All) Location (All) Qualification (All) Experience in class (All) Type Judgement
Average of Error%Ult
BF exposure basis
1.2
Average of Error%Ult
1
0.8
0.8
0.6
Method
Method order
PCL - 1
ICL - 2
AIC - 10
APC - 9
Final - 19
IBF (exp) - 7
PBF (exp) - 5
0.4
0.2
0
ƒ T
Traditional
diti
lb
basis
i ffor BF method
th d iis premium
i
as
exposure measure
ƒ How does this compare with other exposure
measures (eg vehicle-years, wageroll), where
available?
Tester
21
22
23
24
25
26
27
28
29
0.6
0.4
0.2
0
-0.2
-0.2
1
-0.4
1
2
3
30
4
5
1
2
3
31
4
5
1
2
3
4
5
32
Tester Age of acc year
1
2
3
33
4
5
1
2
3
34
4
5
2
3
4
5
6
7
8
1
2
3
4
5
IBF (exp)
IBF (prem)
7
8
6
7
8
Method order Method Age of acc year
8
Dataset Motor Third Party Injury AccYear (All) YearEnd 6 Experience (All) Location (All) Qualification (All) Experience in class (All) Type Mechanical
Sensitivity to development factors and
tail factors
1.2
Average of Error%Ult
Questions for further investigation
1
0.8
ƒ H
How bi
big a diff
difference d
does th
the d
development
l
t
factor averaging basis make?
ƒ How big a difference does the tail factor
extrapolation basis make?
ƒ How do these compare to the impact of
randomness?
d
?
Tester
0.6
1mech
2mech
3mech
4mech
5mech
11mech
12mech
0.4
0.2
0
-0.2
-0.4
-0.6
1
2
3
4
5
6
1
2
3
4
PCL
ICL
1
2
5
ƒ D
Do methods
th d b
based
d on ““extra”
t ”d
data
t b
bring
i added
dd d
value?
ƒ Is premium the best exposure measure?
ƒ How do we resolve divergent messages from
paid and incurred data?
ƒ Can we reduce tail factor uncertainty?
6
Method order Method Age of acc year
9
Next steps
What plans for
the year ahead?
ƒ
ƒ
ƒ
ƒ
ƒ
F db k tto ttesting
Feedback
ti volunteers
l t
Further analysis of testing results
Delve deeper into questions raised today
“Controlled” testing using pseudo-data
g
of impact
p
of understanding
g the
Investigation
business
ƒ Another testing exercise?
Prize Draw!
10
Prizes
ƒ
ƒ
ƒ
ƒ
ƒ
A Nintendo Wii games console
A bottle of Chateau Mouton Rothschild 2001
A helicopter tour of London
Dinner for two at Gordon Ramsay at Claridge's
An iPod Touch
ƒ Many thanks to the Actuarial Profession for
providing research funds
And the winners are…
Coming up after the break…
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
M
More
detail
d t il on ttesting
ti methodology
th d l
In depth discussion of issues raised
Pseudo-data
Mechanical testing
Tail factors
How to measure “effectiveness”
Some surprising conclusions…
11
GI ROC
Effectiveness of Reserving Methods
working party
Steven Fisher & Derek Newton
GIRO 23-26
GIRO,
23 26 September 2008
12