DYNAMIC ECONOMETRIC ANALYSIS OF INSURANCE MARKETS WITH IMPERFECT INFORMATION ISBN 978 90 361 00946 Cover design: Crasborn Graphic Designers bno, Valkenburg a.d. Geul This book is no. 442 of the Tinbergen Institute Research Series, established through cooperation between Thela Thesis and the Tinbergen Institute. already appeared in the series can be found in the back. A list of books which VRIJE UNIVERSITEIT Dynamic Econometric Analysis of Insurance Markets with Imperfect Information ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad Doctor aan de Vrije Universiteit Amsterdam, op gezag van de rector magnicus prof.dr. L.M. Bouter, in het openbaar te verdedigen ten overstaan van de promotiecommissie van de faculteit der Economische Wetenschappen en Bedrijfskunde op dinsdag 13 januari 2009 om 13.45 uur in de aula van de universiteit, De Boelelaan 1105 door Tibor Zavadil geboren te a©a, Slowakije promotor: prof.dr. J.H. Abbring venujem Jaapovi, Drahu²ke a rodi£om Acknowledgments First of all I would like to thank Jaap for accepting me for this PhD project. He showed me the magic of science and was not only an excellent supervisor, but also my good friend. Undoubtedly, without him this thesis would never see the light of day. Moreover, we had so much fun during my PhD that I can honestly say that it is a pity that my PhD is over. Finally, without Jaap I would have never nished my thesis on time. Last day before the deadline we were working together whole night long (in Jaap's hotel room in Milan) to accomplish the thesis. Obviously, I have to thank also my girlfriend Drahu²ka who was always supporting me, mainly in dicult situations, which I could not handle myself. Since we met she has been a bright side of my life. Then I need to thank my parents for their love and generosity; they were always motivating me to work hard on myself. I would like to thank also my grandparents for their curious questions about the progress of my PhD. Further, I have to thank my closest colleagues: Ronald for being always available to help me with technical issues, and Marcel for taking care of the printing of my thesis while I was on holidays in south-east Asia. Then I want to thank also Sander an insurance professional for his precious help with the data and explanation how it works in the business. I gratefully acknowledge nancial support by the Netherlands Organisation for Scientic Research (NWO) through a MaGW Free Competition grant (400-03-257). Finally, I thank all my friends from Tinbergen Institute, Vrije Universiteit and Sushi Me for making my stay in Amsterdam pleasant and funny. I have never had so many parties in my life before :-). Peace and love, Tibor vii Contents Acknowledgments vii 1 Introduction and Summary 1 1.1 Adverse Selection versus Moral Hazard . . . . . . . . . . . . . . . . . . . . 3 1.2 Ex Ante and Ex Post Moral Hazard . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Asymmetric Information in Car Insurance 9 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Car Insurance in the Netherlands . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.1 Sample Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.2 Claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 2.5 Asymmetric Information and Occurrence of Claims 9 . . . . . . . . . . . . . 18 2.4.1 Pair of Probits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.4.2 Bivariate Probit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.4.3 χ2 . . . . . . . . . . . . . . . . . . . . . . . . 20 2.4.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.4.4.1 Actuarial Study . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.4.2 Experience Rating 26 2.4.4.3 Results for Young Drivers . . . . . . . . . . . . . . . . . . 28 2.4.4.4 Results for Senior Drivers . . . . . . . . . . . . . . . . . . 31 2.4.4.5 Negative Claim-Coverage Correlation . . . . . . . . . . . . 36 Asymmetric Information and Incurred Damages . . . . . . . . . . . . . . . 37 2.5.1 Classical Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.5.2 Nonparametric Tests 39 Test of Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix CONTENTS 2.5.2.1 2.6 2.7 Implementation and Results . . . . . . . . . . . . . . . . . 40 Premium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.6.1 Liability Premium and Expected Damage . . . . . . . . . . . . . . 43 2.6.2 Test for Asymmetric Information Based on Claim Frequency . . . . 45 2.6.3 Test for Asymmetric Information Based on Claim Severity . . . . . 47 2.6.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.6.3.2 First Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.6.3.3 Second Test . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Appendix to Chapter 2 2.A 55 Technical Details for Tests from Subsection 2.6.3 . . . . . . . . . . . . . . . 55 2.A.1 Asymptotic Properties of the Test Statistic T2 . . . . . . . . . . . . 55 2.A.2 Asymptotic Properties of the Test Statistic T4 . . . . . . . . . . . . 56 3 State Dependence 59 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.2 State Dependence and Heterogeneity in Renewal . . . . . . . . . . . . . . . 60 3.3 Identiability 61 3.3.1 3.4 Occurrence Dependence and Duration Dependence 3.3.1.1 Full Information 3.3.1.2 Censored Data . . . . . . . . . 62 . . . . . . . . . . . . . . . . . . . . . . . 63 . . . . . . . . . . . . . . . . . . . . . . . . 63 3.3.2 Occurrence Dependence and Lagged Duration Dependence . . . . . 65 3.3.3 Occurrence Dependence and Nonstationarity . . . . . . . . . . . . . 65 Nonparametric Tests 3.4.1 x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple Rank Test 67 . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.4.1.1 The Case without Censoring . . . . . . . . . . . . . . . . . 68 3.4.1.2 The Case with Censoring 69 . . . . . . . . . . . . . . . . . . CONTENTS 3.4.1.3 3.4.2 3.5 Sample Sizes . . . . . . . . . . . . . . . . . . . . . . . . . A Transformed Rank Test 71 . . . . . . . . . . . . . . . . . . . . . . . 75 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4 Moral Hazard in Dynamic Insurance Data 79 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.2 Institutional Background and Data . . . . . . . . . . . . . . . . . . . . . . 84 4.2.1 Experience Rating in Dutch Car Insurance . . . . . . . . . . . . . . 84 4.2.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Model of Claim Rates and Sizes . . . . . . . . . . . . . . . . . . . . . . . . 90 4.3.1 Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.3.2 Optimal Risk, Claims and Savings . . . . . . . . . . . . . . . . . . . 95 4.3.3 Dynamic Incentives from Experience Rating . . . . . . . . . . . . . 98 . . . . . . . . . . . . . . . . . . . . 98 4.3 4.4 Measure of Incentives 4.3.3.2 Theoretical Characterization of Incentives 4.3.3.3 Numerical Characterization of Incentives . . . . . . . . . . 102 Empirical Analysis . . . . . . . . . 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.4.1 Econometric Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 4.4.2 Structural Test on the Full Sample of Claim Times 4.4.3 Tests for State Dependence in Claim Times and Sizes . . . . . . . . 116 4.4.4 4.5 4.3.3.1 . . . . . . . . . 113 4.4.3.1 Theoretical Implications for the Claims Process . . . . . . 116 4.4.3.2 Distribution of First Claim Time 4.4.3.3 Marginal Distributions of First and Second Claim Times . 122 4.4.3.4 Joint Distribution of First and Second Claim Durations . . 125 4.4.3.5 Claim Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . 129 . . . . . . . . . . . . . . 121 Claim Withdrawals . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 xi CONTENTS Appendices to Chapter 4 135 4.A Proofs of Results in Section 4.3 . . . . . . . . . . . . . . . . . . . . . . . . 135 4.B Computation of Proposition 5's Function 4.C Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 4.D Main (Corrected) Sample without Young Drivers 4.E Cleaned Data 4.F Sample Corrected Based on Initial Bonus-Malus Class . . . . . . . . . . . . 166 4.G Main (Corrected) Sample Including Withdrawn Claims Q . . . . . . . . . . . . . . . . . 136 . . . . . . . . . . . . . . 142 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 5 Conclusion . . . . . . . . . . . 178 191 5.1 Integration of the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 5.2 Dynamic Contract Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 5.3 Observed Contract Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . 195 Summary in Dutch 204 References 212 xii 1 Introduction and Summary Risk-averse agents benet from insurance against income shocks. Imperfect, asymmetric information may lead to two problems in providing such insurance, moral hazard and adverse selection. Moral hazard arises when agents change their behavior in favor of more risky actions once they are insured. Adverse selection (on risk) arises when agents who are inherently more risk-prone (bad risks) select into buying more insurance. Either way, competitive markets may fail to provide ecient levels of insurance, which is a main reason for regulation of insurance markets. Furthermore, moral hazard and adverse selection have substantially dierent implications for optimal contract design. The empirical analysis of the eects of asymmetric information on insurance markets is therefore of major interest. 1 It is a core topic in the recent empirical literature on the economics of contracts. This PhD project analyzes asymmetric information in car insurance markets, which represent one of the most important sectors in non-life insurance. 2 We use data from a major Dutch car insurer, providing us with detailed information on insurees (age, sex, address of residence), their cars (brand, model, price, engine volume, power, etc.), con- 1 See Chiappori and Salanié (2003) for an overview. 2 In the Netherlands, car insurance is the second most important sector of the nonlife insurance. With the total gross premium income of 4.5 billion euro in 2006, car insurance covered almost 20% of the income of the whole nonlife insurance market. The income from car insurance represents almost 1% of the total GDP. Source: Statistics Netherlands (www.cbs.nl). 1 CHAPTER 1. INTRODUCTION AND SUMMARY tracts (coverage, premium, deductible, starting date, renewal date, etc.) and claims (type, damage, impact). As argued by Chiappori and Salanié (1997), insurance data are ideal for the empirical analysis of contract theory. Especially in car insurance, contracts are highly standardized and can be exhaustively described by a small set of variables. A large company typically covers hundreds of thousands of clients, which provides enough variation even with rare events. Finally, the empirical counterpart of the client's performance is the occurrence of an accident and its cost, which are again precisely recorded in the company's les. Thus, insurance data allow to test most predictions of contract theory in a detailed way, using standard econometric tools. One popular approach to testing for the presence of asymmetric information is based on a general theoretical conclusion that, under asymmetric information,contracts with more comprehensive coverage are chosen by agents with higher expected accident costs. This conclusion holds under both adverse selection and moral hazard. Under adverse selection, high-risk agents, who expect to incur more losses, choose to buy more coverage (Rothschild and Stiglitz, 1976). Under moral hazard, agents who buy (for whatever reason) more insurance, become more risky because the extensive coverage reduces incentives for cautious behavior (Shavell, 1979, Holmström, 1979). Chiappori and Salanié (2000) point out a direct implication of these predictions, which is that, under asymmetric information, a positive correlation between coverage and frequency of accidents should be observed on observationally identical agents. They argue 3 that this prediction is robust to a variety of generalizations. Using both parametric and nonparametric tests, they do not nd evidence of asymmetric information in French car insurance market among young drivers. They suspect, however, that such information asymmetry may arise in the course of time due to asymmetric learning about risk. We adopt this approach in the rst part of the thesis and extend it in various ways. First we will test for asymmetric information also among senior drivers, controlling for 3 However, as argued by de Meza and Webb (2001), this prediction might not hold in the presence of both selection on risk preferences and moral hazard. See Section 2.7 for more discussion. 2 1.1. ADVERSE SELECTION VERSUS MORAL HAZARD their experience rating. Second, we will examine not only claim occurrences but also claim sizes, since both are relevant for the (expected) claim costs. Finally, we will also explore data on premia, which will facilitate applications of fully nonparametric methods. The conditional-correlation approach is quite easy to implement, requiring only crosssectional data on contracts and claims. However, it cannot identify the type of information asymmetry involved (if any). Such distinction requires more advanced methods, which are discussed in the next section. 1.1 Adverse Selection versus Moral Hazard Chiappori (2001) discusses some ways how to distinguish adverse selection from moral hazard. One of them is to use natural experiments. Assume that a given population faces an incentive structure that is suddenly modied for some exogenous reason, such as a policy reform. If this change in incentives coincides with a change in risk, this is evidence 4 of moral hazard. Another possibility is to use quasi natural experiments in which identical agents face, for exogenous reasons, dierent incentive schemes. If they also have dierent risks, this is evidence of moral hazard. When using this approach, it is important to check that the dierences in schemes are purely exogenous and do not reect some hidden characteristics of the agents. 5 Next possibility is to do a social experiment, to contracts with dierent coverage. in which agents are randomly assigned If agents with better coverage claim more costs, this is evidence of moral hazard. Such an experiment was done, for example, in health insurance by Manning, Newhouse, Duan, Keeler, and Leibowitz (1987), who estimated how cost sharing, i.e. the portion of the bill the patient pays, aects the demand for 4 Natural experiments were exploited, among others, by Dionne and Vanasse (1997), Chiappori, Durand, and Geoard (1998) and Dionne, Maurice, Pinquet, and Vanasse (2005). 5 See, for example, Holly, Gardiol, Domenighetti, and Bisig (1998) and Cardon and Hendel (2001), who estimate structural models of health insurance. 3 CHAPTER 1. INTRODUCTION AND SUMMARY medical services. Last possibility is to explore approaches can be chosen here. dynamic aspects of the contractual relationship. Two One assumes that existing contracts are optimal and compares the observed features of these contracts to the theoretical predictions about the form of optimal contracts, which are dierent under adverse selection and moral hazard. 6 The second approach does not rely on this optimality assumption. Instead, it takes existing contracts as given and contrasts behavior, implied by the theory under adverse selection and moral hazard, to observed behavior. The idea is that particular features of existing contracts (whether optimal or not) have dierent theoretical implications for observed behavior under adverse selection and moral hazard. Thus, the two can be distinguished by a careful analysis of observed behavior. This approach was introduced by Abbring, Chiappori, Heckman, and Pinquet (2003), who observed that there is a close relation between the empirical analysis of moral hazard in a market with experience-rated insurance contracts and the classical problem of distinguishing state dependence and heterogeneity in labor economics (Heckman, 1981). Abbring, Chiappori, and Pinquet (2003) formalize this idea using dynamic economic theory. They show that in the French experience-rated car-insurance system each claim at fault (i.e., that triggers a premium increase, or malus) increases incentives to avoid further claims, and therefore reduces claim intensities under moral hazard. The resulting negative occurrence dependence (Heckman and Borjas, 1980) of claims due to moral hazard is possibly counteracted by the eects of unobserved heterogeneity: Agents who incur claims are more likely to be bad drivers and to incur more claims in the future anyhow. Abbring et al. (2003) extend the work of Heckman and Borjas to show that it is possible to detect true occurrence dependence due to moral hazard in the presence of such dynamic selection eects and possible non-stationarity of accident rates over time. The second part of the thesis continues and broadens this line of work. 6 See Dionne and Doherty (1994) for an early example. 4 First, we 1.2. EX ANTE AND EX POST MORAL HAZARD develop a fully structured dynamic micro-econometric model to study moral hazard in Dutch car insurance. This allows us to exploit the rich variation in incentives, induced by the Dutch experience rating scheme, in our analysis of moral hazard. We also increase the power of the tests by extending the analysis to longer panels of insurance data. Finally, we distinguish two forms of moral hazard ex ante and ex post, by modeling both claim occurrences and claim sizes. This distinction is important for the implications for market outcomes, optimal contracts and public policy. 1.2 Ex Ante and Ex Post Moral Hazard A central problem in the empirical analysis of insurance data is that insurance companies can typically only provide data on claims that are actually led with the company, and not on the occurrence of the relevant insured events directly. Even in absence of false claims, the two may not coincide if reporting an insured loss is to some extent at the insuree's discretion. In such a context, it is useful to distinguish between an hazard eect on the occurrence of insured losses and an ex ante moral ex post moral hazard eect on the propensity to claim once a loss has occurred. Abbring et al. (2003) focus on the occurrence of claims, and therefore on the combined eects of ex ante and ex post moral hazard. From an insurance company's perspective, this combined eect on claims may be all that matters. However, from an academic and public-policy perspective the distinction between ex ante and ex post moral hazard is of considerable interest. We address this issue by extending Abbring et al.'s analysis of moral hazard by including not just the occurrences, but also the sizes of claims. The Dutch experience-rating system only punishes the former, not the latter. In particular, given that a claim is led, the claim amount does not aect future premia. Therefore, a Dutch insuree will be more reluctant to report small losses than to report large losses. After all, the costs of reporting (in terms of increased future premia, etc.) are the same in both cases, but the benets of 5 CHAPTER 1. INTRODUCTION AND SUMMARY reporting small losses are lower. Thus, instead of only analyzing whether the individual claim intensity changes with the past driving history, we also analyze whether the claim sizes change with the number of past claims. The latter is evidence of ex post moral hazard to the extent that, given the occurrence of an accident, insured losses are not aected by ex ante moral hazard. Complementary information on ex ante and ex post moral hazard can be found in our Dutch car insurance data where agents have the option to withdraw a claim within six months of the relevant accident, and thus avoid any experience-rating repercussions. This option reduces incentives to underreport accidents when they occur, and provides direct information on ex post moral hazard. In the extreme case that there are no costs to ling and withdrawing a claim neither direct administrative costs nor indirect informational costs and insured losses take time (beyond the initial ling period) to be assessed, initial claims will be directly informative on ex ante moral hazard and all ex post moral hazard will manifest itself as claim withdrawals. 1.3 Thesis Overview This thesis is organized as follows. In Chapter 2 we will test for the presence of asymmetric information in the Dutch car insurance using the conditional-correlation approach. Under asymmetric information, more comprehensive coverage is associated with higher risk, conditional on the information available to the insurer. We explore this prediction by analyzing whether agents with better coverage have either higher frequency of claims or cause more severe accidents. We use also data on premium, which allows us to develop novel nonparametric methods. Controlling for agents' experience rating, we do not nd any evidence of asymmetric information in this market. This is a common result from the empirical literature using the conditional-correlation approach. This chapter is based on Zavadil (2008). In Chapter 3 we will review and extend results on the identiability of, and nonpara- 6 1.3. THESIS OVERVIEW metric tests for, state dependence and heterogeneity in renewal models. The renewal models studied are analogous to linear panel data models with xed eects and lagged endogenous regressors. This chapter focuses on the specic problems that arise with panel duration data. Most importantly, it explores the implications of the fact that renewal data can typically only be collected over a nite period of time. It shows that such censoring invalidates existing identication results and, in particular if the renewal events are rare, reduces the power of nonparametric tests. This chapter is based on Abbring and Zavadil (2008) and develops some econometric theory used in the next chapter. Chapter 4 constitutes the core of this thesis. It empirically analyzes moral hazard in car insurance using a dynamic theory of an insuree's dynamic risk (ex ante moral hazard) and claim (ex post moral hazard) choices and Dutch longitudinal micro data. We use the theory to characterize the heterogeneous dynamic changes in incentives to avoid claims that are generated by the Dutch experience-rating scheme, and their eects on claim times and sizes under moral hazard. We develop tests that exploit these structural implications of moral hazard and experience rating. evidence of moral hazard. Unlike much of the earlier literature, we nd This chapter appeared as Abbring, Chiappori, and Zavadil (2008). Finally, Chapter 5 summarizes the main results from the previous chapters and discusses new ideas for future work. In particular, we argue that a natural next step is to include dynamic contract choices in the analysis. We provide some evidence that our Dutch insurance data contain sucient variation in contract choices over time to support such an analysis. Throughout the whole thesis we will use the following connotation: agent and she he will refer to an will refer to an insurance company. This connotation is based on linguistics; in most European languages an agent is masculine and a company is feminine. 7 2 Asymmetric Information in Car Insurance 2.1 Introduction Analysis of asymmetric information in insurance markets has become a core topic in the 1 recent empirical literature on the economics of contracts. After the seminal work on moral hazard and adverse selection by Arrow (1963), Pauly (1968, 1974) and Rothschild and Stiglitz (1976), who showed that asymmetric information in competitive insurance markets may lead to inecient outcomes and market failure, economic theorists devoted much eort to development of adverse selection and moral hazard models. In the last two decades of the twentieth century, contract theory developed at a rapid pace, but empirical applications lagged behind. At the turn of the millennium, this gap was lled with numerous empirical papers analyzing asymmetric information in various insurance markets; see Chiappori and Salanié (2003) for an excellent overview. In the automobile insurance market, the initial empirical studies by Dahlby (1983, 1992) and Puelz and Snow (1994) suggested the existence of adverse selection in car 1 To mention just a few recent works: Israel (2004), Ceccarini and Pereira (2004), Cohen (2005), Dionne et al. (2005), Dionne, Dahchour, and Michaud (2006), Chiappori, Jullien, Salanié, and Salanié (2006), Pinquet, Dionne, Vanasse, and Maurice (2007) and Abbring, Chiappori, and Zavadil (2008) analyze asymmetric information in car insurance, and Cardon and Hendel (2001), Hendel and Lizzeri (2003), Fang, Keane, and Silverman (2006) study health and life insurance data. 9 CHAPTER 2. insurance. ASYMMETRIC INFORMATION IN CAR INSURANCE These ndings were later challenged by subsequent research. Particularly, Chiappori and Salanié (2000) and Dionne, Gouriéroux, and Vanasse (2001) questioned the results of Puelz and Snow's (1994) analysis, claiming that they used too constrained functional forms relying on very few variables, and did not control for agent's seniority and driving experience. Chiappori and Salanié (2000) adopt an alternative approach based on a theoretical conclusion that higher risks are associated with more comprehensive coverage. This result comes from the seminal works of Rothschild and Stiglitz (1976) and Wilson (1977) who predict under adverse selection that high risk individuals choose higher insurance coverage and have more accidents (within risk classes). Under moral hazard, Shavell (1979) and Holmström (1979) predict that those with higher insurance coverage have weaker incentives for safe driving and should have more accidents. Consequently, asymmetric information leads to a positive correlation between coverage and frequency of accidents (conditionally on all observables). This prediction is quite general; it does not require any assumptions on preferences, neither on the rm's pricing policy. Given that it is valid under both adverse selection (bad risks buy more insurance) and moral hazard (comprehensive coverage decreases incentives to drive carefully), tests based on this prediction 2 cannot distinguish between the two. Any test based on the correlation between choice of coverage and occurrence of claim must control for all variables observed by the insurer because these are used to price individual risk. Omitting any relevant characteristic observed by both parties can lead to spurious informational asymmetry. Chiappori and Salanié (2000) point out that it is quite problematic to control for the past driving record which is obviously endogenous. They circumvent this problem by focusing on subpopulation of young drivers who have no driving history yet. They do not nd any evidence of asymmetric information using French car insurance data. This chapter links directly to their work and extends it in 2 See Section 1.1 for a discussion how to disentangle moral hazard from adverse selection. 10 2.1. INTRODUCTION various ways. First we test for asymmetric information also among senior drivers, controlling for their driving experience observed by the insurer. We argue that conditioning on experience rating is innocuous and, in fact, indispensable because this way we control for eventual (symmetric) learning of the insurer about agent's risk. Second, we extend the analysis by including sizes of incurred damages. Chiappori (2001) predicts that, under asymmetric information, contracts with more comprehensive coverage are chosen by agents with higher expected accident costs. factors: (1) the probability of claim and (2) its severity. These depend on two While Chiappori and Salanié (2000) focus only on the rst factor by modeling claim frequencies, we take into account also the second factor by exploring the data on observed losses. This is highly relevant in the case when the distribution of incurred losses depends on agent's characteristics and, especially, on the type of his contract. Finally, our data provide also information on the actual premium, which represents the insurer's estimation of the expected claim costs. We provide some empirical evidence that the premium is a good predictor for both the occurrence and the sizes of claims. Under the assumption that the premium is a sucient statistic for all risk factors observed by the insurer in predicting the occurrence and the sizes of claims, we can model the accident probability and the sizes of incurred losses using only the premium. Conditioning on one variable gives us a space to explore novel fully nonparametric methods. In the end, we do not nd any evidence of asymmetric information in Dutch car insurance data. This chapter is organized as follows. The next section describes the car insurance system in the Netherlands. The third section presents the data. The fourth section tests for asymmetric information in claim frequencies. The fth section tests for asymmetric information in severity of claims. The sixth section uses premium to test for asymmetric information. The last section concludes. Appendix provides some technical details for the tests used in the Section 2.5. 11 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE 2.2 Car Insurance in the Netherlands Dutch law stipulates that all cars must have a liability insurance (LI) that covers damage inicted to other drivers and their cars. Every car insurance company oers this coverage along with other two noncompulsory products: a limited comprehensive insurance (Mini- CASCO) which covers damage caused by nature, re, vandalism or thefts; and a comprehensive insurance full (CASCO) which covers all risks, including damage at fault on insured car. The insurance premium is calculated in two steps. First, the insurer calculates a premium base which depends only on the observed characteristics of the agent and his car. Then the insurer takes into account also the agent's claim history (if observed) based on which she determines a special discount or a surcharge on the base premium. discuss this kind of experience rating We will later in this section. Let us rst focus on the base premium. The base premium depends on various characteristics and, of course, on the type of coverage. Each coverage has a specic vehicle-dependent factor which is the most important parameter used in the calculation of the premium. For LI it is the weight, for Mini-CASCO the actual value and for CASCO the value at new. Beside that, the premium is calculated using the following characteristics: agent's age and address (region), use of car (private/business) and expected kilometrage (number of yearly driven kilometers). CASCO premium depends also on the level of deductible. 3 Agents opting for higher deductible get a discount on the premium. Sometimes higher deductible is compulsory in which case there is no discount on the premium. In the period covered by our data (1995 4 2000), the standard level of deductible was 300 , currently it is 136 euro. 5 3 Mini-CASCO contracts have only standard level of deductible which is equal for all agents. There is no deductible for the liability insurance. 4 (orijn) is a symbol for the Dutch guilder, the currency used in the Netherlands from 1279 to 2002 when it was replaced by the euro. 1 euro = 2.20371 Dutch guilders. Source: European Central Bank (www.ecb.int). 5 At present, this is the standard level of deductible applied by most insurance companies in the 12 2.2. CAR INSURANCE IN THE NETHERLANDS Now, the experience rating is implemented by a so-called bonus-malus (BM) system. In this system, each agent is assigned to a certain BM class, which determines a height of discount (resp. surcharge) applied to his premium. history yet, pay the full base premium. with a bonus 6 New agents, who have no claim After each claim-free year, agents are awarded which gives them a certain discount on their premium. On the other hand, each claim at fault comes with a malus which causes a surcharge on the premium. In this project we will work with the data from a major Dutch insurer which uses the 7 BM scheme given in Table 2.1. There are 20 BM classes where the worst class is BM class 1 and the best class is BM class 20. Every new insuree begins in BM class 2 where he pays the full base premium (100%). After each year without a claim at fault, he advances in BM scheme by one class up which awards him with an extra discount on the premium. The maximum discount of 75% is provided in top BM classes 14 to 20. After each claim at fault, the agent drops in BM scheme by 4 to 6 classes and usually pays higher premium. The agent's BM class is updated at the beginning of each contract year and depends on the BM class and the number of claims in the previous contract year. For instance, an agent in BM class 12 pays 35% of his base premium. If he has no accident during the contract year, he will advance into BM class 13 where he will pay 30% of the base premium. However, if he causes an accident, he will drop into BM class 7 where he will pay 55% of the base premium. If he causes 2 accidents, he will drop further into BM class 3 and will pay 90% of the base premium. Finally, every agent causing three or more claims in a year will be degraded into BM class 1 which implies a surcharge of 20% on Nethelrands. Source: Dutch Association of Insurers (www.verzekeraars.nl). 6 Agents who switch from one insurer to the other can ask the old insurer for a statement which states their claim history, usually in a form of a number of claim-free years they had. Based on this statement they can apply their right for a premium discount by the new insurer. 7 This scheme is similar to the one proposed by de Wit et al. (1982) who made an extensive actuarial study of the motor rating structure in the whole Netherlands. The authors proposed a BM scheme consisting of 14 classes, with a maximum discount of 70% in the top class 14 and a surcharge of 20% in the bottom class 1. This scheme was broadly introduced in the Netherlands on January 1, 1982. In the course of time, extra bonus classes were added, oering better protection against premium increase to good customers. In our case, the highest BM class 20 gives an agent some kind of malus-deductible in a way that his premium does not increase after one claim at fault. 13 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE the base premium. 2.3 Data Our data describe contract and claim histories of a major Dutch car insurer in the period from 1 January 1995 to 31 December 2000. They provide rich information about agents (sex, age, address), their cars (brand, model, production year, price, weight, power, etc.) and contracts (coverage, bonus-malus class, level of deductible, premium, renewal date). If a claim occurs, we observe its type (accident, theft, windscreen, etc.), damage (material damage, bodily injury) and whether the agent was guilty or not. The data are longitudinal, describing contract dynamics, like changes in coverage, insured subjects or agents' addresses, and contract renewals or terminations. Given that this chapter applies static econometric methods, we will use only cross-sectional data. The dynamic aspect of the data will be explored in the Chapter 4. 2.3.1 Sample Selection The raw data contains 163,194 unique contracts. There is, however, no information on claims in 1995, therefore we excluded this year from the data. From the remaining 142,175 contracts we deleted 1,376 contracts that are not covered by the BM system contracts that do not have LI. 9 8 and 563 As it will be explained later in this chapter, our approach requires that all agents have the basic (liability) insurance. Further we selected the contracts observed for at least one full contract year which is the period between two contract renewal dates (or the period between the starting date of the contract and its rst renewal date). By focusing only on fully observed contract years we avoid problems with attrition. In the data, there are 111,480 such contracts 8 These are the contracts covering companies' eets of cars. Such contracts have no individual BM coecients but general eet discounts which are adjusted every year based on the eets' claim histories. 9 Some policyholders insured their car for liability cover at a dierent insurer. For example, Belgian drivers must insure the liability cover at a Belgian insurer, but can choose an arbitrary insurer for the comprehensive cover. 14 2.3. DATA Table 2.1: Bonus-Malus Scheme Present Premium Future BM class after a contract year with BM class paid no claim 1 claim 2 claims 3 or more claims 20 25% 20 14 8 1 19 25% 20 13 7 1 18 25% 19 12 7 1 17 25% 18 11 6 1 16 25% 17 10 6 1 15 25% 16 9 5 1 14 25% 15 8 4 1 13 30% 14 7 3 1 12 35% 13 7 3 1 11 37.5% 12 6 2 1 10 40% 11 6 2 1 9 45% 10 5 1 1 8 50% 9 4 1 1 7 55% 8 3 1 1 6 60% 7 2 1 1 5 70% 6 1 1 1 4 80% 5 1 1 1 3 90% 4 1 1 1 2 100% 3 1 1 1 1 120% 2 1 1 1 15 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE from which the vast majority (97.5%) starts or renews in 1996. We will select only these contracts which is convenient for our analysis in the Section 2.6 where we condition on the premium which needs to be calculated for all agents in the same way. By focusing on the sample of contracts starting in one particular year we avoid problems with eventual premium increase due to ination. From the remaining 108,741 contracts we have to delete 135 ones with erroneous or missing values of weight which is the key risk factor for LI. We are left with 108,606 contracts from which 24,101 cover business cars. We noticed that for these cars the agents' characteristics, like sex and age, are missing because business cars are usually used by many dierent drivers and therefore their premium cannot be based on individual characteristics. Furthermore, we discovered that most business cars are new and have full comprehensive coverage, so there is a lack of variety in this group. For all these reasons we decided to discard business cars from our sample and concentrate only on private users. This will allow us to use agents' characteristics in our model and, more importantly, to distinguish between young and senior drivers. Our nal sample consists of 84,505 contracts from which 34,251 have full (CASCO) coverage and 50,254 only basic (LI) coverage. 2.3.2 Claims A typical issue with using insurance data is that they refer to claims, not accidents. Whether an accident, once it has occurred, becomes a claim (i.e. is declared to the insurance company) is left to the individual's discretion. Obviously, accidents that are not covered will not be claimed. The estimated (conditional) claim-coverage correlation, based on all observed claims, will be signicantly positive even in absence of any asymmetric information fully insured agents claim more damages simply because they have better coverage. Therefore, if we want to compare claim occurrences between two types of agents, ones with and the others without full insurance, we have to take into account only 16 2.3. DATA accidents which are covered by the contracts of all agents. Such accidents must involve third-party damage which is covered by the liability insurance that is obligatory for all agents. Moreover, by focusing on third-party claims we can avoid (at least partially) post moral hazard ex eect, because accidents involving multiple parties are more likely to be claimed to the insurer in any case. Another issue is insurance fraud which arises when agents manufacture false claims with the intent to fraudulently obtain payment from the insurer. The information asymmetry of insurance fraud is usually resolved by claim verication and monitoring. 10 Hard insurance fraud is a special case of ex post moral hazard in which agents claim fake damages. It is more likely to be prevalent among agents with full coverage who can obtain payment directly from their insurer by pretending to have incurred damage on their own car. Staging credibly a fake accident with multiple cars is very dicult if not impossible. Therefore, by focusing on third-party claims we escape (substantially) problems of eventual insurance fraud. Lastly, we will focus only on claims at fault because these are directly informative on agents' risk. Accidents where an agent is not guilty are covered by oender's insurance, and therefore do not need to be claimed to the insurer. From now on in this chapter, by a claim we will always mean a claim at fault with a third-party damage. In our sample, we observe 80,790 contracts without a claim and 3,715 contracts with a claim. From those, 3,583 contracts have one claim, 124 contracts two claims and 8 contracts three claims. 10 Many empirical studies, for example Cummins and Tennyson (1996) or Abrahamse and Carroll (1999), found an evidence of fraud in automobile insurance markets. Tennyson and Salsas-Forn (2002) claim that the vast majority of suspicious claims involve potential buildup (exaggerated loss amounts) rather than outright fraud (illegitimate claims). Insurance experts share the same experience. This means that insurance fraud is more likely to distort our analysis on claimed amounts rather than on claim occurrences. 17 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE 2.4 Asymmetric Information and Occurrence of Claims As explained in the introduction, asymmetric information leads to a positive correlation between coverage and claim costs, conditional on all information available to the insurer. The claim costs depend on both the probability of an accident and the distribution of incurred damages. In this section we will explore the st feature by testing for the conditional independence between the choice of better coverage and the occurrence of claim(s). We will adopt the methods introduced by Chiappori and Salanié (2000), conditioning on various sets of (relevant) variables observed by the insurer. 2.4.1 Pair of Probits First we develop a simple parametric test based on two independent probits, one for the choice of coverage and the other for the occurrence of claim. Let y z i = 1, . . . , n denote agents. We dene two 0-1 endogenous variables: choice of coverage y = 1 occurrence of claim z = 1 : if agent i : i i if agent bought a full coverage and i had a claim at fault, otherwise These denitions require some remarks. First, zi zi equals zero. Second, the variable y if not. zi = 0. equals one only if an agent least one claim at fault with a third party damage. If blamed guilty, yi = 0 i i had at had no such claim or he was not does not distinguish between agents with only LI and those having also Mini-CASCO. Such distinction is not necessary here given that Mini-CASCO insurance does not cover damages caused by trac accidents, which are the only relevant ones for the denition of the variable z. Last remark concerns the level of deductible which can dier among agents with CASCO insurance. Ideally, these contracts should be treated separately and not bundled together as we do here. However, in our model, we concentrate only on liability claims which are not subject to any deductible. Moreover, in the sample there are only 4% of agents with a non-standard level of deductible, so there is no need to worry about this issue. 18 2.4. ASYMMETRIC INFORMATION AND OCCURRENCE OF CLAIMS Now we can set up the two probit models. Let of an agent i. i and be the set of observed risk factors Then yi = I(Xi β + i > 0) where Xi ηi and zi = I(Xi γ + ηi > 0), are some risk factors that are unobserved by the insurer, but possibly observed by the agent. We allow these factors to be dependent on the covariates i = Xi β ∗ + ∗i assume that such dependence is linear, i.e. and ηi∗ are standard normal errors independent of yi = I(Xi β̃ + ∗i > 0) where β̃ = β + β ∗ and γ̃ = γ + γ ∗ and Xi . and ηi = Xi γ ∗ + ηi∗ , Xi , but where ∗i Then we can write zi = I(Xi γ̃ + ηi∗ > 0), can be estimated by standard methods. We estimate both probits independently by maximum likelihood method and compute the generalized residuals ˆ , where ˜∗ (β̃) and η̃ ∗ (β̃) are generalized ˆ and η̃ˆ∗ = η̃ ∗ (β̃) ˜ˆ∗i = ˜∗i (β̃) i i i i errors, dened by Gourieroux, Monfort, Renault, and Trognon (1987). For instance, ˜∗i (β̃) is given by ˜∗i (β̃) ≡ E[∗i |yi , Xi ] = yi where φ and Φ φ(Xi β̃) φ(Xi β̃) − (1 − yi ) , Φ(Xi β̃) Φ(−Xi β̃) denote the density and the cumulative distribution function (cdf ) of a standard normal distribution N (0, 1). Gourieroux et al. (1987, Section 3.7) dene a test statistic P n ˆ∗ ˆ∗ ˜i η̃i i=1 W =P n i=1 ˜ˆ∗i η̃ˆi∗ 2 2 , which is, under the null of conditional independence tributed as a χ2 (1). cov(∗i , ηi∗ ) = 0, asymptotically dis- This provides us with a test of the symmetric information assumption. 19 CHAPTER 2. 2.4.2 ASYMMETRIC INFORMATION IN CAR INSURANCE Bivariate Probit Estimating the two probits independently is appropriate under conditional independence, but it is inecient under the alternative. Therefore we also estimate a bivariate probit in which ∗i coecient and %. ηi∗ are jointly normal with zero mean, unit variance and a correlation We will estimate this coecient together with its standard error which will allow us to test the null of 2.4.3 χ2 % = 0. Test of Independence The two parametric procedures presented above rely on the functional forms, which are quite restrictive since the latent models are linear and the errors are normal. If the underlying data generating process is driven by more complicated nonlinear functions of X , our results could be biased in unpredictable ways. To remedy this issue, Chiappori and Salanié (2000) adopt a fully nonparametric procedure based on a They create M χ2 test for independence. groups of agents with similar characteristics observed by the insurer and, for each group, they do the χ2 test for independence. In this way they obtain M test statistics each of which is, under the null of independence, asymptotically distributed as a χ2 (1). There are many ways how to use these data for a test of conditional independence. One of them is a Kolmogorov-Smirnov test which compares the empirical cdf, calculated from all obtained M test statistics, with its theoretical counterpart, the cdf of a χ2 (1). Under conditional independence, both functions should be asymptotically identical. As the Kolmogorov-Smirnov test is known to have limited power, the authors provide two other tests. The rst one is based on a sum of all M test statistics. Under conditional independence this sum should be asymptotically distributed as χ2 (M ). The last approach is based on the number of rejections in all cells. We reject the null of independence in a cell if the test statistic exceeds 3.74, the 5% critical value of the 20 χ2 (1). Under the null, 2.4. ASYMMETRIC INFORMATION AND OCCURRENCE OF CLAIMS such rejection appears with a probability 0.05. Thus, under conditional independence, the total number of rejections should be distributed as a binomial 2.4.4 B(M, 0.05). Implementation Before we start testing we have to decide which variables to use in the model. Optimally, we should condition on all information available to the insurer. In the data, there are more than 20 descriptive variables about policyholders and their cars. We can easily include all of them into X. Estimation of the probits will be still feasible and we can be sure that we are not omitting any information which is known to the insurer. However, there are three caveats in this procedure. (1) There are around 5% of agents for whom we do not observe all variables. Consequently, if we want to use all characteristics in the model, we have to exclude these agents from the sample. (2) Some variables, like for example weight and engine volume, are highly correlated. The underlying multicollinearity does not reduce the predictive power or reliability of the model as a whole, but it aects the calculation of individual predictors. Since we are not interested in the individual estimates of the coecients, we do not need to worry so much about this issue here. (3) The most serious problem with using all information is that we cannot condition on so many variables in the non-parametric approach, because some groups of similar agents will be very small or even empty. Here we face a trade-o between better conditioning (and thus a bigger number of groups with similar agents) and not too small populations in each group (to be able to apply asymptotic results). To help us with this issue we asked the insurer which characteristics are the most important in the determination of the agents' risk. We were told that, in the period covered by the data, the following factors were used in the calculation of the liability premium for private cars: of policyholder. weight, region, kilometrage and age We will refer to these variables as premium risk factors. Another possibility is to use directly the liability premium, which is also observed in the data. It is natural to assume that the insurer's objective is to estimate the agents' risk 21 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE in the best possible way, using all available characteristics. Hence the premium should truly reect the underlying third-party risk. Therefore, it could be sucient to condition only on the premium. We will discuss and adopt this approach in Section 2.6. The premium risk factors and the premium itself represent the information used by the insurer to evaluate the agent's risk. We will verify whether this information is sufcient by testing the independence between the choice of coverage and the occurrence of claims, conditioning only on these premium risk factors, resp. only on the premium. If we reject the null of independence, we can conclude that the risk factors used by the insurer are insucient to price the underlying risk. Then it is important to verify whether this information asymmetry would still retain if we used extra risk factors that are still observed by the insurer, but not used in the pricing. Therefore we will search for other characteristics of agents, which have also strong predictive power for the claim occurrence. We will lean on the actuarial study, provided to us by the insurer, which was made by an independent consulting company, using exactly the same data as we do in this thesis. This study suggests many improvements in the pricing policy, so the insurer's pricing of the risk is not optimal in a sense that it does not explore all available information. We will discuss here only the most important suggestions, concerning mainly a replacement of some currently used premium risk factors by other factors which explain better the underlying risk. We will verify the relevance of these suggestions by estimating the bivariate probit with the risk factors proposed by the actuarial study. If these risk factors improve the explicative power of the bivariate probit (compared to the premium risk factors), we will include them into our analysis. Our aim is to nd a small set of variables with the strongest explicative power. As a benchmark, we will use our bivariate probit. 2.4.4.1 Actuarial Study First we estimated the bivariate probit using only the premium risk factors. We found out that only the coecients of the weight and the dummies for two regions were signicant 22 2.4. ASYMMETRIC INFORMATION AND OCCURRENCE OF CLAIMS in both probits. This can originate from the fact that the underlying risk does not depend proportionally on the premium risk factors. The insurer prices the liability insurance using a specic structure in which the relevant variables enter in a non-proportional way. In what follows we will discuss in details all key risk factors used (in the period of our data) by the insurer and possible improvements as proposed by the given actuarial study. Region. The insurer used four-level region code in her tari structure to discriminate between urban and rural areas. The rst region code relates to the capital city of Amsterdam, the second code to other major cities in the Netherlands, the third code to small towns, and the last code to countryside. The actuarial study suggests that such classication is not necessary and that it is sucient to distinguish only between big cities and the rest of the country. In other words, they suggest to bundle rst region together with the second, and the third region together with the fourth. Our analysis came to the same conclusion. We made four dummies for each region. We cannot include all four dummies into the model because then the matrix dummy, say for region 1 (resp. (resp. region 4) is insignicant. X would be singular. Each time we exclude one region 3), the coecient of the corresponding region 2 Therefore we create only one dummy, denoting both regions 1 and 2, which relates to the use of a car in a city. Then its estimated coecient is signicant in both probits; it is positive in the claim-occurrence probit and negative in the coverage-choice probit. Kilometrage. Another factor used by the insurer in the tari structure was kilometrage. Each agent entering the insurance was asked to estimate the number of driven kilometers per year and choose one of the following three levels of kilometrage: below 12,000 km per year, maximum 20,000 km per year and unlimited number of kilometers. This measure turned out to be very unreliable because many agents underreported their actual level of kilometrage, which is seldom checked by the insurer, to get extra discount on their 23 CHAPTER 2. premium. 11 ASYMMETRIC INFORMATION IN CAR INSURANCE Not surprisingly, we observe in the data that only 4.7% of the policyholders have unlimited kilometrage. Furthermore, from the estimated bivariate probit it seems that there is no signicant dierence between the estimated coecients of dummies for the lowest and the highest kilometrage level; one of the two is insignicant when used in the combination with a dummy for the middle kilometrage level. The actuarial study suggests to use a fuel type of car instead of kilometrage. As a practical rating factor, the fuel type has an advantage over kilometrage in that it is objective and veriable from the public vehicle licensing database. 12 Moreover, there is an evidence in the data that diesel or gas fueled cars have approximately 32% higher expected third party claims cost than petrol cars (all other factors being equal). This is common in European motor markets and most probably results from the fact that the fuel type is a proxy for kilometrage, with diesel- and gas-powered vehicles being more heavily used. Therefore we decided to distinguish benzine powered cars from the rest. Estimated coecient of the corresponding dummy is signicant in both probits; it is negative in the claim-occurrence probit and positive in the coverage-choice probit. Age of policyholder. The last characteristic used by the insurer in the rating struc- ture is the age of policyholder. The insurer discriminates only young drivers by giving an extra surcharge to all policyholders aged below 28 years. The actuarial study con- rms that young policyholders have very high risk, compared to the policyholders aged between 28 and 39 who represent the lowest risk. Then the risk starts growing with 11 This happened mainly when the contract was underwritten via an insurer's intermediary who reported lower kilometrage in order to get a good price for his clients. Kilometrage can be checked only by a claim expert when surveying a specic claim. If the claim expert gets a proof that the actual kilometrage is higher than was reported to the insurance company, the cover can be lapsed. However, many agents often make up a good story why their actual kilometrage is so high. Since the insurer has no means to verify the credibility of the actual kilometrage, she decided to drop it from her tari structure and use the fuel type instead. This change took eect in 2002, i.e. two years after the period covered by our data. 12 We know about two databases providing technical data and price information about Dutch cars. One is the RDW database (www.rdw.nl), freely accessible to public, which contains basic technical data about all vehicles registered in the Netherlands. The other one RDC database (www.rdc.nl) is more complex and accessible to car insurance companies against a payment. It contains all technical and price information about the vehicles registered in the whole Benelux. 24 2.4. ASYMMETRIC INFORMATION AND OCCURRENCE OF CLAIMS the age again and becomes signicantly higher for the policyholders aged between 50 to 59 probably because of the fact that some of these policyholders have high risk teenage children starting to drive on their parents' insurance cover. policyholders aged over 75 have the highest third party risk. conrms these results. The study concludes that Our preliminary analysis Estimated coecient for the age is individually insignicant in the claim-occurrence probit when we use whole sample, apparently because there is no proportional relation between the age of policyholders and the underlying risk. However, when we estimate the claim-occurrence probit using only subsample of young drivers, the estimated coecient becomes signicantly negative which could be explained by learning eect. When we use a subsample of experienced drivers (aged 28 years or more), the estimated coecient for age is positive, though not very signicant. It becomes signicant when we select a subsample of more senior drivers, aged above 40 years. Thus it seems that the age of policyholders is an important risk factor. Age of car. The actuarial study suggests to use also this factor in rating, since there is a strong statistical evidence that the underlying third party risk is higher for older cars. Our preliminary analysis conrms it; estimated coecient of the age of car is signicantly negative in the coverage-choice probit and signicantly positive in the claim-occurrence probit. The age of car is evidently a very strong determinant in agent's decision whether to buy a full insurance or not. Since the premium of the full insurance is based on the value of car at new, but the car itself is insured against the maximum loss equal to its actual value (depreciated by time), the full insurance is more advantageous for new cars than for old cars. Indeed, we observe in the data that mainly new cars have CASCO coverage. Agents cancel this coverage when their car gets older and usually switch to Mini CASCO. Most old cars have only the liability cover. 13 For all these reasons we decided to use the age of car in our tests. 13 Chapter 5 provides more details about the age distribution of cars among dierent types of coverage. 25 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE The actuarial study also suggests to use motor volume as a rating factor, since cars with large engine have signicantly higher third party risk costs than cars with small engine. Indeed, in our bivariate probit, the motor volume alone is a signicant predictor, however, in a combination with the weight of car it is not so signicant any more. This is probably because of a high correlation between both factors (around 70%). We get similar results when using engine power; it is also highly correlated with the previous two 14 variables. We will therefore use only the weight of car in our model. To recapitulate, based on our bivariate probit, we found out that the following variables are the most relevant risk factors for the liability insurance: policyholder, age of car and the indicators for the use in city We will refer to these variables as actuarial risk factors. weight of car, age of and the fuel type benzine. In the bivariate probit, all estimated coecients of these variables are signicant. Adding extra variables into the model, like for example sex of policyholder, value of car or type of its body, does not signicantly improve the prediction power of the model; corresponding estimated coecients are not signicant. We therefore believe that it is sucient to condition only on these characteristics. 2.4.4.2 Experience Rating So far, we have discussed only exogenous characteristics that do not depend on agent's claim history. We know, however, that the insurer also observes past driving records, which are highly informative on probabilities of claim and, as such, are used for pricing. Omitting experience rating from the tests can generate spurious information asymmetry, because the corresponding information is treated as being private, whereas it is in fact common to both parties. Indeed, Puelz and Snow (1994) found evidence of adverse selection in car insurance market when they neglected experience rating. Such omission can generate a bias that tends to overestimate the level of asymmetric information. 14 In some countries, like for example Slovakia, the key risk factor for the liability insurance is the motor volume. In the Netherlands, as suggested by de Wit et al. (1982), it is the weight of car. 26 2.4. ASYMMETRIC INFORMATION AND OCCURRENCE OF CLAIMS This remark clearly suggests that the tests should control for driving experience. Chiappori and Salanié (2000) point out that the introduction of such a variable is problematic because of its obvious endogeneity. They circumvent this problem by focusing on a subpopulation of beginning drivers for whom no driving history is observed yet. They nd no evidence of asymmetric information among young drivers, but suspect that such informational asymmetry can arise in the course of insurance relationship due to learning. asymmetric This entails that drivers learn faster about their risk than insurers because they accumulate information also on near misses and small accidents which they do not report to the insurer. Hence, it is interesting to test for presence of asymmetric information especially among experienced drivers. Such analysis, however, requires conditioning on the observed claim history. We will assume that all claim history, observed by the insurer (and naturally also by the agent) is suciently expressed by the experience rating, namely the agent's BM class. In the rest of this section we will justify the introduction of this variable in our model. If we want to include experience rating in our model, we cannot avoid a discussion about dynamic aspects of the contractual relationship, namely learning.15 The learn- ing can be either symmetric, when both parties learn equally about the agent's risk, or asymmetric, when one party (usually the agent) learns faster than the other party (the insurer). Under the null of symmetric information, i.e. neither adverse selection nor moral hazard, both the agent and the insurer share the same information about the agent's risk. If there is no learning, then the experience rating reects just some random shocks and does not provide extra information about the agent's risk. If there is some learning, then this learning must be symmetric, since both parties share the same information. In this case it is important to condition on the experience rating, which reects the general 15 The basic reference on learning is Harris and Holmstrom (1982) who studied a case of symmetric learning in labor market. Their model was further applied to life insurance by Hendel and Lizzeri (2003). Learning in car insurance markets was recently studied by Dionne et al. (2006) and Cohen (2005, 2008). 27 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE knowledge (common to both parties) about the agent's risk. Omitting experience rating can create spurious informational asymmetry. Under the alternative of or moral hazard (or both). asymmetric information, there is either adverse selection Under pure moral hazard (without learning), conditioning on experience rating is not necessary, but does not harm either; it leaves variation in CASCO coverage that, under moral hazard, should be related to risk. Under pure adverse selection, the agent either has private information about his risk from the start, or acquires it through asymmetric learning over the course of time. Either way, the experience rating will not fully reect the agent's risk. Conditioning on the experience rating may reduce the magnitude of the information asymmetry, but will not cancel it completely out. In any case, it is useful to condition on the experience rating because this is directly informative on the level of the insurer's knowledge about the agent's risk. From the above reection we can conclude that it is never harmful to condition on the experience rating. The main reason for conditioning on the experience rating is to control for the insurer's learning about the agent's risk. Dionne et al. (2006) also include the BM coecient into their model, claiming that it provides additional information on the riskiness of policyholders. Their results suggest that the agents in high BM classes (who receive high bonus discount on their premium) tend to both buy more insurance and have less accidents. In what follows we will present the results of the tests separately for young and senior drivers. It is not necessary to condition on the experience rating for young agents since they have very little driving experience. We expect this will be important mainly for senior drivers. 2.4.4.3 Results for Young Drivers First we will turn attention to young drivers. As already mentioned, the insurer charges all policyholders younger than 28 years an extra fee because there is a strong empirical 28 2.4. ASYMMETRIC INFORMATION AND OCCURRENCE OF CLAIMS evidence that this group represents a higher risk. Therefore we will focus on a subsample of agents below the age of 28 years. Driving experience of such agents is certainly less than 10 years given that in the Netherlands, the legal age for obtaining a driving licence for a car is 18 years. Despite this, we observe a big variety among the BM classes. Some agents are even in the BM class 20 which they would normally reach after having driven for 18 years without a claim. This comes from the fact that in the period covered by our data, the insurer gave new drivers a possibility to inherit a favorable BM coecient from their parents. 16 Our subsample of young drivers consists of 4,319 individuals which represents 5% of the whole sample. More than a third of these agents are at the margin age of 27 years. Another 28% are 26 years old. Just less than a quarter of agents are younger than 25 years. First part of Table 2.2 gives the distribution of young agents according to their coverage and occurrence of claim. From the rst sight we can see that the agents with full coverage have proportionally less accidents (3.8%) that the agents with only basic insurance (5.5%). This is somehow surprising. Consequently, the is negative (-0.029), but insignicant: the χ2 unconditional claim-coverage correlation test of independence does not reject the null 17 at a conventional 5% level. When we estimate the two independent probits using all available exogenous characteristics, 18 we get a value of the bivariate probit estimates % at W -statistic −0.058 equal to 0.842, with a with a standard error of p-value 0.076. of 0.359. The We are far from rejecting the null. Conditioning only on the premium risk factors gives 0.031, and %̂ = −0.098 with a standard error of 0.049. W = 4.667 with a p-value of Surprisingly, both tests reject the 16 Nowadays, it is still possible to get the same BM-discount on the second car in a household, but only under strict conditions, preventing young drivers from starting with a high discount. 17 The rejection of the null is stronger when we exclude 27 year old agents. Then the test is p-value of the 0.239. χ2 18 There are 228 agents with missing values for some variables, thus the probits are estimated using only 4,091 observations with full information on all variables. 29 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE null at a 5% level. However, when we exclude the oldest drivers from the sample (27 year old agents), the null is not rejected any more. Then and %̂ = −0.064 with a standard error of W = 1.216 and %̂ = −0.021 p-value of 0.270, 0.062. Conditioning only on the actuarial risk factors gives 0.562, with a with a standard error of 0.070. W = 0.336 with a p-value of Both models reject the null even more strongly than when we conditioned on all exogenous characteristics. In the nonparametric approach we rst group agents based on the premium risk factors. Beside 4 levels for the region and 3 levels for the kilometrage, we make 2 levels for the weight: light cars (below 1,000 kg) and heavy cars (above 1,000 kg). In this way we got 24 groups. 3 groups had, however, very few individuals, so we add them to a bigger group with similar characteristics. We ended up with 21 groups where the smallest one has 11 individuals and the biggest one 1,882 individuals. All three proposed tests are far from rejecting the null of independence. The Kolmogorov-Smirnov test statistic is p-value of 0.938. All χ2 test statistics summed up to the 5% critical value of the so the p-value of χ2 (21). B(21, 0.05) 13.468 0.115 with a which is much below 32.671, Finally, there was no rejection of the null at any cell, is 1. Grouping agents with regard to the actuarial risk factors gives the same result. As before, we make 2 levels for the weight of car. For the age of car, we distinguish 3 levels: new (0 - 4 years), as good as new (5 - 9 years) and old (10 years and more). By conditioning also on the remaining two 0-1 variables (the indicators for a use in city and a benzine fueled car), we created 23 groups. The smallest one has 12 individuals and the biggest one 1,070 individuals. Again, none of the three proposed tests rejects the null of independence. The value of the Kolmogorov-Smirnov test statistic is of 0.550. All χ2 tests statistics summed up to critical value of the χ2 (23). 9.859 0.174 with a p-value which is much below 35.172, the 5% Furthermore, there is no rejection of the null in any cell. All tests gave the same result: conditional (and also unconditional) correlation is not signicant. This means that there is no asymmetric information between young drivers 30 2.4. ASYMMETRIC INFORMATION AND OCCURRENCE OF CLAIMS and the insurer. This result is consistent with Chiappori and Salanié (2000), who did not nd any evidence of asymmetric information among young drivers either. It seems that beginners have no informational advantage over the insurance company. Young drivers have very little driving experience to determine whether they will turn out to be good or bad drivers. Therefore they have no extra (private) information on their risk when they choose an insurance coverage for the rst time. New policyholders either randomize their contract choice or distribute across the menu of contracts based on their preferences, which are uncorrelated to risk. In the course of time, however, agents can gain some information advantage due to asymmetric learning. Therefore, we will repeat all tests on the subsample of senior drivers. 2.4.4.4 Results for Senior Drivers Here we will focus on senior drivers, i.e. on all agents aged 28 years or more. There is no guarantee that all these agents are experienced drivers because some of them could have started driving in later age. However, in the data we observe that more than a half of these agents have been insured at our insurance company for at least 10 years. Thus we can be sure that majority of agents in our subsample are experienced drivers. Our subsample of senior drivers consists of 80,186 individuals. Around 28% of them are aged below 40 years. Less than 5% of agents are above 75 years old. Second part of the Table 2.2 gives the distribution of senior agents according to their coverage and occurrence of claim. We can see again that the agents with full coverage have proportionally less accidents (3.9%) that the agents with only basic insurance (4.7%). The rejects the null of (unconditional) independence. coverage correlation is, however, quite small: The estimated χ2 test strongly unconditional claim- −0.017. The correlation does not change too much when we condition on all exogenous variables. 19 The two independent probits give W = 12.440 with a p-value of 0.000. The 19 By conditioning on all variables we loose 4,050 observations which have some variables with missing values. All probits are therefore estimated using only subsample of 76,136 agents for whom all variables 31 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE Table 2.2: Distribution of Agents According to Their Coverage and Occurrence of Claim YOUNG DRIVERS Claim Coverage yes no Total basic full Total Test of independence: 189 3,271 3,460 (5.5%) (94.5%) (100%) 33 826 859 (3.8%) (96.2%) (100%) 222 4,097 4,319 (5.1%) (94.9%) (100%) χ2 (1) = 3.707 p-value with = 0.054 SENIOR DRIVERS Claim Coverage yes no Total basic full Total Test of independence: 32 2,178 44,616 46,794 (4.7%) (95.3%) (100%) 1,315 32,077 33,392 (3.9%) (96.1%) (100%) 3,493 76,693 80,186 (4.4%) (95.6%) (100%) χ2 (1) = 24.003 with p-value = 0.000 2.4. ASYMMETRIC INFORMATION AND OCCURRENCE OF CLAIMS bivariate probit estimates %̂ = −0.046 0.013, with a standard error of so the correlation is in both cases signicant. However, as soon as we add the BM class, rejected any more: error of W = 0.572 with a p-value of 0.450 and 20 %̂ = −0.010 the null is not with a standard 0.014. When we condition only on the premium risk factors, we get again signicant correlation: 0.010. a W = 36.367 with a p-value of 0.000 and %̂ = −0.060 with a standard error of The correlation stays signicant even after adding the BM class: p-value of 0.005 and %̂ = −0.029 with a standard error of 0.010. estimate the probits using with a p-value of 0.112 only and with a standard error of Conditioning on the actuarial risk factors gives and %̂ = −0.048 with a standard error of 0.013, W = 15.613 %̂ = −0.011 with a standard error of W = 2.530 0.010. with a p-value of 0.000 so the estimated correlation is signicant. It becomes insignicant when we add the BM class: and Interestingly, when we the BM class, the correlation is insignicant: %̂ = −0.016 with This result is surprising 21 and may come from the fact that our model is not well specied. W = 8.068 W = 0.781 with a p-value of 0.377 0.013. From all these results we can conclude that omitting the experience rating (i.e. the BM class) from the model creates a spurious correlation between the coverage choice and the claim occurrence. This result is consistent with Dionne et al. (2006), who also discovered that the BM class mask a correlation between coverage and accidents. Further we noticed that the results are very similar when we condition on all observed variables as when we condition only on the actuarial risk factors, which we thoroughly selected in the previous section. This conrms our earlier claim that it is enough to condition only on these factors. What is surprising is the fact that by conditioning only on the premium risk factors (i.e. are observed. 20 Given that BM class is an ordinal variable, we create dummies for each class (except for the rst one) and add them into X. 21 Chiappori (2001) points out that any misspecication can lead to a spurious correlation. Parametric approaches, in particular, are highly vulnerable to this type of aws, especially when they rely upon some simple, linear form. 33 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE the variables actually used by the insurer in the premium rating), we found a signicantly negative correlation, even after adding the BM class into the model. It is quite improbable that the insurer would use such weak variables (with low predictive power) in the premium rating. Much better explanation is that our parametric model is not exible enough to capture the non-proportional structure used by the insurer in rating of the underlying risk. If this is true then our nonparametric approach should overcome this problem. Another possibility is to use directly the premium calculated by the insurer, which we will do later. In the non-parametric approach we leaned again on our actuarial study. We distinguish two groups of senior agents by age: younger (28 - 39 years), having lower risk, and older (40 and more), having higher risk. 22 As in the previous section, we distinguish 2 levels of the weight: light cars (below 1,000 kg) and heavy cars (above 1,000 kg). First, we condition on the premium risk factors, i.e. on the weight (2 levels), the region (4 levels), the kilometrage (3 levels) and the age of policyholder (2 levels). In this way we get 48 groups of similar agents. The smallest group has 13 individuals and the biggest one 16,982 individuals. The value of the Kolmogorov-Smirnov test statistic is a corresponding a p-value of the p-value χ2 (48) 23 rejected in 6 cells of 0.002. All 48 χ2 test statistics sum up to distribution equal to which gives a 0.001. 84.400 0.266 with which gives Finally, the null of independence is p-value of the B(48, 0.05) equal to 0.032. We see that all three tests reject the null of independence. This happened also earlier when we ignored experience rating. Conditioning on all 20 BM classes is not appropriate, because some groups would be very small. Therefore we will distinguish only two levels: low BM classes (1 to 10) and high BM classes (11 to 20). In this way we get 96 groups. Three groups are too small, having less than 10 individuals, so we merge two of them, which are similar, together; and we attach the remaining small group to another similar group which is bigger. We end up with 94 groups; the smallest one has 10 individuals, the biggest one 22 We do not make a special group for the agents older than 75, who have the highest risk, because there are very few (less than 5% of ) such old agents in the data. 23 All cells where the null was rejected are big, having more than 550 individuals. asymptotic results of the 34 χ2 tests of independence are reliable. Therefore the 2.4. ASYMMETRIC INFORMATION AND OCCURRENCE OF CLAIMS 14,018 individuals. Now, none of the tests rejects the null anymore. The KolmogorovSmirnov test statistic is 0.102 with a p-value 105.760 which gives a p-value of 0.191. p-value of B(94, 0.05) equal to 0.697. of 0.255. All 94 χ2 test statistics sum up to Finally, there are only 4 rejections 24 which gives a We see now that the null is not rejected any more. The rejection of the null we got earlier in the parametric approach was very probably caused by a misspecication. Conditioning on the actuarial risk factors gives similar results. We create 48 groups based on the weight (2 levels), the use in a city (2 levels), the benzine fueled car (2 levels), the age of policyholder (2 levels) and the age of car (3 levels). The smallest group has 13 individuals and the biggest one 10,649 individuals. The Kolmogorov-Smirnov test does not reject the null; the test statistic is the with a p-value of 0.729. p-value of As in the previous case, the null is rejected in 6 cells which gives a p-value of B(48, 0.05) equal to 0.032. test statistics sum up to 68.675 But the other two which gives a tests reject the null. All 48 0.027. χ2 0.095 This time, however, one rejection appears in the smallest group (with 13 individuals), so we cannot rely on the asymptotic properties of the tests. When we exclude this group from the analysis, we do not reject the null any more at a conventional 5% level. 25 Anyway, this issue disappears when we condition also on the BM class. Again, by distinguishing low and high BM classes, as in the previous case, we get 96 groups. Two similar groups are very small, so we merge them together. The merged group has still only 7 individuals, but there is no rejection of the null in this group. The biggest group has 9,006 individuals. Now, none of the tests reject the null any more. The value of the Kolmogorov-Smirnov test statistic is of 95 χ2 test statistics values rejections in the cells 26 100.115 which gives a 0.107 which gives a p-value of with a p-value B(95, 0.05) of p-value 0.340. equal to of 0.221. The sum Finally, there are 8 0.103. Most of the nonparametric tests reject the null when we condition only on exogenous 24 Again, all rejections are in big cells. The smallest cell with a rejection has 397 individuals. 25 Without this group, the sum of the remaining 47 test statistics is 62.716 which gives a p-value of the χ2 (47) equal to 0.062. With 5 rejections only, the p-value of the B(47, 0.05) is 0.085. 26 All cells with a rejection are big enough. The smallest cell has 146 individuals. 35 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE variables. Once we add the BM class, no test rejects the null any more. These results are consistent with the ones we obtained in the parametric approach, except for the one rejection which we got when we used only premium risk factors. As discussed earlier, this rejection is very likely caused by a misspecication. To conclude, we did not nd any evidence of the asymmetric information between agents and the insurer, even among senior drivers. 2.4.4.5 Negative Claim-Coverage Correlation It is interesting that the estimated claim-coverage correlation is signicantly negative when we do not condition on the BM class. The estimation results from our probits suggest that the BM class is positively correlated with coverage choice and negatively correlated with claim probability. This result is supported also by the study of Dionne et al. (2006). On top of that, the observed claim frequencies from the Table 2.2 also suggest that agents with full coverage claim less than agents with only basic coverage. The theory predicts opposite result under adverse selection or moral hazard. One explanation for such relationship could be advantageous selection. de Meza and Webb (2001) postulate that individuals have private information about both their risk type and their risk aversion. The advantageous selection appears if more risk-averse agents have lower risk and buy more insurance. The authors also relate the advantageous selection to exaggerated optimism and mistaken reluctance to purchase insurance. Those who are reluctant to purchase insurance are also disinclined to take precautions. In a context of the car insurance we have another explanation which is based on the experience rating. Agents who do not claim for many years receive a huge discount on their premium. Then the full insurance, which is otherwise quite expensive, becomes more aordable. On the other hand, agents with many claims pay very high premium which can discourage them from buying the full insurance or can lead to its cancelation, if they already have one. 36 2.5. ASYMMETRIC INFORMATION AND INCURRED DAMAGES In any case, the obtained results suggest that the steep experience rating used by our insurer is quite eective in ghting against eventual threats of asymmetric information. First it battles adverse selection by making the full insurance attractive mainly for good drivers who are oered a considerable discount on the premium. Second, it may reduce moral hazard eects by giving the insurees proper incentives to drive more carefully. It would be interesting to better quantify both eects, which is left for future research. 2.5 Asymmetric Information and Incurred Damages As discussed in the introduction, asymmetric information leads to a positive correlation between the coverage and the expected claim costs which depend on (1) the probability of claim occurrence and (2) the distribution of incurred losses in the case a claim has already occurred. In the previous section we developed the rst aspect by focusing on the occurrence of claims. Now we will explore the second aspect by examining the severity of claims. In particular, we would like to gure out whether there is a relation between the choice of coverage and the distribution of incurred losses. In case of adverse selection, agents with higher expected losses buy more insurance. Under moral hazard, agents with full coverage cause more (serious) accidents because of decreased incentives for safe driving. One could therefore expect that fully covered agents have not only higher frequency of claims but also claim higher amounts. We will test for this prediction by comparing the claim sizes of fully covered and basically covered agents. We expect the former to incur larger losses than the latter. In this section we will focus only on the agents who had at least one claim. are 3,715 such agents in our sample. There 2,367 of them have only basic insurance, while the remaining 1,348 agents have full insurance. Obviously, each claim involves a certain (positive) third-party damage. For each contract, we calculate a total sum of all observed damages and denote it by L. These amounts vary a lot, from the lowest total damage of 50 to the highest total damage of 1,250,000 . An average total damage is 6,205 37 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE for all contracts (with a standard error of 39,677). For fully covered contracts it is 6,714 (with a standard error of 45,748) and for basically covered contracts 5,915 (with a standard error of 35,769). This already suggests that the agents with full coverage cause more damages than the agents with only basic coverage. The estimated standard errors are, however, so big, that the dierence in means is not signicant. In what follows, we will do more formal statistical tests which take into account also the observed characteristics of the agents and their cars. Our null hypothesis will be that the distribution of incurred damages, agents (y = 1) F, conditional on as well as basically insured agents (y 1) = F (L|L > 0, X, y = 0). X, = 0), is the same for fully insured i.e. H0 : F (L|L > 0, X, y = We will test the null against a one-sided alternative that fully-covered agents cause more damage than basically-covered agents, i.e. 0, X, y = 1) < F (L|L > 0, X, y = 0). dominance H1 : F (L|L > In other words, we will test for the stochastic of the damages caused by the agents with full insurance. As earlier, we will use two approaches, one parametric and the other nonparametric. 2.5.1 Classical Regression First approach relies on a simple regression estimate of the coecient for the indicator of better coverage, y, conditioning on all (important) characteristics. We specify a fully parametric model in the following way: Li = Xi β + αyi + ui , where agent ui i is some zero mean error and Li is the total third-party damage caused by an during the whole contract year under consideration. We would like to stress here again that we focus only on realized damages, so we ignore agents who have no claims; therefore i, Li > 0. As before, Xi denotes all (relevant) observed characteristics of an agent including a dummy for his BM class. 38 2.5. ASYMMETRIC INFORMATION AND INCURRED DAMAGES Under the null we expect α=0 parameters of the model by OLS 27 and under the alternative α > 0. and test for the signicance of We can estimate all α̂ > 0. As earlier we will use three groups of characteristics: (1) all characteristics, (2) premium risk factors and (3) actuarial risk factors. 28 By using all characteristics, standard error of 2, 023.38. a standard error of standard error of α equal to Premium risk factors give similar result: 1, 425.49. 1, 849.01. we get the estimate of Finally, actuarial risk factors produce All estimates of α 842.62 with a huge α̂ = 1, 377.13 α̂ = 815.81 with with a are insignicant, so we cannot reject the null that the incurred damages do not depend on the type of coverage. 2.5.2 Nonparametric Tests Since any parametric approach involves a risk of misspecication, we will develop also a fully nonparametric method. As in the previous section, we will make M groups of agents with similar characteristics, and within each group we will test for the null of independence, conditionally on X. First test we will do is the Kolmogorov-Smirnov test which compares empirical distributions of the total incurred damages, with full coverage (y = 1) F̂ (L|L > 0, X, y), for both types of agents, ones and the others with only basic coverage (y rejects the null in favor of the alternative if F̂ (L|L > 0, X, y = 1) = 0). The test lies signicantly below F̂ (L|L > 0, X, y = 0). As an alternative, we will also do the Wilcoxon rank-sum test which compares the sums of ranks of incurred damages in the whole sample between the agents with full insurance and the agents with only basic insurance. The test rejects the null in favor of the alternative if the sum of ranks for the fully covered agents is signicantly bigger than 27 It is quite possible that, under the alternative, yi is positively correlated with the error. This kind of endogeneity is not a problem here, because we are not interested in any causality. We just want to test whether agents with full coverage cause more damage relative to agents with only basic coverage. 28 228 contracts have missing values for some characteristics, therefore the model is estimated using only 3,487 contracts. 39 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE the sum of ranks for the basically covered agents. While executing both tests independently, we will register in each cell whether the null was rejected at a 5% level or not. Then, for each test, we will count the total number of rejections. Given that, under the null, a rejection appears with a probability of 5%, the total number of rejections (for each test separately) should be distributed as binomial B(M, 0.05). We will globally reject the null in favor of the alternative if the number of rejections is much bigger than M/20, its expected value under the null. We will do one more test which is based on simple comparison of average damages between both groups of agents. Under the null, there is a 50% probability in each cell that the average damage of one group is bigger than the average damage of the other group. Under the alternative, it is more probable that the average damage of the fully covered agents is bigger than the average damage of the basically covered agents. For each group we will register whether the latter is true, and then we will count the total number of such cases. Under the null, this number should be distributed as binomial reject the null if this number is much bigger than M/2, B(M, 0.5). We will which is its expected value under the null. 2.5.2.1 Implementation and Results Before we start implementing the nonparametric method, we have to choose the characteristics, based on which we will group the (similar) agents. Here we have to be especially careful with a choice of the appropriate characteristics because now our subsample of contracts with a claim is much smaller than the whole sample, which we used in the previous section. Again, we are facing a trade-o between better conditioning (thus bigger number of cells) and not too small cells (to be able to apply asymptotic properties of the tests). As earlier, we make two levels for the weight: light cars (below 1,000 kg) and heavy cars (above 1,000 kg). Then we make three levels for the age: young drivers (below 28 years), experienced drivers (28 to 49 years) and older drivers (50 and more). 40 First we 2.5. ASYMMETRIC INFORMATION AND INCURRED DAMAGES grouped the agents based on the premium risk factors, i.e. beside the age and the weight we also conditioned on the region (4 levels) and the kilometrage (3 levels). We observed, however, that many cells had only couple of individuals which was inconvenient. Then we tried to condition on all actuarial risk factors. Again we got many small cells which prevented us from doing the tests. Therefore we decided to do a compromise between the premium risk factors and the actuarial risk factors. We replaced two premium factors by the actuarial ones, namely the four-level region code by the two-level indicator of the use in city and the three-level kilometrage by the two-level indicator for the benzine-fueled cars. Now the situation is much better, however, there are still couple of very small cells, all of them concerning young drivers. We noticed that there are only 222 young drivers in our subsample. Therefore we decided to discriminate between young drivers only based on the weight of their car which is, naturally, the most important determinant of incurred damages (heavier cars cause more damage than light cars). In this way we get 2 groups for young drivers and 16 groups for senior drivers, so in total 18 groups. The smallest group has 19 individuals and the biggest group 648 individuals. We observe that only in 7 groups, the average damage of the fully insured agents is bigger than the average damage of the basically insured agents. Furthermore, both tests reject the null only in 29 one (dierent) cell which gives a p-value of B(18, 0.05) equal to 0.603. We are very far from rejecting the null. When we condition also on the BM class, distinguishing, as earlier, low (1 - 10) and high (11 - 20) BM classes, we get 36 groups. We merge 2 small groups, having less than 10 individuals, with other similar groups. In this way we get 34 groups, the smallest one with 19 individuals and the biggest one with 487 individuals. During the testing procedure, we have to exclude one more group (with 26 individuals) because it includes only agents with basic insurance. In the remaining 33 groups we nd only 12 groups where the average damage of the fully insured agents is bigger than the average damage of the basically 29 The Kolmogorov-Smirnov test rejects the null in a cell consisting of 44 agents and the Wilcoxon rank-sum test rejects the null in a cell with 28 agents. 41 CHAPTER 2. insured agents. ASYMMETRIC INFORMATION IN CAR INSURANCE Furthermore, the Kolmogorov-Smirnov test does not reject the null in any cell, while the Wilcoxon rank-sum test does it only in one cell (with 115 individuals). All these results suggest that there is no signicant dierence in the incurred damages between the agents with full insurance and the agents with only basic insurance. 2.6 Premium In the previous two sections we tested for the asymmetric information on agent's risk. One of the practical issues we had to solve was a proper selection of explanatory variables which we used for conditioning. We had to do so mainly in the nonparametric approach where conditioning on all characteristics was not possible due to the curse of dimensionality. Therefore we decided rst to use the premium risk factors which are the characteristics used by the insurer in premium pricing. These characteristics performed well in the nonparametric part, where the null of independence was not rejected when we controlled for the experience rating. However, we encountered some problems in the parametric approach, where the estimated conditional correlation was signicantly negative even after adding the BM class into the parametric model. X. Such result might be caused by a misspecication of Therefore we undertook the actuarial study, in which we chose a group of variables with high explanatory power. By conditioning on these actuarial risk factors and the BM class, we obtained an insignicant estimate of the claim-coverage correlation. Now we would like to take a dierent approach by using the premium instead of all (important) risk factors. Insurers usually use more complicated pricing structure, in which the premium risk factors enter in a nonproportional way. Such structure is periodically reviewed and updated, so that the calculated premium reects the underlying risk in the best possible way. Moreover, the premium also represents the insurer's knowledge about the agent's risk. The latter determines not only the expected claim costs, but also the agent's decision whether to buy a full insurance or not. This suggests that, when testing 42 2.6. PREMIUM for asymmetric information, it could be sucient to condition only on the premium that incorporates all the relevant insurer's information about the agent's risk. The possibility to condition only on one variable the premium accords certain advantages. The rst one is that parametric models can be specied in much more exible way with one variable than with multiple variables, which may involve some mutual interactions or cross-eects. The second advantage is that conditioning on one variable gives a huge space for the application of nonparametric methods because of the low dimensionality. In this section we will prot from both these advantages. First we will repeat the Section 2.4's tests for asymmetric information in claim occurrences, conditioning only on the premium. Then we will develop a fully nonparametric method to test for asymmetric information in incurred damages. Before we start with the tests, let us rst clarify our idea in more details with some empirical support from the data. 2.6.1 Liability Premium and Expected Damage If we want to condition on a premium we have to use the premium which is observed for all agents and calculated in the same way for the agents with full insurance as well as for the agents with only basic insurance. Such premium is naturally the premium for the liability cover, which is obligatory for all drivers. Moreover, the liability premium is directly informative on the third-party risk that is highly relevant for the third-party claims, on which we focus in this chapter. In the data, we observe for each agent the yearly using only the exogenous premium risk factors. The base premium which is calculated actually paid premium depends on the agent's BM class as given in the Table 2.1. Our previous results revealed that omitting experience rating from the model causes a spurious informational asymmetry. Therefore in our further analysis we will focus only on the actually paid liability premium, which includes also the BM discount, resp. surcharge. From now on in this chapter we will call 43 CHAPTER 2. it simply premium ASYMMETRIC INFORMATION IN CAR INSURANCE and denote it by q. A natural conjecture is that the premium reects expected underlying risk costs plus some insurer's overheads, i.e. q(X) = h(E[L|X]), where E[L|X] is the expected damage caused by an agent with characteristics X, and h is some strictly increasing function whose graph lies above the diagonal (since the insurer has to cover loading costs and make some prot). We can verify the validity of the above formula empirically by smoothing a scatter plot of all observed damages (including also zero damages) against all paid premia. In Figure 2.1, we display the scatter plot of the actually paid liability premia against the observed third-party damages, smoothed by Lowess method with bandwidth 0.8. We can see that the smoothed line, representing average incurred damage, is indeed strictly increasing with the premium and lies slightly below the diagonal, which is what we expected. The above formula allows us to write E[L|X] = h−1 (q(X)) = E[L|q(X)], premium is a sucient statistic for the expected damage. thus the The latter can be further expressed as a product of the probability of a claim and the expected size of incurred loss once a claim has occurred: E[L|q(X)] = Pr[L > 0|q(X)] · E[L|L > 0, q(X)]. Further we assume that the premium is a sucient statistic for both the probability of a claim and the expected size of incurred losses. This means that conditioning on the premium is the same as conditioning on all risk factors, i.e. and E[L|L > 0, q(X)] = E[L|L > 0, X]. Pr[L > 0|q(X)] = Pr[L > 0|X] This assumption is quite strong and might not be valid in some special cases when the eects of the premium on the probability of a claim and the expected size of incurred loses work in opposite directions. Imagine, for example, two drivers with the same expected claim costs (i.e. paying the same premium), but one 44 2.6. PREMIUM has a light car and high probability of claim, and the other one has a heavy car and low probability of claim. The rst agent causes many small accidents while the second one few severe accidents. In product, their expected damage is the same, but the eect of the premium on the probability of claim and the expected size of incurred loss is reversed. Anyhow, we believe that our assumption is in general true. We have a strong empirical evidence that the premium is a good predictor for the probability of claim; its coecient in the claim-occurrence probit is signicantly positive, suggesting that agents paying high premium cause indeed more accidents. On the other hand, agents with high premium incur also larger losses. When we regress the size of incurred losses on the premium, its coecient is positive, though its signicance is only at the 10% level. In what follows we will use our assumption to test for asymmetric information in claim occurrences and claim severities, by conditioning only on the premium. The possibility to condition only on one variable considerably simplies our previous tests, where we had to battle with the curse of dimensionality (in the nonparametric part) and worry about the misspecication (in the parametric part). Furthermore, due to low dimensionality we will be able to develop new fully nonparametric methods. 2.6.2 Test for Asymmetric Information Based on Claim Frequency As argued earlier, the premium is highly informative on the agent's risk, at least from the insurer's perspective. The risk in turn inuences the decision whether to buy a full coverage or not, and naturally determines the probability of an accident. Therefore, when testing for the asymmetric information between the choice of coverage and the occurrence of claim, it should be sucient to condition only on the premium. This dramatically simplies the implementation of the tests from Section 2.4. Let us briey summarize the output of these tests when using only the premium. Estimation of the two independent probits with the premium as the only covariate, gives W = 2.374 with a p-value of 0.123. The bivariate probit estimates %̂ = −0.016 45 with CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE Figure 2.1: Damage by Premium for All Agents Note: Observed damages above 4,000 are not displayed in the graph, but were taken into account when smoothing. 46 2.6. a standard error of 0.010. PREMIUM In both cases, the estimated correlation is insignicant. This result again conrms our intuition that the rejection of the null, we got in the previous section when using all premium risk factors together with the BM class, was caused by a misspecication. The premium we use here is a function of exactly the same factors, but it depends on these factors apparently in a nonlinear way (otherwise we would get similar results whether using the premium or the premium risk factors). Using the nonparametric approach we group all agents into 20 cells, based on their premium. The rst cell groups the agents who pay the lowest premium, below 100 a year. The second cell groups agents with the premium between 100 and 199 , and so on up to the group 19 where the agents pay a premium between 1,800 and 1,899 . The last cell groups the agents whose premium is above 1,900 guilders. The smallest group has 74 individuals and the biggest one 25,156 individuals. None of the tests reject the null of independence. The Kolmogorov-Smirnov test statistic is All 20 test statistics sum up to equal to 0.751. 15.443 which gives a 0.129 p-value with a of the p-value χ2 (20) of 0.842. distribution Finally, there is only one rejection of the null in the group 18 (with 123 individuals), which gives the p-value of the B(20, 0.05) distribution equal to 0.642. We found no evidence of asymmetric information when we conditioned only on the premium. It should be perhaps emphasized again how important it is to control for the experience rating. When we repeated all these tests using only the base premium, which does not include the BM discount (resp. surcharge), the null of independence was strongly rejected. 2.6.3 Test for Asymmetric Information Based on Claim Severity In the Section 2.5 we tested whether fully insured agents cause more damage than basically insured agents and we found no evidence of it. Our suspicion, however, remains, mainly after we juxtaposed incurred damages with paid premia. Figure 2.2 displays a scatter plot of the incurred damages against the premia, smoothed again by Lowess method with 47 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE bandwidth 0.8. At rst glance we can see that the line representing smoothed damage caused by fully insured agents lies above the line representing smoothed damage caused by basically covered agents, at least at the tails. In what follows we will test whether this dierence is signicant. As discussed earlier, we will condition only on the premium, which allows us to apply a new fully nonparametric method, introduced by Koul and Schick (1997). 2.6.3.1 Model Let E[L|L > 0, q] = µ(q) pays the premium q. denotes the expected size of incurred damage for an agent who We want to test whether this depends on the agent's type of coverage. Therefore we denote by µy (q) = E[L|L > 0, q, y] an agent with coverage y. Recall that coverage. Our null hypothesis is that µ y =0 the expected size of incurred damage for y =1 refers to basic coverage and is independent of y, i.e. H0 : µ0 = µ1 . to full Under the alternative we suppose that the agents with full coverage cause more damage than the agents with only basic coverage, i.e. for all values of q H1 : µ1 > µ0 , where with strict inequality for at least one q. µ1 > µ 0 means µ1 (q) ≥ µ0 (q) Note that these hypotheses are, with the next paragraph's independence assumptions, equivalent to the ones we dened in Section 2.5. Our observations consist of bivariate data is the premium of an agent i with coverage y (qy,i , Ly,i ), i = 0, . . . , ny , y = 0, 1 who caused a damage of size qy,i where Ly,i > 0. We assume that the following relations are satised: Ly,i = µy (qy,i ) + εy,i , where the errors f. i = 1, . . . , ny , y = 0, 1, εy,i are all mutually independent and identically distributed with a density The covariates qy,i are also all mutually independent and have a common density g. In 30 addition, we assume that the covariates are independent of the errors. 30 This assumption seems to be innocuous despite the fact that the incurred damages are always positive. 48 2.6. PREMIUM Figure 2.2: Incurred Damage by Premium for Fully and Basically Insured Agents Note: Observed damages above 60,000 are not displayed in the graph, but were taken into account when smoothing. 49 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE Koul and Schick (1997) propose two tests for a common design, when two tests for a distinct design, when n0 6= n1 . n0 = n1 , and Given that, in our case, the sizes of the two subsamples are dierent, we will focus only on the tests suitable for the distinct design. The Appendix 2.A provides some technical details. 2.6.3.2 First Test The rst test requires a kernel regression estimate of µ, based on the pooled sample: Pny i=1 Ly,i wa (qy,i − q) y=0 , P1 Pny i=1 wa (qy,i − q) y=0 P1 µ̂(q) = where wa 31 is some kernel function with a bandwidth a. q ∈ R, Throughout this chapter we will 32 use the Epanechnikov kernel which has favorable theoretical properties. Set ε̂y,i = Ly,i − µ̂(qy,i ), so that ε̂y,i mimics εy,i i = 1, . . . , ny , y = 0, 1, under the null hypothesis. Koul and Schick (1997) propose the following test which rejects the null hypothesis for large values of the test statistic r T2 = where ψ n0 n1 n0 + n1 Pn1 ψ(ε̂1,i ) − n1 i=1 is a nondecreasing measurable function. Pn0 ψ(ε̂0,i ) n0 i=1 , If the error density zero mean, the authors suggest to use the specication ψ(ε) = ε f is normal with which makes the test locally asymptotically most powerful. The support of the errors could be limited for negative values if the expected size of incurred damage E(L|L > 0, q) the covariates. gets close to zero for some premium q. Then the errors would not be independent of Such pattern is, however, not observed in the data. From Figure 2.2 we can see that the incurred damages are in average far from zero for all premia. The situation would be dierent if we modeled just the expected damage E(L|q) which gets indeed close to zero for small values of the premium; see Figure 2.1. 31 More precisely, which is positive wa (x) = a1 w xa , x ∈ R, a > 0, where w on (−1, 1) and vanishes o (−1, 1). is a symmetric Lipschitz continuous density 32 Epanechnikov kernel is optimal in the sense of minimization of the asymptotic mean integrated squared error (Jones and Wand, 1995, Section 2.7). 50 2.6. The test statistic T2 basically compares ψ -averages of the errors in the two subsamples. Under the null, the errors in both subsamples should be equal, therefore to zero. statistic Under the alternative T2 PREMIUM µ1 > µ0 , the errors ε1 T2 should be close ε0 , should be bigger than so the should be positive. The authors prove that under the null and some mild additional assumptions on f, g, µ, ψ and a (see Appendix 2.A.1 for details), mean and a variance τ2 T2 is asymptotically normal with zero which can be consistently estimated by ny 1 X X 1 ψ 2 (ε̂y,i ) − τ̂2 = n0 + n1 y=0 i=1 We estimated the function µ(q) !2 ny 1 X X 1 ψ(ε̂y,i ) . n0 + n1 y=0 i=1 by the Nadaraya-Watson estimator and looked for a suitable bandwidth by the cross-validation method. We found an optimal value for the bandwidth at 1,297. We used the suggested specication of which produced T2 = 26, 790.3 the approximated p-value of ψ for normal errors, with an estimated standard error of 0.250, √ ψ(ε) = sign(ε) ε. The rst one is concave for negative ε ψ(ε) = sign(ε)ε2 ε<0 so it makes big errors smaller. The rst specication produced with an estimated standard error of The second specication gave the approximated p-value of 3.889 · 1010 and and convex for positive so it makes big errors even bigger. The second one is convex for ε > 0, This gives so we do not reject the null. As a robustness check we also tried two other specications: (1) (2) 39, 662.7. ψ(ε) = ε, and the approximated ε, and concave for T2 = 2.377 · 1010 p-value of 0.271. T2 = 39.988 with an estimated standard error of 72.992 and 0.292. In any case we cannot reject the null. 51 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE 2.6.3.3 Second Test The second test proposed by Koul and Schick (1997) rejects the null hypothesis for large values of the test statistic r T4 = where ρ n1 n0 X n0 n1 1 X ρ(L1,j − L0,i )wa (q1,j − q0,i ), n0 + n1 n0 n1 i=1 j=1 is a measurable odd function. suggest to use the specication If the error density ρ(x) = x f is normal, the authors which makes the test locally asymptotically most powerful. Note that this test does not require estimation of the common mean function µ. It just averages weighted dierences between the damages incurred by the fully insured agents (L1 ) and the basically insured agents (L0 ) who pay similar premium null, these damages should be roughly equal, so alternative L1 > L0 , T4 q. Under the should be close to zero. Under the the test statistic should be signicantly positive. The authors prove that under the null and some mild additional assumptions on f, g, µ, ρ and a (see Appendix 2.A.2 for details), mean and a variance τ4 τ̂4 = T4 is asymptotically normal with zero which can be consistently estimated by 1 2 ! n1 n0 X X 1 1 2 2 Ūj,· − Ū·,· + Ū·,i − Ū·,· , n1 j=1 n0 i=1 where Uj,i = ρ(Y1,j − Y0,i )wa (X1,j − X0,i ), Ūj,· n0 1 X Uj,i = n0 i=1 n0 X n1 1 X Uj,i , Ū·,· = n0 n1 i=1 j=1 n1 1 X and Ū·,i = Uj,i . n1 j=1 Since we do not know an optimal value for the bandwidth a, we will try a wide range of suitable values, say from 1 to 2,000. This will allow us to see how the estimated 52 p-value of 2.6. PREMIUM the test is sensitive to dierent values of bandwidth. Furthermore, to be able to determine robustly a rejection level of the test, we will need to know what is the minimum of all estimated p-values. First we tried the suggested specication normal. We observe that the estimated ρ(x) = x p-value which is optimal if the errors are of the test is quite unstable for small values of bandwidth. This happens probably because observations with unique values of the premium are not taken into account if bandwidth is too small. Therefore the size of bandwidth should be reasonably big. Indeed, the estimated p-value stabilizes around 0.25 for bandwidths bigger than 20. It reaches its minimum of 0.220 at the bandwidth of size 300. Based on this result, we cannot reject the null. As a robustness check, we tried two dierent specications for the function ρ(x) = sign(x)x2 ψ and √ ρ(x) = sign(x) x. in the previous test. ρ, namely We used the same specications for the function As earlier, estimated p-values bandwidth, but get stable for bandwidths above 20. are unstable for small values of The minimum p-value is reached again for a bandwidth around 300. It is 0.124 for the rst specication and 0.493 for the second one. In any case we cannot reject the null. Finally, because the observed damages have a huge variance (see the beginning of the Section 2.5), as a robustness check, we repeated both tests, transformed damages, log(L). T2 and T4 , using log- This should make the errors more homoscedastic. The null was again not rejected. Our new nonparametric approach delivers the same result as the tests from the Section 2.5. The null, that the expected size of incurred damage does not depend on the type of coverage, is not rejected. Looking back to the Figure 2.2 we remark that the two damage lines visibly deviate only at the right tale where very few data are observed. Most of the observations (above 90%) have the premium below 1,000 , where the two lines almost coincide. 53 CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE 2.7 Conclusion We did not nd any evidence of asymmetric information between agents and the insurer. When controlling for all relevant characteristics, the choice of coverage has no inuence on the occurrence of claims, neither on the size of incurred damages. These results should be, however, interpreted with care. de Meza and Webb (2001) pointed out that the conditional-correlation approach may fail to detect asymmetric information in the presence of both selection on risk preferences and moral hazard. Notably, if more risk averse drivers tend to buy more insurance and drive more cautiously, the correlation between the coverage and the occurrence of claims can be even negative. Moreover, our analysis uses only third-party claims. If asymmetric information is particularly strong for accidents involving only one car, we cannot detect it. In Chapter 4 we will explore dynamic features of the data, using all observed claims at fault together with their sizes and timing. We will develop dynamic econometric methods to test for the presence of moral hazard. Since dynamic methods exploit more information from the data, they are in general more powerful in detecting traces of asymmetric information. 54 2.A. TECHNICAL DETAILS FOR TESTS FROM SUBSECTION 2.6.3 APPENDIX TO CHAPTER 2 2.A Technical Details for Tests from Subsection 2.6.3 In the subsection 2.6.3, we used two tests, introduced by Koul and Schick (1997), based on the statistics T2 under the null and T4 . The authors derive asymptotic properties of these test statistics H0 : µ0 = µ1 = µ. Using the same notation as in the subsection 2.6.3, we assume that • µ is a measurable function, • q, ε and ε0 are independent random variables with respective densities is a measurable function such that • ψ is a nondecreasing measurable function such that • ρ is an odd measurable function such that such a way that 2.A.1 n0 and n1 and f, and 0 < Ev 2 (q) < ∞, • v For the asymptotics, we let v≥0 g, f 0 < var(ψ(ε)) < ∞, and 0 < Eρ2 (ε − ε0 ) < ∞. tend to innity and let a depend on n0 and n1 in a → 0. Asymptotic Properties of the Test Statistic T2 Koul and Schick (1997, Theorem 2.1): Suppose additionally that 1. f has zero mean and nite variance, 2. g is bounded and bounded away from 0 on a closed interval 3. µ is continuous, 4. ψ is Lipschitz-continuous, and 5. (n0 + n1 )a2 → ∞. I, and vanishes o 55 I, CHAPTER 2. ASYMMETRIC INFORMATION IN CAR INSURANCE Then r T2 = n1 n0 1 X 1 X v(q1,i )ψ(ε̂1,i ) − v(q0,i )ψ(ε̂0,i ) n1 i=1 n0 i=1 n0 n1 n0 + n1 is asymptotically normal with zero mean and variance ! τ2 = var(v(q)ψ(ε)) which can be consistently estimated by ny 1 X X 1 τ̂2 = v 2 (qy,i )ψ 2 (ε̂y,i ) − n0 + n1 y=0 i=1 !2 ny 1 X X 1 v(qy,i )ψ(ε̂y,i ) . n0 + n1 y=0 i=1 The authors study local asymptotic power of the test and recommend to choose if g is unknown and ψ(ε) = ε if f exact distribution of the premia, v≡1 is normal with zero mean. Since we do not know the g, we took v ≡ 1. What concerns the function ψ, we tried the specication recommended for normal errors and two other specications as a robustness check. 2.A.2 Asymptotic Properties of the Test Statistic T4 Let Z r(t) = Eρ(t − ε) = ρ(t − x)f (x)dx, t ∈ R, Z 2 R(t) = Eρ (t + ε − ε0 ) = ρ2 (t + x1 − x2 )f (x1 )f (x2 )dx1 dx2 , 1 (v(q) + v(q + s))r(t + µ(q) − µ(q + s))g(q + s), 2 hs (q, t) = Koul and Schick (1997, Theorem 2.4): Suppose that Z sup n0 a → ∞ and for some η>0 n1 a → ∞, 2.1 and Z Z lim s→0 56 q, s, t ∈ R. (v(q) + v(q + s))2 R(µ(q) − µ(q + s))g(q + s)g(q)dq < ∞ |s|<η t ∈ R, |hs (q, t) − h0 (q, t)|2 f (t)g(q)dqdt = 0. 2.2 2.A. TECHNICAL DETAILS FOR TESTS FROM SUBSECTION 2.6.3 Then r T4 = n1 n0 X 1 n0 n1 1 X (v(q1,j + v(q0,i ))ρ(L1,j − L0,i )wa (q1,j − q0,i ) n0 + n1 n0 n1 i=1 j=1 2 is asymptotically normal with zero mean and variance τ4 = Ev 2 (q)g 2 (q)r2 (ε) which can be consistently estimated by 1 τ̂4 = 2 ! n1 n0 2 2 1 X 1 X Ūj,· − Ū·,· + Ū·,i − Ū·,· , n1 j=1 n0 i=1 where 1 (v(q1,j ) + v(q0,i ))ρ(Y1,j − Y0,i )wa (X1,j − X0,i ), 2 n0 n1 1 X 1 X = Uj,i , Ū·,i = Uj,i and n0 i=1 n1 j=1 Uj,i = Ūj,· Ū·,· n0 X n1 1 X Uj,i . = n0 n1 i=1 j=1 The authors claim (in Remark 2.5) that if r and R v and g are bounded, v is continuous, and are Lipschitz-continuous, then (2.1) and (2.2) are implied by Z lim s→0 |µ(q + s) − µ(q)|2 g(q)dq = 0. This condition is satised, for example, when µ is continuous. The authors also study local asymptotic power of the test and recommend to choose v ≡ 1 if g is unknown and ρ(ε) = ε if f be nondecreasing and Lipschitz-continuous. the function ρ, is a normal density. As before, we took Otherwise, v ≡ 1 ρ should and, concerning we tried the specication recommended for normal errors and two other specications as a robustness check. 57 3 State Dependence 3.1 Introduction Distinguishing state dependence from heterogeneity in renewal data is of major substantial interest in economics, but hard (Heckman, 1991). Standard techniques for linear panel data with xed eects, which exploit within-subject variation and dynamic instruments, do not readily apply to renewal, or panel duration, data. Problems arise for two reasons. First, renewal models are inherently nonlinear. Second, only a selection of renewal events can usually be observed in a nite observational period. This chapter studies identiability of and testing for state dependence in renewal models with censored data. They are many examples of substantial economic problems that can be reduced to the analysis of state dependence in renewal data. For example, Abbring, Chiappori, Heckman, and Pinquet (2003) relate state dependence of claim intensities in car insurance to moral hazard. They argue that, under the existing experience-rating schemes, incentives change with each insurance claim. Consequently, under moral hazard and given unobserved determinants, claim intensities depend on the occurrence of past claims. This chapter's analysis builds on the pioneering work on state dependence in eventhistory data by Bates and Neyman (1952), Heckman (1981), and Heckman and Borjas 59 CHAPTER 3. STATE DEPENDENCE (1980). We rst explore the literature on the identication of event-history models with state dependence and heterogeneity initiated by Elbers and Ridder (1982) and Heckman and Singer (1984) and reviewed by Van den Berg (2001). We point out that censoring may substantially invalidate existing results for the identication of panel duration models and provide some constructive identication results for this censored case. Subsequently, we consider testing for state dependence with panel duration data. We focus on a set of (paired) rank tests that were inspired by Holt and Prentice's (1974) and Chamberlain's (1985) partial likelihood methods for paired duration data and further developed and applied by Abbring, Chiappori, and Pinquet (2003). These tests compare two subsequent durations for each subject, in the subsample for which two such durations are observed. They naturally handle the fact that this subsample may only be a strict selection of the full sample if there is censoring. Our main contribution is to illustrate that the tests have little power in the case in which this selection is very strong. This, for example, happens with data on rare events, such as insurance events. The remainder of the chapter proceeds as follows. Section 3.2 presents a framework for the analysis of state dependence and heterogeneity in renewal data. Section 3.3 reviews and develops identication results for a range of models in this framework. Section 3.4 explores the power of some nonparametric tests for state dependence. Section 3.5 concludes. 3.2 State Dependence and Heterogeneity in Renewal Consider a sequence of similar, renewal, events in continuous time t ∈ R+ . Examples include claims incurred on an insurance contract in contract economics, nominal price changes in macroeconomics, a consumer's purchases of a given good or service in marketing, and transactions of a particular stock in nance. 1 Suppose that the renewal events 1 For example, Abbring, Chiappori, and Pinquet (2003), Dionne, Dahchour, and Michaud (2006) and Abbring et al. (2008) analyze car insurance claims. Campbell and Eden (2007) analyze grocers' price changes. Chintagunta and Dong (2006) review the application of duration analysis in marketing; Jain 60 3.3. occur at times t. up to time 0 = T0 < T1 < · · · , and let IDENTIFIABILITY N (t) ≡ max{k ∈ N|Tk ≤ t} Denote durations between renewals by all unobserved heterogeneity is captured by a vector count such events ∆Tk ≡ Tk − Tk−1 . λ. Suppose that We suppress observed covariates 2 throughout. If the renewal intensity • t conditional on θ (t|H(t), λ) (H(t), λ), • H(t) ≡ {N (u); 0 ≤ u < t} • λ conditional on H(t), at time t depends on we say that there is conditional on λ, we say that there is time dependence, we say that there is or nonstationarity ; state dependence ; unobserved heterogeneity. Following Heckman and Borjas (1980), state dependence can be further classied in rence dependence and (dependence on lagged duration dependence N (t−)), duration dependence (dependence on (dependence on occur- TN (t−) ), ∆T1 , . . . , ∆TN (t−) ). For example, the dynamic economic models of insurance claim times under moral hazard and experience rating in Abbring, Chiappori, and Pinquet (2003) and Abbring et al. (2008) predict nonstationarity and occurrence dependence, but no (lagged) duration dependence, of claim intensities. They face the empirical challenge of distinguishing these eects, which are of substantial interest, from those of unobserved heterogeneity in risk. 3.3 Identiability Separating duration dependence and heterogeneity in data on a single renewal duration, single spell data, is notoriously hard (Lancaster, 1979, Heckman and Borjas, 1980). Elbers and Ridder (1982), Heckman and Singer (1984), Ridder (1990), Kortram, Lenstra, Ridder, and van Rooij (1995) show that strong, separability and other, assumptions are and Vilcassim (1991) provide one key example of the empirical analysis of repeated consumer's purchases. Engle and Russell (1998) discuss methods for the analysis of transaction times in nance, and apply these to the analysis of IBM stock transactions. 2 Where of interest, we discuss the way they could enter the analysis. 61 CHAPTER 3. STATE DEPENDENCE needed. In particular, data on external covariates are needed, and strong assumptions on the variation in these covariates and the way they enter the duration model. Data on sequences of renewal times for each subject, multiple spell or panel duration data, facilitate the identication of duration dependence under fewer assumptions. Such data also allow for the analysis of state dependence across spells, that is occurrence dependence and lagged duration dependence (Heckman and Borjas, 1980, Honoré, 1993). In this section, we will review and develop some identication results for panel duration data. A novel aspect of our analysis is that we explicitly deal with the common problem that renewal events are selectively observed because data are only collected for a nite amount of time. This censoring problem is shown to invalidate existing panel identication results, and greatly reduces the identifying power of panel duration data. Section 3.4 subsequently explores the eects of censoring on common tests for state dependence. 3.3.1 Occurrence Dependence and Duration Dependence A close analogy with static linear panel data analysis arises in the special case in which there is occurrence and duration dependence, but no lagged duration dependence nor nonstationarity. Consider a specication for this special case in which OD θ (t|H(t), λ) = ξN (t−)+1 (t − TN (t−) )λ, ∆T1 , . . . , ∆Tk , . . . λ is a nonnegative Here, the baseline hazard ξk : R+ → (0, ∞) are mutually independent conditional on random variable with some distribution G. reects duration dependence. It may dier between spells Rt 0 ξk (u)du for all nite t.3 k λ, and and has an integral We exclude defects by assuming that Ξk (t) ≡ limt→∞ Ξ(t) = ∞.4 3 This can be extended to allow for a nite support of ∆T . k 4 For results on identiability of mixture duration models with defects, see Abbring (2002, 2007). 62 3.3. IDENTIFIABILITY 3.3.1.1 Full Information First, consider the case in which always at least two renewal events, and therefore T2 , are observed. Honoré (1993) shows that Proposition 1 Ξ1 , Ξ2 , G (Honoré, 1993, Theorem 1) T1 and are identied in this case. . The functions Ξ , Ξ 1 2 and G in the model specication (OD) are uniquely determined from the distribution of (T1 , T2 ). Some intuition for this result follows from the analogy with linear panel data that arises if we rewrite the model as a panel transformation model: log Ξ1 (∆T1 ) = − log λ + log E1 3.1 and log Ξ2 (∆T2 ) = − log λ + log E2 , with E1 and E2 unit exponential variables that are mutually independent and independent λ. Because Honoré's result does not require data and assumptions on covariates, it can of be interpreted as a result conditional on covariates in the case in which data on covariates are available. In particular, this implies that identiability extends to the case in which duration dependence and heterogeneity may vary in arbitrary ways with the observed covariates. 3.3.1.2 Censored Data Now suppose that renewal events are only observed up to and including some random time C ⊥⊥{Tk ; k ∈ N}. Let Dk ≡ I(C ≥ Tk ) be an indicator of complete observation of this notation, we have data on the distribution of {min{Tk , C}, Dk ; k ∈ N}. the independence assumption, this identies the distribution of C̄); k ∈ N}, with C̄ of (T1 , T2 ) the upper bound of the support of is identied. 5 C. If Tk . In Because of {Tk · I(Tk ≤ C̄), I(Tk ≤ C̄ = ∞, then the distribution Consequently, in that case Proposition 1 continues to apply. 6 5 In an extension in which T may have nite support, it is sucient that Pr(T > C̄) = 0. 2 2 6 The literature has focused on the practical inference problems that arise in this case. In particular, even with independent censoring, the second duration ∆T2 is censored at a random time C̄ − ∆T1 63 that is CHAPTER 3. STATE DEPENDENCE If, on the other hand, C̄ < ∞, the (selected) subpopulation then the distribution of {T2 ≤ C̄}. (T1 , T2 ) is only identied on This case commonly arises in empirical work, where panels are nitely lived, and complicates identication. Honoré (1993)'s proof of Proposition 1 does not readily extend to this case. To partially resolve this identication problem, suppose that duration-dependency patterns between spells are identical, and that occurrence dependence simply proportionally shifts the hazard rate: ∗ OD θ (t|H(t), λ) = β N (t−) ξ(t − TN (t−) )λ, where β >0 reects occurrence dependence there is none if and only if ξ : R+ → (0, ∞) reects duration dependence, t. with integral Intuitively, in this model we can tell whether direction of the asymmetry in the distribution of Proposition 2 Ξ(t) ≡ β < 1, β = 1, or Rt 0 β = 1 and ξ(u)du for all nite β>1 by checking the (∆T1 , ∆T2 ) on its domain of observation. . (Identication of Occurrence Dependence from Censored Renewal Data) The sign of β − 1 in the model specication (OD∗ ) is uniquely determined from the distribution of (T1 , T2 ) on [0, C̄]2 , C̄ > 0. Proof. The proof is constructive. Let and note that of (T1 , T2 ) on Z Z ≡ {(t1 , t2 ) ∈ R2+ : Ξ(t1 ) < Ξ(t2 ) is identied from the distribution of [0, C̄]2 ∆T1 on [0, C̄]. t1 + t2 ≤ C̄} Next, the distribution gives Pr (∆T1 ≤ t1 , ∆T2 ≤ t2 ) = βξ(t1 )ξ(t2 )L00 [Ξ(t1 ) + βΞ(t2 )] dt1 dt2 for almost all and and Pr (∆T1 ≤ t2 , ∆T2 ≤ t1 ) = βξ(t1 )ξ(t2 )L00 [Ξ(t2 ) + βΞ(t1 )] dt1 dt2 R∞ (t1 , t2 ) ∈ Z . Here L(s) ≡ 0 exp(−sv)dG(v) is the Laplace transform of typically not independent of it. Consequently, even under the assumption of independent censoring, the second duration cannot be analyzed in isolation from the rst duration (Visser, 1996). 64 3.3. G. Because L00 IDENTIFIABILITY is strictly monotonic, this identies the sign of Ξ(t1 ) + βΞ(t2 ) − Ξ(t2 ) − βΞ(t1 ) = (β − 1) [Ξ(t2 ) − Ξ(t1 )] for almost all sign of 3.3.2 (t1 , t2 ) ∈ Z . Because Ξ(t2 ) − Ξ(t1 ) > 0 for all (t1 , t2 ) ∈ Z , this equals the β − 1. Occurrence Dependence and Lagged Duration Dependence The specication in (OD) can be extended with lagged duration dependence: LD θ (t|H(t), λ) = µN (t−)+1 (∆T1 , . . . , ∆TN (t−) )ξN (t−)+1 (t − TN (t−) )λ, where µ1 = 1. The function ∆T1 , . . . , ∆Tk−1 , for given µk captures the dependence of N (t−) = k − 1, TN (t−) , and θ (t|H(t), λ) on past durations t, k = 2, 3, . . .. Honoré (1993) presents identication results for a two-spell version of (LD) and com- 7 plete data. His analysis allows λ to vary across spells, but requires proportional variation with external observed covariates, and does not directly carry over to (LD). However, it does strongly suggest that identication of (LD) requires richer external variation in the renewal durations than that required for identication of the basic model in (OD), even without censoring. 3.3.3 Occurrence Dependence and Nonstationarity In many applications, the renewal process takes place in a nonstationary environment. One example is the contracting environment of the car insurance claims process analyzed by Abbring, Chiappori, and Pinquet (2003) and Abbring et al. (2008). Insurance premia are often updated annually, at the time of contract renewal. With forward looking agents who suer from moral hazard, this leads to (contract) time eects in the claims process. 7 Abbring and Van den Berg (2003) present results for a related extension of the model. 65 CHAPTER 3. STATE DEPENDENCE Time eects can easily be mistaken for state dependence of substantial interest. Therefore, controlling for time eects is important if they cannot be excluded a priori. Consider the following specication of the renewal intensity with occurrence dependence and time eects (Abbring, Chiappori, and Pinquet, 2003): NS θ (t|H(t), λ) = β N (t−) ψ(t)λ. Here, ψ : R+ → (0, ∞) captures time eects. It has an integral Ψ(t) ≡ Rt 0 ψ(u)du for all t ∈ R+ . Nonstationarity breaks the previous subsection's analogy to the linear static panel data model. To see this, again rewrite the model as a transformation model, log Ψ(T1 ) = − log λ + log E1 3.2 and log [Ψ(T2 ) − Ψ(T1 )] = − log β − log λ + log E2 , with of λ, E1 and E2 unit exponential variables that are mutually independent and independent and E2 independent of T1 . Thus, in terms of appropriately transformed times, this nonstationary model with occurrence dependence is a dynamic panel data model, with the usual one-factor structure on the errors, but with endogenous variables that cannot be separated. The following partial identication result for this model parallels Proposition 2. Proposition 3 . The sign of (Abbring, Chiappori, and Pinquet, 2003, Proposition 2) β − 1 in the model specication (NS) is uniquely determined from the distribution of (N (C̄), T1 , T2 ) on [0, C̄]2 , C̄ > 0. In addition, Abbring, Chiappori, and Pinquet conjecture that the parameter identied without further assumptions. They also show that β is known, under the additional assumption that L E[λ] < ∞. and Ψ β is point- are identied once The latter assumption has been common in the analysis of the mixed proportional hazard model since the early work 66 3.4. NONPARAMETRIC TESTS of Elbers and Ridder (1982), but is not innocuous (Ridder, 1990). 3.4 Nonparametric Tests Tests for state dependence in renewal data can be distinguished by the way they control for unobserved heterogeneity. One approach follows the intuition from linear panel data analysis, and uses withinsubject variation, variation between spells for each given subject. In fact, it is clear from Section 3.3 that, in some special cases, renewal models with heterogeneity can be written as linear panel data models in log durations. In these cases, standard methods for linear panel data with xed eects can be applied to control for heterogeneity, provided that there is no censoring. By and large, this is the regression approach to the analysis of state dependence forwarded by Heckman and Borjas (1980, Section II.b). More generally, linear panel data methods cannot be applied, but Holt and Prentice's (1974) and Chamberlain's (1985) methods for paired duration data can be. Another approach follows the seminal work of Bates and Neyman (1952) and Heckman and Borjas (1980, Section II.a) and exploits that, in a stationarity environment without state dependence, the number of claims in a given data period is a sucient statistic for the unobserved heterogeneity in the claim intensities. Consequently, any signs of nonstationarity or state dependence of these intensities in subsamples with a given number of events in a period cannot be explained by heterogeneity and are direct evidence of time and state dependence. Note that a test that directly exploits this last result is in fact an omnibus test against time and state dependence. This is true for many tests for state dependence (see e.g. Heckman and Borjas, 1980). Abbring, Chiappori, and Pinquet (2003) use such omnibus tests, but also develop more advanced tests that allow for nonstationarity under the null, and that are designed to have power against particular types of state dependence. This section investigates the power of a particularly simple rank test for state depen- 67 CHAPTER 3. STATE DEPENDENCE dence that they used. Throughout, we will maintain Section 3.3.3's model with occurrence dependence and time eects, NS θ (t|H(t), λ) = β N (t−) ψ(t)λ. Section 3.4.1 studies a version of the test that directly compares the durations of each subject's rst and second spells, and has power against both time and state dependence. It is only a test specically against state dependence if stationarity (ψ = 1) is assumed. An extension based on appropriately transformed durations, briey studied in Section 3.4.2, allows for nonstationarity (general Ψ) under the null, and tests specically against state dependence. 3.4.1 A Simple Rank Test Throughout this section, we assume stationarity (ψ = 1); it is implicitly understood that this section's tests do not allow for nonstationarity under the null. Moreover, we will derive all results conditional on λ that is, we will take λ to be a xed nuisance parameter. Because the distributions of this section's statistics with heterogeneous from mixing over their distributions for given λ, λ follow directly power and other results for the case of general heterogeneity follow easily, and are not explicitly discussed. 3.4.1.1 The Case without Censoring For given λ, parameter λβ k−1 , k ∈ N. the durations ∆Tk are independently and exponentially distributed with Consequently, π(λ, β) ≡ Pr(T1 ≥ T2 − T1 |λ) = Note that π(λ, β) does not depend on λ, and simply write π(β). dependence, 68 β β+1 π(1) = 1/2. Under the alternative, π(λ, β) < 1/2 Under the null of no state if β<1 and π(β) > 1/2 if 3.4. NONPARAMETRIC TESTS β > 1. Now, suppose that we have a sample of n renewal histories and that at least two renewal times are observed for each subject. Thus, we have a sample and reject the null that β=1 if the empirical analog of ((T1,1 , T2,1 ), . . . , (T1,n , T2,n )), π(β), n 1X π̂n ≡ I(T1,i ≥ T2,i − T1,i ), n i=1 is far enough away from 1/2. we would reject the null if For example, if we test against the alternative that π̂n is suciently small. β < 1, This is simply a binomial test with standard power and size properties. Note that these do not depend on λ, so that they hold for general heterogeneity. 3.4.1.2 The Case with Censoring Unfortunately, the previous section's standard test is not feasible in the typical case that we can only observe renewal events for a nite period of time. For expositional convenience, suppose that we observe all renewal events up to a xed time, C̄ = 1. C̄ , and normalize Then, we have a sample (T1,1 , . . . , TN1 (1),1 ; N1 (1)), . . . , (T1,n , . . . , TNn (1),n ; Nn (1)) . Consider the feasible statistics π̂C,n ≡ P∞ k=2 π̂C2 ,n ≡ where k Mk,n ≡ Pn i=1 n X 1 Mk,n I(T1,i ≥ T2,i − T1,i , Ni (1) ≥ 2) n 1 X I(T1,i ≥ T2,i − T1,i , Ni (1) = 2), M2,n i=1 I (Ni (1) = k) is the number of observations in the sample for which renewal events are observed. We again reject the null if away from and i=1 π̂C,n or π̂C2 ,n are suciently far 1/2. 69 CHAPTER 3. STATE DEPENDENCE For given subsample sizes P∞ k=2 Mk,n and M2,n , π̂C,n and π̂C2 ,n are again standard binomial tests, but now with underlying Bernouilli probabilities πC (λ, β) ≡ Pr(T1 ≥ T2 − T1 |λ, N (1) ≥ 2) = π(β) · A(λ, β) and πC2 (λ, β) ≡ Pr(T1 ≥ T2 − T1 |λ, N (1) = 2) = π(β) · B(λ, β), where 1 1 − β + (β + 1)e−λ − 2e− 2 λ(β+1) A(λ, β) ≡ 1 − β + βe−λ − e−λβ and 1 2 β + 1 2β + 1 + e−λ(β −1) − 2(β + 1)e− 2 λ(β−1) · . B(λ, β) ≡ 2β + 1 β + e−λ(β 2 −1) − (β + 1)e−λ(β−1) Both A(λ, 1) = 1 and B(λ, 1) = 1, so that πC (λ, 1) = πC2 (λ, 1) = 1/2. For given subsample sizes, this gives the standard one-sided or two-sided rejection regions for a binomial test of a fair Bernouilli trial. Moreover, the rejection probabilities are continuous functions of respectively |πC (λ, β) − 1/2| and πC (λ, β) πC2 (λ, β). and |πC2 (λ, β) − 1/2| Thus, the tests are more powerful if are larger, for the relevant alternatives. Therefore, for characterizing the power of the tests for given subsample sizes, it suces to characterize |πC (λ, β) − 1/2| and First, consider β > 1, |πC2 (λ, β) − 1/2|, πC (λ, β). we have that as functions of Although A(λ, β) > 1 if |πC (λ, β) − 1/2| < |π(λ, β) − 1/2| β πC (λ, β) < 1/2 β <1 for all and and the nuisance parameter if β < 1 A(λ, β) < 1 β 6= 1, so that if π̂C,n and β > 1. πC (λ, β) > 1/2 limλ→∞ A(λ, β) = 1, Next, consider 70 limλ→0 πC (λ, β) = 1/2 for all β . and the dierences between πC2 (λ, β). Again, π(λ, β) is less powerful than π̂n , λ, π̂C,n has For large and if This implies that even if they would be based on (sub-)samples of the same size. For very small very little power, because λ. λ, πC (λ, β) on the other hand, vanish. πC2 (λ, β) < 1/2 if β < 1 and πC2 (λ, β) > 1/2 if β > 1. 3.4. However, for β < 1, we now have that λ. for large values of it has low power if λ However, if λ B(λ, β) > 1 In both cases, π̂C2 ,n and π̂C2 ,n π̂C2 ,n = 1 may be more powerful than π̂n . is more powerful than β < 1, β <1 β 6= 1. and A(λ, β) < B(λ, β) then Consequently, π̂C2 ,n dependence. If λ if but B(λ, β) < 1 for small λ is small. π̂C,n and so that π̂C,n π̂n , λ, but Again, if both are based on unique λ̃(β) such that |πC2 (λ, β) − 1/2| > |π(λ, β) − 1/2|, β > 1, limλ→∞ πC2 (λ, β) = 1, π̂C2 ,n . β 2β+1 so that < π(β). For all λ, A(λ, β) > B(λ, β) |πC (λ, β) − 1/2| < |πC2 (λ, β) − 1/2| is more powerful than of the same size (note though that tests are equally poor if π̂n there exist a limλ→∞ πC2 (λ, β) = β > 1, if then Moreover, if Finally, we can compare the power of if β 6= 1 λ > λ̃(β), (see Figure 3.1). If almost surely. If B(λ, β) < 1 has lower power than subsamples of the same size. In fact, for each B(λ̃(β), β) = 1 i.e. λ, limλ→0 πC2 (λ, β) = 1/2. is very small: is large, π̂C2 ,n for small values of β > 1, The opposite holds for B(λ, β) > 1 for large λ. NONPARAMETRIC TESTS π̂C,n if if both are based on subsamples is typically based on a larger subsample). Both is close to zero, and perform equally well with extreme state λ is very large, π̂C,n is almost as good as, and π̂C2 ,n outperforms, π̂n , again for equal (sub-)sample sizes. Table 3.1 summarizes these ndings. In general, we can conclude that the tests for censored data have limited power if renewal events are rare, i.e. if λ is small, for given subsample sizes. In such cases, the only way to increase the tests' power is to increase the number of observations. In the next section, we characterize the sample sizes needed for the tests to have reasonable power at various values of λ and β. 3.4.1.3 Sample Sizes In this section, we will focus on the statistic asymptotically normal with mean 1/2 π̂C2 ,n . and variance Under the null that [4nP2 (λ, 1)]−1 , β = 1, π̂C2 ,n where 2 P2 (λ, β) ≡ Pr(N (1) = 2|λ) = βe−λ − (β + 1)e−λβ + e−λβ . (β + 1)(β − 1)2 71 is CHAPTER 3. STATE DEPENDENCE Figure 3.1: Graph of Function B(λ, β) and λ̃(β) ΛHΒL BHΛ,ΒL 1 0 0 5Λ Β1 2 β 6= 1, λ̃(β) is the unique solution λ of B(λ, β) = 1. The graph of λ̃ is represented an intersection of the function B with the horizontal plane at 1. Note: For each positive in the above gure as 72 10 3.4. NONPARAMETRIC TESTS Table 3.1: Comparison of General properties of Parameters π -functions π , πC β<1 and πC2 β=1 β>1 λ < λ̃(β) π < πC2 < πC < 1 2 π = πC2 = πC = 1 2 π > πC2 > πC > 1 2 λ > λ̃(β) πC 2 < π < π C < 1 2 π = πC2 = πC = 1 2 πC 2 > π > π C > 1 2 Note: The function λ̃ is dened in Section 3.4.1 and plotted in Figure 3.1. Limiting properties of Limit λ→0 πC and πC 2 0<λ<∞ λ→∞ λ (e 2 −1)2 eλ (λ−1)+1 β→0 πC = πC2 = 1 2 0<β<1 πC = πC2 = 1 2 See table 1<β<∞ πC = πC2 = 1 2 above β→∞ πC = πC2 = πC = π are that limβ→0 π(β) = 0 β β+1 πC = π C = πC 2 = 1 does not exist Note: Limiting properties of πC = πC2 = 0 > πC2 = β β+1 β 2β+1 < πC2 = 1 πC = πC2 = 1 and limβ→∞ π(β) = 1. 73 CHAPTER 3. Under the null, M2,n STATE DEPENDENCE n−1 M2,n is a consistent estimator of P2 (λ, 1). In practice, we would take α for the normal distribution as given and compute a critical region for a given size 1/2 with mean and variance (4M2,n )−1 . In this section, we characterize, for dierent λ and β , the minimal sample size nmin (λ, β) needed to ensure that the null is rejected at the expected value M2,n relevant subsample size equals its expected value πC2 of nP2 (λ, β). π̂C2 ,n , given that the Note that this roughly corresponds to the minimum sample size needed to reject the null half of the times at the given values of eect of β λ and λ and β . β Also note that the results from this computation reect both the on the test's power for a given subsample size, and the eects of λ and on that subsample size. For concreteness, we focus on a one-sided test of β = 1 against β < 1. In this case, we will reject the null if π̂C2 ,n ≤ where for u(α) M2,n is the and α-quantile πC2 ,n (λ, β) for 3.3 u(α) 1 + p , 2 2 M2,n of the standard normal distribution. Substituting π̂C2 ,n , and solving for the sample size n nP2 (λ, β) for which (3.3) holds with equality, gives u(α)2 nmin (λ, β) = P2 (λ, β) [1 − 2πC2 (λ, β)]2 That is, πC2 ,n (λ, β) is the critical value for rejecting the null have a subsample with nmin P2 (λ, β) one-sided tests against β>1 Note that β=1 β < 1. in favor of β<1 if we A similar computation for a leads to the same expression for nmin (λ, β) for β > 1. nmin (λ, β) is large if the denominator in the right-hand side of (3.4) is small, which happens if πC2 (λ, β) observations, for 3.4 P2 (λ, β) is close to is close to 1) or if λ 1/2, is small (that is, if λ and/or β is small). It is also large if which happens if there is little state dependence (that is, if is small. On the other hand, nmin (λ, β) will be small if λ and β β are large. Table 3.2 plots 74 nmin (λ, β) for various values of λ and β, for α = 5%. Note that we 3.4. NONPARAMETRIC TESTS need millions of observations if the renewal events are rare, in particular if λ is smaller than 0.2. 3.4.2 A Transformed Rank Test Abbring, Chiappori, and Pinquet (2003) developed a variant of Section 3.4.1's test that allows for general nonstationarity (Ψ) under the null. It is eectively π̂C2 ,n π̂C2 ,n applied to an appropriate empirical transformation of the observed renewal times. Because the testt involves this empirical transformation, the analysis of its distributional properties cannot be derived by mixing over its properties in the case of a homogeneous explicitly deal with mixing over the distribution First, suppose we know Ψ. G λ of instead of and λ T1 T1 and T2 Ψ(T2 ) − Ψ(T1 ) and βλ, T2 . and Note that is increasing on the Ψ(T1 ) and Ψ(T2 ) λ, Ψ(T1 ) are again independent exponential random variables with parameters respectively. We can directly apply the earlier analysis for the stationary case, Ψ. Then, we can construct Ψ(T1,i ) π̂C2 ,n Ψ, so that π̂C2 ,n (Ψ) is not feasible. Chiappori, and Pinquet show that we can estimate of H1 (t) ≡ Pr(T1 ≤ t|N (1) = 1) gests substituting Ĥ1,n Ψ(T2,i ) and therefore to arbitrary, but still known, nonstationarity. In general, we do not know Ĥ1,n and n 1 X I (Ψ(T1,i ) ≥ Ψ(T2,i ) − Ψ(T1,i ), Ni (T ) = 2) . M2,n i=1 This is a generalization of ance Ψ So, we can work with the transformed times π̂C2 ,n (Ψ) ≡ null that in this section. without loss of information. This is convenient, as, for given provided that we know analog Therefore, we Then, we can deal with possible nonstationarity by working in integrated-hazard time instead of calendar time. supports of λ. for β = 1, π̂n (Ĥ1,n ) Ψ and using Ψ consistently by the empirical under the null that π̂C2 ,n (Ĥ1,n ) However, Abbring, β = 1. as our test statistic. is asymptotically normal with expectation [4n Pr(N (1) = 2)]−1 + [6n Pr(N (1) = 1)]−1 1/2 This sugUnder the and vari- (Abbring, Chiappori, and Pinquet, 2003, 75 CHAPTER 3. STATE DEPENDENCE Table 3.2: Sample size needed to reject the null that β = 1, for various values of λ and β β λ 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 25,184,288 16,011,558 0.2 1,647,802 1,052,549 925,740 955,190 1,112,960 1,466,716 2,263,569 0.3 340,706 218,653 193,280 200,503 234,958 311,519 483,841 0.4 112,826 72,749 64,632 67,410 79,448 105,976 165,656 0.5 48,361 31,330 27,976 29,336 34,774 46,669 73,420 0.6 24,403 15,884 14,256 15,030 17,919 24,196 38,311 0.7 13,780 9,012 8,130 8,618 10,334 14,040 22,374 0.8 8,450 5,552 5,034 5,366 6,472 8,847 14,190 0.9 5,517 3,643 3,320 3,558 4,316 5,937 9,584 1 3,785 2,511 2,301 2,479 3,025 4,187 6,803 1.2 1.3 1.4 1.5 66,330,810 237,500,978 197,255,624 45,567,046 18,850,249 9,931,289 5,985,657 14,012,010 14,380,528 16,660,656 21,824,318 33,467,702 β λ 0.1 0.8 0.9 1.1 0.2 4,516,366 16,284,972 13,729,815 3,197,037 1,333,547 708,634 430,904 0.3 971,876 3,529,081 3,020,317 708,906 298,145 159,787 98,020 0.4 334,993 1,225,023 1,064,261 251,785 106,766 57,706 35,709 0.5 149,475 550,478 485,463 115,764 49,491 26,975 16,837 0.6 78,525 291,238 260,721 62,665 27,009 14,845 9,346 0.7 46,172 172,462 156,723 37,967 16,497 9,143 5,805 0.8 29,482 110,906 102,307 24,980 10,942 6,115 3,915 0.9 20,049 75,958 71,127 17,504 7,729 4,355 2,812 1 14,329 54,673 51,969 12,889 5,737 3,259 2,122 Note: This table gives the sample size value πC2 M2,n equals its expected value 76 of π̂C2 ,n (λ, β), nmin (λ, β) needed to reject the null that β =1 at the expected 5% one-sided test based on π̂C2 ,n , given that the relevant subsample size nP2 (λ, β), for various values of the parameters λ and β . using a 3.5. Proposition 7). The variance can be estimated consistently as Substitution of fact that Ĥ1,n Ĥ1,n for Ψ some of the occurrence dependence if distribution G of λ, 1/(4M2,n ) + 1/(6M1,n ). comes at the price of lower power. is only a consistent estimator of the population analog CONCLUSION β 6= 1. πC2 (H1 (β, G); β, G) strictly increases near Ψ This is due to the under the null, and generally captures Abbring, Chiappori, and Pinquet show that of the statistic, as a function of β=1 if G β for given is nondegenerate with at least two positive points of support. Here, we add that the presence of nontrivial heterogeneity in λ is crucial for this result: If G is degenerate, then πC2 (H1 (β, G); β, G) = 1/2 for all β. Simulation results not reported here conrm that the test only has power if there is substantial heterogeneity in λ. 3.5 Conclusion Typically, renewal data can only be collected over a nite period of time. This chapter shows that this seriously hampers the analysis of state dependence in the presence of general heterogeneity, and possibly, nonstationarity. In particular, existing identication results for panel duration models do not apply to renewal data that are censored this way. And, nonparametric tests for state dependence loose their power if the renewal events are rare. 77 4 Moral Hazard in Dynamic Insurance Data 4.1 Introduction Four decades of theoretical research on asymmetric information have rmly established its importance for insurance relations and competitive insurance markets. The practical relevance of this research, and of the results on the eciency of insurance markets and the design of optimal contracts that it has produced, depends critically on the empirical relevance of asymmetric information. A substantial and fast growing literature is now assessing this relevance for a variety of markets, using microeconometric methods and micro data on contracts and insurees. For some markets, notably car insurance, evidence is surprisingly mixed and muted, often pointing to a lack of asymmetric information problems. Much of the literature, however, uses static theory and cross-sectional data, which limits both its versatility in dealing with truly dynamic aspects of insurance markets, such as experience rating, and variation in the data that can be turned into robust empirical results. The empirical distinction between moral hazard and selection eects using static methods has turned out to be particularly hard; as argued by Abbring, Chiappori, Heckman, and Pinquet (2003), this is the standard econometric problem of distinguishing causal and selection eects. 79 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA In this chapter we instead analyze moral hazard in car insurance using panel data on contracts and claims provided by a Dutch insurance company. The analysis exploits some remarkable properties of the Dutch experience-rating system. Specically, we theoretically analyze the Dutch scheme as a repeated contract between an insuree and an insurer, in which each period's interaction involves memory of the relationship's past history. Using control theory, we study the endogenous changes this structure induces in the incentives agents face at each point in time. Under moral hazard, these changes, in turn, generate specic patterns in the time prole of claim occurrences and sizes that we fully characterize; these patterns are specic in the sense that they would not appear under the null of no moral hazard. Finally, we develop structural econometric tests based on this theory and apply them to the Dutch micro data. The tests are exibly parametric and nonparametric, and valid in the presence of unobserved heterogeneity of a general type. In contrast to much of the earlier literature, we nd evidence of moral hazard in car insurance. We also discuss the empirical distinction between ex ante and ex post moral hazard. Ex ante moral hazard entails that agents respond to changes in incentives by changing the risk of losses. Ex post moral hazard concerns the eects of incentives on claiming actual losses. The distinction between ex ante and ex post moral hazard is important because of their dierent welfare consequences (e.g. Chiappori, 2001). Because an insurer's administrative data typically only contain data on claims, and not on losses, distinguishing between ex ante and ex post moral hazard requires additional structural assumptions. Under a reasonable set of such assumptions, we nd that at least some of the detected moral hazard is due to ex post moral hazard. Our theoretical model species agent's optimal dynamic savings, loss prevention effort, and claim choices under the experience-rating (bonus-malus) scheme in Dutch car insurance. It produces predictions on the joint behavior of the claim occurrence, claim size, and experience-rating processes, for given individual risk and other characteristics. 80 4.1. INTRODUCTION Ex ante moral hazard is captured by the endogenous loss prevention eort; ex post moral hazard by the endogenous claim choice. Endogenous savings allow for self-insurance. The model provides a characterization of the dynamic heterogeneous incentives to avoid claims inherent to the Dutch experience-rating scheme, and their behavioral consequences under moral hazard. In particular, incentives are dened as the loss in expected discounted utility that would be incurred if a claim would be led. We show how incentives vary with the current bonus-malus state and contract time, and jump with each claim because of its foreseeable eect on the future bonus-malus state. We present an algorithm for numerically characterizing these eects and provide a quantitative analysis of incentives. We restrict attention to computations under the null of no moral hazard. Because claim rates are constant under the null, these computations are relatively straightforward. Our tests for moral hazard build on these theoretical computations. We rst focus on the timing of claims. Under the null that there is no moral hazard, claim rates do not vary with incentives; under moral hazard, on the other hand, claim rates are lower when incentives are stronger. Our main test exploits the full model structure. It is a score (Lagrange multiplier) test for the dependence of claim rates on incentives in a version of the structural model that allows for exible heterogeneity in risk. Because it only requires the computation of incentives under the null, it is easy to implement using our algorithm. We nd strong evidence that claim rates decrease with incentives, and reject the null of no moral hazard at all conventional levels. In addition to this structural parametric test for moral hazard, we also present and apply a range of nonparametric tests for state-dependence and contract-time eects on claim rates, controlling for risk heterogeneity. Our theory implies that any such eects must be due to moral hazard and, in this way, identies the substantial problem of testing for moral hazard with the classical statistical problem of distinguishing state dependence and heterogeneity. This is a hard problem, but one that has been studied at length in statistics and econometrics (Bates and Neyman, 1952, Heckman and Borjas, 1980, 81 CHAPTER 4. Heckman, 1981). MORAL HAZARD IN DYNAMIC INSURANCE DATA Our tests rely on this literature's key insight that, without contract- time and state dependence, the number of claims in a given period is a sucient statistic for the unobserved heterogeneity in the conditional distribution of claim times in that period. Consequently, any signs of time or state dependence in subsamples with a given number of claims in a period are evidence of moral hazard. Moreover, an implication of the Dutch experience-rating system is that incentives may jump up or down at the time a claim is led, depending on the current bonus-malus state. Therefore, we not only test for state dependence, but also for appropriate changes in its sign across bonus-malus states. Even though the nonparametric tests have relatively little power with the type of rare events found in insurance data (see Chapter 3), they corroborate the results from the structural test. Our theory also attributes a moral-hazard interpretation to state-dependence and contract-time eects on the sizes of claims. Under the assumption that ex ante moral hazard only aects the occurrence, but not the size, of insured losses, the latter are informative on ex post moral hazard. We complement this analysis of ex post moral hazard with data on claim withdrawals. Agents in our data set can withdraw a claim within six months and avoid malus. Under some assumptions, which we spell out in detail, claim withdrawals are observed manifestations of ex post moral hazard. This chapter contributes to a rich literature on asymmetric information in insurance markets. The seminal work on moral hazard and adverse selection by Arrow (1963), Pauly (1974, 1968), and Rothschild and Stiglitz (1976) showed that competitive insurance markets may be inecient if information is asymmetric. A vast theoretical literature followed up on their key insights. Increasingly, attention has shifted from the development of theory to the empirical analysis of its relevance (see, e.g., Chiappori, 2001, Chiappori and Salanié, 2003, for reviews). 1 Chiappori (2001) forwarded the idea to exploit the 1 Car-insurance data were studied, among others, by Dionne and Vanasse (1992), Puelz and Snow (1994), Dionne and Doherty (1994), Chiappori and Salanié (1997), Dionne, Gouriéroux, and Vanasse (1999), Richaudeau (1999), Chiappori and Salanié (2000), Dionne et al. (2001), Abbring, Chiappori, and Pinquet (2003), Cohen (2005), Dionne et al. (2006), Chiappori et al. (2006), and Pinquet et al. (2007). 82 4.1. INTRODUCTION rich variation that can be derived from dynamic theory and found in longitudinal data; Abbring, Chiappori, Heckman, and Pinquet (2003) suggested that we base a test for moral hazard on the dynamic variation in individual risk with the idiosyncratic variation in incentives due to experience rating. The empirical papers most closely related this chapter are Abbring, Chiappori, and Pinquet (2003), Dionne et al. (2006) and Pinquet et al. (2007). from and extends these works in several ways. Our analysis diers First, we precisely model the forward- looking behavior of an agent in the actual institutional environment characterizing the insurance market studied. We use this model to dene and compute dynamic incentives and construct a structural test that exploits these computations in detail. Secondly, we explicitly distinguish ex ante and ex post moral hazard, which requires a formal analysis of the claim ling behavior. Finally, we model both claim occurrences and claim sizes. Together, this allows us to confront a novel and precise set of dynamic implications for claim occurrences and sizes under moral hazard to longitudinal data. The remainder of this chapter is organized as follows. Section 4.2 briey discusses the Dutch car-insurance market, with specic attention for the experience-rating scheme used. It also introduces the data. Section 4.3 develops the theory. We use the theory to analyze the dynamic incentives inherent to experience rating, and to derive the implications of moral hazard for claim rate and size dynamics. Section 4.4 develops an econometric framework for testing the eects of moral hazard from data on claim rates and sizes and presents the empirical results. Section 4.5 concludes. Appendices 4.A and 4.B provide proofs and computational details for Section 4.3. Appendix 4.C gives additional information on the data. Appendices 4.D 4.G provide robustness checks. Health and life insurance data were analyzed by, for example, Holly et al. (1998), Chiappori et al. (1998), Cardon and Hendel (2001), Hendel and Lizzeri (2003) and Fang et al. (2006). Finkelstein and Poterba (2002) studied annuities. 83 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA 4.2 Institutional Background and Data 4.2.1 Experience Rating in Dutch Car Insurance In 2006, the 16.3 million inhabitants of the Netherlands were driving 7.2 million private cars. 2 Because liability insurance is mandatory in the Netherlands, this comes with a substantial demand for car insurance. In the same year, 74 insurance companies served this demand. 3 Even though these companies are supervised by the Dutch nancial authorities, they are to great extent free to set their premia and contractual conditions. In doing so, the Dutch insurance companies, united in the Dutch Association of Insurers, have to great extent coordinated their experience-rating systems in car insurance. Before 1982, car insurers employed a limited experience-rating scheme. This scheme was commonly considered to be inadequate to price observed risk. In the early 1980s, six of the market's leading rms proposed a much ner a large actuarial study (de Wit et al., 1982). 4 bonus-malus (BM) system, based on Early 1982, this system was introduced in Dutch car insurance in a coordinated way. After some early market turbulence, the insurers by and large settled on similar bonus-malus schemes. In this chapter we use the same data as in the Chapter 2. The data come from one of the six companies that were leading the introduction of the bonus-malus system in Dutch car insurance. During the data period, January 1, 1995December 31, 2000, 5 this company used the bonus-malus scheme given in Table 4.1 . The premium discount depends on the insuree's current contract renewal date. bonus-malus class, which is determined at each annual Twenty bonus-malus classes are distinguished, from 1 (highest 2 Source: Statistics Netherlands (www.cbs.nl). 3 Source: Dutch Association of Insurers (www.verzekeraars.nl). 4 Information on the development of the BM system in Dutch car insurance is scattered throughout the professional literature. de Wit et al. (1982) provides information on the actuarial research underlying the bonus-malus system, and some very early history. Assurantiemagazine (2004) provides more recent historical reection. 5 This is the same BM scheme as the one given earlier, in the Table 2.1. We repeat it here for the convenience and with a special notation introduced later in this chapter. 84 4.2. INSTITUTIONAL BACKGROUND AND DATA premium) to 20 (lowest premium). Every new insuree starts in class 2 and pays the corresponding premium. We will refer to this premium as the base premium. After each claim-free year, an insuree advances one class, up to class 20. Each claim at fault sets an insuree back into a lower class. the base premium. The worst class is 1, and implies a surcharge to This scheme is representative for the bonus-malus schemes used in 6 the Netherlands in this period. Consequently, throughout this chapter we assume that the drivers in our data set cannot escape Table 4.1's bonus-malus system by switching insurers. The empirical analysis in this chapter exploits that the incentives to avoid a claim jump with each claim led, and vary with contract time and across bonus-malus classes. To gain some rst insight in the dierences in the cost of a claim to an insuree across dierent bonus-malus classes and dierent numbers of claims, we have computed the change in the premium at the next renewal date with each claim in a contract year, for dierent bonus-malus classes. Table 4.2 gives the percentage premium change after a claim-free contract year, and the subsequent marginal percentage changes in the premium after each claim in the contract year. For example, after a claim-free year in class 8, an insuree will be upgraded to class 9 and pay 45% instead of 50% of the base premium. This amounts to a 10% reduction in the premium. If he les one claim in the contract year, he will instead be downgraded to class 4 and pay 80% of the base premium. This amounts to a (80 − 45)/45 = 78% increase relative to the premium that would be paid without the claim. A second claim would take him down further to class 1, and a premium equal to 120% of the base (a 50% increase relative to having one claim). A third claim would have 6 The scheme is similar to the one originally proposed by de Wit et al. (1982), extended with multiple (maximum-bonus) levels that oer good customers some protection against premium increases. Evidence on the development of the bonus-malus system is sketchy see Footnote 4 but strongly suggests that the sector actively coordinated on similar bonus-malus schemes in the course of the 1980s, before the start of our data period. Moreover, further major innovations to car insurance pricing were only introduced recently, after the end of our data period. We also compared Table 4.1's scheme to schemes currently oered by Dutch insurers and found only minor dierences. The maximum discount on premium ranges from 70% to 80% and the maximum surcharge is in the range of 15% to 30%. Some insurance companies oer also collective insurance with more advantageous bonus-malus schemes. 85 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA no further eect on the premium. Clearly, unlike the French scheme studied by Abbring, Chiappori, and Pinquet (2003), this scheme is not proportional. The premium increases after a rst claim are largest for those in the intermediate bonus-malus classes, and smallest for those in the top and bottom classes. The marginal premium increases after a second or third claim, however, are increasing nearly monotonically with the bonus-malus class, from 0% in the lowest classes to 100140% in class 20. 7 This all suggests that incentives to avoid a claim jump down after a rst claim for insurees in low classes and jump up after a rst claim for insurees in high classes. In Section 4.3, we formally dene incentives in a dynamic theoretical setting and provide some numerical computations to formalize this intuition. Finally, note that insurees are contractually obliged to claim all their insured losses as soon as possible. However, the contract leaves them the option to withdraw their claims within six months from the loss date. Withdrawn claims do not count as at-fault claims in determining the insuree's bonus-malus class and therefore do not aect the premium. Therefore, throughout most of this chapter we treat withdrawn claims like unclaimed losses. That is, we ignore them, together with losses that were not claimed in the rst place. Section 4.4.4 discusses the fact that withdrawals are in fact observed manifestations of ex post moral hazard. 4.2.2 Data Our data provide the contract and claim histories of personal car insurance clients of a major Dutch insurer from January 1, 1995 to December 31, 2000. The raw data consist of 1,730,559 records. Each record registers a change in a particular contract (renewal, change of car, etcetera), or a claim. on drivers The data include 75 variables, with information (sex, age, occupation, postcode), cars (brand, model, production year, price, 7 In the lowest classes, therefore, the bonus-malus scheme itself does not give incentives to avoid a second or third claim. However, the insurance company reserves the right to cancel contracts with three or more claims at fault in a year. Because claims at fault are fairly rare, this is unlikely to aect insurees' decisions a lot. Therefore, we ignore contract cancelations in our theoretical and empirical analysis. 86 4.2. INSTITUTIONAL BACKGROUND AND DATA Table 4.1: Bonus-Malus Scheme Present Premium Future BM class (B(K, N)) after a contract year with BM class paid no claim 1 claim 2 claims 3 or more claims (K) (q = A(K)) (N = 0) (N = 1) (N = 2) (N ≥ 3) 20 25% 20 14 8 1 19 25% 20 13 7 1 18 25% 19 12 7 1 17 25% 18 11 6 1 16 25% 17 10 6 1 15 25% 16 9 5 1 14 25% 15 8 4 1 13 30% 14 7 3 1 12 35% 13 7 3 1 11 37.5% 12 6 2 1 10 40% 11 6 2 1 9 45% 10 5 1 1 8 50% 9 4 1 1 7 55% 8 3 1 1 6 60% 7 2 1 1 5 70% 6 1 1 1 4 80% 5 1 1 1 3 90% 4 1 1 1 2 100% 3 1 1 1 1 120% 2 1 1 1 Note: The notation in parentheses is taken from Section 4.3's model. 87 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.2: Percentage Premium Change after a Claim-Free Contract Year and Marginal Percentage Changes in the Premium after each Claim, by Bonus-Malus Class Present Premium change Increase in premium after BM class if no claim 1st claim 2nd claim 3rd claim (K) ( N = 0) (N = 1) (N = 2) (N = 3) 20 0% 0% 100% 140% 19 0% 20% 83% 118% 18 0% 40% 57% 118% 17 0% 50% 60% 100% 16 0% 60% 50% 100% 15 0% 80% 56% 71% 14 0% 100% 60% 50% 13 -17% 120% 64% 33% 12 -14% 83% 64% 33% 11 -7% 71% 67% 20% 10 -6% 60% 67% 20% 9 -11% 75% 71% 0% 8 -10% 78% 50% 0% 7 -9% 80% 33% 0% 6 -8% 82% 20% 0% 5 -14% 100% 0% 0% 4 -13% 71% 0% 0% 3 -11% 50% 0% 0% 2 -10% 33% 0% 0% 1 -17% 20% 0% 0% Note: The notation in parentheses and below is taken from Section 4.3's model. The second column reports New premium after claim-free year − Old premium Old premium for each bonus-malus class K. = A [B(K, 0)] − A(K) A(K) The third, fourth and fth columns report A [B(K, N )] − A [B(K, N − 1)] A [B(K, N − 1)] N = 1, 2, 3, for all K with N claims. for respectively a year in class 88 bonus-malus classes K. Here, A [B(K, N )] is the new premium after 4.2. weight, power, etc.), renewal date), and contracts claims INSTITUTIONAL BACKGROUND AND DATA (coverage, bonus-malus class, level of deductible, premium, (type of claim, damage, etc.). The raw data contains 163,194 unique contracts. Because they do not contain information on claims in 1995, we excluded this year from the data. 8 contracts that are not covered by the bonus-malus system. We also excluded the This leaves 140,799 unique contracts with a total of 101,074 claims. Of these claims, 34,491 are claims at fault that may lead to a malus. 9 However, in 2,463 of these cases, insurees have avoided a malus by 10 withdrawing their claim. Throughout most of the chapter, we treat withdrawn claims as unclaimed losses, and simply exclude them from the analysis. Section 4.4.4 specically studies the withdrawal data to learn about moral hazard. Appendix 4.C shows that the empirical results presented in the main text are robust to alternative ways of dealing with withdrawals. We restrict our analysis to the claim histories from the contracts' rst renewal (or start) date in the sample onwards. In the data, there are 124,021 contracts with observed renewal date. Of these contracts, 6,787 were interrupted for some period of time. In these cases, we only use the contract history from its rst observed renewal date to its rst interruption. For each contract, we registered the claim history, with information on the times and sizes of claims at fault that were not withdrawn. We examined the bonus-malus transitions between all observed contract years, corrected some inconsistencies (see Appendix 4.C for details), and registered the initial bonus-malus class (i.e., the bonus-malus class established at the rst renewal date). Along the way, we discovered that the data on the bonus-malus class after the 2000 renewal are not reliable. 11 Therefore, we excluded con- 8 These are the contracts covering companies' eets of cars. Such contracts have no individual BM coecients, but general eet discounts. These discounts are adjusted every year based on the eets' claim histories. 9 The data also include so called nil claims, which are mostly pro forma claims of amounts below the deductible. These may correspond to an at fault event, but typically do not aect the agent's bonus-malus status. Therefore, we treat all nil claims as claims not-at-fault. 10 We use both direct and indirect information to identify withdrawals. See Appendix 4.C for details. 11 40,104 out of 68,515 bonus-malus transitions in 2000 were incorrect. 89 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA tracts that started in 2000, and the history of ongoing contracts after their 2000 renewal date. Our full nal sample consists of 123,169 unique contracts with 23,396 claims at fault. Table 4.3 shows that many of these are observed for the maximum period of 4 years. We illustrate some of the data's key features using only data on the rst fully observed contract years in the sample. Table 4.4 gives the number of contracts in this subsample by bonus-malus class and by number of claims at fault led in the contract year. Contracts with one claim and contracts with two or more claims will be important, for dierent reasons, to our empirical analysis. There are a lot of contracts with one claim in our subsample but, because claims are rare, there are only 278 contracts with at least two claims. Figure 4.1 plots the distribution of contracts in our subsample across bonus-malus classes. Classes 1 3 have less than 1% of the contracts each; more than 26% of the contracts are in the highest class 20. The majority of contracts (over 57%) is in the high bonus-malus classes 14 20, where the premium is just 25% of the base premium. Figure 4.1 also plots the shares of contracts in our subsample with at least one and at least two claims at fault, by bonus-malus class. These shares drop substantially with the bonus-malus class. It may be tempting to relate this variation in the number of claims over bonus-malus classes to our discussion of incentives. However, the overall pattern can be well explained by heterogeneity in risk, with high-risk individuals sorted into the lower bonus-malus classes. 4.3 Model of Claim Rates and Sizes This section characterizes the dynamic incentives to avoid car insurance claims that are inherent to the Dutch bonus-malus scheme. We do so by analyzing a model of a sin- gle agent's risk prevention and claim behavior that combines features of Mossin's (1968) static model of insurance and Merton's (1971) continuous-time analysis of optimal con- 90 4.3. MODEL OF CLAIM RATES AND SIZES Table 4.3: Contract Exposure Durations in the Sample Number of years Y 1 Number of contracts observed exactly Y years between Y − 1 and Y years Total 8,097 11,775 19,872 4,709 9,616 14,325 3 6,262 7,387 13,649 4 68,820 6,503 75,323 87,888 35,281 2 Total 123,169 Table 4.4: Number of Contracts Observed for At Least One Full Contract Year, by BonusMalus Class and Number of Claims in the First Contract Year BM Number of contracts with class no claim 1 claim 2 claims 3 claims 4 claims 1 562 118 24 4 2 749 94 11 1 1 Total 709 855 3 962 81 10 1,053 4 1,311 100 9 1,420 5 1,876 112 13 1 2,002 6 2,514 160 14 2 2,690 7 3,363 207 16 8 4,232 273 16 9 4,889 249 15 10 6,490 293 11 11 6,063 279 12 12 6,004 285 16 6,305 13 5,879 266 11 6,156 14 6,669 311 13 6,993 15 6,165 301 6 6,472 16 6,377 297 13 17 5,671 249 7 5,927 18 4,367 204 10 4,581 3,586 4,521 2 1 2 3,855 214 5 20 27,652 1,373 29 2 105,650 5,466 261 15 6,795 6,356 1 19 Total 5,155 6,688 4,074 29,056 2 111,394 Note: Nil and withdrawn claims were excluded from the sample. 91 CHAPTER 4. Figure 4.1: MORAL HAZARD IN DYNAMIC INSURANCE DATA Distribution of Contracts Observed for At Least One Full Contract Year Across Bonus-Malus Classes; and Shares of Those Contracts with At Least One and At Least Two Claims at Fault in the First Contract Year, by Bonus-Malus Class 30% 25% All contracts Contracts with at least 1 claim Contracts with at least 2 claims Share 20% 15% 10% 5% 0% 1 2 3 4 5 6 7 8 9 10 11 BM class 92 12 13 14 15 16 17 18 19 20 4.3. sumption. MODEL OF CLAIM RATES AND SIZES Our model is related to Briys (1986), but focuses on experience rating and its moral-hazard eects. It is an extension of Abbring, Chiappori, and Pinquet's (2003) model with heterogeneous losses and endogenous claiming, carefully adapted to the Dutch institutional environment. Also, unlike Abbring et al.'s analysis of experience rating in French car insurance, we make the nonstationarity arising from annual premium revision explicit. This is important for our empirical analysis because in the Dutch bonus-malus system, unlike in the French one, both the number of past claims and their distribution across contract years matter for the current bonus-malus status. 4.3.1 Primitives We consider the behavior and outcomes of an agent i in continuous time τ with innite horizon. Time is measured in contract years and has its origin at the moment the agent entered the insurance market. The wealth of agent time 0, agent i i at time τ is denoted by Wi (τ ) is endowed with some initial wealth τ + dτ , agent i receives a return ρWi (τ )dτ any other income, such as labor income. Wi (0) > 0. Then, between τ and ci (τ )dτ . We ignore with some probability pi (τ )dτ .13 on his wealth and consumes 12 The agent causes an accident between If so, he incurs some monetary loss. and accumulates as follows. At τ and τ + dτ Denote the j -th We assume that Lij (Li1 , . . . , Li(j−1) ), from some time-invariant distribution loss incurred by agent i by Lij . is drawn independently of the agent's insurance history, including by an insurance contract involving a xed deductible Fi .14 Di The losses and a premium Lij are covered qi (τ )dτ that is paid continuously. The deductible is applied on a claim-by-claim basis, i.e. if a claim for 12 For the purpose of our analysis, this is equivalent to assuming that any such income is perfectly foreseen by the agent (Merton, 1971, Section 7). 13 Accidents that are not caused by the agent are fully covered and have no impact on future premia. Such accidents can be and are disregarded in our analysis. From now on, by accident or claim we always mean accident or claim at fault. 14 This assumption is violated if agents can inuence Fi ex ante by choosing to drive more or less carefully. Then, data on claim sizes do not distinguish between ex ante and ex post moral hazard, but are still informative on the overall presence of moral hazard. 93 CHAPTER 4. Lij a loss is led, the insurer pays The premium Table 4.1. i's MORAL HAZARD IN DYNAMIC INSURANCE DATA qi (τ ) Lij − Di to the agent. is determined by agent Thus, we can write i's qi (τ ) = Ai (Ki (τ )), bonus-malus class into his ow premium. Ki (τ ) bonus-malus class where Ai according to is a mapping from agent Because the base premium to which the discounts in Table 4.1 are applied depends on agent i's characteristics, the mapping Ai 15 will be heterogeneous across agents. Agent i is endowed with an initial bonus-malus class Ki (0). The bonus-malus class is updated at the beginning of each contract year, the renewal date, according to the rule in Table 4.1. Thus, date is a right-continuous process, with discrete steps at each renewal τ ∈ N depending on the past contract year's bonus-malus class and number of claims. Denote by time Ki (τ ) τ. Ni (τ ) That is, the number of claims in the ongoing contract year up to and including Ni (τ ) is a claim-counting process that is set to zero at the beginning of each contract year. Then, at each renewal date τ ∈ N, 4.1 Ki (τ ) = B(Ki (τ −), Ni (τ −)), where Ki (τ −) and Ni (τ −) are agent past contract year, respectively, and i's B bonus-malus class and number of claims in the represents Table 4.1's bonus-malus updating rule. Note that this rule is common to all agents. Recall that it moves agents who survive a contract year without claims to a higher bonus-malus class, corresponding to a lower premium, and all other agents to a lower class, with a higher premium. Insurance claims led by agents are potentially aected by ex ante and ex post moral hazard (Chiappori, 2001). Ex ante moral hazard arises if an agent can aect the probability of an accident. We model this by allowing, at each time the intensity cost pi (τ ) Γi (pi (τ )). τ, the agent to choose of having an accident from some bounded interval We assume that Γi is twice dierentiable on (pi , pi ), [pi , pi ], with at a utility Γ0i < 0, Γ00i > 0. 15 Here, we abstract from time-varying characteristics other than K . There is not much harm in treating i e.g. age as a time-invariant characteristic, as our empirical analysis will focus on events in only one or a few contract years. 94 4.3. MODEL OF CLAIM RATES AND SIZES In words, reducing accident rates is costly and returns to prevention are decreasing. For deniteness, we also assume that Γ0i (pi +) = −∞ and Γ0i (pi −) = 0. In addition, we allow for ex post moral hazard by allowing the agent to hide a loss he has actually incurred from the insurer. For clarity of exposition, we assume that claiming and hiding losses are costless, but that the agent cannot claim losses that have not actually been incurred. The agent's instantaneous utility from consuming tensity pi (τ ) at time τ is ui (ci (τ ))−Γi (pi (τ )). ci (τ ) 16 and driving with accident in- We assume that ui is strictly increasing and concave. The agent chooses consumption, prevention and claiming plans that maximize 17 total expected discounted utility Z ∞ e E −ρτ [ui (ci (τ )) − Γi (pi (τ ))] dτ , 0 subject to the intertemporal budget constraint limτ →∞ e−ρτ W (τ ) = 0 and given the wealth and premium dynamics described above. At each time τ, the agent observes his wealth, bonus-malus class and claim histories. As we have implicitly assumed that any labor and other income is perfectly foreseen by the agent, he only has to form expectations on future accidents and their implications. 4.3.2 Optimal Risk, Claims and Savings For notational convenience, we now drop the index i. It should be clear, however, that all results are valid at the individual level, irrespective of the distribution of preferences and technologies across agents. In particular, the results hold for any type of unobserved heterogeneity in these primitives of the model. Because our model is Markovian and, apart from annual contract renewal, timehomogeneous, the optimal consumption, prevention and claim decisions at time τ only 16 Section 4.3.3.1 discusses a simple extension of the model in which hiding losses is costly. Such an extension is needed to formalize variation in the degree of ex post moral hazard in general, and the extreme case that agents report all losses (above the deductible) and do not suer from ex post moral hazard in particular. 17 For simplicity, we assume that subjective discount rates equal the interest rate. 95 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA depend on the past history through the agent's current wealth K(τ ), the number of claims at fault N (τ ), and the time W (τ ), t ≡ τ − [τ ] bonus-malus class past in the ongoing contract year. Let V (t, W, K, N ) denote the agent's optimal expected discounted utility at time the contract year if his wealth equals N W, he is in bonus-malus class K, t in and has claimed losses in the ongoing contract year. This value function satises the Bellman equation V (t, W, K, N ) = n max u(c)dt − Γ(p)dt + e−ρdt × c,p,X h (1 − pdt)V (t + dt, (1 + ρdt)W − cdt − A(K)dt, K, N ) Z + pdt V (t + dt, (1 + ρdt)W − min{l, D} − cdt − A(K)dt, K, N + 1)dF (l) X Z io V (t + dt, (1 + ρdt)W − l − cdt − A(K)dt, K, N )dF (l) , + pdt Xc 4.2 with 4.3 V (1, W, K, N ) ≡ lim V (t, W, K, N ) = V (0, W, B(K, N ), 0). t↑1 Equation (4.2) can be interpreted as follows. Between t and t + dt the agent derives ows of utility from his consumption and disutility from his prevention eort. The value V (t, W, K, N ) equals the net value of these utility ows, at the optimal consumption and prevention levels, plus the expected optimal discounted utility at time probability 1 − pdt no accident occurs. t + dt. With Then, the agent's wealth is increased with the interest ow minus consumption and the premium, and the number of claims at fault, stays unchanged. If the agent causes an accident, with probability pdt, N, he will incur an additional wealth loss. The size of this wealth loss is subject to ex post moral hazard. If the damage L caused by the accident lies in the optimal choice of the claims for insurance compensation and only looses the minimum of D. Then, the number of claims at fault, 96 N, increases by 1. If L L claim set X , he and the deductible lies in the complement 4.3. Xc MODEL OF CLAIM RATES AND SIZES of the optimal claim set, however, he does not claim and pays the full loss the number of claims at fault, N, L. Then, stays unchanged. Equation (4.3) reects the eects of annual premium renewal. It requires that the value in class with 0 K with N claims just before a renewal time equals the value in class B(K, N ) claims just after renewal. Bellman equation (4.2) can be rewritten in a more familiar form by rearranging and taking limits dt ↓ 0, n ρV (S) = max u(c) − Γ(p) c,p,X hZ +p V (t, W − min{l, D}, K, N + 1)dF (l) X Z i + V (t, W − l, K, N )dF (l) − V (S) 4.4 Xc o + VW (S) [ρW − c − A(K)] + Vt (S) , where W. Vt and VW are the partial derivatives of V with respect to, respectively, t and The left-hand side of (4.4) is the ow (or perpetuity) value attached by the agent to state S ≡ (t, W, K, N ). It equals the (optimal) instantaneous ow of utility from his consumption net of the disutility from his prevention eort plus three expected value (capital) gains terms, (i) the expected value gain because of an accident, (ii) the value gain due to net accumulation of wealth, and (iii) the appreciation of the value over time. Standard arguments guarantee that (4.4), with (4.3), has a unique solution that an optimal consumption-prevention-claim plan exists. that the value function V is strictly increasing in wealth weakly increasing in the bonus-malus class claims at fault N K V, and In Appendix 4.A, we prove W (Lemma 1) and that it is and weakly decreasing in the number of (Lemma 2). One direct implication is that the agent follows a threshold rule for claiming. Proposition 4. The optimal claim set in state S is given by X (S) ≡ (x (S), ∞), for ∗ ∗ some claim threshold x∗ (S) ≥ D. 97 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Thus, if the agent incurs a loss K L and number of claims at fault at time N t then, for given wealth W, bonus-malus class right before t, he claims if and only if L > x∗ (S). The threshold is implicitly dened as the loss at which he is indierent between claiming and not claiming: 4.5 V (t, W − D, K, N + 1) = V (t, W − x∗ (S), K, N ). This assumes an internal solution and, in particular, ignores the trivial, and empirically irrelevant, case in which X = ∅. Optimality of the two remaining choices, consumption and prevention, requires that the corresponding rst-order conditions are satised, u0 (c∗ (S)) = VW (S) 0 Z ∗ 4.6 4.7 and x∗ (S) V (t, W − l, K, N )dF (l) −Γ (p (S)) = V (S) − Z 0 ∞ − V (t, W − D, K, N + 1)dF (l), x∗ (S) where p∗ (S) and c∗ (S) are, respectively, the optimal accident and consumption intensities in state S ≡ (t, W, K, N ). The rst equation is the standard Euler condition, which balances the marginal utilities from current and future consumption. The second condition requires equality of the marginal cost of prevention and the marginal cost of an accident. 4.3.3 Dynamic Incentives from Experience Rating 4.3.3.1 Measure of Incentives First, consider ex ante moral hazard. The rst-order condition (4.7) embodies two distinct aspects of ex ante moral hazard, the agent's ability to reduce risk and the incentives he is given to do so. If the marginal cost −Γ0 of reducing risk quickly increases from 0 to ∞, changes in incentives have little eect on risk and moral hazard is limited. In the limiting case in which 98 Γ(p) = 0 if p ≥ p0 and Γ(p) = ∞ if p < p0 , for some p0 > 0, the agent will 4.3. choose an accident rate p0 MODEL OF CLAIM RATES AND SIZES irrespective of incentives to avoid claims. We will refer to this limiting case as the case of no (ex ante) moral hazard. The right-hand side of (4.7) is the expected discounted utility cost of a claim. This is a measure of the incentives to avoid an accident, for a given prevention technology In this section, we characterize the variation in these incentives with, in particular, and t.18 Γ. K, N In the next section, we use this characterization to test for moral hazard. We focus on the dynamic incentives inherent to the bonus-malus scheme and set the deductible D to 0. This simplies the presentation and does not greatly interfere with our objective of learning about changes in incentives across states. Section 4.3.3.2 formalizes this point in the context of a particular model specication. We will also restrict attention to incentives in the case without moral hazard. This will be sucient for computing a score test for moral hazard and for interpreting local behavior of econometric tests near the null of no moral hazard. Without moral hazard, the optimal accident rate p∗ (S) equals a xed number p0 > 0 in all states S and all losses are claimed, so the right-hand side of (4.7) simplies to 4.8 V (t, W, K, N ) − V (t, W, K, N + 1). We will characterize incentives in the case of no moral hazard by characterizing this dierence in utility values as a function of the state (t, W, K, N ). Before we move to these computations, note that (4.8) is also a measure of incentives to avoid a claim given that an accident has occurred. Linearizing (4.5) as a function of the threshold around the deductible D=0 gives 4.9 x∗ (t, W, K, N )VW (t, W, K, N ) ≈ V (t, W, K, N ) − V (t, W, K, N + 1). 18 Abbring, Chiappori, and Pinquet (2003) obtain unambiguous theoretical results on the change in incentives after each claim in French car insurance. These results rely on the proportional nature of the French bonus-malus system, and do not carry over to the Dutch system. Moreover, Abbring et al. do not model the nonstationarity arising from annual contract renewal and, therefore, do not provide results on contract-time eects. 99 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA The right-hand side of this equation is again the expected discounted utility cost of a claim in (4.8). The left-hand side is the marginal cost, in expected discounted utility units and at a time a claim decision needs to be taken, of increasing the threshold just above the deductible D = 0. Note that this cost, unlike the cost is not a free parameter of the model. Γ of loss prevention, This is a direct consequence of our assumption that claiming and hiding losses is costless, and implies that the model does not have a parameter that indexes the degree of ex post moral hazard. extend the model with such a parameter. capital loss of to γL, for some parameter γx∗ (t, W, K, N )VW (t, W, K, N ). in which γ → ∞, It is straightforward to For example, if hiding a loss γ ≥ 1, L leads to a then the left-hand side of (4.9) generalizes Then, the null of no moral hazard is the limiting case and hiding losses is prohibitively expensive. Throughout this chapter, it is implicitly understood that the null of no (ex post) moral hazard can be generated this way. For expositional convenience, we will not make this explicit in the notation. 4.3.3.2 Theoretical Characterization of Incentives We compute the value function and the incentives for the constant absolute risk aversion (CARA) class of utility functions, which is given by u(c) = with α>0 1 − e−αc , α the coecient of absolute risk aversion, arises as a limiting case if we let α ↓ 0. −u00 (c)/u0 (c). Linear utility, u(c) = c, The CARA class brings analytical and compu- tational simplications that we believe outweigh, for the purpose of this chapter at least, its disadvantages (see e.g. Caballero, 1990, for some discussion). Merton's (1971) results that, with CARA utility, the value and utility functions have the same functional forms and consumption is linear in wealth, provide intuition for Proposition 5. In the case of no moral hazard with accident rate p , D = 0, and CARA 100 0 4.3. MODEL OF CLAIM RATES AND SIZES utility, c∗ (S) = ρ [W − Q(t, K, N )] and V (S) = 1 − e−αρ[W −Q(t,K,N )] , αρ with S ≡ (t, W, K, N ) and Q the unique solution to the system of dierential equations ρQ(t, K, N ) = π(K) + p0 4.10 4.11 eαρ[Q(t,K,N +1)−Q(t,K,N )] − 1 + Qt (t, K, N ) αρ Q(1, K, N ) = Q(0, B(K, N ), 0). Here, Qt (t, K, N ) is the partial derivative of Q(t, K, N ) with respect to t. Proposition 5 is proved in Appendix 4.A. function V It provides a characterization of the value that can be used to compute incentives under the null of no moral hazard. To gain some insight in Proposition 5's characterization of optimal consumption and the value function, rst note that equation (4.10) reduces to ρQ(t, K, N ) = π(K) + p0 [Q(t, K, N + 1) − Q(t, K, N )] + Qt (t, K, N ) if we let α ↓ 0. Q(t, K, N ) Thus, in the limiting case of linear utility that is, a risk-neutral agent reduces to the expected discounted ow of future premia. The agent simply consumes the ow value V (S) = W − Q(t, K, N ). ρ[W − Q(t, K, N )] of his net wealth, which produces a value The expected discounted utility cost of a claim in state S is given by V (S) − V (t, W, K, N + 1) = Q(t, K, N + 1) − Q(t, K, N ). Conveniently, incentives are independent of the level of wealth in this case. With a risk-averse agent that is, for xed α>0 the right-hand side of equation (4.10) involves an additional term p0 eαρ[Q(t,K,N +1)−Q(t,K,N )] − 1 − [Q(t, K, N + 1) − Q(t, K, N )] , αρ 101 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA which is strictly positive for all (t, K, N ). As a consequence, Q(t, K, N ) strictly exceeds the expected discounted ow of premia, and optimal consumption is lower than with S are now given by eαρQ(t,K,N +1) − eαρQ(t,K,N ) , so that a wealth- linear utility. This reects precautionary savings. Incentives in state V (S) − V (t, W, K, N + 1) = (αρ)−1 e−αρW invariant measure of incentives is given by ∆V (t, K, N + 1) ≡ V (S) − V (t, W, K, N + 1) eαρQ(t,K,N +1) − eαρQ(t,K,N ) . = e−αρW αρ Note that this measure again reduces to Q(t, K, N + 1) − Q(t, K, N ) if we let α ↓ 0. Before we move to a numerical characterization of incentives, briey consider the case of a general but state-invariant deductible state S D. In this case, with linear utility, incentives in Q(t, K, N + 1) − Q(t, K, N ) in the expected discounted are the sum of the increase premium ow and the deductible D. Because the deductible is not state dependent, changes in incentives across states are not aected. Consequently, tests that focus on changes in incentives across states within agents are robust to an extension to general deductibles (see Section 4.4.2). 4.3.3.3 Numerical Characterization of Incentives In the remainder of this section, we will numerically characterize incentives by computing ∆V (t, K, N + 1) for various values of the underlying function Q (t, K, N ), α, and p0 . is presented in Appendix 4.B. consistent with a 4% annual interest and discount rate. we take p0 = 0.053, An algorithm for computing which corresponds to a contract year. This equals the share 94.8% We set in particular, π(K) to be In our baseline computations probability of having no claim in the 105,650 of contracts without claims in our subsample 111,394 of single contract years (see Table 4.4). We measure the premium the base premium. That is, ρ = ln(1.04) π(K) in multiples of is set equal to the premium reported in Table 4.1 and, π(2) = 1. Figure 4.2 plots the (wealth-invariant measures of the) present discounted utility costs 102 4.3. of a rst (∆V MODEL OF CLAIM RATES AND SIZES (1, K, 1)), a second (∆V (1, K, 2)) and a third (∆V (1, K, 3)) claim just before contract renewal, as a function of the bonus-malus class to the linear-utility case α=0 K. The bold graphs correspond and give the expected discounted premium cost of a claim in multiples of the base premium. The other graphs correspond to in that order and with the graphs corresponding to a consumption level equal to to coecients of relative 20 α = 0.1 closest to the bold graph. At α = 0, 0.1, . . . , 0.5 times the base premium, risk aversion equal to α = 0.1, 0.2, . . . , 0.5, 0, 2, . . . , 10, correspond respectively. This is roughly the range considered, with some empirical support, by Caballero (1990). Incentives near the null of no moral hazard are considerable. In the linear case, total wealth drops by more than the annual base premium. Recall that the base premium is four times the premium in class 20 paid by most insurees in our sample. The cases with risk aversion are very similar. Incentives also vary a lot between bonus-malus classes. The incentives to avoid a rst claim are small in the lowest classes, where the premium paid is already high. They then increase substantially, and again fall to a lower level in the highest classes. Robustly across the values of α, these incentives are larger than the incentives to avoid a second or a third claim in low classes K. K, and smaller in high classes Thus, for agents in high bonus-malus classes, the Dutch bonus-malus system has implications that are similar to those of the French proportional experience-rating scheme studied by Abbring, Chiappori, and Pinquet (2003): The rst and also the second claim in a contract year lead to jump up in incentives, and therefore jump down in claim rates under moral hazard. However, the Dutch system allows us to contrast this implication with the eects of low bonus-malus classes, where incentives jump Figure 4.3 plots the change ∆V (1, K, N + 1) − ∆V (1, K, N ) down after a rst claim. for N =1 (resp. N = 2) in incentives to avoid a claim when a rst (resp. second) claim is led just before contract renewal, again for dierent degrees of risk aversion. after a rst claim jump down for low claim do not change for low K K This graph shows that incentives and up for high K. The incentives after a second (they are already equal to zero), but jump down for middle 103 CHAPTER 4. K and up for high point, t = 1, MORAL HAZARD IN DYNAMIC INSURANCE DATA K. These eects, and those in Figure 4.2, are computed at a specic in time, but are robust to considering alternative times. To illustrate this, Figure 4.3 also plots the changes over the course of a contract year in incentives to avoid, respectively, a rst, a second and a third claim. ∆V (t, K, N ) − ∆V (0, K, N ) these graphs of is for all N = 1, 2, 3 Because close to linear as a function of time t, ∆V (1, K, N )−∆V (0, K, N ) summarize well the time patterns in incentives, and the variation in these time patterns between bonus malus classes. The changes in incentives over time are small relative to the jumps in incentives when a claim is led just before contract renewal. The and ∆V (1, K, N ) − ∆V (0, K, N ) N = 2. dierences between ∆V (1, K, N + 1) − ∆V (0, K, N + 1) are even smaller for Because these dierences give the change in the contract year, this implies that the graphs of N =1 and 2, N = 1 and almost the same for ∆V (t, K, N + 1) − ∆V (t, K, N ) over ∆V (1, K, N +1)−∆V (1, K, N ) for both, indeed characterize well the jump in incentives after a rst and a second claim at all times across the contract year. Even if the time-variation in incentives is small relative to the jumps in incentives at the times of a claim, it may still aect some of our empirical procedures that focus on the latter. After all, the time-variation in incentives aects all contracts, but only some contracts experience jumps in incentives. We will return to this in Section 4.4 in the specic context of an econometric model. Finally, we explore the robustness of these results to changes in the accident intensity under the null, p0 . Figure 4.4 again plots the jumps in incentives after a rst and a second claim and changes in incentives over the course of a contract year for dierent levels of risk aversion, but for p0 = 0. The graphs are qualitatively similar to those in Figure 4.3 for the average-risk case. Time eects are smaller in the zero-risk case because agents do not expect any accident during the contract year; time preference is the only source of nonstationarity in this extreme case. Figure 4.5 plots the same graphs for 104 p0 = 0.232, which is the average risk level consis- 4.4. EMPIRICAL ANALYSIS tent with the share of contracts without a claim in the worst bonus-malus class, K = 1. At this risk level, incentives at the time of a rst claim only increase in the highest bonusmalus classes, and decrease at low and intermediate bonus-malus classes. The jumps in incentives at the time of a second claim have similar features as before. Time eects are now substantial. Because agents are very likely to experience an accident during the contract year anyhow, incentives do not jump much early in the year even if they would jump a lot close to renewal. In sum, the qualitative conclusions for our baseline case with average risk continue to hold as long as p0 is average or low, but change for very large p0 . Nevertheless, we can robustly conclude that incentives at the time of a rst claim drop in all low classes, and increase in very high classes. The results on jumps in incentives at the time of a second claim are more robust to changes in accident intensity; these incentives do not change in low classes, drop in middle classes and increase in high classes. 4.4 Empirical Analysis The empirical analysis uses the data set introduced in Section 4.2.2. We formalize this using Section 4.3's notation, with an appropriate change of the time scale's origin. The full sample consists of i ∈ {1, . . . , n}, we let time τ n contracts. Time is measured in years. For each contract have its origin at the start of the contract's rst contract year included in the sample. Then, Contract Let i is the initial bonus-malus class in the is observed until some random attrition time Ni (τ ) until time Ci is N̄i (Ci ), τ. sample. Ci . count the number of claims at fault on contract year up to and including time i Ki (0) i in the ongoing contract Note that the total number of claims incurred on contract with N̄i (τ ) ≡ Ni (τ ) + [τ ] X Ni (u−). u=1 105 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.2: Incentives to Avoid First, Second and Third Claim; at an Average Risk Level 5 ∆ V(1,K,1) ∆ V(1,K,2) ∆ V(1,K,3) 4.5 4 3.5 ∆V 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K Note: This gure plots hazard, for π(K) p0 = 0.053 ∆V (1, K, N ) for N = 1, 2, 3 as functions of K for the CARA case without moral and dierent values of the coecient of absolute risk aversion is measured in multiples of the base premium, as in Table 4.1. the linear-utility case α = 0 respectively. 106 The premium The bold graphs correspond to and give the expected discounted premium cost of a claim in terms of α = 0.1, 0.2, . . . , 0.5, in that order and with the α = 0.1 closest to the bold graph. At a consumption level equal to 20 times the α = 0, 0.1, . . . , 0.5 correspond to coecients of relative risk aversion equal to 0, 2, . . . , 10, the base premium. The other graphs correspond to graphs corresponding to base premium, α. 4.4. EMPIRICAL ANALYSIS Figure 4.3: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at an Average Risk Level 4 ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 3 2 2 ∆ V 1 0 −1 −2 −3 −4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K ∆V (1, K, N + 1) − ∆V (1, K, N ) for N = 1, 2, and ∆V (1, K, N ) − ∆V (0, K, N ) for N = 1, 2, 3 as functions of K for the CARA case without moral hazard, for p0 = 0.053 and dierent values of the coecient of absolute risk aversion α. The premium π(K) is measured in multiples of the base premium, as in Table 4.1. The bold graphs correspond to the linear-utility case α = 0 and give the Note: This gure plots expected discounted premium cost of a claim in terms of the base premium. The other graphs correspond to α = 0.1, 0.2, . . . , 0.5, graph. At a consumption level equal to coecients of relative α = 0.1 closest to the bold α = 0, 0.1, . . . , 0.5 correspond to in that order and with the graphs corresponding to 20 risk aversion equal to times the base premium, 0, 2, . . . , 10, respectively. 107 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.4: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at a Zero Risk Level ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 3 2 2 ∆ V 1 0 −1 −2 −3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K ∆V (1, K, N + 1) − ∆V (1, K, N ) for N = 1, 2, and ∆V (1, K, N ) − ∆V (0, K, N ) for N = 1, 2, 3 as functions of K for the CARA case without moral hazard, for p0 = 0 and dierent values of the coecient of absolute risk aversion α. The premium π(K) is measured in multiples of the base premium, as in Table 4.1. The bold graphs correspond to the linear-utility case α = 0 and give the Note: This gure plots expected discounted premium cost of a claim in terms of the base premium. The other graphs correspond to α = 0.1, 0.2, . . . , 0.5, graph. At a consumption level equal to coecients of 108 relative α = 0.1 closest to the bold α = 0, 0.1, . . . , 0.5 correspond to in that order and with the graphs corresponding to 20 risk aversion equal to times the base premium, 0, 2, . . . , 10, respectively. 4.4. EMPIRICAL ANALYSIS Figure 4.5: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at a High Risk Level 3 ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 2 1 2 ∆ V 0 −1 −2 −3 −4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K ∆V (1, K, N + 1) − ∆V (1, K, N ) for N = 1, 2, and ∆V (1, K, N ) − ∆V (0, K, N ) for N = 1, 2, 3 as functions of K for the CARA case without moral hazard, for p0 = 0.232 and dierent values of the coecient of absolute risk aversion α. The premium π(K) is measured in multiples of the base premium, as in Table 4.1. The bold graphs correspond to the linear-utility case α = 0 and give the Note: This gure plots expected discounted premium cost of a claim in terms of the base premium. The other graphs correspond to α = 0.1, 0.2, . . . , 0.5, graph. At a consumption level equal to coecients of relative α = 0.1 closest to the bold α = 0, 0.1, . . . , 0.5 correspond to in that order and with the graphs corresponding to 20 risk aversion equal to times the base premium, 0, 2, . . . , 10, respectively. 109 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Denote the time and size of the Then, for each contract and its claim history i, j th we observe Hi [0, Ci ) claim on contract Ci , up to time i Tij with and Lij , respectively. the contract's initial bonus-malus state Ci , Ki (0), where Hi [0, τ ) ≡ ({Ni (u); 0 ≤ u < τ }; Li1 , . . . , LiN̄i (τ ) ). Note that the bonus-malus history up to time Ci , {Ki (τ ); 0 ≤ τ < Ci }, within contract years, and can be constructed from The full unbalanced sample is Ki (0) and {Ni (u); 0 ≤ u < Ci }. {Ci , Ki (0), Hi [0, Ci ); i = 1, . . . , n}. We assume that it is a random sample from the distribution of its population counterpart The claim history H ≡ H[0, ∞), does not vary {C, K(0), H[0, C)}. and its relation to the bonus-malus class K(0) initially occupied by the agent, are the focus of our empirical analysis. 4.4.1 Econometric Model At the core of our econometric model is the intensity time θl of claims of size l ∈ R+ or up at τ , conditional on the claim history H[0, τ ) up to time τ , the initial bonus-malus class K(0), and a nonnegative individual-specic eect λ. We specify the following model: θl (τ |λ, H[0, τ ), K(0)) = ϑ (t|λ, N (τ −), K(τ )) · F (max{l, x∗ (t, λ, N (τ −), K(τ ))}|λ) , 4.12 where t ≡ τ −[τ ] is time elapsed in the contract year and ϑ (t|λ, N (τ −), K(τ )) is the rate at which losses are incurred at time τ by an agent with characteristics K(τ ) and has claimed N (τ −) times in the year up to time τ . class λ who has been in class Recall that the bonus malus K(τ ) is fully determined by the initial class K(0) and the claim history H[0, τ ). The second factor, F (max{l, x∗ (t, λ, N (τ −), K(τ ))}|λ), is the conditional probability that the loss is of size L≥l and that it is claimed, i.e. L ≥ x∗ . This specication incorporates Section 4.3's assumption that losses are drawn from an exogenous and time-invariant distribution 110 F (·|λ) = 1 − F (·|λ) that may dier between agents. It also reects the result 4.4. EMPIRICAL ANALYSIS that agents follow a threshold rule for claiming (Proposition 4). Without further loss of generality, and to facilitate a discussion of theory's implications for (4.12), we write 4.13 ϑ (t|λ, N (τ −), K(τ )) = λ · ψ(t) · β (t|λ, N (τ −), K(τ )) , with and ψ β a continuous and positive function representing external contract time eects, an almost surely bounded and positive function. We frequently use the notation Ψ(t) ≡ Rt 0 ψ(u)du conditional on and normalize Ψ(1) = 1. We assume that λ has distribution K(0) = K . Together with equation (4.12), this fully species the distribution of assume independent censoring, that is C ⊥⊥ H|K(0). H|K(0). θl (τ |H[0, τ ), K(0)) We This is a standard assumption in event-history analysis (e.g. Andersen, Borgan, Gill, and Keiding, 1993). that GK It ensures can be identied with the claim rate among surviving contracts, θl (τ |H[0, τ ), K(0), C > τ ). We are now ready to dene the tests' hypotheses within the context of the econometric model. First, consider the simplication of (4.12) that is implied by the absence of moral hazard. In the empirical analysis, we will refer to this case as the null of no moral hazard. Prediction 1. The claims process under the null of no moral hazard. Without moral hazard, β ≡ 1 and x∗ ≡ 0, so that θl (τ |λ, H[0, τ ), K(0)) = λψ(t)F (l|λ). Given λ, claim rates and sizes do not depend on the past number of claims N (τ −) or the bonus-malus class K(τ ); they only depend on contract time through the function ψ . That is, there is no state dependence in the claims process. Taken literally, Section 4.3's theory implies that λF (l|λ) is time-invariant, with λ = p0 . Thus, ψ ψ ≡ 1, so that θl (τ |λ, H[0, τ ), K(0)) = captures contract-time eects that are 111 CHAPTER 4. external MORAL HAZARD IN DYNAMIC INSURANCE DATA to the model, that is, that are independent of the claim history and the bonus- malus class. We entertain the possibility of such eects because, if they are there for some reason, they are likely to confound our analysis of state dependence. both tests that assume ψ≡1 19 We will present (stationarity) and tests that allow for nonparametric ψ. The proportional specication of (4.13) will then capture the rst-order eects of any external contract-time eects. Note that in addition we explicitly allow, through x∗ , for contract-time eects that arise internally β and because of the fact that contracts are renewed at discrete times. These internal time eects will in general enter the claim rate nonproportionally. Under the alternative of moral hazard, Prediction 1 generally fails. In that case, θl (τ |λ, H[0, τ ), K(0)) depends negatively on incentives, which in turn vary with t, λ, and, in particular, N (τ −) and K(τ ). In Subsection 4.4.2, we impose the full structure of Section 4.3's theory on the econometric model, including ψ ≡ 1. We present and apply a score test that can be interpreted as a Lagrange multiplier test for moral hazard in the structural model. In Section 4.4.3, we use more general tests for state dependence in claim times and sizes. There, we only rely on qualitative predictions of the eect of incentives on claim rates and sizes, without directly using incentive computations. Before we present these tests, we briey reect on the possibility that they pick up alternative sources of state dependence in the claims process, such as learning, fear, or cautionary responses to accidents, that are unrelated to nancial incentives and moral hazard. In Section 4.3, we have assumed these away by specifying the prevention tech- 19 The theoretical and econometric models only recognize contract time and do not explicitly consider the eects of calendar time (or duration since last event for that matter). In our sample, dierent contracts have dierent renewal dates, so that contract time and calendar time do not coincide. If renewal dates are evenly distributed over calendar time, seasonal calendar-time eects are not likely to matter much to the empirical analysis. However, in our sample we observe that the share of contracts starting in January is 13.3% which is more than twice as much as at the end of the calendar year (6.1% contracts start in November and 6.2% in December). This variation could be explained by the fact that it is more advantageous to buy a new car at the beginning of a calendar year because the ageing of a car in years depreciates its value much faster than the ageing in months. On the other hand, the shares of contracts starting in other (middle) months of the year are almost equal; they range from 7.1% (August) to 9.4% (April). 112 4.4. nology, as represented by the cost function Γ, EMPIRICAL ANALYSIS to be independent of the accident history. There are two reasons not be overly concerned about this. First, many of these alternative sources of state dependence are expected to work in one direction, unlike the nancial incentives in the Dutch bonus-malus system. Learning from accidents, for example, is likely to reduce the accident rate in all states, irrespective of nancial incentives. Therefore, it is unlikely that, for example, learning exactly replicates the implications of moral hazard. Second, learning eects are likely to be small for older drivers. We have conrmed the robustness of the empirical conclusions that follow by repeating our analysis on a subsample of insurees of 28 years old and up; see Appendix 4.D for details. 4.4.2 Structural Test on the Full Sample of Claim Times We rst focus on the timing of claim and ignore information on claim sizes. Section 4.3 proves that ex ante and ex post moral hazard work in the same direction (see also Section 4.4.3). Thus, we can view tests based on claim times as overall tests for moral hazard. Assume that there are no external time eects, ψ ≡ 1, so that all nonstationarity arises from behavioral responses to variation in incentives over time. In addition, suppose that there are R risk-types λ1 , . . . , λ R of agents. Consider the following auxiliary model of claim rates, 4.14 θ(τ |λ, H[0, τ ), K(0)) = λ · exp [−β∆V (t, K(τ ), N (τ −) + 1|λ)] with Pr(λ = λr |K(0) = K) = ξr (K), r = 1, . . . , R, and 1, . . . , 20, and ∆V (·|λ) equal to ∆V (·) evaluated at p0 = λ. K have the same supports {λ1 , . . . , λR } across K, β=0 moral hazard, we expect to nd evidence that r=1 ξr (K) = 1, The distributions of for K = λ|K(0) = but with dierent probability masses at each support point because of sorting into classes Under the null of no moral hazard, PR K. and claim rates are time-invariant. Under β > 0. We will now argue that a score test 113 CHAPTER 4. for β=0 MORAL HAZARD IN DYNAMIC INSURANCE DATA in (4.14) can be interpreted as a structural test for moral hazard. The auxiliary model's specication corresponds exactly to theory under the null; in that case θ(τ |λ, H[0, τ ), K(0)) = λ = θ0 (τ |λ, H[0, τ ), K(0)). It can be seen as an ap- proximation to the theoretical model under local alternatives to the null and a specic functional form of Γ. with cost function Γλ (p) = (p/β̃) [ln(p/λ) − 1] + λ/β̃ , Suppose that an agent with characteristics so that λ chooses p from Γ0λ (p) = β̃ −1 ln(p/λ). (0, λ], Sub- stituting in the rst-order condition (4.7) and assuming that there is no ex post moral hazard gives p∗ (S) = λ · exp −β̃ [V (t, W, K, N |λ) − V (t, W, K, N + 1|λ)] h i ≈ λ · exp −β̃e−αρW ∆V (t, K, N + 1|λ) , 4.15 where the approximation in the second line holds near the null of no moral hazard. Thus, the auxiliary model (4.14) is a good approximation to the optimal claiming hazard near the null, that is, for small β̃ , with β = β̃e−αρW . Note that β = β̃ is homogeneous in the population, as in the auxiliary model, in the limiting case of a risk-neutral agent (α = 0). In this case, the derivative of p∗ (S) with respect to β at β =0 exactly equals the corresponding derivative of the auxiliary model's claim rate in (4.14). Consequently, a score test for β=0 in the auxiliary model exactly equals a Lagrange multiplier test for moral hazard in the structural model. The score test for moral hazard has so far been narrowly developed for the case without ex post moral hazard, a specic functional form of the cost function Γ, a zero deductible, and linear utility. However, the intuition for a test based on the auxiliary model (4.14) does not rest on this example's specic assumptions, and we expect such a test to have power against moral hazard more generally. deductible D, For example, with a general but state-invariant the approximation in (4.15) becomes h i p∗ (S) ≈ λ̃ · exp −β̃∆V (t, K, N + 1|λ) , 114 4.4. with λ̃ = λ exp −β̃D . Clearly, a score test for β=0 EMPIRICAL ANALYSIS in the auxiliary model continues to be a test for moral hazard in this extension. We estimate both restricted (β = 0) and unrestricted versions of the auxiliary model with parametric maximum likelihood, using the full unbalanced sample and computing ∆V using the linear specication (α = 0). We compute the likelihood using a discrete (daily) approximation, building on θ(τ |λ, H[0, τ ), K(0)) 1 − − N (τ −) = 1 λ, N (τ −) , K(τ ) ≈ , Pr N τ + 365 365 τ ∈ k ;k 365 ∈ Z+ . Each likelihood computation for the unrestricted model, and the computation of the score test statistic, embed the algorithm in Appendix 4.B to compute ∆V (·|λr ) (that is, ∆V (·) at p0 = λr ), r = 1, . . . , R, at daily times. In addition to the score test, we also compute Wald and likelihood-ratio statistics to test for the alternative that β 6= 0. β=0 against Because the latter two tests involve estimates of the auxiliary model under the alternative of moral hazard, where it only approximates the structural model, their interpretation as structural tests is less clear cut. However, because the approximation holds near the null, we expect them to have good power against, at least, local moral-hazard alternatives. We estimated the unrestricted model with various numbers of support points for the distribution of R = 4 and λ, R = 2, 3, 4, 5, R = 5, and obtained stable estimates of β. Moreover, between the maximum log likelihood only increased by 5.83 points, even though 21 parameters were added. 20 Table 4.5 gives the estimates of β and the λs in the unrestricted model, with their estimated standard errors, for the specication of the model with 3, 4 and 5 support points. It also presents the score, Wald and likelihood-ratio test statistics for the hypothesis that there is no moral hazard: β = 0. The estimate of β is signicantly positive. All three tests reject the null of no moral hazard at all conventional 20 Computation time also became an issue: Estimating the model with ve support points took almost a week on a standard PC. 115 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA levels. Figure 4.6 plots the probability masses points for each class K. ξr (K) of the unrestricted model with 3 support For expositional convenience, the estimates of are in ascending order, i.e. λ̂1 < λ̂2 < λ̂3 . λs in Table 4.5 With this in mind, it is easy to see that the probability masses are slowly moving from the highest risk (λ̂3 ) in bonus-malus class 1 to the lowest risk (λ̂1 ) in bonus-malus class 20. This pattern is consistent with dynamic sorting of agents across bonus-malus classes. 4.4.3 Tests for State Dependence in Claim Times and Sizes The previous section presents a tightly structured test for state dependence. It is tightly structured in the sense that it concentrates on local alternatives in which all state dependence is channeled through the dynamic incentives computed using Section 4.3's theory. In this section, we explore the application of more universal, nonparametric tests for state dependence from the literature. The interpretation and, in a few cases, construction of these tests rely on the theory's qualitative predictions on the claims process for given λ under moral hazard. We rst develop and present these predictions. 4.4.3.1 Theoretical Implications for the Claims Process The theoretical analysis of Section 4.3 can now be applied to predict the properties of the claims process for given λ under local moral-hazard alternatives. First note that the theory implies that incentives to avoid claims vary between initial bonus-malus classes K. However, in data the resulting moral-hazard eects on claims are confounded with sorting of agents with dierent characteristics λ into dierent classes K. The problem of empirically separating these selection eects from the causal eects of incentives is the standard problem of causal inference from cross-sectional data. This is a notoriously hard problem that we avoid here. Instead, we exploit that there is idiosyncratic variation in incentives over time. 116 4.4. EMPIRICAL ANALYSIS Table 4.5: Maximum-Likelihood Estimation of the Auxiliary Model (4.14) with Three, Four, and Five Support Points Three Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 0.4229 0.0428 0.0427 0.0050 0.1770 0.0254 0.3514 0.0234 Tests of β=0 • LM test: 26.14, p-value = 0.00 • LR test: 89.74, p-value = 0.00 • Wald test: 97.54, p-value = 0.00 Four Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 λ4 0.3810 0.0337 0.0000 0.0000 0.0552 0.0074 0.2211 0.0125 0.8715 0.1260 Tests of β=0 • LM test: 72.73, p-value = 0.00 • LR test: 94.96, p-value = 0.00 • Wald test: 127.90, p-value = 0.00 Five Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 λ4 λ5 0.4017 0.0390 0.0135 0.0118 0.0531 0.0117 0.1889 0.0164 0.2629 0.0171 0.8896 0.1219 Tests of β=0 • LM test: 76.56, • LR test: 106.39, • Wald test: 106.09, p-value p-value = 0.00 = 0.00 p-value = 0.00 Note: The left side of each panel presents maximum-likelihood estimates of the relevant parameters in the unrestricted auxiliary model (4.14). The right side of each panel presents Lagrange multiplier (LM), likelihood-ratio (LR) and Wald tests for moral hazard. 117 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.6: Estimated Probability Masses ξr (K) of the Auxiliary Model (4.14) with Three Mass Points BM class 1 BM class 2 BM class 4 1 1 1 0.5 0.5 0.5 0.5 0 λ1 λ2 λ3 0 BM class 5 λ1 λ2 λ3 0 BM class 6 λ1 λ2 λ3 0 BM class 7 1 1 1 0.5 0.5 0.5 0.5 0 0 0 λ1 λ2 λ3 λ1 λ2 λ3 BM class 10 λ1 λ2 λ3 0 BM class 11 1 1 1 1 0.5 0.5 0.5 λ1 λ2 λ3 0 BM class 13 λ1 λ2 λ3 0 BM class 14 λ1 λ2 λ3 0 BM class 15 1 1 1 1 0.5 0.5 0.5 λ1 λ2 λ3 0 BM class 17 λ1 λ2 λ3 0 BM class 18 λ1 λ2 λ3 0 BM class 19 1 1 1 0.5 0.5 0.5 0.5 0 0 0 λ2 λ3 λ1 λ2 λ3 λ1 λ2 λ3 λ1 λ2 λ3 λ1 λ2 λ3 λ1 λ2 λ3 BM class 20 1 λ1 λ3 BM class 16 0.5 0 λ2 BM class 12 0.5 0 λ1 BM class 8 1 BM class 9 118 BM class 3 1 0 λ1 λ2 λ3 4.4. EMPIRICAL ANALYSIS Prediction 2. Dependence of claims on N (τ −), by class K(τ ) under moral hazard. Conditional on λ, loss rates jump down (β(t|λ, 0, K) > β(t|λ, 1, K) > β(t|λ, 2, K)) and claim sizes increase (x∗ (t, λ, 0, K) < x∗ (t, λ, 1, K) < x∗ (t, λ, 2, K)) at the times of the rst and the second claims in high classes K . In contrast, in low classes K loss rates jump up (β(t|λ, 0, K) < β(t|λ, 1, K) ≤ β(t|λ, 2, K)) and claim sizes decrease (x∗ (t, λ, 0, K) > x∗ (t, λ, 1, K) ≥ x∗ (t, λ, 2, K)) after the rst and the second claims. There is no change in loss rates and claim sizes after the second claim in classes K ≤ 5. Because the state-dependence eects on loss rates and claim probabilities work in the same direction, the results for the loss rates carry over to claim rates. Next, for expositional convenience, suppose that there are no external time eects, ψ ≡ 1. Then, we have Prediction 3. Dependence of claims on time t, by class K(τ ) under moral hazard. Conditional on λ, loss rates of an agent with 0 claims, resp. 1 claim (or, more particularly, β(t|λ, 0, K), resp. β(t|λ, 1, K)) weakly decrease with t in most classes K , but may increase in the highest classes. Loss rates of an agent with 2 claims (β(t|λ, 2, K)) are time-invariant in classes K ≤ 9 and strictly decrease with t in classes K > 9. The opposite results hold for claim thresholds x∗ , so that the eects on loss rates carry over to claim rates. All these time eects are small compared to the jumps at the time of a claim (Prediction 2), except for very high loss rates. If there are external contract-time eects, that is if ψ is nontrivial, then Prediction 3 holds relative to these external eects. Predictions 1-3 are all conditional on ual contract. Because λ; they are predictions at the level of an individ- λ is not observed, tests based on contrasting the predicted behavior under the null (Prediction 1) with the predicted behavior under the moral-hazard alternative (Predictions 2 and 3) are not feasible. The econometric challenge is to develop tests that use these predictions without requiring data on λ. 119 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Our tests exploit the dynamics of claims implied by Predictions 2 and 3. Rather than studying cross-sectional variation in incentives, and trying to separate these from selection eects, we exploit variation in incentives over time. The problem of separating the corresponding dynamic moral hazard eects from dynamic selection is the classic problem of distinguishing state dependence and heterogeneity. Like the problem of distinguishing causal eects and selection eects in a static setting, this is a hard problem. However, it is a richer problem that has been well-studied in the statistics and econometrics literature. A key result from this literature implies that, under the null, the total number of claims in the contract year is a sucient statistic for the unobserved heterogeneity in the loss intensities. We use this result to control for unobserved heterogeneity in the loss rates. We build on Abbring, Chiappori, and Pinquet's (2003) adaptations and extensions of the tests developed in the seminal work by Bates and Neyman (1952), Heckman and Borjas (1980) and Heckman (1981). We rst study time eects in claim rates. Prediction 1 implies that, under the null and after controlling for heterogeneity, time eects should be identical between classes K. Moreover, there should be no time eects at all under the theory's assumption of stationarity (ψ ≡ 1). On the other hand, both Predictions 2 and 3 imply that there will be time eects under moral hazard. Time eects in claim rates are likely to be small and tests for moral hazard based on observed time eects are not likely to be very powerful. More importantly, they may be confounded by external time eects (ψ ). Therefore, we quickly move to comparing (distributions of ) rst and second claim times and sizes. Here, Prediction 2 takes center stage. Because the jumps in incentives at the time of a claim are much larger than the time-variation in incentives, Prediction 2's structural occurrence dependence (Heckman and Borjas, 1980) eects dominate Prediction 3's time eects. Therefore, we can test for moral hazard by testing the implications of Prediction 2 for the relation between rst and second claim times and sizes, across classes 120 K and controlling for heterogeneity and, 4.4. EMPIRICAL ANALYSIS possibly, external time eects. For the state-dependence tests, we use the balanced subsample consisting of the rst fully observed contract years, presented in the Table 4.4. We will only use data on contracts with one claim and contracts with (exactly or at least) two claims in the contract year. We will use the same notation as before, i.e. Ki (0) will denote the initial bonus- malus class (which is the bonus-malus class in the rst observed contract year); Lij will refer to the time and size of the j th Tij and claim (in the rst contract year). 4.4.3.2 Distribution of First Claim Time Consider the distribution of the rst claim time T1 in the subpopulation with exactly one claim in the contract year and in one of the bonus-malus classes in K, H1 (t|K) = Pr(T1 ≤ t|N (1−) = 1, K(0) ∈ K), and its empirical counterpart Ĥ1,n (t|K) = where n n X M1,K,n i=1 I(Ti1 ≤ t, Ni (1−) = 1, Ki (0) ∈ K), is the total number of contracts in the sample and k, Ki (0) ∈ K) exactly 1 k Mk,K,n ≡ Pn i=1 I(Ni (1−) = is the number of contracts in the sample of contracts in a class in K with claims. Under the null of no moral hazard, H1 (·|K) = Ψ(·) (Prediction 1 and Abbring, Chi- appori, and Pinquet, 2003). Under the moral hazard alternative, depend on the choice of K and dier from Ψ(·). H1 (·|K) will typically This variation is caused by both changes in incentives at the time of a claim (Prediction 2) and changes in incentives over time (Prediction 3). We tested the null that H1 (·|K) is equal for all K ∈ {1, 2, . . . , 20} using the Kruskal-Wallis test and do not reject the null at conventional levels (see Table 4.6). Figure 4.7 plots Ĥ1,n (t|K(0) ∈ K) for low BM classes K = {1, . . . , 10} and high BM 121 CHAPTER 4. classes MORAL HAZARD IN DYNAMIC INSURANCE DATA K = {11, . . . , 20}. The dierence between these two empirical distributions is not p-values of Wilcoxon rank-sum and Kolmogorov-Smirnov tests (given in signicant: The rst lines of the Table 4.6) are above conventional levels. Suppose now that ψ ≡ 1. Then, under the null of no moral hazard, form distribution. In the Figure 4.7, both empirical distributions of H1 should be a uni- H1 (for low and high BM classes) lie below the diagonal which suggests that agents le claims later in the year in all bonus-malus classes. This is consistent with the theory's Prediction 2 under moral hazard for low bonus-malus classes, but violates this prediction for high classes. Moreover, this apparent anomaly is signicant since the uniformity of classes it is H1 0.002 is 0.015 p-value of the Kolmogorov-Smirnov test for for high classes. For low classes, the p-value is 0.083; and for all (see Table 4.7). These results should, however, be interpreted with considerable care, because Predictions 2 and 3 correspond to only small eects of K on H1 directions. Therefore, even small external time eects in and, moreover, work in opposite ψ can explain the anomaly and, together with moral hazard, generate the pattern observed in Figure 4.7. To see this, note that the estimate of H1 for high bonus-malus classes lies above that for low classes. Thus, consistently with Prediction 2, agents in high classes claim earlier in the year relative to agents in low classes. By comparing across bonus-malus classes, we have controlled for external time eects. Another way to control for such eects is to compare rst and second claim times. 4.4.3.3 Marginal Distributions of First and Second Claim Times Consider the distribution of the second claim time T2 in the subpopulation with exactly two claims in the contract year and in one of the bonus-malus classes in H2 (t|K0 ) = Pr(T2 ≤ t|N (1−) = 2, K(0) ∈ K0 ), 122 K0 , 4.4. Table 4.6: Nonparametric Tests Based on Comparison of EMPIRICAL ANALYSIS H1 and H2 for Dierent Bonus- Malus Classes Kruskal - Wallis test H1 (K) H2 (K) equal for all equal for all p-value K K 0.592 0.271 Wilcoxon test p-value H1 (low K ) ∼ H1 (high K ) H2 (low K ) ∼ H2 (high K ) 0.006 Kolmogorov - Smirnov test p-value 0.629 H1 (low K ) ∼ H1 (high K ) H2 (low K ) ∼ H2 (high K ) 0.676 0.004 Note: This table is computed using a subsample of rst fully-observed contract years from Table 4.4's sample. The values in bold imply rejection of the null of no moral hazard at a 5% level. Low classes are BM classes 1 10 and high classes are BM classes 11 20. Table 4.7: Kolmogorov-Smirnov Test Comparing 2 with H1 for Dierent Bonus-Malus Classes H1 with the Uniform Distribution and H2 Kolmogorov - Smirnov test H1 (all K ) H1 (low K ) H1 (high K ) H2 (all K ) H2 (low K ) H2 (high K ) H2 (low K ) H2 (high K ) ∼ ∼ ∼ ∼ ∼ ∼ ∼ ∼ U nif orm U nif orm U nif orm H12 (all K ) H12 (low K ) H12 (high K ) H12 (high K ) H12 (low K ) p-value 0.002 0.015 0.083 0.524 0.065 0.344 0.149 0.541 Note: This table is computed using a subsample of rst fully-observed contract years from Table 4.4's sample. The values in bold imply rejection of the null of no moral hazard at a 5% level. Low classes are BM classes 1 10 and high classes are BM classes 11 20. 123 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.7: Comparison of Ĥ1 with the Uniform Distribution for Low and High Bonus- Malus Classes 1 Uniform CDF Empirical H1 for low BM 1 − 10 0.9 Empirical H1 for high BM 11 − 20 0.8 0.7 H1 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 t 124 0.6 0.7 0.8 0.9 1 4.4. EMPIRICAL ANALYSIS and its empirical counterpart, 1 n X M2,K0 ,n i=1 0 Ĥ2,n (t|K ) = I(Ti2 ≤ t, Ni (1−) = 2, Ki (0) ∈ K0 ). Abbring, Chiappori, and Pinquet's (2003) analysis implies that, under the null of no moral hazard, H2 (t|K0 ) = H1 (t|K)2 , for all ψ and K, K0 . They also show that this equality breaks down under moral hazard, and is likely to do so in one direction. The immediate implication of this result is that under no moral hazard, the choice of K0 . A Kruskal-Wallis test for the null that {1, 2, . . . , 20} gives a p-value of 0.271. low (1 10) and high (11 20). H2 (t|K0 ) H1 (·|K) will not depend on is equal for all K ∈ The result changes if we group the BM classes into Then, both Wilcoxon and Kolmogorov-Smirnov tests reject the null at conventional levels; see Table 4.6. Another test of moral hazard compares choices of K {1, . . . , 10} and K0 . some evidence that H2 > H12 and Ĥ1,n (·|K), Ĥ1,n (·|K)2 Figure 4.8 plots (low BM classes) and for Ĥ2,n (·|K0 ) and K = K0 = {11, . . . , 20} in low classes, and that Ĥ1,n (·|K)2 for appropriate Ĥ2,n (·|K0 ) for K = K0 = (high BM classes). We nd H2 < H12 in high classes. From Abbring, Chiappori, and Pinquet's (2003) analysis and Prediction 2, we may expect the 21 opposite rankings under moral hazard. for H2 = H12 with dierent choices of K However, none of the Kolmogorov-Smirnov tests and K0 rejects the null; see Table 4.7. This is consistent with our ndings from Chapter 3 that nonparametric state-dependence tests, unlike Section 4.4.2's structural test, have little power with data on rare events. 4.4.3.4 Joint Distribution of First and Second Claim Durations So far, we have only compared marginal distributions of rst and second claim times. Intuitively, much can be gained by comparing rst and second claim times within contracts, that is, by studying the joint distribution of rst and second claim times. Thus, 21 Abbring, Chiappori, and Pinquet (2003) focus on local behavior near the null. In Chapter 3 we showed that the global implications are less clear-cut. This may also explain some of this result. 125 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.8: Comparison of Ĥ1 with the Uniform Distribution and of and High Bonus-Malus Classes, with Ĥ1 and Ĥ2 Ĥ12 with Ĥ2 Estimated on the Same Classes Low BM classes 1 − 10 1 0.9 Uniform CDF Empirical H1 0.8 Empirical H21 Empirical H2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.7 0.8 0.9 1 t High BM classes 11 − 20 1 0.9 Uniform CDF Empirical H1 0.8 Empirical H21 Empirical H2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 t 126 0.6 for Low 4.4. T1 we compare the time of the rst claim claim T2 − T1 EMPIRICAL ANALYSIS and the time between the rst and the second in the subpopulation with exactly two claims in the contract year. Assuming stationarity (ψ ≡ 1) and under the null of no moral hazard, we have that Pr(T1 ≥ T2 − T1 |N (1−) = 2, K(0) ∈ K) = for all K. 1 2 Under moral hazard, on the other hand, we would expect this probability to be larger than smaller than 1/2 1/2 in low classes, where incentives jump down after the rst claim, and in high classes, where incentives jump up. Note that here we again use that these jumps in incentives dominate the changes in incentives over time. Thus, under stationarity (ψ of contracts in classes in K ≡ 1), a test for moral hazard can be based on the share with two claims for which the time to the rst claim is larger than the time between the rst and the second, π̂n (K) = 1 n X M2,K,n i=1 I(Ti1 ≥ Ti2 − Ti1 , Ni (1−) = 2, Ki (0) ∈ K). Under the null of no moral hazard, variance Pk,K 1/(4nK P2,K ), where nK π̂n (K) is asymptotically normal with mean is the total number of contracts in all classes from is more generally the probability that a contract in a class in contract year. The variance of 1/2 π̂n (K) K has can be consistently estimated by k and K, and claims in the 1/(4M2,K,n ). Another test for moral hazard under stationarity can be based on [ ln βn (K) = 1 n X M2,K,n i=1 ln Ti1 Ti2 − Ti1 I(Ni (1−) = 2, Ki (0) ∈ K) which is asymptotically normal under the null of no moral hazard, with expectation variance π 2 /(3nK P2,K ). The variance can be consistently estimated by The rst two columns of Table 4.8 give standard errors for various choices of K. π̂n (K) and [ ln βn (K) 0 and π 2 /(3M2,K,n ). with their estimated The two statistics' values, and their variation 127 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA with classes, are consistent with moral hazard. However, the null of no moral hazard is not rejected at a 5% level, because the small numbers of observations imply low precision. This is consistent with our results from Chapter 3 that these tests have limited power with data on rare events. Precision can be increased by pooling classes at both ends of the bonus-malus scheme, but reversing the comparison for the high classes. For example, we can use π̂n (KL , KH ) = 1 M2,KL ∪KH ,n n X [I(Ti1 ≥ Ti2 − Ti1 , Ni (1−) = 2, Ki (0) ∈ KL ) + i=1 I(Ti1 ≤ Ti2 − Ti1 , Ni (1−) = 2, Ki (0) ∈ KH )], with KL and KH disjoint sets of low and high bonus-malus classes, respectively. Under moral hazard, we would expect this share to be larger than 1/2. Therefore, we can use one-sided test. The rst two columns of Table 4.9 give the values of [ ln βn (KL , KH ) of [ ln βn (K). π̂n (KL , KH ) and a similar variant We expect the latter to be positive under moral hazard. The results are again consistent with moral hazard, now with some rejections of the null at a 5% level in very high and very low BM classes. Abbring, Chiappori, and Pinquet (2003) develop a variant π̂n (K) that allows for general external time eects ψ . the transformed durations classes in K. As before, K H1 (T1 |K0 ) and K0 and π̂n∗ (K|K0 ) of the statistic Adapted to our setting, it compares H1 (T2 |K0 ) − H1 (T1 |K0 ) in the subsample with can be wisely chosen to maximize power. Proposition 7 in Abbring, Chiappori, and Pinquet implies that, under the null of no moral hazard, ance π̂n∗ (K|K0 ) is asymptotically normal with expectation 1/(4nK P2,K ) + 1/(6nK0 P1,K0 ), 1/(6M1,K0 ,n ). which can be consistently estimated by 1/2 1/(4M2,K,n ) + The last two columns of Tables 4.8 and 4.9 plot the values of π̂n∗ estimated standard errors for dierent bonus-malus classes. First we estimated all bonus-malus classes (taking 128 and vari- with the H1 using K0 = {1, . . . , 20}) and then using only the tested (current) 4.4. bonus-malus classes (taking K0 = K). The values of EMPIRICAL ANALYSIS π̂n∗ (K|K0 ) statistic and their variation with classes, are again consistent with moral hazard. However, the null of no moral hazard is not rejected at a 5% level, because the πn∗ test has lower power than the πn test (see Chapter 3). 4.4.3.5 Claim Sizes As discussed in Section 4.4.3.1, the jumps in incentives at the time of a claim dominate the variation in incentives over time. Therefore, in comparing claim sizes within a contract year, we can focus on Prediction 2's occurrence-dependence eects, and ignore Prediction 3's time eects. This facilitates a test for ex post moral hazard based on a comparison of the sizes of agents' rst and second claims in a contract year, even though these occur at dierent times. Under ex post moral hazard the size stochastically larger than the size L1 L2 of a second claim in a contract year is of a rst claim in high classes where incentives jump up after the rst claim. On the other hand, rst claim sizes are stochastically larger than second claim sizes in low classes. Under the null of no moral hazard, rst and second claim sizes share the same distribution F (·|λ). Table 4.10 reports p-values of Wilcoxon and sign tests for this hypothesis against one-sided and two-sided alternatives, using subsamples of contracts with two or more claims in dierent bonus-malus classes. They suggest that L1 and L2 are not identi- cally distributed. In particular, the second claim is stochastically larger in the subpopulation in higher classes. This is consistent with ex post moral hazard: Agents in high classes K increase their claiming thresholds x∗ after experiencing a jump up in their incentives at the time of their rst claim. 129 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.8: Tests Based on Comparison of First and Second Claim Durations, for Dierent Bonus-Malus Classes BM classes Test statistics (std. error) π̂n [ ln βn π̂n∗ (all) π̂n∗ (current) 62.5% (10.2%) 0.392 (0.370) 54.2% (10.2%) 54.2% (10.9%) 1 2 65.7% (8.5%) 0.570 (0.307) 60.0% (8.5%) 60.0% (8.9%) 1 3 62.2% (7.5%) 0.384 (0.270) 57.8% (7.5%) 57.8% (7.8%) 1 4 55.6% (6.8%) 0.211 (0.247) 51.9% (6.8%) 53.7% (7.1%) 1 5 55.2% (6.1%) 0.180 (0.222) 52.2% (6.1%) 55.2% (6.4%) 1 6 54.3% (5.6%) 0.046 (0.202) 50.6% (5.6%) 54.3% (5.8%) 1 7 53.6% (5.1%) 0.086 (0.184) 50.5% (5.1%) 53.6% (5.3%) 1 8 53.1% (4.7%) 0.085 (0.171) 50.4% (4.7%) 53.1% (4.9%) 1 9 52.3% (4.4%) 0.083 (0.160) 49.2% (4.5%) 50.8% (4.6%) 1 10 53.2% (4.2%) 0.106 (0.154) 50.4% (4.3%) 51.1% (4.4%) All 52.9% (3.1%) 0.107 (0.112) 50.6% (3.1%) 50.6% (3.1%) 11 20 52.5% (4.5%) 0.109 (0.164) 50.8% (4.6%) 50.8% (4.6%) 12 20 51.8% (4.8%) 0.035 (0.173) 50.0% (4.8%) 50.0% (4.8%) 13 20 51.1% (5.2%) 0.070 (0.187) 50.0% (5.2%) 50.0% (5.2%) 14 20 53.0% (5.5%) 0.132 (0.199) 51.8% (5.5%) 51.8% (5.5%) 15 20 51.4% (6.0%) 0.101 (0.217) 50.0% (6.0%) 50.0% (6.0%) 16 20 48.4% (6.3%) -0.040 (0.227) 46.9% (6.3%) 46.9% (6.3%) 17 20 45.1% (7.0%) -0.188 (0.254) 43.1% (7.0%) 43.1% (7.1%) 18 20 47.7% (7.5%) -0.105 (0.273) 45.5% (7.6%) 47.7% (7.6%) 19 20 44.1% (8.6%) -0.057 (0.311) 41.2% (8.6%) 44.1% (8.6%) 20 41.4% (9.3%) -0.112 (0.337) 41.4% (9.3%) 44.8% (9.3%) 1 Note: This table is computed using a subsample of rst fully-observed contract years from Table 4.4's sample. The values in bold imply rejection of the null of no moral hazard at a 5% level (two-sided test). In the computation of π̂n∗ , we rst used all bonus-malus classes to estimate (current) bonus-malus classes listed in the rst column. 130 H1 , and then only the tested 4.4. EMPIRICAL ANALYSIS Table 4.9: Tests Based on Comparison of First and Second Claim Durations that Pool Low and High Bonus-Malus Classes BM classes low high 1 Test statistics (std. error) [ ln βn π̂n∗ (all) π̂n∗ (current) 60.4% (6.9%) 0.239 (0.249) 56.6% (6.9%) 56.6% (6.9%) (6.3%) 0.363 (0.227) 59.4% (6.3%) 59.4% (6.3%) (6.0%) 0.317 (0.218) 59.4% (6.0%) 59.4% (6.1%) 0.311 (0.204) 57.0% (5.7%) 58.2% (5.7%) (0.196) 58.1% (5.4%) 58.1% (5.5%) 0.243 (0.204) 58.2% (5.7%) 58.2% (5.7%) π̂n 20 62.5% 60.9% 59.3% 59.5% 1 2 20 1 2 19 20 1 2 18 20 1 2 17 20 1 3 19 20 1 3 18 20 57.3% (5.3%) 0.246 (0.192) 56.2% (5.3%) 57.3% (5.4%) 1 4 17 20 55.2% (4.9%) 0.200 (0.177) 54.3% (4.9%) 54.3% (4.9%) 1 5 16 20 53.4% (4.4%) 0.112 (0.158) 52.7% (4.4%) 52.7% (4.4%) 1 6 15 20 51.7% (4.1%) -0.022 (0.148) 50.3% (4.1%) 50.3% (4.1%) 1 7 14 20 50.6% (3.7%) -0.014 (0.135) 49.4% (3.8%) 49.4% (3.8%) 1 8 13 20 51.2% (3.5%) 0.015 (0.126) 50.2% (3.5%) 50.2% (3.5%) 1 9 12 20 50.4% (3.2%) 0.028 (0.118) 49.6% (3.3%) 49.6% (3.3%) 1 10 11 20 50.6% (3.1%) 0.006 (0.112) 49.8% (3.1%) 49.8% (3.1%) 58.2% (5.6%) (5.4%) (5.6%) 0.343 Note: This table is computed using a subsample of rst fully-observed contract years from Table 4.4's sample. The values in bold imply rejection of the null of no moral hazard at a 5% level (one-sided test). In the computation of π̂n∗ , we rst used all bonus-malus classes to estimate H1 , and then only the tested (current) bonus-malus classes listed in the rst column. 131 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.10: Comparison of First and Second Claim Sizes for Various Bonus-Malus Classes BM # Wilcoxon test Sign test L ∼ L against classes obs. p L L L ≺ L L 6∼ L 1 -value 1 2 1 2 2 1 1 29 0.336 0.068 0.969 0.136 1 2 41 0.791 0.378 0.734 0.755 1 3 51 0.633 0.500 0.610 1.000 1 4 60 0.802 0.449 0.651 0.897 1 5 74 0.942 0.546 0.546 1.000 1 6 90 0.755 0.701 0.376 0.752 1 7 106 0.458 0.809 0.248 0.497 1 8 122 0.219 0.913 0.120 0.239 1 9 139 0.173 0.913 0.117 0.235 1 10 151 0.240 0.028 0.043 0.035 0.029 0.038 0.873 0.164 0.329 0.090 0.986 0.010 0.010 0.007 0.012 0.011 0.009 0.003 0.027 0.019 0.021 0.014 0.025 0.023 0.019 0.007 0.052 0.104 0.879 0.203 0.405 0.763 0.360 0.720 All 278 11 20 127 12 20 113 13 20 97 14 20 86 15 20 73 16 20 67 17 20 53 18 20 46 0.127 0.973 19 20 36 0.388 20 31 0.493 0.061 0.055 0.993 0.994 0.996 0.993 0.994 0.995 0.998 2 0.053 Note: This table is computed using a subsample of contracts with at least two claims in the rst fullyobserved contract year from the Table 4.4's sample. The values in moral hazard at a 5% level. 132 bold imply rejection of the null of no 4.5. 4.4.4 CONCLUSION Claim Withdrawals So far, we have ignored withdrawn claims. We will now argue that withdrawals are directly informative on moral hazard, and present some evidence. Suppose that it takes time for loss amounts to be assessed, so that agents have to le a claim before the loss amount is fully known. Furthermore, suppose that there are no costs administrative or informational of ling and withdrawing claims. Then, agents will report all losses to the insurer to secure an option on compensation, and typically withdraw those claims for losses that fall below the threshold. Our data on claims and withdrawals are thus directly informative on ex post moral hazard (withdrawals), and ex ante moral hazard (initial claims). If we relax our assumptions, some ex post moral hazard will end up reducing initial claims. In any case, the mere fact that some claims are withdrawn in the sample points to the evidence of ex post moral hazard. Under the null of no ex-post moral hazard, agents will claim all accidents to the insurer and withdraw only those which damage falls below the level of deductible. The agent's decision whether to withdraw a claim or not will therefore depend only on the size of a damage and not on the bonus-malus class. Consequently, the shares of withdrawn claims should be roughly the same among all BM classes. Figure 4.9 plots the shares of withdrawn claims for each bonus-malus class. We observe that the shares are small for low and high bonus-malus classes and big for the bonus-malus classes in between. This is consistent with the incentives to avoid a rst (∆V and a second (∆V (1, K, 2)) (1, K, 1)) claim that we presented in the Figure 4.2. 4.5 Conclusion Putting novel theoretical insights into the dynamic incentives implied by experience rating to empirical use, we nd evidence of moral hazard in Dutch car insurance. The earlier literature often fails to nd such evidence. 133 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.9: Share of Withdrawn Claims per Bonus-Malus Class 8% 7% Withdrawals 6% Share 5% 4% 3% 2% 1% 0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 BM class Note: This graph only includes withdrawals that are directly observed. See Appendix 4.C. 134 20 4.A. PROOFS OF RESULTS IN SECTION 4.3 APPENDICES TO CHAPTER 4 4.A Proofs of Results in Section 4.3 Lemma 1. The value function V Proof. Consider a state is strictly increasing in wealth W . (t, W, K, N ) and denote the (stochastic) optimal consumption- prevention-claim plan following this state by with W0 > W, (c∗ , p∗ , X ∗ ). Then, in state the agent can attain an expected discounted utility equal to by following the same plan and instantaneous utility (c∗ , p∗ , X ∗ ). u Because consuming is strictly increasing, Lemma 2. The value function V (t, W 0 , K, N ) V (t, W, K, N ) c∗ +ρ(W 0 −W ) > c∗ is feasible V (t, W 0 , K, N ) > V (t, W, K, N ). is weakly increasing in the bonus-malus class K and weakly decreasing in the number of claims at fault N . Proof. Consider a state (t, W, K, N ) and denote the (stochastic) optimal consumption- prevention-claim plan following this state by with K0 ≥ K and N0 ≤ N, (c∗ , p∗ , X ∗ ). Then, in state (t, W, K 0 , N 0 ) the agent can attain an expected discounted utility equal to V (t, W, K, N ) by following the same plan (c∗ , p∗ , X ∗ ). In this case future insurance premia are weakly smaller, in the sense of stochastic dominance, than under optimal behavior in state in N, (t, W, K, N ), because B(K, N ) premia are weakly decreasing in future claims. Therefore, choosing is weakly increasing in K, K and weakly decreasing and the agent faces the same distribution of (c∗ , p∗ , X ∗ ) in state (t, W, K, N ) is feasible and, indeed, V (t, W, K 0 , N 0 ) ≥ V (t, W, K, N ). Proof of Proposition 5. First, note that the proposition's specications of the consump- tion rule and value function satisfy the Euler equation: u0 (c∗ (S)) = e−αρ[W −Q(t,K,N )] = VW (S) 135 CHAPTER 4. Second, note that MORAL HAZARD IN DYNAMIC INSURANCE DATA ρV (S) = u(c∗ (S)), so that Bellman equation (4.4) is satised if h i 0 = p0 V (t, W, K, N + 1) − V (S) + VW (S) [ρW − c∗ (S) − A(K)] + Vt (S). Because −αρ[W −Q(t,K,N )] V (t, W, K, N + 1) − V (S) = e 1 − eαρ[Q(t,K,N +1)−Q(t,K,N )] αρ VW (S) [ρW − c∗ (S) − A(K)] = e−αρ[W −Q(t,K,N )] [ρQ(t, K, N ) − A(K)] , , and Vt (S) = −e−αρ[W −Q(t,K,N )] Qt (t, K, N ), this is guaranteed by equation (4.10). Third, the Bellman equation's premium renewal conditions (4.3) are satised by equation (4.11): 1 − e−αρ[W −Q(1,K,N )] αρ −αρ[W −Q(0,B(K,N ),0)] 1−e = = V (0, W, B(K, N ), 0). αρ V (1, W, K, N ) = Finally, using standard methods it can be proved that there exists a unique solution Q to the system (4.10)(4.11). 4.B Computation of Proposition 5's Function Q Let A and B be given by Table 4.1 and attach some values to the parameters ρ, α , and p0 . In the limiting case α ↓ 0, the corresponding initial-value problem (4.10)(4.11) has an explicit analytical solution 136 Q. In particular, the initial values Q(0, ·, 0) can be computed 4.B. COMPUTATION OF PROPOSITION 5'S FUNCTION Q directly using Here, I is the Q(0, 1, 0) . . . Q(0, 20, 0) π(1) . 1 − e−ρ = . (I − e−ρ T )−1 . . ρ π(20) 20 × 20 identity matrix and trix among bonus-malus classes implied by T p0 4.16 is the annual transition probability maand B. The solution Q then satises the recursive system π(K) π(K) −ρ(1−t) Q(t, K, N ) = Q(0, 1, 0) − +e ρ ρ for N ≥ 3, Q(t, K, 2) = Q(t, K, 3) + e−(p0 +ρ)(1−t) [Q(0, B(K, 2), 0) − Q(0, 1, 0)], Q(t, K, 1) = Q(t, K, 2) + e−(p0 +ρ)(1−t) {Q(0, B(K, 1), 0) − Q(0, B(K, 2), 0) +p0 (1 − t)[Q(0, B(K, 2), 0) − Q(0, 1, 0)]}, and Q(t, K, 0) = Q(t, K, 1) + e−(p0 +ρ)(1−t) {Q(0, B(K, 0), 0) − Q(0, B(K, 1), 0) +p0 (1 − t)[Q(0, B(K, 1), 0) − Q(0, B(K, 2), 0)] 1 + p20 (1 − t)2 [Q(0, B(K, 2), 0) − Q(0, 1, 0)]}. 2 In the general case, the function Q can be computed iteratively using Algorithm 1. Give starting values to Q(0, K, 0), K = 1, . . . , 20, and repeat 1. set Q∗ (0, K, 0) = Q(0, K, 0), K = 1, . . . , 20; 2. for K = 1, . . . , 20, (a) for N ≥ 3, set π(K) π(K) −ρ(1−t) Q(t, K, N ) = +e Q(0, 1, 0) − ; ρ ρ 137 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA (b) for N = 2, 1, 0, set Q(·, K, N ) to the numerical solution of the corresponding single-equation initial-value problem in (4.10)(4.11); until maxK |Q(0, K, 0) − Q∗ (0, K, 0)| ≤ ε, for some small ε > 0. The values Q(0, ·, 0) for the linear-utility case, those that satisfy (4.16), can be used as starting values in Algorithm 1. Note that in the linear-utility case itself, this produces Q in one iteration; in cases with to compute Q α > 0, for multiple values of α, more iterations are typically needed. If we have we can use the linear-utility values of starting values for the computations with the lowest value of α, the resulting starting values for the computations with the second-lowest value of α, Q(0, ·, 0) as Q(0, ·, 0) as etcetera. 4.C Data Recall from Section 4.2.2 that the data provide contract and claim histories of personal car insurance clients of a major Dutch insurer from January 1, 1995 to December 31, 2000. All data, except information on claim withdrawals, came in a single le with 1,730,559 records for 163,194 unique contracts. A second le provided information on the withdrawal of claims by agents after they were led. Recall from Section 4.2 that agents had the option to avoid a malus after ling a claim by timely withdrawing it. We excluded information on the year 1995, because it lacked information on claims. From the remaining 142,175 contracts, we deleted 1,376 contracts that were not subject 22 to the bonus-malus system. We also deleted 16,778 contracts with unobserved renewal date. Most of these contracts started in 1995 and did not renew in 1996. Many were also short-term contracts covering only a couple of weeks or months. This left a sample of 124,021 contracts. We matched the second le's withdrawal information to the main data set based on contract and claim identiers, but this matching was not complete. This is important, 22 We will use this raw sample, consisting of 140,799 contracts, in the last chapter to give some motivation for the future research. 138 4.C. DATA because consistency of the claim and BM information is crucial to this chapter's empirical analysis. The remainder of this section discusses the ways we enforced such consistency by correcting the claim withdrawal and BM information, and checked for our empirical work's robustness to these corrections. First, few BM transitions in 2000 were recorded correctly. Therefore, we truncated all contract histories that were renewed in 2000 at the 2000 renewal date. This cut another 219 contracts that were rst observed in 2000 from the sample, leaving 123,802 unique contracts. Of these 123,802 contracts, 103,930 are observed for more than one year. For each such contract we observe the sequence of BM classes in consecutive contract years, with the number of claims at fault that were not withdrawn in each contract year. For 14,206 contracts, we observed one or more deviations from Table 4.1's BM updating rule. Many of these deviations can be explained by unobserved withdrawals, which may exist because observed withdrawals could not be perfectly matched to the main data le. For example, some contracts were awarded a bonus after a contract year with a claim. This is only consistent with the BM system if the claim was withdrawn. Therefore, we decided to treat those claims as (unobserved) withdrawals. Consequently, we excluded them from the sample. All in all, we found 1,355 unobserved withdrawals in the sample constructed so far. Even after excluding unobserved withdrawals, the sample still contained incorrect BM transitions. We corrected these anomalies by constructing the most appropriate BM class for the rst contract year that is, the class that minimized changes to the raw data and deriving the BM classes in all consecutive contract years from this initial BM class and claims using the BM updating rule. Of the 14,325 contracts observed for two years only, 1,269 have an incorrect BM transition. In these 1,269 cases, we simply set the BM class in the second year to be consistent with the BM class and the number of claims observed in the rst year. 139 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA For most contracts with inconsistencies observed for more than 2 years, a BM sequence based on the rst year's BM class delivered the best t to the observed BM sequence. A single inconsistency in the middle of a BM sequence, sandwiched between consistent BM classes, also often occurred. Then, we simply corrected this single BM class using the previous year's BM class and claim information. In some cases with more than two years of data, the rst year's BM class was inconsistent with the BM classes in all later years. Then, we forced the rst year's BM class to be consistent with the rst BM class later in the sequence that was consistent with later BM classes and claims. Because the BM updating rule in Table 4.1 is not a one-to-one mapping, there were often more consistent choices of a rst BM class. For example, an agent who was downgraded to BM class 1 in the second contract year after having one claim in the rst, could have been in any of the BM classes 1 5 in the rst year. In these cases, we chose the highest consistent BM class. In a very few cases we were not able to correct the rst year's BM class this way. For example, no choice of a rst year's BM class is consistent with a claim in the rst year and a BM class 15 or higher in the second year. We deleted 12 such contracts, observed for 4 years, from the sample. Finally, we deleted 621 contracts that were observed for more than 2 years and had only inconsistent BM transitions. This leaves a nal sample with 123,169 unique contracts and 23,396 claims at fault that were not withdrawn. All empirical results reported in this chapter are based on this sample. We checked the robustness of these results with respect to the ways we have selected the sample and corrected the BM classes and claim withdrawal information by recomputing all results on dierent samples employing dierent ways of correcting for inconsistencies. First, we used an alternative sample that included only observations with consistent raw data on claims and BM classes. No corrections were applied to these data. The results are presented in Appendix 4.E. Second, we used a sample that was alternatively corrected for inconsistent BM information by deriving all BM sequences from the initial BM classes 140 4.C. DATA observed, using the BM updating rule; see Appendix 4.F for the results. Third, we used the main, corrected sample, but included all withdrawn claims as claim-at-fault events; see Appendix 4.G for the outcome. We found that all results reported in this chapter are robust. 141 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA 4.D Main (Corrected) Sample without Young Drivers Tables and gures numbered 4.D.1 and up report the results of redoing all analyses on a subsample of insurees of 28 years old and up of the main sample (see Section 4.4.1). Like in the main analysis, all claims observed to be withdrawn are excluded. 142 4.D. MAIN (CORRECTED) SAMPLE WITHOUT YOUNG DRIVERS Table 4.D.3: Contract Exposure Durations in the Sample Number of years Y 1 Number of contracts observed exactly Y years between Y − 1 and Y years Total 8,074 11,389 19,463 2 5,107 9,329 14,436 3 6,576 7,111 13,687 4 66,654 6,144 72,798 86,411 33,973 Total 120,384 Table 4.D.4: Number of Contracts Observed for At Least One Full Contract Year, by Bonus-Malus Class and Number of Claims in the First Contract Year BM Number of contracts with class no claim 1 claim 2 claims 3 claims 4 claims 1 Total 1 519 112 23 4 2 704 84 11 1 659 3 902 75 10 987 4 1,266 96 8 1,370 5 1,682 101 9 2 1,794 6 2,330 146 13 2 2,491 7 3,142 192 14 3,348 8 3,857 252 15 4,124 800 9 4,584 235 14 10 6,018 274 11 2 11 5,763 262 10 12 5,895 287 16 6,198 13 5,807 258 10 6,075 14 6,683 311 13 7,007 15 6,241 298 7 6,546 16 6,429 297 13 17 5,692 250 7 5,949 18 4,380 202 10 4,592 2 3,868 213 5 20 27,651 1,372 28 2 103,413 5,317 247 16 6,304 6,037 1 19 Total 4,835 1 6,740 4,086 29,053 2 108,995 143 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.D.1: Distribution of Contracts Observed for At Least One Full Contract Year Across Bonus-Malus Classes; and Shares of Those Contracts with At Least One and At Least Two Claims at Fault in the First Contract Year, by Bonus-Malus Class 30% 25% All contracts Contracts with at least 1 claim Contracts with at least 2 claims Share 20% 15% 10% 5% 0% 1 2 3 4 5 6 7 8 9 10 11 BM class 144 12 13 14 15 16 17 18 19 20 4.D. MAIN (CORRECTED) SAMPLE WITHOUT YOUNG DRIVERS Figure 4.D.2: Incentives to Avoid First, Second and Third Claim; at an Average Risk Level 5 ∆ V(1,K,1) ∆ V(1,K,2) ∆ V(1,K,3) 4.5 4 3.5 ∆V 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K 145 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.D.3: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at an Average Risk Level 4 ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 3 2 ∆2 V 1 0 −1 −2 −3 −4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K Figure 4.D.4: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at a Zero Risk Level unchanged 146 4.D. MAIN (CORRECTED) SAMPLE WITHOUT YOUNG DRIVERS Figure 4.D.5: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at a High Risk Level 3 ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 2 1 ∆2 V 0 −1 −2 −3 −4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K 147 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.D.5: ML Estimation of the Auxiliary Model (4.14) with 3 and 4 Support Points Three Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 0.4193 0.0390 0.0393 0.0032 0.2023 0.0197 0.4172 0.0322 Tests of β=0 • LM test: 120.11, p-value = 0.00 • LR test: 103.66, p-value = 0.00 • Wald test: 115.75, p-value = 0.00 Four Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 λ4 Table 4.D.6: 0.3795 0.0342 0.0000 0.0000 0.0551 0.0080 0.2240 0.0124 0.8919 0.1273 Tests of β=0 • LM test: 57.25, • LR test: 100.75, • Wald test: 123.05, Nonparametric Tests Based on Comparison of H1 Bonus-Malus Classes Kruskal - Wallis test H1 (K) H2 (K) equal for all equal for all K K Wilcoxon test 0.736 0.285 p-value H1 (low K ) ∼ H1 (high K ) H2 (low K ) ∼ H2 (high K ) 0.005 Kolmogorov - Smirnov test p-value H1 (low K ) ∼ H1 (high K ) H2 (low K ) ∼ H2 (high K ) 148 p-value 0.370 0.433 0.002 p-value and p-value = 0.00 = 0.00 p-value H2 = 0.00 for Dierent 4.D. MAIN (CORRECTED) SAMPLE WITHOUT YOUNG DRIVERS Figure 4.D.7: Comparison of Ĥ1 with the Uniform Distribution for Low and High Bonus- Malus Classes 1 Uniform CDF Empirical H1 for low BM 1 − 10 0.9 Empirical H1 for high BM 11 − 20 0.8 0.7 H1 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t Table 4.D.7: Kolmogorov-Smirnov Test Comparing 2 with H1 for Dierent Bonus-Malus Classes H1 with the Uniform Distribution and H2 Kolmogorov - Smirnov test H1 (all K ) H1 (low K ) H1 (high K ) H2 (all K ) H2 (low K ) H2 (high K ) H2 (low K ) H2 (high K ) ∼ ∼ ∼ ∼ ∼ ∼ ∼ ∼ U nif orm U nif orm U nif orm H12 (all K ) H12 (low K ) H12 (high K ) H12 (high K ) H12 (low K ) p-value 0.003 0.018 0.052 0.649 0.055 0.250 0.107 0.515 149 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.D.8: Comparison of Ĥ1 with the Uniform Distribution and of Low and High Bonus-Malus Classes, with Ĥ1 and Ĥ2 Ĥ12 with Estimated on the Same Classes Low BM classes 1 − 10 1 Uniform CDF Empirical H1 0.9 Empirical H21 0.8 Empirical H2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.7 0.8 0.9 1 t High BM classes 11 − 20 1 Uniform CDF Empirical H1 0.9 Empirical H21 0.8 Empirical H2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 t 150 0.6 Ĥ2 for 4.D. MAIN (CORRECTED) SAMPLE WITHOUT YOUNG DRIVERS Table 4.D.8: Tests Based on Comparison of First and Second Claim Durations, for Different Bonus-Malus Classes BM classes Test statistics (std. error) π̂n [ ln βn π̂n∗ (all) π̂n∗ (current) 60.9% (10.4%) 0.408 (0.378) 56.5% (10.4%) 56.5% (11.1%) 1 2 64.7% (8.6%) 0.586 (0.311) 61.8% (8.6%) 61.8% (9.1%) 1 3 61.4% (7.5%) 0.392 (0.273) 59.1% (7.6%) 59.1% (7.9%) 1 4 53.8% (6.9%) 0.196 (0.252) 51.9% (7.0%) 51.9% (7.3%) 1 5 54.1% (6.4%) 0.177 (0.232) 52.5% (6.4%) 54.1% (6.7%) 1 6 52.7% (5.8%) 0.033 (0.211) 51.4% (5.8%) 51.4% (6.0%) 1 7 51.1% (5.3%) 0.061 (0.193) 50.0% (5.4%) 50.0% (5.5%) 1 8 52.4% (4.9%) 0.090 (0.179) 51.5% (5.0%) 53.4% (5.1%) 1 9 52.1% (4.6%) 0.092 (0.168) 50.4% (4.7%) 49.6% (4.8%) 1 10 53.1% (4.4%) 0.117 (0.160) 51.6% (4.5%) 51.6% (4.5%) 1 All 53.4% (3.2%) 0.137 (0.115) 51.4% (3.2%) 51.4% (3.2%) 11 20 53.8% (4.6%) 0.158 (0.166) 51.3% (4.6%) 51.3% (4.6%) 12 20 52.3% (4.8%) 0.075 (0.174) 49.5% (4.8%) 49.5% (4.8%) 13 20 51.6% (5.2%) 0.117 (0.188) 49.5% (5.2%) 49.5% (5.2%) 14 20 51.8% (5.5%) 0.148 (0.199) 50.6% (5.5%) 50.6% (5.5%) 15 20 51.4% (6.0%) 0.124 (0.217) 50.0% (6.0%) 50.0% (6.0%) 16 20 47.6% (6.3%) -0.042 (0.229) 46.0% (6.3%) 46.0% (6.4%) 17 20 44.0% (7.1%) -0.192 (0.257) 42.0% (7.1%) 42.0% (7.1%) 18 20 46.5% (7.6%) -0.109 (0.277) 44.2% (7.6%) 46.5% (7.7%) 19 20 42.4% (8.7%) -0.060 (0.316) 39.4% (8.7%) 42.4% (8.8%) 20 39.3% (9.4%) -0.118 (0.343) 39.3% (9.5%) 42.9% (9.5%) 151 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.D.9: Tests Based on Comparison of First and Second Claim Durations that Pool Low and High Bonus-Malus Classes BM classes low high 1 20 1 2 20 1 2 19 20 1 2 18 20 1 2 17 20 1 3 19 20 1 3 18 20 1 4 1 5 Test statistics (std. error) [ ln βn π̂n∗ (all) π̂n∗ (current) 60.8% (7.0%) 0.249 (0.254) 58.8% (7.0%) 58.8% (7.1%) (6.4%) 0.375 (0.230) (6.1%) 0.327 (0.222) 58.4% (5.7%) 0.320 (0.207) π̂n 62.9% 61.2% 59.5% 59.7% (5.5%) (5.7%) 0.352 (0.198) 0.250 (0.207) 61.3% 61.2% 59.5% 59.7% (6.4%) (6.1%) 58.4% (5.7%) (5.5%) (5.7%) (6.4%) (6.2%) (5.8%) (5.5%) (5.8%) 57.5% (5.4%) 0.252 (0.194) 17 20 54.9% (5.0%) 0.194 (0.180) 54.9% (5.0%) 54.9% (5.0%) 16 20 53.2% (4.5%) 0.108 (0.163) 53.2% (4.5%) 53.2% (4.6%) 1 6 15 20 50.7% (4.2%) -0.043 (0.151) 50.7% (4.2%) 50.7% (4.2%) 1 7 14 20 49.7% (3.8%) -0.040 (0.139) 49.7% (3.9%) 49.7% (3.9%) 1 8 13 20 50.5% (3.6%) -0.008 (0.130) 51.0% (3.6%) 51.0% (3.6%) 1 9 12 20 50.0% (3.3%) 0.012 (0.121) 50.4% (3.4%) 50.4% (3.4%) 1 10 11 20 49.8% (3.2%) -0.015 (0.115) 50.2% (3.2%) 50.2% (3.2%) 152 57.5% (5.4%) 61.3% 61.2% 59.7% 59.5% 59.7% 57.5% (5.4%) 4.D. MAIN (CORRECTED) SAMPLE WITHOUT YOUNG DRIVERS Table 4.D.10: Comparison of First and Second Claim Sizes for Various Bonus-Malus Classes BM # Wilcoxon test Sign test L ∼ L against classes obs. p L L L ≺ L L 6∼ L 1 -value 1 2 1 2 2 1 1 28 0.466 0.092 0.956 0.185 1 2 40 0.628 0.437 0.682 0.875 1 3 50 0.490 0.556 0.556 1.000 1 4 58 0.667 0.448 0.653 0.896 1 5 69 0.888 0.595 0.500 1.000 1 6 84 0.656 0.707 0.372 0.744 1 7 98 0.570 0.760 0.307 0.614 1 8 113 0.371 0.827 0.226 0.452 1 9 129 0.344 0.811 0.241 0.481 1 10 141 0.446 0.750 0.307 0.614 0.018 0.007 0.009 0.016 0.020 0.017 0.005 0.035 0.037 0.015 0.018 0.032 0.040 0.034 0.009 0.068 0.135 0.845 0.250 0.500 0.708 0.428 0.856 All 265 0.075 0.987 11 20 124 0.099 0.996 12 20 112 0.092 0.995 13 20 96 0.078 0.991 14 20 86 0.064 0.989 15 20 73 0.100 0.991 16 20 66 0.056 0.998 17 20 52 0.131 0.982 18 20 45 0.185 0.964 19 20 35 0.534 20 30 0.688 2 0.070 153 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA 4.E Cleaned Data Tables and gures numbered 4.E.1 and up report the results of redoing all analyses with a sample that includes only observations with consistent raw information on claims and bonus-malus classes (see Appendix 4.C). No corrections are applied to these data. Like in the main analysis, all claims observed to be withdrawn are excluded. 154 4.E. CLEANED DATA Table 4.E.3: Contract Exposure Durations in the Sample Number of years Y 1 Number of contracts observed exactly Y years between Y − 1 and Y years Total 8,097 11,775 19,872 2 4,296 8,760 13,056 3 5,487 6,560 12,047 4 59,183 5,438 64,621 77,063 32,533 Total 109,596 Table 4.E.4: Number of Contracts Observed for At Least One Full Contract Year, by Bonus-Malus Class and Number of Claims in the First Contract Year BM Number of contracts with class no claim 1 claim 2 claims 3 claims 4 claims Total 1 363 96 19 1 479 2 570 78 11 1 660 3 788 66 9 863 4 1,059 88 8 1,155 5 1,483 94 9 1 1,587 6 2,019 126 10 2 2,157 7 2,754 172 16 2,942 8 3,570 222 13 3,805 9 4,122 204 14 10 5,574 264 10 2 11 5,197 225 8 12 5,235 255 15 5,505 13 4,777 221 10 5,008 14 5,565 244 9 5,818 15 5,360 262 4 5,626 16 5,600 261 11 5,872 17 4,997 223 5 5,225 18 3,842 177 8 4,027 2 19 3,606 183 4 20 26,424 1,224 26 2 92,905 4,685 219 11 Total 4,342 1 5,849 5,432 3,793 27,676 1 97,821 155 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.E.1: Distribution of Contracts Observed for At Least One Full Contract Year Across Bonus-Malus Classes; and Shares of Those Contracts with At Least One and At Least Two Claims at Fault in the First Contract Year, by Bonus-Malus Class 30% 25% All contracts Contracts with at least 1 claim Contracts with at least 2 claims Share 20% 15% 10% 5% 0% 1 2 3 4 5 6 7 8 9 10 11 BM class 156 12 13 14 15 16 17 18 19 20 4.E. CLEANED DATA Figure 4.E.2: Incentives to Avoid First, Second and Third Claim; at an Average Risk Level 5 ∆ V(1,K,1) ∆ V(1,K,2) ∆ V(1,K,3) 4.5 4 3.5 ∆V 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K 157 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.E.3: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at an Average Risk Level 4 ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 3 2 ∆2 V 1 0 −1 −2 −3 −4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K Figure 4.E.4: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at a Zero Risk Level unchanged 158 4.E. CLEANED DATA Figure 4.E.5: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at a High Risk Level 2 ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 1 ∆2 V 0 −1 −2 −3 −4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K 159 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.E.5: ML Estimation of the Auxiliary Model (4.14) with 3 and 4 Support Points Three Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 0.4203 0.0528 0.0268 0.0051 0.1869 0.0375 0.3023 0.0156 Tests of β=0 • LM test: 5.70, • LR test: 33.08, • Wald test: 63.47, p-value = 0.02 p-value = 0.00 p-value = 0.00 Four Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 λ4 0.3688 0.0417 0.0000 0.0000 0.0843 0.0091 0.1992 0.0147 0.5643 0.0881 Tests of β=0 • LM test: 16.13, p-value = 0.00 • LR test: 50.10, p-value = 0.00 • Wald test: 78.23, Table 4.E.6: Nonparametric Tests Based on Comparison of Kruskal - Wallis test equal for all equal for all K K Wilcoxon test 160 p-value 0.406 0.463 p-value H1 (low K ) ∼ H1 (high K ) H2 (low K ) ∼ H2 (high K ) 0.036 Kolmogorov - Smirnov test p-value H1 (low K ) ∼ H1 (high K ) H2 (low K ) ∼ H2 (high K ) = 0.00 H1 and H2 for Dierent Bonus- Malus Classes H1 (K) H2 (K) p-value 0.336 0.433 0.054 4.E. Figure 4.E.7: Comparison of Ĥ1 CLEANED DATA with the Uniform Distribution for Low and High Bonus- Malus Classes 1 Uniform CDF Empirical H1 for low BM 1 − 10 0.9 Empirical H1 for high BM 11 − 20 0.8 0.7 H1 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t Table 4.E.7: Kolmogorov-Smirnov Test Comparing 2 with H1 for Dierent Bonus-Malus Classes H1 with the Uniform Distribution and H2 Kolmogorov - Smirnov test H1 (all K ) H1 (low K ) H1 (high K ) H2 (all K ) H2 (low K ) H2 (high K ) H2 (low K ) H2 (high K ) ∼ ∼ ∼ ∼ ∼ ∼ ∼ ∼ U nif orm U nif orm U nif orm H12 (all K ) H12 (low K ) H12 (high K ) H12 (high K ) H12 (low K ) p-value 0.012 0.219 0.082 0.542 0.109 0.660 0.215 0.746 161 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.E.8: Comparison of Ĥ1 with the Uniform Distribution and of Low and High Bonus-Malus Classes, with Ĥ1 and Ĥ2 Ĥ12 with Estimated on the Same Classes Low BM classes 1 − 10 1 Uniform CDF Empirical H1 0.9 Empirical H21 0.8 Empirical H2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.7 0.8 0.9 1 t High BM classes 11 − 20 1 Uniform CDF Empirical H1 0.9 Empirical H21 0.8 Empirical H2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 t 162 0.6 Ĥ2 for 4.E. CLEANED DATA Table 4.E.8: Tests Based on Comparison of First and Second Claim Durations, for Different Bonus-Malus Classes BM classes Test statistics (std. error) π̂n [ ln βn π̂n∗ (all) π̂n∗ (current) 57.9% (11.5%) 0.356 (0.416) 47.4% (11.5%) 47.4% (12.2%) 1 2 63.3% (9.1%) 0.577 (0.331) 56.7% (9.1%) 56.7% (9.6%) 1 3 59.0% (8.0%) 0.361 (0.290) 53.8% (8.0%) 53.8% (8.4%) 1 4 53.2% (7.3%) 0.177 (0.265) 48.9% (7.3%) 48.9% (7.6%) 1 5 53.6% (6.7%) 0.141 (0.242) 50.0% (6.7%) 51.8% (7.0%) 1 6 50.0% (6.2%) -0.042 (0.223) 47.0% (6.2%) 47.0% (6.4%) 1 7 50.0% (5.5%) 0.023 (0.200) 47.6% (5.6%) 48.8% (5.7%) 1 8 50.5% (5.1%) 0.082 (0.186) 48.4% (5.2%) 47.4% (5.3%) 1 9 50.5% (4.8%) 0.089 (0.174) 47.7% (4.8%) 46.8% (4.9%) 1 10 51.3% (4.6%) 0.113 (0.166) 48.7% (4.6%) 49.6% (4.7%) 1 All 51.6% (3.4%) 0.094 (0.123) 49.3% (3.4%) 49.3% (3.4%) 11 20 52.0% (5.0%) 0.072 (0.181) 50.0% (5.0%) 49.0% (5.1%) 12 20 52.2% (5.2%) 0.045 (0.189) 50.0% (5.2%) 50.0% (5.3%) 13 20 50.6% (5.7%) 0.088 (0.207) 49.4% (5.7%) 49.4% (5.8%) 14 20 52.2% (6.1%) 0.159 (0.222) 50.7% (6.1%) 50.7% (6.2%) 15 20 51.7% (6.6%) 0.111 (0.238) 50.0% (6.6%) 50.0% (6.6%) 16 20 50.0% (6.8%) 0.006 (0.247) 48.1% (6.8%) 48.1% (6.9%) 17 20 46.5% (7.6%) -0.165 (0.277) 44.2% (7.6%) 44.2% (7.7%) 18 20 47.4% (8.1%) -0.135 (0.294) 44.7% (8.1%) 47.4% (8.2%) 19 20 43.3% (9.1%) -0.106 (0.331) 40.0% (9.1%) 43.3% (9.2%) 20 42.3% (9.8%) -0.143 (0.356) 42.3% (9.8%) 42.3% (9.9%) 163 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.E.9: Tests Based on Comparison of First and Second Claim Durations that Pool Low and High Bonus-Malus Classes BM classes low high Test statistics (std. error) π̂n [ ln βn π̂n∗ (all) π̂n∗ (current) 1 20 57.8% (7.5%) 0.233 (0.270) 53.3% (7.5%) 55.6% (7.5%) 1 2 20 60.7% (6.7%) 0.375 (0.242) 57.1% (6.7%) 58.9% (6.8%) 1 2 19 20 60.0% (6.5%) 0.341 (0.234) 58.3% (6.5%) 60.0% (6.5%) 1 2 18 20 57.4% (6.1%) 0.330 (0.220) 55.9% (6.1%) 57.4% (6.1%) 1 2 17 20 57.5% (5.9%) 0.334 (0.212) 56.2% (5.9%) 56.2% (5.9%) 1 3 19 20 58.0% (6.0%) 0.250 (0.218) 56.5% (6.0%) 58.0% (6.1%) 1 3 18 20 55.8% (5.7%) 0.250 (0.207) 54.5% (5.7%) 55.8% (5.8%) 1 4 17 20 53.3% (5.3%) 0.171 (0.191) 52.2% (5.3%) 52.2% (5.3%) 1 5 16 20 51.8% (4.8%) 0.069 (0.173) 50.9% (4.8%) 50.9% (4.8%) 1 6 15 20 49.2% (4.5%) -0.074 (0.163) 48.4% (4.5%) 48.4% (4.6%) 1 7 14 20 49.0% (4.1%) -0.059 (0.149) 48.3% (4.1%) 48.3% (4.2%) 1 8 13 20 50.0% (3.8%) 0.005 (0.138) 49.4% (3.9%) 49.4% (3.9%) 1 9 12 20 49.3% (3.5%) 0.027 (0.128) 48.8% (3.6%) 48.3% (3.6%) 1 10 11 20 49.8% (3.4%) 0.029 (0.123) 49.3% (3.4%) 49.3% (3.4%) 164 4.E. Table 4.E.10: CLEANED DATA Comparison of First and Second Claim Sizes for Various Bonus-Malus Classes BM # Wilcoxon test Sign test L ∼ L against classes obs. p L L L ≺ L L 6∼ L 1 -value 1 2 1 2 2 1 1 20 0.296 0.058 0.979 0.115 1 2 32 0.736 0.430 0.702 0.860 1 3 41 0.781 0.500 0.622 1.000 1 4 49 0.846 0.500 0.612 1.000 1 5 59 0.916 0.603 0.500 1.000 1 6 71 0.705 0.762 0.318 0.635 1 7 87 0.412 0.858 0.196 0.391 1 8 100 0.203 0.933 0.097 0.193 1 9 116 0.153 0.943 0.082 0.163 1 10 127 0.195 0.021 0.022 0.024 0.011 0.015 0.013 0.922 0.107 0.214 0.087 0.992 All 231 11 20 104 12 20 94 13 20 79 14 20 69 15 20 60 16 20 56 17 20 45 18 20 40 0.088 0.992 19 20 32 0.246 0.945 20 28 0.495 0.828 0.051 0.998 0.998 0.999 0.998 0.999 0.999 1.000 0.003 0.004 0.001 0.003 0.002 0.001 0.001 0.018 0.019 0.006 0.008 0.003 0.007 0.004 0.003 0.002 0.036 0.038 0.286 0.572 0.108 2 0.215 165 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA 4.F Sample Corrected Based on Initial Bonus-Malus Class Tables and gures numbered 4.F.1 and up report the results of redoing all analyses on a sample constructed using an alternatively correction for inconsistent bonus-malus classes (see Appendix 4.C). This alternative correction takes the initial bonus-malus class ob- served as given and constructs all further bonus-malus classes from observed claims and withdrawals data. excluded. 166 Like in the main analysis, all claims observed to be withdrawn are 4.F. SAMPLE CORRECTED BASED ON INITIAL BONUS-MALUS CLASS Table 4.F.3: Contract Exposure Durations in the Sample Number of years Y 1 Number of contracts observed exactly Y years between Y − 1 and Y years Total 8,097 11,775 19,872 2 4,709 9,616 14,325 3 6,428 7,572 14,000 4 69,059 6,546 75,605 88,293 35,509 Total 123,802 Table 4.F.4: Number of Contracts Observed for At Least One Full Contract Year, by Bonus-Malus Class and Number of Claims in the First Contract Year BM Number of contracts with class no claim 1 claim 2 claims 3 claims 4 claims 1 Total 1 475 230 45 7 2 745 126 16 1 758 3 948 115 11 1,074 4 1,295 149 13 1,457 5 1,796 162 16 1 1,975 6 2,431 261 23 2 2,717 7 3,347 306 18 3,671 8 4,227 326 22 4,575 888 9 4,842 314 15 10 6,473 351 12 2 11 6,039 330 10 12 6,003 349 17 6,369 13 5,504 287 10 5,801 14 6,452 588 14 7,054 15 6,220 297 6 6,523 16 6,432 300 13 17 5,753 253 7 6,013 18 4,426 205 10 4,641 2 3,922 210 4 20 27,832 1,375 29 2 105,162 6,534 311 18 6,837 6,381 1 19 Total 5,173 1 6,746 4,136 29,238 2 112,027 167 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.F.1: Distribution of Contracts Observed for At Least One Full Contract Year Across Bonus-Malus Classes; and Shares of Those Contracts with At Least One and At Least Two Claims at Fault in the First Contract Year, by Bonus-Malus Class 30% 25% All contracts Contracts with at least 1 claim Contracts with at least 2 claims Share 20% 15% 10% 5% 0% 1 2 3 4 5 6 7 8 9 10 11 BM class 168 12 13 14 15 16 17 18 19 20 4.F. SAMPLE CORRECTED BASED ON INITIAL BONUS-MALUS CLASS Figure 4.F.2: Incentives to Avoid First, Second and Third Claim; at an Average Risk Level 5 ∆ V(1,K,1) ∆ V(1,K,2) ∆ V(1,K,3) 4.5 4 3.5 ∆V 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K 169 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.F.3: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at an Average Risk Level 4 ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 3 2 ∆2 V 1 0 −1 −2 −3 −4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K Figure 4.F.4: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at a Zero Risk Level unchanged 170 4.F. SAMPLE CORRECTED BASED ON INITIAL BONUS-MALUS CLASS Figure 4.F.5: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at a High Risk Level 1.5 ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 1 0.5 0 ∆2 V −0.5 −1 −1.5 −2 −2.5 −3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K 171 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.F.5: ML Estimation of the Auxiliary Model (4.14) with 3 and 4 Support Points Three Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 0.3201 0.0435 0.0512 0.0037 0.1987 0.0240 0.3319 0.0136 Tests of β=0 • LM test: 5.47, • LR test: 36.92, • Wald test: 54.26, p-value = 0.02 p-value = 0.00 p-value = 0.00 Four Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 λ4 0.3356 0.0339 0.0000 0.0000 0.0644 0.0066 0.2070 0.0117 0.3376 0.0123 Tests of β=0 • LM test: 21.79, p-value = 0.00 • LR test: 11.74, p-value = 0.00 • Wald test: 98.28, Table 4.F.6: Nonparametric Tests Based on Comparison of Kruskal - Wallis test equal for all equal for all K K Wilcoxon test 172 p-value 0.619 0.414 p-value H1 (low K ) ∼ H1 (high K ) H2 (low K ) ∼ H2 (high K ) 0.014 Kolmogorov - Smirnov test p-value H1 (low K ) ∼ H1 (high K ) H2 (low K ) ∼ H2 (high K ) = 0.00 H1 and H2 for Dierent Bonus- Malus Classes H1 (K) H2 (K) p-value 0.287 0.530 0.017 4.F. SAMPLE CORRECTED BASED ON INITIAL BONUS-MALUS CLASS Figure 4.F.7: Comparison of Ĥ1 with the Uniform Distribution for Low and High Bonus- Malus Classes 1 Uniform CDF Empirical H1 for low BM 1 − 10 0.9 Empirical H1 for high BM 11 − 20 0.8 0.7 H1 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t Table 4.F.7: Kolmogorov-Smirnov Test Comparing H2 with H12 for Dierent Bonus-Malus Classes H1 with the Uniform Distribution and Kolmogorov - Smirnov test H1 (all K ) H1 (low K ) H1 (high K ) H2 (all K ) H2 (low K ) H2 (high K ) H2 (low K ) H2 (high K ) ∼ ∼ ∼ ∼ ∼ ∼ ∼ ∼ U nif orm U nif orm U nif orm H12 (all K ) H12 (low K ) H12 (high K ) H12 (high K ) H12 (low K ) p-value 0.014 0.015 0.867 0.601 0.152 0.489 0.133 0.380 173 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.F.8: Comparison of Ĥ1 with the Uniform Distribution and of Low and High Bonus-Malus Classes, with Ĥ1 and Ĥ2 Ĥ12 with Estimated on the Same Classes Low BM classes 1 − 10 1 Uniform CDF Empirical H1 0.9 Empirical H21 0.8 Empirical H2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.7 0.8 0.9 1 t High BM classes 11 − 20 1 Uniform CDF Empirical H1 0.9 Empirical H21 0.8 Empirical H2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 t 174 0.6 Ĥ2 for 4.F. SAMPLE CORRECTED BASED ON INITIAL BONUS-MALUS CLASS Table 4.F.8: Tests Based on Comparison of First and Second Claim Durations, for Different Bonus-Malus Classes BM classes Test statistics (std. error) π̂n [ ln βn π̂n∗ (all) π̂n∗ (current) 1 62.2% (7.5%) 0.488 (0.270) 55.6% (7.5%) 64.4% (7.9%) 1 2 62.3% (6.4%) 0.418 (0.232) 57.4% (6.4%) 59.0% (6.8%) 1 3 61.1% (5.9%) 0.372 (0.214) 56.9% (5.9%) 59.7% (6.2%) 1 4 57.6% (5.4%) 0.326 (0.197) 54.1% (5.4%) 55.3% (5.7%) 1 5 56.4% (5.0%) 0.263 (0.180) 53.5% (5.0%) 56.4% (5.2%) 1 6 53.2% (4.5%) 0.082 (0.163) 50.0% (4.5%) 52.4% (4.7%) 1 7 52.8% (4.2%) 0.099 (0.152) 50.0% (4.2%) 53.5% (4.3%) 1 8 52.4% (3.9%) 0.107 (0.142) 50.0% (3.9%) 53.7% (4.0%) 1 9 52.0% (3.7%) 0.104 (0.136) 49.2% (3.8%) 54.2% (3.8%) 1 10 52.9% (3.6%) 0.123 (0.131) 50.3% (3.7%) 53.4% (3.7%) All 52.4% (2.8%) 0.108 (0.103) 50.5% (2.9%) 50.5% (2.9%) 11 20 51.7% (4.6%) 0.084 (0.166) 50.8% (4.6%) 50.0% (4.6%) 12 20 51.8% (4.8%) 0.040 (0.173) 50.9% (4.8%) 50.0% (4.8%) 13 20 50.5% (5.2%) 0.046 (0.188) 50.5% (5.2%) 49.5% (5.2%) 14 20 51.8% (5.5%) 0.098 (0.199) 51.8% (5.5%) 50.6% (5.5%) 15 20 50.7% (6.0%) 0.091 (0.218) 49.3% (6.0%) 49.3% (6.1%) 16 20 47.6% (6.3%) -0.053 (0.229) 46.0% (6.3%) 46.0% (6.4%) 17 20 44.0% (7.1%) -0.207 (0.257) 42.0% (7.1%) 42.0% (7.1%) 18 20 46.5% (7.6%) -0.126 (0.277) 44.2% (7.6%) 46.5% (7.7%) 19 20 42.4% (8.7%) -0.082 (0.316) 39.4% (8.7%) 42.4% (8.8%) 20 41.4% (9.3%) -0.112 (0.337) 41.4% (9.3%) 44.8% (9.3%) 175 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.F.9: Tests Based on Comparison of First and Second Claim Durations that Pool Low and High Bonus-Malus Classes BM classes low high Test statistics (std. error) π̂n 60.8% 61.1% 60.6% 58.7% 59.5% 60.0% 58.3% 1 20 1 2 20 1 2 19 20 1 2 18 20 1 2 17 20 1 3 19 20 1 3 18 20 1 4 17 20 1 5 16 20 54.9% (3.9%) 1 6 15 20 51.8% (3.6%) 1 7 14 20 1 8 13 20 1 9 1 10 176 (5.8%) (5.3%) (5.2%) (4.9%) (4.7%) (4.9%) (4.7%) 57.0% (4.3%) [ ln βn π̂n∗ (all) 0.341 (0.211) 56.8% (5.8%) (0.191) 57.8% (5.3%) 0.300 (0.187) 58.5% (5.2%) (0.178) 56.7% (4.9%) (0.172) 57.7% (4.8%) 0.319 0.297 0.323 0.28 0.282 0.281 (0.177) (0.169) 58.1% (4.9%) 56.5% (4.7%) π̂n∗ (current) 56.8% (5.9%) 60.0% 59.6% 58.7% 59.0% 58.3% (5.4%) (5.2%) (5.0%) 57.7% (4.8%) (5.0%) (4.7%) (0.156) 55.6% (4.3%) 56.3% (4.4%) 0.183 (0.142) 53.7% (3.9%) 53.7% (4.0%) 0.020 (0.131) 50.3% (3.6%) 50.3% (3.7%) 51.1% (3.3%) 0.026 (0.121) 49.3% (3.4%) 49.3% (3.4%) 51.4% (3.1%) 0.052 (0.113) 49.8% (3.2%) 49.8% (3.2%) 12 20 50.5% (2.9%) 0.049 (0.107) 49.1% (3.0%) 49.5% (3.0%) 11 20 51.1% (2.8%) 0.043 (0.103) 49.8% (2.9%) 49.8% (2.9%) 4.F. SAMPLE CORRECTED BASED ON INITIAL BONUS-MALUS CLASS Table 4.F.10: Comparison of First and Second Claim Sizes for Various Bonus-Malus Classes BM # Wilcoxon test Sign test L ∼ L against classes obs. p L L L ≺ L L 6∼ L 1 -value 1 2 1 2 2 1 1 53 0.780 0.392 0.708 0.784 1 2 70 0.706 0.640 0.452 0.905 1 3 81 0.596 0.672 0.412 0.824 1 4 94 0.612 0.697 0.379 0.757 1 5 111 0.897 0.776 0.285 0.569 1 6 136 0.246 0.928 0.099 0.198 1 7 154 0.201 0.937 0.085 0.171 1 8 176 0.104 0.979 1 9 193 0.086 0.978 1 10 206 0.148 0.019 0.030 0.030 0.017 0.031 0.020 0.049 0.959 All 331 11 20 125 12 20 113 13 20 96 14 20 86 15 20 72 16 20 66 17 20 52 18 20 45 0.068 0.982 19 20 35 0.238 0.912 20 31 0.493 0.763 0.060 0.998 0.994 0.998 0.995 0.997 0.997 0.999 0.991 0.030 0.030 0.003 0.010 0.004 0.009 0.006 0.006 0.002 0.018 0.036 2 0.059 0.061 0.054 0.109 0.006 0.020 0.008 0.018 0.013 0.013 0.004 0.036 0.155 0.311 0.360 0.720 0.072 177 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA 4.G Main (Corrected) Sample Including Withdrawn Claims Tables and gures numbered 4.G.1 and up report the results of redoing all analyses using the contract and bonus-malus information from the main (corrected) sample, but including all withdrawn claims as claim-at-fault events (see Appendix 4.C). 178 4.G. MAIN (CORRECTED) SAMPLE INCLUDING WITHDRAWN CLAIMS Table 4.G.3: Contract Exposure Durations in the Sample Number of years Y 1 Number of contracts observed exactly Y years between Y − 1 and Y years Total 8,097 11,775 19,872 2 4,709 9,616 14,325 3 6,261 7,385 13,646 4 68,808 6,501 75,309 87,875 35,277 Total 123,152 Table 4.G.4: Number of Contracts Observed for At Least One Full Contract Year, by Bonus-Malus Class and Number of Claims in the First Contract Year BM Number of contracts with class no claim 1 claim 2 claims 3 claims 4 claims 1 Total 1 562 119 24 4 2 746 95 11 1 710 3 957 83 11 4 1,308 100 9 5 1,876 112 13 1 2,002 6 2,509 162 14 2 2,687 7 3,360 209 17 3,586 8 4,232 272 16 4,520 853 1,051 1,417 9 4,882 252 16 10 6,486 298 11 2 11 6,056 282 12 12 6,002 287 16 6,305 13 5,874 269 11 6,154 14 6,663 313 13 6,989 15 6,161 302 9 6,472 16 6,375 298 14 17 5,669 252 7 5,928 18 4,366 207 11 4,584 2 3,854 214 7 20 27,651 1,374 29 2 105,589 5,500 271 15 6,796 6,352 1 19 Total 5,152 1 6,688 4,075 29,056 2 111,377 179 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.G.1: Distribution of Contracts Observed for At Least One Full Contract Year Across Bonus-Malus Classes; and Shares of Those Contracts with At Least One and At Least Two Claims at Fault in the First Contract Year, by Bonus-Malus Class 30% 25% All contracts Contracts with at least 1 claim Contracts with at least 2 claims Share 20% 15% 10% 5% 0% 1 2 3 4 5 6 7 8 9 10 11 BM class 180 12 13 14 15 16 17 18 19 20 4.G. MAIN (CORRECTED) SAMPLE INCLUDING WITHDRAWN CLAIMS Figure 4.G.2: Incentives to Avoid First, Second and Third Claim; at an Average Risk Level 5 ∆ V(1,K,1) ∆ V(1,K,2) ∆ V(1,K,3) 4.5 4 3.5 ∆V 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K 181 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.G.3: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at an Average Risk Level 4 ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 3 2 ∆2 V 1 0 −1 −2 −3 −4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K Figure 4.G.4: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at a Zero Risk Level unchanged 182 4.G. MAIN (CORRECTED) SAMPLE INCLUDING WITHDRAWN CLAIMS Figure 4.G.5: Change in Incentives to Avoid a Claim after a First and a Second Claim, and Changes in Incentives to Avoid a First, a Second and a Third Claim over the Course of a Contract Year; at a High Risk Level 3 ∆ V(1,K,2) − ∆ V(1,K,1) ∆ V(1,K,3) − ∆ V(1,K,2) ∆ V(1,K,1) − ∆ V(0,K,1) ∆ V(1,K,2) − ∆ V(0,K,2) ∆ V(1,K,3) − ∆ V(0,K,3) 2 1 ∆2 V 0 −1 −2 −3 −4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K 183 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.G.5: ML Estimation of the Auxiliary Model (4.14) with 3 and 4 Support Points Three Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 0.4710 0.0353 0.0435 0.0037 0.2623 0.0110 0.3560 0.0126 Tests of β=0 • LM test: 9.22, • LR test: 73.61, • Wald test: 178.15, p-value p-value = 0.00 = 0.00 p-value = 0.00 Four Support Points Parameter Estimate Std. Error β λ1 λ2 λ3 λ4 Table 4.G.6: 0.4692 0.0330 0.0396 0.0036 0.2481 0.0150 0.2672 0.0105 0.3611 0.0130 Tests of β=0 • LM test: 229.96, • LR test: 48.01, • Wald test: 201.78, Nonparametric Tests Based on Comparison of H1 Bonus-Malus Classes Kruskal - Wallis test H1 (K) H2 (K) equal for all equal for all K K Wilcoxon test 0.667 0.247 p-value H1 (low K ) ∼ H1 (high K ) H2 (low K ) ∼ H2 (high K ) 0.008 Kolmogorov - Smirnov test p-value H1 (low K ) ∼ H1 (high K ) H2 (low K ) ∼ H2 (high K ) 184 p-value 0.504 0.528 0.003 p-value p-value and = 0.00 = 0.00 p-value H2 = 0.00 for Dierent 4.G. MAIN (CORRECTED) SAMPLE INCLUDING WITHDRAWN CLAIMS Figure 4.G.7: Comparison of Ĥ1 with the Uniform Distribution for Low and High Bonus- Malus Classes 1 Uniform CDF Empirical H1 for low BM 1 − 10 0.9 Empirical H1 for high BM 11 − 20 0.8 0.7 H1 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t Table 4.G.7: Kolmogorov-Smirnov Test Comparing H2 with H12 for Dierent Bonus-Malus Classes H1 with the Uniform Distribution and Kolmogorov - Smirnov test H1 (all K ) H1 (low K ) H1 (high K ) H2 (all K ) H2 (low K ) H2 (high K ) H2 (low K ) H2 (high K ) ∼ ∼ ∼ ∼ ∼ ∼ ∼ ∼ U nif orm U nif orm U nif orm H12 (all K ) H12 (low K ) H12 (high K ) H12 (high K ) H12 (low K ) p-value 0.002 0.011 0.052 0.545 0.059 0.389 0.127 0.658 185 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Figure 4.G.8: Comparison of Ĥ1 with the Uniform Distribution and of Low and High Bonus-Malus Classes, with Ĥ1 and Ĥ2 Ĥ12 with Estimated on the Same Classes Low BM classes 1 − 10 1 Uniform CDF Empirical H1 0.9 Empirical H21 0.8 Empirical H2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.7 0.8 0.9 1 t High BM classes 11 − 20 1 Uniform CDF Empirical H1 0.9 Empirical H21 0.8 Empirical H2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 t 186 0.6 Ĥ2 for 4.G. MAIN (CORRECTED) SAMPLE INCLUDING WITHDRAWN CLAIMS Table 4.G.8: Tests Based on Comparison of First and Second Claim Durations, for Different Bonus-Malus Classes BM classes 1 Test statistics (std. error) π̂n [ ln βn π̂n∗ (all) π̂n∗ (current) 62.5% (10.2%) 0.392 (0.370) 54.2% (10.2%) 54.2% (10.9%) 1 2 65.7% (8.5%) 0.570 (0.307) 60.0% (8.5%) 60.0% (8.9%) 1 3 60.9% (7.4%) 0.346 (0.267) 56.5% (7.4%) 56.5% (7.7%) 1 4 54.5% (6.7%) 0.183 (0.245) 50.9% (6.8%) 52.7% (7.0%) 1 5 54.4% (6.1%) 0.157 (0.220) 51.5% (6.1%) 54.4% (6.3%) 1 6 53.7% (5.5%) 0.029 (0.200) 50.0% (5.5%) 53.7% (5.7%) 1 7 53.5% (5.0%) 0.080 (0.182) 50.5% (5.1%) 53.5% (5.2%) 1 8 53.0% (4.7%) 0.080 (0.169) 50.4% (4.7%) 53.0% (4.8%) 1 9 51.9% (4.4%) 0.064 (0.158) 48.9% (4.4%) 50.4% (4.5%) 1 10 52.8% (4.2%) 0.089 (0.152) 50.0% (4.2%) 50.7% (4.3%) All 52.8% (3.0%) 0.090 (0.110) 50.2% (3.1%) 50.2% (3.1%) 11 20 52.7% (4.4%) 0.091 (0.160) 50.4% (4.4%) 51.2% (4.5%) 12 20 52.1% (4.6%) 0.020 (0.168) 49.6% (4.7%) 50.4% (4.7%) 13 20 51.5% (5.0%) 0.049 (0.180) 49.5% (5.0%) 50.5% (5.0%) 14 20 53.3% (5.3%) 0.104 (0.191) 51.1% (5.3%) 52.2% (5.3%) 15 20 51.9% (5.7%) 0.072 (0.207) 49.4% (5.7%) 50.6% (5.8%) 16 20 48.5% (6.1%) -0.074 (0.220) 47.1% (6.1%) 47.1% (6.1%) 17 20 44.4% (6.8%) -0.250 (0.247) 42.6% (6.8%) 42.6% (6.9%) 18 20 46.8% (7.3%) -0.182 (0.265) 44.7% (7.3%) 46.8% (7.4%) 19 20 41.7% (8.3%) -0.210 (0.302) 38.9% (8.4%) 41.7% (8.4%) 20 41.4% (9.3%) -0.112 (0.337) 41.4% (9.3%) 44.8% (9.3%) 187 CHAPTER 4. MORAL HAZARD IN DYNAMIC INSURANCE DATA Table 4.G.9: Tests Based on Comparison of First and Second Claim Durations that Pool Low and High Bonus-Malus Classes BM classes low high 1 20 1 2 20 1 2 19 20 1 2 18 20 1 2 17 20 1 3 19 20 1 3 18 20 1 4 Test statistics (std. error) [ ln βn π̂n∗ (all) π̂n∗ (current) 60.4% (6.9%) 0.239 (0.249) 56.6% (6.9%) 56.6% (6.9%) (6.3%) 0.363 (0.227) 59.4% (6.3%) 59.4% (6.3%) π̂n 62.5% 62.0% 59.6% 59.8% (5.9%) 58.5% (5.5%) (5.3%) (0.215) (0.200) 60.6% (6.0%) 57.3% (5.5%) 60.6% (6.0%) 58.5% (5.6%) (0.192) 58.4% (5.3%) 58.4% (5.4%) 0.286 (0.200) 58.5% (5.5%) 58.5% (5.6%) 57.0% (5.2%) 0.263 (0.188) 55.9% (5.2%) 57.0% (5.3%) 17 20 55.0% (4.8%) 0.216 (0.174) 54.1% (4.8%) 54.1% (4.9%) 1 5 16 20 52.9% (4.3%) 0.116 (0.156) 52.2% (4.3%) 52.2% (4.4%) 1 6 15 20 50.9% (4.0%) -0.020 (0.144) 50.3% (4.0%) 49.7% (4.0%) 1 7 14 20 50.3% (3.6%) -0.008 (0.132) 49.7% (3.7%) 49.7% (3.7%) 1 8 13 20 50.9% (3.4%) 0.019 (0.123) 50.5% (3.4%) 50.0% (3.5%) 1 9 12 20 50.0% (3.2%) 0.025 (0.115) 49.6% (3.2%) 49.2% (3.2%) 1 10 11 20 50.2% (3.0%) 0.003 (0.110) 49.8% (3.1%) 49.8% (3.1%) 188 (5.5%) 0.387 0.348 0.376 4.G. MAIN (CORRECTED) SAMPLE INCLUDING WITHDRAWN CLAIMS Table 4.G.10: Comparison of First and Second Claim Sizes for Various Bonus-Malus Classes BM # Wilcoxon test Sign test L ∼ L against classes obs. p L L L ≺ L L 6∼ L 1 -value 1 2 1 2 2 1 1 29 0.336 0.068 0.969 0.136 1 2 41 0.791 0.378 0.734 0.755 1 3 52 0.649 0.445 0.661 0.890 1 4 61 0.815 0.399 0.696 0.798 1 5 75 0.941 0.500 0.591 1.000 1 6 91 0.755 0.662 0.417 0.834 1 7 108 0.413 0.807 0.250 0.501 1 8 124 0.195 0.911 0.121 0.243 1 9 142 0.121 0.923 0.104 0.208 1 10 154 0.174 0.147 0.295 All 288 11 20 134 12 20 120 13 20 104 14 20 93 15 20 80 16 20 71 17 20 56 0.021 0.050 0.042 0.035 0.043 0.887 0.119 0.978 18 20 49 0.171 0.957 19 20 38 0.335 20 31 0.493 0.070 0.068 0.008 0.010 0.007 0.012 0.011 0.009 0.004 0.041 0.016 0.019 0.013 0.024 0.022 0.018 0.009 0.076 0.152 0.872 0.209 0.418 0.763 0.360 0.720 0.994 0.994 0.996 0.993 0.994 0.995 0.998 2 0.081 189 5 Conclusion In this chapter we will summarize the key empirical results from the previous chapters and integrate them into one consistent conclusion. Then, we will outline new ideas for future research, providing some support from the data. 5.1 Integration of the Results In Chapter 2, we tested for general asymmetric information using the conditional-correlation approach. Our tests focused mainly on adverse selection and ex ante moral hazard, by considering only the third-party claims, which are very likely to be reported to the insurer. Using cross-sectional data, which covered one particular contract year, we did not nd any evidence of asymmetric information. In Chapter 4, we tested for general moral hazard using structural dynamic econometric methods and longitudinal data covering multiple contract years. We found evidence of moral hazard and concluded that part of this moral hazard is due to ex post moral hazard. Taken at face value, these results imply that there is ex post moral hazard, but little or no ex ante moral hazard and no adverse selection in our data. However, the results of Chapter 2 should be interpreted with some care. First, the tests 191 CHAPTER 5. CONCLUSION in this chapter use only cross-sectional data on a single contract year and, therefore, may not be very powerful. Second, as argued by de Meza and Webb (2001), the conditionalcorrelation method is fragile to certain types of multidimensional asymmetric information. Notably, if there is asymmetric information on risk preferences and moral hazard, and more risk averse drivers tend to buy more insurance and drive more cautiously, the conditional correlation between insurance coverage and occurrence of claims can even be negative. Third, the analysis in this chapter used only third-party claims. If asymmetric information is particularly relevant to accidents involving only one car, we cannot detect it. Chapter 4 provides a more advanced analysis, which explores dynamic features of the data, such as the timing of claims, in relation to contract dynamics, notably experience rating. Our full sample covers data over a period of (up to) four contract years. This is important, because previous empirical studies using data on only one contract year failed to detect asymmetric information, and particularly moral hazard. take into account all claims at fault, i.e. 1 Further, we also those involving only one car. This is important, because ignoring accidents involving only one car can obscure the presence of moral hazard. Under moral hazard, fully covered agents, who started driving less carefully because of decreased incentives, are both more likely to cause accidents with third-party damages and accidents where only their car is damaged. Finally, our analysis is robust to general selection on unobservables (including de Meza and Webb's (2001) advantageous selection), since we fully control for the (unobserved) heterogeneity in agents' risks. This is not the case in the previous analysis. Therefore we have good reasons to believe that the results from this chapter are more robust than the results from Chapter 2. In particular, it is quite possible that there exists also ex ante moral hazard, not just ex post moral hazard. 1 Abbring et al. (2003) did not nd evidence of moral hazard when using dynamic methods and longitudinal data covering only one contract year. Two other studies (Israel, 2004, Dionne et al., 2005) applied their tests on longer panels and found evidence of moral hazard. 192 5.2. DYNAMIC CONTRACT CHOICE 5.2 Dynamic Contract Choice Chapter 2 explores the cross-sectional relation between contract choice and risk in a single contract year. Chapter 4 studies risk dynamics, but ignores contract choice. This way, this PhD thesis so far does not exploit a potentially rich source of variation, dynamic contract choice. There are (at least) four reasons to extend the analysis by including dynamic contract selection. First, such an analysis is informative on the importance of adverse selection. In particular, one may argue that young, inexperienced drivers cannot assess their driving ability any better than insurers can. This seems to suggest that adverse selection is not rele- vant for car-insurance markets. However, even if agents and insurers are symmetrically informed initially, asymmetric information on risk may arise in the course of an insurance relationship due to asymmetric learning.2 This entails that drivers learn faster about their risk than insurers, for example because they accumulate information on near-accidents and small accidents that they do not report to the insurer. In turn, asymmetric learning may aect dynamic contract selection. Thus, a dynamic form of adverse selection arises. It is therefore of interest to investigate to what extent an analysis of claims and contract-choice processes, based on appropriate dynamic economic models, is informative on asymmetric learning. Second, any such analysis should also account for symmetric learning, in relation to experience rating. This will provide a more rigorous foundation for the conditionalcorrelation analysis in Chapter 2, which conditioned on the bonus-malus class to control for symmetric learning. Third, dynamic contract selection cannot be ignored in a dynamic study of moral hazard if it is endogenous to the claims process under investigation. An extreme example is the problem of nonignorable attrition that arises, in particular, if multiyear panels are used. Such problems may arise even if contract choice is only aected by events that are 2 See Subsection 2.4.4.2 for more discussion. 193 CHAPTER 5. CONCLUSION symmetrically observed by agents and insurers. For example, agents and insurers can be expected to symmetrically learn from accidents that are reported to the insurer. In turn, this may aect contract choice, or even the decision to terminate a contract. A claim at fault may also aect contract choice and termination through its eect on the premium. A joint analysis of claims and contract-selection processes can shed a light on these dynamic selection eects. This would take the analysis beyond the usual independent censoring assumption in event-history analysis, which we have also maintained in Chapter 4. Fourth, dynamic contract-choice data may enhance the analysis of asymmetric information by distinguishing learning eects from the nancial-incentive eects of claims through the premium. In the previous chapter we have already addressed this issue in the context of moral hazard. We exploited that learning is probably more important for young drivers than senior ones, and investigated the robustness of our results on moral hazard to possible learning eects; see Appendix 4.D. Additional information on learning from dynamic contract choice would allow us to further enhance our inference on moral hazard. Dynamic contract selection in car insurance was recently studied by Dionne et al. (2006), who jointly analyze dynamic contract choices and accidents under experience rating. They propose a causality test that separates moral hazard from learning and adverse selection. Using three years of French longitudinal survey data with dynamic information on both claims and accidents, they estimate a dynamic panel data model and nd some evidence of moral hazard among policyholders with signicant driving experience (5-15 years). We intend to build on Dionne et al.'s novel approach to the dynamic empirical analysis of insurance markets by amending and/or extending Chapter 4's structural event-history methods to include learning and dynamic contract choice. By using a more structured and continuous-time approach, we can further clarify the dynamic relations between risk, learning and contract choice in the data. It allows us to fully exploit the continuous-time 194 5.3. OBSERVED CONTRACT DYNAMICS variation in our data, which is relatively rich compared to the annual discrete-time panel data used by Dionne et al., and develop more powerful tests. As we will see in the next section, many dynamic contract changes may be triggered by more or less external events, such as the purchase of a new car. Therefore, our analysis should control for such external events, in order to enhance our understanding of contract changes that are truly endogenous, such as those related to learning about risk. Obviously, the empirical analysis of dynamic contract choices would be impossible if we would not have data on such dynamics. Therefore, we nish this thesis with a brief exploration of contract-choice dynamics in our data. 5.3 Observed Contract Dynamics In this section, we explore the feasibility of the analysis of dynamic contract choice with the Dutch data by providing some evidence that agents change their coverage in the course of time in these data. We will distinguish, as in Chapter 2, only two levels of coverage: basic (liability) coverage and full (comprehensive) coverage. 3 In the raw data we observe 29,979 changes in coverage, of which 16,324 are from basic coverage to full coverage (further called coverage (further called 1-0 change ). 0-1 change ) and 13,655 from full coverage to basic These correspond to 16,056 unique contracts with a 0-1 change and 13,424 unique contracts with a 1-0 change. Obviously, some contracts experience multiple changes in coverage across the whole observed period: 1 contract has even 9 changes, 2 contracts 6 changes, 3 contracts 5 changes, 55 contracts 4 changes, 365 contracts 3 changes, 4,293 contracts 2 changes, 20,042 contracts 1 change and 116,038 contracts have no change in coverage across the whole observed period. Table 5.1 gives the number of changes in coverage, which take place at the renewal date and during the contract year. We can see that most 1-0 changes occur at the renewal date. This can be explained by the fact that policy rules allow agents to cancel or reduce their 3 See Appendix 4.C, footnote 22 for more details. 195 CHAPTER 5. CONCLUSION insurance coverage only at the contract renewal date, except in the cases of emigration, death of the insuree or change of the insured object. On the other hand, most of the 0-1 changes happen in the course of the contract year, which can be explained by the fact that policy rules give no timing restrictions on extension of the coverage. These changes can be explained by aging of an insured car (1-0 change), purchase of a new car (0-1 change), or by learning an agent might want to change his coverage once he learns more about his risk. It is easy to determine how many changes in coverage correspond to a change of car, since each car in the Netherlands has a unique license number, which we directly observe in the data. Thus any change of license number in a contract corresponds to a change of insured object. Table 5.2 gives the number of changes in coverage with regard to a change 4 of car. From this table we can see that most of the 0-1 changes indeed correspond to a change of car. Agents buying a new car usually buy also extra (full) insurance. the other hand, the 1-0 changes are mostly not related to a change of car. On They can correspond to learning or aging of insured cars. 5 Table 5.3 gives the age distribution of cars with respect to their coverage. From this table we can see that most of the new cars have full (CASCO) insurance. Older cars have usually restricted comprehensive coverage (Mini-CASCO) and very old cars only liability coverage (LI). Clearly, some, but not all, of the 1-0 variation can indeed be explained by aging of cars. Our preliminary exploration of the data suggests that there is a lot of contract-choice dynamics in the data. Many changes in coverage are due to changes of car (purchase of full insurance) or are to great extent explained by aging of cars (cancelation of full insurance), but there is ample unexplained variation that may be related to learning and other aspects of our future research. 4 We had to exclude 140 contracts with a missing license number of the insured car. 5 This table includes only cars, which are old from 0 to 20 years. In the sample, we observe also some few older cars, the oldest one being 65 years old. 196 5.3. OBSERVED CONTRACT DYNAMICS Table 5.1: Number of Changes in Coverage at Renewal Date and During Contract Year Change in Number of changes coverage at renewal date during contract year Total 0 1 314 16.010 16.324 1 0 11.166 2.489 13.655 11.480 18.499 29.979 Total Table 5.2: Number of Changes in Coverage Related to Change of Car Change in coverage YES Change of car NO at renewal date during con. year at renewal date during con. year 0 1 260 15,432 53 526 1 0 131 1,749 11,032 725 Table 5.3: Age Distribution of Cars with Respect to Coverage Age of car CASCO Number of cars with Mini-CASCO LI only 0 27,523 (98.2%) 271 1 34,240 (97.4%) 2 36,620 (95.5%) 3 36,286 (90.7%) 3,080 4 5 Total # cars (1.0%) 309 (1.1%) 28,038 546 (1.6%) 532 (1.5%) 35,153 1,289 (3.4%) 888 (2.3%) 38,347 (7.7%) 1,427 (3.6%) 40,028 35,225 (82.0%) 6,497 (15.1%) 2,431 (5.7%) 42,956 31,012 (71.0%) 10,739 (24.6%) 3,622 (8.3%) 43,698 6 25,899 (58.4%) 15,324 (34.5%) 5,043 (11.4%) 44,367 7 20,331 (45.4%) 19,493 (43.5%) 6,905 (15.4%) 44,814 8 15,008 (33.6%) 22,366 (50.1%) 9,110 (20.4%) 44,667 9 10,596 (23.2%) 24,159 (52.9%) 12,502 (27.4%) 45,642 10 6,994 (15.5%) 23,319 (51.6%) 16,132 (35.7%) 45,183 11 4,250 (10.3%) 19,231 (46.6%) 18,701 (45.3%) 41,288 12 2,455 (7.0%) 14,069 (40.4%) 18,882 (54.2%) 34,867 13 1,411 (5.2%) 9,363 (34.8%) 16,506 (61.3%) 26,931 14 707 (3.9%) 5,335 (29.8%) 12,028 (67.2%) 17,904 15 365 (3.4%) 2,789 (26.3%) 7,533 (71.0%) 10,606 16 237 (3.9%) 1,449 (23.8%) 4,428 (72.8%) 6,081 17 173 (4.6%) 860 (22.9%) 2,751 (73.1%) 3,761 18 126 (5.4%) 548 (23.3%) 1,695 (72.1%) 2,351 19 95 (6.0%) 390 (24.7%) 1,104 (69.8%) 1,581 20 66 (6.1%) 271 (25.0%) 753 (69.4%) 1,085 Note: Due to overlapping in coverage, the total number of cars is smaller than the sum of columns. 197 Dynamische Econometrische Analyse van Verzekeringsmarkten met Imperfecte Informatie Samenvatting Asymmetrische Informatie in Verzekering De meeste mensen houden niet van inkomensrisico's. verkleinen door ze te delen met anderen. Gelukkig kunnen ze deze risico's In moderne economieën wordt dit soort risi- codeling aangeboden in de vorm van verzekeringen. Levensverzekeringen, bijvoorbeeld, beschermen tegen het inkomensrisico dat kleeft aan onverwacht kort of lang leven. Voor elk individu afzonderlijk is dit levensrisico aanzienlijk, maar het gemiddeld sterfteverloop in een grote groep verzekerden is goed te voorspellen. Een verzekeraar kan zijn lev- ensverzekeringsklanten dus hun levensrisico laten delen zonder dat ze elkaar ooit hoeven te ontmoeten. In zo'n moderne, anonieme verzekeringsmarkt is er een goede kans dat verzekerden voor de verzekering relevante informatie kunnen verbergen. Verzekeraars en verzekerden zijn dan 'asymmetrisch geïnformeerd'. Asymmetrische informatie kan twee vormen aannemen. Ten eerste is het denkbaar dat mogelijke klanten hun risico beter kunnen inschatten dan de verzekeraar. Dit kan leiden tot 'negatieve selectie' (adverse selection ) op risico: klanten die een relatief hoog risico lopen ten opzichte van andere klanten met dezelfde door de verzekeraar waargenomen risicofactoren zullen zich relatief goed verzekeren. Ten tweede heeft de verzekeraar vaak geen volledige controle over het risicogedrag van de klant. In dat geval ontstaat er 'moreel gevaar' (moral hazard ): klanten met een betere dekking gedragen zich risicovoller. Verzekeraars passen hun verzekeringsaanbod aan zulke asymmetrische-informatieproblemen aan. Autoverzekeringen, bijvoorbeeld, hebben doorgaans een eigen risico. Ze bieden dan 199 Summary in Dutch geen volledige verzekering tegen autoschade. De automobilist blijft deels verantwoordelijk voor de gevolgen van zijn rijgedrag; het moreel gevaar wordt beperkt ten koste van de risicodeling. De meeste verzekeraars bieden verder een menu aan contracten aan, vaak met keuze uit verschillende eigen risico's. Dit menu kan worden ontworpen zodat klanten met verschillende verborgen eigenschappen verschillende contracten kiezen. De keuze uit het menu verraadt dan de verborgen eigenschappen van de klant. Tot op zekere hoogte staat dit de verzekeraar toe om zijn verzekering aan te passen aan de verborgen eigenschappen van de klant. Als er geen asymmetrische informatie is, dan kunnen verzekeringsmarkten op een efciënte manier zorgen voor risicodeling. Asymmetrische informatie kan echter de marktwerking verstoren. Dit kan een reden zijn voor overheidsingrijpen in verzekeringsmarkten; het verklaart de sterke interesse van economen voor het asymmetrische-informatieprobleem. Het is dus belangrijk om vast te stellen of asymmetrische informatie in de praktijk echt een rol speelt en, zo ja, in welke vorm. Onderzoek naar dit soort problemen aan de hand van gegevens uit de verzekeringspraktijk is het laatste decennium goed van de grond gekomen; zie Chiappori and Salanié (2003) voor een overzicht. Nederlandse Autoverzekering en Gegevens In dit PhD project gebruiken we rijke administratieve gegevens van een grote Nederlandse autoverzekeraar. Onze longitudinale gegevens dekken een tijdsperiode van 5 jaar en bevatten volledige informatie over verzekerden (leeftijd, geslacht, adres), hun auto's (merk, prijs, motorinhoud, kracht, etc.), polissen (dekking, premie, eigen risico, etc.) en claims (type, schade, invloed). De Nederlandse verzekeraar gebruikt de ervaringsrating die in Tabel 2.1 is aangegeven. De premies worden jaarlijks herzien aan de hand van het claimgedrag. De premie in het volgende polisjaar hangt af van de bonus-malus (BM) trede en het aantal claims-doorschuld (claims 200 at fault ) in het vorige polisjaar. In lage en gemiddelde BM treden levert Samenvatting in het Nederlands een jaar zonder claims een premieverlaging (bonus ) op. Aan de andere kant, elke claim waaraan de verzekerde schuld heeft leidt bij de eerstvolgende premieherziening tot een verhoging (malus ), behalve in de hoogste en de laagste BM treden. Een belangrijk probleem met alle administratieve autoverzekeringsgegevens is dat alleen claims en geen schades worden geregistreerd. Als verzekerden een zekere vrijheid hebben om te kiezen of ze een schade melden aan de verzekeraar, dan zijn claims en schades niet gelijk. In het algemeen zullen verzekerden kleine schades niet melden, omdat de even kleine vergoedingen geen premieverhoging waard zijn. Dit wordt wel moreel gevaar genoemd. ex post Natuurlijk is het optimaal om alleen schades te claimen die groter zijn dan een bepaalde drempel, die hoger dan het eigen risico is. Het eect van verzekering op het schaderisico zelf heet ex ante moreel gevaar. Het onderscheid tussen ex ante en ex post moreel gevaar is van economisch belang. Onder ex ante moreel gevaar wordt door verzekering het schaderisico zelf beïnvloed; ex post moreel gevaar beïnvloedt alleen de verdeling van risico tussen de klant en de verzekeraar en niet het risico zelf. De welvaartsgevolgen van beide vormen van moreel gevaar zijn dus radicaal verschillend. Onze autoverzekeringsgegevens zijn bijzonder interessant omdat ze directe informatie over ex post moreel gevaar geven. De Nederlandse verzekeraar biedt haar klanten de mogelijkheid om een claim binnen 6 maanden terug te trekken en zo een premieverhoging te vermijden. Dit genereert informatie over beide vormen van moreel gevaar. Veronderstel dat schades binnen een bepaalde termijn gemeld moeten worden om voor vergoeding in aanmerking te komen, dat het melden van schades kostenloos is en dat het exact vaststellen van het schadebedrag langer duurt dan de meldingstermijn. Dan worden alle schades gemeld bij de verzekeraar om de optie van vergoeding open te houden en uit ex post moreel gevaar zich volledig in het terugtrekken van claims als de schade uiteindelijk te laag uitvalt. Zowel ex post als ex ante moreel gevaar kunnen dan dus direct bestudeerd worden. In de praktijk zal het aantal teruggetrokken claims een onderschatting van het 201 Summary in Dutch ex post moreel gevaar geven en een analyse van de gemelde schades een overschatting van het ex ante moreel gevaar. Een eerste blik op de Nederlandse gegevens suggereert dat ex post moreel gevaar belangrijk is; er wordt geregeld gebruik gemaakt van de mogelijkheid om claims terug te trekken. Toetsen voor Asymmetrische Informatie Een eenvoudige toets op het belang van asymmetrische informatie kan worden gebaseerd op de waargenomen relatie tussen claims en de door een verzekerde gekochte dekking. Als beter gedekte verzekerden meer claimen, dan wijst dit op asymmetrische informatie. Neem bijvoorbeeld autoverzekeringen. In het geval van negatieve selectie kiezen automobilisten die, beter dan de verzekeraar, weten dat ze goed kunnen autorijden een verzekering met een hoger eigen risico. In het geval van moreel gevaar kiezen automobilisten met een hoger eigen risico ervoor om voorzichtiger te rijden. Omdat er geen reden is voor een structureel verband tussen claims en de mate van verzekering als er geen asymmetrische informatie is, is deze relatie informatief over de aanwezigheid van asymmetrische informatie. Asymmetrische informatie leidt dus tot een positieve voorwaardelijke correlatie tussen dekking en claims, wat een eenvoudig toets toestaat. Deze voorwaardelijke-correlatiebenadering (conditional-correlation approach ) is eerst toegepast door Chiappori and Salanié (2000), die toetsten voor asymmetrische informatie in Franse autoverzekeringen. Om problemen met de endogeniteit van de ervaringsrating te omzeilen, hebben zij zich geconcentreerd op jonge bestuurders zonder claimsgeschiedenis. De auteurs hebben geen bewijs van asymmetrische informatie gevonden, maar vermoeden dat een asymmetrie kan ontstaan in de loop van de verzekeringsrelatie door asymmetrisch leren. In dit geval weten jonge auto- mobilisten initieel weliswaar niet meer over hun risico dan de verzekeraar, maar worden ze snel wijzer van ervaringen, zoals bijna-ongelukken, die ze niet delen met de verzekeraar. In het eerste deel van dit proefschrift adopteren wij de methode van Chiappori and Salanié en breiden deze op verschillende wijzen uit. Ten eerste onderzoeken we ook de 202 Samenvatting in het Nederlands asymmetrische informatie bij gevorderde bestuurders door te controleren voor hun ervaringsrating. Daarnaast bekijken we niet alleen het voorkomen van claims maar ook de grootte van claims. Tot slot verkennen we ook de gegevens van premies, wat het mogelijk maakt om nieuwe nonparametrische methoden toe te passen. Met gebruik van alleen cross-sectionele gegevens over een jaar, vinden we geen sporen van asymmetrische informatie. Nadeel van deze voorwaardelijke-correlatiebenadering is dat noch de dynamische keuze van verzekeringen noch het eect van de dynamische structuur van de verzekeringen op het rij- en claimgedrag kan worden onderzocht. In de praktijk worden deze, blijkens het veelvuldige voorkomen van een bonus-malusstructuur, wel erg belangrijk gevonden. Daar komt bij dat de toets geen onderscheid maakt tussen negatieve selectie en moreel gevaar. Dit onderscheid is belangrijk, omdat beide vormen van asymmetrische informatie verschillende implicaties voor optimale contracten en de werking van verzekeringsmarkten hebben. Een voor de hand liggende oplossing is om het claimgedrag van verzekerden dynamisch te analyseren. In het Nederlandse BM systeem wijzigen de prikkels om claims te voorkomen met de huidige BM trede, de tijd in het lopende polisjaar en het aantal claims dat al is ingediend. We meten deze wijzigingen in de prikkels en tonen aan dat Nederlanders die aan moreel gevaar lijden met elke claim door schuld hun claimsintensiteit veranderen. Dit resultaat verbindt moreel gevaar in Nederlandse autoverzekeringen met zogenaamde 'toestandsafhankelijkheid' (state dependence ) in het individuele claimproces: de snelheid waarmee verzekerden claimen hangt af, via de prikkels die uitgaan van het BM systeem, van het claimverleden. We kunnen dus leren over moreel gevaar door te meten of er zulke toestandsafhankelijkheid in het claimproces is. Hierbij moeten we werkelijke, individuele toestandsafhankelijkheid onderscheiden van de eecten van niet-waargenomen heterogeniteit (Heckman, 1981). 203 Summary in Dutch Deze benadering is eerst geïntroduceerd door Abbring, Chiappori, Heckman, and Pinquet (2003) en toegepast in Franse autoverzekering door Abbring, Chiappori, and Pinquet (2003). In het tweede deel van dit proefschrift breiden we deze werklijn op verschillende wijzen uit. Eerst ontwikkelen we een volledig structureel dynamisch micro-econometrisch model om moreel gevaar in de Nederlandse motorrijtuigverzekering te analyseren. We gebruiken namelijk rijke variatie in de prikkels die uitgaan van het Nederlandse BM systeem. Vervolgens versterken wij de statistische kracht van de toetsen door verlenging van de analyse naar langere panels van verzekeringsgegevens, wat ons toestaat de informatie in onze gegevens volledig te benutten. Tot slot maken wij onderscheid tussen ex ante en ex post moreel gevaar door zowel voorkomen als grootte van claims te modelleren. Onze dynamische analyse geeft bewijs voor moreel gevaar in de Nederlandse autoverzekering. Een deel ervan kan door ex post moreel gevaar verklaard worden. Ons resultaat is bijzonder in de empirische literatuur, die meestal geen bewijs voor asymmetrische informatie in autoverzekering vindt. 204 Bibliography Abbring, J. H. (2002). Stayers versus defecting movers: A note on the identication of defective duration models. Economics Letters 74, 327331. Abbring, J. H. (2007). Mixed hitting-time models. Discussion Paper 07-57/3, Tinbergen Institute, Amsterdam. Abbring, J. H., P. A. Chiappori, J. J. Heckman, and J. Pinquet (2003, AprilMay). Adverse selection and moral hazard in insurance: Can dynamic data help to distinguish? Journal of the European Economic Association: Papers and Proceedings 1 (23), 512 521. Abbring, J. H., P. A. Chiappori, and J. Pinquet (2003). insurance data. Moral hazard and dynamic Journal of the European Economic Association 1, Abbring, J. H., P. A. Chiappori, and T. Zavadil (2008). 767820. Better safe than sorry? Ex ante and ex post moral hazard in dynamic insurance data. Discussion Paper 08-075/3, Tinbergen Institute, Amsterdam. Abbring, J. H. and G. J. Van den Berg (2003). treatment eects in duration models. The non-parametric identication of Econometrica 71, 14911517. Abbring, J. H. and T. Zavadil (2008). The hand of the past in censored renewal data. Mimeo, VU University Amsterdam, Amsterdam. Abrahamse, A. F. and S. J. Carroll (1999). The frequency of excess claims for automobile personal injuries. In Automobile Insurance: Road Safety, New Drivers, Risks, Insurance Fraud and Regulation. Springer. Andersen, P. K., Ø. Borgan, R. D. Gill, and N. Keiding (1993). on Counting Processes. Statistical Models Based New York: Springer-Verlag. 205 BIBLIOGRAPHY Arrow, K. J. (1963, December). Uncertainty and the welfare economics of medical care. American Economic Review 53 (5), 941973. Assurantiemagazine (2004, January 23). Car Insurers Still Face a Hog Cycle, Despite the Introduction of the Bonus-Malus System (in Dutch: Autoverzekeraars Kampen Ondanks Bonus-Malus Nog Met Varkenscyclus). Alphen aan den Rijn: Kluwer. Bates, G. and J. Neyman (1952). Contributions to the theory of accident proneness II: True or false contagion. University of California Publications in Statistics 1, 255275. Briys, E. (1986, December). Insurance and consumption: The continuous time case. Journal of Risk and Insurance 53, 718723. Caballero, R. J. (1990, January). Consumption puzzles and precautionary savings. of Monetary Economics 25, The Journal 113136. Campbell, J. R. and B. Eden (2007, April). Rigid prices: Evidence from U.S. scanner data. Working Paper 2005-08, Federal Reserve Bank of Chicago, Chicago, IL. Cardon, J. H. and I. Hendel (2001, Autumn). Asymmetric information in health insurance: Evidence from the National Medical Expenditure Survey. Economics 32 (3), RAND Journal of 408427. Ceccarini, O. and N. S. Pereira (2004, April). Testing for the presence of moral hazard on dynamic insurance data: Evidence from the Portuguese car insurance industry. Available at www.ios.neu.edu/iioc2004/papers/s4m2.pdf. Chamberlain, G. (1985). Heterogeneity, omitted variable bias, and duration dependence. In J. J. Heckman and B. Singer (Eds.), Longitudinal Analysis of Labor Market Data. Cambridge, MA: Cambridge University Press. Chiappori, P. A. (2001). Econometric models of insurance under asymmetric information. In G. Dionne (Ed.), 206 Handbook of Insurance, Huebner International Series on Risk, In- BIBLIOGRAPHY surance, and Economic Security, Chapter 11, pp. 365393. Dordrecht: Kluwer Academic Publishers. Chiappori, P. A., F. Durand, and P.-Y. Geoard (1998, May). Moral hazard and the demand for physician services: First lessons from a French natural experiment. Economic Review 42 (35), European 499511. Chiappori, P. A., B. Jullien, B. Salanié, and F. Salanié (2006, Winter). Asymmetric information in insurance: General testable implications. RAND Journal of Economics 37 (4), 783798. Chiappori, P. A. and B. Salanié (1997, April). Empirical contract theory: The case of insurance data. European Economic Review 41 (35), 943950. Chiappori, P. A. and B. Salanié (2000, February). Testing for asymmetric information in insurance markets. Journal of Political Economy 108 (1), Chiappori, P. A. and B. Salanié (2003). 5678. Testing contract theory: A survey of some recent work. In M. Dewatripont, L. P. Hansen, and S. J. Turnovsky (Eds.), Advances in Economics and Econometrics Theory and Applications, Eighth World Congress, Econometric Society Monographs, Chapter 4, pp. 115149. Cambridge: Cambridge University Press. Chintagunta, P. K. and X. Dong (2006). R. Grover and M. Vriens (Eds.), and Future Advances, Hazard/survival models in marketing. In The Handbook of Marketing Research: Uses, Misuses, Chapter 21, pp. 441454. Thousands Oaks, CA: Sage Publica- tions. Cohen, A. (2005, August). Asymmetric information and learning: automobile insurance market. Evidence from the Review of Economics and Statistics 87 (2), 197207. Cohen, A. (2008, January). Asymmetric learning in repeated contracting: An empirical study. Working Paper 13752, NBER. Available at www.nber.org/papers/w13752. 207 BIBLIOGRAPHY Cummins, J. D. and S. Tennyson (1996). Moral hazard in insurance claiming: Evidence from automobile insurance. Journal of Risk and Uncertainty 12, Dahlby, B. G. (1983, February). Adverse selection and statistical discrimination: analysis of Canadian automobile insurance. Dahlby, B. G. (1992). 2650. Journal of Public Economics 20, An 121130. Testing for asymmetric information in Canadian automobile in- surance. In G. Dionne (Ed.), Contributions to Insurance Economics. Kluwer Academic Publishers. de Meza, D. and D. C. Webb (2001, Summer). markets. Advantageous selection in insurance The RAND Journal of Economics 32 (2), 249262. de Wit et al., G. (1982). New motor rating structure in the Netherlands. Technical report, ASTIN section of the Actuarial Association, Woerden, Netherlands. Dionne, G., M. Dahchour, and P.-C. Michaud (2006, June). Separating moral hazard from adverse selection and learning in automobile insurance: Longitudinal evidence from France. Working Paper 04-05, Canada Research Chair in Risk Management, HEC Montréal. Available at SSRN: http://ssrn.com/abstract=583063. Dionne, G. and N. A. Doherty (1994, April). Adverse selection, commitment, and renegotiation: Extension to and evidence from insurance markets. Economy 102 (2), Journal of Political 209235. Dionne, G., C. Gouriéroux, and C. Vanasse (1999). Evidence of adverse selection in automobile insurance markets. In G. Dionne and C. Laberge-Nadeau (Eds.), Insurance: Road Safety, Insurance Fraud and Regulation, Automobile pp. 1346. Boston: Kluwer Academic Publishers. Dionne, G., C. Gouriéroux, and C. Vanasse (2001, April). Testing for evidence of ad- verse selection in the automobile insurance market: A comment. Economy 109 (2), 208 444473. Journal of Political BIBLIOGRAPHY Dionne, G., M. Maurice, J. Pinquet, and C. Vanasse (2005, July). The role of memory in long-term contracting with moral hazard: Empirical evidence in automobile insurance. THEMA Working Papers. Dionne, G. and C. Vanasse (1992, April-Jun). Automobile insurance ratemaking in the presence of asymmetrical information. Journal of Applied Econometrics 7 (2), 149165. Dionne, G. and C. Vanasse (1997). Une évaluation empirique de la nouvelle tarication de l'assurance automobile au Québec. L'Actualité économique 73, 4780. Elbers, C. and G. Ridder (1982). True and spurious duration dependence: The identiability of the proportional hazard model. Review of Economic Studies 64, 403409. Engle, R. F. and J. R. Russell (1998). Autoregressive conditional duration: A new model for irregularly spaced transaction data. Econometrica 66 (5), 11271162. Fang, H., M. P. Keane, and D. Silverman (2006, June). Sources of advantageous selection: Evidence from the Medigap insurance market. NBER working paper no. 12289. Finkelstein, A. and J. Poterba (2002). Selection eects in the United Kingdom individual annuities market. Economic Journal 112 (476), 2850. Gourieroux, C., A. Monfort, E. Renault, and A. Trognon (1987). Generalised residuals. Journal of Econometrics 34, 532. Harris, M. and B. Holmstrom (1982). A theory of wage dynamics. Studies 49 (3), The Review of Economic 315333. Heckman, J. J. (1981). Heterogeneity and state dependence. In S. Rosen (Ed.), in Labor Markets. Studies University of Chicago Press. Heckman, J. J. (1991, May). Identifying the hand of the past: dependence from heterogeneity. Distinguishing state American Economic Review 81 (2), 7579. Papers and 209 BIBLIOGRAPHY Proceedings of the Hundred and Third Annual Meeting of the American Economic Association; in Path Dependence in Economics: The Invisible Hand in the Grip of the Past. Heckman, J. J. and G. Borjas (1980). Does unemployment cause future unemployment? Denitions, questions and answers from a continuous time model of heterogeneity and state dependence. Economica 47, 247283. Heckman, J. J. and B. Singer (1984). The identiability of the proportional hazard model. Review of Economic Studies 51, 231241. Hendel, I. and A. Lizzeri (2003, February). The role of commitment in dynamic contracts: Evidence from life insurance. Quarterly Journal of Economics 118 (1), 299327. Holly, A., L. Gardiol, G. Domenighetti, and B. Bisig (1998, May). An econometric model of health care utilization and health insurance in Switzerland. Review 42 (35), European Economic 513522. Holmström, B. (1979). Moral hazard and observability. The Bell Journal of Economics 10, 7492. Holt, J. and R. Prentice (1974). experiments. Biometrika 61, Honoré, B. E. (1993). Survival analysis in twin studies and matched pair 1730. Identication results for duration models with multiple spells. Review of Economic Studies 60, 241246. Israel, M. (2004, February). Do we drive more safely when accidents are more expensive? Identifying moral hazard from experience rating schemes. The Center for the Study of Industrial Organization, Working Paper 0043. Jain, D. C. and N. J. Vilcassim (1991). Investigating household purchase timing decisions: A conditional hazard function approach. 210 Marketing Science 10 (1), 123. BIBLIOGRAPHY Jones, M. C. and M. P. Wand (1995). Kernel Smoothing. London: Chapman & Hall. Kortram, R., A. Lenstra, G. Ridder, and A. van Rooij (1995). Constructive identication of the mixed proportional hazards model. Statistica Neerlandica 49, 269281. Koul, H. L. and A. Schick (1997). Testing for the equality of two nonparametric regression curves. Journal of Statistical Planning and Inference 65, 293 314. Lancaster, T. (1979). Econometric methods for the duration of unemployment. metrica 47, Econo- 939956. Manning, W. G., J. P. Newhouse, N. Duan, E. B. Keeler, and A. Leibowitz (1987, Jun). Health insurance and the demand for medical care: Evidence from a randomized experiment. The American Economic Review 77 (3), Merton, R. C. (1971, December). continuous-time model. 251277. Optimum consumption and portfolio rules in a Journal of Economic Theory 3, 373413. Mossin, J. (1968). Aspects of rational insurance purchasing. omy 76, Journal of Political Econ- 553568. Pauly, M. V. (1968). The economics of moral hazard: Comment. Review 58, American Economic 531537. Pauly, M. V. (1974, February). Overinsurance and public provision of insurance: The roles of moral hazard and adverse selection. Quarterly Journal of Economics 88 (1), 4462. Pinquet, J. L., G. Dionne, C. Vanasse, and M. Maurice (2007, August). Point-record incentives, asymmetric information and dynamic data. Working paper, Economix, Nanterre. Puelz, R. and A. Snow (1994, April). Evidence on adverse selection: Equilibrium signaling and cross-subsidization in the insurance market. Journal of Political Economy 102 (2), 236257. 211 BIBLIOGRAPHY Richaudeau, D. (1999, June). Automobile insurance contracts and risk of accident: An empirical test using French individual data. Theory 24 (1), The Geneva Papers on Risk and Insurance 97114. Ridder, G. (1990). The non-parametric identication of generalized accelerated failuretime models. Review of Economic Studies 57, 167182. Rothschild, M. and J. E. Stiglitz (1976, November). Equilibrium in competitive insurance markets: An essay on the economics of imperfect information. Economics 90 (4), 629649. Shavell, S. (1979, November). Economics 93 (4), Quarterly Journal of On moral hazard and insurance. Quaterly Journal of 541562. Tennyson, S. and P. Salsas-Forn (2002, September). Claims auditing in automobile insurance: Fraud detection and deterrence objectives. ance 69 (3), The Journal of Risk and Insur- 289308. Van den Berg, G. J. (2001). ple durations. Duration models: Specication, identication, and multi- In J. J. Heckman and E. Leamer (Eds.), Handbook of Econometrics, Volume 5, Chapter 55, pp. 33813460. Amsterdam: Elsevier Science. Visser, M. (1996). Nonparametric estimation of the bivariate survival function with an application to vertically transmitted AIDS. Biometrika 83, 507518. Wilson, C. (1977). A model of insurance markets with incomplete information. of Economic Theory 16, Journal 167207. Zavadil, T. (2008, August). Do agents with better coverage cause more damage? Testing for asymmetric information in car insurance. Mimeo, VU University Amsterdam. 212 The Tinbergen Institute is the Institute for Economic Research, which was founded in 1987 by the Faculties of Economics and Econometrics of the Erasmus Universiteit Rotterdam, Universiteit van Amsterdam and Vrije Universiteit Amsterdam. The Institute is named after the late Professor Jan Tinbergen, Dutch Nobel Prize laureate in economics in 1969. The Tinbergen Institute is located in Amsterdam and Rotterdam. The following books recently appeared in the Tinbergen Institute Research Series: 392. K.G. BERDEN, On technology, uncertainty and economic growth. 393. G. VAN DE KUILEN, The economic measurement of psychological risk attitudes. 394. E.A. MOOI, Inter-organizational cooperation, conict, and change. 395. A. LLENA NOZAL, On the dynamics of health, work and socioeconomic status. 396. P.D.E. DINDO, Bounded rationality and heterogeneity in economic dynamic models. 397. D.F. SCHRAGER, Essays on asset liability modeling. 398. R. HUANG, Three essays on the eects of banking regulations. 399. C.M. VAN MOURIK, Globalisation and the role of nancial accounting information in Japan. 400. S.M.S.N. MAXIMIANO, Essays in organizational economics. 401. W. JANSSENS, Social capital and cooperation: An impact evaluation of a women's empowerment programme in rural India. 402. J. VAN DER SLUIS, Successful entrepreneurship and human capital. 403. S. DOMINGUEZ MARTINEZ, Decision making with asymmetric information. 404. H. SUNARTO, Understanding the role of bank relationships, relationship marketing, and organizational learning in the performance of people's credit bank. 405. M.Â. DOS REIS PORTELA, Four essays on education, growth and labour economics. 406. S.S. FICCO, Essays on imperfect information-processing in economics. 407. P.J.P.M. VERSIJP, Advances in the use of stochastic dominance in asset pricing. 408. M.R. WILDENBEEST, Consumer search and oligopolistic pricing: A theoretical and empirical inquiry. 409. E. GUSTAFSSON-WRIGHT, Baring the threads: Social capital, vulnerability and the well-being of children in Guatemala. 410. S. YERGOU-WORKU, Marriage markets and fertility in South Africa with comparisons to Britain and Sweden. 411. J.F. SLIJKERMAN, Financial stability in the EU. 412. W.A. VAN DEN BERG, Private equity acquisitions. 413. Y. CHENG, Selected topics on nonparametric conditional quantiles and risk theory. 414. M. DE POOTER, Modeling and forecasting stock return volatility and the term structure of interest rates. 415. F. RAVAZZOLO, Forecasting nancial time series using model averaging. 416. M.J.E. KABKI, Transnationalism, local development and social security: the functioning of support networks in rural Ghana. 417. M. POPLAWSKI RIBEIRO, Fiscal policy under rules and restrictions. 418. S.W. BISSESSUR, Earnings, quality and earnings management: the role of accounting accruals. 419. L. RATNOVSKI, A Random Walk Down the Lombard Street: Essays on Banking. 420. R.P. NICOLAI, Maintenance models for systems subject to measurable deterioration. 421. R.K. ANDADARI, Local clusters in global value chains, a case study of wood furniture clusters in Central Java (Indonesia). 422. V.KARTSEVA, Designing Controls for Network Organizations: A Value-Based Approach. 423. J. ARTS, Essays on New Product Adoption and Diusion. 424. A. BABUS, Essays on Networks: Theory and Applications. 425. M. VAN DER VOORT, Modelling Credit Derivatives. 426. G. GARITA, Financial Market Liberalization and Economic Growth. 427. E.BEKKERS, Essays on Firm Heterogeneity and Quality in International Trade. 428. H.LEAHU, Measure-Valued Dierentiation for Finite Products of Measures: Theory and Applications. 429. G. BALTUSSEN, New Insights into Behavioral Finance. 430. W. VERMEULEN, Essays on Housing Supply, Land Use Regulation and Regional Labour Markets. 431. I.S. BUHAI, Essays on Labour Markets: Worker-Firm Dynamics, Occupational Segregation and Workplace Conditions. 432. C. ZHOU, On Extreme Value Statistics 433. M. VAN DER WEL, Riskfree Rate Dynamics: Space Modeling. Information, Trading, and State 434. S.M.W. PHLIPPEN, Come Close and Co-Create: Proximities in pharmaceutical innovation networks. 435. A.V.P.B. MONTEIRO, The Dynamics of Corporate Credit Risk: An Intensity-based Econometric Analysis. 436. S.T. TRAUTMANN, Uncertainty in Individual and Social Decisions: Theory and Experiments. 437. R. LORD, Ecient pricing algorithms for exotic derivatives. 438. R.P. WOLTHOFF, Essays on Simultaneous Search Equilibrium. 439. Y.-Y. TSENG, Valuation of travel time reliability in passenger transport. 440. M.C. NON, Essays on Consumer Search and Interlocking Directorates. 441. M. DE HAAN, Family Background and Children's Schooling Outcomes.
© Copyright 2024 Paperzz