Hedonic Prices and Implicit Markets: Consistent Estimation of Marginal Willingness to Pay for Differentiated Products Without Exclusion Restrictions* by Kelly Bishop† Olin Business School Washington University in St. Louis Christopher Timmins‡ Duke University and NBER November 2010 Abstract Since the publication of Rosen’s “Hedonic Prices and Implicit Markets”, property value hedonics has become the workhorse model for valuation of local public goods and environmental amenities. This has happened in spite of a number of well-known and well-documented econometric problems. Bartik (1987) and Epple (1987) describe a source of endogeneity in the second stage of Rosen’s procedure that is difficult to overcome using standard exclusion restriction arguments. This has led researchers to avoid the estimation of marginal willingness-to-pay functions altogether, relying instead on the first-stage hedonic price function and limiting analysis to the evaluation of marginal policies. We propose a new econometric procedure for the full hedonic model that avoids these endogeneity problems without the need for exclusion restrictions that may be difficult to justify, and we implement our estimator using data on violent crime rates in the Los Angeles and San Francisco metropolitan areas. Results indicate that controlling for the Bartik-Epple endogeneity bias without strong assumptions about cross-market preference heterogeneity has important implications for valuing non-marginal policy interventions. * We thank seminar participants at the AERE/ASSA Annual Meetings, Camp Resources, Iowa State University, Montana State University, Syracuse University, the Triangle Resource and Environmental Economics Workshop, the UEA/RSAI Annual Meetings, and the University of Wisconsin for their helpful comments and suggestions. † Kelly Bishop, Olin Business School, Washington University in St. Louis, St. Louis, MO 63130. [email protected]. ‡ Christopher Timmins, Department of Economics, Duke University, Box 90097, Durham, NC 27708. [email protected]. 1 1. Introduction Dating back to the work of Court (1939), Grilliches (1961) and Lancaster (1966), hedonic techniques have been used to determine the implicit prices associated with the attributes of differentiated products. Rosen’s (1974) seminal work proposed a theoretical structure for the hedonic regression that suggested a procedure for the recovery of marginal willingness to pay (MWTP) functions for heterogeneous individuals. Importantly, Rosen’s two-step approach allowed for two sources of preference heterogeneity: individuals’ MWTPs were allowed to differ with (i) their individual attributes and (ii) the quantity of the product attribute that they consume. The latter is particularly important when considering non-marginal policy changes (i.e., any change that is large enough to alter the individual’s willingness to pay at the margin). The twostep procedure suggested by Rosen (and developed further by later authors)1 uses variation in implicit prices (obtained either by combining data from multiple markets or by allowing for nonlinearity in the hedonic price function) to identify a MWTP function. With Rosen (1974) as a backdrop, property value hedonics has become the workhorse model for the valuation of local public goods and environmental amenities despite of a number of well-known and well-documented econometric problems.2 Some of these problems arise in Rosen’s first stage – for example, omitted variables that may be correlated with the local attribute of interest. There is a large and growing literature (cited below) that describes both quasi-experimental and structural solutions to this problem. Our concern in this paper is with an important problem that arises in the second stage of Rosen’s two-step procedure. Bartik (1987) and Epple (1987), in separate papers, describe a source of endogeneity that is difficult to overcome using standard exclusion restriction 1 See, for example, Brown and Rosen (1982) and Mendelsohn (1985). See Taylor (2003) and Palmquist (2005) for a comprehensive discussion (with a particular focus on environmental applications). 2 2 arguments. Specifically, they note that, unless the hedonic price function is linear, the hedonic price of a product attribute varies systematically with the quantity consumed. The researcher therefore faces a difficult endogeneity problem in the application of Rosen’s second-stage. Moreover, because of the equilibrium features of the hedonic model, Bartik and Epple show that there are very few natural exclusion restrictions that one can use to solve this endogeneity problem. In particular, within-market supply-side shifters – the natural instrument choice when estimating a demand equation – are not valid in this context. This has generally left researchers to choose from a variety of weak instrument strategies, or instruments based on cross-market homogeneity assumptions that may be difficult to justify.3 With a few exceptions, the hedonics literature has subsequently ignored preference heterogeneity, focusing instead on recovering estimates of the hedonic price function and valuing marginal changes in amenities [see, for example, Black (1999), Bui and Mayer (2003), Chay and Greenstone (2005), Davis (2004), Figlio and Lucas (2004), Gayer, Hamilton, and Viscusi (2000), Greenstone and Gallagher (2009), Linden and Rockoff (2008), and Pope (2008)].4 In this paper, we propose an estimation procedure for the recovery of MWTP functions that avoids the Bartik-Epple endogeneity problem without the need for exclusion restrictions that may be difficult to justify. In particular, the exclusion restrictions suggested by Bartik require 3 Bound, Jaeger, and Baker (1995) discusses the biases that result from using weak instruments (i.e., instruments that do a poor job of predicting the endogenous variable). 4 Deacon et al. (1998) noted that “To date no hedonic model with site specific environmental amenities has successfully estimated the second stage marginal willingness to pay function.” Since that time, a number of papers have addressed the problem of recovering preference from hedonic estimates. Bajari and Benkard (2005) avoid the Bartik-Epple endogeneity problem by relying on strong parametric assumptions on utility that turn the second-stage of Rosen’s procedure from an estimation problem into a preference-inversion procedure. Bishop and Timmins (2010) show how some of these assumptions can be relaxed with panel data, but even that solution requires strong assumptions about the stability of preferences over time. Ekeland, Heckman, and Nesheim (2004) provide two alternative approaches to recovering MWTP. The first imposes very little in terms of parametric restrictions, requiring only an additive separability assumption in the MWTP specification. Their approach, however, requires very rich data and can be hard to implement with multiple amenities. Their second approach uses insights from the first strategy to develop a set of exclusion restrictions to be used in identifying MWTP. Like much of the hedonics literature, that approach may also suffer from weak instruments problems. 3 that one assume a strong degree of preference homogeneity across markets that may not be defensible. Using our procedure, any variable that can be used as a Bartik-style instrument can instead be included as an observable form of preference heterogeneity across markets. Importantly, our procedure is computationally simple and easy to implement. Moreover, it does not require any more in terms of data or assumptions than does a typical hedonic regression. Instead, the procedure recovers the structural parameters underlying MWTP functions by exploiting the relationship between the quantity of the amenity being consumed and the attributes of the individuals doing the consumption. That such a relationship should exist in hedonic equilibrium goes back to the idea of “stratification” found in Ellickson (1971), which becomes the basis for the estimable Tiebout sorting models of Epple and a variety of co-authors.5 We illustrate our approach in an extremely simple but empirically relevant context – the linearquadratic multi-market setting used by Epple (1987). In the simplest version of that model, estimates can be recovered with a transparent least squares procedure. With richer variation in the data, estimates can be obtained using a NLLS procedure. In an appendix, we also demonstrate how our method can be adapted to achieve identification in a single-market setting that relies on a flexible representation of the hedonic price function to overcome the identification problems described by Brown and Rosen (1982) and Mendelsohn (1985).6 Importantly, our estimation approach will work in either of these contexts. To demonstrate the usefulness of this approach, we implement our estimation procedure using data on violent crime rates in the San Francisco and Los Angeles metropolitan areas and 5 See, for example, Epple (1987), Epple, Filimon and Romer (1984), Epple and Romano (1998), Epple and Platt (1998) and Epple and Sieg (1999). 6 Ekeland, Heckman, and Neshiem (2004) demonstrate that the linear-quadratic hedonic model, which is the source of the identification problems cited by Brown and Rosen (1982) and Mendelsohn (1985), is a very special case – a non-linear hedonic gradient tends to be the rule rather than the exception. This means that non-linearity provides a perfectly valid source of identification, even using data taken from a single market. 4 find that recovering the shape of the MWTP function is economically important. In particular, an individual’s marginal willingness-to-pay to avoid an incident of violent crime per 100,000 residents increases by $1.04 with each additional incident. In contrast, MWTP falls by more than $0.20 with each additional case under the simple Rosen model. Our results imply welfare costs of $66,256 and $73,322 for one standard deviation increases in violent crime in the San Francisco Bay Area and Los Angeles Metro Areas, respectively. This has important consequences for valuing the welfare implications of a non-marginal increase or reduction in crime. The overall cost of a similar increase in crime measured by a simple model that uses only the first-stage hedonic estimates is only 20% to 30% as large. This paper proceeds as follows. Section 2 describes the endogeneity problem discussed by Bartik (1987) and Epple (1987) and provides intuition for why it is difficult to solve with exclusion restrictions. Section 3 describes our estimation procedure in the context of a linearquadratic model with multi-market data using the model employed by Epple (1987). An appendix describes how the method can be adapted to a data from a single market if the hedonic gradient is non-linear. Section 4 describes the data used in our application – housing transactions data from the Los Angeles and San Francisco metropolitan areas between 1990 and 2008, combined with property and violent crime data from RAND California. Section 5 reports the results of applying our estimator, the Rosen two-step procedure, and Bartik’s instrumental variables strategy to these data. Section 6 concludes. 2. Why Is It So Difficult To Recover The MWTP Function? In their 1987 articles, Epple and Bartik each discuss the econometric problems induced by the equilibrium sorting process that underlies the formation of the hedonic price function. In 5 particular, unobserved determinants of tastes affect both the quantity of an amenity that an individual consumes and (if the hedonic price function is not linear) the hedonic price of the attribute.7 In a regression like that described in the second-stage of Rosen’s two-step procedure, the quantity of the amenity that an individual consumes will therefore be endogenous. Moreover, the typical exclusion restrictions used to estimate demand systems (i.e., using supplier attributes as instruments) will not work because, in addition to affecting the quantity of amenity and the hedonic price paid for it, the unobservable component of preferences also determines the supplier from whom the individual purchases. Supplier attributes, which might naturally be used to trace-out the demand function, are therefore correlated with the unobserved determinants of MWTP because of the sorting process underlying the hedonic equilibrium. To make these ideas concrete, and to introduce the notation used throughout the rest of the paper, consider the following simple example, which is based on Epple’s model. The hedonic price function is given by: (1) where 1, 2, … , , indexes houses. measures the price of house , and measures the level of the amenity associated with that house (for the sake of illustration, we ignore other amenities and house attributes). For the time-being, we consider data from just a single market, but we will allow for multi-market data below. The price gradient associated with this hedonic price function is: 7 The simple case of the linear hedonic price function assumes that attributes can be unbundled and repackaged in any combination without affecting their marginal value (e.g., the marginal value of another bedroom is the same regardless of how many bedrooms a house already has). In most empirical settings, this assumption is unrealistic. 6 2 (2) where represents the hedonic price of at house . The second stage of Rosen’s procedure seeks to recover the coefficients of demand (or marginal willingness-to-pay) and supply functions for the attribute : (3) (demand) (4) (supply) Where and represent attributes of the buyers and sellers of house , respectively. and similarly represent unobserved determinants of tastes and marginal costs. The problem we consider in this paper arises from the fact that correlated with must necessarily be . This is easily shown in the following equation. Noting that in hedonic equilibrium and combining (2) and (3) yields (with some re-arranging): (5) 0, equation (5) makes explicit that Making the standard assumption that cannot be uncorrelated with . Therefore, in order to recover the parameters of equation (3), one requires an instrument for . 7 The typical approach to estimating demand functions with endogenous quantities uses supply function shifters. The problem with that approach in this context, however, is that and hedonic sorting induces a correlation between . This can be easily seen by combining equation (5) with a similarly derived expression based on equation (4). Noting that in hedonic equilibrium (6) Solving for (i.e., the attributes of the supplier of house i): (7) Assuming that suppliers are heterogeneous (i.e., function of differently, instrument for ), equation (7) reveals that , making it ineligible as a potential instrument for will be a in equation (2). Put determines which supplier individual i buys from, so that cannot be used to . Epple and Bartik propose instrumental variables strategies to deal with this problem. with market indicator variables. The intuition Bartik, for example, suggests instrumenting for for this strategy is that variation in the distribution of suppliers across markets will provide an exogenous source of variation in the equilibrium quantity of the amenity chosen by each individual. Using the subscript k = 1, 2, … K to indicate the market in which individual i lives, this IV strategy is implemented (as 2SLS) by first regressing the quantity 8 , on a vector of market indicators, and then using the fitted values as regressors in the estimation of the MWTP function: (8) , (9) , , , 0 , , , Here, and in the remainder of the paper, “ ” is used to denote an estimated parameter value. In order for this procedure to work, we require that market dummies are excluded from the MWTP function. Put differently, the distribution of preferences across markets must be the same. However, if households sort across markets based on their preferences for amenities, this assumption will be violated. There is ample empirical evidence that this is the case.8 Suppose, therefore, that we instead allow the MWTP function to have a different intercept for each market ( (10) , , , ): , , If we were to then implement Bartik’s IV strategy, the parameter identified from (11) , , would not be separately : , , , 8 See, for example, Roback (1982), Blomquist, Berger, and Hoehn (1988), Albouy et al. (2009), Bayer, Keohane and Timmins (2009), and Bishop (2008). 9 In the following section, we describe an alternative econometric approach to dealing with this endogeneity problem in the multiple-markets context. Using this approach, we demonstrate that any variable that could be used as a Bartik-style instrument (e.g., market indicator variables and/or market indicators interacted with individual attributes , ) can instead be included explicitly in MWTP as a form of observable preference heterogeneity. In our application, we show that this source of preference heterogeneity is empirically relevant. In an appendix, we show that our model is similarly identified using data from a single market, as long as the hedonic gradient is not linear. 3. Consistent Estimation Without Exclusion Restrictions In this section, we demonstrate that one can easily avoid the endogeneity problems described by Bartik (1987) and Epple (1987), even in a simple linear-quadratic hedonic specification. Rather than look for an IV strategy, we return to the original setup, but allow the hedonic gradient to differ by market: (12) , , (13) , , , 2 , , , , , , This alone does nothing to help with the IV strategy described in equations (8) – (11), but it does allow for identification of market-level heterogeneity in MWTP using our approach. In order to demonstrate this, we now introduce rich cross-market heterogeneity into MWTP. In particular, 10 allow mean MWTP ( , ) and the marginal effects of indiviaul attributes ( , ) to differ across markets: (14) , , , , , Using hedonic equilibrium to combine (13) and (14), and then solving for , yields the following reduced form equation: (15) , , , , , , , , , , , Assuming we have information on just two markets, estimation of equation (15) by OLS yields six pieces of information – , , , , , , , , equations: (16) , (17) , (18) , (19) , (20) (21) , , , , , , , , , , , , 11 , and , which are used in the following six represents the standard deviation of , , which we assume to be common across markets. While our procedure requires cross-market homogeneity in the variance in the unobserved component of preferences and in the slope of the MWTP function with respect to , , we note that these assumptions are also maintained by Rosen model and the Bartik IV approach to estimating it. Our procedure does, however, allow for rich cross-market heterogeneity in the remainder of the MWTP parameters.9 Equations (20) and (21) can be combined to yield an expression for , (22) Knowing : , , we can proceed to recover the remaining MWTP parameters: 2 (23) , , (24) , , (25) , 2 , , (26) , 2 , , 2 , , , , 9 Moreover, note that, with more than two markets, we would also be able to parameterize and/or . For , where represents the population of example, with data on three markets, we could set market k. 12 4. Data In our application, we estimate a pair of hedonic price functions that are allowed to vary across housing markets (i.e., Los Angeles Metro and San Francisco Bay Areas). Our primary variable of interest is the rate of violent crime, although we also control for house attributes, local property crime rates, and vectors of county and year fixed effects. This requires a great deal of data from multiple sources, which are summarized below. 4.1 Property Transactions Data The real estate transaction data we employ cover the five core counties of the San Francisco Bay Area (Alameda, Contra Costa, San Francisco, San Mateo, and Santa Clara) and the five counties that comprise the Los Angeles Metro Area (Los Angeles, Ventura, San Bernadino, Riverside, and Orange) over the period 1993 to 2008. The data were purchased from Dataquick and include transaction dates, prices, loan amounts, and buyers’, sellers’, and lenders’ names for all transactions. In addition, the data for the final observed transaction for each house include characteristics such as exact latitude and longitude, square footage, year built, lot size, number of rooms, number of bathrooms, number of bedrooms, and number of stories in the building. The process of cleaning the data involves a number of cuts. Many of these are made in order to deal with the fact that we only see housing characteristics at the time of the last sale, but we need to use housing characteristics from all sales as controls in our hedonic price regressions. We therefore seek to eliminate any observations that reflect major housing improvement (e.g., additions) or degradation. First, to control for land sales or re-builds, we drop all transactions 13 where “year built” is missing or with a transaction date that is prior to “year built”. Second, in order to control for property improvements (e.g., an updated kitchen) or degradations (e.g., water damage) that do not present as re-builds, we drop any house that ever appreciates or depreciates in excess of 50 percentage points of the county-year mean price change.10 We also drop any house that moves more than 40 percentiles between sales in the county-year distribution.11 Additionally, we drop transactions where the price is missing, negative, or zero. After using the consumer price index to convert all transaction prices into 2000 dollars, we drop one percent of observations from each tail to minimize the effect of outliers. Finally, as we merge in the crime data using the property’s geographic coordinates, we drop properties where latitude and longitude are missing. This yields a final sample of 378,252 transactions in the San Francisco Bay Area and 386,063 transactions in the Los Angeles Metro. Table 1 reports summary statistics for each housing market. 4.2 Home Mortgage Disclosure Act Information about the race and income of home buyers is obtained from data collected from mortgage applications in accordance with the Home Mortgage Disclosure Act of 1975. One of the main purposes of this law was to look for evidence of discriminatory lending practices. In accordance with that goal, HMDA provides information on the race, income, and gender of mortgage applicants, along with information on the mortgage loan amount, loan date, 10 For example, if a house appreciates 65% between years #3 and #6, and the mean appreciation of houses in that county over those years is 12%, we drop that house. If the mean appreciation had been 20% over that period, that house would not be dropped. 11 For example if a house was in the 20th percentile of homes sold in a county in year #1 and then in the 60th percentile of homes sold in that county in year #7, all observations pertaining to that house would be dropped from the data set. 14 lender’s name, and the census tract where the property is located. We are able to merge these data with Dataquick transactions based on these four variables. Bayer, MacMillan, Murphy, and Timmins (2010) describe the quality of this merge in detail. 4.3 Crime Data Crime statistics come from the RAND California data base, are organized by “city”, and measured as incidents per 100,000 residents.12 The data describe property and violent crime rates for 80 cities in the San Francisco metropolitan area and 175 cities in the Los Angeles metropolitan area between 1986 and 2008. Figures 1 and 2 illustrate the locations of these cities. Property crime is defined as “crimes against property, including burglary and motor vehicle theft”, while violent crime is defined as “crimes against people, including homicide, forcible rape, robbery, and aggravated assault.” Crime rates are imputed for each house in our data set using an inverse-distance weighted average of the crime rate in each city.13 We use the property crime rate as a control in our hedonic estimation and focus attention on violent crimes in our valuation exercise, as these crimes are less likely to be subject to systematic under-reporting. [Gibbons (2004)] Figures 3 and 4 illustrate the distribution of violent crime rates within each metropolitan area, and how average crime rates in each city have declined over time. These downward trends are consistent with drops in violent crime rates over the same period in many parts of the US. Table 1 provides summary statistics for crime in each city, measured at the level of the house transaction. 12 13 See http://ca.rand.org/stats/community/crimerate.html. Specifically, each crime rate is weighted by 1/distance2, where distance is measured in kilometers. 15 5. Results 5.1 Estimation of the Hedonic Gradient The first stage of the property value model estimates the hedonic price function, which , relates the price of house j in market k ( transacted in period ( , , to its attributes, both those that vary over time and those that do not. There are many empirical approaches available for this stage of the hedonic procedure, but we opt for a simple linear quadratic estimation framework. In particular, for each market (LA and SF) we estimate a separate hedonic price function using data from 1993-2008. Given that neighborhood unobservables may be correlated with crime rates, and we do not have a source of quasiexperimental variation in crime, we opt to pool the data over time within each city and include a vector of county fixed effects.14 (27) , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , where VC = violent crime rate (incidents per 100,000 population) PC = property crime rate (incidents per 100,000 population) BUILT = year built SQFT = square footage BATH = number of bathrooms BED = number of bedrooms ROOMS = total number of rooms STORIES = number of stories COUNTY = vector of county dummy variables15 14 Bias from omitted variables arises when the regressors are not orthogonal to the regression error. We want to be clear that any quasi-experimental or instrumental variables approach (like those used in the papers cited in the introduction) could be integrated into this phase of our analysis. 15 In the San Francisco metropolitan region, this includes Alameda, Contra Costa, San Francisco, San Mateo and Santa Clara counties. Alameda is omitted from the regression. In the Los Angeles metropolitan region, it includes 16 YEAR = vector of year dummies (1993-2008, 1999 normalized to zero) We use the coefficients ( , , , ) to construct hedonic prices for use in stage-two of the estimation procedure. (28) , , , 2 , , , Table 2 reports the results of these two specifications. Bootstrapped standard errors are reported as well. We note two important features of these results. First, in both cases, we find a hedonic gradient for violent crime that is negative for relevant values of violent crime and rises gradually with increasing incidence. This result alone suggests that heterogeneity in individual preferences is important – if all individuals had homogenous preferences, the hedonic gradient could be interpreted as a MWTP curve, and these gradients would suggest that individuals have upward sloping demand curves for public safety. The second important feature of the data to note is that there is significant heterogeneity in hedonic gradients across cities. This will provide us with an important source of variation with which to identify our model. 5.2 Recovery of the MWTP function Next, we specify a MWTP function that we use both with our estimation strategy (Bishop-Timmins), the traditional two-stage approach proposed by Rosen (no instruments), and with two versions of the instrumental variables strategy proposed by Bartik (1987). The first uses a market dummy (i.e., 1 , along with interactions of that market dummy and , as Los Angeles, Ventrua, San Bernadino, Riverside and Orange counties. Los Angeles and Ventura counties are omitted from the regression. 17 instruments for , , . The second version allows the Los Angeles market dummy variable to enter directly into the MWTP function (i.e., a limited form of cross-city preference heterogeneity), relying on the interaction terms for instruments. In particular, we estimate the following specifications: Bishop Timmins: (29) , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Rosen and Bartik I: (30) , , , , , , , , , , , , , , Bartik II: (31) , , , , , , , , , , , , 18 , , , , where = hedonic price of violent crime at house j in market k in year t (based on linear hedonic gradient or log-linear hedonic gradient, depending upon specification) = violent crime rate at house j in market k in year t , , , , = income (in thousands of dollars) of household choosing house j , , = 1 if household choosing house j is Asian-Pacific Islander , , = 1 if household choosing house j is Black , , = 1 if household choosing house j is Hispanic , , The first stage of the Bartik IV procedure is reported in Table 3. Conditional correlations between violent crime and covariates are, for the most part, intuitive. Violent crime is higher in Los Angeles Metro and lower in higher income communities (although that effect is lessened in LA). Compared with houses owned by Whites, crime rates are significantly higher at houses owned by Blacks and Hispanics. Those racial effects are, moreover, significantly greater in LA. Results from the four MWTP estimation procedures are reported in Table 4. First, consider the coefficient on violent crime. Our estimator is the only one that yields a downward sloping MWTP function, which is consistent with a decreasing MWTP for additional public safety. In each of the other three cases, an increase in violent crime reduces the individual’s willingness to pay to avoid it. The coefficients on violent crime in the Bartik I and Rosen models violate second order conditions for utility maximization. Coefficients in the Bartik II specification (where minimal cross-market preference heterogeneity is introduced) are nearly all statistically insignificant. The effect of increasing income in all but Bartik I is to increase individuals’ distaste for violent crime. Our model suggests that this effect is nearly two times larger in the San Francisco Bay Area than in Los Angeles Metro. Our model also finds that the intercept in MWTP is different across the two cities (i.e., with a significantly smaller average MWTP to avoid violent 19 crime found in Los Angeles). The effect of being Hispanic is significantly different across the two cities as well, with those in Los Angeles exhibiting a smaller average distaste for violent crime. 5.3 Interpretation In this sub-section, we consider the welfare effects of a non-marginal increase in violent crime. In particular, we consider a one standard deviation increase in each city (215.9 additional cases in San Francisco and 273.9 additional cases in Los Angeles). This change is approximately equal (but opposite in sign) to the reduction in violent crime that took place in San Francisco during the sample period, and smaller than that which took place in Los Angeles. With non-marginal changes, the implications of the differences between the estimates in Table 4 for welfare will be clear. For the sake of comparison, we also estimate a simple first-stage hedonic model of the following form: (32) , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , These estimates, reported in Table 5, imply a constant willingness to pay to avoid an additional case of violent crime of $66.42 (San Francisco) and $79.45 (Los Angeles). Table 6 reports the results of this welfare analysis. Consistent with the estimates in Table 4, our model yields the biggest mean welfare cost in both cities – $66,256 in San Francisco and 20 $73,322 in Los Angeles. This translates into an average welfare cost of $307 and $268 per each additional case in each city, respectively. There is wide variation in willingnesses to pay within each city – the standard deviations are $52,509 ($243 per case) and $73,481 ($268 per case) in the San Francisco and Los Angeles areas, respectively. In contrast to our model, the Bartik and Rosen models only yield average welfare costs between $127 and $179 per case in San Francisco and between $43 and $113 per case in Los Angeles. As reported above, the average welfare costs per case based on the simple linear hedonic model are only $66 and $79 in San Francisco and Los Angeles, respectively. These results make clear that the shape of the MWTP function can have practical but economically significant impacts on the valuation of non-marginal changes in amenities. In particular, alternative methods, including the simple Rosen two-step procedure, Bartik’s (1987) proposed IV strategy, and the simple estimation of a linear hedonic price function all tend to understate willingness to pay to avoid violent crime. 6. Conclusion To be fair, there are other factors that complicate the estimation of the full hedonic model. For example, the data requirements are more substantial than for the simple estimation of the first-stage hedonic price function – in particular, the researcher needs to observe household attributes (in addition to housing market outcomes) along with local public goods or amenities. With the increasing availability of large-scale housing transactions databases (like the one we work with here) and given the ability to link these data to other data sets, like that created by the 21 Home Mortgage Disclosure Act, this is becoming increasingly feasible to do across multiple markets.16 It is therefore important to develop empirical techniques that can yield unbiased estimates of the MWTP function. The typical policy being evaluated with a hedonic model is associated with a non-marginal change in the amenity, and as we show here, results based only on a simple first-stage hedonic price function can yield very different conclusions about welfare. While there are IV techniques in the literature for estimating MWTP functions, these typically require strong assumptions about the homogeneity of preferences across markets. Given evidence that individuals sort across markets based on their preferences for amenities, there is good reason to be suspicious of this assumption. We show here that by exploiting the properties of hedonic equilibrium, it is straightforward to recover unbiased estimates of the MWTP function allowing preferences to differ across markets. We demonstrate, in an application to violent crime in the San Francisco and Los Angeles metropolitan areas that recovering an unbiased estimate of MWTP and properly accounting for preference heterogeneity across those markets has important ramifications for valuing a nonmarginal change. 16 While it reports owners’ stated values instead of transaction prices, restricted access census micro data also allows for this possibility (and has much richer information about household attributes than any other data source). These data are also available for multiple markets. 22 References Albouy, D., W. Graf, R. Kellogg, and Hendrik Wolff (2009). “The Impact of Climate Change on Household Welfare.” University of Michigan Department of Economics Working Paper. Bajari, P., J. Cooley, K. Kim, and C. Timmins (2010). “A Theory-Based Approach to Hedonic Price Regressions with Time-Varying Unobserved Product Attributes: The Price of Pollution.” NBER Working Paper 15724. Bartik. T.J. (1987). “The Estimation of Demand Parameters in Hedonic Price Models.” Journal of Political Economy. 95(1):81-88. Bayer, P., N. Keohane, and C. Timmins (2009). “Migration and Hedonic Valuation: The Case of Air Quality.” Journal of Environmental Economics and Management. 58(1):1-14. Bishop, K. (2008). “A Dynamic Model of Location Choice and Hedonic Valuation.” Olin Business School Working Paper. Black, S.E. (1999). “Do Better Schools Matter? Parental Valuation of Elementary Education.” Quarterly Journal of Economics. 114:577-599. Blomquist, G., M. Berger, and J. Hoehn (1988). “New Estimates of Quality of Life in Urban Areas.” American Economic Review. 78(1):89-107. Bui, T.M. and C. Mayer (2003). “Regulation and Capitalization on Environmental Amenities: Evidence from the Toxic Release Inventory in Massachusetts.” Review of Economics and Statistics. 83:281-302. Chay, K.Y. and M. Greenstone (2005). “Does Air Quality Matter? Evidence from the Housing Market.” Journal of Political Economy. 113:376-424. Court, A.T. (1939). “Hedonic Price Indexes with Automotive Examples.” In The Dynamics of Automobile Demand. New York: General Motors. pp.98-119. Davis, L.W. (2004). “The Effect of Health Risk on Housing Values: Evidence from a Cancer Cluster.” American Economic Review. 94:1693-1704. Ekeland, I.J., J. Heckman, and L. Nesheim (2004). “Identification and Estimation of Hedonic Models.” Journal of Political Economy. 112:S60-S109. Ellickson, B. (1971). “Jurisdictional Fragmentation and Residential Choice.” American Economic Review. 61(2):334-339. Epple, D. (1987). “Hedonic Prices and Implicit Markets: Estimating Demand and Supply Functions for Differentiated Products,” Journal of Political Economy. 95(1):59–80. 23 ________, R. Filimon, and T. Romer (1984). “Equilibrium Among Local Jurisdictions: Towards an Integrated Approach ofVoting and Residential Choice,” Journal of Public Economics 24: 281–304. ________ and R. Romano (1998). “Competition Between Private and Public Schools, Vouchers and Peer Group Effects,” American Economic Review 88:33–63. ________ and G. J. Platt (1998): “Equilibrium and Local Redistribution in an Urban Economy when Households Differ in both Preferences and Incomes,” Journal of Urban Economics, 43, 23-51. ________ and H. Sieg (1999): “Estimating Equilibrium Models of Local Jurisdiction,” Journal of Political Economy, 107, 645-681 Figlio, D.N. and M.E. Lucas (2004). “What’s in a Grade? School Report Cards and the Housing Market.” American Economic Review. 94:591-604. Gayer, T., J. Hamilton, and W.K. Viscusi (2000). “Private Values of Risk Tradeoffs at Superfund Sites: Housing Market Evidence on Learning about Risk.” Review of Economics and Statistics. 82:439-451. Gibbons, S. (2004). “The Costs of Urban Property Crime.” The Economic Journal. 114:F441F463. Greenstone, M. and J. Gallagher (2008). “Does Hazardous Waste Matter? Evidence from the Housing Market and the Superfund Program.” Quarterly Journal of Economics. 123(3):9511003. Griliches, Z. (1961). “Hedonic Price Indexes for Automobiles: An Econometric Analysis of Quality Change.” In The Price Statistics of the Federal Government: Review, Appraisal, and Recommendations. New York: NBER. Lancaster, K.J. (1966). “A New Approach to Consumer Theory.” Journal of Political Economy. 74:132-157. Linden, L.L. and J.E. Rockoff (2008). “Estimates of the Impact of Crime Risk on Property Values from Megan’s Laws.” American Economic Review. 98(3):1103-1127. Palmquist, R.B. (2005). “Property Value Models.” In Handbook of Environmental Economics. Eds. Karl Goran-Maler and J.R. Vincent. Amsterdam: North Holland. Pope, J.C. (2008). “Buyer Information and the Hedonic: The Impact of a Seller Disclosure on the Implicit Price for Airport Noise.” Journal of Urban Economics. 63:498-516. Taylor, L. (2003). “The Hedonic Method.” In A Primer on the Economic Valuation of the Environment. Eds. P. Champ, T. Brown, and K. Boyle. Kluwer. pp. 331-394. 24 Table 1: Housing and Crime Data Summary Statistics Price Year Built Square Footage Bathrooms Bedrooms Total Rooms Stories Property Crime Violent Crime Year: 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 County: Alameda Contra Costa San Francisco San Mateo Santa Clara Los Angeles Ventura San Bernadino Riverside Orange San Francisco Bay Area (n=378,252) Mean Std Dev 426932 231050 1969 22.94 1702 686.2 2.129 0.721 3.173 0.942 6.987 1.986 1.413 0.512 1719 643.2 390.0 215.9 0.052 0.057 0.049 0.059 0.070 0.074 0.080 0.074 0.058 0.074 0.078 0.083 0.067 0.049 0.039 0.037 0.222 0.231 0.215 0.236 0.255 0.262 0.272 0.262 0.234 0.262 0.269 0.277 0.250 0.215 0.193 0.188 0.253 0.280 0.035 0.074 0.358 0.435 0.449 0.184 0.261 0.479 25 Los Angeles Metro Area (n=386,063) Mean Std Dev 276142 173722 1964 20.86 1648 645.2 2.084 0.749 3.148 0.852 7.288 2.009 1.248 0.437 1960 666.5 565.2 273.9 0.058 0.066 0.056 0.060 0.063 0.075 0.079 0.075 0.077 0.080 0.078 0.065 0.059 0.043 0.031 0.037 0.233 0.248 0.230 0.238 0.244 0.263 0.270 0.263 0.266 0.271 0.268 0.246 0.235 0.203 0.172 0.188 0.556 0.118 0.251 0.051 0.024 0.500 0.323 0.434 0.221 0.152 Table 2: Hedonic Price Function Estimates Constant Year Built Square Footage Baths Bedrooms Total Rooms Stories Property Crime Violent Crime Violent Crime2 1993 1994 1995 1996 1997 1998 2000 2001 2002 2003 2004 2005 2006 2007 2008 Contra Costa San Francisco San Mateo Santa Clara San Bernadino Riverside Orange Indicates significance at San Francisco Bay Area Estimate Std Error *** 3,069,122 29,787 -1,498*** 15.32 *** 22,309 101.9 23,632*** 708.8 *** 414.7 -16,089 2,799*** 297.9 *** -31,222 531.5 -18.56*** 0.5258 3.540 -308.9*** 0.1443*** 0.0022 -39,006*** 1,132 *** -29,662 1,134 1,090 -43,775*** *** -52,539 1,048 -31,315*** 1,042 *** -11,672 1,024 64,637*** 1,075 *** 1,154 72,292 81,285*** 1,038 97,251*** 1,095 *** 151,385 1,136 242,994*** 1,220 *** 1,231 248,131 230,389*** 1,544 *** 1,685 104,010 -61,439*** 599.6 *** 147,726 1,756 101,265*** 1,060 *** 46,695 495.4 -107,776*** -103,291*** -7,124*** 0.01 (***),0.05 (**), or 0.1 (*). 26 Los Angeles Metro Area Estimate Std Error *** 2,156,256 25,849 -1,057*** 13.36 *** 14,762 73.21 19,175*** 525.2 *** -26,475 336.8 6,402*** 178.8 -1,043* 538.6 7.820*** 0.5086 -201.6*** 2.793 0.0734*** 0.0015 23,967*** 869.0 *** 5,184 919.5 -8,858*** 908.0 *** -15,447 830.2 -15,975*** 822.4 *** -6,140 803.5 10,760*** 788.7 24,347*** 741.3 48,473*** 761.7 88,016*** 838.7 *** 149,404 909.3 201,088*** 1,016 *** 220,754 1,029 205,834*** 1,368 *** 77,661 1,201 546.3 711.6 1,060 Table 3: Bartik IV Estimate (Stage #1) Variable Constant LA Income (/10,000) Asian-Pacific Islander Black Hispanic LA * Income (/10,000) LA * Asian-Pacific Islander LA * Black LA * Hispanic Indicates significance at Estimate 415.0*** 88.76*** -0.2349*** 4.477*** 172.1*** 50.24*** 0.0943*** 2.941* 15.39*** 38.05*** Standard Error 2.78 3.03 0.021 1.13 4.39 1.93 0.023 1.56 4.90 2.12 0.01 (***),0.05 (**), or 0.1 (*). Table 4: Estimation Results Variable Constant Violent Crime Income (/10,000) Asian-Pacific Islander Black Hispanic Bishop-Timmins SF LA *** 242.1 395.9*** (29.4) (36.3) *** -1.039 (0.071) *** -0.1667*** -0.3118 (0.033) (0.015) 8.798*** 5.950*** (1.56) (1.45) 228.6*** 222.4*** (15.4) (13.4) 104.7*** 66.72*** (4.78) (6.30) LA 312.6*** (18.4) Indicates significance at 0.01 (***),0.05 (**), or 0.1 (*). 27 Bartik I Bartik II Rosen -435.6*** (11.2) 0.6064*** (0.023) 0.067*** (0.005) -3.494*** (0.383) -77.64*** (4.46) -34.96*** (1.97) -245.8*** (19.2) 0.1325*** (0.047) -0.0119 (0.008) 0.1141 (0.354) 9.036 (8.56) 2.719 (3.75) 51.68*** (5.62) -242.8*** (2.23) 0.2028*** (0.003) -0.0160*** (0.001) -7.668*** (0.426) -0.8210*** (0.148) 2.891*** (0.236) Table 5: Simple Linear Hedonic Price Function Constant Year Built Square Footage Baths Bedrooms Total Rooms Stories Property Crime Violent Crime 1993 1994 1995 1996 1997 1998 2000 2001 2002 2003 2004 2005 2006 2007 2008 Contra Costa San Francisco San Mateo Santa Clara San Bernadino Riverside Orange Indicates significance at San Francisco Bay Area Estimate Std Error *** 29,919 3,017,860 -1,496*** 15.39 *** 22,529 102.7 24,856*** 715.0 *** -17,180 421.5 2,704*** 302.0 *** -32,516 543.5 0.5656 -32.35*** -66.42*** 1.676 -47,626*** 1,116 1,164 -31,436*** *** -44,123 1,110 -52,701*** 1,055 -33,348*** 1,053 -14,001*** 1,036 *** 1,093 65,503 76,585*** 1,155 *** 88,065 1,042 106,629*** 1,103 *** 164,060 1,146 1,227 251,597*** 256,377*** 1,250 *** 239,044 1,563 108,752*** 1,742 *** -54,682 582.9 1,736 137,481*** *** 109,639 1,045 503.4 50,643*** -110,370*** -105,051*** -1,061 0.01 (***),0.05 (**), or 0.1 (*). 28 Los Angeles Metro Estimate Std Error *** 2,001,550 25,729 -995.4*** 13.33 *** 14,902 73.81 19,176*** 526.9 *** -26,289 340.0 5,788*** 179.3 0.4312 537.6 4.357*** 0.4851 -79.45*** 1.078 26,238*** 859.4 6,994*** 900.8 *** -8,831 880.2 -17,434*** 825.1 *** -18,237 825.7 -7,585*** 803.5 *** 11,847 785.9 25,175*** 745.9 *** 50,818 754.1 91,146*** 842.3 *** 154,307 900.0 205,853*** 1,020 224,551*** 1,033 *** 210,080 1,368 82,603*** 1,203 541.1 703.2 1,076 Table 6: Willingness-to-Pay for a One Standard Deviation Increase in Violent Crime17 Bishop-Timmins Bartik I Bartik II Rosen Simple Hedonic SF Bay Area Within Market Mean Std Dev -66,256 52,509 -27,492 30,610 -38,657 6,919 -31,276 10,348 -14,340 0 LA Metro Area Within Market Mean Std Dev -73,322 73,481 -11,669 42,900 -29,161 9,780 -30,904 14,591 -21,761 0 17 A one standard deviation increase consists of 215.9 additional cases per 100,000 residents in the SF Bay Area, and 273.9 additional cases in the LA Metro Area. 29 Figure 1: Crime Monitor Locations San Francisco Bay Area Figure 2: Crime Monitor Locations Los Angeles Metro Area 30 Figure 3 Distributions: Violent Crime Rate (Incidents per 100,000) San Francisco and Los Angeles Metropolitan Areas 60000 50000 40000 30000 20000 10000 LA SF Figure 4 Time Series: Violent Crime Rate (Incidents per 100,000) San Francisco and Los Angeles Metropolitan Areas 1000 900 800 700 600 500 400 300 200 100 0 1990 1995 2000 San Francisco 31 2005 Los Angeles 1900 1800 1700 1600 1500 1400 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100 0 0 Appendix: Identification With a Single Market Using a Non-Parametric Hedonic Gradient It is well-known that the identification problem described in Brown and Rosen (1982) and Mendelsohn (1985) can be solved with the use of functional form restrictions introducing non-linearity into the hedonic gradient, even with data from a single market. It is equally wellknown that the unidentified linear-quadratic model described in Section 2 is a very special case of a hedonic equilibrium; in general, we would not expect the hedonic gradient to be linear. (Heckman, Ekeland, and Nesheim (2004)). In this appendix, we illustrate how to employ our estimation procedure using the non-linearity of the hedonic gradient for identification. The implementation differs in a number of important ways from that used to estimate the linearquadratic model with multi-market data in Section 3. We begin with a flexible representation of the hedonic gradient, (A1) where ; may be an infinite dimensional parameter vector. The model still requires that we specify a functional form for MWTP. Although any specification that allows the researcher to recover a unique value of consistent with an observed choice of would work, for the sake of simplicity, we maintain the additive specification used in the previous section: (A2) Using hedonic equilibrium to combine these equations, with some rearranging, we get: 32 ; (A3) We then find the vector of parameters that maximizes the likelihood of the observed vector : (A4) max ∏ , , ; , where , (A5) ; , √ To implement this maximum likelihood procedure, we need only recover the value of consistent with the observed value of gradient ; , given a vector of parameters and hedonic price , along with the determinant of the Jacobian of the transformation of into (which, in this simple example, is a scalar): (A6) ; Income Net of Payments for Housing Services Even with a linear-quadratic model and multi-market data, one may still need to employ a method similar to that described in the previous sub-section. In particular, this is the case if MWTP is a function of household income remaining after paying for housing services (i.e., 33 , , , ). Under this circumstance, the non-linear hedonic price function appears in the equilibrium expression used for identification: (A7) , , The vector of parameters , , , , , , , , are all estimated in the first stage hedonic price function, so the maximum likelihood procedure described above need only be used to recover estimates of , , . 34
© Copyright 2026 Paperzz