Hedonic Prices and Implicit Markets

Hedonic Prices and Implicit Markets: Consistent Estimation of Marginal
Willingness to Pay for Differentiated Products Without Exclusion Restrictions*
by
Kelly Bishop†
Olin Business School
Washington University in St. Louis
Christopher Timmins‡
Duke University and NBER
November 2010
Abstract
Since the publication of Rosen’s “Hedonic Prices and Implicit Markets”, property
value hedonics has become the workhorse model for valuation of local public
goods and environmental amenities. This has happened in spite of a number of
well-known and well-documented econometric problems. Bartik (1987) and
Epple (1987) describe a source of endogeneity in the second stage of Rosen’s
procedure that is difficult to overcome using standard exclusion restriction
arguments. This has led researchers to avoid the estimation of marginal
willingness-to-pay functions altogether, relying instead on the first-stage hedonic
price function and limiting analysis to the evaluation of marginal policies. We
propose a new econometric procedure for the full hedonic model that avoids these
endogeneity problems without the need for exclusion restrictions that may be
difficult to justify, and we implement our estimator using data on violent crime
rates in the Los Angeles and San Francisco metropolitan areas. Results indicate
that controlling for the Bartik-Epple endogeneity bias without strong assumptions
about cross-market preference heterogeneity has important implications for
valuing non-marginal policy interventions.
* We thank seminar participants at the AERE/ASSA Annual Meetings, Camp Resources, Iowa State University,
Montana State University, Syracuse University, the Triangle Resource and Environmental Economics Workshop,
the UEA/RSAI Annual Meetings, and the University of Wisconsin for their helpful comments and suggestions.
†
Kelly Bishop, Olin Business School, Washington University in St. Louis, St. Louis, MO 63130.
[email protected].
‡
Christopher Timmins, Department of Economics, Duke University, Box 90097, Durham, NC 27708.
[email protected].
1
1. Introduction
Dating back to the work of Court (1939), Grilliches (1961) and Lancaster (1966), hedonic
techniques have been used to determine the implicit prices associated with the attributes of
differentiated products. Rosen’s (1974) seminal work proposed a theoretical structure for the
hedonic regression that suggested a procedure for the recovery of marginal willingness to pay
(MWTP) functions for heterogeneous individuals. Importantly, Rosen’s two-step approach
allowed for two sources of preference heterogeneity: individuals’ MWTPs were allowed to differ
with (i) their individual attributes and (ii) the quantity of the product attribute that they consume.
The latter is particularly important when considering non-marginal policy changes (i.e., any
change that is large enough to alter the individual’s willingness to pay at the margin). The twostep procedure suggested by Rosen (and developed further by later authors)1 uses variation in
implicit prices (obtained either by combining data from multiple markets or by allowing for nonlinearity in the hedonic price function) to identify a MWTP function.
With Rosen (1974) as a backdrop, property value hedonics has become the workhorse
model for the valuation of local public goods and environmental amenities despite of a number
of well-known and well-documented econometric problems.2 Some of these problems arise in
Rosen’s first stage – for example, omitted variables that may be correlated with the local
attribute of interest. There is a large and growing literature (cited below) that describes both
quasi-experimental and structural solutions to this problem.
Our concern in this paper is with an important problem that arises in the second stage of
Rosen’s two-step procedure. Bartik (1987) and Epple (1987), in separate papers, describe a
source of endogeneity that is difficult to overcome using standard exclusion restriction
1
See, for example, Brown and Rosen (1982) and Mendelsohn (1985).
See Taylor (2003) and Palmquist (2005) for a comprehensive discussion (with a particular focus on environmental
applications).
2
2
arguments. Specifically, they note that, unless the hedonic price function is linear, the hedonic
price of a product attribute varies systematically with the quantity consumed. The researcher
therefore faces a difficult endogeneity problem in the application of Rosen’s second-stage.
Moreover, because of the equilibrium features of the hedonic model, Bartik and Epple show that
there are very few natural exclusion restrictions that one can use to solve this endogeneity
problem. In particular, within-market supply-side shifters – the natural instrument choice when
estimating a demand equation – are not valid in this context. This has generally left researchers
to choose from a variety of weak instrument strategies, or instruments based on cross-market
homogeneity assumptions that may be difficult to justify.3 With a few exceptions, the hedonics
literature has subsequently ignored preference heterogeneity, focusing instead on recovering
estimates of the hedonic price function and valuing marginal changes in amenities [see, for
example, Black (1999), Bui and Mayer (2003), Chay and Greenstone (2005), Davis (2004),
Figlio and Lucas (2004), Gayer, Hamilton, and Viscusi (2000), Greenstone and Gallagher
(2009), Linden and Rockoff (2008), and Pope (2008)].4
In this paper, we propose an estimation procedure for the recovery of MWTP functions
that avoids the Bartik-Epple endogeneity problem without the need for exclusion restrictions that
may be difficult to justify. In particular, the exclusion restrictions suggested by Bartik require
3
Bound, Jaeger, and Baker (1995) discusses the biases that result from using weak instruments (i.e., instruments that
do a poor job of predicting the endogenous variable). 4
Deacon et al. (1998) noted that “To date no hedonic model with site specific environmental amenities has
successfully estimated the second stage marginal willingness to pay function.” Since that time, a number of papers
have addressed the problem of recovering preference from hedonic estimates. Bajari and Benkard (2005) avoid the
Bartik-Epple endogeneity problem by relying on strong parametric assumptions on utility that turn the second-stage
of Rosen’s procedure from an estimation problem into a preference-inversion procedure. Bishop and Timmins
(2010) show how some of these assumptions can be relaxed with panel data, but even that solution requires strong
assumptions about the stability of preferences over time. Ekeland, Heckman, and Nesheim (2004) provide two
alternative approaches to recovering MWTP. The first imposes very little in terms of parametric restrictions,
requiring only an additive separability assumption in the MWTP specification. Their approach, however, requires
very rich data and can be hard to implement with multiple amenities. Their second approach uses insights from the
first strategy to develop a set of exclusion restrictions to be used in identifying MWTP. Like much of the hedonics
literature, that approach may also suffer from weak instruments problems.
3
that one assume a strong degree of preference homogeneity across markets that may not be
defensible. Using our procedure, any variable that can be used as a Bartik-style instrument can
instead be included as an observable form of preference heterogeneity across markets.
Importantly, our procedure is computationally simple and easy to implement. Moreover,
it does not require any more in terms of data or assumptions than does a typical hedonic
regression. Instead, the procedure recovers the structural parameters underlying MWTP
functions by exploiting the relationship between the quantity of the amenity being consumed and
the attributes of the individuals doing the consumption. That such a relationship should exist in
hedonic equilibrium goes back to the idea of “stratification” found in Ellickson (1971), which
becomes the basis for the estimable Tiebout sorting models of Epple and a variety of co-authors.5
We illustrate our approach in an extremely simple but empirically relevant context – the linearquadratic multi-market setting used by Epple (1987). In the simplest version of that model,
estimates can be recovered with a transparent least squares procedure. With richer variation in
the data, estimates can be obtained using a NLLS procedure. In an appendix, we also
demonstrate how our method can be adapted to achieve identification in a single-market setting
that relies on a flexible representation of the hedonic price function to overcome the
identification problems described by Brown and Rosen (1982) and Mendelsohn (1985).6
Importantly, our estimation approach will work in either of these contexts.
To demonstrate the usefulness of this approach, we implement our estimation procedure
using data on violent crime rates in the San Francisco and Los Angeles metropolitan areas and
5
See, for example, Epple (1987), Epple, Filimon and Romer (1984), Epple and Romano (1998), Epple and Platt
(1998) and Epple and Sieg (1999).
6
Ekeland, Heckman, and Neshiem (2004) demonstrate that the linear-quadratic hedonic model, which is the source
of the identification problems cited by Brown and Rosen (1982) and Mendelsohn (1985), is a very special case – a
non-linear hedonic gradient tends to be the rule rather than the exception. This means that non-linearity provides a
perfectly valid source of identification, even using data taken from a single market.
4
find that recovering the shape of the MWTP function is economically important. In particular,
an individual’s marginal willingness-to-pay to avoid an incident of violent crime per 100,000
residents increases by $1.04 with each additional incident. In contrast, MWTP falls by more
than $0.20 with each additional case under the simple Rosen model. Our results imply welfare
costs of $66,256 and $73,322 for one standard deviation increases in violent crime in the San
Francisco Bay Area and Los Angeles Metro Areas, respectively. This has important
consequences for valuing the welfare implications of a non-marginal increase or reduction in
crime. The overall cost of a similar increase in crime measured by a simple model that uses only
the first-stage hedonic estimates is only 20% to 30% as large.
This paper proceeds as follows. Section 2 describes the endogeneity problem discussed
by Bartik (1987) and Epple (1987) and provides intuition for why it is difficult to solve with
exclusion restrictions. Section 3 describes our estimation procedure in the context of a linearquadratic model with multi-market data using the model employed by Epple (1987). An
appendix describes how the method can be adapted to a data from a single market if the hedonic
gradient is non-linear. Section 4 describes the data used in our application – housing transactions
data from the Los Angeles and San Francisco metropolitan areas between 1990 and 2008,
combined with property and violent crime data from RAND California. Section 5 reports the
results of applying our estimator, the Rosen two-step procedure, and Bartik’s instrumental
variables strategy to these data. Section 6 concludes.
2. Why Is It So Difficult To Recover The MWTP Function?
In their 1987 articles, Epple and Bartik each discuss the econometric problems induced
by the equilibrium sorting process that underlies the formation of the hedonic price function. In
5
particular, unobserved determinants of tastes affect both the quantity of an amenity that an
individual consumes and (if the hedonic price function is not linear) the hedonic price of the
attribute.7 In a regression like that described in the second-stage of Rosen’s two-step procedure,
the quantity of the amenity that an individual consumes will therefore be endogenous.
Moreover, the typical exclusion restrictions used to estimate demand systems (i.e., using supplier
attributes as instruments) will not work because, in addition to affecting the quantity of amenity
and the hedonic price paid for it, the unobservable component of preferences also determines the
supplier from whom the individual purchases. Supplier attributes, which might naturally be used
to trace-out the demand function, are therefore correlated with the unobserved determinants of
MWTP because of the sorting process underlying the hedonic equilibrium.
To make these ideas concrete, and to introduce the notation used throughout the rest of
the paper, consider the following simple example, which is based on Epple’s model. The
hedonic price function is given by:
(1)
where
1, 2, … , , indexes houses.
measures the price of house , and
measures the
level of the amenity associated with that house (for the sake of illustration, we ignore other
amenities and house attributes). For the time-being, we consider data from just a single market,
but we will allow for multi-market data below. The price gradient associated with this hedonic
price function is:
7
The simple case of the linear hedonic price function assumes that attributes can be unbundled and repackaged in
any combination without affecting their marginal value (e.g., the marginal value of another bedroom is the same
regardless of how many bedrooms a house already has). In most empirical settings, this assumption is unrealistic.
6
2
(2)
where
represents the hedonic price of
at house . The second stage of Rosen’s procedure
seeks to recover the coefficients of demand (or marginal willingness-to-pay) and supply
functions for the attribute :
(3)
(demand)
(4)
(supply)
Where
and
represent attributes of the buyers and sellers of house , respectively.
and
similarly represent unobserved determinants of tastes and marginal costs.
The problem we consider in this paper arises from the fact that
correlated with
must necessarily be
. This is easily shown in the following equation. Noting that
in
hedonic equilibrium and combining (2) and (3) yields (with some re-arranging):
(5)
0, equation (5) makes explicit that
Making the standard assumption that
cannot be
uncorrelated with
. Therefore, in order to recover the parameters of equation (3), one requires
an instrument for
.
7
The typical approach to estimating demand functions with endogenous quantities uses
supply function shifters. The problem with that approach in this context, however, is that
and
hedonic sorting induces a correlation between
. This can be easily seen by combining
equation (5) with a similarly derived expression based on equation (4). Noting that
in
hedonic equilibrium
(6)
Solving for
(i.e., the attributes of the supplier of house i):
(7)
Assuming that suppliers are heterogeneous (i.e.,
function of
differently,
instrument for
), equation (7) reveals that
, making it ineligible as a potential instrument for
will be a
in equation (2). Put
determines which supplier individual i buys from, so that
cannot be used to
.
Epple and Bartik propose instrumental variables strategies to deal with this problem.
with market indicator variables. The intuition
Bartik, for example, suggests instrumenting for
for this strategy is that variation in the distribution of suppliers across markets will provide an
exogenous source of variation in the equilibrium quantity of the amenity chosen by each
individual. Using the subscript k = 1, 2, … K to indicate the market in which individual i lives,
this IV strategy is implemented (as 2SLS) by first regressing the quantity
8
,
on a vector of
market indicators, and then using the fitted values as regressors in the estimation of the MWTP
function:
(8)
,
(9)
,
,
,
0
,
,
,
Here, and in the remainder of the paper, “ ” is used to denote an estimated parameter value. In
order for this procedure to work, we require that market dummies are excluded from the MWTP
function. Put differently, the distribution of preferences across markets must be the same.
However, if households sort across markets based on their preferences for amenities, this
assumption will be violated. There is ample empirical evidence that this is the case.8
Suppose, therefore, that we instead allow the MWTP function to have a different
intercept for each market (
(10)
,
,
,
):
,
,
If we were to then implement Bartik’s IV strategy, the parameter
identified from
(11)
,
,
would not be separately
:
,
,
,
8
See, for example, Roback (1982), Blomquist, Berger, and Hoehn (1988), Albouy et al. (2009), Bayer, Keohane
and Timmins (2009), and Bishop (2008).
9
In the following section, we describe an alternative econometric approach to dealing with
this endogeneity problem in the multiple-markets context. Using this approach, we demonstrate
that any variable that could be used as a Bartik-style instrument (e.g., market indicator variables
and/or market indicators interacted with individual attributes
,
) can instead be included
explicitly in MWTP as a form of observable preference heterogeneity. In our application, we
show that this source of preference heterogeneity is empirically relevant. In an appendix, we
show that our model is similarly identified using data from a single market, as long as the
hedonic gradient is not linear.
3. Consistent Estimation Without Exclusion Restrictions
In this section, we demonstrate that one can easily avoid the endogeneity problems
described by Bartik (1987) and Epple (1987), even in a simple linear-quadratic hedonic
specification. Rather than look for an IV strategy, we return to the original setup, but allow the
hedonic gradient to differ by market:
(12)
,
,
(13)
,
,
,
2
,
,
,
,
,
,
This alone does nothing to help with the IV strategy described in equations (8) – (11), but it does
allow for identification of market-level heterogeneity in MWTP using our approach. In order to
demonstrate this, we now introduce rich cross-market heterogeneity into MWTP. In particular,
10
allow mean MWTP (
,
) and the marginal effects of indiviaul attributes (
,
) to differ across
markets:
(14)
,
,
,
,
,
Using hedonic equilibrium to combine (13) and (14), and then solving for
,
yields the
following reduced form equation:
(15)
,
,
,
,
,
,
,
,
,
,
,
Assuming we have information on just two markets, estimation of equation (15) by OLS yields
six pieces of information –
,
,
,
,
,
,
,
,
equations:
(16)
,
(17)
,
(18)
,
(19)
,
(20)
(21)
,
,
,
,
,
,
,
,
,
,
,
,
11
, and
, which are used in the following six
represents the standard deviation of
,
, which we assume to be common across markets.
While our procedure requires cross-market homogeneity in the variance in the unobserved
component of preferences and in the slope of the MWTP function with respect to
,
, we note
that these assumptions are also maintained by Rosen model and the Bartik IV approach to
estimating it. Our procedure does, however, allow for rich cross-market heterogeneity in the
remainder of the MWTP parameters.9
Equations (20) and (21) can be combined to yield an expression for
,
(22)
Knowing
:
,
, we can proceed to recover the remaining MWTP parameters:
2
(23)
,
,
(24)
,
,
(25)
,
2
,
,
(26)
,
2
,
,
2
,
,
,
,
9
Moreover, note that, with more than two markets, we would also be able to parameterize and/or . For
, where
represents the population of
example, with data on three markets, we could set
market k.
12
4. Data
In our application, we estimate a pair of hedonic price functions that are allowed to vary across
housing markets (i.e., Los Angeles Metro and San Francisco Bay Areas). Our primary variable
of interest is the rate of violent crime, although we also control for house attributes, local
property crime rates, and vectors of county and year fixed effects. This requires a great deal of
data from multiple sources, which are summarized below.
4.1 Property Transactions Data
The real estate transaction data we employ cover the five core counties of the San
Francisco Bay Area (Alameda, Contra Costa, San Francisco, San Mateo, and Santa Clara) and
the five counties that comprise the Los Angeles Metro Area (Los Angeles, Ventura, San
Bernadino, Riverside, and Orange) over the period 1993 to 2008. The data were purchased from
Dataquick and include transaction dates, prices, loan amounts, and buyers’, sellers’, and lenders’
names for all transactions. In addition, the data for the final observed transaction for each house
include characteristics such as exact latitude and longitude, square footage, year built, lot size,
number of rooms, number of bathrooms, number of bedrooms, and number of stories in the
building.
The process of cleaning the data involves a number of cuts. Many of these are made in
order to deal with the fact that we only see housing characteristics at the time of the last sale, but
we need to use housing characteristics from all sales as controls in our hedonic price regressions.
We therefore seek to eliminate any observations that reflect major housing improvement (e.g.,
additions) or degradation. First, to control for land sales or re-builds, we drop all transactions
13
where “year built” is missing or with a transaction date that is prior to “year built”. Second, in
order to control for property improvements (e.g., an updated kitchen) or degradations (e.g., water
damage) that do not present as re-builds, we drop any house that ever appreciates or depreciates
in excess of 50 percentage points of the county-year mean price change.10 We also drop any
house that moves more than 40 percentiles between sales in the county-year distribution.11
Additionally, we drop transactions where the price is missing, negative, or zero. After
using the consumer price index to convert all transaction prices into 2000 dollars, we drop one
percent of observations from each tail to minimize the effect of outliers. Finally, as we merge in
the crime data using the property’s geographic coordinates, we drop properties where latitude
and longitude are missing.
This yields a final sample of 378,252 transactions in the San Francisco Bay Area and
386,063 transactions in the Los Angeles Metro. Table 1 reports summary statistics for each
housing market.
4.2 Home Mortgage Disclosure Act
Information about the race and income of home buyers is obtained from data collected
from mortgage applications in accordance with the Home Mortgage Disclosure Act of 1975.
One of the main purposes of this law was to look for evidence of discriminatory lending
practices. In accordance with that goal, HMDA provides information on the race, income, and
gender of mortgage applicants, along with information on the mortgage loan amount, loan date,
10
For example, if a house appreciates 65% between years #3 and #6, and the mean appreciation of houses in that
county over those years is 12%, we drop that house. If the mean appreciation had been 20% over that period, that
house would not be dropped.
11
For example if a house was in the 20th percentile of homes sold in a county in year #1 and then in the 60th
percentile of homes sold in that county in year #7, all observations pertaining to that house would be dropped from
the data set.
14
lender’s name, and the census tract where the property is located. We are able to merge these
data with Dataquick transactions based on these four variables. Bayer, MacMillan, Murphy, and
Timmins (2010) describe the quality of this merge in detail.
4.3 Crime Data
Crime statistics come from the RAND California data base, are organized by “city”, and
measured as incidents per 100,000 residents.12 The data describe property and violent crime
rates for 80 cities in the San Francisco metropolitan area and 175 cities in the Los Angeles
metropolitan area between 1986 and 2008. Figures 1 and 2 illustrate the locations of these cities.
Property crime is defined as “crimes against property, including burglary and motor vehicle
theft”, while violent crime is defined as “crimes against people, including homicide, forcible
rape, robbery, and aggravated assault.” Crime rates are imputed for each house in our data set
using an inverse-distance weighted average of the crime rate in each city.13 We use the property
crime rate as a control in our hedonic estimation and focus attention on violent crimes in our
valuation exercise, as these crimes are less likely to be subject to systematic under-reporting.
[Gibbons (2004)]
Figures 3 and 4 illustrate the distribution of violent crime rates within each metropolitan
area, and how average crime rates in each city have declined over time. These downward trends
are consistent with drops in violent crime rates over the same period in many parts of the US.
Table 1 provides summary statistics for crime in each city, measured at the level of the house
transaction.
12
13
See http://ca.rand.org/stats/community/crimerate.html.
Specifically, each crime rate is weighted by 1/distance2, where distance is measured in kilometers.
15
5. Results
5.1 Estimation of the Hedonic Gradient
The first stage of the property value model estimates the hedonic price function, which
,
relates the price of house j in market k (
transacted in period (
, ,
to its
attributes, both those that vary over time and those that do not. There are many empirical
approaches available for this stage of the hedonic procedure, but we opt for a simple linear
quadratic estimation framework. In particular, for each market (LA and SF) we estimate a
separate hedonic price function using data from 1993-2008. Given that neighborhood
unobservables may be correlated with crime rates, and we do not have a source of quasiexperimental variation in crime, we opt to pool the data over time within each city and include a
vector of county fixed effects.14
(27)
, ,
,
,
,
,
, ,
, ,
, ,
,
, ,
,
,
, ,
, ,
,
, ,
, ,
, ,
,
,
, ,
, ,
, ,
where
VC = violent crime rate (incidents per 100,000 population)
PC = property crime rate (incidents per 100,000 population)
BUILT = year built
SQFT = square footage
BATH = number of bathrooms
BED = number of bedrooms
ROOMS = total number of rooms
STORIES = number of stories
COUNTY = vector of county dummy variables15
14
Bias from omitted variables arises when the regressors are not orthogonal to the regression error. We want to be
clear that any quasi-experimental or instrumental variables approach (like those used in the papers cited in the
introduction) could be integrated into this phase of our analysis.
15
In the San Francisco metropolitan region, this includes Alameda, Contra Costa, San Francisco, San Mateo and
Santa Clara counties. Alameda is omitted from the regression. In the Los Angeles metropolitan region, it includes
16
YEAR = vector of year dummies (1993-2008, 1999 normalized to zero)
We use the coefficients (
,
,
,
) to construct hedonic prices for use in stage-two of the
estimation procedure.
(28)
, ,
,
2
,
, ,
Table 2 reports the results of these two specifications. Bootstrapped standard errors are
reported as well. We note two important features of these results. First, in both cases, we find a
hedonic gradient for violent crime that is negative for relevant values of violent crime and rises
gradually with increasing incidence. This result alone suggests that heterogeneity in individual
preferences is important – if all individuals had homogenous preferences, the hedonic gradient
could be interpreted as a MWTP curve, and these gradients would suggest that individuals have
upward sloping demand curves for public safety. The second important feature of the data to
note is that there is significant heterogeneity in hedonic gradients across cities. This will provide
us with an important source of variation with which to identify our model.
5.2 Recovery of the MWTP function
Next, we specify a MWTP function that we use both with our estimation strategy
(Bishop-Timmins), the traditional two-stage approach proposed by Rosen (no instruments), and
with two versions of the instrumental variables strategy proposed by Bartik (1987). The first
uses a market dummy (i.e.,
1 , along with interactions of that market dummy and
,
as
Los Angeles, Ventrua, San Bernadino, Riverside and Orange counties. Los Angeles and Ventura counties are
omitted from the regression.
17
instruments for
, ,
. The second version allows the Los Angeles market dummy variable to
enter directly into the MWTP function (i.e., a limited form of cross-city preference
heterogeneity), relying on the interaction terms for instruments. In particular, we estimate the
following specifications:
Bishop Timmins:
(29)
,
, ,
,
,
, ,
, ,
,
,
,
, ,
, ,
,
, ,
,
, ,
,
, ,
, ,
,
, ,
, ,
Rosen and Bartik I:
(30)
, ,
, ,
, ,
, ,
, ,
, ,
, ,
Bartik II:
(31)
, ,
,
,
, ,
, ,
, ,
, ,
18
, ,
, ,
where
= hedonic price of violent crime at house j in market k in year t (based on linear
hedonic gradient or log-linear hedonic gradient, depending upon specification)
=
violent
crime rate at house j in market k in year t
, ,
, , = income (in thousands of dollars) of household choosing house j
, , = 1 if household choosing house j is Asian-Pacific Islander
, , = 1 if household choosing house j is Black
, , = 1 if household choosing house j is Hispanic
, ,
The first stage of the Bartik IV procedure is reported in Table 3. Conditional correlations
between violent crime and covariates are, for the most part, intuitive. Violent crime is higher in
Los Angeles Metro and lower in higher income communities (although that effect is lessened in
LA). Compared with houses owned by Whites, crime rates are significantly higher at houses
owned by Blacks and Hispanics. Those racial effects are, moreover, significantly greater in LA.
Results from the four MWTP estimation procedures are reported in Table 4. First,
consider the coefficient on violent crime. Our estimator is the only one that yields a downward
sloping MWTP function, which is consistent with a decreasing MWTP for additional public
safety. In each of the other three cases, an increase in violent crime reduces the individual’s
willingness to pay to avoid it. The coefficients on violent crime in the Bartik I and Rosen
models violate second order conditions for utility maximization. Coefficients in the Bartik II
specification (where minimal cross-market preference heterogeneity is introduced) are nearly all
statistically insignificant.
The effect of increasing income in all but Bartik I is to increase individuals’ distaste for
violent crime. Our model suggests that this effect is nearly two times larger in the San Francisco
Bay Area than in Los Angeles Metro. Our model also finds that the intercept in MWTP is
different across the two cities (i.e., with a significantly smaller average MWTP to avoid violent
19
crime found in Los Angeles). The effect of being Hispanic is significantly different across the
two cities as well, with those in Los Angeles exhibiting a smaller average distaste for violent
crime.
5.3 Interpretation
In this sub-section, we consider the welfare effects of a non-marginal increase in violent
crime. In particular, we consider a one standard deviation increase in each city (215.9 additional
cases in San Francisco and 273.9 additional cases in Los Angeles). This change is
approximately equal (but opposite in sign) to the reduction in violent crime that took place in San
Francisco during the sample period, and smaller than that which took place in Los Angeles.
With non-marginal changes, the implications of the differences between the estimates in Table 4
for welfare will be clear.
For the sake of comparison, we also estimate a simple first-stage hedonic model of the
following form:
(32)
, ,
,
,
,
, ,
, ,
,
,
, ,
,
, ,
, ,
,
,
, ,
, ,
, ,
,
, ,
, ,
These estimates, reported in Table 5, imply a constant willingness to pay to avoid an additional
case of violent crime of $66.42 (San Francisco) and $79.45 (Los Angeles).
Table 6 reports the results of this welfare analysis. Consistent with the estimates in Table
4, our model yields the biggest mean welfare cost in both cities – $66,256 in San Francisco and
20
$73,322 in Los Angeles. This translates into an average welfare cost of $307 and $268 per each
additional case in each city, respectively. There is wide variation in willingnesses to pay within
each city – the standard deviations are $52,509 ($243 per case) and $73,481 ($268 per case) in
the San Francisco and Los Angeles areas, respectively. In contrast to our model, the Bartik and
Rosen models only yield average welfare costs between $127 and $179 per case in San Francisco
and between $43 and $113 per case in Los Angeles. As reported above, the average welfare
costs per case based on the simple linear hedonic model are only $66 and $79 in San Francisco
and Los Angeles, respectively.
These results make clear that the shape of the MWTP function can have practical but
economically significant impacts on the valuation of non-marginal changes in amenities. In
particular, alternative methods, including the simple Rosen two-step procedure, Bartik’s (1987)
proposed IV strategy, and the simple estimation of a linear hedonic price function all tend to
understate willingness to pay to avoid violent crime.
6. Conclusion
To be fair, there are other factors that complicate the estimation of the full hedonic
model. For example, the data requirements are more substantial than for the simple estimation of
the first-stage hedonic price function – in particular, the researcher needs to observe household
attributes (in addition to housing market outcomes) along with local public goods or amenities.
With the increasing availability of large-scale housing transactions databases (like the one we
work with here) and given the ability to link these data to other data sets, like that created by the
21
Home Mortgage Disclosure Act, this is becoming increasingly feasible to do across multiple
markets.16
It is therefore important to develop empirical techniques that can yield unbiased estimates
of the MWTP function. The typical policy being evaluated with a hedonic model is associated
with a non-marginal change in the amenity, and as we show here, results based only on a simple
first-stage hedonic price function can yield very different conclusions about welfare. While there
are IV techniques in the literature for estimating MWTP functions, these typically require strong
assumptions about the homogeneity of preferences across markets. Given evidence that
individuals sort across markets based on their preferences for amenities, there is good reason to
be suspicious of this assumption. We show here that by exploiting the properties of hedonic
equilibrium, it is straightforward to recover unbiased estimates of the MWTP function allowing
preferences to differ across markets.
We demonstrate, in an application to violent crime in the San Francisco and Los Angeles
metropolitan areas that recovering an unbiased estimate of MWTP and properly accounting for
preference heterogeneity across those markets has important ramifications for valuing a nonmarginal change.
16
While it reports owners’ stated values instead of transaction prices, restricted access census micro data also allows
for this possibility (and has much richer information about household attributes than any other data source). These
data are also available for multiple markets.
22
References
Albouy, D., W. Graf, R. Kellogg, and Hendrik Wolff (2009). “The Impact of Climate Change on
Household Welfare.” University of Michigan Department of Economics Working Paper.
Bajari, P., J. Cooley, K. Kim, and C. Timmins (2010). “A Theory-Based Approach to Hedonic
Price Regressions with Time-Varying Unobserved Product Attributes: The Price of Pollution.”
NBER Working Paper 15724.
Bartik. T.J. (1987). “The Estimation of Demand Parameters in Hedonic Price Models.” Journal
of Political Economy. 95(1):81-88.
Bayer, P., N. Keohane, and C. Timmins (2009). “Migration and Hedonic Valuation: The Case of Air
Quality.” Journal of Environmental Economics and Management. 58(1):1-14.
Bishop, K. (2008). “A Dynamic Model of Location Choice and Hedonic Valuation.” Olin
Business School Working Paper.
Black, S.E. (1999). “Do Better Schools Matter? Parental Valuation of Elementary Education.”
Quarterly Journal of Economics. 114:577-599.
Blomquist, G., M. Berger, and J. Hoehn (1988). “New Estimates of Quality of Life in Urban
Areas.” American Economic Review. 78(1):89-107.
Bui, T.M. and C. Mayer (2003). “Regulation and Capitalization on Environmental Amenities:
Evidence from the Toxic Release Inventory in Massachusetts.” Review of Economics and
Statistics. 83:281-302.
Chay, K.Y. and M. Greenstone (2005). “Does Air Quality Matter? Evidence from the Housing
Market.” Journal of Political Economy. 113:376-424.
Court, A.T. (1939). “Hedonic Price Indexes with Automotive Examples.” In The Dynamics of
Automobile Demand. New York: General Motors. pp.98-119.
Davis, L.W. (2004). “The Effect of Health Risk on Housing Values: Evidence from a Cancer
Cluster.” American Economic Review. 94:1693-1704.
Ekeland, I.J., J. Heckman, and L. Nesheim (2004). “Identification and Estimation of Hedonic
Models.” Journal of Political Economy. 112:S60-S109.
Ellickson, B. (1971). “Jurisdictional Fragmentation and Residential Choice.” American
Economic Review. 61(2):334-339.
Epple, D. (1987). “Hedonic Prices and Implicit Markets: Estimating Demand and Supply
Functions for Differentiated Products,” Journal of Political Economy. 95(1):59–80.
23
________, R. Filimon, and T. Romer (1984). “Equilibrium Among Local Jurisdictions: Towards
an Integrated Approach ofVoting and Residential Choice,” Journal of Public Economics
24: 281–304.
________ and R. Romano (1998). “Competition Between Private and Public Schools, Vouchers
and Peer Group Effects,” American Economic Review 88:33–63.
________ and G. J. Platt (1998): “Equilibrium and Local Redistribution in an Urban
Economy when Households Differ in both Preferences and Incomes,” Journal of Urban Economics, 43, 23-51.
________ and H. Sieg (1999): “Estimating Equilibrium Models of Local Jurisdiction,”
Journal of Political Economy, 107, 645-681
Figlio, D.N. and M.E. Lucas (2004). “What’s in a Grade? School Report Cards and the Housing
Market.” American Economic Review. 94:591-604.
Gayer, T., J. Hamilton, and W.K. Viscusi (2000). “Private Values of Risk Tradeoffs at
Superfund Sites: Housing Market Evidence on Learning about Risk.” Review of Economics and
Statistics. 82:439-451.
Gibbons, S. (2004). “The Costs of Urban Property Crime.” The Economic Journal. 114:F441F463.
Greenstone, M. and J. Gallagher (2008). “Does Hazardous Waste Matter? Evidence from the
Housing Market and the Superfund Program.” Quarterly Journal of Economics. 123(3):9511003.
Griliches, Z. (1961). “Hedonic Price Indexes for Automobiles: An Econometric Analysis of
Quality Change.” In The Price Statistics of the Federal Government: Review, Appraisal, and
Recommendations. New York: NBER.
Lancaster, K.J. (1966). “A New Approach to Consumer Theory.” Journal of Political Economy.
74:132-157.
Linden, L.L. and J.E. Rockoff (2008). “Estimates of the Impact of Crime Risk on Property
Values from Megan’s Laws.” American Economic Review. 98(3):1103-1127.
Palmquist, R.B. (2005). “Property Value Models.” In Handbook of Environmental Economics.
Eds. Karl Goran-Maler and J.R. Vincent. Amsterdam: North Holland.
Pope, J.C. (2008). “Buyer Information and the Hedonic: The Impact of a Seller Disclosure on
the Implicit Price for Airport Noise.” Journal of Urban Economics. 63:498-516.
Taylor, L. (2003). “The Hedonic Method.” In A Primer on the Economic Valuation of the
Environment. Eds. P. Champ, T. Brown, and K. Boyle. Kluwer. pp. 331-394.
24
Table 1: Housing and Crime Data
Summary Statistics
Price
Year Built
Square Footage
Bathrooms
Bedrooms
Total Rooms
Stories
Property Crime
Violent Crime
Year:
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
County:
Alameda
Contra Costa
San Francisco
San Mateo
Santa Clara
Los Angeles
Ventura
San Bernadino
Riverside
Orange
San Francisco Bay Area
(n=378,252)
Mean
Std Dev
426932
231050
1969
22.94
1702
686.2
2.129
0.721
3.173
0.942
6.987
1.986
1.413
0.512
1719
643.2
390.0
215.9
0.052
0.057
0.049
0.059
0.070
0.074
0.080
0.074
0.058
0.074
0.078
0.083
0.067
0.049
0.039
0.037
0.222
0.231
0.215
0.236
0.255
0.262
0.272
0.262
0.234
0.262
0.269
0.277
0.250
0.215
0.193
0.188
0.253
0.280
0.035
0.074
0.358
0.435
0.449
0.184
0.261
0.479
25
Los Angeles Metro Area
(n=386,063)
Mean
Std Dev
276142
173722
1964
20.86
1648
645.2
2.084
0.749
3.148
0.852
7.288
2.009
1.248
0.437
1960
666.5
565.2
273.9
0.058
0.066
0.056
0.060
0.063
0.075
0.079
0.075
0.077
0.080
0.078
0.065
0.059
0.043
0.031
0.037
0.233
0.248
0.230
0.238
0.244
0.263
0.270
0.263
0.266
0.271
0.268
0.246
0.235
0.203
0.172
0.188
0.556
0.118
0.251
0.051
0.024
0.500
0.323
0.434
0.221
0.152
Table 2: Hedonic Price Function Estimates
Constant
Year Built
Square Footage
Baths
Bedrooms
Total Rooms
Stories
Property Crime
Violent Crime
Violent Crime2
1993
1994
1995
1996
1997
1998
2000
2001
2002
2003
2004
2005
2006
2007
2008
Contra Costa
San Francisco
San Mateo
Santa Clara
San Bernadino
Riverside
Orange
Indicates significance at
San Francisco Bay Area
Estimate
Std Error
***
3,069,122
29,787
-1,498***
15.32
***
22,309
101.9
23,632***
708.8
***
414.7
-16,089
2,799***
297.9
***
-31,222
531.5
-18.56***
0.5258
3.540
-308.9***
0.1443***
0.0022
-39,006***
1,132
***
-29,662
1,134
1,090
-43,775***
***
-52,539
1,048
-31,315***
1,042
***
-11,672
1,024
64,637***
1,075
***
1,154
72,292
81,285***
1,038
97,251***
1,095
***
151,385
1,136
242,994***
1,220
***
1,231
248,131
230,389***
1,544
***
1,685
104,010
-61,439***
599.6
***
147,726
1,756
101,265***
1,060
***
46,695
495.4
-107,776***
-103,291***
-7,124***
0.01 (***),0.05 (**), or 0.1 (*).
26
Los Angeles Metro Area
Estimate
Std Error
***
2,156,256
25,849
-1,057***
13.36
***
14,762
73.21
19,175***
525.2
***
-26,475
336.8
6,402***
178.8
-1,043*
538.6
7.820***
0.5086
-201.6***
2.793
0.0734***
0.0015
23,967***
869.0
***
5,184
919.5
-8,858***
908.0
***
-15,447
830.2
-15,975***
822.4
***
-6,140
803.5
10,760***
788.7
24,347***
741.3
48,473***
761.7
88,016***
838.7
***
149,404
909.3
201,088***
1,016
***
220,754
1,029
205,834***
1,368
***
77,661
1,201
546.3
711.6
1,060
Table 3: Bartik IV Estimate (Stage #1)
Variable
Constant
LA
Income (/10,000)
Asian-Pacific Islander
Black
Hispanic
LA * Income (/10,000)
LA * Asian-Pacific Islander
LA * Black
LA * Hispanic
Indicates significance at
Estimate
415.0***
88.76***
-0.2349***
4.477***
172.1***
50.24***
0.0943***
2.941*
15.39***
38.05***
Standard Error
2.78
3.03
0.021
1.13
4.39
1.93
0.023
1.56
4.90
2.12
0.01 (***),0.05 (**), or 0.1 (*).
Table 4: Estimation Results
Variable
Constant
Violent Crime
Income (/10,000)
Asian-Pacific Islander
Black
Hispanic
Bishop-Timmins
SF
LA
***
242.1
395.9***
(29.4)
(36.3)
***
-1.039
(0.071)
***
-0.1667***
-0.3118
(0.033)
(0.015)
8.798***
5.950***
(1.56)
(1.45)
228.6***
222.4***
(15.4)
(13.4)
104.7***
66.72***
(4.78)
(6.30)
LA
312.6***
(18.4)
Indicates significance at
0.01 (***),0.05 (**), or 0.1 (*).
27
Bartik I
Bartik II
Rosen
-435.6***
(11.2)
0.6064***
(0.023)
0.067***
(0.005)
-3.494***
(0.383)
-77.64***
(4.46)
-34.96***
(1.97)
-245.8***
(19.2)
0.1325***
(0.047)
-0.0119
(0.008)
0.1141
(0.354)
9.036
(8.56)
2.719
(3.75)
51.68***
(5.62)
-242.8***
(2.23)
0.2028***
(0.003)
-0.0160***
(0.001)
-7.668***
(0.426)
-0.8210***
(0.148)
2.891***
(0.236)
Table 5: Simple Linear Hedonic Price Function
Constant
Year Built
Square Footage
Baths
Bedrooms
Total Rooms
Stories
Property Crime
Violent Crime
1993
1994
1995
1996
1997
1998
2000
2001
2002
2003
2004
2005
2006
2007
2008
Contra Costa
San Francisco
San Mateo
Santa Clara
San Bernadino
Riverside
Orange
Indicates significance at
San Francisco Bay Area
Estimate
Std Error
***
29,919
3,017,860
-1,496***
15.39
***
22,529
102.7
24,856***
715.0
***
-17,180
421.5
2,704***
302.0
***
-32,516
543.5
0.5656
-32.35***
-66.42***
1.676
-47,626***
1,116
1,164
-31,436***
***
-44,123
1,110
-52,701***
1,055
-33,348***
1,053
-14,001***
1,036
***
1,093
65,503
76,585***
1,155
***
88,065
1,042
106,629***
1,103
***
164,060
1,146
1,227
251,597***
256,377***
1,250
***
239,044
1,563
108,752***
1,742
***
-54,682
582.9
1,736
137,481***
***
109,639
1,045
503.4
50,643***
-110,370***
-105,051***
-1,061
0.01 (***),0.05 (**), or 0.1 (*).
28
Los Angeles Metro
Estimate
Std Error
***
2,001,550
25,729
-995.4***
13.33
***
14,902
73.81
19,176***
526.9
***
-26,289
340.0
5,788***
179.3
0.4312
537.6
4.357***
0.4851
-79.45***
1.078
26,238***
859.4
6,994***
900.8
***
-8,831
880.2
-17,434***
825.1
***
-18,237
825.7
-7,585***
803.5
***
11,847
785.9
25,175***
745.9
***
50,818
754.1
91,146***
842.3
***
154,307
900.0
205,853***
1,020
224,551***
1,033
***
210,080
1,368
82,603***
1,203
541.1
703.2
1,076
Table 6: Willingness-to-Pay for a One Standard Deviation Increase in Violent Crime17
Bishop-Timmins
Bartik I
Bartik II
Rosen
Simple Hedonic
SF Bay Area
Within Market
Mean
Std Dev
-66,256
52,509
-27,492
30,610
-38,657
6,919
-31,276
10,348
-14,340
0
LA Metro Area
Within Market
Mean
Std Dev
-73,322
73,481
-11,669
42,900
-29,161
9,780
-30,904
14,591
-21,761
0
17
A one standard deviation increase consists of 215.9 additional cases per 100,000 residents in the SF Bay Area, and
273.9 additional cases in the LA Metro Area.
29
Figure 1: Crime Monitor Locations
San Francisco Bay Area
Figure 2: Crime Monitor Locations
Los Angeles Metro Area
30
Figure 3
Distributions: Violent Crime Rate (Incidents per 100,000)
San Francisco and Los Angeles Metropolitan Areas
60000
50000
40000
30000
20000
10000
LA
SF
Figure 4
Time Series: Violent Crime Rate (Incidents per 100,000)
San Francisco and Los Angeles Metropolitan Areas
1000
900
800
700
600
500
400
300
200
100
0
1990
1995
2000
San Francisco
31
2005
Los Angeles
1900
1800
1700
1600
1500
1400
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
0
Appendix: Identification With a Single Market Using a Non-Parametric Hedonic Gradient
It is well-known that the identification problem described in Brown and Rosen (1982)
and Mendelsohn (1985) can be solved with the use of functional form restrictions introducing
non-linearity into the hedonic gradient, even with data from a single market. It is equally wellknown that the unidentified linear-quadratic model described in Section 2 is a very special case
of a hedonic equilibrium; in general, we would not expect the hedonic gradient to be linear.
(Heckman, Ekeland, and Nesheim (2004)). In this appendix, we illustrate how to employ our
estimation procedure using the non-linearity of the hedonic gradient for identification. The
implementation differs in a number of important ways from that used to estimate the linearquadratic model with multi-market data in Section 3.
We begin with a flexible representation of the hedonic gradient,
(A1)
where
;
may be an infinite dimensional parameter vector. The model still requires that we
specify a functional form for MWTP. Although any specification that allows the researcher to
recover a unique value of
consistent with an observed choice of
would work, for the sake of
simplicity, we maintain the additive specification used in the previous section:
(A2)
Using hedonic equilibrium to combine these equations, with some rearranging, we get:
32
;
(A3)
We then find the vector of parameters that maximizes the likelihood of the observed vector
:
(A4)
max
∏
,
,
;
,
where
,
(A5)
;
,
√
To implement this maximum likelihood procedure, we need only recover the value of
consistent with the observed value of
gradient
;
, given a vector of parameters
and hedonic price
, along with the determinant of the Jacobian of the transformation of
into
(which, in this simple example, is a scalar):
(A6)
;
Income Net of Payments for Housing Services
Even with a linear-quadratic model and multi-market data, one may still need to employ a
method similar to that described in the previous sub-section. In particular, this is the case if
MWTP is a function of household income remaining after paying for housing services (i.e.,
33
,
,
,
). Under this circumstance, the non-linear hedonic price function appears in the
equilibrium expression used for identification:
(A7)
,
,
The vector of parameters
,
,
,
,
,
,
,
,
are all estimated in the first stage hedonic price function,
so the maximum likelihood procedure described above need only be used to recover estimates of
,
,
.
34