Demand for Cars and their Attributes

Demand for Cars and their Attributes
Final Report
January 2008
Economics For The Environment Consultancy Ltd (eftec) 73 – 75 Mortimer Street
London W1W 7SQ, tel: 02075805383, fax: 02075805385, [email protected],
www.eftec.co.uk
Report prepared for the Department for Transport
by:
Economics for the Environment Consultancy (eftec)
73 – 75 Mortimer St, London, W1W 7SQ
Tel: 020 75805383
Fax: 020 75805385
www.eftec.co.uk
Acknowledgements:
The study team would like to thank the DfT project manager, Sarah Love, all members of
the steering group and the peer reviewers – Professor Ian Bateman (University of East
Anglia), Professor David Hensher (University of Sydney), Professor Kenneth Train (University
of California, Berkeley) and Dr. Gerard Whelan (MVA Consultancy) for input and comment
throughout the study.
eftec
ii
January 2008
CONTENTS
Executive Summary ......................................................................v
E1. Overview ............................................................................. vi
E2. Modelling Strategy.................................................................. vi
E2.1
E2.2
E2.3
E2.4
E2.5
E2.6
The market for new cars................................................................ vi
The behavioural model ................................................................ vii
The econometric model................................................................ viii
Definition of ‘the market’ ............................................................. viii
Definition of ‘choice occasions’ ....................................................... ix
Definition of ‘choice set’ ............................................................... ix
E3. Data ....................................................................................x
E3.1
E3.2
E3.3
Data: Demand ............................................................................. x
Data: Vehicle physical attributes ....................................................... x
Data: Vehicle costs ..................................................................... xii
E4. Estimation and Results .......................................................... xiii
E4.2
E4.3
Own-price and cross price elasticities ................................................ xv
Forecasting market demand for various cost change scenarios .................. xvi
E5. Concluding Remarks and Suggestions for Further Research ............. xix
Main Report ............................................................................... 1
1. Introduction ......................................................................... 2
1.1
1.2
1.3
2.
Overview ....................................................................................... 2
Specific research objectives ................................................................ 2
Report structure .............................................................................. 4
Methodological overview: Investigating vehicle purchasing decisions ... 5
2.1
The market for new cars .................................................................... 5
2.2
Modelling approaches ........................................................................ 6
2.2.1
Hedonic pricing model ............................................................... 6
2.2.2
Discrete choice models............................................................... 6
2.3
Choice of methodology ...................................................................... 8
2.3.1
Hedonic pricing........................................................................ 9
2.3.2
Aggregate versus disaggregate discrete choice models ......................... 9
3.
Modelling Strategy................................................................. 11
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
4.
Overview .....................................................................................
Definition of ‘the market’ .................................................................
Definition of ‘choice occasions’ ..........................................................
Definition of ‘choice set’ ..................................................................
Factors influencing vehicle choice .......................................................
Household utility function .................................................................
Logit Estimation ............................................................................
Nested Logit Estimation ...................................................................
11
12
12
13
13
15
17
19
Data .................................................................................. 21
4.1
Data: Demand ...............................................................................
4.1.1
Data Sources .........................................................................
4.1.2
Matching the DVLA data ...........................................................
4.1.4
Defining Choice Options ...........................................................
4.1.5
Calculating Market Shares .........................................................
4.1.6
Sales Outside Years of Manufacture .............................................
4.2
Data: Vehicle Attributes ...................................................................
4.3
Data: Vehicle Costs .........................................................................
4.3.1
Purchase Costs ......................................................................
4.3.2
Fixed Costs ...........................................................................
4.3.3
Variable Costs .......................................................................
5.
eftec
21
21
23
26
27
29
29
30
30
32
39
Regression Analysis ............................................................... 42
iii
January 2008
6.
Demand Elasticities and Demand Forecasting ............................... 46
6.1
6.2
6.3
7.
8.
Purchase Prices ............................................................................. 46
Annual Fixed Costs ......................................................................... 48
Variable Costs ............................................................................... 49
Summary and Suggestions for Further Research ............................ 54
References .......................................................................... 56
Annex 1: Overview of Some Key Concepts ........................................ 59
A1.1
A1.2
A1.3
Annex 2:
A2.1
A2.2
A2.3
eftec
Measuring the responsiveness of consumption ..................................... 59
Estimating demand elasticities: the case of homogeneous goods ............... 60
Differentiated product: the car market ............................................. 62
Literature review: Discrete choice models ........................ 63
Application of aggregate and disaggregate models ................................ 63
Specification of vehicle attributes ................................................... 68
Elasticity estimates ..................................................................... 70
iv
January 2008
Demand for Cars and their Attributes – Final Report
EXECUTIVE SUMMARY
eftec
v
January 2008
Demand for Cars and their Attributes – Final Report
E1.
OVERVIEW
The new car market abounds with choice. A myriad of models, versions and trims face the
new car purchaser; each different vehicle type carefully designed so as to display a
combination of attributes that differentiates it from the competition.
From a consumer’s point of view, of course, the ‘attributes’ of a vehicle include not only its
physical appearance and motoring capabilities (e.g. body style, engine and performance
specifications, number of doors and seats, equipment, etc.) but also its price, how much it
costs to run and how much it costs to insure and tax.
The fundamental objective of this research is to understand how those various attributes
determine households’ new car purchasing decisions.
As detailed subsequently, our approach to achieving that objective is an econometric one.
We use a large and uniquely detailed data set recording the new car purchasing decisions of
households in the UK. By examining the observed patterns of car purchases we estimate a
model of choice behaviour that predicts the market share that a vehicle will command
based on its attributes.
The model allows us to extrapolate beyond our data, predicting how new car purchasing
behaviour will change when one, some or all of the attributes of one, some or all of the
vehicles available in the market are changed. As such, the model provides a powerful tool
for forecasting the probable outcomes of changes in either the physical characteristics of
the cars in the market or the variable and fixed costs of vehicle purchase and use.
This report records the data collection as well as model specification, estimation and
results stages of this research. For illustrative purposes, the model is used to forecast how
demand changes in response to specific changes in the fixed and variable costs of driving
and to predict how those changes might impact on the CO2 emissions profile of the new car
market.
A separate deliverable provides the model in the form of a user-friendly Visual Basic
programme fronting a large MS Access database. That deliverable can be used to predict
the demand and emissions outcomes resulting from any user-specified set of price and cost
changes.
E2.
MODELLING STRATEGY
E2.1
The market for new cars
The supply side of the new car market consists of a relatively small number of
manufacturers each producing a variety of vehicles. Each of these varieties differs in its
attributes (e.g. size, doors, top speed, fuel efficiency, etc.). Since the number of
manufacturers is relatively small, it is unrealistic to regard this as a perfectly competitive
market. Rather, each manufacturer is aware of the actions of the other manufacturers and
tailors their product and pricing decisions accordingly. Such a market structure is commonly
referred to as a non-collusive oligopoly.
On the demand side, there are two fundamentally different types of consumer participating
in the new car market:
eftec
vi
January 2008
Demand for Cars and their Attributes – Final Report
(i) Firms or organisations purchasing vehicles for use by their employees; and
(ii) Individual households purchasing vehicles for their private use.
The present research deals with purchase decisions of private households exclusively. In
order to understand the car-purchasing behaviour of companies, one would require data on
vehicle purchase decisions of individual firms, data that was not available for the current
study. Further research using a radically different modelling strategy would be required to
analyse such data and this would be an important area for future research efforts.
E2.2
The behavioural model
Our model of household purchasing behaviour is a discrete choice model. That is to say, on
each choice occasion each household is assumed to face a choice; either buy one particular
type of car from the finite set of car varieties available in the market or make no purchase
at all. In making that choice it is supposed that households assess the benefit that they
would derive from purchasing each different model of car. Some they can reject
immediately as being too expensive, too small, too slow, etc. Others may require more
careful consideration. Either way, it is assumed that this choice involves weighing the
advantages resulting from the purchase of a vehicle with a certain set of attributes against
the concomitant purchase and running costs.
For the purposes of our analysis, we simplify the undoubted complexity of that deliberation
into a utility function. In effect, this utility function ascribes a score to each option,
attributing a higher score to options that provide a greater surplus of advantages over
costs. Households are assumed to purchase the particular vehicle type that scores highest,
provided the utility from that option exceeds the option of not buying a car at all.
The utility function takes as its arguments the physical and cost attributes of the car under
consideration as well as the characteristics of the household making the choice. Here it is
sufficient to note that our analysis accounts for household income, vehicle purchase price,
fixed and variable costs of motoring and a host of other attributes describing a vehicle’s
physical appearance and motoring capabilities.
As formally presented in the main report1, our research requires us to assume some
particular functional form that organises those various arguments in some reasonable and
justifiable manner. The objective of the research then reduces to using data on observed
behaviour to estimate the parameters of the proposed utility function.
Armed with an estimate of this fundamental building block of household choice behaviour,
we are able to predict how purchase behaviour will change in response to changes in the
physical and cost attributes of the vehicles available in the new car market.
Observe that because the model is based on:
•
a model of choice between alternative vehicles, it is ideally suited to estimating
substitution relationships; that is to say, it can be used to identify how changes in the
physical or cost attributes of one type of vehicle impact on demand for both that
type of car and other varieties of car (i.e. elasticities of demand with respect to
prices or to attributes).
•
a utility function, it is possible to examine the welfare effects of particular changes
in market conditions. For example, from changes in the attributes (e.g. fixed or
1
In particular, see Sections 3.5 to 3.7 for the specification of the utility function.
eftec
vii
January 2008
Demand for Cars and their Attributes – Final Report
variable costs of driving) of particular models or from the introduction or removal of
vehicles from the market.
As per the research specification our analysis focuses on identifying substitution
relationships. A useful extension to the research would be to explore welfare issues.
E2.3
The econometric model
For reasons described in detail in section 2.2 of the main report we elect to estimate the
parameters of our assumed utility function using an aggregate discrete choice model.
Rather than focussing on individual decisions, the basic observation of demand for an
aggregate analysis is the sum of purchasing decisions of all households in a particular
market.
In this case, demand for vehicles in general is summarised by the proportion of households
choosing not to purchase a new car, while demand for the different vehicles is summarised
as the proportion of households purchasing each particular type. Econometric techniques
are used to analyse these market shares so as to identify the underlying choice behaviour of
households. Put simply, the determinants of household choice (i.e. the parameters of the
utility function) are revealed by examining how market shares relate to the attributes and
prices of the vehicle models available in the market.
The analysis can be considerably enhanced if market share data from a number of markets
is available. For example, over a series of years we would expect to observe changes in the
consumption patterns in response to changes in the characteristics of households and to
changes in the set of vehicle models offered in the market by manufacturers. The time
dimension also introduces greater variability in important determinants of choice such as
the fixed and variable costs of driving. For example, over time we might expect to see
changes in market shares in response to changes in the tax system (e.g. from changes in
VED) or from changes in fuel prices.
In a similar vein, cross-sectional variation can be introduced by identifying regional
markets. While an identical set of vehicles might be available to households in each region,
we might still observe differences in market shares resulting from differences in the
socioeconomic composition of regions. As such, provided sufficient variation exists in the
data, an aggregate demand model can also identify socioeconomic drivers of demand.
While our original intention had been to investigate such socioeconomic factors in detail,
delays in obtaining data and the complexity of the modelling exercise mitigated against
such a comprehensive analysis. Extending the analysis to investigate socioeconomic drivers
of purchase behaviour may prove a fruitful area for future research.
E2.4
Definition of ‘the market’
We define the new car market as consisting of a set of households living in a distinct
geographic region in a particular year. In particular, our data allows us to identify the
quantity of sales of different types of vehicle to private households for each year from 2001
to 20062 in each of the eleven Government Office Regions (GORs) of Great Britain.
Subsequently, population data available from the Office of National Statistics provides
details of the number of households in each market and allows us to calculate market
shares; that is, the proportions of households in each market purchasing each type of car as
well as the proportion not purchasing a car at all.
2
Note however that in order to integrate this data with other data sets we actually only use data up to 2005.
eftec
viii
January 2008
Demand for Cars and their Attributes – Final Report
E2.5
Definition of ‘choice occasions’
In common with any discrete choice methodology, the aggregate choice model requires a
definition of a choice occasion. In this research we take that choice occasion as spanning
the period of one calendar year. In other words, we assume that each year, each household
in Great Britain is faced by the decision as to which, if any, of the set of new cars available
in the market to purchase.
E2.6
Definition of ‘choice set’
On each choice occasion, we assume that each household surveys the range of vehicles
available to them in the market and chooses to purchase that vehicle which offers them the
greatest utility. Of course cars come in a vast multiplicity of versions differing in both
major features such as their body type as well as minor features such as whether they are
provided with or without air conditioning. Defining what is and what is not a meaningfully
distinct choice option is not a straightforward task.
As we discuss subsequently, we define one choice option as being a vehicle of the same
make, model, body type, fuel type, transmission type and engine size. As far as we are
aware, this level of disaggregation in the definition of choice options is greater than any
used in previous applied work.
Since it will be important in what follows, we also categorise each of our vehicle options
into one of a series of so-called ‘market segments’. The vehicles within each segment show
considerable homogeneity in size, body shape and price and tend to be marketed at a
particular demographic of household. Table E2.1 lists the market segment definitions
provided in our data and used in our analysis.
Table E2.1: Definitions of Market Segments
Segment
Examples
EU A
Mini Cars
Ford Ka, Renault Twingo
EU B
Super Minis
Citroen Berlingo, Renault Clio, Toyota Yaris
EU C
Small/ Medium Family Cars and
Prestige Hatchback
Honda Civic, Volkswagen Golf, Audi A3,
Chrysler Neon
EU D
Large Family Cars, Compact
Executives and Entry-Level Luxury
Ford Mondeo, Vauxhall Vectra, Lexus IS200,
Mercedes C-Class
EU E
Executive Cars and Luxury Cars
Audi A6, Jaguar S-Type, Mercedes S-Class
EU Mini MPV
Multi-purpose vehicles similar in
size to segment C vehicles
Fiat Multipla, Renault Scenic
EU MPV
Multi-purpose vehicles
Chrysler Voyager, Renault Espace
EU Sports
Sports cars
Ford Puma, Porsche Boxster
EU SUV
Sports utility vehicles
Jeep Grand Cherokee, Land Rover Freelander
Note: Full descriptions of each segment are provided in Table 3.3 of the main report.
eftec
ix
January 2008
Demand for Cars and their Attributes – Final Report
E3.
DATA
E3.1
Data: Demand
Market shares were calculated using data taken from the New Car Registrations database
compiled by the Driver and Vehicle Licensing Agency (DVLA).
The data was provided to us partially aggregated. In particular, we were provided with
counts of the number of new registrations by private households of vehicles of the same
type. We defined a “type” as those vehicles that shared the same;
•
•
•
•
•
•
Make (e.g. Ford, Vauxhall etc.);
Model (e.g. Ford Focus, Vauxhall Corsa etc.);
Body type (e.g. hatchback, saloon, estate, cabriolet etc.);
Fuel type (e.g. petrol, diesel, LPG etc.) ;
Transmission type (e.g. automatic, manual); and
Engine size (1.2 litres, 1.6 litres, 2.0 litres etc.).
Our data identified a total of 2,190 such vehicle types, a considerably larger definition of
the choice set than used in any previous study of which we are aware. Our belief is that this
provides a good approximation to households’ actual perception of the choice set,
capturing essential differences in the fuel type, transmission and engine sizes of vehicles
that are fundamental to examining issues surrounding the CO2 emissions of new vehicles.
The counts of the number of new registrations were further disaggregated by year of
purchase and the GOR of residence of the purchaser. As such, our data showed the number
of registrations of each type of car by private households in each GOR in each year from
2001 to 2006 and consisted of some 847,689 records.
As documented in section 4.1 of the main report there were numerous inconsistencies in
these records and an enormous analytic effort was required to clean the DVLA data.
E3.2
Data: Vehicle physical attributes
Data on the physical attributes of each vehicle type were provided by JATO dynamics. The
JATO dataset was extremely detailed, recording numerous measures of each vehicle’s
physical appearance (e.g. body type, number of doors, length, width, volume, weight),
performance (e.g. engine size, transmission type, number of gears, acceleration, brake
horsepower, fuel efficiency, braking system) and trim (e.g. safety features, wheel trim).
Matching the DVLA demand data to the JATO attribute data was a very considerable task
requiring the development of complex fuzzy matching algorithms. Again full details are
provided in Section 4.1 of the main report.
It was not possible to construct complete attribute details for 128 of the 2,190 vehicle
options. These generally pertained to very low production vehicles by small scale
manufacturers (e.g. Strathcarron, Maserati etc.) compising less than 0.5% of the total sales
in our data.
Table E3.1 summarises the average vehicle attributes of data included in the final data set
disaggregated by market segment.
eftec
x
January 2008
Demand for Cars and their Attributes – Final Report
Table E3.1: Average physical attributes of vehicle options used in the analysis disaggregated by market segment
Market Segment
(example vehicle)
A
(Ford Ka)
B
(Renault Clio)
C
(Honda Civic)
Mini MPV
(Renault Scenic)
D
(Ford Mondeo)
E
(Jaguar S-Type)
Sports
(Porsche Boxster)
MPV
(Renault Espace)
SUV
(Land Rover Freelander)
Engine
Size
(litres)
CO2
(g/km)
Automatic*
Num
Gears
Size
(Length ×
Width)
Doors
Brake
Horsepower
Acceleration
(secs to
100km/h)
Number of
Airbags
Air
Conditioning*
Alloy
Wheel
Rims*
Anti-Lock
Braking
System*
1.09
142
0.30
4.78
525
3.68
65
15
1.54
0.13
0.27
0.35
1.45
155
0.21
4.59
644
4.20
86
13
1.78
0.39
0.38
0.59
1.75
178
0.27
4.54
732
4.20
113
11
2.23
0.59
0.52
0.65
1.81
186
0.27
4.79
743
4.91
114
12
2.54
0.68
0.51
0.71
2.18
202
0.38
4.69
809
3.98
152
10
2.59
0.65
0.61
0.67
3.00
244
0.62
4.91
892
4.03
219
9
3.07
0.70
0.70
0.70
3.01
259
0.41
5.33
773
2.11
242
7
2.38
0.71
0.83
0.81
2.24
226
0.43
4.63
869
4.76
139
13
2.43
0.69
0.46
0.68
2.75
266
0.54
4.70
827
4.58
166
12
2.52
0.72
0.69
0.73
* Dummy variables
eftec
xi
January 2008
Demand for Cars and their Attributes – Final Report
E3.3
Data: Vehicle costs
With regards to vehicle costs we identify three different types:
•
Purchase costs, where the price of a particular vehicle option will differ over time;
•
Fixed annual costs which include insurance premiums and vehicle excise duty; and
•
Variable costs which we assume for new cars are confined to motoring costs.
Note that in our analysis all cost variables are scaled to 2005 prices, using the Retail Price
Index (RPI) for non-housing items.
Purchase Costs
The JATO dataset provides ‘list’ prices for each vehicle for each year for which we have
DVLA registrations data. We are aware that this price probably overstates the actual price
paid by households since most dealers will offer a “discount” on list prices when making
sales. Given the available data, there is little we can do to address this concern.
Whatever the actual purchase price, a key question is whether that price provides a
complete picture of the cost of purchasing the vehicle as perceived by the household. In
particular, a purchaser can always sell their vehicle on in the second-hand market so they
can always recoup some of their initial outlay.
We suspect that vehicles that command relatively lower prices in the second hand market
will be relatively less preferred by consumers. To investigate this issue we sourced another
data set from EuroTaxGlass. For every year from 2001 to 2005 this data recorded the price
that one year old vehicles having driven the average household annual distance of 13,000km
fetched in the second hand market. Not surprisingly, we found that new cars depreciate
rapidly in value losing, on average, 33% of their list price over the course of one year.
We assume that households’ decisions about which car to buy are influenced by the current
resale values of cars bought in previous years and not the actual resale value that a vehicle
realises one year later which, of course, is unobservable at the time of purchase. We
include this resale value as well as the purchase price in our analysis.
Fixed Costs
With regards to the fixed costs of motoring, details of the Vehicle Excise Duty (VED)
payable on vehicles with different levels of CO2 emissions over the span of our data series
are illustrated in Figure E3.1. Notice that there has been some, though only moderate
fanning-out of VED over this period with discounted bands being introduced for low
emissions vehicles in 2002 and 2003.
In addition to the VED, a substantial fixed cost associated with owning a vehicle is that of
insuring the vehicle. Since the cost of repairing damage differs substantially across
vehicles, and the risks of incurring damage differs across individuals and locations,
insurance premiums themselves differ across vehicles and individuals.
In order to deal with the intricacies of insurance costs, we sourced data from the AA
detailing a large number of insurance quotes from 51 of the UK’s leading insurers. We used
that data to estimate a model of insurance price formation to predict the insurance
premiums payable for each type of new car. Full details of that model are provided in
Section 4.3.2 of the main report.
eftec
xii
January 2008
Demand for Cars and their Attributes – Final Report
Figure E3.1: Vehicle excise duty by vehicle CO2 emissions category 2001 to 2005
Diesel
Petrol
180
160
Price (£/year)
140
120
100
80
60
40
20
0
2001
2002
CO2 emission
level (g/km):
2003
2004
2005
2001
2002
2003
100 and below
151 to 165
101 to 120
166 to 185
121 to 150
186 and above
2004
2005
Variable Costs
The variable costs of motoring depend primarily on three factors; (i) the number of miles
driven, (ii) the fuel efficiency of the car and (iii) the cost of fuel. We sourced historical
data recording fuel prices disaggregated by region from the AA. In addition, the JATO
dataset records the weighted-average fuel efficiency for each vehicle in the dataset. In our
model estimation, we treat variable costs differently from the other costs we have
considered. Indeed, rather than incorporating the variable costs into budget considerations,
we treat it as a characteristic of a vehicle option, combining fuel efficiency and fuel price
in order to specify the cost of driving an average kilometre.
The cost attributes used in the analysis are summarised in Table E3.2 disaggregated by
market segment.
E4.
ESTIMATION AND RESULTS
E4.1
Model Estimation
The aggregate discrete choice model was estimated using the demand data and attribute
data described in the previous section. The estimation data set consisted of some 70,850
observations, where each observation relates the market share of a particular type of
vehicle in a particular GOR in a particular year to the physical and cost attributes of that
vehicle. This represents one of the most comprehensive data sets of its kind ever compiled.
All the same, it remains unlikely that the variables available to describe the physical
attributes are sufficient to capture the myriad details that differentiate one vehicle from
another.
eftec
xiii
January 2008
Demand for Cars and their Attributes – Final Report
Table E3.2: Average Cost attributes Disaggregated by Market Segment (£)
Market Segment
Price
Resale Price
Vehicle
Excise Duty
Insurance
Fuel Cost for
100km
8,019
5,351
111
289.0
5.09
B
(Renault Clio)
10,581
7,096
126
324.5
5.46
C
(Honda Civic)
14,184
9,172
145
396.6
6.27
D
(Ford Mondeo)
21,183
14,433
158
507.8
7.06
38,514
26,639
164
688.1
8.40
22,151
15,455
165
486.4
7.81
26,171
19,789
166
543.8
9.08
42,168
32,745
162
914.3
8.90
14,850
9,779
153
379.5
6.50
(example vehicle)
A
(Ford Ka)
E
(Jaguar S-Type)
MPV
(Renault Espace)
SUV
(Land Rover Freelander)
Sports
(Porsche Boxster)
Mini MPV
(Renault Scenic)
The existence of unobserved vehicle attributes is a real problem for the robust estimation
of the parameters of the utility function. Indeed, if unobserved attributes are not
controlled for in the estimation procedure, the estimated parameters will in all likelihood
be biased.
To overcome the problem of unobserved attributes we exploit the high level of
disaggregation available in our data. In particular, we contend that our data is relatively
rich in variables describing the quantifiable features of each vehicle type, what is omitted
are those elements of the styling of a vehicle type that are impossible to quantify through
objective measurement. Since we observe demand in numerous markets and within each
market observe several varieties of vehicle of the same make, model and body type that
differ in their engine size and transmission type, we are in a position to estimate a separate
parameter (a so called fixed effect) to capture those omitted styling factors. To be precise,
we use a fixed effects estimator, estimating a separate coefficient to represent the fixed
utility associated with each make, model, body type, fuel type combination. Compared to
other approaches to handling unobserved attributes, the fixed effects estimator is
extremely robust to misspecification error and greatly increases our confidence in the
reliability of our results.
Using the fixed effects estimator, we experimented with a handful of specifications of the
utility function. We discuss and contrast those specifications in Section 5 of the main
report. Our preferred specification is a “nested logit model”. This specification allows for
non-proportional substitution across market segments such that changes in the attributes of
a vehicle type in one market segment will impact differently on demand for vehicles in that
same segment than on demand for vehicles in alternative segments. To give a flavour of the
output from that model, Table E4.1 reports estimates of the vehicle attribute parameters.
The full set of parameter estimates can be found in Table 5.1 of the main report.
eftec
xiv
January 2008
Demand for Cars and their Attributes – Final Report
Table E4.1: Parameter estimates for vehicle attributes (other
parameters not presented) from a fixed-effects nested logit
model of demand for new cars
Variable
Cost Attributes
(scaled by avg. income)
Purchase Price
Parameter
Estimate
p-value
-1.0165
0.000
Resale Price
0.7046
0.000
Fixed Cost
-1.9170
0.003
Fuel Cost per 100km
-0.3850
0.000
Automatic
0.0808
0.000
Num Gears
0.0371
0.000
Size (width x length)
0.0010
0.000
Brake Horsepower
0.0020
0.000
Acceleration (secs to 100km/h)
-0.0097
0.000
Number of Airbags
0.0201
0.000
Air Conditioning
0.0087
0.203
Alloy Wheel Rims
0.0239
0.000
Anti-Lock Braking System
-0.0724
0.000
Doors
0.0039
0.131
Physical Attributes
70,875
N
Num Fixed Effects
735
2
R
Within
0.9133
Between
0.1512
Overall
0.3257
Consider first the coefficients on the cost attributes. As we would expect, we observe
negative coefficients on purchase price, fixed costs and variable costs since these represent
losses in wealth and a positive coefficient on resale price since this represents money back
in the bank. We also find that each of the costs parameters are highly significant. Clearly,
households consider each of the four cost attributes in making their purchasing decisions,
but the different absolute values of the four parameters indicates that those various costs
are treated with different weights.
In the main, the coefficients estimated on the physical attributes are also significant and
with signs that follow our prior expectations. The notable exception is provided by the
coefficient on anti-lock braking system which remains stubbornly and anomalously negative.
E4.2
Own-price and cross price elasticities
The model of choice behaviour can be used to estimate how demand for a vehicle type
changes as a result of changes in selling prices. The usual way to describe the
responsiveness of demand to changes in prices is through the estimation of price
elasticities. Formally, own-price elasticity of demand measures the percentage change in
the demand for a good when the price of that good changes by one percent (all other things
being equal). Table E4.2 shows the distribution of own-price elasticity estimates across
vehicle types in each of the market segments.
eftec
xv
January 2008
Demand for Cars and their Attributes – Final Report
Table E4.2: Percentile values of own-price elasticity of market shares
Market Segment
90th
75th
Median
25th
10th
A (Ford Ka)
-6.887
-5.672
-4.665
-3.962
-3.353
B (Renault Clio)
-5.629
-4.988
-4.333
-3.738
-3.243
C (Honda Civic)
-7.140
-6.205
-5.365
-4.525
-3.842
D (Ford Mondeo)
-5.191
-4.361
-3.597
-3.002
-2.557
E (Jaguar S-Type)
-5.889
-4.084
-3.238
-2.694
-2.332
MPV (Renault Espace)
-3.650
-3.235
-2.813
-2.407
-2.084
SUV (Land Rover Freelander)
-2.351
-1.855
-1.308
-1.038
-0.846
(example vehicle)
Sports (Porsche Boxster)
-3.939
-2.538
-1.590
-1.141
-0.839
Mini MPV (Renault Scenic)
-4.888
-4.385
-3.746
-3.088
-2.622
All
-5.913
-4.827
-3.701
-2.745
-1.635
The own-price elasticities shown in Table E4.2 appear entirely plausible and are not
dissimilar from the sorts of values recorded elsewhere in the literature (see Annex 2). In
accordance with profit-maximising behaviour, the vast majority of vehicles have price
elasticities that exceed unity (if demand is inelastic then a manufacturer can increase
profits by increasing prices). Also, in line with our expectations, demand is generally more
price responsive for low specification vehicles in market segments A, B and C and less
responsive in the high specification SUV and Sports segments.
Of course, demand for a particular vehicle type is not only influenced by its own price, but
also by the price of other vehicle options available in the market, a relationship that is
described by a cross-price elasticity. Formally cross-price elasticity measures the
percentage change in demand for a given good when the price of a related good changes by
1 percent. For substitute goods, such as vehicle models with similar characteristics, the
cross-price elasticity will be positive3.
Table E4.3 shows the average within- and across-segment cross-price elasticities for
vehicle options available in 2005. In line with out expectations we find that cross-price
elasticities are larger for vehicle options in the same segment; i.e. that households are
more likely to substitute within rather than across segments.
E4.3
Forecasting market demand for various cost change scenarios
The model allows us to ask numerous sophisticated questions of the data. For example,
imagine we wished to investigate the market ramifications of a change in the price of
vehicles in a particular section of the new car market. By altering the price of those
particular vehicles, we can use our model to predict how households’ purchasing behaviour
will change.
3
That is, as the price of a given vehicle increases, we would expect households to switch to an alternative
(substitute) vehicle, increasing the quantity consumed. In contrast, complementary goods are expected to have
negative cross-price elasticity: if the price of a product decreases, consumers will increase their consumption of
both the specific product and its complement (e.g. digital set-top boxes and high-definition TVs).
eftec
xvi
January 2008
Demand for Cars and their Attributes – Final Report
Table E4.3: Average own- & cross-price elasticities from 2005
Elasticities
Market Segment
(example vehicle)
Own-Price
Cross-Price Same
Segment
Cross-Price Different
Segment
A (Ford Ka)
-4.411
0.105
0.00001
B (Renault Clio)
-4.239
0.022
0.00003
C (Honda Civic)
-5.223
0.019
0.00002
D (Ford Mondeo)
-3.644
0.008
0.00001
E (Jaguar S-Type)
-4.017
0.020
0.00001
MPV (Renault Espace)
-3.654
0.030
0.00001
SUV (Land Rover Freelander)
-2.679
0.030
0.00001
Sports (Porsche Boxster)
-1.950
0.004
0.00002
Mini MPV (Renault Scenic)
-1.534
0.005
0.00003
Consider the data shown in Table E4.4. The left hand column shows a categorisation of
vehicles according to their CO2 emissions measured in g/km. The second column lists the
number of purchases of vehicles in each CO2 category made in 2005. Notice the first row of
the data lists the number of households choosing the “outside good”, that is, choosing not
to purchase a new vehicle in 2005. Each subsequent column shows the models predictions
of how sales in each category will change as a result of a 1% increase in the price of
vehicles in the CO2 category shown at the top of that column.
Not surprisingly, when the price of vehicles in a particular category goes up, we observe
demand for those vehicles declining; all values along the diagonal are negative. Also in line
with our expectations, the model predicts that households substituting away from the
vehicles experiencing a price rise tend to choose vehicles with a relatively similar CO2
rating. Observe the important role played by the outside good in allowing for households to
substitute out of the market. For example, the 1% increase in price for vehicles generating
between 111 and 120 gCO2/km results in 983 households substituting away from the
vehicles in that category. Whilst 912 of those households are predicted to simply pick a
different vehicle from another category, our model indicates that 71 will choose not to
purchase a vehicle at all.
The final two rows of Table E4.4 provide an indication of how the price change (for each
CO2 category) impacts upon the CO2 emissions profiles of new car purchases. The first row
shows the sales-weighted market average CO2 emissions, that is, the average emissions
level of vehicles sold in the market. Of course, the average emissions of vehicles does not
reflect the fact that fewer vehicles might be sold in total. Accordingly, we also report the
CO2 emissions averaged over all households in the market.
Since the price changes explored in Table E4.4 are rather small and affect just one small
segment of the market we observe relatively little change in purchasing behaviour. Of
course, more substantial price changes or price changes across a wider segment of the
market would have more substantive impact on choice behaviour and consequently on
average CO2 emissions.
In Sections 6.2 and 6.3 of the main report, we carry out similar analyses looking at the
market consequences of changes in the variable and fixed costs of motoring.
eftec
xvii
January 2008
Demand for Cars and their Attributes – Final Report
Table E4.4: Changes in sales patterns from 1% change in price of vehicles in different CO2 emissions categories (disaggregated by CO2 emissions
categories)
Change in Sales for Vehicles in Row Category from 1% Change in Price of Vehicles in Column Category
CO2g/km
2005 Sales
100 to
110
111 to
120
121 to
130
131 to
140
141 to
150
151 to
160
161 to
170
171 to
180
181 to
190
191 to
200
200 to
225
226 to
250
251 to
275
275 to
300
300 to
400
400 to
500
23,142,289
11
71
101
281
501
651
351
541
351
351
651
361
351
191
151
61
100 to 110
5,719
-205
32
30
20
104
8
1
6
8
0
0
0
0
0
0
0
111 to 120
25,857
33
-983
72
123
315
160
50
91
52
13
9
3
0
0
1
0
121 to 130
32,011
28
69
-1,263
143
316
207
107
148
78
35
22
9
1
0
2
0
131 to 140
87,324
19
134
151
-3,006
780
689
279
431
213
90
57
21
2
2
5
0
141 to 150
161,520
94
315
324
766
-4,641
1,166
374
696
363
142
116
39
5
5
7
1
151 to 160
164,892
8
165
208
654
1,102
-5,304
605
841
454
250
293
113
11
12
16
3
161 to 170
87,145
0
50
111
300
383
637
-3,021
587
323
204
176
82
8
7
18
2
171 to 180
124,628
5
82
140
395
607
806
580
-4,275
370
275
253
100
17
11
19
3
181 to 190
75,216
5
40
64
174
288
409
279
345
-2,572
140
177
75
23
16
15
6
191 to 200
61,911
0
11
32
83
122
228
182
267
134
-1,770
173
74
37
20
21
11
200 to 225
94,011
0
6
19
48
89
244
144
219
145
160
-2,192
149
85
46
45
35
226 to 250
43,366
0
2
7
16
27
82
56
75
53
61
135
-1,106
45
25
23
18
251 to 275
33,849
0
0
1
1
2
6
4
11
13
25
62
38
-619
22
20
10
275 to 300
15,094
0
0
0
1
2
6
3
6
7
12
30
18
19
-366
9
6
300 to 400
10,285
0
1
1
3
4
8
9
10
7
10
23
13
14
6
-351
3
400 to 500
1,904
0
0
0
0
0
0
0
0
1
3
11
5
3
2
2
-157
Market avg
CO2
173.64
173.65
173.68
173.68
173.71
173.69
173.68
173.65
173.60
173.59
173.61
173.58
173.58
173.60
173.61
173.59
173.61
Population
avg CO2
7.363
7.363
7.364
7.364
7.364
7.361
7.360
7.361
7.357
7.358
7.359
7.355
7.358
7.358
7.360
7.360
7.361
Outside
Good
eftec
xviii
January 2008
Demand for Cars and their Attributes – Final Report
As mentioned previously, we have coded an easy-to-use tool that automates the production
of outputs such as that illustrated in Table E4.4 such that the user can examine the market
consequences of any particular price or cost change applied to any section of the new car
market.
E5.
CONCLUDING REMARKS AND SUGGESTIONS FOR FURTHER RESEARCH
The research reported in this document records the creation of a remarkably detailed
aggregate data set of new car purchases in GB. We believe our final data set to be of one of
the most detailed and accurate data sets of its type. The data are used to estimate a model
of household new car purchasing behaviour that allows us to investigate how market
demand will react to changes in the purchase, fixed and variable costs of cars.
There are several key areas in which the research reported here might be extended.
• First, delays in accessing data and constraints imposed by the magnitude of the
research task mean that the model reported here does little to investigate the
relationship between demand and socioeconomic characteristics. As a matter of fact,
the authors have access to a rich data set from two national surveys (the National
Travel Survey and the Expenditure and Food Survey) that provide household level
observations of vehicle purchasing behaviour. Combining this individual level data with
the aggregate data analysed here provides a unique opportunity to estimate a demand
model that can provide valuable insights into how household socioeconomics impact on
demand for new cars.
• Second, the current report focuses on the estimation of substitution patterns resulting
from policies that change costs. The model can also be used to estimate the welfare
impacts of those changing costs. Such an analysis would be particularly insightful if
applied to the model with socioeconomics described previously. In that case, the
distributional impacts of policies across regions and socioeconomic groups could be
investigated in detail.
• Third, the current model only considers the purchasing behaviour of private
households. As a matter of fact, each year companies and organisations purchase as
least as many vehicles. Clearly, a priority for further research would be to model
demand for company cars. One of the key problems here is the availability of data.
However, in undertaking the current research project it has become clear how, with
the help of the DVLA, such a data set might be sourced. Moreover, we have some
preliminary ideas as to how a demand model using that data might be formulated.
• Finally, the current research focuses solely on the demand side of the new car market.
As a matter of fact the current analysis can be extended to investigate a number of
supply side issues. In particular, using insights from the theory of non-collusive
oligopolistic markets, it is possible to use the current models to estimate the mark-up
(that is the excess of price over the marginal costs of production) enjoyed by each
vehicle type. Taking those insights one step further, one can attempt to estimate each
manufacturer’s marginal cost function. Such a function would provide the DfT with
estimates, for example, of how costly it is for car makers to manufacture vehicles with
extra fuel efficiency.
The EFTEC research team is eager to pursue all these research ideas.
eftec
xix
January 2008
Demand for Cars and their Attributes – Final Report
MAIN REPORT
eftec
1
January 2008
Demand for Cars and their Attributes – Final Report
1.
INTRODUCTION
1.1
Overview
The new car market abounds with choice. A myriad of models, versions and trims face the
new car purchaser; each different vehicle type carefully designed so as to display a
combination of attributes that differentiates it from the competition.
From a consumer’s point of view, of course, the ‘attributes’ of a vehicle include not only its
physical appearance and motoring capabilities (e.g. body style, engine and performance
specifications, number of doors and seats, equipment, etc.) but also its price, how much it
costs to run and how much it costs to insure and tax.
The fundamental objective of this research is to understand how those various attributes
determine households’ new car purchasing decisions.
As detailed subsequently, our approach to achieving that objective s is an econometric one.
We use a large and uniquely detailed data set recording the new car purchasing decisions of
households in the UK. By examining the observed patterns of car purchases we estimate a
model of choice behaviour that predicts the market share that a vehicle will command
based on its attributes.
The model allows us to extrapolate beyond our data, predicting how new car purchasing
behaviour will change when one, some or all of the attributes of one, some or all of the
vehicles available in the market are changed. As such, the model provides a powerful tool
for forecasting the probable outcomes of changes in either the physical characteristics of
the cars in the market or the variable and fixed costs of vehicle purchase and use.
This report records the data collection as well as model specification, estimation and
results stages of this research. For illustrative purposes, the model is used to forecast how
demand changes in response to specific changes in the fixed and variable costs of driving
and to predict how those changes might impact on the CO2 emissions profile of the new car
market.
A separate deliverable provides the model in the form of a user-friendly Visual Basic
programme fronting a large MS Access database. That deliverable can be used to predict
the demand and emissions outcomes resulting from any user-specified set of price and cost
changes.
1.2
Specific research objectives
The Department for Transport’s (DfT) research specification (DfT, 2006) states that the aim
of the study is to identify how vehicle purchasing choices respond to changes in:
1. The purchase price of vehicles;
2. Fixed costs of motoring (e.g. VED and insurance costs); and
3. Variable costs of motoring (e.g. fuel costs).
In addition, the DfT identified a number of other issues they hoped the research might be
able to address. The full set of research objectives are listed in Table 1.1.
eftec
2
January 2008
Demand for Cars and their Attributes – Final Report
Table 1.1: Research objectives
A.
Estimation of own and cross (purchase) price elasticity of private demand for vehicles:
By car model & market segment
By car model & market segment differentiated by consumer type
B.
Estimation of attribute elasticity of private demand for vehicles:
With regards to fixed costs of vehicle ownership
With regards to variable costs of vehicle ownership (e.g. costs determined by fuel efficiency
and fuel prices)
Investigation of elasticities of demand with regards to other vehicle attributes (e.g. size,
doors, power, CO2 emissions, etc.)
C.
Estimation of ‘price’ elasticity of vehicle usage with regards to:
Fixed costs of vehicle ownership
Variable costs of vehicle ownership
D.
Investigation of other features of private demand for vehicles:
Market responses to the addition of a new model of vehicle
Investigation of how elasticities differ according to consumer characteristics
E.
Investigation of consumer definitions of prices and costs:
With regards to purchase price (i.e. capital value or monthly car loan costs)
With regards to running costs (i.e. how running costs are compared to fixed costs and
purchase prices in the purchase decision).
F.
Investigation of other facets of the market for new cars with regards to:
The demand for vehicles from non-households (e.g. company cars)
The EU Voluntary Agreements
Importantly, no single methodology is ideally suited to addressing this entire range of
objectives. We note that methods that lend themselves to investigating household demand
for new cars are not necessarily appropriate for investigating the vehicle purchase decisions
of organisations. Moreover, measures such as voluntary agreements on new car fuel
efficiency impact upon the production decisions of vehicle manufacturers rather than
purchasing behaviour of consumers.
Our chosen methodology is chosen primarily because of its ability to model patterns of
demand substitution in the new car market and thereby address the core research
objectives listed in categories A and B of Table 1.1.
From the outset it should be noted that our analysis is concerned only with new vehicle
purchasing decisions of households. In order to understand the car-purchasing behaviour of
companies, one would require data on vehicle purchase decisions of individual firms, data
that was not available for the current study. In particular, it would be necessary to
distinguish between the areas of business of the different companies. For example the
demands of taxi companies are likely to differ significantly from the demands of delivery
companies, which would also differ from the demands of corporations providing vehicles to
their sales fleet (etc.). Further research using a radically different modelling strategy
would be required to analyse these purchases and this would be an important area for
future research efforts.
eftec
3
January 2008
Demand for Cars and their Attributes – Final Report
In this report, we do not consider the research objectives listed under categories C, D and
F. In contrast, we are careful in our specification of the demand model to allow for
investigation of the issues raised under category E; that is to say, to investigate the relative
weights that consumers place on the various cost components of purchasing a vehicle.
Finally, our research does not investigate the supply side of the market. As such, any
application of our findings to predict changes in market demand must always be qualified
with the caveat “provided supply side conditions remain constant”.
1.3
Report structure
Following this introductory section, this report is structured as follows:
•
Section 2: provides a non-technical overview of the methodological approaches to
investigating new vehicle purchasing decisions, detailing the basis for our choice of
methodology and a review of relevant literature;
•
Section 3: provides the detailed exposition of our econometric model and proposed
estimation strategy;
•
Section 4: describes the various data sources we have accessed and relates the
construction of our final data set;
•
Section 5: reports on the results of our econometric modelling exercises, reporting and
interpreting parameter estimates;
•
Section 6: employs the estimated model to investigate the various objectives of the
research programme set out Table 1.1 above; and
•
Section 7: provides a summary of the research and some concluding remarks.
A number of additional documents are included as Annexes.
eftec
4
January 2008
Demand for Cars and their Attributes – Final Report
2.
METHODOLOGICAL
OVERVIEW:
INVESTIGATING
VEHICLE PURCHASING
DECISIONS
In order to provide research outputs that have meaningful interpretations with respect to
purchasing behaviour concerning new vehicles, it is important that any analysis is solidly
founded in economic theory. That theory must inform on the mechanism of price-setting
and equilibrium in the new car market and the process by which consumers make purchase
decisions in that market.
In this section we provide a non-technical overview of the theoretical and methodological
background to this study, including a brief discussion of the structure of the market for new
vehicles (Section 2.1), alternative approaches to investigating purchasing and use decisions
(Section 2.2) and the basis for the choice of methodology for this study (Section 2.3).
2.1
The market for new cars
The supply side of the new car market consists of a relatively small number of car
manufacturers each producing a variety of car models. Each of these car models differs in
its attributes (e.g. size, doors, top speed, fuel efficiency, etc.). Since the number of
manufacturers is relatively small, it is unrealistic to regard this as a perfectly competitive
market. Rather, each manufacturer is aware of the actions of the other manufacturers and
tailors their product and pricing decisions accordingly. Such a market structure is commonly
referred to as a non-collusive oligopoly.
For our purposes, the most important characteristic of such a market structure is that
manufacturers may not necessarily set prices equal to the marginal costs of production. As
a result, each car sold may command a “mark-up” that comprises the excess of price over
marginal costs. Moreover, if market conditions change (e.g. from changes in the structure
of vehicle excise duty) manufacturers will reconsider their pricing structure choosing a new
set of prices for the models they manufacture so as to maximise their profits (given the
pricing and production decisions of their competitors).
On the demand side, we acknowledge that there are two fundamentally different types of
consumer participating in the new car market:
(i) Firms or organisations purchasing vehicles for use by their employees; and
(ii) Individual households purchasing vehicles for their private use.
Company cars represent an intermediate category in which the vehicle is purchased and
owned by a firm but used by an employee for both work and private travel. It is selfevident that the determinants of the purchase decisions made by these different types of
consumer will differ fundamentally (e.g. firms will be likely to have a far higher demand for
large white vans than private households).
As previously noted, the present research deals exclusively with purchase decisions of
private households. In building a model of those purchasing decisions, we assume that
individual households are price-takers. That is to say, we assume that they do not have the
market power to unilaterally influence car prices.
In addition, we make the assumption that households new car purchasing decisions are
driven by a desire to maximise their utility. That is to say, households are assumed to
eftec
5
January 2008
Demand for Cars and their Attributes – Final Report
purchase that particular type of vehicle that provides them with the greatest increase in
well-being. In particular, we assume that the well-being derived from purchasing a
particular vehicle is some function of that vehicles physical and cost attributes.
2.2
Modelling approaches
Three main approaches lend themselves to investigating the car-type choice of households
and estimating the demand for car attributes: (i) the hedonic pricing approach; (ii)
aggregate choice models; and (iii) disaggregate choice models. The latter two options are
both rooted in a general discrete choice model.
2.2.1
Hedonic pricing model
The hedonic pricing method is a much-applied tool for analysing product quality. In
essence, the method attempts to identify the part of a product’s price that can be ascribed
to the product’s particular attributes. Having identified the so-called implicit prices of
attributes, data on the attributes of the products actually purchased by households allows
for the estimation of attribute demand curves. This entails two stages of analysis. In the
context of investigating new vehicle purchasing behaviour, they are:
(i).
Estimate hedonic price function: this would involve identifying a series of
independent markets in which vehicles are sold. In each market, the statistical
relationship between car prices and car attributes would be estimated and, under
certain assumptions regarding market structure (see below), the parameters of this
hedonic price function could be interpreted as implicit prices.
(ii).
Estimate attribute demand curve(s): this would involve pooling data across markets in
order to estimate an (or more strictly a system of) attribute demand curve(s). In
particular, the quantities of an attribute in observed purchases would be regressed
against the implicit price of those attributes. Notice that to identify this relationship
it is necessary that the implicit prices of car attributes differ across markets.
The apparent simplicity of the hedonic approach belies some particularly restrictive
assumptions regarding the market structure and the nature of household demand.
Specifically, the consumers are assumed to choose a car with that particular combination of
attributes that maximises their utility. According to economic theory, that would mean
choosing quantities of attributes such that the household’s marginal willingness to pay for
each attribute just equals that attribute’s implicit price. For that to be possible for every
household, the choice set would have to comprise car models presenting all possible
combinations of attributes (i.e. a continuous choice set). On the supply side, car
manufacturers are assumed to operate under conditions of perfect competition. If that is
the case, the competitive pressure ensures that manufacturers price their products
according to the set of implicit prices that emerge in that market.
Clearly it is difficult to reconcile these conditions with the characterisation of the market
for new cars set out in Section 2.1. In particular, when firms can independently set the
prices for their products, the implicit prices as envisaged by the theory do not exist, so that
any relationship between the product prices and attributes quantities would only represent
statistical regularities (and not implicit prices).
2.2.2
Discrete choice models
Discrete choice models can either be applied to the: (i) disaggregate level (individual or
household purchase decisions); or (ii) the aggregate level (outcomes for groups of
eftec
6
January 2008
Demand for Cars and their Attributes – Final Report
individuals). The theoretical framework for both approaches is identical, since in both types
of models consumers are viewed as making a discrete choice, i.e. choosing one particular
car model from the finite set of models provided in the market (or choosing to make no
purchase at all).
In discrete choice models, it is assumed that households assess the benefit that they would
derive from purchasing each type of car available to them in the market. That calculation
requires them to weigh-up the advantages resulting from a car’s physical attributes against
the costs of purchasing and running that vehicle. For the purposes of analysis, that complex
calulcation is simplified into an estimable utility function which takes as its arguments a
car’s physical and cost attributes as well as the characteristics of the household.
Housheolds are assumed to purchase the type of car which provides them with the highest
utility.
The objective of the discrete choice methodology reduces to estimating the parameters of
the proposed utility function, the so-called taste parameters, from observations on the
purchasing choices of households. The aggregate and disaggregate approaches differ
according to the level of aggreagation at which consumption choices are observed.
Aggregate discrete choice model
As its name indicates, the aggregate approach uses the sum of individual purchase decisions
across Great Britain (note that our analysis is restricted to mainland UK). This reveals the
aggregate demand for the various models of vehicle available. In turn, a simplified
description of aggregate purchase behaviour in the new car market is afforded by observing
the proportion of households not purchasing a new car in addition to the proportion of
households purchasing each different model of vehicle available in the market. We shall
refer to these proportions as market shares.
The proportion of households purchasing each of the various types of car model provides
information on the underlying choice behaviour of those households. Therefore, in essence,
the aggregate discrete choice methodology examines how market shares relate to the
attributes and prices of vehicle models available in the market and use this information to
retrieve the parameters of a representative utility function.
The analysis can be considerably enhanced if market share data from a number of years is
available. Indeed, over a series of years we would expect to observe changes in the
consumption patterns in response to changes in the characteristics of GB households and to
changes in the set of vehicle models offered in the market by manufacturers. The time
dimension also introduces greater variability in important determinants of choice such as
the fixed and variable costs of driving (e.g. from changes in the structure of the vehicle
excise duty or from changes in the price of fuel). More importantly, we might expect to see
changes in market shares in response to changes in the tax system (e.g. from changes in
VED) or from changes in fuel prices.
In addition, cross-sectional variation can be introduced by identifying regional markets.
That is to say, rather than using market shares for the whole of GB for each year, the
analysis would use data indicating market shares in each region of GB for each year. Data of
this kind would allow much more accurate definition of the influence of socio-economic
factors on patterns of demand. In particular, the parameters of the utility function that
relate to socio-economic characteristics can be retrieved by observing how market shares
differ across markets with different socioeconomic compositions.
eftec
7
January 2008
Demand for Cars and their Attributes – Final Report
Disaggregate discrete choice model
As noted, the disaggregate discrete choice model is based on the same behavioural
foundation as the aggregate model. Each car model differs in terms of its attributes and
price, and it is assumed that the actual vehicle selected by a household will be the model
containing that combination of attributes and price that provides them with the greatest
level of utility. The aim of the methodology is yet again to identify the parameters of the
specified utility function. However, unlike the aggregate methodology, the parameters of
the utility function are selected to best predict the actual choices made by individuals
observed on the market. Therefore, the parameters directly refer to the individual or
household level.
It follows that to estimate a household-level model, the minimum requirement is a dataset
detailing the purchase decisions of actual consumers along with details of their income and
other socio-economic characteristics. Details of the car’s attributes and price can be taken
from secondary data sources. A single cross-section of data is adequate to estimate the
various parameters of the utility function provided sufficient variation is present in the
attributes of the different vehicles. This might not be the case for important variables such
as fixed and variable costs of driving, particularly those relating to vehicle excise duty and
fuel prices. In that case, a time series recording purchasing decisions of households over a
series of years would be required.
2.3
Choice of methodology
Section 2.2 presented three potential methodologies for investigating purchasing behaviour
in the new car market. Table 2.1 summarises their relative merits.
Table 2.1: Advantages and disadvantages of the research methodologies
Methodology
Advantages
Disadvantages
Hedonic pricing
model
Availability of data (observed from
market)
Structural assumption remote from
observed structure of car market
Simplicity of the economic and
econometric models
Segmentation of the market based on
a questionable assumption
Ideal for identifying substitution
patterns and welfare effects
Risk of misinterpreting individual
preferences from aggregate behaviour
Addressing market-level issues
Complex econometric procedure
Aggregate
choice model
Availability of data (observed from
market)
Ability to include individual-level
information
Inclusion of an ‘outside good’
Disaggregate
choice model
Ideal for identifying substitution
patterns and welfare effects
Direct link between individual choices
and their socio-economic details
The large number of available car
models requires either clustering or
large sample size
Possible reporting error during survey
Flexibility of the data collection
eftec
8
January 2008
Demand for Cars and their Attributes – Final Report
2.3.1
Hedonic pricing
There are a number of reasons why the hedonic pricing method is not suited to answering
the research objectives described in Section 1.2. Most importantly, the method seeks to
identify demand relationships for each of the attributes of cars and not for the cars
themselves. That set of attribute demand functions capture households choice behaviour
but they do not directly inform us as to how households substitute between different
vehicles. In addition, there is no straightforward means of using those outputs to derive the
the price elasticities required from this project.
In addition, as alluded to in Section 2.2.1, the market structure assumed by the hedonic
price model does not provide a satisfactory account of the market for new cars. On the
demand side, the choice set for consumers, whilst very substantial, does not span all the
possible combinations of attribute levels. On the supply side, production is dominated by a
relatively small number of manufacturers, making the perfect competition assumption
unrealistic. It is more likely that each manufacturer is aware of the actions of its
competitor(s) and tailors their product and pricing decisions accordingly. In this
configuration, Pakes (2003) has shown that the hedonic pricing methodology does not
permit the estimation of the parameters of the demand or the supply function.
Finally, the segmentation of the market required to identify the demand curve for each car
attribute will usually need to rely on unverifiable assumptions, which further reduces the
attractiveness of the method.
2.3.2
Aggregate versus disaggregate discrete choice models
In contrast to the hedonic pricing model, the discrete choice methodology has a number of
advantages in the context of this study’s research objectives. In particular, discrete choice
models are ideally suited for examining:
•
Substitution relationships between models of vehicle; identifying how these
relationships change with respect to changes in relative prices or in the attributes of
models (i.e. elasticities of demand with respect to prices or to attributes).
•
Welfare effects of particular changes in market conditions. For example, from
changes in the attributes (e.g. fixed or variable costs of driving) of particular models
or from the introduction or removal of vehicles from the market.
In distinguishing between aggregate and disaggregate methodologies, the theoretical
advantages of a household-level demand model over an aggregate demand model are
twofold. First, it is able to relate particular purchase decisions to particular households,
allowing a much more explicit examination of the relationship between purchase decisions
and socioeconomic characteristics. Second, collecting data at a household level allows for
investigation of other important issues relating to the purchase of new cars. In particular, it
gives options to collect data on household vehicle use decisions as well as pre-existing
stocks of vehicle. This additional information permits analysis of more complex decision
processes such as how purchases are financed or how choices are determined by the current
fleet of cars maintained by the household.
Despite these advantages, we believe that the methodology that best suits the objectives
of this project is that of an aggregate discrete choice demand. Indeed, being based on
household-level data, the disaggregate model is not ideally suited to predicting market-
eftec
9
January 2008
Demand for Cars and their Attributes – Final Report
level responses. That is to say, whilst the disaggregate level model might give accurate
insights into how individual households respond to changes in market decisions, it is less
well-suited to predicting the overall market response to such changes. In contrast, the
aggregate approach specifically answers the question of what happens to the market shares
commanded by various models of cars, when either prices, the costs of motoring or the
attributes of vehicles change. In this respect, the aggregate demand model is ideally suited
to answering the research objectives identified in categories A and B in Table 1.1.
One further advantage of the aggregate approach is the ease with which an “outside good”
(i.e. a “don’t purchase a car” option) can be introduced into the analysis. Whilst it is true
to say that it is possible to include an outside good in a disaggregate analysis the extra data
collecting burden of surveying a representative sample of non-purchasers frequently proves
prohibitive. A choice model that excludes the possibility of non-participation is unlikely to
reflect real world behaviour where, for example, price increases may convince some
households not to purchase a new car at all. This aspect is likely to be important for any
market-level policy simulation.
Concerning the data requirements, the aggregate demand model has the advantage of
having readily available data from secondary sources. The market level data therefore
provides observations of actual purchasing behaviour in the population of interest. This
contrasts with the household level model where a primary data collection exercise is
needed and would also involve some problems. For instance, to capture the range of
choices available to households, a choice set containing at least 1,000 car models is likely
to be required. Sample sizes that would be required to record purchases of each of these
options would be large. Hence the only way to progress in the disaggregate framework is to
cluster vehicles into a smaller set of alternatives, which limits the model outputs.
Another complication with survey data is that the new car market is dominated by a small
number of relatively similar vehicles (e.g. the Ford Focus, Vauxhall Corsa etc.). Indeed,
even in a relatively large random sample one would suspect that the vast majority of
observed purchases will be of these popular vehicles. Relatively few, if any, observations
will record purchases of cars with very different characteristics. In turn, survey data is
unlikely to provide the sort of variation that would be necessary to allow accurate
identification of the impact of a range of vehicle attributes on choice behaviour.
Overall, the aggregate discrete choice model approach adopted in this study provides a
versatile framework within which a rich variety of policy questions may be investigated.
Additionally, this research methodology draws on a strong foundation of economic theory to
provide a comprehensive model of demand in the new car market. The approach has been
successfully applied to model vehicle markets outside the UK (see Annex 2). The research
requires four fundamental items of information:
i).
The market shares commanded by the different models of cars available in one or
more markets over a series of years;
ii).
The attributes of each model of vehicle in each market in each year including their
selling prices, volume, engine capacity, fuel efficiency, number of doors, etc.;
iii).
The costs of motoring in each market in each year including the fixed costs of
motoring (e.g. vehicle excise duty, insurance) and the variable costs of driving (e.g.
petrol prices); and
iv).
The socioeconomic characteristics of the population of each market in each year.
The dataset compiled for this study is described in detail in Section 4.
eftec
10
January 2008
Demand for Cars and their Attributes – Final Report
3.
MODELLING STRATEGY
3.1
Overview
The fundamental objective of this research is to understand what drives (excuse the pun)
households’ car-purchasing decisions. Our starting point is to assume that in making those
decisions households assess the benefit that they would derive from purchasing each model
of car available to them in the market. Some they can reject immediately as being too
expensive, too small, too slow, etc. Others may require more careful consideration. Either
way, we assume that this process of assessment involves weighing the advantages resulting
from the purchase of a vehicle with a certain set of attributes against the concomitant
purchase and running costs.
For the purposes of our analysis, we simplify the undoubted complexity of that calculus into
an (conditional indirect) utility function. This utility function effectively scores each
option; attributing a higher score to options offering a greater net benefit. In our simple
behavioural model, households are assumed to purchase the particular model of vehicle
that scores highest, provided the utility from that option exceeds the option of not buying a
car at all.
The utility function takes as its arguments the characteristics and associated costs of the
car under consideration as well as the characteristics of the household making the choice.
As discussed in much greater detail in the next section, our research requires us to assume
some particular functional form that organises those various arguments in some reasonable
and justifiable manner. The objective of the research then reduces to estimating the
parameters of the proposed utility function. Armed with an estimate of this fundamental
building block of household choice behaviour, we are able to answer many of the research
questions outlined by the DfT (see Table 1.1).
In this project we adopt an aggregate choice modelling methodology. This methodology,
pioneered by Berry (1994) and Berry et al. (1995), follows from the observation that the
proportion of households in a market purchasing each of the various types of car model
provides information on the underlying choice behaviour of those households. In essence,
the method retrieves the parameters of the utility function by examining how market
shares relate to the attributes and prices of vehicle models available in the market.
If one wishes to infer how socioeconomic characteristics influence choice then more
information is required. In particular, in the aggregate methodology, parameters of the
utility function that relate socioeconomic characteristics to vehicle purchasing behaviour
are identified by observing how market shares differ across markets with different
socioeconomic compositions. In this project, we had intended to exploit this source of
identification by defining a set of regional markets and observing market shares in those
markets over a series of years. In the event, delays in accessing socioeconomic data and the
size of the research task mitigated against such a comprehensive analysis.
Similarly, a much more detailed analysis of the relationship between household
characteristics and car purchasing behaviour can be achieved by combining data on
aggregate choices with those recording the individual choices of a representative sample of
households. Again, we had hoped to pursue this line of enquiry using data from the National
Travel Survey (NTS). Unfortunately, we were unable to do so within the timeframe of the
project, though we note that extending the analysis to investigate socioeconomic drivers of
purchase behaviour should prove a fruitful area for future research.
eftec
11
January 2008
Demand for Cars and their Attributes – Final Report
3.2
Definition of ‘the market’
For the purposes of this research, we define the ‘new car market’ as consisting of a set of
households living in a distinct geographic region in a particular year. In particular, our data
(discussed in far greater detail in the next section) allows us to identify the quantity of
sales of different types of vehicle to private households for each year from 2001 to 2006 (to
integrate -with other data sets we actually only use data up to 2005) in each Government
Office Region of GB (see Table 3.1).
Subsequently, population data available from the Office of National Statistics provides
details of the number of households in each market and allows us to calculate market
shares; that is, the proportions of households in each market purchasing each type of car as
well as the proportion not purchasing a car at all. These market shares form the basic
observation of demand for our analysis.
Table 3.1 lists the 11 GORs represented in our data and indicates average household
incomes in those GORs each year from 2001 through to 2005.
Table 3.1: Average Household Income in GB Government Office Regions (£)
Region
2001
2002
2003
2004
2005
North East
22,633
23,307
24,445
25,446
26,921
North West
25,176
25,683
27,095
28,402
29,774
Yorkshire and the Humber
24,393
25,068
26,390
27,594
28,916
East Midlands
25,430
26,595
27,924
28,958
30,826
West Midlands
25,966
26,324
27,780
29,061
30,189
East of England
29,681
30,633
32,208
33,451
34,401
London
34,773
35,402
37,447
39,224
40,540
South East
32,239
32,338
33,731
34,801
35,585
South West
27,102
27,614
29,057
30,216
31,585
Wales
23,501
25,097
26,169
27,031
28,799
Scotland
24,264
25,092
26,206
27,156
28,571
3.3
Definition of ‘choice occasions’
In common with any discrete choice methodology, the aggregate choice model requires a
definition of a choice occasion. In this research we take that choice occasion as spanning
the period of one calendar year. In other words, we assume that each year, each household
in Great Britain is faced by the decision as to which, if any, of the set of new cars available
in the market to purchase.
According to this assumption, households purchase at most one new vehicle each year. Of
course, we can test that assumption by observing the actual number of vehicles purchased
by individual households in GB each year. Table 3.2 provides a summary of data taken from
the National Travel Survey.
Evidently some households purchase more than one new vehicle in a particular year though
the incidence of multiple purchases is very low, averaging one third of one percent over the
6 years of data. We believe that this incidence is sufficiently low so as to support, or at
least not to invalidate, our assumption.
eftec
12
January 2008
Demand for Cars and their Attributes – Final Report
Table 3.2: Number of new vehicles purchased in a year by households in the NTS
No. of Cars
1999
2000
2001
2002
2003
2004
0
94.27%
92.17%
93.29%
92.82%
92.90%
92.98%
1
5.19%
7.53%
6.51%
6.79%
6.88%
6.73%
2
0.51%
0.30%
0.21%
0.39%
0.22%
0.27%
3
0.03%
0.00%
0.00%
0.00%
0.00%
0.01%
Total
3,140
3,372
3,412
7,437
8,258
8,122
To introduce some terminology, we have 1, … , markets, where 55 since we
have observations on 5 years worth of choices in 11 different GORs. In each market there
are
1, … , consumers, which we take to be the number of households in each region.
3.4
Definition of ‘choice set’
On each choice occasion, we assume that each household surveys the range of vehicles
available to them in the market and chooses to purchase that vehicle which offers them the
greatest net benefit. Of course cars come in a vast multiplicity of versions differing in both
major features such as their body type as well as minor features such as whether they are
provided with or without air conditioning.
As a matter of fact, the multiplicity of vehicles available in the new car market
demonstrates considerable high level structure. In particular, most vehicles can be
categorised into one of a series of so-called ‘market segments’. The vehicles within each
segment show considerable homogeneity in size, body shape and price and tend to be
marketed at a particular demographic of household. Table 3.3 lists the market segment
definitions provided in our data and used in our analysis.
For the purposes of developing our model let us index segments 1, 2, … , where given
the definitions in Table 3.3 for our data, 9. Obviously, the vehicles within each
segment demonstrate a large degree of variation. Indeed, as we discuss in the next section
in detail, we define a choice option as being a set of vehicles that are identical in terms of
their manufacturer (make), model, body type, fuel type, transmission type and engine size.
Expanding our notation, we denote the choice set as consisting of 1, 2, … , different
vehicle options.
3.5
Factors influencing vehicle choice
In essence, there are two sets of vehicle characteristics affecting household choice;
physical attributes and cost attributes.
As we describe in the next section, we have data describing the physical attributes of each
vehicle type including details of each vehicle’s body type, engine size, transmission type,
fuel type, size, power, fuel efficiency and various details of the vehicles trim. We denote
this vector of physical attributes for car option j as .
eftec
13
January 2008
Demand for Cars and their Attributes – Final Report
Table 3.3: Definitions of Market Segments
Segment
Example
Description
Ford Ka
Micro cars designed for city use, normally
under 3.5m.
EU A
Mini Cars
EU B
Super Minis
Citroen Berlingo
Renault Clio
Toyota Yaris
Super mini vehicles designed for family use
and budget motoring. Usually a hatchback.
Small/ Medium
Family Cars and
Prestige
Hatchback
Honda Civic
Volkswagen Golf
Audi A3
Chrysler Neon
Lower medium class vehicles for family use
and the fleet market. Engine capacity is
usually below 2 litres. Usually a saloon or a
prestige hatchback
EU D
Large Family Cars,
Compact
Executives and
Entry-Level Luxury
Ford Mondeo
Vauxhall Vectra
Lexus IS200
Mercedes C-Class
Upper class family vehicles and near luxury
vehicles for the family and fleet markets.
Larger vehicles that provide increased
comfort and interior space. Engine
capacity is up to 3 litres for some models.
EU E
Executive Cars and
Luxury Cars
Audi A6
Jaguar S-Type
Mercedes S-Class
Luxury vehicles providing enhanced
comfort aimed at the premium market or
senior management level in the fleet
market.
Fiat Multipla
Renault Scenic
Smaller vehicles normally based on car
platforms, marketed towards family rather
than commercial use. Usually capable of
carrying 5 persons with a greater level of
versatility than a normal hatchback. The
external dimensions are usually similar to
C segment vehicles except for the height
which will be greater.
EU C
EU Mini
MPV
Renault Twingo
Larger vehicles based on car or small
commercial platforms, marketed towards
family rather than commercial use. Usually
capable of carrying between 5 and 8
persons with a great level of versatility.
The vehicles provide features comparable
to a large saloon.
EU MPV
Chrysler Voyager
Renault Espace
EU
Sports
Ford Puma
Porsche Boxster
Vehicles with a higher level of
performance and handling, or a vehicle
that gives the impression of being capable
of high-speed motoring.
EU SUV
Jeep Grand Cherokee
Land Rover
Freelander
Sport utility vehicles, now marketed as
much for lifestyle as for use in off-road
situations.
eftec
14
January 2008
Demand for Cars and their Attributes – Final Report
Of course, associated with each vehicle are a set of costs. For the purposes of our research
we identify three different types of cost:
•
•
•
Purchase costs which we denote , where the indexing on option and market
reflects the fact that the price of a particular vehicle option will differ over time (we
do not observe variation in prices over GORs in the same year).
Fixed annual costs which we denote and include insurance premiums and vehicle
excise duty.
Variable costs which we assume for new cars are confined to motoring costs and we
capture through the fuel efficiency argument in the vector of vehicle physical
attributes .
We expand upon the exact definition and calculation of these different cost elements when
we come to discuss our data in the next section. However, it is important to note at this
stage that our assumption that households make an annual choice over whether or not to
purchase a vehicle requires that all the cost elements are defined as annual payments. As
we explain subsequently, we convert the actual purchase price of a vehicle, , to an
annual cost, , by assuming that a household can partially recoup their investment in the
new vehicle at the end of a year by selling that vehicle on in the second hand market. Our
measure of purchase costs, therefore, is related to the difference between the purchase
price and the resale price after one year.
3.6
Household utility function
The fundamental building block of our model of choice behaviour is a household
(conditional indirect) utility function; that is, a function which indicates the net welfare
benefit that would result from household i from market t purchasing vehicle j. In line with
previous research, we elect to specify that function using the simple linear form:
! " ! # ! #
! $
Here the data consist of:
•
•
•
•
% 1, … , ; 1, … , ; 1, … , '
(3.1)
, the income of household i;
, the purchase price of the vehicle;
, the fixed cost of motoring; and
, the K-dimensional row vector of vehicle attributes.
The parameters of the utility function that we wish to estimate consist of:
•
•
•
eftec
, household i’s marginal utility of income;
, a parameter allowing the marginal utility of income with regards to changes in
fixed motoring costs to differ from that regarding purchase costs (the bar over the
parameter indicates that this parameter is assumed to be constant across
households); and
" the K-dimensional column vector of household-specific taste coefficients with
individual elements () .
15
January 2008
Demand for Cars and their Attributes – Final Report
Finally we introduce three unknowns into the model:
•
•
•
# is the utility resulting from unobserved (by the econometrician) characteristics of
each type of new car;
#
captures the market-specific deviation from the average utility of each type of
new car; and
$
is an mean-zero idiosyncratic error term mopping-up that part of utility that is
not captured by the other elements of the model.
Of course, a household may choose not to purchase a vehicle in a particular year. In that
case, they would get to enjoy the utility of the so-called “outside good” which we label
product 0:
*
! + ! $*
% 1, … , ; 1, … , '
(3.2)
where:
•
+ is a household-specific constant capturing the utility offered by the outside good.
We assume that in making their choice, households evaluate (3.1) for each car option
available in the market and (3.2), then select the option that offers them the highest
utility. As such, our model is completed by a choice rule; household i chooses option j if:
, -
for 1 0,1, … , % 1, … , ; 1, … , '
(3.3)
For the purposes of our subsequent discussion, it is convenient to make some simple
adjustments to our model. Notice that equation (3.3) indicates that all that matters to
households is the difference in utility between one option and another. Moreover, observe
that the element (which captures the utility coming from household income) is
common to all options. Since that element does not differ across options it plays no part in
household choice and henceforth can be dropped from our specification of the conditional
utility functions. Along similar lines, since only differences in utility matter, we can
subtract the element + from each of the conditional utility functions (3.1 and 3.2) without
changing the model. In that case, the elements of the model captured in + , that indicate
the benefits provided by choosing the outside good, enter the model as opportunity costs in
the car purchase utility functions (3.1).
With these changes, the utility of the outside good reduces to:
*
$*
% 1, … , ; 1, … , '
(3.4)
Whilst those of the car purchase utility functions can be expressed as:
+ ! ! " ! # ! #
! $
% 1, 2, … , ; 1, … , '
eftec
16
(3.5)
January 2008
Demand for Cars and their Attributes – Final Report
3.7
Logit Estimation
Observe from (3.4) and (3.5) that household characteristics do not appear directly in the
utility function. Rather, they play a role in determining the values taken by the householdspecific taste parameters , " and + .
We had originally hoped to estimate a model in which disaggregate data on household
characteristics and vehicle purchasing behaviour was used to identify the determinants of
the household-specific taste parameters, , " and + . This has not proved possible within
the timeframe of the project.
For the purposes of the research reported in this document, we are forced to make the
simplifying assumption that households have identical tastes for the physical attributes of
vehicles. That is to say, we assume " ".
The panel nature of our data allows us some flexibility in specifying the taste parameters
for money, , and for the outside good, + .
With regards to tastes for money, it seems reasonable to assume that the disbenefit of a
particular expenditure declines as household income rises. To capture this diminishing
marginal utility of income, we specify:
% 1, … , ; 1, … , '
(3.6)
where % 1, … , ' is the average household income in market t. Notice that according
to (3.6) as income rises the marginal utility of income falls, which is in accordance with our
expectations. The specification in (3.6) is easily operationalised by dividing each cost
variable by .
With regards to tastes for the outside good, we recognise that the benefit of participating
in the market in any particular year may differ systematically across regions because of
differences in the socioeconomic composition, infrastructure and geography of each region
and might differ systematically across years due to changes in the macroeconomic climate.
To capture this variability, we specify + as a function of a series of dummy variables
indicating GOR and year, as follows:
+ +
+345 ! +6789
Any remaining heterogeneity, pertaining to differences in tastes for the physical attributes
of vehicles and deviations from market means for the other taste parameters, are captured
in the error term $
. Under these assumptions the utility function in (3.5) becomes:
:
! $
% 1, … , ; 1, … , ; 1, … , '
where:
:
+
(3.7)
! ! " ! # ! #
% 1, … , ; 1, … , '
(3.8)
eftec
17
January 2008
Demand for Cars and their Attributes – Final Report
If we are prepared to make the further assumption that the $
are independently and
identically distributed according to a Type I extreme-value distribution then we find
ourselves in the company of the traditional logit model. Here the probability that a
particular household will choose a particular option is given by the familiar function:
=' ;
%<
> ?@A
1 ! ∑EFG > ?CA
D
% 0, … , ; 1, … , '
(3.9)
= ' are constant across all households
parameters, the choice probabilities given by the ;
%<
in that market. What’s more, since each household has an identical probability of choosing
a particular option, that probability will identically indicate our model’s prediction of the
market share commanded by that option. Accordingly:
Of course, since all households in a particular market are assumed to have identical taste
= ' ;
%<
= '% 0, … , ; 1, … , '
HI
%<
(3.10)
= has been suggested by Berry (1994) and Berry et al. (1995).
One strategy for estimating <
First, observe from (3.9) and (3.10) that according to our model the natural logarithm of
the market shares (and by extension of our predicted market shares at the true values of
the parameters) reduces to:
D
JKH
:
! JK L1 ! M > ?CA N % 1, … , ; 1, … , '
EFG
and
(3.11)
D
JK%H*
' JK L1 ! M > ?CA N % 1, 2, … , '
EFG
(3.12)
where H
% 0, 1, 2 … . , ; 1, 2, … , ' are the market shares commanded by option j in
market t in our aggregate DVLA dataset.
Subtracting (3.12) from (3.11) we arrive at:
JKH
JK%H*
' :
+
! ! " ! # ! #
% 1, … , ; 1, … , '
(3.13)
The left hand side of (3.13) is simply calculated from the observed market share data.
Moreover, the right hand side of (3.13) is a simple linear function of the model parameters.
eftec
18
January 2008
Demand for Cars and their Attributes – Final Report
One problem with estimating (3.13) is the presence of # and #
. Recall # represents the
utility impact of unobserved characteristics of each vehicle whilst #
can be interpreted as
a market and vehicle specific demand shock. Whilst it might be reasonable to assume that
the #
are independent of the characteristics of the vehicle, there are many reasons to
believe that the # will be correlated with price and perhaps with other vehicle attributes.
For example, # , may in part reflect the prestige derived from owning a particular model of
vehicle. Since manufacturers are most likely aware of the prestige associated with
purchasing a model in their range, their pricing decisions will in part reflect # .
Consequently, the unobservables, #
, will be correlated with price, , and problems of
endogeneity may dog our estimation efforts.
Rather than resorting to instrumental variable estimation we plan to exploit the high level
of disaggregation available in our data. In particular, we contend that our data provide
detailed information on the performance and trim details of each vehicle. Accordingly, the
# capture those elements of the styling and prestige associated with a particular make,
model and body type of vehicle that are impossible to quantify through objective
measurement. Since we observe demand in numerous markets and within each market
observe several varieties of vehicle of the same make, model and body type that differ in
their engine size and transmission type, we are in a position to estimate the # directly
from the data. In particular, we use a fixed effects estimator, estimating a separate
coefficient to represent the fixed utility associated with each make, model, body type, fuel
type combination.
We estimate (3.13) using a standard fixed effects estimator for linear regression.
3.8
Nested Logit Estimation
As is well known, the logit model described in the previous section implies highly restrictive
patterns of substitution across vehicles. The intuition here is that in the logit model, the
relative desirability of a vehicle is defined solely by the vehicle-specific fixed effects,
:
% 1, 2, … , ; 1, … , ', which are independent of household characteristics. Since
two very different cars, say a luxury BMW and a camper van, may be observed to command
the same market share, the logit model will attribute them the same fixed effect. From the
point of view of the model, these two vehicles are considered equally desirable to each
household in the market. Of course, that is nonsense. In reality, if the price of another
luxury vehicle, say an Audi, were to increase we would expect households to substitute
away from that vehicle. Intuition suggests that demand for cars similar to the Audi, say the
BMW, would rise whilst those for very different vehicles, say the camper van, would be
little changed. Unfortunately, since the logit assumes that both the BMW and the camper
van are equally desirable to all households, it also assumes that demand for both vehicles
will increase equally. Since the identification of accurate elasiticities of demand is a key
objective for this research, the restrictive substitution patterns imposed by the logit model
are a concern.
An alternative to the simple logit model is provided by the nested logit model (McFadden,
1974) that allows for consumer tastes, that is the $
, to be correlated across more similar
products. The model progresses by first identifying groups of options that are considered by
the analyst to be more similar. In our analysis we take these groupings to be the market
segments described in Table 3.3. The nested logit model estimates a set of parameters
PQ % 1, … , ' that measures the within-segment correlation of the $
utility elements.
eftec
19
January 2008
Demand for Cars and their Attributes – Final Report
As shown by Cardell (1991) and Berry (1994), the nested logit model results in an estimating
equation similar to (3.13), that is to say:
JKH
JK%H*
' :
+
! ! " ! PQ JKH
⁄HQ
! # ! #
% 1, … , ; 1, … , '
(3.14)
Where the additional element, JKH
⁄HQ
, is the natural log of the within-segment share
commanded by option j in market t. Observe that as PQ approaches 1, the within-segment
correlation in the idiosyncratic error elements goes to unity. Conversely as PQ approaches 0
the within-segment correlation goes to zero and the model collapses back to the simple
logit model.
One problem with the estimation of (3.14) is the inclusion of the endogenous withinsegment market shares on the right hand side. Clearly, an idiosyncratic shock #
will not
only impact on the share option j commands of the whole market, but also option j’s share
of it’s market segment.
To overcome this problem we require a set of instrumental variables that are correlated
with within-segment share but not with the idiosyncratic demand shock #
. As noted by
Berry (1994) suitable instruments can be constructed by assuming that the characteristics
of a vehicle option, , are determined prior to the demand shock being realised, and as
such should be independent of that shock. This assumption seems reasonable in the short
run given the difficulties manufacturers have in quickly adjusting the characteristics of
their cars. The level of demand for vehicle option j manufactured by firm f in market
segment g will not only depend on its own characteristics, but also on the characteristics of
the products with which it is competing in that market segment, both those manufactured
by firm f and those manufactured by firm f’s competitors. As such, we would expect withinsegment shares to be correlated with functions of the characteristics of the other products
owned by firm f that are marketed in that segment and on the characteristics of the
competing products (measuring their closeness in the characteristic space). Accordingly, we
follow the lead of Verboven (1996) and use as instrumental variables:
(i)
(ii)
(iii)
A vehicle option’s own observed characteristics ;
the number of vehicle options and the sums of characteristics of other vehicle
options of the same firm belonging to the same segment, interacted with a segment
dummy variable ; and
the number of products and the sums of the characteristics of competing products
belonging to the same segment interacted with a segment dummy variable.
The instruments are specified as segment-specific because we allow the correlation
parameters, PQ % 1, … , ' to differ across segments.
We estimate (3.14) allowing for the fixed effects (# ) using two-stage least squares.
eftec
20
January 2008
Demand for Cars and their Attributes – Final Report
4.
DATA
The data required for the estimation of the model described in the previous section is very
considerable. Primarily we require information on demand for particular types of vehicle
over a series of years disaggregated by region of GB. Of course, to understand how that
demand is related to the attributes of vehicles we also need data on the physical attributes
of those vehicles. In addition, we require information on the price, resale, insurance and
tax costs associated with those vehicles. In this section we describe the data sources used
in this research and the manipulations that have been undertaken in order to arrive at the
final estimation data set.
4.1
Data: Demand
4.1.1
Data Sources
Data on the demand for vehicles was taken from the New Car Registrations database
compiled by the Driver and Vehicle Licensing Agency (DVLA) since 2001. We shall refer to
this as the DVLA data. This database records vehicle and owner details for each new car
registered in Great Britain (Northern Ireland is excluded). The data is for cars only,
motorcycles, quadbikes, commercial vans, trucks or buses are excluded.
The data indicates whether the purchaser was a company or private individual and in which
of the 11 government office regions (GORs) of GB the vehicle purchaser resides.
We were not provided with the individual records, but with data that was already partiallygrouped. That is to say, each record in our data indicates how many vehicles of a particular
make, model & version, body type, engine size and fuel type were purchased by households
and how many by companies in a particular GOR in a particular year.
In total the DVLA data set consists of 847,689 records, documenting new car purchases from
2001 through to 2006. The variables in the DVLA data set are listed in Table 4.1.
Table 4.1: Variables in DVLA Data
Name
Description
Make
Vehicle manufacture
Model & Version
Make and model of vehicle (including details of transmission type)
Body Type
Thirteen categories of body types (e.g. saloon, cabriolet etc.)
Fuel Type
Six categories of fuel type (e.g. petrol, diesel, electric etc.)
Engine Size
Engine capacity in litres
GOR
Government Office Region
Year
Year of purchases
Purchases
Number of purchases of this vehicle in this GOR in this year
Unfortunately, it appears that entries in the DVLA data are not cross-validated to ensure
that names of the same version of vehicle are entered consistently and that the attributes
of those versions (body type, fuel type, engine size) are correct. Since our objective is to
eftec
21
January 2008
Demand for Cars and their Attributes – Final Report
calculate the demand for each version of vehicle, our first task was to ascertain exactly
which version of vehicle was being recorded in each record of the DVLA data set.
To that end, we drew on two alternative datasets that provided consistent entries of
vehicle versions and their attributes, as follows:
•
JATO vehicle attributes database: JATO Dynamics have provided data reporting the
details of current and historical prices and attribute descriptions for each type of car
marketed in the UK from 1995 to 2006. The data does not cover so-called ‘grey-market’
imports; that is, types of car independently imported into the UK but not directly
marketed in this country by the manufacture.
Like the DVLA data the JATO data set does not record details of motorcycles,
quadbikes, commercial vans, trucks or buses.
The data is provided at the most disaggregate level. Each record refers to a particular
make, model and version of vehicle. Additionally, versions may be further
disaggregated by a variety of other substantive features including fuel type, body type
and number of doors.
Moreover, for each type of vehicle there are multiple entries recording changes in that
vehicles attributes (e.g. the price) over time. In total the data set consists of some
79,619 entries. A typical entry is shown in Table 4.2.
Table 4.2: Example of record of vehicle attributes recorded in JATO data set
Attribute
Record Date:
Date
Car Details:
Model year
Make
Model
Version
Trim level
Body type
Num doors
Fuel type
Market Segment
Engine Details:
Engine Size (l)
Power (hp/PS)
0-100km/h (secs)
Driven wheels
Transmission type
Num speeds
ABS
Fuel tank
Capacity (l)
•
Example
01/05/2004
2004
FORD
FOCUS
1.6 GHIA
GHIA
hatchback
5
unleaded
Small Family
1.6
100
11.4
front
manual
5
standard
standard
55
Attribute
Dimensions:
Kerb weight (kg)
Length (m)
Width (m)
Height (m)
Efficiency/Pollution Details:
Emission control level
standard met
CO2g/km combined
Fuel consumption
Standard
combined (mpg)
combined (l/100km)
Trim Level Details:
Air conditioning
Electric windows
Wheel rim type
Front airbag
Side airbag
Roof airbag
Cost Details:
Price
Insurance
Description
Example
1.197
4.174
1.702
1.430
standard
EU4
163
standard
ECE 99/100
41.5
6.8
standard
front:rear
alloy
driver:passenger
front
not available
£13,862.00
standard
6E
Association of British Insurers (ABI) vehicle insurance classification database
The ABI “Code 44” database, details the insurance industry’s group ratings covering all
volume manufactured private cars for the UK market. The dataset provides an almost
complete list of cars driven in the UK, the earliest entry being for a Bugatti Type13-23
manufactured from 1910 to 1920.
eftec
22
January 2008
Demand for Cars and their Attributes – Final Report
The ABI dataset consists of 37,150 records. The variables in the dataset are listed in
Table 4.3.
Table 4.3: Variables in ABI Data
Name
Description
Make
Vehicle manufacture
Model & Version
Make and model of vehicle (may detail transmission type)
Body Type
Three categories of body type (i.e. cabriolet, estate, other)
Fuel Type
Two categories of fuel type (i.e. petrol or diesel)
Engine Size
Engine capacity in litres
No. Doors
Number of doors
Transmission
Two categories of transmission type (i.e. manual, automatic)
Series
Series number for versions with multiple releases
From
Year first marketed in the UK
To
Year last marketed in the UK
Advisory Insurance
Group
Insurance group ranging from 1 to 20 with higher numbers indicating
vehicles that are relatively more expensive to insure
Other Insurance
Details
Other insurance details (e.g. security code, insurance group suffix)
Whilst we believe the ABI data to be comprehensive, further investigation revealed that the
JATO data does not provide a comprehensive listing of cars. We found numerous entries in
the DVLA data that matched perfectly with the ABI data set but were not present in the
JATO data set. The problem was particularly acute for automatic vehicles especially those
manufactured by BMW, Audi and Volvo. In addition, for vehicles that were present in the
JATO data set numerous records were incomplete, missing details of one or more
attributes. As such we could not rely solely on the JATO data to classify entries in the DVLA
data.
4.1.2
Matching the DVLA data
Since the JATO data set contains multiple entries for each vehicle version recording
changes in (amongst other attributes) the price of that version over time, we first
identified from the JATO dataset a list of unique vehicles differing according to the various
data provided in the DVLA data set. That is to say, we established the set of vehicles with
unique make, model, version, body type, fuel type, transmission type and engine size. That
list contained 27,058 entries covering vehicles marketed in the UK from 1995 to 2006.
As we might expect, this is substantially fewer than the 37,150 recorded in the ABI data set
since not only is the latter more comprehensive, but also it is not restricted to vehicles
marketed from 1995 onwards as well as contains vehicles entering the UK as grey market
imports.
Since the DVLA data only covers the period from 2001 whilst the JATO data covers the
period from 1995 to 2006, we further subdivided our list of vehicles from the JATO data
into those marketed only prior to 2001 and those available in years from 2001 onwards.
One might imagine that it would then be a relatively simple task to match each DVLA
record to a record in the JATO list. Nothing could be farther from the truth.
eftec
23
January 2008
Demand for Cars and their Attributes – Final Report
One particular difficulty we have encountered is that the JATO data records the vehicle
moniker in three fields; make, model and version, whilst the DVLA data only provides two
fields, combining the model and version into one. For example, the JATO data set might
indicate a vehicle as being an:
“ALFA ROMEO”, “147”, “2.0 T.SPARK SELESPEED LUSSO”
The equivalent vehicle in the DVLA dataset is recorded as an:
“ALFA ROMEO”, “147 T SPARK LUSSO S-SPEED”.
Notice that even once one has identified the DVLA record as being a “147” model, the
remaining version name only partially resembles the version name recorded in the JATO
record. To confound matters further, another entry in the DVLA data set will record the
exact same vehicle as being an:
“ALFA ROMEO”, “147 LUSSO SELESPEED”.
A further difficulty was encountered in matching the thirteen categories used to record
body types in the DVLA data to the fourteen categories used in the JATO data. For
example, in registering their vehicle, one individual might classify its body type as being an
“estate” whilst another might record it as being a “5-door hatchback” and yet another as a
“Multi-Purpose Vehicle”. Again the majority of records will identify the correct body type
for that vehicle, but again many do not. The same story repeats itself for the recording of
fuel types.
In order to overcome the problems in matching the DVLA data to the JATO data we have
developed a sophisticated matching algorithm.
The algorithm progresses through five stages. For each record in the DVLA data:
(1) Find the best match in the JATO database for vehicles marketed after 2001
(2) Find the best match in the JATO database for vehicles marketed between 1995 and
2000
(3) Find the best match in the ABI database for vehicles marketed prior to 1995
(4) If no matches found in (1) to (3) find best match in the ABI database for vehicles
marketed after 1995 and record the last year they were available in the UK market
(5) Compare quality of match for (1) to (4) and attribute DVLA entry to best match.
The matching logic in stages (1) to (4) itself follows a number of steps.
(i) The version name of the DVLA entry is scanned for certain key words that provide
information on the type of vehicle. For example, the word “auto” in the version name
indicates an automatic car. Likewise the code “TDI” indicates a diesel engine.
(ii) A list of vehicles with the same make and model as the DVLA entry is retrieved from the
matching database (JATO or ABI). The list may be further reduced if other details of the
car have been established from (i). If no vehicles can be found in the matching
database that fit these criteria then no match can be made.
(iii) The DVLA entry is compared to each entry in this list and a matching score is
calculated. The score comprises an element which reflects the closeness of match
between the version names, the body types, the fuel types and the engine sizes.
eftec
24
January 2008
Demand for Cars and their Attributes – Final Report
o
Version name match: Since there is no guarantee that the words in the vehicle
name are recorded in the same order in the DVLA entry as they are in the matching
entry we are comparing it against, the first step is to break up the version name of
the DVLA entry and the matching entry into tokens (different words).
We then compare each DVLA token to each matching token using a fuzzy matching
algorithm. The algorithm returns a matching score based on Levenshtein distance
(LD) that measures the similarity between two strings as the number of deletions,
insertions, or substitutions required to transform one token into the other. If two
tokens match exactly then they get a score of zero (no changes need to be made to
one string to transform it into the other). If the two tokens do not match at all
then the score will be length of the longer string (all the characters in the string
have to be either substituted or deleted). We end up with a matrix showing the
fuzzy match score between each token from the DVLA entry and each token from
the matching entry.
Finally, we use the Kuhn-Munkres algorithm (Hungarian algorithm) to pair DVLA
entry tokens with matching entry tokens to give the lowest overall fuzzy match
score. The score gives the number of character changes that have to be made in
order to convert one version name into another. Dividing this by the number of
characters in the longest of the two version names and subtracting that from one
gives the version match score. A value of one is a perfect match, a value of zero
indicates that the two version names are totally dissimilar.
o
Body type match: If the two body types are identical, a score of 1 is allotted. For
example, if the matching entry is a “cabriolet” and the DVLA entry is a
“convertible” then the body type match score is 1. Lower scores are attributed for
similar but non-identical body types. For example, a score of .66 would be
attributed if the DVLA entry was instead a “coupe” or .33 if the DVLA entry was
instead a “2 door saloon”.
o
Fuel type match: If the two fuel types are identical a score of 1 is allotted
otherwise the fuel type match score is zero. For example, if the DVLA data entry is
a “petrol” vehicle then the match score will be 1 if the matching entry is also a
“petrol” vehicle, but zero if the matching entry is an “electric” car or runs on
“green fuel”.
o
Engine size match: The engine size match is calculated as the square of the ratio of
the engine sizes of the two vehicles always placing the larger of the two engine
sizes in the denominator. For example, if the DVLA entry is recorded as having a
1.8l engine and the matching entry has the same size engine, then the engine size
match score is 1. Alternatively if the matching entry has a 1.6l engine then the
engine size match score is roughly 0.79 (=(1.6/1.8)^2).
As such, each of the four matching score elements takes a value between 0 and 1. The four
matching scores elements are added as a weighted sum in which twice the weight is
attributed to version name match as to the other elements. The final matching score,
therefore, ranges from 0 for two totally dissimilar entries to 5 for two identical entries. The
entry in the matching list returning the highest score is identified as the best match.
The matching algorithm takes several hours to run but achieves a very high level of
accuracy. Moreover, since each match is given a similarity score, it is easy to go back and
investigate those entries that have not been adequately matched.
The matching algorithm allows us to identify different vehicles in the DVLA dataset and find
corresponding entries in the JATO and ABI datasets. Our analysis reveals that the DVLA’s
eftec
25
January 2008
Demand for Cars and their Attributes – Final Report
new car registrations database contains a large number of entries that are not “new” and
some entries that are not “cars”! Table 4.4 provides a breakdown of the types of entry
found in the DVLA database.
Table 4.4: Breakdown of registrations in DVLA database (2001 to 2006)
Status
Registrations
Car Matched to JATO or ABI 2001-2006
14,605,755
Car Older than 2001
66,750
No Match
182
Make known but Model Unidentified
500,919
Grey Import
26,452
Not Cars
96,664
Total:
15,296,742
The vast majority of entries, some 95% of registrations, could be matched to the JATO
dataset or, failing that, the more comprehensive ABI dataset. A further 66,750 registrations
were found to concern vehicles that were manufactured before 2001 (frequently many
years before 2001) and, therefore, could not be considered as new cars for the purposes of
this research. A substantial proportion of entries were found not to be complete, recording
the vehicle specification details and the manufacturers name but not indicating the name
of the particular model. We found some 26,452 registrations concerned vehicles not
marketed directly to the UK market. Finally we found numerous entries for vehicles that
are not cars (our investigations have revealed entries for trucks and buses, as well as quad
bikes, motorcycles, tractors and even snowmobiles).
As a final step we wanted to allocate the 500,919 registrations for which we were missing
information on the exact model of vehicle to an identified vehicle version. To do that, we
took each unidentified entry in turn and noted the year and GOR to which it referred and
whether the sales in that entry were purchases by private or company buyers. We then
identified the set of vehicles made by the same manufacturer in the same year with the
same specification. The sales of the unidentified entry were allocated to the sales of these
matching vehicles in the correct GOR year and purchaser category in proportion to the
observed sales of those vehicles.
Our final data set ascribes the registrations recorded in the DVLA data to an identifiable
vehicle version. From it we can calculate the sales of each of version of vehicle to both
private and company purchasers in each GOR in each year from 2001 to 2006.
4.1.4
Defining Choice Options
The 14,605,755 new car registrations that were successfully matched regarded purchases of
some 9,119 different specific vehicle versions. With respect to our analysis, a critical
decision concerned whether we should regard each one of these separate vehicle versions
as representing a choice option or attempt some degree of aggregation.
Following advice from the referees, it was decided that maintaining a high degree of
disaggregation was desirable in order to achieve the objectives of the project. In
particular, it was pointed out to us that CO2 emissions may differ substantially even
between apparently similar vehicles.
eftec
26
January 2008
Demand for Cars and their Attributes – Final Report
As such, we define choice options at the lowest level of disaggregation where we are
confident that our data matching algorithms can successfully allocate observations to the
correct option. To that end, we define a choice option as a set of vehicles sharing the
same:
•
Make (e.g. Ford, Vauxhall etc.)
•
Model (e.g. Ford Focus, Vauxhall Corsa etc.)
•
Body type (e.g. hatchback, saloon, estate, cabriolet etc.)
•
Fuel type (e.g. petrol, diesel, LPG etc.)
•
Transmission type (e.g. automatic, manual)
•
Engine size (1.2 litres, 1.6 litres, 2.0 litres etc.)
Aggregating according to these criteria allowed us to reduce the choice options to 2,190.
We note that this is considerably larger than that of any previous study of which we are
aware. Our belief is that this definition of choice options provides a close approximation to
the actual choice set faced by households in the GB car market, capturing essential
differences in the fuel type, transmission and engine sizes of vehicles that are fundamental
to examining issues surrounding the CO2 emissions of new vehicles.
4.1.5
Calculating Market Shares
As discussed previously, our analysis concerns only sales to private households. As such we
discard data on sales to companies and aggregate the private sales data by option, year and
GOR.
According to our model, each household in each GOR makes a decision each year as to
whether to buy a new car and, if they decide in the affirmative, which particular car to
purchase. Population data available from the Office of National Statistics, provides details
of the number of households in each market and allows us to calculate market shares; that
is, the proportions of households in each market purchasing each type of car as well as the
proportion not purchasing a car at all.
Table 4.5 summarises the market share data, disaggregating by GOR and market segment.
Observe that the segments showing the largest sales are segments A and B, small and
medium sized family cars. As might be expected, across all GORs sales of luxury vehicles
(segment E), sports cars and MPVs are relatively lower. The patterns of demand are not
identical across GORs. For example, we observe that total demand is lowest in London;
around 2.4% lower than in the South East Region which shows the highest level of market
participation. All the same, it is interesting to note that in the London region luxury and
sports cars make up a higher proportion of sales than they do in any other region.
eftec
27
January 2008
Demand for Cars and their Attributes – Final Report
Table 4.5: Market shares commanded by market segments for each GOR between 2001 and 2005 (%)
A:
Mini Car
B:
Super
Mini
C:
Medium
Car
D:
Large
Car
E:
Executive
Car
Mini
MPV
MPV
Sports
SUV
None
North East
0.283
2.193
1.275
0.547
0.077
0.458
0.051
0.179
0.376
94.560
North West
0.291
2.064
1.325
0.555
0.098
0.429
0.062
0.230
0.354
94.591
Yorkshire and Humberside
0.252
1.835
1.164
0.588
0.112
0.377
0.055
0.202
0.425
94.989
East Midlands
0.301
1.718
1.170
0.538
0.106
0.375
0.066
0.220
0.405
95.101
West Midlands
0.348
2.040
1.505
0.602
0.106
0.385
0.058
0.249
0.432
94.275
East
0.348
1.961
1.404
0.780
0.140
0.431
0.116
0.295
0.474
94.050
London
0.179
1.196
0.783
0.433
0.122
0.228
0.071
0.215
0.282
96.491
South East
0.325
1.916
1.284
0.829
0.180
0.427
0.104
0.319
0.507
94.109
South West
0.293
1.606
1.152
0.537
0.095
0.400
0.068
0.199
0.431
95.218
Wales
0.378
1.893
1.262
0.500
0.073
0.434
0.055
0.168
0.348
94.889
Scotland
0.220
1.846
1.257
0.546
0.085
0.347
0.049
0.150
0.395
95.104
Government Office Region
eftec
28
January 2008
Demand for Cars and their Attributes – Final Report
4.1.6
Sales Outside Years of Manufacture
One final adjustment was needed to arrive at our final data set. When we plotted out the sales of
vehicles of each option over time, we observed that the sales of a particular option might be
extremely high over a series of years before dropping off dramatically. Further investigation
revealed that this pattern of sales invariably reflected the fact that a particular vehicle had ceased
manufacture midway through our data series. Sales subsequent to the cessation of manufacture of
a vehicle are, we assume, the result of residual stock being sold off by dealers. The opposite
problem was observed when vehicles first entered the market towards the end of a calendar year,
thereby registering only a few sales within that year.
Clearly, it would be very misleading to interpret these observations as indications of annual
demand for a vehicle of that type. As such, in our final data set for a particular year, we only
include market share observations if vehicles were manufactured for the full duration of that year.
As shown in Table 4.6 this required the exclusion of 66,083 car sales equivalent to just over 1% of
the sales recorded between 2001 and 2005.
Table 4.6: Sales of cars to private households (2001 to 2006) included and excluded from the
analysis
Year
Total Sales
Excluded: No
Attribute Data
Excluded: Outside
Manufacturing Years
Included in Final
Data
2001
1,341,495
6,176
13,148
1,322,171
2002
1,375,929
3,212
15,455
1,357,262
2003
1,337,186
3,053
14,865
1,319,268
2004
1,212,412
2,551
14,645
1,195,216
2005
1,037,796
9,828
7,970
1,019,998
4.2
Data: Vehicle Attributes
For each identified option in each year we wish to recover estimates of that option’s price and
specification from the JATO data set. Since numerous different versions of the same option might
exist in the JATO data set (differing in attributes other than make, model, transmission, fuel type
and engine size) we calculate these values as the sales-weighted average of all versions comprising
an option in a particular year.
A problem with which we have struggled is incompleteness of the JATO data set. In particular,
detailed examination of the data revealed that a large number of vehicles represented in the DVLA
data were not present in the JATO data set. The problem was particularly acute for automatic
vehicles especially those manufactured by BMW, Audi and Volvo. In addition, for vehicles that were
present in the JATO data set numerous records were incomplete, missing details of one or more
attributes.
We have dealt with this problem in one of three ways:
1. Where another record existed in the JATO database for this exact version of vehicle but for a
different time period, we updated missing specification details to those of this alternative
vehicle. Clearly, this would not be appropriate for the price details that we expected to vary
across time.
eftec
29
January 2008
Demand for Cars and their Attributes – Final Report
2. Where no record existed for this exact version, we imputed the missing specification and price
data as the average of other JATO records attributed to that option in that year.
3. Where no JATO records existed for an option we sort out alternative data. In particular, we
made use of the Wisebuyer’s website (www.wisebuyers.co.uk) which records specification and
price details for vehicles sold in the UK over the last 20 years. We found that many of the
vehicles missing from the JATO data set were present at this site. With the permission of the
website editors, we programmed a webscraping package to access details on the website and
download these to a database. This source of data was very useful in augmenting the JATO data
set. For example, we found specification and price details for 431 options (mostly automatics)
that had not been recorded in the JATO dataset.
Despite our best efforts, we were unable to construct complete attribute details for 128 of the
2,190 vehicle options. These generally pertained to very low production vehicles by small scale
manufacturers (e.g. Strathcarron, Maserati etc.). Indeed, as shown in Table 4.6, vehicle options for
which we could not construct complete data records only amount to some 24,820 sales, under .5%
of the total sales over the 5 year time span of our data.
Table 4.7 summarises the average vehicle attributes of data included in the final data set
disaggregated by market segment.
4.3
Data: Vehicle Costs
New car purchasing decisions are not determined solely by the physical characteristics of the
various options. Clearly, the costs of purchasing and running a car figure prominently in households’
choice. For the purposes of this research, we classify the cost characteristics of cars into three
categories:
•
Purchase costs: Either as a one-off lump-sum payment, or through taking out a loan and
repaying the debt plus interest or through one of several other financing schemes (e.g. hirepurchase).
•
Fixed costs: Including insurance, vehicle excise duty and parking costs.
•
Variable costs: Including the fuel costs, maintenance costs and perhaps tolls and congestion
charges.
We discuss each type of cost in turn.
4.3.1
Purchase Costs
The JATO dataset provides ‘list’ prices for each vehicle for each year for which we have DVLA
registrations data. We are aware that this price probably overstates the actual price paid by
households since most dealers will offer a “discount” off list prices when making sales. Given the
available data, there is little we can do to address this concern. More importantly, a key question is
whether this list price is a good reflection of the cost of purchasing the vehicle as perceived by the
household.
An important consideration in purchasing a vehicle is the extent to which that vehicle holds its
value such that capital outlay can be recouped on resale. We suspect that vehicles that command
relatively lower prices in the second hand market will be relatively less preferred by consumers.
To investigate this issue we sourced another data set from EuroTaxGlass. For every year from 2001
to 2005 this data recorded the price that one year old vehicles having driven the average household
annual distance of 13,000km fetched in the second hand market.
eftec
30
January 2008
Demand for Cars and their Attributes – Final Report
Table 4.7: Average physical attributes of vehicle options used in the analysis disaggregated by market segment
Engine
Size
(litres)
CO2
(g/km)
Automatic*
Num
Gears
Size
(Length ×
Width)
Doors
Brake
Horsepower
Acceleration
(secs to
100km/h)
Number
of
Airbags
Air
Conditioning*
Alloy
Wheel
Rims*
Anti-Lock
Braking
System*
1.09
142
0.30
4.78
525
3.68
65
15
1.54
0.13
0.27
0.35
1.45
155
0.21
4.59
644
4.20
86
13
1.78
0.39
0.38
0.59
C
(Honda Civic)
1.75
178
0.27
4.54
732
4.20
113
11
2.23
0.59
0.52
0.65
Mini MPV
(Renault Scenic)
1.81
186
0.27
4.79
743
4.91
114
12
2.54
0.68
0.51
0.71
D
(Ford Mondeo)
2.18
202
0.38
4.69
809
3.98
152
10
2.59
0.65
0.61
0.67
E
(Jaguar S-Type)
3.00
244
0.62
4.91
892
4.03
219
9
3.07
0.70
0.70
0.70
Sports
(Porsche Boxster)
3.01
259
0.41
5.33
773
2.11
242
7
2.38
0.71
0.83
0.81
2.24
226
0.43
4.63
869
4.76
139
13
2.43
0.69
0.46
0.68
2.75
266
0.54
4.70
827
4.58
166
12
2.52
0.72
0.69
0.73
Market Segment
(example vehicle)
A
(Ford Ka)
B
(Renault Clio)
MPV
(Renault Espace)
SUV
(Land Rover Freelander)
* Dummy variables
eftec
31
January 2008
Demand for Cars and their Attributes – Final Report
The data quotes two prices:
•
Retail Price: Average selling price of that vehicle at a dealership.
•
Trade Price: The typical price that a dealer would offer to buy the car for cash, either at
auction, from another trader, or from a private seller.
Not surprisingly, we find that the trade price is somewhat lower than the retail price, though both,
in general, are considerably lower that the new price. We find that, on average, the trade price
represents a 33% loss in value over the course of one year. The depreciation when calculated from
the retail price is 25%.
The averages hide considerable variation across vehicles. In 2002, the BMW Series 3 Cabriolets were
actually selling for a price higher than their new price. One can only assume that demand had
outstripped the supply of new cars. In contrast the Proton Wira hatchback in 2005 had lost over 50%
of its value one year after purchase as new.
Since our analysis concerns the purchases of households we take the relative resale price to be the
trade price. Also we assume that households’ decisions about which car to buy are influenced by
the current resale values of cars bought in previous years and not the actual resale value that a
vehicle realises one year later which, of course, is unobservable at the time of purchase.
4.3.2
Fixed Costs
With regards to the fixed costs of motoring, details of the Vehicle Excise Duty (VED) payable on
vehicles with different levels of CO2 emissions over the span of our data series are illustrated in
Figure 4.1. Notice that there has been some, though only moderate fanning-out of VED over this
period with discounted bands being introduced for low emissions vehicles in 2002 and 2003. It is
possible, though an entirely empirical issue, as to whether this variation will be sufficient to
identify the independent impact of the VED on vehicle demand.
In addition to the VED, a substantial fixed cost associated with owning a vehicle is that of insuring
the vehicle. Since the cost of repairing damage differs substantially across vehicles, and the risks of
incurring damage differs across individuals and locations, insurance premiums themselves differ
across vehicles and individuals.
Of course, we do not have observations on the annual insurance premiums payable for each and
every vehicle option for each and every individual in each year. Rather we need some way of
approximating those insurance costs in order to include this important factor in our analysis. We
carry out an ancillary regression analysis to achieve that goal.
The data used for this analysis is that collected by the AA to calculate the British Insurance
Premium Index (BIPI), a quarterly index tracking changes in car insurance costs in the UK. The data
records insurance quotes for 500 ‘virtual’ individuals and, as such, is known as the BIPI 500 dataset.
Each virtual individual is constructed so as to differ with regards to their age, sex, location of
residence and in terms of the car they are trying to insure, the type of insurance they wish to
purchase and their accumulation of years of no-claims bonuses. The 500 individuals are constructed
so as to cover the spectrum of insurance requests observed in the UK.
Each quarter the AA gets 51 of the UK’s leading insurers to provide an insurance quote for each of
the virtual individuals (the individuals remain the same quarter on quarter). The data reports the
average of all 51 quotes, the so-called market average premium, and the average of the three
eftec
32
January 2008
Demand for Cars and their Attributes – Final Report
lowest quotes, the so-called shop-around premium. On the assumption that individuals seek out the
cheapest car insurance, our analysis uses the shop-around values.
Figure 4.1: Vehicle excise duty by vehicle CO2 emissions category 2001 to 2005
Diesel
Petrol
180
160
Price (£/year)
140
120
100
80
60
40
20
0
2001
2002
CO2 emission
level (g/km):
2003
2004
2005
2001
2002
2003
100 and below
151 to 165
101 to 120
166 to 185
121 to 150
186 and above
2004
2005
Each of the 500 records in each quarterly data set records details of the individual, the vehicle and
the type of insurance requested. We discuss important variables in the data subsequently and
provide detailed descriptions of all the variables used in our analysis in Table 4.8.
A primary objective of our analysis is to generate a model from which we might predict the
insurance premiums payable on new cars in each of our vehicle option groupings. To that end, our
vehicle attribute data indicates the Association of British Insurance (ABI) insurance group of each
vehicle option. The ABI categories vehicles according to insurance liability into one of twenty
groups labelled 1 to 20. Vehicles in higher ABI insurance groups are considered a greater insurance
risk and/or have greater repair costs and, as such, tend to command higher insurance premiums.
For example, a small city run-around like a Citroen C2 would be in groups 1 to 8 (according to
version), while a large sports utility vehicle like a Land Rover Range Rover would be in groups 13 to
18 (according to version) whilst a sports car like a Porsche 911 Carrera would be in Group 20. Since
the BIPI 500 data set records the ABI insurance group of the vehicle being insured by each virtual
individual, our regression analysis will allow us to determine the impact of insurance grouping on
insurance premiums.
In our aggregate analysis, we partition the demand data geographically into the 11 Government
Office Regions of Great Britain. As such, we would also like to determine if insurance premiums
differ systematically across the country. Fortunately, the BIPI 500 data set provides a similar level
of geographic disaggregation, though the East and West Midlands GORs are recorded just as one
region (Midlands) as are the North-East England and Yorkshire & Humber GORs (North-East &
Yorkshire).
eftec
33
January 2008
Demand for Cars and their Attributes – Final Report
Table 4.8: Variables in the BIPI 500 dataset used in the regression analysis
Variable
Description and expected relationship to insurance premiums
Policy Details:
Cover type
As suggested by its name a Third-Party, Fire and Theft (TPFT) policy will payout for third-party damages and injuries and in the event that a vehicle is
stolen or damaged by fire. A Comprehensive policy also pays-out for damages
to the owners vehicle regardless of whose was at fault in causing the damage.
The BIPI 500 has observations of both comprehensive and TPFT policies.
Number of individuals
insured
Insurance policies are usually specific to a particular individual driving a
particular vehicle. It is possible to add extra named individuals to a policy. As
well as standard individual policies, the BIPI 500 contains observations on the
price of policies with one additional named driver.
Vehicle Details:
Insurance Group
The Association of British Insurers (ABI) categorises cars into one of twenty
insurance groups labelled (imaginatively) 1 to 20. Cars in higher numbered
groups are considered higher insurance liability because they are either more
likely to be involved in accidents and/or have higher costs of replacement (in
the event of fire and theft) and repair (in the event of an accident for a
comprehensive policy holder). As such, vehicles in higher numbered groups
incur relatively higher insurance premiums. The BIPI 500 reports insurance
policy prices for vehicles from each of the 20 groups.
Vehicle Age
Newer cars are usually more expensive to replace (in the event of fire and
theft) or repair (in the event of an accident for a comprehensive policy
holder) such that older cars will usually attract lower premiums. The BIPI 500
includes quotes for vehicles ranging from new to 30 years in age.
Individual Characteristics:
Gender
Males are more likely to be involved in accidents that females and as such
women may be able to obtain cheaper car insurance than men, with a number
of insurers specifically targeting this less-risky market. The BIPI 500 includes
quotes for both men and women.
Age
Younger, less experienced drivers are more likely to be involved in accidents
and make claims on their insurance policies. Insurance premiums are
particularly high for teenage drivers and only begin to drop significantly over
the age of 25. The BIPI 500 includes quotes for individuals from 17 to 87 years
of age.
Years “No-Claim” Bonus
Owners accrue a no-claims bonus (discount on their insurance policy) for each
year that a vehicle is insured by an owner without making a claim on the
policy. The no-claims discount may be as great as 70% off the cost of
insurance after accruing 5 or more years no-claims bonus. The BIPI 500 data
includes individuals with 0 to 20 years no-claims bonus
Location:
Urban
Clearly, accidents, theft and break-ins tend to be more common in towns and
cities. The BIPI 500 data identifies the town in which the policy holder lives.
We matched this with ONS data to get a rough estimate of the size of
population in that location. We identify dummy variables distinguishing large
urban areas (>250,000 inhabitants) from medium urban areas (50,000 to
250,000 inhabitants) from other areas (<50,000 inhabitants). The data does
not allow us to identify rural from urban locations.
Government Office Region
Insurance premiums may differ systematically across GB. We construct a
series of dummy variables indicating different regions of GB to investigate this
contention.
eftec
34
January 2008
Demand for Cars and their Attributes – Final Report
We attempted to source data for all quarters from 2001 to 2005. Unfortunately, the AA were only
able to provide us with BIPI 500 data for the first three quarters of 2005. Of course, since each
quarter’s data provides quotes for the same 500 virtual individuals, each additional quarter’s data
only provides information pertaining to the variability of quotes and how these change over time.
Accordingly, our data uses 1,498 observations pertaining to 500 individuals (data for one individual
was only available in the first quarter) from the first three quarters of 2005.
We perform a straightforward regression of the covariates described in Table 4.8 on the natural log
of the shop-around insurance premium. This specification allows us to interpret the slope
coefficients on continuous variables as the percentage change in insurance premium resulting from
a one unit change in the covariate. The percentage impact of dummy variables can be calculated
using the formula provided by Halvorsen and Palmquist (1980) and these are reported in the
discussion that follows. We use robust estimation techniques to control for the repetition of
observations for the same individual in the calculation of coefficient standard errors.
To allow for non-linearities in the relationship between the various continuous covariates and
insurance premiums we make use of linear splines. That is to say, the ABI Insurance Group, Vehicle
Age, Individual Age and Years No-Claims Bonus variables all have piecewise linear specifications
where the knots of the splines were chosen through examination of the data and through a process
of investigation. Categorical variables are entered as dummy variables.
The results of the regression analysis are documented in Table 4.9. The model achieves a pleasing
level of fit, returning an R2 statistic of 0.85, indicating significant regularities in the manner in
which insurance premiums are determined by insurers and that the major factors used by insurers
in determining those premiums are included in our model.
Table 4.9: Regression of (natural log) shop-around premium against covariates
Variable
Coefficient
Robust s.e.
p-value
11.666
1.780
0.000
-0.199
0.036
0.000
0.125
0.029
0.000
Group 1 to 4
0.041
0.031
0.184
Group 5 to 8
0.071
0.011
0.000
Group 9 to 12
0.064
0.009
0.000
Group 13 to 16
0.031
0.012
0.012
0.223
0.028
0.000
Age 1 to 2
-0.022
0.030
0.465
Age 3 to 5
-0.015
0.018
0.401
Age 6 to 10
-0.030
0.007
0.000
Age 11 to 20
-0.031
0.007
0.000
0.059
0.024
0.014
0.059
0.024
0.014
Constant
Cover Type (dummy variable: baseline =comprehensive)
Third Party, Fire & Theft
Number Insured (dummy variable: baseline = individual insurance)
Plus One Named Driver
Vehicle ABI Insurance Group (piecewise linear)
Group 17 to 20
Vehicle Age (piecewise linear)
Age over 20
Gender (dummy variable: baseline =female)
Male
eftec
35
January 2008
Demand for Cars and their Attributes – Final Report
Table 4.9 (cont.)
Variable
Coefficient
Robust s.e.
p-value
Age 17 to 20
-0.262
0.093
0.005
Age 21 to 25
-0.102
0.018
0.000
Age 26 to 35
-0.022
0.005
0.000
Age 36 to 45
-0.014
0.004
0.001
Age 46 to 55
-0.008
0.005
0.122
Age 56 to 65
-0.024
0.006
0.000
0.062
0.006
0.000
Years 1 to 2
-0.165
0.023
0.000
Years 3 to 5
-0.127
0.014
0.000
Years 5 to 10
-0.009
0.009
0.336
0.010
0.607
0.034
0.030
0.253
0.217
0.040
0.000
London
0.332
0.048
0.000
Midlands
0.144
0.030
0.000
Northern Ireland
1.135
0.073
0.000
North East & Yorkshire
0.216
0.035
0.000
Age (piecewise linear)
Age over 65
Years No-Claims (piecewise linear)
Years over 10
0.005
Size of Town of Residence (dummy variable: baseline = Population<50,000)
Medium Urban (Population: 50,000 to 250,000)
Large Urban (Population: >250,000)
Region of Residence (dummy variable: baseline = East England)
North West
0.375
0.042
0.000
South West
-0.098
0.066
0.136
South East
0.211
0.045
0.000
Wales
0.132
0.058
0.022
Scotland
0.044
0.516
N
-0.029
1498
R2
0.845
The signs on the estimated coefficients generally follow our expectations regarding the relationship
between the covariates and the level of insurance premiums. We observe immediately that TPFT
policies are significantly less expensive, enjoying an 18% discount on comprehensive policies.
Likewise adding a named driver to a policy increases its cost by some 13%. Both these covariates
are, not surprisingly, highly significant determinants of insurance policy price.
Rather than entering the ABI insurance grouping variable as a series of dummy variables, we take
advantage of the fact that insurance premiums are known to be increasing in group number, and
enter this variable as a linear spline with separate segments for each successive group of four ABI
insurance group numbers. Notice that the coefficients estimated on each of these segments are
positive, supporting our contention that premiums are rising in insurance group categorisation.
Notice also that the slopes of the different segments are not identical, premiums increase
relatively slowly moving from group 1 through to 4 (4% per category), the rate increases moving
from group 5 through to 12 (around 6.5% per category), drops down again over the range 13 to 16
(3% per category) and increases dramatically over the highest range of categories from 17 to 20
containing increasingly high performance and expensive vehicles (22% per category).
eftec
36
January 2008
Demand for Cars and their Attributes – Final Report
The piecewise linear specification also reveals some interesting patterns in the relationship
between vehicle age and insurance premiums. Premiums fall slowly over the range 0 to 5 years of
age (in fact the slope coefficients are insignificantly different from zero). Vehicles from 6 to 20
years of age enjoy increasingly more generous discounts (3% discount per year). After 20 years of
age, however, insurance premiums begin to rise significantly (6% increase per year). The
explanation for this pattern is that cars that are kept on the road for excess of 20 years tend to
achieve “classic” status. Of course, repairing or replacing vintage vehicles becomes increasingly
expensive with the passage of time.
The characteristics of the individual purchasing insurance are also found to have significant impacts
on premiums. As expected, insurance premiums are significantly higher for males than females, all
else equal. Age too is an important factor though the relationship between age and insurance
premiums is highly non-linear as captured by our piecewise linear specification. Insurance premia
are greatest for teenage drivers, falling sharply as the policy holder’s age increases from 17 to 20
(26% per year). The rate of decline reduces somewhat over the range of ages from 21 to 25 (10%
per year), then shows only modest reductions from 25 right the way through to 65 years of age
(1.7% per year on average). Not altogether surprisingly, we observe that from an age of 65 onwards
the negative relationship between age and insurance premiums is reversed. It appears that once
policy holders reach a pensionable age, they can expect their insurance premiums to increase by
around 6% year on year (6% per year).
The data reveals that the no-claims bonus system, acts so as to discount premiums very
significantly in the first two years that insured drivers do not make a claim. Over this range,
premiums fall by 17% each year. The rate of discount declines to around 13% for each of the next 3
years of no-claims. No significant reductions in premiums are observed for no-claims in excess of 5
years.
With regards to location of residence, we observe that insurance premiums are significantly higher
for those living in large urban areas. In addition, those living in London, Northern Ireland and NorthWest England attract the highest insurance costs. Residents of the Midlands, the North-East of
England, Yorkshire, the South-East of England and Wales face somewhat lower insurance premiums,
while the lowest premiums are enjoyed by those resident in Scotland, the South-West and East
England.
The model described in Table 4.9 allows us to calculate an expected insurance premium for any
particular values for the covariates. To illustrate, in Table 4.10 we have constructed estimates of
insurance premiums for a TPFT policy for a single driver on vehicles in each ABI insurance grouping.
Quotes are provided for individuals with three different profiles. Each individual is a male living in
an urban area in the East of England. The first individual is an 18 year old with 1 year’s no-claims
bonus, the second is a 38 year old with 3 year’s no-claims bonus, whilst the final individual is a 58
year old with 6 year’s no-claims bonus.
eftec
37
January 2008
Demand for Cars and their Attributes – Final Report
Table 4.10: Predicted insurance premiums by ABI insurance grouping for three male
individuals living in a medium-sized urban location in the East of England in 2005
1
Individual 1:
18 years old,
1 year ncb
£835.88
Individual 2:
38 years old,
3 year ncb
£171.12
Individual 3:
58 years old,
1 year ncb
£116.48
2
£870.96
£178.30
£121.37
3
£907.52
£185.78
£126.46
4
£945.61
£193.58
£131.77
5
£1,015.11
£207.81
£141.45
6
£1,089.72
£223.08
£151.85
7
£1,169.81
£239.48
£163.01
8
£1,255.79
£257.08
£174.99
Insurance
Group
9
£1,338.63
£274.04
£186.53
10
£1,426.93
£292.12
£198.84
11
£1,521.06
£311.39
£211.96
12
£1,621.40
£331.93
£225.94
13
£1,672.45
£342.38
£233.05
14
£1,725.10
£353.16
£240.39
15
£1,779.42
£364.27
£247.96
16
£1,835.44
£375.74
£255.76
17
£2,294.91
£469.80
£319.79
18
£2,869.39
£587.41
£399.84
19
£3,587.69
£734.46
£499.93
20
£4,485.81
£918.32
£625.08
Recall that the data on which our analysis was based has been drawn from the first three quarters
of 2005. To predict the level of insurance premiums over the course of our data series running from
2001 to 2005, we make use of the BIPI 500 index itself. The index tracks the average price of
insurance in the BIPI 500 dataset for each quarter relative to a 1994 baseline (index = 100 in 1994).
The BIPI 500 index is reproduced in Figure 4.2.
To calculate an insurance premium for a particular policy profile in, for example, 2001,we first use
our model to predict the insurance premium for that profile in the first three quarters of 2005. We
then calculate a scaling relativity as the ratio of the average value of the index in 2001 to the
average value of the index in the first three quarters of 2005 (184.69 for comprehensive cover and
195.69 for TPFT cover). Multiplying the predicted insurance premium for the first three quarters of
2005 by this relativity provides us with an estimate of the insurance premium for that profile in
2001. Table 4.11 lists the relativities used in this calculation.
Table 4.11: Relativities used to scale insurance premiums calculated from regression analysis
Year
BIPI 500 Index
Relativity
TPFT
Comprehensive
TPFT
Comprehensive
2001
171.0
175.8
0.926
0.898
2002
181.7
188.2
0.984
0.962
2003
187.6
193.9
1.016
0.991
2004
187.6
194.9
1.016
0.996
2005
185.4
197.1
1.004
1.007
eftec
38
January 2008
Demand for Cars and their Attributes – Final Report
Figure 4.2: BIPI 500 index for Third Party Fire and Theft (TPFT) and Comprehensive car
insurance policies form 2001 to 2005
205.00
200.00
195.00
190.00
185.00
180.00
TPFT
Comprehensive
175.00
170.00
165.00
160.00
Even if the variation in VED across vehicles is insufficient to easily identify its independent impact
on demand, a further possibility exists. In particular, for very many people, car tax and car
insurance are paid annually in the same month as the vehicle was purchased. Indeed, it may not be
unreasonable to assume that these annual fixed costs of motoring are regarded as very similar types
of expenditure by the household; that is to say, an extra pound on the annual insurance bill is
regarded identically as an extra pound on the annual VED. Given this assumption, it would not be
unreasonable to sum these two costs elements and treat them as one cost. That is, indicating the
VED for vehicle option j in time t as VEDjt and annual insurance costs for that vehicle in that same
year as Insjt then we can denote the fixed costs of motoring as:
cTU InsTU ! VEDTU
(4.1)
The advantage of the specification of fixed costs given in (4.1) is that there is considerable
variation in insurance costs across vehicles as well as across individuals, regions and time.
4.3.3
Variable Costs
Finally, our model will require estimates of the variable costs of motoring. These variable costs of
motoring depend primarily on three factors; (i) the number of miles driven, (ii) the fuel efficiency
of the car and (iii) the cost of fuel. As illustrated in Figure 4.3 we have sourced historical data
recording fuel prices disaggregated by region from the AA.
eftec
39
January 2008
Demand for Cars and their Attributes – Final Report
In addition, the JATO dataset records details of a weighted-average fuel efficiency for each vehicle
in the dataset. Whilst the individual-level data we discuss subsequently records the travelling
decisions of a large sample of households, we currently do not intend using this to calculate a
variable indicating the total variable costs of motoring. In particular, we would need to calculate
this variable for each and every vehicle option considered by the household. Since it is reasonable
to assume that individuals will adapt their driving habits according to the efficiency of the vehicle
they are driving, for example driving more in a fuel efficient car than a gas-guzzler, it does not
hold that driving decisions observed for one vehicle can be simply transferred to another.
Figure 4.3: The costs of fuel disaggregated by region from 2001 to present
Petrol
105
100
Midlands and
East Anglia
Northern
England
Scotland
Price (pence)
95
90
85
80
Southern
England
Wales
75
70
65
Diesel
105
Price (pence)
100
95
Midlands and
East Anglia
90
Northern
England
85
Scotland
80
Southern
England
75
Wales
70
65
Source: AA Monthly Fuel Price Reports (2001-2007)
eftec
40
January 2008
Demand for Cars and their Attributes – Final Report
Accordingly, we intend treating variable costs differently from the other costs we have considered.
Indeed, rather than incorporating the variable costs into budget considerations, we intend treating
those costs as a simple characteristic of a vehicle option. In particular, our current intention is to
define a variable indicating the cost of driving an average kilometre in each vehicle as:
\
lT ⁄km ` a>J b>
(4.2)
where lj/km indicates the fuel efficiency of vehicle j. The assumption here is that in deciding which
vehicle to purchase, households simply regard fuel efficiency as a ‘good thing’. Indeed, the
specification in (4.2) allows fuel efficiency to be an increasingly well-regarded attribute as the
price of fuel rises. What our specification implies, is that households first choose a vehicle model to
purchase (with due consideration of its fuel efficiency attribute). Subsequently, and independent of
the purchase decision, households select how much to use that car.
4.3.4
Rescaling Cost Attributes to 2005 Prices
The final step in specifying the cost variables for the analysis was to rescale all costs to 2005
prices. We use the RPI based on non-housing items to calculate the appropriate relativities and
these are listed in Table 4.12.
Table 4.12: Index of relative value of prices
Year
RPI Relative Index
2001
2002
2003
2004
2005
1.061
1.046
1.028
1.016
1.000
Source: National Statistics Data Series (rpi1a) CHAZ series; RPI based on all items barring housing
The cost attributes used in the analysis are summarised in Table 4.13 disaggregated by market
segment.
Table 4.13: Cost attributes of Options Disaggregated by Market Segment (£)
Market Segment
(example vehicle)
Price
Resale Price
Vehicle
Excise Duty
Insurance
Fuel Cost
for 100km
A (Ford Ka)
8,019
5,351
111
289.0
5.09
B (Renault Clio)
10,581
7,096
126
324.5
5.46
C (Honda Civic)
14,184
9,172
145
396.6
6.27
D (Ford Mondeo)
21,183
14,433
158
507.8
7.06
E (Jaguar S-Type)
38,514
26,639
164
688.1
8.40
MPV (Renault Espace)
22,151
15,455
165
486.4
7.81
SUV (Land Rover Freelander)
26,171
19,789
166
543.8
9.08
Sports (Porsche Boxster)
42,168
32,745
162
914.3
8.90
Mini MPV (Renault Scenic)
14,850
9,779
153
379.5
6.50
eftec
41
January 2008
Demand for Cars and their Attributes – Final Report
5.
REGRESSION ANALYSIS
We estimate the parameters of the model described in Section 3 using the data described in Section
4. Our final estimation data set consisted of 70,875 observations. Each observation recorded the
market share (actually log ratio of market share to share of outside good) of a particular vehicle
option in a particular GOR in a particular year.
Recall that we define a vehicle option as a unique combination of make, model, body type, fuel
type, transmission type and engine size.
In essence, we use an aggregate logit model of demand, though our estimation procedures are
designed to account for a number of features of the data:
•
We group options into 735 groups defined as having the same make, model, body type and fuel
type and employ a fixed effects estimator which removes from the model all utility components
that are common to vehicles in each group. On average groups contain 96.4 different vehicle
options (maximum 527, minimum 3).
The parameters of the model are identified since options also vary with regards to engine size
and transmission type. In addition the attributes and costs of vehicle options may differ over
time and over regions.
The fixed effects estimator is a crude but highly effective method for sweeping out noise in the
data that results from the considerable heterogeneity in vehicles available in the market. We
believe that this specification is far less susceptible to misspecification bias than one in which
differences between vehicles are captured only through a handful of measurable costs and
attributes.
•
We allow for intra-segment correlation in utility through a nested logit specification. The
nested logit model requires the use of an instrumental variables estimator. We employ
instruments described in Section 3.7 based on the observed characteristics of vehicles in each
market segment for this purpose.
•
For each model of vehicle we calculate an insurance cost based on the premia payable for TPFT
insurance by a 35 year old male living in a metropolitan region with 3 years no claims bonus
(see Section 4.3.2).
•
Since vehicle excise duty (VED) shows very little variation across the sample, we add VED to
insurance premia to provide a measure of annual fixed costs.
•
To provide a more realistic estimate of the perceived purchase cost of a vehicle, we include
both the selling price of the vehicle and also its one year resale value on the second hand
market.
•
To allow for a diminishing marginal utility of income we scale all cost related variables by the
market average income (see Section 3.6).
Table 5.1 records the parameter estimates from three models; the logit model described by the
estimating equation 3.13, the nested logit model described by the estimating equation 3.14 and a
second run of the nested logit model in which the variable for resale price has been omitted. The
reader is referred back to Sections 3.6 to 3. 8 for a detailed discussion as to how the parameters
should be interpreted.
Consider first the parameter estimates from the logit model. The coefficients on the set of dummy
variables used to parameterise variation in the utility of the outside good all prove to be
significant. Clearly, there are structural differences across time and across GORs that influence
eftec
42
January 2008
Demand for Cars and their Attributes – Final Report
participation in the market and are independent of the attributes of the goods in the market. The
pattern of coefficients suggests that total market demand trended down over the years 2001 to
2004, reviving slightly in the base case year 2005. Likewise, overall market demand differs across
GORs being relatively highest in the base case of the North-East England GOR and, according to the
logit model substantially lower in London and the South East. Whilst we have evidence to believe
that demand for new cars in London is generally lower than elsewhere in the country (see Table
4.5) the low levels of demand predicted for the logit model for South-East England appear
anomalous.
Table 5.1: Parameter estimates from fixed effects logit and nested logit models of demand for
new cars
Variable
Outside Good %+
'
I. Logit model
II. Nested Logit
III. Nested Logit
Coeff
p-value
Coeff
p-value
Coeff
p-value
Constant
-9.6641
0.000
-6.2979
0.000
-6.3750
0.000
Year 2001
0.5951
0.000
0.3894
0.000
0.4039
0.000
Year 2002
0.2034
0.000
0.2923
0.000
0.3144
0.000
Year 2003
0.0573
0.000
0.2355
0.000
0.2462
0.000
Year 2004
-0.0547
0.000
0.1247
0.000
0.1342
0.000
NW
-0.4719
0.000
-0.1544
0.000
-0.1581
0.000
Yorks
-0.3669
0.000
-0.1261
0.000
-0.1252
0.000
E Mid
-0.5568
0.000
-0.2211
0.000
-0.2168
0.000
W Mid
-0.5484
0.000
-0.1219
0.000
-0.1181
0.000
East
-0.9324
0.000
-0.1737
0.000
-0.1637
0.000
London
-1.9253
0.000
-0.8828
0.000
-0.8773
0.000
SE
-1.0106
0.000
-0.1978
0.000
-0.1915
0.000
SW
-0.7905
0.000
-0.3123
0.000
-0.3018
0.000
Wales
-0.3404
0.000
-0.1659
0.000
-0.1620
0.000
-0.5311
0.000
-0.2087
0.000
-0.2013
0.000
-2.0076
0.000
-1.0165
0.000
-0.5253
0.000
Scotland
Cost Attributes
(scaled by avg. income)
Purchase Price
2.6973
0.000
0.7046
0.000
Fixed Cost
-16.7512
0.000
-1.9170
0.003
-0.3669
0.568
Fuel Cost per 100km
-0.0129
0.019
-0.3850
0.000
-0.3889
0.000
Automatic
-0.1276
0.000
0.0808
0.000
0.0818
0.000
Num Gears
0.1357
0.000
0.0371
0.000
0.0388
0.000
Size (width x length)
0.0019
0.000
0.0010
0.000
0.0011
0.000
Brake Horsepower
Acceleration (secs to
100km/h)
Number of Airbags
0.0032
0.000
0.0020
0.000
0.0017
0.000
0.0269
0.000
-0.0097
0.000
-0.0126
0.000
0.0876
0.000
0.0201
0.000
0.0221
0.000
Air Conditioning
0.0578
0.005
0.0087
0.203
0.0056
0.415
Alloy Wheel Rims
Anti-Lock Braking
System
Doors
0.0470
0.005
0.0239
0.000
0.0228
0.000
-0.4157
0.000
-0.0724
0.000
-0.0710
0.000
0.0496
0.000
0.0039
0.131
0.0030
0.258
Resale Price
Physical Attributes
eftec
43
January 2008
Demand for Cars and their Attributes – Final Report
Table 5.1 (cont.)
Variable
Nest parameters PQ
I. Logit model
Coeff
II. Nested Logit
p-value
III. Nested Logit
Coeff
p-value
Coeff
p-value
A
0.9456
0.000
0.9433
0.000
B
0.9180
0.000
0.9197
0.000
C
0.9103
0.000
0.9116
0.000
D
0.8066
0.000
0.8025
0.000
E
0.6729
0.000
0.6760
0.000
mini MPV
0.8660
0.000
0.8704
0.000
MPV
0.7370
0.000
0.7359
0.000
Sports
0.3053
0.000
0.3018
0.000
SUV
0.4011
0.000
0.3911
0.000
70,875
70,875
70,875
735
735
735
Within
0.1877
0.9133
0.9114
Between
0.124
0.1512
0.1455
Overall
0.1279
0.3257
0.3217
N
Num Groups
2
R
Turning to the cost attributes, consider first the coefficients estimated by the logit model on the
price and resale price variables. An immediate observation is that both are highly significant
indicating that households consider both the purchase price and the resale price when buying a new
car. As we would expect, we observe a negative coefficient on purchase price since this represents
a loss of wealth and a positive coefficient on resale price since this represents money back in the
bank. So far, all well and good. But what about the relative size of the coefficients? Observe that
the absolute size of the coefficient on resale price is greater than that on purchase price; money
going out appears to be regarded more favourably than money coming in, a result which does not
accord with our expectations.
Consider for a moment the situation that would arise if vehicles did not depreciate in value over
time. In that case, a household could at any time sell on their vehicle and reclaim the cash they
had invested in its purchase. For the sake of argument, imagine that the household decided to
purchase a new vehicle and sell it on one year later. In that case, the cost they perceived in buying
a car would depend on how they financed the original purchase. If they had taken out a loan to
finance that purchase, then all they would have lost over the course of that year would be the
interest payment on the loan (at the end of the year they could repay the capital sum through the
proceeds from the resale).
If alternatively, they had paid for the vehicle out of their savings then over the course of the year
the cost they would perceive would be the foregone interest on those savings. Either way, the
perceived cost is intimately related to the rate of interest, rt, in market t. Assuming this is the
same for both savings and borrowings and indicating the purchase price of car option j in market t
as then a more correct measure of the price of the vehicle to a household would be given by:
b
Notice how is increasing in the interest rate, such that cars are perceived as more expensive
when interest rates are high and less expensive when interest rates are low.
eftec
44
January 2008
Demand for Cars and their Attributes – Final Report
Of course, new vehicles do not hold their value over time, in fact, quite the contrary. Accordingly,
the cost as perceived by the household should also include a notion of this depreciation in the value
of the capital asset. If we denote the resale price of car option j after one year as c
de then our
measure of perceived price becomes:
b
! c
de %1 ! b
' c
de
(5.1)
%1 ! b
' times as much as money being recouped from resales. In fact, the coefficients from the
So according to theory, money spent on purchasing a vehicle should be valued as being worth
logit model indicate the reverse relationship returning a ratio of 0.74. The logit model does not
return coefficients that conform to prior expectations.
Similar doubt must be cast on the coefficient estimated on the fixed costs variable; that is, the sum
of VED and annual insurance premia. Whilst the coefficient takes the expected negative sign it is an
order of magnitude larger in absolute terms than the coefficients estimated on both purchase and
resale prices. Whilst we would readily accept that households could regard a pound spent on the
purchase of a vehicle differently from a pound spent on the annual fixed costs of motoring, the
relativities implied by the coefficients from the logit model appear implausible.
The logit model turns up numerous other anomalous results. Contrary to our expectations, the
model suggests that slower cars are apparently preferred to cars with greater acceleration and
having an anti-lock braking system (ABS) significantly reduces the appeal of an option.
The numerous irregularities in these results leave us unconvinced by the output from the logit
model.
Now consider the nested logit model (model II) in Table 5.1. The first thing to notice is that each
of the PQ % 1, … , ' parameters are significantly different from zero, confirming our suspicion
that the logit model is mis-specified. Notice also that in line with theory each of the PQ parameters
are within the unit interval (McFadden, 1978).
The nested logit model provides more convincing results than the logit model in a number of other
respects. Observe that the coefficients on purchase price and resale price now have the expected
relative sizes. Though, at 44%, the interest rate implied by their ratio is somewhat higher than
might be expected. In a similar vein, in contrast to the logit model, the coefficient on fixed costs
estimated by the nested logit model is of a comparable size to that estimated on the other cost
variables. In addition, we find that the automatic transmission variable, and the acceleration
variables each now have coefficients that are correctly signed and are significant, whilst the
coefficient on ABS remains stubbornly and anomalously negative.
Generally, we believe there are many things to commend the nested logit specification of Model II
and it is this model that we carry forward to the simulation exercises in the next section.
Finally, consider Model III in Table 5.1. This repeats the nested logit specification, but this time
following the standard practice of including only a purchase price variable and not the resale price.
Notice that the coefficient on purchase price is considerably lower in the new model, suggesting (as
we shall confirm shortly) that this (mis-specified) model omitting resale prices implies a lower
demand responsiveness to changes in price.
eftec
45
January 2008
Demand for Cars and their Attributes – Final Report
6.
DEMAND ELASTICITIES AND DEMAND FORECASTING
As outlined in Section 1.2, the central objective of this project is to examine how the demand for
new vehicles is influenced by changes in the attributes of vehicles in the new car market. Of
particular interest are those changes that impact on the costs of participating in the new car
market through, for example, increasing purchase costs, the fixed annual costs of motoring or the
variable costs of motoring. In the following sections we consider each of these in turn and illustrate
how the model described in the previous section can be used to examine different cost changes.
6.1
Purchase Prices
As outlined in Annex 1 the usual way to describe the responsiveness of demand to changes in prices
is through the estimation of price elasticities. For our nested logit model, the own-price elasticities
of market shares are given by;
f
G
+ g Ghi 1 PQ H|Q 1 PQ H
(6.1)
j
Where H is the market share enjoyed by vehicle option j, H|Q is option j’s share of its own market
segment and to simplify notation we have suppressed the dependence of income () prices ( ) and
market shares on market. Using this formula, Table 6.1 shows the distribution of own-price
elasticity estimates across options in each of the market segments.
Table 6.1: Percentile values of own-price elasticity of market shares estimated from nested
logit model (II in Table 5.1)
Market Segment (example vehicle)
90th
75th
Median
25th
10th
A (Ford Ka)
-6.887
-5.672
-4.665
-3.962
-3.353
B (Renault Clio)
-5.629
-4.988
-4.333
-3.738
-3.243
C (Honda Civic)
-7.140
-6.205
-5.365
-4.525
-3.842
D (Ford Mondeo)
-5.191
-4.361
-3.597
-3.002
-2.557
E (Jaguar S-Type)
-5.889
-4.084
-3.238
-2.694
-2.332
MPV (Renault Espace)
-3.650
-3.235
-2.813
-2.407
-2.084
SUV (Land Rover Freelander)
-2.351
-1.855
-1.308
-1.038
-0.846
Sports (Porsche Boxster)
-3.939
-2.538
-1.590
-1.141
-0.839
Mini MPV (Renault Scenic)
-4.888
-4.385
-3.746
-3.088
-2.622
All
-5.913
-4.827
-3.701
-2.745
-1.635
In accordance with profit-maximising behaviour, the vast majority of vehicles have price elasticities
that exceed unity (if demand is inelastic then a manufacturer can increase profits by increasing
prices). Also, in line with our expectations, demand is generally more price elastic for low
specification vehicles in market segments A, B and C and less elastic in the high specification SUV
and Sports segments. The own-price elasticities shown in Table 6.1 appear entirely plausible and
are not dissimilar from the sorts of values recorded elsewhere in the literature (see Annex 2).
As a point of interest, we have also estimated own-price elasticities using the parameters from
Model III; a nested logit model in which the resale price variable was omitted. These are shown in
Table 6.2. Observe that the omission of resale value results in considerable downward biasing of
the elasticity estimates.
eftec
46
January 2008
Demand for Cars and their Attributes – Final Report
Table 6.2: Percentile values of own-price elasticity of market shares estimated from a nested
logit model omitting resale price (Model III in Table 5.1)
90th
75th
Median
25th
10th
A (Ford Ka)
-3.416
-2.813
-2.314
-1.965
-1.663
B (Renault Clio)
-2.972
-2.633
-2.288
-1.973
-1.712
C (Honda Civic)
-3.745
-3.254
-2.813
-2.373
-2.015
D (Ford Mondeo)
-2.628
-2.208
-1.821
-1.520
-1.295
Market Segment
E (Jaguar S-Type)
-3.072
-2.130
-1.689
-1.405
-1.216
MPV (Renault Espace)
-1.878
-1.665
-1.447
-1.239
-1.072
SUV (Land Rover Freelander)
-1.195
-0.943
-0.665
-0.528
-0.430
Sports (Porsche Boxster)
-2.025
-1.305
-0.818
-0.587
-0.431
Mini MPV (Renault Scenic)
-2.612
-2.344
-2.002
-1.650
-1.401
All
-3.079
-2.507
-1.909
-1.412
-0.836
Of course, demand for a particular option is not only influenced by its own price, but also by the
price of other vehicle options available in the market, a relationship that is described by a crossprice elasticity. Our nested logit specification allows for the fact that these cross-price elasticities
may be larger for vehicle options in the same segment. In particular, the formula describing the
cross-price elasticity of market share with respect to a vehicle in the same market segment is:
+) f G
P H
g Ghij Q |Q
! 1 PQ H
(6.2)
Whilst the cross-price elasticity with respect to a vehicle in another market segment is:
+- H
f
g
(6.3)
Recall from Table 5.1 that the high values for the PQ parameters are indicative of substantial
correlation between options in the same segment. As such we are not surprised to find that the
cross-price elasticities with respect to vehicles in the same segment are substantially greater than
that for vehicles in alternative segments. Table 6.3 shows the average within- and across-segment
cross-price elasticities for vehicle options available in 2005.
Table 6.3: Average own- & cross-price elasticities from 2005
Elasticities
Market Segment
Cross-Price Same
Segment
Own-Price
Cross-Price Different
Segment
A (Ford Ka)
-4.411
0.105
0.00001
B (Renault Clio)
-4.239
0.022
0.00003
C (Honda Civic)
-5.223
0.019
0.00002
D (Ford Mondeo)
-3.644
0.008
0.00001
E (Jaguar S-Type)
-4.017
0.020
0.00001
MPV (Renault Espace)
-3.654
0.030
0.00001
SUV (Land Rover Freelander)
-2.679
0.030
0.00001
Sports (Porsche Boxster)
-1.950
0.004
0.00002
Mini MPV (Renault Scenic)
-1.534
0.005
0.00003
eftec
47
January 2008
Demand for Cars and their Attributes – Final Report
The model allows us to ask numerous sophisticated questions of the data. For example, imagine we
wished to investigate how changes in the prices of a particular set of vehicles impacted on the
profile of CO2 emissions efficiency of vehicles sold in the new car market. By altering the price of
those vehicles impacted by the price change, we can use our model to predict how households’
purchasing behaviour will change.
Consider the data shown in Table 6.4. The left hand column shows a categorisation of vehicles
according to their CO2 emissions measured in g/km. The second column lists the number of
purchases of vehicles in each CO2 category made in 2005. Notice that the first row of the data lists
the number of households choosing the “outside good”, that is, choosing not to purchase a new
vehicle in 2005. Each subsequent column shows the models predictions of how sales in each
category will change as a result of a 1% in the price of vehicles in the CO2 category shown at the
top of that column.
Not surprisingly, when the price of vehicles in a particular category goes up, we observe demand
for those vehicles declining; all values along the diagonal are negative. Also in line with our
expectations, the model predicts that households substituting away from the vehicles experiencing
a price rise tend to choose vehicles with a relatively similar CO2 rating.
Finally, observe also the important role played by the outside good in allowing for households to
substitute out of the market. For example, the 1% increase in price for vehicles generating between
111 and 120 gCO2/km results in 983 households substituting away from the vehicles in that
category. Whilst 912 of those households are predicted to simply pick a different vehicle from
another category, our model indicates that 71 will choose not to purchase a vehicle at all.
The final two rows of Table 6.4 provide an indication of how the price change (for each CO2
category) impacts upon the CO2 emissions profiles of new car purchases. The first row shows the
sales-weighted market average CO2 emissions, that is, the average emissions level of vehicles sold
in the market. Of course, the average emissions of the set of cars sold in the new car market does
not reflect the fact that fewer vehicles might be sold in total. Since a reduction in overall demand
will in itself reduce emissions from the new car fleet, we also report the CO2 emissions averaged
over all households in the market.
As might be expected, when the price changes are relatively minor and affect only a small segment
of the market, reductions in the CO2 emissions profile of vehicles in the new car market is
relatively marginal. Of course, more substantial price increases or price increases affecting a wider
segment of the market would likely have a more substantive impact.
6.2
Annual Fixed Costs
Table 6.5 records the results of a similar analysis investigating the impact on the CO2 emissions
profile of household purchases in the new car market resulting from changes in the fixed costs of
motoring. In particular, each column records the change in sales in each CO2 emissions category
when the fixed costs payable on the vehicles in the column category increases by £10. The analysis
confirms that such marginal changes in fixed costs have only very minor impacts on purchasing
behaviour in the new car market; we observe practically no change in market average CO2
emissions as a result of any one of the changes.
eftec
48
January 2008
Demand for Cars and their Attributes – Final Report
6.3
Variable Costs
Our model also allows us to consider how the market would respond to changes in the variable costs
of motoring; i.e. the cost of fuel. Table 6.6, for example, considers the impacts of increasing the
price of diesel independent of the price of other fuels. We examine price rises from 1p through to
10p a litre. Notice that compared to the cost changes we have considered previously, increasing the
price of diesel impacts on a much broader segment of the market. Our model indicates quite
substantial changes in purchasing behaviour. Most interestingly, for each 1p rise in the price of
diesel, our model suggests that in excess of 2,000 households, who otherwise would have purchased
a diesel-fuelled car, decide not to participate in the market. That amounts to around a ¼% fall in
market demand for each 1p rise in the price of diesel.
Another interesting pattern is shown by the summary CO2 emissions statistics reported in the last
two rows of Table 6.6. In particular, as we might expect with falling participation in the market,
the CO2 emissions when calculated per household in the population fall as the price of diesel
increases. In contrast, the patterns of substitution suggested by the model result in the average CO2
emissions of vehicles in the new car market actually increasing, a result of substitution to petrolfuelled vehicle (which on average have higher CO2 emissions that diesel-fuelled cars).
Of course, changing only the price of diesel affords households the opportunity of purchasing a
petrol-fuelled vehicle not subject to the increased fuel charges. We carry out one final analysis in
which the prices of all fuels are increased. Table 6.7 shows our models predictions for changes in
the price of all fuels rising from 1p per litre to 10p per litre.
The analysis affords some interesting insights. In particular, when the price of all fuels is increased,
we observe a general substitution away from fuel inefficient to fuel efficient vehicles. According to
our model, demand for vehicles producing in excess of 141 gCO2/km decline following the price
rises whilst those for vehicles with emissions levels below this level increase. In addition the model
predicts that each 1p increase in the per litre price of fuel decreases the demand for new cars by
around 10,000 units or 1%. Finally, observe that in contrast to the pattern observed when only the
price of diesel was increased, we find that increasing prices for all fuels results in both the market
average and population average CO2 emissions of new cars falling.
eftec
49
January 2008
Demand for Cars and their Attributes – Final Report
Table 6.4: Changes in sales patterns from 1% change in price of vehicles in different CO2 emissions categories (disaggregated by CO2 emissions categories)
CO2g/km
Outside
Good
2005
Sales
2005
Prices
23,142,289
£0
Change in Sales for Vehicles in Row Category from 1% Change in Price of Vehicles in Column Category
100 to
110
111 to
120
121 to
130
131 to
140
141 to
150
151 to
160
161 to
170
171 to
180
181 to
190
191 to
200
200 to
225
226 to
250
251 to
275
275 to
300
300 to
400
400 to
500
11
71
101
281
501
651
351
541
351
351
651
361
351
191
151
61
100 to 110
5,719
£6,765
-205
32
30
20
104
8
1
6
8
0
0
0
0
0
0
0
111 to 120
25,857
£9,974
33
-983
72
123
315
160
50
91
52
13
9
3
0
0
1
0
121 to 130
32,011
£11,577
28
69
-1,263
143
316
207
107
148
78
35
22
9
1
0
2
0
131 to 140
87,324
£10,863
19
134
151
-3,006
780
689
279
431
213
90
57
21
2
2
5
0
141 to 150
161,520
£13,147
94
315
324
766
-4,641
1,166
374
696
363
142
116
39
5
5
7
1
151 to 160
164,892
£14,735
8
165
208
654
1,102
-5,304
605
841
454
250
293
113
11
12
16
3
161 to 170
87,145
£14,457
0
50
111
300
383
637
-3,021
587
323
204
176
82
8
7
18
2
171 to 180
124,628
£16,862
5
82
140
395
607
806
580
-4,275
370
275
253
100
17
11
19
3
181 to 190
75,216
£16,650
5
40
64
174
288
409
279
345
-2,572
140
177
75
23
16
15
6
191 to 200
61,911
£18,939
0
11
32
83
122
228
182
267
134
-1,770
173
74
37
20
21
11
200 to 225
94,011
£22,204
0
6
19
48
89
244
144
219
145
160
-2,192
149
85
46
45
35
226 to 250
43,366
£26,996
0
2
7
16
27
82
56
75
53
61
135
-1,106
45
25
23
18
251 to 275
33,849
£33,198
0
0
1
1
2
6
4
11
13
25
62
38
-619
22
20
10
275 to 300
15,094
£43,995
0
0
0
1
2
6
3
6
7
12
30
18
19
-366
9
6
300 to 400
10,285
£55,920
0
1
1
3
4
8
9
10
7
10
23
13
14
6
-351
3
400 to 500
1,904
£113,714
0
0
0
0
0
0
0
0
1
3
11
5
3
2
2
-157
Market avg
CO2
173.64
173.65
173.68
173.68
173.71
173.69
173.68
173.65
173.60
173.59
173.61
173.58
173.58
173.60
173.61
173.59
173.61
Population
avg CO2
7.363
7.363
7.364
7.364
7.364
7.361
7.360
7.361
7.357
7.358
7.359
7.355
7.358
7.358
7.360
7.360
7.361
eftec
50
January 2008
Demand for Cars and their Attributes – Final Report
Table 6.5: Changes in sales patterns from £10 change in fixed costs for vehicles in different CO2 emissions categories (disaggregated by CO2 emissions
categories)
Change in Sales for Vehicles in Row Category from £10 Change in Fixed Costs for Vehicles in Column Category
CO2g/km
2005 Sales
100 to
110
111 to
120
121 to
130
131 to
140
141 to
150
151 to
160
161 to
170
171 to
180
181 to
190
191 to
200
200 to
225
226 to
250
251 to
275
275 to
300
300 to
400
400 to
500
23,142,289
1
11
21
51
91
101
51
71
41
31
51
21
19
11
11
1
100 to 110
5,719
-57
9
8
6
26
2
0
1
2
0
0
0
0
0
0
0
111 to 120
25,857
9
-204
17
27
71
31
8
15
8
2
1
0
0
0
0
0
121 to 130
32,011
8
16
-238
29
69
36
16
23
11
4
2
0
0
0
0
0
131 to 140
87,324
5
27
29
-559
160
124
45
67
30
12
7
2
0
0
0
0
141 to 150
161,520
27
72
69
160
-887
213
60
108
52
18
12
4
1
1
1
0
151 to 160
164,892
2
31
37
124
212
-846
92
122
59
29
28
9
1
1
1
0
161 to 170
87,145
0
8
17
45
60
92
-455
86
41
26
19
8
1
0
1
0
171 to 180
124,628
1
15
23
67
108
122
86
-616
47
34
25
8
1
0
1
0
181 to 190
75,216
1
7
11
29
51
59
41
47
-335
16
16
5
1
0
0
0
191 to 200
61,911
0
2
5
12
18
30
26
35
16
-204
15
6
2
1
1
0
200 to 225
94,011
0
1
3
7
12
28
19
26
16
15
-200
11
5
2
2
1
226 to 250
43,366
0
1
1
2
4
10
8
9
6
6
11
-83
3
1
1
1
251 to 275
33,849
0
0
0
0
0
1
1
1
1
2
5
3
-35
1
1
0
275 to 300
15,094
0
0
0
0
0
1
1
1
1
1
2
1
1
-17
0
0
300 to 400
10,285
0
0
0
1
1
1
2
2
1
1
2
1
1
1
-15
0
400 to 500
1,904
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
-3
Market avg
CO2
173.643
173.64
173.65
173.65
173.65
173.65
173.65
173.64
173.64
173.64
173.64
173.64
173.64
173.64
173.64
173.64
173.64
Population
avg CO2
7.363
7.363
7.363
7.363
7.363
7.362
7.362
7.362
7.362
7.362
7.362
7.362
7.362
7.363
7.363
7.363
7.363
Outside
Good
eftec
51
January 2008
Demand for Cars and their Attributes – Final Report
Table 6.6: Changes in patterns of sales from increases in the price of diesel (disaggregated by CO2 emissions categories)
CO2g/km
2005 Sales
Outside Good
1p
2p
3p
4p
5p
6p
7p
8p
9p
10p
23,142,289
2,761
5,411
7,971
10,421
12,791
15,071
17,271
19,391
21,431
23,391
100 to 110
5,719
9
17
24
30
35
40
45
49
52
55
111 to 120
25,857
-1,241
-2,401
-3,484
-4,494
-5,435
-6,310
-7,124
-7,879
-8,579
-9,229
121 to 130
32,011
-1,366
-2,652
-3,862
-4,998
-6,063
-7,060
-7,993
-8,865
-9,679
-10,437
131 to 140
87,324
-1,035
-2,029
-2,980
-3,889
-4,754
-5,577
-6,357
-7,094
-7,791
-8,448
141 to 150
161,520
-1,009
-1,979
-2,910
-3,803
-4,657
-5,472
-6,251
-6,992
-7,698
-8,370
151 to 160
164,892
-774
-1,548
-2,321
-3,092
-3,860
-4,622
-5,379
-6,128
-6,869
-7,601
161 to 170
87,145
863
1,669
2,420
3,116
3,760
4,352
4,895
5,391
5,842
6,250
171 to 180
124,628
819
1,590
2,313
2,990
3,622
4,210
4,755
5,259
5,725
6,153
181 to 190
75,216
595
1,170
1,725
2,259
2,772
3,265
3,737
4,189
4,622
5,035
191 to 200
61,911
468
921
1,359
1,781
2,188
2,577
2,951
3,307
3,648
3,972
200 to 225
94,011
164
324
480
631
776
917
1,051
1,180
1,302
1,418
226 to 250
43,366
97
192
287
380
472
562
651
737
822
905
251 to 275
33,849
-351
-696
-1,033
-1,363
-1,686
-2,003
-2,313
-2,617
-2,914
-3,205
275 to 300
15,094
-97
-192
-285
-375
-462
-548
-631
-712
-791
-868
300 to 400
10,285
83
164
245
325
404
483
560
637
713
789
400 to 500
1,904
19
38
57
76
95
113
132
151
170
189
173.643
173.855
174.058
174.251
174.435
174.610
174.776
174.934
175.083
175.225
175.358
7.363
7.352
7.341
7.331
7.321
7.311
7.302
7.293
7.283
7.275
7.266
Market avg CO2
Population avg CO2
eftec
Increase in price of Diesel
52
January 2008
Demand for Cars and their Attributes – Final Report
Table 6.7: Changes in patterns of sales from increases in the price of all fuels (disaggregated by CO2 emissions categories)
CO2g/km
2005 Sales
Outside Good
1p
2p
3p
4p
5p
6p
7p
8p
9p
10p
23,142,289
10,851
21,571
32,151
42,601
52,911
63,101
73,161
83,101
92,911
102,601
100 to 110
5,719
134
268
403
537
672
807
942
1,076
1,211
1,344
111 to 120
25,857
606
1,219
1,839
2,465
3,097
3,736
4,380
5,030
5,686
6,347
121 to 130
32,011
514
1,029
1,544
2,060
2,575
3,090
3,604
4,117
4,630
5,141
131 to 140
87,324
389
765
1,127
1,475
1,809
2,128
2,433
2,723
2,998
3,258
141 to 150
161,520
-981
-1,980
-2,997
-4,031
-5,081
-6,147
-7,228
-8,323
-9,431
-10,553
151 to 160
164,892
-1,362
-2,737
-4,125
-5,524
-6,935
-8,355
-9,785
-11,224
-12,672
-14,126
161 to 170
87,145
-1,028
-2,066
-3,113
-4,169
-5,232
-6,301
-7,377
-8,457
-9,541
-10,628
171 to 180
124,628
-2,232
-4,443
-6,630
-8,796
-10,937
-13,055
-15,149
-17,218
-19,262
-21,281
181 to 190
75,216
-1,627
-3,218
-4,775
-6,298
-7,787
-9,243
-10,666
-12,057
-13,416
-14,745
191 to 200
61,911
-1,216
-2,408
-3,577
-4,723
-5,847
-6,948
-8,027
-9,084
-10,120
-11,134
200 to 225
94,011
-1,727
-3,419
-5,077
-6,700
-8,289
-9,846
-11,371
-12,864
-14,327
-15,759
226 to 250
43,366
-972
-1,915
-2,832
-3,721
-4,585
-5,425
-6,241
-7,033
-7,804
-8,554
251 to 275
33,849
-594
-1,177
-1,750
-2,312
-2,864
-3,406
-3,939
-4,461
-4,975
-5,479
275 to 300
15,094
-316
-624
-925
-1,219
-1,506
-1,786
-2,060
-2,328
-2,589
-2,845
300 to 400
10,285
-358
-698
-1,021
-1,327
-1,620
-1,899
-2,165
-2,420
-2,665
-2,899
400 to 500
1,904
-85
-165
-242
-316
-386
-453
-517
-577
-635
-691
173.643
173.286
172.933
172.583
172.235
171.890
171.548
171.207
170.870
170.534
170.200
7.363
7.270
7.178
7.088
7.000
6.912
6.826
6.741
6.658
6.575
6.494
Market avg CO2
Population avg CO2
eftec
Increase in Price of Fuel
53
January 2008
Demand for Cars and their Attributes – Final Report
7.
SUMMARY AND SUGGESTIONS FOR FURTHER RESEARCH
The research reported in this document records the creation of a remarkably detailed aggregate
data set of new car purchases in GB. The data are used to estimate a model of new car purchasing
behaviour that allows us to investigate how market demand will react to changes in certain policy
instruments.
The construction of the data set has been a major undertaking. Several primary sources have been
integrated so as to match data on sales, on vehicle attributes, and on the costs of driving. Of
particular note is the work that we have done to ensure that our data includes accurate estimates
of insurance costs and contains details of the resale price of vehicles.
The detail in our data set allows us to identify highly disaggregate choice options. We define a
choice option as being a set of vehicles which share the same make, model, body type,
transmission, fuel type and engine size. Over the course of the five years of our data (running from
2001 to 2005) we identify 2,190 unique choice options from which households can choose.
In addition, the data provided by the DVLA allows us to disaggregate our sales data into 11 regional
markets. We exploit this panel nature of our data so as to introduce regional-level differences in
income into our analysis.
We believe our final data set to be of one of the most detailed and accurate data sets of its type.
In addition to the aggregate level data on vehicle sales, we had hoped to access disaggregate level
data on individual car purchasing decisions. Whilst this data was eventually made available to us,
we have not had time to pursue the disaggregate modelling approach within the time frame of this
project.
We model choice in the new car market using a nested logit model derived from a random utility
specification of household preferences. The nested logit model is a generalisation of the more
familiar logit model that allows for more realistic patterns of substitution than are allowed for by
the latter. We find strong statistical evidence to support the nested logit model in preference to
the logit model in this case.
We exploit the high-level of disaggregation in our aggregate data set to estimate the parameters of
our model using a fixed effects estimator. The fixed effects estimator allows us to account for a
variety of features of vehicles (e.g. design quality, prestige etc.) that it would be difficult to
control for in any other way. We believe our estimator greatly reduces the possibilities for misspecification error.
The coefficients of our favoured model (Model II in Table 5.1) appear very sensible. They conform
with economic theory and in the main, conform with our prior expectations.
Interestingly we find that resale price is an important determinant of household choice of new car
and that when this argument is omitted the implied relationship between price and demand is
biased downwards.
We use our model to estimate own- and cross-price elasticites of demand. Our elasticity estimates
appear very sensible and are broadly similar to estimates published elsewhere in the literature.
In addition we have written a piece of code that uses the model to predict how the new car market
will react to changes in the cost attributes of vehicles. We illustrate the output from this code by
investigating a number of different cost changes. The patterns of demand revealed in this analysis
concur with our prior expectations and seem of a sensible magnitude.
eftec
54
January 2008
Demand for Cars and their Attributes – Final Report
There are several key areas in which the research reported here might be extended.
• First, delays in accessing data and constraints imposed by the magnitude of the research task
mean that the model reported here does little to investigate the relationship between demand
and socioeconomic characteristics. As a matter of fact, the authors have access to a rich data
set from two national surveys (the National Travel Survey and the Expenditure and Food
Survey) that provide household level observations of vehicle purchasing behaviour. Combining
this individual level data with the aggregate data analysed here provides a unique opportunity
to estimate a demand model that can provide valuable insights into how household
socioeconomics impact on demand for new cars.
• Second, the current report focuses on the estimation of substitution patterns resulting from
policies that change costs. The model can also be used to estimate the welfare impacts of
those changing costs. Such an analysis would be particularly insightful if applied to the model
with socioeconomics described previously. In that case, the distributional impacts of policies
across regions and socioeconomic groups could be investigated in detail.
• Third, the current model only considers the purchasing behaviour of private households. As a
matter of fact, each year companies and organisations purchase as least as many vehicles.
Clearly, a priority for further research would be to model demand for company cars. One of
the key problems here is the availability of data. However, in undertaking the current research
project it has become clear how, with the help of the DVLA, such a data set might be sourced.
Moreover, we have some preliminary ideas as to how a demand model using that data might be
formulated.
• Finally, the current research focuses solely on the demand side of the new car market. As a
matter of fact the current analysis can be extended to investigate a number of supply side
issues. In particular, using insights from the theory of non-collusive oligopolistic markets, it is
possible to use the current models to estimate the mark-up (that is the excess of price over
the marginal costs of production) enjoyed by each vehicle type. Taking those insights one step
further, one can attempt to estimate each manufacturer’s marginal cost function. Such a
function would provide the DfT with estimates, for example, of how costly it is for car makers
to manufacture vehicles with extra fuel efficiency.
The EFTEC research team is eager to pursue all these research ideas.
eftec
55
January 2008
Demand for Cars and their Attributes – Final Report
8.
REFERENCES
Bajic, V. (1993). “Automobiles and implicit markets: an estimate of a structural demand model for
automobile characteristics”, Applied Economics, 25, pp 541-551.
Berkovec, J. and J. Rust. (1984). “A Nested Logit Model of Automobile Holdings for One-Vehicle
Households”, Transportation Research B, 19 (4), pp 275-286.
Berry, S.T. (1994). “Estimating Discrete-Choice Models of Product Differentiation”, The RAND
Journal of Economics, 25(2), pp. 242-262
Berry, S.T., J. Levinsohn and A. Pakes. (1995). “Automobile Prices in Market Equilibrium”,
Econometrica, 63(4), pp. 841-890
Berry, S.T., J. Levinsohn and A. Pakes. (1999). “Voluntary Export Restraints on Automobiles:
Evaluating a Trade Policy”, The American Economic Review, 89(3), pp. 400-430
Berry, S.T., O.B. Linton and A. Pakes. (2004). “Limit Theorems for Estimating the Parameters of
Differentiated Product Demand Systems”, Review of Economic Studies, 71, pp 613-654.
Bhat, C.R. and S. Sen. (2006).”Household Vehicle Type Holdings and Usage: An Application of the
Multiple Distrete-Continuous Extreme Value (MDCEV) model”, Transportation Research B, 40,
pp 35-53.
Boyd, J.H. and R.E. Mellman. (1980). “The Effect of Fuel Economy Standards on the U.S.
Automotive Market: An Hedonic Demand Analysis”, Transportation Research A, 14A, pp 367378.
Bresnahan, T.F. (1987). “Competition an Collusion in the American Auto Industry: the 1955 Price
War”, Journal of Induscrial Economics, 35, pp 457-482.
Bunch, D.S., M. Bradley, T.F. Golob, R. Kitamura and G.P. Occhiuzzo. (1993). “Demand for cleanfuel vehicles in California: A Discrete-Choice Stated-Preference Pilot Project”, Transportation
Research A, 27A (3), pp 237-253.
Bunch, D.S., D. Brownstone and T.F. Golob. (1996). “ A Dynamic forecasting system for vehicle
markets with clean-fuel vehicle”, in D.A. Hensher, J. King, and T.H. Oum (eds), World
Transportation Research. Pergamon, Oxford.
Bunch, D.S., B. Chen (2007). “Automobile Demand and Type Choice”, in D.A. Hensher and K. Button
(eds), Handbook of Transport Modelling, 2nd edition, Elsevier Science, Oxford.
Cambridge Econometrics (2007). Demand for Cars and Their Attributes Interim report to the
Department for Transport, April 2007.
Cardell, S.N., R. Dobson and F.C. Dunbar (1978). “Consumer Research Implications of Random
Coefficient Models”, Advances in Consumer Research, 5, pp 448-455.
Cardell, S.N., F.C. Dunbar (1980). “Measuring the Societal Impacts of Automobile Downsizing”,
Transportation Research A, 14A, pp 423-434.
De Jong, G.C. (1990). “An indirect utility model of car ownership and private use”, European
Economic Review 34, pp 971-985.
eftec
56
January 2008
Demand for Cars and their Attributes – Final Report
De Jong, G.C. (1996). “A Dissaggregate Model System of Vehicle Holding Duration, Type Choice and
Use”, Transportation Research B, 30(4), pp 263-276.
De Jong, G.C., J. Fox, A. Daly, M. Pieters and R. Smit (2004). “Comparison of Car Ownership
Models”, Transport Reviews 24 (4), pp 379-408.
DfT (2006) ‘Research specification: Demand for cars and their attributes’, Department for
Transport, November 2006.
Dinopoulos, E., and M. Kreinin (1988). “Effects of the U.S.-Japan auto VER on European prices and
on US welfare”, The Review of Economics and Statistics, 70(3), pp 484-491.
eftec (2005) Demand for Cars and their Attributes: A Research Programme, Report to Department
for Transport, November 2005.
eftec (2006) Demand for Cars and their Attributes: Road pricing Supplement, Report to
Department for Transport, November 2005.
Feenstra, R. (1985). Automobile prices and protection: The U.S.-Japan trade restraint, Journal of
Policy Modelling, 7(1), pp 49-68.
Goldberg, P.K. (1995). “Product Differentiation and Oligopoly in International Markets: The Case of
the U.S. Automobile Industry”, Econometrica, 63(4), pp 891-951.
Halvorsen, R., and P. Palmquist (1980). “The interpretation of dummy variables in semilogarithmic
equations”, American Economic Review, 70, pp. 474-475.
Hensher, D.A. (1992). Dimension of Automobile Demand: A Longitudinal Study of Household
Automobile Ownership and Use, North-Holland, Amsterdam.
Hensher, D.A. and W.G. Greeene. (2001). “Choosing between Conventional, Electric and LPG/CNG
Vehicles in Single-Vehicle Households” in Hensher, D.A. (eds) The Leading Edge of Travel
Behaviour Research, Pergamon, Oxford.
Hensher, D. A. and V. le Plastrier, (1985). ”Toward a Dynamic discrete-choice model of household
automobile fleet size and composition”, Transportation Research, 19B, pp 481-495.
Hulten, C. R. (2003). “Price Hedonics: A Critical Review”, FRBNY Economic Policy Review, 9 (3), pp
5-15.
Lave, C. A., K. Train. (1979). “A disaggregate model of auto-type choice”, Transportation Research
13A, pp 1-9.
Mannering, F., H. Mahmassani. (1985). “Consumer Valuation of Foreign and Domstic Vehicle
Attributes: Econometric Analysis and Implications for Auto Demand”, Transportation Research
A, 19A (3), pp 243-251.
Mannering, F., C. Winston, (1985). “A dynamic empirical analysis of household vehicle ownership
and utilization”, RAND Journal of Economics, 16(2), pp 215-236.
Manski, C. F., L. Sherman, (1980). “An empirical analysis of household choice among motor
vehicles”, Transportation Research, 14A, pp 349-366.
McCarthy, P.S. and R.S. Tay. (1998). “New Vehicle Consumption and Fuel Efficiency: A Nested Logit
Approach”, Transportation Research E, 34(1), pp 39-51.
eftec
57
January 2008
Demand for Cars and their Attributes – Final Report
McFadden, D. (1974). “Conditional logit analysis of qualitative choice behaviour”, in P. Zarembka,
ed., Frontiers in Econometrics, New York: Academic Press.
McFadden, D., (1978). “Modelling the choice of residential location”, in A. Karlgvist et al. (eds.),
Spatial Interaction Theory and Planning Models, Amsterdam: North-Holland.
Mohammadian, A., E. J. Miller, (2003). “An Empirical Investigation of Household Vehicle Type
Choice Decisions”, Journal of the Transportation Research Record, 1854, pp 99-106.
Nevo, A. (2000). “A Practitioner’s Guide to Estimation of Random-Coefficients Logit Models of
Demand”, Journal of Economics and Management Strategy, 9(4), pp 513-548.
Pakes, A. (2003). “A Reconsideration of Hedonic Price Indices with an Application to PC’s”,
American Economic Review, 95 (5) pp 1578-1596.
Petrin, A. (2002). “Quantifying the Benefits of New Products: The Case of the Minivan”, Journal of
Political Economy, 110 (4), pp 705-729.
SMMT (2006). SMMT Annual CO2 Report: 2006 Market, Society of Motor Manufacturers and Traders
Publication.
Train, K. (1986). “Qualitative Choice Analysis (Theory, Econometrics and an Application to
Automobile Demand)”, MIT Press, Cambridge MA.
Train, K.E., C. Winston, (forthcoming). “Vehicle choice behaviour and the declining market share of
U.S. automakers”, International Economic Review
van Dalen, J., and B. Bode (2004). “Quality-corrected price indices: The case of the Dutch new
passenger car market, 1990–1999”, Applied Economics, 36, pp 1169–1197.
eftec
58
January 2008
Demand for Cars and their Attributes – Final Report
ANNEX 1: OVERVIEW OF SOME KEY CONCEPTS
A1.1
Measuring the responsiveness of consumption
Consumer demand for a given good is traditionally represented by the relationship between its
price and the quantity consumed, holding all other factors constant. This model of consumer
behaviour can then be used to obtain an ex ante indication of how policy instruments affecting
consumption are likely to affect a market outcome, such as the quantity of fuel efficient cars
purchased. Therefore, in order to evaluate the effect of changes in policy instruments, or the
introduction of new instruments, what is needed is to estimate a functional relationship between a
quantity measure and the price.
An attractive way to quantify the relation between two variables is the concept of elasticity, which
is a measure of causal responsiveness from one variable to another. It can be defined as the ratio of
changes in the two variables, where the changes are measured in relative terms. Formally, the
elasticity of y with respect to x is given by:
ex, y =
∂x y percentage change in x
⋅ ≅
∂y x percentage change in y
The main advantage of this measure is that it neither relies on the units of measurement of the two
variables nor relates to a particular value or level of the variables. As a result, elasticities can be
meaningfully compared between different pairs of variables and are therefore particularly suitable
for policy purposes. That said, the concept of elasticity is limited to measuring the impact of a
small changes in the variables.
The price elasticity of demand is an important and widely used concept. It gives the percentage
change in the quantity of a given good exchanged in a market when the price changes by one
percent, all other things being equal. Hence if we are interested in the market impact of the
introduction of a new tax, knowing the demand elasticity can help to understand the likely effect
of a price change on the market equilibrium. Hence the price elasticity and the related concepts
explained below are important in understanding the effects of economic policies. A further concept
directly related to the price-elasticity is the price semi-elasticity. This measures the percentage
change in the quantity consumed for a price increase measured in absolute (money) terms. In other
words, it describes a relative change in the quantity exchanged due to a level change of the price.
The price elasticity (and semi-elasticity) of the vast majority of goods is negative, so that an
increase of the price leads to a diminution of the quantity of the good exchanged on the market.4
Additionally, the demand for a good is usually described as inelastic or elastic, the latter being
defined as having an elasticity greater than one in absolute terms. More precisely, the consumption
of a good with an elastic demand function will react to a 1 percent price change by more than 1
percent, i.e. the demand elasticity is smaller than -1. Symmetrically, a good with an inelastic
demand will see its quantity consumed reduced by less than one percent, so that the elasticity in
response to a 1 percent price change is between 0 and -1.
Finally, a related concept is that of cross-price elasticity, which is particularly relevant to the
context of this study. Cross-price elasticity measures the percentage change in the quantity
consumed of a given good when the price of a related good changes by 1 percent. For substitute
goods, such as vehicle models with similar characteristics, the cross-price elasticity will be positive.
Indeed if the price of a particular product increases, we would expect consumers to buy more of
4
The exception are so-called ‘Giffen goods’, for which the income effect of a price increase outweighs the
substitution effect.
eftec
59
January 2008
Demand for Cars and their Attributes – Final Report
the substitute good, increasing the quantity consumed. In contrast, complement goods are
expected to have negative cross-price elasticity: if the price of a product increases, consumers will
reduce their consumption of both the specific product and its complement.
A1.2
Estimating demand elasticities: the case of homogeneous goods
By definition, a homogeneous commodity is one whose characteristics do not vary, regardless of the
producer of the good. In turn, consumers cannot attribute a particular product to a producer simply
from its physical characteristics. Agricultural goods are a common illustration of this theoretical
abstraction; for example, wheat marketed by two different producers can be seen as (roughly)
perfect substitutes.
From the homogeneity property, it follows that the demand for wheat can be represented by a
relation linking the quantity bought during a particular time interval to the market price.5 Indeed,
since all the characteristics of the good do not vary, the only variable that matters is the price.
Under the standard assumptions of microeconomics, a price increase will reduce the quantity
exchanged at a particular moment (all other things being equal). Hence in order to estimate the
price-elasticity of the demand for wheat, it would be necessary to estimate the demand function
from looking at the data obtained by looking at a particular market; for instance, the market for
wheat in the UK.
One first thing to note is that our theoretical representation relating price and quantity is only a
simplification. Observed market outcomes will fluctuate around any estimated demand function.
Because of this, empirical studies specify an error term, which will account for the variation that is
not captured by the stylised demand function. Formally, if we express the market price and
quantity of a good at time t by Pt and Qt respectively, the estimated demand function can be
written as:
Pt = α + β ⋅ Qt + et
then
α
and
(1)
β are the structural parameters to be estimated and et is the error term of period t .
This relation is shown graphically in Figure A1.1.
5
We will make the simplifying assumption that the characteristics of the good in question do not vary across time
periods.
eftec
60
January 2008
Demand for Cars and their Attributes – Final Report
Figure A1.1: Graphical representation of the demand for a homogenous good
Pt
Stylised demand
function
Observation for
period t
Error term for
observation t
Qt
One of the fundamental assumptions allowing the estimation of the parameters
α and β is that et
and Pt are not functionally related. If this assumption did not hold, then it would not be possible to
attribute the observed variation of Qt to either Pt or et . In our example, this means that the price
of wheat should not be affected systematically by any other events that affect the quantity
exchanged. If the parameters were to be estimated and this assumption was not verified, it can be
shown that the parameters would be biased toward zero.
Of course, in the present context this assumption is very unlikely to hold. Indeed, the quantity and
price prevailing on the wheat market results from a process involving both the supply and demand.
Therefore, the observations used to estimate the demand are likely to be systematically influenced
by the conditions prevailing on the supply side of the market, and these will be reflected in the
error term. As the price and the error term are related through the supply side of the market, the
estimated parameters will be biased and will provide an invalid representation of the true demand
function.
Because this problem has been recognised for a long time, methods exist to obtain statistical
estimates of the relationship between quantity and price without needing to model both the supply
and the demand. Indeed, this problem is known as price endogeneity and it can be solved by using
instrumental variables. More precisely, assuming that the demand remains the same across each
time period, we need to find a ‘shifter’ of the supply curve, that is, a variable that affects only the
supply side of the market. In turn, the shifter would have the property of being correlated with the
price (through the shifts of the supply curve), but not with the unobserved terms that affect the
demand.
In the case of the wheat market, a good example of a supply shifter is favourable weather
conditions. If we assume that better weather increases the yields of wheat, years with better
eftec
61
January 2008
Demand for Cars and their Attributes – Final Report
weather will increase the supply while not changing the quantity demanded. However, as the
supply during a favourable weather period will be more abundant, the price will be lower than
average. The measure of weather will be unrelated to the quantity demanded, but will have a
causal impact on the price through the supply. We can therefore use information on weather to
control for the effect of the supply side of the market.
A1.3
Differentiated product: the car market
The analytical advantage of homogeneous products is that the market can be modelled in a
relatively simple manner. However, most consumer goods cannot be qualified as being
homogeneous as they tend to differ in many dimensions simultaneously. In turn, the representation
of the market presented previously is likely to be too simple to understand many issues of the ‘real
world’.
For our purpose, the relevant example of a differentiated good market is the car market. While
‘car’ is a generic term for a product that basically serves as a means of transportation, it hides a
multitude of different makes, models and attributes. These will differ in many different
dimensions, including their design, their luxury and prestige and their fuel consumption, to cite
only a few. In other words, while all the products sold on the car market are substitutes in terms of
transportation, their varying characteristics make the analysis of supply and demand much more
complicated.
From this fundamental difference between wheat and cars, it follows that the market transactions
cannot be modelled in the same framework. In order to understand the variations in car market
outcomes, we need to account for the multiple facets of each product as they will typically be
taken into account by the buyers. Additionally, the supply side will also account for it as it will
most likely entail different production costs. That is to say, the prevailing market price and
quantity exchanged for a differentiated product will reflect the actual and perceived
characteristics of each product.
On the demand side, although all the differentiated goods provide the consumers with the same
basic service, the demand for one specific good among its differentiated family will reflect its
attribute levels. In turn, the elasticity and substitution patterns for this good in particular will be
more complex, and will be closely dependent on the alternative products or bundles of attributes
available. Hence the choice of one product among all the possible alternatives will require
consumers to make tradeoffs between the different dimensions of the goods. Since the price tag of
a particular good will only be one dimension of choice, the elasticity concept needs to be applied
to all the possible attributes.
eftec
62
January 2008
Demand for Cars and their Attributes – Final Report
ANNEX 2:
LITERATURE REVIEW: DISCRETE CHOICE MODELS
The following provides a survey of relevant literature, focussing on applications of the discrete
choice models investigating vehicle purchasing demand6. Initially we provide an overview of the
technical developments and major applications of these methodologies (Section A2.1). This is
followed by a review of the utility function specified by relevant aggregate level studies and the
inclusion of car attributes within these (Section A.2.2). Finally a brief review of selected empirical
results is provided for illustrative purposes (A2.3).
A2.1
Application of aggregate and disaggregate models
Aggregate discrete choice models
Aggregate choice models base their inference on the summation of individual choices at the market
level. The ‘traditional’ or ‘standard’ methodology applied to this type of data replicates the
analysis of homogeneous goods7. Specifically, variations in the total quantity exchanged of each
product are sought to be explained by the product’s own price and the price of all possible
substitutes. Accordingly, the traditional approach is of limited tractability, entailing the estimation
of a large number of own-price and cross-price elasticities, and hence requiring a substantial
dataset; see for example the discussion in Bresnahan (1987).
In an influential paper, Berry (1994) addresses the limitations of the traditional approach and
tackles directly the problem of differentiated goods in imperfect markets. In particular, Berry sets
out a model of market-level demand based on individual decision making: households are assumed
to buy a given product if it provides a higher utility than any other model of car available on the
market. Importantly, the choice set of households includes an outside good, i.e. not to purchase a
car at all. Proportions of the population making a decision to purchase a specific vehicle model are
taken as the sum of individual-level utility maximisation choices. In this framework, all elasticities
of the demand are defined directly by the parameters of the utility function.
In econometric terminology, the simplest functional form to model such a ‘discrete choice’ is the
multinomial logit (MNL). However, the MNL imposes an unrealistic substitution pattern: the crossprice elasticities can only depend on the mean utility level provided by the product, so that any
pair of products with the same market share will have the same cross-price elasticity with a
particular third product8. Since the MNL does not allow the substitution patterns to be influenced
by the product’s characteristics, it cannot account for the fact that consumers are more likely to
choose a product with similar attributes as a substitute when, for example, the prices change.
Two strategies to avoid unrealistic substitution patterns are available. First, preferences can be
assumed to vary among consumers, a possibility that was pioneered in this context in articles by
Cardell et al. (1978), Cardell and Dunar (1980) and Boyd and Mellman (1980). Technically, this is
the random parameter logit (RPL) model, and the ‘taste’ parameters (or coefficients of the utility
function) are specified as parametrically distributed among the population. The estimation then
seeks to reveal the mean and standard deviation of the tastes’ distribution. In a RPL, a consumer
who bought a relatively large size car is treated as having a more pronounced taste for large cars,
so that he will be seen as more likely to substitute consumption to other models of large cars. In
the second strategy, the consumers’ characteristics are modelled as having a systematic effect on
6
As noted in Section 1.2, the parallel Cambridge Econometrics study is applying a disaggregate choice model. Given this,
less weight is attached to this approach and more emphasis is placed upon the development of the aggregate choice model,
since this provides the methodological basis of this study.
7
For further detail see Annex 1.
8
These arise because of the error term is assumed to be identically and independently distributed.
eftec
63
January 2008
Demand for Cars and their Attributes – Final Report
their choices. In this case, a consumer of a large car would be more likely to substitute to a
different large car, if for example, household characteristics dictated this (e.g. a large family).
This can be achieved by interacting individual characteristics with product characteristics.
A further issue highlighted by Berry is an inability to observe all the characteristics of the product
that are relevant to the choices of individuals. For instance, factors such as style, prestige,
reputation of the producer or past experience are difficult to measure, and hence difficult to
include within the analysis. In order to account for this kind of ‘product specific’ unobserved
attribute, the utility of a typical consumer is assumed to include a ‘catch-all’ term for the average
utility provided by the unmeasured characteristics of a vehicle. Because this catch-all utility is
likely to vary among consumers, the variation around that mean utility is modelled through the
traditional error term.
While more realistic, this representation of the utility function imposes one complication. Indeed, a
product with a relatively high ‘style factor’ will typically command a higher market price (e.g.
because of higher production costs). This implies that the price might be correlated with the
unobserved characteristics, or more technically that one of the independent variables is correlated
with the error term9. If the product-specific error term is functionally related to the price, the socalled price endogeneity will imply that the estimated parameters will be biased toward zero and
hence unreliable.
In order to address the endogeneity of the price variable, it is usually possible to proceed in the
estimation with instrumental variables, i.e. variables that are related to the price but not to the
error term. Unfortunately, the discrete-choice model formulation precludes the traditional
instrumental variable procedure. Indeed, the relationship between the dependent variable and the
unobserved characteristics is non-linear. Therefore, one of the major contributions of Berry is to
propose a method to ‘linearise’ the aggregate discrete choice model, so that the instrumental
variable procedure can be accommodated, yielding unbiased estimates for the parameters.
Finally, Berry also allows for the inclusion of the supply side of the market. Given the oligopolistic
structure of the car market, the firms are assumed to be price-setters. Hence, notionally, profit
maximising firms will determine their price vector in the context of a Nash-in-price market
equilibrium, taking into account the demand function of the consumers. By assuming a specific
functional form for the firm’s cost function, the demand and pricing equation can be estimated
simultaneously to obtain the demand elasticities, marginal cost parameters and the mark-up for
each marketed product.
Berry et al. (1995) applies the methodology proposed by Berry (1994), employing product-level
market share data, with observations for multiple periods. The approach is structural in that the
estimated equations are derived from an oligopolistic pricing equilibrium model, hence also
including the supply side of the market10. Berry et al. use a generalised method of moments (GMM)
estimator, which selects a set of estimates that minimises a measure of the difference between the
population restriction and its sample analogue (which is a function of the parameters of interest).
In this particular context, this is roughly equivalent to selecting the parameters of the model that
yield a correlation between the unobserved terms and the instrumental variables close to zero.
Note that Berry et al. (2004) show that this estimation procedure generates estimates with
desirable statistical properties.11
9
This is the analogue of the homogeneous market simultaneity problem detailed in Annex 1.
Structural econometrics refers to a methodology that directly tests the validity of a particular economic model.
The main advantage is that the estimates can be interpreted as the parameters of the model and hence as causal
relationships (as opposed to correlations).
11
Loosely speaking, they show that as the number of observations grows, the distribution of the estimates will
converge to a normal distribution with a mean equal to the true value of the parameter.
10
eftec
64
January 2008
Demand for Cars and their Attributes – Final Report
In a further paper, Berry et al. (1999) apply the same methodology to evaluate the welfare effects
of a trade policy, namely the voluntary export restraints (VER) that Japan placed on its car exports
to the United States in 1981. The analysis employs panel data detailing the US car market for the
time period 1971 to 1990, with the same structure as in Berry et al. (1995). The authors also use
macroeconomic indicators to model the supply side, notably the exchange rate, the consumer price
deflator, the interest rate, gross national product and foreign wage levels. Finally, they include a
VER dummy if the trade restriction applies to a particular model in a particular year.
The results show that the estimates for the parameters of the tastes distribution (mean and
variance) are statistically significant, highlighting the heterogeneity of preferences. Concerning the
VER, the main finding is that the prices did not significantly increase in the first years, as the
estimates of the tax equivalent of the trade restriction are also shown not to be statistically
different from zero until 1986. The welfare implications of the restriction are assessed by
simulating a counterfactual, i.e. the industry equilibrium that would have prevailed in the absence
of the VER. This is done by setting the implicit tax rate to zero and solving the model for the new
price vector. This suggests that the most price-sensitive consumers switched over to US
manufacturers, which in turn increased the profits of US firms. Finally, the authors estimate the
compensating variation12 for consumers and show that the burden fell disproportionately on buyers
with inelastic demand for Japanese cars.
Petrin (2002) provides another significant development in the aggregate level methodology. In this
study, the RPL estimation procedure is augmented to include the influence of socio-economic
characteristics. Specifically, the procedure restricts the estimated parameters to match averaged
US household (micro) data taken from the Consumer Expenditure Survey (CES). The characteristics
of consumers can therefore be linked to the probability of choosing a product, increasing the
amount of information and reliability of the substitution patterns without the need to have large
amount of consumer-level data. The estimation procedure follows the GMM approach of Berry et al.
(1995), with additional constraints to match as closely as possible the model predictions to the
observed outcomes from the CES.
With that specification, the model allows the taste parameter for a specific vehicle’s attributes to
vary as a function of the socio-economic indicators. For instance, larger size households may have a
tendency to place greater weight on attributes such as capacity/number of seats, etc. In turn,
patterns of substitution between family vehicles can emerge either because larger families prefer
minivans or because these products share other common characteristics. The socio-economic
characteristics included are household size, the age of the head of the household and the income
level. Finally, the price endogeneity problem is addressed by using instrumental variables, namely
production cost information and characteristics of other products in the same market segment.
Petrin’s results show that both the RPL specification and the inclusion of constraints to account for
average socio-demographic information are found to significantly increase the precision of the
estimates. Additionally, controlling for price endogeneity is found to be important to obtain
unbiased parameters. Because the main aim of the article is not to estimate demand elasticities,
these are not explicitly reported. Nevertheless, the results of Petrin reveal that a reliable
desctiption of welfare impacts may hinge upon the inclusion of socio-economic information.
Indeed, since policies often affect population segments differently, having precisely estimated
taste coefficients for different subgroups is crucial to assess the overall impact.
The final and most recent article developing this methodology is by Berry et al. (2004), which also
employs micro-level data together with aggregate market share data. The individual-level
information was obtained through a survey of General Motor car buyers during 1993. A notable
12
eftec
The compensating variation is the income which would generate the same level of utility at the non-VER prices.
65
January 2008
Demand for Cars and their Attributes – Final Report
feature of this work is that it includes information on ‘second choices’ of people who actually
bought a car, i.e. the car model that would have been chosen if their preferred product was not
available. With this data, the parameters can be further restricted to replicate as closely as
possible observed (but hypothetical) substitution patterns.
Disaggregate discrete choice models
Disaggregate level choice models stem from an established estimation methodology dating from
McFadden (1974). The aim of this approach is to estimate individual-level demand for new vehicles
by linking directly consumers’ and cars’ characteristics with the vehicle model chosen. The
corollary is that this methodology requires survey data with observations on households’
characteristics, together with their vehicle purchased or stated choice13.
An early application of the disaggregate methodology is provided Lave and Train (1979), who
analyse survey data gathered in seven US cities in 1976. To provide a choice set of manageable
size, vehicle models are classified in ten categories, with ten ‘representative cars’ constructed by
averaging all characteristics and weighting them according to market share of each model. The
econometric model used is the multinomial logit, and the authors include alternative specific
constants for each class of vehicle to mitigate the problem of the ‘independence of irrelevant
alternatives’ (IIA).14 The procedure permits the estimation of the probability that a household
chooses a car within one of the ten categories, provided it previously chose to buy a new car.
Building on these results, two studies have thereafter explicitly incorporated heterogeneity in the
taste parameters. First, Berkovec and Rust (1984) have used a sequential choice framework, where
the household chooses the vehicle class before choosing the make/model/vintage. While allowing
the taste parameters to differ conditionally on the class of the vehicle, this type of model is
ultimately limited by the imposed choice structure. The second study, by Mannering and
Mahmassani (1985), estimates separate taste coefficients for consumers who purchased cars
manufactured in the US as opposed to foreign models. In these two studies, the results reveal
significant heterogeneity in consumers’ tastes, which points out the need to specify a flexible
functional form for the utility function.
Further studies have extended the initial scope by modelling the existing size of households’
vehicle fleets, a feature that has been subsequently shown to be crucial in order to obtain the
elasticity estimates (Hensher, 1992). The earliest study of this kind was published by Manski and
Sherman (1980), using 1976 US data. The model posits that if the household does not own a vehicle,
its alternatives are the set of vehicles on the market, while if it owns one, it can choose to retain
the vehicle or sell it and purchase another on the market. An interesting feature of this model is
that it includes a transaction cost component for those households choosing to revise their vehicle
fleet by purchasing a new car. However, two shortcomings have been highlighted (Hensher and Le
Plastrier, 1985). First, the study fails to merge households with more than one car in the same
estimation; i.e. there are separate models of one- and two-car households. Second, there is no real
account for the dynamic component of the choice process; that is, the decision in each time period
whether to revise the household’s fleet or not. Indeed, the incorporation of transaction costs gives
a limited account of this facet of the decision-making process.
In order to overcome these limitations, Hensher and Le Plastrier (1985) and Mannering and Winston
(1985) present models in which the household vehicle fleet and its adjustment over time are part of
13
Note that studies based on hypothetical choices, i.e. stated preferences surveys, are omitted from this discussion.
See for example: an assessment of the demand for clean-fuel vehicles in Bunch et al. (1993); application of results for
transport modelling in Bunch et al. (1996); or Hensher and Green (2001) on stated preferences for technologies with very
small market shares (e.g. electric vehicles).
14
The IIA property of the multinomial logit implicitly restricts the choice between two alternatives to be
independent of the characteristics of a third option.
eftec
66
January 2008
Demand for Cars and their Attributes – Final Report
the same decision process. The first study, based on a survey implemented in Sydney (Australia),
uses a nested logit approach, so that the two decisions are modelled sequentially. Mannering and
Winston also include the evolution of households’ tastes over time, which permits the analysis of
‘brand loyalty’, i.e. consumers of a particular brand might be more likely to select that same brand
in future vehicle purchase decisions. The data used is based on surveys within the US from 1977 to
1980. Another interesting aspect of the latter study is that it looks at the vehicle use, notably in
terms of miles driven and operating costs, both in the current and previous time period. This
augments the model of household vehicle purchasing by allowing past experience and use to
influence current choices.
In a similar line of research, Train (1986) and De Jong (1990) estimate models that include car
ownership and mileage. In particular, De Jong (1990) studies the effect of both fixed and variable
utilisation costs on car ownership and yearly kilometres driven (for households that chose to own a
car). The estimated equation is derived from a static utility optimisation problem and the data
stems from a 1985 Dutch survey. The results are used to simulate the impact of change in the cost
of car ownership and usage on both the mileage and the ownership.
Orientated toward transport planning policy simulations, De Jong (1996) augments the methodology
by bringing together models that were traditionally treated as separate. In particular, De Jong
simulates the effect of policies by integrating four decision-making processes: the length of time
the car is used, the choice of the vehicle type, the annual mileage and the fuel consumption while
driving. Therefore, in this context, the vehicle-type choice is conditional on actual replacement of
the currently owned vehicle. De Jong applies the different models to Dutch panel data from 1992
and 1993. In a further attempt to merge together different aspects of vehicles decision-making,
Bhat and Sen (2006) integrated the choice of the number and types of vehicles owned. This allowed
them to assess the responsiveness of the vehicle fleet and usage over time, as a response to
changes in demographics or operating costs.
Closer to the problem considered within the present research, three relatively recent articles focus
on vehicle-type choice exclusively and use a sequential-choice format (nested logit model). First,
Goldberg (1995) uses data sourced from a US consumer survey between 1983 and 1987, and
specifies a five-level decision making process. Specifically, households are assumed to decide
sequentially whether to buy a car or not, whether to buy a new car or a used car, the class of car,
the country of origin, and finally the model. Note that household’s characteristics only enter the
choice of the class of cars.
Second, McCarthy and Tay (1998) use a 1989 US survey representative of the US car buying
population. Because the study focuses on the fuel efficiency attribute of new vehicles, the nests
are defined in terms of ‘miles per gallon’ bands, and the researchers find it to be a “significant
improvement” over the simpler MNL model. In other words, they find that the structure of the
vehicle demand is not independent of the fuel efficiency category.
Third, Mohammadian and Miller (2003) study data on Canadian households obtained by a
retrospective survey, with information on vehicle transactions for up to nine years. This study also
includes the choice of used cars, which are classified according to their first circulation date. In the
specified choice structure, the household first decides which vehicle class or type they prefer (out
of six alternatives), then continue to choose the vintage of the car (four alternatives). Although the
choice structure is heavily constrained by aggregating a large number of alternatives into just 24,
the results fit the data relatively well.
The final and most recent study, by Train and Winston (forthcoming), identifies the parameters of
the utility function without nesting the choice of the type of car model. The authors apply a RPL
model that captures the effect of brand loyalty, the manufacturer’s product line and distribution
eftec
67
January 2008
Demand for Cars and their Attributes – Final Report
infrastructure, as well as car and household attributes. They analyse survey data on US households
that acquired a new car model in the year 2000. With this model, Train and Winston are able to
examine the underlying causes of market share trends, enabling the authors to attribute the
decline of US car manufacturing mainly to changes in the basic characteristics of the cars (e.g.
price, fuel consumption, engine power, reliability, etc.).
A2.2
Specification of vehicle attributes
One of the key assumptions underlying the discrete choice methodology (aggregate and
disaggregate) is that households select the vehicle that yields the highest possible utility. In turn,
selecting the appropriate parameters that enter the utility index is a crucial step for empirical
applications. There is, however, no theoretically grounded manner for selecting these attributes.
Evidently though, some guide as to the specification of attributes within the utility function can be
taken from existing empirical applications. As such, this section reviews the evidence gathered
from the four principle aggregate level choice modelling studies; namely Berry et al. (1995), Berry
et al. (1999), Petrin (2002) and Berry et al. (2004)15.
Overall, there are three broad types of components that may enter into the utility function:
•
Vehicle attributes: these should reflect all characteristics that are of importance to the
consumers. Within this, one attribute (or set of attributes) of particular interest are the costs
of car ownership, including both those that are fixed in nature (e.g. vehicle excise duty) and
those that are variable (e.g. fuel).
•
Household characteristics: these are likely to influence vehicle choice; either because they
influence utility directly (as in the case of income) or because they result in different
weightings of importance for specific vehicle attributes (e.g. a family of four may place more
emphasis on the number of seats than a single person household).
•
Interaction terms: namely, interactions between the characteristics of households and vehicle
attributes. This allows the effect of an attribute on the utility to depend on the level of
another attribute (e.g. the utility obtained through the size of the car might depend on the
number of people in the household, so that we would need to multiply these two variables to
quantify the interaction).
Variables included in the final specification of the four principal aggregate level studies are
presented in Table A2.1. In each case, data on each vehicle type relates to the base model and all
studies specified vehicle price as the retail (‘list’) price, since actual transaction prices were not
available.
Table A2.1: Car attributes specification in aggregate models
Berry et al. (1995)
Berry et al. (1999)
Petrin (2002)
Berry et al. (2004)
Constant
Constant
Constant
Constant
Log of the ratio of horse
power to weight
Ratio of horse power to
weight
Ratio of horse power to
weight
Ratio of horse power to
weight
Standard air conditioned*
Standard air conditioned*
Standard air conditioned*
Miles per $ of gasoline
Miles per $ of gasoline
Miles per $ of gasoline
Miles per $ of gasoline
No. seats
Log of the size
Size
Front wheel drive*
No. power accessories
(length × width)
(length × width)
Minivan*
Safety (sum of airbag and
anti blocking breaks)
15
eftec
For a review of the parameters used in the utility function of disaggregate studies, see Bunch and Chen (2007).
68
January 2008
Demand for Cars and their Attributes – Final Report
Station wagon*
All wheel drive*
Sport utility vehicle*
Sport car*
Full-size van*
Minivan*
Percentage change in
GNP
Sport utility vehicle*
Payload*Sport utility
Payload*Pickup
Full-size van*
Chrysler*
Ford*
GM*
Honda*
Nissan*
Toyota*
Small Asian car*
European car*
The effect of price is
measured through the
logarithm of average
population income minus
price:
ln(y-p)
The effect of price is
measured through the
ratio of average
population income to
price:
p/y
The price coefficient is
interacted with a dummy
for three income groups
to measure non-constant
marginal utility of
income
Price is interacted with
individual-level variables
(see Table A2.2)
Notes: All characteristics are random parameters. Dummies are marked with an asterisk.
Additionally to the variables reported above, Petrin (2002) and Berry et al. (2004) complete their
specification with interactions between car attributes and average households characteristics. In
particular, the study of Petrin (2002), which focuses on the introduction of a minivan class of
vehicle, interacts household size with a dummy variable that signals vehicle type (minivans, station
wagons, full-size passenger vans and sport-utility vehicles). Berry et al. (2004) on the other hand
use a much more complex pattern of interaction terms (with dual and triple interactions), which
are reported in Table A2.2.
As discussed in Section A2.1, Berry et al. (2004) account for potential price endogenity by using an
instrumental variable procedure. In this particular case, valid instruments are variables which are
related to the market price, but which are not functionally related with unobserved attributes,
such as prestige. From this perspective, the literature suggests that instruments for the price of a
particular car model include the attributes of the product itself and the characteristics of all other
products in the same market segment. Indeed, the attribute levels of the car model will be
reflected in production costs, so that the price charged by the producer will be directly linked to
the attributes of the car. However, there is no a priori reason to conceive a relationship between
the unobserved characteristics of the model and its measured attributes.
Table A2.2: Interaction terms in Berry et al. (2004)
Variable 1
Variable 2
Price
Constant
Price
Income
eftec
Variable 3
Income is lower that the 75th
percentile*
69
January 2008
Demand for Cars and their Attributes – Final Report
Price
Income is larger that the 75th
percentile*
Income
Price
Family size
Minivan*
Kids (<16 y.o.)*
No. passengers seats
No. adults
No. passengers seats
Family size
No. passengers seats
Age of household head
Ratio horse power to weight
Age of household head
No. power accessories
Age of household head
No. power accessories
Age of household head squared
Payload
Pickup*
Age of household head
Payload
Pickup*
Rural environment*
Safety
Age of Household head
Sport-utility vehicle*
Age of Household head
Sport-utility vehicle*
Rural environment*
All wheel drive*
Rural environment*
Constant
Total income
Constant
Family size
Constant
Number of adults
Note: Dummy variables are signalled with an asterisk.
Additionally, the relationship between the price of a car model and the attributes of the models
marketed in the same segment can be explained by the oligopolistic market structure. Indeed, such
a structure implies that products in the same market segment will have similar mark-ups. In turn,
because of this particular pricing equilibrium, the price of a specific vehicle is likely to be
correlated with the attribute level and cost elements of other products in the same market
segment. On the contrary, there is no a priori reason for these variables to be related to the
unobserved characteristics of the product considered.
According to these arguments, the aggregate studies cited above all use some combinations of the
attributes of the product itself and the characteristics of all other products in the same market
segment.
A2.3
Elasticity estimates
This final sub-section of the literature review focuses on a selection of empirical evidence
pertaining to elasticity estimates generated by aggregate level vehicle choice models16. In this
regard, not all published studies explicitly report elasticity estimates and no such data is available
for the UK car market. Indeed only Berry et al. (1995) and Berry et al. (2004) report elasticity
estimates, based on US car market data. The following is intended for illustrative purposes only,
rather than to allow for meaningful comparison with the results of this study. Foremost, the
elasticity estimates from the two Berry et al. studies pertain to the US car market, rather than that
16
See Hensher (1992) for an example of empirical results from a disaggregate discrete choice model.
Generally, this study suggests less price sensitivity than the aggregate models detailed in Section A2.2. Several
modelling differences might explain this, notably the fact that Hensher’s analysis explicitly accounts for the
pre-existing vehicle fleet. Additionally, Hensher notes that the model parameters have to be calibrated to
reproduce the market shares before the elasticity can be computed, a step which is missing from the
aggregate studies summarised.
eftec
70
January 2008
Demand for Cars and their Attributes – Final Report
of the UK. Moreover, both studies’ primary focus is substitution patterns between vehicle types,
rather than precise elasticity estimates.
Berry et al. (1995)
The findings presented in Berry et al. (1995) include three different functional forms: a simple MNL
(producing restrictive substitution patterns, as outlined in Section 2.1.2), a MNL instrumental
variable specification (with the same substitution patterns, but accounting for the possible
endogeneity of the price variable), and a RPL. The RPL specification allows inducing substitution
patterns related to the characteristics of each product and accommodating the data on income. In
the simple MNL specification, the results display a number of anomalies, the most undesirable of
which is that two thirds of the products are estimated to be price inelastic. This finding is
inconsistent with the assumption of profit maximising price setting.17 When applying the MNL
specification with instrumental variables (to account for the potential endogeneity of the price) the
results are more theoretically consistent, as all attributes are found to yield positive marginal
utility and only 22 out of 2,217 models included in the analysis are estimated to have inelastic
demand.
With the RPL specification it is found that each attribute has either a statistically significant
positive effect on mean utility or on the variance of utility. This implies that an increase in any of
the characteristics of a given model (except price) will tend to, on average, increase its market
share, all else equal. Additionally every model is found to have an elastic demand, i.e. the ownprice elasticity is larger than 1 in absolute terms. Unambiguously, the RPL specification yields the
most theoretically consistent results and the richest possibilities for the substitution patterns. The
magnitude of the impact of a change in attributes is summarised by the demand elasticities for
each product. Table A2.3 reports the elasticity estimates for attributes (including price) of
selected vehicle models. One of the interesting findings of Berry et al., with regards to informing
policy, is that consumers who purchase cars with high fuel efficiency (expressed in terms of ‘miles
per dollar’) are indeed sensitive to fuel economy, i.e. the elasticity with respect to this attribute is
relatively large (note that model fuel efficiency is not shown in Table A2.3). However, the
elasticity decreases for vehicles with low fuel efficiency. Indeed, for a low efficiency car, the
elasticity of demand eventually becomes negative (albeit close to zero), which according to the
authors signals that consumers purchasing these cars are not concerned with marginal changes in
fuel efficiency.
Table A2.3: Demand elasticities with respect to vehicle attributes and price
(selected models)
Horsepower/
Weight
(Acceleration)
Standard AC
(Luxury)
Miles per $
(Fuel
efficiency)
Size
(width x
length) (Quality
of ride)
Price
Mazda 323
0.46
0.00
1.01
1.34
-6.36
Ford Escort
0.45
0.00
1.13
1.18
-6.03
Honda Accord
0.28
0.00
0.13
0.87
-4.80
Nissan Maxima
0.32
0.40
-0.14
0.93
-4.85
Lexus LS400
0.07
0.04
-0.01
0.15
-3.09
BMW 735i
0.06
0.01
-0.02
0.17
-3.52
Model
Notes: These figures are selected from Table V of Berry et al. (1995). For example, a 10% increase in the
Horsepower/Weight ratio of the Mazda 323 would increase its market share by roughly 4.6%.
17
A price elasticity greater than one (in absolute terms) is the condition for an oligopolistic producer to
operate on the market since it implies a non-negative marginal revenue.
eftec
71
January 2008
Demand for Cars and their Attributes – Final Report
A further finding of Berry et al. (1995) is that models with numerous substitutes (typically in the
compact and sub-compact market segments) have the highest price elasticities. Intuitively, this
represents the large substitution possibilities that these models offer. To illustrate this finding
further, Table A2.4 reports the price and cross-price semi-elasticities18 for a $1,000 increase in
price. For example, a $1,000 increase in the price of the Ford Escort model would decrease its
market share by an estimated 106.5 percent, but would increase that of the Mazda 323 by 8.9
percent; however, the market share of the BMW 735i would not be affected. From Table A2.4 it is
also clear that cross-price semi-elasticities are larger for cars with similar characteristics. Note also
that these are lower for cars that are sold at higher prices.
Table A2.4: Own- and cross-price-elasticity for a $1,000 increase (selected models)
Mazda 323
Ford Escort
Honda
Accord
Nissan
Maxima
Lexus LS400
BMW 735i
Mazda 323
-125.93
8.95
2.19
0.06
0
0
Ford Escort
0.71
-106.5
2.3
0.08
0
0
Honda Accord
0.12
1.59
-51.64
0.31
0.03
0.01
Nissan Maxima
0.01
0.24
1.29
-35.38
0.12
0.02
Lexus LS400
0
0.02
0.3
0.28
-11.2
0.09
BMW 735i
0
0.01
0.2
0.19
0.34
-9.38
Notes: These figures are selected from Table VI of Berry et al. (1995). The table reads as follows: if rows are indexed by i
and columns by j, cell (i,j) gives the percentage change in the market share of model i, when the price of model j increases
by $1,000.
Table A2.4 however displays some apparent anomalies. Indeed, it is very surprising to see the
market share of the Mazda 323 drop to zero when its price increases by $1,000. This may explain
why the authors are only concerned with the general substitution patterns between cars generated
by their model: as the price changes, consumers will be looking at the closest substitute on the
market. Because the Mazda 323 had a list price of $5,049 and is positioned as the cheapest model
in a market segment loaded with possible substitute (all costing less that $5,800), a strong price
effect is however not unsurprising.
The final result of importance is the substitution pattern with the ‘outside good’. This gives the
proportion of consumers who would react to a price increase by not purchasing a new car rather
than choosing an alternative vehicle. It is found that consumers of lower priced cars are more likely
not to buy any product if the price changes (around 30 percent for the Mazda 323 against 10
percent for the BMW 735i). However, Berry et al. suggest that the reported findings are greater
than what would be expected, highlighting the importance of the interpretation of the outside
good. Indeed, for many consumers the outside good represents alternatives more complex to
model, like keeping their existing car or purchasing a used car.
Berry et al. (2004)
Results reported in Berry et al. (2004) are less detailed than the 1995 study, but again demonstrate
that the more complete the specification, the more realistic the substitution patterns obtained.
Table A2.5 provides some example of substitution patterns and price semi-elasticities (for a ‘small
18
In this case, the semi-elasticities measure the percent change of market share for a price increase measured in absolute
terms. By contrast, the price elasticity relates the percent change of the market share with a percentage change of price.
eftec
72
January 2008
Demand for Cars and their Attributes – Final Report
increase’ of the price). These highlight the finding that when the price of a car model increases,
only the more price-sensitive fraction of households would substitute away from that vehicle.
For example, a small increase in the price of the Ford Escort would lead the largest fraction of
consumers to substitute to the Ford Tempo (8.2%), and the second largest fraction of consumers to
choose the Chevrolet Cavalier (7.3%). Hence these two cars alone account for roughly 15% of those
consumers who would substitute away from the Ford Escort if its price rose. Finally, a fraction of
consumers substituting away from the Ford Escort (6.6%) would react not purchase a car at all.
Table A2.5: Substitution patterns and fraction of consumers affected
Vehicle
Semielasticity
1st Best
substitute
% Movers
2nd Best
substitute
% Movers
% To outside
Nissan Metro
-1.77
Toyota
Tercel
15.0
Ford Festiva
10.6
18.0
Ford Escort
-4.02
Ford Tempo
8.2
Chevrolet
Cavalier
7.3
6.6
Honda
Accord
-3.92
Toyota
Camry
8.6
Honda Civic
4.5
5.1
Lexus LS 400
-3.43
Mercedes 300
8.0
Lincoln Town
Car
6.3
5.9
Toyota
Pickup
-3.34
Chevy Pickup
43.5
Ford Pickup
13.6
6.0
Notes: These figures are selected from Table 9 of Berry et al. (2004). The ‘Movers’ columns show the percentage of
consumer substituting away from the vehicle whose price increased that are predicted to choose the ‘first best’ or ‘second
best’ substitute. The ‘To outside’ column shows the percentage of consumers substituting away from the vehicle whose
price increased that are predicted not to buy a car at all.
Berry et al. also find that out-of-the-sample prediction, such as assessing the impact of introducing
a new product, generally generates convincing results. As an example, they simulate the impact of
the introduction of a high-end Mercedes SUV and find plausible response patterns. Specifically, the
price of the new model was simulated with a hedonic regression and substitution patterns are
reproduced in Table A2.6.
Table A2.6: Substitution patterns simulated for a new car model
Model
Price
Old Share
New Share
New Mercedes SUV
33,659
0.000
0.076
Ford Explorer
24,274
0.252
0.237
Chevy Pickup
22,651
0.111
0.107
Toyota Pickup
25,548
0.038
0.035
-
0.161
0.157
Luxury cars
Notes: These figures are selected from Table 12 of Berry et al. (2004). Luxury cars are models priced above $30,000.
Some additional caveats apply:
•
•
The elasticity numbers reported apply only to the fraction of the population who actually
purchased that model of car, making it difficult to extrapolate to potential buyers. This is,
however, an artefact of any statistical analysis using observed behaviour data;
The market as a whole is likely to be less sensitive to price changes than these figures suggest.
Indeed, a large fraction of cars are bought by companies which may be less price-sensitive than
households. Additionally, this effect might be more pronounced in the upper end of the market
where individual consumers are relatively less important.
eftec
73
January 2008