DYNAMIC DEMAND MODELLING
AND PRICING DECISION SUPPORT
SYSTEMS FOR PETROLEUM
A thesis submitted to the University of Manchester
for the degree of Doctor of Philosophy
in the Faculty of Engineering and Physical Sciences
2014
By
David Fox
School of Computer Science
Contents
Abstract
7
Declaration
8
Copyright
9
Acknowledgements
11
1 Introduction
13
1.1
Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . .
15
1.2
Summary of the Main Contributions of this Research . . . . . . .
16
1.3
Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
2 Pricing Decision Support Systems
20
2.1
Pricing Decision Support Systems for retail . . . . . . . . . . . . .
23
2.2
Issues with current methods . . . . . . . . . . . . . . . . . . . . .
30
2.3
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
3 Forecasting demand
34
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
3.2
Dynamic Linear Models . . . . . . . . . . . . . . . . . . . . . . .
41
3.2.1
45
Using bounded DLMs to forecast sales . . . . . . . . . . .
2
3.3
3.2.2
Multicollinearity . . . . . . . . . . . . . . . . . . . . . . .
47
3.2.3
Algorithm to forecast sales . . . . . . . . . . . . . . . . . .
49
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
4 Forecasting competitor prices
54
4.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
4.2
Competitor price forecasting
. . . . . . . . . . . . . . . . . . . .
56
4.3
SVR theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
4.4
SVR algorithm to predict competitor prices . . . . . . . . . . . .
64
4.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
66
5 Price Optimisation
68
5.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
68
5.2
Optimisation under competitive uncertainty . . . . . . . . . . . .
71
5.3
Optimisation using forecasted competitor prices . . . . . . . . . .
79
5.3.1
82
Exhaustive Game-Tree Search . . . . . . . . . . . . . . . .
5.4
Complete Optimisation
. . . . . . . . . . . . . . . . . . . . . . .
89
5.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
93
6 Case Study: Retail Vehicle Fuel Industry
6.1
95
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
96
6.1.1
Retail Fuel Industry . . . . . . . . . . . . . . . . . . . . .
96
6.1.2
Pricing Strategies . . . . . . . . . . . . . . . . . . . . . . .
97
6.1.3
Collusion in Pricing . . . . . . . . . . . . . . . . . . . . . . 100
6.1.4
Pricing Decision Support Systems in the Retail fuel Industry101
6.2
Testing bounded DLM demand model . . . . . . . . . . . . . . . . 103
6.3
Testing SVR competitor price forecasting model
6.4
Testing greatest guaranteed profit optimisation . . . . . . . . . . . 113
3
. . . . . . . . . 109
6.5
Testing Game-Tree Optimisation . . . . . . . . . . . . . . . . . . 115
6.6
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7 Conclusions and Future Work
121
7.1
Summary and Main Contributions . . . . . . . . . . . . . . . . . . 121
7.2
Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
8 Appendix
125
8.1
Basic Assumption . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.2
Two Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . 127
8.3
Comparing LLS (X, θLS ) and LBLS (X, θBLS ) . . . . . . . . . . . . 129
Bibliography
134
Word Count: 29647
4
List of Tables
6.1
Example of daily sales data with three competitors . . . . . . . . 104
6.2
Comparison of DLM method with [MJL11] method . . . . . . . . 109
5
List of Figures
5.1
Own Price options . . . . . . . . . . . . . . . . . . . . . . . . . .
80
5.2
Competitor Price response . . . . . . . . . . . . . . . . . . . . . .
80
5.3
A game-tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
5.4
Game-tree optimisation . . . . . . . . . . . . . . . . . . . . . . . .
87
5.5
Complete price optimiser . . . . . . . . . . . . . . . . . . . . . . .
90
6.1
Actual and predicted sales (in litres) at site 1 . . . . . . . . . . . 106
6.2
Actual and predicted sales (in litres) at site 2 . . . . . . . . . . . 106
6.3
Correlogram of the daily error at site 1 . . . . . . . . . . . . . . . 107
6.4
Correlogram of the daily error at site 2 . . . . . . . . . . . . . . . 107
6.5
Plot of the constrained daily movements of own price elasticity . . 108
6.6
Denmark box-plots of the Mean Absolute Error . . . . . . . . . . 111
6.7
USA box-plots of the Mean Absolute Error . . . . . . . . . . . . . 112
6.8
Optimisation of own price to obtain highest guaranteed profit . . 114
6.9
Actual price data . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.10 Increased profit for Denmark own sites . . . . . . . . . . . . . . . 117
6.11 Increased profit for USA own sites . . . . . . . . . . . . . . . . . . 118
6
Abstract
Pricing decision support systems have been developed in order to help retail companies optimise the prices they set when selling their goods and services. This
research aims to enhance the essential forecasting and optimisation techniques
that underlie these systems. This is first done by applying the method of Dynamic Linear Models in order to provide sales forecasts of a higher accuracy
compared with current methods. Secondly, the method of Support Vector Regression is used to forecast future competitor prices. This new technique aims
to produce forecasts of greater accuracy compared with the assumption currently
used in pricing decision support systems that each competitor’s price will simply remain unchanged. Thirdly, when competitor prices aren’t forecasted, a new
pricing optimisation technique is presented which provides the highest guaranteed
profit. Existing pricing decision support systems optimise price assuming that
competitor prices will remain unchanged but this optimisation can’t be trusted
since competitor prices are never actually forecasted. Finally, when competitor
prices are forecasted, an exhaustive search of a game-tree is presented as a new
way to optimise a retailer’s price. This optimisation incorporates future competitor price moves, something which is vital when analysing the success of a pricing
strategy but is absent from current pricing decision support systems. Each approach is applied to the forecasting and optimisation of daily retail vehicle fuel
pricing using real commercial data, showing the improved results in each case.
7
Declaration
No portion of the work referred to in this thesis has been
submitted in support of an application for another degree
or qualification of this or any other university or other
institute of learning.
8
Copyright
i. The author of this thesis (including any appendices and/or schedules to
this thesis) owns certain copyright or related rights in it (the “Copyright”)
and s/he has given The University of Manchester certain rights to use such
Copyright, including for administrative purposes.
ii. Copies of this thesis, either in full or in extracts and whether in hard or
electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it
or, where appropriate, in accordance with licensing agreements which the
University has from time to time. This page must form part of any such
copies made.
iii. The ownership of certain Copyright, patents, designs, trade marks and other
intellectual property (the “Intellectual Property”) and any reproductions of
copyright works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the
author and may be owned by third parties. Such Intellectual Property and
Reproductions cannot and must not be made available for use without the
prior written permission of the owner(s) of the relevant Intellectual Property
and/or Reproductions.
iv. Further information on the conditions under which disclosure, publication
9
and commercialisation of this thesis, the Copyright and any Intellectual
Property and/or Reproductions described in it may take place is available
in the University IP Policy (see http://documents.manchester.ac.uk/
DocuInfo.aspx?DocID=487), in any relevant Thesis restriction declarations
deposited in the University Library, The University Library’s regulations
(see http://www.manchester.ac.uk/library/aboutus/regulations) and
in The University’s policy on presentation of Theses
10
Acknowledgements
I would firstly like to thank all of the researchers I have worked with at KSS
Fuels who have each helped me during different stages of my research. I must
also thank KSS Fuels and EPSRC for providing the funding which supported
this research. Finally I want to thank my supervisor Xiao-Jun Zeng who has
continuously provided the guidance and expert advice which has underpinned
this work.
11
List of Publications and Patents
related to this Thesis
1. A New Approach to Demand Modelling and Optimisation in Pricing Decision Support Systems. David Fox and Xiao-Jun Zeng. School
of Computer Science, University of Manchester, Manchester, UK. Systems,
Man, and Cybernetics (SMC), 2013 IEEE International Conference on.
IEEE, pages 74-79, 2013.
2. Patent: US Patent Application No 14/029432
Due to the commercial and confidential nature of this research, some results
reported in this thesis have not been able to be fully published in this publication.
12
Chapter 1
Introduction
The most fundamental action of a retailer is deciding on price. In fact Singh states
in [Sin91], “Price is the only marketing mix element which generates revenue:
all others involve expenditure of funds”. The general pricing problem is that
selling your product at a higher price increases the profit for every unit sold, but
usually reduces the total volume of sales, which may in turn reduce overall profit.
However, selling at a lower price will reduce profit per unit sold but may increase
sales and overall profit, unless competitors follow this price decrease causing sales
to return to their original level reducing overall profits. All these issues compound
the price setting problem and explain why research has been undertaken to help
improve these pricing decisions.
Unlike in the case of perfect competition, retailers trading in a monopoly1
or oligopoly2 have market power when setting their price. They are able to
set their optimal price and dictate supply in order to increase their profit, as
shown in [Viv99] for example, who presents a number of pricing models in an
oligopoly market. In order to price optimally, a retailer needs to investigate each
1
2
A market where there is only one supplier of a specific good or service.
A market where there are only a few limited suppliers of a specific good or service.
13
14
CHAPTER 1. INTRODUCTION
potential pricing option so that they can understand which price to implement.
The chosen price should then extract the most money a consumer is willing to
pay for the good or service. As stated in [Mit13], ”Pricing has evolved from a
topic purely related to economics and academic research to a very practical and
powerful instrument to drive profitability in firms.” This has predominantly been
due to the successful implementation of many pricing decision support systems.
The underlying techniques used within these systems to optimise price are the
sole focus of this research. Hence the research problem tackled in this thesis
is to develop these pricing decision support systems in order to increase their
profitability and necessity to retailers. The problem to be solved is to improve
both their forecasting accuracy and the methodology they use to decide upon
an optimum pricing strategy. Existing pricing decision support systems model
demand under the assumption that price effects remain static over the long-term.
They also do not use any modelling techniques to forecast competitor prices,
simply assuming that competitor prices will remain unchanged. The optimum
price is then found under these incorrect assumptions. Solving these issues will
be the objective of this research.
Such a project is clearly important for retailers since their success often depends fundamentally on making correct pricing decisions. Maximising the profits
of a company allows them to avoid cash-flow problems as much as possible, reinvest to grow the business, develop products and pay employees. Enabling retailers
to better understand the purchasing decisions of their customers will also allow
customers to assert influence on these pricing decisions. For example, if customers choose to be price sensitive and only buy the cheapest product available,
retailers will understand these buying patterns and will set their price as low as
possible. Using pricing decision support systems should also allow retailers to focus on developing great products since they will not need to spend as much time
1.1. RESEARCH OBJECTIVES
15
setting prices, providing greater choice overall. For all these reasons, the success
of projects like this one trying to advance pricing decision support systems will
benefit the economy as a whole.
1.1
Research Objectives
The research in this thesis is focused upon the price optimisation of a nonperishable product sold at a single price per unit at any given time. In order
to develop the pricing decision support systems currently used to solve this pricing task, the objectives of this research are:
1. To create an improved demand model algorithm that has a superior forecast accuracy compared with techniques currently implemented to forecast
customer demand. These forecasts will be used to compare the impact of
different prices on overall sales. Completing this task is the focus of chapter
3.
2. To create an algorithm to predict the future price movements of competitors
at a greater accuracy than what is currently achieved. These prices will be
inserted into the demand model in order to improve the sales forecasts.
Completing this task is the focus of chapter 4.
3. To develop a superior algorithm to optimise a retailer’s price when competitor prices are not forecasted, compared with optimisation techniques
currently implemented to do this. Completing this task is the focus of
section 5.2.
4. To develop an algorithm which optimises a retailer’s price using both demand and competitor price model forecasts. Completing this task is the
focus of section 5.3.
16
CHAPTER 1. INTRODUCTION
After this has been achieved, all of the newly proposed theory will be combined
to construct a completed pricing decision support system. This task is the focus
of chapter 5.4. Each part will then be tested on the retail vehicle fuel industry, using real commercial data, with the results analysed and compared against
current methodology. Completing this task is the focus of chapter 6. Overall
this thesis aims to shine an academic light upon pricing decision support systems
used to solve this type of pricing problem, further proving their effectiveness and
demystifying some of the secrecy hampering their wider use, hopefully leading to
their greater deployment.
1.2
Summary of the Main Contributions of this
Research
This project has focused upon pricing decision support systems and in particular
improving their demand forecasting accuracy, their competitor price forecasting
accuracy and the methodology they use to incorporate these forecasts into finding
an optimum pricing strategy. The main outcomes achieved are that:
• Taking a dynamic approach to the demand modelling problem by applying
a bounded Dynamic Linear Model algorithm provides sales forecasts of a
higher accuracy compared with current demand forecasting methods which
use a static modelling approach. It achieves this superiority by dealing
with the short-term correlation in the sales data which static approaches
fail to achieve, providing improved estimates of the price elasticities which
are integral to the overall optimisation. The bounded Dynamic Linear
Model algorithm proposed should therefore improve sales forecasts in a
wide variety of commercial sectors whilst providing much greater insight
1.2. SUMMARY OF THE MAIN CONTRIBUTIONS OF THIS RESEARCH17
into the price effects on demand for many companies.
• It’s clear that the profitability of one firm’s pricing strategy depends crucially on the response of its closest rivals. Therefore a retailer must predict
how a competitor would react when setting their own price. Despite the
importance of making correct competitor price predictions, techniques for
forecasting competitor prices are lacking from academic literature. Instead
the naive assumption that a competitor’s price will remain unchanged is
often used, as is the case in [MJL11]. This research is the first to address the competitor price forecasting problem explicitly and shows how
the systematic approach of using the method of Support Vector Regression
produces competitor price forecasts of greater accuracy compared with the
assumption currently used in pricing decision support systems that every
competitor’s price will simply remain unchanged.
• Current optimisers don’t change their advised price depending on how a
competitor is forecasted to react to a change in own price. This research
shows that this is incorrect and mathematically proves different optimal
prices exist. Furthermore, when competitor prices aren’t forecasted, a superior pricing optimisation technique is achieved when optimising the highest guaranteed profit instead of optimising a retailer’s price assuming that
competitor prices won’t react, as is currently done. This optimisation is the
first to explicitly incorporate the competitive nature of a market, providing
a superior pricing methodology.
• Current pricing decision support systems fail to incorporate future competitor price moves into their optimisation, something which is vital when
analysing the success of a pricing strategy. Hence the final contribution of
this research is to propose the first price optimisation methodology which
18
CHAPTER 1. INTRODUCTION
explicitly forecasts future competitor prices. This is achieved by completing
an exhaustive search of a Game-Tree to optimise a retailer’s price, producing increased profit compared with current price optimisation techniques.
It also enables a discrete price optimisation which is different to the continuous price optimisation currently used in pricing decision support systems
that is prone to rounding errors.
1.3
Thesis Structure
Chapter 2 provides a detailed background to pricing decision support systems
and the major milestones passed in applying these systems to different sectors.
The issues with the methods used in current systems are presented in great detail,
explaining the challenges which improved techniques will have to successfully deal
with. Chapter 3 then provides greater detail regarding the challenges faced when
forecasting demand. It then explains the method of Dynamic Linear Models and
how a bounded version of this technique can and should be applied as a new
technique to forecast demand. Chapter 4 explains the absence of any competitor
price forecasting in current pricing decision support systems whilst illustrating
the necessity of such a task. It then describes the required factors needed in
order to understand and forecast competitor pricing decisions. The method of
Support Vector Regression is then presented and a bounded version of this technique is proposed as a way to complete this task. Attention then turns to overall
price optimisation in chapter 5. First the background to the pricing problem is
described together with the current techniques used. A new optimisation to provide the highest guaranteed profit is then proposed for when competitor prices
aren’t forecasted. This theory is applied to explain why prices may not always
retain a continuous margin above cost even in a competitive market. This chapter
1.3. THESIS STRUCTURE
19
then incorporates the forecasting of competitor prices into the price optimisation
and explains the importance of doing so. Using an Exhaustive Game-Tree Search
to find the optimal pricing strategy is then explained in detail. This chapter
ends by combining all of the newly proposed techniques into a completed pricing
decision support system. Chapter 6 presents a highly detailed background to
the pricing problem faced by retailers in the retail vehicle fuel industry. It then
analyses the results and compares the advantages found when testing each of the
individual newly proposed techniques on this pricing problem against methodologies currently used in pricing decision support systems in this sector, using real
commercial data. Finally, chapter 7 gives a summary of the research presented
in this thesis.
Chapter 2
Pricing Decision Support
Systems
The first publication of the idea of using computers to help decision makers
was given by [Bon63]. Incorporating this technology into a decision-making
process is known as a decision support system. [EK06] presents a recent survey of decision support system applications and there are many examples (e.g.
[GCRA10, LH12, SLW10]) of where decision support systems have been successfully implemented. The continual introduction of computerised technology into
new areas of the retail environment has clearly resulted in new opportunities
for retailers, as shown in [BGL94]. The explosion in the availability of data
and computing ability in retail management has led to a new desire to implement
pricing decision support systems which use statistical models to predict consumer
price response using historical information in order to optimise price. They have
been shown to be especially advantageous within the revenue management sector,
where product availability and price are optimised in order to maximise profits.
As stated in [CDV10], “Revenue management deals with maximizing revenue
for a fixed capacity of a product or service. It saves the capacity for the most
20
21
valuable customer by proper capacity allocation and constantly attempts to understand, anticipate, and then react to consumer behavior in order to maximize
revenue/profit.” The earliest notable success in this area was the yield management system implemented to strategically control inventory at American Airlines.
It was recognised as producing a staggering $1.4 billion in a three year period at
the airline, presented in [SLD92]. This example in particular sparked the growth
of this field. Revenue management and pricing models have advanced significantly
in recent decades, as shown in [EK03] and [BC03]. The major milestones in the
science of pricing and revenue management are identified in [CHC10]. Revenue
management has been applied or is being considered in many different industries including airlines, hotels, car rentals, casinos, restaurants, grocery chains,
golf courses, cruise lines, apartment rentals, sports, performing arts, media, etc.
(e.g., [GF08], [GFKS06], [Gu06], [Haw03], [HGF02], [Kim05], [KS02], [KWN02],
[Kuy02], [LD02], [Lip03], [Vin04]).
The initial revenue management successes were achieved for products with a
given expiration date, known as perishable goods. Once their expiration date has
passed, the product being sold loses its value and becomes worthless. Therefore
there is a time constraint before which these goods need to be sold. There are
also examples where revenue management has been applied to non-perishable
inventory. For example, Ford estimated that about $3 billion in additional profits
came from Revenue Management initiatives, claimed in [Lei00]. As stated in
[CHC10], “The great public success of pricing and revenue management at Ford
solidified the ability of the discipline to address the revenue generation issues
of virtually any company.” Overall these systems enable firms to make superior
pricing decisions within a dynamic competitive environment, as shown in [CS01].
[TR04] and [CCX07] provide an overview of the revenue management and pricing
literature.
22
CHAPTER 2. PRICING DECISION SUPPORT SYSTEMS
There are many companies in the field of pricing decision support systems in
the retail market including PROS, Zilliant, KSS Fuels, KSS Retail, Dunnhumby,
SAP, JDA and Oracle. This indicates the adoption and readiness of using such
systems.
Despite this, as [Sim12] states, “Most companies still price based
on rules of thumb, cost-plus considerations, and gut feelings”. This lack of
widespread use is in part due to the implementation challenges of such systems,
as shown in [Mon04]. Another reason for this lack of adoption is because the yield
management systems which have been successfully used for years to sell a specific
type of perishable product are often either unusable or missing some important
pricing elements in order to optimise the price of other types of products, which
make up the vast majority of products sold by retailers. The reasons for this
are firstly that demand for these perishable products is affected by the time remaining until expiration but most products are not like this. Hence the demand
models used in currently implemented yield management systems can’t be applied to forecast demand for these other products. Secondly, yield management
systems are used to optimise the price of a fixed amount of goods sold within
each specified time-frame. This is different to the pricing optimisation problem
faced by most retailers where they can often produce a varying amount of product or instead choose to hold back some products in order to sell them in a more
profitable period. Another difference is the fact that yield management systems
almost always allow the price per unit to change at any moment. The truth is
that most retailers do not have this flexibility and instead can only change their
price at very specific times. Therefore although there have been small successes,
for example as stated previously in [Lei00] regarding Ford, much work is required
to show that the statement made in [CHC10] regarding the potential of using
pricing management systems in order to optimise the price of any product is in
fact true. New pricing systems need to be developed and successfully applied for
2.1. PRICING DECISION SUPPORT SYSTEMS FOR RETAIL
23
these remaining products in order to remove the dependance of most companies
on the human decision maker and to truly prove the worthiness of pricing decision support systems to address the revenue generation issues of virtually any
company.
From an academic perspective, the commercial nature of these systems hampers research into their advancement due to the secretive nature of these companies. This makes it difficult for the academic community to help work on
improving these systems. As [WvBS99] states,“A prime criterion for assessing a
decision support system’s usefulness is the quality of its forecasts or to what extent the forecasts can predict the actual outcome. There is little research on this
topic as companies are reluctant to give insight into their marketing practice.”
2.1
Pricing Decision Support Systems for retail
The general framework for pricing decision support systems was originally proposed in [Sin91], [CS01] and [SB93]. The systems firstly use historical price and
sales data in order to infer trends and relationships between variables, such as the
sensitivity of own price to overall sales. This knowledge is then used to optimise
profit with respect to own price under certain conditions, such as ensuring that
a given volume constraint is forecasted to not be violated during the next optimisation period. The question being asked is essentially whether an increase or
decrease in a retailer’s own price will improve profit and if so, how much should
the price be changed in either direction. After the optimised price has been implemented, new data emerges. This will contain the actual implemented own price
and competitor prices, together with the achieved profit and sales figures. The
system uses this new information to update its previous inferences by comparing the actual values to those previously forecasted. Analysing this error allows
24
CHAPTER 2. PRICING DECISION SUPPORT SYSTEMS
the system to understand whether its current inferences seem to be satisfactory
or whether they need to be perhaps significantly changed. Repeating the previous steps over time allows the system to continuously improve its learning and
decision making.
In order to predict how a change in price affects sales, a demand model is required. It infers the correlation by analysing the prices and sales figures observed
in historical data. Once fitted, it explicitly states the estimated relationship. By
assuming that this relationship will remain unchanged during the next time-step,
the fitted demand model can be used to forecast future sales given a set of future
prices. Without a demand model there would be no way a pricing decision support system would be able to forecast how price changes affect future sales and
therefore it would be impossible to optimise overall profit with respect to own
price. The relationship between price and sales would be unknown to a retailer
and they would be unable to fathom whether an increase in price would increase
their profit or whether the opposite would be true. Of course there are other
factors, often deterministic such as the day of the week and month of the year,
whose effect can be incorporated into a demand model when appropriate. These
models then need to be structured in such a way that they can be updated using
new data, in order to improve their inferences over time. In a competitive market
a retailer’s sales are always affected by their own price and also the prices of
their competitors. Therefore sales are a function of these input variables, dictating that the general mathematical formula of a demand model (considering only
price effects) is
Sales = f (OwnP rice, Competitor1P rice, Competitor2P rice, ...)
(2.1)
Discovering the underlying function is required in order to understand the effect
2.1. PRICING DECISION SUPPORT SYSTEMS FOR RETAIL
25
a change in any of the input variables would have on overall sales. Specifying a
model structure for this function and then fitting it to historical data allows us to
work out the coefficients which would have provided the best forecasts previously.
If we assume that these coefficients remain constant over the next time-step then
this allows us to use this inferred knowledge to predict future sales. Depending on
the choice of model used, the relationship inferred between the own price variable
and overall sales is known as the direct price elasticity of demand. This is a
measure of the percentage change in quantity demanded of a particular good as
a result of a one percent change in own price. The same is also true for each of
the competitor price effects. Here the relationship is known as the cross price
elasticity for each competitor. It measures the percentage change in the quantity
demanded of a particular good given a one percent change in the price of that
respective competitor. Since price effects can evolve over time, the coefficients
are updated using new data. Time can be seen as an implied factor within the
model. The theory behind demand modelling is explained in chapter 3.
Competitor price effects on demand must also be considered. For example,
it is shown in [CHC09] that Intercontinental Hotels Group (IHG) recognized the
impact of competitor prices, noting that when a competitor changes their price,
the consumer’s perception of IHG’s price also changes, affecting sales. It is often
the case that due to competitive pressure, a change in own price may cause a
corresponding change in a competitor’s price. Therefore future competitor prices
may be affected by own price decisions. Hence demand is not only affected by a
retailer’s own price directly through the direct price elasticity, it is also affected
indirectly through the cross price elasticities since each competitor’s price may
be a function of own price. In fact this indirect effect may greatly dominate
the direct price elasticity effect. Understanding how a competitor will react to
an own price change is therefore very important. For example, an own price
26
CHAPTER 2. PRICING DECISION SUPPORT SYSTEMS
increase may only be worthwhile if a competitor copies this change. Clearly it
is important to successfully predict these prices in order to price pro-actively
rather than reactively. Just like when attempting to forecast future sales, fitting
a model containing the required factors to historical data is the most obvious
way to forecast competitor prices. Since competitor prices can usually change
at any moment, having some sort of time factor as an explicit input variable in
the model is key. Successfully forecasting future competitor prices means that a
retailer can have greater knowledge regarding the implications of their actions on
their competitors’ future prices. Updating these price models as new competitor
data becomes available will hopefully allow us to quickly understand when new
price trends are occurring. The theory behind competitor price forecasting is
explained in chapter 4.
Clearly the overall goal of the pricing decision support system is to maximise
overall profit. This is achieved by combining all of the models and forecasts
from the previous stages in order to find the optimum price. Since profit is the
margin between price and cost multiplied by sales, this is found by maximising
the function
P rof it = (OwnP rice − Cost) ∗ Sales
(2.2)
where OwnP rice and Cost are per unit. Own price is often constrained to be
within specified price differentials from the other competitors. There is also
usually a sales constraint in order to ensure that at least a certain amount of
stock is forecasted to be sold. Clearly the success of the optimisation stage is
dependent on the accuracy of the competitor price and sales forecasts which are
fed in to it. When optimising own price over multiple t time periods or a range of
p products, the overall mathematical formula of the pricing optimisation problem
2.1. PRICING DECISION SUPPORT SYSTEMS FOR RETAIL
27
is therefore
P rof it =
XX
t
(OwnP ricet,p − Costt,p ) ∗ Salest,p
(2.3)
p
subject to the volume constraint
XX
t
Salest,p ≥ V olumeConstraint
(2.4)
p
and also subject to the price differential constraint
LowerP riceConstraintt,p ≤ OwnP ricet,p ≤ U pperP riceConstraintt,p
(2.5)
It is possible that there may be instances where no own prices exist which are
forecasted to provide the stated minimum volume constraint. In these circumstances the pricing decision support system should make it clear to the user of
the system that this is the case and the own prices which provide the greatest
estimated volume should be advised, unless the user wishes to reduce the volume
constraint before re-running the price optimisation.
The advantages of optimising over multiple periods and multiple products
is that it may be preferential to achieve more profit (at lower volume) from one
product or time window and achieve greater volume (at lower profit) from another.
This may provide greater overall profit instead of optimising each product and
time window individually, whilst still satisfying the overall volume constraint The
theory behind price optimisation is explained in chapter 5.
The main stages required in a pricing decision support system have been
presented above. Without forecasting demand a retailer would be unable to
predict the impact of a price change. Without forecasting competitor prices a
28
CHAPTER 2. PRICING DECISION SUPPORT SYSTEMS
retailer would be unable to cope with the competitive nature of their market,
lacking a competitive edge and failing to understand the impact of price moves.
Without a final price optimisation stage a retailer would fail to utilise the model
predictions, being unable to discover the optimal own price which also satisfies
their constraints. We now briefly talk about existing work in each of these areas.
There has been a lot of previous work on demand estimation in modelling
sales of goods or services as a function of price. Examples include [BCM98] and
[AC09] who both model demand as the result of an arrival process (the number of customers entering a store) and a reservation price distribution for each
customer (with the customer only purchasing the product if the price is lower
than their reservation price). A Poisson distribution is used to model the arrival
process and a probability distribution is used to model the heterogeneity among
customers regarding the reservation price. [CP98] analyses the functional form
of demand models and provides evidence that non-linear demand models appear
superior to linear ones, as well as requiring less restrictive assumptions. [MR99]
and [Mon02] develop a Bayesian approach to incorporate prior information in
a hierarchical setting in order to combine the benefits of economic theory with
statistical shrinkage methods. They use a hierarchical model as a statistical procedure in which to utilise (or partially pool) information across similar brands or
units of aggregation such as stores or chains. The Bayesian approach allows them
to incorporate prior information into the initialisation of the model, allowing the
data to then modify the prior without the need of imposing exact restrictions
from economic theory that do not always hold exactly in reality. [BDR04] simultaneously model response to price, display, and feature promotion in a Bayesian
Hierarchical model.
Academic research into forecasting competitor prices is dominated by the
use of game theory, where competitors maximise their own profit. It is used
2.1. PRICING DECISION SUPPORT SYSTEMS FOR RETAIL
29
to rationalise future competitor decisions in a retail environment. This is done
by comparing the payoff from each of their different potential pricing options.
[ZTW12] and [GK08] show examples of where this game theoretic approach has
been applied. In reality the payoff of each competitor is almost always unknown
but forecasting the change in a competitor’s price due to external factors such
as changes in cost or own price, without knowing the payoff of each competitor
is an approach which has not been implemented. This has led to the current
naive assumption that a competitor’s price will simply remain unchanged for the
foreseeable future. This assumption is used in the pricing decision support system
presented by [MJL11], for example.
Regarding overall price optimisation, the academic literature focuses upon
the solving of a differential game (see for example [DJLS00] and the references
therein). In game theory, differential games use differential equations to model
and analyse a conflict between different competing players. Each player aims to
optimise their strategy in the game with the success/payoff of the chosen strategy
being dependent on the choices made by all players. When each player makes the
best decision they can, taking into account the decisions of all the other players,
they are all in Nash equilibrium. In this scenario it is not worthwhile for any
player to change their strategy. By assuming that each player knows the optimum strategies for all the other players, current academic research on pricing
optimisation focuses upon finding the Nash equilibrium point and the conditions
required for its existence. [AP10] and [DCFN05] are examples of such an optimisation. [JKZ99] presents a survey of differential games applied to management
science. Other pricing decision support systems optimise price continuously, as
is the case in [MJL11] who use quadratic programming methods. They find the
optimal price as a continuous value to numerous decimal places, as the solution
to a profit maximisation problem under certain constraints.
30
CHAPTER 2. PRICING DECISION SUPPORT SYSTEMS
2.2
Issues with current methods
The main flaw with current academic research in demand modelling is shown
by [BZ09] who states that, “A critical assumption made in most academic studies of revenue management problems is that the functional relationship between
the mean demand rate and price, often referred to as the demand function or
demand curve, is known to the decision maker.” Obviously this assumption is almost never correct. In fact, as stated previously, finding this relationship is often
crucial to the success of a pricing decision support system. When this relationship
is explicitly investigated, prediction models often fail to grasp the dynamics of the
relationship between price and demand, as is the case in [MJL11]. This is because
the estimated price effects are almost always assumed to either remain constant
or to remain constant for long periods of time, when in reality price effects are
continuously evolving and changing. Models using this erroneous assumption are
therefore unable to correctly estimate these price effects, reducing their overall
forecast accuracy. Current methods also fail to simultaneously update their estimates for the price effects whilst dealing with the correlation between successive
periods in the sales data, caused by deterministic effects rather than price effects.
Hence a new technique which continuously updates evolving price effects and also
provides a higher forecast accuracy is required.
Using a game theoretic approach to competitor price forecasting requires
knowing the utility/payoff function of a competitor. In reality this is rarely
known and so this methodology is not applicable to the majority of retailers
when predicting the prices of their competitors. Rather than relying on such
unavailable information, what is required is a way of forecasting the change in a
competitor’s price due to external factors such as changes in cost or own price,
without knowing the payoff of each competitor. This separates the competitor
2.2. ISSUES WITH CURRENT METHODS
31
price forecasting stage from the decision strategy stage used to choose the optimal own price. Competitor prices are not currently explicitly forecasted, therefore
current pricing decision support systems almost always fail to take in to account
how competitors will react, ignoring the competitive environment which faces
most companies. Clearly optimising own price without forecasting and taking
into account how a competitor will respond can be sub-optimal and will become
more erroneous over time. Since competitor pricing decisions will have a direct
effect on a retailer’s profit, these decisions need to be taken into account and the
forecasts greatly improved.
Optimising a retailer’s own price by solving a differential game in order to
find the Nash equilibrium is often unrealistic. The flaws with this methodology
are highlighted by McCauley in [McC09] where he states, “A brief journey into
academia quickly reveals that economists teach market equilibrium even while
the real world of economics experiences no stability. Standard microeconomic
theory is based on a deterministic model, called neoclassical economics, where
perfect knowledge of the infinite future is assumed on the part of all players.
That an equilibrium exists mathematically under totally unrealistic conditions
has been proven, but that the hypothetical equilibrium is stable (or computable)
or has anything at all to do with reality has never been demonstrated.” When a
retailer’s own price is instead optimised using quadratic programming techniques,
there are other issues. This is because retailers only allow a limited number of
decimal places when setting their actual price and often only price at specific
price points. Hence the advised price from the pricing decision support system
is rounded, perhaps incorrectly, to the nearest acceptable price point. This optimisation is therefore inefficient since it should only consider prices which can
actually be implemented. These quadratic programming techniques have also
been built without factoring in the forecasting of competitor prices. They are
32
CHAPTER 2. PRICING DECISION SUPPORT SYSTEMS
currently unable to compute their optimisation using a competitor price model
rather than a static price which does not change over time or depending on the
own price chosen. Hence a new optimisation which incorporates these forecasts
and optimises price in a discrete way considering only the acceptable potential
own prices will further improve the overall decision making process. Therefore
improvements in all stages of these pricing systems are required in order to further
increase their effectiveness and profitability.
2.3
Summary
Due to the huge amounts of data and computing power now available to retailers,
pricing decision support systems have been built in order to use this data more
successfully when making pricing decisions. They have proved to be exceptionally
successful within the yield management industry, in particular when used to price
fares for commercial airliners. This success sparked a growth in its use throughout
other sectors. There are now many different companies in many different sectors
offering their expertise in this area. Despite their varied implementation, these
systems still require further enhancements in order for their use to be widespread.
Pricing decision support systems require three specific components. The first
one is the demand model and this is used to allow a retailer to forecast sales
given values for the input variables, which include the own price and competitor
prices. The next stage is the competitor price model which forecasts their future
prices. This needs to be able to forecast prices over time and is essential to ensure
that a retailer understands how an own price change will affect their competitors’
pricing decisions, critical in a competitive market. The final stage is the price
optimisation component. This is required in order to provide a framework through
which the demand model and competitor price model can be utilised. It finds the
2.3. SUMMARY
33
optimum price under any volume or price constraints.
Despite all of the work undertaken to improve demand modelling techniques,
current methods often fail to grasp the dynamics of the relationship between price
and demand. This is firstly because the estimated price effects are almost always
assumed to either remain constant or to remain constant for long periods of
time, when in reality price effects are continuously evolving and changing. They
are also unable to simultaneously learn about price effects whilst attempting to
understand current underlying sales trends, reducing the accuracy of their model
estimates. A lot less work has been completed on competitor price forecasting.
In fact the only work available in the academic literature has involved a game
theoretic approach to rationalise their decision making through understanding
their payoff function. Since this payoff is almost never known, it has led to the
current naive assumption that a competitor’s price will simply remain unchanged
for the foreseeable future. Therefore current systems almost always fail to take
in to account how competitors will react, ignoring the competitive environment
which faces most companies. Regarding overall price optimisation, the academic
literature focuses upon the solving of a differential game. This is something
whose successful application seems erroneous in reality. Quadratic programming
techniques have also been applied here but this also has major flaws, mainly
due to to the fact that current implementations are incapable of incorporating
competitor forecasts into the optimisation.
Chapter 3
Forecasting demand
This chapter will explain the background behind building mathematical models and how they can be applied to forecast demand for a retailer. It will detail
current methodology and highlight the flawed assumptions underlying these techniques. The method of Dynamic Linear Models is then presented and it is shown
how a bounded version can be used and implemented to forecast demand, detailing its theoretical advantages over other methods.
3.1
Introduction
Much of applied econometric analysis begins with the following premise: y and
x are two variables, representing some population, and we are interested in “explaining y in terms of x,” or in “studying how y varies with changes in x.” A
simple equation relating y to x is
y = β0 + β1 x + u
and is called the simple linear regression model, where the coefficients β0 and β1
are constants. In this model, y and x have several different names used interchangeably. For y, these include the dependent variable, the explained variable,
34
3.1. INTRODUCTION
35
the response variable or the predicted variable. For x, these include the independent variable, the explanatory variable, the control variable, the predictor
variable, the regressor or the covariate. The variable u, called the error term or
disturbance in the relationship, represents factors other than x which affect y.
A simple regression analysis effectively treats all factors affecting y other than x
as being unobserved. You can usefully think of u as standing for ’unobserved’.
Of course the relationship relating y to x can take a much more complex form
and discovering this is fundamental to the success of the model. There can also
be more than one explanatory variable. In almost all cases there are many other
factors that simultaneously affect the dependent variable and adding these to our
model takes us into the area of multiple regression analysis.
The reason why we wish to use a regression model is because it explicitly
states (by finding the optimum coefficients) the effect which each explanatory
variable has on the dependent variable. Understanding price effects on demand
is an important requirement of a demand model because bounding and controlling
the coefficients is the only way we can ensure that the demand model is making
reasonable sales forecasts. For example we need to be sure that a retailer’s sales
are predicted to decrease when its price increases and also that price effects
don’t change too dramatically from one time-step to the next. Other modelling
techniques do not explicitly state these effects, reducing our control over the
causal effects inferred and used to make future predictions. Therefore regression
models are used within pricing decision support systems as they can be used to
ensure the necessary supervision over the sales predictions.
We wish to model expected demand for a retailer over a given period of
time. Therefore our dependent variable y represents demand. The independent
variables represent all the factors which are believed to affect demand. The
regression model will then be fit to historical data, in order to find the effect of
36
CHAPTER 3. FORECASTING DEMAND
each explanatory variable and allowing us to forecast future demand. Our major
problem here is the problem that, outside of experiments, we cannot look at the
effect of changing one variable in a mathematical model while holding another
constant. In observational studies we do not control the predictor variables,
so within the data that generated the estimated regression coefficients, as one
variable increases, another variable may naturally change also. The regression
coefficients provide the combined effect of the two variables. Thus the data don’t
really give us a basis for discussing how changing one predictor variable affects the
response when other things are held constant. Despite these difficulties, predictive
models that account for all the predictor variables that could reasonably affect
the response (and account for them appropriately) are typically the closest we
can come to teasing out causation from these predictive models based on nonrandomized data, as explained in [CJBH11].
A time series is a sequence of data points, measured typically at successive
times, spaced at uniform time intervals. Sales data contains sales figures over
consecutive periods and thus represents a time series. Time series data have a
natural temporal ordering. This makes time series analysis distinct from other
common data analysis problems, in which there is no natural ordering of the
observations. A time series model will generally reflect the fact that observations
close together in time will be more closely related than observations further apart.
This may often be true in retail sales where consecutive periods would have
similar sales figures. For example, large sales in one period may be followed by
large sales in the next period. Time series models look at past patterns of data
and attempt to predict the future based upon the underlying patterns. There is
always tension between seeking a parsimonious model (so that fewer parameters
need to be estimated) while ensuring that important effects are not mistakenly
omitted from the model, shown in [Cha04].
3.1. INTRODUCTION
37
It is very difficult to forecast demand accurately, especially when a retailer
needs to forecast how demand is affected not only by its own price but also by its
competitors’ prices. The aim of the demand model is to understand the ceteris
paribus effect that changing any of the variables would have. For example, the
effect of changing own price whilst keeping all the other factors in the demand
model constant. In general, a model to forecast sales, at time t, would take the
form,
St = α1,t + β2,t OPt + β3,t CP1,t + ... + β2+K,t CPK,t
(3.1)
where St represents the expected sales volume in time period t, OPt is the timeweighted average own price during period t and CPj,t is the time-weighted average
price of competitor j during period t. α1,t is the intercept coefficient at time t
and the coefficients β2,t , ..., β2+K,t show the effect that a one unit change in its
respective price variable would have on sales, whilst holding constant all the
other determinants of demand at time t. Linear price effects are assumed to be
acceptable for the small spectrum of prices for which the demand model is used
to make predictions, as long as the coefficients are regularly updated.
In truth there are many different models and methods to forecast demand but
the linear model (e.g. equation 3.1) is one of the simplest. An example of a more
complicated time series model is an autoregressive-moving-average model with
exogenous inputs. This class of models is formed by combining moving average1 ,
autoregressive2 and other exogenous3 variables into the chosen model. The insertion of lags of the forecast error and of the dependent variable are done to deal
with the correlation often present in time series data. The general autoregressivemoving-average model was described in the 1951 thesis of Peter Whittle entitled
1
The insertion of lags of the forecast error.
The insertion of lags of the dependent variable.
3
Any variable that is uncorrelated with the error term in the model of interest.
2
38
CHAPTER 3. FORECASTING DEMAND
“Hypothesis testing in time series analysis” and it was popularized by Box and
Jenkins in [BJ70]. Furthermore the simple linear model can also have further
price variables added to it in order to be able to derive a more complex relationship between price and demand. There are also methods which do not have a
specific underlying model but instead have a kind of ’black box’ approach where
their inner workings are hidden in some way. These are typically machine learning techniques and include the methods of regression trees, neural networks and
support vector regression. The difficulties in using complicated time series or
machine learning methods is that firstly they often require large amounts of data
in order to provide accurate sales forecasts. Retailers often do not have a lot of
historical sales data and are sometimes unable to gather it, hence these forecasting methods are not appropriate for such retailers. Also there is no need to use
complicated models or techniques to forecast sales since the effect of price changes
within a small region can often be approximated well with a simple linear model.
In fact these other methods may find specific price points which are forecasted
to provide extreme sales values but are in fact massively erroneous. This is due
to overfitting and generally occurs when a model is excessively complex, such as
having too many parameters relative to the number of observations. A model
which has been overfit will generally have poor predictive performance, as it can
exaggerate minor fluctuations in the data. This can be dealt with by bounding
the output but it’s much harder to ensure that a demand model using these more
complicated techniques makes consistently realistic predictions (such as always
predicting that a retailer’s sales decrease when their price increases) than with
a linear regression model whose coefficients can be easily bounded to ensure this
consistency. Overfitting is also unlikely to occur when using the simple linear
model presented in 3.1 due to the reduced number of parameters.
3.1. INTRODUCTION
39
When using a model such as equation 3.1 to forecast sales over a specified time,
we are assuming that sales are achieved with a uniform distribution throughout
the optimisation period. Therefore if we were forecasting daily sales, the forecasted sales we would expect to achieve during any hour in the day would be the
same (i.e. the total forecasted sales divided by 24). Of course this assumption
may be wrong but if for example we only have historical aggregated daily sales,
there is no proof that the sales are not achieved uniformally throughout each
day. The reason for using average prices instead of actual prices is that prices
can often change at any moment. Therefore to input the actual prices into a
demand model may require numerous variables and become infeasible. On the
other hand, using the average price only requires a single input variable. Furthermore, understanding the effect of a single price variable is hard enough, trying
to do this for many more variables (as would be necessary when inputting multiple actual prices) reduces the belief we can infer from the model estimates and
the predictions it gives. Therefore average prices are commonly used in demand
models. Something to remember is that identical average prices will provide the
same sales forecast whether or not the actual prices and times at which they were
implemented were the same. Of course in an ideal world we would have high
frequency sales data which would allow us to forecast sales at small intervals,
requiring less actual prices to be combined into each average price. In theory
this should increase the accuracy of the estimated price effects and overall sales
forecasts.
In our linear demand model (i.e. equation 3.1), St is the dependent variable
and represents total sales. Looking, for example, at hotel reservations, most popular price-dependent demand models are univariate linear and exponential models
(see [WK03]). Replacing St with log(St ) as the dependent variable transforms our
model into an exponential model and can be advantageous. According to [BC03],
40
CHAPTER 3. FORECASTING DEMAND
exponential demand models are commonly used in the retail industry instead of
linear models due to the fact they provide a better fit. An example is the model
utilized by [SA98]. Taking logs of all explanatory variables in fact usually narrows
the range of each variable, in some cases by a considerable amount. This makes
estimates less sensitive to outlying (or extreme) observations on the dependent
variable, as explained in [Woo06] who also explains that it can also mitigate, if
not eliminate, any heteroskedastic4 or skewed5 characteristics in the data. When
logs for all variables in equation 3.1 are taken, the demand model becomes,
log(St ) = β1,t + β2,t log(OPt ) + β3,t log(CP1,t ) + ... + β2+K,t log(CPK,t )
(3.2)
Exponentiating both sides provides us with the demand model,
St = β1,t ∗ (OPt )β2,t ∗ (CP1,t )β3,t ∗ ... ∗ (CPK,t )β2+K,t
(3.3)
where again St represents the expected sales volume in time period t, OPt is the
average own price during period t and CPj,t is the average price of competitor
j during period t. β1,t is the intercept coefficient at time t and the coefficients
β2,t , ..., β2+K,t are known as the elasticities at time t. Due to the construction of
this new model, when a small price change is considered these elasticities give the
percentage change in sales in response to a one percent change in its respective
price variable, whilst holding constant all the other determinants of demand. Our
aim is now to find the values of all these coefficients in order to find the most
accurate demand model. These estimated values will be shown to be pivotal in
optimising price (in chapter 5).
4
5
The variance of the error term, given the explanatory variables, is not constant.
The distribution of the data is not symmetric.
3.2. DYNAMIC LINEAR MODELS
41
The most common way of finding these coefficients is to use Least Squares regression and the variants thereof. In general this method chooses the coefficients
which minimize the sum of the squared error (or some similar error statisitic)
between the observed responses in an assigned training set and the responses
predicted by the model using those coefficients. The issue here is that there are
often many unrealistic assumptions needed to have confidence in the estimates
found using any Least Squares regression technique. When modelling demand,
the most regularly violated assumption is that the ’error’ terms form an independent sequence. This assumption can lead to a badly misspecifed model and
poor forecasts, as explained in [BN71]. It also requires the incorrect assumption
that price elasticies remain constant over time. Hence even though there are
recursive versions of Least Squares regression, they fail to grasp the dynamics of
coefficients when estimating their value. The same issues (i.e. the violation of
the same required assumptions) also arise when using Maximum-likelihood estimation (a method which selects the set of values of the model parameters that
maximizes the likelihood function) and its related techniques, the other common
way of finding estimated coefficients. Therefore this chapter will propose a new
way to find these coefficients, stating its advantages and suitability to finding the
required coefficients in our demand model (i.e. equation 3.3) whilst also showing
how the technique can be implemented.
3.2
Dynamic Linear Models
Using Dynamic Linear Models (DLMs) as a new approach to demand modelling provides numerous advantages. Different to existing demand modelling
approaches which assume coefficients are constant, DLMs assume coefficients are
time-varying random variables, enabling the user to track their behaviour whilst
42
CHAPTER 3. FORECASTING DEMAND
improving the handling of noisy data. Secondly they keep track of non-linear behaviour because their recursive feature allows the estimates for the coefficients to
continually update. Furthermore the recursive formula overcomes the numerical
issues with traditional regression techniques involving the inverse matrix which
would have arisen here due to the high collinearity between prices. DLMs also
have solid theoretic foundations in Bayesian methodology, providing increased
certainty in the estimates found. Finally, the recursive nature of DLMs means
that they update online and so there is no need to store vast amounts of historical
data as is often currently necessary. The theory behind this technique will now
be presented.
DLMs are a special case of a general state space model6 , being linear and
Gaussian. State space models were proposed by Kalman in [Aka74, HS76], and
quickly became established in this field, for example [Har89, WH97, Aok87].
For DLMs, estimation and forecasting can be obtained recursively by the
Kalman filter, devised by [Kal60] and [KB63]. They were first presented to solve
time series problems by [WHM85]. They use a Bayesian methodology to provide
posterior probability distributions for the coefficients in a model and the forecasts
the model provides. Examples of their successful implementation in forecasting
retail sales, without finding actual price elasticities, are [AHM10, AHM07, JR11].
[PPC09] summarizes the basic aspects of DLMs as follows:
• The observable process (Yt : t=1,2,...) is thought of as determined by a
latent7 process (θt : t=1,2,...), up to Gaussian random errors. If we knew
the position of the object at successive time points, every Yt would be
independent: what remains are only unpredictable measurement errors.
6
A model which considers a time series as the output of a dynamic system perturbed by
random disturbances.
7
Variables that are not directly observed but are rather inferred (through a mathematical
model) from other variables that are observed (directly measured).
3.2. DYNAMIC LINEAR MODELS
43
Furthermore, the observation Yt depends only on the position θt of the
target at time t.
• The latent process (θt ) has a fairly simple dynamics: θt does not depend on
the entire past trajectory but only on the previous position θt−1 , through a
linear relationship, up to Gaussian random errors.
• Estimation and forecasting can be obtained sequentially, as new data becomes available.
The assumption of linearity and Gaussianity is specific to dynamic linear models,
but the dependance structure of the processes (Yt ) and (θt ) is part of the definition
of a general state space model. The assumption linearity allows us to use the
DLM with regression models. The assumption of Normality is sensible in many
applications and can be justified by central limit theorem arguments.
Both Yt and θt are vectors since the system being modelled can be made up
of numerous observable and latent variables. The length of the Yt vector depends
upon the number of observable processes. If a DLM is used to predict a single
observable variable, Yt would have length 1. The number of latent processes
dictates the size of θt .
A DLM is specified by a Normal prior distribution for the p-dimensional state
vector at time t = 0,
θ0 ∼ Np (m0 , C0 ),
together with a pair of equations for each time t ≥1,
Yt = Ft θt + vt
vt ∼ Nm (0, Vt ),
which is called the ’observation equation’, and
θt = Gt θt−1 + wt
wt ∼ Np (0, Wt ),
44
CHAPTER 3. FORECASTING DEMAND
which is called the ’state equation’ or ’system equation’. Explaining the remaining
variables in the two equations above:
• The matrix Ft expresses our belief about how the latent variables in the
current time-step combine to form the values seen in the observable variables.
• vt expresses our belief regarding the randomness between the latent variables and the observable variables in the current time-step.
• The matrix Gt expresses our belief about how the latent variables from the
previous time-step combine to update their values in the next time-step.
• wt expresses our belief regarding the random part of the evolution of each
latent variable in the system from the current time-step.
Furthermore it assumed that θ0 is independent of (vt ) and (wt ). One can show
that a DLM satisfies the assumptions
1. (θt ) is a Markov chain
2. Conditionally on (θt ), every Yt is independent and Yt depends on θt only,
with Yt | θt ∼ N (Ft θt , Vt ) and θt | θt−1 ∼ N (Gt θt−1 , Wt ). These assumptions fit
perfectly with a regression model that is updated on-line as new data is received in
each time-step. Firstly, the unknown latent effects of the input variables (which
we attempt to infer) only depend on the value at their previous state before
being updated with new information. Secondly, the observed predicted variable
depends only upon the current latent effects of each input variable.
Now considering we are trying to update the coefficients in a DLM where
θt−1 | y1:t−1 ∼ N (mt−1 , Ct−1 ),
3.2. DYNAMIC LINEAR MODELS
45
is normally distributed for some mean mt−1 and variance matrix Ct−1 , where
y1:t−1 is the available observed data up to time t, then;
• The one-step-ahead predictive distribution of θt | y1:t−1 ∼ N (at , Rt ) where
0
at = Gt mt−1 and Rt = Gt Ct−1 Gt + Wt .
• The one-step-ahead predictive distribution of yt | y1:t−1 ∼ N (ft , Qt ) where
0
ft = Ft at and Qt = Ft Rt Ft + Vt .
• Finally, the filtering8 distribution of θt | y1:t ∼ N (mt , Ct ) where mt =
0
0
−1
at + Rt Ft Q−1
t et , Ct = Rt − Rt Ft Qt Ft Rt and et = Yt − ft is the forecast
error.
The proofs of these distributions are given on pages 53-55 of [PPC09].
The next section applies this method to the task of forecasting daily sales.
3.2.1
Using bounded DLMs to forecast sales
The existing method of DLMs can not be used directly to forecast sales since the
movement of the estimated coefficients in the observation equation is unbounded.
The problem here is that elasticity values must always take the correct sign and
their movement needs to be restrained, otherwise it will lead to incorrect optimal
prices, as will be shown. For this reason, this section proposes a new version of
DLMs which bounds the coefficients. Theoretic justification for implementing a
bounded approach is provided in chapter 8.
The general observation equation to use for our DLM, at time t, is
log(St ) = β1,t + β2,t log(OPt ) + β3,t log(CP1,t )+
(3.4)
... + β2+K,t log(CPK,t ) + vt
8
determining the distribution of a latent variable at a specific time, given all observations
up to that time
46
CHAPTER 3. FORECASTING DEMAND
which is equation (3.2) together with the error term vt added to it.
Our first state equation, at time t, is β1,t = β1,t−1 + wt where wt ∼ N (0, Wt ).
The state equations for all the other coefficients are βj,t = βj,t−1 for 2 ≤ j. Hence:
• Yt = log(St )
• Ft = [1 log(OPt ) log(CP1,t ) ...log(CPK,t )]
β1,t
β2,t
• θt =
...
β2+K,t
1 0
0 1
• GT =
... ...
0 0
• vt ∼ N (0, Vt )
0
0
• w t ∼ N
...
0
... 0
... 0
... ...
... 1
Wt 0
0 0
,
... ...
0 0
... 0
... 0
... ...
... 0
The identity matrix is used for GT because we do not know how the model
coefficients change from one time-step to the next. Due to the presence of shortterm correlation in retail sales, wt has been chosen in such a way that any shortterm correlation effects will be primarily inserted into the new filtered estimate of
the first latent variable, β1,t . Since short-term correlation will most likely affect
the base level of sales (due to general peaks or troughs in overall demand which
are irrespective of price), this choice for wt seems reasonable.
3.2. DYNAMIC LINEAR MODELS
3.2.2
47
Multicollinearity
If own price increases and all other factors remain constant we would expect demand to fall and so the price elasticity on own price should always be negative. If
a competitor raises their price we would expect our sales to increase and therefore
the price elasticity on each competitor should be positive. We have clear theoretical elasticity restrictions but it is possible that updated elasticities could take
the wrong sign. This may be because of randomness in daily sales but more likely
due to multicollinearity between prices. This is caused by competitors retaining
similar price margins in order to be competitive.
High (but not perfect) correlation between two or more independent variables
is called multicollinearity. [Woo06] shows that worrying about high degrees of
correlation among the independent variables in the sample data is really no different from worrying about a small sample size, both work to increase the variance
in the estimated coefficient for a variable in the model. Although the problem
of multicollinearity can’t be clearly defined, it is clear that everything else being
equal, for estimating the coefficients in a model, it is better to have less correlation between the independent variables. Returning to our problem, the variables
corresponding to own price and the prices of our individual competitors are the
variables which are highly collinear. This is because competitors retain similar
price margins in order to be competitive and therefore competitors react similarly
to price changes in order to maintain their pricing strategy. Thus most of the
variance in the price of one retailer can be explained by the variance in other
competitors. This correlation among the regressors will most probably lead to
large standard errors for the coefficients in our demand model.
The effect of these errors when estimating the coefficients in the demand model
can have a significant negative impact. For instance, we would certainly expect
48
CHAPTER 3. FORECASTING DEMAND
the elasticity for own price to be negative since as own price goes up we would
expect our sales forecast to decrease. If this was not the case then our model
would predict sales to continually increase the higher we raise own price and our
optimisation would therefore advise us to raise the price as high as the constraints
allow. In a similar vein we would expect the elasticity for each of our competitors
to be positive. This is because as their price goes up, customers would find
the current own price more favourable, with some customers now choosing to
purchase from us instead, increasing our sales. Therefore we have clear bounds
which are needed for these elasticities.
Therefore a modification must be made to the current unbounded algorithm
to ensure that the price coefficients always maintain their correct sign. To do this
we change an updated negative competitor elasticity to zero and add its original
negative value to the elasticity for own price. If the elasticity for own price
became positive, we call it zero and add its original positive value multiplied by
log(OwnP ricet ) to β1,t . To further this method, the absolute size of any elasticity
should be restrained. For example it would seem highly unrealistic in the real
world for a 1% change in price to affect daily sales by 200%. Hence if a competitor
elasticity reached a value greater than an assigned limit, it should be reduced to
this limit with its difference added to the elasticity for own price. If the elasticity
for own price took a value of less than its respective limit, it should be increased
to this limit with its difference multiplied by log(OwnP ricet ) and added to β1,t .
We did this to keep the balance of the model the same. The justification for
the validity of this correction is that firstly, the fact that competitor prices are
usually close (if not the same) means that swapping the elasticities, or moving
part of their value between each other will provide similar sales forecast and hence
will give a similar predictive accuracy. Also the high collinearity between prices
increases the chances of an error occurring in the found elasticities. It is as if
3.2. DYNAMIC LINEAR MODELS
49
the model has been unable to distinguish the effect of an individual price variable
and therefore assigned the wrong effect to a variable. High collinearity also means
that there is not a unique solution to the problem posed. Hence the elasticities
found are not necessarily the best, giving further reasoning to allow them to be
artificially changed. Our final justification is that we have come across a model
where the elasticities are certainly wrong (e.g. if the elasticities have taken the
wrong sign) and we have now changed its values slightly to create a model which
at least has the potential to be correct.
3.2.3
Algorithm to forecast sales
Before a retailer can use this algorithm to forecast sales they need to consider
factors, other than their own price and competitor prices, which have an impact
on their sales. Deterministic time factors are most likely to be required. Therefore
depending on the period for which sales are being forecasted, examples of time
factors are:
• the day of the week.
• the month of the year.
• the season.
Any necessary variables need to be investigated and incorporated into the observation equation. Using standard regression model building theory, the addition
of dummy variables to the general observation equation can be used to incorporate time factors. They take the value of 1 to indicate the presence of a stated
categorical effect, otherwise they take the value 0. For example, if we wish to
incorporate the effect of each day of the week we can extend the equation used in
(3.4) to include day of the week factors. Hence the updated observation equation
50
CHAPTER 3. FORECASTING DEMAND
we will use to forecast daily sales is
log(St ) = β1,t + β2,t D2,t + ... + β7,t D7,t + β8,t log(OPt )+
(3.5)
β9,t log(CP1,t ) + ... + β8+K,t log(CPK,t ) + vt
where D2,t , ..., D7,t are dummy variables. They take the value of 1 if the day
they represent is the day we are forecasting sales at time t, otherwise they take
the value 0. Explaining the respective coefficients for each new variable in this
model, β1,t is the average sales on day 1 and can be viewed as the intercept.
The other coefficients represent the estimated difference in sales between day 1
and that day. For example β2,t represents the difference in average sales between
day 1 and day 2. This model allows us to use day 1 as the base group. It
is therefore the day against which all other days are compared to and that is
why its dummy variable doesn’t appear in the model. Finding the observation
equation containing the “correct” variables is not an exact science and it is often
not clear cut as to which factors are necessary. Utilising prior knowledge can
often help, whilst analysing the overall accuracy of different models singles out a
superior observation equation.
Once the observation equation has been constructed, in order for this algorithm to be used in industry, we need the price elasticities to be relatively stable.
Allowing them to update freely could produce large changes in their values. This
causes sizeable differences in the optimised price between consecutive optimisation
periods, weakening the trust of price setters. This constraint will be implemented
in the following way. If a competitor elasticity tried to update by more than an
assigned limit, it should only be allowed to move up to this limit, with the extra
amount being added to β2,t , the elasticity for own price. The same should be
done if the elasticity for own price was updated by an amount larger than this
3.3. SUMMARY
51
limit, except its extra value should be multiplied by log(OwnP ricet ) and added
to β1,t . The justification for this correction is the same as in section 3.2.2.
In order to use the DLM method, prior means and variances need to be chosen
for each βi to input any prior knowledge into the model. The method of Maximum
Likelihood Estimation (MLE) should then be used to find Vt and Wt , using an
assigned training set. This can be understood as finding the volatility in the
previous data, in order to understand how quickly the coefficients in the model
should update. The way this method works is that given any DLM for a given
data set, y1:n , its log-likelihood, l, is given by
n
l=−
n
1X
1X
log|Qt | −
(yt − ft )0 Q−1
t (yt − ft )
2 t=1
2 t=1
(3.6)
where ft and Qt are as in section 3.2. Minimising the negative log-likelihood with
respect to Vt and Wt provides us with optimal constant values for Vt and Wt . A
training set should be used to initialise the model before forecasting one time-step
ahead. The actual sales should then be used to update the model estimates for
the coefficients. Repeating the forecasting and updating steps should be done for
every future time-step individually. After a given number of time-steps, the MLE
estimates should be updated. Each of these steps is given in algorithm 1.
3.3
Summary
A simple linear model can be used to incorporate the factors which are believed
to affect a retailer’s future sales, such as own price and competitor price effects.
There are problems with using more complicated models or techniques which
don’t require an underlying model. Essentially they either risk overfitting when
52
CHAPTER 3. FORECASTING DEMAND
Algorithm 1 Bounded DLM Algorithm
1: Construct observation equation.
2: Set prior means and variances for the coefficients in the observation equation.
3: Use MLE on training data to find Vt and Wt .
4: Let N be the number of time-steps in the data.
5: for i from 1 to N do
6:
Use current coefficients in the observation equation to forecast sales for
time-step i.
7:
Use actual sales data from time-step i to update the coefficients.
8:
if any coefficient goes outside of its bound once updated then
9:
Correct the relevant coefficient.
10:
end if
11:
if time-step i is not in the training set then
12:
if the movement of any coefficient is too great when updated then
13:
Correct the relevant coefficient.
14:
end if
15:
end if
16:
Use MLE to find new values for Vt and Wt after a specified number of
future time steps.
17: end for
applied to historical data, failing to understand individual price elasticities or perhaps finding specific price points which are forecasted to provide extreme sales
values but are in fact massively erroneous. In general, a simple linear demand
model can be advanced by logging its variables, mitigating, if not eliminating,
any heteroskedastic or skewed characteristics in the data. Once a demand model
has been chosen, the aim is now to find the most accurate values of its coefficients (using historical data) in order to be able to use it to successfully forecast
sales. The problem with current techniques is that they require many unrealistic
assumptions in order to have confidence in their estimates, therefore a new way
to infer the values of these coefficients is required.
The method of Dynamic Linear Models has many advantages when it is used
to infer estimates of coefficients in a linear demand model. Firstly it assumes
that coefficients are time-varying random variables and its recursive nature means
3.3. SUMMARY
53
that estimates update online as new data is received, allowing the user to keep
track of non-linear behaviour. These are important characteristics when trying
to understand the coefficients in a demand model and a technique combining
these features is currently lacking. In order to apply Dynamic Linear Models
to demand modelling, a bounded version of the technique can deal with the
problem of multicollinearity in the data, helping to keep the price elasticities the
correct sign whilst restraining the size of their movement between updates. This
adaptation enables the method of Dynamic Linear Models to be applied to infer
coefficients of a demand model, so that the model can be used to learn from
historical data in order to forecast future sales.
Chapter 4
Forecasting competitor prices
This chapter will explain the need for retailers to forecast their competitors’
prices. Currently this task is done solely by humans and there has previously
been no implementation of a computer being used to forecast these prices in a
general retail setting. Due to human error and their physical limitations, new
techniques are required. The method of Support Vector Regression is presented
and it is shown how a bounded version of this method can be used as a way to
automate the learning of competitor price behaviour by analysing historical price
data.
4.1
Introduction
Traditional economic, marketing and operational models view the consumer as a
rational agent. Therefore if price was the only deciding factor, customers would
be expected to simply purchase their product from the retailer with the cheapest
price. This would lead to competitors continually reducing their price in order to
be the cheapest. Price wars would thus ensue and the retailer with the cheapest
price would capture all the demand in the local area. The truth is that in reality
54
4.1. INTRODUCTION
55
price is not the only factor affecting a consumer’s purchasing decision and thus
retailers are able to differentiate their product and remain profitable whilst not
providing the cheapest available price. Therefore all competitors do not simply
re-align their price to ensure that they are always the cheapest. Retailers often
instead maintain a pricing image within the local competitive market, aligning
their price against a subset of competitors. They then further reduce or increase
their price depending on their recent sales figures. Since the sales of competing retailers are unknown and their assigned competitors are also unknown, forecasting
a competitor’s price is often a difficult task.
When a retailer considers the optimal price to set for their product, the profitability of a price move often depends crucially on the response of its closest
rivals. For example, a price increase may only improve profit if a competitor
also increases their price whereas a price decrease may only improve profit if a
competitor doesn’t follow the price decrease. This makes forecasting future competitor prices crucial for any retailer. Human price setters are often used to make
these predictions but this can cause many issues. For instance, this is impractical
when there are many products and there is a need to continually predict how
each competitor would react to every potential own price decision. Humans are
also unable to process vast amounts of historical price data and human error can
produce forecasts of low accuracy. Therefore new techniques are required. Using
computers to learn from historical price data in order to predict future competitor
prices has been investigated in respect to an auction, where competitor bids have
been forecasted using historical price data, as shown in [WJS08] and [LAH10].
Although this problem is different to the pricing problem faced by an average
retailer, it shows the potential in using previous competitor data to predict their
future prices.
56
CHAPTER 4. FORECASTING COMPETITOR PRICES
4.2
Competitor price forecasting
Although average prices are commonly used as factors within demand models,
as shown in chapter 3, the actual price needs to be predicted when forecasting
competitor prices. This is because retailers often wish to keep their price within a
certain price differential from their competitors when optimising their own price
and they therefore need to know the forecasted actual competitor prices.
When forecasting the price of a competitor, forecasts only need to be made at
discrete times. This is because even when retailers religiously monitor the current
price of their competitors, it is not possible to do this continuously. Competitors’
prices are instead checked every so often, with the time between price observations
for a given competitor being specific to each retailer. The time until the next
competitor price observation is therefore something which is usually known by
the retailer. Once a competitor’s price is checked, it is assumed that their price
remains unchanged until at least the time of the next price observation and there
is in fact no way of proving that their price may have in fact moved. Prices
which are actually observed are clearly the only prices which we can be certain
about and they are the only ones which forecasts can be checked against and used
to update any model. Therefore this is why forecasts only need to be made at
predetermined times.
When discerning the factors needed to predict the price of a competitor, it
seems reasonable to assume that firstly deterministic factors such as the day and
time are important. One example of how time affects price is given by [FS08]
who observes retail vehicle fuel price cycles that last exactly one week all over
Norway. The underlying cost is also important since changing cost margins can
certainly affect prices. For example, [Noe07c] provides evidence of retail fuel
stations pricing according to costs, retaining the same margin if the wholesale
4.2. COMPETITOR PRICE FORECASTING
57
price changes. Of course the actual cost is not of importance, rather the difference
between a competitor’s price and the cost, as well as the change in cost since a
competitor last changed their price. A retailer’s own price can also affect the
prices of other competitors since they will always want to remain competitive,
especially when selling a homogeneous product1 . For example, in the retail vehicle
fuel sector, [AEW09] and [Con01] provide evidence that retailers often react to
the price movements of their competitors. Again, the actual own price is not of
importance, rather the difference between a competitor’s price and the own price,
as well as the change in own price since a competitor last changed their price.
Since competitors may not change their price regularly, a certain amount of time
may often pass before a new price change is implemented. Competitors may also
take time to react to changes in own price. [AEW09] gives evidence of this in
the retail vehicle fuel sector. Therefore we will use, as inputs, the variables (all
calculated at the future time at which the competitor price is being forecasted
for):
1. The day of the week.
2. The time of day.
3. Difference between previously observed competitor price and current cost.
4. Change in cost since the last competitor price change.
5. Difference between previously observed competitor price and current own
price.
6. Change in own price since the last competitor price change.
7. Time since last competitor price change.
1
A product sold by one firm which is indistinguishable from the same product sold by another
competing firm.
58
CHAPTER 4. FORECASTING COMPETITOR PRICES
8. Time since last own price change.
Due to the input variables, the forecasted change in a competitor’s price from the
most recent price observation will be the output and it is added to the previous
price. In order to predict a competitor’s price we don’t use the prices of other
competitors. This is because we don’t know which retailers are in competition
with each other and we won’t have all the necessary prices since they may have
different competitors. Hence only the variables stated above are used. Our aim is
now to find a way of predicting future competitor price changes using these input
variables. This chapter will propose a new technique for making these forecasts,
also showing how it can be implemented.
4.3
SVR theory
Using Support Vector Regression (SVR) as a new method to forecast competitor prices has numerous advantages. Firstly, the SVR methodology can deal with
non-linear data. This is very important given the non-linear nature of competitor
pricing decisions. SVR also performs well on data sets that have many attributes
(i.e. high dimensional data) since the method strives to reduce the complexity
of the model used. Since we will use quite a few variables to predict the price
of a competitor, this is again another reason to use this technique. The SVR
algorithm is also a much less intensive algorithm than other machine learning
techniques (such as neural networks), allowing the model to be re-tuned continuously whenever new data is received.
Using the method of Support Vector Machines (SVMs) for regression is known
as Support Vector Regression (SVR). SVM theory was first proposed by Vapnik
in [Vap95]. It is grounded in the framework of statistical learning theory and the
4.3. SVR THEORY
59
methodology quickly developed over time, as shown in [Vap99]. SVMs are equivalent to solving a linear constrained quadratic programming problem and have
been applied in many applications such as texture classification, image recognition, data mining and bioinformatics, for example [BTBH01, HCH+ 04, KJPK02,
MBB+ 08, CHV99]. SVMs were extended to solve regression problems, first shown
in [DBK+ 97], and have been successfully applied in different problems of time series prediction such as production forecasts, wind speed prediction and financial
time series forecasting, for example [Cao03, CT01, MHRH04, PL05]. A more
detailed historical background of the development of this methodology is given
by Smola and Scholkopf in [SS04]. Despite the previous uses of SVRs they have
not yet been used to forecast the price of a competitor in a retail environment.
0
When using SVR we wish to predict a real-valued y given variables x0 . We
have L training points where each input xi has D attributes (i.e. is of dimensionality D) together with a corresponding real-valued output yi∗ . Once the training
data is fitted, we will receive a fitted value yi for each actual real-valued output.
This gives us training data of the form {xi ,yi } where i = 1, ..., L; yi <; xi <D
with
y i = w · xi + b
(4.1)
w<D and b<. Fitting the data therefore involves finding optimum values for w
and b. Once these values are found they will be used to predict y 0 using x0 .
Using the method of SVR on non-linear data first requires a mapping of this
data, xi → φ(xi ), in order to make it linear, causing
yi = w · φ(xi ) + b
(4.2)
A kernel with relevant parameters is used to do this. The kernel is related to
60
CHAPTER 4. FORECASTING COMPETITOR PRICES
the transform φ(xi ) by the equation K(xi , xj ) = φ(xi , ) · φ(xj ). Therefore when
using kernels to transform data only inner products of the mapped inputs need
to be determined, we don’t ever know or compute φ(xi ). This operation is often
computationally cheaper than the explicit computation of the coordinates and
this approach is called the ”kernel trick”. In order to map data using a Radial
Basis Function we define our kernel to be K(xi , xj ) = exp(−γ k xi − xj k2 ) with
the kernel parameter γ > 0 but we never define φ(xi ).
When fitting the training data, the method of SVR uses a penalty function
such that if the predicted fitted value, yi , is less than ε away from the actual
value, yi∗ , (i.e. if |yi∗ − yi | < ε) it is not allocated a penalty. The region bound by
yi ±ε ∀i is called the ε-insensitive tube. Predicted values which are further than ε
from their corresponding actual value are given one of two slack variable penalties
depending on whether they lie above (ξ + ) or below (ξ − ) the ε-insensitive tube
(where ξ + > 0, ξ − > 0 ∀i). Hence
yi∗ ≤ yi + ε + ξi+
(4.3)
yi∗ ≥ yi − ε − ξi−
(4.4)
The error function for SVR which we wish to minimize can then be written as
L
X
1
k w k2 +(C
(ξi+ + ξi− ))
2
i=1
(4.5)
subject to the constraints ξi+ > 0, ξi− > 0 ∀i and (4.3) and (4.4), where w is that
used in (4.2) and C is known as the ’cost’. Providing an explanation of this error
P
function, it’s firstly obvious why we wish to minimize Li=1 (ξi+ + ξi− ) since we
want our predicted values to be as close as possible to their actual values, the
4.3. SVR THEORY
61
ε-insensitive tube. The reason why we wish to minimize k w k (i.e. making it
is flat as possible) is because doing this reduces the complexity of w since the
norm of a vector is a measure of its complexity. A small norm ensures that when
you have lots of features, no feature has an unreasonable weight. Smaller values
in w also mean the model is less sensitive to errors in measurement/random
shocks/non-stationarity of the features, xi . This explains why given two models
(i.e. two possible values of w) which explain the data equally well, the ’flatter’
one is preferred. Minimizing k w k is equivalent to minimizing
1
2
k w k2 and
the use of this term makes it possible to perform Quadratic Programming (QP)
optimization. The parameter C controls the trade-off between the slack variable
penalty and the flatness/complexity of w. It states the balance between making
sure that as many samples as possible are “well-enough“ approximated whilst
using the “simplest“ classifier possible.
In order to minimize (4.5) with its constraints, Lagrange multipliers µ+
i ≥ 0,
+
−
µ−
i ≥ 0, αi ≥ 0 and αi ≥ 0 ∀i are added giving us the Lagrangian,
L
L
X
X
1
+
− −
(ξi+ + ξi− )) −
(µ+
L = k w k2 +(C ∗
i ξi − µi ξi )
2
i=1
i=1
−
L
X
i=1
αi+ (ε + ξi+ + yi − yi∗ ) −
L
X
(4.6)
αi− (ε + ξi− − yi + yi∗ )
i=1
The Lagrange multipliers are added due to the constraints on (4.5), which are
ξi+ > 0, ξi− > 0 ∀i and (4.3) and (4.4). The notation chosen for these Lagrange
−
+
−
multipliers (i.e. µ+
i , µi , αi and αi ) is just to help make the remaining expla-
nation of this methodology clearer. Any notation could have been used but this
explanation copies the notation used in [SS04]. It follows from the saddle point
condition that the partial derivatives of L with respect to the primal variables
62
CHAPTER 4. FORECASTING COMPETITOR PRICES
(w, b, ξ + and ξ − ) have to vanish for optimality. Substituting for yi (i.e. equation
(4.2)), differentiating with respect to w, b, ξ + and ξ − and setting the derivatives
to zero:
L
X
∂L
=0⇒w=
(αi+ − αi− )φ(xi )
∂w
i=1
(4.7)
L
X
∂L
=0⇒
(αi+ − αi− ) = 0
∂b
i=1
(4.8)
∂L
= 0 ⇒ C = αi+ + µ+
i
∂ξi+
(4.9)
∂L
−
−
− = 0 ⇒ C = αi + µi
∂ξi
(4.10)
Due to duality, substituting (4.7), (4.8), (4.9) and (4.10) into (4.6), we now need
to maximise,
L=−
1X +
(αi − αi− )(αj+ − αj− )φ(xi ) · φ(xj )
2 i,j
L
L
X
X
+
− ∗
+
(αi − αi )yi − ε
(αi+ + αi− )
i=1
(4.11)
i=1
with respect to αi+ and αi− where αi+ ≥ 0 and αi− ≥ 0 ∀i. Using µ+
i ≥ 0 and
+
−
µ−
i ≥ 0 together with (4.9) and (4.10) means that αi ≤ C and αi ≤ C. The
constraints on αi+ and αi− are therefore that 0 ≤ αi+ ≤ C, 0 ≤ αi− ≤ C and
PL
+
−
i=1 (αi − αi ) = 0 ∀i. QP can now be performed to find optimum values for all
αi+ and αi− .
Once (4.11) has been optimised, substituting (4.7) into (4.2) allows us to make
4.3. SVR THEORY
63
new predictions using
0
y =
L
X
(αi+ − αi− )φ(xi ) · φ(x0 ) + b
(4.12)
i=1
Computation of b is done by exploiting the so called Karush–Kuhn–Tucker (KKT)
conditions ([Kar39] and [KT51]) which state that at the point of the solution the
product between variables and constraints must equal zero. Hence
αi+ (ε + ξi+ + yi − yi∗ ) = 0
(4.13)
αi− (ε + ξi− − yi + yi∗ ) = 0
(4.14)
+
+ +
µ+
i ξi = 0 ⇒ (C − αi )ξi = 0
(4.15)
−
− −
µ−
i ξi = 0 ⇒ (C − αi )ξi = 0
(4.16)
This allows us to make several useful conclusions. Firstly only yi with corresponding αi+ = C or αi− = C lie outside the -insensitive tube. Secondly αi+ · αi− = 0,
i.e. there can never be a set of dual variables αi+ , αi− which are both simultaneously nonzero as this would require nonzero slacks (ξ + and ξ − ) in both directions.
Finally for αi+ (0, C), ξi+ = 0 and ε + ξi+ + yi − yi∗ = 0 and the same is also true
for αi− (0, C), ξi− = 0 and ε + ξi− − y + yi∗ = 0. The indices i where this is true are
the φ(xi ) which make up the set S of support vectors that lie on either boundary
of the ε-insensitive tube. Hence by (4.2),
b = yi∗ − w · φ(xi ) − ε
(4.17)
b = yi∗ − w · φ(xi ) + ε
(4.18)
for αi+ (0, C) and
64
CHAPTER 4. FORECASTING COMPETITOR PRICES
for αi− (0, C). (4.7) is then used together with (4.17) and (4.18) to find the value
for each b over all the indices i in S. Taking the average of the found b provides a
more robust value. The method of Support Vector Regression therefore doesn’t
require us to know the transformation φ, the only parameters we need to set are
γ, C and ε.
4.4
SVR algorithm to predict competitor prices
We wish to predict the price of any competitor and this will be done using a SVR
model. In order to forecast a competitor’s price, a separate model will be used
for each competitor. The SVR model will predict the change in a competitor’s
price from the most recently observed price. In order to use the model, a training
set is first established using a designated amount of historical pricing data. The
SVR model then uses this training data for tuning in order to find optimum
values for γ, C and ε. The model is then fitted using these values and once it has
been re-trained it is now ready to be used for prediction over the period we wish
to forecast. Since the SVR method is a black-box method where we have little
control over the forecasts it makes, we need to restrict the model from making
extreme predictions. To do this bounds have been assigned.
• The first bound finds the largest positive and negative move between consecutive competitor price observations in the training data and restricts any
forecasted change in a competitor’s price to be within this range.
• The other bound finds the largest positive and negative overall movement
of a competitor’s price from its initial price (at the start of the optimisation
period) during any period in the training data. It restricts every forecast to
always be within this range from its initial value at the start of the period
4.4. SVR ALGORITHM TO PREDICT COMPETITOR PRICES
65
we wish to forecast.
Related theoretic justification for implementing a bounded approach is provided
in chapter 8.
Initial values for the all the necessary data inputs in the SVR model will
be known at the start of each forecasting period. The future times at which a
competitor’s price will be checked (i.e. the times for which we need to forecast)
within the forecasting period will also be known. It is assumed that the future
own price is known since this is something which we have total control over. It is
also assumed that future cost will remain unchanged unless it can be predicted.
We are now ready to start predicting each competitor’s price. Starting at the
beginning of each period we forecast a competitor’s price at each predetermined
time, predicting the change in a competitor’s price from its previous value. If
we were predicting a competitor’s price over the next 24 hours, we would have
initial values for the data inputs at the start of the day (i.e. at midnight). If
the first competitor price observation will occur at 9am, for example, we update
the variables in the model (e.g. current time = 9) and the price at 9am is
then forecasted. If the next observation occurs at 2pm, we further update the
model using the forecasted competitor price at 9am and the price at 2pm is
then forecasted. We repeat this process until there are no more competitor price
observations in the period for which we are forecasting. Once the end of the
forecast period has been reached, the actual competitor prices are used to analyse
the forecast accuracy of the model. The training data set is then updated, the
model is re-trained and predictions for the next forecast period are made, as
shown in algorithm 2.
66
CHAPTER 4. FORECASTING COMPETITOR PRICES
Algorithm 2 SVR Competitor Price Forecasting Algorithm
1: Assign initial training set from price data of a given competitor.
2: Set parameters of SVR model.
3: for each time period in the data do
4:
Train SVR model using the current training set.
5:
Let N be the number of competitor price observations in the next period.
6:
if N > 0 then
7:
for j from 1 to N do
8:
Find the value for each input of the model at the time of the jth
price observation.
9:
Forecast the competitor price at the time of the jth observation.
10:
if any of the forecasted prices are outside of their constraints then
11:
Reduce its value to the nearest acceptable price forecast value.
12:
end if
13:
end for
14:
end if
15:
Update the training set using the actual prices from the most recent period.
16: end for
17: Repeat for each competitor separately.
4.5
Summary
The forecasting of competitor prices is a difficult yet often vital task for retailers.
Deterministic factors, for example the day and time, have been shown to affect
future competitor prices. Other factors such as the underlying cost and a retailer’s
own price are also certainly important factors. Combining these in the right way
in order to be able to forecast competitor prices using historical data is currently
done by human pricing managers but due to their limitations new techniques are
required.
The method of Support Vector Regression has many advantages when used
to forecast competitor prices. The main advantage is that it can learn non-linear
relationships between factors, which is important given the non-linear nature of
competitor pricing decisions. To apply Support Vector Regression, its output
4.5. SUMMARY
67
needs to be bounded in order to restrain the model from making extreme predictions. Once this is done, this method can be applied to forecast future competitor
prices using historical data.
Chapter 5
Price Optimisation
This chapter will explain the importance of retailers making optimum pricing
decisions in order to maximise profit. It will be mathematically proven that different optimised own prices exist as competitor price reactions change. This will
show the limitations of current pricing decision support systems which optimise
own price under the unproven assumption that competitor prices will remain unchanged. Therefore an optimisation providing the largest guaranteed profit will
be proposed for when competitor price reactions can’t be forecasted. A further
optimisation methodology involving an exhaustive search of a game-tree will then
be presented which incorporates the forecasting of competitor price reactions into
the price optimisation for the first time. All of the proposed methodologies will
then be combined together into a complete algorithm that can be used by a
retailer to optimise their price over future periods.
5.1
Introduction
Individuals invest their resources into a commercial venture (selling goods or
services) in order to achieve an increased return on their investment in the future.
68
5.1. INTRODUCTION
69
The difference between the investment and the return is known as the profit.
Clearly the greater the profit the more worthwhile the investment. Investors
wish to attain the maximum possible return, explaining why profit maximisation
is an integral part of any commercial enterprise.
So how does one perform this task of profit maximisation? Answering this
question is about discovering the maximum amount consumers would be willing
to pay for the product you are selling. Given a single own price per unit charged
to all consumers, a certain number of sales will be achieved. If own price is then
increased, some consumers will now be unwilling to pay, causing demand to fall.
If own price decreases the opposite will occur. Since each unit requires a certain
cost to produce, the profits attained on the products sold, at time t, is:
Pt = (OPt − Ct ) ∗ St
(5.1)
where Pt represents the total profit during time t, OPt is the own price per
unit during period t, Ct is the wholesale cost per unit during period t and St
represents the sales achieved during this period. Increasing the price will increase
the margin1 but may decrease the sales by enough to cause overall profit to
decrease. On the other hand, decreasing the price will decrease the margin but
may increase the sales by enough to cause overall profit to increase. Optimising
profit is about discovering the own price which maximises the balance between
margin and sales. Therefore the ability to forecast the resultant sales from a
given own price is vital. Comparing the forecasted profit from different pricing
scenarios allows one to find the optimal price which maximises the profit function,
(i.e equation 5.1), producing the greatest profit.
In reality, solving this optimisation problem is slightly more complicated than
1
The difference between the Own Price and Cost for each unit sold.
70
CHAPTER 5. PRICE OPTIMISATION
maximising the profit function. There is often also a minimum volume constraint
which needs to be satisfied during each optimisation period. This is implemented
to reduce the commercial risk of selling to a small customer base, where a change
in purchasing habits of a small number of consumers can have huge consequences
on the viability of the seller’s business. Hence there is usually a minimum volume
constraint (unique to each retailer) when optimising equation 5.1, to help ensure
a seller’s long-term profitability. Another factor to consider is that managers
will often implement price differential constraints when setting prices. This may
be done in order for their product to maintain a certain pricing image when
compared with the prices of other competitors. These price bounds may even
be forced upon them through legal requirements, to avoid consumers paying a
price which is viewed as being ’unreasonable’. Therefore this price differential
constraint also needs to be considered in the price optimisation.
Using the general demand model from equation 3.3 to forecast sales it is
known, as shown in [MB99] for example, that if all competitors don’t follow a
change in own price then the maximum profit, at time t, is achieved when the
own price per unit is set at
β2,t ∗ Ct
,
β2,t + 1
(5.2)
where Ct is the cost per unit during period t and β2,t is the own elasticity at time
t with β2,t < −1 otherwise profit always increases as own price increases. This
is found by taking the profit function (equation 5.1), substituting in the demand
model (using equation 3.3), logging both sides, differentiating with respect to
own price, making it equal to zero and then rearranging. Pricing decision support systems set the optimised price as close to the price at equation 5.2 as the
volume and price differential constraints allow. This is because they assume that
competitors never react to a change in own price. Even if it is known that a
5.2. OPTIMISATION UNDER COMPETITIVE UNCERTAINTY
71
competitor will react, this information is ignored and the same optimised price is
still implemented. An example of such a system is presented in [MJL11] where
a pricing decision support system is used to optimise retail vehicle fuel pricing
using only current competitor prices. They explain that this is due to the current difficulty in forecasting competitor prices to sufficient accuracy to support
optimisation.
Failing to take future competitor price moves into account is obviously incorrect since it’s clear that the profitability of one firm’s pricing strategy depends
crucially on the response of its closest rivals. For example, if a retailer is faced
with a single competitor, a price decrease may be worthwhile if the competitor
doesn’t react to the price change. This will be because the increase in sales
due to the lower price more than compensates the reduction in margin received
for each unit of product sold. But if the competitor will in fact copy the price
change, the increase in sales will be reduced and the overall increase may not
be enough to counteract the reduction in margin, causing a reduction in overall
profit. Therefore the optimised price may differ depending on how competitors
will react, something which will be mathematically proven.
5.2
Optimisation under competitive uncertainty
This research is dealing with the price optimisation of a product sold per unit to
consumers during a specified time period. It has been shown that if a competitor
won’t react to a change in own price, then the optimum own price is known (i.e.
the price at equation 5.2). Of course finding this optimum price is only possible
once the necessary own price elasticity is known. The prior ability to forecast
demand and to understand its sensitivity to price changes is clearly integral to
this optimisation.
72
CHAPTER 5. PRICE OPTIMISATION
The first thing which we shall do in this section is to show that the optimum
own price does not remain static for all possible competitor price reactions. To
do this we will consider the simple case where there is a single competitor and
they copy any change in own price. The optimum price in this pricing scenario
has not previously been mathematically proven but the following observation will
change this.
Observation 1. If a retailer is faced with a single competitor who copies any
change in own price exactly then the maximum profit, at time t, is achieved at
the solution to the equation
β2,t
β3,t
1
=0
+
+
OPt − Ct OPt OPt + ∆
(5.3)
where OPt is the own price per unit during period t, ∆ is the difference between
the competitor’s price and own price per unit in time period t, Ct is the cost per
unit during period t, β2,t is the own elasticity at time t, β3,t is the competitor’s
elasticity at time t, OPt > Ct and OPt > −4.
Proof. We take our profit function, equation 5.1, and substitute in our demand
model, equation 3.3, with one competitor. If the competitor will copy every
own price move then CPt = OPt + ∆, where ∆ is the difference between the
competitor’s price and own price per unit in time period t, and we substitute this
in. Therefore we want to maximise the function
Pt = (OPt − Ct ) ∗ β1,t ∗ (OPt )β2,t ∗ (OPt + 4)β3,t .
(5.4)
After logging both sides and then differentiating with respect to own price we
have
1
β2,t
β3,t
∂ln(Pt )
=
+
+
,
∂(OPt )
OPt − Ct OPt OPt + ∆
(5.5)
5.2. OPTIMISATION UNDER COMPETITIVE UNCERTAINTY
73
where OPt > Ct since otherwise there would be no profit made and OPt > −4
since a competitor’s price is always positive. We find the optimum price by
making this equal to zero.
The solution to this equation can then be found by turning it into a polynomial
equation and solving to find its zeroes (checking that it is the profit maximising
price instead of the minimum since we have a quadratic equation). If the solution
to the equation is a price which minimises profit, profit always increases as own
price increases (i.e. the optimum own price occurs at infinity). In the case of a
single competitor who copies every own price change, optimisers should price at
the solution to equation 5.3 instead of at equation 5.2.
The optimum prices are now known for the case of a single competitor who
either copies an own price change or ignores it. If retailers are faced with multiple competitors, where some copy own price changes but others do not, then
the same method can be used to find the optimal price. The difference is that
only competitors who copy a price change remain in equation 5.3, the others
disappear when the profit function (incorporating the demand model containing
the additional competitor prices and respective elasticity coefficients) is differentiated. For example, if there were three competitors, the optimum price assuming
that the competitors copy every change in own price is found at the solution to
the equation,
β2,t
β3,t
β4,t
β5,t
1
+
+
+
+
=0
OPt − Ct OPt OPt + ∆1 OPt + ∆2 OPt + ∆3
(5.6)
with the extra variables ∆i being the current difference between own price and the
price for competitor i in time period t and β2+j,t being the elasticity for competitor
j in time period t. Hence the assumption that a change in any competitor’s price
reaction never has an impact on optimal own price is incorrect, disproving the
74
CHAPTER 5. PRICE OPTIMISATION
methodology used in current optimisers.
The optimum price when competitors ignore changes to own price is denoted
as OPCP ignore for the remainder of this section. The optimum price when all competitors copy any change in own price is denoted as OPCP f ollow for the remainder
of this section. The following observation compares these two prices.
Observation 2. The optimal price when competitors copy every own price
change is never below the optimal price when competitors instead ignore every
own price change (i.e. OPCP f ollow ≥ OPCP ignore ).
Proof. To prove this observation we will consider own price increases and decreases separately. When considering an own price increase, the worst case scenario regarding overall profit is if competitors ignore the own price increase instead of copying it. This is because each competitor’s price is cheaper if they
ignore the price increase compared with if they copy it. Although we receive the
same margin per unit sold in each scenario, we receive less sales when competitor
prices are cheaper. Therefore if it is worthwhile to increase own price (compared
with leaving own price unchanged) when competitors don’t follow the price rise
(i.e. the worst case scenario), it is certainly a worthwhile price move if they do
follow the price rise. Hence when considering an own price increase, the optimal
price when competitors copy the price increase is never below the optimal price
if competitors ignore the price increase.
When considering an own price decrease, the worst case scenario regarding
overall profit is if competitors copy the own price decrease instead of ignoring it.
This is because each competitor’s price is cheaper if they copy the price decrease
compared with if they ignore it. Although we receive the same margin per unit
sold in each scenario, we receive less sales when the competitor prices are cheaper.
Therefore if it is worthwhile to decrease own price (compared with leaving own
5.2. OPTIMISATION UNDER COMPETITIVE UNCERTAINTY
75
price unchanged) when competitors follow the price reduction (i.e. the worst case
scenario), it is certainly a worthwhile price move if they don’t follow the price
reduction. Hence when considering an own price decrease, the optimal price when
competitors copy the price decrease is never below the optimal price if competitors
ignore the price decrease.
We have therefore shown that in either situation the optimal price when competitors copy a price change is never below the optimal price if competitors ignore
a price change.
In many cases, competitor prices can’t be forecasted. In these circumstances
we wish to maximise the highest guaranteed profit. To do this we first need to
consider how a competitor could react to a change in own price. As stated in the
proof of observation 2, in the worst case scenario competitors ignore own price
increases but follow price decreases,
CPt =
CP
if
CPt−1 + δ
if
t−1
OPt ≥ OPt−1
(5.7)
OPt < OPt−1
where δ = OPt − OPt−1 . This uses the reasonable assumption that competitors
won’t react by moving their price by a greater amount than the change in own
price and that they won’t react in the opposite direction to a change in own price
(i.e. an own price increase won’t be the cause of a competitor price decrease).
If a competitor changed their price by more than the change in own price we
can assume that they were going to change their price anyway, even if own price
remain unchanged. This theory fits perfectly within the realms of a retailer
optimising a kinked demand curve where competitors do not follow price rises
but do react to price decreases. If competitors react to an own price change but
move their price by a smaller amount (i.e. not copying the price change exactly),
76
CHAPTER 5. PRICE OPTIMISATION
the optimum own price will be somewhere between OPCP ignore and OPCP f ollow .
This can be shown using the same theory as used in the proof of observation 2.
Now since OPCP ignore ≤ OPCP f ollow , as proven in observation 2, when maximising
guaranteed profit we should:
• If the current own price is below OPCP ignore , increase own price to OPCP ignore .
• If the current own price is above OPCP f ollow , decrease own price to OPCP f ollow .
This is because however competitors actually react, optimising as above will move
the current own price closer to both OPCP ignore and OPCP f ollow . Therefore own
price will be moved close to the actual optimal price and (due to the demand
model used) profits increase as own price moves closer to the optimal price. Hence
using this methodology increases profits. Moving own price further than what has
been advised won’t guarantee an increase in profit as we will then be moving away
from either OPCP ignore or OPCP f ollow , causing current overall profit to be reduced
if the competitor reacts in the worst case scenario, as described above. When
initialising this optimisation, the current own price could feasibly be between
OPCP ignore and OPCP f ollow . In this case no price movement improves guaranteed
profit and own price should be left untouched. This is because any change in own
price will move it further away from either OPCP ignore or OPCP f ollow , causing
overall profit to be reduced (compared with keeping own price unchanged) if the
competitor reacts in the worst case scenario.
If there are volume or price constraints, own price should be set as close as
the constraints allow to the optimum price which provides the highest guaranteed
profit. In fact we can only trust an advised optimal price if it falls within the
range of each competitor’s price found in previous data. Therefore price constraints should always be instituted, with the most relaxed constraints matching
the maximum difference between own price and competitor prices found in the
5.2. OPTIMISATION UNDER COMPETITIVE UNCERTAINTY
77
data used to fit the demand model. This optimisation is displayed in algorithm
3.
Algorithm 3 Optimisation of own price which provides the highest guaranteed
profit
[Let the price which maximises profit when all competitors copy changes in own
price be OPCP f ollow .]
1: Set OPt ← initial own price.
β ∗C
2: if OPt < β2,t +1t then
2,t
3:
4:
5:
6:
7:
8:
9:
∗Ct
Set OPt as close as constraints allow to ββ2,t
(i.e. OPCP ignore ).
2,t +1
else if OPt > OPCP f ollow then
Set OPt as close as constraints allow to OPCP f ollow .
else if OPt doesn’t satisfy constraints then
Set OPt to nearest price which satisfies all the constraints.
end if
return OPt
. The optimised own price
When a retailer optimises their price according to algorithm 3, often OPCP ignore
is below OPCP f ollow . An interesting scenario to consider is if a cost increase occurs which makes it optimal to raise own price but is then followed shortly after
by a cost decrease of the same absolute size (i.e. returning the cost to its original
position). Own price ends up at OPCP f ollow after a decrease, hence if own price
was not at OPCP f ollow before the initial cost increase, it will not return to the
same price after the cost decrease. If it was not originally priced at OPCP f ollow ,
the own price must have been at a lower price since this optimisation (algorithm
3) causes own price to be positioned at or between OPCP f ollow and the lower price
at OPCP ignore . Hence an increase in cost followed by an identically sized decrease
can cause own price to end up at an increased overall price when optimising
guaranteed profit. Therefore own price can increase by a greater amount as costs
go up compared with how much it decreases when costs go down. If retailers
are unable to predict how competitors will react to a change in price then they
will need to price with some caution, perhaps pricing as advised in or similar
78
CHAPTER 5. PRICE OPTIMISATION
to the optimisation above. A competitive retail environment can therefore cause
prices to remain high even as costs reduce back to previous levels simply because
retailers are worried and unsure about how their competitors will respond. This
explains why prices seem to go up when costs rise but do not return to their original price when costs return to their original position in a seemingly competitive
marketplace. On the other hand, the reverse can happen when a cost decrease
is followed by an increase of the same amount. Own price would end up at a
lower price compared with its initial price if it was not initially positioned at
OPCP ignore .
Although businesses today are not employing this algorithm directly, it can
explain their behaviour simply because it correctly incorporates the uncertainty
which retailers are faced with when making pricing decisions, unlike other pricing
optimisations which assume competitor prices will remain unchanged. Retailers
don’t know how competitors will react to an own price change and they need
to consider the worst case scenario in order to decide whether the change would
be worthwhile, just like what is done in this algorithm. They therefore price
with some caution, exhibiting the same characteristics as this optimisation. This
mathematical analysis produces very interesting pricing results which may provide the basis for explaining the presence of price assymetry (i.e. asymmetric
price transmission) often evident in competitive markets. It also concludes the
research in this thesis regarding the optimisation of own price when competitor
prices aren’t forecasted. The next section attempts to incorporate competitor
price forecasts into the optimisation.
5.3. OPTIMISATION USING FORECASTED COMPETITOR PRICES
5.3
79
Optimisation using forecasted competitor
prices
As previously stated, current pricing decision support systems do not forecast
competitor prices but instead assume they will remain unchanged during the optimisation period. The own price which maximises profit whilst satisfying the
necessary constraints is then implemented under this assumption. Clearly the
problem with this optimisation is that a competitor’s price will not always remain constant. Therefore when new pricing survey data is received, competitor
prices could be shown to have changed. Once a competitor price is known to
have changed, current pricing decision support systems can either choose the
best own price for the remainder of the optimisation period, under the assumption that no further change will occur, or keep the current optimisation of own
price unchanged. If a new optimised own price is chosen for the remainder of
the optimisation period, this optimisation repeats for every observed competitor
price change. Clearly the actual competitor prices implemented throughout the
optimisation period may have caused the chosen own prices to have been suboptimal. Therefore if a retailer knew that their competitors would act this way
then they may not have made these own price decisions. Hence a superior optimisation would be optimise own price without assuming competitor prices will
remain unchanged but instead forecasting what their future prices will be.
Before a retailer can decide upon their best own price, every potential pricing
option needs to considered together with the resultant competitor prices in each
case. During any period for which a retailer wishes to optimise their price, there
will be times at which a change in their price can occur. This may be at the start
of the optimisation period, in reaction to other competitor price moves or at some
other time. For each own price decision there will only be a certain number of
80
CHAPTER 5. PRICE OPTIMISATION
potential options. This is due to price differential constraints and the fact that
retailers often only price at specific price points (for example only pricing at .9
positions). Whenever an own price decision is required, the different potential
own prices need to be found and each one needs to be considered individually, as
shown in figure 5.1.
Figure 5.1: Own Price options
During the same optimisation period there will also be times at which new
competitor price survey data will be received, showing the actual competitor
prices at those times. These prices could obviously have changed since the last
pricing survey and current competitor prices may be affected by previous own
price decisions. Figure 5.2 presents a general example of what may happen with
one competitor, with the different own price choices resulting in different competitor prices.
Figure 5.2: Competitor Price response
Currently pricing decision support systems assume that competitor prices
5.3. OPTIMISATION USING FORECASTED COMPETITOR PRICES
81
will remain unchanged. Removing this assumption and instead forecasting these
prices will enable us to compare how different own prices will affect competitor
prices and overall sales. Of course during any optimisation period there can be
multiple own price decisions and multiple competitor pricing surveys. Therefore
every possible choice for own price throughout the optimisation period, together
with the resultant competitor prices for each choice, needs to be considered by a
retailer.
For each product sold by a retailer, if there are n own price decisions during
an optimisation period, the overall optimisation problem is to choose the set of
own prices which maximise the total profit
n
X
(OPt − Ct ) ∗ St
P =
(5.8)
t=1
subject to the volume constraint
n
X
St ≥ V
(5.9)
t=1
and where OPt can only be within a range of discrete prices. Here P represents
the total profit during the optimisation period, OPt is the own price per unit
during period t, C is the wholesale cost per unit during the optimisation period,
St represents the sales achieved by a retailer during period t and V is the overall
volume constraint. As stated previously in this thesis, St in equation 5.8 is a
function of both OPt and also the price of each competitor during period t.
Therefore using forecasted competitor prices enables us to forecast St .
My first contribution regarding own price optimisation is to use forecasted
competitor prices instead of using current competitor prices which are naively
used in existing pricing decision support systems. My second contribution is that
82
CHAPTER 5. PRICE OPTIMISATION
rather than optimising own price as the solution to a continuous optimisation
problem solved using classical nonlinear programming techniques as is currently
the case, it is to treat this as a discrete optimisation problem. Using this framework will not only be simpler, it will also be easier for domain pricing managers to
understand. This is important since the easier it is to undertand and accept, the
more likely the pricing optimisation system will be transformed from academic
research into commercial application. The next section will propose a new way
to complete this optimisation, also showing how it can be implemented.
5.3.1
Exhaustive Game-Tree Search
Using an exhaustive search of a game-tree to optimise price has numerous advantages. Firstly, it provides a discrete rather than continuous optimisation (which
is often found in current pricing decision support systems), therefore it only considers actual potential pricing options, making the overall optimisation comparatively more efficient. It also incorporates the future price reaction of competitors
over time into the optimisation, something which is missing from current price
optimisers since they don’t forecast competitor prices. Future own price decisions
are also investigated, dynamically optimising own price rather than solely implementing a static price which may not fully optimise profits. Hence it enables
the investigation of all the pricing options/scenarios over an entire optimisation
period.
Completing an exhaustive search of a game-tree is a relatively straight-forward
technique. It simply involves considering all the potential options available to
the decision maker over a given period before choosing the one which returns the
best overall result. A game-tree is used here to represent all the possible pricing
scenarios, by finding and plotting all these possible permutations together. Figure
5.3. OPTIMISATION USING FORECASTED COMPETITOR PRICES
83
Figure 5.3: A game-tree
5.3 presents a general example of a game-tree with two own price decisions and
two competitor pricing surveys for a single competitor. The different combination
of choices for own price throughout the optimisation period can be viewed as being
different paths of own price. Each path contains the forecasted competitor prices
which are expected to occur should a retailer implement the own prices in this
path. We forecast the prices of each competitor rather than also try to find their
possible pricing options because:
1. The tree would become too large to search if all these permutations were
considered.
2. We don’t know each competitor’s pricing options because we do not know
their pricing differential constraints.
3. We need to predict which price a competitor will choose and it makes sense
to predict their prices whilst building the game-tree.
84
CHAPTER 5. PRICE OPTIMISATION
Competitor prices are not forecasted to impact each other since we do not know
which of our competitors are in direct competition with each other. We are not
using competitor prices to forecast the prices of other competitors in chapter 4
for the same reason.
Evaluating every pricing scenario is the same as considering every path of
prices. The number of paths is dependent on the number of own price decisions
and also the number of options for own price at each decision. Interestingly,
the number of paths is independent of the number of competitors. In order to
construct a game-tree detailing all the possible paths, we start with the initial
own price and competitor prices. Then working our way chronologically through
the optimisation period:
• At the time of an own price decision we go through each path working out
the discrete set of options available in each case, determined by current
competitor prices. Whenever there is more than one option available, the
number of paths will increase.
• At the time of a pricing survey for any competitor we go through each path
and forecast their price.
Once the end of the all the own price decisions have been considered and all
the prices at competitor surveys have been forecasted, a retailer now has insight
into every possible pricing scenario for own price together with the expected
competitor prices in each case. Before we can construct the game-tree though,
the timing of these pricing surveys and own price decisions need to be considered.
As stated in section 4.2, the time of future pricing surveys is usually known
by a retailer and is therefore already predetermined for the optimisation. The
time at which own price decisions can be made has not been specified. Own
price decisions can be made at any time but if we simply wish to provide a
5.3. OPTIMISATION USING FORECASTED COMPETITOR PRICES
85
superior optimisation compared with current techniques, then we simply only
need to make own price decisions at the same time at which current optimisers
ever allow own price to change2 . They currently allow an own price change to
occur at the start of an optimisation period. As time elapses, new competitor
surveys often show that prices have changed. In this case, the optimisation is
often repeated from the time of the most recent competitor price survey using
the new competitor prices, and own price is allowed to change. Therefore own
price changes can currently also be changed at the time of competitor pricing
surveys although the optimisation assumes that the competitor prices will always
remain unchanged from their current price. Therefore our game-tree optimisation
will make own price decisions at the start of the optimisation period and at the
time of competitor price surveys.
If we are able to forecast the prices of each competitor we can use these predictions together with the initial own price, initial competitor prices, times of pricing
surveys and price differential constraints to construct the game-tree. Beginning
with the initial prices at the start of the optimisation period we work through in
chronological order predicting competitor prices and working out the options for
own price whenever required. Once the end of the optimisation period is reached
the game-tree construction has finished. Each path in the game-tree shows a
different potential choice for own price together with the expected competitor
prices in this scenario.
Once the game-tree for the optimisation period has been constructed, each
path in the game-tree can be viewed as a separate set of own prices. If we use
average prices to forecast demand, in order to complete the optimisation we firstly
need to work out, for each different path, the time-weighted average own price
2
This is an assumption which does not need to hold always, it’s just used here to show a
comparison between the optimisation in current pricing decision support systems and the new
proposed game-tree optimisation.
86
CHAPTER 5. PRICE OPTIMISATION
and time-weighted average competitor prices. Whichever technique is used to
forecast sales, we find the forecasted sales for each path. The total profit for
each path is then found for each path (using the profit function in equation 5.1).
By only analysing the paths which provide forecasted sales satisfying the volume
constraint, the profits in the remaining paths can be compared and the one which
provides the maximum forecasted profit can then be chosen. We know that the
pricing differential constraints are satisfied since otherwise the own prices would
not have been considered originally when constructing the game-tree.
Once the optimal own price has been implemented, future competitor pricing
surveys will be received over time. The forecasted competitor prices for each competitor are then compared with the actual competitor prices. If they are different,
the game-tree optimisation needs to be re-run from the start of the optimisation
period using the actual own prices and competitor prices up to that current point
in time. This continues until all the new pricing survey data is received for the optimisation period. Figure 5.4 presents each of the steps in the algorithm. Clearly
the more accurate the competitor price forecasts are, the greater the expected
improvement in the overall profit optimisation of this proposed method and the
less times it needs to be re-run. For any period we wish to optimise own price,
algorithm 4 shows how to apply this game-tree optimisation.
5.3. OPTIMISATION USING FORECASTED COMPETITOR PRICES
Figure 5.4: Game-tree optimisation
87
88
CHAPTER 5. PRICE OPTIMISATION
Algorithm 4 Game-tree Algorithm
1: Construct the first path of the game-tree using the initial own price and initial
competitor prices.
2: Assign the current time (CT) to be the time at the start of the optimisation
period.
3: while CT is earlier than the time at the end of the optimisation period do
4:
Assign CT to be the time of the earliest occurrence of either: the next CP
survey for any competitor, the next OP decision or the end of the optimisation
period.
5:
if CT = time of next CP survey then
6:
Let P be the number of different paths in the current game-tree.
7:
for k from 1 to P do
8:
In path k, forecast the prices for the required competitors.
9:
Add the forecasted prices to path k in the game-tree.
10:
end for
11:
end if
12:
if CT = time of next OP decision then
13:
Let P be the number of different paths in the current game-tree.
14:
for k from 1 to P do
15:
In path k, find the potential own prices using the current (perhaps
forecasted) competitor prices and price differential constraints
16:
Add the potential prices to path k in the game-tree. (If there is
more than one potential price, the number of paths will increase.)
17:
end for
18:
end if
19: end while
20: Stop once the end of the optimisation period is reached.
21: Let P be the number of different paths in the current game-tree.
22: for k from 1 to P do
23:
Using the set of own prices and competitor prices for path k, predict the
sales and profit for path k.
24: end for
25: Remove paths where the forecasted sales violate the volume constraint.
26: Find the remaining path of own prices which produces the maximum profit.
27: return Optimised Own Price
. Implement optimum path of own prices
28: New pricing surveys are then received during the optimisation period.
29: for each competitor price in the new pricing survey do
30:
if new actual price 6= the forecasted price then
31:
Repeat optimisation from the beginning using the actual own prices
and competitor prices up to the current point in time.
32:
end if
33: end for
5.4. COMPLETE OPTIMISATION
5.4
89
Complete Optimisation
This section will show how the DLM demand model presented in section 3, the
SVR competitor price model presented in section 4 and the game-tree optimisation presented in section 5.3 can be combined to optimise own price over a given
number of optimisation periods. Implementing this optimisation involves a number of steps. The first thing that needs to happen is that the number of previous
periods used as training data for any of the models needs to be assigned. Once
this is done the DLM demand model can be initialised. To do this the observation
equation needs to be constructed and the prior means for each of the coefficients
needs to be set. The training set will be fed, one period at a time, in chronological order into the demand model and the coefficients in the model will update
each time. If the size of any elasticity is outside the acceptable range after each
update, then it is corrected. Once the training set has been inputted we have our
fitted demand model. The SVR competitor price model is now initialised. This
is done by fitting the training data and by finding the bounds that will be used
to restrict the future predictions.
We now construct the game-tree and then find and implement the optimum
own price. As time passes and competitor pricing surveys are received, if their
actual price is different to our forecasted price then the optimisation is re-run
using the actual prices. If the actual price is the same as the forecasted price
then we keep the initial optimised own price and the optimisation is not re-run.
This repeats as each pricing survey is received. When the end of the optimisation
period is reached, all the new pricing survey data for that period is collated. The
actual average prices are found and the demand model is updated, producing
new elasticity estimates. Once again, if the size of any elasticity is outside the
acceptable range, then it is corrected. Also, if the movement of an elasticity is
90
CHAPTER 5. PRICE OPTIMISATION
Figure 5.5: Complete price optimiser
larger than what is allowed, its movement is constrained. The training set for the
SVR competitor price model is also updated with the data from the most recent
period added and the data from the oldest period in the training data removed.
The optimiser is now ready for the next period, perhaps with new constraints.
Figure 5.5 presents each of the steps in the algorithm. Algorithm 5 shows how to
apply this completed price optimiser.
We can compare this optimisation with the guaranteed profit optimisation
presented in section 5.2. Instead of forecasting competitor prices, the guaranteed
5.4. COMPLETE OPTIMISATION
91
Algorithm 5 Complete Optimisation Algorithm
1: Assign training set for the demand model.
2: Construct observation equation.
3: Set prior means and variances for coefficients in observation equation.
4: Use MLE on training data to find Vt and Wt .
5: Let N be the number of time-steps in the training data.
6: for i from 1 to N do
7:
Use current coefficients in the observation equation to forecast sales for
time-step i.
8:
Use actual sales data from time-step i to update the coefficients.
9:
if any coefficients go outside of their bounds once updated then
10:
Correct the relevant coefficients.
11:
end if
12: end for
13: for each new period where we wish to optimise Own Price do
14:
for each competitor do
15:
Assign training set for the competitor price model.
16:
Set parameters of SVR model.
17:
Train SVR model using the current training set.
18:
end for
19:
Find initial own price, initial competitor prices, price differential constraints, times of future pricing surveys and times at which own price can
change.
20:
Construct the first path of own prices using the value of the initial own
price and initial competitor prices.
21:
Assign the current time (CT) to be the time at the start of the optimisation
period.
22:
while CT is earlier than the time at the end of the optimisation period
do
23:
Assign CT to be the time of the earliest occurrence of either: the
next CP survey for any competitor, the next OP decision or the end of the
optimisation period.
24:
if CT = time of next CP survey then
25:
Let P be the number of different paths in the current game-tree.
26:
for k from 1 to P do
27:
In path k, forecast the prices for the required competitors.
28:
Add the forecasted prices to path k in the game-tree.
29:
end for
30:
end if
31:
if CT = time of next OP decision then
32:
Let P be the number of different paths in the current game-tree.
33:
for k from 1 to P do
34:
In path k, find the potential own prices using the current (perhaps forecasted) competitor prices and price differential constraints
35:
Add the potential prices to path k in the game-tree. (If there
is more than one potential price, the number of paths will increase.)
92
CHAPTER 5. PRICE OPTIMISATION
Algorithm 5 Complete Optimisation Algorithm (continued)
end for
end if
end while
Stop once the end of the optimisation period is reached.
Let P be the number of different paths in the current game-tree.
for k from 1 to P do
Find the time-weighted average own price and competitor prices for
path k.
43:
Using the set of time-weighted average prices, predict the sales for path
k using the current estimates of the coefficients in the demand model.
44:
Using the profit function, predict the profit for path k.
45:
end for
46:
Remove paths where the forecasted sales violate the volume constraint.
47:
Find the remaining path of own prices which produces the maximum
profit.
48:
return Optimised Own Price . Implement optimum path of own prices
49:
New pricing surveys are then received during the optimisation period.
50:
for each competitor price in the new pricing survey do
51:
if new actual price 6= the forecasted price then
52:
Repeat optimisation from the beginning using the actual own prices
and competitor prices up to the current point in time.
53:
end if
54:
end for
55:
Repeat optimisation until all future pricing surveys in the optimisation
period have been received.
56:
Once the end of the optimisation period has been reached, the actual sales
and actual average prices for the period are used to update the coefficients in
the observation equation.
57:
if any of the coefficients go outside of their bounds once updated then
58:
Correct the relevant coefficients.
59:
end if
60:
if the movement of any of the coefficients from their previous value is too
great when updated then
61:
Correct the relevant coefficients
62:
end if
63:
Use MLE to find new values for Vt and Wt after a specified number of
future time steps.
64: end for
36:
37:
38:
39:
40:
41:
42:
5.5. SUMMARY
93
profit optimisation algorithm would insert the competitor price which impacts
own sales the most (i.e. the worst case scenario) whenever own price is changed.
As stated previously, in the worst case scenario competitors ignore own price
increases but copy price decreases. This would result in the advised own price
moves being much more cautious, meaning less changes and changes of smaller
absolute size. This is because a competitor will often not be forecasted to react in
the most detrimental way and therefore an advised price move will be more likely
to occur in the price optimiser presented in this section. If an own price move
will increase profit but it isn’t implemented, this will result in reduced profit. In
reality there may be plenty of cases where competitors don’t price in the way
which negatively impacts own sales the most and therefore pricing according to
the algorithm which optimises guaranteed profit will reduce overall profit. Of
course the accuracy of the SVR model will dictate whether competitor prices
can be forecasted correctly. If they can then the algorithm providing maximum
guaranteed profit when pricing under competitive uncertainty isn’t required as it
will reduce overall profit.
5.5
Summary
Optimising own price is extremely important for any retailier and it is achieved
by maximising profit, discovering the own price which maximises the balance
between margin and sales, under certain volume and price constraints. Current
pricing decisions support systems optimise own price assuming competitor prices
won’t change, despite the fact they don’t actually forecast their future prices.
It can be shown mathematically that the optimal own price depends on how
competitors react, showing the main flaw with current methodology. Instead
94
CHAPTER 5. PRICE OPTIMISATION
maximising guaranteed own profit is a superior optimisation of own price compared with naively assuming that competitor prices will remain unchanged. This
new optimisation is the first to explicitly incorporate the competitive nature of
the market into the pricing solution. An even more superior optimisation would
be to forecast competitor price moves under different own price decisions and then
choose the own price which maximises profit whilst satisfying any set constraints.
Prices in reality are only ever set at specified price points (rarely containing
more than a few decimal places) and hence any pricing decision is made from a
discrete sets of potential prices. Plotting a game-tree of all the own price decisions over a designated optimisation period, together with subsequent forecasted
competitor prices, allows a retailer to visualise all of the potential paths for own
price over that period. An exhaustive search of the resultant profit and sales
achieved from each path in this game-tree can then be completed in order to
find the best own prices. The advantages of solving this discrete optimisation
problem rather than optimising own price as the solution to a continuous optimisation problem solved using classical nonlinear programming techniques, as is
currently the case, are firstly that it is simpler and easier for pricing managers
to understand and implement themselves. This factor increases the chances of
this pricing optimisation system being transformed from academic research into
commercial application. It is also not susceptible to the rounding errors caused
by moving the current continuous price solution to the nearest acceptable pricing position. Incorporating this optimisation stage together with the proposed
techniques from previous chapters used to forecast demand and competitor prices
creates a complete pricing decision support system.
Chapter 6
Case Study: Retail Vehicle Fuel
Industry
This chapter will present the problem retailers are faced when optimising their
prices in the retail vehicle fuel market. The methods proposed in previous chapters are then indvidually tested using real commercial data from this industry,
with their results analysed. The first data set contained the daily sales, wholesale
cost per litre, time-weighted average own price per litre and time-weighted average competitor prices per litre (although their sales are unknown) from over an
entire year at 109 different fuel station sites. This was used to test the proposed
demand model methodology. The second data set contained the wholesale cost
per unit at all times, the own price per unit at all times and competitor prices per
unit together with the times at which they were observed from almost an entire
year at 100 different fuel station sites. This was used to test the competitor price
forecasting and overall price optimisation methodologies. Each proposed method
proves its superiority compared with techniques currently implemented.
95
96
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
6.1
Background
Understanding how to optimise price setting for a product is rarely straightforward and the sensitivity of price to the final profit is often crucial, as we have
previously shown. The retail fuel industry is a prime example of this. It is an
extremely competitive industry, where many companies face the task of pricing
multiple grades of fuel throughout a network of stations across a wide geographic
area. This has naturally led researchers to investigate using pricing decision support systems in order to aid these decisions. In fact academic research is very
limited in this field, with industry applications leading the way. The only known
fuel pricing decision support system currently cited academically is built by KSS
Fuels called PriceNet, shown later in section 6.1.4. These systems are in fact
used to maximise gross profit. This entails maximising the difference between
the wholesale cost that a fuel station buys each litre of fuel at and the cost that
a customer pays per unit of fuel when they fill up their car. The next section
introduces the retail fuel pricing problem.
6.1.1
Retail Fuel Industry
In the retail fuel market, the profits are very small. A typical station attains a
margin, on average, of around 3% from selling each litre of fuel, as [Pet12] shows.
Therefore fuel stations are always looking to maximise their profit, sometimes just
to remain profitable at all. Understanding how and what to price their product
at is therefore fundamental to their success. If they can formulate a method to
slightly increase their revenue from selling each litre, then they can produce huge
extra profits, especially as some sites can sell over 100,000 litres of fuel a week.
This would also be multiplied over a network of stations.
Regarding the wholesale cost, it is something that we cannot forecast since
6.1. BACKGROUND
97
the wholesale market is extremely volatile. It is also affected by so many varying
factors. To compound this further, speculators create extra fluctuations as they
buy and sell oil commodities on the global market. We therefore assume that the
wholesale cost remains fixed over the short time-frame that we are optimising our
prices for.
There is of course a long history of academic research into price elasticity in
the retail fuel market, see for instance [Cla66]. Understanding the dynamism of
the price movements within the retail fuel industry appears to be more difficult
than you would think, since retail margins change sizeably over time. As [HMT08]
states, “In a market with little entry or exit, little non-geographic differentiation,
where wholesale prices are observable with little brand variation in retail prices
and inelastic demand, one would expect more constant retail margins”.
A study has been carried out on the Belgian retail gasoline market by [VM03].
He states that, “At any given day retail prices (for fuel) are dispersed. While retailers in a given local market charge more or less the same prices, the price
differences between local markets are considerable. Across locations, even for
stations of the same brand, retail prices vary substantially.” He puts this price
variation down to the differing degrees of local price competition between competitors in each local market.
6.1.2
Pricing Strategies
The wholesale price of fuel is at a record high which has in turn caused the retail
price to spike upwards. For this reason there is a lot of interest regarding the
prices which fuel stations set for their fuel. This section explains some of the
current pricing techniques which are evident in the retail fuel market.
A proportion of retail fuel stations price according to wholesale prices and
98
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
retain the same margin if the wholesale price changes. Other competitors try to
maintain very ‘sticky’ prices and so keep their prices the same even if underlying
costs or market conditions change. Both of these pricing strategies are shown in
[Noe07c].
The most popular pricing strategy appears to be for a fuel station to maintain
a pricing ‘image’ along the spectrum of prices for all the local competitors, as
shown in [Con01]. Here a fuel station would try to maintain a pricing position
in respect to the other competing fuel stations and would react to the changing
prices of its competitors in order to ensure that its own price maintained the
same pricing position. Prices at the lower end of the spectrum would be cheap
in comparison to other fuel stations, prices at the higher end would appear to be
expensive. Maintaining this position would allow customers to view a fuel station
as having cheap or expensive prices in comparison to its competitors without the
actual price being of importance. This strategy explains why competitors are
often shown to react to the price movements of their competitors, as [AEW09]
shows.
In certain areas, the competition between gasoline stations seems to be more
intense and prices become unstable as stations compete for market share. Certain
asymmetric cyclic pricing patterns are frequently observed in these areas which
leads the pricing to be called “cyclic” or “restorative” pricing. A restorative market is a highly volatile market in which large price hikes, followed by periods of
gradual price reduction, are frequently seen. This type of behaviour is observed
in gasoline prices in parts of Canada, the US Midwest, and Western Australia.
Several attempts have been made to understand these markets, by governments
concerned about collusion (e.g. [LEC06, Con01]) and by researchers aiming to explain the reasons for the behaviour (e.g. [Noe07b, And11, FS08, Atk08]). Current
evidence favours Edgeworth price cycles being the result of stronger competition
6.1. BACKGROUND
99
and the source of lower retail gasoline prices, stated by [Noe11].
The literature also contains previous studies done on the retail fuel market in
different locations around the World. One prominent example of this is a data set
consisting of a three year panel of prices from a sample of gasoline stations located
in suburban Washington DC together with a corresponding census of the region’s
stations, collected by [HMT08]. It was used to develop new empirical findings
about retail gasoline pricing in North America. They obtained evidence that
some retailers play very different pricing strategies; that is, some firms may play
a mixed-price strategy while other firms maintain a relative position in the pricing distribution. However, in contrast to the prediction in [BKD92], the stations
that maintain their position in the pricing distribution charge a systematically low
rather than a high price. They conclude that conventional collusion models are
unlikely to explain the observed changes in retail margins in their data. More generally, many of their results can be interpreted as adding to mounting evidence
(e.g. [EW04b, EW04a, Noe07b, Noe07a, Sla92]) that localized retail gasoline
competition appears to be characterized by regime shifts in pricing. They have
also examined how their empirical findings relate to existing theories of pricing
that appear most relevant for retail gasoline. While each of these theories explains some aspects of gasoline pricing, none provide explanations for the pricing
dynamics observed.
Overall, the primary source of retail price variation at a fuel station is when
it changes its price in response to a change in the wholesale price, or when the
station changes its price relative to other stations. The problem arises when we
try to understand when this response to wholesale prices will be, or when a change
in price relative to other stations will occur. It’s clearly difficult to understand
the precise price movements of all competitors since the retail mark-up changes
sizeably over time and some stations appear to use complex pricing rules.
100
6.1.3
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
Collusion in Pricing
With current retail prices being as high as they are, there are many people who
believe that collusion plays some part in these increased prices. This section
presents some of the theory and findings given in the literature for whether there
is evidence of collusion.
As explained in [VM03], the most important players in the retail petrol market
have a very dense retail network and therefore meet each other in most local
markets. Since the contribution of [Ber90] it is clear that, in the presence of
such multi-market contact, collusion is easier to sustain. Under multi-market
contact there are more possibilities for punishing deviations from the collusive
outcome. Collusion thus becomes more attractive as the expected punishment
following a deviation increases. As stated by Borenstein in [BS96], “Collusion
is more difficult to sustain (i.e. the highest sustainable collusive margin will be
lower) when either the gain from defection is greater or the anticipated loss from
punishment is lower”.
Governments have understandably worried about collusion between firms being the cause of price cycling and several studies have been performed to investigate this. Studies into the Canadian gasoline market include [Con01] and
[LEC06]. The former report concluded that differences in prices between cities is
a result of the competitive conditions present in each, and furthermore it commented that: “It takes only one dealer who is determined to increase market share
at the expense of competitors to upset the balance”. Contrary to what [Noe07b]
found, they detected no asymmetry in the adjustment of prices. However they
did find that large, vertically integrated firms subsidised their retail operations
so that low prices could be sustained, and in a price war they were also found
to give undisclosed wholesale discounts to their branded retailers to sustain low
6.1. BACKGROUND
101
prices. Although this practice is seen as unfair by independents, the report finds
that it is lawful. In [LEC06] the author studied the profitability of gasoline retailers. The author found no evidence of “price squeezing” by large firms. In fact
wholesale prices were found to be equal for independents and non-independents
alike. One conclusion was that price differences between firms are attributable
to differences in running costs and not wholesale costs. The Australian market
has also been investigated: [Wan05] reported on an Australian court case where
gasoline firms were found to be guilty of collusion in an attempt to co-ordinate
price rises. [CC07] found no evidence of price-fixing or collusion and stated that
“the unleaded petrol industry in Australia is fundamentally competitive”.
To summarise, there is little evidence to support the existence of collusion in
the retail petrol industry although limited cases of collusion have been recorded.
6.1.4
Pricing Decision Support Systems in the Retail fuel
Industry
The main advantage of pricing decision support systems in the retail fuel industry
is that given the frequency and volume of pricing decisions, these systems can do
it at a higher level of consistency and accuracy, increasing overall profit. Other
than the work done by KSS Fuels, there is a relative lack of academic literature
on pricing decision support systems in the retail fuel industry. A currently employed system is that used by KSS Fuels called PriceNet where, as explained in
[CS01], “Modelling, optimization, and learning interact at the level of a single
station in order to provide profit maximising prices for the station such that its
price positioning and volume goals are satisfied.” Modelling is the forecasting
of demand and is done by using mathematical models to estimate daily future
demand. Optimization is done by using all the information available to achieve a
102
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
profit maximising task whilst also attaining sales goals. Learning is done within
PriceNet by comparing its models and updating them with actual sales performance. The preliminary improvements this system provided are given in [SB93].
An example of its profitability in the real world is shown in [KSS] where Jeff
Miller, the president of Miller Oil, states, “With PriceNet we’re able to challenge
our gas pricing tactics at each store to consistently deliver on our volume and
margin budgets. The results are impressive – we are now running 4.5% over
volume budget and meeting our margin targets”.
[JLM10] shows how PriceNet has also been recently extended to optimise pricing for a network of retail fuel stations. New advances have shown the increased
profit optimisation of differentiating prices across sites, grades of fuel and time
periods in response to variations in demand from different segments of the customer base. This is shown in [MJL11] whose results show that the price optimizer
produced a statistically significant increase of around 9% in the average level of
weekly gross profit dollars. PriceNet has therefore shown the advantages of using
pricing decision support systems in solving this pricing problem.
Summarising this pricing problem, there are clearly unique challenges which
price setters in this market are faced with. The tight margins and volatile underlying costs together with the hazy competitor pricing strategies all combine
to harden the challenge which a price setter is faced with when pricing their fuel.
To help with this task, pricing decision support systems have been successfully
implemented to aid pricing decisions in this industry. We will now test each of
the newly proposed methodologies presented in previous chapters on this pricing
problem, with the results compared against those from currently implemented
techniques.
6.2. TESTING BOUNDED DLM DEMAND MODEL
6.2
103
Testing bounded DLM demand model
The first thing that is required in order to implement this bounded DLM algorithm is to construct the observation equation. Given that current pricing
decision support systems forecast daily retail vehicle fuel sales, it seems sensible
to consider the effect of each day of the week (as done in PriceNet) when using
the proposed DLM algorithm. Therefore we will extend the equation used in
(3.4) to include day of the week factors. Hence the updated observation equation
we will use to forecast daily sales is
log(St ) = β1,t + β2,t D2,t + ... + β7,t D7,t + β8,t log(OPt )+
(6.1)
β9,t log(CP1,t ) + ... + β8+K,t log(CPK,t ) + vt
where D2,t , ..., D7,t are dummy variables. They take the value of 1 if the day they
represent is the day we are forecasting sales at time t, otherwise they take the
value 0. Explaining the respective coefficients for each new variable in this model,
β1,t is the average sales on day 1 and can be viewed as the intercept. The other
coefficients represent the estimated difference in sales between day 1 and that day.
For example β2,t represents the difference in average sales between day 1 and day
2. This model allows us to use day 1 as the base group. It is therefore the day
against which all other days are compared to and that is why its dummy variable
doesn’t appear in the model. The only changes in the observation equation are
that
Ft = [1 D2,t ...D7,t log(OPt ) log(CP1,t ) ...log(CPK,t )]
0
and θt = [β1,t ...β8+K,t ]. This is in fact the DLM version of the model used in
[MJL11], a currently implemented state-of-the-art methodology which we will
compare our results against. A prior of mean 0 and variance 10 was assigned for
104
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
each βi,t . The prior mean was therefore a vector of zeroes and the prior variance
was a diagonal matrix with every input in the leading diagonal taking the value
10. Competitor elasticities were limited to be less than 20 and the own elasticity
limited to be more than -20. A constraint of ±0.05 was implemented to restrict
the daily movements of the elasticities, once the model had been fitted to the
training data. The first 30 days for each site was used as training data and new
estimates of Vt and Wt were updated every 30 days. To test our DLM algorithm,
real commercial data was collected over an entire year at 109 different fuel station
sites. It contained the daily sales (in litres), wholesale cost per litre, time-weighted
average own price per litre and time-weighted average competitor prices per litre
(although their sales are unknown). The confidential and commercial nature of
this data means that it is anonymous. A snapshot of the data is presented in
table 6.1 where in this case there are three competitors.
Algorithm (1) was used to apply the DLM method to the data, one site at
a time. Over the 109 different sites, the average daily mean absolute percentage
error (MAPE) was 8.95%. Figure 6.1 and figure 6.2 show the comparison between
actual and forecasted daily sales for two different sites. When implementing the
methodology used in [MJL11] to forecast this data set, the MAPE was 16.9%.
[MJL11] used a very similar underlying regression model to our bounded DLM
Table 6.1: Example of daily sales data with three competitors
Date
Volume
18/11/2010
2818
19/11/2010 14420
20/11/2010
7631
21/11/2010
5067
22/11/2010
6488
23/11/2010 10867
24/11/2010
7645
25/11/2010
8550
Cost OwnPrice CP1 CP2 CP3
0.779
1.059
1.059 1.049 1.059
0.779
1.049
1.059 1.049 1.059
0.779
1.049
1.059 1.049 1.059
0.779
1.049
1.059 1.039 1.039
0.779
1.049
1.059 1.039 1.039
0.779
1.049
1.059 1.039 1.039
0.773
1.049
1.059 1.039 1.039
0.773
1.049
1.059 1.039 1.039
6.2. TESTING BOUNDED DLM DEMAND MODEL
105
method but learnt long-term historical elasticities through the method of generalized ridge regression on large amounts of historical data. Each day they updated
the base-level of sales in the absence of any price effects (i.e. β1,t ) to take into
account short-term fluctuations in sales. (It should be noted that we weren’t
able to recreate the daily forecasts found using [MJL11] due to the large amounts
of unavailable historical data required to train their initial model but we were
provided with their accuracy on this data set.) There is clearly a large increase
in forecast accuracy achieved when using the bounded DLM method. The main
reason for this seems to be that the new bounded DLM methodology is able to
update all of its coefficients simultaneously and is therefore able to infer more
accurate elasticities, better understanding future demand. In effect the DLM
has dealt with the remaining short-term correlation contained within the price
effects, producing improved forecasts. The correlograms1 in figure 6.3 and figure
6.4 show that all the correlation has been removed from the two sites displayed
here.
It should be noted that changing the size of the prior variance, increasing the
size of the initial training set or marginally changing how often Vt and Wt were
updated had negligible effect on the forecast accuracy of the model. As explained
earlier, the daily movement of the elasticities has been limited and figure 6.5
shows a plot of the daily movement of the elasticity for own price, β8,t , for all the
sites. We can see that at first the movements were regularly at the limits of what
the algorithm allowed (i.e. ±0.05) but over time they reduced. After around 120
days, 50% of the movements ranged between -0.015 and 0.015 (i.e. the positions
of the dotted lines). Our estimates for the own price elasticity therefore appear to
stabilise over time and the same was found when analysing the daily movement
1
A plot of the sample autocorrelation coefficients (ACF). The sample ACF measure the
correlation, if any, between observations in the time series data at different distances apart.
106
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
Figure 6.1: Actual and predicted sales (in litres) at site 1
Figure 6.2: Actual and predicted sales (in litres) at site 2
6.2. TESTING BOUNDED DLM DEMAND MODEL
Figure 6.3: Correlogram of the daily error at site 1
Figure 6.4: Correlogram of the daily error at site 2
107
108
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
Figure 6.5: Plot of the constrained daily movements of the elasticity for own price
for all 109 sites. Between the dashed lines is 80% of the movements, between the
dotted lines is 50% of the movements and the solid line is the median.
of competitors’ elasticities. On this data set, the average size of the own price
elasticity, β8,t , found using [MJL11] was -2.38. In comparison, the average size
when using the DLM method, after the final data point has been used, was 5.61, a noticeable increase in absolute size. These results compare favourably
with other new forecasting techniques. For example, [MLJ11] used a Bayesian
hierarchical approach to model the same problem and also discovered elasticities
of greater absolute value than the ones currently found, providing evidence to
further support our results. Table 6.2 summarises the difference between our new
DLM method and the current industry standard method used in [MJL11].
6.3. TESTING SVR COMPETITOR PRICE FORECASTING MODEL
109
Table 6.2: Comparison of DLM method with [MJL11] method
DLM method
[MJL11] method
6.3
Average size of own elasticity MAPE
-5.61
8.95%
-2.38
16.9%
Testing SVR competitor price forecasting
model
In order to test our SVR algorithm, two data sets containing real commercial
data were collected encompassing almost an entire year, with 50 different own
fuel station sites in each data set. It contained the wholesale cost per unit at all
times, the own price per unit at all times and competitor prices per unit together
with the times at which they were observed. The first data set was from Denmark
and contained high frequency price data and hence the price of each competitor
was recorded around 10 times per day. The second data set was from the USA
and was of much lower frequency, with competitor prices recorded around once
or twice each day. The confidential and commercial nature of this data means
that it is anonymous. Since retail vehicle fuel prices are often optimised daily,
the SVR model was used to predict a single competitor’s price for each 24 hour
period (starting at midnight) in the data. The mean absolute error (MAE) for
all days was then found for this competitor. This was then repeated for all the
remaining competitors. It was compared against the MAE when using the current
assumption that each competitor’s price will remain unchanged during each day.
To show what this error statistic is saying in reality, an example is now provided.
The actual prices and times are $2 from 00:00 until 12:00 and $3 from 12:00 until
24:00. The forecasted prices and times are $2 from 00:00 until 12:00 and $2.5
from 12:00 until 24:00. Therefore the absolute error using the forecasted prices
is |2-2|* (12/24) + |3-2.5|* (12/24) = 0 + 0.25 = $0.25. Hence the expected
110
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
difference between the forecasted and actual price at any moment in time on this
day is $0.25. Finding the mean difference for all days gives the expected absolute
difference between the forecasted and actual values at any moment in time, the
MAE. For this day, the absolute error assuming that a competitor’s price will
remain unchanged is |2-2|* (12/24) + |3-2|* (12/24) = 0 + 0.5 = $0.5.
The first data set contained a total of 202 competitors. For each competitor,
the MAE was found using algorithm 2 with the previous 60 days of pricing data
used for training. The chosen parameters used in the model were γ = 1/8,
= 0.001 and C = 1. The average MAE for all the competitors, assuming that
their price will remain unchanged, was found to be 0.234 and the average MAE
when using the SVR predictions was 0.160. Hence the SVR forecasts provided
an improvement (i.e. decrease) in average MAE of 0.074. To give this some
context, the average price per unit of fuel was around 12.5 Danish Krona ($2.15).
Notched box-plots of each MAE are given in figure 6.6. The box-plot representing
the assumption that a competitor’s price will remain unchanged is on the lefthand side and the box-plot using the SVR predictions is on the right-hand side.
The standard deviation of the MAE was found for each competitor individually.
The mean standard deviation in the error for all the competitors when using
the SVR predictions was 0.199. The mean standard deviation in the error when
assuming that a competitor’s price will remain unchanged was 0.286. Hence using
the SVR competitor price predictions reduced the standard deviation by 0.087.
The second data set contained a total of 257 competitors. Again, for each
competitor, the MAE was found using algorithm 2 with the previous 60 days of
pricing data used for training. The chosen parameters used in the model were
again γ = 1/8, = 0.001 and C = 1. The average MAE for all the competitors,
assuming that their price will remain unchanged, was found to be 0.00449 and
the average MAE when using the SVR predictions was 0.00424. Hence the SVR
6.3. TESTING SVR COMPETITOR PRICE FORECASTING MODEL
111
Figure 6.6: Denmark box-plots of the Mean Absolute Error (box-plot of SVR
predictions is on the right and the box-plot showing the assumption that a competitor’s price will remain unchanged is on the left)
forecasts provided an improvement (i.e. decrease) in average MAE of 0.00025.
Notched box-plots of each MAE are given in figure 6.7. The box-plot representing
the assumption that a competitor’s price will remain unchanged is on the lefthand side and the box-plot using the SVR predictions is on the right-hand side.
To give this some context, the average price per unit of fuel is around $3.67.
The standard deviation of the MAE was found for each competitor individually.
The mean standard deviation in the error for all the competitors when using the
SVR predictions was 0.00283. The mean standard deviation in the error when
assuming that a competitor’s price will remain unchanged was 0.00300. Hence
using the SVR competitor price predictions reduced the standard deviation by
0.00017.
Analysing these two data sets, it’s clear that in the first data set there are
frequent competitor price moves each day, unlike in the second data set. This can
112
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
Figure 6.7: USA box-plots of the Mean Absolute Error (box-plot of SVR predictions is on the right and the box-plot showing the assumption that a competitor’s
price will remain unchanged is on the left)
be seen when comparing the size of the error when assuming that a competitor
will keep their price unchanged. From this you could assume that the reason
why prices are checked more regularly in the first data set is because prices move
more often and therefore the assumption that a competitor will keep their price
unchanged is less correct. Hence there is more need to use the SVR model to
predict competitor prices in the first data set and this explains why the model
had a much bigger advantage here. In fact since in the second data set the
competitor is less likely to change their price, the model is more likely to forecast
that a competitor will keep their price unchanged and in these cases there will
be no difference in the model predictions compared with the assumption of no
competitor price change. This explains the small difference in forecast accuracy
in the second data set. But overall for both data sets the SVR model increased
the forecast accuracy compared with the current assumption that a competitor
6.4. TESTING GREATEST GUARANTEED PROFIT OPTIMISATION 113
will keep their price unchanged. In fact the standard deviation of the error was
also reduced by using the SVR model predictions although this reduction was
more pronounced on the first data set. These results show the advantages of
using this SVR technique.
6.4
Testing greatest guaranteed profit optimisation
An example of an optimisation to provide the highest guaranteed profit (i.e. using
algorithm 3), over 100 days, using simulated daily costs and competitor prices
is shown in figure 6.8. Only one competitor was used in this scenario with both
prices initially at 1.189, cost at 0.879, β8 = −4.5 and β9 = 0.4. Here the higher
dotted line is the best price when the competitor follows a price change, the
lower dashed line is the best price when the competitor doesn’t follow a price
change and the solid line is the optimised own price. If current pricing decision
support systems were applied to this data set, own price would always be set on
the lower dashed line. This would therefore not provide the greatest guaranteed
profit, showing the advantage of the new methodology. In reality fuel stations
often only price at certain intervals and this explains why own price doesn’t sit
perfectly on the dotted or dashed lines. This optimisation is in fact evident in
real pricing data. An example of such a site is given in figure 6.9. The higher
dotted line is the best price when a competitor follows a price change, the lower
dashed line is the best price when a competitor doesn’t follow a price change and
the solid line is the actual own price. Real data shows that fuel stations do not
only price assuming that a competitor will not respond, human price setters seem
to take the future prices of competitors into account when setting their prices.
114
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
Figure 6.8: Optimisation of own price to obtain highest guaranteed profit (the
higher dotted line is the best price when the competitor follows a price change,
the lower dashed line is the best price when the competitor doesn’t follow a price
change and the solid line is the optimised own price)
Figure 6.9: Actual price data (the higher dotted line is the best price when the
competitor follows a price change, the lower dashed line is the best price when
the competitor doesn’t follow a price change and the solid line is the actual own
price)
6.5. TESTING GAME-TREE OPTIMISATION
115
This shows the superiority of this new optimisation when competitor prices aren’t
forecasted.
6.5
Testing Game-Tree Optimisation
The exhaustive search optimisation of a game-tree (from section 5.3.1) was completed at different sites in order to dynamically optimise own price over a 24 hour
period. This analysis was completed on real commercial data using the two data
sets which were previously used to test the SVR competitor price forecasting
model in section 6.3. The daily sales achieved at each site was the only input
variable added to the original data sets. The first data set was from Denmark
and contained high frequency competitor price survey data and hence the price of
each competitor was recorded around 10 times per day. Since own price decisions
were made when competitor surveys were received, this network of own sites had
multiple times at which own price decisions were made each day. The second data
set was from the USA and was of much lower frequency and hence the price of
each competitor was recorded around once or twice each day. Therefore this network of own sites had much fewer times at which own price decisions were made
each day. As also stated previously, the confidential and commercial nature of
this data means that it is anonymous. The DLM and SVR models were used to
forecast demand and competitor prices in this optimisation (i.e. using algorithm
5). This one day ahead price optimisation was done at 100 different sites for 20
consecutive days. The path of own prices which maximises forecasted profit can
be viewed as the dynamic optimisation of own price.
Own price was also optimised according to the price optimisation used in
current pricing decision support systems. Starting at the beginning and moving
through each day, whenever an own price decision was needed it was optimised
116
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
assuming that current competitor prices will remain unchanged for the remainder
of the optimisation period. Whenever a competitor survey was expected, the
actual competitor prices were forecasted and if a competitor price was predicted
to change, the optimisation was repeated from the time the competitor price was
forecasted to have changed (i.e. by assuming that the competitor price will remain
unchanged from its new price for the remainder of the optimisation period). The
resultant own prices over the optimisation period were recorded together with the
resultant forecasted profit and it can be viewed as the static optimisation of own
price. This static price optimisation was also done for one day ahead on the same
100 different sites for the same 20 consecutive days. The resultant profit from
the dynamic and static prices were compared in order to analyse the difference
between the new proposed optimisation and the current optimisation. Obviously
if there are no competitor price changes during a given period, the dynamic and
static prices will be the same and there is no increase in profit from this new
optimisation. Of course there can never be a negative change in profit when
using the newly proposed optimisation since the path with optimum profit is
always chosen for the dynamic optimisation of own price (under the assumption
that competitor prices can be perfectly forecasted), the only question is really
regarding how much better the new method is.
The results from the first data set showed that the average increase in daily
forecasted profit was found to be 16.18%, with figure 6.10 showing a histogram
of the results. The results from the second data set showed that the average
increase in daily forecasted profit was found to be 3.82%, with figure 6.11 showing
a histogram of the results. Analysing the results from these two data sets it’s
clear that optimising own price using a game-tree produces a noticeable increase
in profit compared with using the current static optimisation of own price. Clearly
the high frequency competitor price surveys in the Denmark data set provides
6.5. TESTING GAME-TREE OPTIMISATION
117
Figure 6.10: Increased profit for Denmark own sites
much more opportunity for competitor prices to change. In fact we know from
section 6.3 that assuming competitor prices will remain unchanged is less true
here than in the other data set. Therefore the newly proposed optimisation
which is taking into account these future competitor price moves will be much
more profitable compared with the currently implemented optimisation which
doesn’t take these future changes into account. Turning to the USA data set,
despite the lack of competitor price movement, profit still increased using the
new optimisation technique. Of course these results assume that competitors
prices and overall demand can be forecasted but since this section is exclusively
focused on developing the price optimisation methodology, these results show the
advantages of using this game-tree optimisation technique within pricing decision
support systems.
118
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
Figure 6.11: Increased profit for USA own sites
6.6
Summary
This chapter introduces the tasks faced by retailers when optimising retail vehicle fuel. In this industry margins are very tight, negatively impacting on overall
profit. Underlying costs are also often very volatile and can’t be accurately predicted. Furthermore the retail mark-up changes sizeably over time and some
competitors appear to use complex pricing rules, making it difficult to predict
competitor price moves. Overall this makes optimising price correctly extremely
important but difficult to achieve. Due to this, pricing decision support systems
have been implemented to aid these pricing decisions. Their main advantage in
this market is that due to the frequency and volume of pricing decisions, these
systems can do it at a higher level of consistency and accuracy, increasing overall
profit. They model demand, learning from historical daily price and sales data,
6.6. SUMMARY
119
before optimising price. They can be used to do this for individual products at
a single fuel station or even a whole host of products over a network of fuel stations. From an academic perspective though, other than the work done by KSS
Fuels, there is a relative lack of literature on pricing decision support systems in
the retail fuel industry due to the competitive and therefore secretive nature of
their use. The proposed methods from earlier chapters need to be tested to see
whether they provide an improvement against the techniques used within pricing
decision support systems already employed in this pricing problem.
Firstly the results from using the bounded DLM algorithm markedly reduced
the error when forecasting daily fuel demand at a retail fuel station, reducing the
the average daily mean absolute percentage error to 8.95% from the 16.9% found
when using current forecasting techniques. The main reason for this seems to
be that the DLM algorithm updates daily, unlike current models where each day
of the week is modelled separately and hence they update every week instead.
In effect the DLM has dealt with the remaining short-term correlation between
days, producing improved forecasts.
Analysing the results from the SVR algorithm used to forecast future competitor prices, it showed an improvement over the current method which simply
assumes that competitor prices will remain unchanged. The results showed the
advantages of using this SVR technique since overall for both data sets the SVR
model increased the forecast accuracy compared with the current assumption that
a competitor will keep their price unchanged.
The algorithm to optimise own price maximising guaranteed profit was tested
using simulated daily costs and competitor prices. It provided a clear understanding of how current pricing decision support systems price differently, failing
to provide the greatest guaranteed profit, showing the advantage of the new
methodology when competitor prices aren’t forecasted.
120
CHAPTER 6. CASE STUDY: RETAIL VEHICLE FUEL INDUSTRY
The results from the game-tree optimisation algorithm were analysed using
the Support Vector Regression algorithm to predict competitor prices and the
bounded Dynamic Linear Models algorithm to forecast the resulting sales from
each path of own prices over the chosen optimisation period. The optimum
path chosen was compared to the pricing decision made from current pricing
decision support system methods, where competitor prices are assumed to remain
unchanged. Overall it’s clear that optimising own price using an exhaustive search
of a game-tree produces a noticeable increase in profit compared with using the
current static optimisation of own price.
Chapter 7
Conclusions and Future Work
7.1
Summary and Main Contributions
There has been much academic work undertaken to understand retail pricing, and
pricing decision support systems have been successfully introduced in the retail
sector. Their main advantage is that given the frequency and volume of pricing
decisions, they can price at a higher level of consistency and accuracy, increasing
overall profit. Despite this there are still potential areas for improvement which
this research has focused upon.
The first contribution of this research is that since the forecasting of demand
is often difficult and predictions often limited by the static modelling approaches
found in existing pricing decision support systems, a bounded Dynamic Linear
Model algorithm has been proposed. It has been shown to provide a considerable improvement in predictive accuracy compared with current techniques, when
tested on real commercial data from the retail vehicle fuel sector. It achieves this
superiority by dealing with the short-term correlation in the sales data, providing improved estimates of the price elasticities which are integral to the overall
optimisation.
121
122
CHAPTER 7. CONCLUSIONS AND FUTURE WORK
Secondly, it’s clear that the profitability of one firm’s pricing strategy depends
crucially on the response of its closest rivals. Therefore a retailer must predict
how a competitor would react when setting their own price. Despite the importance of making correct competitor price predictions, techniques for forecasting
competitor prices are lacking from academic literature. Instead the naive assumption that a competitor’s price will remain unchanged is often used. To improve
this situation, the second contribution of this research is the proposed method of
Support Vector Regression (SVR) which is used to forecast competitor prices. It
has been thoroughly explained together with a detailed algorithm showing how
it can be implemented. It has then been tested on the retail vehicle fuel industry
using real commercial data. Competitor prices have been forecasted for each day
in the data set with the results showing a clear improvement when using the SVR
competitor price predictions against the current methodology of assuming that a
competitor’s price will remain unchanged.
Thirdly, current optimisers don’t change their advised price depending on how
a competitor is forecasted to react to a change in own price. The third contribution of this research is to show how this is incorrect and in fact mathematically
prove that different optimal prices exist. An algorithm to provide the highest
guaranteed profit when a retailer doesn’t forecast how its competitors will react
has also been presented and this new optimisation has been shown to improve
guaranteed profit compared with current optimisers. It also closer resembles historical price moves when compared with real commercial data from the retail
vehicle fuel sector. This makes sense as retailers are often unable to forecast
competitor price reactions when setting their price. This optimisation is the first
to explicitly incorporate the competitive nature of a market, providing us with a
superior pricing methodology.
Fourthly, as previously shown, failing to incorporate future competitor price
7.2. FUTURE WORK
123
moves when optimising price can severely reduce overall profit for a retailer and
is a major issue with current price optimisers. In order to change this status
quo, the final contribution of this research is to propose the first price optimisation methodology which explicitly forecasts future competitor prices. It involves
building a game-tree containing every potential path of own price over the optimisation period together with forecasting the resultant competitor prices. An
exhaustive search is then completed on the game-tree in order to find the path
of own prices which maximises profit whilst satisfying any volume constraints.
An algorithm showing how to implement this optimisation has been constructed.
When competitor prices can be forecasted correctly, results have shown that this
optimisation improves overall profit compared with assuming that competitor
prices will remain unchanged, as is the case in current state of the art price
optimisers.
Overall, each of these proposed techniques has been shown to be superior to
the equivalent techniques used to do the same task in current pricing decision
support systems. Therefore implementing them to replace the old methodologies will provide the improvement in these systems which this research aimed to
deliver.
7.2
Future Work
One possible future research project would be to develop the game-tree optimisation for when the total number of pricing decisions a retailer makes during a given
optimisation period is large and when there are a wide range of possible own price
points to consider each time. This may occur when a retailer aims to optimise
the prices of multiple products over multiple days simultaneously. In this case
the tree size may become very large, producing a huge number of different pricing
124
CHAPTER 7. CONCLUSIONS AND FUTURE WORK
strategies. Hence more effective and computationally feasible methods may be
required.
Another possible future research project would be to use probability based
approaches to forecast competitor prices. That is, identify the probabilities of
different competitor price moves. If we use a probability based forecasting model
then the pricing optimisation problem becomes an expectation profit maximisation problem and new optimisation algorithms will need to be developed.
A final research project would be to forecast the time at which competitor
prices will actually change, rather than simply checking competitor prices at
predetermined times. A developed algorithm could perhaps inform a retailer
when it is optimal to check a competitor’s price at a specific time, since it is
impossible and inefficient to check them continuously. This is instead of leaving
the retailer in a position where they are simply guessing, as is currently the case.
Chapter 8
Appendix: Theoretic justification
for using a bounded approach
In demand modelling we know that the own price (direct) elasticities are negative
and the competitor price (cross) elasticities are positive. However due to various
noise in both the historical price and sales data, incorrect elasticities are possible
to obtain if we use the standard unbounded DLM method. In order to overcome
such a problem, a bounded DLM method was proposed (in chapter 3) to utilise
the information known about the signs of elasticities in learning the demand
model. Although this bounding is necessary from a practical point of view, this
chapter will provide theoretic justification from a related problem, proving why
a bounded least squares approach produces a more accurate model than the
standard least squares method. The similarities between the DLM and least
squares methodologies are that they both:
1. Require a model structure (linking the input variables to the output variable) which has been previously chosen.
2. Aim to infer optimum model coefficients using historical data.
125
126
CHAPTER 8. APPENDIX
3. Do not care what the input variables represent or their context to the output
variable.
4. Require the assumption of linearity and Normality in the error.
Of course the least squares method is different to the recursive DLM approach and
the parameters are bounded differently but the proof is still relevant since it gives
a degree of justification for implementing the proposed bounded DLM algorithm
instead of the standard DLM method. The same is also true for bounding the
output of the SVR model used to forecast competitor prices (in chapter 4).
8.1
Basic Assumption
Suppose that the true model of a considered system is a linear model given below:
y = Ltrue (X, θtrue ) = α0true + α1true x1 + ... + αntrue xn
(8.1)
where
θtrue = [α0true
α1true
...αntrue ]T
(8.2)
with T denoting the transpose of the vector. Further it is assumed that the values
of θtrue is unknown but we know the bounds of each parameter. That is
αilower ≤ αitrue ≤ αihigher
i = 0, 1, .., n
(8.3)
and αilower , αihigher are known. If some αilower or αihigher are unknown, we can
choose αilower = −∞ or αihigher = ∞. So such an assumption is without loss of
generality.
8.2. TWO LEARNING METHODS
8.2
127
Two Learning Methods
Suppose that we have the following observation portraying data about the given
system in equation 8.1 as
{y(t), X(t)} = {y(t), [x1 (t), ..., xn (t)]} t = 1, 2, ..., T.
(8.4)
The first method to learn the system in equation 8.1 based on the above data
is the standard least squares method where we find the best parameters θ∗ by
minimising the following square errors:
T
∗
θ = arg
1X
{y(t) − L(X(t), θ)}2
min
θ∈Rn+1 2
t=1
where L(X(t), θ) = α0 + α1 x1 + ... + αn xn
and
θ = [α0
α1
(8.5)
...αn ]T ∈ Rn+1 .
We denote the parameters obtained by this first method as θLS as it is the best
least squares solution, that is,
θLS = [α0LS
α1LS
...αnLS ]T
(8.6)
The second method to learn system in equation 8.1 based on the available
data in 8.4 is a bounded least squares method which the unknown information
about the parameters given in 8.3 is utilised. That is, we find the best parameters
θ∗ by solving the following bounded least squares problem:
T
min
θ
1X
{y(t) − L(X(t), θ)}2
2 t=1
such that αilower ≤ αitrue ≤ αihigher
i = 0, 1, .., n
(8.7)
128
CHAPTER 8. APPENDIX
We denote the parameters obtained by this second method as θBLS as it is the
best bounded least squares solution, that is
θLS = [α0BLS
α1BLS
...αnBLS ]T
(8.8)
In order to give the condition to ensure the solutions in the first and second
method exist are unique, the following notations are introduced:
1 x1 (1)
1 x1 (2)
AT =
...
...
1 x1 (T )
... xn (1)
y(1)
y(2)
... xn (2)
YT =
...
...
...
... xn (t)
y(T )
(8.9)
It is well known from the least squares method and quadratic programming
that if
ATT AT > 0
(8.10)
(that is ATT AT is positive definite) then the optimal solution exists and is unique,
as stated in [ZS97] for example. Now the question to answer is which model
obtained by the above two methods is better? Before we give the answer to
the above question, we need to clarify how to measure the ”better” model when
comparing the above two methods. The definition or measurement of the better
model is the one which is closer to the true model in the least squares sense.
Therefore the mathematical definition can be given as below:
Definition 8.2.1. Let LA (X, θA ) and LB (X, θB ) be two models to approximate
the true system model Ltrue (X, θtrue ). Model LA is said to be better or more
8.3. COMPARING LLS (X, θLS ) AND LBLS (X, θBLS )
129
accurate than model LB if the following inequality holds:
T
T
X
X
2
{LA (X(t), θA ) − Ltrue (X(t), θtrue )} ≤
{LB (X(t), θB ) − Ltrue (X(t), θtrue )}2
t=1
t=1
(8.11)
With the above definition, we are going to prove in the next section that the
model obtained by the second method (i.e. bounded least squares) is better than
the one obtained by the first method (i.e. standard least squares).
8.3
Comparing LLS (X, θLS ) and LBLS (X, θBLS )
In this section we will give a theoretic justification that the model obtained by the
bounded least squares method is better or more accurate than the one obtained
by the least squares method. More exactly, we are going to prove the following
theorem:
Theorem 3. Assume that
1. The true model of a considered system is given in equation 8.1 with the
known parameter bounds given in 8.3.
2. The training data for the true model is as in 8.4.
3. Condition 8.10 holds. That is ATT AT > 0.
4. LLS (X, θLS ) is the best least squares model obtained by 8.5.
5. LBLS (X, θBLS ) is the best least squares model obtained by 8.7.
Then LBLS (X, θBLS ) is a more accurate model than LLS (X, θLS ) in the sense of
definition 8.2.1.
130
CHAPTER 8. APPENDIX
To prove theorem 3, we need the following lemma.
Lemma 4. Let G be a p x p non-negative definite matrix and g be a p x 1 matrix.
If:
1. Y ∗ is an optimal solution of the following unconstrained optimisation problem
1
min θ(Y ) = Y T GY − Y T g
2
Y ∈ Rp
(8.12)
2. X ∗ is an optimal solution of the following quadratic programming problem
1
min θ(X) = X T GX − X T g
2
X = (x1 , ..., xp )T ∈ Rp
such that elower
≤ xi ≤ ehigher
i
i
i = 0, 1, .., p
with elower
≤ 0 and 0 ≤ ehigher
i
i
i = 0, 1, .., p
(8.13)
Then:
(a) (X ∗ )T GX ∗ ≤ (Y ∗ )T GY ∗
(b) If further G > 0 and X ∗ 6= Y ∗ , then (X ∗ )T GX ∗ < (Y ∗ )T GY ∗
Proof. The proof to this lemma is obtained by choosing Ci,j = 0 when i 6= j and
Ci,j = 1 when i = j, with i, j = 0, 1, .., p in the proof of Lemma A2 found in the
appendix section of [ZS97].
We now prove theorem 3.
Proof. Firstly, based on 8.9 and some simple matrix manipulation, we have
P
J(θ) := 12 Tt=1 {y(t) − L(X(t), θ)}2
= 12 (YT − AT θ)T (YT − AT θ)
= 21 (YT − AT θ + AT θtrue − AT θtrue )T (YT − AT θ + AT θtrue − AT θtrue )
=
1
[AT (θ
2
− θtrue )]T [AT (θ − θtrue )] − [AT (θ − θtrue )]T [YT − AT θtrue ] + 21 [YT −
8.3. COMPARING LLS (X, θLS ) AND LBLS (X, θBLS )
131
AT θtrue ]T [YT − AT θtrue ]
= 12 (θ−θtrue )T ATT AT (θ−θtrue )−(θ−θtrue )T ATT [YT −AT θtrue ]+ 12 [YT −AT θtrue ]T [YT −
AT θtrue ]
= 12 θ̃T ATT AT θ̃ − θ̃T ATT [YT − AT θtrue ] + 21 [YT − AT θtrue ]T [YT − AT θtrue ]
˜ θ̃) + 1 [YT − AT θtrue ]T [YT − AT θtrue ]
= J(
2
where θ̃ = [α̃0 , α̃1 , ..., α̃n ] = θ − θtrue = [α0 − α0true , α1 − α1true , ..., αn − αntrue ]
˜ θ̃) = 1 θ̃T AT AT θ̃ − θ̃T AT [YT − AT θtrue ].
and J(
T
T
2
Assigning the constant C := 21 [YT − AT θtrue ]T [YT − AT θtrue ] then we have from
above:
T
1X
˜ θ̃) + C
{y(t) − L(X(t), θ)}2 = J(
J(θ) :=
2 t=1
(8.14)
From 8.14 with C being a constant and 8.5, we have immediately that θLS is
the optimal solution of the unconstrained optimisation problem 8.5 if and only
if θ̃LS = θLS − θtrue is the optimisation solution of the following unconstrained
optimisation problem:
˜ θ̃) = 1 θ̃T ATT AT θ̃ − θ̃T ATT [YT − AT θtrue ].
min J(
2
(8.15)
Further from 8.14 and 8.7, we have immediately that θBLS is the optimal
solution of the bounded least squares problem given in 8.7 if and only if θ̃BLS =
θBLS − θtrue is the optimisation solution of the following quadratic programming
132
CHAPTER 8. APPENDIX
problem:
˜ θ̃) = 1 θ̃T AT AT θ̃ − θ̃T AT [YT − AT θtrue ]
min J(
T
T
2
such that αilower − αitrue ≤ α̃i ≤ αihigher − αitrue
= αilower − αitrue ≤ 0,
with elower
i
i = 0, 1, ..., n
ehigher
= αihigher − αitrue ≥ 0 i = 0, 1, ..., n
i
(8.16)
Now let
G = ATT AT > 0
g = ATT (YT − AT θtrue )
(8.17)
Y ∗ = θLS − θtrue
X ∗ = θBLS − θtrue
then the conditions of lemma 4 are satisfied. Based on Lemma 4, we have
(X ∗ )T GX ∗ ≤ (Y ∗ )T GY ∗ .
(8.18)
That is
(θBLS − θtrue )T ATT AT (θBLS − θtrue ) ≤ (θLS − θtrue )T ATT AT (θLS − θtrue ). (8.19)
Noticing for any θ ∈ Rn+1 and its corresponding L(X(t), θ), the following equality
holds
(θ − θtrue )T ATT AT (θ − θtrue )
= (AT θ − AT θtrue )T (AT θ − AT θtrue )
=
T
X
t=1
{L[X(t), θ] − Ltrue [X(t), θtrue ]}2
(8.20)
8.3. COMPARING LLS (X, θLS ) AND LBLS (X, θBLS )
133
which implies by using 8.19
T
T
X
X
2
{LBLS [X(t), θBLS ]−Ltrue [X(t), θtrue ]} ≤
{LLS [X(t), θLS ]−Ltrue [X(t), θtrue ]}2
t=1
t=1
(8.21)
That is, inequality 8.11 in definition 8.2.1 holds. Therefore we have that LBLS (X, θBLS )
is more accurate than LLS (X, θLS ) based on definition 8.2.1 and this ends the
proof.
Bibliography
[AC09]
V. Araman and R. Caldentey. Dynamic pricing for non-perishable
products with demand learning. Operations Research, 57:1169–1188,
2009.
[AEW09]
B. Atkinson, A. Eckert, and D.S. West.
Price matching and
the domino effect in a retail gasoline market. Economic Inquiry,
47(3):568–588, 2009.
[AHM07]
M. Berk Ataman, Harald J. Van Heerde, and Carl F. Mela. Building
Brands. Marketing Science, 27(6):1036–1054, 2007.
[AHM10]
M. Berk Ataman, Harald J. Van Heerde, and Carl F. Mela. The
Long-Term Effect of Marketing Strategy on Brand Sales. Journal of
Marketing Research, 47(5):866–882, 2010.
[Aka74]
H. Akaike. Markovian representation of stochastic processes and
its application to the analysis of autoregressive moving average processes. Annals of the Institute of Statistical Mathematics, 26, 1974.
[And11]
Edward Anderson. A new model for cycles in retail petrol prices. European Journal of Operational Research, 210(2):436–447, April 2011.
[Aok87]
M. Aoki. State Space Modeling of Time Series. 1987.
134
BIBLIOGRAPHY
[AP10]
Elodie Adida and Georgia Perakis.
135
Dynamic pricing and inven-
tory control: uncertainty and competition. Operations Research,
58(2):289–302, 2010.
[Atk08]
Benjamin Atkinson. Retail Gasoline Price Cycles: Evidence from
Guelph, Ontario Using Bi-Hourly, Station-Specific Retail Price Data.
2008.
[BC03]
G. Bitran and R. Caldentey. An overview of pricing models for revenue management. Manufacturing and Service Operations Management, 5:203–229, 2003.
[BCM98]
G. Bitran, R. Caldentey, and S. Mondschien. Coordinating clearance
markdown sales of seasonal products in retail chains. Operations
Research, 46:609–624, 1998.
[BDR04]
Peter Boatwright, Sanjay Dhar, and Peter E. Rossi. The role of retail competition, demographics and account retail strategy as drivers
of promotional sensitivity. Quantitative Marketing and Economics,
2(2):169–190, 2004.
[Ber90]
B Douglas Bernheim. Multimarket contact and collusive behavior.
Rand Journal of Economics, 21(1):1–26, 1990.
[BGL94]
R. C. Blattberg, R. Glazer, and J. D. C. Little. The Marketing
Information Revolution. 1994.
[BJ70]
George Box and Gwilym Jenkins. Time series analysis: Forecasting
and control. 1970.
[BKD92]
M. Baye, D. Kovenock, and C.G. DeVries. It takes two to tango:
136
BIBLIOGRAPHY
equilibrium in a model of sales. Games and Economic Behavior,
4:493–510, 1992.
[BN71]
G.E.P. Box and P. Newbold. Some comments on a paper of Coen,
Gomme and Kendall. Journal of the Royal Statistical Society. Series
A (General), 134, 1971.
[Bon63]
C.P. Bonini. Simulation of information and decision systems in the
firm. 1963.
[BS96]
S. Borenstein and A. Shepard. Dynamic Pricing in Retail Gasoline
Markets. Rand Journal of Economics, 27, 1996.
[BTBH01] R. Burbidge, M. Trotter, B. Buxton, and S. Holden. Drug design by
machines learning: support vector machines for pharmaceutical data
analysis. Computers and Chemistry, 26:5–14, 2001.
[BZ09]
Omar Besbes and Assaf Zeevi. Dynamic pricing without knowing the
demand function: Risk bounds and near-optimal algorithms. Operations Research, 57(6):1407–1420, 2009.
[Cao03]
L.J. Cao. Support vector machines experts for time series forecasting.
Neurocomputing, 51:321–339, 2003.
[CC07]
Australian Competition and Consumer Commission. Petrol prices
and Australian consumers: Report of the ACCC inquiry into the
price of unleaded petrol. Independent examination of the Australian
petroleum market, 2007.
[CCX07]
W. Chiang, J. Chen, and X. Xu. An overview of research on revenue management: current issues and future research. International
Journal of Revenue Management, 1(1):97–128, 2007.
BIBLIOGRAPHY
[CDV10]
137
S. Hossein Cheraghi, Mohammad Dadashzadeh, and Prakash Venkitachalam. Revenue management in manufacturing: a research landscape. Journal of Business and Economics Research, 8(2), 2010.
[Cha04]
Chris Chatfield. The Analysis of Time Series: An Introduction. 2004.
[CHC09]
R. Cross, J. Higbie, and D. Cross. Revenue managements renaissance:
A rebirth of the art and science of profitable revenue generation.
Cornell Hospitality Quarterly, 50:56–81, 2009.
[CHC10]
Robert G. Cross, Jon A. Higbie, and Zachary N. Cross. Milestones in
the application of analytical pricing and revenue management. Journal of Revenue and Pricing Management, 10(1):8–18, 2010.
[CHV99]
O. Chapelle, P. Haffner, and V. N. Vapnik. Support vector machines
for histogram-based image classification. IEEE Transactions on Neural Networks, 10(5):1055–1064, 1999.
[CJBH11] Ronald Christensen, Wesley Johnson, Adam Branscum, and Timothy E. Hanson. Bayesian Ideas and Data Analysis: An Introduction
for Scientists and Statisticians. 2011.
[Cla66]
H. J. Claycamp. Dynamic effects of short duration price differentials
on retail gasoline sales. Journal of Marketing Research, 3(2):175–178,
1966.
[Con01]
Conference Board of Canada. The Final Fifteen Feet of Hose: The
Canadian Gasoline Industry in the Year 2000. Independent examination of the Canadian petroleum market 2001, 2001.
[CP98]
Ronald W. Cotterill and Putsis William P. Testing the theory: Vertical strategic interaction and demand functional form. (25211), 1998.
138
[CS01]
BIBLIOGRAPHY
N. Cassaigne and M.G. Singh. Intelligent decision support for the
pricing of products and services in competitive consumer markets.
IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 31(1):96–106, 2001.
[CT01]
L.J. Cao and F.E.H. Tay. Financial forecasting using support vector
machines. Neural Computing and Applications, 10:184–192, 2001.
[DBK+ 97] H. Drucker, C.J. Burges, L. Kaufman, A.J. Smola, and V.N. Vapnik.
Support vector regression machines. Advances in Neural Information
Processing Systems, 9:155–161, 1997.
[DCFN05] Y. Dai, X. Chao, S.C. Fang, and H.L.W. Nuttlei. Dynamic pricing
and inventory control: uncertainty and competition. International
Journal of Production Economics, 98:1–16, 2005.
[DJLS00]
E. Dockner, S. Jørgensen, N. Van Long, and G. Sorger. Differential
games in economics and management science. Cambridge University
Press, 2000.
[EK03]
W. Elmaghraby and P. Keskinocak. Dynamic pricing in the presence
of inventory considerations: research overview, current practices, and
future directions. Management Science, 49(10):1287–1309, 2003.
[EK06]
S. Eom and E. Kim. A survey of decision support system applications,
1995 to 2001. Journal of the Operational Research Society, 57:1264–
1278, 2006.
[EW04a]
A. Eckert and D. West. A tale of two cities: price uniformity and price
volatility in gasoline retailing. Annals of Regional Science, 38:25–46,
2004.
BIBLIOGRAPHY
[EW04b]
139
A. Eckert and D. West. Retail gasoline price cycles across spatially
dispersed gasoline stations. Journal of Law and Economics, 47:245–
273, 2004.
[FS08]
O. Foros and F. Steen. Gasoline prices jump up on Mondays : An
outcome of aggressive competition? Business, (June), 2008.
[GCRA10] Andrea Giovannucci, Jess Cerquides, and Juan A. Rodrguez-Aguilar.
Composing supply chains through multiunit combinatorial reverse
auctions with transformability relationships among goods.
IEEE
Transactions on Systems, Man and Cybernetics, Part A: Systems
and Humans, 40(4):767–778, 2010.
[GF08]
L. A. Garrow and M. A. Ferguson. Revenue management and the
analytics explosion: perspectives from industry experts. Journal of
Revenue and Pricing Management, 7(2):219–229, 2008.
[GFKS06] L. A. Garrow, M. A. Ferguson, P. Keskinocak, and J. Swann. Expert opinions: Current pricing and revenue management practice
across u.s. industries. Journal of Revenue and Pricing Management,
5(3):237–247, 2006.
[GK08]
Romualdas Ginevicius and Algirdas Krivka. Application of game
theory for duopoly market analysis. Journal of business economics
and management, 9(3):207–217, 2008.
[Gu06]
Z. Gu.
Product differentiation: Key to macaus gaming revenue
growth. Journal of Revenue and Pricing Management, 4(4):382–388,
2006.
140
[Har89]
BIBLIOGRAPHY
A.C. Harvey. Forecasting, structural time series models and the
Kalman filter. 1989.
[Haw03]
M. Hawtin. The practicalities and benefits of applying revenue management to grocery retailing, and the need for effective business rule
management. Journal of Revenue and Pricing Management, 2(1):61–
67, 2003.
[HCH+ 04] Z. Huang, H. Chen, C.J. Hsu, W.H. Chen, and S. Wu. Credit rating
analysis with support vector machines and neural networks: a market
comparative study. Decision Support Systems, 37:543–558, 2004.
[HGF02]
A. Heching, L. A. Garrow, and M. A. Ferguson. Mark-down pricing:
An empirical analysis of policies and revenue potential at one apparel
retailer. Journal of Revenue and Pricing Management, 1(2):139–160,
2002.
[HMT08]
D Hosken, R McMillan, and C Taylor. Retail gasoline pricing: What
do we know?
International Journal of Industrial Organization,
26(6):1425–1436, November 2008.
[HS76]
P. Harrison and C. Stevens. Bayesian forecasting (with discussion).
Journal of the Royal Statistical Society, series B, 38:205–247, 1976.
[JKZ99]
S. Jørgensen, P. M. Kort, and G. Zaccour. Production, inventory, and
pricing under cost and demand learning effects. European Journal of
Operations Research, 117:382–395, 1999.
[JLM10]
B. Jenkins, T. Liptrot, and D. McCaffrey. Optimisation of prices
across a retail fuel network. Proceedings from the Operational Research 52nd Annual Conference, 2010.
BIBLIOGRAPHY
[JR11]
141
Kurt Jetta and Erick W. Rengifo. A Model to Improve the Estimation
of Baseline Retail Sales. CENTRUM Cathedra, 4(1):10–26, 2011.
[Kal60]
R.E Kalman. A new approach to linear filtering and prediction problems. Journal of basic Engineering, 82:35–45, 1960.
[Kar39]
W. Karush. Minima of functions of several variables with inequalities
as side constraints. M.Sc. Dissertation. Dept. of Mathematics, Univ.
of Chicago, 1939.
[KB63]
R.E Kalman and R.S. Bucy. New results in linear filtering and prediction theory. Journal of basic Engineering, 83:95–108, 1963.
[Kim05]
S. E. Kimes. Restaurant revenue management: Could it work? Journal of Revenue and Pricing Management, 4(1):95–97, 2005.
[KJPK02] K.I. Kim, K. Jung, S.H. Park, and H.J. Kim. Support vector machines
for texture classification. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 24:1542–1550, 2002.
[KS02]
S. E. Kimes and L. W. Schruben. Golf course revenue management:
A study of tee time intervals. Journal of Revenue and Pricing Management, 1(3):111–120, 2002.
[KSS]
KSSfuels.com. Miller Oil Case Study.
[KT51]
H. W. Kuhn and A. W. Tucker. Nonlinear programming. Proceedings
of 2nd Berkeley Symposium on Mathematical Statistics and Probability, pages 481–492, 1951.
[Kuy02]
A. H. Kuyumucu. Gaming twist in hotel revenue management. Journal of Revenue and Pricing Management, 1(2):161–167, 2002.
142
[KWN02]
BIBLIOGRAPHY
S. E. Kimes, J. Wirtz, and B. M. Noone. How long should dinner
take? measuring expected meal duration for restaurant revenue management. Journal of Revenue and Pricing Management, 1(3):220–
233, 2002.
[LAH10]
Deborah Lim, Patricia Anthony, and Chong Mun Ho. Predict the online auction’s closing price using grey system theory. IEEE International Conference on Systems Man and Cybernetics, pages 156–163,
2010.
[LD02]
W. H. Lieberman and T. Dieck. Expanding the revenue management
frontier: Optimal air planning in the cruise industry. Journal of
Revenue and Pricing Management, 1(2):7–24, 2002.
[LEC06]
LECG Canada. What Determines the Profitability of a Retail Gasoline Outlet? A Study for the Competition Bureau of Canada. Independent expert report, 2006.
[Lei00]
S. Leibs. Ford heeds the profits. CFO Magazine, August 2000.
[LH12]
Yuan Liu and K.W. Hipel. A hierarchical decision model to select
quality control strategies for a complex product. IEEE Transactions
on Systems, Man and Cybernetics, Part A: Systems and Humans,
42(4):814–826, 2012.
[Lip03]
B. W. Lippman. Retail revenue management? competitive strategy
for grocery retailers. Journal of Revenue and Pricing Management,
2(3):229–233, 2003.
BIBLIOGRAPHY
[MB99]
143
Alan L. Montgomery and Eric T. Bradlow. Why Analyst Overconfidence about the Functional Form of Demand Models Can Lead to
Overpricing. Marketing Science, 18(4):569–583, 1999.
[MBB+ 08] D. Martens, L. Bruynseels, B. Baesens, M. Willekens, and J. Vanthienen. Predicting going concern opinion with data mining. Decision
Support Systems, 45:765–777, 2008.
[McC09]
J. L. McCauley. Dynamics of markets: The New Financial Economics.
2009.
[MHRH04] M.A. Mohandes, T.O. Halawani, S. Rehmam, and A.A. Hussain. Support vector machines for wind speed prediction. Renewable Energy,
29:939–947, 2004.
[Mit13]
Kevin Mitchell. The next frontier of the pricing profession. Innovation
in Pricing: Contemporary Theories and Best Practices, page 403,
2013.
[MJL11]
D. McCaffrey, B. Jenkins, and T. Liptrot. Optimization of forecourt
fuel pricing. IMA Journal of Management Mathematics, 2011.
[MLJ11]
David McCaffrey, Tom Liptrot, and Barbara Jenkins. Retail gasoline
pricing: A Bayesian hierarchical approach to modeling the effect of
brand on elasticity. Journal of Revenue and Pricing Management,
10:514–527, 2011.
[Mon02]
Alan L. Montgomery. Reflecting uncertainty about economic theory when estimating consumer demand. Advances in Econometrics,
16:257–294, 2002.
144
[Mon04]
BIBLIOGRAPHY
Alan L Montgomery. The Implementation Challenge of Pricing Decision Support Systems for Retail Managers. Optimization, 1, 2004.
[MR99]
Alan L. Montgomery and Peter E. Rossi. Estimating price elasticities
with theory-based priors. Journal of Marketing Research, 36(4):413–
423, November 1999.
[Noe07a]
M Noel.
Do Gasoline Prices Respond Asymmentrically to Cost
Shocks? The Confounding Effect of Edgeworth Cycles. 2007.
[Noe07b]
M. Noel. Edgeworth price cycles: evidence from the Toronto retail
gasoline market. Journal of Industrial Economics, 55:69–92, 2007.
[Noe07c]
Michael D Noel. Edgeworth price cycles, cost-based pricing, and
sticky pricing in retail gasoline markets. The Review of Economics
and Statistics, 89(2):324–334, 2007.
[Noe11]
Michael D. Noel. Edgeworth price cycles. The New Palgrave Dictionary of Economics, 2011.
[Pet12]
PetrolPrices.com. The Price of Fuel. 2012.
[PL05]
P.F. Pai and C.S. Lin. Using support vector machines in forecasting
production values of machinery industry in taiwan. International
Journal of Advanced Manufacturing Technology, 27:205–210, 2005.
[PPC09]
Giovanni Petris, Sonia Petrone, and Patrizia Campagnoli. Dynamic
Linear Models with R. 2009.
[SA98]
S. Smith and D. Achabal. Clearance pricing and inventory policies
for retail chains. Management Science, 44(3):285–300, 1998.
BIBLIOGRAPHY
[SB93]
145
M.G. Singh and J.C. Bennavail. Experiments in the use of a knowledge support system for the pricing of gasoline products. Information
and Decision Technologies, 18:427–442, 1993.
[Sim12]
Hermann Simon. How price consulting is coming of age. Advances in
Business Marketing and Purchasing, 19:61–79, 2012.
[Sin91]
M.G. Singh. Knowledge support systems for ’smarter’ pricing and resource allocation. IEEE Control Systems Magazine, 11(5):3–7, 1991.
[Sla92]
M. Slade. Vancouver’s gasoline-price wars: an empirical exercise
in uncovering supergame strategies. Review of Economic Studies,
59:257–276, 1992.
[SLD92]
Barry C. Smith, John F. Leimkuhler, and Ross M. Darrow. Yield
Management at American Airlines. Interfaces, 22(1), 1992.
[SLW10]
Qiang Su, Lei Liu, and Daniel E. Whitney. A systematic study of
the prediction model for operator-induced assembly defects based on
assembly complexity factors. IEEE Transactions on Systems, Man
and Cybernetics, Part A: Systems and Humans, 40(1):107–120, 2010.
[SS04]
A.J. Smola and B. Scholkopf. A tutorial on support vector regression.
Statistics and computing, 14(3):199–222, 2004.
[TR04]
Kalyan T. Talluri and Garrett Van Ryzin. The Theory and Practice
of Revenue Management. 2004.
[Vap95]
V. N. Vapnik. The nature of statistical learning theory. 1995.
[Vap99]
V. N. Vapnik. An overview of statistical learning theory. EEE Transactions on Neural Networks, 10(5), 1999.
146
[Vin04]
BIBLIOGRAPHY
B. Vinod. Unlocking the value of revenue management in the hotel
industry. Journal of Revenue and Pricing Management, 3(2):178–190,
2004.
[Viv99]
X. Vives. Oligopoly pricing. old ideas and new tools. 1999.
[VM03]
Wim Van Meerbeeck. Competition and Local Market Conditions on
the Belgian Retail Gasoline Market. De Economist, 151(4):369–388,
December 2003.
[Wan05]
Z. Wang. Timing and Oligopoly Pricing: Evidence from a Repeated
Game in a Timing-Controlled Gasoline Market. 2005.
[WH97]
Mike West and Jeff Harrison. Bayesian forecasting and dynamic models. 1997.
[WHM85] M. West, J. Harrison, and H. Migon. Dynamic generalized linear
models and Bayesian forecasting. American Statistical Association,
80:77–83, 1985.
[WJS08]
Shanshan Wang, Wolfgang Jank, and Galit Shmueli. Explaining and
forecasting online auction prices and their dynamics using functional
data analysis. Journal of Business and Economic Statistics, 26(2),
2008.
[WK03]
L. R. Weatherford and S. E. Kimes. A comparison of forecasting
methods for hotel revenue management. International Journal of
Forecasting, 19(3):401–415, 2003.
[Woo06]
Jeffrey M. Wooldridge. Introductory Econometrics: A Modern Approach. 2006.
BIBLIOGRAPHY
147
[WvBS99] B. Wierenga, G. H. van Bruggen, and R. Staelin. The success of marketing management support systems. Marketing Science, 18(3):196–
207, 1999.
[ZS97]
X-J. Zeng and M. G. Singh. Fuzzy bounded least-squares method for
the identification of linear systems. IEEE Transactions on Systems,
Man and Cybernetics, Part A: Systems and Humans, 27(5):624–635,
1997.
[ZTW12]
Jing Zhao, Wansheng Tang, and Jie Wei. Pricing decision for substitutable products with retail competition in a fuzzy environment. International Journal of Production Economics, 135(1):144–153, 2012.
© Copyright 2026 Paperzz