Optimal pricing model: case of study for convenience stores

1
Optimal pricing model: case of study for
convenience stores
Laura Hervert-Escobar1,2?, Jesus Fabian López-Pérez2 , and Oscar Alejandro
Esquivel-Flores1
1
Instituto Tecnológico y de Estudios Superiores de Monterrey. Monterrey, Nuevo
León, México
2
SINTEC Customer and Operations Strategy. Monterrey, Nuevo León, México
Abstract. Pricing is one of the most vital and highly demanded component in the mix of marketing along with the Product, Place and Promotion. An organization can adopt a number of pricing strategies, which
usually will be based on corporate objectives. The purpose of this paper is to propose a methodology to define an optimal pricing strategy
for convenience stores. The solution approach involves a multiple linear
regression as well as a linear programming optimization model. To prove
the value of the proposed methodology a pilot was performed for selected
stores. Results show the value of the solution methodology. This model
provides an innovative solution that allows the decision maker include
business rules of their particular environment in order to define a price
strategy that meet the objective business goals.
Keywords: Price Strategy, Correlation, Multiple Linear Regression, Optimization
1
Introduction
A convenience store is a small retail business that stocks a range of everyday
items such as groceries, snack foods, candy, toiletries, soft drinks, tobacco products, magazines and newspapers. Such stores may also offer money order and
wire transfer services. In some jurisdictions, corner stores are licensed to sell
alcohol, typically beer and wine. Typically, this type of stores open 24 hours and
operate in locations with surface less than 500 m2 . According to data provided
by marker research firm Euromonitor International, the most important growth
of the market in Mexico is through these small formats. Currently there are
approximately 20000 convenience stores located throughout the country. This
type of stores has had a growth of 50.6% since 2009 [12]. Convenience stores
usually charge significantly higher prices than conventional grocery stores or supermarkets, as convenience stores order smaller quantities of inventory at higher
per-unit prices from wholesalers. However, convenience stores make up for this by
?
Corresponding author. E-mail:[email protected], tel.: +52-181-1068-3479
having longer opening hours, serving more locations, and having shorter cashier
lines. Custom research pricing methods fall into two main approaches: pricing
the total product (pricing for a complete concept /product) and / or pricing
elements of a variable concept /product offering such as branding or features.
This research focus on pricing the total product. In different areas of science
and applied research data analysis and forecasts have been treated using linear regression [15], [9] considering multiple explanatory variables combined with
other methods for selecting variables in order to reduce the dimensionality of the
problem, for instance stepwise and genetic algorithms [2]. Currently the multiple linear regression (MLR) is a technique used in machine learning to refine
decision making in finance and economics. Ismail, et al. [6] use multiple linear
regression as a tool to develop forecasting model for predicting gold prices based
on economic factors. Shabri and Samsudin[14] propose a hybrid model which
integrates wavelet and multiple linear regressions for crude oil price forecasting,
it is important to highlight the use of principal component analysis to process
subseries data in MLR to reduce the dimensions of sub-time components series
. Various proposals have tried to use MLR and correlation analysis to calculate
the optimal price in retail products considering the law of demand known or
uncertain. Meijer et al. [11, 10] address the problem of optimal price retailing
considering law of demand. They use Cox regression to diagnose identity hazard
rates for different products besides characteristics of spline regression is used to
find the price elasticity parameter related to the markdown moment. The main
concept in their work is to derive unbiased estimates to sell through curves used
in retail based through the use of survival analysis. On the other had, Kwong et
al. [8] consider demand as a Bayesian belief updated and cumulative demand as
Brownian motion with unknown drift. Keskin et al. [7] consider dynamic pricing where the seller does not know the parameters of product’s linear demand
curve (incomplete learning). Chakravarty et al. [3] propose a simulation-based
algorithm for computing the optimal pricing policy with a stochastic differential
equation model for uncertain demand dynamics of the product. Close to our
methodology is the proposal of Hu et al. [4] who use predictive modeling based
in regression estimates for home value. They build multivariate regression model
using maximum information coefficient (MIC) statistic [13] to the observed values and predictive values as an evaluation of the regression. The MIC propose a
measure of dependence for two-variate relationships, this coefficient is used as a
measure for selecting the ”best subset” of predictor for building the regression
model. Hu et al. divides the process in some steps, a) select variables, b) build
linear regression models for selected variables, c) analysis for residual errors and
fix problems using Box-Cox transformation, d)build the final model. The organization of the document is as follows, the introduction above served to set the
business frame as well as the literature review. The definition of the problem
is presented in section 2. Then, the mathematical formulation is presented in
section 3. Section 4 describes the design and implementation of the pilot test,
including results of the pilot. Finally the conclusion and future work in section
4.
2
Case Study
The pricing strategy is by definition the efforts aimed at finding a product’s optimum price, typically including overall marketing objectives, consumer demand,
product attributes, competitors’ pricing, and market and economic trends [5].
In this case of study, a company that operates several convenience stores desire
to improve the current pricing strategy in order to maximize the business performance. The main goals are to obtain the optimum price of each product and
to increase the gross margin.
In the current pricing strategy, the company use a simplistic process to define
the prices of the products for all the stores. The process is based on salespeople
information, pricing personnel and other customer-facing staff for pricing intelligence. Although the process considers marketing, competition and financial
data, the company has failed to establish internal procedures to optimize prices.
Consequently, the process is prone to error, in addition, it is hard to recognize opportunities to adjust the price to improve the business performance. The
methodology proposed consist of a pricing model that considers the analysis of
historical data. The information contains product attributes, customers behavior, marketing strategies, competitor’s pricing, among others. The model is given
in three stages given in Fig. 1. The first stage is the cleaning and organization
of data, the main goal is to organize a cluster of products (categories) belonging
to the same commercial hierarchy level and share attributes. The second stage
is a regression model of the clusters obtained in previous stage. The regression
model allows to establish product associations, as well as trend, likelihood and
promotional effects. Finally, the third stage is an optimization model that incorporates business rules, objective goals as well as external factors in order to
define the optimum price.
Fig. 1. Stages of the proposed pricing model
Following section explained in detail the mathematical formulation used in
each stage of the pricing model.
3
Mathematical Formulation
In this case study, the pricing model proposed is based on three stages. The first
stage is the cleaning and organization of data. The complexity of this stage lies
in the purification and management of data in order to select the information
that is most useful for the following models. Typically, historical data is used
to provide insight as to how volume sales (V ) is affected by changes in price
and market variables such as seasonality, advertising, promotions, competitive
product prices and other variables deemed appropriate. However, for a business
based on convenience sales the measure of most interest is the gross margin given
in equation (1). The gross margin (Y ) is the total sales income (IS) minus the
cost of goods (COGS), divided by the total sales income (I), expressed as a
percentage. The higher the percentage, the more the company retains on each
monetary unit of sales, to service its other costs and debt obligations. The margin
is computed for a set of products G.
%yi =
isi − cogsi
;
isi
∀i ∈ G
(1)
Historical weekly data of this study case is given by product and includes
income sales, cost, seasonality, competitors price, promotions, volume sales, commercial hierarchy, and other attributes of the product that allow to build groups
of products with similarities features into categories, where the category C is
a subset of G. Therefore, each category contains similar products and its size
depend on the type of category. Some categories will contain a large number of
different products e.g. a category of candies, and others will have a small group
of products e.g. dairy products.
In the clustering analysis, the objective is to find the relation among the
products of the same category k ∈ C. The analysis is performed for each product
i ∈ C of the category (cluster).Therefore, the information of a product is crossed
with the information of the rest of products of the same category. Historical data
is per product, then, records among products may o may not match. In order to
compare products with similar purchase behavior, a product that shares 80% of
historical records with the product under analysis is selected to next step.
Next, the sample Pearson correlation (ρ) is computed in order to define the
relation between two variables: Margin of the product under analysis (Y ) and
Price (P ) of other product of the category. Where historical data per each product have a size of T weeks and each record is denoted by index t (week), therefore
t ∈ T . Consequently, yi,t represent the margin of the product i at the time t and
analogously for pj,t .ȳi and p̄i are the sample mean. Finally syi and spj are sample
standard deviations. For this step, if the absolute value of the Pearson correlation (2) is moderate, then Pj is selected for next step. Consequently, we obtain
a set of selected prices P Ri =
S
3 ρyi pj ≥ 0.4
Pj
j∈G
T
P
ρ y i pj =
yit pjt − T · ȳi · p̄j
t=1
(T − 1) syi spj
;
∀i, j ∈ Ck ⊂ G;
i 6= j
(2)
Until this step, we have only cross information between Margin and Prices, where
information of Margin comes from the product under analysis and information
of Prices comes from the product under analysis and other products in the
same category. However, there is more information in the historical data that
can be useful in the understanding of the commercial behavior of the product.
Therefore, final step in the cluster analysis is to cross information of tendency
(E), seasonality (S), cost (COGS), promotions (EC), competitive price(CP ),
weather conditions(W ) and selected prices in the previous step (P Ri ).
Lets have a set of information variables associated to a product i given by
Z i = {Ei ∪Si ∪COGSi ∪ECi ∪CPi ∪Wi ∪P Ri }. Where Z i has a size H and each
element is denoted by h. Then, a column vector of ratios ri is computed using
equation given in (3), where ρiyh is the sample Pearson correlation between the
margin and the information variable h and ρih,l is the sample Pearson correlation
between two different information variables. The ratio is computed for each
variable of information. In this way, a high ratio means lower relation with other
variables. This ratio is helpful to avoid multicollinearity issues.
H · |ρy ,h |
ratioih = P i i ;
|ρhi ,li |
∀i ∈ Ck ⊂ G;
h, l ∈ Z i ;
h 6= l
(3)
li ∈Z i
Finally, the vector of ratios is order from higher to lower value, finally, a mix of
nine explanatory variables X are selected to later perform the regression analysis.
The mix of variables consist of 5 variables of prices (X1 , X2 , X3 , X4 , X5 ) including the price of the product under analysis, 2 variables of business metrics (cost,
competitive prices, promotions) X6 , X7 , 1 variable of temporality(tendency, seasonality) and 1 variable of the weather conditions. An example of the selection is
showed in Fig. 2. Phase 1 ends with the selection of variables, leading to Phase
2, the multiple linear regression analysis.
Fig. 2. Example of the selection of explanatory variables using (3)
Multiple linear regression attempts to model the relationship between two or
more explanatory variables and a response variable by fitting a linear equation
to observed data. Multiple linear regression attempts to model the relationship
between two or more explanatory variables and a response variable by fitting a
linear equation to observed data. Every value of the independent variable x is
associated with a value of the dependent variable y. The population regression
line for p explanatory variables x1, x2, ..., xp is defined to be µy = β0 + β1 x1 +
β2 x2 + ... + βp xp . This line describes how the mean response y changes with the
explanatory variables. The observed values for y vary about their means y and
are assumed to have the same standard deviation . The fitted values b0 , b1 , ..., bp
estimate the parameters β0 , β1 , ..., βp of the population regression line. With the
selection of the explanatory variables, formally, the model for multiple linear
regression, given n observations, is given by (4).
yi = β0 +
9
X
βip xip ;
∀i ∈ Ck ⊂ G
(4)
p=1
A stepwise procedure using Akaike information criterion was used in the regression analysis. The procedure start with all candidate variables, testing the
deletion of each variable using a chosen model comparison criterion, deleting
the variable (if any) that improves the model the most by being deleted, and
repeating this process until no further improvement is possible. The price of
the product is always included in the final econometric model. Details of the
Akaike information criterion can be found in [1]. Second phase ends with the
econometric models. The third and final phase is the optimization model. The
optimization phase consider the econometric models obtained during the regression analysis. In order to enrich the proposal with features of the real operation
of the convenience stores, a set of business rules are included. Business rules includes operational considerations of the business as well as the external factors of
the market. The optimization model is performed per category. The mathematical formulation of the model is given in the following equations. The objective
function is given in (5), which consist on maximizes the total gross margin per
category. The objective function is subject to constraints given by (6) to (11).
MaximizeZ =
X
yi ;
(5)
i∈Ck ⊂G
Econometrics models obtain during the regression analysis are used as a constraint in (6), this equation is given by each product i belonging to the same
category C.
9
X
yi = β0 +
βip xip ;
∀i ∈ Ck ⊂ G
(6)
p=1
The primary business rule for a price is to overcome the net cost of the
product for a 3% (7).
xi ≥ 1.03 ∗ cogsi ;
i = 1, 2, 3, 4, 5
(7)
Upper and lower bound are given in (8) and (9). In these equations, the
proposed price should be ±15% of the selected reference. The options are the
competitor price cp or the reference price ref . The reference price is the last
price used in the convenience store.
xi ≥ 0.85 ∗ min (cpi , refi ) ;
∀i ∈ Ck ⊂ G;
i = 1, 2, 3, 4, 5
(8)
xi ≤ 1.15 ∗ max (cpi , refi ) ;
∀i ∈ Ck ⊂ G;
i = 1, 2, 3, 4, 5
(9)
When category of products is under analysis, it is common to find products
of the same brand and same content, they only differ in flavor. Therefore, those
products should have the same price, unless a marketing strategy suggest differentiate one of the products. In this way, all products under the same features F
and marketing strategy are constraint to have the same price (10).
xi = xj ;
∀i 6= j;
i, j ∈ Ck , F ;
i, j = 1, 2, 3, 4, 5
(10)
Another characteristic to consider for a product is the different sizes available
at the store. Then, (11) considers that the price of the product i should be less
than the price of the product j if both are the same product under different size,
where j is bigger than i. Such definition is given in SZ. This constraint also
serve to differentiate one of the products under F in order to define a marketing
strategy.
xi ≥ xj ;
∀i, j ∈ SZ, Ck ;
i, j = 1, 2, 3, 4, 5
(11)
Typically, convenience stores are free to set the price of the products as
appropriate to the business, however, there are products that are controlled by
the government or by special promotion of the supplier LG. Then, the products
under this description are constrained to the controlled price kli (12).
xi = kli ;
∀i ∈ Ck , LG;
i = 1, 2, 3, 4, 5
(12)
The optimization is always given for a period of time. In this way, explanatory
variables for seasonality, business, and weather are set to a constant value kitp
where i is the product t is the period of time and p is the type of explanatory
variable (13).
xi = kitp ;
∀p = 6, 7, 8, 9
(13)
Finally, non-negativity of the variables is given in (14). It is important to
highlight that prices in a convenience store are usually rounded to cents. For
this case study, the prices should be rounded to 0, 0.1¢, 0.5¢, 0.9¢, due to is the
price to be label at the store.
xi ≥ 0
∀i ∈ Ck
i = 1, 2, 3, 4, 5
xi ∈ {a.b |a ∈ N, b ∈ {0, 1, 5, 9} } (14)
The details of the implementation of the models as well as instances and case
study are describe in next section.
4
Test and Results
To test the performance of the proposed models, several instances were tested
in a case study. The data for each instance correspond to a real life case from
stores located in three cities. The stores were classified into two groups: mirror
and pilot. The pilot stores use the price of the proposed methodology while
mirror stores remain with the actual strategy. In this way, mirror stores are used
to compare the value of the proposed methodology. The classification of the
stores was defined by the company according to their business interest.
Table 1 shows the nomenclature used to identify instances. Due to data confidentiality, the nomenclature will not show compromised information of the
actual places. Then, cities are identified as CT . Products are classified in different levels regarding its purpose. In this research, two levels of classification are
used. The first level is named ”Segment” and give information about the market segment of the product. The next level is the ”Category”, which grouped
the products by specific similarities. Categories were built during phase 1 of the
proposed methodology.
Table 1. Instances used for the test of the proposed methodology
Cities Pilot Stores Mirror Stores Segment Category Products
CT1
CT2
CT3
31
25
31
101
96
104
S
G
S,G
26
46
71
527
407
1090
Then, for city CT1, the proposed methodology was tested in 31 stores. The
price strategy includes 558 products classified in 27 categories. The performance
of the store was compared with 164 mirror stores.
A weekly historical data of the total stores (Pilot + Mirror) from 2013 to
2015 was used for the regression analysis. Given the amount of data, a procedure to select, clean and organized data was implemented using Visual Fox
Pro. The selection of explanatory variables as well as the regression analysis was
implemented using a statistical software R. The linear optimization model was
implemented in FICO Xpress 7.0.1, the optimization model runs until optimality.
As shown in the table 1, and according to the objectives of the company,
the variety of size is good enough to prove whether the proposal is efficient for a
business plan. During the test, there were five proposals of price every four weeks
for each city. The generation of the proposal includes the information generated
until two week before implementation. The time available to generate new prices
is one business day, then it is reviewed and authorized to be sent to the pilot
stores for implementation. Meanwhile, the company generate the prices with the
current price strategy for the mirror stores. Results of the methodology are given
in terms of % margin for mirror and pilot stores. Also, the econometrics model
are review in order to monitor the weight of the price in the strategy.
An example of the results obtained after the implementation of the price
strategy is given in Figure 3, where the x-axis is the period of time during
the test,and y-axis shows the % margin. The figure shows highlights of the
implementation of the new strategy.
Fig. 3. Results for instances CT2- segment G during the test period
As figure 3 shows, the performance of pilot stores is better than mirror stores.
Even with a decrease in the sales income, the margin still better. This can be
explained with the proposal of prices, where the optimization model is focused
in changing the prices of those products with higher % margin in the category.
Also, the trend of the results are better in segment S as figure 4 shows. For this
segment and city, both mirror and pilot stores increase their income sales in a
similar rate, however, the price strategy allows to get a better margin in pilot
stores.
Fig. 4. Results for instances CT1- segment S during the test period
Finally, figure 5 shows the benefits obtained for pilot stores. Due to confidentiality agreement, we are only able to present the % of benefits obtained after
compared the actual strategy and the methodology proposed. It can be noted
that in all cities and segments, the pilot stores were able to obtained benefits.
Also, the strategy performed better for segment S, than for segment G.
Fig. 5. Results for instances CT1- segment S during the test period
5
Conclusion
Convenience stores serve the entire purchasing population of its geographical
area but focuses on customers who need to purchase items outside of normal
working hours such as swing shift employees and quick shoppers looking for snacks and related items. In order to capture attention and sales they use prominent signs at the store locations, billboards, media bites on local news, and radio
advertisements to capture customers. These actions combined with a right price
strategy will lead to the success of the business.
In this research we proposed a pricing strategy composed for three phases.
The objective in the first phase was to understand and use the most appropriate
variables for the business, in this way through a MLR we can obtained econometric model that will better focus in the major goal, the increase of the margin.
Then, the optimization model will take additional business rules (internal and
external) in order to obtain a price proposal that allowed to increase the margin
under the most possible real scenario. The strategy was tested in a real case
study, where the results proved the value of the methodology.
The proposed methodology is easy to understand and implement within most
fields of business. Also, it produces information that can be used to create forecasting and planning models and it is relatively inexpensive research process if
a firm has high quality data. Also, the use of business rules give an advantage
over typical econometric models that only considers one price and not crossed
information. However, risks during implementation are the lack of information
or little variation of data, because it would produce a low quality regression
analysis.
Future work in this research includes the exploration of new methodologies
during the selection of explanatory variables, as well as improved the regression
analysis. During the definition of the price strategy, the metric of most interest
is the margin, however, future work considers the inclusion of other metrics, such
as volume sales.
Acknowledgments
The authors are grateful to Sintec for financial and technical support during the
development of this research. Sintec is the leading business consulting firm for
Supply Chain, Customer and Operations Strategies with a consultative model
in Developing Organizational Skills that enable their customers to generate unique capabilities based on processes, organization and IT. Also, we appreciate
the financial support of CONACYT-SNI program in order to promote quality
research.
References
[1] Hirotugu Akaike.
[2] Ganjali Mohammad Reza Norouzi Parviz Avval Zhila Mohajeri, Pourbashir Eslam. Application of genetic algorithm - multiple linear regressions to predict the
activity of rsk inhibitors. Journal of the Serbian Chemical Society, 80:187–196,
2015.
[3] Saswata Chakravarty, Sindhu Padakandla, and Shalabh Bhatnagar. A simulationbased algorithm for optimal pricing policy under demand uncertainty. International Transactions in Operational Research, 21(5):737–760, 2014.
[4] Feng Wenying Hu Gongzhu, Wang Jinping. Multivariate Regression Modeling for
Home Value Estimates with Evaluation Using Maximum Information Coefficient,
pages 69–81. Springer Berlin Heidelberg, Berlin, Heidelberg, 2013.
[5] WebFinance Inc. pricing strategy, 2016.
[6] Z. Ismail, A. Yahya, and A. Shabri. Forecasting gold prices using multiple linear
regre ssion method. American Journal of Applied Sciences, 6:1509–1514, 2009.
[7] N.Bora Keskin and Assaf Zeevi. Dynamic pricing with an unknown demand
model: Asymptotically optimal semi-myopic policies.
[8] H. dharma Kwon, Steven a. Lippman, and Christopher s. Tang. Optimal markdown pricing strategy with demand learning. Probab. Eng. Inf. Sci., 26(1):77–104,
January 2012.
[9] B. Chowdhury M. Abuella. Solar power probabilistic forecasting by using multiple
linear regression analysis. In SoutheastCon 2015, pages 1–5, April 2015.
[10] Rudi Meijer and Sandjai Bhulai. Optimal pricing in retail: a cox regression approach. International Journal of Retail & Distribution Management, 41(4):311–
320, 2013.
[11] Rudi Meijer, Michael Pieffers, and Sandjai Bhulai. Markdown policies for optimizing revenue, towards optimal pricing in retail. Journal of Business and Retail
Management Research, 08(1), 10 2013.
[12] Rivas Raquel. Tiendas de conveniencia, un negocio a tomar en cuenta, 2016.
[13] David N. Reshef, Yakir A. Reshef, Hilary K. Finucane, Sharon R. Grossman,
Gilean McVean, Peter J. Turnbaugh, Eric S. Lander, Michael Mitzenmacher,
and Pardis C. Sabeti. Detecting novel associations in large data sets. Science,
334(6062):1518–1524, 2011.
[14] Samsudin Ruhaidah Shabri Ani. Crude oil price forecasting based on hybridizing
wavelet multiple linear regression model, particle swarm optimization techniques,
and principal component analysis. The Scientific World Journal, 2014, 2014.
[15] Wang Tong Xiaozhen Yan, Xie Hong. A multiple linear regression data predicting
method using correlation analysis for wireless sensor networks. In Cross Strait
Quad-Regional Radio Science and Wireless Technology Conference (CSQRWC),
2011, volume 2, pages 960–963, July 2011.