1 Optimal pricing model: case of study for convenience stores Laura Hervert-Escobar1,2?, Jesus Fabian López-Pérez2 , and Oscar Alejandro Esquivel-Flores1 1 Instituto Tecnológico y de Estudios Superiores de Monterrey. Monterrey, Nuevo León, México 2 SINTEC Customer and Operations Strategy. Monterrey, Nuevo León, México Abstract. Pricing is one of the most vital and highly demanded component in the mix of marketing along with the Product, Place and Promotion. An organization can adopt a number of pricing strategies, which usually will be based on corporate objectives. The purpose of this paper is to propose a methodology to define an optimal pricing strategy for convenience stores. The solution approach involves a multiple linear regression as well as a linear programming optimization model. To prove the value of the proposed methodology a pilot was performed for selected stores. Results show the value of the solution methodology. This model provides an innovative solution that allows the decision maker include business rules of their particular environment in order to define a price strategy that meet the objective business goals. Keywords: Price Strategy, Correlation, Multiple Linear Regression, Optimization 1 Introduction A convenience store is a small retail business that stocks a range of everyday items such as groceries, snack foods, candy, toiletries, soft drinks, tobacco products, magazines and newspapers. Such stores may also offer money order and wire transfer services. In some jurisdictions, corner stores are licensed to sell alcohol, typically beer and wine. Typically, this type of stores open 24 hours and operate in locations with surface less than 500 m2 . According to data provided by marker research firm Euromonitor International, the most important growth of the market in Mexico is through these small formats. Currently there are approximately 20000 convenience stores located throughout the country. This type of stores has had a growth of 50.6% since 2009 [12]. Convenience stores usually charge significantly higher prices than conventional grocery stores or supermarkets, as convenience stores order smaller quantities of inventory at higher per-unit prices from wholesalers. However, convenience stores make up for this by ? Corresponding author. E-mail:[email protected], tel.: +52-181-1068-3479 having longer opening hours, serving more locations, and having shorter cashier lines. Custom research pricing methods fall into two main approaches: pricing the total product (pricing for a complete concept /product) and / or pricing elements of a variable concept /product offering such as branding or features. This research focus on pricing the total product. In different areas of science and applied research data analysis and forecasts have been treated using linear regression [15], [9] considering multiple explanatory variables combined with other methods for selecting variables in order to reduce the dimensionality of the problem, for instance stepwise and genetic algorithms [2]. Currently the multiple linear regression (MLR) is a technique used in machine learning to refine decision making in finance and economics. Ismail, et al. [6] use multiple linear regression as a tool to develop forecasting model for predicting gold prices based on economic factors. Shabri and Samsudin[14] propose a hybrid model which integrates wavelet and multiple linear regressions for crude oil price forecasting, it is important to highlight the use of principal component analysis to process subseries data in MLR to reduce the dimensions of sub-time components series . Various proposals have tried to use MLR and correlation analysis to calculate the optimal price in retail products considering the law of demand known or uncertain. Meijer et al. [11, 10] address the problem of optimal price retailing considering law of demand. They use Cox regression to diagnose identity hazard rates for different products besides characteristics of spline regression is used to find the price elasticity parameter related to the markdown moment. The main concept in their work is to derive unbiased estimates to sell through curves used in retail based through the use of survival analysis. On the other had, Kwong et al. [8] consider demand as a Bayesian belief updated and cumulative demand as Brownian motion with unknown drift. Keskin et al. [7] consider dynamic pricing where the seller does not know the parameters of product’s linear demand curve (incomplete learning). Chakravarty et al. [3] propose a simulation-based algorithm for computing the optimal pricing policy with a stochastic differential equation model for uncertain demand dynamics of the product. Close to our methodology is the proposal of Hu et al. [4] who use predictive modeling based in regression estimates for home value. They build multivariate regression model using maximum information coefficient (MIC) statistic [13] to the observed values and predictive values as an evaluation of the regression. The MIC propose a measure of dependence for two-variate relationships, this coefficient is used as a measure for selecting the ”best subset” of predictor for building the regression model. Hu et al. divides the process in some steps, a) select variables, b) build linear regression models for selected variables, c) analysis for residual errors and fix problems using Box-Cox transformation, d)build the final model. The organization of the document is as follows, the introduction above served to set the business frame as well as the literature review. The definition of the problem is presented in section 2. Then, the mathematical formulation is presented in section 3. Section 4 describes the design and implementation of the pilot test, including results of the pilot. Finally the conclusion and future work in section 4. 2 Case Study The pricing strategy is by definition the efforts aimed at finding a product’s optimum price, typically including overall marketing objectives, consumer demand, product attributes, competitors’ pricing, and market and economic trends [5]. In this case of study, a company that operates several convenience stores desire to improve the current pricing strategy in order to maximize the business performance. The main goals are to obtain the optimum price of each product and to increase the gross margin. In the current pricing strategy, the company use a simplistic process to define the prices of the products for all the stores. The process is based on salespeople information, pricing personnel and other customer-facing staff for pricing intelligence. Although the process considers marketing, competition and financial data, the company has failed to establish internal procedures to optimize prices. Consequently, the process is prone to error, in addition, it is hard to recognize opportunities to adjust the price to improve the business performance. The methodology proposed consist of a pricing model that considers the analysis of historical data. The information contains product attributes, customers behavior, marketing strategies, competitor’s pricing, among others. The model is given in three stages given in Fig. 1. The first stage is the cleaning and organization of data, the main goal is to organize a cluster of products (categories) belonging to the same commercial hierarchy level and share attributes. The second stage is a regression model of the clusters obtained in previous stage. The regression model allows to establish product associations, as well as trend, likelihood and promotional effects. Finally, the third stage is an optimization model that incorporates business rules, objective goals as well as external factors in order to define the optimum price. Fig. 1. Stages of the proposed pricing model Following section explained in detail the mathematical formulation used in each stage of the pricing model. 3 Mathematical Formulation In this case study, the pricing model proposed is based on three stages. The first stage is the cleaning and organization of data. The complexity of this stage lies in the purification and management of data in order to select the information that is most useful for the following models. Typically, historical data is used to provide insight as to how volume sales (V ) is affected by changes in price and market variables such as seasonality, advertising, promotions, competitive product prices and other variables deemed appropriate. However, for a business based on convenience sales the measure of most interest is the gross margin given in equation (1). The gross margin (Y ) is the total sales income (IS) minus the cost of goods (COGS), divided by the total sales income (I), expressed as a percentage. The higher the percentage, the more the company retains on each monetary unit of sales, to service its other costs and debt obligations. The margin is computed for a set of products G. %yi = isi − cogsi ; isi ∀i ∈ G (1) Historical weekly data of this study case is given by product and includes income sales, cost, seasonality, competitors price, promotions, volume sales, commercial hierarchy, and other attributes of the product that allow to build groups of products with similarities features into categories, where the category C is a subset of G. Therefore, each category contains similar products and its size depend on the type of category. Some categories will contain a large number of different products e.g. a category of candies, and others will have a small group of products e.g. dairy products. In the clustering analysis, the objective is to find the relation among the products of the same category k ∈ C. The analysis is performed for each product i ∈ C of the category (cluster).Therefore, the information of a product is crossed with the information of the rest of products of the same category. Historical data is per product, then, records among products may o may not match. In order to compare products with similar purchase behavior, a product that shares 80% of historical records with the product under analysis is selected to next step. Next, the sample Pearson correlation (ρ) is computed in order to define the relation between two variables: Margin of the product under analysis (Y ) and Price (P ) of other product of the category. Where historical data per each product have a size of T weeks and each record is denoted by index t (week), therefore t ∈ T . Consequently, yi,t represent the margin of the product i at the time t and analogously for pj,t .ȳi and p̄i are the sample mean. Finally syi and spj are sample standard deviations. For this step, if the absolute value of the Pearson correlation (2) is moderate, then Pj is selected for next step. Consequently, we obtain a set of selected prices P Ri = S 3 ρyi pj ≥ 0.4 Pj j∈G T P ρ y i pj = yit pjt − T · ȳi · p̄j t=1 (T − 1) syi spj ; ∀i, j ∈ Ck ⊂ G; i 6= j (2) Until this step, we have only cross information between Margin and Prices, where information of Margin comes from the product under analysis and information of Prices comes from the product under analysis and other products in the same category. However, there is more information in the historical data that can be useful in the understanding of the commercial behavior of the product. Therefore, final step in the cluster analysis is to cross information of tendency (E), seasonality (S), cost (COGS), promotions (EC), competitive price(CP ), weather conditions(W ) and selected prices in the previous step (P Ri ). Lets have a set of information variables associated to a product i given by Z i = {Ei ∪Si ∪COGSi ∪ECi ∪CPi ∪Wi ∪P Ri }. Where Z i has a size H and each element is denoted by h. Then, a column vector of ratios ri is computed using equation given in (3), where ρiyh is the sample Pearson correlation between the margin and the information variable h and ρih,l is the sample Pearson correlation between two different information variables. The ratio is computed for each variable of information. In this way, a high ratio means lower relation with other variables. This ratio is helpful to avoid multicollinearity issues. H · |ρy ,h | ratioih = P i i ; |ρhi ,li | ∀i ∈ Ck ⊂ G; h, l ∈ Z i ; h 6= l (3) li ∈Z i Finally, the vector of ratios is order from higher to lower value, finally, a mix of nine explanatory variables X are selected to later perform the regression analysis. The mix of variables consist of 5 variables of prices (X1 , X2 , X3 , X4 , X5 ) including the price of the product under analysis, 2 variables of business metrics (cost, competitive prices, promotions) X6 , X7 , 1 variable of temporality(tendency, seasonality) and 1 variable of the weather conditions. An example of the selection is showed in Fig. 2. Phase 1 ends with the selection of variables, leading to Phase 2, the multiple linear regression analysis. Fig. 2. Example of the selection of explanatory variables using (3) Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. Every value of the independent variable x is associated with a value of the dependent variable y. The population regression line for p explanatory variables x1, x2, ..., xp is defined to be µy = β0 + β1 x1 + β2 x2 + ... + βp xp . This line describes how the mean response y changes with the explanatory variables. The observed values for y vary about their means y and are assumed to have the same standard deviation . The fitted values b0 , b1 , ..., bp estimate the parameters β0 , β1 , ..., βp of the population regression line. With the selection of the explanatory variables, formally, the model for multiple linear regression, given n observations, is given by (4). yi = β0 + 9 X βip xip ; ∀i ∈ Ck ⊂ G (4) p=1 A stepwise procedure using Akaike information criterion was used in the regression analysis. The procedure start with all candidate variables, testing the deletion of each variable using a chosen model comparison criterion, deleting the variable (if any) that improves the model the most by being deleted, and repeating this process until no further improvement is possible. The price of the product is always included in the final econometric model. Details of the Akaike information criterion can be found in [1]. Second phase ends with the econometric models. The third and final phase is the optimization model. The optimization phase consider the econometric models obtained during the regression analysis. In order to enrich the proposal with features of the real operation of the convenience stores, a set of business rules are included. Business rules includes operational considerations of the business as well as the external factors of the market. The optimization model is performed per category. The mathematical formulation of the model is given in the following equations. The objective function is given in (5), which consist on maximizes the total gross margin per category. The objective function is subject to constraints given by (6) to (11). MaximizeZ = X yi ; (5) i∈Ck ⊂G Econometrics models obtain during the regression analysis are used as a constraint in (6), this equation is given by each product i belonging to the same category C. 9 X yi = β0 + βip xip ; ∀i ∈ Ck ⊂ G (6) p=1 The primary business rule for a price is to overcome the net cost of the product for a 3% (7). xi ≥ 1.03 ∗ cogsi ; i = 1, 2, 3, 4, 5 (7) Upper and lower bound are given in (8) and (9). In these equations, the proposed price should be ±15% of the selected reference. The options are the competitor price cp or the reference price ref . The reference price is the last price used in the convenience store. xi ≥ 0.85 ∗ min (cpi , refi ) ; ∀i ∈ Ck ⊂ G; i = 1, 2, 3, 4, 5 (8) xi ≤ 1.15 ∗ max (cpi , refi ) ; ∀i ∈ Ck ⊂ G; i = 1, 2, 3, 4, 5 (9) When category of products is under analysis, it is common to find products of the same brand and same content, they only differ in flavor. Therefore, those products should have the same price, unless a marketing strategy suggest differentiate one of the products. In this way, all products under the same features F and marketing strategy are constraint to have the same price (10). xi = xj ; ∀i 6= j; i, j ∈ Ck , F ; i, j = 1, 2, 3, 4, 5 (10) Another characteristic to consider for a product is the different sizes available at the store. Then, (11) considers that the price of the product i should be less than the price of the product j if both are the same product under different size, where j is bigger than i. Such definition is given in SZ. This constraint also serve to differentiate one of the products under F in order to define a marketing strategy. xi ≥ xj ; ∀i, j ∈ SZ, Ck ; i, j = 1, 2, 3, 4, 5 (11) Typically, convenience stores are free to set the price of the products as appropriate to the business, however, there are products that are controlled by the government or by special promotion of the supplier LG. Then, the products under this description are constrained to the controlled price kli (12). xi = kli ; ∀i ∈ Ck , LG; i = 1, 2, 3, 4, 5 (12) The optimization is always given for a period of time. In this way, explanatory variables for seasonality, business, and weather are set to a constant value kitp where i is the product t is the period of time and p is the type of explanatory variable (13). xi = kitp ; ∀p = 6, 7, 8, 9 (13) Finally, non-negativity of the variables is given in (14). It is important to highlight that prices in a convenience store are usually rounded to cents. For this case study, the prices should be rounded to 0, 0.1¢, 0.5¢, 0.9¢, due to is the price to be label at the store. xi ≥ 0 ∀i ∈ Ck i = 1, 2, 3, 4, 5 xi ∈ {a.b |a ∈ N, b ∈ {0, 1, 5, 9} } (14) The details of the implementation of the models as well as instances and case study are describe in next section. 4 Test and Results To test the performance of the proposed models, several instances were tested in a case study. The data for each instance correspond to a real life case from stores located in three cities. The stores were classified into two groups: mirror and pilot. The pilot stores use the price of the proposed methodology while mirror stores remain with the actual strategy. In this way, mirror stores are used to compare the value of the proposed methodology. The classification of the stores was defined by the company according to their business interest. Table 1 shows the nomenclature used to identify instances. Due to data confidentiality, the nomenclature will not show compromised information of the actual places. Then, cities are identified as CT . Products are classified in different levels regarding its purpose. In this research, two levels of classification are used. The first level is named ”Segment” and give information about the market segment of the product. The next level is the ”Category”, which grouped the products by specific similarities. Categories were built during phase 1 of the proposed methodology. Table 1. Instances used for the test of the proposed methodology Cities Pilot Stores Mirror Stores Segment Category Products CT1 CT2 CT3 31 25 31 101 96 104 S G S,G 26 46 71 527 407 1090 Then, for city CT1, the proposed methodology was tested in 31 stores. The price strategy includes 558 products classified in 27 categories. The performance of the store was compared with 164 mirror stores. A weekly historical data of the total stores (Pilot + Mirror) from 2013 to 2015 was used for the regression analysis. Given the amount of data, a procedure to select, clean and organized data was implemented using Visual Fox Pro. The selection of explanatory variables as well as the regression analysis was implemented using a statistical software R. The linear optimization model was implemented in FICO Xpress 7.0.1, the optimization model runs until optimality. As shown in the table 1, and according to the objectives of the company, the variety of size is good enough to prove whether the proposal is efficient for a business plan. During the test, there were five proposals of price every four weeks for each city. The generation of the proposal includes the information generated until two week before implementation. The time available to generate new prices is one business day, then it is reviewed and authorized to be sent to the pilot stores for implementation. Meanwhile, the company generate the prices with the current price strategy for the mirror stores. Results of the methodology are given in terms of % margin for mirror and pilot stores. Also, the econometrics model are review in order to monitor the weight of the price in the strategy. An example of the results obtained after the implementation of the price strategy is given in Figure 3, where the x-axis is the period of time during the test,and y-axis shows the % margin. The figure shows highlights of the implementation of the new strategy. Fig. 3. Results for instances CT2- segment G during the test period As figure 3 shows, the performance of pilot stores is better than mirror stores. Even with a decrease in the sales income, the margin still better. This can be explained with the proposal of prices, where the optimization model is focused in changing the prices of those products with higher % margin in the category. Also, the trend of the results are better in segment S as figure 4 shows. For this segment and city, both mirror and pilot stores increase their income sales in a similar rate, however, the price strategy allows to get a better margin in pilot stores. Fig. 4. Results for instances CT1- segment S during the test period Finally, figure 5 shows the benefits obtained for pilot stores. Due to confidentiality agreement, we are only able to present the % of benefits obtained after compared the actual strategy and the methodology proposed. It can be noted that in all cities and segments, the pilot stores were able to obtained benefits. Also, the strategy performed better for segment S, than for segment G. Fig. 5. Results for instances CT1- segment S during the test period 5 Conclusion Convenience stores serve the entire purchasing population of its geographical area but focuses on customers who need to purchase items outside of normal working hours such as swing shift employees and quick shoppers looking for snacks and related items. In order to capture attention and sales they use prominent signs at the store locations, billboards, media bites on local news, and radio advertisements to capture customers. These actions combined with a right price strategy will lead to the success of the business. In this research we proposed a pricing strategy composed for three phases. The objective in the first phase was to understand and use the most appropriate variables for the business, in this way through a MLR we can obtained econometric model that will better focus in the major goal, the increase of the margin. Then, the optimization model will take additional business rules (internal and external) in order to obtain a price proposal that allowed to increase the margin under the most possible real scenario. The strategy was tested in a real case study, where the results proved the value of the methodology. The proposed methodology is easy to understand and implement within most fields of business. Also, it produces information that can be used to create forecasting and planning models and it is relatively inexpensive research process if a firm has high quality data. Also, the use of business rules give an advantage over typical econometric models that only considers one price and not crossed information. However, risks during implementation are the lack of information or little variation of data, because it would produce a low quality regression analysis. Future work in this research includes the exploration of new methodologies during the selection of explanatory variables, as well as improved the regression analysis. During the definition of the price strategy, the metric of most interest is the margin, however, future work considers the inclusion of other metrics, such as volume sales. Acknowledgments The authors are grateful to Sintec for financial and technical support during the development of this research. Sintec is the leading business consulting firm for Supply Chain, Customer and Operations Strategies with a consultative model in Developing Organizational Skills that enable their customers to generate unique capabilities based on processes, organization and IT. Also, we appreciate the financial support of CONACYT-SNI program in order to promote quality research. References [1] Hirotugu Akaike. [2] Ganjali Mohammad Reza Norouzi Parviz Avval Zhila Mohajeri, Pourbashir Eslam. Application of genetic algorithm - multiple linear regressions to predict the activity of rsk inhibitors. Journal of the Serbian Chemical Society, 80:187–196, 2015. [3] Saswata Chakravarty, Sindhu Padakandla, and Shalabh Bhatnagar. A simulationbased algorithm for optimal pricing policy under demand uncertainty. International Transactions in Operational Research, 21(5):737–760, 2014. [4] Feng Wenying Hu Gongzhu, Wang Jinping. Multivariate Regression Modeling for Home Value Estimates with Evaluation Using Maximum Information Coefficient, pages 69–81. Springer Berlin Heidelberg, Berlin, Heidelberg, 2013. [5] WebFinance Inc. pricing strategy, 2016. [6] Z. Ismail, A. Yahya, and A. Shabri. Forecasting gold prices using multiple linear regre ssion method. American Journal of Applied Sciences, 6:1509–1514, 2009. [7] N.Bora Keskin and Assaf Zeevi. Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. [8] H. dharma Kwon, Steven a. Lippman, and Christopher s. Tang. Optimal markdown pricing strategy with demand learning. Probab. Eng. Inf. Sci., 26(1):77–104, January 2012. [9] B. Chowdhury M. Abuella. Solar power probabilistic forecasting by using multiple linear regression analysis. In SoutheastCon 2015, pages 1–5, April 2015. [10] Rudi Meijer and Sandjai Bhulai. Optimal pricing in retail: a cox regression approach. International Journal of Retail & Distribution Management, 41(4):311– 320, 2013. [11] Rudi Meijer, Michael Pieffers, and Sandjai Bhulai. Markdown policies for optimizing revenue, towards optimal pricing in retail. Journal of Business and Retail Management Research, 08(1), 10 2013. [12] Rivas Raquel. Tiendas de conveniencia, un negocio a tomar en cuenta, 2016. [13] David N. Reshef, Yakir A. Reshef, Hilary K. Finucane, Sharon R. Grossman, Gilean McVean, Peter J. Turnbaugh, Eric S. Lander, Michael Mitzenmacher, and Pardis C. Sabeti. Detecting novel associations in large data sets. Science, 334(6062):1518–1524, 2011. [14] Samsudin Ruhaidah Shabri Ani. Crude oil price forecasting based on hybridizing wavelet multiple linear regression model, particle swarm optimization techniques, and principal component analysis. The Scientific World Journal, 2014, 2014. [15] Wang Tong Xiaozhen Yan, Xie Hong. A multiple linear regression data predicting method using correlation analysis for wireless sensor networks. In Cross Strait Quad-Regional Radio Science and Wireless Technology Conference (CSQRWC), 2011, volume 2, pages 960–963, July 2011.
© Copyright 2026 Paperzz