Product Profit Optimization Jack Meyers, Bria Lambert, Kelsey Frain

Product Profit Optimization
Jack Meyers, Bria Lambert, Kelsey Frain, Daniel Bonneville
Introduction
Problem Overview
Similar to many thriving businesses in the United States, Fastenal is continuously
developing new, progressive strategies in order to save cost and maximize profit. Over the last
fifty years, Fastenal has negotiated numerous agreements between their top customers and
brand manufactures. This agreement allows Fastenal to distribute mass quantities of a brand’s
product to a customer at a discounted price, also defined as a Special Pricing Agreement
(SPAs). Through the use of large datasets and predictive analysis, our team has analyzed
possible areas of profit growth based on Fastenal’s continuing use of SPAs. Our goal was to
analyze where there are possible opportunities to create additional SPAs with other customers
based on previous product sales and brand data.
Why Is This Important?
A Special Pricing Agreement provides a discounted buying and selling price for Fastenal
in addition to allowing Fastenal to distribute a massive amount of product to a customer in one
sale. A reduction in the buying price occurs between Fastenal and the manufacturer, and the
reduction in the selling price occurs between Fastenal and their customer. This discounted price
in both areas of the relationship in conjunction with a high quantity sold is what has led to a
maximized profit for Fastenal.
In addition to direct profit increase, an increase in quantity sold reduces distribution
complexity, increases inventory flow, and reduces unnecessary brand variety within each
Fastenal distribution center. Because an SPA involves the sale of thousands of one product, the
necessity to repeat this sale in small amounts is eliminated which then reduces distribution
complexity. Similarly, the large sale removes a massive amount of one product from Fastenal's
inventory at one time; the combination of multiple SPAs drastically increases inventory flow
because of this. Due to the fact that an SPA involves just one brand's product and a customer,
that customer has enough supply of the product to purchase only from the brand involved in the
contract. By focusing on a few unique brands to create SPAs, we are able to reduce Fastenal's
inventory of other brands that are no longer supplying large amounts of product to Fastenal's
customers; this impact reduces unnecessary brand variety. A specific situation where an SPA
exists is between the following three parties: 3M Manufacturer, Fastenal and Walmart. Figure 1
would be an example of one customer-product relationship.
Figure 1: example of SPA relationship
Because Fastenal is the middle man in the transaction, they are not only conducting the two
discounted prices between the three parties but are able to control the increase in quantity sold.
The control Fastenal has in forming this agreement has played a large factor in their ability to
maximize profit.
Expected Outcome
1. Discover influential SPA attributes
2. Find possible SPAs
3. Analyze/validate their profit opportunity
Assumptions
As our team continued to work through an approach, we made brief assumptions in order to
proceed with our analysis:

Any national brand would be willing to form an SPA with a customer

The best location for possible SPA conversions are in the top 1000 customers based on
quantity purchased

Influential SPA attributes include: quantity sold, product sale price, product cost price
and profitability
Data Analysis
In order to identify where possible SPAs could be formed, our team had to first clearly
understand the data provided and analyze how we can use the data to discover a solution. The
path to using the data involved cleaning, processing and exploring each dataset.
Data Cleaning
Within the multiple datasets Fastenal has provided, each set had millions of entries and
extensive variables describing each entry. In order to make connections between the dataset
and further analyze past SPA relationships, we first had to clean the data and eliminate
variables and values that had no connection to the creation of an SPA.
Due to the fact that an SPA should only be formed when a product is sold at high
quantity, our team has simplified Fastenal’s thousands of customers down to the top 1,000
customers based on quantity purchased. Within the top 1,000 customers we found multiple
products that were sold only a few times; so we extracted those products of a quantity sold less
than ten from our dataset. Once these customer had been identified, our team was able to use
sales data to analyze each top customer's previous sale transactions with Fastenal. Because
these are Fastenal’s top customers and SPA’s are only formed when there is high quantity of
one product purchased, this subset of transactions holds the most logical possibilities to create
additional SPA’s.
In order to simplify and condense this dataset further, our team has extracted products
that have negative numerical values and therefore skewed the data. Examples of these entries
include products with an average sale price and average cost less than zero; similarly, we found
multiple columns with missing variables throughout every entry. Examples of variables with
these entries include: Global Group ID, Vending Machine Customer, Employee Headcount or
Binstock Flag. Thus, we removed variables with these irrelevant values from our data of focus.
Without the elimination of irrelevant variables and negative values, not only would our
analysis of the data have been clouded but it also would have drastically skewed the trends in
our search for possible SPAs.
Data Processing
Once our team cleaned the data and extracted the top 1,000 customers, we combined data sets
to tie the customer and product associated to each sales transaction. We also crafted multiple
variables describing each customer-product relationship.
Fastenal provided our team with extensive data revealing valuable information; the data
tables we are focused on included: sales history, SPAs, products, brands and contract pricing.
Sales history, SPA existence, customer and product information were scattered throughout
multiple different data sets. Using the inventory item ID as a primary key, our team was able to
tie one sales transaction of the inventory item ID to variables describing the item from multiple
data sets. Once this connection had been made, we formed a simplified table where each row is
a customer-product transaction and each column is a unique variable describing the transaction
and item. In addition to the variables Fastenal has provided describing each transaction, our
team created additional variables in hopes of further classifying an SPA. The variables our team
crafted describes one customer-product relationship over the course of numerous invoices
between one unique customer and product; their corresponding formulas are shown in the
following Table 2. Additionally, our team identified the transaction with the highest and lowest
sale price of one product with one unique customer and the transaction with highest and lowest
cost price of one product with one unique customer.
Variable
Quantity Sold
Average Sale Price
Average Cost Price
Profitability
Formula
= Σ quantity sold
= Σ (quantity sold × sale price) / Σ quantity sold
= Σ (quantity sold × cost price) / Σ quantity sold
= (average sale price – average cost price) / average sale price
Table 2: Created variables and their corresponding formulas
Based on the creation of our variables and those Fastenal has provided, our dataset
describing each customer-product relationship consisted of 79 unique variables. As our
research continued, our team limited those variables based on their significance in the data, this
research is revealed later in Figure 5.
Using this condensed dataset, we were able to connect a unique brand’s product, the
customer and the existence of an SPA. If an SPA exists the relationship is represented with 1, if
not it is represented with 0. An example of a customer-product relationship in the table our
team crafted is the brand 3M that offers to Walmart. This manufacturing company has sold
many unique, individual products through Fastenal to Walmart repeatedly; our team has
connected this relationship and described it in a table based on a format similar to what is
shown in Table 3.
Customer - Product
Quantity Sold
Profitability
SPA
Walmart - 3M bolt
120,000
.45
1
Walmart - 3M wrench
400
.30
0
Walmart - 3M screw
320,000
.60
0
...
..
..
..
Table 3: Portion of our customer-product relationship table.
The dataset our team has created connecting these attributes allowed us to further
analyze influential parameter values that have contributed to the creation of an SPA for each
product. Following the identification of these values, we appled this information to hundreds of
customer-product relationships that are potential, logical SPAs.
Data Exploration
Through our process of cleaning and processing the data, we have extracted valuable
statistics that help better understand the data we are working with. Shown in Table 4 is the
average of the following variables that describe the top 1,000 customers:
Number of
Percent of
Profit
Quantity Sold
relationships
data
SPA
1,577
0.44%
$0.34
151,083 units
Non-SPA
308,562
99.56%
$0.21
18,513 units
Table 4: Top 1,000 customer statistics.
Comparing these values between relationships that do and do not have an SPA allows our team
to further understand and analyze how characteristics classify an SPA versus a Non-SPA
product based on a historical sales data. In identifying these attributes and further exploring
their role in an SPA relationship, we can not only identify potential SPA relationships but also
investigate the quality of Fastenal's current SPAs.
Our Method
Overview
Our team's overall goal was to identify potential SPA relationships that will maximize our
profit margin. Following the analysis and cleaning of the scattered data into a dataset of
309,926 customer-product entries, we analyzed when a logical SPA should be formed using a
prediction algorithm. The prediction algorithm will help identify these possible SPA customerproduct relationships.
Classification
Classification is the process of categorizing an instance based on shared characteristics.
A simple classification example is the decision for a bank to award a loan. The goal is to classify
a loan applicant and their desired loan as an acceptance or denial. Characteristics describing
this loan applicant would include: income level, criminal record and years in present job. As
different values are associated to each of the characteristics, such as income level less than
$30,000 or 20 years in present job, the way in which the loan is classified as an acceptance or
denial changes. Similar to this concept, our team worked to classify a customer-product
relationship as an SPA or Non-SPA based on the characteristics of the relationship.
Over the course of 8 weeks, our team has experimented with a series of classification
techniques. Our approach to each classification method align with one consistent goal: to
understand how an SPA customer-product relationship is classified based on the characteristics
of each relationship. The different techniques we have explored include: Clustering, Logistic
Regression, Neural Networks and Decision Trees. Decision Trees have proven to be the most
useful throughout our approach because it allows us to simply visualize how parameter values
have previously lead to an SPA creation based on sales data, our team then used these values
to identify additional SPA opportunities.
What is a Decision Tree?
A Decision Tree is a predictive algorithm used to reveal the probability of one specific
target variable occurring based on the input of several predictor variables. In our case, the
target variable is whether a customer-product relationship is in SPA, yes or no. The predictor
variables are the attributes used to describe each relationship. With a flow-like structure, the
tree includes multiple branches each representing the application of a predictor variable, which
then leads to a leaf holding the probability of the target variable occurring. Shown in Figure 5 is
a small part of a decision tree. At the top of the tree exists the entire dataset; the data is then
split into multiple different subsets based on the way in which each relationship agrees or
disagrees with the value assigned to each predictor value.
Figure 5: part of a Decision Tree .
The Decision Tree algorithm works in a way that automatically takes the most influential
variable that causes the most significant split in the data as its first predictor variable. Similarly,
the value that is assigned to each predictor variable is calculated in a way that will cause a
primary split in the data for the highest improvement in classification.
As discussed previously, there are only 1,577 SPAs out of the 309,926 customerproduct relationships in our data set. Because this amount is so small in comparison, the ability
to understand the direct influence each predictor variable has on filtering the dataset becomes
less accurate. Therefore, our team has created a subset consisting of all 1,577 SPAs in addition
to a random sample of 100,000 Non-SPA products to run through the Decision Tree. So, we
now are analyzing a subset of 101,577 as compared to the entire 309,926. Similarly, the percent
of SPAs in this smaller dataset is now 2% as compared to the .44% in the larger dataset. This
not only speeds up the completion of the algorithm but also increases its accuracy in
determining how different predictor values lead to an SPA cluster. Shown in Figure 6 is an
example of a Decision Tree our team has crafted.
Figure 6: SPA classification Decision Tree
Beginning at the top of the tree is the dataset of 101,577 customer-product
relationships; the first decimal reveals the portion of relationships that have SPAs which we
have already found to be 2%. As the data is ran through the tree the data is separated, or
subsetted, based on how it matches the criteria of the predictor variable. For example, if a
product has high cost less than $0.20 it is sent to a group on the left and if the high cost is
greater than $0.20 it is sent the right. This process continues throughout multiple variables;
some branches are longer, with more predictor variables than others resulting in a more narrow
subset.
The decision tree algorithm as shown in Figure 6 has now separated our one dataset
into nine subsets; each subset consists of a group of customer-product relationships that have
met the criteria in its corresponding tree branch. The decimals shown at the end of each branch
reveal the percentage of the products in that particular subset that have an SPA.
Although most Decision Trees are used as a way of predicting the probability of a
particular event occurring, our team used the Decision Tree model to analyze what predictor
values have already led to the occurrence of an event, or in our case the creation of an SPA.
Thus, this algorithm revealed how a path of different predictor variables and their corresponding
values lead to a cluster of the most SPA products. The four circled subsets in the Decision Tree
in Figure 6 are the subsets in this particular tree that have a majority of SPA products. Once
these SPA subsets were identified, we used those parameter values along the branch
describing each of the subsets to identify other products with similar attributes; thus revealing
potential SPAs.
Variables Selection with Random Forests
Our largest obstacle in creating a useful decision tree has been deciding which variables
play the largest role in the creation of an SPA; once the most important variables were identified
they can be used within the Decision Tree as predictor variables. If too many variables were to
be used in the tree then we would run into an issue of overfitting. Overfitting the tree means
there are too many branches separating the data due to irrelevant variables, this then fails to
reveal the underlying relationship of what ultimately contributes to the creation of an SPA. Each
customer-product relationship originally included 79 variables describing the sale. Fastenal
already established that quantity sold is the most influential attribute in creating an SPA, hence
the reason we are focusing on only the top 1,000 customers in terms of quantity. Outside of
quantity sold, our team identified influential variables with the use of an algorithm called
Random Forest.
Random Forest is a method that was created to address the issue of overfitting; it will
provide a deeper insight as to which are the significant variables in the data. A Random Forest
is a "forest" of multiple decision trees; each tree in the forest has a limited number of predictor
variables and these variables are chosen independently by each tree based on how these
variables significantly influence the outcome variable. Therefore, we have created multiple
Random Forest's to run through different subsets of the training data; they determine
classification by taking the mode of the predictions of every tree in the forest. The process of
training multiple weak learners on different subsets of the data is called bagging; it has been
proven to decrease variance in the model without increasing bias.
In order to check the stability of the results, we created unique subsets to run the
Random Forest through. Due to our very low ratio of SPAs to Non-SPAs, 1: 227, we decided it
would be more effective to take a random sample of Non-SPAs and combine it with all the SPAs
in order to create our training set with a ratio closer to 1 SPA to 3 Non-SPAs. Using this
process, we then created 100 training sets; thus our result was 100 datasets containing all
1,364 SPAs and a random sample of 5,000 Non-SPAs. By training a Random Forest of 100
trees on the 100 subsets crafted, we had 100 strongly trained models to be analyzed. Based on
the results of each forest, we were able to compare the consistency in the predictor variables
chosen from forest to forest and thus develop concrete evidence supporting which variables
have proven to be the most influential in creating an SPA.
Following the creation of 100 Random Forests, our team produced 100 importance plots
graphing the influential variables used in each forest. An example of one of these importance
plots is shown in figure 7 below. The variables are listed on the y-axis as most significant from
top to bottom and the x-axis describes the increase in variable importance.
Figure 7: Importance plot from one forest
The use of a Random Forest allowed us to produce concrete evidence on the most
influential SPA attributes; we then increased the accuracy of the Decision Tree by assigning
these attributes as predictor variables in the tree. Although the Random Forest model may be
more accurate, we were able to use this accuracy to our advantage and still produce a Decision
Tree giving us the ability of being able to visualize how an SPA is categorized among the
variables.
Results
The Decision Tree our team has crafted allowed us to identify a specific set of
characteristics, more specifically parameter values, that contribute to the creation of multiple
SPA relationships. Once these values were identified, they were used to further identify
possibilities for more SPA relationships.
The four circled subsets from the Decision Tree in Figure 6 are groups of products
consisting of over 59% SPA products, thus each subset is populated with a majority of SPA
products. A detailed table revealing the characteristics that classify each of these SPA subsets,
or the parameter values along the tree branch, is shown in Figure 6 below.
High Cost
Quantity Sold
Low Cost
High Sale
#1
> $.20
> 118K
> $.33
--
#2
> $.20
> 302K
< $.33
--
#3
> $.20
12K – 118K
> $2.71
--
#4
> $.20
3K-12K
> $2.71
> $7.93
Table 8: Variable values describing each SPA subset
The corresponding parameter values associated to each SPA subset were extracted and
used to identify products throughout the entire dataset, 309,926 customers-product
relationships, that lack an SPA but match those same attributes. Based on the Decision Tree
results in Figure 6 and data in Table 8, our team discovered 200 total customer-product
relationships among the top 1,000 customers that should be SPAs. The specific number of
customer-product relationships with SPA potential associated to the four subsets are detailed in
Table 9.
Subset from
Tree
#1
#2
#3
#4
Total
Potential SPAs
Relationships
42
15
52
91
200
Table 9: Potential SPA relationships according to their subset.
Based on the characteristics that describe Fastenal's established SPAs, 200 potential
SPA relationships have been identified within the top 1,000 customers alone. This same
process can be applied to the top 2,000 customers, 3,000 customers, and so on. Thus,
associated to each segment of top customers there exists a unique decision tree, unique
parameter values leading to SPA subsets and therefore a new set of potential SPAs to explore.
Validation
In order to thoroughly validate why an SPA is so influential and how our discovery is
going to benefit Fastenal, we needed to be able to compare the difference of one unique
product when it is an SPA versus when it is a Non-SPA. Thus, our team crafted a new data set
consisting of 609 unique products when each product was and SPA and when each product
was a Non-SPA. This direct comparison among 609 products allowed our team to thoroughly
validate the profit difference when an SPA was formed.
Although an SPA reduces both purchase and selling price of a unit, the increase in
quantity sold of the unit has been how Fastenal has been able to make additional profit. Thus,
the profit per unit is often smaller with an SPA than the profit per unit without an SPA. Based on
further analysis, the average profit per unit with an SPA is $3.60 versus an average profit per
unit of $4.88 when a non-SPA. Although this may appear undesirable, the quantity sold of that
unit has a drastically higher volume due to the creation of an SPA and in return the entire
product profit is significantly greater. When the unit quantity sold is increased with a customer,
the entire average product profit with that customer is $920,000 with an SPA versus an average
$89,201 without an SPA. Thus, when comparing a product that lacks an SPA to that same
product with an SPA, the average profit increase among a product line and its corresponding
customer is $830,799.
Shown in Figure 10 is a scatter plot describing the 609 products. There are 1,218 total
points because each product has 2 points: one is describing the product as an SPA, in blue, and
one describing the product without an SPA, in red. Each point is then characterized by quantity
sold on the X-axis versus profit per item on the Y-axis; by taking the log of quantity and profit for
each item we are able to the scale the axis and analyze individual points more closely. Among
the points there also exists a blue SPA regression line and red Non-SPA regression line
representing the average quantity to profit per unit ratio. Both slopes are negative due to the
consistent decrease in unit price as the quantity of the unit increased. Each point shows relative
profit compared to the points around it. The average profit increase due to offering an SPA to
the points for existing data where potential SPAs would lie is shown with the grey area.
Figure 10: Quantity vs. Profit per Unit comparison of SPA and Non-SPA products.
The Decision Tree our team crafted in Figure 6 revealed potential SPAs with a high, but
broad range of quantity sold per item. In subset #4 from the tree there exists 91 potential SPAs;
it is also the subset of potential SPAs with the lowest threshold for quantity sold of 2,830 units.
When comparing this quantity sold threshold to the plot in Figure 10, it intercepts just to the right
of the intersection of the two regression lines. This validates our research that an SPA should
be created because profit has been greater for SPAs when the item sold had a quantity greater
than 2,830 units for existing products.
Now that we've explored the profit comparison for SPAs versus their Non-SPA
counterparts, we explored the accuracy of our original Decision Tree model. Since our original
model was only based on the top 1000 customers based on their quantity purchased, we
decided to create a model with the next 1000 customers to see if we could notice any trends in
the Decision Trees. We chose this approach because the data for the next 1000 customers
looked very similar to the first 1000 customers with about 310K Non-SPAs and 1.5K SPAs after
cleaning the data. The resulting model can be seen below in Figure 11.
Figure 11: Decision Tree of the next 1,000 customers.
This Decision Tree shows similar patterns to the original tree which we can observe in
the three circled areas in Figure 11. First, the primary split in seen in both Figure 11 and Figure
6, the initial tree, seem to filter out any products with a very low high cost and very low quantity
sold. Next, looking at Group 1 in Figure 9, it represents SPAs with a high quantity sold and a
relatively low cost, a similar looking grouping can be seen in Figure 6 in SPA subsets #1 and #2.
Finally in Group 2 of Figure 9, there are 3 subsets of SPAs that have a medium-high quantity
sold with a higher sale price and similar characteristics are seen in SPA subsets #3 and #4 of
Figure 6. When comparing the models more carefully, it can be seen that the original model
with the top 1000 customers generally created SPA subsets that were less expensive and had a
higher quantity sold. This makes sense since that subset contained the customers that sold the
most quantity, meanwhile the second model had SPAs that were generally more expensive,
which accounted for the fact that they moved lower volume in terms of quantity sold. It also
suggests that it may be more effective to create a different model depending on the data instead
of a single catch-all model for creating SPAs.
Conclusion
Over the course of the last few months our team has been searching for potential
customer-product relationships that would logically benefit from an SPA. The creation of an SPA
not only directly maximizes Fastenal's profit by increasing the quantity sold of a product, but
also indirectly benefits the inventory flow of each distribution center. The first step our team took
in discovering potential SPAs was to clean, analyze and extract information from extensive
sales data; we could then use the clean data to identify the ways in which Fastenal's current
SPAs are classified. In order to understand their classification, our team needed to first identify
which attributes describing a customer-product sale have proven to be the most influential in
creating an SPA. Conclusive evidence of the top attributes had been discovered through the
use of a Random Forest; we could then use these attributes to further analyze their unique
characteristics that classify an SPA.
After exploring three different classification techniques over the course of two months,
our team settled on a technique involving Decision Trees. The benefit of Decision Tree is that it
allowed us to clearly visualize how attribute values contribute to the likelihood of a product
having an SPA, thus how these attributes classify an SPA. In using the tree to identify a cluster
of products with many SPAs, our team was able to identify 200 logical, potential SPAs that we
recommend Fastenal making. The results produced include not only a list of potential SPAs
associated to a subset of data, but additionally a concrete, yet versatile method for evaluating
sales data and identifying possible SPAs among any subset of sales data.
Future Work
In addition to identifying SPA possibilities, Fastenal's profit could be increased by
eliminating SPAs that have a low quantity sold and thus are less profitable. Decision Trees
have assisted our team in identifying profitable SPAs, but the trees can also be used to identify
poor SPAs as well. Within each Decision Tree there are numerous small subsets revealing a
small portion of SPAs; these subset are filled with customer-product relationships that have a
low quantity sold and provide less profit. Ideally, the SPAs found in these subsets should either
be reevaluated to increase their quantity sold or their contract should be terminated.
Based on the analysis of just the top 1,000 customers, 200 potential SPAs have been
identified; these 200 potentials are associated to a small portion of the sales data Fastenal has
at hand. The method, or algorithm, our team crafted to produce these results can be taken
multiple steps further to discover hundreds of additional SPAs associated to various subsets of
data. When the next thousand customers are ran through the algorithm, a unique tree will be
produced, in addition to the tree's unique SPA subsets and thus a unique set of customerproduct relationships that could be potential SPAs. The ability for this process to proceed using
our team's method is endless due numerous subsets that can be extracted from the millions of
sales data. Although our team has discovered 200 potential SPAs among just a portion of the
data, the versatile algorithm we crafted will allow Fastenal to analyze and explore hundreds of
customer-product relationships that will maximize their profit exponentially.