Optimal Product Line Design: Genetic Algorithm Approach to

C 2006)
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 131, No. 2, pp. 227–244, November 2006 (
DOI: 10.1007/s10957-006-9135-3
Optimal Product Line Design: Genetic Algorithm
Approach to Mitigate Cannibalization
G. E. FRUCHTER,1 A. FLIGLER,2
AND
R. S. WINER3
Communicated by G. Leitmann
Published Online: 11 November 2006
Abstract. In this marketing-oriented era where manufacturers maximize
profits through customer satisfaction, there is an increasing need to design
a product line rather than a single product. By offering a product line, the
manufacturer can customize his or her products to the needs of a variety of
segments in order to maximize profits by satisfying more customers than a
single product would. When the amount of data on customer preferences or
possible product configurations is large and no analytical relations can be established, the problem of an optimal product line design becomes very difficult
and there are no traditional methods to solve it. In this paper, we show that the
usage of genetic algorithms, a mathematical heuristics mimicking the process
of biological evolution, can solve efficiently the problem. Special domain operators were developed to help the genetic algorithm mitigate cannibalization
and enhance the algorithm’s local search abilities. Using manufacturer’s profits as the criteria for fitness in evaluating chromosomes, the usage of domain
specific operators was found to be highly beneficial with better final results.
Also, we have hybridized the genetic algorithm with a linear programming
postprocessing step to fine tune the prices of products in the product line.
Attacking the core difficulty of cannibalization in the algorithm, the operators
introduced in this work are unique.
Key Words. Genetic algorithms, heuristics, product line design, cannibalization, pricing, marketing.
1. Introduction
In this marketing-oriented era where manufacturers maximize profits through
customer satisfaction, there is an increasing need to design a product line rather
1 Associate
Professor, Graduate School of Business Administration, Bar-Ilan University, Ramat-Gan,
Israel.
2 Vice President of Product Management, Olista, Netanya, Israel.
3 Professor, Stern School of Business, New York University, New York, NY.
227
C 2006 Springer Science+Business Media, Inc.
0022-3239/06/1100-0227/0 228
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
than a single product. By offering a product line, the manufacturer can customize
his or her products to the needs of a variety of segments in order to maximize
profits by satisfying more customers than a single product would. We define a
product to be a set of attributes where each attribute can have different levels.
Thus, a product line will be a set of different attribute level configurations.
In offering a product line to achieve maximal profit, the manufacturer needs
to carefully choose the variety of product configurations and their prices given the
products’ costs and customers’ willingness to pay. The main problem in designing
a profit-maximizing product line is to target the right product to the right customer.
A customer wants to optimize the utility from purchase by choosing a product that
maximizes his surplus. In contrast, the manufacturer wants to maximize profits by
extracting the customer’s surplus. However, in an open market, this strategy leads
to cannibalization where customers with high willingness to pay may choose products designed for customers with low willingness to pay. The classical approach
to avoid cannibalization is to reprice the product configurations incorporating the
customer’s willingness and wants. This method is possible when we have a small
set of product configurations and customer segments or alternatively when analytical relations can be established between customer preferences and product
configurations. However, in reality, when the amount of data on customer preferences or possible product configurations is large and no analytical relations can be
established, the problem of an optimal product line design becomes very difficult
and there are no traditional methods to solve it. In fact, the product line design
problem has been shown to belong to a group of problems known in computer
science as NP-complete (Ref. 1). Though it has not yet been proven, it is assumed
that NP-complete problems have a time complexity that is an exponential function
of the problem size. Thus, usage of heuristics or a nondeterministic algorithm is a
common practice for their solution.
The problem of product design has been studied heavily in the literature over
the past 20 years. Heuristics to solve the single product design problem have been
proposed in Ref. 2 using a dynamic programming approach and in Ref. 3 using a
genetic algorithm. Reference 3 solved the product design problem using a genetic
algorithm for the share of choices, seller’s returns, and buyer’s welfare objectives
working with part-worth data. They report their results on a set of randomly
generated datasets and compare the genetic algorithm’s performance with the
dynamic programming heuristics of Ref. 1. Their work showed the advantage of
the genetic algorithms over the dynamic programming approach in solving the
problem of single product design.
Reference 4 was among the first to model the product line design problem.
They presented a heuristic algorithm to solve the seller’s return and buyer’s welfare problems. Reference 5 described a method to solve the product line design
problem using an integer programming method with a model that follows Ref. 4.
Reference 6 presented a heuristics-based approach to the product line design
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
229
problem that extended the model developed by Ref. 4. Reference 7 followed a
dynamic programming-like approach using a model similar to that of Ref. 4. The
ideas of Ref. 7 were extended by Ref. 8 using a beam search method that is based
on a breadth-first search algorithm without backtracking. Reference 9 used the
mathematical formulation of Ref. 7 to test a method using a genetic algorithm.
They showed that genetic algorithms have a significant potential to solve product
line design problems.
Thus, there are two different approaches to attack the product line design
problem. The first approach considers that first a finite set of reference products
is obtained from which a product line is selected (Refs. 4–6). If the number of
attributes and attribute levels is large and if most attribute level combinations
define feasible products, it can be computationally infeasible to enumerate the
utilities of the candidate items. For this class of problems, it is preferable to use
the second approach, which constructs product lines directly from part-worth data.
The methods proposed in Refs. 7–9 use the second approach.
In this paper, we follow the idea of employing genetic algorithms, a mathematical heuristics mimicking the process of biological evolution, to the problem
of the optimal product line design, considering the first approach. Refraining from
working directly with part-worth data in our chromosome genes, we succeeded
in having shorter chromosomes and no dependency on the products’ utility evaluation method, which leads to a more effective search process. Special domain
operators were developed to help the genetic algorithm mitigate cannibalization
and enhance the algorithm’s local search abilities. Using the manufacturer’s profits
as the criteria for fitness in evaluating chromosomes, the usage of domain specific
operators was found to be highly beneficial with better final results. Also, we have
hybridized the genetic algorithm with a linear programming postprocessing step to
fine tune the prices of products in the product line. Attacking the core difficulty of
cannibalization in the algorithm, the operators introduced in this work are unique.
The rest of the paper is organized as following. The product line design problem is formally defined in Section 2. In Section 3, we describe the methodology.
In Section 4, we test the methodology on a real-world scenario as well as on
simulated data sets. In Sections 5, we summarize our findings.
2. Optimal Product Line Design Problem
We consider a manufacturer who wants to design a product line to a heterogeneous market of I customer segments which differ in their preferences in choosing
a product. Let Ni be the number of customers in segment i, i = 1, . . . , I . For simplicity we assume Ni = 1, i = 1, . . . , I ; instead of segment i, we will simply say
customer i.
230
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
A product is represented by a set of attributes and each attribute has a different number of levels. If m is the total number of the attributes and attribute
k ∈ {1, . . . , m} has qk levels, then we have a total of Q = q1 × q2 × · · · × qm
attribute level combinations or possible product configurations. A certain product
configuration j, j = 1, 2, . . . , Q, is a potential new brand targeted to satisfy one
or more customers. Let us assume that the manufacturer’s unit cost associated
with a product configuration j is cj .
We assume that the manufacturer’s goal is to maximize its profit by offering
a set of product configurations which satisfy the customer’s needs. Let pj be the
price of brand j and let yj be a variable indicating whether brand j is offered or
not, i.e.,
1, if brand j is offered,
(1)
yj =
0, else.
The manufacturer’s goal is to determine which brands to offer and at what price.
Each customer may or may not choose a brand and different customers
may choose the same brand. In choosing a product, we assume that each customer
maximizes the surplus from buying the product and that she/he restricts the quantity
of purchase to a single product. By customer surplus, we mean the difference
between the customer’s willingness to pay and the actual price. Let uij be customer
i’s willingness to pay for brand j.
Since several customers may choose the same brand, when designing a product line, i.e. determining which brand and at which price to offer, the manufacturer
needs to carefully consider each customer’s choice to avoid the cannibalization
problem. In other words, it could happen that brands designed at higher prices for
certain customers will be left on the shelf in favor of brands designed at lower
prices that would leave money in the customer’s pocket.
To state more formally this problem, let xij be a variable indicating whether
customer i buys brand j , i.e.,
1 if customer i buys brand j,
(2)
xij =
0, else.
The manufacturer’s problem is to select the right customer with the right brand,
xij , i = 1, . . . , I and j = 1, . . . , Q, and the corresponding price pj to brand j ,
to maximize the profit given that each customer i maximizes his surplus from
buying the brand j . This optimal policy must then be the solution of the following
optimization problem:
Max
x ij ,pj
Q
I i=1 j =1
(pj − cj )xij ,
(3a)
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
s.t.
Q
x ij ≤ 1,
i = 1, . . . , I,
231
(3b)
i=1
Q
(uij − pj )x ij ≥ (uij − pj )y j
j =1
x ij ≤ y j ≤
x ij ,
j = 1, . . . , I,
(3c)
j =1
(uij − pj )x ij ≥ 0,
pj > cj ,
Q
i = 1, . . . , I and j = 1, . . . , Q,
j = 1, . . . , Q,
I
x ij ,
i = 1, . . . , I and j = 1, . . . , Q.
(3d)
(3e)
(3f)
i=1
In the optimization problem (3) we have five constraints. Constraint (3b) guarantees that each customer will buy at most one product; constraint (3c) guarantees
that this product will maximize his surplus (self-selection condition); constraint
(3d) asserts that customer will only buy a product with a nonnegative surplus.
Constraint (3e), guarantees that the offered products are priced above the cost
(exclusion condition). Constraint (3f) ensures that the customers buy only offered
products and that a product will be offered only if it is being bought by at least
one customer.
A solution to problem (3) specifies which brands to offer in the product line
and at which prices.
The product line design problem has been shown to be NP-complete (Ref.
10) and the problem cannot be solved optimally. A full enumeration of the possible
product lines is possible for only the smallest problem sizes. Indeed, for each brand
j, j ∈ {1, . . . , Q}, we have Ij ≤ I possible customers with reservation prices
above product j ’s cost. Thus, the search space for finding the price and the products
to offer to this market could contain in the worst-case I Q possible pricing schemes.
Furthermore, to avoid cannibalization, we allow repricing. Thus, the price pj of
brand j can have intermediate values between the customer reservation prices.
Thus, the search space could contain in the worst case (I )Q possibilities, where
I > I . Clearly, as the number of possible product configurations and customers
grows, the size of the problem space grows exponentially. Due to the problem’s
complexity and hardness, we have resorted to the usage of a hybridized genetic
algorithm in order to solve it.
3. Application of Genetic Algorithms to the Product Line Design Problem
The concept of genetic algorithms (GA) belongs to the family of evolutionary
algorithms, mathematical optimization methods inspired by the processes of natural evolution. Invented in Ref. 11 and developed later in Refs. 12 and 13, GA have
232
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
received great popularity in domains such as manufacturing, robotics, engineering, logistics and CAD. Here, we evaluate and extend the standard GA approach
with respect to the above stated problem of the optimal product line design using
customer preference data evaluated from a conjoint analysis experiment and given
the cost data for the potential brands in the product line.
GA works on the coding of strings (chromosomes), particular string positions
or substrings corresponding to variables (genes), which in turn can take on a
number of values (alleles). Then, each string as whole represents a candidate
solution of the underlying optimization problem.
String Specification. To connect the GA approach with our model, each
chromosome (string) describes which brand j to offer and at which price pj , j =
1, 2, . . . , Q. The brand j is specified by the gene yj that takes the value 1 if the
brand is offered and 0 if not. The price pj is specified by two predefined numbers
of the algorithm,4 {λ, ε}, where λ ≥ 1 is a constant integer and 0 < ε ≤ 1, as well
as a pair of genes (αj , βj ). The first gene in the pair (αj , βj ) indicates the index
number of the customer whose reservation price is above the brand j’s cost. Recall
that there are Ij such customers; thus, αj ∈ {1, . . . , Ij } and each uαi ,j > cj . The
second gene βj ∈ {0, 1, . . . , λ}. Given the fixed numbers {λ, ε} and the genes’
values (αj , βj ), the price pj of brand j is determined by
γ1 uαi ,j + γ2 uαi −1,j , γ2 = βj /λ and γ1 = 1 − γ2 ,
if αj > 1,
pj =
δ1 uαi ,j + δ2 cj ,
δ2 = γ2 (1 − ε) and δ1 = 1 − δ2 , if αj = 1.
(4)
Formula (4) guarantees that the price pj of brand j will take values that are
either at the reservation price (willingness to pay) of one of the customers, uαj ,j ,
where αj ∈ {1, . . . , Ij }, or at some intermediate value between two consecutive
reservation prices, uαj ,j and uαj −1,j , or intermediate values between the lowest
reservation price above the cost, in case αj = 1, and the cost cj .
The chromosome that is the candidate for the underlying optimization problem is composed of a string of length 3 × Q where Q is the total number of possible
product configurations. Thus, each string is composed of Q triplets (triplet ≡ three
genes), where the triplet j is (yj , αj , βj ). The problem of which brands to offer in
the product line to maximize a profit is highly correlated with their prices. Thus,
the brand genes yj are grouped, along the price genes αj , βj , in triplets, in order
to increase the probability that the three genes will flow together under crossover
operations.
As an example, let assume that we have a product line of laptops, represented
by two attributes (memory and screen), where the first attribute (memory) has
two levels (256MB, 512MB) and the second (screen) has also two levels (high
4 They
determine the granularity of intermediate values of reservation prices.
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
233
resolution, low resolution). Thus, in this case, we have Q = 4 possible product
configurations (brands);
Brand 1:
Brand 2:
Brand 3:
Brand 4:
{memory 256MB, high resolution screen},
{memory 256MB, low resolution screen},
{memory 512MB, high resolution screen},
{memory 512MB, low resolution screen}.
Let the costs of the corresponding brands be {2, 3, 6, 8}. Assume that we have two
customers and that the reservation prices are
u11 = 3, u12 = 4, u13 = 6, u14 = 9, u21 = 2, u22 = 5, u23 = 7, u24 = 8.
Let λ = 5 and ε = 0.5. Then, an example of a chromosome representing a possible
product line would be 012 123 114 013, which corresponds to a product line
consisting from Brand 2 {memory 256 MB, low-resolution screen} with price
p2 = (2/5) 5 + (3/5) 4 = 4.4 and Brand 3 {memory 512 MB, high-resolution
screen} with price p3 = 0.6 × 7 + 0.4 × 6 = 6.6.
Using the chromosome coding described above, an initial chromosome population is created by assigning to each chromosome a uniformly distributed random
value for every gene position based on the gene type (brand or price gene) and
triplet number, according to the following rules:
(i) For the first gene in the j th triplet, a value of 0 or 1 is randomly generated.
(ii) For the second gene in the j th triplet, a value in the range of 1, . . . , Ij
is randomly generated.
(iii) For the third gene in the j th triplet, a value in the range of 0, 1, . . . , λ is
randomly generated.
Evaluation. To solve the profit maximization problem (3), we have applied
the standard genetic algorithm life cycle evolutionary process (see Ref. 16 for
a detailed summary of the standard process), which includes fitness evaluation,
selection, crossover, and mutation. In this process, by exchanging high-value
combinations of genes from the best-fitted chromosomes in the population, the
algorithm generates better solutions along generations.
Due to the genetic algorithm’s stochastic nature, product line setting in the
chromosome is random and might be nonoptimal. With this in mind, we have
improved the standard process by introducing three deterministic operators called
domain-specific operators5 (DSOs) before going through the standard fitness evaluation process. We describe the steps of the extended algorithm below. The first
three steps represent the new operators.
5 Those
are operators that improve the chromosomes with respect to specific challenges of the product
line design problem.
234
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
Step 1. Cannibalization Avoidance Operator. The objective of this operator
is to avoid possible cannibalization created due to the suboptimal and stochastic
nature of the GA. The operator first calculates for a customer i, i = 1, . . . , I ,
the price pj that will force self-selection,6 where j = 1, . . . , Q. Let 0 be the
profit generated by the GA before applying this operator; let ij , j = 1, . . . , Q,
be the profits of the manufacturer generated from forcing customer i to buy brand
j, j = 1, . . . , Q, by the self-selection process. As repricing brand j can change
the other customer’s l = i preferences and decrease or increase the profit of the
manufacturer 0 , the operator automatically will chose the maximal profit between
0 and the possible profits i1 , . . . , iQ . Let i = max{i1 , . . . , iQ , 0 }. This
step is repeated iteratively for each customer i ∈ {1, . . . , I }. The operator will end
with a result = max{1 , . . . , I }.
Step 2. Market Diversification Operator. The objective of this operator is
to avoid not offering a brand due to the GA suboptimal and stochastic nature.
The operator attempts to find, for each customer that was not selected by the
standard GA and the previous DSO, the cannibalization avoidance operator (CAO),
a potential brand and price, from the nonoffered brands so far, thus with yj = 0, that
maximizes the profit of the manufacturer. This should be done without affecting the
preferences of the selected customers so far. Thus, this operator deals with market
diversification by offering new products to a new market without cannibalizing
the existing products.
The operator first sets for a customer i in the complementary set of the current
market, say I, the profit-maximizing price to brand j, pj = uij , for each j in the
be the
Let ij , j ∈ Q,
complementary set of the current offered brands, say Q.
corresponding profits of the manufacturer generated from offering the new brands
to customer i. Offering new brands at new prices can change the other customers’
l = i preferences and decrease or increase the profit of the manufacturer ; the
operator automatically will chose the maximal profit, say D
i , between and
This step is repeated iteratively for each customer
the possible profits ij , j ∈ Q.
The operator will end with a result D that is the maximum across all
i in Q.
D
i , i ∈ I.
Step 3. Price Improvement Operator. The objective of this operator is to avoid
too low price settings due to the suboptimal and stochastic nature of the GA. Moreover, since the price settings generated by the cannibalization avoidance operator
depends also on the GA outputs, the current operator is applied to both the GA and
cannibalization avoidance operator deficiencies. In practice, the operator works on
extracting more customer surplus, thus again improving the manufacturer’s profit
without changing the preferences of the customers.
6 In
a similar manner as the approach of Ref. 14, i.e. reprice brand j such that the customer i’s surplus
from buying brand j will be maximal.
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
235
The operator is applied iteratively on each product j in the offered set according to the following procedure: first, the operator calculates for each customer i
j
in Ij , the maximum possible decrease in his surplus, say Di , without changing
the customers’ preferences. In order to keep the whole market of customers, the
j
operator will choose among all Di the smallest decrease, i.e., the value
j
j D j = min D1 , . . . , DIj .
The operator will end with a price improvement,
pj = max pj + D j , min uij .
i∈Ij
Step 4. Fitness Evaluation. Once the chromosomes go through the domain specific operators, they are improved and the algorithm evaluates each
improved chromosome’s fitness by calculating the manufacturer’s profit given
in (3a). For this step, the chromosome undergoes a change in the brand
gene. If yj takes the value 1 and there is i such that xij = 1, for this customer, the brand j yields a nonnegative maximizing surplus. yj is then set
to 1 or else it is reset to 0. Considering our example above and assuming
that the Steps 1–3 did not change the specific chromosome, then after the fitness evaluation step, the chromosome becomes: 012 123 014 013. Thus, y3 is
changed from 1 to 0 because no customer buys brand 3 and y2 remains with
the value 1 because Customer 2 will buy brand 2 (indeed, for Customer 1
u12 − p2 = 4 − 4.4 < 0, u13 − p3 = 6 − 6.6 < 0, thus this customer will not buy
any brand; for Customer 2, u22 − p2 = 5 − 4.4 = 0.6, u23 − p3 = 7 − 6.6 = 0.4,
thus this customer will buy brand
2). In this way, the fitness of the chromosome
can be evaluated by the formula Q
j =1 (pj − cj )xij . In our example, the fitness
will be (p2 − c2 )1 = 4.4 − 3 = 1.4. The higher the fitness, the higher the probability that the product line represented by this chromosome will be part of the next
generation population or the final solution.
Step 5. Selection. After all chromosomes have been evaluated for their fitness
value, we go on to generate the next generation of chromosomes. First, a certain
preconfigured percentage of the highest-fitted chromosomes are copied as is to
the next generation. The operation called elitism in GA terminology is intended to
preserve the highest quality solutions generated thus far by the algorithm. Second,
using a rejuvenation process, a certain preconfigured percentage of the lowest
fitted chromosomes are replaced by randomly generated new chromosomes. The
rejuvenation process enhances the search by randomly creating new candidate
solutions that do not necessarily exist in the population. Third, a certain preconfigured percentage of chromosomes are selected for reproduction using a tournament
selection scheme. In the tournament selection, several chromosomes are selected
randomly from the population and the chromosome with the highest fitness value
236
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
is selected for reproduction. Some of the chromosomes selected for reproduction
go through the process of crossover and mutation as described next.
Step 6. Crossover and Mutation. In the reproduction step, chromosomes
are selected in couples and go through a crossover operation for the creation of
offspring with a probability Pcross . We tried several crossover schemes and the
multiple location one was found to be the best for the problem at hand. In the
multiple location crossover, two chromosomes exchange their genes at location j
with a certain probability Ploc . In our context, by using the crossover operator, we
can change the offered product line and the associated products’ prices through the
exchange of product and price genes between two chromosomes. After two siblings
have been generated using the crossover operator, each goes through the process
of mutation. The probability of going through mutation for each gene location has
been set to equal 1% in our implementation. The gene going through mutation is
set a randomly generated value according to the rules in (i)–(iii) mentioned above.
The genetic algorithm iteratively goes through the process described above
until the stopping condition described next holds true.
Step 7. Stopping Condition and Postprocessing Linear Programming Optimization. The genetic algorithm stops when either the number of GA generations
is higher than a preconfigured limit or when the population is considered converged. We say that the population is converged if the difference in fitness of the
best chromosome in the population has not changed for more than a preconfigured
number of consecutive generations. Once the GA stops, the chromosomes with
the best fitness value in the final population are selected as the algorithm’s final
result.
Our genetic algorithm generates a solution that defines the brands to be offered and their prices. Since it is a probabilistic algorithm, the genetic algorithm is
not robust in fine-tuning the optimal price. Using the deterministic price improvement operator in Stage 3 caused the genetic algorithm to converge prematurely
as it frequently enlarges the existing differences in fitness between solutions and
decreases the variance of chromosomes in the population. Furthermore, the linear
model definition grows in size as the number of products or customers gets larger.
Thus, using this price Improvement operator for every chromosome in the population is prohibitive in running time and memory usage. Thus, we have used the
linear programming method to further improve the final solution for price setting.
Namely, the brands to be offered and the assignments of products to customers are
already defined by the final solution and only optimal prices given the assignments
are left to be found by the linear optimization.
Before the final result is generated, the best solution is always optimized with
respect to product pricing by using linear programming optimization. Examining
our mathematical formulation, it is important to note that, when the xij and yj
values in (3) are given, namely, when the assignment of which customers buy
which brands, the problem becomes linear and can be represented and solved for
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
237
optimal pricing using the linear programming approach. The linear programming
model of finding the optimal pricing when the xij and yj values are given is:
Max
(pj − cj )xij ,
(5)
pj
s.t.
i∈{1,...,I } j ∈{1,...,Q}
(uiki − pki ) ≥ (uij − pj ),
uiki − pki ≥ 0,
i ∈ {1, . . . , I },
i ∈ {1, . . . , I },
j ∈ {1, . . . , Q}
(6)
(7)
where ki defines the index of the brand that customer i is buying in the final
genetic algorithm’s solution. Namely, ki = j if xij = 1 in the genetic algorithm’s
final solution.
We have thus used the linear programming method as means for local optimization of the genetic algorithm’s final solution. By local optimization, we mean
that the brands to be offered and the assignments of brands to customers are already defined by the GA’s final solution; given the assignments, only the optimal
prices are left to be found by the linear optimization.
4. Testing the Methodology
Our methodology includes testing the genetic algorithm on a dataset from a
real-world scenario. As we will show, the data on customers’ reservation prices
for all possible product configurations is generated through a conjoint analysis
study applied to data collected on customers’ preferences on part of the product
configurations. Full enumeration of the possible product lines was not feasible (in
our case, 61144 = 1.22 × 10257 possible product lines in the worst-case scenario).
Thus, to be able to asses how good is the optimal solution, we simulated a second
data set that allows an analytic approach [as in Moorthy (Ref. 14)]. To assess our
methodology, we compare the analytical result for the optimal solution with the
result generated by our methodology. Next, we describe our data sets.
Real Data Set. The product category used to test the performance of our new
algorithm was laptop computers. Assuming a general target market of computer
users, customers understand the attributes of a laptop computer and have the
ability to evaluate their preferences or utility from such a product. Furthermore,
real costs can be calculated easily for the different product configurations. In
defining the number of product attributes, we had to strike a balance between the
size of the product space and the limitations of conjoint analysis processing. On
the one hand, we wanted to define a product space as large as possible to have
a representative problem space with respect to problem complexity. On the other
hand, as the number of attributes gets higher, the conjoint analysis questionnaire
becomes longer and harder for respondents to answer. It is common in conjoint
analysis questionnaires to have no more than 6–15 attributes with each having 3–6
options (levels).
238
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
Table 1.
Laptop computer attributes and options.
Attribute
Option I
Option II
Option III
Memory
Processor speed
Hart disk size
Internal 56k modem
Recordable CD-Rom drive (CD-R)
Screen size and resolution
32 megabytes
400 megahertz
6 gigabytes
No
No
15 XGA with
1024 × 768
resolution
64 megabytes
500 megahertz
12 gigabytes
Yes
Yes
16 XGA with
1400 × 1050
resolution
128 megabytes
650 megahertz
Conjoint analysis is concerned with understanding how people make tradeoffs
of product attributes. By looking at how people made selections between a limited
number of theoretical product profiles involving different tradeoffs, we are able
to predict accurately which choices would be made between previously untested
products. Respondents are presented with a set of representative products each
having different combinations of attribute levels. Respondents are then required
to evaluate their reservation price for the specific combination, namely, how much
they would be willing to pay for a product with the specific combination of features.
The price of the product with the minimal features is presented as a baseline.
Through the conjoint analysis process, we can then calculate the reservation prices
values for the limited product sets and the utility and importance of each attribute
level of every product for every customer.
We have defined a product space with 6 attributes each having 2–3 options
as described in Table 1.
Thus, in this case, the number of possible product configurations is Q =
3 × 3 × 2 × 2 × 2 × 2 = 144.
Issuing questionnaires as part of a university marketing course generated
customer reservation prices for our model. The 61 respondents had an understanding of both the problem and the conjoint-analysis approach. Questionnaires were
analyzed with the conjoint analysis software, and the part-worth values for every
attribute level for each of the 144 possible products (brands) were calculated for
each customer. Since our algorithm does not process part-worth data explicitly but
rather work with complete product utility values to measure customers’ preferences, we have precalculated the utility of each product for every customer from
its part-worth values. It is important to note that our algorithm has no dependency
on the specific method used to evaluate the customers’ preferences and will work
with any method that generates the products’ utility values. The conjoint analysis
questionnaire is described in Ref. 16.
Simulated Data Set. In order to assess how good the genetic algorithm performs in finding the optimal product line and profits, we have tested its performance on a simulated data set based on the model of Moorthy (Ref. 14). By
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
239
using simulated data, we had total control on the problem characteristics such as
customer reservation prices or marginal costs as well as the ability to calculate
analytically the optimal profits and product line configuration. Having the optimal
result at hand, we had a tool to assess the performance of the genetic algorithm in
solving the problem.
The simulated data is based on a model of a heterogeneous market of I
customers; each customer i ∈ {1, . . . , I } has a different reservation price for each
brand. There are Q different brands and each brand j, j ∈ {1, . . . , Q}, has a cost
cj . We have Iˆj = I possible reservation prices uij , i = 1, . . . , Iˆj . We assume
uij = Bj × i and cj = κ × (Bj )2 , where 0 ≤ κ ≤ 1 is a predefined constant and
Bj a variable that needs to be found. The sequence Bj , j = 1, . . . , Q, is assumed
to be monotonically increasing.
Following the approach of Ref. 14, the prices are set to avoid cannibalization
and brand j is designed to customer j ; thus, it is assumed that Iˆj = I = Q.
Thus, p1 = u11 = B1 , p2 = u22 − (u21 − p1 ) = 2B2 − (2B1 − B1 ) = 2B2 − B1 ,
and so on pj = ujj − (ujj −1 − pj −1 ). Therefore, pj is a function of the values
B1 , B2 , . . . , Bj or pj = f (B1 , B2 , . . . , Bj ). The manufacturer’s problem becomes
f (B1 , B2 , . . . , Bj ) − κBj2 ,
(pj − cj ) =
Max =
B1 ,B2 ,...,BQ
j ∈{1,...,Q}
j ∈{1,...,Q}
∗
) is a solution of
where f is known. Thus, the optimal strategy (B1∗ , B2∗ , . . . , BQ
the Q equations
f (B1 , B2 , . . . , Bj − κBj2 ) = 0, j = 1, . . . , Q.
(∂/∂Bj ) =
j ∈{1,...,Q}
For example, if Q = 3, we obtain
B1∗ = (1/2)κ, B2∗ = (1/2)κ, B3∗ = (3/2)κ.
Negative B1∗ means that the manufacturer should not offer this brand under the
optimal product line. Thus, the optimal product line for the manufacturer is to
offer the products
B2∗ = −(1/2)ε, B3∗ = (3/2)ε
at prices (3/2)ε and (9/2)ε, respectively.
Using this model, we have generated 3 sets of simulated data sets with
different problem sizes, (Q = 30, I = 30), (Q = 150, I = 150), (Q = 250, I =
250). In addition, we calculated the optimal solution for each problem size and
used them to asses the genetic algorithm performance. We have used κ = 0.1 in
these calculations.
Simulated vs. Real Data Set. We have used the simulated data set as means
to evaluate the proximity of the results generated by the genetic algorithm to the
240
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
optimal solution value. As apparent from comparing the reservation price graphs
in the simulated and real data sets, and in retrospect by inspecting our results, it
is important to ask if the results on the simulated data set are predictive for the
genetic algorithm’s performance on the real data set.
Several points of interest spring to mind when analyzing the simulated and
real data.
(1) The simulated data set has an optimal solution with no cannibalization
between products. Also, in the optimal solution of the simulated data set,
the number of products and the number of customers is equal and every
customer that is being offered a product is targeted with a different one. In
the real data set, this is not necessarily the case, as in the optimal solution
many customers may buy the same product. Thus, the simulated data is
more sensitive to product pricing and introduction.
(2) The customers population in the simulated data set has a symmetric structure that is absent in the real data set. All customers have the same pattern
of reservation price graphs with only the inclination of the graphs being
different. Also, all products have positive reservation prices for all customers. Contrary to that, in the real data set, the reservation price patterns
of customers are totally different and have random distribution patterns.
Some of the products have negative or below product cost reservation
prices for some customers, while some of the customers do not have a
single product with a reservation price above any product cost. This fact
can make the simulated data less sensitive to changes with respect to the
number of buying customers or number of bought products.
(3) The simulated data were generated in a way that there is a one-to-one
relation between the customer’s space and the product’s space. For every
customer, there is a parallel product configuration. Thus, if we want to
check a 300-product configuration problem, we need to have 300 customers. In the real data model, there is no correlation between the number
of product configurations and the number of customers.
(4) The products in the simulated data set are defined based on the customer’s
reservation price graphs. In the real data set, product configurations definitions are agnostic to the customer’s reservation prices. Furthermore, the
customer’s reservation prices are defined after the product configuration
space has been defined.
The differences between the simulated and real data sets, make the simulated
data at times more sensitive and at times less sensitive to changes in prices and product configurations than in the real data set. More specifically, the simulated data
have an internal order within their structure that prevents us from treating it as general examples for the product line design problem’s data sets. Thus, care should be
taken when using the simulated data as predictions for the algorithm performance
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
241
on real data sets. We use the simulated and real data sets in conjunction in our
experiments. In most cases, we use the simulated data set as other examples to
show the general characteristic of the genetic algorithm. At other times, we use the
simulated data as the main data set, mainly when we like to test the problem scalability or we need reference to the optimal solution’s value. In the sequel, we further
describe how the differences in the genetic algorithm’s performance on the real and
simulated data sets can be attributed to the differences in their internal structure.
Optimal Product Line for the Real Data Set. Implementing our algorithm
to the laptop study resulted in a unique solution for the optimal product line
design problem. The product line includes 15 brands (out of 144 possible product
configurations), a market of 32 customers (out of the total 61 customers), and a
total profit is $12,431. None of the 129 nonoffered brands could be offered to the 29
nonbuying customers without cannibalizing existing products in the product line.
The best chromosome in the initial population had a profit value of $1,816; thus,
our result shows an increase of 684% in fitness. The genetic algorithm reached
99% of the final result fitness by the 54th generation. The optimal product line
profile is described in Table 2.
A key observation, emphasized in Table 2, is that several attributes (like 64M
Memory, 6GB Hard Drive Space, No Modem, High Resolution Monitor) were
found to be popular with the majority of customers and therefore were common
in most of the offered product configurations.
Results of the Simulated Data Set. The genetic algorithm generated good
results on the simulated data set. For a small data set with 30 possible product
configurations, the algorithm generated a profit value which was 99% of the
optimal expected result. All customers in the market were offered a product in
the final product line. 10 products out of 30 possible product configurations were
offered as part of the best product line. For a big data set with 250 possible
product configurations, the algorithm generated a profit value which was 97% of
the optimal expected result. All customers in the market were offered a product
in the final product line. 15 products out of 250 possible product configurations
were offered as part of the final product line.
The fact that all customers were satisfied by the solution can be accredited
to the special structure of the simulated data set, where every customer had an
above-cost reservation price from every product. Thus, there are many product
lines combinations that cater to the tastes of all customers. It is important to
remember that, in order to minimize cannibalization, the optimal solution of the
simulated data set is structured such that customer i buys product i. In our final
solutions, almost all of the satisfied customers bought products with an index
different than their own. Furthermore, 40% of the customers in the 30 strong data
set and 10% in the 150 strong data set bought products with indexes smaller than
their own, namely, products that cannibalized the products they were supposed to
buy in the optimal solution. In the 150 problem size, a mere 2% of the customers
Appearance
117
105
99
97
83
81
75
73
71
65
60
52
49
11
1
Brand
13%
√
32M
√
60%
27%
53%
√
√
√
√
√
33%
√
14%
√
√
√
√
√
√
√
√
√
650 MHz
√
√
√
√
√
√
√
500 MHz
66%
√
√
√
√
√
√
√
√
√
6 GB
√
24%
√
√
√
√
√
12 GB
HD
Optimal product line profile.
Processor
√
√
√
400 MHz
√
√
128M
√
64M
Memory
Table 2.
14%
√
√
Y
86%
√
√
√
√
√
√
√
√
√
√
√
√
N
√
Modem
46%
√
√
√
√
√
√
√
Y
54%
√
√
√
√
√
√
√
N
√
CD
87%
√
√
√
√
√
√
√
√
√
√
√
√
High
√
13%
√
√
Low
Screen
242
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
243
bought their optimal product. Furthermore, 87% of the customers bought products
with a higher index number, forcing the higher-value product to be priced at a
lower reservation price.
Because all products have above-cost reservation price for all customers,
although our solutions diverged considerably from the optimal product line configuration, the degradation in the profit value was minimal. Since solutions with
totally different configurations in the final configuration had near optimal values,
we can postulate that the fitness landscape in the simulated data set has many local
maxima, making it hard for the genetic algorithm to find the optimal solution.
5. Conclusions
This research presents a genetic algorithm tailored specifically to solve the
product line design problem. Furthermore, the algorithm is implemented on a
real-world scenario. The algorithm used a coding that reflected which brands to
offer and at which price. In contrast to previous studies where product lines were
constructed directly from part-worths data obtained by conjoint analysis, we have
refrained from working directly with part-worth data in our chromosome genes.
In this way, we succeeded in having shorter chromosomes and no dependency on
the products’ utility evaluation method, an approach that leads to a more effective
search process. Special domain operators were developed to help the genetic algorithm mitigate cannibalization and enhance the algorithm’s local search abilities
to improve fitness criteria (in our case, the profits of the manufacturer). The usage
of domain-specific operators was found to be highly beneficial, with better final
results. Also, we have hybridized the genetic algorithm with a linear programming
postprocessing step to fine tune the prices of products in the product line. Attacking the core difficulty of canibalization in the algorithm, the operators introduced
in this work are unique.
We have tried our algorithm on both simulated and real data sets. Also, it
appears that applying our algorithm to a particular product line problem design,
the single product design (see Ref. 17), we find a very interesting feature of the
product line design problem. We find that there exists a core of high quality
attribute levels. These high quality attributes’ levels appear with high probability
in brands that are part of the final solution of both a single product and a product
line. A comparison of our algorithm’s performance with a known state of the art
heuristics (Ref. 6) shows its superiority (see Ref. 17).
References
1. KOHLI, R., and KRISHNAMURTI, R., Optimal Product Design Conjoint Analysis: Computational Complexity and Algorithms, European Journal of Operational Research,
Vol. 40, pp. 186–195, 1989.
244
JOTA: VOL. 131, NO. 2, NOVEMBER 2006
2. KOHLI, R., and KRISHNAMURTI, R., A Heuristic Approach to Product Design, Management Science, Vol. 33, pp. 1523–1533, 1987.
3. BALAKRISHNAN, P. V., and JACOB, V. S., Genetic Algorithm for Product Design, Management Science, Vol. 42, pp. 1105–1117, 1996.
4. GREEN, P. E., and KRIEGER, A. M., Models and Heuristics for Product Line Selection,
Marketing Science, Vol. 4, pp. 1–19, 1985.
5. MCBRIDE, R. D., and ZUFRYDEN, S. F., An Integer Programming Approach to the
Optimal Product Line Selection Problem, Marketing Science, Vol. 7, pp. 126–140,
1988.
6. DOBSON, G., and KALISH, S., Heuristics for Pricing and Positioning a Product Line
Using Conjoint and Cost Data, Management Science, Vol. 39, pp. 160–175, 1993.
7. KOHLI, R., and SUKUMAR, R., Heuristics for Product Line Design Using Conjoint
Analysis, Management Science, Vol. 12, pp. 1464–1478, 1990.
8. NAIR, S. K., THAKUR, L. S., and WEN, K., Near Optimal Solutions for Product Line
Design and Selection: Beam Search Heuristics, Management Science, Vol. 41, pp.
767–785, 1995.
9. ALEXOUDA, G., and PAPARIZOS, K., A Genetic Algorithm Approach to the Product
Line Design Problem Using the Seller’s Return Criterion: An Extensive Comparative
Computational Study, European Journal of Operational Research, Vol. 134, pp. 165–
178, 2001.
10. DOBSON, G., and KALISH, S., Positioning and Pricing a Product Line, Marketing
Science, Vol. 7, pp. 107–125, 1988.
11. HOLLAND, H. J., Adaptation in Natural and Artificial Systems, MIT Press, Cambridge,
Massachusetts, 1992.
12. GOLDBERG, E. D., Genetic Algorithms in Search, Optimization, and Machine Learning,
Addison-Wesley, Reading, Massachusetts, 1989.
13. MITCHELL, M., An Introduction to Genetic Algorithms, MIT Press, Cambridge,
Massachusetts, 1996.
14. MOORTHY, K. S., Market Segmentation, Self-Selection, and Product Line Design, Marketing Science, Vol. 3, pp. 288–307, 1984.
15. BLICKLE, T., and THIELE, L., A Comparison of Selection Schemes Used in Genetic Algorithms, TIK-Report No. 11, Swiss Federal Institute of Technology, Zurich, Switzerland,
1995.
16. FLIGLER, A., Product Line Design Using a Genetic Algorithm, Thesis, Technion Library,
2003.
17. FRUCHTER, G. E., and FLIGLER, A. On the Performance of a Genetic Algorithm
Approach to Product Line Design, Working Paper, 2005.