Selecting a small number of products for effective user

Expert Systems with Applications 37 (2010) 3055–3062
Contents lists available at ScienceDirect
Expert Systems with Applications
journal homepage: www.elsevier.com/locate/eswa
Selecting a small number of products for effective user profiling
in collaborative filtering
Hyung Jun Ahn *, Hyunjeong Kang, Jinpyo Lee
College of Business Administration, Hongik University, 72-1 Sangsu-Dong, Mapo-gu, Seoul 121-791, Republic of Korea
a r t i c l e
i n f o
Keywords:
Collaborative filtering
Product selection
User profiling
Information theory
a b s t r a c t
Collaborative filtering (CF) is one of the most widely used methods for personalized product recommendation at online stores. CF predicts users’ preferences on products using past data of users such as
purchase records or their ratings on products. The prediction is then used for personalized recommendation so that products with highly estimated preference for each user are selected and presented. One of
the most difficult issues in using CF is that it is often hard to collect sufficient amount of data for each user
to estimate preferences accurately enough. In order to address this problem, this research studies how we
can gain the most information about each user by collecting data on a very small number of selected
products, and develops a method for choosing a sequence of such products tailored to each user based
on metrics from information theory and correlation-based product similarity. The effectiveness of the
proposed methods is tested using experiments with the MovieLens dataset.
Ó 2009 Elsevier Ltd. All rights reserved.
1. Introduction
Many Internet stores and shopping malls provide personalized
recommendation services to help customers find products and
information easily for efficient shopping. Unlike the early days of
electronic commerce when only a small number of businesses used
such personalization techniques, now not only major Internet
stores but also many smaller ones in diverse sectors provide such
services in many different ways. The personalization services
typically recommend products of potential interests to users
whenever a product is clicked, purchased, or put into a shopping
cart. Many stores also provide separate personal pages customized
to each user where all relevant information, advertisements, and
products are presented to users based on estimated users’
preferences.
There are a variety of methods used for personalized recommendation, and one of the most popular ones is collaborative filtering (CF) (Ahn, 2006; Cohen & Fan, 2000; Greco, Greco, & Zumpano,
2004; Herlocker, Konstan, Terveen, & Riedl, 2004; Konstan et al.,
1997). CF uses customers’ past purchasing records or their ratings
on products to calculate similarity between customers. The calculation is then used to find similar users to a given user, and their
ratings on a given product are used to estimate the preference of
the given user for the given product. That is to say, CF recommends
the products that have been highly rated or purchased often by
those who are similar to the given user.
* Corresponding author. Tel.: +82 2 320 1730; fax: +82 2 322 2293.
E-mail address: [email protected] (H.J. Ahn).
0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2009.09.025
The effectiveness of CF has been proven by many studies and
there are many cases of successful adoption of CF in practice such
as by Amazon.com (Cohen & Fan, 2000; Greco et al., 2004; Herlocker et al., 2004; Konstan et al., 1997; Lee, Kim, & Choi, 2003; Vezina
& Militaru, 2004), but there are some limitations and problems.
Among the shortcomings, this paper addresses the difficulty of
collecting sufficient ratings data for each user for effective recommendation (Cylogy, 2005; Maltz & Ehrlich, 1995; Middleton, Alani,
Shadbolt, & De Roure, 2002). This is a significant problem because,
in practice, it is usually costly and difficult to collect sufficient data
for all users. For example, there are always a large portion of new
or inactive customers, and it is usually very disruptive to users to
force them to rate many products.
There have been some studies that have tackled this issue by,
for instance, using additional content-related information of products to supplement the insufficient data (Huang, Chen, & Zeng,
2004; Li & Kim, 2003; Park, Pennock, Madani, Good, & DeCoste,
2006; Salter & Antonopoulos, 2006; Schein, Popescul, Ungar, &
Pennock, 2002), or improving the similarity measure of CF for the
cold-start conditions (Ahn, 2008). Different from these approaches,
this research aims to develop methods of selecting a sequence of a
small number of products about which ratings data will be collected for each user. The sequence will be different for each user,
and hence needs to be tailored to each user, depending on whether
and how each user rates products presented to them. By doing so,
we can help Internet stores gain more information about users
with less disruption to them. The methods are developed using
many measures from information theory and correlation-based
product similarity. The methods are tested with experiments using
3056
H.J. Ahn et al. / Expert Systems with Applications 37 (2010) 3055–3062
the MovieLens dataset which has ratings on many movies by many
users.
This paper is organized as follows: in Section 2, a brief review of
related work is presented. Section 3 presents the product selection
methods. Section 4 presents the details of the experiments along
with the discussion of the results. Section 5 presents the summary
and the conclusion.
2. Review of related work
2.1. Personalized recommendation
There have been numerous studies on recommender systems
which can be classified into two types: ones that develop and test
new recommendation methods, and the others that investigate
empirically the factors affecting the usefulness of recommendation
systems, or the effects of using recommendation systems on consumer purchasing processes.
Regarding the first type of research, we can further classify the
methods developed so far very broadly into collaborative filtering
and content-based filtering methods. The biggest difference between the two is the type of data used for recommendation (Burke,
2002). CF systems typically use ratings data or purchasing records
for calculating the similarity between users or products to estimate
preferences of each user (Ahn, 2006; Konstan et al., 1997; Linden,
Smith, & York, 2003; Resnick, Iacovou, Suchak, Bergstrom, & Riedl,
1994). On the other hand, content-based methods use content data
such as product description or keywords to find the match between users and products (Adomavicius, Sankaranarayanan, Sen,
& Tuzhilin, 2005; Ahn & Kim, 2006; Kim, Yum, Song, & Kim,
2005; Li, Lu, & Xuefeng, 2005; Melville, Mooney, & Nagarajan,
2002; Mirzadeh, Ricci, & Bansal, 2005). In general, CF systems that
use users’ rating data are known to produce better results. More
detailed classification and the reviews of recommender systems
can be found in (Adomavicius & Tuzhilin, 2005; Burke, 2002).
The empirical studies focus on the impacts of recommender
systems on customer behavior, or finding the factors that influence
the usefulness of recommender systems for users, usually through
experiments using real or experimental Internet stores and human
participants. For example, the study by Komiak and Benbasat studied the impacts of personalization and familiarity on the effectiveness of recommendation systems using a virtual shopping
experiment for products such as notebook computers and desktop
PCs (Komiak & Benbasat, 2006). The study by Tam (2006) also
investigated how personalization affects the cognitive processes
and decision making of customers with experiments involving human participants buying PDAs and downloading music files (Tam,
2006). A very comprehensive review on this type of studies can be
found in the work by Xiao and Benbasat (2007).
This study is based on the widely studied CF method that specifically uses the similarity between users for product recommendation. There are other variations of the CF method including the
ones that use the similarity between products instead (Sarwar,
Karypis, Konstan, & Riedl, 2001), where a recommendation is made
for the products that bear more similarity to the products that have
been preferred by the given user in the past. There are also many
hybrid systems that combine the strengths of more than one method such as those that use both ratings data and content information. More on various hybrid systems can be found in Burke (2002).
have cold-starting and data sparsity problems, where the former
refers to the difficulty of recommending new products or to new
users, and the latter the impaired performance under the sparsity
of the < user product > rating matrix. There have been many approaches to these problems, the most notable of which is to develop hybrid recommendation systems that use the content
information of products and/or customers together with the insufficient ratings data to circumvent the difficulties (Huang et al.,
2004; Li & Kim, 2003; Park et al., 2006; Salter & Antonopoulos,
2006; Schein et al., 2002). There was also a study that tried to improve the performance of CF systems under cold-start conditions
by devising a new similarity measure for CF (Ahn, 2008).
This study is also related with the general issue of cold-starting,
but its approach is different from the above studies in that it focuses on finding out which products can give us most additional
information about each customer, while the cold-start studies try
to use various types of available information more effectively to
complement the insufficient data.
Another group of studies that are related with this work are
those that present instance selection methods (Yu, Xu, Ester, &
Kriegel, 2003; Zeng, Xing, Zhou, & Zheng, 2004). The studies develop techniques to reduce the size of datasets used for recommendation by selecting a certain subset of it mainly in order to reduce the
memory requirement for recommendation, speed the processing,
or to improve the accuracy of recommendation. Although these
studies provide some useful insights, the focus of this study is very
different in that this work aims to find the products with which we
can gain maximum information about each user, while the instance selection studies are concerned with reducing the amount
of the data we already have for improving recommendation
performance.
2.3. Information theory
Information theory is a branch of mathematics originally developed for the field of electronic communication in order to quantify
information so that many properties of communications such as
transmission rates, data capacity, and communication reliability
could be calculated effectively (MacKay, 2003; Reza, 1994).
Information theory provides definitions on various metrics of
information which can be very useful for this study as well. The
most important metric of information theory for this study is the
entropy which measures the amount of uncertainty associated
with a probability variable. The entropy of a probability variable
X is defined as follows:
HðXÞ ¼ E½IðXÞ ¼ X
pðxÞ log pðxÞ;
x
where IðXÞ is the amount of information of X, also called the selfinformation of X. This can be extended to define a conditional entropy HðXjY ¼ yÞ of X given an outcome y of a probability variable Y as
follows:
HðXjY ¼ yÞ ¼ E½IðXjY ¼ yÞ ¼ X
pðxjY ¼ yÞ log pðxjY ¼ yÞ:
x
The entropy and conditional entropy measures introduced above
are used in this study to estimate the amount of information about
a user we can gain by acquiring the rating of some product by the
user.
3. Effective selection of products for user profiling
2.2. Cold-staring and data sparsity
3.1. Research objective
Despite the popularity of CF systems, it is well known that CF
systems do not perform well when there are insufficient users’ ratings on products. More specifically, CF systems have been found to
As we have seen in the review, personalization based on CF
cannot effectively work and produce accurate prediction about
H.J. Ahn et al. / Expert Systems with Applications 37 (2010) 3055–3062
user preferences without sufficient data. Moreover, collecting
users’ ratings on products is often difficult because customers
may regard the process of rating many products disrupting or even
breaching privacy. Therefore, the goal of this research is to investigate how we can gain as much information as possible about customers with minimum inconvenience to users by asking customers
to provide their preferences on only a small number of reference
products. More formally, the objective of the research is:
To develop methods of selecting a small number of products
sequentially for each user so that we can collect and use the
user’s ratings on the products for effective personalized recommendation of products based on the CF method.
Readers should note that the sequence is different for different
users because the selection of a product for a user is dependent
upon the products the user has already rated and the user’s ratings
on them.
3.2. Overview of the research method and measure components
The goal of this research can be translated into creating a
method with which we can select a product that provides most
additional information about a given user. Using the terms of the
information theory, this approach can be described simply as the
diagram in Fig. 1. Suppose that the user has already rated k products ðMk Þ and we have Ik amount of information about the given
user (the left side of Fig. 1). The question is how to select the next
product mkþ1 among the product candidates (the right side of
Fig. 1) in order to maximize the amount of information Ikþ1 .
We can formally describe this approach as follows. First, Ik , the
self information from some user’s ratings on k products, can be
written as:
Ik ¼ Iðr m1 ¼ v 1 ; rm2 ¼ v 2 ; . . . ; rmk ¼ v k Þ, where rmk is the variable
representing the user’s rating on mk , and v 1 ; v 2 ; . . . ; v k 2 W are values within the rating scale W.
Therefore, the objective is to find the next product, mkþ1 , that
can maximize the additional information:
Ikþ1 Ik :
Here, the additional information can be written as:
Ikþ1 Ik ¼ X
pðr mkþ1 ¼ wjr m1 ¼ v 1 ; r m2 ¼ v 2 ; . . . ; r mk ¼ v k Þ
Hence, we can see that the objective of the research is formally
summarized as finding mkþ1 for each user that maximizes the conditional entropy given Mk for the user, that is:
mkþ1 ¼ arg max Hðr m0 jM k Þ
m0
Therefore, this approach can be most effectively implemented if the
conditional entropy of a candidate product m0 could be found. However, in practice, in order to calculate the conditional entropy
Hðrm0 jM k Þ, we need to be able to estimate the joint probabilities
for all possible combinations of ratings on k þ 1 products from the
sample ratings data, which is often practically infeasible due to data
insufficiency. For example, when we have 1000 movies with the ratings in integers between 1 and 5, for k ¼ 3, we need to estimate the
1000!
54 2:59 1013 combijoint probabilities for 1000 C 4 54 ¼ 4!ð10004Þ!
nations to compute the conditional entropies.
Considering these constraints, we indirectly develop heuristic
methods using the following components and their combinations:
– Component 1: We can use marginal entropies instead. That is,
P
Hðrm0 Þ ¼ w2W pðrm0 ¼ wÞlog2 pðrm0 ¼ wÞ is used for choosing
products, assuming that ratings on products with high entropies
will, on average, give more information about the user.
– Component 2: Because it is difficult to estimate Hðr mkþ1 jM k Þ, use
Hðrmkþ1 jrmi ¼ v i Þ instead ði ¼ 1; 2; . . . ; kÞ. We can easily estimate
Hðrmkþ1 jrmi ¼ v i Þ from sample data.
– Component 3: We can use the distance between products. That
is, we can assume that products with larger distance with a
given set of products will provide more additional information
compared with those that are closer to the given products. We
will use the Pearson’s correlation as the distance measure
between products.
In addition to the above, we can also utilize the level of exposure of each product:
– Component 4: The level of exposure of each product can be utilized since, with other conditions equal, products with higher
exposure can be more useful in finding similar customers to a
given one, which is important since the CF method estimates
the preference of a given user based on the similarity. We
denote the level of exposure of a product m0 as
w2W
log2 pðrmkþ1 ¼ wjr m1 ¼ v 1 ; r m2 ¼ v 2 ; . . . ; r mk ¼ v k Þ:
For simplicity, we let Mk denote the user u’s ratings on the k products as:
M k ¼ hr m1 ¼ v 1 ; r m2 ¼ v 2 ; . . . ; r mk ¼ v k i:
Now, the additional information can also be re-written simply as:
X
Ikþ1 Ik ¼ pðr mkþ1 ¼ wjMk Þlog2 pðr mkþ1 ¼ wjMk Þ ¼ Hðr mkþ1 jM k Þ:
w2W
3057
eðm0 Þ ¼
# of users who rated m0
:
total # of users
3.3. Selecting the first products
Selecting the first product for a user is relatively easy because,
in this case, we do not have to consider the relationship between
already-chosen products and a candidate product for estimating
the additional amount of information. Therefore, the following
methods that combine the components were used to choose the
first products.
A. ENT method: The first product m1 is selected such that m1 is
the product with the largest entropy. That is,
m1 ¼ arg max Hðr m0 Þ
m0
B. EXP method: The first product m1 is selected such that m1 is
the product with the largest exposure. That is,
m1 ¼ arg max eðm0 Þ
m0
Fig. 1. Overview of the research method.
C. ENTEXP method: The product of ENT and EXP is used. That
is,
3058
H.J. Ahn et al. / Expert Systems with Applications 37 (2010) 3055–3062
m1 ¼ arg max½Hðrm0 Þ eðm0 Þ
3.5. Random choice
m0
D. ENT+EXP method: The sum of ENT and EXP is used. That is,
0
m1 ¼ arg max½Hðrm0 Þ þ eðm Þ
m0
3.4. Selecting products after the first product
From the second product on, we should consider the products
that have already been rated by a user for additional information.
Using the components introduced in Section 3.2, the following
methods were used for selecting the products from the second
product:
A. Average Conditional Entropy (ACE) method: The product
with the largest average conditional entropy for alreadyselected products is chosen. In other words, a product is
selected if it can give, on average, the most additional information to each of the already-selected products. That is,
Pk
mkþ1 ¼ arg max
i¼1 Hðr m0 jr mi
¼ v iÞ
k
m0
:
B. ENT method: The marginal entropies of each product are
used as they are. That is,
Pk
mkþ1 ¼ arg max
i¼1 Hðr m0 Þ
k
m0
:
C. COR method (using Pearson’s correlation): The Pearson’s
correlation among products is used as a measure of the distance between the products. The distance is formally defined
P
ðr u;mi r mi Þðr u;mj r mj Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u2U
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P
as: Distðmi ; mj Þ ¼ 1 P
, where
2
2
ðr u;mi r mi Þ
ðr u;mj r mj Þ u2U
u2U
ru;mi is the rating of user u on product mi ; r mi is the average
rating on product mi by all users, and U is the set of all users
who have rated both mi and mj . Therefore, using the average
distance with k products,
mkþ1 ¼ arg max
m0
k
1X
Distðmi ; m0 Þ:
k i¼1
D. Largest Minimum Correlation Distance (LMC) method: Similar to the COR method, the correlation distance is used, but
this time, the maximum of the minimum distances among a
candidate product and already-chosen products is used. That
is, only the minimum correlation distance (MCD) between a
candidate product and already-selected ones is considered,
and the candidate product which has the largest MCD is
selected assuming that the product has the least similarity
with already-selected products. Hence,
mkþ1 ¼ arg maxðminfDistðmi ; m0 Þji ¼ 1; 2; . . . ; kgÞ:
m0
i
E. LMC+E method: The sum of LMC and exposure is used. That
is,
mkþ1 ¼ arg maxðminfDistðmi ; m0 Þji ¼ 1; 2; . . . ; kg þ eðm0 ÞÞ:
m0
i
A random choice method was used as a baseline strategy for
comparison with the methods introduced so far. Although readers
may regard the random method as quite an ineffective one, it
might perform well as k increases, because it can evenly choose
random products without any possible bias.
3.6. Collaborative filtering
For the recommendation experiments using the product selection methods presented so far, among many variations of the CF
method, this paper uses the user-based CF method, which predicts
the preference of a given user based on those of similar users. The
prediction of the rating on product i by user u is made using the
following formula (Ahn, 2008; Konstan et al., 1997):
r0u;i ¼ ru þ
P
t2T simðu; tÞðr t;i
P
rt Þ
t2T jsimðu; tÞj
;
ð1Þ
where r u is the average rating of user u on all products, simðu; tÞ is
the similarity between user u and t, and T is the set of all reference
users.
There are several similarity measures applicable to the above
formula such as the traditional Pearson’s correlation or cosine
(Herlocker et al., 2004; Konstan et al., 1997). However, a recent
study has proposed a new measure called PIP which shows much
better performance when the number of available ratings is very
small (Ahn, 2008). PIP measures the similarity even with a very
limited number of co-ratings effectively based on the three factors
of similarity: proximity, impact, and popularity. This study also
uses PIP for most of the experiments. Readers are referred to the
original article for more about PIP, because repeating the details
of it here is redundant. For the measure of prediction accuracy,
the Mean Absolute Error (MAE) between predicted and actual ratings was used (Herlocker et al., 2004).
4. Experiments
4.1. Overview of the experiments
In order to test the effectiveness of the suggested methods, a
series of experiments were performed using a subset of the
widely-used MovieLens dataset that contains ratings on movies
by many users. The ratings are in the scale of 1–5, where 5 represents the strongest preference, and 1 the least. Table 1 summarizes
the details of the dataset used for the experiments.
Table 2 summarizes the experiments that use the methods
introduced in Section 3. Broadly, the experiments are of three
types, h1i selecting the first products denoted as Best 1 selection,
h2i selecting next products after the first ones, denoted as Best K
selection, and h3i comparing the use of two similarity measures,
PIP and COR. Also seen in the table is that experiments with random selection are performed together with each experiment for
comparison. Except for the last experiment, the PIP measure was
used for all the experiments.
In each of the experiments, randomly chosen 80% of the movies
were used for similarity calculation (profiling or learning movies)
F. EXP method: Simply, the product with the largest exposure
is selected. That is,
mkþ1 ¼ arg max eðm0 Þ
m0
In addition to the above, more combinations and hybrids were
tried, but, for simplicity and readability, only those with better or
illustrative results are presented in the paper.
Table 1
Description of the dataset.
Name
Description
Figures
Availability
MovieLens dataset (MovieLens, 2005)
Ratings of movies by anonymous users in scale of 1–5
100,000 ratings by 943 users on 1682 movies
Available at http://www.grouplens.org/
3059
H.J. Ahn et al. / Expert Systems with Applications 37 (2010) 3055–3062
Table 2
List of the experiments.
ENT
Experiments
Selection
methods
Short description
h1i Best 1
selection
methods
ENT
EXP
ENT EXP
ENT + EXP
Rand
Entropy only
Exposure only
Entropy multiplied by exposure
Entropy plus exposure
Random
h2i Best K
selection
methods
ACE
ENT
COR
LMC
LMC + E
EXP
Rand
Average conditional entropy
Entropy only
Correlation distance
Largest minimum correlation
distance
LMC plus exposure
Exposure only
Random
The best methods
from the previous
experiments
Two CF methods, one using PIP for
user similarity and the other using
Pearson’s correlation
h3i PIP versus
COR
Best 1 selection
0.83
and twenty for recommendation candidates (test movies). Similarly, randomly chosen 80% of the users were used as reference
users (learning users) and recommendation was made to the
remaining 20% (test users). All the experiments were repeated 20
times for all the movies and users, each time selecting different
subsets of data for learning and testing.
For those methods that use entropy, many sample probabilities
need to be pre-calculated for efficient experiments. Hence, about
6700 marginal probabilities pðrk ¼ xÞ and 420,000 conditional probabilities pðrk ¼ xjrl ¼ yÞ were prepared before the experiments.
4.2. Best 1 selection
Before presenting the result of the experiments, it needs to be
explained first how the average user ratings should be calculated
in the experiments in predicting the user ratings with formula
(1). When there are only a very small number of ratings available
for each user which is the case in these experiments, if we use only
the available ratings for calculating the average rating for each
user, it will be heavily biased by the small number of ratings.
Therefore, a smoothed
average r0u for each user u was used instead:
P
k Þ ; where a P 0 is a constant, r is
r 0u ¼ a r þ k r þ 1k k ðr u;k q
k is the average
the average rating of all users for all products, and q
rating on product k by all users. Here, we can see easily that larger
values of a will make the smoothed average rely more on the global average, and vice versa. In order to find the proper value of a,
many experiments were performed with different values of a
(see the Appendix A), and it was observed that a ¼ 5 produces a
good result. Hence, a ¼ 5 was used for all the experiments.
Prediction Accuracy in MAE
EXP
ENT+EXP
0.82
ENTxEXP
Rand
0.81
0.8
0.79
Fig. 3. Comparing the Best 1 selection methods.
Next, in order to give a basic understanding about the data,
Fig. 2 shows the distribution of the exposure and entropy for all
the movies. As can be seen, most of the movies are exposed to less
than 10% of the users, and only a very small number of movies are
exposed to more than 20%. For the entropy, many movies have entropy values around 2.0, but there are also many movies that have
significantly larger or smaller entropy values. An entropy value of 2
of a movie implies that the rating of a user for the movie can give
us on average two bits of information about the user. The correlation between the exposure and the entropy of the movies was
found to be small (0.1416).
Fig. 3 shows the result of the Best 1 selection experiments. As
can be seen, the ENT + EXP method was found to be the most
effective method of selecting the first products. Along with
ENT + EXP, ENT EXP also shows a good result in comparison
with the random selection. However, using the entropy alone
shows worse performance than the random method, which
implies that exposure should also be taken into account in the
selection because, regardless of entropy values, movies with low
exposure might not be very useful in finding reference users for
the recommendation.
Next, in order to give readers some idea on which movies are
selected using each of the four methods, Table 3 shows the three
movies that are most likely to be selected by each method. We
can see that each method produces different sets of movies, and
also that ENT + EXP and ENT EXP produce different but similar
movies. The table also shows the average entropies and exposures
of the movies that are selected as the first movie for all the users by
each method.
Fig. 2. Distribution of exposure and entropy.
3060
H.J. Ahn et al. / Expert Systems with Applications 37 (2010) 3055–3062
Table 3
Examples movies selected by each method.
ENT
EXP
ENT + EXP
ENT EXP
First
Second
Third
Average entropy
Average exposure
Lost highway
Star wars
Liar Liar
Liar Liar
Natural born killers
Contact
Scream
Scream
Evita
Return of the Jedi
Independence Day
Contact
2.20636
1.680647
2.09814
2.03305
0.204127
0.581351
0.481743
0.51224
Table 4
Pair-wise t-tests for the comparison of performances of LMC and Rand.
LMC
Rand
t-Value
The degree of
*
Significant
**
Significant
***
Significant
K=1
K=2
K=3
K=4
0.8099
0.8295
5.3278***
0.7974
0.8057
3.995 ***
0.7886
0.7926
1.8274**
0.7820
0.7847
1.4744*
freedom = 19.
at 0.90.
at 0.95.
at 0.99.
Next, for the illustration of the actual selection of the movies by
some of the distinctive methods, Table 5 shows the examples of the
selection process for k ¼ 4. For the ACE method, based on the ratings already collected for the three movies, The English Patient,
Contact, and Independence Day, the average conditional entropies
are calculated for the three candidate movies, Bonnie and Clyde,
Fargo, and Ransom. As the result, Fargo is selected since it has
the largest ACE value. In the case of COR, George of the Jungle is selected because it has the largest average correlation distance from
the movies, The English Patient, Jerry Maguire, and Ben-Hur. Lastly,
the LMC method selects 2001: A Space Odyssey which has the largest minimum distance with the three movies.
4.4. PIP versus correlation
Fig. 4. Comparing the Best K selection methods.
The last experiment shows how the proposed methods perform
when used with correlation, not PIP in calculating the user similarity for the CF method. The best performing Best 1 and Best K selection methods, ENT + EXP and LMC, respectively, were used again in
the experiments. As can be seen in Figs. 5 and 6, the use of correlation shows significantly worse performance than the use of PIP.
Also, the performance of using the selection methods with correlation is even worse than using the random selection with correlation, although the difference is quite small, which can be due to
the overall ill performance of correlation for very small number
of available ratings.
4.3. Best K selection
4.5. Summary of the results and discussion
Fig. 4 shows the result of the Best K selection experiments. Note
that the experiments utilize the best result of the Best 1 experiment, ENT+EXP. The upper chart in Fig. 4 shows the results with
all the selection methods, while the lower chart shows the results
of only the two best performing methods, ENT, LMC, in comparison
with the random selection. As we can see, ENT and LMC show better results than Rand until about k ¼ 4. COR also shows a good result for small numbers of k, but ACE or LMC+E does not appear to
be effective. The random selection shows better performance than
all the others from the point k ¼ 5, which implies that the random
selection can also be a good strategy when we collect ratings on
more than a certain number of products.
Table 4 shows the result of the one-tailed pairwise t-tests to see
the significance of the differences between LMC, the best performing method, and the random selection. We can see that all the differences are statistically significant.
We can summarize the results of the experiments as follows.
First, for constructing user profiles with a very limited number of
products, the proposed methods proved to be effective. Second,
although the random selection shows inferior results for a very
small number of ratings, it shows good performance as more products are used. This can be due to the limitations of the proposed
methods that might lead to accumulated bias as more and more
products are selected by them. Third, for estimating the average
user ratings for a small number of products, a smoothing method
was proposed to estimate the averages effectively. Fourth, consistent with the previous study (Ahn, 2008), the PIP measure shows
much better performance than correlation when only a limited
number of ratings is available.
The above results have the following practical implications.
First, it showed that it is possible to gain more information for
effective recommendation by, together with adopting the PIP
3061
H.J. Ahn et al. / Expert Systems with Applications 37 (2010) 3055–3062
Table 5
Example movies selected by ACE, COR, and LMC methods (CE = conditional entropy, CD = correlation distance).
ACE
COR
LMC
Movies already rated for k ¼ 3hai
Movies considered for k ¼ 4hbi
Score for each pair of hai and hbi
Overall score
The English Patient (r=5)
Contact (r = 5)
Independence day (r = 3)
The English Patient (r = 5)
Contact (r = 5)
Independence day (r = 3)
The English Patient (r = 5)
Contact (r = 5)
Independence day (r = 3)
Bonnie and Clyde
CE = 0.42956674
CE = 0.20270143
CE = 0.5128109
CE = 1.2319341
CE = 1.1928089
CE = 1.3904102
CE = 0.81961715
CE = 0.9822581
CE = 1.3260816
ACE = 0.38169304
The English Patient
Jerry Maguire
Ben-Hur
The English Patient
Jerry Maguire
Ben-Hur
The English Patient
Jerry Maguire
Ben-Hur
Queen Margot
CD = 0.45098275
CD = 0.7331273
CD = 0.24990857
CD = 0.67926437
CD = 0.71803486
CD = 0.61346614
CD = 0.9391285
CD = 0.9899691
CD = 0.9863818
Average of CDs = 0.47800618
The English Patient
Jerry Maguire
Ben-Hur
The English Patient
Jerry Maguire
Ben-Hur
The English Patient
Jerry Maguire
Ben-Hur
Fargo
Ransom
Pretty woman
George of the Jungle
Brazil
Othello
2001: a space Odyssey
Fig. 5. Comparing correlation and PIP as the user similarity measure for the CF
method.
CD = 0.81344426
CD = 0.88023716
CD = 0.93532586
CD = 0.7192551
CD = 0.9487815
CD = 0.3709597
CD = 0.9003594
CD = 0.9641372
CD = 0.90802217
ACE = 1.2717177
ACE = 1.0426522
Average of CDs = 0.6702552
Average of CDs = 0.9718265
LMC = 0.81344426
LMC = 0.3709597
LMC = 0.9003594
measure, requiring users to rate on a very small number of carefully chosen products. Because users are often very reluctant to
provide personal information to Internet stores, the suggested
method can be used effectively to gain more information with minimum user disturbance. For example, through a promotion campaign on a web page, we can ask customers to provide ratings on
a very small number of products sequentially, and use the result
for recommending products. Second, the same idea can be applied
to other e-business areas such as portals and social communities so
that we can again analyze the preferences and interests of users
with minimum disruption, in which case, different modeling of
the items and users may be needed though.
There are also limitations of this study. First, the effectiveness of
the suggested methods has yet to be proven for other Internet
stores or products. Therefore, care should be taken in generalizing
the results and applying the methods to other areas. Second, since
the previous literature showed that the use of correlation for user
similarity in CF outperforms PIP when a larger number of ratings
are available (e.g. when k ¼ 11 15), the use of correlation also
Fig. 6. Finding the appropriate value of a.
3062
H.J. Ahn et al. / Expert Systems with Applications 37 (2010) 3055–3062
needs to be considered when applying the selection methods of
this paper to similar situations.
5. Conclusion
This paper presented novel methods of selecting a sequence of a
small number of products for each user which can be used to construct user profiles effectively for the CF-based product recommendation. The method developed many measures of product
characteristics and diversity based on the concepts of entropy,
exposure, and correlation distance, which were used individually
or in combination to select different products for different users.
The effectiveness of the methods was tested and proved using
experiments with the MovieLens dataset. The main contribution
of this research can be summarized as follows: First, to the authors’
knowledge, this research addressed the very practical problem of
selecting the sequence of a small number of products for user profiling for the first time. Second, the paper showed that using the
entropy, exposure, and correlation distance of movies together
with the PIP measure can lead to improved user profiling while
requiring ratings only on a small number of products.
Although there are limitations of the study already discussed in
Section 4, the authors believe that carefully applying the proposed
methods to many Internet stores can give practical benefits to
them in improving their personalized product recommendation.
There are some interesting further research issues as well. First,
the method needs to be applied and tested under more diverse circumstances to see whether the results can be generalized or how
the results might vary. Second, rather than using a dataset, we
can experiment the proposed methods empirically involving human participants to test its performance under a more realistic setting. Third, because the proposed methods were applied only to the
CF method in this study, applying it to other methods of product
recommendation can also be interesting and meaningful.
Acknowledgement
This work was supported by the Hongik University new faculty
research support fund.
Appendix A
In order to find the appropriate value of a, the Best K selection
experiments were repeated for a values between 1 and 20, for
k = 1, 2, 3, 4, and 5. For both Rand and LMC, a ¼ 5 shows good
results (see Figure 6).
References
Adomavicius, G., Sankaranarayanan, R., Sen, S., & Tuzhilin, A. (2005). Incorporating
contextual information in recommender systems using a multidimensional
approach. ACM Transactions on Information Systems, 23(1), 103–145.
Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender
systems: A survey of the state-of-the-art and possible extensions. IEEE
Transactions on Knowledge & Data Engineering, 17(6), 734–749.
Ahn, H. J. (2006). Utilizing popularity characteristics for product recommendation.
International Journal of Electronic Commerce, 11(2), 57–78.
Ahn, H. (2008). A new similarity measure for collaborative filtering to alleviate the
new user cold-starting problem. Information Sciences, 178(1), 37–51.
Ahn, H. J., & Kim, J. W. (2006). Feature reduction for product recommendation in
internet shopping malls. International Journal of Electronic Business, 4(5),
432–444.
Burke, R. (2002). Hybrid recommender systems: Survey and experiments. User
Modeling and User-Adapted Interaction, 12(4), 331–370.
Cohen, W. W., & Fan, W. (2000). Web-collaborative filtering: Recommending music
by crawling the web. Computer Networks, 33(1-6), 685–698.
Cylogy (2005, 28/Nov/2005). Personalization overview. Retrieved April 2006, from
<http://www.cylogy.com/library/personalization_overview-kb.pdf>.
Greco, G., Greco, S., & Zumpano, E. (2004). Collaborative filtering supporting web
site navigation. AI Communications, 17(3), 155–166.
Herlocker, J. L., Konstan, J. A., Terveen, L. G., & Riedl, J. T. (2004). Evaluating
collaborative filtering recommender systems. ACM Transactions on Information
Systems, 22(1), 5–53.
Huang, Z., Chen, H., & Zeng, D. (2004). Applying associative retrieval techniques to
alleviate the sparsity problem in collaborative filtering. ACM Transactions on
Information Systems, 22(1), 116–142.
Kim, Y. S., Yum, B.-J., Song, J., & Kim, S. M. (2005). Development of a recommender
system based on navigational and behavioral patterns of customers in ecommerce sites. Expert Systems with Applications, 28(2), 381.
Komiak, S. Y. X., & Benbasat, I. (2006). The effects of personalization and familiarity
on trust and adoption of recommendation agents. MIS Quarterly, 30(4),
941–960.
Konstan, J. A., Miller, B. N., Maltz, D., Herlocker, J. L., Gordon, L. R., & Riedl, J. (1997).
GroupLens: Applying collaborative filtering to usenet news. Communications of
the ACM, 40(3), 77–87.
Lee, D.-S., Kim, G.-Y., & Choi, H.-I. (2003). A web-based collaborative filtering
system. Pattern Recognition, 36(2), 519–526.
Li, Q., & Kim, B. M. (2003). Culstering approach to hybrid recommendation. Paper
presented at the IEEE/WIC international conference on web intelligence
(WI’03).
Li, Y., Lu, L., & Xuefeng, L. (2005). A hybrid collaborative filtering method for
multiple-interests and multiple-content recommendation in E-Commerce.
Expert Systems with Applications, 28(1), 67–77.
Linden, G., Smith, B., & York, J. (2003). Amazon.com recommendations: Item-toitem collaborative filtering. IEEE Internet Computing, 7(1), 76–80.
MacKay, D. (2003). Information theory, inference and learning algorithms. Cambridge
University Press.
Maltz, D., & Ehrlich, E. (1995, May 1995). Pointing the way: Active collaborative
filtering. Paper presented at the CHI 95 human factors in computing systems,
Denver, USA.
Melville, P., Mooney, R. J., & Nagarajan, R. (2002, July 2002). Content-boosted
collaborative filtering for improved recommendations. Paper presented at the 18th
national conference on artificial intelligence (AAAI-2002), Edmonton, Canada.
Middleton, S. E., Alani, H., Shadbolt, N. R., & De Roure, D. C. (2002, May 2002).
Exploiting synergy between ontologies and recommender systems. Paper presented
at the 11th international world wide web conference (WWW2002), Hawaii,
USA.
Mirzadeh, N., Ricci, F., & Bansal, M. (2005, March 2005). Feature selection methods for
conversational recommender systems. Paper presented at the 2005 IEEE
international conference on e-technology, e-commerce and e-service, Hong
Kong, China.
Park, S.-T., Pennock, D. M., Madani, O., Good, N., & DeCoste, D. (2006). Naive filterbots
for robust cold-start recommendations. Paper presented at the KDD’06,
Philadelphia, Pennsylvania, USA.
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., & Riedl, J. (1994, October 1994).
GroupLens: An open architecture for collaborative filtering of netnews. Paper
presented at the ACM 1994 conference on computer supported cooperative
work, Chapel Hill, North Carolina.
Reza, F. (1994). An introduction to information theory. Dover Publications.
Salter, J., & Antonopoulos, N. (2006). Cinemascreen recommender agent: Combining
collaborative and content-based filtering. IEEE Intelligent Systems, 21(1), 35–41.
Sarwar, B., Karypis, G., Konstan, J. A., & Riedl, J. (2001, May 2001). Item-based
collaborative filtering recommendation algorithms. Paper presented at the 10th
international world wide web conference (WWW10), Hong Kong.
Schein, A. I., Popescul, A., Ungar, L. H., & Pennock, D. M. (2002, August 2002).
Methods and metrics for cold-start recommendations. Paper presented at the 25th
annual international ACM SIGIR conference on research and development in
information retrieval, Tampere, Finland.
Tam, K. (2006). Understanding the impact of web personalization on user
information processing and decision outcomes. Management Information
Systems Quarterly, 30(1), 32.
Vezina, R., & Militaru, D. (2004). Collaborative filtering: Theoretical positions and a
research agenda in marketing. International Journal of Technology Management,
28(1), 31–45.
Xiao, B., & Benbasat, I. (2007). E-Commerce product recommendation agents: Use,
characteristics, and impact. MIS Quarterly, 31(1), 137–209.
Yu, K., Xu, X., Ester, M., & Kriegel, H. (2003). Feature weighting and instance
selection for collaborative filtering: An information-theoretic approach*.
Knowledge and Information Systems, 5(2), 201–224.
Zeng, C., Xing, C., Zhou, L., & Zheng, X. (2004). Similarity measure and instance
selection for collaborative filtering. International Journal of Electronic Commerce,
8(4), 115–129.