Collaborative Deep Learning (CDL)

Collaborative Deep Learning
for Recommender Systems
报告人:王 月
日期:2016/5/26
Outline
Overview for Recommender System
Collaborative Deep Learning
Experiments
Conclusion
Page
2
Overview for Recommender System
Kinds of Recommendation System
 Content-based methods
 CF-based methods
 user-based
 item-based
 model-based (eg. Matrix Factorization)
 Hybrid methods:
 Loosely coupled methods
 Tightly coupled methods (CTR、CDL)
Page
3
Overview for Recommender System
Collaborative Topic Regression (CTR)
 tightly coupled method
 probabilistic graphical model
• topic model: latent Dirichlet allocation (LDA)
• model-based CF method: probabilistic matrix
factorization (PMF)
Page
4
Motivations
 The latent representation learned by CTR is often not effective enough
especially when the auxiliary information is very sparse.
 Deep learning models can learn effective feature representations from
text content automatically.
 DL models are inferior to shallow models such as CF in capturing and
learning the similarity and implicit relationship between items (and users).
Page
5
Outline
Overview for Recommender System
Collaborative Deep Learning
Experiments
Conclusion
Page
6
Collaborative Deep Learning
 A novel tightly coupled method for RS.
 Taking implicit feedback as the training and test data.
 Collaborative Filtering
 for the ratings (feedback) matrix
 to capture the similarity and implicit relationship between
items (and users)
 Deep Learning
 for the content information
 to extract an effective deep feature representation
Page
7
Collaborative Deep Learning
Notation and Problem Formulation
Matrix Factorization
Stacked Denoising Autoencoders
Collaborative Deep Learning
Learning and Prediction
Page
8
Notation and Problem Formulation
notation
meaning
I
J
Number of users
Number of items (articles or movies)
S
Size of vocabulary
Xc
Collection of J items , Clean input ,
J  by  S matrix
R
X0
Xl
Noise-corrupted input, J  by  S matrix
Wl , bl
Weight matrix and bias vector of layer l
L
W
Page
9
R  [ Rij]I  J , I  by  J binary rating
Output of layer L ,
J  by  Kl
matrix
Number of layers
Collection of all layers of weight matrices
and biases
Matrix factorization
Page
10
Matrix factorization
Page
11
Probabilistic matrix factorization (PMF)
Page
12
Disadvantages for Matrix Factorization
Page
13
Stacked Denoising Autoencoders (SDAE)
Page
14
Stacked Denoising Autoencoders (SDAE)
Assume that both the clean input Xc and the corrupted input X0
are observed, we can define the following generative process:
Page
15
Collaborative Deep Learning (CDL)
Page
16
The graphical model of CDL
The graphical model of
the degenerated CDL
Page
17
NN representation for degenerated CDL
Page
18
Learning and Prediction
Page
19
Outline
Overview for Recommender System
Collaborative Deep Learning
Experiments
Conclusion
Page
24
Experiments
Dataset
users
items
ratings
sparsity
citeulike-a
5,551
16,980
207,363
0.22%
citeulike-t
7,947
25,975
144,496
0.07%
Netflix
407,261
9,228
15,348,808
0.41%


Page
25
removing users with less than 3 positive ratings (or
articles)
removing stop words and choosing the top S
discriminative words according to the tf-idf values to
form the vocabulary (S is 8000, 20000, and 20000
for the three datasets)
Experiments
Evaluation:
 five-fold cross-validation with recall under both sparse and dense
settings,
Baselines:
 CMF: Collective Matrix Factorization is a model incorporating different sources
of information by simultaneously factorizing multiple matrices.
 SVDFeature: SVDFeature is a model for feature-based collaborative filtering.
 DeepMusic: DeepMusic is a model for music recommendation
 CTR: Collaborative Topic Regression is a model performing topic modeling and
collaborative filtering simultaneously.
 CDL: Collaborative Deep Learning is our proposed model. It allows different
levels of model complexity by varying the number of layers.
Page
26
Results
Page
27
Results
Page
28
Interpretation: example user I
Page
29
Interpretation: example user II
Page
30
Outline
Overview for Recommender System
Collaborative Deep Learning
Experiments
Conclusion
Page
31
Conclusion
 the first hierarchical Bayesian model to bridge the gap between
state-of-the-art deep learning models and RS
 combine the merits of traditional collaborative filtering and deep
learning model
 provides an interpretable latent structure for users and items
 can form recommendation about both existing and newly items
 conduct extensive experiments on three real-world datasets from
different domains
Page
32
Thank you!