Recommender Systems

Recommender Systems
Ian Wesley-Smith
[email protected]
http://iwsmith.in
October 26th, 2016
Ian Wesley-Smith
PhD Student
Information School
University of Washington
https://iwsmith.in
Areas: Recommender Systems, Machine Learning, Data Science
Research Question: How do we generate recommendations for for corpora with few user-item interactions and
many items?
Search vs Recommendation
Why do we need recommenders?
Information Overload
•
•
•
•
•
•
•
Netflix ~ 100,000 movies/shows
Amazon prime ~ 40,000 movies/shows
Amazon.com ~ 480M products in US
Spotify ~ 20M songs
Youtube ~ 300 hours of content/minute
All social networks ever: Facebook, Reddit, Tumblr, Twitter
Content creators: NYTimes, Wall St. Journal, WaPo
Content based filtering
• Find similar items
• Won’t find
complementary
items
• Manual entry for
large catalogues
is infeasible
• Features are
subjective
Collaborative Filtering
• Find similar users
• Domain Free
• Good at finding complementary items
• Cold start problem
• Users might like the same thing for different reasons
Collaborative Filtering
Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 42.8 (2009): 30-37.
Netflix Prize (2006)
• Improve RMSE performance over Netflix’s by 10%, win 1M USD
• Training set: 100M ratings, 500K customers, 17K movies
• Explicit user-item interactions – rating from 1 – 5 stars
Latent Factor
Model
• Define a shared feature vector
between users (U) and items (I)
• Prediction: 𝑢𝑖 ∙ 𝑖𝑗 = 𝑠𝑐𝑜𝑟𝑒
• Perform matrix factorization to
determine missing values (SGD,
ALS)
Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 42.8 (2009): 30-37.
Netflix Prize
Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 42.8 (2009): 30-37.
Recommender Challenges
• Latent factor model works on homogeneous corpus
• Many techniques require |User-item| interactions > items
• Many corpuses (scholarly articles, personal photos) don’t have this
• Implicit feedback leads to recommender confusion
• Recommenders blind to user intent currently
• Ensembles are popular
Additional Material
• Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization
techniques for recommender systems." Computer 42.8 (2009): 30-37.
• Gomez-uribe, C. A., & Hunt, N. (2015). The Netflix Recommender
System: Algorithms, Business Value, and Innovation, 6(4).
• Narayanan, Arvind, and Vitaly Shmatikov. "How to break anonymity of
the netflix prize dataset." arXiv preprint cs/0610105 (2006).
• MovieLens Dataset
• Facebook: Recommending items to more than a billion people