Recommender Systems Ian Wesley-Smith [email protected] http://iwsmith.in October 26th, 2016 Ian Wesley-Smith PhD Student Information School University of Washington https://iwsmith.in Areas: Recommender Systems, Machine Learning, Data Science Research Question: How do we generate recommendations for for corpora with few user-item interactions and many items? Search vs Recommendation Why do we need recommenders? Information Overload • • • • • • • Netflix ~ 100,000 movies/shows Amazon prime ~ 40,000 movies/shows Amazon.com ~ 480M products in US Spotify ~ 20M songs Youtube ~ 300 hours of content/minute All social networks ever: Facebook, Reddit, Tumblr, Twitter Content creators: NYTimes, Wall St. Journal, WaPo Content based filtering • Find similar items • Won’t find complementary items • Manual entry for large catalogues is infeasible • Features are subjective Collaborative Filtering • Find similar users • Domain Free • Good at finding complementary items • Cold start problem • Users might like the same thing for different reasons Collaborative Filtering Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 42.8 (2009): 30-37. Netflix Prize (2006) • Improve RMSE performance over Netflix’s by 10%, win 1M USD • Training set: 100M ratings, 500K customers, 17K movies • Explicit user-item interactions – rating from 1 – 5 stars Latent Factor Model • Define a shared feature vector between users (U) and items (I) • Prediction: 𝑢𝑖 ∙ 𝑖𝑗 = 𝑠𝑐𝑜𝑟𝑒 • Perform matrix factorization to determine missing values (SGD, ALS) Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 42.8 (2009): 30-37. Netflix Prize Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 42.8 (2009): 30-37. Recommender Challenges • Latent factor model works on homogeneous corpus • Many techniques require |User-item| interactions > items • Many corpuses (scholarly articles, personal photos) don’t have this • Implicit feedback leads to recommender confusion • Recommenders blind to user intent currently • Ensembles are popular Additional Material • Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 42.8 (2009): 30-37. • Gomez-uribe, C. A., & Hunt, N. (2015). The Netflix Recommender System: Algorithms, Business Value, and Innovation, 6(4). • Narayanan, Arvind, and Vitaly Shmatikov. "How to break anonymity of the netflix prize dataset." arXiv preprint cs/0610105 (2006). • MovieLens Dataset • Facebook: Recommending items to more than a billion people
© Copyright 2025 Paperzz