Fused Matrix Factorization with Geographical and Social Influence in Location-based Social Networks Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1 1Department of Computer Science and Engineering The Chinese University of Hong Kong & 2ATT Labs, Research [email protected] AAAI 2012, Toronto, Canada Check-in becomes a life style… AAAI 2012, Toronto, Canada Check-in becomes a life style… Now the number of users surpasses 20 million corresponding to 2 billion check-ins1! 1http://statspotting.com/2012/04/foursquare-statistics-20-million-users-2-billion-check-ins/ AAAI 2012, Toronto, Canada Graph illustration of Location-based Social Networks (LBSNs) Friend link • Community detection • Link prediction • POI recommendation • Next place prediction Checked in Check in? Check in ? POI (lat, lng) AAAI 2012, Toronto, Canada • Travel sequence detection • Trip recommendation Our focus: POI recommendation • Help users explore their surroundings • Provide personalized travel recommendation • Help 3rd-party developers provide personalized services – Advertisements – Coupons – Traffic statistics AAAI 2012, Toronto, Canada Challenges • Large dataset – Crawled from Gowalla from Feb. 2009 to Sep. 2011 – 4,128,714 check-ins from 53,944 users on 367,149 locations • Only positive data is seen • Sparsity : density of our dataset is only 0.0208% AAAI 2012, Toronto, Canada POI recommendation in LBSNs Matrix Factorization can be a promising tool However… Geographical influence is ignored! AAAI 2012, Toronto, Canada POI recommendation in LBSNs Er… a little far.. AAAI 2012, Toronto, Canada Multi-centers and normal distribution • Two centers (home & office) in [Cho et al 2011] • Several centers proposed in our paper AAAI 2012, Toronto, Canada Multi-centers and normal distribution Similar to [Brockmann 2006, Gonzalez 2008] , we assume each center follow the norm distribution AAAI 2012, Toronto, Canada Inverse distance rule AAAI 2012, Toronto, Canada Social influence • On average, overlap of a user’s check-ins to his friends only about 9.6% 90% users have only 20% common check-ins AAAI 2012, Toronto, Canada Our proposal • Multi-center Gaussian Model (MGM) to capture geographical influence • Propose a generalized fused matrix factorization framework to include social and geographical influences • Conduct thorough experiments on large-scale Gowalla dataset AAAI 2012, Toronto, Canada Multi-center Gaussian model • Recall check-in locations are located around several centers • The probability a user visiting a location is inversely proportional to the distance from its nearest center • MGM is proposed to model users’ check-in behavior AAAI 2012, Toronto, Canada Multi-center Gaussian model • Notation – – – : multi-center set for user u : total frequency at center for user u is : the pdf of Gaussian distribution, and denote the mean and covariance matrices of regions around center • The probability a user u visiting a location l given defined as: AAAI 2012, Toronto, Canada Multi-center discovering algorithm A greedy clustering algorithm is proposed due to Pareto principle (top 20 locations cover about 80% check-ins) 0.2 20 search centers AAAI 2012, Toronto, Canada Fused framework User • Traditional Matrix Factorization (MF) only model users’ preference on locations • MGM only models geographical influence • We can fuse both of them Location prob. user u visit location l encode user preference based on MF calculated by MGM AAAI 2012, Toronto, Canada Setup and metric • Split the dataset into 2 non-overlapping sets – Randomly select x% for each user as training data and the rest (1-x)% as the test data, x set to 70 and 80 – Carried out 5 times independently, we report the average • POI recommendation – Return top-N POIs for each user – Find out # of locations in test dataset are recovered • Metric AAAI 2012, Toronto, Canada Comparison Methods • MGM • PMF: [Salakhutdinov and Mnih 2007] – Assume Gaussian distribution on observed data – Gaussian prior on latent feature vector • PMF with Social Regularization (PMFSR): [Ma et al. 2011b] – Social regularization term added to PMF • Probabilistic Factor Model (PFM): [Ma et al. 2011a] – Model frequency data, Gamma prior on latent feature vector and Poisson distribution on the frequency data • Fused MF with MGM (FMFMGM): our proposed method AAAI 2012, Toronto, Canada Results Precision Recall 70% 80% AAAI 2012, Toronto, Canada User check-in distribution AAAI 2012, Toronto, Canada Performance on different users AAAI 2012, Toronto, Canada Conclusion • Extract characteristics of a large dataset crawled from Gowalla • Propose a novel Multi-center Gaussian Model (MGM) to model geographical influence • Propose a fused MF framework which outperforms state-of-the-art methods AAAI 2012, Toronto, Canada Future work • To better model one-class frequency data • To include other information: location category, activity, etc. • To incorporate temporal effect AAAI 2012, Toronto, Canada Thanks Q&A Chen Cheng [email protected] AAAI 2012, Toronto, Canada
© Copyright 2026 Paperzz