Learning user preferences for 2CP-regression for a recommender system Alan Eckhardt, Peter Vojtáš Department of Software Engineering, Charles University in Prague, Czech Republic Outline Motivation User model Peak and 2CP Experiments Conclusion and future work SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 User preference learning Helping the user to find what she looks for A small amount of information required from the user Ratings of notebooks,... Construction of a general user preference model E.g. notebooks Each user has his/her own preference model Recommendation of the top k notebooks to the user Which the preference model has chosen as the most preferred for the user SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 User preference learning Recommendation process Initial set Centers of clusters of objects Construction of user model Recommendation More iterations possible Recommender system Initial set Construction of user model In each iteration the user model is refined SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 Recommended items User User decision making Two step user model User model learning is divided into two steps 1. Local preferences - normalization of the attribute values of notebooks to their preference degrees f i : DAi 0,1 Transforms the space DA into [0,1]N i 2. Global preferences - aggregation of preference degrees of attribute values into the predicted rating @ : 0,1 0,1 N SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 User model Fuzzy sets Normalize the space to monotone space [0,1]N Define pareto front Set of incomparable objects Candidates for the best object (1,…,1) is the best object 1 0 f Price 1 1 SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 0 Price [$] 0 100 500 2 000 User model Aggregation Resolves the best object from pareto front The second best object may not be on pareto front Two methods – Statistical and Instances @RAM_U 1 , CPU_U 1 , Price_U 1 1 1st best 2nd best 0 1 5 * RAM_U1 1 * CPU_U 1 3 * Price_U 1 9 SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 Normalization of numerical attributes Rating 1 Linear regression Preference of the smallest or the largest value 0 Quadratic regression 0 Price 10 000 [$] 0 Price 10 000 [$] Rating 1 Can detect ideal values, but often fails in experiments 0 SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 2CP regression Preference dependence between attributes This is not a dependence in the dataset (e.g. the resolution of display influences the price) The influence of the value of attribute A1 on the preference of attribute A2 Manufacturer Name Rating Name ACER 0.2 TOSHIBA 0.7 ASUS 0.5 HP 0.9 FUJITSU 0.8 IBM 0.8 MSI SONY 0.7 0.5 Rating LENOVO 0.4 Price Manufacturer ACER, ASUS, Ideal Price 750$ FUJITSU, MSI TOSHIBA, HP, IBM, SONY, LENOVO 2200$ E.g. the value of the producer (IBM) of a notebook influence the preference of the price of the notebook (for IBM, the ideal price is SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 2200$). Peak Motivation Finding the peak value User often prefer once particular value of attribute Peak Rating 1 Traversing the training set Which is small Testing the error of linear regressions on both sides of the peak 0 0 Price 10 000 [$] We know exactly which value is the most preferred Useful for visual representation SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 2CP regression+Peak Dependence of price on the value of manufacturer ACER => High price ASUS => Lower price Manufacturer Price SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 Name Name ACER ASUS Experiment settings Dataset of 200 notebooks Artificial user preferences Training sets of sizes 2-60 The preference of price was dependent on the value of producer The rest of the dataset was used as testing set Error measures RMSE Kendall t coefficient SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 Experiment settings Tested methods Support Vector Machines from Weka Mean – returns the average rating from the training set Instances – classification, uses objects from the training as boundaries on rating Statistical – weighted average with learned weights 2CP Both Instances and Statistical can use local preference normalization – Linear, Quadratic, Peak 2CP serves to find the relation between the preference of an attribute value and the value of another SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 Experiment results SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 Experiment results SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 Experiment results SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 Conclusion Proposal of method Peak Combination with 2CP Experimental evaluation with very good results Using rank correlation measure SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010 Future work nCP-regression Clustering of similar values for better robustness Degree of relation between two attributes SOFSEM 2010, Špindlerův mlýv, Czech Republic, 23.-29.1.2010
© Copyright 2026 Paperzz