Beyond the Words: Predicting User Personality from

BEYOND THE WORDS: PREDICTING USER PERSONALITY FROM
HETEROGENEOUS INFORMATION
PRESENTER: BENYI GONG
MOTIVATION
• User personality is not only essential to many scientific disciplines, but also
has a profound business impact on practical applications such as digital
marketing, personalized recommendation, mental diagnosis, and human
resources management.
• Language usage in social media is effective in personality prediction. However,
leveraging the heterogeneous information on social media could have a better
understanding of user personality!
PAPER STRUCTURE
• PROBLEM DEFINITION
• HIE Structure
• HETEROGENEOUS FEATURE ENGINEERING
• Experiments and Results
• Conclusion and Discuss
PROBLEM DEFINITION
• 4 types of digital trace data: tweet, emoticon, avatar, and responsive pattern
• The five factor model in personality: Extraversion, Agreeableness,
Conscientiousness, Neuroticism, and Openness.
• Personality Prediction Evaluation
HIE STRUCTURE
STRATEGIES TO
EXTRACT SEMANTIC REPRESENTATIONS
• Tweets: LIWC, Pearson correlation, bag-of-words clustering, and Text-CNN
• Avatars: deep learning, k-means clustering
• Emoticons: Pearson Correlation, Emotion Mapping
• Responsive Pattern: Responsive-CNN
EXPERIMENTS AND RESULTS
CONCLUSION AND DISCUSS
• Invent HIE to predict user personality by integrating heterogeneous
information in digital traces including self-language usage, avatars, emoticons,
and responsive patterns.
• Extensive experiments and analysis on a real-world dataset covering both
personality survey results and social media usage from 3,162 volunteers. The
results are promising and HIE outperforms the state-of-the-art models in all
Big Five personality dimensions.
• Limitations and how to use it in test retrieval model?
QUESTIONS?