Graph-based Embedding and its Application in Trajectories

Graph-based Embedding
Method and Applications
Hongzhi Shi
2017/6/22
Embedding——representation learning
• Neutral network based
• Word2vec
• Deepwalk
• ……
• Graph based
• LINE
• PTE
• ……
intuitive and powerful
It doesn’t request GPUs and tuning parameters.
Outline
• Methods (a unified framework)
• homogeneous graph
• heterogeneous graph
Embedding one type of items
Embedding multi types of items
• Applications
Embedding homogeneous graph
Construct the graph
e.g. Social network
Goal:
Learn a low-dimension representation of vertex which
remains both the 1-order and 2-order proximity
i.e.
Learn a k-dimension vecter 𝑢𝑖 to represent i-th vertex
Tang Jian, et al. "Line: Large-scale information network embedding." WWW 2015.
Learn a k-dimension vecter 𝑢𝑖 to represent i-th vertex
• Define the generating probability
Estimated distribution
Practice distribution:
• First order proximity:
• Second order proximity:
• Minimize the KL-divergence of the two probability distributions.
• objective function
Tang Jian, et al. "Line: Large-scale information network embedding." WWW 2015.
application
• Language Network
• Word Analogy: Beijing-China≈Paris-France.
• Document classification.
• Social Network
• User classification.
• Citation Network
• Paper classification.
Embedding heterogeneous graph
——Embedding one type of items
• Construct the graph,
• e.g. word-word word-document word-label
(edge weight : co-occurrence count)
Goal:
Learn a low-dimension representation of word which remains the 2order proximity in this graph
Tang Jian, et al. “Pte: Predictive text embedding through large-scale heterogeneous text networks.” KDD 2015.
Embedding heterogeneous graph
——Embedding one type of items
• Define the generating probability
• Second order proximity:
• Practice distribution:
• Minimize the KL-divergence of the two probability distributions.
Tang Jian, et al. “Pte: Predictive text embedding through large-scale heterogeneous text networks.” KDD 2015.
Embedding heterogeneous graph
——Embedding multi type of items
• Construct the graph:
• Data:
• Egde:
• co-occurrence : (1) word-word; (2) word-time; (3) word-location; (4)
time-location.
• We set the edge weight to the normalized co-occurrence count.
• neighborhood relationship: (1) location-location; (2) time-time edge.
• We set the edge weights to the kernel strengths.
Zhang, Chao, et al. "Regions, periods, activities: Uncovering urban dynamics via cross-modal representation learning." WWW 2017.
Embedding different type of items into one space
• Define the generating probability (the same as before: 2-order proximity)
• Minimize the KL-divergence of the two probability distributions.
Zhang, Chao, et al. "Regions, periods, activities: Uncovering urban dynamics via cross-modal representation learning." WWW 2017.
Application: Cross-domain query
What can be regarded as graph to be embedded?
What can we do through embedding?
semantic
semantic
POI type
App
type
user
location
website
time
…………
Physical World
…………
Cyberspace
Thank you!
Welcome discussion!