Graph-based Embedding Method and Applications Hongzhi Shi 2017/6/22 Embedding——representation learning • Neutral network based • Word2vec • Deepwalk • …… • Graph based • LINE • PTE • …… intuitive and powerful It doesn’t request GPUs and tuning parameters. Outline • Methods (a unified framework) • homogeneous graph • heterogeneous graph Embedding one type of items Embedding multi types of items • Applications Embedding homogeneous graph Construct the graph e.g. Social network Goal: Learn a low-dimension representation of vertex which remains both the 1-order and 2-order proximity i.e. Learn a k-dimension vecter 𝑢𝑖 to represent i-th vertex Tang Jian, et al. "Line: Large-scale information network embedding." WWW 2015. Learn a k-dimension vecter 𝑢𝑖 to represent i-th vertex • Define the generating probability Estimated distribution Practice distribution: • First order proximity: • Second order proximity: • Minimize the KL-divergence of the two probability distributions. • objective function Tang Jian, et al. "Line: Large-scale information network embedding." WWW 2015. application • Language Network • Word Analogy: Beijing-China≈Paris-France. • Document classification. • Social Network • User classification. • Citation Network • Paper classification. Embedding heterogeneous graph ——Embedding one type of items • Construct the graph, • e.g. word-word word-document word-label (edge weight : co-occurrence count) Goal: Learn a low-dimension representation of word which remains the 2order proximity in this graph Tang Jian, et al. “Pte: Predictive text embedding through large-scale heterogeneous text networks.” KDD 2015. Embedding heterogeneous graph ——Embedding one type of items • Define the generating probability • Second order proximity: • Practice distribution: • Minimize the KL-divergence of the two probability distributions. Tang Jian, et al. “Pte: Predictive text embedding through large-scale heterogeneous text networks.” KDD 2015. Embedding heterogeneous graph ——Embedding multi type of items • Construct the graph: • Data: • Egde: • co-occurrence : (1) word-word; (2) word-time; (3) word-location; (4) time-location. • We set the edge weight to the normalized co-occurrence count. • neighborhood relationship: (1) location-location; (2) time-time edge. • We set the edge weights to the kernel strengths. Zhang, Chao, et al. "Regions, periods, activities: Uncovering urban dynamics via cross-modal representation learning." WWW 2017. Embedding different type of items into one space • Define the generating probability (the same as before: 2-order proximity) • Minimize the KL-divergence of the two probability distributions. Zhang, Chao, et al. "Regions, periods, activities: Uncovering urban dynamics via cross-modal representation learning." WWW 2017. Application: Cross-domain query What can be regarded as graph to be embedded? What can we do through embedding? semantic semantic POI type App type user location website time ………… Physical World ………… Cyberspace Thank you! Welcome discussion!
© Copyright 2026 Paperzz