Analysis of Topological Characteristics of Huge Online Social Networking Services 2007.3.9. Friday 10am Telefonica Barcelona Yong-Yeol Ahn Seungyeop Han Haewoon Kwak Sue Moon Hawoong Jeong To be presented at WWW 2007 Online Social Networking Services Portal for people to … Stay in touch with friends Share photos and personal news Find others of common interests Establish a forum for discussion 2 CyWorld Largest SNS in South Korea Started in September 2001 10 million users in 2004 16 million users out of 48 million population Front runner of many features Friend (il-chon) relationship Guestbook Testimonial Photos - scraps Avatar in cyber home 3 My CyWorld “Mini-Homepage” 4 Overview of the Talk Snowball sampling CyWorld, MySpace, orkut data sets Metrics representative of topological characteristics Related works Analysis of CyWorld, MySpace, orkut Summary Future work 5 Snowball Sampling (I) Only feasible sampling method for crawling the web Select a seed node Snowball Sampling (II) Pick all nodes directly connected to the seed node (1st layer) Snowball Sampling (III) Pick all nodes directly connected to 1st layer nodes (2nd layer) CyWorld Data Sets Complete snapshot (Nov 2005) 191 million friend relationships between 12 million users Two additional snapshots (Apr/Sep 2005) Testimonial network 100,000 users by snowball sampling 9 MySpace Data Set Largest in the world Began in Jul 2003 Has 130 million by Nov 2006 Snowball sampled During Sep/Oct 2006 Random seed to 100,000 users About 23% of users had friend list hidden 10 Orkut Data Set Google SNS Began in Sep 2002 Became official Google service in Jan 2004 Began as invitation-only; open now Has 33 million users Snowball sampled During Jun to Sep 2006 100,000 users 11 Metrics of Interest Degree distribution “Power-law” Small number of nodes have large number of links Clustering coefficient C(k) # of existing links / # of all possible links between a link’s adjacent neighbors Close to 1, close to a mesh Degree correlation knn Degree k ~ mean degree of adjacent neighbors of nodes with degree k Assortativity: characteristic of knn distribution 12 Assortative Mixing “Social” + “nonsocial” M. E. J. Newman, Phys. Rev. Lett. 89, 208701 (2002) assortative degree 13 Previous Works Other networks have been examined Milgram’s seminal degree-of-separation experiment By snail mail Movie actors, human sexual contacts, scientific collaborators All heavy tailed and have assortative mixing Work on online SNS Pussokram.com Internet dating community of 30,000 users LiveJournal 1.3 million bloggers with email addr/geography Both show some super-heavy tail end Questions We Raise What are the main characteristics of online SNSs? How representative is a sample network? How does a social network evolve? 15 CyWorld Complete data set Degree distribution Clustering coefficient Degree correlation Average path length Historical analysis Growth in numbers Degree distribution, clustering coefficient, degree correlation Average path length over time Testimonial network 16 Degree Distribution Two scaling regions 17 Figure 1-(a): degree distribution, CCDF Clustering Coefficient Distribution 18 Degree Correlation Not assortative 19 Average Path Length < 5 is about 90% 20 Historical Analysis 21 Evolution of Degree Distributions Two kinds of driving force 22 Evolution of Clustering Coefficient Becoming more dense 23 Evolution of Degree Correlation Transition from highly dissortative to slightly assortative mixing – sign of forming ‘real-world social relationship’? 24 Evolution of Path Length Start of densification? 25 CyWorld Testimonials # of Friends ≤ 100 In CyWorld Testimonials 26 Degree Correlation Not assortative Assortative 27 Sample Networks Rare opportunity to validate snowball sampling 28 Degree Distributions 29 Clustering Coefficients 30 Degree Correlations 31 Sample Networks Degree correlation Overestimate exponents Not fit for clustering coefficient/degree correlation 32 CyWorld, MySpace, and orkut Three major online SNSs of the world Common traits? 33 Degree Distributions 34 Clustering Coefficient The restriction of orkut’s testmonial network system (# of friends is less than about 1000) 35 Degree Correlation Dissortative Assortative “Real” Social Network Figure 8-(c): The degree correlation of two social networks: orkut and MySpace 36 Summary and Conclusions Cyworld has two scaling regions in the degree distribution. Other measurements (C(k), degree correlation) also support the existence of two regions. Boundary of two regions ~ 100s Dunbar’s Law: limit on human social relationships Mature online social community orkut shows fast-decaying degree distribution and assortative mixing pattern – similar to the small degree region of Cyworld. Close-knit, real-world-like social network structure. MySpace shows heavy tail and dissortative mixing pattern – similar to the large degree region of Cyworld. popularity is a more important than human interaction Future Work Friendship network topology not representative of activities Identify steady core and study its evolution Explicit vs implicit communities Clubs, towns vs cliquish behavior Growth model Existing preferential attachment-based models do not fit Forest fire model? Extensions? 38 BACKUP SLIDES 39 Signature of Two Scaling Regions PetterHolme, Christofer R. Edling, and Fredrik Liljeros, Structure and time evolution of an Internet dating community, SocialNetworks, 26, 155 (2004)
© Copyright 2026 Paperzz