Movement trajectories versus patterns Head/Tail Breaks: A New Classification Scheme for Data with a Heavy-tailed Distribution Bin Jiang University of Gävle, Sweden http://fromto.hig.se/~bjg/ Commenting on city growth and city modeling, Paul Krugman had this insightful statement: “We have complex messy models, yet reality is startlingly neat and simple.” I think this statement applies to movement data as well. While focusing on trajectories, we need very complicated models or algorithms, While focusing on patterns, the movement data is startlingly simple...as simple as “far more small things than large ones.” 2 Human mobility patterns related work The fourth paradigm (BIG data) Jiang B., Yin J., and Zhao S. (2009), Characterizing human mobility patterns in a large street network, Physical Review E, 80, 021136. Jiang B. (2009), Street hierarchies: a minority of streets account for a majority of traffic flow, International Journal of Geographical Information Science, 23(8), 1033-1048. Jiang B. and Jia T. (2011), Agent-based simulation of human movement shaped by the underlying street structure, International Journal of Geographical Information Science, 25(1), 51-64. The bigger the data, the more likely heavy tailed Scaling or heavy tailed distributions are ubiquitously observed 3 Classification Natural breaks (Jenks 1963) Data classification involves two basic issues: 4 Number of classes, and Class intervals 5 Minimizes the variance within classes and maximizes the variance between classes 6 A sample data about population densities Histogram: Gaussian way of thinking The highly improbable as an outlier under the Gaussian thinking 7 Rank-size: Scaling way of thinking 8 What is scaling? Scaling = A recurring structure of far more small things than large ones. NO YES The highly improbable ranked number 1 under the scaling thinking 9 Scaling of geographic space (a hidden order) Heavy tailed distributions ln y ln x Jiang B., Liu X. and Jia T. (2012), Scaling of geographic space as a universal rule for map generalization, Annals of AAG, Preprint: http://arxiv.org/abs/1102.1561. 11 12 Head/tail division rule Head/tail movement Given a variable x, if its values follow a heavy tailed distribution, then the mean of x can divide all the values into two parts: those above the mean in the head and those below the mean in the tail (Jiang and Liu 2012). AT&T Britinica National mapping agency Governments/CNN Centralized mindset, top-down Jiang B. and Liu X. (2012), Scaling of geographic space from the perspective of city and field blocks and using volunteered geographic information, International Journal of Geographical Information Science, 26(2), 215-229. Head Skype Wikipedia OpenStreetMap WikiLeaks(OpenLeaks) Decentralized mindset, bottom-up Looooooooong tail 13 Victory of the long tail again 14 Head/tail breaks Obama is re-elected for the second term Romney represents the top 1% Obama represents the long long tail Iteratively apply the head/tail division rule to dataset with a heavy tailed distribution, untill the data in head is no longer heavy tailed distributed, or specifically, the number in the head is no longer a minority (e.g., < 40%). Both the number of classes and class intervals are automatically or naturally determined. For example, four classes: [min, m1), [m1, m2), [m2, m3), [m3, max]. Head/tail breaks is more natural than natural breaks (comes later as to why...). 16 Head/tail breaks: A first look Scaling of USCities 17 18 Unrevealed scaling of USCities minorities Scaling of Swedish streets 19 20 21 22 Conclusion Why more natural than natural breaks? Reflects human binary thinking. Captures the scaling pattern of the data. Both the number of classes and class intervals are automatically or naturally determined. Reflects figure/ground perception. Nature and society are like that – ”far more small things than large ones”. Unique thing about the scheme is simplicity. 23 This paper has proposed a novel classification scheme, the head/tail breaks, for data that are heavy-tailed distributed. The head/tail breaks scheme captures the hierarchy of the data. Head/tail breaks can be used for statistical mapping, map generalization and cognitive mapping. We can unite cartographic mapping and cognitive mapping under the same scaling law. 24 Thank you very much for your attention! Questions and comments? 26
© Copyright 2026 Paperzz