A Dynamic Mobility Histogram Construction Method Based on Markov Chains Yoshiharu Ishikawa (Nagoya University) Yoji Machida (University of Tsukuba) Hiroyuki Kitagawa (University of Tsukuba) Outline • • • • • • Background and Objectives Modeling Movement Patterns Mobility Histogram: Logical Structure Mobility Histogram: Physical Structure Experimental Results Conclusions 1 Background • Advance of GPS and communication technology enabled tracking of moving objects – Example: A taxi company in Tokyo monitor >200 taxi cabs continually • Movement data is delivered as a data stream Moving Objects Moving Object Database Data Stream Movement Data 2 Objectives • Construction and maintenance of a mobility histogram – Compact summary of movement data for a specific time period – Used for mobility analysis and estimation • Problems – Concrete definition of a mobility histogram • How to model movement patterns – Compact representation • Tradeoff with accuracy – Efficient construction and maintenance • Incremental processing for streamed data 3 Basic Idea Request for analysis / estimation Movement Data (as a Data Stream) … Histogram Maintenance Module Incremental updates Results Mobility Analysis / estimation Module Query for estimation Mobility histogram 4 Outline • • • • • • Background and Objectives Modeling Movement Patterns Mobility Histogram: Logical Structure Mobility Histogram: Physical Structure Experimental Results Conclusions 5 Approach • 2-D movement area • Uniform cell decompositions – But allow multiple spatial granularities (e.g., 4 x 4, 16 x 16) • Movement pattern is represented as a sequence of cell numbers • Based on the Markov chain model – Treats a movement pattern as a Markov chain sequence – Well-known model in traffic modeling 6 Movement Patterns: Example (1) 1 0 Movement pattern of A 2200 C Movement pattern of B 2 3 3311 Movement pattern of C 0223 A B 7 Movement Patterns: Example (2) 0 1 4 5 2 3 6 7 • Cell partitioning with different granularities Movement pattern of A 8 9 12 13 10 11 14 15 11 9 3 1 A 8 Cell Numbering Scheme (1) 0 1 4 5 2 3 6 7 8 9 12 13 10 11 14 15 • Based on Z-ordering method – Simple encoding method – Assign similar values to neighboring cells – Translation to different granularities is easy 9 Cell Numbering Scheme (2) 0(2) 1(2) 0000 0001 2(2) 3(2) 0010 0011 Level-1 (21x21) decomposition Level-2 (22x22) decomposition 10 Markov Chain Model (example: order = 2) 2(1) 3(1) 1(1) 9(2) 12(2) 6(2) Step 0 Step 1 Step 2 11 Outline • • • • • • Background and Objectives Modeling Movement Patterns Mobility Histogram: Logical Structure Mobility Histogram: Physical Structure Experimental Results Conclusions 12 Mobility Histogram as a Data Cube • Representing order-n Markov chain statistics as a (n +1)-d data cube Example: 1(1) 1(1) 0(1) 13 Histogram Maintenance Movement Data … Histogram Maintenance Module Incremental updates Mobility Analysis / Estimation Module Query for analysis Mobility histogram … • Periodical reconstruction – To cope with non-stationary movement patterns – Ease of maintenance – Old histograms are written to disk 14 Outline • • • • • • Background and Objectives Modeling Movement Patterns Mobility Histogram: Logical Structure Mobility Histogram: Physical Structure Experimental Results Conclusions 15 Mobility Histogram: Physical Structure • Problems in logical structure: huge space – 2GB (!) for a typical parameter setting – Needs multiple cubes for multiple spatial granularities – Data cubes are sparse: most of mobility patterns are hard to occur • Solution: tree-based representation – Unification of quad-tree, k-d tree, and trie – Integration of cubes in multiple granularities – Selective allocation of nodes • Saves memory space 16 Insertion of 3(2) 6(2) 12(2): BASE method root 11 00 level 1 01 +1 00 11 01 10 +1 00 11 01 10 +1 00 11 01 10 +1 level 2 11 00 01 10 00 x : counter +1 step 0 +1 01 10 11 : visited edge : non-visited edge step 1 step 2 Binary representation Step 0: 00 11 (=3) Step 1: 01 10 (=6) Step 2: 11 00 (=12) Approximated Histogram (APR) • Problem of the BASE method – Memory size requirement is still high • Approximated method (APR) – Compact histogram construction by adaptive tree expansion • Allocate a buffer for each leaf node • If skew is observed, the leaf node is expanded • 2 statistics is used to check the non-uniformity – Inherited the idea from decision tree construction from streamed data (e.g., VFDT) 18 Node Expansion root root 00 11 10 01 00 01 00 11 10 00 internal node expansion 11 00 01 01 10 11 internal or leaf node trans_seq[1] 01 10 buffer buffer skew is detected 11 buffer buffer … trans_seq[0] 11 10 00 buffer leaf node 10 01 Quit expansion when no. of nodes has reached a given constant 19 Non-uniformity Check • Use of 2 test for goodness of fit Distribution of next steps Buffer 4(2)12(2)6(2) 5(2)12(2) 9(2) … 7(2) 13(2) 15(2) x00 x01 x10 x11 Example: 100 sequences in the buffer 22 23 10 20 27 28 50 20 Uniform Non-uniform x00 x01 x10 x11 x 4 2 ( x x ) c 2 x c( 00, 01,10,11) • Null hypothesis: distribution is uniform • If 2 value > 7.815, the distribution is non-uniform at the significance level 5% 20 Problems in Statistical Test • Problems: 2 value is not reliable – when the total number is small 1 2 Total number = 1 + 2 + 1 + 4 = 8 1 4 – when some value(s) is close to 0 0 10 20 25 These situations are common in our case • Solution: use non-parametric statistics while 2 value is not reliable – Detail is shown in the paper 21 Use of Bitmap Cube (APR-BM) • Minor improvement to the APR method – Use a small bitmap cube in addition to a treestructured histogram – Represent “correct” summary in some coarse level – Improvement of precision 25336 Small bitmap cube in a coarse level 11 00 Tree-based histogram (APR method) 13821 level = 1 01 00 01 00 + 11 4351 53 10 10 11 10 01 1293 11 00 level = 2 01 Accurate estimation for some queries 538 10 11 00 01 10 00 299 10 01 38 Example: When partition level = 3, Markov order = 2, bitmap size = 32KB 11 22 Outline • • • • • • Background and Objectives Modeling Movement Patterns Mobility Histogram: Logical Structure Mobility Histogram: Physical Structure Experimental Results Conclusions 23 Dataset and Environments • Experimental data – Used moving objects simulator by Brinkoff – 1024×1024 in finest granularities – 1,000 moving objects are on the map at every time instance • Environments – CPU:Pentium4 3.2GHz – Memory:1GB RAM – OS:Cygwin 24 Histogram Size • Settings – Data Size: 1K, 10K, 50K – Order-2 Markov transition • Results – BASE method requires huge storage Data Size BASE APR APR-BM 1K 0.35 0.01 0.04 10K 2.7 0.10 0.13 50K 9.4 0.52 0.55 Histogram Size (MB) 25 Construction Time • Comparison of BASE and APR – M: maximal partitioning level (granularity of input sequences) • Results – BASE has small construction cost – APR has nearly O(n2) cost due to non-uniformity check, but still has small processing cost (less than 0.15 ms per input sequence) 0.18 9000 0.16 M5(素朴な方式) = 5, BASE 5(近似方式) M10(素朴な方式) = 5, APR M10(近似方式) = 10, BASE 7000 6000 5000 Construction Time (ms) Construction Time (ms) 8000 M = 10, APR 4000 3000 2000 0.12 0.1 0.08 0.06 0.04 1000 0.02 0 0 1K 10K Data Size Construction Time 50K M = 5, BASE 5(素朴な方式) 5(近似方式) M = 5, APR 10(素朴な方式) M = 10, BASE 10(近似方式) M = 10, APR 0.14 1K 10K 50K Data Size Construction Time per Sequence 26 Query Processing Time – Fine level: Issue queries on the most fine partitioning level (M = 10) – Mixed-level: Issue queries on randomly mixed partitioning levels • Results – Comparison of BASE and APR – No difference – Quite fast Query Processing Time (ms) • Two types of queries 80 70 60 50 40 30 20 10 0 1K 10K 50K BASE 素朴な方式 APR 近似方式 最大空間分割レベル fine-level query BASE 素朴な方式 APR 近似方式 最大空間分割レベル mixed-level query よりも粗い問合せ と一致する問合せ 問合せパターン 27 Accuracy: Histogram Plot (1) • Order-1 Markov chain histograms • Partition level = 2 BASE (“true” count) APR 28 Accuracy: Histogram Plot (2) Histogram Difference Diff Count = |Base count – APR count| 29 Precision: Evaluation Measures • Distance R n1 ( ACT EST ) i i 1 i • ACTi: Actual cell value (BASE method) • ESTi: Estimated cell value (APR and APRBM methods) • Relative Error 1 2 2 P ( n 1) 2 2 P ( n+1) i 1 ACTi ESTi ACTi 2 30 Evaluation of Precision Distance • Comparison of APR and APR-BM 600 – Using “Distance” and “Relative Error” APR 400 200 0 1K 6.692K Distance 0.3 0.25 Relative Error • APR-BM can estimate small cell values accurately 5K 2.5K Number of Nodes • Results – Similar results for Distance – APR-BM is better in terms of Relative Error APR-BM APR APR-BM 0.2 0.15 0.1 0.05 0 1K 2.5K 5K Number of Nodes Relative Error 6.692K 31 Outline • • • • • • Background and Objectives Modeling Movement Patterns Mobility Histogram: Logical Structure Mobility Histogram: Physical Structure Experimental Results Conclusions 32 Conclusions • Mobility histogram construction method – Based on Markov chain model – Handling streamed trajectory sequences – Logical histogram: data cube – Physical histogram: tree structure (quad tree + k-d tree) • Adaptive tree growth • Approximated representation method • Use of nonparametric statistics for exceptional cases • Use of a bitmap cube to enhance precision 33
© Copyright 2026 Paperzz