NTU/Intel M2M Project: Wireless Sensor Networks Content Analysis and Management Special Interest Group Data Analysis Team Monthly Report 1. Team Organization Principal Investigator: Shou-De Lin Co-Principal Investigator: Mi-Yen Yeh Team Members: Chih-Hung Hsieh (postdoc), Yi-Chen Lo (PhD student), Perng-Hwa Kung (Graduate student), Ruei-Bin Wang (Graduate student), Yu-Chen Lu (Undergraduate student), Kuan-Ting Chou (Undergraduate student), Chin-en Wang (Graduate student) 2. Discussion with Champions a. Number of meetings with champion in current month: 1(8/27 phone) b. Major comments/conclusion from the discussion: PRD for phase II 3. Progress between last month and this month a. Topic1: Clustering streams using MSWave. 1) Potential Problem: So far, the whole discussions are based on the assumption that the top coefficient can represent the whole vector. Nevertheless, it is possible to fail when the vector is not time series. According to the last experiment, we can find that the convergence of bounds is very bad. Therefore, we think it is very important to deal with this problem first, or the result might be bad. 2) Exchange Dimension: - Define d(dim1,dim2) = d( (a11,a21), (a12,a22) ) Find a best sequence (i1,i2,i3,i4) that minimize ∑d(dim ij , dim ij+1) Travelling salesman problem (NP hard) There are some approximate solutions. 3) Two possible directions to work: 3.1) Is it possible that the result by MsWave is better than the original method? E.g. Use less frame to summarize. Since the MsWave is the estimate of the real distance, we think the result by MsWave would not be better and only save little bandwidth. 3.2) Is there any application which needs the features without sending the frame to the root? We think this has to discuss with O since he has more experience in the video analysis. b. Topic2: Exploiting Correlation among Sensors 1) Observations on over sampling: 1.1) Reason 1: Each cluster, certain sensors send more. - - Change the “picked sensor” to send more? Choose sensors have max/min total similarity (centroid) within the cluster 1. No significant difference Random choose the one to send more Parameters to control sample rate 1. Cycle period (5 mod 3) 2. Time rate (1→ 2) 1.2) Reason 2: Unbalanced clustering. In ( ) are practical sampled rate on expected 2.5% Cycle period=10, Time rate = 4 Berkeley cluster size: 10(2.59%), 20(2.53%), 13(2.79%), 11(2.71%) NO2 cluster size: 5(5.07%), 7(3.63%), 10(2.53%), 38(2.57%) 1.3) Summary: - Although we have some results but still hard to give explanation Collecting more trivial trials to report. c. Topic 3: Distributed Nearest Neighbor Search of Time Series Using Dynamic Time Warping 1) Progress: 1.1) Rewriting testing code of both frameworks 1.2) Investigation on previous experiments 1.3) Some fixing of segmentation and framework 1.4) Review of segmentation and bounding technique 2) Experiment Conclusion: - Smaller S / M, better pruning power 1. When T is large, the experiments support the intuitive 2. As S and T are getting larger, the variance of DTWs among all sensors will be larger i.e. it is easy for Framework 1 and 2 to prune candidate sensors 3. As S and T are small, the experiment outcome is likely to violate the intuitive since it is hard for Framework 1 and 2 to prune candidate sensors 3) Future Work: - - - Analyze the number of sites for initialization in Framework 2 Analyze the parameters for segmentation 1. Any background theory? 2. Different segmentation accuracy for different frameworks? Analyze where to use Framework 1 and Framework 2 1. K large or small? 2. Shape of time series? Others d. Topic 4: Intelligent Transportation System (ITS) Machine Learning: Predict whether driver will stop at intersection or not without using video data. 1) New region is adopted to generate features: 2) Feature Extraction: - We extract the data estimated in the region [15m, 45m] ahead of the intersection. - For the range of 30 meters, we generate 30 feature values per each distance of one meter 1. Ex : GPS speed[1], GPS speed[2], …, GPS speed[30] for distance of 15 meters, 16 meters, …, 44 meters. 2. Because of the sampling rate of GPS is one time per second, some values of feature have to be generated by means of interpolation. - 240 Used Feature 1. GPS speed[1], GPS[2], …, GPS speed[30] for distance of 15 meters, 16 meters, …, 44 meters. 2. Acceleration_X[1], …, Acceleration_X[30]. 3. Acceleration_Y[1], …, Acceleration_Y[30]. 4. Acceleration_Z[1], …, Acceleration_Z[30]. 5. Orientation_W[1], …, Orientation_W[30]. 6. Orientation_X[1], …, Orientation_X[30]. 7. Orientation_Y[1], …, Orientation_Y[30]. 8. Orientation_Z[1], …, Orientation_Z[30]. 3) Experimental Results - max validation accuracy: 75.641000 select 27 features: [13, 9, 3, 14, 1, 10, 12, 2, 11, 4, 18, 7, 23, 6, 8, 20, 17, 28, 30, 25, 19, 15, 24, 21, 22, 5, 29] - All features are from GPS_Speed group! - best (c,g)= [16.0, 0.001953125], cv-acc = 75.641000 testing accuracy = 0.743590 (the best so far!) 4. Brief plan for the next month a. We will continuous paper survey and refine our proposed approaches. b. To implement our proposed approaches and evaluate their performance. 5. Research Byproducts a. Paper: N/A b. Served on the Editorial Board of International Journals: N/A c. Invited Lectures: N/A d. Significant Honors / Awards: N/A
© Copyright 2026 Paperzz