Modeling Evolution in Spatial Datasets Paul Amalaman 2/17/2012 Data Mining and Machine Learning Lab Team Members Dr Eick Christoph Nouhad Rizk Zechun Cao Sujing Wang Anirup Dutta Swati Goyal Tarikul Islam Paul Amalaman 1 IIIIIIIV- Background Research Goals Case Study Summary 2 I-Background Machine Learning Techniques are mostly used where • modeling implicit trends is possible (Regression) • stable patterns exist in dataset (Classification) Simulation Systems are used when • a model is hard to establish • there is a great degree of randomness in the attribute values • there are a lot of interactions between objects • when attributes have to be predicted recursively over many steps Example Applications of Simulation Systems: Traffic Modeling, Weather Forecasting, Social Networks, Urban Modeling 3 I-Background continued(3) Spatial Simulation Systems ABM Cellular Automata (CA) (Cell centered approach) Continuous Agent Space Or Multi Agent System (MAS) (Agent centered approach) 4 I-Background continued(3) Modeling with Cellular Automata • Concept of neighborhood • Moore Neighborhood • Von Newman neighborhood D(x-1,y-1) D(x-1,y) D(x+1,y-1) D(x-1,y) P(x,y) D(x+1,y) D(x-1,y+1) D(x-1,y+1) D(x+1,y+1) Moore Neighborhood D(x-1,y) D(x-1,y) P(x,y) D(x+1,y) D(x-1,y+1) Von Newman Neighborhood http://en.wikipedia.org/wiki/Von_Neumann_neighborhood http://en.wikipedia.org/wiki/Moore_neighborhood 5 I-Background continued(4) Modeling with Cellular Automata Cellular Automata • provides the programmer a cell-centered programming style where the set of cells represents computing units that are regularly organized • good efficiency with parallel architecture 6 II-Research Goals Using Data Mining and Machine Learning Techniques to Enhance Simulation Systems New approach= Machine Learning Techniques + Spatial Simulation Systems Goal1: Grid-based Models for Progression in Spatial Datasets Goal2: Development of Cluster-based Bias Removal Methods 7 II-Research Goal continued (1) Goal1:Grid-based Models for Progression in Spatial Datasets ? t t +1 yi,j,t+1= fij(x1,1,1,t,…, x1,n,n,t,… , xm,1,1,t,…, xm,n,n,t, y1,1,t,…,y,n,n,t) X1(t) X2(t) . . Xn(t) Y(t) X1(t+Δt)=? X2(t+Δt)=? . . Xn(t+Δt)=? Y(t+Δt)=? Given that at t we know all the attribute values including the output variable Y, can we predict all attribute values at t+1? Challenges: 1. Many target variables to predict; different variables have to be predicted at different location 2. Target variables are not independent of each other (e.g. some are auto-correlated) 3. Models has to be used over multiple steps 8 II-Research Goal continued (2) Goal2:Development of Cluster-based Bias Removal Methods Input x Output + bias b(x) Model EPA prediction models are meteorological and chemical transport models. Those models are derived from solving differential equations. Over time, the model bias grows larger http://www.epa.gov/AMD/CMAQ/ch06.pdf Whether pattern recognition Input x group(x) b(x) Model Output Correction (bias removal) Output h(b(x), group(x)) Bias removal based on whether pattern recognition Our model, model h learn group(x), and b(x) and make better prediction 9 III-Case Study Improving Ozone Forecasting For HoustonGalveston Area Goal1: Development of a Grid-based Prediction Framework Goal2: Development of Cluster-based Bias Removal Methods In Collaboration with UH-IMAQS Institute for Multidimensional Air Quality Studies (UH Department of Earth and Atmospheric Science) -Dr Rappenglueck, Bernhard -Dr Li, Xiangshang 10 III-Case Study Continued(1) Ozone Prediction Goal 1:Improving Prediction for Spatial Progression Given what happened at t, can we predict what happens at t+Δ, t+2Δ, ..? 11 III-Case Study Continued(2) Ozone Prediction Goal 2- Improving forecast Accuracy 12 III-Case Study Continued(2) Status of Dissertation • Methods to collect ozone data and to capture it in a relational database have been developed. • The necessary knowledge for simulationbased prediction systems in general, and ozone prediction in particular has been obtained • Started work on different modeling approaches for grid-based prediction 13 IV-SUMMARY 14 Thank you! 15
© Copyright 2026 Paperzz