Can we increase neighborhood disorder estimation accuracy by incorporating spatially located covariates in kriging models? Stephen J Mooney, Michael DM Bader, Gina S Lovasi, Kathryn M Neckerman, Andrew G Rundle, Julien O Teitler Population Association of America Brief Abstract (150 word limit) Ordinary kriging, a geostatistical technique that leverages spatial correlation between observed sample points to estimate at unobserved locations, has been used in the social sciences to estimate contextual measures such as neighborhood physical disorder. Universal kriging extends ordinary kriging by supplementing the spatial model with additional covariates measured at the estimate location. We will use an existing measure of neighborhood physical disorder collected using Google Street View imagery from 1,826 sampled block faces across 4 US cities (New York, Philadelphia, Detroit, and San Jose) to test whether universal kriging can improve disorder estimation. The first analysis will use all measured points within each city to compare accuracy between kriging methods. The second will use random subsamples of the virtual audited data to explore the relationship of sampling density to estimation accuracy. Preliminary results using convenient but not theorized covariates in Philadelphia achieved a minor (-1 to 7%) decrease in error. Extended Abstract (2-4 pages) Introduction In recent years, social science and public health researchers have been increasingly interested in neighborhood contextual factors such as physical disorder or pedestrian infrastructure as a driver of diverse set of health outcomes (Entwisle 2007). Much of this research has characterized neighborhoods using systematic social observations, wherein trained neighborhood auditors rate street segments on a number of characteristics (Reiss 1971). Such audits have been performed in person (Keyes, McLaughlin et al. 2012), using filmed street images (Sampson and Raudenbush 1999), or using Google Street View imagery (Badland, Opit et al. 2010, Rundle, Bader et al. 2011, Wilson, Kelly et al. 2012). However, because auditing every block in a city is usually prohibitively expensive and time-consuming many audits have selected a spatial sample of blocks to audit and used geostatistical tools such as ordinary kriging (Cressie 1988) and land use regression (Shmool, Kubzansky et al. 2014) to estimate at unobserved locations. This approach can substantially reduce audit costs for neighborhood factors whose spatial characteristics are amenable to such techniques while allowing researchers to define neighborhood measures at theoretically relevant scales (Bader and Ailshire 2014). However, because ordinary kriging relies only on spatial covariance to construct estimates, ordinary kriged estimates include error due to differences in small-scale street characteristics such as the presence of retail, due to jurisdictional boundaries, or due to any other causes of differences in the neighborhood construct being measured that cannot be predicted from the spatial covariance structure alone. For example, because pedestrians create litter, places with higher population densities and consequently more pedestrian traffic may have more litter on average, independent of other sources of disorder. However, ordinary kriging cannot incorporate population density into its disorder estimates. By contrast, land use regression can incorporate such covariates, but does not incorporate distances between sample points or spatial covariance. Universal kriging, however, incorporates both spatial Figure 1. A map of neighborhood correlation and covariates such as land use in prediction disorder in Philadelphia, PA models. For example, a universal kriging model could be fitted estimated using ordinary kriging. to spatial data wherein each point in the sample includes a Green indicates more disorder. location, a level of physical disorder, and the population density in the census tract where the sample was taken. Estimates at unobserved locations would then be made by first estimating the level of disorder as in ordinary kriging, then ‘correcting’ for the population density by adding the expected value of the ordinary kriged residual for that population density. In principle, this approach should improving estimation accuracy if the covariates incorporate additional information about the level of disorder (Hengl, Heuvelink et al. 2003). However, to the best of our knowledge, universal kriging has not been explored to improve accuracy of neighborhood disorder estimates. Methods Disorder Data This analysis will use a measure of neighborhood disorder estimated from 9 virtual audit items collected from a structured spatial sample on 1,826 block faces in four US cities: New York (N=532), Philadelphia (N=503), Detroit (N=502) and San Jose (N=289) using the Computer-Assisted Neighborhood Visual Assessment System (CANVAS) and imagery from Google Street View (Mooney, Bader et al. 2014, Bader, Mooney et al. 2015). The construction and validation of the measure has been described elsewhere (Mooney, Bader et al. 2014). Covariate Data We will use spatial data analysis R packages, including ‘rgdal’ and ‘rgeos’, to incorporate covariate information from additional datasets, a ‘high-variety Big Data’ approach (Mooney, Westreich et al. 2015). Specifically, we will merge data from the US Census to identify population density in the census tract surrounding each sampled location (United States Census Bureau 2013). Particularly in cities, population density is a driver of high pedestrian volume, which contributes to minor indicators of disorder such as litter. We will also identify street functional class from TIGER 2015 data released by the US Census Bureau. We anticipate more indicators of disorder to be present on larger streets owing to presence of retail and higher pedestrian volumes on such streets. Finally, within New York City, we will identify the community district for each location. Within New York, Community Districts are the smallest unit of municipal government and advocate for a neighborhood’s priorities, potentially including reducing physical disorder. Universal Kriging vs. Ordinary Kriging Next, to assess the incremental accuracy by incorporating covariates into the kriging model, we will use the R package ‘gstat’ to krige the disorder measure four ways for each city: (1) using ordinary kriging, (2) using universal kriging incorporating population density as a predictor, (3) using universal kriging incorporating street functional class code, and (4) using universal kriging incorportating community district (for New York City only). For each kriging method in each city, we use leave-one-out (jackknife) cross-validation to estimate the root mean squared error (RMSE) from the model (Bivand, Pebesma et al. 2008). By comparing the RMSE of the universal kriging approach to the RMSE of the ordinary kriging approach we can estimate the prediction accuracy benefit (or harm) from incorporating additional covariates into the kriging model. Because each city’s kriging model will be estimated independently, consistent findings across all cities may provide evidence of generalizability to other contexts. Sampling Density Finally, in order to assess the sensitivity of model accuracy to sampling density, we will estimate the RMSE for the ordinary kriging approach and the best-performing universal kriging approach after deleting sample points. Specifically, we will select a random number between 0 and 100, delete that proportion of sample points from the spatial dataset, then compute the leave-one-out cross-validation RMSE for the resulting model. If too few points remain to estimate a variogram automatically, we will consider the model to be have an infinite RMSE. By plotting the relationship of estimated RMSE to sample point count, we can estimate the relationship between sampling density and measure accuracy. Preliminary Results As a preliminary analysis, we explored the RMSE resulting from incorporating other variables using a universal kriging model. Because this analysis was preliminary, we used data only from 503 block faces in Philadelphia, and incorporated covariates that had been assessed by virtual street audit rather than the Census measures of more theoretical relevance. As expected, covariates with little relation to physical disorder, such as conditions of the roadway surface, had little impact on estimates. Low sidewalk quality, which is in some cases a marker of neighborhood abandonment, was the only covariate that markedly approved performance. Table 1 displays the results of this preliminary analysis. Table 1: Preliminary cross-validation results from universal kriging and ordinary kriging models estimating physical disorder in Philadelphia, PA Geographic Feature included as a Universal Kriging Covariate Nothing (Ordinary Kriging) Number of lanes for cars Presence of a bus stop Condition of the roadway surface Presence of any visible billboard Condition of the sidewalk Presence of any rowhouses Root Mean Squared Error (lower is better) 0.441 0.442 0.447 0.438 0.440 0.412 0.435 Percent improvement over ordinary kriging --<1% -1% 1% <1% 7% 1% Conclusions The proposed project will advance methodological knowledge regarding accuracy and efficiency gains that may be available to social science and public health researchers using universal kriging to assess contextual measures on study subjects. If we identify accuracy or efficiency gains, we anticipate pursuing independent funding to explore different sampling and spatial interpolation techniques and working to identify relevant, frequently measured covariates to improve estimates. References Bader, M. D., S. J. Mooney, Y. J. Lee, D. Sheehan, K. M. Neckerman, A. G. Rundle and J. O. Teitler (2015). "Development and deployment of the Computer Assisted Neighborhood Visual Assessment System (CANVAS) to measure health-related neighborhood conditions." Health & place 31: 163-172. Bader, M. D. M. and J. A. Ailshire (2014). "Creating Measures of Theoretically Relevant Neighborhood Attributes at Multiple Spatial Scales [Available online ahead of print February 7, 2014]." Sociological Methodology: (doi:10.1177/0081175013516749). Badland, H. M., S. Opit, K. Witten, R. A. Kearns and S. Mavoa (2010). "Can Virtual Streetscape Audits Reliably Replace Physical Streetscape Audits?" Journal of Urban Health-Bulletin of the New York Academy of Medicine 87(6): 1007-1016. Bivand, R. S., E. J. Pebesma, V. Gomez-Rubio and E. J. Pebesma (2008). Applied spatial data analysis with R, Springer. Cressie, N. (1988). "Spatial prediction and ordinary kriging." Mathematical Geology 20(4): 405-421. Entwisle, B. (2007). "Putting people into place." Demography 44(4): 687-703. Hengl, T., G. B. Heuvelink and A. Stein (2003). "Comparison of kriging with external drift and regressionkriging." Technical note, ITC 51. Keyes, K. M., K. A. McLaughlin, K. C. Koenen, E. Goldmann, M. Uddin and S. Galea (2012). "Child maltreatment increases sensitivity to adverse social contexts: neighborhood physical disorder and incident binge drinking in Detroit." Drug and alcohol dependence 122(1): 77-85. Mooney, S. J., M. D. Bader, G. S. Lovasi, K. M. Neckerman, J. O. Teitler and A. G. Rundle (2014). "Validity of an ecometric neighborhood physical disorder measure constructed by virtual street audit." American journal of epidemiology: kwu180. Mooney, S. J., D. J. Westreich and A. M. El-Sayed (2015). "Commentary: Epidemiology in the Era of Big Data." Epidemiology 26(3): 390-394. Reiss, A. J. (1971). "Systematic observation of natural social phenomena." Sociological methodology 3(1): 3-33. Rundle, A. G., M. D. Bader, C. A. Richards, K. M. Neckerman and J. O. Teitler (2011). "Using Google Street View to audit neighborhood environments." Am J Prev Med 40(1): 94-100. Sampson, R. J. and S. W. Raudenbush (1999). "Systematic social observation of public spaces: A new look at disorder in urban neighborhoods." American Journal of Sociology 105(3): 603-651. Shmool, J. L., L. D. Kubzansky, O. D. Newman, J. Spengler, P. Shepard and J. E. Clougherty (2014). "Social stressors and air pollution across New York City communities: a spatial approach for assessing correlations among multiple exposures." Environmental Health 13(1): 91. United States Census Bureau. (2013). "TIGER/Line Shapefiles." Retrieved July 31, 2013, from http://www.census.gov/geo/maps-data/data/tiger-line.html. Wilson, J. S., C. M. Kelly, M. Schootman, E. A. Baker, A. Banerjee, M. Clennin and D. K. Miller (2012). "Assessing the Built Environment Using Omnidirectional Imagery." American Journal of Preventive Medicine 42(2): 193-199.
© Copyright 2026 Paperzz