Visualization of Public Health Data in Maps: The Potential for Misinterpretation Eric Hanson • Minnesota Environmental Public Health Tracking • Minnesota Department of Health Introduction Spatial Aggregation and Scale The visualization of public health data in maps can be an informative means to view patterns and trends in the data, explore and analyze the data and as a quick visual method in providing information. Maps can also cause a misinterpretation of the data. Using an inappropriate classification method in creating a choropleth map, not evaluating potential boundary changes from year to year and the potential of patterns being masked due to aggregation could be problematic in providing an accurate cartographic display and analysis of the data. These three examples of potential problems are examined. The ability to identify geographic patterns or have access to a finer geographic scale of data is often dependent on data availability or confidentiality. Viewing data at large geographic units such as county, compared to smaller units such as census tract, reduces the ability to find patterns or see trends in the data. Ideally it is best to view data at smaller geographic units, such as census tract, but often this data is not available, either because of small numbers and confidentiality or the data was not collected at that level of geography. Choropleth Map Classification Methods Equal Interval 3.3 2.6 1.3 0.66 10 2.0 Value ranges divided equally. 8 0.00 - 0.66 6 0.67 - 1.32 1.33 - 1.98 4 1.99 - 2.64 2 2.65 - 3.30 0 0.82 1.6 Data Distribution 2.5 3.3 Class Breaks Quantile 3.3 0.68 0.2 0.38 10 0.96 Each class contains an equal number of features. 8 In the example below, disease rates are mapped for block groups, census tracts, zip codes and county. The ability to identify patterns in the disease data and provide increased information is most effective with the smaller geographic units of block groups and census tracts. A choropleth map is a type of thematic map in which geographic areas are organized into classes based on values. There are a number of different methods to group values into classes. The most common are equal interval, quantile, natural breaks and standard deviation. Depending on the data distribution, the maps could reveal different results with the same data. Block Groups As the size of the units increase to zip code and county, the identification of patterns decreases as well as the ability for more precise comparisons between geographic units and access to information. Census Tracts In the example to the left, all four maps have the same data set, but reveal a variance in outcome. The differences between the equal interval and quantile maps with this data set could result in varied analysis and interpretation. This misinterpretation could cause areas that have relatively low values, to be displayed in a high value class or vice versa. Zip Codes County Confidentiality Data availability Pattern identification Increased information 0.00 - 0.20 6 0.21 - 0.38 0.39 - 0.68 4 0.69 - 0.96 2 0.97 - 3.30 0 0.82 1.6 Data Distribution 2.5 3.3 Class Breaks Often the best choice of classification method may depend on the data that is being displayed and the purpose of the map. Examining the data distribution and how each method categorizes the data should be an essential step in this process. Geographic Boundary Change ! 3.3 2.31 0.59 0.24 1.07 Classes divided based on natural groupings in the data. ! 8 B 6 C C A ! ! 4 0.60 - 1.07 B ! ! ! ! 1.08 - 2.31 2 2.32 - 3.30 0 0.82 1.6 2.5 Data Distribution 3.3 Year X A A D ! ! ! ! ! ! ! Class Breaks E ! G F C 3.3 2.2 1.6 0.96 0.36 < -0.50 Std. Dev. Year Y 6 4 A E 0.82 Data Distribution 1.6 2.5 Class Breaks ! ! 3.3 H G ! From Year X to Year Y, the geographic boundaries of Units A and B change. In Year Y, Unit B loses area to Unit A. ! ! H ! F ! ! ! ! ! ! ! ! ! ! B ! ! ! ! ! C ! ! A ! ! ! ! ! ! ! !! !! ! !! ! ! ! ! !! ! ! ! ! ! ! ! D ! ! Disease counts are represented by ! points on the ! map. If the geographic location of disease data is similar for both Year X and Year Y , but the geographic ! ! boundaries change, it!could cause a ! misanalysis when comparing Year X to ! ! ! ! Year Y. If Unit A has 6 counts of disease in Year X and has 15 counts of disease in Year Y, it may appear there ! ! was a large increase ! of disease counts ! ! in in Unit A from Year X to Year Y, but fact, the increase was caused by a ! geographic border change. ! ! ! ! ! E !A ! ! ! ! ! ! G ! ! ! F H ! ! ! ! ! F ! ! ! ! ! Disease Counts ! ! ! ! ! ! ! !! !! ! !! ! ! ! ! !! ! 10 - 17 G H A B C D E Disease Counts ! F B Mapping disease counts in a choropleth map for both years ! will show an increase in disease counts for Unit A in ! Year Y. A D C D Year Y G 9 E Disease Counts 6 2-3 ! 3 0 G H The apparent increase in disease in Unit A in Year Y, is simply the result of a boundary change. 12 ! F Units ! ! ! 7-9 E 4-6 7-9 A !B C D E Units ! ! 0 4-6 15 B ! 10 2-3 ! ! ! ! ! ! ! D ! ! ! ! G ! ! ! > 2.5 Std. Dev. ! ! D 2 0 E ! ! ! ! B 8 A D Disease Counts 5 ! ! ! Classes based on standard deviations from the mean. 1.5 - 2.5 Std. Dev. ! ! Standard Deviation 0.50 - 1.5 Std. Dev. D ! 15 C B ! H -0.50 - 0.50 Std. Dev. ! ! ! 10 ! ! ! ! ! ! ! !! !! ! !! ! ! ! ! !! ! ! ! ! ! ! ! !! !! ! !! ! ! ! ! !! Year X 20 B ! 0.00 - 0.24 0.25 - 0.59 Visualizing public health data on maps can be an important means to ! examine and analyze data. Maps can also potentially cause misinterpretation of the data if best practices are not applied. Best practices would consist of having a firm understanding of the data, what purposes are being achieved by mapping the data and what would be an appropriate classification method for creating a choropleth map; being aware of boundary changes that may have occurred that could alter comparisons; ! and if available, map at the smallest geographic unit possible if it is appropriate for the purpose of the map. ! Another source of potential misinterpretation occurs when geographic boundaries change. When viewing trends in data, if the geographic units have changed their ! boundaries, it could result in an inaccurate comparison of the geographic units between different time periods. The example below shows how an alteration of borders between two geographic units can reveal!very different results. Natural Breaks 10 Conclusion ! F G H 10 - 17 H F
© Copyright 2026 Paperzz