Public Opinion Quarterly, Vol. 81, Special Issue, 2017, pp. 338–356 ASSESSING CHANGES IN COVERAGE BIAS OF WEB SURVEYS IN THE UNITED STATES DAVID STERRETT* DAN MALATO JENNIFER BENZ TREVOR TOMPSON NED ENGLISH Abstract The rising costs and declining response rates of traditional survey modes have spurred many organizations to conduct surveys online. Less expensive broadband connections and the popularity of smartphones have also made it easier for many to access the Internet. The General Social Survey (GSS) shows that the percentage of American adults who use the Internet increased from 69 percent in 2006 to 86 percent in 2014. The increased access has reduced some concerns about the representativeness of Internet surveys. However, there remains little research into coverage bias, which occurs if those not in the sampling frame differ from the target population on variables of interest. This raises a question: With the increase in Internet access, has there been any change in the coverage bias of web surveys? To assess coverage bias, we analyze the GSS to determine whether those with Internet access and those without it became more or less similar between 2006 and 2014. We calculate the potential coverage bias of web-only surveys over these years for sex, age, education, income, race, political ideology, urbanicity, and life satisfaction. We also compare the bias observed in David Sterrett is a research scientist at NORC at the University of Chicago, Chicago, IL, USA. Dan Malato is a principal research analyst at NORC at the University of Chicago, Chicago, IL, USA. Jennifer Benz is a principal research scientist at NORC at the University of Chicago in Boston, Boston, MA, USA. Trevor Tompson is the vice president for public affairs research at NORC at the University of Chicago, Chicago, IL, USA. Ned English is a senior research methodologist II at NORC at the University of Chicago, Chicago, IL, USA. The authors thank Rene Bautista, Ipek Bilgen, Allyson Holbrook, Timothy Johnson, and the anonymous reviewers for helpful comments on earlier versions of the article. *Address correspondence to David Sterrett, NORC at the University of Chicago, 55 E. Monroe Street, 30th Floor, Chicago, IL 60603, USA; e-mail: [email protected]. doi:10.1093/poq/nfx002 © The Author 2017. Published by Oxford University Press on behalf of the American Association for Public Opinion Research. All rights reserved. For permissions, please e-mail: [email protected] Changes in Coverage Bias of Web Surveys 339 the United States to that in Europe. Our results illustrate that relative coverage bias associated with education, income, race, and age declined between 2006 and 2014, but bias still exists. The rise of the Internet in the past decade has dramatically changed how many researchers conduct surveys. Researchers frequently rely on the Internet for mixed-mode surveys, web-only surveys, and online panels, with the use of web surveys growing increasingly common across a range of research disciplines (Tourangeau, Conrad, and Couper 2013). The popular media, policy analysts, and academic researchers frequently use Internet surveys of Americans, and many academic publications feature data from surveys conducted online. Since researchers first began experimenting with web surveys more than a decade ago, a major concern has been whether they can be representative of the general population (Couper 2000, 2007; Stern, Adams, and Elsasser 2009). As of 2014, more than 23 million American households lacked Internet access, and potential coverage bias resulting from differences between those with Internet access and those without access could pose a significant challenge for some uses of web surveys (Bureau of the Census 2014). An address-based sample, for instance, can reach nearly everyone living in the United States, and only about 2 percent of households do not have access to a cell or landline phone that would be included in a random-digit-dial sample frame (Pew Research Center 2015). In spite of coverage bias concerns, web surveys are becoming increasingly popular as more Americans gain access to the Internet. Less costly broadband connections and the growth of smartphones have made it easier for many to get online. The General Social Survey (GSS) finds that the percentage of adults in the United States who have Internet access at home or through a mobile device increased from 69 percent in 2006 to 86 percent in 2014 (Smith et al. 1972–2014). As more people are able to get online, coverage of web surveys improves, but it is still far from complete. Past studies of coverage bias of web-only surveys in the United States and more recent research on declines in coverage bias in Europe raise several key questions: 1) Has there been a change in the coverage bias associated with demographic variables for web-only surveys in the United States? 2) Have changes in coverage bias been similar across demographic groups, or are there differences? 3) How do patterns of change compare to those seen in Europe? 4) And finally, what do the results indicate about the state of coverage bias for web-only surveys? This research aims to answer these questions by using the GSS to compare the demographics, political ideology, and life satisfaction of Americans with Internet access and those without it from 2006 to 2014. The findings are noteworthy for researchers conducting online surveys in the United States. 340 Sterrett et al. Background Coverage bias is a non-observational error in the total survey error framework (Groves 1989; Groves and Lyberg 2010). Coverage bias can exist whenever certain people or households in the population of interest never have a chance to be studied because they are not part of the sampling frame (Groves et al. 2009). However, coverage bias only occurs when those people who never have a chance to be studied are different than those people who have a chance to be studied. Coverage bias depends on two factors: 1) the proportion of the population not covered; and 2) the difference in the statistic of interest between those who are covered and those who are not covered. Since coverage bias depends on this difference in the statistic of interest, coverage bias can vary on a question-by-question basis. In web-only surveys, those who do not have Internet access are not covered because they do not have an opportunity to complete such a survey. There is a wealth of research highlighting the potential coverage bias of web-only surveys and documenting disparities in Internet use or access (see Robinson et al. 2015). Access to the Internet varies across locations (Stern, Adams, and Elsasser 2009). In particular, there can be significant differences in connectivity between rural, suburban, and urban communities in the United States (Mossberger, Tolbert, and McNeal 2007; Sylvester and McGlynn 2010; Stern and Rookey 2012). Socioeconomic factors (DiMaggio et al. 2004; Stern and Dillman 2006) and race and ethnicity (Mesch and Talmud 2011; CamposCastillo 2014) have also been associated with Internet access. Much of this research, however, was conducted when there were lower levels of Internet access in the United States (see Tourangeau, Conrad, and Couper 2013 for a review). Moreover, studies show that the potential coverage bias of webonly surveys has declined in a number of European nations in recent years as Internet access has increased (Vicente and Reis 2012; Mohorko, de Leeuw, and Hox 2013). Measuring Internet Coverage There are a number of ways to measure and conceptualize Internet access (see Tourangeau, Conrad, and Couper 2013 for a review). The GSS has asked about Internet access at home since 2006 and Internet access through a mobile device since 2010. For this research, Internet access from 2010 to 2014 includes both people with Internet at home and those with access via a mobile device. For 2006 and 2008, Internet access includes just those with Internet access at home. Several other general population surveys attempt to estimate the number of people with access to the Internet, and these surveys vary in their approach and the unit of measurement. Despite differences between how these organizations measure Internet access, they have produced relatively similar estimates Changes in Coverage Bias of Web Surveys 341 of access in the United States during the past decade. A comparison of estimates of Internet access from the GSS, Pew Research Center, and National Telecommunications and Information Administration (NTIA) can be found in figure 1. More details on each organization’s methods for measuring access can be found in appendix 1. Data for This Analysis We use the GSS to assess the potential coverage bias of web surveys in the United States. NORC at the University of Chicago, with funding from the National Science Foundation, has conducted the GSS at least every two years since 1972. The data allow for a direct comparison of those with Internet access and those without Internet access. The GSS features a multi-stage area probability sample designed to generate a nationally representative survey. A number of past studies examining coverage bias of web surveys in the United States and Europe have utilized in-person surveys that ask respondents about Internet access (Lee 2006; Mohorko, de Leeuw, and Hox 2013; Tourangeau, Conrad, and Couper 2013). In-person surveys with address-based sampling designs offer the greatest coverage of the general population and can produce samples that are highly representative of both those with and without Internet access (Groves et al. 2009). The GSS, an in-person survey that asks about Internet access, allows for an analysis of those with and without access based on the same questions asked at the same time in the same survey mode.1 Figure 1. Estimates of Internet Access across Organizations. 1. The items used to assess coverage bias are part of two question modules. The demographic questions and mobile Internet access question are part of a core module asked near the start of the survey. The question about Internet access at home is part of a science module asked in the middle part of the survey. The placement of all these questions within the modules remains consistent across years. 342 Sterrett et al. GSS data from 2006 to 2014 are used in this analysis for examining changes in coverage bias over time. Response rates have remained around 70 percent since 2006 (Smith et al. 1972–2014). As with other surveys, nonresponse and sample design present possible sources of error when measuring Internet access on the GSS, but the area-probability sampling and the relatively high response rates help reduce the potential for such error. Response rates and completed interview totals for the 2006–2014 surveys can be seen in table 1. Descriptive statistics and sample sizes for variables included in models can be found in table 2. The following measures are used for our analysis. The question wording and coding for each variable is in appendix 2. INTERNET ACCESS As noted above, Internet access is calculated based on a question about Internet access at home and a question about Internet access via a mobile device. The variable is coded 0 for people who do not have access at home or through a mobile device, and 1 for people who have access either at home or through a mobile device. About 2 percent of people reported having Internet access through a mobile device but not at home in 2014, and including these people in the access group provides an inclusive measure of coverage. This more inclusive measure of access assumes that a web-only survey can be completed via a mobile device. In addition, this measure assumes that everyone with access at home has an Internet connection conducive to completing surveys (i.e., a connection that is reliable and fast). TIME To test whether change in coverage over time is linear, the analysis includes both a linear and a quadratic measure of time. For the linear variable, 2006 = 0, 2008 = 2, 2010 = 4, 2012 = 6, and 2014 = 8. For the quadratic variable, 2006 = 0, 2008 = 4, 2010 = 16, 2012 = 36, and 2014 = 64. DEMOGRAPHIC CHARACTERISTICS AND ATTITUDES To make this analysis comparable to that of Mohorko, de Leeuw, and Hox (2013) on Internet coverage in Europe, we include measures of gender, age, Table 1. Response Rates for GSS from 2006 to 2014 Year Response rate Interviews 2006 2008 2010 2012 2014 71.2 70.4 70.3 71.4 69.2 4,510 2,023 2,044 1,974 2,538 Less than high school degree A high school degree More than a high school degree Education Black Hispanic Non-Hispanic white Race 18–29 years old 30–49 years old 50–64 years old 65 and older Missing item = 3 15% 51% 34% Missing item = 0 12% 15% 69% Missing item = 18 20% 42% 24% 14% Age N = 4,510 N = 1,862 69% 31% Total GSS sample Internet variable Have Internet access No Internet access 2006 Missing item = 1 14% 50% 35% Missing item = 0 13% 14% 69% Missing item = 10 21% 38% 26% 15% N = 2,023 N = 1,492 74% 26% 2008 Missing item = 0 15% 50% 35% Missing item = 0 14% 13% 69% Missing item = 3 20% 37% 26% 17% N = 2,044 N = 712 77% 23% 2010 Table 2. Sample Sizes and Descriptive Statistics for GSS Variables by Year 2006–2014 Missing item = 0 15% 49% 36% Missing item = 0 14% 16% 64% Missing item = 5 20% 38% 25% 16% N = 1,974 N = 993 80% 20% 2012 Continued Missing item = 0 13% 51% 36% Missing item = 0 14% 18% 64% Missing item = 9 18% 35% 29% 17% N = 2,538 N = 1,238 86% 14% 2014 Changes in Coverage Bias of Web Surveys 343 Missing/ not asked = 1,524 88% 12% Life satisfaction Very or pretty happy Not too happy Liberal Moderate Conservative Missing item = 177 26% 39% 35% Missing item = 0 11% 33% 57% Missing item = 0 54% 46% Missing item = 637 18% 32% 50% 2006 Political ideology Live in rural area Live in suburban area Live in urban area Urbanicity Female Male Gender Family income < $20,000 Family income $20,000–$50,000 Family income > $50,000 Income Table 2. Continued Missing item = 8 86% 14% Missing item = 90 26% 38% 36% Missing item = 0 10% 32% 58% Missing item = 0 53% 47% Missing item = 249 17% 28% 55% 2008 Missing item = 5 86% 14% Missing item = 71 29% 37% 34% Missing item = 0 11% 31% 58% Missing item = 0 55% 45% Missing item = 239 21% 30% 49% 2010 Missing item = 10 87% 13% Missing item = 100 27% 39% 34% Missing item = 0 11% 35% 55% Missing item = 0 54% 46% Missing item = 216 18% 30% 51% 2012 Missing item = 8 88% 12% Missing item = 89 26% 40% 34% Missing item = 0 10% 35% 55% Missing item = 0 54% 46% Missing item = 224 17% 28% 55% 2014 344 Sterrett et al. Changes in Coverage Bias of Web Surveys 345 education, income, race, ethnicity, urbanicity, life satisfaction, and political ideology. Analysis We model our analysis on the work of Mohorko, de Leeuw, and Hox (2013), which assessed changes in Internet coverage bias between 2005 and 2009 in Europe. They designated the Eurobarometer survey samples for those years as proxies for the target populations of European countries. Similarly, we use the GSS samples as proxies for the US population. In both the European and US studies, the findings are subject to possible nonresponse and other errors in the surveys that serve as population proxies. Thus, we present relative coverage bias estimates, comparing characteristics of respondents who report having Internet access to the total achieved GSS samples. In regression analyses of change over time, like Mohorko, de Leeuw, and Hox, we analyze the absolute value of the relative coverage bias (the absolute relative coverage bias), to avoid the problem of positive and negative values for relative coverage bias possibly canceling each other out and giving a false impression of no change. We examine how Internet access relates to demographic characteristics and look at changes from 2006 to 2014. First, we calculate the relative Internet coverage bias for age, gender, income, education, race and ethnicity, political ideology, urbanicity, and happiness. Then, we use bivariate regressions to determine whether there has been a change in the absolute relative coverage bias associated with these characteristics from 2006 to 2014. We test both linear and quadratic measures of time, and we analyze whether the rate of change in bias varies across characteristics. In addition, we use multivariate regression to explore which demographic factors were associated with Internet access in 2014. Finally, we compare the demographic changes in access in the United States to the findings of previous research that document changes in Internet access in European countries. We conduct all of the analyses in STATA 14 with the GSS weights (WSSNR) that account for differences in both the probability of selection and nonresponse. Results DEMOGRAPHIC DIFFERENCES Ordered logistic regressions show that absolute relative coverage bias has significantly declined for some, but not all, demographic characteristics in the United States between 2006 and 2014. Similar to the research on the changes in coverage bias in Europe (Mohorko, de Leeuw, and Hox 2013), the quadratic 346 Sterrett et al. effect of time was never significant and only the linear effect of time was used in the final models. These models show a significant decline in absolute relative coverage bias associated with income, education, race and ethnicity, and age over time (see table 3). In 2006, people with more than a high school degree were overrepresented on a web-only survey by about 12.3 percentage points, but there is a decline in absolute relative coverage bias of education of about 1 percentage point a year (coefficient = –1.01, p < .05). A similar pattern emerges for income, with those with high incomes overrepresented in web-only surveys by about 13.8 percentage points in 2006, and a decline in absolute relative coverage bias of about 1 percentage point per year (coefficient = –0.97, p < .05). Overrepresentation of whites also declined significantly between 2006 and 2014 (coefficient = –0.78, p < .01), while the decrease in overrepresentation of people under age 30 is slightly significant (coefficient = –0.18, p < .1). The changes in absolute relative coverage bias associated with gender, political ideology, happiness, and urbanicity are not significant. Despite the declines in coverage bias related to education levels, income levels, race, and age over time, the web-only population remained significantly different from the overall population on several demographic measures (see table 4). Education, income, age, race, and urbanicity all remain predictive of Internet access in 2014. Table 3. Change in Absolute Relative Coverage Bias across Demographic Factors in the United States from 2006 to 2014 Model Sex (over-represent male) Age (over-represent under 30) Education (over-represent high education) Income (over-represent high income) Race (over-represent white) Urbanicity (over-represent suburb) Happiness (over-represent happy) Political ideology (under-represent moderate) Intercept (SE) Time 2006–2014 (SE) 0.82 (.75) 0.07 (.15) 1.89** (.27) –0.18+ (.06) 12.27** (1.17) –1.01* (.24) 13.79** (.94) –0.97* (.19) 8.37** (.53) –0.78** (.11) 3.66** (.60) –0.19 (.12) 3.54 (1.56) –0.42 (.32) 3.10+ (1.09) –0.29 (.22) OLS regressions. +p < .10; *p < .05; **p < .01. Agea 18–29 years old 30–49 years old 50–64 years old 65 and older Race and ethnicitya Black Hispanic Non-Hispanic white Educationa Less than high school degree A high school degree More than a high school degree Household incomea Family income less than $20,000 Family income between $20,000 and $50,000 Family income more than $50,000 Gender Female Male –0.014 0.047 0.015 –0.047 –0.033 –0.030 0.061 –0.057 –0.024 0.082 –0.059 –0.043 0.106 –0.010 0.010 0.018 0.025 0.001 –0.043 –0.027 –0.076 0.090 –0.092 –0.038 0.130 –0.067 –0.057 0.141 –0.002 0.002 0.019 –0.019 –0.051 –0.037 0.115 –0.050 –0.036 0.086 –0.016 –0.026 0.047 0.014 0.032 0.005 –0.051 –0.021 0.021 –0.051 –0.019 0.076 –0.053 –0.016 0.069 –0.035 –0.009 0.042 –0.012 –0.008 0.023 –0.003 –0.004 0.004 –0.027 –0.024 0.059 –0.044 0.017 0.027 0.003 –0.027 0.021 0.001 0.026 0.004 –0.031 Continued –0.004 0.004 –0.051 –0.036 0.099 –0.059 –0.019 0.079 –0.021 –0.034 0.052 0.001 0.024 0.010 –0.035 Relative Relative Relative Relative Relative Average relative coverage bias coverage bias coverage bias coverage bias coverage bias coverage bias 2006 2008 2010 2012 2014 2006–2014 Table 4. Relative Coverage Bias for Demographic Characteristics across Years Changes in Coverage Bias of Web Surveys 347 –0.015 0.024 –0.009 0.007 –0.015 0.008 0.055 –0.055 –0.020 0.039 –0.019 0.017 –0.039 0.023 0.024 –0.024 0.002 –0.002 0.035 –0.010 –0.025 –0.001 0.034 –0.032 –0.007 0.007 –0.010 –0.031 0.041 –0.015 0.032 –0.018 0.005 –0.005 0.008 –0.002 –0.006 –0.012 0.017 –0.005 0.016 –0.016 0.011 –0.020 0.008 –0.013 0.029 –0.017 Relative Relative Relative Relative Relative Average relative coverage bias coverage bias coverage bias coverage bias coverage bias coverage bias 2006 2008 2010 2012 2014 2006–2014 Note.—Positive values indicate overrepresentation, and negative values indicate underrepresentation. A value of 0.01 would indicate 1 percentage point of overrepresentation, while a value of –0.045 would indicate 4.5 percentage points of underrepresentation. a Differences among categories within this variable are statistically significantly in the 2014 GSS based on multivariate logistic regressions. Urbanicitya Live in rural area Live in suburban area Live in urban area Political ideology Liberal Moderate Conservative Life satisfaction Very or pretty happy Not too happy Table 4. Continued 348 Sterrett et al. Changes in Coverage Bias of Web Surveys 349 COMPARISON TO PAST RESEARCH ON EUROPE The changes in coverage bias in the United States over time are comparable, but distinct, from the declines in coverage bias in Europe detailed by Mohorko, de Leeuw, and Hox (2013). One key similarity is that in both the United States and Europe there has been a significant decline in the overrepresentation of highly educated people in web-only populations. Between 2005 and 2009, bias in Europe of those who are highly educated declined by an average of .78 percentage points per year (Mohorko, de Leeuw, and Hox 2013), which is similar to the 1-percentage-point decline observed in the United States. The overrepresentation of young people also declined in both Europe and the United States, although the rate of reduction that Mohorko et al. found in Europe from 2005 to 2009 (.60 percentage points a year) is greater than the observed decline in the United States from 2006 to 2014 (.18 percentage points a year). Differences in coverage patterns also emerge between the United States and Europe when looking at trends in gender balance and life satisfaction. From 2005 to 2009, Europe showed a significant decline in the overrepresentation of males and those who say they are happy (Mohorko, de Leeuw, and Hox 2013), though no significant decline in bias among those groups was observed in the United States, as bias related to those variables was relatively low in the United States from 2006 to 2014. The potential Internet coverage error for most subgroups in Europe has likely decreased since Mohorko, de Leeuw, and Hox’s (2013) analysis, since levels of Internet access have continued to increase since 2009. Statistics from Eurostat (see table 5) measuring the percentage of people aged 16 to 74 who have used the Internet in the past three months shows that Internet access for the European area has increased from an average of 53 percent of the population in 2006 to about 78 percent in 2014 (Eurostat 2016). The proportion of Americans with Internet access in 2014, about 86 percent, is similar to the levels of access in places such as Germany, Belgium, and France. In contrast, more people had Internet access in places such as Iceland, Denmark, and Norway (more than 96 percent), while fewer had access in places such as Poland, Portugal, Greece, and Italy (less than 70 percent). Discussion This research illustrates that the relative Internet coverage bias associated with education, income, race, and age in the United States declined significantly from 2006 to 2014, and the rate of decline varied across different variables. No significant decline in coverage bias was observed, however, with gender, political ideology, happiness, or urbanicity. The declines in coverage bias related to education and age were similar to those observed in Europe from 2005 to 2009 by Mohorko, de Leeuw, and Hox (2013), although declines in bias associated with gender and life satisfaction were observed in Europe and not the 350 Sterrett et al. Table 5. Percent of Those Aged 16–74 who Used Internet in Past Three Months Country/region Iceland Denmark Norway Luxembourg Netherlands Sweden Finland United Kingdom Germany Belgium Estonia France Austria Czech Republic Ireland Slovakia Euro area Spain Latvia Hungary Malta Lithuania Slovenia Croatia Cyprus Former Yugoslav Republic of Macedonia Poland Portugal Greece Italy Bulgaria Romania Turkey 2006 2008 2010 2012 2014 88 83 81 71 81 86 77 66 69 62 61 47 61 44 51 50 53 47 50 44 38 42 51 NA 34 25 40 36 29 36 24 21 NA 91 84 89 81 87 88 83 76 75 69 66 68 71 58 63 66 62 56 61 58 49 53 56 42 39 42 49 42 38 42 35 29 32 93 88 93 90 90 91 86 83 80 78 74 75 74 66 67 76 69 64 66 61 62 60 68 54 52 52 59 51 44 51 43 36 38 96 92 95 92 93 93 90 87 82 81 78 81 80 73 77 77 74 69 73 70 69 66 68 62 61 57 62 60 55 56 52 46 43 98 96 96 95 93 93 92 92 86 85 84 84 81 80 80 80 78 76 76 76 73 72 72 69 69 68 67 65 63 62 55 54 48 Source.—Eurostat 2016. United States. The analysis also shows that education, income, age, race, and urbanicity remain predictors of Internet access in the United States despite the declines in coverage bias across time. This study may underestimate the extent of potential coverage bias because it assumes that everyone with Internet access has personal capability and adequate facilities to a web survey. Beyond having access to the Internet, people Changes in Coverage Bias of Web Surveys 351 need a certain level of proficiency to complete online tasks such as surveys (Hargittai and Hsieh 2012), and research indicates that Internet proficiency and/or skills can vary across groups (Mossberger, Tolbert, and McNeal 2007; Stern, Adams, and Elsasser 2009). Past studies show that people lacking the necessary proficiency to complete a survey online are likely to be different socially, economically, and politically than those with the access and skills to complete a web survey (Selwyn 2004; Mossberger, Tolbert, and McNeal 2007). The results of this study have a number of implications for researchers. Our results show that although potential coverage bias has decreased for demographics like education, income, race, and age, there are still significant differences in access to the Internet based on those demographic variables. Since the proportion of Americans with Internet access has remained relatively stable in recent years (Pew reports 84 percent of Americans with access in 2013, 2014, and 2015), researchers need to continue to consider the potential for coverage bias when conducting Web-only surveys. The impact of coverage error likely depends on the particular study objectives and highlights the need for a fit-for-purpose research design. The coverage error associated with web-only surveys poses challenges to tracking changes over time, as apparent shifts in statistics of interest could be cofounded with increases/decreases in coverage bias among certain subgroups. The potential coverage bias of key demographic factors inherent to webonly surveys highlights the advantages of using mixed-mode designs to sample and survey the broader population. Mail, telephone, or in-person surveys all provide an opportunity to reach people without Internet access and reduce coverage bias across groups. However, the reductions in coverage bias of mixed-mode surveys depend on the specific survey design, and reductions in coverage bias need to be weighed against the challenges associated with multiple modes. In addition, it would be worthwhile to assess the benefits and consequences of weight adjustments that consider differences in Internet access within demographic cells. Potential coverage bias of web-only surveys is declining for several demographic groups, but Americans without Internet access remain a distinct segment of society that should be included in any survey designed to make precise inferences about the broader public. Appendix 1. Other Measures of Internet Access Rates Surveys other than the GSS have also attempted to estimate the number of people with access to the Internet, and these surveys vary in their approach and the unit of measurement. Some ask about frequency of Internet use, while other surveys ask about access to the Internet at various places, such as at home or work. Some approach this at an individual level, while others look at it at the household level. The Pew Research Center has tracked Internet use 352 Sterrett et al. since 2000, and has used a few different approaches to measuring Internet access during that time (Perrin and Duggan 2015). From January 2005 through February 2012, an Internet user was defined as someone who said “yes” to either “Do you use the Internet, at least occasionally?” or “Do you send or receive email, at least occasionally?” From April 2012 through April 2013, users were defined as anyone who said “yes” to at least one of these three questions: “Do you use the Internet, at least occasionally?”; “Do you send or receive email, at least occasionally?”; or “Do you access the Internet on a cell phone, tablet, or other mobile handheld device, at least occasionally?” Since then, they have included respondents who said “yes” to either “Do you use the Internet or email, at least occasionally?” or “Do you access the Internet on a cell phone, tablet, or other mobile handheld device, at least occasionally?” The National Telecommunications and Information Administration’s (NTIA) estimates of access are based on the Current Population Survey and reflect the number of Americans age 15 and older who say they go online from any location. NTIA also estimates Internet users age three and older and Internet access by household, but figures for those age 15 and older are included here to provide a closer comparison to the GSS and Pew’s findings. The questions used by NTIA to measure access have changed over the years. In 2007–2009, CPS asked, “(Do you/Does anyone) in this household use the Internet at any location?” and then followed up by asking which people use the Internet. Starting in 2010, CPS began to ask if people used the Internet in specific locations, then created an estimate of Internet use based on anyone who accessed the Internet from any of the locations asked about. While the exact wording of the questions has changed in small ways and a few additional specific locations have been added to the list asked about between 2010 and 2015, the most recent versions of these questions are “(Do you/Does anyone in this household, including you,) use the Internet at [home/work/school/coffee shop or other business that offers Internet access/while traveling between places/ library, community center, park, or other public place/someone else’s home/ some other location we haven’t covered yet]?” There is a follow-up for each location to determine who uses the Internet at that location. The NTIA Internet estimate includes anyone who said yes to any of the locations of Internet use (National Telecommunications and Information Administration 2016). Appendix 2. GSS Question Wording and Variable Coding INTERNET ACCESS VARIABLES Internet access at home (INTRHOME): Respondents were asked, “Do you have access to the Internet in your home?” Coded 1 if respondent says yes and 0 if respondent says no. Internet access, mobile device (WEBMOB): Respondents were asked, “Do you have access to the Internet or World Wide Web in your home through Changes in Coverage Bias of Web Surveys 353 an Internet-enabled mobile device like a smart phone, PDA, or BlackBerry?” Coded 1 if respondent says yes and 0 if respondent says no. Internet access, at home or through a mobile device (INTRHOMERC): Respondents were asked, “Do you have access to the Internet in your home?” Coded 1 if respondent says yes and 0 if respondent says no. Some respondents were also asked, “Do you have access to the Internet or World Wide Web in your home through an Internet-enabled mobile device like a smart phone, PDA, or BlackBerry?” Coded 1 if respondent says yes to either of these questions and 0 if they did not say yes to either. DEMOGRAPHIC VARIABLES Age 18–29 years old (AGE): Respondents were coded 1 if between the ages of 18–29 years old and 0 if older than 29 years old. Age 30–49 years old (AGE): Respondents were coded 1 if between the ages of 30–49 years old and 0 if younger than 30 or older than 49 years old. Age 50–64 years old (AGE): Respondents were coded 1 if between the ages of 50–64 years old and 0 if younger than 50 or older than 64 years old. Age 65 and older (AGE): Respondents were coded 1 if 65 years old or older and 0 if younger than 65 years old. Black (RACE, HISPANIC): Respondents were asked, “What race do you consider yourself? Are you Spanish, Hispanic, or Latino/Latina?” Coded 1 for non-Hispanic black and 0 for not non-Hispanic black. Hispanic (RACE, HISPANIC): Respondents were asked, “What race do you consider yourself? Are you Spanish, Hispanic, or Latino/Latina?” Coded 1 for Hispanic and 0 for not Hispanic. Non-Hispanic white (RACE, HISPANIC): Respondents were asked, “What race do you consider yourself? Are you Spanish, Hispanic, or Latino/Latina?” Coded 1 for non-Hispanic white and 0 for not non-Hispanic white. Less than a high school degree (DEGREE): Respondents’ education is based on a series of questions about how many years of education they completed and what degrees they received. Coded 1 if respondent reports less than a high school degree, and 0 if respondent reports a high school degree or more education. A high school degree (DEGREE): Respondents’ education is based on a series of questions about how many years of education they completed and what degrees they received. Coded 1 if respondent reports a high school degree, and 0 if respondent reports either less than a high school degree or more than a high school degree. More than a high school degree (DEGREE): Respondents’ education is based on a series of questions about how many years of education they completed and what degrees they received. Coded 1 if respondent reports either a junior college degree, bachelor’s degree, or graduate degree. Coded 0 if respondent reports a high school degree or less education. 354 Sterrett et al. Family income less than $20,000 (INCOME06): Respondents were asked, “In which of these groups did your total family income, from all sources, fall last year before taxes, that is.” Respondents were coded 1 if they reported income less than $20,000 and 0 if they reported income higher than $20,000. Family income between $20,000 and $50,000 (INCOME06): Respondents were asked, “In which of these groups did your total family income, from all sources, fall last year before taxes, that is.” Respondents were coded 1 if they reported income between $20,000 and $50,000 and 0 if they reported income less than $20,000 or income more than $50,000. Family income more than $50,000 (INCOME06): Respondents were asked, “In which of these groups did your total family income, from all sources, fall last year before taxes, that is.” Respondents were coded 1 if they reported income more than $50,000 and 0 if they reported income less than $50,000. Gender (SEX): Respondent’s sex is coded 1 for female and 0 for male. Live in rural area (SRCBELT): Respondents were coded 1 if they lived in an area designated rural and 0 if they lived in any other type of area. Live in suburban area (SRCBELT): Respondents were coded 1 if they lived in an area designated either large suburb or suburb and 0 if they lived in any other type of area. Live in urban area (SRCBELT): Respondents were coded 1 if they lived in an area designated as one of the largest 100 standard metropolitan statistical areas or other urban area and 0 if they lived in any other type of area. Conservative (POLVIEWS): Respondents were asked, “We hear a lot of talk these days about liberals and conservatives. I’m going to show you a seven-point scale on which the political views that people might hold are arranged from extremely liberal—point 1—to extremely conservative—point 7.” Coded 1 if respondent says extremely conservative, conservative, or slightly conservative and 0 if respondent says moderate, extremely liberal, liberal, or slightly liberal. Moderate (POLVIEWS): Respondents were asked, “We hear a lot of talk these days about liberals and conservatives. I’m going to show you a sevenpoint scale on which the political views that people might hold are arranged from extremely liberal—point 1—to extremely conservative—point 7.” Coded 1 if respondent says moderate and 0 if respondent says extremely liberal, liberal, slightly liberal, extremely conservative, conservative, or slightly conservative. Liberal (POLVIEWS): Respondents were asked, “We hear a lot of talk these days about liberals and conservatives. I’m going to show you a sevenpoint scale on which the political views that people might hold are arranged from extremely liberal—point 1—to extremely conservative—point 7. Where would you place yourself on this scale?” Coded 1 if respondent says extremely liberal, liberal, or slightly liberal and 0 if respondent says moderate, extremely conservative, conservative, or slightly conservative. Life satisfaction (HAPPY): Respondents were asked, “Taken all together, how would you say things are these days—would you say that you are very Changes in Coverage Bias of Web Surveys 355 happy, pretty happy, or not too happy?” Coded 1 if respondent says very happy or pretty happy and 0 if respondent says not too happy. References Bureau of the Census. 2014. “2014 American Community Survey 1-Year Estimates.” Available at http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_14_ 1YR_B28002&prodType=table. Campos-Castillo, Celeste. 2014. “Revisiting the First-Level Digital Divide in the United States: Gender and Race/Ethnicity Patterns, 2007–2012.” Social Science Computer Review 33:423–39. Couper, Mick P. 2000. “Web Surveys: A Review of Issues and Approaches.” Public Opinion Quarterly 64:464–94. ———. 2007. “Issues of Representation in eHealth Research (with a Focus on Web Surveys).” American Journal of Preventive Medicine 32:S83–S89. DiMaggio, Paul, Eszter Hargittai, Coral Celeste, and Steven Shafer. 2004. “From Unequal Access to Differentiated Use: A Literature Review and Agenda for Research on Digital Inequality.” In Social Inequality, edited by K. Neckerman, 355–400. New York: Russell Sage Foundation. Eurostat. 2016. “Internet Use by Individuals.” Available at http://ec.europa.eu/eurostat/web/ products-datasets/-/tin00028. Groves, Robert M. 1989. Survey Costs and Survey Errors. New York: John Wiley & Sons. Groves, Robert M., Floyd J. Fowler, Mick P. Couper, James M. Lepkowski, Eleanor Singer, and Roger Tourangeau. 2009. Survey Methodology. New York: John Wiley & Sons. Groves, Robert M., and Lars Lyberg. 2010. “Total Survey Error: Past, Present, and Future.” Public Opinion Quarterly 74:849–79. Hargittai, Eszter, and Yuli Patrick Hsieh. 2012. “Succinct Survey Measures of Web-Use Skills.” Social Science Computer Review 30:95–107. Lee, Sunghee. 2006. “Propensity Score Adjustments as a Weighting Scheme for Volunteer Panel Web Surveys.” Journal of Official Statistics 22:329–49. Mesch, Gustavo S., and Ilan Talmud. 2011. “Ethnic Differences in Internet Access: The Role of Occupation and Exposure.” Information Communication & Society 14:445–71. Mohorko, Anja, Edith de Leeuw, and Joop Hox. 2013. “Internet Coverage and Coverage Bias in Europe: Developments Across Countries and Over Time.” Journal of Official Statistics 29:609–22. Mossberger, Karen, Caroline J. Tolbert, and Ramona S. McNeal. 2007. Digital Citizenship: The Internet, Society, and Participation. Cambridge, MA: MIT Press. National Telecommunications and Information Administration. US Department of Commerce. 2016. “Digital Nation Data Explorer—Ages 15+: Uses the Internet (Any Location).” Available at https://www.ntia.doc.gov/other-publication/2016/digital-nation-data-explorer. Perrin, Andrew, and Maeve Duggan. 2015. “Americans’ Internet Access: 2000–2015.” Pew Research Center, June 26. Available at http://www.pewInternet.org/2015/06/26/ americans-Internet-access-2000–2015/. Pew Research Center. 2015. “Sampling.” Available at http://www.pewresearch.org/ methodology/u-s-survey-research/sampling/. Robinson, Laura, Shelia R. Cotten, Hiroshi Ono, Anabel Quan-Haase, Gustavo Mesch, Wenhong Chen, Jeremy Schulz, Timothy M. Hale, and Michael J. Stern. 2015. “Digital Inequalities and Why They Matter.” Information, Communication & Society 18:569–82. Selwyn, Neil. 2004. “Reconsidering Political and Popular Understandings of the Digital Divide.” New Media & Society 6:341–62. Smith, Tom W., Peter Marsden, Michael Hout, and Jibum Kim. General Social Surveys, 1972– 2014 [machine-readable data file]. Principal Investigator, Tom W. Smith; Co-Principal Investigator, Peter V. Marsden; Co-Principal Investigator, Michael Hout; Sponsored by National 356 Sterrett et al. Science Foundation. NORC ed. Chicago: NORC at the University of Chicago [producer and distributor]. Stern, Michael J., Alison E. Adams, and Shaun Elsasser. 2009. “Digital Inequality and Place: The Effects of Technological Diffusion on Internet Proficiency and Usage across Rural, Suburban, and Urban Counties.” Sociological Inquiry 79:391–417. Stern, Michael J., and Don A. Dillman. 2006. “Community Participation, Social Ties, and Use of the Internet.” City & Community 5:409–24. Stern, Michael J., and Bryan D. Rookey. 2012. “The Politics of New Media, Space, and Race: A Socio-Spatial Analysis of the 2008 Presidential Election.” New Media & Society 15:519–40. Sylvester, Dari E., and Adam J. McGlynn. 2010. “The Digital Divide, Political Participation, and Place.” Social Science Computer Review 28:64–74. Tourangeau, Roger, Fredrick G. Conrad, and Mick P. Couper. 2013. The Science of Web Surveys. New York: Oxford University Press. Vicente, Paula, and Elizabeth Reis. 2012. “Coverage Error in Internet Surveys: Can Fixed Phones Fix It?” International Journal of Market Research 54:323–45.
© Copyright 2026 Paperzz