Examining Response Categories: Does Minimizing Response Burden Maintain the Instrument? Anneliese C. Bolland, John M. Bolland, Sara E. Tomek, and Heather M. Moore The University of Alabama Abstract Researchers often cringe when faced with dichotomous data, especially measured attitudes. Multiple response categories (e.g., Likert-type) are often encouraged because they increase response variability. Yet, instrument developers are also cautioned to not overburden respondents. Researchers have concluded that two-answer formats are equally reliable as Likert-type response options, while also reducing speed (Dolnicar & Grun, 2007; Komorita, 1963; Percy, et.al., 1976). The current study provides additional support for the use of dichotomous response categories, particularly in surveys that require cognitively efficient methods to measure attitudes and beliefs in hard-toreach populations. Here, two scales (Identity Styles and Ego Strengths) from three consecutive waves (2005, 2006, 2007) of the Mobile Youth Survey (a multiple cohort longitudinal study of poverty and adolescent risk conducted in Mobile, AL between 1998 and 2011) were analyzed. MANOVA was used to examine differences between the 4-point scale (collected in 2005), the artificially dichotomized 4-point scale (collected in 2005) and the true dichotomous scale (collected in 2006), controlling for age differences. Post-hoc analyses were conducted to determine how age contributed to differences. Internal reliability for each version (4-point, artificially dichotomized, true dichotomous) of each subscale was compared. No differences were found between the artificially dichotomized Identity Styles scale and the truly dichotomous scale. As expected, reliability of the 4-point scale was slightly higher than the reliability of either the artificially dichotomized and truly dichotomous scales, yet reliability was generally consistent and relatively high in each of the three Identity Styles subscales (values ranged between .56 and .80). Differences were found between the artificially dichotomized Ego Strengths scale and the truly dichotomous scale. We also found youngest participants most affected by the format change. Consistency was higher for the older participants. Reliability was generally consistent and relatively high in each of the eight subscales making up the Ego Strengths scale, regardless of the response format (values ranged between .44 and .69). Generally, consistency is maintained between the four versus two category scales. Additionally, younger respondents are more affected by response format in the Ego Strengths scale. This result seems counterintuitive, but younger participants may have increased difficulty maintaining consistency in four categories of responses. Introduction Binary response options are becoming more favored on attitude surveys (Dolnicar & Grun, 2007). Dolnicar and Grun find two-answer formats equally reliable, while also reducing speed. The number of questions in a survey can then be increased. Our goal was to determine how response format affects responses. Implicit in this question is an answer to “Does the response format affect the response.” Considerations in these questions are age, development, the scale, and the subscales. What we really want to know is whether it makes practical and empirical sense to use a dichotomous response on surveys, rather than a Likert-type scale, particularly one with less than five response options. This issue is particularly relevant when working with adolescents or individuals with limited cognitive ability because a Likert-type scale may create more “noise” or be more confusing to the respondent, rather than detecting degrees of difference or direction for the researcher. When working with some populations, brevity is important, and less complex response options may be best suited for these populations. There are two primary camps and ample evidence to support both positions. The question is whether the additional options add information or merely create data noise. What appears clear among the mix of research findings is that it all depends on your measures and possibly on the population. When measuring response styles or behavior, more options may be better; but, if you are measuring attitudes or attributes, fewer categories may work better particularly if the degree of agreement or disagreement is not a question. Scale reliability appears to be more often unaffected by the number of response categories (see Dolnicar & Grun, 2007), while validity depends largely on subjective interpretation (see Rossiter, 2002, 2011). Few studies use repeated measures to compare response format; most simply attempt to recode Likert-type scales, which becomes difficult depending on whether there is a clear midpoint. Lee and colleagues (2002) found cultural differences in response patterns dependent on the number of choices, which may also be indicative of issues complex populations may face when responding to surveys. Further, young people and less educated people may have more difficulty with more response options or degrees of intensity (e.g., strongly agree, strongly disagree) (Hartley & MacLean, 2006). The current study provides additional support for the use of dichotomous response categories, particularly in surveys that require cognitively efficient methods to measure attitudes and beliefs in hard-to-reach populations. Mobile Youth Survey The MYS is a multiple cohort longitudinal study of poverty and adolescent risk conducted in Mobile, AL between 1998 and 2011. Participants are 10-18 year old adolescents, with over 99% of the sample Black American or mixed race and a mean household income of $6,276 (Bolland, 2005). Current study data consist of the 2005, 2006, and 2007 waves of the MYS. Participants (N = 12,000+) Adolescents Aged 9.75 to 19.25 Years 99.5% Non-White, Low-Income (M = $5,000/hh) High-risk with Low IQ (M = 85.3; KBIT II; n = 463) Design Multiple Cohort Longitudinal Design Cluster and Convenience Sampling Targeted low-income neighborhoods Intent is 100% response rate from these residents Proctor-read with Scantron Response Forms Data Wave 8 responses – artificially dichotomized Likert-type scale Strongly Agree and Agree = Agree Strongly Disagree and Disagree = Disagree Wave 9 and Wave 10 responses – truly dichotomous scale Agree vs. Disagree Age groups Youngest (10-12 years of age) Middle (13-15 years of age) Oldest (16-18 years of age) Results No differences were found between the artificially dichotomized Identity Styles scale and the truly dichotomous scale. As expected, reliability of the 4-point scale was slightly higher than the reliability of either the artificially dichotomized and truly dichotomous scales, yet the reliability was generally consistent and relatively high in each of the three Identity Styles subscales (values ranged between .56 and .80). Differences were found between the artificially dichotomized Ego Strengths scale and the truly dichotomous scale, F(1,1288) = 42.063 (p < .001). We also found youngest participants most affected by the format change. Reliability was generally consistent and relatively high in each of the eight subscales making up the Ego Strengths scale, regardless of the response format (values ranged between .44 and .69). Table 1 – MANCOVA, Estimated Marginal Means, Identity Styles Table 4 – Cronbach’s Alpha Over Time, Identity Style Reliability Table 2 – MANCOVA, Estimated Marginal Means, Ego Strengths Table 5 – Cronbach’s Alpha Over Time, Ego Strength Reliability Table 3 – MANCOVA, Estimated Marginal Means, Ego Strengths: Will Methodology MANOVA was run with age as a covariate, so we could control for development. If there are no statistically significant differences between scales (artificially dichotomized and truly dichotomous), there is no need to examine whether a difference is more pronounced between a particular age group. If there are no statistically significant differences between scales (artificially dichotomized and truly dichotomous), there is no need to examine the subscales individually or the items individually to determine whether the participants seemed to have cognitive difficulty with one of the subscales or one or more of the items. Examined internal reliability between the different waves. Conclusions Generally, consistency is maintained between the four versus two category scales. Additionally, younger respondents are more affected by response format in the Ego Strengths scale. This result seems counterintuitive, but younger participants may have increased difficulty maintaining consistency in four categories of responses. More research is needed to determine age effects in attitude scales.
© Copyright 2026 Paperzz