Evidence and Culture 1 Running head: EVIDENCE QUALITY AND PERSUASIVENESS Evidence Quality and Persuasiveness: Germans Are Not Sensitive to The Quality of Statistical Evidence Jos Hornikx Margje ter Haar Radboud University Nijmegen Author Note: Jos Hornikx, Centre for Language Studies, Communication and Information Studies, Radboud University Nijmegen; Margje ter Haar, Centre for Language Studies, Communication and Information Studies, Radboud University Nijmegen. The authors thank Rogier Crijns, Carin Lony, and Frank van Meurs for their help. Correspondence concerning this article should be addressed to Jos Hornikx, Radboud University Nijmegen, Communication and Information Studies, Erasmusplein 1, P.O. Box 9103, 6500 HD Nijmegen, the Netherlands, [email protected] Evidence and Culture 2 Abstract For a long time, research in communication and argumentation has investigated which kinds of evidence are most effective in changing people’s beliefs in descriptive claims. For each type of evidence, such as statistical or expert evidence, high-quality and low-quality variants exist, depending on the extent to which evidence respects norms for strong argumentation. Studies have shown that participants are sensitive to such quality variations in some, but not in all, cultures. This paper expands such work by comparing the persuasiveness of high- and lowquality statistical and expert evidence for participants from two geographically close cultures, the Dutch and the German. Study 1, in which participants (N = 150) judge a number of claims with evidence, underscores earlier findings that high-quality is more persuasive than low-quality evidence for the Dutch, and – surprisingly – also shows that this is less the case for the Germans, in particular for statistical evidence. Study 2 with German participants (N = 64) shows again they are not sensitive to the quality of statistical evidence, and rules out that this finding can be attributed to their understanding of the rules of generalization. Together, findings in this paper underline the need to empirically investigate what norms people from different cultures have for high-quality evidence, and to what extent these norms matter for persuasive success. Keywords: argument quality, culture, evidence, generalization, uncertainty avoidance Evidence and Culture 3 Evidence Quality and Persuasiveness: Germans Are Not Sensitive to The Quality of Statistical Evidence The quality of the arguments that are provided to convince an audience of a claim determines the extent to which this audience is persuaded. If people are motivated and able to process a message, high-quality arguments are more persuasive than low-quality arguments (e.g., Petty, Rucker, Bizer, & Cacioppo, 2004; Petty & Cacioppo, 1986). The question is what constitutes high-quality arguments? One way to distinguish between arguments is to start with the concept of evidence, “data (facts or opinions) presented as proof for an assertion.” (Reynolds & Reynolds, 2002, p. 429). The study of persuasive evidence has focused on the extent to which types of data affect people’s adherence to factual claims that describe (future) events, such as “Listening to classical music helps students to absorb a lot of knowledge in a short period of time”. Four types of evidence have generally been compared: statistical and anecdotal evidence in most cases (see Allen & Preiss, 1997; Reinard, 1988), but also causal evidence (providing an explanation for the claim), and expert evidence (in which an expert underscores the claim) (e.g., Hoeken, 2001; Hornikx & Hoeken, 2007). In these comparisons, people judge the probability of different claims, which are supported by different types of evidence. Research has shown that statistical evidence, which is based on a large number of cases, is generally more persuasive than anecdotal evidence, which is based on a small number of cases (see reviews by Allen & Preiss, 1997; Hornikx, 2005). Studies in this domain give us insights into what kind of data are most likely to change people’s beliefs about the phenomena described in claims. Such insights have increased after the notion of evidence quality was introduced. A given piece of evidence, say expert evidence, has a high quality when it respects the norms associated Evidence and Culture 4 with the underlying argumentation scheme, in this case the argument from expert opinion (e.g., Walton, 1997). For this type of argumentation, just as for others, a number of critical questions have been proposed. These questions can be used to assess whether a given line of argumentation is normatively strong (see, e.g., Garssen, 1997; Walton, Reed, and Macagno, 2008). For the argument from expert opinion, one of the critical questions addresses the relevance of the field of expertise in relation to the topic of the claim (Walton, 1997). In the study of persuasive evidence, variations in the quality of a given type of evidence have been compared (e.g., Hornikx & Hoeken, 2007; Hoeken & Hustinx, 2009; Hoeken, Timmers, & Schellens, 2012). In Hornikx and Hoeken (2007, Study 2), for instance, the persuasiveness of high-quality expert evidence (the field of expertise was relevant to the claim’s topic) was compared to that of low-quality expert evidence (the expert’s field of expertise was irrelevant to the topic of the claim). Such comparisons are useful for a better understanding of whether the normative quality of an argument has an impact on its persuasive success. Recently, evidence research has started taking a cross-cultural perspective by comparing the extent to which audiences from different cultural backgrounds are sensitive to evidence quality (e.g., Hornikx & De Best, 2011; Hornikx & Hoeken, 2007). Such research adds to our knowledge about the extent to which normative argument quality is important for persuasive success, and also deepens our insights into what quality criteria matter or do not matter in different cultures. This paper reports on two studies that examine the persuasiveness of highquality and low-quality evidence for two geographically close Western-European countries, the Netherlands and Germany. Culture and Persuasive Evidence Evidence and Culture 5 In the last sixty years, most research on persuasive evidence has been conducted in the United States, which has led researchers to question “[w]hether other cultures with different expectations for forms of proof would reflect the same outcomes” (Allen & Preiss, 1997, p. 129; McCroskey, 1969; Reynolds & Reynolds, 2002). Research addressing this question has a multiculturalist perspective, which argues that people from other cultures may evaluate evidence differently (see, e.g., MacIntyre, 1988; McKerrow, 1990). This perspective may be considered contradictory to the normative approach to argument/evidence quality, which argues that argument quality is not dependent on the evaluator, but is an intrinsic characteristic of the argument. Only a handful of studies have adopted this multiculturalist perspective in empirical research. In Hornikx (2008), Dutch and French participants ranked the four types of evidence for a number of claims in terms of how persuasive they expected these claims to be for other people (Dutch or French, respectively). No cultural differences were observed; participants’ ranking was: statistical (most persuasive for others) – expert – causal – anecdotal evidence. Hornikx and Hoeken (2007) measured the actual effectiveness of the four types of evidence for other Dutch and French participants (Study 1), and of high- and low-quality variations of statistical and expert evidence for these two cultural groups (Study 2). Cross-cultural differences were mainly found in the second study: Dutch participants were found to be sensitive to evidence quality (high-quality evidence was most persuasive), but French participants were not. In a follow-up study, Hornikx (2011) examined why the French were insensitive to the quality of expert evidence, which varied in the (ir)relevance of the expert’s field of expertise to the topic of the claim. Dutch and French students rated the expertise of professors and researchers with a Evidence and Culture 6 relevant and an irrelevant field of expertise. French students were more obedient than the Dutch students, and more obedience predicted a smaller difference between the relevant and the irrelevant fields of expertise. Thus, the smaller difference in the weight of expertise observed for the French students was partially explained by their obedience, which has a parallel in the French and Dutch educational systems: the French educational system underlines teachers’ authority and students’ obedience more than the Dutch (Hornikx, 2011). Finally, a study in a non-Western setting indicates that Indians are also more persuaded by high-quality expert evidence than by low-quality expert evidence, and more persuaded by (high-quality) statistical evidence than by (low-quality) anecdotal evidence (Hornikx & De Best, 2011). The findings in this study were highly similar to findings reported in studies with Dutch participants. Given that the effects of evidence quality have been investigated only in a small number of cultures, it is important to extend this research to other cultures. This paper compares the Dutch culture (where effects of evidence quality have been examined before) with the German culture (where such effects have not been studied). Whereas France and the Netherlands differ both in Hofstede’s (1980, 2001) cultural dimensions power distance (68 versus 38 on a 100-point scale respectively) and uncertainty avoidance (86 versus 53 respectively), Germany and the Netherlands – which are also highly similar in geographical location, and demographic and economic situation – only differ in uncertainty avoidance (65 versus 53) and not in power distance (35 versus 38). Uncertainty avoidance is “[t]he extent to which the members of a culture feel threatened by uncertain and unknown situations” (Hofstede, 2001, p. 161), and seems relevant for the impact of evidence quality. When judging the probability of a novel claim, evidence supporting that claim helps reduce uncertainty about that judgment. Indeed, a straightforward way to resolve Evidence and Culture 7 uncertainty is to look for information (e.g., Shuper, Sorrentino, Otsubo, Hodson, & Walker, 2004). Information that presents better reasons for accepting the claim, high-quality evidence, is in a better position to convince a person from a culture with high uncertainty avoidance than low-quality evidence. For instance, the more expert a professor is in expert evidence or the larger the sample size reported in statistical evidence, the more uncertainty about a claim can be reduced. On the other hand, it could also be argued that – given that evidence only affects the evaluation of a claim to a small degree (e.g., Hornikx & Hoeken, 2007) – presenting evidence in a high uncertainty avoidance culture is not very much persuasive, regardless of the quality of the evidence. Given the potential impact of uncertainty avoidance on the evaluation of evidence quality, it is useful to examine the impact of evidence quality on persuasiveness for Germans (higher uncertainty avoidance) and Dutch (lower uncertainty avoidance): Research question 1 Is the difference between high-quality and low-quality statistical and expert evidence larger for German than for Dutch people? As the dimension of uncertainty avoidance inspires RQ1, it is important to demonstrate, if RQ1 is answered positively, whether an individual-level measure of the cultural dimension of uncertainty avoidance explains this effect: Research question 2 Can the difference in RQ1 be explained by individual-level measures of uncertainty avoidance? The study that is needed to address these research questions makes it possible to find additional support for the Dutch sensitivity to evidence quality, and for statistical evidence being more persuasive than expert evidence, both as reported in Hornikx and Hoeken (2007): Hypothesis 1 For Dutch people, high-quality statistical or expert evidence is more persuasive than low-quality statistical or expert evidence Evidence and Culture 8 Hypothesis 2 For Dutch people, statistical evidence is more persuasive than expert evidence Study 1 Method Material From a set of 50 descriptive claims in which the consequences of actions or measures are described (Hornikx & Hoeken, 2007), 16 claims were selected: 10 serving as experimental claims and 6 serving as filler claims. The filler claims were supported by anecdotal or causal evidence in order to obscure the study’s interest in statistical and expert evidence; these claims were not analyzed in the results. The experimental claims dealt with social issues, for example relating to transportation, work, education, and consumption. Examples of these claims are (in an English translation) “Boys’ performance at school can be improved by putting them next to girls in class”, “Playing slow music in supermarkets increases sales”, and “Wearing a tie too tightly leads to reduced sight.” For each of the 10 experimental claims, different manipulations of evidence were constructed: high-quality statistical evidence, low-quality statistical evidence, high-quality expert evidence, and low-quality expert evidence. The operationalizations were based on earlier studies (Hornikx, 2011; Hornikx & De Best, 2011; Hornikx & Hoeken, 2007), and were adapted for the Dutch and German context. Statistical evidence consisted of the results of a study that demonstrated the effect of the claim for a Dutch or German sample. The quality of statistical Evidence and Culture 9 evidence was manipulated with respect to the number of cases in the sample, and to the percentage of people in the sample that had experienced the effect. An example of the high(low)quality statistical evidence for the claim “Listening to classical music helps students to absorb a lot of knowledge in a short period of time” is “A study among 381 (53) Dutch students has shown that 74% (38%) of them had absorbed a lot of knowledge in a short period of time by listening to classical music.” Different percentages and sample sizes were used in the instantiations of statistical evidence. Expert evidence consisted of a university professor who underlined the exact claim. The quality of expert evidence was manipulated according to the match or mismatch between the domain of expertise of the professor and the topic of the claim. An example of high(low)-quality expert evidence for the same claim about the effects of classical music is “Professor Dr. Van Zanten, a specialist in the field of music studies (social psychology) at the University of Amsterdam, underscores that students can absorb a lot of knowledge in a short period of time by listening to classical music.” It should be noted that the low-quality evidence instantiations are a lower quality in relative terms (i.e., compared to highquality), but not in absolute terms (e.g., the experts were, in all conditions, credible and knowledgeable). One person translated the Dutch material into German, and another person back translated the German translations into Dutch (cf. Brislin, 1980). A careful check between the original Dutch material, the translated German material, and the back-translated Dutch material identified a few cases where the wording of claims or evidence was improved. The cross-cultural equivalence of the material (see Harkness, Van de Vijver, & Mohler, 2003) was further ensured for the names and universities employed in the expert evidence (cf. Hornikx & Hoeken, 2007). Dutch and German surnames were selected in databases as relatively common names (e.g., Evidence and Culture 10 Timmermans and De Groot for the Netherlands; Schmidt and Becker for Germany). Dutch and German universities were selected on the basis of their number of students and the number of inhabitants of the host city (e.g., Erasmus Universiteit Rotterdam and Vrije Universiteit Amsterdam for the Netherlands; Universität Leipzig and Ruhr-Universität Bochum for Germany). Participants One hundred and fifty students from Nijmegen (the Netherlands; n = 73) and Münster (Germany; n = 77) participated in the study. Students were on average 22.33 years old (SD = 2.78), ranging between 18 and 36. Although the German participants (M = 23.21, SD = 2.54) were significantly older than the Dutch participants (M = 21.41, SD = 2.73; F (1, 148) = 1.75, p < .001, η2 = .11), age did not interact with evidence quality (F (1, 148) < 1) or evidence type (F (1, 148) = 2.76, p = .10). The male-female ratio was approximately 50-50 (51.3% male). The percentage of female participants was higher in the Dutch sample (57.5%) than in the German sample (40.3%; Χ2(1) = 4.48, p < .05), but there were no significant interactions between gender and evidence quality (F (1, 148) < 1), or between gender and evidence type (F (1, 148) < 1). Finally, when it comes to the education of the participants, the three most frequent disciplines were Biology (17.8%), History (9.6%), and English (9.6%) for the Dutch, and Chemistry (37.7%), Medicine (14.3%), and Geology (11.7%) for the German participants. Design The study had a 2 (nationality: Dutch, German) x 2 (evidence quality: high, low) x 2 (evidence type: expert, statistical) design with nationality as between-subject factor, and evidence quality Evidence and Culture 11 and evidence type as within-subject factors. There were five versions of the material. In these versions, the same 10 experimental claims and the same 6 filler claims were presented in an identical order, but the way in which the evidence types and the evidence quality were linked to the 10 experimental claims was different. These links were created on the basis of a Latin square design. Taking all five versions together, each claim was supported by all kinds of evidence (one type in one version). Also, in each version, participants judged two claims with high-quality statistical evidence, two claims with low-quality statistical evidence, two claims with highquality expert evidence, two claims with low-quality expert evidence, and two claims with no evidence. This no-evidence condition was necessary as a baseline to compute the persuasiveness of evidence (cf. Hoeken & Hustinx, 2009; Hornikx & Hoeken, 2007). In this condition, participants rated only the claim itself. For the computation of each persuasiveness score for a claim with one of the four other types of evidence (e.g., high-quality statistical evidence), this baseline score of claim’s probability was subtracted, resulting in the contribution that evidence makes to the probability judgments (a positive score means that evidence increases the probability judgment of the claim). Instrumentation The questionnaire, presented as a study on social issues, started with 16 claims with manipulated evidence that were each directly followed by a repetition of the claim without evidence and by a 5-point semantic differential “I find this claim very improbable – very probable” (cf. Hoeken & Hustinx, 2009; Hornikx & Hoeken, 2007). Two individual-level variables (in a Dutch and German translation) were included as potential constructs to tap into uncertainty avoidance as an explanation for cultural differences Evidence and Culture 12 on evidence quality. First, the Need for Structure scale (Neuberg & Newsom, 1993) was included, as it was suggested by Matsumoto and Yoo (2006) as a relevant scale for uncertainty avoidance. Need for Structure measures a person’s need for cognitive structure in life with 11 items such as “I enjoy having a clear and structured mode of life” on 5-point semantic differentials (totally me – totally not me). A principal component analysis showed that 8 of the 11 items loaded on a single factor (34% explained variance; α = .72; deleted items: 5, 8, 10). Second, the Need for Precision scale (Viswanathan, 1997), which measures people’s preference for fine-grained processing, was included as another measure for uncertainty avoidance. One example of the 13 items on 5-point semantic differentials (totally me – totally not me) was “Vague descriptions leave me with the need for more information”. A principal component analysis showed that 7 of the original 13 items loaded on a single factor (33% explained variance; α = .65; deleted items: 1, 2, 6, 8, 12, 13). Before the final questions about participants’ demographics (age, gender, nationality, and current education), the manipulation of the normatively strong and normatively weak expert evidence was checked. On 5-point Likert scales, participants judged the degree to which they believed the experts they had read about in the evidence had enough expertise to make a judgment about the specific claim the experts endorsed (cf. Hornikx & Hoeken, 2007). Procedure and Statistical Tests On the university campus of the Radboud University Nijmegen (the Netherlands) and the University of Münster (Germany), students were recruited individually by a Dutch person (speaking Dutch or German, respectively). Those students who were willing to participate were Evidence and Culture 13 asked to fill in a written questionnaire. Participation was voluntary. After the questionnaires were handed in (after 10 to 15 minutes), participants were thanked for their cooperation. The analysis of the impact of evidence quality on the persuasiveness of evidence was based on the computed difference scores (cf. Hoeken & Hustinx, 2009; Hornikx & Hoeken, 2007). These scores express the degree to which participants found the claims more probable when they were supported by evidence than when they were not. Analyses of variance with repeated measures with nationality (between-subjects), evidence type (within-subjects), and evidence quality (within-subjects) were conducted to address the research questions and hypotheses. Results Preliminary analyses Study 1 was conducted to examine whether Germans would be more sensitive to evidence quality variations than Dutch. A preliminary analysis showed that the manipulation of high- and low-quality expert evidence was successful: professors with relevant fields of expertise (M = 3.35, SD = 0.86) were believed to have more expertise than professors with irrelevant fields of expertise (M = 2.16, SD = 0.91; F (1, 148) = 182.77, p < .001, η2 = .55). This main effect of field of expertise was qualified by a significant interaction with nationality (F (1, 148) = 3.93, p < .05, η2 = .03): the difference between relevant and irrelevant was larger for the Dutch participants (M = 3.56, SD = 0.71, and M = 2.19, SD = 0.95, respectively) than for the German participants (M = 3.15, SD = 0.94, and M = 2.13, SD = 0.95, respectively), but – importantly – the difference was present in both groups. Finally, it was checked whether the probability scores of the claims Evidence and Culture 14 without evidence were indeed low as intended – giving participants the opportunity to indicate higher probability ratings when evidence was included. The probability scores of the claims without evidence were below the midpoint of the scale, and there was no difference between the German participants (M = 2.45, SD = 0.84) and the Dutch participants (M = 2.49, SD = 0.95), F (1, 148) < 1. Main Analyses An analysis of variance showed that there was no main effect of Evidence type (F (1, 148) < 1), or Nationality (F (1, 148) = 2.23, p = .14), and no interactions between Evidence type and Nationality (F (1, 148) < 1), or between Evidence type and Evidence quality (F (1, 148) < 1). However, there was a main effect of Evidence quality (F (1, 148) = 5.74, p < .05, η2 = .04): highquality evidence (M = 0.41, SD = 0.74) was found to be more effective than low-quality evidence (M = 0.25, SD = 0.59). RQ1 centered around the question of whether this effect of evidence quality would be larger for Germans than for Dutch. The interaction effect relevant to RQ1 was marginally significant (F (1, 148) = 3.21, p = .08), and since there was a tendency for this interaction to be dependent on Evidence type (F (1, 148) = 3.09, p = .08), the interaction Evidence quality x Nationality was analyzed for each type of evidence separately. For expert evidence, there was no significant interaction between Evidence quality and Nationality (F (1, 148) < 1); for statistical evidence, there was a significant interaction (F (1, 148) = 5.80, p < .05, η2 = .04). For the Dutch participants, high-quality statistical evidence was more persuasive than low-quality statistical evidence (F (1, 72) = 4.47, p < .05, η2 = .06), but this was not the case for the German participants (F (1, 76) = 1.38, p = .24). The relevant descriptive statistics are given in Table 1. Evidence and Culture 15 --- TABLE 1 ABOUT HERE --RQ2 addressed the question of whether the larger effect of evidence quality for the German participants could be explained by individual-level measures of uncertainty avoidance. As the answer to RQ1 was not affirmative, the same is true for RQ2. An additional analysis of variance showed no differences between the Dutch and the Germans in the mean scores on Need for Structure (F (1, 148) < 1; Dutch: M = 3.04, SD = 0.55; Germans: M = 3.09, SD = 0.60), and Need for Precision (F (1, 148) = 1.80, p = .18; Dutch: M = 3.35, SD = 0.43; Germans: M = 3.45, SD = 0.46). Finally, when a difference score was computed between the persuasiveness of highand low-quality evidence (which is a measure of sensitivity to evidence quality), correlation analyses showed that Need for Structure (r(150) = -.01, p = .93) and Need for Precision (r(150) = .01, p = .92) did not correlate with this measure of sensitivity. H1 predicted that the Dutch would be sensitive to evidence quality variations. This hypothesis was supported by the main effect of Evidence quality documented before. A specific test for the Dutch participants alone further underlines that high-quality evidence was found to be more persuasive than low-quality evidence (F (1, 72) = 6.80, p < .05, η2 = .09). H2, finally, predicted that the Dutch would be more persuaded by statistical than by expert evidence (cf. Hornikx & Hoeken, 2007), but this prediction did not find empirical support in Study 1 (F (1, 72) < 1). Conclusion and Discussion Study 1 was conducted to examine the effects of evidence quality for Dutch and German participants. The main question was whether the Germans would be more sensitive to evidence Evidence and Culture 16 quality variations than the Dutch. Results showed that this was not the case. Whereas the Dutch participants were more persuaded by high-quality than by low-quality evidence (thereby supporting H1), German participants proved to be less sensitive to evidence quality variations than the Dutch (RQ1). This reduced sensitivity was related to statistical evidence. Although the difference between high-quality and low-quality statistical evidence was non-significant, the direction of the effect was striking: low-quality statistical evidence (M = 0.38) seems to outperform high-quality statistical evidence (M = 0.24) (means from Table 1). This observation is incongruent with earlier studies on the impact of evidence quality on the evaluation of claims. The manipulation of the quality of statistical evidence used in this study was borrowed from Hornikx and Hoeken (2007), and in that study participants proved to be sensitive to rules of generalization underlying statistical evidence. In Study 1, these rules of generalization were not tested. Although consistent with normative criteria for valid generalizations based on numerical information (i.e., sample size), the manipulations of statistical evidence may have been evaluated differently by the German participants. Therefore, Study 2 was conducted to re-examine German evaluations of high-quality and low-quality statistical evidence, and to control for their understanding of the rules of generalization. A lack of an effect for evidence quality may also be related to a reduced motivation, as motivation has been shown to affect the difference between strong and weak arguments (Petty, et al., 2004; Petty, & Cacioppo, 1986). In Study 1, potential German participants were contacted by an outgroup member, namely a Dutch person from the Netherlands. In Study 2, a university staff member recruited the potential participants. The central question was: Research question 3 Is high-quality statistical evidence more persuasive than lowquality statistical evidence for German people? Evidence and Culture 17 Study 2 Method The design of this study was based on Study 1. As the instrumentation now included more control questions and as the focus was on statistical evidence alone, the number of claims with evidence that German participants judged was lower. Material From the 10 experimental claims in Study 1, 6 claims were randomly chosen. There were no filler claims. The same instantiations of statistical evidence quality were used as in Study 1. Participants Sixty-four German students from the University of Münster participated in the study. They were on average 22.13 years old (SD = 3.48), ranging between 18 and 41. Most of the participants were female (67.2%), and the majority studied Dutch language and literature (92.2%). Design Study 2 had a within-subjects design: participants judged two claims with high-quality statistical evidence, two claims with low-quality statistical evidence, and two claims without evidence. A Latin square design distributed these three conditions of evidence over the six claims and the three versions that were created. Evidence and Culture 18 Instrumentation The introduction and the probability items to measure the effectiveness of evidence were identical to Study 1. After the six judgments, there was a check for evidence quality (developed from an existing, simpler measure in Hornikx & Hoeken, 2007). Technically, this scale did not measure whether the manipulation was successful (in a way, the evidence respected or did not respect norms for high-quality evidence independently of the perceptions of the participants; O’Keefe, 2003), but it measured the extent to which participants understood the rules underlying generalization from samples to a population. In the Understanding of Generalization scale, participants were asked to indicate which of two examples on 5-point scales they would prefer as proof for the generality of the occurrence of an effect. A higher score indicated a better understanding of criteria for high-quality statistical evidence: “the effect occurs in 35% of 46 persons – the effect occurs in 78% of 314 persons”, “the effect occurs in 78% of 46 persons – the effect occurs in 78% of 314 persons”, and “the effect occurs in 35% of 46 persons – the effect occurs in 35% of 314 persons”. The Understanding of Generalization scale, consisting of the average of the three items, was reliable (α = .81). Next, participants were presented with the Need for Precision scale (Viswanathan, 1997). A principal component analysis identified a 1-factor solution (32% explained variance; deleted items: 3, 4, 6, 10). As this scale was not reliable (α = .33), it was discarded in the analyses. A new scale replaced the Need for Structure from Study 1, namely the Preference for Numerical Information scale (Viswanathan, 1993). This scale was measured with Hornikx and Hoeken’s (2007) subset of 8 of the 20 original items such as “I enjoy work that requires the use of numbers”. A principal component analysis identified a 1-factor solution (60% explained Evidence and Culture 19 variance; α = .93; no items deleted). The questionnaire ended with questions about participants’ demographics (age, gender, nationality, and current education). Procedure and Statistical Tests Students were recruited at the University of Münster (Germany) by an ingroup person, a Dutch staff member who has been affiliated with the university for years. As in Study 1, those who were willing to participate were asked to fill in a written questionnaire. Again, participation was voluntary, and participants were thanked for their cooperation. In order to compute the effectiveness of evidence, the procedure similar to Study 1 was used to compute difference scores. Results The main goal of Study 2 was to re-examine the German sensitivity to quality variations of statistical evidence. The probability scores of the claims without evidence were below the midpoint of the scale (M = 2.62, SD = 1.01). As in Study 1, high-quality statistical evidence (M = 0.07, SD = 0.85) was not found to be more persuasive than low-quality statistical evidence (M = 0.08, SD = 0.81) for the German participants (RQ3; F (1, 63) < 1). Study 2 was also conducted to examine whether this finding might be (partly) explained by the participants’ understanding of generalization. This explanation is unlikely: participants clearly indicated that they understood what kind of statistical data is best suited to make generalizations. Their mean score M = 3.83 (SD = 1.03) was significantly higher than the midpoint of the Understanding of Generalization scale (t(61) = 6.40, p < .001). In addition, when controlling for participants’ scores on this scale Evidence and Culture 20 as covariate, the main effect of Evidence quality was still absent (F (1, 60) < 1). Finally, when a difference score was computed between the persuasiveness of high- and low-quality statistical evidence (which is a measure of sensitivity to evidence quality), correlation analyses showed that Understanding of Generalization (r(62) = .06, p = .62), and Preference for Numerical Information (r(64) = .08, p = .51) did not correlate with this measure of sensitivity. Conclusion and Discussion Study 2 corroborates the finding in Study 1 that German participants are not sensitive to the quality of statistical evidence in their evaluations of claims supported by this type of evidence (RQ3). Study 2 ruled out that this finding could be attributed to participants’ lack of understanding of rules for generalization. Although the German participants proved able to understand what kind of data is more valuable to make generalizations, they apparently did not apply this understanding when evaluating claims based on high- and low-quality statistical evidence. General Conclusion and Discussion In empirical research on culture and argumentation, one line of research has focused on the persuasiveness of types of evidence that support claims for audiences with different cultural backgrounds (e.g., Hornikx & De Best, 2011; Hornikx & Hoeken, 2007). The current paper aimed to extend this work by comparing the persuasiveness of high- and low-quality evidence for two types of evidence (statistical and expert evidence) and for participants from two different Evidence and Culture 21 cultures (Dutch and German). Study 1 addressed the question whether Germans would be more sensitive to evidence quality variations than Dutch people because of their (presumed) higher level of uncertainty avoidance, which would make them prefer high-quality evidence that provides stronger proof for claims. Dutch and German participants judged a number of claims with or without evidence. Results showed that, as expected, Dutch participants were more persuaded by high-quality than by low-quality evidence (H1). This finding corroborates conclusions drawn in Hoeken and Hustinx (2009) and Hornikx and Hoeken (2007) pertaining to Dutch participants’ sensitivity to evidence quality. Statistical evidence was not found to be more persuasive than expert evidence for the Dutch participants, contrary to H2 and to the findings in Hornikx and Hoeken (2007). This difference lies in the ratings of expert evidence, which were higher in the present study than in Hornikx and Hoeken (2007). It is not clear why these ratings were higher in the present study. German participants were also sensitive to evidence quality variations in expert evidence (high-quality expert evidence was more persuasive than low-quality expert evidence), but not to variations in the quality of statistical evidence. As Study 1 did not have measures that could potentially provide explanations for this finding, Study 2 was conducted to re-examine the sensitivity to high- and low-quality statistical evidence for Germans (RQ3). As in Study 1, German participants were not found to be sensitive to the quality of statistical evidence. In addition, this non-significant effect could not be attributed to a lack of understanding of rules for generalization. The persuasiveness scores in Study 2 were lower than in Study 1. One difference between these two studies is that participants in Study 1 judged a larger number of claims with a richer variation of evidence (e.g., strong statistical evidence in the experimental claims, and Evidence and Culture 22 anecdotal evidence in the filler claims), which enabled them to differentiate more clearly between evidence types, evidence quality, and evidence versus no evidence. Why Are Germans Not Sensitive to the Quality of Statistical Evidence? Germans are found to be less sensitive to evidence quality variations than Dutch people. Why is that? A first potential explanation may lie in a difference in the motivation of the participants. Motivation is a condition for distinguishing between strong and weak argumentation (Petty, et al., 2004; Petty, & Cacioppo, 1986), but it is not clear how motivated the participants were in their evaluation of the claims. As motivation was not measured, its potential role cannot be ruled out. However, it is unlikely that low motivation provides a strong explanation for the insensitivity to high-quality and low-quality statistical evidence: German participants do differentiate between high- and low-quality expert evidence (Study 1), and in Study 2 where motivation may be higher due to the intervention of an ingroup university member they still do not differentiate between high- and low-quality statistical evidence. A second potential explanation for the German insensivity may lie in their educational background. In Study 1, most German participants came from science disciplines, and were less sensitive to statistical evidence quality variations than Dutch participants who came mostly from social sciences and the humanities. It does not seem, however, that the latter background affects sensitivity to quality variations, as the German participants in Study 2 were almost exclusively humanities students and did not differentiate between high- and low-quality expert evidence. A third potential explanation may be related to the link between this sensitivity and uncertainty avoidance measures. Both studies included scales that were expected to be associated Evidence and Culture 23 with the degree to which members from a given cultural background tolerate and appreciate ambiguity. The Need for Structure scale (Neuberg & Newsom, 1993) was suggested by Matsumoto and Yoo (2006) as an individual-level scale to measure uncertainty avoidance. In Study 1, scores on this scale correlated positively (r(150) = .21, p < .01) with the Need for Precision scale (Viswanathan, 1997), suggesting that this second scale is also related to the concepts of ambiguity, uncertainty, and precision. However, no differences between the Dutch and the Germans scores on these two scales were found, and scores on these scales did not correlate with the sensitivity to evidence quality. Other scales, such as arithmetic aptitude (Zillmann, Callison, & Gibson, 2009) may be relevant to examine in future studies. Thus, there is no empirical evidence in this study for the link between uncertainty avoidance and sensitivity to evidence quality. This, however, does not automatically imply that one should completely discard the links between uncertainty avoidance and evidence quality. The observation that scores on the two individual-level scales were inconsistent with results found on a cultural level in large-scale surveys (Hofstede, 1980, 2001) is not unique in cross-cultural research. It has indeed been found difficult to empirically measure cultural values on an individual level (see Peng, Nisbett, & Wong, 1997; Smith & Schwartz, 1997). A final potential explanation is a more general one, pertaining to the methodological choices made in the studies. Because of the Latin square design that was used, participants judged all the types of evidence, but each time for another claim. This design has advantages (e.g., results can be generalized over multiple claims and multiple evidence instantiations), but another design could have been helpful in investigating whether German participants are sensitive to argument quality variations. That is, if they receive a claim with both the highquality and the low-quality variant of statistical evidence, they can be asked to choose a side – Evidence and Culture 24 similar to the Understanding of Generalization scale. Such a design, by sidestepping the potential role of motivation, could be sufficient to examine if Germans translate their understanding of generalization to the context of claims with evidence. Another methodological choice is the use of difference scores. The impact of evidence was measured as a difference between the evaluation of a claim with evidence and the evaluation of the claim in isolation. Although this measurement is justifiable, other measures can be used in future research (cf. Nelson, 2005), such as impact as a ratio: the difference score of each claim divided by the base rate evaluation of the claim. Also, the evaluation of the claim in isolation was now determined by a group of other participants, whereas a study’s focus could also lie on the difference score within an individual participant. Norms for High-Quality Evidence Study 2 showed that the German participants understand the criteria for generalization, but that they do not apply these criteria when judging evidence that supports claims. This leads to the following question: what criteria are relevant for high-quality evidence, and how may culture play a role? Findings in the present paper are in line with the multiculturalist view on argument quality. Siegel (1999) argues that the multiculturalist perspective fails against the normative approach, but findings from this study and earlier studies underline that people from different cultures may not be sensitive to norms linked to such high-quality evidence to the same extent. It is still an open question as to whether the norms are universal and people’s reactions to them are culture-dependent, or as to whether norms themselves may be culture-dependent. Empirical research is needed to gain insight into this question. Studies as presented in this paper may reveal Evidence and Culture 25 that the norms of high sample size and high effect size are not used similarly in different cultures. In addition, there may be other norms related to statistical evidence that were not investigated. One way to increase our insights into the norms of people from different cultural background is to have people generate norms on the basis of scenarios asking them to reflect on high-quality evidence. Another way is to conduct experiments such as in this paper, in which different norms are tested. One complication is that there is no agreed list of norms associated with high-quality evidence or arguments. Over the years, various frameworks have been developed with different argument types and associated norms, ranging from 3 (Garssen, 1997) to over 60 types (Walton et al., 2008). In addition, such frameworks originate from different countries, including the US (e.g., Hastings, 1962), Canada (e.g., Walton et al., 2008), the Netherlands (e.g., Van Eemeren & Grootendorst, 1992), and Austria (Kienpointner, 1992). In argumentation theory, there is a debate about the number of argument types (argumentation schemes), the way they should be classified, and the norms (critical questions) that are relevant (see Katzav & Reed, 2004; Walton et al., 2008). There is no consensus about what the exact norms are for high-quality arguments or evidence. Research is needed to sort this out. Studies such as those reported in this paper, incorporating insights from argumentation theory, communication, and psychology, are needed to further the field of persuasive argumentation (cf. Hornikx & Hahn, 2012). It may very well be that research finds out – for different types of evidence – that some criteria associated with high-quality evidence matter in some cultures, and not in others, such as the criterion of a relevant field of expertise for expert evidence in France (Hornikx, 2011; Hornikx & Hoeken, 2007). Such research relates to essential questions, such as the questions as to what is universal and what is not (cf. Mercier, 2011), and the question whether normatively stronger also means empirically more persuasive (cf. O’Keefe, Evidence and Culture 26 2007). Pursuing experimental work in this area will deepen our insights into the ways in which culture may affect people’s evaluations of arguments. References Allen, M., & Preiss, R.W. (1997). Comparing the persuasiveness of narrative and statistical evidence using meta-analysis. Communication Research Reports, 14, 125-131. Brislin, R.W. (1980). Translation and content analysis of oral and written material. In H.C. Triandis & J.W. Berry (Eds.), Handbook of cross-cultural psychology: Methodology (pp. 389-444). Boston: Allyn & Bacon. Eemeren, F.H. van, & Grootendorst, R. (1992). Argumentation, communication, and fallacies: A pragma-dialectical perspective. Hillsdale, NJ: Lawrence Erlbaum. Garssen, B.J. (1997). Argumentatieschema’s in pragma-dialectisch perspectief: Een theoretisch en empirisch onderzoek [Argumentation schemes in pragmadialectical perspective: A theoretical and empirical study]. Amsterdam: IFOTT. Harkness, J.A., Vijver, F.J.R. van de, & Mohler, P.Ph. (2003). Cross-cultural survey methods. Hoboken, NJ: John Wiley & Sons. Hastings, A.C. (1962). A reformulation of the modes of reasoning in argumentation. Unpublished dissertation. Evanston, IL: Northwestern University. Hoeken, H. (2001). Anecdotal, statistical, and causal evidence: Their perceived and actual persuasiveness. Argumentation, 15, 425-437. Hoeken, H., & Hustinx, L. (2009). When is statistical evidence superior to anecdotal evidence in supporting probability claims? The role of argument type. Human Communication Research, 35, 491-510. Evidence and Culture 27 Hoeken, H., Timmers, R., & Schellens, P.J. (2012). Arguing about desirable consequences: What constitutes a convincing argument? Thinking and Reasoning, 18, 394-416. Hofstede, G. (1980). Culture’s consequences: International differences in work-related values. Beverly Hills, CA: Sage. Hofstede, G. (2001). Culture’s consequences: Comparing values, behaviors, institutions, and organizations across nations (2nd ed.). Thousand Oaks, CA: Sage. Hornikx, J. (2005). A review of experimental research on the relative persuasiveness of anecdotal, statistical, causal, and expert evidence. Studies in Communication Sciences, 5, 205-216. Hornikx, J. (2008). Comparing the actual and expected persuasiveness of evidence types: How good are lay people at selecting persuasive evidence? Argumentation, 22, 555-569. Hornikx, J. (2011). Epistemic authority of professors and researchers: Differential perceptions by students from two cultural-educational systems. Social Psychology of Education, 14, 169183. Hornikx, J. & Best, J. de (2011). Persuasive evidence in India: An investigation of the impact of evidence types and evidence quality. Argumentation and Advocacy, 47, 246-257. Hornikx, J., & Hahn, U. (2012). Reasoning and argumentation: Towards an integrated psychology of argumentation. Thinking and Reasoning, 18, 225-243. Hornikx, J., & Hoeken, H. (2007). Cultural differences in the persuasiveness of evidence types and evidence quality. Communication Monographs, 74, 443-463. Katzav, J., & Reed, C.A. (2004). On argumentation schemes and the natural classification of arguments. Argumentation, 18, 239-259. Evidence and Culture 28 Kienpointner, M. (1992). Alltagslogik: Struktur und Funktion von Argumentationsmustern [Everyday logic: Structure and function from a sample of arguments]. Stuttgart-Bad Cannstatt: Friedrich Frommann. MacIntyre, A. (1988). Whose justice? Which rationality? Notre Dame, IN: University of Notre Dame Press. Matsumoto, D., & Yoo, S.H. (2006). Toward a new generation of cross-cultural research. Perspectives on Psychological Science, 1, 234-250. McCroskey, J.C. (1969). A summary of experimental research on the effects of evidence in persuasive communication. Quarterly Journal of Speech, 55, 169-176. McKerrow, R.E. (1990). Overcoming fatalism: Rhetoric/argument in postmodernity. In R.E. McKerrow (Ed.), Argument and the postmodern challenge: Proceedings of the eight SCA/AFA conference on argumentation (pp. 119-121). Annandale, VA: Speech Communication Association. Mercier, H. (2011). On the universality of argumentative reasoning. Journal of Cognition and Culture, 11, 85-113. Nelson, J.D. (2005). Finding useful questions: On Bayesian diagnosticity, probability, impact and information gain. Psychological Review, 112, 979-999. Neuberg, S.L., & Newsom, J.T. (1993). Personal need for structure: individual differences in the desire for simple structure. Journal of Personality and Social Psychology, 65, 113-131. O’Keefe, D.J. (2003). Message properties, mediating states, and manipulation checks: Claims, evidence, and data analysis in experimental persuasive message effects research. Communication Theory, 13, 251-274. Evidence and Culture 29 O’Keefe, D.J. (2007). Potential conflicts between normatively-responsible advocacy and successful social influence: Evidence from persuasion effects research. Argumentation, 21, 151-163. Peng, K., Nisbett, R.E., & Wong, N.Y.C. (1997). Validity problems comparing values across cultures and possible solutions. Psychological Methods, 2, 329-344. Petty, R.E., & Cacioppo, J.T. (1986). Communication and persuasion: Central and peripheral routes to attitude change. New York: Springer. Petty, R.E., Rucker, D.D., Bizer, G.Y. & Cacioppo, J.T. (2004). The Elaboration Likelihood Model of persuasion. In J.S. Seiter & G.H. Gass (Eds.), Perspectives on persuasion, social influence, and compliance gaining (pp. 65-89). Boston: Allyn & Bacon. Reinard, J.C. (1988). The empirical study of the persuasive effects of evidence: The status after fifty years of research. Human Communication Research, 15, 3-59. Reynolds, R.A., & Reynolds, J.L. (2002). Evidence. In J.P. Dillard & M. Pfau (Eds.), The persuasion handbook: Developments in theory and practice (pp. 427-444). Thousand Oaks, CA: Sage. Shuper, P. A., Sorrentino, R. M., Otsubo, Y., Hodson, G., & Walker, A. M. (2004). A theory of uncertainty orientation: Implications for the study of individual differences within and across cultures. Journal of Cross-Cultural Psychology, 35, 460-480. Siegel, H. (1999). Argument quality and cultural difference. Argumentation, 13, 183-201. Smith, P.B., & Schwartz, S.H. (1997). Values. In J.W. Berry, M.H. Segall & C. Kagitçibasi (Eds.), Handbook of cross-cultural psychology: Vol. 3. (2nd ed., pp. 77-118). Boston: Allyn and Bacon. Evidence and Culture 30 Viswanathan, M. (1993). The measurement of individual differences in preference for numerical information. Journal of Applied Psychology, 78, 741-752. Viswanathan, M. (1997). Individual differences in need for precision. Personality and Social Psychology Bulletin, 23, 717-735. Walton, D.N. (1997). Appeal to expert opinion: Arguments from authority. University Park, PA: Pennsylvania State University Press. Walton, D.N., Reed, C., & Macagno, F. (2008). Argumentation schemes. Cambridge: Cambridge University Press. Zillmann, D., Callison, C., & Gibson, R. (2009). Quantitative media literacy: Individual differences in dealing with numbers in the news. Media Psychology, 12, 394-416. Evidence and Culture 31 Table 1. Persuasiveness of evidence in function of type, quality, and nationality (Study 1) Evidence type expert statistical Evidence quality Dutch (n = 73) Germans (n = 77) M SD M SD high-quality 0.52 1.01 0.34 1.02 low-quality 0.29 0.90 0.11 0.75 high-quality 0.56 1.00 0.24 0.82 low-quality 0.21 0.91 0.38 0.85
© Copyright 2026 Paperzz