USING ICON ARRAY AS A VISUAL AID FOR COMMUNICATING VALIDITY INFORMATION Don C. Zhang A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY May 2016 Committee: Scott Highhouse, Advisor Priscilla K. Coleman Graduate Faculty Representative Richard Anderson Margaret Brooks ii ABSTRACT Scott Highhouse, Advisor To promote better decisions in the workplace, organizational researchers must communicate the value of their scientific findings. Traditional statistics such as the correlation coefficient are difficult to interpret. Graphical visual aids, such as Icon arrays, have recently emerged as effective tools for simplifying probabilistic and statistical information. This dissertation examined the benefits the Icon array in communicating the validity of structured interviews. People judged the Icon array as more useful than the Binomial Effect Size Display (BESD) for communicating validity information. People were more engaged with the interactive visual aid than its static counterpart, and judged the interactive visual aid more useful. Finally, people performed better on an objective graph comprehension test when presented with an Icon array than the bar graph. The benefit of graphical displays (Icon array and bar graph), however, was moderated by individual differences in graph literacy. Bar graph and the BESD were more useful for people with high (vs. low) graph literacy. The Icon array was equally useful for people with high and low graph literacy. iii To my parents, for their unwavering support iv ACKNOWLEDGMENTS I would like to thank all the friends I made at BGSU for making this five-year journey beyond tolerable. I would also like to thank my dissertation committee members: Richard Anderson, Margaret Brooks, and Priscilla Coleman for their invaluable feedback. Finally, I would like to thank my adviser and mentor, Scott Highhouse, for his support, patience, and guidance. v TABLE OF CONTENTS Page INTRODUCTION ................................................................................................................. 1 Criterion Validity ...................................................................................................... 3 Predictive Validity of Structured Interviews ............................................................. 6 Benefits of Structured Interviews .................................................................. 6 Resistance Against Structured Interviews ..................................................... 7 Methods for Communicating Validity ....................................................................... 8 Alternative Displays of Validity .................................................................... 9 Graphical Visual Aids ................................................................................................ 10 Icon Array ...................................................................................................... 10 Interactivity .................................................................................................... 13 Individual Differences in Graph Literacy .................................................................. 16 METHOD .............................................................................................................................. 20 Participants ............................................................................................................... 20 Stimulus Material ....................................................................................................... 21 Subjective Graph Literacy ......................................................................................... 21 Design and Procedure ................................................................................................ 22 RESULTS ............................................................................................................................ 24 Preliminary Analysis.................................................................................................. 24 Subjective Graph Literacy ............................................................................. 25 Hypotheses Testing .................................................................................................... 26 Perceived Visual Aid Usefulness ................................................................... 26 vi Visual Aid Engagement ................................................................................. 28 Objective Comprehension Test ...................................................................... 28 DISCUSSION ........................................................................................................................ 31 Limitations and Future Direction ............................................................................... 36 Practical Implications................................................................................................. 37 Conclusion ................................................................................................................ 37 REFERENCES ...................................................................................................................... 39 APPENDIX A. JOB SCREENING ITEMS ......................................................................... 76 APPENDIX B. INTERVIEW VIGNETTE .......................................................................... 77 APPENDIX C. OBJECTIVE COMPREHENSION TEST .................................................. 78 APPENDIX D. DEPENDENT VARIABLES ...................................................................... 79 APPENDIX E. DEMOGRAPHICS ...................................................................................... 80 APPENDIX F. INFORMED CONSENT ............................................................................. 83 APPENDIX G. HSRB APPROVAL LETTER .................................................................... 84 vii LIST OF TABLES Table Page 1 Example of Taylor-Russell Table .............................................................................. 52 2 Example of Binomial Effect Size Display ................................................................. 53 3 Characteristics of Decision-aids that Enhance Comprehension ................................ 54 4 Principle Axis Factoring Results for Subjective Graph Literacy............................... 55 5 Means, Standard Deviations, Reliabilities and Inter-correlations of Variables ......... 56 6 Principle Axis Factoring Results for the Dependent Variables ................................. 57 7 Principle Axis Factoring Results for Revised Dependent Variables ......................... 58 8 Summary of ANOVA Results for Perceived Visual Aid Usefulness ........................ 59 9 Mean Perceived Usefulness Across Visual Aids ....................................................... 60 10 Summary of ANOVA Results for Engagement ......................................................... 61 11 Summary of the Ordered Logistic Regression on Number of Correct Answers ....... 62 12 Predicted Probabilities of Number of Correct Responses.......................................... 63 13 Logistic Regression Analysis of Objective Comprehension Test Questions............. 64 viii LIST OF FIGURES Figure Page 1 Example of Expectancy Chart ................................................................................... 65 2 Example of Icon Array............................................................................................... 66 3 Screenshot of Visual Aid Instructions ....................................................................... 67 4 Screenshot of Interactive BESD ................................................................................ 68 5 Screenshot of Interactive Bar Graph .......................................................................... 69 6 Screenshot of Interactive Icon Array ......................................................................... 70 7 Screenshot of Static BESD ........................................................................................ 71 8 Screenshot of Static Bar Graph .................................................................................. 72 9 Screenshot of Static Icon Array ................................................................................. 73 10 Histogram of Subjective Graph Literacy Scale ......................................................... 74 11 Plot of Means for Perceived Usefulness of Icon Array and Bar Graph ..................... 75 1 INTRODUCTION To promote better decision-making, it is critical that scientists communicate the significance of their research to relevant stakeholders. Haensly, Lupkowski, & McNamara, (1987) noted “… the greatest impact of research stems from clearly communicating research findings to policy makers and practitioners” (p. 63). Scientific findings, however, are typically technical and difficult for a lay audience to understand. In social sciences, where considerable research is quantitative and statistical in nature, it can be difficult to communicate the practical implications of research findings. Although traditional effect size indices such as the correlation coefficient are the norm for communicating effect sizes in an academic context, they are not ideal for informing real-world decisions and predictions. Employee selection research often uses Pearson’s correlation coefficient to describe the predictive validity of a selection instrument (e.g., structured interviews). Whereas the correlation coefficient is the standard for communicating predictive validity in the scientific literature, it is difficult to comprehend for a general population (Brogden, 1946), often misunderstood (Lawshe & Bolda, 1958), and not informative for decision-making (Beaton & Barone, 1981; Soyer & Hogarth, 2012). Kuncel and Rigdon (2012) recommended two alternatives to the correlation coefficient for improving understanding of research findings. The first recommendation was using alternative effect size indices. The second recommendation was using graphical visual aids. Some scholars have explored alternative effect size statistics such as the Common Language Effect Size (CLES) (Brooks, Dalal, & Nolan 2013; McGraw & Wong, 1992). Other research has examined tabular and graphical displays such as Binomial Effect Size Displays (BESD) or Taylor-Russell tables (Murphy & Davidshofer, 1988; Rosenthal, 2005); and 2 Expectancy Charts (Cascio, 1977; Lawshe & Bolda, 1958). More recently, researchers have begun examining non-traditional graphical visual aids such as Icon arrays (D. Zhang, Y. Zhang, Highhouse, & Brooks, 2014) The present dissertation extends the current literature on using visual aids to communicate predictive validity statistics of selection instruments. First, previous research on communicating statistical effect size information focused on alternative numerical displays (e.g., BESD and CLES). Interpreting numerical information is cognitively demanding, especially for people with low numeracy. This study examined graphical displays, which is more accessible for a wide population. Second, research using graphical visual aids has examined the benefits of different user interactivity on learning and decision-making (Cleveland & McGill 1985; Lowe, 2003; Mayer & Chandler, 2001; Zikmund-Fisher et al., 2014), but the effect of user interactivity in the context of communicating effect sizes remains unexplored. In this study, I examined user interaction moderate the usefulness of visual aids for enhancing the reader’s comprehension of validity information. Finally, previous research shows that individual differences such as graph literacy are related to graph comprehension (Okan, Garcia-Retamero, Cokely, & Maldonado, 2012; Galesic & Garcia-Retamero, 2011), yet no research to date has examined the role of graph literacy in the context of using an interactive visual aid to communicate effect size information. The current study also examined the moderating effect of graph literacy on the benefits of graphical visual aids. Kuncel and Rigdon (2012) suggested that future research on communication of Industrial and Organizational Psychology findings should: (1) Explore tools that effectively communicate the value of evidence-based organizational interventions, and (2) Understand the role of 3 individual differences in the comprehension of decision aids. The present dissertation examined these two questions. Criterion Validity Criterion validity, or predictive validity, is the cornerstone of personnel selection research. It represents the degree to which a predictor – often an assessment instrument or technique – predicts a job-related criterion (e.g., job performance) (Borsboom, Mellenbergh, & van Heerden, 2004). Schmidt and Hunter (1998) emphasized “From the point of view of practical value, the most important property of a personnel assessment method is predictive validity: the ability to predict future job performance, job-related learning, and other criteria” (p. 262). Adopting criterion-valid hiring practices can improve work performance, increase profitability, and decrease counterproductive work behaviors (Arthur, 1994; Huselid, 1995; Salgado, 2002; Schmidt & Hunter, 1998; Terpstra & Rozell, 1997). In the scientific literature, the criterion validity of a predictor is most frequently expressed as the Pearson product-moment correlation coefficient (r), which represents the linear relationship between two variables (e.g., intelligence and job performance; Brogden, 1946). Criterion validity can also be expressed incrementally when multiple predictors of a single criterion are assessed simultaneously and the researcher intends to isolate the validity of a predictor that has shared variance with others. For instance, Schmidt and Hunter (1998) metaanalytically examined the incremental validity of 18 predictors after accounting for the variance explained by cognitive ability. Depending on the statistical assumptions and the intended use of the validity results, reports also contain alternative, but mathematically similar indices to express validity, such as the coefficient of determination (i.e., percent of variance explained), 4 standardized regression weights, or the slope of the regression line for the predictor-criterion relationship. The standard for measuring predictive validity in scientific writing – correlation coefficient – is difficult for lay readers to comprehend (Cascio, 1977; Lawshe & Bolda, 1958, Hoffrage, Lindsey, Hertwig, & Gigerenzer, 2000). Lawshe and Bolda (1958) declared, “To explain the meaning of r to a non-statistician is next to impossible” (p. 353). Rynes, Colbert, and Brown (2002) surveyed human resource (HR) professionals and found that a majority of them (82%) are not aware that intelligence is a better predictor of job performance than conscientiousness despite difference in the meta-analytic validity coefficients (Schmidt & Hunter, 1998; validity for intelligence = 0.51 vs. 0.31 for conscientiousness). Many also believe that integrity tests are not valid predictors of behaviors on the job, even though meta-analyses show that the validity coefficient for integrity tests for counterproductive workplace behaviors is 0.41 (Ones, Viswesvaran, & Schmidt, 1993). In education research, validity of standardized tests (e.g., SAT) has also been misinterpreted and its utility marginalized (Mattern, Kobrin, Patterson, Shaw, & Camara, 2009). Mattern et al. (2009) argued: “No matter how good a job one does to collect, analyze, and present validity evidence, it may fall on deaf ears if the results are not effectively communicated” (p. 229). Paradoxically, practitioners value research and cite scientific findings as one of their main justifications for HR-related decisions. Ryan and Sackett (1987) surveyed 163 individuals who conduct individual assessments and found that the most common reason for choosing a test is based on published data (63%). However, as evidence shows, readers looking at the same results can come to different or misinformed interpretations. When the validity information for 5 the SAT as a college admissions test was published in 2008, the public formed various opposing beliefs based on the same published validity data (Mattern et al., 2009). Some researchers have focused on the end-users’ interpretation of validity, rather than the validity statistic itself. Maciver, Anderson, Costa, and Evers (2014) argued that criterion-related validity alone lacks context. They maintain that the concept of validity extends beyond the statistical metric; interpretation of validity information depends on the context. Proper use of criterion-related validity is contingent on the relation between the user’s interpretation of the test score and the criterion outcome. Traditional validity statistics, such as those that focus on a variance-based interpretation, do not communicate the implication of the statistic in an actual hiring scenario. For example, 15% variance explained in the criterion does not help the user with inferring actual hiring outcomes such as how many employees hired are expected to succeed or fail. Due to complexity and ambiguity of statistical validity interpretation, validity research has arguably had a limited impact on HR-related decisions. Johns (1993) observed that the adoption of I/O personnel practices in the workplace is “not influenced by technical merit” (p. 46), and that other institutional factors such as organizational politics and government regulations are often the driving force for the adoption of personnel innovations at work. Guion (2011) argued that decisions to use a hiring method often have less to do with the test’s psychometric properties than with the organizational and political culture. In a study of over 53 organizations in both private and public sectors, Wolf and Jenkins (2006) surveyed managers who were directly responsible for various recruitment and hiring decisions at their company. They found almost all private sector companies increased use in testing in the past 5 years. And out of the twelve public sector and non-for-profit companies, nine of them had increased use of 6 testing while the other maintained the same level. The authors also observed, based on semistructured interviews with the managers, that the primary reasons for adopting standardized tests for hiring are its cost-effectiveness, organizational culture, and legal requirements; predictive validity had only a subsidiary impact on the decisions to adopt evidence-based hiring practices. In summary, predictive validity remains one of the most important criteria in evaluating hiring outcomes in the scientific literature, and yet, it has had limited influence on organizational change when compared with other organizational, political, and legal factors (Johns, 1993; Wolf & Jenkins, 2006). One explanation is that many managers are either not aware of the validity evidence discovered in research or misinterpreting the outcomes (Rynes et al., 2002). Validity information, as traditionally communicated in scientific literature is often unhelpful for informing real world decisions because it is difficult to understand. Improving the accessibility of validity findings to the managers is particularly important in an area where science has not informed practice at work: the use of structured interview in employee selection. Predictive Validity of Structured Interviews One of the most valid and most neglected selection tools is the structured interview (Buckley, Norris, & Wiese, 2000; Dipboye, 1997; Highhouse, 2008; van der Zee, Bakker, & Bakker, 2002). The structured interview is typically characterized by the standardization of the interview process (Levishina, Hartwell, Morgeson, & Campion, 2014), which include the generation of a pre-selected question list based on a job analysis and a quantitative and uniform scoring procedure for all interviewees. Benefits of Structured Interviews. Research has shown that the structured interview is more predictive of job relevant outcomes than the traditional (i.e. unstructured interview) 7 (Barrick, Patton, & Haugland, 2000; Huffcutt & Authur, 1994; McDaniel, Whetzler, Schmidt, & Mauer, 1994). A meta-analysis of 245 validity coefficients shows that the structured interview (p = 0.44) is significantly better than unstructured interviews (p = 0.33) at predicting future job performance (McDaniel et al., 1994). In a separate meta-analysis, Huffcutt and Arthur (1994) found that implementing even a small amount of structure to the interview process can improve the validity of the interview from 0.20 to 0.35; whereas a fully structured interview had a validity of 0.57. Unstructured interviews may even hinder the effectiveness of other hiring tools. For instance, Dana, Dawes, and Peterson (2013) found that, combining judgments made with unstructured interviews with valid predictors (e.g., Grade Point Average) can actually lead to worse predictions than when unstructured interviews are not administered. Resistance Against Structured Interviews. Despite what the evidence suggests, many managers still favor the traditional interview over the structured interview because of subjective benefits such as the need for autonomy during the interview process (Dipboye, 1997; Nolan & Highhouse, 2014), the need to exert influence on the applicant (Pfeffer & Lammerding, 1981), and better applicant reactions (Latham and Finnegan, 1993; Schuler, 1993). Another reason for the resistance to using structured interviews is the lack of awareness and understanding of its increased predictive validity over traditional interviews (Priem & Rosenstein, 2000; Rynes, 2009). The lack of awareness can be attributed to factors such as the lack of formal education, limited exposure to the research literature, and the poorly disseminated research findings in periodicals (Rynes, 2009). Even when managers are presented with research evidence, they are often not equipped to interpret the statistical results, and therefore, perceive the findings as uninteresting, unimportant, and abstract (Bailey & Eastman, 1996; Campbell, Daft, & Hulin 8 1982). Furthermore, managers ignore the findings because they believe that the research does not apply to their own unique situations (Highhouse, 2008). The lack of awareness in personnel selection research, combined with the general distrust and discomfort with statistics (Ayres, 2008), impedes managers from integrating research findings from an area that is as quantitatively oriented as personnel selection (Rynes, 2009). Awareness and understanding can be improved by finding more effective ways to communicate the utility of personnel selection research findings (Kuncel & Rigdon, 2012; Rynes, 2002). Methods for Communicating Validity When a researcher wants to describe the validity of a particular predictor, he or she typically turns to one of the aforementioned traditional effect size statistics; the Pearson’s correlation coefficient being the most common. To ease the interpretation of effect size statistics for scientists, Cohen (1988) provided general guidelines of what constitutes a “small”, “medium”, or “large” effect size in r terms. Traditional effect statistics are useful because they allow researchers to compare their results on a standardized metric. However, the standardization of effect statistics limits one’s ability to interpret its meaning across diverse real world contexts, such as the difference between experimental groups or expected number of successful employee hires. Rosenthal and Rubin (1982) found that even experienced researchers were surprised at the fact that a correlation of 0.32 translated into an increase in success rate of intervention from 34% to 66%. Validity estimates communicated with traditional effect size statistics are often underestimated by the public. For instance, critics of the SAT as a college admissions test have stated “the SAT only adds 5.4 percent of variance explained by HSGPA alone” (Kidder & Rosner, 2002, p.193). However, these misunderstandings can easily be amended by changing the 9 presentation of the same information (Bridgeman, Pollack, & Burton, 2004; Brooks et al., 2014; Davidshofer & Murphy, 2005; Lawshe & Bolda, 1958; McGraw & Wong, 1992; Taylor & Russell, 1939). In the following paragraphs, I outline several alternatives to traditional statistics in communicating effect size and validity information. Alternative Displays of Validity. Instead of a point estimate of the linear relation between predictor and criterion, criterion validity can be expressed in terms of probabilities or likelihoods of success based on a candidate’s score on a predictor (e.g., Intelligence test). TaylorRussell tables, for instance, convert the correlation coefficient to the expected probability of success given one’s standing on a particular predictor, the selection ratio, and success rate (Murphy & Davidshofer, 1988; Taylor & Russell, 1939). To read the Taylor-Russell table, one has to determine the selection ratio (the percentage of people hired from the applicant pool) and the base-rate of success, which is the percentage of the population that would succeed on the job. Next, based on the validity of the selection tool, as indicated by the correlation coefficient, one can derive the proportion of successful hires based on the job candidate’s standing on the predictor (i.e., selection test). For example, in Table 1, if 50% of the candidates are selected based on a selection test with a validity of r = 0.25, and the population has a base-rate success of 30%; then about 37% of the chosen employees will succeed. Similarly, an expectancy chart shows how one’s standing on a predictor relates to one’s standing on the criterion (Schrader, 1965; Lawshe & Bolda, 1958). In the expectancy chart (Figure 1), the reader is provided with the probability of success for a particular candidate given his score. For instance, if John scored between 16 and 20 on the selection test, the probability of his future success is 80%. 10 The binomial effect size display (BESD) simplifies the correlation coefficient by presenting the expected success and failure in a 2x2 matrix. In a BESD, the cells of the matrix are defined by (0.5+r/2)*100 and (0.5–r/2)*100 where r is the validity of the test or intervention (Rosenthal & Rubin, 1982). For example, in Table 2, one can evaluate the effectiveness of a Graduate Record Examination (GRE) training program by seeing the probability of improvement: 65% of the people who took the training improved their GRE score while only 35% of the people who did not take the training showed improvement. Finally, the Common Language Effect Size describes the difference between two groups (e.g., control vs. intervention group) with the probability that a random score from one group will differ from the other. For example, the effectiveness of the GRE program can be described as “there is a 60% chance that a score from someone who took the GRE training will be better than someone without training.” Despite the number of alternative data presentation tools available, there are only a few empirical examinations on their benefits for comprehension. Brooks et al. (2014) examined several alternatives to traditional effect size displays (e.g., Pearson’s r). They presented participants with the Common Language Effect Size and the Binomial Effect Size Display of the effectiveness of a GRE training program, and found that both alternative effect size displays were judged to be higher in understandability, usefulness, and effectiveness when compared with traditional effect sizes such as the Pearson’s r and the coefficient of determination. Bridgeman et al. (2004) used various expectancy charts to present validity information of SAT scores on college academic outcomes. The “straightforward approach,” as they called it, showed the percent of students that fall under different bands of college GPA based on SAT scores. Although expectancy charts have often been employed to present both theoretical and empirical 11 expectancies of predictor-criterion data (e.g., Cascio, 1976; Tiffin & Vincent, 2006; Yankelevich, 2007), no research has directly examined its usefulness for communication compared to alternatives presentation methods. Graphical Visual Aids Graphical visual aids have a long history in quantitative and scientific education (see Shah & Hoeffner, 2002 for review). The Joint Committee of Standards for Graphic Presentation published a list of guidelines for using graphics in presenting quantitative data in 1915. These guidelines have led to a stream of research examining how to best use graphical visual aids in a variety of contexts such as education, decision-making, and communication (Bettman & Zins, 1979; Boucheix & Guignard, 2005; Carter, 1947). There are several benefits of using graphical visual aids in communicating complex numerical information. First, graphical visual aids take advantage of people’s automatic visual perception abilities (Cleveland & McGill, 1985), which improves memorability (Denis, 1984; Levie & Lentz, 1982) and understandability of quantitative information (MacDonald-Ross, 1977; Winn, 1987). Second, graphical visual aids convey more information than through quantitative description (i.e., numbers and statistics) alone (Lewandowsky & Spence, 1989). Visual aids have been shown to improve understanding and decision-making in education, finance, and medicine (Ancker et al., 2006; Garcia-Retamero et al., 2012; Shah & Hoeffner, 2002; Volkov & Laing, 2012). Icon Array. One type of graphical visual aid that has received modest attention in medical decision-making research is the Icon arrays. Icon arrays are “graphical representations consisting of a number of stick figures, faces, circles, or other icons symbolizing individuals…” 12 (Galesic et al., 2009, p. 210) Figure 2 shows an example Icon array that communicates the effectiveness of a medical treatment. As you can see, the array contains 100 icons, each representing a single person in a sample. The sample can be hypothetic or empirical. The icons are separated by color. In the example, green icons represented individuals who are cured by the medication, red icons represent the uncured, while the gray icons represent the untreated. Icon arrays have been used in communicating the benefits and risks regarding health and medical treatments (e.g., Fagerlin, Wang, & Ubel 2005; Feldman-Stewart, Kocovsky, McConnell, Brundage, & Mackillop, 2000; Garcia-Retamero, Galesic, & Gigerenzer, 2010). Research has shown that Icon arrays can improve understandability of risk information and help people make more informed decisions (Garcia-Retamero et al., 2012; Galesic & Garcia-Retamero, 2011). For instance, Garcia-Rematero and Hoffrage (2013) found that Icon arrays improved diagnostic judgments for both doctor and patients. Furthermore, Garcia-Rematero et al. (2010) showed that Icon arrays reduced cognitive biases such as base-rate neglect, which lead to more accurate judgments of risk. Icon arrays provide several unique advantages over the previously studied displays of effect size. First, Icon array is the most salient graphical representation of frequency information. This is because Icon arrays use individual icons to represent each frequency, therefore, bringing attention to the discrete properties of the data. Research has shown that when it comes to making probabilistic inferences, frequency representations are generally easier to understand than probabilities and more useful (Gigerenzer & Hoffrage, 1995; Hoffrage et al., 2000). Although the BESD and Expectancy Charts are also capable of representing frequency information, they are usually used in displaying proportions or probabilities when used in communicating effect 13 size information (e.g., Brooks et al., 2014). Second, Icon array is the only visual aid that can use different shapes to communicate information. Icon array can represent individual candidates with silhouettes of people. This gives Icon array the advantage of having higher iconicity than the other graphical representations. Iconicity is the degree to which symbols used in a graph represents its real life counterpart (Gaissmaier et al., 2012). Icons can also represent categories. For example, a happy face can represent a successful employee and an unhappy face can represent an unsuccessful employee. Finally, Icon array can overcome base-rate neglect, which is a cognitive bias that interferes with interpreting and understanding probabilistic information (Bar-Hillel, 1980). The Icon array ameliorates base-rate neglect because it emphasizes the denominator of a faction with the size of the array, and in turn, calls attention to the denominator. Table 3 summarizes the characteristics of different representations for effect size that enhances comprehension. For all these reasons, I hypothesize that the Icon array will be perceived as more useful for communicating the validity of structured interviews and easier to understand than both the bar graph and the BESD. Hypothesis 1.1: Managers will rate the Icon array as more useful than the bar graph and the BESD for communicating the validity of structured interviews in hiring. Hypothesis 1.2: Managers will rate the Icon array as easier to comprehend than the bar graph and the BESD for communicating the validity of structured interviews in hiring. Interactivity. With increased access to computers in the current digital age, more and more visual aids are becoming computer-based, which allows graph makers to incorporate 14 animation and user interactivity (Lowe, 2003). Some scholars have also recommended implementing interaction and animation in visual aids to communicate probabilistic information (Spiegelhalter, Pearson, & Short, 2011). Animation and interactivity are interrelated concepts in visual displays. Visual aids that are interactive are necessarily animated in some respect because interactions involve the manipulation of the visual elements, and control over change in the presentation. Animations and interactivity each have their advantages. Boucheix and Guignard (2005) argued that animated presentations alone are more interactive than static visual aids. Interactivity refers to giving user control over the words or pictures (Mayer & Chandler, 2001). Aesthetically, animated graphs are more interesting and attractive, which could enhance the engagement of the user (Ancker, Weber, & Kukafka, 2011; Perez & White 1985, Rieber, 1990). Animations are ideal for displaying complex concepts and for communicating changes or trends over time (Morrison, Tversky, & Bertrancourt, 2000). When used properly, animations can reduce the user’s cognitive load, which leads to better learning outcomes (Mayer & Chandler, 2001). Boucheix and Guignard (2005) found that animation and user interactivity both improved performance on the comprehension of a technical document. Mayer and Chandler (2001) found that modest amount of interactivity improved deep learning of scientific concepts. However, Gonzalez (1996) cautioned the benefits of animation to decision-making are contingent on the properties of the design such as transition smoothness, realism, and interactivity style. Interactive and animated graphs may overwhelm the users by imposing excessive information that could overload one’s cognitive resources (Lowe, 2003; Morrison & Tversky, 2000). In order for 15 animation and interactivity to be useful, they must be theory-based and not detract from what is important in the graphs (Mayer & Chandler, 2001). So far, only one study has implemented user-interactivity to Icon array displays (Ancker et al., 2011). Their study examined user interactivity in Icon arrays for risk judgments. The authors created a game-like task where the user clicked on masked squares to reveal the color of the icon. Over time, the user learned the risk proportions, as the squares are unmasked. They found that the interactive visual aid did not significantly affect one’s perception of risk or perceived usefulness of the visual aid. They also found that users with low familiarity with computers were more confused by the interactive visual aids than its static counterparts. One limitation of this study is the complexity and novelty of the user interaction. Clicking squares in a game-like manner is a very specific type of interaction that is unique only to a small set of computer tasks, and is not one that most computer users are accustomed to. The second limitation of the Ancker et al. (2011) study design is that the implementation of user interactivity is confounded with the information presentation mode. In the interactive visual aids, users learned the risk proportions over time – by clicking on individual icons – rather than at once by looking at a single complete array. Previous research has shown that probability judgments differ depending on if the underlying distribution is learned over time (decision from experience) or at once (decision from description) (Hau, Pleskac, Kiefer, & Hertwig, 2008). Therefore, it is unclear if incorporating a simpler interaction, while maintaining the information gathering process will improve the comprehension of the interactive visual aid. The education literature has examined the effectiveness of interactive and animated visual aids extensively, but decision-making research has only focused on static visual aids (e.g., 16 Brooks et al., 2014; Garcia-Retamero & Dhami, 2011, Hess, Visschers, & Siegrist, 2011). Interactive visual aids are just a small part of a much larger body of scholarship: human computer interaction (HCI) (Preece, Rogers, Sharp, Benyon, Holland, & Carey, 1994). Because of the broad nature of the term “interaction”, for the purpose of this study, I constrain the interactive component of a visual aid by defining it as giving the user control over the basic appearance of graphical elements (e.g., labels) and the delivery of information. Implementation of interaction will be elaborated in the methods section. I hypothesize that interactive visual aids will improve user engagement with the decision-aid, and overall comprehension of the data. I also hypothesize that people will judge the interactive visual as more useful for communicating the validity information than the static counterparts. Hypothesis 2.1: Managers will rate interactive visual as more useful than the static counterparts of the respective graphs for communicating the validity of structured interviews. Hypothesis 2.2: Managers will rate interactive visual as more engaging than the static counterparts of the respective graphs for communicating the validity of structured interviews. Hypothesis 2.3: Managers will rate interactive visual as easier to comprehend than the static counterparts of the respective graphs for communicating the validity of structured interviews. Individual Differences in Graph Literacy More recently, researchers have begun examining individual differences in the ability to comprehend graphical information. Two main factors influence the comprehension of graphical 17 information. The first is content knowledge and the second is graph literacy (Shah & Hoeffner, 2002). Content knowledge is related to one’s interpretation of graphical data. People are more likely to infer relations and trends in familiar than unfamiliar contexts. Lord et al. (1979) also found that when information presented in graphs is inconsistent with one’s prior experience, viewers are more likely to make systematic errors in judgments. Finally, expertise in the content area also allows the user to make more meaningful interpretations of the data (Chase & Simon, 1976, Egan & Schwartz, 1992). The second factor that influences graph comprehension is graph literacy, which is the ability to comprehend graphically presented information (Galesic & Garcia-Retamero, 2011). Research has shown that graphs are not equally effective communication tools for everyone. Expert graph viewers are more capable of extracting abstract information from graphs, and less likely to neglect important elements of a graph (Shah & Hoeffner, 2002; Shah & Freedman, 2011). Graph comprehension also takes less cognitive effort when the user is familiar with the content or is proficient at reading graphs (Kosslyn, 1985). Finally, expert graph readers have better memory for graphical displays because they are able to group graphical elements in meaningful ways (Egan & Schwartz, 1992). Okan et al. (2012) found that visual aids (e.g., Icon array) improved risk comprehension more for the individual with high graph literacy than low. Given the importance of graph literacy in interpreting information presented in graphs, I hypothesize that the benefits of graphical displays (Bar graph and Icon array) will be more beneficial for people with high graph literacy than those with low. Hypothesis 3.1: There will be an interaction between graph literacy and visual aid type, such that graphical displays (Icon array and bar graph) will have a greater 18 effect on perceived usefulness of the visual aid over a non-graphical display (BESD) for managers with high than low graph literacy. Hypothesis 3.2: There will be an interaction between graph literacy and visual aid type, such that graphical displays (Icon array and bar graph) will have a greater effect on perceived comprehension of the visual aid over a non-graphical display (BESD) for managers with high than low graph literacy. Previous researchers have cautioned the possible disadvantages of animation and interactivity (Mayer & Chandler 2001; Morrison & Tversky, 2001). Too many simultaneous graphical elements can overload one’s cognitive capacity and hinder comprehension (Chandler & Sweller, 1991; Tindall-Ford, Chandler & Sweller, 1997). The potential cognitive overload caused by extra graphical elements may be less taxing for individuals with high graph literacy because they are already proficient at processing basic graphical elements, and therefore, have the extra cognitive capacity to incorporate additional graphical elements. This suggests that the addition of interactivity may benefit individuals with high graph literacy more because they have more cognitive resources available to for engaging and processing the interactive component of the visual aid. Hypothesis 4.1: There will be an interaction between graph literacy and user interactivity, such that interactivity will have a greater effect on perceived usefulness for managers with high graph literacy than with low. Hypothesis 4.2: There will be an interaction between graph literacy and user interactivity, such that interactivity will have a greater effect on perceived comprehension for managers with high graph literacy than with low. 19 Hypothesis 4.3: There will be an interaction between graph literacy and user interactivity, such that interactivity will have a greater effect on perceived engagement for managers with high graph literacy than with low. Whereas the previous study on communicating validity information with Icon arrays were administered to a lay audience (Zhang et al., 2014), the present study will narrow the target population to managers with experience in hiring. There are both theoretical and methodological advantages to surveying people with context experience. First, doing so would improve the external validity of the study. The purpose of using visual aids to simplify validity information is to help people make better decisions with regard to choosing interview methods. Therefore, surveying those who are in the position of making real world hiring decisions would maximize the external validity of the study. Secondly, content knowledge is related to graph interpretation (Shah & Hoeffner, 2002). People with low content knowledge will have to expend additional cognitive resource to process the non-numerical information whereas people with high content knowledge are already familiar with concepts such as job interviews and job performance. Content familiarity also affects people’s interpretation of the graphs. People tend to be better at inferring relations in familiar contexts than unfamiliar ones; and in situations where their prior knowledge aligns with the information presented in the graph. A lay population are usually not familiar with job interviews, and therefore, may process the information presented in the graphs differently than those who has experience. 20 METHOD Participants Data were collected on Amazon Mechanical Turk (MTurk). MTurk is a crowdsourcing service where people participate in online tasks for modest pay. Past research has demonstrated that the MTurk population generalizes well to a general adult population and that the service is a valuable platform for conducting workplace-related experimental research (Highhouse & Zhang, 2015, Paolacci, Chandler, & Iperirotis, 2010). Each participant received 75 cents for completing the survey, which took approximately 10 minutes. Multiple steps were taken to ensure that the sample included managers with experience in conducting job interviews. First, participants who were interested in the survey completed short screening questions that asked them to indicate their current employment status and employment industry. Participants who were unemployed were excluded from the survey. Next, participants saw a list of common work tasks across many occupations (e.g., interact with customers, data analysis, manual labor, etc.) (Appendix A). These tasks were modified based on a sampling of major job groups and job tasks listed on O*Net (Onetonline.org). Participants were instructed to indicate up to five tasks that they most frequently engage in at work. I excluded participants who did not include either “recruiting/interviewing” or “management” as one of their primary tasks. Because the survey restricts the same participant from retaking the survey, it discourages participants from faking by taking the survey multiple times until they fulfilled our selection criteria. Moreover, the large number of possible responses greatly reduces the likelihood of participants figuring out the exclusion criteria by chance. Finally, at the end of the study, participants were asked again if they are currently in a managerial position and their previous 21 experience with interviewing job candidates. I excluded participants who did not have interviewing experience or those who were not in a managerial position. This entire process yielded 329 completed surveys. Twenty-four participants were removed for missing any one of the two attention check questions (e.g., “If you are still paying attention, please respond with strongly agree”). The final sample of the study had 305 employees who were either in managerial positions or had interviewing experience (52% male, Mean age = 37, SD = 10, 80% Caucasian). Participants held occupations across a wide range of industries, the most popular being: retail trade (11%), health care and social assistance (10%), and professional, scientific and technical services (10%). Stimulus Material I created the graphical visual stimuli on Infogr.am (www.infogr.am). Infogr.am is an online web service for creating custom charts. In the present study, the interactive visual aids are ones where the user can compare the validities between different hiring methods (random, traditional interview, and structured interview) by clicking on the radio buttons on the survey. The visual aid on the screen only displays the validity information for one hiring method at a time. The user may take as much time as needed to process the information before clicking on the radio buttons to see a different hiring method. Moreover, the user may also go back and review charts that he or she has already seen. There is no constraint on the time spent on each chart or the order in which the charts are displayed. Users were free to engage with the charts in any order they preferred until they were satisfied with the information. Figure 4 – 9 are screenshots the static and interactive visual aids. Subjective Graph Literacy 22 The subjective graph literacy scale was developed for this study. Some of the items are modified based on the subjective numeracy scale (Fagerlin et al. 2007), which is a self-report measure for one’s numerical abilities. A sample item is “I am good at creating graphs or charts of numerical information.” Participants responded to the questionnaire using a 5-point response scale (1= strongly disagree to 5= strongly agree). The items of the present scale are presented in Table 4. Because the scale was developed specifically for this study, I conducted a principal axis factoring with the Oblimin rotation to assess its psychometric properties. A parallel analysis recommended a single component solution. The single component solution explained 37% of the total variance with an eigenvalue of 2.93. All measures reached the minimal factor loading required for retaining the item (Tabachnik & Fidel, 2001). The scale also had acceptable internal consistency (α= 0.81). Design and Procedure This study aimed to improve how statistical validity is communicated to managers. As such, I examined three different visual aids for communicating validity information: Binomial effect size display (BESD), bar graph, and Icon array. All three visual aids presented the same statistical validity information for traditional interviews and structured interviews. I also examined the benefits of user-interactive visual aids: each visual aid was either static or interactive. In order to determine the effects of each visual aid and user-interactivity on the hypothesized outcomes (perceived comprehension, perceived usefulness, engagement), I used a randomized control trial design where participants were randomly assigned see validity presented with one of the three visual aids and either a static or interactive version of the visual aid. 23 The proposed study used a 3 (Visual Aid: BESD vs. Bar graph vs. Icon array) x 2 (Interactivity: Static vs. Interactive) between-subjects design. First, participants read a short vignette describing the decision scenario where they were asked to assume the role of manager. As a hypothetical manager, they had to choose between a traditional interview or structured interview for their company. (Appendix B). Following the vignette, the participants were randomly assigned to one of six conditions, each associated with a different graphical visual aid. Participants read a short description of the visual aid, which will remain the same for all six conditions. Participants had the opportunity to thoroughly examine the graph before proceeding to the next page (Figure 3). Participants also answered four objective comprehension questions while reviewing the graphs (Appendix C). Next, participants continued to the dependent variables page where they responded to questions regarding their attitudes toward the visual aid. Dependent variables are presented in Appendix D. Following completing the dependent variables questions, participants completed the subjective graph literacy scale. Finally, Participants provided basic demographic information along with their previous experience with making hiring decisions and computer use (Appendix E). I tested participants’ attentiveness with two attention-check questions (e.g. “Please respond to this question with ‘strongly disagree’”). 24 RESULTS Preliminary Analysis Means, standard deviations, item intercorrelations and standardized Cronbach Alphas of the study’s variables are presented in Table 5. Given that the measures used were developed specifically for this study and high correlation between perceived usefulness and perceived comprehension measures (r = 0.70), I first conducted an exploratory factor analysis on the study variables to examine its factor structure. I used principle axis factoring with the Oblimin rotation. Although the measures aimed to assess three constructs: perceived comprehension, perceived usefulness, and engagement, parallel analysis retained two components with eigenvalues greater than one. The first component (eigenvalue = 5.51) accounted for 40% of the total variance. The second component (eigenvalue = 1.26) accounted for 21% of the total variance. Pattern matrix from the two-factor solution revealed that the four items for perceived usefulness and two items from perceived comprehension strongly loaded onto the same factor while the three items for perceived engagement loaded onto a separate factor (Table 6). The reverse-coded item from the perceived comprehension measure did not reach the threshold for factor loading (0.32) recommended to retaining the item (Tabachnik & Fidel, 2001). I conducted a follow-up principle axis factoring analysis with the Oblimin rotation without the negatively worded item. The parallel analysis still recommended a two-component solution. The first component (eigenvalue = 5.30) explained 44% of the variance and included the six items intended to assess both perceived usefulness and perceived comprehension. The second component (eigenvalue = 1.26) explained 23% of the variance and included the three items intended to assess engagement. Table 7 shows the standardized factor loadings of the items 25 in the two-factor solution. Given that the items for both perceived usefulness and perceived comprehension loaded highly with the same factor, the six items were combined to a single measure of perceived visual aid usefulness. The negatively worded item was removed from the analysis. Finally, interview experience was not significantly correlated with any of the DVs, whereas age was significantly correlated with perceived usefulness (r = 0.17) but not engagement (r = 0.06). Computer experience was significantly correlated with engagement (r = 0.14) but not perceived usefulness (r = 0.09). There was no sex difference in any of the study’s variables. Controlling for age or computer experience as covariates did not change the results of the test of study’s hypotheses. Therefore, they were excluded from the report of the ANOVA for simplicity. Subjective Graph Literacy. Subjective graph literacy was measured on an interval scale. Given that the independent variables in the current study are categorical and contain more than two nominal groups, the interpretation of the interactive effects between the categorical IV and a continuous IV is more difficult in a multiple regression analysis. Thus, graph literacy was dichotomized to improve the interpretability of the results. Given the data are negatively skewed (Figure 10), the most natural cutoff point in the data is the median. A median split was conducted to separate the participants’ scores on graph literacy to high and low. Subsequent tests of the hypothesized effects will be reported as ANOVAs where graph literacy is treated as a dichotomous factor. There are several shortcomings to artificially dichotomizing a continuous variable. First, it reduces the variability of the results; second, it reduces the statistical power of the analysis; third, the cut-off values are often subjective (Irwin & McClelland, 2003). Therefore, 26 to ensure that the dichotomizing of subjective graph literacy measure did not reduce the power to detect the hypothesized effects, I also tested the study’s hypotheses with multiple regression, while keeping subjective graph literacy as a continuous variable. The results from the multiple regression were congruent with those found in the ANOVAs. Hypotheses Testing Perceived Visual Aid Usefulness. Summary of the Analysis of Variance is shown in Table 8. There was a significant main effect of the visual aid type on participants’ perceived visual aid usefulness, F(2,287) = 4.48, p < 0.05, η2 = 0.02. The Icon array was judged to be the most useful (M = 4.37, SD = 0.57), followed by the bar graph (M = 4.28, SD = 0.72) and then the BESD (M = 4.10, SD = 0.75). Tukey’s post-hoc comparison tests revealed a significant difference between the Icon array and the BESD (Hedge’s g = 0.40, p < 0.01) but not between the Icon array and the bar graph (Hedge’s g = 0.15, p = 0.62) nor between the bar graph and the BESD (Hedge’s g = 0.23 p = 0.10). Therefore, Hypothesis 1.1 was partially supported. I found support for Hypothesis 2.1, as there was a significant main effect of interactivity on perceived visual aid usefulness, F(1,287) = 8.06, p<0.01, η2 = 0.02. People judged the interactive visual aids as more useful (M = 4.37, SD = 0.64) than the static visual aids (M = 4.15, SD = 0.71, Hedge’s g = 0.32). There was also a main effect of subjective graph literacy, F(1,287) = 39.67, p < 0.01, η2 = 0.14. People who scored above the median judged the visual aid to be more useful (M = 4.46, SD = 0.65) than people who scored below the median (M = 3.96, SD = 0.73, Hedge’s g = 0.73). However, there was no significant interaction between user interactivity and subjective graph literacy on perceived usefulness, F(1,287) = 0.44, p = 0.51. Therefore, Hypotheses 4.1 and 4.2 were not supported. 27 There was a significant interaction between the visual aid condition and subjective graph literacy, F(2,287) = 5.29, p < 0.01, η2 = 0.04. As recommended by Rosnow and Rosenthal (1991), I calculated individual cell means to better illustrate the interactive effects between subject graph literacy and visual aid type (Table 9). Furthermore, pair-wise comparisons were made for perceived usefulness of each visual aid between people with high and low subjective graph literacy. To control for inflated family-wise error in the multiple comparisons, the Type-I error rate was adjusted based on the number of comparisons (3) using the Bonferroni correction (Dunn, 1961), resulting in an adjusted Type-I error rate of 0.017. There was no significant difference in the perceived usefulness of the BESD between people with high subjective graph literacy (M = 4.19, SD = 0.77) and people with low graph literacy (M = 4.01, SD = 0.74, Hedge’s g = 0.24), t(86) = 1.12, p = 0.27. However, there was a difference in the perceived usefulness of the bar graph between people with high subjective graph literacy (M = 4.57, SD = 0.48) and people with low (M = 3.81, SD = 0.78, Hedge’s g = 1.23), t(55) = 5.73, p < 0.01. There was also a difference in the perceived usefulness of the Icon array between people with high subjective graph literacy (M = 4.52, SD = 0.56) and people with low (M = 4.10, SD = 0.48, Hedge’s g = 0.78), t(86) = 4.10, p < 0.01. These results suggest that subjective graph literacy played a role in people’s attitudes toward the two graphical displays (Bar graph vs. Icon array) but not the table (BESD). To better understand the role of subjective graph literacy on the perceived usefulness of graphical displays, I conducted a follow-up ANOVA on the two graphical displays only (Bar graph vs. Icon array). There was no significant main effect of visual aid type on perceived usefulness, F(1,202) = 0.99, p = 0.32, η2 = 0.00. There was, however, a main effect of subjective 28 graph literacy, F(1,202) = 57.4, p < 0.01, η2 = 0.28. There was also a significant interaction between graph type and subjective graph literacy, F(1,202) = 4.44, p < 0.05, η2 = 0.03 (see Figure 11). People with low graph literacy perceived the Icon array to be more useful than the bar graph t(63) = 2.17, p < 0.05, Hedge’s g = 0.43, whereas people with high graph literacy did not perceive the two graphical visual aids to differ in usefulness, t(130) = 0.55, p = 0.58, Hedge’s g = 0.11. These results suggest that people with high subjective graph literacy perceive both the bar graph and the Icon array as equally useful, whereas people who with low graph literacy perceive the Icon array to be more useful than the bar graph. Visual Aid Engagement. Summary of the Analysis of Variance is shown in Table 10. There was a main effect of user interactivity on people’s engagement with the visual aid, F(1,281) = 9.85, p < 0.01, η2 = 0.04. People who saw the interactive visual aid judged the visual aid as more engaging (M = 4.24, SD = 0.84) than those who saw a static visual aid (M = 3.97, SD = 0.72 Hedge’s g = 0.36). Therefore, Hypothesis 2.2 was supported. There was also a significant main effect of subjective graph literacy on visual aid engagement, F(1,281) = 23.00, p < 0.01, η2 = 0.08. People who scored above the median on the subjective graph literacy scale rated the visual aid as more engaging (M = 4.28, SD = 0.80) than those who scored below the median (M = 3.85, SD = 0.74, Hedge’s g = 0.55). However, there was no significant effect of visual aid type on engagement, F(2,281) = 1.37, p = 0.25, η2 = 0.01, nor was there a significant interaction between user interactivity and subjective graph literacy, F(1,281) = 0.98, p = 0.32. Therefore, Hypothesis 4.3 was not supported. Objective Comprehension Test. I computed the total number of correct answers to the objective comprehension questions for each subject. The score range from zero to four. The four 29 objective comprehension questions varied in difficulty. The proportion of correct responses for questions 1 through 4 were 97%, 94%, 91%, and 78% respectively, which indicates that the questions were too easy and there is a severe ceiling effect. Given that the variable is ordinal data, and the distribution is highly skewed, I used ordered logistic regression to analyze the data. Ordered logistic regression is a variant of logistic regression, but it allows for more than two response options (Hardin, James, Hilbe, & Joseph, 2007). It is also more liberal in its statistical assumptions such as normality of the variables and homogeneity of variances. To conduct an ordered logistic regression with categorical predictors with more than two levels, I created an additional k-1 number of dummy coded variables such that one level of the independent variable is used as a reference point for which the other two levels are compared to. In this case, because the test on objective comprehension was not planned, I did not have theoretical rationale for the reference level. Therefore, I used an empirical approach where I first calculated the mean number of correct answers for the BESD, Bar graph, and Icon array levels. The mean correct answers for the three levels were 3.62, 3.52, and 3.64 respectively, which indicates that people who saw the bar graph had the lowest average score and people who saw the BESD and Icon array both scored higher. Therefore, I used the bar graph condition as a reference point. Table 11 contains the summary of the ordered logistic regression analysis. The model likelihood ratio test showed that the logistic model was a good fit for the data, χ(4) = 11.24, p < 0.05. Ordered logistic regressions can be interpreted the same way as a logistic regression. The exponentiated B represents the odds ratio of the change from one category of the DV to the next for every unit increase in the predictor. As shown in the table, there is a marginally significant effect of the Icon array variable on the number of correct responses, Exp(B) = 1.77, Wald’s z = 1.85, p = 0.06. 30 Participants who saw the Icon array were 1.77 times more likely to increase their score than those who saw the bar graph. There was not a significant effect of BESD. Alternatively, one can interpret the results by examining the predicted probabilities table. Table 12 shows the predicted probabilities of a participant’s score based on his standing on each of the three visual aid conditions. Participants in the Icon array condition were more likely to score a perfect score (75.3%) than those in the bar graph condition (63.2%). Separate analyses for each questions showed that the effect was primarily driven by the difference in performance on question four of the objective comprehension test. Logistic regression showed that people who saw the Icon array were 3.15 times more likely to answer question four correctly than people who saw the bar graph, Exp(B) = 3.07, Wald’s z = 3.15, p < 0.01. There was a marginal difference in performance between people who saw the bar graph and the BESD. People who saw the BESD were 1.77 times more likely to answer question four correctly than people who saw the bar graph, Exp(B) = 1.71, Wald’s z = 1.61, p = 0.10. Table 13 shows the summary of the logistic regression analysis. 31 DISCUSSION Effective communication of statistical information is paramount for good managerial decision-making (Brooks et al., 2014; Kuncel & Rigdon, 2012). The science-practitioner gap in personnel selection can be partly attributed to the lack of awareness or understanding of the validity evidence from academic research (Colbert, Rynes, & Brown, 2005; Rynes, Bartunek, & Daft, 2001). The structured interview, specifically, is underutilized and undervalued by managers because its benefits are not well communicated or understood by managers (Roulin & Bangerter, 2012; Van der Zee et al., 2002). A growing body of research has shown that “non-traditional” presentations of numerical information, such as the Binomial Effect Size Display and Icon array are more effective for communicating complicated statistical and probabilistic information to non-experts (Ancker et al., 2010; Brooks et al., 2014; Garcia-Retamero et al., 2013). Given the lack of awareness of the benefits of evidence-based hiring practices by some of the non-academic population, there is an emerging need to improve how those benefits are communicated. Furthermore, with the prevalence of technology in people’s everyday lives, research should examine how digital platforms (E.g., computers and mobile devices) can be integrated in communicating statistical information. This dissertation accomplishes these goals by examining the benefits of userinteractive graphical displays on the comprehension of statistical validity information. As hypothesized, managers judged the Icon array as more useful than the BESD for communicating the benefits of structured interviews. However, there was no significant difference in perceived usefulness between the Icon array and the bar graph. These results suggest that, although all three visual aids presented the same type of information (comparing 32 expected outcomes of various hiring aids), the physical presentation of that information does influence people’s perception of the visual aid’s usefulness; there seems to be an advantage of using graphical displays over numerical only displays. This is consistent with research on how graphs are used to communicate difficult quantitative and probabilistic information in risk communication and quantitative education (Gaissmaier et al., 2012; Garcia-Retamero & Cokely, 2013). The perceived usefulness of the visual aids was moderated by individual differences in subjective graph literacy. As expected, subjective graph literacy influenced people’s perceived usefulness of both graphical displays (Bar graph and Icon array) but not the tabular display (BESD). These results are consistent with the theoretical rationale behind the graph literacy construct and other empirical findings. Since subjective graph literacy measures the individual differences in the ability to extract and interpret information from a graphical display, it is expected to have an effect on the perceived usefulness of the graphs but not tabular displays. Previous research has also found that graphical visual aids are more useful for people with higher graph literacy (Okan et al., 2011). These results also suggest that for non-expert graph readers, graphical displays are as useful as BESDs. More importantly, there was a difference in the role of graph literacy for the two graphical displays (bar graph vs. Icon array). People with low graph literacy judged the Icon array to be more useful than the bar graph while people with high graph literacy judged both graphical displays as equally useful. These results suggest that the Icon array might not require the same graph processing skills as a bar graph, which makes Icon arrays useful even for people with low graph literacy. The benefits of the Icon array may be attributed to its design differences 33 from the bar graph. The design differences are central to the three components of graph comprehension (Cleveland, 1993; Pinker, 1990; Shah, Mayer, & Hegarty, 1999). First, viewers must identify key visual features (e.g., bars or icons), next viewers must relate those visual features to the conceptual relationships depicted in the graphs (expected outcomes in hiring across interview methods), and finally the viewer must identify the concepts being quantified in the graph (e.g., applicants). In an Icon array, the key features of the graph (human silhouettes) are also representations of the concepts being represented in the graph (applicants). This feature removes the need for graph viewers to identify the important abstract visual features and associate them to a real world concept, which is the central component graph comprehension. Individual human-shaped icons also automatically evoke the viewers’ association to people, whereas in a bar graph, the association is not as direct; the viewer has to refer to additional labels on the axis to infer the meaning of the bars. In other words, the design of Icon arrays removes some of the barriers to the graph comprehension process, making it easier for even novice graph readers to understand its meaning. Visual aid type also affected people’s scores on the objective graph comprehension test. People who saw the Icon array and the BESD were more likely to make the correct objective inferences of the data than people who saw the bar graph. In other words, even though both graphical displays (Icon array and bar graph) were perceived as equally useful and easy to understand, they differ when the reader have to make objective inferences of the graphical information. Moreover, even though the BESD was perceived as less useful than the bar graph, people actually performed marginally better on the objective comprehension test when they saw 34 the BESD than the bar graph. These results suggest that the benefits of the different displays may be contingent on the nature of the criteria and problem-solving task. The differential effects of visual aid type on the dependent variables can be attributed to the proximity compatibility principle (Carswell, 1992; Carswell & Wickens, 1987), which states that the degree to which a graphical display is useful depends on the nature of the task it intends to solve. In graphical representation of numerical information, there is a trade-off between the ability to accurately perceive precise numerical values and the ability to infer the gist information presented in the data (Shah & Hoeffner, 2002). Tabular displays of numbers, such as the BESD, are best at presenting single point estimates but does not provide integrative gist information of the pattern in the data (Guthrie, Weber, & Kimmerly, 1993). On the other hand, bar graphs emphasize comparisons of numerical values across categories, as highlighted by the height of the bars, which is more suited to convey gist information with regard to comparison of quantity (Shah et al., 1999). The objective comprehension question, which asks people: “what percentage of applicants are expected to succeed when using a traditional interview”, requires people to extract exact numerical data from the visual aid, which is best displayed by the BESD. Whereas subjective questions that focus on the overall perceived utility of the display focuses on the gist information presented in the data, which is easier to extract in a bar graph. The difference in task type between the subjective and objective dependent variables may explain why the BESD, while perceived to be less useful, actually resulted in slightly better performance in the objective comprehension test than the bar graph. The Icon array, on the other hand, has design features that both highlights gist information and precise numerical estimates. Graphically, the Icon array shares similarities with 35 the bar graph in that it conveys gist information with the overall size of the arrays. Similar to the bar graph, the numerical magnitude is represented with the height of the array, therefore making process of extracting the gist information similar to that in a bar graph. Moreover, because the Icon array uses individual icons to represent frequencies, it also brings attention to the precise value of each arrays, making it easy for the viewer to extract precise numerical information from the array. Results also showed that the objective measure of comprehension was uncorrelated with the subjective measure of comprehension and usefulness. Objective comprehension assessed the viewer’s ability to extract precise numerical information while the subjective measure of comprehension assessed the viewer’s overall understanding of the gist information presented in the graph. The two different types of comprehension may be differentially beneficial for different organizational decisions and goals. If the goal is to communicate and compare the effectiveness of assessment methods or interventions and to improve people’s intention to adopt these methods, then it is more beneficial to use graphical visual aids that improve people’s understanding of the overall pattern in the data. On the other hand, tabular displays have their use. Precise quantitative information can be useful when making specific forecasts. For instance, if a manager has to make precise performance projections based on a particular assessment method, a tabular display with numerical information might be more useful. Nevertheless, as shown in this study, the Icon array excels at both presenting gist information and precise numerical information, which makes them the best of both worlds. The benefits of user-interactive graphical displays are more controversial (Ancker et al., 2011; Mayer & Chandler, 2001; Zikmund-Fisher et al., 2011). Many researchers cautioned that 36 user-interactivity can backfire if the implementation is too complex, distracting, or does not aid information processing (Gonzalez, 1996; Mayer, 2000; Morrison & Tversky, 2000). The userinteractive component of the visual aid examined in this study is simple and intuitive. Participants did not need additional instructions to understand how to interact with the graphs. Moreover, the interactive component in this study prompts the users to compare the efficacy of the different hiring practices, which is relevant to the context of the information being communicated. Thus, the way user-interactivity in this study satisfies the basic requirements of an effective implementation. As expected, people were more engaged with the visual aids and found the information easier to understand. There was no interaction between user-interactivity and subjective graph literacy, however. This finding suggests that the benefits of userinteractivity, as implemented in this study, does not require high graph comprehension skills. Limitations and Future Direction There are several limitations to this study. First, the study assumes a selection ratio of 0.50. In other words, it assumes that in the population of job applicants, 50% will be successful at the particular job examined in the study. In the real world, the selection ratio varies significantly across jobs. For more technically demand jobs or executive level positions, the selection ratio could be much lower. Therefore, the expected outcomes of different hiring methods will also change as a result. Future research should try to address this issue presenting the validity of different interviewing method across different types of occupations, and varying the statistical parameters such as selection ratio or effect size. The second limitation is the psychometric properties of the objective comprehension questions. The questions used in this study were fairly easy for the sample: more than 90% of participants were able to answer the first 37 three questions correctly. The limited variability in the criterion may have suppressed the true statistical effect of the different visual aids. Future research should examine other more difficult objective comprehension questions. Practical Implications This study has several practical implications. While most academic journals have guidelines and recommendations for reporting statistical information to enhance understandability and maintain statistical rigor, such guidelines do not exist when communicating validity information in non-academic formats. The methods and principles of statistical communication in the present study can serve as possible guidelines. The graphical displays used in the study can also be used to communicate the value of various psychological services. For example, consulting firms and test development companies can use these displays to communicate the value of their services and products. Finally, simplifying scientific evidence can also reduce the science-practice gap across many academic disciplines; especially those where evidence-based practices are not always being adopted. Conclusion Industrial and organizational psychologists have a long struggled with persuading organizations to adopt evidence-based hiring practices (Highhouse, 2008; Lawshe & Bolda, 1958; Rynes, 2009; Rynes, Colbert, & Brown, 2002). One major cause of the struggle is the lack of clear and understandable means to communicate complex statistical evidence. This dissertation improved on how statistical evidence is communicated by using Icon arrays. Compared to non-graphical tables and bar graphs, Icon arrays were perceived to be easier to understand and enhance numerical interpretation of statistical information. Principles of effective 38 statistical communication, as demonstrated in this study, have the potential to inform how scientists present their research to policy makers, managers, and practitioners across many scientific disciplines. 39 REFERENCES Ancker, J. S., Senathirajah, Y., Kukafka, R., & Starren, J. B. (2006). Design features of graphs in health risk communication: a systematic review. Journal of the American Medical Informatics Association, 13(6), 608–618. Ancker, J. S., Weber, E. U., & Kukafka, R. (2011). Effects of game-like interactive graphics on risk perceptions and decisions. Medical Decision Making, 31(1), 130–142. Arthur, J. B. (1994). Effects of human resource systems on manufacturing performance and turnover. Academy of Management Journal, 37(3), 670–687. Ayres, I. (2008). Super Crunchers: how anything can be predicted. Hachette UK. Bailey, J. R., & Eastman, W. N. (1996). Tensions between science and service in organizational scholarship. The Journal of Applied Behavioral Science, 32(4), 350. Barrick, M. R., Patton, G. K., & Haugland, S. N. (2000). Accuracy of interviewer judgments of job applicant personality traits. Personnel Psychology, 53(4), 925–951. Beaton, A. E., & Barone, J. L. (1981). The usefulness of selection tests in college admissions. ETS Research Report Series, 1981(1), 1–17. Bettman, J. R., & Zins, M. A. (1979). Information format and choice task effects in decision making. Journal of Consumer Research, 141–153. Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 1061. Boucheix, J.-M., & Guignard, H. (2005). What animated illustrations conditions can improve technical document comprehension in young students? Format, signaling and control of the presentation. European Journal of Psychology of Education, 20(4), 369–388. 40 Bridgeman, B., Pollack, J., & Burton, N. (2004). Understanding what SAT Reasoning TestTM scores add to high school grades: A straightforward approach. ETS Research Report Series, 2004(2), 1– 20. Brogden, H. E. (1946). On the interpretation of the correlation coefficient as a measure of predictive efficiency. Journal of Educational Psychology, 37(2), 65. Brooks, M. E., Dalal, D. K., & Nolan, K. P. (2014). Are common language effect sizes easier to understand than traditional effect sizes? The Journal of Applied Psychology, 99(2), 332–40. Buckley, M. R., Christine Norris, A., & Wiese, D. S. (2000). A brief history of the selection interview: May the next 100 years be more fruitful. Journal of Management History, 6(3), 113– 126. Campbell, J. P., Daft, R. L., Hulin, C. L., Association, A. P., & others. (1982). What to study: Generating and developing research questions (Vol. 32). Sage Beverly Hills, CA. Carswell, C. M. (1992). Choosing specifiers: An evaluation of the basic tasks model of graphical perception. Human Factors: The Journal of the Human Factors and Ergonomics Society, 34(5), 535–554. Carter, L. F. (1947). An experiment on the design of tables and graphs used for presenting numerical data. Journal of Applied Psychology, 31(6), 640. Cascio, W. F. (1976). Turnover, biographical data, and fair employment practice. Journal of Applied Psychology, 61(5), 576. Cascio, W. F. (1977). Formal education and police officer performance. Journal of Police Science & Administration, 5(1), 89–96. 41 Chandler, P., & Sweller, J. (1991). Cognitive load theory and the format of instruction. Cognition and Instruction, 8(4), 293–332. Chase, W. G., & Simon, H. A. (1973). Perception in chess. Cognitive Psychology, 4(1), 55–81. Cleveland, W. S., & McGill, R. (1985). Graphical perception and graphical methods for analyzing scientific data. Science, 229(4716), 828–833. Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd edition). Hillsdale, N.J: Routledge. Colbert, A. E., Rynes, S. L., & Brown, K. G. (2005). Who believes us? Understanding managers’ agreement with human resource research findings. The Journal of Applied Behavioral Science, 41(3), 304–325. Dana, J., Dawes, R., & Peterson, N. (2013). Belief in the unstructured interview: The persistence of an illusion. Judgment and Decision Making, 8(5), 512–520. Davidshofer, K. R., & Murphy, C. O. (2005). Psychological testing: principles and applications. Upper Saddle River, NJ: Pearson/Prentice Hall. Denis, M. (1984). Imagery and prose: A critical review of research on adults and children. TextInterdisciplinary Journal for the Study of Discourse, 4(4), 381–402. Dipboye, Robert L. (1997). Structured selection interviews: Why do they work? Why are they underutilized? In International handbook of selection and assessment (pp. 455–474). London: J Wiley. Egan, D. E., & Schwartz, B. J. (1979). Chunking in recall of symbolic drawings. Memory & Cognition, 7(2), 149–158. 42 Fagerlin, A., Wang, C., & Ubel, P. A. (2005). Reducing the influence of anecdotal reasoning on people’s health care decisions: is a picture worth a thousand statistics? Medical Decision Making, 25(4), 398–405. Fagerlin, A., Zikmund-Fisher, B. J., Ubel, P. A., Jankovic, A., Derry, H. A., & Smith, D. M. (2007). Measuring Numeracy without a Math Test: Development of the Subjective Numeracy Scale. Medical Decision Making, 27(5), 672–680. http://doi.org/10.1177/0272989X07304449 Feldman-Stewart, D., Kocovski, N., McConnell, B. A., Brundage, M. D., & Mackillop, W. J. (2000). Perception of quantitative information for treatment decisions. Medical Decision Making, 20(2), 228–238. Gaissmaier, W., Wegwarth, O., Skopec, D., Müller, A.-S., Broschinski, S., & Politi, M. C. (2012). Numbers can be worth a thousand pictures: Individual differences in understanding graphical and numerical representations of health-related information. Health Psychology, 31, 286–296. Galesic, M., & Garcia-Retamero, R. (2011). Graph literacy: a cross-cultural comparison. Medical Decision Making : An International Journal of the Society for Medical Decision Making, 31(3), 444–57. http://doi.org/10.1177/0272989X10373805 Galesic, M., Garcia-Retamero, R., & Gigerenzer, G. (2009). Using icon arrays to communicate medical risks: overcoming low numeracy. Health Psychology, 28(2), 210. Garcia-Retamero, R., & Cokely, E. T. (2013). Communicating Health Risks With Visual Aids. Current Directions in Psychological Science, 22(5), 392–399. http://doi.org/10.1177/0963721413491570 43 Garcia-Retamero, R., & Dhami, M. K. (2011). Pictures speak louder than numbers: on communicating medical risks to immigrants with limited non-native language proficiency. Health Expectations, 14, 46–57. http://doi.org/10.1111/j.1369-7625.2011.00670.x Garcia-Retamero, R., Galesic, M., & Gigerenzer, G. (2010). Do icon arrays help reduce denominator neglect? Medical Decision Making, 30(6), 672–684. Garcia-Retamero, R., Okan, Y., & Cokely, E. T. (2012). Using visual aids to improve communication of risks about health: a review. TheScientificWorldJournal, 2012, 562637. http://doi.org/10.1100/2012/562637 Gigerenzer, G., & Hoffrage, U. (1995). How to improve Bayesian reasoning without instruction: Frequency formats. Psychological Review, 102(4), 684. Gonzalez, C. (1996). Does animation in user interfaces improve decision making? In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 27–34). ACM. Guion, R. M. (2011). Assessment, measurement, and prediction for personnel decisions. Taylor & Francis. Guthrie, J. T., Weber, S., & Kimmerly, N. (1993). Searching documents: Cognitive processes and deficits in understanding graphs, tables, and illustrations. Contemporary Educational Psychology, 18(2), 186–221. Haensly, P. A., Lupkowski, A. E., & McNamara, J. F. (1987). The chart essay: A strategy for communicating research findings to policymakers and practitioners. Educational Evaluation and Policy Analysis, 9(1), 63–75. Hardin, J. W., Hilbe, J. M., & Hilbe, J. (2007). Generalized linear models and extensions. Stata Press. 44 Hau, R., Pleskac, T. J., Kiefer, J., & Hertwig, R. (2008). The description–experience gap in risky choice: the role of sample size and experienced probabilities. Journal of Behavioral Decision Making, 21(5), 493–518. http://doi.org/10.1002/bdm.598 Hess, R., Visschers, V. H., & Siegrist, M. (2011). Risk communication with pictographs: the role of numeracy and graph processing. Judgment and Decision Making, 6(3), 263–274. Highhouse, S. (2008). Stubborn reliance on intuition and subjectivity in employee selection. Industrial and Organizational Psychology, 1(3), 333–342. Highhouse, S., & Zhang, D. (2015). The New Fruit Fly for Applied Psychological Research. Industrial and Organizational Psychology, 8(02), 179–183. http://doi.org/10.1017/iop.2015.22 Hoffrage, U., Lindsey, S., Hertwig, R., & Gigerenzer, G. (2000). Communicating statistical information. Science, 290(5500), 2261–2262. Huffcutt, A. I., & Arthur, W. (1994). Hunter and Hunter (1984) revisited: Interview validity for entrylevel jobs. Journal of Applied Psychology, 79(2). Huselid, M. A. (1995). The impact of human resource management practices on turnover, productivity, and corporate financial performance. Academy of Management Journal, 38(3), 635–672. Irwin, J. R., & McClelland, G. H. (2003). Negative Consequences of Dichotomizing Continuous Predictor Variables. Journal of Marketing Research, 40(3), 366–371. Johns, G. (1993). Constraints on the adoption of psychology-based personnel practices: lessons from organizational innovation. Personnel Psychology, 46(3), 569–592. Kidder, W. C., & Rosner, J. (2002). How the SAT Creates Built-in-Headwinds: An Educational and Legal Analysis of Disparate Impact. Santa Clara L. Rev., 43, 131. 45 Kosslyn, S. M. (1985). Graphics and human information processing: a review of five books. Journal of the American Statistical Association, 80(391), 499–512. Kuncel, N., & Rigdon, J. (2012). Communicating Research Findings. In Handbook of Psychology, Industrial and Organizational Psychology. John Wiley & Sons. Latham, G. P., & Finnegan, B. J. (1993). Perceived practicality of unstructured, patterned, and situational interviews. Personnel Selection and Assessment: Individual and Organizational Perspectives, 41–55. Lawshe, C. H., Bolda, R. A., Brune, R. L., & Auclair, G. (1958). Expectancy charts II. Their theoretical development. Personnel Psychology, 11(4), 545–559. Levashina, J., Hartwell, C. J., Morgeson, F. P., & Campion, M. A. (2014). The structured employment interview: Narrative and quantitative review of the research literature. Personnel Psychology, 67(1), 241–293. Levie, W. H., & Lentz, R. (1982). Effects of text illustrations: A review of research. ECTJ, 30(4), 195–232. Lewandowsky, S., & Spence, I. (1989). The perception of statistical graphs. Sociological Methods & Research, 18(2-3), 200–242. Lord, C. G., Ross, L., & Lepper, M. R. (1979). Biased assimilation and attitude polarization: The effects of prior theories on subsequently considered evidence. Journal of Personality and Social Psychology, 37(11), 2098. Lowe, R. K. (2003). Animation and learning: selective processing of information in dynamic graphics. Learning and Instruction, 13(2), 157–176. http://doi.org/10.1016/S09594752(02)00018-X 46 Macdonald-Ross, M. (1977). How numbers are shown. AV Communication Review, 25(4), 359–409. MacIver, R., Anderson, N., Costa, A.-C., & Evers, A. (2014). Validity of Interpretation: A user validity perspective beyond the test score. International Journal of Selection and Assessment, 22(2), 149–164. Mattern, K. D., Shaw, E. J., & Kobrin, J. L. (2011). An Alternative Presentation of Incremental Validity Discrepant SAT and HSGPA Performance. Educational and Psychological Measurement, 71(4), 638–662. http://doi.org/10.1177/0013164410383563 Mayer, R. E., & Chandler, P. (2001). When learning is just a click away: Does simple user interaction foster deeper understanding of multimedia messages? Journal of Educational Psychology, 93(2), 390–397. http://doi.org/10.1037/0022-0663.93.2.390 McDaniel, M. A., Whetzel, D. L., Schmidt, F. L., & Maurer, S. D. (1994). The validity of employment interviews: A comprehensive review and meta-analysis. Journal of Applied Psychology, 79(4), 599. McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111(2), 361. Melody Carswell, C., & Wickens, C. D. (1987). Information integration and the object display an interaction of task demands and display superiority. Ergonomics, 30(3), 511–527. Morrison, J. B., & Tversky, B. (2001). The (in) effectiveness of animation in instruction. In CHI’01 extended abstracts on Human factors in computing systems (pp. 377–378). ACM. Morrison, J. B., Tversky, B., & Betrancourt, M. (2000). Animation: Does it facilitate learning. In AAAI spring symposium on smart graphics (pp. 53–59). 47 Murphy, K. R., & Davidshofer, C. O. (1988). Psychological testing. Principles, and Applications, Englewood Cliffs. Nolan, K. P., & Highhouse, S. (2014). Need for autonomy and resistance to standardized employee selection practices. Human Performance, 27(4), 328–346. Okan, Y., Garcia-Retamero, R., Cokely, E. T., & Maldonado, A. (2012). Individual differences in graph literacy: Overcoming denominator neglect in risk comprehension. Journal of Behavioral Decision Making, 25(4), 390–401. Ones, D. S., Viswesvaran, C., & Schmidt, F. L. (1993). Comprehensive meta-analysis of integrity test validities: Findings and implications for personnel selection and theories of job performance. Journal of Applied Psychology, 78(4), 679–703. http://doi.org/10.1037/0021-9010.78.4.679 Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on amazon mechanical turk. Judgment and Decision Making, 5(5), 411–419. Perez, E. C., & White, M. A. (1985). Student evaluation of motivational and learning attributes of microcomputer software. Journal of Computer-Based Instruction. Retrieved from http://psycnet.apa.org/psycinfo/1986-10370-001 Pfeffer, J., & Lammerding, C. (1981). Power in organizations (Vol. 33). Pitman Marshfield, MA. Pinker, S. (1990). A theory of graph comprehension. Artificial Intelligence and the Future of Testing, 73–126. Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S., & Carey, T. (1994). Human-computer interaction. Addison-Wesley Longman Ltd. Priem, R. L., & Rosenstein, J. (2000). Is organization theory obvious to practitioners? A test of one established theory. Organization Science, 11(5), 509–524. 48 Rieber, L. P. (1990). Using computer animated graphics in science instruction with children. Journal of Educational Psychology, 82(1), 135. Rosenthal, R. (2005). Binomial Effect Size Display. In Encyclopedia of Statistics in Behavioral Science. John Wiley & Sons, Ltd. Rosenthal, R., & Rubin, D. B. (1982). A simple, general purpose display of magnitude of experimental effect. Journal of Educational Psychology, 74(2), 166. Rosnow, R. L., & Rosenthal, R. (1991). If you’re looking at the cell means, you’re not looking at only the interaction (unless all main effects are zero). Roulin, N., & Bangerter, A. (2012). Understanding the Academic–Practitioner Gap for Structured Interviews:“Behavioral”interviews diffuse,“structured”interviews do not. International Journal of Selection and Assessment, 20(2), 149–158. Ryan, A. M., & Sackett, P. R. (1987). A survey of individual assessment practices by I/O psychologists. Personnel Psychology, 40(3), 455–488. Rynes, S. (2009). The research-practice gap in industrial-organizational psychology and related fields: Challenges and potential solutions. Rynes, S. L., Bartunek, J. M., & Daft, R. L. (2001). Across the Great Divide: Knowledge Creation and Transfer between Practitioners and Academics. The Academy of Management Journal, 44(2), 340–355. http://doi.org/10.2307/3069460 Rynes, S. L., Colbert, A. E., & Brown, K. G. (2002). HR professionals’ beliefs about effective human resource practices: Correspondence between research and practice. Human Resource Management, 41(2), 149–174. 49 Salgado, J. F. (2002). The Big Five personality dimensions and counterproductive behaviors. International Journal of Selection and Assessment, 10, 117–125. Schmidt, F., & Hunter, J. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124(2), 262–274. Schrader, W. B. (1965). A taxonomy of expactancy tables. Journal of Educational Measurement, 2(1), 29–35. Schuler, H. (1993). Social validity of selection situations: A concept and some empirical results. In H. Schuler, J. L. Farr, & M. Smith (Eds.), Personnel selection and assessment: Individual and organizational perspectives (pp. 11–26). Hillsdale, NJ, England: Lawrence Erlbaum Associates, Inc. Shah, P., & Hoeffner, J. (2002). Review of graph comprehension research: Implications for instruction. Educational Psychology Review, 14(1), 47–69. Shah, P., Mayer, R. E., & Hegarty, M. (1999). Graphs as aids to knowledge construction: Signaling techniques for guiding the process of graph comprehension. Journal of Educational Psychology, 91(4), 690. Soyer, E., & Hogarth, R. M. (2012). The illusion of predictability: How regression statistics mislead experts. International Journal of Forecasting, 28(3), 695–711. Spiegelhalter, D., Pearson, M., & Short, I. (2011). Visualizing Uncertainty About the Future. Science, 333(6048), 1393–1400. http://doi.org/10.1126/science.1191181 Tabachnick, B. G., Fidell, L. S., & Osterlind, S. J. (2001). Using multivariate statistics. 50 Taylor, H. C., & Russell, J. T. (1939). The relationship of validity coefficients to the practical effectiveness of tests in selection: discussion and tables. Journal of Applied Psychology, 23(5), 565. Terpstra, D. E., & Rozell, E. J. (1997). Why some potentially effective staffing practices are seldom used. Public Personnel Management, 26(4), 483–495. Tiffin, J., & Vincent, N. L. (1960). Comparison of empirical and theoretical expectancies. Personnel Psychology, 13(1), 59–64. Tindall-Ford, S., Chandler, P., & Sweller, J. (1997). When two sensory modes are better than one. Journal of Experimental Psychology: Applied, 3(4), 257. Van der Zee, K. I., Bakker, A. B., & Bakker, P. (2002). Why are structured interviews so rarely used in personnel selection? Journal of Applied Psychology, 87(1), 176. Volkov, A., & Laing, G. (2012). Assessing the Value of Graphical Presentations in Financial Reports. Australasian Accounting, Business and Finance Journal, 6(3), 85–108. Winn, W. (1989). The role of graphics in training documents: Toward an explanatory theory of how they communicate. Professional Communication, IEEE Transactions on, 32(4), 300–309. Wolf, A., & Jenkins, A. (2006). Explaining greater test use for selection: The role of HR professionals in a world of expanding regulation. Human Resource Management Journal, 16(2), 193–213. Yankelevich, M. (2007). EXPECTANCY CHART INTERPRETATION AND USE: EFFECTS OF PRESENTATION. Bowling Green State University. Zhang, D. C., Zhang, Y., Highhouse, S., & Brooks, M. E. (2014, November). Using Iconarrays to Communicate Effect Size Information. Presented at the Annual Meeting for the Society for Judgment and Decision Making, Long Beach, CA. 51 Zikmund-Fisher, B. J., Witteman, H. O., Fuhrel-Forbis, A., Exe, N. L., Kahn, V. C., & Dickson, M. (2012). Animated Graphics for Comparing Two Risks: A Cautionary Tale. Journal of Medical Internet Research, 14(4). http://doi.org/10.2196/jmir.2030 52 Table 1. Example of Taylor-Russell Table Base-rate of Level of Success Validity 30% Selection Ratio (SR) Pearson's r 0.05 0.10 0.20 0.30 0.40 0.50 0.00 0.30 0.30 0.30 0.30 0.30 0.30 0.25 0.50 0.47 0.43 0.41 0.39 0.37 0.50 0.72 0.65 0.58 0.52 0.48 0.44 0.75 0.93 0.86 0.76 0.67 0.59 0.52 53 Table 2. Example of Binomial Effect Size Display Improvement No Improvement GRE Training 65% 35% No Training 35% 65% 54 Table 3. Characteristics of Decision-aids that Enhance Comprehension Characteristic CLES BESD Expectancy Charts Icon array Frequency Representation No Sometimes Sometimes Yes Graph Format No No Yes Yes Table Format No Yes No No High Iconicity No No No Yes Overcomes base-rate neglect No No No Yes 55 Table 4. Principle Axis Factoring Results for Subjective Graph Literacy Scale Item Factor Loading It is easy for me to understand information presented in a graph or a chart (E.g. pie chart, bar graph) 0.72 I find that complex information is easier to understand when it is supported by graphs or charts 0.64 I find it easier to communicate information to others using a graph or a chart 0.70 I am good at creating graphs or charts of numerical information 0.62 When reading the newspaper or magazine, I find the graphs and charts very helpful 0.71 I have a hard time making sense of data presented in a graph or chart 0.38 I am better than most people at visualizing information 0.49 In my opinion, a picture is worth a thousand words 0.47 56 Table 5. Means, Standard Deviations, Reliabilities and Inter-correlations of Variables Variable M SD 1 2 3 1 Hiring Experience 2.80 1.10 (0.86) 2 Graph Literacy 4.10 0.52 -0.02 (0.81) 3 Computer Experience 4.50 0.46 0.03 0.26** (0.65) 4 Perceived Usefulness 4.20 0.75 -0.02 0.34** 0.06 (0.92) 5 Visual-aid Engagement 4.10 0.79 0.00 0.36** 0.14* 0.55** (0.85) 6 Perceived Comprehension 4.30 0.68 -0.03 0.35** 0.14* 0.70** 0.49** (0.78) 7 Objective Comprehension 3.59 0.75 0.05 0.22** 0.14* 0.03 0.05 0.14* 8 Age 37.35 10.69 0.18** -0.02 -0.05 0.15** 0.06 0.16** Notes. *. P<0.05, **. P<0.01, Diagonals contain standardized Cronbach's Alpha 4 5 6 7 0.08 57 Table 6. Principle Axis Factoring Results for the Dependent Variables Factor Item 1 2 0.82 0.00 0.93 -0.03 0.93 -0.06 interviews. 0.80 0.06 The graphical visual aid was interesting 0.10 0.77 I was engaged in the graphical visual aid -0.05 0.95 I was bored from looking at the visual aid 0.03 0.65 The information presented in the visual aid was confusing 0.31 0.17 0.63 0.09 0.70 0.08 I would use this visual aid to communicate the advantages of structured interviews. I would recommend this visual aid to be used in presenting the advantages of a structured interview. I would like to have this visual aid to accompany information about the advantages of a structured interview This visual aid clearly demonstrates the advantages of structured It was easy to understand the information about the different interview methods The visual aid made the advantage of structured interview easy to understand 58 Table 7. Principle Axis Factoring Results for Revised Dependent Variables Factor Visual aid Item usefulness Engagement 0.81 0.00 0.91 -0.03 0.92 -0.06 interviews. 0.79 0.06 The graphical visual aid was interesting 0.11 0.77 I was engaged in the graphical visual aid -0.05 0.98 I was bored from looking at the visual aid 0.05 0.68 0.62 0.09 0.69 0.08 I would use this visual aid to communicate the advantages of structured interviews. I would recommend this visual aid to be used in presenting the advantages of a structured interview. I would like to have this visual aid to accompany information about the advantages of a structured interview This visual aid clearly demonstrates the advantages of structured It was easy to understand the information about the different interview methods The visual aid made the advantage of structured interview easy to understand 59 Table 8. Summary of ANOVA Results for Perceived Visual Aid Usefulness Source DF MS F Interactivity 1 3.17 8.06** Visual Aid 2 1.76 4.48* Subjective Graph Literacy 1 15.62 39.67** Visual Aid x Interactivity 2 0.67 1.70 Subjective Graph Literacy x Interactivity 1 0.17 0.44 Visual Aid x Subjective Graph Literacy 2 2.08 5.29** Three-way Interaction 2 0.06 0.16 Error 286 0.48 Notes. *. p<0.05, **. p<0.01 60 Table 9. Mean Perceived Usefulness Across Visual Aids Graph Literacy Low High BESD 4.01(0.74) 4.19(0.77) Bar graph 3.81(0.78) 4.57(0.48) Icon array 4.10(0.48) 4.52(0.56) Notes. Parentheses contain cell standard deviations 61 Table 10. Summary of ANOVA Results for Engagement Source DF MS F Interactivity 1 5.59 9.85* Visual Aid 2 0.77 1.36 Subjective Graph Literacy 1 12.93 22.79** Visual Aid x Interactivity 2 1.10 1.93 Subjective Graph Literacy x Interactivity 1 0.66 1.16 Visual Aid x Subjective Graph Literacy 2 0.36 0.63 Three-way Interaction 2 0.85 1.50 Error 281 0.57 Notes. *. p<0.01, **. p<0.001 62 Table 11. Summary of the Ordered Logistic Regression on Number of Correct Answers B Exp(B) Wald's z Interactivity 0.07 1.07 0.29 Icon array 0.57 1.77 1.85+ BESD 0.36 1.43 1.18 Subjective Graph Literacy 0.60 1.82 2.81* Notes. +. p<0.10, *. p<0.05 63 Table 12. Predicted Probabilities of Number of Correct Responses Total number of correct Visual aid 1 2 3 4 BESD 1.9% 4.5% 21.4% 71.2% Icon array 1.6% 3.7% 18.7% 75.3% Bar graph 2.7% 6.2% 26.5% 63.2% 64 Table 13. Logistic Regression Analysis of Objective Comprehension Test Questions Question Number Question 1 Question 2 Question 3 Question 4 Subjective Graph Literacy 2.00 3.45 2.26 1.79* User Interactivity 0.54 0.53 1.05 1.03 Icon array 1.81 0.62 0.69 3.07** BESD 0.92 0.91 1.02 1.71+ Notes. Values indicate Exponentiated (B); +. p<0.10, *. p<0.05, **. p<0.01 65 Figure 1. Example of Expectancy Chart Probability of Success Score Range 21-25 16-20 11-15 6-10 0-5 0% 20% 40% 60% 80% 100% 66 Figure 2. Example of Icon array 67 Figure 3. Screenshot of Visual Aid Instructions 68 Figure 4. Screenshot of Interactive BESD 69 Figure 5. Screenshot of Interactive Bar Graph 70 Figure 6. Screenshot of Interactive Icon array 71 Figure 7. Screenshot of Static BESD 72 Figure 8. Screenshot of Static Bar Graph 73 Figure 9. Screenshot of Static Icon array 74 Figure 10. Histogram of Subjective Graph Literacy Scale 75 Figure 11. Plot of Means for Perceived Usefulness of Icon array and Bar Graph 5 Perceived Usefulness 4 3 2 1 Bar graph Low Graph Literacy Icon array High Graph Literacy 76 APPENDIX A: JOB SCREENING ITEMS • • • • • • • • • • • • • • • • • • • • • • • Physical and manual labor Interacting with computers Teaching and instructing Entertainment/Performance Recruiting and Hiring Interact with customers Research and development Training and mentoring Management Care for patients Public speaking Planning and organizing Operate machinery Traveling Student Administration Lobbying Fund raising Writing Sales Transportation Analyst Food operation 77 APPENDIX B: INTERVIEW VIGNETTE Imagine that you are the manager of a medium sized software company in charge of hiring. You are tasked with choosing between a traditional interview to using a structured interview when assessing job candidates. The structured interviews require that the same questions are asked of each applicant. Thus, structured interviews restrict the freedom of the interviewer. But the advantage of the structured interview is that it results in more accurate predictions and more successful hires than a traditional interview. Next, you will be presented a visual aid that presents information between the different interview methods. We would like your reaction to the visual aid. 78 APPENDIX C: OBJECTIVE COMPREHENSION TEST 1. Which hiring method will lead to the highest number of expected successful hires? 2. If 50 applicants were hired with the structured interview, how many of those hires are expected to succeed on the job? 3. If switched from a traditional interview to a structured interview, how many more successful hires will be expected? 4. What percentage of applicants are expected to succeed when using a traditional interview? 79 APPENDIX D: DEPENDENT VARIABLES Visual Aid Usefulness 1. I would use this visual aid to communicate the advantages of structured interviews. 2. I would recommend this visual aid to be used in presenting the advantages of a structured interview. 3. I would like to have this visual aid to accompany information about the advantages of a structured interview 4. This visual aid clearly demonstrates the advantages of structured interviews. Visual Aid Engagement 1. The graphical visual aid was interesting 2. I was engaged in the graphical visual aid 3. I was bored from looking at the visual aid Visual Aid Comprehension 1. The information presented in the visual aid was confusing 2. It was easy to understand the information about the different interview methods 3. The visual aid made the advantage of structured interview easy to understand 80 APPENDIX E: DEMOGRAPHICS Hiring Experience • How many times have you made a hiring-related decision about a candidate? 1: 2: 3: 4: 5: 0 times 1-5 times 5-10 times 10-20 times 20 or more times • How many times have you made a hiring-related recommendation about a candidate? 1: 2: 3: 4: 5: 0 times 1-5 times 5-10 times 10-20 times 20 or more times • How many times have you interviewed a job candidate with a traditional interview? 1: 2: 3: 4: 5: 0 times 1-5 times 5-10 times 10-20 times 20 or more times 81 • How many times have you interviewed a job candidate with a structured interview? 1: 2: 3: 4: 5: 0 times 1-5 times 5-10 times 10-20 times 20 or more times • How often do you read academic journal articles on the topic of human resources and/or management? 1: 2: 3: 4: 5: Never A few times a Every month Every week Almost every year • What kind of formal training have you had in hiring-related practices? Check all that apply. o Professional workshop o Online training o Written manual o Informal instructions o Classroom instructions o Observation/shadowing Computer Experience • day How often do you use a computer? 82 • • 1: 2: 3: 4: 5: Less than 1 Less than 1 1-2 hours 3-6 hours 6 or more hours hour per week hour per day per day per day per day How would you rate your knowledge about computer use? 1: 2: 3: 4: 5: Much less Less than About average More than Much more than than average average average average How comfortable are you with using a computer for everyday tasks? 1: 2: Not at all comfortable 3: 4: 5: Somewhat Somewhat Very uncomfortable Comfortable comfortable 83 APPENDIX F: INFORMED CONSENT 84 APPENDIX G: HSRB APPROVAL LETTER
© Copyright 2026 Paperzz