Judging Audit Quality in Light of Adverse Outcomes: Evidence of Outcome Bias and Reverse Outcome Bias Mark E. Peecher, Ph.D., CPA Deloitte & Touche Teaching Fellow Associate Professor of Accountancy M. David Piercey Deloitte & Touche Doctoral Fellow Doctoral Candidate University of Illinois at Urbana-Champaign February 2006 Under Revision: Comments Appreciated We thank Brooke Elliott, Anne Farrell, Mike Gibbins, Jonathan Grenier, Gary Hecht, Josh Herbold, Karim Jamal, Kathryn Kadous, Susan Krische, Thomas Matthews, Molly Mercer, Joel Pike, Doug Prawitt, Ira Solomon, Kristy Towry, and George Wu for their helpful comments. We also thank participants at the 9th BDRM Conference at Duke University and at the 1st Accounting Research Symposium at Brigham Young University, as well as workshop participants at the University of Alberta, the University of Connecticut, Emory University, and the University of Illinois at Urbana-Champaign. Address email to either [email protected] or [email protected] or mail to either author at UIUC, Department of Accountancy, College of Business, 284 Wohlers Hall, 1206 S. Sixth Street, Champaign, IL, 61820. Judging Audit Quality in Light of Adverse Outcomes: Evidence of Outcome Bias and Reverse Outcome Bias Abstract Considerable auditing research demonstrates that individuals exhibit outcome effects when judging audit quality: That is, they judge auditor negligence higher when given knowledge about adverse audit outcomes. Many studies conclude that knowledge of adverse audit outcomes biases individuals against auditors, and attempt to improve their judgments by reducing outcome effects. Yet, whether outcome effects imply outcome bias is a vexing question: From a Bayesian perspective, individuals should judge auditors more harshly when given adverse outcomes. Logically, individuals could exhibit either outcome bias (over-rely on adverse audit outcomes), reverse outcome bias (under-rely on adverse outcomes), or neither form of bias. Applying Prospect Theory’s probability weighting function, we hypothesize and find that individuals’ negligence judgments exhibit outcome bias when the Bayesian probability of auditor negligence is relatively low, but also exhibit reverse outcome bias when the Bayesian probability is relatively high. This finding is robust to both relatively rich and relatively abstract experimental settings, and to judgments made in hindsight and in foresight. While many factors likely contribute to judgments of auditors potentially being too harsh (e.g., large plaintiff losses and auditors’ “deep pockets”), our conclusions suggest that the effect of adverse outcome information on judgments of auditor negligence is not as obvious, and depends on whether the Bayesian probability of auditor negligence is high or low. We suggest that the model for judgments of auditor negligence should expand to include both outcome bias and reverse outcome bias, where predicted by our combination of outcome effects and Prospect Theory’s probability weighting function. 1 I. INTRODUCTION In a variety of contexts, individuals assess the quality of auditors’ decision making in light of adverse outcomes, e.g., material misstatements of earnings. Such outcomes often are salient in contexts such as litigation, the popular business press, and alternative dispute resolution. A common concern is that adverse outcomes exert too much influence on individuals’ judgments of auditor negligence. The effects of outcome knowledge on individuals’ judgments of auditor negligence are called outcome effects. A number of accounting studies have shown that individuals judge auditors more harshly when given information about adverse outcomes than when not given such information (e.g., Clarkson et al. 2002; Kadous 2001).1 Some studies conclude that larger outcome effects indicate a larger bias against auditors and, accordingly, attempt to de-bias individuals’ judgments of auditors by reducing outcome effects (e.g., Cornell et al. 2005; Clarkson et al. 2002; Kadous 2001). Yet, the extent to which outcome effects imply outcome bias is more vexing: From a Bayesian perspective, audit outcomes are informative of original audit quality (Hershey and Baron 1995; Hawkins and Hastie 1990; Brown and Solomon 1987). A Bayesian evaluator would judge auditors more harshly given knowledge of adverse audit outcomes, and therefore exhibit such outcome effects (see Section II; Hershey and Baron 1995; Holmstrom 1978).2 Since individuals are regularly non-Bayesian, they could either over- or under-rely on outcome information (e.g., Edwards 1968). Logically, outcome effects, by themselves, could be the same as, greater than, or less than those exhibited by a Bayesian evaluator of audit quality. 1 Outcome effects have also been shown in other accounting contexts, including bankruptcy prediction (Buchman 1985), capital budgeting (Brown and Solomon 1987), variance investigation (Lipe 1993), taxation (Kadous and Magro 2001), and performance evaluation (Frederickson, Peffer and Pratt 1999). 2 It is generally appropriate for evaluators to use outcome information in most real-world contexts (Hershey and Baron 1992, 1995). Normatively, adverse outcomes are to some extent diagnostic of auditor negligence unless either: (1) individuals are certain that they possess all of the information that auditors should have possessed when making their decisions, or (2) individuals are uncertain whether they possess all such information but are somehow certain that all missing information is uncorrelated with adverse outcomes (Brown and Solomon 1987, 565-66). 2 Several accounting studies include caveats to emphasize that outcome effects can be normatively appropriate (e.g., Brown and Solomon 1987; Lipe 1993; Anderson et al. 1993, 1997; Tan and Lipe 1997; Frederickson et al. 1999), while others explicitly assume that outcome effects likely are non-normative (e.g., Kadous 2001, 441). For example, some studies characterize evaluators as being “vulnerable” or “susceptible” to outcome effects, or stress that evaluators unlikely can ignore outcomes when evaluating auditors or managers, who did not know the outcomes at the time of their decisions (e.g., Kinney and Nelson 1996; Kadous 2001; Clarkson et al. 2002; Cornell et al. 2005). Other studies take measures to increase the likelihood that outcome effects indicate outcome bias by trying to suppress the diagnosticity of outcomes. For example, some participants are asked to assume they have exactly the same information that auditors (should have) had when deciding (e.g., Anderson et al. 1993, 1997).3 Some participants are told that they should ignore outcomes (e.g., Anderson et al. 1993, 1997; Clarkson et al. 2002).4 Yet, whether and the extent to which outcome effects documented in these studies indicate outcome bias depends in large part on the diagnosticity of the adverse outcomes for audit quality. Outcome effects should arise when participants reasonably judge adverse outcomes to have non-zero diagnosticity for audit quality. We build upon this literature by using a conceptual approach that does not require caveats about whether or the extent to which adverse outcomes are diagnostic of audit quality. We allow adverse outcomes to be diagnostic of audit quality and, unlike prior studies, measure their 3 Of course, it is impossible to quantify with certainty whether prior research rendered outcome information non-diagnostic of audit quality. For example, this may depend on the extent to which participants in an experiment believe it when they are told that they have all of the information that an auditor had. If individuals believe they have incomplete information, they may use outcome information to infer missing information (Hershey and Baron 1992). 4 Whether such instructions render outcomes non-diagnostic is also uncertain. An important distinction is whether such experimental tasks ask participants to recall their prior beliefs about auditor negligence or to judge auditor negligence. If the task judgment task were to recall one’s prior beliefs (i.e., what one thought audit quality was before knowledge of outcome), then the optimal judgment would ignore outcomes. Failure to do so is called hindsight bias (e.g., Kennedy 1995). However, if the judgment task were to judge auditor negligence (i.e., to diagnose original decision-making quality), then the normative judgment (to maximize accurate judgments of auditor negligence) would use outcome information, even if one were given instructions to ignore it (Hershey and Baron 1992, 1995). The over-use, not the mere use, of outcome information that leads to unduly large outcome effects is outcome bias (Baron and Hershey 1998). 3 diagnosticity, as suggested by Hershey and Baron (1992, 1995). Because we empirically measure the diagnosticity of adverse outcomes via a Bayesian Benchmark, we can separate outcome bias from outcome effects. In addition, we present theory that has not been used to illuminate outcome effects — Prospect Theory’s probability weighting function (cf., Kahneman and Tverksy 1979; Tversky and Kahneman 1992). By combining extant auditing theory with Prospect Theory, we hypothesize conditions under which judgments of auditor negligence exhibit outcome bias (i.e., are too high) and exhibit reverse outcome bias (i.e., are too low). We predict and find that individuals’ judgments of auditor negligence exhibit outcome bias when the Bayesian probability of auditor negligence is relatively low (i.e., below the vicinity of 40%), but also exhibit reverse outcome bias low when the Bayesian probability of negligence is relatively high (i.e., above the vicinity of 40%). Our results support this prediction in two experiments that collectively use both relatively abstract and relatively rich experimental settings (adapted from Kadous 2001, 2000), and for judgments made in hindsight and in foresight. Prior auditing research has identified many factors that increase the harshness with which individuals judge auditors. For example, individuals may be motivated by plaintiffs’ losses (Kadous 2000), or simply by auditors’ “deep pockets” (i.e., regardless of whether they were in fact negligence), and these factors likely do increase the probability that individuals’ judgments of auditors’ are too harsh. However, with respect to adverse outcome information, our theory and findings suggest that its incremental effects are not obvious, but rather depend on the Bayesian probability of auditor negligence. Our theory and experimental findings contribute to the accounting literature in at least four ways. One, we use a novel measurement method that conceptually distinguishes among outcome effects, outcome bias, and reverse outcome bias, and that directly measures these biases. Two, based on Prospect Theory’s probability weighting function, we predict that probability weighting constitutes a previously overlooked source of both outcome bias and reverse outcome 4 bias, with the sign of bias depending on the Bayesian probability of auditor negligence. This extends auditing theory with the first prediction of both outcome bias and reverse outcome bias following adverse outcomes, and extends Prospect Theory with its first prediction of either outcome bias or reverse outcome bias. Three, we report two experiments to empirically test and replicate our theory-based predictions. Four, our theory and findings suggest that the theoretical model for judgments of auditor negligence — including de-biasing frameworks — should expand to include both outcome bias and reverse outcome bias (i.e., judgments following adverse outcomes that are reliably too lenient), where predicted by our application of Prospect Theory’s probability weighting function. The remainder of this paper is organized as follows. Section II develops our theory and hypotheses. Sections III and IV describe two experiments designed to test our hypotheses. Section V discusses our conclusions and limitations. II. THEORY In this section, we develop three theory-based hypotheses. We begin by reviewing the accounting outcome-effect literature from a Bayesian point of view. We then discuss Prospect Theory’s probability weighting function and its applicability to outcome effects observed in audit negligence contexts to develop hypothesis one (H1). Next, we develop H2 by discussing how evaluations conditioned on past, instead of future, outcomes can increase outcome effects. Finally, in developing H3, we discuss the joint effects of probability weighting in H1 and outcome temporality in H2. For expositional purposes, we emphasize the audit negligence context when developing our hypotheses, but our underlying theory generalizes to other accounting contexts. 2.1 A Bayesian Perspective on Outcome Effects Many accounting studies find that evaluators with information about adverse outcomes more harshly assess auditors’ decision processes than do evaluators without such information 5 (see, e.g., Reimers and Butler 1992; Anderson et al. 1993; Lowe and Reckers 1994; Nelson and Kinney 1996; Anderson et al. 1997; Kadous 2000, 2001; and Clarkson et al. 2002).5 Normatively, outcome effects usually are desirable, since adverse audit outcomes can be and often are informative of the quality of auditors’ decision processes. Hawkins and Hastie (1990, 312) note that outcome effects are consistent with learning from outcome feedback: “The rational or adaptive response to outcome feedback should be to change beliefs or reasoning procedures to incorporate the implications of new information.” In the rare case that an evaluator has certain, accurate, and complete information about auditors’ ex ante decision processes, outcomes would add no incremental information and thus would be non-diagnostic (Hershey and Baron 1992). Under such pristine conditions, evaluators’ outcome-informed and outcome-uninformed assessments of auditor negligence should be equivalent. Pristine conditions rarely occur in the natural audit ecology, however. Even auditors themselves inaccurately recall the information sets used to make their judgments and decisions (Moeckel and Plumlee 1989; Moeckel 1990).6 Third-party evaluators, who were not at the audit, likely obtain uncertain and incomplete information about auditors’ ex ante decision processes. When evaluators have such information, outcomes are diagnostic of auditors’ ex ante decision processes (Brown and Solomon 1987; Hoch and Loewenstein 1989; Hershey and Baron 1992; Kelman, Fallas and Folger 1998). That is, evaluators should use outcome information, along with other diagnostic signals, to assess the quality of auditors’ ex ante decision processes (Hershey and Baron 1995; cf., Holmstrom 1979). 5 Elsewhere, outcome effects are measured as the difference between evaluators’ assessments of decisionmaking conditional on good, as opposed to bad, outcomes (Frederickson et al. 1999; Tan and Lipe 1997). Brown and Solomon (1987) also include a good outcome condition, along with no outcome and bad outcome conditions. All three of these studies include cautionary statements that larger outcome effects might or might not represent greater outcome bias, but none measure the extent to which this is true. 6 Rather than recalling actual evidence, experienced auditors sometimes rely on default values to reconstruct past mental representations and decision processes (Moeckel 1990, 371-72). Whether outcomes affect default values auditors use for reconstruction purposes is a topic for future research. 6 To illustrate, suppose that an evaluator assesses the probability of auditor negligence (AN) conditional on an outcome that indicates materially misstated (MM) financial statements (p(AN|MM)). Before considering the outcome, the evaluator had prior beliefs about base rate probabilities of auditor negligence p(AN) and material misstatements p(MM), as well as about the conditional probability of materially misstated financial statements when auditors are negligent (p(MM|AN)). So long as material misstatements are more likely to go unprevented or undetected when auditors are negligent than when they are not, the likelihood ratio p(MM|AN)/p(MM) must exceed 1. The greater this ratio, the greater the outcome diagnosticity, and the greater the factor by which an evaluator should increase the base rate p(AN) in assessing of auditor negligence p(AN|MM): ( p!MM | AN " % . ## p! AN | MM " * p! AN ") && ' p!MM " $ (1) Therefore, the extent to which outcome effects imply that evaluators are overly harsh or overly lenient is hard to say without measurement. Bayesian evaluators would exhibit outcome effects (p(AN|MM) > p(AN) in equation 1, above), to maximize the expectation of an accurate evaluation. An additional complication is that evaluators often need to assess the quality of auditors’ decisions based on probabilistic, instead of deterministic, outcomes. The probability with which a material misstatement exists often is based on rumor and can be a matter of contentious debate. Were earnings really misstated and, if so, at what point did the misstatement really become material? Deloitte LLP, as just one example, publicly and vigorously disagreed with the SEC’s allegation that an adverse outcome – a material misstatement – existed in Pre-paid Legal Services Inc.’s financial statements: “Deloitte & Touche . . . took the unusual step yesterday of stating publicly that it believes the Securities and Exchange Commission was wrong when it forced a company to restate its financial results. The firm said it would not certify the revised books because it does not believe they are correct” (Glater and Norris 2001). When considering 7 probabilistic adverse outcomes, evaluators generally should treat them as more diagnostic of poor audit decisions as they become more probable. 2.2 Prospect Theory, Probability Weighting & Reverse Outcome Bias Even aside from Prospect Theory, a Bayesian perspective recognizes the possibility that: (1) outcome effects could reflect evaluators’ warranted belief revision instead outcome bias, (2) evaluators could exhibit outcome bias (i.e., over-harshness) or reverse outcome bias (i.e., overleniency), and (3) reductions in outcome effects unwittingly could cause or amplify reverse outcome bias (cf., Hershey and Baron 1992, 1995). While prior accounting studies on outcome effects are silent about or assume away reverse outcome bias, we next apply Prospect Theory’s probability weighting function to specify conditions under which reverse outcome bias (and outcome bias) is likely to obtain. When Kahneman and Tversky’s (1979) Prospect Theory emerged, it included a nascent probability weighting function to account for mounting empirical evidence of individuals’ systematic mistreatment of probabilities (Phillips, Hays and Edwards 1966; Phillips and Edwards 1968; Edwards 1968). This probability weighting function is distinct from Prospect Theory’s more famous value function, which features a reference point and is about twice as steep for losses than for gains.7 Research on the probability weighting function during the past 25 years has revealed several theory-consistent empirical regularities. Most strikingly, individuals overweight relatively low probabilities but underweight relatively high probabilities, with these trends abating for probabilities in the vicinity of 0 and 1. Multiple empirical studies show that plotting individuals’ weighted probabilities (w(p)) on actual probabilities (p) results in an inverse-S (see Figure 1). Multiple empirical studies also show that, in the vicinity of 40%, the probability weighting function switches from overweighting to underweighting probabilities and also go from 7 Without the probability weighting function, Prospect Theory cannot explain well-replicated decision behaviors with respect to risk preferences or violations of first-order stochastic dominance in choosing among risky prospects (see, e.g., Fox and Tversky 1998). 8 exhibiting concavity to convexity (e.g., Tversky and Kahneman 1992; Camerer and Ho 1994).8 While initial tests of the probability weighting function examine how humans weight stated probabilities when choosing between alternative abstract gambles, more recent work does so in applied contexts featuring intuitive probability assessment (e.g., Fox and Tversky 1998; Wu and Gonzalez 1999). [Insert Figure 1 here] Prospect Theorists characterize the probability weighting effect as a primitive, unconscious bias in humans’ perceptions of uncertainty (Wu and Gonzalez 1996; Gonzalez and Wu 1999; Wu, Zhang and Gonzalez 2004). We contend that, as such, Prospect Theory’s probability weighting function can be used to predict conditions under which evaluators underand over-react to the diagnostic value of probabilistic (i.e., uncertain and certain) adverse audit outcomes. If evaluators mis-weight the probabilities of auditor negligence in accord with Prospect Theory, evaluators will overweight the Bayesian probability of auditor negligence when it is relatively low (below the vicinity 40%), but underweight the Bayesian probability of auditor negligence when it is relatively high (above the vicinity of 40%). This suggests the following hypothesis (see Figure 1): H1: (Probability-Weighting Effect) Outcome-informed evaluators will exhibit outcome bias (over-harshness) when the Bayesian probability of auditor negligence is relatively low (below the vicinity of 40%), but reverse outcome bias (over-leniency) when the Bayesian probability of auditor negligence is relatively high (above the vicinity of 40%). If H1 were empirically supported, it would contribute to theory of how and how well evaluators assess auditors’ negligence. It would constitute an important boundary condition for the presumption that outcome effects imply overly harsh judgments of auditors and for the idea that outcome effects generally should be reduced. Notably, H1 predicts over leniency for probabilities that have ecological significance in audit litigation contexts – those probabilities that 8 Wu and Gonzalez (1996) demonstrate the cross-over point characteristic of an inverse-S curvature (also see, e.g., Tversky and Fox 1995). In addition Prelec’s (1998) elegant model predicts the cross-over point to be at 1/e + 38%. 9 are above legal standards of proof such as “preponderance of the evidence,” “clear and convincing,” and “beyond reasonable doubt.” 2.3 Outcome Temporality: Revising Beliefs in Foresight vs. Hindsight H1 predicts the way evaluators revise their prior beliefs of auditor negligence, conditional on outcome information, relative to a Bayesian. As prior accounting studies on outcome effects demonstrate, many factors likely influence the relative harshness or leniency of evaluators’ judgments of auditor negligence. For example, some individuals may be motivated by plaintiffs’ losses (Kadous 2000), by auditors’ “deep pockets” (i.e., regardless of whether a material misstatement really exists), or by a personal pre-disposition against auditors. Of particular interest to our belief-revision orientation is whether, holding the stated probability of outcomes constant, evaluators’ pre-posterior analyses, or how they prospectively revise beliefs given future adverse outcomes, differ from their posterior analyses, or how they retrospectively revise beliefs given past adverse outcomes. As Hawkins and Hastie (1990, 311) note, “It is a common observation that events in the past appear simple, comprehensible, and predictable compared to events in the future. Everyone has had the experience of believing that they ‘knew it all along’ the outcome of a horse race, football game, marriage, business investment…” (italics added). Evaluators likely treat past material misstatements as more predictable or controllable (i.e., by auditors) than future material misstatements.9 Since controllability increases outcome effects (cf., Tan and Lipe 1997), evaluators likely will assess auditors less favorably given past, as opposed to future, adverse outcomes. Thus, we propose the following hypothesis: 9 Hawkins and Hastie (1990, 315) discuss several other ways individuals may respond differently to past, as opposed to future, otherwise identical outcomes. They argue individuals’ processing of past outcomes “involves narrow-minded thinking backwards from the given outcome to precipitating events… whereas foresight involves consideration of many possible outcomes.” Thus, individuals naturally may respond to future outcomes by processing them in frequentist terms but to otherwise identical past outcomes by processing them in non-frequentist terms. We leave examination of this interesting possibility to future research, as it is beyond the scope of our study. 10 H2: (Temporality Effect) Outcome-informed evaluators will judge auditors to be more negligent conditional on past, as opposed to future, adverse audit outcomes. Two points are noteworthy: One, this outcome-informed comparison differs from typical comparisons in accounting literature. Typically, assessments of evaluators given information of past outcomes are compared to assessments of evaluators who are uninformed about outcomes (e.g., Lipe 1993; Nelson and Kinney 1996; Kadous 2001; Kadous and Magro 2001; Clarkson et al. 2002). Thus, prior accounting studies do not isolate on the extent to which outcome effects are caused by information about outcomes or by information about past, as opposed to future, outcomes. Two, while H2 predicts that evaluators will make harsher judgments of auditor negligence given past, as opposed to future, adverse outcomes, it is silent on whether their evaluations will be too harsh or lenient. Theory for H1 warrants predicting two kinds of directional bias, whereas theory for H2 warrants predicting one directional effect (without specification of bias).10 2.4 Joint Effects of Probability Mis-Weighting and Outcome Temporality For relatively low Bayesian probabilities or auditor negligence, H1’s predicted outcome bias and H2’s predicted temporality effect for hindsight evaluations go the same direction. As such, H1 and H2 jointly warrant predicting outcome bias, for both foresight and hindsight evaluations, given relatively low Bayesian probabilities of auditor negligence. For relatively high Bayesian probabilities, though, H1’s predicted reverse outcome bias and H2’s predicted temporality effect for hindsight evaluations go in opposite directions. Absent 10 Hindsight bias differs from the temporality effect discussed in H2. Hindsight bias refers to evaluators’ propensity to mis-remember what their prior beliefs about the probability an outcome was (or would have been), after learning something about a realized outcome (e.g., Kennedy 1995, 253). That would be analogous to giving evaluators outcome information that probabilistically points towards a material misstatement, and then asking them to provide their prior beliefs, as if they did not have the outcome information. In contrast, outcome effects refer to evaluators’ propensity to revise their prior beliefs about decision-making quality, in light of outcome information. For the former type of task, ignoring outcome information will maximize the expectation of accurately recalled priors and reduce hindsight bias. For the latter task, using outcome information will maximize the expectation of accurately updated evaluations of auditor negligence (e.g., Baron and Hershey 1988). The temporality effect of H2 predicts that evaluators’ revised beliefs will be harsher given past versus future outcomes. 11 additional theory, we would conjecture that the net bias is simply an empirical question. A theoretical case for predicting reverse outcome bias, even in hindsight, exists, however. Specifically, Fox and Tversky (1998) and Wu and Gonzalez (1999) model probability judgments as if they follow a two-stage process. In Stage 1, evaluators amass “support” for probabilities based on event salience and, in Stage 2, they weight the probabilities in accord with the probability weighting function. Many factors influence event salience and thus how much support is amassed; “unpacking” is one (Tversky and Koehler 1994; Fox and Tversky 1998; Fox and Birke 2002). Unpacking breaks outcomes into components – one could unpack the outcome “material misstatement” into “incorrect revenue recognition,” “incorrect inventory valuation,” and so on. We argue that hindsight temporality, like unpacking, increases event salience and thus how much support evaluators amass in stage one, resulting in higher objective probabilities (e.g., Hawkins and Hastie 1990). Then in stage two, evaluators mis-weight these objective probabilities in accord with the probability weighting function. Thus, for relatively high Bayesian probabilities of auditor negligence, we predict reverse outcome bias to obtain in hindsight as well as foresight. We can not, however, predict whether the magnitude of reverse outcome bias will be greater in foresight versus in hindsight: That would depend on the relative effect sizes associated with in Stage 1’s support gathering versus in Stage 2’s probability mis-weighting. As the shape of the probability weighting function clearly indicates (see Figure 2), the extent of underweighting of probabilities nonlinearly varies for relatively high probabilities. So questions about the relative size of Stage 1’s hindsight-related incremental “support” acquisition and of Stage 2’s probability under-weighting ultimately are empirical. This discussion warrants the following hypotheses, broken down into cases that feature outcome bias for relatively low (H3a) and reverse outcome bias for relatively high (H3b) Bayesian probabilities of auditor negligence: [Insert Figure 2 here] 12 H3a: (Relatively Low Bayesian Probabilities of Negligence) Outcome bias will obtain for relatively low Bayesian probabilities of auditor negligence, regardless of whether evaluations are made in hindsight or foresight. H3b: (Relatively High Bayesian Probabilities of Negligence) Reverse outcome bias will obtain for relatively high Bayesian probabilities of auditor negligence, regardless of whether evaluations are made in hindsight or foresight. If these hypotheses were empirically supported, it would provide a new lens through which to view prior accounting studies on how evaluators use past adverse outcomes to evaluate decisions of auditors or managers. Prior accounting studies assume or emphasize the condition of evaluator over-harshness and ignore or downplay the possibility of evaluator over-leniency (e.g., Kadous 2001; Kadous and Magro 2001; Clarkson et al. 2002). III. EXPERIMENT 1 3.1 Participants. Undergraduates enrolled in an introductory accountancy course at the University of ________ served as participants.11 Nine hundred thirty-three volunteered for extra-credit, worth up to 1% of their final grade. Participants averaged 1.37 years of post-high-school education (s = 0.97), 0.09 accounting courses (s = 0.38), and 3.29 business, accounting, and economics courses (s = 1.98). 59.0% were male. On an 11-point Lickert scale centered at 0, participants were slightly unsympathetic to auditors, on average (mean = –0.11, s = 1.65, t = –2.03, p = 0.043). 3.2 Task. The experimental materials consisted of a single paper packet and featured three sections: an introduction, an experimental case, and a post-case questionnaire. The introduction explained basic concepts such as material misstatements, unqualified audit opinions, reasonable assurance, 11 Like most outcome effect studies in accounting we use non-professional evaluators. Many nonprofessional and professionals evaluators assess the quality of auditors’ decisions in many different contexts – including business school students and other readers of the popular business press, jurors, journalists, mediators, state or federal judges, lawyers, regulators, and so on. Absent a theory that our students revise beliefs differently than other evaluators, our use of student participants is both theoretically and practically justified (Libby, Bloomfield, and Nelson 2002; Peecher and Solomon 2001). 13 auditor negligence, and due professional care. It asked participants simple review questions to emphasize these concepts. Because this is the first outcome-effect accounting study to provide theory predicting a condition under which reverse outcome bias (i.e., over-leniency) obtains, we wanted to provide ample opportunity for evaluators’ assessments to be overly harsh. Thus, the introduction included language adapted from the “severe consequences” condition in Kadous (2000), emphasizing financial and emotional losses associated with undetected material misstatements and audit failures. We also added language to highlight how the Enron and WorldCom debacles resulted in lost savings and jobs (Appendix A). In the experimental case section, participants used natural frequencies to convey their prior beliefs and revised beliefs, conditional on probabilistic outcome information. We chose to solicit their beliefs with natural frequencies because people come closer to being Bayesians when using natural frequencies instead of probabilities (Gigerenzer and Hoffrage 1995, 1999; Gigerenzer 2000). Our use of natural frequencies instead of probabilities, therefore, prejudices us against observing our predicted form of non-Bayesian belief revision that is consistent with the probability weighting function. In the experimental case, the instrument referred to all the audits of U.S. companies by Big-4 public accounting firms as the reference population, and elicited participants’ prior beliefs (Appendix B). Specifically, it asked participants about the base rate frequency of material misstatements (p(MM)), auditor negligence (p(AN)), and material misstatements given auditor negligence (p(MM|AN)). Based on these measures, we computed the likelihood ratio (p(MM|AN)/p(MM)) and the Bayesian posterior. 3.3 Manipulations and Experimental Design. We manipulated outcome temporality, outcome probability (P*), and Order in a 2 × 5 × 2 between-participants, fixed-factorial design. The first two factors manipulate attributes of 14 outcomes, and last factor caused participants to provide different priors (Appendix C). We next explain each factor in turn. The two levels of outcome temporality are hindsight and foresight. In the hindsight condition, participants evaluated auditors based on audits and misstatement outcomes that already had occurred, and all verbs appeared in the past tense. In the foresight conditions, participants evaluated auditors based audits and misstatement outcomes that had not yet occurred, and all verbs appeared in the future tense (Appendix C). This manipulation differs from manipulations in extant outcome-effect accounting research, in which the baseline conditions typically withhold outcome information. With our manipulation, any differences in evaluations of auditor negligence are due participants’ use of different encoding or processing of past versus future outcomes. The five levels of outcome probability, P*, are 0%, 25%, 75%, 90% or 100%. These were five levels of explicitly stated probabilities of material misstatement (i.e., the adverse outcome of interest). To accomplish this manipulation, the instrument described a hypothetical watch list of U.S. financial statement audits. It included wording to the effect that a stated percentage of companies on the list had (will have) materially misstated financial statements, with the percentage stated depending on the level of P* to which given participants were assigned. Later, participants evaluated audit quality for an audit randomly drawn from the list (Appendix C).12 The manipulation of P* at five levels provides three benefits. One, it allows for a robustness check for the outcome temporality effect for misstatements at multiple, explicitly stated levels of probability. Two, it enables us to compare participants’ revised beliefs against a Bayesian benchmark for the entire [0,1] probability interval. Three, while levels of P* denoting certainty (i.e., 0% and 100%) make our research comparable to prior outcome effect studies in 12 For parsimony, we used a single mechanism (i.e., an earnings management watch list) for all five levels of P*. In the real world, different mechanisms (e.g., an SEC investigation) may signal different levels of P*. Identification of these mechanisms and the levels of P* that people typically associate with them are beyond the scope of Experiment 1. We let participants infer levels of P* based on two realistic signals in Experiment 2. 15 accounting, levels of P* that convey uncertainty (i.e., 25%, 75%, and 90%) are advantageous because, in the real world, evaluators regularly confront uncertain outcomes (e.g., sometimes whether a material misstatement exists is a judgment call and frequently the probability of material misstatement < 100%). The third factor is Order, by which we caused participants’ priors to differ, for control purposes. We counterbalanced whether the instrument first elicited participants’ outcomeuninformed, prior beliefs p(AN), p(MM), or p(MM|AN) (Appendix B), or their outcome-informed revised beliefs, p(AN|MMP*) (Appendix C). That is, some participants processed outcome information before providing their priors whereas other participants first provided their priors. Although Order significantly affected participants’ priors, it is insignificant in all tests of our hypotheses and does not change any of our inferences. For simplicity, we collapse across Order in our analyses. 3.4 Manipulation Checks. To encourage participants’ attention, review questions appeared throughout the instrument and in its final section. We told participants before the experiment that their extracredit depended on the accuracy of their answers on the review questions and on the reasonableness of their responses for the case questions. Participants spent an average of 28.9 minutes on the experiment (s = 4.9). To check on the outcome temporality manipulation, participants responded to a multiplechoice question immediately after receiving outcome information but before responding with their posterior judgment. The question asked whether the outcome information related to audits that “have already finished” or “have not yet started” (Appendix C). Of the 933 participants, 856 (91.8%) passed and 77 (8.2%) failed the manipulation check. When deciding whether to present our findings with or without participants who failed our manipulation check, we wanted to bias against supporting our theory. In H3b, we predict that reverse outcome bias (i.e., over-leniency) will obtain, even in hindsight outcome temporality. So, to reduce the chance of classification 16 errors biasing our findings, we dropped participants who failed the manipulation check from the sample a priori, resulting in a beginning sample size of 856.13 3.5 Experimental Findings. Because of the specific inverse-S empirical shape of the data predicted by our theory (Figure 2), we use a cubic polynomial regression to estimate our participants’ revised beliefs about auditor negligence (p(AN|MMP*). H1 and H3 predicted that participants’ revised beliefs would obtain as a cubic polynomial on the computed Bayesian revised belief, Bayes.14 Thus, we regress participants’ revised beliefs on our two manipulated factors and on the Bayesian revised belief up to a power of 3 (i.e., Bayes, Bayes2, and Bayes3). No higher-order interactions obtain. We also include time spent on the task, Minutes, as an explanatory covariate.15 Note that, if participants were perfect Bayesians, the plotting of their revised beliefs and Bayesian revised beliefs would produce a 45° line coinciding with the main diagonal (Figure 2). 13 The experimental findings do not change statistically or qualitatively with inclusion of participants who failed the manipulation check. We also exclude observations due to missing or unintelligible data (30) and outliers (15) identified by Cook’s distance (Neter et al. 1996). The experimental findings do not statistically or qualitatively change if we include outlier responses. 14 If a misstatement outcome, MM, has a probability P*, the Bayesian posterior of auditor negligence is: ! ( ( p MM | AN ( ( p!MM | AN " % % ## ## - !1 , P *" ) & p! AN " ) & Bayes * P * )&& p! AN " ) && & p MM & ' p!MM " $ $ ' ' ' ! " "%# %# , ## $$ (2) where MM is a no-misstatement outcome. Note that, when P* = 1, equation (2) reduces to Bayes = p(AN|MM), and when P* = 0, equation (2) reduces to Bayes = p(AN| MM ). For example, if 10 out of every 100 audits allow a material misstatement (p(MM) = 0.1), if auditors are negligent on 20 out of every 100 financial statement audits (p(AN) = 0.2), if 8 out of those 20 negligent audits allow a material misstatement (p(MM|AN) = 8/20 = 0.4), and if you are 90% sure that a particular audit under evaluation did end in a material misstatement (P* = 0.9), then the Bayesian probability that the auditors are negligent for that particular audit under evaluation is: ( ( !1 , 0.4 " % % ( ( 0.4 % % ## ## * 0.733. Bayes * 0.9 ) && 0.2 ) & # ## - !1 , 0.9 " ) && 0.2 ) && ' 0.1 $ $ ' ' !1 , 0.1" $ $ ' (2a). Since, the Bayesian probability of auditor negligence is above 40%, H1 would predict that individuals with these prior beliefs about auditor negligence and about the diagnosticity of outcomes would under-react to this outcome information (a material misstatement with P* = 90% probability), and, in this case, underestimate the probability of auditor negligence given this outcome (73.3%). 15 For exploratory purposes, the post-test also included thirteen follow-up questions. While some questions asked participants about their gender, education, and degree of sympathy towards auditors, others asked them to think about covariation, randomness, and diagnostic inference. None of these thirteen response variables are statistically significant in our analyses (all p’s > 0.10), and none qualitatively change our findings. We thus omit these exploratory variables for simplicity. 17 The coefficients for polynomial terms Bayes2, and Bayes3 would be 0, and the coefficient for Bayes would be 1. Table 1 presents the cubic polynomial regression model, and Figure 3 displays the marginal profile plot.16 As shown, coefficients for Bayes, Bayes2, and Bayes3 are all significant (all p’s < 0.001). Furthermore, inspection of findings in Table 1 and Figure 3 suggest that H1, H2, and H3 are supported. One, statistical significance obtains up to a cubic power for Bayes, and, in the vicinity of 40%, the profile plot plainly switches from being concave and exhibiting outcome bias to being convex and exhibiting reverse outcome bias (when the plot is above the diagonal, over-harshness is indicated). This pattern of findings comprises the exact empirical footprint predicted by H1. Inspection of Figure 3 also reveals hindsight evaluations are harsher evaluations than foresight evaluations, consistent with H2. Finally, consistent with H3a and H3b, inspection of Figure 3 reveals that hindsight evaluations exhibit outcome bias for relatively low Bayesian probabilities of auditor negligence, but reverse outcome bias for relatively high Bayesian probabilities of auditor negligence. We next present additional tests of H1-H3 below. [Insert Table 1 and Figure 3 here] As further tests of H1 and H3, we chose a relatively low and a relatively high value of Bayes, and contrasted the prediction of participant’s revised beliefs p(AN|MMP*) for outcomes from the polynomial regression model at Table 2 against the null-hypothesized value of Bayes. We chose 25% and 75% as representative high and low values of Bayes well above and below Prospect Theory’s 40%, respectively. H1 would be supported by observing significant outcome bias at Bayes = 25% and significant reverse outcome bias at Bayes = 75%. H3 would be supported by observing these two effects in the foresight and in the hindsight conditions. Table 2 reports results for contrasts of p(AN|MMP*) against Bayes at 25% and 75% for all five levels of P* and on an overall basis (sensitivity analyses showed that results were qualitatively and statistically similar for other nearby, arbitrarily chosen values). The right 16 Centering the polynomial terms Bayes2 and Bayes3 to reduce multicollinearity yields statistically and qualitatively similar results (Neter et al. 1996). 18 column of Table 2 provides results on an overall basis. When the Bayesian probability of auditor negligence is 25%, the predicted value for p(AN|MMP*) is significantly greater than 25% both in the foresight condition at 31.78% (t = 4.384, p < 0.001) and in the hindsight condition at 35.53% (t = 6.782, p < 0.001). These two findings indicate significant outcome bias. When the Bayesian probability of auditor negligence is 75%, however, the predicted value for p(AN|MMP*) is significantly lower than 75% in the foresight condition at 50.08% (t = –10.795, p < 0.001) and in the hindsight condition at 53.84% (t = –9.218, p < 0.001). These two findings indicate significant reverse outcome bias. Overall, this pattern of findings supports H1 and H3.17 [Insert Table 2 here] To test H2, we examine the main effect of hindsight in the cubic polynomial regression model at Table 1. As the Temporality coefficient shows, participants’ judgments of auditor negligence were 3.75 percentage points higher (i.e., more harsh) in the hindsight conditions than in the foresight conditions (t = 2.343, p = 0.019). This main effect supports H2. 3.6 Supplemental Findings. To supplement our tests of H1, H2, and H3, we measure participants’ outcome effects, outcome bias, and reverse outcome bias over low and high ranges of Bayesian probabilities of auditor negligence, across the hindsight and foresight conditions. Specifically, since our hypotheses predict outcome bias (reverse outcome bias) for Bayesian posteriors below (above) the vicinity of 40%, we tabulate participants’ outcome effects and outcome bias (reverse outcome 17 To obtain convergent validity (cf., Trochim 2001), we compare our findings to those of other studies of the probability weighting function. Tversky and Kahneman (1992) use a single parameter model to specify the probability weighting function: p. , (3) w! p " * !p . " . 1/ . - !1 , p " When . < 1, the predicted inverse-S curve obtains (as in Figure 1), and the curve becomes more linear as . approaches 1. For . = 1, w(p) = p, so the curve coincides with the main diagonal. When . > 1, a regular-S curve obtains. We estimated . using non-linear regression and an iterative algorithmic process that continued until reductions in the sum of squared residuals were locally minimized. We estimate . at 0.61 for hindsight and at 0.57 for foresight. These estimates yield inverse-S curves like those in Figure 3 and typify prior estimates of . (e.g., Camerer and Ho’s 1994 review reports an average . of 0.56, Wu and Gonzalez 1996 estimate . at 0.71, and Tversky and Kahneman 1992 estimate . at 0.61 and 0.69). 19 bias) within the ranges 10% ! Bayes ! 30% and 50% ! Bayes ! 90%. As Table 3 shows, we replicate the finding of outcome effects documented in prior auditor negligence studies, in both the hindsight and the foresight conditions. However, consistent with the tests of H1 and H3, participants exhibit, on average, outcome bias within the 10% ! Bayes ! 30% range, but reverse outcome bias within the range 50% ! Bayes ! 90% range. Within the hindsight conditions, participants in the 10% ! Bayes ! 30% range exhibit outcome effects of 14.8 percentage points (t = 7.17, p < 0.001), which is 12.3 percentage points too high (i.e., outcome bias; t = 6.18, p < 0.001). Participants in the 50% ! Bayes ! 90% range, in contrast, exhibit outcome effects of 16.5 percentage points (t = 5.53, p < 0.001), which is 10.2 percentage points too low (i.e., reverse outcome bias; t = –3.39, p < 0.001). Results for the foresight conditions are similar, but they exhibit less outcome bias and more reverse outcome bias (Table 3). These findings are consistent with H1, H2, and H3. [Insert Table 3 here] IV. EXPERIMENT 2 Since Experiment 1 is the first to report evidence of reverse outcome bias and uses a relatively abstract audit negligence setting, we desired to test whether our theory-based hypotheses would replicate and extend to a richer, more realistic audit negligence setting. To do so, we used a setting quite similar to the Big Time Gravel scenario from Kadous (2000, 2001). Participants evaluated auditor negligence with respect to this particular, vivid audit rather than with respect to an audit randomly drawn from a stylized earnings management watch list. Undergraduates (n = 168) from a different semester of the same introductory accountancy course at the University of _______ volunteered, again for 1% extra course credit. Participants had completed an average of 1.47 years of post-high-school education (s = 1.20), 0.12 accounting courses (s = 0.38), and 3.15 business, accounting, and economics courses (s = 1.77). On an 11- 20 point scale ranging from unsympathetic to auditors (–5) to sympathetic with auditors (+5), participants’ average rating was –0.01 (s = 1.39, p = 0.99). We employed a 2 ) 2 between-participant factorial design. We manipulated outcome temporality at two levels, hindsight and foresight, by describing the Big Time Gravel audit and misstatement information in either past or future tense (e.g., Appendix D). We manipulated outcome probability, P*, at two levels, with more realism than in Experiment 1. At one level of P*, participants received outcome information stating that the Big Time Gravel’s inventory was (or will be, in foresight conditions) overstated according to an SEC investigation (Appendix D). At the other level of P*, the outcome information was rumors among analysts of an SEC investigation of possible overstatement. We measured participants’ perceptions of P* by asking them for the probability that the outcome actually was (or will be) a material misstatement and then used these measured probabilities for the outcome information in computing the Bayesian posterior, Bayes (see equation 2, footnote 14). Results for the polynomial model appear at Table 4 Panel A.18 Consistent with H1, the coefficients for Bayes, Bayes2, and Bayes3 are once again statistically significant (see Table 4 Panel A). Inconsistent with H2, however, the effects caused by the hindsight-foresight distinction are statistically insignificant in this richer, more realistic audit negligence context (t = –0.754, p = 0.452). One plausible explanation is that the influence of the foresight-hindsight distinction on outcome effects and outcome bias becomes negligible as one moves to relatively rich, realistic decision contexts (see, e.g., Christensen-Szalanski and Willham 1991). Since we observe no outcome-temporality effect in Experiment 2, we do not test H3 and collapse across the hindsight and foresight conditions in the posterior probability plot at Figure 18 We used the same manipulation check of the hindsight manipulation as in Experiment 1. Of the 166 subjects, only 14 failed the manipulation check. Results are statistically robust to their inclusion or exclusion at / = 0.05. Centering the polynomial terms Bayes2 and Bayes3 to reduce multicollinearity produced statistically similar results at /0= 0.05 (Neter et al. 1996). Additionally, the tabulated results exclude 11 observations with missing or nonsense responses. 21 4.19 Observe that the modeled posteriors switch from concavity to convexity, crossing the diagonal at approximately 40%, leaving the empirical footprint of the probability weighting function (H1). Moreover, as Table 4 Panel B shows, collapsed across outcome temporality, participants’ evaluations exhibit statistically significant outcome bias when the Bayesian probability of auditor negligence is relatively low, or 25% (t = 1.604, p = 0.055), but also statistically significant reverse outcome bias when the Bayesian probability of auditor negligence is relatively high, or 75% (t = –3.022, p = 0.001), consistent with H1.20, 21 V. CONCLUSION: LIMITATIONS AND IMPLICATIONS We report theory and experimental-empirical findings that contribute to the accounting literature on outcome effects and outcome bias. Outcome effects occur when evaluators use outcomes to revise their beliefs about others’ decision quality, and they robustly obtain in many accounting contexts (e.g., Buchman 1985; Brown and Solomon 1987; Lipe 1993; Kinney and Nelson 1996; Frederickson et al. 1999; Kadous 2000, 2001; Kadous and Magro 2001). Although outcome effects generally should obtain after negative outcomes, several accounting studies try to improve evaluators’ judgments by reducing outcome effects or their consequences. Such studies use two basic approaches: (1) identify interventions that reduce outcome effects (e.g., Kadous 2001) and (2) identify offsetting effects likely to exist in accounting contexts (e.g., Kinney and Nelson 1996). 19 We do not test H3 because it is about the joint influence of probability weighting (H1) and outcome temporality (H2). In Experiment 2, only H1’s probability weighting effects obtain, there is no joint influence to consider, and so H3 becomes moot. 20 We repeated the supplemental analyses to H1 performed for Experiment 1, estimating the amount of outcome bias (reverse outcome bias) in the 10%-to-30% (50%-to-90%) range of Bayesian probabilities. As expected, the average outcome bias within the low range was +8.1 percentage points (p < 0.001, n = 58). Similarly, the average reverse outcome bias within the high range was –11.7 percentage points (p = 0.023, n = 11). 21 The estimate of . (see footnote 17) for all assessments of auditor negligence (collapsed across hindsight and foresight) was 0.71. This estimate yields a curve that appears very similar to that of the cubic polynomial regression model at Figure 4, and, as with Experiment 1, aligns very comparably with historical estimates from Prospect Theory. 22 Although these two approaches have led to identification of factors that reduce or offset evaluators’ outcome effects, they tend to downplay the possibility of reverse outcome bias and vexingly do not separate outcome effects into appropriate belief revision and inappropriate bias. Guided by Prospect Theory’s probability weighting function, we predict conditions under which outcome bias and reverse outcome bias will obtain. We also empirically measure the degree to which outcome effects reflect warranted belief revision, outcome bias, or reverse outcome bias. Consistent with our theory, findings from two experiments show outcome bias for relatively low Bayesian probabilities of auditor negligence, but reverse outcome bias for relatively high Bayesian probabilities. Notably, even though reverse outcome bias has not been emphasized in extant outcome-effect accounting studies, we predict and observe that it obtains – even when evaluators’ assess audit quality in hindsight. As with all experimental investigations of theory, this study naturally has some limitations. While this is the first study to measure how evaluators’ assessments of audit quality compare to a Bayesian benchmark, we rely on participants’ judgmental inputs for this benchmark. We neither identify, nor contend that we could identify, accurate base rates of auditor negligence or materially misstated financial statements. Of course, such accuracy information would be quite costly if not impracticable to acquire and, of course, this limitation applies to all studies of outcome effects in the accounting literature. One potentially profitable avenue for future research is to obtain “best-practice” estimates of such base rates from very experienced academics, auditors, standard setters, regulators and/or managers. A second limitation is that our theory and experimental findings pertain to “on average” results. It would be profitable if future research could develop theory as to how evaluator characteristics influence, or interact with environmental factors to influence, the shape of the probability weighting function (Gonzalez and Wu 1999) report preliminary, exploratory work in this area). Despite these limitations, we provide the first theory-based empirical experimental evidence regarding conditions under which outcome effects in accounting contexts likely reflect 23 warranted belief revision, outcome bias, and reverse outcome bias. Empirical findings from two experiments suggest that both foresight and hindsight evaluators’ assessments exhibit outcome bias and reverse outcome bias, depending on whether Bayesian probabilities of auditor negligence are relatively low or relatively high, respectively. Interestingly, we predict and observe reverse outcome bias for ecologically meaningful probabilities within audit litigation contexts – those that fall at or above legal standards of proof such as “preponderance of the evidence,” “clear and convincing,” and “beyond reasonable doubt.” Our theory and findings collectively suggest it would be helpful to add reverse outcome bias to the lexicon of biases germane to accounting contexts and to embellish de-biasing frameworks in the accounting so they better account for: (1) the influence of multiple effects on evaluator bias in applicable accounting contexts (e.g., audit litigation, capital budgeting, performance evaluation), and (2) conditions under which interventions likely will influence such effects or evaluator bias. 24 REFERENCES ANDERSON, J.; D. J. LOWE; AND P. RECKERS. “Evaluation of Auditor Decisions: Hindsight Bias Effects and the Expectation Gap.” Journal of Economic Psychology (1993): 711-737. ANDERSON, J.; M. M. JENNINGS; D. J. LOWE; AND P. RECKERS. “The Mitigation of Hindsight Bias in Judges’ Evaluations of Auditor Decisions.” Auditing: A Journal of Practice & Theory (1997): 20-39. BARON, J. AND J. C. HERSHEY. “Outcome bias in Decision Evaluation.” Journal of Personality and Social Psychology 54 (1988): 569-579. BROWN, C. E. AND I. SOLOMON. “Effects of Outcome Information on Evaluations of Managerial Decisions.” The Accounting Review 62 (1987): 564-577. BUCHMAN, T. “An effect of Hindsight on Predicting Bankruptcy with Accounting Information.” Accounting, Organizations and Society 10 (1985): 267-285. CAMERER, C. F. AND T. HO. “Violations of the Betweenness Axiom and Nonlinearity in Probability.” Journal of Risk and Uncertainty 8 (1994): 167-196. CHRISTENSEN-SZALASNKI, J. J. J. AND WILLHAM, C. F. “The Hindsight Bias: A Meta-Analysis.” Organizational Behavior and Human Decision Processes 48 (1991): 147-168. CLARKSON, P. M.; C. EMBY; AND V. W-S WATT. “Debiasing the Outcome Effect.” Auditing: A Journal of Practice & Theory 21 (2002): 1-20. CORNELL, R. M.; R. C. WARNE; AND M. M. EINING. “Remedial Tactics in Auditor Negligence Litigation.” Working paper, University of Utah, 2005. EDWARDS, W. “Conservatism in Human Information Processing.” Formal Representations of Human Judgment. B. Kleinmuntz, ed., New York, NY: Wiley, 1968: 17-52. FOX, C. R. AND R. BIRKE. “Forecasting Trial Outcomes: Lawyers Assign Higher Probability to Possibilities that are Described in Greater Detail.” Law and Human Behavior 26 (2002): 159-173. FOX, C. R. AND A. TVERSKY. “A Belief-Based Account of Decision under Uncertainty, Management Science 44 (1998): 879-895. FREDERICKSON, J. R.; S. A. PEFFER; AND J. PRATT. “Performance Evaluation Judgments: Effects of Prior Experience Under Different Performance Evaluation Schemes and Feedback Frequencies.” Journal of Accounting Research 37 (1999): 151-165. GIGERENZER, G. Adaptive Thinking: Rationality in the Real World. Oxford, UK: Oxford University Press, 2000. GIGERENZER, G. AND U. HOFFRAGE. “How to Improve Bayesian Reasoning Without Instruction: Frequency Formats.” Psychological Review 102 (1995): 684-704. GIGERENZER, G. AND U. HOFFRAGE. “Overcoming Difficulties in Bayesian Reasoning: A Reply to Lewis & Keren and Mellers & McGraw.” Psychological Review 106 (1999): 425-430. 25 GLATER, J. D. AND F. NORRIS. “Deloitte Parts with S.E.C. Over Audit of Company.” The New York Times Online (August 2, 2001). GONZALEZ, R. AND G. WU. “On the Shape of the Probability Weighting Function.” Cognitive Psychology 38 (1999): 129-166. HAWKINS, S. AND R. HASTIE. “Hindsight: Biased Judgments of Past Events After the Outcomes Are Known.” Psychological Bulletin 107, 3 (1990): 311-327. HERSHEY, J. AND J. BARON. “Judgments by Outcomes: When is it Justified?” Organizational Behavior and Human Decision Processes 53 (1992): 89-93. HERSHEY, J. AND J. BARON. “Judgments by Outcomes: When is it Warranted?” Organizational Behavior and Human Decision Processes 62, 1 (1995): 127. HOCH, S. AND G. LOWENSTEIN. “Outcome Feedback: Hindsight and Information.” Journal of Experimental Psychology: Learning, Memory, and Cognition 15, 4 (1989): 605-619. HOLMSTROM, B. “Moral Hazard and Observability.” The Bell Journal of Economics (Spring 1979): 74-91. HOGARTH, R. M. “Beyond Discrete Biases: Functional and Dysfunctional Aspects of Judgmental Heuristics.” Psychological Bulletin 90, 2 (1981): 197-217. KADOUS, K. “The Effects of Audit Quality and Consequence Severity on Juror Evaluations of Auditor Responsibility for Plaintiff Losses.” The Accounting Review 75, 3 (2000): 327341. KADOUS, K. “Improving Jurors’ Evaluations of Auditors in Negligence Cases.” Contemporary Accounting Research (2001): 425-449. KADOUS, K. AND A. MAGRO. “The Effects of Exposure to Practice Risk on Tax Professionals’ Judgments and Recommendations.” Contemporary Accounting Research (Fall 2001): 451-475. KAHNEMAN, D. AND TVERSKY A. “Prospect Theory: An Analysis of Decision under Risk.” Econometrica 47, 2 (1979): 263-291. KELMAN, M.; D. FALLAS; AND H. FOLGER. “Decomposing Hindsight Bias.” Journal of Risk and Uncertainty 16 (1998): 251-269. KENNEDY, J. “Debiasing Audit Judgment with Accountability: A Framework and Experimental Results.” Journal of Accounting Research Autumn (1993): 231-245. KENNEDY, J. “Debiasing the Curse of Knowledge in Audit Judgment.” The Accounting Review 70, 2 (1995): 249-273. KINNEY, W. AND M. NELSON. “Outcome Information and the ‘Expectation Gap’: The Case of Loss Contingencies.” Journal of Accounting Research 34, 2 (1996): 281-299. LIBBY, R.; R. BLOOMFIELD; AND M. W. NELSON. “Experimental Research in Financial Accounting.” Accounting, Organizations, and Society 27 (2002): 775-810. 26 LIPE, M. “Analyzing the Variance Investigation Decision: The Effects of Outcomes, Mental Accounting, and Framing.” The Accounting Review 68, 4 (1993): 748-764. LOWE, D. J. AND P. M. J. RECKERS. “The Effects of Hindsight Bias on Jurors’ Evaluations of Auditor Decisions.” Decision Sciences 25, 2 (1992): 401-426. MOECKEL, C. “The Effect of Experience on Auditors’ Memory Errors. Journal of Accounting Research (Autumn 1990): 368-387. MOECKEL, C., AND R. PLUMLEE. “Auditors’ Confidence in Recognition of Audit Evidence.” The Accounting Review (October 1989): 635-668. NETER, J.; M. H. KUTNER; C. J. NACHTSCHEIM; AND W. WASSERMAN Applied Linear Statistical Models McGraw-Hill, Inc., 4th ed, 1996. PEECHER, M. E. AND I. SOLOMON. “Theory and Experimentation in Studies of Audit Judgments and Decisions: Avoiding Common Research Traps.” International Journal of Auditing 5 (2001): 193-203. PHILLIPS, L. D., AND W. EDWARDS. “Conservatism in a Simple Probability Inference Task.” Journal of Experimental Psychology 72, 3 (September 1966): 346-354. PHILLIPS, L. D.; W. L. HAYS; AND W. EDWARDS. “Conservatism in Complex Probabilistic Inference.” IEEE Transactions on Human Factors in Electronics HFE-7, 1 (March 1966): 7-18. PRELEC, D. “The Probability Weighting Function.” Econometrica 60 (1998): 497-528. RAIFFA, H. Decision Analysis: Introductory Lectures on Choices under Uncertainty. Reading, MA: Addison-Wesley Publishing Company, 1997. REIMERS, J. AND S. BUTLER. “The Effect of Outcome Knowledge on Auditors’ Judgmental Evaluations.” Accounting, Organizations and Society 17, 2 (1992): 185-194. TAN, H. AND M. LIPE. “Outcome effects: The Impact of Decision Process and Outcome Controllability.” Journal of Behavioral Decision Making 10 (1997): 315-325. TROCHIM, W. M. The Research Methods Knowledge Base, 2nd ed., Cornell University, Cincinnati, Ohio: Atomic Dog, 2001. TVERSKY, A. AND C. R. FOX. “Weighing Risk and Uncertainty.” Psychological Review 102(2) (1995): 269-83. TVERSKY, A. AND D. KAHNEMAN. “Advances in Prospect Theory: Cumulative Representation of Uncertainty.” Journal of Risk and Uncertainty 5 (1992): 297-323. TVERSKY, A. AND D. KOEHLER. “Support Theory: A Non-Extensional Representation of Subjective Probability.” Psychological Review 101 (1994): 547-567. WU, G. AND GONZALEZ, R. “Curvature of the Probability Weighting Function.” Management Science 42, 12 (1996): 1676-1690. 27 WU, G. AND GONZALEZ, R. “Nonlinear Decision Weights in Choice under Uncertainty.” Management Science 45, 1 (1999): 74-85. WU, G.; J. ZHANG; AND R. GONZALEZ. “Decision Under Risk.” Blackwell Handbook of Judgment and Decision Making, ed. N. Harvey and D. Koehler, Malden, MA: Blackwell Publishers, 2004. 28 Table 1 Cubic Polynomial Regression of p(AN|MMP*) on Bayes – Experiment 1 Dependent variable: p(AN|MMP*) is participants’ posterior beliefs of auditor negligence. Independent variables: Hindsight = 1 for subjects in the hindsight conditions (verbs describing outcome information in past tense), 0 otherwise (verbs describing outcome information in future tense). Bayes, Bayes2 and Bayes3 refers to the simple, squared, and cubed Bayesian posteriors for p(AN|MMP*), respectively. P* refers to stated outcome probability that we manipulated at five levels between-subjects (P* = 0%, 25%, 75%, 90% or 100%). Minutes refers to how long it took each participant to complete the instrument. Intercept Hindsight Bayes Bayes2 Bayes3 P*25% P*75% P*90% P*100% MINUTES F R2 R2adj n Estimated Coefficient –10.123 3.753 1.266 –2.416 1.867 1.237 11.595 18.322 5.021 0.525 10.123 36.7% 36.0% 811 Std. Error 5.495 1.602 0.243 0.645 0.470 2.729 2.948 2.986 3.071 0.169 t –1.842 2.343 5.203 –3.746 3.975 0.453 3.934 6.136 1.635 3.116 p 0.066 0.019 < 0.001 < 0.001 < 0.001 0.651 < 0.001 < 0.001 0.102 0.002 < 0.001 Table 2 Cubic Polynomial Contrast Tests of p(AN|MMP*) against Bayes – Experiment 1 Contrasts of participants’ posterior beliefs of auditor negligence, p(AN|MMP*), to Bayesian posteriors at low (25%) and high (75%) values of Bayesian posteriors using estimated marginal means from the cubic polynomial model (Table 1). Panel A: Foresight Conditions At P*= 0% Bayes 25.00% 75.00% p(AN|MMP*) 24.54% 42.85% Contrast -0.46% –32.15% Hypothesized Sign + – Standard Error 2.78% 3.23% t –0.165 –9.947 p 0.869 < 0.001 At P* = 25% 25.00% 75.00% 25.78% 44.08% 0.78% –30.92% + – 2.09% 3.14% 0.372 –9.842 0.355 < 0.001 At P* = 75% 25.00% 75.00% 36.14% 54.44% 11.14% –20.56% + – 2.17% 2.62% 5.125 –7.845 < 0.001 < 0.001 At P* = 90% 25.00% 75.00% 42.86% 61.17% 17.86% –13.83% + – 2.28% 2.58% 7.844 –5.355 < 0.001 < 0.001 At P* = 100% 25.00% 75.00% 55.82% 47.87% 30.82% –27.13% + – 7.42% 2.80% 4.151 –9.706 < 0.001 < 0.001 Overall 25.00% 75.00% 31.78% 50.08% 6.78% –24.92% + – 1.55% 2.31% 4.384 –10.795 < 0.001 < 0.001 At P* = 100% 25.00% 75.00% 59.57% 51.62% 34.57% –23.38% + – 3.47% 2.77% 9.963 –8.450 < 0.001 < 0.001 Overall 25.00% 75.00% 35.53% 53.84% 10.53% –21.16% + – 1.55% 2.30% 6.782 –9.218 < 0.001 < 0.001 Panel B: Hindsight Conditions Bayes p(AN|MMP*) Contrast Hypothesized Sign Standard Error t p At P*= 0% 25.00% 75.00% 28.29% 46.60% 3.29% –28.40% + – 2.80% 3.24% 1.177 –8.764 0.120 < 0.001 At P* = 25% 25.00% 75.00% 29.53% 47.84% 4.53% –27.16% + – 2.09% 3.12% 2.172 –8.693 0.015 < 0.001 At P* = 75% 25.00% 75.00% 39.89% 58.20% 14.89% –16.80% + – 2.18% 2.61% 6.839 –6.441 < 0.001 < 0.001 At P* = 90% 25.00% 75.00% 46.62% 64.92% 21.62% –10.08% + – 2.29% 2.58% 9.434 –3.906 < 0.001 < 0.001 Table 3 Supplemental Analyses – Experiment 1 Mean outcome effects and outcome bias (reverse outcome bias) within low (10% ! Bayes ! 30%) and high (50% ! Bayes ! 90%) ranges of Bayesian posterior probabilities of auditor negligence. Hindsight Observed Outcome Effect Bayesian Outcome Effect Outcome Bias (Reverse Outcome Bias) n Foresight Observed Outcome Effect Bayesian Outcome Effect Outcome Bias (Reverse Outcome Bias) n 10%!Bayes!30% p Estimate 14.8 < 0.001 2.5 n/a 12.3 < 0.001 144 10.4 2.7 7.7 133 < 0.001 n/a < 0.001 50%!Bayes!90% Estimate p 16.5 < 0.001 26.7 n/a -10.2 < 0.001 81 8.9 24.1 -15.2 81 0.001 n/a < 0.001 Table 4 Panel A: Cubic Polynomial Regression of p(AN|MMP*) on Bayes – Experiment 2 Dependent variable: p(AN|MMP*) is participants’ posterior beliefs of auditor negligence. Independent variables: Hindsight = 1 for subjects in the hindsight conditions (verbs describing outcome information in past tense), 0 otherwise (verbs describing outcome information in future tense). Bayes, Bayes2 and Bayes3 refers to the simple, squared, and cubed Bayesian posteriors for p(AN|MMP*), respectively. P*SEC refers to outcome probability, and equals 1 if the material misstatement outcome was based upon the results of the results of an SEC investigation, and 0 if the outcome was based upon rumors about an SEC investigation. Participants then evaluated P* numerically, and those evaluations of P* were used in the computation of Bayes. Intercept Hindsight Bayes Bayes2 Bayes3 P*SEC F R2 R2adj n Estimated Coefficient 2.128 –1.802 1.909 –3.143 2.237 –0.169 43.609 59.1% 57.7% 157 Std. Error 3.589 2.390 0.377 1.117 0.836 2.417 t 0.593 –0.754 5.068 –2.814 2.677 –0.070 p 0.554 0.452 < 0.001 0.006 0.008 0.944 < 0.001 Panel B: Cubic Polynomial Contrast Tests of p(AN|MMP*) against Bayes – Experiment 2 Contrasts of participants’ posterior beliefs of auditor negligence, p(AN|MMP*), to Bayesian posteriors at low (25%) and high (75%) values of Bayesian posteriors using estimated marginal means from the cubic polynomial model (Panel A). Bayes p(AN|MMP*) Contrast Hypothesized Sign Standard Error t p 25.00% 31.61% 6.61% + 4.12% 1.604 0.055 75.00% 60.74% –14.26% – 4.72% –3.022 0.001 Figure 1 A Hypothetical Probability Weighting Function (e.g., Tversky and Kahneman 1992) 100% 80% Behavioral Observation w(p) 60% 40% 20% 0% 0% 20% 40% 60% Probability p 80% 100% Figure 2 Probability Weighting Effect (H1) and Temporality Effect (H2) 100% Observed Posterior 80% Reverse Outcome Bias (H1) 60% Foresight Hindsight 40% Temporality Effect (H2) 20% Outcome Bias (H1) 0% 0% 20% 40% 60% Bayesian Posterior 80% 100% Figure 3 Polynomial Profile Plots of p(AN|MMP*) on Bayes – Experiment 1 from the polynomial regression model at Table 1 100% Observed Posterior 80% 60% Foresight Hindsight 40% 20% 0% 0% 20% 40% 60% Bayesian Posterior 80% 100% Figure 4 Polynomial Profile Plots of p(AN|MMP*) on Bayes – Experiment 2 from the polynomial regression model at Table 4 100% Observed Posterior 80% 60% 40% 20% 0% 0% 20% 40% 60% Bayesian Posterior 80% 100% Appendix A Excerpt from Introduction of Experiment 1 Why is this Important? Unfortunately, undetected material misstatements can cause enormous losses of money, as in the appalling Enron and WorldCom cases. Investors and lenders who rely on the financial statements with undetected material misstatements can lose their money. Employees of the company can lose their jobs and their life-savings, which is often invested in the stock of the company they work for. Many innocent people, both inside and outside of the company, can be harmed by an accounting or auditing failure. As we have seen, the negative effects of misstated financial statements can touch hundreds of thousands of people. Thus, it is absolutely critical to society that auditors exercise due professional care. Appendix B Excerpt from Experiment 1: Elicitation of Prior Beliefs 1. How often do you think that audited financial statements contain a material misstatement? A: _________ out of every 1,000 audited financial statements contain a material misstatement. 2. How often do you think the auditors are negligent? A: The auditors are negligent on _________ out of every 1,000 financial statement audits. THE NEXT QUESTION WILL REFER TO YOUR ANSWER TO #2. BEFORE ANSWERING #3, PLEASE WRITE YOUR ANSWER TO #2 EVERYWHERE YOU SEE A DOTTED LINE IN #3. 3. Now, imagine the negligent audits in your answer to #2. Given that these auditors are negligent, how many out of these do you believe contain a material misstatement in the audited financial report? A: The financial statements contain a misstatement on _________ of negligent audits from #2. the Appendix C Excerpt from Experiment 1: Outcome Information and Elicitation of Posterior Beliefs from the hindsight [foresight] condition at P* = 25% Instructions: For questions #4-5, we will be referring to an Earnings Management Watch List compiled by a major securities rating agency. This is a list of U.S. companies who have been identified by the rating agency as having potentially manipulated [as potentially manipulating] their reported earnings numbers in the financial statements [in the future]. This Watch List includes information about a number of companies and the Big Four auditors who audited [who will soon begin the audits of] their financial statements. After changing the name of both the auditors and their clients, the list looks like the following: Auditor BIG-FOUR-A BIG-FOUR-B BIG-FOUR-C BIG-FOUR-D BIG-FOUR-A BIG-FOUR-B … Client U.S. Company, Inc. 1 U.S. Company, Inc. 2 U.S. Company, Inc. 3 U.S. Company, Inc. 4 U.S. Company, Inc. 5 U.S. Company, Inc. 6 … IMPORTANT: Among the financial statement audits from this Earnings Management Watch List, SOME (25%) of the audited financial reports actually were [will be] materially misstated, even though the auditor gave [will give] the financial statements a clean audit opinion after the audit was completed [once the audit is complete]. Review Question 4. The audits that you just read about of the financial statements of companies from the Earnings Management Watch List: ! have already finished. ! have not yet started. Case Question 5. We have randomly chosen one audit from the above list. How frequently do you believe that the auditors would [will] have been negligent on audits like the one randomly chosen from this list? A: I believe that the auditors would [will] have been negligent on ______ out of every 100 audits like the one randomly chosen from this Watch List. Appendix D Excerpt from Experiment 2: Outcome Information and Elicitation of Posterior Beliefs from the hindsight [foresight] condition at P* = SEC Investigation Outcome Information IMPORTANT! For questions #11-13, also assume the following: After the audit was completed [is complete], Big Time Gravel encountered [will encounter] some financial difficulty. The SEC (the U.S. Securities and Exchange Commission) began [will begin] an investigation, alleging that the financial statements of Big Time Gravel were [will have been] materially misstated, even though Jones & Company gave [will have given] the financial statements a clean audit opinion. At the end of their investigation, the SEC concluded [will conclude] that Big Time Gravel [will have] understated the amount of gravel and concrete inventory in its audited financial statements by $15 million. As a result of Big Time Gravel’s financial difficulty, Bierhoff Inc., a lender who loaned [will loan] Big Time Gravel $25 million, had [will have] to settle for collecting only $15 million of its $25 million loan, losing $10 million. Bierhoff Inc. alleged [will allege] that it [will have] made the loan relying on the financial statements, audited by Jones & Company, which were [will have been] found by the SEC to be materially misstated. Case Questions 11. The SEC conducted [will conduct] an investigation and concluded [will conclude] that the financial statements of Big Time Gravel were [will have been] materially misstated. Think of 100 SEC investigations like the one described above. How frequently do you believe that audited financial statements with an SEC conclusion such as this actually would [will] have been materially misstated? A: I believe that the audited financial statements actually would [will] have been materially misstated for ______ out of every 100 audits with an SEC judgment like this one. 12. Think of 100 cases like Jones & Company’s audit of Big Time Gravel, and with the same outcome information (described above). Given this outcome, how frequently do you believe that the auditors would [will] have been negligent on audits like the one described above? A: I believe that the auditors would [will] have been negligent on ______ out of every 100 audits like this one.
© Copyright 2025 Paperzz