The Effect of Substantive Testing Outcome and Type of Control

THE EFFECT OF SUBSTANTIVE TESTING OUTCOME AND TYPE OF CONTROL
DEFICIENCY ON AUDITORS’ CONTROL SEVERITY JUDGMENTS
STEPHEN KWAKU ASARE
UNIVERSITY OF FLORIDA
BARBARA MAJOOR
NYENRODE BUSINESS UNIVERSITY
ARNOLD WRIGHT
NORTHEASTERN UNIVERSITY
DECEMBER 2014
We appreciate helpful comments from the editor, Jacqueline Hammersley, the anonymous
reviewers, Bill Felix, Robert Knechel, Mark Peecher and participants at the 2012 International
Symposium on Auditing Research in Tokyo, Japan.
THE EFFECT OF SUBSTANTIVE TESTING OUTCOME AND TYPE OF CONTROL
DEFICIENCY ON AUDITORS’ CONTROL SEVERITY JUDGMENTS
ABSTRACT: This study examines the extent to which auditors’ control severity judgments in
the natural setting reflect their beliefs about how substantive testing outcome (no misstatement
versus immaterial misstatement) and type of control deficiency (entity level versus account
specific) should affect those judgments. This is an important issue because auditors’ lack of
insight into their control severity judgments can lead them to issue internal control reports that
they do not intend. We posit that auditors use an intuitive mode of reasoning in the natural
setting, which leads their control severity judgments to be affected by substantive testing
outcome but not by the type of control deficiency. We also posit that these control severity
judgments do not reflect auditors’ beliefs about how the two variables should affect those
judgments. In experiment 1, we use a between-participants design to examine the likely effect of
the manipulated variables in the natural setting. The results indicate that auditors evaluate a
control deficiency less severely when it has not led to a misstatement. However, the type of
deficiency does not affect auditors’ severity assessments. In experiment 2, we use a withinparticipants design, which makes the manipulations more salient, to examine whether auditors
intend the outcome and deficiency effects observed in experiment 1. The results indicate that
auditors evaluate the deficiencies as equally severe, regardless of the substantive testing
outcome, and assess entity level deficiencies as more severe than account specific deficiencies.
Thus, the findings indicate that auditors do not intend either of the effects observed in the
between-participants design. Taken together, the results suggest that auditors’ deficiency
evaluations, in the natural setting, may not reflect their beliefs about how substantive testing
outcomes and type of deficiency should affect those evaluations, raising the possibility of their
issuing unintended internal control reports. The results also suggest the need to consider decision
aids and other mechanisms that align auditors’ knowledge and heuristics.
Keywords: Misstatements; account specific and entity level deficiencies; severity of
deficiencies; material weaknesses; internal control decisions
I.
INTRODUCTION
Auditors are required to evaluate the severity of internal control deficiencies to determine
whether the deficiencies, individually or in combination, are material weaknesses as of the
balance sheet date (PCAOB 2007, par. 62).1 This evaluation is an important and difficult
judgment, which determines whether an adverse report is issued on the client’s internal controls
over financial reporting (ICOFR) with its consequential effects, such as a higher cost of capital
or negative market reaction (e.g., Ashbaugh-Skaife et al. 2009; Beneish et al. 2008; Hammersley
et al. 2008; Asare and Wright 2012). Prior studies report that task variables, such as the presence
of a misstatement, affect auditors’ intuitive control severity judgments (e.g., Kinney et al. 2008;
PCAOB 2009; Bedard and Graham 2011; Gramling et al. 2013). However, there is no evidence
on whether this effect reflects auditors’ beliefs about how the task variables should affect their
judgments. Basic research suggests that auditors’ intuitive judgments may not necessarily reflect
their underlying knowledge about control deficiencies (e.g., Kozuch and Nichols 2011). It is,
therefore, important to examine the extent to which auditors’ intuitive control severity judgments
reflect their underlying knowledge about controls.
The purpose of this study is to examine the extent to which auditors’ control severity
judgments in the natural setting reflect their beliefs about how substantive testing outcome (no
misstatement versus immaterial misstatement) and type of control deficiency (entity level versus
account specific) should affect those judgments.2 That is, do auditors’ intend the effect that
1
This requirement applies to auditors of companies that are listed or cross-listed on U.S. exchanges. Thus, the
PCAOB auditing standard has a global reach applying to auditors who operate within and outside the U.S.
Deficiencies that are not severe enough to be classified as material weaknesses are either significant deficiencies or
control deficiencies.
2
Consistent with prior studies, we use the natural setting to characterize the more common practice situation where
auditors make decisions based on only the realized outcome and are unable to compare alternative levels of
outcomes across identical circumstances (Tversky and Kahnmen 1996; Libby et al. 2002). From a methodological
perspective, Kahneman and Tversky (1996) suggest that a between-subjects design provides a clean test of subjects’
1
substantive testing outcomes and type of control deficiency have on their control severity
judgments? This is an important question because auditors’ lack of insight into their control
severity judgments can lead them to issue unintended control reports, which, in turn, suggests the
need for decision aids and/or guidance that align their knowledge and heuristics (see Tversky and
Kahneman 1996, Tan et al. 2002; Libby et al. 2002).
We address this question in two separate experiments using a between-participants design
in experiment 1 and a within-participants design in experiment 2. The former setting allows us to
provide evidence on how the two variables affect auditors’ assessments in the natural setting
while the latter setting provides evidence on auditors’ beliefs regarding how the variables should
affect their judgments when differences in substantive testing outcomes and type of control
deficiency are made apparent.3
Auditors are aware of the substantive testing outcomes when the severity of control
deficiencies is being assessed for reporting purposes.4 PCAOB inspection reports suggest that
auditors may inappropriately base their material weakness evaluations solely on the materiality
natural reasoning process, while the within-subjects design draws attention to the independent variable of interest
and thus gives the subject a chance to detect and correct errors and inconsistencies in their responses.
3
Prior studies recommend the use of a combination of between- and within-subjects designs as a method of
partitioning the effects of unintentional biases from intentional judgment policies (Tversky and Kahneman 1996;
Tan et al. 2002; Libby et al. 2002). The combined design highlights how subjects address any conflict between what
they do and what they know (Libby et al. 2002). Evidence of differences using between-subjects treatments, but not
using within-subjects treatments, suggests that the between-subjects differences are unintentional. On the other
hand, evidence of differences using within-subjects treatments, but not using between-subjects treatments, suggests
that subjects are aware of the implications of the differences in the stimuli, but that, in their natural reasoning
process, the stimuli were ignored or subjects’ related knowledge was not accessed and used.
4
Auditors may also identify and evaluate control deficiencies in earlier phases of the audit. The purpose of such
evaluation will be to reassess control risk and modify the substantive program but not to determine the type of audit
report on the ICOFR. At this earlier stage auditors are less likely to know substantive testing outcomes when
evaluating such control deficiencies. Further, early identification and evaluation of control deficiencies provide
management an opportunity to remediate the deficiency prior to the required year-end evaluation of control
deficiencies. Our focus in this paper is on auditors’ evaluation of deficiencies for reporting purposes, which likely
occurs at the completion of the engagement when substantive testing outcomes are known.
2
of identified misstatements in the financial statements (PCAOB 2009).5 Prior research also
shows that the presence of a misstatement influences auditors’ assessments of the likelihood that
a control deficiency is a material weakness (Bedard and Graham 2011; Gramling et al. 2013).
Further, audit partners suggest that they find the identification of ICOFR design weaknesses to
be a particularly difficult task when no misstatements have been detected (Kinney et al. 2008).
These results suggest that the availability of substantive testing results can lead to an outcome
effect (see, Kadous 2000; Emby et al. 2002; Peecher and Piercey 2008), which may cause
auditors’ control severity judgments to be influenced by the presence of misstatements (i.e., what
has gone wrong or not gone wrong) rather than the potential for material misstatements (i.e.,
what could go wrong) (see PCAOB 2007). We extend this literature by providing evidence on
whether an outcome effect occurs unconsciously or is intended by auditors. An outcome effect in
auditors’ control severity judgments is less defensible if it can be shown that it is not intended.
With respect to type of control deficiency, auditing standards make a distinction between
entity level and account specific controls, in what is referred to as a “top down” approach
(PCAOB 2007). While the top-down approach favors the selection of entity level controls for
testing to enhance audit cost-effectiveness, the standards do not directly address the issue of how
the type of deficiency should affect control severity judgments (Asare et al. 2013). Entity level
controls are more pervasive, more difficult to audit or to remediate, and have higher operational
consequences than account specific controls (see Asare and Wright 2014). As such, it could be
argued that auditors should evaluate entity-level deficiencies more severely than account specific
5
It is inappropriate because there can be a material weakness in ICOFR even where no misstatement has occurred.
Auditing standards underscore the notion that the severity of a deficiency does not depend on whether a
misstatement actually has occurred but rather on whether there is a reasonable possibility that the company's
controls will fail to prevent or detect a material misstatement (PCAOB 2007, par. 64). Thus, for example, ineffective
oversight of the audit committee over financial reporting or an absence of a security controls over the enterprise
resource planning system is a material weakness, even if the control deficiency has yet to lead to a misstatement.
3
deficiencies. Even if auditors are generally aware that entity level deficiencies should be
evaluated more severely than account specific deficiencies, it does not follow that case specific
evaluations of such deficiencies will reflect this general cognitive template (Kozuch and Nichols
2011; Libby et al. 2002; Tversky and Kahneman 1996). We are unaware of any studies that
provide theoretical analysis or empirical evidence on how the type of deficiency affects or should
affect auditors’ control severity judgments. It is therefore of theoretical interest and practical
importance to provide evidence on whether auditors conceptually think that the type of control
deficiency should affect their internal control severity assessments and whether it actually affects
their case-specific assessments in a natural setting.
Dual-process theory postulates that decision-makers can operate in either an intuitive
(system 1) or deliberative (system 2) mode of reasoning (Kahneman 2011; Evans and Stanovich
2013). System 1 operates autonomously and generates intuitive default responses that allow the
decision maker to solve problems rapidly (Kahneman 2011; Evans and Stanovich 2013). System
2 involves cognitive decoupling and hypothetical thinking, which while exacting on working
memory resources can intervene to override and replace system 1 when the decision situation
presents unusual difficulty or novelty (Kahneman 2011; Evans and Stanovich 2013).
Drawing on dual-process theory, we posit that auditors use an intuitive mode of reasoning
in the natural setting when evaluating control deficiencies. We also posit that auditors will switch
to a deliberative mode of reasoning when the task is restructured to allow a direct comparison of
the underlying controls and substantive testing outcomes. The combined effect of these two
postulates is that auditors’ intuitive control severity judgments may not reflect their beliefs or
deliberative judgments. Accordingly, we propose dual hypotheses for each of our experimental
variables: (1) As regards substantive testing outcome, we hypothesize that (a) auditors’ evaluate
4
a control deficiency more severely when the outcome of substantive testing is an immaterial
misstatement compared to no misstatement; and (b) auditors’ severity assessment will not be
affected by the substantive testing outcome in a setting that allows a direct comparison of both
outcomes and, thus, makes the presence or absence of misstatements salient. (2) With regard to
type of deficiency, we hypothesize that (a) auditors’ control severity judgments are unaffected by
type of control deficiency, while (b) auditors’ believe that entity level control deficiencies should
be evaluated more severely than account specific control deficiencies.
To test our hypotheses, we conduct two experiments. In experiment 1, 95 experienced
auditors evaluate a revenue deficiency in a between-participants design, where we manipulate
the substantive testing outcome (immaterial misstatements detected or misstatements not
detected) and the type of control deficiencies (entity level or account specific). In experiment 2,
we employ a within-participants design with 32 experienced auditors where the manipulations
are made more salient and participants evaluate the deficiency under both substantive outcomes.
They also evaluate the severity of a deficiency assuming it relates to each type of control.
Our findings indicate that: (i) auditors evaluate a control deficiency, for which an
immaterial misstatement has been detected, as more severe than that for which no misstatements
have been detected; (ii) this outcome effect is not intended by auditors because in the withinparticipants design, auditors evaluate the deficiencies as equally severe; (iii) in a betweenparticipants design, auditors’ severity assessments are not affected by the type of control
deficiency; and (iv) as indicated by the results of the within-participants design, auditors believe
that the type of deficiency affects severity assessment (with entity level deficiencies evaluated as
more severe than account specific deficiencies). In sum, the results represent an intriguing
disconnect between auditors’ knowledge and the heuristics or rules of thumb that they actually
5
use when evaluating control deficiencies for reporting purposes, highlighting the possibility that
auditors may be issuing unintended adverse controls reports. This issue is a concern because
internal control problems have become lagging rather than leading indicators of future financial
reporting problems (Jonas et al. 2007). It also raises the specter of auditors being unaware of how
engagement conditions affect their control severity judgments.
Our research design offers several important innovations. First, the between-participants
design provides evidence of the existence of an outcome effect in a more natural setting where
auditors are unable to compare alternative substantive testing outcomes for the same underlying
control process. Second, the salience of the manipulations in the within-participants design
provides an opportunity for participants to reveal their beliefs about how their severity
assessments should vary with the manipulated variables. This design allows us to make
inferences about whether the between-participants outcome effect is intentional (Reffet 2010;
Libby et al. 2002; Kahneman and Tversky 1996). It also highlights that auditors’ judgment
heuristics could lead to internal control decisions that are disconnected from their underlying
knowledge (Libby et al. 2002). Finally, the within-participants results allow us to infer that the
between-participants results are a departure from auditors’ “own standards of judgment,” which
suggests the need for auditor training, task restructuring or decision tools.
The remainder of this paper is organized into four sections. In the next section we review
relevant literature and identify the research hypotheses. This section is followed by a description
of the method and presentation of the findings. The final section is devoted to a discussion of the
major results and their implications for practice and future research.
II.
RELEVANT LITERATURE AND RESEARCH HYPOTHESES
Background
6
A deficiency in ICOFR exists when the design or operating effectiveness of a control
does not allow management or employees, in the normal course of performing their assigned
functions, to prevent or detect misstatements on a timely basis (IAASB 2009, ISA 265;
PCAOB 2007). Under Auditing Standard No. 5, auditors are required to evaluate the severity
of each control deficiency that comes to their attention to serve as a basis of determining
whether an adverse report should be issued on the client’s ICOFR (PCAOB 2007).6 The
standard states that the severity of a deficiency depends on (i) “whether there is a reasonable
possibility that the company’s controls will fail to prevent or detect a misstatement of an
account balance or disclosure”; and (ii) “the magnitude of the potential misstatement resulting
from the deficiency or deficiencies” (PCAOB 2007, par. 63).7 A material weakness, which
requires the issuance of an adverse report, “is a deficiency, or a combination of deficiencies, in
ICOFR such that there is a reasonable possibility that a material misstatement of the
company’s annual or interim financial statements will not be prevented on a timely basis”
(PCAOB 2007). On the other hand, a significant deficiency “is a deficiency, or a combination
of deficiencies, in ICOFR that is less severe than a material weakness, yet important enough to
merit attention by those responsible for oversight of the company’s financial reporting”
(PCAOB 2007).
6
Under international auditing standards, auditors are required to obtain an understanding of internal control relevant
to the audit to identify, assess and respond to the risk of material misstatement (IAASB 2009, ISA 315) and to
evaluate the severity of each identified deficiency to determine whether, individually or in combination, they
constitute significant deficiencies, which are required to be communicated in writing to those charged with the
governance of the company (ISA 265).
7
Factors that affect the likelihood of misstatements include the nature of the financial statement accounts,
disclosures, and assertions involved; the susceptibility of the related asset or liability to loss or fraud; the
subjectivity, complexity, or extent of judgment required to determine the amount involved; the interaction of the
control with other controls; the interaction of the deficiencies and the possible future consequences of the
deficiencies (PCAOB 2007, par. 65). Factors that affect the magnitude of misstatement include the financial
statement amounts or total transactions exposed to the deficiency and the volume of activity in the account balance
or class of transactions exposed to the deficiency that has occurred in the current period or that is expected to occur
in the future periods (PCAOB 2007, par. 66).
7
Auditing Standard No. 5 indicates that, “the severity of a deficiency does not depend on
whether a misstatement actually has occurred” (PCAOB 2007, par. 64). The standard adopts
this position because even though the audits of the ICOFR and financial statements are
integrated, they focus on different, albeit related, objectives. The ICOFR audit focuses on
processes, which can be flawed even in the absence of misstatements. Thus, for instance,
granting unrestricted access to a company’s revenue system is a material weakness due to the
potential for this deficiency to lead to material misstatements, independent of whether this
control deficiency has or has not actually led to any misstatements up to that point in time. In
essence, when evaluating the severity of a deficiency, auditing standards direct that the
emphasis is on what could go wrong (defined by the likelihood and magnitude of potential
misstatements) in the financial statements, not what has gone wrong.
Auditing Standard No. 5 also highlights the distinction between entity level controls and
account specific controls and requires the use of a “top down” approach for purposes of
identifying key controls to test (PCAOB 2007, par 21).8 Entity level controls are those controls
emplaced at top management levels such as the control environment, risk assessment processes,
the audit committee, centralized processing, period-end financial reporting process, or
monitoring controls. Thus, entity level controls typically affect multiple accounts (PCAOB 2007,
par. 24). However, entity level controls can also operate at a level of precision that directly
addresses a particular account or assertions (e.g., monthly review of variances in the sales budget
at the corporate level), in which case it operates over the same account at multiple locations.
Account specific controls (PCAOB 2007, par. 34) are those controls pertaining to a particular
8
Auditing Standard No. 5 makes the distinction between entity level and account specific controls to enhance the
efficiency of the audit of ICOFR, in what is referred to as a “top down” approach. If an entity-level control
sufficiently addresses the assessed risk of misstatement in an account, disclosure or assertion, the auditor need not
test additional account specific controls relating to that risk (PCAOB 2007, par. 23).
8
account or process (e.g., a control in the revenue process to reduce the risk that shipped goods
will not be billed). While Auditing Standard No. 5 notes that the ultimate decision as to whether
a control should be selected for testing depends on the relevance rather than the labeling of the
control (e.g., entity-level control, transaction-level control, control activity, monitoring control,
preventive control, detective control) (PCAOB 2007, par 41), it does not directly address the
question of how type of control should affect the evaluation of a deficiency. Thus, the effect of
both substantive testing outcome and the type of deficiency on auditors’ control severity
judgments as well as auditors’ beliefs regarding this effect are of importance from a practical as
well as a theoretical standpoint.
Dual-Process Theory of Cognition
Dual-process theory of human cognition postulates that decision makers can operate in
either of two different modes of thought (Kahneman 2011; Evans and Stanovich 2013). One
mode, referred to as system 1, is characterized by intuitive, experiential and affective
reasoning while the other mode (system 2) is characterized by analytical and deliberative
reasoning (Kahneman 2011). System 1 “operates automatically and quickly with little or no
effort” while system 2 “allocates attention to the effortful mental activities that demand it”
(Kahneman 2011). System 1 acts as the default mode of thought for solving problems but
system 2 can be activated when an event or situation is detected that violates the model of the
world that system 1 maintains (e.g., surprising stimulus)(Kahneman 2011). Thus, dual-process
theory suggests that most of what decision-makers do originates in system 1, with system 2
taking over when things get difficult (Kahneman 2011). According to Kahneman (2011), this
division of labor minimizes cognitive effort and optimizes performance. Further, while system
1 reasoning is generally good and provides swift and usually short-term accurate decisions, it
9
is nonetheless prone to biases in decision-making ((Kahneman 2011; Evans and Stanovich
2013). As discussed below, we apply dual-process theory and argue that auditors’ make
intuitive control severity judgments in the natural setting but can resort to deliberative
reasoning when the task is formulated to draw their attention to the underlying controls and
substantive testing outcomes.
Outcome Effect and Severity Assessments
When evaluating the severity of a control deficiency, the auditor is essentially asking the
question of what could go wrong in the financial statements in light of the deficiency. Inevitably,
at the time of the evaluation, the auditor is also aware of what has gone wrong in the financial
statements as a result of substantive testing, raising the specter of a possible outcome effect
(Brown and Solomon 1987; Lipe 1993; Kadous 2000).9 Extant research on auditors’ evaluation
of internal controls is consistent with an outcome effect in that the presence of a misstatement
influences auditors’ assessments of the likelihood that a control deficiency is a material
weakness (Kinney et al. 2008; Bedard and Graham 2011; Gramling et al. 2013).
We suggest that this outcome effect occurs because auditors default mode of reasoning
about control deficiencies is system 1. Specifically, auditors evaluate control deficiencies using a
rapid cognitive reconstruction process (Nisbett and Ross 1980; Einhorn and Hogarth 1981;
Hogarth 1980; Ross and Sicoly 1982). Consistent with Brown and Solomon (1987), we propose
that auditors evaluate control deficiencies by starting with the substantive outcome and working
backward to connect the causal links that may have led to the outcome. They then are expected
9
While an outcome effect cannot be definitively labeled as a bias, there is considerable evidence that decision
makers’ intuitive judgments often take outcomes into account in a way that is irrelevant to the true quality of the
decision (Baron and Hershey 1988). However, recent research has shown that the existence of such a bias may
depend on the evaluators’ pre-outcome likelihood of the event, suggesting that more emphasis should be placed on
understanding the conditions under which an outcome effect reflects warranted belief revision, outcome bias, or
reverse outcome bias (Peecher and Piercey 2008).
10
to evaluate the severity of the deficiency based on the causal links, leading to their magnitude
and likelihood assessments aligning with the substantive testing outcome. Thus, the detection of
a misstatement enables the auditor to more easily envision a chain of events leading from the
control deficiency to a future material misstatement. As a result, when substantive tests detect
immaterial misstatements, auditors’ are more likely to construct causal chains where the control
deficiency leads to the occurrence of a larger misstatement compared to when no misstatement
has been detected.10 This discussion leads to the following hypothesis:
H1: Auditors’ control severity judgments are more severe when a related immaterial
misstatement is detected than when no misstatement is detected.
We are unaware of any research that examines auditors’ beliefs about how substantive
testing outcome should affect their control severity judgments. Notwithstanding the prevalence
and robustness of the outcome effect across a variety of decision settings, prior research suggests
that it is an unconscious judgmental phenomenon, which can be mitigated by raising subjects’
awareness. For instance, Gino et al. (2009) show that priming decision makers to adopt a
deliberative perspective (i.e., to provide their most rational, objective judgment), rather than an
intuitive perspective (i.e., to provide their intuitive, “gut feel” judgment), can undo the outcome
bias, as long as the deliberative perspective is required first. That is, participants who first
responded under the rational priming were not subject to the outcome effect but the effect
occurred when they were subsequently asked to respond intuitively. In contrast, participants who
first responded intuitively were subject to the outcome effect and continued to be subject to this
effect when they were asked to respond deliberatively. In an audit litigation setting, Clarkson et
10
The accounting literature provides support for a biasing effect of outcome information in various juror
assessments of auditor performance (e.g., Lipe 1993; Lowe and Reckers 1997; Emby et al. 2002; Kadous 2000;
Wright and Wright 2014).
11
al. (2002) similarly show that bringing an alternative view of the circumstances to the fore can
undo the outcome effect. We argue that priming and alternative set of circumstances cause the
decision maker to switch to system 2 processing.
We propose that auditors will similarly switch to system 2 processing when they are
sequentially provided both outcomes (no misstatement and immaterial misstatement), which
provides them an opportunity to detect and correct any errors and/or inconsistencies in their
responses to the separate outcomes. That is, comparison of the outcomes generated by the same
underlying control deficiency shifts auditors mode of thinking from an intuitive to a more
deliberative perspective (Gino et al. 2009). A sequential presentation of both outcomes draws
auditors’ attention to the fact that the underlying control deficiency is the same and has simply
led to alternative outcomes (for instance, a vulnerability in access controls is present regardless
of whether a hacker has or has not taken advantage of it). The outcome comparison is a way of
making auditors aware of the alternative plausible outcomes that can be generated by the control
deficiency (Clarkson et al. 2002). We also argue that control severity judgments based on such a
comparison reveal auditors’ beliefs about whether the alternative outcomes should affect severity
assessments (Kahneman and Tversky 1996; Libby et al. 2002; Tan et al. 2002; Reffet 2010).11
Under those circumstances, we predict an absence of an outcome effect in the setting where
multiple outcomes are present for the underlying control deficiency.12 This leads to the following
hypothesis:
11
Kahneman and Tversky (1996) discuss the importance of a within-participants design as a vehicle for examining
decision makers’ awareness of their judgments in natural settings, where they are unable to compare alternative
levels of a variable of interest (for instance, for a given control deficiency, substantive testing will reveal either a
misstatement or no misstatement). The salience of the manipulations in the within-participants design provides an
opportunity for the participants to reveal their beliefs about how their judgments should vary with the manipulated
variables (see also Libby et al. 2002; Tan et al. 2002; Reffet 2010).
12
The participants in Gino et al. (2009) were required to respond either rationally or intuitively and subsequently
were asked to respond in a reverse order. As discussed in the method section, our participants received both
12
H2: Auditors’ control severity judgments will not be affected by substantive testing outcome in a
setting that allows a comparison of both outcomes.
Type of Control Deficiency and Severity Assessments
Current auditing standards do not resolve the question of whether the type of control
deficiency should affect auditors’ control severity judgments (PCAOB 2007, par 41; Asare et al.
2013). However, Auditing Standards No. 5 identifies some factors that should affect internal
control severity assessments that could be linked to type of deficiency. For instance, the standard
identifies “the possible future consequences of the deficiency“ as a risk factor that affects the
likelihood of a misstatement. Asare and Wright (2014) suggest that entity level controls have
more serious future consequences because they have higher verification, information, operational
and remediation risk.13 This would suggest that ceteris paribus auditors should judge entity level
deficiencies as more severe than account specific deficiencies. Similarly, the standard notes that,
“the total of transactions exposed to the deficiency” is a factor that affects the magnitude of
misstatement. Entity-level controls are more pervasive in affecting different types of accounts
and locations, whereas account specific control deficiencies are isolated, and therefore their
effects are not as far-reaching. Thus, it could be argued that a deficiency in entity-level controls
exposes more transactions to potential misstatement.
Bedard and Graham (2011) is the only study that provides data regarding the severity
classification of various types of control deficiencies. They report that the control environment is
outcomes prior to providing any responses. The difference in approach reflects a difference in objectives. Their
study was focused on whether the rational mindset can override the outcome effect. Our emphasis is on using the
within-participants design as a standard of judgment for the between-participants responses (i.e., is there a
disconnect between what auditors do in their natural setting and what they think they should be doing?).
13
Remediation risk reflects management’s ability to fix a material weakness. Operational risk reflects loss in
operational efficiency arising from the presence of a material weakness. Verification risk reflects the auditor’s
ability to audit around identified material weaknesses to provide an opinion on the financial statements (see Asare
and Wright 2014).
13
a strong factor in distinguishing a material weakness from significant deficiencies. However,
unexpectedly, they find that entity level monitoring controls and information technology general
controls were more likely to be classified as significant deficiencies rather than material
weaknesses. They also find that revenue account specific deficiencies are more likely to be
classified as material weaknesses. However, their study was not designed to compare the effect
of entity level and account specific deficiencies on severity judgments. Further, with archival
data it is not possible to disentangle the auditor’s judgments from the material weakness
classification, which ultimately results from auditor-client negotiations (see e.g., Earley et al.
2008).14
We posit that auditors’ default to system 1 in evaluating a control deficiency. This mode
of thinking will likely not involve hypothetical comparison of the effect of an entity-level versus
an account specific deficiency. Without a deliberative consideration, auditors are likely to
consider the situational cues related to the deficiencies (e.g., number of deviations in the sample)
and evaluate the deficiencies as equally severe. In contrast, if auditors are asked to engage in a
direct comparison of entity level and account specific deficiencies, it will increase their
awareness of the differences and lead to different severity assessments. This discussion leads to
the following hypotheses:
H3: Auditors’ control severity judgments are not affected by the type of control deficiency
(entity level or account specific).
H4: Auditors’ evaluate entity level deficiencies more severely than account specific deficiencies
in a setting that allows a direct comparison of both functional controls.
14
Psychology research suggests that decision makers perceive more risk when a risky item is less observable, is
unknown, and has a pervasive effect (Slovic 1987; Weber 1988; Weber and Bottom 1989, 1990). Consistent with
this expectation, accounting research provides evidence that investors perceive entity-level control deficiencies as
more severe than account-specific deficiencies (e.g., Rose et al. 2010; Asare and Wright 2014) but there is no
evidence on whether auditors’ severity assessments are also similarly affected by the type of control deficiency.
14
III. METHOD
Overview of Experiment 1 and Participants
In experiment 1 we employ a between-participants design that manipulates substantive
testing outcome and type of control deficiency to test H1 and H3. The participants were 95
experienced auditors (26 partners, 13 directors (i.e., non-equity partners), 26 senior managers,
and 30 managers) with a mean (standard deviation) of approximately 4.5 (1.8) years of SarbanesOxley work-related experience. Thirty-two auditors self-reported that they had evaluated the
severity of a control deficiency 0 to 10 times, twenty (20) between 10 to 25 times, seventeen (17)
between 25 to 50 times, and twenty-six (26) over 50 times.15 Participants were recruited from the
Big 4 firms in the Netherlands. Each of the firms has a worldwide audit approach in examining
and attesting to internal controls as required by the Sarbanes-Oxley Act. Neither the number of
times participants had evaluated the severity of deficiency (F=1.304, p=.278) nor the
participant’s current rank (F=1.075, p=.364) varied by experimental condition.
The Task
Each participant received a secured internet link to the case from a firm representative.
The first screen described the purpose of the study as obtaining insights on how auditors evaluate
control deficiencies and asked participants to work independently since individual judgments
were sought. Subsequently, participants were asked to assume they were assigned to an audit
engagement (Precision Inc.). The audit is almost complete and there is one unresolved item on
the “summary of control deficiencies.” Specifically, the audit team has identified a control
deficiency and their task is to evaluate the severity of the deficiency.
15
The number of times that the severity of a deficiency had been evaluated is not significant when included as a
covariate in the results presented in the next section. Similarly, excluding participants in the 0 to 10 category does
not qualitatively change the results.
15
These preliminary instructions were followed by background information about the
company, which was described as a Dutch corporation that is a leading global producer of
flooring and ceiling systems for use primarily in the construction and renovation of buildings.
The company’s shares were traded on the Amsterdam Stock Exchange (ASE) and were crosslisted on the New York Stock Exchange (NYSE). The company, through its Netherland
operations and international subsidiaries, sells its products in more than 80 countries. The firm
had audited the company’s financial statements since its incorporation in 1998. Participants were
also provided with 3-year comparative financial summary, including net sales, accounts
receivable, allowance for bad debts, pretax income, overall materiality (5% of pretax income),
materiality for accounts receivable (75% of overall materiality), and total assets. For the current
year net sales, pretax income, overall materiality, and tolerable misstatement for accounts
receivable were reported at €3,844 million, €340 million, €17 million, and €13 million
respectively.
Participants next saw a screen that provided a summarized description of a key control
(account specific or entity level, see the following section entitled “manipulations and design”),
the sample plan and results for testing design effectiveness, the sample plan for testing operating
effectiveness, and the sample results for operating effectiveness. In addition, participants
received information about compensating controls and remediation.
For the test of design effectiveness, participants were told the audit team performed a
walkthrough of an account and concluded that the control was effectively designed. This phase
was followed by the sample plan for testing operating effectiveness. The sample results showed
that the control was not operating in 3 of 40 sampled items and additional analysis of the 3
16
deviations indicated that they were not fraud related.16 The participants were told that no
compensating controls are present and remediation is not possible by year-end because of the
timing of identifying the deficiencies.
Participants were then provided the results of substantive tests, manipulated at either no
misstatement or the detection of an immaterial misstatement of €7 million related to the control
deficiency.17 Participants then provided responses to the following questions regarding the
judged severity of the identified deficiency, (i) the likelihood that the control deficiency could
lead to a material misstatement in revenue, (ii) the likelihood that next year’s unaudited financial
statements will materially misstate revenue; (iii) estimates of expected amount of misstatement
that the deficiency could lead to; and (iv) the overall severity of the control deficiency for
reporting purposes (on a 3-point scale labeled “control deficiency,” “significant deficiency” or
“material weakness”). The response mode for the likelihood questions was a 5-point scale
labeled “no chance,” “remote,” “reasonably possible,” “probable,” or “certain.”
Manipulations and Design
In experiment 1 the detection of misstatements (detected or not detected) and type of
control deficiency (account specific or entity level) were manipulated between participants
resulting in a fully crossed 2 x 2 between-subject design. In the detected misstatement condition,
participants were told that “as a result of substantive tests the audit team identified transactions
in which sales were overstated by €7 million.” They were further told that these overstatement
errors were related to the control deficiency. They were also informed that management adjusted
the sales accounts to reflect the misstatement and no other misstatements were found.
16
17
The sample size of 40 is based on the guidance in PricewaterhouseCooper’s publication (PWC 2004).
The misstatement was 41% of overall materiality and 54% of tolerable misstatement for accounts receivable.
17
Conversely, in the no detected misstatement condition, participants were told, “no errors were
identified during the substantive testing of sales.”
In the account specific condition, the key control was described as
“The audit team identified the matching of shipping documents with sales invoices and the
investigation of unmatched invoices as a key control over the validity of Sales assertion. The
revenue application module produces an exception report, which lists any unmatched sales
invoices. The assistant controller then investigates the causes of any unmatched items monthly.”
In the entity level condition, the key control was described as:
“The audit team identified the comparison of actual monthly sales to budgeted monthly sales, for
each of the 900 significant customer accounts, and the investigation of variances as a key control
over the validity of Sales assertion. The variance analysis-reporting module produces an
exception report that lists significant customer accounts with unexpected variances. The assistant
controller then investigates the causes of any variances.”
The control deficiencies were identified through examples in auditing standards, firm
manuals, and textbooks and then evaluated by experienced practitioners in the development of
the experimental case as important entity level or account specific controls respectively in the
revenue cycle. The entity level control operates at a level of precision that could adequately
prevent or detect on a timely basis material misstatements related to sales. The use and
classification of budgets and investigation of variances as entity level controls is widely
acknowledged in the professional literature and is a pervasive monitoring function of
management that affects many accounts (see e.g. Ernst and Young 2007, 2; PCAOB 2009, 16,
example 2-2). In addition, a budgetary control deficiency is more difficult to remediate and has
more operational consequences than the investigation of unmatched shipping documents. We
chose this type of entity level control to allow us to hold constant the sample parameters
described below.
18
The sample plan for testing operating effectiveness of the account specific control
indicated the audit team decided to take a statistical sample of 40 items. This plan was based on
three factors: the control operating multiple times a day, thus averaging about 30 times a day or
about 10,800 times annually (900 significant customer accounts x 12); the control had operated
effectively in the last two years with minimal deviations; and an expected deviation rate of 0%
and tolerable deviation rate of 3%. The plan for the entity level control was based on the same
parameters except that the control was described as operating monthly, averaging about 900
times a month or about 10,800 times annually (900 significant customer accounts x 12).
IV. RESULTS
Manipulation Check
After completing the primary task, participants responded to a manipulation check
question for the detected misstatement independent variable (did the audit team identify a
misstatement during the substantive testing of sales? (Yes or No). All participants correctly
answered the question, so the manipulation was deemed successful.
Descriptive Statistics and Hypotheses Tests
H1 predicts that auditors’ control severity judgments are more severe when a related
immaterial misstatement is detected than when no misstatement is detected. H3 predicts that
auditors’ control severity judgments are not affected by the type of control deficiency (i.e., they
evaluate account specific and entity level deficiencies as equally severe). We test H1 and H3
jointly by performing a 2 x 2 ANOVA using substantive testing outcome and type of control
deficiency as independent variables. The control severity judgment is reflected in the judgment
19
of the magnitude of expected misstatement, the likelihood of misstatement and the overall
conclusion. We present evidence on each of these measures.
The ANOVA results examining the effect on expected misstatement is reported in panel
A of Table 2. Supporting H1, there is a significant main effect for the substantive testing
outcome (F = 7.683, p = .007). The descriptive statistics presented in Table 1 (panel A) show that
the mean expected misstatement for participants in the detected misstatement is €244 million
compared to €9 million in the no misstatement condition.18 Neither the deficiency type
manipulation (F = .366, p =.546) nor the interaction of deficiency type and substantive testing
outcome (F = .205, p = .652) are significant. Thus, H3 regarding the type of deficiency is not
rejected.
Panel B of Table 2 reports the ANOVA results using the likelihood of misstatements as
the dependent variable. Supporting H1, there is a significant main effect for the substantive
testing outcome (F= 35.4, p=.001).19 The descriptive statistics presented in Table 1 show that the
mean likelihood assessment for participants in the detected misstatement condition is 3.40
compared to 2.51 in the no misstatement detected condition.20 Neither the type of control
deficiency (F=.593, p=.443) nor the interaction of control deficiency type and substantive testing
outcome (F=.108, p=.743) are significant. H3 again is not rejected.
18
Interestingly, the magnitude of error for participants in the error condition is approximately equal to the product of
the sample deviation rate (7.5%) and the revenue balance (€3,844), suggesting that in judging the magnitude of
potential misstatement, auditors may be using the sample deviation rate as an anchor.
We obtain qualitatively similar results if we use the likelihood that next year’s unaudited financial statements will
contain a material misstatement in revenue as the dependent variable with occurrence of misstatement (F= 29.9,
p=.001), type of deficiency (F= 1.052, p=.308), and the interaction of occurrence of misstatement and type of
deficiency (F= 0.180, p=.672). The descriptive statistics in Table 1 show that the likelihood in the error condition is
3.21 (.75) compared to the 2.43 (.67) for the no-error condition.
19
The response scale for likelihood is: 1= “no chance”; 2= “remote”; 3= “reasonably possible”; 4= “probable”; 5=
“certain.”
20
20
The third measure of severity is the auditors’ overall conclusion. Since the decision to
issue or not issue an adverse control report is the most important outcome for both users and the
auditor, we collapsed the overall conclusion into two categories (material weakness or no
material weakness). The frequency distribution is presented in panel B of Table 1. Twenty-eight
(67%) of the participants in the detected misstatement condition chose a material weakness while
only 4 (8%) did so in the no misstatement detected condition ( 2 = 36.67, p = .001), providing
support for H1. Panel C of Table 1 shows that thirteen (30%) of the participants in the account
specific condition chose a material weakness compared to 19 (37%) in the entity level condition
( 2=.419, p=.517). Thus, consistent with the other measures of severity, deficiency type did not
significantly affect material weakness classifications.
We also tested the hypotheses with a logistic regression using substantive testing
outcome, deficiency type manipulation, and the interaction of deficiency type as the independent
variables. The severity classification (material weakness or not material weakness) is the
dependent variable. The results are presented in Panel B of Table 2. The substantive testing
outcome is significant (Wald statistic 18.14, p =.001). However, neither the deficiency type
(Wald statistic .001, p =.99) nor the interaction term are significant (Wald statistic .001, p =.99).
Lastly, we also performed an ordinal regression that used participants’ overall conclusion as a
dependent variable. The independent variables are as before. The untabulated results show a
significant main effect for the substantive testing outcome (Wald statistic = 36.73, p = .001) but
no significant effect for type of control deficiency (Wald statistic .822, p = .365).21 Thus, we find
21
We also tested the hypotheses by performing an ANOVA using participants’ overall conclusion as a dependent
variable (response scale: 1= “control deficiency”; 2 = “significant deficiency”; 3 = “material weakness”). The
independent variables are as before. The ANOVA results in Table 2 show a significant main effect for the
substantive testing outcome (F= 78.3, p=.001) but no significant effect for type of control deficiency and the
interaction term. The descriptive statistics presented in Table 1 show that the mean overall conclusion for
21
consistent and robust results across the full scale of internal control severity measures in support
of our hypotheses on substantive testing outcome (H1) and type of control deficiency (H3).
Overview of Experiment 2 and Participants
In experiment 2 we employ a within-participants design that manipulates the substantive
testing outcome and type of control deficiency to test H2 and H4. The participants were 32
experienced auditors (12 partners, 3 directors, 15 senior managers, and 2 managers) with a mean
(standard deviation) of 5.28 (2.4) years of Sarbanes-Oxley work-related experience. Thirteen
auditors self-reported that they had evaluated the severity of a control deficiency 0 to 10 times,
five (5) between 10 to 25 times, nine (9) between 25 to 50 times, and five (5) over 50 times.
Participants were recruited from the same Big 4 firms in the Netherlands, with the proviso that
they had not participated in experiment 1.
Manipulation and Design
Figure 1 provides an overview of experimental process. There were two tasks in this
experiment focusing on the within-participants manipulation of substantive testing outcome and
type of control deficiency respectively. With a few modifications, the first task on substantive
testing outcome is similar to that used in experiment 1. However, as explained below, we
obtained one overall severity assessment for entity level and account specific deficiencies. Thus,
our measure captures participants’ beliefs regarding the relative severity of the two functional
controls rather than their evaluation of the specific deficiencies used in experiment 1. Our focus
is to demonstrate that participants’ evaluation of the specific control deficiencies used in
experiment 1 do not reflect their general knowledge about the relative severity of deficiencies in
the two functional controls.
participants in the detected misstatement is 2.62 compared to 1.49 in the no detected misstatement condition. These
findings again corroborate the logistic regression results reported previously.
22
Each participant received a secured internet link to the case from a firm representative.
The instructions on the first screen were as before except participants were told they will be
asked some questions about the client’s ICOFR under two different scenarios. They then
received the same background information, financial summary, a summarized description of a
key control (account specific deficiency for all participants), the sample plan and results for
testing design effectiveness, the sample plan for testing operating effectiveness, and the sample
results for operating effectiveness. In addition, consistent with experiment 1, participants
received information about compensating controls, remediation, and sample results.
As shown in Figure 1, participants were then provided the substantive testing outcomes
and were told that the substantive testing was completed during the fieldwork phase under two
independent unrelated scenarios. Half of the participants initially saw the no misstatement
outcome (scenario 1 for these participants) followed by a detection of a misstatement of €7
million related to the control deficiency which had been corrected by management (scenario 2
for these participants). The other half saw the substantive testing outcomes in the reverse order.
The substantive testing outcome was randomly ordered. The purpose of providing both
substantive testing outcomes prior to making severity assessments was to provide auditors an
opportunity to detect and correct any errors and/or inconsistencies in their responses to the
separate outcomes.
After seeing both outcomes, participants moved to the next screen where they were
provided the control deficiency, the substantive testing outcome for their scenario 1 and the
comparative summary performance measures. Participants then provided the same assessments
as in experiment 1. After their scenario 1 assessments, participants were asked if the audit team
identified a misstatement during the substantive testing of sales (Yes or No). This served as a
23
manipulation check. Participants moved to scenario 2 where they saw the control deficiency, the
substantive testing outcome for their scenario 2 and the comparative summary performance
measures. They then provided the same assessments based on the scenario 2 parameters followed
by an evaluation of whether the audit team identified a misstatement during the substantive
testing of sales (Yes or No).22
After completing the assessments for both scenarios, participants moved to the second
task, which focused on examining the effect of making the manipulation of type of control
deficiency salient. Each participant was provided definitions of an entity level and account
specific control respectively.23 They were then asked to assume they are evaluating a control
deficiency in a key control at the end of the year. The control operates over several transactions
that results in a balance of €4 billion euros (i.e., approximate size of Precision sales). They were
asked to assume that there are no compensating controls and remediation is not possible. Each
participant was then asked to assess the severity of the deficiency assuming that it is an account
specific deficiency and then an entity level deficiency (order balanced) on an 11-point scale with
endpoints labeled (1=not very severe; 6 = moderately severe; 11 = very severe). We used this
approach in response to negative pilot comments on the repetitiveness of evaluating four full
cases and since it provides the same information more efficiently. Further, our objective in
experiment 2 is to examine whether auditors’ broad knowledge of these two types of controls
affects their severity judgment. Finally, participants provided demographic information.
Descriptive Statistics and Hypotheses Tests
22
All participants correctly answered these questions, so the manipulation was deemed successful.
Specifically, the case indicated that, “An account specific deficiency exists when the design or operation of a
control in a specific account or transaction cycle does not allow management or employees, in the normal course of
performing their assigned functions, to prevent or detect misstatements on a timely basis. An entity level deficiency
exists when the design or operation of a control at the company level does not allow management or employees, in
the normal course of performing their assigned functions, to prevent or detect misstatements on a timely basis.”
23
24
H2 states that auditors’ control severity judgments will not be affected by the substantive
testing outcome in a setting that allows a comparison of both outcomes. We test H2 by analyzing
each of the three control severity judgments. Our primary test in each case uses a Mixed
ANOVA model with the relevant control severity judgment as a repeated measure and the order
in which the judgments were made as a between-subjects variable.
Across auditors, the mean (standard deviation) expected misstatement when a
misstatement is detected is €45.72 (48.97) compared to a mean (standard deviation) of €39.47
(48.3) when a misstatement is not detected. As shown in Table 4, neither the substantive testing
outcome (F=3.533, p=.070) nor its interaction with order of evaluation (F=.043, p=.838) are
significant. Descriptive statistics by experimental condition are presented in Table 3. In the no
misstatement followed by the misstatement order (Panel A), the mean expected misstatement of
€40.5 million is not significantly different from €46.06 million (paired-t = -1.68, p =.114).
Similarly, for the misstatement followed by the no misstatement order (Panel B), the mean
expected misstatement of €45.38 million is not significantly different from €38.44 million
(paired-t = -1.20, p =.247). Taken together, the results support H2 that substantive testing
outcome did not have a significant effect on expected misstatement.
The mixed ANOVA results in Table 4 show that there is no significant effect for
substantive testing outcome when comparing the likelihood of misstatement judgments within
subject (F=2.743, p=.108). Moreover, the interaction (F=.171, p=.662) and the between-subjects
order effect are not significant (F=.056, p=.815). The descriptive statistics in panel A of Table 3
show that the mean likelihood of misstatement of 2.75 is not significantly different from 3.06
(paired-t = 1.23, p =.237). The same results are indicated by panel B where the mean likelihood
of misstatement of 3.06 is not significantly different from 2.88 (paired-t = 1.15, p =.270).
25
As shown in Table 4, none of the variables are significant at a conventional level of
significance when we use the overall conclusion as the dependent variable (substantive testing
outcome (F=3.253, p=.081); interaction (F=1.446, p=.239); between-subjects order effect
(F=.001, p=.998)). In panel A of Table 3., the mean overall conclusion of 2.06 is not
significantly different from 2.13 (paired-t = 1.01, p =.333). Panel B also shows that the mean
overall conclusion of 2.25 is not significantly different from 1.94 (paired-t = 1.78, p =.096).24
Thus, across the three measures of control severity judgments, we find no significant effect for
substantive testing outcome in support of H2.
H4 states that auditors’ control severity judgments are more severe when an entity level
deficiency is present compared to an account specific deficiency in a setting that allows a
comparison of both functional controls. We used a Mixed ANOVA model with severity
assessments under each type of deficiency as a repeated measure and the order in which the
judgments were made as a between subjects variable. In support of H4, Table 4 shows a
significant effect for type of deficiency (F=115.453, p=.001) on severity assessments. Neither the
interaction (F=.052, p=.821) nor the order (F=.062, p=.804) are significant. Across auditors, the
mean (standard deviation) severity assessment of the entity level deficiency is 8.13 (1.24), which
is larger than the mean (standard deviation) of 6.66 (.971) severity assessment of the account
specific deficiency. The descriptive results (by condition) presented in Table 4 support this
conclusion regardless of the order in which the assessment is made. For auditors who evaluated
entity level followed by the account specific deficiency, the mean severity assessment for entitylevel deficiency is 8.19 (1.27) compared to 6.69 (.873) for account specific deficiency (paired-t =
7.34, p = .001). The mean severity assessment for entity-level deficiency is 8.06 compared to
24
The McNemar test of correlated proportions shows that the proportion of material weakness conclusions does not
vary by substantive testing outcome in either order condition (p > .5).
26
6.63 for account specific deficiency in the reverse order (paired-t = 7.91, p = .001). Thus, H4 is
supported.
V. DISCUSSION
Our primary findings suggest that: (i) substantive testing results, which are inevitably
known at the time auditors evaluate an unremediated control deficiency for control reporting
purposes, affect auditors’ control severity assessments in a manner that is not intended by
auditors; and (ii) auditors generally believe that entity-level control deficiencies should be
evaluated more severely than account-specific control deficiencies. However, auditors’ control
severity judgments, in a between-participants setting, do not reflect this belief.
The evidence that auditors’ natural severity assessments may not reflect their underlying
beliefs has important implications. Regarding the outcome effect, auditors in a natural setting
behave as if the substantive testing outcome of no misstatements is evidence that there is not a
material weakness. Thus, auditors may underestimate the severity of a deficiency that has yet to
lead to misstatements, which could be a serious problem since misstatements often lag ICOFR
problems. For instance, system vulnerability and exploitation of the system are typically
temporally separated (Bozorgi et al. 2010).
In light of the SEC staff’s concerns about whether the decline in the number of adverse
reports is really attributable to better company ICOFR or auditors’ failure to identify material
weaknesses (Besch 2009; Whitehouse 2010), our findings suggest that a decision prompt that
asks auditors to consider possible misstatements that could occur might attenuate the outcome
effect that results in potentially underestimating the severity of a deficiency. For instance, if no
misstatements are detected, auditors may be asked to assume that the deficiency had led to an
immaterial misstatement and vice versa, as was essentially done in our within-participants design
27
in experiment 2. Earley et al. (2008) find that asking auditors to evaluate and explicitly document
the likelihood and magnitude of the effect of a control deficiency on the financial statements
mitigates management’s “first mover” influence on auditors’ judgments in the early phases of
control assessments. An alternative promising prompt may require auditors to explicitly
document possible misstatements that could occur as a result of the control deficiency. The issue
is whether the documentation of these possible misstatements will affect likelihood and
magnitude assessments. The efficacy of these prompts is a promising issue for future research.
In equal measure, a potential risk resulting from the outcome effect is that auditors may
overestimate the severity of deficiencies that have led to misstatements. However, because it is
assumed that clients would protect their interests during the negotiation process with auditors,
this concern appears less problematic. Nonetheless, the results suggest the use of various quality
control devices may be necessary (e.g., decision aid reminders, training in the psychology of
reasoning) to mitigate this concern. The key is to introduce interventions that encourage auditors
to evaluate control deficiencies deliberatively rather than intuitively.
Our results also indicate that while auditors are aware that entity-level deficiencies are
generally more severe than account-specific deficiencies, their day-to-day severity assessments
many not reflect this belief. This finding suggests that auditors may have difficulties evaluating
the potential impact of entity level deficiencies unless their functionality is highlighted. This
finding may explain why Bedard and Graham (2011) unexpectedly found that entity level
monitoring controls and information technology general controls were more likely to be
classified as significant deficiencies rather than material weaknesses.
From a theoretical perspective, our results are consistent with psychology theories that
suggest that individuals can operate in either of two different modes of thought (see e.g.,
28
Kahneman 2011). Our findings provide some evidence that auditors use their system 1 (intuitive)
mode of thought when they evaluate control deficiencies for reporting purposes in the natural
setting. However, auditors can use their system 2 (rational and systematic) mode of thought
when the context requires them to think more deliberatively. The challenge in the evaluation of
control deficiencies is to structure the task to bring alternative perspectives to the fore (e.g.,
documenting what misstatements could occur if none occurred and vice versa, etc.), thus
triggering the use of auditors’ system 2 reasoning.
As noted previously, our manipulation of type of deficiency in experiment 2 differed
from that used in experiment 1. Our experiment 2 manipulation obtained auditors’ underlying
beliefs and knowledge of the severity of entity level deficiencies versus account specific
deficiencies. Further, we obtained a measure of overall severity assessment rather than the
component judgments obtained in experiment 1. While this difference represents an important
design limitation in comparing the results of the two experiments, we are able to say that, as a
conceptual matter, auditors believe entity level deficiencies are more severe than account
specific deficiencies. However, their evaluation of specific functional controls may differ from
this general template. For instance, with respect to the specific deficiencies used in experiment 1,
auditors do not naturally follow their general functional control deficiency template. Moreover,
we employed an entity level account that operates at the account level while several entity level
controls operate over multiple processes and accounts. As such, future research is needed to
corroborate the findings and consider other factors that may impact such judgments. For
instance, in practice auditors may need to evaluate multiple deficiencies, rather than a single
deficiency as was done in our study, raising issues about aggregation across different deficiency
types. Finally, because entity level deficiencies occur at the corporate level, the client pressures
29
to evaluate them may be different than those exerted on the auditor when account specific
deficiencies are evaluated, which, in turn, could impact the negotiation pressures placed on the
auditors’ severity assessments. These are important issues for practice consideration and future
research.
30
31
Table 1
Experiment 1: Between-Participants Results
Panel A: Descriptive statistics (means (standard deviations)) of expected misstatements,
likelihood of misstatements, likelihood of misstatements in next year unaudited revenue, and
overall conclusions by experimental condition
Account Specific
Entity Level
Total
n=23
n=19
n=42
Misstatement Detected
Expected misstatement (EM)1
284 (788)
196 (161)
244 (589)
Likelihood of misstatement (LM)2
3.38 (.73)
3.32 (.58)
3.40 (.67)
3.17 (.72)
3.26 (.81)
3.21 (.75)
2.52 (.59)
2.74 (.56)
2.62 (.58)
n=20
17 (20)
n=33
4 (8)
n=53
9 (15)
LM
2.55 (.83)
2.48 (.67)
2.51 (.73)
NYLM
2.30 (.66)
2.52 (.67)
2.43 (.67)
OC
1.45 (.51)
1.52 (.71)
1.49 (.64)
Likelihood of misstatement in next
year’s unaudited revenue (NYLM)
Overall Conclusion (OC)4
No Misstatement
EM
3
n=43
n=52
n=95
Total
EM
160 (586)
74 (134)
113 (406)
LM
3.05 (.90)
2.79 (.75)
2.91 (.83)
NYLM
2.77 (.81)
2.79 (.80)
2.78 (.80)
OC
2.02 (.77)
1.96 (.89)
1.99 (.83)
Notes
1
Expected misstatements is in millions of Euros
2
likelihood of misstatements is on a 5 point scale (1= “no chance”; 2= “remote”; 3= “reasonably
possible”; 4= “probable”; 5= “certain”)
3
likelihood of misstatements is on a 5 point scale (1= “no chance”; 2= “remote”; 3= “reasonably
possible”; 4= “probable”; 5= “certain”)
4
Overall conclusion is on a 3 point scale (1= “control deficiency”; 2 = “significant deficiency”; 3
= “material weakness”)
Participants responded to the following questions:
1. What is your estimate of the expected amount of misstatement that the control deficiency
could lead to?
2. What is your assessment of the likelihood that the control deficiency could lead to a
material misstatement in revenue?
3. What is your assessment of the likelihood that next year’s unaudited financial statements
will contain a material misstatement in revenue?
4. What is your overall conclusion about the severity of the control deficiency for reporting
purposes?
32
Table 1
Panel B: Frequency of participants’ material weakness decision classified by substantive testing
outcome condition
Material Weakness
Not Material Weakness
Total
Misstatement detected
28 (67%)
14 (33%)
42 (100%)
No misstatement
4 (8%)
49 (92%)
53(100%)
detected
Total
32 (34%)
63 (66%)
95 (100%)
Note: There is a significant difference in the frequency of material weakness decisions of the
participants in the Detected Misstatement condition vs. No Detected Misstatement condition ( 2
(1)= 36.67, p = .001).
Table 1
Panel C: Frequency of participants’ material weakness decision classified by type of deficiency
condition
Material Weakness
Not material Weakness Total
Account Specific
13 (30%)
30 (70%)
43 (100%)
Entity Level
19 (37%)
33 (63%)
52 (100%)
Total
32 (34%)
63 (66%)
95 (100%)
Note: There is not a significant difference in the frequency of material weakness decisions of the
participants in the Account Specific vs. Entity Level conditions ( 2(1) = .419, p = .517).
33
Table 2
Experiment 1: Between-Participants Results
Panel A: ANOVA for Expected Misstatement
Source of Variation
df
MS
F-Test p-value
Substantive Testing Outcome (O) 1 1,194,557 7.683
.007
Deficiency Type (D)
1
56,979
.366
.546
OxD
1
31,856
.205
.652
Error
91 155,043
Panel B: ANOVA for Likelihood of Misstatement
Source of Variation
df
MS
F-test p-value
Substantive Testing Outcome (O) 1 17.543 35.4
.001
Deficiency Type (D)
1
.294
.593
.443
OxD
1
.054
.108
.743
Error
91 .495
Panel C: ANOVA for Overall Conclusion
Source of Variation
df
MS
F-test p-value
Substantive Testing Outcome (O) 1 29.815 78.3
.001
Deficiency Type (D)
1
.445 1.170
.282
OxD
1
.127
.335
.564
Error
91 .380
Panel D: Logistic Regression of Overall Conclusion
Wald-test p-value
Substantive testing outcome manipulation (O)
18.14
.001
Deficiency type manipulation (D)
.001
.99
OxD
.001
.99
34
Table 3
Experiment 2: Within-Participants Results
Descriptive statistics (means (standard deviations)) and within-participants paired sample t test
Panel A: No misstatement condition followed by the misstatement condition
Descriptive statistics
Paired differences and
testing
First
Second
Mean (std. Paired- df p-value (2Judgment
Judgment
dev.)
t
tailed)
Expected misstatement
40.5
46.06 (48.2)
-5.563
-1.68 15
.114
(EM)
(13.27)
(47.4)
Likelihood of
2.75
3.06
-.313
-1.23 15
.237
(.93)
(.78)
misstatement (LM)
(1.01)
Overall Conclusion (OC)
2.06
2.13
-.063
-.57
15
.580
(.77)
(.81)
(.44)
Frequency of material
5 (31%)
7 (44%)
weakness decisions
Panel B: The misstatement condition followed by the no misstatement condition
Descriptive statistics
Paired differences and
testing
First
Second
Mean (std. Paired- df p-value (2Judgment
Judgment
dev.)
t
tailed)
Expected misstatement
45.38
38.44 (50.7)
6.94
1.20
15
.247
(51.2)
(EM)
(23.1)
Likelihood of
3.06
2.88
.188
1.15
15
.270
(.85)
(.89)
(.65)
misstatement (LM)
Overall Conclusion (OC)
2.25
1.94
.313
1.78
15
.096
(.70)
(.68)
(.77)
Frequency of material
6 (38%)
4 (25%)
weakness decisions
Notes
Participants either saw the no misstatement condition followed by the misstatement condition
(panel A) or the misstatement condition followed by the no misstatement condition (panel B) and
made the following three judgments:
1. What is your estimate of the expected amount of misstatement that the control deficiency
could lead to? Response scale: millions of Euros
2. What is your assessment of the likelihood that the control deficiency could lead to a
material misstatement in revenue? Response scale: 1=no chance; 2=remote; 3=reasonable
possible; 4=probable; and 5=certain.
3. What is your overall conclusion about the severity of the control deficiency for reporting
purposes? Response mode: 1=control deficiency; 2=significant deficiency; 3=material
weakness.
35
Table 4
Panel A: Mixed ANOVA for Expected Misstatement
Source of Variation
df
MS
F-Test
Within-Subjects
Substantive Testing Outcome (O) 1
625
3.533
O x Order
1
7.563
.043
Error(O)
30 176.881
Between-Subjects
Order
1
30.25
.006
Error
30 4711.41
Panel B: Mixed ANOVA for Likelihood of Misstatement
Source of Variation
df MS F-Test
Within-Subjects
Substantive Testing Outcome (O) 1 1.000 2.743
O x Order
1 .063
.171
Error(O)
30 .365
Between-Subjects
Order
1 .062
.056
Error
30 1.123
Panel C: Mixed ANOVA for Overall Conclusion
Source of Variation
df MS F-Test
Within-Subjects
Substantive Testing Outcome (O) 1 .562 3.253
O x Order
1 .250 1.446
Error(O)
30 .173
Between-Subjects
Order
1 .001 .001
Error
30 .981
36
p-value
.070
.838
.937
p-value
.108
.682
.815
p-value
.081
.239
.998
Table 5
Within-Participants Mean (standard deviation) Severity Assessments and Paired Differences
Entity Level
AccountPaired
Paired- df pAssessment
Specific
Difference
t
value
Assessment
Entity-level followed
8.19
6.69
1.50
7.34
15 .001
by account-specific
(1.27)
(.873)
(.816)
deficiency
Account-specific
8.06
6.63
1.44
7.90
15 .001
(1.08)
(.727)
deficiency followed by (1.23)
entity-level deficiency
Overall
8.13
6.66
1.47
10.91
15 .001
(1.24)
(.971)
(.761)
Responses are on a 11-point scale (1= “not very severe,” 6 = “moderately severe,” 11= “very
severe”
Participants responded to the following questions:
An entity level deficiency exists when the design or operation of a control at the company
level does not allow management or employees, in the normal course of performing their
assigned functions, to prevent or detect misstatements on a timely basis. An account-specific
deficiency exists when the design or operation of a control in a specific account or
transaction cycle does not allow management or employees, in the normal course of
performing their assigned functions, to prevent or detect misstatements on a timely basis.
Assume that you are evaluating a control deficiency in a key control at the end of the year.
The control operates over several transactions that result in a balance of €4 billion. Assume
that there are no compensating controls and remediation is not possible
1. How severe is this control deficiency assuming it is an entity level control?
2. How severe is this control deficiency assuming it is an account-specific control?
37
Table 6
Experiment 2: Mixed ANOVA for Severity of Deficiency
Source of Variation df
MS
F-Test p-value
Within-Subjects
Deficiency Type (D) 1 34.516 115.45
.001
D x Order
1
.016
.052
.821
Error(D)
30 .299
Between-Subjects
Order
1
.141
.062
.804
Error
30 2.253
38
References
Asare, S., Fitzgerald, B., Graham, L., Joe, J., Negangard, E., and Wolfe. C. (2013). Auditors’
internal control over financial reporting decisions: Analysis, synthesis, and research
directions. Auditing: A Journal of Practice and Theory.
Asare, S. and Wright, A. (2012). The effect of type of material weakness on users' confidence in
the accompanying financial statement audit report. Contemporary Accounting Research.
29(1): 152-175.
Asare, S. and Wright, A. (2014). Evidence on the factors that mediate the relationship between
type of material weakness and investors' financial reporting risk assessments. Working
paper, Northeastern University.
Ashbaugh-Skaife, H., Collins, D., Kinney, W., & Lafond, R. (2009). The effect of internal
control deficiencies on firm risk and cost of equity capital. Journal of Accounting
Research 47(1): 1-43.
Baron, J., and J. Hershey. (1988). Outcome bias in decision evaluation. Journal of Personality
and Social Psychology 54 [4]: 569-579.
Bedard, J., & Graham, L. (2011). Detection and severity classifications of Sarbanes-Oxley
Section 404 internal control deficiencies. The Accounting Review 86, (3): 825–855.
Beneish, M., Billings, M., & Hodder, L. (2008). Internal control weaknesses and information
uncertainty. The Accounting Review 83 (3): 665-703.
Besch, D. (2009). Speech by SEC Staff: Remarks before the 2009 AICPA National Conference
on Current SEC and PCAOB Developments. (December 7). Available at:
http://www.sec.gov/news/speech/2009/spch120709db.htm
Bozorgi, M., Saul, L., Savage, S., & Voelker, M. (2010). Beyond heuristics: Learning to classify
vulnerabilities and predict exploits. Proceedings of the 16th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining.
Brown, C. E., and I. Solomon. (1987). Effects of outcome information on evaluations of
managerial decisions. The Accounting Review 62 [3]: 564-577.
Clarkson, P., Emby, C. & Watt, V. 2002. Debiasing the outcome effect: The role of instructions
in an audit litigation setting. Auditing: A Journal of Practice & Theory 21 [2]: 7-20.
Earley, C., Hoffman, V., & Joe, J. (2008). Reducing management’s influence on auditors’
judgments: An experimental investigation of SOX 404 assessments. The Accounting
Review 83 (6): 1461–1485.
39
Einhorn, H., and Hogarth, R. (1981). Behavioral Decision Theory: Process of Judgment and
Choice. Journal of Accounting Research. (Spring): 32-41.
Emby, C., A. M. G. Gelardi, and D. J. Lowe. (2002). A research note on the influence of
outcome knowledge on audit partners' judgments. Behavioral Research in Accounting 14:
87-103.
Ernst and Young. (2007). Strengthening Internal Control Through More Effective and Efficient
Entity-Level Controls. Available at
http://www.eycom.ch/publications/items/2008_entity_level_controls/2008_ey_entity_lev
el_controls.pdf
Evans, J. S. B., and Stanovich, K. E. (2013). Dual-process theories of higher cognition advancing
the debate. Perspectives on Psychological Science, 8(3), 223-241.
Fischhoff, B. (1975). Hindsight foresight: The effect of outcome knowledge. Journal of
Experimental Psychology: Human Perception and Performance: 288–299.
Gino, F., Moore, D., and Bazerman, M. 2009. No harm, no foul: The outcome bias in ethical
judgments. Working Paper 08-080. Harvard Business School.
Gramling, A., O’Donnell, E., and Vandervelde, S. (2013). An experimental examination of
factors that influence auditor assessments of a deficiency in internal control over financial
reporting. Accounting Horizons 27: 249-269.
Hammersley, J., Myers, L. and Shakespeare, C. (2008). Market reactions to the disclosure of
internal control weaknesses and to the characteristics of those weaknesses under section
302 of the Sarbanes Oxley Act of 2002. Review of Accounting Studies 13 (March).
International Auditing and Assurance Standards Board (IAASB). (2009). Communicating
Deficiencies in Internal Control to Those Charged with Governance and Management.
International Standards on Auditing (ISA) 265. International Federation of Accountants
(IFAC).
International Auditing and Assurance Standards Board (IAASB). (2009). Identifying and
Assessing the Risks of Material Misstatements Through Understanding the Entity and its
Environment. International Standards on Auditing (ISA) 315. International Federation of
Accountants (IFAC).
Jonas, G., and Gale, M., Rosenberg, A., and Hedges, L. (2007). The Third Year of Section 404
Reporting on Internal Control (May). Available at SSRN:
http://ssrn.com/abstract=985546 or http://dx.doi.org/10.2139/ssrn.985546
Kadous, K. K. (2000). The effects of audit quality and consequence severity on juror evaluations
of auditor responsibility for plaintiff losses. The Accounting Review 75 [3]: 327-341.
40
Kadous, K. K. (2001). Improving juror evaluation of auditors in negligence cases. Contemporary
Accounting Research 18 [3]: 425-444.
Kahneman, D (1972). Judgment under uncertainty: Heuristics and biases. Cambridge:
Cambridge University Press.
Kahneman, D. (2011). Thinking, Fast and Slow. New York: Farrar, Straus and Giroux.
Kahneman, D., &Tversky, A. (1982). The simulation heuristic. In D. Kahneman, P. Slovic& A.
Tversky (Eds.), Judgment under uncertainty: Heuristics and biases (pp. 201-208). New
York : Cambridge University Press.
Kahneman, D., & Tversky, A. (1996). On the reality of cognitive illusions. Psychological
Review, 103(3), 582–588.
Kennedy, J. 1995. Debiasing the curse of knowledge in audit judgment. The Accounting Review
70 (2) 249-274.
Kinney Jr., W., Martin, R., & Shepardson, M. (2008). Integrated audits: Promise, pitfalls, and
alternatives. Working paper, University of Texas.
Kozuch, B., and Nichols, S. 2011. Awareness of unawareness: Folk psychology and
introspective transparency. Journal of Consciousness Studies. 18(11-12): 135-160.
Libby, R., Bloomfield, R., & Nelson, M. W. (2002). Experimental research in financial
accounting. Accounting, Organizations and Society 27:775-810.
Lipe, M. (1993). Analyzing the variance investigation decision: The effects of outcomes, mental
accounting, and framing. The Accounting Review 68 [4]: 748-764.
Lipe, M. (2008). Discussion of "Judging audit quality in light of adverse outcomes: Evidence of
outcome bias and reverse outcome bias". Contemporary Accounting Research 25 [1]:
275-282.
Lowe, D., and Reckers, P. (1997). The influence of outcome effects, decision aid usage, and
intolerance of ambiguity on evaluations of professional audit judgement. International
Journal of Auditing 1 [1]: 43-58.
Nisbett, R., and L. Ross. (1980). Human Inference: Strategies and Shortcomings of Social
Judgment. Englewood Cliffs, NJ: Prentice-Hall.
Peecher, M.E., and M. D. Piercey. (2008). Judging audit quality in light of adverse outcomes:
Evidence of outcome bias and reverse outcome bias. Contemporary Accounting Research
25 [1]: 243-274.
41
PricewaterhouseCoopers (PWC). (2004). Sarbanes-Oxley Act: Section 404, Practical Guidance
for Management. (July).
Public Company Accounting Oversight Board (PCAOB). (2007). An Audit of Internal Control
over Financial Reporting that is Integrated with an Audit of Financial Statements.
Standard No. 5.
Public Company Accounting Oversight Board (PCAOB). (2009). An Audit of Internal Control
over Financial Reporting that is Integrated with an Audit of Financial Statements.
Standard No. 5. Guidance for auditors of smaller public companies, Staff views.
Available at http://pcaobus.org/Standards/Auditing/Documents/AS5/Guidance.pdf
Reffet, A. (2010). Can identifying and investigating fraud risks increase auditors’ liability? The
Accounting Review (85): 2145-2167.
Roese, N. J. (1997). Counterfactual thinking. Psychological Bulletin, 121, 133-148.
Rose, J. M., C. S. Norman, and A. M. Rose. (2010). Perceptions of investment risk associated
with material control weakness pervasiveness and disclosure detail. The Accounting
Review 85 (5):1787-1807.
Ross, M., and F. Sicoly. (1982). Egocentric biases in availability and attribution. In D.
Kahneman, P. Slovic and A. Tversky Eds., Judgment Under Uncertainty: Heuristics and
Biases pp. 179-189. Cambridge, MA: Cambridge University Press.
Sarbanes-Oxley Act of 2002.PUBLIC LAW 107–204—JULY 30, 2002 116 STAT. 745
Slovic, P. (1987). Perception of risk. Science (236) April: 280–285.
Tan, H. T., Libby, R., & Hunton, J. (2002). Analysts’ reactions to earnings preannouncement
strategies. Journal of Accounting Research. 40 (1): 223-246.
Tversky, A. and D. Kahneman (1981) The Framing of Decisions and the Psychology of Choice,
Science, 211 (4481): 453-458.
The Institute of Internal Auditors (IIA). (2008). Sarbanes-Oxley Section 404: A Guide for
Management by Internal Control Practitioners. The Institute of Internal Auditors.
Available at www.theiia.org/download.cfm?file=31866
Weber, E. (1988). A descriptive measure of risk. Acta Psychologica (69) November: 185–203.
The Accounting Review September 2010
Weber, E., and W. Bottom. (1989). Axiomatic measures of perceived risk: Some tests and
extensions. Journal of Behavioral Decision Making (2) April-June: 113–131.
42
Weber, F., and W. Bottom. (1990). An empirical evaluation of the transitivity, monotonicity,
accounting, and conjoint axioms for perceived risk. Organizational Behavior and Human
Decision Processes (45) April: 253– 275.
Whitehouse, T. (2010). SEC Curious About Drop in Material Weaknesses. Compliance Week
(February 7). Available at: http://vlex.com/vid/sec-curious-drop-material-weaknesses229098255
Wolfe, C., Mauldin, E., & Diaz, M. (2009). Concede or deny: Do management persuasion tactics
affect auditor evaluation of internal control deviations? The Accounting Review 84, (6):
2013–2037.
Wright, A. and S. Wright. 2014. Modification of the audit report: Mitigating investor attribution
by disclosing the auditor’s judgment process. Behavioral Research In Accounting
(forthcoming).
43
Formatiert: Deutsch (Deutschland)