The Value of Truth and the Optimal Standard of

JLEO, V10 N2
343
The Value of Truth and the Optimal Standard
of Proof in Legal Disputes
Michael L. Davis
Southern Methodist University
This article models the fact-finding process as one in which a defendant is
accused of violating some legal standard (eg., exceeding the speed limit)
The fact finder cannot observe the actual behavior but observes some variable
that is a noisy signal of the truth (e g , the reading from a radar detector). From
this signal it is possible to calculate the probability of guilt The standard of
proof (the minimal probability necessary to convict) that minimizes the expected cost of error is seen as a trade-off between the expected cost of false
acquittals and convictions A simple example shows that a clearer signal of
truth (a more reliable radar detector) may not always lower error costs, because the utility of a fact-finding method depends also on the population of
defendants and the standard of proof These results clarify the source of
disagreements over different fact-finding procedures.
1. Introduction
Imagine that you're a judge being asked to rule on the admissibility of evidence produced by some new technology—say, the use of DNA matching
tests in rape cases, or whether to employ some new breath analyzer for testing
the blood alcohol content of suspected drunk drivers. Although your decision
must consider a range of factors (e.g., the rights of defendants to withhold
testimony), a critical and in some sense more basic sequence of questions is,
first, whether the new technology represents a more accurate way of discovering the truth and, second, whether the increased accuracy is worth any additional cost.
The question of whether to admit a particular sort of evidence is but an
example of a more general problem of legal procedure. The right to a trial by
jury, limitations on discovery, and a host of other rules define a technology for
resolving uncertainty, both as to facts and law. When considering almost any
aspect of procedure, its utility as a means of sorting out the truth will surely be
an important issue.
A natural way of quantifying the accuracy of some procedural rules is to
calculate the expected cost of error associated with that technology. If we can
say that one set of procedures yields, on average, lower error costs, this
© 1994 by Oxford University Press. All rights reserved. 8756-6222/94/S5.00
344
T7» Journal ol Law, EcononncB, & Organization, V10 N2
should be weighed in the comparison along with other factors (administrative
cost, respect for the individual, etc.) The calculation of expected error costs
actually requires two steps: first, calculating what kinds of mistakes will be
made by a particular technology (e.g., what percentage of drunk drivers will
pass a breath analyzer test despite their guilt), and second determining how
expensive these mistakes are likely to be. Clearly, this second step is more
difficult and controversial—would you, for example, prefer to acquit 10
murderers rather than send one innocent to the gallows? However, even the
step of determining the errors generated by different fact-finding technologies
is much more complex and subtle than it might appear at first glance. This
article uses the techniques and tools of economics and decision theory to
highlight these complexities.
The formal model presented here offers two sorts of insights. First, it shows
that questions of what should be the optimal standard of proof and the optimal
technology are completely intertwined. (In this context the standard of proof
means the minimum probability of guilt necessary before convicting.) That is,
I show that the optimal standard of proof will depend on the type of technology employed—for example, if a different type of breath analyzer is used, a
different standard may be appropriate. Conversely, if the standard of proof is
changed, a fact-finding technology that was once rejected as causing too many
expensive errors may appear to lower error costs.
In fact, the standard of proof plays a critical and curious role throughout
this analysis. From the perspective of a decision theorist modeling legal
procedure as a means of minimizing error costs, the standard of proof is
viewed as a control variable to be adjusted in response to changing circumstances. In practice, however, the standard appears to be beyond the discretion
of trial courts and perhaps even higher authorities. In other words, there may
be substantial institutional constraints preventing the standard from being
adjusted to the level that would minimize expected error costs. Insofar as this
is true, it may explain why some fact-finding technologies are not used. But it
also raises the question of why the standard is not adjusted to account for
differences in technology and error costs. I briefly consider this issue in
Section 2.2.
In addition to illustrating the importance of the standard of proof, the model
demonstrates how the ranking of fact-finding technologies is interrelated with
both the costs associated with different types of errors and the population to be
judged using the technology. This is important insofar as it shows that the
technical characteristics of a method of determining facts are never a sufficient
basis for judging the desirability of that method.1 Thus, for example, a less
reliable breath analyzer—in the sense that it provides a noisier signal of the
truth—might actually lower the total cost of error when applied to a particular
I. The exception is a technology that never has any error. But since this seems both unlikely
and uninteresting, I assume that no technology is perfect.
The \falue of Thin and the Optima) Standard of Proof
345
population. Some specific examples as well as a more complete discussion
appear in Section 3.
By showing that the value of some fact-finding technology depends on the
standard of proof used and the characteristics of the population being tested, it
becomes much easier to identify the root causes of disagreements over legal
procedures. The formal model makes clear that there are at least three reasons
people might disagree about which sorts of procedures are best. First, the
dispute might center on the reliability of the specific procedure—for example,
how often do technicians misclassify a DNA sample, or how reliable is a
child's testimony about abuse. Second, people might not agree on the characteristics of the population of suspects to be judged by a set of procedures—for
example, some people might think that sexual abuse of children is extremely
rare while others think pedophilia is rampant. Third, if there is no consensus
as to the optimal standard of proof, possibly because of different assessments
of the cost of error, there is likely to be disagreement as to the optimal way of
discovering the truth.
In a recent article, Kaplow (forthcoming) also considers the value of accuracy, describing his work as an "attempt to illuminate the following sort of
inquiry: If a contemplated legal reform would increase accuracy in some
specified manner and increase cost by a determined amount, is the reform
desirable?" This article can be seen as, in some sense, complementary, in that
it relies on a formal model to specify more precisely the manner in which a
legal reform increases accuracy. Here I follow in the tradition of much of the
economic analysis of legal procedure in assuming that at least some important
consequences of rules can be quantified as economic costs.2 On the more
narrow problem of factual uncertainty, much of the previous research has
dealt with the question of how changes in the error rate affect the behavior of
the agents involved (e.g., does an increase in Type I error lead to more or less
crime).3 In this article I take the behavior of potential offenders as exogenous.
There is also a literature discussing the optimal standard of proof, much of
which looks at how changes in the standard will influence the expenditures
made by defendants or plaintiffs.4 For my purposes, discussion of the optimal
standard of proof is primarily a stepping stone—albeit a major one—in considering the value of different fact-finding technologies. Thus, I don't explicitly consider how the parties to a legal dispute will alter their behavior in
response to changes in the standard, although one might argue that their
responses are implicitly reflected in the model.
The next section presents the formal model and a discussion of the optimal
2. See, for example, Tullock (1971, 1980), Posner (1973), Ehrlich and Posner (1974), and
Wittman (1974).
3. Examples include Craswell and Calfee (1986), Png (1986) and Polinsky and Shavell
(1989).
4. Rubinfeld and Sappington (1977) examine how defensive efforts will change in response
to changes in the standard, and Miceli (1990) looks at the relation between prosecution efforts and
the standard.
346
Tbe Journal of Law, Economics, S Organzalion, V10 N2
standard of proof. Section 3 considers how we might characterize a given
technology as better or worse. Section 4 offers a short summary and some
suggestions for further research.
2. The Optimal Standard of Proof
2.1 The Model
If the minimization of error costs is taken as an important objective in analyzing procedure, it is useful to exploit the obvious similarity between most legal
disputes and standard statistical tests. Thus, the "burden of proof" (e.g., the
presumption of innocence in criminal cases) can be viewed as a statement of
the null hypothesis, while the "standard of proof" defines the degree of
confidence—as measured by the probability of being wrong—that must be
reached before the null hypothesis is rejected. Following this paradigm, incorrectly rejecting the null hypothesis (e.g., convicting an innocent) is described
as a Type I error, while incorrectly accepting the null (acquitting a guilty
party) is referred to as a Type II error. In my discussion, I use the language of
criminal procedure (guilt or innocence) and assume that the null hypothesis is
the presumption of innocence. It should be clear, however, that the analysis
applies to all sorts of legal disputes and even many extralegal issues. Indeed,
the critical feature of the analysis is not the legal environment but simply the
need to choose between only two alternatives (accept or reject) when there is
uncertainty.5
Imagine that a violation of the law is defined as allowing the level of some
continuous variable to rise above a certain amount. Let,r represent the difference between what the defendant actually did and the legal limit, with positive
values of x representing a violation. For example, if the charge is drunk
driving, x might represent the difference between the defendant's actual blood
alcohol content and the legal limit. For simplicity, all positive x are taken as
the same crime. In this context, the legal procedure is a test of the null
hypothesis, x < 0, and rejection of the null implies a finding of guilt. (Degree
of culpability can be accommodated in the model simply by regarding more
serious offenses as different crimes.)
When a particular defendant is brought to court, .t cannot be observed
directly, but rather the court must rely on a set of procedures—the trial
technology—to obtain an observed value of x, denoted x". The trial technology is not perfect and so generates an observed value described by x° = x + e,
where e, representing error, has some distribution f[e). If this distribution of
errors is known, then whenever an observation on x is made, it is possible to
calculate the probability of guilt, denoted p°. Specifically, letting F(-) indicate
the cumulative distribution function of j{e),
p" = Pr[.v > 0[.s"] = Pr|.r' - e > 0] = Pr[<? < x"] = F(x").
(I)
5. Any parent who has been called on to arbitrate a dispute among rival siblings over an
indivisible toy knows the extent of the problem. Other examples include the need to make a
decision to fire an employee, or pass a student.
The \felue of Truth and the Optimal Standard of Proof
347
For example, if the court admitted into evidence the results of a breathalyzer
reading showing a drunk driving suspect's blood alcohol level to be .02
percent above the legal standard, the court might conclude that there was a 90
percent chance that the suspect's true blood alcohol was above the limit. If p*
defines the standard of proof, the suspect is convicted if p" > p5. A different
fact-finding technology can be thought of as altering the distribution of e.
Applying a particular trial technology to a population of suspects will result
in a distribution of observed probabilities of guilt in that population, denoted
H(p°). To give an example, it may be that if Texas allows videotapes of drunk
driving suspects to be shown in court, one-third of the defendants will be
considered to have less than a 5 percent probability of guilt (H(.O5) = i) and
two-thirds of the defendants will be considered to have less than a 95 percent
probability of guilt (//(.95) = §). This distribution is critical to the analysis
because it gives the proportion of the population who will be acquitted when a
given trial technology and standard of proof are used. To extend the example,
if the standard of proof used in judging drunk driving cases results in acquittals when there is less than a 95 percent chance that the defendant is guilty,
two-thirds of the suspects will be acquitted.
This distribution function is also necessary to calculate the percentage of
the population falsely convicted and acquitted. The proportion of false convictions is given by
T,(P>) = f' (1 - p)dH(p) .
(2)
The proportion of false acquittals is given by
T2(P>) = J
|>
pdH(p) .
(3)
o
The complication in all this, of course, is that to know the distribution of
observed probabilities of guilt we must know the something about the population of suspects to be tested (i.e., we have to know the distribution of x in the
population). In other words, it is impossible to predict how a particular technology will classify a population of suspects without some a priori judgment
about the population. Suppose, for example, that police in two separate towns
used the same standards for selecting drivers to take a breath analyzer test
(improper lane changes, erratic speed, etc.). If one town were inhabited
mainly by elderly people, a large percentage of the population would pass the
test (their unsafe driving resulting from something other than drunkenness).
Thus, there might be a small percentage of false convictions simply because
there would be a small percentage of the population falsely convicted. If the
other place were a college town inhabited by libertine faculty and students,
there might be a large percentage of convictions, and hence a larger percentage of false convictions.
348
The Journal of Law, Economics, S Organization, V10 N2
To express this formally, let G(x) be the cumulative distribution function
describing the level of x in the population. Thus, in the drunk driving problem, G(0) gives the percentage of the population of suspects with a blood
alcohol content below the legal limit.
When a suspect drawn from this population is examined with the trial
technology, the result will be an observation, x°. The distribution of these
observations in the population can be derived as a convolution of the distribution G(x) and F(e):
H*(X°) =
F{X° - x)dG(x) .
(4)
In the drunk driving example, Hx(0) gives the percentage of the population for
whom the breath analyzer can be expected to measure a blood alcohol content
less than the legal limit.
Since the observed probability of guilt is a monotonic transformation of the
observed behavior, the distribution of observed probabilities of guilt is
H(p)=
-I
F[F~KP) ~ x]dG(x) .
(5)
It is an open question how one learns the underlying distribution of the
population, and it could be that this is informed by the technology itself. For
instance, controlled experiments with a breath analyzer might reveal it to be
unbiased (i.e., the expectation of the error is zero) and so experience with the
machine in the field perfectly characterizes the population (although still
imperfectly categorizing a given suspect). In other cases, prior beliefs about
the distribution of suspects may hinge on other more subjective factors. The
essential point is that it is impossible to say how well a technology is going to
work with a population of suspects without prior beliefs about that population.
It would seem reasonable to suppose that the total cost to society of each
type of error will be an increasing function of the total number of errors.
However, to simplify the notation I assume that the size of the population is
constant, and write error costs as depending on the proportion of errors. Let
the cost of each type of error be given by a monotonically increasing function,
CjiT,), where i = 1,2 depending on the type of error. The standard of proof
that minimizes the combined error costs, CJT^P,)] + C2[T2(PS)\, should
satisfy the first-order condition
C.TT, (/>*)] + C'2T2[(P*)1
(6)
This first-order condition says that the optimal standard is proportional to
the marginal cost of error. If false convictions are, on the margin, very costly
The VbJue of Truth and the Optimal Standard of Proof
349
relative to false acquittals, the optimal standard approaches 1. If the marginal
cost of each type of error is identical, the optimal standard is i.
I should point out that this formulation treats both the behavior of the
population and the error cost functions as exogenous. Thus, the model ignores
the possibility that, as the standard of proof or the trial technology changes,
potential suspects may alter their behavior—formally, the distribution G(x) is
assumed to be invariant \ops and independent of/\». Of course, this assumption ignores one branch of research into the problem of legal error, which has
focused precisely on the question of how error affects behavior (e.g., Polinsky
and Shavell, 1989).
It may also be that the cost of error depends on the consequence of error.
Perhaps the best example is capital punishment. It is certainly a more grievous
mistake to falsely convict a murderer if the punishment is death rather than
life in prison. Thus, if the standard of proof for murder were raised or if
courts discovered a way to lower the number of false convictions, the form
of punishment—and hence the cost of the mistakes that are made—might
change. To include these factors would greatly complicate the analysis without, I believe, contradicting any of the conclusions that I draw.
2.2 Discussion
In this section 1 consider first the generality of the model, and then discuss
some problems associated with finding the optimal standard of proof.
2.2.1 Subjective and Objective Guilt. In the model the judicial decision
depends on the measurement of the level of some continuous variable. While
many legal issues cannot be described in such a precise way, many can.
Almost any question involving compliance with some technical standard fits
this model. Speeding, for instance, is determined by comparing a defendant's
observed speed against the speed limit. Similarly, environmental laws define a
level of acceptable emissions, which are compared with measurements of the
defendant's emissions. It is also quite common to model civil cases as disputes over the level of some continuous variable. For example, negligence is
often described as the failure to take a cost-effective precaution against an
accident, and the level of precaution is measured by the amount of money or
time expended.6
Many other legal issues differ from this paradigm only to the extent that the
objective standard being measured involves a discontinuous variable. The
question of whether A stole B's car depends not on how long the car was kept
or how far it was driven, but only on whether A did or did not possess the car
without B's permission. Similarly, a number of civil cases involve a discontinuous variable. For example, a contract dispute might center around the
question of whether A did nor did not deliver the goods. My model could be
6. See, for example, Shavell (1987).
350
The Journal of Law, Economics, & Organization, V10 N2
adjusted to accommodate such discontinuities by interpreting the variable x as
a dummy variable indicating some objective state. The fact-finding technology could then be thought of as giving a stochastic indication of the true value
of this state based on some observations. To introduce such discontinuities
would add some technical complexity to the analysis, but would not disturb
the conclusions that I offer below.
A more serious challenge to the generality of this model arises from legal
disputes involving more subjective questions. Criminal law in particular often
inquires into the defendant's state of mind. Cases of marital rape, for instance,
would seem a great deal murkier than car theft. But even these cases are
structurally similar to those described by the model insofar as the fact-finding
technology considers a body of evidence and produces what can be thought of
as a probability of guilt or innocence. On the other hand, some care should be
taken in extending the model into areas where there is not a clear and objection notion of the legal standard.
2 2.2 On the Nature of Error Costs and the Role of Trial Courts in Setting
Standards of Proof. The specific question is this: If the goal of legal procedures is to minimize error costs, why impose a single standard of proof for a
particular class of cases rather than allow trial courts to set the standard on a
case-by-case basis? After all, the trial court is in an ideal position to observe
the characteristics of a particular defendant and determine at least some of the
costs of error. A court might decide, for example, that a particular murder
defendant is so menacing that a false acquittal would be a very expensive
mistake since the defendant is very likely to commit some kind of crime in the
future. Such a judgment would result in setting a low standard of proof for a
particular defendant. Of course, the phrases "proof beyond a reasonable
doubt" or "proof by a preponderance of the evidence" may be so vague as to
allow courts this flexibility in setting the standard of proof. However, this
analysis suggests some reasons why courts should be constrained.7 Since
what matters in setting the standard is the relative marginal error costs, there
are two serious objections to allowing trial courts the freedom to set their own
standards. First, these marginal cost functions are almost certainly nonlinear
and so the marginal cost of any error will depend on the total number of
errors. For example, the marginal cost of falsely convicting a defendant
includes the cost of administering the punishment. As prisons become more
crowded, it seems likely that the cost of imprisonment rises. If the marginal
cost of error varies with the total number of errors, a trial court that wished to
7. Of course, (here may be noneconomic explanations for refusing trial courts discretionary
power over the standard of proof. In particular, judging different defendants by different standards
may violate the fundamental notion of equal justice. But this argument is not entirely persuasive
since trial courts often have considerable discretion in other aspects of the process, such as
sentencing—if convicted, the despicable defendant is likely to receive a longer sentence precisely
because he or she is considered a threat.
The value of Tmlh and the Optimal Standard of Proof
351
set the optimal standard would need information about facts not directly
before it.8
The second problem created by nonlinear error costs is that, even if the trial
court knew the marginal cost of error, efficiency might require that it treat
similarly situated defendants differently. In effect, the trial court would be
involved in a sort of "price discrimination" in setting the standard. For instance, a vicious murder defendant who happened to come to trial at a time
when the relative cost of false conviction was high would be judged by a
lower standard than an equally reprehensible defendant who came to trial in
different circumstances. Although this sort of discrimination might reduce
error costs, it might introduce other sorts of inefficiencies by making the
judicial system less predictable.9 While we might not object to this on efficiency grounds, this sort of price discrimination would seem fundamentally
unjust. It is difficult to imagine a judge telling a jury, "The warden informs me
that the prison is at capacity and that if we convict the defendant, the guards
will have to be paid overtime. And so if you vote to convict, I want you to be
very sure (say 99 percent) that you are making the right decision. Today, at
least, a mistake is very expensive."
3. The Value of Truth
We now turn to the question of when one method of discovering truth should
be considered superior to another. The expected error cost functions developed above would seem a natural point of departure for such analysis since a
conclusive demonstration that some method lowered expected error would
comprise a strong argument in its favor. However, a careful analysis of the
model shows just how unlikely such a conclusive demonstration is. Indeed,
the main message seems to be that normative judgments about fact-finding
technologies can almost never be made purely on the objective characteristics
of the technology. Rather, such judgments will hinge on at least three other
factors: (i) a priori assessment of the cost of error, (ii) a priori assessment of
the distribution of suspects to be judged by the technology, and (iii) the
mechanism for setting the standard of proof.
Now this claim may seem strongly counterintuitive. It would appear reasonable, after all, to suppose that there are some types of legal procedures that
do such an obviously bad job of revealing the truth that all sensible people
8. Tullock (1980) makes a similar point in discussing the bias in favor of criminal defendants
implied by the standard of proof beyond a reasonable doubt. He argues. "The real problem here is
analogous to the problem of public goods in economics" (p. 85). In particular, the costs of the
additional crimes caused by false acquittals may not be observed by the court since they are
spread out among an anonymous society.
9. There is, of course, an extensive literature dating at least to Becker (1968) discussing the
efficiency of randomness in the enforcement of law, some of which suggests that some uncertainty
may be desirable. But, even if this is true, chance can be introduced in a much more obvious way
by manipulating the enforcement effort and/or penalties.
352
The Journal of Law, Economics. 4 Organization, V10 N2
should reject them, no matter what their beliefs about these other factors.
Why, for example, would a traffic court judge ever prefer the evidence of an
old, uncalibrated, radar speed detector to a modern, well-maintained, laser
speed detector? Similarly, would an honest judge ever prefer a confession
obtained by torture to a confession resulting from the usual constitutional
safeguards? These are, of course, extreme examples, but it turns out that no
matter how bad a fact-finding method might seem, there is always some
situation where it will generate lower error costs than a competing technology.
To show why this is so, let Fa{e) and Fb(e) define two different ways of
evaluating a suspect. My claim is that no matter how bad one method of
evaluation might seem in comparison to the other, it is not possible to rank
them without reference to a specific set of error cost functions and distribution
of suspects. That is, the following proposition is true.
Proposition 1. For any two different distribution functions, F" and Fb, it is
possible to find a distribution of suspects and a pair of error cost functions
such that total error costs will be lower under distribution A and another set of
circumstances where costs will be lower under B.
The details of the complete proof are a bit tedious and so are banished to the
Appendix, but the intuition can be grasped with a simple numerical illustration. Imagine two breath analyzers, A and B. Suppose that with each test done
with A, there is a 50 percent chance of underestimating the defendant's true
blood alcohol by 1 unit (e = - 1 ) and a 50 percent chance of overestimating
by 1 unit (e = 1). Machine B also has a 50 percent chance of understating or
overstating the truth, but B always makes bigger mistakes, with an error of
- 2 or +2. By the usual definition and common intuition, B is noisier than A
(formally, it is easily shown that distribution B is a mean-preserving spread of
A). Machine B, however, might be the better choice.
To see why, suppose that all innocent people have a true blood alcohol of
— 1 and all guilty people have a true blood alcohol of 1 (x = - 1 or x = 1).
Table 1 gives the observed values possible with each machine—for example,
if machine A understates the blood alcohol of an innocent person, the reported
level will be - 2 . When applied to this population of suspects, machine A can
produce only three observed values: - 2 (resulting from x = -1 and e = - 1 ) ,
0 (x = - 1 , e = 1; orjr = 1, e = - 1 ) , or 2 (x = 1, e = 1). When machine A is
applied to this population of suspects, an observed blood alcohol of 0 can
represent either innocence or guilt and hence there will be some error costs no
matter where the standard is set. Machine B, on the other hand, produces four
distinct signals and so, there will never be an error.10
10. Because this example involves discrete values of e and x, it may make sense to acquit some
people who actually have a higher observed level of x. When machine B is used, we acquit
the suspect who presents with an x° = 1 (jr = - I, e = 2) and convict the suspect with an x° =
- I (x = 1, e = — 2). Notice, however, that we are not convicting the person with the higher
probability of guilt.
The Value of Tnjtfi and the Optimal Standard oi Proof
353
Table 1. Observed Blood Alcohol Levels Possible Using Each Type of Breath Analyzer
Machine A
Machine B
Type of
Defendant
Under
( e = -1)
Over
(e = 1)
Under
(e = - 2 )
Over
(e = 2)
Innocent
( x = -1)
Guilty
(x = 1)
-2
0
-3
1
0
2
-1
3
Now one should not take Proposition 1 and this example as an argument for
some sort of procedural nihilism. It is possible to rank competing legal procedures in a fairly broad way—that is, to obtain a ranking that should command
broad, if not universal, agreement. But in seeking such a ranking, we must
focus not only on the narrow technical qualities of procedure but also on the
population to be judged by the technology.
In fact, the formal model permits precisely such a comparison. Recall that
when a given set of procedural rules is used to evaluate a population of
suspects, it is possible to determine the proportion of that population who will
be judged to have a probability of guilt less than some amount—that is, it is
possible to derive the distribution H(p). This distribution gives the proportion
of the population who will be acquitted when a given standard of proof is
used, Hip1), as well as the proportion of false acquittals and convictions.
If two different technologies are being considered for use on a population of
suspects, there are two possible distributions of probabilities of guilt, Ha(p)
and Hb{p). It turns out that in some instances it will be possible to say that the
procedures which define the distribution Ha(p) are better than the alternative
no matter what the source of error cost. Suppose, for instance, that at some
time in the past authorities had an intense desire to avoid false convictions and
so established elaborate rules to insure that all confessions were truly voluntary. Even if attitudes are now reversed, so that the prime concern is to avoid
false acquittals, the old procedures may have certain characteristics that still
make them superior. Further, the rule for ranking has a clear intuitive interpretation, and also illustrates the critical role played by the standard of proof.
Formally, the following is true:
Proposition 2. If Ha(p) is a mean-preserving spread of Hb(p) and if the
standard of proof is optimal, then the total cost of error will be less whenever
technology A is used.
The proof is given in the Appendix.
Here I first provide an intuitive explanation as to why a spread in the
distribution of observed probabilities would normally lead to a reduction in
error costs. I then present an example to demonstrate why, if the standard
354
The Journal o) Law, Economcs, & Ofganizalion, V10 N2
%of
population
Perfect Discrimination
Distnbution B
Distribution
A
Jt
A
.6
£
1.0
Probability of Guilt
Figure 1 Distributions of observed probabilities of guilt for a perfectly discriminating
technology and two imperfect technologies
were not optimally set, a change in technology that would otherwise be an
improvement might actually make things worse.
Suppose that on a given weekend 300 people have been arrested for drunk
driving and it is believed that 150 are guilty. When applied to this population,
then, a perfect technology (i.e., the technology such that x" = x) would assign
to half of this population a 100 percent probability of guilt and assign the other
half a 0 percent probability of guilt. In other words, the perfect technology
discriminates perfectly, leaving no doubt as to who was truly guilty. The
distribution of observed probabilities of guilt generated by testing this population of suspects with the perfect technology is bimodal, with half the distribution concentrated at p" = 0 and the other half concentrated at p° = 1. This is
described by the two heavy vertical bars in Figure 1.
Now suppose that the perfect technology is not available and we are forced
to pick between imperfect technologies A and B. Both technologies classify
suspects into one of two groups. Under technology A, half the suspects are
judged to have a 55 to 65 percent chance of guilt and the other half are judged
to have a 35 to 45 percent chance of being guilty, as shown in the unshaded
bimodal distribution of Figure 1. Technology B also divides the suspects into
two groups but gives better discrimination between suspects. When B is used,
half the suspects are judged to have a 25 to 35 percent chance of guilt, the
other half to have a 75 to 85 percent chance of guilt, as represented by the
shaded distribution in Figure 1. Formally, distribution B is a mean-preserving
spread (MPS) of distribution A. When the distribution of observed probabilities is spread out, we get closer to the ideal distribution. Thus, we expect
error costs to be lower under technology B than under A. Proposition 2
confirms this intuition, provided the standard of proof is optimal for each
technology.
The value of Truh and the Optimal Standard of Proof
355
Again, we see that the standard of proof plays a key role in evaluating
technologies: if you think the standard of proof is improperly set, you may
oppose the use of fact-finding technologies that better discriminate between
suspects. To see why, suppose that you are an appellate judge who is being
asked to rule on the admissibility of DNA matching tests in rape cases where
the victim cannot identify the assailant. Imagine that your experience as a trial
court judge leads you to believe that without the tests, the lower courts will
observe a distribution of cases like that given by technology A in Figure 1.
Your research leads you to believe that these tests do work (in the sense that
they provide better discrimination between suspects) and that if they are used
for a large number of cases will result in a distribution like that shown for
technology B.
Despite the fact that you believe that DNA matching tests reduce uncertainty, you may still not allow their use if you think the new test will be used in
conjunction with the wrong standard. Suppose that you think the costs of false
convictions are very high and hence you want to convict only those for whom
there is at least a 90 percent chance of guilt (i.e., you believe Ps = .9). In this
case, you don't believe that any of these suspects should be convicted either
with or without the DNA test since none of them meet the standard under
either technology. However, your experience as a trial court judge leads you to
believe that juries will convict when there is at least a § probability of guilt.
Thus, when technology A is used, no one will be convicted. However, when
DNA tests are used, half the suspects will be convicted. Since you think the
costs of false convictions are so high, the greater number of convictions that
will result when the DNA test is used may lead you to reject the new technology. To summarize, Proposition 2 and the example show that better information
may not be valuable if it is used incorrectly.''
4. Conclusions
One way to understand the essential point of this article is to ask why wellmeaning and well-informed individuals often have substantial disagreements
about the most effective way to discover the truth regarding factual disputes.
By presenting a formal model of factual uncertainty in which the objective is
to minimize the expected cost of mistakes, it is possible to isolate the sources
such disputes. In particular, I show that disagreements can arise not only
because people make different judgments as to the technical characteristics of
different procedures, but also because they have differing opinions as to the
characteristics of the population to be tested with the technology and the
standards of proof to be used in conjunction with the technology. Thus, we
should be suspicious of attempts to resolve disputes about procedure by
strictly "technocratic" methods. Better information about the reliability of
11. For the reader who is unconvinced by this example—perhaps because it assumes such a
high standard of proof—I can supply an earlier version of this article, which presents a more
complex numerical illustration.
354
The Journal of Law, Economics, & Organization, V10 N2
different means of fact-finding can never hurt, but it can never solve the
problem of selecting the best method of discovering truth.
Appendix
A 1 Proof of Proposition 1
Let Fb(e) describe a fact-finding technology that is different from Fa(e) in that
Fa(e) - Fb(e) ^ 0 for some range of e. For simplicity, assume that the graphs
of the functions have a single crossing at the point ec, such that [Fa(e) Fb(e)](e — e<0 > 0. It should be clear that multiple crossings can be accommodated with a more cumbersome notation. The proof will be simplified by
writing the error proportions as functions of x. If P" is the optimal standard of
proof when technology A is used, then x° = Fu~*{p°) is the critical observed
level of behavior, and the proportion of false convictions is given by
Tla(x°) =
g(x)[l - F°{x° - x)]dx .
(Al)
The proportion of false acquittals is given by
Ta2Ua) = |
g{x)F°(x« - x)dx .
(A2)
Using this notation, Tl(xs) and T£(xs) are respectively the proportions of
Type I and Type II errors when technology B is used in conjunction with
standard x3.
The freedom to pick any cost function means that for any g(x) and Fa(e) it is
possible to find some error cost functions that would give rise to any possible
x". This follows from the fact that for any error function, we could always
take a simple affine transformation such that O'(-) approached 0 for any 7"1,
which would imply p° approaching 0. Similarly, a transformation of C 2 '^)
that approached 0 for any T2 would imply p* approaching 1.
The proof follows by finding cost functions and a distribution function g(x)
such that
+ C 2 [ r ^ ) ] - C t ^ C c ) ] - (PW^x")] > 0.
(A3)
(Proving that this holds is sufficient since the total error costs generated by
setting the standard at x" and using technology B are at least as large as the
error costs that would result if the standard were set to its optimal level.)
Expanding the functions C'lVJ^x")] around rj,(jt°)] gives
Thus, if, as we assumed, marginal costs are nondecreasing, the sufficient
condition given by Equation (4) can be written as
The Walue of Truth and the Optanal Standard of Proof
> 0,
357
(A5)
where g = C2'[71g(.x<')]/C1'[7'i(jra)]. This expression can then be written
ra - Jf) - Fa(x° - x)]dx
•*
/;
g(x)[Fa(xa - x) - Fb(x" - x)]dx >0 .
(A6)
Define the function g{\) as follows:
x < xa - ec, x > c? - xa
x" - ec < x < 0
g(x) = 0,
g(x) = k,
^
ec
-
- k,
0 < x < ec - x«
(A7)
xa
(It is easily seen that for any k > 0, g(x) is a distribution function—that is,
g(x)dx=
1.
•X
Now write (A6) as:
k I
[Fb(x« - x) - F"(xa - x))dx
gc 1 ra
- *) f"
(A8)
For the interval of JC relevant to the first integral in this expression, xa — ec < x
< 0, Fh(x" - x) - F"{x") - x > 0, because ec > x" - x. Thus, the first
integral is always some positive number multiplied by k. Similarly, the second
expression is always some negative number multiplied by Q(
But k can be made as close as necessary to
entire expression is nonnegative.
—
_—— — k).
to guarantee that the
•
A.2 Proof of Proposition 2
Rothschild and Stiglitz (1970) have shown that a mean-preserving spread (MPS)
can always be described by marginal shifts in the cumulative distribution function. Thus let a shift parameter, /, characterize distributions, then the technology
of classifying suspects has undergone an MPS if
358
The Journal of Law, Economics, & Organization, V10 N2
i,{p,t)dp > 0,
°
=0,
0 < y < 1,
y= I
(A9)
The impact of a marginal change in / on the total cost of error evaluated at the
optimal standard is
/a/ + c2\.)dT2{.)idt.
(AiO)
Integrating by parts gives
dTK-Vdt = (1 - p>)H,{p*,t) +
\H,(p,t)dp
(All)
•V
dT\.)ldt = p'H,(p,t) ~
I
I' H,(j},t)dp .
Using these expressions and the fact that
(A 12)
H,(p,t)dp = J
p-
H,{p,t)dp,
J
0
Equation (2) can be written as
.)(1 - p*) + C'2{.)p*] + C\.) J'//,(p,Odp
+ C'2(.)]j
[P-
H,(p,t)dp.
(A 13)
The first term vanishes because of the first-order condition for ps, the
second term vanishes, and the third term is negative because of the definition
of a mean-preserving spread. Thus, the total expression is negative.
•
References
Becker, Gary S. 1968. "Crime and Punishment: An Economic Approach," 76 Journal of Political
Economy 169-217.
Craswell, Richard, and John E. Calfee. 1986. "Deterrence and Uncertain Legal Standards," 2
Journal of Law, Economics, & Organization 279-303.
Ehrlich, Isaac, and Richard A. Posner. 1974. "An Economic Analysis of Legal Rule Making," 3
Journal of Legal Studies 257-86.
Kaplow, Louis. 1994. "The Value of Accuracy in Adjudication," forthcoming in Journal of Legal
Studies
Miceli, Thomas J. 1990. "Optimal Prosecution of Defendants Whose Guilt is Uncertain," 6(1)
Journal of Law, Economics, & Organization 189-202.
Png, Ivan. 1986. "Optimal Subsidies and Damages In the Presence of Judicial Error," International Review of Law and Economics 257-660.
Polinsky, A. Mitchell, and Steven Shavell. 1989 "Legal Error, Litigation and the Incentive to
Obey the Law," 5 Journal of Law, Economics, & Organization 99-108.
The Value of Truth and the Optimal Standard of Proof
359
Posner, Richard A. 1973. "An Economic Approach to Legal Procedure and Judicial Administration," 2 Journal of Legal Studies 399-458.
Rothschild, Michael, and Joseph E. Stiglitz. 1970. "Increasing Risk I: A Definition," 2 Journal of
Economic Theory 225—43.
Rubinfeld, Daniel L., and David E. M. Sappington. 1977 "Efficient Awards and Standards of
Proof in Judicial Proceedings," 81(2) RAND Journal of Economics 308-15.
Shavell, Steven. 1987. Economic Analysis of Accident Law. Boston. Harvard University Press.
Tullock, Gordon. 1971. The Logic of the Law. New York: Basic Books.
. 1980. Trials on Trial: The Pure Theory of Legal Procedure. New York: Columbia
University Press.
Wittman, Donald. 1974. "Two Views of Procedure," 3 Journal of Legal Studies 249-56.