DNA Evidence: Probability, Population Genetics, and the Courts

V o l u m e 7, F a l l l s s u e , 1 9 9 3
DNA EVIDENCE:PROBABILITY, POPULATION
GENETICS, A N D THE COURTS
David H. Kaye*
INTRODUCTION
Courts, attorneys, scientists, statisticians, journalists, and g o v e r n m e n t
agencies h a v e b e e n explaining, 1 e x a m i n i n g , 2 p r o m o t i n g , 3 proselytizing, 4
denigrating, s and o t h e r w i s e struggling w i t h D N A identification e v i d e n c e
at least since 1985. 6 In t h e first w a v e o f cases, expert t e s t i m o n y for the
* Regents' Professor, A~zona State University College of Law, Box 877906, Tempe, AZ
85287-7906 (602 965-2922, [email protected]). A version of this paper was presented at the
1992 Joint Statistical Meetings of the American Statistical Association, the Biometric
Society, and the Institute of Mathematical Statistics. I am grateful to Herman Chem0ff for
comments on that paper and to Colin Aitken, Richard Lempert, Brace Weir, and especially
Bernard Devlin for comments on later drafts. The errors that remain despite this guidance
are entirely my own.
1. See, e.g:, David H. Kaye, DNA Paternity Probabilities, 24 FAM. L. Q. 279 (1990);
K.F. Kelly et al., Method and Applications of DNA Fingerprinting: A Guide for the NonScientist, 1987 CRIM. L. REV. 105; Miller, DNA Fingerprints to Aid Sleuths, 128 SCl.
NEWS 390 (1985).
2. See, e.g., OFFICEOF TECHNOLOGYASSESSMENT,GENETICWITNESS:FORENSICUSES
OF DNA TESTS (1990); Alan Guisti et al., Application of Deoxyribonucleic Acid (DNA)
Polymorphisms to the Analysis of DNA Recovered from Sperm, 31 J. FORENSICSCI. 409
(1986).
3. See, e.g., Andre A. Moenssens, DNA Evidence and Its Critics--How Valid Are the
Challenges?, 31 JURIMEr~cs J. 87 (1990).
4. See, e.g., People v. Wesley, 533 N.Y.S.2d 643, 644 (Sup. Ct. 1988) ('It]he single
greatest advance in the'sea~h for truth,' and the goal of convicting the guilty and a~uitting
the innocent, since the advent of cmss-examinadon.'), aft'd, 589 N.Y.S.2d 197 lApp. Div.
1992).
5. See, e.g., Gina Kolata, N.Y. TIMES, Jan. 29, 1990, at A1 (~Leading molecular
biologists say a technique promoted by the nation's top law-enforcement agency for
identifying suspects in criminal trials through the analysis of genetic material is too
upreliable to be used in court.'); Janet C. Hoeffel, Note, The Dark Side of DNA Profiling:
Un~reliable Scientffic Evidence Meets the Criminal Defendant, 42 STAN.L. REV.465 (1990);
Marjorie M. Shultz, Reasons for Doubt: Legal Issues in the Use of DNA Identification
Evidence, DNA ON TRIAL: GENERICIDENTIFICATIONAND CRIMINALJUSTICE19 (Paul R.
Billings ed., 1992), reviewed by, John F.Y. Brookfield, Gene Justice, 363 NATURE122
(1993) (dismissing Professor ShultT'S analysis as ~parochial nonsense"). Several of the
biologists referred to in the New York Times story have complained that their views were
misrepresented. Moenssens, supra note 3, at 99-100.
6. The earliest instance of DNA analysis for legal purposes is Alec J. Jeffreys et al.,
Posifi'~-~. Idetm'fication of an Immigration Test-Case Using Human DNA Fingerprints, 317
NATURE818 (1985) (applying the multilocus probes described in Alec J. Jeffreys et al.,
Individual-Specific "Fingerprints ~ of Human DNA, 316 NATURE76 (1985), and Alec J.
Jeffreys et al., Hypervariable ~Minisatellite ~ Regions in Human DNA, 314 NATURE67
(1985)). Soon after, this group applied the technique to a serial murder case described at
length in JOSEPH WAMBAUGH,THE BLOODING (1989), excluding one suspect and
Harvard Journal ofLaw &Technology
102
: [Vol. 7 -~.
prosecution was rarely countered, and courts readily admired the findings
of commercial laboratories. 7
I n t h e w a k e o f t h i s early e n t h u s i a s m f o r D N A
emerged)
evident,
doubfs
Diligent attorneys and enterprising defendants enlisted well-
c r e d e n t i a l e d e x p e r t s to s c r u t i n i z e t h e w o r k o f c o m m e r c i a l a n d c r i m e
laboratories.
The resulting plethora of questions about laboratory
procedures a n d
analyses 9 convinced many courts, including the
Supreme
C o u r t s o f G e o r g i a , t° M a s s a c h u s e t t s , ~ a n d M i n n e s o t a n to e x c l u d e at l e a s t
incriminating another.
7. See David H. Kaye, The Admissibility of DNA Testing, 13 CARDOZOL. REX'. 353,357
n.17 (1991). A case that is representative of this epoch is Cobey v. State, 559 A.2d 391
(Md. Ct. Spec. App. 1988). A man forced a woman jogging in a park into the woods,
where, as the court of appeals put it, he "ravished" her and drove away in her car. A
policeman issued a traffic citation to Kenneth Cobey, who was driving that car. Ceilmark
Diagnostics performed a "DNA fingerprint analysis" showing "a 'mateh' between the DNA
in Cobey's blood sample and the DNA [extracted from] semen stains [on the woman's
clothing]." ld. at 392. The state produced five experts "who testified that DNA
fingerprinting was accepted in the scientific community," while Coboy "produced no expert
evidence to the contrary." ld. at 392. To buttress the testimony for the state, the court of
appeals relied on a news account in the American Bar Association Journal that "Cellmark
Diagnostics of Germantown, Md., claims its 'DNA fingerprint' test can identify a suspect
with 'virtual certainty,' and that the chances of any two people having the same DNA
fingerprint are one in 30 billion." ld. at 392 n.7 (quoting D. Moss, DNA--The New
Fingerprints, 74 A.B.A.J. 66 (1988)). Although the court cautioned that "we are not, at
this juncture, holding that DNA fingerprinting is now admissible willy-nilly," but "are
merely holding that, based upon this record, [the court below] did not e r r . . , since there
was no evidence to the contrary,"/d, at 398, courts--even those confronted with expert
testimony opposing a DNA identification--frequently cite Cobey for the proposition that all
aspects of DNA analysis and all types of DNA probes are accepted among scientists, even
though this "30 billion" figure pertains to a mnitilocus probe that is no longer used in this
country for criminal identification.
8. See Kaye, supra note 7, at 357 n.18.
9. These included the possible effects of contaminants on forensic samples, the use of
ethidium bromide, corrections for band shifting, the records of laboratories on proficiency
tests, the size of data bases used to assess the significance of matching bands, and the
procedure for calculating the frequency of matching DNA patterns within the general
population.
10. Caldwell v. State, 393 S.E.2d 436 (Ga. 1990) (finding Lifecodes's "sWaight binning
method satisfactory," but because laboratory's calculation that frequency of profile in
population was 1/24,000,000 rested on assumption of Hardy-Weinberg equilibrium
inconsistent with its data base, the more conservative figure of 1/250,000 derived from that
data base would have to be used).
11. Commonwealth v. Cumin, 565 N.E.2d 440 (Mass. 1991) (holding Cellmark's DNA
evidence in rape case erroneously admitted in absence of showing general acceptance of
validity of process leading to conclusion that one Caucasian in 59 million would have
incriminating profile).
12. State v. Schwartz, 447 N.W.2d 422, 428 (Minn. 1989) (responding to Cellmark's
multilocus VNTR probe, said to produce a "banding pauem [whose frequency] in the
Causcasian population is approximately 1 in 33 billion," the court concluded that "DNA
typing has gained general acceptance in the scientific community," but "the laboratory in
this case did not comport" with "appropriate standards," and further holding the statistical
....
Fail; 1993]
D N A Evidence ~ . . . .
~: 103
some aspects of DNA evidence; 13 Nevertheless, in the majorityofcases,
the courts continued to hold DNA matches and probabilities admissible
even in the face of conflicting expert testimony, ~4
With the publication of a long-awaited report of a twelve-member
panel of the National Research Council, Is a third wave of cases i s
crashing down upon this battered legal shoreline. Even before the
National Academy of Sciences released this report f o r publication,
unofficial announcements of an impending call for a moratorium,on
forensic DNA identification~6 produced consternation ~7 and legal
maneuvering. ~8 Although the final report sought no such moratorium and
strongly endorsed the theory behind forensic DNA analysis, it does
question several aspects of current and past practice and does rex~mmend
improvements in the process. The pressure created by these pronounce-
conclusion to be inadmissible, because even if the computation is accurate, "we remain
convinced that juries in criminal cases may give undue weight and deference to presented
statistical evidence").
13. Other courts have also refused to admit certain forms of DNA evidence. See, e.g.,
United States v. Two Bulls, 918 F.2d 56 (8th Cir. 1990), vacated for reh'g en bane but
appeal dismissed due to death of defendant, 925 F.2d 1127 (Sth Cir. 1991); People v.
Castro, 545 N.Y.S.2d 985 (Sup. Ct. 1989); cf. Perry v. State, 586 So.2d 242 (Ala. 1991)
(remanding for hearing on Lifecodes's adherence to proper procedures and acceptability of
statistical methods).
14. See, e.g., United States v. Jakobetz, 747 F. Supp. 250 (D. Vt. 1990) (applying
relevance standard), aft'd, 955 F.2d 786 (2d Cir. 1992), cert. denied, 113 S.Ct. !.04 (1992);
United States v. Yee, 134 F.R.D. 161 (N.D. Ohio 1991) (applying general acceptance
standard); ~ State v. Pierce, 597 N.E.2d 107 (Ohio 1992) (applying relevance standard,
no defense experts); Sateher v. Commonwealth, 421 S.E.2d 821 (Va. 1992) (applying
general acceptance standard and statute, no defense experts).
15. COMMrlTEEON DNA TEG~OLOOY IN FORENSIC SCIENCE, NATIONALRESEARCH
COUNCIL, DNA TECHNOLOGYIN FORENSIC SCIENCE (1992) [hereinafter NRC REPORT].
For a more comprehensive summary of the NRC Report and thoughts on its legal implications, see Kenneth R. Kreiling, Review-Comment, 33 JURnVlETRICSJ. 449 (1993).
16. Gina Kolata, U.S. Panel Seeking Restriction on Use of DNA in Courts, N.Y. TIMES,
Apr. 14, 1992, at A1.
I7. The panel's chair promptly repudiated the N.Y. Times concededly exaggerated
account. Oina Kolata, Chief Says Panel Backs Courts" Use of a Genetic Test, N.Y. TIMES,
Apr. 15, 1992, at A1.
18. FBI "interference" in the preparation of the report and a last-minute compromise in
a crucial section of the report has been alleged. Leslie Roberts, DNA Fingerprinting:
Academy Reports, 256 SCIENCE 300 (1992) (describing compromise within the National
Academy of Sciences Commitee on a Statistical Standard); Rorie Sherman, Genetic Testing
Criticized, NAT'L L.J., Apr. 20, 1992 (some courts have ordered production of the
penultimate draft of the NRC report, which was leaked to and criticized by the FBD.
Charges of FBI interference apparently come readily to the lips of some participants in the
public debate. See e.g., Rorie Sherman, New ScratinyforDNA Testing, NAT'L LJ., Oct.
18, 1993, at 3 (quoting one defense attorney's reaction to a recent decision of the National
Academy to impanel a new commi~e to update the population genetics chapter of the 1992
report as the "offensive~ result of the "law enforcement [community's] dictating to the
independent scientific community how they should examine problems').
104
Harvard Journal o f L a w & Technology
ments 19 is shaping opinions across the nation. 2°
[Vol. 7
: .
O f all the technological and scientific issues in this debate, the most
difficult for the courts, and those that have generated the most d i s a g r e e merit within the scientific community, involve statistics. The disagreements revolve around one central challenge--presenting the degree o f
similarity between D N A in a crime sample and D N A in a defendant's
sample so that a j u d g e or j u r y can fairly assess the probative value o f
D N A evidence. The predominant procedure for criminal D N A testing in
the United States involves two major steps: first, declaring a "match"
between the two samples, and second, if a match is declared, estimating
its relative frequency in a reference population. This frequency indicates,
at least indirectly, the significance o f a match.
It reveals whether the
match is as c o m m o n as a polite smile or as rare as the enigmatic
expression o f the M o n a Lisa.
In determining the admissibility o f testimony on these points, courts
have applied two competing standards.
One is the general acceptance
standard first applied to scientific evidence in Frye v. United States. 21
Under the Frye standard, courts do not inquire directly into scientific
truth, but ascertain whether the seier~tifie community has reached the
consensus that the scientific procedure in question rests on a valid theory
and generates reliable results when properly applied, z'
The other
19. A committee of defense lawyers is reviewing convictions involving DNA
evidence, seeking to apply the report's recommendations retroactively, as it were. See
Tim Beardley, DNA Fingerp.rinting Reconsidered Again, SO. AM,, luly 1992, at 26.
Prosecutors have begun to request calculations of the frequency of matching DNA types
using the "ceiling method" advocated in the report. See Christopher Anderson, Courts
Reject DNA Fingerprinting, Citing Controversy After NAS Report, 359 NATURE 349
(1992).
20. See People v. Barney, 10 Cal. Rptr. 2d 731 (Ct. App. 1992) (finding that product
role calculation method not prescribed by NRC panel for calculating frequency of DNA
pattern is not generally accepted among population geneticists), followed in People v.
Wallace, 17 Cal. Rptr. 2d 721 (Ct. App. 1993); Commonwealth v. Lanigan, 596 N.E.2d
311 (Mass. 1992) (same); State v. Vandebogart, 616 A.2d 483 (N.H. 1992) (same);
State v. Canthron, 846 P.2d 502 (Wash. 1993) (finding error in allowing expert to testify
that defendant was the source of the incriminating DNA and yet excluding testimony of
frequency of the DNA pattern given that the NRC panel had proposed a generally
accepted method of calculation); cf. State v. Bible, 858 P.2d. 1152 (Ariz. 1993) (holding
method as applied to 1988 data base not generally accepted); Springfield v. State, 860
P.2d 435 (Wyo. 1993) (holding frequency re-calculated with "the most conservative'~
NRC method admissible under relevance standard); People v. Atoigue, DCA No. CR
91-95A (Guam Dist. Ct. App. Die. 1992) (method not generally accepted among
population geneticists). For discussion of these cases, see infra Part II(B)(4).
21. 293 F. 1013 (D.C. Cir. 1923) (holding inadmissible expert opinion of truthfulness
formed from a primitive version of the polygraph).
22. See generally, e.g., 1 MCCORMICKON EVIDENCE § 203 (J. Strong ed., 4th ed.
F a i l , 1993]
DNAEvidence
I05,
" : ' ! i.
a p p r o a c h treats general ac~.eptance as b u t o n e factor that b e a r s ' o n the. :" - '>~: ~ "
u l t i m a t e qt~estion o f w h e t h e r the scientific findings are sufficiently reliable
to j u s t i f y their a d m i s s i o n in v i e w o f the dangers 0 f u n c r i t i c a i a c c e p t ~ c e . . . .
b y the j u r y and u n d u e e x p e n s e and c o n s u m p t i o n o f time. 23-,Although b o t h
standards are consistent w i t h the w o r d i n g and history o f R u l e s 403 u and
70225 o f the F e d e r a l a n d the U n i f o r m R u l e s o f E v i d e n c e , in Daubet~ v.
MerreU D o w Pharmaceuticals, Inc., 2s the S u p r e m e C o u r t u n a n i m o u s l y
h e l d that the federal rules i m p l i c i t l y r e j e c t the Frye test. 27 O v e r ! s o m e
dissent, 2s the C o u r t attempted to define "scientific knowledge,"29 and it
articulated f o u r " g e n e r a l observations "3° for use in d e t e r m i n i n g w h e t h e r
1992).
23. See, E.g., State v. Pierce, 597 N.E.2d 107 (Ohio 1992) (applying relevance
standard to uphold admission of DNA statistics despite NRC Report); cases cited,
MCCORMICK,supra note 22, § 203 at 872 n. 31.
24. Rule 403 states: "Although relevant, evidence may be excluded if its probative
value is substantially out~veighed by the danger of unfair prejudice, confusion of the
issues, or misleading the :,;ry, or by considerations of undue delay, waste of time, or
needless presentation of cumulative evidence. ~
25. Rule 702 provides: "If scientific, technical, or other specialized knowledge witl
assist the trier of fact to understand the evidence or to determine a fact in issue, a
witness qualified as an expert by knowledge, skill, experience, training, or education,
may testify thereto in the form of an opinion or otherwise."
26. 113 S.Ct. 2786 (1993).
27. Although Daubert should accelerate the movement away from Frye, two factors
may blunt the force of the decision. First, the Court continued to apply its wooden,
"plain meaning" construction of the rules. See, e.g., Glen Weissenberger, The Supreme
Court and the Interpretation of the Federal Rules of Evidence, 53 OHIO ST. L.J. 1307
(1992). Second, Daubert concerned the general acceptance of a scientific conclusion
about a putative teratogen. The general methodology for determining teratogenicity--the
examination of data from toxiciologic and epidemiologic studies---was not controversial;
only its application was in dispute, and the application of an accepted methodology plays
no part in the normal Frye analysis. For cases hesitating or declining to follow Daubert,
see, for example, State v. Bible, 858 P.2d 1152 (Ariz. 1993) and Fishback v. People,
851 P.2d 884 (Colo. 1993).
28. Chief Justice Relmquist and Justice Stevens dissented from this portion of the
opinion.
29. The Court treated Rule 702 as the "primary locus" of the trial court's obligation
to screen out unacceptable scientific testimony. 113 S.Ct. at 2795. The rule speaks of
"scientific . . . knowledge," and the Court propounded the tautology that "in order to
qualify as 'scientific knowledge,' an inference or assertion must be derived by the
scientific method." ld. "Proposed testimony must be supported by appropriate
validation--i.e., 'good grounds,' based on what is known, ~ recognizing, of course, that
"it would be unreasonable to conclude that the subject of scientific testimony must be
'known' m a certainty." Id. However, it is not the inference or assertion itself that must
be sufficiently "known." "The focus, of course, must be solely on the principles and
methodology, not on the conclusions that they generate." ld. at 2797. This article
contends that "good grounds" exist "based on what is known ~ to support the
introduction of DNA evidence and a variety of statistics or probabilities that indicate how
revealing such evidence is.
30. First, citing the positivist criterion that, in principle, a scientific hypothesis must
106
Harvard Journal of L a w & Technology
[Vol. 7
purportedly scientific testimony possesses sufficient validity and reliability
to qualify as "scientific knowledge" and whether i t ,would Sufficiently
"assist the trier of fact" within the meaning of Rule 7027 z
To help meet the challenge of presenting properly performed. DNA
tests within this legal framework, this Article outlines the statistical
procedures that have been employed or proposed to providejudges and
juries with quantitative measures of such probative value, describes more
fully how the courts have dealt with these procedures;and evaluates ,the
opinions and the statistical analyses from the standpoint of the law of
evidence. Part I outlines the procedure used to declare whether two
samples of DNA "match." It explains how shrinking the size of the
"match window," as some defendants have urged, will decrease the risk
of false matches, but will also exclude hig~hly probative evidence of
identity. This section also demonstrates that a defendant's effort to show
that a smaller match window would not permit the declaration of a match
is irrelevant or misleading. Part II explains procedures for estimating the
frequency of the incriminating genetic characteristics in various populations. These procedures have been the subject of an acrimonious debate,
both in the courts and in the press, about the effect of "population
structure." This section reveals that the population structure objection,
which has proved so effective in court, applies most strongly to only a
limited class of cases Thus, courts have erred in excluding DNA
evidence on the theory that the sciefitific community advocates that the
most "conservative" procedures must be used in all cases. Part M
identifies more fundamental problems in the use of population frequency
estimates. It advocates supplementary and alternative procedures that are
essential if quantitative statements of the probative value of DNA
be subject to some empirical test that could falsify it, the Court observed that "a key
q u e s t i o n . . , will be whether [a theory or technique] can be (and has been) tested." 113
S. Ct. at 2797. Second, a "pertinent consideration is whether the theory or technique
has been subjected to peer review and publication." Id. Third, ordinarily "the known
or potential rate of error" should be assessed. Id. Finally, "general acceptance" within
the scientific community "can yet have a bearing on the inquiry." Id. None of these
factors, except presumably the first which excludes purely metaphysical theorizing, is " a
sine qua non of admissibility," for "It]he inquiry envisioned by Rule 702 is, we
emphasize, a flexible one." Id.
31. To "assist the trier of fact to understand the evidence or to determine a fact in
issue," as Rule 702 requires, the scientific testimony must be "relevant" and "fit" the
circumstances of the case. 113 S.Ct. at 2795, 2796. Assuming that the identity of the
actual criminal is a contested issue, properly conducted DNA tests of identity always will
be relevant. The only arguable lack of "fit" might involve the selection of a data base
for computing related statistics. See infra text accompanying note 155.
'
-~
" .
F a l l , 1993]
D N A Evidence
107 ~ ' ':,i' i: !'i: ::~
evidence are to be admissible. . . . .
I. DECIDING WHETHER D N A FRAGMENTS
MATCH
•
•
..
J'}
..
,
....
•
,
The most c o m m o n form o f D N A analysis m crmunal cases utilizes
four o r five so-called " s i n g l e locus V N T R
probes " ~ to produce a
"multilocus genotype," or, more simply, a " D N A profile." D N A is a
i!7
complicated but stable organic compound found in the cells o f all organisms, from the humblest amoeba to the most arrogant human being. I t
is composed o f two weakly connected strands o f molecules that s p i r a l
around one another to f o r m a double helix. Along the backbone o f each
strand are much smaller, relatively flat molecules known as nucleotide
bases. There are four such b a s e s , often referred to b y their initials, C,
T, A, and G.
The C on one strand always pairs with t h e G on its
complementary strand, and the A w i t h the T. A little reflection reveals
that there is an incredibly large number o f possible orderings o f these
base pairs in a lengthy stretch o f D N A . 33
Using techniques o f molecular biology, 34 fragments o f chromosomes ~
that begin and end with certain sequences o f D N A base pairs are excised
from samples found in blood, semen, o r other material containing
sufficient D N A . 36 The beginning and ending sequences are chosen so that
32. See infra note 40 and accompanying text.
33. At each site, there are four possible pairs: AT, TA, CG, or C,C. Two sites
produce 4 × 4 = 16 possibilities, three produce 16 x 4 -- 64, and so on, so that n sites
can accommodate 4" possibilities.
34. See generally MAXI~ SINGER & PAULBERG, GENES& GENOMES:A CHANGING
P~.S~CT~Ve (1991).
35. A chromosome is essentially a tightly coiled molecule of DNA. Each parent
supplies one each of 23 different chromosomes, so the human genome consists of 46
chromosomes arr~ged into 23 pairs. On the coiling of DNA, see; for example, Michael
Gnmstein, Histories as Regulators of Genes, Sa. AM., Oct. 1992, at 68.
36. Bacterial enzymes are used to cut the DNA into fragments. A given ~restricfion
enzyme" binds to DNA when it encounters a certain short sequence of DNA base pairs
and cleaves the DNA at a specific site. For example, the Hae HI enzyme cleaves the
strand ...GGCC... to yield ...GG and CC .... "Digesting" DNA with such an enzyme
usually produces fragments ranging from several hundred to several thousand base pairs
in length.
A technique for copying DNA permits minute quantities of DNA to be analyzed.
See, e.g., Henry A. Ehrlich et al., Recent Advances in the Polymerase Omin Reaction,
252 SCIENCE 1643 (1991). In most forensic applications to date, DNA that has been
"amplified" in this way has been probed at less revealin[; loci that do n0~~involve
VNTRs. See, e.g., State v. Williams, 599 A.2d 961 (N.J. 1991) (unopposed testimony
of prosecution experts established general acceptance of PCR amplification followed by
:
.....
108
Harvard Journal o f ~ : & T e c h n o l o g y
:~ ,/,
[Vol.:7:
"'+
.
+''
the material they bracket tends to vary in sizefrom person to person.
The lengths of the DNA fragments are measured by seeing how far
they move through a slab of gelatinous material when attracted by an "
electric Charge relative to DNA fragments of known lengths--a process
known as electrophoresis. Just as a sleek panther can wend its way
through a stretch of dense jungle more readily than a bulky elephant,:4n
a given period of time shorter fragments (with low molecular weight)
migrate farther in an electrophoretic gel than longer fragments (of high
molecular weight). 37
The variations in the lengths of the fragments from different people,
referred to as "fragment length polymorphisms, "38 result primarily from
disparities in the number of repetitions of a short sequence Of nucleotide
base pairs. 39 The-number of repetitions of this core or "consensus"
sequence varies greatly among people--hence the phrase "variable
number of tandem repeats," or VNTRs. 4° The fragments containing the
tandem repeats can be detected by speeially constructed molecular
"probes" that bind to a specific consensus sequence. +~ By measuring the
dot-blot detection of HLA DQ,~ polymorphism). Amplification coupled with more
precise detection of VNTRs is, however, also possible and likely to dominate forensic
applications in the near futare. See, e.g., Brace Budowle et al., Analysis of the VlVTR
Locus DIS80 by PCR Followed by High Resolution PAGE, 48 AM. J. HUM. GENETICS
137 (1991). For a set of proposed safeguards in forensic PCR analysis+ see N'RC
REPORT,supra note 15, at 63-73.
37. The molecular weight of a compound is equal to the total mass of its constituent
atoms. Since DNA fragments all have pretty much the same mix o f atoms, a fragment
that has twice the length of another also has about twice the molecular weight.
38. They also are called "RFLPs" or "AmFLPs," depending o n t h e procedure that
yields the fragments.
39. See Yusuke Nakamura et al., VariableNumber of Tandem Repeat (VNTR) Markers
for Human Gene Mapping, 235 SCIENCE 1616 (1987). More than one core sequence
may be repeated in some VNTRs. See Alec J. Ieffreys et:al., MinisateUite Repeat
Coding as a Digital Approach to DNA Typing, 354 NATURE 204 (1991) (proposing an
analysis o f the order in which two interspersed core sequences appear in the repetitive
portion o f fragments in order to provide greater discrimination).
40. The chromosomal locations that give rise to VNTR RFLPs or AmFLPS are known
as VNTR or hypervariable loci. The number of repetitions o f the core sequence can
vary from a handful to a few hundred, depending on the particular locus, but, when the
length of the core sequence is short, the differences between fragments f r o m different
subjects will be too small to detect on a typical electrophoretic gel. Resolution of
fragments which differ by as little as two base pairs, however, has been reported with
newer gels. See Rene Hubert et al., A New Source of Polymorphic DNA Markers for
Sperm Typing: Analysis of MicrosateUite Repeats in Single Cells, 51 AM; J. HUM.
GENETICS 985 (1992).
- 41. These 1.~obes are short segments of single-stranded DNA with a radioactive or other
readily identifmble component attached, like a sticker or tag on a suitcase. When the probe
encounters a strand of DNA with the complementary sequence of bases, it pairs ("hybridizes") with the target DNA.
: "
7
"
......
:'
Fall, 1993]
DNA Evidence
109
distances that the fragments tagged with these probes have migrated,4z'the
-approximate lengths o f the VNTR fragments can be=determined;
Although VNTRs used in forensic work represent a minuscule portion of
the full genome,: the number of distinct combinatious o f them easily runs
into the billions and trillions. 43
flanking
region
.........
many repeats of a single
consensus sequence
>>>>>>>>>>...>>>>>>>>>>>
flanking
region
.........
Pigure 1.
A schematic diagram of a VNTR fragment. Between:two
flanking regions of DNA (-) are many repeats of the same
small sequence of base pairs (the consensus sequence > ) .
The number of repeats often varies, as between the pair of
chromosomes in an individual and as among the chromosomes from different people.
Because the prevailing method of agarose gel electrophoresis for
42. In one common procedure for "visualizing" the target DNA, the DNA is denatured to its single-stranded form and transferred from t h e eleclxophoretic gel to a
niu'ocellulose filter. The probe is applied tO the filter, and any excess, unbound probe is
washed away. X-ray film is placed next to the filter. Radioactivity from the probe
exposes the film, producing a black band whose location reveals how far the restriction
fragment migrated on the gei:~-'See generally JAMES D. WATSON ET AL., RECOMBINANT
DNA (2d ed. 1992). Many opinions liturgically recite all the steps of this "Southern
blotting" procedure. See, e.g., Springfield v. State, 860 P.2d 598 (Wyo. 1993) (quoting
Fishback v. People, 851 P.2d 884 (Colo. 1993)).
43. A typical fragment from a given region of a chromosome (a "locus") easily can
come in 20 or more discernibly different sizes ('alleles"). See, e.g., S.J. Odelberg et
al., Characterization of Eight VNTR Loci in Agarose Gel Electrophoresis, 5 GENOMICS
915 (1989). For the maternal and paternal pair of chromosomes, then, there are many
possibilities: maternal "allele" 1 can pair with paternal "allele" 1, 2, 3 . . . . . or 20;
mammal "allele" 2 can pair with paternal "allele" !, 2, 3 . . . . . or 20; and so on. The
result is 20 × 20 = 400 possible "allele" pairs. Without a stody of family members to
ascertain which "allele" is on which chromosome, however, a single-locus VNTR probe
cannot distinguish a paternal-maternal "ailele" pair from a maternal-paternal one. On a
gel, the pair (1,2); for example, looks the same as the pair (2,1). The 400 possibilities
therefore includes (20 × 19)/2 = 190 duplicates, leaving 210 discernible single-locus
"genotypes." At four such loci, 210 × 210 × 210 × 210 = 1,944,810,000 discernible
"genotypes" are possible; likewise, 21(P -- 408,410,100,000 distinguishable five-locus
"genotypes" are possible. Obviously, not all of the mathematically possible combinations are realized in any human population, and some may be represented more frequently than others. The branch of biology that studies the distribution of genotypos across
populations and within populations over time is known as population genetics.
aevaations-
of a set of mdepend¢
44. See supra note 40. Contra State v.
( ' A variation o f even One nucleetide in the
45. "I'aere is confusion as to what role:(
requires that bands be separated by no more than +2.5% of their mean nml~mla r weight ~
to declare a match. This window is sligh~y larger than the biggest difference observed
in the FBI laboratory for the same sample measured twice. See Bmce Bndowleet aL,
Fixed Bin Analysis for Statistical Evaluation o f Continuous Distributions o f Allele Data
from $TVTRLoci, for Use in Forensic Comparisons, 48 AM. L HOM.Gl.:Nl~'Tl~ 841, 844
"
(1991). According to Erie S . Lander, Invited Editorinl: .Research :'on DNATypin& . . .
Catching Up with Courtroom Application, 48 AM. J. HUM. GI~'ETICS 819,820 (1991),
this FBI stady suggests that the Bureau's laboratory has a standard deviation for the
difference between two measurements o f about 1.5% o f the molecular.weight o f their
mean. I f s o , the-l-2.5% matehwindowcon-espondsto + l . 7 standard deviations. On
the other hand, Nell J. Risch & Bernard Devlin, On the Probability o f Matching DNA
Fingerprints, 255 SCIenCE 717, 72On.9 (1992), conclude from the same FBI studythat
the FBI "measurement error SD'..is 0.625%, which implies a match window of.four ~standard deviations. Likewise, Seymour Geisser,. Some. Statistical Issues in Medicine .i
and Forensics, 87 J. AM. STAT. ASS'N 607, 609 (1992), comments that ?the FBI . . .
will declare a match between two samples if the bands are wilhin 2 . 5 ~ o f the average of
the two . . . a tolerance of about 4 standard deviations of the difference." Britain's
Home Office Forensic Science Service estimates the standard deviation o f the difference
between two independent measurements to be 1.1~. See D o n a l d A . : B e r r y et al.,
Statistical Inference in Crime Investigations UFmg Deoxyribonucleic Acid Profiling; 41
APPLmD STAT. 499, 502 (1992). According to Lander, supra, and Risch & Devlin,
....
supra, commercial laboratories report still smaller figures of abeut:0.6~. Lifecodes
Corporation uses a match window of 1.8%, See Bruce S. Weir, Review:.Pupulation
Genetics in the Forensic DNADebate, 89PROC. NAT'L ARAB. SCI; 11654, 11655
(1992), which amounts to + 3 standard deviations of the'reported measurement e r r o r
Some of these discrepancies may arise from differences in the materiais being
e - ~ t i n ~ in the "calibrati_'~n" studies. More consistent results may be expected from ....
fresh DNA; evidentia~ samples can be influenced by degradation and exln'bit band
shifting. The characterizations of the F B I ' s match window for forensic casework as 4-4
standard deviations of the mean of the two fragments actually pertain to the standard
deviation derived from their KI2 cell line ('control D N A ' ) measurements. It also
should be noted that a match window of :kk standard deviations of the mean of the two
fragments implies that the two fiagments could be + 2 k standard deviations apart and sfill
"match." See infra note 47; Michael J. DiRussu, Note, DNA "Profiles'--The Problerm
of Technology Transfer, 8 N.Y.L. SCH. J. HUM. RTS. 183, 205 (1990) (criticizing
Cellmark for reporting that it used a matchwindow of -4-3 standard deviations when it
actually was using +6).
46. The standard deviation is a statistic that measures the degree of variation in a set
of munbers. If all the numbers are identical, their standard deviation is zero. If they
vary greatly from their mean, then their standard deviation is large. For electrophoretic
: (:. :
~ ,- ~:i i.:
~"!
"
~ ii~i~!
~
.
~/~;
~:i~!::::
:
.~.
:
"'!i
/-/.~::
~.
•
~
.
-~
• :
i
•Fall,: 1 9 9 3 ] • -'
observed
differen
of thesame len~
which two bands
measurements, the S
smaller distances on I
47. In practice, v
lengths for VNTRs in . . . . . . . ~
,~u,,, , ~ , ~ o , a ~ ~ , , , U ~ s ~ t , , ~ , ~ , ~,,~J ~ , ,
from blood taken from the same woman. Both samples containthe sameDNA,.' and:the "
" " .~
two SOUI'C~S Corfespol~d to the s i t u a t i o n i n I a I ~ CaSeS. A n al~l'l'~l~/e',woiJt]d~!bl~'.,Io
,.
:
compare semen and blood samples from the same man. In,such studies conducted by the..,:.-,..,.,
South Carolina L a w Enfc~-.---:~* DivisionDNA laboratory, corresp0nding bands, here r
differed by more than 5.6% of their average length. B . S : . W e i r & B~S. Gaut,':Matchtng
and Binning DNA Fragments in Forensic ~$cience, 34 JURIlVlE~JCSJ.:9.(199~)..: SI]l~h
studies enable the laboratmy to choose a .window i]mt is Wide en0ugh to be likely t0 "
result in a match when two samples come from the same source.
48. Weir & Gaut, supra note 47. lucidly describe the process:
Suppose the vaginal sample length isdenoted by e and the b l e e d s a m p l e
length by s. Then the relationship
(s+e)12
s
2a
"
Is-(s+e)/21 ~ a
(s÷e).t2
[e-(s+e)/2[ ~
(s÷e)t'2
This situation is shown below. In other words, there is uncertainty
associated with an eslimated band length. The lnle length of a band of
estimated length e is thought to be contained in the interval e + a . and
two bands are said to match if they are no more than 2~x apart:
sn*NHu**lsnnnlnnunJ*
*olnJuo*HmslsuHHnstteu
,ql
(e+s)/2
IV
vaginal: e
.H°tn.gmt~n®°H*e*lunll*.
.I.*Nn.o.l*n°u°°°ºunn*l
°l.°Hnms.l~s°°.lJ*lenn°ln
"
"
"
"
defines ce (0.028 for the South Carolina laboratory). Alternatively, each
band is no more than c~ fromthe average of the two lengths:
blood: s
"
"
112
S
(tho;
expe
chall
v
.
,
.
.
.
.
.
rejected through computer analysis, using~Whollyoi~j/~'tive/criteriai ~'s°
Likewise, courts in Arizona: t California,~ and New York',have
that high standards of accuracy, .~c h as thq N R C
panel?s:-i~:for ya
precise and objective matching rule+~ do iiot require exclusion of results
of the FBI's or.Cellmark's match procedures.
+ ': -
These holdings are correct. A s long as no + visual ~
,
. : :.
:
+
- +i~+:.':/ '
~will b e
reported as a match unless confirmed by the quantitative matchlng rule,
the imprecise, subjective phase serves as0nly a preliminary filter. I t
means that in some cases where the purely statistical rule would declare
a match the laboratory will not report a match. W h e n a sample f r o m a
defendant matches both objectively and subjectively, the defendant can
hardly complain that the laboratory should not have bothered with the
subjective phase of the procedure, ss
Judicial discussions of the adequacy of the objective, statistical phase
of matching have been less perspicacious. The issue can surfaceboth
when the prosecution offers proof of a match, and when a defendant
offers evidence of a non-match. In the former, inculpatory situation, a
defendant might argue that the match window is too wide, ss and a more
. .
. ~++, "
.
Once this numerical matching rule has been established for a particular laboratory, the evidence bands • can be compared to bands s from a
suspect. If a visual match is declared, and if all pairs of corresponding
bands in the two profiles differ by no more than 2ct, then the+ two
profiles are said to match.
49. 486 N.W.2d 407 (Minn. 1992).
50. Id. at 420.
51. State v. Bible, 858 P.2d 1152 (Ariz. 1993).
52. People v. Barney, 10 Cal. Rptr. 2d 731 (Ct. App. 1992).
53. People v. Wesley, 589 N.Y.S.2d 197 (App. Div. 1992).
54. NRC REPORT, supra note 15, at 72.
55. When the laboratory reports the proportion of people in the general population
with DNA that would match the crime sample, it uses the purely statistical match nile.
To the extent that the subjective component can only reduce the number of matches in
the population, lifts frequency tends to overstate the degree to which the DNA test would
recriminate innocent people. Of course, there may be separate reasons to question these
estimates of population frequencies. These are analyzed/nfra in Pan II.
56. See, e.g., United States v. Yec, 134 F.R.D. 161, 207-08 (N.D. Ohio 1991); see
+
Fall, 1993]
D N A Evidence
i13 i :::!".i::
'-:
stringent rule that would preclude the deci~tiOn:of amatch ~sho~fldbe : : "
used.~ In many cases, thisargument will be futile, for ali!66,pairsOf,,
:
measurements willliewell withinthe match window, ss:iudeed,when this
happens, experts m a y refer to the concordance of the two measurements
within the match window as ~an exact match,"s9 or "conclusive matches."m Conversely, when atleastone pair of measurements spans nearly
the fulllength of the window, and an analystjust speaks of:"a match,"
a court may stilladmit the evidence, as did the magistratejudge in United
States v. Yee. 61 That courtjustifieditsholding with the observatiouthat
"defendants who would be outside a smaller window but are within the
F.B.I.'s
larger window can make that point clear at t r i a l . "~a
:"
Although this may sound like a reasonable:compromise, t heYee
suggestion invites a potentially confusing exchange. The prosecution says
to the defendant, "under our match rule, you match:" The defendant
replies, "That's your rule. Under a different rule, I don't match., What
is the jury to make of this thrust and counterthrust? If all goesweH; the
exchange will make no difference because the jury also will be presented
with the frequency with which the prosecution's procedure for declaring
also Geisser, supra note 45, at 609 (characterizing the :k2.5% window as ~an extraordinarily wide net to declare a match').
57. In United States v. Jakobelz, 747:F. Supp. 250 (D. Vt. 1990), aft'd, 955 F.2d 786
(2d Cir. 1992), cert. denied, 113 S.Ct. 104 (1992), the defense attacked the FBI's match
window of :i:2.5% on the ground that "It]he FBI derived this 5% window through an
empirical analysis based upon the total variation of matches from known samples r a t h e r
than a statistical approach that utilizes confidence intervals. Defense expert Dr. [JOseph]
Nadeau testified that the distinction renders the FBI's mathematical approach scientifically unacceptable." Id. at 257. Since the statistical properties of a match window do not
depend on how it was derived, the criticism that the court descn'bes is misdirected.
58. This was the case in State v. Vandebogart, 616 A.2d 483,488 (N.H. 1992) ("The
FBI confirmed a visual m a t c h . . , because the degree of variation did not exceed plus or
minus one percent.'), State v. Aft, 504 N.W.2d 38, 47 (Minn. App,~ 1993) ("The
greatestvariance between Ali's D N A and any of the forensicD N A specimens on any of
the probes is 1.3%, approximately half the size of the match window, and within the
match windows suggested by the defense experts.")and Jakober~, 747 F. Supp. at 257.
After criticizing the ±2.5% rule, the defense expert in 3"akobetz "conceded that ff the
antorad matches . . . were within plus or minus 1% of the number o f base pairs, he
would have more confidence in the conclusion that there was in fact a match: ld. Tiffs
allowed the government nimbly to sidestep the criticism by pointing out that "all sixteen
band matches (eight alleles from each the victim and the suspect on four different
automds) were within plus or minus 1%." Id. at 258.
59. See, e.g., State v. Hammond, 604 A.2d 793, 802 (Conn. 1992) (noting that the
"female portion of the DNA that came from the [semen] stain matched 'exactly' with
that of the vicfim~).
60. See, e.g., Jakobetz, 747 F. Supp. at 257.
61. 134 F.R.D. 161 (N.D. Ohio 1991),
62. ld. at 208.
114
Harvard Journal of Law
a match would produce reports of match
members of the reference population.63 ]
results in matches for samples from the,
would be rare among innocent people, then the:evidence provesl soreething, and its probative value is unaffected ~by the truism ihat t h e : s ~ '
measurements would not match under some even more Stringent rUle.~
The defense testimony contemplated in Yeetherefore Proves very !ittle.
Disputes over the strictness of particular windows ~ or the optimal match
window--when there is no such thing67--may confuse and perplex t h e j ~
63. See infra notes 149o208 and accompanying text.
64. Nevertheless, within the window that a labomtury uniformly applies to declare a - .
match, some matches--those well within the window--are more pmhative than others.
Thus, the defense (or the prosecution) should be permitted to argue t h a t t h e smallest
window that could produce a match between the crime scene D N A and the defendant's
DNA is at least as pertinent as any broader window that also produces a match, a n d to
introduce appropriate statistics about the narrower window. In particular, one. could
argue that the probative value of the closer match depends on the frequency with which
the minimally matching window produces matches in reproducibility studies as compared
to the frequency corresponding within the reference population. (I am :indebted to
William C. Thompson for this insight.)
Cfi infra Part HI(C) (likelihood ratio as a
measure of probative value). But once the frequency of match with a given window is
presented, merely introducing testimony that there exists another window that excludes
the defendant is not particularly edifying. See/nfra note 65.
65. Once a jury knows that the match window is large enough to ensure that almost all
duplicate measurements produce matches and that a defendant's VNTR fragments match
'"
the forensic sample in a way that would occur at a frequency o f say, I/I00,000, in t h e
relevant population, it gains little or no useful information from hearing that t h e fragments do not match in a smaller window that would produce asmaller frequency of, say,
1/200,000, if they did match.
66. In Perry v. State, 606 So. 2d 224, 225 (Ala. Crim. App. 1992), the appellate
court reassured itself that a match declared by Lifecodes was acceptable because
Lifecodes's match window "of 1.8%, was stricter than that used by the FBI . . . . which
[is] 2.5 percent." C f State v. Pierce, 597 N.E.2d 107, 113 (1992) (Cellmark's determination of a match was not unreliable just because "other laboratories and experts may
use somewhat different criteria."). Such comparisons of these raw percentages, however, are misleading unless the standard errors of the laboratories are comparable. See
supra note 45. Lifecodes's standard error is smaller than the FBI's, which m a k e s
Lifccodes's window more lenient than the percentages would suggest.
67. Neither statistical theory nor legal doctrine dictates the ideal size of the window.
The former informs us that we can reduce the risk of falsely declaring a match only by
increasing the risk of incorrectly failing to declare a match. Big match windows make
for fewer false exclusions; small windows result in fewer false inclusions. With a match
window of two standard deviations per independent comparison, the risk that at least one
comparison out of ten for samples from the same person will not show a match is not
.05, but 1 - .95 ~° -- .40. Increasing the window to three standard deviations obviously
produces more matches when the samples being compared come from different people,
but it reduces the risk of failing to declare a match when the samples being compared
come from the same person to 1 - .99 ~° = .10. For more sophisticated studies of real
data, see Berry et al., supra note 45, at 520 (match-binning with a window of 4-2.5
standard deviations gives false exclusion rate of nearly 2% per probe) and lan W. Evett
"
.
F a l l , ~ 1993]
DNAEvidence
" ~:::::":::,',~ :
w h e n it c o n s i d e r s t h e p r o b a t i v e v a l u e o f a m a t c h ' . : A s a
1115
result; a c o u r t h a s
:
d i s c r e t i o n to e x c l u d e t h i s t e s t i m o n y . ~
T h e a p p r o p r i a t e n e s s o f t h e r u l e f o r d e c l a r i n g m a t c h e s also c o m e s i n t o
p l a y w h e n t h e d e f e n d a n t s e e k s to p r o v e t h a t n o m a t c h Can b e d e c l a r e d ::::
, : :
u n d e r t h e u s u a l m a t c h i n g rules. 69 A g a i n , it o f t e n w i l l b e t h e c a s e t h a t t h e
exclusion results from measurements that place a pair of fragments well:
et al., An Illustration of theAdvantages of Effident Statistical Meihods for RFLP
Analysis in Forensic Science, 52 AM. J. HUM. GENETICS498, 502 (1993).(152 3-probe .
duplicate measurements produced 20% false,exclusions for a window of =1=1,2% and
2.6% for a window of 5:2%.).
......
As for legal doctrine, one might think that some coum' reliance on a ,'two or three .,, ~ ' i ~
standard deviation rule~ in discrimination litigationshould dictate the.use Of an interval
"i'!
that spans the sfime number of standard deviations. This thought should be ~resisted. :
First, the rule itself does not mesh well with the more-probable-than-not or other"
evidentiary standards of proof. See, e.g., David H. Kaye, Hypothesis Testing in the
Courtroom, in CONTRIBUTIONS TO THE THEORY AND APPLICATION~OF STATISTICS
(A. Gelfand ed. 1987); David H. Kaye, Is Proof of Statistical Significance Reievar~?, 61
WASH. L. REV. 1333 (1986). The two-standard-deviation rule for a normally distributed
statistical measure of the difference between two groups merely means that a disparity o f
at least that magnitude will occur~about 5 % of the time that the rule is applied to cases
where the disparity is a statistical fluctuation rather than a reflection of any real dispari:
ty. It says nothing about the frequency with which the rule will fail to identify true
disparities when they are present, and it does not imply that one can be 95% ~confidant"
that a disparity outside the 95'~ interval is due to an impermissible criterion. Id,; David
H. Kaye, Apples and Oranges: Confidence Coefficients Versus the Burden of Persuasion,
_73 COP.NELL L. REV. 54 (1987). Second, whatever may be the probability that a single
measurement deemed "significant~ under the two-standard-deviation rule is due to
chance, declaring that two samples of DNA match on five probes requires ten comparisons. As shown above, multiple comparisons give the rule quite different statistical
properties. Instead of devising rules that treat relevant evidence as either admissible o r
inadmissible, as totally revealing or utterly worthless, the law here should be concerned
with conveying to the judge or jury sufficient information to gauge the probative value of
the evidence--the extent to which the various pairs'of fragment lengths match.
68. See Fed. R. Evid. 403. If all the defense can say is that a sufficiently small
window would not include the defendant, it is arguable that anything less than exclusion
of the proposed testimony would be error. However, even if the defense fails to
undertake a more careful analysis of the precise degree of matching and its implications,
the testimony about small enough windows could prompt the prosecution to do so, See
supra note 64. And, since the prosecution should be able to demonstrate the tautological
nature of the defense argument, the danger of prejudice is not overwhelming. Consequently, a strict exclusionary rule may not be needed even in this situation. It suffices to
leave it to the trial court to inquire whether additional analysis of the minimally matching
window gives a substantially different picture than the frequency associated with the
laboratory's conventional match window. If it does, argument about the effect of smaller
windows should be allowed; if it does not, the testimony invites a pointless digression
and should be excluded.
69. It has been said that the exclusion rate for most laboratories is about 30%.
Bernard Devlin et al., Statistical Evaluation of DNA Fingerprinting: A Critique of the
NRC's Report, 259 SCIENCE 748 (1993). Some of the exclusions have been dramatic.
See, e.g., Jonathan Rabinovitz, Rape Conviction Overturned on DNA Tests, N. Y.
TIMES, Dec. 2, 1992, at B6 (man convicted of rape released after 11 years in prison).
116
Harvard Journal of Law & Techlzo!ogy, : ',~,:',..: :' [Vol., 7 . .
outside the match' window, and:not'much, will turn on the precise width
: <:
of the window. 7° But what happens'when the putative exclusion is a :,
closer call? A narrow match w i n d o w reduces the.number'0f falsely
included defendants at the cost o f excluding a large number of g u i l t y
defendants whose DNA fragments no longer "match" the cfimesample. 7~
Because there can be little difference between' a pair of bands that barely
falls into a match window and a p a i r that barely fails outside ' the same
window, to consider the former ,an inclusion and the latter an exclusion
would be misleading. To avoid thi s outcome, analysts may be tempted
to designate such weak exclusions as "inconclusive. "72 Thus, in
evaluating a defendant's effort to introduce non-matches'as exculpatory
evidence, the judge or jury should attend to the degree o f non=matching
and not just the label.
The standard matching procedure, with fixed match windows, does not
lend itself to this task, but other statistical procedures do. They replace
the somewhat artificial match vs. no match dichotomy with an inquiry
into (a) the probability of finding the observed degree o f congruence in
the crime fragments and the defendant's fragments when all the fragments
come from the same person, and Co) the probability of finding this
70. Presumably, this was the case in State v. Hammond, 604 A.2d 793 (Conn. 1992).
In this unusual case, an FBI analyst testified on behalf of a man accused of rape. The
analyst stated that the tests had been run properly because the "female portion of the
DNA that came from the [semen] stain matched 'exactly' with that of the victim," while
"neither the defendant nor the victim's boyfriend could have contributed any part of the
semen stain on the victim's underwear." Even when the exclusions are clear, however,
there remains a non-zero probability that the samples came from the same source.
Consequently, occasional dicta like that offered by the Arizona Supreme Court, in State
v. Bible, 858 P.2d 1152 (Ariz. 1993), that "ff samples do not match, they must have
come from different individuals," cannot be taken literally.
71. See supra note 67.
72. The term "inconclusive" is appropriate as applied to testing that fails to produce
any measurements at all. See, e.g., State v. Woodall, 385 S.E.2d 253, 260 (W. Va.
1989) (holding that failure to match defendant's DNA with the sample from a rape
victim was irrelevant because the crime sample lacked sufficient high molecular weight
D N A to make any comparison). It is more problematic when used to designate a close
non-match, or worse, when used to dismiss selected pairs of length measurements that
almost match, so that a frequency of matches for the remaining probes in the population
can be computed. Cf. NRC REPORT,supra note 15, at 61 ("When samples fall outside
the match criterion, they should be declared to be 'inconclusive' or 'nonmatching.").
For cases that may violate this precept, see Polk v. State, 612 So. 2d 381, 392 ('Miss.
1993) (Cellmark's expert "testified that seven of eight bands from Georgia Mae Thomas's DNA met the criteria to be determined a match with the DNA obtained from the
blood on Polk's underwear") and State v. Quatrevingt, 617 So.2d 484, 492 (La. Ct.
App. 1993) (determination of a match was "supported by the evidence" that "two of the
three percentages for probe DXYS14" fell within Lifecodes's match window).
:
Fall, 1993]
DNA Evidence
117 i/: '
congruence when the fragments come from different people. 73 Before
considering these resulting alternatives to categorical matching, however;
we should investigate the second part of the currentmode of presenting
DNA test results in court--estimating the frequency of matching profiles
in some relevant population.
II. ESTIMATING MATCH-BINNING
FREQUENCIES
If a match is declared, the weight of the evidence depends~on the
probability of such a result if the suspect is the source of the sample (an
event that we may denote as S), compared to the probability of a match
if someone other than the suspect is the source (O).74 Although some
experts seem to say that there is no chance of a false inelnsion, 75 there are
scenarios that would produce such an error, 76 and there are reported
instances of false positive identifications, r7 In addition, even if the DNA
fragments really are within the match window, there is some probability
that other people have fragments in this region. If the relative frequency
of the incriminating fragment sizes is large, so that many people would
match, then the finding of a match is not very probative. Estimating the
frequency requires some analysis of population data, and the adequacy of
such analyses is controversial. Furthermore, even if a correct population
frequency can be found, there is a risk that it will be interpreted as the
73. See infra Part rn.
74. See, e.g., Richard Lempert, Some Caveats Concerning DNA as Criminal Identification Evidence: With Thanks to the Reverend Bayes, 13 CARDOZOL. REV. 303 (1991).
See generally David H. Kaye, Comment, Quantifying Probative Value, 66 B.U.L. REV.
761 (1986).
75. See, e.g., Fisbback v. People, 829 P.2d 489, 492 (Colo. Ct. App. 1991), aft'd,
851 P.2d 884 (Colo. 1993); People v. Shi Fu Hnang, 546 N.Y.S. 2d 920, 921 (Sup. Ct.
1989).
76. See William C. Thompson & Simon Ford, in The Meaning of a Match: Sources of
Ambiguity in the Interpretation of DNA Prints, FORENSICDNA TECHNOLOGY (M. Farley
& J. Harrington eds.. 1990).
77. See State v. Bible, 858 P.2d 1152, 1180 n.16 (Adz. 1993) (referring to reports of
errors in paternity determinations); Lempert, supra note 74, at 324-25; NRC REPORT,
supra note 15, at 88. But see People v. Mehlberg, 618 N.E.2d 1168, 1180 (ill. App.
Ct. 1993) (testimony of Robin Cotton of Cellmark Diagnostics that reports of Cellmark's
erroneous attribution of maternity to a woman in Maryland are mistaken). There also
are instances of "clerical errors" in calculating the frequency of matching DNA patterns
in the general population. See Perry v. State, 606 So, 2d 224, 226 (Ala. Crim. App.
1992) (original three-locus frequency estimated to be 1/209.100,000 instead of
1/23.000,000 due to "clerical error").
Harvard Journal ofl~w &TechnOlogy "
118
:-,,[Vol;;7ii::,i,i .:
~robability that someone other than the defendant is the source of the
evidence sample: s As a result, the" use ofmatch!'~luencies or"
probabilities has proved susceptible to challengein court. " ,......'
Martinez v. State~ illustratesthe type of testimony, as tofrequencms
or probabilities that has provoked objections. In Martinez, Lifecodes
Corporation "explained the significance of the match of D N A patterns"
in the following way:
Q. And what would be the answer to that question as far as the
likelihood o f finding another individual whose bands would match . . . . .
up in the same fashion as this?
A.
The final number was that you would expect to fred only one
individual in 234 billion that would have the same banding pattern
that we found in this case.
Q° What is the total earth population, if you know?
A.
Five billion.
Q. This is in excess o f the number of people today?
A.
Yes.
Basically that's what that number ultimately means is that
that pattern is unique within the population o f this planet.
Q. Is that consistent with your opinion earlier that the semen involved
in this case came from Fernando Martinez?
A.
That is correct.
The defendant, Martinez, argued that the introduction o f this testimony
was error simply because "a figure 47 times larger than the world's
current total population was 'nonsensical'; and it was so overwhelming
as to deprive the jury o f its ftmetion in fairly appraising all o f the
evidence." The Florida district court o f appeals rejected this broad-brnsh
argument against small frequencies, but other courts have been more
sympathetic,
especially when more focused arguments have been
advanced and supported by expert testimony for the defense: ° Indeed,
the procedure for computing frequencies like the one in Martinez also has
inspired the sharpest debate about D N A evidence outside of the court-
78. See, e.g., People v. Shi Fu Huang, 546 N.Y.S.2d 920, 921 (Sup. Ct. 1989) ("If
there is an adequate and reliable data base, a forensic scientist cm'acalculate that a match
did not occur by chance."); Lempcrt, supra note 74, at 306; infra text accompanying
note 233.
79. 549 SU.2d 694 (Fla. App. 1989).
80. Compare Perry v. State, 586 SU.2d 242 (Ala. 1991), with Snowdcn v. State, 574
So.2d 960 (Aia. Crim. App. 1990).
Fall, 1993]
D N A Evidence
119 '
room, in the pages of scientific journals. J
This section therefore considers several procedures for computing the
frequency of an incriminating set o f DNA fragment lengths and the
statistical and legal objections that can be raised. Part III considers the
problems in using even an accurately determined population frequency to
gauge the significance of a match.
A. Direct Estimation
How can one estimate the proportion of people in the-relevant
population whose DNA fragments would be considered to match the set
of measured lengths of the VNTR fragments derived from the crime
sample? One procedure recommended-by several commentators8~ is
simply to sample people in the relevant population, analyze their DNA,
and report the number who match the crim~ sampler' Thus, the
laboratory might report that of the, say, N = 1,000 DNA samples it has
analyzed, only the defendant's was found to match the crime sample.
The National Research Council report recommends this approach, at least
for the time being. ~
81. See, e.g., Richard C. Lewontin & Daniel L. Hard, Population Genetics in
Forensic DNA Typing, 254 SCIENCE 1745 (1991); David A. Stoney, Reporting of Highly
Individual Genetic Typing Results: A Practical Approach, 37 J. FORI~SlC SCI. 373
(1992) (recommending direct estimation supplemented by more theoretical methods).
82. In deciding whether a sample of DNA in the database matches, the laboratory
should apply the same matching rule, with the standard error applicable to inmr-gel
comparisons, that it used to declare a match in the case at bar. However, a broader
match window for counting matches in the database could only lead to an overestimate of
the population proportion; it would not prejudice a defendant who objects to DNA
evidence of a match.
83. NRC REPORT, supra note 15, at 91. For no apparent reason, the recommendation
is limited to cases in which no multilocos matches in the database are observed. In
discussing this "counting" method, the panel also suggests that "an upper confidence
limit of the frequency should be used in court~ because "est~natcs used in forensic
science should avoid placing undue weight on incriminating evidence" and "any loss of
power can be offset by studying additional loci." Id. at 75. The first reason, however,
begs the question: How does unbiased estimation place "undue" weight on the evidence7 One could argue against the panel's suggestion, with equal force, thal estimates
used in forensic science should avoid placing too little weight on incriminating evidence.
As for the panel's reliance on testing additional loci to enhance statistical power, such
testing would not increase the number of people in the database; consequently, it might
have no effect on power (as indicated by the width of the confidence interval). Furthermore, the panel misconstrues the meaning of the most common confidence interval when
it explains that "the traditional 95% confidence interval . . . implies that the true value
has only a 5% chance of exceeding the upper bound." A 95% confidence interval is
computed according to a procedure that, if applied to many random samples from the
same population, would include the population proportion in about 95% of these
;~
120
Harvard Jourm
As the N R C panel observes,
it requires no theoretical assure
population of possible perpetrate
among the restriction fragment,,
at least three objections, u For one, it grossly understates the evidential '- ::~
value of the incriminating match. As explained below, there :is eve~:
reason to believe that matches are far less frequent than I/N. Of course,
this does not mean that the defendant should be able to exclude the figure
of 1/N, which errs in his or her favor, but it counsels against a rule that
would make it the sole indication of the significance of the incriminating
match, s5
The second objection is that a random sample of the relevant
population is essential to a valid estimate, but existing databases are
convenience samples. ~ This point has been consistently rejected in
court, s7 largely because it is felt that the distribution of VNTRs is n o
different in a convenience sample than in a random sample, ss Some
samples. Each sample, and hence each interval, would be different, and one cannot say
that there is a 95 % chance that the population proportion lies within the one and only
available 95% confidence interval. See, e.g., DAVID S. MOORE & GEORGE P. MCCABE, INTRODUCTIONTO THE PRACTICE OF STATISTICS § 7.1 (2d ed. 1993). Despite
these flaws in the report, presenting a confidence interval along with the sample frequency of matches is desirable, since it conveys information about the uncertainty in the
unbiased point estimate. See Kaye, supra note 67.
84. Another objection, having to do with the choice of the reference population, is
discussed infra text accompanying notes 149-208.
85. Cf. C. Thomas Caskey, Comments on DNA-boJed Forensic Analysis, 49 AM. J.
HUM. GENETICS 893, 894 (1991) (The use of direct estimation "would represent a loss"
of potential information available from the field of population genetics). The direct
count frequency within a database would be the same, of course, for a match at 20 loci
as for a match with testing at only one locus. Yet, the probability of a random match at
20 separate VNTR loci is many o~ers of magnitude smaller than that of a single locus
match.
86. See, e.g., State v. Bible, 858 P.2d 1152, l186(Ariz, 1993); People v. Mohit, 579
N.Y.S.2d 990, 998 (Sup. Ct. 1992); Geisser, supra note 45. In "convenience sampling, ~ individuals are included in the sample because they are easily accessible. Not
being the result of a procedure that gives every individual in the population a known
probability of being sampled, the statistical properties of convenience samples ate not
well-defined. Lifecedes's samples come from paternity cases, while the FBI and Cellmark Diagnostics rely on bloodbanks. See Welt, supra note 45. Efforts to broaden or
supplement the databases are underway. Id.
87. But see State v. Bible, 858 P.2d 1152, 1186 n.23 (Ariz. 1993) (observing that
"frequency f i g u r e s . . , are valid and accurate only if they come from a truly random
sample," but purporting not "o rely on this consideration in holding that Cellmack's
calculation was erroneously admitted); Commonwealth v. Cumin, 565 N.E.2d 440
(Mass. 1991) (evincing concern that "Cellmatk compiled its Caucasian data base by
testing 200 blood samples collected at a New York City blood bank').
88. United States v. Jakobetz, 747 F. Supp. 250, 261 (D. Vt. 1990) ("Dr. Kidd
Fall, 1993]
D N A Evidence . . . . .
research supports th~.s intuition,s9 .
,
"
"
....
i,: :
.~ .:. . . .
Third, it could be argued; as defendants challenging other.'esthnates : " "
of population proportions have done, that existing databases are t0 o small
topermit usefulestimates. Courtshave also rejected this a r g u m e n t o n the
strength o f conclusory statements by geneticists that sample sizes of a few
hundred are sufficient to permit reasonable estimates.~
However, the
notion that some m i n i m u m size exists above which all e s t i m a t e s are
reliable and below which none are hardly is i n k e e p i n g with statistical
theory.
Even a small sample can supply a foundation for v a l i d l y
estimating the frequency of a characteristic in a vastly larger population.The appropriate reaction to the sample size concern is neither to reject the .~:
sample statistic out of hand n o r to accept-it without qualms, b u t t e press
for a range of estimates indicating the.. extent to which the calculation
might vary from one such small sample to another. 91
I n sum, direct counts of the frequency of the incriminating D N A
profile in the appropriate database o r d i n a r i l y should be admissible.
Indeed, at least one court has excluded indirect estimates in favor of more
direct counts. ~
testified that the composition of the data base may be less rigorous when the targeted
genes or VNTRs occur randomly.'), aft'd, 955 F.2d 758 (2d Cir. 1992), cert. denied,
113 S.Ct. 104 (1992); People v. Shi Fu Huang, 546 N.Y.S.2d 920 (Sup. Ct. 1989)
(testimony from Lifecodes that a database of under 200 samples of university students
from mainland China ~'seemed' m be from a random sampling").
89. Bernard Devlin &Neil Risch, Ethnic Differentiation at VIVTR Loci, with Special
Reference to Forensic Applications, 51 AM. J. HUM. GENETICS534, 545-46 (1992).
90. See, e.g., Jakobetz, 747 F. Supp. at 261 (~According to Dr. Kidd, once it is
determined that the alleles are randomly distributed throughout a targeted population,
sample size can be decreased to as little as 100 individuals.r); Shi Fu Huang, 546
N.Y.S.2d at 921 (minimum size said to be 200). For a more careful treaunent of the
issue, see Ranajit Chakraborty, Sample Size Requirements for Addressing the Population
Genetics Issues of Forensic Use of DNA Typing, 64 HUM. BIOLOGY141 (1992) and
Ranajit Chakraborty et al., Evaluation of Standard Error and Confidence Interval of
Estimated Multilocus Genotype Frobabili~es and Their Imph'cations in DNA Forensics,
52 AM. J. HUM. GENERICS60 (1993) (method for reporting match-binningfrequencies
that accounts for sampling error).
91. On the appropriate statisticalprocedures for producing such interval estimates, see
Bruce S. Weir, Forensic Population Genetics and the NRC, 52 AM. J. HUH. GENETICS
437 (1993).
92. In Caldwell v. State, 393 S.E.2d 436, 443 (1990), Lifecodes's calculation that
frequency of genotype in population was 1/24,000,000 was replaced with the figure of
1/250,000 derived from the "more conservativeapproach [of using] the database itself,
and not "any population theory.'~ Because e~isting databases are much smaller than
250,000, however, it not obvious how "the database itself" was used to produce the
1/250,000 figure.
"
.
:-~:.i
.122
H a r v a r d Journal o f
B.
Inferences f r o m ,,allele~. Frequencies ':ic:.::, ::.:.:!:.::~: ;:!
1. The Independence M e t h o d With M a t c h WindowS E q u a l t6~B i n Widths,,
Basic.Bins
:
, ....
Direct estimates o f match frequencies ,that give! V a n i ~ g l y
.
,-
" "
..:
Small
n u m b e r s , like t h o s e i n Martinez, have b e c o m e p r e v a l e n t d e s p i t e their
shortcomings,
The basic m e t h o d . n o w
e m p l o y e d for "i n f e r r i n g the
" g e n o t y p e " frequency f r o m "allele" frequencies presupposes i n d e p e n dence 93 o f certain genetic characteristics a n d is therefore referred to a s t h e
" i n d e p e n d e n c e method. "94 The m e t h o d involves three steps: estimating
"allele" frequencies, d e d u c i n g " g e n o t y p e " frequencies at each locus, and
d e d u c i n g a " g e n o t y p e " frequency for all the loci.
I n the first step, for each D N A fragment, o n e counts the n u m b e r o f
indistinguishable (or similarly sized) fragments i n the database.
This
c o u n t i n g procedure is often called " b i n n i n g " because it piles fragments
o f slightly different sizes into distinct " b i n s . " T o estimate the r e l a t i v e
frequencies o f different sized fragments, forensic laboratories use either
"floating " ~ o r "fixed "96 bins. 9~ D N A fragments o f similar l e n g t h s that
93. Two events are independent if the occurrence of one is not associated with the
occurrence of the other. Cards, dice, roulette wheels, coins, and balls in urns provide
classic illustrations. For example, if a coin is tossed vigorously twice, obtaining a head
on the second toss is independent of a head on the first. When events are independent,
the probability of their .joint occurrence is the product of the probabilities of each event:
if the coin is fair, the chance of two heads is (1/2)(1/2) = I/4.
94. Courts sometimes speak of a "prnducff or "multiplication rule~ for independent
events, but this terminology is infelicitous, for there is another multiplication rule,
involving conditional probabilities, for dependent events. See, e.g., William Fairley &
Frederick Mosteller, A Conversation About CoUim, 41 U. CHI. L. REV. 242 (1974);
David H. Kaye, Statistics for Lawyers and Law for Statistics, 89 MIC~I. L. REV. 1520
(1991).
95. A floating bin is just the barof a histogram centered on the length of a fragment
seen in the incriminating DNA. (A histogram is a bar chart in which the heights of the
bars represent the proportions of items in the range, or ~bin," covered by that bar.) As
Weir & Gaut, supra note 47, explain:
Since any band of length s satisfying [the first equation of note 48]
would be said to match an evidence band of length e, a bin is constructed around length e to contain all such lengths. From this equation, ~.q
matching lengths d must satisfy
(
1-"~e <
(1+¢C~e
These floating bins have approximate width 4 . centered on the evidence
:
"
F a l l , 1993]
DNA Evidence
'
are p u t into t h e s a m e b i n c o n s t i t u t e a n "allele. "9s
~
123 !
F o r e x a m p l e , i f p,:
d e n o t e s t h e p r o p o r t i o n o f e a c h " a l l e l e " g e n e r a t e d f r o m t h e s e c o u n t s , the.;~ ~. :
first fragment might migrate a distance that puts it ina bincontaining pt
= 1/5 of all fragments for that locus. The only legal objections that can
be raised to the above procedure relate to the source and size o f the
band. Each band in thedatabase is examined, and those satisfying [the
first equation of note 48] are assigned to the bin for that evidence band.
In this way the bin frequency is obtained.
='
96. A fixed bin is a bar of a histogram established before referring to any of the
observed fragments:
A set of fragments of known length are used as:bin~boundaries. These
fragments are those produced by digesting viral DNA with restriction
enzymes, and the lengths serve as "sizing ladders" on electrophoretic
• gels. For binning, however, the important thing is that a set of bins are
pre-defmed with fixed boundaries. Once a match has been declared, the.
evidence band is assigned t o a fixed bin. Because there is uncertainty
associated with the length e of the evidence band, a window of width 2c~
centered on • is constructed. If this window lies wholly within a fixed
bin, the band is assigned to that bin. If the window includes a bin
boundary, it is not known to which fixed bin the true band length
belongs. It is known, however, that the true length belongs to only one
bin, and a conservative procedure is to assign the band to the bin with
highest frequency.
Weir & Gaut, supra note 47. Consequently, "there is no logical basis for the recommendation of a recent National Research Council (NRC) report that the band be assigned
to a bin obtained by adding the two adjacent bins in cases of overlap." Id.
The advantages of fixed binning are that the laboratory can estimate allele frequencies
by consulting a table instead of performing new counts for each fragment in the crime
sample and that statistical tests can be applied to the predefined bins to establish
independence. In practice, the FBI uses bins whose widths exceed the match window,
thus producing overestimates of allele frequencies.
97. In either case, the width of a bin should correspond to the laboratory's matching
role, using the standard error for inter-gel comparisons--an obvious precept that
Lifecodes failed to observe in People v. Castro, 545 N.Y.S.2d 985 (Sup. Ct. 1989), but
now abides by. See People v. Golub, 601 N.Y.S.2d 502, 504 (App. Die. 1993). For
an empirical comparison of fixed and floating bins, showing that, or/average, fixed bins
produce larger frequency estimates for DNA profiles, see Keith L. Monson & Bruce
Budowle, A Comparison of the Fixed Bin Method with the ; .~:ating Bin and Direct Count
Methods: Effect of VNTR Profile Frequency Estimation and Reference Population, 38 J.
FORENSIC SCI. 1037 (1993).
98. The term "allele" is taken from other contexts where it refers to a form of a gene,
that is, a sequence of DNA that codes for observable traits. Two VNTR fragments of
the same length would be considered "alleles" even though their base sequences might
differ and even though they do not code for any known traits. Moreover, since the
VNTR fragments in a database are clumped by size into bins, the bins contain a range of
differently sized fragments. Therefore, a better term for a set of comparable fragments
might be "binelle." Cf. Bernard Devlin et al., Estimation of Allele Frequencies for
VNTRLoci, 48 AM. J. HUM. GENETICS662 (1991) (procedure for deducing the frequencies of fragments with the same numbers of tandem repeats from the distribution of
measured sizes).
.
124
Harvard Journal
database. ~
The second stage' o f ~ e analysis
,, genotype , at each V N T R locus. 10o I!
assump2ion comes into p l a y . Every pel
cont,'~in a particular V N T R . usually giv.us . . . . . . . . . . . . . . . . .
E,-~,,,~. ::
for each enzyme-probe system; ~°2 Having estimated the:rel~Ve
-::I,:I..
frequency of each~t~-fit size, one now computes
~:~:~bserved pair {i,j} of f-ragraents. If the population i
(
calV~Hardy-Weinberg equilibrium, I°3 then I°4 the pr0portions iof the : :
e~ = 2ppj
. . . .
(1).l~ ~,
99. The congruence of the database with the population of plausiblesuspects, which is
treated below, is also open to attack. Contrary m the opinion of the Nebraska Supreme ,::
Court in State v. Houser, 490 N.W.2d 168, 183 (1992), Hardy-weinbergequilibrinm, i
see infra note 103, plays no role in the estimation of "allele" frequencies.
100. Since no genes are involved and the operationally defined %lle|es = include a
hodgepodge o f tree alleles, a more apt term would be "binotype." Devlin et al., supra
note 98. In the usual terminology, however, a "single l o c u s genotype ~ is the set of :
~alleles" detected by a single probe.
.
-101. A single band will appear if a person's mother and father both transmitted the
same allele (the person is homozygous) or if one band has not be~n detected.il 102. As explained in Part !, each restriction enzyme cuts a long D N A molecule into
much shorter fragments by cleaving a specific sequence of bases, and a probe binds to
those fragments that coutai~ varying numbers of the ~ n s e n s u s seq-lence within these
restriction sites. Consequently, the distribution o f fragment sizes i n the population
depends on the enzyme and probe. After "tligesting" DNA with one enzyme and
applying a probe after electropboresis and blotting, the probe can be w~kshed from the
DNA, and a probe that recognizes a different consensus sequence then can bo applied.
This probe identifies a length variation that starts at another location, o r "locus," along
the D N A molecule.
103. Hardy-Weinberg equilibrium follows rigorously under three conditions: (1)~a
Mendelian pattern of inheritance(no mutation and alleles segregate independently); (2)
no selection (the expected number of fertile progeny from a mating that roaches maturity
does not depend on the genotype of the mates); and (3) an infimte, unstructured populalion (i.e., malings and genotypes are uncorrelated in an infinite population). See, e.g.,
L.L. CAVILLA-SFORZA & W.F. BODMER, THE GENETICS OF HUMAN POPULATIONS
(1971).
104. The converse is not true: Hardy-Weinberg equilibrium is not a necessary
condition for independence of alleles at a locus. Independence can exist in the presence
of selection or non-random mating. See Richard C. Lewontln & C.C. Cockerham, The
Goodness-of-Fit Testfor Detecting Natural Selection in Random Mating Populations. 13
EVOLUTION 561 (1959); C.C. Li, Pseudo-random Mating Populations. In Celebration of
the 80th Anniversary of the Hardy-Weinberg Law, 119 GENERICS 731 (1988).
105. The factor of 2 reflec:s the ~fact that "allele" i could lie on the chromosome
inherited from the mother and j on the one from the father, or vice versa. In other
words, the "genotype = could be written (i,j) or (j,i), where the first "allele" is from the
maternal chromosome and the second is from the paternal one. See supra n o t e 4 3 . In a
.........
i) :i ."::-~
?i
Fall, 1993]
.... " " ..... .DNAEvidencei~i ~:~:
T h u s , i f t h e first
fragmenta!
o f all the fragments f r o m ~
falls in
a locus i S : i n a Size
:locus: and the:seco!
a bin that:contains, 1/10
for p e o p l e in the.
of.the f r a g m e n t s , , ~
.-
population, then-the relative frequency o f the , g e n o t y p e = {1,2} at this
locus.1 w o u l d he':P, = 2 ( 1 / 5 ) ( 1 / 1 0 ) =
1/25.
-: '
-::!
Initially, v a r i o u s experts argued t h a t the n u m b e r o f h o m o z y g o t e s - - i n - . . i i
d i v i d u a l s w ~ h apparently equal fragment lengths at a l o c u s , e x c e e d s t h e
If'
.
'~
.
-
:
.
o
'°
:
,
expected v a l u e u n d e r t h e assumplaons o f a H a r d y - W e i n b e r g e q u i l i b n - : :
urn. ~°6
T h e a r g u m e n t s w a y e d s e v e r a l courts, ~°7 a n d ,
indeed,- most
~.::
:
•
:
i.~-' ::::
o p i n i o n s that question the p o p u l a t i o n frequencies d o s o b e c a u s e o f express}
:i .....
doubts about the H a r d y - W e i n b e r g e q u i l i b r i u m a s s u m p f i o u s i / ~ : TO s o m e i / i :
courts __tiff.':
is ,the s?le debatable link in t h e c h a i n o f reasoning t h a t ig e n o t y p e frequency estimates: '°9 I n r e p l y , the F B I and other: : , : :i :::/:!
produces
population in Hardy-Weinberg equilibrium, th~:~,stsituation occurs in a fraction ppj
the population, as does the second. Therefore, the pmlmnion which is either(ij) or 0f.~
is 2ppj: If a DNA sample gives rise tO 0nly one allele," i, :itmay be because the ::::
person inherited the same ~allele" from both parents. Homozygos!ty,-as this is called,
•
can happen Only one way,-(i,i)--2so itsrelative frequency:~s~p~.i H o w e v e r , it,is::als0 • :
".
possible that the :person is hetemzyg0ns but has a "null allele" that escapes detection
/ "
(e.g., because it isvery small and runs off the gel) or has alleles that "coalesce" on the /
autoradiograph because the two alleles are very~close together. See E.M. Steinberger et :
:
al.,On the:Use of Excess'Homozygosity for Subpopulation Detection, 52 AM. J. HUM:
GE:q~-WICS1275.(1993 ). When •loci show only One "allele," the FBI use s the figure 2pi,'
which overstates the"genotype ~ frequency under either scenario; Life.codes nses2p~ for
enzymes that,rarely if ever produce null alleles and 2Pi otherwise.
:.
106. See:Caldwell v. State, 393 S.E.2d 436; 443 (Ga. 1990) (testimony ofJung Choi . . . . .
that Lifec0des's database indicated a population that was not in Hardy-Weinberg :
equilibrium); Joel E. Cohen, DNA Fingerprin~n" g for Forensic Identification: Potential
Effects on Data lraerpretcaion of 8ubpopulation Heterogeneily and Band Number
Variabilily, 46 AM. J. HUM. GENEWICS358 (1990); Eric S. Lander, DNA Fingerprinting
on Trial, 339 NATURE 501, 504 (1989) (editorial noting ,spectacular deviations from
Hardy:Weinberg equilibrium" in Lifecodes's data, ind/cating "genetically distinct
subgroups within the Hispanic sample').
107. See, e.g., State v. Bible, 858 P.2d 1152 (Ariz. 1993) (misreading an expert's
explanation of excess homozygosity as a concession that her calculation of the population
frequency was not based on a generally accepted method); State v. Peunell, 584 A.2d
513 (Del. Super. Ct. 1989) (fact of Cellmark's match in serial murder case admissible
under reasonable reliance test, but match probability of 1/180,000,000,000inadmissible
due to excess homozygosity indicating lack of Hardy-Weinberg equilibrium, and
questionable binning procedures). Oddly, the claim of "statistically significant deviations
from Hardy-Weinberg equilibrium" continues to impress courts even though, as indicated below, the scientific debate has ceased. See, e.g., State v. Cauthron, 846 P.2d 502,
515 (Wash. 1993).
108. See, e.g., Bible, 858 P.2d 1152; Prater v. State, 820 S.W.2d429, 438-39 (Ark.
1991); Ca~weU, 393 S.E.2d 436.
109. See, e.g., State v. Wilson, 817 P.2d 1136 (Kan. App. 1991) (opinion designated
not for publication) (where FBI agent ~testified at trial the formula used in calcnlaling'(:
the frequency of a particular DNA band was based on standard probability theory and!,
derived from the Hardy-We~.berg equilibrium, which has been modified to compensate
-
k
.
:
measuring process rather than a disequilibrium, n° :. In~the:~:ensuing
technical debate, m critics retreated from the claim of excess h0mozygosi:
ty to the weaker claim that the statistical tests are not powerfulen0ugh t0
:
disprove the hypothesized lack of independence.u2 This fullback position
has grown increasingly untenable a s m o r e studies report substantial
equilibrium at most loci. 113
The real issue, however, is not "statistical significance "m- but i=ather :
-:
for limitations in forensic DNA
, -
•
profirmg," and defense expert, a professor ofbiologya~ ...:i
equilibrium, the conflicting testimony did not conclusively show those re,sults :are
unreliable, and disagreement goes "only to the weight.of the test results.?); Mandujano
v. State, 799 S.W.2d 318 ('rex. Ct. App. 1990). For a particularly garbled account, see
Martinez v. State, 549 So.2d 694, 695 (Fla. App. 1989) (descn'bing "the Hardy-weinberg equilibria" as ~an established statistical data base" and then as a "formula"). ~
110. See Bernard Devlin, Nell Risch & Kathryn Roeder, No Excess of Homozygosity at
Loci Used for DNA Fingerprinting, 249 SCIENCE 1416 (1990). The FBI attributes the
excess of apparent homozygotes in its database to "tecbuical problems which sometimes
show only one band exhibited from a heterozygote [as when] a band is so small it rims
right off the gel, or 2 bands occur so close to one another so as to appear as a single
band." People v. Mobil, 579 N.Y.S.2d 990, 996 (Sup. Ct. 1992).
111. See Joel E. Cohen et al., Forensic DNA Tests and Hardy-Weinberg Equilibriun,~,
253 SCIENCE 1037 (1991); Philip Green & Eric S. Lander, Forensic DNA Tests and
Hardy-Weinberg Equilibrium, 252 S ~ C E 1038 (1991); Bernard Devlin etal., Forensic
DNA Tests and Hardy-Weinberg Equilibrium, 252 SCIENCE 1039 (1991). See also
Ranajit Chakraborty et al., Apparent Heterozygote Deficiencies Observed in DNA Typing
Data and Their Implications in Forensic Applications, 56 ANN. HUM. GENEriCS 45
(1992); Ranajit Chakraborty, Statistical Interpretation of DNA l~ping Data, 49 AM. J .
HUM. GENETICS 895 (1991); Ranajit Chakrabony & L~ Jin, Heterozygote Deficiency,
Population Substructure and Their Implications in DNA Fingerprinting, 88 HUI~:.
GENERICS 267 (1992); Bernard Devlin & Nell Risch, A Note on Hardy-Weinberg
Equilibrium of VlVTR Data by Using the Federal Bureau of Investigation's Fixed-Bin
Method, 51 AM. J. HUM. GENETICS549 (1992); Bruce S: Weir, Independence of VNTR
Alleles Defined as Floating Bins, 51 AM. J. HUM. GENEriCS 992 (1992) (all defending
reliance on Hardy-Weinberg equilibrium).
112. See Eric S. Lander, Reply, 49 AM. J. HUM. GENETICS 899, 900 (1991) ("Critics
• . . reply that such tests have insufficient power to detect deviations [from HardyWeinberg equilibrium] if present"); cf. Seymour Geisser & Wesley Johnson, Testing
Hardy-Weinberg Equilibrium on Allelic Data from VlVTR Loci, 51 AM. J. HUM.
GENETICS 1084 (1992) (proposing other statistical procedures for assessing independence).
113. For the view that statistical tests with substantial power demonstrate the independence of VNTR alleles, see, for example, Bruce Budowle et al., Reliability of Statistical
Estimates in Forensic DNA Typing, in DNA ON TRIAL: GENETICIDENTIFICATION AND
CRIMINAL JUSTICE 79, 87-88 (Paul R. Billings ed., 1992)(summarizing studies);
Chakrabony, supra note 90, at 155-56; Bruce S. Weir, Independence of VNTR Alleles
Defined as FixedBins, 130 GENEriCS 873 (1992); Devlin et al., supra note 69; Kathryn
Roeder, DNA Fingerprinting: A Review of the Controversy § 3.2.1 (1993) (unpublished
manuscript, on file with the author) (sun~mlariaing studies).
114. An observed discrepancy would be "statistically significant" if the probability of
observing so large a departure from equilibrium when, in reality, the population is in
ii
i!~i
Fall, 1993]
D N A Evidence
practical or Substantive significance.
127
may, at the same time, make no meaningful difference to the match-.In t h i s regard, simulation studies indicating that
any departure from independence has minor effects o n match-binning
frequency estimates support simple multiplication o f "allele" frequencies
to find the "genotype" frequencies at each locx,s, n6
After estimating "allele" and " g e n o t y p e " frequencies, denoted be Pt
at each locus 1, the final step i n the independence method requires
combining the various Pl's.
:i)"
:.
:'
While~ a small or moderate : :
departure f r o m equilibrium may not be detectable in the existing data, it
binning frequencies, m
'~
This procedure generates the relative
frequency o f the total "genotype" G = {gt,g2,g3,g4,gs} f o r a m a t c h at five
loci. If "linkage equilibrium," a Situation in which there is no correlation
between "genotypes" at different loci, arises, the frequency o f the pattern
for all the loci resembles the outcome o f a series o f coin flips. I t i s t h e
product o f the frequencies at each locus:
P = PtP2 . . . P s
(2)..
. The Population Structure Objection
The most powerful c r i t i c i s m o f this simple calculation concerns the
population stmcturen~--the presence o f subgroups with varying D N A
equilibrium, falls below some threshold, like 0.05.
115. For this reason, it can be misleading to insist that "the product r u l e . . , can only
be appfied when the pairs of alleles are statistically independent,~ Geisser & lohnson,
supra note 112, at 1084; or that "the validity of the mnitipfication rule depends on the
absence of population substmcture, because only in this special case are the different
alleles statistically uncorrelated with one another." NRC REPORT,supra note 15, at 79.
116. See Devlin & Risch, supra note 89; Evett et al., supra note 67, at 502 (simulations of 1.2 million false accusations for all pairs of three probes in Caucasian database,
like previous experiments with other databases and probes, showed that "the assumption
of pairwise independence between probes has no unacceptable practical effects");/an W.
Evett, DNA Statistics: Putting the Problem into Perspective, 33 JURIMETR1CSJ. 139
(1992) (exhaustive pairing of 1500 Caucasians tested at three loci to generate over a
million between-person comparisons to simulate cases of false accusations and estimating
frequencies of matching profiles assuming independence gave nine false matches, most of
which had match frequencies larger than 1/100); lan W. Evett & R. Pinchin, DNA
Single-Locus Profiles: Tests for Robustness of Statistical Procedures Within the Context
of Forensic Science, 104 INT'L J. L. & MED. 267 (1991); Brace S. Weir, Independence
of VNTR Alleles Defined as Floating Bins, 51 AM. J. HUM. GENETICS992 (1992); cf.
Berry et al., supra note 45 (bands are independent at one locus, but measurement errors
are ~rrelated). But see Weir, supra note 113, at 886 (disequilibrium found for some
bins at some loci for single-fragment measurements in FBI Hispanic and Black databases).
117. See United States v. Jakobetz, 747 F. Supp. 250, 263 (13. Vt. 1990)
•
128
Harvard Journal of Ianv &, Technology
: +~ [voL'7
?. + +
.
.
patterns that tend to mate among themselves:m This structure contradicts
the assumptions that guarantee independence of alleles at a specific locus
(Hardy-Weinberg equilibrium),n9 and casts a doubt on the validity of
multiplying "genotype" frequencies across loci. Depending on the
intercorrelations of "alleles" within subgroups, the full "genotypo"
frequency P in the broad population may be higher, lower, or even the
same as PIP2P3P4Ps. Likewise, depending on the details of the population
structure, the multilocns "genotype" frequency in a particular subpopulation may be higher, lower, or the same as PtP2P3P+Ps.
But how much higher or lower?. At present, it is doubtful that
population structure makes much of a difference. Of course, the
independence assumptions do not bold rigorously. Few assumptions in
applied mathematics or statistics do. Since almost all scientific work
proceeds on the basis of simplifying assumptions, the question is whether
the simplification produces satisfactory approximations. Thus, some
medical geneticists argue, often on the basis of impression, that the
degree of population structure is modest n° and that overestimates at some
loci are likely to be countered by underestimates at others, so that the use
of the final product will not systematically disadvantage defendants.m
("[S]ubslructure is arguably the weakest link of the DNA profiling chain.'), aft'd, 955
F.2d 786 (2d Cir. 1992), cert. denied, 113 U.S. 104 (1992); see a/so NRC REPORT,
supra note 15, at 79 ("[W]hether actual populations have significant submucaLre for the
loci used" is "the key question."). But see Peter Donnelly, Discussion of Paper by
Berry, Evett & Pinchin, 41 APPLIED STAT. 521, 524-25 (1992) (presenting a theoretical
basis other than population structure to suspect difficult to detect correlations across
loci).
118. Lewontin & Hart/, supra note 81; Cohen, supra note 111.
119. Despite the representations of some experts testifying in support of FBI findings
of matches, the mere fact that people do not consider the VN_'[~.s of their sexual partners
does not satisfy the "random mating" assumptions behind Hardy-Weinberg equilibrium.
See supra note 103. In Jakobetz, 747 F. Supp.at 260, for instance, the district court
observed that "Dr. Lewontin did discredit the government's experts who casually
concluded that VNTRs must randomly occur throughout the population because individuals do not consciously consider VNTRs when they choose their mates." Even after this
rebuke, however, the FBI appears to have continued to advance this simplistic argument.
See, e.g., People v. Mohit, 579 lq.Y.S.2d 990, 996 (Sup. Ct. 1992) ("The FBI, in
supporting its claim of Hardy-Weinberg, argues that mating is random since individuals
are not aware of their partner's VNTR patterns."). However, unless some other
characteristic affecting mating within subgroups is correlated with VI~I'Rs, the observation implies that each subpopulation is in equilibrium.
120. See, e.g., United States v. Yee, 134 F.R.D. 161, 185-87 (N.D. Ohio 1991).
121. See, Id. at 187 (testimony of Stephen Diager and Kenneth Kidd); Mohit, 579
N.Y.S.2d at 997 (testimony of Michael ConneaUy). Support for these impressions may
be found in Weir, supra note 113, which presents four-locus "genotype" frequencies for
VNTR patterns computed according to "allele" frequencies in all possible pairs of black,
Caucasian, Florida Hispanic and Texas Hispanic databases, and concludes:
-
•
: ii++:
+ .
Fall, 1993]
DNA Evidence
~ 129 ~' ~
Moreover, studies of the frequency of matching ~alleles"i for large
numbers of pairs of different people in labc~ratory databases seem to show
no false matches across four or five loci and rates of matches on subsets
of loci that do not depart markedly from the expected values given
independence of "alleles" across loci. n2
Accumulating evidence supports the independence of the VNTR
loci. 123 Yet, because of the lack of dramatic differences in the frequencies of VNTR alleles across ethnic subpopulafions, 124 and because of the
small differences attributable to using an inapposite racial database ~n
Although different bin frequencies lead to different four-locus
estimated frequencies, the differences are rarely more than two orders of
magnitude, and generally less than one order of magnitude. I t is as
though frequency differences tend to cancel each other--some fragments
are more frequent in one database while others are less frequent.
Id. at 885.
122. See George Herrin, Jr., Probability of Matching RFLP Patterns from Unrelated
Ind/v/dua/s, 52 AM. J. HUM. GENETICS 491 (1993); Neff J. Risch & Bernard Devlin,
DNA Fingerprint Matches, 256 SCIENCE 1744 (1992); Risch & Devlin, supra note 45.
From this analysis, these biostatisticians conclude that ~[t]he .observed independence Of
matehing among loci, both in the FBI and Lifecodes data sets, provides no support for
claims of linkage disequilibrium within ethnic groups, indeed, if linkage disequilibrktm
among loci does exist, it has little effect on the probability of two random individuals
having matching genotypes." Id. at 719. See also Weir, supra note113, at 997 ("By
randomly generating many prof'des, however, this study has demonstrated that . . . . .
whatever levels of dependence do exist are unlikely to have a meaningful impact on
forensic calculations.~).
123. See, e.g., Devlin & Risch, supra note 89; Berry et al.~ supra note 45. But cf.
Weir, supra note 113, at 886 (some two-locus associations found at ,05 but not ,01
significance level for some single-fragment patterns in FBI database for BlacJ~, but even
these associations disappeared when only double hetemzygotes were considered).
124. See, e.g., Devlin & Risch, supra note 89. It is also suggested llmt genotype
frequencies differ more among groups than within ethnic groups.
See Ranajit
Chakraborty, NRC Report on DNA Typing, 260 SCIENCE 1.059 (1993) (letter insisting
that "the extent of regional difference within a racial group is far less than that between
races" and that "analysis of hypervariable DNA loci [demonstrate that] .the mean kinship
within race is 0.4% ~ which is "less by an order of m a g n i t u d e . . , than for blood groups
and isoanzymes~); Bernard Devlin et al., IV'RCReport on DNA Typing, 260 SCIENCE
1057 (1993) (letter asserting that "the estimate of diversity based .on variance .of allele
frequencies among subpopulations is usually quite small--approximately O . 1 % ' ) ;
Bernard Devlin et al., Statistical Evaluation of DNA Fingerprinting: A Critique Of the
NRC's Report. 259 SCIENCE 748, 837 (1993). Contra Daniel L. Haiti & Richard C.
Lewontin, DNA Fingerprinting Report, 260 SCIENCE 473, 474 (1993) (letter asserting
that ~there is approximately as much genetic variation among ethnic groups Within major
races as there is among the races"). This last statement from H a r t and Lewontin seems
to recognize that their original claim of substantially more genetic variation.among ethnic
subgroups within races than across the races was overstated. Lewontin & Harti, supra
note 81, at 1745 (paper typically cited in opinions holding that populatioil structure is so
serious or controversial a problem that big bin computations are rinadmissible).
k=
130
Harvard Journal, of Law.' & Technology . ,'':.:::~?[Vol.:7 .'. ',
simulation studies of false'matches, ns the controversy, over the implica-
tions of population structure for the independence method lingers O n.~,26.,
As one might expect, the debate is not easy for the courts to untangle;n7
In People v. Pizarro, m for instance, the California court of appeals
quoted at length from various scientificpublicationsand submissions, and
lamented:
The difficulty is, where does this place us? It places us in
the middle of the conflict as to whether or not the basic
theory of population genetics involving broad racial and
ethnic groups as opposed to the argumeat of substructure
has any general acceptance in the relevant scientific community--a conflict which we cannot resolve on the present
record.
Neither does the NRC study settle the issue. To the contrary, it pointedly
avoids it. Unable to agree that the population structure objection is valid
for VNTRs, the panel simply "decided to assume that population
substructure might exist" and to propound one particularly "conservative"
method to respond to this hypothetical problem. 129
125. See, e.g., Ian W. Evett, DNA Statistics: Putting the Problems into Perspective,
33 JURIMETRICSJ. 139 (1992) (using Afro-Caribbean instead of Caucasian database to
estimate three-locus profde frequencies in one million simulated cases of false accusations raised the false match rate from 9 to 16 per million). Such studies demonstrate that
the =potential error rate" associated with the independence method weighs in favor of
admitt~g such computations. See supra note 30 (Daubert =considerations" for discerning =scientific knowledge").
126. See Iolm Bmokfield, Law and Probabilities, 355 NATURE207 (1992); Ranajit
Chakraborty & Kenneth K. Kidd, The Utility ~i, DNA Typing in Forensic Work, 254
SCIENCE 1735 (1991); Christopher Wills, Forensic DNA Typing, 255 SCIENCE 1050
(1992); Richard A. Nichols & David J. Balding, Effects of Population Structure on DNA
Fingerprint Analysis in Forensic Science, 66 HEREDITY 297, 301 (1991); Bruce S. Weir,
Discussion of the Paper by Berry, Evett & Pinchin, 41 All'LIED SLAT. 521,528 (1992);
c/. Donnelly, supra note 117. Compare Daniel L. Hartl & Richard C. Lewonfin, DNA
Fingerprinting Report, 260 SCIENCE473 (1993) with B. Devlin et al., NRC Report on
DNA Typing, 260 SCIENCE 1057 (1993) (exchange of letters on genetic variability within
ethnic groups of racial populations as opposed to the variability across populations, and
on the interpretation of Dan E. Krane et al., Genetic Differences at Four DNA Typing
Loci in Finnish, Italian, and Mixed Caucasian Populations, 89 PRO(3. NAT'L ACAD. SCI.
10583, 10585 (1992), discussed infra note 201, on allele frequencies within certain
ethnic groups).
127. The tendency of courts to cite the opinions of other courts rather than scientists
for scientific propositions and to lag behind the rapidly accumulating scientific literature
combine to exascerbate the problem.
128. 12 Cal. Rptr. 2d 436, 456 (Ct. App. 1992).
129. NRC REPORT, supra note 15, at 94. See also id. at 80 (~the committee has
Fall, 1993]
D N A Evidence
~131
Even before the N R C committee spoke, "conservative" ~3° alternatives
for computing the match-binning "genotype ~ f r e q u e n c y P had been
a d v a n c e d - - a n d i m p l e m e n t e d - - t o counter the population structure concern.
W i t h the N R C ' s recommendations for even greater caution, the pressure
to overestimate population frequencies has increased. T h e opinion o f the
Supreme
Judicial
Court
o f Massachusetts,
in
Cotrononwealthv.
L a n i g a n , t31 illustrates how compelling the calls for caution can be.
In
disposing o f matches with population frequencies on the order o f one in
several million, the Massachusetts court explained that "[t]he national call
for considered, conservative approaches to D N A testing, such a s the use
o f ceiling frequencies, and the absence o f such an approach in the present
cases, underscore the w i s d o m o f the motion j u d g e i n excluding the test
evidence." m
However, it may n o t b e so wise to compel the experts t o bend over
backwards in computing population frequencies.
A more accurate
estimate o f the interval in which the true frequency lies may be o f more
assistance to the j u r y , o r a somewhat different, but still conservative
procedure, may be superior to the N R C committee proposal.
For
example, a series o f recent papers show how to incorporate existing
information on population substructure into estirnate~ o f population
frequencies, m Before courts or legislatures decide on which methodology
chosen to assume for the sake of discussion that population substructure may exist and
provide a method for estimating population frequencies in a manner that adequately
accounts for it.').
130. A "conservative" estimate of an allele frequency is an estimate that is too large,
and hence biased in favor of defendant. See, e.g., Commonwealth v. Lanigan, 596
N.E.2d 311,316 (Mass. 1992).
131. Id.
132. Id. at 316. See also State v. Cauthron, 846 P.2d 502, 517 (Wash. 1993)
(remanding for a determination of whether ~the empirical evidence utilized by Ceilmark
is valid under the crtieria set forth by the [NRC] Committee").
133. See, e.g., David Balding & Richard A. Nichols, DNA Profile Match Probobility
Calculation: How to Allow for Population Stratification, Relatedness, Database Selection
and Single Bands (Mar. 24, 1993) (unpublished manuscript, on file with the author)
(proposing a procedure that estimates genotype frequency within a subpopolation or
among relatives using measures of interpopulation variation in allele frequency, said to
be superior to the "complicated, ad hoe and overly-conservative~ ceiling principle);
A.W. Sudbury & J. Martinopoulos, Assessing the Evidential Value of DNA Profiles
Matching Without Using the Assumption of Independent Loci, 33 J. FORENSICSCI. SOC'Y
73 (1993); James F. Crow & Carter Denniston, Population Genetics as It Relates to
Human Identification, PROC. FOR THE FOURTHINT'L SYMP. ON HUM. IDEN'IIHCATION
(forthcoming 1994) (describing a procedure that incorporates existing data on population
structure into computations of the reference population frequency); Bruce S. Weir,
Conditional Genotypic Frequencies in Forensic Analysis, Paper presented at the Nat'l
Institute of Statistical Sciences Forum on DNA Fingerprinting (Oct. 21, 1993) (describ-
132
Harvard Journal o f L a w & Technology
- ~ [Vol. 7
to accept, they should consider the f u l l gamut o f Conservative approaches
and their relation to the basic independence m e t h o d . W i t h the essential
features o f these "overestimation" methods elucidated, we shall return to
the population structure objection to match-binning frequencies.
3. Overestimation Methods
Independence with Big Bins.
The F B I a n d other proponents o f the
independence method have responded to criticism o f the equilibria
assumptions with a form o f confession and avoidance.
They concede
that, strictly speaking, the assumptions do not hold, but they argue that
any plausible underestimate o f the genotype frequency is avoided by the
intentionally "conservative" aspects ofmatch-birming as practiced b y t h e
FBI. Such practices include using big bins relative to match windows,
combining bins so that none contain less than 5 % o f the alleles i n the
population, and treating a fragment near a fixed bin boundary as if it falls
within the larger bin. TM Many courts have accepted the assurances that
these adjustments are more than generous and have held genotype
frequencies obtained with the big bin variation o f the independence
method as admissible. 135
Guessing. A few courts have demanded more. In People v. Mohit, I~
an Iranian-born physician was indicted for rape and sexual abuse o f a
patient during an office examination.
FBI testing revealed a match
between the crime sample (a vaginal swab) and a sample o f Dr. MolaR's
ing another procedure to incorpate data on population structure and the possibility that
the suspect and the perpetrator belong to the same subpopulafion).
134. See, e.g., Springfield v. State, 860 P.2d 435 (Wyo. 1993) (testimony of Michael
Conneaily that "the conservative approach of binning.., more than makes up for any
small differences there might be"); Bruce Bndowle et al., Fixed Bin Analysis for
Statistical Evaluation of Continuous Distributions of Allelic Data from VlCl'RLoci, for
Use in Forensic Comparisons, 48 AM. J. HUM. GENETICS841 (1991); Ranajit Chakraberry, Statistical Interpretation of DNA ~ping Data, 49 AM. J. HUM. GENETICS895,
896 (1991) (letter); supra note 105 (use of 2p rather than p2 in cases of apparent
homozygosity); Newton E. Morton, Genetic Structure of Forensic Populations, 89 PRO(:.
NAT'L ACAD. SCI. 2256 (1992). The NRC Report favors combining the two adjacent,
fixed bins when a fragment lies near the boundary between them. Why this is preferable
to selecting the bin that has the larger frequency is mysterious. See, e.g., Monson &
Budowle, supra note 97, at 1043; supra note 96.
135. See, e.g., United States v. Jakobetz, 747 F.Supp. 250, 259-61 (D. Vt. 1990),
aft'd, 955 F.2d 786 (2d Cir. 1992), cert. denied, 113 S.Ct. 10zt (1992); United States v.
Yee, 134 F.R.D. 161, 182, 210 (N.D. Ohio 1991); State v. Futeh, 860 P.2d 264 (Or.
Ct. App. 1993). Contra, e.g., Commonwealth v. Lanigan, 596 N.E. 2d 311 (Mass.
1992).
136. 579 N.Y.S.2d 990 (Sup. Ct. 1992).
Fall, 1993]
DNA Evidence
133
blood. The FBI reported that "the probability o f such a match occurring
in the United States was 1 in 67,000,000 for Caucasians, 1 in 79,000,000
for Blacks, and 1 in 14,000,000 for Hispanics." While the conservative
features of the FBI's binning did not satisfy the trial court, the court was
not prepared to exclude a sufficiently conservative estimate of the
genotype frequency. Thus the Court proceeded to press the government's
expert, a medical geneticist, for "the most conservative possible estimate
conceivable.'137 The court then held the figure of 1/100,000 supplied by
the geneticist m to be admissible instead of the 1/67,000,000 obtained via
the independence method with big bins.
As a general matter, pressuring experts to raise their estimates until
it is believed that the genotype frequency cannot go any lower lacks a
certain elegance. Other seat-of-the-pants judgments by experts are not
much better, m
Independence with Ceilings. Big bins and other ad hoc adjustments
are disquieting. In the absence of direct studies of the variance of VNTR
"alleles" and "genotypes" across subgroups of broad racial and ethnic
populations and its effect on estimated match frequencies, it is impossible,
a few scientists say, 14° to know whether the overestimation of "allele"
137. /d. at 999.
138. The precise basis for the estimate of 1/100,000 is not clear from the opinion.
Michael Conneally, on whom the court relied, h a d opined that "the highest degree o f
dependence between genotypes on separate chromosomes could not possibly exceed
10%." Id. at 998. "Factoring the 10% correlation into the multiplication of the 4
genotype frequencies, Conneally came up w i t h . . . 1 in 22 million. ~ ld. He provided
the 1/100,000 figure when the ¢ourt demanded that he be still more conservative.
139. Compare People v. Shi Fu Huang, 546 N.Y.S.2d 920 (Crim. Ct. 1989) with
Mohit, 579 N.Y.S.2d 990. In Shi Fu Huang, no objection to the independence method
(with small bins) was raised, but defendant did question Lifecodas's use of a database of
under 200 university students from mainland China. Dr. Michael Baird of Lifecodas
reported that, given the size of the sample, ~the range statistically could be between one
in two and a half billion to one in several trillion." ld. at 922. The court ruled that it
would admit "the lowest figure of probability, namely one billion to one." Id, How
Lifecodes arrived at its interval estimate, and how the court managed to select a figure
below that interval, are not disclosed. Similarly, in People v.-Wesley, 533 N.Y.S.2d
643, 658 (Crim. Ct. 1988), aft'd, 589 N.Y.S.2d 197 (App. Dff. 1992), Dr. Kenneth
Kidd testified that "an examination of the data given by Lifecodes indicated that there
was, in fact, no linkage disequilibrium~ and that he "found no marked deviation from
the expected" genotype frequencies at loci under Hardy-Weinberg equilibrium, but that
slight deviations from Hardy-Weinberg equilibrium warranted reducing "mean power of
identity" by "much less than a factor of 10." The court then limited the prosecution to
estimates reduced by a factor of ten to 1 in 1.4 billion for American Blacks and 1 in
840,000,000 for Caucasoids.
140. See, e.g., United States v. Yee, 134 F.R.D. 161, 183-84 (N.D. Ohio 1991)
(testimony of Eric Lander); Eric S. Lander, 49 AM J. HUM GENETICS 899 (1991)
(letter). Other scientists maintain that ample information already warrants the conclusion
134
Harvard Journal of Law & Technology
[Vok 7:
frequencies introduced at the binning stage14t or elsewheret 42 overcompensates or undercompensates for possible underestimation (with respect to
a particular subpopulation) of the single-locus.frequencies.given b y
equation (1) or their product (2). Persuaded by this limitation in the ad
hoc adjustments to the independence method,:the NRC panel endorsed yet
another variation on the independence method--the "ceiling principle."t4~
The ceiling principle, in its simplest form, capitalizes on studies of
"allele" frequencies among subgroups. Once random samples of DNA
from more or less homogeneous ethnic subgroups, such as "English,
Germ~.ns, Italians, Russians, Navahos, Puerto Ricans, Chinese, Japanese,
Vietnamese, and West Africans,, 1" are collected, the highest frequency,
p m~, for each "allele" i in the crime sample, with respect to all of the
subgroups, is selected. These frequencies are then multiplied as in the
independence method to produce "genotype" frequencies. Since the
largest frequency for any ethnic subgroup studied has been used at each
locus, and no single ethnic subgroup has the maximum frequency at every
locus, it would seem that the result.must overstate the "genotype"
frequency both within each ethnic subgroup and within each broader
group composed of these subgroups. And, the. estimated genotype
frequency P that results from multiplying these ceiling frequencies pi "~
must be the same for every subgroup, Consequently, the committee
insists that "the ceiling principle eliminates the need for investigating the
perpetrator population because it yields an upper bound to the frequency
that would be obtained by that approach; ~t4s
Unless every conceivable ethnic subgroup is studied, however, this
simple formulation of the ceiling principle is not guaranteed to yield the
upper bound. ~¢ It is possible (though at some point implausible) that
that underestimation is a remote possibility. See infra note 199.
141. See supra note 96.
142. See supra note 105.
143. NRC REPORT, supra note 15, at 82-85. For other commentary proposing this
approach, see, for example, Lander, supra note 140; Lewontin & Hartl, supra note 81.
144. NRC REPORT, supra note 15, at 84.
145. ld. at 85. This upper bound is not the only, and certainly not the least upper
bound, that avoids an inquiry into the population of plausible suspects. A lower ceiling,
consisting of the maximum ~genotype" frequency in any of the subpopulations, alsu
could be chosen. See Weir, supra note 91. If computations with the NRC committee's
"ceiling principle ~ are admissible, then, arguably, such refinements also should be.
146. In truth, the ceiling method is not guaranteed to produce an upper bound even
when every subgroup for which there are frequency variations is considered. The
method yields an upper bound only if the alleles occur independently at all loci in each
subpopnlation. If Hardy-Weinberg equilibrium does not exist at s locus, or if linkage
Fall,' 1993]
"
" D N A P,vidence • : ii: .:i:,~.:/::::' .:/" !"/.;'.i ~"~35.iiI'II':/:;:' "::
•
.
.:.
~
...:
:. i!:.., I,
another, as yet unstudied subpopulation, has an "aileie" frequency above:
the ceiling seen so far, To cope with this contingency, the.NRC panel , " .
w o u l d require a m i n i m u m
ceiling frequency O f 5%",, n o matter h o w : l o w
all the subgroup frequencies are. x47
The NRC committee's quest for a suitably conservativeprocedure
does not stop at the imposition of the 5% lower upper bound.; Until
subgroup studies are complete, the panel calls for still higher ceilings.
It recommends that each "allele, frequency be taken to be the higher o f :
either 10% or the"upper 95% confidence limiff Of the frequency seen in
the major "race" with the largest frequency, u8
equilibrium is absent across loci, then the ceiling method can understate the genotype
frequency. See Joel E. Cohen, The Ceiling Principle is not Always Conservative in
Assigning Genotype Frequencies for Forensic DNA Testing, 51 AM. J. HUM. Gi~IETICS
1165 (1992). However, the deviations from equilibrium must be extreme to have this
effect, and no reasons have been advanced for thinking that such deviations exist. See/
supra note 116.
147. NRC REPORT, supra note 15, at 84. This 5 ~ minimum on the ceiling imposes a
lower limit of 11400 on the estimated frequency at each locus and a limit of 1/40(P on a
match at n loci. The panel justifies this 5% lower bound on the upper bounds p ~ as
follows:
Because only a limited number of populations can be sampled, it is
necessary to make some allowance for unexamined populations. As
usual, the problem is rare alleles. Genetic drift has the greatest proportional effect on rare alleles and may cause substantial variation in their
frequency. Even if one sees allele frequencies of 1% in several ethnic
populations, it is not safe to conclude that the frequency might not be
five-fold higher in some subgroups.
Id. :~ some ways, the committee's reasoning here is remarkable. Ordinarily, the fact
that "it is not safe to conclude that [something] might not" is not a ~ s o n m act as if it
actually will happen. If this principle were applied generally, juries, businesses,
governments, and all other decision-makers would be paralyzed, since actions so often
rest on premises that are at best tentatively true. To say that any number less than 5% is
"not safe" is to express a policy judgment about tolerable risk, and such a judgment can
be defended only by specifying the risk in question and the dangers involved. Although
the committee writes that its selection of a lowest upper bound of 5% "was based on
population genetic theory and computational r e s u l t s . . , aimed at accounting for the
effects of sampling error and for genetic drift," id., its report omits any explanation or
description of this theoretical and computational analysis. This omission is troublesome
because more than one model of the introduction o f new alleles, and hence, genetic drift,
in a population can be proposed. See, e.g., Bernard Devlin & Nell Risch, Ethn/c
Differentiation ar VNTR Loci, With Special Reference to Forensic Applications, 51 AM.
J. HUM. GENETICS 534, 545 (1992). The most specific statement the committee does
make is that "[e]ven if one observed allele frequencies of about 1%, one would guard
against the possibility that the frequency in a subpopulation had drifted higher by using
the lower bound of 5%." Id. at 84. As a result, the panel simply fails to explain how it
struck the desired "balance [between] rigor and practicality." ld. at 83.
148. NRC REPORT, supra note 15, at 93. If the NRC committee's recommendations
are uncritically adopted, the interim ceilings may well be permanent. The panel would
-;
136
4.
Harvard Journal of L a w
& :T e C h ~ i o g y ::~ :'~:..'/~c": "" ':" "
Population Structure.and Overestimation
-
~
..
~i!i
:i:.i!:'i::,~.!'
" ~
W e have discussed four methods .of inferring.,the ."genotype"
frequency P from allele frequencies p: the independence method with
bin widths that correspond to match windows, the independence method
with big bins, the method of guessing at upper bounds, and the independence method with ceiling frequencies.: The three modifications.of the
firstform of the independence method all suffer from the same weakness
as the direct estimation procedure in that they are likely to produce
overestimates of the match frequency in a broad ethnic or racial . . . .
population. In theory, the smallest "genotyp~" frequency~that the interim
10% ceiling procecl~e can generate for four probes is I/(200)4 =
I/1,600,000,000, and this "one in a billion" figure should be.small
enough to deligh~:/most prosecutors and to convince most jurors that the
match is no accident. However, in practice, genotype frequencies
computed with "allele" ceilings will be larger than in thenry.149 For
example, the "genotype" in United States v. Yee ~5° had a frequency of
1/35,000 when adjusted upward with big bins. According to some
reports, the ceilings expand this figure by a factor of 2,000, to yield the
frequency of 1/17) 5]
Does the hypothetical possibility of substantial population structure
allow the 5% lower upper hound to replace the 10% ceiling only when ~'the population
studies do not reveal significant substructure." Id. From existing data on Caucasians,
Navajos and West Africans, it is clear that stadsticelly significant differences in allele
frequencies are present (although they do not seem to be ~signlficant" in the practical
sense of producing a large proportion of markedly-different multilocus genotype
frequencies). The same is probably true for English, Navajos and West Africans. I f
~significant substructure~ means statistically significant substructure in the enumerated
subpopulations, then the subpopnlation studies are pointless and the prospect of dropping
from the ten percent ceiling illusory. See Devlin et al., supra note 69,
149, Large genotype ceiling frequencies could come to be most common during the
transition from the interim to the final ceiling method. If the samples of the ethnic
subgroups are small, the upper 95% confidence limit of some "allele" frequencies in
some subgroup easily could exceed even the interim minimum "allele" ceiling of 10%.
150. 134 F.R.D. 161 (N.D. Ohio 1991).
151. See Retie Sherman, New Scrutiny for DNA Testing, N^TL LJ., Oct. 18, 1993,
at 3, 52; Richard C. Lewontin, DNA Evidence: Statistical and Biological Considerations,
Invit~'~.Papers on Statistical Issues in DNA Identification Evidence, IOINT STATISl"ICAL
ME~N~S OF TH~ AMERICAN ST^TISTICAL ASSOCIATION, BIOMET~C SOCIETY AND
INSiTrtrrE oF MATHEMATICALSTATISTICS(Aug. 10, 1992). However, this figure may
be overstated. Cf. Weir & Gaut, supra note 47 (elucidating errors in one expert's
inflated computation of the ceiling frequency in another case). Less dramatic differences
probably are typical. See, e.g., Springfield v. State, 860 P.2d 435 (Wyo. 1993) (using
ceiling frequencies of 1/17,000,000 for blacks and 11221,000 for Native Americans, as
opposed to big bin frequencies of 1/150,000,000 and 11250,000, respectively).
"
Fail; 1993]
DNA Evidence !: ' i/
13:
warrant requiring such overestimation? At ieast in jurisdictions that :.!. i !~'i':''
!(
consider the scientific merit of scientific evidence, ~ the answerdepends
on how hypothetical the population structure argument is and whether the ~
"
independence method allows an expert to present a reasonable estimate of
the population frequency and to quantify the uncertainty in the figure. I
shall argue that speculation about the extent of population structure
notwithstanding, in many cases a suitable population frequency estimate
is obtainable without resort to extreme overestimation. Furthermore, I
will demonstrate that this proposition has not been widely nor directly
disputed in the scientific literature. 153
My analysis builds on a fundamental distinction between what I denote
as a general population cese and a subpopulatiun case. ~ A general
population case arises when the appropriate reference population is a
broad ethnic or racial population, and a representative sample of "allele"
frequencies for this general population is available. A subpopulation case
arises when the appropriate reference population is itself a subpopulatiun
(or a population or set of subpopulations not represented inthe database).
The distinction is important because the presence of substructure in
general population cases can be expected to cause predominantly one type
of error--an overestimate of the population "genotype, frequency--and
only relatively small errors in most instances. As a result, the population
structure objection does not justify a rule of law that demands drastic
overestimation in these cases.
In applying this distinction, it is criticai to understand the limited r01e
that the defendant's ethnic or racial status plays in evaluating the evidence
of a match. The choice of the reference population for any frequency
estimate should be appropriate to the facts of the case. Is the pertinent
frequency to be found from a sample drawn from the general population?
From a particular geographic area? From people resembling or related
to the defendant? These questions are neither new nor special to DNA
evidence. ]54 One'simple principle supplies the answers: The relevant
population consists of all people who might have been the source of the
152. See supra text accompanyingnotes 21-26.
153. This inquiry is particularlyimportantin jurisdictions that neat controversyover a
scientific procedure as an absolute barrier to admissibility, regardlessof the validity of
the procedure. See supra text arid accompanyingnotes 21-26.
154. See, e.g., David H. Kaye, The Admissibility of "Probability Evidence n in
Criminal Trials--Part II, 27 JURIMETRICSJ. 160 (1987);Bruce S. Weir & Ian W. Evett,
Reply to Lewontin, 52 AM. J. HUM.GENERICS206 (19")3).
138
e v i d e n c e sa
defendant's
Y e t , son
appropriate
defendant. |s~ I n P e o p l e v . Mohit,~S7 f o r e x a m p i e , the court was c d n c e n i e d :
::
that the race and ethnicity o f the dentist accused o f r a t ) i n g h i s p a t i ~ t w a s '':
n o t r e p r e s e n t e d in the F B I ' s database. I t noted that:
T h e issue o f i n b r e e d i n g is o f particular~importance in this
case.
;:
T h e defendant, D r . M o ~ t , W a s ' ~ f i ~ i n t h e I r a n i a n
t o w n o f Shnshtar.
H i s ancestors o v e r at l e a s t t h e ~ s t . f i v e
'
generations w e r e o f Persian descent, all f r o m the s a m e t o w n
o r a t o w n close b y .
;
:;
:
- -.
....
' .... ,
.
".
:
::;.
T h e y are a l l o f the ShiRe. Muslim-
r e l i g i o n . D r . M o h i t c l a i m e d that f o r religious reasons, .and
.
as a matter o f tradition, i n b r e e d i n g was v e r y c o m m o n in his
family. H e indicated that his maternal g r a n d m o t h e r was the
daughter
o f his f a t h e r ' s
great-grandparents...Marriage
a m o n g first cousins was c o m m o n in his t o w n . tss
"
.:!.
"
~: " . . . .
T h e issue, h o w e v e r , is n o t the f r e q u e n c y o f matching-~I)NA patterns f o r :' : "
i n b r e d f a m i l i e s o f ShiRe m u s l i m s f r o m Shustar, Iran, but t h e i r frequency. . . . . .
in the v i c i n i t y o f W e s t c h e s t e r C o u m y , N e w Y o r k , or, m o r e precisely~
155. Among commentators, agreement on this point is now virmaUy unanimous. See
Donald A. Berry, Stati~cal Issues in DNA Identification, in DNA ON TRIAL:GENETIC
IDENTIFICATIONAND CRIMINALJusncE 91, 106 (Paul R. Billings ed., 1992)~(~l'he
standard is to use the race of the suspect [but this] makes no sense.~); Bndowle et al.,
supra note 113, at 81-83; IanW. Evett & Brace S. Weir, Flawed Reasoning in Court,
CHANCE, Fall 1991, at 19; Richard Lemport, ,The Suspect PoPulation andDNA Identification, 34 JURIME'r~CS J. 1 ,(1993); Lempert, supra note 74, at 310; Richard C.
Lewontin, Which Population? 52 AM. J. HUM. GI~ETICS 206 (1993) (~[T]he ethnicity
of the defendant is not the directly relevant question, but rather the ethnic composition of
the pool of poss~le alternative ~Lspe.cts.~).
156. See Evett and Weir, ,¢upra note 155; Brace S. Weir & Ian W. Evett, Whose
DNA? 50 AM. J. HUM. GENETICS 869 (1992) Getter :recounting exclusion of DNA
evidence in State v. Passino, No. 185-1-90 ~mnklin Co. Dist. Ct. Vt. May !3, 1991),
because homicide defendant had mixed racial heritage). However, Lewontin, supra note
155, indicates that the trial judge in Pass/no may not have made this misk"~.~-~.' Seealso
Rich,~..-d C. Lewondn, The Dream oftheHuman Genome, N.Y. REV. BOOIC.S)May_28,
1992, at 31. Although the Paasino court's conclusior,, may be defensible, it seems clear
that the court was unduly impressed with =the uncontroverted teslimony"-~that =the i
defendant is one half Indian. three eights native American Indian and one eighth
French. ~
~
.
157. 579 N.Y.S.2d 990 (Sup. Ct. 1992).
158. Id. at 997. The court even cited a study finding that marriage among relatives
(second cousins or closer) in Muslim countries is 20-55%.
.
.
~~: .
'
' '
-.;
Fall, .1993]
DNA Evidence
their frequency among people 'other ~
Dr. Mohit Who might have leR
their semen on the patient. Unless this group consistslargely o f : D r . . :
Mohit's relatives, ~:ze'.:a n o n e e d t o estimate(thefrequency":among . . . . .
people of his racial and ethnic'background.- The frequency: among
bro y defm
and ethm groups is the app0site ag
.
On the other hand, cases do arise where the population of interestis,
arguably, a genetically distinct subpopulation,:and where little orno data ..... :'
specific to that subpopulafion have been collected..- United States:W iTwo :i .ii ....
Bulls gs9 m a y be such a case. Accused o f raping a ~ 1 0 n itilepineRidge ~' i i / . . :
Indian)eservation in South Dakota, Matthew Two =BullSm0ved to-" ':' : '> :.
suppress testimony of amatch between DNAiextracted from semen o n
. :.her underwear and hisDNA~ 1~° The FBI estimated the frequencyofthe ~i . i ....
matching pattern in "a Native American population base. " |e| HoWever,
the appropriate reference population is not all Native Americans, but only ....
.,
the Oglala Sioux. If t h e F B F s "Native ~American, i database is an
amalgam of distinct subpopulations,m while the suspect :population is ' "' .":.i-.:
dominated by one subpopulation, the frequency: of matches-in the FBI's. . - "
database might be beside the point.
Population cases. Although courts are coming toappreeiate.that a: " . : .,:
defendant's ancestry is, at best, tangentially relevant, to-the choice o f a
: , /-ireference population, the relationship between the reference population- i . i i •
and the estimation procedure has yet to be.recognized i n a n y raported. ~{ • -.... ii~i::,
opinion. Even the NRC Report, commendably lucid and comprehensive -/.:>.i ' ; ; ~
in other areas, overlooks the possibility o f adapting the computational: ......: .:~
method to the.circumstances of the case. ~63 . Yet, a simple, numerical
159. 918 F.2d 56 (Sth Cir. 1990),vacated for reh'g en banc but appeal dismiss'ed due
ro death of defendant, 925 F.2d 1127 (Sth Cir. 1991).
160. Id. at 61. The district court had held the evidence admissible, and Two Bulls
entered a conditional plea of guilty under a plea agreement. The court of appeals set the
plea aside and remanded the case to the trial court for ~an expanded pre-trial hearing ~ tO
determine whether the method of DNA typing was generally, accepted, and performed
properly, and whether the statistical evidence was unfairly prejudicial.
161. Id. at 57 n.2. The FBI obtained a frequency of 1/177,000 using the independence method with big bins.
162. See State v. Passino, No. 185-1-90 (Franklin Co. Dist. Ct. Vt. May 13, 1991)
(UTha FBI's Indian Database is made up of a variety of different tribes. Approximately
half of them are Sioux Indians from the Northern Great Plains. Other tribes include the
Cherokee, Arapaho, Zuni and Menomince.'); supra note 156.
163. The closest that the report comes to acknowledging this approach is the following
passage:
Some legal commentators have pointed out that frequencies should
properly be based on the population of possible perpetrators, rather th~n
140
Harvard.lo~rnal of Law- & Technology. '"iJi: [Vol: 7
,iii ii~i,i!(i'):~ii!~,
example illustrates h o w the force of the p o p u l a t i o n s m i c t ~ r e o b j ~ o n
depends on the nature of thereference pop~ation. T o p u t t h e example
in a forensic context, suppose that there has been a Vioientrobbery~d
rape at a rest stop on an~interstate highway and that the robber and rapist,
identified as a Caucasian, left traces ofhisblood orsemen~!~i: susPicion
focuses on a particular man. Careful DNA testing demonstrates :thathe
matches at each locus. If, however, ibis suspectis not the assailant, then
we can say only that someone else is. We:ihave no reason to expect the
•
: ~.
.
i ~. ,-,
~': ;~.~
~:.~.~ "
guilty party to be of the suspect's detailed ~ancestry~ or ;etlmictty.
Therefore, we are interested in the frequencyof the matching "genotype"
among all Caucasians who ~ interstate highways2-and not, the pr0portion in the defendant's subpopulation. When the case comesto trial, the
prosecution offers an estimate of the frequency of this "genotype" in
Caucasians in order to gauge the probative value of the evidence of t h e
match. The prosecution's expert computes I the frequency using the
independence method with bin widths equal to match windows in a large
national database on Caucasian Americans. The defense objects that the
estimate is prejudicial because the population may b e structured, s o the
actu~ frequency could be dramatically larger t~ than the figures computed
by the prosecution's expert using equations (1) and (2).
/
To test the validity of the defense's objection, letus startwith the
simplest possible case of population substructure--one locus with only
two "alleles" and one population composO of two genetically isolated
subpopulations. Subpopulation 1 represents 80% of the population and
subpopulation 2 represents 20%. The ~allele ~ frequencies are presented
in Table 1.
~:~:i ~)
-~
i
:~:
: .........
~
-
:-
:~
on the population to which a particular suspect belongs. Although that
argument is formally correct, practicalities often preclude use of that
approach.
NRC REPORT, supra note 15, at 85. The report nowhere identifies its practical objec~
tions to det'ming the reference population in the only logically acceptable manner
imaginable, and its implication that it is generally appropriate to seek some estimate of
the frequency in the suspect's ethnic subpopulafion is plainlymistaken.
164. Cf. United States v. Jakobetz, 955 F.2d 786 (2d Cir. 1992), cert~ denied, 113
S.Ct. 104 (1992); Edwin McDoweU, Threatof CrimeRises on The Main Highway, N.Y.~
TIMES, Oct. 28, 1992, at A14.
165. It also could be considerably smaller, but the. defense is unlikely to note this
possibility.
',5.,
.
~"~
")
:.
'~
Fall, 1993]
DNAEvidence
subpop.l
Allele
subpop.2
(80%)
(20%)
..,
::
.
~.
:: .141
.... ~
. total
. population
:.
I
3/5
115
(315)(80%) + (I/5)(20~) = 13125 '
2
2/5
4/5
(2/5)(80%) + (4/5)(20%) ffi 12/25
Table 1.
Frequencies of two hypothetical alleles in a structured population.
This population structure implies that equilibrium does not exist for the
broad population, but it does not .impeach, the equilibrium• a~sumptions. . . . . . . within each subpopulation. In the two subpopulatious, equations (1) and~
.
(2) hold and can be used to deduce the "genotype" frequencies within
.....
these subpopulations, and hence, in the total population. 1~ Table 2
presents these frequencies.
Genotype
1,2
Freq. in
Freq. in
Freq.,in
subpop. I
(80%)
subpop. 2
(20%)
total
population
2(3/5)(2/5)
2(1/5)(4/5)
(t2/25)(80~)÷(8/25)(20~)*280/625
• -
Table 2 .
.
.
.
Frequencies of one genotype in a structured population.
Of course, the prosecution expert did not know the subpopul~on •
frequencies. Thus, the expert could use only the population a l l e l e
frequencies 13/25 and 12/25. Using these values in (1) and (2) gives a
calculated population genotype frequency of 2(13/25)(12/25) = 312/625.
In this example, the population structure objection is not well-taken.
While, the simple independence method is slightly inaccurate, reporting
312/625 instead of the true frequency of 280/625, the error favors the
defendant.
This result is the consequence of a general mathematical truth rather
than the consequence of a clever choice o f numbers. As long as-a
population is composed of two isolated subgroups, each of which isin
equilibrium, the frequency for a diallelic locus estimated by ignoring the
166. See supra note 146.
.
142.
Harvard Journal of Law & T e c ~ l o ~ :".i:!~!'i:ii'!i,i:'[Voi!':7
!I'I~'!
.
. :.
~'
..
....
:,'~'.
.
.
population, structure overstates .the true! frequency) 67.' A s a iresulti:'the '.
independence procedure is already conservative, andres0rt to c e ~ g s i s :".' unjustified.l~
, ,...
. ..
..
_...Unfortunately, this example is not representative ofm0re'complex
systems. With more loci or subpopulations, the multilocus:frequencies
estimated without considering population structurecan overstate the true
frequencies for populations.l¢° Since the number of possible alleles in
'
VNTR systems is typically 20 or more, and considerably morethan two
subpopulations may be present, an inequality that applies only to the case
of two alleles and two subpopulations is of little use. Nevertheless, even
in the more realistic situation, on average, the error due topopulation
structure inures to the defendant's benefit,!T° and the differences between
the computed and the true single-locus genotype frequencies w i l l rarely
be large. 171
Partly because this point has not been recognized in thelegal
literature, the population structure objection has proved remarkably
powerful in court. In People v. Barney,172 for instance, a.Califomia court
of appeals concluded that the NRC Report, a paper and a reporter's
observations in Science, and conflicting testimony.of experts in various
other cases demonstrated the existence of an unsettled scientific controversy over population frequencies. 173 Most recently, in State v. Bible,!74 the
167. For a proof, see David H. Kaye, The Effect of Population Structure on Estimated
Allele Frequencies (1993) (unpublished manuscript, on file with author).
:
168. The frequency that independence with ceilings produces in Our example is
2(3/5)(4/5) = 600/625 as compared to the correct value o f 280/625. However, the
degree o f excessive overestimation inherent i n the 'ceiling method Will vary"with the
numbers used in such examples.
169. See C.C. Li, Population Subdivision with Respect to Multiple Alleles, 33 ANNALS
HUM. GENETICS 23 (1969).
170. See Ranajit Chakraborty et al., Effects of Population.Subdivision and Allele
Frequency
Differences on Interpretation of DNA Typing Data for Human Identification,
1992 PROC. THIRD INT'L SYMP. ON HUM. IDENTIFICATION205; Kaye, supra note 167
(about 60% of computations o f allele frequencies in randomlystmclured populations are
overestimates).
171. In simulations of randomly structured populations, the maximum ratio of the true
single locus genotype proportions P to the computed proportions P" seen in simulated
populations was less than three. Although conditions can be created that make P/P"
arbitrsxily large (meaning that the true value is many times larger than the estimated
value), preliminmT study suggests that b o t h P and P" must be very close to zero for this
to occur. See Kaye, supra note 167. If the worst effect of population structure is to
cause the estimated proportion to be one in a billion when the true proportion is one in a
million, the objection seems not to justify the exclusion of DNA evidence in all cases.
172. 10 Cal. Rptr. 2d 731 (Ct. App. 1992).
173. Justice Chin, who wrote the Barney opinion, adhered to this conclusion in the
opinion for another panel in People v. Wallace, 17 Cal. Rptr.:2d 721 (Ct. App. 1993),
-
>
F a l l , 1993]
DNA Evidence". ' ~ ......~: :.~
mark's statistical probability calculations." Tbesecases demonstrate that
the judicial perception that population substructure is a problem In all
DNA cases is widespread, t75
~': ~'
:ii~-:31
~~:.-i..
Nevertheless, the concern is largely misplaced when the.pertinent
frequency is in the general population. In these cases the population ,
structure objection is far less vexing than many~ 0pinions.and a few.; . . . . .
articles suggest. There is a corollary to this conclusion.. : T h e NRC
::
panel's influential call for more conservative meth0ds inthesecases is an
unnecessary response even t o a hypothetical problem. Post=NRCReport
i
cases excluding genotype frequency estimates on the,ground:that
computational methods less conservative than the NRC S version of " ....":
independence with ceilings are inadmissible should not be followed.
Indeed, under the analysis developed in this article, most of the cases
should have found the frequency estimates to be admissible becanse the
:;
circumstances of the offenses pointed to .no specific subpopulation of :r
suspects. In these cases, the relevant population in which to consider~the
ii: 'i:i:
frequency of the incriminating match is a general population, and existing
....
computational methods work reasonably well for such:populations: 176 i i
:i
In Commonwealth v. Lanigan, for instance, the Supreme Ju~licial
Court of Massachnsetts consolidated cases against two
sets of defendants.
7
Thomas Lanigan was indicted for the rape of a child a n d f o r sexual .... i
/::
assault and battery of .three minors. 1~7 Presumably, .these-victims . - . . .
identified their assailant as a Caucasian, and not as a member of some
and chastised the scientific community for questioning the need for or the desirability of
the ceiling approach.
174. 858 P.2d. 1152 (Ariz. 1993).
175. See also People v. Atoigue, DCA No. CR 91-95A (Guam Dist. Ct. App. Div.
1992) (following Barney); Commonwealth v. Lanigan, 596 N.E.2d 311 (Mass. 1992);
State v. Vandebogart. 616 A.2d 483 (N.H. 1992) (relying on NRC Report to establish
that FBI's calculation of population frequency of 1/50,000 is too controversial among
population geneticists, and remanding for a hearing on the general acceptance of the
NRC ceiling frequency); State v. Cauthron, 846 P.2d 502 (Wash. 1993) (relying on
eaxly paper by Eric Lander questioning equilibria assumptions).
176. Cf. Richard Lempert. DNA, Science and the Law: Two Cheers for the Ceiling
: Principle, 34 JUI~r~RICS J, 41 (1993) ("[l]n most forensic situations the problem the
ceiling principle was designed to resolve--the possibility that forensic data bases would .
be ignoring population substructure substantially underestimate relevant allele frequencies--hardly ever exists because the proper reference population for estimating allele
frequencies is typically a mixed population fairly represented by the data bases now in
USe. ~),
177. 596 N.E.2d at 312.
144
Harvard Journal o f L a w & T e c h n o l o g y
s u b p o p u l a t i o n o f that r a c e .
" ii',i
~
[Void/71
.'~-
T h e reference i p o p u l a t i o n consists of. a l l
p e o p l e w h o m i g h t h a v e 'committed the acts.
U n l e s s the victims w e r e
largely isolated f r o m the general population, t h e c l a s s ~ o f plansiblei ~
.....
suspects is Caucasians in general, and not s o m e subpopulation to which
Lanigan belongs.
F o r this broad population, t h e independence m e t h o d :
w i t h o u t c e i l i n g s is appropriate. ~7~
T h e o t h e r actions in Lanigan n a m e d L e o B r e a d m o r e Senior a n d J u n i o r . . . .
as defendants.
T h e y w e r e accused o f r a p i n g ; : a s s a n l t i n g , and h a v i n g
incest w i t h granddaughters and n i e c e s . O n e alleged v i c t i m w h o d e l i v e r e d
a child testified b e f o r e a g r a n d j u r y t h a t she had sexual intercourse o n l y
w i t h the defendants. D N A analysis o f b l o o d samples f r o m the v i c t i m , h e r
child and the B r e a d m o r e s p r o v e d that the y o u n g e r B r e a d m o r e c o u l d not
b e the father, and that the eider B r e a d m o r e had alleles that w e r e " 2 , 5 0 0
times m o r e l i k e l y . . ,
i f he w e r e the father o f the child than i f he w e r e
not the f a t h e r . " O n c e again~ unless the m o t h e r was largely isolated f r o m
the m u l t i - e t h n i c p o p u l a t i o n in Massachusetts,
suspects is Caucasians,
defendants b e l o n g , m
the class o f p l a u s i b l e
and not j u s t the~subpopulation to w h i c h the
Similarly, the t w o eases consolidated in P e o p l e v .
Barney e v i n c e d no circumstances suggesting some special s u ~ a t i o n ; ~ S °
178. The FBI reported that the genotype frequency among Caucasians was 1/2,400,000. /at. at 312-13.
179. However, this population does include a number of people closely related to the
defendants, which poses a special problem. See/nfra note 230.
180. 10 Cal Rptr. 2d 731 (Ct. App. 1992). In People v. Howard, Octavia Matthews
was found on the floor of her home, bleeding from multiple head wounds, with a rope
wrapped around her neck. She soon died. Id. at 732. Kevin Howard was her tenant in
another building. Id. at 733. Ample circumstantial evidence pointed to him. She had
served him with an eviction notice. Id. His fingerprints were on a postcard in her
upstairs bedroom. His wallet was on her bloodstained couch. Id. There also were
bloodstains on a tile floor, a paper napkin in a cosmetics ease and a tissue in a purse. Id.
Howard had a fresh cut on a finger when arrested, and conventional blood analysis
showed tha~ the stains and Howard's blood "shared an unusual blood type found in
approximately 1.2 persons out of 1,000 in the Black population (and not at all in the
White population)." Id. at 73. In these circumstances, the reference population does not
seem to be any special subpopulation of African Americans, and it is reasonable to
consider the fact, as reported by the FBI, that "Howard's DNA pattern matched.., and
the frequency of such a pattern is 1 in 200 million in the Black population." Id.
In People v. Barney, a woman entered her car in the South- Hayward BART parking
lot. Id. A man forced his way in, demanded money, and used a knife to force her to
drive and park some blocks away. Id.
There he molested and tried to rape her,
ejaculated on her clothing, and took about two dollars in small change, her BART ticket
with $3.80 credit on it, and her car keys. ld. The woman found Ralph Edward
Barney's wallet on the floor of her car and recognized Barney as her assailant from his
photo ID in the wallet. Id. When arrested, he had a knife, a BART ticket last used to
enter the transit system from the South Hayward station with the same amount of credit
remaining on it to match the missing ticket, and $1.82 in small change. Id. Again, none
•
:
Fall, 1993]
:7
DNA
Evidence
.
.---
This treatment of the post-NRC Report cases, however;~glglit be
criticized for ignoring the doctrinal basis of the opinions. ~,ih/t~;efaulted
these opinions for not recognizing that the population structure 0bjeCtion .
is weak when the relevant population in which to estimate the 'matchbinning frequency is a collection of subpopulatfons hav~g different
VNTR frequencies. BUt the cases to date come from jurisdictions that,
in theory,~S~ neither ask nor allow their courts to decide what is scientifically valid or invalid, but only to ascertain whether the Scientific
community has reached the consensus that a scientific procedure restS on
a valid theory and generates reliable results when properly applied.::, If
•
of the circumstances suggests any special subpopulation, and it seems reasonable to
consider the frequency of the incriminating DNA profde among African Americans
generally--reported by Cellmark Diagnostics to be "1 in 7.8 million in the Black population." ld. at 734.
Most of the other cases giving population structure as the reason to exclude frequency calculations were general population cases. In both People v. Wallace, 17 Cal. Rptr.
2d 721 (Ct. App. 1993), and State v. Canthron, 846 P.2dS02 (Wash. 1993); police ~ - '
apprehended a man hiding in the bushes with material or implementsof a type used in a
series of unsolved, relatively distinctive rapes in the area. Nothing in the 10pini0ns
suggests that any of the accounts of the victims or Other circumstances pointed' to
membership in some well-defined ethnic subgroup as a characteristic Ofthe rapist.
In State v. Bible a nine-year-old girl bicycling to a ranch in Flagstaffdisappeared,
and her battered body was found hidden in the woods three weekslater. ~The defetidant
was apprehended the day she disappeared, driving a stolen car whose contents m a t c h e d , . .
items found near her body. Cellmark reported that DNA from blood stains on his shirt
matched the girl's DNA, and estimated the genotype frequency in the .caucasian
population to be between 1/60 million and 1/14 billion. I f the defendant w a s not the
person who abducted, molested and killed the girl, then someone else in the:Flagstaff
area that day did, and it is reasonable to consider the frequencies of the incriminating
genotype among broadly defined groups in assessing the probative value of the:match.
Because the New Hampshire Supreme Court's opinion in State v. Vandebogart does
not specify the circumstances that motivated the FBI to use Caucasians as the reference
population, it is difficult to say whether this case falls into the general population
category for which the debate over substructure is essentially irrelevant.
The most awkward case for the general population approach may be People v.
Atoigue As in Lanigan, Atoigue involved sexual intercourse with a child leading to
pregnancy, and the reference population under the hypothesis that the man identified by
the twelve-year-old mother as the father was not responsible is not a unique ethnic or
racial subpopulation. To the extent that the distribution of alleles in the population of
Chamorro, Guam, is markedly different from that in the database actually used, however, the frequency produc~l from the database would be inappropriate. The Polynesian
Chamorrons have been living on Guam for perhaps 1,000 years, and a relatively small
number of people founded the population there. If the Chamorrons have remained
genetically isolated from the other inhabitants of the island, then even a database derived
from the island as a whole may be off the mark. Apparently, Cellmark tried to address
this concern by testing 15 unrelated male police officers in Chamnrrn, but the opinion
does not describe the results of this ad hoc testing and the steps taken to ensure that the
officers were unrelated.
181. For eases that show how far courts in Frye jurisdictions have diverged from this
theory, see McCoRMICK, supra note 22, § 203, at 871=72.
146
leading population geneticists cannot agree o n t h e validity o f the
independence assumptions; '82 and a blue ribbon committee that includes "
scientists among its members recommends the most extreme forms of
overestimation, how can it be said that, without resorting t o ceiling >
frequencies, equations (1) and (2) are generally accepted?
This criticism, however, treats the scientific dispute at too high a level
of generality. No population geneticist or statistician denies that the:
relevant population in which to estimate a match-binning frequency
consists of all the people who might have committed the crime, m At
most, as in the NRC Report, 'm there are occasional slips in phraseology
that make it seem like the particular suspect's subpopulation is necessarily
relevant. However, in reality, the defendant's subpopulation is only
derivatively relevant, to the extent that it conforms to the reference
population of plausible suspects.'S5 Neither does any population geneticist
or statistician dispute the mathematical truism that equations (1) and(2),
when used with allele frequencies for a structured population in which the
independence assumptions hold within each subpopulation, t86 tend to
overstate the frequencies of VNTR genotypes in that structured population.'87
Unfortunately, the leading scientific papers advancing population
structure as a reason to avoid the independence method in estimating
match-binning frequencies do not explicitly analyze the effect of such
182. But see Anderson, supra note 19 (characterizing the debate as "on the scientific
fringe" and the creation of "a few s c i e n t i s t s . . , who proclaim themselves to be extremists
183. 1 have found an exception to this generalization in one conversation with o n e
eminent population geneticist who had not previously thought through the issues i n
forensic DNA testing. Certainly, the publications of scientists endorsing the view that
the relevant population consists of people who might have committed the crime have not
come under attack. See supra notes 155 & 156. And, the publications of population
geneticists illustrate the population structure objection primarily in cases where the
population of potential suspects is very probably a subpopulatiun o f the type of people
represented in the broad racial and ethnic databases. See Lander, supra note 106, at 505
("IT]he crime occurred in a small, inbred Texas town founded by a handful of families."); Lewontin, supra note 156, at 68-69.
184. NRC REPORT,supra note 15, at 94.
185+ See NRC REPORT, supra note 15, at 85 (It is correct to say that "frequencies
should properly be based on the population o f possible perpetrators, rather than on the
population to which a particular suspect belongs."). But see Balding & Nichols, supra
note 133.
186. For a direct analysis of the validity of this assumption, see Dan E. Krane et al.,
Genetic Differences at Four DNA Typing Loci in Finnish, Italian, and Mired Caucasian
Populations, 89 PRec. NAT'L ACAD. SCI. 10583, 10585 (1992).
187. See Chakraborty et al., supra note 170.
.
.
.
.
Fall, 1993]
:: DNA-:E~iden~ '- i.'~! ':." ~
s t r u c t u r e i n g e n e r a l population cases.- ,Rather, t h e y emphasize the
possibility o f error when the reference population c o n s i s t s o f individuals
of the defendant's subpopulation, ns To this extent, the controversy o v e r
the extent a n d impact o f p o p u l a t i o n structure, as developed in the
scientific literature itself, does not compel a conclusion t h a t the theory
underlying multilocus genotype frequency estimates for broadly defined
reference populations is not generally accepted. Is9 Even in the dwindling
n u m b e r of jurisdictions where general acceptance is essential to the
admissibility of scientific evidence, Ige the Frye standard does not demand
the exclusion of the evidence in general population cases.
Lanigan opinion makes this clear.
I n fact, the
Lanigan, recognizing that the
population structure argument concerns "the possibility that using allele
frequencies of larger population groups might produce an inaccurate
frequency estimate for
members o f substructure groups, n excluded the •
frequency evidence. ~9~ However, since there was no reason to estimate
the frequency for such subgroups in Lanigan, "the lively and still very
current dispute"~92 that the court identified did not justify exclusion of the
evidence. ~93 The same is true of Barney) 94 Bible) 9s and Wallace. ~96
188. The leading criticism of independence(with match windows equal to bin widths
or with big bins) is Lewontin & Haril, supra note 81. These geneticists contend that
substantial structuring for VNTR alleles may be present in populations, and they discuss
the ratios in multilocus genotype frequencies for different subpopulations--butnot the
ratio between the frequency computed in the population given a knowledge of its
structure and that computed without this knowledge, ld. at 1748 (tables 1 & 2). They
express concern that uif the wrong ethnic group is used as the referen~ population, then
a very low probability, even zero, may be assignedto a particular VNTR type, when the
true probability may actually be relatively high in the proper ethnic group." ld. They
conclude that "to be scientificallyreliable, the databases must be expanded to include
detailed knowledge of the VNTR frequency distributions in a wide variety of ethnic
subgroups that are likely to be relevant in forensic applications." Id. at 1749. Their
paper demonstrates that at least two eminent population geneticists have doubts about
gcuotype frequenciesderived from broad populationdata but then appfied to cases where
the reference population is but one subpopulation within that broader population.
Although the paper concludes with the more sweeping claim that "esfimates of the
probabilityof a matching DNA proffde based on VNTR data, as currentlycalculated, are
unjustified and generally unreliable," id. a t 1750, the analysis does not focus on the
distinct question of the magnitude of the errors in using allele frequencies in a broad
population when the reference population is that same broad population. Cf. supra note
180.
189. Plainly, the NRC panel's desire for a single method of calculating an upper
bound on genotype frequencies in any likely population or subpepulation is not a
pronouncementabout science, but a mere preference for one jurispradentialpolicy over
another.
190. See supra text accompanyingnote 26.
191. 596 N.E.2d at 315 (emphasis added).
192. ld. at 316.
193. The court relied also on the NRC Report's statement that Uwbetber actual
14s
Harvard soumat Of L a w ;,~"i
rec~Ozogy:{~~ :-.:.::~::Wol. 7 :'
$iLbpopulationcases. The recognitionthat:.thepopulationstruc~
objectionisattenuatedin the generalpopulationca~sdoe~not mean,that
there can never be a legitimateconcernaboUt population:::S~xucture.:-To
the contrary, the o b j e c t i o n has real b i t e w h e n
the g r o u p iof people who
might h a v e left the crime sample are a n a r r o w and p o s s i b l y i n s u l a r :
subpopulation.
I n these s u b p o p u l a t i o n cases, the s c i e n t i f i c q u e s t i o n ' i s
h o w m u c h v a r i a t i o n in g e n o t y p e frequencies exists across s u b p o p u l a f i 0 n s . : . . . .
T w o v i e w s prevail o n t h e subject. T h e skeptical camp contends t h a t : t h e
variations at specific loci might be huge, o r they might b e m i n u s c u l e i ,but
that it is impossible to d e t e r m i n e w i t h o u t studies o f the distributions o f
particular V N T R alleles across subpopulations.197
The other: camp
m a i n t a i n s that subpopulations w i t h i n ethnic-groups ra/ely d i f f e f s u b s t a n tially as compared to variations a c r o s s ethnic groups, 19s so that the
disparities a m o n g subpopulations a r e n o t matters o f pure s p e c u l a t i o n .
A c c o r d i n g to this view, the state o f scientific knowledge, i n c l u d i n g
computations o f m a t c h - b i n n i n g frequencies in various populations and
subpopulations,199 suggest that "differences a m o n g subpopulations are o f
populations have significant substructure for the [alleles] Used for. forensic typing ~ ..
has provoked considerable debate among population geneticists." Id. As we have seen,
however, the relevant debate for a general population case in a Frye jurisdiction is not
just over the degree of population structuring, but over the impact of substructure on
genotype frequency estimates in the broad reference population. Finally~ the court
thought that the NRC panel's willingness to proceed Yon the assumption that population
structure may exist~ established a lack of acceptance.in the scientific community of the
independence method for estimating genotype frequencies in general population cases,
Id. This is an obvious non sequitur.
194. 10 Cal. Rptr. 2d 731 (Ct. App. 1992). Partly on the basis of statements of one
science journalist, Barney detects a ~change in the attitude of the scientific community"
occurring with the publication in Science of the Lewontin-Hard paper and the publication
of the NRC Report. Id. at 744. These authors, the court observes, ~conclude that
because the frequency of a given VNTR allele may differ among subgroups, reference to
a broad data base may produce an inaccurate frequency estimate for a defendant's
subgroup." Id. at 740 (emphasis added). Since the defendant's subgroup was not the
appropriate referenc¢ population in the cases in Barney (see supra text accompanying
note 180), the opinion does not explain why this controversy made it improper to admit
the frequency estimates in the general population. In addition, it seems odd--and
extremely risky--to resolve questions of general acceptance by what science journalists
say scientists have said to them rather than what scientists have written in professional
journals or said to courts. See supra notes 5, 17 & 182.
195. 858 P.2d 1152 (Ariz. 1993).
196. People v. Wallace, 17 Cal. Rptr. 2d. 721 (Ct. App. 1993).
197. See, e.g., Lewontin & Hard, supra note 81.
198. The literature supporting this view is sunmmrized in Devlin et al., supra note 69.
See also supra note 124.
199. One indication that the true "genotype" frequencies are much smaller than the
Ucorrections~ for putative population structure made by the overestimation methods lies
in studies of subgroup frequencies. If data on two subpopulations are available, the
I~. ..:,.
Fall,. 1 9 9 3 ]
"
+DNA'Evidence
.... '. ".149: ~:.::II-:
q u e s t i o n a l f l e i m p o r t a n c e "a°° and+have little i m p a c t o n m u l t i l o c u s " g e n o -
.
t y p e " f r e q u e n c i e s . 2m
singlelocus and multilocus genotype frequencies validly can be estimated u n d e r t h e . :
independence assumptions, since any possible, structure i n the general population is . . . .
eliminated or reduced by fncusing on the subpopulations. Next, a database mixing ~Cse r
"+
subgroups can be constructed, simulating a highly structured population. Using this : .+
simulated database, one can estimate allele frequencies and compute genotype frequencies as if the population were homogeneous. If the artificial population frequencies are
close to the true frequencies in the simulated population, one must conclude that even the
exaggerated substructuring does not produce much error.
Although detailed data on fully homogeneous subpopulations are not:yet available,
analyses along these lines have been performed mixing southeastern and southwestern
Hispunic-Americans, African-Americans. and mixing Caucasian-Americans, and AfroCaribbeans, Asians. and Caucasians living in England. See Devlin & Risch, supra note.
+.
89, at 546; Berry etal., supra note 45; Even & Pinchin, supra note 116, at271 ( " E v e n
in the extreme case of using an Afro-Caribbean instead of a Caucasian database, the ....
consequences are not serious . . . . It is now clear that the precise shapes of the
:
bandweight frequency distributions are not particularly important,"); Monson &
Budowle, supra note 97, at 1044-49 (four-locus genotype frequencies derived by
crossing African-American, Caucasian, suutheastem Hispanic and soiJthwestem Hispanic
databases rarely differ by more than a factor of ten). Comparisons between CaucasianAmericans, African-Americans, Hispanic-Americaus and Chineso-Singapurans, MalaySingaporaus, and Indian-Si.'|gapurans tell much the same story. : Shui TsoChow etaL,
The Development o f a DNA Profiling Database in a HAE l l l Based RFLP System for
Chinese, Malaya, and Indians in Singapore, 38 J. FORENSIC SCI. 874 (1993).
200. Devlin & Risch, supra note 89. at 546.
201. See Newton E. Morton, Genetic Structure of Forensic Populat+ons, 89 PREC.
NAT'L ACAD. SCI. U.S. 2556, 2560 (1992) (Kinship studies show human populations tu
have little structure, making the ceiling approach "absurdly conservative."). Krane et
al., supra note 126, defend the ceiling principle as having "a sufficient margin Of+
safety." ld. at 10586. Krane and his coauthors analyze blood samples from 73 Finns in
Helsinki, 7 9 Italians in Milan, and 1,354 Caucasians in St. Louis to find that allele
frequencies do vary among these groups. To judge the impact on forensic calculations,
they examine discrepancies obtained by switching the databases for Finns and Italians
(i.e., computing three-locus frequencies for Finns using allele frequencies for Italians and
vice versa). Although intriguing, this analysis does not fully simulate the forensic
practice. In court, more loci are used, reducing the probability that the frequencies
estimated at every locus will be too low. In addition, the forensic databases reflect more
heterogeneous populations, like Caucasians, s o that the divergence in allele frequencies
between them and their subpopulations are likely to be less than the disparities in
frequencies between the alleles in two subpopulations. Indeed, when the re,archers
computed three-locus profile frequencies for Finns and Italians with allele fr~luencies
appropriate to the St. Louis Caucasians, the disparities were somewhat less~dramatic.
Most (78%) of these profile frequencies are offby less than a factor of ten, and virtually
all are within a factor of I00 of the correct values for Finns and Italians (which typically
are on the order of lif e or less). Id. at 10586 (Figure 2). These fmdings thus suggest
that the independence assumption with big bins produces genotTpe frequencies that are
roughly correct even when applied to a subpopulation. Indeed, these numbers probably
understate the accuracy of the independence assumptions with big bins. When the
databases are switched, the individual whose genotype frequency is to be estimated is left
in the cognate database, which elevates the frequency of this genotype in that database.
See Chakraborty, supra note 124 ("inherent statistical artifact"); Bernard Devlin & Neil
Risch, NRC Report on DNA Typing, 260 SCIENCE 1057, 1058 (1993) 0otter pointing out
"large upward bias" in Krane et al. for samples that included only 29 Finns and 70
Jr:
+
1OPt
If the -we-know-en0ugh. camp is ..correct, then :~the overestimation: " ' ~ ......
estimates of the genotype frequency P might be exp0sed in any number
of ways. The overestimates--from big bins to c e i l i n g s ' m i g h t be
presented along withless extravagant estimates Of P . Rather than~using
the largest allele frequencies p l m to arrive at a single number f o r P i n
any population, values of P computed via (1) and (2) might be given
across a range of subgroups. 2°2 In light of the mounting evidence that the
independence assumptions are reasonable for the VNTR enzyme-probe
systems in use, the straight-forward independence method, supplemented
by reasonable indications of the uncertainty in the results of these
computations, seems to produce the most appropriate estimates of
"genotype" frequencies.~
In contrast, the NRC Report advocates one form of overestimation
because it seeks a procedure that is "appropriatelyeonservative ~'2°4rather
than reasonably accurate. 2~ By limiting the presentation to the highest
possible range for P in both general population and subpopulati0n eases,
the NRC hopes to sweep the debate about population genetics under.the
proverbial rug. After all, how can scientists and lawyers quarrel when
all that the scientists will say is that the genotype frequency cannot exceed
some "appropriately conservative, value? Although this approach .is not
without appeal in subpopulation cases, where the disparities between the
true genotype frequencies and those computed with the basic independence method are potentially the most pronounced, the NRC committee's
ad hoe determination of what is "appropriately conservative" is as much
a determination based on social policy as a declaration of what is
scientifically acceptable. 2°~ Therefore, it would be a mistake for courts
Italians with three-locus profiles).
202. If the databases are such that sampling error is a serious issue, interval esfimates
can be presented. On the computation of these intervals, see Chakraborty et al., supra
note 90.
203. But see supra note 133 (papers proposing the use of parameters that characterize
the extent of substructure).
204. NRC REPORT, supra note 15, at 94.
205. The panel also "sought to develop a recommendation . . . flexible enough to
apply not only to markers now used, but also to markers that might be technically
preferable in the future." ld. It does not explain w h y the same procedure must be
applied to all markers or why population studies cannot show that the simple independence method will not work with such markers.
206. The consistent undervaluation of the evidence that may result does not trouble the
panel because "[w]~iatever power is sacrificed by requiring conservative estimates can be
-
Fall, 1993]
DNA Evidence: ~!i~
•
to conclude that scientific practice or theory dictates the use o f the one
procedure that the panel deems to be "appropriately conservative."
Still, one must ask if it is bad law to allow scientifically defensible i
estimates to be admitted. The law of evidence does not normally dictate
which of several scientifically acceptable methods of analysis an exPert
may present in court. However, if jurors would be so bemused by a
sensitivity analysis of P, or if they would ignore:the larger estimates in
favor of the more impressively infinitesimal ones, then the overarehing
objective of achieving a fair assessment of the DNA test results may be
difficult to attain with the usual approach. Regrettably, there is no
research to date that can definitively resolve this psychological issue of
how jurors respond to extreme statistics, ~ but when the risk that the jury
will overvalue or be unable to assimilate a range of figures is not
demonstrable, the law should allow a qualified exPert to pursue the
scientifically acceptable approach that the exPert finds most congenial.
At the very least, the law should permit the expert to present both the
"conservative" estimate and the best available estimate. This approach
is well-suited to match-binning frequencies in general populations, and,
arguably, it is acceptable even in the more vexing subpopulation easesi ~
III.
TO BIN OR NOT TO BIN
In Part II, I considered match-binning and the procedures for
determining the frequency of a match in a reference population. I argued
that more than one approach to producing a match frequency or probability is within the bounds of acceptable scientific practice, and that insisting
regained by examining additional loci. ~ ld. at 85. As a purely scientific matmr,
however, it is preferable to be as accurate as possible in estimatingthe frequency, and
then produce a range that reflects the uncermimies in the estimate. Scientistsdo not
normally present paramemrs of theoretical interestby looking only to one end of a
confidence interval, and it would be most peculiar to fred a statisticianadvocating an
inconsistent and biased estimator--one that is expected to depart from the true value,
even as more and more observations are made, and that tends to err in a particular
direction across many samples--simply because it is possible to gather stillmore data.
Worse still,resortto more and more probes raises the risk of false exclusions under a
match-no match rule. See supra text accompanying note 45.
207. David H. Kaye & Jonathan 3. Koehler, Can Jurors Understand Probabiliatic
Evidence? 154(A) J. ROYAL STAT. SOC'Y 75 (1991).
208. The argument against admissibilityof any estimate reaches its zenith in subpopuiation cases involving uncommon, isolated ethnic groups (such as, perhaps, Polynesian
Chamorrons) rather than more c o m m o n subgroups (such as Italian-Americans). See
supra note 180.
:
.. '
.>Y: ~l:)z
"'^
==" ,r '21 ¢~
'0~
.na ara'Jb'ur,
'l
issue, in other words, is no longer,wtiether the evidentiaryirules..SI~
~o expert testimony demand the exclusion o f the match frequency.: ,~.~__
analysis and'review of the scientific literature in Part II establishes that
at least some version o f match-binning--be it basic bins, b i g b i n ~
"cei!ings" o f one kind or another--satisfies both Daubert (or othercases=>-., : { . . ,: !..:
that require a court to assure itself that there.,is a scientifically valid basis
for the testimony) 2°9 and Frye (or other cases that require the court to,find
: =:: ~..... : ::: "
general scientific acceptance). 2~°
,:
.-:
~
., :.The remaining question involves the familiar balancing test for
virtually all evidence: Does the balance o f probative value and prejudice
favor excluding relevant a n d s cientifically acceptable estimates o f the
match frequency P in the: suitably chosen refe_rence population or
populations? 2|| The tbrm of prejudice that arguably ~ e c t s match:binning '
"
estimates is that they will unfairly impress the jury and induce them to
slight other. . important, evidence. Thus, courts h'ave ~
concerned ,that ...... : "
:
very small fractious, by virtue o f their large denominators, are just too
impressive for jurors to handle>properly and that jurors are lilmly to
misconstrue them as stating the probability of innocence. Furthermo~,
commentators have argued,that jurors may not appreciate the limited
:
209. See supra text accompanying note 26.
210. See supra text accompanying note 21. One possible source of confusion should
be put to rest. How can "conservative" procedures like the N R C ' s ceiling methods
satisfy the general acceptance rest when scientists remain divided over the appropriateness of these procedures? This question, howeve-/~ inVites us to confuse the policy
question of whether exv~me overestimates are necessary or desirable with the more
scientifically tractable question o f whether the overestimation methods work as advertised
to ovex~'tate the matching proportion P. "there is liule, if any, dispote over the proposi-' "
tion that for a structured population with each ~ p o p u l a f i o n in e~mh1~um, the ceiling
methods produce generous estimates o f P. See Eric S. Lander, DNA F'mgerprinting: The
~'-~,~.. :
l~!RCReport, 260 SClENCE1221(1993) ("The NRC committee simply coocloded that ltm
chosen upper bound sufficed to eliminate serious scientific objections . . . while still
allowing odds of up to 6,250,000:1 for a match at four genetic loci;"). I n fact, the very
perception that the methods are enormously generous evokes antagonism on the part o f
the scientists and statisticians w h o see d g ceiling computations as inappropriate.
Consequently, the warning o f the Wallace court that "the key players in this dispute,
over the excessiveness of the ceiling principle must "agree to a compron~ise on statigf~.al
calculation ~ or "risk preventing any general acceptance at all, thus pre~.luding the
admissibility of DNA analysis evidence, ~ 17 Cal. Rptr. 2d 7 2 1 , 7 2 5 • (Ct. App. 1993),
rests on a failure to recognize what the debate is about.
211. See supra note 24.
:~
:
"
Faii,1993] "
cannot be evaluated,
costs and benefits ¢
relative to other methods o f informing the jtw.
DNA testin discriminating betweenthe innuc~
is a spectrum ofmodes of pw.senmti0n that may
ways: the pure opinion format, the improbability!fommt, lthe likelih00d
ratio format, and the posterior probability format. :In what follows, I
explain what I mean by these phr~es,:and argue thatIsome combination
of the second and third approaches should be preferred:
:~
A. The Pure Opinion Format
One can imagine a world in which numbers are verboten,and experts
are constrained to stating categorical opinions. In the legal universe, this
world is more hypothetical than real.2 |2 For a time, Minnesota seemed
to have such a rule, and estimates of genotype population frequencies ate
still inadmissible regardless of their accuracy, m The rule forbidding
numerical estimates emerged in a 1978 case involving microscopic
: . ;:
212. Cases explicitly rejecting this rule with DNA evidence include United States v .
Yee, 134 F.ILD. 161,211-12 (N.D. Ohio 1991); Marfinez v. State. 549 So.2d 694 (Fla.
Dist. Ct. App. 1989) (see supra text accompanying note 79); People v. Mehlberg, 618 :
N.E.2d 1168 (Ill. App. Ct. I993); People v. Lipscomb, 574 N.E.2d 1345 (Ill. App. Ct:
1991); State v. Brown, 470 N.W.2d 30 ~0wa 1991); People v. Arian,s, 489N.W.2d 192 c=
(Mich. Ct. App. 1992) ("[T]esting would be a matter of speculation without the statistic' i~::"
cal analysis.'); State v. Cauthron, 846 P.2d 502, 516 (Wash. 1993);:Springfield v.
State, 860 P.2d 435 (Wyo 1993) (rejecting the observation in Rivera v. State, 840 P.2d
933 (Wyo. 1992), that "the bem:r practice" is the Minnesota rule excluding =statistical
probability~ because it %ould be perceived as an ophxionby the expert that the accused.
is guilty'). But see Perry v. State, 586 So.2d 242 (Ala. 1991) (remanding for hearing
on Lifecodes's procedures for single locus VNTR tests and computation o f P = 1/209,100,000, and whether figure is unfairly prejudicial); State v. Pennoil, 584 A.2d 513
(Del. Super. Ct. 1989); John J . Walsh, Forensic DNA ~ping: The Canadian F a p ~ ence, PROC. THn~ INTL SYMP. ON HUM. IDENTIRCATION 85 (1992) (criticizhlg
unreported Canadian cases).
213. In State v. Joun Kyu Kim, 398 N.W.2d 544 (Mum. I987), the court departed
':slightly from the pure ~no numbers" rule. It allowed testimony as to frequencies of each
protein or enzyme marker in a semen stain. Under this variant of the rule, the jury may
be told the frequency of each marker, but not the frequency of their combination.~
Applied to VNTR studies, it would allow the expert, t o give the frequency of each
"allele" and perhaps of the single-locus ~genotYI~es" obtained from equation (1), but not f/
of the muldlocus "genotype" derived from (2). Cases pursuing this exception include .~
State v. Johnson, 498 N.W.2d 10 (Minn. 1993) and State v. AlL 504N.W.2d 38 (Minn.
Ct. App. 1993).
S t a t e v. S c h w a r t z , ~'° ~ e c o u r t h e l d t h a t it: govemed..iDHA evidence ~a s . '
':.. ~:~..
.
well.. I n this last case, police investigating,the stabbing.death o f ~ e
.:
' " ,~ ::~~.. ' .
C o o u r o d f o u n d a n d seized bloodstained b l u e j e a n s i n T h o ~ S c h W a r t z : s
:. ~i" i:::'i/~.i: !:
residence. Cellmark Diagnostic Corporation's rep6rtconcluded,that.fit
:.:~.
is the o p i n i o n o f the u n d e r s i g n e d t h a t the D N A b a n d i n g p a ~ e r n s Obtained
from the stain r e m o v e d f r o m t h e
b l u e j e a n s and t h e b l o 0 d 0 f C a r r i e
Coollrod are f r o m the same individual. "2t7
This :6pinionrested 0na
" b a n d i n g pattern [whose frequency] i n t h e
Caucfisian popnlati0n~iis?
approximately 1 in 33 b i l l i o n . , 2~8 T h e state urged the supreme court,to :;
.
.
.
.
,
,
~
().
.
.
allow this statisticto be admitted "after an adequate opportunity for cross ' ~
examination and limiting instructions."219 The .court declined this
invitation. "In dealing with complex technology, likeD N A testing,"it.
wrote, "we remain convinced thatjuriesin criminalcases may give undue
weight and deference to presented statisticalevidence and are reluctant to
take that r i s k . ' ~
~
"
T h e defect i n the M i n n e s o t a rule is obvious. The c o m p l e x technology
o f D N A testing can p r o d u c e figures that are n o t o n l y relevant, b u t h i g h l y
probative. The j u r y needs some estimate of:the p o p u l a t i o n f r e q u e n c y o r
the p r o b a b i l i t y o f a match with a n o t h e r source to give a m a t c h t h e weight
it deserves. 221 An expert may be needed to calculate P, but the expert's
214. State v. Carlson, 267 N.W.2d 170, 176 (Minn. 1978), described infra note 246.
215. State v. Boyd, 331 N.W.2d 480, 482 (Minn. 1983); Joon K y u K i m (allowing
testimony as to frequencies of each marker but not as to the frequency of the set of
incrimi~..fing markers).
216./447 N.W.2d 422, 428 (Minn. 1989).
217. "ld. at 424.
218. Id. CeUmark used a multilocus probe, which produces pmfdes that are harder to
interprez statistically than the series of single locus probes that have come to dominate
criminal tasting. Thus, the 1/33 billion figure is a binomial probability computed in a
different fashion from (1) and (2), which apply only to single locus:probes. S e e Kaye,
supra note 1. Interestingly, the calculation is essentially identical to one used over a
century ago to analyze an allegedly forged signature in Robinson v. Mandell, 20 F. 1027
(C.C.D. Mass. 1868), described in Paul Meier & Sandy Zabell, Benjamin t~erce and the
Howland Will, 75 J. AM. SWAT.A~'N. 497 (1980).
219. 447 N.W.2d at 428.
220. Id. The Minnesota Supreme Court adhered to this reasoning and result in State
v. Jobe, 486 N.W2d 407 (Minn. 1992). In State v. Nielsen, 467 N.W.2d 615, 620
(Minn. 1991), it intimated that MINN. SWAT. § 634.26 (1989), which was enacted to "=
overturn the Carlson line of cases, is somehow unconstitutional.
221. See, e.g.. Nelson v. State, 628 A.2d 69, 76 (Del. 1993) (finding trial court's
exclusion of mateh frequency ~inherently inconsistent" with its admission of testimony of
a match, because "without the necessary statistical calculations, the evidence of the
match was 'meaningless' to the jury'); State v. Brown, 470 N.W.2d 30 (Iowa~:1991)
"
~i
Fall,- 1993]
-. : DNA Evidence:
..
'•.,.... ~+
.........
155-,
optmon about the conclnsmn to:b~..orawn from this statistic,or ,~.
expertise, in laboratoW":chemistry, genetics;or biostatistics. ~ - i U n l e s s / " ~i~.
invited b y the defendant, such testimony Should not b e allowed. 223
Only if it were certain, ' or nearlyso, ~that jurors would ~ , a r a y
such number would it be desirable, to lear e them at sea and hope that they.:. . . . . .
might make it,to port on their own..But there is noclear indfcation,that -. : ~:::':i
"undue weight and deference~ t o statistical: evidence is any:more'likely
than insensitivity and hostility to the evidence22: •or helpless capitulation
i :i
to an inscrutable opinion. Consequently, ablanket rnleagainst statistics
, :
or probabilities relatingto DNA evidence is unjustified. Global d o u b t s
:
about jurors' abilities to handle statistics do not l'ead to the concius:on that
the dangers of prejudice substantially outweigh the probative value O f :
well-founded estimates of population frequencies.
(holding expert testhnony that "the likelihood o f a person matching in all four fragments~ , : •. . . .
• . . would be one in several billion" admissible, since "[w]ithout statistical evidenee,-the
ultimate results of DNA testing w o ~ d become a,. mat~r o f , speculation");Smte v . - . ..... , ,.
Vandebogart, 616 A.2d 483, 494 (N.H. 1992)=( A match is virtually .meaningless . . . .
, '
without a s~tistical probability expressing the frequency with which a match :,could :
..
occur."); NRC REPORT, supra note 15, at 74 ('To say that two patterns match, without'.
" .
providing any scientifically valid estimate (or, at least, an upper bound) of the frequency
with which such matcbes mightoccur by c _ h ~ . , is meaningless.").
.
-- ' .
It would.not' however, be meaningless to inform the j u ~ that two samples match
and that this match makes it more probable, in an. amount that is not precisely k n o w n , / • .
"
that the DNA in the samples comes from the same p e r s o n . Nor, when all eslimates of
the frequency are in the many millionths or billionths, would it be meaningless to inform
the july that there is a match that is known to be extremely rare, if not unique, in the general population.
At least one court has suggested that the NRC Report's "meaningless" remark
demonstrates that Frye precludes presenting evidence o f a match without an estimate of
the genotype frequency. State v. Bible, 858 P.2d 1152 (Ariz. 1993), construing State v.
Cauthron, 846 P.2d 502 (Wash. 1993). This view i s plainly mistaken. The general
acceptance standard addresses the validity and reliability of the methodology that
produces evidence of identity. The fact of a match is scientifically valid evidence of
identity as long as it can be shown from theory and data that the genotype is not
ubiquitous in the relevant population. How valid scientific evidence of a match should
be presented to a jury is a legal rather than a scientific issue falling far outside the
domain of the Frye test.
222. Even in Minnesota, it may be that opinions beyond the bland statement of a
match are inadmissible. See State v. Aft' 504 N.W.2d 38, 52 (Minn. Ct. App. 1993).
223. The outcome in State v. Cauthron, 846 P.2d at 515-16, is consistent with this
suggestion. Cellmark's Dr. Robin Cotton testified that she had "no doubts" that 1he
defendant was "the source of the semen sample in the five [rape] cases that we got the
result on" and that "the DNA could not have come from anyone else on earth." The
court held that because this opinion testimony was not supplemented or replaced with
"background probability information," it should not have been allowed, ld. at 516.
224. See Kaye & Koehler, supra note 207;
156
Harvard Journal OfLaw & Technolo~
B. The Improbability Format
Because most DNA testers have Chosen tocompute m a t e h : b ~ g :: :'<
frequencies, DNA evidence usually comes ~ the form of adeterminaticiii.)
of a match accompanied by a small numberthat is said to:show the .~ ~:,i: :
improbability of a match in a population o f innocent suspectL: 'Although : . . ~:
nearly all courts dismiss the broad-brush objection-to these nuinbers, =
there are subtle--and troublesome--ways i n which mateh-binni~5
frequencies could be unfairly prejudicial. These possibilities do not, I
think, dictate a fiat ban on P, but they do require steps to avoid abuse or
misuse of the figure.
" "
1. P is not P(Mo I O)
A more sophisticated legal criticism than the global obj~tion to
numbers is that population frequencies may be mistaken for the frequency
with which the laboratory willdeelare a match between the defendant's
sample and the crime sample (MD) when the samples are from different
sources (0). Contrary to what some testifying experts have claimed or
implied, ~ the f r e q u ~ of a DNA profile in a given population only
reveals how often an error-free DNA test will,give false positives when
applied to that population. If matches result both from people whose
DNA truly satisfies the matching and from peop,le whose DNA does not
match, but appears to because of non-random error such as mislabeling,~
then the rate of false positives will be larger than the proportion P . In
practice, of course, DNA tests are not always free o f all non-random
e r r o r s , 227 and even a tiny probability of/[:'-falsepositive error typically will
swamp the vanishingly small estimates of population frequencies
associated with matches at four or five VNTR loci.
Three strategies to Eounter the danger that a jury will confuse a match :'<.-. . .
frequency with the probability of a false positivehave been proposed.
225. See Kelly v. State, 792 S.W.2d 579, 583 (Tex. Ct. App.-1990) ("He [Kevin
McElfresh of Lifecodes] noted that the statistical probabilities o f such a match being
incorrect was one in thirteen million."). For more examples, see Jonathan J. Koehler,
Error and Exaggeration in the Presentation of DNA Evidence at Trial, 34 IURIMETRICS
I. 21 (1993).
226. Undetected degradation and band shifting are not likely to generate false posifives. On the possible sources of false positive laboratory errors, see Thompson & Ford,
supra now 76.
227. See supra text accompanying note 77.
'Fall,-1993]
O n e is to r e d u c e the
c y and s u b m i t t i n g :
ses. ~
A
second
e s t i m a t e and p r e s e n
case at bar, c o n s i d e r i n g b o t h the i a b o r ~ r y ' s : r a t e
o f false p o s i t i v e s o n .
b l i n d p r o f i c i e n c y tests and the p o p u l a t i o n frequency,229.The thirds01ufion
is to p r o v i d e the j u r o r s w i t h b o t h the l a b o r a t o r y false.positiVe e r r o r : ~
,::::
'
'!ii::
and the e s t i m a t e d p o p u l a t i o n p r o p o r t i o n P , t h e r e b y i m p r e s s i n g On the j u r y
that the latter c a n n o t b e equated to the p r o b a b i l i t y o f a false m a t c h . ~ . -
.
. :-
228. Lempert, supra note 74, at 327-28. Whether the costs, beth in terms of resourc r :
es and increased false negatives, justify multiple ~ g
is unclear. !t may be enough to~-i :
give defendants the right to retest at different laboratories and to subsidize multiple :tests:
for indigent defendants who demand them. Cf..James Wooley & Rockne P.: Harmon,"-,
The Forensic DNA Brouhaha: Science or Debate?,i:i51 AM. J. HUM. GENETICS 1164
(1992) (letter urging defense experts to retest rather,than'theorizeabeut theposs~le
sources of laboratory error).
"' "'
"!i ~:~,.'.
In any event, vigorous legislative or administrative,action to reduce the risk of false .: : : /-..
positive and false negative errors alike is eminentlydesirable. Even with the u n u s ' ~ ,
! . 'i'
safeguard of imposing on the proponent of DNA,evidenee'at,a preliminary hearing the :, ~ '
burden of proving that a match follows from properly applied'l~aberatory,procedures;,see .
E. Imwinkelried, The Debate in the DNA Cases Over the Foundation for the AdmissiOn.
of Scientific Evidence: The lmportance o f Human Error as a Cause of Forensic
Misanalysis, 69 WASH. U. L.Q. 19-(1991), it will be immensely difficult to detect
possible errors in a particular case. Moreover, the judicial system is unlikely to produce
sufficient incentives for quality control. If the DNA testing is done moderately well, butnot as well as it could be, the court must decide whetber to exclude generally probative , +
evidence because of the possibility th,3t the laberatory may have erred in the case at bar.
Courts are rightly loathe to exclude such evidence without a specific indication of
laboratory error. If all cases went to trial and all defendants hadskilled anti'astute
+,
counsel with access to experts who could look over the shoulders of the laboratory
technicians, so to speak, the state would feel strong pressure to invest in the laboratories
up to the point at which marginal benefits flowing from the admission of the laboratory
findings equals the marginal cost of improvements in laboratory procedures. However,
the vast majority of cases never reach trial, and very few defense lawyers have the
knowledge and resources required to identify the particular instances when laboratory
imperfections actually cause a problem. Prosecutions will be instituted and m o s t
defendants will plead guilty when faced with infinitesimal match-binning probabilities.
At some point, of course, demands for quality control become excessive, but there is
every reason to get things right before trial. The optimal level of quality control
therefore is farther in the direction of increased expenditures than might at first be
imagined.
229. See Paul J. Hagennan, DNA Typing in the Forensic Arena, 47 AM. J. HUM.
GENETICS876 (1990); Lempert, supra note 74, at 325-26.
230. See Russell Higuchi, Human Error in Forensic DNA Typing, 48 AM. J. HUM.
GENETICS 1215 (1991); NRC REPORT, supra note 15, at 88 & 94 ("A laboratory's
overall rate of incorrect conclusions due to error should be reported, but separately
from, the probability of coincidental matches in the population. Both should be weighted
in evaluating evidence.~); /d. ~i~89 (~The jury should be told beth results."). Presumably, expert testimony could ~ssist by combining the error rate with P to arrive at
P(Mo ] O), the probability that the laboratory will declare a match for defendant given
158
'-Harvard.
These proposals undel
proportion P is not the prol
the defenda' would recriminate an innocent person. Although the p o i n t
does not rv .date a categorical exclusion of P, itdoes militate in favor O f
adopting, as a precondition to the admission of P,: a PrC~edure, such
~++ a s
+
those described above, that would emphasize the distinction to the jury.: :
Requiring the expert to give an estimate of the _g~tteof false matching on
independently administered blind proficiency tests: ~ ; ~ b e the simplest
prophylactic. TM
i!!/i:
2. P is not P(O l MtO
Presenting P also can produce prejudice if the jury misinterprets it as
the probability that someone other than the defendant is the source of the
crime sample. In a sense, this is a more fundamental objection, since it
pertains even to an error-free test, for which P actually is the risk of a
false positive. Assuming, for simplicity, that the test is error-free, the
fallacy works something like this: (a) P, the frequency of a match in the
reference population, is the probability that an innocent perscn would
match the crime sample; Co) defendant does match; therefore, (c) P is the
probability that defendant is innocent.
Where does the fallacy occ~? The first two steps are correct. If the
population proportion is, say, P = 1/100,000, then the probability that
any randomly selected person D will match (an event we may.designate
MD) given that someone other than D is the source of the crime sample
that someone else is the source. If no explanation is provided, the jury may "be
helplessly confused about the weight to accord the testimony [of a match] because
ordinmy people are not very good at working with conditional probabilities." Lempert,
supra note 74, at 325.
Another source of possible error in the interpretation of P is:the presence of
relatives, who have a greater chance of sharing alleles with the defendant.and matching
the crime sample, than the figure P suggests. This is really an aspect of the problem of
defining the reference population. Lempert capably surveys the possible solutions and
concludes that "until technology advances, the most honest approach is to present the
jury with the probability that it was left by one of the group of defendant's relatives
whom the state has not been able to exclude from the suspect population. ~ Id. at 214;
cf. NRC REPORT, supra note 15, at 87; Balding & Nichols, supra note 133. But see
Lempert, supra note 176 (conceding that the ceiling procedure is difficult to justiTy on
scientific grounds, but defending it as an indirect vehicle for accomodating the problems
of laboratory error and "micropopulations").
231. Naturally, defense counsel would remain free to buttress tiffs generalized
information with arguments about the adequacy of the laboratory work in a particular
case.
%/
(an event that may be
fallacy occurs at the last step, whichspeaks of th(
that D is not the source given the match between D and the crime sample.
T h e rules of probability reveal that P(M D [ O ) i s nofgenerallylequal to '
P(O [ MD). The probability that a Card &awnfrom awell shuffled (ieCk
is an ace of diamonds given the fact that iris a red card is 1/26, but the
probability that it is a red card given that it is the a c e o f diamonds is one.
Although it is an elementary mistake to conflate/the conditional"
probability of the match given innocence with the conditional probability
of innocence given the match, 232 more than one court has fallen prey:to
this "inversion fallacy."233 For example, the Califoraia court of appeals, .:
in People v. Axell,TM thought that Cellmark's report that "the frequency
of that DNA banding pattern in the Hispanic population is approximately
1 in 6 billion" meant ,that the chance that any but appellant left the
unknown hairs at the scene of the crime is 6 billion to ~t." Courts in
Arizona, ~ Colorado, ~ Georgia 237, lllinois, 23s Indiana, 239Mississippi, u°
232. See, e.g., McCORMICK, supra note 22, § 211; Jonathan J. Koehler, DNA
Matches and Statistics: Important Questions, Surprising Answers, 78 JUDICATURE 222,
224 (1993). Another way to recognize t h a t P is not P(O [ biD) isto consider the impact
z
of population size on this probability. Cf. Laurence H. Tribe, Trial by Mathematics:
Precision and Ritual in the Legal Process, 84 HARV. L. REV. 1329 (1971). Suppose, as
in Kelley v. State, 792 S.W.2d 579, 582 flex. Ct. App. 1990), the reference population
consists of "white males" and the incriminating profile occurs with an estimated frequency of P = 1113,500,000. If the reference population numbers 27 million, then the
:
expected number of matching DNA profiles is two. The defendant is one of these two,
which suggests--in the absence of other information linking him as opposed to the other
potential match to the crime--that the chance that the other man is the source of the
crime sample is one-half. This is a far cry from the court's thought that "It]be statistical
probability that the semen came from another white male was 1 in 13,5 million." ld..~=~"-~,,
582. Of course, there is no particular reason to think that the reference population in
~
i
Kelly numbers 27 million. It probably is much less. But that does not diminish the
~L
logical force of the argument. The one in 13.5 million figure is just a population
proportion. It neither grows nor shrinks according to the size of the reference population. Yet. the conditional probability P(O M~) that someone other than the matching
defendant is the source is related to the number of other people who could be the source.
Hence, these two quantities are not identical; even though P can b e interpreted as
P(Mv [ O), P(MD [ O) cannot be equated with-P(O [ My). See, e.g., Keehier, supra
note 225.
233. Kaye & Koehler, supra note 207. It also is called the "prosecutor's fallacy."
William Thompson, Are Juries Competent to Evaluate Statistical Evidence?, 52 LAW &
CONTEMP. PROBS. 9 (1989). Commentators, as well as experts, courts, and jurors, also
have been known to commit this error. See, e.g~., Joseph Liebeschuelz, Statutory
Control of DNA Fingerprinting in Indiana, 25 IND. L. REV. 204, 208 (1991) ("The
exclusion frequency is the relative probability that the defendant committed the crime
compared with a person selected at random from the general population.").
234. 1 Cal. Rptr. 2d 411 (Ct. App. 1991).
235. State v. Bible, 858 P.2d 1152, 1165 (Ariz. 1993) ('Cellmark concluded that the
N e w Y o r k , u~ S o u t h D a k o t a , ua T e n n e s s e e , 243 T e x a s , u4~ a n d t h e U n i t e d
I ~ g d o r n ~s h a v e made. o r b e e n p r e s e n t e d ,with s i m i l a r i n v e r s i o n s . ~
chances were one in 14 billion . . . ihat the blood on Defendant's shirt was not the
victim's."); id. at 1189 (referring to "the product role and the resulting opini0n of the
odds against a random match"). The court criticized the state for "tacitly [attempting] to
argue that these probability figures could be equated with the probability that someone
other than Defendant committed the crime." Id. at I185.
236. Fishback v. People, 851 P.2d 884, 888 (Colo. 1993) ("Once a match is determined, its statistical significance. • . is usually expressed~in terms of the likelih0L~dthat
the crime scene samples came from a third person who has the same DNA profile as the
suspect.").
237. Homsby v. State, No. A93A1270, 1993 WL 49"/094, at *6 (Ga. Ct. App. Oct.
18, 1993) ("['r]he chances that the semen recovered from the victim belonged to
someone other than the defendant were one in 70 million . . . . "); Bradford v. State, 420
S.E.2d 4, 5 (Ga. Ct. App. 1992) (Apparently unchallenged FBI DNA tests in rape case
said to show that "the odds someone other than defendant attacked the victim were I in
49 million.").
238. People v. Miles, 577 N.E.2d 477, 484 (IILApp. Ct. 1991) ("The probability of
an African-American other than the defendant leaving the semen stain on the bed sheet
w a s 1 in 300,000.").
239. McElroy v. State, 592 N.E.2d 726, 728 (Ind. Ct. App. 1992) ("The State
presented DNA identification evidence which showed the odds were 20 million to one
that he committed the rape.").
240. Polk v. State, 612 So2d 381, n. 1 (Miss. 1992) ("The probability that the blood
• . . was from any person other than Georgia Mae Thomas was calculated to he 1 in
530,000,000.").
241. People v. Davis, 601 N.Y.S.2d 174, 175 (App. Div, 1993) ("A Lifecodes
t e c h u i c i a n . . , declared at trial that the statistical probability of someone other than the
perpetrator providing the alleged 'match' w a s 'one in ten million.'").
242. United States v. Martinez, 3 F.3d 1191, 1193 (8th C i r . 1993) (~Tbe P'BI
concluded that there was a 1 in 2600 probability that the semen found on the panties
came from someone other than Martinez."); United States v. Two ;~lls, 918 F,2d 56,
57 n. 2 (8th Cir. 1990) ("[P]robability of someone other than Tw'0 Bulls providing a
match was one in 177,000.'), vacatcd f o r reh'g en bane but appeal disraised due to
death o f d e f e ~ ,
925 F.2d 1127 (8th Cir. 1991).
243. State v. Myers, 1993 WL 1416512 (Tenn. Crim. App. May 4, 1993) (Unpublished opinion reporting that an FBI agent "concluded that a 1 in 50,000 chance existed
that an individual unrelated to and other than the defendant produced the semen sample
found on the victim's clothing.").
244. Kelly v. State, 792 S.W.2d 579, 582 ('rex. Ct. App. 1990) ('The statistical
probability that the semen came from another white male was 1 in 13.5 million.");
Transcript at 2327, State v. Bethune, 821 S.W.2d 222 :(Tex. Ct; App. 1991) ('There
would {be] a one in 5 billion chance that anybody else could have committed the
crime.").
245. R. v. Carman, 92 Crim. App. 16 (1991) ("So far as the DNA evidence was
concerned it seems that the chances of anyone else having been responsible for the
semen found on the knickers was something like 260 million to one against.").
246. For still more examples and a discussion of the forces that induce these errors,
see Koehler, supra note 225• The error is hardly confined to DNA identifications. See,
e.g., United States ex tel. DiGiacomo v. Franzen, 680 F.2d 515, 516 (7ih Cir. 1982)
(State criminalist testified that "the chances of another person belonging to that hair
would be 1/4,500."); State v. Carlson, 267 N.W.2d 170, 175 (Minn. 1978,'."(Expert
testified that "the likelihood that the hair found in the r u g . . , in Carlson's b e d r o o m . . .
did not come from the victim would be on the order of one chance in 4,500."). The
•
.
,
Fall, 1993]
DNAEvidence
7i ! 6 1
The Minnesota court in:Schwartz perceived the inversion fallacy as a
reason to exclude P altogether, u7 but this ieaction seems precipitous i f
less drastic measures will reduce the danger. : Such measUres~includ6
cross examination and opposing expert testimony or jury argumentahout
the meaning of p.24, They also include a rule that would preclude
prosecutors or experts from describing P, assome do,~9 in ways;that
encourage the commission of the fallacy. Broader awareness of the
fallacy should go far toward retarding its influence inthe courtroom.
C. The Likelihood Retio Format
I have argued that suitably computed and presented match-binning
frequencies and probabilities pass muster under the conventional rules of
evidence. They pose some danger of misinterpretation, but the risk can
be reduced to the point where the usefulness of the testimony justifies its
admission. This does not mean, however, that P has to be introduced in
court in preference to any alternative. Match-binning, as we have seen,
has several drawbacks. The threshold for declaring a match is arbitrary
and existing match rules may be producing a high rate o f false nonmatches. The need to fit all comparisons into two rigid categories
obscures distinctions that are reintroduced in vague ways when experts
speak of ~exact" matches on the one hand, or "inconclusive" exclusions
hair cases are reviewed more fully in T H E EVOLVING ROLE O F STATISTICAL
ASSESSMENTSAS EVIDENff~IN THE COURTS60-67 (Stephen E. Fienberg ed., 1988).
For a new set of abuses, see Commonwealth v. Pandolfino, 596 N.E.2d 390 (Mass.
App. Ct. 1992).
247. Stat~ v. Schwarz, 447 N.W.2d 422, 428 (Minn. 1989) ('There is a real danger
that the jury will use the evidence as a me~sure of the probability of the defendant's guilt
or innocence.") (quoting State v. Boyd, 331 N.W.2d 480, 483 (Minn. 1983)).
248. See MCCORMICK, supra note 22 § 211; Kaye & Koehler, supra note 207.
Existing empirical research indicates that the inversion fallacy can be counteracted with
an argument like that based on population size. See supra note 232; William C.
Thompson & Edward L. Schumann, Interpretation of Statistical Evidence in Crimina!
Trials: The Prosecutors' Fallacy and the Defense Attorney's Fallacy, 11 LAW & HUM.
BEHAV. 167 (1987). However, this work is based on much larger values for matching
probabilities P, and the countvrargument probably would be less effective-when an
enormous population size would be needed to generate many falsely incriminated people
in the reference population.
249, See, e.g., People v. Miles, 577 N.E.2d 477 (Ill. App. Ct. 1991) (Cellmark's
expert testified that, using database of African-Americans in Detroit, "the probability of
an African-American other than the defendant leaving the semen stain on the bed sheet
• . . was I in 300,000."): ~ United States v. Massey, 594 F.2d 676 (8th Cir, 1979)
(improper closing argument concerning probability of matching hair samples):
:'
162 ~
on the other. ~°
words that make it seem likethe likelih(
Recognizing these problem, some statisticians havle '
methods for conveying the implications of the similarities between!VNA
samples. These alternatives dispense with tile Classification0 f test,~csults ....
into "matches" and "nonmatches," and instead look t o the degree of
similarity in the DNA fragments by quantifying the hypotheses that the
defendant is the source (S) as opposed to the alternative that someone else
is (O). These quantities come together in the likelihood ratio for thetest
results, which expresses how many times more probable the results are
under S than O, and, hence, the relative likelihood of S and O. ~'z If, for
example, the measured differences in lengths of.the VNTR fragments
from the crime sample and the suspect's sample would arise nine times
out of ten when the suspect is indeed the source of the crime sample, but
only one time in 100,000 when someone else in the relevant population
is the source, then the likelihood ratio would be L = ( 9 / 1 0 ) I ( I / 1 0 0 , 0 0 0 )
_-- 9 0 , 0 0 0 . 253
I shall not dwell on the details of producing the likelihoods. There are
competing suggestions. TM All involve a statistical model of the measurement error and an analysis of the distribution of DNA fragment sizes in
a reference population with sampling error. 255 None can b e dismissed as '
250. See supra Part I.
251. See supra Part HI(B)(2).
252. See generally A.W.F. EDWARDS. LIKELIHOOD: AN ACCOUNT OF THE STATISTICAL CONCEPT OF LIKELIHOOD AND ITS APPLICATIONTO SClENTIHC INFERENCE (1972).
253. An analogous ratio using match-binning can be computed. If tbere is a match in
an error-free test at n loci using a match w~ndow of three standard deviations for the 2n
(presumed) independent measurements, this ratio is LM = (.99)~1P. Tbe numerator is
the probability of a match on the 2n fragments from a common source; the denominator
is the probability of a match drawn at random from the reference population in which the
frequency of the matching "genotype" is P. The likelihood ratio L in the text is not
computed in this way, and there need be no "match" (according to some preset match
rule) in the fragments. The numerator of L represents the,probability density of the
measured differences in the fragment lengths (whether or not they fall into some
preordained match window) for a common source, The-~enominator is the probability
density for these differences (without regard to any p-c~efbins or binning rules) for the
crime sample and one drawn at random from the reference population.
254. See Donald A. Berry, Inferences Using DNA Profiling in Forensic Identification
and Paternity Cases, 6 STAT. SCl. 175 (1991); Bernard Devlin et al., Forensic Inference
from DNA Fingerprints. 87 J. AM. STAT. ASS'N 337 (1992); Jeffrey Morris et al.,
Biostatistical Evaluation of Evidence from Continuou~ Allele Frequency Distribution
Deoxyribonucleic Acid (DNA) Probes in Reference to Disputed Paternity and Identity, 34
J. FORENSIC. SCl. 1311 (1989); cf. D.W. Gjertson et al., Calculation of Puternity Using
DNA Sequences, 43 AM. J. HUM. GENETICS 860 (1988) (likelihood ratio for paternity).
255. Sampling error refers to possible differences between the sample and the
F a l l , 1993]
D N A Evidence ~~: .:: ' . ~
1 6 3 'i,~
likelihood ratios a r e so unintelligible as tO ProVide n o assistance to the
jury or so misleading as:to be unduly prejudicial, they should be admissible .
.
.
.
Prejudice seems the more serious o f these possibilitiesi As with match
frequencies, the proposed likelihood ratios do not account for laboratory
r
error, and a jury m i g h t misconstrue even amodified version that did as
a statement o f the odds in favor o f S. 256 Just ~ these objections track
~
those made against the matching frequency, so do the rejoinders. Once
again, admission o f the likelihood ratio L should not beallowed unless the
i
:
risk o f a false positive is incorporated formally or placed a l o n g side it.
As for the second possible misinterpretation o f L, that too is a question
of jury psychology, and the answer is too uncertain to warrant excluding
a statistically acceptable calculation. A n expert who desires to present a
reasonably computed value o f L , either as a substitute for or a supplement
to P, should be allowed to do so. ~
O.
:i:
The Posterior Probabi!itY;F6~s~rnat ~.
(
'-:~'
-
The likelihood ratio, while an improvement"6~,er the match-binning
frequency, is still one step removed from what the j u d g e or j u r y truly
seeks--an estimate o f the probability P(S ] X) that the crime sample is the
suspect's D N A given the observed fragment lengths X i n D N A extracted '
from the samples.
Recognizing this, a number o f statisticians have
argued that the likelihood ratio should n o t be presented to the jury in its
own right, 2as but should b e used t o estimate the probability t h a t the
population from which it is drawn.
256. The possibility of misinterpretation is present in the use of the phrase ~identity
index" that Devlin et al., supra note 254, at 341/=propose for the likelibood ratio in this
context. It also is present with a proposal for ~a verbal convention, ~vhichmaps from
ranges of the likelihood ratio to selected phrases" like "strong evidence~ or ~weak
evident." Ian W. Evett, Comment, 6 STAT. SOl. 200, 201 (1991).~ Cf. David H.
Kaye, The Probability of an Ultir~ate Issue: The Strange Cases of Patem~ Testing, 75
IOWA L. REV. 75, 99-100 (1989) (criticizing the comparable convention of "verbal
predicates" used in paternity testing).
257. If anything, one might argue that inasmuch as L includes all the information in P
and more besides, it should be required, and P should be excluded. Some jurors,
however, may find the less statistically sophisticated P a more comprehensible figure.
As long as both quantities are relevant and not unduly prejudicial, it should be left to the
proponent of the evidence to decide whether to introduce P, L, or both.
258. See. e.g., Evett, supra note 256, at 201 ("[J]ust leaving a court with a likelihood
•
suspect is the source o f the crime sample.~9: And a.few e..~ . . . . . . . .
.. . . . . .
b e e n w i l l i n g t o speak to this probability.:~ I n S m i t h v.: Deppish, ~z t h e
state's " D N A experts informed the jury t h a t . . : , there was m o r e t h a n a . . . .
99 percent probability that Smith w a s a c0ntributor Of the semen f o u n d
on the swab." Likewise, in State v. Thomas, ~ a geneticist testified that:
"the likelihood that the D N A found in Marion's panties came from the
defendant was higher than 99.99%."
.~
Before accepting such pronouncements as a d n ~ i b l e - - a s these courts
apparently have--one should ask how these probabilities are obtained and
whether they are appropriately placed before a jury.
/~lthough the
opinions are silent on these matters, only one mathematically valid
procedure is known for arriving at a probability that a defendant is the
source. From the D N A testing in question a n d data on the distribution
o f the V N T R fragments in the reference population, we can estimate the
probability P(X [ S) o f the measurements X given the hypothesis~'S that
the suspect is the s o u r c e o f the DNA.
Likewise, we can estimate the . . . .
probability P ( X [ O) under the alternative hypothesis O that the crime
sample D N A comes from another source. For concreteness, suppose, as
before, that these probabilities are 9/10 and 1/100,000, respectively.
They are conditional probabilities in that they pertain to an outcome (X)
on the condition that one or another hypothesis (S o r O ) is true. Butthe
conditioning runs in the wrong direction.
We seek
P(SIX),
the • ~;:~
probability that the defendant is the source o f the crime sample given the
data X, or P(O [ X), the probability that someone else is the s o u r ~ given
this same information.
We already h a v e s e e n t h a t P(O [ X) is not
1/100,000--to switch the letters around like this is to commit the
inversion fallacy. 262 To invert P(X [ S) or P(X ] O) correctly takes more
ratio does not seem enough."); cf. Stephen E. Fienberg, Conmlent, The Increasing
Sophistication of Statistical Assessments as Evidence in Discrimination Litigation, 77 J.
AM. SIAT. ASS'N784 (1982) (criticizing presentation of a relative likelihood function).
259. See e.g., Berry, supra note 254. But see Donald A. Berry, Rejoinder, 6 STAT.
SCL 202, 203-04 (1991). The NRC panel pretermitted all proposals involving likelihood
ratios or posterior probabilities on the curious ground that Uno forensic laboratory in this
country has, to our knowledge, used Bayesian methods to interpret the implications~of
DNA matches in criminal cases." NRC REPORT,supra note 15, at 62. Under this
reasoning, the panel should not have urged external blind proficiency of laboratories by a
federal committee as a prerequisite to admissibility and should not have proposed the
ceiling method of computing match-binning probabilities.
260. 807 P.2d 144, 148 (Kan. 1991).
261. 830 S.W.2d 546, 550 (Mo. Ct. App. 1992).
262. See supra Part III(B)(2).
"'
/
I~II
~
No. I]
,. -. . . . . ', 'DNA Evidence
work: The correct answer, however, is well-kn0wn: 2~
Odds(S I X):=
L Odds(S)-
.
(3).
In words, the posterior odds (considering the fragment lengths X) thatthe
defendant is the source are just the likelihood ratio timesthe priorodds : ":
(those formed without knowing this information).TM In 0urillnstrationL - .. :
= (9/10)/(1/100,000) = 90,000. Starting from the ' (dubious) p~mise
that the presumption of innocence should t~e=interpreted to mean that the
-.
defendant has the same chance as anyone else in ~theUnited States of
being the source of the crime Sample,2~ i t follows that the DNA evidence
raises the odds of S to 90,000/300,000,000.= 3/10,000. Alternatively,
starting with prior odds of one, the DNA evidence prompts the conclusion
that the posterior odds are 90,000 to one.
Expressions like (3) have arich history in statistics andlaw. Known
as Bayes's rule because of their ancestry,2~ they have been the subject of
a protracted debate among academically inclined lawyers and statisticians. 267 In courtroom practice, three procedures have been u s ~ . In the
expert-prior-odds implementation, the scientist implicitly Or explicitly
selects a prior probability for the jurors, applies Bayes,s rule, and
informs the jury that the scientific evidence establishes a single probability ~
for the event in question. The prosecution relied on a Bayesian analysis
of this type i n State v. K l i n d t , 2~ a gruesome chainsaw murder case
decided before the emergence ~fDNA testing, and the Supreme Court of
Iowa affirmed the admission of'~ statistician'S testimony as to aposterior
263. See, e.g., MICHAEL O. FINKELSFEIN & BRUCE LEVIN, STATISTICS FOR LAWYERS
93 (1990).
264. The odds in favor of an event are the probability that it will occur divided by the
probability that it will not occur. Hence, Odds(S) = P(S)/P(O) and Odds(S I X) =
P(S [ X)/P(O [ X). For instance, if the probability of an event is II4, then the odds are
(I/4)/(I - I/4) = I/3, or 1:3.
265. This interpretation of the presumption of innocence is found in John Kaplan,
Decision Theory and the Factfinding Process, 20 S'rAN. L. REV. 1065 (1968). If the
population of the United States is 300,000,000, the prior probability is 1/300,000,000, and
the prior odds are 1/299,999,999 = 1/300,000,000.
266. They date back to a paper appearing in 1763 and attributed to the late Reverend
Thomas Bayes.
267. See generally Symposium, 13 CARDOZOL. REV. Nos. 2-3 (1991); David H. Kaye,
Introduction: What is Bayesianism?, in PROBABILITYAND INFERENCEIN THE LAW OF
EVIDENCE: THE LIMITS AND USES OF BAYESIANISM 1 (P. Tillers & E.C. Green eds.,
1988), reprinted as What is Bayesianism? A Guide for the Perplexed, 28 JURIMETRICSJ.
161 (1988).
268. 389 N.W.2d 670 (Iowa 1986).
~:166
probability in
River was wl~
,J however, that
F o r y e a r s , col
routinely admi
ad hoc and o:
half. u° These
the ,probability of paternity" laid before them. hut courts Unmistakably;. . . . . "
apprised of the foundations o f thes: probabilities: have ~n~uNi~t0}i:i
i;:i; i i:;: ~
approve ofthem, r~ Nevertheless, the expert-prior-odds!~appmach~is
~'
clearly ill-advised. It does not permit or assist ~ e juryin !ntegrat~g,~the--•:
scientific proof with the other eviden~ in the Case: (Inst~l, it ~
~:
the j u ~ ~o defe~~to the expert's choice of the prior odds, even though the
~:
scientist'sspecial knowledge and skill merely extend to the production of
the like~hood ratio for the scientific evidence.
,,
A second approach--the jury-prior-odds implementation--overcomes
this defect, it requires the jury to articulate prior odds, to use them as
prescribed by (3). and to return a verdict of guilty if the posterior odds"
exceed some threshold that expresses ~ e point at which the reasonable
doubt standard is satisfied. But this precedure raises serious questions
about the jury's ability to translate beliefs into num~.,,,~ ~ and aboutth e
desirability of ,;4u~ntifyingthe vague concept of ~n~!ei=doubt.272}!=It,
too, is far from optimal.
,,: ~
:
269. This practice first was criticized ~.,Ira Ellman & David Kaye, Probabilities and
Proof.. Can HLA and ~lood Group Testing Prove Paternity?, 54 N.Y..U.L. REV. 1131
(1979) .
.
.
.
:' " ,~:?=:.,
}: '~
270. A few have Imposed restrictions on the practice. ,gee, e.g.,1Coinn~nwealth v.
Beansoldl, 490 N.E:2d 788 (Mass. 1986) (croiticized in Kaye~"supra 'note 256. In
Plemei v. WaiSt, 735 P.2d 1209 (Or. 1987), the Oregon Supreme Cotu~ i~jec~ed the
expert-prior-odds implemenm6zn in favorof the variable-prior-odds Bayesian procedure
discussed, supra text accompanying note 263. See David H. Kaye, Plemel as a tVimer
dn Proving Paternity, 24 WILLAMEFrE L. J.~ 867 (1988). Son~, rape cases in which the
prosecii-fion relies on a "probability of paternity" using undisclosed prior odds ~ r O ~ ~
have generated appellate opinions sritical of that probability, b'ee, e.g., State. V.
'~ ' :
Hartman, 426 N.W.2d 320 (Wis. ,198g). However, the opinions are not wel~ reasohed.
.i~;
See Kaye, supra 256.
":
271. See Tribe. supra n6te 232; David H. Kaye, Comment~ l.;ncertain~T in DNA
:,
Profile Evidence, 6 STAT. SCI. !~,, Lo~ (1991).
272. See Charies R. Nesson, P~asonable Doubt and Permissive Inferences: 2~e Vc~lue
o f Complexity, 92 HARV.<:L. REV. 1187 (1979); Tribe, supra now 232. Compare
generally Charles R. Nes~n, The EvMence or the Event? On Judicial' Proof and t~e
:, ::
:
Acceptability o f Verdicts, 98 HARV.~ L. REV. 1357 (1985)i= w/th Dauiel Shapiro,
StatisticaloProbability Evidence and. the Appe-~r~mce o f Justice, 103 HARV. L.: REV. 530
~4989).
F~
o&
tio:
or
the
displaying the force of the evidence across a w [ d e raxtge of prior
probabilities. No juror need adopt Bayes's rule Or any prior probability) ! :i.
but all jurors can seethe distinction between P(X [ O)~ the probability of ://i!,ii
the evidence under the hypothesis that someone Other t h i n { t h e : d e f e r
" :/:
is the source, and P(S [ X),. the' probability that the:defendant .is
source given the evidence.
., ~
probability of innocence, and from misinterpreting the likelihood ratio L
273. See Ellman & Kaye, supra note 269; Kaye supra note 27l.
274. Michael O. Finkelstein & William. B. Fairley, A Bayes&n Approach to
Idemifio,fion Evidence, 83 I-tARv. L : REV. 489 (1970). F o r L = 90,000, the posterior
probability approaches one for all but invisible values of P(S). For example, the prior
probability would have to be about I/I00,000 o r less to keep the posterior to lessthan
one-half. On the other hand, for smal!er tikelihood ratios the graph responds t o p ( s )
over a broader range. Consider t.b~
match-binning frequency of 1/17 recomputed in.Yee
f,J
according to the ceiling method. See supra note 148 and 151, If this frequency were
used to form the likelihood ratio L = 17, as de~rihed supra note 253 the graph would
look like this:
Figure 2
Po=terlorv= I~'1~ t~dool~l/ f e e L=17
!
S ~
O.7
-
~
i1 •
~'0.40.3 I
O.2
f~
o
o.!
.,~
--.
Thus, the:variable-prior-odds implementation of Bayes.'s rule should
be at l=~st permissible, z75 It has thepotential of preventing the judge and
jury from m i s ~ n s t n ~ g the match-binning probability POVIDI O)ibS the
probability of innocence, from mistaking the probability P(X:I O) for the
o.)
i~
'!
:::~
o.2
o.3
o.o,
o...q
o.a
o,7
o.19
o.91
i
275. The main drawback of the variable-prior-odds implementation is that4t d o e s n o t
necessarily incorporate the risk of labo~mry error. ~
is not wrong, as long as one
make it clear what likelihood is being .~stimated. The concern is that the emphasis on
the numbers that are available may lead the jury to overlook this consideration. As with
the likelihood r~6o or other probab~ties,, however, the most reasonable, resl~'~se is to
insist that no D N A results be admitted without information on the rate of false,, :3sitives
as determined by external proficiency testing.
.,
-:~
'"
:ii
:: :
- ':
.
,
,
1
68 .
.
.
.
.
'
H~yvard dournal.ofLa~='&:.Technology:~ .,': ~: [V.,ol.,~7:
:
..
"
.:
.
:
' """:"
,
-
.
~' . * "
~i
ii ~
~ ":2
.,
:L2.
~.
.~ ~-~:
asthe poster
.~ i~
swamp any I
no great moment, but when.:the most .conservative procedures, for
computing probabilities are usedto generate undulymodest values for L,
the need for the jury to see how strongly.e~,*enthese underestimates affect
"
a reasonably ascertained prior probabi!ity is greatest.2~6 Theexpert . . . .
should not be precluded from presenting a mathematically valid and
possibly revealing explanation of the significance of the fragment lengths. . . . . .
C
O
N
C
L
U
S
I
O
N
"
-
-
Analysxs of DNA samples can produce revealing evidence of identity,
but the search for a procedure ".~
- ~ ,convey--inteUigibly, acatrately and
fairly--the probative value of such evidence has proved challenging.. The
dommant method for assessing "~he evxdential slgmficance of DNA
evidence in this country entails a declaration that~,wo samples either do
or do not match, followed, in the case of a m~tch, b y an estimateof the
matching genotype frequency in some reference population. Some of the
controversy that surrounds this methodolegy is specious. For example,
with regard to the matching phase, once a match is declared, most
arguments about the overbreadth of a match window are misleading.
Other arguments are less easily resolved. Of these, the most
prominent and effective inrecent litigation is the concern that populations
could be structured in ways that seriously vitiate the population frequencyestimates. This scientific issue warrants more refined and complete
judicial scrutiny than it has received. The question is not whether there
is absolutely no structuring. TM It is not whether there are absolutely no
departures from genetic equilibria. TM It is whether the structuring and the
deviations that it induces have an appreciable impact on "qNTR genotype
frequencies in the relevant population.
There is very little evidence, and certainly no scientific .¢0nsensus, that
the impact is substantial in any known population. But neither are
popul;~3ion geneticists and Statisticians unanimous in dismissing the
concern. Where the reference population is a broad and probably
°
.
~./
.
o
.
•
276. S e e s u p r a note 274.
',~
277. Likewise, in F r y e jurisdictions, the question is not whether the s~ientific
community agreqs that there is absolutelyno structuring.
278. In jurisdictions that have adopted'the F r y e standard, ti~ question is ;-~t whether
scientists agree that there are absolutely no such departures.
Fall,.1993]
,?,
D N A Evidence
i...
.. ,
"
169
structured ethnic or racial population, as ~itis
cases, the population structure 0bjeeti0n amour
basic procedure adds and then multiplies when
add allele frequencies. A striking series of studies show that the resulting. : ~":''; i., .~.~
differences in genotype frequencies rarely are dramatic and o f t e n l f a v o r " . .:i.:..,,
defendants. Where the reference population, isitselfla subpopulation,
..:7.:.':
however, requiring resort to exh-eme overestimation procedures,, such as
:i ".~.
the one called for by a committee o f the National Resf,arch Council,: is
more defensible. Still, to the extent that itis feasible to produce a series
of estimates that show not only the best estimate for the subpopulation, but also how much that estimate could be in error, this solution may not : .
be needed.
Beyond the debate over population structure is a~ issue that has lead . . . . ii,
one jurisdiction to eschew probability, estimates altogether. Even when
a suitable reference population frequency can be computed, i t m a y be
misinterpreted. To avoid prejudice, however, it suffices to bar' the-..
p~:oponent of the evidence from miseharacterizing t h e match-binning.
frequency as the probability of a false positive or the probability o f
>
innocence and to apprise the jury of the probability o f a false positive.
In sum, given the current state of scientific knowledge, match-blnning >:
frequencies should be admissible, atieast in general population cases, and
perhaps in most subpopulation cases as well.
This is not to say, however, that such frequencies are the best way to
express the evidential value of DNA testing. To the contrary, the match
vs. no match decision producese:a false dichotomy. There is little
difference between samples that almost match and samples that do not
quite match. The likelihood ratio, which states how much more probable
it is to find the observed degree of similarity when the defendant is the
source than when someone else is, overcomes this dif-fieulty. This
quantity should be admissible, either in lieu of or in addition to, a matchbinning frequency.
Finally, testimony of the probability that the defendant {or someone
else) is the source of the DNA in the crime sample should be admissible
if the calculations conform to Eayes's rule and if they do not rest on a
prior probability that the expert, rather than the jury, has produced. A
Bayesian presentation should involve variable prior odds so that jurors
can consider the other evidence in the ease and are not compelled to
accept an expert's prior probability or to force their own beliefs into a
mathematical mold.
170
Harvard Journal of Law & Technology.
',. [Vol.:~7
~' : i:i:'~i!i:-~
These conclusions rest on a particularphilosophy cOncerning the role' " ! .:i:!.
of expert witness and jury. T h e taskof:the expert is to assist the jury in .:. i.:->
ill
evaluating evidence that it would be hard-pressed tolimderstand fullyon, ~ .:. '"'-': :
its own. The task of the j u r y i s to decide what the evidence~.proves;i: !. _,. '
With DNA evidence, the expertise of the laboratory technician or'sci~tist
is needed to explain what lies behind.the pattern o f d a r k bands.0n a n . i
i-::!
autoradiogram. The expertise of the statistician is nceded toexplaifihow
probable the patterns are .under various.hypotheses',~and how.these
.
probabilities affect the plausibility of these hypotheses...Unless more
fi:ferential errors are likely to arise with the expert testimony than without
it, the rules of evidence should not bar experts from providing all or
~ m e of this information to a jury. And, where the range of uncertainty
can be described, the law should not force an expert to present this
inforrr,
ztion in a manner that always favors one party over another. "~.-,.~
/;
Because it is far from obvious which method of presentation--match,:....
binning with basic bins, match-binning with overestimation, likelihood
ratio, or Bayesian--will prove most helpful to all jurors, the courts should
permit the litigants to advance the combination of reasonably computed
statistics or probabilities that they deem most suitable. A healthy
pluralism is preferable to a rigid catechism.
/ ,Z
Fall, 1993]
DNA Evidence ~, :~-~ i
APPENDIX: THE INTERIMCEILING
......
171 II:'!:'::I'i:~; ;
PROPOSe::/
" ':i ::
As noted in b e body of the article, the NRCcommittce,$,quest:for:a :::: .... :
suitably conservative procedure does not stop at the ~ S i f i 0 n Of the 5 %
':
lower upper bound. Until subgroup studies are complete, the p~el calls
for still higher ceilings. It recommends that each "allele',frequencybe
taken to be the higher of either 10 % or the "upper 95 % confidence Unfit~
of the- frequency seen in the major "race" with the largest frequency,. . . . .
• :'~heupper end o f a confidence interval. The proposal to use these
vaulted ceilings from racial databases while awaiting the~t~sults of
subpopulat/on studies does not flow inexora.~y from any :generally
accepted scientific or statistical theory. Indeed,"fhe upper bound of the
confidence:interval is presented, candidly, as "a pragmatic approach to
recognize uncertainties in current population sampling."279 The enumerated "uncertainties" are the sampl~g method--"the ~'xrrent '~'mvemenee
sample' mauner'2S°--and "sampli~,g error. ~2sl
" r ~ *" v
'
:..~,-.~'~.
~
°
As a response to these c o n c e r ~ i n g the upper end of a confidence
interval is most peculiar. To see why, onemust understand:just what a
confidence interval is. Contrary to the view expressed by some courts, ~
confidence intervals do not account for any and all errors in estimation.
Confidence intervals are one way to address one kind of error. They
express the likely range of sampling error in a probability sample-,one
in which every item sampled has a known probability of being selected.
When, and only when, the probability structure of the sample:is known
can a 95 % confidence interval be said to be ~be result of a procedure that,
under repeated sampling, would generate intervals that capture the true
value about 95 % of the time.
When convenience samples are collected, the laws of probability
caunot reveal how the statistics from the samples will.~E~,e. The
variability and bias arising from convenience sampling are, qu!~. ~;2nply,
v\ !!
not addressed by confide,nee intervals, which are directed to the variabili•
.
~t~i
t y m unbiased, rand~m.zampling. Computing a confidence interval for
a non-probability sample may be a "pragmatic" response to the concern
NRC REPORT,supra note 15, at 92 (emphasis added).
Id. at 92.
See supra text accompanying note 147.
See Brock v. Merrill Dow Pharmaceuticals, 874 F.2d 3if/, 312 (5th Cir. 1989),
cert. denied, 494 U.S. 1046(1990). Contra Michael D. Green, Expert Witnesses and
Sufficiency of Evidence in Toxic Substances Litigation: The Legacy of Agent Orange and
Bendectin Litigation, 86 NW. U. L. REV. 643, 666-68 (1992).
279.
280.
281.
282.
172
over the samplit
is a pragmatic response tOt he concern aboUt pop~ation structure. T h e
committee embraces the former, while it shuns the latter.
.T h e second justification for the upper end of the: 95 % confidence
interval on each allele frequency fares no better.i ~A confidence interval
is a reasonable response to a concern about:sample size, but the
committee's proposed treatment of these intervals remaln.q peculiar. I n
a small database, very rare alleles are likely to be underrepresent~, and
the estimate of any allele frequency is inherently uncertain in the sense
that another sample could produce somewhat different estimates. With
small databases, then, it might be appropriate to pick an a priori lower
bound for the rarest alleles. 2~ But this could notjustify using only the
upper end of a confidence interval, especially in broad racial and ethnic
databases that are reaching appreciable sizes. Instead, the obvious way
to cope with sampling error is to present an interval estimate of the
match-binning frequency P computed after the best ~Yailable estimates 0f
each allele frequency are applied in (1) and (2). TM
The 10% floor on the ceilings. The substitution of 10% for the 5%
lower upper bound while awaiting direct studies of population structiire
"is designed to address a remaining concern that populat[pns might be
substructured in unknown ways with unknown effect and the concern that
the suspect might belong to a population not represented by existing
databanks or a subpopulation within a heterogeneous group."~s But any
concern about the suspect's racial and ethnic identity is misplaced. It
bears repeating that the pertinent reference population is not the defendant, but all people, of whatever race and ethnicity, who plausibly might
be suspected of leaving the trace evidence.~ And, the choice of 10% is
no more scientific than the earlier choice of 5%. Both rest on an
unarticulated balancing of co~,ape'~ing policies.
283. See Chakraborty et. al, supra note 90.
284. See/d.; Weir, supra note 45 (emphasizing the fact that a "confidence limit of a
p r o d u c t . . , is not the product of the confidence limits").
285. NRC REPORT, supra note 15, at 92.
286. See supra text accompanying note 155; NRC REPORT, supra note 15, at 85.
-..,