ungated version

3URMHFWDELOLW\'LVDJUHHPHQWDQG&RQVHQVXV$&KDOOHQJH
WR&OLQLFDO(TXLSRLVH
0DUN)HG\N0LFKHO6KDP\
Theoretical & Applied Ethics, Volume 3, Number 1, Summer 2014,
pp. 17-34 (Article)
3XEOLVKHGE\8QLYHUVLW\RI1HEUDVND3UHVV
DOI: 10.1353/tha.2014.0004
For additional information about this article
http://muse.jhu.edu/journals/tha/summary/v003/3.1.fedyk.html
Access provided by your local institution (2 Mar 2015 19:41 GMT)
Mark Fedyk and Michel Shamy
Projectability, Disagreement,
and Consensus
A Challenge to Clinical Equipoise
Clinical equipoise links ethically appropriate medical research with
medical research that has the reasonable chance of resolving debates.
We argue against this principle on the ground that most debates in medicine cannot be resolved by the outcomes of any particular (even extremely rigorous) empirical study. In fact, a deep understanding of the
methodology of scientific research leads to the conclusion that adopting
clinical equipoise as an ethical standard for medical research would deprive medical researchers of the ability to confirm clinical hypotheses.
Introduction
Medical research in the form of clinical trials cannot be accidental, in the
sense that both the outcomes of medical research must be reasonably
anticipated and the methods of any new research protocols must be informed by methods that have been used in the past. Specifically, clinical
research requires explicit evidence that, prior to running a proposed experiment, the experiment meets various orthodox ethical requirements.
Importantly, this ethical evidence is not independent of equally explicit
scientific evidence that the research is likely to generate knowledge, and
this is not because it is unethical to do pointless or uninformed medical research. Rather, as a general rule, the methodology of medical experiments
cannot be carved off from ethical evaluation of the very same experiments,
because ethical evaluation and methodological design deeply and inextricably inform each other. This interdependence is reflected in the practice
of mandatory clinical research review, during which experimental protocols undergo both rigorous methodological assessment and ethical evaluation, where the ethical appropriateness of a trial and the methodological
quality of the trial will be concurrently and reflexively determined.
So, there are two ways in which clinical research cannot be accidental:
it can be neither a scientific nor an ethical accident. This is one reason why
Freedman’s (1987) principle of clinical equipoise is so interesting. According to this principle, Freedman says part of the evidence needed to certify
that a proposed study is ethical is a particular kind of scientific uncertainty
that, if it exists, may effectively prevent researchers from knowing that
medical research is not methodologically accidental.
What’s more, in order for a proposed study to be ethical, it must also
be likely that this antecedent (ex ante) uncertainty will be ameliorated by
the results of the clinical study. Freedman says that clinical equipoise exists
when there is “no consensus within the expert clinical community about
the comparative merits of the alternatives to be tested.” He continues:
We may state the formal conditions under which such a trial would
be ethical as follows: at the start of the trial, there must be a state of
clinical equipoise regarding the merits of the regimens to be tested,
and the trial must be designed in such a way as to make it reasonable to expect that, if it is successfully concluded, clinical equipoise
will be disturbed. In other words, the results of a successful clinical
trial should be convincing enough to resolve the dispute among
clinicians. (1987, p. 144)
There can be, prior to the trial, no scientific consensus about a hypothesis tested by the trial; yet the trial must also be very likely to generate
consensus-producing knowledge.
To philosophers of science, Freedman’s principle seems to say that it
is only ethical to conduct crucial experiments (Lakatos, 1974). Crucial experiments are those that have the potential to produce evidence strong
enough to resolve scientific disputes about empirical questions. Yet nearly
all accounts of scientific confirmation developed since the demise of logical positivism predict that crucial experiments are exceedingly rare (cf.
Crupi, 2013). So there is a prima facie worry—from the philosophy of
science—that there is tension between Freedman’s principle of clinical
equipoise and scientific methodology. Freedman’s principle may inadvertently set the ethically permitted scientific outcomes of clinical research
impossibly high.
This article argues that this problem is real. Our argument can be sum18
Theoretical & Applied Ethics 3:1
marized in the following way. The antecedent uncertainty that Freedman
refers to will not typically exist for clinical studies that are also not methodologically accidental. Furthermore, the knowledge that ensures that
scientific research is not methodologically accidental is also usually the
grounds for scientific disagreements that cannot be resolved by particular
one-off scientific experiments, no matter how rigorously constructed such
experiments may be. Thus any ethical norm that links ethically appropriate medical research with medical research that has the reasonable chance
of resolving debates will not be practically compatible with scientific progress in the relevant field of research.
In more detail, we argue that it is not possible to carry out scientific
research—in medicine or elsewhere—without relying upon what philosophers of science call projectability judgments. Relying on projectability
judgments, however, leads to a fundamentally different kind of disagreement in medical research than the disagreement that Freedman calls
clinical equipoise. These “hard” disagreements will look, prior to an experiment, like there is “no consensus within the expert clinical community
about the comparative merits of the alternatives to be tested.” However,
what makes these disagreements “hard” is that they are not the kind of disagreements that can be resolved, as is mandated by Freedman’s principle,
by even the most rigorous medical experiments. Importantly, however, the
knowledge that generates these hard disagreements is also the knowledge
that ensures that medical research is scientifically rigorous—that is, is not
methodologically accidental. We conclude that most disagreements in
medical research are hard disagreements, that such disagreements are an
inevitable by-product of the projectability judgments created by increasingly reliable and increasingly useful scientific inquiry in medical research,
and that these two points are in practice (if not in theory) well understood
by medical researchers. Thus our objection to clinical equipoise is that it
makes it unethical to conduct research into topics about which there are
hard disagreements.
Cases where plausible ethical principles are inconsistent with rational
scientific norms are important areas of study for clinical ethics. Work on
such cases can lead to a deeper understanding of medical research, and
this can help clinical ethicists produce deeper and more penetrating analyses, principles, and ethical interventions. But it can also lead to insights
Special Topic: Ethics and Clinical Equipoise
19
that help reform research methodology. Illustrating the role of projectability judgments in theory confirmation allows us to address the question of
whether the principle of clinical equipoise is compatible with the norms
of scientific research, and also provides a concrete example of one way that
the philosophy of science can contribute to clinical ethics.
Projectability and Confirmation
So, what is a projectability judgment? The concept finds its home in contemporary naturalistic philosophy of science, where it stands for all of the
various and different ways in which the decisions of scientific researchers
are influenced by the background scientific knowledge in their particular
discipline (Boyd 1991, 1992). Intuitively, these are decisions in the present
that reflect expert insight into what worked well in the past. For example,
decisions about which tests of statistical significance to use are projectability judgments; decisions about how to measure outcomes in multiple sclerosis are projectability judgments; decisions about trialing a new stroke
preventative against an accepted therapy (like aspirin) are projectability
judgments. The concept of projectability judgments was introduced by
Nelson Goodman (1983), who used “projectability” to refer to a property that certain terms or concepts had, just in case they had a substantial history of being useful for framing reliable inductive or explanatory
hypotheses. In the philosophy of science, then, projectability judgments
are a key part of the explanation of how different scientific fields can produce increasingly accurate representations of the parts and processes of
the world that they study. It is by relying on projectability judgments that
the approximations of past theories can be incrementally improved upon
through successive generations of scientific research. For our purposes,
then, a good-enough definition of “projectability judgment” is the judgment that a given concept, theory, hypothesis, or assumption is plausible
on the basis of previously accepted scientific research.
Considerations of projectability play an important role in explaining
how scientific theories are confirmed by experimentation or observation.
We begin with the two truisms: that there are no deductively valid inferences from observational evidence to theoretical conclusions, and that
nearly all of the most scientifically useful concepts are not definable in observational terms. Given these two facts, how do scientists confirm theories?
20
Theoretical & Applied Ethics 3:1
Popper famously concluded that no confirmation is possible. If we
start with the assumption that we should only regard as confirmed those
theories that we are completely certain are true, then the two elementary
truisms above seem to reveal that we can never have complete certainty
of the truth of a scientific theory. But an understanding of the role that
projectability judgments play in science provides an elegant way of modeling scientific confirmation that, importantly, implies that while evidence
licensing absolute certainty in the truth of any particular scientific theory
may not be attainable, there can nevertheless be evidence that rationalizes
increasingly strong degrees of confidence in the truth of a scientific theory.
As we will illustrate below, this model conforms very well to methods of
confirmation in medical research. Intuitively, it says that a clinical theory is
confirmed if it is the last of a series of projectable theories standing at some
time. For this reason, it implies that confirmation is provisional and partly
contextual, since a confirmed hypothesis can be rendered unconfirmed by
the emergence of a projectable alternative that fits with or explains the same
set of observations as the first hypothesis. The model also implies that a degree of preliminary confirmation can accrue to a theory even prior to its
experimental testing, such as when the theory is the only projectable theory
about the relevant subject matter in the field at a given time.
Here, then, is a sketch of the reasoning behind the model. We begin by
adding a third truism: it is not possible to deduce predictions from a theory without committing oneself to a host of auxiliary hypotheses. To test
a prediction from even a very a simple equivalence, like cpp = map-icp
(cerebral perfusion pressure equals mean arterial pressure minus intracranial pressure), we need to make any number of additional assumptions:
platitudinous metaphysical assumptions (there is a material world, brains
exist, there are no hidden arterial channels), routine theological assumptions (God, if she exists, is not fussing with our measuring instruments),
and practical assumptions (most of the pressure measurements will not be
thrown off by something weird that the patient ate yesterday). It is important to note, moreover, that the choice of auxiliary hypotheses is neither
random nor arbitrary. Many auxiliary hypotheses amount to scientific or
philosophical common sense, and these assumptions are very nearly ubiquitous in all scientific reasoning. But in mature sciences, like most fields
of medical research, it is often the case that the most significant auxiliary
Special Topic: Ethics and Clinical Equipoise
21
hypotheses that are used to frame tests of predictions from a theory are
clusters of previously accepted scientific theories, many of which are proprietary to the relevant field. Thus, one of the ways in which projectability
judgments influence scientific research is by determining which scientific
theories are assumed as auxiliary hypotheses in service of testing some
predictions of a novel, or at least newer, scientific theory.
Successful predictions are, all things being equal, evidence that a theory is true. But it is also important to note that quite a few of the most wellconfirmed scientific theories appear to be confirmed not because they are
manifestly good at making significant predictions but because they offer
deep and compelling explanations of observed phenomena. For example,
in The Origin of Species Darwin makes almost no nontrivial predictions. Instead, he constructs a theory that does a better job of explaining a number
of observations—like the webbed feet of the upland goose, a species that
never goes anywhere near water—than all of the most scientifically plausible (i.e., projectable) theories of speciation of the mid-nineteenth century.
In so doing, Darwin relied upon any number of projectability judgments
to determine which competing theories at the time were the best. But the
deeper point here is to see that a theory can be confirmed if it provides the
best explanation of a set of observations, and constructing the evidence
that a particular theory offers a better explanation than alternative scientific theories requires relying upon projectability judgments to, among other
things, determine which scientific theories are the relevant alternatives,
and also organize both the collection of observations and analyze the explanatory fit between other theories and these observations.
Prediction and explanation both lead to confirmation, and there can be
no prediction or explanation without the use of projectability judgments.
So, one last truism from the philosophy of science: data or observation
under-determines theory. This is a loose way of saying that, for some set
of scientific observations, there will usually be more than one theory that
is consistent with these observations. Sometimes under-determination is
presented as a logical objection to the view that we can have evidence that
a scientific theory is confirmed. However, under-determination is also a
practical problem facing working scientific researchers. One side of the
problem is that it is simply too easy to generate new scientific theories, and
it is practically impossible to test all of the novel theories that research22
Theoretical & Applied Ethics 3:1
ers can create—there is not enough time, money, energy, or lab personnel. A filter is needed between the creative activities of theory generation
and the practical activities of experimentation and testing. Projectability judgments provide this filter. Only those theories that are reasonably
plausible by the light of previously accepted scientific research are able to
make the transition from being a novel hypothesis to a hypothesis subject
to a degree of empirical scrutiny or validation. The other side of underdetermination is that, once the range of projectable alternative scientific
hypotheses has been established, if each of these scientific theories does
no better than any other in terms of prediction or explanation, then none
of these theories is confirmed. More data are required, and confirmation
will accrue only subsequently to whichever theory turns out to be either
not inconsistent with, or able to adequately explain, any new findings.
Scientists must rely on projectability judgments, then, to test predictions, assess explanations, and determine the degree of confirmation a set
of observations confers upon a particular theory. We can now summarize
the various insights and truisms that lead to this conclusion by formulating a simple model of confirmation:
Some theory T is confirmed by observations O at time t, if and only if,
at t, either:
All of the projectable alternative scientific theories to T make predictions about O that are less successful than the predictions T makes
about O.
T is a better explanation of O than any of T’s projectable alternatives
The predictive and explanatory abilities of T are better than all of its
projectable alternatives in some combination.
And where the data in O were collected with controls and protocols
that do not presuppose either the truth of T or the truth of any of
the projectable relevant alternatives to T at t.
It is easy to find in both the history of medical research and in current
clinical studies practices that exemplify this model of how projectability
judgments regulate confirmation. For example, the recent large, randomized clinical trial (rct) of the oral multiple sclerosis (ms) drug FingoliSpecial Topic: Ethics and Clinical Equipoise
23
mod was constructed on the basis of past laboratory work, experiences
with patients, and smaller clinical trials (Kappos et al., 2006). Each judgment about how to “carry forward” this past research into the design of the
rct for Fingolimod was a particular projectability judgment. These judgments ensured that the trial was grounded as deeply as possible in past ms
research, and it is this projectability that explains why the community of
medical researchers at large easily accepted the trial’s results.
Fingolimod is hypothesized to work by suppressing the interaction
of lymphocytes and the central nervous system, and one standard theory
about the origins of ms is that autoimmune processes of some kind cause
the demyelination of nerve cells. This latter fact is why the trial of a novel
surgical procedure intended to reverse “chronic cerebrospinal insufficiency” (ccsvi) by Italian surgeon Paolo Zamboni (Zamboni et al., 2008) was
met with profound skepticism. There was little prior research supporting
Zamboni’s theory that ccsvi is part of the cause of ms, and this in turn
explains why his study was viewed with skepticism amongst researchers:
it simply was not very projectable on the basis of past medical research on
ms. Consequently, Fingolimod is better confirmed as a treatment than is
Zamboni’s procedure.
But this is not to conclude either that Zamboni’s proposal is wrong or
that conclusive evidence supports the use of Fingolimod for the treatment
of ms. Rather, the contrast between the amounts of confirmation accruing
to the results of each of these studies reflects the underlying differences in
the projectability—that is, plausibility in the light of previous scientific
research—of various aspects of the Fingolimod rct compared to the details of Zamboni’s study.
Projectability and the Possibility of Equipoise
It is a methodological problem for medical research if researchers are ethically obliged to abandon projectability judgments. Why? Requiring scientists to abandon projectability judgments is to automatically deprive
them of the ability to test and confirm medical hypotheses. But abandoning projectability judgments also produces a deeper ethical problem. The
antecedent need for explicit evidence that a planned course of medical research is ethically appropriate requires insight into the quality of the proposed research methodologies, and only relying upon projectability judg24
Theoretical & Applied Ethics 3:1
ments produces such insight. There is no way to avoid using knowledge of
past research when assessing the methodology of a future trial.
So let us turn now to the argument that the principle of clinical equipoise conflicts with this reliance upon projectability judgments. Recall
that the principle requires there to be, in the relevant community of experts, no consensus about the comparative merits of two or more treatment options. However, there must also be the reasonable chance that
the trial produce evidence that can convince the relevant experts that one
treatment is meaningfully better than another. Thus, clinical equipoise requires performing only those studies for which it is reasonable to expect
movement from an antecedent lack of consensus and reasonable uncertainty to subsequent consensus and reasonable certainty. The crux of our
argument, however, is that when scientists rely on projectability judgments there will be rational disagreement about empirical research both
before and after even the most sophisticated (“gold standard”) studies. For
disagreements that are based on differences in projectability judgments
among medical researchers, the “movement” required by Freedman’s clinical equipoise will be exceedingly rare, if it occurs at all.
Our argument for this conclusion has two parts. First, we demonstrate some of the ways that projectability judgments can lead to rational
variability in judgments about the methodology or results of a particular
clinical trial, study, or experiment. We then argue that most scientific research in medicine will produce this rational variability, and that this will
subsequently cause a lack of consensus about the degree of confirmation
that clinical hypotheses receive from even methodologically impeccable
clinical research.
Antecedent Disagreement Is Not (Just)
Disagreement from Lack of Subsequent Data
In medicine, there is no such thing as a standard study. The number of
experimental conditions, the nature of the controls or placebos, the age
and health of the patient population, the nature of the drug or intervention in the treatment condition—these all can be reasons for setting up
a study in a slightly different way than even very similar past studies. A
randomized trial of botulinum toxin injection (botox) will require that the
placebo treatment also be injected, not swallowed, and only into a stanSpecial Topic: Ethics and Clinical Equipoise
25
dardized location. But trials of psychotherapeutic methodologies cannot
be this uniform, since they must allow for a degree of variability in the
content of the therapy delivered to each individual patient. It is also not
clear what “placebo therapy” could realistically be in psychotherapy. Alternatively, the investigation of genetic treatments for very rare disorders may
not be possible in a randomized, controlled fashion due to the scarcity of
potential subjects. Because of the deep differences in the topics of scientific investigation within medicine, there cannot be a uniform methodology for medical research.
The complexity of the conditions justifying a particular experimental
method will mean that expertise in one field of research (say, geriatrics)
will confer little particular insight into the design of studies in another (e.g.,
pediatrics). A geriatrician who has specific knowledge of the manifestations of diseases in the elderly will have little advantage in assessing how a
disease of newborns should be studied, treated, and measured. Thus, even
at its most advanced levels, medical knowledge is not homogeneous. Different experts know different things, and their judgments about how best
to conduct a study can both differ and yet remain equally well grounded
in medical evidence. Then, as one generation of research in medicine leads
to the next, projectability judgments will play a role in causing increasing
reliability in clinical research. However, this will also lead to more heterogeneity in the background knowledge that informs medical projectability
judgments. Scientific progress begets increasing amounts of specialization
and thereby increases the scope of evidence-based disagreement. After a
sufficient amount of progress, therefore, it becomes common for there to
be rational disagreement between experts within the same field about the
appropriate methodology for a new study, given subtle but scientifically
important differences in individual experts’ background knowledge. Differences in subject matter are thus one explanation of the lack of a uniform
methodology in medical research; another, deeper, cause of the lack of a
uniform methodology in medicine is past-evidence-based disagreement
among experts within a field over how best to carry out a new study.
To make this point more clear, recall one of the reasons why Freedman rejects Fried’s earlier conceptualization of equipoise. In 1974, Charles
Fried proposed that clinicians were ethically justified in enrolling their
patients into clinical trials only if they had equipoise, by which he meant
26
Theoretical & Applied Ethics 3:1
that they were personally undecided as to the relative merits of the interventions being studied in a given trial. Freedman (1987) argues that this
“theoretical equipoise” places too much emphasis on the persistence of
the uncertainty of an individual researcher, often the principal investigator of the given trial. Clinical equipoise fixes this by obliging the principal
investigator to defer to the state of the debate in the relevant expert community, and thereby distinguish between the degree of her confidence in
a treatment and the views of other similarly qualified researchers. Freedman’s conception is therefore social in a way that Fried’s is not. In Freedman’s view this makes clinical equipoise more stable because individual
changes in confidence cannot easily eliminate equipoise.
However, Freedman overlooks a second dimension in which medical
research is profoundly social. He calls our attention to the “horizontal”
social dimension: the views of other experts in the relevant field who may
disagree prior to a particular study about the merits of a treatment. But
there is also a “vertical” dimension that interacts with the horizontal dimension. The vertical dimension is the “stack” of past horizontal research
communities whose research flows into the decisions of contemporary
researchers whenever these decisions are based upon projectability judgments. And in the history of medical research, consensus about the merits
of a treatment is the exception, not the rule. Yet this past lack of consensus is not due to a lack of information or evidence—but (as per above)
is sometimes the natural by-product of increasing success and specialization in medical research. Since different experts in even a very small field
will have different sets of background knowledge, and since this background knowledge is not perfectly coherent despite the fact that it (usually) reflects progressively deeper empirical insights, there will be rational,
evidence-based disagreement about the appropriate methodologies for
carrying out specific clinical trials or research studies. This will emerge in
the form of differences in the projectability judgments of otherwise comparable experts in one and the same field about, for instance, whether or
not the design of a proposed trial is sufficiently rigorous.
Again, differences in projectability judgments are part of the explanation for why there is no standard study in clinical research. But these differences also entail that it is a mistake to equate scientific rigor (or ethical
appropriateness) in medicine with any particular methodology because
Special Topic: Ethics and Clinical Equipoise
27
it is either favored by consensus or seems capable of creating consensus.
There can be different and incompatible judgments about how best to
carry out research that are, nonetheless, each rational, because they reflect
different past scientific successes and approximate truths.
So, it is important to distinguish between two kinds of antecedent disagreement. There can be disagreement mediated by projectability judgments, such as when there is more than one rational, evidence-based point
of view to take about the merits of any particular treatment, or about the
merits of some particular study meant to shed light on the merits of some
potential treatment. These disagreements facilitate scientific progress.
But they also persist despite—and are sometimes even amplified by—
progress in medical research, which is why it makes sense to call them
“hard” disagreements. The second kind of disagreement is what Freedman
calls clinical equipoise. This is disagreement due simply to a lack of some
outstanding data or information, and not rational differences in the background scientific knowledge of experts that is expressed as differences in
their projectability judgments.
The principle of clinical equipoise prohibits research into topics
about which there is something approximating hard disagreement. Accepting the principle of clinical equipoise as anything more than a purely
nominal component of the ethical deliberations associated with medical research would therefore be an extremely regressive step. It would
prevent all the scientific knowledge reflected in the “vertical” dimension
about which there is hard disagreement from being used to frame new
experiments. Consequently, it is important to understand the scope of
hard disagreements. They can be caused by specialization in medical
knowledge that results from scientific progress, but specialization can
also lead, especially over the long run, to scientific consensus. So, is there
also evidence that can tell us how common these hard disagreements will
be among medical researchers?
The Ubiquity of Hard Disagreement
Freedman’s definition of clinical equipoise avoids the question of how realistic it is to assume that it is routinely possible to conduct experiments
that confirm a medical hypothesis:
28
Theoretical & Applied Ethics 3:1
A state of clinical equipoise is consistent with a decided treatment
preference on the part of the investigators. They must simply recognize that their less-favored treatment is preferred by colleagues
whom they consider to be responsible and competent. Even if the
interim results favor the preference of the investigators, treatment
B, clinical equipoise persists as long as those results are too weak to
influence the judgment of the community of clinicians, because of
limited sample size, unresolved possibilities of side effects, or other
factors. (This judgment can necessarily be made only by those who
know the interim results—whether a data-monitoring committee
or the investigators.) (1987, p. 144)
But Freedman does not consider the further question. What if there is no
problem in sample size, or controls for potential side effects, or any other
factors? Suppose that the study is as well designed as is possible, and suppose furthermore that there are no commercial or personal interests that
prejudice members of the relevant community of experts. It is still possible that, once the relevant results are published and become known to the
relevant experts, there remains a genuine disagreement about the merits
of the treatment, because of differences in the projectability judgments of these
experts. Was it therefore an ethical mistake to conduct this study?
Strictly interpreted, Freedman’s principle says that the answer is yes,
the conduct of such a trial would be ethically inappropriate. But that
does not settle the question of the compatibility of Freedman’s principle
and the methodology of scientific research. It may be that projectabilityjudgment-based disagreements are infrequent enough for the principle of
clinical equipoise to still be generally compatible with the methodology of
medical research; and of course we do not want to simply assume for the
sake of our argument that such disagreements about the results of clinical
trials are pervasive in medicine. In this section, then, we develop an argument that such disagreements are widespread.
One piece of evidence for this claim is the historical paucity of crucial
experiments in medical research. A crucial experiment, again, is an experiment that conclusively settles an important outstanding empirical question.
These experiments are often responses to hard, projectability-judgmentbased disagreement, yet they are able to settle the disagreement. In mediSpecial Topic: Ethics and Clinical Equipoise
29
cine, such experiments would convince any medical expert—irrespective of
her background training, area of specialization, or (to foreshadow) her practical
experience treating patients—of the merits or efficacy of a particular treatment. But most research in medicine does not have this effect. Even extremely well designed and well-run trials result in rational disagreements
about which, if any, is the “true” conclusion to draw from the study.
From a historical perspective, this is not a surprise. Many medical
studies that are now considered definitive and confirmatory remained
controversial for decades after their initial public presentation. Early investigations of using vitamin C to treat scurvy, hand-washing to prevent
infections, and bloodletting to treat pneumonia are now considered the
seminal trials of the modern, “evidence-based” era. However, none of
these trials’ results were widely accepted in their historical milieu. Many
of the chief investigators of these studies were heavily criticized, and some
were even persecuted.
But these historical examples do not help us assess how much
projectability-judgment-based disagreement there is likely to be in contemporary medical research. So, we will argue from a more recent example, the landmark National Institutes of Neurological Disease and Stroke
(ninds) trial. The ninds trial tested the efficacy of the clot-busting drug
tissue plasminogen activator (tPA) in the treatment of patients with
stroke. Despite arriving at a statistically-significant demonstration of benefit for the use of tPA, there remains hard disagreement about its application. Importantly, this particular disagreement is based in a component of
the “vertical” dimension of a community of medical researchers’ scientific
knowledge: the knowledge that they have acquired from the practical experience of treating patients. Because this component is present in nearly
every community of medical researcher, we arrive at the conclusion that
hard disagreements will be correspondingly common.
Ischemic stroke, in which a blockage in blood flow to the brain leads
to the sudden onset of often catastrophic disability, was a disease without any successful treatment until 1995. The ninds trial, published in the
New England Journal of Medicine, reported results that demonstrated, for
the first time, the beneficial effect to stroke patients of the administration
of a drug called tPA. For some members of the neurological community,
the ninds trial was sufficient to confirm tPA as a treatment for ischemic
30
Theoretical & Applied Ethics 3:1
stroke in many patients. Because of this support, regulatory agencies approved the treatment, and millions of dollars have been invested in health
systems designed to maximize access to it. However, not all neurologists,
let alone physicians in other fields, agreed. Most prominently, members
of the emergency medicine community, who also possess expertise in the
management of stroke patients, were by and large skeptical about the results of the ninds trial. This difference can be understood in relation to
differences in background medical knowledge, manifesting in the form of
differing estimations of the plausibility of the results from the ninds trial.
Specifically, the neurological interpretation of the ninds trial is based
on the background acceptance of the theory of the penumbra, which is
a physiological theory according to which the brains of stroke patients
are potentially salvageable if treated quickly. Nearly every neurological
article on acute stroke trials begins with a reference to the theory of the
penumbra, despite the fact that the penumbra phenomenon has never
been conclusively shown to exist in humans. In contrast, the emergency
medicine literature contains few, if any, references to the penumbra. The
projectability judgments of emergency medicine physicians are, instead,
most deeply informed by experiences with myocardial infarctions (heart
attacks), in which the use of tPA usually leads to immediate and demonstrable improvements in patient status. This is important, because tPA for
stroke does not lead to immediate improvement, which means that it is
possible to see “from the bedside” its use as a failure and then infer that
any subsequent patient improvement is due to other factors. Despite multiple subsequent trials, these differences in interpretation persist. We have
here another example of a hard, projectability-judgment-based disagreement in medicine. The key point, however, is that for emergency medicine
physicians, their projectability judgments are based in part upon knowledge gleaned from the practical experience of treating patients.
So, is the disagreement about the ninds trial the exception, or the
rule? Are most studies in medicine like the ninds trial in the sense that
they are embedded in an ongoing hard disagreement? We believe that
evidence generated by studies of the clinical decisions of physicians shows
that the answer to this question is yes. Recent work in the field of stroke
care demonstrates how, even among neurologists who accept that the
ninds trial confirmed the effectiveness of tPA, there is still significant
Special Topic: Ethics and Clinical Equipoise
31
variability in its interpretation, which can be manifested as different decisions about how IV tPA should be administered (Shamy & Jaigobin, 2013).
Even a shared theoretical commitment can lead to subtle differences in
treatment decisions. Observations like these are suggestive of the deeper
reason why projectability-judgment-based disagreements are common.
Recall that the particular outcome that disturbs clinical equipoise for
Freedman is evidence that a particular treatment should be preferred over
its alternatives. In the context of medicine, this means that the treatment
will be more effective than its alternatives, not merely that some hypothesis is seen as more likely to be true. This is important, because the effectiveness of a particular medical treatment is not an essential property
of the treatment itself. It is, instead, a relational property that holds between patients, treatments, and physicians, in a given historical moment.
An easy way to understand this point is to see that there is no such thing
as an abstractly effective treatment that is nevertheless not indicated for
any human population. So, when physicians read and interpret the results
of studies—be they rcts with massive samples, the ninds trial, or smallscale observational studies—any interpretation must be influenced by the
physician’s background knowledge about, and gained slowly from the experience of, treating patients. As the ninds trial example demonstrates,
this knowledge is a source of difference in projectability judgments that
can, in turn, lead to evidence-based disagreements about whether and
how to accept the results of a particular study. Different past experiences
treating patients have a material influence on physicians’ interpretations of
the results of even methodologically impeccable studies.
Now, the practical experience of treating patients is virtually pervasive
among physicians. And as is nicely illustrated by the work of Katherine
Montgomery (2005), this experience is not uniform physician to physician, and this experience does not typically lead to perfectly coherent or
perfectly generalizable medical knowledge. Instead, it leads to a kind of
“clinical phronesis.” And just like specialist knowledge of past medical
research, this background practical knowledge will create differences in
the projectability judgments that physicians must rely upon in order to
assess the results of new research. These differences can then beget hard
disagreements about new research. Given that the experience of treating
patients is virtually pervasive among physicians, differences in projectabil32
Theoretical & Applied Ethics 3:1
ity judgments, grounded in the practical knowledge or phronesis needed
to successfully treat patients, will be comparably ubiquitous.
Conclusions
Even the most evidence-based medical research will still exhibit a remarkably profound lack of consensus. Some of this may be what Freedman calls
clinical equipoise, which is a lack of consensus caused by a lack of information or evidence about the effectiveness of a particular treatment. However, a significant amount of this will be due to the influence of the “vertical” dimension of medical knowledge on both the design of studies and
on the subsequent interpretations of studies by the relevant community
of physicians. Because of the influence of specialization and practical experience treating patients, the “vertical” dimension is not internally coherent. Relying upon it will therefore lead to disagreement. However, these
disagreements will be rational, because they are based upon past (and
often hard-won) empirical insights, practical wisdom, and experimental
successes. These disagreements are not likely to be resolved by any single
new clinical trial.
The implications for Freedman’s ethical principle are clear: it is not
compatible with the methodology of scientific research, because it obliges
researchers to perform only those studies that are likely to resolve disagreement among the relevant experts. Adopting the principle of clinical
equipoise requires abandoning projectability judgments, as these judgments are in various different ways the cause of hard disagreements. But
without projectability judgments, there is no way to confirm hypotheses
tested by clinical research.
Recent debates in several fields of medical research about the operationalization and theoretical basis of equipoise suggest that medical
researchers are to some degree aware of the problems that arise from attempting to apply the principle of clinical equipoise to medical practice
(Goyal et al., 2013). And again, giving up on projectability judgments
would cost medical research its ability to confirm theories; giving up on
the principle of clinical equipoise does not carry this risk. What, then,
should replace clinical equipoise? We support a revised set of ethical standards, built upon a psychologically and methodologically accurate theory
of disagreement, decision making, and consensus in medicine, one that acSpecial Topic: Ethics and Clinical Equipoise
33
knowledges the interrelation of methodological and ethical considerations
in the justification of clinical research. But we do not claim to know what
these ethical standards are. We therefore envision an important project
that will involve contributions from researchers in medicine, clinical ethics, and the philosophy of science.
References
Boyd, R. (1991). Confirmation, Semantics, and the Interpretation of Scientific Theories.
In R. Boyd, P. Gasper, & J. D. Trout (Eds.), The Philosophy of Science (pp. 3–35).
Cambridge: mit Press.
Boyd, R. (1992). Constructivism, realism, and philosophical method. In J. Earman
(Ed.) Inference, explanation, and other frustrations: Essays in the philosophy of science
(pp. 131–198). Berkeley: University of California Press.
Crupi, V. (2013). Confirmation. Stanford Encyclopedia of Philosophy.
Freedman, B. (1987). Equipoise and the ethics of clinical research. New England Journal
of Medicine, 317(3), 141–145.
Goodman, N. (1983). Fact, fiction, and forecast (4th ed.). Cambridge: Harvard University
Press.
Goyal, M., Shamy, M., Jovin, T., Zaidat, O., Levy, E., Davalos, A., et al. (2013). Endovascular stroke trials: Why we must enroll all eligible patients. Stroke, 44(12),
3591–3595.
Kappos, L., Karlsson, G., Korn, A. A., Haas, T., Polman, C. H., O’Connor, P., et al.
(2006). Oral Fingolimod (fty720) for relapsing multiple sclerosis. New England
Journal of Medicine, 355(11), 1124–1140.
Lakatos, I. (1974). The role of crucial experiments in science. Studies in the History and
Philosophy of Science, 4(4), 344–355.
Montgomery, K. (2006). How doctors think clinical judgment and the practice of medicine.
Oxford: Oxford University Press.
Shamy, M. C. F., & Jaigobin, C.S. (2013). The complexities of acute stroke decisionmaking: A survey of neurologists. Neurology, 81, 1130–1133.
Zamboni, P., Galeotti, R., Menegatti, E., Malagoni, A. M., Tacconi, G., Dall’Ara, S., et
al. (2008). Chronic cerebrospinal venous insufficiency in patients with multiple
sclerosis. Journal of Neurology, Neurosurgery and Psychiatry, 80(4), 392–399.
34
Theoretical & Applied Ethics 3:1