3URMHFWDELOLW\'LVDJUHHPHQWDQG&RQVHQVXV$&KDOOHQJH WR&OLQLFDO(TXLSRLVH 0DUN)HG\N0LFKHO6KDP\ Theoretical & Applied Ethics, Volume 3, Number 1, Summer 2014, pp. 17-34 (Article) 3XEOLVKHGE\8QLYHUVLW\RI1HEUDVND3UHVV DOI: 10.1353/tha.2014.0004 For additional information about this article http://muse.jhu.edu/journals/tha/summary/v003/3.1.fedyk.html Access provided by your local institution (2 Mar 2015 19:41 GMT) Mark Fedyk and Michel Shamy Projectability, Disagreement, and Consensus A Challenge to Clinical Equipoise Clinical equipoise links ethically appropriate medical research with medical research that has the reasonable chance of resolving debates. We argue against this principle on the ground that most debates in medicine cannot be resolved by the outcomes of any particular (even extremely rigorous) empirical study. In fact, a deep understanding of the methodology of scientific research leads to the conclusion that adopting clinical equipoise as an ethical standard for medical research would deprive medical researchers of the ability to confirm clinical hypotheses. Introduction Medical research in the form of clinical trials cannot be accidental, in the sense that both the outcomes of medical research must be reasonably anticipated and the methods of any new research protocols must be informed by methods that have been used in the past. Specifically, clinical research requires explicit evidence that, prior to running a proposed experiment, the experiment meets various orthodox ethical requirements. Importantly, this ethical evidence is not independent of equally explicit scientific evidence that the research is likely to generate knowledge, and this is not because it is unethical to do pointless or uninformed medical research. Rather, as a general rule, the methodology of medical experiments cannot be carved off from ethical evaluation of the very same experiments, because ethical evaluation and methodological design deeply and inextricably inform each other. This interdependence is reflected in the practice of mandatory clinical research review, during which experimental protocols undergo both rigorous methodological assessment and ethical evaluation, where the ethical appropriateness of a trial and the methodological quality of the trial will be concurrently and reflexively determined. So, there are two ways in which clinical research cannot be accidental: it can be neither a scientific nor an ethical accident. This is one reason why Freedman’s (1987) principle of clinical equipoise is so interesting. According to this principle, Freedman says part of the evidence needed to certify that a proposed study is ethical is a particular kind of scientific uncertainty that, if it exists, may effectively prevent researchers from knowing that medical research is not methodologically accidental. What’s more, in order for a proposed study to be ethical, it must also be likely that this antecedent (ex ante) uncertainty will be ameliorated by the results of the clinical study. Freedman says that clinical equipoise exists when there is “no consensus within the expert clinical community about the comparative merits of the alternatives to be tested.” He continues: We may state the formal conditions under which such a trial would be ethical as follows: at the start of the trial, there must be a state of clinical equipoise regarding the merits of the regimens to be tested, and the trial must be designed in such a way as to make it reasonable to expect that, if it is successfully concluded, clinical equipoise will be disturbed. In other words, the results of a successful clinical trial should be convincing enough to resolve the dispute among clinicians. (1987, p. 144) There can be, prior to the trial, no scientific consensus about a hypothesis tested by the trial; yet the trial must also be very likely to generate consensus-producing knowledge. To philosophers of science, Freedman’s principle seems to say that it is only ethical to conduct crucial experiments (Lakatos, 1974). Crucial experiments are those that have the potential to produce evidence strong enough to resolve scientific disputes about empirical questions. Yet nearly all accounts of scientific confirmation developed since the demise of logical positivism predict that crucial experiments are exceedingly rare (cf. Crupi, 2013). So there is a prima facie worry—from the philosophy of science—that there is tension between Freedman’s principle of clinical equipoise and scientific methodology. Freedman’s principle may inadvertently set the ethically permitted scientific outcomes of clinical research impossibly high. This article argues that this problem is real. Our argument can be sum18 Theoretical & Applied Ethics 3:1 marized in the following way. The antecedent uncertainty that Freedman refers to will not typically exist for clinical studies that are also not methodologically accidental. Furthermore, the knowledge that ensures that scientific research is not methodologically accidental is also usually the grounds for scientific disagreements that cannot be resolved by particular one-off scientific experiments, no matter how rigorously constructed such experiments may be. Thus any ethical norm that links ethically appropriate medical research with medical research that has the reasonable chance of resolving debates will not be practically compatible with scientific progress in the relevant field of research. In more detail, we argue that it is not possible to carry out scientific research—in medicine or elsewhere—without relying upon what philosophers of science call projectability judgments. Relying on projectability judgments, however, leads to a fundamentally different kind of disagreement in medical research than the disagreement that Freedman calls clinical equipoise. These “hard” disagreements will look, prior to an experiment, like there is “no consensus within the expert clinical community about the comparative merits of the alternatives to be tested.” However, what makes these disagreements “hard” is that they are not the kind of disagreements that can be resolved, as is mandated by Freedman’s principle, by even the most rigorous medical experiments. Importantly, however, the knowledge that generates these hard disagreements is also the knowledge that ensures that medical research is scientifically rigorous—that is, is not methodologically accidental. We conclude that most disagreements in medical research are hard disagreements, that such disagreements are an inevitable by-product of the projectability judgments created by increasingly reliable and increasingly useful scientific inquiry in medical research, and that these two points are in practice (if not in theory) well understood by medical researchers. Thus our objection to clinical equipoise is that it makes it unethical to conduct research into topics about which there are hard disagreements. Cases where plausible ethical principles are inconsistent with rational scientific norms are important areas of study for clinical ethics. Work on such cases can lead to a deeper understanding of medical research, and this can help clinical ethicists produce deeper and more penetrating analyses, principles, and ethical interventions. But it can also lead to insights Special Topic: Ethics and Clinical Equipoise 19 that help reform research methodology. Illustrating the role of projectability judgments in theory confirmation allows us to address the question of whether the principle of clinical equipoise is compatible with the norms of scientific research, and also provides a concrete example of one way that the philosophy of science can contribute to clinical ethics. Projectability and Confirmation So, what is a projectability judgment? The concept finds its home in contemporary naturalistic philosophy of science, where it stands for all of the various and different ways in which the decisions of scientific researchers are influenced by the background scientific knowledge in their particular discipline (Boyd 1991, 1992). Intuitively, these are decisions in the present that reflect expert insight into what worked well in the past. For example, decisions about which tests of statistical significance to use are projectability judgments; decisions about how to measure outcomes in multiple sclerosis are projectability judgments; decisions about trialing a new stroke preventative against an accepted therapy (like aspirin) are projectability judgments. The concept of projectability judgments was introduced by Nelson Goodman (1983), who used “projectability” to refer to a property that certain terms or concepts had, just in case they had a substantial history of being useful for framing reliable inductive or explanatory hypotheses. In the philosophy of science, then, projectability judgments are a key part of the explanation of how different scientific fields can produce increasingly accurate representations of the parts and processes of the world that they study. It is by relying on projectability judgments that the approximations of past theories can be incrementally improved upon through successive generations of scientific research. For our purposes, then, a good-enough definition of “projectability judgment” is the judgment that a given concept, theory, hypothesis, or assumption is plausible on the basis of previously accepted scientific research. Considerations of projectability play an important role in explaining how scientific theories are confirmed by experimentation or observation. We begin with the two truisms: that there are no deductively valid inferences from observational evidence to theoretical conclusions, and that nearly all of the most scientifically useful concepts are not definable in observational terms. Given these two facts, how do scientists confirm theories? 20 Theoretical & Applied Ethics 3:1 Popper famously concluded that no confirmation is possible. If we start with the assumption that we should only regard as confirmed those theories that we are completely certain are true, then the two elementary truisms above seem to reveal that we can never have complete certainty of the truth of a scientific theory. But an understanding of the role that projectability judgments play in science provides an elegant way of modeling scientific confirmation that, importantly, implies that while evidence licensing absolute certainty in the truth of any particular scientific theory may not be attainable, there can nevertheless be evidence that rationalizes increasingly strong degrees of confidence in the truth of a scientific theory. As we will illustrate below, this model conforms very well to methods of confirmation in medical research. Intuitively, it says that a clinical theory is confirmed if it is the last of a series of projectable theories standing at some time. For this reason, it implies that confirmation is provisional and partly contextual, since a confirmed hypothesis can be rendered unconfirmed by the emergence of a projectable alternative that fits with or explains the same set of observations as the first hypothesis. The model also implies that a degree of preliminary confirmation can accrue to a theory even prior to its experimental testing, such as when the theory is the only projectable theory about the relevant subject matter in the field at a given time. Here, then, is a sketch of the reasoning behind the model. We begin by adding a third truism: it is not possible to deduce predictions from a theory without committing oneself to a host of auxiliary hypotheses. To test a prediction from even a very a simple equivalence, like cpp = map-icp (cerebral perfusion pressure equals mean arterial pressure minus intracranial pressure), we need to make any number of additional assumptions: platitudinous metaphysical assumptions (there is a material world, brains exist, there are no hidden arterial channels), routine theological assumptions (God, if she exists, is not fussing with our measuring instruments), and practical assumptions (most of the pressure measurements will not be thrown off by something weird that the patient ate yesterday). It is important to note, moreover, that the choice of auxiliary hypotheses is neither random nor arbitrary. Many auxiliary hypotheses amount to scientific or philosophical common sense, and these assumptions are very nearly ubiquitous in all scientific reasoning. But in mature sciences, like most fields of medical research, it is often the case that the most significant auxiliary Special Topic: Ethics and Clinical Equipoise 21 hypotheses that are used to frame tests of predictions from a theory are clusters of previously accepted scientific theories, many of which are proprietary to the relevant field. Thus, one of the ways in which projectability judgments influence scientific research is by determining which scientific theories are assumed as auxiliary hypotheses in service of testing some predictions of a novel, or at least newer, scientific theory. Successful predictions are, all things being equal, evidence that a theory is true. But it is also important to note that quite a few of the most wellconfirmed scientific theories appear to be confirmed not because they are manifestly good at making significant predictions but because they offer deep and compelling explanations of observed phenomena. For example, in The Origin of Species Darwin makes almost no nontrivial predictions. Instead, he constructs a theory that does a better job of explaining a number of observations—like the webbed feet of the upland goose, a species that never goes anywhere near water—than all of the most scientifically plausible (i.e., projectable) theories of speciation of the mid-nineteenth century. In so doing, Darwin relied upon any number of projectability judgments to determine which competing theories at the time were the best. But the deeper point here is to see that a theory can be confirmed if it provides the best explanation of a set of observations, and constructing the evidence that a particular theory offers a better explanation than alternative scientific theories requires relying upon projectability judgments to, among other things, determine which scientific theories are the relevant alternatives, and also organize both the collection of observations and analyze the explanatory fit between other theories and these observations. Prediction and explanation both lead to confirmation, and there can be no prediction or explanation without the use of projectability judgments. So, one last truism from the philosophy of science: data or observation under-determines theory. This is a loose way of saying that, for some set of scientific observations, there will usually be more than one theory that is consistent with these observations. Sometimes under-determination is presented as a logical objection to the view that we can have evidence that a scientific theory is confirmed. However, under-determination is also a practical problem facing working scientific researchers. One side of the problem is that it is simply too easy to generate new scientific theories, and it is practically impossible to test all of the novel theories that research22 Theoretical & Applied Ethics 3:1 ers can create—there is not enough time, money, energy, or lab personnel. A filter is needed between the creative activities of theory generation and the practical activities of experimentation and testing. Projectability judgments provide this filter. Only those theories that are reasonably plausible by the light of previously accepted scientific research are able to make the transition from being a novel hypothesis to a hypothesis subject to a degree of empirical scrutiny or validation. The other side of underdetermination is that, once the range of projectable alternative scientific hypotheses has been established, if each of these scientific theories does no better than any other in terms of prediction or explanation, then none of these theories is confirmed. More data are required, and confirmation will accrue only subsequently to whichever theory turns out to be either not inconsistent with, or able to adequately explain, any new findings. Scientists must rely on projectability judgments, then, to test predictions, assess explanations, and determine the degree of confirmation a set of observations confers upon a particular theory. We can now summarize the various insights and truisms that lead to this conclusion by formulating a simple model of confirmation: Some theory T is confirmed by observations O at time t, if and only if, at t, either: All of the projectable alternative scientific theories to T make predictions about O that are less successful than the predictions T makes about O. T is a better explanation of O than any of T’s projectable alternatives The predictive and explanatory abilities of T are better than all of its projectable alternatives in some combination. And where the data in O were collected with controls and protocols that do not presuppose either the truth of T or the truth of any of the projectable relevant alternatives to T at t. It is easy to find in both the history of medical research and in current clinical studies practices that exemplify this model of how projectability judgments regulate confirmation. For example, the recent large, randomized clinical trial (rct) of the oral multiple sclerosis (ms) drug FingoliSpecial Topic: Ethics and Clinical Equipoise 23 mod was constructed on the basis of past laboratory work, experiences with patients, and smaller clinical trials (Kappos et al., 2006). Each judgment about how to “carry forward” this past research into the design of the rct for Fingolimod was a particular projectability judgment. These judgments ensured that the trial was grounded as deeply as possible in past ms research, and it is this projectability that explains why the community of medical researchers at large easily accepted the trial’s results. Fingolimod is hypothesized to work by suppressing the interaction of lymphocytes and the central nervous system, and one standard theory about the origins of ms is that autoimmune processes of some kind cause the demyelination of nerve cells. This latter fact is why the trial of a novel surgical procedure intended to reverse “chronic cerebrospinal insufficiency” (ccsvi) by Italian surgeon Paolo Zamboni (Zamboni et al., 2008) was met with profound skepticism. There was little prior research supporting Zamboni’s theory that ccsvi is part of the cause of ms, and this in turn explains why his study was viewed with skepticism amongst researchers: it simply was not very projectable on the basis of past medical research on ms. Consequently, Fingolimod is better confirmed as a treatment than is Zamboni’s procedure. But this is not to conclude either that Zamboni’s proposal is wrong or that conclusive evidence supports the use of Fingolimod for the treatment of ms. Rather, the contrast between the amounts of confirmation accruing to the results of each of these studies reflects the underlying differences in the projectability—that is, plausibility in the light of previous scientific research—of various aspects of the Fingolimod rct compared to the details of Zamboni’s study. Projectability and the Possibility of Equipoise It is a methodological problem for medical research if researchers are ethically obliged to abandon projectability judgments. Why? Requiring scientists to abandon projectability judgments is to automatically deprive them of the ability to test and confirm medical hypotheses. But abandoning projectability judgments also produces a deeper ethical problem. The antecedent need for explicit evidence that a planned course of medical research is ethically appropriate requires insight into the quality of the proposed research methodologies, and only relying upon projectability judg24 Theoretical & Applied Ethics 3:1 ments produces such insight. There is no way to avoid using knowledge of past research when assessing the methodology of a future trial. So let us turn now to the argument that the principle of clinical equipoise conflicts with this reliance upon projectability judgments. Recall that the principle requires there to be, in the relevant community of experts, no consensus about the comparative merits of two or more treatment options. However, there must also be the reasonable chance that the trial produce evidence that can convince the relevant experts that one treatment is meaningfully better than another. Thus, clinical equipoise requires performing only those studies for which it is reasonable to expect movement from an antecedent lack of consensus and reasonable uncertainty to subsequent consensus and reasonable certainty. The crux of our argument, however, is that when scientists rely on projectability judgments there will be rational disagreement about empirical research both before and after even the most sophisticated (“gold standard”) studies. For disagreements that are based on differences in projectability judgments among medical researchers, the “movement” required by Freedman’s clinical equipoise will be exceedingly rare, if it occurs at all. Our argument for this conclusion has two parts. First, we demonstrate some of the ways that projectability judgments can lead to rational variability in judgments about the methodology or results of a particular clinical trial, study, or experiment. We then argue that most scientific research in medicine will produce this rational variability, and that this will subsequently cause a lack of consensus about the degree of confirmation that clinical hypotheses receive from even methodologically impeccable clinical research. Antecedent Disagreement Is Not (Just) Disagreement from Lack of Subsequent Data In medicine, there is no such thing as a standard study. The number of experimental conditions, the nature of the controls or placebos, the age and health of the patient population, the nature of the drug or intervention in the treatment condition—these all can be reasons for setting up a study in a slightly different way than even very similar past studies. A randomized trial of botulinum toxin injection (botox) will require that the placebo treatment also be injected, not swallowed, and only into a stanSpecial Topic: Ethics and Clinical Equipoise 25 dardized location. But trials of psychotherapeutic methodologies cannot be this uniform, since they must allow for a degree of variability in the content of the therapy delivered to each individual patient. It is also not clear what “placebo therapy” could realistically be in psychotherapy. Alternatively, the investigation of genetic treatments for very rare disorders may not be possible in a randomized, controlled fashion due to the scarcity of potential subjects. Because of the deep differences in the topics of scientific investigation within medicine, there cannot be a uniform methodology for medical research. The complexity of the conditions justifying a particular experimental method will mean that expertise in one field of research (say, geriatrics) will confer little particular insight into the design of studies in another (e.g., pediatrics). A geriatrician who has specific knowledge of the manifestations of diseases in the elderly will have little advantage in assessing how a disease of newborns should be studied, treated, and measured. Thus, even at its most advanced levels, medical knowledge is not homogeneous. Different experts know different things, and their judgments about how best to conduct a study can both differ and yet remain equally well grounded in medical evidence. Then, as one generation of research in medicine leads to the next, projectability judgments will play a role in causing increasing reliability in clinical research. However, this will also lead to more heterogeneity in the background knowledge that informs medical projectability judgments. Scientific progress begets increasing amounts of specialization and thereby increases the scope of evidence-based disagreement. After a sufficient amount of progress, therefore, it becomes common for there to be rational disagreement between experts within the same field about the appropriate methodology for a new study, given subtle but scientifically important differences in individual experts’ background knowledge. Differences in subject matter are thus one explanation of the lack of a uniform methodology in medical research; another, deeper, cause of the lack of a uniform methodology in medicine is past-evidence-based disagreement among experts within a field over how best to carry out a new study. To make this point more clear, recall one of the reasons why Freedman rejects Fried’s earlier conceptualization of equipoise. In 1974, Charles Fried proposed that clinicians were ethically justified in enrolling their patients into clinical trials only if they had equipoise, by which he meant 26 Theoretical & Applied Ethics 3:1 that they were personally undecided as to the relative merits of the interventions being studied in a given trial. Freedman (1987) argues that this “theoretical equipoise” places too much emphasis on the persistence of the uncertainty of an individual researcher, often the principal investigator of the given trial. Clinical equipoise fixes this by obliging the principal investigator to defer to the state of the debate in the relevant expert community, and thereby distinguish between the degree of her confidence in a treatment and the views of other similarly qualified researchers. Freedman’s conception is therefore social in a way that Fried’s is not. In Freedman’s view this makes clinical equipoise more stable because individual changes in confidence cannot easily eliminate equipoise. However, Freedman overlooks a second dimension in which medical research is profoundly social. He calls our attention to the “horizontal” social dimension: the views of other experts in the relevant field who may disagree prior to a particular study about the merits of a treatment. But there is also a “vertical” dimension that interacts with the horizontal dimension. The vertical dimension is the “stack” of past horizontal research communities whose research flows into the decisions of contemporary researchers whenever these decisions are based upon projectability judgments. And in the history of medical research, consensus about the merits of a treatment is the exception, not the rule. Yet this past lack of consensus is not due to a lack of information or evidence—but (as per above) is sometimes the natural by-product of increasing success and specialization in medical research. Since different experts in even a very small field will have different sets of background knowledge, and since this background knowledge is not perfectly coherent despite the fact that it (usually) reflects progressively deeper empirical insights, there will be rational, evidence-based disagreement about the appropriate methodologies for carrying out specific clinical trials or research studies. This will emerge in the form of differences in the projectability judgments of otherwise comparable experts in one and the same field about, for instance, whether or not the design of a proposed trial is sufficiently rigorous. Again, differences in projectability judgments are part of the explanation for why there is no standard study in clinical research. But these differences also entail that it is a mistake to equate scientific rigor (or ethical appropriateness) in medicine with any particular methodology because Special Topic: Ethics and Clinical Equipoise 27 it is either favored by consensus or seems capable of creating consensus. There can be different and incompatible judgments about how best to carry out research that are, nonetheless, each rational, because they reflect different past scientific successes and approximate truths. So, it is important to distinguish between two kinds of antecedent disagreement. There can be disagreement mediated by projectability judgments, such as when there is more than one rational, evidence-based point of view to take about the merits of any particular treatment, or about the merits of some particular study meant to shed light on the merits of some potential treatment. These disagreements facilitate scientific progress. But they also persist despite—and are sometimes even amplified by— progress in medical research, which is why it makes sense to call them “hard” disagreements. The second kind of disagreement is what Freedman calls clinical equipoise. This is disagreement due simply to a lack of some outstanding data or information, and not rational differences in the background scientific knowledge of experts that is expressed as differences in their projectability judgments. The principle of clinical equipoise prohibits research into topics about which there is something approximating hard disagreement. Accepting the principle of clinical equipoise as anything more than a purely nominal component of the ethical deliberations associated with medical research would therefore be an extremely regressive step. It would prevent all the scientific knowledge reflected in the “vertical” dimension about which there is hard disagreement from being used to frame new experiments. Consequently, it is important to understand the scope of hard disagreements. They can be caused by specialization in medical knowledge that results from scientific progress, but specialization can also lead, especially over the long run, to scientific consensus. So, is there also evidence that can tell us how common these hard disagreements will be among medical researchers? The Ubiquity of Hard Disagreement Freedman’s definition of clinical equipoise avoids the question of how realistic it is to assume that it is routinely possible to conduct experiments that confirm a medical hypothesis: 28 Theoretical & Applied Ethics 3:1 A state of clinical equipoise is consistent with a decided treatment preference on the part of the investigators. They must simply recognize that their less-favored treatment is preferred by colleagues whom they consider to be responsible and competent. Even if the interim results favor the preference of the investigators, treatment B, clinical equipoise persists as long as those results are too weak to influence the judgment of the community of clinicians, because of limited sample size, unresolved possibilities of side effects, or other factors. (This judgment can necessarily be made only by those who know the interim results—whether a data-monitoring committee or the investigators.) (1987, p. 144) But Freedman does not consider the further question. What if there is no problem in sample size, or controls for potential side effects, or any other factors? Suppose that the study is as well designed as is possible, and suppose furthermore that there are no commercial or personal interests that prejudice members of the relevant community of experts. It is still possible that, once the relevant results are published and become known to the relevant experts, there remains a genuine disagreement about the merits of the treatment, because of differences in the projectability judgments of these experts. Was it therefore an ethical mistake to conduct this study? Strictly interpreted, Freedman’s principle says that the answer is yes, the conduct of such a trial would be ethically inappropriate. But that does not settle the question of the compatibility of Freedman’s principle and the methodology of scientific research. It may be that projectabilityjudgment-based disagreements are infrequent enough for the principle of clinical equipoise to still be generally compatible with the methodology of medical research; and of course we do not want to simply assume for the sake of our argument that such disagreements about the results of clinical trials are pervasive in medicine. In this section, then, we develop an argument that such disagreements are widespread. One piece of evidence for this claim is the historical paucity of crucial experiments in medical research. A crucial experiment, again, is an experiment that conclusively settles an important outstanding empirical question. These experiments are often responses to hard, projectability-judgmentbased disagreement, yet they are able to settle the disagreement. In mediSpecial Topic: Ethics and Clinical Equipoise 29 cine, such experiments would convince any medical expert—irrespective of her background training, area of specialization, or (to foreshadow) her practical experience treating patients—of the merits or efficacy of a particular treatment. But most research in medicine does not have this effect. Even extremely well designed and well-run trials result in rational disagreements about which, if any, is the “true” conclusion to draw from the study. From a historical perspective, this is not a surprise. Many medical studies that are now considered definitive and confirmatory remained controversial for decades after their initial public presentation. Early investigations of using vitamin C to treat scurvy, hand-washing to prevent infections, and bloodletting to treat pneumonia are now considered the seminal trials of the modern, “evidence-based” era. However, none of these trials’ results were widely accepted in their historical milieu. Many of the chief investigators of these studies were heavily criticized, and some were even persecuted. But these historical examples do not help us assess how much projectability-judgment-based disagreement there is likely to be in contemporary medical research. So, we will argue from a more recent example, the landmark National Institutes of Neurological Disease and Stroke (ninds) trial. The ninds trial tested the efficacy of the clot-busting drug tissue plasminogen activator (tPA) in the treatment of patients with stroke. Despite arriving at a statistically-significant demonstration of benefit for the use of tPA, there remains hard disagreement about its application. Importantly, this particular disagreement is based in a component of the “vertical” dimension of a community of medical researchers’ scientific knowledge: the knowledge that they have acquired from the practical experience of treating patients. Because this component is present in nearly every community of medical researcher, we arrive at the conclusion that hard disagreements will be correspondingly common. Ischemic stroke, in which a blockage in blood flow to the brain leads to the sudden onset of often catastrophic disability, was a disease without any successful treatment until 1995. The ninds trial, published in the New England Journal of Medicine, reported results that demonstrated, for the first time, the beneficial effect to stroke patients of the administration of a drug called tPA. For some members of the neurological community, the ninds trial was sufficient to confirm tPA as a treatment for ischemic 30 Theoretical & Applied Ethics 3:1 stroke in many patients. Because of this support, regulatory agencies approved the treatment, and millions of dollars have been invested in health systems designed to maximize access to it. However, not all neurologists, let alone physicians in other fields, agreed. Most prominently, members of the emergency medicine community, who also possess expertise in the management of stroke patients, were by and large skeptical about the results of the ninds trial. This difference can be understood in relation to differences in background medical knowledge, manifesting in the form of differing estimations of the plausibility of the results from the ninds trial. Specifically, the neurological interpretation of the ninds trial is based on the background acceptance of the theory of the penumbra, which is a physiological theory according to which the brains of stroke patients are potentially salvageable if treated quickly. Nearly every neurological article on acute stroke trials begins with a reference to the theory of the penumbra, despite the fact that the penumbra phenomenon has never been conclusively shown to exist in humans. In contrast, the emergency medicine literature contains few, if any, references to the penumbra. The projectability judgments of emergency medicine physicians are, instead, most deeply informed by experiences with myocardial infarctions (heart attacks), in which the use of tPA usually leads to immediate and demonstrable improvements in patient status. This is important, because tPA for stroke does not lead to immediate improvement, which means that it is possible to see “from the bedside” its use as a failure and then infer that any subsequent patient improvement is due to other factors. Despite multiple subsequent trials, these differences in interpretation persist. We have here another example of a hard, projectability-judgment-based disagreement in medicine. The key point, however, is that for emergency medicine physicians, their projectability judgments are based in part upon knowledge gleaned from the practical experience of treating patients. So, is the disagreement about the ninds trial the exception, or the rule? Are most studies in medicine like the ninds trial in the sense that they are embedded in an ongoing hard disagreement? We believe that evidence generated by studies of the clinical decisions of physicians shows that the answer to this question is yes. Recent work in the field of stroke care demonstrates how, even among neurologists who accept that the ninds trial confirmed the effectiveness of tPA, there is still significant Special Topic: Ethics and Clinical Equipoise 31 variability in its interpretation, which can be manifested as different decisions about how IV tPA should be administered (Shamy & Jaigobin, 2013). Even a shared theoretical commitment can lead to subtle differences in treatment decisions. Observations like these are suggestive of the deeper reason why projectability-judgment-based disagreements are common. Recall that the particular outcome that disturbs clinical equipoise for Freedman is evidence that a particular treatment should be preferred over its alternatives. In the context of medicine, this means that the treatment will be more effective than its alternatives, not merely that some hypothesis is seen as more likely to be true. This is important, because the effectiveness of a particular medical treatment is not an essential property of the treatment itself. It is, instead, a relational property that holds between patients, treatments, and physicians, in a given historical moment. An easy way to understand this point is to see that there is no such thing as an abstractly effective treatment that is nevertheless not indicated for any human population. So, when physicians read and interpret the results of studies—be they rcts with massive samples, the ninds trial, or smallscale observational studies—any interpretation must be influenced by the physician’s background knowledge about, and gained slowly from the experience of, treating patients. As the ninds trial example demonstrates, this knowledge is a source of difference in projectability judgments that can, in turn, lead to evidence-based disagreements about whether and how to accept the results of a particular study. Different past experiences treating patients have a material influence on physicians’ interpretations of the results of even methodologically impeccable studies. Now, the practical experience of treating patients is virtually pervasive among physicians. And as is nicely illustrated by the work of Katherine Montgomery (2005), this experience is not uniform physician to physician, and this experience does not typically lead to perfectly coherent or perfectly generalizable medical knowledge. Instead, it leads to a kind of “clinical phronesis.” And just like specialist knowledge of past medical research, this background practical knowledge will create differences in the projectability judgments that physicians must rely upon in order to assess the results of new research. These differences can then beget hard disagreements about new research. Given that the experience of treating patients is virtually pervasive among physicians, differences in projectabil32 Theoretical & Applied Ethics 3:1 ity judgments, grounded in the practical knowledge or phronesis needed to successfully treat patients, will be comparably ubiquitous. Conclusions Even the most evidence-based medical research will still exhibit a remarkably profound lack of consensus. Some of this may be what Freedman calls clinical equipoise, which is a lack of consensus caused by a lack of information or evidence about the effectiveness of a particular treatment. However, a significant amount of this will be due to the influence of the “vertical” dimension of medical knowledge on both the design of studies and on the subsequent interpretations of studies by the relevant community of physicians. Because of the influence of specialization and practical experience treating patients, the “vertical” dimension is not internally coherent. Relying upon it will therefore lead to disagreement. However, these disagreements will be rational, because they are based upon past (and often hard-won) empirical insights, practical wisdom, and experimental successes. These disagreements are not likely to be resolved by any single new clinical trial. The implications for Freedman’s ethical principle are clear: it is not compatible with the methodology of scientific research, because it obliges researchers to perform only those studies that are likely to resolve disagreement among the relevant experts. Adopting the principle of clinical equipoise requires abandoning projectability judgments, as these judgments are in various different ways the cause of hard disagreements. But without projectability judgments, there is no way to confirm hypotheses tested by clinical research. Recent debates in several fields of medical research about the operationalization and theoretical basis of equipoise suggest that medical researchers are to some degree aware of the problems that arise from attempting to apply the principle of clinical equipoise to medical practice (Goyal et al., 2013). And again, giving up on projectability judgments would cost medical research its ability to confirm theories; giving up on the principle of clinical equipoise does not carry this risk. What, then, should replace clinical equipoise? We support a revised set of ethical standards, built upon a psychologically and methodologically accurate theory of disagreement, decision making, and consensus in medicine, one that acSpecial Topic: Ethics and Clinical Equipoise 33 knowledges the interrelation of methodological and ethical considerations in the justification of clinical research. But we do not claim to know what these ethical standards are. We therefore envision an important project that will involve contributions from researchers in medicine, clinical ethics, and the philosophy of science. References Boyd, R. (1991). Confirmation, Semantics, and the Interpretation of Scientific Theories. In R. Boyd, P. Gasper, & J. D. Trout (Eds.), The Philosophy of Science (pp. 3–35). Cambridge: mit Press. Boyd, R. (1992). Constructivism, realism, and philosophical method. In J. Earman (Ed.) Inference, explanation, and other frustrations: Essays in the philosophy of science (pp. 131–198). Berkeley: University of California Press. Crupi, V. (2013). Confirmation. Stanford Encyclopedia of Philosophy. Freedman, B. (1987). Equipoise and the ethics of clinical research. New England Journal of Medicine, 317(3), 141–145. Goodman, N. (1983). Fact, fiction, and forecast (4th ed.). Cambridge: Harvard University Press. Goyal, M., Shamy, M., Jovin, T., Zaidat, O., Levy, E., Davalos, A., et al. (2013). Endovascular stroke trials: Why we must enroll all eligible patients. Stroke, 44(12), 3591–3595. Kappos, L., Karlsson, G., Korn, A. A., Haas, T., Polman, C. H., O’Connor, P., et al. (2006). Oral Fingolimod (fty720) for relapsing multiple sclerosis. New England Journal of Medicine, 355(11), 1124–1140. Lakatos, I. (1974). The role of crucial experiments in science. Studies in the History and Philosophy of Science, 4(4), 344–355. Montgomery, K. (2006). How doctors think clinical judgment and the practice of medicine. Oxford: Oxford University Press. Shamy, M. C. F., & Jaigobin, C.S. (2013). The complexities of acute stroke decisionmaking: A survey of neurologists. Neurology, 81, 1130–1133. Zamboni, P., Galeotti, R., Menegatti, E., Malagoni, A. M., Tacconi, G., Dall’Ara, S., et al. (2008). Chronic cerebrospinal venous insufficiency in patients with multiple sclerosis. Journal of Neurology, Neurosurgery and Psychiatry, 80(4), 392–399. 34 Theoretical & Applied Ethics 3:1
© Copyright 2026 Paperzz