The Decline Effect

The Decline Effect: Exploring Why Effects Sizes often
Decline Following Repeated Replications
Jonathan W. Schooler
University of California, Santa Barbara
Beginners Luck
“Every search begins with beginners
luck and ends with the victor’s
being severely tested.” Alchemist
Paulo Coelho
Beginners luck in research
– It seems to be a ubiquitous aspect of research that
initial attempts are associated with a level of success
that is hard to replicate.
– Decline effect
 Originally described by Rhine to characterize drop in
performance of star clairvoyant participant
 Described by Radin in Entangled Minds
“a frequent observation in psi research is that when a new
experiment is first conducted the outcomes are strikingly
successful. Then, as others try to replicate the effects they
begin to fade.”
 Popularized by Jonah Lehrer in 2010 New Yorker Article
Personal experiences with decline effect
– Mainstream studies
 Verbal overshadowing of non-verbal memories
Basic finding
 Describing a previously seen face interferes with its
subsequent recognition
Paradigms in which I have observed decline effects
 Verbal overshadowing of non-verbal memories
» Faces (Schooler & Engstler-Schooler, 1990)
» Color (Schooler and Engstler-Schooler, 1990)
» Music (Houser, Fiore & Schooler, unpublished)
» Maps (Fiore, 1997- Masters Thesis)
 Related paradigms
» Insight problem solving (Schooler et al, 1993)
» Implicit Learning (Fallshore, 1997, Dissertation)
» Analogical retrieval (Lane & Schooler, 2004)
Personal experiences of decline effects in
unconventional paradigms
– Temporally reversed perceptual priming (w McSpadden)
– Temporally reversed practice effects (w Franklin)
The arrow of time
– Physicists acknowledge that there is nothing inherent in the
laws of physics that precludes the arrow of time going
from future to past
 “…the laws of physics that have been articulated from Newton
through Maxwell and Einstein and up until today, show a complete
symmetry between past and future. Nowhere in any of these laws
do we find a stipulation that they apply one way in time but not the
other…even though experience reveals over and over again that
there is an arrow of how events unfold in time, this arrow seem not
to be found in the fundamental laws of physics…not only do no
known laws fail to tell us why we see events unfold in only one
order, they also tell us that in theory events can unfold in reverse
order
Brian Greene (2004, p 144-145)
– Nevertheless the presumption that causes precede effects
is rather well entrenched
Meta-analysis of precognition
Summary of Honorton & Ferrari (1989)
–
–
–
–
–
–
–
–
–
309 studies
62 investigators
2 million individual trials
50,000 subjects
Small but reliable effect size .02
z=11.41, p=6.3 10-25
30% of studies significant <.05
No relationship between quality of study and size of effect
File drawer study 46 unreported for each reported study
Bem’s precognition studies JPSP, 2011)
Temporally Reversed Implicit Perceptual Priming
– Procedure
 View fixation
 Noise mask
 Briefly flashed image
 Noise mask
 Indicate whether you know what was presented
 Image repeated or followed by a blank screen
+
If you know what the image just
flashed was press the uparrow
If you do not know what the image
just flashed was press the
downarrow
+
If you know what the image just
flashed was press the uparrow
If you do not know what the image
just flashed was press the
downarrow
Experiment 1: Effect of Post-priming
Experiment 1
Image Identification T ask
0.68
0.66
0.64
0.62
Proportion
Ye s
0.6
Re sponse s
0.58
0.56
0.54
0.52
Primed
Unprimed
Primed
Unprimed
Experiment 2: Replication w/ random yoked design
Experiment 2 - Image Identification T ask - Fully Counterbalanced Presentation Order
0.41
0.4
Percent identifiable
0.39
0.38
0.37
0.36
0.35
0.34
0.33
0.32
0.31
Primed
UnPrimed
N=20 p=.003
Primed
UnPrimed
The Decline Effect
Conclusions from temporally reversed perceptual
priming
– Overall effect remained significant
– Evidenced massive decline in significance
– Early studies may have had advantage of smaller n (easier
to get spurious results)
 Decline effect still observed in later studies with larger N
Dice throwing (Bierman, 2001)
Decline Effect in Ganzfeld (Bierman 2001)
Decline and Return in Ganzfeld (Storm et al 2010)
Decline and Return- Telekinesis
Conclusions from meta-analyses of psi research
– Decline effect observed in a number of domains
– After decline return observed in longest studied domains
– Could this be a decline of the decline effect?
Decline Effects in Conventional Science
– Drug treatments
– Highly cited medical interventions
– Biology meta-analyses
Decline Effect in drug treatment of schizophrenia (Kemp
et al 2008)
Decline effect in Pravastatin to treat cholesterol
(Kaplan*, personal communication)
* NIMH Associate Director for Behavioral and Social Sciences
Decline effect in Timolol, beta blocker (Kaplan, personal
communication)
Year of publication
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
Reported effect size
(lowering of IOP in mmHg)
Decline effect in latanoprost for treatment of glaucoma
(Kaplan, personal communication)
Latanoprost (n=32)
12
10
8
6
4
2
0
Ioannidis (2005)
Journal of the American Medical Association
Ioannidis (2005)
Journal of the American Medical Association
– Main findings
 Of 49 highly cited original clinical research studies, 45 claimed that
the intervention was effective.
 Of these, 7 (16%) were contradicted by subsequent studies,
 7 others (16%) had found effects that were stronger than those of
subsequent studies,
 20 (44%) were replicated,
 11 (24%) remained largely unchallenged.
– Bottom line
 Over 40% of those studies for which a replication was attempted
were either associated with a declined effect size or were outright
contradicted
Decline effects in ecological and evolutionary biology
Meta-analyses of decline effect in biology
Conclusions from decline effects in mainstream science
– Decline effects observed in variety of domains
 Drugs
Anti-psychotics
Cholesterol lowering
Beta-blocker
Treatment of glaucoma
 Medical interventions
 Biology
Reasons for declining effect sizes
– Ionadis
 Initial studies often have smaller n
Among randomized studies with contradicted or initially stronger
effects were smaller (P=.009) than replicated or unchallenged
studies
 Non-randomized studies tend to be unreliable
Five of 6 highlycited nonrandomized studies had been
contradicted or had found stronger effects vs 9 of 39
randomized controlled trials (P=.008).
– But…
 Decline effects observed when study n is controlled for
E.g Jennions & Moller, 2001
 Decline effects observed in many domains with randomized
assignment
Conventional accounts
of decline effect
– Regression to the mean
 Findings may be exaggerated by error variance
 But does not explain linear decline
– Degradation of procedure
 Effect may depend on the unappreciated importance of arbitrary
methodological elements that are not included in replications
 But why are studies so lucky at first?
– Refinement of procedure
 Improved methodologies remove erroneous sources of positive
effects
 But shouldn’t methodological refinements also sometimes lead to
enhanced ability to find effects?
– Publication bias
 Publication process favors publishing positive results
 Does not explain my personal experience, nor meta-analyses in psi
research where null effects are readily published
– Too many research degrees of freedom
Too many degrees of freedom
False . PositivePsychology: Undisclosed Flexibili~ in Data Collection and Analysis Allows Presenting Anything
as Significant
Joseph P. Simmons
Yale School ofManagement
Lelt D. Nelson
Universi~ 01California Berkeley -Haas School 01Business
Uri Simonsohn
University of Pennsylvania-The Wharton School
PsYchological Science, 2011
Abstract:
This paper accomplishes two things, First, we show that despite our field'snominal endorsement of alow rateof false-positive findings(p s.05),flexibility indata c-Ollec ion, analysis,and
reporting dramaticallyincreases actual false-positive rates, Inmany cases, aresearcher ismorelikelytofalselyfind e~dence that an effect existsthanto correctlyfindevidence thatit does not.
We present computersimulations and apairof actual experiments that demonstrate how unacceptably easy it is to accumulate(and report) statisticallysignificant evidenceforafalse
hypothesis. Second, we suggest asimple,10w-GOst,andstraightforwardly effecnve disclosure-based solutionto thisproblem,Itinvolves sixconcrete requirements for authors andfour
guidelines for reviewers, imposing aminimal burden on thepublicationprocess,
Too many degrees of freedom: Simmons et al in press
– Identified 5 sources of experimenter degrees of freedom
 (1) choosing among dependent variables,
 (2) sample size,
 (3) use of covariates,
 (4) reporting subsets of experimental conditions,
 (5) their combination.
– Conducted Monte Carlo simulation,
 Varied degrees of freedom
 observed how often at least one of the resulting p-values in each
sample was below standard significance levels.
Too many degrees of freedom: Simmons et al in press
Too many degrees of freedom: Simmons et al in press
– Conducted actual experiments that varied these degrees of
freedom
 Investigated whether listening to “When Im 64” makes people
physically older
 Participants reported an older age when they were exposed to
“When Im 64”relative to control song.
– Provided two write ups
 As currently allowed
 With their suggested requirements for additional disclosures
Original write up
 Twenty undergraduates drawn from the same pool as Study 1
listened to either “When I am 64” by The Beatles or “Kalimba.” In an
ostensibly unrelated task, they then indicated their birth date. An
ANCOVA revealed the predicted effect: People were nearly a yearand-a-half younger after listening to “When I am 64” rather than
“Kalimba” (adjusted Ms = 20.1 vs. 21.5), F(1, 17) = 4.92, p = .033.
Too many degrees of freedom: Simmons et al in press
Evidence of that people selectively report research: John,
Lowenstein & Prelec (under review)
– surveyed over 2,000 psychologists about their involvement
in questionable research practices, using an anonymous
elicitation format supplemented by incentives for honest
reporting.
– Two
conditions
– One condition received standard instructions,
– the other received Bayesian truth serum scoring algorithm
(Prelec, 2004),
 respondents were told that a donation would be given to a charity of
their choice, and that the size of this donation would depend on the
truthfulness of their response
Questionable reporting practices John, Lowenstein &
Prelec (under review)
Could selective reporting account for decline effect?
– Certainly could be an important factor
 Initial investigators may be especially motivated to use degrees of
freedom to obtain greatest possible effect
– No question it is a major problem for the field
– But
 Does not entirely explain my personal experiences,
 Hard to account for linear patterns of decline effects across fields
Nonconventional account of decline effect
– Heisenberg effects generalize in some yet unknown
manner to scientific observation of phenomena
 Genuine effects actually fade with repeated observation
 Suggests laws of nature are not immutable
Possibility also raised by physicists (e.g. Paul Davies)
– Admittedly explanation of last resort
 Cannot entirely rule it out until we get better handle on actual
source of decline effect
Need the development of an open source data repository
– Need a process to let
scientists log their
hypotheses and
methodologies before an
experiment, and their
results afterwards,
regardless of outcome.
(Schooler, 2011, Nature)
Challenges of an open access data repository
– Would require
 an automated protocol to enable study methods and results to be
entered and retrieved.
 Some way to assess the quality of the work
perhaps through open-access commentaries moderated in a
manner similar to Wikipedia.
 A way to assure the qualifications of researchers who use it,
 The maintenance of a blackout period to protect hypotheses and
findings prior to publication.
 incentives — and perhaps new rules from funders — to take part.
– Not insurmountable.
 A similar databases has already been set up for clinical trials
(http://clinicaltrials.gov)
Benefits of an open access repository
– Would reveal how published studies fit into the larger set of
conducted studies
– Would overcome many problems stemming from excessive
degrees of freedom.
– Would make the scientific process much more transparent
– Would likely reveal the source of the decline effect
Bottom Line
– Decline effect has haunted me my entire career
 Has it happened to you?
– It has also been observed in many domains of science
– Many factors likely to contribute to the decline effect
– Highlights likely impact of questionable yet common
scientific practices
– We cannot completely understand the decline effect until
science does a better job of making available all studies.
 Not just those that have been tailored for publication.
– Open access repository would go a long way both towards
correcting excessive scientific degrees of freedom and
revealing the source(s) of the decline effect