Report the awful truth!

commentary
Report the awful truth!
Negative and null results are routinely produced across all scientific disciplines, but rarely get reported.
The key to combat the biases arising from this mismatch lies in disseminating all details about a work,
rather than just positive results.
Leonie Mueck
A
good metaphor for how science
works is the game Battleship1. The
players shoot cannons into a largely
unexplored territory. Sometimes, eureka! The
shot hits a ship, but most cannonballs land
in the bleak and empty ocean. At first sight
those shots seem wasted. But every player
knows that the misses provide valuable
information. Getting to know the terrain
ultimately points you towards the targets.
Science — like Battleship — often is about
learning from the research attempts that
failed to confirm the original hypothesis.
But a thorough look through the scientific
literature reveals that these negative or null
results are rarely reported. Published articles
are usually polished narratives, which convey
that everything went according to plan.
exactly what you did, the publication of any
result — whether positive or negative — will
be valuable” Fanelli says. “But if you only
allow positive results to be reported, it might
turn out that people base their future research
on an occasional happy mistake.” This
occasional happy mistake, which might be
due to a little impurity in a chemical synthesis
or a statistical fluctuation in a clinical trial,
will lead fellow researchers down the wrong
path, delay scientific progress and lead to
irreproducible studies. The rectification
of a false positive result or an exaggerated
effect will waste a lot of time and money.
Conversely, not reporting experiments that
did not show the desired outcome might
cause duplicate scientific efforts and lead to a
slower discovery of errors or scientific fraud.
Occasional happy mistakes
Biases across the disciplines
In certain disciplines the positive-outcome
bias has been discussed for decades and
we have a fairly good picture of its effect
© RUTE ANDRÉ
When an article is written up, sent to the
editor and goes through peer review, the
studies that researchers or journals consider
uninteresting are shelved. They disappear
into the bottom drawer of the principal
investigator and it is mostly the negative and
null results that suffer this fate. This effect is
well known and has been dubbed the ‘filedrawer effect’2.
Daniele Fanelli, a scientometrician at
the Université de Montréal, Canada, has
performed large investigations into just how
seldom negative or null results are published.
The results are surprising: depending on the
discipline, the portion of articles reporting
negative or null results typically amounts
to 10–15%. Only in the fields of space
science and geoscience does this number
exceed 25%. Materials science reports less
than 10%, physics around 15%, and the
percentages for chemistry, the biological
sciences and medicine lie in between3. Ask a
few PhD students about how many negative
or null results they have already produced
in their short careers and these percentages
seem minuscule.
This positive-outcome bias might harm
scientific progress on the whole. “If you
are certain that your study was correctly
conducted, and as long as you report
on scientific progress. Psychology 4 and
evolutionary biology 5, for example, have
a long history of worrying about the
consequences. One such consequence might
be the infamous decline effect: it seems that
scientific findings become less pronounced
every time they are replicated6,7. The absence
of negative and null results artificially
inflates the magnitude of an effect in the
early years after its discovery — and for the
true magnitude to be determined a slow and
laborious self-correction process is necessary.
Another field for which the decline
effect has been well documented and
where researchers have been reiterating
the importance of negative and null results
is medicine8. In this case, it is not purely
of academic interest to foster efficient
communication of an effect’s true magnitude.
Concealing clinical trials, which show that
a new medication is no better than the
conventional product, might cost the public
health sector billions. The fact that the
NATURE NANOTECHNOLOGY | VOL 8 | OCTOBER 2013 | www.nature.com/naturenanotechnology
© 2013 Macmillan Publishers Limited. All rights reserved
693
commentary
positive-outcome bias has a considerable
effect on the overall scientific opinion
has been well documented for various
medical fields9–11.
Gerhard Fröhlich, a philosopher of
science from the Johannes-Kepler University
in Linz, Austria, has been criticizing biases
and distortions in clinical trials for a long
time. “We must not merely consider the
published articles to get a balanced picture of
a medical therapy, especially in meta studies.
Rather the unpublished studies are just as
important as they are much more likely to
contain negative or null results”, he says. He
goes on to mention an even more worrying
finding: in medicine, industry funded
research is less likely to report negative or
null results than publically funded studies12.
For disciplines in the physical and
chemical sciences where statistics do not
play such a central methodological role,
publication bias and the file-drawer effect
have been investigated much less. Possibly
it is not a big problem in those disciplines.
Fanelli, at least, has found that disciplines
in the physical sciences in general are
more likely to report negative results than
in the biological sciences, psychology and
medicine3. For Fanelli, an insightful rationale
for this discrepancy lies in a modernized
version of August Comte’s 200-year-old
concept of a ‘Hierarchy of Sciences’. Sciences
may be ranked according to the complexity
of the studied objects, with physics at
the bottom and sociology at the top. The
hypothesis is that disciplines at the top are
more susceptible to biases. “We often treat
all fields in the same way, expecting that
psychology has the same methodological
rigor as physics”, Fanelli remarks.
Nevertheless, he thinks that publication
biases are present in physics as well. But
based on the rationale of the hierarchy of
sciences, he argues: “I suspect that in physics
the decline effect is less of a problem than
in softer sciences. Although some initially
reported effects might be too large or
even due to an artefact, the self-correction
mechanism in physics seems much more
rapid.” At present, Fanelli and his colleagues
are collecting more evidence to back up
this hypothesis13.
Canada. They questioned the exceptionally
high yields of more than 95% in organic
synthesis methods that are routinely reported
nowadays and set out to test whether the
reported yields are in fact realistic. A look
through the literature shows that such
high yields were rarely reported before
1980 and that today they are still absent
from the journal Organic Synthesis where all
submissions are independently reproduced
before publication. Wernerova and Hudlicky
carefully studied how much of the chemical
product is lost in the typical work-up
procedure and concluded that any yield of
more than 94% is unrealistic14. Any higher
reported yield seems an ‘occasional happy
mistake’, just like the false positive results that
find their way into psychological journals.
This case is a prime example of how a
publication bias is introduced into the official
literature because of a bad reporting practice,
which by now is almost impossible to correct.
In addition to those distortions, omitting
small but relevant details can have direct
consequences on other people’s research. This
fact is well demonstrated by the following
anecdote: during his PhD, Mathias Kläui,
now a physics professor at the Johannes
Gutenberg-University in Mainz, Germany,
and active in the field of nanomagnetism,
developed a method to determine all in-plane
magnetization components in a magnetic film
Omitting relevant details
It seems, however, that a few bad habits
in reporting practices have undesirable
consequences even in the hard sciences.
If papers are polished a bit too much or if
relevant details are left out, anomalies might
arise that distort the scientific literature
in the long term. One such anomaly has
recently been uncovered by organic chemists
Tomas Hudlicky and Martina Wernerova
from Brock University in St Catherines,
ME
67 (3)
WA
75 (32)
OR
87 (15)
NV
33 (3)
CA
82 (150)
ND
25 (4)
MT
67 (3)
ID
99 (3)
UT
88 (16)
AZ
83 (24)
WY
99 (1)
CO
83 (24)
NM
83 (6)
MN
82 (22)
SD
99 (1)
WI
88 (24)
MI
98 (54)
IA
85 (20)
NE
99 (13)
KS
82 (11)
OK
78 (9)
TX
87 (75)
AK
60 (5)
NY
90 (96)
IL
IN
83 (46) 90 (20)
KY
79 (19)
MO
77 (26)
TN
70 (20)
AR
99 (5)
MS
50 (6)
AL
83 (12)
LA
79 (14)
HI
99 (2)
PA
86 (73)
OH
96 (47)
WV
80 (5)
VA
86 (28)
NC
77 (62)
VT 99 (4)
NH 78 (9)
MA 87 (67)
RI 86 (7)
CT 84 (19)
NJ 89 (27)
MD 88 (81)
DC 99 (18)
SC
78 (9)
GA
94 (36)
FL
88 (40)
0.2–0.3
0.3–0.4
0.4–0.5
0.5–0.6
0.6–0.7
0.7–0.8
0.8–1.0
Figure 1 | A competitive research environment might increase the positive-outcome bias. By analysing 1,316 papers published in the US, Fanelli showed that
more competition in an individual state correlated with an increased rate of published positive results. The level of competitiveness is measured by the number
of articles published per doctorate holder (see colour scale). The percentage of published positive results are given in each state with the corresponding
sample size in parentheses. A complete discussion including the margins of uncertainties can be found in ref. 20. Data courtesy of Daniele Fanelli.
694
NATURE NANOTECHNOLOGY | VOL 8 | OCTOBER 2013 | www.nature.com/naturenanotechnology
© 2013 Macmillan Publishers Limited. All rights reserved
commentary
from a single measurement. For certain angles
this worked very well, for others it did not.
“As it was just a side project, we never really
investigated why this was the case”, Kläui
says. In the publication this measurement
technique was merely used to validate a new
fabrication method for magnetic films, so the
group decided to mention only information
on the successful measurement angles15. Years
later, Kläui received an e-mail from a PhD
student in the US. “Because of our paper, she
had tried to repeat the same measurements
with all angles and came across the same
problems”, he says, “she carried on, trying to
understand the result for two years. Naturally,
she was quite frustrated when she heard that
I had had the same problems but we never
published the details.”
Internal barriers
Reporting all the details of what you did
and how you did it seems to be quite
straightforward and the fact that this is not
the norm certainly is puzzling. After all there
are not many external barriers. In these times
of online publication, preprint repositories
and large supplementary information
files, there are no space limitations and
all necessary infrastructure is available.
Moreover, hardly any journal explicitly
excludes the publication of negative results.
The barriers that keep us from disseminating
the big ‘failed’ experiments as well as the little
erroneous paths rather seem to be internal.
When talking to researchers, especially
from physics and chemistry, they often
express the concern that their lack of a
positive result is simply due to a trivial
mistake. Or they think that their negative
results are less important, will have little
impact and will not get many citations.
Commonly, if preliminary results indicate
that the study will not produce a positive
outcome, research projects get terminated.
Why waste time and money to finish and
write them up if the research will be of
low impact anyway? There is some truth
to those prejudices: Fanelli showed that
articles reporting negative results get cited
less, but this finding was only statistically
significant for the biological sciences16. In
contrast to positive results, which tend to get
cited by a fixed community of researchers
from a specific discipline, negative results
are cited by a broad range of scientists from
different fields17. Moreover, a simulation
study that was recently published in PLoS
ONE by de Winter and Happee18 suggests
that a certain amount of the file-drawer effect
might actually be beneficial for science as a
whole. The simulation showed that with a
selective publication approach it took fewer
published articles to arrive at a true metaanalytic estimate of the effect than with a
publish-everything approach. Ranking results
according to their scientific impact might
make sense to a certain extent. However,
the simulation also demonstrated that not
reporting negative results leads to poorly
reproducible and contradictory literature.
Fröhlich does not think that those
arguments against publishing negative results
are very strong. His first thought on how to
explain the internal barrier is competition.
“Not publishing negative results is a strategy
to withhold information from scientific
competitors because they are condemned
to go down the same wrong paths and dead
end streets again and to repeat prior errors”,
he says19. In this context, it is a long-standing
hypothesis that the competitive environment
and publish-or-perish culture that scientists
nowadays find themselves in, contribute to
the publication of fewer negative results.
But only recently have the first firm pieces
of evidence started to trickle in. Fanelli, for
example, showed that in US states with a
more competitive research environment the
portion of positive-outcome studies is larger
than in states with less competition20 (Fig. 1).
Combating publication biases
From what we know about the undesirable
effects of publication biases, we need to
overcome the barriers that keep us from
publishing more negative results. Especially
in medicine and psychology the matter is
urgent and scientists, journals and policy
makers have become aware of it. Plenty of
proposals about how to combat publication
biases have been made. Most are specific
to certain disciplines, which reflects the
variations in prerequisites and demands
across the different fields.
For clinical studies, comprehensive
clinical-trial registries can make it harder
to draw a veil over negative or null
outcomes; such measures are already being
implemented in various countries. The
publication of more raw data is another
measure to facilitate reproducibility,
especially, but not exclusively for medical
and biological studies. One of the proposals
that immediately comes to Fanelli’s mind is
a reproducibility factor for journals. Rather
than counting citations to measure quality,
this factor would be a gauge for how well the
published articles withstand a reproducibility
test. Fröhlich has a related idea that would
help with its implementation: “When
funding a scientific project, I am in favour of
dedicating a certain portion of the money for
independently checking the reproducibility
of the original results.” But the careeradvancement system might compromise this
idea. “Unfortunately it is hard to win laurels
as an individual researcher by reproducing
previous studies” Fröhlich adds.
NATURE NANOTECHNOLOGY | VOL 8 | OCTOBER 2013 | www.nature.com/naturenanotechnology
© 2013 Macmillan Publishers Limited. All rights reserved
Fanelli also thinks the current evaluation
practices are not optimal for preventing
biases. “Changes in the funding and careeradvancement system might make it less of a
conflict of interest to share negative results. If
you get at least some credit for having tried
the wrong experiments first, this will help to
overcome publication biases” he says.
In physics and chemistry, although our
general understanding of reporting biases
in these disciplines is still very sketchy, we
might not need the big measures to combat
distortions. On the other hand, it is obvious
that the hard sciences are not at all immune
to biases. The general idea is clear: when
researchers write up a scientific article and
when journals select and finalize it, a greater
focus on the experimental details rather
than on the success story will lead to more
transparency. In the example of organic
yields, for instance, requesting the exact
information on how the yield was obtained
and whether any kind of average over several
experiments was taken, is plain and simple.
Another easy measure to save PhD students
and postdocs from frustration would be to
designate some space in the supplementary
information to specifically talk about the
possible experimental pitfalls and erreoneous
paths taken.
The appeal message to scientists and
journals must be: report the awful truth!
At the beginning it might seem slightly
embarrassing to disseminate everything,
including errors and failures. But in the end
it will be for the benefit of science.
❐
Leonie Mueck is an Associate Editor at
Nature Communications.
e-mail: [email protected]
References
1. Fröhlich, G. in Der unendliche Prozeβ der Zivilisation
(eds Kuzmics, H. & Mörth, I.) 95–111 (Campus Verlag, 1991).
2. Rosenthal, R. Psychol. Bull. 86, 638–641 (1979).
3. Fanelli, D. PLoS ONE 5, e10068 (2010).
4. Howard, G. S. et al. Rev. Gen. Psychol. 13, 146–166 (2009).
5. Jennions, M. D. & Møller, A. P. Biol. Rev. 77, 211–222 (2002).
6. Schooler, J. Nature 470, 437 (2011).
7. Jennions, M. D. & Møller, A. P. Proc. R. Soc. Lond.
269, 43–48 (2002).
8. Ioannides, J. P. A. J. Am. Med. Assoc. 294, 218–228 (2005).
9. Song, F. et al. Health Technol. Assess. 14, 1–193 (2010).
10.Turner, E. H. CNS Drugs 27, 457–468 (2013).
11.Dwan, K., Gamble, C., Williamson, P. R. & Kirkham, J. J.
PLoS ONE 8, e66844 (2013).
12.Bailey, C. S. et al. Can. J. Surg. 54, 321–326 (2011).
13.Fanelli, D. & Ioannides, J. P. A. Proc. Natl Acad. Sci. USA
10, 15031–15306 (2013).
14.Wernerova, M. & Hudlicky, T. Synlett 18, 2701–2707 (2010).
15.Heydermann, L. J., Kläui, M., Rothman, J., Caz, C. A. F. &
Bland, J. A. C. J. Appl. Phys. 93, 7349–7351 (2003).
16.Fanelli, D. Scientometrics 94, 701–709 (2013).
17.Gumpenberger, C. et al. Scientometrics 95, 277–297 (2013).
18.De Winter, J. & Happee, R. PLoS ONE 6, e66463 (2013).
19.Fröhlich, G. in Knoweledge Management und
Kommunikationssysteme, Workflow Management,
Multimedia, Knowledge Transfer Proc. 6th Int. Symp. for
Information Sci. (eds Zimmermann, H. H. & Schramm, V.)
535–549 (UVK Verlagsgesellschaft mbH, 1998).
20.Fanelli, D. PLoS ONE 5, e10271 (2010).
695