“Making Science Great Again” From replication crisis to open science, how we can improve research Roy Salomon Gonda Center 12.6.17 This presentation is inspired by presentations from Daniel Lakens, Jim Grange, and Brian Nosek. Most slides are from PD Dr. Felix Schönbrodt, Ludwig-Maximilians-Universität München, and used under a CC-BY 4.0 license. “Only when certain events recur in accordance with rules or regularities, as in the case of repeatable experiments, can our observations be tested—in principle—by anyone.... Only by such repetition can we convince ourselves that we are not dealing with a mere isolated ‘coincidence.” – Karl Popper (1959, p. 45) We have a problem! What are the causes? We have a problem! 2011 Bem 2012 2013 2014 2015 4 2011 Bem Simmons et al.: False-positive psychology 2012 2013 2014 2015 The combination of some typical questionable research practices (QRPs) increasesType-I error rate from 5% to > 50%. 5 2011 Bem Simmons et al.: False-positive psychology 2012 John et al.: Prevalence of QRPs 2013 2014 2015 “Self-admission rate” for many QRPs > 50%; estimated prevalence partly > 70%. 6 2011 Bem Simmons et al.: False-positive psychology 2012 2013 John et al.: Prevalence of QRPs Doyen et al. (2012) ➙ “The Bargh rant” Kahneman: Open Letter Cited by 4195 2014 2015 I believe that you should collectively do something about this mess. I see a train wreck looming. http://www.nature.com/polopoly_fs/7.6716.1349271308!/suppinfoFile/Kahneman%20Letter.pdf 7 I believe that you should collectively do something about this mess. I see a train wreck looming. http://www.nature.com/polopoly_fs/7.6716.1349271308!/suppinfoFile/Kahneman%20Letter.pdf 8 n = 20 in each condition d = 0.73 95% CI[0.05; 1.41] 577 citations http://www.terryburnham.com/2015/04/a-trick-for-higher-sat-scores.html?m=1 9 N >3500 in each condition p=.76 d = -0.01 95% CI[-0.05; 0.04] http://www.terryburnham.com/2015/04/a-trick-for-higher-sat-scores.html?m=1 n = 20 in each condition d = 0.73 95% CI[0.05; 1.41] 577cited citations 577x 1 0 2011 Bem Simmons et al.: False-positive psychology 2012 2013 John et al.: Prevalence of QRPs Doyen et al. (2012) ➙ “The Bargh rant” Kahneman: Open Letter Foundation of Center for Open Science ( Open ) Science Framework 2014 2015 11 Complete scientific project management Data management, pre-registrations, version control, private/public, private read-only links for reviewers, wikis, email lists, Dropbox/Figshare/Github integration, download statistics … 12 2011 Bem Simmons et al.: False-positive psychology 2012 2013 2014 John et al.: Prevalence of QRPs Doyen et al. (2012) ➙ “The Bargh rant” Kahneman: Open Letter Foundation of Center for Open Science ( Open Science Framework Simonsohn et al.: p-curve )+ 2015 13 Simonsohn et al.: p-curve p-curve: Null effect • Under H₀, p-values are uniformly distributed a study = drawing a random p-value from this distribution 3 2 5% 1 0 Density 4 5 • Doing 0.0 0.2 0.4 0.6 p value 0.8 1.0 14 Simonsohn et al.: p-curve p-curve: Effect size > 0 increasing power, the p-curve gets more positively skewed 8 • With 4 10% 2 0 Density 6 10% power 0.0 0.2 0.4 0.6 p value 0.8 1.0 15 Simonsohn et al.: p-curve p-curve: Effect size > 0 • With increasing power, the p-curve gets more positively skewed 6 35% 4 2 0 Density 8 10 12 35% power (average in psychology) 0.0 0.2 0.4 0.6 p value 0.8 1.0 16 Simonsohn et al.: p-curve p-curve: Effect size > 0 • With increasing power, the p-curve gets more positively skewed 80% 10 15 20 25 30 5 0 Density 80% power 0.0 0.2 0.4 0.6 p value 0.8 1.0 17 Simonsohn et al.: p-curve Elderly priming p-values 30 20 0 10 Density 40 50 of all p-values are 49% of all11% p-values are expected expected to be between to be <.025 .025 and .05 k=5 0.00 60% power k=13 0.05 0.10 p value 0.15 0.20 18 Simonsohn et al.: p-curve http://p-curve.com/ Elderly priming p-values (k = 18): p = .043 p = .034 p = .046 p = .033 p = .017 p = .044 p = .043 p = .048 p = .039 … 19 2011 Bem Simmons et al.: False-positive psychology 2012 2013 2014 John et al.: Prevalence of QRPs Doyen et al. (2012) ➙ “The Bargh rant” Kahneman: Open Letter Foundation of Center for Open Science ( Open Science Framework Simonsohn et al.: p-curve )+ ManyLabs 1 & Special Issue “Replication” 2015 20 ManyLabs 1 & Special Issue “Replication” Social Psychology: Replication Special Issue (Nosek & Lakens, 2014) Bayesian reanalysis (Marsman, Schönbrodt, Morey,Wagenmakers, in prep.) 7/59 = 12% replicable 21 2011 Bem Simmons et al.: False-positive psychology 2012 2013 2014 John et al.: Prevalence of QRPs Doyen et al. (2012) ➙ “The Bargh rant” Kahneman: Open Letter Foundation of Center for Open Science ( Open Science Framework Simonsohn et al.: p-curve )+ ManyLabs 1 & Special Issue “Replication” Schnall-Debate 2015 ManyLabs 3 22 ManyLabs 3 10 effects, 20 labs, n > 3400 23 ManyLabs 3 ES: d = .09, p = .02 n for 95% power = 6708 power in original study (n = 152): 8% 10 effects, 20 labs, n > 3400 24 2011 Bem Simmons et al.: False-positive psychology 2012 2013 2014 John et al.: Prevalence of QRPs Doyen et al. (2012) ➙ “The Bargh rant” Kahneman: Open Letter Foundation of Center for Open Science ( Open Science Framework Simonsohn et al.: p-curve )+ ManyLabs 1 & Special Issue “Replication” Schnall-Debate 2015 ManyLabs 3 Reproducibility Project: Psychology (RP:P) 25 Reproducibility Project: Psychology (RP:P) https://osf.io/ezcuj/wiki/home/ 97 replications • 36% of all replications were significant • PS - cog: 53% • JEP:LMC: 48% • PS - soc: 29% • JPSP - soc: 23% • • 83% of all effect sizes are smaller than the original 27 Not my problem? An outlook to other disciplines. 31 • 53 ‘landmark studies’, not randomly selected: fresh approaches targeted for future drug development • “scientific findings were confirmed in only 6 (11%) cases. Even knowing the limitations of preclinical research, this was a shocking result.” • Bayer Healthcare: 67 target-validation projects in oncology, women’s health, and cardiovascular medicine. Only 14 (21%) could be reproduced. Begley, C. G., & Ellis, L. M. (2012). Drug development: Raise standards for preclinical cancer research. Nature, 483, 531–533. doi:10.1038/483531a Prinz, F., Schlange, T., & Asadullah, K. (2011). Believe it or not: how much can we rely on published data on potential drug targets? Nature Reviews Drug Discovery, 10, 712–712. doi:10.1038/nrd3439-c1 32 “Our results indicate that the average statistical power of studies in the field of neuroscience is probably no more than between ~8% and ~31%, on the basis of evidence from diverse subfields within neuro-science. What are the Causes? What are the Solutions? 32 Why is reproducibility so low?? Why? How? We are human We Err. We p-hack. We HARK. We use QRDs. We are part of a system Publication Bias Problematic incentive scheme 33 Unintentional mistakes The garden of forking paths Questionable Research Practices (QRPs) Fraud Publication bias 35 Unintentional mistakes The garden of forking paths Questionable Research Practices (QRPs) Fraud Publication bias 36 • Reproducible analysis code and open data required at submission - “inhouse checking” in review process • 54%of all submissions had results in the paper that did not match the computed results from the code • wrong signs, wrong labeling of regression coefficients, erorrs in sample sizes, wrong descriptive stats http://thepoliticalmethodologist.com/2014/12/09/a-decade-of-replications-lessons-from-the-quarterly-journal-of-political-science/ 37 Unintentional mistakes Solution: Open Data Solution: Open Scripts The garden of forking paths Questionable Research Practices (QRPs) Fraud Publication bias 39 Unintentional mistakes Solution: Open Data Solution: Open Scripts The garden of forking paths Questionable Research Practices (QRPs) Fraud Publication bias 40 The garden of forking paths Data Andrew Gelman & Eric Loken, 2013 Inspired by Neurosceptic’s blog: http://blogs.discovermagazine.com/neuroskeptic/2015/05/18/p-hacking-a-talk-and-further-thoughts/#.VV2TiOePKsN 40 The garden of forking p-hacks P=0.82 P=0.04 P=0.34 Data P=0.17 P=0.66 P=0.82 P=0.34 P=0.07 Andrew Gelman & Eric Loken, 2013 P=0.24 Inspired by Neurosceptic’s blog: http://blogs.discovermagazine.com/neuroskeptic/2015/05/18/p-hacking-a-talk-and-further-thoughts/#.VV2TiOePKsN 41 Lets do this together http://shinyapps.org/appshttp://shinyapps.org/apps/p-hacker//p-hacker/ Inspired by Neurosceptic’s blog: http://blogs.discovermagazine.com/neuroskeptic/2015/05/18/p-hacking-a-talk-and-further-thoughts/#.VV2TiOePKsN 42 Solution: Preregistration The first principle is that you must not fool yourself and you are the easiest person to fool. -Richard P. Feynman What should be included in a preregistration? What is a preregistration? Predictions • Hypotheses • • Models Dependent variables • ROIs • Confounds • Exclusion criteria • Feature definition (“functional connectivity defined as…”) • It’s the introduction and methods section of your future paper. • Analysis plan Statistical techniques (algorithms) • Multiple comparison correction • • Parameters http://dx.doi.org/10.1371/journal.pone.0132382 Unintentional mistakes Solution: Open Data Solution: Open Scripts The garden of forking paths Solution: Open Data Solution: Pre- registration Questionable Research Practices (QRPs) Fraud Publication bias 45 Unintentional mistakes Solution: Open Data Solution: Open Scripts The garden of forking paths Solution: Open Data Solution: Pre- registration Questionable Research Practices (QRPs) Fraud Publication bias 46 QRP Unintentional mistakes Solution: Open Data Solution: Reproducible Scripts The garden of forking paths Solution: Open Data Solution: Pre- registration Questionable Research Practices (QRPs) Solution: Pre- registration Fraud Publication bias 48 Psychology/Psychiatry 92%! 34%? 21%? Fanelli, D. (2011). Negative results are disappearing from most disciplines and countries. Scientometrics, 90(3), 891–904. doi:10.1007/s11192-011-0494-7 49 Reviewed Pre-Registration https://www.elsevier.com/editors-update/story/peer-review/cortexs-registered-reports Reviewed Pre-Registration Advances in Methodologies and Practices in Psychological ScienceAIMS Neuroscience Animal Behavior and Cognition Attention, Perception, and Psychophysics Behavioral Neuroscience Cognition and Emotion Cognitive Research: Principles and Implications Comprehensive Results in Social Psychology Cortex Drug and Alcohol Dependence European Journal of Neuroscience Experimental Psychology Health Psychology Bulletin Human Movement Science Infancy International Journal of Psychophysiology Journal of Business and Psychology Journal of Cognitive Enhancement Journal of European Psychology Students Journal of Experimental Political Science Journal of Media Psychology Journal of Personnel Psychology Journal of Research in Personality Judgment and Decision Making Management and Organization Review Memory Nature Human Behaviour NFS Journal Nicotine & Tobacco Research Perspectives on Psychological Science Royal Society Open Science Stress and Health The Leadership Quarterly Work, Aging and Retirement Unintentional mistakes Solution: Open Data Solution: Reproducible Scripts The garden of forking paths Solution: Open Data Solution: Pre- registration Questionable Research Practices (QRPs) Solution: Pre- registration Fraud Publication bias Solution: Pre- registration, Registered reports 52 How we can improve research? Summary Our current system’s incentives foster questionable research Personal level practices, which decrease the truth value of our shared System Level knowledge. To make science great again we need to adopt new approaches: • How we appraise • • • • • • • Open data Open scripts Open materials Preregistration Transparency Better Statistics Peer review openness (make others participate) and hire people. • Expect less papers but better ones. • What journals we support? • Open Access!
© Copyright 2024 Paperzz