THE IMPORTANCE OF DATA ACCESS AND RESEARCH TRANSPARENCY Vera E. Troeger Department of Economics and CAGE, University of Warwick Editor in Chief, Political Science Research and Methods (PSRM), the Journal of the EPSA DART in Switzerland November 7, 2014 TRUST IS GOOD, CONTROL IS BETTER? TRUST IS GOOD, CONTROL IS BETTER! TRUST IS GOOD, SELF-CONTROL IS BETTER! Academic fraud has developed into an endemic disease. Particular disciplines working with experimental data (esp. Medicine and Psychology) face the questions which results can be trusted. Why Fraud? Like doping in sports – allows to reach the goal (publications, citations, tenure, promotion…) faster With probability of detection low incentives for cheating high Costs are borne by honest academics both personally (competition) and as a profession (reputation) An Anecdote: The FAZ published an article on Wednesday (5 November) that titled “Der Doktortitel gehoert abgeschafft” – “the doctoral title should be abolished” The argument was that this would put an end to people trying to cheat their way to a PhD and the doctoral title. Especially in Germany this seems to be a big part of the new trend towards “DART”, transparency, academic honesty and access. Starting with “VroniPlag” – many PhD dissertations (especially those of prominent politicians) have been proven to be plagiarised. Thus academic fraud and misconduct are issues, not just when plagiarising a dissertation, however proposing to eliminate the very proof of academic qualification (a PhD dissertation) seems to be equivalent to eliminate speed limits because drivers violate them all the time… THUS: Control is Better! Academic Fraud - Some Facts: The Tip of the Iceberg Number of withdrawn articles by publication year Source: Web of Science. Note: The decrease after 2007 is not due a reduction in cases but the amount of time it takes from publication to detection of fraud. Withdrawn articles by discipline, 1985-2013 most affected disciplines Number cases per 1000 articles Molecular Biology 293 0,039 Cell Biology 191 0,062 Chemistry 187 0,047 Oncology 145 0,056 Physics 143 0,040 selected disciplines for comparison Psychology 81 0,030 Ecology 66 0,051 Economics 34 0,028 Sociology 10 0,011 Political Science 3 0,005 Source: Web of Science Withdrawn Articles by Research Location, 1985-2013 Country Number Cases per 1000 Articles India 183 0.177 China 488 0.170 Singapore 24 0.115 Japan 181 0.086 The Netherlands 90 0.084 Germany 211 0.073 USA 799 0.057 Italy 85 0.049 UK 133 0.047 Switzerland 32 0.046 Canada 75 0.037 France 66 0.029 Source: Web of Science A Spectacular Case: Jan Hendrik Schoen, a Physicist who received his PhD from the University of Konstanz (which he lost after the fraud became public) , published 45 articles in 2001, 17 of which alone in Science and Nature the flagship journals of the natural and life sciences. Meanwhile 16 articles have been withdrawn due to proven fraud. The media has caught up to spectacular cases of academic fraud – but these are rather pathological than representative. Spectacular cases reveal the weaknesses of academia, however they obscure the much more usual lower level of academic misconduct. Typically researchers do not invent results or data but tweak results in order to confirm theories and hypotheses. - Selective choice of cases - Fill in missing values at will - Strategic choice of estimation procedures and model specification These more subtle forms of academic fraud are much more prevalent. They are also much harder to detect – not least by the typical peer review process. Why is control important? Research produces (positive) results that hinge on our credibility and reputation. We need to maintain this credibility and reputation by implementing self-control mechanisms that prevent academic fraud and misconduct. DART is such an initiative We cannot leave it to the (criminal) justice system, since the fraud of a few produces negative externalities for the whole profession. Detection of Fraud? It seems almost impossible to detect this kind of subtle fraud through the typical peer review process, which is supposedly the main instrument of quality assurance in the academic profession. In most cases, authors don’t have to provide their data to the reviewers (for good reason? – original data, sensitive data, personalised data) The peer review process only evaluates the plausibility of results, it assumes honesty SOLUTIONS... Plagiarism Ulysses: researchers have to bind themselves „to the mast“ Registration Publication Replication Robustness Solutions have to increase the perceived probability of detection for the single researcher PLAGIARISM Publishers can easily implement Plagiarism software into their online submission systems to screen articles and books for potential copying of existing work without proper citation. REGISTRATION (Disciplines which are less affected by spectacular fraud seem to be leading…) Political Science: egap – 80 experiments Economics: RCT Registry (American Economic Association) – 240 esperiments Exponentially increasing registration Platforms for pre-registration of experimental designs – registered experiments cannot be changed exposed to adapt the design to the results Problem: registration doesn’t work if researchers regard the experimentally generated data as private property which don’t have to be published or made available to reviewers – researchers can remove cases that do not fit… CPS – special issue with pre-registered analyses, ms. are reviewed without empirical results review process detached from results, avoidance of publication bias PUBLICATION Make all data publicly available Problem: original data, confidential data, personalized data Improve data citation – data are intellectual products for which citation should be required (Mooney 2011) Increase incentives for scholars to publish data (citations count!) Publication of replication files (datasets and do-files) necessary but not sufficient REPLICATION Strengthening of review process with ACTUAL replication of empirical results Journals are key! Journals need to publish null findings and replication studies PSRM (sorry blatant self-advertisement…): Successful Replication by Data Analyst – necessary for publication of ms. 90% of replication files do not produce output in submitted ms. 10% serious problems 5% need to withdraw, empirical results cannot be replicated IS REPLICATION ENOUGH? Probably not The example of the excel-spreadsheet mistakes of Rogoff and Reinhard, as well as the problem of how to treat missing values (FT) in the Piketty case show that simple replication of results will remain insufficient to prevent fraud… ROBUSTNESS Robustness checks can close part of the gap – over the last decade increasingly standard in the social sciences Problem: feasibility – who should check robustness of results and at what stage of the process? Robustness checks do not just replicate empirical results but take into account that researchers have to take many decisions about estimation and specification Many published studies, however, read as if the presented empirical specification was the only plausible one Robustness checks assume that alternative specification are not less plausible and test whether results and conclusions hold for alternative assumptions At the moment, authors decide which robustness tests to include… The future: publishers and editors should decide joint policies and agree on which robustness checks are necessary! CONTROL Scientific progress is important Systematic control to avoid potential fraud is necessary The scientific community, publishers and journals need to provide the necessary resources to generate an infrastructure which increases the probability of detecting academic fraud – much more so than it is the case at present. DART is an important step in the right direction (some of the information was graciously provided by Prof. Thomas Pluemper, Essex – please also consult his FAZ article “Vertrauen ist gut, Kontrolle ist besser” from 20 August 2014)
© Copyright 2026 Paperzz