(Why) do some fields of science progress faster than others? IceLab Camp September 17-‐20, 2012 Kristian Moss Bendtsen Lauri Kovanen Christoph Weise Proposal duration: 36 months Proposal summary Some disciplines, particularly social sciences, have historically suffered from slow progress. Even though some reasons for this are known — it is for example impossible to run experiments in macroeconomics or evolution research — little research has been done on measuring the progress of science. There are however significant benefits in understanding more closely why, how and when a field of science makes progress. This would benefit both individuals doing the science and grant agencies funding them. We propose a set of new quantities for measuring the progress and convergence of fields of science, as well as experimental setups for measuring their value. The experiments are based on semantic analysis of research publications, and the analysis of the web of citations formed by articles in a given field. We plan to apply these methods to large electronic data sets that have recently become available. Background There is anecdotal evidence that some fields of science progress faster than others. For example many human sciences are still trying to answer the same questions that they did 50 years ago; in some cases there is even no agreement of what questions should be answered. On the other hand in high energy physics there is a very high degree of agreement on what questions are important, to the extent that thousands of scientists agree on spending hundreds of millions of euros on producing a special device — the Large Hadron Collider — to get an answer to the question. Specific aims of the proposed work Quantitative measurements for scientific progress Progress can be defined as the amount of new information produced; new information should take the field forward, not sideways. If it is not possible to directly measure progress, we can measure other, related quantities. For example, using both citation data and semantic analysis it should be possible to measure the convergence of a scientific field. This is useful if convergence can be shown to be a proxy for progress. Understanding factors behind progress The classic way of creating new information is through falsification of hypotheses. Some fields have a stronger tradition of formulating testable hypotheses than others. Is progress correlated with the use of falsification? Also, before constructing experiments to falsify hypotheses one needs to come up with ideas to test. The rate of generating ideas and the rate of falsifying them might both be important. Understanding limitations to progress After identifying the features that separate fields with fast and slow progress (e.g. the usage of falsification), it makes sense to ask why the good features do not appear in other fields. The reasons might be cultural, but it might also turn out that the best practices are not equally applicable to all fields. There may also be multiple methods for creating new information. Falsification and models Building computational models is an increasingly common approach in science. Models are often not falsified nor falsifiable; in some cases two alternative models are proposed that fit the data equally well. Under what circumstances is it acceptable to use models that cannot be falsified? Significance We will compare different scientific disciplines and, using the methods of semantic analysis, try to discern why some fields are more successful. Progress in science is determined by the accuracy of its output and the speed with which knowledge is generated. The extent of agreement among peers will serve as a proxy for measures of accuracy. By developing and implementing metrics for the progress of science we hope to understand why particular disciplines progress slowly and discover better ways to do science. These reasons might be methodological, cultural or experimental. The insights should assist grant agencies in allocating funds optimally toward projects that are likely to produce valuable new information efficiently. The benefits of success would obviously reach beyond academia. For instance, greater progress in social sciences such as economics, and particularly the achievement of a concrete understanding of principles (theories) of macroeconomics could lead to advances in forecasting and prevent the future occurrence of catastrophic events. Methods Scientific publications provide a record of the data and ideas generated within a field. Combining citation-‐ and semantic analysis on the vast record of papers has proven to reliably identifying distinct patterns in publication data, e.g. mapping the usage of jargon (semantics) within scientific disciplines onto a corresponding citation network (citation). In order to quantify the progress of science we therefore propose that similar methods can be used to identify the emergence of new ideas, hypothesis testing and the acceptance of ideas. Theoretical work We carry out case studies to find out why science has in some cases advanced with great speed (e.g. the discovery of DNA, understanding of neuro-‐anatomy and communication theory) and in other cases not (management science, postmodern studies). The theoretical work is used to refine the measures of progress and gain better understanding of underlying factors. Semantic analysis To quantify the rate with which new ideas emerge we will first choose 20 well established ideas within 5 different fields. From these 100 papers we will select the first 5 papers citing them. Commonalities in the language used by these novel articles and their “first citers” can be found by performing statistical tests against 500 random papers. Naively put, the word “brilliant” might occur in a first citer but probably not in a random scientific paper. The same procedure can be used to identify papers where a hypothesis is being tested. A priori we would expect words such as falsify, challenge and prove. Convergence Using semantic analysis we can identify clusters of words that reflect the acceptance of an idea (”It is generally know that…”) and others correlating with disagreement (”It is not well understood that…”). Looking into the dynamics of how ideas move from one classifications to another allows us to rank fields according to how fast they reach consensus. We expect to see a correlation between the use of falsification and the speed with which an idea becomes accepted in a field. Time plan Definition of measures: Collection of data: Semantic analysis: Convergence analysis: Writing up results: months 1-‐12 months 4-‐8 months 6-‐24 months 6-‐24 months 24-‐36
© Copyright 2026 Paperzz