The Impact of Biomedical Knowledge Accumulation on Mortality: A

NBER WORKING PAPER SERIES
THE IMPACT OF BIOMEDICAL KNOWLEDGE ACCUMULATION ON MORTALITY:
A BIBLIOMETRIC ANALYSIS OF CANCER DATA
Frank R. Lichtenberg
Working Paper 19593
http://www.nber.org/papers/w19593
NATIONAL BUREAU OF ECONOMIC RESEARCH
1050 Massachusetts Avenue
Cambridge, MA 02138
October 2013
The views expressed herein are those of the author and do not necessarily reflect the views of the National
Bureau of Economic Research.
NBER working papers are circulated for discussion and comment purposes. They have not been peerreviewed or been subject to the review by the NBER Board of Directors that accompanies official
NBER publications.
© 2013 by Frank R. Lichtenberg. All rights reserved. Short sections of text, not to exceed two paragraphs,
may be quoted without explicit permission provided that full credit, including © notice, is given to
the source.
The Impact of Biomedical Knowledge Accumulation on Mortality: A Bibliometric Analysis
of Cancer Data
Frank R. Lichtenberg
NBER Working Paper No. 19593
October 2013
JEL No. J11,O33,O40
ABSTRACT
I examine the relationship across diseases between the long-run growth in the number of publications
about a disease and the change in the age-adjusted mortality rate from the disease. The diseases analyzed
are almost all the different forms of cancer, i.e. cancer at different sites in the body (lung, colon, breast,
etc.). Time-series data on the number of publications pertaining to each cancer site were obtained
from PubMed. For articles published since 1975, it is possible to distinguish between publications
indicating and not indicating any research funding support.
My estimates indicate that mortality rates: (1) are unrelated to the (current or lagged) stock of publications
that had not received research funding; (2) are only weakly inversely related to the contemporaneous
stock of published articles that received research funding; and (3) are strongly inversely related to
the stock of articles that had received research funding and been published 5 and 10 years earlier.
The effect after 10 years is 66% larger than the contemporaneous effect. The strong inverse correlation
between mortality growth and growth in the lagged number of publications that were supported by
research funding is not driven by a small number of outliers.
Frank R. Lichtenberg
Columbia University
504 Uris Hall
3022 Broadway
New York, NY 10027
and NBER
[email protected]
3
I.
Introduction
Many people and organizations have expressed the view that biomedical research has
yielded substantial improvements in longevity and health. Nabel (2009) said that “biomedical
research provides the basis for progress in health and health care.” Moses and Martin (2011)
said that “since 1945, biomedical research has been viewed as the essential contributor to
improving the health of individuals and populations, in both the developed and developing
world.” Cutler, Deaton and Lleras-Muney (2006) “tentatively identif[ied] the application of
scientific advance and technical progress (some of which is induced by income and facilitated by
education) as the ultimate determinant of health.” The Federation of American Societies for
Experimental Biology (2013) said that “research in the biomedical sciences has generated a
wealth of new discoveries that are improving our health, extending our lives and raising our
standard of living.” The National Institutes of Health (NIH) said that “in the last 25 years, NIH‐
supported biomedical research has directly led to human health benefits that both extend lifespan
and reduce illnesses” (NIH (2013a)). The Australian Government (2013) said that “the purpose
of health and medical research (HMR) is to achieve better health for all Australians. Better
health encompasses increased life expectancy, as well as social goals such as equity,
affordability and quality of life.”
The hypothesis that biomedical research has yielded substantial improvements in
longevity and health has been examined using two kinds of evidence. The first type of evidence
consists of qualitative “case studies” of specific diseases. NIH (2013b, 2013c) describes the
impacts of its long-term efforts to understand, treat, and prevent chronic diseases (including
cardiovascular disease, cancer, diabetes, depression), and how it has worked to combat infectious
diseases such as HIV/AIDS and influenza by helping to develop new therapies, vaccines,
diagnostic tests, and other technologies.
The second kind of evidence is indirect, (partially) econometric evidence. This evidence
is indirect because it is based on evidence about two links in the following causal chain:
biomedical research  new drugs, devices, and procedures  longevity and health
Regarding the first link: the National Cancer Institute (NCI) says that “approximately one half of
the chemotherapeutic drugs currently used by oncologists for cancer treatment were discovered
and/or developed at NCI” (NCI (2013)), and Sampat and Lichtenberg (2011) demonstrated that
4
new drugs often build on upstream government research. Regarding the second link: a number
of studies have examined the impact of the introduction and use of new drugs, devices, and
procedures on longevity and health.1 For example, Lichtenberg (2011) analyzed the impact of
new drugs and imaging procedures on longevity in the U.S. using longitudinal state-level data;
Lichtenberg (2013a) analyzed the impact of new drugs on longevity in France using longitudinal
disease-level data; and Lichtenberg (2013b) analyzed the impact of therapeutic procedure
innovation on hospital patient longevity in Western Australia using patient-level data.
In this paper, I will use a different econometric approach to assess the impact that
biomedical research has had on longevity: a direct examination of the relationship across
diseases between the long-run growth in the number of research publications and the change in
the mortality rate (in most cases controlling for the disease incidence rate). I hypothesize that the
growth in the number of research publications about a disease is a useful indicator of the growth
in knowledge about the disease. As the National Science Foundation (NSF) says, “Research
produces new knowledge, products, or processes. Research publications reflect contributions to
knowledge” (NSF (2013)). In his model of endogenous technological change, Romer (1990)
hypothesized an aggregate production function such that an economy’s output depends on the
“stock of ideas” that have previously been developed, as well as on the economy’s endowments
of labor and capital. The mortality model that I will estimate may be considered a health
production function, in which the mortality rate is an (inverse) indicator of health output or
outcomes, and the cumulative number of publications is analogous to the stock of ideas.
Previous research on the agricultural and manufacturing sectors of the economy has
found that counts of publications are useful indicators of the stock of knowledge. Evenson and
Kislev (1973) used the publication of crop-specific scientific papers as a measure of agricultural
research output in 75 wheat- and maize- growing countries to explain increases in yield per unit
land in these crops over the period 1948-68. They observed a strong and persistent relationship
between agricultural research and biological productivity-yield in wheat and maize. This
relationship existed both “between” countries and “within” countries over time. Adams (1990)
utilized article count data in each science as measures of knowledge in his analysis of
productivity growth in two-digit manufacturing industries during the period 1949-83.
1
Fuchs (2010) stated that “since World War II…biomedical innovations (new drugs, devices, and procedures) have
been the primary source of increases in longevity.”
5
The diseases we will analyze are almost all the different forms of cancer, i.e. cancer at
different sites in the body (lung, colon, breast, etc.). About one-fourth of U.S. deaths during the
period 1999-2010 were due to cancer. The main reason we focus on cancer is that the NCI
publishes annual data on cancer incidence2 as well as on cancer mortality, by cancer site.
Incidence data aren’t available for most other diseases. A less important reason is that the NCI
uses a uniform cancer site classification scheme for data covering the entire period 1975-present.
There were significant changes in the disease classification scheme for other diseases between
1998 and 1999, when the system used to classify underlying cause of death was changed from
the International Classification of Diseases (ICD) Ninth Revision to the ICD Tenth Revision. As
the CDC (2013) notes, the two classification schemes are different enough to make direct
comparisons of cause-of-death difficult.
In the next section, I will briefly describe the biomedical publications data I will use. In
Section III, I develop the econometric model I will use to investigate the impact of contributions
to knowledge (as measured by publication counts) on cancer mortality rates. Descriptive
statistics will be presented in Section IV. Estimates of the econometric model will be presented
in Section V. Section VI provides a summary and conclusions.
II.
Biomedical publications data
Time-series data on the number of publications pertaining to each cancer site were
obtained from PubMed, a database developed by the National Center for Biotechnology
Information (NCBI) at the National Library of Medicine (NLM), one of the institutes of the
National Institutes of Health (NIH). The database was designed to provide access to citations
(with abstracts) from biomedical journals. PubMed's primary data resource is MEDLINE, the
NLM's premier bibliographic database covering the fields of medicine, nursing, dentistry,
veterinary medicine, the health care system, and the preclinical sciences, such as molecular
biology. MEDLINE contains bibliographic citations and author abstracts from about 4,600
biomedical journals published in the United States and 70 other countries. The database contains
about 12 million citations dating back to the mid-1960s. Coverage is worldwide, but most
2
A cancer incidence rate is the number of new cancers of a specific site/type occurring in a specified population
during a year, usually expressed as the number of cancers per 100,000 population at risk.
6
records are from English-language sources or have English abstracts. In addition to MEDLINE
citations, PubMed provides access to non-MEDLINE resources, such as out-of-scope citations,
citations that precede MEDLINE selection, and PubMed Central (PMC) citations.3
A controlled vocabulary of biomedical terms, the NLM Medical Subject Headings
(MeSH), is used to describe the subject of each journal article in MEDLINE. MeSH contains
approximately 26 thousand terms and is updated annually to reflect changes in medicine and
medical terminology. MeSH terms are arranged hierarchically by subject categories with more
specific terms arranged beneath broader terms.4 PubMed allows one to view this hierarchy and
select terms for searching in the MeSH Database. Skilled subject analysts examine journal
articles and assign to each the most specific MeSH terms applicable—typically ten to twelve.
Applying the MeSH vocabulary ensures that articles are uniformly indexed by subject, whatever
the author's words (National Center for Biotechnology Information (2013)). Figure 1 shows an
abridged sample of a PubMed bibliographic citation. I use three attributes (search fields) in the
citation: the date of publication (line 8), the MeSH headings (lines 27-36), and the Publication
Type (lines 18-20).
For articles published since 1975, the Publication Types identify U.S. government and
non-U.S. government5 financial support of the research that resulted in the published papers
when that support is mentioned in the articles (National Library of Medicine (2013b)). Figure 2
shows data on the number of PubMed publications pertaining to cancer that were published
during the period 1975-2009, by extent and source of research support. Cancer was one of the
main topics discussed (i.e., cancer was a “MeSH Major Topic”) in about 1.5 million articles
published during this period. About 30% of these articles mentioned either U.S. government
support, non-U.S. government support, or both. 20% of the articles indicating any research
funding support mentioned only U.S. government support; 63% of the articles indicating any
research funding support mentioned only non-U.S. government support; and 17% of the articles
3
Together, these are often referred to as “PubMed-only citations.” Out-of-scope citations are primarily from general
science and chemistry journals that contain life sciences articles indexed for MEDLINE, e.g., the plate tectonics or
astrophysics articles from Science magazine. Publishers can also submit citations with publication dates that precede
the journal's selection for MEDLINE indexing, usually because they want to create links to older content. PMC
citations are taken from life sciences journals (MEDLINE or non-MEDLINE) that submit full-text articles to PMC.
4
The MeSH Tree Structure can be browsed online; see National Library of Medicine (2013a).
5
Non-U.S. government financial support includes support by American societies, institutes, state governments,
universities, and private organizations, and by foreign sources (national, departmental, provincial, academic &
private organizations).
7
indicating any research funding support mentioned both U.S. government and non-U.S.
government support. This distribution of funding support by source is quite consistent with data
compiled by Research!America (shown in Figure 3) on the distribution of 2011 U.S. biomedical
and health R&D spending, by source of funding. The Research!America data indicate that the
Federal government accounted for 29% of 2011 U.S. biomedical and health R&D spending. If
we assume that the U.S. government deserves “half the credit” for articles that mentioned both
U.S. government and non-U.S. government support, we can say that the U.S. government
support accounted for 28.5% (= 20% + (17% / 2)) of the funding support for articles that
received any funding support.
Our ability to distinguish between publications indicating and not indicating any research
funding support will allow us to test the hypothesis that an increase in the number of publications
indicating any research funding support has a larger (more negative) effect on mortality than an
increase in the number of publications not indicating any research funding support; the latter may
even have no effect. In principle, our ability to also distinguish between publications indicating
U.S. government and non-U.S. government funding support could also allow us to separately
examine the effects of both kinds of research funding on mortality. However, since almost half
of the articles acknowledging U.S. government support also acknowledged non-U.S. government
support, disentangling the effects of the two kinds of research funding on mortality may be
difficult.
The PubMed database indicates the year of publication of each article, but not the year(s)
in which research funding occurred (for articles that acknowledged research funding). However
the NIH Reporter database (NIH (2013)) enables us to determine the start dates of NIH projects
that yielded PubMed articles, as well as the publication dates of those articles. Hence, we can
analyze the frequency distribution of the lag between project start date and the publication date
of articles. The distribution of NIH-supported articles, by lag between project start date and
publication date, is shown in Figure 4.6 The median lag from project start to article publication
is about six years. However, since this figure is based on right-censored data—articles that were
or will be published after 2011 are excluded—six years should be considered a lower-bound
estimate of the median lag from project start to article publication.
6
Figure 4 is based on data on almost all NIH-supported articles published during 1985-2011 (N = 323,196), not just
articles about cancer.
8
When former NIH Director Harold Varmus testified before Congress in 1998, he said that
“the benefits of research are unpredictable…Although basic research projects initially may
appear to be unrelated to any specific disease, findings from this research ultimately may prove
to be a critical turning point in a long chain of discoveries leading to improved health” (Varmus
(1998)). Determining whether or not a research project is applicable to a specific disease is
therefore likely to be far easier six or more years after the project began (and articles are
published) than it was when the project started.
III.
Econometric model
Two types of statistics are often used to assess progress in the “war on cancer”: survival
rates and mortality rates. Survival rates are typically expressed as the proportion of patients alive
at some point subsequent to the diagnosis of their cancer. For example, the observed 5-year
survival rate is defined as follows:
5-year Survival Rate = Number of people diagnosed with cancer at time t alive at time
t+5 / Number of people diagnosed with cancer at time t
= 1 – (Number of people diagnosed with cancer at time t dead at time t+5 / Number of
people diagnosed with cancer at time t)
Hence, the survival rate is based on a conditional (upon previous diagnosis) mortality
rate. The second type of statistic is the unconditional cancer mortality rate: the number of
deaths, with cancer as the underlying cause of death, occurring during a year per 100,000
population.
The 5-year relative survival rate from cancer has increased steadily since the mid-1970s,
from 49.1% for people diagnosed during 1975-1977 to 67.6% for people diagnosed during 20012008. Although this increase suggests that there has been significant progress in the war against
cancer, it might simply be a reflection of (increasing) lead-time bias. Lead time bias is the bias
that occurs when two tests for a disease are compared, and one test (the new, experimental one)
diagnoses the disease earlier, but there is no effect on the outcome of the disease--it may appear
that the test prolonged survival, when in fact it only resulted in earlier diagnosis when compared
to traditional methods. Welch et al (2000) argued that “improving 5-year survival over
time…should not be taken as evidence of improved prevention, screening, or therapy.” They
9
argued that “while 5-year survival is a perfectly valid measure to compare cancer therapies in a
randomized trial, comparisons of 5-year survival rates across time (or place) may be extremely
misleading. If cancer patients in the past always had palpable tumors at the time of diagnosis
while current cancer patients include those diagnosed with microscopic abnormalities, then 5year survival would be expected to increase over time even if new screening and treatment
strategies are ineffective. To avoid the problems introduced by changing patterns of diagnosis,
observers have argued that progress against cancer be assessed using population-based mortality
rates.” Therefore, the dependent variable I will analyze will be the unconditional cancer
mortality rate, rather than a variable based on the survival rate.7
The unconditional cancer mortality rate is essentially the unconditional probability of
death from cancer (P(death from cancer)). The law of total probability implies the following:
P(death from cancer) = P(death from cancer | cancer diagnosis) * P(cancer diagnosis)
+ P(death from cancer | no cancer diagnosis) * (1 – P(cancer diagnosis))
(1)
The probability of dying from cancer is much lower than the probability of being diagnosed with
cancer: in 2006, the cancer incidence rate was 2.5 times as high as the cancer mortality rate.8
This suggests that the probability that a person who has never been diagnosed with cancer dies
from cancer is quite small: P(death from cancer | no cancer diagnosis) ≈ 0. In this case, eq. (1)
reduces to
P(death from cancer) ≈ P(death from cancer | cancer diagnosis) * P(cancer diagnosis)
(2)
Hence
ln P(death from cancer) ≈ ln P(death from cancer | cancer diagnosis)
+ ln P(cancer diagnosis)
(3)
7
I will control for cancer incidence (by including it in the mortality equation), but in a completely unrestrictive
manner. If changes in incidence are merely due to lead-time bias, the coefficient on incidence should be zero.
8
2006 U.S. age-adjusted incidence and mortality rates were 456.2 and 181.1, respectively.
10
I hypothesize that the conditional mortality rate (P(death from cancer | cancer diagnosis))
is inversely related to the (current or lagged) stock of useful knowledge about cancer.9 The stock
of knowledge is not directly observable, but I also hypothesize that the cumulative number of
scientific publications is a meaningful indicator of the stock of knowledge.
ln P(death from cancer | cancer diagnosis) =  ln(cum_pubst-k)
(4)
Substituting (4) into (3),
ln P(death from cancer) ≈  ln(cum_pubst-k)+ ln P(cancer diagnosis)
(5)
To assess the impact of biomedical research on cancer mortality, I will estimate the
following difference-in-differences version of eq. (5), based on longitudinal, cancer-site-level
data on about 45 cancer sites:10
ln(mort_ratest) =  ln(cum_pubss,t-k) +  ln(inc_ratest) + s + t + st
(6)
mort_ratest = the age-adjusted mortality rate from cancer at site s (s = 1,…, 47) in
year t (t=1995,…,2009)
cum_pubss,t-k = the number of PubMed articles published by the end of year t – k
that were about cancer at site s
inc_ratest = the age-adjusted incidence rate of cancer at site s in year t
s = a fixed effect for cancer site s
t = a fixed effect for year t
st = a disturbance
The fixed year effects control for time-varying factors that influence cancer mortality rates in
general.
9
The stock of useful knowledge may also affect the probability of diagnosis.
Galiani et al (2005) used a difference-in-differences model to assess the impact of privatization of water services
on child mortality in Argentina. They estimated their model using data classified by region and year, whereas the
data I will use are classified by disease and year. Their “treatment variable” (whether water services were publicly
or privately provided) was discrete, whereas my treatment variable (stocks of publications) are continuous.
10
11
I will estimate models based on eq. (6) using three alternative values of k: 0, 5, and 10.11
For concreteness, suppose that k = 10. Now, let’s write specific versions of eq. (6) for the first
and last years of the sample period (t = 1995 and t = 2009):
ln(mort_rates,1995) =  ln(cum_pubss,1985) +  ln(inc_rates,1995) + s + 1995 + s,1995
(7)
ln(mort_rates,2009) =  ln(cum_pubss,1999) +  ln(inc_rates,2009) + s + 2009 + s,2009
(8)
Subtracting eq. (7) from eq. (8),
ln(mort_rates,2009 / mort_rates,1995) =  ln(cum_pubss,1999 / cum_pubss,1985) +
 ln(inc_rates,2009 / inc_rates,1995) + (2009 - 1995) + (s,2009 - s,1995)
(9)
or
ln(mort_rates) = ln(cum_pubss) + ln(inc_rates) + ’ + s’
(10)
where
ln(mort_rates)
ln(cum_pubss)
ln(inc_rates)
’
= ln(mort_rates,2009 / mort_rates,1995)
= ln(cum_pubss,1999 / cum_pubss,1985)
= ln(inc_rates,2009 / inc_rates,1995)
= (2009 - 1995)
The cancer-site fixed effects that were included in the “within” model (eq. (6)) are no longer
present in the “long-difference” model (eq. (10)); the intercept of eq. (10) is the difference
between the initial- and end-year year fixed effects. In this simple model, the long-run growth of
the age-adjusted cancer mortality rate depends on the long-run growth of the (lagged) cumulative
number of publications, the long-run growth of the age-adjusted cancer incidence rate, and a
constant.
Eq. (10) can easily be generalized to allow for two or three different stocks of
publications:
ln(mort_rates) = RESEARCH ln(cum_research_pubss)
+ NON-RESEARCH ln(cum_non_research_pubss) +  ln(inc_rates) + ’ + s’
(11)
 ln(mort_rates) = RESEARCH_US_GOV ln(cum_US_gov_research_pubss)
11
Since data on financial support of research that resulted in published papers begin in 1975, it is not practical to
specify longer lags (k > 10).
12
+ RESEARCH_OTHER ln(cum_other_research_pubss)
+ NON-RESEARCH ln(cum_non_research_pubss) + ln(inc_rates) + ’ + s’
(12)
where
cum_research_pubss,t-k = the number of PubMed articles indicating any research
funding support published by the end of year t–k that were
about cancer at site s
cum_non_research_pubss,t-k = the number of PubMed articles not indicating any research
funding support published by the end of year t–k that were
about cancer at site s
cum_US_gov_research_pubss,t-k = the number of PubMed articles indicating US government
research funding support published by the end of year t–k that
were about cancer at site s
cum_other_research_pubss,t-k = the number of PubMed articles indicating non-US
government research funding support published by the end of
year t–k that were about cancer at site s
I will estimate eqs. 10-12 for three different values of k (0, 5, and 10). These equations
will be estimated via weighted least-squares, weighting by the mean mortality rate of cancer site
s during the period 1985-2009. Since the dependent variable is the log of the mortality rate, I am
analyzing percentage changes in the mortality rate. As shown in Figure 5, the data exhibit
heteroskedasticity: cancer sites with low average mortality rates exhibit much larger positive and
negative percentage changes in mortality rates than cancer sites with high average mortality
rates. Weighted least squares is appropriate in the presence of heteroskedasticity.
IV.
Descriptive statistics
Data on age-adjusted incidence and mortality rates were obtained from SEER Cancer
Query Systems (National Cancer Institute (2013b)). Incidence and mortality rates of all
malignant cancers combined during the period 1973-2009 are shown in Figure 6. Incidence and
mortality both increased between the mid-1970s and the early 1990s, when both began to
decline. Between 1992 and 2009, the incidence rate declined 9% and the mortality rate declined
19%.
Age-adjusted mortality and incidence rates in 1995 and 2009 and PubMed publication
counts 10 years earlier (in 1985 and 1999) for the top 18 cancer sites (ranked by mean mortality
13
rate) are shown in Table 1.12 Lung cancer had the largest mean mortality rate by far; it
accounted for more than one in four cancer deaths. Between 1995 and 2009, the lung cancer
incidence rate declined 12% and the lung cancer mortality rate declined 17%. The cumulative
number of PubMed publications about lung cancer (cum_pubs) approximately doubled between
1985 and 1999; the cumulative number of PubMed publications about lung cancer that cited any
research support (cum_research_pubs) more than tripled.
The second largest cancer (ranked by mean mortality rate) was colon cancer. The
incidence and mortality rates of colon cancer declined about twice as much as the incidence and
mortality rates of lung cancer: by 23% and 34%, respectively. But lagged cum_pubs and
cum_research_pubs increased more slowly for colon cancer than they did for lung cancer: by
77% and 139%, respectively.
The third largest cancer (ranked by mean mortality rate) was breast cancer. The breast
cancer incidence rate declined just 4%, while the breast cancer mortality rate declined by 29%.
Lagged cum_pubs and cum_research_pubs increased more for breast cancer than they did for
lung cancer: by 144% and 294%, respectively.
Weighted means, standard deviations, and correlation coefficients across 47 cancer sites
of 1995-2009 growth in mortality, incidence, and cumulative number of publications 10 years
earlier are shown in Table 2. Observations are weighted by mean mortality rate. The weighted
mean declines in mortality and incidence are consistent with the data shown in Figure 6. The
mean log change in publications acknowledging research funding (cum_research_pubs) was
almost twice as large as the mean log change in total publications (cum_pubs); this is at least
partly due to the fact that only articles published after 1974 include information about research
funding. The mean log change in publications acknowledging non-U.S. government research
funding (cum_other_research_pubs) was 81% larger than the mean log change in publications
acknowledging U.S. government research funding (cum_gov_research_pubs). This is consistent
with data compiled by Research!America, which indicate that the Federal government’s share of
U.S. biomedical R&D has been declining; it fell from 34% in 2002 to 29% in 2011.
As shown in the first row of correlation coefficients in Table 2, there is a significant
positive correlation across cancer sites between the growth in incidence and the growth in
12
Age-adjusted mortality and incidence rates and PubMed publication counts for the other 29 cancer sites not
included in Table 1 are shown in Appendix Table 1.
14
mortality: cancer sites with larger declines in incidence had larger declines in mortality. The
correlation between mortality growth and growth in non-research publications is insignificant,
but the correlations between mortality growth and growth in cum_research_pubs,
cum_gov_research_pubs, and cum_other_research_pubs are negative and significant. The
correlation between the growth of government and other research publications is quite high (r =
0.856), suggesting that disentangling the effects of the two kinds of research funding on
mortality may be difficult.
V.
Estimates of models of 1995-2009 growth of the age-adjusted cancer mortality rate
Weighted least-squares estimates of models of 1995-2009 growth of the age-adjusted
cancer mortality rate (eqs. 10-12) are shown in Table 3. The equations were estimated using
three alternative assumed values of the lag (k) from cumulative publications to the mortality rate:
0, 5, and 10 years. k = 0 in models 1-5; k = 5 in models 6-10; and k = 10 in models 11-15.
Model 1 is a simple regression of the growth in the mortality rate on the growth in
cum_pubs, i.e. the growth in the incidence rate is excluded. The coefficient on the growth in
cum_pubs is insignificant. Model 2 includes the growth in the incidence rate as well as the
growth in cum_pubs. In this model, the coefficient on the growth in cum_pubs is negative and
highly significant (and the coefficient on the growth in the incidence rate () is positive and
significant). This indicates that failure to control for the growth in incidence (which it is not
feasible to do for non-cancer diseases) may bias estimates of the coefficient on the growth in
cum_pubs () towards zero, because growth in the number of publications is positively
correlated across diseases with growth in incidence.13 In model 3, the growth in cum_pubs is
replaced by the growth in cum_research_pubs. The coefficient on the growth in
cum_research_pubs is also negative and highly significant. However, when we control (in
model 4) for the growth in cum_non_research_pubs, the estimate of RESEARCH is only marginally
significant (p-value = 0.092).14 Model 5 is an estimate of eq. (12), in which cum_research_pubs
13
The coefficient on incidence growth is positive, but (contrary to eq. (5) above) significantly less than one: a 10%
rise in incidence is associated with a 7.3% rise in mortality. This may be at least partly due to the fact that measured
incidence is a noisy indicator of true incidence, e.g. due to changing patterns of diagnosis and a changing degree of
lead-time bias.
14
As shown in Table 2, the correlation across cancer sites between growth in cum_ research_pubs and growth in
cum_non_research_pubs is quite high (0.647).
15
is disaggregated into cum_gov_research_pubs and cum_other_research_pubs. Neither
RESEARCH_US_GOV nor RESEARCH_OTHER is significant, which is not surprising given the high
correlation across cancer sites between the growth of government and other research
publications.
Models 6-10 are identical to models 1-5, except the assumed lag from cumulative
publications to the mortality rate is five years rather than zero years. The estimates of models 68 are similar to the estimates of models 1-3, but the contrast between models 9 and 4 (which
include both cum_research_pubs and cum_non_research_pubs) is interesting. Although
RESEARCH is only marginally significant (p-value = 0.092) in model 4, it is highly significant (pvalue = 0.012) in model 9. This means that although the mortality rate is only weakly inversely
related to the contemporaneous stock of publications that had received research funding
(controlling for the contemporaneous stock of publications that had not received research
funding), it is strongly inversely related to the stock of publications that had received research
funding 5 years earlier. Moreover, the magnitude of the point estimate of RESEARCH is 46%
larger in model 9 than it is model 4.
In models 11-15, the assumed lag from cumulative publications to the mortality rate is
ten years. As shown in Figure 7, the magnitude of the point estimate of RESEARCH in model 14 is
14% larger than it is in model 9, and 66% larger than it is in model 4.
Figure 8 shows the partial correlation across cancer sites between the 1985-1999 log
change in the number of research publications and the 1995-2009 log change in the mortality
rate, controlling for the 1995-2009 log change in the incidence rate. The figure is a plot of the
residuals from the weighted simple regression of ln(mort_rates) onln(inc_rates) against the
residuals from the weighted simple regression of ln(cum_research_pubss) onln(inc_rates),
where we assume a 10-year lag from cumulative publications to the mortality rate.15 The figure
suggests that the strong inverse correlation between mortality growth and growth in the lagged
number of publications that were supported by research funding is not being driven by a small
number of outliers. If we exclude lung cancer, which receives the greatest weight by far, from
the sample, the estimate of RESEARCH in model 13 hardly changes: RESEARCH = -0.285 (T = -3.22;
p-value = 0.003).
15
Figure 8 is a partial regression plot of model 13 in Table 3.
16
The magnitude of RESEARCH in model 13 is quite large. As shown in Table 2, the
weighted mean value of ln(cum_research_pubss) is 1.538. The average annual rate of increase
in lagged cum_research_pubs during 1995-2009 was 11.0% (= 1.538 / 14). Model 13 implies
that, during the period 1995-2009, the growth in the lagged number of publications that were
supported by research funding reduced the age-adjusted cancer mortality rate by 3.5% (= -0.319
* 11.0%) per year. During that period, the age-adjusted cancer mortality rate declined at an
average annual rate of 1.5%.16 This means that, in the absence of any growth in the lagged
number of publications that were supported by research funding, the age-adjusted cancer
mortality rate would have increased at an average annual rate of 2.0%. However, since there
was such rapid growth in the number of publications, estimating what would have happened in
the absence of any growth requires substantial out-of-sample prediction, which is certainly
subject to great uncertainty.
VI.
Summary and conclusions
Previous research on the agricultural and manufacturing sectors of the economy has
found that counts of publications are useful indicators of the stock of knowledge: they are
strongly positively correlated with productivity. In this paper, I have examined the relationship
across diseases between the long-run growth in the number of publications about a disease and
the change in the mortality rate from the disease.
The diseases I analyzed are almost all the different forms of cancer, i.e. cancer at
different sites in the body (lung, colon, breast, etc.). About one-fourth of U.S. deaths during the
period 1999-2010 were due to cancer. The main reason I focused on cancer is that the National
Cancer Institute publishes annual data on cancer incidence as well as on cancer mortality, by
cancer site. Failure to control for the growth in incidence (which it is not feasible to do for noncancer diseases) may bias estimates of the effect of publication growth towards zero, because
growth in the number of publications is positively correlated across diseases with growth in
incidence.
16
Eq. (13) implies that declining incidence accounted for about 1/6 of the decline in mortality.
17
Time-series data on the number of publications pertaining to each cancer site were
obtained from PubMed. For articles published since 1975, it is possible to distinguish between
publications indicating and not indicating any research funding support.
My estimates indicated that mortality rates: (1) are unrelated to the (current or lagged)
stock of publications that had not received research funding; (2) are only weakly inversely
related to the contemporaneous stock of published articles that received research funding; and (3)
are strongly inversely related to the stock of articles that had received research funding and been
published 5 and 10 years earlier. The effect after 10 years is 66% larger than the
contemporaneous effect. The strong inverse correlation between mortality growth and growth in
the lagged number of publications that were supported by research funding is not driven by a
small number of outliers.
Research!America (2013) estimates that U.S. biomedical and health R&D spending (from
all sources) declined by more than 3% in fiscal year 2011, and that this is the first drop in overall
spending since 2002. While most of that decrease reflects the end of American Recovery and
Reinvestment Act (ARRA) funding, which allocated $10.4 billion to the National Institutes of
Health over two fiscal years (2009-2010), federal funding declined beyond the drop attributable
to ARRA. In subsequent years, across-the-board cuts could cut billions more out of the federal
research budget. The White House Office of Management and Budget estimated that the NIH
alone could lose $2.53 billion in funding in fiscal year 2013. The evidence in this paper strongly
suggests that reductions in biomedical and health R&D spending will ultimately have an adverse
effect on U.S. longevity growth.
18
References
Adams JD (1990), “Fundamental Stocks of Knowledge and Productivity Growth,” Journal of
Political Economy 98 (4): 673-702, August.
Australian Government (2013), National Health and Medical Research Council, Meeting the
Challenges of the Next Decade to Improve Health through Research
https://www.nhmrc.gov.au/_files_nhmrc/file/research/nhmrc_submission_background_mckeon_
120417.pdf
Canese K, Jentsch J, Myers C, PubMed: The Bibliographic Database, Chapter 2 of The NCBI
Handbook [Internet]. McEntyre J, Ostell J, editors. Bethesda (MD): National Center for
Biotechnology Information (US); 2002-, http://www.ncbi.nlm.nih.gov/books/NBK21094/
Centers for Disease Control and Prevention (2013), Compressed Mortality File 1968-2010,
http://wonder.cdc.gov/wonder/help/cmf.html#Compressed%20Mortality%20File:%20ICD%20R
evision
Cutler D, Deaton A, Lleras-Muney A (2006). “The Determinants Of Mortality,” Journal of
Economic Perspectives 20(3,Summer), 97-120.
Evenson RE, Kislev Y (1973). “Research and Productivity in Wheat and Maize.” Journal of
Political Economy 81(6): 1309-29.
Federation of American Societies for Experimental Biology (2013), Benefits of Biomedical
Research,
http://www.faseb.org/Policy-and-Government-Affairs/Data-Compilations/Benefits-ofBiomedical-Research.aspx
Fuchs, VR (2010), “New Priorities for Future Biomedical Innovations,” N Engl J Med 363:704706 August 19, http://www.nejm.org/doi/full/10.1056/NEJMp0906597
Galiani S, Gertler P, Schargrodsky E, (2005). “Water for Life: The Impact of the Privatization of
Water Services on Child Mortality.” Journal of Political Economy 113 (1), February, 83-120.
Lichtenberg FR (2011), “The quality of medical care, behavioral risk factors, and longevity
growth,” International Journal of Health Care Finance and Economics 11(1): 1-34, March.
Lichtenberg FR (2013a), “The impact of pharmaceutical innovation on longevity and medical
expenditure in France, 2000–2009,” Economics and Human Biology.
Lichtenberg FR (2013b), “The impact of therapeutic procedure innovation on hospital patient
longevity: Evidence from Western Australia, 2000-2007,” Social Science and Medicine 77: 50-9,
January 2013.
Moses H, Martin JB (2011), Biomedical Research and Health Advances. N Engl J Med 364:567571, February 10. http://www.nejm.org/doi/full/10.1056/NEJMsb1007634
19
Nabel EG (2009). Linking biomedical research to health care. J Clin Invest. 119(10):2858–
2858, Oct. 1.
National Cancer Institute (2013a), Drug Discovery at the National Cancer Institute,
http://www.cancer.gov/cancertopics/factsheet/NCI/drugdiscovery
National Cancer Institute (2013b), SEER Cancer Query Systems (CanQues),
http://seer.cancer.gov/canques/
National Center for Biotechnology Information (2013), PubMed Help [Internet],
http://www.ncbi.nlm.nih.gov/books/NBK3827/
National Institutes of Health (2013a), NIH Research’s Impact on Health,
http://www.nih.gov/about/impact/impact_health.pdf
National Institutes of Health (2013b), NIH…Turning Discovery into Health,
http://www.nih.gov/about/discovery/viewbook_2011.pdf ,
National Institutes of Health (2013c), NIH Research’s Impact on Scientific Knowledge,
http://www.nih.gov/about/impact/impact_knowledge.pdf
National Institutes of Health (2012), ExPORTER - NIH RePORTER Database Download,
http://exporter.nih.gov/
National Library of Medicine (2013a), MeSH Tree Structure,
http://www.nlm.nih.gov/cgi/mesh/2013/MB_cgi
National Library of Medicine (2013b), Funding Support (Grant) Information in
MEDLINE/PubMed – 2013, http://www.nlm.nih.gov/bsd/funding_support.html
National Science Foundation (2013), Key Science and Engineering Indicators 2012 Digest,
Research Outputs: Publications and Patents http://www.nsf.gov/statistics/digest12/outputs.cfm
Research!America (2013), U.S. Investment in Health Research,
http://www.researchamerica.org/research_investment
Romer, P., 1990. Endogenous Technological Change. Journal of Political Economy 98 (5, Part
2), S71-S102.
Sampat, B. N., Lichtenberg, F. R., 2011. What are the Respective Roles of the Public and Private
Sectors in Pharmaceutical Innovation? Health Affairs 30(2), 332-9.
20
Varmus H (1998), New Developments in Medical Research: NIH and Patient Groups, Testimony
Before the House Commerce Committee, Subcommittee on Health and Environment,
March 26, http://www.hhs.gov/asl/testify/t980326a.html
Welch, H. Gilbert, Lisa M. Schwartz, and Steven Woloshin (2000), “Are Increasing 5-Year
Survival Rates Evidence of Success Against Cancer?,” JAMA 283(22): 2975-2978.
Figure 1
Abridged sample of a PubMed bibliographic citation
Line
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
PMID- 20425429
OWN - NLM
STAT- MEDLINE
DA - 20100428
DCOM- 20100810
VI - 4
IP - 3
DP - 2009 Jul
TI - Application of immunotherapy in pediatric leukemia.
PG - 159-66
LID - 10.1007/s11899-009-0022-5 [doi]
AD - Center for Cancer Research, National Cancer Institute, National Institutes of
Health, Building 10, Room 1W-3750, 9000 Rockville Pike, MSC-1104, Bethesda, MD
20892, USA. [email protected]
FAU - Wayne, Alan S
AU - Wayne AS
LA - eng
PT - Journal Article
PT - Research Support, N.I.H., Intramural
PT - Review
PL - United States
TA - Curr Hematol Malig Rep
JT - Current hematologic malignancy reports
JID - 101262565
RN - 0 (Immunotoxins)
SB - IM
MH - Child
MH - Graft vs Leukemia Effect/immunology
MH - Hematopoietic Stem Cell Transplantation/methods
MH - Humans
MH - Immunotherapy/*methods
MH - Immunotherapy, Adoptive/methods
MH - Immunotoxins/immunology/therapeutic use
MH - Leukemia/immunology/pathology/*therapy
MH - Models, Immunological
MH - Transplantation, Homologous
RF - 50
EDAT- 2010/04/29 06:00
MHDA- 2010/08/11 06:00
CRDT- 2010/04/29 06:00
AID - 10.1007/s11899-009-0022-5 [doi]
PST - ppublish
SO - Curr Hematol Malig Rep. 2009 Jul;4(3):159-66. doi: 10.1007/s11899-009-0022-5
Figure 2
Number of PubMed publications pertaining to cancer* that were published during the period 1975‐2009, by extent and source of research support All publications
1,532,640
Publications indicating any research funding support
464,556
Publications indicating US government research funding support only
92,723
Publications indicating non‐US government research funding support only
292,801
Publications not indicating any research funding support
1,068,084
Publications indicating both US government and non‐US government research funding support
79,032
* PubMed publications pertaining to cancer are those identified by the search "neoplasms[MeSH Major Topic]"
Figure 3
2011 U.S. Biomedical and Health R&D Spending (millions of dollars)
$19,113, 14%
Industry
Federal government
$39,552, 29%
Other
$77,580, 57%
http://www.researchamerica.org/uploads/healthdollar11.pdf Figure 4
Distribution of NIH‐supported articles, by lag between project start date and publication date
250,000
200,000
Median lag = 6 years
150,000
Number of articles
100,000
50,000
0
0
5
10
15
20
25
Number of years
30
35
40
45
50
Figure 5
Heteroskedasticity: relationship across cancer sites between m
mean mortality rate and log change in mortality rate 0.8
0.6
0.4
0.2
log change in mortality rate, 1995‐2009
0
0
10
20
30
‐0.2
‐0.4
‐0.6
‐0.8
Mean mortality rate, 1973‐2009
40
50
60
Figure 6
Incidence and mortality rates per 100,000 population: all malignant cancers, 1973‐2009
530
220
510.5
510
215.1
210
490
200
470
464.9
198.7
450
190
Mortality rate
Incidence rate
430
180
410
173.1
170
Incidence rate: 9 SEER registries
390
Mortality rate: All Malignant Cancers
385.5
160
370
350
1970
1975
1980
1985
1990
1995
2000
2005
2010
150
2015
Figure 7
Estimates of ‐RESEARCH in eq. (11) based on three alternative assumed values of the lag (k) from cumulative publications to the mortality rate
0.6
0.5
Lower 95% confidence limit
0.4
Point estimate
Upper 95% confidence limit
0.3
0.2
0.1
0
0
5
‐0.1
Assumed lag (in years) from cumulative publications to the mortality rate
10
Figure 8
Partial correlation across cancer sites between 1985‐1999 log change in number of research publications and 1995‐2009 log change in mortality rate,
controlling for 1995‐2009 log change in incidence rate
1995‐2009 log change in mortality rate
0.4
0.3
0.2
0.1
0
‐0.5
‐0.4
‐0.3
‐0.2
‐0.1
0
0.1
‐0.1
‐0.2
‐0.3
1985‐1999 log change in number of research publications
Note: bubble sizes are proportional to mean age‐adjusted mortality rate during 1973‐2009
0.2
0.3
0.4
Table 1
16.8
67.7
prostatic
11.9
61.6
pancreatic
10.7
11.7
lymphoma non hodgkin
7.2
17.0
stomach
5.9
9.6
ovarian
5.2
8.2
urinary bladder
4.7
20.8
brain
4.5
6.5
esophageal
4.1
4.5
kidney
4.0
10.7
rectal
3.6
16.1
multiple myeloma
3.5
5.5
skin
3.4
16.2
liver
3.2
3.9
leukemia myeloid acute
2.5
3.4
leukemia lymphoid
2.3
6.4
44,873
78,143
17,663
27,722
45,221
92,547
14,376
37,152
12,304
24,802
19,556
39,556
26,118
41,271
16,504
30,196
14,145
23,436
37,432
61,182
11,620
20,848
17,449
31,148
14,578
22,567
9,026
14,439
28,990
53,907
28,258
52,669
9,322
15,121
15,652
25,056
3,515
8,623
2,771
5,414
4,941
18,331
1,826
10,400
1,274
3,199
2,165
4,830
498
1,205
1,337
4,220
1,313
2,356
3,147
6,821
314
1,071
1,184
2,332
752
1,056
681
1,801
2,330
4,930
3,824
5,951
2,299
3,507
2,733
4,848
cum_other_re
search_pubs
breast
8,171
24,704
5,757
13,776
11,766
46,391
3,677
21,043
2,764
8,582
4,979
14,397
2,309
9,076
3,398
11,493
2,866
6,773
6,557
17,739
953
4,464
2,597
6,775
2,037
3,769
1,727
5,449
5,061
13,234
8,241
20,603
5,363
10,049
6,512
14,531
cum_US_gov_r
esearch_pubs
41.2
66.8 53,044
58.8 102,847
39.7 23,420
30.5 41,498
72.8 56,987
69.8 138,938
72.2 18,053
69.4 58,195
11.1 15,068
12.8 33,384
19.9 24,535
20.2 53,953
8.3 28,427
7.3 50,347
8.0 19,902
6.8 41,689
20.6 17,011
20.4 30,209
6.5 43,989
6.6 78,921
4.4 12,573
4.5 25,312
11.1 20,046
14.9 37,923
14.4 16,615
12.2 26,336
5.7 10,753
6.1 19,888
18.1 34,051
24.6 67,141
3.7 36,499
7.1 73,272
3.7 14,685
3.6 25,170
6.5 22,164
6.3 39,587
cum_non_rese
arch_pubs
19.9
58.4
48.5
19.4
12.9
17.4
12.4
13.5
8.6
10.4
10.8
8.7
6.3
5.3
3.4
5.2
4.4
4.4
4.3
4.7
4.4
4.3
4.2
4.3
3.9
3.1
2.8
4.0
3.3
3.5
3.7
3.5
4.5
2.4
2.9
2.4
2.0
cum_research
_pubs
colonic
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
cum_pubs
62.9
inc_rate
52.8
mort_rate
mean inc_rate
lung
Site
year
mean mort_rate
Mortality and incidence rates in 1995 and 2009 and PubMed publication counts ten years earlier (in 1985 and 1999), top 18 cancer sites (ranked by mean mortality rate)
5,781
19,825
4,030
10,770
8,554
36,643
2,381
15,453
1,948
7,081
3,789
12,061
1,957
8,380
2,629
9,473
1,966
5,292
4,651
14,331
742
3,860
1,869
5,456
1,493
3,052
1,247
4,569
3,581
10,464
5,678
16,988
4,167
8,492
5,112
12,447
Table 2
Weighted means, standard deviations, and correlation coefficients across 47 cancer sites of 1995‐2009 growth in mortality, incidence, and number of publications ten years earlier
ln(mort_r ln(inc_r ln(cum_ ln(cum_res ln(cum_non ln(cum_U ln(cum_othe
ates) ates) pubss) earch_pubss) _research_p S_gov_rese r_research_pu
arch_pubss) bss) ubss) Mean
Std. dev.
‐0.210
0.356
‐0.053
0.348
0.813
0.364
ln(mort_rates) 1.000
0.631
‐0.119
1.000
0.246
0.058
1.000
ln(inc_rates) ln(cum_pubss) ln(cum_research_p
ubss) ln(cum_non_resea
rch_pubss) 1.538
0.434
0.694
0.345
1.081
0.410
1.957
0.481
Correlation coefficients
‐0.348
‐0.032
‐0.411
‐0.348
0.304
0.026
0.120
0.667
0.981
0.741
0.680
1.000
0.647
0.908
0.939
1.000
0.690
0.643
1.000
0.856
ln(cum_US_gov_re
search_pubss) ln(cum_other_rese
arch_pubss) Observations are weighted by mean mortality rate.
Correlation coefficients in bold are statistically significant (p‐value < 0.05).
1.000
Table 3
Weighted least‐squares estimates of models of 1995‐2009 growth of the age‐adjusted cancer mortality rate (eqs. 10‐12)
Model Publication Statistic ln(inc_r ln(cum_ ln(cum_r ln(cum_ ln(cum_ ln(cum_ Intercept
lag (years)
ates) pubss) esearch_p non_rese US_gov_r other_res
’)
ubss) arch_pub esearch_p earch_pu
ss) ubss) bss) 0
1
Estimate
.
‐0.249
.
.
.
.
‐0.034
T
.
‐1.588
.
.
.
.
‐0.295
PVALUE
.
0.120
.
.
.
.
0.769
2
0
Estimate
T
PVALUE
0.732
6.772
0.000
‐0.426
‐3.801
0.000
.
.
.
.
.
.
.
.
.
.
.
.
0.130
1.573
0.124
3
0
Estimate
T
PVALUE
0.653
6.102
0.000
.
.
.
‐0.262
‐3.555
0.001
.
.
.
.
.
.
.
.
.
0.120
1.404
0.168
4
0
Estimate
T
PVALUE
0.698
5.745
0.000
.
.
.
‐0.195
‐1.726
0.092
‐0.172
‐0.792
0.433
.
.
.
.
.
.
0.149
1.596
0.118
5
0
Estimate
T
PVALUE
0.721
5.148
0.000
.
.
.
.
.
.
‐0.217
‐0.819
0.418
0.024
0.117
0.907
‐0.202
‐0.937
0.355
0.186
1.110
0.274
6
5
Estimate
T
PVALUE
.
.
.
‐0.245
‐1.619
0.113
.
.
.
.
.
.
.
.
.
.
.
.
‐0.029
‐0.254
0.801
7
5
Estimate
T
PVALUE
0.738
6.869
0.000
‐0.422
‐3.928
0.000
.
.
.
.
.
.
.
.
.
.
.
.
0.140
1.697
0.097
8
5
Estimate
T
PVALUE
0.664
6.555
0.000
.
.
.
‐0.300
‐4.366
0.000
.
.
.
.
.
.
.
.
.
0.202
2.281
0.028
9
5
Estimate
T
PVALUE
0.673
5.928
0.000
.
.
.
‐0.284
‐2.641
0.012
‐0.037
‐0.191
0.850
.
.
.
.
.
.
0.206
2.239
0.031
10
5
Estimate
T
PVALUE
0.632
4.781
0.000
.
.
.
.
.
.
0.074
0.317
0.753
‐0.156
‐0.823
0.416
‐0.157
‐0.831
0.411
0.156
0.953
0.346
Table 3
Weighted least‐squares estimates of models of 1995‐2009 growth of the age‐adjusted cancer mortality rate (eqs. 10‐12)
Model Publication Statistic ln(inc_r ln(cum_ ln(cum_r ln(cum_ ln(cum_ ln(cum_ Intercept
lag (years)
ates) pubss) esearch_p non_rese US_gov_r other_res
’)
ubss) arch_pub esearch_p earch_pu
ss) ubss) bss) 11
10
Estimate
T
PVALUE
.
.
.
‐0.092
‐0.587
0.561
.
.
.
.
.
.
.
.
.
.
.
.
‐0.134
‐1.027
0.310
12
10
Estimate
T
PVALUE
0.718
5.967
0.000
‐0.299
‐2.476
0.018
.
.
.
.
.
.
.
.
.
.
.
.
0.071
0.701
0.488
13
10
Estimate
T
PVALUE
0.663
6.132
0.000
.
.
.
‐0.319
‐3.578
0.001
.
.
.
.
.
.
.
.
.
0.316
2.278
0.028
14
10
Estimate
T
PVALUE
0.660
5.551
0.000
.
.
.
‐0.324
‐2.807
0.008
0.011
0.067
0.947
.
.
.
.
.
.
0.316
2.248
0.030
15
10
Estimate
T
PVALUE
0.608
5.090
0.000
.
.
.
.
.
.
0.189
1.092
0.282
‐0.302
‐1.546
0.130
‐0.162
‐1.089
0.283
0.336
2.031
0.049
1.3
2.6
gallbladder
1.0
1.4
tongue
0.8
2.6
hodgkin disease
0.7
2.9
bile duct
0.7
0.5
bone
0.5
0.9
thyroid
0.5
6.8
intestinal
0.4
1.5
vulvar
0.3
1.3
salivary
0.3
1.2
nasopharyngeal
0.3
0.7
tonsillar
0.3
1.3
nose
0.2
0.7
hypopharyngeal
0.2
1.0
oropharyngeal
0.2
0.3
pleural
0.2
0.0
19,731
31,443
10,184
14,907
4,717
12,095
2,556
4,392
2,894
4,357
13,942
17,723
3,510
7,337
36,900
60,486
4,711
14,423
39,706
77,592
2,789
4,442
5,166
8,718
3,753
6,122
881
1,266
5,708
9,402
400
1,161
1,347
2,714
2,850
6,004
951
2,440
272
404
365
626
35
67
60
120
959
1,311
69
232
1,041
2,125
302
950
3,588
9,530
99
142
121
217
232
348
21
35
117
166
11
24
43
132
86
246
cum_other_re
search_pubs
soft tissue
2,696
8,743
712
1,780
851
1,967
126
437
172
641
2,100
3,772
258
962
2,573
6,968
1,235
4,484
8,720
30,022
239
493
467
1,155
628
2,167
52
130
309
692
53
213
141
540
310
1,058
cum_US_gov_r
esearch_pubs
4.5
cum_non_rese
arch_pubs
1.5
1.8
4.6 22,427
1.2
3.5 40,186
1.5
4.4 10,896
1.1
3.1 16,687
1.5
2.8
5,568
1.3
3.3 14,062
0.8
1.4
2,682
0.6
1.1
4,829
0.7
2.5
3,066
0.6
3.3
4,998
0.5
2.8 16,042
0.4
2.9 21,495
0.8
0.8
3,768
1.3
0.8
8,299
0.5
1.0 39,473
0.4
1.0 67,454
0.4
6.2
5,946
0.5 14.3 18,907
0.4
1.7 48,426
0.4
2.2 107,614
0.3
1.3
3,028
0.3
1.4
4,935
0.3
1.2
5,633
0.2
1.3
9,873
0.3
0.8
4,381
0.2
0.6
8,289
0.2
1.2
933
0.2
1.8
1,396
0.2
0.7
6,017
0.2
0.7 10,094
0.2
0.9
453
0.1
0.6
1,374
0.2
0.3
1,488
0.2
0.3
3,254
0.2 .
3,160
0.1
0.0
7,062
cum_research
_pubs
laryngeal
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
cum_pubs
5.3
inc_rate
2.0
mort_rate
mean inc_rate
uterine cervical
Site
year
mean mort_rate
Appendix Table 1
Mortality and incidence rates in 1995 and 2009 and PubMed publication counts ten years earlier (in 1985 and 1999), 29 cancer sites not included in Table 1 (ranked by mean mortality rate)
2,033
7,209
513
1,511
581
1,552
102
392
124
563
1,379
2,894
210
845
1,809
5,714
1,076
4,040
6,525
24,852
177
406
370
1,014
461
1,946
35
104
214
562
49
203
113
459
251
947
0.2
0.2
peritoneal
0.2
0.4
anus
0.1
1.2
retroperitoneal
0.1
0.5
eye orbital
0.1
0.8
ureteral
0.1
0.6
penile
0.1
0.4
leukemia monocytic acute
0.1
0.2
lip
0.0
1.5
cum_other_re
search_pubs
tracheal mediastinal
cum_US_gov_r
esearch_pubs
0.4
cum_non_rese
arch_pubs
0.2
cum_research
_pubs
vaginal
cum_pubs
2.4
inc_rate
0.2
mort_rate
mean inc_rate
testicular
Site
year
mean mort_rate
Appendix Table 1
Mortality and incidence rates in 1995 and 2009 and PubMed publication counts ten years earlier (in 1985 and 1999), 29 cancer sites not included in Table 1 (ranked by mean mortality rate)
1995
2009
1995
2009
1995
0.1
0.1
0.2
0.1
0.1
2.3
2.9
0.4
0.4
0.2
10,173
16,191
2,221
3,119
7,340
1,471 8,702
2,707 13,484
153 2,068
233 2,886
371 6,969
598
852
90
113
215
1,050
2,158
79
148
213
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
1995
2009
0.1
0.1
0.2
0.2
0.2
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.0
0.0
0.0
0.2
0.4
0.6
1.2
1.7
0.4
0.4
0.9
0.8
0.6
0.5
0.3
0.4
0.3
0.2
1.3
0.6
10,607
3,250
7,395
1,822
3,175
3,436
5,445
14,335
23,738
1,765
2,738
1,947
3,211
978
1,416
1,469
2,064
604 10,003
267 2,983
931 6,464
138 1,684
372 2,803
128 3,308
255 5,190
2,241 12,094
4,771 18,967
47 1,718
125 2,613
107 1,840
193 3,018
258
720
423
993
48 1,421
103 1,961
261
94
258
53
132
55
85
1,059
1,638
13
20
44
59
96
115
12
17
420
215
781
102
289
89
198
1,705
4,047
36
111
73
150
205
364
42
95