Sen, Pranab Kumar; (1994).Bridging the Biostatistics-Epidemiology Gap: The Bangladesh Task."

•
•
...
BRIDGING THE BIOSTATISTICSEPIDEMIOLOGY GAP:
THE BANGLADESH TASK
by
Pranab Kumar Sen
Department of Biostatistics
University of North Carolina
•!
Institute of Statistics
Mimeo Series No. 2131
,
•
May 1994
BRiDGiNG THE BiOSTATiSTiCS-EPiDEMiOLOGY GAP:
THE BANGLADESH TASK I
•
BY
University of North Carolina at Chapel Hill, NC, USA
•
In spite of having a concordant scientif ic base,
biostatistics and epidemiology may not always have
harmony in their philosophical as well as operational
stands.
Part of this dissension is due to a less than
full appreciation of each others basic foundation and
objectives,
and there is ample room for better
interaction and coordination. Incorporation of 'local'
factors (and covariates) in epidemiological modelling and
statistical analysis is a vital component in this
respect.
with reference to some of the vital
epidemiological issues in Bangladesh, the role of
biostatistics in their effective "resolution is discussed
here in a broader perspective.
Among the major disciplines pertaining to the pUblic
health sciences, the triplet, biostatistics, environmental sciences and
epidemiology, constitute the so called quantitative (or measurement)
sciences, while health administration and health policy, health
behavior, health education, health promotion and disease prevention,
laboratory practice, maternal and child health, nutrition, parasitology,
public health nursing and other disciplines form the pUblic health
practice (or clinical) sector.
Nevertheless, there is no rigid
1.
•
•
PRANAB KUMAR SEN
iNTRODUCTiON.
demarcation of the boundaries of these territories, and there is ample
room for effective interactions not only between the branches within the
same sector but also between the sectors themselves.
For example,
without biostatistics, perhaps, nutrition and, to a certain extent,
lAMS SUbject Classifications:
,
•
Keywords and phrases: AIDS; biomathematics; biotechnology;
Cancer; Cardiovascular disease; chemometry; clinical epidemiology;
Cholera; demography; drug research; ecology;environmental science;
etiology; infectious disease; immunology; inhalation toxicology;
Kalazar;
Malaria;
occupational health;
pharmacoepidemiology;
pollution;
reproductive epidemiology;
statistical modelling;
stochastic processes; toxicology.
1
maternal and child health may lose their scientific foundations.
Likewise, biostatistics can not have its full development with respect
to the integrated field of pUblic health without the input from each of
the other fields referred to earlier. As a matter of fact, the impact
of biostatistics is by no means confined to the field of pUblic health
alone.
There is hardly any area left in medical and health sciences
where biostatistics has not made an appearance as an indispensable tool.
More specifically, in medical investigations, multicenter clinical
trials, biotechnology, immunology, medical diagnostics and image
processing, dentistry, pharmacology, mutagenesis, chemometrics and a
variety of areas of research and practice within the greater domain of
health affairs, biostatistics is emerging as the binding force.
It is
an indispensable quantitative tool for the planning (or designing) of
any objective investigation, a mentor for data collection and data
management and a passport for invoking appropriate statistical
•
.'
methodology pertaining to proper formulation of analysis schemes
providing valid and efficient statistical conclusions from the
experimental outcome.
Such conclusions need to be relayed to the
investigators and others who may not have profound statistical
background, and the interpretational role of biostatistics. is also
equally important.
In a broad sense, biostatistics is a hybrid of
biometry and statistics, inheriting the basic emphasis on biological
applications from the former and the affections for sound methodology
If
from the latter.
The advent of modern biomedical and health sciences
has indeed stimulated the veins of biostatistics, and, without any
reservation, it may be presented that the emergence of biostatistics as
a discipline marks the most significant development in statistical
sciences qualifying as a key technology as well as a refined art for
decision making in every walk of life, science and society. This unique
feature is bound to continue beyond the turn of this century.
In this perspective, however, it should not be forgotten that long
before the emergence of biostatistics, epidemiology paved the way for a
genuine quantitative approach to the study of epidemics, infectious
diseases, occupational hazards, and pUblic health as a whole; the grim
environmental impacts on our life have caught the attention of concerned
2
•
•
•
•
•
•
people only in the recent past, and in these assessments, both
epidemiology and biostatistics are indispensable.
For centuries,
different parts of this planet have experienced,
periodically,
divastating epidemics (such as cholera, dengue, jaundice, yellow fever,
plague, pox and others).
Infectious diseases (such as malaria,
typhoid, hepatitis, tuberculosis and others) have showered catastrophic
effects on human health and population dynamics.
It's epidemiology
which emerged as the basic quantitative approach to unfold the intricate
relationship between quality of life (i.e., living styles and security)
and susceptibility to such epidemics and communicable diseases.
Sexually transmitted diseases and the immunological epidemiology both
are on the top of research agenda at the current time.
Modern
epidemiology owes a lot to those pioneers whose deep foresight and
penetrating line of objective thinking opened the doors for this
genuinely important branch of public health.
The epidemiology without the active and matching collaboration of
biostatistics is incapable of meeting the challenge of today in pUblic
health, and equally, without the epidemiological impacts, biostatistics
by itself will be dehydrated of pUblic spirits, and cannot resolve the
vital issues in this area of serious human concern.
Thus, they are
complementary to each other, so that there is a genuine need to nurture
a healthy partnership between these two vital wings of pUblic health.
This can be implemented effectively by only having a comprehensive view
of the biostatistics-epidemiology integrated discipline, examining the
foundations of each field, their strengths and weakness, and then
attempting to bridging the gap, if any, between the two approaches which
share the common obj ectives to a greater extent. Or, in other words, we
need to find out suitable avenues for their fruitful integration.
With this objective in mind, in section 2, we proceed to examine
the interface of biostatistics and epidemiology, with due emphasis on
their individual foundation, domains of applications and scopes for
further augmentations. section 3 deals with their basic differences (in
philosophy/concepts/operations) Which, often, force a dissension between
biostatistical perspectives and epidemiological objectives. section 4
is devoted to the basic aspects of possible dissensions and means to
3
bridging the gap.
In epidemiological studies, geographical, cultural,
socio-economic, religious and other factors generally have profound
impacts, and biostatistics has the right ingredients to formulating
suitable models allowing the role of such factors and drawing meaningful
conclusions and interpretations from such studies.
For example, AIDS
(or HIV) is on the march almost everywhere in the world, and yet, there
are distinct geographical variation in the incidence rate, and, in this
respect, socio-economic, cultural and religious factors are, often,
•
.
important contributors towards the propagation of such an epidemic. For
this
reason,
in
the
concluding
section,
the
most
prevalent
epidemiological issues in the greater Bangladesh region (and in adjacent
parts of India, Burma and some other territories) are highlighted with
a view to emphasizing on the need for a Bangladesh task force to resolve
these problems to some satisfactory extent. The main emphasis, in this
quest, is, of course, on the need of developing more appropriate and
adequate biostatistics concepts and tools in order that valid and
efficient conclusions can be drawn.
The basic drawbacks of the
conventional statistical inference (and planning) -procedures in dealing
with such nonstandard problems are also presented side by side, so that
the need for novel methodology would be appreciated more.
2.
THE
INTERFACE.
Almost
one
hundred
years
ago,
in
quest
..
of
quantitative models in heredity and anthropometry, the need for
statistical methodology cropped up, and biometry emerged as a vital
branch to deal with this specialization. Soon afterwards, the need for
development of statistical methodology emerged in the field of
agricultural sciences, and later on, in industrial sectors too. Public
health sciences were themselves in rUdimentary forms (mostly), and
statistics gradually found its way in this novel field as other areas
started having considerable developments.
No wonder that in most of the
places, biostatistics and epidemiology were put in a common slot,
although, it was not uncommon to house biostatistics in the so called
department of preventive medicine (which is, often, in the school of
medicine rather than public health). One of the interesting points in
4
•
.
..
..
•
this genesis of biostatistics i~ that not only pUblic health recognized
the need for sound biostatistical theory and methodology for its various
tributaries, various branches of medical sciences also realized the need
for implementation of biostatistics for not only statistical analysis of
their experimental outcomes
but also for sound planning of their
(medical) studies. Government health departments and health agencies,
in their quest for improving the various vital statistics records laid
down emphasis on the need for training of biostatisticians with
supporting programs in population studies or demography.
Central
agencies, like the National Institutes of Health (NIH) and Food and Drug
Administration (FDA) in USA, the British Council of Medical Research (in
UK) and other places, realized a far greater need for statistical
methodology for their regulatory purposes.
Not only control of the
spread of various infectious diseases tops the list of their objectives,
there is a genuine need to scrutinize marketability of new drugs being
pushed by various pharmaceutical research groups (allover the world).
At the same time, the pharmaceutical moguls started recruiting
biostatisticians to conduct the needed statistical analysis of their
studies with a view to getting approval from appropriate agencies for
marketability
of
their
products.
In
all
these
ventures,
biostatisticians appear as an indispensable personnel.
Awareness of
environmental impacts has become a "household word" in the recent past.
Smoking habits are going through a lot of basic changes, and
environmental epidemiology has emerged as a vital area of pUblic health
sciences. Biostatistics is an essential component in this sector too.
While some of these features of biostatistics have been discussed in
detail in Sen (1993)2, we would like to emphasize here mainly the
interactive features of the triplet: Biostatistics, epidemiology and
environmental sciences.
They are not the same; they differ in their
•
approaches, coverages as well as basic philosophy. Nevertheless, they
all aim at a common goal: How to improve the quality of our lives by
controlling our environment (before it becomes unmanageable)?
•
2Sen (1991) [Statistical perspectives in clinical and health
sciences: The broadway to modern applied statistics.
Jr. Appl.
Statist. Sci. 1:1-50].
5
Environment is not simply what we get from the sun
(and other
planets), air, water and other resources, but more on what we are
contributing towards making our own lives unsafe! The thinning of the
Ozone layer is a concern for the entire planet--not even least for the
industrialized nations!
Exhaust
from
Inhalation toxicology is a hot topic of study.
industrial plants,
gasoline and diesel
combustions by
automobiles (and airplanes too), continuing use of natural resources for
energy producing plans,
and the use of various chemical agents have
raised the level of pollution (and radiation too) to an unsafe grade,
almost allover the world.
Chemical dumpings and nuclear wastes are
causing serious water and subsoil contaminations.
Can we drink the
natural water any more? Can we breathe the air comfortably? Where are
we heading to? Epidemiology has taken up this challenge (in cooperation
with biostatistics and environmental sciences) to put a halt to such
disasters.
Let me present a very brief outline of the basic sectors in
epidemiology with a view to providing more information on the current
developments.
~ EPidem~cs a~d Infectious diseases;
~
Epidemiology
The
early
diseases,
__ EcologJ.cal J.mpacts;
-
----
Clinical approaches;
~ Demographical shades;
______ Toxicology.
developments
related
mostly
to
epidemics
and
•
infectious
and demographical approces provided the usual tools.
course of this traditional progress, a striking feature emerged.
In
This
led to the formal separation of the two philosophical slants: Ecology
and toxicology.
In ecology, not much emphasis is usually placed on the
etiological issues, but more on the means of studying the nature of the
development with a view to preserving our environmental treasures.
toxicology,
on the contrary,
In
emphasis is primarily on the cause and
effect type of studies, and the environmental engineering impacts are
6
•
more predominant in this respect.
Nevertheless, for both ecology and
toxicology approaches, biostatistics is indispensable!
•
To illustrate
the coverage of modern epidemiology, let me mention a few of the most
important areas:
•
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
(viii)
(ix)
•
•
(x)
(xi)
Infectious diseases epidemiology
Epidemiology of Immunizations
Biochemical Epidemiology
Epidemiology of sexually transmitted diseases
Reproductive epidemiology
Cardiovascular diseases epidemiology
Cancer epidemiology
pharmacoepidemiology
Occupational health epidemiology
Environmental epidemiology
Clinical trials in epidemiology
There are many other tributaries.
The important feature common to all
these branches is that
Epidemiology deals primarily with human life.
Bearing
in mind,
the
complexities
of modern
life,
we may
need to
appraise the scope of epidemiology from a much broader perspective, and
in this assessment biostatistics is the main tool.
To relate the two
wings of pUblic health in a harmonious blending, it is also necessary to
examine the current state of the art of biostatistics.
Although biostatistics literally means statistics relating to bio
,
or life sciences, its growth has not been restricted only to biological
sciences.
Of the variety of areas enriched with biostatistics (logic
7
and modelling), we may refer to.the following:
Biomathematic/biomedical engineering
statistical methodology
Sampling methodology
Biometry
. Demography
Genetics/Mutagenesis
Environmental Sciences
Epidemiology
Clinical trials
........ Stochastic modelling
B
I
o
S
T
A
T
I
S
T
I
C
S
Neural network (neurophysiology)
·Medical diagnostics
Tomography (Image processing)
-Health Sciences (& Management)
Medical Studies
Data monitoring/management
Biotechnology
'Chemometry
Drug research & information
There is virtually no limit to the scope of applicability of modern
biostatistics theory and methodology.
The question may therefore arise naturally:
to be so attentive to epidemiology?
Why biostatistics has
The answer to this query is simple:
In order to foster the healthy growth of research on quantitative (or
measurement sciences) analysis in pUblic health, biostatistics has the
natural obligation to side with both epidemiology and environmental
sciences.
Only then a full assessment of the scientific as well as
equalpractical implications of pUblic health problems can be resolved
effectively.
For example in an environmental or epidemiological problem
(say, inhalation toxicology), it may be quite tempting to incorporate
8
.
•
•
•
biomathematics or biomedical engineering methodology and formulate a
suitable quantitative model to depict the nature of the development.
.
However,
to make it applicable in a particular context,
the socio-
economic, graphic and other concomitant information may be vital for an
effective resolution.
Moreover, keeping in mind the stochastic elements
embedded in such a formulation, there is a genuine need to introduce
stochastic components along with the deterministic ones.
Then, one may
need to set suitable margin of fluctuations for the stochastics, so that
the
deterministic
factors
can
be
quantitatively
assessed,
objectives of the study may therefore be met adequately.
ological
concepts,
demographic
measures
and
and
the
Epidemi-
environmental
interpretations are by no means adequate to address fully this problem.
.
.
•
Likewise,
(mathematical)
inappropriate
statistician
to
handle
be
statistics,
such
fully
by
itself,
problems.
It
conversant
is
may
be
essential
wi th
grossly
that
the
a
basic
epidemiological/environmental aspects of the problem before planning
such a study, and subsequent statistical analysis schemes may have to be
innovated in the light of the specificities of such problems.
interface
of
pUblic
health
measurement
sciences
is
The
therefore
of
fundamental importance for any objective scientific investigation and
effective resolution of pUblic health problems.
is highly valuable in this respect.
Mathematical statistics
However, the common assumptions
which are generally made in statistical modelling and inference may not
be tenable in the pUblic health content.
sampling
(with
or without
replacement)
For example, equal probability
and
independence
of
observations may not generally hold in pUblic health studies.
9
sample
The
nature of sampling may depend very much on the particular problems under
study
(viz,
air pollution,
water contamination,
follow-up
studies,
retrospective studies, case-control studies, etc.), and typically, in
such
a
case,
standard
statistical
tools
may
not
be
usable.
Biostatistics has been designated as the branch of statistics to deal
with
such
more
general,
more
practical
models,
to
develop
.
novel
statistical modelling and analysis tools and to provide meaningful
interpretations of the study-outcomes not only to statisticians but also
to public health scientists as well as general pUblic for the betterment
of their lives.
3.
THE DISSENSION.
(Bio-)statistics is more of an art (than a
science) for providing meaningful interpretations and drawing valid and
objective conclusions from experimental or observational studies,
and
is guided by the basic principles of statistical methodology in carrying
out this delicate task with controllable margin of errors.
hardly
be
characterized
as
a
bonafide
member
of
the
It may
club
of
mathematical, biological, physical or engineering sciences, and there
should not be any attempt to coin the term statistical sciences to
classify (bio-) statistics as a science discipline.
compared to the social sciences
On the contrary,
(including economics/management and
political sciences), statistics has a far more visible scientific base,
although
in
mathematical
principle,
sciences.
objectivity of
a
it
differs
from
Nevertheless,
scientific study,
the
basic
biostatistics
allowing the
philosophy
of
combines
the
flexibility
of
an
uncontrolled experimental setup, and retains its adaptability as a handy
10
•
tool
for
scientific
assessment
Accommodation of this
broader
domain of
in
a
wide
spectrum of aims
biological,
medical
range
of
applications.
and objectives,
within the
and public health
sciences,
naturally calls for a close examination of the basis (nonstatistical)
problems from Scientific as well as socio-economic perspectives.
This
is generally accomplished by the rule of three in biostatistics:
planning
(or
modelling,
design)
and
of the
(iii)
data
experimental
collection,
scheme,
(ii)
monitoring
and
statistical conclusions from the experimental outcome.
approach
enables
biostatistics
also
to
(i)
statistical
drawing
of
This combined
provide
meaningful
interpretations to the experimenters as to the general outcomes of the
study.
In this delicate task,
inference
•
•
procedures
are
to
statistical planning,
be
blended harmoniously
flexible and yet efficient manner.
modelling and
in
an
enough
Therefore, there remains room for
more thoughtful provocations of statistical theory,
methodology and
basic principles in the general development of pUblic health sciences.
As has already been pointed out in the previous section,
majority
of
pUblic
health
investigations,
epidemiology
in a
and/or
environmental sciences occupy the focal stand, and thereby dominate the
scenario.
Since the basic problems generally crop up in this area, it
is quite natural to plead to their general principles for
seeking
interpretations, motivations and justifications of the general features
underlying such experimentations.
Frankly,
without this patronage,
there would be hardly any ground for a statistical planning/analysis of
•
pUblic
health
studies.
The
main
point
is
therefore
a
proper
coordination and mutual understanding between biostatisticians and other
11
public health researchers so that a genuine pUblic health problem can be
statistically well formulated, and then appropriate statistical tools
can be incorporated for proper planning of the study, safe·and efficient
data collection and useful statistical analysis.
outside the scope of reachibility!
This objective is not
Environmental science has a strong
.
engineering component where chemistry and physical sciences play a basic
role.
As such, it is not unexpected to observe an engineering overtone
in the formulation of environmental problems.
this conventional setup,
environmental
and
Ecology is a dissident in
and it also serves as the liaison between
epidemiological
approaches;
nevertheless;
) statistics is a very vi tal component of ecology too.
deals more directly with human population.
(bio-
Epidemiology
It started with the direct
etiological issues of epidemics and infectious diseases and then with
the assessment of their aftermath.
consideration
stochastic
dominate
elements
indispensable
in
over
in
In this setup, often, ecological
etiological
the
issues,
investigations
epidemiology.
and
make
Toxicological
the
prevalent
biostatistics
approaches
in
epidemiological studies are more etiologic-oriented while demographic
approaches, often, lay less emphasis on etiology and more on descriptive
demographic factors.
Clinical epidemiology is a relatively new branch
where a marriage of (bio-)statistical principles with epidemiological
objectives is usually aimed in a
setup.
(reasonably) controlled experimental
Randomized clinical trials have been incorporated, during the
past twenty five years,
in many vital epidemiological investigations.
The wealth of information acquired in this manner has greatly advanced
the
state
of
our
knowledge
regarding
environmental and epidemiological factors.
12
many
of
these
medical,
•
From pUblic health perspectives, it is quite clear that there is a
genuine need to nurture healthy growth of each of its vital components,
.
and
yet,
there
has
to
be
a
coordinated
effort
in
fostering
interdisciplinary developments so as to place the composite field in an
.
effectively adaptable standing.
The dissension has its roots in the
intricate philosophical as well as manual channels of the individual
disciplines.
It
may
not
be
improper
to
say
that,
often,
the
epidemiologists are themselves divided by their basic ideological (or
philosophical) slants:
Whither etiological, clinical or ecological foundations!
Environmental, cardiovascular and reproductive epidemiology, toxicology,
mutagenesis and some other areas are, by formulation, more etiologically
oriented, while in cancer epidemiology, AIDS and even in respiratory
disease epidemiology, there is a considerable overtone of ecological
•
conceptions.
The
clinical
epidemiology
comes
in
between
the
two
extremes, and it attempts to provide meaningful interpretations through
a
clinical
approach
which
etiological study, and yet,
preserves
the
basic
objectives
of
an
accommodates ecological foundations to a
greater extent.
It is quite clear that the role of biostatistics (in
modelling,
monitoring
data
presented in a
approaches.
and
statistical
analysis)
can
not
be
single framework for these wings of epidemiological
It is therefore imperative for biostatisticians to identify
first the basic epidemiological foundation before even planning a study
and to analyze the observational data.
it is equally imperative for the
epidemiologists to convey clearly the etiological, clinical, ecological
13
and demographic basis of a study, so that a proper communication channel
can be established with the biostatisticians and a healthy resolution of
their common problem be made.
In practice, this may not, often, be the
case.
Both the disciplines (i.e., biostatistics and epidemiology) aim to
acquire,
(viz,
as much as possible,
case-control
studies,
information from observational studies
field
trails,
follow-up
studies,
retrospective studies, clinical trials, demographic surveys etc.), and
yet, because of their basic philosophies, they differ considerably (and,
often,
incomprehensively)
in their
conclusions from acquired data banks.
cause of their dissension.
operational
manuals
for
drawing
This is, by far, the most notable
Let's illustrate this basic phenomenon with
a concrete clinical epidemiological model, and a similar problem crops
up in almost all other areas in a epidemiology and public health,
in
general.
Suppose that a clinical trial is to be planned for studying the
effect of high-fat diet on heart diseases.
In a quantitative approach,
the first and foremost job is to define the "risk" of heart attacks in
a clearcut manner.
Angina Pectors, strokes (in the ascending aorta),
arteriosclerosis, Perkinson's disease, and heart attacks are all members
of the general class of cardiovascular diseases.
The diet may have
differential effects on different members in this class, and hence, it
needs to be clearly formulated: that is the primary end point o'f the
trial?
Are there any secondary end-points?
point study?
study?
Is it then a multiple end-
What are the etiological considerations underlying such a
Has there been any pilot study to collect some preliminary
14
..
information relating to such formulations?
Phase I
.
trials,
there might have been some preliminary studies with
subhuman primates:
How far the response characteristics on such animals
can be projected for human sUbjects?
..
As often is the case with
valid in the current setup?
Or, is accelerated life testing
Above all, the clinical trial is planned
for human sUbjects, and therefore there are a number of basic queries:
(i)
What sector of the population is the focal point of such a
study?
(ii)
( iii)
(iv)
(v)
(vi)
How are they to be recruited?
Retrospective or follow-up study?
How to define properly high-fat diet?
Racial/socio-economic/ethnic considerations?
Besides high-fat diet, are there other important factors which
are very relevant (viz., smoking, drinking, lack of physical
exercise, tension at working place/home, family history etc.)?
(vii)
Whether to choose some of these factors as treatment variates
or as concomitant ones?
(viii)
How much "control" the trial may have on the follow-up scheme
with such "outpatients"?
(ix)
What
to
do
with
"noncompliances"
due
to
dropouts/
withdrawals/failure due to other causes?
(x)
How to cope with the medical ethics?
There may be one thousand and one other considerations!
There is a
endpoint(s)
in
lot of arbitrariness in the formulation of primary
clinical
trials.
Medical
and
epidemiological
considerations, often, do not match very harmoniously, and, as a result,
15
statistical formulations are,
areas
often, not so precise.
in statistical sciences
(viz,
In some other
agricultural experiments/animal
studies), statistical hypotheses are generally quite precise and because
such experiments can be conducted with a greater amount of control,
statistical
formulations
are
usually
quite
simple.
In
epidemiological/environment studies, this is not generally the case.
Because
of
possible
concomitant
multiple
variates,
deterministic ones,
the
end
points
stochastic
and hence,
and
a
aspects,
large
often,
number
dominate
extra care is needed to
of
the
identify a
suitable model, plan the study in such a way that information pertaining
to such a model can be properly extracted from the experimental outcome,
and to formulate valid and efficient statistical tools for drawing
conclusions from the acquired data set.
with respect to each of these
three basic considerations, epidemiological approaches are, often, at
cross-roads with biostatistical ones.
From epidemiological point of
view, it may be quite natural to assume that the more is the number of
factors and response variables included in the stUdy, the greater would
be the amount of information, and hence, the better would be the quality
of conclusions to be drawn from the experimental outcome!!!
were no stochastic elements in this formulation,
would have been very reasonable.
setup,
the
larger
variables/concomitant
is
variates,
such an expectation
On the contrary,
the
number
there
will
of
be
If there
in a stochastic
factors/response
greater
amount
of
variability from true pattern, and hence, there will be greater risk for
making
incorrect/imprecise conclusions
There
is
therefore
a
genuine
need
16
from
for
the
acquired data
reconciliation
of
set.
the
..
deterministic
vs.
stochastic. undercurrents,
and
the
integrated
biostatistics-epidemiological study has to be founded on this mutual
standing.
Any development in isolation is inappropriate and inadequate
for an integrated study.
...
Let me iterate the last point with the
following salient points •
Suppose that in the planned clinical trial, it is aimed to compare
the cardio-vascular problems of a control group (of low-fat diet people)
and a treatment group (of high-fat diet people).
If really the control
group has a lower risk then medical ethics would prompt us to curtail
the study as early as possible (with due evidence on this significance)
and to switch all the sUbjects to the low-fat diet group so that they
have a greater chance of survival.
It is not, therefore, uncommon to
have an interim analysis scheme where the accumulating data set is
.
periodically examined with this early possible termination in mind:
Does it matter how many times you look at the date?
Indeed,
biostatisticians and epidemiologists, often, do not agree on the format
of such interim analysis!
Fortunately, the academics as well as the
agencies are aware of this basic methodological issue, and the past two
decades have witnessed a phenomenal growth of research literature on
interim analysis.
Randomized clinical trials have been proposed and
successfully conducted to inflict greater amount of control through the
so called "double blind" studies, and yet, from a practical point of
view, there remain the concern:
How to make sure that randomization works?
Statistical analyses are usually made to draw valid and efficient
statistical conclusions based on the trial outcome.
17
Interim analysis
has led to time-sequential, progressive censoring and/or repeated (as
well as group sequential) significance testing schemes.
Although these
are being advocated more and more, the intricate stochastic base of the
trial
data
set
biostatisticians
is
often
as
(SPSS/BMDP/SAS/S-plus,
misunderstood
well.
This
etc. )
by
is
the
epidemiologists
Routine
unfortunate.
statistical
packages
and
often
are
mechanically adapted for such statistical analysis, and the Cox (1972)3
proportional
hazard
model
(PHM)
has
become
epidemiologists and biostatisticians as well!
a
household
word
to
There is a genuine need
to examine the appropriateness of any specific model and/or program in
a
specific case,
and this
calls for
a
lot of coordination between
epidemiologists and biostatisticans.
4.
THE
BRIDGE.
Biostatistics,
having
its genesis
in biomedical,
clinical, health and statistical sciences, has been catering their needs
very well.
Likewise, epidemiology initiated a quantitative approach to
a general class of problems in health sciences and is very much aligned
to the modern pUblic health sciences.
Both the disciplines deal, in an
objective manner, with quantitative (or measurement) aspects of health
problems, and they share a common ground in their foundations too.
This
concordance of their basic concepts and objectives is indeed the bridge
between the two seemingly less related wings of public health.
There
is, however, a basic need to fortify this link in a way as to allow free
trespassing of ideas, concepts and operational manuals from one camp to
the other.
3COX
(1972)
[Regression Modes
and
Life
Tables
discussion). J. Roy. statist. Sec. B. 34:187-220.]
18
(with
The mathematical
(which
(or theoretical)
is more popularly known as
statistics)
counterpart of biostatistics
the
theoretical
or mathematical
has an inherent tendency to rely heavily on probability
theory, measure theory, real and functional analysis as well as other
areas of (pure and applied) mathematics, and may,
obscured
in
biostatistics
abstractions.
to
refined
This
generally
theoretical
at times, be quire
limits
statistics,
the
access
although
this
of
is
certainly quite healthy for development of theory and methodology which
can be incorporated in biostatistics as well.
For the latter task, it
may be desired to have strong interactive research in biostatistics
(theory and methodology) wherein the mathematical sophistications can be
decoded to a grater extent for making room for fruitful applications.
Nevertheless, in a majority of practical problems arising in biomedical,
clinical
and
health
studies,
direct
adoptations
statistics may stumble into read blocks:
considerably,
leading
to
possibly
from
theoretical
The basic setups may differ
different
sets
of
regularity
assumptions, so that an adoptation, without checking the validity, may,
often, be disastrous in biostatistics applications.
Fortunately, at the
present time, through the intervention of competent statisticians having
sound
theoretical
background
and
keen
interest
in
biostatistical
applications, there has been a steady growth of research work on the
interface of theoretical statistics and biostatistics.
Epidemiology
comes from the other corner, and putting theoretical statistics on the
same table with epidemiology may not always be wise.
•
Biomathematics,
often, serves as a liaison between mathematics and biological sciences,
19
and
there
are
some
subtle
biostatistical approaches.
di.fferences
between
biomathematical
and
Let us examine these basic points so as to
prepare the way for a proper bridging the biostatistics-epidemiology
.
gap.
(i)
Whither biomathematics?
In a sense biostatistics combines the
-.
mathematical objectivity of biomathematics and the statistical concepts
of theoretical statistics, and hence, serves as a better bartender in
health sciences where stochastics may not play any insignificant role
compared to the deterministic factors.
emphasis
on the modelling part
Biomathematics by somewhat more
incorporating all the deterministic
factors, and these models work out well when the stochastic components
are not so dominant.
such a setup,
It is indeed possible to include stochastics in
but the resulting picture may depend very much on the
distributional
properties
of
such
stochastic
elements.
Often,
.
stochastic differential equations are imported to describe biological
systems.
But their adoptability in a given context may depend very much
on the regularity assumptions on the stochastic components.
In most of
the health sciences-problems, these regularity conditions are generally
more complex so that such stochastic differential equations may not lead
to simple solutions.
Theoretical statistics may have some problems too,
and
discussed. later
these
will
be
on.
From
this
perspective,
biostatistics offers a better choice.
(ii) Beyond the LLd. sampling.
Conventionally, in (mathematical)
statistics, it is assumed that the sample (on which statistical theory
has to be developed) consists of independent and identically distributed
(i.i.d.) random elements.
In sample survey methodology this relates to
20
•
the so called simple random sampling with replacement
(SRSWR).
In
practice, often, the population size (N) is finite and sampling is made
without replacement (WOR).
probability SRSWOR.
..
The first step towards this is the equal
In Socio-economic,
agricultural and demographic
surveys, often, a stratification of the population is incorporated along
with SRSW(O}R to reduce further the sampling error.
methodology has
decades.
Objective sampling
gone through an evolutionary growth
In many epidemiological investigations,
in the past 4
(stratified or not)
SRSW(O)R are not the appropriate, and, usually more complex sampling
schemes are adopted to suit the practicality.
control
studies,
field
trials,
retrospective
For example,
or
follow-up
in casestudies,
matching and other covariate adjustments often eliminate the possibility
of using
.
•
i. i. d.
sampling or some minor modifications of the
same .
Length biased sampling or weighted sampling is becoming quite popular in
environmental studies.
In theoretical statistics, there has been, too,
some developments to deal with more complex situations than in i. Ld.
sampling, although the main emphasis is on long-range dependence which
typically arises in time-series models.
Spatial statistics is another
area os theoretical interest but have good scope for applications in a
variety of models where the so called L Ld.
appropriate.
diverse
type
Naturally,
of
there is a
sampling
schemes
structure may not be
genuine need to
that
are
look into the
appropriate
for
epidemiological (as well as environmental) studies, and to develop more
•
•
statistical methodology so as to make them usable in the field of pUblic
health.
21
(iii)
Beyond the Parametrics.
There is an abundance of rates,
ratios and proportions in epidemiological measures and interpretations.
Thus, it may be quite intuitive to incorporate some simple parametric
models
(resting
hypergeometric,
mostly
on
binomial,
Poisson,
exponential or normal distributions)
analysis of such studies.
binomial,
in statistical
However, most of these parametric models are
tied-down to SRSW(O)R, while (as explained in (ii»
studies,
negative
in epidemiological
such simple sampling schemes may not be that relevant,
and
hence, more complex parametric models may crop up in such studies.
In
this model, the binomial (hypergeomtric) law has been extended to the
beta-binomial and the Poisson to the negative binomial laws. still then,
the situation is not totally satisfactory.
most
of
the
approach,
epidemiological
models,
because of the complexities,
even
The bottom line is that in
one
adopts
a
parametric
the number of parameters may
become large, and in terms of robustness, the statistical procedures
become more vulnerable.
[Refer to a general birth and death process
allowing nonstationarity in the rates and also immigration in a general
form.]
It is therefore desirable to consider more complex statistical
models which are built on the epidemiological axioms,
and, with due
emphasis on validity and robustness considerations, formulate efficient
and yet practically adoptable statistical procedures.
Nonparametric
methods generally fare well in this respect.
(iv) Whither Linear Models.
Linearity of regression function,
homoscedasticity and normality of the error components(s) form the basis
of linear statistical inference.
In many biological, epidemiological
t
and medical studies, the error distribution may be highly skewed, and
22
hence, often, a transformation (viz, Box-Cox or others) is made on the
response variable to induce "more normality" into the system!
Such
transformations may not only affect the error distribution but also the
underlying linear model, if any.
•
As a result, such linear statistical
inference has been characterized as highly nonrobust in many practical
applications, and epidemiology is no exception.
In many epidemiological
studies, the response variable may be binary (or ordered categorical),
and hence, the usual linear model may not be very appropriate.
There
has been a steady growth of research literature on such models,
and
logistic regression, and more generally, generalized linear models have
evolved to eliminate some of the basic inapplicability aspects of the
classical linear models.
nevertheless, because of the basic differences
in the sampling schemes, one needs to pay close attention to the scope
•
of such generalized linear models to complex sampling schemes
(other
than SRS).
(v)
Misclassification
in
Data
Synthesis.
In
epidemiological
studies, often, the response or independent variables are misclassified
(due to latent effects or other reasons).
Such a misclassification can
have serious effects on the validity of statistical conclusions to be
made from acquired data sets, and in a majority of cases, severe bias
crops up due to such misclassifications.
In epidemiology, the terms
sensitivity and specificity refer to the effects of misclassification,
and there remains a lot to be accomplished to make them usable in more
•
..
complex sampling schemes .
(vi) Measurement
(or observational)
errors.
This phenomenon is
related to misclassification, although this is somewhat more specialized
23
to error in measurement rather than classification of states etc.
is not only on the primary
scope of measurement errors
The
(response)
variate (when it is continuous, discrete or categorical) but also on
other auxiliary or concomitant variates.
Fortunately,
the past few
years have witnessed a steady growth of statistical research literature
.
on this vital topic, albeit mostly relating to SRS and some other simple
models.
There is a genuine need to include more complex epidemiological
models in such studies.
(viii)
Interim Analysis.
Medical,
epidemiological and many
health studies are often based on a follow-up scheme which pertains to
accumulating datasets
consensus
among
over
a
period of
epidemiologists,
time.
there
biostatisticians
is a
general
and
medical
researchers that (statistical) monitoring of such a follow-up study can
not only lead to an early termination of the study having time and cost
(as well as human lives too)
but also providing greater control on
ethical constraints as well as other
(viz.
side-)
effects.
On the
contrary, in order to do this statistical monitoring in a valid and yet
eff icient manner,
attended
there are certain basic factors
properly.
Sometimes,
these
create
the
dissension between two statisticians and others!
argue:
which are to be
basic
course
of
Epidemiologists may
Does it matter how many times you look into the accumulating
data set?
An uncomfortable biostatistician may try to nod in protest:
How are you going to control the level of significance of the test or
coverage
probability
accumulating data set?
of
the
estimates
you
want
to
base
on
the
•
Fortunately, the situation is far more clear now
t
than twenty years ago.
Interim analysis has been accepted as a valid
24
statistical
decisions
tool
too)
for
or
studying
statistical
accumulating
data
sets.
properties
Repeated
(and
making
significance
testing, group sequential tests, time-sequential procedures , progressive
censoring schemes have been systematically developed to handle this
..
basic problem in a
class of situations.
Nevertheless,
there is a
genuine need to incorporate more complex epidemiological models in this
statistical methodology.
(viii)
Clinical Epidemiology: Whose Child is it After All?
It
is generally claimed that it is a hybrid of medical and epidemiological
concepts and practices!
Nevertheless, the most signif icant component in
this venture is biostatistics.
The conventional methods of collecting
information for epidemiological research may not always work out that
well when dealing with chronic diseases or with other epidemiological
..
studies without having a strong etiology, so that a clinical trial with
an adequate number of sUbjects maybe planned to gather the pertinent
information.
usually sUbj ects are enrolled into the proj ect from a pool
of volunteers, and it is imperative that they agree to go through the
clinical trial protocol and satisfy the basic need of randomization
effectively.
The very planning of such a study demands considerable
statistical expertise.
The scope of such a study is limited to the
population for which the volunteered sUbjects form a
Issues
of
noncompliance,
elimination
of
bias,
(random) sample!
effectiveness
of
randomization and validity of standard analysis tools are the most
•
pertinent ones.
They call for a sound and thorough interaction between
epidemiological objectives and statistical principles.
There is light
at the other end of the tunnel, and we hope for the best to emerge in
near future.
25
(ix) Control
studies,
of
Extraneous
Factors!
control of extraneous factors
In
many
epidemiological
(especially from etiological
perspectives) is essential for elimination of relatively less important
and unrelated causes
or
factors,
conclusions can be drawn.
so that more
precise statistical
Since, human sUbjects are typically involved
.
in such studies, it may not be possible to have a completely controlled
experimental setup (as in agricultural/laboratory experiments).
For
this reason,
are
matching,
analysis
of
adopted to enhance compatibility.
covariance and other means
Therefore, the real challenge for
biostatisticians is to fathom out the intricacies of the sampling design
[see
(ii)],
and,
in view
statistical methodology
for
of
such
complexities,
possible
epidemiological investigations.
to
incorporation
develop
in a
proper
variety of
Better not to pass on the blame to
epidemiologists for not using a sophisticated statistical package, but
.
to put more emphasis on interactive research which would permit easy
access to such complex statistical models.
(x)
in
the
Beyond Epidemic Theory.
quantitative
assessment
Although epidemiology has its roots
of
epidemics-etiology,
its
branching includes a
far wider spectrum of objectives;
toxicology are both
integrated components of the same.
)mathematical models for the common epidemics
theory),
in spite of their elegance,
modern epidemiology.
current
ecology and
the
(bio-
(known as the epidemic
may not be very pertinent for
There is a genuine need for more complex, 'more
flexible, stochastic modellings in modern epidemiology.
This approach
should be capable of incorporating the etiological factors along with
t
26
the ecological aspects in a way to have a natural evolution.
On both
counts, regional, cultural, religious and other factors are important
ones, and hence,
in the concluding section,
I will touch on some of
these issues with special reference to Bangladesh.
5.
It is almost impossible to track down the
THE BANGLADESH TASK.
entire set of epidemiological issues in a densely populated country like
Bangladesh.
I shall only consider a few important issues, and discuss
the related biostatistical problems.
with a geographical area not
larger than the smaller states (in USA), Bangladesh has a population
more than half of USA.
Many of the epidemiological issues revolve
around this enormous population, its relatively low profile in income,
health
and
education,
industrialization.
torrential rains)
and
its
low
standing
with
respect
to
Natural calamities (e.g., tidal floods, tornadoes,
invade this country very regularly,
famines are not unexpected.
and recurring
The rate of population growth is one of the
highest ones and the per capita income is in the lowest category.
Yet,
Bangladesh is making remarkable progress in various directions, and in
this venture,
biostatistics plays a vital role.
following:
(i)
(ii)
(iii)
(iv)
(v)
(vi)
Reproductive Epidemiology
Child and Maternal Health Epidemiology
AIDS and Venereal Diseases
Depression and Mental Illness
Cholera, Diarrheal Diseases, Kalazar
Malnutrition
27
Let me mention the
(vii)
Pollution and water contamination
(viii)
Smoking, Cancer and tuberculosis
(xi)
(x)
Cardiovascular Disease Epidemiology
•
Chronic Disease Epidemiology
The sheer weight of population has made demography as the custodian
..
of these studies on Bangladesh, and no wonder, that among the Bangladesh
statisticians, there are more demographers than in any other branch!
The wealth of demographic background and insights may very well be
utilized in all of the areas of epidemiology referred to above.
I
understand that the International Center for Diarrheal Disease Research
(ICDDR), located in Dakha (Bangladesh) is also a prominent center for
the study of cholera and other intestinal diseases, although primarily
from
epidemiological
demography
Research)
group,
of
and
the
their
demographical
ICDDR
own,
which
epidemiological findings.
although more
in
has
a
points
journal
collects
and
of
view.
(of
Like
Diarrheal
disseminates
the
Disease
pertinent
There are occasional biostatistical sparks,
line with demography.
There
is
a
department of
Nutrition (in the PG Hospital at Dakha) and the National Institute of
Preventive Medicine
expertise too.
(also
in
Dakha)
On the other hand,
which have
some
biostatistics
the main thrust of statistical
activities (research as well as applications in various fields) rests
with
the
department
of
primarily an academic
Statistics,
University
of
statistics
This
is
institution with the provision of attracting
bright students and faculty members from the academia.
established
Dakha.
training
program,
and
Also, it has an
this
journal
(of
statistical Research) is a convenient outlet of their creative research.
28
t
Therefore, this journal, amidst its silver jubilee celebrations, should
be charged to enlarge its area of jurisdiction.
transfer
the
framework
to
epidemiology
or
There is no need to
demography,
but
it
is
imperative to include all the vital components of modern statistics in
•
"statistical Research".
This research should bridge the gap not only
between biostatistics and epidemiology (with especial emphasis on the
Bangladesh
issues)
methodological
but
issues
also
between
operational
arising thereof.
biostatistics
The epidemiological
and
issues
referred to above are some of clinical nature, some etiological while
others are more ecological oriented.
Only biostatistics can bring all
of these apparently diverse approaches into a common stream, and for
this
unification,
methodology.
.
the main task
This development,
is
the development
in turn,
of
appropriate
depends heavily on active
collaboration of researchers from these diverse fields with competent
statisticians who are interested in extending standard methodology to
such nonstandard situations.
unrestricted use of some standard packages
(such as logistic regression/proportional hazards model etc.) without
checking
their
validity
and
appropriateness
in
such
nonstandard
situations is dangerous.
It is the (bio-)statisticians' responsibility
to
model
develop
tools
for
specification,
cross-validation
and
statistical analysis, and only then the bridging of the biostatisticsepidemiology gap will be complete.
The legendary physician, Dr. Bidhan
Chandra Roy, made the remark (in the context of control of Kalazar) that
for every disease there ought to be a local clue (solution)
for the
remedy [Presidential address as the Indian Science Congress Association
&.
Meeting, Calcutta, 1957].
It may not be an exaggeration to iterate his
29
legendary
remark
further
in
saying
that
for
every
epidemiological
problem there ought to be certain local/regional factors which provide
key information (on etiology as well as ecology) which should form the
•
base of plausible biostatistical resolutions (regarding planning of the
study,
data
collection and monitoring,
statistical conclusions).
the key technology
model
selection and drawing
Therefore, we may propose biostatistics as
in extracting
local/regional
information to the
maximum extent possible and to incorporate the same in a valid and
efficient
analysis
of
epidemiological
investigations.
For
the
assessment of cholera epidemiology, for example, the water contamination
problem, the infectious nature of the disease and local socio-economic
patterns are all vital components,
and their study depends a lot on
regional as well as social conditions.
As such, any model one wants to
consider must take into account such factors on a local/regional basis.
If
Therefore, a specific model pertaining to the Indus-delta (Karachi area)
in Pakistan, or the Hooghly-delta (India) region may not have the same
etiology or ecological factors as in the Padma-delta in Bangladesh.
In
smoking/respiratory disease epidemiology, similar regional factors are
very pertinent.
(Cigar,
cigarettes and bidi may not have the same
impact, and how about chewing tobacco?)
In cardiovascular epidemiology,
the diet pattern, physical exercise, smoking/not and many other factors
have distinct regional differentials, and hence, the model for Pakistan
or even India may not suit very well Bangladesh.
epidemiology may contain a
Chronic diseases
significant genetic effect,
regional/cultural/socio-economic
impacts
are
and,
overwhelming!
again,
AIDS/
venereal diseases and other sexually transmitted ones have distinct
30
•
regional effects:
The models. for Kenya may not be appropriate for
Ethiopia !Industrialized nations may have different contours than
•
developing
measure.
•
FAO (UN)
ones.
Malnutrition
is
a
somewhat
imprecisely
defined
It has been observed (through the pioneering efforts of the
under the leadership of Professor P.
v. Sukhatme) that the
amount of calorie intake with our food not only depends on the physical
characteristics
(such as age,
climatic factors.
height,
weight etc.)
but also on the
For example, in the northern Europe (or Canada), a
daily average of 3, 000 calorie is ideal
(especially in wintertime),
where as in the Indian sUbcontinent (especially in the coastal regions)
a thousand calorie would suffice.
matter
of
debating:
Different
The intake of protein is similarly a
religious
and
cultural
sectors
have
different patterns, not to speak of socio-economic factors within each
•
sector!
As such, a first and foremost task in the study of malnutrition
is to define the norm taking into account all such pertinent factors.
Again,
the
resolution
has
to
be
regional/cultural/religious
factors.
another
customs,
area
regional
where
factors
social
have
profound
highly
dependent
Reproductive
economic
impacts.
we
the
epidemiology
conditions
Can
on
and
expect
is
other
that
in
Bangladesh the model of India/Pakistan or any other nation will be
totally appropriate?
It is a complex study involving demographic under
current, government policies, birth control and contraceptive measures
and religious/cultural factors.
In the Indian subcontinent with evident
emphasis on male births, the number of children in a family may also
depend
(at least,
stochastically)
on the outcome of a boy or girl!
Maternal health (and hence, child health too)
31
are not in commendable
shape in Bangladesh (India too).
to improve the situation?
What are the most important measures
Can biostatistics be kept at bay in this
study too?
•
To sum up,
take
up
this
I would strongly urge the Bangladesh statisticians to
challenge
"statistical research",
appropriate
statistical
of
augmenting
biostatisticians
in
their
•
and to pay special attention to developing
models,
sound
statistical
methodology
and
efficient (and yet simple) statistical analysis schemes with direct and
in depth collaboration with scientists from epidemiology, pUblic health
and clinical sciences, in general.
To eliminate poverty, to eradicate
malaria as well as illiteracy, to eliminate malnutrition and to provide
a balanced diet to population of all ages, to combat with epidemics and
infectious diseases, to survive the AIDS episode, to be able to breathe
fresh air, drink natural water and to live in peace and good health, we
need public health awareness and advancements, and biostatistics is the
binding
force
for
all
disciplines
in
this
greater
domain.
Let
statistics in Bangladesh embrace biostatistics and bring the much needed
monsoon of applicable methodological research, and let this be reflected
in the Journal of statistical Research in the years to follow.
t
i
•
32