Chapter 9 Scientific methods: background and discussion

Chapter 9
Scientific methods: background and
This section is not examinable. It is included here only for completeness. We define the scientific
method and discuss the difference between a hypothesis, a scientific fact, a theory, and a model.
The Scientific Method
This section borrows heavily from the website of the University of Rochester 1
The scientific method is the process by which scientists, collectively and over time,
endeavour to construct an accurate (that is, reliable, consistent and non-arbitrary) representation of the world.
Now it is well known that “smart people (like smart lawyers) can come up with very good explanations
for mistaken points of view.” The scientific method seeks to minimize any such bias or prejudice in
an experimenter when a hypothesis or theory is being tested. The method has four steps.
1. Observation and description of a phenomenon or group of phenomena.
2. Formulation of a hypothesis (a clever guess) to explain the phenomena. In physics, the
hypothesis often takes the form of a causal mechanism or a mathematical relation.
1 labs/appendixe/appendixe.html
9.3. Hypothesis, theory, model – what’s the difference?
3. Use of the hypothesis to predict the existence of other phenomena, or to predict quantitatively
the results of new observations.
4. Performance of experimental tests of the predictions by several independent experimenters
and properly performed experiments – reproducibility.
If the experiments bear out the hypothesis it may come to be regarded as a theory or law of
nature. If the experiments do not bear out the hypothesis, it must be rejected or modified. What is
key in the description of the scientific method just given is the predictive power of the hypothesis
or theory, as tested by experiment. It is often said in science that theories can never be proved, only
disproved: a scientific fact is a statement that has not yet been disproved. There is always
the possibility that a new observation or a new experiment will conflict with a long-standing theory.
Hypothesis, theory, model – what’s the difference?
1. A hypothesis is a clever guess to explain observed phenomena. ‘My car won’t start’ is an
observation about the physical world. ‘The battery is low’ is a hypothesis. This must then be
checked by an experiment – perhaps by measuring the voltage across the battery terminals.
2. A scientific theory or law is a hypothesis or group of hypotheses that together form a
framework to explain and predict. The theory must be confirmed by reproducible experiments.
In physics for example, laws are formulated in terms of some basic concepts or equations.
Another view of a scientific theory is that it is a form of data compression – vast amounts
of data about the physical world can be distilled into a few sentences or mathematical symbols.
3. A model is reserved for situations when it is known that the hypothesis has only limited
validity. For example, in Hooke’s Law (which should really be called ’Hooke’s model’), it is
stated that the force on a mass attached to a stretched spring is proportional to the amount
of stretching. We know this is only valid for small amounts of stretching (otherwise, the spring
will be bent out of shape, or can break). However, this model leads to the prediction of simple
harmonic motion, which is extremely useful in a broad range of applications.
We have stated that the hypothesis on which a model rests has limited validity. This comes in
two forms:
1. Incomplete information (we have not yet found out everything);
2. A deliberate decision to disregard some information believed to be irrelevant;
Chapter 9. Scientific methods: background and discussion
A climate model is an example of the first kind: it is simply not possible to know all the effects that
influence the long-run state of the atmosphere. Instead, we assemble the effects that we believe
to be important: momentum transfer, radiative and convective heat transfer, ocean-atmosphere
interactions, the earth’s rotation, sources and sinks of CO2 , vegetation cover, precipitation, ice and
so on. These are combined in mathematical equations that are then solved on a computer. In this
case, it can be said that
A model is something you do when you don’t know everything.
Hooke’s ‘law’ is an example of the second kind of limited validity: it might well be possible to
write down an equation for the force F (x) as a function of any amount of stretching, x. This
equation would encompass large stretching values (the plastic limit), and broken springs. However,
when it comes to describing small oscillations, this information is irrelevant. Thus, we are justified
in ignoring these phenomena and using a model. In both types of limited validity, the following
statement applies:
All models are wrong, but some are useful.
On the other hand, theories have a much more exalted status. Remember, a theory must
be confirmed through repeated experimental tests. They have universal applicability. Accepted
scientific theories and laws become part of our understanding of the universe and the basis for
exploring less well-understood areas of knowledge. Theories are not easily discarded; new discoveries
are first assumed to fit into the existing theoretical framework. It is only when, after repeated
experimental tests, the new phenomenon cannot be accommodated that scientists seriously question
the theory and attempt to modify it. The word theory in science conveys weight; it is not like my
‘theory’ of why Manchester United are doing better this season than in recent past seasons. It is
therefore to be contrasted with the shallow dismissal implied by the expression “It’s only a theory”.
For example, it is unlikely that a person will step off a tall building on the assumption that they will
not fall, because “Gravity is only a theory.”
Discuss whether the following examples are a theory or a model. If the example is a model, discuss what kind limited validity applies (incomplete information or neglect of seemingly irrelevant
• A computer program to predict the weather;
9.4. The modelling process
• Newton’s law for two point particles: Any two point particles of masses m2 and m2 attract
each other; the attractive force is given by
F =−
Gm1 m2
(F < 0 =⇒ attraction),
where G is a constant and r12 is the separation between the particles.
• A description of the earth-sun gravitational interaction in which each element is assumed to
behave like a point particle;
• A description of the earth-sun gravitational interaction in which each element is given its
proper (extended) shape.
• A description of a human population size in terms of a few parameters that are given constant
values, such as the birth rate, the death rate, and the migration rate.
• Newton’s law for a point particle of mass m experiencing a net force F :
m × [acceleration] = F.
Now, having said all of this, the distinction between a ‘model’ and a ‘theory’ is, for the purposes of
this module, moot.
The modelling process
We conclude with a flow chart that describes the steps needed to build and validate a mathematical
model (Fig. 9.1).
Consider the population of a fictitious country, Genovia. Let P (t) be the number of people present
at time t. In the next chapter, we are going to go through the flow chart 9.1: a mathematical
problem, assumptions, equations, and equation solutions. We will end up with a model
P (t) = P (0)ekt ,
where P (0) is the population at time t = 0 and k is an unknown constant. Thus, to arrive back at
the real-world solution, we need some data to determine k. For a problem involving time-evolution,
Chapter 9. Scientific methods: background and discussion
Figure 9.1: Flowchart showing the modelling process
t (years)
t (years)
Table 9.1: Population of Genovia over fifty years.
this will be information about the past. For example, suppose we have the following data for the
past fifty years (starting from t = 0). We overlay a curve of the kind (9.1) on the data (Fig. 9.2)
and choose a k-value that best fits the data. The result is k = 0.021. The result is not a perfect
fit – it never will be. There will be small fluctuations in the population size for reasons unknown to
this crude model.2 Nevertheless, the fit is good.
Suppose now we use the k-value to make a prediction for what the population will be in twenty
years’ time (t = 70):
P (t = 70) = 1000e0.021×70 = 4300 to two significant figures
Perhaps due to annual fluctuations in the pear harvest.
9.5. Deterministic versus statistical modelling
Figure 9.2: Estimating the k-value from the data in Tab. 9.1.
(our k-value has two significant figures, so we are only allowed to keep two here).
• If, in twenty years’ time, the population is close to P = 4300, then our model is validated,
and we gain confidence in it (broken red line does not need to be followed in Fig. 9.1).
• If, however, in twenty years’ time, the population prediction is wildly different from P = 4300,
then we need to revisit the model assumptions in Fig. 9.1.
In any event, this example shows another typical (but not essential) property of a mathematical
model (as opposed to a theory):
Often, a mathematical model contains free parameters that need
to be estimated from data before predictions can be made.
Deterministic versus statistical modelling
Depending on your choice of major, in later years you will gain familiarity with statistical modelling. Supeficailly at least, this bears little resemblance to what is done in this module, which is
deterministic modelling. Typically,
• In deterministic modelling we solve a set of ODEs to make a prediction about the future based
on initial conditions.
Chapter 9. Scientific methods: background and discussion
• In statistical modelling we assume that an output variable depends on
– An input variable
– Some additional randomness
For example, the height of individuals in a population depends on the age of the individual,
as well as some random contribution because not all people are the same. The random contribution to the dependence is modelled as a random variable that is drawn from a probability
distribution. Statistical modelling involves finding a sensible relationship between the input
and output variables, and a sensible probability distribution with which to characterize the
random contribution. This is often done with a linear regression analysis.
It is tempting to think that these two approaches constitute two fundamentally different world views.
However, this would be wrong. Indeed, as you will notice in this module, many of the deterministic
ODE models that we write down depend on parameters (e.g. population growth rates) that are
not known a priori. An increasingly popular and extremely effective approach to determining these
parameters is via statistical modelling. Therefore, the ‘dream’ model is often a deterministic one
where the free parameters are determined rigorously via statistical modelling. Thus, at a very high
level, these two apparantly disparate approaches can be brought together.