In economic theory we attempt to formulate laws for the interaction

TRYGVE HAAVELMO:
On the Statistical “Testing” of Hypotheses in Economic Theory
Lecture given at the Third Nordic Meeting of younger economists, Copenhagen, May 1939.
Published, for the first time, In Norwegian, in Samfunnsøkonomen, 2008(6), pp. 5–15.
Translated July–August 2011, from the manuscript dated March 13, 1939, by:
Erik Biørn, Department of Economics, University of Oslo
1. INTRODUCTION
In economic theory we attempt to formulate laws for the interaction between events
in economic life. They may be purely qualitative statements, but most of them, by
far the most important laws, are of a quantitative nature, indeed what we are most
frequently concerned with, are quantitative or quantifiable entities. This emphasis
on quantitative reasoning is seen in almost any work of theory, regardless of whether
the formulation is purely verbal or is given in more precise mathematical terms. The
derivation of such laws rests on a foundation of hypotheses. We proceed from certain
basic hypotheses, maybe introduce supplementary hypotheses along the way, while
proceeding through a chain of conclusions. The value of the results – provided
that their derivation is technically impeccable – then depends on the foundation
of hypotheses. Indeed, each conclusion itself becomes merely a new hypothesis, a
logical transformation of the original assumptions. For this reason I will here use
hypotheses as a common term for the statements in economic theory.
Anyone familiar with economic theory knows how it is often possible to formulate
several, entirely different “correct” theories about one and the same phenomenon.
This is due to differences in the choice of assumptions. One often encounters crossroads in the argument, where one direction a priori appears as just as plausible as
another. To avoid all becoming a logical game, one must at each stage keep the
following questions in mind: Is my argument rooted in reality, or am I operating
within a one hundred percent model world? Is what I have found essential or merely
insubstantial? Here, the requirement of statistical verification can aid us, preventing
our imagination from running riot, and forcing us to a sharp and precise formulation
of the hypotheses. This statistical scrutiny saves us from many empty theories, while
at the same time giving the hypotheses that are verified by data immensely greater
theoretical and practical value.
It may seem that we would be correct in sticking to what we see from the data
only. But that is not so. Then we would never be able to distinguish between
essential and inessential features. Data may give us ideas of how to formulate hy-
1
potheses, but theoretical considerations must be drawn upon. On the other hand,
we should not uncritically reject a hypothesis even if a data set seems to point
in another direction. Many hypotheses, maybe the most fundamental and fruitful
ones, are often not so apparent that they can be tested by data. But we can take
the argument further until we reach the “surface” hypotheses which are testable.
Then, if we are repeatedly in conflict with our data – and in essential respects – then
we shall have to revise our hypotheses. But perhaps the data we have used is not
appropriate, or we have been unable to “clean” it for elements which are not part
of our hypotheses. In the analysis of these various possibilities, the crucial problem
of statistical hypothesis testing lies. There are specific testing problems associated
with all hypotheses, but there are also certain problems of a more general nature
and they can be divided into groups. It is these more general problems I will try to
comment upon in the following sections.
2. THE HYPOTHESES IN ECONOMIC THEORY ARE OF A STATISTICAL NATURE
Strictly speaking, exact laws belong in logical thought constructions only. When the
laws are transmitted to the real world, we must always allow room for inexplicable
discrepancies, the exact laws changing into relations of a statistical nature. This
holds true for any science, the natural sciences not excepted. In principle, economic
science does is thus not special in this respect, even if, so far, there is an enormous
difference of degree relative to the “exact” sciences.
The theoretical laws we operate with, say something about the effects of certain
imagined variations in a more or less simplified model world. For example: how
will changes in price and purchasing power affect the demand for a particular good;
what is the relationship between output and input in a production process, or what
is the connection between changes in interest rates and changes in the price level etc.
etc.? As a special case, our hypothesis may state that certain entities are constants,
but such conclusions also rely on certain imagined variations. If we now take our
model world into the observation field, innumerable new elements come into play.
The imagined variations are replaced by the variations which have actually taken
place in the real data. Our models in economic theory are often so simple that
we do not expect to find any agreement. Such models are by no means necessarily
without interest. On the contrary, they may represent a very valuable survey of
what would happen under certain assumptions, so that we know what would be
the outcome in a situation where the assumptions were in fact satisfied. Other
hypotheses may be closer to reality, by attempting to include as many realistic
elements as possible. But we shall never find any exact agreements with statistical
2
data. Neither is this what we are asking for. What is our concern is whether
certain relations can be established as statistical average laws. We may say that
such laws connecting a certain set of specified entities are exact in a statistical sense
if they, when the number of observations becomes very large, they approach in the
limit a certain form which is virtually independent of elements not included in our
model. That such statistical laws are what we usually have in mind in economic
theory, is confirmed by the fact that we almost invariably deal with variables that
have a certain “weight”. For instance, we do not ask for the demand responses of
specific persons to price changes, but rather seek for average responses for a larger
group, or – equivalently – the responses of the typical representatives of certain
groups (“the man in the street”). We study average prices and average quantities
or total quantities for larger groups of objects, etc. The idea is the same as in
statistics, namely that the detailed differences disappear in the mass, while the
typical features cumulate. But the cases where the “errors” vanish completely, are
only of theoretical interest; for practical purposes it is much more important that
they almost disappear when considering large masses. When this is the case, it does
not make much difference whether we, for reasons of convenience, operate with exact
relations instead of relations with errors, e.g., by drawing the relation between price
and quantity in a demand diagram as a curve rather than as a more of less wide
band (Figure 1).
Figure 1 about here
But the circuit of issues related to hypothesis testing is not exhausted by the
question of smaller or larger degree of precision in the conformity between data and
a given hypothesis. The crucial problems in the testing of hypotheses precede this
stage of the analysis. It turns out – as we shall see – that many hypothesis by no
means lend themselves to verification by data, even if they are quantitatively well
defined and realistic enough. Indeed, we may be led astray if we attempt a direct
verification. Moreover, almost all hypotheses will be connected with substantial “ceteris paribus”-clauses which pose particular statistical problems. In addition comes
the question of the choice of quantitative form of the hypothesis under consideration (the specification problem), and here also we must usually be assisted by data.
Before addressing these various problems it is, however, convenient to take a look
at the general principles of statistical hypothesis testing.
3. ON THE GENERAL PRINCIPLE OF STATISTICAL TESTING OF HYPOTHESES
Let us consider two observation series, r and x, for example real income (r) and the
consumption of pork and other meat (x) in a number of working-class families over
3
a certain time span during which prices have been constant (Figure 2).
Figure 2 about here
We advance the following hypothesis
(3.1)
x = k·r+b
(k and b constants)
Now, we might clearly draw an arbitrary straight line in the (x, r)-diagram and consider the observations that do not fall on this line as affected by “errors” . For our
question to have a meaning, we must therefore formulate certain criteria for accepting or rejecting the hypothesis. Of course, the choice of such criteria is not uniquely
determined. Supplementary information about the data at hand, with attention
paid to the intended use of the result, etc., will have to be considered. Obviously,
one set of criteria can lead us to reject, another to accept the same hypothesis. To
illustrate the kind of the purely statistical-theoretical problems one may encounter,
we will go through the reasoning for one particular criterion. Let us assume that
k and b shall satisfy the condition that the sum of squares of the deviations from
the line x = k · r + b, taken in the x-direction, is as small as possible. The crucial
issue is, presumably, whether the k determined in this way is significantly positive
(that is, whether consumption increases with income). To proceed further, we need
to make supplementary assumptions about the kind of our observation material.
Let us, for example, consider the given observation set as a random sample from a
two-dimensional normal distribution with marginal expectations and standard deviations equal to those observed. Perhaps we have additional information that makes
such an assumption plausible. With this specification, the testing problem is reduced to examining whether the observed positive correlation coefficient, and hence
k, is significantly positive. In order to examine this, we try the following alternative
hypothesis: the observation set is a random sample from a two-dimensional normal
distribution with marginal expectations and standard deviations equal to those observed, but with correlation coefficient equal to zero. If this alternative hypothesis
is accepted, then our initial hypothesis is thereby rejected. On the other hand, if
the alternative hypothesis must be rejected, then all hypotheses that the correlation
coefficient in the two-dimensional normal distribution is negative must a fortiori
be rejected, i.e., the initial hypothesis, must be accepted (under the assumptions
made). But now I can give a quite precise probability statement about the validity
of this alternative hypothesis, since from this hypothesis I am able to calculate the
probability for – in a sample of N observations – getting by chance a correlation coefficient at least as large as the one observed. If this probability is for example 0.05,
then I know that, on average, in 5 of 100 cases like the actual one, I commit an error
by rejecting the alternative hypothesis, that is accepting the observed coefficient
4
as significant. When I only specify how certain I want to be, the decision is thus
completely determined. If now the observed correlation coefficient passes this test,
I can, for example substitute its value into the two-dimensional normal distribution
and compute a probabilistic expression for the observed distribution being a random
sample from the theoretical distribution thus determined. In this way, I also test
the validity of the assumed two-dimensional normal distribution.
One sees that by this kind of testing we are exposed to two types of errors:
1. I may reject the hypothesis when it is correct.
2. I may accept the hypothesis when it is wrong, i.e., when another hypothesis is
correct.
The first type of errors is one I may commit when dismissing a hypothesis that does
not seem very likely, but still might be correct. The second type of errors occurs
in when I accept a particular one among the possible hypotheses that “survive” the
testing process, since one of the others may well be the correct one. What I achieve
by performing hypothesis tests like these, is to delimit a field of possible hypotheses.
Yet, probabilistic considerations applied to economic data may often be of dubious value, so that we may here choose other criteria. But still the argument becomes
similar. It is therefore convenient to take the purely statistical hypothesis testing
technique as our point of departure.
4. THE FREE AND THE SYSTEM-BOUND VARIATION. “VISIBLE” AND “INVISIBLE” HYPOTHESES
Many hypotheses, including those perhaps we reckon as basic in economic theory,
seem to be strongly at odds with the statistical facts. This phenomenon often
provides those with a particular interest in criticizing economic theory with welcome
“statistical counter-evidence”. But there is not necessarily anything paradoxical in
such occurrences. Rather, it may be that such seeming contradictions just serve
to verify the theoretical hypotheses. We will now examine this phenomenon a bit
closer. It relates to matters which are absolutely necessary to keep in mind when
attempting statistical verification.
We see this problem most clearly when reflecting on how the construction of the
hypotheses proceeds: First, we define a specific set of objects – certain economic
variables – to be studied. At the start, they move freely in our model world. Then
we begin studying the effects of certain imagined variations, and are in this way
led towards certain relations which the studied variables should satisfy. Each such
relation restricts the freedom of variation in the group of entities we study. If we
5
have n variables and m independent equations (n > m), then only n − m degrees of
freedom remain. A person who now takes a glance into our model world, will not
detect the free variations on which the formulation of each separate conditioning
equation rested, he will see solely the system-bound variation which follows when
requiring all conditioning equation to be fulfilled simultaneously.
In the demand and supply theory we find simple examples of this. Let us take a
market where the incomes are constant and assume that demand and supply (x) is
a function only of the price (p) of the commodity considered, that is:
(4.1)
(4.2)
x = f (p)
x = g(p)
(the demand curve),
(the supply curve).
(See Figure 3.) If this holds exactly, the only information we get from data for this
market will be the point A in Figure 3. This alone cannot give us any information
about the shape of the two curves. They are “invisible” hypotheses in this material.
On the other hand, if our hypotheses are realistic, then the observed result follows
by necessity. In this “pure” case, we run no risk of being misled by data. In practice
we should, however, be prepared to find that the demand and the supply relations
are not two curves, but rather two bands, as indicated in Figure 4. And then data
may lead us astray. Indeed, the data then become the arbitrarily located observation
points in the area (a, b, c, d) in Figure 4. If the widths of the two error bands are
approximately equal, then we definitely do not know whether what we observe is
supply variations or demand variations. If the demand band is narrower than the
supply band, we get most knowledge about the form of the demand “function”.
Conversely, if the supply band is the narrower, we obtain most knowledge about the
supply “function”.
Figures 3 and 4 about here
Now, let us bring variations in income (r) into the demand function, still letting
the supply depend on the price only, that is
(4.3)
(4.4)
x = f (p, r)
x = g(p)
(demand),
(supply).
Assume, for simplicity, that (4.3) and (4.4) are two linear equations, i.e., two planes
in the (x, p, r)-diagram (see Figure 5). If this holds exactly, then all variation in
this market must take place along the line of intersection between the two planes.
This line of intersection is the confluent market relation emerging from the structural relations (4.3) and (4.4) holding simultaneously. It is, of course, impossible
to determine the slope of the demand plane from statistical data for this market,
as there are innumerable planes (and, for that matter, also curved surfaces) which
6
give the same line of intersection. The only visible trace of our hypotheses are three
straight lines in, respectively, the (x, p)-, the (x, r)-, and the (p, r) diagram. In our
example, the straight line in the (x, p)-diagram is, of course, nothing other than the
supply curve (see Figure 6). We know this here because we know the two structural
relations (4.3) and (4.4). But if we only had the data to rely on, we might have been
led to interpret the observed relation in the (x, p)-diagram as an upward sloping demand curve. If, on the contrary, we had formulated the hypotheses (4.3) and (4.4) a
priori, then the observed relation would just have been a verification. Just as in the
previous example, it will be most realistic to consider the demand and the supply
functions, not as exact planes or surfaces, but rather as two surfaces of a certain
thickness, in this way allowing for the random variations. The confluent market
relation then becomes a “box” extending outwards in the (x, p, r)-diagram, and we
get statistical problems similar to those mentioned in connection with Figure 4.
Figures 5 and 6 about here
This problem of confluence emerges as part of practically all testing tasks. It also
occurs in other fields of research, but it occupies a prominent position in economic
testing problems because here we are in general – disregarding the possibilities provided by interview investigations – precluded from performing experiments with free
variations. Thus, the system-bound observed variation is the only information at our
disposal. This is indeed one of the main reasons why refined statistical techniques
must be given such a strong emphasis in modern economic research. Using solely the
“sledge hammers” among statistical methods will not do; we need the most refined
tools among our statistical-techniques to come to grips with the problems.
Above we have seen how we can enter into difficulties when trying out hypotheses, without this being due to restrictive assumptions or lack of realism. But the
testing problems, of course, get still more complicated when one is confronted with
hypotheses conditioned by substantial “ceteris paribus”-clauses. We shall now take
a look at the key issues raised by this.
5. THE ’CETERIS PARIBUS’ CLAUSE AS A STATISTICAL PROBLEM
For “ceteris paribus” statements to lead to something more than mere trivialities, we
should first and foremost make clear to ourselves which other elements are assumed
unchanged. We have no justification for making a “ceteris paribus” statement on
matters that we know nothing about at all. The rational application of the “ceteris
paribus” clause comes into our hypotheses in two forms. The first is that when
formulating a hypothesis as realistic as possible, we start out by trying to specify
7
the elements essential to the problem at hand. Data and practical prior knowledge
may guide us in this process. But we are forced to restrict our selection and then
we may impose the “ceteris paribus”-clause on all remaining unspecified elements,
because the total effect of these other elements has by experience played no large
part for the problem at hand and neither can they be expected to do so in the future,
so that whether we assume them unchanged or let them play freely is virtually of no
consequence. The second way of the “ceteris paribus”-clause is the one we impose
within the system of specified variables. The idea here is just the same as that of
partial derivatives. We study relations between some among the specified objects,
which are mutually independent, subject to the assumption that the remaining
elements specified are kept constant. Usually, the form that such relations takes,
depends on the level at which the other elements are fixed. Such reasoning is not
only of theoretical interest, on the contrary, it is also the basis for assessing effects
of practical measures of intervention in the economic activity.
Statistical data do not in general satisfy – fortunately, we should say – such
“ceteris paribus”-clauses. Thereby we indeed get a tool by means of which we can
study the effects of variations in all the entities which are essential to the problem at
hand. Once we know these effects, we can possibly eliminate them. The requirement
of statistical testability is really the most important – not to say the only – means
by which to clarify the nature of our “ceteris paribus” clauses.
Among the statistical-techniques regression analysis comes here into the foreground. It occupies, moreover, a key position in all modern econometric research.
By means of it, the general principle for an extensive group of test problems can be
formulated like this: We attempt to establish regression equations which include, in
addition to the variables entering our hypotheses, as many as possible other “irrelevant” variables whose variations systematically influence what is to be “explained”,
such that the residual variations become as small as possible and non-systematic. If
this is within reach, then the “ceteris paribus”problem is reduced to that of studying the effect of partial variations in the regression equation we have established.
But hardly many attempts are needed to becoming convinced that this is far from
following a beaten track.
First, we are again confronted with the issue of confluent versus structural relations, exemplified earlier. For example, if we have, apart from random errors, a
relation like the one illustrated in Figure 5, it would have been completely nonsensical to attempt to determine the slope of the demand plane by regression, including
income as a variable in addition to price. That would have given a completely arbitrary result, entirely determined by the errors in the material. Modern regression
analysis has, however, measures to safeguard against such fictitious results. We are
8
able to see the confluent relations which the data satisfy. As mentioned above, this
is actually a very important way of testing our hypotheses. If they are realistic, they
should give just the observed confluent relation as a result of elimination. Only in
rare situations we are able to a priori formulate our hypothesis such that this will be
the case. The statistical regression analysis thus becomes an important means for
adjusting the hypothesis formulation. — The above is, of course, merely a couple of
rather superficial remarks to indicate the place of regression analysis as a statisticaltechnical tool in hypothesis testing. If we should, even only casually, touch upon the
more technical details, we would right away have to erect a considerable construction
of definitions and formulae, for which there is no space here.
Secondly, we are confronted with the specific issue of the significance of the
statistical results. Often the situation here is as follows: Our hypothesis may, for
example, be a relation which the data ought to satisfy. But in a given set of observations the actually realized variations in some variables may have been so small that
the errors dominate. From such data we then cannot see what would have taken
place in another data set in which variations had been pronounced. In such cases we
get nowhere in our attempt to illuminate the effect of our “ceteris paribus”-clauses.
If we assume that the variables which have not had significant variations in our data
set, continue to be of this kind, then we can say that it is inessential to include
these variables in our hypothesis to explain what happens. But what is of greatest
interest is often indeed to find what would have happened if one made interventions
in the system.
I can give an illustration of these problems, taken from a study of the demand for
pork in Copenhagen, undertaken at the Institute of Economics at Aarhus University
this year. (The results conveyed here are only preliminary, and we mention only one
of the trials.) One would expect beforehand that many factors influence the consumption of pork: the price of pork, the price of other meat relative to that of pork,
the income of the purchasers, the cost-of-living, and the size and composition of the
population would all be among the factors one might a priori take into account.
All these elements were included in a regression analysis. The cost-of-living level
was used to transform prices and incomes into real values, the size and composition
of the population were used to calculate consumption per consumer unit. These
transformed variables were used as the explanatory variables. The relationship was
assumed linear, nothing in the data suggested a more complicated relationship. (Attempts with logarithmic transformations gave, besides, virtually the same result.)
It then turned out that the direct covariation between the consumption (per consumer unit) and the real pork price was the totally dominating feature of our data
set. Without elimination of any other factors this relation showed a correlation of
9
about 0.90 – which gave a gross elasticity with respect to the pork price of about
−0.8. Inclusion of the other factors had virtually no influence on this correlation.
Attempting to explain the residual variation by incorporating these other factors
worked out as to be theoretically expected, but their explanatory power was weak.
In particular, this was the case for the effect of real income changes. It is impossible to accept this as a general result. If consumers’ purchasing power declined
to, say, one half, it would unavoidably exert a decisive effect on the consumption
of a commodity like pork. The circumstances which explain the above outcome is
partly that the variations in the real income have been small, and partly that they
have had a confluent covariation with the pork price (correlation ca. −0.70). If this
latter relationship had been very tight, we might equally well have taken income as
the explanatory variable instead of the real pork price. However, the income variation alone gave a much less significant explanation of the changes in consumption
(correlation only about 0.5). Theoretically as well as practically it would have been
of great interest to be able to statistically illustrate the effect of price variations
under constant incomes. But in this case, data were hardly sufficient to assess the
implication of such a “ceteris paribus”-clause.
In treating the testing problems we have thus far tacitly skipped another major
problem that usually occurs jointly with it, namely
6. THE SPECIFICATION PROBLEM
This is the problem of choosing the quantitative form for the hypothesis of the
economic theory. In more general theoretical formulations one often, for the time
being, leaves this question open. For instance, one often indicates only that a
variable is some function of some other variables, or, a bit more specifically, that a
certain variable, for example, increases with a partial increase in another variable.
But such general statements often provide only half-way solutions to the problems
presented. It is often at least as important to establish exactly how a change works
and how large its magnitude is. This is not a question to be asked afterwards,
at the stage of applying the theoretical law. In fact, the numerical values of the
parameters quite often have importance for the kind of the theoretical conclusions
one can draw, as well. We see this clearly when we, for example, study the solution
forms of a determinate dynamic system. Changes in the numerical values of the
coefficients may well bring the nature of the solution to change, e.g., switch from
cyclical movements to trend movements. The final answer is not obtained until a
statistical analysis is performed. But this requires that the hypothesis is given a
precise quantitative form. Here we should also consult the data, but data alone
10
cannot provide a unique solution to the specification problem. In fact, it has no
meaning at all to ask like this: which hypothesis is the best one? Data cannot
decide this. But we can formulate a set of alternative hypotheses and choose certain
testing criteria to distinguish between them, e.g., decide whether a parabola gives a
smaller residual sum of squares than a straight line. What is important is that we,
by using data, are able to eliminate a sizable amount of “unreasonable” hypotheses.
We indicated general principles for doing this in section 3 above.
The choice between different possible hypotheses may sometimes be reduced to
a question of mathematical approximation. The final choice of specification may
appear to be of lesser importance. But often the choice may also have far-reaching
consequences for the conclusions drawn. Suppose, for example, that the issue is to
choose between the following two forms of a demand curve
(6.1)
(6.2)
x=a·p+b
log(x) = e · log(p) + cb
(a and b constants),
(e and c constants).
In (6.1) the demand elasticity with respect to the price is a function of x and p:
p
p
(6.3)
e(x, p) = a· = a·
.
x
a·p+b
In (6.2) the demand elasticity it is constant, equal to e. Assume that both hypotheses
give practically the same correlation in a given data set. This may well happen. (See,
for example, Henry Schultz: Der Sinn der statistischen Nachfragekurven, Bonn 1930,
pp. 57 and 69.) If we insert in (6.3) the full sample means of x and p, we usually
get practically the same value of the elasticity as the constant e in (6.2). This
is distinctly different from inserting the individual observations of p’s and the x’s
calculated from (6.1) into (6.3). (Of course, one should not take the observed xvalues.) Then we get a more or less strongly varying elasticity. (See for example
Wolff: The Demand for Passengers Cars in U.S.A., Econometrica, April 1938, pp.
123–124.) From a theoretical-economic viewpoint the difference between these two
results is essential. We here need supplementary theoretical considerations to decide
which hypothesis to choose.
This arbitrariness is, however, narrowed down by the fact that these various
hypotheses should fit into a large number of interconnections with other economic
factors. This criss-cross testing makes it possible a priori to eliminate a lot of
quite impossible hypotheses. Here as in other cases we clearly see how theoretical
formulation of hypotheses and statistical testing are not two successive steps, but a
simultaneous process in the analysis of economic problems. This is the basic idea of
modern econometric research.
11
7. THE TREND PROBLEM
We now proceed to take a look at the problem of trend elimination. This often
is perceived as a purely technical-statistical issue, but its nature is really much
more profound. We will attempt to examine briefly the logical foundation for trend
elimination.
In our theoretical formulation of laws, we are always concerned with phenomena
of such a nature that they may be assumed to repeat themselves. This applies to both
static and dynamic law formulations. Now, the most important economic data are
given as time series, a quite specific series of successive events. Is it indeed possible
to test laws for recurrent phenomena on the basis of such time-bound variations?
To be able to say anything about this question we must study the characteristic
path of the observed time series. For economic time series there are usually two
features that catch our attention: one is a steady straight development, the trend
path, the other is certain variations around the trend movement. We often can trace
the trend back to certain more sluggishly moving factors (notably changes in the size
and composition of the populations), factors that are outside the circuit of variables
included in our hypotheses and working independently of the variations we want to
study. In such cases it is natural to take a trend as a datum in the analysis, and
consider what happens apart from the trend variation. This is the rational basis for
a statistical elimination of the trend in our observation series. It is unacceptable
to undertake a purely mechanical trend elimination without being able to give a
concrete interpretation of the emergence of the trend. It might very well be that
an observed trend gets its natural explanation through the relations and the set of
variables that are included in our hypotheses. We shall take a closer look at this.
A realistic system of hypotheses will comprise static as well as dynamic structural
relations. The formulation of the various structural relations are founded on certain
imagined alternative variations, partly in the variables themselves at a given point
in time, and partly in growth rates and “lag” terms. Assume that our efforts lead
to a determinate dynamic system, allowing us to solve it, i.e., finding the time
path of the variables studied. It may then well happen that the observed trend
movements are just the possible solutions of this system. In other words, the trend
movement may emerge as a confluent form of the dynamic system of structural
equations. The observed trend movements can then often be taken to be a statistical
verification of our system of hypotheses. If we eliminate mechanically the trend in
advance like one or another time function (e.g., a straight line or an exponential
function), then we have first, prevented ourselves from realizing that our system of
hypotheses may give a plausible explanation of the trend movement. Further, we
12
may have prevented ourselves from undertaking a statistical testing and estimation
of coefficients of certain structural relations where this would otherwise have been
possible, as it may happen that for some variables, the trend is the only significant
element, while the other variations are disguised by random “errors”. And, finally,
we have confined attention to a particular trend path independent of our structural
relations, so that we cannot uncover the way in which changes in the structure would
influence the trend. This latter point can really be of crucial interest in assessing
regulatory measures.
When our testing data constitute series with pronounced trend movements, then
it might be asserted that the hypotheses we verify are not laws for recurrent phenomena, but only a description of a historic development. If this view were to be
accepted in general, it would have been a heavy blow for the strivings to establish
economic laws. But we do not need to restrict ourselves to such a negative position,
as is already apparent from our remarks on the trend problem. Either the trend has
its origins lying outside our system of hypotheses. And if we specify these causes,
we are entitled to eliminate the trend in advance, such that we only consider the
residual variation as having the character of recurrent cases. Or, the trend is rooted
in the structure of the system under consideration, an outcome of an analysis of free
variations being explained through the same system of hypotheses as the one which
leads to the variations of recurrent character. (To examine whether the latter holds
true, it may be of interest to try out the same hypothesis on detrended data.) There
is really no reason why trended data could not at the same time be conceived as
recurrent phenomena, it is only a question of what we are considering, whether the
variables themselves or their growth rates. As soon as a growth rate varies around
a non-zero average, then the corresponding variable will itself get trended. For example, let W be the stock of real capital at a certain time and w and u investment
and depreciation per unit of time, respectively. Regardless of our hypothesis for the
form of the relation between w and u, provided it includes a condition that w on
average should exceed u, then W will follow an increasing trend (we assume w and
u positive). Then we just have
(7.1)
Ẇ (t) = w(t)−u(t) = a positive variable on average.
Here the different situations with respect to Ẇ (t) (the growth rate of W ) are the
elements which can repeat themselves, W itself going towards gradually new positions. And this is what is implicitly expressed in our hypothesis.
13
8. AVERAGE VERSUS MOMENTARY EXPLANATIONS
We mentioned above how the choice between different possible hypothesis cannot
be done unconditionally, we must establish certain criteria for how to proceed. The
nature of these criteria cannot be the same in all cases. Let us stick to hypotheses
that are given as an equation between certain economic variables. By transforming
the variables, constructing new terms, etc., we will in general be able to express the
relation in linear form so that we can use linear regression analysis as a testing tool.
We will be inclined to accept such a relationship if the statistical agreement between
the observed data and those that can be computed from the regression equation is
good. This question is often mixed up with the question of whether the regression
determined coefficients in the hypothesis are significant or not (i.e., whether the
absolute value of the coefficients are large relative to their standard errors). These,
however, are two different issues, the first being about the error in the momentary
explanation, the second about the error of an average relation. (This, by the way,
should not be mixed up with fact that each observation entering the analysis can
be averages for larger groups of units, as mentioned above when commenting on
the statistical nature of economic laws). Let us take an example to illustrate this
difference between average explanation and momentary explanation.
Consider N observations on three variables, x, y, z, which are exactly related
through the equation
(8.1)
y = 2x−3z.
For simplicity, we assume that the three variables are measured from their respective
means, and that they have finite standard deviations σx , σy , σz , which together
with the correlation coefficients rxy , rxz , ryz , approach certain fixed numbers when
the number of observations becomes large. Suppose that we do not know the exact
relation (8.1), but believe that there exists a proportionality relationship between y
and x only,
(8.2)
y = bx.
The correlation coefficient between these two variables becomes
∑N
∑N
xi (2xi − 3zi )
x
y
i
i
= i=1
.
(8.3)
rxy = i=1
N · σx σy
N · σx σy
After rearrangement we get
σx
σz
(8.4)
rxy = 2· −3·rxz · .
σy
σy
Thus, from our above assumptions, rxy will converge to a fixed number when N
increases. Now, the elementary regression of y on x is defined by
14
y
x
= rxy .
σy
σx
Inserting (8.4) in (8.5) we get, as our expression for (8.2),
(
)
σz
calculated
(8.6)
y
= 2 − 3rxz ·
·x.
σx
(8.5)
Now, it is well known that the standard error of the regression coefficient b equals
1
σy √
2 ,
(8.7)
σb = √
1 − rxy
N − 2 σx
and this entity becomes larger the smaller N is. In other words, the regression
between y and x becomes more precisely determined the larger is the number of
observations at our disposal. Thus, there exists an average relation between y and x
per group of N observations, which is more stable the larger is N . But this does not
necessarily mean that (8.6) gives a good agreement between observed and calculated
values of y, i.e., a good description of the momentary variation in y. Consider now
the mean squared difference between y in (8.1) and y calculated in (8.6). It becomes
∑
∑
σz
calculated 2
2
2
2
(8.8) N1 N
) = N1 N
i=1 (yi −yi
i=1 (2xi −3zi −2xi +3rxz · σx ·xi ) = 9(1−rxz )σz .
We see that irrespective of the number of observations, the same unexplained vari2
ance is left in y, expressed by 9(1−rxz
)σz2 at the right hand side. When the number
of observations increases, the error in the average explanation is more and more
reduced, while the error in the momentary explanation remains at the same level
as long as we in our hypothesis do not include new variables (here z), which can
explain more or less of the residual dispersion.
One sees from (8.8) that the momentary explanation of y by using (8.6) becomes
better the smaller is the variation in z and the larger is the correlation (rxz ) between
the latter variation and that of x as included in (8.6). If the correlation rxz is
high, this obviously means that we, by accounting for the variation in x, also have
accounted for part of the variation in z. If now rxz is small while z displays very little
variation, this means that z is a superfluous variable when it comes to explaining
the observed variation of y in this material. But that does not necessarily mean that
(8.6) will give us a good forecast of y outside the period covered by data. Because
from (8.1) we just see that z will exert a substantial impact if it should happen to
vary markedly stronger than it does in our material. Hence, although x alone may
give a very good explanation of y in our material, it may be of decisive importance
whether or not we can utilize the tiny part of the variation remaining in attempting
to capture the effect of z (i.e., the coefficient value 3 in (8.1)).
If now x and z are uncorrelated, then, even if z were to vary strongly, still the
average explanation of y by x would be the same, that is y = 2x. This is seen from
(8.6). But the transitory explanation would be much poorer, as is seen from (8.8).
15
If there now, in addition to the stronger variations in z, also is a certain correlation between x and z, then we will get more or less distorted information about x’s
effect on y by sticking to relation (8.6), even if this relation, as an average explanation, has full explanatory power from a purely statistical point of view. For while
x, as is seen from (8.1), in reality affects y by twice its own value, this is not the
case in (8.6) as long as rxz differs from zero. If rxz is larger than zero, the coefficient
of x may even become negative. This is just a result we can get if we impose a
“ceteris paribus”-clause on z without knowing its effect, that is, without knowing
the fundamental relation (8.1).
This indeed shows how crucial it is to have, in advance, a formulation of the
hypotheses, in which one operates with specific fictitious variations. If one refrains
from doing so, one runs the risk of missing out important variables that for some
reason have not shown significant variation in the material at hand. And although
a simpler hypothesis may give a stable average explanation and for that reason gets
accepted as statistically valid, it may well give a very poor, maybe a completely
worthless momentary explanation and no deeper insight into structural relationships between the studied variables.
Translator’s note: In translating this manuscript into English I have tried to stick as closely to
Haavelmo’s original style and formulations as possible. All underlined words or expressions have
been set in italics, and all quotation marks have been retained. In this lecture, Haavelmo touches
upon several concepts that have become extremely important in econometrics later – e.g., identifiability, autonomy, and omitted variables – without using these terms. I have not tried to modify
or expand his text in this respect. Translating the Norwegian terms ‘gjennomsnittsforklaring’ and
‘momentanforklaring’, which he introduces in the last part of the text, has created some problems,
inter alia, because they seem not to have been given much attention in later econometric literature
(maybe because they were not found fruitful?) – although the subject matter is highly important.
My suggested translations are, for lack of better terms, ‘average explanation’ and ‘momentary explanation’. I thank John Aldrich, Olav Bjerkholt, Duo Qin, and Yngve Willassen for comments,
and Frikk Nesje for technical assistance.
16
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
17