Full Text PDF

Relidbility
of Metabolism Meuswemetits
by the Closed Circzdt Method1
FRANCIS
L;. HARMON.
Said
University,
Lowis
Saint
FYOWZ the Department
Louis, Missowi
0f
PSyGkOlOgy,
ECAUSE OF THE WIDESPREAD
USE of metabolism tests, both in clinical practice
and research, the question of the variability of such measurements is of crucial importance. Studies dealing with several aspects of this question have been reported
(~-6) ; yet it remains a fact that the reliability of the metabolism test has not been
fully determined to date. This seems to have been due partly to the application of
relatively inefficient and, in some cases, inappropriate statistical methods to the
problem, and partly to a faulty analysis of the problem itself. More specifically, previous investigators have not usually distinguished clearly enough between the stability of the metabolic rate considered simply as a physiological function, and the
precision with which the typical test measures this function. Strictly speaking, the
problem of reliability concerns only the second of the two points just mentioned;
however, it is obvious that in any practical situation involving the interpretation of
a particular test record, both questions must be taken into account.
The general aim of the present study was to investigate the reliability of measurements made by the closed circuit method upon 29 ostensibly normal, young
adult males2 Three steps were involved: a> carrying out a complex analysis of variance upon a set of 348 test records, in order to isolate certain major sources of variation and evaluate the signticance of each; b) deriving, from the preceding analysis,
a valid estimate of the error of measurement of the metabolism test; and c) establishing other critical values for determining the significance of changes in metabolic
rate under certain specified conditions.
The subjects were male university students, ranging in age from I 7-27 yr.
Twelve measurements of metabolic rate were made upon each subject, the complete series consisting of I.o-min. tests at 8 :oo A.M., 32 :oo K, 6 :oo P.M. and IO : oo
PX.
on each of 3 different days, approximately a week apart. Every test was preceded by a 3o-min. rest, the subject lying upon a cot throughout this period, and
during the test itself. In addition, all tests made at 8 :oo A.M. were BMR’s in the
clinical sense, since the subjects were always in the postabsorptive
state at these
times, and had observed the usual rules regarding exercise, smoking etc. It should be
remarked, too, that during the period of the experiment, the subjects followed their
normal routines as to outside activities. These routines, while not rigidly constant,
were ascertained to be reasonably stable, as well as quite uniform from I individual
to another. Nevertheless, as further precautions, the subject’s oral temperature
Received for publication
December
5, 1952.
1 An abstract
of this paper was read at the St. Louis meeting of the National
Sciences, November
12, 1952.
2 Acknowledgment
is made to Thomas F. Oehrlein
and Thomas Jm Fitzgerald
in gathering
the data.
Academy
for assistance
773
of
Downloaded from http://jap.physiology.org/ by 10.220.32.247 on June 18, 2017
B
774
FRANCIS
L. HARMON
Volame
5
RESULTS
Errors of Measurement. The data upon which the following analyses are based
may be obtained from the American Documentation
Institute.3
Table I presents a
complete summary of the analysis carried out upon our 348 coded test records: the
model for this analysis is given by Snedecor (7). It can be seen that the experimental
design permits us to separate the total sum of squares into 7 portions, according to
the source of variation.
At the moment we are not primarily
concerned with the
various tests of statistical significance made possible by the analysis summarized
in
table I, Let us remark merely that the variances for subjects, hours, the subjects x
when tested against the error
hours interaction,
and the subjects x days interaction,
variance as represented
by the triple interaction,
all prove significant
at the I %
level of confidence or beyond; while none of the rest is significant,
even at the 5 %
leve1.4 It may be added that the absence of a significant
hours x days interaction
is
of interest in the present study chiefly as a ‘test of technique.’
It justifies our assumption
that the conditions
under which the experiment
was carried out were
satisfactorily
uniform in one essential respect, since the differences between the 4
hourly means are independent
of the days on which the tests were made.
Table I also shows the standard deviations obtained by extracting
the square
roots of the several variances resulting from the analysis. These standard deviations
are expressed both as calories per square meter per hour (assuming an R.Q. of 0.82)
and, alternatively,
as cubic centimeters
of oxygen per square meter per minute*
Consider first the sigma derived from the triple interaction.
In effect, this sigma
3 Four tables of detailed
data for this paper have been deposited
with the American
Documentation
Institute,
Library
of Congress, Washington
25, D. C. For copies of these tables order
Document
3955 directly
from the Institute,
remitting
$I. 25 for microfilm
(images I inch high on
35-mm. film) or $1.25 for photoprints
readable without
optical aid.
4 The application
of Bartlett’s
test showed no evidence
of heterogeneity
of variance
among
the subgroups
in this experiment,
since chi square was significant
between the ~0% and the I&
levels of confidence.
Downloaded from http://jap.physiology.org/ by 10.220.32.247 on June 18, 2017
was recorded at the beginning of every experimental
session; also he was questioned
with regard to his general health, activities, rest and food intake during the period
since his last test. In the few instances when temperature
was abnormal
or questioning brought out marked irregularities,
the records for that entire day were discarded, and an additional
day of testing was scheduled.
A Sanborn Metabulator,
with standard mouthpiece
and noseclip, was used in
determining
metabolic
rates, and at least 3 tests for leaks were made during the
progress of each session. Like other closed circuit instruments,
the Metabulator
measures oxygen consumption
in cubic centimeters
per minute. These values, corrected to S.T.P., were then divided by the subject’s body surface area in square
meters, as estimated from his height and weight by the Du Bois formula. Lastly,
each measure was ‘coded’ by multiplying
it by the constant, 0.2895. Since this is the
factor used in basal metabolism
tests to convert measures of oxygen consumption
into their heat equivalents,
assuming an R.Q. of 0.82, it follows that our results for
all the 8 :oo o’clock tests would be correctly expressed in calories per square meter
per hour. The remaining
data, based upon tests made later in the day, could be ‘decoded’ after analysis, and expressed merely as cubic centimeters
of oxygen per
square meter per minute; this, of course, being necessary because one can assume
an R.Q, of 0.82, if at all, only when the subject is in the postabsorptive
state.
June
1953
RELIABILITY
OF
METABOLISM
MEASUREMENTS
775
TABLE
Source
I.
ANALYSIS
OF VARIANCE
of Variation
Between subjects .............................
Between
hours ...............................
Between days ................................
Interaction:
subjects x hours. .................
Interaction:
subjects x days. ..................
Interaction:
hours x days. ....................
Interaction:
subjects x hours x days. ...........
Total...................................
OF CODED METABOLISM
TEST DATA
-
=e:i=
Freedom
sum of
Squares
Variance
rx7;Gcqj
k----
I -28
3
2
1265.794
1152.253
27.950
84
56
6
I68
1111.558
347
5020.939
557.QOO
31.143
875 - 241
ma/ml,.
l
&2069*
384.0843”
13*9750
13.2328”
9.9464*
541905
5 * 2098
14
l
4696
6.72
19.60
3.64
3.64
3.15
2.28
2.28
I
23 ’ 23
67.70
12.57
12.57
IO.90
7*87
7.87
-I 1
3.70
r2.79
* Significant
at or beyond I% level of confidence.
Standard
deviations
expressed in cal/mg/hr.
and in cc of 02/m2/min.
single metabolism test differs from ‘normal’ by 4.47 cal. or more, one may conclude
at the 5 % level of confidence that the individual’s true metabolic rate is not equal to
the mean of his norm group; while, if the difference is as great as 5.88 Cal., the same
conclusion may be stated at the z % level. This last value, incidentally, conforms
exactly to the time-honored ‘15 % rule’ for the clinical evaluation of BMR test results, since 5.88 cal./m2/hr. is just 15 % of the BMR of normal males of our age
group.
Another application of the S.E. is in testing the significance of a difference between 2 measurements. In this case the S.E. is multiplied by d< and the critical values for the 5 % and the I % levels are then calculated as before. These values
are found to be 6.;~ cal. (21.83 cc of 02) and 8.31 Cal. (28.75 cc of O,), respectively.
Thus, if the results of 2 metabolism tests differ by as much as 21.83 cc of oxygen, it
is possible to say that the chances are 95 in 100 that there is a real difference between the metabolic rates; and, if the difference is as great as 28.75,
the chances are
at least 99 in 100.
In summary, the SE. of measurement is used to specify the fineness of.
discrimination of metabolism tests. Such a measure is limited in one important respect, however. While it may inform the investigator of the occurrence bf real differences between metabolic rates, it tells him nothing as to the sources of ‘these
Downloaded from http://jap.physiology.org/ by 10.220.32.247 on June 18, 2017
is based upon 163 independent observations which, needless to say, constitutes a
fair-sized sample. This sigma, furthermore, is a valid estimate of the reliability of
any single metabolism test. Since it includes all sources of variation not covered
elsewhere in the analysis, but presumably randomized in the present experiment,
the SD. derived from the subjects x hours x days interaction represents the S.E.
of measurement.
In practice, standard errors often are interpreted in terms of fiducial limits: the
range or band of values over which a series of measurements may ieasonably be
expected to vary. Generally, too, such fiducial limits are established with reference
to some specific ‘level of confidence’, ordinarily the 5 % or the I % level. In order to
fix the limits of the 5 % interval, the S.E. is multiplied by 1.96; while for the I%
interval, it is multiplied by 2.58. The resulting values in the present case are 4.47
cal. (or 15.44 cc of 0,) and 5.88 cal. (or 20.33 cc of OJ,, respectively. If, then, a
776
FRANCIS
vuhze
L. HARMON
5
TABLE
2. SHOWING
Measure
SIGNIFICANT
DEVIATIONS
IN METABOLISM
VARIOUS
CONDITIONS
Evaluated
Any single observation,
e.g. in comparing
individual
test result with group mean
Difference
between two single observations,
e.g. immediately
consecutive
tests of same
individual
Difference
between two single observations,
e+g. comparable
tests of same individual
on different
days, or different
individuals
on same day
Difference
between two single observations,
e.g. same individual
at different
hours
RESULTS
Sx H x D
168
Significant
Cal/m’L/hr.
5%
1%
5.88
4-47
Sx H x D
168
6.32
S”~:zof
Estimate
No.
g;P*
TEST
l
SXD
56
8.72
SXH
84
10.08
UNDER
Variation
cc/m2/min.
5%
1%
15.44
20.~3
21.83
28.75
r1.p
30.20
39-76
13.28
34.81
45481
8.31
against an error estimate based upon the significant subjects x hours interaction.
A positive result from this test would mean that the morning and noon results differed more than could be expected from a normal diurnal variation.
In precisely the samemanner, it is possibleto derive an error estimate appropriate for evaluating the significance of differences between tests made under comparable conditions on different days. Using the subjects x days interaction for this
purpose, the critical values for the 5 % and I % levels of confidence may be calculated. These are the values which should be used in deciding, for example, whether
an individual’s metabolic rate has changed significantly over a period of time, or in
evaluating differences between the metabolic rates of dif3erent individuals, since
due allowance is made for chancevariations for I subject to another in their day-today rates.
Table z summarizesthe various error estimatesdeveloped in the present analysis,and gives suggestionsfor their appropriate use..In addition to the sourceof error
and the number of independent observations upon which each estimate is based,
this table showsthe magnitude of significant variations, both in caloriesper square
meter per hour and in cubic centimeters of oxygen per squaremeter per minute, for
the 5% and the I % levels of confidence.In general, any given test result must differ
from sometheoretical value or from another test result by amounts as great as or
Downloaded from http://jap.physiology.org/ by 10.220.32.247 on June 18, 2017
differences. Information of this sort usually is obtained by way of the experimental
design itself, through the inclusion of control tests, or groups of subjects; and this,
of course, is as it should be. Sometimes, though, especially in metabolism research,
the cost of establishing any really adequate experimental -controls becomes prohibitive. Fortunately, our analysis of variance provides for an effective type of statistical control which may result in a considerable economy of research time and money.
Suppose that a subject, or a group of subjects, be given metabolism tests at
8 :OO A.M. and I 2 : oo M., with a view to determining the effect of some variablefood, muscular exercise etc,- which is introduced between the 2 tests. The difference between the morning and the noon records would then be tested for significance against an error estimate derived from the S.E. of measurement. If this test
proved significant, the investigator next would seek to exclude the possibility that
the difference was due to an ordinary diurnal trend in metabolic rate. In the absence
of a control subject (or group), this could be accomplished by testing the difference
JUfie
RELIABILITY
l-953
OF
METABOLISM
MEASUREMENTS
777
greater than those shown in table z before the investigator may conclude that a significant difference exists between the metabolic rates in question.
When the measures to be evaluated are means based upon n observations, or
differences between means of m observations each, the critical values shown in table z are to be divided by dn. If, however, the 2 means are derived from unequal
numbers of observations, the S.E. of each must be estimated separately, and then
the SE. of the difference, this value being multiplied by either I .96 or 2.58 in order
to obtain the critical figure desired. The particular value to be used in estimating the
standard errors of the means is obtained from table I : it will be the SD. that corresponds to I df the 3 interactions (subjects x hours x days, subjects x hours, or
subjects x days), depending upon the problem in hand.
3* ANALYSIS
Source
OF VARIANCE
OF BASAL
D;w;;f
of Variation
Between
subjects.
... ... ... ... ... .. .,. ... ... ...
Betweendays.................................
Interaction:
subjects x days. . . . . . . . . . . . . . . - . . . . +
Total. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
* Significant
beyond
Standard
deviations
METABOLISM
28
2
56
86
RECORDS
sum of
Squares
548,391
5798
245.102
799.29=
(8:~
A.M.
TESTS)
Variance
Cal/Qmg/hr.
19.5854"
4.43
1~70
2.09
30%
2.8990
4.3768
9.2941
1% level of confidence.
expressed in cal/m2/hr.
Special Case of the BMR. There are at least two reasons for giving separate
consideration to BMR measurements in any reliability study. First, of course, the
BMR test is of particular interest in clinical work. In the second place, these records
constitute a special group of measurements, made under conditions which might
be expected to result in considerably greater day-to-day stability than other measurements, representing nonbasal conditions.
For these reasons, a special analysis of vafiance was carried out upon the 87
BMR records (8 : 00 AX tests), and the results are summarized in table 3. In this
case, the S.E. of measurement comes from the subjects x days interaction, which
yields a sigma of 2.09 cal/m2/hr. The error estimate thus obtained does not differ
appreciably from the corresponding value of 2.28, derived from the subjects x hours x
days interaction in our analysis of the complete data; it is recommended, therefore,
that the latter be used in evaluating the significance of any metabolism test results,
whether basal or otherwise.
The need for conservatism in this area of measurement may be clarified further
by examining the test-retest correlations of the BMR data. Product-moment
coefficients, calculated for every possible pairing of the 8: oo A.M. tests, were found to be
where the subscripts refer to
as follows: ~12 = 0.703, ~23 = 0.563, and ~13 = 0,411,
tests given on &zys .r, z and 3+5Since the average of these 3 correlations is but 0.559,
it must be concluded that the metabolism test, under the best of circumstances, is
not a highly reliable instrument; accordingly, any measurements obtained with this
test should be evaluated with due regard to their appropriate error estimates.
SUMMARY
From an analysis of variance carried out upon 348 test records, the standard
error of estim ate of the metabolism test is estimated as 2.28 cal/m2/hr. Other criti5 For
respectively.
27” of freedom,
the significant
values
of r at the 5% and 1% levels are 0.369 and 0.472,
Downloaded from http://jap.physiology.org/ by 10.220.32.247 on June 18, 2017
TABLE
778
FRANCIS
L. HARMON
Voltime
5
cal values for evaluating
the significanceof changesin metabolic rate under specified
conditions are established.Results obtained by this approach are described,together
with examplesof their applications.
REFERENCES
I.
2.
3.
4.
5,
Downloaded from http://jap.physiology.org/ by 10.220.32.247 on June 18, 2017
6.
7.
F. G. AND T. M. CARPENTER.
Food Ingestion and Energy Twasfmmations,
w.ith@ecia-i
Reference to the Stimjdlating
Effects of Ntitriertts.
Washington:
Carnegie Institute,
1918.
HARRIS,
J. A. AND F. G. BENEDICT.
J. Biol. Ckem. 46: 257, 1921.
GRIFFITH,
F. R,, G. W. PUCHER,
K. A. BROWNELL,
J. II. KLEIN
AND M. E. CARMER.
Am. J.
Physiol. 87: 602, 1929.
BOOTHBY,
W. M., J* BERKSON
AND H. L. DUNN.
Am. J. PktysioE. 116: 468, 1936.
BERKSON,
J. AND W. M. BO~THBY.
Am. J. PhysioZ. 121: 669, 1938.
LEWIS,
W. H. Am. J. Physiol. I 21: 502, 1938.
SNEDECOR,
G. W. Statistical Methods. Ames: Iowa State Colbge Press, 1940.
BENEDICT,