The Evolution and Validity of Health-Related Fitness

Quest 2006, 58, 160-175
© 2006 National Association for Kinesiology and Physical Education in Higher Education
The Evolution and Validity
of Health-Related Fitness
Andrew S. Jackson
This paper traces the evolution fitness testing from an athletic emphasis to one
with a public health focus and examines the forces that brought about the change
in an environment that was not totally receptive. An atmosphere for change was
created during this era with the development of exercise physiology, exercise
epidemiology, and measurement. The publication of a position paper of physical
fitness by exercise scientists and the publication of state health-related fitness
tests in the 1970s increased the pressure for change. Contemporary public health
research clearly documents the positive role of physical activity, cardiorespiratory fitness, and body composition on health outcomes. During this era, exercise
scientists developed valid laboratory and field tests of cardiorespiratory fitness
and body composition that have become the standards of practice. Public health
researchers have been able to provide valid fitness standards for health promotion.
These validate the health-related approach.
During the last half of the 20th century, physical fitness testing in the United
States evolved from an athletic orientation to one with an emphasis on public health.
This paper describes this change through the eyes of one who was academically
and politically involved in this evolution. My observations and analyses are freely
reflected in the paper and the reader should be aware of the limitations and potential
bias of such an approach. This evolutionary process is addressed in three ways:
(a) key factors that fueled the change in fitness testing in a hostile environment,
(b) the scientific validity of health-related fitness, and (c) key milestones that led
to the valid assessment of health-related fitness.
Key Factors That Fueled The Change
In A Hostile Environment
There were many factors that brought about the change. Among them were the
development and maturation of the exercise physiology, exercise epidemiology, and
measurement disciplines. This provided the academic environment that embraced
change. While there were many who wanted and worked for the public health
The author (AAKPE Fellow #285) is with the Department of Health and Human Performance at the
University of Houston and the Udde Research Center and Rowing Club. E-mail: [email protected]
160
12Jackson(160) 160
1/15/06, 11:57:40 AM
Health-Related Fitness
161
orientation, there were many others who resisted. This resistance became especially
apparent when exercise scientists challenged the validity of the AAHPERD Youth
Fitness Test (YFT) in the early 1970s. The primary sources of the resistance were
two national organizations, AAHPERD and the Presidentʼs Council on Physical
Fitness and Sports. While there may be several reasons for their resistance, the
apparent reason was AAHPER had a national fitness test, the AAHPERD YFT, and
the Presidentʼs Council had a national fitness award program based on the AAHPERD YFT. Quite simply they did not want to disrupt their national programs.
The formal process of changing fitness testing to a public health emphasis
started in the early 1970s. This was the time when the discipline of exercise physiology was maturing, the roots of exercise epidemiology were starting to spread, and
main frame computers and statistical packages were becoming readily available
thereby providing measurement scientists with the tools for complex validation data
analysis research. In addition to the favorable academic environment for change,
there were two events that fueled the evolution in fitness testing: (a) the formation an AAHPERD joint committee with a charge to evaluate the validity of the
AAHPERD YFT and (b) the implementation of health-related fitness tests in the
states of Texas and South Carolina.
AAHPERD Joint Committee
In the early 1970s, AAHPERD formed a joint committee with the charge to
examine the rationale for changing the AAHPERD YFT. Membership on the joint
committee represented the Physical Fitness, Measurement and Evaluation, and
Research Councils of ARAPCS. The committee members represented the exercise
physiology and measurement disciplines. The committee members were B. Don
Franks, Frank I. Katch, Victor L. Katch, Sharon A. Plowman, Margaret J. Safrit,
and Andrew S. Jackson who chaired the committee. The committee members were
asked to consider five items: (a) the rationale for test revision, (b) an operational
definition of physical fitness, (c) decision making on the basis of test results, (d)
items that could be used to measure physical fitness, and (e) the feasibility of
using norm-referenced and criterion-referenced standards for defined groups. The
committee initially met in October 1975 for two days at the Big Ten Measurement
Symposium held at Indiana University. In addition to the six-committee members,
Dr. Raymond Ciszek of AAHPERD and Dr. Ash Hayes of the Presidentʼs Council
on Physical Fitness and Sports attended the meeting.
In February 1976, the Joint Committee met in Washington D.C. to finish the
initial working draft of their work. Each committee member critically evaluated
the draft and his or her reactions were used to prepare a second working draft.
This draft was presented at the 1976 AAHPERD Convention. Dr. Ash Hayes, Dr.
Herbert deVries, and Dr. Guy Reiff were invited to present a 10-minute reaction
to the document. After the paper and formal reactions, discussion and comments
were invited from the audience. Additionally, two convention sessions sponsored
by the Physical Fitness and Measurement and Evaluation Councils of ARAPCS
were devoted to a presentation and reaction to the second working draft. Prior to the
National Convention, the recommendations of this working draft were published
in the AAHPERD publication Update and individuals were encouraged to submit
their reactions in writing, to the chairperson.
12Jackson(160) 161
1/15/06, 11:57:42 AM
162
Jackson
The Joint Committee met in Chicago in October 1976 and considered all reactions to the second working draft. All this information was considered in preparing
the final draft (Jackson, Franks et al., 1976). The position paper was sent to AAHPERD, but the committee was not invited to publish it.1 Central to the document
were recommendations for changing the AAHPERD YFT. These were separated
into three sections, which are fully presented in the Appendix herein.
AAHPERD YFT and New
State Fitness Tests
While the Joint Committee Position Paper provided direction for change, much
more was needed to create change. The advocates for the public health position
were gaining strength to the point where the states of Texas and South Carolina
published state tests that became competitors of the AAHPERD YFT.
The AAHPERD YFT was initially published in 1958. The original test was
not developed through test validation research; rather leading physical educators
met and created the test on the basis of logic and current practice. In 1975 the
AAHPERD YFT was revised. The straight-leg sit-up test was replaced with the
flexed-leg test, the softball throw for distance was dropped, and distance run tests
were added as optional tests to the 600-yard run. This change was encouraged by the
1973 development and implementation of the Texas Test (Baumgartner & Jackson,
1982). The Texas Governorʼs Commission on Physical Fitness developed the Texas
Test. I was actively involved in its development and remember that the prevailing
view within the Texas group was that the AAHPERD YFT was not a valid fitness
test. The Texas Test split its battery into two components, motor fitness, which were
athletic-related items such as speed, agility, and power and physical fitness, which
included abdominal and upper body endurance items and distance run tests.
The evidence of the influence of the Texas Test on the 1975 AAHPERD revision was the adoption of tests and norms from the Texas Test. The 1975 revised
test used the bent-leg sit-up test and norms from the Texas test. The distance runs
were 1.5-mile and 12-min walk/run for distance for students in grades 7 to 12 and
1-mile and 9-min walk/run for distance for students in grades 4 to 6 (Baumgartner
& Jackson, 1982). The Texas award program was just for fitness tests.
In 1976 a national normative fitness survey was completed and these data were
used to re-norm the AAHPERD YFT (AAHPER, 1976). While the strength of the
1976 revision was the national norms, the major limitation was that the 600-yard
run was the only distance run included in the national study. This was a time when
the validity of longer distance run tests was being documented and advanced by
exercise scientists.
Another major criticism levied at the AAHPERD YFT was that the battery
emphasized athletic success. The Presidentʼs Council on Physical Fitness and
Sports award system reinforced this by honoring the most athletically gifted.
The Presidentʼs Councilʼs award system used the AAHPERD YFT to award the
Presidential Physical Fitness Award. In order to qualify, a student had to achieve
at or above the 85th percentile on all six AAHPERD tests for his or her age. Since
the correlations among the six tests are considerably less than 1.00, far fewer than
12Jackson(160) 162
1/15/06, 11:57:43 AM
Health-Related Fitness
163
15% would be expected to qualify for the Presidential Award; it would be closer
to 1% of the population.
This professional frustration with the AAHPERD YFT led to the development
of the South Carolina Fitness Test (Pate, 1978). The South Carolina Test used the
AAHPERD position paper on physical fitness (Jackson, Franks et al., 1976) to
guide their test development. The South Carolina Test included both criterion and
norm referenced standards and was the first youth fitness test that included skinfold
measures to evaluate body composition. With the publication and implementation
of the Texas and South Carolina tests, AAHPERD and the Presidentʼs Council no
longer had a “closed shop” in fitness testing.
Even with the implementation of the South Carolina test, progress was slow in
moving to a health-related fitness approach, and the professional conflict remained
apparent well into the 1980s. While many of us experienced the resentment personally, Plowman and colleagues (in press) eloquently captured the source and reasons
for the conflict and our frustration:
The perceived lack of commitment from the AAHPERD leadership to the health
related physical fitness concept, the awkward and time consuming decision
making process utilized by the AAHPERD structure, and the overwhelming financial considerations linked to the awards led concerned members of
AAHPERD to hold several meetings at the annual American College of Sports
Medicine Meetings in Indianapolis. (1986 meeting) . . . decided that they would
work to provide the best physical fitness test to this nation whether through
AAHPERD or other avenues.
Needless to say, moving to the health-related fitness model was not an easy
task. In my judgment, the conflict was a mirror of changes taking place in our
profession during this era. This was the time when exercise science was emerging
at an accelerated pace. The academic values of exercise science faculty differed
from traditional physical educators. This evolved into a value conflict of science
versus practice. Another indication of this value conflict was reflected in the painful process of renaming our academic homes. When I arrived at the University of
Houston in 1969, we were the Department of Physical Education with an emphasis
on teacher education. Several years ago, we became the Department of Health and
Human Performance with a major research mission. Our university was not unique;
it was more like the norm. These events motivated many exercise scientists to reduce
their professional activity in AAHPERD and become more actively involved with
professional groups with a stronger research agenda such as ACSM.
Scientific Validity
of Health-Related Fitness
The logic expressed in the physical fitness position paper (Jackson, Franks
et al., 1976) was the important role of exercise and fitness on health. The areas
of physiological function identified in the position paper that were viewed as a
national concern were (a) cardiorespiratory function, (b) body composition, and
(c) abdominal and low back musculoskeletal function. Important criteria outlined
12Jackson(160) 163
1/15/06, 11:57:45 AM
164
Jackson
in the position paper for test selection were that physical activity be related to the
component and the test score accurately reflect the level of health-related fitness.
How accurate were these views and recommendations in light of current public
health research? The evidence for cardiorespiratory function and body composition is strong and convincing, while the data supporting abdominal and low back
musculoskeletal function are not.
Scientific Validity:
Cardiorespiratory Function
The important role of cardiorespiratory function on health is documented in
1996 Surgeon Generalʼs report on physical activity and health (USDHHS, 1996).
A primary purpose of the report was to summarize existing scientific research on
physical activity and fitness. The report provides a detailed summary of the public
health research examining the role of physical activity and fitness on health and
disease. Well over 100 public health studies were reviewed and summarized in
Chapter 4, titled “The Effects of Physical Activity on Health and Disease.” Provided
next is a summary of the findings.
Higher levels of regular physical activity are associated with lower mortality rates for both older and younger adults. Even moderate activity on a regular
basis has a protective effect compared to those who are least active. Both regular
physical activity and cardiorespiratory fitness decrease the risk of CVD mortality
and coronary heart disease mortality in particular. The physical activity protective
effect on the coronary heart disease risk is similar to that of other lifestyle factors
such as smoking. Regular physical activity is protective in preventing or delaying
the development of hypertension, and physical activity is effective in lowering
blood pressure in hypertensives. The role of physical activity and fitness on stroke
was not conclusive. Regular physical activity is associated with a decreased risk
of colon cancer but is inconsistent on other cancers.
While included in the Surgeon Generalʼs review, the research from the Aerobics
Center Longitudinal Study (ACLS) is especially important and noteworthy. Blair
and associates (1989) published convincing data showing that level of cardiorespiratory fitness measured with a maximum treadmill test was related to all-cause
and coronary heart disease mortality. They showed that unfit men and women had
a mortality risk over three times higher than fit individuals. The unfit group was
those men and women in the lowest aerobic fitness quintile (≤ 20th percentile) for
their age group. A review of the Web of Science (web search 11/11/2005) documents the importance of this classic study; it has been cited over 1,100 times in
the literature.
Subsequent research with the ACLS sample strengthens the link between
cardiorespiratory fitness and health. In a second study, Blair and associates (1995)
documented that men who moved from the unfit group to the fit group significantly
reduced their mortality risk. This supported a causal effect. Williams (2003) challenged this view and countered that the change in risk was likely due to measurement error inherent in measuring changes in fitness. Further analysis of the ACLS
data (Jackson et al., 2004) demonstrated that changes in fitness were associated
with changes in body composition and self-report level of physical activity. These
changes are consistent with physiological theory and support the conclusion that
12Jackson(160) 164
1/15/06, 11:57:46 AM
Health-Related Fitness
165
changes in fitness shown by Blair et al. (1995) were real, not just measurement
error.
A recent study (Myers et al., 2002) published in the New England Journal
of Medicine supports the Blair et al. (1989) research linking mortality to level of
aerobic fitness. They reported that peak exercise capacity measured in METs was
the strongest predictor of risk of death among normal men and those with CVD.
They found that each MET increase in cardiorespiratory fitness conferred a 12%
improvement in survival. These researchers were also able to show that the exercise
capacity and mortality relationship was independent of history of hypertension,
diabetes, smoking, obesity measured by BMI, and high cholesterol. Cardiorespiratory fitness is an independent risk factor for CVD and all-cause mortality.
Scientific Validity: Body Composition
The Surgeon Generalʼs report on physical activity (USDHHS, 1996) documents
that a lack of physical activity is a primary cause of being overweight. Numerous
researchers have linked overweight with medical problems such as hypertension,
diabetes, and heart disease. Overweight individuals are more likely to be diabetic,
hypertensive, and have higher cholesterol levels. Since these are major, independent
CVD risk factors, a common thought was that the increased risk of CVD was due
to these established risk factors and not being overweight. Hubert and associates
(1983) published the classic study showing that this was not the case. Using data
from a 26-year follow-up of the classic Framingham Heart Study, they showed that
weight, per unit of height, was related to the risk of CVD in men and women who
were nonsmokers, under age 50, and had normal cholesterol and blood pressure
levels. They further showed that change in weight was associated with the change
in CVD risk. Those who gained weight increased their CVD risk and those who
lost weight decreased their risk.
A key finding of the Hubert et al. study (1983) was that the 8-year follow-up
did not produce a significant relationship between weight level and CVD risk,
while the 26-year trend did. This supported the conclusion that long-term obesity
is an independent CVD risk factor. This provides stronger data supporting body
composition as a valid component of health-related youth fitness.
Scientific Validation: Abdominal
and Low Back Musculoskeletal Function
While many believe that an adequate development of abdominal strengthendurance and low back-posterior thigh flexibility is important for the prevention
and rehabilitation of low back disorders, Plowman (1992), in a comprehensive
review, reported that there is little scientific evidence to support the contention. In
the same paper, she searched the literature to determine if evidence supported the
inclusion of the 1-min sit-up and sit-and-reach tests as items in health-related fitness
batteries. She concluded with a “cautious yes” and based her weak endorsement
on the following: (a) there is evidence that low back pain syndrome is high and
increasing in industrial societies; (b) there is anatomical logical validity that the
proper functioning of the trunk and thigh musculature is important for a healthy
back; and (c) there is “scant” evidence that “strength/endurance and/or flexibility”
12Jackson(160) 165
1/15/06, 11:57:48 AM
166
Jackson
is predictive of first time low back pain and “marginal” evidence that low levels of
“strength/endurance and/or flexibility” and predictive of recurrent low back pain.
While the evidence relating “strength/endurance and/or flexibility” to low back
pain is weak, Plowman (1992) further theorizes that the problem may lie with the
tests that are used. She reports that 1-min sit-ups and sit-and-reach tests, which are
included in health-related fitness batteries, have serious anatomical shortcomings
and are likely not comprehensive. Jackson et al. (1998) report little relation between
sit-up and sit-and-reach tests to low back pain in adults.
A primary reason these tests were recommended by the AAHPERD Joint Committee was for the prevention of low back injuries. Ergonomic research documents
(Snook, 1978, 1991; Snook, Campanelli, & Hart, 1978; Waters, Putz-Anderson,
Garg, & Fine, 1993) that low back problems are the leading type of worker injury
and lifting is the major cause. While a low level of strength is often viewed as a
primary cause of low back injury, ergonomic research shows this explains just
half of the variation. The ergonomic literature on lifting and low back injury documents that low back injury is not just a function of lifter strength and endurance,
but also the weight of the load lifted. This finding comes from two different types
of research. First, research shows that low back injuries are reduced by selecting
workers who have the muscular strength and endurance required by the work task
(Ayoub, 1982; Chaffin, 1974, 1975; Chaffin, Herrin, Keyserling, & Garg, 1977;
Chaffin, Herrin, & Keyserling, 1978; Chaffin & Park, 1973; Liles, Deivanayagam,
Ayoub, & Mahajan, 1984). Pre-employment strength testing designed to match a
worker to the demands of the task is an accepted ergonomic strategy for reducing lifting-related back injuries (Jackson, 1994; NIOSH, 1977). Second, Snook
and associatesʼ (1978) classic epidemiological study showed that low back lifting
injuries were a function of lifting weight loads perceived as being too physically
demanding by the lifter. Snook and associates used a psychophysical rating method
to define the acceptable weight of lift. Psychophysical methods measure exercise
intensity in terms of percentage of maximum capacity (Borg, 1998). In the context of lifting, percentage of maximum capacity is a function of lift load and lifter
strength (Jackson, Borg, Zhang, Laughery, & Chen, 1997; Jackson & Sekula, 1999).
These data suggest that it is not possible to assess the level of low back strength
and endurance one needs to reduce the risk of low back injury without knowing
the lift load. NIOSH has published a research-based lift equation that estimates the
maximum acceptable weight of lift for a variety of industrial materials handling
tasks (Waters et al., 1993). The equation clearly demonstrates that the casual factors
of low back injury associated with lifting are complex.
Key Measurement Milestones
The scientific validation research documents that during the last half of the
20th century, a strong body of public health research was published, supporting the
role of suitable levels of body composition and cardiorespiratory fitness on health.
During this same era, the disciplines of exercise physiology and measurement
matured, which led to the publication of valid laboratory and field tests of these
components. Buskirk (1992) provides a fascinating historical presentation of this
scientific evolution that started at the Harvard Fatigue Laboratory and continued at
12Jackson(160) 166
1/15/06, 11:57:49 AM
Health-Related Fitness
167
the University of Minnesota Laboratory of Physiological Hygiene. These laboratories were instrumental in developing the measurement techniques for assessing
cardiorespiratory fitness and body composition. Presented in this section are key
cardiorespiratory fitness and body composition measurement milestones that buttressed the evolution of health-related fitness testing.
Measurement Milestones:
Cardiorespiratory Fitness
Maximum oxygen uptake (VO2max) measured by indirect calorimetry is recognized as the “gold standard” for evaluating cardiorespiratory fitness (Åstrand
& Rodahl, 1970). While the need and interest in endurance testing can be traced
to the Harvard Fatigue Laboratory, the development of the metabolic measurement of VO2max as a test of cardiorespiratory fitness took place at the University
of Minnesota Laboratory of Physiological Hygiene. Buskirk (1992) reports that
two studies (Buskirk & Taylor, 1957; Taylor, Buskirk, & Henschel, 1955) have
been credited for setting “. . . the stage for maximal oxygen uptake to become the
‘gold standardʼ for determining the functional capacity of cardiovascular system
in health and disease” (p. 16). These two studies came from Buskirkʼs PhD dissertation (1953) under the direction of Henry Taylor. The importance of Buskirkʼs
dissertation was recognized by being identified as a rare and important book in the
history and development of medicine and related sciences (Zeitlin and Ver Brugge:
Booksellers, Los Angeles, CA; Buskirk, 1992).
In the era before computer-controlled metabolic carts, measuring VO2max was a
laborious task. Another important measurement milestone was the development of
submaximal methods to estimate VO2max. Submaximal methods provided a method
to estimate VO2max from heart rate response to submaximal exercise. The classic
paper was published by Åstrand and Rhyming (1954) who developed a single-stage
model and provided a nomogram to ease the computations in this pre-microcomputer era. The Åstrand-Rhyming nomogram has been published in numerous texts
and used extensively in research and fitness settings. The YMCA adult fitness test
(Golding, Meyers, & Sinning, 1989) uses a submaximal cycle ergometer test to
measure cardiorespiratory fitness. The YMCA test is multistaged and based on
the same physiological theory as the Åstrand-Rhyming test, heart rate response to
submaximal power output.
The YMCA test (Golding et al., 1989) is administered on a cycle ergometer.
While often viewed as a cycle ergometer test, the Åstrand-Rhyming test can be
administered on a treadmill or as a step test. The limitations of these submaximal
tests for mass testing are that heart rate must be measured and the tests must be
administered on calibrated exercise equipment by trained administrators.
An important measurement milestone for youth and mass testing was the validation of distance run tests. Cooper (1968) published a classic study demonstrating
that distance run performance was highly correlated with measured VO2max. Using
115 U.S. Air Force men, Cooper reported a correlation of 0.90 between VO2max
and the distance covered in 12-min. Cooperʼs research not only demonstrated that
distance run tests were valid cardiorespiratory fitness field tests, but also stimulated
many to examine the concurrent validity of distance run tests. This research, summarized in another source (Baumgartner & Jackson, 1999), clearly shows distance
12Jackson(160) 167
1/15/06, 11:57:51 AM
168
Jackson
run performance is a valid test of cardiorespiratory fitness. Distance runs are the
test of choice for youth health-related fitness batteries (Baumgartner, Jackson,
Mahar, & Rowe, 2003).
A final cardiorespiratory fitness measurement milestone relates to setting
health-related standards. The common method to evaluate fitness has been with
normative standards. The research published by Blair and associates (1989) previously cited in the scientific validation section of this paper not only showed that
cardiorespiratory fitness was a risk factor for coronary heart disease and all-cause
mortality, but also provided evidence supporting a level of fitness needed for health
promotion. The greatest drop in mortality was between the lowest quintile (i.e.,
lowest 20%) and the next quintile for men and women of defined age groups. The
mortality for the highest quintile (i.e., highest 20%) was not much lower than the
moderate fitness group. Blairʼs data showed that the level of cardiorespiratory fitness needed to move out of unfit group is 35 ml/kg/min and 32 ml/kg/min for men
and women, age 45 and younger. The level of physical activity needed for most
people to achieve this level of fitness would be a daily, brisk walk lasting 30-60
min (Blair et al., 1989).
Measurement Milestones:
Body Composition
The first important body composition measurement milestones were the
development of hydrostatic weighing to measure body density and equations for
calculating percent body fat from body density. Underwater weighing became the
“gold standard” for assessing body composition. Underwater weighing can be
traced back to the early 1940s when Behnke used the method to measure specific
gravity (Behnke, Feen, & Welham, 1942). Like the measurement of VO2max, the
development of the hydrostatic weighing method as a “gold standard” can be
traced to the Laboratory of Physiological Hygiene at the University of Minnesota
where Ancel Keys conducted World War II nutrition research on military rations
and semi-starvation (Buskirk, 1992).
The hydrostatic weighing method measures density of the body, which like
any material is equivalent to the ratio of its mass and volume (Going, 1996). The
second important step was developing equations that converted body density to
percent body fat. Brozek and Siri (Brozek, Grande, & Anderson, 1963; Siri, 1956)
published equations to convert body density to percent body fat. Both equations
assume a two-component model of fat weight and fat-free weight. The assumption
of these two-component equations was that the density of fat weight and fat-freeweight components were 0.90 g/cc and 1.10 g/cc. These equations provide nearly
identical estimates of percent body fat for given levels of body density (Baumgartner
et al., 2003). Like the metabolic measurement of VO2max, the hydrostatic weighing method became the “standard of practice” in exercise physiology laboratories
around the world.
It now has become apparent the two-component body composition model is
limited. The major problem is the assumed density of 1.10 g/cc density of the fat-free
weight component is not consistent across all racial groups (Wagner & Heyward,
2000). Researchers are now using either a four-component model (Going, 1996)
or dual energy x-ray absorptiometry (DXA; Lohman, 1996) to account for these
12Jackson(160) 168
1/15/06, 11:57:53 AM
Health-Related Fitness
169
racial differences. It also appears that DXA is replacing underwater weighing as
the body composition “gold standard.”
Hydrostatic weighing and DXA, like the metabolic measurement of VO2max, are
laboratory methods that require specialized equipment and trained administrators.
Underwater weighing or DXA cannot be used for mass testing in the field. Brozek
and Keys (1951) were the first to develop a field test. They published a regression
equation with a function to estimate hydrostatic-determined body composition
from skinfold fat. This classic paper stimulated a host of body composition validation studies that used hydrostatically measured body density as the dependent
variable and combinations of body circumferences and bone diameters, along with
skinfold fat. This was also the era when statistical packages became available for
use on mainframe computers. This allowed researchers to use various multiple
regression approaches to develop multivariate regression equations with different combinations of anthropometric variables. While most of these equations
were with adults, valid equations have been published for youth (Lohman, 1992;
Slaughter et al., 1988).
Between 1951, when Brozek and Keys published their first anthropometric
equation, and the mid 1970s hundreds of anthropometric equations appeared in
the literature. The trend of this research was to develop anthropometric prediction
equations for more narrowly defined groups, such as young and middle-aged men
and women (Pollock, 1975; Pollock, Hickman, & Kendrick, 1976) or athletes
(Sinning, 1978; Sinning, Dolny, & Little, 1985). These were termed “population
specific equations.” The next measurement milestone in body composition measurement was the development of generalized body composition equations. The
term first appeared in the literature in 1978 when we (A. Jackson and M. Pollock)
published the generalized equation for men (Jackson & Pollock, 1978), which
was followed with a generalized model for women (Jackson, Pollock, & Ward,
1980). This research proved to be very popular. The menʼs paper was published
in the British Journal of Nutrition in 1978 and was republished in 2004 as a classic citation paper. The web of science showed these two papers have been cited
over 1,000 times in the literature. A Google search (date 11/11/2005) of the term
“Jackson Pollock percent fat” produced nearly 52,000 hits.
Why did the Jackson-Pollock equations become so popular? I believe the
major reason was we departed from the trend of attempting to develop equations
for narrowly defined groups (e.g., college students, male athletes, etc.) and used
sound measurement and statistical methods to evolve valid equations that could
be used with variable populations. We moved from a specific focus to a more
general one. The idea of our generalized approach came from the classic body
composition paper published by Durnin and Wormsley (1974). They showed that
the relationship between skinfold fat and hydrostatically measured body composition was nonlinear and there was an aging effect. They used log transformations
to account for the nonlinearity and skinfold equations with different intercepts to
account for the influence of age. We (Jackson & Pollock, 1976) also discovered
that skinfold fat measured a common factor. This indicated that the most reliable
measure was the sum of several skinfolds rather than using multiple skinfolds in
an equation as separate independent variables, which was the standard of practice
at that time. We found that the sum of three and seven skinfolds were equally
valid. Polynomial regression was used to account for the nonlinearity between
12Jackson(160) 169
1/15/06, 11:57:54 AM
170
Jackson
body density and skinfold fat. Age, in a linear form, was included in the model to
account for the aging effect.
It is important to recognize that an equation is valid for the sampled population. In this respect, all statistically derived regression equations are population
specific. Our approach was one to develop models that could be applied to individuals representative of populations that varied widely in terms of age and body
composition. In this respect, the Jackson-Pollock equations are more generalizable than what was termed population-specific equations. The limitation of the
Jackson-Pollock equations is the lack of generalization to non-White subjects.
The subjects used to develop the Jackson-Pollock equation were largely White
individuals. Research documents the need to consider race in body composition
prediction models (Schutte et al., 1984; Wagner & Heyward, 2000). Cross-validation of the Jackson-Pollock equations using DXA percent fat as the dependent
variable showed that while the correlation between measured percent fat and
estimated with the Jackson-Pollock equations was high, 0.94 (SEE = 3.4), race
accounted for an additional proportion of DXA percent fat variance (Jackson
et al., 2005). Further research is needed to extend the generalizability of these
equations to different ethnic groups.
My final body composition measurement milestone is the World Health Organization (WHO) establishment of BMI-based overweight and obesity standards
for men and women (WHO, 1998). The WHO standards defined a preobese state
(overweight) as a BMI between 25 and 29.9 kg/m2 and obesity as a BMI ≥ 30 kg/
m2. While exercise physiologists recognize percent body fat measures as the most
valid method of assessing body composition, BMI is easily measured, requiring
just height and weight, and it is easy to understand. A Google search for the term
“body mass index” produced over 23 million hits (web search date 11/11/2005).
This clearly documents its impact on the general population. In my judgment, the
primary value of adopting the WHO BMI obesity standards has been to increase the
publicʼs awareness of the growing incidence of obesity at all segments of society.
Lay individuals are aware that regular physical activity is needed for weight control
and being obese has health consequences.
A limitation with defining obesity with a common BMI standard is that it
assumes that BMI is independent of variables such as age, sex, ethnicity, and level
of physical activity. A review of research examining the age and sex effect on the
BMI and percent body fat relationship showed that age and sex account for significant percent body fat variation beyond BMI (Deurenberg, Weststrate, & Seidell,
1991; Deurenberg, Yap, & vam Staveren, 1998; Deurenberg-Yap, Schmidt, van
Staveren, & Deurengerg, 2000; Gallagher et al., 1996; Jackson, et al., 2002). The
male-female percent fat differences were substantial, ranging from 10.8 to 12.1%
fat. The yearly aging effect ranged from 0.13 to 0.23% fat (Jackson et al., 2002).
Gallagher and associates (1996) reported that they did not find a racial effect after
accounting for age and gender. In contrast, Bray et al. (2005) using DXA to measure percent body fat, found a racial effect. For the same BMI, the DXA percent
fat of Black men and women was 1.9% lower than White subjects and the percent
fat of Hispanic subjects was 1.3% higher. We will see much more of this type of
research in the future.
Even with the sex, age, and race bias, the World Health BMI standards have,
in my opinion, been extremely useful. They have been important in identifying
12Jackson(160) 170
1/15/06, 11:57:56 AM
Health-Related Fitness
171
a major public health problem. The issue of bias can be handled by using more
accurate measurement methods such as DXA when accuracy of measurement is
a primary concern.
Summary
In summary, health-related fitness, while initially a controversial topic with proponents and nonsupporters, is now recognized as based on sound science, is widely
accepted, and is viewed by many as the standard for physical fitness testing.
References
AAHPER. (1976). Youth fitness test manual. Washington, DC: AAHPER.
Åstrand, P., & Rodahl, K. (1970). Textbook of work physiology. New York: McGraw-Hill.
Åstrand, P., & Ryhming, I. (1954). A nomogram for calculation of aerobic capacity (physical fitness) from pulse rate during submaximal work. Journal of Applied Physiology,
7, 218-221.
Ayoub, M. (1982). Control of manual lifting hazards: III. Preemployment screening. Journal
of Occupational Medicine, 24, 751-761.
Baumgartner, T.A., & Jackson, A.S. (1982). Measurement for evaluation in physical education and exercise science (3 ed.). Dubuque, IA: Wm. C. Brown.
Baumgartner, T.A., & Jackson, A.S. (1999). Measurement for evaluation in physical education and exercise science (6th ed.). Dubuque, IA: Wm. C. Brown.
Baumgartner, T.A., Jackson, A.S., Mahar, M.T., & Rowe, D.A. (2003). Measurement for
evaluation in physical education and exercise science (7th ed.). Dubuque, IA: Wm.
C. Brown.
Behnke, A.R., Feen, B.G., & Welham, W.C. (1942). The specific gravity of healthy men.
Journal of American Medical Association, 118, 495-498.
Blair, S.N., Kohl, H.W., Paffenbarger, Jr., R.S., Clark, D.G., Cooper, K.H., & Gibbons, L.W.
(1989). Physical fitness and all-cause mortality: A prospective study of health men and
women. Journal of the American Medical Association, 262, 2395-2401.
Blair, S.N., Kohl III, H.W., Barlow, M.S., Paffenbarger, Jr., R.S., Gibbons, L.W., & Macera,
C.A. (1995). Changes in physical fitness and all-cause mortality: A prospective study
of healthy and unhealthy men. Journal of the American Medical Association, 273,
1093-1098.
Borg, G. (1998). Borgʼs perceived exertion and pain scaling method. Champaign, IL:
Human Kinetics.
Bray, M., Ellis, K., Sailors, M., McFarlin, B., Turpin, I., & Jackson, A.S. (2005). Black,
Hispanic and White differences in the relation between BMI and DXA percent fat of
men and women: The TIGER study. Obesity Research, 13, A29.
Brozek, J., Grande, F., & Anderson, J.T. (1963). Densitometric analysis of body composition: Revision of some quantitative assumptions. Annals of New York Academy of
Science, 110, 113-140.
Brozek, J., & Keys, A. (1951). The evaluation of leanness-fatness in man: Norms and intercorrelations. British Journal of Nutrition, 5, 194-206.
Buskirk, E., & Taylor, H.L. (1957). Maximal oxygen intake and its relationship to body
composition, with special reference to chronic physical activity and obesity. Journal
of Applied Physiology, 11, 72-78.
Buskirk, E.R. (1953). Relationships in man between the maximal oxygen intake and
components of body composition. Unpublished PhD, University of Minnesota, Minneapolis.
12Jackson(160) 171
1/15/06, 11:57:58 AM
172
Jackson
Buskirk, E.R. (1992). From Harvard to Minnesota: Keys to our History. In J.O. Holloszy (Ed.),
Exercise and sport sciences reviews (pp. 1-26). Baltimore: Williams & Wilkins.
Chaffin, D.B. (1974). Human strength capability and low-back pain. Journal of Occupational
Medicine, 16, 248-254.
Chaffin, D.B. (1975). Ergonomics guide for the assessment of human static strength. American Industrial Hygiene Association Journal, 36, 505-511.
Chaffin, D.B., Herrin, G.D., Keyserling, M., & Garg, A. (1977). A method for evaluating
the biomechanical stresses resulting from manual materials handling jobs. American
Industrial Hygiene Association Journal, 38, 662-675.
Chaffin, D.B., Herrin, G.D., & Keyserling, W.M. (1978). Preemployment strength testing.
Journal of Occupational Medicine, 67, 403-408.
Chaffin, D.B., & Park, K.S. (1973). A longitudinal study of low-back pain as associated
with occupational weight lifting factors. American Industrial Hygiene Association
Journal, 34, 513-525.
Cooper, K.H. (1968). A means of assessing maximal oxygen intake. Journal of the American
Medical Association, 203, 201-204.
Deurenberg, P., Weststrate, J.A., & Seidell, J.C. (1991). Body mass index as a measure of
body fatness: Age- and sex- specific prediction formulas. British Journal of Nutrition,
65, 105-114.
Deurenberg, P., Yap, M., & vam Staveren, W.A. (1998). Body mass index and percent body
fat: A meta analysis among different ethnic groups. International Journal of Obesity,
22, 1164-1171.
Deurenberg-Yap, M., Schmidt, G., van Staveren, W.A., & Deurenberg, P. (2000). The paradox of low body mass index and high body fat percentage among Chinese, Malays and
Indians in Singapore. International Journal of Obesity, 24, 1011-1017.
Durnin, J.V.G. A., & Wormsley, J. (1974). Body fat assessed from total body density and
its estimation from skinfold thickness: Measurements on 481 men and women aged
from 16 to 72 years. British Journal of Nutrition, 32, 77-92.
Gallagher, D., Visser, M., Sepulveda, D., Pierson, R.N., Harris, T., & Heymsfield, S.B. (1996).
How useful is body mass index for comparison of body fatness across age, sex, and
ethnic groups. American Journal of Epidemiology, 143(3), 228-239.
Going, S.B. (1996). Densitometry. In A.F. Roche, S.B. Heymsfield, T.G. Lohman (Eds.),
Human Body Composition (pp. 3-23). Champaign, IL: Human Kinetics.
Golding, L.A., Meyers, C.R., & Sinning, W.E. (1989). The Yʼs way to physical fitness. (3rd
ed.). Chicago: National Board of YMCA.
Hubert, H.B., et al. (1983). Obesity as an independent risk factor for cardiovascular diseases:
A 26-year follow-up of participants in the Framingham heart study. Circulation, 67,
968-977.
Jackson, A.S. (1994). Preemployment physical evaluation. Exercise and Sport Science
Reviews, 22, 53-90.
Jackson, A.S., Franks, B.D., Katch, F.I., Katch, V.L., Plowman, S.A., & Safrit, M.J. (1976).
A position paper on physical fitness. Position paper of a joint committees representing
the Measurement and Evaluation, Physical Fitness and Research Councils of AAHPER,
Washington, DC.
Jackson, A.S., Borg, G., Zhang, J.J., Laughery, K.R., & Chen, J. (1997). Role of physical
work capacity and load weight on psychophysical lift ratings. International Journal
of Industrial Ergonomics, 20, 181-190.
Jackson, A.S., Ellis, K., Sailors, M., McFarlin, B., Turpin, I., & Bray, M. (2005). The generalizability of the Jackson-Pollock skinfold equations for Black and Hispanic men
and women: The TIGER study. Obesity Research, 13, A140.
Jackson, A.S., Kampert, J.B., Barlow, C.E., Morrow, J.R., Jr., Church, T.S., & Blair, S.N.
(2004). Longitudinal changes in cardiorespiratory fitness: Measurement error or true
change? Medicine and Science in Sports and Exercise, 36, 1175-1180.
12Jackson(160) 172
1/15/06, 11:58:00 AM
Health-Related Fitness
173
Jackson, A.S., Katch, F.I., Katch, V.L., Plowman, S.A., & Safrit, M.J. (1976). A position
paper on physical fitness (pp. 50). Washington, DC: AAHPER
Jackson, A.S., & Pollock, M.L. (1976). Factor analysis and multivariate scaling of anthropometric variables for the assessment of body composition. Medicine and Science in
Sports, 8, 196-203.
Jackson, A.S., & Pollock, M L. (1978). Generalized equations for predicting body density
of men. British Journal of Nutrition, 40, 497-504.
Jackson, A.S., Pollock, M.L., & Ward, A. (1980). Generalized equations for predicting body
density of women. Medicine and Science in Sports and Exercise, 12, 175-182.
Jackson, A.S., & Sekula, B.K. (1999). The influence of strength and gender on defining
psychophysical lift capacity. Proceeding of the Human Factors and Ergonomics
Society, 43, 723-727.
Jackson, A.S., Stanforth, P.R., Gagnon, J., Rankinen, T., Leon, A.S., Rao, D.C., Skinner, J.S.,
Bouchard, C., & Wilmore, J.H. (2002). The effect of sex, age, and race on estimating
percent body fat from BMI: The HERITAGE Family Study. International Journal of
Obesity, 26, 789-796.
Jackson, A.W., Morrow, J.R., Jr., Brill, P.A., Kohl, H.W., III, Gordon, N.F., & Blair, S.N.
(1998). Relations of sit-up and sit-and-reach tests to low back pain in adults. The Journal
of Orthopaedic & Sports Physical Therapy, 27(1), 22-26.
Liles, D.H., Deivanayagam, S., Ayoub, M.M., & Mahajan, P. (1984). A job severity index
for the evaluation and control of lifting injury. Human Factors, 26, 683-693.
Lohman, T.G. (1992). Advances in body composition assessment. Champaign, IL: Human
Kinetics.
Lohman, T.G. (1996). Dual energy x-ray absorptiometry. In A.F. Roche, S.B. Heymsfield,
T.G. Lohman, (Eds.), Human body composition (pp. 63-78). Champaign, IL: Human
Kinetics.
Myers, J., Prakash, M., Froelicher, V., Do, D., Partington, S., & Atwood, J.E. (2002). Exercise capacity and mortality among men referred for exercise testing. The New England
Journal of Medicine, 346, 793-801.
NIOSH. (1977). Preemployment strength testing. Washington, DC: U.S. Department of
Health and Human Services.
Pate, R.R. (1978). South Carolina physical fitness test manual. Columbia, SC: Governorʼs
Council on Physical Fitness.
Plowman, S.A., Sterling, C.L., Corbin, C.B., Meredith, M.D., Welk, G.J., & Morrow,
J.R., Jr. (in press). The History of FITNESSGRAM®. Journal of Physical Activity
& Health.
Plowman, S.A. (1992). Physical activity, physical fitness, and low back pain. In J.O. Holloszy (Ed.), Exercise and sport sciences reviews (Vol. 20, pp. 221-242). Baltimore:
Williams & Wilkins.
Pollock, M.L. (1975). Prediction of body density in young and middle-aged women. Journal
of Applied Physiology, 38, 745-749.
Pollock, M.L., Hickman, T., & Kendrick, Z. (1976). Prediction of body density in young
and middle-aged men. Journal of Applied Physiology, 40, 300-304.
Schutte, J.E., Townsend, E.J. Hugg, J., Shoup, R.F., Malina, R.M., & Blomqvist, C.G.
(1984). Density of lean body mass is greater in Blacks than in Whites. Journal of
Applied Physiology, 56, 1647-1649.
Sinning, W.E. (1978). Anthropometric estimation of body density, fat, and lean body weight
in women gymnasts, 10, 234-249.
Sinning, W.E., Dolny, D.G., & Little, K.D. (1985). Validity of ‘generalizedʼ equations for
body composition in male athletes. Medicine and Science in Sports and Exercise, 17,
124-130.
Siri, W.E. (1956). The gross composition of the body. In C.A. Tobias & J.H. Lawrence (Eds.),
Advances in biological and medical physics (Vol. 4). New York: Academic.
12Jackson(160) 173
1/15/06, 11:58:02 AM
174
Jackson
Slaughter, M.H., Lohman, T.G., Boileau, R.A., Horswill, C.A., Stillman, R.J., VanLoan,
M.D., et al. (1988). Skinfold equations for estimating of body fatness in children and
youth. Human Biology, 60, 709-723.
Snook, S.H. (1978). The design of manual handling tasks. Ergonomics, 21, 963-985.
Snook, S.H. (1991). Low back disorders in industry. Proceedings of the Human Factors
Society 35th Annual Meeting, 35, 830-833.
Snook, S.H., Campanelli, R.A., & Hart, J.W. (1978). A study of three preventive approaches
to low back injury. Journal of Occupational Medicine, 20, 478-481.
Taylor, H.L., Buskirk, E., & Henschel, A. (1955). Maximal oxygen intake as an objective
measure of cardiorespiratory performance. Journal of Applied Physiology, 8, 73-80.
USDHHS. (1996). Physical activity and health: A report of the Surgeon General. Washington
DC: U.S. Department of Health and Human Services.
Wagner, D.R., & Heyward, V.H. (2000). Measures of body composition in Blacks and Whites:
A comparative review. American Journal of Clinical Nutrition, 71, 1392-1402.
Waters, T.R., Putz-Anderson, V., Garg, A., & Fine, L.J. (1993). Revised NIOSH equation
for the design and evaluation of manual lifting tasks. Ergonomics, 7, 749-766.
WHO. (1998). Obesity: Preventing and managing the global epidemic. Report of a WHO
consultation on obesity. Geneva, Switzerland: World Health Organization.
Williams, P.T. (2003). The illusion of improved physical fitness and reduced mortality.
Medicine and Science in Sports and Exercise, 35, 736-740.
Author Note
1
The position paper has been given to the AAKPE for placement in its achieves.
Appendix
Position Paper Recommendations
on Youth Physical Fitness
Recommendations—Physical Fitness
It is recommended that a battery of tests be developed to measure physical
fitness related to functional health. Within the limitations of current scientific
information three recommendations are made:
1. That distance run tests be used as field tests of cardiorespiratory function. The
recommended test is (a) one-mile run/walk for time or (b) nine-min run/walk
for distance.
2. That an anthropometric nationwide study be conducted to establish a valid
field test to estimate body fat in school-age children.
3. That the maximum number of flexed-leg sit-ups achieved in one minute be
used as the test to measure abdominal strength/endurance.
12Jackson(160) 174
1/15/06, 11:58:05 AM
Health-Related Fitness
175
Recommendation—Task Specific Tests
Of Motor Performance
It is recommended that physical education teachers who identify specific needs
with regard to aspects of motor performance that are specific to a task be encouraged to select supplementary items to measure these task-specific components. The
AAHPERD Youth Fitness test battery is one source for task specific tests.
Recommendations—Evaluation Strategy
A total of six recommendations are made.
1. That both criterion-referenced and norm-referenced standards be developed
for the physical fitness battery.
2. That the issues related to sampling be discussed with statistical experts in order
to explore the feasibility of sampling sub-groups.
3. That separate norms be utilized for boys and girls if it can be established the
separation is based on physiological differences rather than social and cultural
differences.
4. That the use of the Nielson-Cozens classification be eliminated. This supports
the change made in the 1976 revision of the AAHPERD Youth Fitness Test.
5. That the percentile age norm be retained.
6. That polynomial regression models be explored as a possible second set of
age-adjusted norms. (Jackson, Katch, Katch, Plowman, & Safrit, 1976)
12Jackson(160) 175
1/15/06, 11:58:07 AM