A Structural Factor Analysis of Vocabulary

Journal of Gerontology: PSYCHOLOGICAL SCIENCES
2005, Vol. 60B, No. 5, P234–P241
Copyright 2005 by The Gerontological Society of America
A Structural Factor Analysis of Vocabulary
Knowledge and Relations to Age
Ryan P. Bowles, Kevin J. Grimm, and John J. McArdle
Department of Psychology, University of Virginia, Charlottesville.
Vocabulary knowledge may not be a unidimensional construct, and the relations between vocabulary knowledge
and age may depend on the aspect of vocabulary knowledge being assessed. In this study, we examined the factor
structure of a vocabulary test given to a large nationally representative sample of individuals (N ; 20,500).
Results indicated that the vocabulary test is not unidimensional but bidimensional, with Basic Vocabulary and
Advanced Vocabulary factors. An analysis of age differences indicates that basic vocabulary is highest around the
age of 30, with a negative relation to age in late adulthood; in contrast, advanced vocabulary is unrelated to age
between ages 35 and 70. Cohort effects may explain some of the differential age trend.
V
OCABULARY knowledge is one of the few cognitive
skills that remain relatively intact over adulthood. Unlike
most cognitive abilities, which peak when a person is around
the age of 20 and then decline with age, vocabulary knowledge
seems to peak around age 50 or possibly later, and decline only
slowly, if at all, into old age. This distinction was recognized
early in research on cognitive aging (Jones & Conrad, 1933;
Sorenson, 1938), and the finding has been confirmed in numerous types of studies, including cohort sequential (Schaie, 1996),
repeated cross-sectional (Wilson & Gove, 1999b), and longitudinal (McArdle et al., 2003) studies. The shape of the age trend
also seems to be consistent across types of tests of vocabulary
knowledge, although the magnitude of age-related changes may
vary (Bowles & Salthouse, 2003; Verhaeghen, 2003).
Several theoretical explanations for the age-related stability
of vocabulary knowledge have been proposed. Among the most
influential is the Cattell-Horn theory of fluid and crystallized
intelligence (Cattell, 1941, 1963; Horn, 1998; Horn & Cattell,
1966), which posits that vocabulary knowledge, as a prototypical example of crystallized intelligence, is maintained
across the life span, whereas cognition in general declines with
age. Vocabulary knowledge is maintained because of new learning, and because of reinforcement through usage of previously
learned material (Cattell, 1987). After the school years, most
learning is in specific areas, suggesting that the increase or
maintenance of vocabulary knowledge in later adulthood may
be a result of increasing esoteric knowledge (Cattell, 1998).
A second theory of the aging of vocabulary knowledge is the
dual-representation theory of knowledge (e.g., Brainerd &
Reyna, 1992), or related theories of changing representations
(McGinnis & Zelinski, 2003). Vocabulary knowledge may be
characterized by two types of representations, a general ‘‘gist’’
definition and a detailed specific definition, or by a single representation varying along a specificity continuum (McGinnis &
Zelinski, 2000). Age is associated with a reduction in the ability
to generate and access the specific definition (McGinnis &
Zelinski, 2003). Older adults may compensate for the agerelated decline in cognition by relying more heavily on the
general definition (Botwinick & Storandt, 1974; Tun, Wingfield, Rosen, & Blanchard, 1998), thus maintaining vocabulary
knowledge despite cognitive decline.
P234
A third theory of the aging of vocabulary knowledge is the
transmission deficit hypothesis (MacKay & Abrams, 1998;
MacKay & Burke, 1990). The transmission deficit hypothesis is
based on node structure theory (NST), a version of a spreading
activation model of the organization of lexical information
(MacKay, 1987), with each individual representation of knowledge, whether semantic, phonological, or orthographic, contained in a node. Under NST, activation is passed (called
priming in the terminology of NST) between nodes according
to the strength of the connection between the nodes. The
connections become universally weaker or less efficient with
age (MacKay & Abrams) but strengthen with cumulative usage,
so that, overall, vocabulary knowledge remains constant
(Burke, MacKay, & James, 2000).
Some researchers have suggested that the age trend observed
in tests of vocabulary knowledge is at least in part attributable
to cohort effects (Alwin & McCammon, 1999, 2001; Alwin,
McCammon, Rodgers, & Wray, 2003; Glenn, 1999; see also
Wilson & Gove, 1999a, 1999b). What appears to be an effect of
age may actually be explained by an intercohort decline in
vocabulary knowledge, a result of changes in family structure
(Alwin, 1991) or a decline in time spent reading (Glenn, 1994,
1999). Much of the decline is suppressed by an intercohort increase in schooling (Glenn, 1999), yielding what appears to be
overall stability in vocabulary knowledge in adulthood.
One problem with these theories is that the structure of vocabulary knowledge is not generally studied in depth. Most
researchers have assumed or concluded that vocabulary is a
unitary construct. In particular, factor analyses of cognitive tasks
including vocabulary tests generally yield the same results
regardless of the particular vocabulary test employed (Carroll,
1993, p. 158). However, these results are based on the assumption that a single dimension of vocabulary knowledge is tested
on any given vocabulary test. Vocabulary knowledge may be
multidimensional, and there may be different age trends for
different aspects of vocabulary knowledge.
Factor analyses of the item responses to vocabulary tests often
yield two factors, although the identification of the factors
differs across studies. Bailey and Federman (1979) identified
Breadth and Depth factors; Gustafsson and Holmberg (1992)
identified factors associated with word origin for a Swedish
STRUCTURE OF VOCABULARY KNOWLEDGE
Table 1. Examples of Items From the Thorndike-Gallup Test
Target
Word
Option 1
Pristine
flashing
Edible
auspicious
Tactility
tangibility
Concern
see clearly
Encomium repetition
Option 2
earlier
eligible
grace
engage
friend
Option 3
Option 4
Option 5
primeval
bound
green
fit to eat
sagacious
able to speak
subtlety extensibility manageableness
furnish
disturb
have to do with
panegyric
abrasion
expulsion
Note: These items are selected at random from those presented in
Miner (1957).
vocabulary test. In an analysis of the items from the Wechsler
Adult Intelligence Scale–Revised, Beck and colleagues (1989)
found that the vocabulary items split into two factors, with the
easy items loading on a Verbal factor, along with most of the
items from the standard verbal scales (Information, Similarities,
Comprehension), and the difficult items loading on an Advanced Vocabulary factor, along with the difficult items from
the Information scale. None of these studies, however, examined the relations between the vocabulary factors and age.
In this study, we assess the factor structure of a test of
vocabulary for potential multidimensionality in vocabulary
knowledge. We then compare dimensions of vocabulary knowledge with age to see if different types of vocabulary knowledge
have different relations to age. These analyses are based on data
from a nationally representative repeated cross-sectional sample, allowing improved generalizations of the results to the
U.S. population.
METHODS
Data
The data are from the General Social Survey (GSS), a
collection of questionnaires given by in-person interview yearly
or biyearly to a random cross section of U.S. households, with
a new cross section selected in each year. (Each year’s cross
section was independent of all other years, so that the likelihood
that a respondent was included more than once is negligible.)
According to 1985 data reported in the GSS codebook
(National Opinion Research Center, 2004), this includes approximately 97.3% of all U.S. adults. Only adults are included,
and persons not living in households are excluded. The excluded individuals include 9.4% of adults aged 18 to 24, mostly
military personnel and college students living in dormitories
(but also incarcerated persons); 11.4% of adults aged 75 and
older, mostly nursing home and other long-term-care residents;
and between 0.8% and 1.4% of adults aged 25 to 74. Also
excluded were non-English speaking households, which were
2% or less of the total adult population. Response rates were
approximately constant across all survey years, generally between 75% and 80%.
In 15 years between 1974 and 2000, inclusive, a vocabulary
test was administered as part of the GSS (generally every other
year). The vocabulary test has 10 multiple-choice items, with
each item consisting of a target word with five options (usually
a single word, but in some cases a short phrase). Respondents
were asked to choose the option that is most similar in meaning
to the target word. The items are a subset of the items on the
Thorndike–Gallup test of verbal intelligence (Thorndike, 1942;
P235
Thorndike & Gallup, 1944). The administrators of the GSS ask
that the specific content of the items not be publicized.
Therefore, in order to give readers a sense of the items on the
test, we present five items selected at random from the full
Thorndike–Gallup test in Table 1. We refer to the items on the
GSS vocabulary test as Word A, Word B, . . ., Word J. Responses to the items were coded as correct, incorrect, or no
answer. We considered nonresponses to be incorrect. Other
options for dealing with nonresponses yielded similar results.
The GSS subsample we used is aggregated across the 15 survey
years and consists of N ¼ 20,560 adults who were given the
vocabulary test. Because our unit of analysis is the adult person
whereas the unit of sampling for the GSS was the household
(from which one adult was selected at random), we reweighted
the data for all analyses by the number of adults living in the
household. In two of the survey years (1982 and 1987), Blacks
were oversampled. In order to maintain consistency with other
survey years, we removed this oversample from the data,
although leaving them in the data set yielded nearly identical
results. Consistent with previous analyses of the GSS vocabulary data (e.g., Alwin, 1991), we did not adjust for differential
sample size across survey year, as the sample size was approximately equal in each year. (In three years, 1988, 1989, and
1990, the vocabulary test was given every year instead of every
other year, but only to two thirds of the sample, so that, across
the three years, the sample size was approximately equal to the
sample size of any other two years). Age ranged from 18 to 89
years (M ¼ 45.2), and birth year ranged from 1885 to 1982 (M ¼
1943). The sample was 57% female. The mean number of years
of formal schooling was 12.6. Table 2 describes the demographic characteristics of the sample in more detail.
The items differed in the proportion of persons who answered
correctly, and in the relation between the proportion of correct
responses and age, as illustrated in Figure 1. It appears that the
more difficult items (lowest on the Y axis) tended to remain
constant across late adulthood ages, whereas the easier items (top
section of the Y axis) tended to display an age-related decline.
Assessing Factor Structure
Responses to binary items are not normally distributed and
therefore violate a key assumption of standard factor analysis
(i.e., the normality of the unique factors; see Browne, 1984).
Previous work has shown that factor analyses of dichotomous
items can yield biased factor loadings (Parry & McArdle, 1991)
or generate spurious factors (McDonald & Ahlawat, 1974).
Researchers have proposed several versions of nonlinear factor
analysis to deal with these problems (McDonald & Ahlawat,
1974; Tate, 2003).
In this study, we report results from the version of nonlinear
factor analysis implemented in the Mplus program (Muthén &
Muthén, 2004). Mplus allows for both exploratory and confirmatory factor analysis of dichotomous items through the use
of tetrachoric correlations of categorical data. The calculation of
tetrachoric correlations involves the assumption that underlying
each categorical variable is a continuous, normally distributed
latent response propensity. Associated with each boundary
between categories is a threshold, so that if a person has response propensity above the threshold, the person will respond
in the higher category. Factor analysis of the tetrachoric
correlations associated with the categorical variables can be
BOWLES ET AL.
P236
Table 2. Demographic Characteristics of GSS Sample
Birth Year
Before 1921
Age
, 25
25–29
30–34
35–39
40–44
45–49
50–54
55–59
60–64
65–69
70–74
75–79
80
n
26
247
354
543
690
662
731
Female (%)
58
51
58
56
58
63
68
1921–1940
Education
12.2
11.5
10.9
10.5
10.7
10.9
11.6
n
Female (%)
24
305
444
588
854
894
841
594
344
97
71
59
53
57
55
57
61
60
59
73
1941–1960
Education
12.2
13.0
12.8
12.4
12.4
12.7
12.4
12.5
11.8
11.3
After 1960
n
Female (%)
Education
n
Female (%)
Education
747
1,145
1,547
1,667
1,552
1,089
591
151
54
57
57
56
53
55
56
52
12.4
13.3
13.4
13.9
13.9
13.8
13.7
13.7
1,503
1,201
823
306
53
56
59
53
12.8
13.9
14.0
13.8
Notes: GSS ¼ General Social Survey. Education refers to the average years of formal schooling. Blank cells indicate that no respondents were in that age and
cohort range.
conceptualized as standard linear factor analysis of the latent
response propensities. (Alternatively, one can approximately
achieve the version of nonlinear factor analysis implemented in
Mplus by estimating the tetrachoric correlations externally and
then using these correlations in any factor analysis or structural
equation modeling program.) Thus, the nonlinearity between a
factor and its indicators comes from the combination of a linear
effect of the factor on the response propensities and a nonlinear
dichotomization of the response propensities. For a more
detailed introduction to nonlinear factor analysis, see Bartholomew, Steele, Moustaki, and Galbraith (2002). We made
comparisons between factor solutions on the basis of the chisquare discrepancy function compared with the degrees of
freedom (with a ¼ .01), the root mean square error of approximation (RMSEA; Browne & Cudeck, 1993), the number
of eigenvalues of the tetrachoric correlation matrix greater than
1, the number of eigenvalues greater than simulated unidimensional data, and the overall interpretability of the factors.
An important disadvantage of Mplus is that it does not offer
correction for guessing, so it may overestimate the number of
factors (Hulin, Drasgow, & Parsons, 1983, chapter 8; Tate,
2003). We also employed versions of nonlinear factor analysis
with correction for guessing as implemented in the NOHARM
program (Fraser & McDonald, 1988; McDonald, 1981) and in
the TESTFACT program (Bock et al., 2003). Corrections for
guessing were small, and results from these analyses (available
by request) were not different from results with Mplus.
factor levels across factors is not clear, so that age trends across
factors can be meaningfully compared, but not levels. We calculated the regressions in two steps. First, we ran a confirmatory
factor analysis with item factor loadings less than .1 in absolute
value from the exploratory factor analysis set to 0 (all other
loadings were above .3). We then anchored the factor loadings
so that the factor variances were equal to 1, and we regressed the
factors on the age group variables directly in Mplus.
We aimed further analyses at modeling the age curves
identified by the age group variable regressions. We employed
the dual exponential growth curve model (McArdle, FerrerCaja, Hamagami, & Woodcock, 2002), which has been shown
to fit age curves well and has easily interpretable parameters.
The dual exponential model is given in Equation 1:
Fn ¼ b0 þ expðrate d 3 agen Þ expðrate g 3 agen Þ; ð1Þ
where Fn is the factor score for person n and b0 is the intercept.
In dual exponential growth, there are two competing forces,
exponential growth (rate_g) and exponential decline (rate_d).
Structural Relations to Chronological Age
After establishing the factor structure of the GSS vocabulary
test, we assessed the relations between chronological age and
the factor(s). To account for a wide range of nonlinear shapes
while maintaining large group sizes, we created 12 dummycoded variables to reflect 11 5-year age intervals and two
extreme age groups, age younger than 25 and age greater than or
equal to 80. Group sizes ranged from 731 for the oldest age
group to 2,394 for the 30–34 group. We then regressed the
factors on the age group variables, and adjusted the intercept so
that the overall factor mean was set to 0. Comparability of the
Figure 1. Relation between age and proportion correct on the GSS
vocabulary items. Letters refer to the items (e.g., A refers to Word A).
STRUCTURE OF VOCABULARY KNOWLEDGE
Table 3. Tetrachoric Correlations of GSS Vocabulary Test Items
Word Word Word Word Word Word Word Word Word
B
C
D
E
F
G
H
I
J
Word
Word
Word
Word
Word
Word
Word
Word
Word
A
B
C
D
E
F
G
H
I
.693
.339
.635
.493
.511
.281
.340
.466
.386
.427
.859
.723
.743
.424
.469
.663
.425
.420
.471
.500
.466
.533
.331
.558
.752
.738
.423
.454
.678
.411
.693
.459
.507
.480
.523
.463
.525
.534
.557
.552
.291
.524
.343
.598
.403
Because of the complexity of the nonlinearity in the dual
exponential model, it cannot be estimated in Mplus but instead
requires factor score estimates. Factor score estimates have
well-known validity issues, so we used two types of factor
score estimates: the factor score estimates from Mplus and, as
suggested by Wackwitz and Horn (1971), an unweighted sum
of the item scores of items loading on a single factor. Results
were similar for both estimates, so we report only results for the
simpler item sum score. We estimated the dual exponential
model by using the SAS NLIN procedure. Scripts are available
by request.
An important concern in the interpretation of the age curves
is that the age relations in the GSS vocabulary test may reflect
at least in part a cohort effect (Alwin & McCammon, 1999,
2001; Wilson & Gove, 1999a, 1999b). In order to test this
possibility, we repeated the dual exponential analysis, allowing
for a linear effect of cohort. We then compared the age curves
within cohorts (i.e., allowing for a cohort effect) with the age
curves across cohorts.
RESULTS
Exploratory Factor Analyses
Tetrachoric correlations among the items as estimated by
Mplus are reported in Table 3. The first two eigenvalues were
greater than 1, and these values were greater than the eigenvalues of a simulated one-factor model (simulated using the
parameter estimates from the one-factor exploratory factor analysis solution). The RMSEAs for the one-factor and two-factor
solutions were .048 and .022, respectively. Both were below
the suggested .05 threshold (Browne & Cudeck, 1993), but
the two-factor solution had a substantially lower RMSEA. The
two-factor solution fit significantly better than the one-factor
solution (v12 factor ¼ 1698, v22 factor ¼ 282, v2 ¼ 1416, df ¼ 9,
p , .01). The three-factor solution did not converge as a result
of a Heywood case. On the basis of these findings, we
concluded that the two-factor solution best describes the GSS
vocabulary test.
Factor Identification
The factor loadings from the Mplus exploratory factor
analysis with an oblique promax rotation (power ¼ 4) are
displayed in Table 4, along with the proportion correct scores.
The factors were directly related to difficulty. One factor consisted of the difficult items, whereas the other factor consisted of
the easy items. The congruence coefficient between the factor
P237
Table 4. Factor Loadings for the Two-Factor Solution
Item
Word
Word
Word
Word
Word
Word
Word
Word
Word
Word
A
B
C
D
E
F
G
H
I
J
Factor 1
Loading
Factor 2
Loading
Proportion
Correct
.703
.931
.045
.894
.558
.565
.022
.025
.685
.052
.003
.018
.668
.051
.337
.356
.705
.796
.033
.752
.78
.88
.21
.88
.69
.74
.32
.28
.73
.22
loadings on the first factor and difficulty as indicated by the
proportion correct was .96, whereas the congruence coefficient
between the second factor and one minus difficulty was .97. We
therefore identify the factors as Basic Vocabulary and Advanced
Vocabulary. The correlation between the two factors was .62.
We examined several alternative factor structures in order to
assess the robustness of the factor identification, both statistically and in content. We examined the items for potential
multidimensionality in the content, but no differences in content
were apparent. In line with Gustafsson and Holmberg (1992),
we also looked at word origin as a possible source of multidimensionality. One of the words was of Germanic origin,
whereas the remaining nine can be traced to Latin through one
of the Romance languages, in most cases French. Therefore,
word origin did not provide a way to differentiate among the
items. Next, we repeated the exploratory factor analyses, dropping the easiest item and separately by dropping the hardest
item. Results were consistent with the two-factor solution with
Basic and Advanced Vocabulary. We also considered a confirmatory factor analysis with an essentially random selection of
items. We split the items into two factors on the basis of the
order of presentation, with odd items on one factor and even
items on the other. We also selected at random one item from
each factor to cross load with the other factor, in order to match
the two items cross loading on the Basic and Advanced
Vocabulary factors. The odd–even factors fit substantially worse
than the Basic and Advanced factors, with a RMSEA of .048
compared with .020. Finally, we considered slight perturbations
to the Basic and Advanced factor structure by exchanging
a randomly selected item from each factor (i.e., an easy item
moved to the Advanced factor and a difficult item moved to the
Basic factor). We repeated this procedure three times. In all
cases the fit was substantially worse, with RMSEAs ranging
from .049 to .050. Thus, we conclude that the Basic and
Advanced factor structure is a robust result.
Because we were interested in the relations between the
factors and age, we examined factorial invariance across age.
Exploratory factor analyses of the GSS data from each group
separately yielded similar factor patterns and loadings at all age
levels. A confirmatory factor analysis with multiple groups
yielded a significant gain in fit from allowing factor loadings to
vary across groups (v2 ¼ 467, df ¼ 120, p , .01). However, the RMSEA changed by only .002, indicating that there
was no meaningful change in fit. Furthermore, there were no
substantive or systematic differences in factor variances or
correlations across the age groups. Therefore, we concluded
P238
BOWLES ET AL.
Figure 2. Relation between age and Basic and Advanced
Vocabulary factors. The zero point of both curves was arbitrarily set
to the overall mean.
that factorial invariance (both configural and metric) holds
across age groups (see endnote).
Age Relations
Results of the age regressions are displayed in Figure 2 for the
two-factor solution. Numerical results are available by request.
Constraining the age relations to be identical yielded a substantial loss in fit (v2 ¼ 607, df ¼ 12, p , .01). It is clear that
the two factors have different relations to age. Basic Vocabulary, on one hand, displays an age-related increase in early
adulthood, has a peak at around age 35, and then shows an agerelated decline in the later years. Advanced Vocabulary, on the
other hand, displays an age-related increase throughout adulthood, reaching an asymptote around age 45, and showing an
age-related decline only in very old age.
These findings were confirmed with the dual exponential
model. Basic vocabulary had a strong growth rate (.162) and
a smaller decline rate (.067), with a peak around age 30. Advanced vocabulary, in contrast, had a weaker growth rate (.052)
and a small but statistically significant decline rate (.016), with
a long plateau in later adulthood and a peak around age 50. The
factors differed in both growth rate (95% confidence interval,
or CI, for difference ¼ [.074, .116]) and decline rate (95% CI
for difference ¼ [.029, .046]; note that PROC NLIN reports
only confidence intervals, not p values).
Figure 3. Relation between age and the Basic Vocabulary factor
within and across cohorts.
edge. Furthermore, after allowing for the linear cohort effect, we
found that there were substantially smaller growth (.101 vs .162)
and decline (.040 vs .067) rates, although these rates were both
still larger than the rates for Advanced Vocabulary.
Because cohort effects in vocabulary knowledge may be
related to changes in years of schooling (Glenn, 1994, 1999),
we also repeated the cohort analysis, using years of formal
education instead of birth year as the covariate in the dual
exponential model. More years of schooling was associated
with higher vocabulary knowledge for both Basic Vocabulary
(.024, 95% CI ¼ [.021, .027]) and Advanced Vocabulary (.050,
95% CI ¼ [.047, .053]), although the effect on Advanced
Vocabulary was stronger. After allowing for the linear
education effect, we found that there were small reductions in
the growth (.149 vs .162) and decline (.055 vs .067) rates for
Basic Vocabulary, whereas for Advanced Vocabulary there was
essentially no change in either the growth (.051 vs .052) or
decline (.012 vs .016) rates.
Cohort Effects
The dual exponential models with linear cohort effect are
displayed in Figure 3 for Basic Vocabulary and Figure 4 for Advanced Vocabulary. A small cohort effect is identifiable for
Advanced Vocabulary, such that more recently born cohorts had
lower Advanced Vocabulary knowledge (coefficient on cohort:
.0024, 95% CI ¼ [.0043, .0006]). After allowing for the
linear cohort effects, we found that there were larger growth
(.074 vs .052) and decline (.020 vs .016) rates. Basic Vocabulary displays a stronger cohort effect. There was a positive
effect of cohort (.0060, 95% CI ¼ [.0048, .0073]), such that
more recently born cohorts had higher Basic Vocabulary knowl-
Figure 4. Relation between age and the Advanced Vocabulary
factor within and across cohorts.
STRUCTURE OF VOCABULARY KNOWLEDGE
P239
DISCUSSION
Vocabulary knowledge as measured by the GSS vocabulary
test does not appear to be a unidimensional concept. Instead, it is
bidimensional, with the dimensions reflecting differences in
difficulty. We therefore label the dimensions basic vocabulary
and advanced vocabulary. The two dimensions have different
relations to age. Basic Vocabulary has a peak at a relatively
young age, around 35, and has an age-related decline in later
adulthood; Advanced Vocabulary has its peak around age 45,
and it is essentially constant across almost all late adulthood ages.
same methodologies used for the GSS vocabulary test, we also
analyzed Salthouse’s (1993) Synonyms Vocabulary Test (see
Bowles & Salthouse, 2003, for a description of the data). The
Synonyms test replicated the bidimensionality with factors reflecting difficulty, although the factors were less well identified because of a lack of difficult items; the lowest proportion
correct was .56, compared with .21 for the GSS vocabulary
test. The differential age relations were also found, although
again the difference was less clear because of the difficulty
in identifying the Advanced Vocabulary factor.
Difficulty Factors
Related Findings
Factors related to difficulty when dichotomous items are used
can be problematic (McDonald & Ahlawat, 1974). Some researchers seem to indicate that difficulty factors are always
artifactual (Hattie, Krakowski, Rogers, & Swaminathan, 1996),
reflecting a misspecified relation between the true unidimensional factor and the observations. The misspecification can
result from the use of a linear factor analysis instead of a
nonlinear factor analysis (McDonald & Ahlawat, 1974), but it
may still arise if the nonlinear function is not approximately
correct. The misspecification may also arise from a lack of
correction for guessing (Hulin et al., 1983, chapter 8). Our use of
a nonlinear factor analysis may have minimized these problems,
and correcting for guessing had no effect on the results, but it is
not possible to assess to what degree we were successful in
eliminating the potential for artifactual multidimensionality.
In contrast, difficulty factors may indicate true multidimensionality, reflecting differences in the cognitive processes
required for easy and difficult items (McDonald & Ahlawat,
1974). Research into the sources of difficulty factors has not
yielded definitive means of addressing potential artifactuality.
One possibility is to use external validation. If the factors have
different relations to other variables, then a conclusion of actual
multidimensionality may be justified. We turn to the relations
with age as external evidence of the validity of the
bidimensionality of the GSS vocabulary test.
Some other studies have found differences in age trends
between basic and advanced vocabulary knowledge, although
in no case was the difference a focus of the findings. Hambrick,
Salthouse, and Meinz (1996) found stronger positive correlations between age and ability to complete crossword puzzles for
the more difficult New York Times crossword puzzles than for
relatively easier puzzles (.41 vs .03 and .07 in Study 2, .46 vs
.01 and .12 in Study 3, and .36 and .31 vs .16 in Study 4). In
the Seattle Longitudinal Study, scores on an advanced
vocabulary test peaked later and declined less in a person’s
late adulthood than scores on an easier vocabulary test when
longitudinal data were examined (Schaie, 1996, Fig. 5.5, p.
124), although the difference did not appear in cross-sectional
data (Fig. 4.6, p. 92).
Generalizability
An important feature of these results is that, unlike many
psychological studies of aging, which rely on convenience
samples, the data come from a nationally representative sample.
The sampling procedure is not completely representative, as
institutionalized persons are not included. This includes nursing
home residents, who are presumably less able at cognitive tasks
than their uninstitutionalized peers, and college students,
who are presumably more able at cognitive tasks than their
uninstitionalized peers. Therefore, less able older adults and
more able younger adults are likely underrepresented. The
average ability of very old adults on both factors of vocabulary
knowledge is likely overestimated, and the average ability of
the young adults is likely underestimated. However, it is
unlikely that this misestimation affects the general results
of this study, as it probably affects both factors approximately equally.
In this study we looked at only one test, the multiple-choice
GSS vocabulary test, and it is not known to what extent the
results generalize to other vocabulary tests. Analyses of other
tests have yielded less clear results. For example, using the
Explanations for Differential Age Trends
It appears that no theory of the aging of vocabulary knowledge offers a good explanation for the seemingly counterintuitive finding that advanced vocabulary remains fairly stable in
late adulthood, whereas basic vocabulary displays an agerelated decline. Results from the dual exponential model suggest
that a growth in esoteric knowledge does not account for the
findings, as the decline rates of basic and advanced vocabulary
differ, and advanced vocabulary, which presumably requires
more esoteric knowledge, has a weaker growth rate. Dualrepresentation theories do not predict differences in the age
trends between basic and advanced vocabulary, although it may
be possible to develop an ad hoc and perhaps convoluted
explanation. The transmission deficit hypothesis also does not
explain the findings, as it suggests that advanced vocabulary
should be more negatively related to age (Burke et al., 2000).
The analysis of cohort effects indicates that a theory
involving cohort effects may provide the best explanation for
the findings, although the currently conceptualized theories are
insufficient to explain all the results. One possible explanation
for the differential age trends is that there are two types of
cohort effects: word obsolescence (Hauser & Huang, 1997),
which would favor older adults and affect advanced vocabulary
more strongly, and the intercohort increase in schooling (Alwin
& McCammon, 2001), which affects both types of vocabulary
knowledge and counteracts the obsolescence effect for advanced vocabulary, thus yielding only a small observed cohort
effect for advanced vocabulary. These cohort effects would
yield an observed intercohort increase in basic vocabulary,
which appears as an age-related decline. However, the findings
on the effect of years of education suggest that education has
little explanatory power for the differential age trends. A more
P240
BOWLES ET AL.
thorough theory of the mechanisms behind cohort effects on
vocabulary knowledge is needed.
effect, but current theories of the aging of vocabulary
knowledge are insufficient to explain the differential age trends.
Measurement Issues
ACKNOWLEDGMENTS
Issues of basic structural measurement are often ignored in
psychological research (Bowles & Salthouse, 2003; Michell,
1990). The total score is often assumed to represent an assumed
latent trait without testing the success of the representation, and
all measurement properties are assumed without being tested.
One commonly assumed measurement property is unidimensionality, which means that a single latent trait can account for
the observations generated with the instrument. Although in
some cases unidimensionality may hold, it need not, and, as
illustrated in this study, the differences between the dimensions
may be interesting and important. An assumption of unidimensionality can obscure the differences between the dimensions, so
that findings based on the instrument may not be appropriate.
Just as aggregating over persons can obscure important individual differences such that conclusions about the aggregate
may not apply to any particular individual (Allport, 1937; Estes,
1956; Nesselroade & Molenaar, 1997), aggregating over dimensions of measurement with an untenable assumption of
unidimensionality may yield conclusions that do not apply to
any of the true underlying dimensions. Thus, it is important to
test an assumption of multidimensionality, as well as other
assumed measurement properties.
Previous analyses of the GSS vocabulary test have assumed
that the test is unidimensional (e.g., Alwin & MacCammon,
1999; Wilson & Gove, 1999a, 1999b). Our results indicate that
the test is bidimensional, suggesting that previous conclusions
may have ignored some of the interesting and informative
properties of the GSS vocabulary data. Future analyses should
incorporate the two-factor model in a structural equation modeling framework, or at least approximate it with an appropriate
selection of a factor score estimate. Although no optimal means
of estimating factor scores with nonlinear factor analysis has
been identified, Wackwitz and Horn (1971) suggest that the sum
of scores on items loading on a single factor may closely
approximate the factor score. Therefore, researchers could calculate two vocabulary scores, a basic vocabulary score equal to
the sum of the scores on Word A, Word B, Word D, and Word I,
and an advanced vocabulary score equal to the sum of the scores
on Word C, Word G, Word H, and Word J. This system of
equations cannot perfectly match the Basic and Advanced
factors, but it approximates them while maintaining a simple
raw score representation of vocabulary knowledge.
Conclusion
Researchers sometimes point to vocabulary knowledge as
a fortunate exception to the general decline found in almost all
cognitive tasks. However, looking at age-related changes in
a unitary construct of vocabulary knowledge appears to be
misleading, as vocabulary knowledge, at least as measured by
the GSS vocabulary test, is better described as bidimensional.
The two dimensions have dramatically different relations to age.
Knowledge of basic vocabulary seems to peak at a relatively
early age and then have an age-related decline, similar to most
cognitive tasks, whereas knowledge of advanced vocabulary
seems to be stable after reaching a peak around the age of 45.
Some of the age differences may be accounted for by a cohort
We gratefully acknowledge the support provided by Grants T32
AG20500-01 and AG07407 from the National Institute on Aging in the
preparation of this article. We thank Leo Stam and Karen Schmidt for
their help.
Address correspondence to Ryan P. Bowles, Department of Psychology,
University of Virginia, PO Box 400400, Charlottesville, VA 22904-4400.
E-mail: [email protected]
REFERENCES
Allport, G. W. (1937). Personality: A psychological interpretation. New
York: Holt, Rinehart & Winston.
Alwin, D. F. (1991). Family of origin and cohort differences in verbal
ability. American Sociological Review, 56, 625–638.
Alwin, D. F., & McCammon, R. J. (1999). Aging versus cohort
interpretation of intercohort differences in GSS vocabulary scores.
American Sociological Review, 64, 272–286.
Alwin, D. F., & McCammon, R. J. (2001). Aging, cohorts, and verbal
ability. Journal of Gerontology: Social Sciences, 56B, S151–S161.
Alwin, D. F., McCammon, R. J., Rodgers, W. L., & Wray, L. A. (2003,
September). Populations, cohorts and processes of cognitive aging.
Paper presented at The Dynamic Processes in Ageing: The Relationships Among Cognitive, Social, Biological, Health and Economic
Factors in Aging, Canberra.
Bailey, K. G., & Federman, E. J. (1979). Factor analysis of breadth and
depth dimensions on Wechsler’s similarities and vocabulary subscales.
Journal of Clinical Psychology, 35, 341–345.
Baltes, P. B., Cornelius, S. W., Spiro, A., Nesselroade, J. R., & Willis, S.
(1980). Integration versus differentiation of fluid/crystallized intelligence in old age. Developmental Psychology, 16, 625–635.
Bartholomew, D. J., Steele, F., Moustaki, I., & Galbraith, J. I. (2002). The
analysis and interpretation of multivariate data for social scientists.
Boca Raton, FL: Chapman & Hall.
Beck, N. C., Tucker, D., Frank, R., Parker, J., Lake, R., Thomas, S., et al.
(1989). The latent factor structure of the WAIS-R: A factor analysis
of individual item responses. Journal of Clinical Psychology, 45,
281–293.
Bock, R. D., Gibbons, R., Schilling, S. G., Muraki, E., Wilson, D. T.,
& Wood, R. (2003). TESTFACT (Version 4) [Computer software].
Chicago: Scientific Software International.
Botwinick, J., & Storandt, M. (1974). Vocabulary ability in later life.
Journal of Genetic Psychology, 125, 303–308.
Bowles, R. P., & Salthouse, T. A. (2003, August). Age relations and
processing components of synonyms and antonyms. Paper presented at
the Annual Meeting of the American Psychological Association,
Toronto.
Brainerd, C. J., & Reyna, V. F. (1992). Explaining ‘‘memory free’’
reasoning. Psychological Science, 3, 332–339.
Browne, M. W. (1984). Asymptotically distribution-free methods for the
analysis of covariance structures. British Journal of Mathematical &
Statistical Psychology, 37, 62–83.
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model
fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation
models (pp. 136–162). Beverly Hills, CA: Sage.
Burke, D. M., MacKay, D. G., & James, L. E. (2000). Theoretical
approaches to language and aging. In T. J. Perfect & E. A. Maylor
(Eds.), Models of cognitive aging (pp. 204–237). Oxford, England:
Oxford University Press.
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic
studies. Cambridge, England: Cambridge University Press.
Cattell, R. B. (1941). Some theoretical issues in adult intelligence testing.
Psychological Bulletin, 38, 592.
Cattell, R. B. (1963). Theory of fluid and crystallized intelligence: A critical
experiment. Journal of Educational Psychology, 54, 1–22.
Cattell, R. B. (1987). Intelligence: Its structure, growth and action.
Amsterdam: Elsevier.
Cattell, R. B. (1998). Where is intelligence? Some answers from the triadic
STRUCTURE OF VOCABULARY KNOWLEDGE
theory. In J. J. McArdle & R. W. Woodcock (Eds.), Human cognitive
abilities in theory and practice (pp. 29–38). Mahwah, NJ: Erlbaum.
Estes, W. K. (1956). The problem of inference from curves based on group
data. Psychological Bulletin, 53, 134–140.
Fraser, C., & McDonald, R. P. (1988). NOHARM: Least squares item
factor analysis. Multivariate Behavioral Research, 23, 267–269.
Glenn, N. D. (1994). Television watching, newspaper reading, and cohort
differences in verbal ability. Sociology of Education, 67, 216–230.
Glenn, N. D. (1999). Further discussion of the evidence for an intercohort
decline in education-adjusted vocabulary. American Sociological
Review, 64, 267–271.
Gustafsson, J.-E., & Holmberg, L. M. (1992). Psychometric properties of
vocabulary test items as a function of word characteristics. Scandinavian Journal of Educational Research, 36, 191–210.
Hambrick, D. Z., Salthouse, T. A., & Meinz, E. J. (1999). Predictors of
crossword puzzle proficiency and moderators of age–cognition
relations. Journal of Experimental Psychology: General, 128, 131–164.
Hattie, J., Krakowski, K., Rogers, H. J., & Swaminathan, H. (1996). An
assessment of Stout’s index of essential unidimensionality. Applied
Psychological Measurement, 20, 1–14.
Hauser, R. M., & Huang, M.-H. (1997). Verbal ability and socioeconomic
success: A trend analysis. Social Science Research, 26, 331–376.
Horn, J. L. (1998). A basis for research on age differences in cognitive
capabilities. In J. J. McArdle & R. W. Woodcock (Eds.), Human
cognitive abilities in theory and practice (pp. 57–91). Mahwah, NJ:
Erlbaum.
Horn, J. L., & Cattell, R. B. (1966). Age differences in primary mental
ability factors. Journal of Gerontology, 21, 210–220.
Hulin, C. L., Drasgow, F., & Parsons, C. K. (1983). Item response theory:
Applications to psychological measurement. Homewood, IL: Dow
Jones–Irwin.
Jones, J. C., & Conrad, H. S. (1933). The growth and decline of
intelligence: A study of a homogeneous group between the ages of ten
and sixty. Genetic Psychology Monographs, 13, 223–275.
MacKay, D. G. (1987). The organization of perception and action: A
theory for language and other cognitive skills. New York: SpringerVerlag.
MacKay, D. G., & Abrams, L. (1998). Age-linked declines in retrieving
orthographic knowledge: Empirical, practical and theoretical implications. Psychology & Aging, 13, 647–662.
MacKay, D. G., & Burke, D. M. (1990). Cognition and aging: A theory of
new learning and the use of old connections. In T. M. Hess (Ed.), Aging
and cognition: Knowledge organization and utilization (pp. 213–263).
New York: Elsevier.
McArdle, J. J., Ferrer-Caja, E., Hamagami, F., & Woodcock, R. W. (2002).
Comparative longitudinal structural analyses of the growth and decline
of multiple intellectual abilities over the life span. Developmental
Psychology, 38, 115–142.
McArdle, J. J., Grimm, K. J., Hamagami, F., Bowles, R. P., Ferrer-Caja, E.,
& Meredith, W. (2003, October). Modeling latent growth curves using
longitudinal data with non-repeated measurements. Paper presented at
the Annual Meeting of the Society for Multivariate Experimental
Psychology, Charlottesville, VA.
McDonald, R. P. (1981). The dimensionality of test and items. British
Journal of Mathematical and Statistical Psychology, 34, 100–117.
McDonald, R. P., & Ahlawat, K. S. (1974). Difficulty factors in binary data.
British Journal of Mathematical and Statistical Psychology, 27, 82–99.
McGinnis, D., & Zelinski, E. M. (2000). Understanding unfamiliar words:
The influence of processing resources, vocabulary knowledge, and age.
Psychology and Aging, 15, 335–350.
McGinnis, D., & Zelinski, E. M. (2003). Understanding unfamiliar words in
P241
young, young-old, and old-old adults: Inferential processing and the
abstraction-deficit hypothesis. Psychology and Aging, 18, 497–509.
Michell, J. (1990). An introduction to the logic of psychological
measurement. Hillsdale, NJ: Erlbaum.
Miner, J. B. (1957). Intelligence in the United States: A survey—With
conclusions for manpower utilization in education and employment.
New York: Springer.
Muthén, B. O., & Muthén, L. K. (2004). Mplus (Version 3.01). Los
Angeles: Muthén & Muthén.
National Opinion Research Center. (2004). General social survey: 1972–
2000 cumulative codebook. Retrieved December 1, 2004, from http://
webapp.icpsr.umich.edu/GSS/
Nesselroade, J. R., & Molenaar, P. C. M. (1997). Pooling lagged covariance
structures based on short, multivariate time series for dynamic factor
analysis. In R. H. Hoyle (Ed.), Statistical strategies for small sample
research (pp. 224–251). London: Sage.
Parry, C. D., & McArdle, J. J. (1991). An applied comparison of methods
for least-squares factor analysis of dichotomous variables. Applied
Psychological Measurement, 15, 35–46.
Salthouse, T. A. (1993). Speed and knowledge as determinants of adult age
differences in verbal tasks. Journal of Gerontology: Psychological
Sciences, 48, P29–P36.
Schaie, K. W. (1996). Intellectual development in adulthood. Cambridge,
England: Cambridge University Press.
Sorenson, H. (1938). Adult abilities. Minneapolis: University of Minnesota
Press.
Tate, R. (2003). A comparison of selected empirical methods for assessing
the structure of responses to test items. Applied Psychological
Measurement, 27, 159–203.
Thorndike, R. L. (1942). Two screening tests of verbal intelligence. Journal
of Applied Psychology, 26, 128–135.
Thorndike, R. L., & Gallup, G. H. (1944). Verbal intelligence in the
American adult. Journal of General Psychology, 30, 75–85.
Tun, P. A., Wingfield, A., Rosen, M. J., & Blanchard, L. (1998). Response
latencies for false memories: Gist-based processes in normal aging.
Psychology and Aging, 13, 230–241.
Verhaeghen, P. (2003). Aging and vocabulary score: A meta-analysis.
Psychology and Aging, 18, 332–339.
Wackwitz, J. H., & Horn, J. L. (1971). On obtaining the best estimates of
factor scores within an ideal simple structure. Multivariate Behavioral
Research, 6, 389–408.
Wilson, J. A., & Gove, W. R. (1999a). The age-period-cohort conundrum
and verbal ability: Empirical relationships and their interpretation:
Reply to Glenn and to Alwin and McCammon. American Sociological
Review, 64, 287–302.
Wilson, J. A., & Gove, W. R. (1999b). The intercohort decline in verbal
ability: Does it exist? American Sociological Review, 64, 253–266.
Received July 8, 2004
Accepted March 18, 2005
Decision Editor: Thomas M. Hess, PhD
END NOTES
The lack of systematic differences in factor correlations across
age goes against typical findings of dedifferentiation, such as those
of Baltes, Cornelius, Spiro, Nesselroade, & Willis (1980). Studies of
dedifferentiation focus on broad test batteries and do not tend to
examine cognitive abilities at the level of detail used in this study.
Therefore, we make no conclusions about the relation between our
findings and the concept of dedifferentiation.