Grouping Operations in Free RecallZ

JOURNALOF VERBALLEARNINGANDVERBALBEHAVIOR8, 481--493 (1969)
Grouping Operations in Free RecallZ
GORDON H. BOWER, ALAN M. LESGOLD, AND DAVID TIEMAN
Stanford University, Stanford, California 94305
In free recall, Ss search for stable groupings of the list words, which groups become
functional recall units. The experiments reported show that recall suffers if S is forced to
adopt groupings differing from those used previously. Although such regroupings disrupt
retrieval processes in recall, they do not reduce recognition memory for the list words,
presumably because recognition depends on "occurrence information" which is independent
of interitem associations utilized in retrieval. Another experiment showed that S's recall
is facilitated by E explicitly aggregating S's functional recall units into larger groups over
trials. A final experiment showed that recall was facilitated by arranging for structural linkages between groups via common words much as in a sausage chain. More groups are
recalled in such linked lists, and Ss clearly used the linkages in moving from one to another
group in recall.
A currently prevalent view of free recall
learning is that it involves organizational
processes (e.g., Bower, 1968; Mandler, 1967;
TuNing, 1962, 1968). There are at least two
aspects to this organization: first, the aggregation or grouping together of several words
of the to-be-recalled list, so that these groups
become functionally integrated units in recall;
and second, the development by S of a retrieval plan or cuing scheme to enable him
to move in recall from one functional unit
to another in his memory. The following
experiments attempt to manipulate the first
factor, the subjective units or groups which S
uses in free recall.
There are several ways to study the role of
subjective groupings in free recall. I f a cluster
of items is acting as a functional unit, then
recall of the words in that cluster should
exhibit a characteristic pattern; they should
tend to appear close together in recall, perhaps
even in a stereotyped order, and the interresponse times between recalling words in the
unit will be short. The latency characteristics
of this picture have been investigated by
Pollio (1968), McLean and Gregg (1967), and
Mandler (personal communication, 1968),
who have found the recall pattern to be a rapid
burst of responses in a subjective cluster, then
a pause, another burst of responses from
another cluster, another pause, and so on.
The output stereotypy of words in subjective
groups has been measured by Tulving's (1962)
index of "subjective organization," or Bousfield, Puff, and Cowan's (1964) index of intertrial repetition. Both indices essentially measure the frequency with which S recalls pairs
of words in a consistent order even though he
is free to do otherwise. Although these indices
are mathematically independent of the number
of words recalled, both studies cited reported
that these indices increased with practice on a
list and correlated highly with the number of
words recalled by individual Ss.
Other methods for investigating these
subjective groups are to have S write the w~rds
as they are presented, allowing him to group
them on his study sheet in any way he wishes
i The research reported in this article was supported (Seibel, 1967), or to have S sort the words,
by a grant, MH-13950, to the senior author from the categorizing them into as m a n y groups as he
National Institute of Mental Health.
wishes before he recalls them (Mandler, 1967).
481
482
BOWER, LESGOLD, AND TIEMAN
The conclusions suggested by these several
methods are that Ss search for stable groupings of the list words, and that they tend to
recall together those items which they have
assigned to the same group. Furthermore,
as multitrial learning proceeds, the assignment
of items to groups becomes more consistent
and the groups become better integrated in
that more words can be recalled from each
group.
Tulving and Mandler have hypothesized
that it is this increase in size and integration
of the subjective groups that is the main cause
of the increase over trials in the number of
words which S can recall. If one believes that
this is the cause for improvement in free recall
with repeated trials, it then follows that if we
somehow force S to change his groupings anew
every trial, we should seriously degrade the
usual increments in recall provided by repeated
trials. This implication is tested in Exp. I.
pedaling a bicycle. From pilot work, we know
that this procedure establishes very strong
interassociations among these words, so they
act much like an integrated unit. We also
found that when we did this for quartets of
words in a free recall list, we had practically
complete control of S's clusters in free recall;
although free to recall words in any order, S's
entire protocol could be described as short
runs of two to four responses from the quartet
groupings presented to him. The cluster
scores produced by this method (with unrelated nouns) are in fact much higher than the
clustering scores o n e obtains with a highly
categorized list of words presented one at a
time in blocked fashion (e.g., Cofer, 1965).
The clustering may be indexed by the modified
ratio of repetition (MRR) which is
~njj
MRR
-
~(nj
J
-
D'
for all nj > 1,
J
EXPERIMENT I
In order to test this implication, one ideally
would like to have some means for controlling
how S groups a given set of words and, in
particular, be able to force S to use different
groupings on different input trials. Although
it might be possible to construct lists of related
words that we could get S to classify exhaustively in alternative ways on different trials,
that method of testing the implication struck
us as likely to produce results having only
limited generality. We therefore developed
a technique for imposing any arbitrary groupings which E wishes upon any arbitrary set
of unrelated words (concrete nouns in this
instance). The technique is elementary: S is
merely shown a small group (2-4) of the
words at one time and is told to make up an
elaborate "mental image or picture" in which
all the objects named are interacting together
in some vivid and memorable scene. For
example, with the quartet, dog, cigar, bicycle,
hat, S might mentally imagine a scene in which
a dog wearing a hat and smoking a cigar is
where nj is the number of words recalled from
category or group j, nj~ is the number of pairs
of the nj words which are recalled in consecutive order, and the summation ranges over
the various categories or groups. This clustering index is a fraction, anchored at 1 when
there is perfect clustering. With the mental
imagery method, we consistently obtain mean
M R R scores of .90 or higher, representing
almost perfect clustering in recall according
to the groups E imposed upon S. It may also
be reported that the latencies of successive
responses in recall appear to follow the pauseburst-pause pattern reviewed earlier. However, we have not made extensive enough
measurements of this to report any statistical
data on latencies.
With this method for controlling S's
functional groupings, we are in a position to
test the implication that multitrial free recall
is retarded if the groupings of the words are
changed on every trial. To change groupings,
E merely rescrambles the list words and presents new quartets on each trial for S to
associate together.
483
GROUPING OPERATIONS IN FREE RECALL
Method
Ioo.
Each S had three trials on each of four lists of 24
nouns for free recall. For two of the lists, the quartet
groups were preserved intact over all three trials (Same
c~
tad
2 90
"¢
t,.)
condition); for two of the lists, the groupings of words
into quartets were systematically different on all three
trials with that list (Changed condition). The 24 words
to
80
a
in each list were concrete nouns of high imagery ratings
(Paivio, Yuille, & Madigan, 1968) chosen so as to be
as Unrelated as possible. Across Ss, each list appeared
equally often first, second, third, or fourth, and was
equally often in the Same or Changed condition.
The 24 words of a list were divided randomly into
6 quartets for Trial 1. These 6 quartets were presented
by a slide projector for 12 see. each (3 see. per word)
with S instructed to mentally image an interactive
scene for the four objects named. He was not told that
he had to recall these four words together. After the
sixth input slide, S began oral free recall for 60 see.,
being told to recall all words he could in any order he
wished. Due to a procedural oversight, recall order
was not recorded in this study. For lists in the Same
condition, the same six slides of Trial 1 were presented
again in the same order on Trials 2 and 3. For lists in the
• Changed condition, six new quartets were composed
of the 24 words for slide presentation on Trial 2,
insuring that no two words appeared in the same groups
on Trials 1 and 2. Similarly, six further quartets were
composed for presentation on Trial 3, insuring that no
two words appeared together which had been together
on Trials 1 or 2. The S was instructed to construct a
composite mental image for whatever quartet of words
was shown to him. A 30-second break plus instructional
review were interspersed between the last trial of one
list and the first trial of the next list.
The Ss were 12 Stanford undergraduates fulfilling a
service requirement for their introductory psychology
course. They were run it~dividually.
Results
D a t a were p o o l e d over w o r d lists a n d over
lists-within-session since neither o f these
incidental variables p r o d u c e d significant
effects. The m a i n results are shown in F i g u r e 1
giving w o r d s recalled over the three trials for
the S a m e vs. C h a n g e d conditions. Recall in
these c o n d i t i o n s b e g a n at the same level on
Trial 1, as it s h o u l d since the t r e a t m e n t was
identical at t h a t point. H o w e v e r , the conditions differed r a d i c a l l y in their i m p r o v e m e n t
over Trials 2 a n d 3 (p < .001). T h e list r e p e a t e d
with the same g r o u p i n g s i m p r o v e d m a r k e d l y
PLES
~_
z 70
"'
CD
LES
rw
~- 60 l
I
2
TRIALS
3
Fie. 1. Mean recall curves for Exp. I.
with practice, whereas the list r e p e a t e d with
changing g r o u p s s h o w e d relatively little
i m p r o v e m e n t . I n fact, the null hypothesis o f no
i m p r o v e m e n t in recall in the C h a n g e d condition could n o t be rejected, F(2, 6 9 ) = 1.91,
p > .20.
The differences a m o n g c o n d i t i o n s a p p e a r
b o t h in intertrial r e t e n t i o n a n d i n t r a t r i a l
forgetting (Tulving, 1964). G i v e n t h a t a w o r d
failed to be recalled on Trial n, its c o n d i t i o n a l
p r o b a b i l i t y o f being recalled on Trial n + 1
was .87 in the Same c o n d i t i o n a n d .61 in the
C h a n g e d condition. G i v e n t h a t a w o r d was
recalled on Trial n, its c o n d i t i o n a l p r o b a b i l i t y
o f being f o r g o t t e n on Trial n + 1 was .04 in
the Same c o n d i t i o n a n d .22 in the C h a n g e d
condition. Thus, the C h a n g e d c o n d i t i o n was
p o o r e r in t h a t fewer new w o r d s were p i c k e d
u p e a c h trial, a n d m o r e p r e v i o u s l y recalled
w o r d s were f o r g o t t e n each trial.
Recall a c c o r d i n g to the clusters p r e s e n t e d
also differed c o n s i d e r a b l y between the two
groups. F i g u r e 2 shows the p r o b a b i l i t y h i s t o g r a m s o f w o r d s recalled p e r p r e s e n t e d quartet,
p o o l i n g the two c o n d i t i o n s o n Trial 1, a n d
p o o l i n g Trials 2 a n d 3 s e p a r a t e l y for the two
conditions. The Trial-1 h i s t o g r a m shows the
" s o m e - o r - n o n e " result r e p o r t e d b y C o h e n
(1966) for free recall o f c a t e g o r i z e d w o r d
lists, viz., S either does or does n o t get into a
g r o u p or c a t e g o r y (the high 0 point), b u t i f he
gets into it he recalls a high percentage o f the
484
BOWER, LESGOLD, AND TIEMAN
1.0
1.0
Trial 1
Pooled
Trials 2 and 3
Conditions
Some Groups
.6
.6
o 4-
4
,&
/
Changing
Group
~'/
2
/ I
I
T
0
Words per Group
T
,
,
2
3
4
Words per Group
FIG. 2. Probability distributions of the number of
words recalled per presented quartet, grouped over
Subjects.
words from that group. The distribution for
the Same condition on Trials 2 and 3 shifts
strongly towards "all-or-none" recall of the
quartet, with predominately all of it being
recalled. On the other hand, the distribution
on Trials 2 and 3 for the Changed lists (scored
relative to the groups presented on those trials)
shifts away from the "some-or-none" pattern
towards a simple linear distribution with the
zero point falling in line along a gradient.
This presumably means that the new quartets
presented on Trial 3, for instance, are not
acting as new functional units, but rather are
conflicting with the prior groupings given
on Trials 1 and 2.
These results support the Tulving-Mandler
hypothesis which led to this experiment.
Improvement in multitrial free recall appears
to be a concomitant of developing stable,
integrated groups of list words; and if
measures are taken to diminish stable groupings, multitrial free recall is seriously retarded.
It is probable that Ss experiencing changed
input groupings would eventually improve
so as to recall nearly all the words. After
a while, S may ignore the input groupings
and establish his own output groupings, just
as he apparently does in the usual free recall
procedure. Future experiments shall have to
determine the persistence with which changing
input groupings can override S's demand for
order in his output.
The results contradict the view that free
recall of a given word depends upon the
number of other list words which are associated
with it. By Trial 3 in the Changed condition, at
least nine other list words had been associated
with a given list word. Yet its recall was
considerably poorer than in the Same condition in which only three other list words had
been associated to a given word. However,
association theory suggests an alternate
explanation of our results; namely, that recall
of a given word depends upon the number and
strength of associations to it from other list
words (e.g., Deese, 1959). On this view, items
repeated in the same quartets develop strong
associations to the three other items in that
group, whereas items presented in new quartets
develop weak associations to nine other items.
Moreover, successive sets of three of these
latter associations are related to one another
by a general A-B, A-C (or A-Br) paradigm
of negative transfer, perhaps causing unlearning of prior associations to an item. If the
strength of associations to a given list word is
a more important determinant of its recall than
is the number of words associated with it,
then this negative-transfer hypothesis might
have some hope of accounting for the results.
However, the specific details of an association
theory for free recall are not yet worked out
sufficiently to know whether it would be
consistent with the present results.
EXPERIMENT I I
In Exp. I, the input clusters were arbitrarily
specified by E, and S apparently adopted these
as functional units for recall. It is presumed,
however, that if E did not group the items
(i.e., presented them singly), then S would still
come eventually to group them together in his
recall, probably in an idiosyncratic manner.
Suppose that at some point late in S's freerecall learning, we ask him to indicate the
natural word groupings he has been using for
his recall. With this knowledge of S's natural
groupings, we would then be in a position
GROUPING OPERATIONSIN FREE RECALL
to predict whether a subsequent grouped input
trial would increase or decrease his next recall
depending u p o n the groupings imposed. I f
the clusters on this input trial are c o m p o s e d
so as to correspond with the natural groupings
which S has told us he uses, then his next
recall should be facilitated by that m e t h o d of
presentation. However, if the input clusters
are c o m p o s e d so as to systematically violate
S's natural groupings of the words, then his
next recall should be poorer than it was
before, despite the additional input trial. The
reasoning is m u c h the same as for the Samevs. Changed-groupings comparison in Exp. I,
except in the present experiment S's natural
groupings of the words (rather than E ' s
arbitrarily imposed groupings) serve as the
reference point for presenting the Samevs. Changed-groupings o f the words on the
next trial. These predictions are tested in
Exp. II.
Method
Each of 8 Ss from the previous source did two free
recall and sorting tasks. On each list of 36 unrelated
concrete nouns, S first had ungrouped input-output
trials (at a 2.5-second input rate) until he reached a
criterion of recalling at least 32 out of the 36 words
(89 ~). After his criterion trial, S was then asked to
sort the cards containing the 36 list words into 9 groups
of 4 words each, putting together "those words which
you have been recalling together or which you think of
as belonging together in a group." The S was permitted
as much time as he needed to do this, with all taking
between 1.5 and 4 min. After recording S's groupings,
E then gave one further input trial, this time with nine
groups of four words (cards) presented together on the
table top for 10 see. per quartet with mental imagery
instructions given to S. For one of the lists (Consistent
list), these 9 input quartets were exactly the same as the
nine groups of four words that S had just indicated in his
sorting. For the other list (Inconsistent), the nine input
quartets systematically violated S's sorting categories,
with each word in the quartet coming from a different
one of S's sorting categories. After this input trial, S
gave a final recall of the list.
Each S received one Consistent and one Inconsistent
list, half the Ss in that order, half in the reverse order.
Over Ss, the two sets of 36 words served equally often
as Consistent and Inconsistent lists, and equally often
as first or second list being learned.
485
Results
M e a n trial o f reaching criterion was 4.12
for the Consistent list and 4.00 for the Inconsistent list; this is an insignificant difference. M e a n words recalled on the criterion
trial were 32.5 out o f 36. M e a n recall after the
sorting and grouped input trial increased to
34.1 with the Consistent input trial and
decreased to 30.2 with the Inconsistent input
trial. The effect is small but it is quite consistent
and reliable over the 8 Ss. All 8 Ss increased
their recall after the Consistent trial (t = 5.65,
p < .01, whereas 6 o f 8 decreased recall
after the Inconsistent trial (one increased
by one word and one was unchanged),
t = 2.57,p < .05. F o r all 8 Ss, recall was higher
after their Consistent input trial than
after their Inconsistent input trial, t = 4.78,
p < .01.
The effect of Consistent vs. Inconsistent
groupings was relatively small here, but
various considerations would have led one to
expect this. First, the criterion performance o f
32.5 out of 36 words is already very close to the
true asymptote for this task, so one cannot
reasonably expect m u c h o f an increment for
the Consistent list. Second, the one Inconsistent input trial was working against the prior
organizations established by an average o f
four preceding trials on the list plus the
additional study time (mean was 2.5 rain.)
provided by the lengthy sorting trial. The latter
by itself would normally have been expected
to produce an increase in recall but this
apparently was overriden by the intervention o f the inconsistently grouped input
trial.
The conclusion f r o m this study is similar to
that f r o m Exp. I: S's recall is poorer if the
input groupings o f the words are different
f r o m the familiar groupings he has previously
been using for guiding his recall. The difference between Exp. I and the current one is that
in the former case the prior w o r d groupings
were imposed by E whereas in the latter case
these presumably were developed by S and
were indicated by his sorting behavior.
486
BOWER, LESGOLD, AND TIEMAN
EXPERIMENT I I I
The t h i r d e x p e r i m e n t i n q u i r e d whether
changing w o r d g r o u p i n g s over trials w o u l d
affect r e c o g n i t i o n m e m o r y or d i s c r i m i n a t i o n
o f list m e m b e r s h i p . I n t e g r a t i o n o f s e v e r a l
w o r d s into a h i g h e r - o r d e r unit p r e s u m a b l y
affects the p r o b a b i l i t y o f retrieval, since a n y
i t e m o f the cluster m a y cue recall o f a n y o t h e r ;
b u t conceivably, r e c o g n i t i o n d e p e n d s only
u p o n some " o c c u r r e n c e " i n f o r m a t i o n stored
with each item t h a t is i n d e p e n d e n t o f interi t e m associations. F o r example, K i n t s c h
(1968) has shown t h a t blocked, categorized
lists are b e t t e r recalled t h a n u n r e l a t e d w o r d
lists b u t t h a t they do n o t differ in recognition.
I n his terms, categories, or the i n t e r i t e m
associations a m o n g items in the categories,
affect retrieval processes, b u t do n o t affect the
basic d i s c r i m i n a t i o n t h a t a w o r d was on the
list. P e r h a p s we shall be able to show a
s i m i l a r s e p a r a t i o n o f r e c o g n i t i o n a n d recall
for o u r lists t h a t are g r o u p e d in the same w a y
or in changing ways over trials.
Method
The basic procedure was to give S two input-output
cycles of free recall on a list in which the groupings
were the same or different on the two input trials, and
then have a recognition test after the second output
trial. Each S learned two lists in a counterbalanced
order; for one list, the groups were the same on Trials
I and 2; for the other list, the groupings changed. To
avoid a ceiling on recognition performance, the lists
were lengthened to 75 words, and the presentation time
was 3 sec. per word. Triplets of words were presented
for 9 sec. with mental imagery instructions. There were
25 such triplet slides in the list. For Same-groupings
lists, the same 25 triplets were repeated on the two
input trials; for Changed-groupings lists, the 75 words
were arranged into 25 new triplets for presentation on
Trial 2. Recall was in writing, with 3 rain. allowed.
Immediately after the second recall of a given list, the
recognition test was given. The S was given a sheet of
paper listing in random order the 75 list words mixed
in with 75 synonymic or closely related distraetors.
He was told: "Some of these words were on the list you
have just studied and some were not. Check off those
words which you think were on the list." The Swas not
told how many list words were on the sheet nor how
many words to check off. Each list word had a
corresponding synonym or related distractor. In order
to achieve 150 related pairs (75 on two lists), a few
abstract nouns were used as well as concrete nouns.
The Ss were 16 students from the previous source.
They were run individually with counterbalancing over
Ss of which word list was first or second and whether
a Same or Change list was first or second.
Results
T h e results are s u m m a r i z e d in T a b l e 1
p o o l e d over the variables o f w o r d lists a n d
first vs. second list in the session. P e r h a p s due
to the greater n u m b e r o f f u n c t i o n a l g r o u p s
(25 vs. 6) a n d the presence o f some a b s t r a c t
words, recall in Exp. IV was c o n s i d e r a b l y less
t h a n in Exp. I. The two c o n d k i o n s , Same vs.
TABLE 1
FREE RECALLAND RECOGNITION SCORESFOR
SAMEVS. CHANGEDCONDITIONS
Same groupings Changed groupings
Free recall
Trial 1
Trial 2
Increment
Recognition
Hits
False alarms
.28
.56
.28
.31
.45
.14
.82
.02
.79
.03
C h a n g e d lists, were similar in recall o n Trial 1
b u t differed on Trial 2, t(15) = 2.82, p < .01.
T h e i n c r e m e n t in recall from Trial 1 to Trial 2
was significant for b o t h conditions, b u t the
i n c r e m e n t was larger for the S a m e c o n d i t i o n
(.28) t h a n for the C h a n g e d c o n d i t i o n (.14),
t(15) = 3.60, p < .001. Thus, the recall p o r t i o n
o f this e x p e r i m e n t replicates the qualitative
finding o f Exp. I, t h a t the t r i a l - t o - t r i a l
i m p r o v e m e n t in free recall is less with c h a n g e d
g r o u p i n g s o f the words.
I n c o n t r a s t to the recall differences, there
were no differences in r e c o g n i t i o n b e t w e e n the
Same vs. C h a n g e d c o n d i t i o n s . H i t rates
(checks on list w o r d s ) d i d n o t differ, false a l a r m
rates (checks on nonlist w o r d s ) d i d n o t differ,
n o r d i d the hit-minus-false a l a r m scores
(latter t(15) = 1.09, p > .15)~
T h e c o n d i t i o n a l relationships between recall
GROUPING OPERATIONSIN FREE RECALL
on Trial 2 and recognition of a word were
examined for the two conditions. I f a word
was recalled on Trial 2, its probability of being
recognized was .93 in the Same condition
and .97 in the Changed condition. I f a word
was not recalled, its probability of being
recognized was .65 in the S a m e condition
and .61 in the Changed condition. The difference in recognition of previously recalled
words in favor of the Changed list was
sufficient to offset the difference in recall, thus
yielding near equivalence in net recognition
despite the differences in recall of the two
lists.
The net outcome of this experiment is
exactly the pattern conjectured on the basis
of Kintsch's (1968) prior results. Variables
affecting stability of groupings of list words
influence recall but not recognition. The
account of these results is similar to Kintsch's.
When a word is studied, S stores information
about its occurrence on the list (perhaps as a
recency or contextual frequency tag or as a
trace strength), and he also tries to learn some
method for retrieving that item in recall
(perhaps associating it to a category cue, or to
other list words). Grouping and regrouping
operations affect the efficacy of the latter sorts
of retrieval information, but h a v e no effect
upon the occurrence information stored
alongside the word in S's lexicon. And it is the
latter sort of information that is consulted
when S is shown a word and asked to decide
whether it was on the list just studied.
487
size over the three trials, with individual groups
on Trials 2 and 3 composed of two intact
groups from the preceding input trial. In the
Decreasing condition, the group sizes correspondingly decreased over the three trials, by
halving on Trials 2 and 3 the groups of the
preceding input trial. For the Increasing condition, the group sizes were 3-6-12 over the three
trials; for the Decreasing condition, they were
12-6-3. The hypothesis expects that Ss having
increasing group-sizes will recall better than
Ss having decreasing group sizes.
Method
Twenty paid Ss (Stanford students) had three inputoutput cycles on a list of 60 concrete nouns. These
words were randomly composed into 20 triplets and
typed on 4- × 6-inch cards; then two random triplets
were combined to make a list of ten 6-tuples; then two
random 6-tuples were combined to make a third list of
five 12-tuples. By appropriate spacing, the 6-tuple and
12-tuple cards were clearly divided into the two or four
constituent triplets of words. The cards were presented
manually to S.
Ten of the Ss received these list groupings in the order
3-6-12 on input Trials 1, 2, 3 (Increasing condition),
and ten Ss received the reverse order of groupings over
trials (Decreasing condition). The Ss were instructed
to try to form an elaborate mental image or imaginary
scene in which aU the objects named on a given card
were interacting together in some vivid or memorable
way. Presentation time was calculated at 4 sec. per
word, so 3-tuple cards were shown for 12 sec. each,
6-tuples for 24 sec., and 12-tuples for 48 sec. After
presentation of the last input card, S began free recall
orally for 2 min., being instructed to recall as many
words as he could in any order he preferred.
Results
EXPERIMENT IV
The organizational hypothesis of Tulving
and Mandler implies that multitrial free recall
increases in part because the subjective groups
increase in size over trials; the clusters presumably become larger and better integrated
with practice. In the following experiment, by
use of varied input groupings, we have tried
to facilitate or retard this growth over trials
in the size of subjective clusters. In the Increasing condition, the input groupings increased in
The proportions of words recalled over trials
by the two groups are shown in Figure 3. The
groups are equal in recall on Trial 1, but the
Increasing Ss improve at a faster rate than do
the Decreasing Ss. Although the difference in
total recall is not large in an absolute sense, it
is highly significant statistically, F(1, 5 4 ) =
10.96, p < .005.
We next examined these data for group
differences in clustering. There are a variety
of ways to look at this feature of the data. One
488
BOWER, LESGOLD, AND TIEMAN
90
80
Inoreasin~.//////.~.,,
-- 70
o~o
6O
"E
8
- 5040"
3C
Trials
FIG. 3. Mean recall curves for the increasing (3-6-12)
and decreasing (12-6-3) conditions of Exp. IV.
statistic is the M R R index presented earlier.
These M R R scores calculated for 3-tuples,
6-tuples and 12-tuples were consistently
higher for Increasing Ss than for Decreasing
Ss, indicating greater conformity to the input
clusters by the Increasing Ss. A second statistic
we have examined is the conditional probability that S recalled all the words of a group
given that he recalled at least one word of that
group. This is a convenient index of the
integration of the group. Since the definition of
an input "group" changes systematically over
trials for each condition, we have scored recall
for each definition of "group" on each trial.
For example, on Trial l for Increasing Ss who
had input triplets, we scored for triplets and
also for the 6-tuples and 12-tuples they were
~ 01
Increasing Groups
~- O ' ' r
I
.
2
Trials
LOI
.
3
Decreasing Groups
I
2
.'5
Trials
FIG. 4. Conditional probabilities of subjects recalling all of a group given that they recalled part of that
group (Exp. IV).
going to see on later trials. This index of cluster
integration is depicted in Figure 4 for the three
cluster sizes, for the Increasing condition on
the left and Decreasing condition on the right.
Within each panel, the integration index is
necessarily ordered from 3-, to 6-, to 12-tuples.
The salient conclusion from comparing the
two panels of Figure 4 is that group integration
is markedly higher for the Increasing Ss than
for the Decreasing Ss on all trials for all sizes
of groups.
Could these clustering results be an artifact
simply of differences in mean recall? The
percentages can be "corrected" on the
assumption that recall of each word in the list
occurs randomly with probability p, on Trial n
(cf. Figure 3). If individual words were
recalled at random, then the probability of
recalling all of a group of size k given recall
of at least one word of that group would be
p , k / [ 1 - (1 _p,)k]. After making this correction for the mean recall level, two conclusions
remain: first, that clustering according to
input groupings is still very much greater than
one would expect from random output at a
given average probability level; and second,
that the clustering in excess of chance is still
much greater for the Increasing Ss than for the
Decreasing Ss.
Although the results of Exp. IV came out
significantly in the expected direction (Increasing better than Decreasing), it must be
admitted that the effect is not large in an
absolute sense. Why might this be? We
think the answer to this must refer to
certain organizing activities of S that simply
are not controlled by our input-grouping
procedures. For example, although presentation of triplets controls fairly welt the tendency
for S to recall these words as a unit, it does
not control the order in which he recalls the
triplets nor does it prevent his organizing
(grouping, associating) several triplets into a
larger subjective unit for his recall. Consequently, when for the Increasing Ss we aggregate two triplets at random for a 6-tuple
presentation on Trial 2, there is a good like-
GROUPING OPERATIONS IN FREE RECALL
lihood that we will aggregate triplets differently from how S did in his Trial-1 recall.
Similarly, in the Decreasing condition, when
we divide a 6-tuple from Trial 2 into two
3-tuples for Trial-3 presentation, we have
done nothing to prevent S from continuing to
use his old association between these 3-tuples
in recalling on Trial 3. The point of these
remarks is to suggest that the groupings we
impose on a particular input trial cannot be
the only organization that S will have available
in his recall on that trial. To the extent that S
rather than E controls some of this organization, manipulation of Increasing vs. Decreasing group sizes will have an attenuated effect
on recall. These considerations may help
explain some of the puzzle of why these groupsize manipulations produced only a small
difference in recall.
EXPERIMENT V
If S forms subjective units or clusters of
words~ he still has the problem of getting from
one cluster to the next in his free recall. One
way this can be achieved, it is presumed, is
by associating one cluster with another, or
by aggregating them into a larger cluster as in
Exp. IV. In the following experiment we
have tried another method to aid and abet the
S in moving from one cluster to the next in
his memory. We tried to do this by arranging
for structural linkages between two clusters
via a common element. This common element
could then serve as a mediating bridge for
moving in memory from one cluster to the
next during recall.
The structures of the Linked list and
Control list are schematized in Figure 5,
where letters represent unrelated nouns of the
list to be recalled and ellipses are drawn around
the input quartets. The quartets of the Linked
list are linked together by common elements
(A, D, G, J) which appear in two different
clusters. All quartets were linked at both ends
in this way, much in the manner of a sausage
chain with its end attached b a c k upon its
LINKED LIST
489
CONTROL LIST
(A D 6 J)
(B C E F)
(H'KO
QA D G J)
FIG. 5. Structure of presentation lists for Exp. V.
beginning. Although Figure 5 shows only 4
"sausage links," there were 16 in the actual
experiment. They were presented in a random
temporal input order although they were
linked structurally in a continuous chain.
Examining the Control list in Figure 5, it
shares with the Linked list the fact that the
same words (A, D, G, J) are presented twice
and the same words once. Its difference from
the Linked list is that no common linking
elements are provided to enable S to move in
memory from any one cluster to any other,
and S is left to do this by his own devices.
The prediction is that Ss learning the Linked
list will recall more clusters, and therefore
more words, than Ss learning the Control list;
moreover, the output order of the linked
clusters should correspond in large degree to
the succession of clusters as linked in the
sausage chain. A preliminary test of this with a
list containing eight quartets revealed no
differences in recall between a Linked and
Control list. This worried us until we scored
the protocols for cluster recall, and discovered
practically all Ss in both conditions were
recalling some words from practically all of
the eight input clusters on Trial 1. However,
the hypothesis supposes that any advantage
that might appear for the Linked list would
have to be in terms of more clusters recalled
than in the Control list. But since all eight
clusters were being recalled on the Control
list by practically all pilot Ss, there was no
possibility of showing an advantage for the
Linked list in that situation (i.e., a ceiling
effect). To avoid an artificial ceiling, therefore,
490
BOWER, LESGOLD, AND TIEMAN
we ran the main experiment reported below
with a longer list o f 16 clusters.
Method
Twelve Ss learned two Linked lists and 12 other Ss
learned two comparable Control lists. Each list was
given for three input-output cycles. Each list contained
16 concrete nouns presented twice, and 32 concrete
nouns presented once. The words, grouped into 16
quartets, were presented by a slide projector, at a rate
of 12 sec. per slide with visual imagery instructions to
S. The 16 slides for a given list were shown in a different
random order over the three trials. The structure of the
Linked and Control lists was as illustrated in Figure 5.
After the last study slide, S wrote his free recall for
4 min. There was a rest pause of 1 min. between the
end of S's first list and the beginning of his second. The
Ss were run individually and. were not informed of
the structural arrangements of the lists they were to
learn. Over Ss, the two lists of 48 words were used
equally often as first or second lists in the session.
Results
There were only small differences between
recall of the two lists of the session, so they
have been pooled to increase reliability. T h e
main results are shown in Table 2, giving mean
TABLE 2
MEAN RECALLPROBABIL~S : EXPERIMENTV
Single
Double
Trials
Linked
Control
Linked
Control
1
2
3
.42
.72
.84
.28
.70
.86
.69
.87
.94
.67
.81
.94
recall probability on each trial for the two
conditions for the once-presented (singles)
and for the twice-presented (doubles) words.
The doubles are recalled m u c h better than
the singles, of course, as any theory would
expect. A l t h o u g h most o f the recall differences
between the Linked and Control lists are in
the predicted direction on each trial, the only
significant difference is on Trial-1 recall of the
once-presented words, t(22) = 2.63, p < .01.
The total recalls of the once-presented words
over all three trials is also significantly higher
in the Linked list, t(22) = 1.82, p < .05.
We next analyzed for recall o f the input
clusters. In recall of the Control list, the
average M R R scores was .987 on Trial 1,
.990 on Trial 2, and .993 on Trial 3, reflecting
almost perfect output clustering according to
the input groups. F o r the Linked list, M R R is
not easily c o m p u t e d because of the double
words which appeared in two different
clusters.
The hypothesis implies that the advantage
for the Linked list is in S's ability to recall
some words from more clusters than is true for
the Control list. One therefore wants to compare the n u m b e r o f clusters recalled (i.e., at
least one w o r d of the cluster) for the two lists.
However, because doable words in the Linked
list appear in two different input groups, if S
recalls, say, D B C (see Figure 5) from the
Linked list, we do not k n o w whether to credit
him with two clusters (since D appeared in
two) or one cluster recalled. To avoid all
problems o f this kind, we decided to redefine
a recall cluster in terms of the two unique
words in the input quarters; examples in
Figure 5 are BC, EF, HI, and KL. I f S recalled
either member of these 16 unique pairs, he
was credited with recall o f that cluster. Recall
of the Control list was scored in the same
manner so that, for instance, the unique
quartet B C E F in the Control list of Figure 5
was redefined as consisting o f the two clusters
BC and EF. Regarding the prediction o f
interest, this m e t h o d o f scoring is conservative,
and gives an advantage to the Control group,
since recall o f one of their input groups is being
artificially scored as recall o f two redefined
clusters. Nonetheless, the differences between
the two conditions came out quite substantially in the predicted direction. P r o p o r t i o n
of these clusters recalled on Trial 1 was .53 for
the Linked list v s . . 3 2 for the Control list,
p < .01 ; on Trial 2, the proportions were .82
vs..74, p < .05; on Trial 3, they were .86 vs.
.88. Thtls, even with this conservative scoring
method, duster recall was higher on early
GROUPING OPERATIONS IN FREE RECALL
491
trials for the list which provided explicit computer by running 500 random Monte
linkages between input clusters.
Carlo permutations of the recall number for
The second prediction is that S's temporal each of the 72 protocols, counting the succesorder of recalling the linked clusters will sions in each random permutation, and therecorrespond to a significant degree with the by developing a probability distribution of the
succession of "sausage linkS" in the structural succession scores to be expected by chance
chain, unwinding it in either a clockwise or if the words recalled in that protocol had been
counterclockwise direction. By casual inspec- generated in a random sequence. In this way,
tion of the recall protocols, this is indeed an we were able to ascribe to each of the 72
accurate description, but to develop a quanti- protocols (double-word sequences) the probtative index of this ordering is a ticklish ability that a succession score as high or
problem. After considering various ways to higher than that observed would have
measure this, we finally hit upon the following occurred had the order of recalled words been
method which proved sufficient to show what governed by a totally random process.
was obvious from inspection of the recall
These results can be summarized in terms
pro~:ocols. The unique words were simply of the number of protocols for which the
ignored in recall of the Linked list, and only observed succession score had, on the null
recalls of the double words were considered. hypothesis, a theoretical probability less than
These were numbered from 1 to 16 according .01, or .05, or .50. Thirty-four protocols, or
to their structural succession in the input list. 47 ~ , had p values less than .01; 50, or 69 ~ ,
For example, in Figure 5 the double words had p values less than .05; and 65, or 9 0 ~ ,
A, D, G, J would be numbered 1, 2, 3, 4, had p values less than .50. Quite obviously,
respectively. By this coding, a S's protocol these outcomes mean that the recall protocols
would consist of a sequence of integers, such were strongly ordered in proper succession.
as 7, 8, 1, 16, 15, 9, 10, 4, 3, 5. We wished to Subjects were indeed recalling around the
measure the pairwise correspondence between "sausage links."
such recall sequences and what one would
In conclusion, by arranging linkages beobtain if one cycled through the numberloo p tween input clusters, Ss recall more clusters
1-16 either forwards or backwards. Consecu- on the early trials, and they clearly output the
tive pairs of numbers in the recall sequence clusters in an order prescribed by the structural
were therefore scored as plus or minus accord- array. This occurred despite the fact that these
ing to whether or not that pair of numbers were •clusters were shown in a random temporal
adjacent to one another in the natural numer- order. This benefit for recall has been shown
ical order (considering 1 and 16 as adjacent). where the linkages were established by a
For example, the sequence above has 9 common word in two clusters. One probably
consecutive pairs of which 5 are plus and 4 are would be able to demonstrate a similar effect
minus. The number of pluses will be called with highly associated pairs of words, with the
the "succession score."
two members of associated pairs being
All 72 Linked-list recall protocols (12 Ss embedded in different input clusters. Thus,
x 2 lists x 3 trials) were scored in this manner. recall of black in its cluster might cue recall
The number of pluses varies according to the. of white and its input cluster. That experiment
number of double words recalled and which is yet to be done.
ones they are. The question is whether the
The advantage demonstrated here for t h e
obtained succession scores exceed those to be Linked list was small and short lived, disapexpected by chance if the sequential ordering pearing by Trial 3, presumably b e c a u s e
of the double words recalled were totally Control Ss were recalling from nearly all of
random. We answered this question with a their input clusters by that trial. Possibly the
492
BOWER, LESGOLD, AND TIEMAN
advantage would prove more persistent with
a longer list of clusters. The size of the Linkedlist advantage may have been attenuated to
some degree because the linking words were
each presented for association with two
different sets of words on each trial. This
within-trial procedure for the linking words
therefore shares some of the features of the
between-trial changes in groupings which
produced, deleterious effects on recall in
Exp. I and IV. However, the shifting groupings
in Exp. I and IV were much more extensive
t h a n here, and the present experiment
repeated the identical groupings over all
three trials.
DISCUSSION
To recapitulate the argument and our
experimental data: we believe that a fundamental strategy Ss employ in learning is to
group or subdivide the material into subjective
clusters which become integrated units in
recall. These strivings for stable groupings
can be assessed in various ways which tap the
organization S employs in free recall. The
hypothesis is that the improvement with
practice in free recall results in part from the
increasing size and integration of these subjective clusters. It follows that recall should be
retarded if measures are taken to prevent the
development of stable groupings with practice.
Using the "mental imagery" method for
interassociating groups of four unrelated
nouns, Exp. I showed that the normal improvement with practice in free recall practically
vanished when new groupings were imposed
upon the list words on each trial. In Exp. II,
rather than E imposing arbitrary groupings,
S indicated his subjective groupings of the
words developed during preliminary recall
trials. A subsequent input trial with imposed
groupings increased or decreased recall
accordingly as the imposed groupings were
consistent or inconsistent with S's subjective
groupings. We view these two experiments as
establishing the same point: If current input
groupings conflict with prior groupings of the
material, then recall suffers. The experiments
differ only in the nature of the prior groupings
and how they were established.
The third experiment replicated the poorer
recall learning with changed groupings but
showed that the effect was absent in recognition tests of memory. It was proposed that
grouping factors influence response generation and retrieval processes, but not recognition since the latter depends only upon
"occurrence information" stored in memory
for each word, which information is independent of retrieval cues or schemes for
accessing that word in recall. This "occurrence
information" is presumed to be accessed
directly by the word on a recognition test.
Given that S has categorized the list words
into many subjective units, he still has the
problem of getting to all these groups in recall.
Mnemonic techniques use various (cuing)
methods for solving this retrieval problem.
A less dramatic but all-purpose method is to
associate two or more groups together,
integrating two former units into one larger
chunk, thus reducing the number of "units"
to be retrieved. In Exp. IV, this composition
of subgroups into larger groups was either
aided or hindered somewhat by the groupings
presented over successive trials, and recall
was better in the former case, as expected.
In Exp. V, S was aided in moving in memory
from one to another recall cluster by the
presence of a common word linking "adjacent" clusters. These structural linkages improved cluster recall; it was further demonstrated that S clearly used these linkages in
moving from one to another cluster in his
recall.
Our initial assumption was that free recall
reflects in part the amount of grouping which
S has carried out on the list words. This view
is bolstered by several incidental learning
experiments with free recall. Tulving (1966)
found little effect on tree recall of having S
merely read the list words many times before
being told they were to recall the words.
GROUPING OPERATIONSIN FREE RECALL
Contrariwise, M a n d l e r (1967) f o u n d that
incidental Ss, required to group the list words
into consistent categories, recalled as m u c h as
intentional Ss told to categorize then recall.
Considered more generally, however, it is not
clear what kinds o f categorization will or will
not benefit recall. F o r example, if words are
classified by their n u m b e r o f letters, by whether
or not they contain a t, by their part of speech,
etc., one intuitively would expect little incidental recall f r o m such activities. The important
processes p r o b a b l y involve the arousal of
semantic features o f the word, and the arousal
o f categorizations that have strong associations to the list words.
REFERENCES
BOUSFXELD,W. A., PUre, C. R., & COWAN,T. M. The
development of constancies in sequential organization during repeated free recall. Journal of
Verbal Learning and Verbal Behavior, 1964, 3,
449-459.
BOWER,G. H. Organization and memory. Address at
meetings of Western Psychological Association,
San Diego, California, March, 1968.
COFER, C. N. On some factors in the organizational
characteristics of free recall. American Psychologist, 1965, 20, 261-272.
COHEN,B. H. Some-or-none characteristics of coding.
Journal of Verbal Learning and Verbal Behavior,
1966, 5, 182-187.
DEESE, J. Influence of inter-item associative strength
upon immediate free recall. Psychologica! Reports,
1959, 5, 305-312.
493
Kiyrscn, W. Recognition and free recall of organized
lists. Journal of Experimental Psychology, 1968,
78, 481--487.
MANOLER, G. Organization and memory, in K. W.
Spence & J. T. Spence (Eds.), The Psychology of
Learning and Motivation, Vol. 1. New York:
Academic Press, 1967.
McLEAN, R. S., & GREGG, L. W. Effects of induced
chunking on temporal aspects of serial recitation.
Journal of Experimental Psychology, 1967, 74,
455-459.
PAIVIO,A., YUILLE,J. C., & MADIGAN,S. A. Concreteness, imagery, and meaningfulness values for
925 nouns. Journal of Experimental Psychology
Monograph Supplement, 1968, 76, No. 1, Part 2,
1-25.
POLLIO, H. R. Associative structure and verbal
behavior. In T. R. Dixon and D. L. Horton
(Eds.), Verbal Behavior and General Behavior
Theory. Englewood Cliffs, N.J. : Prentice-Hall,
Inc., 1968. Pp. 37-66.
SEmEL, R. Organization in learning. Tech. Rep.,
Contract No. OE-5-10-431. Pennsylvania State
University, University Park, Pa., 1967.
TULXaNG, E. Subjective organization in free recall of
"unrelated" words. Psychological Review, 1962,
69, 344-354.
TOLVING,E. Intratrial and intertrial retention: Notes
towards a theory of free recall verbal learning.
Psychological Review, 1964, 71, 219-237.
TULVl~G, E. Subjective organization and effects of
repetition in multi-trial free-recall learning.
Journal of VerbalLearning and Verbal Behavior,
1966, 5, 193-197.
TULVIN~, E. Theoretical issues in free recall. In T. R.
Dixon and D. L. Horton (Eds.), Verbal Behavior
and General Behavior Theory. Englewood Cliffs,
N.J. : Prentice-Hall, Inc., 1968. Pp. 2-36.
(Received December 16, 1968)