Vol. 49 No. 4 November 1996 Section 5 Page 919

TH E QUARTE R LY JOU RNAL O F E XPE R IM E NTAL PS YCH OL OG Y, 1996 , 49A (4), 919 ± 939
Word Dose in th e D isruption of Serial Recall
by Irrelevan t Sp eech:
P hon ological Confusions or Ch anging S tate?
Andrew M . Bridges and D ylan M . Jones
U niversity of Wa les College of Ca rdiff, Ca rdiff, U.K .
Irrelevan t backg rou n d spe ech d isr u pts serial recall of visu ally p resen ted lists of verb al m aterial. T hr ee exp erim e nts tested th e h ypo th esis th at the d eg ree of d isru p tio n is d ep en d ent on
th e num ber of w ord s h eard (i.e. w ord d ose) w h ils t th e task w as un d ertaken . E xp erim en ts 1
an d 2 sh ow ed th at m ore d isr up tio n is p ro d u ce d if th e w ord do se is increased , th ereby
providing evid enc e to su p p ort the experim en tal hy po th esis. It w as con cluded from th e ® rst
tw o exp erim e nts th at th e w ord -d ose effect m igh t be th e resu lt of inc re asin g the am ou n t of
cha ng ing-state infor m atio n in th e sp eech . T h e resu lts of E xp erim en t 3 su pp or ted th is con clusion by sh ow ing an inter action betw een w ord d os e an d cha ng ing- state infor m ation . It w as
no ted h ow ever th at th e resu lts m igh t be exp lain ed w ith in th e w orkin g m em ory accou nt of th e
disru p tive action of irrelevan t sp eech . A fu r th er tw o experim en ts cast d ou bt on th is p ossibility by failin g to rep licate th e ® n d ing th at th e p h on ological sim ilarity betw een h eard an d seen
m aterial affects th e d eg ree of interferen ce (Salam e & B add eley, 1982). T h e ® n d ings are
discu ssed in relation to the chan ging state h yp oth esis of th e irrelevan t speech effect (e.g.
Jo n es, M ad d en, & M iles, 1992).
It is a w ell-docum ented an d rob ust ® ndin g that ``unattended’ ’ backg round speech red uces
perfor m an ce o n serial recall tasks by ap proximately 30% (Bad deley & Salam e , 1986;
C olle, 1980; C olle & Welsh, 1976; H an ley & B road b ent, 1987; Jones, 1994; Jo nes &
M acken , 1993; Jon es, M acken , & M urray, 1993; Jones et al. 1992; Jones, M iles & Page,
1990 ; M iles, Jo nes & M adden, 1991; M or ris & Jones, 1 990a, 1990 b; Salam e & B addeley,
1982 , 1986, 1987, 1989, 1990). T he ter m ``irrelevan t speech effect’ ’ (Jo nes et al., 1 990)
w as suggested to be a better description of the phenom enon in preference to referring to
the speech as unattended, th ereby avoiding pre-judging th e n ature of its processing. It has
An dy Bridge s is now at the P yscholog y divisio n, Asto n U niversity. Reque sts for reprints sh ould be se nt to
An dy Bridges, Psyc holog y G roup, A sto n Un ive rsity, Asto n Tria ngle, Bir m in gham , B4 7E T, U.K . Em ail:
a.m .bridges@ asto n.ac.uk
The work was carrie d out at the School of Psyc hology, Cardiff, U nive rsity of Wale s during the period of an
E SRC studentship awarded to the ® rst au thor. Thanks are due to Paul Farrand, Bill M acken, and A liso n M urra y
for help in the develop m ent of this pape r and to G raham H itc h, R ick H an le y, and o ne ano nym ou s referee for
com me nts on earlie r drafts.
q
199 6 The E xp erim ental Psyc holo g y S ociety
920
B R ID G E S AN D J O N E S
been show n that g ross characteristics of the speech source such as m eaning an d in tensity
do not dete rm ine the disruptio n (C olle, 1980; C olle & Welsh, 1976; Jo nes et al., 1 990)
an d, by varying the sectio n of the task durin g w hich th e speech is played, that the effect
occurs in m em ory an d not at encoding (Jones, 19 93; M acken & Jones, 1995; M iles et al.,
1991 ).
T he identi® cation of aspects of processing that are pre-attentive w as noted by M assaro
an d C ow an (19 93) to be im portant in an info rm ation-p rocessing ap proach to th e study of
co gn ition . T h is allow s the study of processes that arise w ithout active attention an d can
provide co nsiderable insight into the m echan ism s of b oth peripheral an d h igher processing. Investigation of the irrelevan t speech effect has facilitated this ap proach in the study
of au ditory cognition, con tributing consider ably to th e prog ression of theories concer ning
m em ory an d atten tio n (see Jo nes & M orris, 1992 for a review ). To date, how ever, a ran ge
of factors concern ing the characteristics of au ditory m aterial that produce this disruptio n
has yet to be fully exp lored. Two speci® c issues are ad dressed in this study: the de gree to
w hich the num ber of w ords (word dose) in the irrelevan t speech deter m ines the distruption; an d w hether the phon ological sim ilarity between heard an d seen m aterial is an
important factor in the effect.
The Working M emory A ccount of the I rreleva nt S peech E ffect. A num ber of in¯ uential
stu dies have investig ated w hether irrelevan t acoustic m aterial other th an speech prod uces
sim ilar disr uption of serial recall. It w as fou nd that co ntinuous w hite noise fails to
prod uce disruption (e.g. Bad deley, 1 968), bursts of w hite noise pro duce som e disruptio n
(S alam e & B addeley, 1982, E xperim en t 2; Salam e & W itter sheim, 1978), an d the m ost
robu st interference is produced by speech. T he conclusio n w as draw n, therefore, that
non-sp eech m aterial does not have the cap acity to disrupt serial short-ter m m em ory to
the sam e deg ree as speech.
T his position har m o niz es w ell w ith Bad deley an d colleagues’ in ¯ uential th eory of
working m em ory (B add eley & H itch, 1974; see B ad deley, 1992a, 1992b for a review of
the m odel), according to w hich item s are represented by w ay of phonological codes in
sh ort-term m em ory, au tom atically in the case of speech, an d by deliberate articulation if
they are of visual origin. T he irrelevan t sp eech effect is therefore the result of an ``alphabet soup’ ’ in w hich codes represe nting rehearsed visual item s becom e co nfused w ith the
sp eech cod es that have au tom atic access to the m em o ry store. N o n-speech m aterial does
not have this au tom atic access to the ph onological store, an d therefore does n ot produce
disruptio n of the representatio ns of the visual m aterial in the sam e w ay as speech. Recent
stu dies have, how ever, show n th at it is not o nly speech that disrup ts serial recall. S pecifically, sounds such as to nes are equally likely to produce an irrelevan t ``speech ’ ’ effect (e.g.
Jones & M ack en, 1993). T hese ® nd ings, by casting doub t on the w orking m em ory
ex planation, have m otivated an alter native account of the effectÐ that is, the chan gingstate hypothesis (Jones et al., 1992).
The Cha nging-sta te H ypothesis. T h e ® nding that irrelevan t speech consisting of a
single repeated syllable did not produce ap preciable disrup tion of serial recall (Jo nes et al.,
1992 ) d em onstrated that speech w as not a suf® cient con stituent of an irrelevan t sound for
it to bring about an irrelevan t speech effect. It w as argued, instead , that the constituent
W O R D DO S E A N D IR R E L E V A N T S P E E C H
921
item s in the speech m ust chan ge from one to the next (``chan ging state’ ’ ) in order for the
effect to be show n. T his view stan ds in co ntrast to the w orking m em ory account, w hich
prop oses that speech h as m an datory access to the pho nological store, an d w ill, therefore,
au tom atically com m an d som e of the resources required for the m aintenan ce of delib erately rehearsed visual m aterial. H ence, a ny irrelevan t speech sound w ould be expected to
interfere w ith serial short-ter m m em ory.
T he early stu dies of irrelevan t sound w ere largely m otivated by requ irem ents to
prod uce optim um noise level guidelin es for in dustrial work p laces (for a review see Jones
& Broad bent, 1991). N on-speech sounds used in these studies correspond ed to the type
of backg round noise that m ight be heard in such an environm ent, su ch as continuous
w hite noise (Baddeley, 1968). T hus, the conclusio n th at the disruption of serial recall by
irrelevant sound w as virtually unique to speech w as based o n the investigation of the
effects of ``steady-state’ ’ sound. T he possibility therefore rem ained that an ``irrelevan t
sp eech’ ’ effect m ight b e produced by chan g ing-state non-speech soun ds. E xperimental
investig ation of this possibility dem o nstrated that disruption of serial recall, of a m agnitude equivalent to that produced by speech, w as produced by chan ging state to nes (Jones
& M acken, 1993), segm ented glissan di ( Jo nes, M acken, & M urray, 1993), an d ban d-pass
noise (Jones & M acken, 1995a). T he co nclusions draw n from these studies, form aliz ed as
the chan ging-state hypothesis, w ere that sp eech is neither a n ecessary nor suf® cient
co nstituent of an irrelevan t sound for it to prod uce d isruption of serial recall. R ather,
the critical factor in deter m in ing th e d isruption is w heth er or not the irrelevan t soun d
co ntains chan ging-state infor m ation, that is, each p hysical unit w ithin the so und m ust be
different to the one that preceded it.
M a nipula ting C ha nging S ta te U sing ``Word Dose’ ’ . A ccording to the chan ging-state
hypothesis, increasing the am ount of chan g ing-state infor m atio n should increase the
deg ree of disru ptio n. To d ate this has o nly b een investig ated in o ne w ayÐ by m an ipulating the phonological sim ilarity between the item s in the sp eech (Jones & M acken, 1995b).
If the word s in the speech are m ore p ho nologically sim ilar to each otherÐ as, for exam ple,
in rhym ing wordsÐ there w ill be less chan ge from w ord to w ord , an d hence less chan gingstate inform ation. T his type of speech should th erefore produ ce less d isruption of serial
recall. E xperim ental support for this position w as provided by a series of experim ents that
co ntrasted rhym ing w ords (e.g. knee, key, fee) w ith non -rhym ing words (e.g dea f, pa y, bell).
It w as found th at, as predicted, the chan ging-state (no n-rhym ing) speech produced m ore
disruptio n than the stead y-state (rhym ing) words ( Jon es & M acken, 1995b ). T he purpose
of the p resent study w as to investig ate an alter native technique for m an ipulating the
am ount of chan ging-state infor m ation in the sp eech, by varying the total nu m ber of
irrelevant spoken w ords heard durin g the course of a serial recall task, that is, the ``word
dose’ ’ .
C han ging-state infor m atio n, it is suggested, m ay be derived from the difference
between sub sequent units in the irrelevan t sound. Increasing the num ber of units w ould,
therefore, increase the num ber of b etw een-unit ch an ges, an d hence the am ount of
ch an ging-state infor m ation. It w ould therefore b e expected that increasing the word
dose in an irrelevan t sp eech sequence w ould increase th e deg ree of disruptio n to serial
recall. T he ® rst experim ent of this study sought to exam ine this possibility, by contrasting
922
B R ID G E S AN D J O N E S
the effects of three types of irrelevan t speech, each co ntaining a different level of word
dose, on a serial recall task.
E X PE R IM E N T 1
T he ® rst experim ent in the series w as intended to investig ate th e possibility th at word
dose is an important factor in the disruptive effect of irrelevan t sp eech on serial recall.
Word d ose w as m an ipu lated by contrasting three types of speech: o ne in w hich there w as
no ga p b etw een words (the treatm ent w ith m ost w ords), one in w hich there w as a sma ll ga p
between words (few er words), an d one in w hich there w as a relatively big ga p b etween
words (few est w ords). It w as predicted that increasing the w ord dose w ould in crease the
am ount of chan ging-state infor m ation. T herefore, the predictio n according to the
ch an ging-state hypothesis is that a co ntinuum of increased disruption is expected, w ith
m ost interference occur ring in the condition w ith the m ost words, an d least inte rference
sh ow n w hen there is the sm allest nu m ber of w ords in the sp eech stream .
M eth o d
Subjects
Twen ty-four u n d erg r ad u ate stud en t volun tee rs w ere p aid for p articip ating in th e experim en t. A ll
su bjects h ad E n glish as the ir ® rst lan gu age and rep orted n or m al h earin g.
App aratus and M aterials
Item s to be recalled w ere p resen ted serially on th e scre en of an A pp le M acintosh Q u ad r a m icro com p uter. L ists w ere con stru cted from th e ran d om ar ra ng em en t of th e letters F, K , L, M , Q , R , S , T,
an d Y, w ith th e co nstrain t that a letter cou ld no t ap p ear in th e sam e ser ial p osition in tw o con secu tive
lists an d th at recogn iz able w ord s, letter strin gs, or acro n ym s were n ot includ ed. T he lists w ere stored
an d p res ented w ith in a H yperca rd env iron m en t.
A ll th re e sp eech co nd itio ns con sisted of th e sam e ® ve letter n am esÐ B ( bee) , I ( eye) , J ( ja y) , N
( enn) , an d Z ( z ed) Ð in ran d om ord er (th ese w ill be refer red to as uttera nces, as th ey do n ot all strictly
1
con stitu te w ords ). A ll u tteran ces w ere ed ited u sin g d igital sign al p ro cessin g tech niqu es to last
exactly 350 m sec, b ut the len gth of the silen ce p lac ed betw een the u tter an ce s w as varied betw een
the cond ition s. In th e n o-g ap con dition , n o silen ce w as p laced be tw een th e u tter an ces. T he sm all g ap
con d ition co ntain ed 350- m sec p eriod s of silen ce p laced betw een the u tteran ces, and 70 0-m sec g ap s
w ere u sed in th e lon g g ap con d ition . T h is res u lted in th e su bjects h earin g the m ost w ords in th e n ogap con d itio n , th e few est in th e lon g-g ap con d ition , an d an inter m ed iate nu m er in th e sm all-g ap
con d ition . T h e th ree sp eech con d ition s w ere co ntrasted w ith a qu iet co ntrol. A ll w ord s w ere sp oken
in a m ale voice, re corde d in a r an do m ord er u sin g S ou nd E d it so ftw are an d stor ed as ``sn d ’ ’ reso u rces
w ith in H yp ercard . In th is m an n er a record ing w as con str u cted for eac h con d ition , w h ich lasted
ap p roxim ately 20 sec an d w as re peated ly loop ed . S ou n d w as d elivered via S on y C D 250 h ead p ho n es
at 65 d B(A) as m easu red by an ar ti® cial ear.
1
All sound used in the experim e nts in this pape r were re corded an d digitall y edited using digital sig nal
processing techniques to 8-bit resolut ion at a sam pling rate of 22 kH z.
W O R D DO S E A N D IR R E L E V A N T S P E E C H
923
Experim ental D esig n
A repeated m easu res design w as used in w hich th e four treatm en ts w ere blocked , w ith 16 trials in
each con d ition . T h e order of p resen tation of th e blo cks w as ran d om ized betw een su bjects.
Procedure
Su bjects w ere tested ind ividu ally, seated in a so u n d -p ro ofed laboratory ap p roxim ately 0.5 m from
the com pu ter’ s screen . B efore th e star t of th e trial, stan da rd instru ction s w ere read by the su bject.
T h ese infor m ed th em of the n atu re of th e r ecall task an d instru cted them to ign ore any so u n ds th ey
m ight h ear. In each trial, the nin e letter s w ere d isp layed in ran d om ord er, as d escribe d earlier. E ach
letter w as d isp layed for 500 m sec w ith an inter-letter interv al also of 500 m sec. E ach lis t w as p reced ed
an d follow ed by a ton e. A fter p resentatio n of th e s ec on d ton e, th e w ord ``w ait’ ’ w as ¯ ash ed o n th e
screen for 10 sec, d u rin g w h ich tim e th e subject w as e xp ected to reh ear se th e list. A s it h as been
sh ow n that the d isr u p tion oc cur s in m em ory an d no t en cod ing, it w as exp ected that exp osure to th e
irrelevan t sp eech d u rin g a reten tio n inter val w ou ld resu lt in m ore m arked effects than im m ed iate
recall. A fter this reten tio n interv al an oth er ton e w as p layed , at w hich s ign al th e su bject w as instru cted
to recall the list in s trict serial order. Sp eech w as p layed throu gh ou t th e p resen tatio n, reten tio n , an d
recall of th e lists. Resp on ses were w ritten on a blan k g rid com p risin g ro w s of n ine box es. T h e su bjects
w ere given 15 sec to recall each list. T h e exp erim en t w as p receded by a ® ve-trial p ractice session , an d
the exp erim en t lasted abou t 40 m in in all.
R esu lts
A two-factor A N OVA w ith au ditory co nditio n (four levels) an d serial position (nine
levels) as factors w as used to an alyse errors. T he m ean errors at each serial positio n are
sh ow n in F igure 1. T here w as a signi® can t effect of condition , F(3, 69) = 14.36; p <
.00 01, an d position, F(8, 184) = 34.96; p < .0001, w ith a signi® cant interaction betw een
the two, F(24, 552) = 1.58; p < .05. Fo r each co ndition, the total m ean errors out of 16
an d standard errors w ere: 6 .51 6 0.30 for the q uiet condition, 7.38 6 0.31 for th e bigg ap condition, 8.17 6 0.30 for the sm all-g ap con ditio n, an d 9.20 6 0.31 for the no-g ap
co ndition.
Planned com parisons between the au ditory con ditio ns showed there to be signi® can t
differences between quiet versus b ig gap, F(1, 17) = 4.215; p < .05, sm all g ap versus no
g ap, F(1, 17) = 5.86; p < .05, an d big g ap versus n o gap, F(1, 17) = 18.03; p < .0001, an d
no signi® can t difference between sm all g ap versus big g ap, p = .07. T he continuum of
incresed interference from big g ap through to no g ap w as exam ined using an orthogonal
polyn om ial an alysis, w hich show ed there to b e a highly signi® can t linear trend, F(1, 23) =
19.35; p < .0001.
Inspectio n of Figure 1 ah ow s a possible reason for the interactio n b etw een con ditio n
an d positionÐ nam ely, the reduced difference b etw een co nditio ns at S erial Postitions 1 to
3. T his m ay also explain the ® n ding that the difference b etw een big gap an d sm all gap did
not reach signi® cance, despite the statistical evidence fro m the polynom ial an alysis of a
co ntinuum of in terference as dose increased . A series of post-hoc com pariso ns were therefore used to exam in e the difference between the conditions in just th e last six serial
positions. Sign i® can t differences were fou nd between all cond ition s: quiet versus big
924
B R ID G E S AN D J O N E S
FIG . 1 .
Re su lts of E xpe rim e nt 1, showing m ean er ror s of seria l re call w ith respect to se rial positio n,
contrasting perfor m an ce in quiet with conditio ns of irreleva nt sp eech in w hich the spacin g betwee n words
was varie d from 700 m se c (big g ap ) throug h 350-m sec in tervals (sm all g ap) to no g ap.
g ap, F(1, 17) = 35.40; p < .0001; big g ap versu s sm all g ap, F(1, 17) = 23.90; p < .0001;
sm all g ap versus no g ap, F(1, 17) = 36.30; p < .0001.
D iscussio n
T he results of the ® rst exp eriment for m a coherent pattern ; as the am ount of silence
between the u tteran ces in an irrelevan t speech stream decreasesÐ that is, as the num ber of
words increasesÐ the am oun t of disruptio n to serial recall cau sed by that stream
increases: a ``w ord dose’ ’ effect. T his ® nding ap pears to support the hypothesis m ad e
earlier th at o ne factor affectin g the am ount of disruptio n, p ossibly by altering the am ount
of chan ging state infor m ation, is the num b er of word s in the speech stream . It could be
argued, how ever, that the num ber of utteran ces has been confounded w ith the total
sp eech exp osure. T hat is, in the n o-g ap condition there were m an y m ore utter ances
than in the big-g ap co ndition, b ecau se it co ntained no silence. T his design , however,
resulted in greater exposure to speech in the fo r m er co nditio n than the latter. A further
ex periment w as u nder taken, to m an ipulate both dose an d sp eech ex posure th ereby avoiding th e possibility of attributing the results to w ord dose in error.
W O R D DO S E A N D IR R E L E V A N T S P E E C H
925
E X PE R IM E N T 2
It has been argued that w ord dose, the num ber of distin ct an d different un its in an
irrelevant speech stream , is an im portant factor in deter m ining the extent of its interference w ith serial recall. We w ould not, there fore, attribute the results of E xperiment 1 to
differential am ounts of speech exposu re across conditions. E xperiment 2 sought to test
this propo sition by co ntrasting speech stream s w ith the sam e num ber of utteran ces, b ut
different am ounts of total sp eech exposure: short utteran ces w ith long g ap s, lo ng utteran ces w ith short g ap s, an d short utter an ces w ith short g ap s. To avoid the possible confound that lon g utteran ces w ould be pho nologically different from short utteran ces,
soun d-editing softw are, u sing digital signal p rocessing techniques that len gthen sounds
but retain essential characteristics such as pitch, w as used to construct the lo ng utterances
from th e short utteran ces. It w as predicted that the stream w ith short utteran ces an d lo ng
g aps would sh ow an eq uivalen t de gree of disrup tio n to that w ith lo ng utter an ces an d short
g aps (i.e. the sam e num ber of utteran ces in each), despite the latter providin g g reater
sp eech exp osure.T he stream co ntaining the g reatest num ber of utteran ces (short utteran ces an d short g ap s) w as expected to p roduce the m ost disrup tio n.
M eth o d
Subjects
Twen ty u nd erg rad u ate stu d ent volunteers were p aid for p articipatin g in th e experim en t. A ll
su bjects h ad E n glish as the ir ® rst lan gu age and rep orted n or m al h earin g.
App aratus and Procedure
Item s to be recalled w ere p resen ted serially on th e scre en of an A pp le M acintosh Q u ad r a m icro com p uter. L ists w ere th e sam e as for E xp erim ent 1, as w as th e p ro ced ure.
M aterials
A ll th ree sp eec h co nd itio ns co ns isted of th e sam e ® ve u tteran ces of letter n am esÐ B ( bee) , I
( eye) , J ( ja y) , N ( enn) and Z ( z ed) Ð in r and om ord er. Two sets of recordin gs of th e letter nam es
w ere m ad e. In th e ® rst, the u tteran ces w ere tim ed to last exactly 350 m sec. T h ese recordin gs w ere
than tr ans for m ed usin g S ound Designer so ftw are to p ro du ce a secon d set of record ings, each of w h ich
lasted exactly 700 m sec. Q u alities of th e original recordin g, su ch as p itch , w ere kept exactly th e sam e
for th e tw o sets of record ings. It is im p ortan t to n ote that th ese tech niqu es of tran sfor m atio n d o n ot
d ecrease intelligibility, an d th at th e r esu ltan t u tteran ce is p erceived as n atu ral sp eech . T h e sp eech
m aterials w ere assem bled from either th e sh ort (350 m se c) or lon g (700 m sec) u tter an ces w ith eith er
sh or t (350 m sec) or lon g (700 m sec) p eriod s of silen ce between th e u tteran ces to p rod u ce record ings
of sp ee ch for three aud itory co n d ition s; short word with short ga p, long word with short ga p, and short
word with long ga p. In th is m an ner a recordin g w as con stru cted for each con d ition w h ich lasted
ap p roxim ately 20 sec an d cou ld be rep eated ly loop ed. A ll w ord s w ere sp oke n in a m ale voice an d
stored as reso u rces w ith in H ypercard. T h e th ree sp eech c on dition s w ere com p le m ented w ith a qu iet
con tr ol. T h e sa m e n um b er of utteran ces w as con tain ed in the recordin gs for th e shor t-w ord w ith
lo n g- gap and lo ng -w ord w ith sh ort-gap con d ition s, althou gh th e total am ou n t of speech , in ter m s of
926
B R ID G E S AN D J O N E S
tim e, w as greater in th e latter con d ition . T h e am ou n t of sp ee ch exp osu re w as also g reater in th e
sh or t- w ord w ith sh ort-g ap con d ition than in th e sh ort-w ord w ith lon g- gap con d ition ; ho w ever, th e
for m er con d ition con tained m any m ore u tter anc es. S ou n d p resen tatio n w as as for E xp erim en t 1.
R esu lts
A two-factor A N OVA w as carried out on serial recall er rors w ith respect to au ditory
co ndition (four levels) an d serial position (nine levels). M ean errors at each serial positio n
are d isplayed in Figure 2. T here w as a signi® can t effect of condition, F(3, 57) = 5.89;
p < .001, an d serial position, F(8, 152) = 23.87; p < .0001 b ut no signi® can t interactio n,
p > 0.5. T he results of p lanned com pariso ns b etw een co nditions showed th ere to be
signi® can t differen ces for q uiet versus sho rt w ord w ith lo ng g ap, F(1, 17) = 3.99; p < .05,
quiet versus long w ord w ith short g ap, F(1, 17) = 4.60; p < .05, short w ord w ith lo ng
g ap versus short w ord w ith short gap, F(1, 17) = 4.85; p < .05, an d long word w ith
sh ort gap ver sus short w ord w ith short g ap, F(1, 17) = 4.22; p < .05. N o signi® can t
difference w as show n for lon g wo rd w ith sh or t g ap versus sh ort word w ith long g ap,
p > 0.5.
FIG . 2 . Re su lts of E xpe rim e nt 2, showing m ean e rror s of seria l re call w ith respect to se rial positio n,
contrasting perfor m an ce in qu iet with co nditio ns of irrelev an t spe ech in w hich the co nstitue nt word s were
350 m se c lo ng, with be tween-w ord in tervals of either 350 m sec (shor t word with shor t g ap ) or 700 m sec
(shor t word with long g ap ), or the word s were 350 m se c lo ng with 350-m se c in tervals (short word w ith shor t
gap).
W O R D DO S E A N D IR R E L E V A N T S P E E C H
927
D iscussio n
T he results of E xperim ent 2 are as predicted by the word dose hypothesis. T he two
co nditions in w hich there were the sam e num ber of u tteran ces in the speech stream (short
word w ith lon g g ap an d long w ord w ith short g ap ) show ed an eq uivalent am ount of
disruptio n o f serial recall despite there being a g reater am ount of total speech exposure
in the latter co nditio n. T he condition in w hich the speech stream contained m an y m ore
utteran ces (short w ord w ith short g ap ) produced signi® can tly m ore disr uption th an either
of the oth er two sp eech conditions. Taking these results in conjun ction w ith th ose of
E xperiment 1, w e can now state m ore con® d ently that the total duration of sp eech is not
the im portant factor in th e dose effect. R ather, as hypothesized, a m ajor deter m inan t of
the deg ree of disr uptio n ap pears to be the num ber of distinct utterances in the speech
stream . In sum m atio n, we have observ ed a dose effectÐ that is, the g reater th e ``dose’ ’ of
units in a speech seq uence, the g reater th e disrup tio n of serial order in for m atio n th at the
sequence w ill p roduce.
A further predictio n that m ay be deriv ed fro m the chan ging-state hypothesis is that
there should be an interaction b etw een chan ging state an d w ord dose. It is sugg ested
that disr uptio n of serial recall is dependent on item -to-item chan ge, hence stead y-state
m aterial produces little disruptio n. T herefore, increasing the w ord dose in a stead y-state
soun d should h ave little effect o n the am ount of disruption produced. C ertainly, an y
increase in disruption should not be as m arked as that show n in the preceding experim ents by chan ging-state sou nd . A third experim en t w as undertaken to exam ine this
prediction.
E X PE R IM E N T 3
T he th ird experim ent w as designed to investig ate the m an ipulatio n of word dose in
irrelevant speech co nsisting of the sam e repeated utteran ce (steady state) an d speech
that contain ed differing utteran ces (chan ging state). It is proposed, from the chan gingstate hypoth esis, ® r st, that chan ging-state sp eech w ould prod uce signi® can tly m ore interference than steady state. Seco nd, increasing the dose in steady-state speech should not
increase the am ount of chan ging-state infor m ation an d therefore w ould no t increase the
disruptio n of serial recall. H ence there should be an interaction b etw een chan ging state
an d w ord dose.
M eth o d
Subjects
Twen ty-seven un d erg rad u ate stu d en t volu n teers w ere p aid for p articipatin g in th e exp erim en t.
A ll sub jects h ad E n glish as their ® rst lan gu age an d rep orted n or m al h earin g.
App aratus and Procedure
Item s to be recalled w ere p resen ted serially on th e scre en of an A pp le M acintosh Q u ad r a m icro com p uter. L ists w ere th e sam e as for E xp erim ent 1, as w as th e p ro ced ure.
928
B R ID G E S AN D J O N E S
M a terials
Two typ es of sp eech s eq uen ces w ere con str u cted . T h e ® r st co nsisted of a rep eated (steady-state)
u tteran ce: the sy llable B ( bee) ; th e s ec on d co nsisted of th e sam e letter n am es u sed in E xp erim ent 1 in
ran d om ord er (cha ng ing state ): B ( bee) , I ( eye) , J ( ja y) , N ( enn) , an d Z ( z ed) . A ll u tteran ces w ere
sp oken in a m ale voice an d ed ited u sin g digital sign al-p ro cessin g tech n iqu e s to last exactly 350 m sec.
Word do se w as m anip ulated in both of th e sp eech typ es by inser tin g eith er a 0-m sec (h igh w ord d ose)
or 700-m sec (low w ord d ose) g ap of silen ce betw een th e u tter an ces. T h is re su lted in th e p rod u ctio n
of four sp eech co nd itio ns: stea dy /low, stea dy /high, cha nging/low and cha nging/high. T he record ing
for each co nd itio n lasted ap p roxim ately 20 sec an d cou ld be rep eated ly loo ped . T h ese record ings
w ere stored as reso urces w ith in H yp ercard . T h e fou r sp eech co nd itio n s w ere co n tr asted w ith a qu iet
con tr ol. S ou nd p resen tation an d d elivery wer e as for E xp erim ent 1.
R esu lts
T he er rors w ere an alysed ® rst usin g an A N OVA w ith two factors: co ndition an d serial
position. M ean er rors at each serial positio n are d isplayed in Figure 3. T here w as a
signi® can t effect of condition, F(4 , 104) = 9 .20; p < .0001, an d position, F(8, 208) =
63.05; p < .0001, an d no interactio n, p > .1. Fo r each condition, the total m ean er rors out
of 16 an d standard errors w ere: 6.48 6 0.29 in the q uiet co ndition, 6 .78 6 0.27 in the
stead y/low condition, 6.95 6 0.29 in the steady/high condition, 7.51 6 0.28 in the
ch an ging/low co ndition, an d 9.17 6 0.27 in the chan ging/ high cond ition .
Planned com parison s show ed there to be signi® can t differences between quiet an d the
two chan ging state co nditio ns: q uiet versus chan ging/low, F(1, 26) = 4.21; p < .05, an d
quiet versus chan ging/high, F(1, 26) = 29.04; p < .0001. N o sign i® can t differences w ere
found between quiet an d the two steady-state sp eech conditions: quiet versus stead y/low,
p > .1, an d quiet v erus stead y/h igh, p > .1.
The data for the quiet condition w ere then rem oved, an d a 2 (speech type-chan ging vs.
stead y state) 3 2 (dose) A N OVA was used to an alyse the data from the speech conditions.
T here was a signi® can t m ain effect of speech type, F(1, 26) = 20.99; p < .0001, an d dose,
F(1, 26) = 6.62; p < .05. Im portantly, there was also a signi® cant interaction betw een dose
an d speech type, F(1, 26) = 4.95; p < .05. Planned com parisons showed that a signi® cant
dose effect was produced only w ith chan ging-state speech, chan ging/low versus changing/
high, F(1, 26) = 12.27; p < .01; no signi® can t difference was found between the two steadystate conditions, steady/low an d stead y/high, p > .1. A nalysis of the effect of speech type
showed that signi® can tly m ore errors w ere produced in the chan ging/high condition than in
either of the conditions of steady-state speech: chan ging/high versus steady/low, F(1, 26) =
25.33; p < .0001, chan ging/high versus steady/high, F(1, 26) = 21.88; p < .0001, w hereas
chan ging/low speech did not produce m ore errors than either of the steady state conditions:
chan ging/low versus steady/low, p > .1; changing/low versus stead y/low, p > .1.
D iscussio n
T he prediction s draw n from the chan ging-state hypo th esis co ncer ning the relationship
between w ord dose an d chan ging-state infor m ation were ful® lled in the results of E xperim ent 3. First, there w as a signi® can t interactio n between dose an d speech type, w ith only
W O R D DO S E A N D IR R E L E V A N T S P E E C H
929
FIG . 3 . Re su lts of E xpe rim e nt 3, showing m ean er ror s of seria l re call w ith respect to se rial positio n,
contrasting perfor mance in qu iet with condition s of irrelev an t sp eech that m anipulated spe ech type (steady
an d c hanging state ) an d dose (low dose an d high dose).
the chan ging -state speech producing a signi® can t dose effect. S econd , signi® can tly m ore
er rors were prod uced under con ditio ns of ch an ging-state speech w hen presented at a high
dose. It would ap pear th at the reduction in chan gin g-state infor m ation cau sed by presenting the chan ging-state speech at a low dose resulted in this condition producing no
m ore disruption than stead y-state speech. C han ging-state speech presen ted at a low dose
did, how ever, produce m ore errors than w hen the task w as perfor m ed in quiet, unlike
either of the stead y-state cond ition s.
T he ® rst two exp eriments dem on strated that as the num ber of discrete w ord s in an
irrelevant speech sou rce increases, the greater the disruption to serial recall cau sed by that
sp eech. T his w as sug gested to be d ue to an increase in chan ging-state in for m atio n in the
soun d. T he third experim ent supports this conclu sion w ith the dem on stration of an
interaction between word dose and w h ether the sound consisted of chan ging- or
stead y-state item s.
A w eakness of the preceding conclu sions, how ever, is that the w ord-dose effect m ight
also be explained w ithin the w orking m em ory m odel. A n important extrap olation of the
working m em ory account is that, if the irrelevan t speech effect is due to pho nological
co nfusio ns, speech that is phonologically m ore similar to the item s to-b e-recalled should
produce m ore interference. Evidence for this position was provided by Salam e and Baddeley
(1982, E xperim ent 5), w ho contrasted the effects of three irrelevan t speech sources on the
930
B R ID G E S AN D J O N E S
serial recall of visually presented digits: (1) digits identical to the visu al presen tatio n
(sem an tically similar condition), (2) w ords th at co ntain ed the sam e phonem es as th e d igits
in different orderÐ for exam ple, tun, gnu, tree, sore (phonologically sim ilar condition), or
(3) p honologically dissim ilar wordsÐ for exam ple tennis, tipple, wicket (ph onologically
dissimilar condition). It w as foun d that the sem an tically similar an d phonologically sim ilar
co nditions produced an eq uivalent de gree of disruption, w hich w as signi® can tly g reater
than that produced by the pho nologically dissim ilar co ndition (all sp eech co nditio ns
prod uced signi® cantly m ore errors of recall than w hen the task w as perfor m ed in quiet).
O ne of the co nclusions drawn from these ® ndings w as that ``(the results) seem to offer
strong support for the view that th e am ount of disruption by irrelevant speech is determ ined by the ph onological sim ilarity between the m aterial being rehearsed an d the
irrelevant distracting m aterial’ ’ (Salam e & B ad deley, 1982, p. 1 60). A second conclusio n
w as that the ordering of the ph onologically similar pho nem es w as incon sequential. T his
argum en t w as drawn from th e ® n ding that digits, w hich w ere identical to the visual
m aterial, an d the ph onologically similar speech, w hich contained the sam e pho nem es as
the d igits in a different order, produced sim ilar de grees of d isruption.
T he conclusio ns draw n by Salam e an d Bad deley (198 2) could lead to the suggestio n
that it w as not chan gin g-state infor m atio n p er se that produced the increased interference
sh ow n by h igh-dose sound. Rather, it could be argued that increasing the w ord dose
sim ply increases the num ber of confusions w ithin the phon ological store. E qually, the
interaction dem o nstrated in E xperim en t 3 could be th e result of the chan ging-state
sp eech con taining m ore phonem es th at were also present in th e to-be-rem em bered list.
T hat is, the chan ging-state sou nd w as ph onologically m ore sim ilar to the to-b e-rem em bered item s. T he chan ging-state state speech contained the syllable B ( bee) (containing
the vow el sound ee presen t in T in the visual list), J ( ja y) (a y present in K in the visual
list), I ( eye) (eye present in Y in the visual list), an d N ( enn) & Z ( z ed) (ed present in M
in th e visual list). In com parison, the steady-state speech on ly contained the letter nam e B
( bee) . It m ay be, therefo re, that the chan ging-state speech produced g reater disruptio n
because it created m ore pho nem ic con fusions in the phonological store. Further m ore,
increasing the dose of the chan ging-state speech w ould produce proportionally g reater
poten tial for phonem ic co nfusio ns than the higher dose in the stead y-state speech.
It should be noted that indirect evidence, from studies of the irrelevan t speech effect,
is not very supportive of the hypothesis that the effect is related to the pho nological
sim ilarity between heard and seen m aterial. A num b er of studies using very different
types of speech have produced broad ly sim ilar deg rees of disruptionÐ such as nar rative
(Colle, 1980; Jo nes et al., 1990), speech in a language that is not un derstood by the
su bject (Jones et al., 19 90), sung passages (M or ris, Jo nes, & Q uayle, 1989), an d reversed
sp eech (Jo nes et al., 1990). F ur ther m ore, the ® ndings discussed earlier that cer tain types
of non-sp eech produ ce sim ilar disrup tion to speech (e.g. Jones & M acken, 1993) cast even
m ore do ub t on the prop osal that the pho nological similarity between the speech an d the
to-be-rem em bered item s is an important factor in the interference. D irect evidence contrary to th e results of Salam e an d Bad deley (1982, E xperim ent 5) com es from an other
stu dy in w hich, although stim uli different to the original experim ent w ere used, an effect
of the pho nological sim ilarity between heard an d seen item s w as not found (Jo nes &
M acken , 1995b, E xp eriment 2).
W O R D DO S E A N D IR R E L E V A N T S P E E C H
931
D espite these ® ndings, it m ay be feasible to object to the co nclusions of the present
series on th e b asis of th e p honological sim ilarity hypothesis detailed above. It seem ed
desirable therefore to attem pt a direct replication of th e experim en t that ® rst dem onstrated the effect (Salam e & Baddeley, 1982, E xperim ent 5) to exam in e w hether th e
phonological similarity effect is as rob ust as the dose effect dem onstrated in the preceding
three experim ents.
E X PE R IM E N T 4
T he effects of irrelevan t speech have been explain ed w ithin the w orking m em ory m odel of
Bad deley an d colleagues as a result of pho nological confusions b etw een heard an d seen
m aterial (e.g. Salam e & B ad deley, 1982). It has been noted that the w ord-d ose effect
dem o nstrated in the p revious three experim ents could also be accom m od ated w ith in
this m od el. Increasing the w ord dose m ay simply increase the num ber of pho nological
co nfusio ns. Further m ore, it is possible that the sp eech containing a single repeated
utteran ce (steady state) used in E xperiment 3 w as ph onologically less similar to the visual
stim uli than the speech containing several w ords (ch an ging state). T his d ifference in
phonological sim ilarity could therefore be argu ed to account for the ® nding that
stead y-state speech did not produce a dose effect. In order to be certain that pho nological
sim ilarity w as not an im portant factor in th e results of E xperiment 3, a replication w as
attem pted of the original ex periment, w hich dem onstrated the effect (S alam e & B addeley,
1982 , E xperim ent 5).
M eth o d
Subjects
E ighteen u nd erg rad u ate stu d en t volun teer s w ere pa id for p articip ating in the exp erim en t. A ll
su bjects h ad E n glish as the ir ® rst lan gu age and rep orted n or m al h earin g.
App aratus and M aterials
L ists of item s to be recalled w ere p res ented serially o n th e screen of an A p ple M acintosh IIvi
m icrocom p u ter. L ists w ere con str u cted from th e ran d om ar ran gem en t of th e d igits one to nine, as in
S alam e an d B ad deley (1982), w ith th e co nstraint th at a digit cou ld n ot app ear in th e sam e serial
p osition in two co n secutive lists. T he lists w ere stored and presen ted w ith in a H yp ercard environ m en t.
T h e th ree typ es of irrelevan t sp eec h u sed w ere sim ilar to th ose u sed by S alam e an d B add eley
(1982), bu t w ith th e safegu ard that all w ords w ere tim ed to exactly 750 m sec us ing editing so ftw are.
M aterial in the sem antically sim ilar co n ditio n w as iden tical to that u sed in th e visual lists (th e d igits
on e to n ine) an d p h on ologically sim ilar m aterial con tained the sam e p h on em es ar ran ged into differen t p airs: tun, gnu, wee, sore, thriv e, ® x, hea ven, an d sign. A s in S alam e and B add eley (1982), th e d igit
seven, an d its rhym e (hea ven) w ere p ro n ou n ced w ith th e inten tion of app roxim atin g a m on o-syllabic
w ord (sev’ n an d ’ ea v’ n), to con for m w ith th e rest of th e u tter an ces in th eir resp ective sp eech stream s.
T h e p h on ologically dissim ilar w ords com pr ised tennis, jelly, tipple, double, tunnel, ha ckle, va lley, pickle,
an d wicket. T h e three sp eech con d ition s w ere con trasted w ith a q uiet con trol. A ll w ords w ere sp oken
932
B R ID G E S AN D J O N E S
in a m ale voice, recorde d usin g S ou n d E dit so ftw are an d s tored as ``sn d ’ ’ reso u rces w ith in H yp ercard .
T h e w ord s w ere record ed in a ran d om ord er w ith a g ap of 250 m sec betw een each on e; th is,
com bined w ith th e fact th at all wo rd s w ere of th e sam e len gth , en sured th at th e sam e w ord d os e
w as co ntain ed in each sp eech con dition . In th is m an n er a r ecord ing w as con str u cted for each
con d ition , w h ich lasted ap proxim ately 20 se c and w as repeated ly loop ed . Sou n d d elivery w as as
for E xpe rim en t 1.
Experim ental D esig n
A repeated -m easu res d esign w as u sed in th e experim en t. T h e fou r treatm en ts w ere blocked , w ith
20 trials in each con d ition , an d the ord er of p resen tation of th e blocks w as ran d om iz ed betw een
su bjects.
Procedure
T h e p ro ced ure w as as for E xp erim en t 1, e xcep t th at in ord er to rep licate the m eth odo logy u se d by
S alam e an d B add eley (1982, E xp erim en t 5), n o reten tio n interv al w as u sed . Su bjects w ere requ ired to
recall th e item s im m ed iately follow ing p resen tation .
R esu lts
E rrors w ere an alysed using a two-factor A N OVA w ith acoustic conditio n (four levels) an d
serial p ositio n (n ine levels) as factors. T here were highly signi® cant effects of co nditio n,
F(3, 51) = 15.44; p < .0001, an d serial positio n, F(8, 136) = 33 .56; p < .0001. Planned
co m parisons between the m ain effect of acoustic conditions show ed there to be no signi® cant d ifferences betw een any of the speech conditions (p > 0.05 in all cases), b ut
signi® can t differen ces b etw een each of the speech conditions an d q uiet: quiet versus
sem an tically similar, F(1, 17) = 20.99; p < .0001, quiet versus phonologically sim ilar,
F(1, 17) = 39.93; p < .0001, quiet versus pho nologically dissimilar, F(1, 17) = 27.07; p <
.00 01. T here w as also a signi® cant interaction between the two facto rs, F(24, 4 08) = 2.15;
p < .01. T here are two possible accounts for the interaction that is displayed in Figure 4.
First, pho nologically similar w ords produce m ore disruptio n than the other speech types
at Positions 3, 4, an d 5; how ever, of th e three contrasts only that at Positio n 5 just reaches
signi® can ce, an d then o nly at the 5% level. Testing of an isolated point am ong m any is not
strictly legitim ate, an d it therefore seem s likely that the signi® can t d ifference is a chan ce
result. A secon d possible reason for the interactio n is the convergence of all speech
co nditions w ith th e q uiet condition at S erial Positions 1, 2, an d 8. T he latter explan atio n
seem s m ore likely, given that an an alysis of the am ount of variance accounted for by
2
differences across co nditio ns is extrem ely sm all, h
= 0.07, in com parison to that
2
accounted for by chan ge across serial position, h = 0.29.
D iscussio n
In show ing no increased disruption of serial recall by irrelevan t speech containin g the
sam e pho nem es as the to-be-rem em bered m aterial, E xperiment 4 failed to replicate the
® ndings of E xp eriment 5 of Salam e an d B addeley (1982). T he ph onologically sim ilar
W O R D DO S E A N D IR R E L E V A N T S P E E C H
933
FIG . 4 .
Re su lts of E xpe rim e nt 4, showing m ean er ror s of seria l re call w ith respect to se rial positio n.
Perfo r m anc e in quiet was co ntraste d w ith perfor mance under c onditio ns of irreleva nt sp eech co ntaining words
that were either id entic al to the visua l stim uli (digits), containe d the sam e rhym es as the stim uli (pho nologic ally
sim ilar), or w ere pho nologic ally dissimilar.
m aterial did, however, show a (largely no n-signi® cant) tendency to produce m ore disruption at th ree serial p ositions. It w ould seem unw ise to attach too m uch signi® can ce to
this, co nsidering the an alysis of the am ou nt of variance accounted fo r by chan ge across
co ndition an d serial positio n. S econd , if the increase in disruption w as d ue to phonological sim ilarity, we would have expected th e sem an tically sim ilar speech to have show n a
sim ilar trend .
T he failure to ® nd a pho nolo gical sim ilarity effect, however, su pports the evidence,
discussed earlier, from other irrelevan t speech studies, w hich co ncluded that pho nological
sim ilarity between h eard an d seen item s w as not an important factor in the disruption of
serial recall (e.g. Jo nes & M acken, 1995b, E xperim ent 2). We w ould suggest, from this
evidence, that th e sole experimental dem o nstratio n of a p honological sim ilarity effect
available (Salam e & B addeley, 1982, E xperim ent 5) w as, in fact, the resu lt of a Type II
sam plin g er ror. T he result allow s us to be m ore con® dent th at the word -dose effect
dem o nstrated in previou s experim ents w as not the product of increasing the pho nological
sim ilarity b etw een heard an d seen m aterial.
D espite suppo rting predictions, a possible w eakness of E xp eriment 4 is that only the
words in the pho nolog ically dissim ilar condition were disyllabic. It could be argued that
934
B R ID G E S AN D J O N E S
dose m ay be increased, not o nly by increasing the num b er of word s in the irrelevan t
sp eech, b ut also by increasing the num b er of syllables in th ose w ords. We could, therefore,
have observed, in the results of E xperim ent 4, an effect of b oth dose an d pho nological
sim ilarity. It could be sug gested that as the ph onolo gically dissim ilar w ords w ere disyllabic, this con ditio n provided a g reater dose than the other speech con ditio ns, hence
prod ucing an increased deg ree of disruption, equivalent to that produced by pho nological
sim ilarity. T he results of E xperim ent 4 could therefore be revealing two m echan ism s of
interference rather than just th e one. It should, how ever, be noted that if this position is
taken, a dose effect should have also b een show n by the disyllabic w ords in Salam e an d
Bad deley’ s original investig ation. F urtherm ore, in an other experiment in the sam e series
(S alam e & Badd eley, 1982, E xp eriment 4), it w as show n that m ultisyllabic w ords did not
prod uce m ore interference than did speech co ntaining m onosyllabic w ords.
D espite that evid ence, a ® nal experim ent w as designed to ch eck that disyllabic w ords
do not produce m ore disruptio n than m o nosyllabic word s. T he aim of E xperiment 5,
therefore, w as to support our contention that the results of Ex perim ent 4 w ere due not to
a m an ifestation of both phonological similarity an d dose effects, but solely to a failu re to
® nd a phonological sim ilarity effect.
E X PE R IM E N T 5
T his experim en t contrasted th e effect of speech co ntaining w ords that w ere pho nologically identical to the visual stim uli w ith speech containing either m onosyllabic or disyllabic
phonologically dissim ilar w ords. A ll w ords w ere edited to exactly the sam e duration, an d
therefore the sam e w ord dose w as m aintain ed in all co nditions. If d ose increases as the
num ber of syllables in a word increases, disyllabic words should produ ce m ore disruptio n
of serial recall than m onosyllabic w ords.
M eth o d
Subjects
E ighteen u nd erg rad u ate stu d en t volun teer s w ere pa id for p articip ating in the exp erim en t. A ll
su bjects h ad E n glish as the ir ® rst lan gu age and rep orted n or m al h earin g.
App aratus and Procedure
Item s to be recalled w ere p resen ted serially on th e screen of an A pp le M acintosh IIvi m icro com p uter. L ists w ere th e sam e as for E xp erim ent 4, as w as th e p ro ced ure.
M a terials
Two of th e typ es of irrelevant sp eech u sed Ð sem an tically sim ilar (th e d igits one to nine) an d
d isyllabic p ho no logically d issim ilar (tennis, jelly, tipple, etc.)Ð w ere identical to th ose u sed in E xp erim en t 4. T h e th ird type com p rised m on osy llabic p h on ologically dissim ilar w ord s: bed, sa p, pick, stop,
neck, tip, nut, ca t, an d duck . Word len gth an d r ate of p resen tation w ere as for E xp erim en t 4, ag ain
en surin g th e sam e w ord d ose in eac h con d ition . T he th ree sp eech con d ition s w ere con trasted w ith a
W O R D DO S E A N D IR R E L E V A N T S P E E C H
935
qu iet con trol. A ll w ord s w ere sp oken in a m ale voice, recorde d u sin g S ou nd E d it so ftw are, and stored
as ``sn d ’ ’ reso u rces w ith in H yp ercard . T h e w ord s w ere recorde d in a ran d om ord er, w ith a g ap of
250 m sec betw een each on e. In th is m an n er a record ing w as con stru cted for each co nd itio n, w h ich
lasted ap p roxim ately 20 sec an d could be r epeatedly loo ped . T h e m eth od of p resenting so u n d w as as
for E xpe rim en t 4.
R esu lts
T he m ean errors at each serial position are illustrated in Figure 5. T here w ere signi® can t
effects of au ditory condition, F(3, 51) = 7.25; p < .001, an d serial positio ns, F(8, 136) =
52.62; p < .0001, w ith no interactio n between th e two. Planned com pariso ns showed there
to be no differences am o ng the speech co nditions, p > 0.1 in all cases, an d signi® can t
differences between each of the speech conditions an d quiet: quiet versus sem an tically
sim ilar, F(1, 17) = 13.95; p < .001, quiet versus m onosy llabic dissim ilar, F(1, 17) = 12.45;
p < .001, quiet versus disyllabic dissimilar, F(1, 17) = 16.62; p < .001.
FIG . 5 . Re su lts of E xpe rim e nt 5, showing m ean er ror s of seria l re call w ith respect to se rial positio n,
perfor mance in qu iet was co ntraste d with perfor manc e under conditio ns of irrelev an t spe ech that contained
word s that we re e ither identical to the visual stim uli (digits) , or were phon ological ly dissim ilar and either
m onosy llabic (e.g. pet, ca t, duck), or disyllabi c (e.g. tennis, va lley, wicket).
936
B R ID G E S AN D J O N E S
D iscussio n
T he results of E xperiment 5 show that disyllabic w ord s do not produce m ore disruptio n
than do the m o nosyllabic pho nologically dissimilar w ords. T his suggests that d ose is not
affected by the num ber of syllables in the w ords com prising the irrelevan t speech. A s the
purpose of this experiment w as to investig ate an alter native exp lanation for the results of
E xperiment 4, it is useful to co nsider the two sets of results together. E xperim ent 4
sh ow ed no effect of the pho nological sim ilarity between an irrelevan t speech source
an d the visually p resented item s of a serial recall task. T his ® nding is con trary to the
results of Salam e an d Bad deley (1982, E xperim ent 5) but har m o niz es w ith the results of
several other irrelevan t speech studies (e.g. Jones & M acken, 1995 b, E xperim ent 2). It w as
noted , however, that the results m ay have been due to the action of two m echan ism s. If it
is assum ed that dose increases as the num ber of syllables in the constituen t w ords of the
irrelevant speech increases, disyllab ic w ords would p rovide a g reater dose of speech an d
hence w ould b e expected to create m ore interference. E xperim ent 5, however, show s that
m o nosyllabic an d disyllabic pho nolo gically dissim ilar words produ ce equivalent am ounts
of disruption, an d, further m ore, they produce interference at a level com parable to that
prod uced by w ords that are p honologically similar to the visually presented digits. T his
result do es not co ncu r w ith the hypothesis that the ® ndings from E xperim ent 4 w ere the
result of both phonological similarity an d dose. It does, how ever, provide further sup port
for the co nclusion that the results of E xperiment 4 w ere due solely to a failure to ® n d an
effect of the pho nological similarity between heard an d seen item s.
G EN E R A L D IS C U S S IO N
T he results of the p resent series for m a clear an d coherent patter n. E xperim ent 1 show ed
there to be a m o notonic relationship between the num b er of words in an irrelevan t speech
source, the w ord dose, an d its propensity to disrupt serial recall. E xperim ent 2 show ed
that th is effect w as independ ent of th e overall speech exposure provided by the sp eech.
T he resu lts of E xperim ent 3 suggested that the effect of w ord dose w as due to an increase
in the am ount of chan gin g state infor m ation co ntain ed in the speech, as stead y-state
sp eech did not pro duce a signi® cant dose effect. It w as n oted, how ever, that this result
m ay b e due to the chan gin g-state speech co ntainin g m ore phonem es in com m on w ith the
to-be-rem em bered m aterial, resulting in g reater pho nem ic co nfusion. T his possibility w as
ex am ined in E xp erim ent 4, w hich attem pted to replicate the ® nding that speech that is
m ore phon ologically sim ilar to the visual stim uli produces m ore disruption (Salam e &
Bad deley, 1982, E xperim ent 5). N o effect of p honological similarity w as fou nd, but it w as
noted that the result m ay have been co nfou nded by the fact that o nly the phonologically
dissimilar w ords were disyllabic. E xp erim ent 5 exam ined th e possibility that dose m ay be
increased by increasing th e num ber of syllables in the words contained in an irrelevan t
sp eech source. M onosy llabic an d disyllabic w ords produced equivalent deg rees of disruption, im plying that dose (in discrete presentatio n at least) is dependent on the num ber
of w ords an d not the num b er of syllables contained in those w ords. E xperiment 5
increases ou r con® dence that th e results of E xp erim ent 4 were du e solely to a failure
to ® nd a pho nological similarity effect. T his, in tur n, sup ports the conclusio ns drawn
W O R D DO S E A N D IR R E L E V A N T S P E E C H
937
from the ® rst three experim ents that increased disruptio n produced by high-dose speech
is the result of an increase in chan ging-state infor m ation w ithin the speech, not an
increase in phonological confusio ns between the speech an d the visual m aterial.
Word Dose in Continuous S peech. W hilst it has been argued that the results p oint to a
word-dose effect of irrelevan t sp eech, we have been careful to point out th roughout the
pap er th at the experiments have only investigated the effect of discrete presentations of
words. We w ould also expect to see a dose effect of continuous speech; however, we w ould
not expect th at dose would correspon d to the num b er o f w ritten dictionary (W D ) words.
A nu m ber of studies have suggested that the p arsing of continuous speech is facilitated by
the locatio n of stressed syllables (e.g. C utler, 1989; C utler, M ehler, N orris, & S egui, 1986;
G rosjean & G ee, 1987). It is beyon d the scope of the current discussion to consider the
m erits of th ese argu m ents, but it m ay be that dose in continuous sp eech w ould be fou nd
to cor respond to th e num b er of parsed u nits or pho nological w ords (G rosjean & G ee,
1987 ) created by such a segm entatio n process. T his po ssib ility has relevan ce to the
® nding that serial recall is disrupted by irrelevan t speech in a lan guage that w as not
understood by the subject (e.g. Jones et al., 1990). A lthough it has b een argued th at
sp eech se gm entation processes are different in different languages (C utler, 1993), native
E nglish sp eakers w ill parse an unfam iliar language in the sam e w ay as th ey w ould parse
E nglish. T herefore, we would still expect to see a dose effect, as th e unfam iliar speech
would be segm en ted into discrete u nits, albeit different units from those that w ould be
prod uced by a native speaker of the language.
I mplica tions for A uditory S hort-term M emory. T he irrelevan t speech effect w as suggested by Bad deley an d A ndrad e (1994) to be o ne of the four primary sources of evidence
for the existence of the phonological store. It w as noted, however, that a deeper un derstanding of the operation s of these effects w as necessary for a m ore precise m odel of
phonological short-ter m m em ory. We believe that the ® ndings of the present study, in
co njunction w ith other reported ® ndings, go som e w ay tow ards providin g that un derstanding of the irrelevan t speech effect, an d m ay therefore begin to an swer th is call for a
m ore precise m odel.
T he m ajor novel ® nding of the experim ental series is that the num ber of discrete
seg m entable units in an irrelevan t sound sourc e is an im portan t factor in its prop ensity
to disrupt serial recall. T his w as interpreted as add ing to the co nsiderable body of
literature that sug gests that changing-state infor m ation is largely responsible for the
irrelevant ``speech’ ’ effect. T he con clusion to be draw n from these argu m ents is th at
ch an ging-state inform ation has an important role in the cognitive processing of b oth
au ditory an d visual m aterial.
A secon dary ® nd ing w as that, co ntrary to th e results of Salam e an d B addeley (1982,
E xperiment 5), the deg ree of phonological sim ilarity between heard an d seen m aterial did
not affect the disruption. T his provided at least som e evid ence to suggest that th e effect is
unlikely to be the result of pho nem ic con fusion s. T his seco ndary result m ay provide som e
insight into the ® nding that instrum ental m usic produces som e disruptio n of serial recall
(S alam e & Bad deley, 1989). T he conclusio n dr aw n from the result w as that m usic is in
som e w ay ``speech-like’ ’ enough to g ain access to the pho nological store. H owever, it is
938
B R ID G E S AN D J O N E S
dif® cu lt to u nderstand how instru m ental m usic m ight create the pho nem ic confusions
su ggested to be integ ral to the irrelevan t speech effect. A n altern ative p roposal is th at
even co ntinuous m usic is perceptually segm en ted into un its th at differ from o ne to the
next. H ence, in contrast to a stead y state co ntinuous sound such as w hite n oise, it has the
prop ensity to disru pt serial recall.
We w ould not argue that the results discussed above are suf® cient to for m the b asis for
a m ore precise m odel of short-ter m phonological m em ory. A num ber of q uestions have
yet to be addressed con cer ning the exact m echan ism s that produce a chan gin g-state effect
of irrelevan t speech. Fo r instance, w hy should chan gin g-state speech be afforded a g reater
deg ree o f processing than steady-state, an d exactly w hat param eters de® ne the deg ree of
ch an ging-state infor m ation in a stim ulus source? It is ap parent, how ever, that m ajor
revisions of Badd eley an d colleagues’ theory of the p honological store w ould h ave to be
m ad e to accom m odate th ese ® ndings from the study of irrelevan t speech. A t the very
least, the assum ptions that speech is the only m aterial to g ain au tom atic access to the store
an d that the irrelevan t speech effect is the resu lt of phonological confusions b etw een
heard an d seen item s w ould h ave to be dropp ed. Further m o re, co nsiderable ad ditions
would have to b e m ad e to the current m odel to account for the importance of chan gingstate inform ation.
R E FE R E N C E S
B ad deley, A.D. (19 68) . H ow d oes ac ou stic sim ilarity in¯ uence sh or t-te rm m em or y? The Qua rterly
J ourna l of Experimenta l P sychology, 18, 362± 365.
Baddeley, A.D. (1992a ). Is working memor y working? T he ® fteenth Bar tlett lecture. The Qua rterly
J ourna l of Experimenta l P sychology, 44A , 1± 31.
Baddeley, A.D. (1992b ). Working m emo ry: T he inte rface be tw een m em ory and cognitio n. J ourna l of
Cognitiv e Neuroscience, 4, 281± 288.
Baddeley, A.D., & Andrade, J. (1994) . Reversing th e word-length effect: A comm ent o n Cap lan, Rochon
& Walter s. The Qua rterly J ourna l of Experimenta l P sychology, 47A , 104 7± 1062 .
Baddeley, A.D., & Hitch, G.J. (1974 ). Working m emor y. In G. Bower (E d.), R ecent a dva nces in lea rning
a nd motiva tion, Vol. I I I (p p. 47± 89) . New York: Academic P ress.
Baddeley, A.D., & S alam e , P. (1986) . Th e unattended speech effect: Perception or mem ory? J ourna l of
Experimenta l P sychology : Lea rning, M emory a nd Cognition, 12, 525± 529.
Colle, H .A . (1980) . Auditory encoding in visual short-ter m recall: Effects of noise inte nsity and spatial
location s. J ourna l of Verba l Lea rning a nd Verba l B eha v ior, 19, 722± 735.
Colle, H .A., & Welsh, A. (1976) . Acoustic m asking in primar y memory. J ourna l of Verba l Lea rning a nd
Verba l B eha vior, 15, 17 ± 31.
Cutler, A.J. (198 9). Auditory lexical ac cess: W here do we start? In W. M ar slen-W ilso n (E d.), Lexica l
representa tion a nd process (p p. 342 ± 356) . Cam bridge, M A: M IT P ress.
Cutler, A.J. (1993) . S egm enting speech in differen t lan gu ages. The P sychologist, 8, 453 ± 455.
Cutler, A.J., M ehler, J., Nor ris, D., & Se gu i, J. (1986) . T he syllable’s differing role in th e segm entatio n of
French an d English. J ourna l of M emory a nd La ngua ge, 25, 386± 40 0.
Gros jean, F., & Gee, J.P. (1987) . Prosodic structure an d spoken word recognitio n. Cognition, 25, 135 ±
155.
Hanley, J.R., & B roadbent, C. (19 87) . T he effect of unatte nded speech on serial rec all follo wing au dito ry
presentatio n. B ritish J ourna l of P sychology, 78, 287 ± 297.
Jo n es, D.M . (1 99 3). O bjects, stream s an d th reads o f a uditory attentio n. In A.D. Bad d eley & L .
Weiskran tz (E ds.), Attention: S election, a wa reness a nd control. Oxfor d: Clarendon P ress.
W O R D DO S E A N D IR R E L E V A N T S P E E C H
939
Jones, D.M . (1994) . D isruptio n of mem ory for lipread lists by irrelevant speech: Further suppor t for the
changing state hypothesis. The Qua rterly J ourna l of Experimenta l P sychology, 47A , 143± 160.
Jones, D.M ., & B roadbent, D.E. (1991) . Hum an perfor mance an d noise. In C.M . Har ris (E d.), H a ndbook
of a coustica l mea surements a nd noise control (p p. 24.1± 24.24) . New York: M cGraw -H ill.
Jones, D.M ., & M acken, W.J. (1993) . Irrelevant tones produce an irrelevant speech effect: Im plicatio ns
for pho nological coding in working m em ory. J ourna l of Experimenta l P sychology : Lea rning, M emory
a nd Cognition, 19, 369± 381.
Jones, D.M ., & M ac ken, W.J. (1995a ). Ban d-pass noise, chan ging state an d th e ``irrelevant speech’ ’
effe ct. I n prepa ra tion.
Jones, D.M ., & M acken, W.J. (1995b ). P honolog ical sim ilarity in the irrelevant speech effect: W ithin- or
betwee n-stream similarity? J ourna l of Experimenta l P sychology : Lea rning, M emory a nd Cognition, 21,
103± 116.
Jo nes, D.M ., M ack en, W.J., & M ur ray, A .C. (1 993 ). D isruptio n o f visual shor t-ter m m em o ry by
changing-state au dito ry stimuli: T he role of segm entatio n. M emory a nd Cognition, 21, 318± 328.
Jones, D.M ., M adden, C., & M iles, C. (1992) . Privileged ac ce ss by irrelevant speech to shor t-term
mem ory: T he role of chan ging state. The Qua rterly J ourna l of Experimenta l P sychology, 44A , 645± 66 9.
Jones, D.M ., M iles, C., & P age, J. (1990 ). Disruption of proo freading by irrelevant speech: Effects of
atte ntion, arousal or m emo ry? Applied Cognitiv e P sychology, 4, 89± 108.
Jones, D.M ., & M or ris, N. (19 92). Irrelevant speech an d serial recall: Implicatio ns for th eories of
atte ntion an d working m emor y. S ca ndina via n J ourna l of P sychology, 33 , 212± 229.
M assaro, D.W., & Cowan , N. (1993) . Infor matio n processing mod els: M icroscop es of th e mind. Annual
Review of P sychology, 44, 383± 425.
M ack en, W.J., & Jones, D.M . (1995 ). Functional char acteristics of th e inner voice an d inner ear: S ingle
or double agency? J ourna l of Experimenta l P sychology : Lea rning, Memory a nd Cognition, 21, 436± 44 9.
M iles, C., Jones, D.M ., & M adden, C.A. (1991) . L oc us of the irrelevant speech effect in shor t-term
mem ory. J ourna l of Experimenta l P sychology : Lea rning, Memory a nd Cognition, 17, 57 8± 584.
M or ris, N., & Jones, D.M . (1990a) . M emory updating in working memory: T he role of th e ce ntral
execu tive. B ritish J ourna l of P sychology, 81, 111± 121.
M or ris, N., & Jones, D.M . (1990b ). H abituatio n to irrelevant speech: Effects on a visual shor t-term
mem ory task. P erception a nd P sychophysics, 47, 291 ± 297.
M or ris, N., Jones, D.M ., & Q uayle, A. (1989) . M emory disruption by back ground speech an d singing. In
E.D. M egaw (E d.), Contempora ry ergonomics (p p. 494 ± 499) . Lo ndon: Taylo r an d Fran cis.
Salam e , P., & Baddeley, A.D. (1982 ). Disruption of short-ter m m eory by unattended speech: Im plications for th e structure of working mem or y. J ourna l of Verba l Lea rning a nd Verba l B eha vior, 21, 150 ±
164.
Salame , P., & B addeley, A.D. (1986) . P hon ological facto rs in S TM : Similarity an d the unatte nded spee ch
effe ct. B ulletin of the P sychonomic S ociety, 24, 263± 265.
Salame , P., & Baddeley, A.D. (19 87) . N oise, unattended speech and short-term memory. Ergonomics, 30,
1185± 1194.
Salame , P., & B addeley, A.D. (1989) . Effects of backg round noise on phon ological short-term m emor y.
The Qua rterly J ourna l of Experimenta l P sychology, 41A , 107± 122.
Salame , P., & Bad deley, A.D. (1990) . T he effects of irrelevant speech on immediate free recall. B ulletin of
the P sychonomic S ociety, 28, 540± 542 .
Salame , P., & W ittersheim, G. (1978) . S elective noise disturbance of th e infor matio n input in shor t-te rm
mem ory. The Qua rterly J ourna l of Experimenta l P sychology, 30, 69 3± 704.
Origina l ma nuscript received 5 Ma y 199 4
Accepted revision received 1 N ovember 199 5