Now at New York University. **Now at New York

299
EFFECTS OF CUMULATIVE CONTEXT AND
GUESSING METHODS ON ESTIMATES OF TRANSITION
PROBABILITY IN SPEECH*
JOHN P. BURKE
New York University
and
NICHOLAS SCHIAVETTI**
Newark State College
This study investigated the effects of cumulative context and of various guessing
methods on transition probability estimates derived from the same speech materials.
Transition probability estimates obtained from the single-guess and continuous-guessing
methods were highly correlated and yielded similar distributions of scores for both
isolated and cumulative context material. Forward and backward guessing methods
yielded uncorrelated sets of predictability scores, the distributions of which were
significantly different. Cloze procedure predictability scores were highly correlated
with forward guessing and combined forward-backward guessing results, but were
significantly higher in magnitude. Implications of the effects of procedural differences
for future
investigations
are
discussed.
INTRODUCTION
The present study is the second in a series which is part of a programme of
research whose general purpose is the investigation of methods of estimating
transition probability. The first study considered effects of single-guess versus
continuous-guessing methods (Schiavetti and Burke, 1974).
The possibility of methodological effects on resultant transition probability
estimates is strong. For example, Goldman-Eisler (1958) found that forward guessing
and backward guessing yielded equivalent transition probability estimates for some
words, but not for others. Also, Tannenbaum et al. (1956, p. 136) cited an
unpublished study by Williams which found that their Cloze procedure yielded
significantly different transition probability estimates than did Goldman-Eisler’s
combined forward and backward guessing method. Tannenbaum et al. did not
specify relevant procedural considerations such as amount of context or guessing
ongoing
*
Part ofthis research was conducted at the Speech Science Laboratory, Teachers
College, Columbia University, where the junior author was Research Associate.
The co-operation ofDr. Ronald J. Baken, Co-ordinator of the Speech Science
Laboratory, is appreciated.
**Now at New York University.
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
300
method (single versus continuous) used in this sub-study. Therefore, although the
overall procedure yielded different results, direct comparison of the influence of each
individual procedural difference is difficult, since multiple variations in procedure
were possible. Direct comparison of such individual effects would best be accomplished
by a systematic programme of research that would consider the effects of specific
changes in procedure on resultant transition probability estimates. The purpose of
our research programme, therefore, is to specify methodological effects as a guide to
future investigators in the selection of methods of estimating transition probability in
their research.
In
previous study (Schiavetti and Burke, 1974), single-guess and continuousmethods of estimating transition probability were found to yield essentially
equivalent scores. Thus, this difference between guessing methods alone would seem
to be insufficient to account for the discrepant results of Goldman-Eisler (1958) and
Tannenbaum et al. (1965) regarding transition probabilities of words surrounding
hesitation pauses in speech. It is possible, however, that a contextual factor may
have influenced their discrepant findings, since different amounts of context were
a
guessing
associated with the speech materials in the two studies. Goldman-Eisler’s subjects
guessed words in several unrelated sentences, whereas the subjects in the study by
Tannenbaum guessed words in a continuous context. Thus, the contextual effects
were cumulative over a longer period of time in the Tannenbaum study.
It is also
accumulation
that
contextual
and
method
of
forward
guessing might interact.
possible
That is, the two different methods of forward guessing might yield the same transition
probability estimates when applied to single sentences, but transition probability
estimates might differ for cumulative context.
The purpose of the present study, therefore, was to compare transition probability
estimates derived from sentences presented to guessers in a continuous, cumulative
context and estimates derived from the same sentences presented in isolation from
their true context. Both the single-guess and continuous-guessing methods were
employed to study the possibility of an interaction between method of guessing and
accumulation of context. Also, Cloze procedure and combined forward-backward
guessing were considered. Two experiments were carried out: the first to investigate
specific effects of guessing method and contextual accumulation, and the second,
essentially a replication of the Goldman-Eisler (1958) and Tannenbaum et al. (1965)
procedures, to investigate the complicating factor of backward contextual effects in the
use of Cloze and combined forward-backward guessing methods.
~
~
~
..
_
EXPERIMENT I
. ..
’
The specific purpose of Experiment I was to investigate the effects of contextual
accumulation and forward guessing methods on transition probability estimates.
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
301
METHOD
Subjects
One hundred and six forward
were 212 college student volunteers.
the
isolated
context
and
106
forward
materials,
guessed the cumulative context
guessed
materials. In each group, 100 subjects (50 males; 50 females) employed the singleguess method and 6 subjects (3 males; 3 females) employed the continuous-guessing
method. No subject had been previously exposed to the speech material.
The
subjects
Material
The
speech material, a paragraph of neutral affect describing the use of precious
jewellery design, was adapted from an article in a popular weekly news
magazine. It consisted of 10 simple sentences ranging from 10 to 15 words in length.
stones
in
In each sentence, one word from the first half and one from the second half were
deleted as the target words for forward guessing. Four of the words were associated
with each of the five &dquo; word weights &dquo; used by Brown (1945) to estimate the
&dquo;
prominence &dquo; of words. Thus, an even range of prominence &dquo; of the target words
Amount of context between deletions
to be guessed was previously determined.
varied from one to eight words, thus providing a range of difficulty related to the
amount of text between deletions (Fillenbaum et al., 1963).
For the cumulative context condition, a special typescript was prepared. On the
first page of the typescript, only that portion of the first sentence preceding the first
target word was included. The second page included the previous material plus the
correct word supplied in the position at which the subject had made his first guess,
plus the remainder of the sentence preceding the second target word to be guessed.
For example:
&dquo;
The third page presented the entire first sentence and that portion of the second
which preceded the next target word. The typescript continued in this
fashion, presenting the subject with the cumulative context preceding each target
word until the paragraph was completed,
For the non-contextual condition, a different typescript was constructed, in which
the order of the target sentences was randomized and &dquo; noise &dquo; was added in the form
of distractor sentences. Fifteen sentences, of similar neutral affect, describing urban
events, business activities, food, etc., were adapted from the same weekly news
magazine and added to the target material to distract subjects from the cumulative
context of the jewellery passage.
Distractor sentences also ranged in length from 10
to 15 words.
All 25 sentences were then randomly ordered and a special typescript
was prepared, in which each page presented the subject with the preceding context
sentence
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
302
of each individual sentence to be guessed. Thus, page one presented the context
preceding the first target word of sentence number one. Page two presented the
previous material plus the correct word supplied where the subject had guessed, plus
the remainder of the sentence preceding the second word to be guessed. Page three
began anew with the second sentence, presenting only material preceding the first target
word of the second sentence. Page four included this material plus the correct word
supplied where the subject had guessed, plus the remainder of the sentence preceding
the second word of the second sentence to be guessed. Previous sentences were,
therefore, not carried over from page to page. For example:
Procedures
Subjects in the single-guess group were told to write their first guess of the next
word in each sentence in the blank space of the typescript.
Subjects in the continuous-guessing group were instructed to guess what they thought
the next word in each sentence would be and to keep on guessing either until told
by the experimenter that their guess was correct or until one minute had elapsed. The
experimenter wrote down the guesses on another copy of the typescript. In addition,
responses were tape-recorded so that the written record of their guesses could be
checked for accuracy after the session. This was necessary since some subjects
guessed faster than the experimenter could write. The same single-guess and
continuous-guessing methods were applied to the contextual and non-contexual
materials.
Data
Analysis
probability for the single-guess method was taken as the percentage of
subjects correctly guessing the word.
Transition probability for the continuous-guessing method was taken as the ratio
of correct guesses to the total number of di ff erent words guessed as was suggested by
Goldman-Eisler (1958). Only one occurrence of each wrong guess was included in
Transition
the total number of guesses for each sentence. In other words, the total number of
types of guesses, rather than tokens, was considered in this measure in order to
’
replicate Goldman-Eisler’s procedure.
Statistics describing the distributions of the transition probability estimates were
computed. In addition, Pearson product-moment correlations were determined for
the six sets of comparisons generated by the combinations of the four experimental
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
303
’
TABLE 1
describing transition probability estimates obtained by single-guess and
continuous-guessing methods applied to isolated and cumulative context materials
Statistics
conditions. The Wilcoxon matched-pairs, signed-ranks test was used
the significance of the differences between conditions in each of the six
to
evaluate
comparisons
(Siegel, 1956).
RESULTS
The results presented below are based on the transition probability estimates of all
20 words. Unlike our first study (Schiavetti and Burke, 1974), none of the
continuous-guessing scores had to be eliminated from the analysis because of transition
probability estimates greater than one.
Table 1 displays the various statistics computed to describe the distributions of the
transition probability estimates. The four distributions were similar in central tendency
and in variability. Transition probability estimates derived from the continuousguessing method were slightly higher than those derived from the single-guess method,
The overas reported in our previous study (Schiavetti and Burke, 1974).
estimates were approximately as would have been predicted from our previous
regression equation (Y = 1.2X + 0.06). The standard deviations and standard
errors of the means were also similar across the four conditions.
Inspection of the
data
reveals
one
toward
a
difference.
There
was
a slight inflation
tendency
quartile
of the third quartile (and, consequently, of the semi-interquartile range) of the
continuous-cumulative condition. This was due to the fact that one function word
was guessed the first time by all six subjects in that condition, while it was missed by
This was one of the two scores at the quartile boundary,
some in the other groups.
and half of the difference between this score and the next lowest score was added to
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
304
compute the third quartile. Therefore, it is possible that this
have
resulted from an artifact produced by the small number
atypical quartile may
’for
use in the continuous-guessing method.
of subjects recommended
Table 2 displays the correlations between the various conditions of the experiment.
All correlations are highly significant and indicate strong agreement in the relative
rankings of the transition probability estimates of the 20 words among all conditions.
Table 3 presents the results of the Wilcoxon matched-pairs, signed-ranks test
(T). For a given number of pairs (N) on which there is a difference between scores,
a significant T must be less than or equal to the critical value (CV).
Thus, larger
T’s indicate non-significance of differences for this non-parametric test (Siegel, 1956,
75-83). Inspection of Table 3 reveals that none of the comparisons reached
significance at the predetermined alpha of 0.01.
The results suggest, then, that the two forward guessing methods yield similar
distributions of highly correlated estimates of transition probability for contextual and
isolated material, with slightly higher estimates derived from the continuousguessing method, regardless of context accumulation.
the
next
lowest
score to
EXPERIMENT II
Since the first experiment indicated no significant effects of context or of guessing,
method on resultant transition probability estimates, the specific effects of these two
variables cannot account for the discrepant results of Goldman-Eisler (1958) and
Tannenbaum et al. (1965) regarding predictability of words surrounding hesitation
pauses in speech. The problem, then, may lie in the differential grammatical and
semantic effects of neighbouring context presented by the forward and backward
guessing methods and the Cloze procedure. Since Goldman-Eisler (1958) had
indicated differences between forward and backward guessing on some words, and
since Tannenbaum et al. (1965, p. 136) had referred to an unpublished report of a
significant difference between their Cloze procedure and the combined forwardbackward guessing method, Experiment II investigated transition probability estimates
of the same material predicted by each of these methods. The specific purpose was
to describe in detail the direction and magnitude of any differences in predictability
yielded by these methods, since such data are not readily available in the GoldmanEisler and Tannenbaum studies. Reporting these data may help to explain the
discrepancy between the two studies regarding predictability of words surrounding
hesitation pauses in speech, and would provide guidelines for researchers in the
selection of guessing methods in future experiments.
,.
.
-
-
_
.,
~
METHOD
.
Sub jects
~
.
’
The
subjects
were
112
college
student volunteers.
One hundred
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
(50 males;
50
305
TABLE 2
Correlation coefficients
*
showing
the associations among all of the
conditions*.
An r value of 0.503 is
significant
at
the 0.01
level,
experimental
19 d.f.
TABLE 3
Results of the Wilcoxon
matched-pairs, signed-ranks tests for the
yielded by the four experimental conditions.*
*
six
Critical Values of T (CV) for the given N’s are: N = 14, CV =
CV
20; N = 18, CV
28; N = 19, CV
32; N = 20, CV
=
=
=
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
comparisons
13; N
=
38.
=
16,
306
females) guessed the words by Cloze procedure. Six subjects (3 males; 3 females)
guessed and six subjects (3 males; 3 females) backward guessed the material
by the continuous-guessing method. The forward guessing subjects and data were
the same as used in Experiment I.
forward
Material
used as in Experiment I. For the Cloze procedure,
in
which
the target words were deleted from the text.
printed page
prepared
For backward guessing, the typescript of Experiment I was reversed so that a blank
space for the target words preceded the following context of the sentence. The
second half of each sentence was shown first, and the first half was shown second.
The isolated condition of the material was used to attempt replication of GoldmanEisler’s procedure.
The
same
speech
a
material
was
was
Procedures
Subjects
in the Cloze
procedure
group
were
told
to
write their first guess of the
target words in the blank spaces of the typescript.
Subjects in the backward continuous-guessing group were instructed to guess what
they thought the preceding word in each instance had been, and to keep guessing
until told by the experimenter that their guess was correct or until one minute had
elapsed. Procedures for subjects in the forward continuous-guessing group were as
described in
Data
Experiment
I.
Analysis
Transition
probability for the Cloze procedure was taken as the percentage of
subjects correctly guessing the word. Transition probability for the backward and
forward continuous-guessing methods was taken as described in Experiment I. In
addition, a combined forward-backward continuous-guessing transition probability
estimate was computed by averaging the forward and backward result for each word.
Statistics describing the distributions of the transition probability estimates were
computed. Also, Pearson product-moment correlations were determined for the three
comparisons of interest, generated by the combinations of the experimental conditions.
The Wilcoxon matched-pairs, signed-ranks test was used to evaluate the significance
of the differences between conditions in each of the three comparisons (Siegel, 1956).
&dquo;
RESULTS
’
The results presented below are based on the transition probability estimates of all
20 words. Again, none had to be eliminated because of transition probability estimates
greater than one.
.
,
I
.
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
307
TABLE 4
Statistics describing transition probability estimates obtained by Cloze procedure,
forward guessing, backward guessing, and combined forward-backward guessing.
Table 4 displays the various statistics computed to describe the distributions of
the transition probability estimates. Differences in central tendency and variability
are evident.
Forward guessing yielded higher mean and median values with a greater
range, semi-interquartile range, and standard deviation of scores than did backward
guessing. The backward guessing scores were severely restricted to a narrow range
of low scores, reflecting the extreme difficulty of this task. This difficulty was
subjectively voiced by all our guessers, as also had been the case with GoldmanEisler’s subjects. The forward-backward scores, of course, reflect the averaged
distribution of the two methods. The Cloze results were higher in central tendency
(mean and median) than the results of the other three conditions. While the standard
deviation of the Cloze results was similar to that of the forward guessing (and higher
than the backward and forward-backward combined), the semi-interquartile range and
the three quartiles were much higher, because almost all scores were uniformly higher
in the Cloze condition.
Table 5 displays the correlations four the comparisons of interest. A low negative
correlation was found between the transition probability estimates derived from the
forward and backward guessing methods. Since this negative relationship was not
significant, it should be concluded that the two sets of data are unrelated. This is
due to the fact that there was a very limited spread of results from the backward
guessing method, while the forward guessing method yielded an obvious spread of
scores.
Thus, a backward guessed word had a low score, regardless of whether or
not the word was easy or difficult to guess by the forward guessing method.
Significantly high positive
correlations
were
found between forward
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
guessing
and
308
TABLE 5
Correlation coefficients showing associations among transition probability estimates
yielded by Cloze procedure, forward guessing, backward guessing, and combined
forward-backward guessing*.
*
An
r
value of 0.503 is
significant
at
the 0.01
level,
19 d.f.
TABLE 6
Results of the Wilcoxon matched-pairs, signed ranks tests for the comparison of
Cloze procedure, forward guessing, backward guessing and combined
forward-backward guessing*.
* Critical values of T
CV = 32; N
=
(CV)
20, CV
for the
38.
=
given
N’s
are :
N =
18, CV = 28; N
=
19,
-
Cloze scores, and between combined forward-backward and Cloze scores, indicating
that the words maintained their relative predictabilities, regardless of method.
’I’able 6 presents the results of the Wilcoxon matched-pairs, signed-ranks test (T).
All three comparisons yielded significant differences (T’s lower than CV’s).
Results can be summarized as follows. Backward guessing results were significantly
lower than, and uncorrelated with, forward guessing results. Cloze procedure results
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
309
significantly higher than both forward guessing and combined forward-backward
guessing results, but the relative positions of the individual word predictabilities were
were
maintained between Cloze results and both the forward and combined forwardbackward guessing results, as evidenced by high correlations.
DISCUSSION
The results of these experiments provide some guidelines for commenting on the
discrepant results of Goldman-Eisler (1958) and Tannenbaum et al. (1965) regarding
the predictability of words surrounding hesitation pauses in speech, and on the use
of transition probability estimates in such research.
First, the results confirm the findings of our previous experiment (Schiavetti and
Burke, in press) regarding the equivalence of transition probability estimates yielded
by single-guess and continuous-guessing methods. Secondly, the results rule out the
possibility of accumulation of context as a factor in the discrepancy regarding words
surrounding hesitations.
Our findings on contextual accumulation are in good agreement with those of
Aborn et al. (1959) who studied the effects of context accumulation within sentence
boundaries on predictability scores yielded by Cloze procedures They found that
predictability increased as context increased from 6 to 11 words, but remained
essentially stable as context increased from 11 to 25 words per sentence. Aborn
et al. (1959, p. 178) concluded that &dquo; The relationship between constraint and length
of context does not go on indefinitely.&dquo; The present study extends these findings
beyond sentence boundaries to the paragraph, in which the cumulative effect of
context was still negligible.
The possibility may still exist that even longer contexts
cause
a
reversal
in
this
trend, possibly due to a familiarity with the vocabulary
may
of a particular context over a period of time. However, since most research on
predictability has been restricted to isolated sentences or to a paragraph of speech
material, cumulative context may be considered to attain a maximum somewhere
between 5 and 10 words as suggested by Aborn et al. (1959).
Thirdly, the findings regarding the relationships between Cloze procedure and
forward and backward guessing methods are particularly important. While the Cloze
procedure yielded predictability scores that were highly correlated with the forward
and combined forward-backward guessing scores, Cloze scores were significantly
higher. Forward guessing scores were significantly higher than backward guessing
scores, and the two were uncorrelated. These findings account, in part, for the
discrepancy regarding predictability of words in the environment of hesitations.
Goldman-Eisler (1958) found that transition probability estimates of words after
hesitations were lower than those of fluent words in her speech materials. Conversely,
transition probability estimates of words before hesitations were higher than those
of fluent words. Tannenbaum et al. (1965) found that transition probability estimates
of words before and after hesitations were not significantly different from each other
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
310
and both were lower than those of other fluent words. It is interesting to note that
Goldman-Eisler found that many words were equally difficult to predict by both
forward and backward guessing methods, some were more predictable by forward
guessing, and some were more predictable by backward guessing. Thus, combined
forward and backward guessing might differentially affect some words, while Cloze
procedure might inflate the transition probability estimates of all words relative to
forward and backward guessing.
Inspection of our raw data indicated that predictability of all but one of the
words (as measured by backward or by forward guessing) was inflated by Cloze
procedure. About half of the words were of low predictability by both forward and
backward guessing methods, and the remainder were of high predictability by forward
guessing and low predictability by backward guessing. No words were of high
predictability by both forward and backward guessing methods. Thus, while almost
all scores were inflated (relative to both forward and backward guessing) by Cloze,
some were deflated (relative to forward guessing) by combined forward-backward
guessing, some were inflated (relative to backward guessing), and some were unaffected
(relative to forward and backward guessing). The differential effects of the two
procedures (Cloze v. combined forward-backward guessing) may have accounted for
the discrepancy between the results of Goldman-Eisler and Tannenbaum et al.
regarding words in the environment of hesitations. Of course, another possibility is
to be found in the speculation of Tannenbaum et al. that different types of hesitations
may be related to the different findings regarding predictability. Obviously, further
research is needed to examine the different hesitation types hypothesis, as is also
suggested by Tannenbaum et al., but future research must also consider differences in
transition probability yielded by different guessing methods.
Interestingly enough, the possibility of differential effects of guessing methods is
also highlighted by the findings of Abom et al. (1959) who concluded (p. 179) &dquo; a
bilaterally distributed context exerts greater constraint than a totally preceding or
totally following context of the same length.&dquo; Thus, since the global view of context
offered by Cloze procedure is not the same as the view of context offered to a forward
or a backward guesser, the different &dquo; viewpoint &dquo; of the guesser must be considered
in all future research employing transition probability estimates in speech. It is,
therefore, suggested that investigators be consistent in applying one method in future
studies. If different methods must be employed, then studies must take into account
the possible differences-in results due to experimental method alone.
A deeper question concerns the validity of the various methods as estimators of
transition probability. In reading studies, for example, Cloze procedure would appear
to have the most face validity, since a reader views simultaneous preceding and
following context on a printed page. On the other hand, in studies of speech, the
forward guessing method might have more face validity because a listener has only
preceding context available at the instant of word ;production. It might also be
argued, however, since the temporal relationships of words involve such small time
frames, that the Cloze procedure is equally valid for predicting words in the midst
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016
311
of a temporally compact context. Also, the speaker, rather than the listener, may be
the focus of a particular study (as in disfluency investigations) and he has at least
part of the following context somewhat at his disposal while producing a particular
word.
The question of validating each particular method is a difficult one, also involving
the problems of reliability (a subject we have not found previous reference to in the
literature), and of external validity criteria. Many assumptions so far have been based
on face validity, but external validity, defined as the consistency of the relationship
between transition probability estimates and phenomena assumed to be associated
with information transfer (e.g., hesitation phenomena), need to be considered.
However, the argument may be circular, since the assumptions regarding association
with information transfer may often be based on observations of transition probability
estimates obtained with a particular method. Thus, we may be forced to rely on
operationalism and face validity in selecting a particular method. Therefore, methods
which offer greatest reliability and theoretical face validity will have to suffice for the
present. These and other questions regarding the generalizability of transition
probability estimates yielded by a particular method for use with different subjects
under different research conditions with different materials should be the focus of
future research.
REFERENCES
ABORN, M., RUBENSTEIN,
H. and
STERLING, T. D. (1959). Sources of contextual constraint
upon words in sentences. J
. exp. Psychol., 57, 171.
S. F. (1945). The loci of stutterings in the speech sequence.
. Speech Dis., 10, 181.
J
BROWN,
FILLENBAUM, S., JONES, L. V. and RAPOPORT, A. (1963). The predictability of words and
their grammatical classes as a function of rate of deletion from a speech transcript.
. verb. Learn. verb. Behav., 2, 186.
J
GOLDMAN-EISLER, F. (1958). Speech production and the predictability of words in context.
Quart. J exp. Psychol., 10, 96.
SCHIAVETTI, N. and BURKE, J. P. (1974). Comparison of methods of estimating transition
probability in speech. Language and Speech, 17, 347.
SIEGEL, S. (1956). Nonparametric Statistics (New York).
TANNENBAUM, P. H., WILLIAMS, F. and HILLIER, C. S. (1965). Word predictability in the
environment of hesitations. J
. verb. Learn. verb. Behav., 4, 134.
Downloaded from las.sagepub.com at PENNSYLVANIA STATE UNIV on May 17, 2016