On the Combination of Associative Probabilities in Linguistic Contexts Author(s): Davis Howes and Charles E. Osgood Source: The American Journal of Psychology, Vol. 67, No. 2 (Jun., 1954), pp. 241-258 Published by: University of Illinois Press Stable URL: http://www.jstor.org/stable/1418626 . Accessed: 10/01/2011 03:54 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at . http://www.jstor.org/action/showPublisher?publisherCode=illinois. . Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. University of Illinois Press is collaborating with JSTOR to digitize, preserve and extend access to The American Journal of Psychology. http://www.jstor.org ON THE COMBINATION OF ASSOCIATIVE PROBABILITIES IN LINGUISTIC CONTEXTS By DAVIS HOWES, Aero Medical Laboratory, Wright-Patterson AFB, and CHARLES E. OSGOOD, University of Illinois It is a commonplacethatmeaningsof words dependupon the contextsin which they occur. This dependence sets a fundamentalproblem in the psychologyof language: calculationof the psychologicaleffectsof a word in its context from the individual propertiesof the word and of the contextual elements. Recognizing a conventionaldistinctionbetween the linguistic and non-linguisticcontexts of a person's speech, we may conveniently subdividethe former into (a) the homogeneouslinguistic contextthe context provided by his own previouslanguagebehavior;and (b) the heterogeneouslinguistic context-that providedby the utterancesof other persons in his environment.Recent work by Shannonhas provided a statistical model for describingcertain problems of homogeneoussequences and has stimulatedseveralexperimentalstudiesin that area.' In the present paper we shall be concernedsolely with heterogeneouslinguistic contexts; i.e. with the predictionof the language behaviorof an experimentalsubject from the language behaviorof another person in his environment. For experimentalpurposeswe take a sequenceof four words spoken by an experimenter(E) and measureas a dependentvariablethe probability that a given word will be emitted as an associationto the last word of the sequence.This is a modified form of the conventionalword-association experiment.Since the sequenceis spoken by E, the propertiesof the words constitutingit can be controlledas independentvariables.The strengthsof the associativeeffectsof each of the firstthree words of the sequenceupon the subject's(S's) responseto the fourth word is the propertyinvestigated in the following experiments.Three studieswill be reported.If we designate the first three words of the four-wordsequenceas the contextand the fourth word as the test-word,the independentvariablesdefining the three * Accepted for publication October 23, 1952. 1C. E. Shannon, A mathematical theory of communication, Bell Syst. Tech. J., 27, 1948, 379-423, 623-656; G. A. Miller and J. A. Selfridge, Verbal context and the recall of meaningful material, this JOURNAL,63, 1950, 176-185; E. B. Newman, The pattern of vowels and consonants in various languages, this JOURNAL,64, 1951, 369-379; C. E. Shannon, Prediction and entropy of printed English, Bell Syst. Tech. J., 30, 1951, 50-64. 241 242 HOWES AND OSGOOD experiments are as follows: (1) interposition-the number of stimulus- words separatinga given contextualword from the test-word; (2) density -the number of contextual words having similar associative effects; (3) frequency of occurrence-the number of times a contextual word occursin a general sample of the languageunder study. Since the experiments attemptto predictthe probabilitythat a particularassociationwill be emitted following stimulationwith a sequenceof words from knowledge of its probabilityfollowing single words, these studies take the general form of experimentson the combinationof word probabilities. Definitionsand symbols.Fromthe above descriptionit can be seen that the basic idea requiringdefinitionis that of the probabilityof a word occurringas an association. For the generalcasewe can definethe associativeword-probability Pj,j2 . ... m for two populations of subjects U and W as the limiting relative (i1, i2, . . in) frequency with which the sequence of words il, i2, . . . ,i is emitted by population U following the emission of the sequence of words i}, i2 ... im by population W. To give the definition experimentalmeaning,certain boundaryconditionshave to be specified.First, predictionwill be attemptedfor the probabilityof only the first word of the sequenceof associatedwords, ii (which can thereforebe writtensimply as i). Similarly,only sequencesof four stimulatingwords j will be considered.The stimulatingpopulationW will be a single person (one of the Es), and the associating populationU will consist of a class of college students.One other very important condition must be specifiedviz. the instructionsgiven the Ss. These are, briefly, to associateonly to the final word of the stimulatingsequence.This final word will thereforebe called the test-wordand will be designatedjt. The entire sequence of four stimulating words, called a word-set, will be written ji, j2,j3, jt to indicatethe special emphasisplaced on the test-wordby the instructions.The three words that precedethe test-wordconstitutea context.The form of our experiments can thus be representedby the expressionPJl,J2,3J,j(i). In words, this expression representsthe probabilitythat word i will occuras the associationto a sequenceof four wordswhen the Ss areinstructedto associateto the last word only. Associativeprobabilitiesfollowing single-wordstimulation,such as those reported by Kent and Rosanoff,2will be called first-orderassociativeprobabilitiesand written pj(i). Since the strengthof these associationsis the basic quantityin the present experiments,subscriptswill be used to indicate different values of this variable. Largefirst-orderassociativeprobabilitieswill be indicatedby letter subscriptsto the stimulus-symbol,small first-orderassociativeprobabilitiesby numericalsubscripts. Thus pja(i) >> pil(i). It will be convenientto definethe associativeeffect of a stimulus-word/ as the probabilityof the associationi following / in the conventional word-association experiment.In the aboveinequality,for example,a word represented by ja (e.g. man) can be said to have a strongerassociativeeffectthanjl (boy) upon the associationi (woman) for the populationstudied by Kent and Rosanoff,since in their tables the relative frequencywith which woman is associatedto man is 0.394 while its relativefrequencyof associationto boy is only 0.002. These symbols 2G. H. Kent and A. J. Rosanoff,A study of associationin insanity,Amer. J. Insanity,67, 1910, 37-96, 317-390. ASSOCIATIVEPROBABILITIESIN LINGUISTIC CONTEXTS 243 for the strength of first-order associative effects will be carried over to the word sets: thus, Pja,1J,i2 jt(i) indicates that pja(i) and Pit(i) are large relative to Pjl(i) and pj2(i). The effect of the context of a word-set upon the probability of the association i will be measured as the difference between the probability of i following the word-set containing that context and the probability of i following the test-word with a control context consisting of nonsense words and three-place numbers (represented by j'). It is assumed that such contextual materials affect the associative response negligibly. Values of pit(i), estimated from the Kent-Rosanoff tables, can also be put to this purpose. But the Kent-Rosanoff data can be applied to the present population only with reservation in view of the type of Ss they used and the historical period in which their measurements were made. The extent to which the word-set constitutes a meaningful sequence that might be emitted in the course of ordinary continuous speech can be expected to influence the associative response. Response to the word-set "Mary had a little," for example can be expected to differ greatly from that to a set, such as "a had Mary little," which presents the same components in a less familiar sequence. To control for this factor we can specify that the transitional probabilities for populations U and W shall be negligible where the transitional probability P*Jk-t(Jk) is defined as the probability that word jk will follow a given word jk-i in samples from the continuous discourse of a stated population. Populations U and W in the present case both represent samples of the general population of English-speaking persons of college education. It should be clear that the symbols defined here represent only experimental variables that we have attempted to control in the present studies, and do not imply that the dependent variable can be completely specified by these concepts. All kinds of complex interactions, in addition to those specified by the transitional probabilities of the word-set, may modify the effects upon the subject of each contextual word. Indeed, one of the purposes of the experiments is to ascertain the extent to which a few simple conditions, such as those defined above, can determine the probability of a given associative response. The elucidation of complex interactions between the effects of different elements of a word-set is a natural corollary of such an experimental program. All measurements necessary to specify the independent variables of these experiments could be obtained from extensive tables of first-order associative and transitional probabilities analogous to those published for the former by Kent and Rosanoff and others.3 Sufficiently extensive tables, unfortunately, exist in neither case, so that it has been necessary to make subjective estimations of these probabilities. Little validity can be imputed to subjective estimations of fine differences in associative or transitional probabilities; the experimental designs have therefore been simplified to require only these very simple judgments: (1) that a given word is a highly probable association, in the conventional word-association test, to a second given word; (2) that a given word is a highly improbable association to a second given word; and (3) that a given sequence of words will occur only very rarely (if at 3Kent and Rosanoff, op. cit.; J. O'Connor, Born That Way, 1928; Herbert Woodrow and F. Lowell, Children's association frequency tables, Psychol. Monog., 22, 1916, No. 97, 1-110. 244 HOWES AND OSGOOD all) in ordinary continuous discourse, and hence represents very low transitional probabilities. The design of the experiments makes it necessary that the first two judgments be valid only relative to each other; thus only two subjective judgments are really required for the experiments. Selection of sequences with low transitional probabilities can be facilitated by eliminating from the word-sets all connective words (articles, conjunctions, prepositions, etc.), since sequences of high transitional probability usually include one or more of these words. While no validity studies specific to these judgments have been conducted, the ability of subjects to estimate the relative probabilities of occurrence of words would indicate that such simple judgments can be carried out with small error.4 Such estimates, of course, do not possess the objectivity of measurements that are independent of the experimenter's judgment, and the results to be presented below must be interpreted with that limitation in mind. At the same time, it should be emphasized that the difficulties are not inherent in the conceptual design of the investigation, but are artifacts of a temporary lack of empirical information. Whenever more extensive tables of associative and transitional frequencies are available, the present experiments can be replaced by more rigorous ones. Subjects and instructions. Approximately 200 men, students in the introductory psychology course in Yale College, served as Ss in these experiments. They were divided into four groups,5 each of which was given the same test-words but different contexts. The following instructions were read to all the groups. This is an experiment on what is known as 'free association.' When a person calls out a word and you say (or write) the first word that comes into your head, that is a free association. Thus, for example, if I should say tree you might immediately think trunk or leaf or green or sleep, or any word whatsoever. The present experiment differs from this simple type in one respect: I will read four words slowly and you are to listen carefully to all of them, but you are to free-associate to the last word only. Thus if I should say 'toy, come, wretched, book,' you would then write down the first word that book made you think of. This is important: listen to each word carefully, but respond only to the last one. Be sure to get the first association down. There will be 50 such word-sets altogether. I will call out the number of the set before I read off the words. Although the test-words will all be meaningful, the other three will somtimes be nonsense words or numbers. Try to avoid responding with one of the words in the set, but write it down if it comes up strongly. Before the experiment was begun the main rules were repeated: "(1) Listen to all four words carefully. (2) Free-associate to the last word only. (3) Put down the first word that comes to mind; write it immediately. (4) Try to avoid writing down one of the words used in the word set." To set the test-word off from the context, E paused slightly between reading the last word of the context and the test-word and enunciated the latter with an intonation of finality (the tacit period of a spoken sentence). Each word-set required about 5 sec. to read. The same E read the instructions and word-sets to all groups of Ss. Materials. The first 5 word-sets read to each group were given to show the Ss the procedure. Of the remaining 45 sets, 10 were devoted to Experiment 1 (inter4D. Howes, The definition and measurement of word probability, Doctoral Dissertation, Harvard University, 1950, 33-36. 5Scheduling difficulties resulted in a slight imbalance among the groups. The actual number in each group varied from 46 to 57. ASSOCIATIVEPROBABILITIESIN LINGUISTIC CONTEXTS 245 position), 10 to Experiment 2 (density), and 8 or 7, depending on the group, to Experiment 3 (frequency of occurrence). The remaining word-sets concerned variables not reported in this paper (the word-sets for these experiments appeared in random order throughout the list of 50). Since the Ss were instructed to avoid using contextual words as associations, it was necessary to exclude from each context any word likely to occur as an associate to the corresponding test-word. To insure this condition, all test-words were taken from the list of 100 stimulus-words for which Kent and Rosanoff published 1000 associations, and no word was admitted to a context if in those tables it appeared as an association to the test-word with a frequency greater than 0.005. Treatment of data. The dependent variable in these experiments, P, (i),6 is the proportion of Ss who respond to a word-set with a particular association, i. The associations to be so measured must be selected carefully for the probability with which they occur as first-order associations to the test-word. For if pit(i) were very high, no context could effect a considerable increase in the probability of i, which must always be less than unity; while too low a value of pit(i) would be impossible to measure accurately with groups of only 50 subjects. If a cluster of associations, all having similar first-order associative relations to the contextual words, is taken as the dependent event i, the first-order associative probability of each response-word can be small while the probability of the whole cluster can be large enough to present no serious difficulty of measurement in samples of 50. This solution has been followed in the present experiments. For the word-set: devil, fearful, sinister, dark, for example (a set, used in the density experiment, in which each contextual word should have a strong associative effect upon the response), the frequency of occurrence of i was tabulated for the associative cluster (bad, evil, fear, fright, frightening, gloomy, hell, mysterious, scared, scary). The chief danger in specifying these clusters is that personal idiosyncrasies in the meanings of words will influence the selection. Several precautions were observed in order to minimize this effect: (1) each E examined all 200 associations made to a given test-word without distinguishing among those made under different experimental conditions; (2) each E excluded any word about which he felt doubtful (i.e. with regard to its associative relations to the contextual words); (3) separate lists of associative clusters were drawn up by the two Es, and no important differences appear when results are calculated separately for each list; (4) a final list was prepared, including only words listed by both Es, in which a still more strict criterion for selection was adopted. Calculations given below are for this final list. As additional assurance against the problem posed by associations of high probability, any word occurring to the test-word with a relative frequency of 0.02 or higher in the Kent-Rosanoff tables, or of over 0.04 to the control word-sets with nonsense-word contexts, was excluded from the clusters. While it is unfortunate that these subjective estimations had to be made, it should be remembered that the discrimination that is basic to all these approximations is simply the ability to select one set of words that occur as associations to a given stimulus-word much more frequently than to a second set of words. This discrimination is made especially easy in the present study by the fact that each associative cluster need include only a few words. Thus any word about which there was doubt 6 The subscript w appears here as a general symbol for word-set. 246 HOWES AND OSGOOD could be eliminated. As will be seen, the experimental design itself provides a further safeguard: any word selected for an associative cluster is counted in all experimental groups, including the control, and poor selection would tend to minimize effects of the experimental variables rather than to exaggerate them. Results: Experiment 1: Interposition. It can be expected that an increase in the number of stimulus-words, /i, j,, . . . that are presented between a given contextual word ja and the moment of the S's response will tend to decrease the effect of ja upon the associative response. The present experiment explores this relation. The experiment involves measurement of the probability of a given associative cluster i under three conditions: one in which the experimental word ja is the first word of the context (hence farthest removed from jt); a second in which it is the second word of the context; and a third in which it is the last word in the context (hence nearest to jt). The remaining words in the context are neutral with respect to i. A control context made up of 3-place numbers or nonsense words of two or three syllables must be added. The four word-sets of this experiment can thus be represented: Condition I, , ia, i2, Condition II, ConditionIII, jt; j2, ia, /1, jt /i, j2, ja, jt; Condition IV, ji, /2, /'3, jt. For example, let ja = skin, jt = rough, and consider the association hands. The word-sets read to the Ss might then be: I, skin, hour, utter, rough; II, hour, skin, utter, rough; III, utter, hour, skin, rough; IV, 318, hokiba, rafuny, rough. It is assumedthat neither hour nor utter has an appreciableeffect upon the occurrence of handsas an association. The amountof data this experimentgeneratescan be tripled by selectingsecond and third associationsdifferentfrom the first one. Representingby ia the first association,which is a strong first-orderassociateof j,, we can write the two other associationsio and iy. We now select a word that has a strongfirst-orderassociative effect on ii but negligible first-orderassociativeeffects on ia and iy. This word, which can be represented by jb, can be used in place of the neutral word j/ of the contexts, and thus its separation from jt will be varied just as the separation between ja and jt is varied.A third word, jc, can be chosen in like mannerfor strong firstorder associativeeffect on iy and negligible effects upon ia and i3. It can replacej2 in the word-sets of Equation [1]. We can then rewrite the first three word-sets of the experiment: Condition I, ,ja, /b, jt; Condition II, jc, ja, jb, jt; ConditionIII, jb,i,,jc, it; lla] where it is assumed that pja(ia), pjb(ip), and pjc(iy) are large relative to the remaining associative probabilities, Pia(iS), Pja(iy), jb(ia), PJb(iy), Pjc(ia) and pjc(ip). For illustration, let jt = rough, ia = hands, ia =- storm, and iy = rocky, and let the three contexts of [la] be: I, skin, wind, mountain; II, mountain, skin, wind; III, wind, mountain, skin. We then determine the combined probability of ASSOCIATIVEPROBABILITIESIN LINGUISTIC CONTEXTS 247 hands following both skin and rough when zero, one, or two neutral contextual words are interposedbetweenthem, just as before.In addition,the samecalculations can be madefor stormfollowing wind and rough and for rockyfollowing mountain and rough. Thus a single word-setprovides three determinationsof the function. Fig. 1 shows the probabilityof associativeclusters,Pw(i), as a function of the number of interposed neutral words. Thirty determinationswere made for each numberof interposedwords, three for each of the 10 wordsets. Thus Pw(ia) is plotted opposite the abscissalnumberzero for Condition III, in which ja is the last contextualword, and oppositethe number .20 .10- .05 -- .00 . 0 2 I m FIG. 1. THE PROBABILITY OF AN ASSOCIATIVECLUSTER, P,(i), AS A FUNCTION OF m, THE NUMBER OF NEUTRAL WORDS INTERPOSEDBETWEEN THE EXPERIMENTAL WORD AND THE TEST-WORD Circles and solid line representmeans of 30 determinations;trianglesand broken lines representquartiles. Results for the control condition are shown above the letter C. two for ConditionI, in which ja is the first contextualword and separated from jt by two words neutral with respect to i. Mean values of P,(i) are indicatedby the circlesand solid line, quartilesby trianglesand dashed lines. At the right of the graph, opposite the abscissalpoint labeled C, are shown the results for the control condition. These data show that a contextualword has its greatesteffect upon the associationwhen it occurs immediatelyprior to the test-word.The word's effect is considerablydiminished by introducinganotherword between it and the test-word. Interposition of two rather than one neutral word results in no appreciablefurther decreasein the effect of the experimental 248 HOWES AND OSGOOD word. In each position the contextual words have a greater effect than have the controlcontexts.These trendsare equallyclear for meansand for both quartiles. Statisticalanalysis, the results of which are presented in Table I, bearout these conclusions. Ideally, this experimentwould consistof a very long context in which the associativerelations that define the experimentwere preserved.Then the dissipationof a contextualword's effect on the probabilityof an associative cluster could be describedas a function of the number of interposed neutralwords with the total number of words in the context as a parameter.For very large numbersof interposedwords the effect of a contextual word should become negligible, and Pw(i) should therefore approachits value in the control condition. In Fig. 1, however,the tendency TABLEI SIGNIFICANCE OF DIFFERENCESIN EXPERIMENT I Values of t (df.=29), with correspondingp values, for differencesin P,(i) when various numbers of neutralwords are interposedbetween the experimentalword and the test-word Number of interposedwords 0 I t p 3.48 .01 2 t 2.94 Control I 2 o.12 p .01 t 4.47 2.67 2.46 p .01 .02 .02 .9 is for P,(i) to approacha value higher than that found in the control condition. In our opinion, this is not merely an indicationof inaccuracy of the data, but resultsfrom the reinforcementof the firstword of a context by an additional factor. This reinforcementcan be thought of as a consequenceof greaterattentionpaid the first word of a sequence,or as a result of the fact that the first contextualword is the only one that is free of the competitivetendenciesarousedby a prior word. Anotherpossibility is that the transitionaleffectsof the contextualwords upon each other are not as negligible as assumed.The problem is amenableto straightforward experimentalinvestigation. Experiment2: Density. For this experimentwe takethree words having strong first-orderassociativeeffectsupon i (experimentalwords) and compare the probabilityof i following a context including only one of them with its probabilityfollowing two of them or three of them. ASSOCIATIVEPROBABILITIESIN LINGUISTIC CONTEXTS 249 The word sets necessaryfor this experimentmay be specifiedas follows: Condition I, , ji, j2, jt; Condition II, jo,jb, j2, jt; ConditionIII, j, jba, jb, jt; Condition IV, j'i, j2, j', jt, where ja, ji, jc are contextualwords presumedto have strong first-orderassociative effects upon the same associativecluster, i, and jl and j2 are words presumedto have negligible first-ordereffects upon i. Word-set IV refers to the control condition. An illustrationcan be providedby the word-sets:I, devil, eat, basic, dark; II, devil, fearful, basic, dark; III, devil, fearful, sinister, dark; and IV, 429, 124, 713, dark, where the probability of associate hell is to be measured. On the average, the three experimental words can be considered to have approximately equal firstorder associative effects upon i, for the contextual words were assigned at random to ja, jb, and jc in the various word-sets. The probability of i following word-sets containing one, two, and three experimental words appears in Fig. 2. As in Fig. 1, circles and solid .40 .30 .$0 3 / .20 / / / // / .1 / , .00 I 3 2 C n FIG.2. THE PROBABILITY OF n, OF AN ASSOCIATIVE CLUSTER, Pw(i), ASA FUNCTION THE NUMBER OF CONTEXTUAL WORDS HAVING STRONG FIRST-ORDER EFFECTSON i ASSOCIATIVE Circles and solid line represent means of 10 determinations; triangles and broken lines, quartiles. Results for the control condition are shown above the letter C. lines represent means, while triangles and dashed lines represent quartiles. Each point is based on 10 measurements, one for each word-set. Results for the control condition are shown opposite the letter C. The probability of an associative cluster is seen to be an increasing function of the number 250 HOWES AND OSGOOD of contextualwords having strong first-orderassociativeeffects upon that cluster. Statisticalanalysis,summarizedin Table II, indicatesthat none of the differencesin Fig. 2 would be expected to occur by randomsampling as often as one time in a hundred. In these raw data the effects of contextualwords-which constitutethe independentvariable-are confoundedwith those of the test-word.These effects must be separated,for test-word and contextualwords cannot be treateduniformly in view of the emphasisplaced upon the former by the instructions.It has been suggestedpreviouslythat the effect of a context can be measuredby the differencebetweenthe probabilityof an associative cluster following an experimentalword-setand either (a) its probability following the correspondingcontrol context or (b) its first-orderassoTABLEII SIGNIFICANCE OF DIFFERENCESIN EXPERIMENT 2 Values of t (df=9), with corresponding p values, for differencesin Pw(i) when different numbersof words with strong first,orderassociativeeffects appearin the context. Number of contextual words 2 I 2 3 Control t 2.97 p .02 t 5.49 5.66 p .01 .01 t p 3.35 3 4.73 .OI .OI 6.58 .OI ciative probability following the test-word (i.e. its Kent-Rosanofffrequency). Either of these procedurestacitly assumes that the associative effects of context and test-wordare algebraicallyadditive. Since the effects of the independentvariablecan be measuredby the presenttechniqueonly if the contributionof the test-wordcan be extracted,an assumptionof this type is indispensible. A test of the additivityassumptioncan be obtainedfrom the data presented in Fig. 2. We assumethat the first-orderassociativeeffectsof each memberof a word-set are additive. Then the differenceAPw(i) between P((i) following word-sets I, II, or III and P,(i) following the corresponding control word-set (IV) should be directlyproportionalto n, the numberof contextualwords having strong associativeeffectsupon i. Taking the interpositionvariableof Experiment1 into account,the theoretical ASSOCIATIVEPROBABILITIESIN LINGUISTIC CONTEXTS 251 valueof AP,(i) for a contextof Experiment2 is: APw(i) =r Kpr(i), [3] where K is a weighting factor,evaluatedfrom the resultsof Experiment1, which depends upon the number of words interposed between j, and jt.7 These weights, obtained by subtractingthe probabilitiesof the associative clustersunder control conditionsof Experiment1 from their probabilities under the correspondingexperimentalconditions, are 0.025, 0.028, and 0.071 for, respectively,2, 1, and 0 interposed neutral words. Thus the following theoretical values of APw(i) for Experiment 2 are given by Equation [31: Condition I (one experimentalword), 0.025; Condition II (two experimentalwords), 0.053; Condition III (three experimental words), 0.124. As the design of Experiment1 requiredthree associative clustersfor each word-setwhile Experiment2 requiredonly one, the wordsets of the latter tend to be somewhat larger.This differencein size of associativeclusters can be correctedby multiplying the theoreticalvalues by a constant which equates theoreticaland experimentalvalues for any one of the experimentalconditions. Adjusted in this manner for Condition I, in which only one contextualword has a strong associativeeffect, the theoreticalmeans for Experiment2 are 0.050, 0.106, and 0.248, comparedwith the experimentallyobtainedmeans (data from Fig. 2) of 0.050, 0.137, and 0.245. For neither the two- nor the three-wordexperimental conditionsdoes the theoreticalvalue representa significantdeparturefrom the experimentallyobtainedmean (ts are, respectively,1.03 and 0.08 with 9 df.). This comparisonis based upon the assumptionthat the probabilityof the associative cluster unaffectedby contextualwords can be measuredby the probabilityof the cluster following the control context. There is some reasonto believe that this methodmay give too high a value, however,for in a few cases the numbersof control contexts probablyhad an appreciableassociativeeffect. A second comparison of theoreticaland observedvalues for Experiment2 was thereforemade, using the relative frequencyof the words of the associativeclusters in the Kent-Rosanoff tables in place of the control-conditionvalues of P,(i). Computedby this method, the theoreticalmeansare 0.056, 0.116, and 0.228 and the empiricalmeansare 0.056, 0.143, and 0.251. Again the t's fail to approachstatisticalsignificancefor the twoandthe three-wordconditions. The assumption that associative effects are algebraically additive is thus This assumesthat the results of Experiment1, in which the interposedwords were neutralwith respectto the measuredassociativecluster,hold also for the interposition of words having strong associativeeffects on it. 252 HOWES AND OSGOOD consistentwith the present data. It can be expectedto hold, however, only within the limitations of two defining conditions of these experiments: (1) that the transitionalprobabilitiesof the contextual words are negligible; and (2) that the componentassociativeeffects are not too large. The latter restrictionis imposedby the logical requirementthat the sum of the componentassociativeeffects be less than unity. As for the former restriction,a word-set consisting of a very familiar sequence (e.g. "Mary had a little") would almost certainlylead to a disproportionatenumber of associationsof the completionor speech-habittype (e.g. lamb).8 It is also probablethat the associativeeffectsof the componentwords, or their weights in influencingthe response,would be modified by the transitional probabilities(cf. Experiment3 below). Even thus qualified,the assumption of additivityshould be acceptedonly with considerablecaution,since the possibility remains that even within the present defining conditions some word sets can be found that will yield resultsincompatiblewith the assumption. Experiment3: Frequency.The extentto which the associativeresponseto a word-set is determined by the first-orderassociativeeffects of a particular contextualword may be expected to depend upon how familiar S is with that word. In this experiment we compare the effects of two contextual words, ja and /b, for which p(ja) > P(jb), where p(j) is the probability of occurrence of / in a general sample of the language behavior of the population under study. The further condition is imposed that the two words have approximately equal first-order associative effects upon the associative cluster, i.e. that pja(i) = pjb(i). In lieu of the tables of first-order associative probabilities called for by this condition, a subjective approximation was attempted by selecting pairs of words that are closely synonymous, as judged by the experimenters and corroborated by a thesaurus.9 The following word-sets then define the experiment: Condition I, jl,/2, fa, jt; Condition II, i, j2, ib, jt; Condition III, j'1, j'2, j', jt. [4] For an example let the synonyms praise and panegyric be the contextual words and jb and let glory be the association i made to the test-word soldier. ja The mean probabilitiesof the 10 associativeclusters obtained under each of the three conditions of this experiment are as follows: control (Condition III), 0.032; infrequent-word (Condition II), 0.035; fre8 For the meaning of these classifications cf. R. S. Woodworth, Experimental Psychology, 1938, 350-352. 9C. O. S. Mawson, Roget's Thesaurus of the English Language in Dictionary Form, 1940. ASSOCIATIVEPROBABILITIESIN LINGUISTIC CONTEXTS 253 quent-word(Condition I), 0.075. The differencesare statisticallyreliable between Conditions I and II (t = 2.64, p < 0.05) and between Conditions I and III (t = 3.46, p < 0.01). The difference between Conditions II and III is not significant. These results indicate that the frequency with which a contextual word occurs in the general language behavior of a population can be regardedas a factor weighting that word's contributionto the associative response. The insignificanceof the difference between the control and infrequent-wordconditions then implies that the average weight of the infrequent words in this experimentwas so low that their effective contribution to the response was little greaterthan that of neutral items. Next let us express the contextual word's weight in determiningthe associativeresponse as a function of its frequencyof occurrence.On the assumptionthat associative effects of contextual words are algebraically additive, the desired function is given by f in APw(i) = f[p(j)]. [5 As in Experiment2, the quantityAPw(i) representsthe differencebetween the probabilityof associativecluster i following an experimentalwordset (I or II in this experiment) and its probabilityfollowing the control word-set (III). The term p(j) has been defined as the probabilityof occurrenceof word j in a general sample of the languagebehaviorof the population under study-American college students in this case. The frequenciesof words in the Lorge MagazineCount and Thorndike-Lorge SemanticCount, which correlatehighly with college students' ratings of the frequencywith which they use words, can be used to measurep(j).10 Taken together, these counts give the number of times that a word appearedin highly varied samples of written language behaviortotalling over nine million words. Fig. 3 presents the data. The abscissa,graphed logarithmicallyto conserve space, gives the Thorndike-Lorgefrequency of each experimental word, j, or jb, and the ordinate gives the difference APw(i) between the probabilityof associativeclusteri following a word-setcontainingthe experimental word and the probability of i in the control condition Correlationcoefficientscomputed for APw(i) as a function of log p(j) are significantlygreaterthan zero (,I = 0.88, r = +0.77); the difference between r and r is not sufficientto warrantrejection of the hypothesis that the function is rectilinear in log p(j) (F = 1.91; df = 5, 13; 10E. L. Thorndike and I. Lorge, The Teacher's Word Book of 30,000 Words, 1944; Howes, loc. cit. HOWES AND OSGOOD 254 p >0.10)."1 A rough estimate of the reliabilityof the measurementsof log p(j) is given by the correlationbetween the frequenciesof words in o .10 .08 - .06.04* _ a. .02.00 -.02 - -.02 * .04.l l 0.I 1l4o 5 .I 1.0 X 1.5 I I 2.0 2.5 i. 3.0 log p (j) FIG. 3. THE WEIGHTED ASSOCIATIVE EFFECT OF A CONTEXTUAL WORD AS A FUNCTION OF ITS PROBABILITYOF OCCURRENCE The abscissa shows the Thorndike-Lorge frequency of the experimental word. The ordinate gives the difference between the probability of an associative cluster following a context including the experimental word and its probability following a control context. the Semantic Count and their frequencies in the Magazine Count (r = +0.80).12 Since prediction of AP~,(i) from the frequency of oc- "The small number of Ss made it necessary to estimate P, (i) for each word-set in relative-frequency units of 0.02. This is too coarse a step-interval to permit accurate estimation of the small values of P,,(i) that obtain for the control word-sets. In computing APw(i) for Fig. 3, therefore, the mean probability of i over all 10 of the control word-sets has been used. This is permissible because all associative clusters and contexts for the control condition were subjected to the same selection procedures. If each value of APw(i) is recomputed using the control frequency for each individual word-set s1 and r are reduced to 0.71 and +0.64, respectively. 2 The four rarest words could not be included in these calculations since their ASSOCIATIVEPROBABILITIESIN LINGUISTIC CONTEXTS 255 currenceof contextualwords is of the same order of accuracyas prediction of word frequencies from one sample to another, a causal relation betweenthe two variablesis indicated. Some of our Ss may not have known the meanings of all of the rare contextual words. The first-order associative effects of these words could not be expected to resemble those of their corresponding frequent contextual words. Hence the infrequent contextual words would function as neutral words, and the infrequentword contexts would be constituted essentially like control contexts. This possibility offers an alternative interpretation of the fact that the results for the infrequentword contexts do not differ reliably from those for the control contexts. Although it is impossible to discover directly the extent to which this factor affected the results, it seems unlikely that more than a small proportion of the Ss employed were unfamiliar with most of the infrequent words used here. The function of Fig. 3 also gives no evidence of a discontinuity such as might be expected to result from an artifact of that type. Moreover, the function approaches zero for fairly well-known words (e.g. astringent, delectable): thus words of even greater rarity would also be expected to have zero weights, in which case it would make no difference to the experimental results whether the subjects understood the word or not. Independenceof neutral contextualwords. In all of these experiments it has been assumedthat the words we have selected as neutralcontextual words (representedby numerical subscripts) have no appreciableeffect upon the probabilityof the associativeclusters. Precautionstaken to assure satisfactoryselection of these neutral words have already been explained,but it is desirableto have an empiricalcheck. The data of Experiment 3 can be used for this purpose. In Conditions I and II of that experiment we have two word-sets that are identical save for their third contextual words. A new associative cluster u can then be so chosen that (1) each of the first two contextual words has a strong associative effect upon u, and (2) the words in the third contextual positions are neutral with respect to u. These third contextual words in Experiment 3 are synonyms with comparable first-order associative effects upon u, but the third word of Condition I is a much more frequent word than that of Condition II. Now let us suppose that the assumption that these words are neutral with respect to cluster u is false. This will mean that the third word of each word-set will contribute to the probability of u. By Experiment 3, the contribution of a frequent word to the associative response must be weighted much more heavily than that of a rare word. The third contextual word of Condition I will thus increase the probability of u more than will the third contextual word of Condition II, and therefore the measured value P,(u) will, on the average, be larger for Condition I. The results, however, show a small difference in the opposite direction: the mean probability of clusters u is 0.100 following Condition I (frefrequencies in the Magazine and Semantic counts are not distinguished in the published Thorndike-Lorge tables; but as the scatter-diagram of the data indicates that the reliability of infrequent words is as low as, or lower than, that of frequent words, the present argument is not invalidated by their omission. 256 HOWES AND OSGOOD quent-word contexts) and 0.117 following Condition II (infrequent-word contexts). A difference of this size would be expected more than 3 out of 10 times by random sampling (t = 0.98, df = 9). Consequently the null hypothesis, that words selected to be neutral have in fact no effect upon an associative cluster, should not be rejected. Discussion. The results of these experimentslend themselvesto a surprisingly simple interpretation.This, however, requires a more refined definition of the associativeeffect of a stimulus-word.We consider that S is capable of emitting any one of a set of alternativeword-responses ia, i3, . . . , iv. Each response-wordwe assumeto have an averageprobability of emission p(i) independentof any stimulus-word.These values may be regardedas the relative habit strengths of the words, and presumably they are sampled by tables like those published by Thorndike and Lorge.13The effect of a particularstimulus-word(when S is set by the instructionsof the word-associationtest) is then to redistributethese probabilities,increasingthem for some words, decreasingthem for others, and leaving some practicallyunchanged.Thus the associativeeffect of a stimulus-word is properly measured by the difference pj(i)p(i). A set of such probabilitychanges for all possible responses,ia, . . , iv, we assume to be a fixed propertyof the stimulus-wordand the population of Ss. Considernow what happenswhen an S perceivestwo or more stimuluswords in the association-experiment.The change in probabilityof one response-wordrelative to another is a property of each stimulus-word and cannot be changed by the fact that each stimulus-wordnow appears as one of a sequence.Only the extent to which a stimulus-wordaffectsthe response-mechanismas a whole-its weight in Equation [3]-can vary. Thus it is the capacityof the perceived stimulus-wordto 'capture'the response-mechanismthat decreaseswith the number of other stimuluswords interposedbetween it and the momentof response (Experiment1), and that increases in approximateproportion to the logarithm of its probabilityof occurrence(Experiment 3).14 Hence the high correlation between AP,(i) and p(j) found in Experiment3-which would hardly be expected if more complicatedinteractionsamong the effects of con13 The correct statistic would be -ljp(j)pj(i), the sum of the associative probabilities for all possible stimulus-words weighted according to the probability of occurrence of each stimulus-word. A preliminary comparison of the Kent-Rosanoff and Thorndike-Lorge tables indicates that the Thorndike-Lorge frequency of a word gives a good estimate of this value except for a few special classes of words. 1 In this connection it is interesting to note that the time for which a stimulusword must be exposed tachistoscopically in order for it to be perceived can be decreased in approximate proportion to the logarithm of its probability of occurrence (cf. Howes, op. cit., esp. Ch. IV). ASSOCIATIVEPROBABILITIESIN LINGUISTIC CONTEXTS 257 textual words took place-and the additivity of associativeeffects found in Experiment2. In this paper only word-setswith zero transitionalprobabilitieshave been considered. What would happen if this restrictionwere removed?We have already seen that, on empirical grounds, one can expect that the presence of familiar sequencesin the word-setswould greatlymodify the associationsgiven. The simple model describedabove predictssuch differences.When transitionalprobabilitiesare appreciable,the probabilityof occurrenceof a stimulus-word,p(j), dependsupon the particularwords that precedeit. The weights of the respectivecontextualwords in determiningthe responseswould thus be changedgreatly, yielding results for P, (i) entirelydifferentfrom those calculatedon the simple basis used in the present studies. This conception of the associativeprocess is much simpler than many views of linguistic processes would lead us to expect. It does not, for example, postulate the representationalmediation-processesfound necessary by one of the authors to account for many aspects of linguistic behavior,particularlythose involving semantic functions.15This simplicity, however, relates only to the manner in which certain concepts are interrelated.The concepts themselves are statisticallydefined and thus are recognized to be the product of complex multiple determination. Study of furthervariableswithin the presentexperimentaldesign may, indeed, require a more complicatedinterpretationlike that affordedby the mediation hypothesis. SUMMARY (1) The prediction of the language behavior of one population of Ss from the language behaviorof a second population is formulatedin statisticalconcepts.The basicconceptis thatof associativeword-probability, defined as the probabilitythat one person (or population of persons) will emit a word as an associationfollowing the emissionof a given stimulus-word by anotherperson. This concept is applied to the predictionof the probabilityof a word-associationfollowing a sequence of stimuluswords from the probabilityof that associationfollowing each of the component stimulus-wordstaken separately. (2) Three experiments, each using 200 college students, indicate the following: (a) the effect of a given stimulus-wordon an associativeresponse is a decreasing function of the number of additional stimuluswords interposedbetween it and the time of response; (b) the effect of 5C. E. Osgood, The nature and measurementof meaning, Psychol. Bull., 49, 1952, 197-237. 258 HOWES AND OSGOOD a sequenceof stimulus-wordsupon an associativeresponseis an increasing function of the proportion of those stimulus-wordshaving similar firstorder associativeeffects on the response; and (c) the effect of a given stimulus-wordon an associativeresponse is an increasingfunction of the frequencyof occurrenceof the stimulus-wordin general linguistic usage. (3) Quantitativespecificationof these functionssuggestscertainassumptions aboutthe way in which the effectsof severaldifferentstimulus-words interactupon the same associativeresponse.The presentdata are consistent with the assumptionthat these effectsare algebraicallyadditive.
© Copyright 2026 Paperzz