Linguistic Society of America Annual Meeting January 8, 2010 Baltimore, MD Gerard Manley Hopkins’s Sprung Rhythm: Corpus study and stochastic grammar Bruce Hayes UCLA Claire Moore-Cantwell University of Massachusetts, Amherst 1. Sprung rhythm • A poetic meter invented by the Victorian poet Gerard Manley Hopkins (1844-1899), the basis of some of his most admired poetry. • Until recently, nobody understood how it worked. Hopkins’s own explanations give us only hints. • Some scholarship has even charged Hopkins with being deluded, in imagining he was composing to a real meter. 2. Kiparsky (1989) • Kiparsky claims to have cracked the code. Essential ingredient, neglected by earlier analysts: syllable quantity. 3. Revisiting Kiparsky’s work • We returned to the same data, using newly available tools: electronic corpus machine scansion stochastic grammar frameworks 4. Goals • Validate the original analysis, using digital technology for thoroughness. • Render the analysis more complete by adding constraints and couching it in the framework of stochastic grammar. 5. Outline of the talk • • • • Summarize the Kiparskian analysis Add two new inviolable constraints Outline the problem of indeterminate scansions Offer a solution: violable constraints and stochastic grammar Hayes/Moore-Cantwell p. 2 Sprung Rhythm: Corpus study and stochastic grammar THE KIPARSKYAN ANALYSIS OF SPRUNG RHYTHM 6. Notation for syllable weight – ⏑ ⏓ heavy light ambiguous: heavy or light, as the scansion requires 7. How syllable weight works in sprung rhythm • Basically, the normal Latin-like syllable weight criterion, with closed and long-voweled syllables counting as heavy. • With three added complications: You can optionally ignore a single word final consonant (cf. “consonant extrametricality”; Hayes 1982) So havoc [»hæ.vək] is /⏑ ⏓/. You can optionally consider a word-final stressless non-low vowel as short (cf. Chomsky and Halle 1968) So they [eɪ] is /⏓/, I [aɪ] is /–/. You’re allowed to collapse stressless vowel + coda sonorant into a single short syllabic sonorant; hence light (dandled /ˈdændəld/ can be /– ⏑/, interpreted as [»dændld]. 8. Framework • Kiparsky’s work falls in the mainstream research tradition of generative metrics (Halle and Keyser 1969, 1971; Kiparsky 1975, 1977, et seq.) • Meter is a sequence of strong and weak slots; e.g. like W S W S W S W S W. • There are rules for how you can fill an S or a W slot (cf. Hanson and Kiparsky 1996). • Here is a case of such filling: /w () /s To- /w wery /s city /w and /s bran- /w chy be- /s tween /w () /s to- /w wers; DO 1 • These rules refer to stress, syllable weight, phrasing, and word boundaries. 9. Meters of sprung rhythm • Each is an alternating sequence of S (Strong) and W (Weak) positions, beginning and ending with W. • e.g. W S W S W S W S W, tetrameter • 2 S’s = dimeter, 3 S’s = trimeter, and so on. • Possible meters: 2, 3, 4, 5, 6, 8, or varying length in a fixed stanza type 1 For abbreviations of poem titles see Appendix A below. Hayes/Moore-Cantwell Sprung Rhythm: Corpus study and stochastic grammar p. 3 RULES FOR FILLING S AND W POSITION 10. Preliminary definition • Resolved sequence = stressed light followed by a stressless light in the same word • Words that embody resolved sequences: dapple [ˈdæpl ̩], level [ˈlɛvl ̩] 11. The rules for filling W and S position • W may be filled with any of the following: a. b. c. d. e. A single stressless syllable A stressed monosyllable A sequence of stressless light syllables A resolved sequence Null • S may be filled with any of the following: a. A single stressed syllable b. A resolved sequence c. A single stressless syllable, provided it is not light. • For examples of all of these rules, see Appendix B below. OUTRIDES 12. Outrides • Extra syllables not affiliated with either W or S position. • They are Hopkins’s extension of “extrametrical syllables,” a common phenomenon in ordinary verse (Kiparsky 1977, 230-232). 13. Conditions on outrides • Content of outride: same as any W position • Required context for outrides: Must precede a phonological break (end of P-phrase, I-phrase; Selkirk 1980) Stress on an outride must be weaker than stress on the preceding S. Normally allowed only in tetrameter or longer lines 14. Example of an outride /w () /s Shares /w their /s best /w gifts /s sure- /o ly, /w fall /s how /w things /s will), BC Hayes/Moore-Cantwell p. 4 Sprung Rhythm: Corpus study and stochastic grammar DATA CORPUS 15. Purpose • In order to test Kiparsky’s analysis (and our revisions) by our method, we must first code the entire sprung rhythm corpus in digital form. 16. Poems included • All 25 sprung rhythm poems in which the number of S positions per line is known. • 583 lines, 6127 syllables 17. Phonological coding For every syllable we coded: • Stress — numbers 1-2-3-4, with 4 the greatest. We follow the phrasal stress rules in the literature (e.g. Chomsky and Halle 1968, Selkirk 1984, Hayes 1995). • Weight: light, heavy, or ambiguous • Phonological phrasing: Selkirk’s Prosodic Hierarchy (1980, 1986), with levels of phrasing (Word, Clitic Group, Phonological Phrase, Intonational Phrase) and rules for formation as in Hayes (1989). 18. Example of a coded line IP PP CG CG Word To- we3 1 – ⏓ PP CG Word Word ry 1 ⏓ ci- ty 4 1 ⏑ ⏓ and 1 ⏓ CG Word Word Word bran- chy be- tween to- wers; 3 1 1 2 4 1 – ⏓ ⏑ – – ⏓ phrasing 2 syllables stress weight MACHINE SCANSION 19. Chopkins.exe • This program knows all the options for filling S, W, O. • It finds all the possible scansions of a line, or where appropriate tells the user that no legal scansions exist. 2 We don’t code the actual tree. Instead: for each syllable, the rank of the highest constituent of which it is the rightmost syllable. Hayes/Moore-Cantwell Sprung Rhythm: Corpus study and stochastic grammar p. 5 20. Procedure: first step • We inspected the outputs of Chopkins and used them to discover new constraints to add to Kiparsky’s system, in order to make the grammar more restrictive. PROPOSED ADDITIONAL INVIOLABLE CONSTRAINTS 21. FINAL FALL Assess a violation when the rightmost S is filled by a syllable that does not have more stress than what fills the following W. 22. *EMPTY W INSIDE LEVEL I • Empty W is rare inside a word. • Moreover, all 5 attested cases put the empty W between a Level II (Kiparsky 1982) affix and the stem, like this: /w ()/s Strokes /w of /s ha- /w voc /s un- /w () /s selve • So: we impose an inviolable ban on empty W internal to a Level I domain. 23. Modify the grammar • We altered Chopkins to respect these two new constraints. TESTING THE ANALYSIS 24. The question • How well does the grammar work? Might counterexamples have slipped by Kiparsky in his earlier inspection of the data? Do our new constraints still permit the whole corpus to be scanned? 25. Results • A fair amount of discussion and fine tuning of individual examples is needed (see full paper), but the upshot is that there are about 2 unmetrical lines (583 lines total): Forward-like, but however, and like favourable heaven heard these. A heart’s-clarion! Away grief’s gasping, joyless days, dejection. 26. Query • How meaningful is it that there are just 2 exceptions? • One way to check: do a comparison with prose: BC HF Hayes/Moore-Cantwell Sprung Rhythm: Corpus study and stochastic grammar p. 6 How many “exceptions” would we get if we tried to scan lines of ordinary English as sprung rhythm? Earlier uses of this method: Tarlinskaja and Teterina (1974), Tarlinskaja (1976), Biggs (1996) 27. Sources of prose • We used Hopkins’s own writings: unpublished “Author’s preface” a few of his letters 28. Forming the sample • Separated these texts into “pseudo-lines” — sequences separated by punctuation marks • Selected 155 to match real corpus distribution of line lengths in syllables • For each: randomly assign to a meter (trimeter, tetrameter, etc.), matching the statistics of corpus (e.g., words of n syllables occur in lines with m S positions x% of the time) 29. Result • About 10% of the prose lines are unscannable with the meter that was randomly assigned to them. • This proportion is higher—significantly so—then the proportion of unscannable real lines. metrical unmetrical 581 2 verse 139 16 prose Fisher’s exact test: p < 10–9 30. Summing up so far • Kiparsky’s system can be slightly tightened with two inviolable constraints. • Thus modified, it suffers very few counterexamples and stands up to statistical testing with the prose model method. • But we think there is still a problem with it: insufficient restrictiveness. THE PROBLEM OF TOO MANY LEGAL SCANSIONS 31. Defining the problem • If a metrical analysis allows a great number of scansions, it is unrestrictive. • An insufficiently restrictive system would be scientifically uninteresting; it would make scansion “as indeterminate as slicing cucumbers” (Kiparsky 1989, 308) Hayes/Moore-Cantwell Sprung Rhythm: Corpus study and stochastic grammar p. 7 32. How many scansions does the (modified) Kiparskian analysis allow? Chopkins can tell us. Line with a unique scansion: just 47 (out of 583) More than 10: 211 lines More than 100: 12 lines average: 14.8 median: 6 • We think this is probably too many. 33. Our proposed solution • We think Kiparsky’s work only found a subset of the constraints under which Hopkins wrote—the inviolable ones. • We can and should add additional violable constraints. • We can deploy these constraints rigorously by using a framework of stochastic grammar. • … and we can test our proposal, because Hopkins left testimony about which scansions he felt were best. USING HOPKINS’S DIACRITICS AS A DIAGNOSTIC FOR HIS PREFERRED SCANSIONS 34. The diacritics • Hopkins added them because his friends couldn’t scan his poems. • They mark outrides, empty W, syllables that should be scanned in S or in W. • We can use these diacritics to single out a “Hopkins preferred” scansion from the many logically possible scansions. 35. The informative subset of the corpus • The lines in which only one scansion is compatible with the diacritics. • This is true for 311/583 lines 36. Goal • Construct a stochastic grammar that maximally favors the same scansions that Hopkins preferred, according to his diacritics. CONSTRUCTING A STOCHASTIC GRAMMAR I: CONSTRAINT SET 37. Source • The literature on generative metrics, including Kiparsky (1989). Hayes/Moore-Cantwell Sprung Rhythm: Corpus study and stochastic grammar p. 8 38. Avoiding multiply-filled positions a. *2 SYL IN W Assess a violation for W positions filled by 2 or more syllables. b. *3 SYL IN W Assess a violation for W positions filled by 3 or more syllables. c. *RESOLUTION IN S Assess a violation for each resolved sequence in S position. 39. Matching the stress pattern to the S’s and W’s • Match SW to a “fall” in stress contour (Magnuson and Ryder 1970; Tarlinskaja 1976) MATCH SW Assess a violation if the (first) syllable occupying W position has more stress than the (first) syllable occupying the preceding S. • We tried MATCH WS, but it did not improve the model fit and so we omit it. 40. Flanking empty W positions with stressed syllables *NO-CLASH EMPTY W Assess a violation for an empty W position if the S positions that flank it are not both filled by a stressed syllable. 41. Constraints on outrides a. *OUTRIDE Assess a violation for every outride. b. *OUTRIDE-WEAK BREAK Assess a violation for every outride that is only at the end of a Clitic Group. c. *OUTRIDE-SHORT LINE Assess a violation for every outride in a line with 4 or fewer S positions. 42. All these constraints are violable Constraint *2 SYL IN W *3 SYL IN W *RESOLUTION IN S MATCH SW *NO-CLASH EMPTY W *OUTRIDE *OUTRIDE-WEAK BREAK *OUTRIDE-SHORT LINE Number of violations in 311 “Hopkins scansions” 211 23 21 117 21 88 3 2 Hayes/Moore-Cantwell Sprung Rhythm: Corpus study and stochastic grammar p. 9 CONSTRUCTING A STOCHASTIC GRAMMAR II: FORMAL FRAMEWORK 43. Goal • A system that calculates output probabilities from constraint violation profiles in a principled way. • Here, for “principled”, we use the maximum likelihood criterion: maximize the predicted probability of Hopkins’s own preferred scansions. This is a pretty standard criterion for fitting models. • Maxent grammars do this in a fairly simple way, and are backed by solid mathematics. 44. Maxent grammars in recent phonological work • Goldwater and Johnson 2003, Wilson 2006, Hayes and Wilson 2008, Hayes, Zuraw, Siptár and Londe (in press) 45. Basis of maxent grammars • They are a subspecies of harmonic grammars (Legendre, Smolensky, and Miyata 1990; Pater 2009, Potts et al., in press) • Each constraint has a weight, a non-negative number, expressing how much it lowers the output probability of candidates that violate it. 46. Finding the weights • This is done by fitting them to the data. • For a presentation of the algorithm involved, see Hayes and Wilson 2008. • Software used: the Maxent Grammar Tool (Colin Wilson/Ben George), http://www.linguistics.ucla.edu/people/hayes/MaxentGrammarTool/ 47. Our maxent simulation • Feed the software: the 583 sprung rhythm lines of the corpus every legal candidate scansion of these lines—8633 in total the violation profiles for all the candidate scansions for the 311 lines in which an unambiguous Hopkins scansion was determinable, a designation of this scansion as the “winning” one, for purposes of training the weights. • The algorithm tries to maximize the probability assigned to these winning scansions. • Program output: the best-fit weights the predicted probability of every scansion Hayes/Moore-Cantwell Sprung Rhythm: Corpus study and stochastic grammar p. 10 48. Weights obtained MATCH SW *NO CLASH EMPTY W *S RESOLUTION *2 SYL IN W *3 SYL IN W *OUTRIDE *OUTRIDE SHORT LINE *OUTRIDE WEAK BREAK 1.05 1.74 1.44 1.75 1.75 2.22 11.48 1.69 EVALUATING THE MACHINE-LEARNED GRAMMAR 49. Guesses needed to find right answer • Procedure: for each line, sort the candidates by predicted probability assigned by the maxent grammar, in descending order. • Count ties as the larger value (2 candidates tied for 1st = “2”) • How far down the list (many such “guesses”) were needed to find the Hopkins-preferred scansion? Guesses Number 1 2 3 4 5 6 8 10 15 54 158 260 24 10 8 2 2 1 1 1 1 1 Cumulative fraction of total 0.836 0.913 0.945 0.971 0.977 0.984 0.987 0.990 0.994 0.997 1.000 • The average rank of the correct guess is 2.02. • Without the violable stochastic constraints, guessing candidates with equal Kiparskian probability, the comparable number is 7.5 guesses. 50. Other models (work in progress) • We plan to run our data on similar stochastic grammar models: Stochastic OT (Boersma 1997, Boersma and Hayes 2001); Noisy Harmonic Grammar (Pater 2009, Potts et al. in press). • Point of interest: we think there may be harmonically bounded winners in the corpus, (Appendix C), which these theories predict to have zero probability. Hayes/Moore-Cantwell Sprung Rhythm: Corpus study and stochastic grammar p. 11 DISCUSSION 51. Hopkins’s metrical experiment • He wasn’t deluded about his meter. His system is restrictive, and even more so than Kiparsky claimed. • With our additions, the system now normally exhibits a strong preference for one single scansion (one dominates the others in its probability), which in a large majority of testable cases coincides with the one Hopkins preferred. 52. Quasi-prediction in linguistics • Although all the options that obey the inviolable constraints are in principle legal, the choice among these legal options is far from random and can be “quasi-predicted” by a stochastic grammar. • Compare Bresnan et al. (2007), who use a stochastic model to quasi-predict which kind of dative construction (V NP PP or V NP NP) speakers use in dative sentences. 53. Methodology of metrics • We think doing metrics with a corpus and machine search puts you on safer ground, helps you discover new constraints, and helps in verifying you’re on the right track. Appendix A: poems studied with title abbreviations AB BC BP BR CC CS DO FR HF HH HP HR ID KF LE MM NW PB RB SD Ashboughs The Bugler’s First Communion Binsey Poplars Brothers Carrion Comfort The Caged Skylark Duns Scotus’s Oxford Felix Randal That Nature is a Heraclitean Fire and of the Comfort of the Resurrection Hurrahing in Harvest Harry Ploughman Henry Purcell Inversnaid As Kingfishers Catch Fire The Loss of the Eurydice The May Magnificat No Worst Pied Beauty Ribblesdale The Soldier Hayes/Moore-Cantwell SF SS TG WH WM Sprung Rhythm: Corpus study and stochastic grammar p. 12 Spring and Fall Spelt from Sibyl’s Leaves Tom’s Garland The Windhover At the Wedding March Appendix B: Examples of how S and W positions are filled 1. W can be any single stressless syllable • This is the norm in stress-based poetry; many examples. 2. W can be any sequence of stressless light syllables /w A /s bea- /w con, an e- /s ter- /w nal /s beam. /w Flesh /s fade, /w and /s mor- /w tal /s trash HF [kən ən ə] 3. W can be any stressed monosyllable /W Young /S John: /W then /S fear, /W then /S joy BR 4. W can be a resolved sequence /w Her /s fond /w yellow /s horn- /w light /s wound /w to the /s west, /w her /s wild [»jlU] /w hollow /s hoar- /w light /s hung /w to the /s height [»hɒlU] SS 5. W can be Null • This is what Hopkins meant by “sprung”. /w () /s Aèll /w () /s félled,: /w () /s félled, /w are /s áll /w () /s félled; BP 6. S can be filled by a single stressed syllable • This is the norm in stress-based poetry. 7. S can be filled by a resolved sequence • Resolved sequence as defined as in (4) above. • Not as common as in W position. /w This /s very /w very /s day /w came /s down /w to us /s af- /w ter a /s boon /w he on [»v̆rɪ ̆] BC Hayes/Moore-Cantwell Sprung Rhythm: Corpus study and stochastic grammar p. 13 8. S can filled by a single stressless non-light syllable • This one is ok, because [ənd] can be counted as heavy: /w Till a /s life- /w belt /s and /w God’s /s will [ənd] LE • But this constructed line is impossible, since /C0ə/ is never heavy: */w Till it /s streng- /w then /s a /w man’s /s will [ə] References Biggs, Henry (1996) A Statistical Analysis of the Metrics of the Classic French Decasyllable and Classic Alexandrine. Ph.D. dissertation, Program in Romance Linguistics, UCLA, Los Angeles, CA. Bresnan, Joan, Anna Cueni, Tatiana Nikitina, and Harald Baayen. 2007. “Predicting the Dative Alternation.” In Cognitive Foundations of Interpretation, ed. by G. Boume, I. Kraemer, and J. Zwarts. Amsterdam: Royal Netherlands Academy of Science, pp. 69-94. 33 pages. Chomsky, Noam and Morris Halle (1968) The Sound Pattern of English. New York: Harper and Row. Goldwater, Sharon, and Mark Johnson. 2003. Learning OT constraint rankings using a maximum entropy model. In Proceedings of the Stockholm Workshop on Variation within Optimality Theory, ed. Jennifer Spenader, Anders Eriksson, and Osten Dahl, 111–120. Halle, Morris and S. Jay Keyser (1969) Chaucer and the study of prosody. College English 28, 187-219. Halle, Morris, and S. Jay Keyser (1971) English Stress: Its Form, Its Growth, and Its Role in Verse, Harper and Row, New York. Hanson, Kristin and Paul Kiparsky (1996) A Parametric Theory of Poetic Meter. Language 72: 287-335. Hayes, Bruce (1982) Extrametricality and English stress. Linguistic Inquiry 13, 227-276. Hayes, Bruce (1989) The Prosodic Hiearchy in meter. In Paul Kiparsky and Gilbert Youmans, eds., Rhythm and Meter, Academic Press, Orlando, FL, pp. 201-260 (1989). Hayes, Bruce. 1995. Metrical stress theory: principles and case studies. Chicago: University of Chicago Press. Hayes, Bruce, Kie Zuraw, Péter Siptár, and Zsuzsa Londe (in press) Natural and unnatural constraints in Hungarian vowel harmony. To appear in Language. Hayes, Bruce and Colin Wilson (2008) A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry 39: 379-440. Kiparsky, Paul. 1982. Lexical phonology and morphology. In Linguistics in the morning calm, ed. In-Seok Yang, 3–91. Seoul: Hanshin. Kiparsky, Paul (1975) Stress, Syntax, and Meter. Language 51, 576-616. Kiparsky, Paul (1977) “The rhythmic structure of English verse,” Linguistic Inquiry 8, pp. 189247, 1977 Hayes/Moore-Cantwell Sprung Rhythm: Corpus study and stochastic grammar p. 14 Kiparsky, Paul (1989) Sprung rhythm. In Paul Kiparsky and Gilbert Youmans, eds., Rhythm and Meter. San Diego: Academic Press. Legendre, Géraldine, Yoshiro Miyata, and Paul Smolensky. 1990. Harmonic grammar: A formal multi-level connectionist theory of linguistic well-formedness: an application. In COGSCI 1990, 884–891. MacKenzie, Norman H. (1990) The Poetical Words of Gerard Manley Hopkins. Oxford: Clarendon Press. MacKenzie, Norman H. (1991) The Later Poetic Manuscripts of Gerard Manley Hopkins. New York and London: Garland. Magnuson, Karl, and Frank G. Ryder. 1970. The study of English prosody: an alternative proposal. College English 31. 789-820. Potts, Christopher, Joe Pater, Karen Jesney, Rajesh Bhatt, and Michael Becker (in press) Harmonic grammar with linear programming. To appear in Phonology. Pater, Joe (2009) Weighted constraints in generative linguistics. Cognitive Science 33: 999-1035 Selkirk, Elizabeth (1980). Prosodic domains in phonology: Sanskrit revisited. In Mark Aronoff and Mary-Louise Kean, eds., Juncture. Saratoga, California: Anma Libri. Selkirk, Elizabeth O. 1982. Sound and Syntax: the Relation between Sound and Structure, MIT Press. Selkirk, Elizabeth O. 1986. On derived domains in sentence phonology. Phonology Yearbook 3:371–405. Tarlinskaja, Marina (1976) English Verse: Theory and History. The Hague: Mouton. Tarlinskaja, Marina and L. M. Teterina (1974). Verse-prose-metre. Linguistics 129: 63-86. Wilson, Colin. 2006. Learning phonology with substantive bias: an experimental and computational investigation of velar palatalization. Cognitive Science 30:945–982
© Copyright 2026 Paperzz