‘The Sounds of the Psalter: Computational Analysis of Soundplay’ ............................................................................................................................................................ Drayton C. Benner University of Chicago, Chicago, IL, USA ....................................................................................................................................... Correspondence: University of Chicago, Near Eastern Languages and Civilizations, 1155 East 58th Street, Chicago IL 60637, USA. E-mail: [email protected] This article presents computational techniques for analyzing soundplay in a corpus and applies it to a corpus of Biblical Hebrew poetry, namely, the Book of Psalms. Evidence is presented to show that there is soundplay in the Book of Psalms, and computational techniques are presented to evaluate a poetic passage proposed by a scholar as having soundplay. That is, the computational techniques, though not definitive, help to distinguish between artistic soundplay and the results of chance and a limited phonemic inventory. In addition, visualization tools are presented to aid the researcher in finding soundplay in a corpus. ................................................................................................................................................................................. 1 Introduction In discussions of poetic alliteration and other soundplay, intuition has usually been the sole guide. Soundplay is sometimes sufficiently striking that an author’s artistry is apparent, but other times, poets are more subtle. With a limited phonemic inventory, clusters of particular sounds are inevitable in a passage of poetry. How can a scholar decide whether a cluster of sounds in a poetic passage is artistic or merely the result of chance? Intuition alone is an insufficient guide to problems involving complex probability and statistics. Computational techniques can aid the critical eye and ear of a reader. These computational techniques do not replace the human critic, but they are tools that can improve the scholar’s reading of poetry. This article presents computational techniques for analyzing soundplay in a corpus and applies it to a corpus of Biblical Hebrew poetry, namely, the Book of Psalms. There are three major goals of the computational techniques presented here. First, when a scholar has posited that there is artistic soundplay in a poetic passage, computational techniques ought to be capable of assessing the plausibility of the scholar’s assertion. They may not decide the issue definitively, but they can provide independent support for a scholarly proposal. The computational techniques involved need to be firmly founded in mathematical theory and make use of as much data as possible in making that determination. Second, computational techniques ought to assist the scholar in finding statistically anomalous uses of sound in a poetic corpus so as to present possible instances of soundplay to the scholar. Third, computational techniques should be enlisted to determine whether a poetic corpus contains soundplay at all. Following a discussion of past scholarly attempts to quantify alliteration in poetry and an identification of the source of data used along with how it was segmented, the computational approaches adopted here are presented in three major sections, one for each of the three main goals listed above. Literary and Linguistic Computing, Vol. 29, No. 3, 2014. ß The Author 2014. Published by Oxford University Press on behalf of EADH. All rights reserved. For Permissions, please email: [email protected] doi:10.1093/llc/fqu024 Advance Access published on 14 June 2014 361 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Abstract D. C. Benner Finally, one poetic passage in the psalms containing soundplay is briefly discussed. 2 Previous Attempts to Quantify Alliteration in Poetry 362 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Attempts at quantifying alliteration in ancient Northwest Semitic poetry have been non-existent, with almost all of the work on identifying alliteration being subjective and intuitive.1 However, there is a tradition of quantifying alliteration in other poetic corpora, especially English (of various time periods) and German poetry. This tradition has used valuable concepts and has sometimes been at least partially grounded in mathematical theory, though none of the approaches has been completely satisfactory. The renowned psychologist B. F. Skinner wrote two articles on quantifying alliteration (Skinner, 1939, 1941). In his first article, Skinner examines Shakespeare’s sonnets. He restricts the consonants under consideration to important syllable-initial consonants, where a determination of importance involves a consideration of stress, whether the word is a content word, and subjective factors. For Skinner, alliteration consists of the repetition of a particular phoneme in a single line. Though he does not use the term, his reliance on the line introduces the useful concept of a window, a contiguous sequence of words in the poem. Skinner uses the binomial distribution to calculate the expected number of lines with a given number of occurrences of any phoneme. In his second article, Skinner includes poetry by Algernon Swinburne and examines windows of 2–10 syllables, regardless of line breaks. He also considers sets of multiple phonemes as well as single phonemes. Finally, he produces a measure of alliteration in a corpus by returning to his technique from his first article, essentially treating lines that are alliterative due to repeated words as onethird as important as other alliterative lines. He finds no evidence of alliteration in Shakespeare but does find Swinburne alliterative. Skinner’s studies are, in many ways, fine studies. The window is a useful concept in delineating the bounds of a passage that might contain alliteration, and his approach is rightly grounded in probability theory. That said, there are weaknesses. First, is alliteration in Shakespeare really limited to a small handful of the consonants (cf. Wright, 1974)? Would it not be better to treat certain consonantal phonemes as more important than others rather than excluding some altogether? Second, Skinner does not know how to treat repeated words responsibly, an issue that has plagued computational studies of alliteration. Third, Skinner’s first study limits itself to windows of precisely one line in length. Was Shakespeare constrained to a single line in producing alliteration (cf. Wright, 1974)? In his second study, why did he only examine the first and last syllables in his new windows rather than every phoneme in those windows? The later work of Karl Magnuson and David Chisholm on German verse incorporated many of Skinner’s insights but also many of the weaknesses of his approach (Magnuson, 1962, 1966; Chisholm, 1976, 1981). Direct challenges to Skinner’s work, however, fared worse. Elizabeth Jackson compares the proportion of lines that are alliterative by different poets, wherein a line is considered alliterative if at least two consonantal phonemes of important syllables are identical (Jackson, 1942). Jackson dismantles Skinner’s mathematical foundation, ignoring his use of the binomial distribution. She does not even take into account differences in line lengths, an issue even for the tiny samples of iambic pentameter she uses. Finally, Jackson dismisses Skinner’s legitimately uneasy conscience concerning how he was handling repeated words. N. B. Wright’s later response to Skinner correctly noted some of the problems with Skinner’s studies, but he, like Jackson, does not recognize how having repeated words complicates matters and explicitly bases his analysis on English orthography rather than phonology (Wright, 1974). A variety of independent attempts to quantify alliteration in poetry have been relatively unsuccessful. Richard Bailey tries a variety of techniques for quantifying alliteration in poetry and comparing poetic texts to prose texts, but by his own admission, he has no success (Bailey, 1971). Jay Leavitt proposes a technique by which to rank texts relative to one another based on their use of alliteration, but The Sounds of the Psalter 3 Source of data This project requires a morphologically tagged electronic edition of the Hebrew Bible. It uses the Westminster Leningrad Codex (WLC) and Westminster Hebrew Morphology (WHM), both version 4.14. WLC is a diplomatic edition of Codex Leningradensis, the oldest complete manuscript of the Hebrew Bible in the Tiberian tradition. That is, it follows Codex Leningradensis faithfully even when there is an occasional scribal error. WHM provides lemmas and morphology codes for each Hebrew segment. These have been developed since the 1980s by a variety of scholars and are currently maintained by the J. Alan Groves Center for Advanced Biblical Research. Though perfection is impossible to attain, particularly given the many legitimate scholarly disputes concerning how to analyze particular passages, these are mature sources of good data. 4 Segmenting the corpus The primary corpus of study is the Hebrew Bible, and the language is Biblical Hebrew. From WLC, a few parts must be excised: principally the Aramaic parts of the Hebrew Bible but also some nonlinguistic symbols. In addition, WLC provides the text for the kethib and qere forms, the occasional instances in which the Massoretes preserved the consonants of their written tradition but provided the vowels to a different word that belonged to their oral tradition. To use both would be redundant, so I opted to follow the kethib text. Working computationally with a corpus requires tokenizing, or segmenting, the corpus. Suppose we have a corpus C in language L consisting of a set of texts T ¼ ft1 , t2 , . . . , ta g, which form a partition over the corpus. I divided the corpus into texts at book boundaries, psalm boundaries, and after the language shifts from Aramaic to Hebrew. Each text t contains a sequence of words fw1 , w2 , . . . , wb g. Word divisions are generally indicated orthographically by the presence of a space or a Massoretic maqqeph. However, following WHM, I allow a word to span across a space or maqqeph for certain Literary and Linguistic Computing, Vol. 29, No. 3, 2014 363 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 there is no theoretical basis for his measure of alliteration, and his samples are far too short to be useful—each 300–500 phonemes in length (Leavitt, 1976). Marc Plamondon has perhaps worked on quantifying sound patterning in poetry more persistently than anyone else (Plamondon, 2001, 2005, 2009). Plamondon maintains sensibly that a phoneme leaves an impression on the hearer, and the impression of this phoneme decreases over time. He posits that the effect of repeating a particular phoneme is additive. This approach yields a graph of the phonemic accumulations for a phoneme or set of phonemes. He then runs a Discrete Fourier Transform on it to try to distinguish the signal from the noise. He can also plot the ratio of the results of the Fourier Transform between two sets of phonemes (e.g. stops and fricatives) (Plamondon, 2009). Plamondon’s techniques will likely be helpful, but I sought other techniques for identifying a passage that contains alliteration for three reasons. First, there is no theoretical justification for Plamondon’s function for how the impression of a phoneme declines over time, nor is there theoretical or empirical justification for treating the effect of phonemes as additive. Second, while phonemes could theoretically be weighted differently in Plamondon’s system, he does not do so. The final issue for my purposes is that Plamondon’s system is designed to identify a particular moment in the poem wherein the effect of alliteration on a hearer is strongest. A poet, however, uses alliteration over a particular poetic passage. Techniques designed to identify passages containing alliteration will likely produce overlapping results with techniques designed to identify peak effects on a reader, but where the latter is successful, it is likely to identify a peak near the end of an alliterative section and provide less information concerning where the alliterative section began. In sum, despite the positive characteristics of past work,2 there is a need for a new approach. Any technique for quantifying alliteration in poetry ought to be built from the base of a solid mathematical framework and use all of the data available, weighting different evidence appropriately. Much of the rest of this article presents such a technique. D. C. Benner 5 Measuring soundplay in a passage The central issue in computationally validating a particular window—a sequence of contiguous words in a single text—as containing a cluster of soundplay is that one needs a way in which to determine whether a set of phonemes is overrepresented in a given passage. In particular, this determination needs to be made in a mathematically rigorous fashion. 5.1 Binomial distribution and its cdf We want to know whether the number of phonemes in a window W belonging to a set of phonemes F P, where P is the set of consonantal phonemes in the language, is abnormally high under the assumption that the consonantal phonemes in W are selected at random according to their distribution in corpus C. Toward this end, the binomial distribution provides a probability distribution of the number of phonemes in W that belong to F if each consonantal phoneme is chosen from P independently with some probability p, where p is the frequency of the consonantal phoneme in the corpus. Figure 1 shows p for each consonantal phoneme in Biblical Hebrew. From the cumulative distribution function (cdf ) of the binomial distribution, for a given window Fig. 1 Consonantal phoneme probabilities in the Hebrew Bible 364 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 proper nouns. Each word w consists of a sequence of segments fs1 , s2 , . . . , sc g. For example, the word (mhmlk; ‘from the king’) consists of three (mn; segments corresponding to the lexemes (mlk; ‘king’), but the ‘from’), (h; ‘the’), and (jktb; ‘he will write’) consists of only one word segment. Each segment s consists of a sequence of consonantal phonemes p1 , p2 , . . . , pd , where pi 2 P, the set of consonantal phonemes in language L. Matres lectionis—vowels indicated via consonant letters before the development of a full system for representing vowels in Hebrew—and Massoretic symbols were removed from WLC. I limit my investigation to the consonants on account of the difficulties in reconstructing the precise vowels of the biblical period, particularly in the Book of Psalms, an anthological corpus with compositions from many different centuries. For the consonants, I assume that the consonantal orthography represents the phonology well, with three exceptions: the representation of /¿/ and // by u (¤ ), the representation of / / and /x/ by j (h[), and the occasional quiescence of a (? ). I also assume that the Massoretes correctly distinguished between two phonemes in marking ? as ?_ (S ) and _? (¸ ). The articulation of r () is uncertain. Each token also has a part of speech and a lexeme, which are provided by WHM. The Sounds of the Psalter containing n consonantal phonemes, the probability that it will contain at most k consonantal phonemes in F is: Pr ðX k Þ ¼ k X n i p ð1 pÞni i i¼0 5.2 Weighting the phonemes of a segment The foregoing use of the binomial distribution is quite powerful and grounded nicely in mathematical theory, but it assumes that all consonantal phonemes are of equal importance to the poet in producing soundplay, which is simplistic. Poets working within a particular poetic tradition may make more use of certain sounds than others. Consider an example of alliteration from Shakespeare: From forth the fatal loins of these two foes A pair of star-cross’d lovers take their life.3 In these two lines, /f/ appears five times and /l/ four times, but in seven of the nine times it is the first sound of the word, and in the remaining two cases it is the final sound in the word. A model that weighted all phonemes equally might miss this pattern. 5.3 Weighting phonemes in Biblical Hebrew From my observations of Biblical Hebrew poetry, the location of the phoneme within the word does not appear to be significant. However, three factors are important: the relative frequency of lexemes, the repetition of lexemes, and the parts of speech of segments. A list of nouns with the definite article h (h) does not represent a cluster of the h (h) phoneme. By contrast, a rare lexeme may have been chosen over a more common one precisely on account of its sound. Soundplay using content words is more effective than that which uses function words. The repetition of a particular lexeme does 1 a f,d ¼ 8 > > > > 1, > < bðpos Þ ¼ 0:8, > > > 0:6, > > : 0:5, :2 1 4f d 1:2 d ( pos pos pos pos ) verb, common noun, 2 adjective, number ¼ pronoun ¼ particle ¼ proper or gentilic noun Figures 2 and3 show the frequency weights for the frequencies a f , d for the lexemes of the Hebrew Bible. Unlike the functions a and b, one can define c r, m, f , d in a manner that is theoretically grounded. The binomial distribution provides a probability distribution concerning the number of times we would expect a given lexeme to appear in a window containing m segments. If Literary and Linguistic Computing, Vol. 29, No. 3, 2014 365 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 When this calculation produces a value that is greater than some threshold, it is consistent with the hypothesis that the poet used the set of phonemes F in W for artistic effect. not signal that the poet was specifically seeking to employ phonological parallelism, or soundplay, over against another type of parallelism. A clause (pd wpt wp ¿ lk ‘terror, such as pit, and snare are upon you’; Isaiah 24:17; Jeremiah 48:43) is rhetorically effective on account of its use of different lexemes with similar phonemes. Initially, when each consonantal phoneme was weighted equally, each consonantal phoneme counted as a single Bernoulli trial. However, if one weights the consonantal phonemes of each segment according to the relative frequency of their lexemes, their parts of speech, and the repetition of lexemes in the window, then each consonantal phoneme counts as a trial with weight in the range ½0:0, 1:0. It is calculated as follows: wðf , d, pos, r, mÞ ¼ aðf , dÞ bðposÞ cðr, m, f , dÞ, where f is the number of occurrences of the lexeme in the corpus, d is the number of segments in the corpus, pos is the part of speech of the segment’s lexeme, r is the number of times the lexeme associated with the segment appears in the window, and m is the window size in segments. There is no theoretical guidance for how to define the functions a and b, but they can be defined in an ad hoc but straightforward and reasonable manner as follows: D. C. Benner Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Fig. 2 Frequency weight aðf , dÞ for all lexemes in the Hebrew Bible Fig. 3 Frequency weight aðf , dÞ of lexemes occurring 100 times in the Hebrew Bible the cdf of the binomial distribution is very high, then the lexeme is overrepresented in the window, and the weight of each occurrence ought to be lower. Let F ðk; n, pÞ be the cdf of the binomial distribution, where k is the number of successes in n trials, where each trial has a probability of success p. Let F 0 y; n, p be the inverse cdf of the binomial distribution. 366 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 Let c r, m, f , d ¼ 8 f > 1:0, if F r 1; m 1, > d 0:9; > < f F r1;m1, f d max 1, F 0 0:45 þ ;m1, d > 2 > > : , if F r 1; m1, f >0:9 r d In words, when a lexeme appears in a window, if by chance it would appear at least as many times The Sounds of the Psalter 6 Discovering soundplay in a text Three visualization tools to aid the researcher in discovering soundplay will be presented. The first two are independent of the computational techniques for discovering soundplay described above, while the third makes full use of them. The third has been the most productive. 6.1 Two preliminary visualizations The first two visualization techniques both rely on mapping from phonemes to color. This mapping is designed such that similar-sounding phonemes will have similar colors, as shown in Figure 4. The two velar fricatives in Biblical Hebrew were not distinguished in the orthography from their pharyngeal counterparts, so they are colored as though they were pharyngeal. 6.1.1 Colored text With this mapping from consonantal phoneme to color in hand, it is possible to display the text of the Hebrew Bible with each consonantal phoneme colored accordingly, as shown in Figure 5. As this visualization tool was written in xhtml and javascript and none of the major web browsers could handle the placement of vowels and cantillation marks Fig. 4 Mapping from Hebrew consonantal phonemes to colors Literary and Linguistic Computing, Vol. 29, No. 3, 2014 367 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 elsewhere in the window as it actually does at least one-tenth of the time, then the repetition weight is 1; it receives full weight. However, if the lexeme appears more times than that in the rest of the window, then the weight is scaled down. The function c is defined in such a way that it is continuous. Without weighting phonemes, a window contained k successes in n Bernoulli trials, where k, n 2 Z, so the discrete binomial distribution was appropriate. Once phonemes are weighted differently, k, n 2 R, so a continuous distribution is required. The cdf of the binomial distribution can be represented in terms of the continuous regularized incomplete beta function, so it is used. Weighting the phonemes also affects the baseline probability of a success in the weighted trials. If each phoneme has a constant weight, this presents little challenge. However, the weights, as we have defined them, are dependent on the selection of the window because they take into account repetition of lexemes within the window. A window-independent weight is required for the phoneme. This is calculated by finding the average weight of the phoneme for all windows of which it is a part, only requiring that the window’s length be in the range ½wmin , wmax words. In all work described henceforth, wmin ¼ 5 and wmax ¼ 25. D. C. Benner while having them a separate color from the consonant to which they are attached, the vowels and cantillation marks surrounding a consonantal phoneme are colored. The chief benefit of this visualization tool is that it allows the researcher to read the text at the same time as seeing the colors, performing a close reading of the text. The same is true of other visualization tools that other scholars have created to aid in the close reading of poetry with particular attention to sound: AnalysePoems, ProseVis, and PoemViewer (Plamondon, 2005; Clement et al., 2013; Coles and Lein, 2013). 6.1.2 Coloring with one vertical strip per phoneme The second visualization tool allows the researcher to view a large swath of text at one time. In this visualization tool, the phonemes themselves are not shown. Rather, each phoneme is given a box that is only one pixel wide and several pixels high. The box is filled with the color corresponding to the phoneme. Importantly, one can fill the boxes for any subset of the phonemes while leaving the rest black. Figures 6 and 7 show this visualization tool, which was written in Java, with particular phonemes selected for viewing. The white boxes at the end of a line indicate the end of a chapter. On mouseover, the researcher learns the location in the text via a tooltip. 368 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 6.2 Visualization with the computational techniques described above The third and final visualization tool has been the most helpful at finding soundplay because it uses the computational techniques described above. Once the user selects a phoneme or group of phonemes, it tests every possible window in each text T ¼ fw1 , w2 , . . . , wb g in the range ½wmin , wmax words long. I exclude all windows that have fewer than pmin phonemes in F. In all work described henceforth, pmin ¼ 3. For each word, it records the highest cdf value of any window containing that word. These values can then be color-coded and shown in a visually compact manner. Each word receives a box one pixel in width and several pixels in height. The colors are chosen according to the value v for each word using a rainbow color 8 scale: Red, v 0:999 > > > > Orange, 0:99 v < 0:999 > > > > < Yellow, 0:98 v < 0:99 Color is Green, 0:97 v < 0:98 > > Blue, 0:95 v < 0:97 > > > > > Indigo, 0:5 v < 0:95 > : Violet, v < 0:5 Figures 8 and 9 show this visualization tool for two different sets of phonemes. The Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Fig. 5 Psalm 119:1-13 with color mapping of Hebrew consonantal phonemes The Sounds of the Psalter Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Fig. 6 Psalm 1ff. (one Psalm per line) showing r () Fig. 7 Psalm 1ff. (one Psalm per line) showing the dental stops (t (t), d (d), and ~ (t')) location in the corpus is provided in a tooltip on mouseover. 7 Validating the existence of soundplay in the psalms One can use statistical and computational techniques to validate that a poetic corpus does in fact contain soundplay. This requires a two-step process. The first step is to produce a corpus that is comparable to the poetic corpus yet has little or no artistic soundplay. The second step is to quantify the amount of soundplay in each of these corpora and compare them. Given the discussion above, the second step is more straightforward and will thus be discussed first. The discussion of developing comparable corpora will be presented next, followed Literary and Linguistic Computing, Vol. 29, No. 3, 2014 369 D. C. Benner Fig. 9 This shows Psalms 84-113 (one psalm per line) for the consonantal phonemes in the word Jerusalem. finally by the results of the comparisons, which do validate that there is soundplay in the psalms. 7.1 Measuring soundplay in a corpus To define a metric of relative soundplay in a corpus, one can make use of the computational techniques described above. In particular, the third visualization technique above described finding a peak value for each word. One can set a threshold for this peak value and count how many words in the corpus exceed that threshold. For example, if the threshold 370 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 is 0:999, this is equivalent to counting the number of red, one-pixel by several-pixel boxes in the visualization tool. This number can be compared to the number of red boxes for another corpus, dividing each by the corpus size in words if the two corpora are of different lengths. 7.2 Producing a comparable corpus There are two methods by which one can produce a corpus comparable to the original poetic corpus Cpoetic yet lacking a significant amount of soundplay. Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Fig. 8 This shows Psalms 23-37 (one psalm per line) for the phoneme r () The Sounds of the Psalter The first is to generate a test corpus by rearranging the words of the poetic corpus at random. The second involves comparing the poetic corpus to a corpus of prose texts. Each of these ways will be discussed in turn. 7.2.2 Using a prose corpus as a test corpus The second way to produce a corpus for comparison to Cpoetic is to use an already-existing prose corpus in the same language. Prose can contain soundplay as well, but it is likely that soundplay is not as prominent in prose as it is in poetry in most literary traditions. In the case of the Hebrew Bible, the following predominantly prose books were selected: Genesis, Exodus, Leviticus, Numbers, Deuteronomy, Joshua, Judges, Ruth, 1-2 Samuel, 1-2 Kings, 1-2 Chronicles, Ezra, Nehemiah, and Esther. There are some potential differences between the two corpora. The language is a bit more formulaic in prose, and word order is less fluid. There is more varied vocabulary in poetry than in prose. Certain particles that are extremely common in prose are less common in poetry, and uncommon words appear more frequently in poetry than in prose. Moreover, content words are likely to be distributed in prose in a manner that less resembles a Poisson distribution than in poetry. This is especially the case for proper nouns, which, excluding divine Literary and Linguistic Computing, Vol. 29, No. 3, 2014 371 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 7.2.1 Generating a test corpus by rearranging the words of the poetic corpus To generate a corpus that shares most of the characteristics of Cpoetic yet has no artistic uses of soundplay, one can rearrange the words of the poetic corpus at random. In this manner, the distributions of phonemes within a word can be preserved, and the text boundaries in Cpoetic can be preserved as well. It is important to rearrange the words at random, not the phonemes, because the phonemes within a token or word are not independent of one another. This is true of languages generally, but it is especially true of Semitic languages like Biblical Hebrew, wherein nearly every word derives from a tri-consonantal root. There are phonological constraints on the selection of the three consonants of the roots (Greenberg, 1950; cf. Weitzman, 1987; Frisch, 2004; Frisch et al., 2004; Vernet, 2011). This generated corpus cannot possibly share every characteristic of Cpoetic without itself being Cpoetic . The differences in characteristics between the generated corpus and Cpoetic , however, are minor and unlikely to be of consequence in measuring soundplay. If the language in Cpoetic is fairly formulaic, then one would expect some n-grams (for n > 1) at the word level to be quite frequent in Cpoetic , whereas one would not expect this in the generated corpus. This could affect a measure of soundplay in the corpus if the commonly adjoining words happen to contain similar phonemes. However, the psalms are not particularly formulaic; its vocabulary is more varied than the vocabulary of the prose books. Additionally, there is no reason to expect frequent n-grams to contain the same phonemes. The one area that has been argued to be formulaic in the psalms is fixed word-pairs.4 That is, particular pairs of words are commonly used in parallel lines, whether the two words be synonyms, antonyms, etc., but this is unlikely to produce a significant effect. It consists of only two words, usually separated by several words that vary. Moreover, there is no reason to expect these word-pairs to contain the same phonemes an abnormal proportion of the time. After rearranging the words of the poetic corpus, we expect the distances between segments corresponding to the same lexemes to follow a Poisson distribution. This is generally also the case in natural language for function words, but occurrences of particular content words tend to bunch together. That is, the variance of the distances between the occurrences of a content word is larger than the variance of a Poisson distribution. This can affect a measure of soundplay, but its effect is mitigated by two factors. First, the more varied vocabulary of the Book of Psalms lessens the effect of this phenomenon. Second, the measure of soundplay that I am using is intentionally chosen to mitigate the effects of the repetition of lexemes by weighting those repeated words lower. Thus, I do not anticipate that these differences between Cpoetic and the generated corpus pose any problems. D. C. Benner 7.2.3 Results of the comparisons 7.2.3.1 Comparing psalms to rearranged psalms I generated a corpus from the psalms with the words rearranged 100 times. Let us call them Crearranged ¼ Crearranged1 , . . . , Crearranged100 . Using my metric of counting the number of peaks above a certain threshold t, we do indeed find a difference between the psalms and these rearranged corpora, provided that t is set sufficiently high. For t ¼ 0:999, Figure 10 compares the number of peaks above the threshold for each consonantal phoneme in Cpoetic over against Crearranged . The average value in Figure 10 is 74. If each corpus in Crearranged were compared against the remaining 99 corpora in Crearranged , the average value on a comparable graph would be in the range ½35, 64. Similarly, if one totals the peaks above the threshold for all of the consonantal phonemes, this sum for the corpora in Crearranged is in the range ½2444, 3573. By contrast, the sum for Cpoetic is 4340, as shown in Figure 11. Thus, above the 0:999 threshold, it is reasonable to expect approximately 56 82% of the peaks to be the result of chance and a limited phonemic inventory, while the remaining 18 44% are the result of the artistic use of sound. When the threshold is lowered to 0:99, Cpoetic still looks to have more soundplay than the corpora in Fig. 10 Word peaks above threshold 0.999 in psalms as percentile of word peaks above the same threshold in C rearranged Fig. 11 Number of peaks above threshold 0.999 for all consonantal phonemes 372 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 names, are much more common in Biblical Hebrew prose than poetry. All of these are mostly mitigated by my weighting strategies. If there is any residual effect, it would be to skew measures of soundplay in the direction of prose. Additionally, the texts are longer in prose than in poetry. As a result, a higher proportion of the words in prose are part of as many windows as possible. The more windows of which a word is a part, the more likely it is to have a high maximum value. Again, this difference tilts my measure toward favoring prose. Thus, if my measure of soundplay results in showing more soundplay in poetry despite these differences and despite the fact that there is probably some soundplay in prose as well, then one can be even surer of the result that the poetic corpus contains soundplay. The Sounds of the Psalter 7.2.3.2 Comparing psalms to prose Cprose consists of seventeen books of the Hebrew Bible primarily consisting of prose. Figures 16 and 17 show the peak word values by consonantal phoneme for the thresholds 0:999 and 0:99, respectively. Figure 18 shows the peak word values for all consonantal phonemes for a variety of threshold values in one graph. It shows that the psalms word peaks are substantially higher than the prose word peaks for the rightmost tail of the distribution, where the threshold is 0:99 or higher, indicating that there is more soundplay in the psalms than in the prose books. 7.2.4 Conclusion The comparison between the psalms and Crearranged as well as the comparison between the psalms and Cprose show that there is indeed soundplay in the psalms. Moreover, it shows that a peak value below the threshold 0:99 should not be taken as significant. Regardless of the threshold used, there will be peaks that are simply the result of chance and a limited phonemic inventory. Computational techniques can show that sound is plausibly used for artistic purposes in a passage, but it cannot prove that any given instance of soundplay is artistic and not the result of chance; the human critic has to look for additional lines of evidence to distinguish between the two or humbly accept uncertainty. Fig. 12 Word peaks above threshold 0.99 in psalms as percentile of word peaks above the same threshold in C rearranged Fig. 13 Number of peaks above threshold 0.99 for all consonantal phonemes Literary and Linguistic Computing, Vol. 29, No. 3, 2014 373 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Crearranged , but the gap is much narrower on a percentage basis. Figures 12 and 13 are the two comparable charts with t ¼ 0:99. This suggests that by the time the threshold t ¼ 0:99, most of the peaks correspond to chance. However, perhaps between 3 12% of them are artistic uses of sound, some of which satisfied the more rigorous threshold of t ¼ 0:999. When one lowers the threshold further to 0:98, one finds no distinction between Cpoetic and Crearranged . Thus, in looking for artistic uses of sound in the psalms, one should look only at values above 0:99. Figures 14 and 15 show the peaks in the range ½0:98, 0:99. D. C. Benner Fig. 15 Number of peaks in range ½0:98, 0:99 for all consonantal phonemes Fig. 16 Comparison of word peaks above threshold 0:999 in psalms and prose books by consonantal phoneme 8 Results The computational techniques described above have been successful in uncovering soundplay in the 374 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 psalms. One example is presented here. In Psalm 37:34-36, the r ( ) phoneme appears precisely one time in each of fifteen consecutive words. Psalm 37 is an acrostic poem, and these fifteen words span the Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Fig. 14 Word peaks in the range ½0:98, 0:99 in psalms as percentile of word peaks in the same range in C rearranged The Sounds of the Psalter Fig. 18 Comparison of peak word values for the psalms and prose books q (k') and r ( ) sections. This soundplay helps to reinforce the acrostic structure of the poem and also serves to bind the q (k') and r ( ) sections together. Figure 19 shows the text using the first visualization tool presented above, along with a transliteration of its consonants and a translation. The second and third visualizations for Psalm 37 are also shown in Figures 20 and 21. With one notable exception (Levine, 2003, p. 78–9), modern exegetes, following the lead of the LXX, the ancient Greek translation of the Hebrew Bible, have not noticed this phonological pattern and have sought to smooth ( ; the sense of verse 35 via emendation: ‘high-spirited, arrogant’) in place of ( ; ‘vio( ; ‘one who elevates himself’) in lent’),5 place of ( ; ‘spreading’), and ( ; ( ; ‘native’). While the ‘cedar’) in place of final of these is plausible, the first two miss the poet’s phonological artistry. The poet’s word choice in verse 35 might be sub-optimal when considered semantically, but that is because sound, not semantics, was driving his word choice. Exegetes’ Literary and Linguistic Computing, Vol. 29, No. 3, 2014 375 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Fig. 17 Comparison of word peaks above threshold 0:99 in psalms and prose books by consonantal phoneme D. C. Benner Fig. 20 Psalm 37 with Visualization Tool 2 Fig. 21 Psalm 37 with Visualization Tool 3 emendations represent an aesthetically appealing alternative artistry available to the ancient poet, but it is not the artistry that the ancient poet chose. 9 Conclusion Computational techniques are necessary for evaluating soundplay in a corpus. Past discussions of soundplay in Northwest Semitic poetry have mostly been driven by intuition, but human critics are not good at distinguishing between artistic uses of sound and the results of chance and a limited phonemic inventory. Computational techniques developed for other poetic corpora have been more advanced, but none of them has been completely 376 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 satisfactory. The approach described in this chapter is grounded firmly in mathematical theory, makes maximum use of the available data, and recognizes that some phonemes are more important to a poem than others. Computational techniques are helpful not only in evaluating soundplay that a scholar has proposed but also in aiding a scholar in finding soundplay. Among the three visualization tools presented herein, the third visualization tool has been the most useful in finding soundplay because it uses the computational techniques designed for evaluating whether soundplay exists in a poetic passage. Searching a poetic corpus for soundplay is fruitless if there is not actually artistic soundplay in that corpus. As a result, I presented techniques that show that there is soundplay in the psalms. Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Fig. 19 Psalm 37 with Visualization Tool 1 The Sounds of the Psalter visualization. iConference 2013 Proceedings [Online]. http://hdl.handle.net/2142/38940 (accessed 18 October 2013). Frisch, S. A. (2004). Language Processing and Segmental OCP. In Hayes, B., Kirchner, R., and Steriade, D. (eds), Phonetically-based phonology. Cambridge: Cambridge University Press. Frisch, S. A., Pierrehumbert, J. B., and Broe, M. B. (2004). Similarity avoidance and the OCP. Natural Language and Linguistic Theory, 22: 179–228. Greenberg, J. H. (1950). The patterning of root morphemes in semitic. Word, 6: 162–81. Hidley, G. R. (1986). Some Thoughts concerning the application of software tools in support of old english poetic studies. Literary and Linguistic Computing, 1: 156–62. Jackson, E. (1942). The quantitative measurement of assonance and alliteration in swinburne. American Journal of Psychology, 55: 115–23. Leavitt, J. A. (1976). On the measurement of alliteration in poetry. Computers and the Humanities, 10: 333–42. Lessard, G. and Hamm, J. -J. (1991). Computer-aided Analysis of repeated structures: the case of Stendhal’s Armance. Literary and Linguistic Computing, 6: 246–52. References Bailey, R. W. (1971). Statistics and the sounds of poetry. Poetics, 1: 16–37. Levine, N. (2003). Vertical poetics: interlinear phonological parallelism in psalms. Journal of Northwest Semitic Languages, 29: 65–82. Barquist, C. R. (1987). Phonological Patterning in Beowulf. Literary and Linguistic Computing, 2: 19–23. Logan, H. M. (1988). Computer analysis of sound and meter in poetry. College Literature, 15: 19–24. Barquist, C. R. and Shie, D. L. (1991). Computer analysis of alliteration in beowulf using distinctive feature theory. Literary and Linguistic Computing, 6: 274–80. Magnuson, K. (1962). Consonant Repetition in the lyric of Georg Trakl. Germanic Review, 37: 263–81. Berlin, A. (1985). The Dynamics of Biblical Parallelism. Bloomington, Ind.: Indiana University Press. Magnuson, K. (1966). Phonological Investigations into the Structure of German Verse. Ph.D. dissertation, University of Michigan. Chisholm, D. (1976). Phonological patterning in German Verse. Computers and the Humanities, 10: 5–20. Margalit, B. (1975). Studia ugaritica I: introduction to ugaritic prosody. Ugarit Forschungen, 7: 289–313. Chisholm, D. (1981). Phonology and style: a computerassisted approach to German Verse. Computers and the Humanities, 15: 199–210. Pardee, D. (1973). A Restudy of the commentary on Psalm 37 from Qumran Cave 4 (Discoveries in the Judaean Desert of Jordan, vol. V, no 171). Revue de Qumran, 8: 163–94. Clement, T., Auvil, L., Tcheng, D., Capitanu, B., Monroe, M., and Goel, A. (2013). Sounding for meaning: using theories of knowledge representation to analyze aural patterns in texts. Digital Humanities Quarterly, 7. http://digitalhumanities.org/dhq/vol/7/1/ 000146/000146.html (accessed 13 May 2014). Coles, K. and Lein, J. (2013). Finding and figuring flow: notes toward multidimensional poetry Plamondon, M. R. (2001). The Musical Aesthetics of the Poetry of Tennyson and Browning. Ph.D. dissertation, University of Toronto. Plamondon, M. R. (2005). Computer-assisted phonetic analysis of english poetry: a preliminary case study of browning and tennyson. TEXT Technology, 14: 153–75. Literary and Linguistic Computing, Vol. 29, No. 3, 2014 377 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 This work has been specifically geared toward Biblical Hebrew poetry, but the techniques are general enough to apply to other poetic corpora as well with only minor changes. The method of segmenting the data might vary from corpus to corpus; for example, line and stanza divisions might be more important to soundplay in other poetic corpora. The way in which phonemes are weighted might also vary; for example, the location of a phoneme within a word or syllable might be significant in other poetic corpora. Computational techniques are critical to a study of sound in poetry, but they are not sufficient in and of themselves. They do not replace the human critic. When computational techniques suggest that there may be soundplay in a poetic passage, a human critic can bring other lines of evidence to try to determine whether the soundplay is artistic. Certainty is elusive, but an able critic working with computational techniques will produce the best readings of poetic soundplay. D. C. Benner Plamondon, M. R. (2009). Poetic waveforms, discrete fourier transform analysis of phonemic accumulations, and love in the garden of tennyson’s maud. Digital Studies / Le champ numérique (Online), 1. http:// www.digitalstudies.org/ojs/index.php/digital_studies/ article/view/179/228 (Accessed 26 July 2013). Wright, N. B. (1974). Measuring Alliteration: A Study in Method. In Mitchell, J. L. (ed.), Computers in the Humanities. Edinburgh: Edinburgh University Press. Shirley, C. G. (1979). Alliteration as Evidence in dating a poem of thomas churchyard: an exploratory computeraided study. Modern Philology, 76: 374–7. Notes Skinner, B. F. (1941). A Quantitative Estimate of Certain Types of Sound-Patterning in Poetry. American Journal of Psychology, 54: 64–79. Vernet, E. L. (2011). Semitic root incompatibilities and historical linguistics. Journal of Semitic Studies, 56: 1–18. Watson, W. G. E. (1984). Classical Hebrew Poetry: A Guide to its Techniques. Sheffield, England: JSOT Press. Weitzman, M. (1987). Statistical Patterns in hebrew and arabic roots. Journal of the Royal Asiatic Society of Great Britain and Ireland, 1: 15–22. 378 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 Downloaded from http://llc.oxfordjournals.org/ at Pennsylvania State University on September 16, 2016 Skinner, B. F. (1939). The alliteration in Shakespeare’s sonnets: a study in literary behavior. Psychological Record, 3: 186–92. 1 Baruch Margalit does attempt to delineate rules for alliteration in Ugaritic poetry, but his approach is statistically meaningless (Margalit, 1975, p. 311). 2 Several other studies deserve mention, though their methods are either rather simplistic or not given in as much detail as one might like: Shirley (1979); Hidley (1986); Barquist (1987); Logan (1988); Barquist and Shie (1991); Lessard and Hamm (1991). 3 Romeo and Juliet, Act I, Prologue, lines 5-6. 4 For an overview of research on word pairs in Biblical Hebrew poetry, see Watson (1984, p. 128–44). For a more skeptical treatment that explains fixed word pairs in Biblical Hebrew in terms of psycholinguistics rather than a culturally specific stock vocabulary of word-pairs, see Berlin (1985, p. 65–83). 5 Note that 4Q Psalm 37 preserves #yru (¿s'; ‘‘violent’’) but is broken such that the other two words are unknown. See Pardee (197, p. 166).
© Copyright 2026 Paperzz