Scat syllables and markedness
theory*
Patricia A. Shaw
University of British Columbia
T
here could be no more appropriate dedication to Jack:
Thou Swell (scat solo)
Thou swell 1927. Words by Lorenz Hart, Music by Richard Rodgers.
Scat solo by Betty Carter, transcribed by William R. Bauer (2002a: 251).
A highly creative domain between the prosodies of human language
and the riffs of instrumental jazz is the dynamic vocal jazz idiom of
scat. The present analysis proceeds from the observation that, despite
the distinctly individualistic approaches to scatting by renowned jazz
masters such as Louis Armstrong, Betty Carter, and Chet Baker, the
inventory of the semantically empty syllables used in scat is extremely
limited in comparison to the rich range of combinatorial possibilities
that define well-formed syllables in English. This paper explores the
degree to which the form of scat syllables in the performance repertoire
of various artists conforms to postulated universal markedness
constraints on natural language syllables. Significantly, markedness
theory plausibly accounts for a considerable range of the data.
Nonetheless, certain systematic deviations occur. It is proposed that
the relative markedness of such properties may be genre-dependent,
functioning in scat to enhance musical form or modality.
1.
Introduction
Like the majority of human languages in the world, which evolved and persist as strictly
oral traditions, scat emerged in the realm of musical genres as a vibrant, expressive, and
exclusively oral idiom. However, unlike human languages, scat does not build on a
consistent, conventionalized relationship between sound and meaning. Its essence is
creative, improvisational vocal tract sound. Its syllables and sequences are evocative and
emotive, but not denotative. There is no standardized or systematic interpretability to the
musically parsed cadences of scat syllables. For example, the title of Louis Armstrong’s
1926 hit Heebie Jeebies has a consistent interpretation, verifiable across different speakers, as
* I am deeply indebted to Mike Fitzgerald, Kate Hammett-Vaughn, Ted Moore, Tyler Peterson, Suzanne
Pittson, Fred Stride, and particularly Bill Bauer and Alan Matheson for their generous guidance. Special
thanks to Walter Pedersen for his enthusiastic assistance with transcription and in tracking recordings.
Toronto Working Papers in Linguistics 27: 145–191
Copyright © 2008 Patricia Shaw
Patricia A. Shaw
refering to a kind of nervous energy or a scattered uneasy feeling, the ‘jitters’. However, the
sequence of syllables in the scat line in (1) would elicit no coherent consensual meaning.¹
(1) Bars 5–7 of the scat solo by Louis Armstrong in Heebie Jeebies (1926)²
WRB: | duw Œ daw diy duw də | ‰ diy də də dow diy | dow di dow duw ‰ duw– |
| duw Œ daw diy duw də | ‰ diy də də dow diy | dow dɩ dow duw ‰ duw– |
In a formal linguistic sense, then, scat syllables are semantically ‘empty’.
Nonetheless, of considerable linguistic interest is their form. The present analysis
proceeds from the observation that, despite the distinctly individualistic approaches to
scatting by great vocal jazz masters, the repertoire of the semantically empty syllables used
in scat is extremely limited in comparison to the rich range of combinatorial possibilities
that define well-formed syllables in English. For example, two properties of the excerpt
in (1) are immediately noteworthy, and, as it turns out, are robustly characteristic of scat
vocables produced by a broad diversity of performers. First, consider the onset and coda³
structure of the scat syllables in (1): of the 15 syllables, all have a single consonant as
onset—there are no clusters, no onsetless syllables, and none has even a single consonant
as coda. In other words, all are ‘open’ CV or CVG syllables, despite the fact that English
words are built on an inventory of combinatorial possibilities that readily sanctions codas,
and allows quite extensive complexity within both onset and coda clusters, e.g. as [str...]
and [...ŋkθs] in ‘strengths’ [strɛŋkθs]. Secondly, not only do all the scat syllables in (1) have a
non-complex onset, but in fact they all have the same consonant [d] as the syllable onset.
As a means of comparison, now consider in (2) the structure of the syllables in
another scat solo by Louis Armstrong from Hotter Than That, recorded 3 years later
(cited from Reeves 2001 by Bauer 2002b: 308). Just as in (1), all the syllables in (2) are
canonical open syllables: all have a single segment onset, none has a complex onset, and
none has a coda. However, in contrast to (1), there are no [d]s. Rather, here all 16 syllables
have [b] as their onset.
(2) Bars 49–54 of the scat solo by Louis Armstrong in Hotter Than That (1929)
WRB: | boh ‰ bə Œ boh | ‰ ba Œ bə ‰ biy | Œ bə ‰ biy Œ | bow ‰ bə Œ bow | ‰ bə Œ ba ‰ biy | Œ ba ‰ biy Œ |
| bɔ ‰ bə Œ bɔ | ‰ ba Œ bə ‰ biy | Œ bə ‰ biy Œ | bow ‰ bə Œ bow | ‰ bə Œ ba ‰ biy | Œ ba ‰ biy Œ |
¹ An independent measure of what has—or has not—conventionalized semantic interpretability is
reflected by which sequences of sounds are accorded entry as ‘words’ in standard English dictionaries.
Consistent with the particular example chosen here, ‘heebie-jeebies’ is listed as a word in the American
Heritage dictionary: “slang. A feeling of uneasiness or nervousness; the jitters.”. However, none of the
various potential spellings of the scat syllables (de, dee, deh, di, dih, du, duh, doo, etc. ) are.
² The transcription line labelled WRB is by Bauer 2002b: 308; the transliteration beneath it follows the
principles of phonemic interpretation in Appendix 1.
³ The use of the terms onset and coda here does not entail the attribution of category or constituency
status within a formal theory of prosodic structure. Rather, these are simply cover terms to reference (i)
as onset, the string (possibly null) of segments between the left edge of a syllable and the Nucleus, and
(ii) as coda, the string (possibly null) of segments between the Nucleus and the right edge of a syllable.
The Nucleus is assumed to be an independent category node, which in English dominates a short vowel
(V), long vowel (Vː), or diphthong (VG). C abbreviates ‘consonant’, V ‘vowel’, G ‘glide’, σ ‘syllable’.
146
Scat syllables and markedness theory
In sum, two generalizations are strikingly evident from the data in (1) and (2). First,
of the 24 consonants available in the English phonemic inventory, the only two used in
these excerpts are [b] and [d]. Secondly, the syllable structure is consistently open, i.e. not
closed by a coda consonant. To an extreme then, Louis Armstrong’s repertoire in these
citations exemplifies the fundamental premise of this research: scat draws on a very limited
subset of the sounds and of the syllabic groupings that are regularly used in English.
However, how representative are these generalizations? Is the favouring of [b] and
[d] part of Satchmo’s own particular idiosyncratic style, or is this genuinely something that
is broadly characteristic of scat? What—you are doubtless wanting to interject—about
the [š] in shoo be doo? And to what extent do other scat singers use a more diversified and
complex range of syllable shapes? What about codas? After all, who put the bop in the bop
shoo bop shoo bop?⁴
A diverse sampling of vocal scat is investigated here, ranging from classic jazz icons
like Louis Armstrong, Betty Carter, and Chet Baker to pop music song-writers/recording
artists like Johnny Cymbal and Barry Mann, who in the early 60s wittily transported the
playful and unmistakably sexy edginess of scat directly into their rock’n’roll lyrics. Across
these artists, generations, and genres, the basic introductory observations about scat
are consistently affirmed: the inventory of sounds used and their syllabic organization
constitute a significantly small subset of the full diversity of available English options.
The principal goal is to identify just what generalizations about phonological form hold
within this body of scat data, and to explore various hypotheses that might plausibly
explain why the particular patterns that are attested emerge in scat.
From the perspective of linguistic theory, the observations are evaluated in the
context of postulated universal constraints on articulatory phonetics and phonological
markedness. Interestingly, a considerable range of the data is plausibly accounted for by
markedness theory. Equally interesting is the finding that certain systematically attested
scat properties run directly counter to markedness expectations. The highly marked, yet
robustly attested status of these characteristics suggests that over-riding the body of
linguistic constraints on the scat phonological system are competing constraints on scat
as a musical performance genre, constraints that function to enhance the melodic pitch
contour, the musical phrasing, the auditory interpretation, or the distinctive trademarking
of individual artistic style. What results from this analysis is a unique perspective into
the structural and performative interface of two complex systems of human vocal
expression—music and language—each subject to distinct sets of constraints and
conventions, sometimes convergent and sometimes conflicting, but ultimately combining
in the creative exuberance of scat.
1.1.
Purview
The analyses in §2 below are sequenced with respect to the recording date chronology.
Beginning with the seminal 1926 Heebie Jeebies recording, the full context of the Louis
Armstrong scat solo from which the three-bar excerpt in (1) was drawn is explored
⁴ Barry Mann and Gerry Goffin did, in their 1961 hit single, Who Put the Bomp?
147
Patricia A. Shaw
in §2.1. This is then compared to the phonological properties of a 1929 rendition
of Hotter Than That. In §2.2, the focus shifts to Chet Baker (1955; 1989), an icon
of consonantal minimalism. In contrast, Betty Carter’s repertoire, representatively
examined (1955; 1979) in §2.3, introduces a considerably expanded consonantal
inventory. Through the subsequent decades, these two artists—Chet Baker and Betty
Carter—remained committed to the vocal jazz idiom of scat despite a significant shift
in the general public’s musical interests away from bebop. For each, a comparison of
performances recorded nearly a quarter century apart provides an interesting measure
of individual creative evolution, as well as of particular consistencies despite dramatic
shifts in the musical and cultural backdrop of the latter half of the twentieth century.
Although the popularity of bebop—the jazz medium that had become virtually
synonymous with scat—had significantly declined by the 60s, the vocabulary of scat
itself surged into a different realm of wide-spread prominence in that same period: the
American Hit Parade. As seen in §3.1 and §3.2, in major hits by recording artists like Barry
Mann (1961) and Johnny Cymbal (1963; re-recorded by ShaNaNa in 1980), canonical
scat syllables are directly imported into lyrics like Who Put the Bomp in the Bomp bah
Bomp bah Bomp? Here scat is explicitly objectified, transported, and incorporated into
a different and evolving musical genre. Although bereft of its improvisational core, this
“embedded scat” phenomenon carries forward the continuing identification of scat as
infectiously fun and irresistibly seductive. Despite rife competition for cornering the
sex appeal market from a burgeoning and rapidly diversifying popular music scene in
America, it was scat (the bop, the dip, and the rama lama ding dong) that “made my baby
fall in love with me, yeah!!” By 1963, Mr. Bass Man’s “baw bə bə baw bə baw bə baw baw”
had elevated him to being “the hidden King of Rock’n’Roll” ( Johnny Cymbal 1963), and
scat had clearly spread from bebop jazz to become established in the R’n’R mainstream
as eminently cool.
1.2. Methodology
Transcriptions of the body of scat data that informs the present study are presented
in Appendix 2. With a few notable exceptions, particularly Bauer (2002a, b), there is a
paucity of formal documentation of scat, and the diverse original sources that have been
drawn on here differ considerably in transcription conventions and rigour.
Bauer’s work constitutes an immensely detailed and valuable resource: in the
extensive Appendix (2002a: 245–343) to his outstanding contribution to the study of
Betty Carter’s musical genius, Bauer provides a full transcription in musical notation of
Carter’s melodic line, synchronized with the lyrics, for 15 tunes. Of these, six incorporate
scat vocables, phonologically transcribed by Bauer in Trager-Smith notation. The two
chosen for the analysis in §2.3 allow a comparison across a 24-year time frame stretching
from 1955 to 1979. As well as Bauer’s Betty Carter material, the present analysis also
incorporates his transcription (2002b) of Louis Armstrong’s Heebie Jeebies and his
148
Scat syllables and markedness theory
citation of Reeve’s (2001)⁵ transcription of bars 49–55 of Louis Armstrong’s Hotter Than
That scat solo. Note, however, that the Trager-Smith system adopted by Bauer has been
transliterated here, following the transcription conventions detailed in Appendix 1.
Two other helpful sources were Kernfeld’s (1995) transcription of Armstrong’s
Hotter Than That and Bastian’s (Bastian and Alexander 1995) transcriptions of Chet
Baker’s scat solos. As both these writers used different non-standardized representations
(duh, day, doe, etc.) that were ambiguously interpretable, these were re-transcribed⁶ from
audio files of the original recordings, following the principles in Appendix 1. This retranscription is directly paired with the source transcriptions in Appendix 2.
For the other songs (§3.1, §3.2), the transcriptions presented here are novel. It is
worth foregrounding the complexity and relatively narrow focus of this task. Because the
goal is to relate the articulatory expression of these singers to the range of phonological
parameters that typologically characterize natural language systems, many features
of the sophisticated manipulations of vocal tract sound are not represented in the
relatively broad transcription system adopted here. Further, individual perceptions of the
appropriate categorization of a constantly mutating cadence of vocables into segmental
values may differ, as discussed in detail in Appendix 1. Given the paucity of literature
on linguistic properties of scat, this preliminary study will hopefully “open the door” to
further research into the nature of this interface.
2.
The Phonological Properties of Scat
The analytical goal in this section is to examine the phonological inventory of onsets
and codas⁷ in the scat syllables of the tunes documented in the database in Appendix
2, as well as to determine general properties of syllable shape in the output. Some
challenges related to the fluidity of the medium or of individual expression are raised
in the discussion of particular performances below. More general methodological issues
pertaining to the classification of syllabic form are presented in Appendix 1.
2.1. Louis Armstrong
Of Louis Armstrong’s vast repertoire, an examination of two of his recordings from the
early heydey of jazz in the 1920s serves here to establish a frame of reference both for
Armstrong’s own style and for subsequent diachronic developments in scat.
⁵ Reeves, Scott. 2001. Creative Jazz Improvisation. 3d ed. Upper Saddle River, J.J.: Prentice-Hall. This
resource was not available to me, and hence is cited only through Bauer’s (2002b) reference.
⁶ These were re-transcribed independently by myself and by a research assistant with both musical and
linguistic training. Where there was variance in the transcriptions, either between us and/or with cited
sources (e.g. §2.2), I assume sole responsibility for the interpretation adopted in this analysis.
⁷ Thus, the present focus is on consonantal patterns. For analysis of vowel quality in scat syllable nuclei,
the interested reader is referred to Bauer (2002b), which presents detailed discussion of vocalic ‘timbre’.
149
Patricia A. Shaw
2.1.1. Louis Armstrong, Heebie Jeebies (1926)
Even a cursory look at the first 4 non-lexical syllables ([eə iyf gæf əmf]) that lead into
the scat solo of Heebie Jeebies (see Appendix 2.1.1) suffices to identify them as unusual in
comparison with the syllabic patterns which follow. Therefore, the analysis below focuses
first on the subsequent 48 syllable tokens.
The chart in (3) summarizes the findings about simplex syllable onsets. Consonants
which are attested in onset position are in white cells, along with their raw frequency
count. Possible, but unattested, onset consonants appear in shaded cells. Additional
information about onsetless syllables and cluster behaviour is on the right.
(3) Onsets:
p
t
č
k
b=
d =
ǰ
g
f
Ɵ
s
š
v
ð
z
ž
m
ʔ
No Onset: /
Onset clusters: sk =
h
n
l=
r=
y
w
Viewed against the full backdrop of the 24 consonants which can function as syllable onsets
in English, the fact that 20 (83.3) are not used at all (viz. the shaded grey cells) clearly
underscores the initial premise that scat is highly selective in its segmental inventory. Of
the four segments [d, b, l, r] that do appear as onsets, [d] is the clear favourite, initiating
37 of the 48 syllables (77.1). As one might expect from the discussion in §1, the runner-up
is [b] and although it trails far behind with only five appearances (10.4), its occurrence
is nonetheless salient. The liquids [l] and [r] make an early appearance in syllables 5 and
7 respectively of this set of 48, followed very shortly (beginning with σ3 of bar 4) by a
running stream of 19 consecutive [d]-initial syllables.
Markedly heralding the start of a new phrase in bar 8, an initial [b] breaks the
[d]-only alliteration, leading into an alternating b-d-b-d-d sequence. Then, after
this cascade of 22 [d] onsets with only two [b] onsets having disrupted the auditory
flow, in bar 9 the only consonant cluster hits: [sk]. Its alliterative sequencing (three in
a row), its timing, and its composition all contribute to its striking impact. Nothing
has primed the listener for an [sk] cluster. Although [sk] is not at all an uncommon
English onset, in the context of the segmental composition of Louis Armstrong’s
scat sequence here, it is totally deviant: neither [s] nor [k] occur anywhere else, either
before or after, and it has unique status as the only onset cluster. Frequency, then, is
significant—not only at the high end in terms of ascertaining what segments might
most commonly appear in scat vocalization, but also at the low end in terms of observing
what segments and/or combinations are drawn on only very rarely, to powerful effect.
Although the vast majority of syllables in the Heebie Jeebies solo are open (39/48 =
81), the identity and frequency of the attested coda consonants is shown in the chart
in (4). Of the 21 possibilities,⁸ only three appear, with [p] being the most common. Note
that there is no overlap at all in the identity of the consonants that occur as onsets
⁸ As post-vocalic [w] and [y] appear only in diphthongs, they are not counted as possible codas.
150
Scat syllables and markedness theory
[d, b, l, r, sk] and those that occur as codas [p, m, t]. This is patently not an inherent
characteristic of English (cf. words like pad, tab, mask, etc.), but will be seen to be a
common characteristic, particularly of obstruents, in scat.
(4) Codas: p =
b
t=
č
k
No Coda: /
d
ǰ
g
Coda clusters: Ø
f
Ɵ
s
š
v
ð
z
ž
m=
n
l
[ŋ]
r
(y)
(w)
A final question is whether any particular syllabic forms, from a wholistic
perspective, are preferred. In this 48-syllable sample, there are three favoured shapes:
nine tokens of [də], eight each of [diy] and [duw]. Aside from these, there is remarkably
little repetition of exactly the same phonological form in the residual 23 syllables. The
frequency counts of the particular scat shapes are given in the following table:
(5) Frequency/
Syllable form
(18.8)
də
(16.7)
diy, duw
(6.2)
dɩ, dow
Frequency/
Syllable form
(4.2)
dɩp, daw, bə
(2.1)
biy, bam, bəp, duwt,
dey, la, rɩp, ɩp, skiyp
skæm, skɩ
Having established this body of generalizations about onsets (3), codas (4), and overall syllabic form (5), let us return to formally consider the properties of the introductory
four syllables: [eə iyf gæf əmf]. Clearly the initial impression that these four syllables
are unusual in the context of the entire scat sequence is indeed validated. Three of the
four are onsetless, compared to only one of the 48 syllables that follow. The only onset
consonant, [g], is unique: this segment appears nowhere else in the full scat database
examined here. With respect to codas, note that there are no coda clusters anywhere
else in the work, whereas this introductory sequence ends emphatically with an [mf]
cluster. Moreover, the last three of these four syllables reiterate the coda [f]: not only
are codas relatively infrequent in the rest of the work (there are only nine codas in 48
syllables: 18.7), but the particular segment [f] is unattested elsewhere as either a coda
or an onset. Louis Armstrong’s choice of such unusual scat form in this quadra-syllabic
bridge functions dramatically to grab the listener’s attention as Armstrong moves from
the preceding English lyrics invoking everyone to “c’mon and do the Heebie Jeebies
dance” to settle into the full-blown canonical scat syllables that follow.
2.1.2. Louis Armstrong, Hotter Than That (1929)
The second Louis Armstrong tune analyzed here is the much longer 165 syllable scat solo
from Hotter Than That (see Appendix 2.1.2), from which the excerpt cited earlier
151
Patricia A. Shaw
in (2) was taken. Whereas bars 49–54, as seen in (2), draw exclusively on a sequence of
[b]-initial syllables, a full count of onsets throughout the solo shows that [d] (= 78) is in
fact used more frequently than [b] (= 57). [d] and [b] are by far the most prevalent onset
consonants, with [b] exceeding the next ranked candidate [w] by a difference of 49.
(6)
Onsets:⁹ d (), b (), w (), l (), r (), n (), m (), y (), h (), t ()
No Onset: /
Onset clusters: zw (), mw (), bw ()
Even in this work where the inventory of onsets stretches to 10 different segments,
consistent patterns recur. For example, the four onsets attested in Heebie Jeebies, viz. [d,
b, l, r] are all included within this larger set. Of the residual segments, all are attested—
though with low frequency—in the other scat data investigated here, except [t]. The
occurrence of [t] as an onset is unique not only in this song (in the second syllable of the
otherwise uniform [d]-initial syllables in line BK⁵), but also in the entire sample of scat
repertoire studied here.
Moving to a consideration of the onset clusters attested in Hotter Than That, we
encounter an interesting trio: [zw] (time 2:02), followed in the same line by [mw] and
shortly thereafter by [bw]. Not only are none of these found elsewhere in the present
database, but none of these /Cw/ sequences is part of the standard repertoire of English
syllable onsets. Louis Armstrong here is clearly deviating from the canonical constraints
on English well-formedness, and Native English listeners would, of course, attend to such
novelty immediately. The hypothesis to be advanced here is that such cases illustrate a
domain of tension between linguistic form and musical expression, where enhancement
of the latter is achieved by violation of markedness constraints on the former.
Consider next the coda inventory:
(7)
Codas: p (), t (), m (), ṃ (), n (), l (), g ()
No Coda: /
Coda clusters: Ø
As was the case in Heebie Jeebies, most (76.4) of the syllables in this tune too are open.
Although there is somewhat greater segmental diversity in the coda repertoire, it is still
very limited: only six of the 21 possible consonantal codas are attested. There are no
coda clusters. Interestingly, the three coda segments ([p, t, m]) that appear in Heebie
Jeebies constitute a proper subset of the larger coda inventory here, with [p] again being
significantly more frequent (2.75 times more; 52.4 of coda attestations) than its closest
contender [t] (19.0). Three syllables in this work are realized exclusively as a “syllabic”
[ṃ]. Apart from these cases, [m] functions once as an onset (see (6)) and three times as
a post-vocalic coda. Interestingly all instances of [ṃ] follow a coda [w] or a [u] in the
preceding syllable. The shared labial gesture across this sequence is a kind of harmonic
pattern which recurs in various forms in other case studies below.
Although [d] is attested as an onset segment 20 more times than [b], when one
looks at which full syllable shapes recur most frequently, the two are pretty comparable:
⁹ For space reasons, for the rest of the discussion attested consonants will not be contextualized within
the full inventory of English as in (3) and (4), but will simply be listed in rank order of frequency.
152
Scat syllables and markedness theory
[ba] edges out [də] by a count of 16 to 15. [bi] in its variant realizations (i.e. with length
and/or homorganic glide) is tied with [da] at twelve occurrences each, then the favoured
[d] takes over in the next most frequent syllables [di] and [du].
(8) Frequency/
Syllable form
Frequency/
Syllable form
(9.7)
ba
(6.7)
di ~ diː ~ diy
(9.1)
də
(4.8)
du ~ duː ~ duw
(7.3)
bi ~ biː ~ biy, da
Note that none of the most common syllables here have front/back lax or front/back
mid vowels.
2.2. Chet Baker
Among the major scat artists through the decades, Chet Baker is renowned for the
extreme minimalism of the consonant set that forms the basis for his scat improvisations.
A comparison of different takes of the same tune, Everything Happens to Me, recorded
more than three decades apart (1955 compared with 1989), illustrates remarkable
consistency in the consonantal repertoire employed, despite major differences in the
melodic and rhythmic structure.
Transcriptions of the eight-bar scat bridge in these two versions are given in
Appendix 2.2.1 and 2.2.2. Although Jim Bastian’s transcriptions (labelled JB) and my own
(labelled PAS) differ in orthographic form,¹⁰ they are generally consistent in those features
relevant to the present focus.¹¹ However, two domains of difference merit comment.
One pertains to vowel quality: Chet Baker’s vocalization is extraordinarily mobile.
The looseness and fluidity of movement in Baker’s vocalic articulation present significant
challenges, such that the transcribed values that I propose are at best an approximation of
a nuclear target range within the interconsonantal domain. What emerges most reliably
is a general pattern of lax quality (primarily [ɩ ɛ ʋ æ ə]) and the predominant openness
of syllabic form.
The second notable difference between Bastian’s notation and mine pertains to
consonants. Whereas Bastian remarks on the fact that Chet’s ‘scat vocabulary made
predominant use of syllables beginning with the letter “D” ’ (Bastian and Alexander 1995:
4), not all [d]s are distinctly articulated with a full stop closure. In a number of cases, what
is phonetically realized is the corresponding fricative [ð]. For example, the AIF wave file
in (9) from the 1955 version (time = [2:37.6–2.38.4]) shows a sequence of two syllables,
¹⁰ Whereas Bastian’s orthographic interpretation is English-like, e.g. “ee” for [iː], the transcription I offer
follows the principles in Appendix 1, with explicit representation of the more prominent glides but
otherwise just length on the tense vowels.
¹¹ A discrepancy in bar 6 of the 1989 version is that JB documents 2 more syllables than I am able to
discriminate. The present analysis is based on my total count of 53 syllables vs. Bastian’s 55. However, the
strength of the generalizations is statistically robust, regardless of the difference in syllable count.
153
Patricia A. Shaw
the first with a clear [d] stop closure attack in comparison with the lack of full closure [ð]
in the onset of the second syllable:
(9)
JB:
PAS:
d
d
eh
ə
d
ð
eh
ɛ
This tendency is much more prevalent in the 1955 version, where of the 42 “D”
onsets, 13 are realized as [ð]. In the 1989 version, only one of 45 “D”s is. It is entirely
plausible that the phonological “target” in cases like the second consonantal onset in (9)
is indeed a /d/, as consistently represented in Bastian’s transcriptions, but that its lenition
to the smooth, non-punctuated continuant [ð] may reflect Chet Baker’s “airy”, “almost
weightless”, “romantic crooner” style, disarmingly characterized as “being sweet talked by
the void” (Bastian and Alexander 1995: 4).
Invoking Sapir’s (1933) “psychological reality of the phoneme” argument, the
hypothesis advanced here is that Bastian’s perceived “D” is interpretable as a more abstract
level of representation, i.e. phonemic /d/, and that its sometimes lenited non-plosive
phonetic realization as [ð] is a phonologically non-distinctive, surface level articulation.
Consistent with this interpretation is the broad-based generalization in §1 that [d] is part
of the standard scat repertoire; [ð] is not otherwise attested in any of the scat pieces by
other artists studied here. In the analyses that follow, then, Baker’s [ð] articulations are
taken to be epiphenomenal and are not independently represented in his scat inventory.
2.2.1. Chet Baker, Everything Happens to Me (1955)
The onset repertoire of the early (1955) version of Everything Happens to Me reveals a
highly skewed frequency distribution:
(10)
Onsets: d (), y (), b (), h ()
Onset clusters: Ø
No Onset:¹² /
Ambisyllabic [tṇ] coda/onset: /
Similar to what was seen in Louis Armstrong’s rendition of Heebie Jeebies (§2.1.1), where
[d] initiates 37 of the 48 syllables (77), here /d/ accounts for 77.8 (42/54) of the onsets.
Concomitantly, the relative infrequency of the residual segments raises questions as to
their distribution and functional load. The next most frequent onset is [y]; it occurs only
three times (3/54).
Interestingly, the distribution of these markedly less frequent segments is often
melodically significant. For example, both [h] and [b]—which occur only once each—
¹² The evaluation of No Onset status is challenged by Baker’s fluidity of articulation. Specifically,
there are six cases in Baker 1955 and three in Baker 1989 where a coda [t] precedes a syllabic [ṇ]: as
the [t] is interpretable as an ambisyllabic transition creating an onset for [ṇ], these are not counted
as No Onset.
154
Scat syllables and markedness theory
appear in particularly prominent prosodic positions. Each is phrase-initial: the only
occurrence of [h] introduces the second major phrase in bar 3, and the sole instance of
[b], in the up-take into bar 7, initiates the final phrase of the scat bridge.
Summarized in (11), the coda inventory is even more minimal.
(11)
Codas: t (), ṇ (), n ()
Coda clusters: Ø
No Coda: /
Combining the consonantal repertoires of (10) and (11), we see that Baker’s 1955
improvisation utilizes a mere six segments from the full English set of 24 options: 25
of the available inventory.
As observed in the previous works, here too there is a strong preference for open
syllables (39/54 = 72.2). However, in contrast to Louis Armstrong, for whom [p] was
the most frequent coda, Chet Baker does not use [p] at all, in either of the two scat
performances examined here. Rather, his codas are exclusively alveolar [t, n], with [t] being
the more prevalent.
Somewhat parallel to the trans-syllabic gestural continuity of the feature [labial]
leading into [ṃ] in Louis Armstrong’s Hotter Than That, there is a consistent homorganic
pattern observed in the distribution of [ṇ] in Baker’s scat. Specifically, all instances of [ṇ]
are immediately preceded by a homorganic coda [t]. Further, all cases of coda [n] or [ṇ]
are followed by a homorganic /d/ onset of the subsequent syllable.
In terms of syllable shape, Chet Baker’s preferred forms in this 1955 take are syllables
where his near-ubiquitous /d/ combines with a non-low, non-high lax vowel:
(12) Frequency/
Syllable form
(35.2)
dɛ(ː) ~ ðɛ(ː)
(16.7)
də(ː) ~ ðə(ː)
2.2.2. Chet Baker, Everything Happens to Me (1989)
Although by no means identical in rhythmic, melodic, or expressive form,¹³ the 1989
performance of this same song is remarkably consistent in its consonantal inventory. The
most transparent difference in the onset repertoire is the fact that [b], used only once in
the 1955 version, is completely absent in the 1989 take.
(13)
Onsets: d (), y (), h ()
Onset clusters: Ø
No Onset: /
Ambisyllabic [tṇ] coda/onset: /
As seen in (13), the prevalence of /d/ in the 1989 version emerges as even more
disproportionate, accounting for 85 (45/53) of the onsets. Clearly, /d/ in and of itself
constitutes the core of Chet Baker’s consonantal inventory. Again, where another segment
is used by Baker, it functions through its very uniqueness to demarcate a prosodically
¹³ For example, a very rudimentary comparison shows the opening bar in the 1955 version has 6
syllables moving from Ebm to A b+ towards Db ∆, whereas bar 1 in the 1989 recording has 10
syllables moduating from Fm through Bb towards Eb ∆. (Note: ∆ = major 7).
155
Patricia A. Shaw
prominent position. Thus, the sole occurrence of [h] introduces what is arguably the most
prosodically salient position: the very first syllable of the first phrase of the scat bridge.
In the 1989 version, Baker’s sparse and tightly restrictive treatment of codas is
remarkably consistent with his 1955 repertoire, though their particular distribution in the
scat melodic lines is entirely divergent.
(14)
Codas: t (), ṇ (), n ()
Coda clusters: Ø
No Coda: /
As documented in (14), the same 2 segmental values are attested as in Baker’s 1955 coda
chart in (11). Again, all three instances of [ṇ] are introduced by a dual function coda/onset
[t], and are followed by a homorganic onset [d].
Given the fluid mobility of Chet Baker’s vowel articulations, a characterization
of his favoured syllable shapes unequivocably identifies an open syllable with a [d]
onset but is much less definitive in terms of vowel quality. Most generally, as in the 1955
version (see (12)), his articulation meanders around a mid lax vowel, either schwa [ə] or
a ‘neutral position’ [ɛ], identified for English as the characteristic articulatory setting for
the onset of speech (Chomsky and Halle 1967). However, on notes of longer duration, his
resonant crooning often ascends to a tenser high back [u]. Based on the transcriptions in
Appendix 2, there is considerable consistency between the 1955 and 1969 versions in terms
of a frequency of use ranking:
(15) 1955 Frequency/
Syllable form
1969 Frequency/
Syllable form
(35.2)
dɛ
(17)
dɛ
(16.7)
də
each (13.2)
də, du, dʋ
However, as is evident from the lower frequency numbers and the three-way tie for
second place in the 1969 count, there are no strongly identifiable constraints on his wideranging vocalic diversity.
2.3. Betty Carter ¹⁴
Even as the repertoire of scat vocabulary expanded through the creatively explosive bebop
rush of the 1940s, [d] and [b] remained particularly prominent. For example, although
Betty Carter was a major innovative force in extending the repertoire of jazz vocables,
Bauer notes that in Carter’s short scat solo in Babe’s Blues (1958), of the nine consonants
which are used as syllable onsets, /b/ and /d/ together “initiate more than half of the
vocable classes used in the solo” (2002b: 312). Other Betty Carter songs attest to this
same generalization: in my count of the 197 syllables in her 36-bar scat solo rendition of
You’re Driving Me Crazy (1958; transcribed by Bauer 2002a: 252–254), the most frequent
onset consonant is [d] (in 80 of the 197 syllables) and the next most frequent is [b] (in
36 of the 197 syllables). Thus, although Carter uses 10 different consonants as onsets in
¹⁴ My commentary on Betty Carter is deeply indebted to Bauer’s (2002a,b) insightful and superbly
documented interpretation of her life and work.
156
Scat syllables and markedness theory
this solo, the two segments [b] and [d] together comprise the majority (58.9) of onset
choices. In the following sections, we look at two of her other tunes to broaden the base
of comparison further.
2.3.1. Betty Carter, Thou Swell (1955)
Recorded the same year (1955) as the early version of Chet Baker’s Everything Happens to
Me that was considered in §2.2.1, Betty Carter’s scat rendition of the original 1927 classic
Thou Swell draws on the following inventory of eight consonants as simplex onsets (see
Appendix 2.3.1).
(16)
Onsets: d (), b (), l (), y (), w (), h (), r (), š ()
Onset clusters: ly (), dl (), sp ()
No Onset: /
Constituting a combined total of 75/114 (=65.8), the consonants [d] and [b] are
reaffirmed as incontestably at the core of Carter’s—and everyone else’s—stock of scat
resources. Although less frequently drawn on, the consonants [l, r, y, w, h] are all familiar
as staple scat segments that have been attested in the work of Louis Armstrong and Chet
Baker examined in the preceding sections.
The innovative element in (16) is Carter’s once-only exploitation of [š] (bar 13,
coupled with the unique attestation of [r] in the sequence [šiy ra]). The use of [š] is rare in
Betty Carter’s scat, although it figured prominently in the influential repertoire of Sarah
Vaughan and became a flagship marker of 1950s doo wop motifs like “shoo bee doo” and
“sha na na”.¹⁵ Despite the collective recognition among jazz artists of certain segments
being standard communal property in the scat arsenal, other specific sounds acquired the
status of individual trademarks. Carter reportedly admonished a young vocalist in 1978:
“Why are you using scat syllables like ‘shoo-bee-doo-bee’? Those belong to Sarah, and
they belong to the fifties.” (Berliner 1994: 254, 804, cited by Bauer 2002b: 314–315) At the
heart of improvisional creativity in music, as in language, is the challenge of innovation
under the constraints of structural limitations, critically the inventory of segments and
restrictions on their combination. Given the very small set of sounds that came to be
established as the “conventional” scat inventory in the works of the early artists, to then
have certain consonants among these evolve into sound symbolic associations with a
particular singer and/or decade effectively heightens the challenge for new artists to
create an individualistic scat voice.
Onset clusters are generally quite rare in scat. Of the four that occur in this work,
only one [sp] conforms to standard well-formedness constraints of English. The other
two, [ly] and [dl], draw on segments that are very common in the scat inventory of onsets,
but in bundling them into tauto-syllabic onset sequences Carter pushes beyond the
canonical bounds of regular English. Just as [š] became a Sarah Vaughan scat trademark,
the [dl] onset is a strong candidate for a Betty Carter signature: jumping ahead 27 years
¹⁵ It was from the vocals in the Silhouettes’ 1957 hit song Get a Job that the 50s revival group, Sha Na Na,
took its name.
¹⁶ What a Little Moonlight Can Do (1982) Whatever Happened to Love? Verve/Polygram 835 683–1; see
transcription by Bauer (2002a: 310–343).
157
Patricia A. Shaw
to her 1982 recording¹⁶ of What a Little Moonlight Can Do, this same highly marked onset
appears eleven times, most strikingly in a sequence of six syllables in the climactic scat
line of bars 189–190 (WB line as transcribed by Bauer 2002a: 317; transliteration (2nd
line) as in Appendix 1):
(17)
WB: | ‰ ə weh– dlow dlow |– dlə dle dle | dlow dow Œ | ...
| ‰ ə wɛə– dlow dlow |– dlə dle dlɛ | dlow dow Œ | ...
Carter’s usage of codas in Thou Swell is infrequent, as seen in (18), and the observed
patterns are familiar. She draws strictly on the resonants [m, n, l]. Both of the syllabic
segments are alveolar, and follow a homorganic onset [d].
(18)
Codas: m (), ṇ (), l (), ḷ ()
Coda clusters: Ø
No Coda: /
The syllable shapes which surface most frequently in this piece are not at all
surprising either:
(19) Frequency /
Syllable form
Frequency / Syllable form
(13.2)
bə
(10.5)
ba
(12.3)
duw
(8.8)
də
In sum, despite the creative uniqueness of how her scat artistry uses them, Carter’s
arsenal of tools as represented by this acclaimed 1955 performance draws on a markedly
standard repertoire.
2.3.2. Betty Carter, Open the Door (1979)
Based on his intimate and broad-based musical insights into the full body of Betty Carter’s
“relatively small recorded output”, Bauer (2002a: xi) contends “that the defining features
of Carter’s style remained consistent even as her approach kept changing.” From the
linguistic perspective of the present study, a comparative analysis of the scat interludes in
Carter’s 1979 version of Open the Door (see Appendix 2.3.2), recorded 24 years later than
the 1955 work discussed above, reveals a tightly focussed phonological repertoire. The
five onset segments that appear in the 1979 version, documented in (20), are a subset of
the eight that were used in Thou Swell (see (16)).
(20)
Onsets: d (), y (), w (), l (), h ()
Onset clusters: Ø
No Onset: /
Notably absent from the attested onsets in (20) is [b]. However, the ubiquitous scat onset
[d] is not only present, but strongly dominant, introducing 19 of the 27 syllables (= 70.4).
The other onset segments here, viz. [l, y, w, h], are all scat basics, not just in Carter’s
earlier work, but in that of other scat vocalists.
Of particular interest in (21) is the total absence of post-vocalic coda consonants.
158
Scat syllables and markedness theory
(21)
Codas: ṇ (), ṃ ()
Coda clusters: Ø
No Coda: /
The only syllables that are not open CV or CVG structures are the 3 cases where
there is a syllabic nasal. A comparison of the first three scat lines (cf. bars 9, 14, and 16,
respectively, in Appendix 2.3.2) reveals a striking and doubtless strategic parallelism of
form and function where these three syllabic nasals occur. Specifically, each is in absolute
phrase-initial position of the first three scat lines, with each new cycle entailing some
minimal variation from the preceding one: labial [ṃ] in the first phrase shifts place of
articulation to alveolar [ṇ] in the second phrase, which itself is repeated in the third line
but differentiated by the introduction of an [h] onset. Abstracting away from rhythm,
duration, and pitch, the segmental content of these three lines is reproduced below:
(22)
ṃ də dow ...
ṇ duw duw duw ...
hṇ duw duw diy duw ...
What this short prosodic progression illustrates is that far from scat being comprised
of randomly articulated sequences of a delimited set of nonsense syllables, the skill of a
brilliant scat artist like Betty Carter entails masterful structuring of content and sequence:
here, each nasal syllable introduces an iteration of exclusively [d]-initial syllables, and
each line builds substance and momentum by adding one more syllable.
Finally, in determining which syllable shapes are most prevalent, there are two that
clearly emerge as most frequent:
(23) Frequency/
Syllable form
(29.6)
du(w)
(25.9)
dey
While [du(w)] figures prominently in the repertoire of her other work (cf. (19)) and that
of the other singers sampled here, [dey] is less favoured, though not unattested (cf. (5)).
2.4. Syllable Structure Generalizations
Having documented specific aspects of syllable content and form in two different works
from each of three renowned jazz vocalists, spanning the 63 years between 1926 and
1989, we are now in a position to determine what generalizations, if any, hold across this
sample, despite each artist’s highly individualistic musicianship and distinctly unique
approach to the idiom.
The initial question posed in §1 was to what extent the delimitation of onsets to [d]
and [b], as exemplified by the brief excerpts in (1) and (2), is representative of a broader
database of scat. The onset tabulations from each previous section (viz. (3), (6), (10), (13),
(16), (20)) are summarized in the table in (24) below. Note in (24) that the frequencies
of [d] and [b] are given both as a token count and as a percentage value of the number
159
Patricia A. Shaw
of scat syllables in each piece. There are three particularly interesting facts revealed by
these results. First, none of the six works studied here—including the full texts of each of
the classic performances from which (1) and (2) were drawn—uses exclusively [d] and/or
[b] onsets. In every case, the vocalist has chosen some scat syllables with other onset
consonants, however minimal this extended range may be. For example, in cases like
Louis Armstrong’s 1926 recording of Heebie Jeebies (§2.1.1) and Chet Baker’s 1989 version
of Everything Happens to Me, there is only one occurrence of each of two other onsets.
(24) Simplex Onsets: comparative usage by different scat vocalists
2.1.1. Armstrong 1926
d
b
37 = 77.1
5 = 10.4
y
h
l
w
2.1.2. Armstrong 1929 78 = 47.3 57 = 34.5
2.2.1. Baker 1955
42 = 77.8
3
2.2.2. Baker 1989
45 = 84.9
2.3.1. Carter 1955
40 = 35.1
2.3.2. Carter 1979
19 = 70.4
Total syllables: 461
r
261 = 56.6 98 = 21.3
35 = 30.7
n
š
t
= 1.9
m
At the other end of the spectrum, Louis Armstrong’s Hotter Than That employs the
greatest diversity: ten different consonants. A further observation is that there are only nine
consonants other than [d] and [b] which comprise the full set of onsets that are collectively
utilized by these artists. Together, these latter two facts affirm the initial premise of this
research: of the full complement of 24 consonants that can potentially function as syllable
onsets in English, scat draws on a very limited, and largely recurrent, subset.
The third conclusion that emerges from (24) is that there is a consistent asymmetry
in the relative frequency of [d] over [b]. In two of the songs (§2.2.2, §2.3.2), there is no [b]
at all; in a third (§2.2.1), there is a single attestation; in the remaining three, though the
degree of imbalance differs, the direction of difference is constant.
In contrast to the robust generalizations about simplex onsets, the usage patterns
with respect to complex onsets, as summarized in (25), do not at all cohere.
(25) Complex Onsets: comparative usage by different scat vocalists
sk
2.1.1. Armstrong 1926
2.1.2. Armstrong 1929
ly, sp, dl
/
6.25
/
1.8
/
---
2.2.2. Baker 1989
/
---
/
3.5
/
---
/
2.17
2.3.2. Carter 1979
, ,
σ’s with CmplxOns
2.2.1. Baker 1955
2.3.1. Carter 1955
160
zw, mw, bw
, ,
Scat syllables and markedness theory
Clusters appear in only three of the six pieces, with an extremely low frequency count
(averaging just over 2). Significantly, there is no overlap at all in the specific clusters
used in each of the works, even by the same singer. Moreover, with respect to the identity
of segments involved in these clusters, it is patently not the case that these sequences
are compositionally built from the simplex onset consonant inventory: [k], [p], [s], and
[z] in the clusters of (25) are not part of the repertoire of (24). Most striking is that a
majority (5/7) of the attested clusters violate canonical English patterns: although all of
the individual segments involved are legitimate potential simplex onsets, none of [zw],
[mw], [bw], [ly], or [dl] conform to standard “well-formed” sequences in English. Across
these diverse observations, there is in fact a consistent generalization, namely: complex
onsets are highly marked. In terms of frequency they are rare, and in terms of content
they are often exceptional.
Consider now the properties of codas, summarized in (26). Whereas the excerpts
in (1) and (2) in §1 were comprised exclusively of open syllables, this generalization does
not hold of any single work considered in its entirety. Nonetheless, open syllables are
unequivocably dominant, ranging from 94 to 72 in individual works and with the
overall average being 82.65.
(26) Codas: comparative usage by different scat vocalists
p
t
n
2.1.1. Armstrong 1926
2.1.2. Armstrong 1929
2.2.1. Baker 1955
2.2.2. Baker 1955
ṇ
ḷ
g
/
74.5
/
72.2
/
86.8
/
93.9
/
88.9
σ’s with no Coda
81.25
2.3.2. Carter 1979
l
/
ṃ
2.3.1. Carter 1955
Totals:
m
/ 82.0
Moreover, there were no complex codas. With respect to segmental identity, of the 21
potential English coda consonants, only six different segments appear. Compared with the
inventory of scat onsets in (24), it is interesting to note that there is overlap in the resonant
repertoire /m, n, l/, but complementarity in the obstruent stops: onset /b, d/ vs. coda
/p, t/.¹⁷ Once again, Louis Armstrong is the king of segmental diversity in his Hotter Than
That rendition (§2.2.1), which draws on seven different codas, whereas the other artists
employ a much more restricted range of between two and four. Across the artists, the most
favoured segments are [t] and [n/ṇ], although Armstrong’s clear personal favourite is [p].
Finally, consider in (27) the generalizations that hold regarding the overall form
of scat syllables that are used by these diverse singers. Only syllables which occurred at
least three times, and with greater than 7 frequency in each song are included in the
¹⁷ Thus, the unique instances of onset /t/ and coda /g/, both in Armstrong’s Hotter Than That (§2.1.2)
appear anomalous: the /g/ in terms of both place and voicing, and the /t/ in terms of voicing.
161
Patricia A. Shaw
table below. Because the overall syllable count differed considerably across the different
selections, the most frequent syllables for each artist are simply ranked, with 1 being
the most frequent. Ties are represented by the same number. The scale descends for
each artist, but may stop at either 2, 3, or 4 depending on the actual frequency values (as
detailed in the corresponding tables in each individual section above). Thus, for example,
for each of Chet Baker in §2.2.1 and Betty Carter in §2.3.2, the very high frequency of
two particular syllables results in no others exceeding the criterion level.
(27) Syllables: comparative usage by different scat vocalists (<7, Ranked =high)
də du(w) ba
2.1.1. Armstrong 1926
2.1.2. Armstrong 1929
2.2.1. Baker 1955
da bi(y) di(y) dʋ dey bə
2.2.2. Baker 1989
2.3.1. Carter 1955
2.3.2. Carter 1979
dɛ
Comparing the scat choices in the two different recordings by each vocalist shows Baker
to be the most consistent, despite the 34-year time interval between these performances:
[dɛ] and [də] rank 1 and 2, respectively, in both. In contrast, in each of the two recordings
by Armstrong and Carter, different syllables rank 1 and 2, and for each of them, the topranked syllable in one of their pieces does not even reach criterion in the other, viz. [ba]
in Armstrong, and [bə] in Carter. Clearly, there is no single “favoured” syllable shape: in
this sample of six scat performances, there are five different syllables that emerge as the
most frequently used in any given piece.
Nonetheless, the tabulations in (27) provide striking confirmation of the two
generalizations originally observed in the brief Louis Armstrong extracts in (1) and (2).
First, all these favoured syllables are open. Secondly, they all start with [d] or [b]. The
over-all favourite is [də], with [du(w)]—as in “doo-wop”—coming in second.
3.
Beyond Bebop
As bebop morphed into hard bop and doo-wop in the 1950s, and classical jazz of the
previous decades diversified under multiple influences, particularly R&B and the explosive
impact of rock’n’roll, jazz scat began to wane in popularity. A few exemplary jazz vocalists
continued the bebop scat tradition, but other genres had come to dominate the pop music
scene. Not until the uniquely versatile creative talents of Al Jarreau and Bobby McFerrin
emerged in the late 70s did vocal improvisation once again top the charts.
The next two tunes come not from core vocal jazz repertoire, but from the heart
of the early 60s Hit Parade era, in the decade following the bebop heyday. What makes
these works substantially different from the preceding ones is that scat is formally scripted
into the lyrics, not improvised: in the first example (§3.1), scat syllables are explicitly
162
Scat syllables and markedness theory
referenced in the English text, and in the second example (§3.2), more extensive scat
lines alternate with English. These case studies are of interest in two respects: first, for
interrogating the extent to which the segmental content and shape of these select scat
tokens conform to the generalizations established for the classical scat vocables examined
in the preceding vocal jazz tunes; and secondly, for the insights that this phenomenon
provides from a historical perspective on the evolving diversification of the cultural
impact of scat. Despite bebop itself having shifted out of the popular mainstream at that
time, the fact that very young¹⁸ creative songwriters chose to incorporate scat syllables
into their lyrics in the 1960s reflects its strong formative influence on their own musical
identities and its enduring legacy in the broad-based musical culture of the era.
3.1. Barry Mann, Who Put the Bomp? (1961)
The infectiously popular music and words of this 1961 hit were co-written by Barry Mann
and Gerry Goffin, with Barry Mann as the original recording artist. Because the lyrics
here are not improvised, but rather are composed in conformity with a tightly structured,
fixed melodic and rhythmic framework, the methodology of previous sections—namely,
a frequency count of attested segmental tokens in a stream of spontaneously improvised
scat—is less revealing than simply the inventory of segments and syllable shapes that are
drawn on. That is, what is particularly significant is just which scat vocables are chosen
for the lyrics, as this very choice implies that these particular forms already (in 1961) had
significant currency in the general public domain as cool and hip.
Archetypal and high-profile scat syllables here (see Appendix 3.1) include the [šu]
~ [šə] attributed to Sarah Vaughan (§2.3.1), the [dɩp] that surfaces as early as Heebie Jeebies
(see (5)), as well as the [bap] that not only persists to this day as the name of the genre, but
that had become the basis of Betty Carter’s moniker: Betty Bebop.¹⁹ The rhythmically
alternating syllables [bə] (line 2) and [dɩ] (line 8) are clearly canonical scat form, adhering
to both the preference for [b]/[d] onsets and No Coda (open syllable).
Although it was noted in every improvisational jazz sample investigated earlier that
open syllables were much more frequent than closed syllables, a superficially inconsistent
observation is that the reverse is the case in Who Put the Bomp?. What this illustrates, I
would suggest, is the potential over-riding effect of prosodic constraints when scripting
lyrics to a fixed melodic line and rhythmic beat. The lyrics for the lines with the scat
syllables [bam], [bap], [dɩp] are basically structured as follows, with the CVC closed
syllables out-numbering the open CV syllables four to two. Each of the underlined
syllables in (28) is directly synchronized with a rhythmic beat.
(28)
Who put the [CVC]
in the [CVC] [CV] [CVC] [CV] [CVC] ?
Because closed CVC syllables are prosodically heavier, aligning a closed syllable
with a rhythmically strong position functions to enhance the prominence of the beat.
¹⁸ Barry Mann and Gerry Goffin were 22 when their co-written success Who Put the Bomp? was released,
and teen idol Johnny Cymbal was 18 when he wrote and recorded Mr. Bass Man.
¹⁹ This was Lionel Hampton’s nickname for her, despite her expressed dislike of it (Bauer 2002a: 4–5).
163
Patricia A. Shaw
Note that the initial ‘who’ [huw], even though open, is also heavy by virtue of the long/
tense diphthong. Further enhancing the strong rhythmic stability of these lines is the fact
that the light open scat syllables [bə], [šə], and [dɩ] are never aligned with the beat. While
this kind of prosodic alignment of heavy syllables with positions of rhythmic prominence,
and the complementary preference for light syllables in weak rhythmic positions, most
certainly occurs in improvised scat as well, it would appear to be a significantly less
dominant factor, perhaps since rhythm itself is also subject to improvisation.
Of further interest in the lyrics of this song is that there is a category of forms that
are neither standard English lexical items nor syllables that conform to the characteristics
of scat. Concatenations of essentially semantically empty compounds that rhyme or
alliterate, such as “rama lama”, or that carry some onomatopoetic value like “ding dong”,
or that live on a hip fringe of the English lexicon like “boogity boogity” were also drawn
from the pop music scene of the 50s, namely the Edsels’ major doo-wop hit Rama Lama
Ding Dong, originally released in 1958 on Dub Records and re-released on Twin Records
in 1961,²⁰ and the Quincy Jones composition Boogity Boogity, recorded on Milt Jackson’s
1958 album Plenty, Plenty Soul. Unlike scat, these sequences each pattern basically as a
lexicalized unit, without independent freedom of realization of the constituent syllables.
The form of all 3 of these expressions is essentially reduplicative, with the nature of any
deviance from full identity falling directly within recognized cross-linguistic patterns
of reduplication (e.g. Moravcsik 1978, McCarthy and Prince 1986, Hurch 2005). Finally,
based on the generalizations established in §2, some of the segmental content in these
examples falls markedly outside of that found in core scat, viz. the [ŋ] codas in “ding
dong”, and the [g] onset in “boogity”.
In sum, Who Put the Bomp? is highly syncretic in its explicit references to many
of the rapidly evolving musical influences of the era. The lyrics integrate unmistakably
identifiable scat syllables from the classical vocal jazz tradition, with references from the
rhythm and blues progression into doo-wop, along with the blues-based “modern jazz”
sophistication of Quincy Jones and Milt Jackson. What this tells us is that although the
pure jazz scat genre itself isn’t charting in the mainstream at this point in time, it remains
a major foundational force in the broader musical scene. Moreover, of all the diverse
genres referenced in these lyrics, it is a scat line that is attributed with ultimate success
in the conquest of love: “When my baby heard bam bə bə bam bə bam bə bam bam, every
word went right into her heart...”
3.2. Johnny Cymbal, as recorded by Sha Na Na, Mr. Bass Man (1963)
The second example illustrating the continuing legacy of scat in the pop scene of the early
60s is Johnny Cymbal’s signature song, Mr. Bass Man. Sha Na Na’s re-recording of it in
²⁰ Although the Edsels, like the ill-fated car model they named themselves after, were defunct as a
group by the time their version of Rama Lama Ding Dong rose to prominence on the national charts,
the song itself attained significantly greater longevity as the title song of Sha Na Na’s 1980 album.
Note too that in the historical context of the 50s “Ding Dong” itself carried an established frame of
reference from the title and lyrics of Louis Armstrong’s early 1930s hit, I’m a Ding Dong Daddy From
Dumas (on The Best of Louis Armstrong and His Orchestra: 1930-31. Classics B000001NJB).
164
Scat syllables and markedness theory
1980 stands both as a tribute to its enduring popularity and as a major contribution to
ensuring its continued exposure to subsequent generations. The transcription in Appendix
3.2 is based on the Sha Na Na version, and differentiates the scat lines that are sung by Mr.
Bass Man himself (abbreviated BM in Appendix 2) from the fledgling attempts of the
“Wanna-be” guy (abbreviated W in Appendix 2) who sings, following line 9, “I wanna be a
bass man too”. Interestingly, this separation reveals some fascinating differences.
As seen in (29), Mr. Bass Man himself uses exclusively [b] onsets. In contrast,
the majority of Wanna-be’s onsets are [b], but his inventory also includes a substantial
number of [d]s and [y]s, both of which accord with the standard scat onsets documented
in (24). Although [ʔ] is not included in (24), its absence is directly attributable to the
transcriptional principles outlined in Appendix 1, so the two attestations of [ʔ] here are
not anomalous. The unique occurrence of [s] at the beginning of line 5 is odd, given the
generalizations of (24), but may be explicable as perseverance of the final sibilant of the
immediately preceding word “songs”, across the juncture from English lyrics to scat.
(29) a. Mr. Bass Man’s scat lines (including back-up line and joint BM/W lines):
Onsets: b ()
Onset clusters: Ø
No Onset: Ø/
b. Wanna-be’s scat lines:
Onsets: b (), d (), y (), ʔ (), s ()
Onset clusters: Ø
No Onset: Ø/
Not only is the greater diversity of segments in the novice’s attempts of interest, so
too is the distribution of these segments. For example, in three lines (lines 5, 6, 17), Wannabe switches in mid-sequence from [d]-onsets to [b]-onsets (significantly, a switch to the
correct target), but never does he switch in the opposite direction. All other lines are
either exclusively [d] (lines 13, 14, 25, 26) or exclusively [b] (3, 7, 10, 12, 22, 24, 29).
There is also a marked discrepancy in coda patterns between Mr. Bass Man and
Wanna-be. Mr. Bass Man uses exclusively [m]/[ṃ] codas, whereas Wanna-be models
[m] most frequently, to be sure, but he also draws on the 3 most favoured scat codas that
were documented in (26): [p, t, ṇ]. Nonetheless, note that Wanna-be’s very last solo line
achieves perfect canonical form as defined by Mr. Bass Man: exclusively [b] onsets and
exclusively [m] codas.
(30) a. Mr. Bass Man’s scat lines (including back-up line and joint BM/W lines):
Codas: m (), ṃ ()
Coda clusters: Ø
No Coda: /
b. Wanna-be’s scat lines:
Codas: m (), t (), p (), ṇ ()
Coda clusters: Ø
No Coda: /
Although Wanna-be uses a broader inventory of both onsets and codas, these
segments are significantly constrained in their distribution, in that a consistent pattern
of syllable-internal consonant harmony obtains with respect to place of articulation in
closed syllables. That is, a labial [b] onset is followed by a labial [m] or [p] coda, regardless
of the vowel quality in the nucleus, e.g.: bam, bum, bəm, bom, bʋm, bɩp. Similarly, an
165
Patricia A. Shaw
alveolar [d] onset is closed by [ṇ] or [t].²¹ Given that none of the other onsets /y, ʔ, s/
occur in closed syllables, this generalization regarding intra-syllabic consonant harmony
holds throughout the entire work.
4.
Explanatory Hypotheses
The analyses of these several examples of scat show that, across the diversity of musical
styles and individual expressions, the repertoire of sounds and syllable shapes is
remarkably consistent and extremely limited in comparison to the extensive range of
segments and combinatorial possibilities that are used in English, let alone available
within the articulatory range of the human vocal apparatus. To address the question of
what might account for these patterns, three hypotheses are explored: that vocal scat
is essentially imitative of instrumental jazz (§4.1); that the repertoire of sounds in scat
are constrained by phonological markedness theory (§4.2); and that scat production is
subject to independent constraints on musical form and vocal performance (§4.3).
Although each of these, among other cognitive and performative factors, doubtless
contributes to shaping the output of scat, the argumentation to follow suggests that
specific tenets of phonological markedness theory interacting with the melodic imperative
for a voice line to carry pitch contribute substantially to broadening our understanding of the
attested patterns.
4.1. The Imitative Hypothesis
A number of theorists within the musical literature have hypothesized that scat vocalization
is essentially imitative of jazz instrumental expression. For example, Robinson (2002: 515)
attributes the origin of scat to “singers imitat[ing] the sounds of jazz instrumentalists”.
Bauer (2002b: 303) cites Milton Stewart (1987: 65, 68, 74) as showing that “the vocables
used by such notable exponents of scat as Ella Fitzgerald and Sarah Vaughan often mimic
the tonguing, phrasing, and articulation of instrumentalists.” Stoloff (2003: 4) notes that
Louis Armstrong, “like many other ‘instru-vocalists’ who followed, unconsciously used
scat syllables that emanated from his trumpet style”. The core question in considering
the “Imitative Hypothesis” is to what extent such comparisons are based on essentially
arbitrary associations, as opposed to qualities of instrumental sound production that
are directly reproduced in the choice of consonants and vowels in a scat syllable. That
is, are there consistent, independently verifiable articulatory correlations between an
instrumental rendition and a particular scat vocalization? Or, like the arbitrariness of
the sound-meaning correspondences in natural language, is the seemingly “imitative”
association based on fundamentally arbitrary, conventionalized interpretations?
One type of case is illustrated by the fact that sometimes hand gestures lent an
explicit instrumental identity to the vocables. Stoloff (2003: 5) points out that “Ella, for
example, often used trombone-like hand motions while scatting “du-wah” type syllables”.
²¹ All 3 instances have the same vowel: [dɩt].
166
Scat syllables and markedness theory
The question here then is whether there is anything inherent in the phonetic properties
of the syllables “du-wah” [du wa] that is uniquely representative of the production or
perception of trombone sound, or whether the explicitly iconic identification established
by Ella’s hand gestures substantially contributes to creating a conventionalized significance.
Weighing against a one-to-one interpretation of the Imitative Hypothesis is the fact, noted
earlier in (27), that [du(w)] is the second most frequent syllable used by Louis Armstrong
in Heebie Jeebies, Chet Baker in Everything Happens to Me (1989), and Betty Carter in
Thou Swell. In other words, the documentation in §2 establishes that throughout the scat
repertoire, [du(w)] is simply an extremely common syllable. What seems most plausible, then,
is that a “du-wah”/trombone sound-meaning connection evolved into a conventionalized
relationship, with the explicit interpretive overlay of hand gestures contributing significantly
to establishing this as a semi-lexicalized associative correspondence.
A second type of case exemplifying the frequent interpretation of scat as directly
representative of instrumental effects is illustrated by Robinson’s (2002: 515) identification
of the following line from Louis Armstrong’s Hotter Than That (1927, OK 8535) as one
“which illustrates his clear imitation of a trumpet rip”:
(31)
From L. Armstrong Hotter Than That (1927); transcription J.B. Robinson:
A basic question here is: How much of the interpretation of this phrase being a
“trumpet rip” follows from the initial monosyllabic identity tag “rip”? First, the research
documentation in §2 establishes that “rip” is not in the common inventory of scat syllables.
In fact, it is a unique attestation in the database of 461 scat syllables. Secondly, “rip” is
a recognizable English word, with a particularized semantic interpretation specifically
within the jazz lexicon. Thirdly, this word is positioned strategically at the very beginning
of the scat sequence that is interpreted by Robinson as “a trumpet rip.” In terms of
perceptual salience, initial position is the locus of greatest prosodic prominence in the
phrasal domain. Moreover, note in (31) that “rip” bears the highest pitch level and its
rhythmic value (a quarter note) is twice the value of each individual note in the sequence
of eighth notes that follows. Collectively these prosodic cues of position, pitch level, and
duration converge to focus the listener’s attention on this entry, which is realized not
by a familiar scat syllable, but rather by the lexically informative label that this is a “rip”.
Finally, a complementary question stemming from Robinson’s characterization of this
sequence as a “trumpet” rip, is whether there is anything in the choice of the particular
scat syllables—independently of the lead signifier “rip”—that is uniquely associated with a
trumpet, as opposed to a sax, bass, or any other instrument. Again the collective evidence
in §2 establishes that the specific syllables that follow “rip” in (31) are all unequivocably
canonical scat, used by a diversity of singers across a diversity of melodies, chord
progressions, tempos, and rhythms.
Nonetheless, the fact that it is Louis Armstrong himself, one of the most virtuoso
jazz trumpeters of all time, who is scatting in (31) unquestionably establishes an association
167
Patricia A. Shaw
between his vocal and instrumental expression. Of course, a particular musician’s primary
instrumental identity would not preclude scat excursions into imitative or evocative effects
of other instrumentation. However, one might ask: given that Chet Baker and Louis
Armstrong are both jazz trumpeters and scat vocalists, is there any significant parallelism
between them in the choice of scat repertoire? Comparison of their use of onsets in the
chart in (24) and of codas in (26) not only provides a distinct profile for each, but also
establishes no greater similarity between them than between either one of them and
Betty Carter, who was not a trumpet player. In short, the research evidence here argues
that the specific choice of scat syllables for each of these performers follows a canon
of phonological constraints on scat repertoire that are independent of trumpet—or any
other—instrumental realization. Most fundamentally, I would submit, it is the musical
individuality of each of these artists and their unique creative mastery of the cognitive
systems involved that transcends defined conventions on the essential form of notes and
syllables, and systemic constraints on their patterning.
However, to explore the empirical bases of the Imitative Hypothesis yet further,
consider commentary such as that advanced by Stewart (1987: 65–66), who interprets
Ella Fitzgerald’s 1949 performance of Flying Home as follows:
Fitzgerald alternates the bilabial ‘b’ and ‘p’ plosives with the lingua-alveolar ‘d’
plosives. The ‘b’ and ‘p’ sounds are formed similarly to the sounds of jazz wind
instruments, which sound by the release of built-up mouth air pressure onto the
reed, while the ‘d’ sound is similar to the tonguing on jazz brass instruments.
On the basis of a phonological model of natural language sound production, my
hypotheses about the articulatory correlations entailed in initiating and modifying air
flow on reed and brass wind instruments differ from Stewart’s. Specifically, pitch-based
sound on a trumpet or any other brass instrument is produced by bilabial constriction:
labial is the primary articulator. As well, tonguing effects—most commonly coronal,
but also dorsal—function significantly to modify the stream of sound in terms of attacks,
closures, trills, duration, phrasing, tonal quality, etc. Less frequent, but certainly available
within the repertory of articulatory modifications, are uvular and laryngealization effects.
Consequently, under an articulatorily-based Imitative Hypothesis, trumpet-denotative
scat would liberally draw on a inventory of both labial and coronal consonants, but
could also include other articulatory effects. In contrast, in producing the primary sound
on a reed instrument, like a sax or clarinet, the player’s lips and upper teeth hold the
mouthpiece: although lip compression can modify pitch, tone, or timbre, labial is not
a primary articulator in the way that it is with brass instruments. However, the range of
tonguing effects and other articulatory modifications would be similar. The Imitative
Hypothesis implication that follows from this comparison would be that sax- or clarinetimitative scat should have no [p]s or [b]s (contra Stewart’s interpretation above), whereas
brass-imitative scat could. Essential to testing such articulatory-modeling claims would
be a body of data where the intentionality of the scat singer is unambiguous. As none
of the references drawn on here provide adequate documentation to explore these
hypotheses more definitively, they are left for future research.
168
Scat syllables and markedness theory
In summary, despite various approaches to the hypothesis that scat vocalization is
essentially imitative of jazz instrumental expression, what has been shown is that there
is in fact little empirical evidence to sustain a non-arbitrary relationship in the form of
realization across the two modalities. Moreover, compared with the huge range of distinct
combinatorial possibilities in jazz instrumentation, whether articulated by mouth, hand,
valve, slide, bow, or mallet, the exceedingly small set of segments in the core repertoire of
scat presents a striking contrast. What the Imitative Hypothesis fundamentally fails to
explain is why the rich diversity of instrumental sound is not more extensively mirrored
in scat. The possible articulatory range of the human vocal apparatus far exceeds what
is found in human language systems, let alone in scat. Moreover, even the much more
limited range of segmental and combinatory possibilities in the English phonological
system significantly exceeds what is found in scat. The fundamental question then is
what hypotheses might offer a more insightful and constrained explanation for the small
and remarkably consistent inventory of segments and syllable shapes that characterize
scat. In the next section it is argued that phonological markedness theory constitutes a
productive basis of inquiry.
4.2. Markedness Theory
From a linguistic perspective, the framework of phonological markedness theory
embodies a number of hypotheses against which these empirical generalizations about
scat can be evaluated. It is markedness theory that negotiates the interface of fundamental
questions regarding linguistic diversity vs. universality, seeking to understand across
the manifest differences of human languages just what properties of language may be
universally attested, what properties may be correlated with or implicated by another
property, and what properties are rare or may in fact never be attested. The basic premise
to be evaluated in the context of specific constraints identified in the discussion to follow
is that the phonological form and content of scat are relatively “unmarked” along various
diverse, independent measures of markedness.
4.2.1. Markedness Hypotheses about Syllabic Shape
Consider first syllabic form. Evidence from several diverse domains of natural language—
cross-linguistic studies of canonical syllable structures, phonological epenthesis, cluster
simplification processes, language acquisition, prosodic morphology, etc.—independently
identify CV syllables as the most basic and the single universally attested syllable shape,
justifying the characterization of CV as the ‘core’ syllable. In accord with this empirical
generalization, all of the diverse approaches to markedness theory (cf. Jakobsen 1941/1968;
Trubetzkoy 1939; Chomsky and Halle 1968; Greenberg 1966; Kaye and Lowenstamm
1984; Prince and Smolensky 1993; McCarthy and Prince 1994; de Lacy 2002 among
others) converge on a recognition of open CV syllables as the least marked syllable type.
Within the framework of Optimality Theory (Prince and Smolensky 1993, McCarthy
and Prince 1995, Kager 1999, etc.), the relative markedness of an output sequence is
169
Patricia A. Shaw
determined with respect to its violation of each of a ranked set of universal constraints
on phonological structure. Constraints relevant to syllable shape properties are stated in
(32), adapted from Kager (1999: 93, 94, 97):
(32) a. Onset
b. NoCoda
c. *ComplexOnset
d. *ComplexCoda
*σ[V
*C]σ
*σ[CC
*CC]σ
A syllable must have an onset.
A syllable must not have a coda.
Onsets are simple.
Codas are simple.
The optimization of CV results from the fact that this syllable shape violates none of the
constraints in (32).
The emergence of core CV syllables as ubiquitously preferred in scat is therefore
entirely in conformity with markedness predictions about syllable shape. Different
measures confirm their special status, from the lead observation that the scat excerpts in
(1) and (2) contain exclusively ‘core’ syllables to the accumulated evidence in (27) that the
10 most frequently used syllable shapes are all open CV syllables.
Although the survey of scat in §2 sustains the generalization that the vast majority
of scat syllables adhere to the simplex onset plus no coda pattern, it also reveals that
not one of the six pieces analyzed here consists only of such syllables. Deviation from
this optimally unmarked canon falls into two categories: §4.2.2. violations of (32c)
*ComplexOnset, and §4.2.3, violations of (32b) NoCoda. Notably, there are no syllables
documented in the present database that violate the *ComplexCoda constraint in (32d):
all codas in the tunes sampled here consist of a single consonant.
4.2.2. Complex Onsets
A very small set of syllables (an overall total of 2.17 of the sample, as shown in (25)) have two
consonants as opposed to one in the onset. Such cases violate the constraint *ComplexOnset
in (32c), and fall into two subtypes, dependent on specific segmental content.
First are the clusters [sk] and [sp]. What differentiates these from the second
subtype of *ComplexOnset violations is that [sk] and [sp] are familiar, frequent,
well-formed clusters of English. Interestingly, however, they are not common in scat.
Only Armstrong (1926: bar 9-10 in §2.1.1) uses [sk], and it occurs only in the alliterative
sequence [skiyp skæm skɩ]. Similarly, only Carter uses [sp], and it occurs only once (1955:
bar 17 in §2.3.1). Thus, not only are these clusters marked cross-linguistically by virtue of
being structurally complex onsets, but they are also foregrounded in terms of perceptual
salience within the scat repetoire by virtue of being so infrequent. A final observation is
that outside of their occurrence in these clusters, nowhere else in this scat database do
any of the individual segments [s], [k], or [p] occur as simplex onsets. As a consequence,
these sequences do not conform to the basic generalization that complex margins in
natural language phonological systems are characteristically compositional.²² That is, the
well-formedness of an [sk] or an [sp] onset cluster in English builds on the independent
²² As stated by Greenberg (1963: 263): “If syllables containing sequences of n consonants in a language
are to be found..., then sequences of n-1 consonants are also to be found in the corresponding position
(prevocalic or postvocalic).”
170
Scat syllables and markedness theory
availability of each of [s], [k], and [p] as a simplex onset. Thus, on yet another dimension
of general properties of phonological systems, these clusters are marked. In short, despite
their being entirely within the well-formedness constraints of English, the rare injection
of an [sp] or [sk] cluster into a stream of the more limited consonantal playing field of
scat syllables will effectively cause them to stand out as highly unusual.
In contrast, the second subtype of violations of the *ComplexOnset constraint
in (32c) consists of clusters that deviate from standard English: [bw], [mw], and [zw]
in Armstrong (1929: 2:02, 2:15 in §2.1.2), and [dl] and [ly] in Carter (1955: bar 6, 7, 16
in §2.3.1). Interestingly, although these segmental concatenations are not well-formed
English onsets, they differ from the first subtype in that they are basically compositional
within the scat repertoire of onset consonants. That is, with the exception of [z], each of
the components of these clusters—viz. [b], [d], [m], [l], [w], and [y]—occurs as a simplex
onset in the scat database, as charted in (24). There are two other ways that this second
set of clusters differs from the [sk] and [sp] clusters. First, they comprise exclusively
voiced segments. The fact that the segments in these clusters agree in voice conforms
with Greenberg’s (1978: 252) markedness generalization that “combinations which are
homogeneous in respect to voicing” are favoured “over those which are heterogeneous.”
Secondly, drawing on the Sonority Hierarchy in (33a), note that each of these onset
sequences conforms to the Sonority Sequencing Principle in (33b), in that there is an
increasing sonority cline between the first consonant and the second.
(33) a. Sonority Hierarchy (< indicates ‘less sonorant than’)
Obstruent (O) < Nasal (N) < Liquid (L) < Glide (G) < Vowel (V)
b. Sonority Sequencing Principle: (Clements 1990: 285)
Between any member of a syllable and the syllable peak, only sounds of
higher sonority rank are permitted.
To summarize, although these clusters are not part of the familiar English repertoire,
there are three general cross-linguistic markedness measures to which they conform:
they are compositional; they are homogeneously voiced; and they obey the Sonority
Sequencing Principle.
What sets this subset of onsets apart from standard English clusters as well as
from general cross-linguistic expectations is their relatively marked status with respect
to two other constraints on segmental sequencing, both of which fall within the broad
purview of the Obligatory Contour Principle (OCP). First, the systematic absence of
Liquid-Glide sequences in English reflects a general constraint on minimal sonority
distance (34a). In standard English, all Liquid-Glide onset clusters are prohibited:
*[ly-, *[lw-, *[ry-, *[rw-. In Betty Carter’s scat, however, [ly- slips past the *[Liquid-Glide
constraint. Secondly, militating against various assimilatory forces within the grammar
are certain context-sensitive pressures to avoid homorganic place. In standard English,
there are no Labial (*Lab–Lab) onset sequences: *[bw-, *[mw-, *[pw-, *[fw-, *[vw-, but
in Louis Armstrong’s scat [bw- and [mw- occur, these being the two that transition from
a voiced [-continuant] attack into the [w]. Similarly, with Betty Carter, it is the voiced
[-continuant] [d] that releases into a liquid [l] that violates the prohibition in standard
English against the *Cor–Cor sequences, *[dl- and *[tl-.
171
Patricia A. Shaw
(34) Obligatory Contour Principle (OCP):
a. Minimal Sonority Distance (cf. Vennemann 1988, Clements 1990, Zec 2007):
*[Liquid-Glide: *[lyb. Avoidance of homorganicity in consonant-resonant onset clusters:
*Lab–Lab: *[bw-, *[mw*Cor–Cor: *[dl-
None of these constraints characterizes the other non-English cluster, [zw], that
Armstrong uses. On a cline of relative markedness, *[zw- is not strongly deviant: it is not
subject to repair strategies in the pronunciation of proper names like “Zwicky”; and its
voiceless onset counterpart [sw], as in sweet, sway, swan, swoon..., has well-established
familiarity in the non-scat lexicon of the romantic lyricists of this same era. Nonetheless,
[zw] is outside the boundaries of standard English phonotactics, and will be recognized
as such by the listener. The hypothesis developed in §4.3 below is that such violations
of the phonological system are not arbitrary: rather, they are strategic manipulations of
the dynamic constraints that conventionally delimit linguistic structure, functioning to
enhance a range of performative musical effects.
To summarize thus far, the argumentation in this section illustrates how
phonological markedness theory provides an insightful framework for characterizing why
certain overwhelmingly common patterns emerge in the scat syllables of different artists.
At the same time, the discussion reveals that this theoretical approach also functions to
identify what properties of the empirical residue are not amenable to general linguistic
explanation. Based just on an examination of syllable onsets, the fact that this residue is
extremely narrow in scope and in realization is itself an interesting finding. In the next
section, the relative markedness of coda realization is explored.
4.2.3. Coda Constraints
Although the vast majority of scat syllables in the repertoire here do not have a coda, 17
do, as tabulated in (26). However, like onsets, their realization is very restricted. Of the 21
possible coda segments in English (see (4)), only six different segments appear: there are
multiple occurrences of [p, t, n, m, l] and a single occurrence of [g]. As the transcribed
value of this latter segment (§2.1.2, [1:56]) varies between [g] and [v]—either one of
which would be a unique attestation—it will not be incorporated into the following
discussion. In markedness terms, there are several cross-linguistic generalizations that
characterize the identity and distribution of the five other segments.
Note among the obstruents that there are no fricatives or affricates. There are only
the two plain anterior stops [p] and [t] which, in terms of frequency (see (26)), account
for 59 (49/83) of all attested codas. Given that these are the voiceless counterparts of
[b] and [d], which clearly emerge as the overwhelming segmental favourites in onsets,
a major question relates to why the value of [voice] is in complementary distribution
between onset and coda? Markedness theory offers a straightforward account of the
coda behaviour, in that the preference for obstruents to be voiceless in syllable-final
position (alternatively, at the end of a word or before another obstruent) is a widely
172
Scat syllables and markedness theory
attested cross-linguistic phenomenon. This contextual neutralization underlies the OT
formalization of the positional markedness constraint in (35):
(35) *Voiced Coda (Kager 1999; cf. Steriade 1999, Gordon 2007, Zec 2007)
Obstruents must not be marked for [voice] in coda position.
This constraint is unviolated in the entire scat corpus documented here, and effectively
captures the relevant generalization: if a coda is an obstruent, then it must be voiceless.
Not all the attested codas are obstruents, however. The residual codas [m], [n], and
[l] are all sonorants. On the basis of the cross-linguistic observation that some languages,
like Chinese, allow only sonorants in coda position, Pepperkamp (2003) proposes the
markedness constraint in (36):
(36) *Obstruent Coda
Codas cannot be obstruents.
The postulated constraint in (36) makes two predictions. First, a phonological system
could have only sonorant codas, as Pepperkamp argues for Chinese. Secondly, a
phonological system could not have exclusively obstruent codas: that is, if it has obstruent
codas, then it also must have sonorant codas. This second type of system is exactly what
is documented for both tunes analyzed for each of Louis Armstrong and Chet Baker
(see (26)). Of particular interest, however, is the fact that this is not what has emerged for
either of the Betty Carter recordings. As summarized in (26), her inventory of codas is
precisely the system characterized by the first prediction: there are only sonorant codas.
This is really quite striking confirmation of the role of universal markedness constraints
in governing the strictly delimited inventory of scat.
Moving to a consideration of place of articulation properties of codas, we note that
the limitation of the set of attested scat codas {p, t, n, m, l} to Labials and Coronals is also
systematically derivable from general tenets of markedness theory. Drawing on various
observed asymmetries in inventories, epenthesis, neutralization, etc., the markedness
hierarchy in (37) identifies Dorsal place as the most highly marked:
(37) Place Markedness Hierarchy (de Lacy 2007: 23)
*Dorsal »» *Labial »» *Coronal
Hence, the non-attestation of Dorsals and, concomitantly, the preferred status of
Coronals and Labials follow from this markedness generalization.
Finally, it is important to consider not just the distinctive properties of segments
in a particular prosodic position, but also aspects of their sequential relation to their
neighbours. As a dramatic example of harmonic assimilation, all nine instances of [ṇ] in
Chet Baker’s minimally contrastive articulatory flow are preceded by homorganic [t] and
followed by [d]. Thus, a single coronal non-continuant gesture is sustained across the trisegmental sequence, modulated only by velic movement for the oral-nasal contrast and
laryngeal voicing. Even in the context of the much more diverse articulatory repertoire
in Louis Armstrong’s Hotter Than That, an examination of trans-syllabic properties in it
reveals that the place of articulation in the vast majority of the 42 codas is homorganic
with the place of articulation of the following onset. Specifically, all eight cases of coda [t]
173
Patricia A. Shaw
are followed by a [d] onset. Similarly both [n] codas precede [d]. All three post-vocalic
coda [m]s are also homorganic, in one case to [b] and in the other two cases to [w]. All
three tokens of syllabic [ṃ] follow a comparable pattern, preceding onsets [w], [m], and
cluster [mw]. Aside from the unique instance of a [g], the only coda segment that is
ever independent of this assimilatory effect is Louis Armstrong’s favoured coda in these
works, [p]. Still, the majority of [p] codas (13/22 = 59.1) precede homorganic [b]. The
residual nine—all of which occur before [d]—are the only non-homorganic codas in this
entire scat set.
Again, these coda-onset assimilatory patterns constitute further evidence of
a remarkably consistent and delimited range of vocal behaviours in scat that are
systematically correlated with a broadly motivated positional markedness constraint, the
Coda-Condition:
(38) Coda-Condition (Itô 1989; Kager 1999)
A coda cannot have a place feature different from the following onset.
Note that (38), which fosters adjacent labial-labial or coronal-coronal articulations, is
differentiated from (34b), which militates against labial-labial or coronal-coronal
sequences, by virtue of prosodic context. The former applies across a coda-onset sequence
whereas the latter obtains between segments within a complex onset.
What has been argued in this section is that all the defining properties of scat
codas in the current sample fall directly within the explanatory framework of the
independently movitated theory of phonological markedness. They may be exclusively
sonorant (36); if obstruent, they are voiceless (35); they are solely coronal and labial (37);
and they are overwhelmingly homorganic with a following onset (38).
4.2.4. Why are [d] and [b] the favoured onsets in scat?
Having examined the markedness properties of syllable shape, of complex onsets, and of
codas, let us now return to two fundamental questions raised in the introduction, where it was
noted that, in the Louis Armstrong excerpts in (1) and (2), [d] and [b] are the only onsets.
The first question was whether this observation is genuinely representative of scat
or whether it is essentially accidental, attributable perhaps to this particular artist, to
selective sampling, to the stylistic phraseology of these brief excerpts, or to some other
factor. The present analysis clearly affirms that these two segments are indeed the most
prevalent onset consonants. As summarized in (24), [d] is incontestably the favoured scat
onset in every tune by every artist examined in §2. The next most frequently attested
onset is [b]. However, as noted earlier, there is an evident asymmetry in their usage. In
two of the six songs, [b] is not used at all; in a third, it appears only once. In the other
three songs, it ranks below [d] with a broad range of variance: from 66.7 difference
(in Armstrong §2.1.1) to 12.8 (Armstrong §2.1.2) to 4.4 (Carter §2.3.1). Nonetheless,
despite this imbalance between [d] and [b], they both clearly stand out as more prominent
than any of the other nine consonants that appear in onsets.
The second question is what the explanatory basis of this generalization might
be. Notably, it does not mirror standard English frequency patterns. According to the
174
Scat syllables and markedness theory
Francis and Kucera (1982) count of the token frequency of all word-initial onsets in a
corpus of over a million English words, neither [d] nor [b] stands at the head of the
relative frequency ranking of single consonant onsets, summarized in (39a).²³ In fact, in
terms of the absolute measures cited in (39b), [b] is almost twice as frequent as [d] in this
extensive corpus of standard English usage, a result that is opposite to the consistently
greater frequency and breadth of distribution of [d] over [b] in scat.
(39) a. ð > w > h > b > t > s > k, m > f > d > ...
b. [b] = .05335, [d] = .02762
In short, standard English frequency measures do not account for the two most robust
generalizations that have emerged in the scat data: (1) the preference of [d] over [b]; and
(2) the prevalence of [d] and [b] over all other consonants in the English inventory.
Phonological markedness theory contributes significantly to a principled
interpretation of these results. First, consider place of articulation. The Place Markedness
Hierarchy already introduced in (37) effects an internal ranking of the coronal place as
the most optimal (least marked), of labial as an intermediate class, and of dorsal as
the least optimal (most marked). Two major empirical findings about scat onsets follow
directly from this hypothesis of a fixed place hierarchy: one is the preferred status of
coronal [d] over labial [b]; and the second is the absense of dorsal [k] or [g]. Dorsals,
the most marked of the English stops, are simply unattested as scat onsets. Thus, the
place features of [d] and [b] are clearly consistent with fundamental markedness tenets.
But, what about their manner and laryngeal properties?
With respect to manner, the fact that [d] and [b] are both obstruents accords with
the cross-linguistic preference for low sonority onsets, captured by the fixed positional
markedness constraint hierarchy in (40a). Further, within the class of obstruents, the
articulated subcategorizations of (40b) identify stops as less sonorant than fricatives.
(40) a. Optimal Onset Sonority (de Lacy 2001; Prince and Smolensky 2004)
*Onset/L » *Onset/N » *Onset/O
b. Obstruent Sonority (Dell and Elmedlaoui 1985; Prince and Smolensky 2004)
*voicedFric » *voicelessFric » *voicedStop » *voicelessStop
The combined effect of the markedness relations in (40a) and (40b) accords
directly with the notable paucity (a single attestation: see (24)) of fricatives as a simplex
onset (Sarah Vaughn’s and Sha Na Na’s trademark [š] notwithstanding). We conclude
then that, for manner features, phonological markedness theory provides considerable
explanatory coverage of the favoured status of [d] and [b] in the full context of the highly
constrained scat inventory.
However, a major anomaly persists with respect to laryngeal markedness: the
privileged status of the voiced obstruents [d] and [b] and, correlatively, the extreme
rarity of their voiceless counterparts [p] and [t] as scat onsets are directly counter to
the predictions of markedness theory. That is, based on cross-linguistic generalizations
²³ This rank order is constructed from the Francis-Kucera token frequencies for single consonant onsets
cited in the appendix to Stemberger 1990: 157. The cited values in (39b) are from this same source.
175
Patricia A. Shaw
from a variety of perspectives (including the typology of sound systems,²⁴ natural classes,
direction of neutralization, direction of language change, segmental complexity, perceptual
and articulatory contrast, and other factors), voiceless obstruents are the unmarked series,
this generalization motivating the markedness constraint in (41):
(41) Voicing markedness (de Lacy 2002)
*[+voice, -sonorant] Obstruents must be voiceless.
As shown in (24), [d] and [b] together comprise 77.9 of scat syllable onsets. In
contrast, there are no instances of [p] in onset position, and only a single occurrence of
[t] (in Louis Armstrong’s Hotter Than That). Beyond the database of tunes analyzed
in §2, a full examination of Bauer’s (2002a: 245–343) prodigious body of transcriptions
of Betty Carter’s scat corroborates this generalization: in the entire collection, [t] is
unattested as an onset and [p] is exceedingly uncommon.²⁵ Given robust cross-linguistic
support for (41), it can only be concluded that the overwhelming preponderance
of the voiced obstruents [d] and [b] as onsets in scat, combined with the virtual
absence of voiceless [t] and [p], is distinctly odd from a markedness perspective. The
very fact that this pervasive asymmetry is clearly defined in markedness terms as
deviant is a productive consequence of the theory, and is pursued further in §4.3.
4.2.5. The Contributions of Markedness Theory
To summarize, the goal of §4.2 has been to explore the degree to which phonological
markedness theory provides an explanatory framework for the observed patterns
in syllable shape and segmental inventory in scat. The results of this approach are of
considerable interest, I believe, to deepening our understanding of the interface of natural
language systems and musical vocal performance. Although couched in an optimality
theoretic framework, the markedness generalizations invoked here essentially derive
from the confluence of a diversity of insightful conceptual approaches to markedness
issues that have spanned many decades of linguistic research. The breadth of empirical
coverage offered by an essentially small body of tightly constrained and independently
motivated hypotheses is considerable.
First, in §4.2.1 it is seen that the robust preference for open CV syllables in scat
directly accords with the four universal markedness constraints in (32) that govern
syllabic form. There are two kinds of deviations from this basic canon: a very small set
(2) of syllables have complex onsets and a larger set (17) have simplex codas. There
are no complex codas.
²⁴ Obstruents are exclusively voiceless in the phonological inventories of many languages, e.g.
ʼ
ʼ m
ʼ (Salish), Nuu-chah-nulth (Wakashan), Hawaiian (Austronesian), Korean, etc.
hənʼ qəmi
nə
Further, the presence of voiced obstruents in a language characteristically entails the presence of
their voiceless counterparts, as in English.
²⁵ Two tokens of [p] are in You’re Driving Me Crazy (Bauer 2002a: 252; bars 7, 9); a third token is in
the initial syllable of bar 57 in the 1979 take of I Could Write a Book (Bauer 2002a: 307). These last
two are plausibly interpretable as an ambisyllabic coda-onset from the preceding [bap] syllable.
176
Scat syllables and markedness theory
Then, to extend the analysis beyond syllable shape, the particular properties
that “mark” the attested complex onsets are examined in §4.2.2, the properties of codas
are analyzed in §4.2.3, and the observed featural asymmetries of simplex onsets are
investigated in §4.2.4. As shown in §4.2.3, the identity and distribution of the limited
repertoire of codas conform to positional markedness constraints on voice (35), manner
(36), and coda-onset place agreement (38), as well as to the fixed place hierarchy in (37).
In §4.2.4, it is seen that all but one dimension of the featural identity of the restricted
inventory of scat onsets follows markedness patterns. Specifically, they comply with the
fixed place hierarchy of (37), with the positional markedness constraints governing the
intersection of sonority and manner in onsets in (40a), as well as with the fixed hierarchy
in (40b) that optimizes non-continuant manner and sonority. Onsets deviate on only
one—albeit a perceptually highly salient—measure: voice, as formalized by the constraint
in (41). An alternate hypothesis was therefore tested, namely, that this “anti-markedness”
result might correlate with identified frequency patterns for standard English onsets.
However, comparative evaluation of the evidence shows no systematic relationship to
support a frequency hypothesis.
In sum, the explanatory power of a markedness explanation for these diverse
and strikingly consistent factors is substantial. However, a further empirical strength
of applying markedness theory to the analysis of scat is that the theory characterizes
not only what corresponds with English and/or universal language patterns, but it also
functions to define the specific nature and locus of deviance. In essence, markedness
theory effectively subcategorizes the residue of scat properties that fall outside of the
predictions of markedness-governed sound patterns into two domains. One is absolutely
pervasive across all singers, namely: the consistent realization of voiced [d] and [b] onsets,
to the virtual exclusion of their voiceless unmarked counterparts [t] and [p]. The other
is a much less coherent set of often unique attestations of highly marked segments, such
as Louis Armstrong’s once-only insertion of a triple sequence of [f] codas and later of
[sk] onsets in Heebie Jeebies. The fact that the empirical residue that is not amenable
to a linguistic markedness explanation is extremely narrow in scope is theoretically
interesting, and suggests that quite specific competing functional forces external to the
linguistic system may be at play. Scat is, after all, a component of each jazz artist’s musical
repertoire and individual creativity. In the following section, the musical interface of
vocal performance with the “marked” linguistic residue is examined.
4.3. The Performative Interface between Vocal Music and Phonological
Markedness
Given that scat is at the interface of constraints on linguistic sound structure and the
exigencies of vocal music production, the central issue to be addressed here is whether
there are factors of musical performance or creative expression that conflict with and
override phonological markedness, thus providing a systematic explanation for the
deviant residue identified in §4.2.
177
Patricia A. Shaw
4.3.1. Voice
The most salient property of scat syllables that breaches markedness patterns is the
pervasive preference for [d] and [b] as onsets, to the virtual exclusion of [t] and [p].
In natural language systems, there is a robustly attested cross-linguistic asymmetry:
a phonological inventory may have only voiceless obstruents, or both voiced and voiceless,
but not only voiced obstruents. This asymmetry is formalized in (41) by hypothesizing a
markedness constraint that identifies *[+voice, -sonorant] segments as “marked”, in the
absence of a corresponding constraint prohibiting [-voice, -sonorant] segments. What
has been documented in §2 is that in scat this asymmetry is reversed. The fundamental
question is why: what properties obtain in the performative context of scat that would
create pressure to systematically violate this constraint?
We have seen the role played by the canonical syllable structure constraints in (32)
to optimize a repetitive CV alternation in natural language. Further, it is hypothesized,
through constraints like (40), that the optimal CV contour alternates between minimal
sonority onsets (voiceless stops) and maximal sonority nuclei ([a]), thus enhancing
perceptual contrast of the peaks and troughs. What is happening in scat is that the
optimal voiceless stop onset is being compromised just one notch up the Obstruent
Sonority hierarchy in (40b) to the category of voiced stops. The articulatory factor that
differentiates these two categories is vocal cord vibration. The functional relevance of
this difference is that vocal cord vibration is essential to carry pitch. In natural language,
pitch plays a criterial role in marking tone, intonation, and often stress. However,
linguistically significant pitch is characteristically carried not by the onset, but by the
syllabic nucleus, and sometimes by subsequent coda elements dependent on sonority.
In music, pitch is fundamental to the expression of melody. Unlike language, the
melodic line of music is not structured by constraints that optimize a voiceless-voiced
contour. To the contrary, although periods of relative silence (rests) contribute to phrasing
a melody, the foundation of a melodic line is a continuous soundwave that allows pitch
realization and modification in order to create a succession of tones.
Consequently, in vocal music that incorporates natural language, there is an
inevitable tension between the articulation of voiceless sounds and the fluidity of
the melodic line, since voiceless segments cannot bear pitch. Research on singing in
Tashlhiyt Berber (Dell and Elmedlaoui 2008), a language with extensive sequences of
voiceless obstruents, reveals two strategies for the realization of the melodic line across
such stretches. One strategy is to prolong a preceding vowel so that it carries not only its
own tone, but also the tone that metrical scansion would normally assign to the following
voiceless syllable. A second strategy is to epenthesize an unmarked schwa vowel to carry
the pitch of the associated melody.
What I hypothesize is happening in scat is a third strategy, namely: the musical
melodic imperative for voiced realization overrides the natural language markedness
constraint in (40b) that optimizes voiceless stops in onsets. Because voiced stops can,
however briefly, carry some pitch realization, they satisfy the high-ranked musical
constraint on melodic voicing and therefore emerge as the most prevalent onsets in
scat. This interface between the vocal music “Melodic Voicing” constraint, formalized in
178
Scat syllables and markedness theory
(42a), and the markedness constraints on onset sonority (deconstructing *Onset/O from
(40a) into the more refined hierarchy of (40b)) is illustrated in the tableau in (42b):
(42) a. Vocal Music Melodic Voicing Constraint: *[-voice]/ ♫ melody
Sung segments must be voiced.
b. Music System:
*[-voice]/ ♫ melody
Language System:
a. [ta]
*Onset/voicedStop
*!
☞ b. [da]
*Onset/voiceelessStop
*
*
Similarly, the absence of [p] onsets, despite the occurrence of [b] onsets, follows from this
same interface where the musical constraint (42a) outranks the language constraints.
4.3.2. Performative Markedness
The second kind of deviance identified by markedness theory does not cohere in a
single identifiable characteristic. Rather, there is an assortment of infrequent—indeed,
often unique—attestations of sounds that are relatively marked in terms of general
cross-linguistic patterns identified in §4.2. The question to be explored in this section
is whether there might be an independent functional motivation for the inclusion of
marked structure.
Consider, for example, the diverse array of complex onset clusters listed in (25)
and discussed in §4.2.2. A variety of different frequency measures related to these onset
clusters attest to their rarity. Clusters occur in only three of the six scat tunes, and in
the output of only two of the three artists: there are no clusters in Chet Baker’s work.
In total, 10 cluster tokens are attested in the database of 461 syllables: thus, they mark
a mere 2.17 of onsets. Louis Armstrong’s output contains six of the ten clusters; even
within his scat corpus of 213 syllables, the percentage of complex onsets is only marginally
higher: 2.8 (6/213). In terms of individual frequency, five of the seven different types of
documented clusters are unique attestations. As shown in the summary onset chart in
(24), even certain simplex onsets are attested only once in the present sample.
The basic generalization that the full body of frequency measures reported on
here sustains is that the inventory of segments and combinatorial possibilities in scat
is extremely limited in comparison to the full inventory of sounds in English, not to
mention the even more extensive repertoire of phones in other natural language systems
or, in fact, in the extraordinary array of oral articulations that the human vocal apparatus
is capable of. A further claim has been advanced that the very limited inventory of scat
as defined in terms of frequency correlates very strongly with segments and structures
that are characterized as relatively unmarked according to basic tenets of phonological
markedness theory. Consider then the cognitive impact of deviation from what has
been identified as the high frequency, phonologically unmarked norm for scat. The
introduction of novelty into an established, familiar sequence will command immediate
auditory attention and interest.
179
Patricia A. Shaw
For example, as noted earlier (§1.2.1) in the discussion of Heebie Jeebies, it is after
a extended auditory train of 22 [d]-initial syllables, minimally broken by two [b]-initial
syllables, that Louis Armstrong interjects the alliterative sequence of three syllables with
an [sk] onset cluster. The auditory impact—phonologically and musically—is undeniably
powerful. A variation on Armstrong’s creative use of deviant phonology to musical effect
occurs in Hotter Than That, where the trio of non-English clusters [zw], [mw], and
[bw] occur within a few bars of one another, establishing a brief articulatory leitmotif
that through its disruption of the familiar scat canon simultaneously produces auditory
tension and artistic interest. Betty Carter’s repertoire, as observed in §2.3.1, is similarly
enriched and the listener’s attention engaged by unanticipated occurrences of the highly
marked [dl] or [ly] clusters. Chet Baker’s implementation of phonological deviance is
illustrated by the distribution of [h]. As pointed out in §2.2, in each of the 1955 and the
1989 versions of Everything Happens to Me, [h] occurs only once, in different places, but
in both cases its uniqueness functions to demarcate the initial syllable, a particularly
prominent prosodic position, of a musically important phrase. The hypothesis, then,
is that phonological deviance and low frequency may functionally conspire in scat to
enhance perceptual salience. By challenging the limits of phonological markedness
constraints manifest in scat, a jazz artist can effectively arrest the listener’s attention,
strategically manipulate the cognitive tension of phonological deviance, and creatively
expand the expressive repertoire at the interface of language and vocal music.
5.
Conclusions
Scat extends the vibrant improvisational genre of instrumental jazz to the human voice. As
an oral idiom, scat draws on the same articulatory apparatus as natural human languages
do. Because it is uniquely situated at the interface of the cognitive and performative
systems that underlie both music and language, scat can potentially deepen our insight
into the complex organizational structure of each of these particularly human creative
systems, as well as of their interaction.
Through investigation of the form of scat syllables used by three renowned jazz
vocalists—Louis Armstrong, Chet Baker, and Betty Carter—in performances that range
across time from 1926 to 1989, the analysis in §2 establishes that, despite their markedly
different musical styles and individual creativity, the set of consonantal sounds that these
diverse performers draw on in the creation of scat syllables is strikingly consistent and is
extremely limited in comparison to the extensive range of segments and combinatorial
possibilities that define the inventory of syllables in English. Moreover, as shown in §3,
the observed generalizations apply not only within the classical scat canon of these scat
masters, but also in scat-derived pop music lyrics of the early 60s rock’n’roll era.
Given that the articulatory range of the human vocal apparatus far exceeds what
is found in scat, the fundamental goal pursued in §4 is to explore what might account
for the attested sound patterns in scat. Three hypotheses are investigated. The first (§4.1),
familiar from the jazz literature, holds that jazz vocalization is essentially imitative of
180
Scat syllables and markedness theory
jazz instrumental expression. Although this hypothesis holds considerable popular appeal,
it is difficult to substantiate from an empirical perspective.
The second hypothesis (§4.2) is that phonological markedness theory provides an
insightful framework for characterizing why certain overwhelmingly common patterns
emerge in the scat syllables of different artists. It is argued that there is substantial support
for this approach. Specifically, all the defining properties of scat codas in the current
sample fall directly within the explanatory framework of independently movitated
markedness constraints: codas may be exclusively sonorant, in accordance with (36)
*Obstruent Coda; if obstruent, they are voiceless, following (35) *Voiced Coda; they
are solely coronal and labial, to the exclusion of dorsals, in conformity with (37) the Place
Markedness Hierarchy; and they are overwhelmingly homorganic with a following onset,
in adherence to (38) the Coda-Condition positional markedness constraint. With respect
to onsets, the clear preference for stops accords with the cross-linguistic preference for low
sonority onsets, captured by the positional markedness constraint in (40a) that defines
Optimal Onset Sonority, combined with the obstruent sonority hierarchy in (40b). The
fact that dorsals, the most marked of the English stops, are unattested as scat onsets, as
well as the preferred status of coronal /d/ over labial /b/ follow, as in the case with codas,
from the Place Markedness Hierarchy in (37). In sum, phonological markedness theory
provides considerable explanatory coverage of the highly constrained scat inventory.
Not all of the observed scat data is interpretable in terms of phonological
markedness, however. A valuable strength of the theory is precisely this consequence of
its identifying two sets of empirical residue that are not amenable to general linguistic
explanation. In §5 it is argued that such violations of phonological markedness are
systematic, and function to enhance performative components at the interface of vocal
music and the phonological system.
The most striking and consistent property of scat syllables that challenges
markedness theory is the overwhelming preference for the voiced obstruents [d] and [b]
as onsets, in conjunction with the extreme rarity of their voiceless counterparts [p] and
[t]. Whereas in a natural language system, the perceptual contrast between onset and
nucleus is optimized by a [-voice] onsets, in a vocal music system, the pitch level of the
melodic line can only be carried by [+voice] segments. What is hypothesized here is that
the musical imperative for melodic/voiced realization over-rides (in optimality terms,
outranks) the natural language markedness constraint.
The second type of scat anomaly involves a diversity of infrequent attestations
of relatively marked sounds. Because the introduction of deviation into a stream of
high frequency, phonologically unmarked scat has considerable cognitive impact, it is
hypothesized that phonological deviance and low frequency may functionally conspire in
scat to enhance perceptual salience. By defying the familiar bounds of the scat inventory,
a jazz singer can effectively capture the listener’s attention, extend the articulatory
repertoire that he can creatively work with, and transcend the familiar.
At the heart of improvisional creativity in music, as in language, is the challenge of
innovation under the constraints of structural limitations. Although an intricate variety
of cognitive and performative factors contribute to scat improvisation, what has been
argued in this paper is that phonological markedness theory provides an explanatory
181
Patricia A. Shaw
framework of the structural constraints that largely define the articulatory domain of
scat. Interfacing with this phonological framework, and sometimes over-riding specific
constraints within it, are performative exigencies of melodic realization and the ineffable
creative workings of improvisational genius.
References
Bauer, William R. 2002a. Open The Door: The Life and Music of Betty Carter. Ann Arbor:
University of Michigan Press.
Bauer, William R. 2002b. Scat Singing: A Timbral and Phonemic Analysis. Current
Musicology 71–73: 303–322.
Bastian, Jim, and John Alexander. 1995. Chet Baker’s Greatest Scat Solos. Smithfield, RI:
Coastal Publishing and Educational Resources.
Berliner, Paul. 1994. Thinking in Jazz: The Infinite Art of Improvisation. Chicago:
University of Chicago Press.
Chomsky, Noam, and Morris Halle 1968. The Sound Pattern of English. NY: Harper and
Row.
Clements, G.N. 1990. The Role of the sonority cycle in core syllabification. In Papers in
Laboratory Phonology I: Between the grammar and physics of speech, ed. J. Kingston
and Mary E. Beckman. Cambridge: Cambridge University Press. 283–333.
de Lacy, Paul. 2001. Markedness in prominent positions. In MITWPL 40: HUMIT 2000.,
eds. O. Matushansky et al. Cambridge, MA. 53–66. [also ROA 542]
de Lacy, Paul. 2002. The formal expression of markedness. University of Massachusetts,
Amherst: Doctoral dissertation.
de Lacy, Paul. 2007. Themes in phonology. In The Cambridge Handbook of Phonology, ed.
Paul de Lacy. Cambridge: Cambridge University Press.
Dell, F. and M. Elmedlaoui. 1985. Syllabic consonants and syllabification in Imdlawn
Tashlhiyt Berber. Journal of African Languages and Linguistics 7: 105–130.
Dell, F. and M. Elmedlaoui. 2008. Poetic meter and musical form in Tashlhiyt Berber songs.
Cologne: Rüdiger Köppe.
Francis, W.N. and H. Kucera. 1982. Frequency analysis of English usage: Lexicon and
grammar. Boston: Houghton Mifflin.
Gordon, Matthew. 2007. Functionalism in phonology. In The Cambridge Handbook of
Phonology, ed. Paul de Lacy. Cambridge: Cambridge University Press.
Greenberg, Joseph H. 1963. Memorandum concerning language universals. In Universals
of Language, ed. J.H. Greenberg. Cambridge, MA.
Greenberg, Joseph H. 1966. Universals of Language. Cambridge: MIT Press.
Greenberg, J.H. 1978. Some generalizations concerning initial and final consonant
clusters. In Universals of human language 2: Phonology, ed. J.H. Greenberg. 243–280.
Stanford: Stanford University Press.
Hurch, Bernard. 2005. Studies on reduplication. Empirical approaches to language typology,
No. 28. Berlin: Mouton de Gruyter.
182
Scat syllables and markedness theory
Ito, Junko. 1989. A prosodic theory of epenthesis. Natural Language and Linguistic Theory
7: 217–259.
Jakobsen, Roman. 1941/1968. Child language, aphasia and phonological universals. The
Hague and Paris: Mouton.
Kager, R. 1999. Optimality Theory. Cambridge: Cambridge University Press.
Kaye, Jonathan, and Jean Lowenstamm. 1984. De la syllabicité. In Forme sonore du
language: Structure des représentations en phonologie, ed. F. Dell, D. Hirst and J.-R.
Vergnaud. Paris: Hermann. 123–159.
Kernfeld, Barry. 1995. What to Listen for in Jazz. New Haven, Conn.: Yale University
McCarthy, John, and Alan Prince. 1986. Prosodic morphology. Technical report 32. Rutger
University Center for Cognitive Science. (online revised version: http://ruccs.
rutgers.edu/pub/papers/pm86all.pdf )
McCarthy, John, and Alan Prince. 1994. The emergence of the unmarked: Optimality
in prosodic morphology. In Proceedings of the North East Linguistic Society 24, ed.
Merce Bonzalez. Amherst, MA: GLSA Publications. 333–379.
McCarthy, John, and Alan Prince. 1995. Faithfulness and Reduplicative Identity. In
University of Massachusetts Occasional Papers in Linguistics 18, eds. Jill Beckman,
Laura Walsh Dickey, and Suzanne Urbanczyk. 249–384. Amherst, MA; GLSA
Moravcsik, Edith. 1978. Reduplicative constructions. In Universals of human language 3:
Word structure, ed. J. H. Greenberg. 297–334. Stanford, CA: Stanford University Press.
Peperkamp, S. 2003. Phonological Acquisition: Recent Attainments and New Challenges.
Language and Speech 46, 2–3: 78–113.
Prince, Alan, and Paul Smolensky. 2004. Optimality Theory: Constraint Interaction in
Generative Grammar. Oxford: Basil Blackwell. [1993. ROA 537]
Robinson, J. Bradford. 2002. Scat Singing. In New Grove Dictionary of Jazz, Vol. 3, ed.
Barry Kernfeld. 515–516.
Sapir, Edward. 1933. The Psychological Reality of the Phoneme. In Selected Writings of
Edward Sapir in Language, Culture and Personality, ed. David Mandelbaum. 1986.
Berkeley: University of California Press.
Stemberger, Joseph Paul. 1990. Wordshape errors in language. Cognition 35: 123–157.
Steriade, Donca. 1999. Phonetics in phonology: The case of laryngeal neutralization. In
Papers in Phonology 3, ed. Matthew Gordon. UCLS Working Papers in Linguistics
2. UCLA. 25–146.
Stewart, Milton L. 1987. Stylistic Environment and the Scat Singing Styles of Ella Fitzgerald
and Sarah Vaughan. Jazzforschung/Jazz Research 19: 61–76.
Stoloff, Robert. 2003. Blues Scatitudes. Brooklyn: Gerard and Sarzin Publishing Co.
Trubetzkoy, Nikolaj. 1939. Grundzuge der Phonologie. Gottingen: Vandenhoeck & Ruprecht.
Vennemann, Theo. 1988. Preference laws for syllable structure. Berlin: Mouton de Gruyter.
Zec, Draga. 2007. The Syllable. In The Cambridge Handbook of Phonology, ed. Paul de
Lacy. Cambridge: Cambridge University Press.
183
Patricia A. Shaw
Appendix 1. Transcription conventions
Reducing the dynamic auditory flux of articulatory movement in scat vocalization
to discrete transcriptional conventions that are defined in terms of phonologically
independent unitary segments and syllables entails both informed choice and compromise.
Some of the major factors impacting on the transcriptions presented in Appendix 2 are
discussed here.
First, there are some different notational symbols that are aligned with different
transcription traditions. Given the American roots of jazz, certain “Americanist” symbols,
like [š] and [č], are here adopted, rather than their corresponding IPA counterparts [∫]
and [t∫]. Nasalization is indicated by a tilda over the vowel, e.g. [ã]. A “syllabic” resonant
is marked with a subscript dot, e.g. [ṇ].
More nuanced are issues related to levels of abstraction away from phonetic
detail. The transcriptions in Appendix 2 are basically very broad, ignoring most aspects
of phonetic realization that are entirely regular, such as pre-tonic aspiration of stops.
However, certain other features that are normally non-contrastive in English, but that
surface prominently in unpredictable environments are explicitly marked, for example,
the nasalization in Louis Armstrong’s string of syllables initiated in bar 49 of §2.1.2.
A particularly complex domain is the representation of vowels, as their articulation
tends to be highly mobile. Bauer opts for a relatively abstract transcription, based on the
Trager-Smith system for phonemicization of English. Bauer defines the interpretation
of vowels with reference to the words in the table below (see Chart 1 in Bauer (2002a:
238); and Table 3 in Bauer (2002b: 306–307)). In order to standardize the transcription
system used for all the scat data considered in the present study, the Trager-Smith
(abbreviated T-S) representations are here interpreted as in the “PAS” column below.
The major differences are in the representation of lax vowels and of the T-S post-vocalic
/h/. Bauer’s transcriptions reproduced in Appendix 2 include both his T-S notation and
a transliteration in terms of the general correspondences set out in (1).
(1) English Word
T-S
PAS
English Word
T-S
PAS
uw
uw
beat
iy
iy
boot
pit
i
ɩ
put
u
ʋ
bait
ey
ey
boat
ow
ow
pail
eh
ɛə
caught
oh
ɔ
pet
e
ɛ
pot
a
a
pat
æ
æ
cut
ə
ə
Both systems neutralize the considerable variation in vowel realization that may relate to
an individual singer’s articulation, phonological context, or melodic interpretation.
An inherent limitation of the Trager-Smith transcriptions that impacts on the
present analysis derives from the fact that the T-S system does not include glottal stop,
since [ʔ] is non-phonemic in standard English. As a consequence, there is systematic
184
Scat syllables and markedness theory
ambiguity in the onsetless status of syllables that are transcribed by Bauer as vowel-initial.
There are three different types of contexts where this observation is relevant.
First, consider the post-rest “ə” in what is transcribed as / ‰ ə duw .../ in bar 6
of Betty Carter’s Thou Swell (§2.3.1). In the context of the present evaluation of the
relative markedness of scat syllable structure, the question is whether this syllable is
truly onsetless, i.e. simply [ə], which would be a marked syllable structure, or whether
there is a sub-phonemic epenthetic glottal stop functioning as an onset and creating
an unmarked structure, i.e. [ʔə]. My auditory assessment of the phonetic realization of
these contexts basically accords with Bauer’s phonemic transcriptions: generally Carter’s
mellifluous voice transitions very smoothly into a vocalic realization both phrase-initially
and in phrase-internal vowel-vowel sequences, as in [... lyə a əm ...] in bar 7. Despite the
musical appropriateness of these seamless transitions to different vowel targets, within
the linguistic analysis such syllables are tallied as onsetless, and hence “marked”.
The second context is exemplified in the last two syllables of bar 6 of §2.3.1 Thou
Swell, where Bauer’s transcription /duw uw/ implies the second syllable /uw/ has no
onset. Here there are two other potential interpretations: (i) a glottal onset, or (ii) a transsyllabic perseveration from the preceding glide into an onset role. The retranscription
[du wuw], adopted here, interprets the intervocalic glide as an onset. Alternatively the
[w] closure may be considered ambisyllabic. Either way, such cases are not onsetless, and
hence not categorized as “marked” in the present analysis.
A third and similar type of case where there plausibly is dual functionality of
an intervening consonant is the situation where a ‘syllabic’ nasal is preceded by a syllable
headed by a lax vowel and closed by a stop that is tautosyllabic to the following nasal,
e.g. the sequences [dɛt ṇ] and [dɩt ṇ] in line 7 of Chet Baker’s Everything Happens to Me
(§2.2.2). At a very surface level, such [t]s are arguably ambisyllabic, functioning as both
coda to the preceding syllable and onset to the following one. Consequently, the syllabic
resonant in such cases is not classified in the present analysis as onsetless.
Space limitations here unfortunately preclude inclusion of the corresponding
musical transcription for the full repetoire of scat renditions analyzed here (but see
citations to musical notation by Bauer (2002a, b) and Bastian and Alexander (1995)).
However, because there are, as one might expect, certain phonological correlations in
positions of prosodic prominence, both bar divisions and rests are encoded in some
(but not all) of the transcriptions here. Bar divisions are represented by | . Rests are in
standard musical notation: sixteenth note ≈, eighth note ‰, quarter note Œ, and half note Ó.
A hold, where a syllable is held across a bar line, is indicated by a dash on both sides of
the bar, e.g. [... dip–|– bə ...].
The reality, of course, is that a wealth of auditory information that springs from
this improvisational conjunction of creative and physical forces—the finessed range of
articulatory movement, the rich and highly individualistic molding of acoustic shapes
and tonalities—is not captured in conventional phonetic transcription. For the present
purposes, however, the notation adopted in Appendix 2 provides considerable insight
into the linguistic issues under investigation.
185
Patricia A. Shaw
Appendix 2: Scat Transcriptions
2.1.1. Armstrong, Louis. Heebie Jeebies scat solo (February 26, 1926)
Okeh Records 8300. Transcription by W.R. Bauer (2002b: 308).
Transliteration (2nd line) follows the principles outlined in Appendix 1.
[Note: see Bauer for a full musical transcription of the melodic line.]
WRB: eh | iyf Œ gæf Œ | əmf ‰ diy bə ‰ | diy də la bam Œ | rip ip ‰ di duw diy duwt |
eə | iyf Œ gæf Œ | əmf ‰ diy bə ‰ | diy də la bam Œ | rɩp ɩp ‰ dɩ duw diy duwt |
WRB: duw Œ duw diy duw də | ‰ diy də də dow diy | dow di dow duw ‰ duw– |
duw Œ duw diy duw də | ‰ diy də də dow diy | dow dɩ dow duw ‰ duw– |
WRB: –bə duw biy dey də | skiyp Œ skæm Œ | ski bəp ‰ diy də di də |
–bə duw biy dey də | skiyp Œ skæm Œ | skɩ bəp ‰ diy də dɩ də |
WRB: dip dæw diy ‰ dip– | –duw də dæw də ...
dɩp dæw diy ‰ dɩp– | –duw də dæw də ...
2.1.2. Armstrong, Louis. “Hotter Than That” scat solo (December 13, 1929).
Hotter than That, Track 1, 1:18. Okeh Records 8535.
Transcription BK by Barry Kernfeld (1995: 168).
Transcription PAS and time markers by Patricia A. Shaw.
Transcription WB/R by Bauer (2002b: 308), citing Reeves (2001): bars 49–54.
[1:19]
BK1: Dip deh doop da, doe doe doe doe.
PAS: dɩp diː duːp də da do do do
BK2a: Dah dew dah doot doot dew, da dee dee doot,
PAS: da daw da duːt duːt duː də diː diː duːt
BK2b: daw bee do bee dup baw lahp baw.
PAS: da biː duː biː duːp ba la(p-) baw
[1:26]
BK3a: Bah bee boop, buh dee bee doop bee,
PAS: ba biː buːp bə diː biː duːp biː
BK3b: hew law bah de bohm, bah bah bah bough
PAS: hə lo ba di bom ba ba ba baw
[1:30]
BK4a: Wah bee bah bee bee, low bah dah-oh-ah,
PAS: wa biː ba biː biː loː ba də o wa
BK4b: lah dah bee bop bah deep bah feh.
PAS: la da biː bap bə diːp ba bɛ ̃
[1:35]
BK5: Dah to dit dit dew dup, dee duh doe.
PAS: da tu dɩt dɩt duː dəp diː də doːl
186
Scat syllables and markedness theory
BK6a: Rip dee duh duh dew dah daw-ee-ya doe doe dip dip,
PAS: rɩp diː də də duː da do wiː yo də
dɩt dɩp
[1:40]
BK6b: baw buh bah bah baw beep bah beep baw bah baw bah bah beep bah beep
WRB: | boh ‰ bə Œ boh | ‰ ba Œ bə ‰ biy | Œ bə ‰ biy Œ | bow ‰ bə Œ bow | ‰ bə Œ ba ‰ biy | Œ ba ‰ biy Œ |
| bɔ ‰ bə Œ bɔ | ‰ ba Œ bə ‰ biy | Œ bə ‰ biy Œ | bow ‰ bə Œ bow | ‰ bə Œ ba ‰ biy | Œ ba ‰ biy Œ |
PAS: bõ ba bã bõ bãw biːp bãw biːp bõ ba bõ
bõ bã biːp bã biːp
BK6c: thiz dit duh duh.
PAS: di dɩt də də
[1:49]
BK7: Reap dew dit done, dah nah naw naw deep dah dee, dah done dah dew.
PAS: rip diw duːt dən da də na nə diːp də diː da dan də dɛl
[1:52]
BK8: Bah bah dah beep bew.
PAS: ba bə da biːp biw
BK9: Bah bee dut zuh bow.
PAS: ba bi dəp da bõ
[1:56]
BK10: Wah-oh dove dew, duh boop bee dew the boop, wah-oo-lough.
PAS: wæ̃uw dag duː də buːp biː duː diː buːm wæuː læw
[guitar]
[2:02]
BK11: Zwee boo bee dew um-wow dah-dah-wow.
PAS: zwiː buː bi duː ṃ mwaw də də waw
[guitar]
[2:06]
BK12: Oooooo dah-dum-wah um-mough hmaf hwow.
PAS: ũː ː
da dəm wæ̃w ṃ mæ̃w ṃ
waw
[guitar ... ]
[2:11]
[2:16]
BK13: Reap deh diddle dee tih duh, boo wuh buh bow.
PAS: rɩp di du dəl di bə dæp bwa
bæ bo
2.2.1. Baker, Chet. Everything Happens to Me. (1955)
Verve Jazz Masters 32, Verve CD 314 516 939–2.
Track = 3.31 minutes, Scat bridge = [2:27–2:58].
Transcription JB by Jim Bastian (Bastian and Alexander 1995: 14).
Transcription PAS and time markers by Patricia A. Shaw.
[2:27]
JB: | Œ ‰ Det ’n deh dah ’n duh– | – dit dah dah dah– Œ
PAS: | Œ ‰ dɛt ṇ dɛ dʋt ṇ dɛː– | – dɩt dɛ dɛ ðɛː– Œ
[2:34]
JB: | ‰ ah det ’n deh deh deh– | – det dee yah dah– ‰
PAS: | ‰ hɛː– dɛt ṇ dɛ də ðɛː– | – dət diː yə dɛː– ‰
[2:42]
‰ yah dah deh | deh ‰
JB:
PAS: ‰ yʋ dən dæ | dɛː ‰
187
Patricia A. Shaw
[2:44]
‰ dah ee dah dut ’n dah dee– dah | dah– yah dah deh deh– ‰ ‰
JB:
PAS: ‰ də iː dɛ dʋt ṇ də deː– də | dəː– yə də ðɛ ðɛ– ‰ ‰
[2:50]
JB:
bah | deh– deh deh deh dah dah dah det ’n deh dah dah dah– |– det ’n deh Œ
PAS: bə | ðɛː– ðɩ dɛ də ðɛ ðɛ ðɛ ðʋt ṇ də də ðɛ dɛː– |– ðɛt ṇ ðɛə Œ
2.2.2. Baker, Chet. Everything Happens to Me. (1989)
Chet Baker Sings and Plays from the Film ‘Let’s Get Lost’, Novus CD3054-2-N.
Track = 5:15 minutes, Scat bridge = [3:47–4:21].
Transcription JB by Jim Bastian (Bastian and Alexander 1995: 15).
Transcription PAS and time markers by Patricia A. Shaw.
JB: ‰ ≈ Hoo day dut ’n dah dah deh– deh deh deh | deh– ‰ Ó
PAS: ‰ ≈ huː deː dʋt ṇ de yə deyː– də də də | ðə– ‰ Ó
[3:47]
[3:57]
JB: Œ ‰ ≈ yah deh deh dah deh– | – Œ
PAS: Œ ‰ ≈ dʋː də deːy – dʋ̃ diː– | – Œ
[4:06]
JB: Œ dah deh dah deh deh deh deh deh deh deh deh dee– | –
PAS: Œ dəw dɛ dʋ dɛ dʋ də duː dɛw duː dʋ də diː – | –
[4:10]
JB: – dee dee dee dee dee dee dee dee dee dee deh det ≈ deh– | –
PAS: – diː duː dɩː de du dɛ dʋ duːː
dʋ dɛt ≈ dɛː – | –
[4:15]
JB: ‰ ‰ ≈ yeh deh det ’n deh dit ’n deh dee ee– | – day doo ee doo– Œ
PAS: ‰ ‰ ≈ dɛ dɛ dɛt ṇ dɛ dɩt ṇ dɛ du–iy– | – dɛ duː ʋː– ʋː Œ
2.3.1. Carter, Betty (née Lillie May Jones) 1929–1998. Thou Swell. (1955)
Meet Betty Carter and Ray Bryant. Columbia/Legacy CK 64936, 6. (from A
Connecticut Yankee. 1927. Rodgers and Hart, WB Music Corp.)
Transcription by W. R. Bauer (2002a: 251; includes musical transcription).
Transliteration (2nd line) follows the principles outlined in Appendix 1.
(nasal)
WB: ‰ hə lə dow di dl ə yə də | hiy– də də ba yow bow || ba ba duw bə ba ‰ biy |
‰ hə lə dow dɩ dḷ ə yə də | hiy– də də ba yow bow || ba ba duw bə ba ‰ biy |
WB: – ə di dn di də yow bow | ba ba duw bə ba ba duw bə |
– ə dɩ dṇ dɩ də yow bow | ba ba duw bə ba ba duw bə |
WB: uw də ‰ ə duw lyə duw uw – | – uw ə duw lyə a əm | bey dow– ‰ ə |
uw də ‰ ə duw lyə du wuw – | – u wə duw lyə a əm | bey dow– ‰ ə |
WB: la əm biy bə duw bə duw wiy – | – da ə yuw duw bə | du du di – duw bə |
la əm biy bə duw bə duw wiy – | – da ə yuw duw bə | du du dɩ – duw bə |
WB: di dəl də i– duw əm | ba bə də šiy – ra | ‰ ə la əm biy bə duw bə |
dɩ dəl də i– du wəm | ba bə də šiy – ra | ‰ ə la əm biy bə duw bə |
188
Scat syllables and markedness theory
WB: iy bə duw bə ‰ ə ba bə | di dliy ə duw bæw– Œ | ‰ hey ba bə spi də lə di də lə |
iy bə duw bə ‰ ə ba bə | dɩ dli yə duw bæw– Œ | ‰ hey ba bə spɩ də lə dɩ də lə |
WB: ba bə di də lə bey (ay || fiyl ) Œ Œ ||
ba bə dɩ də lə bey ‘I feel’
2.3.2. Carter, Betty. Open the Door (1979)
Words and music by Betty Carter, 1964. MyKag Publ. Co.
The Audience with Betty Carter. Bet-Car MK 1003; reissue Verve 835 684–1.
Transcription by W.R. Bauer (2002a: 294–303; includes musical transcription).
Transliteration (2nd line) follows the principles outlined in Appendix 1.
WB: Œ ṃ– | – də | dow | ... | ... |
WB: Œ Œ ṇ duw– duw– | duw |
WB: Ó Œ ‰ hṇ | duw– duw– | – diy | duw | ... | ... |
WB: Œ ‰ duw iy uw iy uw | duw – | – Ó | ... | ... |
Œ ‰ du wi yu wi yuw | duw – | – Ó | ... | ... |
WB: Œ le də dey dey– | – dey dey– dey dey | dey– | – Œ Ó | ...
Œ lɛ də dey dey– | – dey dey– dey dey | dey– | – Œ Ó | ...
3.1. Barry Mann. “Who Put the Bomp?” (1961)
Words and Music by Barry Mann and Gerry Goffin. ABC-Paramount 45 NK-10237.
Lyrics from The Official Barry Mann and Cynthia Weil Website:
http://www.spectropop.com/hmannandweil.html.
Transcription by Patricia A. Shaw.
Due to space limitations, only lines that have scat are included below, and only scat
syllables are transcribed. For reference, lines are numbered at the left.
1.
Who put the bomp
7. Who put the dip
2.
In the bomp bah bomp bah bomp?
8. In the dip da dip da dip?
3.
Who put the ram
4.
In the rama lama ding dong?
9. Boogity boogity boogity
5.
Who put the bop
10. Boogity boogity boogity shoo
bamp
baːm bə baːm bə bam
ræm
ræ mæ læ mæ dɩŋ dãŋ
bap
6. In the bop shoo bop shoo bop?
bap šə bap šə bap
dɩp
dɩp dɩ dɩp dɩ dɩp
...
bu gɩ di bu gɩ di bu gɩ di
bu gɩ di bu gɩ di bu gɩ di šu
...
189
Patricia A. Shaw
3.2. Sha Na Na “Mr. Bass Man”
Words and Music by Johnny Cymbal (1963). Original release: Kapp 503.
Re-recorded by Sha Na Na, Rama Lama Ding Dong (1980).
Transcription by Patricia A. Shaw.
[Note: The lyrical lines are marked for BM “Mr. Bass Man”, W “I wanna be...”,
and Back for the back-up singers’ line. English is in italics and in standard
orthography. Only scat syllables are transcribed.]
For reference, scat lines are numbered at the left.
BM: 1. baw bə bə baw bə baw bə baw baw
2. bə bə baw bə bə baw bə baw bə baw bəm bəm
W:
Mr. Bass Man, you’ve got that certain something...
Mr. Bass Man, you set that music thumpin’
To you it’s easy
when you go 1-2-3
3. bə bʋ bə bam ba?
BM:
W:
[0:33]
BM:
W:
Yeah! Mr. Bass Man, you’re on all the songs
5. sə dɩ də bɩ bə bum bum
6. And the dɩ dɩt ba ba bæw
Hey Mr. Bass Man, you’re the hidden King of Rock ‘n’ Roll
7. bə bə bə ba bæw??
No, no!
8. bə bə ba bə bə bu ba bə ba ba bæw
Oh, it don’t mean a thing when the leader’s singin’
Or when he goes
9. ʔay yay yay yay yæ yæ
Hey Mr. Bass Man, I’m askin’ just one thing
Will you teach me, mmm, yeah, the way you sing
‘Cause Mr. Bass Man, I wanna be a bass man too
10. bɩ bʋ bə ba bɑ?
[0:41]
BM:
[1:00]
You mean:
4. bʋ bə baw bʋ bə bʋ baw bə baw baw ba ?
W:
BM:
Try this:
11. bʋ bə bæ bu bu bə ba bə bæw
Oh Mr Bass Man, I really think I’m winnin’
12. With the bɩp bum bõ
13. And a dɩ dɩt dɩ dæ dæ
Oh Mr. Bass Man, now I’m a bass man too
14. dɩ də dɩ dæ
That’s it!
15. bu bə bæ bu bə bu ba bə ba
Back: 16. bəm bəm bəm bə bə bʋm bʋm bə bəm bə bə bə bʋm bʋm ba bə bʋm
BM: Now you!
[1:18]
W: 17. də dɩ dṇ dɩ bə bum bum bo bəm bə bə bə bə bom bʋm
BM: With me
190
Scat syllables and markedness theory
BM/W: 18. bʋm bə bə bɔm bə bə bɔm ba bum ba bəm
19. bʋm bə bə bəm bə bə bəm _ bʋm bəm
20. bəm bə bə bum bə bə bəm bə bə bəm bə bə bʋm bəm
21. bʋm bə bəm bə bə bu bə bu bʋ bə bəm
[1:30]
[1:35]
W: Oh, it don’t mean a thing oh when the leader’s singin’
Or when he goes
21. ʔay yay yay yay yæ ya
Hey Mr. Bass Man, I’m askin’ just one thing
Will you teach me, mmm, yeah, the way you sing
‘Cause Mr. Bass Man, I wanna be a bass man too
22. ba bɑ
BM:
W:
BM:
[2:08]
That’s it.
27. bu bə bæ bu bə bu ba bə bæw
28. bəm bəm bəm bə bə bə bʋm bʋm bʋm bə bə bə bə bə bʋm bʋm bə bə bʋm
Now you!
W: 29. bə bə bom bə bə bom bə bum
BM:
[2:16]
Oh Mr. Bass Man, I really think I’m winnin’
With the
24. bɩp bum bõ
And a
25. dɩ dɩt dɩ dæ da(w)
Oh Mr. Bass Man, now I’m a bass man too
26. dɩ də dɩ dɩ də dæw
[2:05]
[2:12]
Try this:
23. bʋ bə bæ bu bu bə ba bʋ bæw
Soundin’ good ...
With me
BM/W: 30. bʋm bə bə bʋm bə bə bʋm ba bʋm ba bʋm
31. bʋm bə bə bəm bə bə bəm _ bə bəm
32. bəm bəm bə bə bəm bʋm bə bə bəm bə bə bəm bə bə ba bṃ
33. bə bəm bə bə bə bə bʋm bə bəm
34. _ bum bum bə bʋm bə bə bə bə bʋm
35. bə bə bə bə bʋm bʋm bə bʋm bə bə ...
191
© Copyright 2026 Paperzz