FINDING THE BLUES AN INVESTIGATION INTO THE ORIGINS

FINDING THE BLUES
FINDING THE BLUES
AN INVESTIGATION INTO THE ORIGINS AND EVOLUTION OF AFRICAN
AMERICAN MUSIC
by GEORGE BUSBY
A thesis submitted in partial fulfilment of the requirements for the degree of Master
of Research of the University of London and the Diploma of Imperial College
March 30th 2006
1
FINDING THE BLUES
ABSTRACT
Genetic data have been used to study human population history. Increasingly, however, scientists
are investigating non-organic sources of information on culture, such as language, and combining
these studies with genetic, geographic and archaeological data in order to study the early migration
and evolution of man.
This study uses a novel dataset on human songs style to investigate relationships between
human populations. I studied character profiles on the style of songs from 7 broad geographic
regions, with the aim of identifying how African American music relates to the music of the cultures
thought to influence it.
A global dataset of over 5000 songs, a result of the Cantometrics study of the 1960s, was
used as source for the song profiles. I used a sample of 1488 songs from North America, Europe
and Africa.
Each song profile contains information on each of 44 variables, each solely concerned with
a different aspect of song style. I produced distance matrices for pair-wise comparisons of these
profiles. I then ran these distance matrices through successive levels of cluster analysis, from 2 to 7
clusters. A priori defined information on the geographic and cultural origins of the songs was then
re-introduced to the clustered songs and any structure observed identified.
I looked for correlations between the songs of each of the different cultural regions in each
of the clusters. The analysis revealed structure in the dataset that could be explained by cultural
information: in general, most songs from different cultural regions fell into different clusters. Some
songs from different cultural regions fell into the same clusters: they showed stylistic relationships.
These relationships could be explained by historical contact and migration between the cultural
regions in question.
Surprisingly, African American song style appears to have diverged greatly from the style of
West Africa, the area of Africa where the founders of the African American population originated.
Comparison of African America music from the Caribbean showed not only that Caribbean songs
are more closely stylistically related to African songs than African American songs, but also that
African American and Caribbean song style has evolved in different directions.
2
FINDING THE BLUES
TABLE OF CONTENTS
ABSTRACT
2
1. INTRODUCTION
4
1.1. The study of culture
4
1.1.1. Measures of culture
5
1.1.2. Using song as a measure of culture
5
1.2. The phylogenetic study of culture
6
1.2.1. The Cantometrics experiment
6
1.2.2. Analysing African American Music
6
1.3. Aim
2. MATERIALS AND METHODS
2.1. Materials
7
8
8
2.1.1. The song sample
8
2.1.2. Correcting the dataset
8
2.1.3. Balancing the sample
9
2.2. Methods
9
2.1.1. Initial investigation of the data
9
2.2.2. Production of a distance matrix
9
2.2.3. The JS clustering algorithm
9
3. RESULTS
3.1. Songs from different regions differ systematically in style
10
3.2. Songs from a given region tend to resemble each other when
all variables are considered simultaneously
3.3. Songs from different cultures show stylistic relationships
4. DISCUSSION
11
16
21
4.1. Song styles differ systematically among cultures
21
4.2. Song styles show historical relationships
21
4.3. African American songs and their origins
22
4.4. Further work
22
5. ACKNOWLEDGEMENTS
23
6. REFERENCES
24
7. APPENDICES
27
A1 Cantometrics
28
A2 The song sample
34
A3 Problems with the Cantometrics dataset
38
A4 The distance matrix program
42
A5 The JS clustering algorithm
46
A6 Graphs showing variation across each of the 44 variables for all
cultural regions
48
3
FINDING THE BLUES
1. INTRODUCTION
Genetic data have been used to study the evolution and relatedness of human populations.
Techniques used in the study of human population genetic variation and structure include;
comparisons of protein and blood group polymorphisms (Lewontin 1972), investigations using
microsatellite markers (Barbujani 1997; Rosenberg et al 2002), investigations using restriction
fragment length polymorphisms (Barbujani et al 1997) and studies using mitochondrial DNA (Cann
et al 1987; Jorde et al 2000). In general the results of these studies have shown that the average
proportion of genetic differences between individuals from different human populations only slightly
exceeds that between unrelated individuals. That is, the majority of variation between humans
occurs at the individual and not population level. There is little evidence for a large genetic basis for
differences between human populations (although see Edwards 2003).
Further phylogenetic analysis has produced evolutionary trees of human populations using
genetic data (Nei and Roychoudhury 1982; Cavalli-Sforza et al 1988; Nei and Takezaki 1996).
Phylogenetic analyses have also produced evolutionary trees of non-organic, cultural data, such as
language (Cavalli-Sforza 1994; Gray and Jordan 2000; Holden 2002; Mace and Holden 2005) and
even Turkmen carpet design (Tehrani and Collard 2002). Cavalli-Sforza (1994) combined genetic,
archaeological and linguistic data in the largest cross-discipline investigation of human origins to
date. Use of complementary data in this way has proved to be of great insight to those investigating
human origins (Ayub et al 2003). Recently genetic data have also been used to investigate theories
of human migration (Oppenheimer 2003).
Given that the definition of human populations is genetically unclear (Excoffier 2003), it
seems necessary to search different avenues of inquiry if one aims to investigate what it is that
produces differences between populations and in what way these should be classified. One
possibility is to study population differences dependent on linguistic, cultural and geographic
inferences (Pritchard et al 2000). It might then be possible for classification of human populations to
be made from quantitative study of shared cultural traits as well as from genetic congruence.
Therefore, investigation of differences between human populations can be conducted by
investigating differences in their non-genetic inheritance: their culture.
1.1. The study of culture
Humans are not the only animals to have culture. Using tools, chimpanzees have been observed
learning from each other of ways to pull termites from mounds (Goodall 1964; see also Whiten et al
2005), and desert elephants follow ancient routes to quarry caves, rich in kaolin, in order to acquire
rare salts. These are examples of behaviours ostensibly acquired via learning rather than through
the species’ genetic inheritance, so can be called culture. What differs with man’s inheritance of
culture is the scale of it, especially with regard to advanced communication (Pinker 1994;
Szathmary and Maynard Smith 1995). Every human is brought up and taught how to live.
Differences in the quality of this teaching will profoundly affect the likelihood of a human’s survival.
Culture is therefore important in man’s history (Heinrich and McElreath 2003). As man becomes
4
FINDING THE BLUES
more aware of his own physical evolutionary history, it is of interest to try to use and combine
information on language and culture to help piece together human past.
1.1.1. Measures of culture
The present study will use human song as a measure of culture. Mace and Holden (2005:116)
define culture “broadly as behavioural traditions that are transmitted by social learning”. This
definition is attractive as it describes culture as a learned behaviour. Importantly it also describes
culture in terms of tradition, so implicit to this notion is the idea of the behaviour lasting over time.
Under this definition song can be described as a part of culture: it is a stable, behavioural tradition
(Frisbie 1971; Merriam 1964) transmitted through social learning.
It is therefore conceivable that song is amenable to evolutionary study in a way similar to
language. Gray and Jordan (2000) used phylogenetic analyses, developed by evolutionary
biologists, to study the migration of human populations in Austronesia. By constructing a tree of the
languages across the Melanesian archipelago, and testing it against hypotheses of human
migration, they were able to add evidence to the so-called ‘express-train hypothesis’ of human
movement in the Pacific. In other words, their analysis of language evolution supports attempts to
explain human movement. Holden (2002) showed that study of Bantu language evolution can be
linked directly to hypotheses on the genetic evolution of the same populations (see Holden and
Mace 2005 for review).
Culture and its music differ markedly over the world. If it is possible to measure the
differences in music and then to quantify them, then it is also possible to draw ideas about the
relationships between cultures based on these differences and so investigate the ascent of man
both biologically and culturally.
1.1.2. Using song as a measure of culture
In order to study the evolution of song with a biological perspective, it is necessary to argue that
songs can act, at least to some extent, in a Darwinian fashion.
Biological organisms replicate, and this is key to the process of evolution. In The Extended
Phenotype, Dawkins (1982) described three fundamental properties of a replicating entity, longevity,
fecundity and fidelity. If culture is to be analysed under an evolutionary framework then these three
characteristics have to be considered. Song can be long-lived; it is passed down over time. It can
also be fecund; certain song styles are said to be fore-runners to other styles. In an example from
European classical music history, Bach’s baroque music was a precursor to the classical style of
Mozart, which was in turn ancestral to Berlioz’s romantic music. Fidelity is a problem however.
Songs are not always faithfully reproduced over time. It has also been argued that processes such
as blending of cultural ideas block their ability to evolve (Boyer 1999). Culture changes due to
influences from different sources at a speed very much faster than genetic evolution. However,
there is selection on culture. Ideas and traditions that are favoured remain embedded in the lives of
human populations and those that do not carry the same importance are lost to time. Over time,
5
FINDING THE BLUES
therefore, those cultural traditions that have survived do so because they are supported by the
populations in which they arise.
1.2. The phylogenetic study of song
The phylogenetic study of song requires song analysis in a way that can produce readily
comparable data. A profile of numbers for a song (which refers to its characteristics) needs to be
produced so that its similarities to other songs can be quantitatively investigated. This is analogous
to the production of genetic profiles from DNA for investigating evolutionary hypotheses. Fortunately
data of this type are already available from a world sample of folk and tribal music collated for the
Cantometrics experiment of the 1960s.
Cantometrics is a heuristic method of analysing song, or more precisely, song performance
style, developed by Alan Lomax and others at Columbia University (Lomax 1968). This
classification also attempted to link song style and culture. The song style profiles from this
endeavour will provide the raw data for the analysis in this paper.
1.2.1. The Cantometrics experiment
Cantometrics aimed to analyse world music on a detailed scale of 37 different musical traits. These
include song style characteristics such as tempo, orchestration and relationship of the vocal group.
Songs collected over a 60 year period from all over the world were measured and rated on this
scale, which produces a unique profile of 37 numbers for each song. Lomax et al (1968) used factor
analysis to investigate the global distribution of song style. Through analysis of both musical and
cultural data they found that the world’s music could be divided into stylistic areas that matched
geographical and historical relationships between human populations. It was a large, ambitious
project and a full description is out of the scope of this paper. The interested reader can find a
detailed explanation of Cantometrics the supplementary material attached (appendix A1).
1.2.2. Analysing African American Music
The music of African Americans covers a diverse range of styles from jazz and blues to Negro
spirituals. It is clear that the origin of African American music, and in particular the blues, was
influenced by the music brought over with West African slaves during the 16th and 17th centuries
(Kubik 1999). But where exactly this influence came from in the expanse of West Africa is less
certain. Music played in the Sudanic belt of Africa is characterised by a predominance of pentatonic
tuning patterns, the absence of asymmetric timeline patterns, the absence of polyrhythm and wavy
intonation or melisma (Kubik 1999:63). This differs from the style of the Guinea coast with its
percussive rhythms and asymmetric timeline patterns (Kubik 1999:51). This is of interest because
the Guinea coast was a large area from which slaves were recruited (Kubik 1999). It is also of
interest because these are characteristics of song style that were explicitly measured in the
Cantometrics project.
Influences on this area of Africa are deeply rooted in Islamic traditions (Kubik 1999).
Indeed, some report that over 30% of the slaves taken to North America were Muslim (Diouf 2004).
6
FINDING THE BLUES
Songs from Muslim North Africa and the Middle East might also hold clues as to the origin of the
distinct African American style.
This musical genre was also affected by the songs of colonial Europeans in North America:
the music of the slave and plantation owners of present day North America.
1.3. Aim
In this paper I will reassess an African American sample of songs from the Cantometrics project in
order to identify what structure in the dataset, if any, can be attributed to shared geographic and
cultural history. This will be achieved through computational cluster analysis using an algorithm
designed to group song styles according to their similarities. Songs with similar profiles are likely to
have a more recent common cultural ancestor than those that are dissimilar. It is therefore possible
to use this analysis to make inferences about the cultural evolutionary history of the human
populations under study.
Profiles of songs from Europe, European overseas settlements in North America, the
Caribbean, Sudanic Africa, North Africa, the Middle East and American Indians will be combined in
the analysis with African American songs in order to test the following questions.
1. Is there variation across the different character states in the a priori defined cultural groups
for each of the Cantometric variables?
2. Does cluster analysis of songs in the absence of predefined information about geographic
origin produce groups of songs that corroborate what we know about the history of the a
priori defined groups?
3. Does cluster analysis reveal the cultural origins of African American music?
7
FINDING THE BLUES
2. MATERIALS AND METHODS
2.1. Materials
2.1.1. The song sample
The data used in the analysis comes from the Cantometrics Project. The full dataset contains some
5000 song profiles from over 400 different cultures from around the world and represents over five
years worth of musical analysis. In the present study a sample of these song profiles is used in
order to concentrate on African-American music and those songs that might have influenced its
origins. I used 1488 songs representing 35% of the total available. The sample contained all song
profiles from seven broad geographic areas and are listed in table 2.1. (A full description of the
sample can be found in appendix A2.)
Cultural Region
Code
Number of Songs
African America
AAF
44
African Caribbean
CAF
357
American Indian
AI
326
West Africa
AF
319
North Africa and the
NAF
251
Middle East
Western European
WEU
105
European Overseas
OEU
86
TOTAL
1488
Table 2.1. The Song Sample
African American songs were included as they are the target group of interest. Caribbean songs are
included to test the similarities of another group of African migrants that have experienced different
cultural influences. American Indian songs are included for comparison of African American songs
to the indigenous music of North America. West African songs were included from the countries
where the slaves, who made up the original black populations of North America, historically came
from. Through North Africa and the Middle Eastern slave trade, an Islamic influence on the Blues
has been postulated (Kubik 1999), so it was therefore of interest to investigate their relationship with
African American music. European songs of both the North American colonists and the countries
where they originated are also included.
All of the songs originally included in the original Cantometric analysis are thought to be
traditional, having been produced and performed prior to the. These songs are therefore the most
representative of each cultural region’s original type.
2.1.2. Correcting the dataset
Before the songs could be analysed with the clustering algorithm there were nine problems which
had to be corrected. These problems ranged from non-independence between variables to
ambiguous codings. For example eight variables were concerned with the orchestration of a song.
In the Cantometrics study all songs where there is no orchestration are given a score of 1. This
leads to spurious statistical conclusions when similarity scores are calculated because all songs
without orchestration are perceived similar purely because they have no orchestration, rather than
8
FINDING THE BLUES
because of any stylistic similarities. These problems and the remedial action taken are discussed in
detail in appendix A3.
2.1.3. Balancing the sample
The number of songs from each cultural region varied from 44 to 357. Before the analysis could
proceed it was necessary to take a sub-sample of songs from the larger sample with equal numbers
of songs from each cultural region. 44 songs were randomly sampled from each region except for
the African American songs, all of which were used in each sub-sample. In total 50 sub-samples,
each of 308 songs, were produced and analysed separately.
2.2. Methods
2.2.1. Initial investigation of the data
For the full sample (1488 songs) tables were produced containing the numbers of songs contained
within each of the variable states. Graphs with the frequency of songs in each state were produced
for each of the 44 variables. The tables were tested to see if the numbers of songs from each
culture within each state were significantly different.
2.2.2. Production of a distance matrix
A distance matrix was produced for each of the 50 sub-samples. This involved producing a pairwise similarity score for each pair of songs. The similarity score was based on comparisons
between each of the 44 variables that make up the song profiles. A detailed outline of the distance
matrix program is described in appendix A4.
2.2.3. The JS clustering algorithm
The distance matrix that was produced for the sample of songs was run through the JS clustering
program which produces K clusters (K being specified in advance). The criterion for the final
clustering solution was to minimise the total within group variance of each of the K clusters. The
exact workings of the algorithm are detailed in appendix A5.
Multiple runs of the clustering algorithm were run on the 50 distance matrices, for all values
of K between 2 and 7.
9
FINDING THE BLUES
3. RESULTS
3.1. Songs from different regions differ systematically in style
The 1488 songs studied here belong to seven a priori defined “cultural regions”: African America,
AAF, American Indian, AI, West Africa, AF, Caribbean African America, CAF, overseas European
OEU, Western Europe, WEU and North Africa and the Middle East, NAF. These regions are defined
by a mix of cultural, historical, and geographic criteria. Lomax (1968) collected data for 37 variables
on all these songs, which we have redefined as 44 variables with anywhere between 5 and 228
states each. Most of these variable states are higher-order combinations of single states, where
songs were scored for more than character state in the original Cantometric analysis (see appendix
A3). In order to examine how the states of each variable were distributed across our a priori
defined groups, I began by constructing contingency tables for each variable. Below, table 3.1, is an
example. I also plotted the frequency of each state for each variable in each cultural group (figures
1 to 44 in appendix A6). Analysis of the contingency tables showed that the states of each of the 44
variables were distributed unevenly across the 7 cultural groups (Chi-squared test for goodness of
fit; df = 6-42; p<0.001 in all cases; see appendix A6 for full details). In other words, all variables
showed that songs from different regions differ from each other in style.
Table 3.1. An example contingency table for Variable 17 – Range. (The songs with ‘more than one state’
represent songs coded with more than one of the character states.)
CULTURAL REGION
BA CAA EO EU
17. Range
State Description
State
AA
AI
OHC
TOTAL
no data
a monotone to a major second
0
1
3
0
0
1
5
5
1
0
4
0
1
0
7
4
21
10
a minor third to a perfect fifth
a minor sixth to an octave
a minor ninth to a major fourteenth
two octaves or more
more than one state
2
3
4
5
6+
TOTAL
4
11
20
6
0
44
57
143
117
7
1
325
55
155
90
9
0
319
105
229
16
6
0
357
3
43
30
6
0
86
2
57
45
0
0
105
84
94
47
9
6
245
310
732
365
43
7
1488
10
FINDING THE BLUES
3.2. Songs from a given region tend to resemble each other when all variables are
considered simultaneously
The uni-variate analysis given above and in appendix A6 suggests that the Cantometric variables
capture at least some of the features that make the songs of one cultural region sound different
from another. However songs, like genomes, do not differ simply in particular features or variables;
they sound different because of statistical associations between variables, what in genetic terms is
called linkage disequilibrium. To examine the degree to which the songs of a given cultural region
cluster together when all variables are considered simultaneously, I carried out a cluster analysis
using the JS algorithm in which a balanced sub-sample of the songs was clustered successively
into K groups where K was 2, 3, 4, 5, 6 or 7.
Tables 3.2.-3.7. show the results for typical cluster runs in which the songs were divided into 2 to 7
groups. The uneven distribution of songs from a given cultural region among clusters suggests that
similar songs are grouping together. Recall, that although a total of 1488 songs were studied, each
run of the clustering algorithm involved only a sub-sample of 308 songs (44 from each of the 7 a
priori defined cultural groups). Each table represents the results from one run using one subsample. For each value of K, I did this 50 times.
The analysis clusters the songs into K groups by distributing the groups into clusters with
the lowest variance between song style profiles (see methods). In this case, for K = 2, the algorithm
produced cluster 1 with the majority of AI and NAF songs and cluster 2 with a majority of AF and
CAF songs. The AAF, OEU and WEU songs were split quite evenly across the two clusters. For K =
3 the algorithm produced three groups; cluster 1, predominantly AI and NAF, cluster 2
predominantly AF and CAF and a third European cluster, containing the majority of AAF song styles
as well. For K = 4, cluster 1 has AI and NAF songs, cluster 2, AF and CAF songs, cluster 3 mainly
OEU and WEU. AAF songs appear in cluster 4 along with a third of the WEU songs. We can see
that as K increases, groups containing different song styles become more distinct. At K = 5, more of
the songs are segregated into individual groups. Cluster 1 is mainly (39/63) AI, cluster 2 is the AFCAF cluster, cluster 3 AAF, but note also there are 12 OEU songs in this group. Cluster 4 contains
mostly WEU, but also 19 OEU and cluster 5 is the NAF group. For K = 7 the algorithm clusters
containing songs predominately from just one or two of the a priori defined cultural regions.
11
FINDING THE BLUES
Table 3.2. Table to show the results from one run of the K=2 cluster analysis
Cluster
Cultural
Region
AAF
AI
AF
CAF
OEU
WEU
NAF
TOTAL
1
2
23
28
15
2
25
21
32
146
21
16
29
42
19
23
12
162
Table 3.3. Table to show the results of a K=3 cluster analysis
Cluster
Cultural
Region
1
2
3
AAF
AI
AF
CAF
OEU
WEU
NAF
TOTAL
11
24
14
2
8
4
32
95
12
12
26
35
4
11
9
109
21
8
4
7
32
29
3
104
Table 3.4. Table to show the results of a K=4 cluster analysis
Cluster
Cultural
Region
1
2
3
4
AAF
AI
AF
CAF
OEU
WEU
NAF
TOTAL
5
20
12
0
6
3
27
73
7
12
23
30
3
2
6
83
3
5
2
9
22
38
4
83
29
7
7
5
13
1
7
69
12
FINDING THE BLUES
Table 3.5. Table to show the results of a K=5 cluster analysis
Cluster
Cultural
Region
1
2
3
4
5
AAF
AI
AF
CAF
OEU
WEU
NAF
4
39
9
1
1
0
9
5
0
21
31
2
2
6
28
2
4
4
12
1
4
1
2
1
8
19
36
0
6
1
9
0
10
5
25
TOTAL
63
67
55
67
56
Table 3.6. Table to show the results of a K=6 cluster analysis
Cluster
Cultural
Region
1
2
3
4
5
6
AAF
AI
AF
CAF
OEU
WEU
NAF
TOTAL
23
2
3
4
11
0
3
46
4
2
5
12
9
20
2
54
7
2
1
3
19
18
4
54
5
0
18
25
1
2
5
56
4
0
11
0
4
3
24
46
1
38
6
0
0
1
6
52
Table 3.7. Table to show the results of a K=7 cluster analysis
Cluster
Cultural
Region
1
2
3
4
5
6
7
AAF
AI
AF
CAF
OEU
WEU
NAF
TOTAL
2
2
5
12
5
19
2
47
1
2
2
2
17
18
4
46
16
3
7
3
2
1
6
38
0
36
6
0
0
0
4
46
4
0
6
0
4
3
23
40
3
0
16
22
1
2
5
49
18
1
2
5
15
1
0
42
13
FINDING THE BLUES
The algorithm shows clearly that multivariate analysis of the Cantometric data can produce
structure that matches closely to geographical, historical and cultural populations. Note also that
some clusters are highly biased to one cultural region while others contain a mixture, for example in
K = 7, cluster 4 is mainly AI, while cluster 7 has a combination of styles, AAF and OEU, in almost
equal proportions. It must be inferred from these results that the song styles from the two regions
are similar.
On the next page, figure 3.1. shows the distribution of cluster size across each of the different
cultural regions for K = 2-7. These frequency graphs are a result of averaging the proportion of
songs in each cluster over the 50 runs of each algorithm using different sub-samples. Cluster 1 is
always the cluster with the most songs from a given cultural region. Cluster 2 is the cluster with the
next highest and so on. It can easily be seen that the songs tend towards one cluster. At all levels of
K, the average proportion of songs in cluster 1 is largest, suggesting that over and over again the
songs only fall into a few of the clusters.
These results show that on the one hand, the cluster analysis is very good at assigning some
cultural groups to a given cluster. For example, AF and CAF have high proportions of songs in
cluster 1 for all values of K. That is, the majority of the songs from these cultures fall into the same
cluster on repeated runs of the cluster analysis. On the other hand however, some cultural groups
are more widely dispersed across the clusters, such as AAF and AI, suggesting that these data are
more heterogeneous and that the cluster analysis is less good at assigning these cultural groups to
a given cluster. Nevertheless, I conclude that the multivariate cluster analyses based on song style
can reveal a great deal of geographical and cultural structure in this dataset.
14
FINDING THE BLUES
K=3
K=2
1.0
1.0
0.9
0.9
0.8
0.8
0.7
Cul t ur e
0.7
AAF
0.6
AAF
AI
AI
0.6
AF
0.5
CAF
AF
0.5
0.4
CAF
NAF
0.4
OEU
WEU
0.3
WEU
CAF
AF
AI
0.0
0.1
AI
0.0
NAF
CAF
AF
0.1
0.2
WEU
OEU
0.2
NAF
OEU
0.3
OEU
NAF
WEU
AAF
1
2
3
AAF
1
2
K=5
K=4
1.0
0.9
1.0
0.8
0.9
0.7
0.8
0.6
0.7
AI
Frequency
0.5
AI
AF
CAF
0.4
AF
OEU
CAF
0.4
AAF
0.5
AAF
0.6
WEU
0.3
OEU
NAF
WEU
0.3
WEU
0.2
NAF
AF
CAF
0.1
0.0
AF
AI
1
0.0
2
AI
1
AAF
3
4
5
AAF
2
3
NAF
CAF
0.1
OEU
0.2
NAF
WEU
OEU
4
K=7
K=6
1.0
0.9
1.0
0.8
0.9
0.7
0.8
0.6
AAF
0.7
AI
0.5
0.6
AF
AAF
AI
0.5
AF
CAF
0.4
OEU
0.3
]
0.4
CAF
OEU
0.3
WEU
NAF
0.2
NAF
WEU
NAF
WEU
OEU
CAF
0.2
0.1
NAF
1
3
4
AF
1
AI
2
OEU
0.0
AF
0.0
0.1
2
3
4
5
AAF
6
7
AAF
5
6
Cluster
Figure 3.1. Frequency distribution graphs of cluster size for all values of K
Frequency appears on the y-axis, culture region is on the z-axis, cluster is on the x-axis. For each run I ranked
the clusters from largest to smallest, then averaged the size of the largest clusters, second largest clusters and
so on In each case cluster 1 refers to the cluster with the largest number of songs from the particular culture.
Cluster 2 is the cluster with the second highest number of songs from the particular culture, and so on.
15
FINDING THE BLUES
3.3. Songs from different cultural regions show stylistic relationships
I next asked whether successive clustering could be used to determine the relationships between
the songs sung by different cultural groups. The tables shown above (3.2-3.7) give some indication
that songs from some of the different cultural regions do in fact cluster together at different values of
K. To look at the relationships between the songs of different cultural regions, I investigated whether
the songs of a given cultural group (say AF) which cluster together, also tended to cluster with those
of another cultural group (say AAF). Table 3.3. shows an example of a K = 3 cluster analysis for
one sub-sample. Here, the majority of both AF and CAF songs occur in a high number in cluster 2.
Cluster 3 is characterised by European songs, OEU and WEU. The joint distribution of songs
across the 3 clusters immediately suggests that there may be a correlation between CAF and AF,
but not between AF and songs from another region, say OEU. To formalize this, I asked whether
there was a correlation between the membership of clusters for each pair of groups, at each K.
Recall that 50 runs were done for each K. This means that each pair-wise correlation coefficient
had a sample size of 2 x 50 = 100 for K =2; 3 x 50 =150 for K=3 and so on till K = 7.
The following tables 3.8-3.13 show the correlation matrices for K = 2-7. The lower diagonal
shows the correlation coefficients, any significant positive values are shown in red. The level of
significance is denoted in the upper diagonal of each matrix (numbers are Pearsons correlation
coefficients (r), df in title. *= p<0.05, **= p<0.01, ***= p<0.001, ****= p<0.0001).
The cluster analysis at all levels gave significant positive correlations between AF and CAF
(r= 0.46-0.77, p<0.0001). West African music and Caribbean music clearly share similar song
stylistic characteristics. Songs from WEU and OEU were also highly correlated over all cluster
analyses (r=0.66-0.95, p<0.0001). The two European derived song styles also clearly share similar
song style characters. In the 2 cluster analysis significant positive correlations were also found
between AAF and OEU (r=0.37,p<0.001) and AAF and NAF (r=0.52,p<0.0001). At this low level of
clustering African American songs appear to share style with European colonial music and the
songs of North Africa and the Middle East, although these relationships were not found at higher
values for K. Interestingly AAF also correlated significantly and positively with AF music at the 4, 5,
6 and 7 cluster analyses, although at a much lower level to the Caribbean songs. Clearly there are
still some stylistic similarities between the songs of West Africa and those of African Americans,
who migrated from that region some 250 years ago.
There was no correlation between AAF songs and CAF songs. Both of these styles were
sung by peoples who originated from the same region of Africa. Indeed, both groups correlated with
African songs.
16
FINDING THE BLUES
K=2
AAF
AI
AF
CAF
OEU
WEU
NAF
Table 3.8. Correlation matrix for K=2 (n= 100; df= 98)
AAF
AI
AF
CAF OEU WEU NAF
***
****
1.00
*
-0.71 1.00
****
-0.33 0.00
1.00
0.77
-0.72 0.23
1.00
0.37 -0.47 -0.70 -0.51 1.00
****
-0.19 -0.12 -0.54 -0.10 0.66
1.00
0.52
0.06 -0.66 -0.90 0.21 -0.14 1.00
K=3
AAF
AI
AF
CAF
OEU
WEU
NAF
Table 3.9. Correlation matrix for K=3 (n= 150; df= 148)
AAF
AI
AF
CAF OEU WEU NAF
1.00
****
-0.57 1.00
****
0.09 -0.13 1.00
-0.31 -0.22 0.68
1.00
****
-0.05 -0.33 -0.81 -0.32 1.00
-0.15 -0.32 -0.75 -0.21 0.95
1.00
0.54
0.12
0.14 -0.47 -0.63 -0.70 1.00
K=4
AAF
AI
AF
CAF
OEU
WEU
NAF
Table 3.10. Correlation matrix for K=4 (n= 200; df= 198)
AAF
AI
AF
CAF OEU WEU NAF
***
1.00
****
-0.54 1.00
0.21 -0.33 1.00
****
-0.28 -0.28 0.66
1.00
****
-0.01 -0.29 -0.70 -0.29 1.00
-0.33 -0.13 -0.69 -0.16 0.88
1.00
-0.17 0.49
0.00 -0.38 -0.56 -0.53 1.00
K=5
AAF
AI
AF
CAF
OEU
WEU
NAF
Table 3.11. Correlation matrix for K=5 (n= 250; df= 248)
AAF
AI
AF
CAF OEU WEU NAF
***
1.00
*
-0.36 1.00
0.21 -0.31 1.00
****
-0.27 -0.33 0.70
1.00
****
-0.05 -0.39 -0.60 -0.25 1.00
-0.41 -0.15 -0.64 -0.10 0.84
1.00
-0.17 0.13 -0.11 -0.36 -0.34 -0.37 1.00
K=6
AAF
AI
AF
CAF
OEU
WEU
NAF
Table 3.12. Correlation matrix for K=6 (n= 300; df= 298)
AAF
AI
AF
CAF OEU WEU NAF
*
1.00
**
-0.41 1.00
0.14 -0.27 1.00
****
-0.29 -0.34 0.63
1.00
****
0.09 -0.40 -0.62 -0.23 1.00
-0.24 -0.26 -0.57 -0.07 0.80
1.00
-0.20 0.16 -0.08 -0.35 -0.38 -0.43 1.00
K=7
Table 3.13. Correlation matrix for K=7 (n= 350; df= 348)
AAF
AI
AF
CAF OEU WEU NAF
AAF
AI
AF
CAF
OEU
WEU
NAF
1.00
-0.34
0.11
-0.30
0.07
-0.40
-0.15
*
1.00
-0.27
-0.27
-0.37
-0.22
0.06
1.00
0.46
-0.57
-0.50
-0.03
****
1.00
-0.23
-0.03
-0.33
1.00
0.72
-0.37
****
1.00
-0.37
1.00
17
FINDING THE BLUES
Another way of looking at the relationships between the cultural regions is to look at how often
the majority of songs from two or more cultural regions fell into the same cluster at the higher
values of K.
I calculated the proportion of songs from each a priori defined cultural region in each
of the clusters for all runs of K = 5 and 7. Each culture was assigned to the cluster with the
highest proportion of its songs. For the 5 cluster analysis this proportion ranged from 0.290.91 (mean= 0.59 +/- 0.019, 99% CI). In other words, on average, around 60% of the songs
from each a priori defined cultural region fell into a single cluster. Table 3.14. below shows the
frequency of occurrence of pairs of song regions where they appeared in the same group.
The cultures in black represent those that may realistically be linked through a priori notions
of relatedness. The red cultures represent anomalous groupings of cultures.
Of the 21 potential pairings of the seven cultures, only 7 were observed. 98% of runs
produced the European cluster (OEU and WEU), and 88% of runs the AF and CAF cluster.
AF and AAF paired in 10% of the runs. However AAF and CAF songs never appeared
together in a cluster. There were significantly more occurrences of the a priori defined groups
(black) than the anomalous groups (red) (Chi-squared - df 2, p < 0.0001).
PAIRS OF
CULTURES
FREQUENCY
OCCURRED
OEU and WEU
98
AF and CAF
88
AF and AAF
10
AF and CAF
0
AF and NAF
2
AI and NAF
2
Table 3.14. Frequency occurrence of pairs of cultures for K=5 cluster analysis
(In each run there were 2 paired groups, the frequency therefore sums to 200)
These groupings become even more apparent in figure 3.2. This shows the occurrence of the
different “clusters” over the 50 runs. Cluster here refers to the cluster returned by the
algorithm containing the highest proportion of songs from a given cultural region. Thus the AI
cluster in figure 3.2. refers to the cluster in which the highest proportion of AI songs fell. 5
main clusters are repeatedly produced: AF, AI, NAF, a European cluster (OEU and WEU) and
a Black African Caribbean cluster (AF and CAF).
18
FINDING THE BLUES
Figure 3.2. Graph to show the occurrence of all clusters in 50 runs of K=5 cluster algorithm
50
Occurrence
40
30
20
10
CA
F
O
EU
W
EU
O
EU
an NA
F
AF d W
E
an U
AF d C
A
a
AA nd F
AA
F
an
F
d
AI C
an AF
AF d N
a n AF
d
NA
F
AF
AI
AA
F
0
Clusters
Recall that in each run of the 5 cluster algorithm, each of the 7 a priori defined cultural regions
was assigned to the cluster which contained the highest proportion of its songs. There were
50 runs of each level of K, so for K = 5 there were 250 clusters returned.
The same graph for the 7 cluster analysis is shown below, figure 3.3. The range of
the highest proportions of songs from in each of the clusters was 0.26-0.86 (mean= 0.50 +/0.018, 99% CI). So, on average, 50% of the songs from a given culture fell into a single
cluster. If the highest proportion of songs from two different cultural regions fell into the same
cluster, the cluster was identified as having both cultures, e.g. OEU and WEU. Figure 3.3.
also contains clusters with two cultural labels ‘no culture’. The algorithm produced 7 clusters
but, as we have seen in the correlation matrix for K = 7 (table 3.13) both AF and CAF, and
WEU and OEU were highly correlated at this level of K. This is reflected in this figure by the
high occurrence of a WEU and OEU cluster. When a pair of cultural regions occurs in the
same cluster, one cluster must be left without any predominance of songs from a specific
cultural region. These were labelled as ‘no culture 1’. When there were two pairs of cultures in
the same cluster, two clusters were left without a specific identity. The label ‘no culture 2’
refers to this. The CAF and AF cluster occurred in around 4% of the clusters.
19
FINDING THE BLUES
Figure 3.3. Graph showing the occurrence of all clusters in 50 runs of K=7 cluster algorithm
50
Occurrence
40
30
20
10
AA
F
AI
AF
CA
F
O
EU
W
O
EU
EU
a N
AF nd W AF
a E
AF nd U
AA a n CA
F dA F
a
AF nd AF
AA a C A
F nd F
an NA
AI d W F
a E
no nd U
c NA
no ultu F
cu re
ltu 1
re
2
0
Clusters
Over 78% of the clusters were identified as one of AAF, AI, AF, CAF, NAF, OEU and WEU,
and ‘no culture 1’. It appears then, that although the CAF and AF songs were highly
correlated at K=7 (r= 0.46,p<0.0001), the songs from the two cultural regions could be
segregated at the highest level of K.
These data suggest the following:
1. Songs from different cultural regions differ in the stylistic characters of their music.
2. However, some songs do share similar stylistic characteristics, e.g. the European songs
styles.
3. African and Caribbean songs share similar song style characters, but higher level cluster
analysis can differentiate between the two regions.
4. Most surprisingly, African American songs have diverged greatly from African songs. They
did correlate with African songs at some levels, but the relationship was small. At the lowest
level of clustering, they also show some affiliation with colonial European songs and North
African and Middle Eastern songs.
5. Both African American and Caribbean songs have diverged from the African music of their
origins, but at different rates and in different directions. African American songs bear no
relationship with Caribbean songs while they both correlate with African music.
20
FINDING THE BLUES
4. DISCUSSION
4.1. Song styles differ systematically among cultures
Alan Lomax developed Cantometrics with the conviction that song style around the world
could be explained by the culture in which it was produced. His idea, that song is a measure
of culture, was a bold one. At the time, it had its critics. The methodology was questionable,
the analysis perhaps naïve (see appendix A1). However, the present study has sought to
overcome many of these problems, particularly regarding our removal of a priori cultural
information from the analysis and the quantitative techniques used in the cluster analysis. In
the light of recent cultural evolutionary research, Lomax’s idea now seems prescient.
Music is an ancient human tradition. A bone flute, found in southern Germany, has
been dated to 34000 years BC (see Cross 2004). Cantometrics found regions of song style
that could be explained by culture and historical human population regions (Lomax 1968). It is
therefore reassuring that structure has been found in the cultural regions under present
investigation, given that music is such a prehistoric human behaviour.
I have asked whether geographical, historical and cultural structure can be revealed
by correct cluster analysis of a sample of the new, reconstituted dataset. To an extent, this
was found, with most of the songs in the higher order cluster analysis consistently falling into
clusters with other songs from the same cultural region. Although study of population genetic
structure has failed to classify human races (Lewontin 1972), it might be that correlates of
many loci, analogous here to the multivariate cluster analysis, can reveal geographic structure
in human gene frequencies (Edwards 2003).
4.2. Song styles show historical relationships
Why then was the clustering imperfect? One reason might be influences from multiple
sources, analogous to migration or admixture in genetics. African Americans in North America
still retain around 70% of the African gene pool (Reed 1969; Cavalli-Sforza et al 1994).
Quantifiable cultural relationships between Caribbean African Americans, and to some extent
African North Americans, and West Africans shows us that comparison of song style
characteristics reflects genetic differences between the same populations. Given that cultural
data, in the form of language (Gray and Jordan 2000; Holden 2002; Holden and Mace 2005)
and historical data, in the form of archaeology (Cavalii-Sforza et al 1994), have corroborated
genetic inferences of population structure, the present study presents a novel, and perhaps
conservative, source of information with which to study human population history.
Nevertheless, one of the most striking results found here is that although one might expect
song to be plastic and susceptible to diffusion or admixture, it largely is not.
Western European and Overseas European song styles did not cluster well and are
highly conserved. This is not surprising considering the cultural domination by European
colonial powers, of North America in the past. They took their songs with them. Again, the
present analysis has identified historical relationships that can be substantiated.
21
FINDING THE BLUES
4.3. African American songs and their origins
We have seen that the style of African American songs has diverged greatly from that of
traditional West Africa. Influences from the European colonists and possibly from Islamic
North Africa and the Middle East are suggested by these data. Many slaves may have been
Islamic (Kubik 1999; Diouf 2004). Melisma and wavy intonation are apparent in Islamic song
and could be conserved in the Blues which may lead to similarities in this analysis. The
Cantometric data codes specifically for melisma, nasalization, enunciation of consonants and
embellishment, all Islamic stylistic traits (Kubik 1999). However these correlations are likely a
result of the small African American sample size and a predominance of the Blues in the
Cantometrics dataset.
It also appears that African American songs have diverged in a different stylistic
direction from Caribbean songs. At no point were the two styles correlated, yet both related to
some degree to African songs. Perhaps, both cultural types have converged on different
stylistic traits originally present in the African founding population in a process analogous to
speciation.
4.4. Future work
Further work with a larger African American sample would greatly add to the conclusions
reached here. It would also be of interest to analyse which variables are causing the different
correlations. It could be that specific groups of variables measuring similar traits cause the
relationships between various cultural regions.
Further systematic analysis of the contingency tables, identifying which variables are
the most heterogeneous over the different cultural groups would also help to identify which
stylistic characteristics are most important in producing the structure observed.
A full analysis in the same fashion as the present study but on the complete data set
will be important in assessing the suitability of using song as a measure of cultural change
and evolution and could be combined with genetic, linguistic and historical data to gain a
more complete perspective of human history.
22
FINDING THE BLUES
5. ACKNOWLEDGEMENTS
Thanks must first and foremost go to Dr Armand Leroi, my supervisor. His enthusiasm from
the start and discussion throughout was invaluable in both the production of this paper and in
fuelling my inquisition into a subject both novel and challenging.
Jonathan Swire, who not only masterminded the reconstitution of the Cantometric
data, but also designed and tested the computer programs, was a benignant aid in
understanding the finer workings of computational analysis.
Thanks also to Dr Timothy Barraclough for help and consultation in the latter stages
of the project.
23
FINDING THE BLUES
6. REFERENCES
Ayub, Q., Mansoor, A., Ismail, M., Khaliq, S., Mohyuddin, A., Hameed, A., Mazhar, K.,
Rehman, S., Siddiqi, S., Papaioannou, M., Piazza, A., Cavalli-Sforza, L.L., and
Mehdi, S.Q. (2003) Reconstruction of Human Evolutionary Tree Using Polymorphic
Autosomal Microsatellites. American Journal of Physical Anthropology 122 259-268
Boyer, P. (1999) Cognitive tracks of cultural inheritance: how evolved intuitive ontology
governs cultural transmission. American Anthropologist 100 876-889
Cann, R.L., Stoneking, M. and Wilson, A.C. (1987) Mitochondrial DNA and Human
Populations. Nature 325 (6009) 31-36
Cavalli-Sforza, L.L., Piazza, A., Menozzi, P. and Mountain, J. (1988) Reconstruction of
Human Evolution: bringing together genetic, archaeological and linguistic data. Proc.
Natl. Acad. Sci. USA. 85 6002-6006
Cavalli-Sforza, L.L., Menozzi, P., Piazza, A. The History and Geography of Human Genes.
Princeton University Press. 1994.
Cross, I. (2004) Music and meaning, ambiguity and evolution. in Musical Communication,
eds. D. Miell, R. MacDonald & D. Hargreaves, O.U.P. 2004 (web source)
Dawkins, R. The Extended Phenotype. Oxford University Press. 1982
Diouf, S. (2004) in Muslim Roots of the Blues by Curiel, J. San Francisco Chronicle
15/08/2004 online www.sfgate.com
Edwards, A.W. (2003) Human Genetic Diversity. Lewontin’s fallacy. Bioessays 25 (8) 798801
Feld, S. (1984) Sound Structure as Social Structure. Ethnomusicology 28 (3) 383-409
Frisbie, C.J. (1971) Anthropological and Ethnomusicological Implications of a Comparative
Analysis of Bushmen and African Pygmy Music. Ethnology 10 265-295
Goodall, J. (1964) Nature 201, 1264-1266
Gray, R.D. and Jordan, F.M. (2000) Language trees support the express-train model of
Austronesian expansion. Nature. 405, 1052-1055
Grauer, V. (2005) Cantometrics: Song and Social Culture. A Response.
www.mustrad.org.uk/articles/cantome2.htm
Grauer, V. (in press) Echoes of our Forgotten Ancestors.
http://pages.prodigy.net/victorag/echoes.htm#_edn17
*Hauser, M.D., Chomsky, N. and Fitch, W.T. (2002) The Faculty of Language: What is it?
Who has it? And how did it evolve? Science 298 1569-1579
Heinrich, J. and McElreath, R. (2003) The evolution of cultural evolution. Evolutionary
Anthropology 12,123–135
Henry, E.O. (1976) The variety of music in a north Indian Village: Reassessing Cantometrics.
Ethnomusicology 20 49-66
Holden, C.J. (2002) Bantu language trees reflect the spread of farming across sub-Saharan
Africa: a maximum-parsimony analysis. Proc. R. Soc. Lond. B 269, 793–799
Kolata, G.B. (1978) Singing Styles and Human Cultures. Science 200 287-288
24
FINDING THE BLUES
Lewontin, R.C. The apportionment of human diversity. In: Dobzhansky T, Hecht MK, Steere
WC, editors. Evolutionary Biology 6. New York: Appleton-Century-Crofts. 1972. p
381–398.
Lomax, A. (et al) Folk Song Style and Culture. Washington D.C. American Association for
the Advancement of Science Publication no: 88. 1968
Lomax, A. and Berkowitz, N. (1971) The Evolutionary Taxonomy of Culture. Science. 177
228-239
Lomax, A. Cantometrics; an approach to the anthropology of music. Audiocassettes and a
handbook. University of California, Berkeley. 1976
Lomax, A. (1980) Factors of Music Style. In Theory and Practise: Essays presented to Gene
Weltfish. Ed S. Diamond 1980
Mace, R. and Holden, C.J. (2005) A phylogenetic approach to cultural evolution. TREE 20
(3) 116-121
Mace, R. and Pagel, M. (1994). The comparative method in anthropology. Current
Anthropology 35 (5) 549-564
Maranda, E.K. (1970) Deep significance or surface significance: Is Cantometrics possible?
Semiotica 2 173-184
McLeod, N. (1974) Ethnomusicological Research and Anthropology. Annual Review of
Anthropology 3 99-115
Merriam, A.P. The Anthropology of Music. Northwestern University Press. 1964
Merriam, A.P. (1969) Folk Song Style and Culture Review Article. Journal of American
Folklore. 82, 385-387
Murdock, G.P. (1962-1967) Ethnographic Atlas. Ethnology 1-5
Nei, M. and Roychoudhury A.K. (1982) Genetic relationship and evolution of human races.
Evolutionary Biology 14 1-59
Nei, M. and Takezaki, N. (1996) The Root of the Phylogenetic Tree of Human Populations.
Molecular Biology and Evolution 13 (1) 170-177
Nettl, B. (1970) Folk Song Style and Culture Review Article. American Anthropologist 72 438441
Oppenheimer, S. Out of Eden: The Peopling of the World. Constable and Robinson Ltd 2003
Pinker, S. The Language Instinct. New York: Morrow 1994, chapter 11. in Evolution Oxford
Readers 2nd Edition. ed M. Ridley, O.U.P. 2004
Pritchard, J.K., Stephens, M. and Donnelly, P. (2000) Inference of Population Structure
Using Multilocus Genotype Data. Genetics 155 945-959
Reed, T.E. (1969) Caucasian genes in American Negroes. Science 165 762-768
Rosenberg, N.A., Pritchard, J.K., Weber, J.L., Cann, H.M., Kidd, K.K., Zhivotovsky, L.A.,
Feldman, M.W. (2002) Genetic structure of human populations. Science 298 2381–
2385.
25
FINDING THE BLUES
Rummel, R.J. (1967) Understanding Factor Analysis. (Scanned from "Understanding Factor
Analysis," The Journal of Conflict Resolution 11 444-480)
www.hawaii.edu/powerkills/UFA.HTM
Szathmary, E. and Maynard Smith, J. (1995). The major evolutionary transitions. Nature.
374, 227-232
Tehrani, J. and Collard, M. (2002) Investigating cultural evolution through biological
phylogenetic analyses of Turkmen textiles. Journal of Anthropological Archaeology.
21, 443–463
Whiten, A., Horner, V. and De Wall, F.B.M. (2005) Conformity to cultural norms of tool use
in chimpanzees. Nature 437 (29) 737-740
Wikipedia. http://en.wikipedia.org/wiki/Classical_music_era
26
FINDING THE BLUES
APPENDICES
27
FINDING THE BLUES
A1 CANTOMETRICS
A1.1 A brief introduction to Cantometrics
In the 1960s Alan Lomax developed a programme with the aim of trying to evaluate and
quantify musical performance style. While working and collecting songs in Europe in the
1950s, he noticed that the performance style of Spanish songs “varied in terms of the severity
of prohibitions against premarital intercourse” (Lomax 1969:pviii). Songs from the restrictive
southern provinces were characterised by high-pitched, squeezed and pierced singing style,
whereas the style of songs from the more liberal north tended much more towards open,
mixed voice, well-blended choirs. Later in Italy, a thorough collection effort resulted in Lomax
observing similar characteristics (Lomax 1969:pviii). Here he saw a way to define
characteristics of song performance style and so compare them. Through evaluation and
comparison of world song performance style he believed he might be able to understand why
songs have such a deep effect on man’s conscience. He believed that these trends he had
seen in Mediterranean music might hold for music from all over the world. Lomax wanted to
show that a people’s musical style could be explained by the ‘style’ of the culture in which it
has developed: song structure as social structure. In the 1950s a new age was dawning.
Better recording and playback quality meant that music could now be listened to in an
unprecedented way. Songs could be recorded anywhere in the world and then listened to
back in the comfort of one’s home or office. Initially, a pilot study in 1961 developed
phonotactics (a measure of assonance in verse) as a means to evaluate the words in songs
in the Mediterranean song sample (Lomax 1969;pviii). This system showed that there were
indeed geographic regions of song styles that were reflected in the assonance of the verse of
song (Lomax 1969: pix). In the summer of 1963 Lomax went further and sat down with an
associate, Victor Grauer, and listened to a sample of more than 700 songs from around the
world (Lomax 1969:pxiv). They tried to produce a holistic descriptive system that could be
used to characterise the performance style of a song purely by listening to a recording of it.
Cantometrics was born.
There seemed at once to be a dichotomy in the performance styles of the songs that
they listened to: songs tended to be either group-dominating, and therefore individualised,
solo, textually and melodically complex, ornamented and precisely enunciated or they were
group-involving, sung in unison with simple text and melody with no ornamentation and
slurred enunciation (Lomax 1969:p16). Now, both Lomax and Grauer were aware that this
was a generalisation, however they were intrigued by the fact that these patterns were there
at all. They continued to manipulate the method in order to find important characters and traits
of song style performance. They were trying to find the factors that could best be used to
profile a song style so that performances could be compared.
28
FINDING THE BLUES
The Taxonomic Concepts of Cantometrics
1
The social organisation of the vocal group
2
The social organisation of the orchestra
3
The level of cohesiveness of both vocal group and orchestra
4
The level of explicitness – text and consonant load
5
The rhythmic organisation of the vocal group and the orchestra
6
The order of melodic complexity
7
The degree and kind of embellishment used
8
The vocal stance
Table A1.1. The Taxonomic Concepts of Cantometrics (Lomax 1969:p20)
Key to the theories that Lomax was thinking about were music’s link to culture. It was
therefore not enough to just evaluate song. It was also necessary to investigate the culture of
the area where the song came from. He used Murdock’s Ethnographic Atlas (1962-7) to add
information to the analysis about the culture from where the songs came.
From the initial analysis they had the two broad types which, as the study evolved,
were divided into subsets of types depending on these factors of song style performance.
Different combinations of levels of factors gave different profiles and so songs could be
quantitatively rated on how similar they were. As well as the measures mentioned above, the
songs were graded on their cohesion, wordiness, metre, embellishment and type of voice
(clear or slurred). Factor analysis produces clusters of songs with similar profiles for these
traits. Interestingly, these clusters matched, albeit very generally, geographically and
historically similar areas. That is to say, song styles from African Hunter societies were more
similar to each other than they were to Amerindian or European songs. This crude system
was improved and eight major taxonomic concepts were discovered, each with different
scales beneath them (table A1.1.).
Musicologists were already aware of these world song style regions, so it’s not wholly
surprising that Cantometrics confirmed this. However, it is important to note that it did find
these areas in a quantitative and scientific fashion (Merriam 1969).
29
FINDING THE BLUES
Figure A1.1. Lomax and Berkowitz’s ‘Tree’ of cultures (1972).
Each box represents one of the cultural clusters recognised by the factor analysis. The numbers refer to the
similarities (in terms of profile similarity) between them – the higher the value the more similar the cultures. The
differentiation scale on the left (the y-axis) is a measure of economic productivity derived from the differentiation
factor that was produced from the factor analysis. There is no x-axis.
A1.2. The Cantometrics Coding Book
The Cantometrics Coding Book has some 37 different song style factors with scales ranging
from 3-13, which allowed for a sophisticated profile to be made for songs from a given culture.
Further analysis involving the inclusion of Murdock’s cultural information (1962-7) as well as
the song style factors, on a larger sample of recordings, ensued. They now had the
beginnings of a system that could be used to audibly rate songs on their performance style,
with a view to making meaningful comparisons between them. Progressively larger sample of
songs were rated, encompassing an ever larger sample of cultures. The final coding system
was created and published along with the first large-scale analysis of around 2400 songs from
233 different cultures and was published as Folk Song Style and Culture (Lomax et al 1968).
This analysis combined Cantometrics, which took on average ten songs from each of the 233
cultures and produced a modal profile for a culture based on this sample (Lomax 1968), with
cultural data using multi-variate factor analysis. Factor analysis is a way of defining patterns
from data (Rummel 1967). By comparing and correlating the profiles for each culture it finds
factors (in this case the cultural and Cantometric measures) that are particularly good at
explaining the variation in the data. In this way, Lomax was able to cluster groups of cultures
together on the basis of a few important cultural and performance traits.
A larger data set of around 4000 songs from 400 cultures was used to produce an
evolutionary perspective of culture with songs, (figure A1.1.: Lomax and Berkowitz 1972). The
combined data sets of Cantometrics and cultural factors gave 13 different cultural clusters.
Lomax and Berkowitz (1972) assumed that economic productivity was key to the evolution of
30
FINDING THE BLUES
culture. Figure A1.1. shows the 13 cultural clusters against measurements of ‘differentiation’
(or economic productivity). Thus, in figure 1 we see the relationships in terms of similarities, of
the 13 cultural clusters against a differentiation scale. It is a sort of primitive evolutionary tree.
A1.3. Criticisms of Cantometrics
Cantometrics had its critics. Much of the criticism fell around two camps. There were those
who believe that the methodology, both musically and statistically, was unsound and naïve,
and there were those that the conclusions of Cantometrics were unfounded (Kolata 1978).
A1.3.1.1. Problems with collection of songs
Lomax chose ten songs from each culture. He then produced a modal profile for each culture.
In a study of the Kaluli people of Papua New Guinea, Feld (1984) has shown that on a
sample of 500 songs from these people, the intracultural variability is so great that it is almost
impossible to produce a modal profile. How then can Cantometrics claim to have the full
essence of the songs of a culture in a small sample of ten? Indeed, how can we know that the
ten are representative anyway (Nettl 1970; McLeod 1974)? In a stern critique Maranda (1970)
suggests that Lomax deliberately ignores music styles from areas that don’t fit the data. In
another study detailing Indian folk music, Henry (1976) went further and specifically criticised
Lomax’s description of Indian music as ‘Old High Culture’, showing in his own prolonged
study of Indian music, a wealth of diverse music forms, rather than a dominance of the
‘bardic’ solo form Lomax assumed to be present there. He shows that devotional songs, male
group songs and women’s songs have a different style (Henry 1976). He also shows that
within a genre of songs there can be diversity. Non-participatory songs, performed for an
audience, show large diversity in embellishment and interval range. Thus the Khari viraha
which are short songs of herdsmen with narrow intervals, much embellishment and
rhythmically free differ from the Kaharava songs, sung by groups of men that are relatively
unembellished, melodically simple and highly rhythmic (Henry 1976). Merriam (quoted in
Kolata 1978) and Feld (1984) also comment on the time depth in the sample: the songs are
treated as contemporaneous when they were in fact collected over a 70 year time frame.
There seems to be evidence, therefore, that Lomax’s sample size and assumptions were
misleading.
Lomax and Grauer, the co-founders of Cantometrics, point out that the method was
never meant to describe fine detail, but deal instead with ‘general trends on a global basis’
(Lomax 1980; Grauer 2005). It cannot realistically be expected in a global analysis that
precise individual differences be recognised. It is simply too time consuming. Furthermore,
Lomax and Grauer argue that a sample greater than ten produced a similar modal profile to
the ones produced with just ten songs: “we found that we were recording little new information
when we coded more” (Lomax 1968:p32). In order to get a representative sample, wherever
possible, Lomax asked a specialist in a certain field to give his opinion on what songs to take
(Grauer 2005). It is almost never possible to sample the whole of a population. This is one of
31
FINDING THE BLUES
the fundamental principles of statistics: in fact it is the whole point of statistics, that is, to use a
sample of a population to make inferences and generalisations about the whole. With
relatively simple statistics, one can find the variance in the profiles of music in a sample and
then make predictions about how variable and valid the sample is (Merriam 1964). I believe
that most ethnomusicologists, who over the last 50 years have diverged away from the
general and into the specific (McLeod 1974), have lost sight of what their discipline was
originally intended to study: which is the cross-cultural study of music (Lomax 1968:p35). Too
many ethnomusicologists are insulted by Lomax’s crude generalisations, purely because they
have spent the vast majority of their lives trying to describe the diversity of music within a
culture. They are therefore obliged to disagree with Cantometrics, or else acknowledge their
own studies as inept and pointless. This has been a huge problem in the development of
Cantometrics since the 1960s.
A1.3.1.2. Problems with analysis of songs
Once the Cantometrics project had a set of songs from different cultures, it was time to start
the analysis. This involved rating every song with the 37 ratings in the Cantometrics Coding
Book (Lomax 1968: chapter 3). As explained above, the ratings were produced by listening to
recordings of the songs and then commenting on the style. Many of the qualities that were
rated were highly subjective (Nettl 1970). Vocal rasp, nasality and vocal width are examples
of fuzzy areas of analysis, where it is difficult to rate with precision. Amount of nonsense in a
song is another area of ambiguity (Maranda 1970). How can non-native speakers comment
on the level of nonsense in a song? Also, many of the scales have dubious states of which
the rater is supposed to discern. For example the scale to judge embellishment runs as
follows:
extreme
embellishment,
much
embellishment,
a
considerable
amount
of
embellishment, some embellishment, little or none.
It is easy to see how such subjective measurements can cause offence, but less easy
to justify them. Lomax was bold enough (Feld 1984) to publish the Cantometric training tapes
(Lomax 1976) so that anyone could see how he had rated the songs. It is true that a lot of the
factors were ambiguous to rate, but the use of the tapes certainly helps people to gain
understanding. Indeed, Lomax and Halifax (another Cantometrics collaborator) explain in Folk
Song Style and Culture (Lomax 1968:p112) that the mean reliability of the rater score for
thirteen of the Cantometric parameters was 84.7%, and this test included nasality (76%
consensus) and raspiness (78% consensus), both factors that came under criticism as being
to subjective. At the time of the research there were few if any other options of analysis in
terms of computer software or other sophisticated technologies. It is important to note that the
initial Cantometrics analysis occurred over 40 years ago and the rate of technological
advancement since then has been greater than ever before.
32
FINDING THE BLUES
A1.3.2. Statistical Problems
The statistical analysis used in Cantometrics often comes under criticism (Grauer 2005)
However there is little published critique of the system and even fewer published alternatives.
This criticism stems largely from ignorance rather than truly scientific judgement on behalf of
the ethnomusicologists. Cantometrics used multi-variate factor analysis to produce clusters of
similar cultures based on factors of song style and culture. Essentially it looks for correlations
between cultures over the different factors and tries to find patterns in the data (Rummel
1967). So Cantometrics sits on analysis that is solely based on correlation and as any
scientist will tell you: correlation does not equal causation. There are always alternative
hypotheses that could explain the data and Lomax simply never explores them (Maranda
1970).
Maranda (1970) alleges that if there were two alternative types of song performance
in a cultural sample then Lomax ignored one of them. She also suggests that Lomax simply
ignores those results that do not say what he wants, whereas he over emphasises those
results that do. And this leads on to probably the biggest criticism of the Cantometric method:
that Lomax had decided what he wanted the results to be a priori and ‘fixed’ the data to it
(Maranda 1970). This is evident in his book Folk Song Style and Culture where he states in
the foreword that during his early folk-song recordings in Spain (that were unrelated to
Cantometrics) he noticed that “Spanish performance style varied in terms of the severity of
prohibitions against feminine premarital intercourse” (Lomax 1968:pviii). From the beginning,
indeed before the beginning, he had preconceived ideas about what he was trying to do with
Cantometrics which influenced what he wanted to find, and this is most notable in the
presence of the ‘dichotomy’ of styles that was found, one being restrictive and individualistic,
the other permissive and group-based, which is essentially what he had noticed in the 1950s’
Spanish songs.
33
FINDING THE BLUES
A2 THE SONG SAMPLE
A2.1. Introduction
A sample of the total database of song style profiles was made in order to look specifically at
African American songs. Songs from American Indians, West Africa, Northern Africa, the
Middle East, Western Europe and European Overseas colonies were selected because these
are believed to have been the major influences in shaping African American songs (see Kubik
1999). Similarities in song style variables between these broad geographic areas and African
American style profiles were investigated. Variables, and groups of variables, that occur in
similar states in different geographies imply similar cultural styles and possibly similar cultural
ancestry.
Cantometrics divided the world into 56 a priori defined cultural areas, primarily based
on Murdock’s Ethnographic Atlas (Lomax 1968:75; Murdock 1962-67). These were recorded
on the profile for each of the songs and were used as indicators of areas from which to
sample songs of interest to the present study. In each of the tables below, this cultural area is
referred to by the three digit number at the top left of each cell. Sometimes it was sufficient to
segregate songs purely on this basis, but for others it was necessary to split up songs from
within the cultural area, to avoid using irrelevant songs.
A2.2. The Sample
Out of the total 5266 songs which were originally analysed and given profiles during the
Cantometric study 1488 were used in the analysis. This is roughly 35% of the total song
sample.
A2.2.1. African American Songs - AF
The Cantometrics database contains many types of American songs ostensibly performed by
African Americans. These include songs from North America such as the blues, delta blues
and jazz; music from Caribbean Islands; songs played by Africans in Latin American countries
and songs from Central American cultures. Only songs from North American cultural areas
were relevant to the present analysis, so all Latin and Central American songs were removed.
Caribbean songs were included, but as a separate group (see below). There is a large
Spanish and Portuguese influence on song style from Latin America, as Africans were
brought to here as slaves by the peninsular nations of Europe. Although there is African
influence on these song styles, the Spanish / Portuguese influence was not being investigated
so the inclusion of this music was not relevant. In total 44 out of 460 African American songs
were included. This constituted the smallest of all the geographic song samples.
Song Regions Discarded
521 – Caribbean Islands and Salvador
Songs from:
Androsis, Anguill, Calypso, Carriacou, Columbia
Popayan, Colec Negro, Salvador, Surinam
Song Regions Included
521
Southern US General, Afro-American, Jazz, US
South
34
FINDING THE BLUES
A2.2.2. Caribbean Songs - CAF
A subgroup of Caribbean songs was included in the analysis. These were thought to be of
interest because they would produce a contrast to the African Americans. Many of the
Caribbean slaves came from a similar area to the mainland North African slaves and so
investigation of any similarities and differences between the two cultures is of interest. There
were 357 Caribbean songs included in the analysis.
521 – Caribbean Islands and Salvador
AA Anguil, AA Carriac, AA Dominic, AA Grenada, AA Guadalo, AA Martinin, AA Nevis, AA St Luci, AA
Toco, Black Carib Cuba, Haiti, Jamaica, Toco, Trin, Trindad, Virgin Isl
A2.2.3. American Indian Songs - AI
The influence of American Indian songs on the evolution of African American music is
believed to be minimal, given that the two genres act on different tuning scales (Kubik 1999).
It was therefore of interest to include these songs in the analysis. On the Cantometric sample
there are 910 American Indian songs, of which 326 were used. American Indian songs from
non-North American countries, such as Peru, and ancient traditional songs from non-North
American cultures such as the Maya were discarded as they will have had no influence on the
origin of African American music. A large proportion of the American Indian songs on the
Cantometrics database came from Mexican and other Central American cultures and these
were also discarded.
Song Regions Discarded
101 - Patagonia
Yaghan and Ona
103
Mataco, Toba, Siriono, Caingua, Pilaga, Ayore
105 – Chilian Amazon
Araucanian, Quechuan, Aymara, Chipayas
107 – Peruvian Amazon
Cavana, Cayapa, Borawit, Shipibo, Conibo, Jivaro,
Tucano, Yagua, Otavalo, Orejon, Cocamo, Campo
109 – Central Brazilian Amazon
Trumai, Camagura, Iwalapeti
113 – Eastern Brazilian Amazon
Tapirape,
Krahocanel,
Jahave,
Caraje,
Ncayaposhu, Juruna, Suya, Kuikuru, Caingang,
Cayapo
115 – North Brazilian Amazon
Yarura, Guahibo, Warao, Yekuana, Oyana,
Piaroa, Kalina, Makirita, Guarauno
117 – North South American Amazon
Goajiro, Motilon
119 – Central America
Lacandon, Cuna, Tzotzil, NoanM, Maya, Cholo,
Chiapas, Mestizo, Miskito
203 – Northern Mexico
Huichol Papago, Pima, Yaqui, Seri, Tarhumara,
Totonactep, Cora, Otomi, Mazatec, Popluco,
Miztec
219 - Alaska
Inuit
Song Regions Included
205 – Californian Region
Wapache, Navaho, Mohave
206 – South Central US
Taos, Zumi, Laguna, Hopi, Santaclar, San
Ildefonso
207 – Missouri/Kansas region
Creek, Iroquis, Wabanaki, Yuchi, Delaware,
Choctaw
209
Pawnee, Meskawaki, Ojibwa, Fox, Menomini
211 – North-East US
Cree, Tetondakot, Kiowa, Blood-Bac, Comanche,
Cheyenne, Teton Dakota
213 – North Central US
Flathead, S.Paiute, Washo, Kutenai, Shoshoni,
Ute, Walapai, Bannock
215 – North West US
Hupa, Pomo, Siletz, Yurok, Yokuts tachi,
Diegueno
217 – Washington/Alaskan region
Tolowa, Kwakiuth, Tsimshian, Salish, Haida,
Salishcoas, Nootka
35
FINDING THE BLUES
A2.2.4. African Songs
The Cantometrics database includes songs from all over Africa; from the Bushmen of the
Kalahari to the Dogon of Mali to Arabic Tunisian songs. 1015 songs were labelled as Black
African or Old High Culture (OHC: see Lomax 1968), both groupings containing African
songs. OHC refers to an Afro-Eurasian style region spreading from North Africa to East Asia
and Malaysia.
A2.2.4.1. West Africa - AF
319 out of 763 songs were left in the analysis. These included songs from areas of West
Africa of particular interest because of their links with the slave trade and a qualitative study
(Kubik 1999) on the origin of blues music.
Songs Regions Discarded
505 – East Africa
511 – Cape Africa
513 – Botswana/Zimbabwe region
515 – Malawi/Mozambique region
516 – Madagascar
619 – Horn of Africa
Songs Regions Included
501 – Western Senegambia
503 – Central Sudanic region
517 – Equatorial west Africa
519 – Guinea Coast
615 – Central Sahara Africa
A2.2.4.2. African/Middle Eastern - NAF
An Islamic influence has been postulated to have given rise to some aspects of the blues
(Kubik 1999). Songs from North Africa, the Mahgreb and the Middle East, which are
characterised by the ancient Islamic roots of this area, were therefore included as a group to
see how they compared to African American and American songs.
A group of 251 songs from North Africa, more specifically the Mahgreb (Morocco,
Algeria and Tunisia) and the Middle East were included as a separate group. This area has a
mix of influences itself as well being the major travel route between the exotic East and the
West during the first half of the last millennium. This is reflected in the Cantometrics literature,
these areas belong to the ‘Old High Culture’ style region, an area depicted as being one of
the major centres of civilisation.
613 - North Algeria / Morocco; 615 - North Central Sahara; 622 - Arabian Peninsular; 630 Mesopotamia.
36
FINDING THE BLUES
A2.2.5. Western European Songs - WEU
Although the Cantometrics database contains 758 songs from all over Europe only 153 of
them are from Western European nations, such as Great Britain and France, which were
believed to be the most important European influence in the evolution of African American
music, as these countries populated much of North America during the 18th and 19th
centuries. Celtic, French, English and Welsh songs were included, while Spanish, central
European, Scandinavian, Nordic and Eastern European songs were discarded. This left 105
out of the 153 total Western European songs in the analysis.
Song Regions Discarded
601 – Central Russia, Central Europe
All songs from Eastern Europe, Scandinavia and
Russian Countries
603 – North European
Songs from Denmark, Holland, Finland, Gypsy,
Faeroe Isles, Norway
607 – West Mediterranean
Songs from the Mediterranean rim, and islands
within
609 – Iberian Peninsula
Spanish and Portuguese songs
Song Regions Used
603 – North European
French: Western and Breton, English, Scottish,
Welsh, Irish
A2.2.6. Overseas European Songs - OEU
Songs from European colonised cultures in North America were included in the analysis.
These were important as they mark the European types of style that would have influenced
African Americans. These include Hill Billy, French Canadian, songs from Nova Scotia, white
US popular music from the 1940s and 1950s and songs from the Kentucky mountains. Songs
from Hispanic colonies, British and French colonies in non-north American countries and
other European colonies in Asia were discarded. There were 86 out of 191 overseas
European songs included in the analysis.
Cultures with Songs Discarded
405 - Australia
British Colonial Songs from Australia
604 – North-East US seaboard
French Colonial Songs from the French West
Indies
611 – Chile, Argentina, Brazil,Mexico
All Hispanic Colonial Songs
Cultures with Songs Used
604 – North-East US seaboard
French Colonial Songs: French Canadian, Nova
Scotia, US Pop 1940s, 1952, Northern US,
Newfoundland, Shaker
605 – US
British Colonial Songs: Kentucky Mountains, Hill
Billy, Laacadian, Southern US
37
FINDING THE BLUES
A3 PROBLEMS WITH THE CANTOMETRICS DATASET
A3.1. Introduction
The Cantometrics dataset contains song profiles for over 5000 songs, each having been
scored for the 37 Cantometric variables. This represents a large dataset. Further inspection of
the dataset, the variables that were scored and the way in which they were scored showed
serious biases and problems that required attention before the analysis could continue. As
such, a renewed dataset of 44 variables was produced by Jonathan Swire, and this can be
seen in comparison to the original in table A3.1. The nine major problems are listed below
with the remedial action that was taken to produce a more independent, unbiased and robust
dataset.
NB Scores for all variables can be expressed both as a number and as a description. For
example in the new variable 1 (see table) Number of Singers, a score of “1” means “no
singers”. These have been used interchangeably during the following explanation, depending
on which description carries the most salient meaning.
A3.2. Problems with the data
A3.2.1. Illegal values filled for some variables
Only three of the old variables have 13 character states. All of the rest have between 3 and 9
states and these are all represented on a scale of 1-13 so there are gaps between the values
on the scale. For example old variable 5 “Tonal blend of the vocal group” has five states, from
“1” = “No blend” to “13” = “Maximal blend”. The three other possible states are “4” = “Minimal
blend”, “7” = “Medium blend” and “10” = “Good blend”. Occasionally on scales such as this an
illegal value, such as “3” or “8” was scored. Without going back to the actual songs it is
impossible to know what state the variable should actually be scored as. In these cases the
variable was eliminated from the profile and scored as “0”.
A3.2.2. Impossible values filled for some variables
Sometimes the data in the same profile contradicted itself. For example a song might be
scored as having “no instruments” for one variable and then for another variable it might be
scored as having “instrumental rubato”. Clearly if there were no instruments in the song then it
would be impossible for them to be scored as having “rubato”. In these cases the number of
singers and number of instruments in the orchestra were treated as being the most important
variable. Thus, if a song profile was said to have only “one singer”, but the “rhythmic blend”
was scored as being “complex” or “polyrhythmic” then the “one singer” score remained and
the “rhythmic blend” score was discarded.
A3.2.3. Many variables score absence as a value
Various sub-groups of variables are concerned with details of the singing group and of the
orchestra. In the cases where a variable has been scored indicating that there are no singers
(or no orchestra), the variables in these subsequent sub-groups are inconsistently scored as
38
FINDING THE BLUES
either “1” or “0”. For all variables, “0” is scored when there either no data available or if the
variable is not relevant, as in this case. Thus these explanatory variables should always be
scored as “0” in these cases.
A3.2.4. Ambiguous codings
Some of the original codings for the variables lead to ambiguity. For example, for old variable
12, "1" can mean either "no singer" or “one singer at a time” or "no rhythmic coherence
between singers". This is clearly not helpful. In these cases the variables have been split up
in order to extract the useful information. (see table A3.1.)
A3.2.5. Some variables are in fact unordered
The Cantometrics project regarded all states of the variables as ordered. They are scaled
according to level of integration (after Margaret Mead) and in some cases they clearly are not.
For example, old variable 35, “Raspiness”, has five states from “1” = “Extreme raspiness” to
“5” = “Voices that lack rasp”. This is an ordered variable. However it is difficult to see how
other variables, such as old variable 18, “Number of Phrases”, in which the states are “1” =
“more than eight phrases before a full repeat”, through “8” = “Three or six phrases
asymmetrically arranged”, to “13” = “One or two phrases symmetrically arranged”, can truly
be said to be ordered.
The new set of variables (table A3.1.) takes this into account and splits up some
variables and also takes account of whether in fact the variable, or part of the variable, is
ordered or unordered. In the new variable dataset there are 8 unordered variables and 36
ordered.
A3.2.6. Some variables, which are unordered according to the criterion above, can be
made ordered by splitting
The new variables dataset, which explicitly states whether or not the variable is ordered or
unordered deals with this. (See also problem A3.2.5. above).
A3.2.7. The same information occurs in more than one variable
In the old variables 2 and 3, “Rhythmic relationship between the voice and orchestra” and
“Social organisation of the orchestral group”, the same piece of information is measured
twice. For both variables state “1” is “No orchestration”. This is a form of pseudoreplication,
and is remedied in the new variables dataset by recording the information for one variable
once as “1”, and then setting it to “0” for all other potentially pseudoreplicated variables.
A3.2.8. Allele values cannot always be evenly spaced over the 13 possibilities
In the old variable dataset, there were 3 variables with 13 states, and the rest (34) had
between 3 and 9. However “1” was always set to be the first state and “13” was always set as
the last state in the progression. This lead to unequal spacing between the character states.
39
FINDING THE BLUES
For example, old variable 24, “Tempo” had 5 states from “1” = “Extremely slow” to “13” =
“Very fast”. However the spaces between the three intermediate states were not always the
same. As an ordered variable and as such the absolute differences between the states, rather
than just the statement of their differences was important. If the gaps are not equal then this
will lead to spurious calculations. Therefore in the new variable dataset the variable states
were set to the scale of the number of states in that variable. So in the example above, for
“Tempo”, “1” = “Extremely slow” and “5” = “Very fast”.
A3.2.9. Ambiguity over polyalleles
Sometimes a variable was given two character states for one variable. This could either be
because the scorer was uncertain of the exact value to ascribe the variable and entered both
states to code that the state lay somewhere in between the two states, or it could be that the
both of the character states occur at different stages during in the song. These are very
different explanations for the same result.
On order to fully remedy this problem in the new dataset, it is necessary to restructure
all of the data, which is a computationally and time intensive exercise. However, in the case
where there are two polyalleles, averaging the two does not make a difference to the overall
rating for the song, so the explanation as to why it is polyallelic does not matter. In the cases
where there are more than two polyallelic states it does make a difference however. These
amount to a very small number of comparisons in total, 0.29% of the total possible
comparisons. For example there are only 5 songs in the whole dataset (>5000 songs) which
have 4 polyallelic states. Therefore the effects of these comparisons will be negligible to the
overall analysis and as such it was decided that the analysis should continue with the new
variable dataset as it is.
A3.3. 0 or ‘no data’ states
In order to alleviate some of the problems described above a new state value of 0 was
introduced to the new variable dataset. This score was given when there was no data or the
variable was irrelevant, such as those regarding orchestration for songs with no singing.
When a 0 state was scored for a song, the distance matrix program gave the song the mean
value for all comparisons for that variable across all songs included in the distance matrix
building program.
40
FINDING THE BLUES
Table A3.1. Table to show new variables compared to the original Cantometric measures
The new variables were produced in response to the various problems discussed in the text above. This
involved splitting up four of the original variables to produce more robust divisions that use the
information from the Cantometrics dataset in a more logical fashion.
New Variable
Number
Variable Description
Ordered/Unordered
Cantometrics
Variable
Number
Variable Description
1
Number of singers
O
The vocal group
2
Form of the vocal part
U
3
Form of overlap alteration
O
4
Musical organisation of the voice
O
4
5
Polyphonic type
O
22
6
Tonal blend of vocal group
O
5
Tonal blend of vocal group
7
Rhythmic blend of vocal group
O
6
8
Rhythmic relationship within vocal group
U
12
Rhythmic blend of vocal group
Rhythmic relationship within vocal
group
9
Melodic shape of vocal line
U
15
Melodic shape
1
The vocal group > one singer
The vocal group > one singer, type of
alteration
Basic musical organisation of the voice
part
Polyphonic type
10
Melodic form of vocal line
U
11
Type of litany
O
12
Variation of litany/strophe
O
13
Phrase length
O
14
Number of phrases
O
15
Symmetry of phrases
U
16
Position of final tone
O
19
Position of final tone
17
Range
O
20
Range
18
Interval width
O
21
Interval width
19
Embellishment
O
23
Embellishment
20
Tempo
O
24
Tempo
21
Volume
O
25
Volume
22
Rubato in voice part
O
26
Rubato in voice part
23
Glissando in voice part
O
28
Glissando in voice part
24
Melisma in voice part
O
29
Melisma in voice part
25
Tremolo in voice part
O
30
Tremolo in voice part
26
Glottal shake in voice part
O
31
Glottal shake in voice part
27
Register
O
32
Register
28
Vocal width
O
33
Vocal width
29
Nasalization
O
34
Nasalization
30
Raspiness
O
35
Raspiness
31
Accent
O
36
Accent
32
Enunciation of consonants
O
37
Enunciation of consonants
33
Words to nonsense
O
10
Words to nonsense
34
Overall rhythmic scheme
O
11
Overall rhythmic scheme
35
Size of orchestra
O
36
Form of the instrumental group
U
37
Form of overlap alteration - orchestra
O
38
Tonal blend of orchestra
O
8
Tonal blend of orchestra
39
Rhythmic blend of orchestra
O
9
40
Overall rhythmic relationships - orchestra
U
14
41
Musical organisation of orchestra
Overall rhythmic structure of
accompaniment
O
7
O
13
Rhythmic blend of orchestra
Rhythmic relationship within the
accompaniment
Basic musical organisation of the
orchestra
Overall rhythmic structure of
accompaniment
O
27
U
2
42
43
44
Rubato in instruments
Dominance relationship between orchestra
and vocal part
Melodic form
16
Melodic form - type of litany
Melodic form - amount of variation
17
18
Phrase length
Number of phrases
Number of phrases - symmetry
The instrumental group
3
Form of the instrumental group
Alteration in the instrumental group
Rubato in instruments
Dominance relationship between
orchestra and vocal part
41
FINDING THE BLUES
A4 THE DISTANCE MATRIX PROGRAM
A4.1. Introduction
An algorithm was designed by Jonathan Swire that would produce a distance matrix for the
dataset containing the 44 new variables. This produces pair-wise comparisons, over each of
the 44 variables, for all 5000 songs. The algorithm will be described here for the distance
matrix for all of the songs, however in the analysis of songs in the present study a new
distance matrix was made in the same way for each of the 50 sub-samples of songs taken
from the overall song sample.
The absolute distances between all variables, rather than the sum of squares, were
used as the score for the comparisons between the songs. The variables are not orthogonal,
thus violating the assumption of independence between the variables. It is therefore correct to
use absolute values.
A4.2. Pair-wise comparisons
For each of the different variables in each of the different songs, a field was produced. The
field shows which of the character states was marked for each of the song variables. For
example in the example below Field 1, which represents the first variable, has five character
states; 1,2,3,4,5, in which SongA was rated as “2” and SongB rated as “4”.
State
SongA
SongB
Field 1
1 2 3
0 1 0
0 0 0
4
0
1
5
0
0
This constitutes one comparison for one variable between two songs.
A4.3. Producing a similarity score
A4.3.1. Ordered and un-ordered variables
Of the total 44 variables, eight were classed as unordered and the remaining 36 were
ordered. These are treated differently in the distance matrix programme. If the variable under
consideration was ordered, as in the majority of variables then the absolute distance between
the two states of the field constitutes the difference between the two songs. So in this case
the difference between SongA and SongB is 2. If the variable is unordered then what matters
is only whether the two songs differ in the state of this variable, as they do in this case. If this
was an unordered variable the difference between the two songs would be given an assigned
value. The assigned value could be chosen in advance and in the analysis was always 4. In
this example, and those below, 4 will be used as the assigned value.
In the following example, where the two songs were both rated as 2, the difference
is 0, regardless of whether the variable is ordered or not.
42
FINDING THE BLUES
State
SongA
SongB
Field 2
1 2 3
0 1 0
0 1 0
4
0
0
5
0
0
Most of the comparisons worked on this basis, however there are a few ways in which the
comparisons need further explanation.
A4.3.2. Songs with variables with no state scored
Occasionally songs received no score for a particular variable. This could happen for a
number of reasons. These are discussed in appendix A3, and include, variables in which an
illegal value was scored, variables in which an impossible value was scored, variables in
which the context is irrelevant and in cases of pseudoreplication. A comparison between two
songs in this way might look like the following.
State
SongA
SongB
Field 3
1 2 3
0 0 0
0 1 0
4
0
0
5
0
0
For both ordered and unordered variables, a blank is left until all the songs have been
compared for this variable and then the mean difference is taken for the value of the field,
over all of the songs. This value is then used to compare SongA with SongB for this variable.
In this example, if the mean value for the pair-wise comparisons was 1.9, then this is the
value that is given to the comparison between SongA and SongB for field 3.
A4.3.3. Songs with variables with polyallelic states
A polyallelic score results from a song being scored twice for the same variable (see appendix
A3).There are three possible ways in which comparisons of this sort might occur. A song with
a variable in a polyallelic state might be compared to a song with just one state for the
variable. In another case, both songs might have variables with similar polyallelic states.
Another case might occur when variables of songs are compared that have different numbers
of polyallelic states. These will now be considered in turn.
State
SongA
SongB
Field 4
1 2 3
1 0 0
0 0 0
4
0
0
5
1
1
In the above example SongA is biallelic for field 4, states “1” and “5” whereas SongB has only
one state, “5”. If the variable producing field 4 is ordered then in these cases both the
differences are taken, which are 0 and 5 and then divided by the total amount of comparisons
43
FINDING THE BLUES
which is 2. (0+5)/2 = 2.5. So the value of the difference between these two songs for this
variable is 2.5.
If the variable is unordered then a similar calculation is performed except the
assigned value is used for any difference that is not 0. So in this example the total difference
is (0+assigned value)/2. (0+4)/2 = 2.
Another example shows a song with a variable in a triallelic state.
State
SongA
SongB
Field 5
1 2 3
1 0 1
0 1 0
4
0
0
5
1
0
The calculation follows in the same way as above. For an ordered variable the comparisons
yield (1+1+3) = 5, which is then divided by the total number of comparisons, which is 3. 3/5 =
1.7. For an unordered variable the calculation is 3 times the assigned value, 3x4 = 12, divided
by the number of comparisons, which 3. 12/3 = 4.
Now consider the following example where both SongA and SongB have biallelic states.
State
SongA
SongB
Field 6
1 2 3
1 0 1
0 1 1
4
0
0
5
0
0
SongA is similar to SongB for one state, “3”, and different for the other states. The calculation
for an ordered variable is (0+1)/2 = 0.5. For an unordered variable it is (0+assigned value)
again divided by the number of comparisons, so (0+4)/2 = 2.
One final example shows comparisons between songs with variables with different numbers
of polyallelic states. SongA is triallelic and SongB is biallelic.
State
SongA
SongB
Field 7
1 2 3
1 0 1
0 1 1
4
0
0
5
1
0
In this case, for an ordered variable, all possible comparisons are made between the two
songs and the sum of these is divided by the number of comparisons, so (1+2+1+0+2+3)/6 =
2. For an unordered variable it is the number of different states multiplied by the assigned
value, so (0+3x assigned value)/ the number of comparisons, which is 4. (0+3x4)/4 = 3.
A4.3.5. Controlling for fields of different lengths
In all of the above examples the variables had 5 states, however this is not true for all of the
variables. The amount of states will make a difference to the absolute values, for example a
44
FINDING THE BLUES
variable with 8 states has more potential for producing higher difference scores than a
variable with 5 or 3 states. In order to give equal weight to each field it is therefore necessary
to control for this. After the algorithm has produced these first round of difference scores for
each of the variables for each of the pair-wise comparisons, it then corrects the scores over
another run by dividing each difference score by the mean score for the whole field.
A4.4. Producing the matrix
For each pair-wise comparison between two songs, the total differences over all the variables
is summed to give a single value for the difference between the two songs. This is then
entered into the resulting distance matrix. The matrix is then run through the JS clustering
algorithm.
45
FINDING THE BLUES
A5 THE JS CLUSTERING ALGORITHM
A5.1. Introduction
A clustering algorithm, designed by Jonathan Swire, was programmed to produce groups of
songs where the within group variance of the distance scores was minimised. The number of
clusters, K, is specified in advance and runs were made with several different values of K.
A5.2. Using the distance matrix
The matrix produced by the distance matrix program was loaded into the clustering
programme. This contained n unique pair-wise comparisons between each of the songs, in
the case of the full dataset n = 5000. For each run of the programme the following factors
could be specified: K; the number of traverses across the matrix, set to 40 and the number of
iterations per traverse, also set to 40.
A5.3. Running the algorithm
K=2
The clustering algorithm firstly produces n random numbers between 0 and 1 and assigns
one number to each of the songs. When the random number assigned to a song was greater
than 0.5, the song was given the number 1 and if the random number was less than 0.5 then
the song was given a 0. Thus, each of the n songs is initially randomly placed in group 1 or 0.
The within group variance in distance scores is then calculated for group 0 and group 1 by
summing the scores for each of the pair-wise comparisons between all members of a group.
The two variances are then summed to give a single variance value for the first pass.
The algorithm then switches the group of the first song only and the within group
variances for the two new groups is calculated. These are again summed to give a single
variance value for the second pass. If this total variance value is less than the solution given
by the first random allocation then the first song remains in the switched group. If the total
variance is greater than the total variance calculated for the first run then the first song is put
back into its original group as the switch has increased the within group variance. The second
song is then switched and exactly the same calculations are made to elucidate whether the
switch has decreased the within group variance. This is then repeated for each of the n
songs. Having passed through the matrix switching each song once, the whole process is
repeated until a full pass occurs without any switching.
Within this process a second degree of randomisation was added. Two vectors were
produced: a vector of the n song numbers and a vector of n random numbers. This was used
as an index with which the algorithm traversed the matrix. Each time a new pass was made,
the algorithm started with a different song and switched them in a different order. This was
necessary to reduce the likelihood of the algorithm finding the best local solution rather than
the best overall solution.
46
FINDING THE BLUES
For each run of the algorithm, the two worst and the two ‘best’ solutions were
recorded and the number of times they were each reached. Confidence in the results was
found by comparing the number of times the best solution was found and the distance
between the best two results. If the first solution was found on a number of occasions and the
second best solution was closer to the first than to the worst, it can be assumed in this case
that the best solution is a fair estimate at the structure of the songs. If the best solution is only
produced once and the second best solution is not too close then this implies that the best
solution was ‘lucky’ and the likelihood of it being the true best solution is low.
K>2
In these cases the algorithm worked in exactly the same way except for the random
assignation of groups at the beginning of the process. K random groups were produced by
dividing 1 by K to produce a number that represented the boundary for the first group of
random numbers, i.e. the first group was made from all songs with random numbers between
0 and 1/K. This was group 0. Group 1 was found by converting all numbers between 1/K and
2*(1/K) into a 1. Group 2 was found by converting all numbers between 2*(1/K) and 3*(1/K).
This was continued until the K*(1/K) – which equals 1. For example when K=4, group 0 is all
songs with random numbers between 0 and 1/4 = 0.25. Group 1 is all songs with numbers
between 1/4 = 0.25 and 2*(1/4) = 0.5. Group 2 is all songs between 2*(1/4) = 0.5 and 3*(1/4)
= 0.75. Finally group 3 is all songs between 3*(1/4) = 0.75 and 4*(1/4). The grouping now
ends having reached K*(1/K).
47
FINDING THE BLUES
A6 GRAPHS SHOWING VARIATION ACROSS EACH OF THE 44 VARIABLES
FOR ALL CULTURAL REGIONS
Below are the results of Chi-squared tests for goodness of fit for all 44 variables, table A6.1.
These show that the number of songs in each state for every variable were significantly
different.
Table A6.1. Table to show the chi-sq results and % of songs with no data and in a polyallelic
state for all 44 variables
Variable Number
% of songs with no data
% of polyallelic songs
df
1
0.3
0.5
12
p
Chi-squared p value
7.6E-44
2
36.3
15.7
42
p
8.0E-75
3
70.3
1.1
12
p
1.8E-04
4
31.6
2.9
24
p
7.3E-50
5
31.3
2.0
30
p
5.7E-52
6
33.2
1.9
24
p
4.4E-64
7
33.2
2.0
24
p
3.8E-56
8
34.0
4.6
36
p
1.8E-33
9
2.3
10.0
18
p
6.9E-83
10
1.7
2.9
12
p
5.3E-18
11
8.4
4.4
18
p
2.1E-101
12
8.7
7.6
12
p
3.8E-55
13
1.9
24.3
24
p
4.9E-48
14
1.5
5.2
24
p
4.2E-103
15
24.8
0.9
6
p
5.1E-18
16
1.9
3.8
24
p
6.4E-14
17
1.4
0.5
24
p
5.0E-39
18
1.6
5.4
24
p
3.1E-176
19
1.3
0.4
24
p
3.0E-160
20
1.5
5.0
30
p
4.3E-36
21
1.6
10.6
24
p
1.1E-17
22
1.4
3.0
18
p
1.3E-70
23
1.4
0.5
18
p
2.6E-93
24
1.4
1.1
12
p
9.0E-61
25
1.4
1.1
12
p
3.8E-74
26
1.5
0.6
12
p
8.1E-114
27
2.3
41.0
24
p
5.2E-18
28
2.1
28.3
30
p
3.9E-125
29
1.6
7.7
24
p
1.3E-60
30
1.5
9.9
24
p
8.3E-12
31
1.5
8.1
24
p
3.1E-34
32
1.5
7.0
24
p
6.0E-125
33
1.4
1.7
24
p
1.2E-112
34
1.5
3.7
24
p
1.4E-105
35
0.2
3.1
18
p
8.3E-64
36
53.1
2.4
42
p
1.1E-30
37
82.0
0.1
12
p
5.0E-28
38
52.3
0.1
24
p
3.4E-16
39
52.2
0.5
24
p
4.1E-34
40
53.2
2.3
42
p
3.2E-52
41
31.5
0.9
24
p
6.6E-70
42
32.5
3.6
30
p
3.1E-80
43
31.9
2.1
18
p
2.6E-22
44
34.2
3.1
24
p
1.8E-64
48
FINDING THE BLUES
Figures to show variation across the states for each of the 44 variables
The following 44 figures, 1-44, represent each of the 44 variables involved in the analysis. It is
possible to see the variation across each of the states for the different a priori defined cultural
regions for each of the variables.
The state descriptions are noted on the x-axis of each figure but in reality are
represented by a number in the analysis. Songs with ‘no data’ have missing a score for the
variable in question or the variable is not relevant to the song (see appendix A3). Polyallelic
states are all songs that were scored more than once for the same variable (again this is
discussed in appendix A3).
49
FINDING THE BLUES
Fig 5. Variable 5: Polyphonic Type
AAF
AI
AF
CAF
frequency
100
90
80
70
60
50
40
30
20
10
0
80
70
60
50
40
30
20
10
0
OEU
NAF
> one
singer
no data
AF
CAF
polyallelic
state
AI
AF
CAF
da
ta
po
lya
lle
lic
no
al
ly
al
le
lic
WEU
po
no
da
t
a
d
d
bl
en
od
ax
im
al
OEU
NAF
m
bl
en
in
im
al
polyallelic
m
no data
no
choruschorus
bl
en
d
CAF
d
NAF
AF
go
WEU
AI
d
OEU
AAF
ed
iu
m
CAF
Fig 7. Variable 7: Rhythmic Blend of the Vocal
Group
bl
en
AF
80
70
60
50
40
30
20
10
0
m
AI
fequency
AAF
state
Fig 4. Variable 4: Musical Organisation of the
Voice
Fig 8. Variable 8: Rhythmic relationship within the
Vocal Group
state
no data
polyallelic
NAF
rhythmic
counterpoint
WEU
simple
polyrhythm
complex
polyrhythm
OEU
AF
accompanying
rhythm
da
t
a
po
no
ly
al
le
lic
y
po
ly
ph
on
y
ph
on
is
on
te
ro
he
ny
ho
on
op
tw
o
or
m
si
ng
er
s
CAF
AI
rhythmic
heterophony
AF
AAF
rhythmic
unison
AI
90
80
70
60
50
40
30
20
10
0
non
occurrence
AAF
frequency
state
un
frequency
100
90
80
70
60
50
40
30
20
10
0
or
e
NAF
state
Fig 3. Variable 3: Form of Overlap Alteration
frequency
bl
en
d
bl
en
d
ax
im
state
m
WEU
m
m
NAF
go
od
in
im
al
b
le
nd
nd
bl
e
WEU
bl
en
d
OEU
OEU
no
polyallelic
no data
interlocking
overlap
alternation
simple
alteration,
simple
alteration,
heterogeneous
unison;
dominant
unison;
dominant
CAF
AAF
ed
iu
m
AF
80
70
60
50
40
30
20
10
0
NAF
80
70
60
50
40
30
20
10
0
m
frequency
AI
chorusleader
WEU
Fig 6. Variable 6: Tonal Blend of the Vocal Group
AAF
two solo
singers
frequency
Fig 2. Variable 2: Form of the Vocal Part
80
70
60
50
40
30
20
10
0
leaderchorus
OEU
state
bl
en
no singer one singer
AI
po
lyp
dr
ho
on
ny
e
po
ly
p
iso
ho
ny
la
te
d
ch
or
pa
ds
ra
lle
lc
ho
rd
s
ha
rm
o
co
ny
un
te
rp
oi
nt
no
da
ta
po
lya
lle
lic
WEU
AAF
no
frequency
Fig 1. Variable 1: No of Singers
CAF
OEU
WEU
NAF
state
50
FINDING THE BLUES
Fig 13. Variable 13: Phrase Length
100
90
80
70
60
50
40
30
20
10
0
60
50
AI
AF
CAF
da
ta
po
lya
lle
lic
no
40
AI
30
20
AF
CAF
10
OEU
0
NAF
state
da
t
a
po
ly
al
le
lic
se
s
or
e
on
AAF
AI
AF
WEU
NAF
state
WEU
NAF
Fig 16. Variable 16: Position of the Final Note
CAF
30
OEU
20
10
WEU
NAF
0
no data
polyallelic
AAF
AI
AF
CAF
da
ta
po
lya
lle
lic
40
AF
no
AI
es
tn
ot
er
e
ha
lf
of
r
m
an
id
ge
po
in
to
fr
an
up
ge
pe
rh
al
fr
an
ge
hi
gh
es
tn
ot
e
60
50
60
50
40
30
20
10
0
OEU
WEU
NAF
lo
w
AAF
frequency
70
lo
w
80
state
OEU
state
Fig 12. Variable 12: Variation of Litany/Strophe
little
variation
polyallelic
OEU
no data
CAF
phrases
asymmetrically
arranged
po
lya
lle
lic
da
ta
lit
pl
e
no
an
y
ny
lita
sim
pl
ex
co
m
st
ro
ph
e
CAF
pl
e
NAF
80
70
60
50
40
30
20
10
0
phrases
symmetrically
arranged
AF
frequency
AI
sim
pl
ex
WEU
Fig 15. Variable 15: Symmetry of Phrases
AAF
st
ro
ph
e
frequency
80
70
60
50
40
30
20
10
0
co
m
OEU
state
Fig 11. Variable 11: Type of Siniging Form
frequency
no
ph
ra
tw
o
6
or
3
state
moderate
ra
se
s
se
s
ph
ra
7
NAF
5
polyallelic
8
WEU
no data
or
0
8
OEU
se
s
se
s
ph
ra
20
AF
CAF
CAF
ph
ra
AF
AI
or
40
AAF
4
AI
frequency
AAF
60
much
NAF
80
70
60
50
40
30
20
10
0
>
frequency
80
canon
WEU
Fig 14. Variable 14: Number of Phrases
100
litany or
strophe
polyallelic
state
Fig 10. Variable 10: Melodic Form of the Vocal
Line
through
composed
no data
phrases phrases short and
longer
a very
of shorter very short
of
than
very long
phrases
than
phrase medium to average
medium
length
quite long
length
phrases
WEU
ph
un
du
la
t in
g
de
sc
en
di
ng
te
rr a
ce
d
OEU
AAF
frequency
AAF
ar
ch
ed
frequency
Fig 9. Variable 9: Melodic Shape of the Vocal Line
state
51
FINDING THE BLUES
Fig 21. Variable 21: Volume
OEU
m
state
AI
AF
AI
AF
CAF
OEU
AI
AF
CAF
OEU
po
lya
lle
lic
da
ta
no
e
so
m
pr
om
m
state
state
Fig 20. Variable 20: Tempo
Fig 24. Variable 24: Melisma in the Voice Part
lya
lle
lic
po
a
da
t
no
rti
cu
la
t
ed
n
tic
ul
at
io
ar
ar
tic
ul
at
os
tl y
un
WEU
CAF
OEU
WEU
NAF
m
po
lya
lle
lic
da
ta
no
fa
st
ve
ry
fa
st
w
ed
iu
m
m
s lo
slo
w
OEU
NAF
AF
ed
CAF
AI
al
la
AF
NAF
AAF
e
AI
so
m
AAF
100
90
80
70
60
50
40
30
20
10
0
frequency
80
70
60
50
40
30
20
10
0
qu
ite
no
ne
WEU
ax
im
NAF
co
ns
id
AAF
in
en
t
WEU
80
70
60
50
40
30
20
10
0
al
da
ta
po
lya
lle
lic
no
or
no
ne
litt
er
ab
l
le
e
am
so
m
e
ou
nt
uc
h
m
OEU
NAF
po
no
no
da
t
a
ne
e
Fig 23. Variable 23: Glissando in the Voice Part
AF
w
so
m
m
Fig 19. Variable 19: Embellishment
frequency
state
AI
slo
uc
h
m
NAF
ly
al
le
lic
WEU
e
WEU
AAF
e
AAF
state
100
90
80
70
60
50
40
30
20
10
0
el
y
90
80
70
60
50
40
30
20
10
0
ex
tre
da
ta
po
lya
lle
lic
s
id
e
no
in
te
rv
al
in
te
rv
al
s
OEU
ve
ry
w
wi
d
di
at
on
ic
e
in
te
rv
al
s
in
te
rv
al
s
CAF
na
rro
w
m
frequency
frequency
AAF
on
ot
on
e
frequency
80
70
60
50
40
30
20
10
0
ex
tr e
m
NAF
Fig 22. Variable 22: Rubato in the Voice Part
CAF
frequency
WEU
state
Fig 18. Variable 18: Interval Width
ex
tr e
m
da
ta
po
lya
lle
lic
NAF
no
so
ft
WEU
CAF
lo
ud
OEU
AF
ve
ry
CAF
AI
ve
ry
m
in
6t
h
to
m
to
3r
d
in
m
m
on
ot
on
e
pe
rf
5t
h
m
to
in
8
9t
ta
h
ve
to
m
tw
a
o
j
1
8t
4t
av
h
es
or
m
or
e
no
da
ta
po
lya
lle
lic
AF
AAF
30
20
10
0
lo
ud
AI
70
60
50
40
so
ft
id
-v
ol
um
e
AAF
aj
2n
d
frequency
70
60
50
40
30
20
10
0
frequency
Fig 17. Variable 17: Range
state
state
52
FINDING THE BLUES
WEU
CAF
OEU
slight
little or
none
no data
polyallelic
m
heavy
state
state
Fig 26. Variable 26: Glottal Shake in the Voice Part
Fig 30. Variable 30: Raspiness
0
OEU
AF
CAF
gr
ea
t
in
te
rm
ex
tr e
m
po
lya
lle
lic
da
ta
no
to
ta
l
state
Fig 27. Variable 27: Register
da
ta
no
state
state
Fig 28. Variable 28: Vocal Width
Fig 32. Variable 32: Enunciation of Consonants
CAF
po
lya
lle
lic
pr
ec
is
NAF
da
ta
OEU
pr
ec
is
e
po
lya
lle
lic
da
ta
no
yo
de
l
id
e
w
e
ve
ry
WEU
no
OEU
ur
re
d
0
AF
sl
CAF
rre
d
10
wi
d
NAF
AI
slu
AF
ve
ry
20
na
rro
w
sp
ea
ki
ng
WEU
AAF
al
AI
no
rm
AAF
30
70
60
50
40
30
20
10
0
e
40
frequency
50
or
cle
ar
60
na
rro
w
OEU
m
NAF
po
lya
lle
lic
ve
ry
po
lya
lle
lic
no
da
ta
lo
w
lo
w
ve
ry
id
m
hi
gh
hi
g
h
WEU
ve
ry
CAF
fo
rc
ef
ul
OEU
0
AF
re
la
xe
d
CAF
ve
ry
20
AI
od
er
at
e
AF
m
30
10
frequency
NAF
AAF
fo
rc
ef
ul
AI
ar
ke
dl
y
AAF
40
frequency
50
80
70
60
50
40
30
20
10
0
ve
ry
frequency
60
state
WEU
Fig 31. Variable 31: Accent
70
ve
ry
OEU
state
re
la
xe
d
e
or
no
ne
NAF
litt
le
so
m
st
ro
ng
WEU
po
lya
lle
lic
CAF
da
ta
20
NAF
AI
no
AF
no
ne
40
WEU
AAF
30
20
10
0
ra
sp
AI
pe
rc
ep
tib
le
AAF
60
e
80
60
50
40
it t
en
t
100
frequency
120
frequency
AF
e
NAF
AI
da
ta
po
lya
lle
lic
OEU
AAF
no
CAF
it t
en
t
oc
ca
ss
io
na
l
litt
le
or
no
ne
AF
50
45
40
35
30
25
20
15
10
5
0
in
te
rm
AI
frequency
AAF
ar
ke
d
100
90
80
70
60
50
40
30
20
10
0
Fig 29. Variable 29: Nasalization
ex
tr e
m
frequency
Fig 25. Variable 25: Tremolo in the Voice Part
WEU
NAF
state
53
FINDING THE BLUES
Fig 37. Variable 37: Form of Overlap Alteration
AAF
AI
AF
no
ai
nl
y
m
WEU
AAF
AI
AF
CAF
OEU
WEU
NAF
NAF
leadergroup
groupleader
groupgroup
Fig 38. VAriable 38: Tonal Blend of the Orchestra
da
ta
po
lya
lle
lic
no
bl
en
d
bl
en
d
al
NAF
m
m
WEU
state
Fig 39. Variable 39: Rhythmic Blend of the
Orchestra
90
80
70
60
50
40
30
20
10
0
100
CAF
WEU
NAF
4+
polyallelic
no blend
state
Fig 36. Variable 36: Form of the Instrumental
Group
Fig 40. Variable 40: Overall Rhythmic
Relationships in the Orchestra
state
NAF
no data
rhythmic
counterpoint
WEU
complex
polyrhythm
OEU
NAF
AF
simple
polyrhythm
CAF
WEU
AI
accompanying
rhythm
no data
alteration group
group
heterogeneous,
one leader
unison, no
dominant
series of
instruments
AF
OEU
AAF
rhythmic
heterophony
AI
rhythmic
unison
AAF
100
80
60
40
20
0
nonoccurrence
100
80
60
40
20
0
frequency
state
polyallelic
da
ta
po
lya
lle
lic
st
ru
m
AF
in
no
en
ts
en
ts
en
t
st
ru
m
in
23
in
st
ru
m
1
AI
20
0
OEU
or
ch
es
tra
AAF
40
no data
CAF
60
complete
rhythmic
linkage
AF
80
same
rhythm,
cohesion
AI
same
rhythm,
disunified
same
rhythm,
coordination
AAF
frequency
frequency
ax
im
go
od
le
nd
bl
en
d
OEU
m
WEU
NAF
irr
CAF
nd
OEU
AF
bl
e
po
lya
lle
lic
no
da
ta
m
et
er
eg
ul
ar
m
et
er
fre
e
rh
yt
hm
et
er
pl
ex
m
co
m
on
e
sim
pl
e
be
at
r
hy
th
m
CAF
AI
ed
iu
m
AF
AAF
al
b
AI
100
90
80
70
60
50
40
30
20
10
0
in
im
AAF
frequency
Fig 34. Variable 34: Overall Vocal Rhythmic
Scheme
Fig 35. Variable 35: Size of Orchestra
no
polyallelic
state
state
frequency
no data
state
90
80
70
60
50
40
30
20
10
0
frequency
OEU
100
90
80
70
60
50
40
30
20
10
0
no
al
l
da
ta
po
lya
lle
lic
w
ha
or
lf
ds
te
xt
re
pe
at
ed
>
ha
lf
re
pe
at
ed
en
tir
e
re
pi
tit
io
n
CAF
frequency
70
60
50
40
30
20
10
0
wo
rd
s
frequency
Fig 33. Variable 33: Words to Nonsense
CAF
OEU
WEU
NAF
state
54
FINDING THE BLUES
OEU
po
lya
lle
lic
da
ta
no
NAF
no
ne
WEU
e
WEU
CAF
e
OEU
AF
ex
tr e
m
polyallelic
no data
polyphony
or
polyrhythm
heterophony
unison
monophony
CAF
AI
so
m
AF
AAF
uc
h
AI
90
80
70
60
50
40
30
20
10
0
m
AAF
state
Fig 42. Variable 42: Overall Rhythmic Structure of
the Accompaniment
Fig 44. variable 44: Dominance Relationship in
between Orchestra and Vocal Part
state
WEU
NAF
CAF
po
lya
lle
lic
OEU
AF
da
ta
CAF
AI
no
AF
NAF
AAF
do
m
in
an
t
in
te
rlu
de
s
un
re
la
co
te
m
d
pl
em
en
ta
ry
AI
di
na
te
AAF
90
80
70
60
50
40
30
20
10
0
su
bo
r
90
80
70
60
50
40
30
20
10
0
frequency
state
no
noc
cu
on
rre
enc
be
e
at
rh
yt
sim
hm
pl
e
m
co
et
m
er
pl
ex
m
irr
e
te
eg
r
ul
ar
m
et
er
fre
e
rh
yt
hm
no
da
ta
po
lya
lle
lic
frequency
Fig 43. Variable 43: Rubato in the Instruments
frequency
90
80
70
60
50
40
30
20
10
0
nonoccurrence
frequency
Fig 41. Variable 41: Musical Organisation of the
Orchestral Part
OEU
WEU
NAF
state
--------------------------------------------------------------------------------------------------------------
55