The Complexity of Songs

SPECIAL SECTION
Guest Editor
Peter Neumann
The Complexity
of Songs
DONALD E. KNUTH
Every day brings new evidence that the concepts of
computer science are applicable to areas of life which
have little or nothing to do with computers. The purpose of this survey paper is to demonstrate that important aspects of popular songs are best understood in
terms of modern complexity theory.
It is known [3] that almost all songs of length n require a text of length ~ n. But this puts a considerable
space requirement on one's memory if many songs are
to be learned; hence, our ancient ancestors invented
the concept of a refrain [14]. When the song has a
refrain, its space complexity can be reduced to cn,
where c < 1 as shown by the following lemma.
LEMMA 1.
Let S be a song containing m verses of length V and a
refrain of length R where the refrain is to be sung first,
last, and between adjacent verses. Then, the space
complexity of S is ( V / ( V + R)) n + O(1) for fixed V
and R as m ~ oo.
PROOF.
T h e l e n g t h of S when s u n g i s
n =R+(V+R)m
(1)
while its space complexity is
c = R + Vm.
(2)
The research reported here was supported in part by the National Institute of
Wealth under grant $262,144.
©1984ACMO001-0782/84/0400-0344 75¢
344
Communications of the ACM
By the Distributive Law and the Commutative Law [4],
we have
c= n= n-
(V+R)m
+ mV
Vm-Rm
+ Vm
(3)
=n-Rm.
The lemma follows.
[3
(It is possible to generalize this lemma to the case of
verses of differing lengths V1, V2. . . . ~Vm, provided that
the sequence (Vk) satisfies a certain smoothness condition. Details will appear in a future paper.)
A significant improvement on Lemma 1 was discovered in medieval European Jewish communities where
an anonymous composer was able to reduce the complexity to O(x/n). His song "Ehad Mi Yode'a" or "Who
Knows One?" is still traditionally sung near the end of
the Passover ritual, reportedly in order to keep the children awake [6]. It consists of a refrain and 13 verses
vl . . . . . v13, where v~ is followed by vk-1 • .. v2vl before
the refrain is repeated; hence m verses of text lead to
1/2m2 .-F O(m) verses of singing. A similar song called
"Green Grow the Rushes O" or "The Dilly Song" is
often sung in western Britain at Easter time [1], but it
has only twelve verses (see [1]), where Breton, Flemish,
German, Greek, Medieval Latin, Moldavian, and Scottish
versions are cited.
The coefficient of ~n was further improved by a Scottish farmer named O. MacDonald, whose construction~
appears in Lemma 2.
ActuallyMacDonald'spriorityhas beendisputedby somescholars;Peter
Kennedy([8],p. 676)claimsthat "1BoughtMyselfa Cock"and similarfarmyard songsare actuallymucholder.
April 1984 Volume 27 Number 4
Special Section
L E M M A 2.
Given positive integers a and X, there exists a song
whose complexity is (20 + X + a) ~/n/(30 + 2)0 + O(1).
PROOF.
Consider the following schema [9].
V o = 'Old MacDonald had a farm, ' Ri
R1 = 'Ee-igh, ,2 'oh! '
R2(x) -- Vo 'And on this farm he had
some' x', ' R1 'With a'
U~(x, x') = x', ' x' ' here and a ' x', ' x ' ' there; '
U2(x, y) = x'here a ' y, ' '
U3(x, x') = Ui(x, x')
U2(g, x)
U2('t', x ' )
U2('everyw', x ' , ' x')
Vk = U3(Wk, W~)Vk-i
for
k _> 1
where
W1 = 'chick', W2 = 'quack',
W3 = 'gobble', Wk = 'oink',
(5)
W5 = 'moo', W6 = 'hee' ,
and
for
W~- = W k
k#6;W~
='haw'.
(6)
The song of order m is defined by
(7)
J~,~ = R2(W;[i)VmJ'~-I
for
m_>l,
where
W~' = 'chicks', W~' = 'ducks',
W3" = 'turkeys', W ~ = 'pigs',
(8)
W~' = 'cows', W~' = ' d o n k e y s ' .
when a is fixed. Therefore if MacDonald's farm animals
ultimately have long names they should make slightly
shorter noises.
Similar results were achieved by a French Canadian
ornithologist, who named his song schema "Alouette"
[2, 15]; and at about the same time by a Tyrolean
butcher whose schema [5] is popularly called "Ist das
nicht ein Schnitzelbank?" Several other cumulative
songs have been collected by Peter Kennedy [8], including "The Mallard" with 17 verses and "The Barley
Mow" with 18. More recent compositions, like "There's
a Hole in the Bottom of the Sea" and "I Know an Old
Lady Who Swallowed a Fly" unfortunately have comparatively large coefficients.
A fundamental improvement was claimed in England
in 1824, when the true love of U. Jack gave to him a
total of 12 ladies dancing, 22 lords a-leaping, 30 drummers drumming, 36 pipers piping, 40 maids a-milking,
42 swans a-swimming, 42 geese a-laying, 40 golden
rings, 36 collie birds, 30 french hens, 22 turtle doves,
and 12 partridges in pear trees during the twelve days
of Christmas [11]. This amounted to 1/6 m 3 -.b 1/2 m 2 -b 1/a
m gifts in m days, so the complexity appeared to be
O(3~n); however, it was soon pointed out [10] that his
computation was based on n gifts rather than n units of
singing. A complexity of order ~/n/log n was finally
established (.see [7]).
We have seen that the p a r t r i ~ in the pear tree gave
an improvement of only 1/Vlog n; but the importance
of this discovery should not be underestimated since it
showed that the n °5 barrier could be broken. The next
big breakthrough was in fact obtained by generalizing
the partridge schema in a remarkably simple way. It
was J. W. Blatz of Milwaukee, Wisconsin who first discovered a class of songs known as "m Bottles of Beer on
the Wall"; her elegant construction2 appears in the following proof of the first major result in the theory.
THEOREM 1.
There exis t songs of complexity O(log n).
PROOF.
Consider the schema
The length of Y (m) is
Vk = TkBW', '
n = 30m 2 + 153m
TkB'; '
+ 4(m/1 + (m - 1)/2 -4- . . . + /m)
+ (al + . . . + am)
If one of those bottles should
(9)
happen to fall, '
while the length of the corresponding schema is
Tk-iB W'.'
c = 20m + 211 + (/1 + . . . + / , , )
+ (al + . . . + a,,).
(10)
April 1984
Volume 27
Number 4
where
B = ' bottles of beer' ,
HereA = IWkl + IW/I and ak = IW~Yl,where Ixl
denotes the length of string x. The result follows at
once, if we assume that ,~ = X and ak = a for all
large k. [3
Note that the coefficient (20 + X + a ) / ~ / ~ + 2X
assumes its m i n i m u m value at
k = max(l, a-lO)
(12)
(13)
W = ' on the wall' ,
and where Tk is a translation of the integer k into Eng-
(11)
2 Again Kennedy ([8], p. 631) claims priority for the English, in this case
because of the song "I'11 drink m if you'll drink m + 1." However, the English
start at m = 1 and get no higher than m = 9, possibly because they actually
drink the beer instead of allowing the bottles to fall.
Communications of the ACM
34S
Special Section
lish. It requires only O(m) space to define Tk for all
k < 10" since we can define
Tq.lo.,.r = Tq ' times 10 to the ' Tin' plus ' T,
Acknowledgment I wish to thank J. M. Knuth and J. S.
Knuth for suggesting the topic of this paper.
(14)
for 1 _< q _< 9 and 0 _< r < 10m-L
Therefore the songs Sk defined by
So=e,Sk=
VkSk-1 for k>_ 1
(15)
have length n X k log k, but the schema which defines
them has length O(log k); the result follows. [3
Theorem 1 was the best result known until recently 3,
perhaps because it satisfied all practical requirements
for song generation with limited memory space. In fact,
99 bottles of beer usually seemed to be more than sufficient in most cases.
However, the advent of modern drugs has led to
demands for still less memory, and the ultimate improvement of Theorem 1 has consequently just been
announced:
THEOREM 2.
There exist arbitrarily long songs of complexity O(1).
PROOF: (due to Casey and the Sunshine Band). Consider
the songs Sk defined by (15), but with
Vk = 'That's the way,' U 'I like it, ' U
(16)
U = 'uh huh,' 'uh huh'
for all k. [3
It remains an open problem to study the complexity
of nondeterministic songs.
3 The chief rival for this honor was "This old man, he played m, he played
knick-knack...'.
346
Communications of the ACM
REFERENCES
1. Rev. S. Baring-Gould, Rev. H. Fleetwood Sheppard, and F.W. Bussell, Songs of the West (London: Methuen, 1905}, 23, 160-161.
2. Oscar Brand, Singing Holidays (New York: Alfred Knopf, 1957), 6869.
3. G.J. Chaitin, "On the length of programs for computing finite binary
sequences: Statistical considerations," J. ACM 16 (1969), 145-159.
4. G. Chrystal, Algebra, an Elementary Textbook (Edinburgh: Adam and
Charles Black, 1886), Chapter 1.
5. A. D6rrer, Tiroler Fasnacht (Wien, 1949), 480 pp.
6. Encyclopedia Judaica (New York: Macmillan, 1971), v. 6 p. 503; The
Jewish Encyclopedia (New York: Funk and Wagnalls, 1903); articles
on Ehad Mi Yode'a.
7. U. Jack, "Logarithmic growth of verses," Acta Perdix 15 (1826),
1-65535.
8. Peter Kennedy, Folksongs of Britain and Ireland (New York: Schirmer,
1975), 824 pp.
9. Norman Lloyd. The New Golden Song Book (New York: Golden Press,
1955), 20-21.
10. N. Picker, "Once sefiores brincando al mismo tiempo," Acta Perdix
12 (1825), 1009.
11. ben shahn, a partridge in a pear tree (New York: the museum of
modern art, 1949), 28 pp. (unnumbered).
12. Cecil J. Sharp, ed., One Hundred English Folksongs (Boston: Oliver
Ditson, 1916), xlii.
13. Christopher J. Shaw, "that old favorite, A p i a p t / a Christmastime algorithm," with illustrations by Gene Oltan, Datamation 10, 12 (December 1964), 48-49. Reprinted in Jack Moshman, ed., Faith, Hope
and Parity (Washington, D.C.: Thompson, 1966), 48-51.
14. Gustav Thurau, Beitr~ge zur Geschichte und Charakteristik des Refrains
in derfranzosischen Chanson (Weimar: Felber, 1899), 47 pp.
15. Marcel Vigneras, ed., Chansons de France (Boston: D.C. Heath. 1941),
52 pp.
Permission to copy without fee all or part of this material is granted
provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication
and its date appear, and notice is given that copying is by permission of
the Association for Computing Machinery. To copy otherwise, or to
republish, requires a fee a n d / o r specific permission.
April 1984
Volume 27
Number 4