On a Discriminatory Problem Connected with the

On a Discriminatory Problem Connected with the Works of Plato
Author(s): D. R. Cox and L. Brandwood
Source: Journal of the Royal Statistical Society. Series B (Methodological), Vol. 21, No. 1
(1959), pp. 195-200
Published by: Wiley for the Royal Statistical Society
Stable URL: http://www.jstor.org/stable/2983942
Accessed: 03-03-2017 13:42 UTC
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted
digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about
JSTOR, please contact [email protected].
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms
Royal Statistical Society, Wiley are collaborating with JSTOR to digitize, preserve and extend access to
Journal of the Royal Statistical Society. Series B (Methodological)
This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC
All use subject to http://about.jstor.org/terms
1959]
195
ON A DISCRIMINATORY PROBLEM CONNECTED WITH THE WORKS OF PLATO
By D. R. Cox and L. BRANDWOOD
Birkbeck College, University of London
[Received November, 1958]
SUMMARY
A FORM of discriminant analysis for qualitative data is developed and used
to place in order some of the works of Plato.
1. INTRODUCTION
Between writing the Republic (Rep.) and the Laws, Plato wrote the Critias (Crit.),
Philebus (Phil.), Politicus (Pol.), Sophist (Soph.) and Timaeus (Tim.), but it is not known in
what order. Kaluscha (1904) was the first to use statistical methods in an attempt to
assign an order to these works. Billig (1920) continued this line of investigation, but the
conclusions from it have not been generally accepted by classical scholars, partly, perhaps,
because of the very subjective methods used to interpret the statistical tables. Brandwood
(1958) has applied more objective statistical methods to the problem and the aim of the
present note is to explain briefly the technique used.
The stylistic property on which the statistical analysis is based is the distribution of
quantity over the sentence ending (clausula). The last five syllables only of each sentence
are considered, each syllable being classed as long or short. The sentence endings are
thus divided into 25 = 32 types. For each work a frequency distribution is obtained
showing the number of endings of each type (Table 1). There is a marked difference
between the distributions for Rep. and Laws; the problem is in effect to order the other
works in decreasing order of affinity with, say, Rep., and then, provided that Plato's change
in literary style was monotone in time, we will have estimated the order in which the works
were written.
The technique used is essentially discriminant analysis. The optimum discriminator
between Rep. and Laws is obtained, giving therefore a method of scoring each type of
sentence ending. The mean score for all sentences in Rep. is substantially negative, that
for Laws is substantially positive; the mean scores for the other works then give the required
ordering. The calculation is completed by obtaining an approximate significance test for
the difference between two mean scores.
The work may be compared with that of Barnard (1935), who used discriminant analysis
to date Egyptian skulls. Her observations, however, were on continuous and approximately
normally distributed variates, and ordinary linear discriminant analysis was used. Here,
however, the observations are on five qualitative variables (long, short). It would be
possible to replace "long" by a 1 and "short" by a 0 and then to use a linear discriminant
function as an approximation; this procedure is not likely to take proper account of the
importance of particular special patterns of quantity among several syllables, and it is
This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC
All use subject to http://about.jstor.org/terms
196 COX AND BRANDWOOD-On a Discriminatory [No. 1,
better, and a lot simpler, to regard the 32 different types of
Problems of discrimination with qualitative variables arise also in other fields, for example
in medical diagnosis.
2. THEORY
Let there be two populations Ho, Hi of individuals and let each individual fall into
one of k types, the probabilities being Ool, . . . , 0Ok; 011, Olk. , , assumed for
the moment to be known. Assume that different individuals are statistically independen
In the application that we have in mind Ho is Rep., H1 is Laws, an individual is a sentence
ending and k = 32.
Suppose first that we have N individuals from one or other population, ni being of
the ith type, and that we require to decide which population has in fact been sampled.
Welch (1939) showed that optimum discrimination between the populations is based on
the likelihood ratio. The probability of the observations in sampling from Hlo is of the
multinomial form
N! n0n
HN!
Ioi
(1)
so that the log-likelihood ratio is
ni
where
any
log
system
Oi)
of
(2)
logarithm
the calculations. That is, each sentence is given a score
Si
Ooi
=
log
1li
(3)
and the optimum discriminator is the total score.
Now suppose that there exist not just populations Ho and Hli, but a series of populations
representing a gradual change from the distribution in Hlo to that in Hl1. A natural way
of representing such a series is to define populations HA, 0 < A < 1, the probability 6Ai
of the ith type in HA being
01-A OA
j=1
For a random sample of N individuals, ni of the ith type, the log-likelihood is, except for a
constant,
k
-k-
E
[n
so that a sufficient statistic for A is the total score (2), or more conveniently, the mean
score
1 = ni log (6ij /Ooi) 5
N
N
The random variable s converges in probability, as N increases, to +(A) = Aisj,
which is a monotone function of A; it would be possible to convert s into an estimate of
This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC
All use subject to http://about.jstor.org/terms
1959] Problem connected with the Works of Plato 197
A, but this seems unnecessary, and we shall instead take i(A) as the parameter of interest.
The variance of s, considered as an estimate of i(A), is
1V(s) =N2I {E 0AiS? - (X GAiSi)2} (6)
and an unbiased estimate of this is
1 S6 2 (Nnis )2
N(N - 1) 1. NJ
Now suppose that we have samples of sizes N', and N", from two populations (two of
the "doubtful" works in the application that we are describing). Calculate the cor-
responding mean scores s' and s11 and the estimated variance of the difference,
Ves s- )- Ve(S') + Ve(s ). (8)
In large samples (s' - s"1) will be nearly normally distributed and a significance test can be
made in the usual way. If a significant difference is obtained, the populations can con-
fidpiitlv be placed in order on the A-scale, provided that (4) is a reasonable representation
of the populations. The adequacy of (4) could in principle be examined by a x2 test,
although this was not done in the present application. The correctness of (4) matters
only in the derivation of (5) as an optimum statistic and not in the calculation of the
standard error; (5) has in any case a certain intuitive appeal.
If the population probabilities are known for Ho and Hi, this completes the theory.
In the present application the numbers of sentences in Rep. and in Laws are appreciably
greater than those in the other books, so that it is reasonable to replace 6oi and Oli by
observed frequencies and to neglect the consequent error in si. In general the large-sample
variance of s' - s" is given by (8) plus
z (06i - ,i)2 V('i) + 2 E (OAji - Oi)(A AX #6) C(?i, ?") (9
i >j
where 6Ai, 0Qti refer to the two populations under comparison, and where hi i
estimate of log (Oli/Ooi) obtained by replacing probabilities by sample frequencies. If
Mo, Mi denote the numbers of observations on the reference populations Ho, HI, then a
simple asymptotic calculation gives
(1- Oi 1 - 6i (oge)
( MoOo Mi6 log e))i (10)
C(si, ? -) - k (j1 + 1j) (log e)2. (11)
Note however that under the null hypothesis that the two populations under comparison
are identical, i.e. A = ~u, the additional terms (9) vanish, so that (7) and (8) may be used
for a significance test, with si replaced by Si.
3. APPLICATION AND DISCUSSION
The first two columns of Table 1 give the percentage distribution among the 32 types
for the reference populations Ho, Rep., and HI, Laws. The third column gives the esti-
This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC
All use subject to http://about.jstor.org/terms
198 COX AND BRANDWOOD-On a Discriminatory [No. 1,
mated scores, Sj, obtained on replacing probabilities in
the first entry is loge (2 4/1 1) =0 779. The remaining columns give the percentage
frequency distributions for the doubtful works Crit., . . . , Tim.
Table 2 gives the mean scores for the doubtful works; for example that for Crit. is
1O (33 x 0779 + 20 x 0867 + . . . + 40 x 0215).
Mean scores have been calculated also for Rep. and Laws; these are the expected values
that would apply to any work for which the distribution of quantities differs only randomly
from that in Rep. or Laws.
TABLE 1
Percentage Distribution of Sentence Endings
sf=Natural
Type of Ho Hi Log of
Ending Rep. Laws Ratio Crit. Phil. Pol. Soph. Tim
u
u
u
u
u
.
1.1
2.4
0
779
3-3
2
5
1-7
2-8
2-4
- u uuu u. 6 3-8 0-867 2-0 2-8 2 5 3-6 3-9
u - u u u . 17 1.9 0-113 2-0 2.1 3-1 3-4 6-0
u u - u u . 1.9 2-6 0-315 1-3 2-6 2-6 2-6 1P8
uuu-u . 2.1 3*0 0358 6-7 40 3-3 2-4 3-4
u u u u - . 20 3-8 0-642 4-0 4.8 2-9 2 5 3 5
-
u
uu
.
21
2*7
0
255
3.3
4.3
3.3
3.3
3-4
-u - u u 2.2 1.8 -0K199 2-0 1.5 2-3 4 0 3-4
- u -u . 28 0-6 -1-541 1P3 0 7 0G4 2-1 1-7
- u u u - . 46 8-8 0-647 6-0 6 5 4 0 2-3 3-3
u - - u u 3-3 3.4 0.030 2-7 6-7 5.3 3.3 3.4
u - u-u - 2*6 1.0 -0 956 2-7 0-6 0 9 1P6 2-2
u _ u u - 4.6 1.1 -1P430 2-0 0 7 1.0 3 0 2-7
u u - .- u 2.6 1.5 -0 548 2-7 3-1 3-1 3 0 3 0
uu-u- 4-4 30 -0385 3-3 19 30 30 2-2
u u u -- 2*5 5*7 0-824 6-7 5.4 4.4 5.1 3.9
- -- u u 2*9 4.2 0*372 2.7 5 5 6-9 5 2 3 0
- u - u 3-0 1-4 -0-761 2-0 0 7 2-7 2-6 3.3
-- u u - . 3-4 1.0 -1-224 0 7 0 4 0 7 2.3 3*3
-u
-u
20 2-3 0-140 2-0 1P2 3*4 3-7
64 2*4 -0-982 1-3 2-8 1.8 2-1
- u u -- . 4-2 0-6 -1-946 4.7 0 7 0-8 3 0
u u --- 2-8 2-9 0 039 1P3 2.6 4-6 3.4
u u -- 42 1.2 -1-253 2-7 1-3 1 0 1-3
u
u
-
u
u
--
u
-
--
.
.
-
u
u_-_-
.
.
.
48
2-4
3.5
8-2
1.9
4-1
0
536
-0-231
0.157
5
3
3-3
5.3
3-3
3-3
3 0
2-8
3 0
3-3
4.5
4-6
2-9
2-4
3*4
1-8
2
5
2-0
3-3
3*8
4
4.9
7.3
2-5
3
0
2-2
-u _- _ 4-0 3-7 -0 077 4.7 3.3 4.9 3.5 3 0
--u-- 4.1 2.1 -0r668 6-0 2-3 2-1 4-1 6-4
- --u- . 4.1 8.8 0 765 2-0 9 0 68 4.7 3-8
- - - - u . 2.0 3 0 0 405 3-3 2-9 2-9 2.6 2-2
-----
.
4.2
5
2
0-215
0
Number of
sentences . 3,778 3,783 - 150 958 770 919 762
There are minor errors in the third decimal place of the scores s.
This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC
All use subject to http://about.jstor.org/terms
1959] Problem connected with the Works of Plato 199
The estimated variances of the mean scores, also given in Table 2, are calculated from
(7), again replacing si by si. Thus for Crit. the estimated variance is
1 (33 x 0.7792 + . . + 4-0 x 0.2152)
149
100
(3 3 x 0-779 +. . x40X 0215)2l
1002
the divisors 100 being inserted because the frequencies in Table 1 are percentages.
TABLE 2
Mean Scores and Their Standard Errors
Crit.
Phil.
Pol.
Soph.
Tim.
Rep.
Laws
Mean score . -0-0346 0a1996 0a1303 -0'0407 -0-1170 -0'2652 0a2176
Estimated
variance . 0003799 0 0003342 0 0003973 0-0005719 0.0007218
Estimated
st. error . 00616 0.0183 0.01993 0-0239 0-0269
The variance, and hence the standard error, of the difference between any two mean
scores can now be found, and the order determined. The more critical comparisons are
shown in Table 3; those not shown in the table are very highly significant statistically.
TABLE 3
Some Comparisons of Mean Scores
Level of
Significance
Estimated Attained
Difference St. Error (%)
Tim. v. Soph. . . . 00763 0.0360 5
Tim. v. Crit.. . . . 00824 0-0672 20
Soph. v. Crit. . . . 0.0061 0'0661 >90
Crit.v.Pol . . . . 0-1649 0f0648 1
Pol. v. Phil. . . . . 00693 0.0270 1
The final ordering is Rep., Tim., Soph., Crit., Pol., Phil., Laws: there is reasonably
strong evidence that Tim. is correctly placed before Soph., but the position of Crit. could
be anywhere between somewhat before Tim. to before Pol. This conclusion agrees broadly
with the one arrived at in the earlier work mentioned in ?1. It is not in accord with the
views held by the majority of classical scholars, although there is a minority group who
have reached a similar ordering by apparently independent arguments.
Brandwood (1958) has described further aspects of the analysis and given a detailed
discussion of the conclusions. One further point that may be noted briefly is that Rep.
and Laws are each divided into a number of books; these have been treated as homogeneous
in the above analysis. This can be checked by calculating mean scores separately for
each book and comparing the dispersion of these means with that to be expected from
(7). The agreement is good. Further, it was expected on general grounds that there
would be no systematic change in style with serial number; this was confirmed.
A final general remark is that the frequency distributions in Table 1 are based not on
samples, but on complete enumeration. In applying probabilistic arguments we are in
This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC
All use subject to http://about.jstor.org/terms
200 COX AND BRANDWOOD-A Problem connected with Works of Plato [No. 1,
effect assuming that certain aspects of Plato's writings are adequately described by the
laws of probability for independent events; this assumption receives some support from
the fact just noted about the dispersion between books within works.
REFERENCES
BARNARD, M. M. (1935), "The secular variations of skull characters in four series of Egyptian skulls",
Ann. Eug. London, 6, 352-362.
BILLIG, L. (1920), "Clausulae and Platonic chronology", J. Philol., 35, 225-256.
BRANDWOOD, L. (1958), The dating of Plato's works by the stylistic method-a historical and critical survey.
Ph.D. thesis, University of London.
KALUSCHA, W. (1904), "Zur Chronologie der platonischen Dialoge", Wiener Studien, 26, 190-204.
WELCH, B. L. (1939), "Note on discriminant functions", Biometrika, 31, 218-219.
This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC
All use subject to http://about.jstor.org/terms