On a Discriminatory Problem Connected with the Works of Plato Author(s): D. R. Cox and L. Brandwood Source: Journal of the Royal Statistical Society. Series B (Methodological), Vol. 21, No. 1 (1959), pp. 195-200 Published by: Wiley for the Royal Statistical Society Stable URL: http://www.jstor.org/stable/2983942 Accessed: 03-03-2017 13:42 UTC JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://about.jstor.org/terms Royal Statistical Society, Wiley are collaborating with JSTOR to digitize, preserve and extend access to Journal of the Royal Statistical Society. Series B (Methodological) This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC All use subject to http://about.jstor.org/terms 1959] 195 ON A DISCRIMINATORY PROBLEM CONNECTED WITH THE WORKS OF PLATO By D. R. Cox and L. BRANDWOOD Birkbeck College, University of London [Received November, 1958] SUMMARY A FORM of discriminant analysis for qualitative data is developed and used to place in order some of the works of Plato. 1. INTRODUCTION Between writing the Republic (Rep.) and the Laws, Plato wrote the Critias (Crit.), Philebus (Phil.), Politicus (Pol.), Sophist (Soph.) and Timaeus (Tim.), but it is not known in what order. Kaluscha (1904) was the first to use statistical methods in an attempt to assign an order to these works. Billig (1920) continued this line of investigation, but the conclusions from it have not been generally accepted by classical scholars, partly, perhaps, because of the very subjective methods used to interpret the statistical tables. Brandwood (1958) has applied more objective statistical methods to the problem and the aim of the present note is to explain briefly the technique used. The stylistic property on which the statistical analysis is based is the distribution of quantity over the sentence ending (clausula). The last five syllables only of each sentence are considered, each syllable being classed as long or short. The sentence endings are thus divided into 25 = 32 types. For each work a frequency distribution is obtained showing the number of endings of each type (Table 1). There is a marked difference between the distributions for Rep. and Laws; the problem is in effect to order the other works in decreasing order of affinity with, say, Rep., and then, provided that Plato's change in literary style was monotone in time, we will have estimated the order in which the works were written. The technique used is essentially discriminant analysis. The optimum discriminator between Rep. and Laws is obtained, giving therefore a method of scoring each type of sentence ending. The mean score for all sentences in Rep. is substantially negative, that for Laws is substantially positive; the mean scores for the other works then give the required ordering. The calculation is completed by obtaining an approximate significance test for the difference between two mean scores. The work may be compared with that of Barnard (1935), who used discriminant analysis to date Egyptian skulls. Her observations, however, were on continuous and approximately normally distributed variates, and ordinary linear discriminant analysis was used. Here, however, the observations are on five qualitative variables (long, short). It would be possible to replace "long" by a 1 and "short" by a 0 and then to use a linear discriminant function as an approximation; this procedure is not likely to take proper account of the importance of particular special patterns of quantity among several syllables, and it is This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC All use subject to http://about.jstor.org/terms 196 COX AND BRANDWOOD-On a Discriminatory [No. 1, better, and a lot simpler, to regard the 32 different types of Problems of discrimination with qualitative variables arise also in other fields, for example in medical diagnosis. 2. THEORY Let there be two populations Ho, Hi of individuals and let each individual fall into one of k types, the probabilities being Ool, . . . , 0Ok; 011, Olk. , , assumed for the moment to be known. Assume that different individuals are statistically independen In the application that we have in mind Ho is Rep., H1 is Laws, an individual is a sentence ending and k = 32. Suppose first that we have N individuals from one or other population, ni being of the ith type, and that we require to decide which population has in fact been sampled. Welch (1939) showed that optimum discrimination between the populations is based on the likelihood ratio. The probability of the observations in sampling from Hlo is of the multinomial form N! n0n HN! Ioi (1) so that the log-likelihood ratio is ni where any log system Oi) of (2) logarithm the calculations. That is, each sentence is given a score Si Ooi = log 1li (3) and the optimum discriminator is the total score. Now suppose that there exist not just populations Ho and Hli, but a series of populations representing a gradual change from the distribution in Hlo to that in Hl1. A natural way of representing such a series is to define populations HA, 0 < A < 1, the probability 6Ai of the ith type in HA being 01-A OA j=1 For a random sample of N individuals, ni of the ith type, the log-likelihood is, except for a constant, k -k- E [n so that a sufficient statistic for A is the total score (2), or more conveniently, the mean score 1 = ni log (6ij /Ooi) 5 N N The random variable s converges in probability, as N increases, to +(A) = Aisj, which is a monotone function of A; it would be possible to convert s into an estimate of This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC All use subject to http://about.jstor.org/terms 1959] Problem connected with the Works of Plato 197 A, but this seems unnecessary, and we shall instead take i(A) as the parameter of interest. The variance of s, considered as an estimate of i(A), is 1V(s) =N2I {E 0AiS? - (X GAiSi)2} (6) and an unbiased estimate of this is 1 S6 2 (Nnis )2 N(N - 1) 1. NJ Now suppose that we have samples of sizes N', and N", from two populations (two of the "doubtful" works in the application that we are describing). Calculate the cor- responding mean scores s' and s11 and the estimated variance of the difference, Ves s- )- Ve(S') + Ve(s ). (8) In large samples (s' - s"1) will be nearly normally distributed and a significance test can be made in the usual way. If a significant difference is obtained, the populations can con- fidpiitlv be placed in order on the A-scale, provided that (4) is a reasonable representation of the populations. The adequacy of (4) could in principle be examined by a x2 test, although this was not done in the present application. The correctness of (4) matters only in the derivation of (5) as an optimum statistic and not in the calculation of the standard error; (5) has in any case a certain intuitive appeal. If the population probabilities are known for Ho and Hi, this completes the theory. In the present application the numbers of sentences in Rep. and in Laws are appreciably greater than those in the other books, so that it is reasonable to replace 6oi and Oli by observed frequencies and to neglect the consequent error in si. In general the large-sample variance of s' - s" is given by (8) plus z (06i - ,i)2 V('i) + 2 E (OAji - Oi)(A AX #6) C(?i, ?") (9 i >j where 6Ai, 0Qti refer to the two populations under comparison, and where hi i estimate of log (Oli/Ooi) obtained by replacing probabilities by sample frequencies. If Mo, Mi denote the numbers of observations on the reference populations Ho, HI, then a simple asymptotic calculation gives (1- Oi 1 - 6i (oge) ( MoOo Mi6 log e))i (10) C(si, ? -) - k (j1 + 1j) (log e)2. (11) Note however that under the null hypothesis that the two populations under comparison are identical, i.e. A = ~u, the additional terms (9) vanish, so that (7) and (8) may be used for a significance test, with si replaced by Si. 3. APPLICATION AND DISCUSSION The first two columns of Table 1 give the percentage distribution among the 32 types for the reference populations Ho, Rep., and HI, Laws. The third column gives the esti- This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC All use subject to http://about.jstor.org/terms 198 COX AND BRANDWOOD-On a Discriminatory [No. 1, mated scores, Sj, obtained on replacing probabilities in the first entry is loge (2 4/1 1) =0 779. The remaining columns give the percentage frequency distributions for the doubtful works Crit., . . . , Tim. Table 2 gives the mean scores for the doubtful works; for example that for Crit. is 1O (33 x 0779 + 20 x 0867 + . . . + 40 x 0215). Mean scores have been calculated also for Rep. and Laws; these are the expected values that would apply to any work for which the distribution of quantities differs only randomly from that in Rep. or Laws. TABLE 1 Percentage Distribution of Sentence Endings sf=Natural Type of Ho Hi Log of Ending Rep. Laws Ratio Crit. Phil. Pol. Soph. Tim u u u u u . 1.1 2.4 0 779 3-3 2 5 1-7 2-8 2-4 - u uuu u. 6 3-8 0-867 2-0 2-8 2 5 3-6 3-9 u - u u u . 17 1.9 0-113 2-0 2.1 3-1 3-4 6-0 u u - u u . 1.9 2-6 0-315 1-3 2-6 2-6 2-6 1P8 uuu-u . 2.1 3*0 0358 6-7 40 3-3 2-4 3-4 u u u u - . 20 3-8 0-642 4-0 4.8 2-9 2 5 3 5 - u uu . 21 2*7 0 255 3.3 4.3 3.3 3.3 3-4 -u - u u 2.2 1.8 -0K199 2-0 1.5 2-3 4 0 3-4 - u -u . 28 0-6 -1-541 1P3 0 7 0G4 2-1 1-7 - u u u - . 46 8-8 0-647 6-0 6 5 4 0 2-3 3-3 u - - u u 3-3 3.4 0.030 2-7 6-7 5.3 3.3 3.4 u - u-u - 2*6 1.0 -0 956 2-7 0-6 0 9 1P6 2-2 u _ u u - 4.6 1.1 -1P430 2-0 0 7 1.0 3 0 2-7 u u - .- u 2.6 1.5 -0 548 2-7 3-1 3-1 3 0 3 0 uu-u- 4-4 30 -0385 3-3 19 30 30 2-2 u u u -- 2*5 5*7 0-824 6-7 5.4 4.4 5.1 3.9 - -- u u 2*9 4.2 0*372 2.7 5 5 6-9 5 2 3 0 - u - u 3-0 1-4 -0-761 2-0 0 7 2-7 2-6 3.3 -- u u - . 3-4 1.0 -1-224 0 7 0 4 0 7 2.3 3*3 -u -u 20 2-3 0-140 2-0 1P2 3*4 3-7 64 2*4 -0-982 1-3 2-8 1.8 2-1 - u u -- . 4-2 0-6 -1-946 4.7 0 7 0-8 3 0 u u --- 2-8 2-9 0 039 1P3 2.6 4-6 3.4 u u -- 42 1.2 -1-253 2-7 1-3 1 0 1-3 u u - u u -- u - -- . . - u u_-_- . . . 48 2-4 3.5 8-2 1.9 4-1 0 536 -0-231 0.157 5 3 3-3 5.3 3-3 3-3 3 0 2-8 3 0 3-3 4.5 4-6 2-9 2-4 3*4 1-8 2 5 2-0 3-3 3*8 4 4.9 7.3 2-5 3 0 2-2 -u _- _ 4-0 3-7 -0 077 4.7 3.3 4.9 3.5 3 0 --u-- 4.1 2.1 -0r668 6-0 2-3 2-1 4-1 6-4 - --u- . 4.1 8.8 0 765 2-0 9 0 68 4.7 3-8 - - - - u . 2.0 3 0 0 405 3-3 2-9 2-9 2.6 2-2 ----- . 4.2 5 2 0-215 0 Number of sentences . 3,778 3,783 - 150 958 770 919 762 There are minor errors in the third decimal place of the scores s. This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC All use subject to http://about.jstor.org/terms 1959] Problem connected with the Works of Plato 199 The estimated variances of the mean scores, also given in Table 2, are calculated from (7), again replacing si by si. Thus for Crit. the estimated variance is 1 (33 x 0.7792 + . . + 4-0 x 0.2152) 149 100 (3 3 x 0-779 +. . x40X 0215)2l 1002 the divisors 100 being inserted because the frequencies in Table 1 are percentages. TABLE 2 Mean Scores and Their Standard Errors Crit. Phil. Pol. Soph. Tim. Rep. Laws Mean score . -0-0346 0a1996 0a1303 -0'0407 -0-1170 -0'2652 0a2176 Estimated variance . 0003799 0 0003342 0 0003973 0-0005719 0.0007218 Estimated st. error . 00616 0.0183 0.01993 0-0239 0-0269 The variance, and hence the standard error, of the difference between any two mean scores can now be found, and the order determined. The more critical comparisons are shown in Table 3; those not shown in the table are very highly significant statistically. TABLE 3 Some Comparisons of Mean Scores Level of Significance Estimated Attained Difference St. Error (%) Tim. v. Soph. . . . 00763 0.0360 5 Tim. v. Crit.. . . . 00824 0-0672 20 Soph. v. Crit. . . . 0.0061 0'0661 >90 Crit.v.Pol . . . . 0-1649 0f0648 1 Pol. v. Phil. . . . . 00693 0.0270 1 The final ordering is Rep., Tim., Soph., Crit., Pol., Phil., Laws: there is reasonably strong evidence that Tim. is correctly placed before Soph., but the position of Crit. could be anywhere between somewhat before Tim. to before Pol. This conclusion agrees broadly with the one arrived at in the earlier work mentioned in ?1. It is not in accord with the views held by the majority of classical scholars, although there is a minority group who have reached a similar ordering by apparently independent arguments. Brandwood (1958) has described further aspects of the analysis and given a detailed discussion of the conclusions. One further point that may be noted briefly is that Rep. and Laws are each divided into a number of books; these have been treated as homogeneous in the above analysis. This can be checked by calculating mean scores separately for each book and comparing the dispersion of these means with that to be expected from (7). The agreement is good. Further, it was expected on general grounds that there would be no systematic change in style with serial number; this was confirmed. A final general remark is that the frequency distributions in Table 1 are based not on samples, but on complete enumeration. In applying probabilistic arguments we are in This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC All use subject to http://about.jstor.org/terms 200 COX AND BRANDWOOD-A Problem connected with Works of Plato [No. 1, effect assuming that certain aspects of Plato's writings are adequately described by the laws of probability for independent events; this assumption receives some support from the fact just noted about the dispersion between books within works. REFERENCES BARNARD, M. M. (1935), "The secular variations of skull characters in four series of Egyptian skulls", Ann. Eug. London, 6, 352-362. BILLIG, L. (1920), "Clausulae and Platonic chronology", J. Philol., 35, 225-256. BRANDWOOD, L. (1958), The dating of Plato's works by the stylistic method-a historical and critical survey. Ph.D. thesis, University of London. KALUSCHA, W. (1904), "Zur Chronologie der platonischen Dialoge", Wiener Studien, 26, 190-204. WELCH, B. L. (1939), "Note on discriminant functions", Biometrika, 31, 218-219. This content downloaded from 163.1.41.46 on Fri, 03 Mar 2017 13:42:58 UTC All use subject to http://about.jstor.org/terms
© Copyright 2026 Paperzz