TEST-RETEST AND INTER-ANALYST RELIABILITY OF THE AUTOMATED READABILITY INDEX, FLESCH READING EASE SCORE, AND THE FOG COUNT Georgelle Thomasa R. Derald Hartley Georgia Southern College So. Carolina Vocational Rehabilitation Dept. J. Peter Kincaidb Georgia Southern College Abstract. Using six analysts, test-retest and inter-analyst reliabilities were determined for the Automated Readability Index (ARI), the Flesch Reading Ease Score, and the Fog Count. All coefficients, with the exception of one Flesch measure, were above .94. Analysis of variance applied to measured working times indicated that the Flesch takes significantly longer to use than the ARI and Fog. A number of readability formulas have been designed to assess the difficulty level of written material. Generally, readability formulas include at least two components: (1) some measure of sentence difficulty (usually sentence length) and (2) some measure of word difficulty. These two measures are typically put into a regression equation which assigns a numerical value denoting ease or difficulty of the material in the form of a total formula score or a grade level. The Flesch formula (Flesch, 1948; Farr & Jenkins, 1949; Flesch, 1951) and the Fog Count (Gunning, 1952) have been widely used in the last two decades. More recently, a new readability formula, the Automated Readability Index (ARI) has been introduced (Smith & Senter, 1967). With the ARI, reading material is typed on a slightly modified IBM Selectric typewriter, the Readability Index Tabulator.0 The modification consists of attaching three microswitches which a Reprints may be requested from Dr. Thomas, Department of Psychology, Georgia Southern College, Division of Social Sciences, Statesboro, Georgia 30458. b This research was supported by Grant #OEG-4-71-0069 from the United States Office of Education, Department of Health, Education, and Welfare, and by a Georgia Southern College Faculty Research Committee Grant. C A complete description, pictures, and a wiring diagram are in the appendix of the Kincaid, Van Duesen, Thomas, Lewis, Anderson and Moody (1972) report. Downloaded from jlr.sagepub.com at PENNSYLVANIA STATE UNIV on February 21, 2016 150 Journal of Reading Behavior 1975 VII, 2 cumulatively record the formula factors: number of strokes (recorded each time the ball of the typewriter goes forward, number of words (recorded each time the space-bar is pressed), and number of sentences (recorded each time the = is pressed). This last operation, recording sentences, is the primary change the typist must make in his usual typing technique; he is simply instructed to punctuate the sentence as usual and follow each end of sentence punctuation mark with a =. The ARI prediction equation is as follows: GL = 0.50 (wd/sn) + 4.71 (st/wd) - 21.43 where GL = assigned grade level wd/sn = words per sentence or sentence length st/wd = strokes per word or word length There are two published validation studies of the ARI (Smith & Kincaid; 1970; Kincaid & Delionbach, 1973) which used military technical material with military personnel serving as subjects. Kincaid et ah (1972) validated the ARI using adult basic reading material and subjects enrolled in a federally sponsored job training program. Reliability studies of the Flesch Reading Ease Score (Hayes, Jenkins, & Walker, 1950; England, Thomas, & Paterson cited in Klare, 1963) indicated test-retest and between-analyst reliabilities about +.90. Kincaid et ah (1972) reported very high test-retest reliabilities for the ARI formula factors (each exceeding +.98). A search of the literature failed to locate reliability studies on the Fog Count. The present investigators obtained reliability coefficients for the ARI, the Flesch Reading Ease Score, and the Fog Count. Test-retest and inter-analyst reliabilities were determined for the separate formula factors, total formula scores, and grade level. Additionally, time measures were recorded for each analyst on all three formulas. METHOD Subjects Six paid female volunteers served as analysts. They were selected on the basis of a standardized typing test. Group mean typing speed was 56.7 words per minute with a mean error rate of .57 errors per minute. Materials The written material consisted of 20 paragraphs of the Minnesota Reading Examinations for College Students, Forms A and B (Haggerty & Eurich, 1930). Instructions for calculating all formulas were compiled and edited for simplicity and clarity. A combined tabulation and computational sheet was devised for each formula. Downloaded from jlr.sagepub.com at PENNSYLVANIA STATE UNIV on February 21, 2016 Thomas, Hartley, Kincaid 151 The Readability Index Tabulator and electronic calculators were the only other materials. Procedure Prior to the study, the analysts attended a training session in which they were familiarized with all three readability formulas, and were given practice in applying the formulas, using the tabulation sheets, the Readability Index Tabulator and the calculator. In the study, analysts worked independently. Each analyst was given the 20 paragraphs and 20 tabulation sheets and worked with a particular formula until all 20 selections were completed. First, analysts examined (either manually or with the Readability Index Tabulator) each selection for the formula factors required to compute the total score or grade level. When these values were recorded on a tabulation sheet, that sheet was given to the examiner who recorded the counting time spent on that paragraph. After the 20 selections were completed in this manner, all the tabulation sheets were returned to the analysts so that the necessary mathematical computations could be completed. As computations for each paragraph were completed, the tabulation sheets were returned to the examiner who recorded computation time. The analysts completed all three formulas in the above manner. Order of presentation of the three readability formulas to each analyst was determined by a table of random numbers. After an interval of two weeks, the analysts reapplied all the formulas in the same order as the first session. RESULTS The test-retest reliability coefficients are found in Table 1. Interanalyst reliability was determined by the intraclass procedure described by Guilford (1954); this statistic yields an average intercorrelation based on all possible pairs of Analysts. Results are in Table 2. Total time data are presented in Table 3. An analysis of variance was performed on the total time data with Test-Retest and Formula as factors. A significant Test-Retest factor .FQ,25) = 16.39, p < .01, indicates the expected practice effect in the use of the formulas. The Formula Factor result, Ff2,25) = 21.18, p < .01, was further examined by a Duncan's New Multiple Range Test. Significant differences (p < .01) were found between the ARI and the Flesch, and between the Fog Count and the Flesch. Downloaded from jlr.sagepub.com at PENNSYLVANIA STATE UNIV on February 21, 2016 Table 1 Pearson Product Moment Test-Retest Correlations for ARI, Flesch Reading Ease Score , and Fog Count Readability Measure Correlation Coefficient Table 2 Inter-Analyst Correlations for ARI, Flesch Reading Ease Score, and Fog Count Readability Measure Correlation Coefficient ARI Strokes Words Sentences Total ARI Grade Level .990 .994 .991 .987 .989 ARI Strokes Words Sentences Total ARI Grade Level .994 .999 .998 .987 .998 Flesch Reading Ease Score Syllables Words Sentences Total Flesch Reading Ease Score .983 .997 .945 .795 Flesch Reading Ease Score Syllables Words Sentences Total Flesch Reading Ease Score .995 .999 .979 .969 Fog Count Easy Elements Polysyllables Sentences Total Fog Count Grade Level .977 .980 .958 .941 .963 Fog Count Easy Elements Polysyllables Sentences Total Fog Count Grade Level .988 .995 .986 .990 .994 Downloaded from jlr.sagepub.com at PENNSYLVANIA STATE UNIV on February 21, 2016 Thomas, Hartley, Kincaid 153 Table 3 Summary Table of Total Time Measures (in Minutes) of Test-Retest and Formula Factors Fog Formula ARI Flesch Total Time X (Per Passage) Test Retest 44.2 28.1 34.0 29.2 62.2 46.8 140.4 104.1 9.3 6.9 Total Time X(Per Passage) 72.2 7.2 63.3 6.3 109.0 10.9 DISCUSSION The test-retest and inter-analyst reliability coefficients indicate the consistently high reliability of all three formulas. Only one r (Flesch Reading Ease Score test-retest r = .795) was below .94. The low r might have resulted from the comparatively more difficult application of the Flesch regression equation and hence a practice effect. Analysis of variance of total time measures revealed that the Flesch formula takes significantly longer to use than the ARI and the Fog Count. It is entirely possible that the use of one of the two available Flesch charts, the Farr-Jenkins Table (Farr & Jenkins, 1949) or the Flesch Nomograph (Flesch, 1951) would decrease total time required for computation. The total application time for the ARI is, of course, dependent upon the efficient use of a typewriter. The analysts in the present study were fairly skilled (57.7 words per minute with per minute error rate of .57). It appears, then, that the ARI can be applied about as rapidly as the Fog Count if the typist achieves a speed of 55-60 words a minute. Aside from time considerations, the ARI has several advantageous qualities. As pointed out by Smith and Kincaid (1970) the Tabulator accomplishes two things simultaneously: it provides a typed copy of the material, and it automatically tabulates the necessary formula factors thus bypassing the necessity of manual counting. All three formulas have been programmed for computer use, thus permitting the evaluation of larger quantities of material by a reduced number of personnel. Of the formulas, the ARI lends itself best to computer programming as strokes, words, and sentences can be perfectly counted by a computer. The report by Kincaid et al. (1972) contains a FORTRAN IV program for the ARI. Both the Fog Count and Flesch Reading Ease Score require the counting of syllables and this is somewhat more difficult to program for computer computation. However, Klare, Roe, St. John, and Stolurow (1969) reported a program for the Flesch formula that counts syllables with over 99% accuracy. Because any readability formula gives only an Downloaded from jlr.sagepub.com at PENNSYLVANIA STATE UNIV on February 21, 2016 154 Journal of Reading Behavior 1975 VII, 2 indication of reading difficulty this accuracy is quite satisfactory. These same investigators (Klare et ah, 1969) also developed a computer program for the Fog Count. Coke and Rothkopf (1970) have developed three algorithms for estimating syllables from vowels per word, consonants per word, and letters per word. The vowels per word algorithm gives the best estimate and this can be used to calculate Flesch Reading Ease scores. REFERENCES COKE, E. V., & ROTHKOPF, E. Z. Note on an algorithm for a computer-produced reading ease score. Journal of Applied Psychology, 1970, 54, 208-210. FARR, J. N. & JENKINS, J. J. Tables for use with the Flesch Readability Formula. Journal of Applied Psychology, 1949, 33, 275-278. FLESCH, R. A new readability yardstick. Journal of Applied Psychology, 1948, 23, 221-233. FLESCH, R. How to test readability. New York: Harper & Brothers, 1951. GUILFORD, J. P. Psychometric methods. New York: McGraw-Hill, 1954. GUNNING, R. The technique of clear writing. New York: McGraw-Hill, 1952. HAGGERTY, M. E., & EURICH, A. C. Minnesota reading examinations for college students, Form A and Form B. Minneapolis, Minnesota: University of Minnesota Press, 1930. HAYES, P. M., JENKINS, J. J., & WALKER, B. J. Reliability of the Flesch Readability Formulas. Journal of Applied Psychology, 1950, 34, 22-26. KINCAID, J. P. & DELIONBACH, L. J. Validation of the Automated Readability Index: A follow-up, Human Factors, 1973, 15, 17-20. KINCAID, J. P., VAN DUESEN, J., THOMAS, G., LEWIS, R., ANDERSON, P. T. and MOODY, L. Use of the Automated Readability Index for evaluating peer-prepared material for use in adult reading education. OEG-4-71-0069. Statesboro, Georgia: Georgia Southern College, 1972 (ERIC file # ED-068814). KLARE, G. R. The measurement of readability. Ames, Iowa: Iowa State University Press, 1963. KLARE, G. R., ROWE, P. P., ST. JOHN, M. G., & STOLUROW, L. M. Automation of the Flesch "Reading Ease" Readability Formula, with various options. Reading Research Quarterly, 1969, 4, 550-559. SMITH, E. A. & KINCAID, J. P. Derivation and validation of the Automated Readability Index for use with technical materials. Human Factors, 1970, 12, 457-464. SMITH, E. A., & SENTER, R. J. Automated Readability Index. AMRL-TR-66-22. Wright Patterson AFB, Ohio: Aerospace Medical Division, 1967. Downloaded from jlr.sagepub.com at PENNSYLVANIA STATE UNIV on February 21, 2016
© Copyright 2026 Paperzz