International Comparisons and the Condition of American Education GERALD W. BRACEY Edi4cational Researcher, Vol. 25, No. 1, pp. 5-11 Given that the results of at least one study were debated I nternational comparisons of achievement levels across various curriculum subject areas have received a considerable amount of publicity. Indeed, some argue that trends on domestic achievement tests are no longer relevant because the salient reference groups are students in other nations. In articles outside of educational research, the reports on international comparisons have accentuated the negative, even when the results have been positive. For instance, the November, 1992 issue of The American School Board Journal carried an unbylined report on How in the World Do Students Read? under the headline, "Good News: Our 9-Year-Olds Read Well; Bad News: O u r 14-Year-Olds Don't." In fact, as discussed in more detail below, 14-yearolds finished eighth among 31 nations, and only one of the seven countries ahead of the United States actually had significantly higher scores. On the pages of educational research journals, such comparisons have also been the subject of debate: Is the United States an underachiever in comparison to Japan or not? (Baker, 1993; Westbury, 1992, 1993). Stedman (1994a) addressed the isjjue in a more balanced manner than most commentators, b u t even his treatment detailed problems while giving short shrift to positive results (the reading study mentioned previously received only a one-sentence mention, for instance). As noted, though various researchers have m a d e various interpretations of the studies, elsewhere there has been little ambiguity about what the results show: American students are performing at much lower levels than students in other industrialized nations (Shanker, 1992a). International examinations designed to compare students from all over the world usually show American students at or near the bottom (Shanker, 1992b). Yet even America's best high school students, as international comparisons reveal, rank far behind students in countries challenging us in the multinational marketplace (Gray & Kemp, 1993). with vigor, given the importance that international studies have assumed, a n d given the strong negative pronouncements emanating from a variety of sources in the general media, it seems appropriate to summarize the findings of these various reports. Because some have challenged the methodology of early studies (Rotberg, 1990), I begin with the Second International Mathematics Study (SIMS; 1987). This study is not without its problems, especially at the 12th grade. At this level, SIMS administered tests of advanced algebra and calculus, b u t only to those students still taking science and advanced mathematics. Such an attempt to measure apples against apples fails because the proportion of students in the mathematics classes varies enormously among countries. For instance, In H o n g Kong only 2% of the students were still enrolled in mathematics courses, whereas in Japan 12% were and in H u n g a r y 50% were. This difference no doubt accounts for changes in the relative rankings from 8th grade to 12th grade. H o n g Kong moved from an average ranking in Grade 8 to n u m b e r 1 in Grade 12, whereas H u n g a r y fell from number 2 to near the bottom. Japan finished number 1 at the 8th-grade level, n u m b e r 2 at the 12th-grade level. Such vagaries cloud the meaning of different countries' average scores. Such selection biases are likely one major reason why no study since SIMS has attempted to measure achievement at the senior high school level (the Third International Mathematics Study will try). Another is that students at this level, in this country at least, have no reason to perform well on a test that has n o relevance to their life. Such omissions carry their o w n risks, of course. Because of tracking programs in the high school, curricula may differ—for some nations, eighth grade is the last time that all students in a cohort will be studying the same material. The United States did not finish "dead last" in any of the comparisons. At the 12th grade, U.S. students were 14th of 15 nations in advanced algebra, 12th of 15 in geometry, and 12th of 15 in calculus. These finishes are quite in line with the SIMS measurement of Opportunity To Learn (OTL). Stedman (1994b) makes much of the SIMS comparisons of the top 1% and 5% of 12th-grade students in calculus, thinking, incorrectly, that this equated nations for selectiv- In mathematics and science, American high schoolers finish last or next to last in virtually every international measure (Gerstner, Semerad, Doyle, & Johnston, 1994). We have gotten used.. .to coming in dead last in international math comparisons (Krauthammer, 1994). An ' F In International Competition (Newsweek, 1992) GERALD W. BRACEY is an educational researcher, writer, and Ex- ecutive Director of The Alliance for Curriculum Reform, 3333 Helen Street, Alexandria, VA 22305. His areas of interest are assessment and policy analysis. JANUARY-FEBRUARY 1996 Downloaded from http://er.aera.net at PENNSYLVANIA STATE UNIV on September 12, 2016 lty. Says Stedman, "The researchers found that the United States ranked last or next to last in each analysis, scoring significantly lower than many countries " In fact, no comparisons of significance were made. Although the U.S. did achieve the low ranks Stedman claims, the U.S. top 1% scored at 61 and the top-ranked Japanese scored at 70. Differences among the scores of the top 5% in various countries were also small except in comparison to those of the Japanese At the eighth-grade level US. students finished in the middle in arithmetic, 10th of 20 nations, 12th of 20 m algebra, 16th of 20 in geometry, and 18th of 20 in measurement. Many countries have quite similar scores. For instance, in geometry U. S students average 38% correct to finish in 16th place whereas Scotland attained a 4th-place ranking with an average of 44% correct America's average score of 51 in arithmetic garnered 10th place, whereas an average of 59 would have meant a third-place finish. One wonders if the differences among the averages are suitably large to form a basis for making policy decisions about the state of education in any nation. As will be seen later, average scores are also quite misleading given the enormous variability about the average. The slightly below-average finish in algebra seems unusually high given that most eighth-grade students in this country do not take algebra as a formal course and most students in other nations do Indeed, Westbury has contended that only the comparison of arithmetic scores is a like-against-like comparison, at least when looking at American scores versus those of the Japanese because the curricula sequences differ so greatly in other curncular areas (Westbury, 1993). It was the concern of comparability that led Westbury to disaggregate U S scores according to the kind of class that students were in- remedial, typical, prealgebra, or algebra With such a disaggregation, U.S. students actually taking algebra scored somewhat higher than Japanese students, whose algebra average score was the highest of any nation Of course, comparing a group of 20% of American students, probably the most academically able 20%, against an entire Japanese cohort introduces a selection bias favoring the American students However, when Westbury compared the scores of those Americans taking algebra against the top 20% of Japanese students, they still scored as well as this Japanese elite (Westbury, 1992) Such results, not uncommon, run counter to claims emanating from the work of Stevenson (et al., 1990; Stevenson, 1992) that only a tiny fraction of American students score as well in mathematics as the median Japanese student. The 1992 Second International Assessment of Educational Progress (IAEP-2) produces results similar to those found in Westbury's analyses (Lapointe, Askew, & Mead, 1992, Lapointe, Mead, & Askew 1992) In mathematics, the U S. ranks low, 13th of 15 nations for 9-year-olds and 14th of 15 for 13-year-olds Ranks, of course, obscure performance and when one looks at the actual scores, the U.S. average is only slightly behind the international average as shown in Tables 1 and 2. Table 2 shows the comparable results for science Table 2 shows that in science U.S. 9-year-olds score just above the international average (they ranked 3rd among 15 nations), whereas 14-year-olds have an average score identical to the international average 6 Table 1 U.S. and International Averages for Mathematics on the Second International Assessment of Educational Progress Average U.S Internationa! (Percent correct) 9-year-olds 13-year-olds 58 63 55 58 Another way of looking at these results is to compare the distance from the U S. score to that of the top-ranked country in terms of number of items. These results are shown in Table 3. The picture that emerges here is clearly mixed with U.S. 9-year-olds averaging only one item fewer correct than the top-ranked nation while American 13-year-olds missed 14 more items in mathematics, on average, than those in the top-ranked nation. As with SIMS, scores among countries in IAEP-2 are often tightly bunched, and small differences in scores make large differences in ranks If U.S. 13-year-olds had scored 72% correct in science instead of 67, they would have finished 5th rather than 13th Similarly, if the third-ranked 9year-olds had scored 60 instead of 65, they would have finished 12th. These results point to an important and often ignored aspect of international comparisons: Most countries score close together such that small differences in scores make large differences m ranks It is difficult to imagine drawing strong conclusions or framing major policy decisions from such small differences. Yet another picture emerges, perhaps the most meaningful picture, if one compares the results from IAEP-2 with results from the 1992 NAEP mathematics assessment which included state-level data for 41 states. Some such comparisons are made in the NCES report Education in States and Nations in which the 13-year-old IAEP-2 data are transformed into NAEP scales (NCES, 1993). Using other NAEP reporting categories (ETS, 1992), I found the following situation among the top scorers (Bracey, 1994). Asian students, U S Schools 287 Taiwan 285 Korea 283 Advantaged urban students, U S. schools 283 Iowa, North Dakota, 283 White students, U S schools 277 Hungary 277 Table 2 U.S. and International Averages for Science Second International Assessment of Educational Progress Average U.S International (Percent correct) EDUCATIONAL RESEARCHER 9-year-olds 13-year-olds 65 67 62 67 Table 3 Raw Scores of U.S. and Top-Ranked Nations Second International Assessment of Educational Progress Score Mathematics, 9-year-olds Mathematics, 13-year-olds Science, 9-year-olds Science, 13-year-olds Top-ranked nation United States Maximum possible 46 55 39 57 35 41 38 48 61 75 58 72 That Asian American students outscore all other nations, states, or other NAEP reporting groups at the very least raises questions about the sources of their achievement. That White students tie for third place with Hungary (comparing only countries) means that a large majority of American students are scoring at the highest levels: White and Asian students together comprise more than 70% of the K-12 population. With a near-average overall score and some American groups scoring high, some must be scoring low. At the lower end of the distribution these results appear (Bracey, 1994) Jordan 246 Mississippi 246 Hispanic students, U S. schools 245 Disadvantaged urban students, U.S schools 239 Black students, U S schools 236 There is no NAEP category of "disadvantaged rural students" else we would likely find another group at this lower end. Research by the Population Reference Bureau reveals a rural underclass about half the size of the urban underclass (O'Hare & Curry-White, 1992). These rural dwellers are in even more dire straits than poor people in the cities These data are reported by ethnicity because ethnicity is a commonly used system in this nation for categorizing results.1 However, other data, such as those reported by NCES (1990, p 26), Jaeger (1992), and Robinson and Forsyth (1994), strongly convey that much of the difference is actually attributable to social factors. Robinson and Forsyth, for instance, noted that 83% of the variance in state-level NAEP scores could be accounted for by four variables, number of parents in the home, parental educational level, type of community, and state poverty rates at ages 5-17. Another IAEP-2/NAEP table in Education in States and Nations strongly suggests that comparison of nations' school systems on the basis of average scores is not meaningful. As one moves from Jordan, the lowest nation at 246, to Taiwan, the highest nation at 285, one traverses 39 NAEP scale points. Within Taiwan, though, as one moves from the 5th percentile to the 95th percentile, one traverses 123 NAEP scale points. If the distance were from the 1st to the 99th percentile, another 20-odd points would be added. Many other states and nations also show similar patterns. Thus, the within-country and within-state variance is much greater than the between-country or between-state variance. Though some critics of American schools have stated or strongly implied that students in other nations are monohthically better than U.S students (e.g., Gray & Kemp, 1993; Shanker, 1992a, 1992b), this is not true. Indeed, the variabilities are such that one must question how representative the average scores are. In IAEP-2, for instance, in a number of comparisons, the 95th percentile of the United States is higher than the 95th percentile of some countries with higher average scores The variability of scores, somewhat incidentally, raises important questions for endeavors such as the New Standards Project and other standard-setting activities. Where could one set a standard that would be perceived as credibly high without failing large percentages of students 7 And what would one do then? The studies leading to the most damaging assessments of the condition of U.S. schools have come from a series of studies conducted on a much smaller scale than SIMS or IAEP-2 by Harold Stevenson and James Stigler and various colleagues at the University of Michigan (Stevenson, 1992; Stigler & Stevenson, 1991; Stevenson, Lee, Chen, & Lummis, 1990; Stigler, Lee, & Stevenson, 1987; Stevenson, Lee, & Stigler, 1986; and summarized in the book, The Learning Gap, Stevenson & Stigler, 1992). This work has lead to the widespread citation of statistics purportedly showing that only 1% of American students performs as well in mathematics as the median Japanese student (Matthews, 1992). It should be noted that the cover of The Learning Gap reads, in its entirety, "Why American Schools Are Failing and What We Can Learn From Japanese and Chinese Education." However, the book does not examine "American schools." It contains some information from kindergarten, but the bulk of the data are from Grades 1 and 5 only and almost exclusively from mathematics. One wonders how much such data from two elementary grades and one curriculum subject can say about the entire American educational system. Or the Japanese system While the cover wording might be attributed to the publisher's marketing needs, there are methodological problems in the Stevenson et al studies. The various articles do not reveal how the schools were selected or how representative they are It would be naive in the extreme to believe that a nation as closed, a nation as obsessed with its public image as The People's Republic of China (demonstrated vividly in the recent two women's conferences held there) would give an American researcher free access to a random sample of schools. Kazuo Ishizaka, a former teacher and member of the National Institute for Educational Research, has called attention to selection problems in Japan (Ishizaka, 1994). He contends that Stevenson—and other visitors—are gaining entrance only to the schools the government wishes them to see. Ishizaka reports that though the curriculum in Japan is demanding, achievement on that curriculum is low. This outcome is important when coupled with a selection bias: "I took (visiting) teachers to an ordinary level high school and they said 'Oh it is terrible. Why did you guide us to such schools?'" Ishizaka's observations, empirical but not systematic, need to be followed up for they have implications far beyond the' Stevenson work. Stedman (1994a) and others have assumed either that countries obtained a representa- JANUARY-FEBRUARY1996 7 Downloaded from http://er.aera.net at PENNSYLVANIA STATE UNIV on September 12, 2016 tive sample of students and/or that, by selecting only the top 1% or 5% in the study, selectivity was countered. Says Stedman (1994a), "The point is that the researchers who conducted the assessments explicitly dealt with the sampling problems." But such dealings were constrained to the samples presented and Ishizaka's assertions strongly imply that such dealings were bound to fall short. Similarly, as regards the sampling in the Stevenson et al. studies, the education of most Chinese is very poor, and the hope of the PRC is to attain universal ninth-grade education sometime around 2000 Yet the Chinese parents in the Stevenson studies had attended school on average more than 11 years. Stevenson and colleagues have also not presented much demographic information about the Minneapolis, Taiwanese, and Japanese samples, but one report comparing students in Chicago and Beijing contains some important data (Stevenson, Lee, Chen, & Lummis, 1990). In that sample, 13% of the families in Chicago had incomes below $10,000, and the sample seems generally weighted toward low incomes. Black and Hispanic students, as shown earlier, do not perform well on academic tests, but the Chicago sample contained 39% Black and Hispanic children. Nationally, by 1986, after the Stevenson et al. data-gathering had been completed, Blacks and Hispanics constituted 26% of the K-12 population (NCES, 1990). Over 20% of the Chicago children did not speak English at home. The Chicago sample was thus not a representative sample of the United States, nor was it comparable to the Beijing sample on many important demographic variables. The Chicago sample is heavily weighted with variables associated with low achievement There are other variables difficult to quantify in terms of their effects, but that render these studies difficult to interpret. For instance, the children in the Chicago sample lived in homes with, on average, two siblings. Because of the population control policies in China, however, most of the Beijing progeny were only children. To reduce a nation that used to count children by the dozen to a nation of single children is to produce jangling cultural changes. The new children are a brood sometimes referred to as "The Spoiled Brats of Beijing." Parents and grandparents alike lavish attention on them. In addition, some 10% of the Chicago households contained at least one grandparent, whereas 50% of the Beijing household had at least one How would one enter this "granny factor" into a regression equation? In any case, the Chinese children are in a much more adultmtensive environment than their Chicago peers. We can note in passing that in an early study, American students were found to do as well as Asian students in reading and better in vocabulary (Stevenson, Lee, Lucker, & Stevenson, 1982). These areas were dropped from later research Early studies, too, explained the differences in achievement exclusively in terms of time: If the classrooms visited in this study are representative of American elementary school classrooms, we must conclude that American children fail to receive sufficient instruction. They spend less time each year in school, less time each day in classes, less time in the day in mathematics classes, and less time in each class receiving instruction (Stigler, Lee, & Stevenson, 1987) If we deem it important to close the mathematics "learning gap," doing so should not be difficult. To do so, 8 though, might be to lower our ranking in international studies in reading which, as mentioned earlier, is quite high. In any case, Stevenson et al. have never explained why they have needed more recently to invoke explanatory variables other than time. From the work of Stevenson and Stigler has also emerged the notion that Americans believe in ability whereas Asians believe in effort. One section of The Learning Gap is titled "The American Emphasis on Ability," whereas another is headlined "The Asian Emphasis on Effort" (Stevenson, Lee, & Stigler, 1986; Stevenson & Stigler, 1992). This interpretation of Stevenson and Stigler by Merseth (1993) is typical: A predominant view in America is that one either "has it" or one doesn't. Effort receives little credit for contributing to successful learning in mathematics—or, for that matter, in any subject For example, American, Japanese, and Chinese mothers were asked what factors among ability, task difficulty, and luck made their children successful in school American mothers ranked ability the highest, while Asian mothers gave high marks to effort. This led the researchers to conclude that "the willingness of Japanese and Chinese children to work so hard in school may be due, in part, to the stronger belief on the part of their mothers in the value of hard work." The researchers that Merseth refers to and cites are Stevenson, Lee, and Stigler, 1986. But in this instance Stevenson, Lee, and Stigler, and Merseth after them, appear to have misinterpreted the data. As can be seen in Figure 1, it is true that American mothers rank ability as more important than Asian mothers. But as they figures show, they rank effort as more important than ability And, in fact, they rank it almost as highly as Asian mothers The data create the pictorial impression that the scores are on a 5-point scale, but as the legend shows, they are on a 10-point scale. It is hard to imagine that these differences in the value assigned to effort produce the kinds Effort Ability - Task difficulty Luck FIGURE 1 Number of points (out of ten) that mothers assigned to the relative importance of factors that affect academic achievement [from Stevenson & Stigler, 1992]. EDUCATIONAL RESEARCHER their achievement test did screen students on one variable place of birth. Almost all of the U S students who [scored at the lowest levels] were nonnative speakers of English who had been identified as Limited English Proficient by their local schools. Most were recent immigrants from Spanishspeaking countries, many of whom had had limited exposure to formal education. When studies leave mathematics, the most easily cross-culturally tested subject, for the area of reading, they find that U.S. students have consistently performed well. In the most recent and sophisticated of these studies, conducted on students in 31 nations by the International Association for the Evaluation of Educational Achievement (IEA), Chicago Beijing American 9-year-olds finished second only to Finland, a small, homogeneous nation that does not concern itself a lot with teaching Finnish as a second language (Elley, FIGURE 2 Children's evaluation of the importance of effort and 1992). The 9-year-olds were 22 points out of first place on a ability for success in school. (Scale. 5 = very important; 1 = Not 600-point scale identical to that of the SAT. American 14at all important.) [From Stevenson & Shgler, 1992]. year-olds tied for eighth place Ranks, as noted earlier, obscure performance. The 14-year-olds were only three points further away from first place, 25 points, than the 9of dedication to hard work that Merseth and Stevenson, year-olds. The high ranking countries were again tightly Lee, and Stigler believe they are observing. bunched such that the distance from second-place France As Figure 2 shows, American children have the same beto eighth-place America was only 14 points (the scale is 600 lief structure as American mothers: They think ability is point, identical to that of the SAT). An analysis by NCES more important than do Chinese children, but they think (OECD, 1993) found that there were, in fact, no significant that effort is more important than ability. They rank it virdifferences among the second through eleventh place tually as important as Chinese children. It appears that nations. American children and their mothers are still taken with As with the mathematics studies, there was considerable the notion of "an 'A' for effort." variance within countries in reading scores. The variability One study by Mayer, Tajika, and Stanley (1991) puts was such that American students of both ages had the mathematics achievement in Japan and America in a somehighest scores at their 90th, 95th, and 99th percentiles what different light. Mayer et al. administered an achieve(NCES, 1994, pp 217-218). ment test consisting largely of computation along with a It is worth noting that though the German education sysproblem-solving test to Japanese and American students. tem has been considered by some as worthy of emulation, The Japanese students scored considerably higher than the the German students' reading scores were very close to the American students The researchers then divided the median. The German Research Service reported these findgroups into six subgroups based on scores on the achieveings with some anxiety, especially after noticing that "Germent test, and looked at the problem-solving test scores of man standards were exceeded. . .even in the USA." Even each group. For five of the groups, the American problem(German Research Service, 1992). German authorities, solving scores were higher, and for the remaining group called on to explain Germany's poor showing, promptly the American and Japanese scores were the same. These reput the blame on the family for neglecting books. sults agree with allegations that instruction in mathematics It is also worth noting, for what it says about the media (and, for that matter, other subjects) in Japan emphasizes and the control of perceptions, that when IAEP-2 was rethe rote aspects of the subject (Boylan, 1993, Schooland, leased in February of 1992, it received wide media cover1990; Van Wolferen, 1987). age. "An ' F in World Competition," was the headline at Stigler and Miller (1993) challenged these findings, Newsweek, a typical reaction (1992). When the IEA reading claiming that they suffered the "matched group fallacy" study was released in July of 1992, no media outlet carried where matching on one variable systematically unmatches it The study surfaced only when a European friend of on other important variables. This was especially imporEducation Week reporter Robert Rothman sent him a copy tant, they contended, because so many American students when it was published in Germany. Education Week then scored at the lowest level. This meant that Mayer et al. carried the story on page one (Rothman, 1992) USA Today (1991) had obtained a group of very intelligent students at picked up the report from Education Week and also ran the the upper end. Mayer and Tajika retorted that the matchstory on page 1 (Manning, 1992) No other print or elecing fallacy, and its concomitant regression to the mean, tronic medium covered the event. The USA Today story might apply if the match were only on one extreme group featured a comment from Francie Alexander, then Deputy at the high end, but that the differences had shown up for Assistant Secretary of Education, dismissing the study: all groups (Mayer & Tajika, 1993) They also noted that "This is OK for the '80s, but for the '90s and beyond, kids Stigler and Miller failed to define "intelligence" and were are going to have to do better." Even today, when I speak thus begging the question. around the country and ask audiences of administrators In looking at why so many American students had and professors for a show of hands from those who have scored at the lowest level, Mayer and Tajika found that JANUARY-FEBRUARY 1996 9 Downloaded from http://er.aera.net at PENNSYLVANIA STATE UNIV on September 12, 2016 from Iowa is important partly because, in some ways, Iowa is frozen in time It has no large cities and is still 98% White Most state-level then-and-now studies are difficult to interpret because states have changed on so many variables. Iowa has changed, too, of course, but less than most. In addition, the Iowa testing program is part of the air in Iowa schools, having been in existence for almost 60 years. It is not a recently imposed, high-stakes program As a long-standing low-stakes program, it is less subject than many other programs to testing's "Lake Wobegon Effect." International Comparisons and Domestic Indicators One can ask why people feel obliged to portray U S. These high performances should not really come as a surschools in such a negative light in international comparisons or on domestic indicators. For those approaching the prise because these results confirm those from domestic insituation from a political or ideological stance, an answer is dicators. We are often unaware of them because of the not hard to find. Some answers are presented in "The ready mindset to perceive failure We have already noted the The American School Board Journal headline on the IEA Right's Data-Proof Ideologues" (Bracey, 1995b). In addition, some observers have pointed to disingenuousness reading study where an 8th-place finish of 14-year-olds in among business and industry (Bracey, 1994a; Cuban, 1994; a comparison of 31 nations was seen as "bad news." SomeRothstein, 1994). In addition, some observers have noted times the distortion of results seems deliberate. The reader the lack of K-12 support in the university research comis referred to The Manufactured Crisis for discussion of this munity (Bracey, 1994b; Myrdal, 1969) After all, almost all possibility (Berliner & Biddle, 1995). of the papers commissioned by the people who produced Still other domestic indicators show trends similar to "A Nation At Risk" were written by academics. A lot of those of international comparisons. For example, in spite of university faculty have staked their reputations on redemographic changes in the SAT test-taking pool weighted search projects that assume the system is in crisis. Shortly against math achievement, the proportion of students scorafter the first "Bracey Report" appeared, I received a letter ing above 650 is at a record level. The average score on the from one professor that read, in part, "The American comSAT was set at 500 on 10,654 students living in the Northmon school is an endangered species. The wimps in edueast Ninety-eight percent of these students were White, cation won't defend it. They're afraid of losing their money 60% were male, and 40% had attended private high or access to the corridors of power " schools Currently, the SAT test-taking population is 30% minority, 52% female, and 31% of the students report famWhile it is common currently to date criticism of the ily incomes of less than $30,000 a year Yet the SAT matheschool from the 1983 publication of "A Nation At Risk," the schools have actually been subject to similar criticisms matics "national average" has fallen only 22 points and not from the public and critics for over a century (Bracey, at all if one controls for these demographic changes 1995b, Kent, 1987; Newman, 1978). These criticisms rose to (Bracey, 1990). a crescendo in the years just after World War II, were valiCurrently, the proportion of students scoring above 650 dated to the critics by the launch of Sputnik in 1957, and on the SAT mathematics is at an all-time high, something have never experienced a diminuendo since. Beginning in that cannot be accounted for by Asian students. Though the period of Reconstruction and continuing through Asian students score higher than other ethnic groups, they today, whenever faced with a pressing social problem, the constitute only 8% of all testtakers, up from 4% a decade nation has turned to its schools for a solution while failing ago, far too few to produce the numbers seen (College to provide sufficient resources to make the solution a genBoard, 1992,1993, Jackson, 1976). When the standards were uine possibility The reader is referred to Final Exam. A set on the SAT, 6.68% of the test takers scored above 650. Study of the Perpetual Scrutiny of American Public Schools Currently 12% score above 650. (Bracey, 1995b) for an elaborated discussion of this posiSince 1978, the number of students taking the College tion. Board's Advanced Placement tests has increased more Independent of what the findings actually say, the value than fourfold, from 98,000 in 1978 to 448,000 in 1994 (Colof international comparisons is a subject of much debate lege Board, 1994). Yet the average score has fallen only 10 and perhaps the topic of a separate paper at a later time. points on the AP's five-point scale. Certainly some increase Suffice it to say here that when the data are analyzed with in the number of test takers is occasioned by economic conno prior position as to what they say, the nation's schools cerns: AP tests provide an economical way of obtaining are seen to be performing at a much higher level than has college credit Still, with such an increase in testtakers, one been presented by many U. S Department of Education ofmight have expected more of a drop in average scores. ficials or in the popular press 2 Some achievement test scores are at record levels, also (H. D Hoover, personal communication, June, 1994) Scores on the ITBS and ITED, both in Iowa and the nation, dropped from the 1960s to the mid-1970s. They have been Notes rising consistently since. By 1990, all grades were at record •About 10 years ago, NAEP moved to cease reporting data by ethhighs. nicity NAEP officials were persuaded to keep these categories, howLike the SAT but unlike most commercial achievement ever, largely by Mary Hatwood Futrell, then president of the National tests, new forms of the ITBS and ITED are equated to preEducation Association. Futrell argued that it was important to mainvious forms. Thus longitudinal trend data are possible to tain ethnic categories to measure progress—or the lack of it—in attaining equality of educational outcomes. She felt the value of this track back to the 1930s. In addition, that we have such data heard of the study, only perhaps 1% of the group knows of it. In this summary, many of the results in international comparisons have been for mathematics, a subject in which it is typically reported that American students, in general, perform terribly. The data, taken together, however, reveal that many students perform quite well relative to students even in mathematics and in even the highest scoring nations in mathematics. 20 EDUCATIONAL RESEARCHER Downloaded from http://er.aera.net at PENNSYLVANIA STATE UNIV on September 12, 2016 measurement outweighed the danger that the results were amenable to racist interpretations 2 I have become identified with having a "position" on what the data say However, when I was first led to analyze the data in the fall of 1990, it was quite by accident and I held the position that is consistently found m the annual Phi Delta Kappa /Gallup poll' Parents think the local schools are OK and that there is a crisis in the nation's schools (Rose & Elam, 1995). The conclusion that this condition was, indeed, a "manufactured" crisis was initially quite surprising. References Advanced Placement Report, 1993 (1994) New York. The College Board. Good news' Our 9-year-olds read well, Bad news Our 14-year-olds don't (1992, November) American School Board Journal An ' F in World Competition (1992, February 17) Newszveek,p 57. Baker, D P (1993) Compared to Japan, the U S is a low achiever really. Educational Researcher, 22(3), 18-20 Bracey, G W (1990, November 21) SATs Miserable or miraculous 7 Education Week, 36 Bracey, G W (1992) The Second Bracey Report on the Condition of Public Education Phi Delta Kappan, 72,104-117. Bracey, G W (1994a) First world, third world, all right here at home Phi Delta Kappan, 75, 649-651. Bracey, G W (1994b, July) The greatly exaggerated death of our schools Presentations to the IDEA Summer Fellows Program, Ontario (CA), Denver, and Applcton, WI Bracey, G W. (1995a). Final exam. A study of the perpetual scrutiny of American public schools. Bloomington, IN Agency for Instructional Technology Bracey, G W. (1995b, January 25). The right's data-proof ideologues. Education Week, p. 44 Cuban, L (1994, June 27) The great school scam Education Week, p 44 Education m States and Nations. (1992). Washington, DC' Center for Education Statistics Elam, S M , & Rose, L C (1995) The 27th Annual Phi Delta Kappan/Gallup Poll of the public's attitudes toward public schools Phi Delta Kappan, 77, 41-56 Elley, W B (1992) How in the world do students read7 Hamburg International Association for the Evaluation of Educational Achievement. German Research Service (1992) Special Science Reports, VIII, November, 12-14 Gerstner, L V, Jr, Semerad, R D , Doyle, D P, & Johnston, W B (1994) Reinventing education New York Dutton Books Gray, C B, & Kemp, E J, Jr (1993) Flunking testing Is too much fairness unfair to school kids? The Washington Post, September 19 International Mathematics and Science Assessments What Have We Learned' (1992) Washington, DC U S Department of Education, National Center on Education Statistics Ishizaka, K (1994) Japanese education—The myths and the realities In Different Visions of the Future of Education Ottawa, Ontario. Canadian Teachers Foundation Jackson, R (1976) An examination of declining numbers of high-scoring SAT candidates New York The College Board Jaeger, R M (1992) Weak measurement serving presumptive policy Phi Delta Kappan, 74,118-128 Kent, J D (1987) A not too distant past Echoes of the calls for reform Educational Forum, Winter, 137-150 Krauthammer, C (1994). Save the border collie The Washington Post, July 15, A21 Lapomte, A E, Askew, J M , and Mead, N A (1992) Learning mathematics Princeton, NJ Educational Testing Service Lapomte, A E , M e a d , N A , & Askew, J M (1992) Learning science Princeton, NJ Educational Testing Service Manning, A (1992) U S kids near top of class in reading USA Today, September 29, 1992 (The USA Today article is dated before the Education Week article on the same topic, but Education Week typically appears two days before its formal publication date ) Matthews, J (1992) Lessons from Asian schools The Washington Post, November 30, A23 Mayer, R E , Tajika, H , & Stanley, C. (1991). Mathematical problem solving in Japan and the United States: A controlled comparison join nal of Educational Psychology, 83, 69-72 jANUARY-FE, Mayer, R E , & Tajika, H (1993) Conducting and comprehending cross-cultural comparisons Reply to Stigler and Miller journal of Educational Psychology, 85, 560-565 Merseth, K K (1993) How old is the shepard 7 - An essay about mathematics education Phi Delta Kappan, 74, 548-554 Myrdal, G (1969) Objectivity m social science New York Pantheon Books NAEP Mathematics Report Card, 1992 (1992) Princeton, NJ Educational Testing Service National Center for Education Statistics (1994). The Condition of Education 1994. Washington, DC U S Department of Education Newman, A. J (Ed ) (1978) In defense of the American public school Berkeley, CA. McCutchan O'Hare, W. P., & Curry-White, B. (1992) The rural underclass- Examination of multiple-problem populations in urban and rural settings Washington, DC Population Reference Bureau O'Neill, B (1994) Anatomy of a hoax Neiv Yoik Times Sunday Magazine, March 6, 46-49 Profiles of College Bound Seniors, 1992,1993 (1994). New York The College Board. Robinson, G , & Forsyth, J (1994) NAEP test scores' Should they be used to compare and tank state educational quality7 Arlington, VA' Educational Research Service Rotberg, I (1990). I never promised you first place Phi Delta Kappan, 72, 296-303 Rothman, R (1992, September 30) U S ranks high m international study of reading. Education Week, 1. Rothstein, R (1994, June) Presentation to the President's Professional Development Workshop, AASA, Arlington, VA Schooland, K (1990). Shogun's ghosts The dark side of Japanese education New York Bergin and Garvey. Stcdman, L C. (1994a) Incomplete explanations The case of U S performance in the international assessments of education. Educational Researcher, 23(7), 24-32 Stedman, L C (1994b) The Sandia Report and U S achievement An assessment Journal of Educational Research, 87,133-146 Shanker, A, (1994, July 24) Making time New York Tunes, Section 4, p. 7 Shanker, A (1992a, July 4). World class standards Neiu York Times, Section 4, p 7 Shanker, A. (1992b, July 11) The wrong message Neiv York Times, Section 4, p 7. Stevenson, H A. (1992) Learning from Asian schools Scientific American, December, 70-76. Stevenson, H W, & Stigler, J W (1992) The learning gap New York Summit Books Stevenson, H W., Lee, S-y, & Stigler, J. W (1986) Mathematics achievement of Chinese, Japanese, and American children Science, 231,693-699 Stevenson, H W, Lee, S-y, Chen, C , & Lummis, M (1990) Mathematics achievement of children in China and the United States Child Development, 61,1053-1066 Stevenson, H. W, Stigler, J W, Lucker, G. W, & Lee, S-y Reading disabilities The case of Chinese, Japanese, and English Child Development, 53,1164-1182 Stigler, J. W., & Stevenson, H W (1991) How Asian teachers polish each lesson to perfection American Education, Spring, 12-47 Stigler, J W., Lee S-y, & Stevenson, H W (1987). Mathematics classrooms in Japan, Taiwan, and the United States Child Development, 58,1272-1285 Stigler, J W., Lee, S-y, Lucker, G W, & Stevenson, H W (1982) Curriculum and achievement in mathematics A study of elementary school children in Japan, Taiwan, and the United States Van Wolferen, K (1989) The enigma of Japanese power New YorkKnopf Westbury, I. (1992) Comparing American and Japanese achievement Is the United States really an underachiever? Educational Researcher, 21(5), 18-24. Westbury, I (1993) American and Japanese achievement .again Educational Researcher, 22(3), 21-26 Yano, H (1993) What can we learn from the learning gap 7 Educational Researcher, 22(1), 36-37. Received May 25,1995 Revision received August 3,1995 Accepted August 29,1995 \RY 1996 11 Downloaded from http://er.aera.net at PENNSYLVANIA STATE UNIV on September 12, 2016
© Copyright 2026 Paperzz