Speaking My Mind Robo-Grading and Writing Instruction: Will the Truth Set Us Free? Dave Perrin Rochelle Township High School Rochelle, Illinois [email protected] When I was a graduate student in English literature in the mid-1990s, I wrote a paper on Kenneth Branagh’s (then recent) production of Much Ado about Nothing. My chosen topic was discrepancies between Shakespeare’s text and the seduction scene involving Barachio and Margaret as portrayed in Branagh’s film. When the professor returned the paper to me, I noticed that he had commented on the front page something to the effect that mine was one of the most finely written papers he had received. This comment is not why I remember the paper. His comment on the back of the paper is what has stuck with me all these years. The comment began with a question: “So what?” He went on to posit, quite rightly, that my thesis did nothing to shed light on our understanding of Shakespeare or his play. My essay was, itself, “much ado about nothing” because although well written, it was vacuous and void of substantial critical insight. I was reminded of this incident last year as I read of a competition sponsored by the William and Flora Hewlett Foundation to see how algorithms could be developed to assess student writing. Hewlett’s education program director, Barbara Chow, concluded, “We had heard the claim that the machine algorithms are as good as human graders, but we wanted to create a neutral and fair platform to assess the various claims of the vendors. It turns out the claims are not hype” (qtd. in Stross). Earlier that year, a University of Akron study found that an analysis of 16,000 middle school and high school essay tests previously graded by humans indicated that, as the researchers stated, robo-graders “achieved virtually identical levels of accuracy, with the software in some cases proving to be more 104 reliable” (qtd. in Bienstock). Dr. Mark Shermis, Akron’s College of Education dean, conceded that “automatic grading doesn’t do well on very creative kinds of writing. But this technology works well for about 95 percent of all the writing that’s out there” (qtd. in Bienstock). Regardless of the claims of the proponents of robo-graders, the clear winners in the standards movement thus far have been those in the billion-dollar-a-year testing industry. Within the language arts curriculum, the emphasis of the standards movement until now has been focused almost exclusively on reading and its concomitant multiple-choice testing, but as Common Core and other movements place renewed emphasis on writing and critical thinking, the test-prep industry will undoubtedly keep churning out “breakthroughs” in these areas of composition curriculum and assessment. There are skeptics. Les Perelman, a retired director of writing at the Massachusetts Institute of Technology (MIT), has set out to expose the flaws and foibles of robo-grading. After studying the Educational Testing Service’s e-Rater, Perelman found some of the qualities that e-Rater privileges as “good” writing—sentence structure; length of words, sentences, and paragraphs; and conjunctive adverbs—as evidence of complexity of thought. Where it fails is in recognizing facts, logic, and truth. “E-Rater doesn’t care if you say the War of 1812 started in 1945,” Perelman was quoted as saying in a 2012 New York Times column by Michael Winerip. In fact, Perelman has manufactured essays based on obviously faulty premises and received e-Rater’s top score of six. One such argument reads, “The average Eng lish Journal 102.6 (2013): 104–106 Copyright © 2013 by the National Council of Teachers of English. All rights reserved. EJ_July2013_C.indd 104 7/3/13 2:39 PM Speaking My Mind teaching assistant makes six times as much money as college presidents. In addition, they often receive a plethora of extra benefits such as private jets, vacations in the south seas, starring roles in motion pictures” (qtd. in Winerip). While the e-Rater fails to grasp the obviously faulty logic of this argument, the indisputable advantage of the robo-grader over the classroom teacher is speed. E.T.S. researcher David Williamson boasts that e-Rater can grade 16,000 essays in 20 seconds (qtd. in Winerip).1 A more recent target of Perelman’s ire is EdX, a nonprofit joint venture of Harvard and MIT to offer online courses promoting essay-grading software as an integral part of its instructional platform. EdX has partnered with other institutions, including Stanford. Grading software is also being used by other “massive open online courses” (MOOCs), including two start-ups founded by Stanford faculty members. Daphne Koller, a computer scientist and founder of one of these MOOCs, Coursera, says of grading software, “It allows students to get immediate feedback on their work, so that learning turns into a game, with students naturally gravitating toward resubmitting the work until they get it right” (qtd. in Markoff). Perelman states that his major concern with the EdX software is that it “did not have any valid statistical test comparing the software directly to human graders” (qtd. in Markoff). Perelman has joined an online petition to stop robo-grading called Professionals Against Machine Scoring of Student Essays in High-Stakes Assessment, which states flatly in its mission, “Computers cannot ‘read’” (qtd. in Markoff). In spite of its critics, EdX has aligned itself with a dozen universities and expects to expand its influence even further. So what? What’s at Stake in Robo-Grading? What is most vitally at stake here is not writing assessment or the validity of standardized writing tests. What is most at stake is instruction. In the standards movement, testing drives instruction because the stakes are so high for teachers and schools. When facts, logic, and truth become dispensable in the assessment of writing, then writing instruction, ostensibly, will become focused solely on the mechanics of writing. So much for the short-lived return to critical thinking that the Common Core State Standards initiative promises to bring back to the English curriculum. Although the e-Rater and its brethren may not be interested in the truth, the truth is that writing teachers always have been. Good writing teachers encourage fact checking, ensure that material is asserted and quoted within its intended context, challenge the logical assumptions of student writing, make personal connections, encourage a student’s development of voice and linguistic ingenuity, and perform a thousand other tasks in responding to student writing that are simply above the pay grade of robo-graders. But will teachers continue to offer rich responses in a high-stakes testing world proliferated with robo-graders, or will they succumb to the pressure to teach to a test that so narrowly defines what “good” writing is? As a writing teacher for the past 20 years, I have often found myself borrowing my professor’s “So what?” question, using it as a jumping-off point for my responses to student writing whenever a student fails to make a logical connection or, as I did in my Much Ado paper, labors for pages on a banality. As writing teachers know, this type of critical commentary is necessary in teaching students to be critical thinkers. Moreover, “good” writing is always subjective. English teachers are notorious for their pet peeves and personal opinions of what is “good.” Over time, astute student writers will collect these hallmarks of good writing from various teachers, stack Will teachers continue to them against one another offer rich responses in a and their own, reject some high-stakes testing world and embrace others, and proliferated with roboeventually develop their graders, or will they own style and criteria for good writing. The adop- succumb to the pressure tion of such a narrowly to teach to a test that so defined concept of writing narrowly defines what in which, for instance, each “good” writing is? sentence in a student essay must be at least 15 words long or contain a conjunctive adverb, threatens this process. As Patricia O’Connor points out in the May 2012 English Journal, “I know from my teaching experience that the nature of writing is not as linear as the data miners would lead us to believe. Learning how to write well is a complex, recursive process, and despite years of research, it still resists being broken neatly into discreet segments for assessment” (105). English Journal EJ_July2013_C.indd 105 105 7/3/13 2:39 PM Robo-Grading and Writing Instruction: Will the Truth Set Us Free? Writing teachers like O’Connor develop what Elliot Eisner refers to as “connoisseurship,” or “the art of appreciation” (215). Of educational connoisseurship, in general, Eisner states, “One must have a great deal of experience with classroom practice to be able to distinguish what is significant about one set of practices or another” (216). Writing teachers develop this connoisseurship of the practices of writing and writing instruction over time through their own reading, writing, and interaction with other readers and writers, as well as through thoughtful reflection on the processes of writing and the teaching of writing. The proponents of robo-grading laud it precisely because it provides some sort of objective quantification of writing, but writing teachers know that a certain degree of subjectivity is inescapable, and indeed even essential to the assessment of writing, as the self cannot be removed from the act of reading (or grading) any more than it can be removed from the act of writing. Students must be taught to read and write in a world where facts matter, where logic is challenged, and where the “truth” is often not only subjective but also subject to nearly inscrutable nuances. In short, they must be taught to write for diverse audiences, not algorithms. Editor’s Note 1. For more information, see the “NCTE Position Statement on Machine Scoring: Machine Scoring Fails the Test,” prepared by the NCTE Task Force on Writing Assessment and adopted by the NCTE Executive Committee in April 2013 (http://www.ncte.org/positions/statements/machine_ scoring). Works Cited Bienstock, Jordan. “Grading Essays: Human vs. machine.” Schools of Thought. CNN.com. 10 May 2012. Web. 7 July 2012. <http://schoolsofthought.blogs.cnn.com/2012/05/ 10/grading-essays-human-vs-machine/>. Eisner, Elliot. The Educational Imagination. 3rd ed. Upper Saddle River: Prentice, 1994. Print. Markoff, John. “Essay-Grading Software Offers Professors a Break.” New York Times. 4 Apr. 2013. Web. 14 Apr. 2013. <http://www.nytimes.com/2013/04/05/science/new-testfor-computers-grading-essays-at-college-level.html>. O’Connor, Patricia. “Waving the Yellow Flag on DataDriven Assessment of Writing.” English Journal 101.5 (2012): 104–05. Print. Stross, Randall. “The Algorithm Didn’t Like My Essay.” New York Times. 9 June 2012. Web. 12 June 2012. <http:// www.nytimes.com/2012/06/10/business/essay-gradingsoftware-as-teachers-aide-digital-domain.html>. Winerip, Michael. “Facing a Robo-Grader? Just Keep Obfuscating Mellifluously.” New York Times. 22 Apr. 2012. Web. 7 July 2012. <http://www.nytimes.com/ 2012/04/23/education/robo-readers-used-to-grade-testessays.html>. Dave Perrin has been teaching high school literature and composition for more than 20 years. He currently teaches dualcredit composition courses at Rochelle Township High School through Kishwaukee College. He holds a master’s degree in English literature as well as a doctoral degree in curriculum leadership, both from Northern Illinois University. The Teacher He said This line is loose, and tugged at it and the whole poem began to unravel, and Why this word and not another?, and the foundation seemed about to crumble as if a brick had been removed. Then he said Ah, this line is good, and the ground steadied again, and What an interesting image, and an oak sprouted from the ground. —Matthew J. Spireng © 2013 by Matthew J. Spireng Matthew J. Spireng’s book What Focus Is was published in 2011 by Word Press. His book Out of Body won the 2004 Bluestem Poetry Award and was published in 2006 by Bluestem Press. He has several chapbooks including Clear Cut, a signed and numbered limited edition of his poems paired with photographs by Austin Stracke on which the poems are based. Email him at [email protected]. 106 July 2013 EJ_July2013_C.indd 106 7/3/13 2:39 PM
© Copyright 2026 Paperzz