Can Computers Score Student Writing As Accurately As Humans?

 

From the Marshall Memo #438

In this New York Times column, Michael Winerip reports on a recent finding that computers can compete with humans in scoring student essays. Mark Shermis of the University of Akron collected 16,000 secondary-school essays that had been hand-scored by teachers and fed them through automated grading systems developed by nine different companies. Computer scoring produced “virtually identical levels of accuracy, with the software in some cases proving to be more reliable,” according to a news release. An even bigger advantage was speed. Human graders can assess about 30 essays an hour. One of the computer systems (e-Rater from the Educational Testing Service) can grade 16,000 essays in 20 seconds!

“Is this the end?” asks Winerip. “Are Robo-Readers destined to inherit the earth?” Not so fast, says Les Perelman, director of writing at M.I.T., who was given access to e-Rater, the computer scoring system developed by Educational Testing Service (ETS). He concluded that automated readers are easy for students to game, are vulnerable to test prep, set a very limited and rigid standard for what good writing is, and might pressure teachers to dumb down writing instruction. The biggest problem is that computer readers can’t discern truth from nonsense. “E-Rater doesn’t care if you say the War of 1812 started in 1945,” says Perelman. The software looks only at whether a fact is part of a well-structured sentence. The substance of an argument doesn’t matter, he says, as long as it looks to the computer as if it’s well-argued. 

To prove the point, Perelman fed an essay into e-Rater in which he said that the number one reason for high college costs was excessive pay for greedy teaching assistants. “The average teaching assistant makes six times as much money as college presidents,” he wrote. “In addition, they often receive a plethora of extra benefits such as private jets, vacations in the south seas, and starring roles in motion pictures.” He even threw in a line from Allen Ginsberg’s “Howl” just to see if he could get away with it. E-Rater gave Perelman’s essay the top score of 6.

Perelman also found that the software gives more points for longer essays. He submitted a 716-word essay with more than a dozen nonsensical sentences and got a 6. A well-argued, well-written essay of 567 words got a 5. “Once you understand e-Rater’s biases,” he says, “it’s not hard to raise your test score.” E-Rater doesn’t like short sentences, sentence fragments, short paragraphs, or sentences beginning with “or.” E-Rater likes connectors, like “however”, and big words. 

In fairness to ETS, says Winerip, it was the only company to give Perelman access to its product. And ETS officials defended their system. “E-Rater is not designed to be a fact checker,” said Paul Deane, a research scientist. It’s best used to give students rapid feedback on drafts so they can improve them before submitting a final draft to a teacher. ETS says that for high-stakes situations (like the Graduate Record Exam), e-Rater is always backed up by a human scorer. 

As for being biased in favor of longer essays, Deane argued that good writers have internalized the skills that make them more fluent and are therefore able to write more in a limited amount of time. 

On Perelman’s point about test prep, ETS officials contend that his advice on how to game the e-Rater is too complex for most students to absorb, and if they can, they’re demonstrating the very kind of high-level thinking the program is designed to pick up. “In other words,” says Winerip, “if they’re smart enough to master such sophisticated test prep, they deserve a 6.” 

“Facing a Robo-Grader? No Worries. Just Keep Obfuscating Mellifluously” by Michael Winerip in The New York Times, Apr. 23, 2012 (p. A11), http://nyti.ms/HZlqlj 

 

Views: 105

Reply to This

JOIN SL 2.0

SUBSCRIBE TO

SCHOOL LEADERSHIP 2.0

School Leadership 2.0 is the premier virtual learning community for school leaders from around the globe.  Our community is a subscription based paid service ($19.95/year or only $1.99 per month for a trial membership)  which will provide school leaders with outstanding resources. Learn more about membership to this service by clicking one our links below.

 

Click HERE to subscribe as an individual.

 

Click HERE to learn about group membership (i.e. association, leadership teams)

__________________

CREATE AN EMPLOYER PROFILE AND GET JOB ALERTS AT 

SCHOOLLEADERSHIPJOBS.COM

FOLLOW SL 2.0

© 2024   Created by William Brennan and Michael Keany   Powered by

Badges  |  Report an Issue  |  Terms of Service