Man vs. Computer: Who Wins the Essay-Scoring Challenge?

Would you rather have an actual person score your carefully crafted essay, or an automated software program designed for that purpose?

I'd still take the flawed human being any day—assuming, of course, the proper expertise and that he or she is operating on a good night's sleep—but a new study suggests there is little, if any, difference in the reliability and accuracy of the computer approach.

And this may be good news for those who believe essays are an essential component of state testing systems, since the cost-savings may well encourage more states to embrace the use of such test items to balance out multiple-choice questions.

"The demonstration showed conclusively that automated essay-scoring systems are fast, accurate, and cost-effective," said Tom Vander Ark, the chief executive officer of Open Education Solutions, and a co-director of the study, in a press release. (Vander Ark is also a former top education official at the Bill & Melinda Gates Foundation.)

The study is described in the news release as the "first comprehensive, multivendor trial to test" claims by companies that provide automated essay-scoring software. It challenged nine companies to compare their capabilities. More than 16,000 essays were released from six participating states, with each set of essays varying in length, type, and grading protocols. The essays had already been hand-scored, and the challenge was for the companies to approximate established scores through their software.

The study was funded by the William and Flora Hewlett Foundation, which also provides financial support for Education Week coverage.

It grew out of a contest Hewlett is sponsoring called the Automated Student Assessment Prize, to evaluate the current state of automated testing and to encourage further developments in the field.

The study comes as two state testing consortia are working to develop new assessment systems pegged to the Common Core State Standards in reading and mathematics. In fact, the two consortia are supporting the Hewlett effort, and three PARCC states and three SMARTER Balanced states supplied student essays for the current study.

"The results demonstrated that overall, automated essay scoring was capable of producing scores similar to human scores for extended-response writing items with equal performance for both source-based and traditional writing genre," says the study, co-authored by Mark Shermis, the dean of the University of Akron's college of education, and Ben Hammer of Kaggle, a private firm that provides a platform for predictive modeling and analytics competitions.

Barbara Chow, the education program director at Hewlett, said in the press release that she believes the results will encourage states to include a greater dose of writing in their state assessments.

And she believes this is good for education.

"The more we can use essays to assess what students have learned," she said, "the greater likelihood they'll master important academic content, critical thinking, and effective communication."

Views: 121

Reply to This

JOIN SL 2.0

SUBSCRIBE TO

SCHOOL LEADERSHIP 2.0

Feedspot named School Leadership 2.0 one of the "Top 25 Educational Leadership Blogs"

"School Leadership 2.0 is the premier virtual learning community for school leaders from around the globe."

---------------------------

 Our community is a subscription-based paid service ($19.95/year or only $1.99 per month for a trial membership)  that will provide school leaders with outstanding resources. Learn more about membership to this service by clicking one of our links below.

 

Click HERE to subscribe as an individual.

 

Click HERE to learn about group membership (i.e., association, leadership teams)

__________________

CREATE AN EMPLOYER PROFILE AND GET JOB ALERTS AT 

SCHOOLLEADERSHIPJOBS.COM

New Partnership

image0.jpeg

Mentors.net - a Professional Development Resource

Mentors.net was founded in 1995 as a professional development resource for school administrators leading new teacher induction programs. It soon evolved into a destination where both new and student teachers could reflect on their teaching experiences. Now, nearly thirty years later, Mentors.net has taken on a new direction—serving as a platform for beginning teachers, preservice educators, and

other professionals to share their insights and experiences from the early years of teaching, with a focus on integrating artificial intelligence. We invite you to contribute by sharing your experiences in the form of a journal article, story, reflection, or timely tips, especially on how you incorporate AI into your teaching

practice. Submissions may range from a 500-word personal reflection to a 2,000-word article with formal citations.

© 2025   Created by William Brennan and Michael Keany   Powered by

Badges  |  Report an Issue  |  Terms of Service