It had to happen sooner or later. Sixteen months after the Los Angeles Times published rankings of 6,000 third- through fifth-grade teachers in the Los Angeles Unified School District that it compiled from seven years of math and English scores, news organizations under the Freedom of Information Law finally received data on 18,000 teachers in the New York City school system when a state court declined to hear a last ditch appeal from the teachers union to keep the information private. In quick order, The New York Times published the names of teachers and their schools, and their ranking based on their students' gains on state standardized tests in math and English over five years until the 2009-10 school year ("City's Ratings of 18,000 Teachers Indicate That Quality Is Widely D...," The New York Times, Feb. 25).
The rationale was that parents have the right to know how their children's teachers rank. It's a compelling argument if it can be proved that publication of ratings leads to better instruction. But it doesn't. For one thing, there is what newspapers report as a huge margin or error in the New York City case. (Actually, the correct term in educational testing is measurement error.) A teacher's score could be 35 points off on the math exam, or 53 points off on the English exam. These numbers hardly instill confidence. For another, teachers who teach English language learners, special education students and disadvantaged students receive lower scores than when they teach affluent students. This raises the question of fairness. Finally, the practice relies on the alleged benefits of naming and shaming, which even Bill Gates opposes ("Shame Is Not the Solution," The New York Times, Feb. 23).
In an attempt to mollify critics, The New York Times offered teachers an opportunity to respond to ratings that they considered unwarranted for one reason or another ("Teachers: An Invitation to Respond to Your Data Report," Feb. 23). It's better than nothing, but it's a sop. The New York Post patted itself on the back in a news story that began: "City parents gave The Post an A-plus yesterday for publishing teacher-evaluation data that revealed valuable information about the educators who are leading their children" ("Parents praise release of NYC public school teacher ratings," Feb. 26).
The issue of evaluating teachers was comprehensively addressed by the Economic Policy Institute in a briefing paper it released on Aug. 29, 2010 ("Problems With the Use of Student Test Scores to Evaluate Teachers"). Ten respected scholars concluded that the value-added metric should be used with "caution." I don't doubt that it can provide some useful information, but that is not the way it is being implemented. Value-added results constitute a disproportionate part of teacher evaluations. I grant that no system of teacher evaluation is perfect. There will always be aspects that are controversial. However, when so much depends on the results of student test scores, it's imperative that more compelling evidence be presented about the value-added metric to assure confidence.
I can't think of data reports with a similar margin of error in any other field that have received such prominent coverage. In fact, most editors would in all likelihood dismiss out of hand any study with such shaky statistics. The imprecision alone would constitute a red flag. Aaron Pallas put it best: "For teachers, the key concern is fairness. Fairness is primarily a procedural issue: Teachers, and the unions that represent them, seek an evaluation process that is neither arbitrary nor capricious, relying on stable and valid criteria that they believe accurate characterize the quality of their work" ("Reasonable doubt," Eye on Education, Feb. 6).
In a court of law in this country, prosecutors have to prove beyond a reasonable doubt that a defendant is guilty of a crime in order to get a conviction, or lawyers have to provide a preponderance of evidence in a civil trial to win their case. But in the court of public opinion, neither standard must be met to prove that teachers are guilty of ineffectiveness. This makes a travesty of what is taking place.