In 2009, the Bill & Melinda Gates Foundation launched a huge project to demonstrate that evaluating teachers by their students’ standardized test scores would improve education and especially the education of “low-income, minority” students. Now the Gates Foundation has paid for a huge Rand Corporation study that showed its original experiment didn’t work. Although the Gates Foundation can move on to testing another hypothesis, its prescription for grading teachers has done immeasurable damage by injecting econometric teacher evaluation into the laws of many states. It will take a long time for the 50 state legislatures to clean up laws based on a mistake.
Chalkbeat‘s Matt Barnum describes the original plan: “Barack Obama’s 2012 State of the Union address reflected the heady moment in education. ‘We know a good teacher can increase the lifetime income of a classroom by over $250,000,’ he said. ‘A great teacher can offer an escape from poverty to the child who dreams beyond his circumstance.’ Bad teachers were the problem; good teachers were the solution. It was a simplified binary, but the idea and the research it drew on had spurred policy changes across the country, including a spate of laws establishing new evaluation systems designed to reward top teachers and help weed out low performers. Behind that effort was the Bill and Melinda Gates Foundation… Now, new research commissioned by the Gates Foundation finds scant evidence that those changes accomplished what they were meant to: improve teacher quality or boost student learning. The 500-plus page report by the Rand Corporation… details the political and technical challenges of putting complex new systems in place…”
The Gates Foundation not only launched a giant experiment without an adequate research base, but it also leveraged the investment of public dollars and used its own lobbying might to influence public policy. The Obama administration conditioned qualification for Race to the Top grants on the use of students’ standardized test scores in teachers’ evaluations and later made the same requirement for states to qualify for No Child Left Behind waivers.
The Washington Post‘s Valerie Strauss details the history: “Put this in the ‘they-were-warned-but-didn’t-listen’ category.” She describes the project launched in Hillsborough County (Greater Tampa), Florida, Memphis, and Pittsburgh along with four charter management organizations: “The Bill & Melinda Gates Foundation pumped nearly $215 million into the project while the partnering school organizations supplied their own money, for a total cost of $575 million.” Federal policy makers jumped into the mix: “The Obama administration, through its Race to the Top initiative, dangled federal funds in front of states that agreed to establish teacher evaluation systems using test scores to varying extents. And Gates funded his ‘Empowering Effective Teachers’ project with the aim of finding proof that such systems could improve student achievement… (M)ost states adopted test-based teacher evaluation systems. In a desperate attempt to evaluate all teachers on tested subjects—reading and math—some of the systems would up evaluating teachers on subjects they didn’t teach or on students they didn’t have. Some major organizations questioned them, including the American Statistical Association…. And so did the Board on Testing and Assessment of the National Research Council.”
Strauss quotes the conclusion of the Rand Corporation’s huge new assessment of the experiment: “Overall, the initiative did not achieve its stated goals for students, particularly LIM (low-income minority) students. By the end of 2014-2015, student outcomes were not dramatically better than outcomes in similar sites that did not participate in the IP (Intensive Partnerships) initiative. Furthermore, in the sites where these analyses could be conducted, we did not find improvement in the effectiveness of newly hired teachers relative to experienced teachers; we found very few instances of improvement in the effectiveness of the teaching force overall; we found no evidence that LIM students had greater access than non-LIM students to effective teaching; and we found no increase in the retention of effective teachers, although we did find declines in the retention of ineffective teachers in most sites.”
What the Rand Report fails to calculate is the collateral damage. It is well known that, in Hillsborough County, Florida, the Gates Foundation suspended its study before it had been completed—leaving the school district itself to cover a significant part of the cost. But beyond Hillsborough County, the consequences were long lasting as state legislatures, lured by Race to the Top funding and the need to qualify for No Child Left Behind waivers, passed laws basing teachers’ evaluations on students’ standardized test scores. When, in December of 2015, Congress replaced No Child Left Behind with the Every Student Succeeds Act, it removed the requirement that states use students’ test scores in teachers’ evaluations, but the laws the states had put in place to meet federal requirements remained.
For example, only last week did the Ohio Legislature act to reduce the role of students’ test scores in the state teacher evaluation system. Finally—before going on a 2018 summer recess, the Ohio lawmakers passed a new statute reducing the weight of students’ standardized tests in the formal evaluation of teachers. The law passed with bipartisan support, and it is hoped that Governor John Kasich will sign it.
Last Sunday, the Columbus Dispatch‘s Jim Siegel reported that Ohio has been basing 50 percent of teachers’ ratings on students’ standardized test scores . Keep in mind that it is now 2018, and Ohio, like many other states, has still been using a plan that the Rand Corporation has now declared ineffective for measuring the quality of teachers.
Siegel quotes Jonathan Juravich, the 2018 Ohio Teacher of the Year, describing the new system: “No longer… (will) student growth measures be used as a disconnected evaluation factor linked to an arbitrary weighted percentage.”
Ohio is also finally doing away with “shared attribution,” according to Siegel: “Changes include doing away with shared attribution—growth measures attributed to a group of teachers that, critics say, does not accurately measure individual performance….”
State Superintendent Paolo DeMaria is quoted describing the new law: “Most importantly, we want our teachers on a path of continuous improvement, and with these changes the system places a greater focus on improvement in teacher practices that lead to better outcomes for students.”
The Bill & Melinda Gates Foundation and Obama administration’s collaborative scheme to evaluate teachers econometricaly has undermined the morale of school teachers and contributed to a climate in which teachers have been blamed unfairly when test scores don’t rise. Contrast the Gates theory, now rejected by the Rand Corporation report, with the research of Harvard’s Daniel Koretz, who explains how the test scores—so central to the school accountability movement—don’t really measure the quality of the schools or specific teachers, but instead primarily reflect the aggregate economic level of a school’s families and neighborhood:
“One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores…. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms…. Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (The Testing Charade; Pretending to Make Schools Better, pp. 129-130)
Ohio is now joining other states trying to undo the damage. Writing for the Stamford Advocate, Wendy Lecker, a columnist and attorney for the Education Law Center, explains: “Technology writer Eugene Morozov coined the term ‘solutionism’: a pathology that recognizes a problem based on one criterion only… solvable with a simple, preferably technological, solution. Solutionists operate with a myopic hubris, believing that if they get their simple fix right, as the chair of Google once claimed, ‘we can fix all the world’s problems.'”
The story of America’s nine year experiment with rating teachers by their students’ test scores ought to teach us to beware solutionists with gobs of money and the power to seduce policy makers.
(This blog has tracked education philanthropy from the Bill & Melinda Gates Foundation here.)